Webhook Data Validation: How to Stop Bad Data from Polluting Your CRM
Practical patterns for validating webhook payloads before they hit your CRM - required fields, format checks, deduplication, rate limiting - with real examples in Make.com and n8n.
Verification note: This post was re-reviewed in May 2026. Public tool pricing, compliance rules, and platform capabilities should be checked against the source list at the end before making budget, legal, or deployment decisions. Private client metrics are not published unless they are safe, public, and verifiable.
Why webhook validation matters
Webhooks are the connective tissue of modern automation stacks. A form submits -> webhook fires -> data flows to your CRM. A call ends -> webhook fires -> outcome updates your pipeline.
Without validation, every webhook becomes a potential vector for bad data:
- A form with no email address creates a ghost contact
- A VAPI webhook with malformed fields breaks your workflow
- A Stripe webhook received twice creates a duplicate deal
- A malicious POST fills your CRM with junk
Validation is the checkpoint between "data exists" and "data enters your system."
The 5 validation layers
Layer 1: Structure validation
Does the payload match the expected shape?
Check:
- Required fields are present (email, phone, or whatever's minimum)
- Fields are of expected types (string, number, boolean)
- Nested objects/arrays exist if expected
Fail behavior: Reject with a 400 status code. Don't log as success.
Layer 2: Format validation
Do fields have valid values?
Check:
- Email matches regex (
/^[^\s@]+@[^\s@]+\.[^\s@]+$/) - Phone is parseable as E.164
- Dates are valid ISO 8601
- URLs are well-formed
Fail behavior: Route to an error queue. Log for human review.
Layer 3: Business logic validation
Does the data make sense for your business?
Check:
- Budget value is within realistic range
- Deal amount isn't negative
- Timeline values are from expected set
- Source tag is from canonical list
Fail behavior: Accept but flag with a "review" tag.
Layer 4: Deduplication
Is this data already in the system?
Check:
- Normalized email or phone matches existing contact
- Same event ID already processed (idempotency)
- Same form submission within dedup window (e.g., 5 minutes)
Fail behavior: Update existing record instead of creating duplicate.
Layer 5: Security validation
Is the request actually from the expected source?
Check:
- Signature header matches expected HMAC (for providers that sign webhooks - Stripe, Shopify, GitHub)
- IP whitelist (if provider publishes allowed IPs)
- Shared secret in header or query param
Fail behavior: Reject with 401 Unauthorized. Log attempt.
Implementation: Make.com
Basic structure validation
At the top of the webhook scenario, add a filter module that checks required fields:
email is not empty AND email contains @ AND phone is not empty
If false, route to error branch.
Email regex validation
Use a filter with:
email matches pattern ^[^\s@]+@[^\s@]+\.[^\s@]+$
Deduplication via HubSpot/GHL lookup
Before creating a contact:
- Search existing contacts by email (or normalized phone)
- If found -> update instead of create
- If not found -> create
Security via shared secret
Most webhook providers let you set a secret query parameter. In Make:
- Add a filter:
_{query.secret}_ equals "YOUR-SECRET-HERE" - If not, reject
For signed webhooks (Stripe, Shopify):
- Extract signature from header
- Compute HMAC using your secret
- Compare - if mismatch, reject
Implementation: n8n
Webhook node validation
Start with a Webhook trigger node. Immediately after, add a Code node with validation logic:
const { email, phone, name } = $input.item.json;
const errors = [];
// Required fields
if (!email && !phone) {
errors.push('Must have email or phone');
}
// Email format
if (email && !/^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email)) {
errors.push('Invalid email format');
}
// Phone format (strict E.164)
if (phone && !/^\+\d{10,15}$/.test(phone)) {
errors.push('Invalid phone format');
}
if (errors.length > 0) {
return { json: { valid: false, errors } };
}
return { json: { valid: true, data: { email, phone, name } } };
Then an IF node to branch on valid === true.
Signature verification (Stripe example)
const crypto = require('crypto');
const signature = $input.item.headers['stripe-signature'];
const payload = JSON.stringify($input.item.body);
const secret = 'your-stripe-webhook-secret';
const expectedSig = crypto
.createHmac('sha256', secret)
.update(payload, 'utf8')
.digest('hex');
const valid = signature && signature.includes(expectedSig);
return { json: { valid } };
Deduplication patterns
Pattern 1: Email/phone lookup before create
For every incoming lead:
- Normalize email (lowercase, trim)
- Normalize phone (E.164)
- Lookup contact by normalized email
- If found -> update fields (merge strategy: newer wins)
- If not found -> create new contact
Pattern 2: Event ID idempotency
For events that might be delivered twice (Stripe, Shopify retry failed webhooks):
- Extract event ID from payload
- Check if we've processed this event ID before (store in Data Store or database)
- If yes -> skip (return 200 OK to prevent retry)
- If no -> process and record event ID
Pattern 3: Time-window dedup
For forms where users might accidentally double-submit:
- Check if same email submitted in last 5 minutes
- If yes -> treat as duplicate, don't create new contact or trigger new workflow
- If no -> process
Rate limiting
If a webhook endpoint is public, it can be abused. Rate limiting prevents flooding.
Simple IP rate limit
Track request count per IP in a Data Store / Redis / database. If >N requests in time window (e.g., 10 requests/minute), reject with 429.
In-app rate limiting (if supported)
GoHighLevel, HubSpot, and others rate-limit incoming webhooks by workflow. Configure at the destination level.
Per-contact rate limit
Prevent the same email from triggering 20 workflows in an hour. Use a tag like "processed-today" with 24-hour expiry - skip if present.
Error handling
When validation fails:
Option 1: Reject with HTTP status
Return 400/401/403 to the sender. For webhooks from tools like Stripe or Shopify, this triggers automatic retry with exponential backoff.
Option 2: Accept but log to error queue
Return 200 OK but route the payload to an error-handling workflow:
- Store payload in a "Review" table
- Notify admin via Slack/email
- Don't create bad data in CRM
Option 2 is better when you can't risk upstream retries creating worse problems.
Option 3: Partial acceptance with flags
Accept the data, create the contact, but tag it "needs-review." Human reviews and cleans up. Not ideal but sometimes necessary for critical flows that can't miss any data.
Real webhook examples
Form submission webhook
Expected payload:
{
"email": "[email protected]",
"phone": "+15551234567",
"name": "Jane Doe",
"business_type": "solar",
"budget": "10k-25k"
}
Validation:
- email OR phone required
- email format valid
- business_type in canonical list
- budget from dropdown values
VAPI call webhook
Expected payload:
{
"call_id": "uuid",
"status": "completed",
"duration": 180,
"transcript": "...",
"outcome": "qualified",
"contact_phone": "+15551234567"
}
Validation:
- call_id present (for idempotency)
- status from expected enum
- duration is positive integer
- contact_phone is valid E.164
- Dedup by call_id to prevent double-processing
Stripe payment webhook
Expected payload: Standard Stripe event object.
Validation:
- Signature verification (critical - prevents webhook spoofing)
- Event type in expected list
- Idempotency by event ID
What NOT to do
1. Trust all incoming data. "It came from a webhook, so it must be fine." Webhooks can be malformed, malicious, or accidentally duplicated.
2. Build happy-path only. Your workflow works for valid data. What about invalid? What about missing fields? What about duplicates? Design for the failure cases.
3. Skip logging. When something goes wrong, you need a trail. Log every webhook receipt, even invalid ones, with enough context to debug.
4. Validate too strictly. If validation rejects 30% of real submissions because of a too-tight regex, you're losing leads. Validate what matters, accept what's valid.
5. Rely only on application-layer validation. If possible, validate at the database layer too. Unique constraints prevent dupes even if application logic has bugs.
Sources
Patterns in this article are industry-standard data validation practices, adaptable from programming language references (RFC 5321 for emails, ITU-T E.164 for phone numbers) and service documentation (Stripe webhooks, Shopify webhooks, GitHub webhooks - all of which document signature verification patterns). Implementation examples tested across Make.com and n8n deployments.
Need help designing webhook validation for a specific integration? Let's talk - I can audit your current webhook endpoints and harden them.
Sources and verification
This article was reviewed in May 2026. Vendor pricing, platform features, ad policies, and telemarketing rules change often, so operational or budget decisions should be checked against the current source pages below before implementation.
- Supabase securing your API
- Supabase Row Level Security
- Supabase Data API hardening
- Vercel pricing and usage
Private client metrics, lead counts, appointment counts, cost reductions, and revenue examples are intentionally removed, softened, or framed as modeled examples unless they can be verified publicly without exposing client data.
Need this built?
Turn this reading into a scoped operating system.
Use the intake to send the business context first, then the build conversation can stay focused on the workflow that needs to change.
Related articles
Data Normalization for CRM Contacts: Fixing the Mess Before It Gets Worse
> Verification note: This post was re-reviewed in May 2026. Public tool pricing, compliance rules, and platform capabiliti...
18 Apr 2026 / 7 min read
Large CRM Data Pipeline Case Study: Privacy-Safe Architecture Notes
18 Nov 2025
8 min
read