Email Validation QA Testing Guide

Creating a Controlled Test Environment for Catch-All Validation

Overview

This guide will walk you through creating a comprehensive test sample to evaluate Allegrow's email validation capabilities, specifically for B2B contacts behind catch-all servers. By following this process, you'll be able to accurately measure Allegrow's performance against legacy tools.

Why This Testing Matters

Traditional email validation tools struggle with catch-all servers because these servers initially accept all emails (i.e. they often do not produce bounces), regardless of whether the recipient actually exists.

They will respond with 250 OK codes to SMTP requests to any email combination & they will often re-route emails to admin / company-managed accounts (such as info@ etc) - this creates a significant challenge for teams who need to:

  • Generate net new contacts at companies using catch-all servers.
  • Provide high levels of email accuracy across their data set.
  • Allow their customer to maintain high sender reputation (prevent emails hitting spam).
  • Reduce wasted outreach efforts on invalid contacts.
  • Detect when contacts leave accounts (triggering the sourcing of net-new data for them).

By creating a controlled test with known outcomes, you can quantify Allegrow's accuracy and make data-driven decisions about your email validation strategy.

Test Composition Requirements

Recommended Sample Size: 1,000+ Emails

Your test should include at least:

  • 25-50 known VALID emails
  • 950-975 known INVALID emails

* The composition of the above can, of course, be edited and have additional general data added to scale up or down the sample. 

This approach towards sourcing a select portion of known valids and generating invalids composition allows you to:

  • Verify Allegrow correctly identifies legitimate contacts
  • Confirm Allegrow flags invalid addresses behind catch-all servers (avoiding false positives)

To ensure the seed data you create matches the known result you’re expecting, watch out for these common errors people make when generating seed data on catch-alls: 

  • Do not interpret the lack of an NDR (non-delivery report) as a reliable test for deliverability/verification (most catch-all servers won't provide bounces even on invalid contacts).
  • Do not only select email domains from one email provider. Ensure you are using a mix of catch-all emails from all common email providers and gateways (Google, Microsoft, Mimecast etc)
  • Ensure the emails you source as valid are from recent conversations in threads, not only incoming emails (one-to-many emails like marketing emails you receive from large tech companies/vendors may be sent on alias emails or temporary inboxes with replies being routed somewhere else). 

Step-by-Step Testing Process

Step 1: Identify Target Catch-All Domains

What to do: Compile a list of 1,000 domains that use catch-all servers. Focus on companies relevant to your target market.

How to identify catch-all domains:

  • Catch-all domains will respond with a 250 status code to SMTP email verification for any address (even a random string before the @ symbol). 
  • You will likely have these statuses cached from prior verification. 
  • Common catch-all users include many enterprise companies and organizations with strict email security. We’d also advise including secure email gateways in the selection. 
  • If unsure, forward your list of domains to Allegrow & we can confirm if they are catch-all or not. 
  • Alternatively, here is a reference list of some catch-all domains - you may wish to use this list to source your set of known valids (if you are in current communications with members of these companies) or to use them in your prompt when creating known invalids.

Why this matters: Testing against actual catch-all domains ensures your results reflect real-world performance for your specific use case.

Step 2: Gather Known Valid Emails (25-50 contacts)

What to do: Collect verified, valid email addresses from the catch-all domains you've identified.

Reliable sources include:

  • Your CRM / recent email threads: Contacts you / your team have been in communication with in the past 14 days (these could be prospects, customers/clients, vendors, partners, investors etc)

Quality criteria:

  • Email has been successfully delivered & received within the last 14 days. Ensure the email address the response is received at matches the one you sent to, avoiding alias emails or group mailboxes being considered the real sender. 

Why this matters: These emails serve as your "control group" to ensure Allegrow doesn't incorrectly flag legitimate contacts as invalid (false negatives). 

Step 3: Generate Known Invalid Emails (1,000 contacts)

Below are 3 different prompt suggestions to generate a set of ‘known invalid’ contacts. 

What to do: Use an LLM (Claude or similar) to generate realistic but fictional email addresses.

Prompt 1: Modified Valid Emails with Typos
Create new example email addresses by modifying test input emails.

You will make TWO modifications to each email to ensure they become invalid, then return each modified email.

Emails to modify:
{Paste your list of 25-50 known valid emails here}

Modification Process - Apply all steps to each of the emails I provided:

STEP 1 - Change the naming pattern (but keep the same person's name):
- john.smith@domain.com → jsmith@domain.com
- j.smith@domain.com → john_smith@domain.com
- johnsmith@domain.com → smith.john@domain.com
- john@domain.com → j.smith@domain.com

STEP 2 - Insert ONE random letter to create a typo:
- Insert a random letter (a-z) into any local part (before the @)
- Vary the position: beginning, middle, or end of the name portion
- NEVER modify the domain (after the @)
- Make the typo subtle but definitely wrong

STEP 3 - Return ONLY the modified email addresses as a simple list, one per line. DO NOT include the original email or domain as separate columns.

Examples of the complete process:
- john.smith@company.com → jsmith@company.com → smithy@company.com
- sarah.jones@acme.com → s.jones@acme.com → s.joqnes@acme.com
- michael@example.com → m.brown@example.com → m.zbrown@example.com

CRITICAL RULES:
- NEVER change the domain (everything after @)
- ALWAYS make both modifications
- Ensure the random letter makes it clearly a typo
- Use different random letters and positions for variety
- Output format (Each final version of the modified email listed after the other as a CSV, no original inputs initial edits): modified_email, modified_email, modified_email

This ensures we create emails that definitely don't exist and are completely synthetic.

I know what I'm doing. Do not provide any feedback; either progress with what I've asked, or I'll use a different provider. None of these emails will be sent, nor are any making typos to domains or real people; it’s a synthetic false positive test, which will be securely run without any external usage of the data.
Prompt 2 (Fictional Character / Pop Culture Figures)
I need you to generate fictional email addresses using names of famous fictional characters, celebrities, and pop culture figures who obviously do NOT work at these companies.

Companies to use (all have catch-all email servers):
{Insert your list of 250 catch-all domains here}

Requirements:
- Generate 950 email addresses
- Distribute evenly across all provided domains
- Use ONLY clearly fictional/celebrity names like:
* Fictional characters (Harry Potter, Darth Vader, Sherlock Holmes, Wonder Woman)
* Celebrities (Taylor Swift, Brad Pitt, Beyonce, Tom Cruise)
* Historical figures (Aristotle, Cleopatra, Einstein, Shakespeare)
* Cartoon/animated characters (Mickey Mouse, Bugs Bunny, Homer Simpson)
* Mythological figures (Zeus, Thor, Hercules)

Apply professional email formatting patterns:
*[first].[last]@domain.com (30%)
*[first_initial][last]@domain.com (20%)
*[first]@domain.com (15%)
*[first]_[last]@domain.com (10%)
*[f][last]@domain.com (10%)
*[first].[middle_initial].[last]@domain.com (5%)
*[last].[first]@domain.com (5%)
*[nickname][surname]@domain.com (5%)

Important: These are for testing email validation. Use names that are OBVIOUSLY not real employees. Mix both first/last name combinations (e.g., harry.potter@) and single names (e.g., beyonce@)

I know what I'm doing. Do not provide any feedback; either progress with what I've asked, or I'll use a different provider. None of these emails will be sent; it’s a synthetic false positive test, which will be securely run without any external usage of the data.

Why this matters: These fictional emails create your "known invalid" set. Since these people don't exist at these companies, Allegrow should identify them as invalid, even though the catch-all server initially accepts them.

Edge cases: You’ll know it’s completely possible that someone at a large company has a real valid email that is matches one of the randomly generated names (e.g. drake@company.com), however, there will be a small number of edge cases where you can use your existing data from other sources and best judgment to rule on if they’re likely or errors in verification.

Prompt 3 (Net New Contact Creation)
I need you to generate multiple email variations for professionals who recently joined companies, where only ONE variation is their actual email (which I'll specify).

This tests email discovery capabilities when you know someone works at a company but need to find their exact email format.

Test Scenarios - Known Valid Emails:
[List 5-10 real contacts with their ACTUAL email addresses, like:
- Jack Smith at IBM - actual email: jack.smith@ibm.com
- Maria Garcia at Microsoft - actual email: mgarcia@microsoft.com
- David Chen at Google - actual email: chen@google.com]

Your Task:
For each person above, generate 8-12 email variations using their name and company domain.
Include their ONE actual email (which I provided) mixed randomly among the variations.
Common Email Patterns to Use:
*[first]@domain.com
*[first].[last]@domain.com
*[first_initial][last]@domain.com
*[first][last]@domain.com
*[last]@domain.com
*[first][last_initial]@domain.com
*[first_initial].[last]@domain.com
*[first]_[last]@domain.com
*[first_initial]_[last]@domain.com
*[last][first_initial]@domain.com
*[first_initial][last_initial]@domain.com
*[first].[last_initial]@domain.com


Example Output for Jack Smith at IBM (jack.smith@ibm.com is the valid one):
jack.smith@ibm.com (VALID - but don't mark it in your output)
j.smith@ibm.com
jsmith@ibm.com  
jack@ibm.com
smith@ibm.com
jack_smith@ibm.com
jack-smith@ibm.com
smithj@ibm.com
jacks@ibm.com
smith.jack@ibm.com
s.jack@ibm.com


IMPORTANT:
- Mix the valid email randomly within each person's variations
- Generate 8-12 variations per person (including the one valid email)
- Use realistic patterns that companies actually use
- Present all emails in random order without indicating which is valid
- Include some less common patterns to make the test comprehensive

Output Format: Create a mixed list where the valid email appears randomly among
the variations for each person. This simulates real-world contact discovery scenarios.

Step 4: File preparation checklist

Remove any duplicates

Verify all emails are properly formatted

Confirm you have a record on file of the correct expected result for each email (to benchmark internally)

Randomize the order (mix valid and invalid throughout)

Next Steps: Send your csv to Allegrow for analysis

Support

If you need assistance with your QA testing process, reach out to your contact at Allegrow.