The mass experiment of 2023 told us one thing clearly: AI-written cold email without guardrails is worse than no AI at all. Prospects learned to recognize the tone, and open-to-reply ratios dropped. But a smaller group of teams kept going, tuned the approach, and by early 2024 are seeing reply rates better than any human-only team we've measured. The difference is how they prompt, not which model they use.
The three-layer personalization pattern
The teams doing this well in 2024 structure every AI-written email in three layers:
- Fact layer (the opener) — one specific, verifiable fact about the prospect. Not a compliment, not a generic observation. A fact. Something you could cite.
- Relevance layer (the bridge) — one sentence that connects the fact to a problem the prospect plausibly has.
- Ask layer (the CTA) — short, concrete, low-friction. Always written by a human, never by AI.
AI handles layers 1 and 2. Humans handle layer 3 and the template around it. The fact layer is where most teams fail, because they feed the model LinkedIn About sections instead of real signals.
What feeds a good opener
- Recent job changes (under 90 days)
- Funding rounds and hiring posts
- Tech stack additions or removals
- Recent podcast appearances or articles they wrote
- Specific product launches or press
What doesn't work: "I noticed you work at [company]" — that's not a fact, that's a field merge. The model will write something grammatically correct and totally empty.
The prompt pattern that actually works
You are writing one sentence. The sentence references this specific fact: [FACT]. The sentence must read as if a human who just read the fact wrote it. Do not flatter. Do not use the words "noticed", "saw", "came across". Do not start with "I". Maximum 18 words.
Constraints matter more than cleverness. The model will write worse sentences when you ask it to "be creative" than when you give it three hard rules.
The voice problem
GPT-4 has a default tone. If you use it without a voice instruction, every email you send ends up sounding identical across 1,000 prospects — and crucially, identical to every other company using GPT-4. The fix is a one-paragraph voice document that includes your typical sentence length, your forbidden words, and two or three example sentences you actually wrote yourself.
What this lets a team actually do
A 2-person SDR team in 2024 can send 400 genuinely personalized emails per week without templating. That number used to require 6 people. The time saved goes into targeting and replies — which is where humans still outperform any model.
"We're not writing faster. We're targeting tighter. AI gave us back the hours we were burning on copy-paste."
Where the limit is
GPT-4 cannot:
- Decide if a prospect is worth contacting
- Write the CTA
- Handle replies beyond routing
- Judge tone for specific cultural contexts
Those are still human jobs. AI is the fastest way to do the mechanical parts — not a replacement for the parts that require judgement.