Case Study: Natural Language Property Search Without LLM Overhead

Real estate agents have a problem: buyers know what they want, but search interfaces don’t speak their language. A buyer says “three bedroom house with a pool near good schools.” The typical real estate portal offers 47 checkboxes, 12 dropdown menus, and a price slider.

When a real estate technology company asked us to solve this, they expected we’d recommend an LLM. Everyone recommends LLMs for natural language these days. We built something different, something that works better for this specific problem.

The Challenge

The company had tried the obvious approach. They connected GPT-4 to their property database and let it interpret queries. The results were impressive in demos and problematic in production:

Latency: 2-3 seconds per query felt slow compared to instant filter updates
Costs: At $0.01-0.03 per query, costs scaled linearly with usage
Hallucinations: The LLM occasionally invented property features that didn’t exist
Inconsistency: The same query produced different interpretations on different days
Voice integration: Response times made voice search frustrating

The LLM was overkill. Property search isn’t open-ended conversation. It’s a constrained domain with finite attributes: bedrooms, bathrooms, price, location, amenities, property type. The challenge isn’t understanding infinite possibilities; it’s mapping natural language to a structured schema reliably.

What We Built

We built a deterministic natural language parser specifically for real estate queries. No LLM required. The system parses queries into structured filters using pattern matching, synonym expansion, and domain-specific rules.

How It Works

When a user types “3 bed 2 bath under 500k with a pool in Austin”:

Tokenization breaks the query into components
Entity recognition identifies numbers, locations, and keywords
Synonym expansion maps “bed” to “bedrooms,” “bath” to “bathrooms”
Context rules interpret “under 500k” as “max_price: 500000”
Location parsing resolves “Austin” to geographic boundaries
Filter construction builds the database query

Total processing time: 3-15 milliseconds.

Handling Natural Language Variation

Real queries aren’t clean. People write “3br/2ba,” “three bedroom,” “3 beds,” and “3 bedrooms.” They say “pool” or “swimming pool” or “private pool.” They describe locations as “downtown,” “near downtown,” or “close to the city center.”

The system handles this through:

Synonym dictionaries: Maintained lists of equivalent terms for every searchable attribute. “Pool,” “swimming pool,” and “private pool” all map to the same filter.

Numeric parsing: Handles written numbers (“three”), digits (“3”), and common abbreviations (“3br”).

Location hierarchies: Understands that “Austin” contains neighborhoods like “East Austin,” which contains streets, which contains addresses. Queries at any level work correctly.

Implicit attributes: When someone says “large backyard,” the system maps this to lot size filters even though “backyard” isn’t a database field.

Voice Search Integration

The same parser powers both text and voice search. Because responses are instant, voice search feels natural:

User: “Show me three bedroom houses under 400k with a garage” System: displays results in 200ms

Compare this to an LLM-powered system where 2-3 seconds of processing creates awkward pauses in voice interaction.

Handling Ambiguity

Some queries are genuinely ambiguous. “Nice neighborhood” means different things to different people. “Good schools” requires external data about school ratings.

The system handles ambiguity through:

Clarification prompts: When a query can’t be fully resolved, the interface shows which parts were interpreted and asks about the rest. “I found houses matching your price and size. What does ’nice neighborhood’ mean to you?”

Fallback to filters: Unresolved terms become search suggestions rather than silent failures. The user sees “We couldn’t interpret ’nice neighborhood.’ You can filter by crime rate, walkability score, or median income.”

Learning from corrections: When users adjust filters after a query, the system logs the pattern for future synonym expansion.

The Results

After deployment across web and voice interfaces:

Response time dropped from 2-3 seconds to under 100ms (including network latency)
Per-query costs fell to effectively zero (compute is negligible)
Zero hallucinations because the system only references actual database fields
100% consistency because identical queries always produce identical results
Voice search adoption increased 340% once latency was no longer a barrier

The agents using the system reported that clients could find properties faster because they weren’t translating their desires into checkbox language. One agent described it as “finally letting people search the way they think.”

Why Not an LLM?

LLMs are remarkable for open-ended tasks where you can’t enumerate all possibilities. They’re poorly suited for constrained domains where:

The output is structured (database filters, not free text)
The domain is finite (property attributes don’t change daily)
Consistency matters (same query should always return same results)
Latency matters (users expect instant feedback)
Hallucinations are unacceptable (invented features waste everyone’s time)

Real estate search checks all five boxes. The right tool for this job was deterministic parsing, not probabilistic generation.

This doesn’t mean LLMs have no role in real estate. They’re excellent for:

Generating property descriptions from attributes
Answering open-ended questions about neighborhoods
Summarizing inspection reports
Drafting correspondence

But for the core search function, a purpose-built parser outperforms a general-purpose language model.

Technical Approach

For teams considering similar systems:

Start with synonym collection: The quality of natural language parsing depends entirely on the synonym dictionaries. We spent significant time analyzing actual user queries to build comprehensive mappings.

Design for extensibility: New property attributes and amenities appear regularly. The system architecture makes adding new parseable terms trivial.

Log everything: Every query, every interpretation, every user correction. This data drives continuous improvement of the parser.

Build confidence scores: Not all interpretations are equal. A query like “3 bed 2 bath” parses with high confidence. “Cozy cottage near water” parses with lower confidence and should prompt clarification.

Test with real queries: Academic NLP benchmarks don’t predict real-world performance. Testing with thousands of actual user queries revealed edge cases we’d never have anticipated.

Is This Right for Your Domain?

Deterministic NLP parsing works well when:

Your domain has a finite, enumerable set of attributes
Users are searching/filtering rather than having open-ended conversations
Consistency and speed matter more than handling novel situations
You need to integrate with voice interfaces
Per-query costs at scale are a concern

If your natural language needs are more open-ended, genuine conversation rather than structured search, an LLM is probably the right tool. But don’t assume LLMs are the answer just because the problem involves language.

Building a search experience for a constrained domain? Let’s discuss whether deterministic NLP or LLMs are right for your use case.

The Challenge#

What We Built#

How It Works#

Handling Natural Language Variation#

Voice Search Integration#

Handling Ambiguity#

The Results#

Why Not an LLM?#

Technical Approach#

Is This Right for Your Domain?#