Daniel Naber wrote:
Searching for Photokopie~ on a 230,000 document corpus takes 2.3 seconds here (AMD Athlon 2600+; other fuzzy terms get similar performance). As the number of terms doesn't increase so fast with more documents, it will not take 10 seconds for 1 million documents. So fuzzy search isn't *that* slow.

How long do non-fuzzy queries take? What is the ratio? How about a query with multiple fuzzy terms?


If someone launches a service but fails to test it with fuzzy queries, will they be subject to inadvertant denial-of-service when a user starts using fuzzy queries? Web-based search is particularly vulnerable. If a query takes a few seconds and the user hits his browser's STOP and RELOAD buttons, the first query keeps running on the server.

This is not an imaginary problem. I have worked with several clients who have run into this in deployed applications.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to