Hi,

I have a question regarding the way I got around the 'TooManyClauses' exception when using wild card queries (http://wiki.apache.org/lucene-java/LuceneFAQ#head-06fafb5d19e786a50fb3dfb8821a6af9f37aa831).


I am using Lucene in conjunction with Hibernate Search (http://www.hibernate.org/410.html). I am indexing 'Compmany' objects which contain multiple attibutes and the application supports different types of searches.

One type of search is a right hand truncated (wildcard query) search of the company name. If eg the user searches for 'M' I constructed initially a 'M*' query. I have about 250.000 companies in the index. Without any modifications I get the 'TooManyClauses' exception and I initially kept increasing the 'maxClauseCount'. It works, but performace was terrible. I haven't tried working with a filter, but instead decided to try a different approach. I index all possible substrings of a string , eg 'Foo' would be indexed as 'F', 'Fo' and 'Foo'.

I got rid of the 'TooManyClauses' exception and performace improved by magnitude, but I would like to get some feedback from other users whether this is a good approach or not.

Of course the index size increased, but that was no issue in this case. Are there any potential problems with ranking/scoring?

Thanks for any feedback.

--Hardy


--
Hartmut Ferentschik
Ekholmsv.339 ,1, 127 45 Skärholmen, Sweden
Phone: +46 855 923 676 (h); +46 704 225 097 (m)

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to