Hi,
I have a question regarding the way I got around the 'TooManyClauses'
exception when using wild card queries
(http://wiki.apache.org/lucene-java/LuceneFAQ#head-06fafb5d19e786a50fb3dfb8821a6af9f37aa831).
I am using Lucene in conjunction with Hibernate Search
(http://www.hibernate.org/410.html). I am indexing 'Compmany' objects
which contain multiple attibutes and the application supports different
types of searches.
One type of search is a right hand truncated (wildcard query) search of
the company name. If eg the user searches for 'M' I constructed initially
a 'M*' query. I have about 250.000 companies in the index. Without any
modifications I get the 'TooManyClauses' exception and I initially kept
increasing the 'maxClauseCount'. It works, but performace was terrible. I
haven't tried working with a filter, but instead decided to try a
different approach. I index all possible substrings of a string , eg 'Foo'
would be indexed as 'F', 'Fo' and 'Foo'.
I got rid of the 'TooManyClauses' exception and performace improved by
magnitude, but I would like to get some feedback from other users whether
this is a good approach or not.
Of course the index size increased, but that was no issue in this case.
Are there any potential problems with ranking/scoring?
Thanks for any feedback.
--Hardy
--
Hartmut Ferentschik
Ekholmsv.339 ,1, 127 45 Skärholmen, Sweden
Phone: +46 855 923 676 (h); +46 704 225 097 (m)
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]