Re: Terms.regex performance issue
Hi, I do have the same problem, i am looking for infix autocomplete, could you elaborate a bit on your QueryConverter - Suggester solution ? Thank You! -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3338273.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
Read http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html for more info about the QueryConverter. IMO Suggester should make it easier to choose between QueryConverters. As for the infix, WIKI says its planned feature, but the Suggester hasnt't been worked on for couple of months. So guess we will have to wait :) -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3338899.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
We do something like: http://localhost:8983/solr/provs/terms?terms.fl=payorterms.regex.flag=case _insensitiveterms.regex=%28.*%29WHAT USER TYPES%28.*%29terms.limit=-1 We want not just prefix but anywhere in the terms. On 8/19/11 5:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Subject: Terms.regex performance issue : : As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets : results in around 100 milliseconds, while terms.regex is 10 to 20 times : slower. can you elaborate on how you are using terms.regex? what does your regex look like? .. particularly if your usecase is autocomplete terms.prefix seems like an odd choice. Possible XY Problem? https://people.apache.org/~hossman/#xyproblem Have you looked at using the Suggester plugin? https://wiki.apache.org/solr/Suggester -Hoss
Re: Terms.regex performance issue
Wait. Sometimes I get confused because gmail will substitute * for bolding, so in my client it looks like you're searching infix (e.g. leading and trailing wildcards). If that's the case, then your performance will always be poor, it has to enumerate all the terms in the field... If it's just bolding confusing me, then never mind Best Erick On Fri, Aug 19, 2011 at 8:27 PM, O. Klein kl...@octoweb.nl wrote: Terms.prefix was just to compare performance. The use case was terms.regex=.*query.* And as Markus pointed out, this will prolly remain a bottleneck. I looked at the Suggester. But like many others I have been struggling to make it useful. It needs a custom queryConverter to give proper suggestions, but I havent tried this yet. -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3269628.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
Yeah, I was searching infix. It worked very nice for autocomplete. Made a custom QueryConverter for the Suggester so it gives proper suggestions for shingles. Will stick with that for now. Thanx for the feedback. -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273145.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
Ah, in that case, comparing prefix and regex is an apples-to-oranges comparison. I expect regex to be slower, but a fairer comparison would be prefix to stuff* (which may be changed into a prefix enumeration for all I know). But comparing infix to prefix doesn't tell you much really Best Erick P.S. There's no reason to do anything if you have a solution that works already though. On Sun, Aug 21, 2011 at 12:56 PM, O. Klein kl...@octoweb.nl wrote: Yeah, I was searching infix. It worked very nice for autocomplete. Made a custom QueryConverter for the Suggester so it gives proper suggestions for shingles. Will stick with that for now. Thanx for the feedback. -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273145.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
Of course. Thats why I compared prefix to bla* and saw it was already a lot slower. -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273370.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
I see now in Suggester Wiki; Support for infix-suggestions is planned for FSTLookup (which would be the only structure to support these). -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273711.html Sent from the Solr - User mailing list archive at Nabble.com.
Terms.regex performance issue
As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets results in around 100 milliseconds, while terms.regex is 10 to 20 times slower. Not storing the field made it a bit faster but not enough. The index is on a seperate core and only about 5Mb big. Are there some tricks to make it work a lot faster? Or do I have to switch to ngrams or something? -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3268994.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
TermsComponent uses java.util.regex which is not particulary fast. If the number of terms grows your CPU is going to overheat. I'd prefer an analyzer approach. As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets results in around 100 milliseconds, while terms.regex is 10 to 20 times slower. Not storing the field made it a bit faster but not enough. The index is on a seperate core and only about 5Mb big. Are there some tricks to make it work a lot faster? Or do I have to switch to ngrams or something? -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994 p3268994.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Terms.regex performance issue
: Subject: Terms.regex performance issue : : As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets : results in around 100 milliseconds, while terms.regex is 10 to 20 times : slower. can you elaborate on how you are using terms.regex? what does your regex look like? .. particularly if your usecase is autocomplete terms.prefix seems like an odd choice. Possible XY Problem? https://people.apache.org/~hossman/#xyproblem Have you looked at using the Suggester plugin? https://wiki.apache.org/solr/Suggester -Hoss
Re: Terms.regex performance issue
Terms.prefix was just to compare performance. The use case was terms.regex=.*query.* And as Markus pointed out, this will prolly remain a bottleneck. I looked at the Suggester. But like many others I have been struggling to make it useful. It needs a custom queryConverter to give proper suggestions, but I havent tried this yet. -- View this message in context: http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3269628.html Sent from the Solr - User mailing list archive at Nabble.com.