Re: Terms.regex performance issue

2011-09-15 Thread tbarbugli
Hi,
I do have the same problem, i am looking for infix autocomplete, could you
elaborate a bit on your QueryConverter - Suggester solution ?
Thank You!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3338273.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Terms.regex performance issue

2011-09-15 Thread O. Klein
Read  http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html
http://lucene.472066.n3.nabble.com/suggester-issues-td3262718.html  for more
info about the QueryConverter. IMO Suggester should make it easier to choose
between QueryConverters.

As for the infix, WIKI says its planned feature, but the Suggester hasnt't
been worked on for couple of months. So guess we will have to wait :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3338899.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Terms.regex performance issue

2011-08-22 Thread Bill Bell
We do something like:

http://localhost:8983/solr/provs/terms?terms.fl=payorterms.regex.flag=case
_insensitiveterms.regex=%28.*%29WHAT USER TYPES%28.*%29terms.limit=-1


We want not just prefix but anywhere in the terms.



On 8/19/11 5:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote:


: Subject: Terms.regex performance issue
: 
: As I want to use it in an Autocomplete it has to be fast. Terms.prefix
gets
: results in around 100 milliseconds, while terms.regex is 10 to 20 times
: slower.

can you elaborate on how you are using terms.regex?  what does your regex
look like? .. particularly if your usecase is autocomplete terms.prefix
seems like an odd choice.

Possible XY Problem?
https://people.apache.org/~hossman/#xyproblem

Have you looked at using the Suggester plugin?

https://wiki.apache.org/solr/Suggester


-Hoss




Re: Terms.regex performance issue

2011-08-21 Thread Erick Erickson
Wait. Sometimes I get confused because gmail will substitute
* for bolding, so in my client it looks like you're searching infix (e.g.
leading and trailing wildcards). If that's the case, then your performance
will always be poor, it has to enumerate all the terms in the field...

If it's just bolding confusing me, then never mind

Best
Erick

On Fri, Aug 19, 2011 at 8:27 PM, O. Klein kl...@octoweb.nl wrote:
 Terms.prefix was just to compare performance.

 The use case was terms.regex=.*query.* And as Markus pointed out, this will
 prolly remain a bottleneck.

 I looked at the Suggester. But like many others I have been struggling to
 make it useful. It needs a custom queryConverter to give proper suggestions,
 but I havent tried this yet.






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3269628.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Terms.regex performance issue

2011-08-21 Thread O. Klein
Yeah, I was searching infix. It worked very nice for autocomplete.

Made a custom QueryConverter for the Suggester so it gives proper
suggestions for shingles. Will stick with that for now.

Thanx for the feedback.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273145.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Terms.regex performance issue

2011-08-21 Thread Erick Erickson
Ah, in that case, comparing prefix and regex is an apples-to-oranges
comparison. I expect regex to be slower, but a fairer comparison
would be prefix to stuff* (which may be changed into a prefix
enumeration for all I know). But comparing infix to prefix doesn't tell you
much really

Best
Erick

P.S. There's no reason to do anything if you have a solution that works
already though.

On Sun, Aug 21, 2011 at 12:56 PM, O. Klein kl...@octoweb.nl wrote:
 Yeah, I was searching infix. It worked very nice for autocomplete.

 Made a custom QueryConverter for the Suggester so it gives proper
 suggestions for shingles. Will stick with that for now.

 Thanx for the feedback.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273145.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Terms.regex performance issue

2011-08-21 Thread O. Klein
Of course. Thats why I compared prefix to bla* and saw it was already a lot
slower.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273370.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Terms.regex performance issue

2011-08-21 Thread O. Klein
I see now in Suggester Wiki; Support for infix-suggestions is planned for
FSTLookup (which would be the only structure to support these).


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3273711.html
Sent from the Solr - User mailing list archive at Nabble.com.


Terms.regex performance issue

2011-08-19 Thread O. Klein
As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets
results in around 100 milliseconds, while terms.regex is 10 to 20 times
slower.

Not storing the field made it a bit faster but not enough. The index is on a
seperate core and only about 5Mb big. Are there some tricks to make it work
a lot faster? Or do I have to switch to ngrams or something?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3268994.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Terms.regex performance issue

2011-08-19 Thread Markus Jelsma
TermsComponent uses java.util.regex which is not particulary fast. If the 
number of terms grows your CPU is going to overheat. I'd prefer an analyzer 
approach.

 As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets
 results in around 100 milliseconds, while terms.regex is 10 to 20 times
 slower.
 
 Not storing the field made it a bit faster but not enough. The index is on
 a seperate core and only about 5Mb big. Are there some tricks to make it
 work a lot faster? Or do I have to switch to ngrams or something?
 
 
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994
 p3268994.html Sent from the Solr - User mailing list archive at Nabble.com.


Re: Terms.regex performance issue

2011-08-19 Thread Chris Hostetter

: Subject: Terms.regex performance issue
: 
: As I want to use it in an Autocomplete it has to be fast. Terms.prefix gets
: results in around 100 milliseconds, while terms.regex is 10 to 20 times
: slower.

can you elaborate on how you are using terms.regex?  what does your regex 
look like? .. particularly if your usecase is autocomplete terms.prefix 
seems like an odd choice. 

Possible XY Problem?
https://people.apache.org/~hossman/#xyproblem

Have you looked at using the Suggester plugin?

https://wiki.apache.org/solr/Suggester


-Hoss


Re: Terms.regex performance issue

2011-08-19 Thread O. Klein
Terms.prefix was just to compare performance.

The use case was terms.regex=.*query.* And as Markus pointed out, this will
prolly remain a bottleneck.

I looked at the Suggester. But like many others I have been struggling to
make it useful. It needs a custom queryConverter to give proper suggestions,
but I havent tried this yet.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-regex-performance-issue-tp3268994p3269628.html
Sent from the Solr - User mailing list archive at Nabble.com.