Re: Performance with search terms starting and ending with wildcards

2011-04-27 Thread Ueland
Hi!

Thanks for the reply.

We decided to give another try with ngrams. After much tweaking/tuning for
our needs. Both the size and speed was more than good enough for our needs.
So it looks like ngrams was the solution for us afterall :)

Best regards
Tor Henning Ueland

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2871451.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance with search terms starting and ending with wildcards

2011-04-11 Thread Otis Gospodnetic
Hi,

Perhaps you should give Lucene/Solr trunk a try and compare!  The Wildcard 
query 
in trunk should be much faster.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Ueland 
> To: solr-user@lucene.apache.org
> Sent: Sun, April 10, 2011 10:44:46 AM
> Subject: Performance with search terms starting and ending with wildcards
> 
> Hi!
> 
> I have been doing some testing with solr and wildcards. Queries  like:
> 
> - *foo
> - foo*
> 
> Does complete quickly(1-2s) in a test index  on about 40-50GB.
> 
> But when i try to do a search for *foo*, the search  time can without any
> trouble come upwards for 30seconds plus. 
> 
> Any  ideas on how that issue can be worked around? 
> 
> One fix would be to change  *foo* to (*foo or foo* or oof* or *oof) (is the
> reverse even needed?). But  that will not give the same results as *foo*,
> logicly enough.
> 
> I have  also tried to set maxTimeAllowed, but that is simply ignored. I guess
> that is  related to either sorting or the wildcard search itself. 
> 
> --
> View this  message in context: 
>http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802561.html
>
> Sent  from the Solr - User mailing list archive at Nabble.com.
> 


Re: Performance with search terms starting and ending with wildcards

2011-04-10 Thread Ueland
>Which version of solr are you using ?

Currently testing with 3.1

> NGrams could be an option but could you give us the field definition in
> your schema ? The words count in this field index ?

I wont share the complete schema but i can summarize it:

For testing, we have around 30 fields used to give us what we need from
documents that can be everything from 1 line to several MB`s of plain text,
and due to this size we have limited the copyfields to a maxmimum of 10 000
characters to limit the index size a bit.

We did a quick test of n-grams, the issue then was that the index grew from
around 90G and until the disk got full at 300G. (We tested more data/fields,
therefore the larger index)

The fact that a n-gram index becomes so large is a bit problematic.

Another interesting note: Even when i use the queryFilter to limit documents
to search in, the query is extremely slow (30s++ etc).

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802686.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance with search terms starting and ending with wildcards

2011-04-10 Thread lboutros
Which version of solr are you using ?

NGrams could be an option but could you give us the field definition in your
schema ? The words count in this field index ?

Ludovic.


2011/4/10 Ueland [via Lucene] <
ml-node+2802561-121096623-383...@n3.nabble.com>

> Hi!
>
> I have been doing some testing with solr and wildcards. Queries like:
>
> - *foo
> - foo*
>
> Does complete quickly(1-2s) in a test index on about 40-50GB.
>
> But when i try to do a search for *foo*, the search time can without any
> trouble come upwards for 30seconds plus.
>
> Any ideas on how that issue can be worked around?
>
> One fix would be to change *foo* to (*foo or foo* or oof* or *oof) (is the
> reverse even needed?). But that will not give the same results as *foo*,
> logicly enough.
>
> I have also tried to set maxTimeAllowed, but that is simply ignored. I
> guess that is related to either sorting or the wildcard search itself.
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802561.html
>  To start a new topic under Solr - User, email
> ml-node+472068-1765922688-383...@n3.nabble.com
> To unsubscribe from Solr - User, click 
> here.
>
>


-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-with-search-terms-starting-and-ending-with-wildcards-tp2802561p2802579.html
Sent from the Solr - User mailing list archive at Nabble.com.