Let's clear up some things about how Solr works.

1. Solr matches individual words, not the whole text. So "Jason Bourne" is 
split into ["Jason", "Bourne"]. The leading ".*" in your pattern does not match 
preceding words, it would match the beginning of a single word.

2. Query time wildcards test every word in the index. This might be a billion 
words. Of course that is slow. This is why we try to do things at index time. 
With ngrams, there is one lookup, not a billion wildcard matches.

3. Regexes will almost always be the slowest way to do something in Solr, and 
are almost always too slow for production.

Now, what are you trying to do for the user? It seems like you have decided on 
a solution and are asking about that.

Solr already has many built-in solutions, so if we know the root problem, we 
may find an easy solution.

wunder

On Jun 6, 2013, at 4:53 AM, Prathik Puthran wrote:

> Basically I want the Suggester to return for "Jason Bourne" as suggestion
> for ".*Bour.*" regex.
> 
> Thanks,
> Prathik
> 
> 
> On Thu, Jun 6, 2013 at 12:52 PM, Prathik Puthran <
> prathik.puthra...@gmail.com> wrote:
> 
>> This works even now i.e. when I search for "Jas" it suggests "Jason
>> Bourne". What I want is when I search for "Bour" or "ason" (any substring)
>> it should suggest me "Jason Bourne" .
>> 
>> 
>> On Thu, Jun 6, 2013 at 12:34 PM, Upayavira <u...@odoko.co.uk> wrote:
>> 
>>> Can you se the ShingleFilterFactory? It is ngrams for terms rather than
>>> characters. If you limited it to two term ngrams, when the user presses
>>> space after their first word, you could do a suggested query against
>>> your two term ngram field, which would suggest Jason Bourne, Jason
>>> Statham, etc then you press space after "Jason".
>>> 
>>> Upayavira
>>> 
>>> On Thu, Jun 6, 2013, at 07:25 AM, Prathik Puthran wrote:
>>>> My use case is I want to search for any substring of the indexed string
>>>> and
>>>> the Suggester should suggest the indexed string. What can I do to make
>>>> this
>>>> work?
>>>> 
>>>> Thanks,
>>>> Prathik
>>>> 
>>>> 
>>>> On Thu, Jun 6, 2013 at 2:05 AM, Mikhail Khludnev
>>>> <mkhlud...@griddynamics.com
>>>>> wrote:
>>>> 
>>>>> Please excuse my misunderstanding, but I always wonder why this index
>>> time
>>>>> processing is suggested usually. from my POV is the case for
>>> query-time
>>>>> processing i.e. PrefixQuery aka wildcard query Jason* .
>>>>> Ultra-fast term retrieval also provided by TermsComponent.
>>>>> 
>>>>> 
>>>>> On Wed, Jun 5, 2013 at 8:09 PM, Jack Krupansky <
>>> j...@basetechnology.com
>>>>>> wrote:
>>>>> 
>>>>>> ngrams?
>>>>>> 
>>>>>> See:
>>>>>> http://lucene.apache.org/core/**4_3_0/analyzers-common/org/**
>>>>>> apache/lucene/analysis/ngram/**NGramFilterFactory.html<
>>>>> 
>>> http://lucene.apache.org/core/4_3_0/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html
>>>>>> 
>>>>>> 
>>>>>> -- Jack Krupansky
>>>>>> 
>>>>>> -----Original Message----- From: Prathik Puthran
>>>>>> Sent: Wednesday, June 05, 2013 11:59 AM
>>>>>> To: solr-user@lucene.apache.org
>>>>>> Subject: Configuring lucene to suggest the indexed string for all
>>> the
>>>>>> searches of the substring of the indexed string
>>>>>> 
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Is it possible to configure solr to suggest the indexed string for
>>> all
>>>>> the
>>>>>> searches of the substring of the string?
>>>>>> 
>>>>>> Thanks,
>>>>>> Prathik
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Sincerely yours
>>>>> Mikhail Khludnev
>>>>> Principal Engineer,
>>>>> Grid Dynamics
>>>>> 
>>>>> <http://www.griddynamics.com>
>>>>> <mkhlud...@griddynamics.com>
>>>>> 
>>> 




Reply via email to