Re: Auto-Suggest within Tier Architecture

2020-02-24 Thread Paras Lehana
Hi Brett,

We, at IndiaMART, have Solr installed behind PHP servers which are
behind Varnish servers.

Yes, you are right exposing Solr URL is not a good idea. A single service
in between would do the trick.

You can try our service at dir.indiamart.com. We have a client-side JS that
handles AJAX requests per keystroke. Do let me know for any other queries.
:)

On Mon, 3 Feb 2020 at 22:10, Moyer, Brett  wrote:

> Hello,
>
> Looking to see how others accomplished this goal. We have a 3 Tier
> architecture, Solr is down deep in T3 far from the end user. How do you
> make Auto-Suggest calls from the Internet Browser through the Tiers down to
> Solr in T3? We essentially created steps down each tier, but I'm looking to
> know what other approaches people have created. Did you put your solr in
> T1, I assume not, that would put it at risk. Thanks!
>
> Brett Moyer
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA
> *
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, *Auto-Suggest*,
IndiaMART InterMESH Ltd,

11th Floor, Tower 2, Assotech Business Cresterra,
Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305

Mob.: +91-9560911996
Work: 0120-4056700 | Extn:
*11096*

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>


Auto-Suggest within Tier Architecture

2020-02-03 Thread Moyer, Brett
Hello,

Looking to see how others accomplished this goal. We have a 3 Tier 
architecture, Solr is down deep in T3 far from the end user. How do you make 
Auto-Suggest calls from the Internet Browser through the Tiers down to Solr in 
T3? We essentially created steps down each tier, but I'm looking to know what 
other approaches people have created. Did you put your solr in T1, I assume 
not, that would put it at risk. Thanks!

Brett Moyer
*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA
*


Re: Type of auto suggest feature

2019-11-24 Thread Paras Lehana
Hey Artur,

If I have understood correctly, you want to suggest terms related to the
query. It would be helpful if you describe the use case as well. Anyways,
please go through this once:

   1. Keep different form of words as different documents so that they
   could be suggested ("closed", "close" and "closing" should be different
   docs). Use stemming (Snowball Porter Stemmer Filter
   
<https://lucene.apache.org/solr/guide/8_3/filter-descriptions.html#snowball-porter-stemmer-filter>)
   so that docs with different forms could be matched.
   2. The "interesting" terms are probably related terms in your case that
   can be addressed with Synonym factory. Again, the related terms should be
   in different documents. Add all the related words in the Synonym file
   separated with commas.
   3. Will your query only have single terms? If no and if there are
   multiple terms, how do you want to handle that? This may require few more
   analyzers and tweaking in query.
   4. If you still want to suggest terms for partial words (to suggest
   "closing" if query is "clo"), use Edge NGrams
   
<https://lucene.apache.org/solr/guide/8_3/tokenizers.html#edge-n-gram-tokenizer>.
   Use Standard Tokenizer
   
<https://lucene.apache.org/solr/guide/8_3/tokenizers.html#Tokenizers-StandardTokenizer>
   to split words. What do you want to achieve with Shingle factory?
   5. I think all of the above can be simply handled without Suggester
   component. Anyways, keep exploring different ways.

Please do tell if you have any queries.

On Sun, 24 Nov 2019 at 19:11, Rudenko, Artur 
wrote:

> Hi,
> I am quite new to solr and I am interested in implementing a sort of auto
> terms suggest (not auto complete) feature based on the user query.
> Users builds some query (on multiple fields) and I am trying to help him
> refining his query by suggesting to add more terms based on his current
> query.
> The suggestions should contain synonyms and different word forms
> (query:close , result: closed, closing) and also some other "interesting"
> (hard to define what interesting is) terms and phrases based on that search.
>
> The queries are perform on text field with about 1000 words on document
> sets of about 20-50M
>
> So far I came up with solution that uses Suggester component over the 1000
> words text field (copy field) as shown below and im trying to find how to
> add to it more "interesting" terms and phrases based on the text field
>
>
>  type="text_total_shingle_synonyms" indexed="true" stored="true"
> termVectors="true" termOffsets="true" termPositions="true" required="false"
> multiValued="true" />
>
> 
>
>  positionIncrementGap="100">
>   
> 
> 
> 
> 
>  protected="protwords.txt"/>
>  maxShingleSize="4" />
>   
>   
> 
>  synonyms="synonyms_suggest.txt" ignoreCase="true" expand="false"/> 
> 
> 
>  protected="protwords.txt"/>
> 
>
> 
> 
>
>
> Thanks,
> Artur Rudenko
>
>
>
> This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or subsidiaries. The
> information is intended to be for the use of the individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may not
> use, copy, disclose or distribute to anyone this message or any information
> contained in this message. If you have received this electronic message in
> error, please notify us by replying to this e-mail.
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
IMPORTANT: 
NEVER share your IndiaMART OTP/ Password with anyone.


Type of auto suggest feature

2019-11-24 Thread Rudenko, Artur
Hi,
I am quite new to solr and I am interested in implementing a sort of auto terms 
suggest (not auto complete) feature based on the user query.
Users builds some query (on multiple fields) and I am trying to help him 
refining his query by suggesting to add more terms based on his current query.
The suggestions should contain synonyms and different word forms (query:close , 
result: closed, closing) and also some other "interesting" (hard to define what 
interesting is) terms and phrases based on that search.

The queries are perform on text field with about 1000 words on document sets of 
about 20-50M

So far I came up with solution that uses Suggester component over the 1000 
words text field (copy field) as shown below and im trying to find how to add 
to it more "interesting" terms and phrases based on the text field







  






  
  

 









Thanks,
Artur Rudenko



This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.


Re: Auto-suggest in Solr

2015-07-12 Thread Zheng Lin Edwin Yeo
Thank you so much.

I'll read up on that and try that out.

Regards,
Edwin


On 12 July 2015 at 00:41, Erick Erickson  wrote:

> Cool! I've bookmarked it, much more thorough
>
> Erick
>
> On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood 
> wrote:
> > Thanks, this is very helpful.
> >
> > Suggester config is quite under documented. It took me longer than I
> expected to get it working.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >
> > On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti <
> benedetti.ale...@gmail.com> wrote:
> >
> >> Hi guys,
> >> just wrote a blog to integrate Erick's post and to explain in details
> with
> >> practical examples all the main Lookup implementations :
> >>
> >> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
> >>
> >> I think this can be useful for Edwin to finally fix the config for the
> >> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike
> answer
> >> in dev, and deep code analysis and testing :) )
> >>
> >> Cheers
> >>
> >> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti <
> benedetti.ale...@gmail.com>
> >> :
> >>
> >>> Thanks, Erick, i didn't have time to go again through the code.
> >>> But i will forward this to the Dev list.
> >>> Thank you for your time !
> >>>
> >>> Cheers
> >>>
> >>> 2015-06-27 16:19 GMT+01:00 Erick Erickson :
> >>>
> >>>> Alessandro:
> >>>>
> >>>> Going to have to defer to Mike McCandless et.al., they're the
> >>>> authorities here. Don't quite know whether they monitor this list,
> >>>> consider the dev list?
> >>>>
> >>>> Best,
> >>>> Erick
> >>>>
> >>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
> >>>>  wrote:
> >>>>> Up, Can anyone gently take a look to my considerations related the
> >>>> FreeText
> >>>>> Suggester ?
> >>>>> I am curious to have more insight.
> >>>>> Eventually I will deeply analyse the code to understand my errors.
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti <
> >>>> benedetti.ale...@gmail.com>
> >>>>> :
> >>>>>
> >>>>>> Actually the documentation is not clear enough.
> >>>>>> Let's try to understand this suggester.
> >>>>>>
> >>>>>> *Building*
> >>>>>> This suggester build a FST that it will use to provide the
> autocomplete
> >>>>>> feature running prefix searches on it .
> >>>>>> The terms it uses to generate the FST are the tokens produced by the
> >>>>>> "suggestFreeTextAnalyzerFieldType" .
> >>>>>>
> >>>>>> And this should be correct.
> >>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as
> >>>> well)
> >>>>>> in our analysis to keep it simple , from these original field
> values :
> >>>>>> "mp3 ipod"
> >>>>>> "mp3 player"
> >>>>>> "mp3 player ipod"
> >>>>>> "player of Real"
> >>>>>>
> >>>>>> -> we produce these list of possible suggestions in our FST :
> >>>>>>
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>>
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>>
> >>>>>> 
> >>>>>> 
> >>>>>>
> >>>>>> From the documentation I read :
> >>>>>>
> >>>>>>> " ngrams: The max number of tokens out of which singles will be
> make
> >>>> the
> >>>>>>> dictionary. The default value is 2. Increasing this would mean you
> >>>> want
> >>>>>>&

Re: Auto-suggest in Solr

2015-07-11 Thread Erick Erickson
Cool! I've bookmarked it, much more thorough

Erick

On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood  wrote:
> Thanks, this is very helpful.
>
> Suggester config is quite under documented. It took me longer than I expected 
> to get it working.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti 
>  wrote:
>
>> Hi guys,
>> just wrote a blog to integrate Erick's post and to explain in details with
>> practical examples all the main Lookup implementations :
>>
>> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
>>
>> I think this can be useful for Edwin to finally fix the config for the
>> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer
>> in dev, and deep code analysis and testing :) )
>>
>> Cheers
>>
>> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti 
>> :
>>
>>> Thanks, Erick, i didn't have time to go again through the code.
>>> But i will forward this to the Dev list.
>>> Thank you for your time !
>>>
>>> Cheers
>>>
>>> 2015-06-27 16:19 GMT+01:00 Erick Erickson :
>>>
>>>> Alessandro:
>>>>
>>>> Going to have to defer to Mike McCandless et.al., they're the
>>>> authorities here. Don't quite know whether they monitor this list,
>>>> consider the dev list?
>>>>
>>>> Best,
>>>> Erick
>>>>
>>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
>>>>  wrote:
>>>>> Up, Can anyone gently take a look to my considerations related the
>>>> FreeText
>>>>> Suggester ?
>>>>> I am curious to have more insight.
>>>>> Eventually I will deeply analyse the code to understand my errors.
>>>>>
>>>>> Cheers
>>>>>
>>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti <
>>>> benedetti.ale...@gmail.com>
>>>>> :
>>>>>
>>>>>> Actually the documentation is not clear enough.
>>>>>> Let's try to understand this suggester.
>>>>>>
>>>>>> *Building*
>>>>>> This suggester build a FST that it will use to provide the autocomplete
>>>>>> feature running prefix searches on it .
>>>>>> The terms it uses to generate the FST are the tokens produced by the
>>>>>> "suggestFreeTextAnalyzerFieldType" .
>>>>>>
>>>>>> And this should be correct.
>>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as
>>>> well)
>>>>>> in our analysis to keep it simple , from these original field values :
>>>>>> "mp3 ipod"
>>>>>> "mp3 player"
>>>>>> "mp3 player ipod"
>>>>>> "player of Real"
>>>>>>
>>>>>> -> we produce these list of possible suggestions in our FST :
>>>>>>
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>
>>>>>> 
>>>>>> 
>>>>>>
>>>>>> From the documentation I read :
>>>>>>
>>>>>>> " ngrams: The max number of tokens out of which singles will be make
>>>> the
>>>>>>> dictionary. The default value is 2. Increasing this would mean you
>>>> want
>>>>>>> more than the previous 2 tokens to be taken into consideration when
>>>> making
>>>>>>> the suggestions. "
>>>>>>
>>>>>>
>>>>>> This makes me confused, as I was not expecting this param to affect the
>>>>>> suggestion dictionary.
>>>>>> So I would like a clarification here from our masters :)
>>>>>> At this point let's see what happens at query time .
>>>>>>
>>>>>> *Query Time *
>>>>>> As my understanding the ngrams params will consider  the last N-1
>>>> tokens
>>>>>> the user put separated by the space separator.
>>>>&g

Re: Auto-suggest in Solr

2015-07-11 Thread Walter Underwood
Thanks, this is very helpful.

Suggester config is quite under documented. It took me longer than I expected 
to get it working.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti  
wrote:

> Hi guys,
> just wrote a blog to integrate Erick's post and to explain in details with
> practical examples all the main Lookup implementations :
> 
> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
> 
> I think this can be useful for Edwin to finally fix the config for the
> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer
> in dev, and deep code analysis and testing :) )
> 
> Cheers
> 
> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti 
> :
> 
>> Thanks, Erick, i didn't have time to go again through the code.
>> But i will forward this to the Dev list.
>> Thank you for your time !
>> 
>> Cheers
>> 
>> 2015-06-27 16:19 GMT+01:00 Erick Erickson :
>> 
>>> Alessandro:
>>> 
>>> Going to have to defer to Mike McCandless et.al., they're the
>>> authorities here. Don't quite know whether they monitor this list,
>>> consider the dev list?
>>> 
>>> Best,
>>> Erick
>>> 
>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
>>>  wrote:
>>>> Up, Can anyone gently take a look to my considerations related the
>>> FreeText
>>>> Suggester ?
>>>> I am curious to have more insight.
>>>> Eventually I will deeply analyse the code to understand my errors.
>>>> 
>>>> Cheers
>>>> 
>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti <
>>> benedetti.ale...@gmail.com>
>>>> :
>>>> 
>>>>> Actually the documentation is not clear enough.
>>>>> Let's try to understand this suggester.
>>>>> 
>>>>> *Building*
>>>>> This suggester build a FST that it will use to provide the autocomplete
>>>>> feature running prefix searches on it .
>>>>> The terms it uses to generate the FST are the tokens produced by the
>>>>> "suggestFreeTextAnalyzerFieldType" .
>>>>> 
>>>>> And this should be correct.
>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as
>>> well)
>>>>> in our analysis to keep it simple , from these original field values :
>>>>> "mp3 ipod"
>>>>> "mp3 player"
>>>>> "mp3 player ipod"
>>>>> "player of Real"
>>>>> 
>>>>> -> we produce these list of possible suggestions in our FST :
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> From the documentation I read :
>>>>> 
>>>>>> " ngrams: The max number of tokens out of which singles will be make
>>> the
>>>>>> dictionary. The default value is 2. Increasing this would mean you
>>> want
>>>>>> more than the previous 2 tokens to be taken into consideration when
>>> making
>>>>>> the suggestions. "
>>>>> 
>>>>> 
>>>>> This makes me confused, as I was not expecting this param to affect the
>>>>> suggestion dictionary.
>>>>> So I would like a clarification here from our masters :)
>>>>> At this point let's see what happens at query time .
>>>>> 
>>>>> *Query Time *
>>>>> As my understanding the ngrams params will consider  the last N-1
>>> tokens
>>>>> the user put separated by the space separator.
>>>>> 
>>>>> "Builds an ngram model from the text sent to {@link
>>>>>> * #build} and predicts based on the last grams-1 tokens in
>>>>>> * the request sent to {@link #lookup}. This tries to
>>>>>> * handle the "long tail" of suggestions for when the
>>>>>> * incoming query is a never before seen query string."
>>>>> 
>>>>> 
>>>>> Example , grams=3 should consider only the last 2 tokens
>>>>&g

Re: Auto-suggest in Solr

2015-07-10 Thread Alessandro Benedetti
Hi guys,
just wrote a blog to integrate Erick's post and to explain in details with
practical examples all the main Lookup implementations :

http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html

I think this can be useful for Edwin to finally fix the config for the
FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer
in dev, and deep code analysis and testing :) )

Cheers

2015-06-27 23:51 GMT+01:00 Alessandro Benedetti 
:

> Thanks, Erick, i didn't have time to go again through the code.
> But i will forward this to the Dev list.
> Thank you for your time !
>
> Cheers
>
> 2015-06-27 16:19 GMT+01:00 Erick Erickson :
>
>> Alessandro:
>>
>> Going to have to defer to Mike McCandless et.al., they're the
>> authorities here. Don't quite know whether they monitor this list,
>> consider the dev list?
>>
>> Best,
>> Erick
>>
>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
>>  wrote:
>> > Up, Can anyone gently take a look to my considerations related the
>> FreeText
>> > Suggester ?
>> > I am curious to have more insight.
>> > Eventually I will deeply analyse the code to understand my errors.
>> >
>> > Cheers
>> >
>> > 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti <
>> benedetti.ale...@gmail.com>
>> > :
>> >
>> >> Actually the documentation is not clear enough.
>> >> Let's try to understand this suggester.
>> >>
>> >> *Building*
>> >> This suggester build a FST that it will use to provide the autocomplete
>> >> feature running prefix searches on it .
>> >> The terms it uses to generate the FST are the tokens produced by the
>> >>  "suggestFreeTextAnalyzerFieldType" .
>> >>
>> >> And this should be correct.
>> >> So if we have a shingle token filter[1-3] ( we produce unigrams as
>> well)
>> >> in our analysis to keep it simple , from these original field values :
>> >> "mp3 ipod"
>> >> "mp3 player"
>> >> "mp3 player ipod"
>> >> "player of Real"
>> >>
>> >> -> we produce these list of possible suggestions in our FST :
>> >>
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >>
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >>
>> >> 
>> >> 
>> >>
>> >> From the documentation I read :
>> >>
>> >>> " ngrams: The max number of tokens out of which singles will be make
>> the
>> >>> dictionary. The default value is 2. Increasing this would mean you
>> want
>> >>> more than the previous 2 tokens to be taken into consideration when
>> making
>> >>> the suggestions. "
>> >>
>> >>
>> >> This makes me confused, as I was not expecting this param to affect the
>> >> suggestion dictionary.
>> >> So I would like a clarification here from our masters :)
>> >> At this point let's see what happens at query time .
>> >>
>> >> *Query Time *
>> >> As my understanding the ngrams params will consider  the last N-1
>> tokens
>> >> the user put separated by the space separator.
>> >>
>> >> "Builds an ngram model from the text sent to {@link
>> >>> * #build} and predicts based on the last grams-1 tokens in
>> >>> * the request sent to {@link #lookup}. This tries to
>> >>> * handle the "long tail" of suggestions for when the
>> >>> * incoming query is a never before seen query string."
>> >>
>> >>
>> >> Example , grams=3 should consider only the last 2 tokens
>> >>
>> >> special mp3 p -> mp3 p
>> >>
>> >> Then this query is analysed using the
>> "suggestFreeTextAnalyzerFieldType" .
>> >> We produce 3 tokens :
>> >> 
>> >> 
>> >> 
>> >>
>> >> And we run the prefix matching on the FST .
>> >>
>> >> *Conclusion*
>> >> My understanding is wrong for sure at some point, as the behaviour I
>> get
>> >> is different.
>> >> Can we discuss this , clarify this and eventually put it in the
>> official
>> >> documentation ?
>> 

Re: Auto-suggest in Solr

2015-06-27 Thread Alessandro Benedetti
Thanks, Erick, i didn't have time to go again through the code.
But i will forward this to the Dev list.
Thank you for your time !

Cheers

2015-06-27 16:19 GMT+01:00 Erick Erickson :

> Alessandro:
>
> Going to have to defer to Mike McCandless et.al., they're the
> authorities here. Don't quite know whether they monitor this list,
> consider the dev list?
>
> Best,
> Erick
>
> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
>  wrote:
> > Up, Can anyone gently take a look to my considerations related the
> FreeText
> > Suggester ?
> > I am curious to have more insight.
> > Eventually I will deeply analyse the code to understand my errors.
> >
> > Cheers
> >
> > 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti <
> benedetti.ale...@gmail.com>
> > :
> >
> >> Actually the documentation is not clear enough.
> >> Let's try to understand this suggester.
> >>
> >> *Building*
> >> This suggester build a FST that it will use to provide the autocomplete
> >> feature running prefix searches on it .
> >> The terms it uses to generate the FST are the tokens produced by the
> >>  "suggestFreeTextAnalyzerFieldType" .
> >>
> >> And this should be correct.
> >> So if we have a shingle token filter[1-3] ( we produce unigrams as well)
> >> in our analysis to keep it simple , from these original field values :
> >> "mp3 ipod"
> >> "mp3 player"
> >> "mp3 player ipod"
> >> "player of Real"
> >>
> >> -> we produce these list of possible suggestions in our FST :
> >>
> >> 
> >> 
> >> 
> >> 
> >> 
> >>
> >> 
> >> 
> >> 
> >> 
> >> 
> >>
> >> 
> >> 
> >>
> >> From the documentation I read :
> >>
> >>> " ngrams: The max number of tokens out of which singles will be make
> the
> >>> dictionary. The default value is 2. Increasing this would mean you want
> >>> more than the previous 2 tokens to be taken into consideration when
> making
> >>> the suggestions. "
> >>
> >>
> >> This makes me confused, as I was not expecting this param to affect the
> >> suggestion dictionary.
> >> So I would like a clarification here from our masters :)
> >> At this point let's see what happens at query time .
> >>
> >> *Query Time *
> >> As my understanding the ngrams params will consider  the last N-1 tokens
> >> the user put separated by the space separator.
> >>
> >> "Builds an ngram model from the text sent to {@link
> >>> * #build} and predicts based on the last grams-1 tokens in
> >>> * the request sent to {@link #lookup}. This tries to
> >>> * handle the "long tail" of suggestions for when the
> >>> * incoming query is a never before seen query string."
> >>
> >>
> >> Example , grams=3 should consider only the last 2 tokens
> >>
> >> special mp3 p -> mp3 p
> >>
> >> Then this query is analysed using the
> "suggestFreeTextAnalyzerFieldType" .
> >> We produce 3 tokens :
> >> 
> >> 
> >> 
> >>
> >> And we run the prefix matching on the FST .
> >>
> >> *Conclusion*
> >> My understanding is wrong for sure at some point, as the behaviour I get
> >> is different.
> >> Can we discuss this , clarify this and eventually put it in the official
> >> documentation ?
> >>
> >> Cheers
> >>
> >> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo :
> >>
> >>> I'm implementing an auto-suggest feature in Solr, and I'll like to
> achieve
> >>> the follwing:
> >>>
> >>> For example, if the user enters "mp3", Solr might suggest "mp3 player",
> >>> "mp3 nano" and "mp3 music".
> >>> When the user enters "mp3 p", the suggestion should narrow down to "mp3
> >>> player".
> >>>
> >>> Currently, when I type "mp3 p", the suggester is returning words that
> >>> starts with the letter "p" only, and I'm getting results like "plan",
> >>> "production", etc, and it does not take the "mp3" token into
> >>> consideration.
> >>>
> >>&g

Re: Auto-suggest in Solr

2015-06-27 Thread Erick Erickson
Alessandro:

Going to have to defer to Mike McCandless et.al., they're the
authorities here. Don't quite know whether they monitor this list,
consider the dev list?

Best,
Erick

On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
 wrote:
> Up, Can anyone gently take a look to my considerations related the FreeText
> Suggester ?
> I am curious to have more insight.
> Eventually I will deeply analyse the code to understand my errors.
>
> Cheers
>
> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti 
> :
>
>> Actually the documentation is not clear enough.
>> Let's try to understand this suggester.
>>
>> *Building*
>> This suggester build a FST that it will use to provide the autocomplete
>> feature running prefix searches on it .
>> The terms it uses to generate the FST are the tokens produced by the
>>  "suggestFreeTextAnalyzerFieldType" .
>>
>> And this should be correct.
>> So if we have a shingle token filter[1-3] ( we produce unigrams as well)
>> in our analysis to keep it simple , from these original field values :
>> "mp3 ipod"
>> "mp3 player"
>> "mp3 player ipod"
>> "player of Real"
>>
>> -> we produce these list of possible suggestions in our FST :
>>
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> 
>>
>> From the documentation I read :
>>
>>> " ngrams: The max number of tokens out of which singles will be make the
>>> dictionary. The default value is 2. Increasing this would mean you want
>>> more than the previous 2 tokens to be taken into consideration when making
>>> the suggestions. "
>>
>>
>> This makes me confused, as I was not expecting this param to affect the
>> suggestion dictionary.
>> So I would like a clarification here from our masters :)
>> At this point let's see what happens at query time .
>>
>> *Query Time *
>> As my understanding the ngrams params will consider  the last N-1 tokens
>> the user put separated by the space separator.
>>
>> "Builds an ngram model from the text sent to {@link
>>> * #build} and predicts based on the last grams-1 tokens in
>>> * the request sent to {@link #lookup}. This tries to
>>> * handle the "long tail" of suggestions for when the
>>> * incoming query is a never before seen query string."
>>
>>
>> Example , grams=3 should consider only the last 2 tokens
>>
>> special mp3 p -> mp3 p
>>
>> Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" .
>> We produce 3 tokens :
>> 
>> 
>> 
>>
>> And we run the prefix matching on the FST .
>>
>> *Conclusion*
>> My understanding is wrong for sure at some point, as the behaviour I get
>> is different.
>> Can we discuss this , clarify this and eventually put it in the official
>> documentation ?
>>
>> Cheers
>>
>> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo :
>>
>>> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve
>>> the follwing:
>>>
>>> For example, if the user enters "mp3", Solr might suggest "mp3 player",
>>> "mp3 nano" and "mp3 music".
>>> When the user enters "mp3 p", the suggestion should narrow down to "mp3
>>> player".
>>>
>>> Currently, when I type "mp3 p", the suggester is returning words that
>>> starts with the letter "p" only, and I'm getting results like "plan",
>>> "production", etc, and it does not take the "mp3" token into
>>> consideration.
>>>
>>> I'm using Solr 5.1 and below is my configuration:
>>>
>>> In solrconfig.xml:
>>>
>>> 
>>>   
>>>
>>>  FreeTextLookupFactory
>>>  suggester_freetext_dir
>>>
>>> DocumentDictionaryFactory
>>> Suggestion
>>> Project
>>> suggestType
>>> 5
>>> false
>>> false
>>>   
>>> 
>>>
>>>
>>> In schema.xml
>>>
>>> >> positionIncrementGap="100">
>>> 
>>> >> pattern="[^a-zA-Z0-9]" replacement=" " />
>>> 
>>> >> maxShingleSize="6" outputUnigrams="false"/>
>>> 
>>> 
>>> >> pattern="[^a-zA-Z0-9]" replacement=" " />
>>> 
>>> >> maxShingleSize="6" outputUnigrams="true"/>
>>> 
>>> 
>>>
>>>
>>> Is there anything that I configured wrongly?
>>>
>>>
>>> Regards,
>>> Edwin
>>>
>>
>>
>>
>> --
>> --
>>
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>>
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England


Re: Auto-suggest in Solr

2015-06-26 Thread Alessandro Benedetti
Up, Can anyone gently take a look to my considerations related the FreeText
Suggester ?
I am curious to have more insight.
Eventually I will deeply analyse the code to understand my errors.

Cheers

2015-06-19 11:53 GMT+01:00 Alessandro Benedetti 
:

> Actually the documentation is not clear enough.
> Let's try to understand this suggester.
>
> *Building*
> This suggester build a FST that it will use to provide the autocomplete
> feature running prefix searches on it .
> The terms it uses to generate the FST are the tokens produced by the
>  "suggestFreeTextAnalyzerFieldType" .
>
> And this should be correct.
> So if we have a shingle token filter[1-3] ( we produce unigrams as well)
> in our analysis to keep it simple , from these original field values :
> "mp3 ipod"
> "mp3 player"
> "mp3 player ipod"
> "player of Real"
>
> -> we produce these list of possible suggestions in our FST :
>
> 
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
>
> 
> 
>
> From the documentation I read :
>
>> " ngrams: The max number of tokens out of which singles will be make the
>> dictionary. The default value is 2. Increasing this would mean you want
>> more than the previous 2 tokens to be taken into consideration when making
>> the suggestions. "
>
>
> This makes me confused, as I was not expecting this param to affect the
> suggestion dictionary.
> So I would like a clarification here from our masters :)
> At this point let's see what happens at query time .
>
> *Query Time *
> As my understanding the ngrams params will consider  the last N-1 tokens
> the user put separated by the space separator.
>
> "Builds an ngram model from the text sent to {@link
>> * #build} and predicts based on the last grams-1 tokens in
>> * the request sent to {@link #lookup}. This tries to
>> * handle the "long tail" of suggestions for when the
>> * incoming query is a never before seen query string."
>
>
> Example , grams=3 should consider only the last 2 tokens
>
> special mp3 p -> mp3 p
>
> Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" .
> We produce 3 tokens :
> 
> 
> 
>
> And we run the prefix matching on the FST .
>
> *Conclusion*
> My understanding is wrong for sure at some point, as the behaviour I get
> is different.
> Can we discuss this , clarify this and eventually put it in the official
> documentation ?
>
> Cheers
>
> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo :
>
>> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve
>> the follwing:
>>
>> For example, if the user enters "mp3", Solr might suggest "mp3 player",
>> "mp3 nano" and "mp3 music".
>> When the user enters "mp3 p", the suggestion should narrow down to "mp3
>> player".
>>
>> Currently, when I type "mp3 p", the suggester is returning words that
>> starts with the letter "p" only, and I'm getting results like "plan",
>> "production", etc, and it does not take the "mp3" token into
>> consideration.
>>
>> I'm using Solr 5.1 and below is my configuration:
>>
>> In solrconfig.xml:
>>
>> 
>>   
>>
>>  FreeTextLookupFactory
>>  suggester_freetext_dir
>>
>> DocumentDictionaryFactory
>> Suggestion
>> Project
>> suggestType
>> 5
>> false
>> false
>>   
>> 
>>
>>
>> In schema.xml
>>
>> > positionIncrementGap="100">
>> 
>> > pattern="[^a-zA-Z0-9]" replacement=" " />
>> 
>> > maxShingleSize="6" outputUnigrams="false"/>
>> 
>> 
>> > pattern="[^a-zA-Z0-9]" replacement=" " />
>> 
>> > maxShingleSize="6" outputUnigrams="true"/>
>> 
>> 
>>
>>
>> Is there anything that I configured wrongly?
>>
>>
>> Regards,
>> Edwin
>>
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Auto-suggest in Solr

2015-06-22 Thread Alessandro Benedetti
Can any of our beloved super guru take a look to my mail ?
It could help Edwin as well :)

Cheers

2015-06-19 11:53 GMT+01:00 Alessandro Benedetti 
:

> Actually the documentation is not clear enough.
> Let's try to understand this suggester.
>
> *Building*
> This suggester build a FST that it will use to provide the autocomplete
> feature running prefix searches on it .
> The terms it uses to generate the FST are the tokens produced by the
>  "suggestFreeTextAnalyzerFieldType" .
>
> And this should be correct.
> So if we have a shingle token filter[1-3] ( we produce unigrams as well)
> in our analysis to keep it simple , from these original field values :
> "mp3 ipod"
> "mp3 player"
> "mp3 player ipod"
> "player of Real"
>
> -> we produce these list of possible suggestions in our FST :
>
> 
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
>
> 
> 
>
> From the documentation I read :
>
>> " ngrams: The max number of tokens out of which singles will be make the
>> dictionary. The default value is 2. Increasing this would mean you want
>> more than the previous 2 tokens to be taken into consideration when making
>> the suggestions. "
>
>
> This makes me confused, as I was not expecting this param to affect the
> suggestion dictionary.
> So I would like a clarification here from our masters :)
> At this point let's see what happens at query time .
>
> *Query Time *
> As my understanding the ngrams params will consider  the last N-1 tokens
> the user put separated by the space separator.
>
> "Builds an ngram model from the text sent to {@link
>> * #build} and predicts based on the last grams-1 tokens in
>> * the request sent to {@link #lookup}. This tries to
>> * handle the "long tail" of suggestions for when the
>> * incoming query is a never before seen query string."
>
>
> Example , grams=3 should consider only the last 2 tokens
>
> special mp3 p -> mp3 p
>
> Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" .
> We produce 3 tokens :
> 
> 
> 
>
> And we run the prefix matching on the FST .
>
> *Conclusion*
> My understanding is wrong for sure at some point, as the behaviour I get
> is different.
> Can we discuss this , clarify this and eventually put it in the official
> documentation ?
>
> Cheers
>
> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo :
>
>> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve
>> the follwing:
>>
>> For example, if the user enters "mp3", Solr might suggest "mp3 player",
>> "mp3 nano" and "mp3 music".
>> When the user enters "mp3 p", the suggestion should narrow down to "mp3
>> player".
>>
>> Currently, when I type "mp3 p", the suggester is returning words that
>> starts with the letter "p" only, and I'm getting results like "plan",
>> "production", etc, and it does not take the "mp3" token into
>> consideration.
>>
>> I'm using Solr 5.1 and below is my configuration:
>>
>> In solrconfig.xml:
>>
>> 
>>   
>>
>>  FreeTextLookupFactory
>>  suggester_freetext_dir
>>
>> DocumentDictionaryFactory
>> Suggestion
>> Project
>> suggestType
>> 5
>> false
>> false
>>   
>> 
>>
>>
>> In schema.xml
>>
>> > positionIncrementGap="100">
>> 
>> > pattern="[^a-zA-Z0-9]" replacement=" " />
>> 
>> > maxShingleSize="6" outputUnigrams="false"/>
>> 
>> 
>> > pattern="[^a-zA-Z0-9]" replacement=" " />
>> 
>> > maxShingleSize="6" outputUnigrams="true"/>
>> 
>> 
>>
>>
>> Is there anything that I configured wrongly?
>>
>>
>> Regards,
>> Edwin
>>
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Auto-suggest in Solr

2015-06-19 Thread Zheng Lin Edwin Yeo
Ok sure.

> " ngrams: The max number of tokens out of which singles will be make the
> dictionary. The default value is 2. Increasing this would mean you want
> more than the previous 2 tokens to be taken into consideration when making
> the suggestions. "

I got confused by this, as I could not get the behavior when I use the
suggester. Since the default value is 2, it means the search for "mp3 p"
should include only suggestions that contains "mp3 ..." and not just from
the letter "p". But I have only been getting suggestions that starts with
"p" only.
Even when I try with a bigger ngrams value for longer search, I'm getting
the same results as well, that the suggester only consider the last token
when giving the suggestions.

I still could not achieve anything that consider 2 or more tokens when
returning the suggestions.

So am I actually following the right direction with this?

Regards,
Edwin



On 19 June 2015 at 18:53, Alessandro Benedetti 
wrote:

> Actually the documentation is not clear enough.
> Let's try to understand this suggester.
>
> *Building*
> This suggester build a FST that it will use to provide the autocomplete
> feature running prefix searches on it .
> The terms it uses to generate the FST are the tokens produced by the
>  "suggestFreeTextAnalyzerFieldType" .
>
> And this should be correct.
> So if we have a shingle token filter[1-3] ( we produce unigrams as well) in
> our analysis to keep it simple , from these original field values :
> "mp3 ipod"
> "mp3 player"
> "mp3 player ipod"
> "player of Real"
>
> -> we produce these list of possible suggestions in our FST :
>
> 
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
>
> 
> 
>
> From the documentation I read :
>
> > " ngrams: The max number of tokens out of which singles will be make the
> > dictionary. The default value is 2. Increasing this would mean you want
> > more than the previous 2 tokens to be taken into consideration when
> making
> > the suggestions. "
>
>
> This makes me confused, as I was not expecting this param to affect the
> suggestion dictionary.
> So I would like a clarification here from our masters :)
> At this point let's see what happens at query time .
>
> *Query Time *
> As my understanding the ngrams params will consider  the last N-1 tokens
> the user put separated by the space separator.
>
> "Builds an ngram model from the text sent to {@link
> > * #build} and predicts based on the last grams-1 tokens in
> > * the request sent to {@link #lookup}. This tries to
> > * handle the "long tail" of suggestions for when the
> > * incoming query is a never before seen query string."
>
>
> Example , grams=3 should consider only the last 2 tokens
>
> special mp3 p -> mp3 p
>
> Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" .
> We produce 3 tokens :
> 
> 
> 
>
> And we run the prefix matching on the FST .
>
> *Conclusion*
> My understanding is wrong for sure at some point, as the behaviour I get is
> different.
> Can we discuss this , clarify this and eventually put it in the official
> documentation ?
>
> Cheers
>
> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo :
>
> > I'm implementing an auto-suggest feature in Solr, and I'll like to
> achieve
> > the follwing:
> >
> > For example, if the user enters "mp3", Solr might suggest "mp3 player",
> > "mp3 nano" and "mp3 music".
> > When the user enters "mp3 p", the suggestion should narrow down to "mp3
> > player".
> >
> > Currently, when I type "mp3 p", the suggester is returning words that
> > starts with the letter "p" only, and I'm getting results like "plan",
> > "production", etc, and it does not take the "mp3" token into
> consideration.
> >
> > I'm using Solr 5.1 and below is my configuration:
> >
> > In solrconfig.xml:
> >
> > 
> >   
> >
> >  FreeTextLookupFactory
> >  suggester_freetext_dir
> >
> > DocumentDictionaryFactory
> > Suggestion
> > Project
> > suggestType
> > 5
> > false
> > false
> >   
> > 
> >
> >
> > In schema.xml
> >
> >  > positionIncrementGap="100">
> > 
> >  > pattern="[^a-zA-Z0-9]" replacement=" " />
> > 
> >  > maxShingleSize="6" outputUnigrams="false"/>
> > 
> > 
> >  > pattern="[^a-zA-Z0-9]" replacement=" " />
> > 
> >  > maxShingleSize="6" outputUnigrams="true"/>
> > 
> > 
> >
> >
> > Is there anything that I configured wrongly?
> >
> >
> > Regards,
> > Edwin
> >
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>


Re: Auto-suggest in Solr

2015-06-19 Thread Alessandro Benedetti
Actually the documentation is not clear enough.
Let's try to understand this suggester.

*Building*
This suggester build a FST that it will use to provide the autocomplete
feature running prefix searches on it .
The terms it uses to generate the FST are the tokens produced by the
 "suggestFreeTextAnalyzerFieldType" .

And this should be correct.
So if we have a shingle token filter[1-3] ( we produce unigrams as well) in
our analysis to keep it simple , from these original field values :
"mp3 ipod"
"mp3 player"
"mp3 player ipod"
"player of Real"

-> we produce these list of possible suggestions in our FST :
















>From the documentation I read :

> " ngrams: The max number of tokens out of which singles will be make the
> dictionary. The default value is 2. Increasing this would mean you want
> more than the previous 2 tokens to be taken into consideration when making
> the suggestions. "


This makes me confused, as I was not expecting this param to affect the
suggestion dictionary.
So I would like a clarification here from our masters :)
At this point let's see what happens at query time .

*Query Time *
As my understanding the ngrams params will consider  the last N-1 tokens
the user put separated by the space separator.

"Builds an ngram model from the text sent to {@link
> * #build} and predicts based on the last grams-1 tokens in
> * the request sent to {@link #lookup}. This tries to
> * handle the "long tail" of suggestions for when the
> * incoming query is a never before seen query string."


Example , grams=3 should consider only the last 2 tokens

special mp3 p -> mp3 p

Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" .
We produce 3 tokens :




And we run the prefix matching on the FST .

*Conclusion*
My understanding is wrong for sure at some point, as the behaviour I get is
different.
Can we discuss this , clarify this and eventually put it in the official
documentation ?

Cheers

2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo :

> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve
> the follwing:
>
> For example, if the user enters "mp3", Solr might suggest "mp3 player",
> "mp3 nano" and "mp3 music".
> When the user enters "mp3 p", the suggestion should narrow down to "mp3
> player".
>
> Currently, when I type "mp3 p", the suggester is returning words that
> starts with the letter "p" only, and I'm getting results like "plan",
> "production", etc, and it does not take the "mp3" token into consideration.
>
> I'm using Solr 5.1 and below is my configuration:
>
> In solrconfig.xml:
>
> 
>   
>
>  FreeTextLookupFactory
>  suggester_freetext_dir
>
> DocumentDictionaryFactory
> Suggestion
> Project
> suggestType
> 5
> false
> false
>   
> 
>
>
> In schema.xml
>
>  positionIncrementGap="100">
> 
>  pattern="[^a-zA-Z0-9]" replacement=" " />
> 
>  maxShingleSize="6" outputUnigrams="false"/>
> 
> 
>  pattern="[^a-zA-Z0-9]" replacement=" " />
> 
>  maxShingleSize="6" outputUnigrams="true"/>
> 
> 
>
>
> Is there anything that I configured wrongly?
>
>
> Regards,
> Edwin
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Auto-suggest in Solr

2015-06-18 Thread Zheng Lin Edwin Yeo
I'm implementing an auto-suggest feature in Solr, and I'll like to achieve
the follwing:

For example, if the user enters "mp3", Solr might suggest "mp3 player",
"mp3 nano" and "mp3 music".
When the user enters "mp3 p", the suggestion should narrow down to "mp3
player".

Currently, when I type "mp3 p", the suggester is returning words that
starts with the letter "p" only, and I'm getting results like "plan",
"production", etc, and it does not take the "mp3" token into consideration.

I'm using Solr 5.1 and below is my configuration:

In solrconfig.xml:


  

 FreeTextLookupFactory
 suggester_freetext_dir

DocumentDictionaryFactory
Suggestion
Project
suggestType
5
false
false
  



In schema.xml















Is there anything that I configured wrongly?


Regards,
Edwin


Re: Auto suggest with adding accents

2014-08-04 Thread benjelloun
Any one find any solution for this probleme ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150972.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest with adding accents

2014-08-01 Thread benjelloun
hello,
on the new suggester, when the field is multivalued="true", itsnot working


i need to try the patch "LUCENE-3842" to test auto complete but i dont know
how.
i have Solr-4.7.2 not source code.
can some one help?

Best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150609.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest with adding accents

2014-08-01 Thread Alexandre Rafalovitch
Perhaps the actual suggester module is a better fit then:

http://blog.mikemccandless.com/2012/09/lucenes-new-analyzing-suggester.html
http://romiawasthy.blogspot.fi/2014/06/configure-solr-suggester.html

Also:
http://jayant7k.blogspot.com/2014/03/an-interesting-suggester-in-solr.html

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Fri, Aug 1, 2014 at 3:21 PM, Otis Gospodnetic
 wrote:
> Aha.  I don't know if Solr Suggester can do that.  Let's see what others
> say.  I know http://www.sematext.com/products/autocomplete/ could do that.
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, Aug 1, 2014 at 9:26 AM, benjelloun  wrote:
>
>> hello,
>>
>> you didnt enderstand well my problem i give you exemple:
>> the document contain the word "genève".
>> q="gene"  auto suggestion give "geneve"
>> q="genè" auto suggestion give "genève"
>>
>> but what i need is q="gene" auto suggestion give "genève" with accent like
>> correction of word.
>> i tried to add spellchecker to correct it but the maximum of character for
>> correction is 2
>> maybe there is other solution,
>> i give my schema of field:
>>
>> > positionIncrementGap="100" omitNorms="true">
>> 
>> 
>> 
>> > ignoreCase="true"/>
>> 
>> 
>> 
>> 
>> 
>>  > class="solr.StandardTokenizerFactory"/>replacement="$2"/>-->
>> 
>> > ignoreCase="true"/>
>> 
>> 
>> 
>> 
>>
>> thanks best regards,
>> Anass BENJELLOUN
>>
>>
>>
>>
>> 2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] <
>> ml-node+s472066n4150410...@n3.nabble.com>:
>>
>> > You need to do the opposite.  Make sure accents are NOT removed at index
>> &
>> > query time.
>> >
>> > Otis
>> > --
>> > Performance Monitoring * Log Analytics * Search Analytics
>> > Solr & Elasticsearch Support * http://sematext.com/
>> >
>> >
>> >
>> > On Thu, Jul 31, 2014 at 5:49 PM, benjelloun <[hidden email]
>> > <http://user/SendEmail.jtp?type=node&node=4150410&i=0>> wrote:
>> >
>> > > hi,
>> > >
>> > > q="gene"  it suggest "geneve"
>> > > ASCIIFoldingFilter work like isolate accent
>> > >
>> > > what i need to suggest is "genève"
>> > >
>> > > any idea?
>> > >
>> > > thanks
>> > > best reagards
>> > > Anass BENJELLOUN
>> > >
>> > >
>> > >
>> > > --
>> > > View this message in context:
>> > >
>> >
>> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html
>> >
>> > > Sent from the Solr - User mailing list archive at Nabble.com.
>> > >
>> >
>> >
>> > --
>> >  If you reply to this email, your message will be added to the discussion
>> > below:
>> >
>> >
>> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150410.html
>> >  To unsubscribe from Auto suggest with adding accents, click here
>> > <
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4150379&code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwMzc5fC0xMDQyNjMzMDgx
>> >
>> > .
>> > NAML
>> > <
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
>> >
>> >
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150569.html
>> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest with adding accents

2014-08-01 Thread Otis Gospodnetic
Aha.  I don't know if Solr Suggester can do that.  Let's see what others
say.  I know http://www.sematext.com/products/autocomplete/ could do that.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Fri, Aug 1, 2014 at 9:26 AM, benjelloun  wrote:

> hello,
>
> you didnt enderstand well my problem i give you exemple:
> the document contain the word "genève".
> q="gene"  auto suggestion give "geneve"
> q="genè" auto suggestion give "genève"
>
> but what i need is q="gene" auto suggestion give "genève" with accent like
> correction of word.
> i tried to add spellchecker to correct it but the maximum of character for
> correction is 2
> maybe there is other solution,
> i give my schema of field:
>
>  positionIncrementGap="100" omitNorms="true">
> 
> 
> 
>  ignoreCase="true"/>
> 
> 
> 
> 
> 
>   class="solr.StandardTokenizerFactory"/>replacement="$2"/>-->
> 
>  ignoreCase="true"/>
> 
> 
> 
> 
>
> thanks best regards,
> Anass BENJELLOUN
>
>
>
>
> 2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] <
> ml-node+s472066n4150410...@n3.nabble.com>:
>
> > You need to do the opposite.  Make sure accents are NOT removed at index
> &
> > query time.
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> >
> > On Thu, Jul 31, 2014 at 5:49 PM, benjelloun <[hidden email]
> > <http://user/SendEmail.jtp?type=node&node=4150410&i=0>> wrote:
> >
> > > hi,
> > >
> > > q="gene"  it suggest "geneve"
> > > ASCIIFoldingFilter work like isolate accent
> > >
> > > what i need to suggest is "genève"
> > >
> > > any idea?
> > >
> > > thanks
> > > best reagards
> > > Anass BENJELLOUN
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html
> >
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> >
> >
> > --
> >  If you reply to this email, your message will be added to the discussion
> > below:
> >
> >
> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150410.html
> >  To unsubscribe from Auto suggest with adding accents, click here
> > <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4150379&code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwMzc5fC0xMDQyNjMzMDgx
> >
> > .
> > NAML
> > <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150569.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest with adding accents

2014-08-01 Thread benjelloun
hello,

you didnt enderstand well my problem i give you exemple:
the document contain the word "genève".
q="gene"  auto suggestion give "geneve"
q="genè" auto suggestion give "genève"

but what i need is q="gene" auto suggestion give "genève" with accent like
correction of word.
i tried to add spellchecker to correct it but the maximum of character for
correction is 2
maybe there is other solution,
i give my schema of field:











 replacement="$2"/>-->







thanks best regards,
Anass BENJELLOUN




2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] <
ml-node+s472066n4150410...@n3.nabble.com>:

> You need to do the opposite.  Make sure accents are NOT removed at index &
> query time.
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
>
> On Thu, Jul 31, 2014 at 5:49 PM, benjelloun <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=4150410&i=0>> wrote:
>
> > hi,
> >
> > q="gene"  it suggest "geneve"
> > ASCIIFoldingFilter work like isolate accent
> >
> > what i need to suggest is "genève"
> >
> > any idea?
> >
> > thanks
> > best reagards
> > Anass BENJELLOUN
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html
>
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150410.html
>  To unsubscribe from Auto suggest with adding accents, click here
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4150379&code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwMzc5fC0xMDQyNjMzMDgx>
> .
> NAML
> <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150569.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Auto suggest with adding accents

2014-07-31 Thread Otis Gospodnetic
You need to do the opposite.  Make sure accents are NOT removed at index &
query time.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Jul 31, 2014 at 5:49 PM, benjelloun  wrote:

> hi,
>
> q="gene"  it suggest "geneve"
> ASCIIFoldingFilter work like isolate accent
>
> what i need to suggest is "genève"
>
> any idea?
>
> thanks
> best reagards
> Anass BENJELLOUN
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Auto suggest with adding accents

2014-07-31 Thread benjelloun
hi,

q="gene"  it suggest "geneve"
ASCIIFoldingFilter work like isolate accent

what i need to suggest is "genève"

any idea?

thanks
best reagards
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest with adding accents

2014-07-31 Thread Ahmet Arslan
Hi,

What happens when you add ASCIIFoldingFilter to field type definition of 
suggestField?

Ahmet


On Thursday, July 31, 2014 5:49 PM, benjelloun  wrote:
Hello,

i'm trying to autosuggest frensh word with accents,
but if the user write q="gene" it will not suggest "genève", it will suggest
"general","genetic" ...


    
      suggestDic
      org.apache.solr.spelling.suggest.Suggester
      org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
      suggestFolder
      suggestField  
      true
      true
       suggest/emptyDic.txt
    
    textSuggest
  
  
  
    
      suggests
      true
      
      suggestDic
      true
      6  
      true
      6 
      true  
    
    
      suggests
    
  

The field "suggestField" dont isolate accents.

Thanks for help,

Best regards,
Anass BENJELLOUN




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379.html
Sent from the Solr - User mailing list archive at Nabble.com.


Auto suggest with adding accents

2014-07-31 Thread benjelloun
Hello,

i'm trying to autosuggest frensh word with accents,
but if the user write q="gene" it will not suggest "genève", it will suggest
"general","genetic" ...



  suggestDic
  org.apache.solr.spelling.suggest.Suggester
  org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
  suggestFolder
  suggestField  
  true
  true
   suggest/emptyDic.txt

textSuggest
  
  
  

  suggests
  true
  
  suggestDic
  true
  6   
  true
  6 
  true  


  suggests

  
 
The field "suggestField" dont isolate accents.

Thanks for help,

Best regards,
Anass BENJELLOUN




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto Suggest

2014-07-28 Thread benjelloun
Hello Erick,

So in your opinion what is the solution to use autosuggest with sentece :)
an exemple will be very helpfull,

Thanks,
best regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-tp4149004p4149441.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto Suggest

2014-07-27 Thread Erick Erickson
No, although there's been some joy with using shingles. Autosuggest
works off of the _indexed tokens_. So the problem is really reducing
the tokenization to something that is  multi-word.

Best,
Erick


On Thu, Jul 24, 2014 at 5:11 AM, benjelloun  wrote:

> Hello,
>
> Did solr.SuggestComponent work on MultiValued Field to Auto suggest not
> only
> one word but the whole sentence?
>
>  indexed="true"/>
>
> Regards,
> Anass BENJELLOUN
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Auto-Suggest-tp4149004.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Auto Suggest

2014-07-24 Thread benjelloun
Hello,

Did solr.SuggestComponent work on MultiValued Field to Auto suggest not only
one word but the whole sentence?



Regards,
Anass BENJELLOUN



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-tp4149004.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto Suggest - Time decay

2013-10-01 Thread Ing. Jorge Luis Betancourt Gonzalez
Sorry, I forgot the link:

[1] - http://wiki.apache.org/solr/SolrRelevancyFAQ

- Mensaje original -
De: "Ing. Jorge Luis Betancourt Gonzalez" 
Para: solr-user@lucene.apache.org
Enviados: Martes, 1 de Octubre 2013 13:34:03
Asunto: Re: Auto Suggest - Time decay

For that core just use a boost factor as explained on [1]:

You could use a query like this to see (before make any change) how your 
suggestions will be retrieved, in this case a query for "goog" has been made, 
and recent documents will be boosted (an extra bonus will be given for the 
newer documents).

http://localhost:8983/solr/select?q={!boost 
b=recip(ms(NOW,manufacturedate_dt),3.16e-11,1,1)}goog

If this is enough for you you could poot the boost parameter in your request 
handler and make it even simpler so any query againsta this particular request 
handler will be automatically boosted by date.

PS: You could tweak the above formula used in the boost parameter for a more 
suitable to your needs.

- Mensaje original -
De: "SolrLover" 
Para: solr-user@lucene.apache.org
Enviados: Martes, 1 de Octubre 2013 12:19:51
Asunto: Re: Auto Suggest - Time decay

I am using a totally separate core for storing the auto suggest keywords.

Would you be able to send me some more details on your implementation? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965p4092969.html
Sent from the Solr - User mailing list archive at Nabble.com.

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu


Re: Auto Suggest - Time decay

2013-10-01 Thread Ing. Jorge Luis Betancourt Gonzalez
For that core just use a boost factor as explained on [1]:

You could use a query like this to see (before make any change) how your 
suggestions will be retrieved, in this case a query for "goog" has been made, 
and recent documents will be boosted (an extra bonus will be given for the 
newer documents).

http://localhost:8983/solr/select?q={!boost 
b=recip(ms(NOW,manufacturedate_dt),3.16e-11,1,1)}goog

If this is enough for you you could poot the boost parameter in your request 
handler and make it even simpler so any query againsta this particular request 
handler will be automatically boosted by date.

PS: You could tweak the above formula used in the boost parameter for a more 
suitable to your needs.

- Mensaje original -
De: "SolrLover" 
Para: solr-user@lucene.apache.org
Enviados: Martes, 1 de Octubre 2013 12:19:51
Asunto: Re: Auto Suggest - Time decay

I am using a totally separate core for storing the auto suggest keywords.

Would you be able to send me some more details on your implementation? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965p4092969.html
Sent from the Solr - User mailing list archive at Nabble.com.

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu


Re: Auto Suggest - Time decay

2013-10-01 Thread SolrLover
I am using a totally separate core for storing the auto suggest keywords.

Would you be able to send me some more details on your implementation? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965p4092969.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto Suggest - Time decay

2013-10-01 Thread Ing. Jorge Luis Betancourt Gonzalez
Are you using the suggester component? or a separated core? I've used a 
separated core to store suggestions and order this suggestions (queries 
performed on the frontend) using a time decay function, and it works great for 
me.

Regards,

- Mensaje original -
De: "SolrLover" 
Para: solr-user@lucene.apache.org
Enviados: Martes, 1 de Octubre 2013 12:12:13
Asunto: Auto Suggest - Time decay

I am trying to implement an auto suggest based on time decay function. I have
a separate index just to store auto suggest keywords.

I would be calculating the frequency over time rather than just calculating
just based on frequency alone. 

I am thinking of using a database to perform the calculation and update the
SOLR index with the boost calculated based on time decay function. I am not
sure if there is a better way to do this...

I need to boost the terms based on the frequency over time,

Ex: when someone searches for 'apple' 1 times during a iphone launch
(one particular day) shouldn't really make apple come up in the auto
suggestion always when someone types in the keyword 'a' rather it should
lose its popularity exponentially..

Anyone has any suggestions?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965.html
Sent from the Solr - User mailing list archive at Nabble.com.

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu

III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 
2014. Ver www.uci.cu


Auto Suggest - Time decay

2013-10-01 Thread SolrLover
I am trying to implement an auto suggest based on time decay function. I have
a separate index just to store auto suggest keywords.

I would be calculating the frequency over time rather than just calculating
just based on frequency alone. 

I am thinking of using a database to perform the calculation and update the
SOLR index with the boost calculated based on time decay function. I am not
sure if there is a better way to do this...

I need to boost the terms based on the frequency over time,

Ex: when someone searches for 'apple' 1 times during a iphone launch
(one particular day) shouldn't really make apple come up in the auto
suggestion always when someone types in the keyword 'a' rather it should
lose its popularity exponentially..

Anyone has any suggestions?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto-Suggest, spell check dictionary replication to slave issue

2013-06-06 Thread bbarani
Seems like this feature is still yet to be implemented..

https://issues.apache.org/jira/browse/SOLR-866



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-spell-check-dictionary-replication-to-slave-issue-tp4068562p4068739.html
Sent from the Solr - User mailing list archive at Nabble.com.


Auto-Suggest, spell check dictionary replication to slave issue

2013-06-06 Thread msreddy.hi
Hi All,

We create 2 dictionary's from a indexed field for auto-sugest, spell check
feature. When we configured replication from master to slave's index is
replicating properly but not the auto-suggest, spell check dictionary's.

Is there a way to replicate auto-suggest, spell check dictionary outside the
index directory?

Please suggest.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-Suggest-spell-check-dictionary-replication-to-slave-issue-tp4068562.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Where is the auto-suggest function gone?

2013-03-07 Thread alecx
Hello Upayavira, thanks for your reply.

In the example I can see the suggestions "dollar" and  "dock" when I type
"do" in Solritas (http://localhost:8983/solr/collection1/browse?q=).

I already changed the field "name" of spellchecker, because I verified the
name field in the admin section and there were in my indexed content no
data.
So there is nothing to suggest.
Then I checked which field contains data and put in this field name into the
field name of spellcheck, but nothing happened - still no suggestions ...




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Where-is-the-auto-suggest-function-gone-tp4045520p4045531.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Where is the auto-suggest function gone?

2013-03-07 Thread Upayavira
Are you thinking of spellchecking? Where are you seeing suggestions?

If you are thinking of spellchecking, by default the spellchecker uses
the 'name' field, and you have likely indexed into the 'text' field,
hence no results being returned.

Upayavira

On Thu, Mar 7, 2013, at 01:12 PM, alecx wrote:
> Hi,
> 
> I just indexed the sample documents in the exampledocs folder and saw the
> search suggestions when I search for something in /browse.
> Afterwards I deleted the index (like described..) and indexed a folder of
> html+pdf files. Searching works but there are no suggestions.
> What I need to adjust to make this work again?
> 
> Thanks in advance.
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Where-is-the-auto-suggest-function-gone-tp4045520.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Where is the auto-suggest function gone?

2013-03-07 Thread alecx
Hi,

I just indexed the sample documents in the exampledocs folder and saw the
search suggestions when I search for something in /browse.
Afterwards I deleted the index (like described..) and indexed a folder of
html+pdf files. Searching works but there are no suggestions.
What I need to adjust to make this work again?

Thanks in advance.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Where-is-the-auto-suggest-function-gone-tp4045520.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Prefix (facet.prefix) based auto-suggest on Multi-Valued field do not return results

2012-08-17 Thread Rajani Maski
Hi,

I think this is because of the space observed   - facet.prefix=   "empty
string"  -  Please see below


3
3
3
3
3
... so on

*But why is this space inserted?*  If you see below, the list of keywords
taken from search results , there is no space.

Thanks & Regards
Rajani



On Fri, Aug 17, 2012 at 3:02 PM, Rajani Maski  wrote:

> Hi All,
>
>  * When I do facet.prefix on a * KEYWORDS *field(this field is multi
> valued) , I don't get suggestion for the first key in this field . *
>
> Example  :
>
> I have 2 documents with the field "KEYWORDS"  containing multiple values.
>
> 
> 偏振式3D成像原理
> 采用LED边缘发光的新技术
> 高级降噪运算法及画质增强技术可
> 
>
> 
> 紧凑机身,轻松携带
> 节能低耗,持久续航
> 
>
>
>
> If I do on next following strings - I get respective suggestions.
>
> BUT If I do facet.prefix  on red colored string  - facet.field=KEYWORDS&
> facet.prefix=偏振 : there are no suggestions.
>
>
>
> What can be the reason?
>
>
>
>
>
> Thanks & Regards
> Rajani
>
>
>
>


Prefix (facet.prefix) based auto-suggest on Multi-Valued field do not return results

2012-08-17 Thread Rajani Maski
Hi All,

 * When I do facet.prefix on a * KEYWORDS *field(this field is multi
valued) , I don't get suggestion for the first key in this field . *

Example  :

I have 2 documents with the field "KEYWORDS"  containing multiple values.


偏振式3D成像原理
采用LED边缘发光的新技术
高级降噪运算法及画质增强技术可



紧凑机身,轻松携带
节能低耗,持久续航




If I do on next following strings - I get respective suggestions.

BUT If I do facet.prefix  on red colored string  - facet.field=KEYWORDS&
facet.prefix=偏振 : there are no suggestions.



What can be the reason?





Thanks & Regards
Rajani


Re: Auto suggest on indexed file content filtered based on user

2012-04-25 Thread Dmitry Kan
On Wed, Apr 25, 2012 at 8:18 AM, prakash_ajp  wrote:

> Is it true that faceting is case sensitive? That would be disastrous for
> our
> requirement :(
>
>
it depends on your schema definition: if you lower case your tokens both
for index and query sides, the faceting should not be case sensitive.


> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3937370.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,

Dmitry Kan


Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread prakash_ajp
Is it true that faceting is case sensitive? That would be disastrous for our
requirement :(

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3937370.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread prakash_ajp
The first one may not work because the number of users can be big. Besides,
the users can simply register themselves and start using it. It won't work
if an admin has to intervene in the registration process.

The second could work I guess. But the problem would be data duplication as
users might also share permissions to same files and folders. I understand
my requirement is a little complicated.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3937368.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread Doug Mittendorf
Another option is to use faceting (via the facet.prefix param) for your 
auto-suggest.  It's not as fast and scalable as using one of the 
Suggester implementations, but it does allow arbitrary fq parameters to 
be included in the request to limit the results.


http://wiki.apache.org/solr/SimpleFacetParameters#Facet_prefix_.28term_suggest.29

Doug

On 04/24/2012 04:30 PM, Erick Erickson wrote:

I don't know if there is a really good solution here. The problem is that
suggester (and the trunk FST version) simply traverse the terms in
the index. there's not even a real concept of those terms belonging to
any document. Since your security level is on a document basis, that
makes things hard.

How many users do you have? And do you ever expect to search
across more than one user's files? If not, you could consider having
one core per user. Then the suggestions would be correct and since
the searches would be against the user's core, they'd never see
any documents they didn't own.

But that solution has some complexity involved, and if you have a zillion
users it can be difficult to get right.

You could consider having separate (dynamically-defined) fields that
had the suggestion list for each individual user. that would be
administratively easier. Then you suggestions would simply go against
that user's suggestion field (suggestion_user1 e.g.).

None of this is elegant, but this is not an elegant problem given how
Solr is structured.

Best
Erick

On Tue, Apr 24, 2012 at 2:31 PM, prakash_ajp  wrote:

I read on a couple of other web pages that fq is not supported for suggester.
I even tried the query and it doesn't help. My understanding was, when the
suggest (spellcheck) index is built, only the field chosen is considered for
queries and the other fields from the main index are not available for
filtering purposes once the index is created.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread Erick Erickson
I don't know if there is a really good solution here. The problem is that
suggester (and the trunk FST version) simply traverse the terms in
the index. there's not even a real concept of those terms belonging to
any document. Since your security level is on a document basis, that
makes things hard.

How many users do you have? And do you ever expect to search
across more than one user's files? If not, you could consider having
one core per user. Then the suggestions would be correct and since
the searches would be against the user's core, they'd never see
any documents they didn't own.

But that solution has some complexity involved, and if you have a zillion
users it can be difficult to get right.

You could consider having separate (dynamically-defined) fields that
had the suggestion list for each individual user. that would be
administratively easier. Then you suggestions would simply go against
that user's suggestion field (suggestion_user1 e.g.).

None of this is elegant, but this is not an elegant problem given how
Solr is structured.

Best
Erick

On Tue, Apr 24, 2012 at 2:31 PM, prakash_ajp  wrote:
> I read on a couple of other web pages that fq is not supported for suggester.
> I even tried the query and it doesn't help. My understanding was, when the
> suggest (spellcheck) index is built, only the field chosen is considered for
> queries and the other fields from the main index are not available for
> filtering purposes once the index is created.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread Jeevanandam Madanagopal
yes only spellcheck indexed build field is for suggest query
I believe, filtering a documents on search handler using fq parameter and spell 
suggest are two part we are discussing here.

lets say you have field for spellcheck - used to build spell dictionary



using copyField for populating a spell field and get dictionary created

referring spellcheck handler in the default search handler at 'last-components' 
section, like below
 
   spellcheck
 

then you will be able to apply search documents filtering and spellcheck params 
to search handler while querying. 

detailed info http://wiki.apache.org/solr/SpellCheckComponent [probably you 
might have already went thru :) ]

-Jeevanandam


On Apr 25, 2012, at 12:01 AM, prakash_ajp wrote:

> I read on a couple of other web pages that fq is not supported for suggester.
> I even tried the query and it doesn't help. My understanding was, when the
> suggest (spellcheck) index is built, only the field chosen is considered for
> queries and the other fields from the main index are not available for
> filtering purposes once the index is created.
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread prakash_ajp
I read on a couple of other web pages that fq is not supported for suggester.
I even tried the query and it doesn't help. My understanding was, when the
suggest (spellcheck) index is built, only the field chosen is considered for
queries and the other fields from the main index are not available for
filtering purposes once the index is created.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread Klostermeyer, Michael
I'm new to Solr, but I would think the fq=[username] would work here.

http://wiki.apache.org/solr/CommonQueryParameters#fq

Mike

-Original Message-
From: prakash_ajp [mailto:prakash_...@yahoo.com] 
Sent: Tuesday, April 24, 2012 11:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Auto suggest on indexed file content filtered based on user

Right now, the query is a very simple one, something like q=text. Basically, it 
would return ['textview', 'textviewer', ..]

But the issue is, the 'textviewer' could be from a file that is out of bounds 
for this user. So, ultimately I would like to include the userName in the 
query. As mentioned earlier, userName is another field in the main index.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3935765.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread Jeevanandam Madanagopal
On Apr 24, 2012, at 9:37 PM, prakash_ajp wrote:

> Right now, the query is a very simple one, something like q=text. Basically,
> it would return ['textview', 'textviewer', ..]
   hmm, so you're using default query field

> 
> But the issue is, the 'textviewer' could be from a file that is out of
> bounds for this user. So, ultimately I would like to include the userName in
> the query. As mentioned earlier, userName is another field in the main
> index.
   and you like to filter the result set along with userName field value
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3935765.html
> Sent from the Solr - User mailing list archive at Nabble.com.

in this scenario 'fq' parameter will facilitate to achieve your desire result.
Please refer http://wiki.apache.org/solr/CommonQueryParameters#fq

try this   q=text&fq=userName:"prakash"

Let us know!

-Jeevanandam



Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread prakash_ajp
Right now, the query is a very simple one, something like q=text. Basically,
it would return ['textview', 'textviewer', ..]

But the issue is, the 'textviewer' could be from a file that is out of
bounds for this user. So, ultimately I would like to include the userName in
the query. As mentioned earlier, userName is another field in the main
index.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3935765.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto suggest on indexed file content filtered based on user

2012-04-24 Thread Jeevanandam


can you please share a sample query?

-Jeevanandam


On 24-04-2012 1:49 pm, prakash_ajp wrote:
I am trying to implement an auto-suggest feature. The search feature 
already

exists and searches on file content in user's allotted workspace.

The following is from my schema that will be used for search 
indexing:


   
   

The search result is filtered by the user name. The suggest is 
implemented
as a searchComponent and the field 'Text' is used by the suggester 
and would
have to be filtered the same way the search is done. The problem with 
this
approach is, suggest works on a single field and there is no way to 
include

the UserName field as a filter.

What's the best way out from here?

Thanks in advance!
Jay

--
View this message in context:

http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3934565.html
Sent from the Solr - User mailing list archive at Nabble.com.


Auto suggest on indexed file content filtered based on user

2012-04-24 Thread prakash_ajp
I am trying to implement an auto-suggest feature. The search feature already
exists and searches on file content in user's allotted workspace.

The following is from my schema that will be used for search indexing:

   
   

The search result is filtered by the user name. The suggest is implemented
as a searchComponent and the field 'Text' is used by the suggester and would
have to be filtered the same way the search is done. The problem with this
approach is, suggest works on a single field and there is no way to include
the UserName field as a filter.

What's the best way out from here?

Thanks in advance!
Jay

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3934565.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Facet auto-suggest

2012-01-17 Thread Jan Høydahl
Hi,

Sure, you can use filters and facets for this. Start a query with 
...&facet.field=source&facet.field=topics&facet.field=type
When you click a "button", you set the corresponding filter (fq=source:people), 
and the new query will return the same facets with new counts. In the Audi 
example, you would disable buttons with 0 hits in the facet count.

For more in depth, see http://java.dzone.com/news/complex-solr-faceting

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 17. jan. 2012, at 23:38, Jon Drukman wrote:

> I don't even know what to call this feature. Here's a website that shows
> the problem:
> 
> http://pulse.audiusanews.com/pulse/index.php
> 
> Notice that you can end up in a situation where there are no results.
> For example,
> in order, press: People, Performance, Technology, Photos. The client
> wants it so that when you click a button, it disables buttons that would
> lead to a dead end. In other words, after clicking Technology, the Photos
> button would be disabled.
> 
> Can Solr help with this?
> 
> -jsd-
> 



Facet auto-suggest

2012-01-17 Thread Jon Drukman
I don't even know what to call this feature. Here's a website that shows
the problem:

http://pulse.audiusanews.com/pulse/index.php

Notice that you can end up in a situation where there are no results.
For example,
in order, press: People, Performance, Technology, Photos. The client
wants it so that when you click a button, it disables buttons that would
lead to a dead end. In other words, after clicking Technology, the Photos
button would be disabled.

Can Solr help with this?

-jsd-



RE: Ebay Kleinanzeigen and Auto Suggest

2011-05-03 Thread Andy

--- On Tue, 5/3/11, Charton, Andre  wrote:
> 
> yes we do. 
> 
> If you use a limit number of categories (like 100) you can
> use dynamic fields with the termscomponent and by choosing a
> category specific prefix, like:
> 
> {schema.xml}
> ...
>  indexed="true" stored="false" multiValued="true"
> omitNorms="true"/>
> ...
> {schema.xml}
> 
> And within data import handler we script prefix from given
> category:
> 
> {data-config.xml}
>         function
> setCatPrefixFields(row) {
>            
> var catId = row.get('category');
>            
> var title = row.get('freetext');
>            
> var cat_prefix = "c" + catId + "_suggestion";
>            
> return row;
>         }
> {data-config.xml}
> 
> Then you we adapt these in our application layer by a
> specific request handler, regarding these prefix.
> 
> Pro:
>     - works fine for limit number of
> categories
> 
> Con:
>     - index is getting bigger, we measure
> increasing by ~40 percent


Very interesting.

Why did the index get bigger? You're still indexing the same title, just to 
different dynamic fields, right? So the total amount of data indexed should 
still be the same. Adding dynamic fields shouldn't increase the index size. 
What am I missing?

Andy


RE: Ebay Kleinanzeigen and Auto Suggest

2011-05-03 Thread Charton, Andre
Hi,

yes we do. 

If you use a limit number of categories (like 100) you can use dynamic fields 
with the termscomponent and by choosing a category specific prefix, like:

{schema.xml}
...

...
{schema.xml}

And within data import handler we script prefix from given category:

{data-config.xml}
function setCatPrefixFields(row) {
var catId = row.get('category');
var title = row.get('freetext');
var cat_prefix = "c" + catId + "_suggestion";
return row;
}
{data-config.xml}

Then you we adapt these in our application layer by a specific request handler, 
regarding these prefix.

Pro:
- works fine for limit number of categories

Con:
- index is getting bigger, we measure increasing by ~40 percent

Regards

André Charton


-Original Message-
From: Eric Grobler [mailto:impalah...@googlemail.com] 
Sent: Wednesday, April 27, 2011 9:56 AM
To: solr-user@lucene.apache.org
Subject: Re: Ebay Kleinanzeigen and Auto Suggest

Hi Otis,

The new Solr 3.1 Suggester also does not support filter queries.

Is anyone using shingles with faceting on large data?

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Eric,
>
> Before using the terms component, allow me to point out:
>
> * http://sematext.com/products/autocomplete/index.html (used on
> http://search-lucene.com/ for example)
>
> * http://wiki.apache.org/solr/Suggester
>
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Eric Grobler 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, April 26, 2011 1:11:11 PM
> > Subject: Ebay Kleinanzeigen and Auto Suggest
> >
> > Hi
> >
> > Someone told me that ebay is using solr.
> > I was looking at their  Auto Suggest implementation and I guess they are
> > using Shingles and the  TermsComponent.
> >
> > I managed to get a satisfactory implementation but I have  a problem with
> > category specific filtering.
> > Ebay suggestions are sensitive  to categories like Cars and Pets.
> >
> > As far as I understand it is not  possible to using filters with a term
> > query.
> > Unless one uses multiple  fields or special prefixes for the words to
> index I
> > cannot think how to  implement this.
> >
> > Is their perhaps a workaround for this  limitation?
> >
> > Best  Regards
> > EricZ
> >
> > ---
> >
> > I am have  a shingle type like:
> >  > positionIncrementGap="100">
> > 
> >
> > > maxShingleSize="4"  />
> >
> >
> > 
> >
> >
> >
> > and a query like
> >
> http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
> >i
> >
>


Re: Ebay Kleinanzeigen and Auto Suggest

2011-04-27 Thread Eric Grobler
Hi Otis,

The new Solr 3.1 Suggester also does not support filter queries.

Is anyone using shingles with faceting on large data?

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Eric,
>
> Before using the terms component, allow me to point out:
>
> * http://sematext.com/products/autocomplete/index.html (used on
> http://search-lucene.com/ for example)
>
> * http://wiki.apache.org/solr/Suggester
>
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Eric Grobler 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, April 26, 2011 1:11:11 PM
> > Subject: Ebay Kleinanzeigen and Auto Suggest
> >
> > Hi
> >
> > Someone told me that ebay is using solr.
> > I was looking at their  Auto Suggest implementation and I guess they are
> > using Shingles and the  TermsComponent.
> >
> > I managed to get a satisfactory implementation but I have  a problem with
> > category specific filtering.
> > Ebay suggestions are sensitive  to categories like Cars and Pets.
> >
> > As far as I understand it is not  possible to using filters with a term
> > query.
> > Unless one uses multiple  fields or special prefixes for the words to
> index I
> > cannot think how to  implement this.
> >
> > Is their perhaps a workaround for this  limitation?
> >
> > Best  Regards
> > EricZ
> >
> > ---
> >
> > I am have  a shingle type like:
> >  > positionIncrementGap="100">
> > 
> >
> > > maxShingleSize="4"  />
> >
> >
> > 
> >
> >
> >
> > and a query like
> >
> http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
> >i
> >
>


Re: Ebay Kleinanzeigen and Auto Suggest

2011-04-26 Thread Eric Grobler
Thanks for the links Otis,

I will have a look.

Regards
Ericz

On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Eric,
>
> Before using the terms component, allow me to point out:
>
> * http://sematext.com/products/autocomplete/index.html (used on
> http://search-lucene.com/ for example)
>
> * http://wiki.apache.org/solr/Suggester
>
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Eric Grobler 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, April 26, 2011 1:11:11 PM
> > Subject: Ebay Kleinanzeigen and Auto Suggest
> >
> > Hi
> >
> > Someone told me that ebay is using solr.
> > I was looking at their  Auto Suggest implementation and I guess they are
> > using Shingles and the  TermsComponent.
> >
> > I managed to get a satisfactory implementation but I have  a problem with
> > category specific filtering.
> > Ebay suggestions are sensitive  to categories like Cars and Pets.
> >
> > As far as I understand it is not  possible to using filters with a term
> > query.
> > Unless one uses multiple  fields or special prefixes for the words to
> index I
> > cannot think how to  implement this.
> >
> > Is their perhaps a workaround for this  limitation?
> >
> > Best  Regards
> > EricZ
> >
> > ---
> >
> > I am have  a shingle type like:
> >  > positionIncrementGap="100">
> > 
> >
> > > maxShingleSize="4"  />
> >
> >
> > 
> >
> >
> >
> > and a query like
> >
> http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
> >i
> >
>


Re: Ebay Kleinanzeigen and Auto Suggest

2011-04-26 Thread Otis Gospodnetic
Hi Eric,

Before using the terms component, allow me to point out:

* http://sematext.com/products/autocomplete/index.html (used on 
http://search-lucene.com/ for example)

* http://wiki.apache.org/solr/Suggester


Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Eric Grobler 
> To: solr-user@lucene.apache.org
> Sent: Tue, April 26, 2011 1:11:11 PM
> Subject: Ebay Kleinanzeigen and Auto Suggest
> 
> Hi
> 
> Someone told me that ebay is using solr.
> I was looking at their  Auto Suggest implementation and I guess they are
> using Shingles and the  TermsComponent.
> 
> I managed to get a satisfactory implementation but I have  a problem with
> category specific filtering.
> Ebay suggestions are sensitive  to categories like Cars and Pets.
> 
> As far as I understand it is not  possible to using filters with a term
> query.
> Unless one uses multiple  fields or special prefixes for the words to index I
> cannot think how to  implement this.
> 
> Is their perhaps a workaround for this  limitation?
> 
> Best  Regards
> EricZ
> 
> ---
> 
> I am have  a shingle type like:
>  positionIncrementGap="100">
> 
>
> maxShingleSize="4"  />
>
>
> 
> 
> 
> 
> and a query like
>http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
>i
> 


Ebay Kleinanzeigen and Auto Suggest

2011-04-26 Thread Eric Grobler
Hi

Someone told me that ebay is using solr.
I was looking at their Auto Suggest implementation and I guess they are
using Shingles and the TermsComponent.

I managed to get a satisfactory implementation but I have a problem with
category specific filtering.
Ebay suggestions are sensitive to categories like Cars and Pets.

As far as I understand it is not possible to using filters with a term
query.
Unless one uses multiple fields or special prefixes for the words to index I
cannot think how to implement this.

Is their perhaps a workaround for this limitation?

Best Regards
EricZ

---

I am have a shingle type like:


  
   
   
  




and a query like
http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi


Re: EdgeNgram Auto suggest - doubles ignore

2011-02-08 Thread Erick Erickson
I'm afraid I'll have to pass, I'm absolutely swamped at the moment. Perhaps
someone else can pick it up.

I will say that you should be getting terms back when you pre-lower-case
them, so look in your index via the admin page or Luke to see if what's
really in your index is what you think in the "name" field.

As for sorting, I haven't a clue. Start by backing out your custom sorting,
verifying that things are as you expect for everything *except* sorting and
add
it back in

Best
Erick



On Tue, Feb 8, 2011 at 10:11 AM, johnnyisrael wrote:

>
> Hi Erick,
>
> If you have time, Can you please take a look and provide your comments (or)
> suggestions for this problem?
>
> Please let me know if you need any more information.
>
> Thanks,
>
> Johnny
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2451828.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: EdgeNgram Auto suggest - doubles ignore

2011-02-08 Thread johnnyisrael

Hi Erick,

If you have time, Can you please take a look and provide your comments (or)
suggestions for this problem?

Please let me know if you need any more information.

Thanks,

Johnny
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2451828.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNgram Auto suggest - doubles ignore

2011-02-01 Thread johnnyisrael

Hi Erick,

I tried to use terms component, I got ended up with the following problems.

Problem: 1

Custom Sort not working in terms component:

http://lucene.472066.n3.nabble.com/Term-component-sort-is-not-working-td1905059.html#a1909386

I want to sort using one of my custom field[value_score], I gave it aleady
in my configuration, but it is not sorting properly.

The following are the configuration in solrconfig.xml

  

  
 
true
json
name
value_score desc
true
 

  termsComponent

  

The SOLR response tag is not returned based on sorted parameter.

Problem: 2

Cap sensitive problem: [I am searching for "Apple"]

http://localhost/solr/core1/terms?terms.fl=name&terms.prefix=apple <-- not
working

http://localhost/solr/core1/terms?terms.fl=name&terms.prefix=Apple <--
working

Tried regex to overcome cap-sensitive problem: 

http://localhost/solr/core1/terms?terms.fl=name&terms.regex=Apple&terms.regex.flag=case_insensitive

Is this regex based search will help me for my requirement?

It is returning irrelevant results. I am using the same syntax it is
mentioned in WIKI.

http://wiki.apache.org/solr/TermsComponent

Am I going wrong anywhere?

Please let me know if you need any more info.

Thanks,

Johnny
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2399330.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Erick Erickson
OK, try this.

Use some analysis chain for your field like:






This can be a multiValued field, BTW.

now use the TermsComponent to fetch your data. See:
http://wiki.apache.org/solr/TermsComponent

and specify terms.prefix=apple e.g.
http://localhost:8983/solr/terms?terms.prefix=app&terms.fl=blivet

The return list should be what you want. Note that the returned
values will be lower cased, and you can only specify
lower case in your search term (all because of specifying
the lowercase filter in my example).

This should be very fast no matter what your index size, as the
return list size defaults to 10 (though you can specify different
numbers).

Best
Erick

On Tue, Jan 25, 2011 at 3:03 PM, johnnyisrael wrote:

>
> Hi Eric,
>
> What I want here is, lets say I have 3 documents like
>
> ["pineapple vers apple", "milk with apple", "apple milk shake" ]
>
> and If i search for "apple", it should return only "apple milk shake"
> because that term alone starts with the letter "apple" which I typed in. It
> should not bring others and if I type "milk" it should return only "milk
> with apple"
>
> I want an output Similar like a Google auto suggest.
>
> Is there a way to achieve  this without encapsulating with double quotes.
>
> Thanks,
>
> Johnny
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2333602.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread mesenthil

Right now our configuration says multivalues=true. But that need not be
"true" in our case. Will make it false and try and update this thread with
more details..
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2334627.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Jonathan Rochkind
Ah, sorry, I got confused about your requirements, if you just want to 
match at the beginning of the field, it may be more possible.  Using 
edgegrams or wildcard. If you have a single-valued field. Do you have a 
single-valued or a multi-valued field?  That is, does each document have 
just one value, or multiple?   I still get confused about how to do it 
with edgegrams, even with single-valued field, but I think maybe it's 
possible.


_Definitely_ possible, with or without edgegrams, if you are 
willing/able to make a completely seperate Solr index where each term 
for auto-suggest is a "document".  Yes.


The problem lies in what "results" are. In general, Solr's results are 
the documents you have in the Solr index. Thus it makes everything a lot 
easier to deal with if you have an index where each document in the 
index is a "term" for auto-suggest.   But that doesnt' always meet 
requirements if you need to auto-suggest within existing fq's and such, 
and of course it takes more resources to run an additional solr index.


On 1/25/2011 5:03 PM, mesenthil wrote:

The index contains around 1.5 million documents. As this is used for
autosuggest feature, performance is an important factor.

So it looks like, using edgeNgram it is difficult to achieve the the
following

Result should return only those terms where search letter is matching with
the first word only. For example, when we type "M",  it should return
"Mumford and Sons" and not "jackson Michael".


Jonathan,

Is it possible to achieve this when we have separate index using edgeNgram?



Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread mesenthil

The index contains around 1.5 million documents. As this is used for
autosuggest feature, performance is an important factor. 

So it looks like, using edgeNgram it is difficult to achieve the the
following 

Result should return only those terms where search letter is matching with
the first word only. For example, when we type "M",  it should return
"Mumford and Sons" and not "jackson Michael". 


Jonathan,

Is it possible to achieve this when we have separate index using edgeNgram?
 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2334538.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Markus Jelsma
Oh, i should perhaps mention that EdgeNGrams will yield results a lot quicker 
than using wildcards at the cost of a larger index. You should, of course, use 
EdgeNGrams if you worry about performance and have a huge index and a number 
of queries per second.

> Then you don't need NGrams at all. A wildcard will suffice or you can use
> the TermsComponent.
> 
> If these strings are indexed as single tokens (KeywordTokenizer with
> LowercaseFilter) you can simply do field:app* to retrieve the "apple milk
> shake". You can also use the string field type but then you must make sure
> the values are already lowercased before indexing.
> 
> Be careful though, there is no query time analysis for wildcard (and fuzzy)
> queries so make sure
> 
> > Hi Eric,
> > 
> > What I want here is, lets say I have 3 documents like
> > 
> > ["pineapple vers apple", "milk with apple", "apple milk shake" ]
> > 
> > and If i search for "apple", it should return only "apple milk shake"
> > because that term alone starts with the letter "apple" which I typed in.
> > It should not bring others and if I type "milk" it should return only
> > "milk with apple"
> > 
> > I want an output Similar like a Google auto suggest.
> > 
> > Is there a way to achieve  this without encapsulating with double quotes.
> > 
> > Thanks,
> > 
> > Johnny


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Markus Jelsma
Then you don't need NGrams at all. A wildcard will suffice or you can use the 
TermsComponent.

If these strings are indexed as single tokens (KeywordTokenizer with 
LowercaseFilter) you can simply do field:app* to retrieve the "apple milk 
shake". You can also use the string field type but then you must make sure the 
values are already lowercased before indexing.

Be careful though, there is no query time analysis for wildcard (and fuzzy) 
queries so make sure 

> Hi Eric,
> 
> What I want here is, lets say I have 3 documents like
> 
> ["pineapple vers apple", "milk with apple", "apple milk shake" ]
> 
> and If i search for "apple", it should return only "apple milk shake"
> because that term alone starts with the letter "apple" which I typed in. It
> should not bring others and if I type "milk" it should return only "milk
> with apple"
> 
> I want an output Similar like a Google auto suggest.
> 
> Is there a way to achieve  this without encapsulating with double quotes.
> 
> Thanks,
> 
> Johnny


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Jonathan Rochkind
I haven't figured out any way to achieve that AT ALL without making a 
seperate Solr index just to serve autosuggest queries. At least when you 
want to auto-suggest on a multi-value field. Someone posted a crazy 
tricky way to do it with a single-valued field a while ago.  If you 
can/are willing to make a seperate Solr index with a schema set up for 
auto-suggest specifically, it's easy. But from an existing schema, where 
you want to auto-suggest just based on the values in one field, it's a 
multi-valued field, and you want to allow matches in the middle of the 
field -- I don't think there's a way to do it.


On 1/25/2011 3:03 PM, johnnyisrael wrote:

Hi Eric,

What I want here is, lets say I have 3 documents like

["pineapple vers apple", "milk with apple", "apple milk shake" ]

and If i search for "apple", it should return only "apple milk shake"
because that term alone starts with the letter "apple" which I typed in. It
should not bring others and if I type "milk" it should return only "milk
with apple"

I want an output Similar like a Google auto suggest.

Is there a way to achieve  this without encapsulating with double quotes.

Thanks,

Johnny


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread johnnyisrael

Hi Eric,

What I want here is, lets say I have 3 documents like 

["pineapple vers apple", "milk with apple", "apple milk shake" ]

and If i search for "apple", it should return only "apple milk shake"
because that term alone starts with the letter "apple" which I typed in. It
should not bring others and if I type "milk" it should return only "milk
with apple"

I want an output Similar like a Google auto suggest.

Is there a way to achieve  this without encapsulating with double quotes.

Thanks,

Johnny
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2333602.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Erick Erickson
Let's back up here because now I'm not clear what you actually want.
EdgeNGrams
are a way of matching substrings, which is what's happening here. Of course
searching "apple" against any of the three examples, just as searching for
"apple"
without grams would match, that's the expected behavior.

So, we need a clear problem definition of what you're trying to do, along
with
example queries (please post the results of adding &debugQuery=on).

Best
Erick

On Tue, Jan 25, 2011 at 8:29 AM, johnnyisrael wrote:

>
> Hi Eric,
>
> You are right, there is a copy field to EdgeNgram, I tried the
> configuration
> but it not working as expected.
>
> Configuration I tried:
>
> 
>
>  termVectors=”true”>
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
>  positionIncrementGap=”100″>
> 
> 
> 
>  maxGramSize=”25″/>
> 
> 
> 
> 
> 
> 
>
>  omitNorms=”true” omitTermFreqAndPositions=”true” />
>  omitNorms=”true” omitTermFreqAndPositions=”true” />
>
> edgy_user_query
> 
>
> ==
>
> When I search for the term "apple".
>
> It is returning results for "pineapple vers apple", "milk with apple",
> "apple milk shake" ...
>
> Is there any other way to overcome this problem?
>
> Thanks,
>
> Johnny
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2329370.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread johnnyisrael

Hi Eric,

You are right, there is a copy field to EdgeNgram, I tried the configuration
but it not working as expected.

Configuration I tried:





























edgy_user_query


==

When I search for the term "apple".

It is returning results for "pineapple vers apple", "milk with apple",
"apple milk shake" ...

Is there any other way to overcome this problem?

Thanks,

Johnny


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2329370.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNgram Auto suggest - doubles ignore

2011-01-24 Thread Erick Erickson
See below.

On Mon, Jan 24, 2011 at 1:51 PM, johnnyisrael wrote:

>
> Hi,
>
> I am trying out the auto suggest using EdgeNgram.
>
> Using the following tutorial as a reference.
>
>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>
> In the above tutorial, The below two lines has been clearly mentioned,
>
> "Note that it’s necessary to wrap the query in double-quotes as a phrase.
> Otherwise unpredictable and unwanted matches can occur."
>
> When i use double quotes as they said it works perfectly fine. I just want
> to know the reason for this behavior.
>
> Can anyone explain me why it behaves like that?
>
>
The reason here is that if you *don't* make it a phrase, then
you're ORing (or ANDing) the grams. So if you were
searching for won, your search would become
w OR wo OR won, which would match n-grams from
all over the place without regard to whether they appeared
in order.


> I tried the alternate method mentioned in the responses section of the same
> tutorial [StandardTokenizerFactory and LowerCaseFilterFactory combination],
> it does not work fine as expected[bringing unwanted matches].
>
>
Hmmm, I don't think the StandartTokenizer & LowerCase was
being applied as autosuggest, there was a copyField in there
that went to the EdgeNGram (note that I scanned the article)..

Best
Erick



> Is there a best way to overcome this?
>
> Thanks,
>
> Johnny
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2321919.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


EdgeNgram Auto suggest - doubles ignore

2011-01-24 Thread johnnyisrael

Hi,

I am trying out the auto suggest using EdgeNgram. 

Using the following tutorial as a reference.

http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

In the above tutorial, The below two lines has been clearly mentioned, 

"Note that it’s necessary to wrap the query in double-quotes as a phrase.
Otherwise unpredictable and unwanted matches can occur."

When i use double quotes as they said it works perfectly fine. I just want
to know the reason for this behavior.

Can anyone explain me why it behaves like that?

I tried the alternate method mentioned in the responses section of the same
tutorial [StandardTokenizerFactory and LowerCaseFilterFactory combination],
it does not work fine as expected[bringing unwanted matches].

Is there a best way to overcome this?

Thanks,

Johnny



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2321919.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto Suggest

2010-09-04 Thread Erick Erickson
Adding &debugQuery=on produced the following:
+edge:testing +edge:lots
   +edge:testing +edge:lots
  
   +PhraseQuery(edge:"te tes test testi testin testing")
+PhraseQuery(edge:"lo lot lots")


So one part of the answer is that multiple terms are broken up into multiple
phrase
queries, one phrase for each term. This is with LowerCaseTokenizerFactory
and
EdgeNGramFilterFactory


So I don't see any reason why your query shouldn't work. Could you provide
your
field type definitions, an example document that you think should be found
and query output with &debugQuery=on?

Best
Erick


On Sat, Sep 4, 2010 at 10:27 AM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> Luke,
>
> Thanks.  What happens if there are 3 terms?  It seems like the entire
> query can go into facet.prefix?
>
> On Fri, Sep 3, 2010 at 8:05 AM, Luke Tebbs 
> wrote:
> > What about if you do something like this? -
> >
> >
> facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1
> >
> >
> > Jason Rutherglen wrote:
> >>
> >> To clarify, the query analyzer returns that.  Variations such as
> >> "apple mou" also do not return anything.  Maybe Jay can comment and
> >> then we can amend the article?
> >>
> >> On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen
> >>  wrote:
> >>
> >>>
> >>> Analysis returns "app mou".
> >>>
> >>> On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog 
> wrote:
> >>>
> >>>>
> >>>> What does analysis.jsp show?
> >>>>
> >>>> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
> >>>>  wrote:
> >>>>
> >>>>>
> >>>>> I'm having a different issue with the EdgeNGram technique described
> >>>>> here:
> >>>>>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> >>>>>
> >>>>> That is one word queries q=app on the query_text field, work fine
> >>>>> however "q=app mou" do not.  Why would this be or is there a
> >>>>> configuration that could be missing?
> >>>>>
> >>>>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler
> >>>>>  wrote:
> >>>>>
> >>>>>>
> >>>>>> Thanks for your feedback Robert,
> >>>>>>
> >>>>>> I will try that and see how Solr performs on my data - I think I
> will
> >>>>>> create
> >>>>>> a field that contains only important key/product terms from the
> text.
> >>>>>>
> >>>>>> Regards
> >>>>>> Johan
> >>>>>>
> >>>>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen 
> >>>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> We don't have that many, just a hundred thousand, and solr response
> >>>>>>> times (since the index's docs are small and not complex) are logged
> >>>>>>> as
> >>>>>>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast
> >>>>>>> no
> >>>>>>> milliseconds have elapsed.  Incredible if you ask me...  :)
> >>>>>>>
> >>>>>>> Once you get SOLR to consider the whole phrase as just one big
> term,
> >>>>>>> the
> >>>>>>> wildcard is very fast.
> >>>>>>>
> >>>>>>> -Original Message-
> >>>>>>> From: Eric Grobler [mailto:impalah...@googlemail.com]
> >>>>>>> Sent: Wednesday, September 01, 2010 12:35 PM
> >>>>>>> To: solr-user@lucene.apache.org
> >>>>>>> Subject: Re: Auto Suggest
> >>>>>>>
> >>>>>>> Hi Robert,
> >>>>>>>
> >>>>>>> Interesting approach, how many documents do you have in Solr?
> >>>>>>> I have about 2 million and I just wonder if it might be a bit slow.
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Johan
> >>>>>>>

Re: Auto Suggest

2010-09-04 Thread Jason Rutherglen
Luke,

Thanks.  What happens if there are 3 terms?  It seems like the entire
query can go into facet.prefix?

On Fri, Sep 3, 2010 at 8:05 AM, Luke Tebbs  wrote:
> What about if you do something like this? -
>
> facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1
>
>
> Jason Rutherglen wrote:
>>
>> To clarify, the query analyzer returns that.  Variations such as
>> "apple mou" also do not return anything.  Maybe Jay can comment and
>> then we can amend the article?
>>
>> On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen
>>  wrote:
>>
>>>
>>> Analysis returns "app mou".
>>>
>>> On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog  wrote:
>>>
>>>>
>>>> What does analysis.jsp show?
>>>>
>>>> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
>>>>  wrote:
>>>>
>>>>>
>>>>> I'm having a different issue with the EdgeNGram technique described
>>>>> here:
>>>>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>>>>>
>>>>> That is one word queries q=app on the query_text field, work fine
>>>>> however "q=app mou" do not.  Why would this be or is there a
>>>>> configuration that could be missing?
>>>>>
>>>>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler
>>>>>  wrote:
>>>>>
>>>>>>
>>>>>> Thanks for your feedback Robert,
>>>>>>
>>>>>> I will try that and see how Solr performs on my data - I think I will
>>>>>> create
>>>>>> a field that contains only important key/product terms from the text.
>>>>>>
>>>>>> Regards
>>>>>> Johan
>>>>>>
>>>>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen 
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> We don't have that many, just a hundred thousand, and solr response
>>>>>>> times (since the index's docs are small and not complex) are logged
>>>>>>> as
>>>>>>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast
>>>>>>> no
>>>>>>> milliseconds have elapsed.  Incredible if you ask me...  :)
>>>>>>>
>>>>>>> Once you get SOLR to consider the whole phrase as just one big term,
>>>>>>> the
>>>>>>> wildcard is very fast.
>>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Eric Grobler [mailto:impalah...@googlemail.com]
>>>>>>> Sent: Wednesday, September 01, 2010 12:35 PM
>>>>>>> To: solr-user@lucene.apache.org
>>>>>>> Subject: Re: Auto Suggest
>>>>>>>
>>>>>>> Hi Robert,
>>>>>>>
>>>>>>> Interesting approach, how many documents do you have in Solr?
>>>>>>> I have about 2 million and I just wonder if it might be a bit slow.
>>>>>>>
>>>>>>> Regards
>>>>>>> Johan
>>>>>>>
>>>>>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> I do this by replacing the spaces with a '%' in a separate search
>>>>>>>>
>>>>>>>
>>>>>>> field
>>>>>>>
>>>>>>>>
>>>>>>>> which is not parsed nor tokenized and then you can wildcard across
>>>>>>>> the
>>>>>>>> whole phrase like you want and the spaces don't mess you up.  Just
>>>>>>>>
>>>>>>>
>>>>>>> store
>>>>>>>
>>>>>>>>
>>>>>>>> the original phrase with spaces in a separate field for returning to
>>>>>>>>
>>>>>>>
>>>>>>> the
>>>>>>>
>>>>>>>>
>>>>>>>> front end for display.
>>>>>>>>
>>>>>>>> -Original Message-
>>>>>>>> From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>>>>>>>> Sent: Wednesday, September 01, 2010 7:33 AM
>>>>>>>> To: solr-user@lucene.apache.org
>>>>>>>> Subject: Auto Suggest
>>>>>>>>
>>>>>>>>
>>>>>>>> Hallo
>>>>>>>>
>>>>>>>> How would one implement a multiple term auto-suggest feature in Solr
>>>>>>>> that is filter sensitive?
>>>>>>>> For example, a user enters :
>>>>>>>> "mp3"
>>>>>>>>  and solr might suggest:
>>>>>>>>  ->   "mp3 player"
>>>>>>>>  ->   "mp3 nano"
>>>>>>>>  ->   "mp3 sony"
>>>>>>>> and then the user starts the second word :
>>>>>>>> "mp3 n"
>>>>>>>> and that narrows it down to:
>>>>>>>>  -> "mp3 nano"
>>>>>>>>
>>>>>>>> I had a quick look at the Terms Component.
>>>>>>>> I suppose it just returns term totals for the entire index and
>>>>>>>> cannot
>>>>>>>>
>>>>>>>
>>>>>>> be
>>>>>>>
>>>>>>>>
>>>>>>>> used with a filter or query?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Johan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> goks...@gmail.com
>>>>
>>>>
>
>


Re: Auto Suggest

2010-09-04 Thread Jason Rutherglen
Dan,

Thanks... I wasn't clear in the original email what the issue is.
It's the fact that multiple terms are in the query, then no results
are returned.

Thanks

On Fri, Sep 3, 2010 at 8:33 AM, dan sutton  wrote:
> I set this up a few years ago with something like the following:
>
> 
>                
>                        
>                        
>                         pattern="([^a-z0-9])" replacement="" replace="all" />
>                         maxGramSize="20" minGramSize="1" />
>                
>                
>                        
>                        
>                         pattern="([^a-z0-9])" replacement="" replace="all" />
>                
>    
>
>  replacement="" replace="all" /> is the bit missing i think here
>
> This way the search is agnostic to case and any non-alphanum chars, this was
> to facilitate a location autocomplete for searching
>
> So is was a basic search, returning the top N results along with additional
> info to show in the autocomplete to our mod_perl servers, Results were
> cached in the mod_perl servers.
>
> Regards,
> Dan
>
> On Thu, Sep 2, 2010 at 1:53 PM, Jason Rutherglen > wrote:
>
>> I'm having a different issue with the EdgeNGram technique described
>> here:
>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>>
>> That is one word queries q=app on the query_text field, work fine
>> however "q=app mou" do not.  Why would this be or is there a
>> configuration that could be missing?
>>
>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler 
>> wrote:
>> > Thanks for your feedback Robert,
>> >
>> > I will try that and see how Solr performs on my data - I think I will
>> create
>> > a field that contains only important key/product terms from the text.
>> >
>> > Regards
>> > Johan
>> >
>> > On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen 
>> wrote:
>> >
>> >> We don't have that many, just a hundred thousand, and solr response
>> >> times (since the index's docs are small and not complex) are logged as
>> >> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
>> >> milliseconds have elapsed.  Incredible if you ask me...  :)
>> >>
>> >> Once you get SOLR to consider the whole phrase as just one big term, the
>> >> wildcard is very fast.
>> >>
>> >> -Original Message-
>> >> From: Eric Grobler [mailto:impalah...@googlemail.com]
>> >> Sent: Wednesday, September 01, 2010 12:35 PM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: Re: Auto Suggest
>> >>
>> >> Hi Robert,
>> >>
>> >> Interesting approach, how many documents do you have in Solr?
>> >> I have about 2 million and I just wonder if it might be a bit slow.
>> >>
>> >> Regards
>> >> Johan
>> >>
>> >> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>> >> wrote:
>> >>
>> >> > I do this by replacing the spaces with a '%' in a separate search
>> >> field
>> >> > which is not parsed nor tokenized and then you can wildcard across the
>> >> > whole phrase like you want and the spaces don't mess you up.  Just
>> >> store
>> >> > the original phrase with spaces in a separate field for returning to
>> >> the
>> >> > front end for display.
>> >> >
>> >> > -Original Message-
>> >> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>> >> > Sent: Wednesday, September 01, 2010 7:33 AM
>> >> > To: solr-user@lucene.apache.org
>> >> > Subject: Auto Suggest
>> >> >
>> >> >
>> >> > Hallo
>> >> >
>> >> > How would one implement a multiple term auto-suggest feature in Solr
>> >> > that is filter sensitive?
>> >> > For example, a user enters :
>> >> > "mp3"
>> >> >  and solr might suggest:
>> >> >  ->   "mp3 player"
>> >> >  ->   "mp3 nano"
>> >> >  ->   "mp3 sony"
>> >> > and then the user starts the second word :
>> >> > "mp3 n"
>> >> > and that narrows it down to:
>> >> >  -> "mp3 nano"
>> >> >
>> >> > I had a quick look at the Terms Component.
>> >> > I suppose it just returns term totals for the entire index and cannot
>> >> be
>> >> > used with a filter or query?
>> >> >
>> >> > Thanks
>> >> > Johan
>> >> >
>> >> >
>> >> >
>> >>
>> >
>>
>


Re: Auto Suggest

2010-09-03 Thread dan sutton
I set this up a few years ago with something like the following:















 is the bit missing i think here

This way the search is agnostic to case and any non-alphanum chars, this was
to facilitate a location autocomplete for searching

So is was a basic search, returning the top N results along with additional
info to show in the autocomplete to our mod_perl servers, Results were
cached in the mod_perl servers.

Regards,
Dan

On Thu, Sep 2, 2010 at 1:53 PM, Jason Rutherglen  wrote:

> I'm having a different issue with the EdgeNGram technique described
> here:
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>
> That is one word queries q=app on the query_text field, work fine
> however "q=app mou" do not.  Why would this be or is there a
> configuration that could be missing?
>
> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler 
> wrote:
> > Thanks for your feedback Robert,
> >
> > I will try that and see how Solr performs on my data - I think I will
> create
> > a field that contains only important key/product terms from the text.
> >
> > Regards
> > Johan
> >
> > On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen 
> wrote:
> >
> >> We don't have that many, just a hundred thousand, and solr response
> >> times (since the index's docs are small and not complex) are logged as
> >> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
> >> milliseconds have elapsed.  Incredible if you ask me...  :)
> >>
> >> Once you get SOLR to consider the whole phrase as just one big term, the
> >> wildcard is very fast.
> >>
> >> -Original Message-
> >> From: Eric Grobler [mailto:impalah...@googlemail.com]
> >> Sent: Wednesday, September 01, 2010 12:35 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Auto Suggest
> >>
> >> Hi Robert,
> >>
> >> Interesting approach, how many documents do you have in Solr?
> >> I have about 2 million and I just wonder if it might be a bit slow.
> >>
> >> Regards
> >> Johan
> >>
> >> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
> >> wrote:
> >>
> >> > I do this by replacing the spaces with a '%' in a separate search
> >> field
> >> > which is not parsed nor tokenized and then you can wildcard across the
> >> > whole phrase like you want and the spaces don't mess you up.  Just
> >> store
> >> > the original phrase with spaces in a separate field for returning to
> >> the
> >> > front end for display.
> >> >
> >> > -Original Message-
> >> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
> >> > Sent: Wednesday, September 01, 2010 7:33 AM
> >> > To: solr-user@lucene.apache.org
> >> > Subject: Auto Suggest
> >> >
> >> >
> >> > Hallo
> >> >
> >> > How would one implement a multiple term auto-suggest feature in Solr
> >> > that is filter sensitive?
> >> > For example, a user enters :
> >> > "mp3"
> >> >  and solr might suggest:
> >> >  ->   "mp3 player"
> >> >  ->   "mp3 nano"
> >> >  ->   "mp3 sony"
> >> > and then the user starts the second word :
> >> > "mp3 n"
> >> > and that narrows it down to:
> >> >  -> "mp3 nano"
> >> >
> >> > I had a quick look at the Terms Component.
> >> > I suppose it just returns term totals for the entire index and cannot
> >> be
> >> > used with a filter or query?
> >> >
> >> > Thanks
> >> > Johan
> >> >
> >> >
> >> >
> >>
> >
>


Re: Auto Suggest

2010-09-03 Thread Luke Tebbs

What about if you do something like this? -

facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1


Jason Rutherglen wrote:

To clarify, the query analyzer returns that.  Variations such as
"apple mou" also do not return anything.  Maybe Jay can comment and
then we can amend the article?

On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen
 wrote:
  

Analysis returns "app mou".

On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog  wrote:


What does analysis.jsp show?

On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
 wrote:
  

I'm having a different issue with the EdgeNGram technique described
here: 
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

That is one word queries q=app on the query_text field, work fine
however "q=app mou" do not.  Why would this be or is there a
configuration that could be missing?

On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  wrote:


Thanks for your feedback Robert,

I will try that and see how Solr performs on my data - I think I will create
a field that contains only important key/product terms from the text.

Regards
Johan

On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:

  

We don't have that many, just a hundred thousand, and solr response
times (since the index's docs are small and not complex) are logged as
typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
milliseconds have elapsed.  Incredible if you ask me...  :)

Once you get SOLR to consider the whole phrase as just one big term, the
wildcard is very fast.

-Original Message-
From: Eric Grobler [mailto:impalah...@googlemail.com]
Sent: Wednesday, September 01, 2010 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Auto Suggest

Hi Robert,

Interesting approach, how many documents do you have in Solr?
I have about 2 million and I just wonder if it might be a bit slow.

Regards
Johan

On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
wrote:



I do this by replacing the spaces with a '%' in a separate search
  

field


which is not parsed nor tokenized and then you can wildcard across the
whole phrase like you want and the spaces don't mess you up.  Just
  

store


the original phrase with spaces in a separate field for returning to
  

the


front end for display.

-Original Message-
From: Jazz Globe [mailto:jazzgl...@hotmail.com]
Sent: Wednesday, September 01, 2010 7:33 AM
To: solr-user@lucene.apache.org
Subject: Auto Suggest


Hallo

How would one implement a multiple term auto-suggest feature in Solr
that is filter sensitive?
For example, a user enters :
"mp3"
 and solr might suggest:
 ->   "mp3 player"
 ->   "mp3 nano"
 ->   "mp3 sony"
and then the user starts the second word :
"mp3 n"
and that narrows it down to:
 -> "mp3 nano"

I had a quick look at the Terms Component.
I suppose it just returns term totals for the entire index and cannot
  

be


used with a filter or query?

Thanks
Johan



  


--
Lance Norskog
goks...@gmail.com

  




Re: Auto Suggest

2010-09-03 Thread Jason Rutherglen
To clarify, the query analyzer returns that.  Variations such as
"apple mou" also do not return anything.  Maybe Jay can comment and
then we can amend the article?

On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen
 wrote:
> Analysis returns "app mou".
>
> On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog  wrote:
>> What does analysis.jsp show?
>>
>> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
>>  wrote:
>>> I'm having a different issue with the EdgeNGram technique described
>>> here: 
>>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>>>
>>> That is one word queries q=app on the query_text field, work fine
>>> however "q=app mou" do not.  Why would this be or is there a
>>> configuration that could be missing?
>>>
>>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  
>>> wrote:
>>>> Thanks for your feedback Robert,
>>>>
>>>> I will try that and see how Solr performs on my data - I think I will 
>>>> create
>>>> a field that contains only important key/product terms from the text.
>>>>
>>>> Regards
>>>> Johan
>>>>
>>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:
>>>>
>>>>> We don't have that many, just a hundred thousand, and solr response
>>>>> times (since the index's docs are small and not complex) are logged as
>>>>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
>>>>> milliseconds have elapsed.  Incredible if you ask me...  :)
>>>>>
>>>>> Once you get SOLR to consider the whole phrase as just one big term, the
>>>>> wildcard is very fast.
>>>>>
>>>>> -Original Message-
>>>>> From: Eric Grobler [mailto:impalah...@googlemail.com]
>>>>> Sent: Wednesday, September 01, 2010 12:35 PM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Re: Auto Suggest
>>>>>
>>>>> Hi Robert,
>>>>>
>>>>> Interesting approach, how many documents do you have in Solr?
>>>>> I have about 2 million and I just wonder if it might be a bit slow.
>>>>>
>>>>> Regards
>>>>> Johan
>>>>>
>>>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>>>>> wrote:
>>>>>
>>>>> > I do this by replacing the spaces with a '%' in a separate search
>>>>> field
>>>>> > which is not parsed nor tokenized and then you can wildcard across the
>>>>> > whole phrase like you want and the spaces don't mess you up.  Just
>>>>> store
>>>>> > the original phrase with spaces in a separate field for returning to
>>>>> the
>>>>> > front end for display.
>>>>> >
>>>>> > -Original Message-
>>>>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>>>>> > Sent: Wednesday, September 01, 2010 7:33 AM
>>>>> > To: solr-user@lucene.apache.org
>>>>> > Subject: Auto Suggest
>>>>> >
>>>>> >
>>>>> > Hallo
>>>>> >
>>>>> > How would one implement a multiple term auto-suggest feature in Solr
>>>>> > that is filter sensitive?
>>>>> > For example, a user enters :
>>>>> > "mp3"
>>>>> >  and solr might suggest:
>>>>> >  ->   "mp3 player"
>>>>> >  ->   "mp3 nano"
>>>>> >  ->   "mp3 sony"
>>>>> > and then the user starts the second word :
>>>>> > "mp3 n"
>>>>> > and that narrows it down to:
>>>>> >  -> "mp3 nano"
>>>>> >
>>>>> > I had a quick look at the Terms Component.
>>>>> > I suppose it just returns term totals for the entire index and cannot
>>>>> be
>>>>> > used with a filter or query?
>>>>> >
>>>>> > Thanks
>>>>> > Johan
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>


Re: Auto Suggest

2010-09-03 Thread Jason Rutherglen
Analysis returns "app mou".

On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog  wrote:
> What does analysis.jsp show?
>
> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
>  wrote:
>> I'm having a different issue with the EdgeNGram technique described
>> here: 
>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>>
>> That is one word queries q=app on the query_text field, work fine
>> however "q=app mou" do not.  Why would this be or is there a
>> configuration that could be missing?
>>
>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  
>> wrote:
>>> Thanks for your feedback Robert,
>>>
>>> I will try that and see how Solr performs on my data - I think I will create
>>> a field that contains only important key/product terms from the text.
>>>
>>> Regards
>>> Johan
>>>
>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:
>>>
>>>> We don't have that many, just a hundred thousand, and solr response
>>>> times (since the index's docs are small and not complex) are logged as
>>>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
>>>> milliseconds have elapsed.  Incredible if you ask me...  :)
>>>>
>>>> Once you get SOLR to consider the whole phrase as just one big term, the
>>>> wildcard is very fast.
>>>>
>>>> -Original Message-
>>>> From: Eric Grobler [mailto:impalah...@googlemail.com]
>>>> Sent: Wednesday, September 01, 2010 12:35 PM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Auto Suggest
>>>>
>>>> Hi Robert,
>>>>
>>>> Interesting approach, how many documents do you have in Solr?
>>>> I have about 2 million and I just wonder if it might be a bit slow.
>>>>
>>>> Regards
>>>> Johan
>>>>
>>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>>>> wrote:
>>>>
>>>> > I do this by replacing the spaces with a '%' in a separate search
>>>> field
>>>> > which is not parsed nor tokenized and then you can wildcard across the
>>>> > whole phrase like you want and the spaces don't mess you up.  Just
>>>> store
>>>> > the original phrase with spaces in a separate field for returning to
>>>> the
>>>> > front end for display.
>>>> >
>>>> > -Original Message-
>>>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>>>> > Sent: Wednesday, September 01, 2010 7:33 AM
>>>> > To: solr-user@lucene.apache.org
>>>> > Subject: Auto Suggest
>>>> >
>>>> >
>>>> > Hallo
>>>> >
>>>> > How would one implement a multiple term auto-suggest feature in Solr
>>>> > that is filter sensitive?
>>>> > For example, a user enters :
>>>> > "mp3"
>>>> >  and solr might suggest:
>>>> >  ->   "mp3 player"
>>>> >  ->   "mp3 nano"
>>>> >  ->   "mp3 sony"
>>>> > and then the user starts the second word :
>>>> > "mp3 n"
>>>> > and that narrows it down to:
>>>> >  -> "mp3 nano"
>>>> >
>>>> > I had a quick look at the Terms Component.
>>>> > I suppose it just returns term totals for the entire index and cannot
>>>> be
>>>> > used with a filter or query?
>>>> >
>>>> > Thanks
>>>> > Johan
>>>> >
>>>> >
>>>> >
>>>>
>>>
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


Re: Auto Suggest

2010-09-03 Thread Jan Høydahl / Cominvent
Are you phrasing the query, like &q="app mou" ?
I guess with edgeNgram you use KeywordTokenizer which stores phrases as single 
terms.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 2. sep. 2010, at 14.53, Jason Rutherglen wrote:

> I'm having a different issue with the EdgeNGram technique described
> here: 
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> 
> That is one word queries q=app on the query_text field, work fine
> however "q=app mou" do not.  Why would this be or is there a
> configuration that could be missing?
> 
> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  
> wrote:
>> Thanks for your feedback Robert,
>> 
>> I will try that and see how Solr performs on my data - I think I will create
>> a field that contains only important key/product terms from the text.
>> 
>> Regards
>> Johan
>> 
>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:
>> 
>>> We don't have that many, just a hundred thousand, and solr response
>>> times (since the index's docs are small and not complex) are logged as
>>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
>>> milliseconds have elapsed.  Incredible if you ask me...  :)
>>> 
>>> Once you get SOLR to consider the whole phrase as just one big term, the
>>> wildcard is very fast.
>>> 
>>> -Original Message-
>>> From: Eric Grobler [mailto:impalah...@googlemail.com]
>>> Sent: Wednesday, September 01, 2010 12:35 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Auto Suggest
>>> 
>>> Hi Robert,
>>> 
>>> Interesting approach, how many documents do you have in Solr?
>>> I have about 2 million and I just wonder if it might be a bit slow.
>>> 
>>> Regards
>>> Johan
>>> 
>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>>> wrote:
>>> 
>>>> I do this by replacing the spaces with a '%' in a separate search
>>> field
>>>> which is not parsed nor tokenized and then you can wildcard across the
>>>> whole phrase like you want and the spaces don't mess you up.  Just
>>> store
>>>> the original phrase with spaces in a separate field for returning to
>>> the
>>>> front end for display.
>>>> 
>>>> -Original Message-
>>>> From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>>>> Sent: Wednesday, September 01, 2010 7:33 AM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Auto Suggest
>>>> 
>>>> 
>>>> Hallo
>>>> 
>>>> How would one implement a multiple term auto-suggest feature in Solr
>>>> that is filter sensitive?
>>>> For example, a user enters :
>>>> "mp3"
>>>>  and solr might suggest:
>>>>  ->   "mp3 player"
>>>>  ->   "mp3 nano"
>>>>  ->   "mp3 sony"
>>>> and then the user starts the second word :
>>>> "mp3 n"
>>>> and that narrows it down to:
>>>>  -> "mp3 nano"
>>>> 
>>>> I had a quick look at the Terms Component.
>>>> I suppose it just returns term totals for the entire index and cannot
>>> be
>>>> used with a filter or query?
>>>> 
>>>> Thanks
>>>> Johan
>>>> 
>>>> 
>>>> 
>>> 
>> 



Re: Auto Suggest

2010-09-02 Thread Lance Norskog
What does analysis.jsp show?

On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
 wrote:
> I'm having a different issue with the EdgeNGram technique described
> here: 
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>
> That is one word queries q=app on the query_text field, work fine
> however "q=app mou" do not.  Why would this be or is there a
> configuration that could be missing?
>
> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  
> wrote:
>> Thanks for your feedback Robert,
>>
>> I will try that and see how Solr performs on my data - I think I will create
>> a field that contains only important key/product terms from the text.
>>
>> Regards
>> Johan
>>
>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:
>>
>>> We don't have that many, just a hundred thousand, and solr response
>>> times (since the index's docs are small and not complex) are logged as
>>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
>>> milliseconds have elapsed.  Incredible if you ask me...  :)
>>>
>>> Once you get SOLR to consider the whole phrase as just one big term, the
>>> wildcard is very fast.
>>>
>>> -Original Message-
>>> From: Eric Grobler [mailto:impalah...@googlemail.com]
>>> Sent: Wednesday, September 01, 2010 12:35 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Auto Suggest
>>>
>>> Hi Robert,
>>>
>>> Interesting approach, how many documents do you have in Solr?
>>> I have about 2 million and I just wonder if it might be a bit slow.
>>>
>>> Regards
>>> Johan
>>>
>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>>> wrote:
>>>
>>> > I do this by replacing the spaces with a '%' in a separate search
>>> field
>>> > which is not parsed nor tokenized and then you can wildcard across the
>>> > whole phrase like you want and the spaces don't mess you up.  Just
>>> store
>>> > the original phrase with spaces in a separate field for returning to
>>> the
>>> > front end for display.
>>> >
>>> > -Original Message-
>>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>>> > Sent: Wednesday, September 01, 2010 7:33 AM
>>> > To: solr-user@lucene.apache.org
>>> > Subject: Auto Suggest
>>> >
>>> >
>>> > Hallo
>>> >
>>> > How would one implement a multiple term auto-suggest feature in Solr
>>> > that is filter sensitive?
>>> > For example, a user enters :
>>> > "mp3"
>>> >  and solr might suggest:
>>> >  ->   "mp3 player"
>>> >  ->   "mp3 nano"
>>> >  ->   "mp3 sony"
>>> > and then the user starts the second word :
>>> > "mp3 n"
>>> > and that narrows it down to:
>>> >  -> "mp3 nano"
>>> >
>>> > I had a quick look at the Terms Component.
>>> > I suppose it just returns term totals for the entire index and cannot
>>> be
>>> > used with a filter or query?
>>> >
>>> > Thanks
>>> > Johan
>>> >
>>> >
>>> >
>>>
>>
>



-- 
Lance Norskog
goks...@gmail.com


Re: Auto Suggest

2010-09-02 Thread Jason Rutherglen
I'm having a different issue with the EdgeNGram technique described
here: 
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

That is one word queries q=app on the query_text field, work fine
however "q=app mou" do not.  Why would this be or is there a
configuration that could be missing?

On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  wrote:
> Thanks for your feedback Robert,
>
> I will try that and see how Solr performs on my data - I think I will create
> a field that contains only important key/product terms from the text.
>
> Regards
> Johan
>
> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:
>
>> We don't have that many, just a hundred thousand, and solr response
>> times (since the index's docs are small and not complex) are logged as
>> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
>> milliseconds have elapsed.  Incredible if you ask me...  :)
>>
>> Once you get SOLR to consider the whole phrase as just one big term, the
>> wildcard is very fast.
>>
>> -Original Message-
>> From: Eric Grobler [mailto:impalah...@googlemail.com]
>> Sent: Wednesday, September 01, 2010 12:35 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Auto Suggest
>>
>> Hi Robert,
>>
>> Interesting approach, how many documents do you have in Solr?
>> I have about 2 million and I just wonder if it might be a bit slow.
>>
>> Regards
>> Johan
>>
>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
>> wrote:
>>
>> > I do this by replacing the spaces with a '%' in a separate search
>> field
>> > which is not parsed nor tokenized and then you can wildcard across the
>> > whole phrase like you want and the spaces don't mess you up.  Just
>> store
>> > the original phrase with spaces in a separate field for returning to
>> the
>> > front end for display.
>> >
>> > -Original Message-
>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
>> > Sent: Wednesday, September 01, 2010 7:33 AM
>> > To: solr-user@lucene.apache.org
>> > Subject: Auto Suggest
>> >
>> >
>> > Hallo
>> >
>> > How would one implement a multiple term auto-suggest feature in Solr
>> > that is filter sensitive?
>> > For example, a user enters :
>> > "mp3"
>> >  and solr might suggest:
>> >  ->   "mp3 player"
>> >  ->   "mp3 nano"
>> >  ->   "mp3 sony"
>> > and then the user starts the second word :
>> > "mp3 n"
>> > and that narrows it down to:
>> >  -> "mp3 nano"
>> >
>> > I had a quick look at the Terms Component.
>> > I suppose it just returns term totals for the entire index and cannot
>> be
>> > used with a filter or query?
>> >
>> > Thanks
>> > Johan
>> >
>> >
>> >
>>
>


Re: Auto Suggest

2010-09-01 Thread Eric Grobler
Thanks for your feedback Robert,

I will try that and see how Solr performs on my data - I think I will create
a field that contains only important key/product terms from the text.

Regards
Johan

On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:

> We don't have that many, just a hundred thousand, and solr response
> times (since the index's docs are small and not complex) are logged as
> typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
> milliseconds have elapsed.  Incredible if you ask me...  :)
>
> Once you get SOLR to consider the whole phrase as just one big term, the
> wildcard is very fast.
>
> -Original Message-
> From: Eric Grobler [mailto:impalah...@googlemail.com]
> Sent: Wednesday, September 01, 2010 12:35 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Auto Suggest
>
> Hi Robert,
>
> Interesting approach, how many documents do you have in Solr?
> I have about 2 million and I just wonder if it might be a bit slow.
>
> Regards
> Johan
>
> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
> wrote:
>
> > I do this by replacing the spaces with a '%' in a separate search
> field
> > which is not parsed nor tokenized and then you can wildcard across the
> > whole phrase like you want and the spaces don't mess you up.  Just
> store
> > the original phrase with spaces in a separate field for returning to
> the
> > front end for display.
> >
> > -Original Message-----
> > From: Jazz Globe [mailto:jazzgl...@hotmail.com]
> > Sent: Wednesday, September 01, 2010 7:33 AM
> > To: solr-user@lucene.apache.org
> > Subject: Auto Suggest
> >
> >
> > Hallo
> >
> > How would one implement a multiple term auto-suggest feature in Solr
> > that is filter sensitive?
> > For example, a user enters :
> > "mp3"
> >  and solr might suggest:
> >  ->   "mp3 player"
> >  ->   "mp3 nano"
> >  ->   "mp3 sony"
> > and then the user starts the second word :
> > "mp3 n"
> > and that narrows it down to:
> >  -> "mp3 nano"
> >
> > I had a quick look at the Terms Component.
> > I suppose it just returns term totals for the entire index and cannot
> be
> > used with a filter or query?
> >
> > Thanks
> > Johan
> >
> >
> >
>


RE: Auto Suggest

2010-09-01 Thread Robert Petersen
We don't have that many, just a hundred thousand, and solr response
times (since the index's docs are small and not complex) are logged as
typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
milliseconds have elapsed.  Incredible if you ask me...  :)

Once you get SOLR to consider the whole phrase as just one big term, the
wildcard is very fast.

-Original Message-
From: Eric Grobler [mailto:impalah...@googlemail.com] 
Sent: Wednesday, September 01, 2010 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Auto Suggest

Hi Robert,

Interesting approach, how many documents do you have in Solr?
I have about 2 million and I just wonder if it might be a bit slow.

Regards
Johan

On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
wrote:

> I do this by replacing the spaces with a '%' in a separate search
field
> which is not parsed nor tokenized and then you can wildcard across the
> whole phrase like you want and the spaces don't mess you up.  Just
store
> the original phrase with spaces in a separate field for returning to
the
> front end for display.
>
> -Original Message-
> From: Jazz Globe [mailto:jazzgl...@hotmail.com]
> Sent: Wednesday, September 01, 2010 7:33 AM
> To: solr-user@lucene.apache.org
> Subject: Auto Suggest
>
>
> Hallo
>
> How would one implement a multiple term auto-suggest feature in Solr
> that is filter sensitive?
> For example, a user enters :
> "mp3"
>  and solr might suggest:
>  ->   "mp3 player"
>  ->   "mp3 nano"
>  ->   "mp3 sony"
> and then the user starts the second word :
> "mp3 n"
> and that narrows it down to:
>  -> "mp3 nano"
>
> I had a quick look at the Terms Component.
> I suppose it just returns term totals for the entire index and cannot
be
> used with a filter or query?
>
> Thanks
> Johan
>
>
>


Re: Auto Suggest

2010-09-01 Thread Eric Grobler
Hi Robert,

Interesting approach, how many documents do you have in Solr?
I have about 2 million and I just wonder if it might be a bit slow.

Regards
Johan

On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen  wrote:

> I do this by replacing the spaces with a '%' in a separate search field
> which is not parsed nor tokenized and then you can wildcard across the
> whole phrase like you want and the spaces don't mess you up.  Just store
> the original phrase with spaces in a separate field for returning to the
> front end for display.
>
> -Original Message-
> From: Jazz Globe [mailto:jazzgl...@hotmail.com]
> Sent: Wednesday, September 01, 2010 7:33 AM
> To: solr-user@lucene.apache.org
> Subject: Auto Suggest
>
>
> Hallo
>
> How would one implement a multiple term auto-suggest feature in Solr
> that is filter sensitive?
> For example, a user enters :
> "mp3"
>  and solr might suggest:
>  ->   "mp3 player"
>  ->   "mp3 nano"
>  ->   "mp3 sony"
> and then the user starts the second word :
> "mp3 n"
> and that narrows it down to:
>  -> "mp3 nano"
>
> I had a quick look at the Terms Component.
> I suppose it just returns term totals for the entire index and cannot be
> used with a filter or query?
>
> Thanks
> Johan
>
>
>


RE: Auto Suggest

2010-09-01 Thread Robert Petersen
I do this by replacing the spaces with a '%' in a separate search field
which is not parsed nor tokenized and then you can wildcard across the
whole phrase like you want and the spaces don't mess you up.  Just store
the original phrase with spaces in a separate field for returning to the
front end for display.

-Original Message-
From: Jazz Globe [mailto:jazzgl...@hotmail.com] 
Sent: Wednesday, September 01, 2010 7:33 AM
To: solr-user@lucene.apache.org
Subject: Auto Suggest


Hallo

How would one implement a multiple term auto-suggest feature in Solr
that is filter sensitive?
For example, a user enters :
"mp3"
  and solr might suggest:
  ->   "mp3 player"
  ->   "mp3 nano"
  ->   "mp3 sony"
and then the user starts the second word :
"mp3 n"
and that narrows it down to:
  -> "mp3 nano"

I had a quick look at the Terms Component.
I suppose it just returns term totals for the entire index and cannot be
used with a filter or query?

Thanks
Johan

  


Auto Suggest

2010-09-01 Thread Jazz Globe

Hallo

How would one implement a multiple term auto-suggest feature in Solr that is 
filter sensitive?
For example, a user enters :
"mp3"
  and solr might suggest:
  ->   "mp3 player"
  ->   "mp3 nano"
  ->   "mp3 sony"
and then the user starts the second word :
"mp3 n"
and that narrows it down to:
  -> "mp3 nano"

I had a quick look at the Terms Component.
I suppose it just returns term totals for the entire index and cannot be used 
with a filter or query?

Thanks
Johan

  

Re: Auto suggest with spell check

2010-08-05 Thread Grijesh.singh

Given below are the steps for auto-suggest and spellcheck in single query:
Make the change in TermComponent part in solrconfig.xml



 
  true


  termsComponent   
  spellcheck

  
Use given below query format for getting autosuggest and spellcheck
suggestion.
http://localhost:8983/solr/terms?terms.fl=text&terms.prefix=computr&spellcheck.q=computr&spellcheck=true
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Auto-suggest-with-spell-check-tp1015114p1025688.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
Am 03.06.2010 16:45, schrieb Andrzej Bialecki:
> You are right to a certain degree. Still, there are some contention
> points in Lucene/Solr, how threads are allocated on available CPU-s, and
> how the heap is used, which can make a two-JVM setup perform much better
> than a single-JVM setup given the same number of threads...

Allow me to don't belive this! ;-) It's not Solr that allocates threads,
it's the web server (Jetty, Glassfish, or whatever). In a normal
configuration, it will use as many threads as useful, so that there's no
need to start a second web server on the same machine.

To Lucene, there is some magic algorithm that reuses an IndexReader by a
limited number of threads (as far as I have seen in the code, but the
details are unimportant). But to the very least, if you've a multi core
setup, you'll get special IndexReader instances from Lucene per core. So
I don't see why you should scatter them on different VMs.

Greetings,
Michael


Re: Auto-suggest internal terms

2010-06-03 Thread Andrzej Bialecki
On 2010-06-03 13:38, Michael Kuhlmann wrote:
> Am 03.06.2010 13:02, schrieb Andrzej Bialecki:
>> ..., and deploy this
>> index in a separate JVM (to benefit from other CPUs than the one that
>> runs your Solr core)
> 
> Every known webserver ist multithreaded by default, so putting different
> Solr instances into different JVMs will be of no use.


You are right to a certain degree. Still, there are some contention
points in Lucene/Solr, how threads are allocated on available CPU-s, and
how the heap is used, which can make a two-JVM setup perform much better
than a single-JVM setup given the same number of threads...


-- 
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
Am 03.06.2010 13:02, schrieb Andrzej Bialecki:
> ..., and deploy this
> index in a separate JVM (to benefit from other CPUs than the one that
> runs your Solr core)

Every known webserver ist multithreaded by default, so putting different
Solr instances into different JVMs will be of no use.

-Michael


Re: Auto-suggest internal terms

2010-06-03 Thread Andrzej Bialecki
On 2010-06-03 09:56, Michael Kuhlmann wrote:
> The only solution without "doing any custom work" would be to perform a
> normal query for each suggestion. But you might get into performance
> troubles with that, because suggestions are typically performed much
> more often than complete searches.

Actually, that's not a bad idea - if you can trim the size of the index
(either by using shingles instead of docs, or trimming the main index -
LUCENE-1812) so that the index fits completely in RAM, and deploy this
index in a separate JVM (to benefit from other CPUs than the one that
runs your Solr core) or another machine, then I think performance would
not be a big concern, and the functionality would be just what you wanted.

> 
> The much faster solution that needs own work would be to build up a
> large TreeMap with each word as the keys, and the matching terms as the
> values.

That would consume an awful lot of RAM... see SOLR-1316 for some
measurements.


-- 
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Auto-suggest internal terms

2010-06-03 Thread Michael Kuhlmann
The only solution without "doing any custom work" would be to perform a
normal query for each suggestion. But you might get into performance
troubles with that, because suggestions are typically performed much
more often than complete searches.

The much faster solution that needs own work would be to build up a
large TreeMap with each word as the keys, and the matching terms as the
values.

-Michael

Am 02.06.2010 22:01, schrieb Jay Hill:
> I've got a situation where I'm looking to build an auto-suggest where any
> term entered will lead to suggestions. For example, if I type "wine" I want
> to see suggestions like this:
> 
> french *wine* classes
> *wine* book discounts
> burgundy *wine*
> 
> etc.
> 
> I've tried some tricks with shingles, but the only solution that worked was
> pre-processing my queries into a core in all variations.
> 
> Anyone know any tricks to accomplish this in Solr without doing any custom
> work?
> 
> -Jay
> 


RE: Auto-suggest internal terms

2010-06-02 Thread Tim Gilbert
I was interested in the same thing and stumbled upon this article:

http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent
-and-jquery/

I haven't followed through, but it looked promising to me.

Tim

-Original Message-
From: Jay Hill [mailto:jayallenh...@gmail.com] 
Sent: Wednesday, June 02, 2010 4:02 PM
To: solr-user@lucene.apache.org
Subject: Auto-suggest internal terms

I've got a situation where I'm looking to build an auto-suggest where
any
term entered will lead to suggestions. For example, if I type "wine" I
want
to see suggestions like this:

french *wine* classes
*wine* book discounts
burgundy *wine*

etc.

I've tried some tricks with shingles, but the only solution that worked
was
pre-processing my queries into a core in all variations.

Anyone know any tricks to accomplish this in Solr without doing any
custom
work?

-Jay


RE: Auto-suggest internal terms

2010-06-02 Thread Patrick Wilson
I'm painfully new to Solr so please be gentle if my suggestion is terrible!

Could you use highlighting to do this? Take the first n results from a query 
and show their highlights, customizing the highlights to show the desired 
number of words.

Just a thought.

Patrick

-Original Message-
From: Jay Hill [mailto:jayallenh...@gmail.com]
Sent: Wednesday, June 02, 2010 4:02 PM
To: solr-user@lucene.apache.org
Subject: Auto-suggest internal terms

I've got a situation where I'm looking to build an auto-suggest where any
term entered will lead to suggestions. For example, if I type "wine" I want
to see suggestions like this:

french *wine* classes
*wine* book discounts
burgundy *wine*

etc.

I've tried some tricks with shingles, but the only solution that worked was
pre-processing my queries into a core in all variations.

Anyone know any tricks to accomplish this in Solr without doing any custom
work?

-Jay


Auto-suggest internal terms

2010-06-02 Thread Jay Hill
I've got a situation where I'm looking to build an auto-suggest where any
term entered will lead to suggestions. For example, if I type "wine" I want
to see suggestions like this:

french *wine* classes
*wine* book discounts
burgundy *wine*

etc.

I've tried some tricks with shingles, but the only solution that worked was
pre-processing my queries into a core in all variations.

Anyone know any tricks to accomplish this in Solr without doing any custom
work?

-Jay


Re: multi term, multi field, auto suggest

2010-02-01 Thread Lukas Kahwe Smith

On 01.02.2010, at 13:27, Lukas Kahwe Smith wrote:

> 
> On 29.01.2010, at 15:40, Lukas Kahwe Smith wrote:
> 
>> I am still a bit unsure how to handle both the lowercased and the case 
>> preserved version:
>> 
>> So here are some examples:
>> UBS => ubs|UBS
>> Kreuzstrasse => kreuzstrasse|Kreuzstrasse
>> 
>> So when I type "Kreu" I would get a suggestion of "Kreuzstrasse" and with 
>> "kreu" I would get "kreuzstrasse".
>> Since I do not expect any words to start with a lowercase letter and still 
>> contain some upper case letter we should be fine with this approach.
>> 
>> As in I doubt there would be stuff like "fooBar" which would lead to 
>> suggestion both "foobar" and "fooBar".
>> 
>> How can I achieve this?
> 
> 
> I just noticed that I need the same thing for the word delimiter splitter. As 
> in some way to index both the splitted and the unsplitted version so that I 
> can use it in a facet search.
> 
> Hans-Peter => Hans|Peter|Hans-Peter


Sorry for the monolog.
I did see 
http://www.mail-archive.com/solr-user@lucene.apache.org/msg29786.html, which 
suggests a solution just for lowercase indexing with mixed case suggest via 
concatenating the lowercased version with some separator with the original 
version.

I guess what I could just do is feed in the same data multiple times and do the 
approach of [indexterm]|[original] in user land somehow

like "Hans-Peter" would be turned into 3 documents:
hans|Hans-Peter
peter|Hans-Peter
hans-peter|Hans-Peter

This solution would be quite cool indeed, since I could suggest "Hans-Peter" if 
someone searches for "Peter".
Since I will just use this for a prefix search, I could just set the query 
analyzer to lowercase the search and it should find the results and I can then 
add some magic to the frontend display logic to split off the suggested 
original term.

I am not aware of any magic inside the schema.xml that could do this work for 
me though. I am using the DatabaseHandler to load the documents. I guess I 
could simply run the query multiple times, but that would screw up the indexing 
of the non auto suggest index. Then again maybe I want to totally separate the 
two anyways.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





Re: multi term, multi field, auto suggest

2010-02-01 Thread Lukas Kahwe Smith

On 29.01.2010, at 15:40, Lukas Kahwe Smith wrote:

> I am still a bit unsure how to handle both the lowercased and the case 
> preserved version:
> 
> So here are some examples:
> UBS => ubs|UBS
> Kreuzstrasse => kreuzstrasse|Kreuzstrasse
> 
> So when I type "Kreu" I would get a suggestion of "Kreuzstrasse" and with 
> "kreu" I would get "kreuzstrasse".
> Since I do not expect any words to start with a lowercase letter and still 
> contain some upper case letter we should be fine with this approach.
> 
> As in I doubt there would be stuff like "fooBar" which would lead to 
> suggestion both "foobar" and "fooBar".
> 
> How can I achieve this?


I just noticed that I need the same thing for the word delimiter splitter. As 
in some way to index both the splitted and the unsplitted version so that I can 
use it in a facet search.

Hans-Peter => Hans|Peter|Hans-Peter

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





  1   2   >