Re: Auto-Suggest within Tier Architecture
Hi Brett, We, at IndiaMART, have Solr installed behind PHP servers which are behind Varnish servers. Yes, you are right exposing Solr URL is not a good idea. A single service in between would do the trick. You can try our service at dir.indiamart.com. We have a client-side JS that handles AJAX requests per keystroke. Do let me know for any other queries. :) On Mon, 3 Feb 2020 at 22:10, Moyer, Brett wrote: > Hello, > > Looking to see how others accomplished this goal. We have a 3 Tier > architecture, Solr is down deep in T3 far from the end user. How do you > make Auto-Suggest calls from the Internet Browser through the Tiers down to > Solr in T3? We essentially created steps down each tier, but I'm looking to > know what other approaches people have created. Did you put your solr in > T1, I assume not, that would put it at risk. Thanks! > > Brett Moyer > * > This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender > immediately and then delete it. > > TIAA > * > -- -- Regards, *Paras Lehana* [65871] Development Engineer, *Auto-Suggest*, IndiaMART InterMESH Ltd, 11th Floor, Tower 2, Assotech Business Cresterra, Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305 Mob.: +91-9560911996 Work: 0120-4056700 | Extn: *11096* -- * * <https://www.facebook.com/IndiaMART/videos/578196442936091/>
Auto-Suggest within Tier Architecture
Hello, Looking to see how others accomplished this goal. We have a 3 Tier architecture, Solr is down deep in T3 far from the end user. How do you make Auto-Suggest calls from the Internet Browser through the Tiers down to Solr in T3? We essentially created steps down each tier, but I'm looking to know what other approaches people have created. Did you put your solr in T1, I assume not, that would put it at risk. Thanks! Brett Moyer * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA *
Re: Type of auto suggest feature
Hey Artur, If I have understood correctly, you want to suggest terms related to the query. It would be helpful if you describe the use case as well. Anyways, please go through this once: 1. Keep different form of words as different documents so that they could be suggested ("closed", "close" and "closing" should be different docs). Use stemming (Snowball Porter Stemmer Filter <https://lucene.apache.org/solr/guide/8_3/filter-descriptions.html#snowball-porter-stemmer-filter>) so that docs with different forms could be matched. 2. The "interesting" terms are probably related terms in your case that can be addressed with Synonym factory. Again, the related terms should be in different documents. Add all the related words in the Synonym file separated with commas. 3. Will your query only have single terms? If no and if there are multiple terms, how do you want to handle that? This may require few more analyzers and tweaking in query. 4. If you still want to suggest terms for partial words (to suggest "closing" if query is "clo"), use Edge NGrams <https://lucene.apache.org/solr/guide/8_3/tokenizers.html#edge-n-gram-tokenizer>. Use Standard Tokenizer <https://lucene.apache.org/solr/guide/8_3/tokenizers.html#Tokenizers-StandardTokenizer> to split words. What do you want to achieve with Shingle factory? 5. I think all of the above can be simply handled without Suggester component. Anyways, keep exploring different ways. Please do tell if you have any queries. On Sun, 24 Nov 2019 at 19:11, Rudenko, Artur wrote: > Hi, > I am quite new to solr and I am interested in implementing a sort of auto > terms suggest (not auto complete) feature based on the user query. > Users builds some query (on multiple fields) and I am trying to help him > refining his query by suggesting to add more terms based on his current > query. > The suggestions should contain synonyms and different word forms > (query:close , result: closed, closing) and also some other "interesting" > (hard to define what interesting is) terms and phrases based on that search. > > The queries are perform on text field with about 1000 words on document > sets of about 20-50M > > So far I came up with solution that uses Suggester component over the 1000 > words text field (copy field) as shown below and im trying to find how to > add to it more "interesting" terms and phrases based on the text field > > > type="text_total_shingle_synonyms" indexed="true" stored="true" > termVectors="true" termOffsets="true" termPositions="true" required="false" > multiValued="true" /> > > > > positionIncrementGap="100"> > > > > > > protected="protwords.txt"/> > maxShingleSize="4" /> > > > > synonyms="synonyms_suggest.txt" ignoreCase="true" expand="false"/> > > > protected="protwords.txt"/> > > > > > > > Thanks, > Artur Rudenko > > > > This electronic message may contain proprietary and confidential > information of Verint Systems Inc., its affiliates and/or subsidiaries. The > information is intended to be for the use of the individual(s) or > entity(ies) named above. If you are not the intended recipient (or > authorized to receive this e-mail for the intended recipient), you may not > use, copy, disclose or distribute to anyone this message or any information > contained in this message. If you have received this electronic message in > error, please notify us by replying to this e-mail. > -- -- Regards, *Paras Lehana* [65871] Development Engineer, Auto-Suggest, IndiaMART Intermesh Ltd. 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, Noida, UP, IN - 201303 Mob.: +91-9560911996 Work: 01203916600 | Extn: *8173* -- IMPORTANT: NEVER share your IndiaMART OTP/ Password with anyone.
Type of auto suggest feature
Hi, I am quite new to solr and I am interested in implementing a sort of auto terms suggest (not auto complete) feature based on the user query. Users builds some query (on multiple fields) and I am trying to help him refining his query by suggesting to add more terms based on his current query. The suggestions should contain synonyms and different word forms (query:close , result: closed, closing) and also some other "interesting" (hard to define what interesting is) terms and phrases based on that search. The queries are perform on text field with about 1000 words on document sets of about 20-50M So far I came up with solution that uses Suggester component over the 1000 words text field (copy field) as shown below and im trying to find how to add to it more "interesting" terms and phrases based on the text field Thanks, Artur Rudenko This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Re: Auto-suggest in Solr
Thank you so much. I'll read up on that and try that out. Regards, Edwin On 12 July 2015 at 00:41, Erick Erickson wrote: > Cool! I've bookmarked it, much more thorough > > Erick > > On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood > wrote: > > Thanks, this is very helpful. > > > > Suggester config is quite under documented. It took me longer than I > expected to get it working. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > > > On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti < > benedetti.ale...@gmail.com> wrote: > > > >> Hi guys, > >> just wrote a blog to integrate Erick's post and to explain in details > with > >> practical examples all the main Lookup implementations : > >> > >> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html > >> > >> I think this can be useful for Edwin to finally fix the config for the > >> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike > answer > >> in dev, and deep code analysis and testing :) ) > >> > >> Cheers > >> > >> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti < > benedetti.ale...@gmail.com> > >> : > >> > >>> Thanks, Erick, i didn't have time to go again through the code. > >>> But i will forward this to the Dev list. > >>> Thank you for your time ! > >>> > >>> Cheers > >>> > >>> 2015-06-27 16:19 GMT+01:00 Erick Erickson : > >>> > >>>> Alessandro: > >>>> > >>>> Going to have to defer to Mike McCandless et.al., they're the > >>>> authorities here. Don't quite know whether they monitor this list, > >>>> consider the dev list? > >>>> > >>>> Best, > >>>> Erick > >>>> > >>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti > >>>> wrote: > >>>>> Up, Can anyone gently take a look to my considerations related the > >>>> FreeText > >>>>> Suggester ? > >>>>> I am curious to have more insight. > >>>>> Eventually I will deeply analyse the code to understand my errors. > >>>>> > >>>>> Cheers > >>>>> > >>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < > >>>> benedetti.ale...@gmail.com> > >>>>> : > >>>>> > >>>>>> Actually the documentation is not clear enough. > >>>>>> Let's try to understand this suggester. > >>>>>> > >>>>>> *Building* > >>>>>> This suggester build a FST that it will use to provide the > autocomplete > >>>>>> feature running prefix searches on it . > >>>>>> The terms it uses to generate the FST are the tokens produced by the > >>>>>> "suggestFreeTextAnalyzerFieldType" . > >>>>>> > >>>>>> And this should be correct. > >>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as > >>>> well) > >>>>>> in our analysis to keep it simple , from these original field > values : > >>>>>> "mp3 ipod" > >>>>>> "mp3 player" > >>>>>> "mp3 player ipod" > >>>>>> "player of Real" > >>>>>> > >>>>>> -> we produce these list of possible suggestions in our FST : > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> From the documentation I read : > >>>>>> > >>>>>>> " ngrams: The max number of tokens out of which singles will be > make > >>>> the > >>>>>>> dictionary. The default value is 2. Increasing this would mean you > >>>> want > >>>>>>&
Re: Auto-suggest in Solr
Cool! I've bookmarked it, much more thorough Erick On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood wrote: > Thanks, this is very helpful. > > Suggester config is quite under documented. It took me longer than I expected > to get it working. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti > wrote: > >> Hi guys, >> just wrote a blog to integrate Erick's post and to explain in details with >> practical examples all the main Lookup implementations : >> >> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html >> >> I think this can be useful for Edwin to finally fix the config for the >> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer >> in dev, and deep code analysis and testing :) ) >> >> Cheers >> >> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti >> : >> >>> Thanks, Erick, i didn't have time to go again through the code. >>> But i will forward this to the Dev list. >>> Thank you for your time ! >>> >>> Cheers >>> >>> 2015-06-27 16:19 GMT+01:00 Erick Erickson : >>> >>>> Alessandro: >>>> >>>> Going to have to defer to Mike McCandless et.al., they're the >>>> authorities here. Don't quite know whether they monitor this list, >>>> consider the dev list? >>>> >>>> Best, >>>> Erick >>>> >>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti >>>> wrote: >>>>> Up, Can anyone gently take a look to my considerations related the >>>> FreeText >>>>> Suggester ? >>>>> I am curious to have more insight. >>>>> Eventually I will deeply analyse the code to understand my errors. >>>>> >>>>> Cheers >>>>> >>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < >>>> benedetti.ale...@gmail.com> >>>>> : >>>>> >>>>>> Actually the documentation is not clear enough. >>>>>> Let's try to understand this suggester. >>>>>> >>>>>> *Building* >>>>>> This suggester build a FST that it will use to provide the autocomplete >>>>>> feature running prefix searches on it . >>>>>> The terms it uses to generate the FST are the tokens produced by the >>>>>> "suggestFreeTextAnalyzerFieldType" . >>>>>> >>>>>> And this should be correct. >>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as >>>> well) >>>>>> in our analysis to keep it simple , from these original field values : >>>>>> "mp3 ipod" >>>>>> "mp3 player" >>>>>> "mp3 player ipod" >>>>>> "player of Real" >>>>>> >>>>>> -> we produce these list of possible suggestions in our FST : >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From the documentation I read : >>>>>> >>>>>>> " ngrams: The max number of tokens out of which singles will be make >>>> the >>>>>>> dictionary. The default value is 2. Increasing this would mean you >>>> want >>>>>>> more than the previous 2 tokens to be taken into consideration when >>>> making >>>>>>> the suggestions. " >>>>>> >>>>>> >>>>>> This makes me confused, as I was not expecting this param to affect the >>>>>> suggestion dictionary. >>>>>> So I would like a clarification here from our masters :) >>>>>> At this point let's see what happens at query time . >>>>>> >>>>>> *Query Time * >>>>>> As my understanding the ngrams params will consider the last N-1 >>>> tokens >>>>>> the user put separated by the space separator. >>>>&g
Re: Auto-suggest in Solr
Thanks, this is very helpful. Suggester config is quite under documented. It took me longer than I expected to get it working. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti wrote: > Hi guys, > just wrote a blog to integrate Erick's post and to explain in details with > practical examples all the main Lookup implementations : > > http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html > > I think this can be useful for Edwin to finally fix the config for the > FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer > in dev, and deep code analysis and testing :) ) > > Cheers > > 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti > : > >> Thanks, Erick, i didn't have time to go again through the code. >> But i will forward this to the Dev list. >> Thank you for your time ! >> >> Cheers >> >> 2015-06-27 16:19 GMT+01:00 Erick Erickson : >> >>> Alessandro: >>> >>> Going to have to defer to Mike McCandless et.al., they're the >>> authorities here. Don't quite know whether they monitor this list, >>> consider the dev list? >>> >>> Best, >>> Erick >>> >>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti >>> wrote: >>>> Up, Can anyone gently take a look to my considerations related the >>> FreeText >>>> Suggester ? >>>> I am curious to have more insight. >>>> Eventually I will deeply analyse the code to understand my errors. >>>> >>>> Cheers >>>> >>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < >>> benedetti.ale...@gmail.com> >>>> : >>>> >>>>> Actually the documentation is not clear enough. >>>>> Let's try to understand this suggester. >>>>> >>>>> *Building* >>>>> This suggester build a FST that it will use to provide the autocomplete >>>>> feature running prefix searches on it . >>>>> The terms it uses to generate the FST are the tokens produced by the >>>>> "suggestFreeTextAnalyzerFieldType" . >>>>> >>>>> And this should be correct. >>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as >>> well) >>>>> in our analysis to keep it simple , from these original field values : >>>>> "mp3 ipod" >>>>> "mp3 player" >>>>> "mp3 player ipod" >>>>> "player of Real" >>>>> >>>>> -> we produce these list of possible suggestions in our FST : >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> From the documentation I read : >>>>> >>>>>> " ngrams: The max number of tokens out of which singles will be make >>> the >>>>>> dictionary. The default value is 2. Increasing this would mean you >>> want >>>>>> more than the previous 2 tokens to be taken into consideration when >>> making >>>>>> the suggestions. " >>>>> >>>>> >>>>> This makes me confused, as I was not expecting this param to affect the >>>>> suggestion dictionary. >>>>> So I would like a clarification here from our masters :) >>>>> At this point let's see what happens at query time . >>>>> >>>>> *Query Time * >>>>> As my understanding the ngrams params will consider the last N-1 >>> tokens >>>>> the user put separated by the space separator. >>>>> >>>>> "Builds an ngram model from the text sent to {@link >>>>>> * #build} and predicts based on the last grams-1 tokens in >>>>>> * the request sent to {@link #lookup}. This tries to >>>>>> * handle the "long tail" of suggestions for when the >>>>>> * incoming query is a never before seen query string." >>>>> >>>>> >>>>> Example , grams=3 should consider only the last 2 tokens >>>>&g
Re: Auto-suggest in Solr
Hi guys, just wrote a blog to integrate Erick's post and to explain in details with practical examples all the main Lookup implementations : http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html I think this can be useful for Edwin to finally fix the config for the FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer in dev, and deep code analysis and testing :) ) Cheers 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti : > Thanks, Erick, i didn't have time to go again through the code. > But i will forward this to the Dev list. > Thank you for your time ! > > Cheers > > 2015-06-27 16:19 GMT+01:00 Erick Erickson : > >> Alessandro: >> >> Going to have to defer to Mike McCandless et.al., they're the >> authorities here. Don't quite know whether they monitor this list, >> consider the dev list? >> >> Best, >> Erick >> >> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti >> wrote: >> > Up, Can anyone gently take a look to my considerations related the >> FreeText >> > Suggester ? >> > I am curious to have more insight. >> > Eventually I will deeply analyse the code to understand my errors. >> > >> > Cheers >> > >> > 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < >> benedetti.ale...@gmail.com> >> > : >> > >> >> Actually the documentation is not clear enough. >> >> Let's try to understand this suggester. >> >> >> >> *Building* >> >> This suggester build a FST that it will use to provide the autocomplete >> >> feature running prefix searches on it . >> >> The terms it uses to generate the FST are the tokens produced by the >> >> "suggestFreeTextAnalyzerFieldType" . >> >> >> >> And this should be correct. >> >> So if we have a shingle token filter[1-3] ( we produce unigrams as >> well) >> >> in our analysis to keep it simple , from these original field values : >> >> "mp3 ipod" >> >> "mp3 player" >> >> "mp3 player ipod" >> >> "player of Real" >> >> >> >> -> we produce these list of possible suggestions in our FST : >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> From the documentation I read : >> >> >> >>> " ngrams: The max number of tokens out of which singles will be make >> the >> >>> dictionary. The default value is 2. Increasing this would mean you >> want >> >>> more than the previous 2 tokens to be taken into consideration when >> making >> >>> the suggestions. " >> >> >> >> >> >> This makes me confused, as I was not expecting this param to affect the >> >> suggestion dictionary. >> >> So I would like a clarification here from our masters :) >> >> At this point let's see what happens at query time . >> >> >> >> *Query Time * >> >> As my understanding the ngrams params will consider the last N-1 >> tokens >> >> the user put separated by the space separator. >> >> >> >> "Builds an ngram model from the text sent to {@link >> >>> * #build} and predicts based on the last grams-1 tokens in >> >>> * the request sent to {@link #lookup}. This tries to >> >>> * handle the "long tail" of suggestions for when the >> >>> * incoming query is a never before seen query string." >> >> >> >> >> >> Example , grams=3 should consider only the last 2 tokens >> >> >> >> special mp3 p -> mp3 p >> >> >> >> Then this query is analysed using the >> "suggestFreeTextAnalyzerFieldType" . >> >> We produce 3 tokens : >> >> >> >> >> >> >> >> >> >> And we run the prefix matching on the FST . >> >> >> >> *Conclusion* >> >> My understanding is wrong for sure at some point, as the behaviour I >> get >> >> is different. >> >> Can we discuss this , clarify this and eventually put it in the >> official >> >> documentation ? >>
Re: Auto-suggest in Solr
Thanks, Erick, i didn't have time to go again through the code. But i will forward this to the Dev list. Thank you for your time ! Cheers 2015-06-27 16:19 GMT+01:00 Erick Erickson : > Alessandro: > > Going to have to defer to Mike McCandless et.al., they're the > authorities here. Don't quite know whether they monitor this list, > consider the dev list? > > Best, > Erick > > On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti > wrote: > > Up, Can anyone gently take a look to my considerations related the > FreeText > > Suggester ? > > I am curious to have more insight. > > Eventually I will deeply analyse the code to understand my errors. > > > > Cheers > > > > 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < > benedetti.ale...@gmail.com> > > : > > > >> Actually the documentation is not clear enough. > >> Let's try to understand this suggester. > >> > >> *Building* > >> This suggester build a FST that it will use to provide the autocomplete > >> feature running prefix searches on it . > >> The terms it uses to generate the FST are the tokens produced by the > >> "suggestFreeTextAnalyzerFieldType" . > >> > >> And this should be correct. > >> So if we have a shingle token filter[1-3] ( we produce unigrams as well) > >> in our analysis to keep it simple , from these original field values : > >> "mp3 ipod" > >> "mp3 player" > >> "mp3 player ipod" > >> "player of Real" > >> > >> -> we produce these list of possible suggestions in our FST : > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> From the documentation I read : > >> > >>> " ngrams: The max number of tokens out of which singles will be make > the > >>> dictionary. The default value is 2. Increasing this would mean you want > >>> more than the previous 2 tokens to be taken into consideration when > making > >>> the suggestions. " > >> > >> > >> This makes me confused, as I was not expecting this param to affect the > >> suggestion dictionary. > >> So I would like a clarification here from our masters :) > >> At this point let's see what happens at query time . > >> > >> *Query Time * > >> As my understanding the ngrams params will consider the last N-1 tokens > >> the user put separated by the space separator. > >> > >> "Builds an ngram model from the text sent to {@link > >>> * #build} and predicts based on the last grams-1 tokens in > >>> * the request sent to {@link #lookup}. This tries to > >>> * handle the "long tail" of suggestions for when the > >>> * incoming query is a never before seen query string." > >> > >> > >> Example , grams=3 should consider only the last 2 tokens > >> > >> special mp3 p -> mp3 p > >> > >> Then this query is analysed using the > "suggestFreeTextAnalyzerFieldType" . > >> We produce 3 tokens : > >> > >> > >> > >> > >> And we run the prefix matching on the FST . > >> > >> *Conclusion* > >> My understanding is wrong for sure at some point, as the behaviour I get > >> is different. > >> Can we discuss this , clarify this and eventually put it in the official > >> documentation ? > >> > >> Cheers > >> > >> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo : > >> > >>> I'm implementing an auto-suggest feature in Solr, and I'll like to > achieve > >>> the follwing: > >>> > >>> For example, if the user enters "mp3", Solr might suggest "mp3 player", > >>> "mp3 nano" and "mp3 music". > >>> When the user enters "mp3 p", the suggestion should narrow down to "mp3 > >>> player". > >>> > >>> Currently, when I type "mp3 p", the suggester is returning words that > >>> starts with the letter "p" only, and I'm getting results like "plan", > >>> "production", etc, and it does not take the "mp3" token into > >>> consideration. > >>> > >>&g
Re: Auto-suggest in Solr
Alessandro: Going to have to defer to Mike McCandless et.al., they're the authorities here. Don't quite know whether they monitor this list, consider the dev list? Best, Erick On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti wrote: > Up, Can anyone gently take a look to my considerations related the FreeText > Suggester ? > I am curious to have more insight. > Eventually I will deeply analyse the code to understand my errors. > > Cheers > > 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti > : > >> Actually the documentation is not clear enough. >> Let's try to understand this suggester. >> >> *Building* >> This suggester build a FST that it will use to provide the autocomplete >> feature running prefix searches on it . >> The terms it uses to generate the FST are the tokens produced by the >> "suggestFreeTextAnalyzerFieldType" . >> >> And this should be correct. >> So if we have a shingle token filter[1-3] ( we produce unigrams as well) >> in our analysis to keep it simple , from these original field values : >> "mp3 ipod" >> "mp3 player" >> "mp3 player ipod" >> "player of Real" >> >> -> we produce these list of possible suggestions in our FST : >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> From the documentation I read : >> >>> " ngrams: The max number of tokens out of which singles will be make the >>> dictionary. The default value is 2. Increasing this would mean you want >>> more than the previous 2 tokens to be taken into consideration when making >>> the suggestions. " >> >> >> This makes me confused, as I was not expecting this param to affect the >> suggestion dictionary. >> So I would like a clarification here from our masters :) >> At this point let's see what happens at query time . >> >> *Query Time * >> As my understanding the ngrams params will consider the last N-1 tokens >> the user put separated by the space separator. >> >> "Builds an ngram model from the text sent to {@link >>> * #build} and predicts based on the last grams-1 tokens in >>> * the request sent to {@link #lookup}. This tries to >>> * handle the "long tail" of suggestions for when the >>> * incoming query is a never before seen query string." >> >> >> Example , grams=3 should consider only the last 2 tokens >> >> special mp3 p -> mp3 p >> >> Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" . >> We produce 3 tokens : >> >> >> >> >> And we run the prefix matching on the FST . >> >> *Conclusion* >> My understanding is wrong for sure at some point, as the behaviour I get >> is different. >> Can we discuss this , clarify this and eventually put it in the official >> documentation ? >> >> Cheers >> >> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo : >> >>> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve >>> the follwing: >>> >>> For example, if the user enters "mp3", Solr might suggest "mp3 player", >>> "mp3 nano" and "mp3 music". >>> When the user enters "mp3 p", the suggestion should narrow down to "mp3 >>> player". >>> >>> Currently, when I type "mp3 p", the suggester is returning words that >>> starts with the letter "p" only, and I'm getting results like "plan", >>> "production", etc, and it does not take the "mp3" token into >>> consideration. >>> >>> I'm using Solr 5.1 and below is my configuration: >>> >>> In solrconfig.xml: >>> >>> >>> >>> >>> FreeTextLookupFactory >>> suggester_freetext_dir >>> >>> DocumentDictionaryFactory >>> Suggestion >>> Project >>> suggestType >>> 5 >>> false >>> false >>> >>> >>> >>> >>> In schema.xml >>> >>> >> positionIncrementGap="100"> >>> >>> >> pattern="[^a-zA-Z0-9]" replacement=" " /> >>> >>> >> maxShingleSize="6" outputUnigrams="false"/> >>> >>> >>> >> pattern="[^a-zA-Z0-9]" replacement=" " /> >>> >>> >> maxShingleSize="6" outputUnigrams="true"/> >>> >>> >>> >>> >>> Is there anything that I configured wrongly? >>> >>> >>> Regards, >>> Edwin >>> >> >> >> >> -- >> -- >> >> Benedetti Alessandro >> Visiting card : http://about.me/alessandro_benedetti >> >> "Tyger, tyger burning bright >> In the forests of the night, >> What immortal hand or eye >> Could frame thy fearful symmetry?" >> >> William Blake - Songs of Experience -1794 England >> > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England
Re: Auto-suggest in Solr
Up, Can anyone gently take a look to my considerations related the FreeText Suggester ? I am curious to have more insight. Eventually I will deeply analyse the code to understand my errors. Cheers 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti : > Actually the documentation is not clear enough. > Let's try to understand this suggester. > > *Building* > This suggester build a FST that it will use to provide the autocomplete > feature running prefix searches on it . > The terms it uses to generate the FST are the tokens produced by the > "suggestFreeTextAnalyzerFieldType" . > > And this should be correct. > So if we have a shingle token filter[1-3] ( we produce unigrams as well) > in our analysis to keep it simple , from these original field values : > "mp3 ipod" > "mp3 player" > "mp3 player ipod" > "player of Real" > > -> we produce these list of possible suggestions in our FST : > > > > > > > > > > > > > > > > > From the documentation I read : > >> " ngrams: The max number of tokens out of which singles will be make the >> dictionary. The default value is 2. Increasing this would mean you want >> more than the previous 2 tokens to be taken into consideration when making >> the suggestions. " > > > This makes me confused, as I was not expecting this param to affect the > suggestion dictionary. > So I would like a clarification here from our masters :) > At this point let's see what happens at query time . > > *Query Time * > As my understanding the ngrams params will consider the last N-1 tokens > the user put separated by the space separator. > > "Builds an ngram model from the text sent to {@link >> * #build} and predicts based on the last grams-1 tokens in >> * the request sent to {@link #lookup}. This tries to >> * handle the "long tail" of suggestions for when the >> * incoming query is a never before seen query string." > > > Example , grams=3 should consider only the last 2 tokens > > special mp3 p -> mp3 p > > Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" . > We produce 3 tokens : > > > > > And we run the prefix matching on the FST . > > *Conclusion* > My understanding is wrong for sure at some point, as the behaviour I get > is different. > Can we discuss this , clarify this and eventually put it in the official > documentation ? > > Cheers > > 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo : > >> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve >> the follwing: >> >> For example, if the user enters "mp3", Solr might suggest "mp3 player", >> "mp3 nano" and "mp3 music". >> When the user enters "mp3 p", the suggestion should narrow down to "mp3 >> player". >> >> Currently, when I type "mp3 p", the suggester is returning words that >> starts with the letter "p" only, and I'm getting results like "plan", >> "production", etc, and it does not take the "mp3" token into >> consideration. >> >> I'm using Solr 5.1 and below is my configuration: >> >> In solrconfig.xml: >> >> >> >> >> FreeTextLookupFactory >> suggester_freetext_dir >> >> DocumentDictionaryFactory >> Suggestion >> Project >> suggestType >> 5 >> false >> false >> >> >> >> >> In schema.xml >> >> > positionIncrementGap="100"> >> >> > pattern="[^a-zA-Z0-9]" replacement=" " /> >> >> > maxShingleSize="6" outputUnigrams="false"/> >> >> >> > pattern="[^a-zA-Z0-9]" replacement=" " /> >> >> > maxShingleSize="6" outputUnigrams="true"/> >> >> >> >> >> Is there anything that I configured wrongly? >> >> >> Regards, >> Edwin >> > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Auto-suggest in Solr
Can any of our beloved super guru take a look to my mail ? It could help Edwin as well :) Cheers 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti : > Actually the documentation is not clear enough. > Let's try to understand this suggester. > > *Building* > This suggester build a FST that it will use to provide the autocomplete > feature running prefix searches on it . > The terms it uses to generate the FST are the tokens produced by the > "suggestFreeTextAnalyzerFieldType" . > > And this should be correct. > So if we have a shingle token filter[1-3] ( we produce unigrams as well) > in our analysis to keep it simple , from these original field values : > "mp3 ipod" > "mp3 player" > "mp3 player ipod" > "player of Real" > > -> we produce these list of possible suggestions in our FST : > > > > > > > > > > > > > > > > > From the documentation I read : > >> " ngrams: The max number of tokens out of which singles will be make the >> dictionary. The default value is 2. Increasing this would mean you want >> more than the previous 2 tokens to be taken into consideration when making >> the suggestions. " > > > This makes me confused, as I was not expecting this param to affect the > suggestion dictionary. > So I would like a clarification here from our masters :) > At this point let's see what happens at query time . > > *Query Time * > As my understanding the ngrams params will consider the last N-1 tokens > the user put separated by the space separator. > > "Builds an ngram model from the text sent to {@link >> * #build} and predicts based on the last grams-1 tokens in >> * the request sent to {@link #lookup}. This tries to >> * handle the "long tail" of suggestions for when the >> * incoming query is a never before seen query string." > > > Example , grams=3 should consider only the last 2 tokens > > special mp3 p -> mp3 p > > Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" . > We produce 3 tokens : > > > > > And we run the prefix matching on the FST . > > *Conclusion* > My understanding is wrong for sure at some point, as the behaviour I get > is different. > Can we discuss this , clarify this and eventually put it in the official > documentation ? > > Cheers > > 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo : > >> I'm implementing an auto-suggest feature in Solr, and I'll like to achieve >> the follwing: >> >> For example, if the user enters "mp3", Solr might suggest "mp3 player", >> "mp3 nano" and "mp3 music". >> When the user enters "mp3 p", the suggestion should narrow down to "mp3 >> player". >> >> Currently, when I type "mp3 p", the suggester is returning words that >> starts with the letter "p" only, and I'm getting results like "plan", >> "production", etc, and it does not take the "mp3" token into >> consideration. >> >> I'm using Solr 5.1 and below is my configuration: >> >> In solrconfig.xml: >> >> >> >> >> FreeTextLookupFactory >> suggester_freetext_dir >> >> DocumentDictionaryFactory >> Suggestion >> Project >> suggestType >> 5 >> false >> false >> >> >> >> >> In schema.xml >> >> > positionIncrementGap="100"> >> >> > pattern="[^a-zA-Z0-9]" replacement=" " /> >> >> > maxShingleSize="6" outputUnigrams="false"/> >> >> >> > pattern="[^a-zA-Z0-9]" replacement=" " /> >> >> > maxShingleSize="6" outputUnigrams="true"/> >> >> >> >> >> Is there anything that I configured wrongly? >> >> >> Regards, >> Edwin >> > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Auto-suggest in Solr
Ok sure. > " ngrams: The max number of tokens out of which singles will be make the > dictionary. The default value is 2. Increasing this would mean you want > more than the previous 2 tokens to be taken into consideration when making > the suggestions. " I got confused by this, as I could not get the behavior when I use the suggester. Since the default value is 2, it means the search for "mp3 p" should include only suggestions that contains "mp3 ..." and not just from the letter "p". But I have only been getting suggestions that starts with "p" only. Even when I try with a bigger ngrams value for longer search, I'm getting the same results as well, that the suggester only consider the last token when giving the suggestions. I still could not achieve anything that consider 2 or more tokens when returning the suggestions. So am I actually following the right direction with this? Regards, Edwin On 19 June 2015 at 18:53, Alessandro Benedetti wrote: > Actually the documentation is not clear enough. > Let's try to understand this suggester. > > *Building* > This suggester build a FST that it will use to provide the autocomplete > feature running prefix searches on it . > The terms it uses to generate the FST are the tokens produced by the > "suggestFreeTextAnalyzerFieldType" . > > And this should be correct. > So if we have a shingle token filter[1-3] ( we produce unigrams as well) in > our analysis to keep it simple , from these original field values : > "mp3 ipod" > "mp3 player" > "mp3 player ipod" > "player of Real" > > -> we produce these list of possible suggestions in our FST : > > > > > > > > > > > > > > > > > From the documentation I read : > > > " ngrams: The max number of tokens out of which singles will be make the > > dictionary. The default value is 2. Increasing this would mean you want > > more than the previous 2 tokens to be taken into consideration when > making > > the suggestions. " > > > This makes me confused, as I was not expecting this param to affect the > suggestion dictionary. > So I would like a clarification here from our masters :) > At this point let's see what happens at query time . > > *Query Time * > As my understanding the ngrams params will consider the last N-1 tokens > the user put separated by the space separator. > > "Builds an ngram model from the text sent to {@link > > * #build} and predicts based on the last grams-1 tokens in > > * the request sent to {@link #lookup}. This tries to > > * handle the "long tail" of suggestions for when the > > * incoming query is a never before seen query string." > > > Example , grams=3 should consider only the last 2 tokens > > special mp3 p -> mp3 p > > Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" . > We produce 3 tokens : > > > > > And we run the prefix matching on the FST . > > *Conclusion* > My understanding is wrong for sure at some point, as the behaviour I get is > different. > Can we discuss this , clarify this and eventually put it in the official > documentation ? > > Cheers > > 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo : > > > I'm implementing an auto-suggest feature in Solr, and I'll like to > achieve > > the follwing: > > > > For example, if the user enters "mp3", Solr might suggest "mp3 player", > > "mp3 nano" and "mp3 music". > > When the user enters "mp3 p", the suggestion should narrow down to "mp3 > > player". > > > > Currently, when I type "mp3 p", the suggester is returning words that > > starts with the letter "p" only, and I'm getting results like "plan", > > "production", etc, and it does not take the "mp3" token into > consideration. > > > > I'm using Solr 5.1 and below is my configuration: > > > > In solrconfig.xml: > > > > > > > > > > FreeTextLookupFactory > > suggester_freetext_dir > > > > DocumentDictionaryFactory > > Suggestion > > Project > > suggestType > > 5 > > false > > false > > > > > > > > > > In schema.xml > > > > > positionIncrementGap="100"> > > > > > pattern="[^a-zA-Z0-9]" replacement=" " /> > > > > > maxShingleSize="6" outputUnigrams="false"/> > > > > > > > pattern="[^a-zA-Z0-9]" replacement=" " /> > > > > > maxShingleSize="6" outputUnigrams="true"/> > > > > > > > > > > Is there anything that I configured wrongly? > > > > > > Regards, > > Edwin > > > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >
Re: Auto-suggest in Solr
Actually the documentation is not clear enough. Let's try to understand this suggester. *Building* This suggester build a FST that it will use to provide the autocomplete feature running prefix searches on it . The terms it uses to generate the FST are the tokens produced by the "suggestFreeTextAnalyzerFieldType" . And this should be correct. So if we have a shingle token filter[1-3] ( we produce unigrams as well) in our analysis to keep it simple , from these original field values : "mp3 ipod" "mp3 player" "mp3 player ipod" "player of Real" -> we produce these list of possible suggestions in our FST : >From the documentation I read : > " ngrams: The max number of tokens out of which singles will be make the > dictionary. The default value is 2. Increasing this would mean you want > more than the previous 2 tokens to be taken into consideration when making > the suggestions. " This makes me confused, as I was not expecting this param to affect the suggestion dictionary. So I would like a clarification here from our masters :) At this point let's see what happens at query time . *Query Time * As my understanding the ngrams params will consider the last N-1 tokens the user put separated by the space separator. "Builds an ngram model from the text sent to {@link > * #build} and predicts based on the last grams-1 tokens in > * the request sent to {@link #lookup}. This tries to > * handle the "long tail" of suggestions for when the > * incoming query is a never before seen query string." Example , grams=3 should consider only the last 2 tokens special mp3 p -> mp3 p Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" . We produce 3 tokens : And we run the prefix matching on the FST . *Conclusion* My understanding is wrong for sure at some point, as the behaviour I get is different. Can we discuss this , clarify this and eventually put it in the official documentation ? Cheers 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo : > I'm implementing an auto-suggest feature in Solr, and I'll like to achieve > the follwing: > > For example, if the user enters "mp3", Solr might suggest "mp3 player", > "mp3 nano" and "mp3 music". > When the user enters "mp3 p", the suggestion should narrow down to "mp3 > player". > > Currently, when I type "mp3 p", the suggester is returning words that > starts with the letter "p" only, and I'm getting results like "plan", > "production", etc, and it does not take the "mp3" token into consideration. > > I'm using Solr 5.1 and below is my configuration: > > In solrconfig.xml: > > > > > FreeTextLookupFactory > suggester_freetext_dir > > DocumentDictionaryFactory > Suggestion > Project > suggestType > 5 > false > false > > > > > In schema.xml > > positionIncrementGap="100"> > > pattern="[^a-zA-Z0-9]" replacement=" " /> > > maxShingleSize="6" outputUnigrams="false"/> > > > pattern="[^a-zA-Z0-9]" replacement=" " /> > > maxShingleSize="6" outputUnigrams="true"/> > > > > > Is there anything that I configured wrongly? > > > Regards, > Edwin > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Auto-suggest in Solr
I'm implementing an auto-suggest feature in Solr, and I'll like to achieve the follwing: For example, if the user enters "mp3", Solr might suggest "mp3 player", "mp3 nano" and "mp3 music". When the user enters "mp3 p", the suggestion should narrow down to "mp3 player". Currently, when I type "mp3 p", the suggester is returning words that starts with the letter "p" only, and I'm getting results like "plan", "production", etc, and it does not take the "mp3" token into consideration. I'm using Solr 5.1 and below is my configuration: In solrconfig.xml: FreeTextLookupFactory suggester_freetext_dir DocumentDictionaryFactory Suggestion Project suggestType 5 false false In schema.xml Is there anything that I configured wrongly? Regards, Edwin
Re: Auto suggest with adding accents
Any one find any solution for this probleme ? -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150972.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest with adding accents
hello, on the new suggester, when the field is multivalued="true", itsnot working i need to try the patch "LUCENE-3842" to test auto complete but i dont know how. i have Solr-4.7.2 not source code. can some one help? Best regards, Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150609.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest with adding accents
Perhaps the actual suggester module is a better fit then: http://blog.mikemccandless.com/2012/09/lucenes-new-analyzing-suggester.html http://romiawasthy.blogspot.fi/2014/06/configure-solr-suggester.html Also: http://jayant7k.blogspot.com/2014/03/an-interesting-suggester-in-solr.html Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Fri, Aug 1, 2014 at 3:21 PM, Otis Gospodnetic wrote: > Aha. I don't know if Solr Suggester can do that. Let's see what others > say. I know http://www.sematext.com/products/autocomplete/ could do that. > > Otis > -- > Performance Monitoring * Log Analytics * Search Analytics > Solr & Elasticsearch Support * http://sematext.com/ > > > On Fri, Aug 1, 2014 at 9:26 AM, benjelloun wrote: > >> hello, >> >> you didnt enderstand well my problem i give you exemple: >> the document contain the word "genève". >> q="gene" auto suggestion give "geneve" >> q="genè" auto suggestion give "genève" >> >> but what i need is q="gene" auto suggestion give "genève" with accent like >> correction of word. >> i tried to add spellchecker to correct it but the maximum of character for >> correction is 2 >> maybe there is other solution, >> i give my schema of field: >> >> > positionIncrementGap="100" omitNorms="true"> >> >> >> >> > ignoreCase="true"/> >> >> >> >> >> >> > class="solr.StandardTokenizerFactory"/>replacement="$2"/>--> >> >> > ignoreCase="true"/> >> >> >> >> >> >> thanks best regards, >> Anass BENJELLOUN >> >> >> >> >> 2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] < >> ml-node+s472066n4150410...@n3.nabble.com>: >> >> > You need to do the opposite. Make sure accents are NOT removed at index >> & >> > query time. >> > >> > Otis >> > -- >> > Performance Monitoring * Log Analytics * Search Analytics >> > Solr & Elasticsearch Support * http://sematext.com/ >> > >> > >> > >> > On Thu, Jul 31, 2014 at 5:49 PM, benjelloun <[hidden email] >> > <http://user/SendEmail.jtp?type=node&node=4150410&i=0>> wrote: >> > >> > > hi, >> > > >> > > q="gene" it suggest "geneve" >> > > ASCIIFoldingFilter work like isolate accent >> > > >> > > what i need to suggest is "genève" >> > > >> > > any idea? >> > > >> > > thanks >> > > best reagards >> > > Anass BENJELLOUN >> > > >> > > >> > > >> > > -- >> > > View this message in context: >> > > >> > >> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html >> > >> > > Sent from the Solr - User mailing list archive at Nabble.com. >> > > >> > >> > >> > -- >> > If you reply to this email, your message will be added to the discussion >> > below: >> > >> > >> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150410.html >> > To unsubscribe from Auto suggest with adding accents, click here >> > < >> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4150379&code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwMzc5fC0xMDQyNjMzMDgx >> > >> > . >> > NAML >> > < >> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml >> > >> > >> >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150569.html >> Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest with adding accents
Aha. I don't know if Solr Suggester can do that. Let's see what others say. I know http://www.sematext.com/products/autocomplete/ could do that. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Fri, Aug 1, 2014 at 9:26 AM, benjelloun wrote: > hello, > > you didnt enderstand well my problem i give you exemple: > the document contain the word "genève". > q="gene" auto suggestion give "geneve" > q="genè" auto suggestion give "genève" > > but what i need is q="gene" auto suggestion give "genève" with accent like > correction of word. > i tried to add spellchecker to correct it but the maximum of character for > correction is 2 > maybe there is other solution, > i give my schema of field: > > positionIncrementGap="100" omitNorms="true"> > > > > ignoreCase="true"/> > > > > > > class="solr.StandardTokenizerFactory"/>replacement="$2"/>--> > > ignoreCase="true"/> > > > > > > thanks best regards, > Anass BENJELLOUN > > > > > 2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] < > ml-node+s472066n4150410...@n3.nabble.com>: > > > You need to do the opposite. Make sure accents are NOT removed at index > & > > query time. > > > > Otis > > -- > > Performance Monitoring * Log Analytics * Search Analytics > > Solr & Elasticsearch Support * http://sematext.com/ > > > > > > > > On Thu, Jul 31, 2014 at 5:49 PM, benjelloun <[hidden email] > > <http://user/SendEmail.jtp?type=node&node=4150410&i=0>> wrote: > > > > > hi, > > > > > > q="gene" it suggest "geneve" > > > ASCIIFoldingFilter work like isolate accent > > > > > > what i need to suggest is "genève" > > > > > > any idea? > > > > > > thanks > > > best reagards > > > Anass BENJELLOUN > > > > > > > > > > > > -- > > > View this message in context: > > > > > > http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html > > > > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > > > > -- > > If you reply to this email, your message will be added to the discussion > > below: > > > > > http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150410.html > > To unsubscribe from Auto suggest with adding accents, click here > > < > http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4150379&code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwMzc5fC0xMDQyNjMzMDgx > > > > . > > NAML > > < > http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml > > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150569.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest with adding accents
hello, you didnt enderstand well my problem i give you exemple: the document contain the word "genève". q="gene" auto suggestion give "geneve" q="genè" auto suggestion give "genève" but what i need is q="gene" auto suggestion give "genève" with accent like correction of word. i tried to add spellchecker to correct it but the maximum of character for correction is 2 maybe there is other solution, i give my schema of field: replacement="$2"/>--> thanks best regards, Anass BENJELLOUN 2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] < ml-node+s472066n4150410...@n3.nabble.com>: > You need to do the opposite. Make sure accents are NOT removed at index & > query time. > > Otis > -- > Performance Monitoring * Log Analytics * Search Analytics > Solr & Elasticsearch Support * http://sematext.com/ > > > > On Thu, Jul 31, 2014 at 5:49 PM, benjelloun <[hidden email] > <http://user/SendEmail.jtp?type=node&node=4150410&i=0>> wrote: > > > hi, > > > > q="gene" it suggest "geneve" > > ASCIIFoldingFilter work like isolate accent > > > > what i need to suggest is "genève" > > > > any idea? > > > > thanks > > best reagards > > Anass BENJELLOUN > > > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html > > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150410.html > To unsubscribe from Auto suggest with adding accents, click here > <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4150379&code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTUwMzc5fC0xMDQyNjMzMDgx> > . > NAML > <http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150569.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest with adding accents
You need to do the opposite. Make sure accents are NOT removed at index & query time. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thu, Jul 31, 2014 at 5:49 PM, benjelloun wrote: > hi, > > q="gene" it suggest "geneve" > ASCIIFoldingFilter work like isolate accent > > what i need to suggest is "genève" > > any idea? > > thanks > best reagards > Anass BENJELLOUN > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Auto suggest with adding accents
hi, q="gene" it suggest "geneve" ASCIIFoldingFilter work like isolate accent what i need to suggest is "genève" any idea? thanks best reagards Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest with adding accents
Hi, What happens when you add ASCIIFoldingFilter to field type definition of suggestField? Ahmet On Thursday, July 31, 2014 5:49 PM, benjelloun wrote: Hello, i'm trying to autosuggest frensh word with accents, but if the user write q="gene" it will not suggest "genève", it will suggest "general","genetic" ... suggestDic org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.fst.WFSTLookupFactory suggestFolder suggestField true true suggest/emptyDic.txt textSuggest suggests true suggestDic true 6 true 6 true suggests The field "suggestField" dont isolate accents. Thanks for help, Best regards, Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379.html Sent from the Solr - User mailing list archive at Nabble.com.
Auto suggest with adding accents
Hello, i'm trying to autosuggest frensh word with accents, but if the user write q="gene" it will not suggest "genève", it will suggest "general","genetic" ... suggestDic org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.fst.WFSTLookupFactory suggestFolder suggestField true true suggest/emptyDic.txt textSuggest suggests true suggestDic true 6 true 6 true suggests The field "suggestField" dont isolate accents. Thanks for help, Best regards, Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto Suggest
Hello Erick, So in your opinion what is the solution to use autosuggest with sentece :) an exemple will be very helpfull, Thanks, best regards, Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-tp4149004p4149441.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto Suggest
No, although there's been some joy with using shingles. Autosuggest works off of the _indexed tokens_. So the problem is really reducing the tokenization to something that is multi-word. Best, Erick On Thu, Jul 24, 2014 at 5:11 AM, benjelloun wrote: > Hello, > > Did solr.SuggestComponent work on MultiValued Field to Auto suggest not > only > one word but the whole sentence? > > indexed="true"/> > > Regards, > Anass BENJELLOUN > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-Suggest-tp4149004.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Auto Suggest
Hello, Did solr.SuggestComponent work on MultiValued Field to Auto suggest not only one word but the whole sentence? Regards, Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-tp4149004.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto Suggest - Time decay
Sorry, I forgot the link: [1] - http://wiki.apache.org/solr/SolrRelevancyFAQ - Mensaje original - De: "Ing. Jorge Luis Betancourt Gonzalez" Para: solr-user@lucene.apache.org Enviados: Martes, 1 de Octubre 2013 13:34:03 Asunto: Re: Auto Suggest - Time decay For that core just use a boost factor as explained on [1]: You could use a query like this to see (before make any change) how your suggestions will be retrieved, in this case a query for "goog" has been made, and recent documents will be boosted (an extra bonus will be given for the newer documents). http://localhost:8983/solr/select?q={!boost b=recip(ms(NOW,manufacturedate_dt),3.16e-11,1,1)}goog If this is enough for you you could poot the boost parameter in your request handler and make it even simpler so any query againsta this particular request handler will be automatically boosted by date. PS: You could tweak the above formula used in the boost parameter for a more suitable to your needs. - Mensaje original - De: "SolrLover" Para: solr-user@lucene.apache.org Enviados: Martes, 1 de Octubre 2013 12:19:51 Asunto: Re: Auto Suggest - Time decay I am using a totally separate core for storing the auto suggest keywords. Would you be able to send me some more details on your implementation? -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965p4092969.html Sent from the Solr - User mailing list archive at Nabble.com. III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu
Re: Auto Suggest - Time decay
For that core just use a boost factor as explained on [1]: You could use a query like this to see (before make any change) how your suggestions will be retrieved, in this case a query for "goog" has been made, and recent documents will be boosted (an extra bonus will be given for the newer documents). http://localhost:8983/solr/select?q={!boost b=recip(ms(NOW,manufacturedate_dt),3.16e-11,1,1)}goog If this is enough for you you could poot the boost parameter in your request handler and make it even simpler so any query againsta this particular request handler will be automatically boosted by date. PS: You could tweak the above formula used in the boost parameter for a more suitable to your needs. - Mensaje original - De: "SolrLover" Para: solr-user@lucene.apache.org Enviados: Martes, 1 de Octubre 2013 12:19:51 Asunto: Re: Auto Suggest - Time decay I am using a totally separate core for storing the auto suggest keywords. Would you be able to send me some more details on your implementation? -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965p4092969.html Sent from the Solr - User mailing list archive at Nabble.com. III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu
Re: Auto Suggest - Time decay
I am using a totally separate core for storing the auto suggest keywords. Would you be able to send me some more details on your implementation? -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965p4092969.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto Suggest - Time decay
Are you using the suggester component? or a separated core? I've used a separated core to store suggestions and order this suggestions (queries performed on the frontend) using a time decay function, and it works great for me. Regards, - Mensaje original - De: "SolrLover" Para: solr-user@lucene.apache.org Enviados: Martes, 1 de Octubre 2013 12:12:13 Asunto: Auto Suggest - Time decay I am trying to implement an auto suggest based on time decay function. I have a separate index just to store auto suggest keywords. I would be calculating the frequency over time rather than just calculating just based on frequency alone. I am thinking of using a database to perform the calculation and update the SOLR index with the boost calculated based on time decay function. I am not sure if there is a better way to do this... I need to boost the terms based on the frequency over time, Ex: when someone searches for 'apple' 1 times during a iphone launch (one particular day) shouldn't really make apple come up in the auto suggestion always when someone types in the keyword 'a' rather it should lose its popularity exponentially.. Anyone has any suggestions? -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965.html Sent from the Solr - User mailing list archive at Nabble.com. III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu III Escuela Internacional de Invierno en la UCI del 17 al 28 de febrero del 2014. Ver www.uci.cu
Auto Suggest - Time decay
I am trying to implement an auto suggest based on time decay function. I have a separate index just to store auto suggest keywords. I would be calculating the frequency over time rather than just calculating just based on frequency alone. I am thinking of using a database to perform the calculation and update the SOLR index with the boost calculated based on time decay function. I am not sure if there is a better way to do this... I need to boost the terms based on the frequency over time, Ex: when someone searches for 'apple' 1 times during a iphone launch (one particular day) shouldn't really make apple come up in the auto suggestion always when someone types in the keyword 'a' rather it should lose its popularity exponentially.. Anyone has any suggestions? -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-Time-decay-tp4092965.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto-Suggest, spell check dictionary replication to slave issue
Seems like this feature is still yet to be implemented.. https://issues.apache.org/jira/browse/SOLR-866 -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-spell-check-dictionary-replication-to-slave-issue-tp4068562p4068739.html Sent from the Solr - User mailing list archive at Nabble.com.
Auto-Suggest, spell check dictionary replication to slave issue
Hi All, We create 2 dictionary's from a indexed field for auto-sugest, spell check feature. When we configured replication from master to slave's index is replicating properly but not the auto-suggest, spell check dictionary's. Is there a way to replicate auto-suggest, spell check dictionary outside the index directory? Please suggest. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-Suggest-spell-check-dictionary-replication-to-slave-issue-tp4068562.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Where is the auto-suggest function gone?
Hello Upayavira, thanks for your reply. In the example I can see the suggestions "dollar" and "dock" when I type "do" in Solritas (http://localhost:8983/solr/collection1/browse?q=). I already changed the field "name" of spellchecker, because I verified the name field in the admin section and there were in my indexed content no data. So there is nothing to suggest. Then I checked which field contains data and put in this field name into the field name of spellcheck, but nothing happened - still no suggestions ... -- View this message in context: http://lucene.472066.n3.nabble.com/Where-is-the-auto-suggest-function-gone-tp4045520p4045531.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Where is the auto-suggest function gone?
Are you thinking of spellchecking? Where are you seeing suggestions? If you are thinking of spellchecking, by default the spellchecker uses the 'name' field, and you have likely indexed into the 'text' field, hence no results being returned. Upayavira On Thu, Mar 7, 2013, at 01:12 PM, alecx wrote: > Hi, > > I just indexed the sample documents in the exampledocs folder and saw the > search suggestions when I search for something in /browse. > Afterwards I deleted the index (like described..) and indexed a folder of > html+pdf files. Searching works but there are no suggestions. > What I need to adjust to make this work again? > > Thanks in advance. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Where-is-the-auto-suggest-function-gone-tp4045520.html > Sent from the Solr - User mailing list archive at Nabble.com.
Where is the auto-suggest function gone?
Hi, I just indexed the sample documents in the exampledocs folder and saw the search suggestions when I search for something in /browse. Afterwards I deleted the index (like described..) and indexed a folder of html+pdf files. Searching works but there are no suggestions. What I need to adjust to make this work again? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Where-is-the-auto-suggest-function-gone-tp4045520.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Prefix (facet.prefix) based auto-suggest on Multi-Valued field do not return results
Hi, I think this is because of the space observed - facet.prefix= "empty string" - Please see below 3 3 3 3 3 ... so on *But why is this space inserted?* If you see below, the list of keywords taken from search results , there is no space. Thanks & Regards Rajani On Fri, Aug 17, 2012 at 3:02 PM, Rajani Maski wrote: > Hi All, > > * When I do facet.prefix on a * KEYWORDS *field(this field is multi > valued) , I don't get suggestion for the first key in this field . * > > Example : > > I have 2 documents with the field "KEYWORDS" containing multiple values. > > > 偏振式3D成像原理 > 采用LED边缘发光的新技术 > 高级降噪运算法及画质增强技术可 > > > > 紧凑机身,轻松携带 > 节能低耗,持久续航 > > > > > If I do on next following strings - I get respective suggestions. > > BUT If I do facet.prefix on red colored string - facet.field=KEYWORDS& > facet.prefix=偏振 : there are no suggestions. > > > > What can be the reason? > > > > > > Thanks & Regards > Rajani > > > >
Prefix (facet.prefix) based auto-suggest on Multi-Valued field do not return results
Hi All, * When I do facet.prefix on a * KEYWORDS *field(this field is multi valued) , I don't get suggestion for the first key in this field . * Example : I have 2 documents with the field "KEYWORDS" containing multiple values. 偏振式3D成像原理 采用LED边缘发光的新技术 高级降噪运算法及画质增强技术可 紧凑机身,轻松携带 节能低耗,持久续航 If I do on next following strings - I get respective suggestions. BUT If I do facet.prefix on red colored string - facet.field=KEYWORDS& facet.prefix=偏振 : there are no suggestions. What can be the reason? Thanks & Regards Rajani
Re: Auto suggest on indexed file content filtered based on user
On Wed, Apr 25, 2012 at 8:18 AM, prakash_ajp wrote: > Is it true that faceting is case sensitive? That would be disastrous for > our > requirement :( > > it depends on your schema definition: if you lower case your tokens both for index and query sides, the faceting should not be case sensitive. > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3937370.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Dmitry Kan
Re: Auto suggest on indexed file content filtered based on user
Is it true that faceting is case sensitive? That would be disastrous for our requirement :( -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3937370.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
The first one may not work because the number of users can be big. Besides, the users can simply register themselves and start using it. It won't work if an admin has to intervene in the registration process. The second could work I guess. But the problem would be data duplication as users might also share permissions to same files and folders. I understand my requirement is a little complicated. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3937368.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
Another option is to use faceting (via the facet.prefix param) for your auto-suggest. It's not as fast and scalable as using one of the Suggester implementations, but it does allow arbitrary fq parameters to be included in the request to limit the results. http://wiki.apache.org/solr/SimpleFacetParameters#Facet_prefix_.28term_suggest.29 Doug On 04/24/2012 04:30 PM, Erick Erickson wrote: I don't know if there is a really good solution here. The problem is that suggester (and the trunk FST version) simply traverse the terms in the index. there's not even a real concept of those terms belonging to any document. Since your security level is on a document basis, that makes things hard. How many users do you have? And do you ever expect to search across more than one user's files? If not, you could consider having one core per user. Then the suggestions would be correct and since the searches would be against the user's core, they'd never see any documents they didn't own. But that solution has some complexity involved, and if you have a zillion users it can be difficult to get right. You could consider having separate (dynamically-defined) fields that had the suggestion list for each individual user. that would be administratively easier. Then you suggestions would simply go against that user's suggestion field (suggestion_user1 e.g.). None of this is elegant, but this is not an elegant problem given how Solr is structured. Best Erick On Tue, Apr 24, 2012 at 2:31 PM, prakash_ajp wrote: I read on a couple of other web pages that fq is not supported for suggester. I even tried the query and it doesn't help. My understanding was, when the suggest (spellcheck) index is built, only the field chosen is considered for queries and the other fields from the main index are not available for filtering purposes once the index is created. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
I don't know if there is a really good solution here. The problem is that suggester (and the trunk FST version) simply traverse the terms in the index. there's not even a real concept of those terms belonging to any document. Since your security level is on a document basis, that makes things hard. How many users do you have? And do you ever expect to search across more than one user's files? If not, you could consider having one core per user. Then the suggestions would be correct and since the searches would be against the user's core, they'd never see any documents they didn't own. But that solution has some complexity involved, and if you have a zillion users it can be difficult to get right. You could consider having separate (dynamically-defined) fields that had the suggestion list for each individual user. that would be administratively easier. Then you suggestions would simply go against that user's suggestion field (suggestion_user1 e.g.). None of this is elegant, but this is not an elegant problem given how Solr is structured. Best Erick On Tue, Apr 24, 2012 at 2:31 PM, prakash_ajp wrote: > I read on a couple of other web pages that fq is not supported for suggester. > I even tried the query and it doesn't help. My understanding was, when the > suggest (spellcheck) index is built, only the field chosen is considered for > queries and the other fields from the main index are not available for > filtering purposes once the index is created. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
yes only spellcheck indexed build field is for suggest query I believe, filtering a documents on search handler using fq parameter and spell suggest are two part we are discussing here. lets say you have field for spellcheck - used to build spell dictionary using copyField for populating a spell field and get dictionary created referring spellcheck handler in the default search handler at 'last-components' section, like below spellcheck then you will be able to apply search documents filtering and spellcheck params to search handler while querying. detailed info http://wiki.apache.org/solr/SpellCheckComponent [probably you might have already went thru :) ] -Jeevanandam On Apr 25, 2012, at 12:01 AM, prakash_ajp wrote: > I read on a couple of other web pages that fq is not supported for suggester. > I even tried the query and it doesn't help. My understanding was, when the > suggest (spellcheck) index is built, only the field chosen is considered for > queries and the other fields from the main index are not available for > filtering purposes once the index is created. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
I read on a couple of other web pages that fq is not supported for suggester. I even tried the query and it doesn't help. My understanding was, when the suggest (spellcheck) index is built, only the field chosen is considered for queries and the other fields from the main index are not available for filtering purposes once the index is created. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3936144.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Auto suggest on indexed file content filtered based on user
I'm new to Solr, but I would think the fq=[username] would work here. http://wiki.apache.org/solr/CommonQueryParameters#fq Mike -Original Message- From: prakash_ajp [mailto:prakash_...@yahoo.com] Sent: Tuesday, April 24, 2012 11:07 AM To: solr-user@lucene.apache.org Subject: Re: Auto suggest on indexed file content filtered based on user Right now, the query is a very simple one, something like q=text. Basically, it would return ['textview', 'textviewer', ..] But the issue is, the 'textviewer' could be from a file that is out of bounds for this user. So, ultimately I would like to include the userName in the query. As mentioned earlier, userName is another field in the main index. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3935765.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
On Apr 24, 2012, at 9:37 PM, prakash_ajp wrote: > Right now, the query is a very simple one, something like q=text. Basically, > it would return ['textview', 'textviewer', ..] hmm, so you're using default query field > > But the issue is, the 'textviewer' could be from a file that is out of > bounds for this user. So, ultimately I would like to include the userName in > the query. As mentioned earlier, userName is another field in the main > index. and you like to filter the result set along with userName field value > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3935765.html > Sent from the Solr - User mailing list archive at Nabble.com. in this scenario 'fq' parameter will facilitate to achieve your desire result. Please refer http://wiki.apache.org/solr/CommonQueryParameters#fq try this q=text&fq=userName:"prakash" Let us know! -Jeevanandam
Re: Auto suggest on indexed file content filtered based on user
Right now, the query is a very simple one, something like q=text. Basically, it would return ['textview', 'textviewer', ..] But the issue is, the 'textviewer' could be from a file that is out of bounds for this user. So, ultimately I would like to include the userName in the query. As mentioned earlier, userName is another field in the main index. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3935765.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto suggest on indexed file content filtered based on user
can you please share a sample query? -Jeevanandam On 24-04-2012 1:49 pm, prakash_ajp wrote: I am trying to implement an auto-suggest feature. The search feature already exists and searches on file content in user's allotted workspace. The following is from my schema that will be used for search indexing: The search result is filtered by the user name. The suggest is implemented as a searchComponent and the field 'Text' is used by the suggester and would have to be filtered the same way the search is done. The problem with this approach is, suggest works on a single field and there is no way to include the UserName field as a filter. What's the best way out from here? Thanks in advance! Jay -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3934565.html Sent from the Solr - User mailing list archive at Nabble.com.
Auto suggest on indexed file content filtered based on user
I am trying to implement an auto-suggest feature. The search feature already exists and searches on file content in user's allotted workspace. The following is from my schema that will be used for search indexing: The search result is filtered by the user name. The suggest is implemented as a searchComponent and the field 'Text' is used by the suggester and would have to be filtered the same way the search is done. The problem with this approach is, suggest works on a single field and there is no way to include the UserName field as a filter. What's the best way out from here? Thanks in advance! Jay -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-on-indexed-file-content-filtered-based-on-user-tp3934565p3934565.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Facet auto-suggest
Hi, Sure, you can use filters and facets for this. Start a query with ...&facet.field=source&facet.field=topics&facet.field=type When you click a "button", you set the corresponding filter (fq=source:people), and the new query will return the same facets with new counts. In the Audi example, you would disable buttons with 0 hits in the facet count. For more in depth, see http://java.dzone.com/news/complex-solr-faceting -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 17. jan. 2012, at 23:38, Jon Drukman wrote: > I don't even know what to call this feature. Here's a website that shows > the problem: > > http://pulse.audiusanews.com/pulse/index.php > > Notice that you can end up in a situation where there are no results. > For example, > in order, press: People, Performance, Technology, Photos. The client > wants it so that when you click a button, it disables buttons that would > lead to a dead end. In other words, after clicking Technology, the Photos > button would be disabled. > > Can Solr help with this? > > -jsd- >
Facet auto-suggest
I don't even know what to call this feature. Here's a website that shows the problem: http://pulse.audiusanews.com/pulse/index.php Notice that you can end up in a situation where there are no results. For example, in order, press: People, Performance, Technology, Photos. The client wants it so that when you click a button, it disables buttons that would lead to a dead end. In other words, after clicking Technology, the Photos button would be disabled. Can Solr help with this? -jsd-
RE: Ebay Kleinanzeigen and Auto Suggest
--- On Tue, 5/3/11, Charton, Andre wrote: > > yes we do. > > If you use a limit number of categories (like 100) you can > use dynamic fields with the termscomponent and by choosing a > category specific prefix, like: > > {schema.xml} > ... > indexed="true" stored="false" multiValued="true" > omitNorms="true"/> > ... > {schema.xml} > > And within data import handler we script prefix from given > category: > > {data-config.xml} > function > setCatPrefixFields(row) { > > var catId = row.get('category'); > > var title = row.get('freetext'); > > var cat_prefix = "c" + catId + "_suggestion"; > > return row; > } > {data-config.xml} > > Then you we adapt these in our application layer by a > specific request handler, regarding these prefix. > > Pro: > - works fine for limit number of > categories > > Con: > - index is getting bigger, we measure > increasing by ~40 percent Very interesting. Why did the index get bigger? You're still indexing the same title, just to different dynamic fields, right? So the total amount of data indexed should still be the same. Adding dynamic fields shouldn't increase the index size. What am I missing? Andy
RE: Ebay Kleinanzeigen and Auto Suggest
Hi, yes we do. If you use a limit number of categories (like 100) you can use dynamic fields with the termscomponent and by choosing a category specific prefix, like: {schema.xml} ... ... {schema.xml} And within data import handler we script prefix from given category: {data-config.xml} function setCatPrefixFields(row) { var catId = row.get('category'); var title = row.get('freetext'); var cat_prefix = "c" + catId + "_suggestion"; return row; } {data-config.xml} Then you we adapt these in our application layer by a specific request handler, regarding these prefix. Pro: - works fine for limit number of categories Con: - index is getting bigger, we measure increasing by ~40 percent Regards André Charton -Original Message- From: Eric Grobler [mailto:impalah...@googlemail.com] Sent: Wednesday, April 27, 2011 9:56 AM To: solr-user@lucene.apache.org Subject: Re: Ebay Kleinanzeigen and Auto Suggest Hi Otis, The new Solr 3.1 Suggester also does not support filter queries. Is anyone using shingles with faceting on large data? Regards Ericz On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hi Eric, > > Before using the terms component, allow me to point out: > > * http://sematext.com/products/autocomplete/index.html (used on > http://search-lucene.com/ for example) > > * http://wiki.apache.org/solr/Suggester > > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > - Original Message > > From: Eric Grobler > > To: solr-user@lucene.apache.org > > Sent: Tue, April 26, 2011 1:11:11 PM > > Subject: Ebay Kleinanzeigen and Auto Suggest > > > > Hi > > > > Someone told me that ebay is using solr. > > I was looking at their Auto Suggest implementation and I guess they are > > using Shingles and the TermsComponent. > > > > I managed to get a satisfactory implementation but I have a problem with > > category specific filtering. > > Ebay suggestions are sensitive to categories like Cars and Pets. > > > > As far as I understand it is not possible to using filters with a term > > query. > > Unless one uses multiple fields or special prefixes for the words to > index I > > cannot think how to implement this. > > > > Is their perhaps a workaround for this limitation? > > > > Best Regards > > EricZ > > > > --- > > > > I am have a shingle type like: > > > positionIncrementGap="100"> > > > > > > > maxShingleSize="4" /> > > > > > > > > > > > > > > and a query like > > > http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi > >i > > >
Re: Ebay Kleinanzeigen and Auto Suggest
Hi Otis, The new Solr 3.1 Suggester also does not support filter queries. Is anyone using shingles with faceting on large data? Regards Ericz On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hi Eric, > > Before using the terms component, allow me to point out: > > * http://sematext.com/products/autocomplete/index.html (used on > http://search-lucene.com/ for example) > > * http://wiki.apache.org/solr/Suggester > > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > - Original Message > > From: Eric Grobler > > To: solr-user@lucene.apache.org > > Sent: Tue, April 26, 2011 1:11:11 PM > > Subject: Ebay Kleinanzeigen and Auto Suggest > > > > Hi > > > > Someone told me that ebay is using solr. > > I was looking at their Auto Suggest implementation and I guess they are > > using Shingles and the TermsComponent. > > > > I managed to get a satisfactory implementation but I have a problem with > > category specific filtering. > > Ebay suggestions are sensitive to categories like Cars and Pets. > > > > As far as I understand it is not possible to using filters with a term > > query. > > Unless one uses multiple fields or special prefixes for the words to > index I > > cannot think how to implement this. > > > > Is their perhaps a workaround for this limitation? > > > > Best Regards > > EricZ > > > > --- > > > > I am have a shingle type like: > > > positionIncrementGap="100"> > > > > > > > maxShingleSize="4" /> > > > > > > > > > > > > > > and a query like > > > http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi > >i > > >
Re: Ebay Kleinanzeigen and Auto Suggest
Thanks for the links Otis, I will have a look. Regards Ericz On Tue, Apr 26, 2011 at 10:06 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hi Eric, > > Before using the terms component, allow me to point out: > > * http://sematext.com/products/autocomplete/index.html (used on > http://search-lucene.com/ for example) > > * http://wiki.apache.org/solr/Suggester > > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > - Original Message > > From: Eric Grobler > > To: solr-user@lucene.apache.org > > Sent: Tue, April 26, 2011 1:11:11 PM > > Subject: Ebay Kleinanzeigen and Auto Suggest > > > > Hi > > > > Someone told me that ebay is using solr. > > I was looking at their Auto Suggest implementation and I guess they are > > using Shingles and the TermsComponent. > > > > I managed to get a satisfactory implementation but I have a problem with > > category specific filtering. > > Ebay suggestions are sensitive to categories like Cars and Pets. > > > > As far as I understand it is not possible to using filters with a term > > query. > > Unless one uses multiple fields or special prefixes for the words to > index I > > cannot think how to implement this. > > > > Is their perhaps a workaround for this limitation? > > > > Best Regards > > EricZ > > > > --- > > > > I am have a shingle type like: > > > positionIncrementGap="100"> > > > > > > > maxShingleSize="4" /> > > > > > > > > > > > > > > and a query like > > > http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi > >i > > >
Re: Ebay Kleinanzeigen and Auto Suggest
Hi Eric, Before using the terms component, allow me to point out: * http://sematext.com/products/autocomplete/index.html (used on http://search-lucene.com/ for example) * http://wiki.apache.org/solr/Suggester Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Eric Grobler > To: solr-user@lucene.apache.org > Sent: Tue, April 26, 2011 1:11:11 PM > Subject: Ebay Kleinanzeigen and Auto Suggest > > Hi > > Someone told me that ebay is using solr. > I was looking at their Auto Suggest implementation and I guess they are > using Shingles and the TermsComponent. > > I managed to get a satisfactory implementation but I have a problem with > category specific filtering. > Ebay suggestions are sensitive to categories like Cars and Pets. > > As far as I understand it is not possible to using filters with a term > query. > Unless one uses multiple fields or special prefixes for the words to index I > cannot think how to implement this. > > Is their perhaps a workaround for this limitation? > > Best Regards > EricZ > > --- > > I am have a shingle type like: > positionIncrementGap="100"> > > > maxShingleSize="4" /> > > > > > > > and a query like >http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi >i >
Ebay Kleinanzeigen and Auto Suggest
Hi Someone told me that ebay is using solr. I was looking at their Auto Suggest implementation and I guess they are using Shingles and the TermsComponent. I managed to get a satisfactory implementation but I have a problem with category specific filtering. Ebay suggestions are sensitive to categories like Cars and Pets. As far as I understand it is not possible to using filters with a term query. Unless one uses multiple fields or special prefixes for the words to index I cannot think how to implement this. Is their perhaps a workaround for this limitation? Best Regards EricZ --- I am have a shingle type like: and a query like http://localhost:8983/solr/terms?q=*%3A*&terms.fl=suggest_text&terms.sort=count&terms.prefix=audi
Re: EdgeNgram Auto suggest - doubles ignore
I'm afraid I'll have to pass, I'm absolutely swamped at the moment. Perhaps someone else can pick it up. I will say that you should be getting terms back when you pre-lower-case them, so look in your index via the admin page or Luke to see if what's really in your index is what you think in the "name" field. As for sorting, I haven't a clue. Start by backing out your custom sorting, verifying that things are as you expect for everything *except* sorting and add it back in Best Erick On Tue, Feb 8, 2011 at 10:11 AM, johnnyisrael wrote: > > Hi Erick, > > If you have time, Can you please take a look and provide your comments (or) > suggestions for this problem? > > Please let me know if you need any more information. > > Thanks, > > Johnny > -- > View this message in context: > http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2451828.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: EdgeNgram Auto suggest - doubles ignore
Hi Erick, If you have time, Can you please take a look and provide your comments (or) suggestions for this problem? Please let me know if you need any more information. Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2451828.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EdgeNgram Auto suggest - doubles ignore
Hi Erick, I tried to use terms component, I got ended up with the following problems. Problem: 1 Custom Sort not working in terms component: http://lucene.472066.n3.nabble.com/Term-component-sort-is-not-working-td1905059.html#a1909386 I want to sort using one of my custom field[value_score], I gave it aleady in my configuration, but it is not sorting properly. The following are the configuration in solrconfig.xml true json name value_score desc true termsComponent The SOLR response tag is not returned based on sorted parameter. Problem: 2 Cap sensitive problem: [I am searching for "Apple"] http://localhost/solr/core1/terms?terms.fl=name&terms.prefix=apple <-- not working http://localhost/solr/core1/terms?terms.fl=name&terms.prefix=Apple <-- working Tried regex to overcome cap-sensitive problem: http://localhost/solr/core1/terms?terms.fl=name&terms.regex=Apple&terms.regex.flag=case_insensitive Is this regex based search will help me for my requirement? It is returning irrelevant results. I am using the same syntax it is mentioned in WIKI. http://wiki.apache.org/solr/TermsComponent Am I going wrong anywhere? Please let me know if you need any more info. Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2399330.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EdgeNgram Auto suggest - doubles ignore
OK, try this. Use some analysis chain for your field like: This can be a multiValued field, BTW. now use the TermsComponent to fetch your data. See: http://wiki.apache.org/solr/TermsComponent and specify terms.prefix=apple e.g. http://localhost:8983/solr/terms?terms.prefix=app&terms.fl=blivet The return list should be what you want. Note that the returned values will be lower cased, and you can only specify lower case in your search term (all because of specifying the lowercase filter in my example). This should be very fast no matter what your index size, as the return list size defaults to 10 (though you can specify different numbers). Best Erick On Tue, Jan 25, 2011 at 3:03 PM, johnnyisrael wrote: > > Hi Eric, > > What I want here is, lets say I have 3 documents like > > ["pineapple vers apple", "milk with apple", "apple milk shake" ] > > and If i search for "apple", it should return only "apple milk shake" > because that term alone starts with the letter "apple" which I typed in. It > should not bring others and if I type "milk" it should return only "milk > with apple" > > I want an output Similar like a Google auto suggest. > > Is there a way to achieve this without encapsulating with double quotes. > > Thanks, > > Johnny > -- > View this message in context: > http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2333602.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: EdgeNgram Auto suggest - doubles ignore
Right now our configuration says multivalues=true. But that need not be "true" in our case. Will make it false and try and update this thread with more details.. -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2334627.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EdgeNgram Auto suggest - doubles ignore
Ah, sorry, I got confused about your requirements, if you just want to match at the beginning of the field, it may be more possible. Using edgegrams or wildcard. If you have a single-valued field. Do you have a single-valued or a multi-valued field? That is, does each document have just one value, or multiple? I still get confused about how to do it with edgegrams, even with single-valued field, but I think maybe it's possible. _Definitely_ possible, with or without edgegrams, if you are willing/able to make a completely seperate Solr index where each term for auto-suggest is a "document". Yes. The problem lies in what "results" are. In general, Solr's results are the documents you have in the Solr index. Thus it makes everything a lot easier to deal with if you have an index where each document in the index is a "term" for auto-suggest. But that doesnt' always meet requirements if you need to auto-suggest within existing fq's and such, and of course it takes more resources to run an additional solr index. On 1/25/2011 5:03 PM, mesenthil wrote: The index contains around 1.5 million documents. As this is used for autosuggest feature, performance is an important factor. So it looks like, using edgeNgram it is difficult to achieve the the following Result should return only those terms where search letter is matching with the first word only. For example, when we type "M", it should return "Mumford and Sons" and not "jackson Michael". Jonathan, Is it possible to achieve this when we have separate index using edgeNgram?
Re: EdgeNgram Auto suggest - doubles ignore
The index contains around 1.5 million documents. As this is used for autosuggest feature, performance is an important factor. So it looks like, using edgeNgram it is difficult to achieve the the following Result should return only those terms where search letter is matching with the first word only. For example, when we type "M", it should return "Mumford and Sons" and not "jackson Michael". Jonathan, Is it possible to achieve this when we have separate index using edgeNgram? -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2334538.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EdgeNgram Auto suggest - doubles ignore
Oh, i should perhaps mention that EdgeNGrams will yield results a lot quicker than using wildcards at the cost of a larger index. You should, of course, use EdgeNGrams if you worry about performance and have a huge index and a number of queries per second. > Then you don't need NGrams at all. A wildcard will suffice or you can use > the TermsComponent. > > If these strings are indexed as single tokens (KeywordTokenizer with > LowercaseFilter) you can simply do field:app* to retrieve the "apple milk > shake". You can also use the string field type but then you must make sure > the values are already lowercased before indexing. > > Be careful though, there is no query time analysis for wildcard (and fuzzy) > queries so make sure > > > Hi Eric, > > > > What I want here is, lets say I have 3 documents like > > > > ["pineapple vers apple", "milk with apple", "apple milk shake" ] > > > > and If i search for "apple", it should return only "apple milk shake" > > because that term alone starts with the letter "apple" which I typed in. > > It should not bring others and if I type "milk" it should return only > > "milk with apple" > > > > I want an output Similar like a Google auto suggest. > > > > Is there a way to achieve this without encapsulating with double quotes. > > > > Thanks, > > > > Johnny
Re: EdgeNgram Auto suggest - doubles ignore
Then you don't need NGrams at all. A wildcard will suffice or you can use the TermsComponent. If these strings are indexed as single tokens (KeywordTokenizer with LowercaseFilter) you can simply do field:app* to retrieve the "apple milk shake". You can also use the string field type but then you must make sure the values are already lowercased before indexing. Be careful though, there is no query time analysis for wildcard (and fuzzy) queries so make sure > Hi Eric, > > What I want here is, lets say I have 3 documents like > > ["pineapple vers apple", "milk with apple", "apple milk shake" ] > > and If i search for "apple", it should return only "apple milk shake" > because that term alone starts with the letter "apple" which I typed in. It > should not bring others and if I type "milk" it should return only "milk > with apple" > > I want an output Similar like a Google auto suggest. > > Is there a way to achieve this without encapsulating with double quotes. > > Thanks, > > Johnny
Re: EdgeNgram Auto suggest - doubles ignore
I haven't figured out any way to achieve that AT ALL without making a seperate Solr index just to serve autosuggest queries. At least when you want to auto-suggest on a multi-value field. Someone posted a crazy tricky way to do it with a single-valued field a while ago. If you can/are willing to make a seperate Solr index with a schema set up for auto-suggest specifically, it's easy. But from an existing schema, where you want to auto-suggest just based on the values in one field, it's a multi-valued field, and you want to allow matches in the middle of the field -- I don't think there's a way to do it. On 1/25/2011 3:03 PM, johnnyisrael wrote: Hi Eric, What I want here is, lets say I have 3 documents like ["pineapple vers apple", "milk with apple", "apple milk shake" ] and If i search for "apple", it should return only "apple milk shake" because that term alone starts with the letter "apple" which I typed in. It should not bring others and if I type "milk" it should return only "milk with apple" I want an output Similar like a Google auto suggest. Is there a way to achieve this without encapsulating with double quotes. Thanks, Johnny
Re: EdgeNgram Auto suggest - doubles ignore
Hi Eric, What I want here is, lets say I have 3 documents like ["pineapple vers apple", "milk with apple", "apple milk shake" ] and If i search for "apple", it should return only "apple milk shake" because that term alone starts with the letter "apple" which I typed in. It should not bring others and if I type "milk" it should return only "milk with apple" I want an output Similar like a Google auto suggest. Is there a way to achieve this without encapsulating with double quotes. Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2333602.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EdgeNgram Auto suggest - doubles ignore
Let's back up here because now I'm not clear what you actually want. EdgeNGrams are a way of matching substrings, which is what's happening here. Of course searching "apple" against any of the three examples, just as searching for "apple" without grams would match, that's the expected behavior. So, we need a clear problem definition of what you're trying to do, along with example queries (please post the results of adding &debugQuery=on). Best Erick On Tue, Jan 25, 2011 at 8:29 AM, johnnyisrael wrote: > > Hi Eric, > > You are right, there is a copy field to EdgeNgram, I tried the > configuration > but it not working as expected. > > Configuration I tried: > > > > termVectors=”true”> > > > > > > > > > > > positionIncrementGap=”100″> > > > > maxGramSize=”25″/> > > > > > > > > omitNorms=”true” omitTermFreqAndPositions=”true” /> > omitNorms=”true” omitTermFreqAndPositions=”true” /> > > edgy_user_query > > > == > > When I search for the term "apple". > > It is returning results for "pineapple vers apple", "milk with apple", > "apple milk shake" ... > > Is there any other way to overcome this problem? > > Thanks, > > Johnny > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2329370.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: EdgeNgram Auto suggest - doubles ignore
Hi Eric, You are right, there is a copy field to EdgeNgram, I tried the configuration but it not working as expected. Configuration I tried: edgy_user_query == When I search for the term "apple". It is returning results for "pineapple vers apple", "milk with apple", "apple milk shake" ... Is there any other way to overcome this problem? Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2329370.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EdgeNgram Auto suggest - doubles ignore
See below. On Mon, Jan 24, 2011 at 1:51 PM, johnnyisrael wrote: > > Hi, > > I am trying out the auto suggest using EdgeNgram. > > Using the following tutorial as a reference. > > > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ > > In the above tutorial, The below two lines has been clearly mentioned, > > "Note that it’s necessary to wrap the query in double-quotes as a phrase. > Otherwise unpredictable and unwanted matches can occur." > > When i use double quotes as they said it works perfectly fine. I just want > to know the reason for this behavior. > > Can anyone explain me why it behaves like that? > > The reason here is that if you *don't* make it a phrase, then you're ORing (or ANDing) the grams. So if you were searching for won, your search would become w OR wo OR won, which would match n-grams from all over the place without regard to whether they appeared in order. > I tried the alternate method mentioned in the responses section of the same > tutorial [StandardTokenizerFactory and LowerCaseFilterFactory combination], > it does not work fine as expected[bringing unwanted matches]. > > Hmmm, I don't think the StandartTokenizer & LowerCase was being applied as autosuggest, there was a copyField in there that went to the EdgeNGram (note that I scanned the article).. Best Erick > Is there a best way to overcome this? > > Thanks, > > Johnny > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2321919.html > Sent from the Solr - User mailing list archive at Nabble.com. >
EdgeNgram Auto suggest - doubles ignore
Hi, I am trying out the auto suggest using EdgeNgram. Using the following tutorial as a reference. http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ In the above tutorial, The below two lines has been clearly mentioned, "Note that it’s necessary to wrap the query in double-quotes as a phrase. Otherwise unpredictable and unwanted matches can occur." When i use double quotes as they said it works perfectly fine. I just want to know the reason for this behavior. Can anyone explain me why it behaves like that? I tried the alternate method mentioned in the responses section of the same tutorial [StandardTokenizerFactory and LowerCaseFilterFactory combination], it does not work fine as expected[bringing unwanted matches]. Is there a best way to overcome this? Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2321919.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto Suggest
Adding &debugQuery=on produced the following: +edge:testing +edge:lots +edge:testing +edge:lots +PhraseQuery(edge:"te tes test testi testin testing") +PhraseQuery(edge:"lo lot lots") So one part of the answer is that multiple terms are broken up into multiple phrase queries, one phrase for each term. This is with LowerCaseTokenizerFactory and EdgeNGramFilterFactory So I don't see any reason why your query shouldn't work. Could you provide your field type definitions, an example document that you think should be found and query output with &debugQuery=on? Best Erick On Sat, Sep 4, 2010 at 10:27 AM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > Luke, > > Thanks. What happens if there are 3 terms? It seems like the entire > query can go into facet.prefix? > > On Fri, Sep 3, 2010 at 8:05 AM, Luke Tebbs > wrote: > > What about if you do something like this? - > > > > > facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1 > > > > > > Jason Rutherglen wrote: > >> > >> To clarify, the query analyzer returns that. Variations such as > >> "apple mou" also do not return anything. Maybe Jay can comment and > >> then we can amend the article? > >> > >> On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen > >> wrote: > >> > >>> > >>> Analysis returns "app mou". > >>> > >>> On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog > wrote: > >>> > >>>> > >>>> What does analysis.jsp show? > >>>> > >>>> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen > >>>> wrote: > >>>> > >>>>> > >>>>> I'm having a different issue with the EdgeNGram technique described > >>>>> here: > >>>>> > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ > >>>>> > >>>>> That is one word queries q=app on the query_text field, work fine > >>>>> however "q=app mou" do not. Why would this be or is there a > >>>>> configuration that could be missing? > >>>>> > >>>>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler > >>>>> wrote: > >>>>> > >>>>>> > >>>>>> Thanks for your feedback Robert, > >>>>>> > >>>>>> I will try that and see how Solr performs on my data - I think I > will > >>>>>> create > >>>>>> a field that contains only important key/product terms from the > text. > >>>>>> > >>>>>> Regards > >>>>>> Johan > >>>>>> > >>>>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen > >>>>>> wrote: > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> We don't have that many, just a hundred thousand, and solr response > >>>>>>> times (since the index's docs are small and not complex) are logged > >>>>>>> as > >>>>>>> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast > >>>>>>> no > >>>>>>> milliseconds have elapsed. Incredible if you ask me... :) > >>>>>>> > >>>>>>> Once you get SOLR to consider the whole phrase as just one big > term, > >>>>>>> the > >>>>>>> wildcard is very fast. > >>>>>>> > >>>>>>> -Original Message- > >>>>>>> From: Eric Grobler [mailto:impalah...@googlemail.com] > >>>>>>> Sent: Wednesday, September 01, 2010 12:35 PM > >>>>>>> To: solr-user@lucene.apache.org > >>>>>>> Subject: Re: Auto Suggest > >>>>>>> > >>>>>>> Hi Robert, > >>>>>>> > >>>>>>> Interesting approach, how many documents do you have in Solr? > >>>>>>> I have about 2 million and I just wonder if it might be a bit slow. > >>>>>>> > >>>>>>> Regards > >>>>>>> Johan > >>>>>>>
Re: Auto Suggest
Luke, Thanks. What happens if there are 3 terms? It seems like the entire query can go into facet.prefix? On Fri, Sep 3, 2010 at 8:05 AM, Luke Tebbs wrote: > What about if you do something like this? - > > facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1 > > > Jason Rutherglen wrote: >> >> To clarify, the query analyzer returns that. Variations such as >> "apple mou" also do not return anything. Maybe Jay can comment and >> then we can amend the article? >> >> On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen >> wrote: >> >>> >>> Analysis returns "app mou". >>> >>> On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog wrote: >>> >>>> >>>> What does analysis.jsp show? >>>> >>>> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen >>>> wrote: >>>> >>>>> >>>>> I'm having a different issue with the EdgeNGram technique described >>>>> here: >>>>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ >>>>> >>>>> That is one word queries q=app on the query_text field, work fine >>>>> however "q=app mou" do not. Why would this be or is there a >>>>> configuration that could be missing? >>>>> >>>>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler >>>>> wrote: >>>>> >>>>>> >>>>>> Thanks for your feedback Robert, >>>>>> >>>>>> I will try that and see how Solr performs on my data - I think I will >>>>>> create >>>>>> a field that contains only important key/product terms from the text. >>>>>> >>>>>> Regards >>>>>> Johan >>>>>> >>>>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen >>>>>> wrote: >>>>>> >>>>>> >>>>>>> >>>>>>> We don't have that many, just a hundred thousand, and solr response >>>>>>> times (since the index's docs are small and not complex) are logged >>>>>>> as >>>>>>> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast >>>>>>> no >>>>>>> milliseconds have elapsed. Incredible if you ask me... :) >>>>>>> >>>>>>> Once you get SOLR to consider the whole phrase as just one big term, >>>>>>> the >>>>>>> wildcard is very fast. >>>>>>> >>>>>>> -Original Message- >>>>>>> From: Eric Grobler [mailto:impalah...@googlemail.com] >>>>>>> Sent: Wednesday, September 01, 2010 12:35 PM >>>>>>> To: solr-user@lucene.apache.org >>>>>>> Subject: Re: Auto Suggest >>>>>>> >>>>>>> Hi Robert, >>>>>>> >>>>>>> Interesting approach, how many documents do you have in Solr? >>>>>>> I have about 2 million and I just wonder if it might be a bit slow. >>>>>>> >>>>>>> Regards >>>>>>> Johan >>>>>>> >>>>>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I do this by replacing the spaces with a '%' in a separate search >>>>>>>> >>>>>>> >>>>>>> field >>>>>>> >>>>>>>> >>>>>>>> which is not parsed nor tokenized and then you can wildcard across >>>>>>>> the >>>>>>>> whole phrase like you want and the spaces don't mess you up. Just >>>>>>>> >>>>>>> >>>>>>> store >>>>>>> >>>>>>>> >>>>>>>> the original phrase with spaces in a separate field for returning to >>>>>>>> >>>>>>> >>>>>>> the >>>>>>> >>>>>>>> >>>>>>>> front end for display. >>>>>>>> >>>>>>>> -Original Message- >>>>>>>> From: Jazz Globe [mailto:jazzgl...@hotmail.com] >>>>>>>> Sent: Wednesday, September 01, 2010 7:33 AM >>>>>>>> To: solr-user@lucene.apache.org >>>>>>>> Subject: Auto Suggest >>>>>>>> >>>>>>>> >>>>>>>> Hallo >>>>>>>> >>>>>>>> How would one implement a multiple term auto-suggest feature in Solr >>>>>>>> that is filter sensitive? >>>>>>>> For example, a user enters : >>>>>>>> "mp3" >>>>>>>> and solr might suggest: >>>>>>>> -> "mp3 player" >>>>>>>> -> "mp3 nano" >>>>>>>> -> "mp3 sony" >>>>>>>> and then the user starts the second word : >>>>>>>> "mp3 n" >>>>>>>> and that narrows it down to: >>>>>>>> -> "mp3 nano" >>>>>>>> >>>>>>>> I had a quick look at the Terms Component. >>>>>>>> I suppose it just returns term totals for the entire index and >>>>>>>> cannot >>>>>>>> >>>>>>> >>>>>>> be >>>>>>> >>>>>>>> >>>>>>>> used with a filter or query? >>>>>>>> >>>>>>>> Thanks >>>>>>>> Johan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>> >>>> -- >>>> Lance Norskog >>>> goks...@gmail.com >>>> >>>> > >
Re: Auto Suggest
Dan, Thanks... I wasn't clear in the original email what the issue is. It's the fact that multiple terms are in the query, then no results are returned. Thanks On Fri, Sep 3, 2010 at 8:33 AM, dan sutton wrote: > I set this up a few years ago with something like the following: > > > > > > pattern="([^a-z0-9])" replacement="" replace="all" /> > maxGramSize="20" minGramSize="1" /> > > > > > pattern="([^a-z0-9])" replacement="" replace="all" /> > > > > replacement="" replace="all" /> is the bit missing i think here > > This way the search is agnostic to case and any non-alphanum chars, this was > to facilitate a location autocomplete for searching > > So is was a basic search, returning the top N results along with additional > info to show in the autocomplete to our mod_perl servers, Results were > cached in the mod_perl servers. > > Regards, > Dan > > On Thu, Sep 2, 2010 at 1:53 PM, Jason Rutherglen > wrote: > >> I'm having a different issue with the EdgeNGram technique described >> here: >> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ >> >> That is one word queries q=app on the query_text field, work fine >> however "q=app mou" do not. Why would this be or is there a >> configuration that could be missing? >> >> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler >> wrote: >> > Thanks for your feedback Robert, >> > >> > I will try that and see how Solr performs on my data - I think I will >> create >> > a field that contains only important key/product terms from the text. >> > >> > Regards >> > Johan >> > >> > On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen >> wrote: >> > >> >> We don't have that many, just a hundred thousand, and solr response >> >> times (since the index's docs are small and not complex) are logged as >> >> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no >> >> milliseconds have elapsed. Incredible if you ask me... :) >> >> >> >> Once you get SOLR to consider the whole phrase as just one big term, the >> >> wildcard is very fast. >> >> >> >> -Original Message- >> >> From: Eric Grobler [mailto:impalah...@googlemail.com] >> >> Sent: Wednesday, September 01, 2010 12:35 PM >> >> To: solr-user@lucene.apache.org >> >> Subject: Re: Auto Suggest >> >> >> >> Hi Robert, >> >> >> >> Interesting approach, how many documents do you have in Solr? >> >> I have about 2 million and I just wonder if it might be a bit slow. >> >> >> >> Regards >> >> Johan >> >> >> >> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >> >> wrote: >> >> >> >> > I do this by replacing the spaces with a '%' in a separate search >> >> field >> >> > which is not parsed nor tokenized and then you can wildcard across the >> >> > whole phrase like you want and the spaces don't mess you up. Just >> >> store >> >> > the original phrase with spaces in a separate field for returning to >> >> the >> >> > front end for display. >> >> > >> >> > -Original Message- >> >> > From: Jazz Globe [mailto:jazzgl...@hotmail.com] >> >> > Sent: Wednesday, September 01, 2010 7:33 AM >> >> > To: solr-user@lucene.apache.org >> >> > Subject: Auto Suggest >> >> > >> >> > >> >> > Hallo >> >> > >> >> > How would one implement a multiple term auto-suggest feature in Solr >> >> > that is filter sensitive? >> >> > For example, a user enters : >> >> > "mp3" >> >> > and solr might suggest: >> >> > -> "mp3 player" >> >> > -> "mp3 nano" >> >> > -> "mp3 sony" >> >> > and then the user starts the second word : >> >> > "mp3 n" >> >> > and that narrows it down to: >> >> > -> "mp3 nano" >> >> > >> >> > I had a quick look at the Terms Component. >> >> > I suppose it just returns term totals for the entire index and cannot >> >> be >> >> > used with a filter or query? >> >> > >> >> > Thanks >> >> > Johan >> >> > >> >> > >> >> > >> >> >> > >> >
Re: Auto Suggest
I set this up a few years ago with something like the following: is the bit missing i think here This way the search is agnostic to case and any non-alphanum chars, this was to facilitate a location autocomplete for searching So is was a basic search, returning the top N results along with additional info to show in the autocomplete to our mod_perl servers, Results were cached in the mod_perl servers. Regards, Dan On Thu, Sep 2, 2010 at 1:53 PM, Jason Rutherglen wrote: > I'm having a different issue with the EdgeNGram technique described > here: > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ > > That is one word queries q=app on the query_text field, work fine > however "q=app mou" do not. Why would this be or is there a > configuration that could be missing? > > On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler > wrote: > > Thanks for your feedback Robert, > > > > I will try that and see how Solr performs on my data - I think I will > create > > a field that contains only important key/product terms from the text. > > > > Regards > > Johan > > > > On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen > wrote: > > > >> We don't have that many, just a hundred thousand, and solr response > >> times (since the index's docs are small and not complex) are logged as > >> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no > >> milliseconds have elapsed. Incredible if you ask me... :) > >> > >> Once you get SOLR to consider the whole phrase as just one big term, the > >> wildcard is very fast. > >> > >> -Original Message- > >> From: Eric Grobler [mailto:impalah...@googlemail.com] > >> Sent: Wednesday, September 01, 2010 12:35 PM > >> To: solr-user@lucene.apache.org > >> Subject: Re: Auto Suggest > >> > >> Hi Robert, > >> > >> Interesting approach, how many documents do you have in Solr? > >> I have about 2 million and I just wonder if it might be a bit slow. > >> > >> Regards > >> Johan > >> > >> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen > >> wrote: > >> > >> > I do this by replacing the spaces with a '%' in a separate search > >> field > >> > which is not parsed nor tokenized and then you can wildcard across the > >> > whole phrase like you want and the spaces don't mess you up. Just > >> store > >> > the original phrase with spaces in a separate field for returning to > >> the > >> > front end for display. > >> > > >> > -Original Message- > >> > From: Jazz Globe [mailto:jazzgl...@hotmail.com] > >> > Sent: Wednesday, September 01, 2010 7:33 AM > >> > To: solr-user@lucene.apache.org > >> > Subject: Auto Suggest > >> > > >> > > >> > Hallo > >> > > >> > How would one implement a multiple term auto-suggest feature in Solr > >> > that is filter sensitive? > >> > For example, a user enters : > >> > "mp3" > >> > and solr might suggest: > >> > -> "mp3 player" > >> > -> "mp3 nano" > >> > -> "mp3 sony" > >> > and then the user starts the second word : > >> > "mp3 n" > >> > and that narrows it down to: > >> > -> "mp3 nano" > >> > > >> > I had a quick look at the Terms Component. > >> > I suppose it just returns term totals for the entire index and cannot > >> be > >> > used with a filter or query? > >> > > >> > Thanks > >> > Johan > >> > > >> > > >> > > >> > > >
Re: Auto Suggest
What about if you do something like this? - facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1 Jason Rutherglen wrote: To clarify, the query analyzer returns that. Variations such as "apple mou" also do not return anything. Maybe Jay can comment and then we can amend the article? On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen wrote: Analysis returns "app mou". On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog wrote: What does analysis.jsp show? On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen wrote: I'm having a different issue with the EdgeNGram technique described here: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ That is one word queries q=app on the query_text field, work fine however "q=app mou" do not. Why would this be or is there a configuration that could be missing? On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler wrote: Thanks for your feedback Robert, I will try that and see how Solr performs on my data - I think I will create a field that contains only important key/product terms from the text. Regards Johan On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: We don't have that many, just a hundred thousand, and solr response times (since the index's docs are small and not complex) are logged as typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no milliseconds have elapsed. Incredible if you ask me... :) Once you get SOLR to consider the whole phrase as just one big term, the wildcard is very fast. -Original Message- From: Eric Grobler [mailto:impalah...@googlemail.com] Sent: Wednesday, September 01, 2010 12:35 PM To: solr-user@lucene.apache.org Subject: Re: Auto Suggest Hi Robert, Interesting approach, how many documents do you have in Solr? I have about 2 million and I just wonder if it might be a bit slow. Regards Johan On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen wrote: I do this by replacing the spaces with a '%' in a separate search field which is not parsed nor tokenized and then you can wildcard across the whole phrase like you want and the spaces don't mess you up. Just store the original phrase with spaces in a separate field for returning to the front end for display. -Original Message- From: Jazz Globe [mailto:jazzgl...@hotmail.com] Sent: Wednesday, September 01, 2010 7:33 AM To: solr-user@lucene.apache.org Subject: Auto Suggest Hallo How would one implement a multiple term auto-suggest feature in Solr that is filter sensitive? For example, a user enters : "mp3" and solr might suggest: -> "mp3 player" -> "mp3 nano" -> "mp3 sony" and then the user starts the second word : "mp3 n" and that narrows it down to: -> "mp3 nano" I had a quick look at the Terms Component. I suppose it just returns term totals for the entire index and cannot be used with a filter or query? Thanks Johan -- Lance Norskog goks...@gmail.com
Re: Auto Suggest
To clarify, the query analyzer returns that. Variations such as "apple mou" also do not return anything. Maybe Jay can comment and then we can amend the article? On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen wrote: > Analysis returns "app mou". > > On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog wrote: >> What does analysis.jsp show? >> >> On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen >> wrote: >>> I'm having a different issue with the EdgeNGram technique described >>> here: >>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ >>> >>> That is one word queries q=app on the query_text field, work fine >>> however "q=app mou" do not. Why would this be or is there a >>> configuration that could be missing? >>> >>> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler >>> wrote: >>>> Thanks for your feedback Robert, >>>> >>>> I will try that and see how Solr performs on my data - I think I will >>>> create >>>> a field that contains only important key/product terms from the text. >>>> >>>> Regards >>>> Johan >>>> >>>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: >>>> >>>>> We don't have that many, just a hundred thousand, and solr response >>>>> times (since the index's docs are small and not complex) are logged as >>>>> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no >>>>> milliseconds have elapsed. Incredible if you ask me... :) >>>>> >>>>> Once you get SOLR to consider the whole phrase as just one big term, the >>>>> wildcard is very fast. >>>>> >>>>> -Original Message- >>>>> From: Eric Grobler [mailto:impalah...@googlemail.com] >>>>> Sent: Wednesday, September 01, 2010 12:35 PM >>>>> To: solr-user@lucene.apache.org >>>>> Subject: Re: Auto Suggest >>>>> >>>>> Hi Robert, >>>>> >>>>> Interesting approach, how many documents do you have in Solr? >>>>> I have about 2 million and I just wonder if it might be a bit slow. >>>>> >>>>> Regards >>>>> Johan >>>>> >>>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >>>>> wrote: >>>>> >>>>> > I do this by replacing the spaces with a '%' in a separate search >>>>> field >>>>> > which is not parsed nor tokenized and then you can wildcard across the >>>>> > whole phrase like you want and the spaces don't mess you up. Just >>>>> store >>>>> > the original phrase with spaces in a separate field for returning to >>>>> the >>>>> > front end for display. >>>>> > >>>>> > -Original Message- >>>>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com] >>>>> > Sent: Wednesday, September 01, 2010 7:33 AM >>>>> > To: solr-user@lucene.apache.org >>>>> > Subject: Auto Suggest >>>>> > >>>>> > >>>>> > Hallo >>>>> > >>>>> > How would one implement a multiple term auto-suggest feature in Solr >>>>> > that is filter sensitive? >>>>> > For example, a user enters : >>>>> > "mp3" >>>>> > and solr might suggest: >>>>> > -> "mp3 player" >>>>> > -> "mp3 nano" >>>>> > -> "mp3 sony" >>>>> > and then the user starts the second word : >>>>> > "mp3 n" >>>>> > and that narrows it down to: >>>>> > -> "mp3 nano" >>>>> > >>>>> > I had a quick look at the Terms Component. >>>>> > I suppose it just returns term totals for the entire index and cannot >>>>> be >>>>> > used with a filter or query? >>>>> > >>>>> > Thanks >>>>> > Johan >>>>> > >>>>> > >>>>> > >>>>> >>>> >>> >> >> >> >> -- >> Lance Norskog >> goks...@gmail.com >> >
Re: Auto Suggest
Analysis returns "app mou". On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog wrote: > What does analysis.jsp show? > > On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen > wrote: >> I'm having a different issue with the EdgeNGram technique described >> here: >> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ >> >> That is one word queries q=app on the query_text field, work fine >> however "q=app mou" do not. Why would this be or is there a >> configuration that could be missing? >> >> On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler >> wrote: >>> Thanks for your feedback Robert, >>> >>> I will try that and see how Solr performs on my data - I think I will create >>> a field that contains only important key/product terms from the text. >>> >>> Regards >>> Johan >>> >>> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: >>> >>>> We don't have that many, just a hundred thousand, and solr response >>>> times (since the index's docs are small and not complex) are logged as >>>> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no >>>> milliseconds have elapsed. Incredible if you ask me... :) >>>> >>>> Once you get SOLR to consider the whole phrase as just one big term, the >>>> wildcard is very fast. >>>> >>>> -Original Message- >>>> From: Eric Grobler [mailto:impalah...@googlemail.com] >>>> Sent: Wednesday, September 01, 2010 12:35 PM >>>> To: solr-user@lucene.apache.org >>>> Subject: Re: Auto Suggest >>>> >>>> Hi Robert, >>>> >>>> Interesting approach, how many documents do you have in Solr? >>>> I have about 2 million and I just wonder if it might be a bit slow. >>>> >>>> Regards >>>> Johan >>>> >>>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >>>> wrote: >>>> >>>> > I do this by replacing the spaces with a '%' in a separate search >>>> field >>>> > which is not parsed nor tokenized and then you can wildcard across the >>>> > whole phrase like you want and the spaces don't mess you up. Just >>>> store >>>> > the original phrase with spaces in a separate field for returning to >>>> the >>>> > front end for display. >>>> > >>>> > -Original Message- >>>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com] >>>> > Sent: Wednesday, September 01, 2010 7:33 AM >>>> > To: solr-user@lucene.apache.org >>>> > Subject: Auto Suggest >>>> > >>>> > >>>> > Hallo >>>> > >>>> > How would one implement a multiple term auto-suggest feature in Solr >>>> > that is filter sensitive? >>>> > For example, a user enters : >>>> > "mp3" >>>> > and solr might suggest: >>>> > -> "mp3 player" >>>> > -> "mp3 nano" >>>> > -> "mp3 sony" >>>> > and then the user starts the second word : >>>> > "mp3 n" >>>> > and that narrows it down to: >>>> > -> "mp3 nano" >>>> > >>>> > I had a quick look at the Terms Component. >>>> > I suppose it just returns term totals for the entire index and cannot >>>> be >>>> > used with a filter or query? >>>> > >>>> > Thanks >>>> > Johan >>>> > >>>> > >>>> > >>>> >>> >> > > > > -- > Lance Norskog > goks...@gmail.com >
Re: Auto Suggest
Are you phrasing the query, like &q="app mou" ? I guess with edgeNgram you use KeywordTokenizer which stores phrases as single terms. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 2. sep. 2010, at 14.53, Jason Rutherglen wrote: > I'm having a different issue with the EdgeNGram technique described > here: > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ > > That is one word queries q=app on the query_text field, work fine > however "q=app mou" do not. Why would this be or is there a > configuration that could be missing? > > On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler > wrote: >> Thanks for your feedback Robert, >> >> I will try that and see how Solr performs on my data - I think I will create >> a field that contains only important key/product terms from the text. >> >> Regards >> Johan >> >> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: >> >>> We don't have that many, just a hundred thousand, and solr response >>> times (since the index's docs are small and not complex) are logged as >>> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no >>> milliseconds have elapsed. Incredible if you ask me... :) >>> >>> Once you get SOLR to consider the whole phrase as just one big term, the >>> wildcard is very fast. >>> >>> -Original Message- >>> From: Eric Grobler [mailto:impalah...@googlemail.com] >>> Sent: Wednesday, September 01, 2010 12:35 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Auto Suggest >>> >>> Hi Robert, >>> >>> Interesting approach, how many documents do you have in Solr? >>> I have about 2 million and I just wonder if it might be a bit slow. >>> >>> Regards >>> Johan >>> >>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >>> wrote: >>> >>>> I do this by replacing the spaces with a '%' in a separate search >>> field >>>> which is not parsed nor tokenized and then you can wildcard across the >>>> whole phrase like you want and the spaces don't mess you up. Just >>> store >>>> the original phrase with spaces in a separate field for returning to >>> the >>>> front end for display. >>>> >>>> -Original Message- >>>> From: Jazz Globe [mailto:jazzgl...@hotmail.com] >>>> Sent: Wednesday, September 01, 2010 7:33 AM >>>> To: solr-user@lucene.apache.org >>>> Subject: Auto Suggest >>>> >>>> >>>> Hallo >>>> >>>> How would one implement a multiple term auto-suggest feature in Solr >>>> that is filter sensitive? >>>> For example, a user enters : >>>> "mp3" >>>> and solr might suggest: >>>> -> "mp3 player" >>>> -> "mp3 nano" >>>> -> "mp3 sony" >>>> and then the user starts the second word : >>>> "mp3 n" >>>> and that narrows it down to: >>>> -> "mp3 nano" >>>> >>>> I had a quick look at the Terms Component. >>>> I suppose it just returns term totals for the entire index and cannot >>> be >>>> used with a filter or query? >>>> >>>> Thanks >>>> Johan >>>> >>>> >>>> >>> >>
Re: Auto Suggest
What does analysis.jsp show? On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen wrote: > I'm having a different issue with the EdgeNGram technique described > here: > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ > > That is one word queries q=app on the query_text field, work fine > however "q=app mou" do not. Why would this be or is there a > configuration that could be missing? > > On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler > wrote: >> Thanks for your feedback Robert, >> >> I will try that and see how Solr performs on my data - I think I will create >> a field that contains only important key/product terms from the text. >> >> Regards >> Johan >> >> On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: >> >>> We don't have that many, just a hundred thousand, and solr response >>> times (since the index's docs are small and not complex) are logged as >>> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no >>> milliseconds have elapsed. Incredible if you ask me... :) >>> >>> Once you get SOLR to consider the whole phrase as just one big term, the >>> wildcard is very fast. >>> >>> -Original Message- >>> From: Eric Grobler [mailto:impalah...@googlemail.com] >>> Sent: Wednesday, September 01, 2010 12:35 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Auto Suggest >>> >>> Hi Robert, >>> >>> Interesting approach, how many documents do you have in Solr? >>> I have about 2 million and I just wonder if it might be a bit slow. >>> >>> Regards >>> Johan >>> >>> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >>> wrote: >>> >>> > I do this by replacing the spaces with a '%' in a separate search >>> field >>> > which is not parsed nor tokenized and then you can wildcard across the >>> > whole phrase like you want and the spaces don't mess you up. Just >>> store >>> > the original phrase with spaces in a separate field for returning to >>> the >>> > front end for display. >>> > >>> > -Original Message- >>> > From: Jazz Globe [mailto:jazzgl...@hotmail.com] >>> > Sent: Wednesday, September 01, 2010 7:33 AM >>> > To: solr-user@lucene.apache.org >>> > Subject: Auto Suggest >>> > >>> > >>> > Hallo >>> > >>> > How would one implement a multiple term auto-suggest feature in Solr >>> > that is filter sensitive? >>> > For example, a user enters : >>> > "mp3" >>> > and solr might suggest: >>> > -> "mp3 player" >>> > -> "mp3 nano" >>> > -> "mp3 sony" >>> > and then the user starts the second word : >>> > "mp3 n" >>> > and that narrows it down to: >>> > -> "mp3 nano" >>> > >>> > I had a quick look at the Terms Component. >>> > I suppose it just returns term totals for the entire index and cannot >>> be >>> > used with a filter or query? >>> > >>> > Thanks >>> > Johan >>> > >>> > >>> > >>> >> > -- Lance Norskog goks...@gmail.com
Re: Auto Suggest
I'm having a different issue with the EdgeNGram technique described here: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ That is one word queries q=app on the query_text field, work fine however "q=app mou" do not. Why would this be or is there a configuration that could be missing? On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler wrote: > Thanks for your feedback Robert, > > I will try that and see how Solr performs on my data - I think I will create > a field that contains only important key/product terms from the text. > > Regards > Johan > > On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: > >> We don't have that many, just a hundred thousand, and solr response >> times (since the index's docs are small and not complex) are logged as >> typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no >> milliseconds have elapsed. Incredible if you ask me... :) >> >> Once you get SOLR to consider the whole phrase as just one big term, the >> wildcard is very fast. >> >> -Original Message- >> From: Eric Grobler [mailto:impalah...@googlemail.com] >> Sent: Wednesday, September 01, 2010 12:35 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Auto Suggest >> >> Hi Robert, >> >> Interesting approach, how many documents do you have in Solr? >> I have about 2 million and I just wonder if it might be a bit slow. >> >> Regards >> Johan >> >> On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen >> wrote: >> >> > I do this by replacing the spaces with a '%' in a separate search >> field >> > which is not parsed nor tokenized and then you can wildcard across the >> > whole phrase like you want and the spaces don't mess you up. Just >> store >> > the original phrase with spaces in a separate field for returning to >> the >> > front end for display. >> > >> > -Original Message- >> > From: Jazz Globe [mailto:jazzgl...@hotmail.com] >> > Sent: Wednesday, September 01, 2010 7:33 AM >> > To: solr-user@lucene.apache.org >> > Subject: Auto Suggest >> > >> > >> > Hallo >> > >> > How would one implement a multiple term auto-suggest feature in Solr >> > that is filter sensitive? >> > For example, a user enters : >> > "mp3" >> > and solr might suggest: >> > -> "mp3 player" >> > -> "mp3 nano" >> > -> "mp3 sony" >> > and then the user starts the second word : >> > "mp3 n" >> > and that narrows it down to: >> > -> "mp3 nano" >> > >> > I had a quick look at the Terms Component. >> > I suppose it just returns term totals for the entire index and cannot >> be >> > used with a filter or query? >> > >> > Thanks >> > Johan >> > >> > >> > >> >
Re: Auto Suggest
Thanks for your feedback Robert, I will try that and see how Solr performs on my data - I think I will create a field that contains only important key/product terms from the text. Regards Johan On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen wrote: > We don't have that many, just a hundred thousand, and solr response > times (since the index's docs are small and not complex) are logged as > typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no > milliseconds have elapsed. Incredible if you ask me... :) > > Once you get SOLR to consider the whole phrase as just one big term, the > wildcard is very fast. > > -Original Message- > From: Eric Grobler [mailto:impalah...@googlemail.com] > Sent: Wednesday, September 01, 2010 12:35 PM > To: solr-user@lucene.apache.org > Subject: Re: Auto Suggest > > Hi Robert, > > Interesting approach, how many documents do you have in Solr? > I have about 2 million and I just wonder if it might be a bit slow. > > Regards > Johan > > On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen > wrote: > > > I do this by replacing the spaces with a '%' in a separate search > field > > which is not parsed nor tokenized and then you can wildcard across the > > whole phrase like you want and the spaces don't mess you up. Just > store > > the original phrase with spaces in a separate field for returning to > the > > front end for display. > > > > -Original Message----- > > From: Jazz Globe [mailto:jazzgl...@hotmail.com] > > Sent: Wednesday, September 01, 2010 7:33 AM > > To: solr-user@lucene.apache.org > > Subject: Auto Suggest > > > > > > Hallo > > > > How would one implement a multiple term auto-suggest feature in Solr > > that is filter sensitive? > > For example, a user enters : > > "mp3" > > and solr might suggest: > > -> "mp3 player" > > -> "mp3 nano" > > -> "mp3 sony" > > and then the user starts the second word : > > "mp3 n" > > and that narrows it down to: > > -> "mp3 nano" > > > > I had a quick look at the Terms Component. > > I suppose it just returns term totals for the entire index and cannot > be > > used with a filter or query? > > > > Thanks > > Johan > > > > > > >
RE: Auto Suggest
We don't have that many, just a hundred thousand, and solr response times (since the index's docs are small and not complex) are logged as typically 1 ms if not 0 ms. It's funny but sometimes it is so fast no milliseconds have elapsed. Incredible if you ask me... :) Once you get SOLR to consider the whole phrase as just one big term, the wildcard is very fast. -Original Message- From: Eric Grobler [mailto:impalah...@googlemail.com] Sent: Wednesday, September 01, 2010 12:35 PM To: solr-user@lucene.apache.org Subject: Re: Auto Suggest Hi Robert, Interesting approach, how many documents do you have in Solr? I have about 2 million and I just wonder if it might be a bit slow. Regards Johan On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen wrote: > I do this by replacing the spaces with a '%' in a separate search field > which is not parsed nor tokenized and then you can wildcard across the > whole phrase like you want and the spaces don't mess you up. Just store > the original phrase with spaces in a separate field for returning to the > front end for display. > > -Original Message- > From: Jazz Globe [mailto:jazzgl...@hotmail.com] > Sent: Wednesday, September 01, 2010 7:33 AM > To: solr-user@lucene.apache.org > Subject: Auto Suggest > > > Hallo > > How would one implement a multiple term auto-suggest feature in Solr > that is filter sensitive? > For example, a user enters : > "mp3" > and solr might suggest: > -> "mp3 player" > -> "mp3 nano" > -> "mp3 sony" > and then the user starts the second word : > "mp3 n" > and that narrows it down to: > -> "mp3 nano" > > I had a quick look at the Terms Component. > I suppose it just returns term totals for the entire index and cannot be > used with a filter or query? > > Thanks > Johan > > >
Re: Auto Suggest
Hi Robert, Interesting approach, how many documents do you have in Solr? I have about 2 million and I just wonder if it might be a bit slow. Regards Johan On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen wrote: > I do this by replacing the spaces with a '%' in a separate search field > which is not parsed nor tokenized and then you can wildcard across the > whole phrase like you want and the spaces don't mess you up. Just store > the original phrase with spaces in a separate field for returning to the > front end for display. > > -Original Message- > From: Jazz Globe [mailto:jazzgl...@hotmail.com] > Sent: Wednesday, September 01, 2010 7:33 AM > To: solr-user@lucene.apache.org > Subject: Auto Suggest > > > Hallo > > How would one implement a multiple term auto-suggest feature in Solr > that is filter sensitive? > For example, a user enters : > "mp3" > and solr might suggest: > -> "mp3 player" > -> "mp3 nano" > -> "mp3 sony" > and then the user starts the second word : > "mp3 n" > and that narrows it down to: > -> "mp3 nano" > > I had a quick look at the Terms Component. > I suppose it just returns term totals for the entire index and cannot be > used with a filter or query? > > Thanks > Johan > > >
RE: Auto Suggest
I do this by replacing the spaces with a '%' in a separate search field which is not parsed nor tokenized and then you can wildcard across the whole phrase like you want and the spaces don't mess you up. Just store the original phrase with spaces in a separate field for returning to the front end for display. -Original Message- From: Jazz Globe [mailto:jazzgl...@hotmail.com] Sent: Wednesday, September 01, 2010 7:33 AM To: solr-user@lucene.apache.org Subject: Auto Suggest Hallo How would one implement a multiple term auto-suggest feature in Solr that is filter sensitive? For example, a user enters : "mp3" and solr might suggest: -> "mp3 player" -> "mp3 nano" -> "mp3 sony" and then the user starts the second word : "mp3 n" and that narrows it down to: -> "mp3 nano" I had a quick look at the Terms Component. I suppose it just returns term totals for the entire index and cannot be used with a filter or query? Thanks Johan
Auto Suggest
Hallo How would one implement a multiple term auto-suggest feature in Solr that is filter sensitive? For example, a user enters : "mp3" and solr might suggest: -> "mp3 player" -> "mp3 nano" -> "mp3 sony" and then the user starts the second word : "mp3 n" and that narrows it down to: -> "mp3 nano" I had a quick look at the Terms Component. I suppose it just returns term totals for the entire index and cannot be used with a filter or query? Thanks Johan
Re: Auto suggest with spell check
Given below are the steps for auto-suggest and spellcheck in single query: Make the change in TermComponent part in solrconfig.xml true termsComponent spellcheck Use given below query format for getting autosuggest and spellcheck suggestion. http://localhost:8983/solr/terms?terms.fl=text&terms.prefix=computr&spellcheck.q=computr&spellcheck=true -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-spell-check-tp1015114p1025688.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Auto-suggest internal terms
Am 03.06.2010 16:45, schrieb Andrzej Bialecki: > You are right to a certain degree. Still, there are some contention > points in Lucene/Solr, how threads are allocated on available CPU-s, and > how the heap is used, which can make a two-JVM setup perform much better > than a single-JVM setup given the same number of threads... Allow me to don't belive this! ;-) It's not Solr that allocates threads, it's the web server (Jetty, Glassfish, or whatever). In a normal configuration, it will use as many threads as useful, so that there's no need to start a second web server on the same machine. To Lucene, there is some magic algorithm that reuses an IndexReader by a limited number of threads (as far as I have seen in the code, but the details are unimportant). But to the very least, if you've a multi core setup, you'll get special IndexReader instances from Lucene per core. So I don't see why you should scatter them on different VMs. Greetings, Michael
Re: Auto-suggest internal terms
On 2010-06-03 13:38, Michael Kuhlmann wrote: > Am 03.06.2010 13:02, schrieb Andrzej Bialecki: >> ..., and deploy this >> index in a separate JVM (to benefit from other CPUs than the one that >> runs your Solr core) > > Every known webserver ist multithreaded by default, so putting different > Solr instances into different JVMs will be of no use. You are right to a certain degree. Still, there are some contention points in Lucene/Solr, how threads are allocated on available CPU-s, and how the heap is used, which can make a two-JVM setup perform much better than a single-JVM setup given the same number of threads... -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Auto-suggest internal terms
Am 03.06.2010 13:02, schrieb Andrzej Bialecki: > ..., and deploy this > index in a separate JVM (to benefit from other CPUs than the one that > runs your Solr core) Every known webserver ist multithreaded by default, so putting different Solr instances into different JVMs will be of no use. -Michael
Re: Auto-suggest internal terms
On 2010-06-03 09:56, Michael Kuhlmann wrote: > The only solution without "doing any custom work" would be to perform a > normal query for each suggestion. But you might get into performance > troubles with that, because suggestions are typically performed much > more often than complete searches. Actually, that's not a bad idea - if you can trim the size of the index (either by using shingles instead of docs, or trimming the main index - LUCENE-1812) so that the index fits completely in RAM, and deploy this index in a separate JVM (to benefit from other CPUs than the one that runs your Solr core) or another machine, then I think performance would not be a big concern, and the functionality would be just what you wanted. > > The much faster solution that needs own work would be to build up a > large TreeMap with each word as the keys, and the matching terms as the > values. That would consume an awful lot of RAM... see SOLR-1316 for some measurements. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Auto-suggest internal terms
The only solution without "doing any custom work" would be to perform a normal query for each suggestion. But you might get into performance troubles with that, because suggestions are typically performed much more often than complete searches. The much faster solution that needs own work would be to build up a large TreeMap with each word as the keys, and the matching terms as the values. -Michael Am 02.06.2010 22:01, schrieb Jay Hill: > I've got a situation where I'm looking to build an auto-suggest where any > term entered will lead to suggestions. For example, if I type "wine" I want > to see suggestions like this: > > french *wine* classes > *wine* book discounts > burgundy *wine* > > etc. > > I've tried some tricks with shingles, but the only solution that worked was > pre-processing my queries into a core in all variations. > > Anyone know any tricks to accomplish this in Solr without doing any custom > work? > > -Jay >
RE: Auto-suggest internal terms
I was interested in the same thing and stumbled upon this article: http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent -and-jquery/ I haven't followed through, but it looked promising to me. Tim -Original Message- From: Jay Hill [mailto:jayallenh...@gmail.com] Sent: Wednesday, June 02, 2010 4:02 PM To: solr-user@lucene.apache.org Subject: Auto-suggest internal terms I've got a situation where I'm looking to build an auto-suggest where any term entered will lead to suggestions. For example, if I type "wine" I want to see suggestions like this: french *wine* classes *wine* book discounts burgundy *wine* etc. I've tried some tricks with shingles, but the only solution that worked was pre-processing my queries into a core in all variations. Anyone know any tricks to accomplish this in Solr without doing any custom work? -Jay
RE: Auto-suggest internal terms
I'm painfully new to Solr so please be gentle if my suggestion is terrible! Could you use highlighting to do this? Take the first n results from a query and show their highlights, customizing the highlights to show the desired number of words. Just a thought. Patrick -Original Message- From: Jay Hill [mailto:jayallenh...@gmail.com] Sent: Wednesday, June 02, 2010 4:02 PM To: solr-user@lucene.apache.org Subject: Auto-suggest internal terms I've got a situation where I'm looking to build an auto-suggest where any term entered will lead to suggestions. For example, if I type "wine" I want to see suggestions like this: french *wine* classes *wine* book discounts burgundy *wine* etc. I've tried some tricks with shingles, but the only solution that worked was pre-processing my queries into a core in all variations. Anyone know any tricks to accomplish this in Solr without doing any custom work? -Jay
Auto-suggest internal terms
I've got a situation where I'm looking to build an auto-suggest where any term entered will lead to suggestions. For example, if I type "wine" I want to see suggestions like this: french *wine* classes *wine* book discounts burgundy *wine* etc. I've tried some tricks with shingles, but the only solution that worked was pre-processing my queries into a core in all variations. Anyone know any tricks to accomplish this in Solr without doing any custom work? -Jay
Re: multi term, multi field, auto suggest
On 01.02.2010, at 13:27, Lukas Kahwe Smith wrote: > > On 29.01.2010, at 15:40, Lukas Kahwe Smith wrote: > >> I am still a bit unsure how to handle both the lowercased and the case >> preserved version: >> >> So here are some examples: >> UBS => ubs|UBS >> Kreuzstrasse => kreuzstrasse|Kreuzstrasse >> >> So when I type "Kreu" I would get a suggestion of "Kreuzstrasse" and with >> "kreu" I would get "kreuzstrasse". >> Since I do not expect any words to start with a lowercase letter and still >> contain some upper case letter we should be fine with this approach. >> >> As in I doubt there would be stuff like "fooBar" which would lead to >> suggestion both "foobar" and "fooBar". >> >> How can I achieve this? > > > I just noticed that I need the same thing for the word delimiter splitter. As > in some way to index both the splitted and the unsplitted version so that I > can use it in a facet search. > > Hans-Peter => Hans|Peter|Hans-Peter Sorry for the monolog. I did see http://www.mail-archive.com/solr-user@lucene.apache.org/msg29786.html, which suggests a solution just for lowercase indexing with mixed case suggest via concatenating the lowercased version with some separator with the original version. I guess what I could just do is feed in the same data multiple times and do the approach of [indexterm]|[original] in user land somehow like "Hans-Peter" would be turned into 3 documents: hans|Hans-Peter peter|Hans-Peter hans-peter|Hans-Peter This solution would be quite cool indeed, since I could suggest "Hans-Peter" if someone searches for "Peter". Since I will just use this for a prefix search, I could just set the query analyzer to lowercase the search and it should find the results and I can then add some magic to the frontend display logic to split off the suggested original term. I am not aware of any magic inside the schema.xml that could do this work for me though. I am using the DatabaseHandler to load the documents. I guess I could simply run the query multiple times, but that would screw up the indexing of the non auto suggest index. Then again maybe I want to totally separate the two anyways. regards, Lukas Kahwe Smith m...@pooteeweet.org
Re: multi term, multi field, auto suggest
On 29.01.2010, at 15:40, Lukas Kahwe Smith wrote: > I am still a bit unsure how to handle both the lowercased and the case > preserved version: > > So here are some examples: > UBS => ubs|UBS > Kreuzstrasse => kreuzstrasse|Kreuzstrasse > > So when I type "Kreu" I would get a suggestion of "Kreuzstrasse" and with > "kreu" I would get "kreuzstrasse". > Since I do not expect any words to start with a lowercase letter and still > contain some upper case letter we should be fine with this approach. > > As in I doubt there would be stuff like "fooBar" which would lead to > suggestion both "foobar" and "fooBar". > > How can I achieve this? I just noticed that I need the same thing for the word delimiter splitter. As in some way to index both the splitted and the unsplitted version so that I can use it in a facet search. Hans-Peter => Hans|Peter|Hans-Peter regards, Lukas Kahwe Smith m...@pooteeweet.org