Thanks Mike ! Tomorrow I will read the details, I was taking as guide the Solr Guide, that was not so clear in relation of the type of Analyzer to provide the Suggester with. Thanks for the support, Tomorrow I will do some experiment and let you know!
Cheers 2015-06-28 11:48 GMT+01:00 Michael McCandless <luc...@mikemccandless.com>: > Which documentation are you reading? > > The analyzer you send to FreeTextSuggester should not make shingles > itself: the suggester does this internally, based on the grams > parameter. > > Maybe look at the TestFreeTextSuggester unit test as an example? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Sat, Jun 27, 2015 at 6:52 PM, Alessandro Benedetti > <benedetti.ale...@gmail.com> wrote: > > Hi guys, > > after reading the documentation for the FreetextSuggester I have some > doubts > > : > > > > Actually the documentation is not clear enough. > > Let's try to understand this suggester. > > > > Building > > This suggester build a FST that it will use to provide the autocomplete > > feature running prefix searches on it . > > The terms it uses to generate the FST are the tokens produced by the > > "suggestFreeTextAnalyzerFieldType" . > > > > And this should be correct. > > So if we have a shingle token filter[1-3] ( we produce unigrams as well) > in > > our analysis to keep it simple , from these original field values : > > "mp3 ipod" > > "mp3 player" > > "mp3 player ipod" > > "player of Real" > > > > -> we produce these list of possible suggestions in our FST : > > > > <mp3> > > <player> > > <ipod> > > <real> > > <of> > > > > <mp3 ipod> > > <mp3 player> > > <player ipod> > > <player of> > > <of real> > > > > <mp3 player ipod> > > <player of real> > > > > From the documentation I read : > >> > >> " ngrams: The max number of tokens out of which singles will be make the > >> dictionary. The default value is 2. Increasing this would mean you want > more > >> than the previous 2 tokens to be taken into consideration when making > the > >> suggestions. " > > > > > > This makes me confused, as I was not expecting this param to affect the > > suggestion dictionary. > > So I would like a clarification here from our masters :) > > At this point let's see what happens at query time . > > > > Query Time > > As my understanding the ngrams params will consider the last N-1 tokens > the > > user put separated by the space separator. > > > >> "Builds an ngram model from the text sent to {@link > >> * #build} and predicts based on the last grams-1 tokens in > >> * the request sent to {@link #lookup}. This tries to > >> * handle the "long tail" of suggestions for when the > >> * incoming query is a never before seen query string." > > > > > > Example , grams=3 should consider only the last 2 tokens > > > > special mp3 p -> mp3 p > > > > Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" > . > > We produce 3 tokens : > > <mp3> > > <p> > > <mp3 p> > > > > And we run the prefix matching on the FST . > > > > Conclusion > > My understanding is wrong for sure at some point, as the behaviour I get > is > > different. > > Can we discuss this , clarify this and eventually put it in the official > > documentation ? > > > > Cheers > > > > -- > > -------------------------- > > > > Benedetti Alessandro > > Visiting card : http://about.me/alessandro_benedetti > > > > "Tyger, tyger burning bright > > In the forests of the night, > > What immortal hand or eye > > Could frame thy fearful symmetry?" > > > > William Blake - Songs of Experience -1794 England > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England