Re: FreeText Auto-suggest

Alessandro Benedetti Sun, 28 Jun 2015 12:56:55 -0700

Thanks Mike !
Tomorrow I will read the details,
I was taking as guide the Solr Guide, that was not so clear in relation of
the type of Analyzer to provide the Suggester with.
Thanks for the support,
Tomorrow I will do some experiment and let you know!


Cheers

2015-06-28 11:48 GMT+01:00 Michael McCandless <luc...@mikemccandless.com>:

> Which documentation are you reading?
>
> The analyzer you send to FreeTextSuggester should not make shingles
> itself: the suggester does this internally, based on the grams
> parameter.
>
> Maybe look at the TestFreeTextSuggester unit test as an example?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sat, Jun 27, 2015 at 6:52 PM, Alessandro Benedetti
> <benedetti.ale...@gmail.com> wrote:
> > Hi guys,
> > after reading the documentation for the FreetextSuggester I have some
> doubts
> > :
> >
> > Actually the documentation is not clear enough.
> > Let's try to understand this suggester.
> >
> > Building
> > This suggester build a FST that it will use to provide the autocomplete
> > feature running prefix searches on it .
> > The terms it uses to generate the FST are the tokens produced by the
> > "suggestFreeTextAnalyzerFieldType" .
> >
> > And this should be correct.
> > So if we have a shingle token filter[1-3] ( we produce unigrams as well)
> in
> > our analysis to keep it simple , from these original field values :
> > "mp3 ipod"
> > "mp3 player"
> > "mp3 player ipod"
> > "player of Real"
> >
> > -> we produce these list of possible suggestions in our FST :
> >
> > <mp3>
> > <player>
> > <ipod>
> > <real>
> > <of>
> >
> > <mp3 ipod>
> > <mp3 player>
> > <player ipod>
> > <player of>
> > <of real>
> >
> > <mp3 player ipod>
> > <player of real>
> >
> > From the documentation I read :
> >>
> >> " ngrams: The max number of tokens out of which singles will be make the
> >> dictionary. The default value is 2. Increasing this would mean you want
> more
> >> than the previous 2 tokens to be taken into consideration when making
> the
> >> suggestions. "
> >
> >
> > This makes me confused, as I was not expecting this param to affect the
> > suggestion dictionary.
> > So I would like a clarification here from our masters :)
> > At this point let's see what happens at query time .
> >
> > Query Time
> > As my understanding the ngrams params will consider  the last N-1 tokens
> the
> > user put separated by the space separator.
> >
> >> "Builds an ngram model from the text sent to {@link
> >> * #build} and predicts based on the last grams-1 tokens in
> >> * the request sent to {@link #lookup}. This tries to
> >> * handle the "long tail" of suggestions for when the
> >> * incoming query is a never before seen query string."
> >
> >
> > Example , grams=3 should consider only the last 2 tokens
> >
> > special mp3 p -> mp3 p
> >
> > Then this query is analysed using the "suggestFreeTextAnalyzerFieldType"
> .
> > We produce 3 tokens :
> > <mp3>
> > <p>
> > <mp3 p>
> >
> > And we run the prefix matching on the FST .
> >
> > Conclusion
> > My understanding is wrong for sure at some point, as the behaviour I get
> is
> > different.
> > Can we discuss this , clarify this and eventually put it in the official
> > documentation ?
> >
> > Cheers
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: FreeText Auto-suggest

Reply via email to