mmmm If I followed your use case is:

I type Asmtreadm and I want document matching Amsterdam ( even if the edit
distance is greater than 2) .
First of all is something I hope you do only if you get 0 results, if not
the overhead can be great and you are going to lose a lot of precision
causing confusion in the customer.

Pf2 and Pf3 is ngram of white space separated tokens, to make partial
phrase query to affect the scoring.
Not a good fit for your problem.

More than grams, have you considered using some sort of phonetic matching ?
Could this help :
https://cwiki.apache.org/confluence/display/solr/Phonetic+Matching

Cheers

On 10 March 2016 at 08:47, elisabeth benoit <elisaelisael...@gmail.com>
wrote:

> I am trying to do approximative search with solr. We've tried fuzzy search,
> and spellcheck search, it's working ok but edit distance is limited (to 2
> for DirectSolrSpellChecker in solr 4.10.1). With fuzzy operator, we've had
> performance issues, and I don't think you can have an edit distance more
> than 2.
>
> What we used to do with a database was more efficient: storing trigrams
> with position, and then searching arround that position (not precisely at
> that position, since it's approximative search)
>
> Position is to avoid  for a trigram like ams (amsterdam) to get answers
> where the same trigram is for instance at the end of the word. I would like
> answers with the same relative position between trigrams to score higher.
> Maybe using edismax'ss pf2 and pf3 is a way to do this. I don't see any
> other way. Please tell me if you do.
>
> From you're answer, I get that position is stored, but I dont understand
> how I can preserve relative order between trigrams, apart from using pf2
> pf3.
>
> Best regards,
> Elisabeth
>
> 2016-03-10 0:02 GMT+01:00 Alessandro Benedetti <abenede...@apache.org>:
>
> > if you store the positions for your tokens ( and it is by default if you
> > don't omit them), you have the relative position in the index. [1]
> > I attach a blog post of mine, describing a little bit more in details the
> > lucene internals.
> >
> > Apart from that, can you explain the problem you are trying to solve ?
> > The high level user experience ?
> > What kind of search/autocompletion/relevancy tuning are you trying to
> > achieve ?
> > Maybe we can help better if we start from the problem :)
> >
> > Cheers
> >
> > [1]
> >
> >
> http://alexbenedetti.blogspot.co.uk/2015/07/exploring-solr-internals-lucene.html
> >
> > On 9 March 2016 at 15:02, elisabeth benoit <elisaelisael...@gmail.com>
> > wrote:
> >
> > > Hello Alessandro,
> > >
> > > You may be right. What would you use to keep relative order between,
> for
> > > instance, grams
> > >
> > > __a
> > > _am
> > > ams
> > > mst
> > > ste
> > > ter
> > > erd
> > > rda
> > > dam
> > > am_
> > >
> > > of amsterdam? pf2 and pf3? That's all I can think about. Please let me
> > know
> > > if you have more insights.
> > >
> > > Best regards,
> > > Elisabeth
> > >
> > > 2016-03-08 17:46 GMT+01:00 Alessandro Benedetti <abenede...@apache.org
> >:
> > >
> > > > Elizabeth,
> > > > out of curiousity, could we know what you are trying to solve with
> that
> > > > complex way of tokenisation ?
> > > > Solr is really good in storing positions along with token, so I am
> > > curious
> > > > to know why your are mixing the things up.
> > > >
> > > > Cheers
> > > >
> > > > On 8 March 2016 at 10:08, elisabeth benoit <
> elisaelisael...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks for your answer Emir,
> > > > >
> > > > > I'll check that out.
> > > > >
> > > > > Best regards,
> > > > > Elisabeth
> > > > >
> > > > > 2016-03-08 10:24 GMT+01:00 Emir Arnautovic <
> > > emir.arnauto...@sematext.com
> > > > >:
> > > > >
> > > > > > Hi Elisabeth,
> > > > > > I don't think there is such token filter, so you would have to
> > create
> > > > > your
> > > > > > own token filter that takes token and emits ngram token of
> specific
> > > > > length.
> > > > > > It should not be too hard to create such filter - you can take a
> > look
> > > > how
> > > > > > nagram filter is coded - yours should be simpler than that.
> > > > > >
> > > > > > Regards,
> > > > > > Emir
> > > > > >
> > > > > >
> > > > > > On 08.03.2016 08:52, elisabeth benoit wrote:
> > > > > >
> > > > > >> Hello,
> > > > > >>
> > > > > >> I'm using solr 4.10.1. I'd like to index words with ngrams of
> fix
> > > > lenght
> > > > > >> with a position in the end.
> > > > > >>
> > > > > >> For instance, with fix lenght 3, Amsterdam would be something
> > like:
> > > > > >>
> > > > > >>
> > > > > >> a0 (two spaces added at beginning)
> > > > > >> am1
> > > > > >> ams2
> > > > > >> mst3
> > > > > >> ste4
> > > > > >> ter5
> > > > > >> erd6
> > > > > >> rda7
> > > > > >> dam8
> > > > > >> am9 (one more space in the end)
> > > > > >>
> > > > > >> The number at the end being the position.
> > > > > >>
> > > > > >> Does anyone have a clue how to achieve this?
> > > > > >>
> > > > > >> Best regards,
> > > > > >> Elisabeth
> > > > > >>
> > > > > >>
> > > > > > --
> > > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > > Management
> > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > --------------------------
> > > >
> > > > Benedetti Alessandro
> > > > Visiting card : http://about.me/alessandro_benedetti
> > > >
> > > > "Tyger, tyger burning bright
> > > > In the forests of the night,
> > > > What immortal hand or eye
> > > > Could frame thy fearful symmetry?"
> > > >
> > > > William Blake - Songs of Experience -1794 England
> > > >
> > >
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to