--- On Tue, 3/30/10, Andrzej Bialecki <a...@getopt.org> wrote:

> From: Andrzej Bialecki <a...@getopt.org>
> Subject: Re: SOLR-1316 How To Implement this autosuggest component ???
> To: solr-user@lucene.apache.org
> Date: Tuesday, March 30, 2010, 9:59 AM
> On 2010-03-30 15:42, Robert Muir
> wrote:
> > On Mon, Mar 29, 2010 at 11:34 PM, Andy<angelf...@yahoo.com> 
> wrote:
> >
> >> Reading through this thread and SOLR-1316, there
> seems to be a lot of
> >> different ways to implement auto-complete in Solr.
> I've seen the mentions
> >> of:
> >>
> >> EdgeNGrams
> >> TermsComponent
> >> Faceting
> >> TST
> >> Patricia Tries
> >> RadixTree
> >> DAWG
> >>
> >>
> >
> > Another idea is you can use the Automaton support in
> the lucene flexible
> > indexing branch: to query the index directly with a
> DFA that represents
> > whatever terms you want back.
> > The idea is that there really isn't much gain in
> building a separate Pat,
> > Radix Tree, or DFA to do this when you can efficiently
> intersect a DFA with
> > the existing terms dictionary.
> >
> > I don't really understand what autosuggest needs to
> do, but if you are doing
> > things like looking for mispellings you can easily
> build a DFA that
> > recognizes terms within some short edit distance with
> the support thats
> > there (the LevenshteinAutomata class), to quickly get
> back candidates.
> >
> > You can intersect/concatenate/union these DFAs with
> prefix or suffix DFAs if
> > you want too, don't really understand what the
> algorithm should do, but I'm
> > happy to try to help.
> >
> 
> The problem is a bit more complicated. There are two
> issues:
> 
> * simple term-level completion often produces wrong results
> for 
> multi-term queries (which are usually rewritten as "weak"
> phrase queries),
> 
> * the weights of suggestions should not correspond directly
> to IDF in 
> the index - much better results can be obtained when they
> correspond to 
> the frequency of terms/phrases in the query logs ...
> 
> TermsComponent and EdgeNGrams, while simple to use, suffer
> from both issues.
> 

Thanks.

I actually have 2 use cases for autosuggest:

1) The "normal" one - I want to suggest search terms to users after they've 
typed a few letters. Just like Google suggest. Looks like for this use case 
SOLR-1316 is the best option. Right?

2) I have a field "city" with values that are entered by users. When a user is 
entering his city, I want to make suggestion based on what cities have already 
been entered so far by other users -- in order to reduce chances of 
duplication. What method would you recommend for this use case?



Reply via email to