Re: SOLR-1316 How To Implement this autosuggest component ???

Andrzej Bialecki Tue, 30 Mar 2010 07:00:01 -0700

On 2010-03-30 15:42, Robert Muir wrote:

On Mon, Mar 29, 2010 at 11:34 PM, Andy<angelf...@yahoo.com>  wrote:

Reading through this thread and SOLR-1316, there seems to be a lot of
different ways to implement auto-complete in Solr. I've seen the mentions
of:

EdgeNGrams
TermsComponent
Faceting
TST
Patricia Tries
RadixTree
DAWG


Another idea is you can use the Automaton support in the lucene flexible
indexing branch: to query the index directly with a DFA that represents
whatever terms you want back.
The idea is that there really isn't much gain in building a separate Pat,
Radix Tree, or DFA to do this when you can efficiently intersect a DFA with
the existing terms dictionary.

I don't really understand what autosuggest needs to do, but if you are doing
things like looking for mispellings you can easily build a DFA that
recognizes terms within some short edit distance with the support thats
there (the LevenshteinAutomata class), to quickly get back candidates.

You can intersect/concatenate/union these DFAs with prefix or suffix DFAs if
you want too, don't really understand what the algorithm should do, but I'm
happy to try to help.


The problem is a bit more complicated. There are two issues:

* simple term-level completion often produces wrong results formulti-term queries (which are usually rewritten as "weak" phrase queries),

* the weights of suggestions should not correspond directly to IDF inthe index - much better results can be obtained when they correspond tothe frequency of terms/phrases in the query logs ...


TermsComponent and EdgeNGrams, while simple to use, suffer from both issues.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: SOLR-1316 How To Implement this autosuggest component ???

Reply via email to