[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789797#action_12789797
 ] 

Shalin Shekhar Mangar commented on SOLR-1316:
---------------------------------------------

{quote}My thinking was that the usual scenario is that you submit autosuggest 
queries soon after user starts typing the query, and the highest perceived 
value of such functionality is when it can suggest complete meaningful phrases 
and not just individual terms. I.e. when you start typing "token sug" it won't 
suggest "token sugar" but instead it will suggest "token suggestions".{quote}

Yes but the decision of selecting the complete phrase or an individual term 
should be up to the user. This is controlled by the "queryAnalyzerFieldType" in 
SpellCheckComponent. We will index tokens returned by that analyzer so the user 
can configure whichever behavior he wants. For example, if it is 
KeywordAnalyzer, we will index/suggest phrases and if it is a 
WhitespaceAnalyzer we will index/suggest individual terms.

{quote}Such as? What you put there is what you get so the fact that we are 
getting complete phrases as suggestions is the consequence of the choice above 
- the trie in this case is populated with phrases. If we populate it with 
tokens, then we can return per-token suggestions, again - losing the added 
value I mentioned above.{quote}

My point was that SpellingResult is too coarse. It is a complete result (for 
all tokens given by "queryAnalyzerFieldType"). If that analyzer gives us 
multiple tokens then we must get suggestions for each. In that case returning a 
SpellingResult for each token is not right. Instead the Suggestor should 
combine suggestions for all tokens into a SpellingResult object. I don't have a 
suggestion on an alternative. Looks like we may need to invent a custom type 
which represents the (suggestion, frequency) pair.

{quote}
For now I'm sure that we do NOT want to use the impl. of RadixTree in this 
patch, because it doesn't support our use case - I'll prepare a patch that 
removes this impl. Other implementations seem comparable wrt. to the speed, 
based on casual tests using /usr/share/dict/words, but I didn't run any exact 
benchmarks yet.
{quote}

OK. Go ahead with the patch and I'll try to find some time to compare the two 
methods. What about DAWGs? Are we still considering them?

{quote}
Shouldn't we be creating a separate AutoSuggestComponent like the 
SpellCheckComponent havings its own prepare, process and inform functions?
{quote}

We could do that but as Andrej noted, we'd end up re-implementing a lot of its 
functionality. I'm not sure if it is worth it. I agree that it'd be odd using 
parameters prefixed with "spellcheck" for auto-suggest and it'd have been 
easier if it were vice-versa. Does anybody have a suggestion?

> Create autosuggest component
> ----------------------------
>
>                 Key: SOLR-1316
>                 URL: https://issues.apache.org/jira/browse/SOLR-1316
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: suggest.patch, suggest.patch, TST.zip
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to