[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600269#action_12600269
 ] 

Otis Gospodnetic commented on SOLR-572:
---------------------------------------

I haven't applied/tried the latest patch yet, but maybe it's
quicker/better to ask here.  I'm wondering/worried about the case
where the input is a multi-term query string and a subset (e.g. 2 of 5
terms) of the query terms is misspelled.

For example, what happens when the query is:

"london brigge is fallinge down"
(my 2 year old's current hit)

In this case the suggestions should be:
# brigge => bridge
# fallinge => falling (or fall, more likely)

Is there something in the response that will allow the client to
figure out the positioning of the spelling suggestions and piece
together the ideal alternative query, in this case "london bridge is
falling/fall down"?

Ideally, the client could piece the new query string, so that it can, for 
example, italicize the misspelled words (see Google's DYM).  If the current 
SCRH returns the final corrected string, e.g. "london bridge is falling down" 
the client has no easy/accurate way of figuring out what was changed, I think.  
If the SCRH returned some mark-up that told the client which word(s) changed, 
then the client could do something with those changed words, e.g. "london 
bridge{was:brigge}...."

Or, if that has problems, maybe each word should be returned separately and 
sequentially:

<word="london"/> <!-- unchanged -->
<word="brigge">bridge</word>

or maybe with offset info:

<word="london" offset="0"/> <!-- unchanged -->
<word="brigge" offset="6">bridge</word>

Thoughts?


> Spell Checker as a Search Component
> -----------------------------------
>
>                 Key: SOLR-572
>                 URL: https://issues.apache.org/jira/browse/SOLR-572
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to