[
https://issues.apache.org/jira/browse/SOLR-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793158#action_12793158
]
Shalin Shekhar Mangar commented on SOLR-1676:
---------------------------------------------
Although it is not documented anywhere, SpellCheckComponent passes
max(spellcheck.count, 5) to the Lucene spellchecker, see
AbstractLuceneSpellChecker line 141 in trunk.
bq. The effect is that with a low value for spellcheck.count you might miss
good hits. In other words, the first item with spellcheck.count==1 is not
always the same item as with e.g. spellcheck.count==10.
That is true. It is a trade-off between accuracy and performance. We cannot
avoid this without fetching all results (or a large number of them) internally
and score all of them with a distance metric and that can make it very slow.
Do you have any suggestion on how we could improve the documentation?
> spellcheck.count has confusing default and documentation
> --------------------------------------------------------
>
> Key: SOLR-1676
> URL: https://issues.apache.org/jira/browse/SOLR-1676
> Project: Solr
> Issue Type: Bug
> Components: spellchecker
> Affects Versions: 1.4
> Reporter: Daniel Naber
> Priority: Minor
>
> It seems spellcheck.count does not just limit the number of results returned,
> as the documentation claims. Instead, this value is given to the Lucene
> SpellChecker class which multiplies it by 10 and then only fetches the first
> spellcheck.count*10 candidates, ignoring all others. The effect is that with
> a low value for spellcheck.count you might miss good hits. In other words,
> the first item with spellcheck.count==1 is not always the same item as with
> e.g. spellcheck.count==10.
> The fix could be to fix the documentation (the comments in the sample
> solrconfig.xml) to mention this and use a better default.
> The Lucene SpellChecker class says about the numSug parameter: "Thus, you
> should set this value to *at least* 5 for a good suggestion."
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.