[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044917#comment-13044917
]
James Dyer commented on SOLR-2571:
----------------------------------
{quote}
what makes this 'decision' of correctlySpelled? Do you know?
{quote}
I took a quick look to find out. Its more complicated than I thought! Here's
the basic jist (I think!) :
- If the instance of SolrSpellChecker returns frequency data and all
suggestions have frequency >0, TRUE.
- If the instance of SolrSpellChecker returns frequency data and any
suggestion have frequency == 0, FALSE.
- If the instance of SolrSpellChecker returns NO frequency data but has
suggestions, OMIT.
- If the instance of SolrSpellChecker returns NO suggestions, FALSE.
Possibly this isn't fully accurate but I'm at least mostly correct here. Seems
like the discrepency with DirectSolrSpellChecker is because it isn't returning
Frequency info?
This all happens in SpellCheckComponent.toNamedList() ... I'm guessing the code
here uses the presence or absence of frequency data as kind of a proxy
indicator whether or not its dealing with IndexBasedSpellChecker or
FileBasedSpellChecker. Possibly it would be better if each instance of
SolrSpellChecker had a "isCorrectlySpelled()" method that toNamedList() could
call? Maybe I should I go open another jira issue for that?
> IndexBasedSpellChecker "thresholdTokenFrequency" fails with a
> ClassCastException on startup
> -------------------------------------------------------------------------------------------
>
> Key: SOLR-2571
> URL: https://issues.apache.org/jira/browse/SOLR-2571
> Project: Solr
> Issue Type: Bug
> Components: spellchecker
> Affects Versions: 1.4.1, 3.1, 4.0
> Reporter: James Dyer
> Priority: Minor
> Labels: whereIsHossManWhenYouNeedHim
> Fix For: 3.3, 4.0
>
> Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch,
> SOLR-2571.solr3.2.patch
>
>
> When parsing the configuration for thresholdTokenFrequency", the
> IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived
> NamedList. However, this comes through as a String. Therefore, a
> ClassCastException is always thrown whenever this parameter is specified.
> The code ought to be doing "Float.parseFloat(...)" on the value.
> This looks like a nice feature to use in cases the data contains misspelled
> or rare words leading to spurious "correct" queries. I would have liked to
> have used this with a project we just completed however this bug prevented
> that. This issue came up recently in the User's mailing list so I am raising
> an issue now.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]