[ https://issues.apache.org/jira/browse/SOLR-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744860#action_12744860 ]
Alex Baranov commented on SOLR-630: ----------------------------------- I would propose to close this bug. 1) of->oft Whether stop words are omitted or not depends on: a. If "q" parameter is used, then "queryAnalyzerFieldType" parameter is used to determine the analyzer for the query. If "queryAnalyzerFieldType" is not specified, then WhitespaceTokenizer is used. b. If "spellcheck.q" parameter is used, then query analyzer of the spellchecker field is used. 2) America->Americaa, america->[none] I couldn't reproduce that. The results are the same as for "America" as for "america". However, spellchecker is really case-sensitive. For example, if there is "AmErIcAa" in the spellchecker index then this suggestion won't appear neither for "America" nor for "america", but would appear for "AmErIcA". The reason, why America->Americaa, america->Americaa lies in the n-gram method which is used in lucene spellchecker: for America and america the same grams are defined, the only difference is "startN" gram. Actually there is still might be a difference in the results: the method works so that it boosts the relevance of the suggestion if the first N letters of it are the same as in the word under spellcheck. I'm not sure whether case-sensitiveness(is it a word?) is a bug or not. Anyway, finding suggestions as well as creating the index for spellchecker is delegated to the Lucene SpellChecker, so this is Lucene issue, not Solr. P.S. I believe that one can avoid case-sensitive issue by configuring properly the analyzers (e.g. for the spellchecker field). > Spellchecker should not be case-sensitive and should be stopwords-aware > ----------------------------------------------------------------------- > > Key: SOLR-630 > URL: https://issues.apache.org/jira/browse/SOLR-630 > Project: Solr > Issue Type: Bug > Components: spellchecker > Reporter: Otis Gospodnetic > Priority: Minor > Fix For: 1.5 > > > Here are 2 more bugs: > 1) > Search for: > united states of America > Suggests: > united states oft America > It looks like the SC doesn't check stopwords, and "of" is a stopword. Thus, > it does not exist in the index, > but "oft" does, so SC suggests "oft" and thinks "of" is misspelled. I think > the SC component should check the list of > stopwords, too, no? > 2) > Search for: > united states of America > Suggests: > united states oftAmericaa > The of->oft is described above. But note how SC suggested America->Americaa, > but it didn't do that for "america". > This looks like case-sensitivity problem. Shouldn't the SC be > case-insensitive? > I can't produce a patch now (no src handy), so I'm hoping Grant or somebody > else can do it based on this report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.