Hi James, hi list, I can confirm the existence of data that's within 1 Levenshtein step from "ichtscheiben":
{ "responseHeader": { "status": 0, "QTime": 0, "params": { "fl": "name,spell", "indent": "true", "q": "name:Sichtscheiben", "_": "1410423419758", "wt": "json", "rows": "50" } }, "response": { "numFound": 6, "start": 0, "docs": [ { "name": "Sichtscheiben", "spell": "Sichtscheiben" }, { "name": "Sichtscheiben", "spell": "Sichtscheiben" }, { "name": "Sichtscheiben", "spell": "Sichtscheiben" }, { "name": "Sichtscheiben", "spell": "Sichtscheiben" }, { "name": "Sichtscheiben", "spell": "Sichtscheiben" }, { "name": "Sichtscheiben", "spell": "Sichtscheiben" } ] } } Multiple records exist that should match. The note for alternativeTermCount is appreciated. I've tried another term: "Transport". I get suggestions when I use "Transpor" and "Transpo", even "Transpotr", but "ransport" doesn't yield any suggestions. Maybe it's a question of the beginning of a word and has not really anything to do with stemming. Am 10.09.2014 15:19 schrieb Dyer, James: > Thomas, > > It looks like you've set things up correctly in that while the user is searching against a stemmed field ("name"), spellcheck is checking against a lightly-analyzed copy of it ("spell"). This is the right way to do it as spellcheck against stemmed forms is usually undesirable. > > But as you've experienced, you will sometimes get results (due to stemming) and also suggestions (because the spellechecker is looking at unstemmed forms). If you do not want spellcheck to return anything when you get results, you can set "spellcheck.maxResultsForSuggest=0". > > Now keeping in mind we're comparing unstemmed forms, can you verify you indeed have something in your index that is within 2 edits of "ichtscheiben" ? My guess is you probably don't, which would be why you do not get spelling results in that case. > > Also, even if you do have something within 2 edits, if "ichtscheiben" occurs in your index, by default it won't try to correct it at all (even if the query returns nothing, maybe because of filters or other required terms on the query). In this case you need to set "spellcheck.alternativeTermCount" to a non-zero value (try maybe 5). > > See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount [1] and following sections. > > James Dyer > Ingram Content Group > (615) 213-4311 > > -----Original Message----- > From: Thomas Michael Engelke [mailto:thomas.enge...@posteo.de] > Sent: Wednesday, September 10, 2014 5:00 AM > To: Solr user > Subject: Solr Spellcheck suggestions only return from /select handler when returning search results > > Hi, > > I'm experimenting with the Spellcheck component and have therefor > used the example configuration for spell checking to try things out. My > solrconfig.xml looks like this: > > <searchComponent name="spellcheck" > class="solr.SpellCheckComponent"> > <str > name="queryAnalyzerFieldType">spell</str> > <!-- Multiple "Spell > Checkers" can be declared and used by this > component > --> > <!-- a > spellchecker built from a field of the main index --> > <lst > name="spellchecker"> > <str name="name">default</str> > <str > name="field">spell</str> > <str > name="classname">solr.DirectSolrSpellChecker</str> > <!-- the spellcheck > distance measure used, the default is the internal levenshtein --> > <str > name="distanceMeasure">internal</str> > <!-- uncomment this to require > suggestions to occur in 1% of the documents > <float > name="thresholdTokenFrequency">.01</float> > --> > </lst> > <!-- a > spellchecker that can break or combine words. See "/spell" handler below > for usage --> > <lst name="spellchecker"> > <str > name="name">wordbreak</str> > <str > name="classname">solr.WordBreakSolrSpellChecker</str> > <str > name="field">spell</str> > <str name="combineWords">true</str> > <str > name="breakWords">true</str> > <int name="maxChanges">10</int> > </lst> > > </searchComponent> > > And I've added the spellcheck component to my > /select request handler: > > <requestHandler name="/select" > class="solr.SearchHandler"> > ... > <arr name="last-components"> > > <str>spellcheck</str> > </arr> > </requestHandler> > > I have built up the > spellchecker source in the schema.xml from the name field: > > <field > name="spell" type="spell" indexed="true" stored="true" required="false" > multiValued="false"/> > <copyField source="name" dest="spell" > maxChars="30000" /> > ... > <fieldType name="spell" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer > class="solr.StandardTokenizerFactory"/> > </analyzer> > <analyzer > type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > > </analyzer> > </fieldType> > > As I'm querying the /select request handler, > I should get spellcheck suggestions with my results. However, I rarely > get a suggestion. Examples: > > query: Sichtscheibe, spellcheck suggestion: > Sichtscheiben (works) > query: Sichtscheib, spellcheck suggestion: > Sichtscheiben (works) > query: ichtscheiben, no spellcheck suggestions > > As > far as I can identify, I only get suggestions when I get real search > results. I get results for the first 2 examples, because the german > StemFilterFactory translates "Sichtscheibe" and "Sichtscheiben" into > "Sichtscheib", so there are matches found. However, the third query > should result in a suggestion, as the Levenshtein distance is less than > in the second example. > > Suggestions, improvements, corrections? Links: ------ [1] http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount