Hi,
even though I read a lot, none of my spellchecker configurations works
really well. I reached a dead end. Maybe someone could help, to solve my
challenges.
- How can I get case sensitive suggestions, independent of the given
case in the query?
- How to configure a 'did you mean' spellchecking, as discussed in
https://issues.apache.org/jira/browse/SOLR-2585 (Context-Sensitive
Spelling Suggestions & Collations)
I'm using following environment:
- Solr 4.0-alpha (downloaded 25. June)
- Java 7
- schema.xml
<fieldType name="textSuggest" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
> ...
<field name="suggest" type="textSuggest" indexed="true" stored="true"
required="false" multiValued="true" />
- solrconfig.xml (suggester)
<requestHandler name="/hint"
class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="echoParams">all</str>
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggester</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.count">20</str>
</lst>
<arr name="components">
<str>suggester</str>
</arr>
</requestHandler>
<searchComponent name="suggester" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">suggester</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str
name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">suggest</str>
</lst>
</searchComponent>
- solrconfig.xml (spellcheck)
<requestHandler name="standard" class="solr.StandardRequestHandler"
default="true">
<lst name="defaults">
<str name="echoParams">all</str>
<int name="rows">10</int>
<str name="df">allfields</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.count">20</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
> <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">suggest</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.1</float>
<int name="maxEdits">2</int>
<int name="minPrefix">1</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">1</int>
<float name="maxQueryFrequency">0.1</float>
<float name="thresholdTokenFrequency">0.001</float>
</lst>
</searchComponent>
*Suggester problem*
With this configuration the suggester works not case sensitive, but the
hints are all lower case.
Example: .../hint?q=da&wt=xml&spellcheck=true&spellcheck.build=true
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">173</int><lst name="params"><str name="spellcheck">true</str><str name="echoParams">all</str><str name="spellcheck.extendedResults">true</str><str name="spellcheck.dictionary">suggester</str><str name="spellcheck.count">20</str><str name="spellcheck.onlyMorePopular">false</str><str name="spellcheck">true</str><str name="q">da</str><str
name="wt">xml</str><str name="spellcheck.build">true</str></lst></lst><str name="command">build</str><lst name="spellcheck"><lst name="suggestions"><lst name="da"><int name="numFound">20</int><int name="startOffset">0</int><int name="endOffset">2</int><arr name="suggestion"><str>dat-marktspiegel spezial</str><str>data structures with c++ using stl</str><str>data warehouse</str><str>datan,
ingeborg</str><str>datenbanken mit delphi</str><str>datenverschlüsselung</str><str>dauner, gabriele</str><str>dautermann, margit</str><str>david copperfield</str><str>david, horst</str><str>dav
id, leo</str><str>david, nicholas</str><str>davis, charles t.</str><str>davis, edward l</str><str>davis, leslie dorfman</str><str>davis, stanley m.</str><str>davor
kommt noch</str><str>davydova, irina n.</str><str>dawidowski, bernd</str><str>dayan, daniel</str></arr></lst><bool
name="correctlySpelled">false</bool></lst></lst>
</response>
Using just solr.StrField as field type, the suggestion are true to
original capitalization, but I get no suggestions, if the query starts
with a lower case character.
*Spelling problem*
One of the indexed entries in the field 'suggest' is "David Copperfield"
and I want this string as alternative suggestion to the query "David
opperfield".
Example .../select?q="david+opperfield"&rows=0&wt=xml&spellcheck=true
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">15</int><lst name="params"><str name="df">allfields</str><str name="echoParams">all</str><str name="spellcheck.extendedResults">true</str><str name="spellcheck.count">20</str><str name="spellcheck.onlyMorePopular">false</str><str
name="rows">0</str><str name="spellcheck">true</str><str name="q">"david opperfield"</str><str name="wt">xml</str><str name="rows">0</str></lst></lst><result name="response" numFound="0" start="0"></result><lst name="spellcheck"><lst name="suggestions"><bool
name="correctlySpelled">false</bool></lst></lst>
</response>
.../select?q=david+opperfield&rows=0&wt=xml&spellcheck=true
--> <bool name="correctlySpelled">true</bool>
=?8-)
Uwe
Btw. Is there a DirectSolrSuggester corresponding to DirectSolrSpellChecker?