Hi,

even though I read a lot, none of my spellchecker configurations works really well. I reached a dead end. Maybe someone could help, to solve my challenges.

- How can I get case sensitive suggestions, independent of the given case in the query?

- How to configure a 'did you mean' spellchecking, as discussed in https://issues.apache.org/jira/browse/SOLR-2585 (Context-Sensitive Spelling Suggestions & Collations)


I'm using following environment:
- Solr 4.0-alpha (downloaded 25. June)
- Java 7
- schema.xml
     <fieldType name="textSuggest" class="solr.TextField" 
positionIncrementGap="100">
         <analyzer>
            <tokenizer class="solr.KeywordTokenizerFactory" />
            <filter class="solr.LowerCaseFilterFactory" />
         </analyzer>
      </fieldType>
> ...
      <field name="suggest" type="textSuggest" indexed="true"  stored="true" 
required="false" multiValued="true"  />
- solrconfig.xml (suggester)
   <requestHandler name="/hint" 
class="org.apache.solr.handler.component.SearchHandler">
      <lst name="defaults">
         <str name="echoParams">all</str>
         <str name="spellcheck">true</str>
         <str name="spellcheck.dictionary">suggester</str>
         <str name="spellcheck.extendedResults">true</str>
         <str name="spellcheck.onlyMorePopular">false</str>
         <str name="spellcheck.count">20</str>
      </lst>
      <arr name="components">
         <str>suggester</str>
      </arr>
   </requestHandler>
   <searchComponent name="suggester" class="solr.SpellCheckComponent">
      <lst name="spellchecker">
         <str name="name">suggester</str>
         <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
         <str 
name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
         <str name="field">suggest</str>
      </lst>
   </searchComponent>
- solrconfig.xml (spellcheck)
  <requestHandler name="standard" class="solr.StandardRequestHandler" 
default="true">
      <lst name="defaults">
         <str name="echoParams">all</str>
         <int name="rows">10</int>
         <str name="df">allfields</str>
         <str name="spellcheck.extendedResults">true</str>
         <str name="spellcheck.onlyMorePopular">false</str>
         <str name="spellcheck.count">20</str>
      </lst>
      <arr name="last-components">
         <str>spellcheck</str>
      </arr>
   </requestHandler>
>    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
      <str name="queryAnalyzerFieldType">textSpell</str>
      <lst name="spellchecker">
         <str name="name">default</str>
         <str name="field">suggest</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">internal</str>
         <float name="accuracy">0.1</float>
         <int name="maxEdits">2</int>
         <int name="minPrefix">1</int>
         <int name="maxInspections">5</int>
         <int name="minQueryLength">1</int>
         <float name="maxQueryFrequency">0.1</float>
         <float name="thresholdTokenFrequency">0.001</float>
      </lst>
   </searchComponent>

*Suggester problem*
With this configuration the suggester works not case sensitive, but the hints are all lower case.
Example: .../hint?q=da&wt=xml&spellcheck=true&spellcheck.build=true
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">173</int><lst name="params"><str name="spellcheck">true</str><str name="echoParams">all</str><str name="spellcheck.extendedResults">true</str><str name="spellcheck.dictionary">suggester</str><str name="spellcheck.count">20</str><str name="spellcheck.onlyMorePopular">false</str><str name="spellcheck">true</str><str name="q">da</str><str 
name="wt">xml</str><str name="spellcheck.build">true</str></lst></lst><str name="command">build</str><lst name="spellcheck"><lst name="suggestions"><lst name="da"><int name="numFound">20</int><int name="startOffset">0</int><int name="endOffset">2</int><arr name="suggestion"><str>dat-marktspiegel spezial</str><str>data structures with c++ using stl</str><str>data warehouse</str><str>datan, 
ingeborg</str><str>datenbanken mit delphi</str><str>datenverschlüsselung</str><str>dauner, gabriele</str><str>dautermann, margit</str><str>david copperfield</str><str>david, horst</str><str>dav
id, leo</str><str>david, nicholas</str><str>davis, charles t.</str><str>davis, edward l</str><str>davis, leslie dorfman</str><str>davis, stanley m.</str><str>davor 
kommt noch</str><str>davydova, irina n.</str><str>dawidowski, bernd</str><str>dayan, daniel</str></arr></lst><bool 
name="correctlySpelled">false</bool></lst></lst>
</response>
Using just solr.StrField as field type, the suggestion are true to original capitalization, but I get no suggestions, if the query starts with a lower case character.

*Spelling problem*
One of the indexed entries in the field 'suggest' is "David Copperfield" and I want this string as alternative suggestion to the query "David opperfield".
Example .../select?q="david+opperfield"&rows=0&wt=xml&spellcheck=true
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">15</int><lst name="params"><str name="df">allfields</str><str name="echoParams">all</str><str name="spellcheck.extendedResults">true</str><str name="spellcheck.count">20</str><str name="spellcheck.onlyMorePopular">false</str><str 
name="rows">0</str><str name="spellcheck">true</str><str name="q">"david opperfield"</str><str name="wt">xml</str><str name="rows">0</str></lst></lst><result name="response" numFound="0" start="0"></result><lst name="spellcheck"><lst name="suggestions"><bool 
name="correctlySpelled">false</bool></lst></lst>
</response>
.../select?q=david+opperfield&rows=0&wt=xml&spellcheck=true
--> <bool name="correctlySpelled">true</bool>

=?8-)
Uwe

Btw. Is there a DirectSolrSuggester corresponding to DirectSolrSpellChecker?


Reply via email to