[
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597930#action_12597930
]
bosmid edited comment on SOLR-572 at 5/19/08 5:05 AM:
----------------------------------------------------------
Character encodings for file-based dictionaries now supported with property
characterEncoding. So, configuration for such dictionary would look like this:
{code:xml}
<lst name="dictionary">
<str name="name">external</str>
<str name="type">file</str>
<str name="sourceLocation">spellings.txt</str>
<str name="characterEncoding">UTF-8</str>
<str name="spellcheckIndexDir ">c:\spellchecker</str>
</lst>
{code}
New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.
Since SolrResourceLoader.getLines method doesn't support configurable encodings
(treats everything as UTF-8), I wasn't sure how to add that support. I could
have added overloaded method to SolrResourceLoader, but there is a TODO
comment, so I decided to create getLines() method inside SpellCheckComponent
class instead. What do you think of this?
was (Author: bosmid):
Character encodings for file-based dictionaries now supported with property
characterEncoding. So, configuration for such dictionary would look like this:
{code:xml}
<lst name="dictionary">
<str name="name">external</str>
<str name="type">file</str>
<str name="location">spellings.txt</str>
<str name="characterEncoding">UTF-8</str>
<str name="spellcheckIndexDir ">c:\spellchecker</str>
</lst>
{code}
New code needs latest lucene-spellchecker-2.4*.jar from Lucene trunk.
Since SolrResourceLoader.getLines method doesn't support configurable encodings
(treats everything as UTF-8), I wasn't sure how to add that support. I could
have added overloaded method to SolrResourceLoader, but there is a TODO
comment, so I decided to create getLines() method inside SpellCheckComponent
class instead. What do you think of this?
> Spell Checker as a Search Component
> -----------------------------------
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
> Issue Type: New Feature
> Components: spellchecker
> Affects Versions: 1.3
> Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the
> following features:
> * Allow creating a spell index on a given field and make it possible to have
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.