Re: File-based Spelling

Mark Fenbers Mon, 19 Oct 2015 03:42:03 -0700

OK. I removed it, started Solr, adn refreshed the query, but my resultsare the same, indicating that queryAnalyzerFieldType has nothing to dowith my problem.


New ideas??
Mark


On 10/19/2015 4:37 AM, Duck Geraint (ext) GBJH wrote:

"Yet, it claimed it found my misspelled word to be "fenber" without the "s""
I wonder if this is because you seem to applying a stemmer to your dictionary 
words.

Try removing the "<str name="queryAnalyzerFieldType">text_en</str>" line from 
your spellcheck search component definition.

Geraint


Geraint Duck
Data Scientist
Toxicology and Health Sciences
Syngenta UK
Email: geraint.d...@syngenta.com


-----Original Message-----
From: Mark Fenbers [mailto:mark.fenb...@noaa.gov]
Sent: 16 October 2015 19:43
To: solr-user@lucene.apache.org
Subject: Re: File-based Spelling

On 10/13/2015 9:30 AM, Dyer, James wrote:

Mark,

The older spellcheck implementations create an n-gram sidecar index, which is 
why you're seeing your name split into 2-grams like this.  See the IR Book by 
Manning et al, section 3.3.4 for more information.  Based on the results you're 
getting, I think it is loading your file correctly.  You should now try a query 
against this spelling index, using words *not* in the file you loaded that are 
within 1 or 2 edits from something that is in the dictionary.  If it doesn't 
yield suggestions, then post the relevant sections of the solrconfig.xml, 
schema.xml and also the query string you are trying.

James Dyer
Ingram Content Group

James, I've already done this.   My query string was "fenbers". This is
my last name which does *not* occur in the linux.words file.  It is only
1 edit distance from "fenders" which *is* in the linux.words file.  Yet, it claimed it found my 
misspelled word to be "fenber" without the "s"
and it gave me these 8 suggestions:
f en be r
f e nb er
f en b er
f e n be r
f en b e r
f e nb e r
f e n b er
f e n b e r

So I'm attaching the the entire solrconfig.xml and schema.xml that is in 
effect.  These are in a single file with all the block comments removed.

I'm also puzzled that you say "older implementations create a sidecar index"... 
because I am using v5.3.0, which was the latest version as of my download a month or two 
ago.  So, with my implementation being recent, why is an n-gram sidecar index still 
(seemingly) being produced?

thanks for the help!
Mark



________________________________


Syngenta Limited, Registered in England No 2710846;Registered Office : Syngenta 
Limited, European Regional Centre, Priestley Road, Surrey Research Park, 
Guildford, Surrey, GU2 7YH, United Kingdom
________________________________
  This message may contain confidential information. If you are not the 
designated recipient, please notify the sender immediately, and delete the 
original and any copies. Any use of the message by you is prohibited.

Re: File-based Spelling

Reply via email to