RE: Need help with spellcheck city name

Dyer, James Tue, 28 Sep 2010 07:16:12 -0700

You might want to look at SOLR-2010.  This patch works with the "collation" 
feature, having it test the collations it returns to ensure they'll return 
hits.  So if a user types "san jos" it will know that the combination "san 
jose" is in the index and "san ojos" is not.


James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Savannah Beckett [mailto:savannah_becket...@yahoo.com] 
Sent: Monday, September 27, 2010 7:45 PM
To: solr-user@lucene.apache.org
Cc: erickerick...@gmail.com
Subject: Re: Need help with spellcheck city name

No, I checked, there is a city called Swan in Iowa.  So, it is getting from the 
city index, so is Clerk.  But why does it favor Swan than San?  Spellcheck get 
weird after I treat city name as one token.  If I do it in the old way, it let 
San go, and correct Jos as Ojos instead of Jose because Ojos is ranked as #1 
and 
Jose at the middle.  Any more suggestions?  Rank it by frequency first then 
score doesn't work neither.  


 

________________________________
From: Erick Erickson <erickerick...@gmail.com>
To: solr-user@lucene.apache.org
Sent: Mon, September 27, 2010 5:24:25 PM
Subject: Re: Need help with spellcheck city name

Hmmm, did you rebuild your spelling index after the config changes?

And it really looks like somehow you're getting results from a field other
than city. Are you also sure that your cityname field is of type
autocomplete1?

Shooting in the dark here, but these results are so weird that I suspect
it's
something fundamental....

Best
Erick

On Mon, Sep 27, 2010 at 8:05 PM, Savannah Beckett <
savannah_becket...@yahoo.com> wrote:

> No, it doesn't work, I got weird result. I set my city name field to be
> parsed
> as a token as following:
>
>        <fieldType name="autocomplete1" class="solr.TextField"
> positionIncrementGap="100">
>          <analyzer type="index">
>            <tokenizer class="solr.KeywordTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>          </analyzer>
>          <analyzer type="query">
>            <tokenizer class="solr.KeywordTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>          </analyzer>
>        </fieldType>
>
> I got following result for spellcheck:
>
> <lstname="spellcheck">
> -    <lstname="suggestions">
> -        <lstname="san">
>              <intname="numFound">1</int>
>              <intname="startOffset">0</int>
>              <intname="endOffset">3</int>
> -            <arrname="suggestion">
>                  <str>swan</str>
>          </arr>
>      </lst>
> -        <lstname="clar">
>              <intname="numFound">1</int>
>              <intname="startOffset">4</int>
>        <intname="endOffset">8</int>
>                <arrname="suggestion">
>          <str>clark</str>
>      </arr>
>      </lst>
>  </lst>
>
>
>
>
>
> ________________________________
> From: Tom Hill <solr-l...@worldware.com>
> To: solr-user@lucene.apache.org
> Sent: Mon, September 27, 2010 3:52:48 PM
> Subject: Re: Need help with spellcheck city name
>
> Maybe process the city name as a single token?
>
> On Mon, Sep 27, 2010 at 3:25 PM, Savannah Beckett
> <savannah_becket...@yahoo.com> wrote:
> > Hi,
> >  I have city name as a text field, and I want to do spellcheck on it.  I
> use
> > setting in http://wiki.apache.org/solr/SpellCheckComponent
> >
> > If I setup city name as text field and do spell check on "San Jos" for
> San
> >Jose,
> > I get suggestion for Jos as "ojos".  I checked the extendedresult and I
> found
> > that Jose is in the middle of all 10 suggestions in term of score and
> > frequency.  I then set city name as string field, and spell check again,
> I got
> > Van for San and Ross for Jos, which is weird because San is correct.
> >
> >
> > How do you setup spellchecker to spellcheck city names?  City name can
> have
> > multiple words.
> > Thanks.
> >
> >
> >
>
>
>
>
>

RE: Need help with spellcheck city name

Reply via email to