Re: EdgeNgramTokenFilter and positions

Walter Underwood Thu, 06 Sep 2012 18:13:38 -0700

Yes, that is exactly the bug. EdgeNgram should work like the synonym filter.


wunder

On Sep 6, 2012, at 5:51 PM, Otis Gospodnetic wrote:

> I don't know for sure, but I remember something around this being a problem, 
> yes ... maybe https://issues.apache.org/jira/browse/LUCENE-3907 ?
> 
> Otis 
> ----
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm 
> 
> 
> 
> ----- Original Message -----
>> From: Walter Underwood <wun...@chegg.com>
>> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
>> Cc: 
>> Sent: Wednesday, September 5, 2012 1:51 PM
>> Subject: EdgeNgramTokenFilter and positions
>> 
>> In the analysis page, the n-grams produced by EdgeNgramTokenFilter are at 
>> sequential positions. This seems wrong, because an n-gram is associated with 
>> a 
>> source token at a specific position. It also really messes up phrase matches.
>> 
>> With the source text "fleen", these positions and tokens are 
>> generated:
>> 
>> 1,fl
>> 2,fle
>> 3,flee
>> 4,fleen
>> 
>> Is this a known bug? Fixed? I'm running 3.3.
>> 
>> wunder
>> --
>> Walter Underwood
>> Search Guy
>> wun...@chegg.com<mailto:wun...@chegg.com>
>> 

--
Walter Underwood
wun...@wunderwood.org

Re: EdgeNgramTokenFilter and positions

Reply via email to