I'm fixing position increment in EdgeNgramTokenFilter to act like synonyms,
with each ngram at the same position as the source token. Currently, the
position is incremented for each output token, which breaks phrase searching
with edge ngrams.
I could not find a current Jira issue for this. Is
Walter, sounds very interesting. Maybe just use this issue:
https://issues.apache.org/jira/browse/LUCENE-3907 ?
On Fri, Mar 1, 2013 at 10:41 AM, Walter Underwood wun...@wunderwood.org wrote:
I'm fixing position increment in EdgeNgramTokenFilter to act like synonyms,
with each ngram at the same
That is a pretty broad bug, but this fix is somewhere in improve ngrams.
Maybe a specific bug linked to that one?
Incrementing positions might be the right thing for pure ngrams.
wunder
On Mar 1, 2013, at 11:02 AM, Robert Muir wrote:
Walter, sounds very interesting. Maybe just use this
sure, you could just make a new issue and link it to that one if you
like. thanks for looking at this!
On Fri, Mar 1, 2013 at 2:15 PM, Walter Underwood wun...@wunderwood.org wrote:
That is a pretty broad bug, but this fix is somewhere in improve ngrams.
Maybe a specific bug linked to that one?
Wunder, you may be thinking of LUCENE-1224 from a few years ago?
http://search-lucene.com/?q=ngramfc_project=Lucenefc_type=issue
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Fri, Mar 1, 2013 at 1:41 PM, Walter Underwood wun...@wunderwood.orgwrote:
I'm fixing position
I can see an argument for NGramTokenFilter incrementing position for each
ngram, because they really are an ordered scan across the text. Pure ngrams are
a different text representation than words. That could be an option on the
token filter.
LUCENE-1224 is mostly concerned with ngram