I can see an argument for NGramTokenFilter incrementing position for each
ngram, because they really are an ordered scan across the text. Pure ngrams are
a different text representation than words. That could be an option on the
token filter.
LUCENE-1224 is mostly concerned with ngram searching
Wunder, you may be thinking of LUCENE-1224 from a few years ago?
http://search-lucene.com/?q=ngram&fc_project=Lucene&fc_type=issue
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Fri, Mar 1, 2013 at 1:41 PM, Walter Underwood wrote:
> I'm fixing position increment in EdgeNgramTo
sure, you could just make a new issue and link it to that one if you
like. thanks for looking at this!
On Fri, Mar 1, 2013 at 2:15 PM, Walter Underwood wrote:
> That is a pretty broad bug, but this fix is somewhere in "improve ngrams".
> Maybe a specific bug linked to that one?
>
> Incrementing
That is a pretty broad bug, but this fix is somewhere in "improve ngrams".
Maybe a specific bug linked to that one?
Incrementing positions might be the right thing for pure ngrams.
wunder
On Mar 1, 2013, at 11:02 AM, Robert Muir wrote:
> Walter, sounds very interesting. Maybe just use this is
Walter, sounds very interesting. Maybe just use this issue:
https://issues.apache.org/jira/browse/LUCENE-3907 ?
On Fri, Mar 1, 2013 at 10:41 AM, Walter Underwood wrote:
> I'm fixing position increment in EdgeNgramTokenFilter to act like synonyms,
> with each ngram at the same position as the sour
I'm fixing position increment in EdgeNgramTokenFilter to act like synonyms,
with each ngram at the same position as the source token. Currently, the
position is incremented for each output token, which breaks phrase searching
with edge ngrams.
I could not find a current Jira issue for this. Is