i had tried it earlier with no effect, when i looked at the source, it doesnt look at offsets at all, just position increments, so short of somebody finding a better way i going to create a similar filter that compared offsets...
On Fri, Apr 2, 2010 at 2:07 PM, Erik Hatcher <erik.hatc...@gmail.com> wrote: > Will adding the RemoveDuplicatesTokenFilter(Factory) do the trick here? > > Erik > > On Apr 2, 2010, at 4:13 PM, Joe Calderon wrote: > >> hello *, i have a field that is indexing the string "the >> ex-girlfriend" as these tokens: [the, exgirlfriend, ex, girlfriend] >> then they are passed to the edgengram filter, this allows me to match >> different user spellings and allows for partial highlighting, however >> a token like 'ex' would get generated twice which should be fine >> except the highlighter seems to highlight that token twice even though >> it has the same offsets (4,6) >> >> is there away to make the highlighter not highlight the same token >> twice, or do i have to create a token filter that would dump tokens >> with equal text and offsets ? >> >> >> basically whats happening now is if i search >> >> 'the e', i get: >> '<em>Seinfeld</em> The <em>E</em><em>E</em>x-Girlfriend' >> >> for 'the ex', i get: >> '<em>Seinfeld</em> The <em>Ex</em><em>Ex</em>-Girlfriend' >> >> and so on >> >> >> thx much >> >> --joe > >