On Nov 6, 2006, at 11:27 AM, hans meiser wrote:

Hi,

Did you take a look at IsoLatin1AccentFilter ?

  It nearly do the same i need, but not perfectly.

   public final Token next() throws java.io.IOException {
 final Token t = input.next();
   if (t == null)
   return null;
return new Token(removeAccents(t.termText()), t.startOffset(), t.endOffset(), t.type());
 }

Here also a new Token is created. The question i have, why the endoffset is not corrected for the new created token? Some times the new token is bigger than before.
  Complete code link:
http://developer.spikesource.com/spikewatch.logs/fedora-3- i386/2221/lucene/reports/clover/org/apache/lucene/analysis/ ISOLatin1AccentFilter.html

For highlighting purposes, it's best to keep the offsets in the original text, not adjusted for token mutation.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to