Re: whats the correct way to do normalisation?

hans meiser Mon, 06 Nov 2006 08:28:02 -0800

Hi,
   
  > Did you take a look at IsoLatin1AccentFilter ?
   
  It nearly do the same i need, but not perfectly.
   
   public final Token next() throws java.io.IOException {
 final Token t = input.next();
   if (t == null)
   return null;   
 return new Token(removeAccents(t.termText()), t.startOffset(), t.endOffset(), 
t.type());
 }
   
  Here also a new Token is created. The question i have, why the endoffset is 
not
  corrected for the new created token? Some times the new token is bigger than 
before.
  Complete code link:
  
http://developer.spikesource.com/spikewatch.logs/fedora-3-i386/2221/lucene/reports/clover/org/apache/lucene/analysis/ISOLatin1AccentFilter.html



 

                
---------------------------------
Keine Lust auf Tippen? Rufen Sie Ihre Freunde einfach an.
  Yahoo! Messenger. Jetzt installieren .

Re: whats the correct way to do normalisation?

Reply via email to