Use ISOLatin1AccentFilter, although it is not perfect... So I made ISOLatin2AccentFilter for me and changed this method. We use our own analysers, so you would use something like this
result = new org.apache.lucene.analysis.WhitespaceTokenizer(reader); result = new ISOLatin2AccentFilter(result); result = new org.apache.lucene.analysis.LowerCaseFilter(result); * To replace accented characters in a String by unaccented equivalents. */ public final static String removeAccents(String input) { final StringBuffer output = new StringBuffer(); for (int i = 0; i < input.length(); i++) { switch (input.charAt(i)) { case '\u00C0' : // À case '\u00C1' : // Á case '\u00C2' : //  case '\u00C3' : // à case '\u00C5' : // Å output.append("A"); break; case '\u00C4' : // Ä case '\u00C6' : // Æ output.append("AE"); break; case '\u00C7' : // Ç output.append("C"); break; case '\u00C8' : // È case '\u00C9' : // É case '\u00CA' : // Ê case '\u00CB' : // Ë output.append("E"); break; case '\u00CC' : // Ì case '\u00CD' : // Í case '\u00CE' : // Î case '\u00CF' : // Ï output.append("I"); break; case '\u00D0' : // Ð output.append("D"); break; case '\u00D1' : // Ñ output.append("N"); break; case '\u00D2' : // Ò case '\u00D3' : // Ó case '\u00D4' : // Ô case '\u00D5' : // Õ case '\u00D8' : // Ø output.append("O"); break; case '\u00D6' : // Ö case '\u0152' : // Œ output.append("OE"); break; case '\u00DE' : // Þ output.append("TH"); break; case '\u00D9' : // Ù case '\u00DA' : // Ú case '\u00DB' : // Û output.append("U"); break; case '\u00DC' : // Ü output.append("UE"); break; case '\u00DD' : // Ý case '\u0178' : // Ÿ output.append("Y"); break; case '\u00E0' : // à case '\u00E1' : // á case '\u00E2' : // â case '\u00E3' : // ã case '\u00E5' : // å output.append("a"); break; case '\u00E4' : // ä case '\u00E6' : // æ output.append("ae"); break; case '\u00E7' : // ç output.append("c"); break; case '\u00E8' : // è case '\u00E9' : // é case '\u00EA' : // ê case '\u00EB' : // ë output.append("e"); break; case '\u00EC' : // ì case '\u00ED' : // í case '\u00EE' : // î case '\u00EF' : // ï output.append("i"); break; case '\u00F0' : // ð output.append("d"); break; case '\u00F1' : // ñ output.append("n"); break; case '\u00F2' : // ò case '\u00F3' : // ó case '\u00F4' : // ô case '\u00F5' : // õ case '\u00F8' : // ø output.append("o"); break; case '\u00F6' : // ö case '\u0153' : // œ output.append("oe"); break; case '\u00DF' : // ß output.append("ss"); break; case '\u00FE' : // þ output.append("th"); break; case '\u00F9' : // ù case '\u00FA' : // ú case '\u00FB' : // û output.append("u"); break; case '\u00FC' : // ü output.append("ue"); break; case '\u00FD' : // ý case '\u00FF' : // ÿ output.append("y"); break; default : output.append(input.charAt(i)); break; } } return output.toString(); } } Regards Uwe Goetzke Leiter Produktentwicklung Healy Hudson GmbH Procurement & Retail Solutions -----Ursprüngliche Nachricht----- Von: Sascha Fahl [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 18. November 2008 13:07 An: java-user@lucene.apache.org Betreff: Transforming german umlaute like ö,ä,ü,ß into oe, ae, ue, ss Hi, what is the best to transform the german umlaute ö,ä,ü,ß into oe, ae, ue, ss during the process of analyzing? Thanks, Sascha Fahl Softwareentwicklung evenity GmbH Zu den Mühlen 19 D-35390 Gießen Mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ----------------------------------------------------------------------- Healy Hudson GmbH - D-55252 Mainz Kastel Geschäftsführer Christian Konhäuser - Amtsgericht Wiesbaden HRB 12076 Diese Email ist vertraulich. Wenn Sie nicht der beabsichtigte Empfänger sind, dürfen Sie die Informationen nicht offen legen oder benutzen. Wenn Sie diese Email durch einen Fehler bekommen haben, teilen Sie uns dies bitte umgehend mit, indem Sie diese Email an den Absender zurückschicken. Bitte löschen Sie danach diese Email. This email is confidential. If you are not the intended recipient, you must not disclose or use this information contained in it. If you have received this email in error please tell us immediately by return email and delete the document. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]