Jon, I too found some problems with the German analyser recently. Here's what may help: 1. You can try reading Joerg Caumanns' paper "A Fast and Simple Stemming Algorithm for German Words". This paper describes the algorithm implemented by GermanAnalyser. 2. I guess German nouns all capitalized, so maybe that's why. Although you would want to be indexing well written German and not emails or text messages! 3. The German Stemmer converts umlauts into some funny form (the code is a bit tricky, and I didn't spend any time looking at it), so maybe thats why you can't find umlauts properly. I think the main reason for this umlaut change is that many plurals are formed by umlauting: E.g. Haus, Haeuser (that ae is a umlaut).
Finally, to really understand what's happening, get your hands on Luke. I just got it last week, and its brilliant. It shows you everything about your indexes. You can also feed text to an Analyser, and see what it makes of it. This will show you the real reason why your umlaut search is failing. Ciao, Jonathan O'Connor XCOM Dublin "Jon Humble" <[EMAIL PROTECTED]> 01/03/2005 09:35 Please respond to "Lucene Users List" <lucene-user@jakarta.apache.org> To <lucene-user@jakarta.apache.org> cc Subject Questions about GermanAnalyzer/Stemmer [auf Viren geprueft] Hello, We?re using the GermanAnalyzer/Stemmer to index/search our (German) Website. I have a few questions: (1) Why is the GermanAnalyzer case-sensitive? None of the other language indexers seem to be. What does this feature add? (2) With the German Analyzer, wildcard searches containing extended German characters do not seem to work. So, a* is fine but anä* or ö* always find zero results. (3) In a similar vein to (2), wildcard searches with escaped special characters fail to find results. So a search for co\-operative works but a search for co\-op* fails. I will be grateful for any light that can be shed on these problems. With Thanks, Jon. Jon Humble BSc (hons,) Software Engineer eMail: [EMAIL PROTECTED] TecSphere Ltd Centre for Advanced Industry Coble Dene, Royal Quays Newcastle upon Tyne NE29 6DE United Kingdom Direct Dial: +44 (191) 270 31 06 Fax: +44 (191) 270 31 09 http://www.tecsphere.com *** Aktuelle Veranstaltungen der XCOM AG *** XCOM laedt ein zur IBM Workplace Roadshow in Berlin (02.03.2005) Anmeldung und Information unter http://lotus.xcom.de/events Workshop-Reihe "Mobilisierung von Lotus Notes Applikationen" in Berlin (05.03.2005) Anmeldung und Information unter http://lotus.xcom.de/events *** XCOM AG Legal Disclaimer *** Diese E-Mail einschliesslich ihrer Anhaenge ist vertraulich und ist allein für den Gebrauch durch den vorgesehenen Empfaenger bestimmt. Dritten ist das Lesen, Verteilen oder Weiterleiten dieser E-Mail untersagt. Wir bitten, eine fehlgeleitete E-Mail unverzueglich vollstaendig zu loeschen und uns eine Nachricht zukommen zu lassen. This email may contain material that is confidential and for the sole use of the intended recipient. Any review, distribution by others or forwarding without express permission is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]