I recently upgraded to Lucene 3.0 and am seeing some new behavior that I don't understand. Perhaps someone can explain why.
I have a custom analyzer. Part of the analyzer uses the AsciiFoldingFilter. If I run a word with an umlaut through that analyzer using the AnalyzerDemo code in LIA2, as expected, I get the same word except that the umlauted letter is now a simple ascii letter (no umlaut). That's what I would expect and want. If I create a Queryparser using the call "new QueryParser(LUCENE_30, "body", myAnalyzer) and then call the parse() method passing the same word, I can see that the query parser has not removed the umlaut. The string it has is "+body: Europabörsen". I know I had to make a number of changes to the analyzer and the tokenizer to upgrade to 3.x. Is there something very different from the 2.x version that I'm likely missing. Anyone have any thoughts?