> LowerCaseFilter will not handle that. So whereas it is "safe" for > English hard-coded strings, it isn't safe for all fields you might > index in general.
This filter is a "safe" fallback that works identically regardless of the locale you have on your computer (or on the server). This, I believe, is good and avoids nasty surprises of locale-sensitive environment. Contrary to the intuition, locale-sensitive methods are more often a headache and source of problems than whatever value they provide. If you live in Turkey then I think you should be using the dedicated TurkishLowerCaseFilter which handles Turkish letter conversion better. > Hopefully Unicode will never add a code point which lowercases to one with > less code units (or I guess > changes one of the lower ones to lowercase to more than one...) I agree this is an assumption that will hold... but if you care to provide a patch then a simple test case like the one I provided would be (I believe) sufficient to ensure this situation is captured early on during automated testing. Dawid --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org