If you don't need to support case-sensitive search in your application, then you may be able to get away with adding string fields to your documents twice - lowercase version for indexing only, and verbatim to store. For example (this is Lucene 4 code, but same idea),
// indexed - not stored doc.add(new Field(fieldName, value.toLowerCase(), StringField.TYPE_NOT_STORED)); // stored - not indexed doc.add(new Field(fieldName, value, StoredField.TYPE)); Of course, to preserve symmetry for search, you would also need to force string terms in your queries to lower case as well. On Sat, Dec 1, 2012 at 1:02 AM, Dawid Weiss <dawid.we...@gmail.com> wrote: > Iterating character-by-character is different than considering the > entire string at once so your observation is correct, that's how it's > supposed to work. In particular, note this in String#toLowerCase > documentation: > > "Since case mappings are not always 1:1 char mappings, the resulting > String may be a different length than the original String." > > So it simply cannot be the same as iterating char-by-char. > > Dawid > > On Sat, Dec 1, 2012 at 6:32 AM, Trejkaz <trej...@trypticon.org> wrote: > > On Fri, Nov 30, 2012 at 8:22 PM, Ian Lea <ian....@gmail.com> wrote: > >> Sounds like a side effect of possibly different, locale-dependent, > >> results of using String.toLowerCase() and/or Character.toLowerCase(). > >> > >> > http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#toLowerCase() > >> specifically mentions Turkish. > >> > >> A Google search for "Character.toLowerCase() turkish" gets hits which > >> sound relevant. > > > > Certainly Turkish has special rules because of that uppercase I with > > dot. I was more wondering whether LowerCaseFilter was intentionally > > doing it differently to String.toLowerCase() or whether it was some > > kind of unintentional side-effect of using Character.toLowerCase() > > iteratively. > > > > TX > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >