Re: Another Q about Unicode, Folding Greek edition!

Don Wed, 09 Jun 2010 01:35:20 -0700

Nick Sabalausky wrote:

Thanks all for the helpful responses. Since we seem to have some realUnicode-knowledge people here, I'd like to repost a question I had askedelsewhere awhile back, but didn't get an answer:
--------------------------------------------------------------------------------
Can someone explain how folding-case differs from lower-case and why itshould be used for case-insensitive matching instead of lower-case?
I was looking at this document, but still don't get it:http://www.unicode.org/reports/tr21/tr21-5.html
The only part I see that directly addresses that is this:

      Case-folding is more than just conversion to lowercase.
      For example, it handles cases such as the Greek sigma,
      so that "?????" and "????S" will match correctly.

Which references what it says earlier about sigma:

      Characters may also have different case mappings,
      depending on the context.

      For example, U+03A3 "S" capital sigma lowercases to
      U+03C3 "s" small sigma if it is followed by another
      letter, but lowercases to U+03C2 "?" small
      final sigma if it is not.
But I still don't see how that demonstrates a need for anything other thantoLower provided that the given toLower routine is already properly handlingthe "end of word"/"not end of word" difference.

--------------------------------------------------------------------------------
Unless, it's just extra speed due to not having to handle things like the"end of word"/"not end of word" difference?

If you want to case-insensitive find "as" in " basdaS " in English, youcan just convert both to lower case, and you'll find them both.

Now suppose you want to find "as" in the string " basdas ", where it'sall in Greek. It still occurs twice, but it you convert it to lowercase, each s has a different character. toLower() doesn't work.

Re: Another Q about Unicode, Folding Greek edition!

Reply via email to