Re: [jdev] SASL debugging

Joe Hildebrand Sat, 10 Dec 2005 13:27:58 -0800

A better example is

Å (U+212B: ANGSTROM SIGN)
Å (U+00C5: LATIN CAPITAL LETTER A WITH RING ABOVE)
Å (U+0041: LATIN CAPITAL LETTER A and U+030A: COMBINING RING ABOVE)

These all look the same, and pretty much mean the same thing.Luckily, we don't have to argue about what the characters "mean",that's a job for the Unicode consortium. For example, they havedecided that:


А (U+0410: CYRILLIC CAPITAL LETTER A)

does *not* map onto U+0041.  Whatever.

The important thing is that for all three of the Å's above, they allcanonicalize (in NFKC) to the UTF-8 bytes:


41 CC 8A (hex)

or

61 CC 8A (hex)

if you've got case folding turned on.

This way you can compare them together for equality.

Oh, another favorite example of mine is Ⅷ (U+2167: ROMAN NUMERALEIGHT). This NFKC's to viii. There are some more examples here:

http://jabberstudio.org/cgi-bin/viewcvs.cgi/cvs/jabber-net/test/stringprep/


On Dec 10, 2005, at 1:49 PM, Yves Goergen wrote:

On 10.12.2005 12:28 (+0100), Matthias Wimmer wrote:

Examples of mapped characters are:
“℉” (U+2109, single charater!) is mapped to “°f” (twocharacters),“™” (U+2122, single character!) is mapped to “tm” (twocharacters),
“ℂ” (U+2102) is mapped to “c”,
“ℹ” (U+2139) is mapped to “i”,
“№” (U+2116, single character!) is mapped to “no” (twocharacters),
“²” (U+00B2) is mapped to “2”.


What's the point in mapping similar-looking characters to another one?
They are simply not the same and mapping a character from one language
set to one of an arbitrary other language can disturb sorting things
very much. Imagine our alphabet was A,B,D,F,G,H,...,C,E only because C
and E were mapped to the greerillew language characters that look

similar (or vice versa). Well anyway, I don't think I need this fornow.

I'll simply make sure it's Unicode-capable, plugging in a string
converter later is still possible.

--
Yves Goergen "LonelyPixel" <[EMAIL PROTECTED]>
"Does the movement of the trees make the wind blow?"
http://newsboard.unclassified.de - Unclassified NewsBoard Forum

Re: [jdev] SASL debugging

Reply via email to