On 1/7/2011 9:51 PM, nicolas cellier wrote:
So, what the hell means this bitAnd: 16r3FFFFF ?
Well, because in Squeak Character encoding, bits above don't encode the
character by itself but the so called #leadingChar. This leadingChar holds
information about the environment and the encoding which should be used to
interpret the charCode.

The background of which is Han unification (http://en.wikipedia.org/wiki/Han_unification). The language environment (encoded in the upper bits) disambiguates the character if necessary.

Cheers,
  - Andreas

In fact, the charCode will most likely return a unicode code point
(http://en.wikipedia.org/wiki/ISO/CEI_10646), except if leadingChar ~= 0, which
can be the case for some east-asian languages environments.

Note that a previous replacement - #codePoint - appears unsent...
This codePoint does not deal with leadingChar, so i'm not sure it's correct.

Hope it helps.

Nicolas

_______________________________________________
Beginners mailing list
Beginners@lists.squeakfoundation.org
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Reply via email to