On 1/7/2011 9:51 PM, nicolas cellier wrote:
So, what the hell means this bitAnd: 16r3FFFFF ? Well, because in Squeak Character encoding, bits above don't encode the character by itself but the so called #leadingChar. This leadingChar holds information about the environment and the encoding which should be used to interpret the charCode.
The background of which is Han unification (http://en.wikipedia.org/wiki/Han_unification). The language environment (encoded in the upper bits) disambiguates the character if necessary.
Cheers, - Andreas
In fact, the charCode will most likely return a unicode code point (http://en.wikipedia.org/wiki/ISO/CEI_10646), except if leadingChar ~= 0, which can be the case for some east-asian languages environments. Note that a previous replacement - #codePoint - appears unsent... This codePoint does not deal with leadingChar, so i'm not sure it's correct. Hope it helps. Nicolas
_______________________________________________ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners