Sean P. DeNigris <sean <at> clipperadams.com> writes: > > > For Character, what is the difference between #asciiValue and #charCode (= > #asciiValue bitAnd: 16r3FFFFF)? > > Thanks. > Sean
#asciiValue suggests the character is encoded in ASCII. But hey, it's not general ! What is the ASCII code of é ? It can be used by legacy code dating from ages... ...when Smalltalk characters were all in the ASCII set. All ? well, but the left and up arrow maybe ;) The modern replacement of #asciiValue is #charCode. So, what the hell means this bitAnd: 16r3FFFFF ? Well, because in Squeak Character encoding, bits above don't encode the character by itself but the so called #leadingChar. This leadingChar holds information about the environment and the encoding which should be used to interpret the charCode. In fact, the charCode will most likely return a unicode code point (http://en.wikipedia.org/wiki/ISO/CEI_10646), except if leadingChar ~= 0, which can be the case for some east-asian languages environments. Note that a previous replacement - #codePoint - appears unsent... This codePoint does not deal with leadingChar, so i'm not sure it's correct. Hope it helps. Nicolas _______________________________________________ Beginners mailing list Beginners@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/beginners