On 21 Jun 2011, at 07:40, Slava Paperno wrote: > VAR UTF-8 > 194 > 171 > 226 > 128 > 148 > 194 > 187 > > The FIELD and the VAR UTF-16 reports are entirely predictable, but the VAR > UTF-8 list is puzzling to me. I expected six bytes, not seven.
I didn't follow the earlier thread, so apologies if I'm not helping here. You said you were puzzled by the UTF-8 list having seven bytes. But unicode characters in UTF-8 may be from 1 to 5 bytes long. The values of the bytes give a hint to what they represent. A byte value between 192 and 223 is the first byte in a 2-byte character. And a byte value between 224 and 239 is the first byte in a 3-byte character. So in this case, the 226 value is the beginning of the 3-byte sequence for em-dash. Cheers Dave _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode