Kenneth Whistler wrote:
"Unicode character (\uFFE2\uFF80\uFF93)"
> ...
What you are actually looking for is the UTF-8 sequence:

0xE2 0x80 0x93

The 8-bit UTF-8 bytes E2 80 93 (all with the most significant bit set) get *sign-extended* to 16 bits, producing FFE2 FF80 FF93. It should suffice in a UTF-8 string literal to rewrite this as \xE2\x80\x93. Otherwise, find out where the 16-bit-widening/sign-extension occurs.


markus


Reply via email to