Re: What encoding does u'...' syntax use?

Denis Kasak Sat, 21 Feb 2009 12:48:51 -0800

On Sat, Feb 21, 2009 at 9:45 PM, "Martin v. Löwis" <[email protected]> wrote:
>>> Indeed. As Python *can* encode all characters even in 2-byte mode
>>> (since PEP 261), it seems clear that Python's Unicode representation
>>> is *not* strictly UCS-2 anymore.
>>
>> Since we're already discussing this, I'm curious - why was UCS-2
>> chosen over plain UTF-16 or UTF-8 in the first place for Python's
>> internal storage?
>
> You mean, originally? Originally, the choice was only between UCS-2
> and UCS-4; choice was in favor of UCS-2 because of size concerns.
> UTF-8 was ruled out easily because it doesn't allow constant-size
> indexing; UTF-16 essentially for the same reason (plus there was
> no point to UTF-16, since there were no assigned characters outside
> the BMP).


Yes, I failed to realise how long ago the unicode data type was
implemented originally. :-)
Thanks for the explanation.

-- 
Denis Kasak
--
http://mail.python.org/mailman/listinfo/python-list

Re: What encoding does u'...' syntax use?

Reply via email to