On Thu, 30 Aug 2012 07:02:24 -0400, Roy Smith wrote: > In article <[email protected]>, > Steven D'Aprano <[email protected]> wrote: > >> The only thing which is innovative here is that instead of the Python >> compiler declaring that "all strings will be stored in UCS-2", the >> compiler chooses an implementation for each string as needed. So some >> strings will be stored internally as UCS-4, some as UCS-2, and some as >> ASCII (which is a standard, but not the Unicode consortium's standard). > > Is the implementation smart enough to know that x == y is always False > if x and y are using different internal representations?
But x and y are not necessarily always False just because they have different representations. There may be circumstances where two strings have different internal representations even though their content is the same, so it's an unsafe optimization to automatically treat them as unequal. The closest existing equivalent here is the relationship between ints and longs in Python 2. 42 == 42L even though they have different internal representations and take up a different amount of space. My expectation is that the initial implementation of PEP 393 will be relatively unoptimized, and over the next few releases it will get more efficient. That's usually the way these things go. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
