On 07/26/2013 07:21 AM, wxjmfa...@gmail.com wrote: >>>> sys.getsizeof('––') - sys.getsizeof('–') > > I have already explained / commented this.
Maybe it got lost in translation, but I don't understand your point with that. > Hint: To understand Unicode (and every coding scheme), you should > understand "utf". The how and the *why*. Hmm, so if python used utf-8 internally to represent unicode strings would not that punish *all* users (not just non-ascii users) since searching a string for a certain character position requires an O(n) operation? UTF-32 I could see (and indeed that's essentially what FSR uses when necessary does it not?), but not utf-8 or utf-16. -- http://mail.python.org/mailman/listinfo/python-list