Re: UTF-8 question from Dive into Python 3

Antoine Pitrou Wed, 19 Jan 2011 11:54:13 -0800

On Wed, 19 Jan 2011 19:18:49 +0000 (UTC)
Tim Harig <[email protected]> wrote:
> On 2011-01-19, Antoine Pitrou <[email protected]> wrote:
> > On Wed, 19 Jan 2011 18:02:22 +0000 (UTC)
> > Tim Harig <[email protected]> wrote:
> >> Converting to a fixed byte
> >> representation (UTF-32/UCS-4) or separating all of the bytes for each
> >> UTF-8 into 6 byte containers both make it possible to simply index the
> >> letters by a constant size.  You will note that Python does the
> >> former.
> >
> > Indeed, Python chose the wise option. Actually, I'd be curious of any
> > real-world software which successfully chose your proposed approach.
> 
> The point is basically the same.  I created an example because it
> was simpler to follow for demonstration purposes then an actual UTF-8
> conversion to any official multibyte format.  You obviously have no
> other purpose then to be contrary [...]


Right. You were the one who jumped in and tried to lecture everyone on
how UTF-8 was "big-endian", and now you are abandoning the one esoteric
argument you found in support of that.

> As soon as you start to convert to a multibyte format the endian issues
> occur.

Ok. Good luck with your "endian issues" which don't exist.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: UTF-8 question from Dive into Python 3

Reply via email to