On Wed, 2004-12-29 at 23:54, Thomas Heller wrote:

> I found the discussion of unicode, in any python book I have, insufficient.

I couldn't agree more. I think explicit treatment of implicit
conversion, the role of sysdefaultencoding, the u'' constructor and
unicode() built in, etc would be helpful to many. 

A clear explanation of why Python strings, despite being assumed to be
ASCII, can contain any 8-bit data in any text encoding (or no text
encoding at all) may also help newbies.

I spent a while fighting to understand the way python handles encodings
a while ago and benefited significantly from it - but there really needs
to be a good explanation. The relationship between 'str' and 'unicode'
objects, the way implicit conversion works with sysdefaultencoding, and
how explicit conversions between encodings and to/from unicode, in
particular, need attention.

It'd also be REALLY good to mention the role of, and importance of, the
coding: line. An explanation of its relationship with the interpretation
of strings in the script, and with the sysdefaultencoding, would also be
helpful, as IMO the script encodings PEP only really makes sense once
you already understand it.

It wouldn't hurt to point C extension authors at things like the 'es'
encoded string format for PyArg_ParseTuple to help them make their code
better behaved with non-ascii text.

--
Craig Ringer 

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to