Ed Kellett <e+python-l...@kellett.im>: > So, Marko, I don't know what code you work on, but I think it's unfair > to attack Python 3's unicode handling too hard if you haven't written > a new project with it.
I don't believe the problem was solely in the difficulty of conversion. It was primarily in the tricky (and apparently pointless) decision-making. Which of these are text (really) and bytes (really): * Pathnames * Commands * Stdin/Stdout/Stderr * environment variables * URIs * HTTP header field names * HTTP header field values * HTTP methods * HTTP content * SMTP messages * Email messages * Base64 encodings * Hexadecimal encodings * JSON encoding and so on. And BTW, I have implemented an SMTP server in Python3 from scratch, and was struggling with similar issues. The code riddled with lines like this: conn.request('EHLO {}\r\n'.format(domain).encode()) In Linux protocol and system programming, text is just the wrong abstraction. I doubt there are very many situations where Python3's UTF-32 strings are a more opportune abstraction than UTF-8 strings are. Marko -- https://mail.python.org/mailman/listinfo/python-list