On Wed, Apr 29, 2009 at 23:03, Terry Reedy <tjre...@udel.edu> wrote: > Thomas Breuel wrote: > >> >> Sure. However, that requires you to provide meaningful, reproducible >> counter-examples, rather than a stenographic formulation that might >> hint some problem you apparently see (which I believe is just not >> there). >> >> >> Well, here's another one: PEP 383 would disallow UTF-8 encodings of half >> surrogates. >> > > By my reading, the current Unicode 5.1 definition of 'UTF-8' disallows > that.
If we use conformance to Unicode 5.1 as the basis for our discussion, then PEP 383 is off the table anyway. I'm all for strict Unicode compliance. But apparently, the Python community doesn't care. CESU-8 is described in Unicode Technical Report #26, so it at least has some official recognition. More importantly, it's also widely used. So, my question: what are the implications of PEP 383 for CESU-8 encodings on Python? My meta-point is: there are probably many more such issues hidden away and it is a really bad idea to rush something like PEP 383 out. Unicode is hard anyway, and tinkering with its semantics requires a lot of thought. Tom
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com