Martin v. Löwis wrote: > Erik Max Francis wrote: > >> The only reason is that nobody has needed one so far, and because > >> it is quite some work to do if done correctly. Why do you need it? > > Somebody asked me about generating UTF-32 (he didn't have choice of the output format). I was about to propose the obvious ``u.encode('utf-32')`` but discovered it's missing. Someone proposed 'unicode-internal' but it depends on the build and is an ugly answer. Next time, I want Guido's Time Machine to just work, so I have to fix this ;-).
> > Why would it be "quite some work"? Converting from UTF-16 to UTF-32 is > > pretty straightforward, and UTF-16 is already supported. > > I would like to see it correct, unlike the current UTF-16 codec. Perhaps > whoever contributes an UTF-32 codec could also deal with the defects of > the UTF-16 codec. > Now this is interesting, as I hoped to base my code on UTF-16 (and perhaps UTF-8 for combining surrogates)... Can you elaborate? I could attempt to fix UTF-16 as well but I don't have the expertise to choose the right behaviour, so you'll have to specify precisely what it should do (that it doesn't do now). -- http://mail.python.org/mailman/listinfo/python-list