Tetsuya Morimoto added the comment: > Another traditional issue with Japanese codecs is that people have different > opinions on what the encoding should do. It may be that when we release the > codec, somebody comes up and says that the codec is incorrect, and it should > do something different for some code points, citing some other applications > which he considers right. In particular for the Microsoft ones, people may > claim that some version of Windows did things differently.
In regard to e-mail encoding, Japanese should use utf-8, then it resolves most problems. However, for historical reason or compatibility reason, it's different even today. I don't think these legacy codecs are needed for individual application, but we sometimes encounter an encoding issue when an application collaborates to external system like e-mail. > Now, for this set, the ones that got registered with IANA sound ok (in the > sense that it is our bug if they fail to conform to the IANA spec, and IANA's > fault if they fail to do what users expect). For the other ones, I wonder > whether there is some official source that can be consulted for correctness. Exactly. Now, I'm finding euc-jp-ms and iso-2022-jp-ms spec in English. Of course, there's a voluntary document in Japanese as follows. http://www.wdic.org/w/WDIC/eucJP-ms http://www.wdic.org/w/WDIC/ISO-2022-JP-MS I may agree with dropping character encoding which is difficult to find official source. > On a different note: why do you claim that the code is written by Perky? > (it's not you, is it?) Right! Because the credit belongs to him. I'm an assistant. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23050> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com