Tetsuya Morimoto added the comment:

> Another traditional issue with Japanese codecs is that people have different 
> opinions on what the encoding should do. It may be that when we release the 
> codec, somebody comes up and says that the codec is incorrect, and it should 
> do something different for some code points, citing some other applications 
> which he considers right. In particular for the Microsoft ones, people may 
> claim that some version of Windows did things differently.

In regard to e-mail encoding, Japanese should use utf-8, then it
resolves most problems. However, for historical reason or
compatibility reason, it's different even today. I don't think these
legacy codecs are needed for individual application, but we sometimes
encounter an encoding issue when an application collaborates to
external system like e-mail.

> Now, for this set, the ones that got registered with IANA sound ok (in the 
> sense that it is our bug if they fail to conform to the IANA spec, and IANA's 
> fault if they fail to do what users expect). For the other ones, I wonder 
> whether there is some official source that can be consulted for correctness.

Exactly. Now, I'm finding euc-jp-ms and iso-2022-jp-ms spec in
English. Of course, there's a voluntary document in Japanese as
follows.
http://www.wdic.org/w/WDIC/eucJP-ms
http://www.wdic.org/w/WDIC/ISO-2022-JP-MS

I may agree with dropping character encoding which is difficult to
find official source.

> On a different note: why do you claim that the code is written by Perky? 
> (it's not you, is it?)

Right! Because the credit belongs to him. I'm an assistant.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23050>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to