Mingye Wang added the comment:

> Advice for final user:

This seems something worthy of adding to the codecs doc as a footnote. Perhaps 
something like "(deprecated) ... gb2312 is an obsolete encoding from the 1980s. 
Use gbk or gb18030 instead." will do.

> libiconv-1.14 is also using the wrong version.

Just a side note on the right/wrongfulness of libiconv: I have reported the 
GB18030 incompatibility as a libiconv bug.[1] From the replies, I learnt that 
1) what libiconv is using currently is a then-official mapping published on 
ftp.unicode.org; 2) vendor implementations of gb2312 differed historically. I 
have updated the corresponding section[2] on Wikipedia to include these old 
references.
  [1]: https://lists.gnu.org/archive/html/bug-gnu-libiconv/2016-09/msg00004.html
  [2]: https://en.wikipedia.org/wiki/GB_2312#Two_implementations_of_GB2312

Still, being old and common does not necessarily mean being correct, as Ma Lin 
have demonstrated by showing the character semantics. To reflect this in a 
better-supported manner, I have added names for the glyphs in question from 
GB2312-80 to [2].

----------
nosy: +Artoria2e5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24036>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to