New submission from Hye-Shik Chang: This patch adds CNS11643 support into Python unicode codecs. CNS11643 is a huge character which is used in EUC-TW and ISO-2022-CN. CJKCodecs have had the CNS11643 support for 4 years at least, but I dropped it because of its huge size in integrating into Python. EUC-TW and ISO-2022-CN aren't being used widely while they are still regarded as part of major encodings yet.
In my patch, disabling the CNS11643 charset support is possible by adding -DNO_CNS11643 in CFLAGS for light platforms. Mapping source code size of the charset is 900K and it adds about 350K into _codecs_tw.so (in POSIX) or python26.dll (in Win32). What do you think about adding this code? ---------- components: Unicode files: cns11643-r1.diff.gz messages: 62282 nosy: hyeshik.chang priority: low severity: normal status: open title: Adding new CNS11643 support, a *huge* charset, in cjkcodecs versions: Python 2.6, Python 3.0 Added file: http://bugs.python.org/file9408/cns11643-r1.diff.gz __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2066> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com