CPython recognises both 'gbk' and 'cp936' i.e. unicode('some string', 'gbk') does what you'd expect. IronPython 1.0.1 recognises only 'cp936'.
CPython recognises 'mac_roman', 'mac_greek', etc. IronPython doesn't. After a [rare] flash of inspiration, I tried 'cp10000', 'cp10006', etc and IronPython recognises these, which CPython doesn't. The "differences" document says: """ IronPython's _codecs module implementation is incomplete. There are several replace_error/lookup_error handlers that IronPython does not implement. """ It is not apparent whether this is intended to mean that missing error handlers is the *only* known deficiency. IronPython Bug #3214 mentions "import encodings" as fixing a LookupError. Well, you learn something new every day: 1. CPython permits one to import encodings, but it's not documented AFAICT, and it's *not* necessary in order to use 'gbk', 'mac_roman', etc. 2. After import encodings, IronPython recognises 'mac_roman' and 'mac_greek', but still not 'gbk'. How much of the above is bug and how much is feature? What is this mysterious encodings module anyway? Does this mean the CPython test suite doesn't cover the above cases? Are the equivalences (mac_roman, cp10000) etc correct and official? Should I just dump all of the above into the IronPython Issue Tracker? Cheers, John _______________________________________________ users mailing list users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com