Julien added the comment:
Hi John, thanks for your contribution,
Looks like your implementation is missing some codepoints, like "\t":
>>> print("\t".encode(encoding='iso6937'))
[...]
UnicodeError: encoding with 'iso6937' codec failed (UnicodeError:
Unacceptable utf-8 character)
Probably due to the "range(0x20, "…, why `0x20`?
You're having problems to decode multibytes sequences as you're not having the
`else: … result += chr(c[0])` in this case. So typically decoding `\xc2\x20`
will raise a `KeyError` as `\x20` is _not_ in your decoding table.
Also, please conform your contribution to the PEP8: you're missing spaces after
comas and you're sometime indenting with 8 spaces instead of 4.
I implemented a simple checker based on glibc localedata, it show clearly your
decoding problems step by step, and should be easily extended to check for your
encoding function too, see attachment. It uses the ISO6937 found typically in
the locales debian package or in an 'apt-get sourcee glibc'.
----------
nosy: +sizeof
Added file: http://bugs.python.org/file45478/check_iso6937.py
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue24339>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com