[issue19619] Blacklist base64, hex, ... codecs from bytes.decode() and str.encode()

Nick Coghlan Sat, 16 Nov 2013 01:17:18 -0800

Nick Coghlan added the comment:

The full input/output type specifications can't be implemented sensibly without 
also defining at least a ByteSequence ABC. While I think it's a good idea in 
the long run, there's no feasible way to design such a system in the time 
remaining before the Python 3.4 feature freeze.


However, we could do something much simpler as a blacklist API:

    def is_unicode_codec(name):
        """Returns true if this is the name of a known Unicode text encoding"""

    def set_as_non_unicode(name):
        """Indicates that the named codec is not a Unicode codec"""

And then the codecs module would just maintain a set internally of all the 
names explicitly flagged as non-unicode.

Such an API remains useful even if the input/output type support is added in 
Python 3.5 (since "codecs.is_unicode_codec(name)" is a bit simpler thing to 
explain than the exact type restrictions).

Alternatively, implementing just the "encodes_to" and "decodes_to" attributes 
would be enough for str.encode, bytes.decode and bytearray.decode to reject 
known bad encodings early, leaving the input type checks to the codecs for now 
(since it is correctly defining "encode_from" and "decode_from" for many stdlib 
codecs that would need the ByteSequence ABC).

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue19619>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19619] Blacklist base64, hex, ... codecs from bytes.decode() and str.encode()

Reply via email to