New submission from STINNER Victor:

The codecs.StreamReaderWriter() class still has old unfixed issues like the 
issue #12508 (open since 2011). This issue is even seen as a security 
vulnerability by the owasp-pysec project:
https://github.com/ebranca/owasp-pysec/wiki/Unicode-string-silently-truncated

I propose to modify codecs.open() to reuse the io module: call io.open() with 
newline=''. The io module is now battle-tested and handles well many corner 
cases of incremental codecs with multibyte encodings.

With this change, codecs.open() cannot be used with non-text encodings... but 
I'm not sure that this feature ever worked in Python 3:

$ ./python -bb
Python 3.7.0a0
>>> import codecs
>>> f = codecs.open('test', 'w', encoding='rot13')
>>> f.write('hello')
TypeError: a bytes-like object is required, not 'str'
>>> f.write(b'hello')
TypeError: a bytes-like object is required, not 'dict'

The next step would be to deprecate the codecs.StreamReaderWriter class and the 
codecs.open(). But my latest attempt to deprecate them was the PEP 400 and it 
wasn't a full success, so I now prefer to move step by step :-)

Attached PR:

* Modify codecs.open() to use io.open()
* Remove "; use codecs.open() to handle arbitrary codecs" from io.open() and 
_pyio.open() error messages
* Replace codecs.open() with open() at various places

----------
components: Unicode
messages: 289362
nosy: ezio.melotti, haypo, lemburg, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Modify codecs.open() to use the io module instead of 
codecs.StreamReaderWriter()
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29783>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to