New submission from Nick Coghlan:
Issue 20404 points out that io.TextIOWrapper can't be used with binary
transform codecs like bz2 because the types are wrong.
By contrast, codecs.open() still defaults to working in binary mode, and just
switches to returning a different type based on the specified encoding (exactly
the kind of value-driven output type changes we're trying to eliminate from the
core text model):
>>> import codecs
>>> print(codecs.open('hex.txt').read())
b'aabbccddeeff'
>>> print(codecs.open('hex.txt', encoding='hex').read())
b'\xaa\xbb\xcc\xdd\xee\xff'
>>> print(codecs.open('hex.txt', encoding='utf-8').read())
aabbccddeeff
While for 3.4, I plan to just extend the issue 19619 blacklist to also cover
TextIOWrapper (and hence open()), it seems to me that there is a valid use case
for bytes-to-bytes transform support directly in the IO stack.
A PEP for 3.5 could propose:
- providing a public API that allows codecs to be classified into at least the
following groups ("binary" = memorview compatible data exporters, including
both bytes and bytearray):
- text encodings (decodes binary to str, encodes str to bytes)
- binary transforms (decodes *and* encodes binary to bytes)
- text transforms (decodes and encodes str to str)
- hybrid transforms (acts as both a binary transform *and* as a text
transform)
- hybrid encodings (decodes binary and potentially str to str, encodes binary
and str to bytes)
- arbitrary encodings (decodes and encodes object to object, without fitting
any of the above categories)
- adding io.BinaryTransformWrapper that applies binary transforms when reading
and writing data (similar to the way TextIOWrapper applies text encodings)
- adding a "transform" parameter to open that inserts BinaryTransformWrapper
into the stack at the appropriate place (the PEP process would need to decide
between supporting just a single transform per stream or multiple). In text
mode, TextIOWrapper would be added to the stack after any binary transforms.
Optionally, the idea could also be extended to adding io.TextTransformWrapper
and a "text_transform" parameter, but those seem somewhat less useful.
----------
components: IO, Interpreter Core, Library (Lib)
messages: 209398
nosy: benjamin.peterson, ezio.melotti, haypo, hynek, lemburg, ncoghlan, pitrou,
serhiy.storchaka, stutzbach
priority: normal
severity: normal
stage: needs patch
status: open
title: Add io.BinaryTransformWrapper and a "transform" parameter to open()
type: enhancement
versions: Python 3.5
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue20405>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com