[issue19051] Unify buffered readers

2015-04-18 Thread Martin Panter

Martin Panter added the comment:

The LZMA, gzip and bzip modules now all use BufferedReader, so Serhiy’s patch 
is no longer relevant for them. Serhiy’s patch also changed the zipfile module, 
which may be still relevant. On the other hand, perhaps it would be more ideal 
to use BufferedReader for the zipfile module as well.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2015-04-13 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Is it still relevant, now that #23529 has been committed?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2015-01-10 Thread Martin Panter

Martin Panter added the comment:

Parts of the patch here actually do the same thing as my LZMAFile patch for 
Issue 15955. I wish I had looked at the patch earlier! The difference is I used 
a proposed max_length parameter for the decompressor rather than unlimited 
decompression, and I used the existing BufferedReader class rather than 
implementing a new custom one.

The changes for the “gzip” module could probably be merged with my GzipFile 
patch at Issue 15955 and made to use BufferedReader.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2015-01-09 Thread Martin Panter

Martin Panter added the comment:

For what it’s worth, it would be better if compressed streams did limit the 
amount of data they decompressed, so that they are not susceptible to 
decompression bombs; see Issue 15955. But having a flexible-sized buffer could 
be useful in other cases.

I haven’t looked closely at the code, but I wonder if there is much difference 
from the existing BufferedReader. Perhaps just that the underlying raw stream 
in this case can deliver data in arbitrary-sized chunks, but BufferedReader 
expects its raw stream to deliver data in limited-sized chunks?

If you exposed the buffer it could be useful to do many things more efficiently:

* readline() with custom newline or end-of-record codes, solving Issue 1152248, 
Issue 17083
* scan the buffer using string operations or regular expressions etc, e.g. to 
skip whitespace, read a run of unescaped symbols
* tentatively read data to see if a keyword is present, but roll back if the 
data doesn’t match the keyword

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2014-12-20 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-20 Thread Antoine Pitrou

Antoine Pitrou added the comment:

If you want this, I think it should be somehow folded into existing classes 
(for example BufferedIOBase). Yet another implementation of readline() isn't 
really a good idea.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

There are some classes in gzip, bz2, lzma, and zipfile modules which implement 
buffered reader interface. They read chunks of data from underlied file object, 
decompress it, save in internal buffer, and provide common methods to read from 
this buffer. Maintaining of duplicated code is cumbersome and error prone. 
Proposed preliminary patch moves common code into new private class 
_io2._BufferedReaderMixin. If the proposition will be accepted in general, I'm 
going to write C version and move it into the io module. Perhaps even then 
merge it with io.BufferedIOBase.

The idea is that all buffered reading functions (read(), read1(), readline(), 
peek(), etc) can be expressed in the term of one function which returns raw 
unbuffered data. Subclasses need define only one such function and will got all 
buffered reader interface. In case of mentioned above classes this functions 
reads and decompresses a chunk of data from underlied file. The HTTPResponse 
class perhaps will benefit too (issue19009).

--
components: IO, Library (Lib)
files: buffered_reader.diff
keywords: patch
messages: 198075
nosy: alanmcintyre, benjamin.peterson, nadeem.vawda, pitrou, serhiy.storchaka, 
stutzbach
priority: normal
severity: normal
stage: patch review
status: open
title: Unify buffered readers
type: enhancement
versions: Python 3.4
Added file: http://bugs.python.org/file31815/buffered_reader.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here are benchmark script and its results.

--
Added file: http://bugs.python.org/file31816/read_bench.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


Added file: http://bugs.python.org/file31817/read_bench_cmp

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Antoine Pitrou

Antoine Pitrou added the comment:

See issue12053 for a more flexible primitive.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

This primitive doesn't not well fit a case of compressed streams. A chunk of 
compressed data read from underlied file object can be uncompressed to 
unpredictable large data. We can't limit the size of buffer.

Another point is that buffer interface is not very appropriate for Python 
implementation. And we want left as much Python code in gzip, bz2, lzma and 
zipfile as possible. Copying from bytes into buffer and back will just waste 
resources.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

This primitive doesn't not well fit a case of compressed streams. A chunk of 
compressed data read from underlied file object can be uncompressed to 
unpredictable large data. We can't limit the size of buffer.

Another point is that buffer interface is not very appropriate for Python 
implementation. And we want left as much Python code in gzip, bz2, lzma and 
zipfile as possible. Copying from bytes into buffer and back will just waste 
resources.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19051] Unify buffered readers

2013-09-19 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
Removed message: http://bugs.python.org/msg198083

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19051
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com