Nir Aides n...@winpdb.org added the comment:
I actually meant how would you simulate zlib's absence on a system in which it
is present?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7610
Ezio Melotti ezio.melo...@gmail.com added the comment:
The easiest way is to setting zlib to None or not import it at all.
Are you suggesting that test_zipfile should be always run with and without zlib
to check that everything (except the things that require zlib of course) works
in both the
Ezio Melotti ezio.melo...@gmail.com added the comment:
Apparently that part of code is already tested in other tests that use deflated
mode, so I'll close this again. Thanks for the info.
--
stage: test needed - committed/rejected
status: open - closed
Changes by Antoine Pitrou pit...@free.fr:
--
assignee: - pitrou
resolution: - accepted
versions: +Python 2.7
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7610
___
Antoine Pitrou pit...@free.fr added the comment:
The patch has been committed in r77798 (trunk) and r77800 (py3k). Thank you!
I won't commit it to 2.6 and 3.1 because it's too involved to qualify as a bug
fix, though.
--
resolution: accepted - fixed
stage: patch review -
Ezio Melotti ezio.melo...@gmail.com added the comment:
Nir, could you also provide a test for the part that handles unconsumed data
(line 601 in zipfile.py)?
In r77809 (and r77810) I made a change to avoid using zlib when it's not
necessary (zlib is not always available), and I was going to
Nir Aides n...@winpdb.org added the comment:
The related scenario is a system without zlib. How do you suggest simulating
this in test?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7610
Ezio Melotti ezio.melo...@gmail.com added the comment:
The test should just check that the part that handles unconsumed data works
when zlib is available. AFAIU if zlib is not available this part (i.e. the
content of the if) can be skipped so it doesn't need to be tested.
(When zlib is not
Nir Aides n...@winpdb.org added the comment:
Unconsumed data is compressed data. If the part which handles unconsumed data
does not work when zlib is available, then the existing tests would fail. In
any case the unconsumed buffer is an implementation detail of zipfile.
I see a point in
Ezio Melotti ezio.melo...@gmail.com added the comment:
If the part which handles unconsumed data does not work when zlib is
available, then the existing tests would fail.
If the existing tests end up testing that part of code too then it's probably
fine. I tried to add a print inside the
Nir Aides n...@winpdb.org added the comment:
Uploaded an updated patch with read() which calls underlying stream enough
times to satisfy required read size.
--
Added file: http://bugs.python.org/file15941/zipfile_7610_py27_v5.diff
___
Python tracker
Antoine Pitrou pit...@free.fr added the comment:
The patch looks rather good. Is `self.MAX_N` still necessary in read()? I guess
it's rare to read more than 2GB at once, though...
--
___
Python tracker rep...@bugs.python.org
Nir Aides n...@winpdb.org added the comment:
Right, removed MAX_N from read(); remains in read1().
If good, what versions of Python is this patch desired for?
--
Added file: http://bugs.python.org/file15949/zipfile_7610_py27_v6.diff
___
Python
Nir Aides n...@winpdb.org added the comment:
I do not find the existing phrasing in the IO docs ambiguous, but since
it is obviously possible to misinterpret it it would be good to clarify
it. Can you suggest an alternate phrasing that would be clearer?
Replace 'may' with 'will' or
Antoine Pitrou pit...@free.fr added the comment:
Replace 'may' with 'will' or 'shall' everywhere the context indicates
a mandatory requirement.
Since this possibly affects the entire Python documentation, does it
make sense to discuss this on python-dev?
Either that, or open a separate
Antoine Pitrou pit...@free.fr added the comment:
The documentation of io.BufferedIOBase.read() reads multiple raw
reads may be issued to satisfy the byte count. I understood this
language to mean satisfying read size is optional. Isn't it?
It's the reverse actually. It means that
Nir Aides n...@winpdb.org added the comment:
May be a good idea to clear this up in the documentation.
http://en.wiktionary.org/wiki/may#Verb
(modal auxiliary verb, defective) To have permission to. Used in granting
permission and in questions to make polite requests.
--
R. David Murray rdmur...@bitdance.com added the comment:
I do not find the existing phrasing in the IO docs ambiguous, but since it is
obviously possible to misinterpret it it would be good to clarify it. Can you
suggest an alternate phrasing that would be clearer?
--
nosy:
Nir Aides n...@winpdb.org added the comment:
I uploaded an update for Python 2.7.
* you should probably write `n = sys.maxsize` instead of `n = 1 31 - 1`
sys.maxsize is 64 bit number on my system but the maximum value accepted by
zlib's decompress() seems to be INT_MAX defined in pyport.h
Antoine Pitrou pit...@free.fr added the comment:
Some comments:
* you should probably write `n = sys.maxsize` instead of `n = 1 31 - 1`
* ZipExtFile.read() should support `n=None` as a synonym to `n=-1` (read
everything)
* `bytes` as a variable name isn't very good since it's the built-in name
Nir Aides n...@winpdb.org added the comment:
If the sequence of readaheads is ['a\r', '\nb\n'], the first use of the pattern
will consume 'a', then the peek(2) will trigger a read() and the next use of
the pattern will consume '\r\n'.
I updated the patch and enhanced a little the inline
Antoine Pitrou pit...@free.fr added the comment:
Since the peek is called with a value of 2, the newline sequence \r\n
should be retrieved as is.
No, it doesn't follow. The \r can still appear at the end of a readahead, in
which case your algorithm will not eliminate the following \n.
That
Nir Aides n...@winpdb.org added the comment:
Right, I was reading the 3.1 docs by mistake.
I updated the patch. This time universal newlines are supported.
On my dataset (75MB 650K lines log file) the readline() speedup is x40 for 'r'
mode and x8 for 'rU' mode, and you can get an extra bump
Antoine Pitrou pit...@free.fr added the comment:
I updated the patch. This time universal newlines are supported.
Thank you. Are you sure the Shortcut common case in readline() is
useful? BufferedIOBase.readline() in itself should be rather fast.
--
Antoine Pitrou pit...@free.fr added the comment:
Also, I'm not sure what happens in readline() in universal mode when the
chunk ends with a '\r' and there's a '\n' in the following chunk (see
the ugly check that your patch removes). Is there a test for that?
--
Nir Aides n...@winpdb.org added the comment:
Thank you. Are you sure the Shortcut common case in readline()
is useful? BufferedIOBase.readline() in itself should be rather fast.
On my dataset the shortcut speeds up readline() 400% on top of the default C
implementation.
I can take a look
Nir Aides n...@winpdb.org added the comment:
I uploaded a possible patch for Python 2.7.
The patch converts ZipExtFile into subclass of io.BufferedIOBase and drops most
of the original implementation.
However, the patch breaks current newline behavior of ZipExtFile. I figured
this was
Antoine Pitrou pit...@free.fr added the comment:
I don't think we can remove the U option from 2.6/2.7; it was certainly
introduced for a reason and isn't inconsistent with the U option to the
built-in open().
On 3.x, behaviour is indeed inconsistent with the standard IO library, so maybe
we
Changes by lucifer luyun...@yahoo.com.cn:
--
components: Extension Modules
nosy: lucifer
severity: normal
status: open
title: Cannot use both read and readline method in same ZipExtFile object
type: behavior
versions: Python 2.6
___
Python tracker
New submission from lucifer luyun...@yahoo.com.cn:
open a file in the zip file through ZipFile.open method, if invoke read
method after readline method in the ZipExtFile object, the data is not
correct.
I was trying to get a ZipExtFile and pass it to pickle.load(f), a
exception was thrown.
Changes by Nir Aides n...@winpdb.org:
--
nosy: +nirai
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7610
___
___
Python-bugs-list mailing list
31 matches
Mail list logo