[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-28 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: I actually meant how would you simulate zlib's absence on a system in which it is present? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7610

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-28 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: The easiest way is to setting zlib to None or not import it at all. Are you suggesting that test_zipfile should be always run with and without zlib to check that everything (except the things that require zlib of course) works in both the

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-28 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Apparently that part of code is already tested in other tests that use deflated mode, so I'll close this again. Thanks for the info. -- stage: test needed - committed/rejected status: open - closed

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: -- assignee: - pitrou resolution: - accepted versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7610 ___

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The patch has been committed in r77798 (trunk) and r77800 (py3k). Thank you! I won't commit it to 2.6 and 3.1 because it's too involved to qualify as a bug fix, though. -- resolution: accepted - fixed stage: patch review -

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Nir, could you also provide a test for the part that handles unconsumed data (line 601 in zipfile.py)? In r77809 (and r77810) I made a change to avoid using zlib when it's not necessary (zlib is not always available), and I was going to

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: The related scenario is a system without zlib. How do you suggest simulating this in test? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7610

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: The test should just check that the part that handles unconsumed data works when zlib is available. AFAIU if zlib is not available this part (i.e. the content of the if) can be skipped so it doesn't need to be tested. (When zlib is not

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: Unconsumed data is compressed data. If the part which handles unconsumed data does not work when zlib is available, then the existing tests would fail. In any case the unconsumed buffer is an implementation detail of zipfile. I see a point in

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-27 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: If the part which handles unconsumed data does not work when zlib is available, then the existing tests would fail. If the existing tests end up testing that part of code too then it's probably fine. I tried to add a print inside the

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-18 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: Uploaded an updated patch with read() which calls underlying stream enough times to satisfy required read size. -- Added file: http://bugs.python.org/file15941/zipfile_7610_py27_v5.diff ___ Python tracker

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-18 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The patch looks rather good. Is `self.MAX_N` still necessary in read()? I guess it's rare to read more than 2GB at once, though... -- ___ Python tracker rep...@bugs.python.org

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-18 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: Right, removed MAX_N from read(); remains in read1(). If good, what versions of Python is this patch desired for? -- Added file: http://bugs.python.org/file15949/zipfile_7610_py27_v6.diff ___ Python

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-17 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: I do not find the existing phrasing in the IO docs ambiguous, but since it is obviously possible to misinterpret it it would be good to clarify it. Can you suggest an alternate phrasing that would be clearer? Replace 'may' with 'will' or

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-17 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Replace 'may' with 'will' or 'shall' everywhere the context indicates a mandatory requirement. Since this possibly affects the entire Python documentation, does it make sense to discuss this on python-dev? Either that, or open a separate

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-15 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The documentation of io.BufferedIOBase.read() reads multiple raw reads may be issued to satisfy the byte count. I understood this language to mean satisfying read size is optional. Isn't it? It's the reverse actually. It means that

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-15 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: May be a good idea to clear this up in the documentation. http://en.wiktionary.org/wiki/may#Verb (modal auxiliary verb, defective) To have permission to. Used in granting permission and in questions to make polite requests. --

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-15 Thread R. David Murray
R. David Murray rdmur...@bitdance.com added the comment: I do not find the existing phrasing in the IO docs ambiguous, but since it is obviously possible to misinterpret it it would be good to clarify it. Can you suggest an alternate phrasing that would be clearer? -- nosy:

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-14 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: I uploaded an update for Python 2.7. * you should probably write `n = sys.maxsize` instead of `n = 1 31 - 1` sys.maxsize is 64 bit number on my system but the maximum value accepted by zlib's decompress() seems to be INT_MAX defined in pyport.h

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-12 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Some comments: * you should probably write `n = sys.maxsize` instead of `n = 1 31 - 1` * ZipExtFile.read() should support `n=None` as a synonym to `n=-1` (read everything) * `bytes` as a variable name isn't very good since it's the built-in name

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-10 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: If the sequence of readaheads is ['a\r', '\nb\n'], the first use of the pattern will consume 'a', then the peek(2) will trigger a read() and the next use of the pattern will consume '\r\n'. I updated the patch and enhanced a little the inline

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-05 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Since the peek is called with a value of 2, the newline sequence \r\n should be retrieved as is. No, it doesn't follow. The \r can still appear at the end of a readahead, in which case your algorithm will not eliminate the following \n. That

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-04 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: Right, I was reading the 3.1 docs by mistake. I updated the patch. This time universal newlines are supported. On my dataset (75MB 650K lines log file) the readline() speedup is x40 for 'r' mode and x8 for 'rU' mode, and you can get an extra bump

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-04 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I updated the patch. This time universal newlines are supported. Thank you. Are you sure the Shortcut common case in readline() is useful? BufferedIOBase.readline() in itself should be rather fast. --

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-04 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Also, I'm not sure what happens in readline() in universal mode when the chunk ends with a '\r' and there's a '\n' in the following chunk (see the ugly check that your patch removes). Is there a test for that? --

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-04 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: Thank you. Are you sure the Shortcut common case in readline() is useful? BufferedIOBase.readline() in itself should be rather fast. On my dataset the shortcut speeds up readline() 400% on top of the default C implementation. I can take a look

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-03 Thread Nir Aides
Nir Aides n...@winpdb.org added the comment: I uploaded a possible patch for Python 2.7. The patch converts ZipExtFile into subclass of io.BufferedIOBase and drops most of the original implementation. However, the patch breaks current newline behavior of ZipExtFile. I figured this was

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2010-01-03 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I don't think we can remove the U option from 2.6/2.7; it was certainly introduced for a reason and isn't inconsistent with the U option to the built-in open(). On 3.x, behaviour is indeed inconsistent with the standard IO library, so maybe we

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2009-12-31 Thread lucifer
Changes by lucifer luyun...@yahoo.com.cn: -- components: Extension Modules nosy: lucifer severity: normal status: open title: Cannot use both read and readline method in same ZipExtFile object type: behavior versions: Python 2.6 ___ Python tracker

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2009-12-31 Thread lucifer
New submission from lucifer luyun...@yahoo.com.cn: open a file in the zip file through ZipFile.open method, if invoke read method after readline method in the ZipExtFile object, the data is not correct. I was trying to get a ZipExtFile and pass it to pickle.load(f), a exception was thrown.

[issue7610] Cannot use both read and readline method in same ZipExtFile object

2009-12-31 Thread Nir Aides
Changes by Nir Aides n...@winpdb.org: -- nosy: +nirai ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7610 ___ ___ Python-bugs-list mailing list