[issue1675951] Performance for small reads and fix seek problem
Antoine Pitrou pit...@free.fr added the comment: Thank you very much! I have kept the second approach (use PaddedFile at all times), since it is more regular and minimizes the probability for borderline cases. As for the supposed performance slowdown, it doesn't seem significant. On large blocks of data, I expect that compression/decompression cost will be overwhelming anyway. I've added a test case and committed the patch in r84976. Don't hesitate to contribute again. -- resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1675951 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1675951] Performance for small reads and fix seek problem
Florian Festi florianfe...@users.sourceforge.net added the comment: Stupid me! I ran the tests against my systems gzip version (Py 3.1). The performance issue is basically fixed by rev 77289. Performance is even a bit better that my original patch by may be 10-20%. The only test case where it performs worse is Random 10485760 byte block test Original gzip Write: 20.452 s Read:2.931 s New gzip Write: 20.518 s Read:1.247 s Don't know if it is worth bothering. May be increasing the maximum chunk size improves this - but I didn't try that out yet. WRT to seeking: I now have two patches that eliminate the need for seek() on normal operation (rewind obviously still needs seek()). Both are based on the PaddedFile class. The first patch just creates a PaddedFile object while switching from an old to a new member while the second just wraps the fileobj all the time. Performance test show that wrapping is cheap. The first patch is a bit ugly while the second requires a implementation of seek() and may create problems if new methods of the fileobj are used that may interfere with the PaddedFile's internals. So I leave the choice which one is preferred to the module owner. The patch creates another problem with is not yet fixed: The implementation of .seekable() is becoming wrong. As one can now use non seekable files the implementation should check if the file object used for reading is really seekable. As this is my first PY3k work I'd prefer if this can be solved by someone else (But that should be pretty easy). -- Added file: http://bugs.python.org/file18964/0001-Avoid-the-need-of-seek-ing-on-the-file-read.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1675951 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1675951] Performance for small reads and fix seek problem
Changes by Florian Festi florianfe...@users.sourceforge.net: Added file: http://bugs.python.org/file18965/0002-Avoid-the-need-of-seek-ing-on-the-file-read-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1675951 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1675951] Performance for small reads and fix seek problem
Antoine Pitrou pit...@free.fr added the comment: Attached result of a run with stdlib gzip module only. Results indicate that performance still is as bad as on Python 2. The Python 3 gzip module also still makes use of tell() ans seek(). So both argument for including this patch are still valid. Performance is easily improved by wrapping the file object in a io.BufferedReader or io.BufferedWriter: Text 1 byte block test Original gzip Write:2.125 s Read:0.683 s New gzip Write:0.390 s Read:0.240 s Text 4 byte block test Original gzip Write:1.077 s Read:0.351 s New gzip Write:0.204 s Read:0.132 s Text 16 byte block test Original gzip Write:1.119 s Read:0.353 s New gzip Write:0.264 s Read:0.137 s Still, fixing the seek()/tell() issue would be nice. -- title: [gzip] Performance for small reads and fix seek problem - Performance for small reads and fix seek problem ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1675951 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com