[issue34010] tarfile stream read performance regression

2018-07-04 Thread STINNER Victor
STINNER Victor added the comment: > @Victor I think removing unused code is better than adding test for it and > maintain it. Sure. I reviewed your PR 8089. -- ___ Python tracker

[issue34010] tarfile stream read performance regression

2018-07-04 Thread INADA Naoki
INADA Naoki added the comment: @Victor I think removing unused code is better than adding test for it and maintain it. So I removed that unused code block in GH-8089. -- ___ Python tracker

[issue34010] tarfile stream read performance regression

2018-07-04 Thread miss-islington
miss-islington added the comment: New changeset d7a0ad7dd7bd7dfbdbf6be2c89fde5a71813628a by Miss Islington (bot) in branch '3.6': bpo-34010: Fix tarfile read performance regression (GH-8020) https://github.com/python/cpython/commit/d7a0ad7dd7bd7dfbdbf6be2c89fde5a71813628a --

[issue34010] tarfile stream read performance regression

2018-07-04 Thread miss-islington
miss-islington added the comment: New changeset c1b75b5fb92fda0ac5b931d7b18c1418557cb7c4 by Miss Islington (bot) in branch '3.7': bpo-34010: Fix tarfile read performance regression (GH-8020) https://github.com/python/cpython/commit/c1b75b5fb92fda0ac5b931d7b18c1418557cb7c4 -- nosy:

[issue34010] tarfile stream read performance regression

2018-07-04 Thread STINNER Victor
STINNER Victor added the comment: https://github.com/python/cpython/pull/8020/files/77a54a39aace1a38794884218abe801b85b54e62#diff-ef64d8b610dda67977a63a9837f46349 -buf = "".join(t) +buf = b"".join(t) @hajoscher: "It never caused a problem, since this line is never

[issue34010] tarfile stream read performance regression

2018-07-04 Thread INADA Naoki
INADA Naoki added the comment: thanks -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue34010] tarfile stream read performance regression

2018-07-04 Thread miss-islington
Change by miss-islington : -- pull_requests: +7683 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34010] tarfile stream read performance regression

2018-07-04 Thread miss-islington
Change by miss-islington : -- pull_requests: +7684 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34010] tarfile stream read performance regression

2018-07-04 Thread INADA Naoki
INADA Naoki added the comment: New changeset 12a08c47601cadea8e7d3808502cdbcca87b2ce2 by INADA Naoki (hajoscher) in branch 'master': bpo-34010: Fix tarfile read performance regression (GH-8020) https://github.com/python/cpython/commit/12a08c47601cadea8e7d3808502cdbcca87b2ce2 --

[issue34010] tarfile stream read performance regression

2018-07-04 Thread INADA Naoki
Change by INADA Naoki : -- keywords: +3.2regression title: tarfile stream read performance -> tarfile stream read performance regression ___ Python tracker ___

[issue34010] tarfile stream read performance

2018-07-03 Thread hajoscher
hajoscher added the comment: Yes, it performance is really bad for large files, and memory consumption as well. I will write something for NEWS. -- ___ Python tracker ___

[issue34010] tarfile stream read performance

2018-07-03 Thread INADA Naoki
Change by INADA Naoki : -- versions: -Python 3.4, Python 3.5 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue34010] tarfile stream read performance

2018-07-01 Thread INADA Naoki
INADA Naoki added the comment: Nice catch. I confirmed this is a hard regression of performance. Decompressing a file must be O(n) when n=filesize, but O(n^2) now. While we live with this regression for a long time, I feel it's worth enough to backport. This can be DoS vulnerability. Can

[issue34010] tarfile stream read performance

2018-06-30 Thread Roundup Robot
Change by Roundup Robot : -- keywords: +patch pull_requests: +7628 stage: -> patch review ___ Python tracker ___ ___

[issue34010] tarfile stream read performance

2018-06-30 Thread hajoscher
New submission from hajoscher : Buffer read of large files in a compressed tarfile stream performs poorly. The buffered read in tarfile _Stream is extending a bytes object. It is much more efficient to use a list followed by a join. Using a list can mean seconds instead of minutes. This