Gregory P. Smith added the comment:
I saw a small regression over 4k when using a 64k buffer on one of my machines
(dual core amd64 linux). With 32k everything (amd64 linux, armv7l 32-bit
linux, 64-bit os x 10.6) showed a dramatic improvement on the microbenchmark.
approaching 50% less cpu u
Roundup Robot added the comment:
New changeset 03a056c3b88e by Gregory P. Smith in branch '3.3':
Fixes issue #19929: Call os.read with 32768 within subprocess.Popen
http://hg.python.org/cpython/rev/03a056c3b88e
New changeset 4de4b5a4e405 by Gregory P. Smith in branch 'default':
Fixes issue #1992
Charles-François Natali added the comment:
> Roundup Robot added the comment:
>
> New changeset 03a056c3b88e by Gregory P. Smith in branch '3.3':
> Fixes issue #19929: Call os.read with 32768 within subprocess.Popen
> http://hg.python.org/cpython/rev/03a056c3b88e
Not that it bothers me, but AFAI
Antoine Pitrou added the comment:
Linux, 64-bit quad core:
With 4K buffer:
$ time ./python test_sub_read.py
0.25217683400114765
real0m0.296s
user0m0.172s
sys 0m0.183s
With 64K buffer:
$ time ./python test_sub_read.py
0.0925754177548
real0m0.132s
user0m0.051s
sys
Charles-François Natali added the comment:
> STINNER Victor added the comment:
>
> Since Popen.communicate() returns the whole content of the buffer, would it
> be safe to increase the buffer size? For example, use 4 GB as the buffer size?
Sure, if you want to pay the CPU and memory overhead of
STINNER Victor added the comment:
Oh, Charles-François Natali replied to my review by email, and it's not
archived on Rietveld. Copy of him message:
> http://bugs.python.org/review/18923/diff/9757/Lib/subprocess.py#newcode420
> Lib/subprocess.py:420: _PopenSelector = selectors.SelectSelector
>
STINNER Victor added the comment:
Where is the buffer size? The hardcoded 4096 value in Popen._communicate()?
data = os.read(key.fd, 4096)
I remember that I asked you where does 4096 come from when you patched
subprocess to use selectors (#18923):
http://bugs.python.org/review/18923/#ps9827
New submission from Charles-François Natali:
This is a spinoff of issue #19506: currently, subprocess.communicate() uses a
4K buffer when reading data from pipes.
This was probably optimal a couple years ago, but nowadays most operating
systems have larger pipes (e.g. Linux has 64K), so we migh