New submission from Richard Oudkerk: Piping significant amounts of data through a subprocess using Popen.communicate() is crazily slow on Windows.
The attached program just pushes data through mingw's cat.exe. Python 3.3: amount = 1 MB; time taken = 0.07 secs; rate = 13.51 MB/s amount = 2 MB; time taken = 0.31 secs; rate = 6.51 MB/s amount = 4 MB; time taken = 1.30 secs; rate = 3.08 MB/s amount = 8 MB; time taken = 5.43 secs; rate = 1.47 MB/s amount = 16 MB; time taken = 21.64 secs; rate = 0.74 MB/s amount = 32 MB; time taken = 87.36 secs; rate = 0.37 MB/s Python 2.7: amount = 1 MB; time taken = 0.02 secs; rate = 66.67 MB/s amount = 2 MB; time taken = 0.03 secs; rate = 68.97 MB/s amount = 4 MB; time taken = 0.05 secs; rate = 76.92 MB/s amount = 8 MB; time taken = 0.10 secs; rate = 82.47 MB/s amount = 16 MB; time taken = 0.27 secs; rate = 60.38 MB/s amount = 32 MB; time taken = 0.88 secs; rate = 36.36 MB/s amount = 64 MB; time taken = 3.20 secs; rate = 20.03 MB/s amount = 128 MB; time taken = 12.36 secs; rate = 10.35 MB/s For Python 3.3 this looks like O(n^2) complexity to me. 2.7 is better but still struggles for large amounts. Changing Popen._readerthread() to read in chunks rather than using FileIO.readall() produces a huge speed up: Python 3.3 with patch: amount = 1 MB; time taken = 0.01 secs; rate = 76.92 MB/s amount = 2 MB; time taken = 0.03 secs; rate = 76.92 MB/s amount = 4 MB; time taken = 0.04 secs; rate = 111.10 MB/s amount = 8 MB; time taken = 0.05 secs; rate = 148.14 MB/s amount = 16 MB; time taken = 0.10 secs; rate = 156.85 MB/s amount = 32 MB; time taken = 0.16 secs; rate = 198.75 MB/s amount = 64 MB; time taken = 0.31 secs; rate = 205.78 MB/s amount = 128 MB; time taken = 0.61 secs; rate = 209.82 MB/s Maybe FileIO.readall() should do something similar for files whose size cannot be determined by stat(). ---------- components: Library (Lib) files: push-thru-cat.py messages: 168813 nosy: sbt priority: normal severity: normal stage: patch review status: open title: 500x speed up for Popen.communicate() on Windows type: performance Added file: http://bugs.python.org/file26952/push-thru-cat.py _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15758> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com