On Fri, 20 Oct 2017 14:31:22 -0500 David Beazley <[email protected]> wrote: > I adapted this benchmark to Curio using streams and Curio's support for > readinto(). Code is at > https://gist.github.com/dabeaz/999dc7d08ddd2c0dea790de67948e756 > Support for readinto() is somewhat recent in Curio so for testing, you will > need the latest version from Github (https://github.com/dabeaz/curio). > However, here are the results I got on my machine: > > - vanilla asyncio archieves 145 MB/s > - asyncio + uvloop achieves 340 MB/s > - Curio achieves 550 MB/s
Thank you Dave! I ran it on my machine and get roughly 910 MB/s, i.e. more or less the same speed as a tweaked Tornado using readinto. I also wrote a variant of the benchmark using socketserver and plain sockets. Its gets 1150 MB/s, and most time seems spent in the kernel, so I'm not quite sure if it's possible to improve over that: https://gist.github.com/pitrou/3ac31e82b4461cbc9b4eee151a47bfee (note that running the server in a separate process doesn't improve things; neither does using writev() or sendmsg() to send the two buffers at once) Regards Antoine. > > Asyncio tests were run using: > https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc > > Cheers, > Dave > > > On Oct 18, 2017, at 1:04 PM, Antoine Pitrou <[email protected]> wrote: > > > > > > Hi, > > > > I am currently looking into ways to optimize large data transfers for a > > distributed computing framework > > (https://github.com/dask/distributed/). We are using Tornado but the > > question is more general, as it turns out that certain kinds of API are > > an impediment to such optimizations. > > > > To put things short, there are a couple benchmarks discussed here: > > https://github.com/tornadoweb/tornado/issues/2147#issuecomment-337187960 > > > > - for Tornado, this benchmark: > > https://gist.github.com/pitrou/0f772867008d861c4aa2d2d7b846bbf0 > > - for asyncio, this benchmark: > > https://gist.github.com/pitrou/719e73c1df51e817d618186833a6e2cc > > > > Both implement a trivial form of framing using the "preferred" APIs of > > each framework (IOStream for Tornado, Protocol for asyncio), and then > > benchmark it over 100 MB frames using a simple echo client/server. > > > > The results (on Python 3.6) are interesting: > > - vanilla asyncio achieves 350 MB/s > > - vanilla Tornado achieves 400 MB/s > > - asyncio + uvloop achieves 600 MB/s > > - an optimized Tornado IOStream with a more sophisticated buffering > > logic (https://github.com/tornadoweb/tornado/pull/2166) > > achieves 700 MB/s > > > > The latter result is especially interesting. uvloop uses hand-crafted > > Cython code + the C libuv library, still, a pure Python version of > > Tornado does better thanks to an improved buffering logic in the > > streaming layer. > > > > Even the Tornado result is not ideal. When profiling, we see that > > 50% of the runtime is actual IO calls (socket.send and socket.recv), > > but the rest is still overhead. Especially, buffering on the read side > > still has costly memory copies (b''.join calls take 22% of the time!). > > > > For a framed layer, you shouldn't need so many copies. Once you've > > read the frame length, you can allocate the frame upfront and read into > > it. It is at odds, however, with the API exposed by asyncio's Protocol: > > data_received() gives you a new bytes object as soon as data arrives. > > It's already too late: a spurious memory copy will have to occur. > > > > Tornado's IOStream is less constrained, but it supports too many read > > schemes (including several types of callbacks). So I crafted a limited > > version of IOStream (*) that supports little functionality, but is able > > to use socket.recv_into() when asked for a given number of bytes. When > > benchmarked, this version achieves 950 MB/s. This is still without C > > code! > > > > (*) see > > https://github.com/tornadoweb/tornado/compare/master...pitrou:stream_readinto?expand=1 > > > > When profiling that limited version of IOStream, we see that 68% of the > > runtime is actual IO calls (socket.send and socket.recv_into). > > Still, 21% of the total runtime is spent allocating a 100 MB buffer for > > each frame! That's 70% of the non-IO overhead! Whether or not there > > are smart ways to reuse that writable buffer depends on how the > > application intends to use data: does it throw it away before the next > > read or not? It doesn't sound easily doable in the general case. > > > > > > So I'm wondering which kind of APIs async libraries could expose to > > make those use cases faster. I know curio and trio have socket objects > > which would probably fit the bill. I don't know if there are > > higher-level concepts that may be as adequate for achieving the highest > > performance. > > > > Also, since asyncio is the de facto standard now, I wonder if asyncio > > might grow such a new API. That may be troublesome: asyncio already > > has Protocols and Streams, and people often complain about its > > extensive API surface that's difficult for beginners :-) > > > > > > Addendum: asyncio streams > > ------------------------- > > > > I didn't think asyncio streams would be a good solution, but I still > > wrote a benchmark variant for them out of curiosity, and it turns out I > > was right. The results: > > - vanilla asyncio streams achieve 300 MB/s > > - asyncio + uvloop streams achieve 550 MB/s > > > > The benchmark script is at > > https://gist.github.com/pitrou/202221ca9c9c74c0b48373ac89e15fd7 > > > > Regards > > > > Antoine. > > > > > > _______________________________________________ > > Async-sig mailing list > > [email protected] > > https://mail.python.org/mailman/listinfo/async-sig > > Code of Conduct: https://www.python.org/psf/codeofconduct/ > > _______________________________________________ > Async-sig mailing list > [email protected] > https://mail.python.org/mailman/listinfo/async-sig > Code of Conduct: https://www.python.org/psf/codeofconduct/ > _______________________________________________ Async-sig mailing list [email protected] https://mail.python.org/mailman/listinfo/async-sig Code of Conduct: https://www.python.org/psf/codeofconduct/
