On 2025-06-24 00:32:26 +0200, Mild Shock wrote: > So what does: > > stats = await asyncio.to_thread(os.stat, url) > > Whell it calls in a sparate new secondary thread: > > os.stat(url) > > It happends that url is only a file path, and > the file path points to an existing file. So the > secondary thread computs the stats, and terminates, > > and the async framework hands the stats back to > the main thread that did the await, and the main > thread stops his waiting and continues to run > > cooperatively with the other tasks in the current > event loop. The test case measures the wall time. > The results are: > > > node.js: 10 ms (usual Promises and stuff) > > JDK 24: 50 ms (using Threads, not yet VirtualThreads) > > pypy: 2000 ms > > I am only using one main task, sequentially on > such await calles, with a couple of file, not > more than 50 files.
So that's 2000 ms for 50 files or 40 ms per file? This is indeed extremely slow. On my laptop, CPython 3.12 and PyPy 7.3 can stat 10000 files in 40 ms. There is no way my laptop is 10000 times faster than your machine. I also measured how long it takes to create and join a thread. Here CPython with 72 µs is quite a bit faster than PyPy with 150 µs, but even that should be able to do about 250 create thread/stat/join cycles (although of course that would be a stupid thing to do). Oh, and lo and behold, that's about the performance I get when I get with asyncio.to_thread(os.stat, ...): #v+ #!/usr/bin/python3 import asyncio import os import random import time filenames = [] def createfilenames(d, levels): if levels == 0: for i in range(10): filenames.append(f"{d}/f{i}") else: for i in range(10): dn = f"{d}/d{i}" createfilenames(dn, levels-1) async def statfiles(filenames): count = 0 total_length = 0 t0 = time.monotonic() for fn in filenames: st = await asyncio.to_thread(os.stat, fn) count += 1 total_length += st.st_size t1 = time.monotonic() print(f"{count} files with {total_length} bytes stat'ed in {t1 - t0:.3f} seconds") createfilenames(".", 3) random.shuffle(filenames) asyncio.run(statfiles(filenames)) #v- % pypy3 stat_asyncio 10000 files with 13235076 bytes stat'ed in 1.415 seconds % python3 stat_asyncio 10000 files with 13235076 bytes stat'ed in 0.788 seconds hjp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | h...@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman3//lists/python-list.python.org