Giampaolo Rodola' <g.rod...@gmail.com> added the comment:
Yes, file copy (open() + read() + write()) is of course more expensive than just "reading" a tree (os.walk(), glob()) or deleting it (rmtree()) and the "pure file copy" time adds up to the benchmark. And indeed it's not an coincidence that #33671 (which replaced read() + write() with sendfile()) shaved off a 5% gain from the benchmark I posted initially for Linux. Still, in a 8k small-files-tree scenario we're seeing ~9% gain on Linux, 20% on Windows and 30% on a SMB share on localhost vs. VirtualBox. I do not consider this a "hardly noticeable gain" as you imply: it is noticeable, exponential and measurable, even with cache being involved (as it is). Note that the number of stat() syscalls per file is being reduced from 6 to 1 (or more if follow_symlinks=False), and that is the real gist here. That *does* make a difference on a regular Windows fs and makes a huge difference with network filesystems in general, as a simple stat() call implies access to the network, not the disk. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33695> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com