STINNER Victor added the comment:

I enhanced bench_scandir2.py to have one command to create a directory or a 
different command to run the benchmark.

All commands:
- create: create the directory for tests (you don't need this command, you can 
also use an existing directory)
- bench: compare scandir+is_dir to listdir+stat, cached
- bench_nocache: compare scandir+is_dir to listdir+stat, flush disk caches
- bench_nostat: compare scandir to listdir, cached
- bench_nostat_nocache: compare scandir to listdir, flush disk caches

--

New patch version 6 written for performances, changes:

- On POSIX, decode the filename in C
- _scandir() iterator now yields list of items, instead of an single item

With my benchmarks, I see that yielding 10 items reduces the overhead of 
scandir on Linux (creating DirEntry objects). On Windows, the number of items 
has no effect. I prefer to also fetch entries 10 per 10 to mimic POSIX. Later, 
on POSIX, we may use directly getdents() and yield the full getdents() result 
at once. according to strace, it's currently around 800 entries per getdents() 
syscall.


Results of bench_scandir2.py on my laptop using SSD and ext4 filesystem:

- 110,100 entries (100,000 files, 100 symlinks, 10,000 directories)
- bench: 1.3x faster (scandir: 164.9 ms, listdir: 216.3 ms)
- bench_nostat: 0.4x faster (scandir: 104.0 ms, listdir: 38.5 ms)
- bench_nocache: 2.1x faster (scandir: 460.2 ms, listdir: 983.2 ms)
- bench_nostat_nocache: 2.2x faster (scandir: 480.4 ms, listdir: 1055.6 ms)

Results of bench_scandir2.py on my laptop using NFS share (server: ext4 
filesystem) and slow wifi:

- 11,100 entries (1,0000 files, 100 symlinks, 1000 directories)
- bench: 1.3x faster (scandir: 22.5 ms, listdir: 28.9 ms)
- bench_nostat: 0.2x faster (scandir: 14.3 ms, listdir: 3.2 ms)

*** Timings with NFS are not reliable. Sometimes, a directory listing takes 
more than 30 seconds, but then it takes less than 100 ms. ***

Results of bench_scandir2.py on a Windows 7 VM using NTFS:

- 11,100 entries (10,000 files, 1,000 directories, 100 symlinks)
- bench: 9.9x faster (scandir: 58.3 ms, listdir: 578.5 ms)
- bench_nostat: 0.3x faster (scandir: 28.5 ms, listdir: 7.6 ms)

Results of bench_scandir2.py on my desktop PC using tmpfs (/tmp):

- 110,100 entries (100,000 files, 100 symlinks, 10,000 directories)
- bench: 1.3x faster (scandir: 149.2 ms, listdir: 189.2 ms)
- bench_nostat: 0.3x faster (scandir: 91.9 ms, listdir: 27.1 ms)

Results of bench_scandir2.py on my desktop PC using HDD and ext4:

- 110,100 entries (100000 files, 100 symlinks, 10000 directories)
- bench: 1.4x faster (scandir: 168.5 ms, listdir: 238.9 ms)
- bench_nostat: 0.4x faster (scandir: 107.5 ms, listdir: 41.9 ms)

----------
Added file: http://bugs.python.org/file38121/scandir-6.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22524>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to