Hello

Speaking from experience from my observations on millions of machines the 
stat() call is *very slow* when compared to readdir(), FindNextFile(), 
getdirentriesattr(), etc. When we switched from a file system indexer that 
stat()ed every file to one that read directories we noticed an average speedup 
of about 10x.

You can probably attribute this to the fact that in file system indexing the 
raw system call volume is much lower (not having to stat() each file, just read 
the directories) but also due to the fact that there is much less HD seeking 
(stat() has to jump around the HD, usually all directory entries fit in one 
block). If you only need to test for the existence of multiple files and don't 
need the extra information that stat() gives you, it might make sense to avoid 
the context switch/IO overhead.  

Rian

On Jan 31, 2011, at 4:43 AM, Antoine Pitrou wrote:

> On Mon, 31 Jan 2011 00:08:25 -0800
> Guido van Rossum <gu...@python.org> wrote:
>> 
>> (Basically I am biased to believe that stat() is a pretty slow system
>> call -- this may just be old NFS lore though.)
> 
> I don't know about NFS, but starting a Python interpreter located on a
> Samba share from a Windows VM is quite slow too.
> I think Martin is right for the common case: on a local filesystem on a
> modern Unix, stat() is certainly very fast. Remote or
> distributed filesystems seem to be more of a problem.
> 
> Regards
> 
> Antoine.
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/rian%40dropbox.com

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to