On Tue, Jul 19, 2005 at 12:48:53PM -0600, Jonathan Briggs wrote:
> > this is pretty slow on reiser, at least compared with ext2/3, and I  
> > understand that it may be because the find command returns the names  
> > in a non-optimal order (ie readdir order?).
> 
> I think Reiser3 is slow more because with mtime, find has to stat each
> file. 

The two issues are related.

Readdir will return the filenames in hash order. Find will then go and
stat each file, still in hash order.

Problem is, the inodes are not sorted in directory hash order on the
disk. They tend to be in approximate creation order. So, the disk access
pattern is nearly random access, meaning every stat is likely to touch a
new block and readahead is completely useless.



I once wrote a new hash for reiserfs3 specifically for maildir. This
hash caused files to be order in approximate creating order, matching
the inode order much closer. 


You will find both the patch and some benchmark results if you search
the archive (messageid [EMAIL PROTECTED]), but speeded up
my testcase by a factor of 6. (My testcase read all the data too though.
I don't think I ever tested just "find . -ls")


In reiserfs3 the usefullness of the hash is limited as hashes are per
filesystem settings. (So it is only useful if you have a dedicated
maildir filesystem). I've lost track of reiserfs4 features - maybe you
can select hashes per directory now? Or maybe the whole thing is made
obsolete by putting the inodes with the directoryentries?




-- 
Ragnar Kjørstad
Software Engineer
Scali - http://www.scali.com
Scaling the Linux Datacenter

Reply via email to