On Sun, Jul 06, 2003 at 05:48:24PM -0400, Theodore Ts'o wrote: > On Sun, Jul 06, 2003 at 10:12:03PM +0100, Andrew Suffield wrote: > > On Sun, Jul 06, 2003 at 10:28:07PM +0200, Koblinger Egmont wrote: > > > Yes, when saying "random order" I obviously ment "in the order readdir() > > > returns them". It's random for me. :-))) > > > > > > It can easily be different on different filesystems, or even on same > > > type of filesystems with different parameters (e.g. blocksize). > > > > I can't think of any reason why changing the blocksize would affect > > this. Most filesystems return files in the sequence in which they were > > added to the directory. ext2, ext3, and reiser all do this; xfs is the > > only one likely to be used on a Debian system which doesn't. > > Err, no. If the htree (hash tree) indexing feature is turned on for > ext2 or ext3 filesystems, they will returned sorted by the hash of the > filename --- effectively a random order. (Since the hash also > includes a secret, random, per-filesystem secret in order to avoid > denial of service attacks by malicious users who might otherwise try > to create huge numbers of files containing hash collisions.)
I can only presume this is new or obscure, since everything I tried had the traditional behaviour. Can't see how to turn it on, either. > I would be very, very surprised if reiserfs returned files in creation > order. Some trivial testing indicates that it does. Heck if I know how or why. > It is really, really bad assumption to assume that files will be > returned in the same order as they were created. However, there's no real need to - that was just an example. As long as the sequence is more or less stable (which it should be, for btrees; don't know about htree) then rsync won't be perturbed. > > On ext2, as an example, stat()ting or open()ing a directory of 10000 > > files in the order returned by readdir() will be vastly quicker than > > in some other sequence (like, say, bytewise lexicographic) due to the > > way in which the filesystem looks up inodes. This has caused > > significant performance issues for bugs.debian.org in the past. > > If you are using HTREE, and want to do a readdir() scan followed by > something which opens or stat's all of the files, you very badly will > want to sort the returned directory inodes by the inode number > (de->d_inode). Otherwise, the order returned by readdir() will be > effectively random, with the resulting loss of performance which you > alluded to because the filesystem needs to randomly seek and ready all > around the inode table. Hmm, that's going to cause some trouble if htree becomes common. Is there any way to test for this at runtime? > The good news is that this particular optimization of sorting by inode > number should work for all filesystems, and should speed up xfs as > well as ext2/3 with HTREE. What about ext[23] without htree? Mucking with the order returned by readdir() has historically caused problems there... -- .''`. ** Debian GNU/Linux ** | Andrew Suffield : :' : http://www.debian.org/ | Dept. of Computing, `. `' | Imperial College, `- -><- | London, UK
pgpbFXtT67wbT.pgp
Description: PGP signature