Vincent Lefevre <vinc...@vinc17.net> writes: >On 2015-04-13 16:28:27 -0600, Bob Proulx wrote:
>> Without dir_index an ext filesystem with large directories is slow due >> to the linear nature of directories. But with dir_index it should be >> using a B-tree data structure and should be much faster. >So, why is it slow? I don't think dir_index has anything to do with it. An index speeds up lookups. You are not doing lookups; you are traversing the entire data structure. A B-tree data structure can take longer to traverse than a contiguous array data structure due to prefetching generally being beneficial to arrays, but less so to pointer-based structures. It's slow because every block of the directory needs to be read to get the contents, even if every block contains empty entries. You don't know that until you've read it. >I also notice slowness with a large maildir directory: >drwx------ 2 vlefevre vlefevre 8409088 2015-03-24 14:04:33 Mail/oldarc/cur/ >In this one, the files are real (145400 files), but I have a Perl >script that basically reads the headers and it takes a lot of time >(several dozens of minutes) after a reboot or dropping the caches >as you suggested above. With a second run of this script, it just >takes 8 seconds. Your large directory is about 3.5 times the size of this one, so we would expect all things being equal that it would take 30s to read the larger directory based on the time of reading your maildir. One thing that is likely not equal is fragmentation. It is quite possible that your 30MB directory is fragmented across the disk and involves many seeks to read it all. If you really want to know if this is the case, use debugfs(8) to have a look: # debugfs /dev/sda1 # sub sda1 with your device debugfs: blocks /path/to/directory # path relative to root of filesystem That will output all the blocks used by the directory, in the order of the blocks in the directory. You'll be able to see how much seeking would be needed to read those blocks linearly. e.g. # debugfs /dev/mapper/m500-var debugfs 1.42.5 (29-Jul-2012) debugfs: blocks /lib/dpkg/info 8236 8207 8204 8221 8222 8223 8231 8232 8234 8333 8394 8395 8393 8396 8399 8400 8402 8747 8913 9258 9289 9311 9433 9405 9432 9452 9407 32084 32237 32238 32236 32245 32254 9555 9978 9908 You can see the blocks are reasonable contiguous until it jumps up to the 32000's, and then back to the 9000's. If you see a lot of that in your large empty directory, you'll find it slow to seek around the whole lot. (In my case, that's on an SSD, so I don't care). -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/4206.552d9c3f.b4...@xdna.net