On Mar 24, 2010, at 11:43 PM, Tom Keiser wrote: >> Our estimate too. But before drilling down, it seemed worth checking if >> anyone else has a similar server - ext3 with 14,000 or more volumes in a >> single vice partition - and has seen a difference. Note, tho, that it's not >> #inodes or total disk usage in the partition. The servers that exhibited the >> problem had a large number of mostly empty volumes. >> > > Sure. Makes sense. The one thing that does come to mind is that > regardless of the number of inodes, ISTR some people were having > trouble with ext performance when htree indices were turned on because > spatial locality of reference against the inode tables goes way down > when you process files in the order returned by readdir(), since > readdir() in htree mode returns files in hash chain order rather than > more-or-less inode order. This could definitely have a huge impact on > the salvager [especially GetVolumeSummary(), and to a lesser extent > ListViceInodes() and friends]. I'm less certain how it would affect > things in the volserver, but it would certainly have an effect on > operations which delete clones, since the nuke code also calls > ListViceInodes(). > > In addition, with regard to ext htree indices I'll pose the > (completely untested) hypothesis that htree indices aren't necessarily > a net win for the namei workload. Given that namei goes great lengths > to avoid large directories (with the notable exception of the /vicepXX > root dir itself), it is conceivable that htree overhead is actually a > net loss. I don't know for sure, but I'd say it's worth doing further > study. In a volume with files>>dirs you're going to see on the order > of ~256 files per namei directory. Certainly a linear search of on > average 128 entries is expensive, but it may be worth verifying this > empirically because we don't know how much overhead htree and its > side-effects produce. Regrettably, there don't seem to be any > published results on the threshold above which htree becomes a net > win... > > Finally, you did tune2fs -O dir_index <dev> before populating the file > system, right?
Didn't try that one, no. Good suggestions, Tom. We're working on duplicating this in one of our test cells; assuming we can we'll try out these and see what actually works. Steve_______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info