On Mar 24, 2010, at 11:43 PM, Tom Keiser wrote:

>> Our estimate too. But before drilling down, it seemed worth checking if 
>> anyone else has a similar server - ext3 with 14,000 or more volumes in a 
>> single vice partition - and has seen a difference. Note, tho, that it's not 
>> #inodes or total disk usage in the partition. The servers that exhibited the 
>> problem had a large number of mostly empty volumes.
>> 
> 
> Sure.  Makes sense.   The one thing that does come to mind is that
> regardless of the number of inodes, ISTR some people were having
> trouble with ext performance when htree indices were turned on because
> spatial locality of reference against the inode tables goes way down
> when you process files in the order returned by readdir(), since
> readdir() in htree mode returns files in hash chain order rather than
> more-or-less inode order.  This could definitely have a huge impact on
> the salvager [especially GetVolumeSummary(), and to a lesser extent
> ListViceInodes() and friends].  I'm less certain how it would affect
> things in the volserver, but it would certainly have an effect on
> operations which delete clones, since the nuke code also calls
> ListViceInodes().
> 
> In addition, with regard to ext htree indices I'll pose the
> (completely untested) hypothesis that htree indices aren't necessarily
> a net win for the namei workload.  Given that namei goes great lengths
> to avoid large directories (with the notable exception of the /vicepXX
> root dir itself), it is conceivable that htree overhead is actually a
> net loss.  I don't know for sure, but I'd say it's worth doing further
> study.  In a volume with files>>dirs you're going to see on the order
> of ~256 files per namei directory.  Certainly a linear search of on
> average 128 entries is expensive, but it may be worth verifying this
> empirically because we don't know how much overhead htree and its
> side-effects produce.  Regrettably, there don't seem to be any
> published results on the threshold above which htree becomes a net
> win...
> 
> Finally, you did tune2fs -O dir_index <dev> before populating the file
> system, right?

Didn't try that one, no.

Good suggestions, Tom. We're working on duplicating this in one of our test 
cells; assuming we can we'll try out these and see what actually works.

Steve_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to