Ricardo M. Correia wrote:
> On Qui, 2008-02-07 at 22:51 -0700, Neil Perrin wrote:
>> I believe when a prototype using 1K dnodes was tested it showed an
>> unacceptable (30%?) hit on some benchmarks. So if can possibly
>> avoid increasing the dnode size (by default) then we should do so.
> 
> Hmm, interesting..
> 
> Do you know the reason for such a performance hit?
> 
> Even with 1K dnode sizes, if the dnodes didn't have any extended 
> attributes and since metadata compression is enabled, the on-disk size 
> of metadnode blocks should remain approximately the same, right?
> 
> Could it be because the metadnode object became twice the size (logical 
> size) and therefore required another level of indirect blocks which, as 
> a consequence, required an additional disk seek for each metadnode block 
> read?

Actually, we tried with both the same (16k) dnode block size, and double 
(32k) dnode block size, to see if the the "sidefetching" caused by reading in 
a whole block of dnodes was influencing things.  Both performed worse than 
512-byte dnodes (in 16k blocks).  However, the additional space would not 
necessarily be zero-filled, it would be used by additional block pointers (if 
the file size is in certain ranges).  So the additional i/o caused by larger 
blocks could be an issue.

Note, the dnode file is always stored with the max levels of indirection so 
that could not be the issue.

Certainly more investigation is needed to determine the impact of 1k dnodes 
on various workloads.  But this is a bit of a diversion from the topic at 
hand -- EAs (or SAs as I call them; "extended attribute" already has a 
different meaning on Solaris).  I haven't heard anyone from Lustre comment on 
my proposed design...

--matt

Reply via email to