Re: RAIDframe performance vs. stripe size

Eduardo Horvath Mon, 14 May 2012 08:56:02 -0700

On Sat, 12 May 2012, Edgar Fu? wrote:

> > In general it won't access just one filesystem block.
> > It will try to readahead 64KB
> Oh, so this declustering seems to make matters even more
> complicated^Winteresting.
> 
> Staying with my example of a 16K fsbsize FFS on a 4+1 disc Level 5
> RAIDframe with a stripe size of 4*16k=64k:
> 
> Suppose a process does something that could immediately be satisfied
> by reading one fs block (probably it matters whether that's a small file,
> a small portion of a large file, a small directory, a portion of a large
> directory, inodes, free list or whatever?). Now, if that, as I understand,
> always causes FFS to in fact issue a 64k request to RAIDframe, this would
> need to read a full stripe and so need all but one disc. So it can't be
> parallelised with another process' request, can it? Does this mean I'm better
> off with a stripe size of 4*64k if I'm after low latency for concurrent
> access?


The problem here is NFS, which requires writes to be persistent before 
returning status to the caller.

Under normal operation, ufs will attempt to use the buffer cache in the 
most efficient manner, doing readahead and delaying writes as much as 
possible to be able to do maximize the number of clustered operations as 
it can.  

Now if NFS does not do similar clustering on writes (I don't know NFS that 
well, especially V3 and V4 which allegedly have write optimizations) then 
you get the situation where the underlying ufs will try to cluster reads 
(satisfying reads out of the buffer cache is much faster than hitting the 
platters) but write out only single filesystem blocks (to satisfy the NFS 
consistency requirements.)  My understanding is that later versions of NFS 
(v3+) have a mechanism for the client side to request writes without the 
consistency guarantee and a separate explicit sync operation.  But using 
those is the responsibility of the NFS client machine.  Of course, if all 
the files are on the order of one filesystem block, clustering won't 
happen at all.

I think you should attempt to characterize your workload here to determine 
the size of the I/O operations the clients are requesting so you can 
decide if clustering is a benefit to you, and if not, turn it off.  (I 
think it can be tweaked with tunefs(8).)

Eduardo

Re: RAIDframe performance vs. stripe size

Reply via email to