Re: [zfs-discuss] ZFS I/O algorithms

Richard Elling Sun, 16 Mar 2008 08:42:23 -0700

Bob Friesenhahn wrote:
> On Sat, 15 Mar 2008, Richard Elling wrote:
>>
>> My observation, is that each metaslab is, by default, 1 MByte in 
>> size.  Each
>> top-level vdev is allocated by metaslabs.  ZFS tries to allocate a 
>> top-level
>> vdev's metaslab before moving onto another one.  So you should see eight
>> 128kByte allocs per top-level vdev before the next top-level vdev is
>> allocated.
>>
>> That said, the actual iops are sent in parallel.  So it is not 
>> unusual to see
>> many, most, or all of the top-level vdevs concurrently busy.
>>
>> Does this match your experience?
>
> I do see that all the devices are quite evenly busy.  There is no 
> doubt that the load balancing is quite good.  The main question is if 
> there is any actual "striping" going on (breaking the data into 
> smaller chunks), or if the algorithm is simply load balancing. 
> Striping trades IOPS for bandwidth.


By my definition of striping, yes it is going on.  But there are
different ways to spread the data.  The way that writes are handled,
ZFS rewards devices which can provide good sequential write
bandwidth, like disks.  Reads are another story, they read from
where the data is, which in turn depends on the conditions at
write time.

The other behaviour you may see is that reads and writes are
coalesced, when possible.  At the device level you may see your
smaller blocks being coalesced into larger iops.

>
> Using my application, I did some tests today.  The application was 
> used to do balanced read/write of about 500GB of data in some tens of 
> thousand of reasonably large files.  The application sequentially 
> reads a file, then sequentially writes a file.  Several copies (2-6) 
> of the application were run at once for concurrency.  What I noticed 
> is that with hardly any CPU being used, the read+write bandwidth 
> seemed to be bottlenecked at about 280MB/second with 'zfs iostat' 
> showing very balanced I/O between the reads and the writes.

But where is the bottleneck?  iostat will show bottlenecks in the
physical disks and channels.  vmstat or mpstat will show the
bottlenecks in cpus.  To see if the app is the bottleneck will
require some analysis of the app itself.  Is it spending its time
blocked on I/O?

>
> The system I set up is performing quite a bit differently than I 
> anticipated.  The I/O is bottlenecked and I find that my application 
> can do significant processing of the data without significantly 
> increasing the application run time.  So CPU time is almost free.
>
> If I was to assign a smaller block size for the filesystem, would that 
> provide more of the benefits of striping or would it be detrimental to 
> performance due to the number of I/Os?

I would not expect to see much difference, but the proof is in the pudding.
Let us know what you find.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS I/O algorithms

Reply via email to