Re: [zfs-discuss] Re: Re[2]: RAIDZ2 vs. ZFS RAID-10

2007-01-04 Thread Roch - PAE
Anton B. Rang writes:
   In our recent experience RAID-5 due to the 2 reads, a XOR calc and a
   write op per write instruction is usually much slower than RAID-10
   (two write ops). Any advice is  greatly appreciated.
   
   RAIDZ and RAIDZ2 does not suffer from this malady (the RAID5 write hole).
  
  1. This isn't the write hole.
  
  2. RAIDZ and RAIDZ2 suffer from read-modify-write overhead when
  updating a file in writes of less than 128K, but not when writing a
  new file or issuing large writes. 
   

I don't think this is stated correctly.

All   filesystems   will   incur  a   read-modify-write when
application is  updating  portion of a  block.  The read I/O
only   occurs if the block  is  not already in memory cache.
The write is potentially deferred and multiple block updates
may occur per write I/O. 

This is not RAIDZ specific.

ZFS stores files less than 128K (or less than the filesystem
recordsize)  as a single block.  Larger  files are stored as
multiple recordsize blocks. 

For RAID-Z a block spreads onto all devices of a group.

-r

   
  This message posted from opensolaris.org
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re[2]: RAIDZ2 vs. ZFS RAID-10

2007-01-04 Thread Anton Rang


On Jan 4, 2007, at 10:26 AM, Roch - PAE wrote:


All   filesystems   will   incur  a   read-modify-write when
application is  updating  portion of a  block.


For most Solaris file systems it is the page size, rather than
the block size, that affects read-modify-write; hence 8K (SPARC)
or 4K (x86/x64) writes do not require read-modify-write for
UFS/QFS, even when larger block sizes are used.

When direct I/O is enabled, UFS and QFS will write directly to
disk (without reading) for 512-byte-aligned I/O.


The read I/O only occurs if the block is not already in memory cache.


Of course.


ZFS stores files less than 128K (or less than the filesystem
recordsize)  as a single block.  Larger  files are stored as
multiple recordsize blocks.


So appending to any file less than 128K will result in a read-modify- 
write

cycle (modulo read caching); while a write to a file which is not
record-size-aligned (by default, 128K) results in a read-modify-write  
cycle.



For RAID-Z a block spreads onto all devices of a group.


Which means that all devices are involved in the read and the write;  
except,
as I believe Casper pointed out, that very small blocks (less than  
512 bytes

per data device) will reside on a smaller set of disks.

Anton

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re[2]: RAIDZ2 vs. ZFS RAID-10

2007-01-03 Thread Jason J. W. Williams

Hi Anton,

Thank you for the information. That is exactly our scenario. We're 70%
write heavy, and given the nature of the workload, our typical writes
are 10-20K. Again the information is much appreciated.

Best Regards,
Jason

On 1/3/07, Anton B. Rang [EMAIL PROTECTED] wrote:

 In our recent experience RAID-5 due to the 2 reads, a XOR calc and a
 write op per write instruction is usually much slower than RAID-10
 (two write ops). Any advice is  greatly appreciated.

 RAIDZ and RAIDZ2 does not suffer from this malady (the RAID5 write hole).

1. This isn't the write hole.

2. RAIDZ and RAIDZ2 suffer from read-modify-write overhead when updating a file 
in writes of less than 128K, but not when writing a new file or issuing large 
writes.


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss