Re: [zfs-discuss] Re: Re[2]: RAIDZ2 vs. ZFS RAID-10
Anton B. Rang writes: In our recent experience RAID-5 due to the 2 reads, a XOR calc and a write op per write instruction is usually much slower than RAID-10 (two write ops). Any advice is greatly appreciated. RAIDZ and RAIDZ2 does not suffer from this malady (the RAID5 write hole). 1. This isn't the write hole. 2. RAIDZ and RAIDZ2 suffer from read-modify-write overhead when updating a file in writes of less than 128K, but not when writing a new file or issuing large writes. I don't think this is stated correctly. All filesystems will incur a read-modify-write when application is updating portion of a block. The read I/O only occurs if the block is not already in memory cache. The write is potentially deferred and multiple block updates may occur per write I/O. This is not RAIDZ specific. ZFS stores files less than 128K (or less than the filesystem recordsize) as a single block. Larger files are stored as multiple recordsize blocks. For RAID-Z a block spreads onto all devices of a group. -r This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re[2]: RAIDZ2 vs. ZFS RAID-10
On Jan 4, 2007, at 10:26 AM, Roch - PAE wrote: All filesystems will incur a read-modify-write when application is updating portion of a block. For most Solaris file systems it is the page size, rather than the block size, that affects read-modify-write; hence 8K (SPARC) or 4K (x86/x64) writes do not require read-modify-write for UFS/QFS, even when larger block sizes are used. When direct I/O is enabled, UFS and QFS will write directly to disk (without reading) for 512-byte-aligned I/O. The read I/O only occurs if the block is not already in memory cache. Of course. ZFS stores files less than 128K (or less than the filesystem recordsize) as a single block. Larger files are stored as multiple recordsize blocks. So appending to any file less than 128K will result in a read-modify- write cycle (modulo read caching); while a write to a file which is not record-size-aligned (by default, 128K) results in a read-modify-write cycle. For RAID-Z a block spreads onto all devices of a group. Which means that all devices are involved in the read and the write; except, as I believe Casper pointed out, that very small blocks (less than 512 bytes per data device) will reside on a smaller set of disks. Anton ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re[2]: RAIDZ2 vs. ZFS RAID-10
Hi Anton, Thank you for the information. That is exactly our scenario. We're 70% write heavy, and given the nature of the workload, our typical writes are 10-20K. Again the information is much appreciated. Best Regards, Jason On 1/3/07, Anton B. Rang [EMAIL PROTECTED] wrote: In our recent experience RAID-5 due to the 2 reads, a XOR calc and a write op per write instruction is usually much slower than RAID-10 (two write ops). Any advice is greatly appreciated. RAIDZ and RAIDZ2 does not suffer from this malady (the RAID5 write hole). 1. This isn't the write hole. 2. RAIDZ and RAIDZ2 suffer from read-modify-write overhead when updating a file in writes of less than 128K, but not when writing a new file or issuing large writes. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss