Hi Robert,

> Basically, the way RAID-Z works is that it spreads FS block to all
> disks in a given VDEV, minus parity/checksum disks). Because when you
> read data back from zfs before it gets to application zfs will check
> it's checksum (fs checksum, not a raid-z one) so it needs entire fs
> block... which is spread to all data disks in a given vdev.

Thank you very much for correcting my long-time misconception.

On the other hand, isn't there room for improvement here? If it was possible to 
break large writes into smaller blocks with individual checkums(for instance 
those which are larger than a preferred_read_size parameter), we could still 
write all of these with a single RAIDZ(2) line, avoid the RAIDx write penalty 
and improve read performance because we'd only need to issue a single read I/O 
for each requested block - needing to access the full RAIDZ line only for the 
degraded RAID case.

I think that this could make a big difference for write-once read many random 
access-type applications like DSS systems etc.

Is this feasible at all?

Nils
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to