Hi Robert, > Basically, the way RAID-Z works is that it spreads FS block to all > disks in a given VDEV, minus parity/checksum disks). Because when you > read data back from zfs before it gets to application zfs will check > it's checksum (fs checksum, not a raid-z one) so it needs entire fs > block... which is spread to all data disks in a given vdev.
Thank you very much for correcting my long-time misconception. On the other hand, isn't there room for improvement here? If it was possible to break large writes into smaller blocks with individual checkums(for instance those which are larger than a preferred_read_size parameter), we could still write all of these with a single RAIDZ(2) line, avoid the RAIDx write penalty and improve read performance because we'd only need to issue a single read I/O for each requested block - needing to access the full RAIDZ line only for the degraded RAID case. I think that this could make a big difference for write-once read many random access-type applications like DSS systems etc. Is this feasible at all? Nils _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss