On Jan 5, 2010, at 8:49 AM, Robert Milkowski wrote:

On 05/01/2010 16:00, Roch wrote:
That said, I truly am for a evolution for random read
workloads. Raid-Z on 4K sectors is quite appealing. It means
that small objects become nearly mirrored with good random read
performance while large objects are stored efficiently.



Have you got any benchmarks available (comparing 512B to 4K to classical RAID-5)?

Not fair!  A 512 byte random write workload will absolutely clobber a
RAID-5 implementation. It is the RAID-5 pathological worst case.
For many arrays, even a 4 KB random write workload will suck most
heinously.

The raidz pathological worst case is a random read from many-column
raidz where files have records 128 KB in size. The inflated read problem
is why it makes sense to match recordsize for fixed record workloads.
This includes CIFS workloads which use 4 KB records. It is also why
having many columns in the raidz for large records does not improve
performance. Hence the 3 to 9 raidz disk limit recommendation in the
zpool man page.

http://www.baarf.com

The problem is that while RAID-Z is really good for some workloads it is really bad for others. Sometimes having L2ARC might effectively mitigate the problem but for some workloads it won't (due to the huge size of a working set). In such environments RAID-Z2 offers much worse performance then similarly configured NetApp (RAID-DP, same number of disk drives). If ZFS would provide another RAID-5/RAID-6 like protection but with different characteristics so writing to a pool would be slower but reading from it would be much faster (comparable to RAID-DP) some customers would be very happy. Then maybe a new kind of cache device would be needed to buffer writes to NV storage to make writes faster (like "HW" arrays have been doing for years).

This still does not address the record checksum.  This is only a problem
for small, random read workloads, which means L2ARC is a good solution.
If L2ARC is a set of HDDs, then you could gain some advantage, but IMHO
HDD and good performance do not belong in the same sentence anymore.
Game over -- SSDs win.

A possible *workaround* is to use SVM to set-up RAID-5 and create a zfs pool on top of it.
How does SVM handle R5 write hole? IIRC SVM doesn't offer RAID-6.

IIRC, SVM does a prewrite. Dog slow. Also, SVM is, AFAICT, on life support. The source is out there if anyone wants to carry it forward. Actually, many of us
would be quite happy for SVM to fade from our memory :-)
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to