Just submitted the bug yesterday, under advice of James, so I don't have a 
number you can refer to you...the "change request" number is 6894775 if that 
helps or is directly related to the future bugid.

>From what I seen/read this problem has been around for awhile but only rears 
>its ugly head under heavy IO with large filesets, probably related to large 
>metadata sets as you spoke of. We are using snv_118 x64 but it seems to appear 
>in snv_123 and snv_125 as well from what I read here.

We've tried installing SSD's to act as a read-cache for the pool to reduce the 
metadata hits on the physical disks and as a last-ditch effort we even tried 
switching to the "latest" LSI-supplied itmpt driver from 2007 (from reading 
http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/) and 
disabling the mpt driver but we ended up with the same timeout issues. In our 
case, the drives in the JBODs are all WD (model WD1002FBYS-18A6B0) 1TB 7.2k 
SATA drives.

In revisting our architecture, we compared it to Sun's x4540 Thumper offering 
which uses the same controller with similar (though apparently customized) 
firmware and 48 disks. The difference is that they use 6 x LSI1068e controllers 
which each have to deal with only 8 disks...obviously better on performance but 
this architecture could be "hiding" the real IO issue by distributing the IO 
across so many controllers.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to