On Sat, May 22, 2010 at 11:33 AM, Bob Friesenhahn
<bfrie...@simple.dallas.tx.us> wrote:
> On Fri, 21 May 2010, Demian Phillips wrote:
>
>> For years I have been running a zpool using a Fibre Channel array with
>> no problems. I would scrub every so often and dump huge amounts of
>> data (tens or hundreds of GB) around and it never had a problem
>> outside of one confirmed (by the array) disk failure.
>>
>> I upgraded to sol10x86 05/09 last year and since then I have
>> discovered any sufficiently high I/O from ZFS starts causing timeouts
>> and off-lining disks. This leads to failure (once rebooted and cleaned
>> all is well) long term because you can no longer scrub reliably.
>
> The problem could be with the device driver, your FC card, or the array
> itself.  In my case, issues I thought were to blame on my motherboard or
> Solaris were due to a defective FC card and replacing the card resolved the
> problem.
>
> If the problem is that your storage array is becoming overloaded with
> requests, then try adding this to your /etc/system file:
>
> * Set device I/O maximum concurrency
> *
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29
> set zfs:zfs_vdev_max_pending = 5
>
> Bob
> --
> Bob Friesenhahn
> bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>

I've gone back to Solaris 10 11/06.
It's working fine, but I notice some differences in performance that
are I think key to the problem.

With the latest Solaris 10 (u8) throughput according to zpool iostat
was hitting about 115MB/sec sometimes a little higher.

With 11/06 it maxes out at 40MB/sec.

Both setups are using mpio devices as far as I can tell.

Next is to go back to u8 and see if the tuning you suggested will
help. It really looks to me that the OS is asking too much of the FC
chain I have.

The really puzzling thing is I just got told about a brand new Dell
Solaris x86 production box using current and supported FC devices and
a supported SAN get the same kind of problems when a scrub is run. I'm
going to investigate that and see if we can get a fix from Oracle as
that does have a support contract. It may shed some light on the issue
I am seeing on the older hardware.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to