Chris,

Thanks for providing the details and the dump.
I shall look into this and update with my findings.

Thanks and regards,
Sanjeev

On Sun, Aug 09, 2009 at 05:53:12PM -0700, Chris Baker wrote:
> Hi Sanjeev
> 
> OK - had a chance to do more testing over the weekend. Firstly some extra 
> data:
> 
> Moving the mirror to both drives on ICH10R ports and on sudden disk power-off 
> the mirror faulted cleanly to the remaining drive no problem.
> 
> Having a one drive pool on the ICH10R under heavy write traffic and then 
> powered off causes the zpool/zfs hangs described above.
> 
> ZPool being tested is called "Remove" and consists of:
> c7t2d0s0 - attached to the ICH10R
> c8t0d0s0 - second disk attached to the Si3132 card with the Si3124 driver
> 
> This leads me to the following suspicions:
> (1) We have an Si3124 issue in not detecting the drive removal always, or of 
> failing to pass that info back to ZFS, even though we know the kernel noticed
> (2) In the event that the only disk in a pool goes faulted, the zpool/zfs 
> subsystem will block indefinitely waiting to get rid of the pending writes.
> 
> I've just recabled back to one disk on ICH10R and one on Si3132 and tried the 
> sudden off with the Si drive:
> 
> *) First try - mirror faulted and IO continued - good news but confusing
> *) Second try - zfs/zpool hung, couldn't even get a zpool status, tried a 
> savecore but savecore hung moving the data to a seperate zpool
> *) Third try - zfs/zpool hung, ran savecore -L to a UFS filesystem I created 
> for the that purpose
> 
> After the first try, dmesg shows:
> Aug 10 00:34:41 TS1  SATA device detected at port 0
> Aug 10 00:34:41 TS1 sata: [ID 663010 kern.info] 
> /p...@0,0/pci8086,3...@1c,3/pci1095,7...@0 :
> Aug 10 00:34:41 TS1 sata: [ID 761595 kern.info]         SATA disk device at 
> port 0
> Aug 10 00:34:41 TS1 sata: [ID 846691 kern.info]         model WDC 
> WD5000AACS-00ZUB0
> Aug 10 00:34:41 TS1 sata: [ID 693010 kern.info]         firmware 01.01B01
> Aug 10 00:34:41 TS1 sata: [ID 163988 kern.info]         serial number      
> WD-xxxxxxxxxxxxxx
> Aug 10 00:34:41 TS1 sata: [ID 594940 kern.info]         supported features:
> Aug 10 00:34:41 TS1 sata: [ID 981177 kern.info]          48-bit LBA, DMA, 
> Native Command Queueing, SMART, SMART self-test
> Aug 10 00:34:41 TS1 sata: [ID 643337 kern.info]         SATA Gen2 signaling 
> speed (3.0Gbps)
> Aug 10 00:34:41 TS1 sata: [ID 349649 kern.info]         Supported queue depth 
> 32, limited to 31
> Aug 10 00:34:41 TS1 sata: [ID 349649 kern.info]         capacity = 976773168 
> sectors
> Aug 10 00:34:41 TS1 fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, 
> TYPE: Fault, VER: 1, SEVERITY: Major
> Aug 10 00:34:41 TS1 EVENT-TIME: Mon Aug 10 00:34:41 BST 2009
> Aug 10 00:34:41 TS1 PLATFORM:                                  , CSN:         
>                          , HOSTNAME: TS1
> Aug 10 00:34:41 TS1 SOURCE: zfs-diagnosis, REV: 1.0
> Aug 10 00:34:41 TS1 EVENT-ID: ab7df266-3380-4a35-e0bc-9056878fd182
> Aug 10 00:34:41 TS1 DESC: The number of I/O errors associated with a ZFS 
> device exceeded
> Aug 10 00:34:41 TS1          acceptable levels.  Refer to 
> http://sun.com/msg/ZFS-8000-FD for more information.
> Aug 10 00:34:41 TS1 AUTO-RESPONSE: The device has been offlined and marked as 
> faulted.  An attempt
> Aug 10 00:34:41 TS1          will be made to activate a hot spare if 
> available.
> Aug 10 00:34:41 TS1 IMPACT: Fault tolerance of the pool may be compromised.
> Aug 10 00:34:41 TS1 REC-ACTION: Run 'zpool status -x' and replace the bad 
> device.
> 
> and after the second and third test, just:
> SATA device detached at port 0
> 
> Core files were tar-ed together and bzip2-ed and can be found at:
> 
> http://dl.getdropbox.com/u/1709454/dump.bakerci.200908100106.tar.bz2
> 
> Please let me know if you need any further core/debug. Apologies to readers 
> having all this inflicted by email digest.
> 
> Many thanks
> 
> Chris
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
----------------
Sanjeev Bagewadi
Solaris RPE 
Bangalore, India
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to