Re: zfs hang

2014-10-09 Thread Andriy Gapon
On 10/10/2014 04:27, Steve Wills wrote:
> Dying drives
> shouldn't cause panic, right?

There can be different shades of dying.  Returning errors is one thing, hanging
is a different thing.  There is a good reason why systems panics in that case.
I am surprised though that the hang was not detected in the lower layers like
ahci driver or CAM.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs hang

2014-10-09 Thread Steve Wills
On Fri, Oct 10, 2014 at 02:35:14AM +0100, Steven Hartland wrote:
> 
> - Original Message - 
> From: "Steve Wills" 
> To: "Andriy Gapon" 
> Cc: ; 
> Sent: Friday, October 10, 2014 2:27 AM
> Subject: Re: zfs hang
> 
> 
> > On Wed, Oct 08, 2014 at 08:55:26AM +0300, Andriy Gapon wrote:
> >> On 08/10/2014 03:40, Steve Wills wrote:
> >> > Hi,
> >> > 
> >> > Not sure which thread this belongs to, but I have a zfs hang on one of 
> >> > my boxes
> >> > running r272152. Running procstat -kka looks like:
> >> > 
> >> > http://pastebin.com/szZZP8Tf
> >> > 
> >> > My zpool commands seem to be hung in spa_errlog_lock while others are 
> >> > hung in
> >> > zfs_lookup. Suggestions?
> >> 
> >> There are several threads in zio_wait.  If this is their permanent state 
> >> then
> >> there is some problem with I/O somewhere below ZFS.
> > 
> > Thanks for the feedback. It seems one of my disks is dying, I rebooted and 
> > it
> > came up OK, but today I got:
> > 
> >  panic: I/O to pool 'rpool' appears to be hung on vdev guid . at 
> > '/dev/ada0p3'
> > 
> > I have screenshots and backtrace if anyone is interested. Dying drives
> > shouldn't cause panic, right?
> 
> Its the deadman timer kicking in so yes, thats expected.
> 
> The following sysctls control this behaviour if you want to try and recover:
> vfs.zfs.deadman_synctime_ms: 100
> vfs.zfs.deadman_checktime_ms: 5000
> vfs.zfs.deadman_enabled: 1

Ah, ok. This pool has two disks, mirrored. I think one of them is dying, the
BIOS gives a SMART error on startup, but it still uses the disk fine. From what
I read of the zfs deadman design, it's for when the controller is acting up. So
I'm confused. Maybe this means both disks are dying?

Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs hang

2014-10-09 Thread Steven Hartland


- Original Message - 
From: "Steve Wills" 

To: "Andriy Gapon" 
Cc: ; 
Sent: Friday, October 10, 2014 2:27 AM
Subject: Re: zfs hang



On Wed, Oct 08, 2014 at 08:55:26AM +0300, Andriy Gapon wrote:

On 08/10/2014 03:40, Steve Wills wrote:
> Hi,
> 
> Not sure which thread this belongs to, but I have a zfs hang on one of my boxes

> running r272152. Running procstat -kka looks like:
> 
> http://pastebin.com/szZZP8Tf
> 
> My zpool commands seem to be hung in spa_errlog_lock while others are hung in

> zfs_lookup. Suggestions?

There are several threads in zio_wait.  If this is their permanent state then
there is some problem with I/O somewhere below ZFS.


Thanks for the feedback. It seems one of my disks is dying, I rebooted and it
came up OK, but today I got:

 panic: I/O to pool 'rpool' appears to be hung on vdev guid . at 
'/dev/ada0p3'

I have screenshots and backtrace if anyone is interested. Dying drives
shouldn't cause panic, right?


Its the deadman timer kicking in so yes, thats expected.

The following sysctls control this behaviour if you want to try and recover:
vfs.zfs.deadman_synctime_ms: 100
vfs.zfs.deadman_checktime_ms: 5000
vfs.zfs.deadman_enabled: 1

   Regards
   Steve


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs hang

2014-10-09 Thread Steve Wills
On Wed, Oct 08, 2014 at 08:55:26AM +0300, Andriy Gapon wrote:
> On 08/10/2014 03:40, Steve Wills wrote:
> > Hi,
> > 
> > Not sure which thread this belongs to, but I have a zfs hang on one of my 
> > boxes
> > running r272152. Running procstat -kka looks like:
> > 
> > http://pastebin.com/szZZP8Tf
> > 
> > My zpool commands seem to be hung in spa_errlog_lock while others are hung 
> > in
> > zfs_lookup. Suggestions?
> 
> There are several threads in zio_wait.  If this is their permanent state then
> there is some problem with I/O somewhere below ZFS.

Thanks for the feedback. It seems one of my disks is dying, I rebooted and it
came up OK, but today I got:

  panic: I/O to pool 'rpool' appears to be hung on vdev guid . at 
'/dev/ada0p3'

I have screenshots and backtrace if anyone is interested. Dying drives
shouldn't cause panic, right?

Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs hang

2014-10-07 Thread Andriy Gapon
On 08/10/2014 03:40, Steve Wills wrote:
> Hi,
> 
> Not sure which thread this belongs to, but I have a zfs hang on one of my 
> boxes
> running r272152. Running procstat -kka looks like:
> 
> http://pastebin.com/szZZP8Tf
> 
> My zpool commands seem to be hung in spa_errlog_lock while others are hung in
> zfs_lookup. Suggestions?

There are several threads in zio_wait.  If this is their permanent state then
there is some problem with I/O somewhere below ZFS.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"