Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-08-04 Thread Miles Nordin
> "re" == Richard Elling <[EMAIL PROTECTED]> writes: > "pf" == Paul Fisher <[EMAIL PROTECTED]> writes: re> I was able to reproduce this in b93, but might have a re> different interpretation You weren't able to reproduce the hang of 'zpool status'? Your 'zpool status' was after the

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-08-01 Thread Richard Elling
Hi Andy, answer & pointer below... Andrew Hisgen wrote: > Question embedded below... > > Richard Elling wrote: > ... >> If you surf to http://www.sun.com/msg/ZFS-8000-HC you'll >> see words to the effect that, >> The pool has experienced I/O failures. Since the ZFS pool property >> 'failmode

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-08-01 Thread Andrew Hisgen
Question embedded below... Richard Elling wrote: ... > If you surf to http://www.sun.com/msg/ZFS-8000-HC you'll > see words to the effect that, > The pool has experienced I/O failures. Since the ZFS pool property > 'failmode' is set to 'wait', all I/Os (reads and writes) are > blocked. See

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-31 Thread Ross Smith
Gave up on ZFS ever recovering. A shutdown attempt hung as expected. I hard-reset the computer. Ross > Date: Wed, 30 Jul 2008 11:17:08 -0700> From: [EMAIL PROTECTED]> Subject: Re: > [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed> To: [EMAIL > PROTECTED]&

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Jonathan Loran
falling short. > > Ross > > > > Date: Wed, 30 Jul 2008 09:48:34 -0500 > > From: [EMAIL PROTECTED] > > To: [EMAIL PROTECTED] > > CC: zfs-discuss@opensolaris.org > > Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive > removed > > &g

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Richard Elling
Peter Cudhea wrote: > Thanks, this is helpful. I was definitely misunderstanding the part that > the ZIL plays in ZFS. > > I found Richard Elling's discussion of the FMA response to the failure > very informative. I see how the device driver, the fault analysis > layer and the ZFS layer are all w

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Peter Cudhea
Thanks, this is helpful. I was definitely misunderstanding the part that the ZIL plays in ZFS. I found Richard Elling's discussion of the FMA response to the failure very informative. I see how the device driver, the fault analysis layer and the ZFS layer are all working together.Though the

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Neil Perrin
Peter Cudhea wrote: > Your point is well taken that ZFS should not duplicate functionality > that is already or should be available at the device driver level.In > this case, I think it misses the point of what ZFS should be doing that > it is not. > > ZFS does its own periodic commits to

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Paul Fisher
Richard Elling wrote: > I was able to reproduce this in b93, but might have a different > interpretation of the conditions. More below... > > Ross Smith wrote: > >> A little more information today. I had a feeling that ZFS would >> continue quite some time before giving an error, and today I'v

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Richard Elling
I was able to reproduce this in b93, but might have a different interpretation of the conditions. More below... Ross Smith wrote: > A little more information today. I had a feeling that ZFS would > continue quite some time before giving an error, and today I've shown > that you can carry on wo

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Peter Cudhea
Your point is well taken that ZFS should not duplicate functionality that is already or should be available at the device driver level.In this case, I think it misses the point of what ZFS should be doing that it is not. ZFS does its own periodic commits to the disk, and it knows if those

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Bob Friesenhahn
On Wed, 30 Jul 2008, Ross Smith wrote: > > I'm not saying that ZFS should be monitoring disks and drivers to > ensure they are working, just that if ZFS attempts to write data and > doesn't get the response it's expecting, an error should be logged > against the device regardless of what the dri

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Ross Smith
0> From: [EMAIL PROTECTED]> To: [EMAIL > PROTECTED]> CC: zfs-discuss@opensolaris.org> Subject: Re: [zfs-discuss] > Supermicro AOC-SAT2-MV8 hang when drive removed> > On Wed, 30 Jul 2008, Ross > wrote:> >> > Imagine you had a raid-z array and pulled a drive as I'm

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Bob Friesenhahn
On Wed, 30 Jul 2008, Ross wrote: > > Imagine you had a raid-z array and pulled a drive as I'm doing here. > Because ZFS isn't aware of the removal it keeps writing to that > drive as if it's valid. That means ZFS still believes the array is > online when in fact it should be degrated. If any o

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-30 Thread Ross
Well yeah, this is obviously not a valid setup for my data, but if you read my first e-mail, the whole point of this test was that I had seen Solaris hang when a drive was removed from a fully redundant array (five sets of three way mirrors), and wanted to see what was going on. So I started wi

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-29 Thread David Collier-Brown
was lost from the pool >>without realising that it simply hasn't mounted and they're actually >>looking at an empty folder. Firstly ZFS should be removing the mount >>point when problems occur, and secondly, ZFS list or ZFS status should >>include

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-29 Thread Jonathan Loran
tten. I > can accept that with delayed writes files may occasionally be lost > when a failure happens, but I don't accept that we need to loose all > knowledge of the affected files when the filesystem has complete > knowledge of what is affected. If there are any working file

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-29 Thread Ross Smith
ystem drive is unavailable, ZFS could try each pool in turn and attempt to store the log there. In fact e-mail alerts or external error logging would be a great addition to ZFS. Surely it makes sense that filesystem errors would be better off being stored and ha

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Miles Nordin
> "mp" == Mattias Pantzare <[EMAIL PROTECTED]> writes: >> This is a big one: ZFS can continue writing to an unavailable >> pool. It doesn't always generate errors (I've seen it copy >> over 100MB before erroring), and if not spotted, this *will* >> cause data loss after you re

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Ross Smith
ears fine to all intents & purposes. You can read it off the pool and copy it elsewhere. There doesn't seem to be any indication that it's going to disappear after a reboot. > Date: Mon, 28 Jul 2008 13:35:21 -0500> From: [EMAIL PROTECTED]> To: [EMAIL > PROTECTED]> S

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Ross Smith
snv_91. I downloaded snv_94 today so I'll be testing with that tomorrow. > Date: Mon, 28 Jul 2008 09:58:43 -0700> From: [EMAIL PROTECTED]> Subject: Re: > [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed> To: [EMAIL > PROTECTED]> > Which OS and revision

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Ross Smith
it does, the pool doesn't report any problems at all. > Date: Mon, 28 Jul 2008 13:03:24 -0500> From: [EMAIL PROTECTED]> To: [EMAIL > PROTECTED]> CC: zfs-discuss@opensolaris.org> Subject: Re: [zfs-discuss] > Supermicro AOC-SAT2-MV8 hang when drive removed> > On

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Bob Friesenhahn
On Mon, 28 Jul 2008, Ross wrote: > > TEST1: Opened File Browser, copied the test data to the pool. > Half way through the copy I pulled the drive. THE COPY COMPLETED > WITHOUT ERROR. Zpool list reports the pool as online, however zpool > status hung as expected. Are you sure that this refere

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Mattias Pantzare
> 4. While reading an offline disk causes errors, writing does not! >*** CAUSES DATA LOSS *** > > This is a big one: ZFS can continue writing to an unavailable pool. It > doesn't always generate errors (I've seen it copy over 100MB > before erroring), and if not spotted, this *will* cause da

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Ross
Ok, after doing a lot more testing of this I've found it's not the Supermicro controller causing problems. It's purely ZFS, and it causes some major problems! I've even found one scenario that appears to cause huge data loss without any warning from ZFS - up to 30,000 files and 100MB of data m

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-24 Thread Ross
Yeah, I thought of the storage forum today and found somebody else with the problem, and since my post a couple of people have reported similar issues on Thumpers. I guess the storage thread is the best place for this now: http://www.opensolaris.org/jive/thread.jspa?threadID=42507&tstart=0 T

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-24 Thread Dave
I've discovered this as well - b81 to b93 (latest I've tried). I switched from my on-board SATA controller to AOC-SAT2-MV8 cards because the MCP55 controller caused random disk hangs. Now the SAT2-MV8 works as long as the drives are working correctly, but the system can't handle a drive failure

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-23 Thread Ross
Has anybody here got any thoughts on how to resolve this problem: http://www.opensolaris.org/jive/thread.jspa?messageID=261204&tstart=0 It sounds like two of us have been affected by this now, and it's a bit of a nuisance your entire server hanging when a drive is removed, makes you worry about