> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
> "pf" == Paul Fisher <[EMAIL PROTECTED]> writes:
re> I was able to reproduce this in b93, but might have a
re> different interpretation
You weren't able to reproduce the hang of 'zpool status'?
Your 'zpool status' was after the
Hi Andy, answer & pointer below...
Andrew Hisgen wrote:
> Question embedded below...
>
> Richard Elling wrote:
> ...
>> If you surf to http://www.sun.com/msg/ZFS-8000-HC you'll
>> see words to the effect that,
>> The pool has experienced I/O failures. Since the ZFS pool property
>> 'failmode
Question embedded below...
Richard Elling wrote:
...
> If you surf to http://www.sun.com/msg/ZFS-8000-HC you'll
> see words to the effect that,
> The pool has experienced I/O failures. Since the ZFS pool property
> 'failmode' is set to 'wait', all I/Os (reads and writes) are
> blocked. See
Gave up on ZFS ever recovering. A shutdown attempt hung as
expected. I hard-reset the computer.
Ross
> Date: Wed, 30 Jul 2008 11:17:08 -0700> From: [EMAIL PROTECTED]> Subject: Re:
> [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed> To: [EMAIL
> PROTECTED]&
falling short.
>
> Ross
>
>
> > Date: Wed, 30 Jul 2008 09:48:34 -0500
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> > CC: zfs-discuss@opensolaris.org
> > Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive
> removed
> >
&g
Peter Cudhea wrote:
> Thanks, this is helpful. I was definitely misunderstanding the part that
> the ZIL plays in ZFS.
>
> I found Richard Elling's discussion of the FMA response to the failure
> very informative. I see how the device driver, the fault analysis
> layer and the ZFS layer are all w
Thanks, this is helpful. I was definitely misunderstanding the part that
the ZIL plays in ZFS.
I found Richard Elling's discussion of the FMA response to the failure
very informative. I see how the device driver, the fault analysis
layer and the ZFS layer are all working together.Though the
Peter Cudhea wrote:
> Your point is well taken that ZFS should not duplicate functionality
> that is already or should be available at the device driver level.In
> this case, I think it misses the point of what ZFS should be doing that
> it is not.
>
> ZFS does its own periodic commits to
Richard Elling wrote:
> I was able to reproduce this in b93, but might have a different
> interpretation of the conditions. More below...
>
> Ross Smith wrote:
>
>> A little more information today. I had a feeling that ZFS would
>> continue quite some time before giving an error, and today I'v
I was able to reproduce this in b93, but might have a different
interpretation of the conditions. More below...
Ross Smith wrote:
> A little more information today. I had a feeling that ZFS would
> continue quite some time before giving an error, and today I've shown
> that you can carry on wo
Your point is well taken that ZFS should not duplicate functionality
that is already or should be available at the device driver level.In
this case, I think it misses the point of what ZFS should be doing that
it is not.
ZFS does its own periodic commits to the disk, and it knows if those
On Wed, 30 Jul 2008, Ross Smith wrote:
>
> I'm not saying that ZFS should be monitoring disks and drivers to
> ensure they are working, just that if ZFS attempts to write data and
> doesn't get the response it's expecting, an error should be logged
> against the device regardless of what the dri
0> From: [EMAIL PROTECTED]> To: [EMAIL
> PROTECTED]> CC: zfs-discuss@opensolaris.org> Subject: Re: [zfs-discuss]
> Supermicro AOC-SAT2-MV8 hang when drive removed> > On Wed, 30 Jul 2008, Ross
> wrote:> >> > Imagine you had a raid-z array and pulled a drive as I'm
On Wed, 30 Jul 2008, Ross wrote:
>
> Imagine you had a raid-z array and pulled a drive as I'm doing here.
> Because ZFS isn't aware of the removal it keeps writing to that
> drive as if it's valid. That means ZFS still believes the array is
> online when in fact it should be degrated. If any o
Well yeah, this is obviously not a valid setup for my data, but if you read my
first e-mail, the whole point of this test was that I had seen Solaris hang
when a drive was removed from a fully redundant array (five sets of three way
mirrors), and wanted to see what was going on.
So I started wi
was lost from the pool
>>without realising that it simply hasn't mounted and they're actually
>>looking at an empty folder. Firstly ZFS should be removing the mount
>>point when problems occur, and secondly, ZFS list or ZFS status should
>>include
tten. I
> can accept that with delayed writes files may occasionally be lost
> when a failure happens, but I don't accept that we need to loose all
> knowledge of the affected files when the filesystem has complete
> knowledge of what is affected. If there are any working file
ystem drive is unavailable, ZFS could try each pool in
turn and attempt to store the log there.
In fact e-mail alerts or external error logging would be a great addition to
ZFS. Surely it makes sense that filesystem errors would be better off being
stored and ha
> "mp" == Mattias Pantzare <[EMAIL PROTECTED]> writes:
>> This is a big one: ZFS can continue writing to an unavailable
>> pool. It doesn't always generate errors (I've seen it copy
>> over 100MB before erroring), and if not spotted, this *will*
>> cause data loss after you re
ears fine to all
intents & purposes. You can read it off the pool and copy it elsewhere. There
doesn't seem to be any indication that it's going to disappear after a reboot.
> Date: Mon, 28 Jul 2008 13:35:21 -0500> From: [EMAIL PROTECTED]> To: [EMAIL
> PROTECTED]> S
snv_91. I downloaded snv_94 today so I'll be testing with that tomorrow.
> Date: Mon, 28 Jul 2008 09:58:43 -0700> From: [EMAIL PROTECTED]> Subject: Re:
> [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed> To: [EMAIL
> PROTECTED]> > Which OS and revision
it does, the pool doesn't report
any problems at all.
> Date: Mon, 28 Jul 2008 13:03:24 -0500> From: [EMAIL PROTECTED]> To: [EMAIL
> PROTECTED]> CC: zfs-discuss@opensolaris.org> Subject: Re: [zfs-discuss]
> Supermicro AOC-SAT2-MV8 hang when drive removed> > On
On Mon, 28 Jul 2008, Ross wrote:
>
> TEST1: Opened File Browser, copied the test data to the pool.
> Half way through the copy I pulled the drive. THE COPY COMPLETED
> WITHOUT ERROR. Zpool list reports the pool as online, however zpool
> status hung as expected.
Are you sure that this refere
> 4. While reading an offline disk causes errors, writing does not!
>*** CAUSES DATA LOSS ***
>
> This is a big one: ZFS can continue writing to an unavailable pool. It
> doesn't always generate errors (I've seen it copy over 100MB
> before erroring), and if not spotted, this *will* cause da
Ok, after doing a lot more testing of this I've found it's not the Supermicro
controller causing problems. It's purely ZFS, and it causes some major
problems! I've even found one scenario that appears to cause huge data loss
without any warning from ZFS - up to 30,000 files and 100MB of data m
Yeah, I thought of the storage forum today and found somebody else with the
problem, and since my post a couple of people have reported similar issues on
Thumpers.
I guess the storage thread is the best place for this now:
http://www.opensolaris.org/jive/thread.jspa?threadID=42507&tstart=0
T
I've discovered this as well - b81 to b93 (latest I've tried). I
switched from my on-board SATA controller to AOC-SAT2-MV8 cards because
the MCP55 controller caused random disk hangs. Now the SAT2-MV8 works as
long as the drives are working correctly, but the system can't handle a
drive failure
Has anybody here got any thoughts on how to resolve this problem:
http://www.opensolaris.org/jive/thread.jspa?messageID=261204&tstart=0
It sounds like two of us have been affected by this now, and it's a bit of a
nuisance your entire server hanging when a drive is removed, makes you worry
about
28 matches
Mail list logo