Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-11-28 Thread Elizabeth Schwartz

Well, I fixed the HW but I had one bad file, and the problem was that ZFS
was saying "delete the pool and restore from tape" when, it turns out, the
answer is just find the file with the bad inode, delete it, clear the device
and scrub.  Maybe more of a documentation problme, but it sure is
disconcerting to have a file system threatening to give up the game over one
bad file (and the real irony: it was a file in someone's TRASH!)

Anyway I'm back in business without a restore (and with a rebuilt RAID) but
yeesh, it sure took a lot of escalating to get to the point where someone
knew to tell me to do a find -inum.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-11-28 Thread Toby Thain


On 28-Nov-06, at 10:01 PM, Elizabeth Schwartz wrote:

Well, I fixed the HW but I had one bad file, and the problem was  
that ZFS was saying "delete the pool and restore from tape" when,  
it turns out, the answer is just find the file with the bad inode,  
delete it, clear the device and scrub.  Maybe more of a  
documentation problme, but it sure is disconcerting to have a file  
system threatening to give up the game over one bad file (and the  
real irony: it was a file in someone's TRASH!)


Anyway I'm back in business without a restore (and with a rebuilt  
RAID) but yeesh, it sure took a lot of escalating to get to the  
point where someone knew to tell me to do a find -inum.


Do you now have a redundant ZFS configuration, to prevent future data  
loss/inconvenience?

--T



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-11-28 Thread David Dyer-Bennet

On 11/28/06, Elizabeth Schwartz <[EMAIL PROTECTED]> wrote:

Well, I fixed the HW but I had one bad file, and the problem was that ZFS
was saying "delete the pool and restore from tape" when, it turns out, the
answer is just find the file with the bad inode, delete it, clear the device
and scrub.  Maybe more of a documentation problme, but it sure is
disconcerting to have a file system threatening to give up the game over one
bad file (and the real irony: it was a file in someone's TRASH!)


The ZFS documentation was assuming you wanted to recover the data, not
abandon it.  Which, realistically, isn't always what people want; when
you know a small number of files are trashed, it's often easier to
delete those files and either just go on, or restore only those files,
compared to a full restore.   So yeah, perhaps the documentation could
be more helpful in that situation.


Anyway I'm back in business without a restore (and with a rebuilt RAID) but
yeesh, it sure took a lot of escalating to get to the point where someone
knew to tell me to do a find -inum.


Ah, if people here had realized that's what you needed to know, many
of us could have told you I'm sure.  I, at least, hadn't realized that
was one of the problem points.  (Probably too focused on the ZFS
content to think about the general issues enough!)

Very glad you're back in service, anyway!
--
David Dyer-Bennet, , 
RKBA: 
Pics: 
Dragaera/Steven Brust: 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-11-29 Thread Cindy Swearingen

Hi Betsy,

Yes, part of this is a documentation problem.

I recently documented the find -inum scenario in the community version 
of the admin guide. Please see page 156, (well, for next time) here:


http://opensolaris.org/os/community/zfs/docs/

We're working on the larger issue as well.

Cindy




Elizabeth Schwartz wrote:
Well, I fixed the HW but I had one bad file, and the problem was that 
ZFS was saying "delete the pool and restore from tape" when, it turns 
out, the answer is just find the file with the bad inode, delete it, 
clear the device and scrub.  Maybe more of a documentation problme, but 
it sure is disconcerting to have a file system threatening to give up 
the game over one bad file (and the real irony: it was a file in 
someone's TRASH!)


Anyway I'm back in business without a restore (and with a rebuilt RAID) 
but yeesh, it sure took a lot of escalating to get to the point where 
someone knew to tell me to do a find -inum.





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Al Hopper
On Tue, 28 Nov 2006, Elizabeth Schwartz wrote:

> Well, I fixed the HW but I had one bad file, and the problem was that ZFS

Hi Elizabeth,

Followup: When you say you "fixed the HW", I'm curious as to what you
found and if this experience with ZFS convinced you that your trusted RAID
H/W did, in fact, have issues?

Do you think that it's likely that there are others running production
systems on RAID systems that they trust, but don't realize may have bugs
(causing data corruption) that have yet to be discovered?

> was saying "delete the pool and restore from tape" when, it turns out, the
> answer is just find the file with the bad inode, delete it, clear the device
> and scrub.  Maybe more of a documentation problme, but it sure is
> disconcerting to have a file system threatening to give up the game over one
> bad file (and the real irony: it was a file in someone's TRASH!)
>
> Anyway I'm back in business without a restore (and with a rebuilt RAID) but
> yeesh, it sure took a lot of escalating to get to the point where someone
> knew to tell me to do a find -inum.
>

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
 OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC


On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:


Followup: When you say you "fixed the HW", I'm curious as to what you
found and if this experience with ZFS convinced you that your  
trusted RAID

H/W did, in fact, have issues?

Do you think that it's likely that there are others running production
systems on RAID systems that they trust, but don't realize may have  
bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie, JBOD  
controllers and disks can also have subtle bugs that corrupt data)


Chad


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Dana H. Myers
Chad Leigh -- Shire.Net LLC wrote:
> 
> On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:
> 
>> Followup: When you say you "fixed the HW", I'm curious as to what you
>> found and if this experience with ZFS convinced you that your trusted
>> RAID
>> H/W did, in fact, have issues?
>>
>> Do you think that it's likely that there are others running production
>> systems on RAID systems that they trust, but don't realize may have bugs
>> (causing data corruption) that have yet to be discovered?
> 
> And this is different from any other storage system, how?  (ie, JBOD
> controllers and disks can also have subtle bugs that corrupt data)

Of course, but there isn't the expectation of data reliability with a
JBOD that there is with some RAID configurations.

Dana
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC


On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:


On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:

Followup: When you say you "fixed the HW", I'm curious as to what  
you
found and if this experience with ZFS convinced you that your  
trusted

RAID
H/W did, in fact, have issues?

Do you think that it's likely that there are others running  
production
systems on RAID systems that they trust, but don't realize may  
have bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie, JBOD
controllers and disks can also have subtle bugs that corrupt data)


Of course, but there isn't the expectation of data reliability with a
JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt  
their data?  I expect the drives I buy to work fine (knowing that  
there could be bugs etc in them, the same as with my RAID systems).


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Dana H. Myers
Chad Leigh -- Shire.Net LLC wrote:
> 
> On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:
> 
>> Chad Leigh -- Shire.Net LLC wrote:
>>>
>>> On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:
>>>
 Followup: When you say you "fixed the HW", I'm curious as to what you
 found and if this experience with ZFS convinced you that your trusted
 RAID
 H/W did, in fact, have issues?

 Do you think that it's likely that there are others running production
 systems on RAID systems that they trust, but don't realize may have
 bugs
 (causing data corruption) that have yet to be discovered?
>>>
>>> And this is different from any other storage system, how?  (ie, JBOD
>>> controllers and disks can also have subtle bugs that corrupt data)
>>
>> Of course, but there isn't the expectation of data reliability with a
>> JBOD that there is with some RAID configurations.
>>
> 
> There is not?  People buy disk drives and expect them to corrupt their
> data?  I expect the drives I buy to work fine (knowing that there could
> be bugs etc in them, the same as with my RAID systems).

So, what do you think reliable RAID configurations are for?

Dana


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Ian Collins
Chad Leigh -- Shire.Net LLC wrote:

>
> On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:
>
>> Chad Leigh -- Shire.Net LLC wrote:
>>
>>>
>>> And this is different from any other storage system, how?  (ie, JBOD
>>> controllers and disks can also have subtle bugs that corrupt data)
>>
>>
>> Of course, but there isn't the expectation of data reliability with a
>> JBOD that there is with some RAID configurations.
>>
>
> There is not?  People buy disk drives and expect them to corrupt 
> their data?  I expect the drives I buy to work fine (knowing that 
> there could be bugs etc in them, the same as with my RAID systems).
>
So you trust your important data to a single drive?  I doubt it.  But I
bet you do trust your data to a hardware RAID array.

Ian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Toby Thain


On 1-Dec-06, at 6:29 PM, Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:


Followup: When you say you "fixed the HW", I'm curious as to what you
found and if this experience with ZFS convinced you that your  
trusted RAID

H/W did, in fact, have issues?

Do you think that it's likely that there are others running  
production
systems on RAID systems that they trust, but don't realize may  
have bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie,  
JBOD controllers and disks can also have subtle bugs that corrupt  
data)



I think Al probably means, "running production systems on [any  
trusted storage system where errors can remain  
undetected]" (contrasted with ZFS).


--Toby



Chad


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Toby Thain


On 1-Dec-06, at 6:36 PM, Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:


On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:

Followup: When you say you "fixed the HW", I'm curious as to  
what you
found and if this experience with ZFS convinced you that your  
trusted

RAID
H/W did, in fact, have issues?

Do you think that it's likely that there are others running  
production
systems on RAID systems that they trust, but don't realize may  
have bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie, JBOD
controllers and disks can also have subtle bugs that corrupt data)


Of course, but there isn't the expectation of data reliability with a
JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt  
their data?  I expect the drives I buy to work fine (knowing that  
there could be bugs etc in them, the same as with my RAID systems).


Yes, but in either case, ZFS will tell you. Other filesystems in  
general cannot.


--Toby



Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC


On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:


Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:



And this is different from any other storage system, how?  (ie,  
JBOD

controllers and disks can also have subtle bugs that corrupt data)



Of course, but there isn't the expectation of data reliability  
with a

JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt
their data?  I expect the drives I buy to work fine (knowing that
there could be bugs etc in them, the same as with my RAID systems).

So you trust your important data to a single drive?  I doubt it.   
But I

bet you do trust your data to a hardware RAID array.


Yes, but not because I expect a single drive to be more error prone  
(versus total failure).  Total drive failure on a single disk loses  
all your data.  But we are not talking total failure, we are talking  
errors that corrupt data.  I buy individual drives with the  
expectation that they are designed to be error free and are error  
free for the most part and I do not expect a RAID array to be more  
robust in this regard (after all, the RAID is made up of a bunch of  
single drives).


Some people on this list think that the RAID arrays are more likely  
to corrupt your data than JBOD (both with ZFS on top, for example, a  
ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no  
proof of this or even reasonable hypothetical explanation for this  
that I have seen presented.


Chad



Ian


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC


On Dec 1, 2006, at 10:42 PM, Toby Thain wrote:



On 1-Dec-06, at 6:36 PM, Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:


On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:

Followup: When you say you "fixed the HW", I'm curious as to  
what you
found and if this experience with ZFS convinced you that your  
trusted

RAID
H/W did, in fact, have issues?

Do you think that it's likely that there are others running  
production
systems on RAID systems that they trust, but don't realize may  
have bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie,  
JBOD

controllers and disks can also have subtle bugs that corrupt data)


Of course, but there isn't the expectation of data reliability  
with a

JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt  
their data?  I expect the drives I buy to work fine (knowing that  
there could be bugs etc in them, the same as with my RAID systems).


Yes, but in either case, ZFS will tell you.


And then kill your whole pool :-)


Other filesystems in general cannot.



While other file systems, when they become corrupt, allow you to  
salvage data :-)


Chad


--Toby



Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Ian Collins
Chad Leigh -- Shire.Net LLC wrote:

>
> On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:
>
>> Chad Leigh -- Shire.Net LLC wrote:
>>
>>> There is not?  People buy disk drives and expect them to corrupt
>>> their data?  I expect the drives I buy to work fine (knowing that
>>> there could be bugs etc in them, the same as with my RAID systems).
>>>
>> So you trust your important data to a single drive?  I doubt it.   But I
>> bet you do trust your data to a hardware RAID array.
>
>
> Yes, but not because I expect a single drive to be more error prone 
> (versus total failure).  Total drive failure on a single disk loses 
> all your data.  But we are not talking total failure, we are talking 
> errors that corrupt data.  I buy individual drives with the 
> expectation that they are designed to be error free and are error 
> free for the most part and I do not expect a RAID array to be more 
> robust in this regard (after all, the RAID is made up of a bunch of 
> single drives).
>
But people expect RAID to protect them from the corruption caused by a
partial failure, say a bad block, which is a common failure mode.  The
worst system failure I experienced was caused by one half of a mirror
experiencing bad blocks and the corrupt data being nicely mirrored on
the other drive.  ZFS would have saved this system from failure.

> Some people on this list think that the RAID arrays are more likely 
> to corrupt your data than JBOD (both with ZFS on top, for example, a 
> ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no 
> proof of this or even reasonable hypothetical explanation for this 
> that I have seen presented.
>
I don't think that the issue here, it's more one of perceived data
integrity.  People who have been happily using a single RAID 5 are now
finding that the array has been silently corrupting their data.  People
expect errors form single drives, so they put them in a RAID knowing the
firmware will protect them from drive errors.  They often fail to
recognise that the RAID firmware may not be perfect.

ZFS looks to be the perfect tool for mirroring hardware RAID arrays,
with the advantage over other schemes of knowing which side of the
mirror has an error.  Thus ZFS can be used as a tool to compliment,
rather than replace hardware RAID.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC


On Dec 2, 2006, at 12:06 AM, Ian Collins wrote:


Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:


Chad Leigh -- Shire.Net LLC wrote:


There is not?  People buy disk drives and expect them to corrupt
their data?  I expect the drives I buy to work fine (knowing that
there could be bugs etc in them, the same as with my RAID systems).

So you trust your important data to a single drive?  I doubt  
it.   But I

bet you do trust your data to a hardware RAID array.



Yes, but not because I expect a single drive to be more error prone
(versus total failure).  Total drive failure on a single disk loses
all your data.  But we are not talking total failure, we are talking
errors that corrupt data.  I buy individual drives with the
expectation that they are designed to be error free and are error
free for the most part and I do not expect a RAID array to be more
robust in this regard (after all, the RAID is made up of a bunch of
single drives).


But people expect RAID to protect them from the corruption caused by a
partial failure, say a bad block, which is a common failure mode.


They do?  I must admit no experience with the big standalone raid  
array storage units, just (expensive) HW raid cards, but I have never  
expected an array to protect me against data corruption.  Bad blocks  
can be detected and remapped, and maybe the array can recalculate the  
block from parity etc, but that is a known disk error, and not the  
subtle kinds of errors created by the RAID array that are being  
claimed here.



  The
worst system failure I experienced was caused by one half of a mirror
experiencing bad blocks and the corrupt data being nicely mirrored on
the other drive.  ZFS would have saved this system from failure.


None of my comments are meant to denigrate ZFS.  I am implementing it  
myself.





Some people on this list think that the RAID arrays are more likely
to corrupt your data than JBOD (both with ZFS on top, for example, a
ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no
proof of this or even reasonable hypothetical explanation for this
that I have seen presented.


I don't think that the issue here, it's more one of perceived data
integrity.  People who have been happily using a single RAID 5 are now
finding that the array has been silently corrupting their data.


They are?  They are being told that the problems they are having is  
due to that but there is no proof.  It could be a bad driver for  
example.



People
expect errors form single drives,


They do?  The tech specs show very low failure rates for single  
drives in terms of bit errors.



so they put them in a RAID knowing the
firmware will protect them from drive errors.


The RAID firmware will not protect them from bit errors on block  
reads unless the disk detects that the whole block is bad.  I admit  
not knowing how much the disk itself can detect bit errors with CRC  
or similar sorts of things.



They often fail to
recognise that the RAID firmware may not be perfect.


ZFS, JBOS disk controllers, drivers for said disk controllers, etc  
may not be perfect either.




ZFS looks to be the perfect tool for mirroring hardware RAID arrays,
with the advantage over other schemes of knowing which side of the
mirror has an error.  Thus ZFS can be used as a tool to compliment,
rather than replace hardware RAID.


I agree.  That is what I am doing :-)

Chad



Ian



---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Ian Collins
Chad Leigh -- Shire.Net LLC wrote:

>
> On Dec 2, 2006, at 12:06 AM, Ian Collins wrote:
>
>> But people expect RAID to protect them from the corruption caused by a
>> partial failure, say a bad block, which is a common failure mode.
>
>
> They do?  I must admit no experience with the big standalone raid 
> array storage units, just (expensive) HW raid cards, but I have never 
> expected an array to protect me against data corruption.  Bad blocks 
> can be detected and remapped, and maybe the array can recalculate the 
> block from parity etc, but that is a known disk error, and not the 
> subtle kinds of errors created by the RAID array that are being 
> claimed here.
>
I must admit that 'they' in my experience have been windows admins!

>> I don't think that the issue here, it's more one of perceived data
>> integrity.  People who have been happily using a single RAID 5 are now
>> finding that the array has been silently corrupting their data.
>
>
> They are?  They are being told that the problems they are having is 
> due to that but there is no proof.  It could be a bad driver for 
> example.

Either way, they are still finding errors they didn't know existed.

>>
>> ZFS looks to be the perfect tool for mirroring hardware RAID arrays,
>> with the advantage over other schemes of knowing which side of the
>> mirror has an error.  Thus ZFS can be used as a tool to compliment,
>> rather than replace hardware RAID.
>
>
> I agree.  That is what I am doing :-)
>
I'll be interested to see how you get on.

Cheers,

Ian
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Dana H. Myers
Chad Leigh -- Shire.Net LLC wrote:
> 
> On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:
> 
>> Chad Leigh -- Shire.Net LLC wrote:
>>
>>>
>>> On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:
>>>
 Chad Leigh -- Shire.Net LLC wrote:

>
> And this is different from any other storage system, how?  (ie, JBOD
> controllers and disks can also have subtle bugs that corrupt data)


 Of course, but there isn't the expectation of data reliability with a
 JBOD that there is with some RAID configurations.

>>>
>>> There is not?  People buy disk drives and expect them to corrupt
>>> their data?  I expect the drives I buy to work fine (knowing that
>>> there could be bugs etc in them, the same as with my RAID systems).
>>>
>> So you trust your important data to a single drive?  I doubt it.  But I
>> bet you do trust your data to a hardware RAID array.
> 
> Yes, but not because I expect a single drive to be more error prone
> (versus total failure).

Ah.  I was guessing you were thinking this.  You believe that a single
disk is no more prone to a soft data error than a reliable RAID configuration.
This is arguably incorrect.  Soft errors and hard failures are pretty
similar in that they both result in data corruption - where they differ
is in the recovery.

While I can not quote a specific figure, disk drives are not immune
from soft errors.  Normally, the drive is able to detect the error
and correct it transparently.  As a result, the apparent soft error
rate is quite low for a typical drive.  However, there are limits to the
soft errors that a drive can detect and correct; it is possible for
an error to slip through the drive's controller without correction if it
is more than some number of bits.  I don't actually know current estimates
for undetected soft errors, but it's small, like 1 in 10^30 bits or more.  My
guess could be wildly wrong - it doesn't change the outcome.  Perhaps
one of the actual disk experts can help here :-).

Thus, undetected soft errors don't happen very often but they do happen.

Further, the drive is attached to the computer via a cable and electronics
which themselves are not immune to errors.  Again, the probability of an
error is very small, so small that we take for granted the reliability
of the data.

A single disk can have an undetected error - if the OS doesn't check, it'll
never notice either.  Since this happens rarely, when it does happen, it
may not even be noticed, or have a lasting impact.  If a record is read from
a database with an error but not modified, the error won't be written-back
to the disk and may not occur again.  An application may crash, and simply be
restarted.  Since programming errors occur far more frequently than soft read
errors (they do), it's probable that buggy software will be blamed rather
than a soft error from a disk.

>  Total drive failure on a single disk loses all
> your data.  But we are not talking total failure, we are talking errors
> that corrupt data.  I buy individual drives with the expectation that
> they are designed to be error free

That's not a reasonable expectation, but, for the above reasons, you may
retire and never have seen something that you believed was a soft error.
The very nature of soft errors tends to hide them, while the nature of
a hard failure exacerbates them.

> and are error free for the most part

For the most part?  They're not always error-free?  Remember, for every
so many detected errors, there's probably an undetected error.

> and I do not expect a RAID array to be more robust in this regard (after
> all, the RAID is made up of a bunch of single drives).

Some RAID configurations are not more robust than a single drive - a simple
stripe/concat for example.  It doesn't matter what kind of error is encountered,
a soft error or a hard failure, there's no redundancy of the data.  A hard
error can't be ignored, but a soft error probably is.

Reliable RAID configurations can tolerate at least one error - be it soft
or hard.  The real difference is the recovery protocol.

> Some people on this list think that the RAID arrays are more likely to
> corrupt your data than JBOD (both with ZFS on top, for example, a ZFS
> mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no proof
> of this or even reasonable hypothetical explanation for this that I have
> seen presented.

We've had an example presented just last week in the field where a RAID
array running in a reliable configuration returned corrupt data as a result
of a faulty interconnection.  The error was detected by ZFS, but recovery
of the data was not possible because the RAID array was trusted to maintain
data integrity and not ZFS.  The RAID array could only insure the integrity
of the data inside the RAID array but can not detect or correct errors
occurring in the interconnect.

If ZFS had been managing the disks in that array as JBOD, and the disks
were in a reliable configuration, the interconnect errors would h

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC


On Dec 2, 2006, at 12:24 AM, Ian Collins wrote:



ZFS looks to be the perfect tool for mirroring hardware RAID arrays,
with the advantage over other schemes of knowing which side of the
mirror has an error.  Thus ZFS can be used as a tool to compliment,
rather than replace hardware RAID.



I agree.  That is what I am doing :-)


I'll be interested to see how you get on.



Me too.  Right now I have one RAID-6 (8 + 1 hotspare = 1.7+ TB) array  
with ZFS for testing etc.  I only have a small amount of live data on  
it (some email stores for a few of my own accounts).  I am setting up  
some backup procedures and messing around.


I finally bought the drives over the US Thanksgiving holiday a week  
ago and they arrived yesterday (8 + 1 + plus a spare for the shelf).   
I need to get my other controller purchased this month, set up the  
array, run some benchmark tests on it for a day or two as a break-in,  
and add it to the ZFS pool as a mirror element.  Then we can start  
seeing how it works.


All the RAID controllers have battery backup and the system itself  
has redund. PS and industrial UPS feeding it so hopefully we should  
have minimal crashing based on power problems etc (leaving SW and HW  
problems as sources).  The individual RAIDs are very fast.  The one  
has 1GB of battery backed cache and the second may have 2GB of  
battery backed cache (once I buy it).  With the battery backing you  
can turn on write caching with some confidence.  I am interested to  
see how it works.  The whole mirror will be exported over nfs through  
three separate (jumbo packet) gigE interfaces to various servers...   
The goal is to have it feed a bunch of "dumb" "low end" servers that  
do the compute work and that can be replaced in a minute on failure  
by mounting the file storage from the zfs/nfs array and restarting  
the services from the crashed server on the new server...


Cheers
Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Dana H. Myers
Chad Leigh -- Shire.Net LLC wrote:
> 
> On Dec 2, 2006, at 12:06 AM, Ian Collins wrote:

[...]

>> I don't think that the issue here, it's more one of perceived data
>> integrity.  People who have been happily using a single RAID 5 are now
>> finding that the array has been silently corrupting their data.
> 
> They are?  They are being told that the problems they are having is due
> to that but there is no proof.  It could be a bad driver for example.

Or a bad cable, or a bad controller IC, or a bad cache RAM. Or something.
The point is, the entire path from the disk to the main system memory
is the error domain.  ZFS sits at the top of this domain and thus can
detect and correct errors that something lower in the domain can not.

>> People
>> expect errors form single drives,
> 
> They do?  The tech specs show very low failure rates for single drives
> in terms of bit errors.

Very low.  Almost never, perhaps, but not never.  Bit errors happen.
When they do, data is corrupted.  Hence, single drives corrupt data -
just not very often and not repeatably at will.  So those soft errors
are easy to ignore or dismiss as something else.

>> so they put them in a RAID knowing the
>> firmware will protect them from drive errors.
> 
> The RAID firmware will not protect them from bit errors on block reads
> unless the disk detects that the whole block is bad.  I admit not
> knowing how much the disk itself can detect bit errors with CRC or
> similar sorts of things.

Actually, some RAID configurations should be able to detect errors as
they calculate and check the parity block.

>> They often fail to
>> recognise that the RAID firmware may not be perfect.
> 
> ZFS, JBOS disk controllers, drivers for said disk controllers, etc may
> not be perfect either.

Sure.  Nothing's perfect.  What's your point?  ZFS sits on top of the
pile of imperfection, and is thus able to make the entire error domain
no worse than ZFS, where it is likely much worse to begin with.

Dana
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Casper . Dik

>While other file systems, when they become corrupt, allow you to  
>salvage data :-)


They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Toby Thain


On 2-Dec-06, at 2:39 AM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:


On Dec 2, 2006, at 12:06 AM, Ian Collins wrote:


[...]


I don't think that the issue here, it's more one of perceived data
integrity.  People who have been happily using a single RAID 5  
are now

finding that the array has been silently corrupting their data.


They are?  They are being told that the problems they are having  
is due

to that but there is no proof.  It could be a bad driver for example.


Or a bad cable, or a bad controller IC, or a bad cache RAM. Or  
something.

The point is, the entire path from the disk to the main system memory
is the error domain.  ZFS sits at the top of this domain and thus can
detect and correct errors that something lower in the domain can not.


Right on. Many people don't seem to grasp this yet.

--T




Dana
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Chad Leigh -- Shire.Net LLC


On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)



They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.


I stand by what I said.  If you have a massive disk failure, yes.   
You are right.


When you have subtle corruption, some of the data and meta data is  
bad but not all.  In that case you can recover (and verify the data  
if you have the means to do so) t he parts that did not get  
corrupted.  My ZFS experience so far is that it basically said the  
whole 20GB pool was dead and I seriously doubt all 20GB was corrupted.


Chad




Casper


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Al Hopper
On Fri, 1 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:

>
> On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:
>
> > Chad Leigh -- Shire.Net LLC wrote:
> >
> >>
> >> On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:
> >>
> >>> Chad Leigh -- Shire.Net LLC wrote:
> >>>
> 
>  And this is different from any other storage system, how?  (ie,
>  JBOD
>  controllers and disks can also have subtle bugs that corrupt data)
> >>>
> >>>
> >>> Of course, but there isn't the expectation of data reliability
> >>> with a
> >>> JBOD that there is with some RAID configurations.
> >>>
> >>
> >> There is not?  People buy disk drives and expect them to corrupt
> >> their data?  I expect the drives I buy to work fine (knowing that
> >> there could be bugs etc in them, the same as with my RAID systems).
> >>
> > So you trust your important data to a single drive?  I doubt it.
> > But I
> > bet you do trust your data to a hardware RAID array.
>
> Yes, but not because I expect a single drive to be more error prone
> (versus total failure).  Total drive failure on a single disk loses
> all your data.  But we are not talking total failure, we are talking
> errors that corrupt data.  I buy individual drives with the
> expectation that they are designed to be error free and are error
> free for the most part and I do not expect a RAID array to be more
> robust in this regard (after all, the RAID is made up of a bunch of
> single drives).
>
> Some people on this list think that the RAID arrays are more likely
> to corrupt your data than JBOD (both with ZFS on top, for example, a
> ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no

Can you present a cut/paste where that assertion was made?

> proof of this or even reasonable hypothetical explanation for this
> that I have seen presented.
>
> Chad
>
> >
> > Ian
>
> ---
> Chad Leigh -- Shire.Net LLC
> Your Web App and Email hosting provider
> chad at shire.net
>
>

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
 OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Al Hopper
On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:

>
> On Dec 2, 2006, at 12:06 AM, Ian Collins wrote:
>
> > Chad Leigh -- Shire.Net LLC wrote:
> >
> >>
> >> On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:
> >>
> >>> Chad Leigh -- Shire.Net LLC wrote:
> >>>
>  There is not?  People buy disk drives and expect them to corrupt
>  their data?  I expect the drives I buy to work fine (knowing that
>  there could be bugs etc in them, the same as with my RAID systems).
> 
> >>> So you trust your important data to a single drive?  I doubt
> >>> it.   But I
> >>> bet you do trust your data to a hardware RAID array.
> >>
> >>
> >> Yes, but not because I expect a single drive to be more error prone
> >> (versus total failure).  Total drive failure on a single disk loses
> >> all your data.  But we are not talking total failure, we are talking
> >> errors that corrupt data.  I buy individual drives with the
> >> expectation that they are designed to be error free and are error
> >> free for the most part and I do not expect a RAID array to be more
> >> robust in this regard (after all, the RAID is made up of a bunch of
> >> single drives).
> >>
> > But people expect RAID to protect them from the corruption caused by a
> > partial failure, say a bad block, which is a common failure mode.
>
> They do?  I must admit no experience with the big standalone raid
> array storage units, just (expensive) HW raid cards, but I have never
> expected an array to protect me against data corruption.  Bad blocks
> can be detected and remapped, and maybe the array can recalculate the
> block from parity etc, but that is a known disk error, and not the
> subtle kinds of errors created by the RAID array that are being
> claimed here.
>
> >   The
> > worst system failure I experienced was caused by one half of a mirror
> > experiencing bad blocks and the corrupt data being nicely mirrored on
> > the other drive.  ZFS would have saved this system from failure.
>
> None of my comments are meant to denigrate ZFS.  I am implementing it
> myself.
>
> >
> >> Some people on this list think that the RAID arrays are more likely
> >> to corrupt your data than JBOD (both with ZFS on top, for example, a
> >> ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no
> >> proof of this or even reasonable hypothetical explanation for this
> >> that I have seen presented.
> >>
> > I don't think that the issue here, it's more one of perceived data
> > integrity.  People who have been happily using a single RAID 5 are now
> > finding that the array has been silently corrupting their data.
>
> They are?  They are being told that the problems they are having is
> due to that but there is no proof.  It could be a bad driver for
> example.
>
> > People
> > expect errors form single drives,
>
> They do?  The tech specs show very low failure rates for single
> drives in terms of bit errors.
>
> > so they put them in a RAID knowing the
> > firmware will protect them from drive errors.
>
> The RAID firmware will not protect them from bit errors on block
> reads unless the disk detects that the whole block is bad.  I admit
> not knowing how much the disk itself can detect bit errors with CRC
> or similar sorts of things.

This is incorrect.  Lets take a simple example of a H/W RAID5 with 4 disk
drives.  If disk 1 returns a bad block when a stripe of data is read (and
does not indicate an error condition), the RAID firmware will calculate
the parity/CRC for the entire stripe (as it *always* does) and "see" that
that there is an error present and transparently correct the error, before
returning the corrected data upstream to the application (server).  It
can't correct every possible error - there will be limits depending on
which CRC algorithms are implemented and the extend of the faulty data.
But, in general, those algorithms, if correctly chosen and implemented,
will correct most errors, most of the time.

The main reason why not *all* the possible errors can be corrected, is
because there are compromises to be made in:

- the number of bits of CRC that will be calculated and stored
- the CPU and memory resources required to perform the CRC calculations
- limitations in the architecture of the RAID h/w, for example, how much
bandwidth is available between the CPU, memory, disk I/O controllers and
what level of bus contention can be tolerated
- whether the RAID vendor wishes to make any money (hardware costs must be
minimized)
- whether the RAID vendor wishes to win benchmarking comparisons with
their competition
- how smart the firmware developers are and how much pressure is put on
them to get the product to market
- blah, blah, blah

> > They often fail to
> > recognise that the RAID firmware may not be perfect.
>
> ZFS, JBOS disk controllers, drivers for said disk controllers, etc
> may not be perfect either.
>
> >
> > ZFS looks to be the perfect tool for mirroring hardware RAID arrays,
> > with the advantage over other schemes o

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Al Hopper
On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:

>
> On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:
>
> >
> >> While other file systems, when they become corrupt, allow you to
> >> salvage data :-)
> >
> >
> > They allow you to salvage what you *think* is your data.
> >
> > But in reality, you have no clue what the disks are giving you.
>
> I stand by what I said.  If you have a massive disk failure, yes.
> You are right.
>
> When you have subtle corruption, some of the data and meta data is
> bad but not all.  In that case you can recover (and verify the data
> if you have the means to do so) t he parts that did not get
> corrupted.  My ZFS experience so far is that it basically said the
> whole 20GB pool was dead and I seriously doubt all 20GB was corrupted.

That was because you built a pool with no redundancy.  In the case where
ZFS does not have a redundant config from which to try to reconstruct the
data (today) it simply says: sorry charlie - you pool is corrupt.

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
 OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Toby Thain


On 2-Dec-06, at 12:56 PM, Al Hopper wrote:


On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)



They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.


I stand by what I said.  If you have a massive disk failure, yes.
You are right.

When you have subtle corruption, some of the data and meta data is
bad but not all.  In that case you can recover (and verify the data
if you have the means to do so) t he parts that did not get
corrupted.  My ZFS experience so far is that it basically said the
whole 20GB pool was dead and I seriously doubt all 20GB was  
corrupted.


That was because you built a pool with no redundancy.  In the case  
where
ZFS does not have a redundant config from which to try to  
reconstruct the

data (today) it simply says: sorry charlie - you pool is corrupt.


Is that the whole story though? Even without redundancy, isn't there  
a lot of resilience against corruption (redundant metadata, etc)?


--Toby



Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
   Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
 OpenSolaris Governing Board (OGB) Member - Feb 2006
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Chad Leigh -- Shire.Net LLC


On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:


On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)



They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.


I stand by what I said.  If you have a massive disk failure, yes.
You are right.

When you have subtle corruption, some of the data and meta data is
bad but not all.  In that case you can recover (and verify the data
if you have the means to do so) t he parts that did not get
corrupted.  My ZFS experience so far is that it basically said the
whole 20GB pool was dead and I seriously doubt all 20GB was  
corrupted.


That was because you built a pool with no redundancy.  In the case  
where
ZFS does not have a redundant config from which to try to  
reconstruct the

data (today) it simply says: sorry charlie - you pool is corrupt.


Where a RAID system would still be salvageable.

Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Jeff Victor

Chad Leigh -- Shire.Net LLC wrote:


On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:


On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)


They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.



I stand by what I said.  If you have a massive disk failure, yes.
You are right.

When you have subtle corruption, some of the data and meta data is
bad but not all.  In that case you can recover (and verify the data
if you have the means to do so) t he parts that did not get
corrupted.  My ZFS experience so far is that it basically said the
whole 20GB pool was dead and I seriously doubt all 20GB was  corrupted.


That was because you built a pool with no redundancy.  In the case  where
ZFS does not have a redundant config from which to try to  reconstruct the
data (today) it simply says: sorry charlie - you pool is corrupt.


Where a RAID system would still be salvageable.


That is a comparison of apples to oranges.  The RAID system has Redundancy.  If 
the ZFS pool had been configured with redundancy, it would have fared at least as 
well as the RAID system.


Without redundancy, neither of them can magically reconstruct data.  The RAID 
system would simply be an AID system.



--
Jeff VICTOR  Sun Microsystemsjeff.victor @ sun.com
OS AmbassadorSr. Technical Specialist
Solaris 10 Zones FAQ:http://www.opensolaris.org/os/community/zones/faq
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Chad Leigh -- Shire.Net LLC


On Dec 2, 2006, at 12:29 PM, Jeff Victor wrote:


Chad Leigh -- Shire.Net LLC wrote:

On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:

On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)


They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.



I stand by what I said.  If you have a massive disk failure, yes.
You are right.

When you have subtle corruption, some of the data and meta data is
bad but not all.  In that case you can recover (and verify the data
if you have the means to do so) t he parts that did not get
corrupted.  My ZFS experience so far is that it basically said the
whole 20GB pool was dead and I seriously doubt all 20GB was   
corrupted.


That was because you built a pool with no redundancy.  In the  
case  where
ZFS does not have a redundant config from which to try to   
reconstruct the

data (today) it simply says: sorry charlie - you pool is corrupt.

Where a RAID system would still be salvageable.


That is a comparison of apples to oranges.  The RAID system has  
Redundancy.  If the ZFS pool had been configured with redundancy,  
it would have fared at least as well as the RAID system.


Without redundancy, neither of them can magically reconstruct  
data.  The RAID system would simply be an AID system.


That is not the question.  Assuming the error came OUT of the RAID  
system (which it did in this case as there was a bug in the driver  
and the cache did not get flushed in a certain shutdown situation),  
another FS would have been salvageable as the whole 20GB of the pool  
was not corrupt.


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Dick Davies

On 02/12/06, Chad Leigh -- Shire.Net LLC <[EMAIL PROTECTED]> wrote:


On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:

> On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



>> On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:



>> When you have subtle corruption, some of the data and meta data is
>> bad but not all.  In that case you can recover (and verify the data
>> if you have the means to do so) t he parts that did not get
>> corrupted.  My ZFS experience so far is that it basically said the
>> whole 20GB pool was dead and I seriously doubt all 20GB was
>> corrupted.



> That was because you built a pool with no redundancy.  In the case
> where
> ZFS does not have a redundant config from which to try to
> reconstruct the
> data (today) it simply says: sorry charlie - you pool is corrupt.



Where a RAID system would still be salvageable.


RAID level what? How is anything salvagable if you lose your only copy?

ZFS does store multiple copies of metadata in a single vdev, so I
assume we're talking about data here.

--
Rasputin :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Rich Teer
On Sat, 2 Dec 2006, Al Hopper wrote:

> > Some people on this list think that the RAID arrays are more likely
> > to corrupt your data than JBOD (both with ZFS on top, for example, a
> > ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no
> 
> Can you present a cut/paste where that assertion was made?

I don't want to put words in Chad's mouth, but I think he might be
misunderstanding representations that people make here about ZFS vs
HW RAID.  I don't think that people have asserted that "RAID arrays
are more likely to corrupt data than a JBOD"; what I think people ARE
asserting is that corruption is more likely to go undetected in a HW
RAID than in a JBOD with ZFS.  (A subtle, but important, difference.)

The reason for this is understandable: if you write some data to a
HW RAID device, you assume that unless otherwise notified, your data
is safe.  The HW RAID, but its very nature, is a black box that we
assume is OK.  With ZFS+JBOD, ZFS' built in end-to-end error checking
will catch any silent errors created in the JBOD, when they happen,
and can correct them (or at least notify you) right away.

-- 
Rich Teer, SCSA, SCNA, SCSECA, OpenSolaris CAB member

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-group.com/rich
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-06 Thread Chad Leigh -- Shire.Net LLC


On Dec 2, 2006, at 12:35 PM, Dick Davies wrote:


On 02/12/06, Chad Leigh -- Shire.Net LLC <[EMAIL PROTECTED]> wrote:


On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:

> On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



>> On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:



>> When you have subtle corruption, some of the data and meta data is
>> bad but not all.  In that case you can recover (and verify the  
data

>> if you have the means to do so) t he parts that did not get
>> corrupted.  My ZFS experience so far is that it basically said the
>> whole 20GB pool was dead and I seriously doubt all 20GB was
>> corrupted.



> That was because you built a pool with no redundancy.  In the case
> where
> ZFS does not have a redundant config from which to try to
> reconstruct the
> data (today) it simply says: sorry charlie - you pool is corrupt.



Where a RAID system would still be salvageable.


RAID level what? How is anything salvagable if you lose your only  
copy?




The whole raid does not fail -- we are talking about corruption  
here.  If you lose some inodes your whole partition is not gone.


My ZFS pool would not salvage -- poof, whole thing was gone (granted  
it was a test one and not a raidz or mirror yet).  But still, for  
what happened, I cannot believe that 20G of data got messed up  
because a 1GB cache was not correctly flushed.


Sorry for "bailing" on this topic.  I did not mean to.  Was on  
babysitting duty on the weekend and the new week hit with a vengeance  
on Monday.  I will try and post a last post or two on the subject and  
let it die.


Chad


ZFS does store multiple copies of metadata in a single vdev, so I
assume we're talking about data here.




---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-07 Thread Jeremy Teo

The whole raid does not fail -- we are talking about corruption
here.  If you lose some inodes your whole partition is not gone.

My ZFS pool would not salvage -- poof, whole thing was gone (granted
it was a test one and not a raidz or mirror yet).  But still, for
what happened, I cannot believe that 20G of data got messed up
because a 1GB cache was not correctly flushed.


Chad, I think what you're saying is for a zpool to allow you to
salvage whatever remaining data that passes it's checksums.

--
Regards,
Jeremy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-09 Thread Richard Elling

Jeremy Teo wrote:

The whole raid does not fail -- we are talking about corruption
here.  If you lose some inodes your whole partition is not gone.

My ZFS pool would not salvage -- poof, whole thing was gone (granted
it was a test one and not a raidz or mirror yet).  But still, for
what happened, I cannot believe that 20G of data got messed up
because a 1GB cache was not correctly flushed.


If you mess up 4 critical blocks (all copies of the same uberblock
metadata) then this is possible.  It is expected that this would be a
highly unlikely event since the blocks are spread across the device.
Inserting a cache between what ZFS thinks is a device and the device
complicates this a bit, but it is not clear to me what alternative
would be better for all possible failure modes.


Chad, I think what you're saying is for a zpool to allow you to
salvage whatever remaining data that passes it's checksums.


That is the way I read this thread.  Perhaps a job for zdb (I speculate
because I've never used zdb)?  Perhaps a zdb expert can chime in...
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss