Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-08-01 Thread Victor Latushkin

On 25.07.09 00:30, Rob Logan wrote:

  The post I read said OpenSolaris guest crashed, and the guy clicked
  the ``power off guest'' button on the virtual machine.

I seem to recall guest hung. 99% of solaris hangs (without
a crash dump) are hardware in nature. (my experience backed by
an uptime of 1116days) so the finger is still
pointed at VirtualBox's hardware implementation.

as for ZFS requiring better hardware, you could turn
off checksums and other protections so one isn't notified
of issues making it act like the others.


You cannot turn off checksums and copies for metadata though, so even if you 
don't care about your data ZFS still cares about its metadata.


victor
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-31 Thread Kurt Olsen
 On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:
 
 Most of the issues that I've read on this list would
 have been  
 solved if there was a mechanism where the user /
 sysadmin could tell  
 ZFS to simply go back until it found a TXG that
 worked.
 
 The trade off is that any transactions (and their
 data) after the  
 working one would be lost. But at least you're not
 left with an un- 
 importable pool.

I'm curious as to why people think rolling back txgs don't come with additional 
costs beyond losing recent transactions. What are the odds that the data blocks 
that were replaced by the discarded transactions haven't been overwritten? 
Without a snapshot to hold the references aren't those blocks considered free 
and available for reuse?

Don't get me wrong, I do think that rolling back to previous uberblocks should 
be an option v. total pool loss, but it doesn't seem like one can reliably say 
that their data is in some known good state.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-27 Thread Marc Bevand
dick hoogendijk dick at nagual.nl writes:
 
 Than why is it that most AMD MoBo's in the shops clearly state that ECC
 Ram is not supported on the MoBo?

To restate what Erik explained: *all* AMD CPUs support ECC RAM, however poorly 
written motherboard specs often make the mistake of confusing non-ECC vs. ECC
with unbuffered vs. registered (these are 2 completely unrelated technical
characteristics). So, don't blindly trust manuals saying ECC RAM is not
supported.

-mrb

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-26 Thread Erik Trimble

dick hoogendijk wrote:

On Sat, 25 Jul 2009 21:58:48 + (UTC)
Marc Bevand m.bev...@gmail.com wrote:

  

dick hoogendijk dick at nagual.nl writes:


I live in Holland and it is not easy to find motherboards that (a)
truly support ECC ram and (b) are (Open)Solaris compatible.
  

Virtually all motherboards for AMD processors support ECC RAM because
the memory controller is in the CPU and all AMD CPUs support ECC RAM.



Than why is it that most AMD MoBo's in the shops clearly state that ECC
Ram is not supported on the MoBo?

  
All /OPTERON/ chips support ECC, unbuffered, non-registered in the case 
of 100/1000 series, and unbuffered, registered in the case of 
200/2000/800/8000 series.


I _believe_ all socket AM2, AM2+ and AM3 consumer chips (Phenom, Phenom 
II, Athlon X2, Athlon X3 and Athlon X4) also support unbuffered 
non-registered ECC.   The AMD Specs page for the above processors 
indicates I'm right about those CPUs.



I think what they're (the retail shops, that is) stating is consumer AMD 
CPUs won't take the server (i.e. registered) ECC DIMMs.


A quick glance at ASUS's website shows that all current consumer (i.e. 
socket AM2/2+/3) AMD motherboards from them support unregistered, 
unbuffered ECC.  I suspect it's the same for the other board makers, too.  


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-26 Thread Erik Trimble
Erik Trimble wrote: 


I _believe_ all socket AM2, AM2+ and AM3 consumer chips (Phenom, 
Phenom II, Athlon X2, Athlon X3 and Athlon X4) also support unbuffered 
non-registered ECC.   The AMD Specs page for the above processors 
indicates I'm right about those CPUs.


Quick correction:   the current AMD CPUs are  Phenom X3, Phenom X4, 
Phenom II, Athlon X2, Athlon, and Sempron. 

According to the Processor Data Sheets for all AMD CPUs, they /all/ 
support ECC RAM (in some form). All the way back to the Socket 754 chips.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-25 Thread Michael McCandless
Thanks for the numerous responses everyone!  Responding to some of the
answers...:

 ZFS has to trust the storage to have committed the data it 
 claims to have committed in the same way it has to trust the integrity 
 of the RAM it uses for checksummed data.

I hope that's not true.

Ie, I can understand that if an IO system lies in an fsync call
(returns before the bits are in fact on stable storage) that ZFS might
lose the pool.  EG it seems like that may've been what happened on the
VB thread (though I agree since it was only the guest that crashed,
the writes should in fact have made it to disk, so...).

But if a bit flips in RAM, at a particularly unlucky moment, is there
any chance whatsoever that ZFS could lose the pool?  There seems to be
mixed opinions here so far... but if I were tallying votes it looks
like more people say no, it cannot than yes it may.

  For example, if the wrong bit flips at the wrong time, could I lose my
  entire RAID-Z pool instead of, say, corrupting one file's contents or
  metadata? Is there such a possibility?
 
 Not likely, but I don't think anyone has done such low-level
 analysis to prove it.

So this is exactly what I'm driving at -- has there really been no
such low level failure analysis?  Ie, if a bit error happens at point
XYZ in ZFS's code, what's the impact (for XYZ at all interesting
points)?

EG say (pure speculation) ZFS has a global checksum that's written on
closing the pool, and then later the pool cannot be imported when the
checksum is bad.  Since a bit error could corrupt that checksum, this
would in fact mean I could lose the pool due to an unluckily timed
bit error.

The decision (to use ECC or not) ought to be a basic cost/benefit
analysis, once one has the facts.  I'm trying to get to the facts
here... ie, if you don't use ECC just how bad is it when bit errors
inevitably happen?  If the effects are local (file/dir contents 
metadata get corrupted) that's one thing; if I could lose the pool
that's very different.

[Eventually] armed with the facts, one should be free to decide on ECC
or not just like one picks, say, the latest  greatest consumer hard
drive (higher risk of errors since they have no track record) or a
known good enterprise hard drive.

 You still have the processor to worry about though.

and

 NB many hard disk drives and controllers have only parity protected
 memory. So even if your main memory is ECC, it is unlikely that the
 entire data path is ECC protected.

These are good points -- even if you have ECC RAM, your CPU and PCI
bus and other parts of the data path could still flip bits.  So I'm
really hoping the answer is no, you'll never lose the pool from
bit errors.

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6667683
 
 Most of the issues that I've read on this list would have been 
 solved if there was a mechanism where the user / sysadmin could tell 
 ZFS to simply go back until it found a TXG that worked.

This one sounds important!  Any means of disaster recovery would be
very welcome...

BTW is there some way for a user to vote/comment on bugs?  EG I think
I've hit this one:

  http://bugs.opensolaris.org/view_bug.do?bug_id=6807184

And would love to vote, share my config, situation, etc.  But I can't
find any links that let me, there are no comments on the bug, etc.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-25 Thread Ian Collins

Michael McCandless wrote:

Thanks for the numerous responses everyone!  Responding to some of the
answers...:

  
ZFS has to trust the storage to have committed the data it 
claims to have committed in the same way it has to trust the integrity 
of the RAM it uses for checksummed data.



I hope that's not true.

Ie, I can understand that if an IO system lies in an fsync call
(returns before the bits are in fact on stable storage) that ZFS might
lose the pool.  EG it seems like that may've been what happened on the
VB thread (though I agree since it was only the guest that crashed,
the writes should in fact have made it to disk, so...).

But if a bit flips in RAM, at a particularly unlucky moment, is there
any chance whatsoever that ZFS could lose the pool?  There seems to be
mixed opinions here so far... but if I were tallying votes it looks
like more people say no, it cannot than yes it may.

  
I've never seen reports of that happening.  What I have seen is 
corrupted files.  Without checksums, the files would have been silently 
corrupted.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-25 Thread Marc Bevand
dick hoogendijk dick at nagual.nl writes:
 
 I live in Holland and it is not easy to find motherboards that (a)
 truly support ECC ram and (b) are (Open)Solaris compatible.

Virtually all motherboards for AMD processors support ECC RAM because the 
memory controller is in the CPU and all AMD CPUs support ECC RAM.

I have heard of a few BIOSes that refuse to POST if ECC RAM is detected, but 
this is often an attempt to segment markets, rather than a real lack of 
ability to support ECC RAM.

-mrb

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Michael McCandless
I've read in numerous threads that it's important to use ECC RAM in a
ZFS file server.

My question is: is there any technical reason, in ZFS's design, that
makes it particularly important for ZFS to require ECC RAM?

Is ZFS especially vulnerable, moreso than other filesystems, to bit
errors in RAM?

For example, if the wrong bit flips at the wrong time, could I lose my
entire RAID-Z pool instead of, say, corrupting one file's contents or
metadata?  Is there such a possibility?

(Assume the rest of the hardware stack behaves, eg an fsync to the
drive won't return until the bytes are written to stable storage).

I had assumed that a bit error from RAM would only have a localized
effect (eg, corrupt the contents or metadata of file or directory)
each time it struck, but now I'm wondering if the failure could be
global because of something in ZFS's design, and that's why the
recommendation for ECC RAM is always so strong.

Some of the posts in this thread (Another user loses his pool...):

  http://opensolaris.org/jive/thread.jspa?threadID=108213tstart=0

make me think ZFS may in fact require ECC RAM.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Rich Teer
On Fri, 24 Jul 2009, Michael McCandless wrote:

 I've read in numerous threads that it's important to use ECC RAM in a
 ZFS file server.
 
 My question is: is there any technical reason, in ZFS's design, that
 makes it particularly important for ZFS to require ECC RAM?

[...]

 Some of the posts in this thread (Another user loses his pool...):
 
   http://opensolaris.org/jive/thread.jspa?threadID=108213tstart=0
 
 make me think ZFS may in fact require ECC RAM.

I don't think it's ZFS per se that requires ECC RAM.  More likely,
it's any application (in the use sense, not a program) that actually
cares about being able to detect--and preferably correct--errors in
memory.

Given that data integrity is presumably important in every non-gaming
computing use, I don't understand why people even consider not using
ECC RAM all the time.  The hardware cost delta is a red herring: how
much would undetected memories cost an organisation?  That's the true
cost of skimping on memory by using non-ECC RAM, IMHO.

HTH,

-- 
Rich Teer, SCSA, SCNA, SCSECA

URLs: http://www.rite-group.com/rich
  http://www.linkedin.com/in/richteer
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Kyle McDonald

Michael McCandless wrote:

I've read in numerous threads that it's important to use ECC RAM in a
ZFS file server.

My question is: is there any technical reason, in ZFS's design, that
makes it particularly important for ZFS to require ECC RAM?
  
I think, basically the idea is, that if you're going to use ZFS to 
protect your data from this sort of thing through the path to the stable 
storage, then it seems  like a shame (or a waste?)  not to equally 
protect the data both before it's given to ZFS for writing, and after 
ZFS reads it back and returns it to you.


 -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread dick hoogendijk
On Fri, 24 Jul 2009 07:19:40 -0700 (PDT)
Rich Teer rich.t...@rite-group.com wrote:

 Given that data integrity is presumably important in every non-gaming
 computing use, I don't understand why people even consider not using
 ECC RAM all the time.  The hardware cost delta is a red herring:

I live in Holland and it is not easy to find motherboards that (a)
truly support ECC ram and (b) are (Open)Solaris compatible.

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | nevada / OpenSolaris 2010.02 B118
+ All that's really worth doing is what we do for others (Lewis Carrol)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread dick hoogendijk
On Fri, 24 Jul 2009 10:44:36 -0400
Kyle McDonald kmcdon...@egenera.com wrote:
 ... then it seems  like a shame (or a waste?)  not to equally
 protect the data both before it's given to ZFS for writing, and after
 ZFS reads it back and returns it to you.

But that was not the question.
The question was: [quote] My question is: is there any technical
reason, in ZFS's design, that makes it particularly important for ZFS
to require ECC RAM?

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | nevada / OpenSolaris 2010.02 B118
+ All that's really worth doing is what we do for others (Lewis Carrol)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Richard Elling

On Jul 24, 2009, at 3:18 AM, Michael McCandless wrote:


I've read in numerous threads that it's important to use ECC RAM in a
ZFS file server.


It is important to use ECC RAM.  The embedded market and
server market demand ECC RAM. It is only the el-cheapo PC
market that does not. Going back to some of the early studies
by IBM on errors in PC memory, it is really a shame that the
market has not moved on.


My question is: is there any technical reason, in ZFS's design, that
makes it particularly important for ZFS to require ECC RAM?


No.


Is ZFS especially vulnerable, moreso than other filesystems, to bit
errors in RAM?


No.  Except that ZFS actual does check data integrity. So ZFS can
detect if you had a problem.  Other file systems can be blissfully
ignorant of data corruption.


For example, if the wrong bit flips at the wrong time, could I lose my
entire RAID-Z pool instead of, say, corrupting one file's contents or
metadata?  Is there such a possibility?


Not likely, but I don't think anyone has done such low-level
analysis to prove it.


(Assume the rest of the hardware stack behaves, eg an fsync to the
drive won't return until the bytes are written to stable storage).

I had assumed that a bit error from RAM would only have a localized
effect (eg, corrupt the contents or metadata of file or directory)
each time it struck, but now I'm wondering if the failure could be
global because of something in ZFS's design, and that's why the
recommendation for ECC RAM is always so strong.


IMHO, the reason this gets discussed on zfs-discuss so frequently
is because ZFS detects data corruption and people start to
speculate about the source.

NB many hard disk drives and controllers have only parity protected
memory. So even if your main memory is ECC, it is unlikely that the
entire data path is ECC protected.



Some of the posts in this thread (Another user loses his pool...):

 http://opensolaris.org/jive/thread.jspa?threadID=108213tstart=0

make me think ZFS may in fact require ECC RAM.


The root cause of this thread's woes have absolutely nothing to
do with ECC RAM. It has everything to do with VirtualBox configuration.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Robert Milkowski

dick hoogendijk wrote:

On Fri, 24 Jul 2009 10:44:36 -0400
Kyle McDonald kmcdon...@egenera.com wrote:
  

... then it seems  like a shame (or a waste?)  not to equally
protect the data both before it's given to ZFS for writing, and after
ZFS reads it back and returns it to you.



But that was not the question.
The question was: [quote] My question is: is there any technical
reason, in ZFS's design, that makes it particularly important for ZFS
to require ECC RAM?

  


No, there isn't.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Nicolas Williams
On Fri, Jul 24, 2009 at 05:01:15PM +0200, dick hoogendijk wrote:
 On Fri, 24 Jul 2009 10:44:36 -0400
 Kyle McDonald kmcdon...@egenera.com wrote:
  ... then it seems  like a shame (or a waste?)  not to equally
  protect the data both before it's given to ZFS for writing, and after
  ZFS reads it back and returns it to you.
 
 But that was not the question.
 The question was: [quote] My question is: is there any technical
 reason, in ZFS's design, that makes it particularly important for ZFS
 to require ECC RAM?

The only thing I can think of is this: if a cosmic ray flips a bit in
memory holding a ZFS transaction that's already had all its checksums
computed, but hasn't hit disk yet, then you'll have a checksum
verification failure later when you read back the affected file (or
directory).  Using ECC memory avoids that.  You still have the processor
to worry about though.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Miles Nordin
 re == Richard Elling richard.ell...@gmail.com writes:

re The root cause of this thread's woes have absolutely nothing
re to do with ECC RAM. It has everything to do with VirtualBox
re configuration.

What part of VirtualBox configuration?

The post I read said OpenSolaris guest crashed, and the guy clicked
the ``power off guest'' button on the virtual machine.  The host never
crashed.  so whether the IDE cache flush parameter was set or not,
whether the guest backing store was a file or a raw disk, seems
irrelevant to me.

Is there a correct way to configure it, or will always any componoent
of the overall system other than ZFS get blamed when ZFS loses a pool?


pgpivoEXGiVhp.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Rob Logan

 The post I read said OpenSolaris guest crashed, and the guy clicked
 the ``power off guest'' button on the virtual machine.

I seem to recall guest hung. 99% of solaris hangs (without
a crash dump) are hardware in nature. (my experience backed by
an uptime of 1116days) so the finger is still
pointed at VirtualBox's hardware implementation.

as for ZFS requiring better hardware, you could turn
off checksums and other protections so one isn't notified
of issues making it act like the others.

Rob
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Bob Friesenhahn

On Fri, 24 Jul 2009, Miles Nordin wrote:

The post I read said OpenSolaris guest crashed, and the guy clicked
the ``power off guest'' button on the virtual machine.  The host never
crashed.  so whether the IDE cache flush parameter was set or not,


Clicking ``power off guest'' is the same as walking up and pulling the 
power cord out of the wall.  That is now how the guest operating 
system is supposed to be shut down.


If VirtualBox does not at least flush pending writes (that it lied 
about) when the user clicks on ``power off guest'' then it has 
committed a crime.  Regardless, it has committed a crime.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Ian Collins

Rob Logan wrote:

 The post I read said OpenSolaris guest crashed, and the guy clicked
 the ``power off guest'' button on the virtual machine.

I seem to recall guest hung. 99% of solaris hangs (without
a crash dump) are hardware in nature. (my experience backed by
an uptime of 1116days) so the finger is still
pointed at VirtualBox's hardware implementation.

as for ZFS requiring better hardware, you could turn
off checksums and other protections so one isn't notified
of issues making it act like the others.

Maybe not better hardware, but honest hardware.  Every piece of software 
depends to some extent on the devices it uses honouring their 
contracts.  ZFS has to trust the storage to have committed the data it 
claims to have committed in the same way it has to trust the integrity 
of the RAM it uses for checksummed data.


That's the price you pay for end to end checksums.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Frank Middleton

On 07/24/09 04:35 PM, Bob Friesenhahn wrote:

 Regardless, it [VirtualBox] has committed a crime.


But ZFS is a journalled file system! Any hardware can lose a flush;
it's just more likely in a VM, especially when anything Microsoft
is involved, and the whole point of journalling is to prevent things
like this happening. However the issue is moot since CR 6667683 is
being addressed. Here's a related thought - does it make sense to
mirror ZFS on iscsi if the host drives are themselves ZFS mirrors?

The whole question of the requirement for ECC depends on your
tolerance for loss of files vs. errors in files. As Richard
Elling points out, there are other sources of error (e.g.,
no checking of PCI parity). But that isn't relevant to the ECC
on main memory question. You can disable checksumming, and then
ZFS is no worse in this regard than any other file system; bad
files get read and you either notice or you don't, but you won't
lose any because of fatal checksum errors and you still have all
the other great features of ZFS,

If you don't mirror, all bets are off. You should set copies=2 or
higher and cross your fingers. You should also disable file
checksumming in ZFS and in that sense degenerate to the behavior
of lesser file systems. However mirroring doesn't buy you much
here because it evidently doesn't double buffer the write before
calculating the checksum, so a stray bitflip can cause metatdata or
data corruption, causing a mirrored file to have an unrecoverable
checksum failure (of course there are many other reasons to mirror).

The real question is - what is the probability of this occurring?
IMO the typical SOHO user has a 1 in 10 to 1 in 100 chance of this
happening in a year of reasonably constant operation (a few dozen
writes/day). I believe that this can be mitigated by setting
copies=2, a good idea anyway if you have biggish disks since, as
Richard Elling has pointed out in his excellent blogs, if you need
to resilver after a disk failure you have a rather large possibility
of a disk read error causing file loss and copies=2 also mitigates
this. Note that hopefully fixing CR 6667683 should eliminate any
possibility of losing an entire mirrored or raidz pool.

So, it seem to me ZFS has a definite dependency on ECC for reliable
operation. However, for non-commercial uses (i.e., less than an
hour or so a day of writes) the probability of losing a file is
fairly small and can be mitigated still further by setting copies=2.
But to eliminate the possibility entirely, you must have ECC. You
should also make sure that the buses have at least parity if not
ECC and that this is actually checked - maybe Richard can comment
on this since I believe he thinks this is a more likely source
of errors.

HTH -- Frank







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Ian Collins

Frank Middleton wrote:

On 07/24/09 04:35 PM, Bob Friesenhahn wrote:

 Regardless, it [VirtualBox] has committed a crime.


But ZFS is a journalled file system! 
Even a journalled file system has to trust the journal.  If the storage 
says the journal is committed and its isn't, all bets are off.


The issue we see here with ZFS appears to be the lack of a means of 
rewinding to a known sane state when this happens.



The whole question of the requirement for ECC depends on your
tolerance for loss of files vs. errors in files. As Richard
Elling points out, there are other sources of error (e.g.,
no checking of PCI parity). But that isn't relevant to the ECC
on main memory question. You can disable checksumming, and then
ZFS is no worse in this regard than any other file system; bad
files get read and you either notice or you don't, but you won't
lose any because of fatal checksum errors and you still have all
the other great features of ZFS,

That's probably the root of the issues we see here, ZFS does a great job 
of telling you when something is irrevocably broken, but doesn't (yet) 
offer a means of fixing the problem.  I guess ZFS is a bit like a single 
bit parity scheme that reports, but does not correct (gross) errors.  
When these are used in an on the wire protocol bad packets can either be 
dropped or retransmitted.  With a file system, only the former option is 
available, the original is lost.


Transmission protocols are always designed to manage data errors.  
Filesystems have traditionally been designed to ignore them, assuming 
the round trip from CPU to storage and back is 100% reliable.  ZFS has 
changed the rules.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread David Magda

On Jul 24, 2009, at 16:00, Miles Nordin wrote:

Is there a correct way to configure it, or will always any  
componoent of the overall system other than ZFS get blamed when ZFS  
loses a pool?


By default VB does not respect the 'disk sync' command that a guest OS  
could send--it's just ignored. This messes up ZFS' assumption about  
transaction being safely on-disk.


Toby Thain posted a link to a VB forum posting on how to configure  
things so that the flush command is not silently ignored:


http://forums.virtualbox.org/viewtopic.php?f=8t=13661start=0

ZFS doesn't make many assumptions, but it does assume that when the  
disk says the data is safe it actually is. If the disk lies then  
this is where things go wonky (and why we have these giant threads).

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Bob Friesenhahn

On Fri, 24 Jul 2009, Frank Middleton wrote:


On 07/24/09 04:35 PM, Bob Friesenhahn wrote:

 Regardless, it [VirtualBox] has committed a crime.


But ZFS is a journalled file system! Any hardware can lose a flush;


From my understanding, ZFS is not a journalled file system.  ZFS 
relies on ordered writes followed by a cache sync (to make sure that 
the bits are on disk) and so it does not use a journaled transaction 
rollback mechanism.


Here is a description of what a journaling file system is:

  http://en.wikipedia.org/wiki/Journaling_file_system

Notice that the second sentence introduces the notion of a race 
condition, but since ZFS uses ordered writes using freshly allocated 
space, there is no possibility of a race condition.  A journaling 
filesystem uses a journal (transaction log) to roll back (replace with 
previous data) the unordered writes in an incomplete transaction.  In 
the case of ZFS, it is only necessary to go back to the most recent 
checkpoint and any subsequent writes after that checkpoint are simply 
forgotten.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread David Magda

On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:

A journaling filesystem uses a journal (transaction log) to roll  
back (replace with previous data) the unordered writes in an  
incomplete transaction.  In the case of ZFS, it is only necessary to  
go back to the most recent checkpoint and any subsequent writes  
after that checkpoint are simply forgotten.


And fixing CR 6667683 is what would allow ZFS to properly /  
automatically recover from a messed up power down:



need a way to rollback to an uberblock from a previous txg


http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6667683

Most of the issues that I've read on this list would have been  
solved if there was a mechanism where the user / sysadmin could tell  
ZFS to simply go back until it found a TXG that worked.


The trade off is that any transactions (and their data) after the  
working one would be lost. But at least you're not left with an un- 
importable pool.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The importance of ECC RAM for ZFS

2009-07-24 Thread Toby Thain


On 24-Jul-09, at 6:41 PM, Frank Middleton wrote:


On 07/24/09 04:35 PM, Bob Friesenhahn wrote:

 Regardless, it [VirtualBox] has committed a crime.


But ZFS is a journalled file system! Any hardware can lose a flush;


No, the problematic default in VirtualBox is flushes being *ignored*,  
which has a different failure mode. A host crash under this regime  
can potentially corrupt *any* journaled and transactional system  
(starting with filesystems and RDBMS) in a manner that does not occur  
on properly functioning bare metal that honours flushes, because  
their ordering assumptions no longer hold.


Whether this is 'possible' with a guest-only crash is arguable - I  
don't want to speak for Miles, but I suspect he was reasoning that a  
guest crash would not interact with ignore-flush, as all requested  
issued I/O up until the crash should finally complete - making a  
guest crash similar to a real crash. But the virtualised stack is  
complex enough that I don't know if we can be certain about that.


I would say that ignoring flushes is still a suspect.



it's just more likely in a VM, especially when anything Microsoft
is involved,


I originally saw the problem on a Ubuntu system, 6 months ago. The  
subsystems which broke were ext3fs and InnoDB - both supposedly  
journaling.



and the whole point of journalling is to prevent things
like this happening.



It can ONLY do that when flushes/barriers/ordering are respected.

--Toby



...
HTH -- Frank







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss