Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-24 Thread David Brodbeck
On Mon, Oct 15, 2012 at 5:02 PM, Richard Elling 
richard.ell...@richardelling.com wrote:

 There is some interesting research that shows how scrubs for RAID-5
 systems can
 contaminate otherwise good data. The reason is that if a RAID-5 parity
 mismatch
 occurs, how do you know where the data corruption is when the disks
 themselves
 do not fail. In those cases, scrubs are evil. ZFS does not suffer from
 this problem because
 the checksums are stored in the parent's metadata.


A similar problem happens for traditional RAID-1 mirrors.  If mirror
verification shows the two disks differ, there's no way of knowing which is
correct.

-- 
David Brodbeck
System Administrator, Linguistics
University of Washington
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-15 Thread Heinrich van Riel
On Mon, Oct 15, 2012 at 6:21 PM, Jason Matthews ja...@broken.net wrote:



 From: heinrich.vanr...@gmail.com [mailto:heinrich.vanr...@gmail.com]


  My point is most high end storage units has some form of data
  verification process that is active all the time.

 As does ZFS. The blocks are checksumed on each read. Assuming you have
 mirrors or parity redundancy, the misbehaving block is corrected,
 reallocated, etc.

 Right, I understand ZFS checks data on each read, my point is checking the
disk or data periodically.


 In my opinion scrubs should be considered depending on the importance
  of data and the frequency based on what type of raidz, change rates
  and disk type used.

 One point of scrubs is to verify the data that you don't normally read.
 Otherwise, the errors would be found in real time upon the next read.


Understood, if full backups are executed weekly/monthly no scrub is
required.


  Perhaps in future ZFS will have the ability to limit resource
  allocation when scrubbing like with BV where it can be set. Rebuild
  priory can also be set.

 There are tunables for this.

 Thanks, did not know will research, had a fairly heavy impact the other
day replacing a disk..


  Also some high end controllers have port verify for each
  disk (media read) when using their integrated raid that runs
  periodically. Since in the world of ZFS it is recommended to use
  JBOD I see it as more than just the filesystem. I have never deployed
  a system containing mission critical data using filesystem raid
  protection other than with ZFS since there is no protection in them an
  I would much rather bank on the controller.


 Unfortunately my parser was unable to grok this. Seems like you would
 prefer
 a raid controller.



Sorry, boils down to this, if ZFS is not an option I use a raid controller
if data is important.
In fact I do not like to be tied to a specific controller, zfs gives me the
freedom to change at any point


 j.

 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-15 Thread Richard Elling

On Oct 15, 2012, at 3:00 PM, heinrich.vanr...@gmail.com wrote:

 Most of my storage background is with EMC CX and VNX  and that is used in a 
 vast amount of datacenters. 
 They run a process called sniiffer that runs in the background and request a 
 read of all blocks on each disk individually for a specific LUN, if there is 
 an unrecoverable read error a Background Verify (BV) is requested by the 
 process to check for data consistency. The unit will also conduct a proactive 
 copy to a hotspare, I believe once data has been verified, from the disk 
 where the error(s) were seen.
 
 A BV is also requested when there is a LUN failover, enclosure path failure 
 or a storage processor failure.
 
 
 My point is most high end storage units has some form of data verification 
 process that is active all the time. 

Don't assume BV is data verification. On most midrange- systems these scrubbers 
just
check for disks to report errors. While this should catch most media errors, it 
does not
catch phantom writes or other corruption in the datapath. On systems with SATA 
disks, 
there is no way to add any additional checksums to the sector, so they are SOL 
if there
is data corruption that does not also cause a disk failure. For SAS or FC 
disks, some
vendors use larger sectors and include per-sector checksums that can help catch
some phantom write or datapath corruption.

There is some interesting research that shows how scrubs for RAID-5 systems can 
contaminate otherwise good data. The reason is that if a RAID-5 parity mismatch
occurs, how do you know where the data corruption is when the disks themselves 
do not fail. In those cases, scrubs are evil. ZFS does not suffer from this 
problem because
the checksums are stored in the parent's metadata.

 In my opinion scrubs should be considered depending on the importance of data 
 and the frequency based on what type of raidz, change rates and disk type 
 used. 
 
 Perhaps in future ZFS will have the ability to limit resource allocation when 
 scrubbing like with BV where it can be set. Rebuild priory can also be set.

Throttling exists today, but most people don't consider mdb as a suitable 
method for setting :-(
Scrub priority is already lowest priority, I don't see much need to increase it.
 -- richard

 Also some high end controllers have port verify for each disk (media read) 
 when using their integrated raid that runs periodically. Since in the world 
 of ZFS it is recommended to use JBOD I see it as more than just the 
 filesystem. I have never deployed a system containing mission critical data 
 using filesystem raid protection other than with ZFS since there is no 
 protection in them an I would much rather bank on the controller.
 
 
 
 my few cents on scrubs. 
 
 
 
 Thanks
 
 
 
 
 
 From: Jim Klimov
 Sent: ‎October‎ ‎13‎, ‎2012 ‎9‎:‎02
 To: Discussion list for OpenIndiana
 Subject: Re: [OpenIndiana-discuss] Zfs stability Scrubs
 
 
 2012-10-13 7:26, Michael Stapleton wrote:
 The VAST majority of data centers are not storing data in storage that
 does checksums to verify data, that is just the reality. Regular backups
 and site replication rule.
 
 And this actually concerns me... we help maintain some deployments
 built by customers including professional arrays like Sun Storagetek
 6140 serving a few LUNs to directly attached servers (so it happens).
 
 The arrays are black boxes to us - we don't know if they use
 something block-checksummed similar to ZFS inside, or can only
 protect against whole-disk failures, when a device just stops
 responding?
 
 We still have little idea - in what config would the data be
 safer to hold a ZFS pool, and which should give more performance:
 * if we use the array with its internal RAID6, and the client
   computer makes a pool over the single LUN
 * a couple of RAID6 array boxes in a mirror provided by arrays'
   firmware (independently of client computers, who see a MPxIO
   target LUN), and the computer makes a pool over the single
   multi-pathed LUN
 * a couple of RAID6 array boxes in a mirror provided by ZFS
   (two independent LUNs mirrored by computer)
 * serve LUNs from each disk in JBOD manner from the one or two
   arrays, and have ZFS construct pools over that.
 
 Having expensive hardware RAIDs (anyway available on customer's
 site) serving as JBODs is kind of overkill - any well-built JBOD
 costing a fraction of this array could suffice. But regarding
 data integrity known to be provided by ZFS and unknown to be
 really provided by black-box appliances, downgrading the arrays
 to JBODs might be better. Who knows?.. (We don't, advice welcome).
 
 
 
 There are several more things to think about:
 
 1) Redundant configs without knowledge of which side of the mirror
is good, or what permutation of RAID blocks yields the correct
answer, is basically useless, and it can propagate errors by
overwriting an unknownly-good copy of the data with unknownly-
corrupted one.
 
For example

Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-15 Thread Jim Klimov

2012-10-16 3:57, Heinrich van Riel wrote:

Understood, if full backups are executed weekly/monthly no scrub is
required.


I'd argue that this is not a completely true statement.

It might hold for raidzN backing storage with single-copy blocks,
but if mirrors and/or two or three copies are involved (i.e. for
metadata blocks) or ditto blocks on deduped pools, you have say
a 50/50 or 33/67 chance of only reading once a particular copy
of a block during the backup'ing procedure, and if errors hide
in other copies - you'll miss them.

That's where scrub should shine, by enforcing reads of all copies
of all blocks while walking the block pointer tree of the pool.

Hope I'm correct ;)
//Jim

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Roel_D
Thank you all for the good answers!

So if i put it all together :
1. ZFS is, in mirror and RAID configs, the best currently available option for 
reliable data
2. Without scrubs data is checked on every read for integrity
3. Unread data will not be checked for integrity
4. Scrubs will solve point 3.
5. Real servers with good hardware (HCL), ECC memory and servergrade harddisks 
have a very low chance of dataloss/corruption when used with ZFS.
6. Large modern drives with large storage like any  750 GB hd have a higher 
chance for corruption
7. Real SAS and SCSi drives offer the best option for reliable data
8. So called near-line SAS drives can give problems when combined with ZFS 
because they haven't been tested very long
9. Checking your logs for hardware messages should be a daily job 



Kind regards, 

The out-side

Op 13 okt. 2012 om 05:26 heeft Michael Stapleton 
michael.staple...@techsologic.com het volgende geschreven:

 I'm not a mathematician, but can anyone calculate the chance of the Same
 8K datablock on Both submirrors Going bad on terabyte drives, before
 the data is ever read and fixed automatically during normal read
 operations?
 And if you are not doing mirroring, you have already accepted a much
 larger margin of error for the sake of $.
 
 The VAST majority of data centers are not storing data in storage that
 does checksums to verify data, that is just the reality. Regular backups
 and site replication rule.
 
 I am Not saying scubs are a bad thing, just that they are being over
 emphasized and some people who do not really understand are getting the
 wrong impression that doing scrubs very often will somehow make them a
 lot safer.
 Scrubs help. But a lot of people who are worrying about scrubs are not
 even doing proper backups or regular DR testing.
 
 
 Mike
 
 On Fri, 2012-10-12 at 22:36 -0400, Doug Hughes wrote:
 
 So?}?\, a lot of people have already answered this in various ways. 
 I'm going to provide a little bit of direct answer and focus to some of 
 those other answers (and emphasis)
 
 On 10/12/2012 5:07 PM, Michael Stapleton wrote:
 It is easy to understand that zfs srubs can be useful, But, How often do
 we scrub or the equivalent of any other file system? UFS? VXFS?
 NTFS? ...
 ZFS has scrubs as a feature, but is it a need? I do not think so. Other
 file systems accept the risk, mostly because they can not really do
 anything if there were errors.
 That's right. They cannot do anything. Why is that a good thing? If you 
 have a corruption on your filesystem because a block or even a single 
 bit went wrong, wouldn't you want to know? Wouldn't you want to fix it? 
 What if a number in an important financial document changed? Seems 
 unlikely, but we've discovered at least 5 instances of spontaneous disk 
 data corruption over the course of a couple of years. zfs corrected them 
 transparently. No data lost, automatic, clean,  and transparent. The 
 more data that we make, the more that possibility of spontaneous data 
 corruption becomes reality.
 It does no harm to do periodic scrubs, but I would not recommend doing
 them often or even at all if scrubs get in the way of production.
 What is the real risk of not doing scrubs?
 data changing without you knowing it. Maybe this doesn't matter on an 
 image file (though a jpeg could end up looking nasty or destroyed, and 
 mpeg4 could be permanently damaged, but in a TIFF or other uncompressed 
 format, you'd probably never know)
 
 
 Risk can not be eliminated, and we have to accept some risk.
 
 For example, data deduplication uses digests on data to detect
 duplication. Most dedup systems assume that if the digest is the same
 for two pieces of data, then the data must be the same.
 This assumption is not actually true. Two differing pieces of data can
 have the same digest, but the chance of this happening is so low that
 the risk is accepted.
 but, the risk of data being flipped once you have TBs of data is way 
 above 0%. You can also do your own erasure coding if you like. That 
 would be one way to achieve the same affect outside of ZFS.
 
 
 I'm only writing this because I get the feeling some people think scrubs
 are a need. Maybe people associate doing scrubs with something like
 doing NTFS defrags?
 NTFS defrag would only help with performance. scrub helps with 
 integrity. Totally different things.
 
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss
 
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Roel_D
10. If SUN had listen to the engineers instead of financials it now would have 
been marketleader in the server market ;-( 


Op 13 okt. 2012 om 09:56 heeft Roel_D openindi...@out-side.nl het volgende 
geschreven:

 Thank you all for the good answers!
 
 So if i put it all together :
 1. ZFS is, in mirror and RAID configs, the best currently available option 
 for reliable data
 2. Without scrubs data is checked on every read for integrity
 3. Unread data will not be checked for integrity
 4. Scrubs will solve point 3.
 5. Real servers with good hardware (HCL), ECC memory and servergrade 
 harddisks have a very low chance of dataloss/corruption when used with ZFS.
 6. Large modern drives with large storage like any  750 GB hd have a higher 
 chance for corruption
 7. Real SAS and SCSi drives offer the best option for reliable data
 8. So called near-line SAS drives can give problems when combined with ZFS 
 because they haven't been tested very long
 9. Checking your logs for hardware messages should be a daily job 
 
 
 
 Kind regards, 
 
 The out-side
 
 Op 13 okt. 2012 om 05:26 heeft Michael Stapleton 
 michael.staple...@techsologic.com het volgende geschreven:
 
 I'm not a mathematician, but can anyone calculate the chance of the Same
 8K datablock on Both submirrors Going bad on terabyte drives, before
 the data is ever read and fixed automatically during normal read
 operations?
 And if you are not doing mirroring, you have already accepted a much
 larger margin of error for the sake of $.
 
 The VAST majority of data centers are not storing data in storage that
 does checksums to verify data, that is just the reality. Regular backups
 and site replication rule.
 
 I am Not saying scubs are a bad thing, just that they are being over
 emphasized and some people who do not really understand are getting the
 wrong impression that doing scrubs very often will somehow make them a
 lot safer.
 Scrubs help. But a lot of people who are worrying about scrubs are not
 even doing proper backups or regular DR testing.
 
 
 Mike
 
 On Fri, 2012-10-12 at 22:36 -0400, Doug Hughes wrote:
 
 So?}?\, a lot of people have already answered this in various ways. 
 I'm going to provide a little bit of direct answer and focus to some of 
 those other answers (and emphasis)
 
 On 10/12/2012 5:07 PM, Michael Stapleton wrote:
 It is easy to understand that zfs srubs can be useful, But, How often do
 we scrub or the equivalent of any other file system? UFS? VXFS?
 NTFS? ...
 ZFS has scrubs as a feature, but is it a need? I do not think so. Other
 file systems accept the risk, mostly because they can not really do
 anything if there were errors.
 That's right. They cannot do anything. Why is that a good thing? If you 
 have a corruption on your filesystem because a block or even a single 
 bit went wrong, wouldn't you want to know? Wouldn't you want to fix it? 
 What if a number in an important financial document changed? Seems 
 unlikely, but we've discovered at least 5 instances of spontaneous disk 
 data corruption over the course of a couple of years. zfs corrected them 
 transparently. No data lost, automatic, clean,  and transparent. The 
 more data that we make, the more that possibility of spontaneous data 
 corruption becomes reality.
 It does no harm to do periodic scrubs, but I would not recommend doing
 them often or even at all if scrubs get in the way of production.
 What is the real risk of not doing scrubs?
 data changing without you knowing it. Maybe this doesn't matter on an 
 image file (though a jpeg could end up looking nasty or destroyed, and 
 mpeg4 could be permanently damaged, but in a TIFF or other uncompressed 
 format, you'd probably never know)
 
 
 Risk can not be eliminated, and we have to accept some risk.
 
 For example, data deduplication uses digests on data to detect
 duplication. Most dedup systems assume that if the digest is the same
 for two pieces of data, then the data must be the same.
 This assumption is not actually true. Two differing pieces of data can
 have the same digest, but the chance of this happening is so low that
 the risk is accepted.
 but, the risk of data being flipped once you have TBs of data is way 
 above 0%. You can also do your own erasure coding if you like. That 
 would be one way to achieve the same affect outside of ZFS.
 
 
 I'm only writing this because I get the feeling some people think scrubs
 are a need. Maybe people associate doing scrubs with something like
 doing NTFS defrags?
 NTFS defrag would only help with performance. scrub helps with 
 integrity. Totally different things.
 
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss
 
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss
 
 

Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Jim Klimov

2012-10-13 2:06, Jan Owoc wrote:

All scrubbing does is put stress on drives and verify that data can
still be read from them. If a hard drive ever fails on you and you
need to replace it (how often does that happen?), then you know hey,
just last week all the other hard drives were able to read their data
under stress, so are less likely to fail on me.


Also note that there are different types of media that are
differently impacted by IOs. CDs/DVDs and tape can get more
scratches upon reads, SSDs wear out upon writes, while HDDs
in stable conditions (good heat, power and vibration) don't
care about doing IOs in terms of their media, though mechanics
of the head movement can wear out - thus, see the disk's
ratings (i.e. 24x7 or not) and vendor-assumed lifetime.

I heard a statement which I am ready to accept but can not
vouch for validity of, that by having the magnetic head
read the bits from the platter can actually help the media
hold its data, by aligning the magnetic domains to one of
their two valid positions. Due to brownian movement and
other factors, these miniature crystals can turn around
in their little beds and spell zeroes or ones with less
and less exactness. Applying oriented magnetic fields can
push them back into one of the stable positions.

Well, whether that was crap or not - I'm not ready to say,
but one thing that is more likely true is that HDDs have
ECC on their sectors. If a read produces repairable bad
data, the HDD itself can try to repair the sector in-place
or by relocation to spare area, perhaps by applying stronger
fields to discern the bits better, and if it succeeds - it
would return no error to the HBA and return the fixed data.
If the repair result was wrong, ZFS would detect incorrect
data and issue its own repairs, using other copies or raidzN
permutations. Also note that this self-repair takes time
while the HDD does nothing else, and *that* IO timeout can
cause grief for RAID systems, HBA reset storms and so on
(hence the RAID editions of drives, TLER and so on).

On the other hand, if you're putting regular stress on the
disks and see some error counters (monitoring!) go high,
you can preemptively order and replace aging disks, instead
of trying to recover from a pool with reduced redundancy
a few days or months later.

HTH,
//Jim Klimov


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Jim Klimov

A few more comments:

2012-10-13 11:56, Roel_D wrote:

Thank you all for the good answers!

So if i put it all together :
1. ZFS is, in mirror and RAID configs, the best currently available option for 
reliable data


Yes, though even it is not replacement for backups, because
data loss can be caused by reasons outside ZFS control,
including admin errors, datacenter fires, code bugs and so on.


2. Without scrubs data is checked on every read for integrity


With normal reads, this check only takes place for the one
semi-randomly chosen copy of the block. If this copy is not
valid, other copies are consulted.


3. Unread data will not be checked for integrity
4. Scrubs will solve point 3.


Yes, because they enforce reads and checks of all copies.


5. Real servers with good hardware (HCL), ECC memory and servergrade harddisks 
have a very low chance of dataloss/corruption when used with ZFS.


Put otherwise, cheaper hardware tends to cause problems
of various nature, that can not be detected and fixed by
this hardware and corrupted data is propagated to ZFS
and it trustily saves trash to disks. Few programs do
verify-on-write to test the saved results...


6. Large modern drives with large storage like any  750 GB hd have a higher 
chance for corruption


The bit-error rates are somewhat the same for disks of the
past decade, being roughly one bit per 10Tb of IOs. With
disk sizes and overall throughputs growing, the chance of
hitting an error on a particular large disk increases.


7. Real SAS and SCSi drives offer the best option for reliable data
8. So called near-line SAS drives can give problems when combined with ZFS 
because they haven't been tested very long


There are also some architectural things and lessons learned,
like don't use SATA disks with SAS expanders, while direct
attachment of SATA disks to individual HBA ports works without
problems (i.e. Sun Thumpers are built like this - with six
eight-port HBAs on board to drive the 48 disks in the box).


9. Checking your logs for hardware messages should be a daily job


Better yet, some monitoring system (nagios, zabbix, whatever)
should check these logs so you have one dashboard for all your
computers with a big green light on it, meaning no problems
detected anywhere. You can worry if the light goes not-green ;)
You should manually check the system with drills too, to test
that it itself monitors stuff correctly, though - but that
can be a non-daily routine.

//Jim

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Jim Klimov

2012-10-13 7:26, Michael Stapleton wrote:

The VAST majority of data centers are not storing data in storage that
does checksums to verify data, that is just the reality. Regular backups
and site replication rule.


And this actually concerns me... we help maintain some deployments
built by customers including professional arrays like Sun Storagetek
6140 serving a few LUNs to directly attached servers (so it happens).

The arrays are black boxes to us - we don't know if they use
something block-checksummed similar to ZFS inside, or can only
protect against whole-disk failures, when a device just stops
responding?

We still have little idea - in what config would the data be
safer to hold a ZFS pool, and which should give more performance:
* if we use the array with its internal RAID6, and the client
  computer makes a pool over the single LUN
* a couple of RAID6 array boxes in a mirror provided by arrays'
  firmware (independently of client computers, who see a MPxIO
  target LUN), and the computer makes a pool over the single
  multi-pathed LUN
* a couple of RAID6 array boxes in a mirror provided by ZFS
  (two independent LUNs mirrored by computer)
* serve LUNs from each disk in JBOD manner from the one or two
  arrays, and have ZFS construct pools over that.

Having expensive hardware RAIDs (anyway available on customer's
site) serving as JBODs is kind of overkill - any well-built JBOD
costing a fraction of this array could suffice. But regarding
data integrity known to be provided by ZFS and unknown to be
really provided by black-box appliances, downgrading the arrays
to JBODs might be better. Who knows?.. (We don't, advice welcome).



There are several more things to think about:

1) Redundant configs without knowledge of which side of the mirror
   is good, or what permutation of RAID blocks yields the correct
   answer, is basically useless, and it can propagate errors by
   overwriting an unknownly-good copy of the data with unknownly-
   corrupted one.

   For example, take a root mirror. You find that your OS can't
   boot. You can try to split the mirror into two separate disks,
   fsck each of them and if one is still correct, recreate the
   mirror using it as base (first half). Even if both disks give
   some errors, these might be in different parts of the data, so
   you have a chance of reconstructing the data using these two
   halves and/or backups. However, if your simplistic RAID just
   copies data from disk1 to disk2 in case of any discrepancies
   and unclean shutdowns, you're roughly 50% likely to corrupt a
   good disk2 with bad data from disk1.

   This setup assumed that bit-rot never occurred or was too rare,
   bus/RAM errors never happened or were ruled out by CRC/ECC,
   and instead disks died altogether, instantly becoming bricks
   (which could be quite true in the old days, and can still be
   probable with expensive enterprise hardware). Basically, this
   assumed that data written from a process was the same data that
   hit the disk platters and the same data that was returned upon
   reads (unless an IO error/deviceMissing were reported) - in that
   case old RAIDs could indeed propagate assumed-good data onto
   replacement disk(s) during reconstruction of the array.

2) Backups and replicas without means to verify them (checksums
   or at least three-way comparisons at some level) are also
   tainted, because you don't really know if what you read from
   them ever matches what you wrote to them (perhaps several years
   ago, counting from the moment the data was written onto RAID
   originally).

My few cents,
//Jim

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Michael Stapleton
Nice list.
You could add:

10. Dedup comes with a price.


Mike



On Sat, 2012-10-13 at 09:56 +0200, Roel_D wrote:

 Thank you all for the good answers!
 
 So if i put it all together :
 1. ZFS is, in mirror and RAID configs, the best currently available option 
 for reliable data
 2. Without scrubs data is checked on every read for integrity
 3. Unread data will not be checked for integrity
 4. Scrubs will solve point 3.
 5. Real servers with good hardware (HCL), ECC memory and servergrade 
 harddisks have a very low chance of dataloss/corruption when used with ZFS.
 6. Large modern drives with large storage like any  750 GB hd have a higher 
 chance for corruption
 7. Real SAS and SCSi drives offer the best option for reliable data
 8. So called near-line SAS drives can give problems when combined with ZFS 
 because they haven't been tested very long
 9. Checking your logs for hardware messages should be a daily job 
 
 
 
 Kind regards, 
 
 The out-side
 
 Op 13 okt. 2012 om 05:26 heeft Michael Stapleton 
 michael.staple...@techsologic.com het volgende geschreven:
 
  I'm not a mathematician, but can anyone calculate the chance of the Same
  8K datablock on Both submirrors Going bad on terabyte drives, before
  the data is ever read and fixed automatically during normal read
  operations?
  And if you are not doing mirroring, you have already accepted a much
  larger margin of error for the sake of $.
  
  The VAST majority of data centers are not storing data in storage that
  does checksums to verify data, that is just the reality. Regular backups
  and site replication rule.
  
  I am Not saying scubs are a bad thing, just that they are being over
  emphasized and some people who do not really understand are getting the
  wrong impression that doing scrubs very often will somehow make them a
  lot safer.
  Scrubs help. But a lot of people who are worrying about scrubs are not
  even doing proper backups or regular DR testing.
  
  
  Mike
  
  On Fri, 2012-10-12 at 22:36 -0400, Doug Hughes wrote:
  
  So?}?\, a lot of people have already answered this in various ways. 
  I'm going to provide a little bit of direct answer and focus to some of 
  those other answers (and emphasis)
  
  On 10/12/2012 5:07 PM, Michael Stapleton wrote:
  It is easy to understand that zfs srubs can be useful, But, How often do
  we scrub or the equivalent of any other file system? UFS? VXFS?
  NTFS? ...
  ZFS has scrubs as a feature, but is it a need? I do not think so. Other
  file systems accept the risk, mostly because they can not really do
  anything if there were errors.
  That's right. They cannot do anything. Why is that a good thing? If you 
  have a corruption on your filesystem because a block or even a single 
  bit went wrong, wouldn't you want to know? Wouldn't you want to fix it? 
  What if a number in an important financial document changed? Seems 
  unlikely, but we've discovered at least 5 instances of spontaneous disk 
  data corruption over the course of a couple of years. zfs corrected them 
  transparently. No data lost, automatic, clean,  and transparent. The 
  more data that we make, the more that possibility of spontaneous data 
  corruption becomes reality.
  It does no harm to do periodic scrubs, but I would not recommend doing
  them often or even at all if scrubs get in the way of production.
  What is the real risk of not doing scrubs?
  data changing without you knowing it. Maybe this doesn't matter on an 
  image file (though a jpeg could end up looking nasty or destroyed, and 
  mpeg4 could be permanently damaged, but in a TIFF or other uncompressed 
  format, you'd probably never know)
  
  
  Risk can not be eliminated, and we have to accept some risk.
  
  For example, data deduplication uses digests on data to detect
  duplication. Most dedup systems assume that if the digest is the same
  for two pieces of data, then the data must be the same.
  This assumption is not actually true. Two differing pieces of data can
  have the same digest, but the chance of this happening is so low that
  the risk is accepted.
  but, the risk of data being flipped once you have TBs of data is way 
  above 0%. You can also do your own erasure coding if you like. That 
  would be one way to achieve the same affect outside of ZFS.
  
  
  I'm only writing this because I get the feeling some people think scrubs
  are a need. Maybe people associate doing scrubs with something like
  doing NTFS defrags?
  NTFS defrag would only help with performance. scrub helps with 
  integrity. Totally different things.
  
  
  ___
  OpenIndiana-discuss mailing list
  OpenIndiana-discuss@openindiana.org
  http://openindiana.org/mailman/listinfo/openindiana-discuss
  
  
  ___
  OpenIndiana-discuss mailing list
  OpenIndiana-discuss@openindiana.org
  http://openindiana.org/mailman/listinfo/openindiana-discuss
 
 

Re: [OpenIndiana-discuss] Zfs stability - our scrub script

2012-10-13 Thread Jim Klimov

2012-10-13 0:41, Doug Hughes wrote:

yes, you shoud do a scrub  and no, there isn't very much risk to this. This
will scan your disks for bits that have gone stale or the like. You should
do it. We do a scrub once per week.


Just in case this helps anyone, here's the script we use to
initiate scrubbing from cron (i.e. once a week on fridays).
Just add a line to crontab and receive emails ;)

There's some config-initialization and include cruft at the
start (we have a large package of admin-scripts), I hope
absence of config files (which can be used to override
hardcoded defaults) and libraries won't preclude the script
from running on systems without our package:

# cat /opt/COSas/bin/zpool-scrub.sh
-
#!/bin/bash

# $Id: zpool-scrub.sh,v 1.6 2010/11/15 14:32:19 jim Exp $
# this script will go through all pools and scrub them one at a time
#
# Use like this in crontab:
# 0 22 * * 5 [ -x /opt/COSas/bin/zpool-scrub.sh ]  
/opt/COSas/bin/zpool-scrub.sh

#
# (C) 2007 nic...@aspiringsysadmin.com and commenters
# 
http://aspiringsysadmin.com/blog/2007/06/07/scrub-your-zfs-file-systems-regularly/

# (C) 2009 Jim Klimov, cosmetic mods and logging; 2010 - locking
#

#[ x$MAILRECIPIENT = x ]  MAILRECIPIENT=ad...@domain.com
[ x$MAILRECIPIENT = x ]  MAILRECIPIENT=root

[ x$ZPOOL = x ]  ZPOOL=/usr/sbin/zpool
[ x$TMPFILE = x ]  TMPFILE=/tmp/scrub.sh.$$.$RANDOM
[ x$LOCK = x ]  LOCK=/tmp/`basename $0`.`dirname $0 | sed 
's/\//_/g'`.lock


COSAS_BINDIR=`dirname $0`
if [ x$COSAS_BINDIR = x./ -o x$COSAS_BINDIR = x. ]; then
COSAS_BINDIR=`pwd`
fi

# Source optional config files
[ x$COSAS_CFGDIR = x ]  COSAS_CFGDIR=$COSAS_BINDIR/../etc
if [ -d $COSAS_CFGDIR ]; then
[  -f $COSAS_CFGDIR/COSas.conf ]  \
. $COSAS_CFGDIR/COSas.conf
[  -f $COSAS_CFGDIR/`basename $0`.conf ]  \
. $COSAS_CFGDIR/`basename $0`.conf
fi

[ ! -x $ZPOOL ]  exit 1

### Include this after config files, in case of RUNLEVEL_NOKICK mask 
override

RUN_CHECKLEVEL=
[ -s $COSAS_BINDIR/runlevel_check.include ] 
. $COSAS_BINDIR/runlevel_check.include 
block_runlevel

# Check LOCKfile
if [ -f $LOCK ]; then
OLDPID=`head -n 1 $LOCK`
BN=`basename $0`
TRYOLDPID=`ps -ef | grep $BN | grep -v grep | awk '{ print $2 }' 
| grep $OLDPID`

if [ x$TRYOLDPID != x ]; then

LF=`cat $LOCK`

echo = ZPoolScrub wrapper aborted because another copy is 
running - lockfile found:

$LF
Aborting... | wall
exit 1
fi
fi
echo $$  $LOCK

scrub_in_progress() {
### Check that we're not yet shutting down
if [ x$RUN_CHECKLEVEL != x ]; then
if [ x`check_runlevel` != x ]; then
echo INFO: System is shutting down. Aborting scrub of 
pool '$1'! 2

zpool scrub -s $1
return 1
fi
fi

if $ZPOOL status $1 | grep scrub in progress /dev/null; then
return 0
else
return 1
fi
}

RESULT=0
for pool in `$ZPOOL list -H -o name`; do
echo === `TZ=UTC date` @ `hostname`: $ZPOOL scrub $pool 
started...

$ZPOOL scrub $pool

while scrub_in_progress $pool; do sleep 60; done

echo === `TZ=UTC date` @ `hostname`: $ZPOOL scrub $pool completed

if ! $ZPOOL status $pool | grep with 0 errors /dev/null; then
$ZPOOL status $pool | tee -a $TMPFILE
RESULT=$(($RESULT+1))
fi
done

if [ -s $TMPFILE ]; then
cat $TMPFILE | mailx -s zpool scrub on `hostname` generated 
errors $MAILRECIPIENT

fi

rm -f $TMPFILE

# Be nice, clean up
rm -f $LOCK

exit $RESULT

-


HTH,
//Jim Klimov

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-13 Thread Michael Stapleton
Some basic thoughts:


The one advantage of using a storage array instead of a JBOD is the
write cache when doing random writes. But the cost is that you loose the
data integrity features if the ZFS pool is not configured with
redundancy.

ZFS works best when it has multiple direct paths to multiple physical
devices configured with mirrored VDevs.

So the bottom line for ZFS is that JBODs are almost always the best
choice as long as the quality of the devices and device drivers are
similar.

SANs provide centralized administration and maintenance, which is their
main feature. 

If you could map actual hard drives from the SAN to ZFS everyone could
be happy.

Backup done while services are running all too often results in unhappy
people.

There are few easy answers when it comes for performance.

And the actual answer to most questions is It Depends.


Mike




On Sat, 2012-10-13 at 17:02 +0400, Jim Klimov wrote:

 2012-10-13 7:26, Michael Stapleton wrote:
  The VAST majority of data centers are not storing data in storage that
  does checksums to verify data, that is just the reality. Regular backups
  and site replication rule.
 
 And this actually concerns me... we help maintain some deployments
 built by customers including professional arrays like Sun Storagetek
 6140 serving a few LUNs to directly attached servers (so it happens).
 
 The arrays are black boxes to us - we don't know if they use
 something block-checksummed similar to ZFS inside, or can only
 protect against whole-disk failures, when a device just stops
 responding?
 
 We still have little idea - in what config would the data be
 safer to hold a ZFS pool, and which should give more performance:
 * if we use the array with its internal RAID6, and the client
computer makes a pool over the single LUN
 * a couple of RAID6 array boxes in a mirror provided by arrays'
firmware (independently of client computers, who see a MPxIO
target LUN), and the computer makes a pool over the single
multi-pathed LUN
 * a couple of RAID6 array boxes in a mirror provided by ZFS
(two independent LUNs mirrored by computer)
 * serve LUNs from each disk in JBOD manner from the one or two
arrays, and have ZFS construct pools over that.
 
 Having expensive hardware RAIDs (anyway available on customer's
 site) serving as JBODs is kind of overkill - any well-built JBOD
 costing a fraction of this array could suffice. But regarding
 data integrity known to be provided by ZFS and unknown to be
 really provided by black-box appliances, downgrading the arrays
 to JBODs might be better. Who knows?.. (We don't, advice welcome).
 
 
 
 There are several more things to think about:
 
 1) Redundant configs without knowledge of which side of the mirror
 is good, or what permutation of RAID blocks yields the correct
 answer, is basically useless, and it can propagate errors by
 overwriting an unknownly-good copy of the data with unknownly-
 corrupted one.
 
 For example, take a root mirror. You find that your OS can't
 boot. You can try to split the mirror into two separate disks,
 fsck each of them and if one is still correct, recreate the
 mirror using it as base (first half). Even if both disks give
 some errors, these might be in different parts of the data, so
 you have a chance of reconstructing the data using these two
 halves and/or backups. However, if your simplistic RAID just
 copies data from disk1 to disk2 in case of any discrepancies
 and unclean shutdowns, you're roughly 50% likely to corrupt a
 good disk2 with bad data from disk1.
 
 This setup assumed that bit-rot never occurred or was too rare,
 bus/RAM errors never happened or were ruled out by CRC/ECC,
 and instead disks died altogether, instantly becoming bricks
 (which could be quite true in the old days, and can still be
 probable with expensive enterprise hardware). Basically, this
 assumed that data written from a process was the same data that
 hit the disk platters and the same data that was returned upon
 reads (unless an IO error/deviceMissing were reported) - in that
 case old RAIDs could indeed propagate assumed-good data onto
 replacement disk(s) during reconstruction of the array.
 
 2) Backups and replicas without means to verify them (checksums
 or at least three-way comparisons at some level) are also
 tainted, because you don't really know if what you read from
 them ever matches what you wrote to them (perhaps several years
 ago, counting from the moment the data was written onto RAID
 originally).
 
 My few cents,
 //Jim


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


[OpenIndiana-discuss] Zfs stability

2012-10-12 Thread Roel_D
Being on the list and reading all ZFS problem and question posts makes me a 
little scared.

I have 4 Sun X4140 servers running in the field for 4 years now and they all 
have ZFS mirrors (2x HD). They are running Solaris 10 and 1 is running solaris 
11. I also have some other servers running OI, also with ZFS.

The Solaris servers N E V E R had any ZFS scrub. I didn't even knew such 
existed ;-) 

Since it all worked flawless for years now i am a huge Solaris/OI fan. 

But how stable are things nowaday? Does one need to do a scrub? Or a resilver? 

How come i see so much ZFS trouble? 



Kind regards, 

The out-side
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability

2012-10-12 Thread Doug Hughes
yes, you shoud do a scrub  and no, there isn't very much risk to this. This
will scan your disks for bits that have gone stale or the like. You should
do it. We do a scrub once per week.



On Fri, Oct 12, 2012 at 3:55 PM, Roel_D openindi...@out-side.nl wrote:

 Being on the list and reading all ZFS problem and question posts makes me
 a little scared.

 I have 4 Sun X4140 servers running in the field for 4 years now and they
 all have ZFS mirrors (2x HD). They are running Solaris 10 and 1 is running
 solaris 11. I also have some other servers running OI, also with ZFS.

 The Solaris servers N E V E R had any ZFS scrub. I didn't even knew such
 existed ;-)

 Since it all worked flawless for years now i am a huge Solaris/OI fan.

 But how stable are things nowaday? Does one need to do a scrub? Or a
 resilver?

 How come i see so much ZFS trouble?



 Kind regards,

 The out-side
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability

2012-10-12 Thread Robbie Crash
Also, the reason there's so much talk about broken ZFS is because nobody
complains when their pools aren't broken.

On Fri, Oct 12, 2012 at 4:41 PM, Doug Hughes d...@will.to wrote:

 yes, you shoud do a scrub  and no, there isn't very much risk to this. This
 will scan your disks for bits that have gone stale or the like. You should
 do it. We do a scrub once per week.



 On Fri, Oct 12, 2012 at 3:55 PM, Roel_D openindi...@out-side.nl wrote:

  Being on the list and reading all ZFS problem and question posts makes me
  a little scared.
 
  I have 4 Sun X4140 servers running in the field for 4 years now and they
  all have ZFS mirrors (2x HD). They are running Solaris 10 and 1 is
 running
  solaris 11. I also have some other servers running OI, also with ZFS.
 
  The Solaris servers N E V E R had any ZFS scrub. I didn't even knew such
  existed ;-)
 
  Since it all worked flawless for years now i am a huge Solaris/OI fan.
 
  But how stable are things nowaday? Does one need to do a scrub? Or a
  resilver?
 
  How come i see so much ZFS trouble?
 
 
 
  Kind regards,
 
  The out-side
  ___
  OpenIndiana-discuss mailing list
  OpenIndiana-discuss@openindiana.org
  http://openindiana.org/mailman/listinfo/openindiana-discuss
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss




-- 
Seconds to the drop, but it seems like hours.

http://www.openmedia.ca
https://robbiecrash.me
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Michael Stapleton
It is easy to understand that zfs srubs can be useful, But, How often do
we scrub or the equivalent of any other file system? UFS? VXFS?
NTFS? ...
ZFS has scrubs as a feature, but is it a need? I do not think so. Other
file systems accept the risk, mostly because they can not really do
anything if there were errors.
It does no harm to do periodic scrubs, but I would not recommend doing
them often or even at all if scrubs get in the way of production.
What is the real risk of not doing scrubs? 

Risk can not be eliminated, and we have to accept some risk.

For example, data deduplication uses digests on data to detect
duplication. Most dedup systems assume that if the digest is the same
for two pieces of data, then the data must be the same.
This assumption is not actually true. Two differing pieces of data can
have the same digest, but the chance of this happening is so low that
the risk is accepted.


I'm only writing this because I get the feeling some people think scrubs
are a need. Maybe people associate doing scrubs with something like
doing NTFS defrags?

Just my 2 cents!


Mike




On Fri, 2012-10-12 at 16:41 -0400, Doug Hughes wrote:

 yes, you shoud do a scrub  and no, there isn't very much risk to this. This
 will scan your disks for bits that have gone stale or the like. You should
 do it. We do a scrub once per week.
 
 
 
 On Fri, Oct 12, 2012 at 3:55 PM, Roel_D openindi...@out-side.nl wrote:
 
  Being on the list and reading all ZFS problem and question posts makes me
  a little scared.
 
  I have 4 Sun X4140 servers running in the field for 4 years now and they
  all have ZFS mirrors (2x HD). They are running Solaris 10 and 1 is running
  solaris 11. I also have some other servers running OI, also with ZFS.
 
  The Solaris servers N E V E R had any ZFS scrub. I didn't even knew such
  existed ;-)
 
  Since it all worked flawless for years now i am a huge Solaris/OI fan.
 
  But how stable are things nowaday? Does one need to do a scrub? Or a
  resilver?
 
  How come i see so much ZFS trouble?
 
 
 
  Kind regards,
 
  The out-side
  ___
  OpenIndiana-discuss mailing list
  OpenIndiana-discuss@openindiana.org
  http://openindiana.org/mailman/listinfo/openindiana-discuss
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability

2012-10-12 Thread Dan Swartzendruber
+1.  What the previous poster is missing is this: it's entirely possible for
sectors on a disk to go bad and if you haven't read them in awhile, you
might not notice.  Then, say, the other disk (in a mirror for example) dies
entirely.  You are dismayed to realize your redundant disk configuration has
lost data for you anyway. 

-Original Message-
From: Doug Hughes [mailto:d...@will.to] 
Sent: Friday, October 12, 2012 4:42 PM
To: Discussion list for OpenIndiana
Subject: Re: [OpenIndiana-discuss] Zfs stability

yes, you shoud do a scrub  and no, there isn't very much risk to this. This
will scan your disks for bits that have gone stale or the like. You should
do it. We do a scrub once per week.



On Fri, Oct 12, 2012 at 3:55 PM, Roel_D openindi...@out-side.nl wrote:

 Being on the list and reading all ZFS problem and question posts makes 
 me a little scared.

 I have 4 Sun X4140 servers running in the field for 4 years now and 
 they all have ZFS mirrors (2x HD). They are running Solaris 10 and 1 
 is running solaris 11. I also have some other servers running OI, also
with ZFS.

 The Solaris servers N E V E R had any ZFS scrub. I didn't even knew 
 such existed ;-)

 Since it all worked flawless for years now i am a huge Solaris/OI fan.

 But how stable are things nowaday? Does one need to do a scrub? Or a 
 resilver?

 How come i see so much ZFS trouble?



 Kind regards,

 The out-side
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Roel_D
Maybe people associate doing scrubs with something like
doing NTFS defrags?

Well if read all the posts and because i installed napp-it on my homeserver 
which has a scrub scheduler i was almost at the point of assuming such. 

I recently bought a secondhand x4140 just because it performs so well. 

I had until recently running mysql cluster on an old HP G3 with solaris 10. It 
served a lot of data with heavy writes every 15 minutes. The whole cluster was 
running in zones based on ZFS storage. Worked like a charm, without scrubs for 
3 years. It had 4 scsi 73GB drives. Had to stop it because i moved all  to a 
X4140. 

ZFS saved me so much trouble and is so fast that i am afraid that new OI users 
will get scared when they read all the bad news. 



Kind regards, 

The out-side

Op 12 okt. 2012 om 23:07 heeft Michael Stapleton 
michael.staple...@techsologic.com het volgende geschreven:

 Maybe people associate doing scrubs with something like
 doing NTFS defrags?

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Reginald Beardsley


--- On Fri, 10/12/12, Michael Stapleton michael.staple...@techsologic.com 
wrote:


 
 I'm only writing this because I get the feeling some people
 think scrubs
 are a need. Maybe people associate doing scrubs with
 something like
 doing NTFS defrags?


I normally do scrubs when I think about it.  Which has been a long time between 
scrubs in most cases.  I got more interested in doing it regularly when I 
encountered SMART errors for excessive sector remapping after a reboot.  I 
don't know if a scrub would detect that or not.

The admin skills in this list vary from very high to very low. High skill 
admins take any threat to system integrity seriously and try to reduce it.  

At a job I worked many years ago, the admins were replacing several failed 
disks every week in the RAID arrays.  If you have lots of disks, you will have 
lots of failures.  There are a lot of companies w/ many petabytes of data on 
disk.  Even w/ 4 TB drives, that's still a lot of drives.  And you're always 
stuck running disks which are several years old and failing more often.

Have Fun!
Reg

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Michael Stapleton
The problem is when people are overly paranoid because the feature
exists and end up causing problems by doing scrubs when they should not
because they feel they need to. Skilled admins also understand SLAs.


Mike


On Fri, 2012-10-12 at 14:38 -0700, Reginald Beardsley wrote:

 
 --- On Fri, 10/12/12, Michael Stapleton michael.staple...@techsologic.com 
 wrote:
 
 
  
  I'm only writing this because I get the feeling some people
  think scrubs
  are a need. Maybe people associate doing scrubs with
  something like
  doing NTFS defrags?
 
 
 I normally do scrubs when I think about it.  Which has been a long time 
 between scrubs in most cases.  I got more interested in doing it regularly 
 when I encountered SMART errors for excessive sector remapping after a 
 reboot.  I don't know if a scrub would detect that or not.
 
 The admin skills in this list vary from very high to very low. High skill 
 admins take any threat to system integrity seriously and try to reduce it.  
 
 At a job I worked many years ago, the admins were replacing several failed 
 disks every week in the RAID arrays.  If you have lots of disks, you will 
 have lots of failures.  There are a lot of companies w/ many petabytes of 
 data on disk.  Even w/ 4 TB drives, that's still a lot of drives.  And you're 
 always stuck running disks which are several years old and failing more often.
 
 Have Fun!
 Reg
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Jan Owoc
On Fri, Oct 12, 2012 at 3:07 PM, Michael Stapleton
michael.staple...@techsologic.com wrote:
 It is easy to understand that zfs srubs can be useful, But, How often do
 we scrub or the equivalent of any other file system? UFS? VXFS?
 NTFS? ...

If your data has checksums, it is standard practice to periodically
verify your checksums and correct if necessary. ECC memory does do a
scrub every once in a while :-). The FS you named don't have
checksums, so scrubbing would do no good.


 For example, data deduplication uses digests on data to detect
 duplication. Most dedup systems assume that if the digest is the same
 for two pieces of data, then the data must be the same.
 This assumption is not actually true. Two differing pieces of data can
 have the same digest, but the chance of this happening is so low that
 the risk is accepted.

So low is an understatement. Have you ever taken 2 to the power of
256? (ZFS currently requires sha256 checksums if you want to do
dedup.) Chances of a block being different but having a duplicate
sha256 is 1 in 
115792089237316195423570985008687907853269984665640564039457584007913129639936.

Just for fun, let's see what those odds give you. Say you were writing
all human information ever produced (2.56e+20 bytes) [1] on one ZFS
filesystem (with 1-byte blocksize). Let's say you were writing this
much data every second for the age of the known universe (4.3e+17 s).
Your odds of having one false positive with this amount of data are 1
in 1e+39.

[1] 
http://www.wired.co.uk/news/archive/2011-02/14/256-exabytes-of-human-information


 I'm only writing this because I get the feeling some people think scrubs
 are a need. Maybe people associate doing scrubs with something like
 doing NTFS defrags?

All scrubbing does is put stress on drives and verify that data can
still be read from them. If a hard drive ever fails on you and you
need to replace it (how often does that happen?), then you know hey,
just last week all the other hard drives were able to read their data
under stress, so are less likely to fail on me.


Jan

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Jerry Kemp
But that the deal with mailing list everywhere.  Be they OI or what ever
else.

Be it some problem someone is having, or some way to enhance a product,
or to get it to do something it was never intended to do.

Support mailing list and forums wouldn't exist if people didn't have
problems that the didn't need support over coming.

Jerry


On 10/12/12 04:34 PM, Roel_D wrote:

 ZFS saved me so much trouble and is so fast that i am afraid that new OI 
 users will get scared when they read all the bad news. 
 
 

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability

2012-10-12 Thread James Carlson
On 10/12/12 16:45, Robbie Crash wrote:
 Also, the reason there's so much talk about broken ZFS is because nobody
 complains when their pools aren't broken.
 
 On Fri, Oct 12, 2012 at 3:55 PM, Roel_D openindi...@out-side.nl wrote:
 How come i see so much ZFS trouble?

I suspect there's more to it than that.  ZFS, unlike most file systems,
has a built-in checksum feature that checks block integrity.  If you
have problems on the drive, in the controller, in the DMA mechanism, or
in memory itself, you're liable to trip over ZFS checksum errors, which
ZFS will then try hard to repair from a mirror or RAID-Z reconstruction.

Because most other file systems don't have this capability, they just
don't notice.  Unless the drive itself flags the data as bad with an
uncorrectable low-level read error, the OS happily believes almost any
garbage it happens to read from the disk.

Thus, I believe that at least some of the people complaining about ZFS
stability problems here are actually getting a wonderful
canary-in-a-coal-mine warning out of ZFS about the reliability of the
hardware they own.  Whether those folks take that warning to heart or
simply wish it away by changing OSes, well, I guess that's up to them.

-- 
James Carlson 42.703N 71.076W carls...@workingcode.com

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Doug Hughes
So?}?\, a lot of people have already answered this in various ways. 
I'm going to provide a little bit of direct answer and focus to some of 
those other answers (and emphasis)


On 10/12/2012 5:07 PM, Michael Stapleton wrote:

It is easy to understand that zfs srubs can be useful, But, How often do
we scrub or the equivalent of any other file system? UFS? VXFS?
NTFS? ...
ZFS has scrubs as a feature, but is it a need? I do not think so. Other
file systems accept the risk, mostly because they can not really do
anything if there were errors.
That's right. They cannot do anything. Why is that a good thing? If you 
have a corruption on your filesystem because a block or even a single 
bit went wrong, wouldn't you want to know? Wouldn't you want to fix it? 
What if a number in an important financial document changed? Seems 
unlikely, but we've discovered at least 5 instances of spontaneous disk 
data corruption over the course of a couple of years. zfs corrected them 
transparently. No data lost, automatic, clean,  and transparent. The 
more data that we make, the more that possibility of spontaneous data 
corruption becomes reality.

It does no harm to do periodic scrubs, but I would not recommend doing
them often or even at all if scrubs get in the way of production.
What is the real risk of not doing scrubs?
data changing without you knowing it. Maybe this doesn't matter on an 
image file (though a jpeg could end up looking nasty or destroyed, and 
mpeg4 could be permanently damaged, but in a TIFF or other uncompressed 
format, you'd probably never know)




Risk can not be eliminated, and we have to accept some risk.

For example, data deduplication uses digests on data to detect
duplication. Most dedup systems assume that if the digest is the same
for two pieces of data, then the data must be the same.
This assumption is not actually true. Two differing pieces of data can
have the same digest, but the chance of this happening is so low that
the risk is accepted.
but, the risk of data being flipped once you have TBs of data is way 
above 0%. You can also do your own erasure coding if you like. That 
would be one way to achieve the same affect outside of ZFS.



I'm only writing this because I get the feeling some people think scrubs
are a need. Maybe people associate doing scrubs with something like
doing NTFS defrags?


NTFS defrag would only help with performance. scrub helps with 
integrity. Totally different things.



___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability Scrubs

2012-10-12 Thread Michael Stapleton
I'm not a mathematician, but can anyone calculate the chance of the Same
8K datablock on Both submirrors Going bad on terabyte drives, before
the data is ever read and fixed automatically during normal read
operations?
And if you are not doing mirroring, you have already accepted a much
larger margin of error for the sake of $.

The VAST majority of data centers are not storing data in storage that
does checksums to verify data, that is just the reality. Regular backups
and site replication rule.

I am Not saying scubs are a bad thing, just that they are being over
emphasized and some people who do not really understand are getting the
wrong impression that doing scrubs very often will somehow make them a
lot safer.
Scrubs help. But a lot of people who are worrying about scrubs are not
even doing proper backups or regular DR testing.


Mike

On Fri, 2012-10-12 at 22:36 -0400, Doug Hughes wrote:

 So?}?\, a lot of people have already answered this in various ways. 
 I'm going to provide a little bit of direct answer and focus to some of 
 those other answers (and emphasis)
 
 On 10/12/2012 5:07 PM, Michael Stapleton wrote:
  It is easy to understand that zfs srubs can be useful, But, How often do
  we scrub or the equivalent of any other file system? UFS? VXFS?
  NTFS? ...
  ZFS has scrubs as a feature, but is it a need? I do not think so. Other
  file systems accept the risk, mostly because they can not really do
  anything if there were errors.
 That's right. They cannot do anything. Why is that a good thing? If you 
 have a corruption on your filesystem because a block or even a single 
 bit went wrong, wouldn't you want to know? Wouldn't you want to fix it? 
 What if a number in an important financial document changed? Seems 
 unlikely, but we've discovered at least 5 instances of spontaneous disk 
 data corruption over the course of a couple of years. zfs corrected them 
 transparently. No data lost, automatic, clean,  and transparent. The 
 more data that we make, the more that possibility of spontaneous data 
 corruption becomes reality.
  It does no harm to do periodic scrubs, but I would not recommend doing
  them often or even at all if scrubs get in the way of production.
  What is the real risk of not doing scrubs?
 data changing without you knowing it. Maybe this doesn't matter on an 
 image file (though a jpeg could end up looking nasty or destroyed, and 
 mpeg4 could be permanently damaged, but in a TIFF or other uncompressed 
 format, you'd probably never know)
 
 
  Risk can not be eliminated, and we have to accept some risk.
 
  For example, data deduplication uses digests on data to detect
  duplication. Most dedup systems assume that if the digest is the same
  for two pieces of data, then the data must be the same.
  This assumption is not actually true. Two differing pieces of data can
  have the same digest, but the chance of this happening is so low that
  the risk is accepted.
 but, the risk of data being flipped once you have TBs of data is way 
 above 0%. You can also do your own erasure coding if you like. That 
 would be one way to achieve the same affect outside of ZFS.
 
 
  I'm only writing this because I get the feeling some people think scrubs
  are a need. Maybe people associate doing scrubs with something like
  doing NTFS defrags?
 
 
 NTFS defrag would only help with performance. scrub helps with 
 integrity. Totally different things.
 
 
 ___
 OpenIndiana-discuss mailing list
 OpenIndiana-discuss@openindiana.org
 http://openindiana.org/mailman/listinfo/openindiana-discuss


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss