Re: [zfs-discuss] What is your data error rate?
On 01/24/12 17:06, Gregg Wonderly wrote: What I've noticed, is that when I have my drives in a situation of small airflow, and hence hotter operating temperatures, my disks will drop quite quickly. While I *believe* the same thing and thus have over provisioned airflow in my cases (for both drives and memory), there are studies which failed to find a strong correlation between drive temperature and failure rates: http://research.google.com/archive/disk_failures.pdf http://www.usenix.org/events/fast07/tech/schroeder.html ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What is your data error rate?
What I've noticed, is that when I have my drives in a situation of small airflow, and hence hotter operating temperatures, my disks will drop quite quickly. I've now moved my systems into large cases, which large amounts of airflow and using the icydock brand of removable drive enclosures. http://www.newegg.com/Product/Product.aspx?Item=N82E16817994097 http://www.newegg.com/Product/Product.aspx?Item=N82E16817994113 I use the SASUC8I SATA/SAS controller to access 8 drives. http://www.newegg.com/Product/Product.aspx?Item=N82E16816117157 I put it in PCI-e x16 slots on "graphics heavy" motherboards which might have as many as 4x PCI-e x16 slots. I am replacing an old motherboard with this one. http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=1124780 The case that I found to be a good match for my needs is the Raven http://www.newegg.com/Product/Product.aspx?Item=N82E16811163180 It has enough slots (7) to put 2x 3-in-2 and 1x 4-in-3 icy dock bays in to provide 10 drives in hot swap bays. I really think that the big issue is that you must move the air. The drives really need to stay cool or else you will see degraded performance and/or data loss much more often. Gregg Wonderly On 1/24/2012 9:50 AM, Stefan Ring wrote: After having read this mailing list for a little while, I get the impression that there are at least some people who regularly experience on-disk corruption that ZFS should be able to report and handle. I’ve been running a raidz1 on three 1TB consumer disks for approx. 2 years now (about 90% full), and I scrub the pool every 3-4 weeks and have never had a single error. From the oft-quoted 10^14 error rate that consumer disks are rated at, I should have seen an error by now -- the scrubbing process is not the only activity on the disks, after all, and the data transfer volume from that alone clocks in at almost exactly 10^14 by now. Not that I’m worried, of course, but it comes at a slight surprise to me. Or does the 10^14 rating just reflect the strength of the on-disk ECC algorithm? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send recv without uncompressing data stream
On Tue, January 24, 2012 13:37, Jim Klimov wrote: > One more rationale - compatibility, including future-proof > somewhat (the zfs-send format explicitly does not guarantee > that it won't change incompatibly). I mean stransfer of data > between systems that do not implement the same set of > compression algoritms in ZFS. The format of 'zfs send' has now been committed: > The format of the stream is committed. You will be able to receive your' > streams on future versions of ZFS. http://docs.oracle.com/cd/E19253-01/816-5166/zfs-1m/index.html This was fixed in some update of Solaris 10, though I can't find the exact one. http://hub.opensolaris.org/bin/view/Community+Group+on/2008042301 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send recv without uncompressing data stream
2012-01-24 19:52, Jim Klimov wrote: 2012-01-24 13:05, Mickaël CANÉVET wrote: Hi, Unless I misunderstood something, zfs send of a volume that has compression activated, uncompress it. So if I do a zfs send|zfs receive from a compressed volume to a compressed volume, my data are uncompressed and compressed again. Right ? Is there a more effective way to do it (without decompression and recompression) ? Rationale being that the two systems might demand different compression (i.e. "lzjb" or "none" on the original system and "gzip-9" on the backup one). One more rationale - compatibility, including future-proof somewhat (the zfs-send format explicitly does not guarantee that it won't change incompatibly). I mean stransfer of data between systems that do not implement the same set of compression algoritms in ZFS. Say, as a developer I find a way to use bzip2 or 7zip to compress my local system's blocks (just like gzip appeared recently, after there were only lzjb and none). If I zfs-send the compressed blocks as they are, another system won't be able to interpret them unless it supports the same algorithm and format. And since zfs-send can be used via files (i.e. distribution media with flar-like archives), there is no way of dialog between zfs-sender and zfs-recipient to agree on a common format, beside using a fixed predefined one - uncompressed. Using external programs to wrap that in the Unix way gets out of ZFS's scope and can be arranged by other software on the OSes. HTH, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send recv without uncompressing data stream
On Jan 24, 2012, at 7:52 AM, Jim Klimov wrote: > 2012-01-24 13:05, Mickaël CANÉVET wrote: >> Hi, >> >> Unless I misunderstood something, zfs send of a volume that has >> compression activated, uncompress it. So if I do a zfs send|zfs receive >> from a compressed volume to a compressed volume, my data are >> uncompressed and compressed again. Right ? correct >> >> Is there a more effective way to do it (without decompression and >> recompression) ? > > > While I can not confirm or deny this statement, it was my > impression as well. Rationale being that the two systems > might demand different compression (i.e. "lzjb" or "none" > on the original system and "gzip-9" on the backup one). > Just like you probably have different VDEV layouts, etc. > Or perhaps even different encryption or dedup settings. that "feature" falls out of the implementation. > > Compression, like many other components, lives on the > layer "under" logical storage (userdata blocks), and > gets applied to newly written blocks only (i.e. your > datasets can have a mix of different compression levels > for different files or even blocks within a file, if > you switched the methods during dataset lifetime). > > Actually I would not be surprised if zfs-send userdata > stream is even above the block level (i.e. it would seem > normal to me if many small userdata blocks of original > pool might become one big block on the recipient). > > So while some optimizations are possible, I think they > would violate layering quite much. data in the ARC is uncompressed. compression/decompression occurs in the ZIO pipeline layer below the DSL. > > But, for example, it might make sense for zfs-send to > include the original compression algorithm information > into the sent stream and send the compressed data (less > network traffic or intermediate storage requirement, > to say the least - at zero price of recompression to > something perhaps more efficient), and if the recipient > dataset's algorithm differs - unpack and recompress it > on the receiving side. > > If that's not done already :) the compression parameter value is sent, but as you mentioned above, blocks in a snapshot can be compressed with different algorithms, so you only actually get the last setting at time of snapshot. > > So far my over-the-net zfs sends are piped into gzip > or pigz, ssh and gunzip, and that often speeds up the > overall transfer. Probably can be done with less overhead > by "ssh -C" for implementations that have it. the UNIX philosophy is in play here :-) Sending the data uncompressed to stdout allows you to pipe it into various transport or transform programs. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What is your data error rate?
On Tue, 24 Jan 2012, Jim Klimov wrote: Or does the 10^14 rating just reflect the strength of the on-disk ECC algorithm? I am not sure how much the algorithms differ between "enterprise" and "consumer" disks, while the UBER is said to differ about 100 times. It might have also to do with quality of materials (better steel in ball bearings, etc.) as well as better firmware/processors which optimize mechanical workloads and reduce the mechanical wear. Maybe so, at least... In addition to the above, an important factor is that enterprise disks with 10^16 ratings also offer considerably less storage density. Instead of 3TB storage per drive, you get 400GB storage per drive. So-called "nearline" enterprise storage drives fit in somewhere in the middle, with higher storage densities, but also higher error rates. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What is your data error rate?
2012-01-24 19:50, Stefan Ring пишет: After having read this mailing list for a little while, I get the impression that there are at least some people who regularly experience on-disk corruption that ZFS should be able to report and handle. I’ve been running a raidz1 on three 1TB consumer disks for approx. 2 years now (about 90% full), and I scrub the pool every 3-4 weeks and have never had a single error. From the oft-quoted 10^14 error rate that consumer disks are rated at, I should have seen an error by now -- the scrubbing process is not the only activity on the disks, after all, and the data transfer volume from that alone clocks in at almost exactly 10^14 by now. Not that I’m worried, of course, but it comes at a slight surprise to me. Or does the 10^14 rating just reflect the strength of the on-disk ECC algorithm? I maintained several dozen storage servers for about 12 years, and I've seen quite a few drive deaths as well as automatically triggered RAID array rebuilds. But usually these were "infant deaths" in the first year, and those drives who passed the age test often give no noticeable problems for the next decade. Several 2-4 disk systems work as OpenSolaris SXCE servers with ZFS pools for root and data for years now, and also show now problems. However most of these are branded systems and disks from Sun. I think we've only had one or two drives die, but happened to have cold-spares due to over-ordering ;) I do have a suspiciously high error rate on my home-NAS which was thrown together from whatever pieces I had at home at the time I left for an overseas trip. The box is nearly unmaintained since then, and can suffer from physical reasons known and unknown, such as the SATA cabling (varied and quite possibly bad), non-ECC memory, dust and overheating, etc. It is also possible that aging components such as the CPU and Motherboard which have about 5 years of active lifetime (including an overclocked past) can contribute to error-rates. The old 80gb root drive has had some bad sectors (READ errors in scrub and data access) and rpool was recreated with copies=2 for a few times now, thanks to LiveUSB, but the main data pool had no substantial errors until the CKSUM errors reported this winter (metadata:0x0 and then the dozen of in-file checksum mismatches). Since one of the drives got itself lost soon after, and only reappeared after all the cables were replugged, I still tend to blame this on SATA cabling as the most probable root cause. I do not have an up-to-date SMART error report, and the box is not accessible at the moment, so I can't comment on lower-level errors in the main pool drives. They were new at the time I put the box together (almost a year ago now). However, so far much more than discovered on-disk CKSUM errors (whichever way they've appeared) I am bothered by tendency of this box to lock up and/or reboot after somewhat repeatable actions (such as destroying large snapshots of deduped datasets, etc.) I tend to write this off as shortcomings of the OS (i.e. memory-hunger and lockup in scarate hell as the most frequent cause), and this really bothers me more now - causing lots of downtime until some friend comes to that apartment to reboot the box. > Or does the 10^14 rating just reflect the strength > of the on-disk ECC algorithm? I am not sure how much the algorithms differ between "enterprise" and "consumer" disks, while the UBER is said to differ about 100 times. It might have also to do with quality of materials (better steel in ball bearings, etc.) as well as better firmware/processors which optimize mechanical workloads and reduce the mechanical wear. Maybe so, at least... Finally, this is statistics. It does not "guarantee" that for some 90Tbits of transferred data you will certainly see an error (and just one for that matter). Those drives which died young hopefully also count in the overall stats, moving the bar a bit higher for their better-made brethren. Also, disk UBER regards media failures and ability of disks' cache, firmware and ECC to deal with that. After the disk sends the "correct" sector on the wire, many things can happen like noise in bad connectors, electromagnetic interference from all the motors in your computer onto the data cable, ability or lack thereof for the data protocol (IDE, ATA, SCSI) to detect and/or recover from such incoming random bits between disk and HBA, errors in HBA chips and code, noise in old rusty PCI* connector slots, bitflips in non-ECC RAM or overheated CPUs, power surges from PSU... There is a lot of stuff that can break :) //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send recv without uncompressing data stream
2012-01-24 13:05, Mickaël CANÉVET wrote: Hi, Unless I misunderstood something, zfs send of a volume that has compression activated, uncompress it. So if I do a zfs send|zfs receive from a compressed volume to a compressed volume, my data are uncompressed and compressed again. Right ? Is there a more effective way to do it (without decompression and recompression) ? While I can not confirm or deny this statement, it was my impression as well. Rationale being that the two systems might demand different compression (i.e. "lzjb" or "none" on the original system and "gzip-9" on the backup one). Just like you probably have different VDEV layouts, etc. Or perhaps even different encryption or dedup settings. Compression, like many other components, lives on the layer "under" logical storage (userdata blocks), and gets applied to newly written blocks only (i.e. your datasets can have a mix of different compression levels for different files or even blocks within a file, if you switched the methods during dataset lifetime). Actually I would not be surprised if zfs-send userdata stream is even above the block level (i.e. it would seem normal to me if many small userdata blocks of original pool might become one big block on the recipient). So while some optimizations are possible, I think they would violate layering quite much. But, for example, it might make sense for zfs-send to include the original compression algorithm information into the sent stream and send the compressed data (less network traffic or intermediate storage requirement, to say the least - at zero price of recompression to something perhaps more efficient), and if the recipient dataset's algorithm differs - unpack and recompress it on the receiving side. If that's not done already :) So far my over-the-net zfs sends are piped into gzip or pigz, ssh and gunzip, and that often speeds up the overall transfer. Probably can be done with less overhead by "ssh -C" for implementations that have it. //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] What is your data error rate?
After having read this mailing list for a little while, I get the impression that there are at least some people who regularly experience on-disk corruption that ZFS should be able to report and handle. I’ve been running a raidz1 on three 1TB consumer disks for approx. 2 years now (about 90% full), and I scrub the pool every 3-4 weeks and have never had a single error. From the oft-quoted 10^14 error rate that consumer disks are rated at, I should have seen an error by now -- the scrubbing process is not the only activity on the disks, after all, and the data transfer volume from that alone clocks in at almost exactly 10^14 by now. Not that I’m worried, of course, but it comes at a slight surprise to me. Or does the 10^14 rating just reflect the strength of the on-disk ECC algorithm? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to access the zpool after issue a reboot
Sudheer, I don't know what the module name is for dynapath, but you may want to include a forceload statement in /etc/system. This will cause the driver to load during initialization. Usually all the modules in the stack should be included, such as the sd driver. example: forceload: drv/sd forceload: drv/ HTH Dave From: sureshkumar To: zfs-discuss@opensolaris.org Sent: Tuesday, January 24, 2012 6:03 AM Subject: [zfs-discuss] unable to access the zpool after issue a reboot Hi all, I am new to Solaris & I am facing an issue with the dynapath [multipath s/w] for Solaris10u10 x86 . I am facing an issue with the zpool. Whats my problem is unable to access the zpool after issue a reboot. I am pasting the zpool status below. == bash-3.2# zpool status pool: test state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-3C scan: none requested config: NAME STATE READ WRITE CKSUM test UNAVAIL 0 0 0 insufficient replicas = But all my devices are online & I am able to access them. when I export & import the zpool , the zpool comes to back to available state. I am not getting whats the problem with the reboot. Any suggestions regarding this was very helpful. Thanks& Regards, Sudheer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs send recv without uncompressing data stream
Hi, Unless I misunderstood something, zfs send of a volume that has compression activated, uncompress it. So if I do a zfs send|zfs receive from a compressed volume to a compressed volume, my data are uncompressed and compressed again. Right ? Is there a more effective way to do it (without decompression and recompression) ? Cheers, Mickaël signature.asc Description: This is a digitally signed message part ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to access the zpool after issue a reboot
On Tue, 24 Jan 2012, sureshkumar wrote: NAME STATE READ WRITE CKSUM test UNAVAIL 0 0 0 insufficient replicas = But all my devices are online & I am able to access them. when I export & import the zpool , the zpool comes to back to available state. I am not getting whats the problem with the reboot. The LUN on which this pool is based was not available within a reasonable time after when zfs tried to import it. It was available later. What storage technology is this LUN based on (local SAS/SATA, iSCSI, FC)? Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to access the zpool after issue a reboot
On Tue, Jan 24, 2012 at 05:33:39PM +0530, sureshkumar wrote: > >I am new to Solaris & I am facing an issue with the dynapath [multipath >s/w] for Solaris10u10 x86 . > >I am facing an issue with the zpool. > >Whats my problem is unable to access the zpool after issue a reboot. I've seen this happen when the zpool was built on an Iscsi LUN. At reboot time, the ZFS import was done before the Iscsi driver was able to connect to its target. After the system was up, an export and import was successful. The solution was to add a new service that imported the zpool later during the reboot. -- -Gary Mills--refurb--Winnipeg, Manitoba, Canada- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to access the zpool after issue a reboot
how did you issue " reboot", try shutdown -i6 -y -g0 Sent from my iPad On Jan 24, 2012, at 7:03, sureshkumar wrote: > Hi all, > > > I am new to Solaris & I am facing an issue with the dynapath [multipath s/w] > for Solaris10u10 x86 . > > I am facing an issue with the zpool. > > Whats my problem is unable to access the zpool after issue a reboot. > > I am pasting the zpool status below. > > == > bash-3.2# zpool status > pool: test > state: UNAVAIL > status: One or more devices could not be opened. There are insufficient > replicas for the pool to continue functioning. > action: Attach the missing device and online it using 'zpool online'. >see: http://www.sun.com/msg/ZFS-8000-3C > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > test UNAVAIL 0 0 0 insufficient > replicas > = > > But all my devices are online & I am able to access them. > when I export & import the zpool , the zpool comes to back to available state. > > I am not getting whats the problem with the reboot. > > Any suggestions regarding this was very helpful. > > Thanks& Regards, > Sudheer. > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to access the zpool after issue a reboot
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of sureshkumar > > Whats my problem is unable to access the zpool after issue a reboot. > > == > bash-3.2# zpool status > pool: test > NAME STATE READ WRITE CKSUM > test UNAVAIL 0 0 0 insufficient replicas Can you do a "history" and tell us what your "zpool create" command was? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] unable to access the zpool after issue a reboot
Hi all, I am new to Solaris & I am facing an issue with the dynapath [multipath s/w] for Solaris10u10 x86 . I am facing an issue with the zpool. Whats my problem is unable to access the zpool after issue a reboot. I am pasting the zpool status below. == bash-3.2# zpool status pool: test state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-3C scan: none requested config: NAME STATE READ WRITE CKSUM test UNAVAIL 0 0 0 insufficient replicas = But all my devices are online & I am able to access them. when I export & import the zpool , the zpool comes to back to available state. I am not getting whats the problem with the reboot. Any suggestions regarding this was very helpful. * * *Thanks& Regards,* *Sudheer.* ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss