Re: [zfs-discuss] What is your data error rate?

2012-01-24 Thread John Martin

On 01/24/12 17:06, Gregg Wonderly wrote:

What I've noticed, is that when I have my drives in a situation of small
airflow, and hence hotter operating temperatures, my disks will drop
quite quickly.


While I *believe* the same thing and thus have over provisioned
airflow in my cases (for both drives and memory), there
are studies which failed to find a strong correlation between
drive temperature and failure rates:

  http://research.google.com/archive/disk_failures.pdf

  http://www.usenix.org/events/fast07/tech/schroeder.html

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What is your data error rate?

2012-01-24 Thread Gregg Wonderly
What I've noticed, is that when I have my drives in a situation of small 
airflow, and hence hotter operating temperatures, my disks will drop quite 
quickly.  I've now moved my systems into large cases, which large amounts of 
airflow and using the icydock brand of removable drive enclosures.


http://www.newegg.com/Product/Product.aspx?Item=N82E16817994097
http://www.newegg.com/Product/Product.aspx?Item=N82E16817994113

I use the SASUC8I SATA/SAS controller to access 8 drives.

http://www.newegg.com/Product/Product.aspx?Item=N82E16816117157

I put it in PCI-e x16 slots on "graphics heavy" motherboards which might have as 
many as 4x PCI-e x16 slots.  I am replacing an old motherboard with this one.


http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=1124780

The case that I found to be a good match for my needs is the Raven

http://www.newegg.com/Product/Product.aspx?Item=N82E16811163180

It has enough slots (7) to put 2x 3-in-2 and 1x 4-in-3 icy dock bays in to 
provide 10 drives in hot swap bays.


I really think that the big issue is that you must move the air.  The drives 
really need to stay cool or else you will see degraded performance and/or data 
loss much more often.


Gregg Wonderly

On 1/24/2012 9:50 AM, Stefan Ring wrote:

After having read this mailing list for a little while, I get the
impression that there are at least some people who regularly
experience on-disk corruption that ZFS should be able to report and
handle. I’ve been running a raidz1 on three 1TB consumer disks for
approx. 2 years now (about 90% full), and I scrub the pool every 3-4
weeks and have never had a single error. From the oft-quoted 10^14
error rate that consumer disks are rated at, I should have seen an
error by now -- the scrubbing process is not the only activity on the
disks, after all, and the data transfer volume from that alone clocks
in at almost exactly 10^14 by now.

Not that I’m worried, of course, but it comes at a slight surprise to
me. Or does the 10^14 rating just reflect the strength of the on-disk
ECC algorithm?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread David Magda
On Tue, January 24, 2012 13:37, Jim Klimov wrote:

> One more rationale - compatibility, including future-proof
> somewhat (the zfs-send format explicitly does not guarantee
> that it won't change incompatibly). I mean stransfer of data
> between systems that do not implement the same set of
> compression algoritms in ZFS.

The format of 'zfs send' has now been committed:

> The format of the stream is committed. You will be able to receive your'
> streams on future versions of ZFS.

http://docs.oracle.com/cd/E19253-01/816-5166/zfs-1m/index.html

This was fixed in some update of Solaris 10, though I can't find the exact
one.

http://hub.opensolaris.org/bin/view/Community+Group+on/2008042301


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread Jim Klimov

2012-01-24 19:52, Jim Klimov wrote:

2012-01-24 13:05, Mickaël CANÉVET wrote:

Hi,

Unless I misunderstood something, zfs send of a volume that has
compression activated, uncompress it. So if I do a zfs send|zfs receive
from a compressed volume to a compressed volume, my data are
uncompressed and compressed again. Right ?

Is there a more effective way to do it (without decompression and
recompression) ?



Rationale being that the two systems
might demand different compression (i.e. "lzjb" or "none"
on the original system and "gzip-9" on the backup one).



One more rationale - compatibility, including future-proof
somewhat (the zfs-send format explicitly does not guarantee
that it won't change incompatibly). I mean stransfer of data
between systems that do not implement the same set of
compression algoritms in ZFS.

Say, as a developer I find a way to use bzip2 or 7zip to
compress my local system's blocks (just like gzip appeared
recently, after there were only lzjb and none). If I zfs-send
the compressed blocks as they are, another system won't be
able to interpret them unless it supports the same algorithm
and format. And since zfs-send can be used via files (i.e.
distribution media with flar-like archives), there is no
way of dialog between zfs-sender and zfs-recipient to agree
on a common format, beside using a fixed predefined one -
uncompressed.

Using external programs to wrap that in the Unix way gets
out of ZFS's scope and can be arranged by other software
on the OSes.

HTH,
//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread Richard Elling
On Jan 24, 2012, at 7:52 AM, Jim Klimov wrote:
> 2012-01-24 13:05, Mickaël CANÉVET wrote:
>> Hi,
>> 
>> Unless I misunderstood something, zfs send of a volume that has
>> compression activated, uncompress it. So if I do a zfs send|zfs receive
>> from a compressed volume to a compressed volume, my data are
>> uncompressed and compressed again. Right ?

correct

>> 
>> Is there a more effective way to do it (without decompression and
>> recompression) ?
> 
> 
> While I can not confirm or deny this statement, it was my
> impression as well. Rationale being that the two systems
> might demand different compression (i.e. "lzjb" or "none"
> on the original system and "gzip-9" on the backup one).
> Just like you probably have different VDEV layouts, etc.
> Or perhaps even different encryption or dedup settings.

that "feature" falls out of the implementation.

> 
> Compression, like many other components, lives on the
> layer "under" logical storage (userdata blocks), and
> gets applied to newly written blocks only (i.e. your
> datasets can have a mix of different compression levels
> for different files or even blocks within a file, if
> you switched the methods during dataset lifetime).
> 
> Actually I would not be surprised if zfs-send userdata
> stream is even above the block level (i.e. it would seem
> normal to me if many small userdata blocks of original
> pool might become one big block on the recipient).
> 
> So while some optimizations are possible, I think they
> would violate layering quite much.

data in the ARC is uncompressed. compression/decompression 
occurs in the ZIO pipeline layer below the DSL.

> 
> But, for example, it might make sense for zfs-send to
> include the original compression algorithm information
> into the sent stream and send the compressed data (less
> network traffic or intermediate storage requirement,
> to say the least - at zero price of recompression to
> something perhaps more efficient), and if the recipient
> dataset's algorithm differs - unpack and recompress it
> on the receiving side.
> 
> If that's not done already :)

the compression parameter value is sent, but as you mentioned
above, blocks in a snapshot can be compressed with different
algorithms, so you only actually get the last setting at time of
snapshot.

> 
> So far my over-the-net zfs sends are piped into gzip
> or pigz, ssh and gunzip, and that often speeds up the
> overall transfer. Probably can be done with less overhead
> by "ssh -C" for implementations that have it.

the UNIX philosophy is in play here :-) Sending the data uncompressed
to stdout allows you to pipe it into various transport or transform programs.
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What is your data error rate?

2012-01-24 Thread Bob Friesenhahn

On Tue, 24 Jan 2012, Jim Klimov wrote:



Or does the 10^14 rating just reflect the strength
of the on-disk ECC algorithm?


I am not sure how much the algorithms differ between
"enterprise" and "consumer" disks, while the UBER is
said to differ about 100 times. It might have also
to do with quality of materials (better steel in ball
bearings, etc.) as well as better firmware/processors
which optimize mechanical workloads and reduce the
mechanical wear. Maybe so, at least...


In addition to the above, an important factor is that enterprise disks 
with 10^16 ratings also offer considerably less storage density. 
Instead of 3TB storage per drive, you get 400GB storage per drive.


So-called "nearline" enterprise storage drives fit in somewhere in the 
middle, with higher storage densities, but also higher error rates.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What is your data error rate?

2012-01-24 Thread Jim Klimov

2012-01-24 19:50, Stefan Ring пишет:

After having read this mailing list for a little while, I get the
impression that there are at least some people who regularly
experience on-disk corruption that ZFS should be able to report and
handle. I’ve been running a raidz1 on three 1TB consumer disks for
approx. 2 years now (about 90% full), and I scrub the pool every 3-4
weeks and have never had a single error. From the oft-quoted 10^14
error rate that consumer disks are rated at, I should have seen an
error by now -- the scrubbing process is not the only activity on the
disks, after all, and the data transfer volume from that alone clocks
in at almost exactly 10^14 by now.

Not that I’m worried, of course, but it comes at a slight surprise to
me. Or does the 10^14 rating just reflect the strength of the on-disk
ECC algorithm?


I maintained several dozen storage servers for about
12 years, and I've seen quite a few drive deaths as
well as automatically triggered RAID array rebuilds.
But usually these were "infant deaths" in the first
year, and those drives who passed the age test often
give no noticeable problems for the next decade.
Several 2-4 disk systems work as OpenSolaris SXCE
servers with ZFS pools for root and data for years
now, and also show now problems. However most of
these are branded systems and disks from Sun.
I think we've only had one or two drives die, but
happened to have cold-spares due to over-ordering ;)

I do have a suspiciously high error rate on my home-NAS
which was thrown together from whatever pieces I had
at home at the time I left for an overseas trip. The
box is nearly unmaintained since then, and can suffer
from physical reasons known and unknown, such as the
SATA cabling (varied and quite possibly bad), non-ECC
memory, dust and overheating, etc.

It is also possible that aging components such as the
CPU and Motherboard which have about 5 years of active
lifetime (including an overclocked past) can contribute
to error-rates.

The old 80gb root drive has had some bad sectors (READ
errors in scrub and data access) and rpool was recreated
with copies=2 for a few times now, thanks to LiveUSB,
but the main data pool had no substantial errors until
the CKSUM errors reported this winter (metadata:0x0 and
then the dozen of in-file checksum mismatches). Since
one of the drives got itself lost soon after, and only
reappeared after all the cables were replugged, I still
tend to blame this on SATA cabling as the most probable
root cause.

I do not have an up-to-date SMART error report, and
the box is not accessible at the moment, so I can't
comment on lower-level errors in the main pool drives.
They were new at the time I put the box together (almost
a year ago now).

However, so far much more than discovered on-disk CKSUM
errors (whichever way they've appeared) I am bothered
by tendency of this box to lock up and/or reboot after
somewhat repeatable actions (such as destroying large
snapshots of deduped datasets, etc.) I tend to write
this off as shortcomings of the OS (i.e. memory-hunger
and lockup in scarate hell as the most frequent cause),
and this really bothers me more now - causing lots of
downtime until some friend comes to that apartment to
reboot the box.

> Or does the 10^14 rating just reflect the strength
> of the on-disk ECC algorithm?

I am not sure how much the algorithms differ between
"enterprise" and "consumer" disks, while the UBER is
said to differ about 100 times. It might have also
to do with quality of materials (better steel in ball
bearings, etc.) as well as better firmware/processors
which optimize mechanical workloads and reduce the
mechanical wear. Maybe so, at least...

Finally, this is statistics. It does not "guarantee"
that for some 90Tbits of transferred data you will
certainly see an error (and just one for that matter).
Those drives which died young hopefully also count
in the overall stats, moving the bar a bit higher
for their better-made brethren.

Also, disk UBER regards media failures and ability
of disks' cache, firmware and ECC to deal with that.
After the disk sends the "correct" sector on the wire,
many things can happen like noise in bad connectors,
electromagnetic interference from all the motors in
your computer onto the data cable, ability or lack
thereof for the data protocol (IDE, ATA, SCSI) to
detect and/or recover from such incoming random bits
between disk and HBA, errors in HBA chips and code,
noise in old rusty PCI* connector slots, bitflips in
non-ECC RAM or overheated CPUs, power surges from PSU...
There is a lot of stuff that can break :)

//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread Jim Klimov

2012-01-24 13:05, Mickaël CANÉVET wrote:

Hi,

Unless I misunderstood something, zfs send of a volume that has
compression activated, uncompress it. So if I do a zfs send|zfs receive
from a compressed volume to a compressed volume, my data are
uncompressed and compressed again. Right ?

Is there a more effective way to do it (without decompression and
recompression) ?



While I can not confirm or deny this statement, it was my
impression as well. Rationale being that the two systems
might demand different compression (i.e. "lzjb" or "none"
on the original system and "gzip-9" on the backup one).
Just like you probably have different VDEV layouts, etc.
Or perhaps even different encryption or dedup settings.

Compression, like many other components, lives on the
layer "under" logical storage (userdata blocks), and
gets applied to newly written blocks only (i.e. your
datasets can have a mix of different compression levels
for different files or even blocks within a file, if
you switched the methods during dataset lifetime).

Actually I would not be surprised if zfs-send userdata
stream is even above the block level (i.e. it would seem
normal to me if many small userdata blocks of original
pool might become one big block on the recipient).

So while some optimizations are possible, I think they
would violate layering quite much.

But, for example, it might make sense for zfs-send to
include the original compression algorithm information
into the sent stream and send the compressed data (less
network traffic or intermediate storage requirement,
to say the least - at zero price of recompression to
something perhaps more efficient), and if the recipient
dataset's algorithm differs - unpack and recompress it
on the receiving side.

If that's not done already :)

So far my over-the-net zfs sends are piped into gzip
or pigz, ssh and gunzip, and that often speeds up the
overall transfer. Probably can be done with less overhead
by "ssh -C" for implementations that have it.

//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What is your data error rate?

2012-01-24 Thread Stefan Ring
After having read this mailing list for a little while, I get the
impression that there are at least some people who regularly
experience on-disk corruption that ZFS should be able to report and
handle. I’ve been running a raidz1 on three 1TB consumer disks for
approx. 2 years now (about 90% full), and I scrub the pool every 3-4
weeks and have never had a single error. From the oft-quoted 10^14
error rate that consumer disks are rated at, I should have seen an
error by now -- the scrubbing process is not the only activity on the
disks, after all, and the data transfer volume from that alone clocks
in at almost exactly 10^14 by now.

Not that I’m worried, of course, but it comes at a slight surprise to
me. Or does the 10^14 rating just reflect the strength of the on-disk
ECC algorithm?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unable to access the zpool after issue a reboot

2012-01-24 Thread David Blasingame
Sudheer,


I don't know what the module name is for dynapath, but you may want to include 
a forceload statement in /etc/system.  This will cause the driver to load 
during initialization.  Usually all the modules in the stack should be 
included, such as the sd driver.


example:

forceload:  drv/sd
forceload:  drv/



HTH

Dave



 From: sureshkumar 
To: zfs-discuss@opensolaris.org 
Sent: Tuesday, January 24, 2012 6:03 AM
Subject: [zfs-discuss] unable to access the zpool after issue a reboot
 

Hi all,


I am new to Solaris & I am facing an issue with the dynapath [multipath s/w] 
for Solaris10u10 x86 .

I am facing an issue with the zpool.

Whats my problem is unable to access the zpool after issue a reboot.

I am pasting the zpool status below.

==
bash-3.2# zpool status
  pool: test
 state: UNAVAIL
 status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
 action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scan: none requested
 config:

        NAME                     STATE     READ WRITE CKSUM
        test                     UNAVAIL      0     0     0  insufficient 
replicas
=                   
                      
But all my devices are online & I am able to access them.
when I export & import the zpool , the zpool comes to back to available state.

I am not getting whats the problem with the reboot.

Any suggestions regarding this was very helpful.

Thanks& Regards,
Sudheer.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread Mickaël CANÉVET
Hi,

Unless I misunderstood something, zfs send of a volume that has
compression activated, uncompress it. So if I do a zfs send|zfs receive
from a compressed volume to a compressed volume, my data are
uncompressed and compressed again. Right ?

Is there a more effective way to do it (without decompression and
recompression) ?

Cheers,
Mickaël


signature.asc
Description: This is a digitally signed message part
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unable to access the zpool after issue a reboot

2012-01-24 Thread Bob Friesenhahn

On Tue, 24 Jan 2012, sureshkumar wrote:


        NAME                     STATE     READ WRITE CKSUM
        test                     UNAVAIL      0     0     0  insufficient 
replicas
=                   
                      
But all my devices are online & I am able to access them.
when I export & import the zpool , the zpool comes to back to available state.

I am not getting whats the problem with the reboot.


The LUN on which this pool is based was not available within a 
reasonable time after when zfs tried to import it.  It was available 
later.


What storage technology is this LUN based on (local SAS/SATA, iSCSI, 
FC)?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unable to access the zpool after issue a reboot

2012-01-24 Thread Gary Mills
On Tue, Jan 24, 2012 at 05:33:39PM +0530, sureshkumar wrote:
> 
>I am new to Solaris & I am facing an issue with the dynapath [multipath
>s/w] for Solaris10u10 x86 .
> 
>I am facing an issue with the zpool.
> 
>Whats my problem is unable to access the zpool after issue a reboot.

I've seen this happen when the zpool was built on an Iscsi LUN.  At
reboot time, the ZFS import was done before the Iscsi driver was able
to connect to its target.  After the system was up, an export and
import was successful.  The solution was to add a new service that
imported the zpool later during the reboot.

-- 
-Gary Mills--refurb--Winnipeg, Manitoba, Canada-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unable to access the zpool after issue a reboot

2012-01-24 Thread Hung-Sheng Tsao (laoTsao)
how did you issue " reboot", try 
shutdown -i6  -y -g0 

Sent from my iPad

On Jan 24, 2012, at 7:03, sureshkumar  wrote:

> Hi all,
> 
> 
> I am new to Solaris & I am facing an issue with the dynapath [multipath s/w] 
> for Solaris10u10 x86 .
> 
> I am facing an issue with the zpool.
> 
> Whats my problem is unable to access the zpool after issue a reboot.
> 
> I am pasting the zpool status below.
> 
> ==
> bash-3.2# zpool status
>   pool: test
>  state: UNAVAIL
>  status: One or more devices could not be opened.  There are insufficient
> replicas for the pool to continue functioning.
>  action: Attach the missing device and online it using 'zpool online'.
>see: http://www.sun.com/msg/ZFS-8000-3C
>  scan: none requested
>  config:
> 
> NAME STATE READ WRITE CKSUM
> test UNAVAIL  0 0 0  insufficient 
> replicas
> = 
> 
> But all my devices are online & I am able to access them.
> when I export & import the zpool , the zpool comes to back to available state.
> 
> I am not getting whats the problem with the reboot.
> 
> Any suggestions regarding this was very helpful.
> 
> Thanks& Regards,
> Sudheer.
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unable to access the zpool after issue a reboot

2012-01-24 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of sureshkumar
> 
> Whats my problem is unable to access the zpool after issue a reboot.
> 
> ==
> bash-3.2# zpool status
>   pool: test
>         NAME                     STATE     READ WRITE CKSUM
>         test                     UNAVAIL      0     0     0  insufficient
replicas

Can you do a "history" and tell us what your "zpool create" command was?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] unable to access the zpool after issue a reboot

2012-01-24 Thread sureshkumar
Hi all,


I am new to Solaris & I am facing an issue with the dynapath [multipath
s/w] for Solaris10u10 x86 .

I am facing an issue with the zpool.

Whats my problem is unable to access the zpool after issue a reboot.

I am pasting the zpool status below.

==
bash-3.2# zpool status
  pool: test
 state: UNAVAIL
 status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
 action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scan: none requested
 config:

NAME STATE READ WRITE CKSUM
test UNAVAIL  0 0 0  insufficient
replicas
=

But all my devices are online & I am able to access them.
when I export & import the zpool , the zpool comes to back
to available state.

I am not getting whats the problem with the reboot.

Any suggestions regarding this was very helpful.
*
*
*Thanks& Regards,*
*Sudheer.*
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss