Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-22 Thread Constantin Gonzalez

Hi,

I agree 100% with Chris.

Notice the on their own part of the original post. Yes, nobody wants
to run zfs send or (s)tar by hand.

That's why Chris's script is so useful: You set it up and forget and get the
job done for 80% of home users.

On another note, I was positively surprised by the availability of Crash Plan
for OpenSolaris:

  http://crashplan.com/

Their free service allows to back up your stuff to a friend's system over the
net in an encrypted way, the paid-for servide uses Crashplan's data centers at
a less than Amazon-S3 pricing.

While this may not be everyone's solution, I find it significant that they
explicitly support OpenSolaris. This either means they're OpenSolaris fans
or that they see potential in OpenSolaris home server users.


Cheers,
  Constantin

On 03/20/10 01:31 PM, Chris Gerhard wrote:


I'll say it again: neither 'zfs send' or (s)tar is an
enterprise (or
even home) backup system on their own one or both can
be components of
the full solution.



Up to a point. zfs send | zfs receive does make a very good back up scheme for 
the home user with a moderate amount of storage. Especially when the entire 
back up will fit on a single drive which I think  would cover the majority of 
home users.

Using external drives and incremental zfs streams allows for extremely quick 
back ups of large amounts of data.

It certainly does for me. 
http://chrisgerhard.wordpress.com/2007/06/01/rolling-incremental-backups/


--
Sent from OpenSolaris, http://www.opensolaris.org/

Constantin Gonzalez  Sun Microsystems GmbH, Germany
Principal Field Technologist   Blog: constantin.glez.de
Tel.: +49 89/4 60 08-25 91  Twitter: @zalez

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Andrew Gabriel

Robert Milkowski wrote:


To add my 0.2 cents...

I think starting/stopping scrub belongs to cron, smf, etc. and not to 
zfs itself.


However what would be nice to have is an ability to freeze/resume a 
scrub and also limit its rate of scrubbing.
One of the reason is that when working in SAN environments one have to 
take into account more that just a server where a scrub will be running 
as while it might not impact the server  it might cause an issue for 
others, etc.


There's an RFE for this (pause/resume a scrub), or rather there was - 
unfortunately, it's got subsumed into another RFE/BUG and the 
pause/resume requirement got lost. I'll see about reinstating it.


--
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Svein Skogen

On 22.03.2010 02:13, Edward Ned Harvey wrote:

Actually ... Why should there be a ZFS property to share NFS, when you

can

already do that with share and dfstab?  And still the zfs property
exists.


Probably because it is easy to create new filesystems and clone them;
as
NFS only works per filesystem you need to edit dfstab every time when
you
add a filesystem.  With the nfs property, zfs create the NFS export,
etc.


Either I'm missing something, or you are.

If I export /somedir and then I create a new zfs filesystem /somedir/foo/bar
then I don't have to mess around with dfstab, because it's a subdirectory of
an exported directory, it's already accessible via NFS.  So unless I
misunderstand what you're saying, you're wrong.

This is the only situation I can imagine, where you would want to create a
ZFS filesystem and have it default to NFS exported.


Actually, I can see some reasons for this. Some of us wants directories 
mounted the same place at all servers. Consider the following:


zfs inherit sharenfs pool/nfs
zfs create -o mountpoint=/home pool/nfs/home
zfs create -o mountpoint=/webpages pool/nfs/www
zfs create -o mountpoint=/someotherdir pool/nfs/otherdir

etc.

So, I do see the point of the sharenfs attribute. ;)

//Svein

--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Svein Skogen

On 21.03.2010 01:25, Robert Milkowski wrote:


To add my 0.2 cents...

I think starting/stopping scrub belongs to cron, smf, etc. and not to
zfs itself.

However what would be nice to have is an ability to freeze/resume a
scrub and also limit its rate of scrubbing.
One of the reason is that when working in SAN environments one have to
take into account more that just a server where a scrub will be running
as while it might not impact the server it might cause an issue for
others, etc.


Does cron happen to know how many other scrubs are running, bogging down 
your IO system? If the scrub scheduling was integrated into zfs itself, 
it would be a small step to include smf/sysctl settings for maximum 
number of parallel scrubs, meaning the next scrub could sit waiting 
until the running ones are finished.


//Svein

--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Robert Milkowski

On 22/03/2010 01:13, Edward Ned Harvey wrote:

Actually ... Why should there be a ZFS property to share NFS, when you
   

can
 

already do that with share and dfstab?  And still the zfs property
exists.
   

Probably because it is easy to create new filesystems and clone them;
as
NFS only works per filesystem you need to edit dfstab every time when
you
add a filesystem.  With the nfs property, zfs create the NFS export,
etc.
 

Either I'm missing something, or you are.

If I export /somedir and then I create a new zfs filesystem /somedir/foo/bar
then I don't have to mess around with dfstab, because it's a subdirectory of
an exported directory, it's already accessible via NFS.  So unless I
misunderstand what you're saying, you're wrong.


   


no, it is not a subdirectory it is a filesystem mounted on top of the 
subdirectory.
So unless you use NFSv4 with mirror mounts or an automounter other NFS 
version will show you contents of a directory and not a filesystem. It 
doesn't matter if it is a zfs or not.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Robert Milkowski

On 22/03/2010 08:49, Andrew Gabriel wrote:

Robert Milkowski wrote:


To add my 0.2 cents...

I think starting/stopping scrub belongs to cron, smf, etc. and not to 
zfs itself.


However what would be nice to have is an ability to freeze/resume a 
scrub and also limit its rate of scrubbing.
One of the reason is that when working in SAN environments one have 
to take into account more that just a server where a scrub will be 
running as while it might not impact the server  it might cause an 
issue for others, etc.


There's an RFE for this (pause/resume a scrub), or rather there was - 
unfortunately, it's got subsumed into another RFE/BUG and the 
pause/resume requirement got lost. I'll see about reinstating it.




have you got the rfe/bug numbers?
I will try to find some time and get it implemented...

--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] David Plaunt is currently away.

2010-03-22 Thread D Plaunt

I will be out of the office starting  22/03/2010 and will not return until
06/04/2010.

Hello,

I am currently working on a project and out of the office.  I will be
checking my message twice a day but may be unavailable to follow up on your
requests.

If the matter requires immediate attention please send your request to
t...@brucetelecom.com or contact technical support at 1 866 517 2000 x 2 /
519 368 2000 x 2.

Thank you,
David

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Edward Ned Harvey
 Does cron happen to know how many other scrubs are running, bogging
 down
 your IO system? If the scrub scheduling was integrated into zfs itself,

It doesn't need to.

Crontab entry:  /root/bin/scruball.sh

/root/bin/scruball.sh:
#!/usr/bin/bash
for filesystem in filesystem1 filesystem2 filesystem3 ; do 
  zfs scrub $filesystem
done


If you were talking about something else, for example, multiple machines all
scrubbing a SAN at the same time, then ZFS can't solve that any better than
cron, because it would require inter-machine communication to coordinate.  I
contend a shell script could actually handle that better than a built-in zfs
property anyway.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Svein Skogen

On 22.03.2010 13:35, Edward Ned Harvey wrote:

Does cron happen to know how many other scrubs are running, bogging
down
your IO system? If the scrub scheduling was integrated into zfs itself,


It doesn't need to.

Crontab entry:  /root/bin/scruball.sh

/root/bin/scruball.sh:
#!/usr/bin/bash
for filesystem in filesystem1 filesystem2 filesystem3 ; do
   zfs scrub $filesystem
done


If you were talking about something else, for example, multiple machines all
scrubbing a SAN at the same time, then ZFS can't solve that any better than
cron, because it would require inter-machine communication to coordinate.  I
contend a shell script could actually handle that better than a built-in zfs
property anyway.



IIRC it's zpool scrub, and last time I checked, the zpool command 
exited (with status 0) as soon as it had started the scrub. Your command 
would start _ALL_ scrubs in paralell as a result.


//Svein

--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Edward Ned Harvey
 no, it is not a subdirectory it is a filesystem mounted on top of the
 subdirectory.
 So unless you use NFSv4 with mirror mounts or an automounter other NFS
 version will show you contents of a directory and not a filesystem. It
 doesn't matter if it is a zfs or not.

Ok, I learned something here, that I want to share:

If you create a new zfs filesystem as a subdir of a zfs filesystem which is
exported via nfs and shared via cifs ...

The cifs clients see the contents of the child zfs filesystems.
But, as Robert said above, nfs clients do not see the contents of the child
zfs filesystem.

So, if you nest zfs filesystems inside each other (I don't) then the
sharenfs property of a parent can be inherited by a child, and if that's
your desired behavior, it's a cool feature.

For that matter, even if you do set the property, and you create a new child
filesystem with inheritance, that only means the server will auto-export the
filesystem.  It doesn't mean the client will auto-mount it, right?  So
what's the 2nd half of the solution?  Assuming you want the clients to see
the subdirectories as the server does.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Edward Ned Harvey
 IIRC it's zpool scrub, and last time I checked, the zpool command
 exited (with status 0) as soon as it had started the scrub. Your
 command
 would start _ALL_ scrubs in paralell as a result.

You're right.  I did that wrong.  Sorry 'bout that.

So either way, if there's a zfs property for scrub, that still doesn't
prevent multiple scrubs from running simultaneously.  So ...  Presently
there's no way to avoid the simultaneous scrubs either way, right?  You have
to home-cook scripts to detect which scrubs are running on which
filesystems, and serialize the scrubs.  With, or without the property.

Don't get me wrong - I'm not discouraging the creation of the property.  But
if you want to avoid simul-scrub, you'd first have to create a mechanism for
that, and then you could create the autoscrub.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS+CIFS: Volume Shadow Services, or Simple Symlink?

2010-03-22 Thread Edward Ned Harvey
 Not being a CIFS user, could you clarify/confirm for me..  is this
 just a presentation issue, ie making a directory icon appear in a
 gooey windows explorer (or mac or whatever equivalent) view for people
 to click on?  The windows client could access the .zfs/snapshot dir
 via typed pathname if it knows to look, or if it's made visible, yes?

You are correct.  A CIFS client by default will not show the hidden .zfs
directory, but if you either click the checkbox show hidden files then
you'll see it, or if you type it into the addressbar, then you can access
it.

However, my users were used to having a hidden .snapshots directory in
every directory.  I didn't want to tell them You have to go to the parent
of all directories, and type in .zfs mostly because they can't remember
zfs ... So the softlink just makes it visible and easy to remember.

  I
  promise you will never catch me creating backups of ZFS via CIFS.  ;-
 )
 
 Never say never..

Hehehehe.  Given the alternatives, I think this is a safe one.  I will never
backup a ZFS filesystem via CIFS client.  ;-)  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Robert Milkowski

On 22/03/2010 12:50, Edward Ned Harvey wrote:

no, it is not a subdirectory it is a filesystem mounted on top of the
subdirectory.
So unless you use NFSv4 with mirror mounts or an automounter other NFS
version will show you contents of a directory and not a filesystem. It
doesn't matter if it is a zfs or not.
 

Ok, I learned something here, that I want to share:

If you create a new zfs filesystem as a subdir of a zfs filesystem which is
exported via nfs and shared via cifs ...

The cifs clients see the contents of the child zfs filesystems.
But, as Robert said above, nfs clients do not see the contents of the child
zfs filesystem.

So, if you nest zfs filesystems inside each other (I don't) then the
sharenfs property of a parent can be inherited by a child, and if that's
your desired behavior, it's a cool feature.

For that matter, even if you do set the property, and you create a new child
filesystem with inheritance, that only means the server will auto-export the
   
filesystem.  It doesn't mean the client will auto-mount it, right?  So

what's the 2nd half of the solution?  Assuming you want the clients to see
the subdirectories as the server does.


   


look for mirror mounts feature in NFSv4.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Performance on SATA Deive

2010-03-22 Thread Kashif Mumtaz
hi, Thanks for all the reply.

I have found the real culprit.
Hard disk was faulty. I  changed the hard disk.And now ZFS performance is much 
better.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Svein Skogen

On 22.03.2010 13:54, Edward Ned Harvey wrote:

IIRC it's zpool scrub, and last time I checked, the zpool command
exited (with status 0) as soon as it had started the scrub. Your
command
would start _ALL_ scrubs in paralell as a result.


You're right.  I did that wrong.  Sorry 'bout that.

So either way, if there's a zfs property for scrub, that still doesn't
prevent multiple scrubs from running simultaneously.  So ...  Presently
there's no way to avoid the simultaneous scrubs either way, right?  You have
to home-cook scripts to detect which scrubs are running on which
filesystems, and serialize the scrubs.  With, or without the property.

Don't get me wrong - I'm not discouraging the creation of the property.  But
if you want to avoid simul-scrub, you'd first have to create a mechanism for
that, and then you could create the autoscrub.



Which is exactly why I wanted it cooked in in the zfs code itself. zfs 
knows how many fs'es it's scrubbing.


//Svein


--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel SASUC8I - worth every penny

2010-03-22 Thread Cooper Hubbell
 I've moved to 7200RPM 2.5 laptop drives over 3.5
 drives, for a 
 combination of reasons:  lower-power, better
 performance than a 
 comparable sized 3.5 drives, and generally
 lower-capacities meaning 
 resilver times are smaller. They're a bit more $/GB,
 but not a lot.
 If you can stomach the extra cost (they run $220),
 I'd actually 
 recommend getting a 8x2.5 in 2x5.25 enclosure from
 Supermicro.  It 
 works nicely, plus it gives you a nice little place
 to put your SSD.   :-)

 -- 
 Erik Trimble
 Java System Support
 Mailstop:  usca22-123
 Phone:  x17195
 Santa Clara, CA

Regarding the 2.5 laptop drives, do the inherent error detection properties of 
ZFS subdue any concerns over a laptop drive's higher bit error rate or rated 
MTBF?  I've been reading about OpenSolaris and ZFS for several months now and 
am incredibly intrigued, but have yet to implement the solution in my lab.

Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel SASUC8I - worth every penny

2010-03-22 Thread Erik Trimble

Cooper Hubbell wrote:
Regarding the 2.5 laptop drives, do the inherent error detection 
properties of ZFS subdue any concerns over a laptop drive's higher bit 
error rate or rated MTBF? I've been reading about OpenSolaris and ZFS 
for several months now and am incredibly intrigued, but have yet to 
implement the solution in my lab.

Thanks!
  
So far as I know, laptop drives have no higher error rates (i.e. 
unrecoverable errors per 1 billion bits read/wrote), and similar MTBF to 
standard consumer SATA drives.  Looking at a couple of spec sheets, MTBF 
is about 600,000 hrs for laptop drives, and 700,000 hrs for consumer 
3.5 drives.   Frankly, if I was concerned about individual component 
failures, I'd look outside the consumer space (in all form factors).


In both cases, they're not terribly reliable, which is why ZFS is so 
great.  :-)


And, yes, to answer your question, this is (one of) the whole point 
behind ZFS - being able to provide a reliable service from unreliable parts.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intel SASUC8I - worth every penny

2010-03-22 Thread Svein Skogen

On 22.03.2010 16:24, Cooper Hubbell wrote:

I've moved to 7200RPM 2.5 laptop drives over 3.5
drives, for a
combination of reasons:  lower-power, better
performance than a
comparable sized 3.5 drives, and generally
lower-capacities meaning
resilver times are smaller. They're a bit more $/GB,
but not a lot.
If you can stomach the extra cost (they run $220),
I'd actually
recommend getting a 8x2.5 in 2x5.25 enclosure from
Supermicro.  It
works nicely, plus it gives you a nice little place
to put your SSD.   :-)



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA


Regarding the 2.5 laptop drives, do the inherent error detection properties of 
ZFS subdue any concerns over a laptop drive's higher bit error rate or rated MTBF?  
I've been reading about OpenSolaris and ZFS for several months now and am incredibly 
intrigued, but have yet to implement the solution in my lab.


Well ... the price difference means you can have mirrors of the laptop 
drives and still save money compared to the enterprise ones. With a 
modern patrol-reading (scrub or hardware raid) array-setup, and with 
some redundancy, you can re-implement I to mean inexpensive not 
independent in RAID. ;)


//Svein

--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-22 Thread David Dyer-Bennet

On Sat, March 20, 2010 07:31, Chris Gerhard wrote:

 Up to a point. zfs send | zfs receive does make a very good back up scheme
 for the home user with a moderate amount of storage. Especially when the
 entire back up will fit on a single drive which I think  would cover the
 majority of home users.

My own fit on a single external drive; but I've noticed that I have a
rather small configuration (1.2TB nominal, less than 800GB used).  Most
people I hear describing building home NAS setups put between 4 and 10 of
the biggest drives they can buy in them -- much more capacity than mine
(but then I built mine in 2006, too). I'm not clear how much of it they
ever fill up :-).

 Using external drives and incremental zfs streams allows for extremely
 quick back ups of large amounts of data.

 It certainly does for me.
 http://chrisgerhard.wordpress.com/2007/06/01/rolling-incremental-backups/

So far, for me it allows for endless failures and a LOT of reboots to free
stuck IO subsystems.

Your script seems to be using a simple zfs send -i; what I'm trying to do
is use an incremental replication stream, a -R -I thing (from memory; hope
that's right!).  This should propagate (for example) my every-2-hours
snapshots over onto the backup, even though I only back up to a given
drive every two or three days (three backup drives, rotating one
off-site).  Unfortunately, though, it doesn't work; hangs during the
receive eventually.  I'm waiting for the 2010.$Spring stable release to
see how it behave there before I get really energetic about debugging; for
now I'm just forcing pool recreation and full backups each time
(destroying the filesystem also hangs).  Given that full backups run
through to completion, whereas incrementals fail even though they're
pushing a lot less data and take a lot less time, I'm not inclined to
blame my USB hardware.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-22 Thread Chris Dunbar
Thank you to all who responded. This response in particular was very helpful 
and I think I will stick with my current zpool configuration (choice a if 
you're reading below). I primarily host VMware virtual machines over NFS from 
this server's predecessor and this server will be doing the same thing. I think 
the 6 x 2-way mirror configuration gives me the best mix of performance and 
fault tolerance.

Regards,
Chris Dunbar

On Mar 19, 2010, at 5:44 PM, Erik Trimble wrote:

 Chris Dunbar - Earthside, LLC wrote:
  Hello,
 
  After being immersed in this list and other ZFS sites for the past few 
  weeks I am having some doubts about the zpool layout on my new server. It's 
  not too late to make a change so I thought I would ask for comments. My 
  current plan to to have 12 x 1.5 TB disks in a what I would normally call a 
  RAID 10 configuration. That doesn't seem to be the right term here, but 
  there are 6 sets of mirrored disks striped together. I know that smaller 
  sets of disks are preferred, but how small is small? I am wondering if I 
  should break this into two sets of 6 disks. I do have a 13th disk available 
  as a hot spare. Would it be available for either pool if I went with two? 
  Finally, would I be better off with raidz2 or something else instead of the 
  striped mirrored sets? Performance and fault tolerance are my highest 
  priorities.
 
  Thank you,
  Chris Dunbar
 There's not much benefit I can see to having two pools if both are using 
 the same configuration (i.e all mirrors or all raidz). There are reasons 
 to do so, but I don't see that they would be of any real benefit for 
 what you describe. A Hot spare disk can be assigned to multiple pools 
 (often referred to as a global hot spare)
 
 Preferences for raidz[123] configs is to have 4-6 data disks in the vdev.
 
 Realistically speaking, you have several different (practical) 
 configurations possible, in order of general performance:
 
 (a) 6 x 2-way mirrors + 1 pool hot spare - 9TB usable
 (b) 4 x 3-ways mirrors + 1 pool hot spare - 6TB usable
 (c) 1 6-disk raidz + 1 7-disk raidz - 16.5TB usable
 (d) 2 6-disk raidz + 1 pool hot spare - 15TB usable
 (e) 1 6-disk raidz2 + 1 7-disk raidz2 - 13.5TB usable
 (f) 2 6-disk raidz2 + 1 pool hot spare - 12TB usable
 (g) 1 6-disk raidz3 + 1 7-disk raidz3 - 10.5TB usable
 (h) 1 13-disk raidz3 - 15TB usable
 
 Given the size of your disks, resilvering is likely to have a 
 significant time problem in any RAIDZ[123] configuration. That is, 
 unless you are storing (almost exclusively) very large files, resilver 
 time is going to be significant, and can potentially be radically higher 
 than a mirrored config.
 
 The mirroring configs will out-perform raidz[123] on everything except 
 large streaming write/reads, and even then, it's a toss-up. 
 
 Overall, the (a), (d), and (f) configurations generally offer the best 
 balance of redundancy, space, and performance.
 
 Here's the chances to survive disk failures (assuming hot spares are 
 unable to be used; that is, all disk failures happen in a short period 
 of time) - note that all three can always survive a single disk failure:
 
 (a) 90% for 2, 73% for 3, 49% for 4, 25% for 5.
 (d) 55% for 2, 27% for 3, 0% for 4 or more
 (f) 100% for 2, 80% for 3, 56% for 4, 0% for 5.
 
 
 Depending on your exact requirements, I'd go with (a) or (f) as the best 
 choices - (a) if performance is more important, (f) if redundancy 
 overrides performance.
 
 -- 
 Erik Trimble
 Java System Support
 Mailstop: usca22-123
 Phone: x17195
 Santa Clara, CA
 
 eSoft SpamFilter Training Tool
 Train as Spam
 Blacklist for All Users
 Whitelist for All Users

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Richard Elling
On Mar 22, 2010, at 7:30 AM, Svein Skogen wrote:

 On 22.03.2010 13:54, Edward Ned Harvey wrote:
 IIRC it's zpool scrub, and last time I checked, the zpool command
 exited (with status 0) as soon as it had started the scrub. Your
 command
 would start _ALL_ scrubs in paralell as a result.
 
 You're right.  I did that wrong.  Sorry 'bout that.
 
 So either way, if there's a zfs property for scrub, that still doesn't
 prevent multiple scrubs from running simultaneously.  So ...  Presently
 there's no way to avoid the simultaneous scrubs either way, right?  You have
 to home-cook scripts to detect which scrubs are running on which
 filesystems, and serialize the scrubs.  With, or without the property.
 
 Don't get me wrong - I'm not discouraging the creation of the property.  But
 if you want to avoid simul-scrub, you'd first have to create a mechanism for
 that, and then you could create the autoscrub.
 
 
 Which is exactly why I wanted it cooked in in the zfs code itself. zfs 
 knows how many fs'es it's scrubbing.

Nit: ZFS does not scrub file systems.  ZFS scrubs pools.  In most deployments
I've done or seen there are very few pools, with many file systems.

For appliances like NexentaStor or Oracle's Sun OpenStorage platforms, the
default smallest unit of deployment is one disk. In other words, there is no
case where multiple scrubs compete for the resources of a single disk because
a single disk only participates in one pool. In general, resource management 
works when you are resource constrained. Hence, it is quite acceptable to 
implement concurrent scrubs.

Bottom line: systems engineering is still required for optimal system operation.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-22 Thread Richard Elling
On Mar 19, 2010, at 1:28 PM, Richard Jahnel wrote:
 They way we do this here is:
 
 zfs snapshot voln...@snapnow
 [i]#code to break on error and email not shown.[/i]
 zfs send -i voln...@snapbefore voln...@snapnow | pigz -p4 -1  file
 [i]#code to break on error and email not shown.[/i]
 scp /dir/file u...@remote:/dir/file
 [i]#code to break on error and email not shown.[/i]
 shh u...@remote gzip -t /dir/file
 [i]#code to break on error and email not shown.[/i]
 shh u...@remote gunzip  /dir/file | zfs receive volname
 
 It works for me and it sends a minimum amount of data across the wire which 
 is tested to minimize the chance of inflight issues. Excpet on Sundays when 
 we do a full send.

NB. deduped streams should further reduce the snapshot size.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Svein Skogen

On 22.03.2010 18:10, Richard Elling wrote:

On Mar 22, 2010, at 7:30 AM, Svein Skogen wrote:


On 22.03.2010 13:54, Edward Ned Harvey wrote:

IIRC it's zpool scrub, and last time I checked, the zpool command
exited (with status 0) as soon as it had started the scrub. Your
command
would start _ALL_ scrubs in paralell as a result.


You're right.  I did that wrong.  Sorry 'bout that.

So either way, if there's a zfs property for scrub, that still doesn't
prevent multiple scrubs from running simultaneously.  So ...  Presently
there's no way to avoid the simultaneous scrubs either way, right?  You have
to home-cook scripts to detect which scrubs are running on which
filesystems, and serialize the scrubs.  With, or without the property.

Don't get me wrong - I'm not discouraging the creation of the property.  But
if you want to avoid simul-scrub, you'd first have to create a mechanism for
that, and then you could create the autoscrub.



Which is exactly why I wanted it cooked in in the zfs code itself. zfs 
knows how many fs'es it's scrubbing.


Nit: ZFS does not scrub file systems.  ZFS scrubs pools.  In most deployments
I've done or seen there are very few pools, with many file systems.

For appliances like NexentaStor or Oracle's Sun OpenStorage platforms, the
default smallest unit of deployment is one disk. In other words, there is no
case where multiple scrubs compete for the resources of a single disk because
a single disk only participates in one pool. In general, resource management
works when you are resource constrained. Hence, it is quite acceptable to
implement concurrent scrubs.

Bottom line: systems engineering is still required for optimal system operation.
  -- richard


When you hook up a monstrosity like 96 disks (the limit of those 
supermicro 2.5-drive sas enclosures discussed on this list recently) to 
two 4-lane sas-controllers, the bottleneck is likely to be your 
controller, your pci-express-bus, or your memory bandwidth. You still 
want to be able to put some constraints into how much your pushing the 
hardware. ;)


//Svein

--

Sending mail from a temporary set up workstation, as my primary W500 is 
off for service. PGP not installed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Richard Elling
On Mar 22, 2010, at 10:36 AM, Svein Skogen wrote:
 On 22.03.2010 18:10, Richard Elling wrote:
 On Mar 22, 2010, at 7:30 AM, Svein Skogen wrote:
 
 On 22.03.2010 13:54, Edward Ned Harvey wrote:
 IIRC it's zpool scrub, and last time I checked, the zpool command
 exited (with status 0) as soon as it had started the scrub. Your
 command
 would start _ALL_ scrubs in paralell as a result.
 
 You're right.  I did that wrong.  Sorry 'bout that.
 
 So either way, if there's a zfs property for scrub, that still doesn't
 prevent multiple scrubs from running simultaneously.  So ...  Presently
 there's no way to avoid the simultaneous scrubs either way, right?  You 
 have
 to home-cook scripts to detect which scrubs are running on which
 filesystems, and serialize the scrubs.  With, or without the property.
 
 Don't get me wrong - I'm not discouraging the creation of the property.  
 But
 if you want to avoid simul-scrub, you'd first have to create a mechanism 
 for
 that, and then you could create the autoscrub.
 
 
 Which is exactly why I wanted it cooked in in the zfs code itself. zfs 
 knows how many fs'es it's scrubbing.
 
 Nit: ZFS does not scrub file systems.  ZFS scrubs pools.  In most deployments
 I've done or seen there are very few pools, with many file systems.
 
 For appliances like NexentaStor or Oracle's Sun OpenStorage platforms, the
 default smallest unit of deployment is one disk. In other words, there is no
 case where multiple scrubs compete for the resources of a single disk because
 a single disk only participates in one pool. In general, resource management
 works when you are resource constrained. Hence, it is quite acceptable to
 implement concurrent scrubs.
 
 Bottom line: systems engineering is still required for optimal system 
 operation.
  -- richard
 
 When you hook up a monstrosity like 96 disks (the limit of those supermicro 
 2.5-drive sas enclosures discussed on this list recently) to two 4-lane 
 sas-controllers, the bottleneck is likely to be your controller, your 
 pci-express-bus, or your memory bandwidth. You still want to be able to put 
 some constraints into how much your pushing the hardware. ;)

Scrub tends to be a random workload dominated by IOPS, not bandwidth.
But if you are so inclined to create an unbalanced system...

Bottom line: systems engineering is still required for optimal system operation 
:-)
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] David Plaunt is currently away.

2010-03-22 Thread D Plaunt

I will be out of the office starting  22/03/2010 and will not return until
06/04/2010.

Hello,

I am currently working on a project and out of the office.  I will be
checking my message twice a day but may be unavailable to follow up on your
requests.

If the matter requires immediate attention please send your request to
t...@brucetelecom.com or contact technical support at 1 866 517 2000 x 2 /
519 368 2000 x 2.

Thank you,
David

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Bill Sommerfeld
On 03/22/10 11:02, Richard Elling wrote:
 Scrub tends to be a random workload dominated by IOPS, not bandwidth.

you may want to look at this again post build 128; the addition of
metadata prefetch to scrub/resilver in that build appears to have
dramatically changed how it performs (largely for the better).

- Bill
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Richard Elling
On Mar 22, 2010, at 11:33 AM, Bill Sommerfeld wrote:
 On 03/22/10 11:02, Richard Elling wrote:
 Scrub tends to be a random workload dominated by IOPS, not bandwidth.
 
 you may want to look at this again post build 128; the addition of
 metadata prefetch to scrub/resilver in that build appears to have
 dramatically changed how it performs (largely for the better).

Yes, it is better.  But still nowhere near platter speed.  All it takes is
one little seek...
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Harry Putnam
This may be a bit dimwitted since I don't really understand how
snapshots work.  I mean the part concerning COW (copy on right) and
how it takes so little room.

But here I'm not asking about that.

It appears to me that the default snapshot setup shares some aspects
of a vcs (version control system) tool.

I wonder if any of you use it that way.

Here is one thing I've considered but not done yet.

When I do video projects or any projects for that matter.
I sometimes want backups every 10 minutes or so, so as not to loose
some piece of script that isn't finished or the like.

Or with something like a flash project, you might want to make sure
you will be able to recover a version from a while back.

So doing the project on zfs filesystem (maybe as nfs or cifs mount)
would offer a way to do that.

I wondered if it would be possible to run a snapshot system
independent of the default one.  I mean so a default setup of auto
snapshotting would continue unaffected.

I'm thinking of scripting something like 10 minute snapshots during
the time I'm working on a project, then just turn it off when not
working on it.  When project is done... zap all those snapshots.

Am I missing something basic that make this a poor use of zfs?

Oh, something I meant to ask... is there some standard way to tell
before calling for a snapshot, if the directory structure has changed
at all, other than aging I mean.  Is there something better than
running `diff -r [...]' between existing structure and last snapshot.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Matt Cowger
This is totally doable, and a reasonable use of zfs snapshots - we do some 
similar things.

You can easily determine if the snapshot has changed by checking the output of 
zfs list for the snapshot.

--M

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Harry Putnam
Sent: Monday, March 22, 2010 1:34 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] snapshots as versioning tool

This may be a bit dimwitted since I don't really understand how
snapshots work.  I mean the part concerning COW (copy on right) and
how it takes so little room.

But here I'm not asking about that.

It appears to me that the default snapshot setup shares some aspects
of a vcs (version control system) tool.

I wonder if any of you use it that way.

Here is one thing I've considered but not done yet.

When I do video projects or any projects for that matter.
I sometimes want backups every 10 minutes or so, so as not to loose
some piece of script that isn't finished or the like.

Or with something like a flash project, you might want to make sure
you will be able to recover a version from a while back.

So doing the project on zfs filesystem (maybe as nfs or cifs mount)
would offer a way to do that.

I wondered if it would be possible to run a snapshot system
independent of the default one.  I mean so a default setup of auto
snapshotting would continue unaffected.

I'm thinking of scripting something like 10 minute snapshots during
the time I'm working on a project, then just turn it off when not
working on it.  When project is done... zap all those snapshots.

Am I missing something basic that make this a poor use of zfs?

Oh, something I meant to ask... is there some standard way to tell
before calling for a snapshot, if the directory structure has changed
at all, other than aging I mean.  Is there something better than
running `diff -r [...]' between existing structure and last snapshot.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Ian Collins

On 03/23/10 09:34 AM, Harry Putnam wrote:

This may be a bit dimwitted since I don't really understand how
snapshots work.  I mean the part concerning COW (copy on right) and
how it takes so little room.

But here I'm not asking about that.

It appears to me that the default snapshot setup shares some aspects
of a vcs (version control system) tool.

   
It does, but on a filesystem rather than file level.  Or to put it 
another way, less fine grained than a traditional VCS.



I wonder if any of you use it that way.

   
I do for things I don't change very often, such us system configuration 
files.  I always snapshot my root pool before making any changes to 
files under /etc for example.



Here is one thing I've considered but not done yet.

When I do video projects or any projects for that matter.
I sometimes want backups every 10 minutes or so, so as not to loose
some piece of script that isn't finished or the like.

Or with something like a flash project, you might want to make sure
you will be able to recover a version from a while back.

So doing the project on zfs filesystem (maybe as nfs or cifs mount)
would offer a way to do that.

I wondered if it would be possible to run a snapshot system
independent of the default one.  I mean so a default setup of auto
snapshotting would continue unaffected.

   
You can, but I think you would be better off using a traditional VCS 
(such as Subversion) that works well with binary files.  If you have to 
work in windows, this is your best option (Tortoise SVN is the only 
reason I know to use windows!).

I'm thinking of scripting something like 10 minute snapshots during
the time I'm working on a project, then just turn it off when not
working on it.  When project is done... zap all those snapshots.

Am I missing something basic that make this a poor use of zfs?

   
You don't really get to track version of a file.  I find I commit very 
frequently (as soon as a new test passes) and use SVN as an undo if I 
mess up a change.  Tying commits to changes is different form tying them 
to time.



Oh, something I meant to ask... is there some standard way to tell
before calling for a snapshot, if the directory structure has changed
at all, other than aging I mean.  Is there something better than
running `diff -r [...]' between existing structure and last snapshot.

   

Not really, there is ZFS diff is in the woks, but not here yet.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Edward Ned Harvey
 This may be a bit dimwitted since I don't really understand how
 snapshots work.  I mean the part concerning COW (copy on right) and
 how it takes so little room.

COW and snapshots are very simple to explain.  Suppose you're chugging along
using your filesystem, and then one moment, you tell the filesystem to
freeze.  Well, suppose a minute later you tell the FS to overwrite some
block that's in use already.  Instead of overwriting the actual block on
disk, the FS will overwrite some unused space, and report back to you that
the operation is completed.  So now there's a copy of the block as it was
at the moment of the freeze, and there's another copy of the block as it
looks later in time.  The FS only needs to freeze the FS tables, to remember
which blocks belonged to which files in each of the snapshots.  Hence, Copy
On Write.

That being said, it's an inaccurate description to say COW takes so little
room.  If anything, it takes more room than a filesystem which can't do
COW, because the FS must not delete any of the old blocks belonging to any
of the old snapshots of the filesystem.  The more frequently you take
snapshots, and the older your oldest snap is, and the more volatile your
data is, changing large sequences of blocks rapidly ... The more disk space
will be consumed.  No block is free, as long as any one of the snaps
references it.

But suppose you have n snapshots.  In a non-COW filesystem, you would have
n-times the data.  While in COW, you still have 1x the total used data size,
plus the byte differentials necessary to resurrect any/all of the old
snapshots.


 I'm thinking of scripting something like 10 minute snapshots during
 the time I'm working on a project, then just turn it off when not
 working on it.  When project is done... zap all those snapshots.

Yup, that's absolutely easy.  Just set up a cron job to snap every 10
minutes, using a unique string in the snapname, like @myprojectsnap ...
and when you're all done, you zfs destroy anything which matches
@myprojectsnap


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Edward Ned Harvey
 In other words, there
 is no
 case where multiple scrubs compete for the resources of a single disk
 because
 a single disk only participates in one pool. 

Excellent point.  However, the problem scenario was described as SAN.  I can
easily imagine a scenario where some SAN administrator created a pool of
raid 5+1 or raid 0+1, and the pool is divided up into 3 LUNs which are
presented to 3 different machines.  Hence, when Machine A is hammering on
the disks, it could also affect Machine B or C.

The catch that I keep repeating, is that even a zfs property couldn't
possibly solve that problem.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Harry Putnam
Matt Cowger mcow...@salesforce.com writes:

 This is totally doable, and a reasonable use of zfs snapshots - we
 do some similar things.

Good, thanks for the input.

 You can easily determine if the snapshot has changed by checking the
 output of zfs list for the snapshot.

Do you mean to just grep it out of the output of 

  zfs list -t snapshot

Or is there some finer grained way to get it?

  (I mean barring feeding the exact snapshot name to
  zfs list [ which would mean finding the name first, of course] )

Here, it appears adding anything more to that command line causes it to
fail.

   zfs list -t  snapshot z3/projects
   cannot open 'z3/projects': operation not applicable to 
   datasets of this type 

An example command line from your usage might be handy.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Matt Cowger
zfs list | grep '@'  

zpool/f...@1154758324G  -   461G  -
zpool/f...@1208482   6.94G  -   338G  -
zpool/f...@daily.netbackup   1.07G  -   344G  -
zpool/f...@11547581.77G  -   242G  -
zpool/f...@12084822.26G  -   261G  -
zpool/f...@daily.netbackup 323M  -   266G  -

First column there shows the size of the snapshot (e.g. how much has changed).


-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Harry Putnam
Sent: Monday, March 22, 2010 2:23 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] snapshots as versioning tool

Matt Cowger mcow...@salesforce.com writes:

 This is totally doable, and a reasonable use of zfs snapshots - we
 do some similar things.

Good, thanks for the input.

 You can easily determine if the snapshot has changed by checking the
 output of zfs list for the snapshot.

Do you mean to just grep it out of the output of 

  zfs list -t snapshot

Or is there some finer grained way to get it?

  (I mean barring feeding the exact snapshot name to
  zfs list [ which would mean finding the name first, of course] )

Here, it appears adding anything more to that command line causes it to
fail.

   zfs list -t  snapshot z3/projects
   cannot open 'z3/projects': operation not applicable to 
   datasets of this type 

An example command line from your usage might be handy.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Brandon High
On Mon, Mar 22, 2010 at 1:58 PM, Ian Collins i...@ianshome.com wrote:

 On 03/23/10 09:34 AM, Harry Putnam wrote:

 Oh, something I meant to ask... is there some standard way to tell
 before calling for a snapshot, if the directory structure has changed
 at all, other than aging I mean.  Is there something better than
 running `diff -r [...]' between existing structure and last snapshot.



 Not really, there is ZFS diff is in the woks, but not here yet.



Someone pointed out that you can use bart, but that also scans the
directories. It might do what you want, but it doesn't work at the zpool /
zfs level, just at the file level layer.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Proposition of a new zpool property.

2010-03-22 Thread Brandon High
On Mon, Mar 22, 2010 at 12:21 PM, Richard Elling
richard.ell...@gmail.comwrote:

 Yes, it is better.  But still nowhere near platter speed.  All it takes is
 one little seek...



True, dat. I find that scrubs start very slow ( 20MB/s) with the disks at
near-100% utilization. Towards the end of the scrub, speeds are up in the
250+ MB/s range. It's on very slow disk (8x WD Green), so the seek penalty
is high.

I suspect this is because data and metadata has been scattered across the
disk due to churn from snapshots, etc. I've never noticed a slowdown in
regular use though, in fact local disk on my clients tends to be the
bottleneck when copying files.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-22 Thread Brandon High
On Mon, Mar 22, 2010 at 10:26 AM, Richard Elling
richard.ell...@gmail.comwrote:

 NB. deduped streams should further reduce the snapshot size.


I haven't seen a lot of discussion on the list regarding send dedup, but I
understand it'll use the DDT if you have dedup enabled on your dataset.
What's the process and penalty for using it on a dataset that is not already
deduped? Does it build a DDT for just the data in the send?

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-22 Thread Frank Middleton

On 03/21/10 03:24 PM, Richard Elling wrote:
 

I feel confident we are not seeing a b0rken drive here.  But something is
clearly amiss and we cannot rule out the processor, memory, or controller.


Absolutely no question of that, otherwise this list would be flooded :-).
However, the purpose of the post wasn't really to diagnose the hardware
but to ask about the behavior of ZFS under certain error conditions.


Frank reports that he sees this on the same file, /lib/libdlpi.so.1, so I'll go 
out
on a limb and speculate that there is something in the bit pattern for that
file that intermittently triggers a bit flip on this system. I'll also 
speculate that
this error will not be reproducible on another system.


Hopefully not, but you never know :-). However, this instance is different.
The example you quote shows both expected and actual checksums to be
the same. This time the expected and actual checksums are different and
fmdump isn't flagging any bad_ranges or set-bits (the behavior you observed
is still happening, but orthogonal to this instance at different times and not
always on this file).

Since file itself is OK, and the expected checksums are always the same,
neither the file nor the metatdata appear to be corrupted, so it appears
that both are making it into memory without error.

It would seem therefore that it is the actual checksum calculation that is
failing. But, only at boot time, the calculated (bad) checksums differ (out
of 16, 10, 3, and 3 are the same [1]) so it's not consistent. At this point it
would seem to be cpu or memory, but why only at boot? IMO it's an
old and feeble power supply under strain pushing cpu or memory to a
margin not seen during normal operation, which could be why diagnostics
never see anything amiss (and the importance of a good power supply).

FWIW the machine passed everything vts could throw at it for a couple
of days. Anyone got any suggestions for more targeted diagnostics?

There were several questions embedded in the original post, and I'm not
sure any of them have really been answered:

o Why is the file flagged by ZFS as fatally corrupted still accessiible?
   [is this new behavior from b111b vs b125?].

o What possible mechanism could there be for the /calculated/ checksums
   of /four/ copies of just one specific file to be bad and no others?

o Why did this only happen at boot to just this one file which also is
   peculiarly subject to the bitflips you observed, also mostly at boot
  (sometimes at scrub)? I like the feeble power supply answer, but why
  just this one file? Bizarre...

# zpool get  failmode rpool
NAME   PROPERTY  VALUE SOURCE
rpool  failmode  wait  default

This machine is extremely memory limited, so I suspect that libdlpi.so.1 is
not in a cache. Certainly, a brand new copy wouldn't be, and there's no
problem writing and (much later) reading the new copy (or the old one,
for that matter). It remains to be seen if the brand new copy gets clobbered
at boot (the machine, for all it's faults, remains busily up and operational
for months at a time). Maybe I should schedule a reboot out of curiosity :-).


This sort of specific error analysis is possible after b125. See CR6867188
for more details.


Wasn't this in b125? IIRC we upgraded to b125 for this very reason. There
certainly seems to be an overwhelming amount of data in the various logs!

Cheers -- Frank

[1]  This could be (3+1) * 4 where in one instance all 3+1 happen to be the
same. Does ZFS really read all 4 copies 4 times (by fmdump timestamp, 8
within 1uS, 40mS later, another 8,  again within 1uS)? Not sure what the
fmdump timestamps mean, so it's hard to find any pattern.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-22 Thread Lori Alt

On 03/22/10 05:04 PM, Brandon High wrote:
On Mon, Mar 22, 2010 at 10:26 AM, Richard Elling 
richard.ell...@gmail.com mailto:richard.ell...@gmail.com wrote:


NB. deduped streams should further reduce the snapshot size.


I haven't seen a lot of discussion on the list regarding send dedup, 
but I understand it'll use the DDT if you have dedup enabled on your 
dataset.
The send code (which is user-level) builds its own DDT no matter what, 
but it will use existing checksums if on-disk dedup is already in effect.


What's the process and penalty for using it on a dataset that is not 
already deduped?


The penalty is the cost of doing the checksums.


Does it build a DDT for just the data in the send?


Yes, currently limited to 20% of physical memory size.

Lori




-B

--
Brandon High : bh...@freaks.com mailto:bh...@freaks.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-22 Thread Nicolas Williams
On Thu, Mar 18, 2010 at 10:38:00PM -0700, Rob wrote:
 Can a ZFS send stream become corrupt when piped between two hosts
 across a WAN link using 'ssh'?

No.  SSHv2 uses HMAC-MD5 and/or HMAC-SHA-1, depending on what gets
negotiated, for integrity protection.  The chances of random on the wire
corruption going undetected by link-layer CRCs, TCP's CRC and SSHv2's
MACs is infinitessimally small.  I suspect the chances of local bit
flips due to cosmic rays and what not are higher.

A bigger problem is that SSHv2 connections do not survive corruption on
the wire.  That is, if corruption is detected then the connection gets
aborted.  If you were zfs send'ing 1TB across a long, narrow link and
corruption hit the wire while sending the last block you'd have to
re-send the whole thing (but even then such corruption would still have
to get past link-layer and TCP checksums -- I've seen it happen, so it
is possible, but it is also unlikely).

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshots as versioning tool

2010-03-22 Thread Edward Ned Harvey
  You can easily determine if the snapshot has changed by checking the
  output of zfs list for the snapshot.
 
 Do you mean to just grep it out of the output of
 
   zfs list -t snapshot

I think the point is:  You can easily tell how many MB changed in a
snapshot, and therefore you can easily tell yes the snapshot changed.  But
unfortunately, no you can't easily tell which files changed.  Yet. 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send/receive and file system properties

2010-03-22 Thread Len Zaifman
I am trying to coordinate properties and data between 2 file servers.

on file server 1 I have:

zfs get all zfs52/export/os/sles10sp2
NAME   PROPERTY  VALUE  
  SOURCE
zfs52/export/os/sles10sp2  type  filesystem 
  -
zfs52/export/os/sles10sp2  creation  Mon Mar 22 15:28 2010  
  -
zfs52/export/os/sles10sp2  used  662M   
  -
zfs52/export/os/sles10sp2  available 49.4G  
  -
zfs52/export/os/sles10sp2  referenced661M   
  -
zfs52/export/os/sles10sp2  compressratio 2.88x  
  -
zfs52/export/os/sles10sp2  mounted   yes
  -
zfs52/export/os/sles10sp2  quota 50G
  local
zfs52/export/os/sles10sp2  mountpoint/export/os/sles10sp2   
  local
zfs52/export/os/sles10sp2  sharenfs  
r...@192.168.0.0/16,ro...@192.168.0.0/24  inherited from zfs52/export/os
zfs52/export/os/sles10sp2  checksum  on 
  default
zfs52/export/os/sles10sp2  compression   gzip   
  local
...

I use
zfs send zfs52/export/os/sles10...@hpffs52_201003221747 | ssh -c blowfish 
hpffs51 zfs receive -d zfs51

to copy it to another system , with the same mointpoint , sharenfs , quota and 
compression properties

on the other system I see:


zfs get all zfs51/export/os/sles10sp2
NAME   PROPERTY  VALUE  

  SOURCE
zfs51/export/os/sles10sp2  type  filesystem 

  -
zfs51/export/os/sles10sp2  creation  Mon Mar 22 20:00 2010  

  -
zfs51/export/os/sles10sp2  used  1.76G  

  -
zfs51/export/os/sles10sp2  available 10.5T  

  -
zfs51/export/os/sles10sp2  referenced1.76G  

  -
zfs51/export/os/sles10sp2  compressratio 1.00x  

  -
zfs51/export/os/sles10sp2  mounted   yes

  -
zfs51/export/os/sles10sp2  quota none   

  default
zfs51/export/os/sles10sp2  mountpoint/export/os/sles10sp2   

  inherited from zfs51/export
zfs51/export/os/sles10sp2  sharenfs  
r...@192.168.0.0/16:@172.16.20.0/24:hpffs24-bkup:hpffs01-bkup,ro...@192.168.0.0/24:@172.16.20.0/24:hpffs24-bkup:hpffs01-bkup
  inherited from zfs51/export
zfs51/export/os/sles10sp2  checksum  on 

  default
zfs51/export/os/sles10sp2  compression   off

  default

The sharenfs and mountpoints came across fine, but what happened to compression 
and quota ??
Is there an option I need?

Len Zaifman
Systems Manager, High Performance Systems
The Centre for Computational Biology
The Hospital for Sick Children
555 University Ave.
Toronto, Ont M5G 1X8

tel: 416-813-5513
email: leona...@sickkids.ca

This e-mail may contain confidential, personal and/or health 
information(information which may be subject to legal restrictions on use, 
retention and/or disclosure) for the sole use of the intended recipient. Any 
review or distribution by anyone other than the person for whom it was 
originally intended is strictly prohibited. If you have received this e-mail in 
error, please contact the sender and delete all copies.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] LSISAS2004 support

2010-03-22 Thread Bart Nabbe
All,

I did some digging and I was under the impression that the mr_sas driver was to 
support the LSISAS2004 HBA controller from LSI.
I did add the pci id to the driver alias for mr_sas, but then the driver still 
showed up as unattached (see below).
Did I miss something, or was my assumption that this controller was supported 
in the dev branch flawed.
I'm running:  SunOS 5.11 snv_134 i86pc i386 i86pc Solaris.

Thanks in advance for any pointers.


node name:  pci1000,3010
Vendor: LSI Logic / Symbios Logic
Device: SAS2004 PCI-Express Fusion-MPT SAS-2 
[Spitfire]
Sub-Vendor: LSI Logic / Symbios Logic
binding name:   pciex1000,70
devfs path: /p...@0,0/pci8086,3...@3/pci1000,3010
pci path:   3,0,0
compatible name:
(pciex1000,70.1000.3010.2)(pciex1000,70.1000.3010)(pciex1000,70.2)(pciex1000,70)(pciexclass,010700)(pciexclass,0107)(pci1000,70.1000.3010.2)(pci1000,70.1000.3010)(pci1000,3010)(pci1000,70.2)(pci1000,70)(pciclass,010700)(pciclass,0107)
driver name:mr_sas
driver state:   Detached
assigned-addresses: 81030010
reg:3
compatible: pciex1000,70.1000.3010.2
model:  Serial Attached SCSI Controller
power-consumption:  1
devsel-speed:   0
interrupts: 1
subsystem-vendor-id:1000
subsystem-id:   3010
unit-address:   0
class-code: 10700
revision-id:2
vendor-id:  1000
device-id:  70

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-22 Thread Richard Elling
On Mar 22, 2010, at 4:21 PM, Frank Middleton wrote:

 On 03/21/10 03:24 PM, Richard Elling wrote:
 
 I feel confident we are not seeing a b0rken drive here.  But something is
 clearly amiss and we cannot rule out the processor, memory, or controller.
 
 Absolutely no question of that, otherwise this list would be flooded :-).
 However, the purpose of the post wasn't really to diagnose the hardware
 but to ask about the behavior of ZFS under certain error conditions.
 
 Frank reports that he sees this on the same file, /lib/libdlpi.so.1, so I'll 
 go out
 on a limb and speculate that there is something in the bit pattern for that
 file that intermittently triggers a bit flip on this system. I'll also 
 speculate that
 this error will not be reproducible on another system.
 
 Hopefully not, but you never know :-). However, this instance is different.
 The example you quote shows both expected and actual checksums to be
 the same.

Look again, the checksums are different.

 This time the expected and actual checksums are different and
 fmdump isn't flagging any bad_ranges or set-bits (the behavior you observed
 is still happening, but orthogonal to this instance at different times and not
 always on this file).

don't forget the -V flag :-)

 Since file itself is OK, and the expected checksums are always the same,
 neither the file nor the metatdata appear to be corrupted, so it appears
 that both are making it into memory without error.
 
 It would seem therefore that it is the actual checksum calculation that is
 failing. But, only at boot time, the calculated (bad) checksums differ (out
 of 16, 10, 3, and 3 are the same [1]) so it's not consistent. At this point it
 would seem to be cpu or memory, but why only at boot? IMO it's an
 old and feeble power supply under strain pushing cpu or memory to a
 margin not seen during normal operation, which could be why diagnostics
 never see anything amiss (and the importance of a good power supply).
 
 FWIW the machine passed everything vts could throw at it for a couple
 of days. Anyone got any suggestions for more targeted diagnostics?
 
 There were several questions embedded in the original post, and I'm not
 sure any of them have really been answered:
 
 o Why is the file flagged by ZFS as fatally corrupted still accessiible?
   [is this new behavior from b111b vs b125?].
 
 o What possible mechanism could there be for the /calculated/ checksums
   of /four/ copies of just one specific file to be bad and no others?

Broken CPU, HBA, bus, or memory.

 o Why did this only happen at boot to just this one file which also is
   peculiarly subject to the bitflips you observed, also mostly at boot
  (sometimes at scrub)? I like the feeble power supply answer, but why
  just this one file? Bizarre...

Broken CPU, HBA, bus, memory, or power supply.

 # zpool get  failmode rpool
 NAME   PROPERTY  VALUE SOURCE
 rpool  failmode  wait  default
 
 This machine is extremely memory limited, so I suspect that libdlpi.so.1 is
 not in a cache. Certainly, a brand new copy wouldn't be, and there's no
 problem writing and (much later) reading the new copy (or the old one,
 for that matter). It remains to be seen if the brand new copy gets clobbered
 at boot (the machine, for all it's faults, remains busily up and operational
 for months at a time). Maybe I should schedule a reboot out of curiosity :-).
 
 This sort of specific error analysis is possible after b125. See CR6867188
 for more details.
 
 Wasn't this in b125? IIRC we upgraded to b125 for this very reason. There
 certainly seems to be an overwhelming amount of data in the various logs!
 
 Cheers -- Frank
 
 [1]  This could be (3+1) * 4 where in one instance all 3+1 happen to be the
 same. Does ZFS really read all 4 copies 4 times (by fmdump timestamp, 8
 within 1uS, 40mS later, another 8,  again within 1uS)? Not sure what the
 fmdump timestamps mean, so it's hard to find any pattern.

Transient failures are some of the most difficult to track down. Not all 
transient failures are random.
  -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] pool use from network poor performance

2010-03-22 Thread homerun
Hi

i have now two pools

rpool 2-way mirror  ( pata )
data 4-way raidz2   ( sata )

if i access to datapool from network , smb , nfs , ftp , sftp , jne...
i get only max 200 KB/s speeds
compared to rpool that give XX MB/S speeds to and from network it is slow.

Any ideas what reasons might be and how try to find reason.

Locally datapool works reasonable fast for me. 
# date  mkfile 1G testfile  date
Tuesday, March 23, 2010 07:52:19 AM EET
Tuesday, March 23, 2010 07:52:36 AM EET


Some information about system.
# cat /etc/release
   OpenSolaris Development snv_134 X86
   Copyright 2010 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
 Assembled 01 March 2010

# isainfo -v
64-bit amd64 applications
ahf sse3 sse2 sse fxsr amd_3dnowx amd_3dnow amd_mmx mmx cmov amd_sysc
cx8 tsc fpu
32-bit i386 applications
ahf sse3 sse2 sse fxsr amd_3dnowx amd_3dnow amd_mmx mmx cmov amd_sysc
cx8 tsc fpu

thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss