Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread Fajar A. Nugraha
On Fri, Mar 19, 2010 at 12:38 PM, Rob slewb...@yahoo.com wrote:
 Can a ZFS send stream become corrupt when piped between two hosts across a 
 WAN link using 'ssh'?

unless the end computers are bad (memory problems, etc.), then the
answer should be no. ssh has its own error detection method, and the
zfs send stream itself is checksummed.

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread homerun
Greetings

I would like to get your recommendation how setup new pool.

I have 4 new 1.5TB disks reserved to new zpool.
I planned to crow/replace existing small 4 disks ( raidz ) setup with new 
bigger one.

As new pool will be bigger and will have more personally important data to be 
stored long time, i like to ask your recommendations should i create recreate 
pool or just replace existing devices.

I have noted there is now raidz2 and been thinking witch woul be better.
A pool with 2 mirrors or one pool with 4 disks raidz2

So at least could some explain these new raidz configurations

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Khyron
Ahhh, this has been...interesting...some real personalities involved in
this
discussion.  :p  The following is long-ish but I thought a re-cap was in
order.
I'm sure we'll never finish this discussion, but I want to at least have a
new
plateau or base from which to consider these questions.

I've just read through EVERY post to this thread, so I want to recap the
best
points in the vein of the original thread, and set a new base for continuing

the conversation.  Personally, I'm less interested in the archival case;
rather,
I'm looking for the best way to either recover from a complete system
failure
or recover an individual file or file set from some backup media, most
likely
tape.

Now let's put all of this together, along with some definitions.  First, the

difference between archival storage (to tape or other) and backup.  I think
the best definition provided in this thread came from Darren Moffat as well.

As Carsten Aulbert mentioned, this discussion is fairly useless until we
start
using the same terminology to describe a set of actions.

For this discussion, I am defining archival as taking the data and placing
it
on some media - likely tape, but not necessarily - in the simplest format
possible that could hopefully be read by another device in the future.  This

could exclude capturing NTFS/NFSv4/ZFS ACLs, Solaris extended attributes,
or zpool properties (aka metadata for purposes of this discussion).  With an

archive, we may not go back and touch the data for a long time, if ever
again.

Backup, OTOH, is the act of making a perfect copy of the data to some
media (in my interest tape, but again, not necessarily) which includes all
of
the metadata associated with that data.  Such a copy would allow perfect
re-creation of the data in a new environment, recovery from a complete
system failure, or single file (or file set) recovery.  With a backup, we
have
the expectation that we may need to return to it shortly after it is
created,
so we have to be able to trust it...now.  Data restored from this backup
needs to be an exact replica of the original source - ZFS pool and dataset
properties, extended attributes, and ZFS ACLs included.

Now that I hopefully have common definitions for this conversation (and
I hope I captured Darren's meaning accurately), I'll divide this into 2
sections,
starting with NDMP.

NDMP:

For those who are unaware (and to clarify my own understanding), I'll take
a moment to describe NDMP.  NDMP was invented by NetApp to allow direct
backup of their Filers to tape backup servers, and eventually onto tape.  It

is designed to remove the need for indirect backup by backing up the NFS
or CIFS shared file systems on the clients.  Instead, we backup the shared
file systems directly from the Filer (or other file server - say Fishworks
box
or OpenSolaris server) to the backup server via the network.  We avoid
multiple copies of the shared file systems.  NDMP is a network-based
delivery mechanism to get data from a storage server to a backup server,
which is why the backup software must also speak NDMP.  Hopefully, my
description is mostly accurate, and it is clear why this might be useful for

people using (Open)Solaris + ZFS for tape backup or archival purposes.

Darren Moffat made the point that NDMP could be used to do the tape
splitting, but I'm not sure this is accurate.  If zfs send from a file
server
running (Open)Solaris to a tape drive over NDMP  is viable -- which it
appears to be to me -- then the tape splitting would be handled by the
tape backup application.  In my world, that's typically NetBackup or some
similar enterprise offering.  I see no reason why it couldn't be Amanda or
Bacula or Arkeia or something else.  THIS is why I am looking for faster
progress on NDMP.

Now, NDMP doesn't do you much good for a locally attached tape drive,
as Darren and Svein pointed out.  However, provided the software which is
installed on this fictional server can talk to the tape in an appropriate
way,
then all you have to do is pipe zfs send into it.  Right?  What did I
miss?

ZVOLs and NTFS/NFSv4/ZFS ACLs:

The answer is zfs send to both of my questions about ZVOLs and ACLs.

At the center of all of this attention is zfs send.  As Darren Moffat
pointed
out, it has all the pieces to do a proper, complete and correct backup.  The

big remaining issue that I see is how do you place a zfs send stream on a
tape in a reliable fashion.  CR 6936195 would seem to handle one complaint
from Svein, Miles Nordin and others about reliability of the send stream on
the tape.  Again, I think NDMP may help answer this question for file
servers without attached tape devices.  For those with attached tape
devices,
what's the equivalent answer?  Who is doing this, and how?  I believe we've
seen Ed Harvey say NetBackup and Ian Collins say NetVault.  Do these
products capture all the metadata required to call this copy a backup?
That's my next question.

Finally, Damon Atkins said:

But their needs to be 

Re: [zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread taemun
A pool with a 4-wide raidz2 is a completely nonsensical idea. It has the
same amount of accessible storage as two striped mirrors. And would be
slower in terms of IOPS, and be harder to upgrade in the future (you'd need
to keep adding four drives for every expansion with raidz2 - with mirrors
you only need to add another two drives to the pool).

Just my $0.02

On 19 March 2010 18:28, homerun petri.j.kunn...@gmail.com wrote:

 Greetings

 I would like to get your recommendation how setup new pool.

 I have 4 new 1.5TB disks reserved to new zpool.
 I planned to crow/replace existing small 4 disks ( raidz ) setup with new
 bigger one.

 As new pool will be bigger and will have more personally important data to
 be stored long time, i like to ask your recommendations should i create
 recreate pool or just replace existing devices.

 I have noted there is now raidz2 and been thinking witch woul be better.
 A pool with 2 mirrors or one pool with 4 disks raidz2

 So at least could some explain these new raidz configurations

 Thanks
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread Edho P Arief
On Fri, Mar 19, 2010 at 2:34 PM, taemun tae...@gmail.com wrote:
 A pool with a 4-wide raidz2 is a completely nonsensical idea. It has the
 same amount of accessible storage as two striped mirrors. And would be
 slower in terms of IOPS, and be harder to upgrade in the future (you'd need
 to keep adding four drives for every expansion with raidz2 - with mirrors
 you only need to add another two drives to the pool).
 Just my $0.02


but it can survive on failure of 2 random disks in the pool.

In striped mirror:
mirror1
  diskA
  diskB
mirror2
  diskC
  diskD

In event diskA and diskB (or diskC and diskD) failed together, entire
pool is lost.

In raidz2:
raidz2-1
  diskA
  diskB
  diskC
  diskD

Any combination of 2 disks can fail at same time and the pool will still intact.


-- 
O ascii ribbon campaign - stop html mail - www.asciiribbon.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread Daniel Carosone
On Fri, Mar 19, 2010 at 06:34:50PM +1100, taemun wrote:
 A pool with a 4-wide raidz2 is a completely nonsensical idea.

No, it's not - not completely.

 It has the same amount of accessible storage as two striped mirrors. And 
 would be slower in terms of IOPS, and be harder to upgrade in the future 

All that is true.

If those things weren't as important to you as error recovery, raidz2
make fine sense: a 4-way raidz2 can tolerate the loss of any 2 disks.
The mirror pool may die with the loss of the wrong 2 disks.

 Just my $0.02

Cost and benefit valuation are left to the user according to their
circumstances. 

--
Dan.


pgpl2dPbBOdFY.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread homerun
Thanks for comments

So possible choises are :

1) 2 2-way mirros
2) 4 disks raidz2

BTW , can raidz have spare ? so is there one posible choise more :
3 disks raidz with 1 spare ?

Here i prefer data availibility not performance.
And if need sometime to expand / change setup it is then that time problem
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Khyron
I'm also a Mac user.  I use Mozy instead of DropBox, but it sounds like
DropBox should get a place at the table.  I'm about to download it in a few
minutes.

I'm right now re-cloning my internal HD due to some HFS+ weirdness.  I
have to completely agree that ZFS would be a great addition to MacOS X,
and the best imaginable replacement for HFS+.  The file system and
associated problems are my only complaint with the entire OS.  I guess my
browser usage pattern is just too much for HFS+.

Of course, I'm the only person I know who said that Sun should have
bought Apple 10 years ago.  What do I know?

Getting better FireWire performance on OpenSolaris would be nice though.
Darwin drivers are open...hmmm.

On Thu, Mar 18, 2010 at 18:19, David Magda dma...@ee.ryerson.ca wrote:

 On Mar 18, 2010, at 14:23, Bob Friesenhahn wrote:

  On Thu, 18 Mar 2010, erik.ableson wrote:


 Ditto on the Linux front.  I was hoping that Solaris would be the
 exception, but no luck.  I wonder if Apple wouldn't mind lending one of the
 driver engineers to OpenSolaris for a few months...


 Perhaps the issue is the filesystem rather than the drivers.  Apple users
 have different expectations regarding data loss than Solaris and Linux users
 do.


 Apple users (of which I am one) expect things to Just Work. :)

 And there are Apple users and Apple users:

 http://daringfireball.net/2010/03/ode_to_diskwarrior_superduper_dropbox

 If anyone Apple is paying attention, perhaps you could re-open discussions
 with now-Oracle about getting ZFS into Mac OS. :)


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
You can choose your friends, you can choose the deals. - Equity Private

If Linux is faster, it's a Solaris bug. - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread Daniel Carosone
On Fri, Mar 19, 2010 at 12:59:39AM -0700, homerun wrote:
 Thanks for comments
 
 So possible choises are :
 
 1) 2 2-way mirros
 2) 4 disks raidz2
 
 BTW , can raidz have spare ? so is there one posible choise more :
 3 disks raidz with 1 spare ?

raidz2 is basically this, with a pre-silvered spare.  With an
unsilvered spare, you have no redundancy until the resilver completes,
and if there are latent errors in the remaining non-redundant disks
you may lose data. 

Other choices:

 - 4way raidz3
 - 4way mirror

Same space and fault tolerance, different performance.  This is an
easier choice, closer (but still not completely) to the nonsensical.

Another choice again:

 - 2 separate pools, each a 2-disk mirror

Data in one pool, backed up regularly by snapshot replication to the
second. Same space as a 4-way mirror, but this has tolerance to some
other kinds of problems that a single pool does not. 

Better still would be a backup pool in another machine/site. Perhaps
the disks you are replacing can go to this purpose?

--
Dan.



pgpagv9YCTq4k.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Erik Ableson
Funny, I thought the same thing up until a couple of years ago when I  
thought Apple should have bought Sun :-)


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 19 mars 2010, at 09:41, Khyron khyron4...@gmail.com wrote:


Of course, I'm the only person I know who said that Sun should have
bought Apple 10 years ago.  What do I know?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup rollback taking a long time.

2010-03-19 Thread John
My  rollback finished yesterday after about 7.5 days. It still wasn't ready 
to receive the last snapshot, so I rm'ed all the files (took 14 hours) and then 
issued the rollback command again, 2 minutes this time.

Ok, I now have many questions, some due to a couple of responses (which don't 
appear on the http://opensolaris.org/jive website)

One response was.

I think it has been shown by others that dedup requires LOTS of RAM and 
to be safe, an SSD L2ARC, especially with large (multi-TB) datasets.  Dedup 
is still very new, too.  People seem to forget that.

The other was

My only suggestion is if the machine is still showing any disk activity to try 
adding more RAM.  I don't know this for a fact but it seems that destroying 
deduped data when the dedup table doesn't fit in RAM is pathologically slow 
because the entire table is traversed for every deletion, or at least enough of 
it to hit the disk on every delete. 
I've seen a couple of people report that the process was able to complete in a 
sane amount of time after adding more RAM.

This information is based on what I remember of past conversations and is all 
available in the archives as well.

I currently have 4 GB of RAM, and can't get anymore in this box (4 x 2 TB hard 
drives), so it sounds like I need bigger hardware. So the question is how much 
more. According to one post I have read, the poster claimed that the dedup 
table would fill 13.4GB for his 1.7 TB file space, assuming this is true (8GB 
per 1TB), then do modern servers have enough RAM space to use dedup 
effectively. Is a SSD fast enough, or does the whole DDT need to be held in RAM?

I am currently getting a planning a new file server for the company which need 
to have space for approx 16 TB of files (twice what we are currently using) and 
this will need to be much more focused to performance. So would the 2 solutions 
have  similar performance, and what results does turning on compress give?

Both will have 20 Hard disks (2 rpool, 2 SDD cache, and 14 data as mirrored 
pairs, and 2 hot spares)

non- dedup. 
16 x 2 TB giving 14 TB file system space ( 2 spares)
2 x 80 GB SSD cache
16 GB RAM  (2 GB for system, 14GB for ZFS, is this fine for non dedup?)

dedup ( I am getting a 2.7 ratio at the moment on the secondary backup)
14 x 1 TB giving 6 TB of file system space ( dedup of 2.3 and 2 spare slots for 
upgrade)
2 x 160 GB SSD cache
64 GB RAM (2GB system, 6GB ZFS, 48 DDT, yes, I know I can't seperate ZFS and 
DDT.)

The second system will be more upgradeable/future proof, but do people think 
the performance would be similar?


Thanks

John
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-19 Thread Tonmaus
 
   sata
   disks don't understand the prioritisation, so

 
 Er, the point was exactly that there is no
 discrimination, once the
 request is handed to the disk. 

So, do you say that SCSI drives do understand prioritisation (i.e. TCQ supports 
the schedule from ZFS), while SATA/NCQ drives don't, or is it just boiling down 
to what Richard told us, SATA disks being too slow?

 If the
 internal-to-disk queue is
 enough to keep the heads saturated / seek bound, then
 a new
 high-priority-in-the-kernel request will get to the
 disk sooner, but
 may languish once there.  

Thanks. That makes sense to me.


 
 You can shorten the number of outstanding IO's per
 vdev for the pool
 overall, or preferably the number scrub will generate
 (to avoid
 penalising all IO).  

That sounds like a meaningful approach to addressing bottlenecks caused by 
zpool scrub to me.

The tunables for each of these
 should be found
 readily, probably in the Evil Tuning Guide.

I think I should try to digest the Evil Tuning Guide occasionally with respect 
to this topic. Thanks for pointing me to a direction. Maybe what you have 
suggested above (shorten the number of I/Os issued by scrub) is already 
possible? If not, I think it would be a meaningful improvement to request.

 Disks with write cache effectively do this [command cueing] for
 writes, by pretending
 they complete immediately, but reads would block the
 channel until
 satisfied.  (This is all for ATA which lacked this,
 before NCQ. SCSI
 has had these capabilities for a long time).

As scrub is about reads, are you saying that this is still a problem with 
SATA/NCQ drives, or not? I am unsure what you mean at this point.

   limiting the number of concurrent IO's handed to
 the disk to try
   and avoid saturating the heads.
  
  Indeed, that was what I had in mind. With the
 addition that I think
  it is as well necessary to avoid saturating other
 components, such
  as CPU.  
 
 Less important, since prioritisation can be applied
 there too, but
 potentially also an issue.  Perhaps you want to keep
 the cpu fan
 speed/noise down for a home server, even if the scrub
 runs longer.

Well, the only thing that was really remarkable while scrubbing was CPU load 
constantly near 100%. I still think that is at least contributing to the 
collapse of concurrent payload. I.e., it's all about services that take place 
in Kernel: CIFS, ZFS, iSCSI Mostly, about concurrent load within ZFS 
itself. That means an implicit trade-off while a file is being provided over 
CIFS, i.e..

 
 AHCI should be fine.  In practice if you see actv  1
 (with a small
 margin for sampling error) then ncq is working.

Ok, and how is that in respect to mpt? My assertion that mpt will support NCQ 
is mainly based on the marketing information provided by LSI that these 
controllers offer NCQ support with SATA drives. How (by which tool) do I get to 
this actv parameter?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Darren J Moffat

Now, NDMP doesn't do you much good for a locally attached tape drive,
as Darren and Svein pointed out.  However, provided the software which is
installed on this fictional server can talk to the tape in an
appropriate way,
then all you have to do is pipe zfs send into it.  Right?  What did I
miss?


Actually there is a case where NDMP is useful when the tape drive is 
locally attached.  If the data server is an appliance that you can not 
(either technically or by policy or both) install any backup agents 
onto.  The SS7000 falls into this category.  The SS7000 allows for a 
locally attached tape drive.  The backup control software runs on 
another machine and talks with the local NDMP to move the data from 
local disk to local tape.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Rethinking my zpool

2010-03-19 Thread Chris Dunbar - Earthside, LLC
Hello,

After being immersed in this list and other ZFS sites for the past few weeks I 
am having some doubts about the zpool layout on my new server. It's not too 
late to make a change so I thought I would ask for comments. My current plan to 
to have 12 x 1.5 TB disks in a what I would normally call a RAID 10 
configuration. That doesn't seem to be the right term here, but there are 6 
sets of mirrored disks striped together. I know that smaller sets of disks 
are preferred, but how small is small? I am wondering if I should break this 
into two sets of 6 disks. I do have a 13th disk available as a hot spare. Would 
it be available for either pool if I went with two? Finally, would I be better 
off with raidz2 or something else instead of the striped mirrored sets? 
Performance and fault tolerance are my highest priorities.

Thank you,
Chris Dunbar 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread David Dyer-Bennet

On Fri, March 19, 2010 00:38, Rob wrote:
 Can a ZFS send stream become corrupt when piped between two hosts across a
 WAN link using 'ssh'?

 For example a host in Australia sends a stream to a host in the UK as
 follows:

 # zfs send tank/f...@now | ssh host.uk receive tank/bar

In general, errors would be detected by TCP (or by lower-level hardware
media error-checking), and the packet retransmitted.  I'm not sure what
error-checking ssh does on top of that (if any).

However, these legacy mechanisms aren't guaranteed to give  you the
less-than-one-wrong-bit-in-10^15 level of accuracy people tend to want for
enterprise backups today (or am I off a couple of orders of magnitude
there?).  They were defined when data rates were much slower and data
volumes much lower.

In addition, memory errors on the receiving host (after the TCP stack
turns the data over to the application), if undetected, could leave you
with corrupted data; not sure what the probability is there.

Every scheme has SOME weak spots.  The well-designed ones at least tell 
you the bit error rate.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Q : recommendations for zpool configuration

2010-03-19 Thread David Dyer-Bennet

On Fri, March 19, 2010 02:28, homerun wrote:
 Greetings

 I would like to get your recommendation how setup new pool.

 I have 4 new 1.5TB disks reserved to new zpool.
 I planned to crow/replace existing small 4 disks ( raidz ) setup with new
 bigger one.

 As new pool will be bigger and will have more personally important data to
 be stored long time, i like to ask your recommendations should i create
 recreate pool or just replace existing devices.

Replacing existing drives runs risks to the data -- you're deliberately
reducing yourself to no redundancy for a while (while the resilver
happens).  It would probably be faster, and definitely safer, to back up
the data, recreate the pool, and restore the data.

 I have noted there is now raidz2 and been thinking witch woul be better.
 A pool with 2 mirrors or one pool with 4 disks raidz2

A pool with 2 mirrors will have the same available space as a 4-disk
raidz2.  It will generally perform better.

For small numbers of disks, I'm a big fan of using mirrors rather than
RAIDZ.  I've got an 8-disk hot-swap bay currently occupied by 3 2-disk
pairs (with 2 slots for future expansion; maybe a hot spare, and a space
to attach an additional disk during upgrades).

When expanding a vdev by replacing devices, it can be done much more
safely with a mirror than a RAIDZ group.  With a mirror, you can attach a
THIRD disk (in fact you can attach any number; one guy wrote about
creating a 47-way mirror).  So, instead of replacing one disk with a
bigger one (eliminating your redundancy during the resilver), attach the
bigger one as a third disk.  When that resilver is done, you can attach
the other new disk, if you have bay space; or detach one of the small
disks and THEN attach the other new disk.  When the second resilver is
done, detach the last small disk, and you have now increased  your mirror
vdev size without ever reducing your redundancy below 2 copies.  There's
no equivalent process for a RAIDZ group.

 So at least could some explain these new raidz configurations

RAIDZ is single parity -- one drive is redundant data.  A RAIDZ vdev
will withstand the failure of one drive without loss of data, but NOT the
failure of 2 or more.  A RAIDZ pool of N drives (all the same size) has
N-1 drives worth of available capacity.

RAIDZ2 is double parity -- two drives are given to redundant data.  A
RAIDZ2 vdev will withstand the failure of one or two drives without loss
of data, but NOT the failure of 3 or more.  A RAIDZ2 pool of N drives (all
the same size) has N-2 drives worth of available capacity.

A problem with modern large drives is that they take a long time to
resilver in case of failure and replacement.  During that period, if you
started with one redundant drive, you're down to no redundant drives,
meaning that a failure during the resilver could lose your data.  (This is
one of the many reasons you should have backups *in addition* to using
redundant vdevs).  This has driven people to develop higher levels of
redundancy in parity schemes, such as RAIDZ2 (and RAIDZ3).
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread Bob Friesenhahn

On Fri, 19 Mar 2010, David Dyer-Bennet wrote:


However, these legacy mechanisms aren't guaranteed to give  you the
less-than-one-wrong-bit-in-10^15 level of accuracy people tend to want for
enterprise backups today (or am I off a couple of orders of magnitude
there?).  They were defined when data rates were much slower and data
volumes much lower.


Are you sure?  Have you done any research on this?  You are saying 
that NSA+-grade crypto on the stream is insufficient to detect a 
modification to the data?


It seems that the main failure mode would be disconnect by ssh.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Joerg Schilling
Darren J Moffat darren.mof...@oracle.com wrote:

 That assumes you are writing the 'zfs send' stream to a file or file 
 like media.  In many cases people using 'zfs send' for they backup 
 strategy are they are writing it back out using 'zfs recv' into another 
 pool.  In those cases the files can even be restored over NFS/CIFS by 
 using the .zfs/snapshot directory

If you unpack the datastream from zfs send on a machine on a different location 
that is safe against e.g. a fire that destroys the main machine, you may call 
it a backup.

  Star implements incremental backups and restores based on POSIX compliant
  archives.

 ZFS filesystem have functionality beyond POSIX and some of that is 
 really very important for some people (especially those using CIFS)

As I mentioned many times in the past, star in contrary to other archives I 
know has the right infrastructure that allows to add support for additional 
metadata easily. The main problem seems to be that some people from inside Sun 
signal that they are not interested in star and that this discourages customers 
that do not maintain their own sw infrastructure. Adding missing features on 
the other side only makes sens if there is interes in using these features.

 Does Star (or any other POSIX archiver) backup:
   ZFS ACLs ?

Now that libsec finally supports the needed features, it only needs to be 
defined and implemented. I am waiting since a few years on a discussion to 
define the textual format to be used in the tar headers...

   ZFS system attributes (as used by the CIFS server and locally) ?

star does support such things for Linux and FreeBSD, the problem on Solaris is 
that the documentation of the interfaces for this Solaris local feature is poor.
The was Sun tar archives the attibutes is non-portable.

Could you point to documentation?

   ZFS dataset properties (compression, checksum etc) ?

Where is the documentation of the interfaces?


 If it doesn't then it is providing an archive of the data in the 
 filesystem, not a full/incremental copy of the ZFS dataset.  Which 
 depending on the requirements of the backup may not be enough.  In 
 otherwords you have data/metadata missing from your backup.

 The only tool I'm aware of today that provides a copy of the data, and 
 all of the ZPL metadata and all the ZFS dataset properties is 'zfs send'.

I encourage you to collaborate... Provide information for documentation in the 
interfaces and help to discuss the archive format extensions for the missing 
features.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Joerg Schilling
Mike Gerdts mger...@gmail.com wrote:

  another server, where the data is immediately fed through zfs receive then
  it's an entirely viable backup technique.

 Richard Elling made an interesting observation that suggests that
 storing a zfs send data stream on tape is a quite reasonable thing to
 do.  Richard's background makes me trust his analysis of this much
 more than I trust the typical person that says that zfs send output is
 poison.

If it is on tape you can restore the whole filesystem if you have a new empty 
one
to restore to but you cannot do all the typical usages of backups.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Darren J Moffat



On 19/03/2010 14:57, joerg.schill...@fokus.fraunhofer.de wrote:

Darren J Moffatdarren.mof...@oracle.com  wrote:


That assumes you are writing the 'zfs send' stream to a file or file
like media.  In many cases people using 'zfs send' for they backup
strategy are they are writing it back out using 'zfs recv' into another
pool.  In those cases the files can even be restored over NFS/CIFS by
using the .zfs/snapshot directory


If you unpack the datastream from zfs send on a machine on a different location
that is safe against e.g. a fire that destroys the main machine, you may call
it a backup.


I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet 
the implication is that a tar archive stored on a tape is considered a 
backup ?



ZFS system attributes (as used by the CIFS server and locally) ?


star does support such things for Linux and FreeBSD, the problem on Solaris is
that the documentation of the interfaces for this Solaris local feature is poor.
The was Sun tar archives the attibutes is non-portable.

Could you point to documentation?


getattrat(3C) / setattrat(3C)

Even has example code in it.

This is what ls(1) uses.


ZFS dataset properties (compression, checksum etc) ?


Where is the documentation of the interfaces?


There isn't any for those because the libzfs interfaces are currently 
still private.   The best you can currently do is to parse the output of 
'zfs list' eg.

zfs list -H -o compression rpool/export/home

Not ideal but it is the only publicly documented interface for now.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Scott Meilicke
You will get much better random IO with mirrors, and better reliability when a 
disk fails with raidz2. Six sets of mirrors are fine for a pool. From what I 
have read, a hot spare can be shared across pools. I think the correct term 
would be load balanced mirrors, vs RAID 10.

What kind of performance do you need? Maybe raidz2 will give you the 
performance you need. Maybe not. Measure the performance of each configuration 
and decide for yourself. I am a big fan of iometer for this type of work.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Bob Friesenhahn

On Fri, 19 Mar 2010, Khyron wrote:

Getting better FireWire performance on OpenSolaris would be nice though.
Darwin drivers are open...hmmm.


OS-X is only (legally) used on Apple hardware.  Has anyone considered 
that since Firewire is important to Apple, they may have selected a 
particular Firewire chip which performs particularly well?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is this a sensible spec for an iSCSI storage box?

2010-03-19 Thread Scott Meilicke
 One of the reasons I am investigating solaris for
 this is sparse volumes and dedupe could really help
 here.  Currently we use direct attached storage on
 the dom0s and allocate an LVM to the domU on
 creation.  Just like your example above, we have lots
 of those 80G to start with please volumes with 10's
 of GB unused.  I also think this data set would
 dedupe quite well since there are a great many
 identical OS files across the domUs.  Is that
 assumption correct?

This is one reason I like NFS - thin by default, and no wasted space within a 
zvol. zvols can be thin as well, but opensolaris will not know the inside 
format of the zvol, and you may still have a lot of wasted space after a while 
as files inside of the zvol come and go. In theory dedupe should work well, but 
I would be careful about a possible speed hit. 


 I've not seen an example of that before.  Do you mean
 having two 'head units' connected to an external JBOD
 enclosure or a proper HA cluster type configuration
 where the entire thing, disks and all, are
 duplicated?

I have not done any type of cluster work myself, but from what I have read on 
Sun's site, yes, you could connect the same jbod to two head units, 
active/passive, in an HA cluster, but no duplicate disks/jbod. When the active 
goes down, passive detects this and takes over the pool by doing an import. 
During the import, any outstanding transactions on the zil are replayed, 
whether they are on a slog or not. I believe this is how Sun does it on their 
open storage boxes (7000 series). Note - two jbods could be used, one for each 
head unit, making an active/active setup. Each jbod is active on one node, 
passive on the other.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Julian Regel
Damon (and others)


For those wanting the ability to perform file backups/restores along with all 
metadata, without resorting to third party applications, if you have a Sun 
support contract, log a call asking that your organisation be added to the list 
of users who wants to see RFE #5004379 want comprehensive backup strategy 
implemented. 

I logged this last month and was told there are now 5 organisations asking for 
this. Considering this topic seems to crop up regularly on zfs-discuss, I'm 
guessing the actual number of people is higher but people don't know how to 
register their interest.

JR



  ___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread David Dyer-Bennet

On Fri, March 19, 2010 09:49, Bob Friesenhahn wrote:
 On Fri, 19 Mar 2010, David Dyer-Bennet wrote:

 However, these legacy mechanisms aren't guaranteed to give  you the
 less-than-one-wrong-bit-in-10^15 level of accuracy people tend to want
 for
 enterprise backups today (or am I off a couple of orders of magnitude
 there?).  They were defined when data rates were much slower and data
 volumes much lower.

 Are you sure?  Have you done any research on this?  You are saying
 that NSA+-grade crypto on the stream is insufficient to detect a
 modification to the data?

I was referring to the tcp and hardware-level checksums.  I specifically
said I didn't know if SSH did anything on top of that (other people have
since said that it does, and it might well be plenty good enough; also
that ZFS itself has checksums in the send stream).

I don't think of stream crypto as inherently including validity checking,
though in practice I suppose it would always be a good idea.

 It seems that the main failure mode would be disconnect by ssh.

Sure, can't guarantee against aborted connections at whatever level
(actual interruption of IP connectivity).  But those are generally
detected and reported as an error; one shouldn't be left with the
impression the transfer succeeded.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Joerg Schilling
Darren J Moffat darr...@opensolaris.org wrote:

 I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet 
 the implication is that a tar archive stored on a tape is considered a 
 backup ?

You cannot get a single file out of the zfs send datastream.

 ZFS system attributes (as used by the CIFS server and locally) ?
 
  star does support such things for Linux and FreeBSD, the problem on Solaris 
  is
  that the documentation of the interfaces for this Solaris local feature is 
  poor.
  The was Sun tar archives the attibutes is non-portable.
 
  Could you point to documentation?

 getattrat(3C) / setattrat(3C)

 Even has example code in it.

 This is what ls(1) uses.

It could be easily possible to add portable support integrated into the 
framework that already supports FreeBSD and Linux attributes.


 ZFS dataset properties (compression, checksum etc) ?
 
  Where is the documentation of the interfaces?

 There isn't any for those because the libzfs interfaces are currently 
 still private.   The best you can currently do is to parse the output of 
 'zfs list' eg.
   zfs list -H -o compression rpool/export/home

 Not ideal but it is the only publicly documented interface for now.

As long as there is no interface that supports what I did discuss with 
Jeff Bonwick in September 2004:

-   A public interface to get the property state

-   A public interface to read the file raw in compressed form

-   A public interface to write the file raw in compressed form

I am not sure whether this is of relevance for a backup. If there is a need
to change the states, on a directory base, there is a need for an easy to use 
public interface.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool I/O error

2010-03-19 Thread Grant Lowe
Hi all,

I'm trying to delete a zpool and when I do, I get this error:

# zpool destroy oradata_fs1
cannot open 'oradata_fs1': I/O error
# 

The pools I have on this box look like this:

#zpool list
NAME  SIZE   USED  AVAILCAP  HEALTH  ALTROOT
oradata_fs1   532G   119K   532G 0%  DEGRADED  -
rpool 136G  28.6G   107G21%  ONLINE  -
#

Why can't I delete this pool? This is on Solaris 10 5/09 s10s_u7.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread Bob Friesenhahn

On Fri, 19 Mar 2010, David Dyer-Bennet wrote:


I don't think of stream crypto as inherently including validity checking,
though in practice I suppose it would always be a good idea.


This is obviously a vital and necessary function of ssh in order to 
defend against man in the middle attacks.  The main requirement is 
to make sure that the transferred data can not be deciphered or 
modified by something other than the two end-points.  I don't know if 
ssh includes retry logic to request that modified data be 
retransmitted.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Darren J Moffat

On 19/03/2010 16:11, joerg.schill...@fokus.fraunhofer.de wrote:

Darren J Moffatdarr...@opensolaris.org  wrote:


I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet
the implication is that a tar archive stored on a tape is considered a
backup ?


You cannot get a single file out of the zfs send datastream.


I don't see that as part of the definition of a backup - you obviously 
do - so we will just have to disagree on that.



ZFS system attributes (as used by the CIFS server and locally) ?


star does support such things for Linux and FreeBSD, the problem on Solaris is
that the documentation of the interfaces for this Solaris local feature is poor.
The was Sun tar archives the attibutes is non-portable.

Could you point to documentation?


getattrat(3C) / setattrat(3C)

Even has example code in it.

This is what ls(1) uses.


It could be easily possible to add portable support integrated into the
framework that already supports FreeBSD and Linux attributes.


Great, do you have a time frame for when you will have this added to 
star then ?



ZFS dataset properties (compression, checksum etc) ?


Where is the documentation of the interfaces?


There isn't any for those because the libzfs interfaces are currently
still private.   The best you can currently do is to parse the output of
'zfs list' eg.
zfs list -H -o compression rpool/export/home

Not ideal but it is the only publicly documented interface for now.


As long as there is no interface that supports what I did discuss with
Jeff Bonwick in September 2004:

-   A public interface to get the property state


That would come from libzfs.  There are private interfaces just now that 
are very likely what you need zfs_prop_get()/zfs_prop_set(). They aren't 
documented or public though and are subject to change at any time.



-   A public interface to read the file raw in compressed form


I think you are missing something about how ZFS works here.  Files 
aren't in a compressed form.  Some blocks of a file may be compressed if 
compression is enabled on the dataset.  Note that for compression and 
checksum properties they only indicate what algorithm will be used to 
compress (or checksum) blocks for new writes. It doesn't say what 
algorithm the blocks of a given file are compressed with.  In fact for 
any given file some blocks may be compressed and some not.  The reasons 
for a block not being compressed include: 1) it didn't compress 2) it 
was written when compression=off 3) it didn't compress enough.
It is even possible that if the user changed the value of compression 
blocks within a file are compressed with a different algorithm.


So you won't ever get this because ZFS just doesn't work like that.

In fact even 'zfs send' doesn't even store compressed data.  The 'zfs 
send' stream has the blocks in the form that they exist in the in memory 
ARC ie uncompressed.


In kernel it is possible to ask for a block in its RAW (ie compressed) 
form but that is only for consumers of arc_read() and zio_read() - way 
way way below the ZPL layer and applications like star.



-   A public interface to write the file raw in compressed form


Not even a private API exists for this.  There is no capability to send 
a RAW (ie compressed) block to arc_write() or zio_write().



I am not sure whether this is of relevance for a backup. If there is a need
to change the states, on a directory base, there is a need for an easy to use
public interface.


I don't understand what you mean by that, can you give me an example.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread David Dyer-Bennet

On Fri, March 19, 2010 11:33, Darren J Moffat wrote:
 On 19/03/2010 16:11, joerg.schill...@fokus.fraunhofer.de wrote:
 Darren J Moffatdarr...@opensolaris.org  wrote:

 I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet
 the implication is that a tar archive stored on a tape is considered a
 backup ?

 You cannot get a single file out of the zfs send datastream.

 I don't see that as part of the definition of a backup - you obviously
 do - so we will just have to disagree on that.

I used to.  Now I think more in terms of getting it from a snapshot
maintained online on the original storage server.

The overall storage strategy has to include retrieving files lost due to
user error over some time period, whether that's months or years.  And
having to restore an entire 100TB backup to spare disk somewhere to get
one file is clearly not on.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Darren J Moffat

On 19/03/2010 17:19, David Dyer-Bennet wrote:


On Fri, March 19, 2010 11:33, Darren J Moffat wrote:

On 19/03/2010 16:11, joerg.schill...@fokus.fraunhofer.de wrote:

Darren J Moffatdarr...@opensolaris.org   wrote:


I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet
the implication is that a tar archive stored on a tape is considered a
backup ?


You cannot get a single file out of the zfs send datastream.


I don't see that as part of the definition of a backup - you obviously
do - so we will just have to disagree on that.


I used to.  Now I think more in terms of getting it from a snapshot
maintained online on the original storage server.


Exactly!  The single file retrieval due to user error case is best 
achieved by an automated snapshot system.   ZFS+CIFS even provides 
Windows Volume Shadow Services so that Windows users can do this on 
their own.



The overall storage strategy has to include retrieving files lost due to
user error over some time period, whether that's months or years.  And
having to restore an entire 100TB backup to spare disk somewhere to get
one file is clearly not on.


Completely agree, no where was I suggesting that 'zfs send' out to tape 
should be the whole backup strategy.  I even pointed to a presentation 
given at LOSUG that shows how someone is doing this.


I'll say it again: neither 'zfs send' or (s)tar is an enterprise (or 
even home) backup system on their own one or both can be components of 
the full solution.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread David Dyer-Bennet

On Fri, March 19, 2010 12:25, Darren J Moffat wrote:
 On 19/03/2010 17:19, David Dyer-Bennet wrote:

 On Fri, March 19, 2010 11:33, Darren J Moffat wrote:
 On 19/03/2010 16:11, joerg.schill...@fokus.fraunhofer.de wrote:
 Darren J Moffatdarr...@opensolaris.org   wrote:

 I'm curious, why isn't a 'zfs send' stream that is stored on a tape
 yet
 the implication is that a tar archive stored on a tape is considered
 a
 backup ?

 You cannot get a single file out of the zfs send datastream.

 I don't see that as part of the definition of a backup - you obviously
 do - so we will just have to disagree on that.

 I used to.  Now I think more in terms of getting it from a snapshot
 maintained online on the original storage server.

 Exactly!  The single file retrieval due to user error case is best
 achieved by an automated snapshot system.   ZFS+CIFS even provides
 Windows Volume Shadow Services so that Windows users can do this on
 their own.

I'll need to look into that, when I get a moment.  Not familiar with
Windows Volume Shadow Services, but having people at home able to do this
directly seems useful.

 The overall storage strategy has to include retrieving files lost due to
 user error over some time period, whether that's months or years.  And
 having to restore an entire 100TB backup to spare disk somewhere to
 get
 one file is clearly not on.

 Completely agree, no where was I suggesting that 'zfs send' out to tape
 should be the whole backup strategy.  I even pointed to a presentation
 given at LOSUG that shows how someone is doing this.

Sorry, didn't mean to sound like I was arguing with you (or suggest we
disagreed in that area); I intended to pontificate on the problem in
general.

 I'll say it again: neither 'zfs send' or (s)tar is an enterprise (or
 even home) backup system on their own one or both can be components of
 the full solution.

I'm seeing what a lot of professional and serious amateur photographers
are building themselves for storage on a mailing list I'm on.  Nearly
always it consists of two layers of storage servers, often with one
off-site (most of them keep current photos on LOCAL disk, instead of my
choice of working directly off storage server disk).

I'm in the fortunate position of having my backups less than the size of a
large single drive; so I'm rotating three backup drives, and intend to be
taking one of them off-site regularly (still in the process of converting
to this new scheme; the previous scheme used off-site optical disks).  I
use ZFS for the removable drives, so I can if necessary reach into them
and drag out single files fairly easily if necessary (but necessary
would require something happening to the online snapshot first).  People
with much bigger configurations look like they save money using tape for
the archival / disaster restore storage, but it's not economically viable
at my level.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Error in zfs list output?

2010-03-19 Thread Brandon High
I think I'm seeing an error in the output from zfs list with regards
to snapshot space utilization.

In the first list, there are 818M used by snapshots, but the snaps
listed aren't using anything close to that amount. If I destroy the
first snapshot, then the second one suddenly jumps in space used to
813M, which seems about right and the USEDSNAP column makes sense.

Is this a bug in snapshot accounting or reporting, or is there
something I missed?

r...@basestar:/export/vmware# zfs list -t all -r -o space
tank/export/volumes/caliban
NAME
 AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
tank/export/volumes/caliban
 3.09T  1.59G  818M    813M  0  0
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-10:00
 -  2.88M -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-11:00
 -  2.80M -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-12:00
 -   200K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:15
 -   174K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:30
 -   252K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:45
 -   340K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-13:00
 -  0 -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-13:00
 -  0 -   -  -  -
r...@basestar:/export/vmware# zfs destroy
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-10:00
r...@basestar:/export/vmware# zfs list -t all -r -o space
tank/export/volumes/caliban
NAME
 AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
tank/export/volumes
 3.09T  39.3G 0   47.1K  0  39.3G
tank/export/volumes/caliban
 3.09T  1.59G  815M    813M  0  0
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-11:00
 -   813M -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-12:00
 -   200K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:15
 -   174K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:30
 -   252K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:45
 -   340K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-13:00
 -  0 -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-13:00
 -  0 -   -  -  -
r...@basestar:/export/vmware# zfs destroy
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-11:00
r...@basestar:/export/vmware# zfs list -t all -r -o space
tank/export/volumes/caliban
NAME
 AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
tank/export/volumes/caliban
 3.09T   815M 2.04M    813M  0  0
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-12:00
 -   200K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:15
 -   174K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:30
 -   252K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-12:45
 -   340K -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:hourly-2010-03-19-13:00
 -  0 -   -  -  -
tank/export/volumes/cali...@zfs-auto-snap:frequent-2010-03-19-13:00
 -  0 -   -  -  -


--
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread Richard Jahnel
They way we do this here is:


zfs snapshot voln...@snapnow
[i]#code to break on error and email not shown.[/i]
zfs send -i voln...@snapbefore voln...@snapnow | pigz -p4 -1  file
[i]#code to break on error and email not shown.[/i]
scp /dir/file u...@remote:/dir/file
[i]#code to break on error and email not shown.[/i]
shh u...@remote gzip -t /dir/file
[i]#code to break on error and email not shown.[/i]
shh u...@remote gunzip  /dir/file | zfs receive volname

It works for me and it sends a minimum amount of data across the wire which is 
tested to minimize the chance of inflight issues. Excpet on Sundays when we do 
a full send.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread Ian Collins

On 03/20/10 09:28 AM, Richard Jahnel wrote:

They way we do this here is:


zfs snapshot voln...@snapnow
[i]#code to break on error and email not shown.[/i]
zfs send -i voln...@snapbefore voln...@snapnow | pigz -p4 -1  file
[i]#code to break on error and email not shown.[/i]
scp /dir/file u...@remote:/dir/file
[i]#code to break on error and email not shown.[/i]
shh u...@remote gzip -t /dir/file
[i]#code to break on error and email not shown.[/i]
shh u...@remote gunzip  /dir/file | zfs receive volname

It works for me and it sends a minimum amount of data across the wire which is 
tested to minimize the chance of inflight issues. Excpet on Sundays when we do 
a full send.
   

Don't you trust the stream checksum?

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS send and receive corruption across a WAN link?

2010-03-19 Thread Richard Jahnel
no, but I'm slightly paranoid that way. ;)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error in zfs list output?

2010-03-19 Thread Miles Nordin
 bh == Brandon High bh...@freaks.com writes:

bh I think I'm seeing an error in the output from zfs list with
bh regards to snapshot space utilization.

no bug.  You just need to think harder about it: the space used cannot
be neatly put into buckets next to each snapshot that add to the
total, just because of...math.  To help understand, suppose you
decide, just to fuck things up, that from now on every time you take a
snapshot you take two snapshots, with exactly zero filesystem writing
happening between the two.  What do you want 'zfs list' to say now?

What does happen if you do that, is it says all snapshots use zero space.

the space shown in zfs list is the amount you'd get back if you
deleted this one snapshot.  Yes, every time you delete a snapshot, all
the numbers reshuffle.  Yes, there is a whole cat's cradle of space
accounting information hidden in there that does not come out through
'zfs list'.


pgpzRUSk68FzY.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Brandon High
On Fri, Mar 19, 2010 at 5:32 AM, Chris Dunbar - Earthside, LLC 
cdun...@earthside.net wrote:

 if I went with two? Finally, would I be better off with raidz2 or something
 else instead of the striped mirrored sets? Performance and fault tolerance
 are my highest priorities.


Performance and fault tolerance are somewhat conflicting.

You'll have good fault tolerance and performance using a wide raidz3 stripe,
eg: 12-disk raidz3 with a spare.

You'll have the best fault tolerance using small raidz3 stripes with a
spare, for instance 2 x 6-disk raidz3. This uses 50% of your disks for
redundancy.

You'll have slightly better performance and slightly worse fault tolerance
using raidz2 instead in both cases above. I would not recommend using raidz,
as it will offer almost no real fault tolerance with the size of drives
you're using.

You'll have your best performance and fault tolerance using 3-way mirrors,
but you sacrifice 2/3 of your disks to do it. Actually, I think that raidz3
is higher tolerance still, but the performance difference will be huge.

2-way mirrors is slightly worse for fault tolerance (below raidz2 I believe)
and good performance.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Alex Blewitt
On 19 Mar 2010, at 15:30, Bob Friesenhahn wrote:

 On Fri, 19 Mar 2010, Khyron wrote:
 Getting better FireWire performance on OpenSolaris would be nice though.
 Darwin drivers are open...hmmm.
 
 OS-X is only (legally) used on Apple hardware.  Has anyone considered that 
 since Firewire is important to Apple, they may have selected a particular 
 Firewire chip which performs particularly well?

Darwin is open-source.

http://www.opensource.apple.com/source/xnu/xnu-1486.2.11/
http://www.opensource.apple.com/source/IOFireWireFamily/IOFireWireFamily-417.4.0/

Alex
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Erik Trimble

Chris Dunbar - Earthside, LLC wrote:

Hello,

After being immersed in this list and other ZFS sites for the past few weeks I am having 
some doubts about the zpool layout on my new server. It's not too late to make a change 
so I thought I would ask for comments. My current plan to to have 12 x 1.5 TB disks in a 
what I would normally call a RAID 10 configuration. That doesn't seem to be the right 
term here, but there are 6 sets of mirrored disks striped together. I know that 
smaller sets of disks are preferred, but how small is small? I am wondering 
if I should break this into two sets of 6 disks. I do have a 13th disk available as a hot 
spare. Would it be available for either pool if I went with two? Finally, would I be 
better off with raidz2 or something else instead of the striped mirrored sets? 
Performance and fault tolerance are my highest priorities.

Thank you,
Chris Dunbar
There's not much benefit I can see to having two pools if both are using 
the same configuration (i.e all mirrors or all raidz). There are reasons 
to do so, but I don't see that they would be of any real benefit for 
what you describe.  A Hot spare disk can be assigned to multiple pools 
(often referred to as a global hot spare)


Preferences for raidz[123] configs is to have 4-6 data disks in the vdev.

Realistically speaking, you have several different (practical) 
configurations possible, in order of general performance:


(a)  6 x 2-way mirrors + 1 pool hot spare - 9TB usable
(b)  4 x 3-ways mirrors + 1 pool hot spare - 6TB usable
(c)  1 6-disk raidz + 1 7-disk raidz -  16.5TB usable
(d)  2 6-disk raidz + 1 pool hot spare - 15TB usable
(e)  1 6-disk raidz2 + 1 7-disk raidz2 - 13.5TB usable
(f)   2 6-disk raidz2 + 1 pool hot spare - 12TB usable
(g)  1 6-disk raidz3 + 1 7-disk raidz3 -  10.5TB usable
(h)  1 13-disk raidz3 - 15TB usable

Given the size of your disks, resilvering is likely to have a 
significant time problem in any RAIDZ[123] configuration.   That is, 
unless you are storing (almost exclusively) very large files, resilver 
time is going to be significant, and can potentially be radically higher 
than a mirrored config.


The mirroring configs will out-perform raidz[123] on everything except 
large streaming write/reads, and even then, it's a toss-up. 

Overall, the (a), (d), and (f) configurations generally offer the best 
balance of redundancy, space, and performance.


Here's the chances to survive disk failures (assuming hot spares are 
unable to be used; that is, all disk failures happen in a short period 
of time) - note that all three can always survive a single disk failure:


(a)   90% for 2, 73% for 3, 49% for 4, 25% for 5.
(d)   55% for 2, 27% for 3, 0% for 4 or more
(f)   100% for 2, 80% for 3, 56% for 4, 0% for 5.


Depending on your exact requirements, I'd go with (a) or (f) as the best 
choices - (a) if performance is more important, (f) if redundancy 
overrides performance.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Khyron
The point I think Bob was making is that FireWire is an Apple technology, so

they have a vested interest in making sure it works well on their systems
and
with their OS.  They could even have a specific chipset that they
exclusively
use in their systems, although I don't see why others couldn't source it
(with
the exception that others may be too cheap to do so).  Given these factors,
it makes sense that FireWire performs brilliantly on Apple
hardware/software,
while everyone else makes the bare minimum (or less) investment in it, if
that much.  So those open drivers, while they could be useful for learning
or other purposes, may not be directly usable for the systems people are
running with OpenSolaris.

At least, that's what I think Bob meant.

On Fri, Mar 19, 2010 at 17:08, Alex Blewitt alex.blew...@gmail.com wrote:

 On 19 Mar 2010, at 15:30, Bob Friesenhahn wrote:

  On Fri, 19 Mar 2010, Khyron wrote:
  Getting better FireWire performance on OpenSolaris would be nice though.
  Darwin drivers are open...hmmm.
 
  OS-X is only (legally) used on Apple hardware.  Has anyone considered
 that since Firewire is important to Apple, they may have selected a
 particular Firewire chip which performs particularly well?

 Darwin is open-source.

 http://www.opensource.apple.com/source/xnu/xnu-1486.2.11/

 http://www.opensource.apple.com/source/IOFireWireFamily/IOFireWireFamily-417.4.0/

 Alex




-- 
You can choose your friends, you can choose the deals. - Equity Private

If Linux is faster, it's a Solaris bug. - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Usage of hot spares and hardware allocation capabilities.

2010-03-19 Thread Khyron
Responses inline...

On Tue, Mar 16, 2010 at 07:35, Robin Axelsson
gu99r...@student.chalmers.sewrote:

 I've been informed that newer versions of ZFS supports the usage of hot
 spares which is denoted for drives that are not in use but available for
 resynchronization/resilvering should one of the original drives fail in the
 assigned storage pool.


That is the definition of a hot spare, at least informally.  ZFS has
supported
this for some time (if not from the beginning; I'm not in a position to
answer
that).  It is *not* new.



 I'm a little sceptical about this because even the hot spare will be
 running for the same duration as the other disks in the pool and therefore
 will be exposed to the same levels of hardware degradation and failures
 unless it is put to sleep during the time it is not being used for storage.
 So, is there a sleep/hibernation/standby mode that the hot spares operate in
 or are they on all the time regardless of whether they are in use or not?


Not that I am aware of or have heard others report.  No such sleep mode
exists.  Sounds like you want a Copan storage system.  AFAIK, hot spares
are always spinning, that's why they are hot.



 Usually the hot spare is on a not so well-performing SAS/SATA controller,
 so given the scenario of a hard drive failure upon which a hot spare has
 been used for resilvering of say a raidz2 cluster, can I move the resilvered
 hot spare to the faster controller by letting it take the faulty hard
 drive's space using the zpool offline, zpool online commands?


Usually?  That's not my experience, from multiple vendors hardware RAID
arrays.  Usually it's on a channel used by storage disks.  Maybe someone
else has seen otherwise.  I'd be personally curious to know what system
puts a spare on a lower performance channel.  That risks slowing the entire
device (RAID set/group) when the hot spare kicks in.

As for your questions, that doesn't make a lot of sense to me.  I don't even

get how that would work, but I'm not Wile E. Coyote, Super Genius either.



 To be more general; are the hard drives in the pool hard coded to their
 SAS/SATA channels or can I swap their connections arbitrarily if I would
 want to do that? Will zfs automatically identify the association of each
 drive of a given pool or tank and automatically reallocate them to put the
 pool/tank/filesystem back in place?


No.  Each disk in the pool has a unique ID, as I understand.  Thus, you
should be able to move a disk to another location (channel, slot) and it
would still be a part of the same pool and VDEV.

All of that said, I saw this post when it originally came in.  I notice no
one has
responded to it until now.  I don't know about anyone else, but I know that
I
was offended when I read this.  I know for myself, I wasn't sure how to take

this when I read it.

Maybe you should not assume that people on this list don't know what
hot sparing is, or that ZFS just learned.  Just a suggestion.

-- 
You can choose your friends, you can choose the deals. - Equity Private

If Linux is faster, it's a Solaris bug. - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] sympathetic (or just multiple) drive failures

2010-03-19 Thread zfs ml
Most discussions I have seen about RAID 5/6 and why it stops working seem to 
base their conclusions solely on single drive characteristics and statistics.
It seems to me there is a missing component in the discussion of drive 
failures in the real world context of a system that lives in an environment 
shared by all the system components - for instance, the video of the disks 
slowing down when they are yelled at is a good visual example of the negative 
effect of vibration on drives.  http://www.youtube.com/watch?v=tDacjrSCeq4


I thought the google and CMU papers talked about a surprisingly high (higher 
than expected) rate of multiple drive failures of drives nearby each other, 
but I couldn't find it when I re-=skimmed the papers now.


What are peoples' experiences with multiple drive failures? Given that we 
often use same brand/model/batch drives (even though we are not supposed to), 
same enclosure, same rack, etc for a given raid 5/6/z1/z2/z3 system, should we 
be paying more attention to harmonics, vibration/isolation and non-intuitive 
system level statistics that might be inducing close proximity drive failures 
rather than just throwing more parity drives at the problem?


What if our enclosure and environmental factors increase the system level 
statistics for multiple drive failures beyond the (used by everyone) single 
drive failure statistics to the point where it is essentially negating the 
positive effect of adding parity drives?


I realize this issue is not addressed because there is too much variability in 
the enviroments, etc but I thought it would be interesting to see if anyone 
has experienced much in terms of close time proximity, multiple drive failures.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rethinking my zpool

2010-03-19 Thread Darren J Moffat
12 disks in mirrored pairs is a small configuration.  The smaller sets 
you referrer to might be the number of disks in a raidz/raidz2/raidz3 
top level vdev.


You say performance is one of your top priorities but what is the 
workload ?  Mostly read ? Mostly write ?  Random ? Sequential ?



See the ZFS Best Practices guide on the solarisinternals.com site for 
guidance on how to select your pool layout.


http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

In particular this part:

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pool_Performance_Considerations

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread erik.ableson
On 19 mars 2010, at 17:11, Joerg Schilling wrote:

 I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet 
 the implication is that a tar archive stored on a tape is considered a 
 backup ?
 
 You cannot get a single file out of the zfs send datastream.

zfs send is a block-level transaction with no filesystem dependencies - it 
could be transmitting a couple of blocks that represent a portion of a file, 
not necessarily an entire file.  And since it can also be used to host a zvol 
with any filesystem format imaginable it doesn't want to know.

Going back to star as an example - from the man page :

Star archives and extracts multiple files to and from a single file called a 
tarfile. A tarfile is usually a magnetic tape, but it can be any file. In all 
cases, appearance of a directory name refers to the files and (recursively) 
subdirectories of that directory.

This process pulls files (repeat: files! not blocks) off of the top of a 
filesystem so it needs to be presented a filesystem with interpretable file 
objects (like almost all backup tools). ZFS confuses the issue by integrating 
volume management with filesystem management. zfs send is dealing with the 
volume and the blocks that represent the volume without any file-level 
dependence.

It addresses an entirely different type of backup need, that is to be able to 
restore or mirror (especially mirror to another live storage system) an entire 
volume at a point in time.  It does not replace the requirement for file-level 
backups which deal with a different level of granularity. Simply because the 
restore use-case is different.

For example, on my Mac servers, I run two different backup strategies 
concurrently - one is bootable clone from which I can restart the computer 
immediately in the case of a drive failure.  At the same time, I use the Time 
Machine backups for file level granularity that allows me to easily find a 
particular file at a particular moment. Before Time Machine, this role was 
fulfilled with Retrospect to a tape drive.  However, a block-level dump to tape 
had little interest in the first use case since the objective is to minimize 
the RTO.

For disaster recovery purposes any of these backup objects can be externalized. 
Offsite rotation of the disks used allow the management of the RPO. 

Remember that files exist in a filesystem context and need to be backed up in 
this context.  Volumes exist in another context and can be replicated/backed up 
in this context.

zfs send/recv =  EMC MirrorView, NetApp Snap Mirror, EqualLogic 
Auto-replication, HP StorageWorks Continuous Access, DataCore AIM, etc.
zfs send/recv ≠ star, Backup Exec, CommVault, ufsdump, bacula, zmanda, 
Retrospect, etc.

Erik

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Khyron
Erik,

I don't think there was any confusion about the block nature of zfs send
vs. the file nature of star.  I think what this discussion is coming down to
is
the best ways to utilize zfs send as a backup, since (as Darren Moffat has

noted) it supports all the ZFS objects and metadata.

I see 2 things coming out of this:

1. NDMP for putting zfs send streams on tape over the network.  So the
question I have now is for anyone who has used or is using NDMP on OSol.
How well does it work?  Pros?  Cons?  If people aren't using it, why not?  I

think this is one area where there are some gains to be made on the OSol
backup front.

I still need to go back and look at the best ways to use local tape drives
on
OSol file servers running ZFS to capture ZFS objects and metadata (ZFS
ACLs, ZVOLs, etc.).

2. A new tool is required to provide some of the functionality desired, at
least as a supported backup method from Sun.  While someone in the
community may be interested in developing such a tool, Darren also noted
that the requisite APIs are private currently and still in flux.  They
haven't
yet stabilized and been published.

To Ed Harvey:

Some questions about your use of NetBackup on your secondary server:

1. Do you successfully backup ZVOLs?  We know NetBackup should be able
to capture datasets (ZFS file systems) using straight POSIX semantics.
2. What version of NetBackup are you using?
3. You simply run the NetBackup agent locally on the (Open)Solaris server?

I thank everyone who has participated in this conversation for sharing their

thoughts, experiences and realities.  It has been most informational.

On Fri, Mar 19, 2010 at 13:11, erik.ableson eable...@me.com wrote:

 On 19 mars 2010, at 17:11, Joerg Schilling wrote:

  I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet
  the implication is that a tar archive stored on a tape is considered a
  backup ?
 
  You cannot get a single file out of the zfs send datastream.

 zfs send is a block-level transaction with no filesystem dependencies - it
 could be transmitting a couple of blocks that represent a portion of a file,
 not necessarily an entire file.  And since it can also be used to host a
 zvol with any filesystem format imaginable it doesn't want to know.


snip

-- 
You can choose your friends, you can choose the deals. - Equity Private

If Linux is faster, it's a Solaris bug. - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Edward Ned Harvey
  ZFS+CIFS even provides
  Windows Volume Shadow Services so that Windows users can do this on
  their own.
 
 I'll need to look into that, when I get a moment.  Not familiar with
 Windows Volume Shadow Services, but having people at home able to do
 this
 directly seems useful.

I'd like to spin off this discussion into a new thread.  Any replies to this
one will surely just get buried in the (many messages) in this very long
thread...

New thread:
ZFS+CIFS:  Volume Shadow Services, or Simple Symlink?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Miles Nordin
 k == Khyron  khyron4...@gmail.com writes:

 k FireWire is an Apple technology, so they have a vested
 k interest in making sure it works well [...]  They could even
 k have a specific chipset that they exclusively use in their
 k systems,

yes, you keep repeating yourselves, but there are only a few firewire
host chips, like ohci and lynx, and apple uses the same ones as
everyone else, no magic.  Why would you speak such a complicated
fantasy out loud without any reason to believe it other than your
imaginations?

I also tried to use firewire on Solaris long ago and had a lot of
problems with it, both with the driver stack in Solaris and with the
embedded software inside a cheaper non-Oxford case (Prolific).  I
think y'all forum users shuold stick to SAS/SATA for external disks
and avoid firewire and USB both.

Realize, though, that it is not just the chip driver but the entire
software stack that influences speed and reliability.  Even above what
you normally consider the firewire stack, above all the mid-layer and
scsi emulation stuff, Mac OS X for example is rigorous about handling
force-unmounting, both with umount -f and disks that go away without
warning.  FreeBSD OTOH has major problems with force-unmounting,
panicing and waiting forever.  Solaris has problems too with freezing
zpool maintenance commands, access to pools unrelated to the one with
the device that went away, and NFS serving anything while any zpool is
frozen.  This is a problem even if you don't make a habit of yanking
disks because it can make diagnosing problems really difficult: what
if your case, like my non-Oxford one, has a firmware bug that makes it
freeze up sometimes?  or a flakey power supply or lose cable?  If the
OS does not stay up long enough to report the case detached, and stay
sane enough for you to figure out what makes it retach (waiting a
while, rebooting the case, jiggling the power connector, jiggling the
data connector) then you will probably never figure out what's wrong
with it, as I didn't for months while if I'd had the same broken case
on a Mac I'd have realized almost immediately that it sometimes
detaches itself for no reason and retaches when I cycle it's power
switch but not when I plug/unplug its data cable and not when I reboot
the Mac, so I'd know the case had buggy firmware, while with Solaris I
just get these craazy panic messages.  Once your exception
handling reaches a certain level of crappyness, you cannot touch
anything without everything collapsing.

And on Solaris all this freezing/panicing behavior depends a lot which
disk driver yuo're using while Mac OS X it's, meh, basically working
the same for SATA, USB, Firewire, or NFS client, and also you can
mount images with hdiutil over NFS without getting weird checksum
errors or deadlocks like you do with file or lofiadm-backed ZFS.
(globalsan iscsi is still a mess though, worse than all other mac disk
drivers and worse than the solaris initiator)

I do not like the Mac OS much because it's slow, because the
hardware's overpriced and fragile, because the only people running it
inside VM's are using piratebay copies, and because I distrust Apple
and strongly disapprove of their master plan both in intent and
practice like the way they crippled dtrace, the displayport bullshit,
and their terrible developer relations like nontransparent last-minute
API yanking and ``agreements'' where you even have to agree not to
discuss the agreement, and in general of their honing a talent for
manipulating people into exploitable corners by slowly convincing them
it's okay to feel lazy and entitled.  But yes they've got some things
relevant to server-side storage working better than Solaris does like
handling flakey disks sanely, and providing source for the stable
supported version of their OS not just the development version.


pgpzf9yUTzCYk.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS+CIFS: Volume Shadow Services, or Simple Symlink?

2010-03-19 Thread Edward Ned Harvey
  ZFS+CIFS even provides

  Windows Volume Shadow Services so that Windows users can do this on

  their own.

 

 I'll need to look into that, when I get a moment.  Not familiar with

 Windows Volume Shadow Services, but having people at home able to do

 this

 directly seems useful.

 

Even in a fully supported, all-MS environment, I've found the support for
Previous Versions is spotty and sort of unreliable at best.  Not to
mention, I think the user interface is just simply non-intuitive.

 

As an alternative, here's what I do:

ln -s .zfs/snapshot snapshots

 

Voila.  All Windows or Mac or Linux or whatever users are able to easily
access snapshots.

 

It's worth note, in the default config of zfs-auto-snapshot, the snaps are
created with non-cifs compatible characters in the filename (the : colon
character in the time.)  So I also make it a habit during installation, to
modify the zfs-auto-snapshot scripts, and substitute that character.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Edward Ned Harvey
  I'll say it again: neither 'zfs send' or (s)tar is an enterprise (or
  even home) backup system on their own one or both can be components
 of
  the full solution.

I would be pretty comfortable with a solution thusly designed:

#1  A small number of external disks, zfs send onto the disks and rotate
offsite.  Then, you're satisfying the ability to restore individual files,
but you're not satisfying the archivability, longevity of tapes.

#2  Also, zfs send onto tapes.  So if ever you needed something older than
your removable disks, it's someplace reliable, just not readily accessible
if you only want a subset of files.


 I'm in the fortunate position of having my backups less than the size
 of a
 large single drive; so I'm rotating three backup drives, and intend to

It's of course convenient if your backup fits entirely inside a single
removable disk, but that's not a requirement.  You could always use
removable stripesets, or raidz, or whatever you wanted.  For example, you
could build a raidz removable volume out of 5 removable disks if you wanted.
Just be sure you attach all 5 disks before you zpool import

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies

2010-03-19 Thread Edward Ned Harvey
 1. NDMP for putting zfs send streams on tape over the network.  So

Tell me if I missed something here.  I don't think I did.  I think this
sounds like crazy talk.

I used NDMP up till November, when we replaced our NetApp with a Solaris Sun
box.  In NDMP, to choose the source files, we had the ability to browse the
fileserver, select files, and specify file matching patterns.  My point is:
NDMP is file based.  It doesn't allow you to spawn a process and backup a
data stream.

Unless I missed something.  Which I doubt.  ;-)


 To Ed Harvey:
 
 Some questions about your use of NetBackup on your secondary server:
 
 1. Do you successfully backup ZVOLs?  We know NetBackup should be able
 to capture datasets (ZFS file systems) using straight POSIX semantics.

I wonder if I'm confused by that question.  backup zvols to me, would
imply something at a lower level than the filesystem.  No, we're not doing
that.  We just specify backup the following directory and all of its
subdirectories.  Just like any other typical backup tool.

The reason we bought NetBackup is because it intelligently supports all the
permissions, ACL's, weird (non-file) file types, and so on.  And it
officially supports ZFS, and you can pay for an enterprise support contract.

Basically, I consider the purchase cost of NetBackup to be insurance.
Although I never plan to actually use it for anything, because all our bases
are covered by zfs send to hard disks and tapes.  I actually trust the
zfs send solution more, but I can't claim that I, or anything I've ever
done, is 100% infallible.  So I need a commercial solution too, just so I
can point my finger somewhere if needed.


 2. What version of NetBackup are you using?

I could look it up, but I'd have to VPN in and open up a console, etc etc.
We bought it in November, so it's whatever was current 4-5 months ago.


 3. You simply run the NetBackup agent locally on the (Open)Solaris
 server?

Yup.  We're doing no rocket science with it.  Ours is the absolute most
basic NetBackup setup you could possibly have.  We're not using 90% of the
features of NetBackup.  It's installed on a Solaris 10 server, with locally
attached tape library, and it does backups directly from local disk to local
tape.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Edward Ned Harvey
 It would appear that the bus bandwidth is limited to about 10MB/sec
 (~80Mbps) which is well below the theoretical 400Mbps that 1394 is
 supposed to be able to handle.  I know that these two disks can go
 significantly higher since I was seeing 30MB/sec when they were used on
 Macs previously in the same daisy-chain configuration.

I have not done 1394 in solaris or opensolaris.  But I have used it in
windows, mac, and Linux.  Many times for each one.  I never have even the
remotest problem with it in any of these other platforms.  I consider it
more universally reliable, even than USB, because occasionally I see a bad
USB driver on some boot CD or something, which can only drive USB around
11Mbit.  Again, I've never had anything but decent performance out of 1394.

Generally speaking, I use 1394 on:
Dell laptops
Lenovo laptops
Apple laptops
Apple XServe
HP laptops
... and maybe some dell servers...

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss