Re: [zfs-discuss] True in U4? Tar and cpio...save and restore ZFS File attributes and ACLs

2009-09-30 Thread Joerg Schilling
Ray Clark webcl...@rochester.rr.com wrote:

 The April 2009 ZFS Administration Guide states ...tar and cpio commands, 
 to save ZFS files.  All of these utilities save and restore ZFS file 
 attributes and ACLs.

Be careful, Sun tar and Sun cpio do not support sparse files.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS : unable to mount a pool

2009-09-30 Thread Nicolas Szalay
Hello all,

I have a critical ZFS problem, quick history

I have a production machine which backplane has burnt (litteraly) that
had 2 pools : applis  storage. Those pools are RAIDz1 + 1 spare.
Then we switched to the backup one, all right.

Backup machine is the exact replica of production one, files are rsynced
every night. I needed to reboot it to rename it, so I did. But reboot
failed : zfs panic at boot. following sun's doc, I
moved /etc/zfs/zpool.cache away, and reboot. All fine. Mounted pool
storage without problem, but unable to mount pool applis.

Someone suggested disks were re-labelled, I'm open to all advices, If I
can't get those datas back, all we be lost...

Thanks for reading, and maybe helping.

diagnostic outputs : (made from opensolaris liveCD, to have a recent
ZFS)

--
# zpool import
  pool: applis
id: 5524311410139446438
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

applis  UNAVAIL  insufficient replicas
  raidz1FAULTED  corrupted data
c7t0d0  FAULTED  corrupted data
c7t1d0  FAULTED  corrupted data
c8t2d0  FAULTED  corrupted data
c8t3d0  FAULTED  corrupted data
spares
  c5t4d0

--
# prtvtoc /dev/rdsk/c7t0d0
* /dev/rdsk/c7t0d0 partition map
*
* Dimensions:
* 512 bytes/sector
* 976773168 sectors
* 976773101 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*   First SectorLast
*   Sector CountSector 
*  34   222   255
*
*  First SectorLast
* Partition  Tag  FlagsSector CountSector  Mount Directory
   0  400256 976756495 976756750
   8 1100  976756751 16384 976773134

--
# fstyp /dev/rdsk/c7t0d0
unknown_fstyp (no match)

--
# zpool import -f applis
cannot import 'applis': one or more device is currently unavailable


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar
One of the disks in my RAIDZ array was behaving oddly (lots of bus errors) so I 
took it offline to replace it. I shut down the server, put in the replacement 
disk, and rebooted. Only to discover that a different drive had chosen that 
moment to fail completely. So I replace the failing (but not yet failed) drive 
and try and import the pool. Failure, because that disk is marked offline. Is 
there any way to recover from this?

System was running b118. Booting off my OS into single user mode causes the 
system to become extremely unhappy (any zfs command hangs the system for a very 
long time, and I get an error about being out of VM)... Booting of the osol 
live CD gives me:

  pool: media
id: 4928877878517118807
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

media   UNAVAIL  insufficient replicas
  raidz1UNAVAIL  insufficient replicas
c7t5d0  UNAVAIL  cannot open
c7t2d0  ONLINE
c7t4d0  ONLINE
c7t3d0  ONLINE
c7t0d0  OFFLINE
c7t7d0  ONLINE
c7t1d0  ONLINE
c7t6d0  ONLINE
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS : unable to mount a pool

2009-09-30 Thread Nicolas Szalay
Le mercredi 30 septembre 2009 à 11:43 +0200, Nicolas Szalay a écrit :
 Hello all,
 
 I have a critical ZFS problem, quick history
[snip]

little addition : zdb -l /dev/rdsk/c7t0d0 sees the metadatas
Isn't it just the phys_path that is wrong ?


LABEL 0

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]
type='disk'
id=1
guid=5339011664685738178
path='/dev/dsk/c4t1d0s0'
devid='id1,s...@x0004d927f810/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@1,0:a'
whole_disk=1
DTL=79
children[2]
type='disk'
id=2
guid=2839810383588280229
path='/dev/dsk/c5t2d0s0'
devid='id1,s...@x0004d927f820/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@2,0:a'
whole_disk=1
DTL=78
children[3]
type='disk'
id=3
guid=2925754536128244731
path='/dev/dsk/c5t3d0s0'
devid='id1,s...@x0004d927f830/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@3,0:a'
whole_disk=1
DTL=77

LABEL 1

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]
type='disk'
id=1
guid=5339011664685738178
path='/dev/dsk/c4t1d0s0'
devid='id1,s...@x0004d927f810/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@1,0:a'
whole_disk=1
DTL=79
children[2]
type='disk'
id=2
guid=2839810383588280229
path='/dev/dsk/c5t2d0s0'
devid='id1,s...@x0004d927f820/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@2,0:a'
whole_disk=1
DTL=78
children[3]
type='disk'
id=3
guid=2925754536128244731
path='/dev/dsk/c5t3d0s0'
devid='id1,s...@x0004d927f830/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@3,0:a'
whole_disk=1
DTL=77

LABEL 2

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]

Re: [zfs-discuss] Best way to convert checksums

2009-09-30 Thread Darren J Moffat

Ray Clark wrote:

When using zfs send/receive to do the conversion, the receive creates a new 
file system:

   zfs snapshot zfs01/h...@before
   zfs send zfs01/h...@before | zfs receive afx01/home.sha256

Where do I get the chance to zfs set checksum=sha256 on the new file system 
before all of the files are written ???


Set it on the afx01 dataset before you do the receive and it will be 
inherited.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS : unable to mount a pool

2009-09-30 Thread Victor Latushkin

On 30.09.09 14:30, Nicolas Szalay wrote:

Le mercredi 30 septembre 2009 à 11:43 +0200, Nicolas Szalay a écrit :

Hello all,

I have a critical ZFS problem, quick history

[snip]

little addition : zdb -l /dev/rdsk/c7t0d0 sees the metadatas


What does zdb -l /dev/rds/c7t0d0s0 show?

Victor


Isn't it just the phys_path that is wrong ?


LABEL 0

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]
type='disk'
id=1
guid=5339011664685738178
path='/dev/dsk/c4t1d0s0'
devid='id1,s...@x0004d927f810/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@1,0:a'
whole_disk=1
DTL=79
children[2]
type='disk'
id=2
guid=2839810383588280229
path='/dev/dsk/c5t2d0s0'
devid='id1,s...@x0004d927f820/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@2,0:a'
whole_disk=1
DTL=78
children[3]
type='disk'
id=3
guid=2925754536128244731
path='/dev/dsk/c5t3d0s0'
devid='id1,s...@x0004d927f830/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@3,0:a'
whole_disk=1
DTL=77

LABEL 1

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]
type='disk'
id=1
guid=5339011664685738178
path='/dev/dsk/c4t1d0s0'
devid='id1,s...@x0004d927f810/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@1,0:a'
whole_disk=1
DTL=79
children[2]
type='disk'
id=2
guid=2839810383588280229
path='/dev/dsk/c5t2d0s0'
devid='id1,s...@x0004d927f820/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@2,0:a'
whole_disk=1
DTL=78
children[3]
type='disk'
id=3
guid=2925754536128244731
path='/dev/dsk/c5t3d0s0'
devid='id1,s...@x0004d927f830/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@3,0:a'
whole_disk=1
DTL=77

LABEL 2

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'


Re: [zfs-discuss] zpool add issue with cache devices thru ldm 71713004

2009-09-30 Thread Bertrand Lesecq - Sun France - Support Engineer




Check S10 U8 SRT, as i remember there is a way to some cache device to
a pool

On 09/29/09 18:23, Ted Ward wrote:

  
Hello Claire.
  
That feature is in OpenSolaris but not regular Solaris 10
(http://www.opensolaris.org/os/community/zfs/version/10/):
  
  ZFS Pool
Version 10
  This page describes the feature that is available with the ZFS
on-disk format, version 10. This version includes support for the
following feature:
  
Devices can be added to a storage pool as "cache devices."
These devices provide an additional layer of caching between main
memory and disk. Using cache devices provides the greatest performance
improvement for random read-workloads of mostly static content.
  
  This feature is available in the Solaris Express Community
Edition, build 78.
  The Solaris 10 10/08 release includes ZFS pool version 10, but
support for cache devices is not included in this Solaris release.
~Ted
  
  
On 09/29/09 09:20, claire.grandal...@sun.com
wrote:
  I could
use
some assistance on this case. I searched this error on sunsolve 
although it did spit out a million things I have not found anything
that pinpoints this issue. 

T5240 w/solaris 10 5/09 U7 kernel patch #141414-10 
# zpool upgrade -v is at version 10 so should have cache availability 
cust is trying to add a cache device to a zpool but it's failing 
ERROR: can not add to zpool name pool must be upgrade to add these
vdevs 

He's also trying to add the cache device into a guest domain, it's
running on the domain controlled by ldm. Can this be done? Looks like
he's at all the versions he needs to be to get this done. Is there a
bug? I appreciate any assistance that can be provided. 

Thanks! 

Claire Grandalski 
OS - Technical Support Engineer 
Sun Microsystems, Inc. 
Operating Systems Technology Service Center 

Email: claire.grandal...@sun.com 
Phone: 1-800-USA-4SUN 
My Working Hours : 6am-2pm ET, Monday thru Friday My Manager's
Email:dawn.b...@sun.com 
=== 
TO REACH THE NEXT AVAILABLE ENGINEER: 
1. Call 1-800-USA-4SUN choose opt 2 and enter your case number. 
2. Wait for my voice mail message to begin. 
3. Press "0" during my message to reach the next available engineer. 
4. You will hear hold music until the next engineer answers. 


Submit, check and update tickets at http://www.sun.com/osc



This email may contain confidential and privileged material for the
sole 
use of the intended recipient. Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient please 
contact the sender and delete all copies. 



  
  
  -- 

Have a great day, and thank you for calling Sun!

Ted (Thomas E.) Ward
Technical Support Engineer
Sun Microsystems, Inc.
Operating Systems Technology Service Center

Email: ted.w...@sun.com
Phone: 303-464-4594
My Working Hours : 9am-6pm MT, Monday thru Friday
My Manager's Email: phil.w...@sun.com

TO REACH THE NEXT AVAILABLE ENGINEER:
1. Call 1-800-USA-4SUN choose opt 2 and enter your case number.
2. Wait for my voice mail message to begin.
3. Press "0" during my message to reach the next available engineer.
4. You will hear hold music until the next engineer answers.

Submit, check and update tickets at http://www.sun.com/osc

This email may contain confidential and privileged material for the sole
use of the intended recipient. Any review or distribution by others is
strictly prohibited. If you are not the intended recipient please
contact the sender and delete all copies.
 
  


-- 


Cordialement ,
With kind regards




  

  
  
  Bertrand Lesecq
   
  EMEA OS Systems TSC 
Community Lead Engineer
  
   Graphics
and
Operating System
  
technical support
  

  Sun Microsystems
France
13 av. Morane Saulnier - BP 53
78142 Velizy Cedex 
France
  
Phone : (+33) 1.34.03.04.34 [ x30434 ]
Email: bertrand.les...@sun.com
  

  








___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread Thomas Burgess
On Tue, Sep 29, 2009 at 7:28 AM, rwali...@washdcmail.com wrote:

 On Sep 29, 2009, at 2:41 AM, Eugen Leitl wrote:

  On Mon, Sep 28, 2009 at 06:04:01PM -0400, Thomas Burgess wrote:

 personally i like this case:


 http://www.newegg.com/Product/Product.aspx?Item=N82E16811219021

 it's got 20 hot swap bays, and it's surprisingly well built.  For the
 money,
 it's an amazing deal.


 You don't like http://www.supermicro.com/products/nfo/chassis_storage.cfm
  ?
 I must admit I don't have a price list of these.

 When running that many hard drives I would insist on redundant
 power supplies, and server motherboards with ECC memory. Unless
 it's for home use, where a downtime of days or weeks is not critical.


 I hadn't thought of going that way because I was looking for at least a
 somewhat pre-packaged system, but another posted pointed out how many more
 drives I could get by choosing case/motherboard separately.  I agree, with
 this much trouble it doesn't make sense to settle for fewer drive slots than
 I can get.

 For the money, it's a much better option. you'll be able to afford many
more drives.  In my opinion, for a home system, the more you can save on the
case and power supply, the more hard drives you can buy.  Right now 1 TB and
1.5 TB drives seem to be the best.  I used 1 TB drives and 2 compact flash
cards for the os (with sata to compact flash adapters)  They are really
small and easy to find a place to mount, which allows you to use the hotswap
bays for even more storage.


 I agree completely with the ECC.  It's for home use, so the power supply
 issue isn't huge (though if it's possible that's a plus).  My concern with
 this particular option is noise.  It will be in a closet, but one with
 louvered doors right off a room where people watch TV.  Anything
 particularly loud would be an issue.  The comments on Newegg make this sound
 pretty loud.  Have you tried one outside of a server room environment?

 Yes, ecc is nice.  They also sell dual powersupplies that FIT in a single
atx slot.  Just look around.  The noise isn't THAT bad.  If you have it in a
closet i'll be very surprised if it's a problem, that's exactly what i do
and it's in the SAME room as the tv and i don't notice it.  I have 2 norco
4020's in there and 2 more 2u servers, one is running my router software
(openbsd with pf) and the other is an older hp proliant box  I don't have a
problem at all with the noise.  I highly recommend that case.  It's not
designed to be quiet, but if you replace the stock fans with low noise fans,
it's much much much more quiet than you'd think. and it's designed well
enough to keep the drives cool.  It's perfect for home and small office use,
and will allow you to put more money into buying storage, which is the POINT
of what you are doing ANYWAYS.



 Thanks,
 Ware

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
I am looking to use Opensolaris/ZFS to create an iscsi SAN to provide storage 
for a collection of virtual systems and replicate to an offiste device.

While testing the environment I was surprised to see the size of the 
incremental snapshots, which I need to send/receive over a WAN connection, 
considering this is supposed to be block level replication. Having run several 
tests making minor changes to large files, I now see what is happening. When I 
use a text editor to modify a file, the whole file is written back to disk and 
so the snapshot includes every written block, whether that block contains the 
same information as before or not.

Would it be possible to develop the incremental snapshot process so that they 
only contain changed written blocks rather than every written block. Certainly 
in my environment where we have large files (500mb), the effect upon what is 
sent over the WAN would be drastically reduced.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Would ZFS work for a high-bandwidth video SAN?

2009-09-30 Thread Frank Middleton

On 09/29/09 10:23 PM, Marc Bevand wrote:


If I were you I would format every 1.5TB drive like this:
* 6GB slice for the root fs


As noted in another thread, 6GB is way too small. Based on
actual experience, an upgradable rpool must be more than
20GB. I would suggest at least 32GB; out of 1.5TB that's
still negligible. Recent release notes for image-update say
that at least 8GB free is required for an update. snv111b
as upgraded from a CD installed image takes  11GB without
any user applications like Firefox. Note also that a nominal
1.5TB drive really only has 1.36TB of actual space as reported
by zfs.

Can't speak to the 12-way mirror idea, but if you go this
route you might keep some slices for rpool backups. I have
found having a disk with such a backup invaluable...

How do you plan to do backups in general?

Cheers -- Frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread paul
 It appears that I have waded into a quagmire.  Every option I can find
 (cpio, tar (Many versions!), cp, star, pax) has issues.  File size and
 filename or path length, and ACLs are common shortfalls.  Surely there is
 an easy answer he says naively!

 I simply want to copy one zfs filesystem tree to another, replicating it
 exactly.  Times, Permissions, hard links, symbolic links, sparse file
 holes, ACLs, extended attributes, and anything I don't know about.

 Can you give me a commandline with parameters?  I will then study what
 they mean.


Have you ruled out using 'zfs send' / 'zfs receive' for some reason? And
have you looked at rsync? I generally find rsync to be the easiest and
most reliable tool for replicating directory structures. You may want to
look at the GNU version of rsync (available at www.sunfreeware.com and
elsewhere).

Paul

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Would ZFS work for a high-bandwidth video SAN?

2009-09-30 Thread paul
 Also, one of those drives will need to be the boot drive. (Even if it's
 possible I don't want to boot from the data dive, need to keep it focused
 on video storage.) So it'll end up being 11 drives in the raid-z.
 --
 This message posted from opensolaris.org



FWIW, most enclosures like the ones we have been discussing lately have an
internal bay for a boot/OS drive--so you'll probably have all 12 hot-swap
bays available for data drives.

Paul

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread paul
 One of the disks in my RAIDZ array was behaving oddly (lots of bus errors)
 so I took it offline to replace it. I shut down the server, put in the
 replacement disk, and rebooted. Only to discover that a different drive
 had chosen that moment to fail completely. So I replace the failing (but
 not yet failed) drive and try and import the pool. Failure, because that
 disk is marked offline. Is there any way to recover from this?

 System was running b118. Booting off my OS into single user mode causes
 the system to become extremely unhappy (any zfs command hangs the system
 for a very long time, and I get an error about being out of VM)... Booting
 of the osol live CD gives me:

   pool: media
 id: 4928877878517118807
  state: UNAVAIL
 status: The pool was last accessed by another system.
 action: The pool cannot be imported due to damaged devices or data.
see: http://www.sun.com/msg/ZFS-8000-EY
 config:

 media   UNAVAIL  insufficient replicas
   raidz1UNAVAIL  insufficient replicas
 c7t5d0  UNAVAIL  cannot open
 c7t2d0  ONLINE
 c7t4d0  ONLINE
 c7t3d0  ONLINE
 c7t0d0  OFFLINE
 c7t7d0  ONLINE
 c7t1d0  ONLINE
 c7t6d0  ONLINE
 --
 This message posted from opensolaris.org


zpool online media c7t0d0

Paul


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Richard Elling

On Sep 30, 2009, at 5:48 AM, Brian Hubbleday wrote:

I am looking to use Opensolaris/ZFS to create an iscsi SAN to  
provide storage for a collection of virtual systems and replicate to  
an offiste device.


While testing the environment I was surprised to see the size of the  
incremental snapshots, which I need to send/receive over a WAN  
connection, considering this is supposed to be block level  
replication. Having run several tests making minor changes to large  
files, I now see what is happening. When I use a text editor to  
modify a file, the whole file is written back to disk and so the  
snapshot includes every written block, whether that block contains  
the same information as before or not.


Yep, that is how most text editors work.

Would it be possible to develop the incremental snapshot process so  
that they only contain changed written blocks rather than every  
written block. Certainly in my environment where we have large files  
(500mb), the effect upon what is sent over the WAN would be  
drastically reduced.


That is how snapshots work. But your application (text editor) writes  
new data.

Maybe you can find another way to edit the files.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Casper . Dik

On Sep 30, 2009, at 5:48 AM, Brian Hubbleday wrote:

 I am looking to use Opensolaris/ZFS to create an iscsi SAN to  
 provide storage for a collection of virtual systems and replicate to  
 an offiste device.

 While testing the environment I was surprised to see the size of the  
 incremental snapshots, which I need to send/receive over a WAN  
 connection, considering this is supposed to be block level  
 replication. Having run several tests making minor changes to large  
 files, I now see what is happening. When I use a text editor to  
 modify a file, the whole file is written back to disk and so the  
 snapshot includes every written block, whether that block contains  
 the same information as before or not.

Yep, that is how most text editors work.

And dedup will not help you in that case either; unless you append to the 
end or when you insert a 128K block in the middle of the file, the blocks
themselves will all be different.

 Would it be possible to develop the incremental snapshot process so  
 that they only contain changed written blocks rather than every  
 written block. Certainly in my environment where we have large files  
 (500mb), the effect upon what is sent over the WAN would be  
 drastically reduced.

That is how snapshots work. But your application (text editor) writes  
new data.
Maybe you can find another way to edit the files.

What type of changes are being made?

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
I took binary dumps of the snapshots taken in between the edits and this showed 
that there was actually very little change in the block structure, however the 
incremental snapshots were very large. So the conclusion I draw from this is 
that the snapshot simply contains every written block since the last snapshot 
regardless of whether the data in the block has changed or not. 

Okay so snapshots work this way, I'm simply suggesting that things could be 
better.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
Just realised I missed a rather important word out there, that could confuse.

So the conclusion I draw from this is that the --incremental-- snapshot simply 
contains every written block since the last snapshot regardless of whether the 
data in the block has changed or not.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] poor man's Drobo on FreeNAS

2009-09-30 Thread Eugen Leitl

Somewhat hairy, but interesting. FYI.

https://sourceforge.net/apps/phpbb/freenas/viewtopic.php?f=97t=1902

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
I had a 50mb zfs volume that was an iscsi target. This was mounted into a 
Windows system (ntfs) and shared on the network. I used notepad.exe on a remote 
system to add/remove a few bytes at the end of a 25mb file.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread David Dyer-Bennet

On Wed, September 30, 2009 07:14, Thomas Burgess wrote:
 For the money, it's a much better option. you'll be able to afford many
 more drives.  In my opinion, for a home system, the more you can save on
 the
 case and power supply, the more hard drives you can buy.  Right now 1 TB
 and
 1.5 TB drives seem to be the best.  I used 1 TB drives and 2 compact flash
 cards for the os (with sata to compact flash adapters)  They are really
 small and easy to find a place to mount, which allows you to use the
 hotswap bays for even more storage.

I've been running a home ZFS server for a while now; mine currently has
two two-way mirrors of 400GB disks, i.e. 800GB usable data space.  I've
got a couple hundred GB free currently.  This server holds my music
collection, plus my digital photography, plus what's scanned of my film
photography, plus my ebook collection, plus the usual random personal
files c.  And it serves as a backup pool for several laptops.

I can see that people heavily active in live audio or (especially) video
recording would fill disks considerably faster than my still photography
does (about 12MB per image, before I start editing it and storing extra
copies).  But I have to say that I'm finding the size NAS boxes people are
building for what they call home use to be rather startling.  I'm using
4 400GB disks with 100% redundancy; lots of people are talking about using
8 or more 1TB or bigger disks with 25% redundancy.  That's a hugely bigger
pool!  Do you actually fill up that space?  With what?

I've got 8 hot-swap bays but only 6 controller channels on the
motherboard.  And I'm using two of those for the boot disks.  I've thought
about going away from rotating disks for boot, to something like CF cards,
or USB.  USB is slow, but will that hurt me any when the system is being a
file server?

What going to USB does for me is free up two SATA controllers, so I can
expand my pool without buying another controller and messing about inside
the box.  Also, I don't need a mirrored pool for boot if it's a cheap USB
drive and I can keep a spare copy or two, and just swap them if there's
any problem with the first one.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread David Dyer-Bennet

On Wed, September 30, 2009 08:21, p...@paularcher.org wrote:
 It appears that I have waded into a quagmire.  Every option I can find
 (cpio, tar (Many versions!), cp, star, pax) has issues.  File size and
 filename or path length, and ACLs are common shortfalls.  Surely there
 is
 an easy answer he says naively!

 I simply want to copy one zfs filesystem tree to another, replicating it
 exactly.  Times, Permissions, hard links, symbolic links, sparse file
 holes, ACLs, extended attributes, and anything I don't know about.

 Can you give me a commandline with parameters?  I will then study what
 they mean.


 Have you ruled out using 'zfs send' / 'zfs receive' for some reason? And
 have you looked at rsync? I generally find rsync to be the easiest and
 most reliable tool for replicating directory structures. You may want to
 look at the GNU version of rsync (available at www.sunfreeware.com and
 elsewhere).

I had to discard an rsync-based backup scheme when I went to CIFS; rsync
doesn't handle extended attributes and ACLs, which CIFS uses.

And I haven't been able to make incremental replication send/receive work.
 Supposed to be working on that, but now I'm having trouble getting a
VirtualBox install that works (my real NAS is physical, but I'm using
virtual systems to test things).
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread Thomas Burgess
On Wed, Sep 30, 2009 at 10:48 AM, David Dyer-Bennet d...@dd-b.net wrote:


 On Wed, September 30, 2009 07:14, Thomas Burgess wrote:
  For the money, it's a much better option. you'll be able to afford many
  more drives.  In my opinion, for a home system, the more you can save on
  the
  case and power supply, the more hard drives you can buy.  Right now 1 TB
  and
  1.5 TB drives seem to be the best.  I used 1 TB drives and 2 compact
 flash
  cards for the os (with sata to compact flash adapters)  They are really
  small and easy to find a place to mount, which allows you to use the
  hotswap bays for even more storage.

 I've been running a home ZFS server for a while now; mine currently has
 two two-way mirrors of 400GB disks, i.e. 800GB usable data space.  I've
 got a couple hundred GB free currently.  This server holds my music
 collection, plus my digital photography, plus what's scanned of my film
 photography, plus my ebook collection, plus the usual random personal
 files c.  And it serves as a backup pool for several laptops.

 I can see that people heavily active in live audio or (especially) video
 recording would fill disks considerably faster than my still photography
 does (about 12MB per image, before I start editing it and storing extra
 copies).  But I have to say that I'm finding the size NAS boxes people are
 building for what they call home use to be rather startling.  I'm using
 4 400GB disks with 100% redundancy; lots of people are talking about using
 8 or more 1TB or bigger disks with 25% redundancy.  That's a hugely bigger
 pool!  Do you actually fill up that space?  With what?

 I've got 8 hot-swap bays but only 6 controller channels on the
 motherboard.  And I'm using two of those for the boot disks.  I've thought
 about going away from rotating disks for boot, to something like CF cards,
 or USB.  USB is slow, but will that hurt me any when the system is being a
 file server?

 What going to USB does for me is free up two SATA controllers, so I can
 expand my pool without buying another controller and messing about inside
 the box.  Also, I don't need a mirrored pool for boot if it's a cheap USB
 drive and I can keep a spare copy or two, and just swap them if there's
 any problem with the first one.

 i fill mine up with TV shows and Movies.  I have a LOT of hd stuff, 1080p
and 720p

A 1080p Movie can take up from 8 gb to 20 gb depending on encoding.  I'm a
digital packrack.  I've replaced cable with a ZFS backed network of htpcs
running xbmc on the ionitx boards.  Each htpc uses about 30 watts of power
peak and does 1080p without a problem.  each box also has a dvd player in it
if we want to watch an old dvd.  I also have rtorrent running using rss to
grab all the new shows, which normally show up a few minutes to an hour
after they air.  I've got them set to automatically sort and placed in the
correct spot.   It's easy to fill up many tb's of space with whole seasons
of 720p and 1080p TV, and hundreds of movies.  Using xbmc and some of the
wonderful skins you can make some amazing alternatives to cable.  I just got
tired of channel surfing.  Also, i use my multi TB system to run rsync
backups on all the computers i care about.  Snapshots allow me to return to
any day

I'm also saving right now to build a backup system to have a second copy of
the stuff i don't want to lose.  I also try not to go over 50-70% full.
 when i get that full i start looking at ways to upgrade.  I started with
linux and an xfs based system on a single tb drive and just kept expanding
it...when i found out about ZFS i knew that was the way to go.  I can't
stand to delete the stuff i have unless i find better copies...so as long as
there is new stuff coming out, i'll probably keep expanding my system.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread Robert Thurlow

David Dyer-Bennet wrote:


And I haven't been able to make incremental replication send/receive work.
 Supposed to be working on that, but now I'm having trouble getting a
VirtualBox install that works (my real NAS is physical, but I'm using
virtual systems to test things).


I've had good success practicing and debugging zfs stuff
by creating small pools based on files and tinkering with
those, e.g.

# mkfile 100m /root/zpool_test1
# zpool create test1 /root/zpool_test1
# mkfile 100m /root/zpool_test2
# zpool create test2 /root/zpool_test2

This can get you a source of non-production data on a
real server.

Rob T
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread erik.ableson

Heh :-)  Disk usage is directly related to available space.

At home I have a 4x1Tb raidz filled to overflowing with music, photos,  
movies, archives, and backups for 4 other machines in the house. I'll  
be adding another 4 and an SSD shortly.


It starts with importing CDs into iTunes or WMP, then comes the TV  
recordings, then comes ripping your DVD collection... Hey disk is  
cheap, right?


Once you have gotten out  of the habit of using shiny discs for music,  
video is a logical progression. You also stop being finicky about  
minimizing file space - I've gone from high quality mp3 to lossless  
formats.


I also have some colleagues that have Flip Mimos and equivalents that  
capture 720p video and that just chews through disk space. Those 12Mb  
shots of baby taking his/her first steps are now multi gigabyte raw  
video files.


Trust me, it's easy.

Erik

On 30 sept. 2009, at 16:48, David Dyer-Bennet wrote:

I can see that people heavily active in live audio or (especially)  
video
recording would fill disks considerably faster than my still  
photography
does (about 12MB per image, before I start editing it and storing  
extra
copies).  But I have to say that I'm finding the size NAS boxes  
people are
building for what they call home use to be rather startling.  I'm  
using
4 400GB disks with 100% redundancy; lots of people are talking about  
using
8 or more 1TB or bigger disks with 25% redundancy.  That's a hugely  
bigger

pool!  Do you actually fill up that space?  With what?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Ross Walker

On Sep 30, 2009, at 10:40 AM, Brian Hubbleday b...@delcam.com wrote:

Just realised I missed a rather important word out there, that could  
confuse.


So the conclusion I draw from this is that the --incremental--  
snapshot simply contains every written block since the last snapshot  
regardless of whether the data in the block has changed or not.


It's because ZFS is a COW file system so each block written is a new  
block.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Scott Meilicke
It is more cost, but a WAN Accelerator (Cisco WAAS, Riverbed, etc.) would be a 
big help.

Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread David Dyer-Bennet

On Wed, September 30, 2009 10:07, Robert Thurlow wrote:
 David Dyer-Bennet wrote:

 And I haven't been able to make incremental replication send/receive
 work.
  Supposed to be working on that, but now I'm having trouble getting a
 VirtualBox install that works (my real NAS is physical, but I'm using
 virtual systems to test things).

 I've had good success practicing and debugging zfs stuff
 by creating small pools based on files and tinkering with
 those, e.g.

 # mkfile 100m /root/zpool_test1
 # zpool create test1 /root/zpool_test1
 # mkfile 100m /root/zpool_test2
 # zpool create test2 /root/zpool_test2

 This can get you a source of non-production data on a
 real server.

That's where I started, and it's useful.  I left out a bit above -- I had
proven to my satisfaction that incremental replication streams don't work
in the software version on my server.  I'm trying to test current
versions, before I commit to upgrading the pools and/or filesystems with
the live data in them.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Would ZFS work for a high-bandwidth video SAN?

2009-09-30 Thread Orvar Korvar
Many sysadmins recommends raidz2. The reason is, if a drive breaks and you have 
to rebuild your array, it will take a long time with a large drive. With a 4TB 
drive or larger, it could take a week to rebuild your array! During that week, 
there will be heavy load on the rest of the drives, which may break another 
drive - and all your data is lost. Are you willing to risc that? With 2TB 
drives, might take 24h or more. 

I will soon be migrating to raidz2 because I expect to swap all my drives to 
larger and larger ones. First 2TB drives. Then 4TB drives. And still keep the 
same nr of drives in my array. In the future, I expect to have 4TB drives. Then 
I will be glad I have opted for raidz2.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] poor man's Drobo on FreeNAS

2009-09-30 Thread Scott Meilicke
Requires a login...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] poor man's Drobo on FreeNAS

2009-09-30 Thread Thomas Burgess
just remove the s in https:// and you can read it

On Wed, Sep 30, 2009 at 12:11 PM, Scott Meilicke 
scott.meili...@craneaerospace.com wrote:

 Requires a login...
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Would ZFS work for a high-bandwidth video SAN?

2009-09-30 Thread Marc Bevand
Frank Middleton f.middleton at apogeect.com writes:
 
 As noted in another thread, 6GB is way too small. Based on
 actual experience, an upgradable rpool must be more than
 20GB.

It depends on how minimal your install is.

The OpenSolaris install instructions recommend 8GB minimum, I have
one OpenSolaris 2009.06 server using about 4GB, so I thought 6GB
would be sufficient. That said I have never upgraded the rpool of
this server, but based on your commends I would recommend an rpool
of 15GB to the original poster.

-mrb

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread erik.ableson
Depending on the data content that you're dealing you can compress the  
snapshots inline with the send/receive operations by piping the data  
through gzip.  Given that we've been talking about 500Mb text files,  
this seems to be a very likely solution. There was some mention in the  
Kernel Keynote in Australia of inline deduplication, ie  
compression :-) in the zfs send stream. But there remains the question  
of references to deduplicated blocks that no longer exist on the  
destination.


Noting that ZFS deduplication will eventually help in diminishing the  
overall volume you have to treat since that while the output of the  
text editor will be to different physical blocks, many of these blocks  
will be identical to previously stored blocks (which will also be kept  
since they exist in snapshots) so that the send/receive operations  
will consist of a lot more block references rather than complete blocks.


Erik

PS - this is pretty much the operational mode of all products that use  
snapshots.  It's even worse on a lot of other storage systems where  
the snapshot content must be written to a specific reserved volume  
(which is often very small compared to the main data store) rather  
than the host pool. Until deduplication becomes the standard method of  
managing blocks, the volume of data required by this use case will not  
change.


On 30 sept. 2009, at 16:35, Brian Hubbleday wrote:

I took binary dumps of the snapshots taken in between the edits and  
this showed that there was actually very little change in the block  
structure, however the incremental snapshots were very large. So the  
conclusion I draw from this is that the snapshot simply contains  
every written block since the last snapshot regardless of whether  
the data in the block has changed or not.


Okay so snapshots work this way, I'm simply suggesting that things  
could be better.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Problem: ZFS Partition rewriten, how to recover data???

2009-09-30 Thread Darko Petkovski
I had a zfs partition written using zfs113 for Mac large around 1.37
TB, then under freebsd 7.2 following a guide on wiki I had wrote 'zpool
create trunk' eventually rewriting the partition. Now the question is
how to recover the partition or to recover data from it? Thanks


  ___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar
 zpool online media c7t0d0

j...@opensolaris:~# zpool online media c7t0d0
cannot open 'media': no such pool

Already tried that ;-)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to convert checksums

2009-09-30 Thread Ray Clark
I made a typo... I only have one pool.  I should have typed:

   zfs snapshot zfs01/h...@before
   zfs send zfs01/h...@before | zfs receive zfs01/home.sha256

Does that change the answer?

And independently if it does or not, zfs01 is a pool, and the property is on 
the home zfs file system.

I cannot change it on the file system before doing the receive because the file 
system does not exist - it is created by the receive.

This raises a related question of whether the file system on the receiving end 
is ALL created using the checksum property from the source file system, or if 
the blocks and their present mix of checksums are faithfully recreated in the 
received file system?

Finally, is there any way to verify behavior after it is done?

Thanks for helping on this.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to convert checksums

2009-09-30 Thread Darren J Moffat

Ray Clark wrote:

I made a typo... I only have one pool.  I should have typed:

   zfs snapshot zfs01/h...@before
   zfs send zfs01/h...@before | zfs receive zfs01/home.sha256

Does that change the answer?


No it doesn't change my answer


And independently if it does or not, zfs01 is a pool, and the property is on 
the home zfs file system.


doesn't mater if zfs01 is the top level dataset or not.

Before you do the receive do this:

zfs set checksum=sha256 zfs01

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Would ZFS work for a high-bandwidth video SAN?

2009-09-30 Thread Frank Middleton

On 09/30/09 12:59 PM, Marc Bevand wrote:


It depends on how minimal your install is.


Absolutely minimalist install from live CD subsequently updated
via pkg to snv111b. This machine is an old 32 bit PC used now
as an X-terminal, so doesn't need any additional software. It
now has a bigger slice of a larger pair of disks :-). snv122
also takes around 11GB after emptying /var/pkg/download.

# uname -a
SunOS host8 5.11 snv_111b i86pc i386 i86pc Solaris
# df -h
FilesystemSize  Used Avail Use% Mounted on
rpool/ROOT/opensolaris-2
   34G   13G   22G  37% /


There's around 765GB in /var/pkg/download that could be deleted,
and 1GB's worth of snapshots left by previous image-updates,
bringing it down to around 11GB. consistent with a minimalist
SPARC snv122 install with /var/pkg/download emptied and all but
the current BE and all snapshots deleted.


The OpenSolaris install instructions recommend 8GB minimum, I have


It actually says 8GB free space required. This is on top of the
space used by the base installation. This 8GB makes perfect sense
when you consider that the baseline has to be snapshotted, and
new code has to be downloaded and installed in a way that can be
rolled back. I can't explain why the snv111b baseline is 11GB vs.
the 6GB of the initial install, but this was a default install
followed by default image-updates.


one OpenSolaris 2009.06 server using about 4GB, so I thought 6GB
would be sufficient. That said I have never upgraded the rpool of
this server, but based on your commends I would recommend an rpool
of 15GB to the original poster.


The absolute minimum for an upgradable rpool is 20GB, for both
SPARC and X86. This assumes you religiously purge all unnecessary
files (such as /var/pkg/download) and keep swap, /var/dump,
/var/crash and /opt on another disk. You *really* don't want to
run out of space doing an image-update. The result is likely
to require a restore from backup of the rpool, or at best, loss
of some space that seems to vanish down a black hole.

Technically, the rpool was recovered from a baseline snapshot
several times onto a 20GB disk until I figured out empirically
that 8GB of free space was required for the image-update. I
really doubt your mileage will vary. Prudence says that 32GB
is much safer...

Cheers -- Frank


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] receive restarting a resilver

2009-09-30 Thread Ian Collins

I have a raidz2 pool on an x4500 running Solaris 10 update 7.

One of the drives has been replaced with a spare (too many errors), but 
the resilver restarts every time data is replicated to the pool with zfs 
receive.


I thought this problem was fixed long ago?

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Kees Nuyt
On Wed, 30 Sep 2009 11:01:13 PDT, Carson Gaspar
carson.gas...@gmail.com wrote:

 zpool online media c7t0d0

j...@opensolaris:~# zpool online media c7t0d0
cannot open 'media': no such pool

Already tried that ;-)

Perhaps you can try some subcommand of cfgadm to get c7t0d0
online, then import the pool again?
-- 
  (  Kees Nuyt
  )
c[_]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread paul
 zpool online media c7t0d0

 j...@opensolaris:~# zpool online media c7t0d0
 cannot open 'media': no such pool

 Already tried that ;-)
 --
 This message posted from opensolaris.org


D'oh! Of course, I should have been paying attention to the fact that the
pool wasn't imported.
My guess is that if you move /etc/zfs/zfs.cache out of the way, then
reboot, ZFS will have to figure out what disks are out there again, find
your disk, and realize it is online.

Paul

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar
  zpool online media c7t0d0
 
  j...@opensolaris:~# zpool online media c7t0d0
  cannot open 'media': no such pool
 
  Already tried that ;-)
  --
  This message posted from opensolaris.org
 
 
 D'oh! Of course, I should have been paying attention
 to the fact that the
 pool wasn't imported.
 My guess is that if you move /etc/zfs/zfs.cache out
 of the way, then
 reboot, ZFS will have to figure out what disks are
 out there again, find
 your disk, and realize it is online.

Sadly, no. Booting off the OpenSolaris LiveCD (which has no cache) doesn't 
help. The offline nature of the disk must be in the ZFS data on the disks 
somewhere...

-- 
Carson
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar
 On Wed, 30 Sep 2009 11:01:13 PDT, Carson Gaspar
 carson.gas...@gmail.com wrote:
 
  zpool online media c7t0d0
 
 j...@opensolaris:~# zpool online media c7t0d0
 cannot open 'media': no such pool
 
 Already tried that ;-)
 
 Perhaps you can try some subcommand of cfgadm to get
 c7t0d0
 online, then import the pool again?

cfgadm is happy - the offline problem is in ZFS somewhere

c7::dsk/c7t0d0 disk connectedconfigured   unknown
c7::dsk/c7t1d0 disk connectedconfigured   unknown
c7::dsk/c7t2d0 disk connectedconfigured   unknown
c7::dsk/c7t3d0 disk connectedconfigured   unknown
c7::dsk/c7t4d0 disk connectedconfigured   unknown
c7::dsk/c7t6d0 disk connectedconfigured   unknown
c7::dsk/c7t7d0 disk connectedconfigured   unknown

-- 
Carson
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Victor Latushkin

Carson Gaspar wrote:

zpool online media c7t0d0

j...@opensolaris:~# zpool online media c7t0d0
cannot open 'media': no such pool

Already tried that ;-)
--
This message posted from opensolaris.org



D'oh! Of course, I should have been paying attention
to the fact that the
pool wasn't imported.
My guess is that if you move /etc/zfs/zfs.cache out
of the way, then
reboot, ZFS will have to figure out what disks are
out there again, find
your disk, and realize it is online.


Sadly, no. Booting off the OpenSolaris LiveCD (which has no cache) doesn't help. The 
offline nature of the disk must be in the ZFS data on the disks somewhere...



is zdb happy with your pool?

Try e.g.

zdb -eud poolname

Victor

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] KCA ZFS keynote available

2009-09-30 Thread Cyril Plisko
On Tue, Sep 29, 2009 at 11:46 PM, Cyril Plisko
cyril.pli...@mountall.com wrote:
 On Tue, Sep 29, 2009 at 11:12 PM, Henrik Johansson henr...@henkis.net wrote:
 Hello everybody,
 The KCA ZFS keynote by Jeff and Bill seems to be available online now:
 http://blogs.sun.com/video/entry/kernel_conference_australia_2009_jeff
 It should probably be mentioned here, i might have missed it.

 Funny voices. Is it me or it is just a Dart Weider theme ?

Apparently it is me, or rather my mis-configured audiohd interrupt
setup. (Note to myself to always run zfs snapshot before tinkering
with drivers.conf files)

-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] bigger zfs arc

2009-09-30 Thread Chris Banal
We have a production server which does nothing but nfs from zfs. This
particular machine has plenty of free memory. Blogs and Documentation state
that zfs will use as much memory as is necessary but how is necessary
calculated? If the memory is free and unused would it not be beneficial to
increase the relative necessary size calculation of the arc even if the
extra cache isn't likely to get hit often? When an L2ARC is attached does it
get used if there is no memory pressure?

Thanks,
Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to convert checksums

2009-09-30 Thread Ray Clark
Dynamite!

I don't feel comfortable leaving things implicit.  That is how 
misunderstandings happen.  

Would you please acknowlege that zfs send | zfs receive uses the checksum 
setting on the receiving pool instead of preserving the checksum algorithm used 
by the sending block?

Thanks a million!
--Ray
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Toby Thain


On 30-Sep-09, at 10:48 AM, Brian Hubbleday wrote:

I had a 50mb zfs volume that was an iscsi target. This was mounted  
into a Windows system (ntfs) and shared on the network. I used  
notepad.exe on a remote system to add/remove a few bytes at the end  
of a 25mb file.


I'm astonished that's even possible with notepad.

I agree with Richard, it looks like your workflow needs attention.  
Making random edits to very large, remotely stored flat files with  
super-simplistic tools seems in defiance of 5 decades of data  
management technology...


--T


--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to convert checksums

2009-09-30 Thread Ray Clark
Sinking feeling...

zfs01 was originally created with fletcher2.  Doesn't this mean that the sort 
of root level stuff in the zfs pool exist with fletcher2 and so are not well 
protected?

If so, is there a way to fix this short of a backup and restore?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS/CIFS file order

2009-09-30 Thread Frans ter Borg
hi,

I'm using a SUN Unified Storage 7410 cluster, on which we access CIFS shares 
from WinXP and Win2000 clients.

If we map a CIFS share on the 7410 to a drive letter on a winXP client, we 
observe that when we do a `dir`from a dosbox  on the mapped drive, the files 
are shown in a seemingly random order. The files that are in the directory are 
scans (processed through about 20 high-speed human operated scanners) and they 
have filenames that could look like this:

scan001.tif
scan002.tif
scan003.tif
scan004.tif
scan005.tif

The numbers represent the order that they are scanned in. The directory listing 
that we get when we type dir, will show something like

scan003.tif
scan002.tif
scan005.tif
scan001.tif
scan004.tif

While if we copy the files to an NTFS or FAT32 filesystem, we will see the 
files in the desired order.

Now, why is this a problem ? Obviously the 7410 cluster hasn't been around for 
long. The scanners used to scan to several Windows Storage Server with an FCAL 
SAN, which needed to be replaced. The data was copied from the old storage 
boxes to the 7410. All of the older applications that perform batch-processing 
on the files in these directories, rely on the order that the OS gives them the 
filenames, which, when using CIFS to the Windows Storage Servers, was 
alphabetical, but now it seems random. This frustrates the batchprocessing to 
an unacceptable level. We don't have the option to rewrite the legacy 
applications on the short term, especially since in many cases the source code 
is no longer available. 

Is there a way to force the CIFS server to serve directory listings in 
alphabetical order ? Or can zfs be forced to serve directory listings to the 
CIFS server in alphabetical order ?

A similar, but not quite identical issue on ext3 was discussed in this thread:

http://lists.centos.org/pipermail/centos/2009-January/071152.html

which confirms that NTFS uses a B-Tree sort before it serves a directory 
listing. Also it confirms that the original poster has been able to solve the 
problem by setting dir_index OFF and using tune2fs.

Is there a solution that will let the ZFS/CIFS server combo behave in a 
comparable way ?

Many thanks,

Frans

[b]PS[/b] I'm aware that I may not alter system settings on the 7410's other 
than through the WebUI, but using your input I hope to be able to convince SUN 
Support to help us out there. I have not yet sent this problem description to 
SUN Support yet, but will do so shortly.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Robert Milkowski

Ross Walker wrote:

On Sep 30, 2009, at 10:40 AM, Brian Hubbleday b...@delcam.com wrote:

Just realised I missed a rather important word out there, that could 
confuse.


So the conclusion I draw from this is that the --incremental-- 
snapshot simply contains every written block since the last snapshot 
regardless of whether the data in the block has changed or not.


It's because ZFS is a COW file system so each block written is a new 
block.


Doesn't matter if it is COW or not here.
He is probably effectively writting brand new file to the file system.
All file systems (maybe save for some with de-dup) would behave the same 
here.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar

Victor Latushkin wrote:

Carson Gaspar wrote:

zpool online media c7t0d0

j...@opensolaris:~# zpool online media c7t0d0
cannot open 'media': no such pool

Already tried that ;-)
--
This message posted from opensolaris.org



D'oh! Of course, I should have been paying attention
to the fact that the
pool wasn't imported.
My guess is that if you move /etc/zfs/zfs.cache out
of the way, then
reboot, ZFS will have to figure out what disks are
out there again, find
your disk, and realize it is online.


Sadly, no. Booting off the OpenSolaris LiveCD (which has no cache) 
doesn't help. The offline nature of the disk must be in the ZFS data 
on the disks somewhere...




is zdb happy with your pool?

Try e.g.

zdb -eud poolname


I'm booted back into snv118 (booting with the damaged pool disks disconnected so 
the host would come up without throwing up). After hot plugging the disks, I get:


bash-3.2# /usr/sbin/zdb -eud media
zdb: can't open media: File exists

zpool status media is hanging, and top shows that I'm spending ~50% of CPU 
time in the kernel - I'll see what it says when it finally returns. Let me know 
if there's anything else I can do to help you help me, including giving you a 
login in the server.


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar

Carson Gaspar wrote:

Victor Latushkin wrote:

Carson Gaspar wrote:



is zdb happy with your pool?

Try e.g.

zdb -eud poolname


I'm booted back into snv118 (booting with the damaged pool disks 
disconnected so the host would come up without throwing up). After hot 
plugging the disks, I get:


bash-3.2# /usr/sbin/zdb -eud media
zdb: can't open media: File exists

zpool status media is hanging, and top shows that I'm spending ~50% of 
CPU time in the kernel - I'll see what it says when it finally returns. 
Let me know if there's anything else I can do to help you help me, 
including giving you a login in the server.


OK, things are now different (possibly better?):

bash-3.2# /usr/sbin/zpool status media
  pool: media
 state: FAULTED
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
media   FAULTED  0 0 1  corrupted data
  raidz1DEGRADED 0 0 6
c7t5d0  UNAVAIL  0 0 0  cannot open
c7t2d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c7t0d0  ONLINE   0 0 0
c7t7d0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0

I suspect that an uberblock rollback might help me - googling all the references 
now, but if someone has any advice, I'd be grateful.


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar

Carson Gaspar wrote:

Carson Gaspar wrote:

Victor Latushkin wrote:

Carson Gaspar wrote:



is zdb happy with your pool?

Try e.g.

zdb -eud poolname


I'm booted back into snv118 (booting with the damaged pool disks 
disconnected so the host would come up without throwing up). After hot 
plugging the disks, I get:


bash-3.2# /usr/sbin/zdb -eud media
zdb: can't open media: File exists

zpool status media is hanging, and top shows that I'm spending ~50% 
of CPU time in the kernel - I'll see what it says when it finally 
returns. Let me know if there's anything else I can do to help you 
help me, including giving you a login in the server.


OK, things are now different (possibly better?):

bash-3.2# /usr/sbin/zpool status media
  pool: media
 state: FAULTED
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
media   FAULTED  0 0 1  corrupted data
  raidz1DEGRADED 0 0 6
c7t5d0  UNAVAIL  0 0 0  cannot open
c7t2d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c7t0d0  ONLINE   0 0 0
c7t7d0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0

I suspect that an uberblock rollback might help me - googling all the 
references now, but if someone has any advice, I'd be grateful.


I'll also note that the kernel is certainly doing _something_ with my pool... 
from iostat -n -x 5:


extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   40.55.4 1546.40.0  0.0  0.30.07.5   0  19 c7t0d0
   40.55.4 1546.40.0  0.0  0.60.0   12.1   0  31 c7t1d0
   44.15.8 1660.80.0  0.0  0.40.07.6   0  21 c7t2d0
   41.95.4 1546.40.0  0.0  0.30.06.6   0  22 c7t3d0
   40.75.8 1546.40.0  0.0  0.50.09.9   0  25 c7t4d0
   40.35.4 1546.40.0  0.0  0.40.08.5   0  20 c7t6d0
   40.55.4 1546.40.0  0.0  0.40.07.9   0  23 c7t7d0

--
Carson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread Brandon High
On Mon, Sep 28, 2009 at 1:12 PM, Ware Adams rwali...@washdcmail.com wrote:
 SuperMicro 7046A-3 Workstation
 http://supermicro.com/products/system/4U/7046/SYS-7046A-3.cfm

I'm using a SuperChassis 743TQ-865B-SQ for my home NAS, which is what
that workstation uses. It's very LARGE and very quiet. Did I mention
it's HUGE? I bought two more 2800 rpm fans for it. The case is
designed for four but only comes with two for noise, I didn't notice
an increase in sound. You can find the fans (part # FAN-0104L4)
online.

I think the dual socket board you chose is a bit overkill for just a
NAS box. I used an ASUS motherboard because I wanted to use AMD, and
went with a 4850e and 8GB ECC memory. It got me a board that supports
ECC and PCI-X slots (so I could use the AOC-SAT-MV8 board). I also
host some (mostly idle) VMs on the machine and they run fine.

Supermicro has a 3 x 5.25 bay rack that holds 5 x 3.5 drives. This
doesn't leave space for a optical drive, but I used a USB drive to
install the OS and don't need it anymore.

-B

-- 
Brandon High : bh...@freaks.com
If it wasn't for pacifists, we could achieve peace.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Hot Space vs. hot spares

2009-09-30 Thread Brandon High
I might have this mentioned already on the list and can't find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space  power but don't provide
usable storage.

Depending on the number of spares you've assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn't work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It's not something I'd want to do with less than raidz2 protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn't be able to share a spare among many
vdevs either, but you wouldn't always need to if you leave some space
free on the zpool.

Provided that bp rewrite is committed, and vdev  zpool shrinks are
functional, could this work? It seems like a feature most applicable
to SOHO users, but I'm sure some enterprise users could find an
application for nearline storage where available space trumps
performance.

-B

-- 
Brandon High : bh...@freaks.com
Always try to do things in chronological order; it's less confusing that way.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] receive restarting a resilver

2009-09-30 Thread Ian Collins
 I have a raidz2 pool on an x4500 running Solaris 10 update 7.
 
 One of the drives has been replaced with a spare (too many errors), but 
 the resilver restarts every time data is replicated
 to the pool with zfs receive.
 
 I thought this problem was fixed long ago?

The bug was reported as 6705765 which was closed as a duplicate of 6655927.  
Unfortunately this bug only mentions and provides a work around for zpool 
status.

Is the problem with zfs receive down to the same root cause?  If so, is there a 
work around other than suspending replication to this pool?

I'd rather not do this as this system is a fall-back backup sever.

Thanks,

-- 
Ian.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hot Space vs. hot spares

2009-09-30 Thread Tim Cook
On Wed, Sep 30, 2009 at 7:06 PM, Brandon High bh...@freaks.com wrote:

 I might have this mentioned already on the list and can't find it now,
 or I might have misread something and come up with this ...

 Right now, using hot spares is a typical method to increase storage
 pool resiliency, since it minimizes the time that an array is
 degraded. The downside is that drives assigned as hot spares are
 essentially wasted. They take up space  power but don't provide
 usable storage.

 Depending on the number of spares you've assigned, you could have 7%
 of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
 This is on top of the RAID6 / raidz[1-3] overhead.

 What about using the free space in the pool to cover for the failed drive?

 With bp rewrite, would it be possible to rebuild the vdev from parity
 and simultaneously rewrite those blocks to a healthy device? In other
 words, when there is free space, remove the failed device from the
 zpool, resizing (shrinking) it on the fly and restoring full parity
 protection for your data. If online shrinking doesn't work, create a
 phantom file that accounts for all the space lost by the removal of
 the device until an export / import.

 It's not something I'd want to do with less than raidz2 protection,
 and I imagine that replacing the failed device and expanding the
 stripe width back to the original would have some negative performance
 implications that would not occur otherwise. I also imagine it would
 take a lot longer to rebuild / resilver at both device failure and
 device replacement. You wouldn't be able to share a spare among many
 vdevs either, but you wouldn't always need to if you leave some space
 free on the zpool.

 Provided that bp rewrite is committed, and vdev  zpool shrinks are
 functional, could this work? It seems like a feature most applicable
 to SOHO users, but I'm sure some enterprise users could find an
 application for nearline storage where available space trumps
 performance.

 -B

 --
 Brandon High : bh...@freaks.com
 Always try to do things in chronological order; it's less confusing that
 way.



What are you hoping to accomplish?  You're still going to need a drives
worth of free space, and if you're so performance strapped that one drive
makes the difference, you've got some bigger problems on your hands.

To me it sounds like complexity for complexity's sake, and leaving yourself
with a far less flexible option in the face of a drive failure.

BTW, you shouldn't need one disk per tray of 14 disks.  Unless you've got
some known bad disks/environmental issues, every 2-3 should be fine.  Quite
frankly, if you're doing raid-z3, I'd feel comfortable with one per thumper.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hot Space vs. hot spares

2009-09-30 Thread Erik Trimble

Brandon High wrote:

I might have this mentioned already on the list and can't find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space  power but don't provide
usable storage.

Depending on the number of spares you've assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn't work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It's not something I'd want to do with less than raidz2 protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn't be able to share a spare among many
vdevs either, but you wouldn't always need to if you leave some space
free on the zpool.

Provided that bp rewrite is committed, and vdev  zpool shrinks are
functional, could this work? It seems like a feature most applicable
to SOHO users, but I'm sure some enterprise users could find an
application for nearline storage where available space trumps
performance.

-B

  
What you describe makes no sense for single-parity vdevs, since it 
actually increases the likelihood for data loss. In multi-parity vdevs, 
even with the loss of one drive, you still have full parity protection, 
so why would you go for all that extra effort, since it gains you what?



From a global perspective, multi-disk parity (e.g. raidz2 or raidz3) is 
the way to go instead of hot spares. 

Hot spares are useful for adding protection to a number of vdevs, not a 
single vdev.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hot Space vs. hot spares

2009-09-30 Thread Matthew Ahrens

Brandon,

Yes, this is something that should be possible once we have bp rewrite (the 
ability to move blocks around).  One minor downside to hot space would be 
that it couldn't be shared among multiple pools the way that hot spares can.


Also depending on the pool configuration, hot space may be impractical.  For 
example if you are using wide RAIDZ[-N] stripes.  If you have say 4 top-level 
RAIDZ-2 vdevs each with 10 disks in it, you would have to keep your pool at 
most 3/4 full to be able to take advantage of hot space.  And if you wanted 
to tolerate any 2 disks failing, the pool could be at most 1/2 full. 
(Although one could imagine eventually recombining some of the remaining 18 
good disks to make another RAIDZ group.)


So I imagine that with this implementation at least (remove faulted top-level 
vdev), Hot Space would only be practical when using mirroring.  That said, 
once we have (top-level) device removal implemented, you could implement a 
poor-man's hot space with some simple scripts -- just remove the degraded 
top-level vdev from the pool.


FYI, I am currently working on bprewrite for device removal.

--matt

Brandon High wrote:

I might have this mentioned already on the list and can't find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space  power but don't provide
usable storage.

Depending on the number of spares you've assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn't work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It's not something I'd want to do with less than raidz2 protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn't be able to share a spare among many
vdevs either, but you wouldn't always need to if you leave some space
free on the zpool.

Provided that bp rewrite is committed, and vdev  zpool shrinks are
functional, could this work? It seems like a feature most applicable
to SOHO users, but I'm sure some enterprise users could find an
application for nearline storage where available space trumps
performance.

-B



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hot Space vs. hot spares

2009-09-30 Thread Matthew Ahrens

Erik Trimble wrote:
 From a global perspective, multi-disk parity (e.g. raidz2 or raidz3) is 
the way to go instead of hot spares.
Hot spares are useful for adding protection to a number of vdevs, not a 
single vdev.


Even when using raidz2 or 3, it is useful to have hot spares so that 
reconstruction can begin immediately.  Otherwise it would have to wait for 
the operator to physically remove the failed disk and insert a new one.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread Jorgen Lundman


I too went with a 5in3 case for HDDs, in a nice portable Mini-ITX case, with 
Intel Atom. More of a SOHO NAS for home use, rather than a beast. Still, I can 
get about 10TB in it.


http://lundman.net/wiki/index.php/ZFS_RAID

I can also recommend the embeddedSolaris project for making a small bootable 
Solaris. Very flexible and can put on the Admin GUIs, and so on.


https://sourceforge.net/projects/embeddedsolaris/

Lund

--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread Fajar A. Nugraha
On Wed, Sep 30, 2009 at 10:54 PM, David Dyer-Bennet d...@dd-b.net wrote:
 On Wed, September 30, 2009 10:07, Robert Thurlow wrote:
 David Dyer-Bennet wrote:

 And I haven't been able to make incremental replication send/receive
 work.
  Supposed to be working on that, but now I'm having trouble getting a
 VirtualBox install that works

 I've had good success practicing and debugging zfs stuff
 by creating small pools based on files and tinkering with
 those, e.g.

 That's where I started, and it's useful.  I left out a bit above -- I had
 proven to my satisfaction that incremental replication streams don't work
 in the software version on my server.  I'm trying to test current
 versions, before I commit to upgrading the pools and/or filesystems with
 the live data in them.

Are you using x86 or sparc? solaris or opensolaris?
If opensolaris on x86, you can use xvm (xen) to achieve the same
functionality as virtualbox.
If sparc T series, you can use LDOM.

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread David Dyer-Bennet

Fajar A. Nugraha wrote:


Are you using x86 or sparc? solaris or opensolaris?
If opensolaris on x86, you can use xvm (xen) to achieve the same
functionality as virtualbox.
If sparc T series, you can use LDOM.
  


x86, OpenSolaris.  But I'm not terribly attracted to the idea of 
switching to another, less familiar, virtualization product in hopes 
that it will work.  I really rather expected Sun's virtualization 
product to run Sun's OS.


--
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Desire simple but complete copy - How?

2009-09-30 Thread Fajar A. Nugraha
On Thu, Oct 1, 2009 at 8:46 AM, David Dyer-Bennet d...@dd-b.net wrote:
 Fajar A. Nugraha wrote:

 x86, OpenSolaris.  But I'm not terribly attracted to the idea of switching
 to another, less familiar, virtualization product in hopes that it will
 work.  I really rather expected Sun's virtualization product to run Sun's
 OS.

xvm is part of opensolaris, so it's Sun's virtualization product :D

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] True in U4? Tar and cpio...save and restore ZFS File attributes and ACLs

2009-09-30 Thread Ray Clark
Joerg, Thanks.  As you (of all people) know, this area is quite a quagmire.  I 
am confident that I don't have any sparse files, or if I do that they are small 
and loosing this property would not be a big impact.  I have determined that 
none of the files have extended attributes or ACLs.  Some are greater than 4GB 
and have long paths, but Sun TAR supports both if I include the E option.  I am 
trusting that because it is recommended in the ZFS Admin Guide that it is my 
safest option with respect to any ZFS idiosyncrasies, given its limitations.  
If only those were documented!

My next problem is that I want to do an exhaustive file compare afterwards, and 
diff is not large-file aware.  

I always wonder if or how these applications that run across every OS known to 
man such as star can possibly be able to have the right code to work around the 
idiosyncrasies and exploit the capabilities of all of those OS's.  Should I 
consider star for the compare?  For the copy?  (Recognizing that it cannot do 
the ACLs, but I don't have those).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar

Carson Gaspar wrote:
I'll also note that the kernel is certainly doing _something_ with my 
pool... from iostat -n -x 5:


extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   40.55.4 1546.40.0  0.0  0.30.07.5   0  19 c7t0d0
   40.55.4 1546.40.0  0.0  0.60.0   12.1   0  31 c7t1d0
   44.15.8 1660.80.0  0.0  0.40.07.6   0  21 c7t2d0
   41.95.4 1546.40.0  0.0  0.30.06.6   0  22 c7t3d0
   40.75.8 1546.40.0  0.0  0.50.09.9   0  25 c7t4d0
   40.35.4 1546.40.0  0.0  0.40.08.5   0  20 c7t6d0
   40.55.4 1546.40.0  0.0  0.40.07.9   0  23 c7t7d0


And now I know what:

bash-3.2# pgrep zfsdle | wc
   15198   15198   86454
bash-3.2# uname -a
SunOS gandalf.taltos.org 5.11 snv_118 i86pc i386 i86xpv

I see a few other folks reporting this, but no responses.

I don't see any bugs filed against this, but I know the search engine is 
differently coded...


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Comments on home OpenSolaris/ZFS server

2009-09-30 Thread Michael Shadle
i looked at possibly doing one of those too - but only 5 disks was too
small for me. and i was too nervous about compatibility with mini-itx
stuff.

On Wed, Sep 30, 2009 at 6:22 PM, Jorgen Lundman lund...@gmo.jp wrote:

 I too went with a 5in3 case for HDDs, in a nice portable Mini-ITX case, with
 Intel Atom. More of a SOHO NAS for home use, rather than a beast. Still, I
 can get about 10TB in it.

 http://lundman.net/wiki/index.php/ZFS_RAID

 I can also recommend the embeddedSolaris project for making a small bootable
 Solaris. Very flexible and can put on the Admin GUIs, and so on.

 https://sourceforge.net/projects/embeddedsolaris/

 Lund

 --
 Jorgen Lundman       | lund...@lundman.net
 Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
 Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
 Japan                | +81 (0)3 -3375-1767          (home)
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help importing pool with offline disk

2009-09-30 Thread Carson Gaspar

Carson Gaspar wrote:

Carson Gaspar wrote:
I'll also note that the kernel is certainly doing _something_ with my 
pool... from iostat -n -x 5:


extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   40.55.4 1546.40.0  0.0  0.30.07.5   0  19 c7t0d0
   40.55.4 1546.40.0  0.0  0.60.0   12.1   0  31 c7t1d0
   44.15.8 1660.80.0  0.0  0.40.07.6   0  21 c7t2d0
   41.95.4 1546.40.0  0.0  0.30.06.6   0  22 c7t3d0
   40.75.8 1546.40.0  0.0  0.50.09.9   0  25 c7t4d0
   40.35.4 1546.40.0  0.0  0.40.08.5   0  20 c7t6d0
   40.55.4 1546.40.0  0.0  0.40.07.9   0  23 c7t7d0


And now I know what:

bash-3.2# pgrep zfsdle | wc
   15198   15198   86454
bash-3.2# uname -a
SunOS gandalf.taltos.org 5.11 snv_118 i86pc i386 i86xpv

I see a few other folks reporting this, but no responses.

I don't see any bugs filed against this, but I know the search engine is 
differently coded...


And they have all been spawned by:

bash-3.2# ps -fp 991
 UID   PID  PPID   CSTIME TTY TIME CMD
root   991 1   1 15:30:40 ?   1:50 
/usr/lib/sysevent/syseventconfd

I renamed /etc/sysevent/config/SUNW,EC_dev_status,ESC_dev_dle,sysevent.conf and 
restarted syseventd to stop the madness.


Anyone know what has gone so horribly wrong?

The other reports I've seen were against snv_123, so the current release appears 
to have the same bug.


--
Carson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hot Space vs. hot spares

2009-09-30 Thread Richard Elling

On Sep 30, 2009, at 6:03 PM, Matthew Ahrens wrote:


Erik Trimble wrote:
From a global perspective, multi-disk parity (e.g. raidz2 or  
raidz3) is the way to go instead of hot spares.
Hot spares are useful for adding protection to a number of vdevs,  
not a single vdev.


Even when using raidz2 or 3, it is useful to have hot spares so that  
reconstruction can begin immediately.  Otherwise it would have to  
wait for the operator to physically remove the failed disk and  
insert a new one.


When I model these things, I use 8 hours logistical response time for
data centers and 48 hours for SOHO. When the disks were small, and
thus resilver times were short, the logistical response time could make
a big impact. With 2+ TB drives, the resilver time is becoming dominant.
As disks becoming larger and not faster, there will be a day when the
logistical response time will become insignificant. In other words, you
won't need a spare to improve logistical response, but you can consider
using spares to extend logistical response time to months. To take this
argument to its limit, it is possible that in our lifetime RAID boxes  
will

be disposable... the razor industry will be proud of us ;-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss