[zfs-discuss] Recover zpool/zfs

2008-11-07 Thread Thomas Kloeber

This is the 2nd attempt, so my apologies, if this mail got to you already...

Folkses,

I'm in an absolute state of panic because I lost about 160GB of data
which were on an external USB disk.
Here is what happened:
1. I added a 500GB USB disk to my Ultra25/Solaris 10
2. I created a zpool and a zfs for the whole disk
3. I copied lots of data on to the disk
4. I 'zfs umount'ed the disk, unplugged the USB cable
5. I attached the disk to a PC/Solaris 10 and tried to get the data
6. but I couldn't find it (didn't know about 'zfs import' then)
7. re-attached the disk to the Ultra25/Solaris 10
and now 'zpool status' tells me that the pool is FAULTED and the data is
corrupted. SUN support tells me that I have lost all data. They also
suggested to try this disk on a Nevada system which I did. There the
zpool showed up correctly but the zfs file system didn't.

If I do a hex dump of the physical device I can see all the data but I
can't get to it... aaarrrggghhh

Has anybody successfully patched/tweaked/whatever a zpool or zfs to
recover from this?

I would be most and for ever greatful I somebody could give me a hint.

Thanx,

Thomas

Following is the 'zpool status':
zpool status -xv
pool: usbDisk
state: FAULTED
status: One or more devices could not be used because the the label is 
missing

or invalid. There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from a backup source.
see: http://www.sun.com/msg/ZFS-8000-5E
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
usbDisk FAULTED 0 0 0 corrupted data
c5t0d0s0 FAULTED 0 0 0 corrupted data

and the output of zdb:
disk1
version=4
name='disk1'
state=0
txg=4
pool_guid=10847216141970570446
vdev_tree
type='root'
id=0
guid=10847216141970570446
children[0]
type='disk'
id=0
guid=5919680419735522389
path='/dev/dsk/c1t1d0s1'
devid='id1,[EMAIL PROTECTED]/b'
whole_disk=0
metaslab_array=14
metaslab_shift=30
ashift=9
asize=141729202176
usbDisk
version=4
name='usbDisk'
state=0
txg=4
pool_guid=18060149023938226877
vdev_tree
type='root'
id=0
guid=18060149023938226877
children[0]
type='disk'
id=0
guid=4959744788625823079
path='/dev/dsk/c5t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500100497408
obelix2:/# zdb -l /dev/dsk/c5t0d0s0

LABEL 0

version=4
name='WD'
state=0
txg=36
pool_guid=7944655440509665617
hostname='opensolaris'
top_guid=13767930161106254968
guid=13767930161106254968
vdev_tree
type='disk'
id=0
guid=13767930161106254968
path='/dev/dsk/c8t0d0p0'
devid='id1,[EMAIL PROTECTED]/q'
phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0:q'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500103118848
is_log=0
DTL=18

LABEL 1

version=4
name='WD'
state=0
txg=36
pool_guid=7944655440509665617
hostname='opensolaris'
top_guid=13767930161106254968
guid=13767930161106254968
vdev_tree
type='disk'
id=0
guid=13767930161106254968
path='/dev/dsk/c8t0d0p0'
devid='id1,[EMAIL PROTECTED]/q'
phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0:q'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500103118848
is_log=0
DTL=18

LABEL 2

version=4
name='usbDisk'
state=0
txg=4
pool_guid=18060149023938226877
top_guid=4959744788625823079
guid=4959744788625823079
vdev_tree
type='disk'
id=0
guid=4959744788625823079
path='/dev/dsk/c5t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500100497408

LABEL 3

version=4
name='usbDisk'
state=0
txg=4
pool_guid=18060149023938226877
top_guid=4959744788625823079
guid=4959744788625823079
vdev_tree
type='disk'
id=0
guid=4959744788625823079
path='/dev/dsk/c5t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=0
metaslab_array=14
metaslab_shift=32
ashift=9
asize=500100497408
--
Intelligent Communication Software Vertriebs GmbH
Firmensitz: Kistlerhof Str. 111, 81379 München
Registergericht: Amtsgericht München, HRB 88283
Geschäftsführer: Albert Fuss

begin:vcard
fn;quoted-printable:Thomas Kl=C3=B6ber
n;quoted-printable:Kl=C3=B6ber;Thomas
org:ICS GmbH
adr;quoted-printable:;;Leibnizpark 1;R=C3=B6srath;;51503;Germany
email;internet:[EMAIL PROTECTED]
title:Dipl-Inform
tel;work:++49-2205-895558
tel;fax:++49-2205-895625
note:VoIP: 032221318672
x-mozilla-html:TRUE
url:www.ics.de
version:2.1
end:vcard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help recovering zfs filesystem

2008-11-07 Thread Sherwood Glazier
That's exactly what I was looking for.  Hopefully Sun will see fit to include 
this functionality in the OS soon.

Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] recovering data from a dettach mirrored vdev

2008-11-07 Thread Krzys
I was wondering if this ever made to zfs as a fix for bad labels?

On Wed, 7 May 2008, Jeff Bonwick wrote:

 Yes, I think that would be useful.  Something like 'zpool revive'
 or 'zpool undead'.  It would not be completely general-purpose --
 in a pool with multiple mirror devices, it could only work if
 all replicas were detached in the same txg -- but for the simple
 case of a single top-level mirror vdev, or a clean 'zpool split',
 it's actually pretty straightforward.

 Jeff

 On Tue, May 06, 2008 at 11:16:25AM +0100, Darren J Moffat wrote:
 Great tool, any chance we can have it integrated into zpool(1M) so that
 it can find and fixup on import detached vdevs as new pools ?

 I'd think it would be reasonable to extend the meaning of
 'zpool import -D' to list detached vdevs as well as destroyed pools.

 --
 Darren J Moffat
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


 !DSPAM:122,482161a8460825014478!

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
I woke up yesterday morning, only to discover my system kept rebooting..

It's been running fine for the last while. I upgraded to snv 98 a couple weeks 
back (from 95), and had upgraded my RaidZ Zpool from version 11 to 13 for 
improved scrub performance.

After some research it turned out that, on bootup, importing my 4tb raidZ array 
was causing the system to panic (similar to this OP's error). I got that 
bypassed, and can now at least boot the system..

However, when I try anything (like mdb -kw), it advises me that there is no 
command line editing because: mdb: no terminal data available for TERM=vt320. 
term init failed: command-line editing and prompt will not be available. This 
means I can't really try what aldredmr had done in mdb, and I really don't have 
any experience in it. I upgraded to snv_100 (November), but experiencing the 
exact same issues. 

If anyone has some insight, it would be greatly appreciated. Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help recovering zfs filesystem

2008-11-07 Thread Nigel Smith
FYI, here are the link to the 'labelfix' utility.
It an attachment to one of Jeff Bonwick's posts on this thread:

http://www.opensolaris.org/jive/thread.jspa?messageID=229969

or here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2008-May/047267.html
http://mail.opensolaris.org/pipermail/zfs-discuss/2008-May/047270.html

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
River Tarnell wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 hi,

 i have two systems, A (Solaris 10 update 5) and B (Solaris 10 update 6).  i'm
 using 'zfs send -i' to replicate changes on A to B.  however, the 'zfs recv' 
 on
 B is running extremely slowly.  
I'm sorry, I didn't notice the -i in your original message.

I get the same problem sending incremental streams between Thumpers.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS problems which scrub can't find?

2008-11-07 Thread Matt Ingenthron
Off the lists, someone suggested to me that the Inconsistent 
filesystem may be the boot archive and not the ZFS filesystem (though I 
still don't know what's wrong with booting b99).

Regardless, I tried rebuilding the boot_archive with bootadm 
update-archive -vf and verified it by mounting it  and peeking inside.  
I also tried both with and without /etc/hostid.  I still get the same 
behavior.

Any thoughts?

Thanks in advance,

- Matt

[EMAIL PROTECTED] wrote:
 Hi,

 After a recent pkg image-update to OpenSolaris build 100, my system 
 booted once and now will no longer boot.  After exhausting other 
 options, I am left wondering if there is some kind of ZFS issue a 
 scrub won't find.

 The current behavior is that it will load GRUB, but trying to boot the
 most recent boot environment (b100 based) I get Error 16: Inconsistent
 filesystem structure.  The pool has gone through two scrubs from a 
 livecd based on b101a without finding anything wrong.  If I select the 
 previous boot environment (b99 based), I get a kernel panic.

 I've tried replacing the /etc/hostid based on a hunch from one of the 
 engineers working on Indiana and ZFS boot.  I also tried rebuilding 
 the boot_archive and reloading the GRUB based on build 100.  I then 
 tried reloading the build 99 grub to hopefully get to where I could 
 boot build 99.  No luck with any of these thus far.

 More below, and some comments in this bug:
 http://defect.opensolaris.org/bz/show_bug.cgi?id=3965, though may need
 to be a separate bug.

 I'd appreciate any suggestions and be glad to gather any data to 
 diagnose this if possible.


 == Screen when trying to boot b100 after boot menu ==

  Booting 'opensolaris-15'

 bootfs rpool/ROOT/opensolaris-15
 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
 loading '/platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS' ...
 cpu: 'GenuineIntel' family 6 model 15 step 11
 [BIOS accepted mixed-mode target setting!]
   [Multiboot-kludge, loadaddr=0xbffe38, text-and-data=0x1931a8, bss=0x0,
 entry=0xc0]
 '/platform/i86pc/kernel/amd64/unix -B
 zfs-bootfs=rpool/391,bootpath=[EMAIL PROTECTED],0/pci1179,[EMAIL 
 PROTECTED],2/[EMAIL PROTECTED],0:a,diskdevid=id1,[EMAIL PROTECTED]/a' 

 is loaded
 module$ /platform/i86pc/$ISADIR/boot_archive
 loading '/platform/i86pc/$ISADIR/boot_archive' ...

 Error 16: Inconsistent filesystem structure

 Press any key to continue...



 == Booting b99 ==
 (by selecting the grub entry from the GRUB menu and adding -kd then 
 doing a :c to continue I get the following stack trace)

 debug_enter+37 ()
 panicsys+40b ()
 vpanic+15d ()
 panic+9c ()
 (lines above typed in from ::stack, lines below typed in from when it 
 dropped into the debugger)
 unix:die+ea ()
 unix:trap+3d0 ()
 unix:cmntrap+e9 ()
 unix:mutex_owner_running+d ()
 genunix:lokuppnat+bc ()
 genunix:vn_removeat+7c ()
 genunix:vn_remove_28 ()
 zfs:spa_config_write+18d ()
 zfs:spa_config_sync+102 ()
 zfs:spa_open_common+24b ()
 zfs:spa_open+1c ()
 zfs:dsl_dsobj_to_dsname+37 ()
 zfs:zfs_parse_bootfs+68 ()
 zfs:zfs_mountroot+10a ()
 genunxi:fsop_mountroot+1a ()
 genunix:rootconf+d5 ()
 genunix:vfs_mountroot+65 ()
 genunix:main+e6 ()
 unix:_locore_start+92 ()

 panic: entering debugger (no dump device, continue to reboot)
 Loaded modules: [ scsi_vhci uppc sd zfs specfs pcplusmp cpu.generic ]
 kmdb: target stopped at:
 kmdb_enter+0xb: movq   %rax,%rdi



 == Output from zdb ==
 
 LABEL 0
 
version=10
name='rpool'
state=1
txg=327816
pool_guid=6981480028020800083
hostid=95693
hostname='opensolaris'
top_guid=5199095267524632419
guid=5199095267524632419
vdev_tree
type='disk'
id=0
guid=5199095267524632419
path='/dev/dsk/c4t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
 PROTECTED],0:a'
whole_disk=0
metaslab_array=14
metaslab_shift=29
ashift=9
asize=90374406144
is_log=0
DTL=161
 
 LABEL 1
 
version=10
name='rpool'
state=1
txg=327816
pool_guid=6981480028020800083
hostid=95693
hostname='opensolaris'
top_guid=5199095267524632419
guid=5199095267524632419
vdev_tree
type='disk'
id=0
guid=5199095267524632419
path='/dev/dsk/c4t0d0s0'
devid='id1,[EMAIL PROTECTED]/a'
phys_path='/[EMAIL PROTECTED],0/pci1179,[EMAIL PROTECTED],2/[EMAIL 
 PROTECTED],0:a'
whole_disk=0
metaslab_array=14
metaslab_shift=29
ashift=9
asize=90374406144
is_log=0
DTL=161
 
 LABEL 2
 
version=10
name='rpool'
state=1
txg=327816

Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Jim Dunham
Andrew,

 I woke up yesterday morning, only to discover my system kept  
 rebooting..

 It's been running fine for the last while. I upgraded to snv 98 a  
 couple weeks back (from 95), and had upgraded my RaidZ Zpool from  
 version 11 to 13 for improved scrub performance.

 After some research it turned out that, on bootup, importing my 4tb  
 raidZ array was causing the system to panic (similar to this OP's  
 error). I got that bypassed, and can now at least boot the system..

 However, when I try anything (like mdb -kw), it advises me that  
 there is no command line editing because: mdb: no terminal data  
 available for TERM=vt320. term init failed: command-line editing and  
 prompt will not be available. This means I can't really try what  
 aldredmr had done in mdb, and I really don't have any experience in  
 it. I upgraded to snv_100 (November), but experiencing the exact  
 same issues

 If anyone has some insight, it would be greatly appreciated. Thanks

I have the same problem SSH'ing in from my Mac OS X, which sets the  
TERM type to 'xterm-color', also not supported.

Do the following, depending on your default shell. and you should be  
all set.

TERM=vt100; export TERM
or
setenv TERM vt100


 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-11-07 Thread Jacob Ritorto
I have a PC server running Solaris 10 5/08 which seems to frequently become 
unable to share zfs filesystems via the shareiscsi and sharenfs options.  It 
appears, from the outside, to be hung -- all clients just freeze, and while 
they're able to ping the host, they're not able to transfer nfs or iSCSI data.  
They're in the same subnet and I've found no network problems thus far.  

After hearing so much about the Marvell problems I'm beginning to wonder it 
they're the culprit, though they're supposed to be fixed in 127128-11, which is 
the kernel I'm running.  

I have an exact hardware duplicate of this machine running Nevada b91 (iirc) 
that doesn't exhibit this problem.

There's nothing in /var/adm/messages and I'm not sure where else to begin.  

Would someone please help me in diagnosing this failure?  

thx
jake
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Ended up in GRUB prompt after the installation on ZFS

2008-11-07 Thread jan damborsky

Hi ZFS team,

when testing installation with recent OpenSolaris builds,
we have been encountering that in some cases, people end up
in GRUB prompt after the installation - it seems that menu.lst
can't be accessed for some reason. At least two following bugs
seems to be describing the same manifestation of the problem
which root cause has not been identified yet:

4051 opensolaris b99b/b100a does not install on 1.5 TB disk or boot 
fails after install

4591 Install failure on a Sun Fire X4240 with Opensolaris 200811

In those bug reports, nothing indicates those might be ZFS related
and probably there could be more scenarios why it might happen.

But when I hit that problem today when testing Automated Installer
(it is a part of Caiman project and will replace current jumpstart
install technology), I was able to make GRUB find 'menu.lst' just by
using 'zpool import' command - please see below for detailed procedure.

Based on this, could you please take a look at those observations
and if possible help me understand if there is anything obvious
what might be wrong and if you think this is somehow related to
ZFS technology ?

Thank you very much for your help,
Jan


configuration:
--
HW: Ultra 20, 1GB RWM, 1 250GB SATA drive
SW: Opensolaris build 100, 64bit mode

steps used:
---
[1] OpenSolaris 100 installed using Automated Installer
   - Solaris 2 partition created during installation

* partition configuration before installation:

# fdisk -W - c2t0d0p0
...* IdAct  Bhead  Bsect  BcylEhead  Esect  EcylRsect  
Numsect
 192   00  1  1   25463 102316065  
22491000 


* partition configuration after installation:

# fdisk -W - c2t0d0p0
...* IdAct  Bhead  Bsect  BcylEhead  Esect  EcylRsect  
Numsect
 192   00  1  1   25463 102316065  
22491000 
 191   128  25463 102325463 102322507065   3000


[2] When I reboot the system after the installation, I ended up in GRUB 
prompt:

grub root
(hd0,1,a): Filesystem type unknown, partition type 0xbf

grub cat /rpool/boot/grub/menu.lst

Error 17: Cannot mount selected partition

grub

[3] I rebooted into AI and did 'zpool import'
# zdb -l /dev/rdsk/c2t0d0s0  /tmp/zdb_before_import.txt (attached)
# zpool import -f rpool
# zdb -l /dev/rdsk/c2t0d0s0  /tmp/zdb_after_import.txt (attached)
# diff /tmp/zdb_before_import.txt /tmp/zdb_after_import.txt
7c7
 txg=21
---
 txg=2675
9c9
 hostid=4741222
---
 hostid=4247690
17a18
 devid='id1,[EMAIL PROTECTED]/a'
31c32
...
# reboot

[4] Now GRUB can access menu.lst and Solaris is booted

hypothesis
--
It seems that for some reason, when ZFS pool was
created, 'devid' information was not added to the
ZFS label.

When 'zpool import' was called, 'devid' got populated.

Looking at the GRUB ZFS plug-in, it seems that 'devid'
(ZPOOL_CONFIG_DEVID attribute) is required in order to
be able to access ZFS filesystem:

In grub/grub-0.95/stage2/fsys_zfs.c:

vdev_get_bootpath()
{
...
   if (strcmp(type, VDEV_TYPE_DISK) == 0) {
   if (vdev_validate(nv) != 0 ||
   (nvlist_lookup_value(nv, ZPOOL_CONFIG_PHYS_PATH,
   bootpath, DATA_TYPE_STRING, NULL) != 0) ||
   (nvlist_lookup_value(nv, ZPOOL_CONFIG_DEVID,
   devid, DATA_TYPE_STRING, NULL) != 0))
   return (ERR_NO_BOOTPATH);
...
}

additional observations:

[1] If 'devid' is populated during installation after 'zpool create'
operation, the problem doesn't occur.

[2] If following described procedure, the problem is reproducible
at will on system where it was initially reproduced
(please see above for the configuration)

[3] I was not able to reproduce that using exactly the same
procedure on following configurations:
* Ferrari 4000 with 160GB IDE disk
* vmware - installation done on IDE disk

[4] When installation into existing Solaris2 partition containing
Solaris instance is done 'devid' is always populated and the problem
doesn't occur.
(it doesn't matter if partition is marked 'active' or not),


LABEL 0

version=13
name='rpool'
state=0
txg=21
pool_guid=7190133845720836012
hostid=4741222
hostname='opensolaris'
top_guid=13119992024029372510
guid=13119992024029372510
vdev_tree
type='disk'
id=0
guid=13119992024029372510
path='/dev/dsk/c2t0d0s0'
phys_path='/[EMAIL PROTECTED],0/pci108e,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0:a'
whole_disk=0
metaslab_array=23
metaslab_shift=27
ashift=9
asize=15327035392
is_log=0

LABEL 1

version=13
name='rpool'
state=0
txg=21
pool_guid=7190133845720836012
hostid=4741222
hostname='opensolaris'

Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
River Tarnell wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Ian Collins:
   
 That's very slow.  What's the nature of your data?
 
  
 mainly two sets of mid-sized files; one of 200KB-2MB in size and other under
 50KB.  they are organised into subdirectories, A/B/C/file.  each directory
 has 18,000-25,000 files.  total data size is around 2.5TB.

 hm, something changed while i was writing this mail: now the transfer is
 running at 2MB/sec, and the read i/o has disappeared.  that's still slower 
 than
 i'd expect, but an improvement.

   
The transfer I mentioned just completed, 1.45.TB sent in 84832 seconds
(17.9MB/sec).  This was during a working day when the sever and network
were busy. 

The best ftp speed I managed was 59 MB/sec over then same network.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems -my math doesn't add up

2008-11-07 Thread Miles Nordin
 n == none  [EMAIL PROTECTED] writes:

 n snapshots referring to old data which has been deleted from
 n the current filesystem and I'd like to find out which
 n snapshots refer to how much data

Imagine you have a filesystem containing ten 1MB files,

  zfs create root/export/home/carton/t
  cd t
  n=0; while [ $n -lt 10 ]; do mkfile 1m $n; n=$(( $n + 1 )); done

and you change nothing on the filesystem except to slowly delete one
file at a time.  After you delete each of them, you take another
snapshot.

  n=0; while [ $n -lt 10 ]; do 
 zfs snapshot root/export/home/carton/[EMAIL PROTECTED]; 
 rm $n; 
 n=$(( $n + 1 )); 
  done

When they're all gone, you stop. Now you want to know, destroying
which snapshot can save you space.

If you think about how you got the filesystem to this state, the ideal
answer is, only deleting the oldest snapshot will save space.
Deleting any middle snapshot saves no space.  Here's what 'zfs list'
says:

bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t10.2M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  7.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

now, uselessly destroy a middle snapshot:

bash-3.2# zfs destroy root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t10.2M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

still using 10MB.  but if you destroy the oldest snapshot:

bash-3.2# zfs destroy root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t9.17M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  9.04M  -
root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

space is freed.  However, there are other cases where deleting a
middle snapshot will save space:

bash-3.2# mkfile 1m 0
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# mkfile 1m 1
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# rm 1
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# rm 0
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t2.07M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  2.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
bash-3.2# zfs destroy root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t1.05M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

To me, it looks as though the USED column tells you ``how much space
would be freed if I destroyed the thing on this row,'' and the REFER
column tells you ``about what would 'du -s' report if I ran it on the
root of this dataset.''  The answer to your question is then to look
at the USED column, but with one caveat: after you delete 

Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
Thanks a lot! Google didn't seem to cooperate as well as I had hoped.

Still no dice on the import. I only have shell access on my Blackberry Pearl 
from where I am, so it's kind of hard, but I'm managing.. I've tried the OP's 
exact commands, and even trying to import array as ro, yet the system still 
wants to panic.. I really hope I don't have to redo my array, and lose 
everything as I still have faith in ZFS...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Is ZFS already the default file System for Solaris 10?

2008-11-07 Thread Kumar, Amit H.
Is ZFS already the default file System for Solaris 10?
If yes has anyone tested it on Thumper ??
Thank you,
Amit

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic at zpool import

2008-11-07 Thread Andrew
Do you guys have any more information about this? I've tried the offset 
methods, zfs_recover, aok=1, mounting read only, yada yada, with still 0 luck. 
I have about 3TBs of data on my array, and I would REALLY hate to lose it.

Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS already the default file System for Solaris 10?

2008-11-07 Thread Rich Teer
On Fri, 7 Nov 2008, Kumar, Amit H. wrote:

 Is ZFS already the default file System for Solaris 10?

ZFS isn't the default file system for Solaris 10, but it is
selectable as the root file system with the most recent update.

-- 
Rich Teer, SCSA, SCNA, SCSECA

CEO,
My Online Home Inventory

URLs: http://www.rite-group.com/rich
  http://www.linkedin.com/in/richteer
  http://www.myonlinehomeinventory.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to diagnose zfs - iscsi - nfs hang

2008-11-07 Thread Brent Jones
On Fri, Nov 7, 2008 at 9:11 AM, Jacob Ritorto [EMAIL PROTECTED] wrote:
 I have a PC server running Solaris 10 5/08 which seems to frequently become 
 unable to share zfs filesystems via the shareiscsi and sharenfs options.  It 
 appears, from the outside, to be hung -- all clients just freeze, and while 
 they're able to ping the host, they're not able to transfer nfs or iSCSI 
 data.  They're in the same subnet and I've found no network problems thus far.

 After hearing so much about the Marvell problems I'm beginning to wonder it 
 they're the culprit, though they're supposed to be fixed in 127128-11, which 
 is the kernel I'm running.

 I have an exact hardware duplicate of this machine running Nevada b91 (iirc) 
 that doesn't exhibit this problem.

 There's nothing in /var/adm/messages and I'm not sure where else to begin.

 Would someone please help me in diagnosing this failure?

 thx
 jake
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


I saw this in Nev b87, where for whatever reason, CIFS and NFS would
completely hang and no longer serve requests (I don't use iscsi,
unable to confirm if that had hung too).
The server was responsive, SSH was fine and could execute commands,
clients could ping it and reach it, but CIFS and NFS were essentially
hung.
Intermittently, the system would recover and resume offering shares,
no triggering events could be correlated.
Since upgrading to newer builds, I haven't seen similar issues.

-- 
Brent Jones
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems-my math doesn't add up

2008-11-07 Thread Adam.Bodette
I really think there is something wrong with how space is being reported
by zfs list in terms of snapshots.

Stealing for the example earlier where a new file system was created, 10
1MB files were created and then do snap, remove a file, snap, remove a
file, until they are all gone and you are left with:
 
 bash-3.2# zfs list -r root/export/home/carton/t
 NAME  USED  AVAIL  REFER  MOUNTPOINT
 root/export/home/carton/t10.2M  26.6G18K  
 /export/home/carton/t
 root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
 root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
 root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
 root/export/home/carton/[EMAIL PROTECTED]18K  -  7.03M  -
 root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
 root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
 root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
 root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
 root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
 root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -

So the file system itself is now empty of files (18K refer for overhead)
but still using 10MB because of the snapshots still holding onto all 10
1MB files.

By how I understand snapshots the oldest one actually holds the file
after it is deleted and any newer snapshot just points to what that
oldest one is holding.

So because the 0 snapshot was taken first it knows about all 10 files,
snap 1 only knows about 9, etc.

The refer numbers all match up correctly as that is how much data
existed at the time of the snapshot.

But the used seems wrong.

The 0 snapshot should be holding onto all 10 files so I would expect it
to be 10MB Used when it's only reporting 1MB used.  Where is the other
9MB hiding?  It only exists because a snapshot is holding it so that
space should be charged to a snapshot.  Since snapshot 1-9 should only
be pointing at the data held by 0 their numbers are correct.

To take the idea further you can delete snapshots 1-9 and snapshot 0
will still say it has 1MB Used, so where again is the other 9MB?

Adding up the total used by snapshots and the refer by the file
system *should* add up to the used for the file system for it all to
make sense right?

Another way to look at it if you have all 10 snapshots and you delete 0
I would expect snapshot 1 to change from 18K used (overhead) to 9MB used
since it would now be the oldest snapshot and official holder of the
data with snapshots 2-9 now pointing at the data it is holding.  The
first 1MB file delete would now be gone forever.

Am I missing something or is the math to account for snapshot space just
not working right in zfs list/get?

Adam
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
Brent Jones wrote:
 Theres been a couple threads about this now, tracked some bug ID's/ticket:

 6333409
 6418042
I see these are fixed in build 102.

Are they targeted to get back to Solaris 10 via a patch? 

If not, is it worth escalating the issue with support to get a patch?

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recover zpool/zfs

2008-11-07 Thread Richard Elling
Thomas Kloeber wrote:
 This is the 2nd attempt, so my apologies, if this mail got to you 
 already...

 Folkses,

 I'm in an absolute state of panic because I lost about 160GB of data
 which were on an external USB disk.
 Here is what happened:
 1. I added a 500GB USB disk to my Ultra25/Solaris 10
 2. I created a zpool and a zfs for the whole disk
 3. I copied lots of data on to the disk
 4. I 'zfs umount'ed the disk, unplugged the USB cable
 5. I attached the disk to a PC/Solaris 10 and tried to get the data
 6. but I couldn't find it (didn't know about 'zfs import' then)
 7. re-attached the disk to the Ultra25/Solaris 10
 and now 'zpool status' tells me that the pool is FAULTED and the data is
 corrupted. SUN support tells me that I have lost all data. They also
 suggested to try this disk on a Nevada system which I did. There the
 zpool showed up correctly but the zfs file system didn't.

Can you explain this?  If the zpool import succeeded, then
what does zfs list show?
 -- richard


 If I do a hex dump of the physical device I can see all the data but I
 can't get to it... aaarrrggghhh

 Has anybody successfully patched/tweaked/whatever a zpool or zfs to
 recover from this?

 I would be most and for ever greatful I somebody could give me a hint.

 Thanx,

 Thomas

 Following is the 'zpool status':
 zpool status -xv
 pool: usbDisk
 state: FAULTED
 status: One or more devices could not be used because the the label is 
 missing
 or invalid. There are insufficient replicas for the pool to continue
 functioning.
 action: Destroy and re-create the pool from a backup source.
 see: http://www.sun.com/msg/ZFS-8000-5E
 scrub: none requested
 config:

 NAME STATE READ WRITE CKSUM
 usbDisk FAULTED 0 0 0 corrupted data
 c5t0d0s0 FAULTED 0 0 0 corrupted data

 and the output of zdb:
 disk1
 version=4
 name='disk1'
 state=0
 txg=4
 pool_guid=10847216141970570446
 vdev_tree
 type='root'
 id=0
 guid=10847216141970570446
 children[0]
 type='disk'
 id=0
 guid=5919680419735522389
 path='/dev/dsk/c1t1d0s1'
 devid='id1,[EMAIL PROTECTED]/b'
 whole_disk=0
 metaslab_array=14
 metaslab_shift=30
 ashift=9
 asize=141729202176
 usbDisk
 version=4
 name='usbDisk'
 state=0
 txg=4
 pool_guid=18060149023938226877
 vdev_tree
 type='root'
 id=0
 guid=18060149023938226877
 children[0]
 type='disk'
 id=0
 guid=4959744788625823079
 path='/dev/dsk/c5t0d0s0'
 devid='id1,[EMAIL PROTECTED]/a'
 whole_disk=0
 metaslab_array=14
 metaslab_shift=32
 ashift=9
 asize=500100497408
 obelix2:/# zdb -l /dev/dsk/c5t0d0s0
 
 LABEL 0
 
 version=4
 name='WD'
 state=0
 txg=36
 pool_guid=7944655440509665617
 hostname='opensolaris'
 top_guid=13767930161106254968
 guid=13767930161106254968
 vdev_tree
 type='disk'
 id=0
 guid=13767930161106254968
 path='/dev/dsk/c8t0d0p0'
 devid='id1,[EMAIL PROTECTED]/q'
 phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
 PROTECTED]/[EMAIL PROTECTED],0:q'
 whole_disk=0
 metaslab_array=14
 metaslab_shift=32
 ashift=9
 asize=500103118848
 is_log=0
 DTL=18
 
 LABEL 1
 
 version=4
 name='WD'
 state=0
 txg=36
 pool_guid=7944655440509665617
 hostname='opensolaris'
 top_guid=13767930161106254968
 guid=13767930161106254968
 vdev_tree
 type='disk'
 id=0
 guid=13767930161106254968
 path='/dev/dsk/c8t0d0p0'
 devid='id1,[EMAIL PROTECTED]/q'
 phys_path='/[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],1/[EMAIL 
 PROTECTED]/[EMAIL PROTECTED],0:q'
 whole_disk=0
 metaslab_array=14
 metaslab_shift=32
 ashift=9
 asize=500103118848
 is_log=0
 DTL=18
 
 LABEL 2
 
 version=4
 name='usbDisk'
 state=0
 txg=4
 pool_guid=18060149023938226877
 top_guid=4959744788625823079
 guid=4959744788625823079
 vdev_tree
 type='disk'
 id=0
 guid=4959744788625823079
 path='/dev/dsk/c5t0d0s0'
 devid='id1,[EMAIL PROTECTED]/a'
 whole_disk=0
 metaslab_array=14
 metaslab_shift=32
 ashift=9
 asize=500100497408
 
 LABEL 3
 
 version=4
 name='usbDisk'
 state=0
 txg=4
 pool_guid=18060149023938226877
 top_guid=4959744788625823079
 guid=4959744788625823079
 vdev_tree
 type='disk'
 id=0
 guid=4959744788625823079
 path='/dev/dsk/c5t0d0s0'
 devid='id1,[EMAIL PROTECTED]/a'
 whole_disk=0
 metaslab_array=14
 metaslab_shift=32
 ashift=9
 asize=500100497408
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots andfilesystems-my math doesn't add up

2008-11-07 Thread Adam.Bodette
I decided to do some more test situations to try to figure out how
adding/removing snapshots changes the space used reporting.

First I setup a test area, a new zfs file system and created some test
files and then created snapshots removing the files one by one.

 mkfile 1m 0
 mkfile 1m 1
 mkfile 1m 2
 mkfile 1m 3
 zfs snapshot u01/[EMAIL PROTECTED]
 rm 0
 zfs snapshot u01/[EMAIL PROTECTED]
 rm 1  
 zfs snapshot u01/[EMAIL PROTECTED]
 rm 2
 zfs snapshot u01/[EMAIL PROTECTED]
 rm 3
 zfs list -r u01/foo
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.76M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  1.18M  -  4.56M  -
u01/[EMAIL PROTECTED]  50.5K  -  3.43M  -
u01/[EMAIL PROTECTED]  50.5K  -  2.31M  -
u01/[EMAIL PROTECTED]  50.5K  -  1.18M  -


So 4M used on the file system but only 1M used by the snapshots so they
claim.

If I delete @1

 zfs destroy u01/[EMAIL PROTECTED]
 zfs list -r u01/foo  
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.71M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  2.30M  -  4.56M  -
u01/[EMAIL PROTECTED]  50.5K  -  2.31M  -
u01/[EMAIL PROTECTED]  50.5K  -  1.18M  -

now suddenly the @0 snapshot is claiming to use more space?

If I delete the newest of the snapshots @3

 zfs destroy u01/[EMAIL PROTECTED]
 zfs list -r u01/foo
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.66M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  2.30M  -  4.56M  -
u01/[EMAIL PROTECTED]  50.5K  -  2.31M  -

No change in the claimed used space by the @0 snapshot!

Now I delete the @2 snapshot

 zfs destroy u01/[EMAIL PROTECTED]
 zfs list -r u01/foo  
NAMEUSED  AVAIL  REFER  MOUNTPOINT
u01/foo4.61M  1.13T  55.0K  /u01/foo
u01/[EMAIL PROTECTED]  4.56M  -  4.56M  -

The @0 snapshot finally claims all the space it's really been holding
all along.

Up until that point other than subtracting space used (as reported by
df) from the used for the file system as reported by zfs there was no
way to know how much space the snapshots were really using.

Something is not right in the space accounting for snapshots.

---
Adam 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 [EMAIL PROTECTED]
 Sent: Friday, November 07, 2008 14:21
 To: [EMAIL PROTECTED]; zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] Disk space usage of zfs snapshots 
 andfilesystems-my math doesn't add up
 
 I really think there is something wrong with how space is 
 being reported
 by zfs list in terms of snapshots.
 
 Stealing for the example earlier where a new file system was 
 created, 10
 1MB files were created and then do snap, remove a file, snap, remove a
 file, until they are all gone and you are left with:
  
  bash-3.2# zfs list -r root/export/home/carton/t
  NAME  USED  AVAIL  REFER  MOUNTPOINT
  root/export/home/carton/t10.2M  26.6G18K  
  /export/home/carton/t
  root/export/home/carton/[EMAIL PROTECTED]  1.02M  -  10.0M  -
  root/export/home/carton/[EMAIL PROTECTED]18K  -  9.04M  -
  root/export/home/carton/[EMAIL PROTECTED]18K  -  8.04M  -
  root/export/home/carton/[EMAIL PROTECTED]18K  -  7.03M  -
  root/export/home/carton/[EMAIL PROTECTED]17K  -  6.03M  -
  root/export/home/carton/[EMAIL PROTECTED]17K  -  5.03M  -
  root/export/home/carton/[EMAIL PROTECTED]17K  -  4.03M  -
  root/export/home/carton/[EMAIL PROTECTED]17K  -  3.02M  -
  root/export/home/carton/[EMAIL PROTECTED]17K  -  2.02M  -
  root/export/home/carton/[EMAIL PROTECTED]17K  -  1.02M  -
 
 So the file system itself is now empty of files (18K refer 
 for overhead)
 but still using 10MB because of the snapshots still holding 
 onto all 10
 1MB files.
 
 By how I understand snapshots the oldest one actually holds the file
 after it is deleted and any newer snapshot just points to what that
 oldest one is holding.
 
 So because the 0 snapshot was taken first it knows about all 10 files,
 snap 1 only knows about 9, etc.
 
 The refer numbers all match up correctly as that is how much data
 existed at the time of the snapshot.
 
 But the used seems wrong.
 
 The 0 snapshot should be holding onto all 10 files so I would 
 expect it
 to be 10MB Used when it's only reporting 1MB used.  Where 
 is the other
 9MB hiding?  It only exists because a snapshot is holding it so that
 space should be charged to a snapshot.  Since snapshot 1-9 should only
 be pointing at the data held by 0 their numbers are correct.
 
 To take the idea further you can delete snapshots 1-9 and snapshot 0
 will still say it has 1MB Used, so where again is the other 9MB?
 
 Adding up the total used by snapshots and the refer by the file
 system *should* add up to the used for the file system for it all to
 make sense 

Re: [zfs-discuss] Is ZFS already the default file System for Solaris 10?

2008-11-07 Thread Neal Pollack

On 11/07/08 11:24, Kumar, Amit H. wrote:

Is ZFS already the default file System for Solaris 10?
If yes has anyone tested it on Thumper ??


Yes.   Formal Sun support is for Thumper running s10.  For the latest
ZFS bug fixes, it is important to run the most recent s10 update release.
Right now, that should be s10u6 any day now, if it's not already released
for download.

There are many customers on this list running Thumpers with s10u5 plus 
patches.


Cheers,

Neal


Thank you,
Amit
 
 
 



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] A couple of newbie questions about ZFS compression

2008-11-07 Thread Ross Becker
I'm about to enable compression on my ZFS filesystem, as most of the data I 
intend to store should be highly compressible.

Before I do so, I'd like to ask a couple of newbie questions

First -  if you were running a ZFS without compression, wrote some files to it, 
then turned compression on, will those original uncompressed files ever get 
compressed via some background work, or will they need to be copied in order to 
compress them?

Second- clearly the du command shows post-compression size; opensolaris 
doesn't have a man page for it, but I'm wondering if there's either an option 
to show original size for du, or if there's a suitable replacement I can use 
which will show me the uncompressed size of a directory full of files? (no, 
knowing the compression ratio of the whole filesystem and the du size isn't 
suitable;  I'm looking for a straight-up du substitute which would tell me 
original sizes)


Thanks
   Ross
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A couple of newbie questions about ZFS compression

2008-11-07 Thread Ian Collins
Ross Becker wrote:
 I'm about to enable compression on my ZFS filesystem, as most of the data I 
 intend to store should be highly compressible.

 Before I do so, I'd like to ask a couple of newbie questions

 First -  if you were running a ZFS without compression, wrote some files to 
 it, then turned compression on, will those original uncompressed files ever 
 get compressed via some background work, or will they need to be copied in 
 order to compress them?

   
Changed filesystem compress properties only apply to new writes.

 Second- clearly the du command shows post-compression size; opensolaris 
 doesn't have a man page for it, but I'm wondering if there's either an option 
 to show original size for du, or if there's a suitable replacement I can 
 use which will show me the uncompressed size of a directory full of files? 
 (no, knowing the compression ratio of the whole filesystem and the du size 
 isn't suitable;  I'm looking for a straight-up du substitute which would tell 
 me original sizes)

   
I guess the obvious question is why?  ls shows the file size.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Victor Latushkin
Andrew,

Andrew wrote:
 Thanks a lot! Google didn't seem to cooperate as well as I had hoped.
 
 
 Still no dice on the import. I only have shell access on my
 Blackberry Pearl from where I am, so it's kind of hard, but I'm
 managing.. I've tried the OP's exact commands, and even trying to
 import array as ro, yet the system still wants to panic.. I really
 hope I don't have to redo my array, and lose everything as I still
 have faith in ZFS...

could you please post a little bit more details - at least panic string 
and stack backtrace during panic. That would help to get an idea about 
what might went wrong.

regards,
victor
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A couple of newbie questions about ZFS compression

2008-11-07 Thread Ross Becker
The compress on-write behavior is what I expected, but I wanted to validate 
that for sure.  Thank you.

On the 2nd question, the obvious answer is that  I'm doing work where knowing 
how large the total file sizes tells me how much work has been completed, and I 
don't have any other feedback which tells me how far along a job is.  When it's 
a directory of 100+ files,  or a whole tree with hundreds of files, it's not 
convenient to add the file sizes up to get the answer.  I could write a perl 
script but it honestly should a be built-in command.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?

2008-11-07 Thread Peter Bridge
Just as a follow up.  I went ahead with the original hardware purchase, it was 
so much cheaper than the alternatives it was hard to resist.

Anyway, OS 2008-05 installed very nicely.  Although it mentions 32bit while 
booting, so I need to investigate that at some point.  The actual hardware 
seems stable and fast enough so far.  The CPU fan is much louder than I 
expected, so I'll probably swap that out since I want this box running 247 in 
the office and can't stand noisy machines.  Anyway I'm booting from a laptop 
ide drive with two sata disks in a mirrored zpool.  This seems to work fine 
although my testing has been limited due to some network problems...

I really need a step-by-step 'how to' to access this box from my OSX Leopard 
based macbook pro.

I've spent about 5 hours trying to get NFS working with minimal progress.  I've 
tried with nwadm disabled, although the two lines I entered to turn it off 
seemed to miss alot of other config items, so I ended up turning it back on.  
The reason I did that was the dhcp was failing to get an address, or that's 
what it looked like, so I wanted to try static address.  Anyway I found a way 
for using a static address with nwadm turned on.  Anyway, to cut a long story 
short, I can now ping between the machines, they have identicle users created 
id=501 and groups with id=501, but OSX just refuses in anyway to connect to the 
NFS share, from command line with various flag, or cmd k.

I'm going to clean install open solaris in the morning, maybe with the newest 
build 100? and try again, but I'd really appreciate some help from someone that 
has got this working.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems -my math doesn't add up

2008-11-07 Thread none
Hi Miles,
Thanks for your reply.
My zfs situation is a little different from your test. I have a few early 
snapshots which I believe would still be sharing most of their data with the 
current filesystem. Then later I have snapshots which would still be holding 
onto data that is now deleted.

So I'd like to get more detailed reports on just what and how much of the data 
is shared between the snapshots and current filesystem, rather than relying on 
my memory. For example, a report showing for all data blocks where they are 
shared would be what I really need:
10% unique to storage/fs
20% shared by storage/fs, [EMAIL PROTECTED], [EMAIL PROTECTED]
40% shared by [EMAIL PROTECTED], [EMAIL PROTECTED]
10% unique to [EMAIL PROTECTED]
10% unique to [EMAIL PROTECTED]
10% unique to [EMAIL PROTECTED]


Unfortunately the USED column is of little help since it only shows you the 
data unique to that snapshot. In my case almost all data is shared amongst the 
snapshots so it only shows up in the USED of the whole fs.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk space usage of zfs snapshots and filesystems-my math doesn't add up

2008-11-07 Thread Miles Nordin
 a ==   [EMAIL PROTECTED] writes:
 c == Miles Nordin [EMAIL PROTECTED] writes:
 n == none  [EMAIL PROTECTED] writes:

 n Unfortunately the USED column is of little help since it only
 n shows you the data unique to that snapshot. In my case almost
 n all data is shared amongst the snapshots so it only shows up
 n in the USED of the whole fs.

Unfortunately sounds right to me.

bash-3.2# mkfile 1m 0
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# zfs snapshot root/export/home/carton/[EMAIL PROTECTED]
bash-3.2# rm 0
bash-3.2# zfs list -r root/export/home/carton/t
NAME  USED  AVAIL  REFER  MOUNTPOINT
root/export/home/carton/t1.04M  26.6G18K  /export/home/carton/t
root/export/home/carton/[EMAIL PROTECTED]  0  -  1.02M  -
root/export/home/carton/[EMAIL PROTECTED]  0  -  1.02M  -

It seems like taking a lot of snapshots can make 'zfs list' output
less useful.

I've trouble imagining an interface to expose the information you
want.  The obvious extension of what we have now is ``show
hypothetical 'zfs list' output after I'd deleted this dataset,'' but I
think that'd be clumsy to use and inefficient to implement.

Maybe btrfs will think of something, since AIUI their model is to have
mandatory infinite snapshots, and instead of explicitly requesting
point-in-time snapshots, your manual action is to free up space by
specifying ranges of snapshot you'd like to delete.  They'll likely
have some interface to answer questions like ``how much space is used
by the log between 2008-01-25 and 2008-02-22?'', and they will have
background log-quantitization daemons that delete snapshot ranges
analagous to our daemons that take snapshots.

Maybe imagining infinitely-granular snapshots is the key to a better
interface: ``show me the USED value for snapshots [EMAIL PROTECTED] - [EMAIL 
PROTECTED]
inclusive,'' and you must always give a contiguous range.  That's
analagous to btrfs-world's hypothetical ``show me the space consumed
by the log between these two timestamps.''

the btrfs guy in here seemed to be saying there's dedup across clone
branches, which ZFS does not do.  I suppose that would complicate
space reporting and their interface to do it.

 a I really think there is something wrong with how space is
 a being reported by zfs list in terms of snapshots.

I also had trouble understanding it.

right:

  conduct tests and use them to form a rational explanation of the
  numbers' meanings.  If you can't determine one, ask for help and try
  harder.  Create experiments to debunk a series of proposed, wrong
  explanations.  If a right explanation doesn't emerge, complain it's
  broken chaos.

wrong:

  use the one-word column titles to daydream about what you THINK the
  numbers ought to mean if you'd designed it, and act obstinately
  surprised when the actual meaning doesn't match your daydream.

  One word isn't enough to get you and the designers on the same page.
  You must mentally prepare yourself to accept column-meanings other
  than your assumption.

Imagine the columns were given to you without headings.  Or, reread
your 'zfs list' output assuming the column headings are ORANGE,
TASMANIAN_DEVIL, and BARBIE, then try to understand what each means
through experiment rather than assumption.

 a But the used seems wrong.

%$#$%!  But I can only repeat myself:

 c USED column tells you ``how much
 c space would be freed if I destroyed the thing on this row,''

AND

 c after you delete something, the USED column will
 c reshuffle,

I'd been staring at the USED column for over a year, and did not know
either of those two things until I ran the experiment in my post.

I'm not saying it's necessarily working properly, but I see no
evidence of a flaw from the numbers in the mail I posted, and in your
writeup I see a bunch of assumptions that are not worth
untangling---paragraphs-long explanations don't work for everyone,
including me.  Adam, suggest you keep doing tests until you understand
the actual behavior, because at least the behavior seen in my post is
understandable and sane to me.


pgp8WqrxYZIzJ.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + OpenSolaris for home NAS?

2008-11-07 Thread Miles Nordin
 pb == Peter Bridge [EMAIL PROTECTED] writes:

pb I really need a step-by-step 'how to' to access this box from
pb my OSX Leopard

What you need for NFS on a laptop is a good automount daemon and a
'umount -f' command that actually does what the man page claims.

The automounter in Leopard works well.  The one in Tiger doesn't---has
quirks and often needs to be rebooted.  AFAICT both 10.4 and 10.5 have
an 'umount -f' that works better than Linux and *BSD.

You can use the automounter by editing files in /etc, but a better way
might be to use the Directory Utility.  Here is an outdated guide, not
for Leopard:

 http://www.behanna.org/osx/nfs/howto4.html

The Leopard steps are:

 1. Open Directory Utility
 2. pick Mounts in the tab-bar
 3. click the Lock in the lower-left corner and authenticate
 4. press +
 5. unroll Advanced Mount Parameters
 6. fill out the form something like this:

   Remote NFS URL: nfs://10.100.100.149/export
   Mount location: /Network/terabithia/export
   Advanced Mount Parameters: nosuid nodev locallocks
   [x] Ignore set user ID privileges

 7. Press Verify
 8. Press Apply, or press Command-S

the locallocks mount parameter should be removed.  I need it with old
versions of Linux and Mac OS 10.5.  I don't need it with
10.4+oldLinux, and hopefully 10.5+Solaris won't need it either.

There is a 'net' mount parameter which changes Finder's behavior.
Also it might be better to mount on some tree outside /Network and
/Volumes since these seem to have some strange special meanings.
ymmv.

At my site I also put 'umask 000' in /etc/launchd.conf to,
errmatch user expectations of how the office used to work with
SMB.  For something fancier, you may have to get the Macs and Suns to
share userids with LDAP.  Someone recently posted a PDF here about an
ozzie site serious about Macs that'd done so:

 http://www.afp548.com/filemgmt_data/files/OSX%20HSM.pdf

Another thing worth knowing: macs throw up these boxes that say
something like

 Server gone.

   [Disconnect]

with a big red Disocnnect button.  If your NFS server reboots, they'll
throw one of these boxes at you.  It looks like you only have one
choice.  You actually have three:

 1. ignore the box
 2. press the tiny red [x] in the upper-left corner of the box
 3. press disconnect

If you do (3), Mac OS will do 'umount -f' for you.  At the time the
box appears, the umount has not been done yet.

If you do (1) or (2), any application using the NFS server (including
potentially Finder and all its windows, not just the NFS windows) will
pinwheel until the NFS server comes back.  When it does come back, all
the apps will continue without data loss, and if you did (1) the box
will also disappear on its own.  The error handling is quite good and
in line with Unixy expectations and NFS statelessness.


pgpHWSn6AhMy6.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Victor Latushkin
Andrew пишет:
 hey Victor,
 
 Where would i find that? I'm still somewhat getting used to the
 Solaris environment. /var/adm/messages doesn't seem to show any Panic
 info.. I only have remote access via SSH, so I hope I can do
 something with dtrace to pull it.

Do you have anything in /var/crash/hostname ?

If yes, then do something like this and provide output:

cd /var/crash/hostname
echo ::status | mdb -k dump number
echo ::stack | mdb -k dump number
echo ::msgbuf -v | mdb -k dump number

victor

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
Andrew Gabriel wrote:
 Ian Collins wrote:
   
 Brent Jones wrote:
 
 Theres been a couple threads about this now, tracked some bug ID's/ticket:

 6333409
 6418042
   
 I see these are fixed in build 102.

 Are they targeted to get back to Solaris 10 via a patch? 

 If not, is it worth escalating the issue with support to get a patch?
 

 Given the issue described is slow zfs recv over network, I suspect this is:

 6729347 Poor zfs receive performance across networks

 This is quite easily worked around by putting a buffering program 
 between the network and the zfs receive. There is a public domain 
 mbuffer which should work, although I haven't tried it as I wrote my 
 own. The buffer size you need is about 5 seconds worth of data. In my 
 case of 7200RPM disks (in a mirror and not striped) and a gigabit 
 ethernet link, the disks are the limiting factor at around 57MB/sec 
 sustained i/o, so I used a 250MB buffer to best effect. If I recall 
 correctly, that speeded up the zfs send/recv across the network by about 
 3 times, and it then ran at the disk platter speed.

   
Did this apply to incremental sends as well?  I can live with ~20MB/sec
for full sends, but ~1MB/sec for incremental sends is a killer.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Andrew Gabriel
Ian Collins wrote:
 Andrew Gabriel wrote:
 Ian Collins wrote:
   
 Brent Jones wrote:
 
 Theres been a couple threads about this now, tracked some bug ID's/ticket:

 6333409
 6418042
   
 I see these are fixed in build 102.

 Are they targeted to get back to Solaris 10 via a patch? 

 If not, is it worth escalating the issue with support to get a patch?
 
 Given the issue described is slow zfs recv over network, I suspect this is:

 6729347 Poor zfs receive performance across networks

 This is quite easily worked around by putting a buffering program 
 between the network and the zfs receive. There is a public domain 
 mbuffer which should work, although I haven't tried it as I wrote my 
 own. The buffer size you need is about 5 seconds worth of data. In my 
 case of 7200RPM disks (in a mirror and not striped) and a gigabit 
 ethernet link, the disks are the limiting factor at around 57MB/sec 
 sustained i/o, so I used a 250MB buffer to best effect. If I recall 
 correctly, that speeded up the zfs send/recv across the network by about 
 3 times, and it then ran at the disk platter speed.
  
 Did this apply to incremental sends as well?  I can live with ~20MB/sec
 for full sends, but ~1MB/sec for incremental sends is a killer.

It doesn't help the ~1MB/sec periods in incrementals, but it does help 
the fast periods in incrementals.

-- 
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Ian Collins
Andrew Gabriel wrote:
 Ian Collins wrote:
 Andrew Gabriel wrote:
 Ian Collins wrote:
  
 Brent Jones wrote:

 Theres been a couple threads about this now, tracked some bug
 ID's/ticket:

 6333409
 6418042
   
 I see these are fixed in build 102.

 Are they targeted to get back to Solaris 10 via a patch?
 If not, is it worth escalating the issue with support to get a patch?
 
 Given the issue described is slow zfs recv over network, I suspect
 this is:

 6729347 Poor zfs receive performance across networks

 This is quite easily worked around by putting a buffering program
 between the network and the zfs receive. There is a public domain
 mbuffer which should work, although I haven't tried it as I wrote
 my own. The buffer size you need is about 5 seconds worth of data.
 In my case of 7200RPM disks (in a mirror and not striped) and a
 gigabit ethernet link, the disks are the limiting factor at around
 57MB/sec sustained i/o, so I used a 250MB buffer to best effect. If
 I recall correctly, that speeded up the zfs send/recv across the
 network by about 3 times, and it then ran at the disk platter speed.
  
 Did this apply to incremental sends as well?  I can live with ~20MB/sec
 for full sends, but ~1MB/sec for incremental sends is a killer.

 It doesn't help the ~1MB/sec periods in incrementals, but it does help
 the fast periods in incrementals.

:)

I don't see the 5 second bursty behaviour described in the bug report. 
It's more like 5 second interval gaps in the network traffic while the
data is written to disk.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS, Kernel Panic on import

2008-11-07 Thread Andrew
Not too sure if it's much help. I enabled kernel pages and curproc.. Let me 
know if I need to enable all then.

solaria crash # echo ::status | mdb -k
debugging live kernel (64-bit) on solaria
operating system: 5.11 snv_98 (i86pc)
solaria crash # echo ::stack | mdb -k
solaria crash # echo ::msgbuf -v | mdb -k
   TIMESTAMP   LOGCTL MESSAGE
2008 Nov  7 18:53:55 ff01c901dcf0   capacity = 1953525168 sectors
2008 Nov  7 18:53:55 ff01c901db70 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:55 ff01c901d9f0   SATA disk device at port 0
2008 Nov  7 18:53:55 ff01c901d870
model ST31000340AS
2008 Nov  7 18:53:55 ff01c901d6f0   firmware SD15
2008 Nov  7 18:53:55 ff01c901d570   serial number 
2008 Nov  7 18:53:55 ff01c901d3f0   supported features:
2008 Nov  7 18:53:55 ff01c901d270
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:55 ff01c901d0f0   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:55 ff01c901adf0   Supported queue depth 32, limited to 31
2008 Nov  7 18:53:55 ff01c901ac70   capacity = 1953525168 sectors
2008 Nov  7 18:53:55 ff01c901aaf0 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:55 ff01c901a970   SATA disk device at port 0
2008 Nov  7 18:53:55 ff01c901a7f0
model Maxtor 6L250S0
2008 Nov  7 18:53:55 ff01c901a670   firmware BANC1G10
2008 Nov  7 18:53:55 ff01c901a4f0   serial number
2008 Nov  7 18:53:55 ff01c901a370   supported features:
2008 Nov  7 18:53:55 ff01c901a2b0
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:55 ff01c901a130   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:55 ff01c901a070   Supported queue depth 32, limited to 31
2008 Nov  7 18:53:55 ff01c9017ef0   capacity = 490234752 sectors
2008 Nov  7 18:53:55 ff01c9017d70 pseudo-device: ramdisk1024
2008 Nov  7 18:53:55 ff01c9017bf0 ramdisk1024 is /pseudo/[EMAIL PROTECTED]
2008 Nov  7 18:53:55 ff01c9017a70 NOTICE: e1000g0 registered
2008 Nov  7 18:53:55 ff01c90179b0
pcplusmp: pci8086,100e (e1000g) instance 0 vector 0x14 ioapic 0x2 intin 0x14 is
bound to cpu 0
2008 Nov  7 18:53:55 ff01c90178f0
Intel(R) PRO/1000 Network Connection, Driver Ver. 5.2.12
2008 Nov  7 18:53:56 ff01c9017830 pseudo-device: lockstat0
2008 Nov  7 18:53:56 ff01c9017770 lockstat0 is /pseudo/[EMAIL PROTECTED]
2008 Nov  7 18:53:56 ff01c90176b0 sd6 at si31240: target 0 lun 0
2008 Nov  7 18:53:56 ff01c90175f0
sd6 is /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
2008 Nov  7 18:53:56 ff01c9017530 sd5 at si31242: target 0 lun 0
2008 Nov  7 18:53:56 ff01c9017470
sd5 is /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
2008 Nov  7 18:53:56 ff01c90173b0 sd4 at si31241: target 0 lun 0
2008 Nov  7 18:53:56 ff01c90172f0
sd4 is /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
2008 Nov  7 18:53:56 ff01c9017230
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd4) online
2008 Nov  7 18:53:56 ff01c9017170 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:56 ff01c90170b0   SATA disk device at port 1
2008 Nov  7 18:53:56 ff01c9087f30
model ST31000340AS
2008 Nov  7 18:53:56 ff01c9087e70   firmware SD15
2008 Nov  7 18:53:56 ff01c9087db0   serial number
2008 Nov  7 18:53:56 ff01c9087cf0   supported features:
2008 Nov  7 18:53:56 ff01c9087c30
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:56 ff01c9087b70   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:56 ff01c9087ab0   Supported queue depth 32, limited to 31
2008 Nov  7 18:53:56 ff01c90879f0   capacity = 1953525168 sectors
2008 Nov  7 18:53:56 ff01c9087930
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd6) online
2008 Nov  7 18:53:56 ff01c9087870
/[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci1095,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd5) online
2008 Nov  7 18:53:56 ff01c90877b0 /[EMAIL PROTECTED],0/pci1022,[EMAIL 
PROTECTED]/pci1095,[EMAIL PROTECTED] :
2008 Nov  7 18:53:56 ff01c90876f0   SATA disk device at port 1
2008 Nov  7 18:53:56 ff01c9087630
model ST31000340AS
2008 Nov  7 18:53:56 ff01c9087570   firmware SD15
2008 Nov  7 18:53:56 ff01c90874b0   serial number 
2008 Nov  7 18:53:56 ff01c90873f0   supported features:
2008 Nov  7 18:53:56 ff01c9087330
 48-bit LBA, DMA, Native Command Queueing, SMART self-test
2008 Nov  7 18:53:56 ff01c9087270   SATA Gen1 signaling speed (1.5Gbps)
2008 Nov  7 18:53:56 ff01c90871b0   Supported queue depth 

Re: [zfs-discuss] 'zfs recv' is very slow

2008-11-07 Thread Andrew Gabriel
Ian Collins wrote:
 Andrew Gabriel wrote:
 Ian Collins wrote:
 Andrew Gabriel wrote:
 Given the issue described is slow zfs recv over network, I suspect
 this is:

 6729347 Poor zfs receive performance across networks

 This is quite easily worked around by putting a buffering program
 between the network and the zfs receive. There is a public domain
 mbuffer which should work, although I haven't tried it as I wrote
 my own. The buffer size you need is about 5 seconds worth of data.
 In my case of 7200RPM disks (in a mirror and not striped) and a
 gigabit ethernet link, the disks are the limiting factor at around
 57MB/sec sustained i/o, so I used a 250MB buffer to best effect. If
 I recall correctly, that speeded up the zfs send/recv across the
 network by about 3 times, and it then ran at the disk platter speed.
  
 Did this apply to incremental sends as well?  I can live with ~20MB/sec
 for full sends, but ~1MB/sec for incremental sends is a killer.
 It doesn't help the ~1MB/sec periods in incrementals, but it does help
 the fast periods in incrementals.

 :)
 
 I don't see the 5 second bursty behaviour described in the bug report. 
 It's more like 5 second interval gaps in the network traffic while the
 data is written to disk.

That is exactly the issue. When the zfs recv data has been written, zfs 
recv starts reading the network again, but there's only a tiny amount of 
data buffered in the TCP/IP stack, so it has to wait for the network to 
heave more data across. In effect, it's a single buffered copy. The 
addition of a buffer program turns it into a double-buffered (or cyclic 
buffered) copy, with the disks running flat out continuously, and the 
network streaming data across continuously at the disk platter speed.

What are your theoretical max speeds for network and disk i/o?
Taking the smaller of these two, are you seeing the sustained send/recv 
performance match that (excluding the ~1MB/sec periods which is some 
other problem)?

The effect described in that bug is most obvious when the disk and 
network speeds are same order of magnitude (as in the example I gave 
above). Given my disk i/o rate above, if the network is much faster 
(say, 10GB), then it's going to cope with the bursty nature of the 
traffic better. If the network is much slower (say, 100MB), then it's 
going to be running flat out anyway and again you won't notice the 
bursty reads (a colleague measured only 20% gain in that case, rather 
than my 200% gain).

-- 
Andrew


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss