Re: [zfs-discuss] Migrating to ZFS

2010-06-04 Thread zfsnoob4
It's not easy to make Solaris slices on the boot drive.

As I am just realizing. The installer does not have any kind of partition 
software.

I have a linux boot disc and I am contemplating using gparted to resize the win 
partition to create a raw 50GB empty partition. Can the installer format a raw 
partition into a Solaris FS? If it can it will be easy (assuming it can set up 
the dual boot properly).

This is what I'm thinking:
1) Use Gparted to resize the windows partition and therefore create a 50GB raw 
partition.
2) Use the opensolaris installer to format the raw partition into a Solaris FS.
3) Install opensolaris 2009.06, the setup should automatically configure the 
dual boot with windows and opensolaris.

Does that make sense?

Thanks again.

Message was edited by: zfsnoob4
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Depth of Scrub

2010-06-04 Thread sensille
Hi,

I have a small question about the depth of scrub in a raidz/2/3 configuration.
I'm quite sure scrub does not check spares or unused areas of the disks (it
could check if the disks detects any errors there).
But what about the parity? Obviously it has to be checked, but I can't find
any indications for it in the literature. The man page only states that the
data is being checksummed and only if that fails the redundancy is being used.
Please tell me I'm wrong ;)

But what I'm really targeting with my question: How much coverage can be
reached with a find | xargs wc in contrast to scrub? It misses the snapshots,
but anything beyond that?

Thanks,
Arne
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog / log recovery is here!

2010-06-04 Thread R. Eulenberg
Sorry for reviving this old thread.

I even have this problem on my (productive) backup server. I lost my system-hdd 
and my separate ZIL-device while the system crashs and now I'm in trouble. The 
old system was running under the least version of osol/dev (snv_134) with zfs 
v22. 
After the server crashs I was very optimistic of solving the problems the same 
day. It's a long time ago.
I was setting up a new systen (osol 2009.06 and updating to the lastest version 
of osol/dev - snv_134 - with deduplication) and then I tried to import my 
backup zpool, but it does not work.

# zpool import
  pool: tank1
id: 5048704328421749681
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

tank1UNAVAIL  missing device
  raidz2-0   ONLINE
c7t5d0   ONLINE
c7t0d0   ONLINE
c7t6d0   ONLINE
c7t3d0   ONLINE
c7t1d0   ONLINE
c7t4d0   ONLINE
c7t2d0   ONLINE

# zpool import -f tank1
cannot import 'tank1': one or more devices is currently unavailable
Destroy and re-create the pool from
a backup source

Any other option (-F, -X, -V, -D) and any combination of them doesn't helps too.
I can not add / attach / detach / remove a vdev and the ZIL-device either, 
because the system tells me: there is no zpool 'tank1'.
In the last ten days I read a lot of threads, guides to solve problems and best 
practice documentations with ZFS and so on, but I do not found a solution for 
my problem. I created a fake-zpool with separate ZIL-device to combine the new 
ZIL-file with my old zpool for importing them, but it doesn't work in course of 
the different GUID and checksum (the name I was modifiing by an binary editor).
The output of:
e...@opensolaris:~# zdb -e tank1

Configuration for import:
vdev_children: 2
version: 22
pool_guid: 5048704328421749681
name: 'tank1'
state: 0
hostid: 946038
hostname: 'opensolaris'
vdev_tree:
type: 'root'
id: 0
guid: 5048704328421749681
children[0]:
type: 'raidz'
id: 0
guid: 16723866123388081610
nparity: 2
metaslab_array: 23
metaslab_shift: 30
ashift: 9
asize: 7001340903424
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 6858138566678362598
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a'
whole_disk: 1
DTL: 4345
create_txg: 4
path: '/dev/dsk/c7t5d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a'
children[1]:
type: 'disk'
id: 1
guid: 16136237447458434520
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a'
whole_disk: 1
DTL: 4344
create_txg: 4
path: '/dev/dsk/c7t0d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a'
children[2]:
type: 'disk'
id: 2
guid: 10876853602231471126
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a'
whole_disk: 1
DTL: 4343
create_txg: 4
path: '/dev/dsk/c7t6d0s0'
devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a'
children[3]:
type: 'disk'
id: 3
guid: 2384677379114262201
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a'
whole_disk: 1
DTL: 4342
create_txg: 4
path: '/dev/dsk/c7t3d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a'
children[4]:
type: 'disk'
id: 4
guid: 15143849195434333247
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a'
whole_disk: 1
DTL: 4341
create_txg: 4
path: '/dev/dsk/c7t1d0s0'
devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a'
children[5]:
type: 'disk'
id: 5
guid: 11627603446133164653
phys_path: 

Re: [zfs-discuss] one more time: pool size changes

2010-06-04 Thread Marty Scholes
On Jun 3, 2010 7:35 PM, David Magda wrote:

 On Jun 3, 2010, at 13:36, Garrett D'Amore wrote:
 
  Perhaps you have been unlucky.  Certainly, there is
 a window with N 
  +1 redundancy where a single failure leaves the
 system exposed in  
  the face of a 2nd fault.  This is a statistics
 game...
 
 It doesn't even have to be a drive failure, but an
 unrecoverable read  
 error.

Well said.

Also include a controller burp, a bit flip somewhere, a drive going offline 
briefly, fibre cable momentary interruption, etc.  The list goes on.

My experience is that these weirdo once in a lifetime issues tend to present 
in clumps which are not as evenly distributred as statistics would lead you to 
believe.  Rather, like my kids, they save up their fun into coordinated bursts.

When these bursts happen, the ensuing conversations with stakeholders about how 
all of this redundancy you tricked them into purchasing has left them 
exposed.  Not good times.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Depth of Scrub

2010-06-04 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of sensille
 
 I'm quite sure scrub does not check spares or unused areas of the disks
 (it
 could check if the disks detects any errors there).
 But what about the parity? Obviously it has to be checked, but I can't
 find
 any indications for it in the literature. The man page only states that
 the
 data is being checksummed and only if that fails the redundancy is
 being used.
 Please tell me I'm wrong ;)

If my understanding is correct, a scrub reads and checksums all the used
blocks on all the primary storage devices.  Meaning:  The scrub is not
checking log devices or spares, and I don't think it checks cache devices.
And as you said, it's not reading empty space.

The main reason to use scrub, as opposed to your find command (which has
some serious shortcomings) or even a zfs send  /dev/null command (which
has far fewer shortcomings) is:  When you just tell the system to read data,
you're only sure to read one half of redundant data.  You might
coincidentally just read the good side of the mirror, or whatever, and
therefore fail to detect the corrupted data on the other side of the mirror.
You've got to use the scrub.

It is very wise to perform a scrub occasionally, because you can only
correct errors as long as you still have redundancy.  If a device fails, and
degrades redundancy, and then some rarely used block is discovered to be
corrupt during the resilver ... too bad for you.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS recovery tools

2010-06-04 Thread Sigbjørn Lie

David Magda wrote:

On Wed, June 2, 2010 02:20, Sigbjorn Lie wrote:

  

I have just recovered from a ZFS crash. During the antagonizing time
this took, I was surprised to learn how undocumented the tools and
options for ZFS recovery we're. I managed to recover thanks to some great
forum posts from Victor Latushkin, however without his posts I would
still be crying at night...



For the archives, from a private exchange:

Zdb(1M) is complicated and in-flux, so asking on zfs-discuss or calling
Oracle isn't a very onerous request IMHO.

As for recovery, see zpool(1M):

  

zpool import [-o mntopts] [ -o  property=value] ... [-d dir  | -c
 cachefile] [-D] [-f] [-R root] [-F [-n]] pool | id  [newpool]


[...]
  

-F
 Recovery mode for a non-importable pool. Attempt to return
 the pool to an importable state by discarding the last few
 transactions. Not all damaged pools can be recovered by
 using this option. If successful, the data from the
 discarded transactions is irretrievably lost. This option
 is ignored if the pool is importable or already imported.



http://docs.sun.com/app/docs/doc/819-2240/zpool-1m

This is available as of svn_128, and not in Solaris as of Update 8 (10/09):

http://bugs.opensolaris.org/view_bug.do?bug_id=6667683

This was part of PSARC 2009/479:

http://arc.opensolaris.org/caselog/PSARC/2009/479/
http://www.c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html
http://sparcv9.blogspot.com/2009/09/zpool-recovery-support-psarc2009479.html

Personally I'm waiting for Solaris 10u9 for a lot of these fixes and
updates [...].

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


Excellent! I wish I would have known about these features when I was attempting 
to recover my pool using 2009.06/snv111.

I still believe there are some document updates to be done. While I was 
attempting to recover my pool and googling for information I found none of 
these documents. What I did find was a lot of forum posts about people that did 
not manage to make a recover, and assumed their data was lost.

ZFS Troubleshooting and Data Recovery from the Solaris ZFS Administration Guide and the ZFS Troubleshooting Guide at SolarisInternals would greatly benefit from being updated with the information you provided. One of the reasons for this is that they appear at the top of googles rankings for zfs recovery as search topic.  :) 

Thank you for the links.  :) 



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog / log recovery is here!

2010-06-04 Thread Sigbjørn Lie


R. Eulenberg wrote:

Sorry for reviving this old thread.

I even have this problem on my (productive) backup server. I lost my system-hdd and my separate ZIL-device while the system crashs and now I'm in trouble. The old system was running under the least version of osol/dev (snv_134) with zfs v22. 
After the server crashs I was very optimistic of solving the problems the same day. It's a long time ago.

I was setting up a new systen (osol 2009.06 and updating to the lastest version 
of osol/dev - snv_134 - with deduplication) and then I tried to import my 
backup zpool, but it does not work.

# zpool import
  pool: tank1
id: 5048704328421749681
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

tank1UNAVAIL  missing device
  raidz2-0   ONLINE
c7t5d0   ONLINE
c7t0d0   ONLINE
c7t6d0   ONLINE
c7t3d0   ONLINE
c7t1d0   ONLINE
c7t4d0   ONLINE
c7t2d0   ONLINE

# zpool import -f tank1
cannot import 'tank1': one or more devices is currently unavailable
Destroy and re-create the pool from
a backup source

Any other option (-F, -X, -V, -D) and any combination of them doesn't helps too.
I can not add / attach / detach / remove a vdev and the ZIL-device either, 
because the system tells me: there is no zpool 'tank1'.
In the last ten days I read a lot of threads, guides to solve problems and best 
practice documentations with ZFS and so on, but I do not found a solution for 
my problem. I created a fake-zpool with separate ZIL-device to combine the new 
ZIL-file with my old zpool for importing them, but it doesn't work in course of 
the different GUID and checksum (the name I was modifiing by an binary editor).
The output of:
e...@opensolaris:~# zdb -e tank1

Configuration for import:
vdev_children: 2
version: 22
pool_guid: 5048704328421749681
name: 'tank1'
state: 0
hostid: 946038
hostname: 'opensolaris'
vdev_tree:
type: 'root'
id: 0
guid: 5048704328421749681
children[0]:
type: 'raidz'
id: 0
guid: 16723866123388081610
nparity: 2
metaslab_array: 23
metaslab_shift: 30
ashift: 9
asize: 7001340903424
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 6858138566678362598
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a'
whole_disk: 1
DTL: 4345
create_txg: 4
path: '/dev/dsk/c7t5d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a'
children[1]:
type: 'disk'
id: 1
guid: 16136237447458434520
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a'
whole_disk: 1
DTL: 4344
create_txg: 4
path: '/dev/dsk/c7t0d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a'
children[2]:
type: 'disk'
id: 2
guid: 10876853602231471126
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a'
whole_disk: 1
DTL: 4343
create_txg: 4
path: '/dev/dsk/c7t6d0s0'
devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a'
children[3]:
type: 'disk'
id: 3
guid: 2384677379114262201
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a'
whole_disk: 1
DTL: 4342
create_txg: 4
path: '/dev/dsk/c7t3d0s0'
devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a'
children[4]:
type: 'disk'
id: 4
guid: 15143849195434333247
phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a'
whole_disk: 1
DTL: 4341
create_txg: 4
path: '/dev/dsk/c7t1d0s0'
devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a'
children[5]:
type: 'disk'
id: 5
guid: 11627603446133164653

Re: [zfs-discuss] Migrating to ZFS

2010-06-04 Thread Edho P Arief
On Fri, Jun 4, 2010 at 2:59 PM, zfsnoob4 zfsnoob...@hotmail.co.uk wrote:
 It's not easy to make Solaris slices on the boot drive.

 As I am just realizing. The installer does not have any kind of partition 
 software.

 I have a linux boot disc and I am contemplating using gparted to resize the 
 win partition to create a raw 50GB empty partition. Can the installer format 
 a raw partition into a Solaris FS? If it can it will be easy (assuming it can 
 set up the dual boot properly).

 This is what I'm thinking:
 1) Use Gparted to resize the windows partition and therefore create a 50GB 
 raw partition.
 2) Use the opensolaris installer to format the raw partition into a Solaris 
 FS.
 3) Install opensolaris 2009.06, the setup should automatically configure the 
 dual boot with windows and opensolaris.

 Does that make sense?


that's exactly what I usually do


-- 
O ascii ribbon campaign - stop html mail - www.asciiribbon.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Usage on drives

2010-06-04 Thread Andreas Iannou

Hello again,

 

I'm wondering if we can see the amount of usage for a drive in ZFS raidz 
mirror. I'm in the process of replacing some drives but I want to replace the 
less used drives first (maybe only 40-50% utilisation). Is there such a thing? 
I saw somewhere that a guy had 3 drives in a raidz, one drive only had to be 
resilvered 612Gb to replace.

 

I'm hoping as theres quite a bit of free space that some drives only occupy a 
little and therefore only resilver 200-300Gb of data.

 

Thanks,

Andre
  
_
New, Used, Demo, Dealer or Private? Find it at CarPoint.com.au
http://clk.atdmt.com/NMN/go/206222968/direct/01/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Depth of Scrub

2010-06-04 Thread Marty Scholes
 I have a small question about the depth of scrub in a
 raidz/2/3 configuration.
 I'm quite sure scrub does not check spares or unused
 areas of the disks (it
 could check if the disks detects any errors there).
 But what about the parity?

From some informal performance testing of RAIDZ2/3 arrays, I am confident that 
scrub reads the parity blocks and normal reads do not.

You can see this for yourself with iostat -x or zpool iostat -v

Start monitoring and watch read I/O.  You will see regularly that a RAIDZ3 
array will read from all but three drives, which I presume is the unread parity.

Do the same monitoring while a scrub is underway and you will see all drives 
being read from equally.

My experience suggests something similar is taking place with mirrors.

If you think about it, having a scrub check everything but the parity would be 
a rather pointless operation.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] nfs share of nested zfs directories?

2010-06-04 Thread Pasi Kärkkäinen
On Fri, Jun 04, 2010 at 08:43:32AM -0400, Cassandra Pugh wrote:
Thank you, when I manually mount using the mount -t nfs4 option, I am
able to see the entire tree, however, the permissions are set as
nfsnobody.
Warning: rpc.idmapd appears not to be running.
 All uids will be mapped to the nobody uid.
 

Did you actually read the error message? :)
Finding a solution shouldn't be too difficult after that..

-- Pasi

-
Cassandra
(609) 243-2413
Unix Administrator
 
From a little spark may burst a mighty flame.
-Dante Alighieri
 
On Thu, Jun 3, 2010 at 4:33 PM, Brandon High [1]bh...@freaks.com wrote:
 
  On Thu, Jun 3, 2010 at 12:50 PM, Cassandra Pugh [2]cp...@pppl.gov
  wrote:
   The special case here is that I am trying to traverse NESTED zfs
  systems,
   for the purpose of having compressed and uncompressed directories.
 
  Make sure to use mount -t nfs4 on your linux client. The standard
  nfs type only supports nfs v2/v3.
 
  -B
  --
  Brandon High : [3]bh...@freaks.com
 
 References
 
Visible links
1. mailto:bh...@freaks.com
2. mailto:cp...@pppl.gov
3. mailto:bh...@freaks.com

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Usage on drives

2010-06-04 Thread Freddie Cash
On Fri, Jun 4, 2010 at 6:36 AM, Andreas Iannou 
andreas_wants_the_w...@hotmail.com wrote:

  Hello again,

 I'm wondering if we can see the amount of usage for a drive in ZFS raidz
 mirror. I'm in the process of replacing some drives but I want to replace
 the less used drives first (maybe only 40-50% utilisation). Is there such a
 thing? I saw somewhere that a guy had 3 drives in a raidz, one drive only
 had to be resilvered 612Gb to replace.

 I'm hoping as theres quite a bit of free space that some drives only occupy
 a little and therefore only resilver 200-300Gb of data.


When in doubt, read the man page.  :)

zpool iostat -v

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Depth of Scrub

2010-06-04 Thread David Dyer-Bennet

On Fri, June 4, 2010 03:29, sensille wrote:
 Hi,

 I have a small question about the depth of scrub in a raidz/2/3
 configuration.
 I'm quite sure scrub does not check spares or unused areas of the disks
 (it
 could check if the disks detects any errors there).
 But what about the parity? Obviously it has to be checked, but I can't
 find
 any indications for it in the literature. The man page only states that
 the
 data is being checksummed and only if that fails the redundancy is being
 used.
 Please tell me I'm wrong ;)

I believe you're wrong.  Scrub checks all the blocks used by ZFS,
regardless of what's in them.  (It doesn't check free blocks.)

 But what I'm really targeting with my question: How much coverage can be
 reached with a find | xargs wc in contrast to scrub? It misses the
 snapshots, but anything beyond that?

Your find script misses the redundant data; scrub checks it all.

It may well miss some of the metadata as well, and probably misses the
redundant copies of metadata.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [zones-discuss] ZFS ARC cache issue

2010-06-04 Thread Robert Milkowski

On 04/06/2010 15:46, James Carlson wrote:

Petr Benes wrote:
   

add to /etc/system something like (value depends on your needs)

* limit greedy ZFS to 4 GiB
set zfs:zfs_arc_max = 4294967296

And yes, this has nothing to do with zones :-).
 

That leaves unanswered the underlying question: why do you need to do
this at all?  Isn't the ZFS ARC supposed to release memory when the
system is under pressure?  Is that mechanism not working well in some
cases ... ?

   


My understanding is that if kmem gets heavily fragmaneted ZFS won't be 
able to give back much memory.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [zones-discuss] ZFS ARC cache issue

2010-06-04 Thread Garrett D'Amore
On Fri, 2010-06-04 at 16:03 +0100, Robert Milkowski wrote:
 On 04/06/2010 15:46, James Carlson wrote:
  Petr Benes wrote:
 
  add to /etc/system something like (value depends on your needs)
 
  * limit greedy ZFS to 4 GiB
  set zfs:zfs_arc_max = 4294967296
 
  And yes, this has nothing to do with zones :-).
   
  That leaves unanswered the underlying question: why do you need to do
  this at all?  Isn't the ZFS ARC supposed to release memory when the
  system is under pressure?  Is that mechanism not working well in some
  cases ... ?
 
 
 
 My understanding is that if kmem gets heavily fragmaneted ZFS won't be 
 able to give back much memory.
 

The slab allocator and virtual memory are designed to prevent memory
fragmentation.  That said, it is possible that certain devices which
need physically contiguous memory may be affected by physical address
fragmentation.  I'm not sure exactly what kind of fragmentation you're
talking about here though... 

- Garrett



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] nfs share of nested zfs directories?

2010-06-04 Thread Cassandra Pugh
Well, yes I understand I need to research the issue of running the idmapd
service, but I also need to figure out how to use nfsv4 and automount.
-
Cassandra
(609) 243-2413
Unix Administrator


From a little spark may burst a mighty flame.
-Dante Alighieri


On Fri, Jun 4, 2010 at 10:00 AM, Pasi Kärkkäinen pa...@iki.fi wrote:

 On Fri, Jun 04, 2010 at 08:43:32AM -0400, Cassandra Pugh wrote:
 Thank you, when I manually mount using the mount -t nfs4 option, I
 am
 able to see the entire tree, however, the permissions are set as
 nfsnobody.
 Warning: rpc.idmapd appears not to be running.
  All uids will be mapped to the nobody uid.
 

 Did you actually read the error message? :)
 Finding a solution shouldn't be too difficult after that..

 -- Pasi

 -
 Cassandra
 (609) 243-2413
 Unix Administrator
 
 From a little spark may burst a mighty flame.
 -Dante Alighieri
 
 On Thu, Jun 3, 2010 at 4:33 PM, Brandon High [1]bh...@freaks.com
 wrote:
 
   On Thu, Jun 3, 2010 at 12:50 PM, Cassandra Pugh [2]cp...@pppl.gov
   wrote:
The special case here is that I am trying to traverse NESTED zfs
   systems,
for the purpose of having compressed and uncompressed directories.
 
   Make sure to use mount -t nfs4 on your linux client. The standard
   nfs type only supports nfs v2/v3.
 
   -B
   --
   Brandon High : [3]bh...@freaks.com
 
  References
 
 Visible links
 1. mailto:bh...@freaks.com
 2. mailto:cp...@pppl.gov
 3. mailto:bh...@freaks.com

  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Ray Van Dolson
I'm running zpool version 23 (via ZFS fuse on Linux) and have a zpool
with deduplication turned on.

I am testing how well deduplication will work for the storage of many,
similar ISO files and so far am seeing unexpected results (or perhaps
my expectations are wrong).

The ISO's I'm testing with are the 32-bit and 64-bit versions of the
RHEL5 DVD ISO's.  While both have their differences, they do contain a
lot of similar data as well.

If I explode both ISO files and copy them to my ZFS filesystem I see
about a 1.24x dedup ratio.

However, if I have only the ISO files on the ZFS filesystem, the ratio
is 1.00x -- no savings at all.

Does this make sense?  I'm going to experiment with other combinations
of ISO files as well...

Thanks,
Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs list sizes - newbie question

2010-06-04 Thread Andres Noriega
Thanks... here's the requested output:

NAMEAVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
vtl_pool1020G  15.0T 0   46.3K  0  15.0T
vtl_pool/lun00  1.99T 1T 0   6.05G  1018G  0
vtl_pool/lun01  1.99T 1T 0   4.46G  1020G  0
vtl_pool/lun02  1.99T 1T 0   4.44G  1020G  0
vtl_pool/lun03  1.99T 1T 0   4.49G  1020G  0
vtl_pool/lun04  2.00T 1T 0869M  1023G  0
vtl_pool/lun05  2.00T 1T 0725M  1023G  0
vtl_pool/lun06  2.00T 1T 0722M  1023G  0
vtl_pool/lun07  2.00T 1T 0700M  1023G  0
vtl_pool/lun08  2.00T 1T 0534M  1023G  0
vtl_pool/lun09  2.00T 1T 0518M  1023G  0
vtl_pool/lun10  2.00T 1T 0309M  1024G  0
vtl_pool/lun11  2.00T 1T 0   4.84M  1024G  0
vtl_pool/lun12  2.00T 1T 0   4.84M  1024G  0
vtl_pool/lun13  2.00T 1T 0   4.84M  1024G  0
vtl_pool/lun14  2.00T 1T 0   4.84M  1024G  0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs list sizes - newbie question

2010-06-04 Thread Brandon High
On Thu, Jun 3, 2010 at 1:06 PM, Andres Noriega
andres.nori...@oracle.com wrote:
 Hi everyone, I have a question about the zfs list output. I created a large 
 zpool and then carved out 1TB volumes (zfs create -V 1T vtl_pool/lun##). 
 Looking at the zfs list output, I'm a little thrown off by the AVAIL amount. 
 Can anyone clarify for me why it is saying 2T?

You have a 16T zpool, and have created 15x 1T zvols, leaving 1T free.
Each zvol is mostly unused, so it has 1T available in its
refreservation, and an additional 1T available from the zpool.

The zvols won't actually hold 2T, because they were created with 1T of
space. The space beyond 1T can be used for snapshots though.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson rvandol...@esri.com wrote:
 The ISO's I'm testing with are the 32-bit and 64-bit versions of the
 RHEL5 DVD ISO's.  While both have their differences, they do contain a
 lot of similar data as well.

Similar != identical.

Dedup works on blocks in zfs, so unless the iso files have identical
data aligned at 128k boundaries you won't see any savings.

 If I explode both ISO files and copy them to my ZFS filesystem I see
 about a 1.24x dedup ratio.

Each file starts a new block, so the identical files can be deduped.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS recovery tools

2010-06-04 Thread Miles Nordin
 sl == Sigbjørn Lie sigbj...@nixtra.com writes:

sl Excellent! I wish I would have known about these features when
sl I was attempting to recover my pool using 2009.06/snv111.

the OP tried the -F feature.  It doesn't work after you've lost zpool.cache:

op I was setting up a new systen (osol 2009.06 and updating to
op the lastest version of osol/dev - snv_134 - with
op deduplication) and then I tried to import my backup zpool, but
op it does not work.  

op # zpool import -f tank1 
op cannot import 'tank1': one or more devices is currently unavailable 
op Destroy and re-create the pool from a backup source

op Any other option (-F, -X, -V, -D) and any combination of them
op doesn't helps too.

I have been in here repeatedly warning about this incompleteness of
the feature while fanbois keep saying ``we have slog recovery so don't
worry.''

R., please let us know if the 'zdb -e -bcsvL zpool-name' incantation
Sigbjorn suggested ends up working for you or not.


pgpFHj14VBEC7.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Depth of Scrub

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 1:29 AM, sensille sensi...@gmx.net wrote:
 But what I'm really targeting with my question: How much coverage can be
 reached with a find | xargs wc in contrast to scrub? It misses the snapshots,
 but anything beyond that?

Your script will also update the atime on every file, which may not be
the desired effect.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ssd pool + ssd cache ?

2010-06-04 Thread zfsnoob4
I'm also considering adding a cheap SSD as a a cache drive. The only problem is 
that SSDs loose performance over time because when something is deleted, it is 
not actually deleted. So the next time something is written on the same blocks, 
it must first delete, then write.

To fix this, SSDs allow a new command called Trim which automatically clean the 
 blocks after deleting something.

Does anyone know if opensolaris supports Trim?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Usage on drives

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 6:36 AM, Andreas Iannou
andreas_wants_the_w...@hotmail.com wrote:
 I'm wondering if we can see the amount of usage for a drive in ZFS raidz
 mirror. I'm in the process of replacing some drives but I want to replace

By definition, a mirror has the a copy of all the data on each drive.

A raidz vdev is auto-balancing, and effort is made to spread data
across as many devices as possible. Unless access patterns are a
weird, each drive should hold the same amount of data within a
reasonable margin of error.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs list sizes - newbie question

2010-06-04 Thread Andres Noriega
I understand now. So each vol's available space is reporting it's reservation 
and whatever is still available in the pool. 

I appreciate the explanation. Thank you!

 On Thu, Jun 3, 2010 at 1:06 PM, Andres Noriega
 andres.nori...@oracle.com wrote:
  Hi everyone, I have a question about the zfs list
 output. I created a large zpool and then carved out
 1TB volumes (zfs create -V 1T vtl_pool/lun##).
 Looking at the zfs list output, I'm a little thrown
 off by the AVAIL amount. Can anyone clarify for me
 why it is saying 2T?
 
 You have a 16T zpool, and have created 15x 1T zvols,
 leaving 1T free.
 Each zvol is mostly unused, so it has 1T available in
 its
 refreservation, and an additional 1T available from
 the zpool.
 
 The zvols won't actually hold 2T, because they were
 created with 1T of
 space. The space beyond 1T can be used for snapshots
 though.
 
 -B
 
 -- 
 Brandon High : bh...@freaks.com
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Migrating to ZFS

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 12:59 AM, zfsnoob4 zfsnoob...@hotmail.co.uk wrote:
 This is what I'm thinking:
 1) Use Gparted to resize the windows partition and therefore create a 50GB 
 raw partition.
 2) Use the opensolaris installer to format the raw partition into a Solaris 
 FS.
 3) Install opensolaris 2009.06, the setup should automatically configure the 
 dual boot with windows and opensolaris.

 Does that make sense?

That will work fine.

Be aware that Solaris on x86 has two types of partitions. There are
fdisk partitions (c0t0d0p1, etc) which is what gparted, windows and
other tools will see. There are also Solaris partitions or slices
(c0t0d0s0). You can create or edit these with the 'format' command in
Solaris. These are created in an fdisk partition that is the SOLARIS2
type. So yeah, it's a partition table inside a partition table.

The caiman installer will allow you to create and install into fdisk
partitions. It creates a Solaris slice that uses the entire fdisk
partition.

If you want to change the size or layout of the slices, you can't do
it at install time.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs list sizes - newbie question

2010-06-04 Thread Freddie Cash
On Fri, Jun 4, 2010 at 11:41 AM, Andres Noriega
andres.nori...@oracle.comwrote:

 I understand now. So each vol's available space is reporting it's
 reservation and whatever is still available in the pool.

 I appreciate the explanation. Thank you!


If you want the available space to be a hard limit, have a look at the quota
property.

The reservation tells the pool to reserve that amount of space for the
dataset, meaning that space is no longer available to anything else in the
pool.

The quota tells the pool the max amount of storage the dataset can use, and
is reflected in the space available output of various tools (like zfs
list, df, etc).

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Small stalls slowing down rsync from holding network saturation every 5 seconds

2010-06-04 Thread Sandon Van Ness
On 06/01/2010 07:57 AM, Bob Friesenhahn wrote:
 On Mon, 31 May 2010, Sandon Van Ness wrote:
 With sequential writes I don't see how parity writing would be any
 different from when I just created a 20 disk zpool which is doing the
 same writes every 5 seconds but the only difference is it isn't maxing
 out CPU usage when doing the writes and and I don't see the transfer
 stall during the writes like I did on raidz2.

 I am not understanding the above paragraph, but hopefully you agree
 that raidz2 issues many more writes (based on vdev stripe width) to
 the underlying disks than a simple non-redundant load-shared pool
 does.  Depending on your system, this might not be an issue, but it is
 possible that there is an I/O threshold beyond which something
 (probably hardware) causes a performance issue.

 Bob

Interesting enough when I went to copy the data back I got even worse
download speeds than I did write speeds! It looks like i need some sort
of read-ahead as unlike the writes it doesn't appear to be CPU bound as
using mbuffer/tar gives me full gigabit speeds. You can see in my graph
here:

http://uverse.houkouonchi.jp/stats/netusage/1.1.1.3_2.html

On the weekly graph is when I was sending to the ZFS server and then
daily is showing it comming back but I stopped it  and shut down the
computer for a while which is the low speed flat line and then started
it up again this time using mbuffer and speeds are great. I don't see
why I am having a trouble getting full speeds when doing reads unless it
needs to read ahead more than it is.

I decided to go ahead and to tar + mbuffer for the first pass and then
run rsync after for the final sync just to make sure nothing was missed.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Ray Van Dolson
On Fri, Jun 04, 2010 at 11:16:40AM -0700, Brandon High wrote:
 On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson rvandol...@esri.com wrote:
  The ISO's I'm testing with are the 32-bit and 64-bit versions of the
  RHEL5 DVD ISO's.  While both have their differences, they do contain a
  lot of similar data as well.
 
 Similar != identical.
 
 Dedup works on blocks in zfs, so unless the iso files have identical
 data aligned at 128k boundaries you won't see any savings.
 
  If I explode both ISO files and copy them to my ZFS filesystem I see
  about a 1.24x dedup ratio.
 
 Each file starts a new block, so the identical files can be deduped.
 
 -B

Makes sense.  So, as someone else suggested, decreasing my block size
may improve the deduplication ratio.

recordsize I presume is the value to tweak?

Thanks,
Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Roy Sigurd Karlsbakk
 Makes sense.  So, as someone else suggested, decreasing my block size
 may improve the deduplication ratio.
 
 recordsize I presume is the value to tweak?

It is, but keep in mind that zfs will need about 150 bytes for each block. 1TB 
with 128k blocks will need about 1GB memory for the index to stay in RAM. 64k 
blocks, the double, et cetera...

l2arc will help a lot if memory is low

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Nicolas Williams
On Fri, Jun 04, 2010 at 12:37:01PM -0700, Ray Van Dolson wrote:
 On Fri, Jun 04, 2010 at 11:16:40AM -0700, Brandon High wrote:
  On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson rvandol...@esri.com wrote:
   The ISO's I'm testing with are the 32-bit and 64-bit versions of the
   RHEL5 DVD ISO's.  While both have their differences, they do contain a
   lot of similar data as well.
  
  Similar != identical.
  
  Dedup works on blocks in zfs, so unless the iso files have identical
  data aligned at 128k boundaries you won't see any savings.
  
   If I explode both ISO files and copy them to my ZFS filesystem I see
   about a 1.24x dedup ratio.
  
  Each file starts a new block, so the identical files can be deduped.
  
  -B
 
 Makes sense.  So, as someone else suggested, decreasing my block size
 may improve the deduplication ratio.
 
 recordsize I presume is the value to tweak?

Yes, but I'd not expect that much commonality between 32-bit and 64-bit
Linux ISOs...

Do the same check again with the ISOs exploded, as you say.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson rvandol...@esri.com wrote:
 Makes sense.  So, as someone else suggested, decreasing my block size
 may improve the deduplication ratio.

It might. It might make your performance tank, too.

Decreasing the block size increases the size of the dedup table (DDT).
Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT
gets too large to fit in memory, it will have to be read from disk,
which will destroy any sort of write performance (although a L2ARC on
SSD can help)

If you move to 64k blocks, you'll double the DDT size and may not
actually increase your ratio. Moving to 8k blocks will increase your
DDT by a factor of 16, and still may not help.

Changing the recordsize will not affect files that are already in the
dataset. You'll have to recopy them to re-write with the smaller block
size.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Ray Van Dolson
On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote:
 On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson rvandol...@esri.com wrote:
  Makes sense.  So, as someone else suggested, decreasing my block size
  may improve the deduplication ratio.
 
 It might. It might make your performance tank, too.
 
 Decreasing the block size increases the size of the dedup table (DDT).
 Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT
 gets too large to fit in memory, it will have to be read from disk,
 which will destroy any sort of write performance (although a L2ARC on
 SSD can help)
 
 If you move to 64k blocks, you'll double the DDT size and may not
 actually increase your ratio. Moving to 8k blocks will increase your
 DDT by a factor of 16, and still may not help.
 
 Changing the recordsize will not affect files that are already in the
 dataset. You'll have to recopy them to re-write with the smaller block
 size.
 
 -B

Gotcha.  Just trying to make sure I understand how all this works, and
if I _would_ in fact see an improvement in dedupe-ratio by tweaking the
recordsize with our data-set.

Once we know that we can decide if it's worth the extra costs in
RAM/L2ARC.

Thanks all.

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog / log recovery is here!

2010-06-04 Thread Victor Latushkin

On Jun 4, 2010, at 5:01 PM, Sigbjørn Lie wrote:

 
 R. Eulenberg wrote:
 Sorry for reviving this old thread.
 
 I even have this problem on my (productive) backup server. I lost my 
 system-hdd and my separate ZIL-device while the system crashs and now I'm in 
 trouble. The old system was running under the least version of osol/dev 
 (snv_134) with zfs v22. After the server crashs I was very optimistic of 
 solving the problems the same day. It's a long time ago.
 I was setting up a new systen (osol 2009.06 and updating to the lastest 
 version of osol/dev - snv_134 - with deduplication) and then I tried to 
 import my backup zpool, but it does not work.
 
 # zpool import
  pool: tank1
id: 5048704328421749681
 state: UNAVAIL
 status: The pool was last accessed by another system.
 action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
 config:
 
tank1UNAVAIL  missing device
  raidz2-0   ONLINE
c7t5d0   ONLINE
c7t0d0   ONLINE
c7t6d0   ONLINE
c7t3d0   ONLINE
c7t1d0   ONLINE
c7t4d0   ONLINE
c7t2d0   ONLINE
 
 # zpool import -f tank1
 cannot import 'tank1': one or more devices is currently unavailable
Destroy and re-create the pool from
a backup source
 
 Any other option (-F, -X, -V, -D) and any combination of them doesn't helps 
 too.
 I can not add / attach / detach / remove a vdev and the ZIL-device either, 
 because the system tells me: there is no zpool 'tank1'.
 In the last ten days I read a lot of threads, guides to solve problems and 
 best practice documentations with ZFS and so on, but I do not found a 
 solution for my problem. I created a fake-zpool with separate ZIL-device to 
 combine the new ZIL-file with my old zpool for importing them, but it 
 doesn't work in course of the different GUID and checksum (the name I was 
 modifiing by an binary editor).
 The output of:
 e...@opensolaris:~# zdb -e tank1
 
 Configuration for import:
vdev_children: 2
version: 22
pool_guid: 5048704328421749681
name: 'tank1'
state: 0
hostid: 946038
hostname: 'opensolaris'
vdev_tree:
type: 'root'
id: 0
guid: 5048704328421749681
children[0]:
type: 'raidz'
id: 0
guid: 16723866123388081610
nparity: 2
metaslab_array: 23
metaslab_shift: 30
ashift: 9
asize: 7001340903424
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 6858138566678362598
phys_path: 
 '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a'
whole_disk: 1
DTL: 4345
create_txg: 4
path: '/dev/dsk/c7t5d0s0'
devid: 
 'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a'
children[1]:
type: 'disk'
id: 1
guid: 16136237447458434520
phys_path: 
 '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a'
whole_disk: 1
DTL: 4344
create_txg: 4
path: '/dev/dsk/c7t0d0s0'
devid: 
 'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a'
children[2]:
type: 'disk'
id: 2
guid: 10876853602231471126
phys_path: 
 '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a'
whole_disk: 1
DTL: 4343
create_txg: 4
path: '/dev/dsk/c7t6d0s0'
devid: 
 'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a'
children[3]:
type: 'disk'
id: 3
guid: 2384677379114262201
phys_path: 
 '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a'
whole_disk: 1
DTL: 4342
create_txg: 4
path: '/dev/dsk/c7t3d0s0'
devid: 
 'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a'
children[4]:
type: 'disk'
id: 4
guid: 15143849195434333247
phys_path: 
 '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a'
whole_disk: 1
DTL: 4341
create_txg: 4
path: '/dev/dsk/c7t1d0s0'
devid: 
 'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a'
children[5]:
type: 

Re: [zfs-discuss] ZFS recovery tools

2010-06-04 Thread Victor Latushkin

On Jun 4, 2010, at 10:18 PM, Miles Nordin wrote:

 sl == Sigbjørn Lie sigbj...@nixtra.com writes:
 
sl Excellent! I wish I would have known about these features when
sl I was attempting to recover my pool using 2009.06/snv111.
 
 the OP tried the -F feature.  It doesn't work after you've lost zpool.cache:

Starting from build 128 option -F is documented option for 'zpool import' and 
'zpool clear' and it has nothing to do with zpool.cache. Old -F has been 
renamed to -V

In some cases it may be possible to extract configuration details from the 
in-pool copy of configuration by running 

zdb -eC poolname

regards
victor

 
op I was setting up a new systen (osol 2009.06 and updating to
op the lastest version of osol/dev - snv_134 - with
op deduplication) and then I tried to import my backup zpool, but
op it does not work.  
 
op # zpool import -f tank1 
op cannot import 'tank1': one or more devices is currently unavailable 
op Destroy and re-create the pool from a backup source
 
op Any other option (-F, -X, -V, -D) and any combination of them
op doesn't helps too.
 
 I have been in here repeatedly warning about this incompleteness of
 the feature while fanbois keep saying ``we have slog recovery so don't
 worry.''
 
 R., please let us know if the 'zdb -e -bcsvL zpool-name' incantation
 Sigbjorn suggested ends up working for you or not.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deduplication and ISO files

2010-06-04 Thread Victor Latushkin

On 05.06.10 00:10, Ray Van Dolson wrote:

On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote:

On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson rvandol...@esri.com wrote:

Makes sense.  So, as someone else suggested, decreasing my block size
may improve the deduplication ratio.

It might. It might make your performance tank, too.

Decreasing the block size increases the size of the dedup table (DDT).
Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT
gets too large to fit in memory, it will have to be read from disk,
which will destroy any sort of write performance (although a L2ARC on
SSD can help)

If you move to 64k blocks, you'll double the DDT size and may not
actually increase your ratio. Moving to 8k blocks will increase your
DDT by a factor of 16, and still may not help.

Changing the recordsize will not affect files that are already in the
dataset. You'll have to recopy them to re-write with the smaller block
size.

-B


Gotcha.  Just trying to make sure I understand how all this works, and
if I _would_ in fact see an improvement in dedupe-ratio by tweaking the
recordsize with our data-set.



You can use zdb -S to assess how effective deduplication can be without actually 
turning it on your pool.


regards
victor
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Migrating to ZFS

2010-06-04 Thread Frank Cusack

On 6/4/10 11:46 AM -0700 Brandon High wrote:

Be aware that Solaris on x86 has two types of partitions. There are
fdisk partitions (c0t0d0p1, etc) which is what gparted, windows and
other tools will see. There are also Solaris partitions or slices
(c0t0d0s0). You can create or edit these with the 'format' command in
Solaris. These are created in an fdisk partition that is the SOLARIS2
type. So yeah, it's a partition table inside a partition table.


That's not correct, at least not technically.  Solaris *slices* within
the Solaris fdisk partition, are not also known as partitions.  They
are simply known as slices.  By calling them Solaris partitions or
slices you are just adding confusion.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ssd pool + ssd cache ?

2010-06-04 Thread David Magda

On Jun 4, 2010, at 14:28, zfsnoob4 wrote:


Does anyone know if opensolaris supports Trim?


Not at this time.

Are you referring to a read cache or a write cache?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog / log recovery is here!

2010-06-04 Thread Sigbjørn Lie

Victor Latushkin wrote:

On Jun 4, 2010, at 5:01 PM, Sigbjørn Lie wrote:

  

R. Eulenberg wrote:


Sorry for reviving this old thread.

I even have this problem on my (productive) backup server. I lost my system-hdd 
and my separate ZIL-device while the system crashs and now I'm in trouble. The 
old system was running under the least version of osol/dev (snv_134) with zfs 
v22. After the server crashs I was very optimistic of solving the problems the 
same day. It's a long time ago.
I was setting up a new systen (osol 2009.06 and updating to the lastest version 
of osol/dev - snv_134 - with deduplication) and then I tried to import my 
backup zpool, but it does not work.

# zpool import
 pool: tank1
   id: 5048704328421749681
state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
  see: http://www.sun.com/msg/ZFS-8000-EY
config:

   tank1UNAVAIL  missing device
 raidz2-0   ONLINE
   c7t5d0   ONLINE
   c7t0d0   ONLINE
   c7t6d0   ONLINE
   c7t3d0   ONLINE
   c7t1d0   ONLINE
   c7t4d0   ONLINE
   c7t2d0   ONLINE

# zpool import -f tank1
cannot import 'tank1': one or more devices is currently unavailable
   Destroy and re-create the pool from
   a backup source

Any other option (-F, -X, -V, -D) and any combination of them doesn't helps too.
I can not add / attach / detach / remove a vdev and the ZIL-device either, 
because the system tells me: there is no zpool 'tank1'.
In the last ten days I read a lot of threads, guides to solve problems and best 
practice documentations with ZFS and so on, but I do not found a solution for 
my problem. I created a fake-zpool with separate ZIL-device to combine the new 
ZIL-file with my old zpool for importing them, but it doesn't work in course of 
the different GUID and checksum (the name I was modifiing by an binary editor).
The output of:
e...@opensolaris:~# zdb -e tank1

Configuration for import:
   vdev_children: 2
   version: 22
   pool_guid: 5048704328421749681
   name: 'tank1'
   state: 0
   hostid: 946038
   hostname: 'opensolaris'
   vdev_tree:
   type: 'root'
   id: 0
   guid: 5048704328421749681
   children[0]:
   type: 'raidz'
   id: 0
   guid: 16723866123388081610
   nparity: 2
   metaslab_array: 23
   metaslab_shift: 30
   ashift: 9
   asize: 7001340903424
   is_log: 0
   create_txg: 4
   children[0]:
   type: 'disk'
   id: 0
   guid: 6858138566678362598
   phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a'
   whole_disk: 1
   DTL: 4345
   create_txg: 4
   path: '/dev/dsk/c7t5d0s0'
   devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a'
   children[1]:
   type: 'disk'
   id: 1
   guid: 16136237447458434520
   phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a'
   whole_disk: 1
   DTL: 4344
   create_txg: 4
   path: '/dev/dsk/c7t0d0s0'
   devid: 
'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a'
   children[2]:
   type: 'disk'
   id: 2
   guid: 10876853602231471126
   phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a'
   whole_disk: 1
   DTL: 4343
   create_txg: 4
   path: '/dev/dsk/c7t6d0s0'
   devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a'
   children[3]:
   type: 'disk'
   id: 3
   guid: 2384677379114262201
   phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a'
   whole_disk: 1
   DTL: 4342
   create_txg: 4
   path: '/dev/dsk/c7t3d0s0'
   devid: 
'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a'
   children[4]:
   type: 'disk'
   id: 4
   guid: 15143849195434333247
   phys_path: 
'/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a'
   whole_disk: 1
   DTL: 4341
   create_txg: 4
   path: '/dev/dsk/c7t1d0s0'
   devid: 
'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a'
   children[5]:
   type: 'disk'
   id: 5
   guid: 11627603446133164653
   

Re: [zfs-discuss] Migrating to ZFS

2010-06-04 Thread Cindy Swearingen

Frank,

The format utility is not technically correct because it refers to
slices as partitions. Check the output below.

We might describe that the partition menu is used to partition the
disk into slices, but all of format refers to partitions, not slices.

I agree with Brandon's explanation, but no amount of explanation
resolves the confusion for those unfamiliar with how we use the
same term to describe different disk components.

Cindy

format p


PARTITION MENU:
0  - change `0' partition
1  - change `1' partition
2  - change `2' partition
3  - change `3' partition
4  - change `4' partition
5  - change `5' partition
6  - change `6' partition
expand - expand label to use whole disk
select - select a predefined table
modify - modify a predefined partition table
name   - name the current table
print  - display the current table
label  - write partition map and label to the disk
!cmd - execute cmd, then return
quit
partition p
Current partition table (original):
Total disk sectors available: 286722878 + 16384 (reserved sectors)

Part  TagFlag First Sector Size Last Sector
  0usrwm   256  136.72GB  286722911
  1 unassignedwm 0   0   0
  2 unassignedwm 0   0   0
  3 unassignedwm 0   0   0
  4 unassignedwm 0   0   0
  5 unassignedwm 0   0   0
  6 unassignedwm 0   0   0
  8   reservedwm 2867229128.00MB  286739295

partition




On 06/04/10 15:43, Frank Cusack wrote:

On 6/4/10 11:46 AM -0700 Brandon High wrote:

Be aware that Solaris on x86 has two types of partitions. There are
fdisk partitions (c0t0d0p1, etc) which is what gparted, windows and
other tools will see. There are also Solaris partitions or slices
(c0t0d0s0). You can create or edit these with the 'format' command in
Solaris. These are created in an fdisk partition that is the SOLARIS2
type. So yeah, it's a partition table inside a partition table.


That's not correct, at least not technically.  Solaris *slices* within
the Solaris fdisk partition, are not also known as partitions.  They
are simply known as slices.  By calling them Solaris partitions or
slices you are just adding confusion.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ssd pool + ssd cache ?

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 11:28 AM, zfsnoob4 zfsnoob...@hotmail.co.uk wrote:
 Does anyone know if opensolaris supports Trim?

It does not. However, it doesn't really matter for a cache device.

The cache device is written to rather slowly, and only needs to have
low latency access on reads.

Most current gen SSDs such as the Intel X25-M, Indilinx Barefoot, etc.
also support garbage collection which reduces the need for TRIM. It's
important that you align block on a 4k or 8k boundary though. (OCZ
recommends 8k for the Vertex drives.) I think that most current drives
have between a 128k and 512k erase block size, which is another
alignment point you can use.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ssd pool + ssd cache ?

2010-06-04 Thread Brandon High
On Fri, Jun 4, 2010 at 2:59 PM, David Magda dma...@ee.ryerson.ca wrote:
 Are you referring to a read cache or a write cache?

A cache vdev is a L2ARC, used for reads.
A log vdev is a slog/zil, used for writes.

Oh, how we overload our terms.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Small stalls slowing down rsync from holding network saturation every 5 seconds

2010-06-04 Thread Bob Friesenhahn

On Fri, 4 Jun 2010, Sandon Van Ness wrote:


Interesting enough when I went to copy the data back I got even worse
download speeds than I did write speeds! It looks like i need some sort
of read-ahead as unlike the writes it doesn't appear to be CPU bound as
using mbuffer/tar gives me full gigabit speeds. You can see in my graph
here:

http://uverse.houkouonchi.jp/stats/netusage/1.1.1.3_2.html


I am still not sure what you are doing, however, it should not 
surprise that gigabit ethernet is limited to one gigabit of traffic 
(1000 Mb/s) in either direction.  Theoretically you should be able to 
get a gigabit of traffic in both directions at once, but this depends 
on the quality of your ethernet switch, ethernet adaptor card, device 
driver, and capabilities of where the data is read and written to.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Small stalls slowing down rsync from holding network saturation every 5 seconds

2010-06-04 Thread Sandon Van Ness
On 06/04/2010 06:15 PM, Bob Friesenhahn wrote:
 On Fri, 4 Jun 2010, Sandon Van Ness wrote:

 Interesting enough when I went to copy the data back I got even worse
 download speeds than I did write speeds! It looks like i need some sort
 of read-ahead as unlike the writes it doesn't appear to be CPU bound as
 using mbuffer/tar gives me full gigabit speeds. You can see in my graph
 here:

 http://uverse.houkouonchi.jp/stats/netusage/1.1.1.3_2.html

 I am still not sure what you are doing, however, it should not
 surprise that gigabit ethernet is limited to one gigabit of traffic
 (1000 Mb/s) in either direction.  Theoretically you should be able to
 get a gigabit of traffic in both directions at once, but this depends
 on the quality of your ethernet switch, ethernet adaptor card, device
 driver, and capabilities of where the data is read and written to.

 Bob

The problem is that just using rsync I am not getting gigabit. For me
gigabit maxes out at around 930-940 megabits. When I use rsync alone I
only was getting around 720 megabits incomming. This is only when its
reading from the block device. When reading from the memory (IE: cat a
few big files on the server to have them cached) it gets ~935 megabits.
The machine is easily able to sustain that read speed (and write) but
the problem is getting it to actually do it.

The only way I was able to get full gig (935 megabits) was using tar and
mbuffer due to it acting as a read-ahead buffer. is there anyway to turn
the prefetch up as there really is no reason I should only be getting
720 megabits when copying files off with rsync (or NFS) like I am seeing.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss