[zfs-discuss] zpool list vs zfs list, size differs...

2009-02-07 Thread Johan Andersson
Hi,

New to OpenSolaris and ZFS...
Wondering about a size difference I see on my newly installed 
OpenSolaris system, Homebuilt AMD Phenom system with SATA3 disks...

[code]
jo...@krynn:~$ zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 696G 7.67G 688G 1% ONLINE -
zpool 2.72T 135K 2.72T 0% ONLINE -

jo...@krynn:~$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 11.5G 674G 72K /rpool
rpool/ROOT 3.78G 674G 18K legacy
rpool/ROOT/opensolaris 3.78G 674G 3.65G /
rpool/dump 3.87G 674G 3.87G -
rpool/export 18.7M 674G 19K /export
rpool/export/home 18.7M 674G 50K /export/home
rpool/export/home/admin 18.6M 674G 18.6M /export/home/admin
rpool/swap 3.87G 677G 16K -
zpool 94.3K 2.00T 26.9K /zpool

jo...@krynn:~$ zpool status
pool: rpool
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c3d0s0 ONLINE 0 0 0
c4d0s0 ONLINE 0 0 0

errors: No known data errors

pool: zpool
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
zpool ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c3d1 ONLINE 0 0 0
c4d1 ONLINE 0 0 0
c6d0 ONLINE 0 0 0
c6d1 ONLINE 0 0 0

errors: No known data errors
[/code]

The disks are all 750GB SATA3 disks, why is zpool listing the raidz 
zpool as 2.74TB but zfs list the /zpool filesys as 2.0TB?
Is this a limit of my server in some way or something I can tune up?

/Johan A
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool list vs zfs list, size differs...

2009-02-07 Thread Johan Andersson

Tomas Ă–gren wrote:

On 07 February, 2009 - Johan Andersson sent me these 1,5K bytes:

  

Hi,

New to OpenSolaris and ZFS...
Wondering about a size difference I see on my newly installed 
OpenSolaris system, Homebuilt AMD Phenom system with SATA3 disks...


[code]
jo...@krynn:~$ zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 696G 7.67G 688G 1% ONLINE -
zpool 2.72T 135K 2.72T 0% ONLINE -



The pool has disks that can hold ...
4*7500/1024/1024/1024/1024 =~ 2.72TB

  

jo...@krynn:~$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 11.5G 674G 72K /rpool
rpool/ROOT 3.78G 674G 18K legacy
rpool/ROOT/opensolaris 3.78G 674G 3.65G /
rpool/dump 3.87G 674G 3.87G -
rpool/export 18.7M 674G 19K /export
rpool/export/home 18.7M 674G 50K /export/home
rpool/export/home/admin 18.6M 674G 18.6M /export/home/admin
rpool/swap 3.87G 677G 16K -
zpool 94.3K 2.00T 26.9K /zpool



In that pool, due to raidz, you can store about ...
3*7500/1024/1024/1024/1024 =~ 2TB

  
The disks are all 750GB SATA3 disks, why is zpool listing the raidz 
zpool as 2.74TB but zfs list the /zpool filesys as 2.0TB?

Is this a limit of my server in some way or something I can tune up?



Space worth about 1x750GB is lost to parity with raidz..

/Tomas
  

Thanks,
I didnt realize that the pool counted in the parity data...
I should have though... if I had bothered to calc it

*duh*


/Johan A
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Should I report this as a bug?

2009-02-07 Thread Kees Nuyt
On Wed, 04 Feb 2009 11:12:55 -0500, Kyle McDonald
kmcdon...@egenera.com wrote:

I jumpstarted my machine with sNV b106, and installed with ZFS root/boot.
It left me at a shell prompt in the JumpStart environment, with my ZFS 
root on /a.

I wanted to try out some things that I planned on scripting for the 
JumpStart to run, one of these waas creating a new ZFS pool from the 
remaining disks. I looked at the zpool create manpage, and saw this it 
had a -R altroot option, and the exact same thing had just worked for 
me with 'dladm aggr-create' so I thought I'd give that a try.

If the machine had been booted normally, my ZFS root would have been /, 
and a 'zpool create zdata0 ...' would have defaulted to mounting the new 
pool as /zdata0 right next to my ZFS root pool /zroot0. So I expected 
'zpool create -R /a zdata0 ...' to set the default mountpoint for the 
pool to /zdata0 with a temporary altroot=/a.

I gave it a try, and while it created the pool it failed to mount it at 
all. It reported that /a wasn't empty.

'zpool list', and 'zpool get all' show the altroot=/a. But 'zfs  get 
all  zdata0' shows the mountpoint=/a also, not the default of /zdata0.

Am I expecting the wrong thing here? or is this a bug?

My guess is /a is occupied by the mount of the just
installed root pool.

You'll have to create a new mountpoint, something like /b,
and have your zdata0 pool mount there temporarily.

 -Kyle
-- 
  (  Kees Nuyt
  )
c[_]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2009-02-07 Thread Gino
 FYI, I'm working on a workaround for broken devices.
  As you note,
 ome disks flat-out lie: you issue the
 synchronize-cache command,
 they say got it, boss, yet the data is still not on
 stable storage.
 Why do they do this?  Because it performs better.
  Well, duh --
 ou can make stuff *really* fast if it doesn't have to
 be correct.
 

 The uberblock ring buffer in ZFS gives us a way to
 cope with this,
 as long as we don't reuse freed blocks for a few
 transaction groups.
 The basic idea: if we can't read the pool startign
 from the most
 recent uberblock, then we should be able to use the
 one before it,
 or the one before that, etc, as long as we haven't
 yet reused any
 blocks that were freed in those earlier txgs.  This
 allows us to
 use the normal load on the pool, plus the passage of
 time, as a
 displacement flush for disk caches that ignore the
 sync command.
 
 If we go back far enough in (txg) time, we will
 eventually find an
 uberblock all of whose dependent data blocks have
 make it to disk.
 I'll run tests with known-broken disks to determine
 how far back we
 need to go in practice -- I'll bet one txg is almost
 always enough.
 
 Jeff

Hi Jeff,
we just losed 2 pools on snv91.
Any news about your workaround to recover pools discarding last txg?

thanks
gino
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Alternatives to increading the number of copies on a ZFS snapshot

2009-02-07 Thread Sriram Narayanan
How do I set the number of copies on a snapshot ? Based on the error
message, I believe that I cannot do so.
I already have a number of clones based on this snapshot, and would
like the snapshot to have more copies now.
For higher redundancy and peace of mind, what alternatives do I have ?

-- Sriram
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] A question on non-consecutive disk failures

2009-02-07 Thread Sriram Narayanan
From the presentation ZFS - The last word in filesystems, Page 22
In a multi-disk pool, ZFS survives any non-consecutive disk failures

Questions:
If I have a 3 disk RAIDZ with disks A, B and C, then:
- if disk b fails, then will I be able to continue to read data if
disks A and C are still available ?
If I have a 4 disk RAIDZ with disks A, B, C, and D, then:
- if disks a and b fail, then I won't be able to read from the mirror
any more. Is this understanding correct ?
- if disks a and c fail, then I will be be able to read from disks b
and d. Is this understanding correct ?

-- Sriram
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nested ZFS file systems are not visible over an NFS export

2009-02-07 Thread Sriram Narayanan
An update:

I'm using VMWare ESX 3.5 and VMWare ESXi 3.5 as the NFS clients.

I'm use zfs set sharenfs=on datapool/vmwarenfs to make that zfs file
system accessible over NFS.

-- Sriram
On Sun, Feb 8, 2009 at 12:07 AM, Sriram Narayanan sriram...@gmail.com wrote:
 Hello:

 I have the following zfs structure
 datapool/vmwarenfs - which is available over NFS

 I have some ZFS filesystems as follows:
 datapool/vmwarenfs/basicVMImage
 datapool/vmwarenfs/basicvmim...@snapshot
 datapool/vmwarenfs/VMImage01- zfs cloned from basicvmim...@snapshot
 datapool/vmwarenfs/VMImage02- zfs cloned from basicvmim...@snapshot

 These are accessible via NFS as /datapool/vmwarenfs
with the subfolders VMImage01 and VMImage02

 What's happening right now:
 a. When I connect to datapool/vmwarenfs over NFS,
 - the contents of /datapool/vmwarenfs are visible and usable
 - VMImage01 and VMImage02 are appearing as empty sub-folders at the
 paths /datapool/vmwarenfs/VMImage01 and /datapool/vmwarenfs/VMImage02,
 but their contents are not vbisible.

 b. When I explicity share VMImage01 and VMImage02 via NFS, then
 /datapool/vmwarenfs/VMImage01 - usable as a separate NFS share
 /datapool/vmwarenfs/VMImage02 - usable as a separate NFS share

 What I'd like to have:
 - attach over NFS to /datapool/vmwarenfs
 - view the ZFS filesystems VMImage01 and VMImage02 as sub folders
 under /datapool/vmwarenfs

 If needed, I can move VMImage01 and VMImage02 from datapool/vmwarenfs,
 and even re-create them elsewhere.

 -- Sriram

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Alternatives to increading the number of copies on a ZFS snapshot

2009-02-07 Thread Mattias Pantzare
On Sat, Feb 7, 2009 at 19:33, Sriram Narayanan sri...@belenix.org wrote:
 How do I set the number of copies on a snapshot ? Based on the error
 message, I believe that I cannot do so.
I already have a number of clones based on this snapshot, and would
 like the snapshot to have more copies now.
For higher redundancy and peace of mind, what alternatives do I have ?

You have to set the number of copies before you write the file.
Snapshots won't write anything so you can't change that on snapshots.

Your best option (and only if you value your data) is mirroring. (zpool attach)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A question on non-consecutive disk failures

2009-02-07 Thread Frank Cusack
On February 8, 2009 12:04:22 AM +0530 Sriram Narayanan sri...@belenix.org 
wrote:
 From the presentation ZFS - The last word in filesystems, Page 22
   In a multi-disk pool, ZFS survives any non-consecutive disk failures

 Questions:
 If I have a 3 disk RAIDZ with disks A, B and C, then:
 - if disk b fails, then will I be able to continue to read data if
 disks A and C are still available ?

yes.  raidz allows for 1 disk failure.

 If I have a 4 disk RAIDZ with disks A, B, C, and D, then:
 - if disks a and b fail, then I won't be able to read from the mirror
 any more. Is this understanding correct ?

what mirror?  there is no mirror.  you have a raidz.  you can have 1
disk failure.

 - if disks a and c fail, then I will be be able to read from disks b
 and d. Is this understanding correct ?

no.  you will lose all the data if 2 disks fail.

The part of the slides you are referring to is in reference to ditto
blocks, which allow failure of PARTS of a SINGLE disk.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A question on non-consecutive disk failures

2009-02-07 Thread Peter Tribble
On Sat, Feb 7, 2009 at 6:34 PM, Sriram Narayanan sri...@belenix.org wrote:
 From the presentation ZFS - The last word in filesystems, Page 22
In a multi-disk pool, ZFS survives any non-consecutive disk failures

 Questions:
 If I have a 3 disk RAIDZ with disks A, B and C, then:
 - if disk b fails, then will I be able to continue to read data if
 disks A and C are still available ?
 If I have a 4 disk RAIDZ with disks A, B, C, and D, then:
 - if disks a and b fail, then I won't be able to read from the mirror
 any more. Is this understanding correct ?
 - if disks a and c fail, then I will be be able to read from disks b
 and d. Is this understanding correct ?

No. That quote is part of the discussion of ditto blocks.

See the following:

http://blogs.sun.com/bill/entry/ditto_blocks_the_amazing_tape

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [cifs-discuss] Permissions / ACL setting for top directory of CIFS export

2009-02-07 Thread David Dyer-Bennet

On Sat, February 7, 2009 14:32, Alan.M.Wright wrote:
 Also, does this end up taking up extra metadata space compared to not
 having to have an ACL entry for each file?

 No, ZFS only stores ACLs.  It doesn't have or store a separate
 representation of the UNIX permissions bits.

So I won't worry about it.  I still worry when I see 11 ACL entries,
though; if only that I can't read through it and accurately tell what it
will do!

 If you set traditional UNIX-like permissions on a ZFS file/directory,
 ZFS sets the ACL to represents those permissions.

I've certainly seen that happen; a change made in ACL syntax can result in
the unix permission bits changing, in ways that represent the resulting
ACL permissions.

Sometimes I end up with Unix permissions of all dashes, though, when the
ACL actually allows quite a lot of access.  That's confusing.  But if it
allows the access I want, I can probably learn to stop worrying about it.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [cifs-discuss] Permissions / ACL setting for top directory of CIFS export

2009-02-07 Thread Afshin Salek
David Dyer-Bennet wrote:
 On Sat, February 7, 2009 14:32, Alan.M.Wright wrote:
 Also, does this end up taking up extra metadata space compared to not
 having to have an ACL entry for each file?
 No, ZFS only stores ACLs.  It doesn't have or store a separate
 representation of the UNIX permissions bits.
 
 So I won't worry about it.  I still worry when I see 11 ACL entries,
 though; if only that I can't read through it and accurately tell what it
 will do!


You can play with aclmode and aclinherit properties of your ZFS
dataset to get different results if you want. But note that playing
with these properties doesn't affect the result when you're operating
over CIFS. CIFS server always applies Windows inheritance rules when
creating new files/folders. When modifying ACLs over CIFS you should
always see what you've sent over the wire.

 If you set traditional UNIX-like permissions on a ZFS file/directory,
 ZFS sets the ACL to represents those permissions.
 
 I've certainly seen that happen; a change made in ACL syntax can result in
 the unix permission bits changing, in ways that represent the resulting
 ACL permissions.
 
 Sometimes I end up with Unix permissions of all dashes, though, when the
 ACL actually allows quite a lot of access.  That's confusing.  But if it
 allows the access I want, I can probably learn to stop worrying about it.
 

That's because ZFS only looks at owner@, group@ and everyone@ entries to
generate Unix permissions, so for example if you don't have any owner@
entries you'll see --- for the owner part of Unix permissions even if
you have an entry for user joe which is also the owner of the file.

Afshin

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nested ZFS file systems are not visible over an NFS export

2009-02-07 Thread Richard Elling
This is not a ZFS question. It is an NFS question.
For Solaris NFSv4 clients post-b77, they will follow the mounts, via
a method called mirror mounts.  For other NFS clients, the behaviour
will be that which the developers implemented.  Please consult the
appropriate NFS client forum for your system.

The announcement of mirror mounts for Solaris is here:
http://opensolaris.org/os/community/on/flag-days/pages/2007102201/
 -- richard


Sriram Narayanan wrote:
 An update:

 I'm using VMWare ESX 3.5 and VMWare ESXi 3.5 as the NFS clients.

 I'm use zfs set sharenfs=on datapool/vmwarenfs to make that zfs file
 system accessible over NFS.

 -- Sriram
 On Sun, Feb 8, 2009 at 12:07 AM, Sriram Narayanan sriram...@gmail.com wrote:
   
 Hello:

 I have the following zfs structure
 datapool/vmwarenfs - which is available over NFS

 I have some ZFS filesystems as follows:
 datapool/vmwarenfs/basicVMImage
 datapool/vmwarenfs/basicvmim...@snapshot
 datapool/vmwarenfs/VMImage01- zfs cloned from basicvmim...@snapshot
 datapool/vmwarenfs/VMImage02- zfs cloned from basicvmim...@snapshot

 These are accessible via NFS as /datapool/vmwarenfs
with the subfolders VMImage01 and VMImage02

 What's happening right now:
 a. When I connect to datapool/vmwarenfs over NFS,
 - the contents of /datapool/vmwarenfs are visible and usable
 - VMImage01 and VMImage02 are appearing as empty sub-folders at the
 paths /datapool/vmwarenfs/VMImage01 and /datapool/vmwarenfs/VMImage02,
 but their contents are not vbisible.

 b. When I explicity share VMImage01 and VMImage02 via NFS, then
 /datapool/vmwarenfs/VMImage01 - usable as a separate NFS share
 /datapool/vmwarenfs/VMImage02 - usable as a separate NFS share

 What I'd like to have:
 - attach over NFS to /datapool/vmwarenfs
 - view the ZFS filesystems VMImage01 and VMImage02 as sub folders
 under /datapool/vmwarenfs

 If needed, I can move VMImage01 and VMImage02 from datapool/vmwarenfs,
 and even re-create them elsewhere.

 -- Sriram

 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] terabyte or terror-byte disk drives

2009-02-07 Thread Al Hopper
This is one of these Doctor - it hurts when I ... and the Doctor
says then don't do that stories.  Basically I installed a couple of
1Tb WD Black Caviar drives in a Sun x2200M2 1U server in a ZFS
mirrored boot config - excellent drives - much faster than previous
7,200 RPM drives and the dual controller config and 32Mb cache really
work and allow a lot more IOPS.  I am delighted with the
price/performance (around $120 ea at that time) compared with earlier
generations of SATA drives.  Highly recommended.

Then, as I've done many times in the past, I moved the box to get at
another system which was underneath the x2200 - with the system live
and disks spinning.  Did'nt even give it a 2nd thought; done this lots
of times previously without any issues.  A week later - Bad Things
(TM) started happening with the box - and yes, you've guessed it, one
of the terror byte drives was generating errors faster than the price
dial on your local gas pump.  A quick look at the logs revealed - that
yes, you've guessed it, the drive errors originated when the box was
moved.  A zpool scrub generated thousands of errors on the damaged
drive.  Now it's offline.  Al is sad. :(

Moral of the story - you could do this (move a box with live disks) in
the days of 300, 400 and 500Gb drives - but not any more with todays
high density terror byte drives.  And just FYI for those technocrats
that don't read disk drive specs - the WD 1Tb drive has 3 platters
(roughly 333Mb each); there is a 500 and 640Gb version of the same
drive family that uses 2 platters of the *same* magnetic data density
- so be very careful with those high density disk drives.

As I said - don't do this!

Just a heads up - it might just help someone else on the list who has
developed bad habits over the years..

Regards,

-- 
Al Hopper  Logical Approach Inc,Plano,TX a...@logical-approach.com
   Voice: 972.379.2133 Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss