[zfs-discuss] zpool list vs zfs list, size differs...
Hi, New to OpenSolaris and ZFS... Wondering about a size difference I see on my newly installed OpenSolaris system, Homebuilt AMD Phenom system with SATA3 disks... [code] jo...@krynn:~$ zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT rpool 696G 7.67G 688G 1% ONLINE - zpool 2.72T 135K 2.72T 0% ONLINE - jo...@krynn:~$ zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 11.5G 674G 72K /rpool rpool/ROOT 3.78G 674G 18K legacy rpool/ROOT/opensolaris 3.78G 674G 3.65G / rpool/dump 3.87G 674G 3.87G - rpool/export 18.7M 674G 19K /export rpool/export/home 18.7M 674G 50K /export/home rpool/export/home/admin 18.6M 674G 18.6M /export/home/admin rpool/swap 3.87G 677G 16K - zpool 94.3K 2.00T 26.9K /zpool jo...@krynn:~$ zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c3d0s0 ONLINE 0 0 0 c4d0s0 ONLINE 0 0 0 errors: No known data errors pool: zpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zpool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c3d1 ONLINE 0 0 0 c4d1 ONLINE 0 0 0 c6d0 ONLINE 0 0 0 c6d1 ONLINE 0 0 0 errors: No known data errors [/code] The disks are all 750GB SATA3 disks, why is zpool listing the raidz zpool as 2.74TB but zfs list the /zpool filesys as 2.0TB? Is this a limit of my server in some way or something I can tune up? /Johan A ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool list vs zfs list, size differs...
Tomas Ă–gren wrote: On 07 February, 2009 - Johan Andersson sent me these 1,5K bytes: Hi, New to OpenSolaris and ZFS... Wondering about a size difference I see on my newly installed OpenSolaris system, Homebuilt AMD Phenom system with SATA3 disks... [code] jo...@krynn:~$ zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT rpool 696G 7.67G 688G 1% ONLINE - zpool 2.72T 135K 2.72T 0% ONLINE - The pool has disks that can hold ... 4*7500/1024/1024/1024/1024 =~ 2.72TB jo...@krynn:~$ zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 11.5G 674G 72K /rpool rpool/ROOT 3.78G 674G 18K legacy rpool/ROOT/opensolaris 3.78G 674G 3.65G / rpool/dump 3.87G 674G 3.87G - rpool/export 18.7M 674G 19K /export rpool/export/home 18.7M 674G 50K /export/home rpool/export/home/admin 18.6M 674G 18.6M /export/home/admin rpool/swap 3.87G 677G 16K - zpool 94.3K 2.00T 26.9K /zpool In that pool, due to raidz, you can store about ... 3*7500/1024/1024/1024/1024 =~ 2TB The disks are all 750GB SATA3 disks, why is zpool listing the raidz zpool as 2.74TB but zfs list the /zpool filesys as 2.0TB? Is this a limit of my server in some way or something I can tune up? Space worth about 1x750GB is lost to parity with raidz.. /Tomas Thanks, I didnt realize that the pool counted in the parity data... I should have though... if I had bothered to calc it *duh* /Johan A ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Should I report this as a bug?
On Wed, 04 Feb 2009 11:12:55 -0500, Kyle McDonald kmcdon...@egenera.com wrote: I jumpstarted my machine with sNV b106, and installed with ZFS root/boot. It left me at a shell prompt in the JumpStart environment, with my ZFS root on /a. I wanted to try out some things that I planned on scripting for the JumpStart to run, one of these waas creating a new ZFS pool from the remaining disks. I looked at the zpool create manpage, and saw this it had a -R altroot option, and the exact same thing had just worked for me with 'dladm aggr-create' so I thought I'd give that a try. If the machine had been booted normally, my ZFS root would have been /, and a 'zpool create zdata0 ...' would have defaulted to mounting the new pool as /zdata0 right next to my ZFS root pool /zroot0. So I expected 'zpool create -R /a zdata0 ...' to set the default mountpoint for the pool to /zdata0 with a temporary altroot=/a. I gave it a try, and while it created the pool it failed to mount it at all. It reported that /a wasn't empty. 'zpool list', and 'zpool get all' show the altroot=/a. But 'zfs get all zdata0' shows the mountpoint=/a also, not the default of /zdata0. Am I expecting the wrong thing here? or is this a bug? My guess is /a is occupied by the mount of the just installed root pool. You'll have to create a new mountpoint, something like /b, and have your zdata0 pool mount there temporarily. -Kyle -- ( Kees Nuyt ) c[_] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow
FYI, I'm working on a workaround for broken devices. As you note, ome disks flat-out lie: you issue the synchronize-cache command, they say got it, boss, yet the data is still not on stable storage. Why do they do this? Because it performs better. Well, duh -- ou can make stuff *really* fast if it doesn't have to be correct. The uberblock ring buffer in ZFS gives us a way to cope with this, as long as we don't reuse freed blocks for a few transaction groups. The basic idea: if we can't read the pool startign from the most recent uberblock, then we should be able to use the one before it, or the one before that, etc, as long as we haven't yet reused any blocks that were freed in those earlier txgs. This allows us to use the normal load on the pool, plus the passage of time, as a displacement flush for disk caches that ignore the sync command. If we go back far enough in (txg) time, we will eventually find an uberblock all of whose dependent data blocks have make it to disk. I'll run tests with known-broken disks to determine how far back we need to go in practice -- I'll bet one txg is almost always enough. Jeff Hi Jeff, we just losed 2 pools on snv91. Any news about your workaround to recover pools discarding last txg? thanks gino -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Alternatives to increading the number of copies on a ZFS snapshot
How do I set the number of copies on a snapshot ? Based on the error message, I believe that I cannot do so. I already have a number of clones based on this snapshot, and would like the snapshot to have more copies now. For higher redundancy and peace of mind, what alternatives do I have ? -- Sriram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] A question on non-consecutive disk failures
From the presentation ZFS - The last word in filesystems, Page 22 In a multi-disk pool, ZFS survives any non-consecutive disk failures Questions: If I have a 3 disk RAIDZ with disks A, B and C, then: - if disk b fails, then will I be able to continue to read data if disks A and C are still available ? If I have a 4 disk RAIDZ with disks A, B, C, and D, then: - if disks a and b fail, then I won't be able to read from the mirror any more. Is this understanding correct ? - if disks a and c fail, then I will be be able to read from disks b and d. Is this understanding correct ? -- Sriram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Nested ZFS file systems are not visible over an NFS export
An update: I'm using VMWare ESX 3.5 and VMWare ESXi 3.5 as the NFS clients. I'm use zfs set sharenfs=on datapool/vmwarenfs to make that zfs file system accessible over NFS. -- Sriram On Sun, Feb 8, 2009 at 12:07 AM, Sriram Narayanan sriram...@gmail.com wrote: Hello: I have the following zfs structure datapool/vmwarenfs - which is available over NFS I have some ZFS filesystems as follows: datapool/vmwarenfs/basicVMImage datapool/vmwarenfs/basicvmim...@snapshot datapool/vmwarenfs/VMImage01- zfs cloned from basicvmim...@snapshot datapool/vmwarenfs/VMImage02- zfs cloned from basicvmim...@snapshot These are accessible via NFS as /datapool/vmwarenfs with the subfolders VMImage01 and VMImage02 What's happening right now: a. When I connect to datapool/vmwarenfs over NFS, - the contents of /datapool/vmwarenfs are visible and usable - VMImage01 and VMImage02 are appearing as empty sub-folders at the paths /datapool/vmwarenfs/VMImage01 and /datapool/vmwarenfs/VMImage02, but their contents are not vbisible. b. When I explicity share VMImage01 and VMImage02 via NFS, then /datapool/vmwarenfs/VMImage01 - usable as a separate NFS share /datapool/vmwarenfs/VMImage02 - usable as a separate NFS share What I'd like to have: - attach over NFS to /datapool/vmwarenfs - view the ZFS filesystems VMImage01 and VMImage02 as sub folders under /datapool/vmwarenfs If needed, I can move VMImage01 and VMImage02 from datapool/vmwarenfs, and even re-create them elsewhere. -- Sriram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Alternatives to increading the number of copies on a ZFS snapshot
On Sat, Feb 7, 2009 at 19:33, Sriram Narayanan sri...@belenix.org wrote: How do I set the number of copies on a snapshot ? Based on the error message, I believe that I cannot do so. I already have a number of clones based on this snapshot, and would like the snapshot to have more copies now. For higher redundancy and peace of mind, what alternatives do I have ? You have to set the number of copies before you write the file. Snapshots won't write anything so you can't change that on snapshots. Your best option (and only if you value your data) is mirroring. (zpool attach) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A question on non-consecutive disk failures
On February 8, 2009 12:04:22 AM +0530 Sriram Narayanan sri...@belenix.org wrote: From the presentation ZFS - The last word in filesystems, Page 22 In a multi-disk pool, ZFS survives any non-consecutive disk failures Questions: If I have a 3 disk RAIDZ with disks A, B and C, then: - if disk b fails, then will I be able to continue to read data if disks A and C are still available ? yes. raidz allows for 1 disk failure. If I have a 4 disk RAIDZ with disks A, B, C, and D, then: - if disks a and b fail, then I won't be able to read from the mirror any more. Is this understanding correct ? what mirror? there is no mirror. you have a raidz. you can have 1 disk failure. - if disks a and c fail, then I will be be able to read from disks b and d. Is this understanding correct ? no. you will lose all the data if 2 disks fail. The part of the slides you are referring to is in reference to ditto blocks, which allow failure of PARTS of a SINGLE disk. -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A question on non-consecutive disk failures
On Sat, Feb 7, 2009 at 6:34 PM, Sriram Narayanan sri...@belenix.org wrote: From the presentation ZFS - The last word in filesystems, Page 22 In a multi-disk pool, ZFS survives any non-consecutive disk failures Questions: If I have a 3 disk RAIDZ with disks A, B and C, then: - if disk b fails, then will I be able to continue to read data if disks A and C are still available ? If I have a 4 disk RAIDZ with disks A, B, C, and D, then: - if disks a and b fail, then I won't be able to read from the mirror any more. Is this understanding correct ? - if disks a and c fail, then I will be be able to read from disks b and d. Is this understanding correct ? No. That quote is part of the discussion of ditto blocks. See the following: http://blogs.sun.com/bill/entry/ditto_blocks_the_amazing_tape -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [cifs-discuss] Permissions / ACL setting for top directory of CIFS export
On Sat, February 7, 2009 14:32, Alan.M.Wright wrote: Also, does this end up taking up extra metadata space compared to not having to have an ACL entry for each file? No, ZFS only stores ACLs. It doesn't have or store a separate representation of the UNIX permissions bits. So I won't worry about it. I still worry when I see 11 ACL entries, though; if only that I can't read through it and accurately tell what it will do! If you set traditional UNIX-like permissions on a ZFS file/directory, ZFS sets the ACL to represents those permissions. I've certainly seen that happen; a change made in ACL syntax can result in the unix permission bits changing, in ways that represent the resulting ACL permissions. Sometimes I end up with Unix permissions of all dashes, though, when the ACL actually allows quite a lot of access. That's confusing. But if it allows the access I want, I can probably learn to stop worrying about it. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [cifs-discuss] Permissions / ACL setting for top directory of CIFS export
David Dyer-Bennet wrote: On Sat, February 7, 2009 14:32, Alan.M.Wright wrote: Also, does this end up taking up extra metadata space compared to not having to have an ACL entry for each file? No, ZFS only stores ACLs. It doesn't have or store a separate representation of the UNIX permissions bits. So I won't worry about it. I still worry when I see 11 ACL entries, though; if only that I can't read through it and accurately tell what it will do! You can play with aclmode and aclinherit properties of your ZFS dataset to get different results if you want. But note that playing with these properties doesn't affect the result when you're operating over CIFS. CIFS server always applies Windows inheritance rules when creating new files/folders. When modifying ACLs over CIFS you should always see what you've sent over the wire. If you set traditional UNIX-like permissions on a ZFS file/directory, ZFS sets the ACL to represents those permissions. I've certainly seen that happen; a change made in ACL syntax can result in the unix permission bits changing, in ways that represent the resulting ACL permissions. Sometimes I end up with Unix permissions of all dashes, though, when the ACL actually allows quite a lot of access. That's confusing. But if it allows the access I want, I can probably learn to stop worrying about it. That's because ZFS only looks at owner@, group@ and everyone@ entries to generate Unix permissions, so for example if you don't have any owner@ entries you'll see --- for the owner part of Unix permissions even if you have an entry for user joe which is also the owner of the file. Afshin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Nested ZFS file systems are not visible over an NFS export
This is not a ZFS question. It is an NFS question. For Solaris NFSv4 clients post-b77, they will follow the mounts, via a method called mirror mounts. For other NFS clients, the behaviour will be that which the developers implemented. Please consult the appropriate NFS client forum for your system. The announcement of mirror mounts for Solaris is here: http://opensolaris.org/os/community/on/flag-days/pages/2007102201/ -- richard Sriram Narayanan wrote: An update: I'm using VMWare ESX 3.5 and VMWare ESXi 3.5 as the NFS clients. I'm use zfs set sharenfs=on datapool/vmwarenfs to make that zfs file system accessible over NFS. -- Sriram On Sun, Feb 8, 2009 at 12:07 AM, Sriram Narayanan sriram...@gmail.com wrote: Hello: I have the following zfs structure datapool/vmwarenfs - which is available over NFS I have some ZFS filesystems as follows: datapool/vmwarenfs/basicVMImage datapool/vmwarenfs/basicvmim...@snapshot datapool/vmwarenfs/VMImage01- zfs cloned from basicvmim...@snapshot datapool/vmwarenfs/VMImage02- zfs cloned from basicvmim...@snapshot These are accessible via NFS as /datapool/vmwarenfs with the subfolders VMImage01 and VMImage02 What's happening right now: a. When I connect to datapool/vmwarenfs over NFS, - the contents of /datapool/vmwarenfs are visible and usable - VMImage01 and VMImage02 are appearing as empty sub-folders at the paths /datapool/vmwarenfs/VMImage01 and /datapool/vmwarenfs/VMImage02, but their contents are not vbisible. b. When I explicity share VMImage01 and VMImage02 via NFS, then /datapool/vmwarenfs/VMImage01 - usable as a separate NFS share /datapool/vmwarenfs/VMImage02 - usable as a separate NFS share What I'd like to have: - attach over NFS to /datapool/vmwarenfs - view the ZFS filesystems VMImage01 and VMImage02 as sub folders under /datapool/vmwarenfs If needed, I can move VMImage01 and VMImage02 from datapool/vmwarenfs, and even re-create them elsewhere. -- Sriram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] terabyte or terror-byte disk drives
This is one of these Doctor - it hurts when I ... and the Doctor says then don't do that stories. Basically I installed a couple of 1Tb WD Black Caviar drives in a Sun x2200M2 1U server in a ZFS mirrored boot config - excellent drives - much faster than previous 7,200 RPM drives and the dual controller config and 32Mb cache really work and allow a lot more IOPS. I am delighted with the price/performance (around $120 ea at that time) compared with earlier generations of SATA drives. Highly recommended. Then, as I've done many times in the past, I moved the box to get at another system which was underneath the x2200 - with the system live and disks spinning. Did'nt even give it a 2nd thought; done this lots of times previously without any issues. A week later - Bad Things (TM) started happening with the box - and yes, you've guessed it, one of the terror byte drives was generating errors faster than the price dial on your local gas pump. A quick look at the logs revealed - that yes, you've guessed it, the drive errors originated when the box was moved. A zpool scrub generated thousands of errors on the damaged drive. Now it's offline. Al is sad. :( Moral of the story - you could do this (move a box with live disks) in the days of 300, 400 and 500Gb drives - but not any more with todays high density terror byte drives. And just FYI for those technocrats that don't read disk drive specs - the WD 1Tb drive has 3 platters (roughly 333Mb each); there is a 500 and 640Gb version of the same drive family that uses 2 platters of the *same* magnetic data density - so be very careful with those high density disk drives. As I said - don't do this! Just a heads up - it might just help someone else on the list who has developed bad habits over the years.. Regards, -- Al Hopper Logical Approach Inc,Plano,TX a...@logical-approach.com Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss