Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-18 Thread Jim Klimov
...and, apparently, I can replace two drives at the similar time (in two 
commands), and resilvering goes in parallel:

{code}
[r...@t2k1 /]# zpool status pool
  pool: pool
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver completed after 0h0m with 0 errors on Sun Jan 18 15:11:24 2009
config:

NAME  STATE READ WRITE CKSUM
pool  DEGRADED 0 0 0
  raidz2  DEGRADED 0 0 0
c1t0d0s3  ONLINE   0 0 0
/ff1  OFFLINE  0 0 0
c1t2d0s3  ONLINE   0 0 0
/ff2  UNAVAIL  0 0 0  cannot open

errors: No known data errors
[r...@t2k1 /]# zpool replace pool /ff1 c1t1d0s3; zpool replace pool /ff2 
c1t3d0s3
{code}

This took a while, about half-a-minute. Now, how is array rebuild going?

{code}
[r...@t2k1 /]# zpool status pool
  pool: pool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.48% done, 1h9m to go
config:

NAMESTATE READ WRITE CKSUM
poolDEGRADED 0 0 0
  raidz2DEGRADED 0 0 0
c1t0d0s3ONLINE   0 0 0
replacing   DEGRADED 0 0 0
  /ff1  OFFLINE  0 0 0
  c1t1d0s3  ONLINE   0 0 0
c1t2d0s3ONLINE   0 0 0
replacing   DEGRADED 0 0 0
  /ff2  UNAVAIL  0 0 0  cannot open
  c1t3d0s3  ONLINE   0 0 0

errors: No known data errors
{code}

The progress meter tends to lie at first: resilvering takes roughly 30 min for 
the
raidz2 of 4*60Gb slices.

BTW, an earlier poster reported very slow synchronization using real disks and
sparse files on a single disk. I removed the sparse files as soon as the array 
was
initialized, and writing to two searate drives went reasonably well. 

I sent data from the latest snapshot of the oldpool to the newpool with 
{code}
zfs send -R oldp...@20090118-02-postupgrade | zfs  recv -vF -d newpool
{code}

Larger datasets went in the normal range of 13-20Mb/s (of course, smaller 
datasets and snapshots ranging in a few kilobytes of size took more time to
open-close than actually copying data; so estimated speed was bytes or kbytes 
per sec).

//Jim
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-17 Thread Jim Klimov
Thanks to all those who helped, even despite the non-enterprise approach of
this question ;)

While experimenting I discovered that Solaris /tmp doesn't seem to support 
sparse files: mkfile -n still creates full-sized files which can either use 
up the
swap space, or not fit there. ZFS and UFS filesystems make sparse files okay
though. This was tested on Solaris 10u4, 10u6 and OpenSolaris b103.

Other than this detail, scenario suggested by Tomas Ogren and updated by  
Daniel Rock seems working. Of these two, the variant with zpool offline for
sparse files is preferential: it only takes one command to complete and is more
straightforward (less error-prone).

The variant with removing a sparse file requires zpool scrub, otherwise the
file remains open on the filesystem and grows (consumes space) while I copy
data to the test pool. The consumed space is released after zpool scrub when
the removed file is finally unlinked from FS.

zpool replace works for both cases.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-16 Thread Daniel Rock
Jim Klimov schrieb:
 Is it possible to create a (degraded) zpool with placeholders specified 
 instead
 of actual disks (parity or mirrors)? This is possible in linux mdadm 
 (missing 
 keyword), so I kinda hoped this can be done in Solaris, but didn't manage to.

Create sparse files with the size of the disks (mkfile -n ...).

Create a zpool with the free disks and the sparse files (zpool create -f 
...). Then immediately put the sparse files offline (zpool offline ...). 
Copy the files to the new zpool, destroy the old one and replace the 
sparse files with the now freed up disks (zpool replace ...).

Remember: during data migration your are running without safety belts. 
If a disk fails during migration you will lose data.



Daniel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-15 Thread Richard Elling
Jim Klimov wrote:
 Is it possible to create a (degraded) zpool with placeholders specified 
 instead
 of actual disks (parity or mirrors)? This is possible in linux mdadm 
 (missing 
 keyword), so I kinda hoped this can be done in Solaris, but didn't manage to.

 Usecase scenario: 

 I have a single server (or home workstation) with 4 HDD bays, sold with 2 
 drives.
 Initially the system was set up with a ZFS mirror for data slices. Now we got 
 2 
 more drives and want to replace the mirror with a larger RAIDZ2 set (say I 
 don't 
 want a RAID10 which is trivial to make). 

 Technically I think that it should be possible to force creation of a degraded
 raidz2 array with two actual drives and two missing drives. Then I'd copy data
 from the old mirror pool to the new degraded raidz2 pool (zfs send | zfs 
 recv),
 destroy the mirror pool and attach its two drives to repair the raidz2 pool.

 While obviously not an enterprise approach, this is useful while expanding
 home systems when I don't have a spare tape backup to dump my files on it 
 and restore afterwards.
   

I would say it is definitely not a recommended approach for those who
love their data, whether enterprise or not.  But my opinion is really a
result of our environment at Sun (or any systems vendor).  Being here
blinds us to some opportunities. Please file an RFE at
http://bugs.opensolaris.org
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-15 Thread Tomas Ögren
On 15 January, 2009 - Jim Klimov sent me these 1,3K bytes:

 Is it possible to create a (degraded) zpool with placeholders specified 
 instead
 of actual disks (parity or mirrors)? This is possible in linux mdadm 
 (missing 
 keyword), so I kinda hoped this can be done in Solaris, but didn't manage to.
 
 Usecase scenario: 
 
 I have a single server (or home workstation) with 4 HDD bays, sold with 2 
 drives.
 Initially the system was set up with a ZFS mirror for data slices. Now we got 
 2 
 more drives and want to replace the mirror with a larger RAIDZ2 set (say I 
 don't 
 want a RAID10 which is trivial to make). 
 
 Technically I think that it should be possible to force creation of a degraded
 raidz2 array with two actual drives and two missing drives. Then I'd copy data
 from the old mirror pool to the new degraded raidz2 pool (zfs send | zfs 
 recv),
 destroy the mirror pool and attach its two drives to repair the raidz2 pool.
 
 While obviously not an enterprise approach, this is useful while expanding
 home systems when I don't have a spare tape backup to dump my files on it 
 and restore afterwards.
 
 I think it's an (intended?) limitation in zpool command itself, since the 
 kernel
 can very well live with degraded pools.

You can fake it..

kalv:/tmp# mkfile 64m realdisk1
kalv:/tmp# mkfile 64m realdisk2
kalv:/tmp# mkfile -n 64m fakedisk1
kalv:/tmp# mkfile -n 64m fakedisk2
kalv:/tmp# ls -la real* fake*
-rw--T 1 root root 67108864 2009-01-15 17:02 fakedisk1
-rw--T 1 root root 67108864 2009-01-15 17:02 fakedisk2
-rw--T 1 root root 67108864 2009-01-15 17:02 realdisk1
-rw--T 1 root root 67108864 2009-01-15 17:02 realdisk2
kalv:/tmp# du real* fake*
6   realdisk1
6   realdisk2
133 fakedisk1
133 fakedisk2


In reality, those realdisk* should be pointing at real disks, but
fakedisk* should still point at sparse mkfile's with the same size as
your real disks (300GB or whatever).

kalv:/tmp# zpool create blah raidz2 /tmp/realdisk1 /tmp/realdisk2 
/tmp/fakedisk1 /tmp/fakedisk2
kalv:/tmp# zpool status blah
  pool: blah
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
blahONLINE   0 0 0
  raidz2ONLINE   0 0 0
/tmp/realdisk1  ONLINE   0 0 0
/tmp/realdisk2  ONLINE   0 0 0
/tmp/fakedisk1  ONLINE   0 0 0
/tmp/fakedisk2  ONLINE   0 0 0

errors: No known data errors

Ok, so it's created fine. Let's accidentally introduce some problems..


kalv:/tmp# rm /tmp/fakedisk1
kalv:/tmp# rm /tmp/fakedisk2
kalv:/tmp# zpool scrub blah
kalv:/tmp# zpool status blah
  pool: blah
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: scrub completed after 0h0m with 0 errors on Thu Jan 15 17:03:38
2009
config:

NAMESTATE READ WRITE CKSUM
blahDEGRADED 0 0 0
  raidz2DEGRADED 0 0 0
/tmp/realdisk1  ONLINE   0 0 0
/tmp/realdisk2  ONLINE   0 0 0
/tmp/fakedisk1  UNAVAIL  0 0 0  cannot open
/tmp/fakedisk2  UNAVAIL  0 0 0  cannot open

errors: No known data errors


Still working.

At this point, you can start filling blah with data. Then after a while,
let's bring in the other real disks:

kalv:/tmp# mkfile 64m realdisk3
kalv:/tmp# mkfile 64m realdisk4
kalv:/tmp# zpool replace blah /tmp/fakedisk1 /tmp/realdisk3
kalv:/tmp# zpool replace blah /tmp/fakedisk2 /tmp/realdisk4
kalv:/tmp# zpool status blah
  pool: blah
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Thu Jan 15 17:04:31 2009
config:

NAMESTATE READ WRITE CKSUM
blahONLINE   0 0 0
  raidz2ONLINE   0 0 0
/tmp/realdisk1  ONLINE   0 0 0
/tmp/realdisk2  ONLINE   0 0 0
/tmp/realdisk3  ONLINE   0 0 0
/tmp/realdisk4  ONLINE   0 0 0


Of course, try it out a bit before doing it for real.

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-15 Thread Jim Klimov
Thanks Tomas, I haven't checked yet, but your workaround seems feasible.

I've posted an RFE and referenced your approach as a workaround.
That's nearly what zpool should do under the hood, and perhaps can be done 
temporarily with a wrapper script to detect min(physical storage sizes)  ;)

//Jim
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I create ZPOOL with missing disks?

2009-01-15 Thread Jonathan
Tomas Ögren wrote:
 On 15 January, 2009 - Jim Klimov sent me these 1,3K bytes:
 
 Is it possible to create a (degraded) zpool with placeholders specified 
 instead
 of actual disks (parity or mirrors)? This is possible in linux mdadm 
 (missing 
 keyword), so I kinda hoped this can be done in Solaris, but didn't manage to.

 Usecase scenario: 

 I have a single server (or home workstation) with 4 HDD bays, sold with 2 
 drives.
 Initially the system was set up with a ZFS mirror for data slices. Now we 
 got 2 
 more drives and want to replace the mirror with a larger RAIDZ2 set (say I 
 don't 
 want a RAID10 which is trivial to make). 

 Technically I think that it should be possible to force creation of a 
 degraded
 raidz2 array with two actual drives and two missing drives. Then I'd copy 
 data
 from the old mirror pool to the new degraded raidz2 pool (zfs send | zfs 
 recv),
 destroy the mirror pool and attach its two drives to repair the raidz2 
 pool.

 While obviously not an enterprise approach, this is useful while expanding
 home systems when I don't have a spare tape backup to dump my files on it 
 and restore afterwards.

 I think it's an (intended?) limitation in zpool command itself, since the 
 kernel
 can very well live with degraded pools.
 
 You can fake it..

[snip command set]

Summary, yes that actually works and I've done it, but its very slow!

I essentially did this myself when I migrated a 4x2-way mirror pool to a
2x4 disk raidzs (4x 500GB and 4x 1.5TB).  I can say from experience that
it works but since I used 2 sparsefiles to simulate 2 disks on a single
physical disk performance sucked and it took a long time to do the
migration.  IIRC it took over 2 days to transfer 2TB of data.  I used
rsync, at the time I either didn't know about or forgot about zfs
send/receive which would probably work better.  It took a couple more
days to verify that everything transferred correctly with no bit rot
(rsync -c).

I think Sun avoids making things like this too easy because from a
business standpoint it's easier just to spend the money on enough
hardware to do it properly without the chance of data loss and the
extended down time.  Doesn't invest the time in may be a be a better
phrase than avoids though.  I doubt Sun actually goes out of their way
to make things harder for people.

Hope that helps,
Jonathan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss