Just my problem too ;) And ZFS disapointed me big time here!
I know ZFS is new and every desired feature isn't implemented yet. I hope and 
beleive more features are comming "soon", so I think I'll stay with ZFS and 
wait..

My idea was to start out with just as many state-of-the-art size disks I really 
needed and could afford and add disks as price dropped and the zpool grew near 
full.
So I bougth 4 Seagate 500GB. Now they are full and meanwhile price has dropped 
to ~1/3 and will continue to drop to ~1/5 I expect (I've just seen 1TB disks in 
stock at the same price the 500GB started when released ~ 2 years ago).
I thought I could buy one at a time and expand the raidz. I have reallized that 
is (currently) *not* an option! -- You can't (currently) add (attach) disks to 
a raidz vdev - period!

What you can do; i.e. the only thing you can do (currently), is adding new 
raidz to an existing pool, somewhat like concatinating two or more raidz vdevs 
into a logical volume.
So, ZFS does not help you here; You'll have to buy an economically optimal 
bunch of disks each time you run out of space and group them into a new raidz 
vdev each time. The new raidz vdev may be part of an/the existing pool (volume) 
(most likely), or a new one.
So with 8-port controller(s) you'd buy 4+4+4+4 or 4+4+8 or 8+8+8 or any number 
> 3 that fits your need and wallet at the time. For each set you lose one for 
redundancy Buying 5+1+1+1+1... is not an option (yet).
Alternatively of course you coul buy 2+2+2+2... and add mirrored pairs to the 
poll, but then you loose 50% to redundancy which is no a budget approach..
Buying 4+4+4+4 gives you at best 75% usable space for your money (N-1/N for 
each set, i.e. 3/4); that is when your pool is 100% full. But if your usage 
grows slowly from nothing, then adding mirror-pairs could atually be more 
economic, and if it accelerates you could later add groups of raidz1 or raidz2.

Note! You can't even regret what you have added to a pool. Being able to 
evacuate a vdev and replace it by a bigger one would have helped. But this 
isn't possible either (currently).

I'd really like to see adding disks to a raidz vdev implemented. I'm no expert 
on the details but having read a bit about ZFS, I think it shouldn't be that 
hard to implement (just extremely I/O intensive while in progress - like scrub 
where every stripe needed correction). It should be possible to read 
stipe-by-stripe and recalculate/reshape to a more narrow but longer stripe 
spanning the added disk(s), and as more space is added and the recalculated 
stripes would be narrower (at least not wider) everything shoud fit as a 
sequential process. One would need some percistent way to keep track on the 
progress in a way that would survive and resume after power loss, panic etc. A 
bitmap could do. Labeling the stripes with a version could be a way that would 
make it possible to having a mix of old short and new longer stripes coexisting 
for a while, say write new stripes (i.e. files) with the new size and 
recalculate and reshape everything as (an optional) part of next scrub. A 
constraint would probably be that you would each time have to add at least as 
much space as your biggest file (file|metadata=stripe as far as I have 
understood) -- at least true if the reshape-process could save the 
biggest/temporarily non fitting stripes to the end of the process, to make sure 
there is allways one good copy of every stipe on disk at any time, which is 
much of the point with ZFS.
An implementation of something like this is *very* welcome!
I would then also like to be able to convert my initial raidz1 to raidz2 so I 
could, ideally, start with a 2-disk raidz1 and end up with a giant raidz2, and 
split it in a reasonable number of disks per group and start a new raidz1 
growing from 2 disks at every 10 disk or so, and probably at the same time step 
up to then new state-of-the-art disk size for each new vdev (and just before I 
run out of slots start replacing the by the time ridiculously small disks (and 
slow (controller)) in then first raidz and thus grow for ever not necessarily 
needing bigger chassis or rack units)

Backing up the whole thing, destroying and recreating the pool, and restore 
everything every couple of months isn't really an option for me..
Actually I have no clue how to back-up such a thing on a private budget. 
Tapedrives that could cope are way to expensive and tapes aren't that cheap 
compared to mid-range SATA-disks.. Best thing I can come up with is rsync to a 
clone-system (build around your old PC/server but with similar disk-capasity, 
with less or no redundancy cold do (since this is budget HW there is no 
significantly cheaper way to build a downscaled clone except reducing/reusing 
old CPU and RAM and so)

-- And by the way, yes, I think this applies to professional use too. It could 
give substantial savings on any scale. Buying things you don't really need till 
next year have usually been a 50% waist of money in this business for at least 
the last 25 years. Next week or month you get more for less!
And in business use I wouldn't have one system; I have thousands!, so it 
applies even more there I think. The argument I've seen that a business would 
allwas afford buing say 6 disks at a time does not hold. It isn't just 6 disks; 
It's 6 disks for each system running aout of space and the sum of all waisted 
space in each separate system..

ZFS is really good, make it better! Thanks :)

Pål
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to