First of all, as other posters stressed, your data is not safe by being stored 
in a
single copy, in the first place. Before doing anything to it, make a backup and 
test the backup if anyhow possible. At least, do it to any data that is more 
worth
than the rest of it ;)

As it was stressed in other posts, and not replied - how much disk space do you 
actually have available on your 8 disks? I.e. can you copy over some files in 
WinXP
over to some disks in order to free up at least one drive? Half of the drives 
(ideal)?

How much compressable is your data (i.e. videos vs text files)? Is it compressed
on the NTFS filesystem already (pointer to freeing up space - if not)?

Depending on how much free space can be actually gained by moving and 
compressing your data, there's a number of scenarios possible, detailed below. 

The point I'm trying to get to is: as soon as you can free up a single drive so 
you 
can move it into the Solaris machine, you can set it up as your ZFS pool.

Initially this pool would only contain a single vdev (a single drive, a mirror 
or a 
raidz group of drives, which may be concatenated to make up the larger pool if 
there's more than one vdev, as detailed below).

You create a filesystem dataset on the pool and enable compression (if your 
data 
can be compressed). In recent Solaris and OpenSolaris you can use gzip-9 to fit 
the info tighter on the drive. Also keep in mind that this setting applies to 
any 
data written *after* the value is set. So a dataset can store data objects 
written 
with mixed compression levels, if the value is changed on the fly. 
Alternatively,
and more simple to support, you can make several datasets with pre-defined 
compression evels (i.e. not to waste CPU cycles to compress JPEGs and MP3's).

Now, as you copy the data from NTFS to Solaris, you (are expected to) free up 
at 
least one more drive which can be added to the ZFS pool. Its capacity is at this
moment concatenated to the same pool. If you free up many drives at once, you
can go for a raidz vdev.

Best-case scenario is that you free up enough disks to build a redundant ZFS 
volume right away (raidz1, raidz2 or mirror - as the redundant pool's capacty 
decreases and data protection grows). Apparently, you don't expect to have
enough drives to mirror all data, so let's skip that idea for now. The raidz 
levels 
require that you free up at least two drives initially. AFAIK the raidz vdevs 
can 
not be expanded at the moment, so the more drives you're initially using - the 
less overhead capacity you'll lose. As you progress with data copying, you can
free up some more drives and make another raidz vdev, attached to this pool.

You can use a trick to make a raidz vdev with missing redundancy disks (which 
you'd attach and resilver later on). This is possible, but not "production" 
ready in 
any manner, and prone to data loss of the whole set of several drives whenever 
anything goes wrong. To my sole risk, I used it to make and populate a raidz2 
pool 
of 4 devices while I only had 2 available drives at that moment (the other 2 
were 
the old raid10 mirror's components with original data). 

The fake raidz redundancy devices trick is discussed in this thread:
[http://opensolaris.org/jive/thread.jspa?messageID=328360&tstart=0]

In a worst-case scenario you'll have a either single pool of concatenated 
disks, 
or a number of separate pools - like your separate NTFS systems are now; in my 
opinion, this is the better of two worses. In case of separate ZFS pools, you 
can
move them around and you only lose one disk worth of data if anything (drive, 
cabling, software, power) goes wrong. With a concatenated pool, however, you
have all of the drives' free space also concatenated as one big available 
bubble.

That's your choice to make. 

Later on you can expand the single drive vdevs to become mirrors, as you buy or
free up drives.

If you find that your data compresses well, so that you start with a 
single-drive 
concatenation pool and then find that you can free up several drives at once and
use raidz sets, see if you can squeeze out at least 3-4 drives (including a fake
device for raidz redundancy if you choose to try the trick). If you can - start 
a
new pool made with raidz vdevs and migrate the data from single drives to it,
then scrap their pool and reuse them. Remember that you can't currently remove
a vdev from the pool.

For such "temporary" pools (preferably redundant, or not) you can also use any 
number of older-smaller drives if you can get hands on them ;)

On a side note, copying this much data over LAN would take ages. If your disks
are not too much fragmented, you can typically expect 20-40Mb/s for large 
files. 
Zillions of small files (or heavily fragmented disks) make up so many mechanical
seeks that speeds can fall down to well under 1Mb/s. Easy to see that copying a 
single 1.5Tb drive can take anywhere from half a day on a gigabit LAN, and 
about 
2-3 days on a 100Mbit LAN (7-10 Mb/s typical).

In your position, I would explore (first on a testbed!) whether you can use the 
same machine to read NTFS and write ZFS. My ideas so far include either 
dual-booting to Solaris with some kind of NTFS driver (for example, see these 
posts: [http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in] or
[http://blogs.sun.com/pradhap/entry/mounting_ntfs_partition_in_solaris]),
or virtualizing either Solaris or WinXP (see VirtualBox), and finding a way to
assign whole physical drives to a virtual machine. If you virtualize OpenSolaris
this way, make sure that your test pool (a USB flash drive, perhaps) can indeed 
be imported to a physical machine.

It may be possible that this way your data can be copied over faster.

On another side note, expect your filesystem security settings to be lost 
(unless
you integrate with a Windows domain), and remember that ZFS filenames are
currently limited to 255 bytes. That bug has bit me once last year - so NTFS 
files 
had to be renamed.

Hope this helps, let us know if it does ;)

//Jim Klimov
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to