Re: [zfs-discuss] ZFS RAID-10
> Dennis Clarke wrote: >> While ZFS may do a similar thing *I don't know* if there is a published >> document yet that shows conclusively that ZFS will survive multiple disk >> failures. > > ?? why not? Perhaps this is just too simple and therefore doesn't get > explained well. That is not what I wrote. Once again, for the sake of clarity, I don't know if there is a published document, anywhere, that shows by way of a concise experiment, that ZFS will actually perform RAID 1+0 and survive multiple disk failures gracefully. I do not see why it would not. But there is no conclusive proof that it will. > Note that SVM (nee Solstice Disksuite) did not always do RAID-1+0, for > many years it would do RAID-0+1. However, the data availability for > RAID-1+0 is better than for an equivalent sized RAID-0+1, so it is just > as well that ZFS does stripes of mirrors. > -- richard My understanding is that SVM will do stripes of mirrors if all of the disk or stripe components have the same geometry. This has been documented, well described and laid out bare for years. One may easily create two identical stripes and then mirror them. Then pull out multiple disks on both sides of the mirror and life goes on. So long as one does not remove identical mirror components on both sides at the same time. Common sense really. Anyways, the point is that SVM does do RAID 1+0 and has for years. ZFS probably does the same thing but it adds in a boatload of new features that leaves SVM lightyears behind. Dennis ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RAID-10
Dennis Clarke wrote: While ZFS may do a similar thing *I don't know* if there is a published document yet that shows conclusively that ZFS will survive multiple disk failures. ?? why not? Perhaps this is just too simple and therefore doesn't get explained well. Note that SVM (nee Solstice Disksuite) did not always do RAID-1+0, for many years it would do RAID-0+1. However, the data availability for RAID-1+0 is better than for an equivalent sized RAID-0+1, so it is just as well that ZFS does stripes of mirrors. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Self-tuning recordsize
Reads? Maybe. Writes are an other matter. Namely the overhead associated with turning a large write into a lot of small writes. (Checksums for example.) Jeremy Teo wrote: Hello all, Isn't a large block size a simple case of prefetching? In other words, if we possessed an intelligent prefetch implementation, would there still be a need for large block sizes? (Thinking aloud) :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RAID-10
> On Sun, 22 Oct 2006, Stephen Le wrote: > >> Is it possible to construct a RAID-10 array with ZFS? I've read through >> the ZFS documentation, and it appears that the only way to create a >> RAID-10 array would be to create two mirrored (RAID-1) emulated volumes >> in ZFS and combine those to create the outer RAID-0 volume. >> >> Am I approaching this in the wrong way? Should I be using SVM to create >> my RAID-1 volumes and then create a ZFS filesystem from those volumes? > > No - don't do that. Here is a ZFS version of a RAID 10 config with 4 > disks: > > - from 817.2271.pdf - > > Creating a Mirrored Storage Pool > > To create a mirrored pool, use the mirror keyword, followed by any number > of storage devices that will comprise the mirror. Multiple mirrors can be > specied by repeating the mirror keyword on the command line. The > following command creates a pool with two, two-way mirrors: > > # zpool create tank mirror c1d0 c2d0 mirror c3d0 c4d0 > > The second mirror keyword indicates that a new top-level virtual device is > being specied. Data is dynamically striped across both mirrors, with data > being replicated between each disk appropriately. > We need to keep in mind that the exact same result may be achieved with simple SVM : d1 1 2 /dev/dsk/c1d0s0 /dev/dsk/c3d0s0 -i 512b d2 1 2 /dev/dsk/c2d0s0 /dev/dsk/c4d0s0 -i 512b d3 -m d1 metainit d1 metainit d2 metainit d3 metattach d3 d2 At this point, if and only if all stripe components come from exactly identical geometry disks or slices, you get a stripe of mirrors and not just a mirror of stripes. While ZFS may do a similar thing *I don't know* if there is a published document yet that shows conclusively that ZFS will survive multiple disk failures. However ZFS brings a lot of other great features. Dennis Clarke ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RAID-10
On Oct 22, 2006, at 9:57 PM, Al Hopper wrote: On Sun, 22 Oct 2006, Stephen Le wrote: Is it possible to construct a RAID-10 array with ZFS? I've read through the ZFS documentation, and it appears that the only way to create a RAID-10 array would be to create two mirrored (RAID-1) emulated volumes in ZFS and combine those to create the outer RAID-0 volume. Am I approaching this in the wrong way? Should I be using SVM to create my RAID-1 volumes and then create a ZFS filesystem from those volumes? No - don't do that. Here is a ZFS version of a RAID 10 config with 4 disks: To further agree with/illustrate Al's point, here's an example of 'zpool status' output which reflects this type of configuration: (Note that there is one mirror set for each pair of drives. In this case, drive 1 on crontroller 3 is mirrored to drive 1 on controller 4, and so on. This will ensure continuity should one controller/buss/ cable fail.) [EMAIL PROTECTED]>zpool status pool: data state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c4t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c4t10d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c4t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t12d0 ONLINE 0 0 0 errors: No known data errors ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RAID-10
On Sun, 22 Oct 2006, Stephen Le wrote: > Is it possible to construct a RAID-10 array with ZFS? I've read through > the ZFS documentation, and it appears that the only way to create a > RAID-10 array would be to create two mirrored (RAID-1) emulated volumes > in ZFS and combine those to create the outer RAID-0 volume. > > Am I approaching this in the wrong way? Should I be using SVM to create > my RAID-1 volumes and then create a ZFS filesystem from those volumes? No - don't do that. Here is a ZFS version of a RAID 10 config with 4 disks: - from 817.2271.pdf - Creating a Mirrored Storage Pool To create a mirrored pool, use the mirror keyword, followed by any number of storage devices that will comprise the mirror. Multiple mirrors can be specied by repeating the mirror keyword on the command line. The following command creates a pool with two, two-way mirrors: # zpool create tank mirror c1d0 c2d0 mirror c3d0 c4d0 The second mirror keyword indicates that a new top-level virtual device is being specied. Data is dynamically striped across both mirrors, with data being replicated between each disk appropriately. --- end of quote from 817-2271.pdf page 38 Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS RAID-10
After some experimentation, it seems something like the following command would create a RAID-10 equivalent: zpool create tank mirror [i]disk1[/i] [i]disk2[/i] mirror [i]disk3[/i] [i]disk4[/i] This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS RAID-10
Is it possible to construct a RAID-10 array with ZFS? I've read through the ZFS documentation, and it appears that the only way to create a RAID-10 array would be to create two mirrored (RAID-1) emulated volumes in ZFS and combine those to create the outer RAID-0 volume. Am I approaching this in the wrong way? Should I be using SVM to create my RAID-1 volumes and then create a ZFS filesystem from those volumes? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: [osol-discuss] Cloning a disk w/ ZFS in it
yeah disks need to be identical but why do you need to do "prtvtoc and fmthard to duplicate the disk label (before the dd)", I thought that dd would take care of all of that... whenever I used dd I used it on slice 2 and I never had to do prtvtoc and fmthard... Juts make sure disks are identical and that is the key. Regards, Chris On Fri, 20 Oct 2006, Richard Elling - PAE wrote: minor adjustments below... Darren J Moffat wrote: Asif Iqbal wrote: Hi I have a X2100 with two 74G disks. I build the OS on the first disk with slice0 root 10G ufs, slice1 2.5G swap, slice6 25MB ufs and slice7 62G zfs. What is the fastest way to clone it to the second disk. I have to build 10 of those in 2 days. Once I build the disks I slam them to the other X2100s and ship it out. if clone really means make completely identical then do this: boot of cd or network. dd if=/dev/dsk/ of=/dev/dsk/ Where and are both localally attached. I use prtvtoc and fmthard to duplicate the disk label (before the dd) Note: the actual disk geometry may change between vendors or disk firmware revs. You will first need to verify that the geometries are similar, especially the total number of blocks. For dd, I'd use a larger block size than the default. Something like: dd bs=1024k if=/dev/dsk/ of=/dev/dsk/ The copy should go at media speed, approximately 50-70 MBytes/s for the X2100 disks. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss !DSPAM:122,45390d6810494021468! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: [osol-discuss] Cloning a disk w/ ZFS in it
you don't really need to do the prtvtoc and fmthard with the old Sun labels if you start at cylinder 0 since you're doing a bit -> bit copy with dd .. but, keep in mind: - The Sun VTOC is the first 512B and s2 *typically* should start at cylinder 0 (unless it's been redefined .. check!) - The EFI label though, reserves the first 17KB (34 blocks) and for a dd to work, you need to either: 1) dd without the slice (eg: dd if=/dev/rdsk/c0t0d0 of=/dev/rdsk/ c1t0d0 bs=128K) or 2) prtvtoc / fmthard (eg: prtvtoc /dev/rdsk/c0t0d0s0 > /tmp/ vtoc.out ; fmthard -s /tmp/vtoc.out /dev/rdsk/c1t0d0s0) .je On Oct 22, 2006, at 12:45, Krzys wrote: yeah disks need to be identical but why do you need to do "prtvtoc and fmthard to duplicate the disk label (before the dd)", I thought that dd would take care of all of that... whenever I used dd I used it on slice 2 and I never had to do prtvtoc and fmthard... Juts make sure disks are identical and that is the key. Regards, Chris On Fri, 20 Oct 2006, Richard Elling - PAE wrote: minor adjustments below... Darren J Moffat wrote: Asif Iqbal wrote: Hi I have a X2100 with two 74G disks. I build the OS on the first disk with slice0 root 10G ufs, slice1 2.5G swap, slice6 25MB ufs and slice7 62G zfs. What is the fastest way to clone it to the second disk. I have to build 10 of those in 2 days. Once I build the disks I slam them to the other X2100s and ship it out. if clone really means make completely identical then do this: boot of cd or network. dd if=/dev/dsk/ of=/dev/dsk/ Where and are both localally attached. I use prtvtoc and fmthard to duplicate the disk label (before the dd) Note: the actual disk geometry may change between vendors or disk firmware revs. You will first need to verify that the geometries are similar, especially the total number of blocks. For dd, I'd use a larger block size than the default. Something like: dd bs=1024k if=/dev/dsk/ of=/dev/dsk/ The copy should go at media speed, approximately 50-70 MBytes/s for the X2100 disks. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss !DSPAM:122,45390d6810494021468! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool question.
I have solaris 10 U2 and I have raidz partition setup on 5 disks, I just added a new disk and was wondering, can I add another disk to raidz? I was able to add it to a pool but I do not think it added it to zpool. [13:38:41] /root > zpool status -v mypool2 pool: mypool2 state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM mypool2 ONLINE 0 0 0 raidz ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 errors: No known data errors [14:35:36] /root > zpool add mypool2 c3t6d0 invalid vdev specification use '-f' to override the following errors: /dev/dsk/c3t6d0s0 contains a ufs filesystem. /dev/dsk/c3t6d0s4 contains a ufs filesystem. [14:36:02] /root > zpool add -f mypool2 c3t6d0 [14:36:14] /root > zpool list NAMESIZEUSED AVAILCAP HEALTH ALTROOT mypool 278G187G 90.6G67% ONLINE - mypool2 952G367K952G 0% ONLINE - [14:36:21] /root > zpool status -v mypool2 pool: mypool2 state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM mypool2 ONLINE 0 0 0 raidz ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c3t6d0ONLINE 0 0 0 errors: No known data errors Also when spare disks and raidz2 will be released in Solaris 10? Does anyone know when U3 will be comming out? Thanks guys. Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Self-tuning recordsize
Hello all, Isn't a large block size a simple case of prefetching? In other words, if we possessed an intelligent prefetch implementation, would there still be a need for large block sizes? (Thinking aloud) :) -- Regards, Jeremy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss