[zfs-discuss] Re: zfs boot error recovery
hi Will, thanks for your answer. Will Murnane schrieb: On 5/31/07, Jakob Praher <[EMAIL PROTECTED]> wrote: c) si 3224 related question: is it possible to simply hot swap the disk (i have the disks in special hot-swappable units, but have no experience in hotswapping under solaris, such that i want to have some echo). As it happens, I just happen to have tried this - albeit on a different card, it went well. I have a Marvell 88SX6081 controller, and removing a disk caused no undue panic (as far as I can tell). Adding a new disk, the kernel detected it immediately and then I had to run "cfgadm -cconfigure scsi0/1" or something like that. Then it Just Worked. I don't know if this is recommended or not... but it worked for me. What is the best way to simulate a disk error under zfs. before i want to add real data to the system, i want to make sure it works. my naive aproach: 1) remove disk from any pool membership (is this needed?) zpool xxx detach zpool yyy detach 2) disk should be free to be removed 3) pull plug 4) see what happens 5) plug disk in 6) restore zpool membership again (1) and (6) should not be really needed, or do I see that incorrectly? -- Jakob Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs boot error recovery
Jakob Praher schrieb: hi all, i would like to ask some questions regarding best practices for zfs recovery if disk errors occur. currently i have zfs boot (nv62) and the following setup: 2 si3224 controllers (each 4 sata disks) 8 sata disks, same size, same type i have two pools: a) rootpool b) datapool the rootpool is a mirrored pool, where every disk has a slice (the s0, which is 5 % of the whole disk) and this is devoted to the rootpool, just for mirroring. the rest of the disk (s1) is added to the datapool which is raidz. my idea is that if any disk is corrupt i am still be able to boot. now I have some questions: a) if i want to boot from every disk in case of error, i have to setup grub on every disk, such that if the controller sets this disk as the booting, the rootpool is able to be loaded from that. b) what is the best way to as fast as possible replace a disk. adding a disk as hotspare for the raidz is a good idea. but i also would like to replace the disk during runtime as simple as possible. the problem is that for the root pool the disks are labeled (the slices thingy). So I cannot simply detach the volumes and replace the disk and attach them again, but I have to format the disk such that the slicing exists. Is there some clever way to automatically re-label a replacement disk? i found out that storing or getting the label information from another disk should work: prtvtoc /dev/rdsk/s2 | fmthard -s - /dev/rdsk/s2 for instance i could simply store the label of all disks on the root pool, which should be available as long as any of the 8 disks is still availabe. So in case of repair i simply have to fmthard -s before attaching the replaced disk. c) si 3224 related question: is it possible to simply hot swap the disk (i have the disks in special hot-swappable units, but have no experience in hotswapping under solaris, such that i want to have some echo). d) do you have best practices for systems like that above? what are the best resources on the web for learning about monitoring the health of the zfs system (like email notifications in case of disk failures...) thannks in advance -- Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs boot error recovery
hi all, i would like to ask some questions regarding best practices for zfs recovery if disk errors occur. currently i have zfs boot (nv62) and the following setup: 2 si3224 controllers (each 4 sata disks) 8 sata disks, same size, same type i have two pools: a) rootpool b) datapool the rootpool is a mirrored pool, where every disk has a slice (the s0, which is 5 % of the whole disk) and this is devoted to the rootpool, just for mirroring. the rest of the disk (s1) is added to the datapool which is raidz. my idea is that if any disk is corrupt i am still be able to boot. now I have some questions: a) if i want to boot from every disk in case of error, i have to setup grub on every disk, such that if the controller sets this disk as the booting, the rootpool is able to be loaded from that. b) what is the best way to as fast as possible replace a disk. adding a disk as hotspare for the raidz is a good idea. but i also would like to replace the disk during runtime as simple as possible. the problem is that for the root pool the disks are labeled (the slices thingy). So I cannot simply detach the volumes and replace the disk and attach them again, but I have to format the disk such that the slicing exists. Is there some clever way to automatically re-label a replacement disk? c) si 3224 related question: is it possible to simply hot swap the disk (i have the disks in special hot-swappable units, but have no experience in hotswapping under solaris, such that i want to have some echo). d) do you have best practices for systems like that above? what are the best resources on the web for learning about monitoring the health of the zfs system (like email notifications in case of disk failures...) thannks in advance -- Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: opensolaris, zfs rootfs raidz
Erik Trimble wrote: > On Thu, 2007-04-05 at 22:59 +0200, Jakob Praher wrote: >> Hi Cyril, >> >> So to get this right: >> Nevada == Solaris Express?! >> > Yes, it's a bit confusing. Think of "Nevada" as a distro name (in Linux > terms), which uses the OpenSolaris source base. There are (generally) > weekly builds, which is what you will see referred to as "B61". Solaris > Express is the marketing name for periodic releases of specific builds > of Nevada (so, every couple of months, a build of Nevada is released as > "Solaris Express" - it's for people who want the latest technology,, > with _some_ support options, while not living on the absolute bleeding > edge like us folks). > The thing is: I am creating a network centered storage server, and for that I'd like to have a somewhat stable OS. I would like to use ZFS and then snapshot to another node (quite freqently) which should give me some DRBD like behavior. IMHO I wanted to have just one giant raidz zfs pool that can be booted from and not bother with the rest. I thought it would be rather hard to support RAIDz as a root pool. Though I gave it a try. Maybe I just should forget the root zfs stuff, if i nonetheless have to use 2 pools in order to have the rest use raidz which is what i need for robustness. Maybe I will just take a hardware raid approach to the root partition (just to have failover support) and to not make the one giant root approach using zfs. One ZFS related question: If I use the ufs partition to boot into the ZFS partition (the "old rootfs" stuff), raidz should then be technically speeking possible? Since in this case grub is using UFS to load the platform kernel and the initial ramdisk? So maybe it should work to have a very small UFS partition mirrored manually on several disks and then to boot into a raidz ZFS. My ZFS partition FAULTED when I tried to boot via UFS on Solaris 10. Is the root fs support mentioned in: http://blogs.sun.com/tabriz/#are_you_ready_to_rumble supported in Solaris 10? Thanks. I am sorry for so much noise on this file system related list. -- Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: opensolaris, zfs rootfs raidz
Hi Cyril, thanks for your quick response! Cyril Plisko wrote: > On 4/5/07, Jakob Praher <[EMAIL PROTECTED]> wrote: >> hi all, > >> >> I am new to solaris. >> I am creating a zfs filestore which should boot via rootfs. >> The version of the system is: SunOS store1 5.10 Generic_118855-33 i86pc >> i386 i86pc. >> >> Now I have seen that there is a new rootfs support for solaris starting >> with build: snv_62. >> (http://www.opensolaris.org/os/community/zfs/boot/zfsboot-manual/) >> >> Is it >> a) possible to start from a raidz pool? > > No. At this point raidz pool is not usable as a boot pool. > Is this possible then to use a mirror pool? >> b) possible to update my version using a patch from the web to the above >> version? > > Generally speaking there is no patch to get your system from any 5.x > release > to 5.x+1 release. You may, however, upgrade your current system from > SunOS 5.10 to Nevada using regular LiveUpgrade or DeadUpgrade(TM) > procedure. > > (I would just install from scratch - assuming you have all your valuable > data stored externally or on exportable zpool) > So to get this right: Nevada == Solaris Express?! This is a little bit confusing. I am very glad Ian Murdock joined Sun. Hopefully system upgrade will be as easy as apt-get dist-upgrade. Is there any easy way to just get the latest solaris kernel via web? thanks Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] opensolaris, zfs rootfs raidz
hi all, I am new to solaris. I am creating a zfs filestore which should boot via rootfs. The version of the system is: SunOS store1 5.10 Generic_118855-33 i86pc i386 i86pc. Now I have seen that there is a new rootfs support for solaris starting with build: snv_62. (http://www.opensolaris.org/os/community/zfs/boot/zfsboot-manual/) Is it a) possible to start from a raidz pool? b) possible to update my version using a patch from the web to the above version? thanks in advance -- Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: drbd using zfs send/receive?
Frank Cusack wrote: On September 18, 2006 5:45:08 PM +0200 Jakob Praher <[EMAIL PROTECTED]> wrote: huh. How do you create a SAN with NFS? Sorry. Okay it would be Network Attached Sotrage not the other way round . I guess you are right. BUT if we are at discussing NFS for distributed stroage: What are your guys performance data for NFSv4 as a storage node. How well does the current Solaris NFSv4 stack interoperate with the Linux stack? Would you go for that? What about iSCSI on top of ZFS? is that an option. I did a research on iSCSI vs NFSv4 once and I found out that the overhead for transproting the fs metadata (in the NFSv4 case) is not the real problem for many szenarios. Especially the COMPOUND messages should help here. I have been using DRBD on linux before and now am asking whether some of you have experience on on-demand network filesystem mirrors. AFAIK, Solaris does not export file change notification to userland in any way that would be useful for on-demand filesystem replication. From looking at drbd for 5 minutes, it looks like the kind of notification that windows/linux/macos provides isn't what drbd uses; it does BLOCK LEVEL replication, and part of the software is a kernel module to export that data to userspace. It sounds like that distinction doesn't matter for what you are trying to achieve, and I believe that this block-by-block duplication isn't a great idea for zfs anyway. It might be neat if zfs could inform userland of each new txg. yes. exactly. It is a block device driver and that replicates. So it sits right underneath Linux's VFS. Okay that is something i wanted to know. Are there any good heartbeat control apps for Solaris out there? I mean if i want to have failover (even if it is a little bit cheap) it should detect failures and react accordingly. Switching from Sender to Receiver should not be difficult given that all you need is to make ZFS snapshots. (and that is really cheap in ZFS). Is this mere a hack or can it be used to create some sort of failover. E.g. DRBD has the master/slave option, which can be configured easily. Something like this would be nice out of the box. So in case of failure another node is the master and if the former master is back again, it is simply the slave, so that both have the current data available again. Any pointers to solutions in that area are greatly appreaciated. See if <http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_now_with> comes close. I have 2 setups, one using SC 3.2 with a SAN (both systems can access the same filesystem, yes it's not as redundant as a remote node and remote filesystem, but it's for HA not DR. I could add another JBOD to the SAN and configure zfs to mirror between the two enclosures to get rid of the SPoF of the JBOD backplane/midplane, but it's not worth it. JBOD, SPoF - what are these things? The other setup is using my own cron script (zfs send | zfs recv) to send snapshots to a "remote" (just another server in the same rack) host. This is for a service that also has very high availability requirements but where I can't afford shared storage. I do a homegrown heartbeat and failover thing. I'm looking at replacing the cron script with the SMF service linked above, but I'm in no rush since the cron job works quite well. If zfs is otherwise a good solution for you, you might want to consider if you really need true on-demand replication. Maybe 5-minute or even 1-minute recency is good enough. I would imagine that you don't actually get too much better than 30s with drbd anyway, since outside of fsync() data doesn't actually make it to disk (and then replicated by drbd) more frequently than that for some generic application. Okay. I think zfs is nice. I am using xfs+lvm2 on my linux boxes so far. This works nice too. SMF is the init.d replacement of solaris, right? What would that look like. What would SMF do, but restart your app if it fails? Would you like to have a background task running instead of kicking it on with cron? Thanks -- Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] drbd using zfs send/receive?
hi everyone, I am planning on creating a local SAN via NFS(v4) and several redundant nodes. I have been using DRBD on linux before and now am asking whether some of you have experience on on-demand network filesystem mirrors. I have yet little Solaris sysadmin know how, but i am interesting whether there is an on-demand support for sending snapshots. I.e. not via a cron job, but via a kind of filesystem change notification system. Is this mere a hack or can it be used to create some sort of failover. E.g. DRBD has the master/slave option, which can be configured easily. Something like this would be nice out of the box. So in case of failure another node is the master and if the former master is back again, it is simply the slave, so that both have the current data available again. Any pointers to solutions in that area are greatly appreaciated. -- Jakob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss