[OpenIndiana-discuss] Replacing both disks in a mirror set
Hi all, I have a zpool on an oi_147 host system which is made up of 3 mirror sets, tank mirror-0 c11t5d0 c11t4d1 mirror-1 c11t3d0 c11t2d0 mirror-2 c11t1d0 c11t0d0 both c11t5d0 and c11t4d0 (SATA 1Tb disks, ST31000528AS) are developing errors, both disks have around one-hundred pending sectors and I'm getting nervous :) I'd like to add a third disk to mirror-0 so that I can let it resilver without decreasing parity (replacing one disk) and increasing my overall risk of loosing the whole zpool A simple zpool attach tank c11t5d0 c12t0d0 should be ok to make mirror-0 a three disks mirror set. The problem, for me at least, arises here: how can I remove/replace disks so that I can end up with the new disk (c12t0d0) in c11t5d0 (or c11t4d0) disk bay and without powering off the system? from googling around it seems that zpool offline cannot be used to replace a disk and if I remove, for example, c11t5d0 with a zpool detach tank c11t5d0 when I move c12t0d0 to the place (disk bay) where c11t5d0 was I fear that I have to let it resilver from the beginning which leaves me without a mirror for at least three days. Is there some way to solve this issue without exporting pool, powering off host system, moving c12t0d0 in c11t5d0 bay and then restarting system and importing pool again? Thanks. Maurilio. -- __ | | | |__| Maurilio Longo |_|_|_|| ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
I'm not understanding your problem. If you add a 3rd temporary disk, wait for it to resilver, then replace c1t5d0, let the new disk resilver, then detach the temporary disk, you will never have less than 2 up to date disks in the mirror. What am I missing? -Original Message- From: Maurilio Longo [mailto:maurilio.lo...@libero.it] Sent: Monday, October 08, 2012 8:58 AM To: Discussion list for OpenIndiana Subject: [OpenIndiana-discuss] Replacing both disks in a mirror set Hi all, I have a zpool on an oi_147 host system which is made up of 3 mirror sets, tank mirror-0 c11t5d0 c11t4d1 mirror-1 c11t3d0 c11t2d0 mirror-2 c11t1d0 c11t0d0 both c11t5d0 and c11t4d0 (SATA 1Tb disks, ST31000528AS) are developing errors, both disks have around one-hundred pending sectors and I'm getting nervous :) I'd like to add a third disk to mirror-0 so that I can let it resilver without decreasing parity (replacing one disk) and increasing my overall risk of loosing the whole zpool A simple zpool attach tank c11t5d0 c12t0d0 should be ok to make mirror-0 a three disks mirror set. The problem, for me at least, arises here: how can I remove/replace disks so that I can end up with the new disk (c12t0d0) in c11t5d0 (or c11t4d0) disk bay and without powering off the system? from googling around it seems that zpool offline cannot be used to replace a disk and if I remove, for example, c11t5d0 with a zpool detach tank c11t5d0 when I move c12t0d0 to the place (disk bay) where c11t5d0 was I fear that I have to let it resilver from the beginning which leaves me without a mirror for at least three days. Is there some way to solve this issue without exporting pool, powering off host system, moving c12t0d0 in c11t5d0 bay and then restarting system and importing pool again? Thanks. Maurilio. -- __ | | | |__| Maurilio Longo |_|_|_|| ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
Dan Swartzendruber wrote: I'm not understanding your problem. If you add a 3rd temporary disk, wait for it to resilver, then replace c1t5d0, let the new disk resilver, then detach the temporary disk, you will never have less than 2 up to date disks in the mirror. What am I missing? Dan, you're right, I was trying to find a way to move the new disk in the failing disk bay instead of simply replacing the failing one :) Thanks for the advice! Maurilio. -- __ | | | |__| Maurilio Longo |_|_|_|| ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
Hi, from what I understood from negative experience with a 12-drive SSD RAID set build with MDRaid on linux, and from answers to a related question I raised recently in this list, it is not so easy to engineer a configuration using a large count of SSDs anyhow. The budget option, using SATA SSDs, seems to be critical in some terms. Using an SSD type based on a controller using compression seems to be a suboptimal choice for any data that will not compress efficiently (which is more likely writing as a stripe set (RaidZn)). Other concern seems to be SATA vs SAS in general, and compatibility of SATA SSDs with the usual SAS HBAs and - Extenders. One should be aware that any of these aspects is prone to make the vdev unresponsible, or even kick drives out of the vdev. Should that be a systematic issue, the stripe set will not rebuild properly or even be lost in an instant, with no parity level offering protection. Another option could be to look into a setup that is using a SLC or RAM-based ZIL device, and/or a large SSD based L2ARC. That's what I am looking into, currently. BR Sebastian ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
I feel bad asking this question because I generally know what raid type to pick. I am about to configure 24 256gig ssd drives in a ZFS/Comstar deployment. This will serve as the datastore for a vmware deployment. Does anyone know what raid level would be best. I know the workload will determine alot, but obviously there is varying workload across a vmware environment. Since we are talking about ssds I dont see a particular reason to not create 1 big zfs pool, with the exception that I know people generally try to keep the drive count from getting out of control. Raid 10 seems like a waste of space with little benefit in performance in this case. i am leaning towards raid z2 but wanted to get everyones input. The datastore will host a fileserver, and exchange server for about 50 users. The environment is all 10g and they have solid states in all desktops so essentially that is the reason for such a large SSD deployment for a small # of users. There seems to be varying opinions, especially when you factor in trying to keep writes low for ssds. I can only share my experience with spinning rust. As others have said, I'd recommend against a single VDEV with all 24 drives. The chance of two or three dying at once is rather high with that number of drives. SSDs die too. A choice of 3x8 in RAIDz2 seems reasonable, or perhaps 2x7+1x8+spare (I know, it's uneven, but not that much, and a spare is a good thing). That'll give you IOPS comparable with 3x IOPS of a single drive, which should be pretty good with most SSDs. Also, keep in mind the problems with certain (or most?) SATA units connected to a SAS expander. I've seen pretty bad things happen with WD2001FASS drives in such a configuration (we had to replace about 160 drives and replace them with hitachis to solve that problem - not too much data was lost, though, thanks to pure luck). - What sort of SSDs are these btw? Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 r...@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
[OpenIndiana-discuss] Running devfsadm -r alt-root inside a zone
Hi, I have scripts running devfsadm -r alt-root, and they can't run inside a zone, complaining that devfsadm can be run on global-zone only. Actually, with -r, the command is not working on real machine devices. Is there any way I can let it work inside a zone? This is nice to run distro_const inside a zone ;) Gabriele. Gabriele Bulfon - Sonicle S.r.l. Tel +39 028246016 Int. 30 - Fax +39 028243880 via Enrico Fermi 44 - 20090, Assago - Milano - Italy http://www.sonicle.com ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
Also, keep in mind the problems with certain (or most?) SATA units connected to a SAS expander. I've seen pretty bad things happen with WD2001FASS drives in such a configuration (we had to replace about 160 drives and replace them with hitachis to solve that problem - not too much data was lost, though, thanks to pure luck). This blogpost says a little about this problem http://gdamore.blogspot.no/2010/08/why-sas-sata-is-not-such-great-idea.html Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 r...@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] [developer] Preliminary Download link: Illumos based MartUX_OpenIndiana_Edition for SPARC LiveDVD (without installer)
True, but I'm talking about the native SVR4 packages for NoMachine NX on SPARC Solaris, already bundles for Solaris 8-10. It may work with minimal hacking on OpenIndiana for SPARC. Jonathan, could you maybe give me some pointers on how you compiled it from source? I was trying to read the documentation that came with the source, and it was so confusing that I wasn't able to finish the task. I'm a big fan of NoMachine NX, and would *love* to see it in action on the Solaris and illumos platforms! I already filed an RFE a while ago when the illumos project was originally forked, but I haven't any activity since. Cheers! On Sun, Oct 7, 2012 at 12:53 PM, Jonathan Adams t12nsloo...@gmail.com wrote: On 5 October 2012 17:26, Alex Smith (K4RNT) shadowhun...@gmail.com wrote: If SVR4 packages work on the current OpenIndiana, you may want to look into NoMachine NX instead. That would probably be your easiest and best solution for what you're trying to do. www.nomachine.com I got this to work relatively well on Solaris 10 (x86) and early OpenIndiana (x86) when compiling from source ... but it took a fair amount of hacking to get it to use system libraries where they existed ... there are however no Solaris x86 clients. Jon ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- ' With the first link, the chain is forged. The first speech censured, the first thought forbidden, the first freedom denied, chains us all irrevocably.' Those words were uttered by Judge Aaron Satie as wisdom and warning... The first time any man's freedom is trodden on we’re all damaged. - Jean-Luc Picard, quoting Judge Aaron Satie, Star Trek: TNG episode The Drumhead - Alex Smith (K4RNT) - Dulles Technology Corridor (Chantilly/Ashburn/Dulles), Virginia USA ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
An SLC SSD would probably be substansially slower than an array of MLC SSDs, and would be likely to slow down the system for sync writes. - Opprinnelig melding - Hi, from what I understood from negative experience with a 12-drive SSD RAID set build with MDRaid on linux, and from answers to a related question I raised recently in this list, it is not so easy to engineer a configuration using a large count of SSDs anyhow. The budget option, using SATA SSDs, seems to be critical in some terms. Using an SSD type based on a controller using compression seems to be a suboptimal choice for any data that will not compress efficiently (which is more likely writing as a stripe set (RaidZn)). Other concern seems to be SATA vs SAS in general, and compatibility of SATA SSDs with the usual SAS HBAs and - Extenders. One should be aware that any of these aspects is prone to make the vdev unresponsible, or even kick drives out of the vdev. Should that be a systematic issue, the stripe set will not rebuild properly or even be lost in an instant, with no parity level offering protection. Another option could be to look into a setup that is using a SLC or RAM-based ZIL device, and/or a large SSD based L2ARC. That's what I am looking into, currently. BR Sebastian ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 r...@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
I still think this whole discussion is like renting a 40 meter long truck to move your garden hose. We all know that it is possible to rent such a truck but nobody tries to role up the hose SSD's are good for fast reads and occasional writes. So don't use them for datastorage of fast changing data. Put the OS's on an SSD raid construction and the rest on a SAS platter RAID construction. Kind regards, The out-side Op 8 okt. 2012 om 21:26 heeft Roy Sigurd Karlsbakk r...@karlsbakk.net het volgende geschreven: An SLC SSD would probably be substansially slower than an array of MLC SSDs, and would be likely to slow down the system for sync writes. - Opprinnelig melding - Hi, from what I understood from negative experience with a 12-drive SSD RAID set build with MDRaid on linux, and from answers to a related question I raised recently in this list, it is not so easy to engineer a configuration using a large count of SSDs anyhow. The budget option, using SATA SSDs, seems to be critical in some terms. Using an SSD type based on a controller using compression seems to be a suboptimal choice for any data that will not compress efficiently (which is more likely writing as a stripe set (RaidZn)). Other concern seems to be SATA vs SAS in general, and compatibility of SATA SSDs with the usual SAS HBAs and - Extenders. One should be aware that any of these aspects is prone to make the vdev unresponsible, or even kick drives out of the vdev. Should that be a systematic issue, the stripe set will not rebuild properly or even be lost in an instant, with no parity level offering protection. Another option could be to look into a setup that is using a SLC or RAM-based ZIL device, and/or a large SSD based L2ARC. That's what I am looking into, currently. BR Sebastian ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 r...@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
Marilio, at first a reminder: never ever detach a disk before you have a third disk that already completed resilvering. The term detach is misleading, because it detaches the disk from the pool. Afterwards you cannot access the disk's previous contents anymore. Your detached half of a mirror can neither be imported, nor mounted and also not even rescued (unlike a disk with a zpool destroyed disk). If I ever mentally recover from a zfs encryption caused 2TB (or 3 years!) data loss, then I may offer an implementation with less ambigous naming to Illumos. zpool detach suggests, that you could still use this disk as a reserve backup copy of the pool you were detaching it from. And that you could simply zpool attach it again, in case the other disk would die. Unfortunately, this is not the case. Well, you can of course attach it again. Like any new or empty disk. But only if and only if you have enough replicas, and that's not what one wanted if one fell in this misunderstanding trap. And there are no warnings in the zpool/zfs man pages. What you want: zpool replace poolname vdev to be replaced new vdev But last weekend I lost 7 years of trust that I had in ZFS. Because Oracle Solaris 11/11 x86 with an encrypted and gzip-9 compressed mirror cannot be accessed anymore after VirtualBox forced me to remove prower from the host machine. Since then a 1:1 mirror of 2TB disks cannot be mounted anymore. It always ends in a kernel panic due to a pf in aes:aes_decrypt_contiguous_blocks. Well: TITANIC IS UNSINKABLE! The problem is, that scrub doesn't find an error, and so has nothing to auto-repair. Even zpool attach sucessfully completes resilver, but the newly resilvered disk contains the same error. Be aware that ZFS is not free of bugs. If it stays like that (I contacted some folks for help), then my trust in ZFS has destroyed, VAPORIZED 3 years of my work and life. So, back to your question: To be as cautious as possible, what I would do in your case: 0.) zpool offline poolname vdev you want to replace 1.) Physically remove this disc (important, because I have seen cases, where zfs forgets that you offlined a vdev after a reboot) 2.) AFTER (!IMPORTANT!) you physically disconnected the disc to be replaced, zpool detach it or alternatively take zpool replace poolname oldvdev_that_you_disconnected_BEFOREinordertokeepitasbafailsafebackup! newvdev 3.) Depending on if you did detach or replace in step 2.), zpool attach poolname Firstvdevofthispool newvdev or ommit this step, if you took zpool replace in step 2.) NEVER TRUST ZFS TOO MUCH. What I do from now on: For each 1:1 mirror that I have I will take a third disk, resilver it, offline and physically disconnect it, and store it at a secure place. Because if you have this much bad luck as I had last weekend, ZFS replicates the data corruption, too. And then you could have 1000 discs mirrored, they would all contain the corruption. For this reason, you are only on the safe side, if you physically disconnect a third copy! Good luck! %martin On 10/8/12, Maurilio Longo maurilio.lo...@libero.it wrote: Dan Swartzendruber wrote: I'm not understanding your problem. If you add a 3rd temporary disk, wait for it to resilver, then replace c1t5d0, let the new disk resilver, then detach the temporary disk, you will never have less than 2 up to date disks in the mirror. What am I missing? Dan, you're right, I was trying to find a way to move the new disk in the failing disk bay instead of simply replacing the failing one :) Thanks for the advice! Maurilio. -- __ | | | |__| Maurilio Longo |_|_|_|| ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- regards %martin bochnig http://wiki.openindiana.org/oi/MartUX_OpenIndiana+oi_151a+SPARC+LiveDVD http://www.youtube.com/user/MartUXopensolaris http://www.facebook.com/pages/MartUX_SPARC-OpenIndiana/357912020962940 https://twitter.com/MartinBochnig http://www.martux.org (new page not yet online, but pretty soon) ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
Unfortunately, this is not the case. Well, you can of course attach it again. Like any new or empty disk. But only if and only if you have enough replicas, and that's not what one wanted if one fell in this misunderstanding trap. And there are no warnings in the zpool/zfs man pages. What you want: zpool replace poolname vdev to be replaced new vdev But last weekend I lost 7 years of trust that I had in ZFS. Because Oracle Solaris 11/11 x86 with an encrypted and gzip-9 compressed mirror cannot be accessed anymore after VirtualBox forced me to remove prower from the host machine. Since then a 1:1 mirror of 2TB disks cannot be mounted anymore. It always ends in a kernel panic due to a pf in aes:aes_decrypt_contiguous_blocks. Well: TITANIC IS UNSINKABLE! Aehm, to make this clear: I fell in the attach/detach trap a few years ago. Not this time! What I mentioned above is a real data corruption caused by the allegedly UNSINKABLE ZFS itself, after VirtualBox froze my x86 host machine. That alone was enough to render my mirrored pool 2TB INACCESSIBLE! Just a simple freeze of the host. I AM DEAD, if I cannot recover this pool again. Luckily it does _not_ contain the work for our distro, and also not my openXsun patches/gate. But about 5 Thousand professional photos and 100 Videos that I created during my time in Bosnia. I wanted to publish this as a book :( And many other things, like development branches I had been working on. All email, EVERYTHING. If it stays like that, I HAVE NO HOME ANYMORE Thanks to my blind trust in ZFS! Here the kernel panic as attachment, to keep text formatting readable: PANIC.txt.gz - 1kb So BE WARNED! Don't risk your own data!!! For this reason I write you, although I'm definitely not in a mood to do anything anymore :( regards, %martin ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
It seems, Gmail corrupted the previous mail's attachment. Here again, this time as a plain text file: PANIC.txt - 6kb -- regards %martin Oct 7 08:40:35 sun4me zfs: [ID 249136 kern.info] imported version 33 pool wonderhome using 33 Oct 7 08:43:03 sun4me unix: [ID 836849 kern.notice] Oct 7 08:43:03 sun4me ^Mpanic[cpu0]/thread=ff00100fac20: Oct 7 08:43:03 sun4me genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ff00100f9fa0 addr=d286f8e0 Oct 7 08:43:03 sun4me unix: [ID 10 kern.notice] Oct 7 08:43:03 sun4me unix: [ID 839527 kern.notice] zpool-wonderhome: Oct 7 08:43:03 sun4me unix: [ID 753105 kern.notice] #pf Page fault Oct 7 08:43:03 sun4me unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xd286f8e0 Oct 7 08:43:03 sun4me unix: [ID 243837 kern.notice] pid=5868, pc=0xd286f8e0, sp=0xff00100fa098, eflags=0x10297 Oct 7 08:43:03 sun4me unix: [ID 211416 kern.notice] cr0: 8005003bpg,wp,ne,et,ts,mp,pe cr4: 406f8osxsav,xmme,fxsr,pge,mce,pae,pse,de Oct 7 08:43:03 sun4me unix: [ID 624947 kern.notice] cr2: d286f8e0 Oct 7 08:43:03 sun4me unix: [ID 625075 kern.notice] cr3: 3dca000 Oct 7 08:43:03 sun4me unix: [ID 625715 kern.notice] cr8: c Oct 7 08:43:03 sun4me unix: [ID 10 kern.notice] Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]rdi: ff00100fa318 rsi: ff02e5f99650 rdx: fff8 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]rcx: d7014800 r8: ff00100fa0b0 r9: ff00100fa0b8 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]rax: 18 rbx: ff00100fa320 rbp: ff00100fa120 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]r10: d286f8e0 r11: ff00100fa630 r12: ff00100fa690 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]r13: ff00100fa168 r14: ff00100fa160 r15: 10 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]fsb:0 gsb: fbc3ebc0 ds: 4b Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice] es: 4b fs:0 gs: 1c3 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice]trp:e err: 10 rip: d286f8e0 Oct 7 08:43:03 sun4me unix: [ID 592667 kern.notice] cs: 30 rfl:10297 rsp: ff00100fa098 Oct 7 08:43:03 sun4me unix: [ID 266532 kern.notice] ss: 38 Oct 7 08:43:03 sun4me unix: [ID 10 kern.notice] Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100f9ec0 unix:die+131 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100f9f90 unix:trap+152b () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100f9fa0 unix:cmntrap+e6 () Oct 7 08:43:03 sun4me genunix: [ID 802836 kern.notice] ff00100fa120 d286f8e0 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa1d0 kcf:ctr_mode_contiguous_blocks+116 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa230 aes:aes_decrypt_contiguous_blocks+ee () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa2e0 kcf:gcm_decrypt_final+1fd () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa460 aes:aes_decrypt_atomic+262 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa5d0 kcf:crypto_decrypt+202 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa720 zfs:zio_decrypt_data+2fb () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa7a0 zfs:zio_decrypt+ff () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa7d0 zfs:zio_pop_transforms+40 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa840 zfs:zio_done+14f () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa870 zfs:zio_execute+8d () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa8d0 zfs:zio_notify_parent+b1 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa940 zfs:zio_done+3a3 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa970 zfs:zio_execute+8d () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fa9d0 zfs:zio_notify_parent+b1 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100faa40 zfs:zio_done+3a3 () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100faa70 zfs:zio_execute+8d () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fab10 genunix:taskq_thread+22e () Oct 7 08:43:03 sun4me genunix: [ID 655072 kern.notice] ff00100fab20 unix:thread_start+8 () Oct 7 08:43:03 sun4me unix: [ID 10 kern.notice] Oct 7 08:43:03 sun4me genunix: [ID 672855 kern.notice] syncing file systems... Oct 7 08:43:03 sun4me genunix: [ID 904073 kern.notice] done Oct 7 08:43:04 sun4me genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump,
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
On 10/8/12, Dan Swartzendruber dswa...@druber.com wrote: Wow, Martin, that's a shocker. I've been doing exactly this to 'backup' my rpool :( My full sympathy. This naming and lack of warnings is just brain-dead. I cannot understand, how smart engineers like them could name and implement it that stupid way. But as written, this time I'm the victim of a real bug. So Oracle has absolutely no excuses. But I have no service plan. Let's PRAY that they respond at all ( ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
Thanks for the feedback, they will be either Samsung 830s or if the timing is right 840s. I am sorta leaning toward the zfs equivalent of raid 10 at this point. Do you guys see an issue using all of them in 1 pool/vdev in that scenario? -Original Message- From: Roy Sigurd Karlsbakk [mailto:r...@karlsbakk.net] Sent: Monday, October 8, 2012 12:27 PM To: Discussion list for OpenIndiana Subject: Re: [OpenIndiana-discuss] Raid type selection for large # of ssds I feel bad asking this question because I generally know what raid type to pick. I am about to configure 24 256gig ssd drives in a ZFS/Comstar deployment. This will serve as the datastore for a vmware deployment. Does anyone know what raid level would be best. I know the workload will determine alot, but obviously there is varying workload across a vmware environment. Since we are talking about ssds I dont see a particular reason to not create 1 big zfs pool, with the exception that I know people generally try to keep the drive count from getting out of control. Raid 10 seems like a waste of space with little benefit in performance in this case. i am leaning towards raid z2 but wanted to get everyones input. The datastore will host a fileserver, and exchange server for about 50 users. The environment is all 10g and they have solid states in all desktops so essentially that is the reason for such a large SSD deployment for a small # of users. There seems to be varying opinions, especially when you factor in trying to keep writes low for ssds. I can only share my experience with spinning rust. As others have said, I'd recommend against a single VDEV with all 24 drives. The chance of two or three dying at once is rather high with that number of drives. SSDs die too. A choice of 3x8 in RAIDz2 seems reasonable, or perhaps 2x7+1x8+spare (I know, it's uneven, but not that much, and a spare is a good thing). That'll give you IOPS comparable with 3x IOPS of a single drive, which should be pretty good with most SSDs. Also, keep in mind the problems with certain (or most?) SATA units connected to a SAS expander. I've seen pretty bad things happen with WD2001FASS drives in such a configuration (we had to replace about 160 drives and replace them with hitachis to solve that problem - not too much data was lost, though, thanks to pure luck). - What sort of SSDs are these btw? Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 r...@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
On Oct 8, 2012, at 4:07 PM, Martin Bochnig mar...@martux.org wrote: Marilio, at first a reminder: never ever detach a disk before you have a third disk that already completed resilvering. The term detach is misleading, because it detaches the disk from the pool. Afterwards you cannot access the disk's previous contents anymore. Your detached half of a mirror can neither be imported, nor mounted and also not even rescued (unlike a disk with a zpool destroyed disk). If I ever mentally recover from a zfs encryption caused 2TB (or 3 years!) data loss, then I may offer an implementation with less ambigous naming to Illumos. zpool detach suggests, that you could still use this disk as a reserve backup copy of the pool you were detaching it from. No it doesn't -- there is no documentation that suggests this usage. And that you could simply zpool attach it again, in case the other disk would die. You are confusing zpool detach and zpool split commands. -- richard Unfortunately, this is not the case. Well, you can of course attach it again. Like any new or empty disk. But only if and only if you have enough replicas, and that's not what one wanted if one fell in this misunderstanding trap. And there are no warnings in the zpool/zfs man pages. What you want: zpool replace poolname vdev to be replaced new vdev But last weekend I lost 7 years of trust that I had in ZFS. Because Oracle Solaris 11/11 x86 with an encrypted and gzip-9 compressed mirror cannot be accessed anymore after VirtualBox forced me to remove prower from the host machine. Since then a 1:1 mirror of 2TB disks cannot be mounted anymore. It always ends in a kernel panic due to a pf in aes:aes_decrypt_contiguous_blocks. Well: TITANIC IS UNSINKABLE! The problem is, that scrub doesn't find an error, and so has nothing to auto-repair. Even zpool attach sucessfully completes resilver, but the newly resilvered disk contains the same error. Be aware that ZFS is not free of bugs. If it stays like that (I contacted some folks for help), then my trust in ZFS has destroyed, VAPORIZED 3 years of my work and life. So, back to your question: To be as cautious as possible, what I would do in your case: 0.) zpool offline poolname vdev you want to replace 1.) Physically remove this disc (important, because I have seen cases, where zfs forgets that you offlined a vdev after a reboot) 2.) AFTER (!IMPORTANT!) you physically disconnected the disc to be replaced, zpool detach it or alternatively take zpool replace poolname oldvdev_that_you_disconnected_BEFOREinordertokeepitasbafailsafebackup! newvdev 3.) Depending on if you did detach or replace in step 2.), zpool attach poolname Firstvdevofthispool newvdev or ommit this step, if you took zpool replace in step 2.) NEVER TRUST ZFS TOO MUCH. What I do from now on: For each 1:1 mirror that I have I will take a third disk, resilver it, offline and physically disconnect it, and store it at a secure place. Because if you have this much bad luck as I had last weekend, ZFS replicates the data corruption, too. And then you could have 1000 discs mirrored, they would all contain the corruption. For this reason, you are only on the safe side, if you physically disconnect a third copy! Good luck! %martin On 10/8/12, Maurilio Longo maurilio.lo...@libero.it wrote: Dan Swartzendruber wrote: I'm not understanding your problem. If you add a 3rd temporary disk, wait for it to resilver, then replace c1t5d0, let the new disk resilver, then detach the temporary disk, you will never have less than 2 up to date disks in the mirror. What am I missing? Dan, you're right, I was trying to find a way to move the new disk in the failing disk bay instead of simply replacing the failing one :) Thanks for the advice! Maurilio. -- __ | | | |__| Maurilio Longo |_|_|_|| ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- regards %martin bochnig http://wiki.openindiana.org/oi/MartUX_OpenIndiana+oi_151a+SPARC+LiveDVD http://www.youtube.com/user/MartUXopensolaris http://www.facebook.com/pages/MartUX_SPARC-OpenIndiana/357912020962940 https://twitter.com/MartinBochnig http://www.martux.org (new page not yet online, but pretty soon) ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- richard.ell...@richardelling.com +1-760-896-4422 ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
Good point on split vs detach. Unfortunately this particular misinformation seems widespread :( -Original Message- From: Richard Elling [mailto:richard.ell...@richardelling.com] Sent: Monday, October 08, 2012 8:39 PM To: Discussion list for OpenIndiana Subject: Re: [OpenIndiana-discuss] Replacing both disks in a mirror set On Oct 8, 2012, at 4:07 PM, Martin Bochnig mar...@martux.org wrote: Marilio, at first a reminder: never ever detach a disk before you have a third disk that already completed resilvering. The term detach is misleading, because it detaches the disk from the pool. Afterwards you cannot access the disk's previous contents anymore. Your detached half of a mirror can neither be imported, nor mounted and also not even rescued (unlike a disk with a zpool destroyed disk). If I ever mentally recover from a zfs encryption caused 2TB (or 3 years!) data loss, then I may offer an implementation with less ambigous naming to Illumos. zpool detach suggests, that you could still use this disk as a reserve backup copy of the pool you were detaching it from. No it doesn't -- there is no documentation that suggests this usage. And that you could simply zpool attach it again, in case the other disk would die. You are confusing zpool detach and zpool split commands. -- richard Unfortunately, this is not the case. Well, you can of course attach it again. Like any new or empty disk. But only if and only if you have enough replicas, and that's not what one wanted if one fell in this misunderstanding trap. And there are no warnings in the zpool/zfs man pages. What you want: zpool replace poolname vdev to be replaced new vdev But last weekend I lost 7 years of trust that I had in ZFS. Because Oracle Solaris 11/11 x86 with an encrypted and gzip-9 compressed mirror cannot be accessed anymore after VirtualBox forced me to remove prower from the host machine. Since then a 1:1 mirror of 2TB disks cannot be mounted anymore. It always ends in a kernel panic due to a pf in aes:aes_decrypt_contiguous_blocks. Well: TITANIC IS UNSINKABLE! The problem is, that scrub doesn't find an error, and so has nothing to auto-repair. Even zpool attach sucessfully completes resilver, but the newly resilvered disk contains the same error. Be aware that ZFS is not free of bugs. If it stays like that (I contacted some folks for help), then my trust in ZFS has destroyed, VAPORIZED 3 years of my work and life. So, back to your question: To be as cautious as possible, what I would do in your case: 0.) zpool offline poolname vdev you want to replace 1.) Physically remove this disc (important, because I have seen cases, where zfs forgets that you offlined a vdev after a reboot) 2.) AFTER (!IMPORTANT!) you physically disconnected the disc to be replaced, zpool detach it or alternatively take zpool replace poolname oldvdev_that_you_disconnected_BEFOREinordertokeepitasbafailsafebackup ! newvdev 3.) Depending on if you did detach or replace in step 2.), zpool attach poolname Firstvdevofthispool newvdev or ommit this step, if you took zpool replace in step 2.) NEVER TRUST ZFS TOO MUCH. What I do from now on: For each 1:1 mirror that I have I will take a third disk, resilver it, offline and physically disconnect it, and store it at a secure place. Because if you have this much bad luck as I had last weekend, ZFS replicates the data corruption, too. And then you could have 1000 discs mirrored, they would all contain the corruption. For this reason, you are only on the safe side, if you physically disconnect a third copy! Good luck! %martin On 10/8/12, Maurilio Longo maurilio.lo...@libero.it wrote: Dan Swartzendruber wrote: I'm not understanding your problem. If you add a 3rd temporary disk, wait for it to resilver, then replace c1t5d0, let the new disk resilver, then detach the temporary disk, you will never have less than 2 up to date disks in the mirror. What am I missing? Dan, you're right, I was trying to find a way to move the new disk in the failing disk bay instead of simply replacing the failing one :) Thanks for the advice! Maurilio. -- __ | | | |__| Maurilio Longo |_|_|_|| ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- regards %martin bochnig http://wiki.openindiana.org/oi/MartUX_OpenIndiana+oi_151a+SPARC+LiveDVD http://www.youtube.com/user/MartUXopensolaris http://www.facebook.com/pages/MartUX_SPARC-OpenIndiana/357912020962940 https://twitter.com/MartinBochnig http://www.martux.org (new page not yet online, but pretty soon) ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org
Re: [OpenIndiana-discuss] Raid type selection for large # of ssds
On Oct 8, 2012, at 2:07 PM, Roel_D openindi...@out-side.nl wrote: I still think this whole discussion is like renting a 40 meter long truck to move your garden hose. We all know that it is possible to rent such a truck but nobody tries to role up the hose SSD's are good for fast reads and occasional writes. So don't use them for datastorage of fast changing data. This is not a true statement. There are variety of SSDs that have been used for high transaction rate systems for a long time, eg RAMSAN. Other new entrants are well designed for high transaction rates (FusionIO, Violin) Not all SSDs are created equal and you need to choose the best one to fit your needs. -- richard Put the OS's on an SSD raid construction and the rest on a SAS platter RAID construction. Kind regards, The out-side Op 8 okt. 2012 om 21:26 heeft Roy Sigurd Karlsbakk r...@karlsbakk.net het volgende geschreven: An SLC SSD would probably be substansially slower than an array of MLC SSDs, and would be likely to slow down the system for sync writes. - Opprinnelig melding - Hi, from what I understood from negative experience with a 12-drive SSD RAID set build with MDRaid on linux, and from answers to a related question I raised recently in this list, it is not so easy to engineer a configuration using a large count of SSDs anyhow. The budget option, using SATA SSDs, seems to be critical in some terms. Using an SSD type based on a controller using compression seems to be a suboptimal choice for any data that will not compress efficiently (which is more likely writing as a stripe set (RaidZn)). Other concern seems to be SATA vs SAS in general, and compatibility of SATA SSDs with the usual SAS HBAs and - Extenders. One should be aware that any of these aspects is prone to make the vdev unresponsible, or even kick drives out of the vdev. Should that be a systematic issue, the stripe set will not rebuild properly or even be lost in an instant, with no parity level offering protection. Another option could be to look into a setup that is using a SLC or RAM-based ZIL device, and/or a large SSD based L2ARC. That's what I am looking into, currently. BR Sebastian ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 r...@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- richard.ell...@richardelling.com +1-760-896-4422 ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
On 10/8/12, Richard Elling richard.ell...@richardelling.com wrote: [...] zpool detach suggests, that you could still use this disk as a reserve backup copy of the pool you were detaching it from. No it doesn't -- there is no documentation that suggests this usage. To non-native speakers of English it sounds like that. Googling a bit proves, that quite some folks were victims of this misunderstanding. BTW: The command names should be given in a 100% mnemonic manner. So even if somebody doesn't read the entire man page, there should be no misunderstanding possible. And if these name shall be short (for whatever reason), then not too short. And simply throwing a warning/confirmation dialog would be the minimum one could expect ... If you move a small single user file to trash (where it still continues to live), you do get one. And for an entire half of a pool, you don't?? Does that make sense? And that you could simply zpool attach it again, in case the other disk would die. You are confusing zpool detach and zpool split commands. -- richard Well, I know this for a few years now (but some still don't, as we just saw ... ) . And at first, a few years back, I didn't know it myself. However, this time I have a real problem. And it did not happen because of ambigously chosen command names, that I misunderstood. Vbox caused the host to freeze. And since then the host's home mirror is no longer mountable. And that's just not in line with the fancy promos and all that hype:( %martin ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
On 10/8/12, Martin Bochnig mar...@martux.org wrote: However, this time I have a real problem. And it did not happen because of ambigously chosen command names, that I misunderstood. Vbox caused the host to freeze. And since then the host's home mirror is no longer mountable. And that's just not in line with the fancy promos and all that hype:( Dear all )) )) A MIRACLE HAS HAPPENED! I tried it the entire night. Now suddenly. mounting ro does not cause this crash (although it did a douzen times). zpool import -o readonly=on wonderhome WITH GOD's HELP (altough I'm actually an atheist) functioned ) So, now I try to rescue the data to another pool via rsync. Sorry for the hectic. Let's hope copying the data will work without a new panic. Cheers) %martin)) ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
Ok, to stay correct: First via rsync or /usr/sbin/tar cEf or cpio for the most important files. Then I try, if piping zfs send still works. IT ALL DOESN'T MATTER TO ME. My DATA MAY BE RESCUED in a few hours ))) ))) ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing both disks in a mirror set
Happy you got your stuff back. On Tuesday, October 09, 2012 11:30 AM, Martin Bochnig wrote: Ok, to stay correct: First via rsync or /usr/sbin/tar cEf or cpio for the most important files. Then I try, if piping zfs send still works. IT ALL DOESN'T MATTER TO ME. My DATA MAY BE RESCUED in a few hours ))) ))) ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss