Re: [zfs-discuss] Severe Problems on ZFS server
Hi Bob, The problem could be due to a faulty/failing disk, a poor connection with a disk, or some other hardware issue. A failing disk can easily make the system pause temporarily like that. As root you can run '/usr/sbin/fmdump -ef' to see all the fault events as they are reported. Be sure to execute '/usr/sbin/fmadm faulty' to see if a fault has already been identified on your system. Also execute '/usr/bin/iostat -xe' to see if there are errors reported against some of your disks, or if some are reported as being abnormally slow. You might also want to verify that your Solaris 10 is current. I notice that you did not identify what Solaris 10 you are using. Thanks a lot for these hints. I checked all this. On my mirror server I found a faulty DIMM with these commands. But on the main server exhibiting the described problem everything seems fine. another machine with 6GB RAM I fired up a second virtual machine (vbox). This drove the machine almost to a halt. The second vbox instance never came up. I finally saw a panel raised by the first vbox instance that there was not enough memory available (non severe vbox error) and the virtual machine was halted!! After killing the process of the second vbox I could simply press resume and the first vbox machine continued to work properly. Maybe you should read the VirtualBox documentation. There is a note about Solaris 10 and about how VirtualBox may fail if it can't get enough contiguous memory space. Maybe I am lucky since I have run three VirtualBox instances at a time (2GB allocation each) on my system with no problem at all. I have inserted set zfs:zfs_arc_max = 0x2 in /etc/system and rebooted the machine having 64GB of memory. Tomorrow will show whether this did the trick! Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Severe Problems on ZFS server
Hi all, we are encountering severe problems on our X4240 (64GB, 16 disks) running Solaris 10 and ZFS. From time to time (5-6 times a day) • FrontBase hangs or crashes • VBox virtual machine do hang • Other applications show rubber effect (white screen) while moving the windows I have been tearing my hair off where this comes from. Could be software bugs, but in all these applications from different vendors? Could be a Solaris bug or bad memory!? Rather unlikely. I just was hit by a thought. On another machine with 6GB RAM I fired up a second virtual machine (vbox). This drove the machine almost to a halt. The second vbox instance never came up. I finally saw a panel raised by the first vbox instance that there was not enough memory available (non severe vbox error) and the virtual machine was halted!! After killing the process of the second vbox I could simply press resume and the first vbox machine continued to work properly. OK, now this starts to make sense. My idea is that ZFS is blocking/allocating all of the available system memory. When an app (FrontBase, VBox,...) is started and suddenly requests larger chunks of memory from the system, the malloc calls fail because ZFS has allocated all the memory or because the system cannot release the memory quickly enough and make it available fo rthe requesting apps, so the malloc fails or times out or whatever which is not catched in the apps and makes them hang or crash or stall for minutes. Does this make any sense? Any similar experiences? Followup to my owm message. On the X4240 I have set zfs:zfs_arc_max = 0x78000 in /etc/system. Would it be a good idea to reduce that to say set zfs:zfs_arc_max = 0x28000 ?? Hints greatly appreciated! Thanks, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Severe Problems on ZFS server
Hi all we are encountering severe problems on our X4240 (64GB, 16 disks) running Solaris 10 and ZFS. From time to time (5-6 times a day) • FrontBase hangs or crashes • VBox virtual machine do hang • Other applications show rubber effect (white screen) while moving the windows I have been tearing my hair off where this comes from. Could be software bugs, but in all these applications from different vendors? Could be a Solaris bug or bad memory!? Rather unlikely. I just was hit by a thought. On another machine with 6GB RAM I fired up a second virtual machine (vbox). This drove the machine almost to a halt. The second vbox instance never came up. I finally saw a panel raised by the first vbox instance that there was not enough memory available (non severe vbox error) and the virtual machine was halted!! After killing the process of the second vbox I could simply press resume and the first vbox machine continued to work properly. OK, now this starts to make sense. My idea is that ZFS is blocking/allocating all of the available system memory. When an app (FrontBase, VBox,...) is started and suddenly requests larger chunks of memory from the system, the malloc calls fail because ZFS has allocated all the memory or because the system cannot release the memory quickly enough and make it available fo rthe requesting apps, so the malloc fails or times out or whatever which is not catched in the apps and makes them hang or crash or stall for minutes. Does this make any sense? Any similar experiences? What can I do about that? Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replacing disk in zfs pool
Hi Ragnar, I need to replace a disk in a zfs pool on a production server (X4240 running Solaris 10) today and won't have access to my documentation there. That's why I would like to have a good plan on paper before driving to that location. :-) The current tank pool looks as follows: pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t13d0 ONLINE 0 0 0 errors: No known data errors Note that disk c1t15d0 is being used and has taken ove rthe duty of c1t6d0. c1t6d0 failed and was replaced with a new disk a couple of months ago. However, the new disk does not show up in /dev/rdsk and /dev/dsk. I was told that the disk has to initialized first with the SCSI BIOS. I am going to do so today (reboot the server). Once the disks shows up in /dev/rdsk I am planning to do the following: I don't think that the BIOS and rebooting part ever has to be true, at least I don't hope so. You shouldn't have to reboot just because you replace a hot plug disk. Hard to believe! But that's the most recent state of affairs. Not even the Sun technician made the disk to show up in /dev/dsks. They have replaced it 3 times assuming it to be defect! :-) I tried to remotely reboot the server (with LOM) and go into the SCSI BIOS to initialize the disk, but the BIOS requires a key combination to initialize the disk that does not go through the remote connections (don't remember which one). That's why I am planning to drive to the remote location and do it manually with a server reboot and keyboard and screen attached like in the very old days. :-( Depending on the hardware and the state of your system, it might not be the problem at all, and rebooting may not help. Are the device links for c1t6* gone in /dev/(r)dsk? Then someone must have ran a "devfsadm -C" or something like that. You could try "devfsadm -sv" to see if it wants to (re)create any device links. If you think that it looks good, run it with "devfsadm -v". If it is the HBA/raid controller acting up and not showing recently inserted drives, you should be able to talk to it with a program from within the OS. raidctl for some LSI HBAs, and arcconf for some SUN/StorageTek HBAs. I have /usr/sbin/raidctl on that machine and just studied the man page of this tool. But I couldn't find hints of how to initialize a disk c1t16d0. It just talks about setting up raid volumes!? :-( Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Replacing disk in zfs pool
Hi all, I need to replace a disk in a zfs pool on a production server (X4240 running Solaris 10) today and won't have access to my documentation there. That's why I would like to have a good plan on paper before driving to that location. :-) The current tank pool looks as follows: pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t10d0 ONLINE 0 0 0 c1t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t12d0 ONLINE 0 0 0 c1t13d0 ONLINE 0 0 0 errors: No known data errors Note that disk c1t15d0 is being used and has taken ove rthe duty of c1t6d0. c1t6d0 failed and was replaced with a new disk a couple of months ago. However, the new disk does not show up in /dev/rdsk and /dev/dsk. I was told that the disk has to initialized first with the SCSI BIOS. I am going to do so today (reboot the server). Once the disks shows up in /dev/rdsk I am planning to do the following: zpool attach tank c1t7d0 c1t6d0 This hopefully gives me a three-way mirror: mirror ONLINE 0 0 0 c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 And then a zpool dettach tank c1t15d0 to get c1t15d0 out of the mirror to finally have mirror ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 again. Is that a good plan? I am then intending to do zpool add tank mirror c1t14d0 c1t15d0 to add another 146GB to the pool. Please let me know if I am missing anything. This is a production server. A failure of the pool would be fatal. Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Removing SSDs from pool
Hi Khyron, No, he did *not* say that a mirrored SLOG has no benefit, redundancy-wise. He said that YOU do *not* have a mirrored SLOG. You have 2 SLOG devices which are striped. And if this machine is running Solaris 10, then you cannot remove a log device because those updates have not made their way into Solaris 10 yet. You need pool version >= 19 to remove log devices, and S10 does not currently have patches to ZFS to get to a pool version >= 19. If your SLOG above were mirrored, you'd have "mirror" under "logs". And you probably would have "log" not "logs" - notice the "s" at the end meaning plural, meaning multiple independent log devices, not a mirrored pair of logs which would effectively look like 1 device. Thanks for the clarification! This is very annoying. My intend was to create a log mirror. I used zpool add tank log c1t6d0 c1t7d0 and this was obviously false. Would zpool add tank mirror log c1t6d0 c1t7d0 have done what I intended to do? If so it seems I have to tear down the tank pool and recreate it from scratc!?. Can I simply use zpool destroy -f tank to do so? Thanks, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Removing SSDs from pool
Hi Edward, thanks a lot for your detailed response! From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Andreas Höschler • I would like to remove the two SSDs as log devices from the pool and instead add them as a separate pool for sole use by the database to see how this enhences performance. I could certainly do zpool detach tank c1t7d0 to remove one disk from the log mirror. But how can I get back the second SSD? If you're running solaris, sorry, you can't remove the log device. You better keep your log mirrored until you can plan for destroying and recreating the pool. Actually, in your example, you don't have a mirror of logs. You have two separate logs. This is fine for opensolaris (zpool =19), but not solaris (presently up to zpool 15). If this is solaris, and *either* one of those SSD's fails, then you lose your pool. I run Solaris 10 (not Open Solaris)! You say the log mirror pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 ... logs c1t6d0ONLINE 0 0 0 c1t7d0ONLINE 0 0 0 does not do me anything good (redundancy-wise)!? Shouldn't I dettach the second drive then and try to use it for something else, may be another machine? I understand it is very dangerous to use SSDs for logs then (no redundancy)!? If you're running opensolaris, "man zpool" and look for "zpool remove" Is the database running locally on the machine? Yes! Or at the other end of something like nfs? You should have better performance using your present config than just about any other config ... By enabling the log devices, such as you've done, you're dedicating the SSD's for sync writes. And that's what the database is probably doing. This config should be *better* than dedicating the SSD's as their own pool. Because with the dedicated log device on a stripe of mirrors, you're allowing the spindle disks to do what they're good at (sequential blocks) and allowing the SSD's to do what they're good at (low latency IOPS). OK! I actually have two machines here, one production machine (X4240 with 16 disks, no SSDs) with performance issues and another development machine X4140 with 6 disks and two SDDs configured as shown in my previous mail. The question for me is how to improve the performance of the production machine and whether buying SSDs for this machine is worth the investment. "zpool iostat" on the development machine with the SSDs gives me capacity operationsbandwidth pool used avail read write read write -- - - - - - - rpool114G 164G 0 4 13.5K 36.0K tank 164G 392G 3131 444K 10.8M -- - - - - - - When I do that on the production machine without SSDs I get pool used avail read write read write -- - - - - - - rpool 98.3G 37.7G 0 7 32.5K 36.9K tank 480G 336G 16 53 1.69M 2.05M -- - - - - - - It is interesting to note that the write bandwidth on the SSD machine is 5 times higher. I take this as an indicaor that the SSDs have some effect. I am still wondering what your "if one SSd fails you loe your pool" means to me. Would you recommend to dettach one of the SSDs in the development machine and add to o the production machine with zpool add tank log c1t15d0 ?? And how save (reliable) is it to use SSDs for this? I mean when do I have to expect the SSD to fail and thus ruin the pool!? Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Removing SSDs from pool
Hi all, while setting of our X4140 I have - following suggestions - added two SSDs as log devices as follows zpool add tank log c1t6d0 c1t7d0 I currently have pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 errors: No known data errors pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 mirrorONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 logs c1t6d0ONLINE 0 0 0 c1t7d0ONLINE 0 0 0 errors: No known data errors We have performance problems especially with FrontBase (relational database) running on this ZFS configuration and need to look for optimizations. • I would like to remove the two SSDs as log devices from the pool and instead add them as a separate pool for sole use by the database to see how this enhences performance. I could certainly do zpool detach tank c1t7d0 to remove one disk from the log mirror. But how can I get back the second SSD? Any experiinces with running database on ZFS pools? What can I do to tune the performance? Smaller block size may be? Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SSD and ZFS
Hi all, just after sending a message to sunmanagers I realized that my question should rather have gone here. So sunmanagers please excus ethe double post: I have inherited a X4140 (8 SAS slots) and have just setup the system with Solaris 10 09. I first setup the system on a mirrored pool over the first two disks pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 errors: No known data errors and then tried to add the second pair of disks to this pool which did not work (famous error message reagding label, root pool BIOS issue). I therefore simply created an additional pool tank. pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 errors: No known data errors pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 mirrorONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 errors: No known data errors So far so good. I have now replaced the last two SAS disks with 32GB SSDs and am wondering how to add these to the system. I googled a lot for best practise but found nothing so far that made me any wiser. My current approach still is to simply do zpool add tank mirror c0t6d0 c0t7d0 as I would do with normal disks but I am wondering whether that's the right approach to significantly increase system performance. Will ZFS automatically use these SSDs and optimize accesses to tank? Probably! But it won't optimize accesses to rpool of course. Not sure whether I need that or should look for that. Should I try to get all disks into rpool inspite of the BIOS label issue so that SSDs are used for all accesses to the disk system? Hints (best practises) are greatly appreciated? Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replacing faulty disk in ZFS pool
Hi Cindy, I think you can still offline the faulted disk, c1t6d0. OK, here it gets tricky. I have NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 c1t6d0 FAULTED 019 0 too many errors c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 spares c1t15d0 INUSE currently in use now. When I issue the command zpool offline tank c1t6d0 I get cannot offline c1t6d0: no valid replicas ?? However zpool detach tank c1t6d0 seems to work! pool: tank state: ONLINE scrub: resilver completed after 0h22m with 0 errors on Thu Aug 6 22:55:37 2009 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 errors: No known data errors This looks like I can remove and physically replace c1t6d0 now! :-) Thanks, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replacing faulty disk in ZFS pool
Hi all, zpool add tank spare c1t15d0 ? After doing that c1t6d0 is offline and ready to be physically replaced? Yes, that is correct. Then you could physically replace c1t6d0 and add it back to the pool as a spare, like this: # zpool add tank spare c1t6d0 For a production system, the steps above might be the most efficient. Get the faulted disk replaced with a known good disk so the pool is no longer degraded, then physically replace the bad disk when you have the time and add it back to the pool as a spare. It is also good practice to run a zpool scrub to ensure the replacement is operational That would be zpool scrub tank in my case!? Yes. and use zpool clear to clear the previous errors on the pool. I assume teh complete comamnd fo rmy case is zpool clear tank Why d we have to do that. Couldb't zfs realize that everything is fine again after executing "zpool replace tank c1t6d0 c1t15d0"? Yes, sometimes the clear is not necessary but it will also clear the error counts if need be. I have done zpool add tank spare c1t15d0 zpool replace tank c1t6d0 c1t15d0 now and waited for the completion of the resilvering process. "zpool status" now gives me scrub: resilver completed after 0h22m with 0 errors on Thu Aug 6 22:55:37 2009 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 c1t6d0 FAULTED 019 0 too many errors c1t15d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 spares c1t15d0 INUSE currently in use errors: No known data errors This does look like a final step is missing. Can I simply physically replace c1t6d0 now or do I have to do zpool offline tank c1t6d0 first? Moreover it seems I have to run a zpool clear in my case to get rid of the DEGRADED message!? What is the missing bit here? zpool offline tank c1t6d0 zpool replace tank c1t6d0 zpool online tank c1t6d0 Just out of curiosity (since I used the other road this time), how does the replace command know what exactly to do here. In my case I ordered the system specifically to replace c1t6d0 with c1t15d0 by doing "zpool replace tank c1t6d0 c1t15d0" but if I simply issue zpool replace tank c1t6d0 it ...!?? Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replacing faulty disk in ZFS pool
Hi Cindy, Good job for using a mirrored configuration. :-) Thanks! Your various approaches would work. My only comment about #2 is that it might take some time for the spare to kick in for the faulted disk. Both 1 and 2 would take a bit more time than just replacing the faulted disk with a spare disk, like this: # zpool replace tank c1t6d0 c1t15d0 You mean I can execute zpool replace tank c1t6d0 c1t15d0 without having made c1t15d0 a spare disk first with zpool add tank spare c1t15d0 ? After doing that c1t6d0 is offline and ready to be physically replaced? Then you could physically replace c1t6d0 and add it back to the pool as a spare, like this: # zpool add tank spare c1t6d0 For a production system, the steps above might be the most efficient. Get the faulted disk replaced with a known good disk so the pool is no longer degraded, then physically replace the bad disk when you have the time and add it back to the pool as a spare. It is also good practice to run a zpool scrub to ensure the replacement is operational That would be zpool scrub tank in my case!? and use zpool clear to clear the previous errors on the pool. I assume teh complete comamnd fo rmy case is zpool clear tank Why d we have to do that. Couldb't zfs realize that everything is fine again after executing "zpool replace tank c1t6d0 c1t15d0"? If the system is used heavily, then you might want to run the zpool scrub when system use is reduced. That would be now! :-) If you were going to physically replace c1t6d0 while it was still attached to the pool, then you might offline it first. Ok, this sounds like approach 3) zpool offline tank c1t6d0 zpool online tank c1t6d0 Would that be it? Thanks a lot! Regards, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Replacing faulty disk in ZFS pool
Dear managers, one of our servers (X4240) shows a faulty disk: -bash-3.00# zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 errors: No known data errors pool: tank state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scrub: none requested config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 c1t6d0 FAULTED 0 19 0 too many errors c1t7d0 ONLINE 0 0 0 errors: No known data errors I derived the following possible approaches to solve the problem: 1) A way to reestablish redundancy would be to use the command zpool attach tank c1t7d0 c1t15d0 to add c1t15d0 to the virtual device "c1t6d0 + c1t7d0". We still would have the faulty disk in the virtual device. We could then dettach the faulty disk with the command zpool dettach tank c1t6d0 2) Another approach would be to add a spare disk to tank zpool add tank spare c1t15d0 and the replace to replace the faulty disk. zpool replace tank c1t6d0 c1t15d0 In theory that is easy, but since I have never done that and since this is a productive server I would appreciate if somone with more experience would look on my agenda before I issue these commands. What is the difference between the two approaches? Which one do you recommend? And is that really all that has to be done or am I missing a bit? I mean can c1t6d0 be physically replaced after issuing "zpool dettach tank c1t6d0" or "zpool replace tank c1t6d0 c1t15d0"? I also found the command zpool offline tank ... but am not sure whether this should be used in my case. Hints are greatly appreciated! Thanks a lot, Andreas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss