Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment
Thank you for the link! Turns out that, even though I bought the WD20EARS and ST32000542AS expecting a 4096 physical blocksize, they report 512. The new drive I bought correctly identifies as 4096 byte blocksize! So...OI doesn't like it merging with the existing pool. Note: ST2000VX000-9YW1 reports physical blocksize of 4096B. The other drives that actually have 4096B blocks report 512B physical blocks. This is misleading, but they do it anyway. On Mon, Sep 24, 2012 at 4:32 PM, Timothy Coalson tsc...@mst.edu wrote: I'm not sure how to definitively check physical sector size on solaris/illumos, but on linux, hdparm -I (capital i) or smartctl -i will do it. OpenIndiana's smartctl doesn't output this information yet (and its smartctl doesn't work on SATA disks unless attached via a SAS chip). The issue is complicated by having both a logical and a physical sector size, and as far as I am aware, on current disks, logical is always 512, which may be what is being reported in what you ran. Some quick googling suggests that previously, it was not possible to use an existing utility to report the physical sector size on solaris, because someone wrote their own: http://solaris.kuehnke.de/archives/18-Checking-physical-sector-size-of-disks-on-Solaris.html So, if you want to make sure of the physical sector size, you could give that program a whirl (it compiled fine for me on oi_151a6, and runs, but it is not easy for me to attach a 4k sector disk to one of my OI machines, so I haven't confirmed its correctness), or temporarily transplant the spare in question to a linux machine (or live system) and use hdparm -I. Tim On Mon, Sep 24, 2012 at 2:37 PM, LIC mesh licm...@gmail.com wrote: Any ideas? On Mon, Sep 24, 2012 at 10:46 AM, LIC mesh licm...@gmail.com wrote: That's what I thought also, but since both prtvtoc and fdisk -G see the two disks as the same (and I have not overridden sector size), I am confused. * * *iostat -xnE:* c16t5000C5002AA08E4Dd0 Soft Errors: 0 Hard Errors: 323 Transport Errors: 489 Vendor: ATA Product: ST32000542AS Revision: CC34 Serial No: %FAKESERIAL% Size: 2000.40GB 2000398934016 bytes Media Error: 207 Device Not Ready: 0 No Device: 116 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c16t5000C5005295F727d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: ST2000VX000-9YW1 Revision: CV13 Serial No: %FAKESERIAL% Size: 2000.40GB 2000398934016 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 *zpool status:* pool: rspool state: ONLINE scan: resilvered 719G in 65h28m with 0 errors on Fri Aug 24 04:21:44 2012 config: NAMESTATE READ WRITE CKSUM rspool ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c16t5000C5002AA08E4Dd0 ONLINE 0 0 0 c16t5000C5002ABE78F5d0 ONLINE 0 0 0 c16t5000C5002AC49840d0 ONLINE 0 0 0 c16t50014EE057B72DD3d0 ONLINE 0 0 0 c16t50014EE057B69208d0 ONLINE 0 0 0 cache c4t2d0ONLINE 0 0 0 spares c16t5000C5005295F727d0AVAIL errors: No known data errors *root@nas:~# zpool replace rspool c16t5000C5002AA08E4Dd0 c16t5000C5005295F727d0* cannot replace c16t5000C5002AA08E4Dd0 with c16t5000C5005295F727d0: devices have different sector alignment On Mon, Sep 24, 2012 at 9:23 AM, Gregg Wonderly gregg...@gmail.comwrote: What is the error message you are seeing on the replace? This sounds like a slice size/placement problem, but clearly, prtvtoc seems to think that everything is the same. Are you certain that you did prtvtoc on the correct drive, and not one of the active disks by mistake? Gregg Wonderly As does fdisk -G: root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 You have new mail in /var/mail/root root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote: Yet another weird thing - prtvtoc shows both drives as having the same sector size, etc: root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First
Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment
Yet another weird thing - prtvtoc shows both drives as having the same sector size, etc: root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0 * /dev/rdsk/c16t5000C5005295F727d0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.edu wrote: I think you can fool a recent Illumos kernel into thinking a 4k disk is 512 (incurring a performance hit for that disk, and therefore the vdev and pool, but to save a raidz1, it might be worth it): http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks , see Overriding the Physical Sector Size I don't know what you might have to do to coax it to do the replace with a hot spare (zpool replace? export/import?). Perhaps there should be a feature in ZFS that notifies when a pool is created or imported with a hot spare that can't be automatically used in one or more vdevs? The whole point of hot spares is to have them automatically swap in when you aren't there to fiddle with things, which is a bad time to find out it won't work. Tim On Sun, Sep 23, 2012 at 10:52 PM, LIC mesh licm...@gmail.com wrote: Well this is a new one Illumos/Openindiana let me add a device as a hot spare that evidently has a different sector alignment than all of the other drives in the array. So now I'm at the point that I /need/ a hot spare and it doesn't look like I have it. And, worse, the other spares I have are all the same model as said hot spare. Is there anything I can do with this or am I just going to be up the creek when any one of the other drives in the raidz1 fails? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment
As does fdisk -G: root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 You have new mail in /var/mail/root root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote: Yet another weird thing - prtvtoc shows both drives as having the same sector size, etc: root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0 * /dev/rdsk/c16t5000C5005295F727d0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.edu wrote: I think you can fool a recent Illumos kernel into thinking a 4k disk is 512 (incurring a performance hit for that disk, and therefore the vdev and pool, but to save a raidz1, it might be worth it): http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks , see Overriding the Physical Sector Size I don't know what you might have to do to coax it to do the replace with a hot spare (zpool replace? export/import?). Perhaps there should be a feature in ZFS that notifies when a pool is created or imported with a hot spare that can't be automatically used in one or more vdevs? The whole point of hot spares is to have them automatically swap in when you aren't there to fiddle with things, which is a bad time to find out it won't work. Tim On Sun, Sep 23, 2012 at 10:52 PM, LIC mesh licm...@gmail.com wrote: Well this is a new one Illumos/Openindiana let me add a device as a hot spare that evidently has a different sector alignment than all of the other drives in the array. So now I'm at the point that I /need/ a hot spare and it doesn't look like I have it. And, worse, the other spares I have are all the same model as said hot spare. Is there anything I can do with this or am I just going to be up the creek when any one of the other drives in the raidz1 fails? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment
That's what I thought also, but since both prtvtoc and fdisk -G see the two disks as the same (and I have not overridden sector size), I am confused. * * *iostat -xnE:* c16t5000C5002AA08E4Dd0 Soft Errors: 0 Hard Errors: 323 Transport Errors: 489 Vendor: ATA Product: ST32000542AS Revision: CC34 Serial No: %FAKESERIAL% Size: 2000.40GB 2000398934016 bytes Media Error: 207 Device Not Ready: 0 No Device: 116 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c16t5000C5005295F727d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: ST2000VX000-9YW1 Revision: CV13 Serial No: %FAKESERIAL% Size: 2000.40GB 2000398934016 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 *zpool status:* pool: rspool state: ONLINE scan: resilvered 719G in 65h28m with 0 errors on Fri Aug 24 04:21:44 2012 config: NAMESTATE READ WRITE CKSUM rspool ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c16t5000C5002AA08E4Dd0 ONLINE 0 0 0 c16t5000C5002ABE78F5d0 ONLINE 0 0 0 c16t5000C5002AC49840d0 ONLINE 0 0 0 c16t50014EE057B72DD3d0 ONLINE 0 0 0 c16t50014EE057B69208d0 ONLINE 0 0 0 cache c4t2d0ONLINE 0 0 0 spares c16t5000C5005295F727d0AVAIL errors: No known data errors *root@nas:~# zpool replace rspool c16t5000C5002AA08E4Dd0 c16t5000C5005295F727d0* cannot replace c16t5000C5002AA08E4Dd0 with c16t5000C5005295F727d0: devices have different sector alignment On Mon, Sep 24, 2012 at 9:23 AM, Gregg Wonderly gregg...@gmail.com wrote: What is the error message you are seeing on the replace? This sounds like a slice size/placement problem, but clearly, prtvtoc seems to think that everything is the same. Are you certain that you did prtvtoc on the correct drive, and not one of the active disks by mistake? Gregg Wonderly As does fdisk -G: root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 You have new mail in /var/mail/root root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote: Yet another weird thing - prtvtoc shows both drives as having the same sector size, etc: root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0 * /dev/rdsk/c16t5000C5005295F727d0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.edu wrote: I think you can fool a recent Illumos kernel into thinking a 4k disk is 512 (incurring a performance hit for that disk, and therefore the vdev and pool, but to save a raidz1, it might be worth it): http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks , see Overriding the Physical Sector Size I don't know what you might have to do to coax it to do the replace with a hot spare (zpool replace? export/import?). Perhaps there should be a feature in ZFS that notifies when a pool is created or imported with a hot spare that can't be automatically used in one or more vdevs? The whole point of hot spares is to have them automatically swap in when you aren't there to fiddle with things, which is a bad time to find out it won't work. Tim On Sun, Sep 23, 2012 at 10:52 PM, LIC mesh licm...@gmail.com
Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment
Any ideas? On Mon, Sep 24, 2012 at 10:46 AM, LIC mesh licm...@gmail.com wrote: That's what I thought also, but since both prtvtoc and fdisk -G see the two disks as the same (and I have not overridden sector size), I am confused. * * *iostat -xnE:* c16t5000C5002AA08E4Dd0 Soft Errors: 0 Hard Errors: 323 Transport Errors: 489 Vendor: ATA Product: ST32000542AS Revision: CC34 Serial No: %FAKESERIAL% Size: 2000.40GB 2000398934016 bytes Media Error: 207 Device Not Ready: 0 No Device: 116 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c16t5000C5005295F727d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: ST2000VX000-9YW1 Revision: CV13 Serial No: %FAKESERIAL% Size: 2000.40GB 2000398934016 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 *zpool status:* pool: rspool state: ONLINE scan: resilvered 719G in 65h28m with 0 errors on Fri Aug 24 04:21:44 2012 config: NAMESTATE READ WRITE CKSUM rspool ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c16t5000C5002AA08E4Dd0 ONLINE 0 0 0 c16t5000C5002ABE78F5d0 ONLINE 0 0 0 c16t5000C5002AC49840d0 ONLINE 0 0 0 c16t50014EE057B72DD3d0 ONLINE 0 0 0 c16t50014EE057B69208d0 ONLINE 0 0 0 cache c4t2d0ONLINE 0 0 0 spares c16t5000C5005295F727d0AVAIL errors: No known data errors *root@nas:~# zpool replace rspool c16t5000C5002AA08E4Dd0 c16t5000C5005295F727d0* cannot replace c16t5000C5002AA08E4Dd0 with c16t5000C5005295F727d0: devices have different sector alignment On Mon, Sep 24, 2012 at 9:23 AM, Gregg Wonderly gregg...@gmail.comwrote: What is the error message you are seeing on the replace? This sounds like a slice size/placement problem, but clearly, prtvtoc seems to think that everything is the same. Are you certain that you did prtvtoc on the correct drive, and not one of the active disks by mistake? Gregg Wonderly As does fdisk -G: root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 You have new mail in /var/mail/root root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ 608006080000255 252 512 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote: Yet another weird thing - prtvtoc shows both drives as having the same sector size, etc: root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0 * /dev/rdsk/c16t5000C5005295F727d0 partition map * * Dimensions: * 512 bytes/sector * 3907029168 sectors * 3907029101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector * 34 222 255 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 0 400256 3907012495 3907012750 8 1100 3907012751 16384 3907029134 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.eduwrote: I think you can fool a recent Illumos kernel into thinking a 4k disk is 512 (incurring a performance hit for that disk, and therefore the vdev and pool, but to save a raidz1, it might be worth it): http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks , see Overriding the Physical Sector Size I don't know what you might have to do to coax it to do the replace with a hot spare (zpool replace? export/import?). Perhaps there should be a feature in ZFS that notifies when a pool is created or imported with a hot spare that can't be automatically used in one or more vdevs? The whole point of hot spares is to have them automatically swap in when you aren't there to fiddle with things
[zfs-discuss] cannot replace X with Y: devices have different sector alignment
Well this is a new one Illumos/Openindiana let me add a device as a hot spare that evidently has a different sector alignment than all of the other drives in the array. So now I'm at the point that I /need/ a hot spare and it doesn't look like I have it. And, worse, the other spares I have are all the same model as said hot spare. Is there anything I can do with this or am I just going to be up the creek when any one of the other drives in the raidz1 fails? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resliver making the system unresponsive
If we've found one bad disk, what are our options? On Thu, Sep 30, 2010 at 10:12 AM, Richard Elling richard.ell...@gmail.comwrote: On Sep 30, 2010, at 2:32 AM, Tuomas Leikola wrote: On Thu, Sep 30, 2010 at 1:16 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to reslivers? This box is used for development VMs, but there is no way I would consider this for production with this kind of performance hit during a resliver. According to http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 resilver should in later builds have some option to limit rebuild speed in order to allow for more IO during reconstruction, but I havent't found any guides on how to actually make use of this feature. Maybe someone can shed some light on this? Simple. Resilver activity is throttled using a delay method. Nothing to tune here. In general, if resilver or scrub make a system seem unresponsive, there is a root cause that is related to the I/O activity. To diagnose, I usually use iostat -zxCn 10 (or similar) and look for unusual asvc_t from a busy disk. One bad disk can ruin performance for the whole pool. -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Is there any way to stop a resilver?
Is there any way to stop a resilver? We gotta stop this thing - at minimum, completion time is 300,000 hours, and maximum is in the millions. Raidz2 array, so it has the redundancy, we just need to get data off. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is there any way to stop a resilver?
It's always running less than an hour. It usually starts at around 300,000h estimate(at 1m in), goes up to an estimate in the millions(about 30mins in) and restarts. Never gets past 0.00% completion, and K resilvered on any LUN. 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs. On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Has it been running long? Initially the numbers are *way* off. After a while it settles down into something reasonable. How many disks, and what size, are in your raidz2? -Scott On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote: Is there any way to stop a resilver? We gotta stop this thing - at minimum, completion time is 300,000 hours, and maximum is in the millions. Raidz2 array, so it has the redundancy, we just need to get data off. -- We value your opinion! http://www.craneae.com/surveys/satisfaction.htmHow may we serve you better?Please click the survey link to tell us how we are doing: http://www.craneae.com/surveys/satisfaction.htm http://www.craneae.com/surveys/satisfaction.htm Your feedback is of the utmost importance to us. Thank you for your time. Crane Aerospace Electronics Confidentiality Statement: The information contained in this email message may be privileged and is confidential information intended only for the use of the recipient, or any employee or agent responsible to deliver it to the intended recipient. Any unauthorized use, distribution or copying of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender immediately and destroy the original message and all attachments from your electronic files. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Fwd: Is there any way to stop a resilver?
This is an iSCSI/COMSTAR array. The head was running 2009.06 stable with version 14 ZFS, but we updated that to build 134 (kept the old OS drives) - did not, however, update the zpool - it's still version 14. The targets are all running 2009.06 stable, exporting 4 raidz1 LUNs each of 6 drives - 8 shelves have 1TB drives, the other 8 have 2TB drives. The head sees the filesystem as comprised of 8 vdevs of 8 iSCSI LUNs each, with SSD ZIL and SSD L2ARC. On Wed, Sep 29, 2010 at 11:49 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: What version of OS? Are snapshots running (turn them off). So are there eight disks? On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com wrote: It's always running less than an hour. It usually starts at around 300,000h estimate(at 1m in), goes up to an estimate in the millions(about 30mins in) and restarts. Never gets past 0.00% completion, and K resilvered on any LUN. 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs. On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Has it been running long? Initially the numbers are *way* off. After a while it settles down into something reasonable. How many disks, and what size, are in your raidz2? -Scott On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com http://licm...@gmail.com wrote: Is there any way to stop a resilver? We gotta stop this thing - at minimum, completion time is 300,000 hours, and maximum is in the millions. Raidz2 array, so it has the redundancy, we just need to get data off. -- We value your opinion! http://www.craneae.com/surveys/satisfaction.htmHow may we serve you better?Please click the survey link to tell us how we are doing: http://www.craneae.com/surveys/satisfaction.htm http://www.craneae.com/surveys/satisfaction.htm Your feedback is of the utmost importance to us. Thank you for your time. Crane Aerospace Electronics Confidentiality Statement: The information contained in this email message may be privileged and is confidential information intended only for the use of the recipient, or any employee or agent responsible to deliver it to the intended recipient. Any unauthorized use, distribution or copying of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender immediately and destroy the original message and all attachments from your electronic files. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is there any way to stop a resilver?
Most likely an iSCSI timeout, but that was before my time here. Since then, there have been various individual drives lost along the way on the shelves, but never a whole LUN, so, theoretically, /except/ for iSCSI timeouts, there has been no great reason to resilver. On Wed, Sep 29, 2010 at 11:51 AM, Lin Ling lin.l...@oracle.com wrote: What caused the resilvering to kick off in the first place? Lin On Sep 29, 2010, at 8:46 AM, LIC mesh wrote: It's always running less than an hour. It usually starts at around 300,000h estimate(at 1m in), goes up to an estimate in the millions(about 30mins in) and restarts. Never gets past 0.00% completion, and K resilvered on any LUN. 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs. On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Has it been running long? Initially the numbers are *way* off. After a while it settles down into something reasonable. How many disks, and what size, are in your raidz2? -Scott On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote: Is there any way to stop a resilver? We gotta stop this thing - at minimum, completion time is 300,000 hours, and maximum is in the millions. Raidz2 array, so it has the redundancy, we just need to get data off. -- We value your opinion! http://www.craneae.com/surveys/satisfaction.htmHow may we serve you better?Please click the survey link to tell us how we are doing: http://www.craneae.com/surveys/satisfaction.htm http://www.craneae.com/surveys/satisfaction.htm Your feedback is of the utmost importance to us. Thank you for your time. Crane Aerospace Electronics Confidentiality Statement: The information contained in this email message may be privileged and is confidential information intended only for the use of the recipient, or any employee or agent responsible to deliver it to the intended recipient. Any unauthorized use, distribution or copying of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify the sender immediately and destroy the original message and all attachments from your electronic files. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resliver making the system unresponsive
Yeah, I'm having a combination of this and the resilver constantly restarting issue. And nothing to free up space. It was recommended to me to replace any expanders I had between the HBA and the drives with extra HBAs, but my array doesn't have expanders. If your's does, you may want to try that. Otherwise, wait it out :( On Wed, Sep 29, 2010 at 6:37 PM, Scott Meilicke sc...@kmclan.net wrote: I should add I have 477 snapshots across all files systems. Most of them are hourly snaps (225 of them anyway). On Sep 29, 2010, at 3:16 PM, Scott Meilicke wrote: This must be resliver day :) I just had a drive failure. The hot spare kicked in, and access to the pool over NFS was effectively zero for about 45 minutes. Currently the pool is still reslivering, but for some reason I can access the file system now. Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to reslivers? This box is used for development VMs, but there is no way I would consider this for production with this kind of performance hit during a resliver. My hardware: Dell 2950 16G ram 16 disk SAS chassis LSI 3801 (I think) SAS card (1068e chip) Intel x25-e SLOG off of the internal PERC 5/i RAID controller Seagate 750G disks (7200.11) I am running Nexenta CE 3.0.3 (SunOS rawhide 5.11 NexentaOS_134f i86pc i386 i86pc Solaris) pool: data01 state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Wed Sep 29 14:03:52 2010 1.12T scanned out of 5.00T at 311M/s, 3h37m to go 82.0G resilvered, 22.42% done config: NAME STATE READ WRITE CKSUM data01 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 c1t8d0 ONLINE 0 0 0 c1t9d0 ONLINE 0 0 0 c1t10d0ONLINE 0 0 0 c1t11d0ONLINE 0 0 0 c1t12d0ONLINE 0 0 0 c1t13d0ONLINE 0 0 0 c1t14d0ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 c1t22d0ONLINE 0 0 0 c1t15d0ONLINE 0 0 0 c1t16d0ONLINE 0 0 0 c1t17d0ONLINE 0 0 0 c1t23d0ONLINE 0 0 0 spare-5REMOVED 0 0 0 c1t20d0 REMOVED 0 0 0 c8t18d0 ONLINE 0 0 0 (resilvering) c1t21d0ONLINE 0 0 0 logs c0t1d0 ONLINE 0 0 0 spares c8t18d0 INUSE currently in use errors: No known data errors Thanks for any insights. -Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Scott Meilicke ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub: resilver in progress for 0h38m, 0.00% done, 1131207h51m to go
On Wed, Sep 22, 2010 at 8:13 PM, Richard Elling rich...@nexenta.com wrote: On Sep 22, 2010, at 1:46 PM, LIC mesh wrote: Something else is probably causing the slow I/O. What is the output of iostat -en ? The best answer is all balls (balls == zeros) Found a number of LUNs with errors this way, looks like it has to do with network problems more so than the hardware, so we're going to try turning of LACP and using just 1 NIC. For SATA drives, we find that zfs_vdev_max_pending = 2 can be needed in certain recovery cases. We've played around with this on the individual shelves (originally was set at 1 for quite a great amount of time), but left the head at default for build 134. Yes. But some are not inexpensive. -- richard What price range would we be looking at? - Michael ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] scrub: resilver in progress for 0h38m, 0.00% done, 1131207h51m to go
What options are there to turn off or reduce the priority of a resilver? This is on a 400TB iSCSI based zpool (8 LUNs per raidz2 vdev, 4 LUNs per shelf, 6 drives per LUN - 16 shelves total) - my client has gotten to the point that they just want to get their data off, but this resilver won't stop. Case in point: $ dd if=/dev/zero of=out bs=64k count=20; (Ctrl-C'd out) 6+0 records in 6+0 records out 393216 bytes (393 kB) copied, 97.6778 s, 4.0 kB/s This is /after/ an upgrade to build 134 on the head, hoping that zpool import's recovery mode would not resilver. Auto-snapshots are off, max_vdev_pending has been tuned down to 7, zfs_scrub_limit is set at 2, zpool status is never run as root. Also, the filesystem has not been touched by anyone other than myself (including applications) in 2 months. If we could flip a switch to turn this thing into read-only mode (it's got the redundancy for it), that would totally fix this. I have been referred to http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 - is this stable enough to build? Are there any other options to speed up getting-data-off-of-this-monstrously-huge-filesystem? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss