Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-25 Thread LIC mesh
Thank you for the link!

Turns out that, even though I bought the WD20EARS and ST32000542AS
expecting a 4096 physical blocksize, they report 512.

The new drive I bought correctly identifies as 4096 byte blocksize!

So...OI doesn't like it merging with the existing pool.

Note: ST2000VX000-9YW1 reports physical blocksize of 4096B.  The other
drives that actually have 4096B blocks report 512B physical blocks.  This
is misleading, but they do it anyway.





On Mon, Sep 24, 2012 at 4:32 PM, Timothy Coalson tsc...@mst.edu wrote:

 I'm not sure how to definitively check physical sector size on
 solaris/illumos, but on linux, hdparm -I (capital i) or smartctl -i will do
 it.  OpenIndiana's smartctl doesn't output this information yet (and its
 smartctl doesn't work on SATA disks unless attached via a SAS chip).  The
 issue is complicated by having both a logical and a physical sector size,
 and as far as I am aware, on current disks, logical is always 512, which
 may be what is being reported in what you ran.  Some quick googling
 suggests that previously, it was not possible to use an existing utility to
 report the physical sector size on solaris, because someone wrote their own:


 http://solaris.kuehnke.de/archives/18-Checking-physical-sector-size-of-disks-on-Solaris.html

 So, if you want to make sure of the physical sector size, you could give
 that program a whirl (it compiled fine for me on oi_151a6, and runs, but it
 is not easy for me to attach a 4k sector disk to one of my OI machines, so
 I haven't confirmed its correctness), or temporarily transplant the spare
 in question to a linux machine (or live system) and use hdparm -I.

 Tim


 On Mon, Sep 24, 2012 at 2:37 PM, LIC mesh licm...@gmail.com wrote:

 Any ideas?


 On Mon, Sep 24, 2012 at 10:46 AM, LIC mesh licm...@gmail.com wrote:

 That's what I thought also, but since both prtvtoc and fdisk -G see the
 two disks as the same (and I have not overridden sector size), I am
 confused.
 *
 *
 *iostat -xnE:*
 c16t5000C5002AA08E4Dd0 Soft Errors: 0 Hard Errors: 323 Transport Errors:
 489
 Vendor: ATA  Product: ST32000542AS Revision: CC34 Serial No:
 %FAKESERIAL%
 Size: 2000.40GB 2000398934016 bytes
 Media Error: 207 Device Not Ready: 0 No Device: 116 Recoverable: 0
 Illegal Request: 0 Predictive Failure Analysis: 0
 c16t5000C5005295F727d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST2000VX000-9YW1 Revision: CV13 Serial No:
 %FAKESERIAL%
 Size: 2000.40GB 2000398934016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
 Illegal Request: 0 Predictive Failure Analysis: 0

 *zpool status:*
   pool: rspool
  state: ONLINE
   scan: resilvered 719G in 65h28m with 0 errors on Fri Aug 24 04:21:44
 2012
 config:

 NAMESTATE READ WRITE CKSUM
 rspool  ONLINE   0 0 0
   raidz1-0  ONLINE   0 0 0
 c16t5000C5002AA08E4Dd0  ONLINE   0 0 0
 c16t5000C5002ABE78F5d0  ONLINE   0 0 0
 c16t5000C5002AC49840d0  ONLINE   0 0 0
 c16t50014EE057B72DD3d0  ONLINE   0 0 0
 c16t50014EE057B69208d0  ONLINE   0 0 0
 cache
   c4t2d0ONLINE   0 0 0
 spares
   c16t5000C5005295F727d0AVAIL

 errors: No known data errors

 *root@nas:~# zpool replace rspool c16t5000C5002AA08E4Dd0
 c16t5000C5005295F727d0*
 cannot replace c16t5000C5002AA08E4Dd0 with c16t5000C5005295F727d0:
 devices have different sector alignment



 On Mon, Sep 24, 2012 at 9:23 AM, Gregg Wonderly gregg...@gmail.comwrote:

 What is the error message you are seeing on the replace?  This sounds
 like a slice size/placement problem, but clearly, prtvtoc seems to think
 that everything is the same.  Are you certain that you did prtvtoc on the
 correct drive, and not one of the active disks by mistake?

 Gregg Wonderly

 As does fdisk -G:
 root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0
 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0
 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
   608006080000255   252   512
 You have new mail in /var/mail/root
 root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0
 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0
 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
   608006080000255   252   512



 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote:

 Yet another weird thing - prtvtoc shows both drives as having the same
 sector size,  etc:
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0
 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First

Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-24 Thread LIC mesh
Yet another weird thing - prtvtoc shows both drives as having the same
sector size,  etc:
root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0
* /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map
*
* Dimensions:
* 512 bytes/sector
* 3907029168 sectors
* 3907029101 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*   First SectorLast
*   Sector CountSector
*  34   222   255
*
*  First SectorLast
* Partition  Tag  FlagsSector CountSector  Mount Directory
   0  400256 3907012495 3907012750
   8 1100  3907012751 16384 3907029134
root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0
* /dev/rdsk/c16t5000C5005295F727d0 partition map
*
* Dimensions:
* 512 bytes/sector
* 3907029168 sectors
* 3907029101 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*   First SectorLast
*   Sector CountSector
*  34   222   255
*
*  First SectorLast
* Partition  Tag  FlagsSector CountSector  Mount Directory
   0  400256 3907012495 3907012750
   8 1100  3907012751 16384 3907029134





On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.edu wrote:

 I think you can fool a recent Illumos kernel into thinking a 4k disk is
 512 (incurring a performance hit for that disk, and therefore the vdev and
 pool, but to save a raidz1, it might be worth it):

 http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ,
 see Overriding the Physical Sector Size

 I don't know what you might have to do to coax it to do the replace with a
 hot spare (zpool replace? export/import?).  Perhaps there should be a
 feature in ZFS that notifies when a pool is created or imported with a hot
 spare that can't be automatically used in one or more vdevs?  The whole
 point of hot spares is to have them automatically swap in when you aren't
 there to fiddle with things, which is a bad time to find out it won't work.

 Tim

 On Sun, Sep 23, 2012 at 10:52 PM, LIC mesh licm...@gmail.com wrote:

 Well this is a new one

 Illumos/Openindiana let me add a device as a hot spare that evidently has
 a different sector alignment than all of the other drives in the array.

 So now I'm at the point that I /need/ a hot spare and it doesn't look
 like I have it.

 And, worse, the other spares I have are all the same model as said hot
 spare.

 Is there anything I can do with this or am I just going to be up the
 creek when any one of the other drives in the raidz1 fails?


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-24 Thread LIC mesh
As does fdisk -G:
root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0
* Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0
* PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
  608006080000255   252   512
You have new mail in /var/mail/root
root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0
* Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0
* PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
  608006080000255   252   512



On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote:

 Yet another weird thing - prtvtoc shows both drives as having the same
 sector size,  etc:
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0
 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First SectorLast
 *   Sector CountSector
 *  34   222   255
 *
 *  First SectorLast
 * Partition  Tag  FlagsSector CountSector  Mount Directory
0  400256 3907012495 3907012750
8 1100  3907012751 16384 3907029134
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0
 * /dev/rdsk/c16t5000C5005295F727d0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First SectorLast
 *   Sector CountSector
 *  34   222   255
 *
 *  First SectorLast
 * Partition  Tag  FlagsSector CountSector  Mount Directory
0  400256 3907012495 3907012750
8 1100  3907012751 16384 3907029134





 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.edu wrote:

 I think you can fool a recent Illumos kernel into thinking a 4k disk is
 512 (incurring a performance hit for that disk, and therefore the vdev and
 pool, but to save a raidz1, it might be worth it):

 http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ,
 see Overriding the Physical Sector Size

 I don't know what you might have to do to coax it to do the replace with
 a hot spare (zpool replace? export/import?).  Perhaps there should be a
 feature in ZFS that notifies when a pool is created or imported with a hot
 spare that can't be automatically used in one or more vdevs?  The whole
 point of hot spares is to have them automatically swap in when you aren't
 there to fiddle with things, which is a bad time to find out it won't work.

 Tim

 On Sun, Sep 23, 2012 at 10:52 PM, LIC mesh licm...@gmail.com wrote:

 Well this is a new one

 Illumos/Openindiana let me add a device as a hot spare that evidently
 has a different sector alignment than all of the other drives in the array.

 So now I'm at the point that I /need/ a hot spare and it doesn't look
 like I have it.

 And, worse, the other spares I have are all the same model as said hot
 spare.

 Is there anything I can do with this or am I just going to be up the
 creek when any one of the other drives in the raidz1 fails?


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-24 Thread LIC mesh
That's what I thought also, but since both prtvtoc and fdisk -G see the two
disks as the same (and I have not overridden sector size), I am confused.
*
*
*iostat -xnE:*
c16t5000C5002AA08E4Dd0 Soft Errors: 0 Hard Errors: 323 Transport Errors:
489
Vendor: ATA  Product: ST32000542AS Revision: CC34 Serial No:
%FAKESERIAL%
Size: 2000.40GB 2000398934016 bytes
Media Error: 207 Device Not Ready: 0 No Device: 116 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c16t5000C5005295F727d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA  Product: ST2000VX000-9YW1 Revision: CV13 Serial No:
%FAKESERIAL%
Size: 2000.40GB 2000398934016 bytes
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

*zpool status:*
  pool: rspool
 state: ONLINE
  scan: resilvered 719G in 65h28m with 0 errors on Fri Aug 24 04:21:44 2012
config:

NAMESTATE READ WRITE CKSUM
rspool  ONLINE   0 0 0
  raidz1-0  ONLINE   0 0 0
c16t5000C5002AA08E4Dd0  ONLINE   0 0 0
c16t5000C5002ABE78F5d0  ONLINE   0 0 0
c16t5000C5002AC49840d0  ONLINE   0 0 0
c16t50014EE057B72DD3d0  ONLINE   0 0 0
c16t50014EE057B69208d0  ONLINE   0 0 0
cache
  c4t2d0ONLINE   0 0 0
spares
  c16t5000C5005295F727d0AVAIL

errors: No known data errors

*root@nas:~# zpool replace rspool c16t5000C5002AA08E4Dd0
c16t5000C5005295F727d0*
cannot replace c16t5000C5002AA08E4Dd0 with c16t5000C5005295F727d0: devices
have different sector alignment



On Mon, Sep 24, 2012 at 9:23 AM, Gregg Wonderly gregg...@gmail.com wrote:

 What is the error message you are seeing on the replace?  This sounds
 like a slice size/placement problem, but clearly, prtvtoc seems to think
 that everything is the same.  Are you certain that you did prtvtoc on the
 correct drive, and not one of the active disks by mistake?

 Gregg Wonderly

 As does fdisk -G:
 root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0
 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0
 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
   608006080000255   252   512
 You have new mail in /var/mail/root
 root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0
 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0
 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
   608006080000255   252   512



 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote:

 Yet another weird thing - prtvtoc shows both drives as having the same
 sector size,  etc:
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0
 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First SectorLast
 *   Sector CountSector
 *  34   222   255
 *
 *  First SectorLast
 * Partition  Tag  FlagsSector CountSector  Mount Directory
0  400256 3907012495 3907012750
8 1100  3907012751 16384 3907029134
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0
 * /dev/rdsk/c16t5000C5005295F727d0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First SectorLast
 *   Sector CountSector
 *  34   222   255
 *
 *  First SectorLast
 * Partition  Tag  FlagsSector CountSector  Mount Directory
0  400256 3907012495 3907012750
 8 1100  3907012751 16384 3907029134





 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.edu wrote:

 I think you can fool a recent Illumos kernel into thinking a 4k disk is
 512 (incurring a performance hit for that disk, and therefore the vdev and
 pool, but to save a raidz1, it might be worth it):

 http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ,
 see Overriding the Physical Sector Size

 I don't know what you might have to do to coax it to do the replace with
 a hot spare (zpool replace? export/import?).  Perhaps there should be a
 feature in ZFS that notifies when a pool is created or imported with a hot
 spare that can't be automatically used in one or more vdevs?  The whole
 point of hot spares is to have them automatically swap in when you aren't
 there to fiddle with things, which is a bad time to find out it won't work.

 Tim

 On Sun, Sep 23, 2012 at 10:52 PM, LIC mesh licm...@gmail.com

Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-24 Thread LIC mesh
Any ideas?

On Mon, Sep 24, 2012 at 10:46 AM, LIC mesh licm...@gmail.com wrote:

 That's what I thought also, but since both prtvtoc and fdisk -G see the
 two disks as the same (and I have not overridden sector size), I am
 confused.
 *
 *
 *iostat -xnE:*
 c16t5000C5002AA08E4Dd0 Soft Errors: 0 Hard Errors: 323 Transport Errors:
 489
 Vendor: ATA  Product: ST32000542AS Revision: CC34 Serial No:
 %FAKESERIAL%
 Size: 2000.40GB 2000398934016 bytes
 Media Error: 207 Device Not Ready: 0 No Device: 116 Recoverable: 0
 Illegal Request: 0 Predictive Failure Analysis: 0
 c16t5000C5005295F727d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
 Vendor: ATA  Product: ST2000VX000-9YW1 Revision: CV13 Serial No:
 %FAKESERIAL%
 Size: 2000.40GB 2000398934016 bytes
 Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
 Illegal Request: 0 Predictive Failure Analysis: 0

 *zpool status:*
   pool: rspool
  state: ONLINE
   scan: resilvered 719G in 65h28m with 0 errors on Fri Aug 24 04:21:44 2012
 config:

 NAMESTATE READ WRITE CKSUM
 rspool  ONLINE   0 0 0
   raidz1-0  ONLINE   0 0 0
 c16t5000C5002AA08E4Dd0  ONLINE   0 0 0
 c16t5000C5002ABE78F5d0  ONLINE   0 0 0
 c16t5000C5002AC49840d0  ONLINE   0 0 0
 c16t50014EE057B72DD3d0  ONLINE   0 0 0
 c16t50014EE057B69208d0  ONLINE   0 0 0
 cache
   c4t2d0ONLINE   0 0 0
 spares
   c16t5000C5005295F727d0AVAIL

 errors: No known data errors

 *root@nas:~# zpool replace rspool c16t5000C5002AA08E4Dd0
 c16t5000C5005295F727d0*
 cannot replace c16t5000C5002AA08E4Dd0 with c16t5000C5005295F727d0: devices
 have different sector alignment



 On Mon, Sep 24, 2012 at 9:23 AM, Gregg Wonderly gregg...@gmail.comwrote:

 What is the error message you are seeing on the replace?  This sounds
 like a slice size/placement problem, but clearly, prtvtoc seems to think
 that everything is the same.  Are you certain that you did prtvtoc on the
 correct drive, and not one of the active disks by mistake?

 Gregg Wonderly

 As does fdisk -G:
 root@nas:~# fdisk -G /dev/rdsk/c16t5000C5002AA08E4Dd0
 * Physical geometry for device /dev/rdsk/c16t5000C5002AA08E4Dd0
 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
   608006080000255   252   512
 You have new mail in /var/mail/root
 root@nas:~# fdisk -G /dev/rdsk/c16t5000C5005295F727d0
 * Physical geometry for device /dev/rdsk/c16t5000C5005295F727d0
 * PCYL NCYL ACYL BCYL NHEAD NSECT SECSIZ
   608006080000255   252   512



 On Mon, Sep 24, 2012 at 9:01 AM, LIC mesh licm...@gmail.com wrote:

 Yet another weird thing - prtvtoc shows both drives as having the same
 sector size,  etc:
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5002AA08E4Dd0
 * /dev/rdsk/c16t5000C5002AA08E4Dd0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First SectorLast
 *   Sector CountSector
 *  34   222   255
 *
 *  First SectorLast
 * Partition  Tag  FlagsSector CountSector  Mount Directory
0  400256 3907012495 3907012750
8 1100  3907012751 16384 3907029134
 root@nas:~# prtvtoc /dev/rdsk/c16t5000C5005295F727d0
 * /dev/rdsk/c16t5000C5005295F727d0 partition map
 *
 * Dimensions:
 * 512 bytes/sector
 * 3907029168 sectors
 * 3907029101 accessible sectors
 *
 * Flags:
 *   1: unmountable
 *  10: read-only
 *
 * Unallocated space:
 *   First SectorLast
 *   Sector CountSector
 *  34   222   255
 *
 *  First SectorLast
 * Partition  Tag  FlagsSector CountSector  Mount Directory
0  400256 3907012495 3907012750
 8 1100  3907012751 16384 3907029134





 On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.eduwrote:

 I think you can fool a recent Illumos kernel into thinking a 4k disk is
 512 (incurring a performance hit for that disk, and therefore the vdev and
 pool, but to save a raidz1, it might be worth it):

 http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks ,
 see Overriding the Physical Sector Size

 I don't know what you might have to do to coax it to do the replace
 with a hot spare (zpool replace? export/import?).  Perhaps there should be
 a feature in ZFS that notifies when a pool is created or imported with a
 hot spare that can't be automatically used in one or more vdevs?  The whole
 point of hot spares is to have them automatically swap in when you aren't
 there to fiddle with things

[zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-23 Thread LIC mesh
Well this is a new one

Illumos/Openindiana let me add a device as a hot spare that evidently has a
different sector alignment than all of the other drives in the array.

So now I'm at the point that I /need/ a hot spare and it doesn't look like
I have it.

And, worse, the other spares I have are all the same model as said hot
spare.

Is there anything I can do with this or am I just going to be up the creek
when any one of the other drives in the raidz1 fails?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resliver making the system unresponsive

2010-09-30 Thread LIC mesh
If we've found one bad disk, what are our options?





On Thu, Sep 30, 2010 at 10:12 AM, Richard Elling
richard.ell...@gmail.comwrote:

 On Sep 30, 2010, at 2:32 AM, Tuomas Leikola wrote:

  On Thu, Sep 30, 2010 at 1:16 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:
  Resliver speed has been beaten to death I know, but is there a way to
 avoid this? For example, is more enterprisy hardware less susceptible to
 reslivers? This box is used for development VMs, but there is no way I would
 consider this for production with this kind of performance hit during a
 resliver.
 
 
  According to
 
  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473
 
  resilver should in later builds have some option to limit rebuild speed
 in order to allow for more IO during reconstruction, but I havent't found
 any guides on how to actually make use of this feature. Maybe someone can
 shed some light on this?

 Simple.  Resilver activity is throttled using a delay method.  Nothing to
 tune here.

 In general, if resilver or scrub make a system seem unresponsive, there is
 a
 root cause that is related to the I/O activity. To diagnose, I usually use
 iostat -zxCn 10
 (or similar) and look for unusual asvc_t from a busy disk.  One bad disk
 can ruin
 performance for the whole pool.
  -- richard

 --
 OpenStorage Summit, October 25-27, Palo Alto, CA
 http://nexenta-summit2010.eventbrite.com
 ZFS and performance consulting
 http://www.RichardElling.com












 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Is there any way to stop a resilver?

2010-09-29 Thread LIC mesh
Is there any way to stop a resilver?

We gotta stop this thing - at minimum, completion time is 300,000 hours, and
maximum is in the millions.

Raidz2 array, so it has the redundancy, we just need to get data off.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is there any way to stop a resilver?

2010-09-29 Thread LIC mesh
It's always running less than an hour.

It usually starts at around 300,000h estimate(at 1m in), goes up to an
estimate in the millions(about 30mins in) and restarts.

Never gets past 0.00% completion, and K resilvered on any LUN.

64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
scott.meili...@craneaerospace.com wrote:

  Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.

  --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Fwd: Is there any way to stop a resilver?

2010-09-29 Thread LIC mesh
This is an iSCSI/COMSTAR array.

The head was running 2009.06 stable with version 14 ZFS, but we updated that
to build 134 (kept the old OS drives) - did not, however, update the zpool -
it's still version 14.

The targets are all running 2009.06 stable, exporting 4 raidz1 LUNs each of
6 drives - 8 shelves have 1TB drives, the other 8 have 2TB drives.

The head sees the filesystem as comprised of 8 vdevs of 8 iSCSI LUNs each,
with SSD ZIL and SSD L2ARC.



On Wed, Sep 29, 2010 at 11:49 AM, Scott Meilicke 
scott.meili...@craneaerospace.com wrote:

  What version of OS?
 Are snapshots running (turn them off).

 So are there eight disks?



 On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com wrote:

 It's always running less than an hour.

 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.

 Never gets past 0.00% completion, and K resilvered on any LUN.

 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:

 Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com 
 http://licm...@gmail.com  wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.

  --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is there any way to stop a resilver?

2010-09-29 Thread LIC mesh
Most likely an iSCSI timeout, but that was before my time here.

Since then, there have been various individual drives lost along the way on
the shelves, but never a whole LUN, so, theoretically, /except/ for iSCSI
timeouts, there has been no great reason to resilver.



On Wed, Sep 29, 2010 at 11:51 AM, Lin Ling lin.l...@oracle.com wrote:


 What caused the resilvering to kick off in the first place?

 Lin

 On Sep 29, 2010, at 8:46 AM, LIC mesh wrote:

 It's always running less than an hour.

 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.

 Never gets past 0.00% completion, and K resilvered on any LUN.

 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:

  Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.



 --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resliver making the system unresponsive

2010-09-29 Thread LIC mesh
Yeah, I'm having a combination of this and the resilver constantly
restarting issue.

And nothing to free up space.

It was recommended to me to replace any expanders I had between the HBA and
the drives with extra HBAs, but my array doesn't have expanders.

If your's does, you may want to try that.

Otherwise, wait it out :(




On Wed, Sep 29, 2010 at 6:37 PM, Scott Meilicke sc...@kmclan.net wrote:

 I should add I have 477 snapshots across all files systems. Most of them
 are hourly snaps (225 of them anyway).

 On Sep 29, 2010, at 3:16 PM, Scott Meilicke wrote:

  This must be resliver day :)
 
  I just had a drive failure. The hot spare kicked in, and access to the
 pool over NFS was effectively zero for about 45 minutes. Currently the pool
 is still reslivering, but for some reason I can access the file system now.
 
  Resliver speed has been beaten to death I know, but is there a way to
 avoid this? For example, is more enterprisy hardware less susceptible to
 reslivers? This box is used for development VMs, but there is no way I would
 consider this for production with this kind of performance hit during a
 resliver.
 
  My hardware:
  Dell 2950
  16G ram
  16 disk SAS chassis
  LSI 3801 (I think) SAS card (1068e chip)
  Intel x25-e SLOG off of the internal PERC 5/i RAID controller
  Seagate 750G disks (7200.11)
 
  I am running Nexenta CE 3.0.3 (SunOS rawhide 5.11 NexentaOS_134f i86pc
 i386 i86pc Solaris)
 
   pool: data01
  state: DEGRADED
  status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Sep 29 14:03:52 2010
 1.12T scanned out of 5.00T at 311M/s, 3h37m to go
 82.0G resilvered, 22.42% done
  config:
 
NAME   STATE READ WRITE CKSUM
data01 DEGRADED 0 0 0
  raidz2-0 ONLINE   0 0 0
c1t8d0 ONLINE   0 0 0
c1t9d0 ONLINE   0 0 0
c1t10d0ONLINE   0 0 0
c1t11d0ONLINE   0 0 0
c1t12d0ONLINE   0 0 0
c1t13d0ONLINE   0 0 0
c1t14d0ONLINE   0 0 0
  raidz2-1 DEGRADED 0 0 0
c1t22d0ONLINE   0 0 0
c1t15d0ONLINE   0 0 0
c1t16d0ONLINE   0 0 0
c1t17d0ONLINE   0 0 0
c1t23d0ONLINE   0 0 0
spare-5REMOVED  0 0 0
  c1t20d0  REMOVED  0 0 0
  c8t18d0  ONLINE   0 0 0  (resilvering)
c1t21d0ONLINE   0 0 0
logs
  c0t1d0   ONLINE   0 0 0
spares
  c8t18d0  INUSE currently in use
 
  errors: No known data errors
 
  Thanks for any insights.
 
  -Scott
  --
  This message posted from opensolaris.org
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

 Scott Meilicke



 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scrub: resilver in progress for 0h38m, 0.00% done, 1131207h51m to go

2010-09-23 Thread LIC mesh
On Wed, Sep 22, 2010 at 8:13 PM, Richard Elling rich...@nexenta.com wrote:

 On Sep 22, 2010, at 1:46 PM, LIC mesh wrote:

 Something else is probably causing the slow I/O.  What is the output of
 iostat -en ?  The best answer is all balls  (balls == zeros)

  Found a number of LUNs with errors this way, looks like it has to do with
network problems more so than the hardware, so we're going to try turning of
LACP and using just 1 NIC.



 For SATA drives, we find that zfs_vdev_max_pending = 2 can be needed in
 certain recovery cases.

 We've played around with this on the individual shelves (originally was set
at 1 for quite a great amount of time), but left the head at default for
build 134.



 Yes.  But some are not inexpensive.
  -- richard

 What price range would we be looking at?

 - Michael
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] scrub: resilver in progress for 0h38m, 0.00% done, 1131207h51m to go

2010-09-22 Thread LIC mesh
What options are there to turn off or reduce the priority of a resilver?

This is on a 400TB iSCSI based zpool (8 LUNs per raidz2 vdev, 4 LUNs per
shelf, 6 drives per LUN - 16 shelves total) - my client has gotten to the
point that they just want to get their data off, but this resilver won't
stop.

Case in point:
$ dd if=/dev/zero of=out bs=64k count=20;
(Ctrl-C'd out)
6+0 records in
6+0 records out
393216 bytes (393 kB) copied, 97.6778 s, 4.0 kB/s

This is /after/ an upgrade to build 134 on the head, hoping that zpool
import's recovery mode would not resilver.

Auto-snapshots are off, max_vdev_pending has been tuned down to 7,
zfs_scrub_limit is set at 2, zpool status is never run as root.  Also, the
filesystem has not been touched by anyone other than myself (including
applications) in 2 months.

If we could flip a switch to turn this thing into read-only mode (it's got
the redundancy for it), that would totally fix this.

I have been referred to
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 - is this
stable enough to build?

Are there any other options to speed up
getting-data-off-of-this-monstrously-huge-filesystem?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss