Re: [zfs-discuss] Extremely slow raidz resilvering

Leandro Vanden Bosch Thu, 20 May 2010 23:50:21 -0700

Brandon,

Thanks for replying to the message.

I believe that this is more related to the variable stripe size of RAIDZ
than the fdisk MBR. I say this because the disk works without any issues in
a mirror configuration or as standalone reaching 80 MB/s burst transfer
rates.
In RAIDZ, however, the transfer rates are in the KB/s order.

Of course that the variable stripe size does not respect any I/O alignment
when the disk's firmware does not expose the real 4K sector size and thus,
the performance is horrible.

Besides, the disk has an EFI label:

             Total disk size is 60800 cylinders
             Cylinder size is 32130 (512 byte) blocks

                                               Cylinders
      Partition   Status    Type          Start   End   Length    %
      =========   ======    ============  =====   ===   ======   ===
          1                 EFI               0  60800    60801    100

...that uses the whole disk:

prtvtoc command output:

* /dev/rdsk/c12t0d0 partition map
*
* Dimensions:
*     512 bytes/sector
* 1953525168 sectors
* 1953525101 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*       First     Sector    Last
*       Sector     Count    Sector
*          34       222       255
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      4    00        256 1953508495 1953508750
       8     11    00  1953508751     16384 1953525134

Though there's an MBR in there (you can check it out with dd), I know that
it doesn't affect the alignment because the usable slice starts at sector
256 and being multiple of 2, it mantains the 4K physical sector alignment.

Different is the situation if the usable slice would have been started at
sector 1 right after the sector 0 MBR. That's because logical sectors 0
through 3 belong to the same 4K physical sector and having that moved by an
offset of one, would definitely alter the I/O.
To make this clearer for the ones that read about this for the first time,
if the logical and physical layouts are not aligned, an operation on one
logical stripe/cluster could be partially impacting another physical sector,
therefore deteriorating final performance because of the overhead.

What would be awesome to do, is to trace all of the I/O access that ZFS does
on the pool and try to match that to the physical layout. I already saw
someone's work running a DTrace script that records all the accesses and
then he creates an animation (black and green sectors) showing the activity
on the disk. An incredibly awesome work. I can't find that link right now.
That script would throw some light on this.

Regards,

Leandro.

On Thu, May 20, 2010 at 8:53 PM, Brandon High <bh...@freaks.com> wrote:

> On Sat, Apr 24, 2010 at 5:02 PM, Leandro Vanden Bosch
> <l.vbo...@gmail.com> wrote:
> > Confirmed then that the issue was with the WD10EARS.
> > I swapped it out with the old one and things look a lot better:
>
> The problem with the EARS drive is that it was not 4k aligned.
>
> The solaris partition table was, but that does not take into account
> the fdisk MBR. As a result, everything was off by one cylinder.
>
> -B
>
> --
> Brandon High : bh...@freaks.com
>

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Extremely slow raidz resilvering

Reply via email to