2012-05-12 7:01, Jim Klimov wrote:
Overall the applied question is whether the disk will
make it back into the live pool (ultimately with no
continuous resilvering), and how fast that can be done -
I don't want to risk the big pool with nonredundant
arrays for too long.

Here lies another "grumpy gripe", although maybe pertaining
to the oldish snv_117 on that box: the system is not making
its best possible effort to complete the resilver ASAP :)

According to "iostat 60", disk utilizations of this raidz
set vary 15-50%busy, queue lengths vary within 5 outstanding
tasks, the CPU kernel time is 2-7% with over 90% idling,
over 2GB RAM remains free... Why won't it go to complete
the quest faster? Can some tire be kicked? ;)

Sat May 12 19:06:09 MSK 2012
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  309.6    3.8 14863.0    5.0  0.0  4.7    0.0   15.0   0  65 c0t1d0
  312.5    3.9 14879.7    5.1  0.0  4.6    0.0   14.7   0  64 c4t3d0
  308.5    4.0 14855.0    5.2  0.0  4.7    0.0   15.1   0  66 c6t5d0
  310.7    3.9 14855.7    5.1  0.0  4.6    0.0   14.8   0  65 c7t6d0
    0.0  225.3    0.0 14484.2  0.0  8.1    0.0   36.0   0  83 c5t6d0
Sat May 12 19:07:09 MSK 2012
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  228.0    3.0 6859.7    4.0  0.0  6.9    0.0   29.9   0  81 c0t1d0
  227.7    3.3 6850.0    4.3  0.0  6.9    0.0   30.0   0  81 c4t3d0
  228.1    3.4 6857.9    4.4  0.0  7.0    0.0   30.0   0  81 c6t5d0
  227.6    3.1 6860.4    4.1  0.0  7.1    0.0   30.7   0  82 c7t6d0
    0.0  225.8    0.0 6379.1  0.0  8.1    0.0   35.8   0  85 c5t6d0
...

On some minutes the disks sit there doing almost nothing at all:

Sat May 12 19:01:09 MSK 2012
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   10.7    0.8  665.4    0.7  0.0  0.1    0.0   11.4   0  13 c0t1d0
   10.7    0.9  667.5    0.7  0.0  0.1    0.0   11.6   0  13 c4t3d0
   10.7    0.8  666.4    0.7  0.0  0.1    0.0   11.9   0  13 c6t5d0
   10.7    0.9  668.5    0.7  0.0  0.1    0.0   11.6   0  13 c7t6d0
    0.1   15.5    0.6   20.3  0.0  0.0    0.0    0.2   0   0 c5t6d0


last pid: 18121;  load avg:  0.16,  0.15,  0.12; up 0+16:03:44 19:06:51
96 processes: 95 sleeping, 1 on cpu
CPU states: 96.6% idle,  0.2% user, 3.2% kernel, 0.0% iowait, 0.0% swap
Memory: 16G phys mem, 2476M free mem, 16G total swap, 16G free swap
...

It has already taken 2 days to try and resilver a 250Gb
disk into the pool, but never made it past 100Gb progress. :(
Reports no errors that I'd see either... :)

Well, that part seems to have been explained in my other
mails, and hopefully worked-around by the hotspare.

//Jim
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to