Hi,

I have a mysql instance which if I point more load towards it it suddenly gets 
100% in SYS as shown below. It can work fine for an hour but eventually it gets 
to a jump from 5-15% of CPU utilization to 100% in SYS as show in mpstat output 
below:

# prtdiag | head
System Configuration: SUN MICROSYSTEMS SUN FIRE X4170 SERVER          
BIOS Configuration: American Megatrends Inc. 07060215 06/19/2009
BMC Configuration: IPMI 1.5 (KCS: Keyboard Controller Style)

==== Processor Sockets ====================================

Version                          Location Tag
-------------------------------- --------------------------
Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz CPU 1
Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz CPU 2
[...]
# uname -a
SunOS XXXX 5.10 Generic_142901-03 i86pc i386 i86pc

# mpstat 1
[...]
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   18   0    0   440  108  350  167   32 8170   23  5267    3  97   0   0
  1    2   0    0    99    6  144   75   31 9119   15  4713    0 100   0   0
  2    1   0    0   146   12  225  118   26 6509    3  3983   27  73   0   0
  3    0   0    0   448  371  142   81   14 9800    8  4759    0 100   0   0
  4    0   0    0   429  341  137   72   18 9624   26  4035    1  99   0   0
  5    0   0    0   117   12  187   98   35 11200   23  4680    0 100   0   0
  6    2   0    0    92    3  138   66   26 10381   13  6218    1  99   0   0
  7    0   0    0   110    6  180   94   29 11391   14  4602    0 100   0   0
  8    0   0    0   149   17  237  119   26 10087   27  7168    1  99   0   0
  9    0   0    0    80    8  113   63   16 10225    8  4734    1  99   0   0
 10    0   0    0   129   11  128   66   25 9906   19  4686    0 100   0   0
 11    0   0    0   103    5  163   85   24 10138   12  4535    1  99   0   0
 12    3   0    0   126    7  214  108   28 12660   15  4514    1  99   0   0
 13   16   0    0    59    5  112   58   25 10227   29  4141    0 100   0   0
 14    0   0    0    54    6   85   50   23 11371   16  4004    0 100   0   0
 15   36   0    0    65    7   72   42   22 8902   16  3937    1  99   0   0
^C
r...@mk-mysqlcluster-2-1[~] 


Now I didn't have much time to look around and had to redirect live traffic 
somewhere else but before I did I managed to get the below:

# dtrace -n profile-997'{...@[stack()]=count();}'
[...]
              unix`default_lock_delay+0x48
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1f9
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
             1861

              unix`mutex_delay_default+0xc
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1b8
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
             2170

              unix`mutex_delay_default+0xc
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1f9
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
             2215

              unix`mutex_delay_default+0xa
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1f9
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`pread+0x178
              unix`sys_syscall+0x17b
            17400

              unix`mutex_delay_default+0xa
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1b8
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`pread+0x178
              unix`sys_syscall+0x17b
            17551

              unix`mutex_delay_default+0xa
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1b8
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
            21427

              unix`mutex_delay_default+0xa
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1f9
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
            22052

              unix`mutex_delay_default+0x7
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1f9
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`pread+0x178
              unix`sys_syscall+0x17b
            22142

              unix`mutex_delay_default+0x7
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1b8
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`pread+0x178
              unix`sys_syscall+0x17b
            22375

              unix`mutex_delay_default+0x7
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1b8
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
            27345

              unix`mutex_delay_default+0x7
              unix`mutex_vector_enter+0x99
              zfs`dmu_zfetch_find+0x1f9
              zfs`dmu_zfetch+0xc9
              zfs`dbuf_read+0x27a
              zfs`dmu_buf_hold_array_by_dnode+0x287
              zfs`dmu_buf_hold_array+0x81
              zfs`dmu_read_uio+0x49
              zfs`zfs_read+0x15e
              zfs`zfs_shim_read+0xc
              genunix`fop_read+0x31
              genunix`read+0x188
              unix`sys_syscall+0x17b
            28144
#

Entire database is cached and except for a tine amount of writes ZFS doesn't 
ready anything at all from disks - everything is coming from the ARC. Above 
suggests that the issue might be related to ZFS and there might be some lock 
contention involved. Even with the live traffic gone it took couple of minutes 
for zfs to give all the CPUs back.

Any ideas?

-- 
Robert Milkowski
http://milek.blogspot.com
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to