Dear Christian and all,

Anyone can advise?

Looking forward to your reply, thank you.


On Thu, Apr 24, 2014 at 1:51 PM, Indra Pramana <> wrote:

> Hi Christian,
> Good day to you, and thank you for your reply.
> On Wed, Apr 23, 2014 at 11:41 PM, Christian Balzer <> wrote:
>> > > > Using 32 concurrent writes, result is below. The speed really
>> > > > fluctuates.
>> > > >
>> > > >  Total time run:         64.31704964.317049
>> > > > Total writes made:      1095
>> > > > Write size:             4194304
>> > > > Bandwidth (MB/sec):     68.100
>> > > >
>> > > > Stddev Bandwidth:       44.6773
>> > > > Max bandwidth (MB/sec): 184
>> > > > Min bandwidth (MB/sec): 0
>> > > > Average Latency:        1.87761
>> > > > Stddev Latency:         1.90906
>> > > > Max latency:            9.99347
>> > > > Min latency:            0.075849
>> > > >
>> > > That is really weird, it should get faster, not slower. ^o^
>> > > I assume you've run this a number of times?
>> > >
>> > > Also my apologies, the default is 16 threads, not 1, but that still
>> > > isn't enough to get my cluster to full speed:
>> > > ---
>> > > Bandwidth (MB/sec):     349.044
>> > >
>> > > Stddev Bandwidth:       107.582
>> > > Max bandwidth (MB/sec): 408
>> > > ---
>> > > at 64 threads it will ramp up from a slow start to:
>> > > ---
>> > > Bandwidth (MB/sec):     406.967
>> > >
>> > > Stddev Bandwidth:       114.015
>> > > Max bandwidth (MB/sec): 452
>> > > ---
>> > >
>> > > But what stands out is your latency. I don't have a 10GBE network to
>> > > compare, but my Infiniband based cluster (going through at least one
>> > > switch) gives me values like this:
>> > > ---
>> > > Average Latency:        0.335519
>> > > Stddev Latency:         0.177663
>> > > Max latency:            1.37517
>> > > Min latency:            0.1017
>> > > ---
>> > >
>> > > Of course that latency is not just the network.
>> > >
>> >
>> > What else can contribute to this latency? Storage node load, disk speed,
>> > anything else?
>> >
>> That and the network itself are pretty much it, you should know once
>> you've run those test with atop or iostat on the storage nodes.
>> >
>> > > I would suggest running atop (gives you more information at one
>> > > glance) or "iostat -x 3" on all your storage nodes during these tests
>> > > to identify any node or OSD that is overloaded in some way.
>> > >
>> >
>> > Will try.
>> >
>> Do that and let us know about the results.
> I have done some tests using iostat and noted some OSDs on a particular
> storage node going up to the 100% limit when I run the rados bench test.
> ====
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.09    0.00    0.92   21.74    0.00   76.25
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    4.33   42.00    73.33  6980.00
> 304.46     0.29    6.22    0.00    6.86   1.50   6.93
> sdb               0.00     0.00    0.00   17.67     0.00  6344.00
> 718.19    59.64  854.26    0.00  854.26  56.60 *100.00*
> sdc               0.00     0.00   12.33   59.33    70.67 18882.33
> 528.92    36.54  509.80   64.76  602.31  10.51  75.33
> sdd               0.00     0.00    3.33   54.33    24.00 15249.17
> 529.71     1.29   22.45    3.20   23.63   1.64   9.47
> sde               0.00     0.33    0.00    0.67     0.00     4.00
> 12.00     0.30  450.00    0.00  450.00 450.00  30.00
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.38    0.00    1.13    7.75    0.00   89.74
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    5.00   69.00    30.67 19408.50
> 525.38     4.29   58.02    0.53   62.18   2.00  14.80
> sdb               0.00     0.00    7.00   63.33    41.33 20911.50
> 595.82    13.09  826.96   88.57  908.57   5.48  38.53
> sdc               0.00     0.00    2.67   30.00    17.33  6945.33
> 426.29     0.21    6.53    0.50    7.07   1.59   5.20
> sdd               0.00     0.00    2.67   58.67    16.00 20661.33
> 674.26     4.89   79.54   41.00   81.30   2.70  16.53
> sde               0.00     0.00    0.00    1.67     0.00     6.67
> 8.00     0.01    3.20    0.00    3.20   1.60   0.27
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.97    0.00    0.55    6.73    0.00   91.75
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    1.67   15.33    21.33   120.00
> 16.63     0.02    1.18    0.00    1.30   0.63   1.07
> sdb               0.00     0.00    4.33   62.33    24.00 13299.17
> 399.69     2.68   11.18    1.23   11.87   1.94  12.93
> sdc               0.00     0.00    0.67   38.33    70.67  7881.33
> 407.79    37.66  202.15    0.00  205.67  13.61  53.07
> sdd               0.00     0.00    3.00   17.33    12.00   166.00
> 17.51     0.05    2.89    3.11    2.85   0.98   2.00
> sde               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.29    0.00    0.92   24.10    0.00   73.68
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00   45.33     0.00  4392.50
> 193.79     0.62   13.62    0.00   13.62   1.09   4.93
> sdb               0.00     0.00    0.00    8.67     0.00  3600.00
> 830.77    63.87 1605.54    0.00 1605.54 115.38 *100.00*
> sdc               0.00     0.33    8.67   42.67    37.33  5672.33
> 222.45    16.88  908.78    1.38 1093.09   7.06  36.27
> sdd               0.00     0.00    0.33   31.00     1.33   629.83
> 40.29     0.06    1.91    0.00    1.94   0.94   2.93
> sde               0.00     0.00    0.00    0.33     0.00     1.33
> 8.00     0.12  368.00    0.00  368.00 368.00  12.27
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.59    0.00    0.88    4.82    0.00   92.70
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00   29.00     0.00   235.00
> 16.21     0.06    1.98    0.00    1.98   0.97   2.80
> sdb               0.00     6.00    4.33  114.67    38.67  6422.33
> 108.59     9.19  513.19  265.23  522.56   2.08  24.80
> sdc               0.00     0.00    0.00   20.67     0.00   124.00
> 12.00     0.04    2.00    0.00    2.00   1.03   2.13
> sdd               0.00     5.00    1.67   81.00    12.00   546.17
> 13.50     0.10    1.21    0.80    1.22   0.39   3.20
> sde               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> ====
> And the high utilisation is randomly affecting other OSDs as well within
> the same node, and not only affecting one particular OSD.
> atop result on the node:
> ====
> ATOP -
> ceph-osd-07
> 2014/04/24
> 13:49:12
> ------                                                              10s
> elapsed
> PRC | sys    1.77s |  user   2.11s |              |               |
> #proc    164 |               | #trun      2 | #tslpi  2817  | #tslpu     0
> |               | #zombie    0 | clones     4  |
> |               | #exit      0 |
> CPU | sys      14% |  user     20% |              |  irq       1%
> |              |               | idle    632% | wait    133%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.79GHz  | avgscal  54% |
> cpu | sys       6% |  user      7% |              |  irq       0%
> |              |               | idle     19% | cpu006 w 68%
> |              |               | steal     0% | guest     0%
> |              | avgf 2.42GHz  | avgscal  73% |
> cpu | sys       2% |  user      3% |              |  irq       0%
> |              |               | idle     88% | cpu002 w  7%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.68GHz  | avgscal  50% |
> cpu | sys       2% |  user      2% |              |  irq       0%
> |              |               | idle     86% | cpu003 w 10%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.67GHz  | avgscal  50% |
> cpu | sys       2% |  user      2% |              |  irq       0%
> |              |               | idle     75% | cpu001 w 21%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.83GHz  | avgscal  55% |
> cpu | sys       1% |  user      2% |              |  irq       1%
> |              |               | idle     70% | cpu000 w 26%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.85GHz  | avgscal  56% |
> cpu | sys       1% |  user      2% |              |  irq       0%
> |              |               | idle     97% | cpu004 w  1%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.64GHz  | avgscal  49% |
> cpu | sys       1% |  user      1% |              |  irq       0%
> |              |               | idle     98% | cpu005 w  0%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.60GHz  | avgscal  48% |
> cpu | sys       0% |  user      1% |              |  irq       0%
> |              |               | idle     98% | cpu007 w  0%
> |              |               | steal     0% | guest     0%
> |              | avgf 1.60GHz  | avgscal  48% |
> CPL | avg1    1.12 |               | avg5    0.90 |               |
> avg15   0.72 |               |              |               | csw   103682
> |               | intr   34330 |               |
> |               | numcpu     8 |
> MEM | tot    15.6G |               | free  158.2M |  cache  13.7G
> |              |  dirty 101.4M | buff   18.2M |               | slab
> 574.6M |               |              |               |
> |               |              |
> SWP | tot   518.0M |               | free  489.6M |
> |              |               |              |
> |              |               |              |               | vmcom
> 5.2G |               | vmlim   8.3G |
> PAG | scan  327450 |               |              |  stall      0
> |              |               |              |
> |              |               | swin       0 |
> |              |               | swout      0 |
> DSK |          sdb |               | busy     90% |  read    8115
> |              |  write    695 | KiB/r    130 |               | KiB/w
> 194 | MBr/s 103.34  |              | MBw/s  13.22  | avq     4.61
> |               | avio 1.01 ms |
> DSK |          sdc |               | busy     32% |  read      23
> |              |  write    431 | KiB/r      6 |               | KiB/w
> 318 | MBr/s   0.02  |              | MBw/s  13.41  | avq    34.86
> |               | avio 6.95 ms |
> DSK |          sda |               | busy     32% |  read      25
> |              |  write    674 | KiB/r      6 |               | KiB/w
> 193 | MBr/s   0.02  |              | MBw/s  12.76  | avq    41.00
> |               | avio 4.48 ms |
> DSK |          sdd |               | busy      7% |  read      26
> |              |  write    473 | KiB/r      7 |               | KiB/w
> 223 | MBr/s   0.02  |              | MBw/s  10.31  | avq    14.29
> |               | avio 1.45 ms |
> DSK |          sde |               | busy      2% |  read       0
> |              |  write      5 | KiB/r      0 |               | KiB/w
> 5 | MBr/s   0.00  |              | MBw/s   0.00  | avq     1.00
> |               | avio 44.8 ms |
> NET | transport    |  tcpi   21326 |              |  tcpo   27479 |
> udpi       0 |  udpo       0 | tcpao      0 |               | tcppo      2
> | tcprs      3  | tcpie      0 | tcpor      0  |              | udpnp
> 0  | udpip      0 |
> NET | network      |               | ipi    21326 |  ipo    14340
> |              |  ipfrw      0 | deliv  21326 |
> |              |               |              |               | icmpi
> 0 |               | icmpo      0 |
> NET | p2p2    ---- |  pcki   12659 |              |  pcko   20931 | si
> 124 Mbps |               | so  107 Mbps | coll       0  | mlti       0
> |               | erri       0 | erro       0  |              | drpi
> 0  | drpo       0 |
> NET | p2p1    ---- |  pcki    8565 |              |  pcko    6443 | si
> 106 Mbps |               | so 7911 Kbps | coll       0  | mlti       0
> |               | erri       0 | erro       0  |              | drpi
> 0  | drpo       0 |
> NET | lo      ---- |  pcki     108 |              |  pcko     108 | si
> 8 Kbps |               | so    8 Kbps | coll       0  | mlti       0
> |               | erri       0 | erro       0  |              | drpi
> 0  | drpo       0 |
>   PID         RUID              EUID              THR
> SYSCPU           USRCPU          VGROW           RGROW
> RDDSK          WRDSK          ST          EXC         S
> CPUNR           CPU         CMD         1/1
>  6881         root              root              538
> 0.74s            0.94s             0K            256K
> 1.0G         121.3M          --            -         S
> 3           17%         ceph-osd
> 28708         root              root              720
> 0.30s            0.69s           512K             -8K
> 160K         157.7M          --            -         S
> 3           10%         ceph-osd
> 31569         root              root              678
> 0.21s            0.30s           512K           -584K
> 156K         162.7M          --            -         S
> 0            5%         ceph-osd
> 32095         root              root              654
> 0.14s            0.16s             0K              0K
> 60K         105.9M          --            -         S
> 0            3%         ceph-osd
>    61         root              root                1
> 0.20s            0.00s             0K              0K
> 0K             0K          --            -         S
> 3            2%         kswapd0
> 10584         root              root                1
> 0.03s            0.02s           112K            112K
> 0K             0K          --            -         R
> 4            1%         atop
> 11618         root              root                1
> 0.03s            0.00s             0K              0K
> 0K             0K          --            -         S
> 6            0%         kworker/6:2
>    10         root              root                1
> 0.02s            0.00s             0K              0K
> 0K             0K          --            -         S
> 0            0%         rcu_sched
>    38         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 6            0%         ksoftirqd/6
>  1623         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 6            0%         kworker/6:1H
>  1993         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 2            0%         flush-8:48
>  2031         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 2            0%         flush-8:0
>  2032         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 0            0%         flush-8:16
>  2033         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 2            0%         flush-8:32
>  5787         root              root                1
> 0.01s            0.00s             0K              0K
> 4K             0K          --            -         S
> 3            0%         kworker/3:0
> 27605         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 1            0%         kworker/1:2
> 27823         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 0            0%         kworker/0:2
> 32511         root              root                1
> 0.01s            0.00s             0K              0K
> 0K             0K          --            -         S
> 2            0%         kworker/2:0
>  1536         root              root                1
> 0.00s            0.00s             0K              0K
> 0K             0K          --            -         S
> 2            0%         irqbalance
>   478         root              root                1
> 0.00s            0.00s             0K              0K
> 0K             0K          --            -         S
> 3            0%         usb-storage
>   494         root              root                1
> 0.00s            0.00s             0K              0K
> 0K             0K          --            -         S
> 1            0%         jbd2/sde1-8
>  1550         root              root                1
> 0.00s            0.00s             0K              0K
> 400K             0K          --            -         S
> 1            0%         xfsaild/sdb1
>  1750         root              root                1
> 0.00s            0.00s             0K              0K
> 128K             0K          --            -         S
> 2            0%         xfsaild/sdd1
>  1994         root              root                1
> 0.00s            0.00s             0K              0K
> 0K             0K          --            -         S
> 1            0%         flush-8:64
> ====
> I have tried to trim the SSD drives but the problem seems to persist. Last
> time trimming the SSD drives can help to improve the performance.
> Any advice is greatly appreciated.
> Thank you.
ceph-users mailing list

Reply via email to