Dear Christian and all, Anyone can advise?
Looking forward to your reply, thank you. Cheers. On Thu, Apr 24, 2014 at 1:51 PM, Indra Pramana <in...@sg.or.id> wrote: > Hi Christian, > > Good day to you, and thank you for your reply. > > On Wed, Apr 23, 2014 at 11:41 PM, Christian Balzer <ch...@gol.com> wrote: > >> > > > Using 32 concurrent writes, result is below. The speed really >> > > > fluctuates. >> > > > >> > > > Total time run: 64.31704964.317049 >> > > > Total writes made: 1095 >> > > > Write size: 4194304 >> > > > Bandwidth (MB/sec): 68.100 >> > > > >> > > > Stddev Bandwidth: 44.6773 >> > > > Max bandwidth (MB/sec): 184 >> > > > Min bandwidth (MB/sec): 0 >> > > > Average Latency: 1.87761 >> > > > Stddev Latency: 1.90906 >> > > > Max latency: 9.99347 >> > > > Min latency: 0.075849 >> > > > >> > > That is really weird, it should get faster, not slower. ^o^ >> > > I assume you've run this a number of times? >> > > >> > > Also my apologies, the default is 16 threads, not 1, but that still >> > > isn't enough to get my cluster to full speed: >> > > --- >> > > Bandwidth (MB/sec): 349.044 >> > > >> > > Stddev Bandwidth: 107.582 >> > > Max bandwidth (MB/sec): 408 >> > > --- >> > > at 64 threads it will ramp up from a slow start to: >> > > --- >> > > Bandwidth (MB/sec): 406.967 >> > > >> > > Stddev Bandwidth: 114.015 >> > > Max bandwidth (MB/sec): 452 >> > > --- >> > > >> > > But what stands out is your latency. I don't have a 10GBE network to >> > > compare, but my Infiniband based cluster (going through at least one >> > > switch) gives me values like this: >> > > --- >> > > Average Latency: 0.335519 >> > > Stddev Latency: 0.177663 >> > > Max latency: 1.37517 >> > > Min latency: 0.1017 >> > > --- >> > > >> > > Of course that latency is not just the network. >> > > >> > >> > What else can contribute to this latency? Storage node load, disk speed, >> > anything else? >> > >> That and the network itself are pretty much it, you should know once >> you've run those test with atop or iostat on the storage nodes. >> >> > >> > > I would suggest running atop (gives you more information at one >> > > glance) or "iostat -x 3" on all your storage nodes during these tests >> > > to identify any node or OSD that is overloaded in some way. >> > > >> > >> > Will try. >> > >> Do that and let us know about the results. >> > > I have done some tests using iostat and noted some OSDs on a particular > storage node going up to the 100% limit when I run the rados bench test. > > ==== > avg-cpu: %user %nice %system %iowait %steal %idle > 1.09 0.00 0.92 21.74 0.00 76.25 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 4.33 42.00 73.33 6980.00 > 304.46 0.29 6.22 0.00 6.86 1.50 6.93 > sdb 0.00 0.00 0.00 17.67 0.00 6344.00 > 718.19 59.64 854.26 0.00 854.26 56.60 *100.00* > sdc 0.00 0.00 12.33 59.33 70.67 18882.33 > 528.92 36.54 509.80 64.76 602.31 10.51 75.33 > sdd 0.00 0.00 3.33 54.33 24.00 15249.17 > 529.71 1.29 22.45 3.20 23.63 1.64 9.47 > sde 0.00 0.33 0.00 0.67 0.00 4.00 > 12.00 0.30 450.00 0.00 450.00 450.00 30.00 > > avg-cpu: %user %nice %system %iowait %steal %idle > 1.38 0.00 1.13 7.75 0.00 89.74 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 5.00 69.00 30.67 19408.50 > 525.38 4.29 58.02 0.53 62.18 2.00 14.80 > sdb 0.00 0.00 7.00 63.33 41.33 20911.50 > 595.82 13.09 826.96 88.57 908.57 5.48 38.53 > sdc 0.00 0.00 2.67 30.00 17.33 6945.33 > 426.29 0.21 6.53 0.50 7.07 1.59 5.20 > sdd 0.00 0.00 2.67 58.67 16.00 20661.33 > 674.26 4.89 79.54 41.00 81.30 2.70 16.53 > sde 0.00 0.00 0.00 1.67 0.00 6.67 > 8.00 0.01 3.20 0.00 3.20 1.60 0.27 > > avg-cpu: %user %nice %system %iowait %steal %idle > 0.97 0.00 0.55 6.73 0.00 91.75 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 1.67 15.33 21.33 120.00 > 16.63 0.02 1.18 0.00 1.30 0.63 1.07 > sdb 0.00 0.00 4.33 62.33 24.00 13299.17 > 399.69 2.68 11.18 1.23 11.87 1.94 12.93 > sdc 0.00 0.00 0.67 38.33 70.67 7881.33 > 407.79 37.66 202.15 0.00 205.67 13.61 53.07 > sdd 0.00 0.00 3.00 17.33 12.00 166.00 > 17.51 0.05 2.89 3.11 2.85 0.98 2.00 > sde 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > > avg-cpu: %user %nice %system %iowait %steal %idle > 1.29 0.00 0.92 24.10 0.00 73.68 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 0.00 45.33 0.00 4392.50 > 193.79 0.62 13.62 0.00 13.62 1.09 4.93 > sdb 0.00 0.00 0.00 8.67 0.00 3600.00 > 830.77 63.87 1605.54 0.00 1605.54 115.38 *100.00* > sdc 0.00 0.33 8.67 42.67 37.33 5672.33 > 222.45 16.88 908.78 1.38 1093.09 7.06 36.27 > sdd 0.00 0.00 0.33 31.00 1.33 629.83 > 40.29 0.06 1.91 0.00 1.94 0.94 2.93 > sde 0.00 0.00 0.00 0.33 0.00 1.33 > 8.00 0.12 368.00 0.00 368.00 368.00 12.27 > > avg-cpu: %user %nice %system %iowait %steal %idle > 1.59 0.00 0.88 4.82 0.00 92.70 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 0.00 29.00 0.00 235.00 > 16.21 0.06 1.98 0.00 1.98 0.97 2.80 > sdb 0.00 6.00 4.33 114.67 38.67 6422.33 > 108.59 9.19 513.19 265.23 522.56 2.08 24.80 > sdc 0.00 0.00 0.00 20.67 0.00 124.00 > 12.00 0.04 2.00 0.00 2.00 1.03 2.13 > sdd 0.00 5.00 1.67 81.00 12.00 546.17 > 13.50 0.10 1.21 0.80 1.22 0.39 3.20 > sde 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > ==== > > And the high utilisation is randomly affecting other OSDs as well within > the same node, and not only affecting one particular OSD. > > atop result on the node: > > ==== > ATOP - > ceph-osd-07 > 2014/04/24 > 13:49:12 > ------ 10s > elapsed > PRC | sys 1.77s | user 2.11s | | | > #proc 164 | | #trun 2 | #tslpi 2817 | #tslpu 0 > | | #zombie 0 | clones 4 | > | | #exit 0 | > CPU | sys 14% | user 20% | | irq 1% > | | | idle 632% | wait 133% > | | | steal 0% | guest 0% > | | avgf 1.79GHz | avgscal 54% | > cpu | sys 6% | user 7% | | irq 0% > | | | idle 19% | cpu006 w 68% > | | | steal 0% | guest 0% > | | avgf 2.42GHz | avgscal 73% | > cpu | sys 2% | user 3% | | irq 0% > | | | idle 88% | cpu002 w 7% > | | | steal 0% | guest 0% > | | avgf 1.68GHz | avgscal 50% | > cpu | sys 2% | user 2% | | irq 0% > | | | idle 86% | cpu003 w 10% > | | | steal 0% | guest 0% > | | avgf 1.67GHz | avgscal 50% | > cpu | sys 2% | user 2% | | irq 0% > | | | idle 75% | cpu001 w 21% > | | | steal 0% | guest 0% > | | avgf 1.83GHz | avgscal 55% | > cpu | sys 1% | user 2% | | irq 1% > | | | idle 70% | cpu000 w 26% > | | | steal 0% | guest 0% > | | avgf 1.85GHz | avgscal 56% | > cpu | sys 1% | user 2% | | irq 0% > | | | idle 97% | cpu004 w 1% > | | | steal 0% | guest 0% > | | avgf 1.64GHz | avgscal 49% | > cpu | sys 1% | user 1% | | irq 0% > | | | idle 98% | cpu005 w 0% > | | | steal 0% | guest 0% > | | avgf 1.60GHz | avgscal 48% | > cpu | sys 0% | user 1% | | irq 0% > | | | idle 98% | cpu007 w 0% > | | | steal 0% | guest 0% > | | avgf 1.60GHz | avgscal 48% | > CPL | avg1 1.12 | | avg5 0.90 | | > avg15 0.72 | | | | csw 103682 > | | intr 34330 | | > | | numcpu 8 | > MEM | tot 15.6G | | free 158.2M | cache 13.7G > | | dirty 101.4M | buff 18.2M | | slab > 574.6M | | | | > | | | > SWP | tot 518.0M | | free 489.6M | > | | | | > | | | | | vmcom > 5.2G | | vmlim 8.3G | > PAG | scan 327450 | | | stall 0 > | | | | > | | | swin 0 | > | | | swout 0 | > DSK | sdb | | busy 90% | read 8115 > | | write 695 | KiB/r 130 | | KiB/w > 194 | MBr/s 103.34 | | MBw/s 13.22 | avq 4.61 > | | avio 1.01 ms | > DSK | sdc | | busy 32% | read 23 > | | write 431 | KiB/r 6 | | KiB/w > 318 | MBr/s 0.02 | | MBw/s 13.41 | avq 34.86 > | | avio 6.95 ms | > DSK | sda | | busy 32% | read 25 > | | write 674 | KiB/r 6 | | KiB/w > 193 | MBr/s 0.02 | | MBw/s 12.76 | avq 41.00 > | | avio 4.48 ms | > DSK | sdd | | busy 7% | read 26 > | | write 473 | KiB/r 7 | | KiB/w > 223 | MBr/s 0.02 | | MBw/s 10.31 | avq 14.29 > | | avio 1.45 ms | > DSK | sde | | busy 2% | read 0 > | | write 5 | KiB/r 0 | | KiB/w > 5 | MBr/s 0.00 | | MBw/s 0.00 | avq 1.00 > | | avio 44.8 ms | > NET | transport | tcpi 21326 | | tcpo 27479 | > udpi 0 | udpo 0 | tcpao 0 | | tcppo 2 > | tcprs 3 | tcpie 0 | tcpor 0 | | udpnp > 0 | udpip 0 | > NET | network | | ipi 21326 | ipo 14340 > | | ipfrw 0 | deliv 21326 | > | | | | | icmpi > 0 | | icmpo 0 | > NET | p2p2 ---- | pcki 12659 | | pcko 20931 | si > 124 Mbps | | so 107 Mbps | coll 0 | mlti 0 > | | erri 0 | erro 0 | | drpi > 0 | drpo 0 | > NET | p2p1 ---- | pcki 8565 | | pcko 6443 | si > 106 Mbps | | so 7911 Kbps | coll 0 | mlti 0 > | | erri 0 | erro 0 | | drpi > 0 | drpo 0 | > NET | lo ---- | pcki 108 | | pcko 108 | si > 8 Kbps | | so 8 Kbps | coll 0 | mlti 0 > | | erri 0 | erro 0 | | drpi > 0 | drpo 0 | > > PID RUID EUID THR > SYSCPU USRCPU VGROW RGROW > RDDSK WRDSK ST EXC S > CPUNR CPU CMD 1/1 > 6881 root root 538 > 0.74s 0.94s 0K 256K > 1.0G 121.3M -- - S > 3 17% ceph-osd > 28708 root root 720 > 0.30s 0.69s 512K -8K > 160K 157.7M -- - S > 3 10% ceph-osd > 31569 root root 678 > 0.21s 0.30s 512K -584K > 156K 162.7M -- - S > 0 5% ceph-osd > 32095 root root 654 > 0.14s 0.16s 0K 0K > 60K 105.9M -- - S > 0 3% ceph-osd > 61 root root 1 > 0.20s 0.00s 0K 0K > 0K 0K -- - S > 3 2% kswapd0 > 10584 root root 1 > 0.03s 0.02s 112K 112K > 0K 0K -- - R > 4 1% atop > 11618 root root 1 > 0.03s 0.00s 0K 0K > 0K 0K -- - S > 6 0% kworker/6:2 > 10 root root 1 > 0.02s 0.00s 0K 0K > 0K 0K -- - S > 0 0% rcu_sched > 38 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 6 0% ksoftirqd/6 > 1623 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 6 0% kworker/6:1H > 1993 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 2 0% flush-8:48 > 2031 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 2 0% flush-8:0 > 2032 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 0 0% flush-8:16 > 2033 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 2 0% flush-8:32 > 5787 root root 1 > 0.01s 0.00s 0K 0K > 4K 0K -- - S > 3 0% kworker/3:0 > 27605 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 1 0% kworker/1:2 > 27823 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 0 0% kworker/0:2 > 32511 root root 1 > 0.01s 0.00s 0K 0K > 0K 0K -- - S > 2 0% kworker/2:0 > 1536 root root 1 > 0.00s 0.00s 0K 0K > 0K 0K -- - S > 2 0% irqbalance > 478 root root 1 > 0.00s 0.00s 0K 0K > 0K 0K -- - S > 3 0% usb-storage > 494 root root 1 > 0.00s 0.00s 0K 0K > 0K 0K -- - S > 1 0% jbd2/sde1-8 > 1550 root root 1 > 0.00s 0.00s 0K 0K > 400K 0K -- - S > 1 0% xfsaild/sdb1 > 1750 root root 1 > 0.00s 0.00s 0K 0K > 128K 0K -- - S > 2 0% xfsaild/sdd1 > 1994 root root 1 > 0.00s 0.00s 0K 0K > 0K 0K -- - S > 1 0% flush-8:64 > ==== > > I have tried to trim the SSD drives but the problem seems to persist. Last > time trimming the SSD drives can help to improve the performance. > > Any advice is greatly appreciated. > > Thank you. > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com