Was there anything in kernel or OVS logs? Comparing 'ovs-appctl coverage/show' for the 10% CPU hosts and 100% CPU hosts might give some hint.
'perf' might give some hint. On 29 January 2016 at 16:05, Henry <[email protected]> wrote: > does any one can give me some hint? > > ---原始邮件--- > 发件人: "Henry"<[email protected]> > 发送时间: 2016年1月15日 01:06:00 > 收件人: "discuss"<[email protected]>; > 主题: [ovs-discuss] small traffic but high CPU load for ovs-vswitchd > > Hi experts, > > My company use OVS 2.3.1 in OpenStack deployment, there is something strange > about the ovs-vswitchd process: > > 1. The traffic is small(as you can see in the following nmon result, the > bond port eth2 & eth3 totoally has 200KBps), but the ovs-vswitchd CPU usage > is high(>100%). > 2. Not all ovs-vswitchd eat so much CPU, I have more than hundreds host act > as compute node, all installed OVS 2.3.1, and all have the similar traffic > load, most ovs-vswitchd only use 10% CPU, but dozens use more than 100%. > > What I want to know is: > 1. What's the potential cause can explain such issue? > 2. What kind of debug I can do to figure out why small traffic with high CPU > load? > > > Here is the nmon check for traffic load: > > +nmon-14i------[H for help]---Hostname=node00234Refresh= 2secs > ---09:27.03-------------------------------------------------------------------+ > | Network I/O > -----------------------------------------------------------------------------------------------------------------------------------| > |I/F Name Recv=KB/s Trans=KB/s packin packout insize outsize Peak->Recv > Trans | > | lo 0.0 0.0 0.0 0.0 0.0 0.0 58.4 > 58.4 | > | eth0 0.8 11.7 7.0 63.4 121.1 188.9 120158.2 > 616.4 | > | eth1 1.3 0.0 9.5 0.0 137.5 0.0 1.3 > 0.0 | > | eth2 99.1 33.7 1270.2 201.8 79.9 171.0 129.9 > 36.9 | > | eth3 100.1 51.6 1294.7 227.3 79.2 232.6 132.3 > 56.2 | > |ovs-system 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > 0.0 | > | br-int 0.1 0.0 2.5 0.0 56.8 0.0 0.2 > 0.0 | > |br-bond1 49.9 0.0 490.5 0.0 104.2 0.0 73.2 > 0.0 | > | bond0 2.1 11.7 16.5 63.4 130.5 188.9 120160.3 > 616.6 | > |ovirtmgmt 1.9 11.7 16.5 63.4 116.5 188.9 115238.5 > 616.5 > > > The CPU usage for the ovs-vswitchd: > > top - 09:27:33 up 281 days, 13:16, 2 users, load average: 2.74, 2.70, 2.72 > Tasks: 14 total, 1 running, 13 sleeping, 0 stopped, 0 zombie > Cpu(s): 8.1%us, 14.2%sy, 0.0%ni, 76.7%id, 0.1%wa, 0.0%hi, 0.8%si, > 0.0%st > Mem: 132109680k total, 85618248k used, 46491432k free, 714884k buffers > Swap: 0k total, 0k used, 0k free, 67934440k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 14833 root 10 -10 1004m 157m 7684 R 99.8 0.1 43033:59 ovs-vswitchd > 14523 root 10 -10 1004m 157m 7684 S 8.3 0.1 669:07.90 handler26 > 14531 root 10 -10 1004m 157m 7684 S 1.3 0.1 104:36.08 revalidator34 > 14532 root 10 -10 1004m 157m 7684 S 1.0 0.1 97:09.84 revalidator35 > 14533 root 10 -10 1004m 157m 7684 S 0.7 0.1 96:36.16 revalidator36 > 14534 root 10 -10 1004m 157m 7684 S 0.7 0.1 96:47.22 revalidator37 > 14843 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.02 urcu8 > 14524 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler28 > 14525 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler27 > 14526 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler29 > 14527 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler30 > 14528 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler31 > 14529 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler32 > 14530 root 10 -10 1004m 157m 7684 S 0.0 0.1 0:00.00 handler33 > > No cpu affinity setting: > > [root@node00234 ~]# taskset -cp 14833 > pid 14833's current affinity list: 0-23 > > More info about dp, and I'm sure megaflow is enabled. > > [root@node00234 ~]# ovs-dpctl show > system@ovs-system: > lookups: hit:15858768478 missed:47078328606 lost:52900178 > flows: 1716 > port 0: ovs-system (internal) > port 1: br-int (internal) > port 2: br-bond1 (internal) > port 3: eth3 > port 4: eth2 > port 5: qvo75ba34b4-fd > port 6: qvo9f41063a-06 > port 7: qvod0752feb-d2 > port 8: qvo1eeb697a-f8 > port 9: qvof9c1a38d-f7 > port 10: qvocd8761b5-ed > port 12: qvo19b7291b-2b > port 13: qvo1b683926-41 > port 14: qvofde29ea5-be > port 15: qvoae1324aa-b9 > port 16: qvo443b155c-13 > port 17: qvo9b7a893c-03 > port 18: qvoa5421574-e5 > port 20: qvod707714a-51 > port 21: qvo9244013b-e6 > port 22: qvo2ac86832-a7 > port 23: qvo4080273b-10 > port 24: qvob14d48f1-55 > port 25: br-ha (internal) > > [root@node00234 ~]# ovs-appctl upcall/show > system@ovs-system: > flows : (current 59) (avg 716) (max 13893) (limit 200000) > dump duration : 5ms > > 34: (keys 0) > 35: (keys 0) > 36: (keys 0) > 37: (keys 0) > > [root@node00234 ~]# ovs-appctl bond/show > ---- bond1 ---- > bond_mode: balance-slb > bond may use recirculation: no, Recirc-ID : -1 > bond-hash-basis: 0 > updelay: 0 ms > downdelay: 0 ms > next rebalance: 1392 ms > lacp_status: off > active slave mac: 90:1b:0e:24:70:92(eth3) > > slave eth2: enabled > may_enable: true > hash 51: 72 kB load > hash 54: 517 kB load > > slave eth3: enabled > active slave > may_enable: true > hash 10: 1 kB load > hash 48: 1 kB load > hash 96: 936 kB load > > _______________________________________________ > discuss mailing list > [email protected] > http://openvswitch.org/mailman/listinfo/discuss > _______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
