Re: [ovs-discuss] ovn-controller consuming lots of CPU

Kevin Lin Wed, 13 Dec 2017 14:01:46 -0800

Thanks for the replies!

We’re using v2.8.1.


I don’t completely understand Russell’s patch, but I don’t think our ACLs were 
taking advantage of it. Do the ACLs need to be “tagged” with port information 
in order for it to be useful?

Before, our ACLs were in terms of L3 and above. I brought down the number of 
flows from 252981 to 16224 by modifying the ACLs to include the logical port. 
For example:
from-lport: (inport == “1” || inport == “2” || inport == “3”) && (ip4.dst == 
$addressSet)
to-lport: (ip4.src == $addressSet) && (outport == “1” || outport == “2” || 
outport == “3”)

Is that the right way to take advantage of Russell’s patch? This explodes the 
address set on one side of the connection, but does decrease the number of 
flows installed on each vswitchd.

Even with the decrease in flows in vswitchd, I’m still seeing the log messages. 
Does 16224 flows per vswitchd instance, and 559 flows in ovn-sb sound 
reasonable?

Thanks,
—Kevin

> On Dec 13, 2017, at 7:55 AM, Mark Michelson <mmich...@redhat.com> wrote:
> 
> On 12/12/2017 03:26 PM, Kevin Lin wrote:
>> Hi again,
>> We’re trying to scale up our OVN deployment and we’re seeing some worrying 
>> log messages.
>> The topology is 32 containers connected to another 32 containers on 10 
>> different ports. This is running on 17 machines (one machine runs ovn-northd 
>> and ovsdb-server, the other 16 run ovn-controller, ovs-vswitchd, and 
>> ovsdb-server). We’re using an address set for the source group, but not the 
>> destination group. We’re also creating a different ACL for each port. So the 
>> ACLs look like:
>> One address set for { container1, container2, … container32 }
>> addressSet -> container1 on port 80
>> addressSet -> container1 on port 81
>> …
>> addressSet -> container1 on port 90
>> addressSet -> container2 on port 80
>> …
>> addressSet -> container32 on port 90
>> The ovn-controller log:
>> 2017-12-12T20:14:49Z|11878|timeval|WARN|Unreasonably long 1843ms poll 
>> interval (1840ms user, 0ms system)
>> 2017-12-12T20:14:49Z|11879|timeval|WARN|disk: 0 reads, 16 writes
>> 2017-12-12T20:14:49Z|11880|timeval|WARN|context switches: 0 voluntary, 21 
>> involuntary
>> 2017-12-12T20:14:49Z|11881|poll_loop|DBG|wakeup due to [POLLIN] on fd 9 
>> (172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (36% CPU 
>> usage)
>> 2017-12-12T20:14:49Z|11882|poll_loop|DBG|wakeup due to [POLLIN] on fd 12 
>> (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:157 (36% CPU usage)
>> 2017-12-12T20:14:49Z|11883|jsonrpc|DBG|tcp:172.31.2.181:6640: received 
>> reply, result=[], id="echo"
>> 2017-12-12T20:14:49Z|11884|netlink_socket|DBG|nl_sock_transact_multiple__ 
>> (Success): nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO], 
>> seq=b11, pid=2268452876
>> 2017-12-12T20:14:49Z|11885|netlink_socket|DBG|nl_sock_recv__ (Success): 
>> nl(len:136, type=36(family-defined), flags=0, seq=b11, pid=2268452876
>> 2017-12-12T20:14:49Z|11886|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt: 
>> received: OFPT_ECHO_REQUEST (OF1.3) (xid=0x0): 0 bytes of payload
>> 2017-12-12T20:14:49Z|11887|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt: 
>> sent (Success): OFPT_ECHO_REPLY (OF1.3) (xid=0x0): 0 bytes of payload
>> 2017-12-12T20:14:51Z|11888|timeval|WARN|Unreasonably long 1851ms poll 
>> interval (1844ms user, 8ms system)
>> 2017-12-12T20:14:51Z|11889|timeval|WARN|context switches: 0 voluntary, 11 
>> involuntary
>> 2017-12-12T20:14:52Z|11890|poll_loop|DBG|wakeup due to [POLLIN] on fd 9 
>> (172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (73% CPU 
>> usage)
>> 2017-12-12T20:14:52Z|11891|jsonrpc|DBG|tcp:172.31.2.181:6640: received 
>> request, method="echo", params=[], id="echo"
>> 2017-12-12T20:14:52Z|11892|jsonrpc|DBG|tcp:172.31.2.181:6640: send reply, 
>> result=[], id="echo"
>> 2017-12-12T20:14:52Z|11893|netlink_socket|DBG|nl_sock_transact_multiple__ 
>> (Success): nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO], 
>> seq=b12, pid=2268452876
>> 2017-12-12T20:14:52Z|11894|netlink_socket|DBG|nl_sock_recv__ (Success): 
>> nl(len:136, type=36(family-defined), flags=0, seq=b12, pid=2268452876
>> 2017-12-12T20:14:52Z|11895|netdev_linux|DBG|Dropped 18 log messages in last 
>> 56 seconds (most recently, 3 seconds ago) due to excessive rate
>> 2017-12-12T20:14:52Z|11896|netdev_linux|DBG|unknown qdisc "mq"
>> 2017-12-12T20:14:54Z|11897|hmap|DBG|Dropped 15511 log messages in last 6 
>> seconds (most recently, 0 seconds ago) due to excessive rate
>> 2017-12-12T20:14:54Z|11898|hmap|DBG|ovn/lib/expr.c:2644: 6 nodes in bucket 
>> (128 nodes, 64 buckets)
>> 2017-12-12T20:14:54Z|11899|timeval|WARN|Unreasonably long 1831ms poll 
>> interval (1828ms user, 4ms system)
>> 2017-12-12T20:14:54Z|11900|timeval|WARN|context switches: 0 voluntary, 12 
>> involuntary
>> The log messages show up continuously. The logs appear even when the network 
>> isn’t being used.
>> I poked around with Ethan Jackson and he noted that the hmap counters seem 
>> unusually high:
>> root@ip-172-31-11-193:/# ovs-appctl -t ovn-controller coverage/show
>> Event coverage, avg rate over last: 5 seconds, last minute, last hour,  
>> hash=d6ee5804:
>> hmap_pathological        2323.6/sec  2662.467/sec     2514.0069/sec   total: 
>> 9407536
>> hmap_expand              3596.8/sec  4121.283/sec     3890.8833/sec   total: 
>> 14604479
>> txn_unchanged              0.8/sec     0.917/sec        0.8658/sec   total: 
>> 5659
>> txn_incomplete             0.0/sec     0.000/sec        0.0008/sec   total: 
>> 33
>> txn_success                0.0/sec     0.000/sec        0.0006/sec   total: 
>> 24
>> poll_create_node           2.4/sec     2.750/sec        2.5986/sec   total: 
>> 18218
>> poll_zero_timeout          0.0/sec     0.000/sec        0.0100/sec   total: 
>> 71
>> rconn_queued               0.0/sec     0.050/sec        0.0531/sec   total: 
>> 252570
>> rconn_sent                 0.0/sec     0.050/sec        0.0531/sec   total: 
>> 252570
>> seq_change                 1.2/sec     1.383/sec        1.2992/sec   total: 
>> 8500
>> pstream_open               0.0/sec     0.000/sec        0.0000/sec   total: 1
>> stream_open                0.0/sec     0.000/sec        0.0000/sec   total: 6
>> unixctl_received           0.0/sec     0.000/sec        0.0019/sec   total: 7
>> unixctl_replied            0.0/sec     0.000/sec        0.0019/sec   total: 7
>> util_xalloc              2731550.2/sec 3129900.483/sec   569276.5414/sec    
>> total: 11057381035
>> vconn_open                 0.0/sec     0.000/sec        0.0000/sec   total: 4
>> vconn_received             0.0/sec     0.050/sec        0.0444/sec   total: 
>> 201
>> vconn_sent                 0.0/sec     0.000/sec        0.0144/sec   total: 
>> 253535
>> netdev_get_ifindex         0.4/sec     0.467/sec        0.4328/sec   total: 
>> 2822
>> netlink_received           0.4/sec     0.467/sec        0.4328/sec   total: 
>> 2822
>> netlink_sent               0.4/sec     0.467/sec        0.4328/sec   total: 
>> 2822
>> cmap_expand                0.0/sec     0.000/sec        0.0000/sec   total: 2
>> 47 events never hit
>> I’ve also attached the output of ovs-bugtool run from the machine running 
>> ovn-northd, and one of the machines running ovn-controller and ovs-vswitchd.
>> Thanks,
>> —Kevin
> 
> Hi Kevin,
> 
> In short, I agree with Ethan's assessment that the hmap numbers appear large. 
> The ACLs, combined with ovn-controller's algorithms, are causing that. The 
> ACLs result in ~250000 openflow flows being generated. ovn-controller 
> re-creates this flow table from scratch each time it runs its main loop.
> 
> These are unsolved problems as yet in ovn-controller. The two high-level 
> thoughts on the matter are:
> 1) See if we can reduce the number of generated flows when ACLs are used.
> 2) See if we can perform incremental changes on the flow table rather than 
> re-creating it from scratch every time.
> 
> There are two commits that may be of interest to you.
> 
> * As Miguel mentioned in his response, Russell Bryant submitted a change [1] 
> within the past couple of months to help towards (1) a bit. The idea was that 
> if we can detect that a flow will never be needed on a specific machine, then 
> don't generate it. It's not going to cover everything, but it could 
> potentially help you. That change is present in 2.7.3 and 2.8.1
> 
> * A change [2] by Han Zhou fixed an issue where we were generating the flow 
> table and then just throwing it away since vswitchd was not ready for us to 
> submit a new flow table to it. This change is present in 2.7.3 and 2.8.0.
> 
> These two commits aren't silver bullets but if you're running a version of 
> OVS that does not have these commits in them already, you may see some 
> improvement with an upgrade.
> 
> Mark
> 
> [1] https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338847.html 
> <https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338847.html>
> [2] https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331286.html 
> <https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331286.html>

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Re: [ovs-discuss] ovn-controller consuming lots of CPU

Reply via email to