Hi all,

I have a question about the main thread in ovs-vswitch.c. The following
are the details, and any comments would be appreciated.

In ovs-vswitchd, netdev_linux_rxq_recv function [1] and
handle_flow_stats_request function [2] are running on the same thread.
The former is responsible for polling the tap interface, while the latter
is responsible for ovs-ofctl dump-flows. The presence of these two functions
on the same thread causes packet forwarding delays.

As an example, let's assume a situation where several million flow entries are
registered. When ovs-ofctl dump-flows command is executed, 
handle_flow_stats_request
function will take several seconds and netdev_linux_rxq_recv function will
stop executing for the same amount of time. As a result, packets from the tap
interface would be delayed for several seconds.

14:58:19 tid:3115 uprobe netdev_linux_rxq_recv
14:58:19 tid:3115 uprobe netdev_linux_rxq_recv
14:58:19 tid:3115 uprobe netdev_linux_rxq_recv
14:58:19 tid:3115 uprobe netdev_linux_rxq_recv
14:58:19 tid:3115 uprobe netdev_linux_rxq_recv
14:58:19 tid:3115 uprobe handle_flow_stats_request     #  here, packet 
forwarding delays.
14:58:21 tid:3115 uretprobe  handle_flow_stats_request
14:58:21 tid:3115 uprobe netdev_linux_rxq_recv
14:58:21 tid:3115 uprobe netdev_linux_rxq_recv
14:58:21 tid:3115 uprobe netdev_linux_rxq_recv
14:58:21 tid:3115 uprobe netdev_linux_rxq_recv
14:58:21 tid:3115 uprobe netdev_linux_rxq_recv

1. I think it would be better to separate these two functions into different
   threads. Is there any reason for running these two functions on the same 
thread?
2. If I have to deal with millions of lines of flow entries in my workload,
   is there any way to deal with this problem?

Diagram
-------

+-------- Physical Server 
-----------------------------------------------------------+     +--- physical 
switch ---+
|                                                                               
     |     |                       |
|                         +----- OvS/DPDK --------------+                       
     |     |                       |
| SRC of ping6            |                             |                       
     |     |   DST of ping6        |
| int1 (tap interface)----|---br-ext (actions=normal)---|---enp94s0f1 (physical 
nic)-|-----|---port1               |
|                         |                             |                       
     |     |                       |
|                         |   br-acl (1.6M dummy flows) |                       
     |     |                       |
|                         |                             |                       
     |     |                       |
|                         +-----------------------------+                       
     |     |                       |
+------------------------------------------------------------------------------------+
     +-----------------------+

Setup
-----

# start ovs
ovs-ctl start

# change log level to debug
ovs-appctl vlog/set file::dbg
ovs-appctl vlog/list

# initialize  dpdk
ovs-vsctl set Open_vSwitch . other_config:dpdk-init="true"
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=f
ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=f0
ovs-vsctl get Open_vSwitch . other_config
ovs-vsctl get Open_vSwitch . dpdk_version
ovs-vsctl get Open_vSwitch . dpdk_initialized
ovs-vswitchd --version  # ovs-vswitchd (Open vSwitch) 2.15.90, DPDK 21.05.0-rc0

# create 2 bridges
ovs-vsctl --may-exist add-br br-acl -- set Bridge br-acl datapath_type=netdev
ovs-vsctl --may-exist add-br br-ext -- set Bridge br-ext datapath_type=netdev

# br-ext: add mellanox nic
ovs-vsctl --may-exist add-port br-ext enp94s0f1 \
  -- set Interface enp94s0f1 type=dpdk options:dpdk-devargs=0000:5e:00.1

# br-ext: add internal port (int1)
ovs-vsctl add-port br-ext int1 -- set Interface int1 type=internal
ip link set int1 up

# br-ext: set actions=NORMAL
ovs-ofctl dump-flows br-ext  # actions=NORMAL

# br-acl: add dummy ports
ovs-vsctl add-port br-acl dummy1 -- set Interface dummy1  type=internal
ovs-vsctl add-port br-acl dummy2 -- set Interface dummy2  type=internal

# br-acl: add 1.6M dummy flows
ovs-ofctl replace-flows br-acl ./1.6M-dummy-flow.txt

# summary
ovs-vsctl show
#
# 8838f149-f719-4243-a7f2-5b9aa179cb7e
#     Bridge br-acl
#         datapath_type: netdev
#         Port dummy1
#             Interface dummy1
#                 type: internal
#         Port dummy2
#             Interface dummy2
#                 type: internal
#         Port br-acl
#             Interface br-acl
#                 type: internal
#     Bridge br-ext
#         datapath_type: netdev
#         Port br-ext
#             Interface br-ext
#                 type: internal
#         Port int1
#             Interface int1
#                 type: internal
#         Port enp94s0f1
#             Interface enp94s0f1
#                 type: dpdk
#                 options: {dpdk-devargs="0000:5e:00.1"}
#     ovs_version: "2.15.90"

Reproduction Steps
------------------

# send out ICMP requests in the background.
ping6 $(ip -6 neigh | awk '/int1/ {print $1}') -I int1 -D &

# at the timing of executing this command, the delay can be observed in the 
results of the ping command.
ovs-ofctl dump-flows br-acl | wc -l

Latencies
---------

For reference, here is the relationship between the number of flow entries and 
the latency.

Number of flow entries in br-acl, Ping6 Latency (ms),
8                                 0.185
16                                0.1691
32                                0.1797
64                                0.1631
128                               0.1778
256                               0.1672
512                               0.1774
1024                              0.1672
2048                              0.176
4096                              0.171
8192                              4.4824
16384                             16.3105
32768                             40.4221
65536                             89.9295
131072                            193.9651
262144                            404.6833
524288                            824.7896
1048576                           1647.5226

[1]: 
https://github.com/openvswitch/ovs/blob/f8be30acf2eb60d567bb7386b98f5cb58ddb9119/lib/netdev-linux.c#L1461
[2]: 
https://github.com/openvswitch/ovs/blob/f8be30acf2eb60d567bb7386b98f5cb58ddb9119/ofproto/ofproto.c#L4639

Best Regards,
Nobuhiro Miki
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to