On Thu, Sep 11, 2025 at 12:11 PM Ilya Maximets <[email protected]> wrote: > > On 9/2/25 9:29 AM, Han Zhou wrote: > > When adding ports to bridges with many existing ports, bridge > > reconfiguration was spending significant CPU time in repeated calls to > > ofproto_port_open_type() during MTU calculations. > > > > For N existing ports, this resulted in O(N) calls to > > ofproto_port_open_type() per port addition. > > > > perf shows: > > ... > > + 75.47% 0.00% ovs-vswitchd ovs-vswitchd [.] ofproto_run > > + 66.17% 0.00% ovs-vswitchd ovs-vswitchd [.] update_mtu > > + 65.77% 7.27% ovs-vswitchd ovs-vswitchd [.] find_min_mtu > > + 65.77% 0.00% ovs-vswitchd ovs-vswitchd [.] update_mtu_ofproto (inlined) > > + 48.75% 1.38% ovs-vswitchd ovs-vswitchd [.] ofport_is_mtu_overridden > > + 48.56% 0.00% ovs-vswitchd ovs-vswitchd [.] ofport_is_internal_or_patch (inlined) > > + 37.19% 2.21% ovs-vswitchd ovs-vswitchd [.] dpif_port_open_type > > + 37.19% 0.00% ovs-vswitchd ovs-vswitchd [.] ofproto_port_open_type (inlined) > > ... > > > > With 5k existing ports, adding a new port took 5.149s. > > After this patch, it takes 3.201s for the same. > > Hi, Han. Thanks for the patch! > > I think, it's an interesting change on its own, but I wonder what is the > datapath and the port types were in your test? >
Thanks Ilya! Yes you are right, I was using sandbox without specifying any port type. I didn't realize it was a performance bug specific to the dummy datapath. > I see a similar performance with dummy datapath, but the reason for that > is a not correct behavior of netdev_dummy_set_config() function. It always > unconditionally calls netdev_change_seq_changed() and this is actually > causing quadratic complexity on the port addition. In the ofproto_run() > we collect all the ports that changed into 'devnames' sset and then > update_port() is called for each of them that calls update_mtu() then > update_mtu_ofproto() and that iterates over all the ports again. So, we > always do a nested iteration over all the ports with N^2 complexity. > For real ports though this should not normally happen as they will not > be marked as changed and will not be added to the 'devnames' set. > > If I comment out the netdev_change_seq_changed() call, then it takes > about 1 second to add a new dummy port at 5000. The time complexity is > still quadratic here because of the very inefficient get_port_by_name() > implementation in lib/dpif-netdev.c. We don't have a hash map there > that would index by name, only by the port number, so this function > just iterates over all the ports and compares names. If we add a new > hash map there that indexes by name, then the port addition in the > dummy datapath (and a general userspace datapath) should be a lot faster. > I tried sandbox with "--dummy=system" and then creating ports with type=internal. It took around 1.2s at 5000, and the perf shows exactly the get_port_by_name() as the bottleneck as you pointed out. > If I do the same test, but adding tunnel ports into a linux kernel > datapath, then I do not see the same issue as with dummy ports. One tunnel > port at 5000 can be added in about 350ms. This is relatively fast, because > we're not doing a lot of syscalls for tunnel ports, since there is only one > shared device for all of them in the kernel. > I tried sandbox with "--dummy=system" with geneve ports. It is also same as your observation, ~350ms for the 5000th port. Then I tried 16k ports in the same sandbox env, and port-add took ~1.2s, which shows linear growth, and I guess for 64k ports it would take ~5s. However, in a scale test env which was running kernel datapath with real geneve ports, at 16k ports level it took 5s to add a new port, but we couldn't do perf profiling in that environment yet. It may be due to VM/CPU differences. > The situation is a bit worse for kernel ports that are actually backed > by separate kernel interfaces. The main problem here are all the > OVS_VPORT_CMD_GET syscalls that are made for each port separately as part > of the OFPROTO_PORT_FOR_EACH() macro inside the main iteration in > vswitchd/bridge.c:bridge_delete_or_reconfigure_ports(). > Addition of one internal port into a linux kernel datapath takes about > 500ms at 5000 preexisting ports. I suspect that if we change the logic > to dump all the ports once instead of doing a separate syscall per port, > this may be done noticeably faster, but I didn't try. > This is less of a concern because it is not reasonable to have that many kernel interfaces. The only use case I can imagine is the tunnel ports that can be so many. > All in all, the mtu-related code should not be a real issue, it is only > a problem if any time you touch an interface it changes, which is only > the case for dummy ports. This should never happen for any type of a > real port. > > What we actually need to optimize: > > 1. Optimize lib/dpif-netdev.c:get_port_by_name() to avoid the linear > port lookup. > > 2. Re-work OFPROTO_PORT_FOR_EACH() macro to dump all the ports from > the datapath once and then look them up in a local hash map, instead > of looking pu every port using a separate syscall. > > 3. Fix the netdev_dummy_set_config() to avoid flagging a change if > nothing actually changed. > > What do you think? I agree, please ignore this patch. And maybe 1) is the most important, 3) is good to have, and 2) is not important for real world use cases. Thanks, Han > > Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
