Hi, In our scalability test for OVN, we observed an in-scalable behaviour of the ovn-northd process: the time binding a logical port increases as # of large port increasing, regardless of whether logical ports belong to the same logical switch. The most suspicious function in causing this issue is build_ports() called by ovnnb_db_run() [1], as described below.
Test description: step 1: Create 6 logical switches. For each logical switch, create 200 logical ports. step 2: Bind 200 lports from each logical switch on an OVN chassis. Test results for step 2: # of ports | # of ovn_ports | Cpu cycle spent in | | allocated in build_port() | built_port(), in million | 200 | 200 | 25 | 400 | 400 | 50 | 600 | 600 | 75 | 800 | 800 | 93 | 1000 | 1000 | 108 | 1200 | 1200 | 125 | We see that on binding each logical port on a hypervisor, join_logical_ports() in build_port allocates the number of (struct ovn_port) for all the existing ports in the southbound database [2], which causes the accumulated CPU cycles. My question is whether there is any particular reason to allocate that number of (struct ovn_port)? It seems to me there is room in this code to optimize for performance. Thanks. - Hui [1] https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L2529 [2] https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L571 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev