Hi,

just some thought on the results.

the test-conntrack.c now only prepares one batch size of packets for each
thread,
and repeat running ct on the same batch until the packet number reaches to
the parameter of n_pkt, which is, in your case, 16777216

So I guess the optimization results largely comes from the fact that you
remove
the global lock on the timeout lists, which is taken/released by each
packet.

the split bucket patch reduces the contention and thus increase the rate of
connection setup,
I think maybe we need a new benchmark on the setup rate.



Paolo Valerio <pvale...@redhat.com> 于2022年3月7日周一 20:39写道:

> This series aims to share the work done so far, and to start a
> discussion of a slightly different approach in terms of the way we
> handle the connections.
> It's not considered ready as further work and extensive tests are
> needed.
>
> These are some performance numbers obtained in the following way:
>
> ./tests/ovstest test-conntrack benchmark $i 16777216 32 1
>
> $i   baseline  ct-scale  ct-bucket
>  2   16480 ms   3527 ms    3180 ms
>  4   37375 ms   4997 ms    4477 ms
>  8   63239 ms   9692 ms    8954 ms
> 16  131583 ms  19303 ms   18062 ms
>
> Both ct-scale and ct-bucket introduce a noticeable improvement in
> terms of performace. It's worth to mention that the ct-scale, keeping
> the exp_lists, is more efficient during the sweeping.
>
> There are two main logical changes that the series tries to introduce.
> The first one is the reintroduction of buckets, the second one is the
> removal of the expiration lists.
>
> The first two patches are a prereq and were picked (untouched) from
> this series (by Gaetan):
>
>
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=249027&state=*
>
> The third patch is a follow up of the following:
>
>
> https://patchwork.ozlabs.org/project/openvswitch/patch/20201129033255.64647-2-hepeng.0...@bytedance.com/
>
> Some logic has changed, but the overall idea remained the same.
> Additional details in the patch description.
>
> The fourth patch introduces the buckets and removes the expiration
> lists.
> The buckets got reintroduced as cmaps instead of hmaps, trying to
> narrow down the critical sections as well.
>
> An alternative approach may involve the replacement of cmaps with
> linked lists (without increasing the number of locks), increasing the
> number of buckets (it could involve the ability to change the buckets
> number when the user changes the connection limit). This could improve
> the stale conns eviction during the processing of the packet, but this
> is currently out of scope for this series (can be discussed, though).
>
> The insertion, namely the creation of the connection, can probably be
> improved (or at least an improvement can be evaluated), as, in case of
> nat, it acquires the bucket locks calling nat_get_unique_tuple_lock().
> This is needed because we may need to know the reverse tuple before
> locking in order to acquire the locks in the right way to avoid ABBA
> deadlocks.
>
> Further details about locking may be found in patch #4 description.
>
> There's a known limitation, and it is related mostly to zone
> limit. Keeping the stale entries, leads to a mismatch between the
> current number of per zone connections and the actual number of valid
> connections. This means that if we have a small zone limit number, we
> may wait a whole "next_wakeup" time before the entries get cleaned up
> (assuming we have candidates for removal). To avoid this, a zone clean
> up from the packet path has been added, but alternative approaches as
> triggering a per zone cleanup when a threshold of connections has
> been reached (storing the connections per bucket and per zone during
> the creation) may be still viable.
>
> Expiration lists have been removed for the sake of evaluation, but
> they could be included and duplicated among buckets. In this way we
> could distribute the contention (as a matter of fact reducing the
> number of connections per list) that affects the write access to a
> single data structure during every connection update.
>
> Patch #6 reintegrates the command below with multiple buckets:
>
> ovs-appctl dpctl/ct-bkts
>
> A couple of minor patches have been kept out of the series, for the
> time being, as they strongly depend on #4.
>
> Gaetan Rivet (3):
>       conntrack: Use a cmap to store zone limits
>       conntrack-tp: Use a cmap to store timeout policies
>       conntrack: Use an atomic conn expiration value
>
> Paolo Valerio (2):
>       conntrack: Split single cmap to multiple buckets.
>       conntrack: Make ovs-appctl dpctl/ct-bkts work with multiple buckets
>
> Peng He (1):
>       conntrack: Replaces nat_conn introducing key directionality.
>
>
>  lib/conntrack-private.h |   60 ++-
>  lib/conntrack-tp.c      |  102 ++--
>  lib/conntrack.c         | 1062 +++++++++++++++++++++++----------------
>  lib/conntrack.h         |    6 +-
>  lib/dpif-netdev.c       |    5 +-
>  tests/system-traffic.at |    5 +-
>  6 files changed, 707 insertions(+), 533 deletions(-)
>
>

-- 
hepeng
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to