On 2/19/22 04:26, Peng He wrote: > From hepeng: > https://patchwork.ozlabs.org/project/openvswitch/patch/20200717015041.82746-1-hepeng.0...@bytedance.com/#2487473 > > also from guohongzhi <guohongz...@huawei.com>: > http://patchwork.ozlabs.org/project/openvswitch/patch/20200306130555.19884-1-guohongz...@huawei.com/ > > also from a discussion about the mixing use of RCU and refcount in the mail > list with Ilya Maximets, William Tu, Ben Pfaf, and Gaëtan Rivet. > > A summary, as quoted from Ilya: > > " > RCU for ofproto was introduced for one > and only one reason - to avoid freeing ofproto while rules are still > alive. This was done in commit f416c8d61601 ("ofproto: RCU postpone > rule destruction."). The goal was to allow using rules without > refcounting them within a single grace period. And that forced us > to postpone destruction of the ofproto for a single grace period. > Later commit 39c9459355b6 ("Use classifier versioning.") made it > possible for rules to be alive for more than one grace period, so > the commit made ofproto wait for 2 grace periods by double postponing. > As we can see now, that wasn't enough and we have to wait for more > than 2 grace periods in certain cases. > " > > In a short, the ofproto should have a longer life time than rule, if > the rule lasts for more than 2 grace periods, the ofproto should live > longer to ensure rule->ofproto is valid. It's hard to predict how long > a ofproto should live, thus we need to use refcount on ofproto to make > things easy. The controversial part is that we have already used RCU postpone > to delay ofproto destrution, if we have to add refcount, is it simpler to > use just refcount without RCU postpone? > > IMO, I think going back to the pure refcount solution is more > complicated than mixing using both. > > Gaëtan Rive asks some questions on guohongzhi's v2 patch: > > during ofproto_rule_create, should we use ofproto_ref > or ofproto_try_ref? how can we make sure the ofproto is alive? > > By using RCU, ofproto has three states: > > state 1: alive, with refcount >= 1 > state 2: dying, with refcount == 0, however pointer is valid > state 3: died, memory freed, pointer might be dangling. > > Without using RCU, there is no state 2, thus, we have to be very careful > every time we see a ofproto pointer. In contrast, with RCU, we can be sure > that it's alive at least in this grace peroid, so we can just check if > it is dying by ofproto_try_ref. > > This shows that by mixing use of RCU and refcount we can save a lot of work > worrying if ofproto is dangling. > > In short, the RCU part makes sure the ofproto is alive when we use it, > and the refcount part makes sure it lives longer enough. > > In this patch, I have merged guohongzhi's patch and mine, and fixes > accoring to the previous comments. > > v4->v5: > * fix the commits, remove the ref to wangyunjian's patch and > remove the comments describing the previous ofproto destruction code. > * fix group alloc leak issues. > > v5->v6: > * fix ofproto unref leak > * fix comments > > v6->v7: > * fix comments and typo > > Signed-off-by: Peng He <hepeng.0...@bytedance.com> > Signed-off-by: guohongzhi <guohongz...@huawei.com> > Acked-by: Mike Pattrick <m...@redhat.com> > Acked-by: Gaetan Rivet <gr...@u256.net> > --- > ofproto/ofproto-dpif-xlate-cache.c | 2 + > ofproto/ofproto-dpif-xlate-cache.h | 5 +-- > ofproto/ofproto-dpif-xlate.c | 14 +++--- > ofproto/ofproto-dpif.c | 24 +++++----- > ofproto/ofproto-provider.h | 2 + > ofproto/ofproto.c | 71 +++++++++++++++++++++++++++--- > ofproto/ofproto.h | 4 ++ > 7 files changed, 98 insertions(+), 24 deletions(-)
Thanks, everyone! Applied and backported down to 2.13. This patch doesn't fix the ofproto use-after-free while freeing meter ids, which is the main reason our CI fails periodically. But with introduction of the refcount that should not be hard to fix. I'll send a patch for this issue soon. Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev