It's so easy to reproduce ? Thanks! then I should have to dig it again.
Eelco Chaudron <echau...@redhat.com> 于2022年9月30日周五 23:39写道: > > > On 30 Sep 2022, at 17:26, Peng He wrote: > > > Eelco Chaudron <echau...@redhat.com> 于2022年9月30日周五 23:01写道: > > > >> > >> > >> On 23 Sep 2022, at 18:29, Peng He wrote: > >> > >>> The userspace datapath mananges all the magaflows by a cmap. The cmap > >>> data structrue will grow/shrink during the datapath processing and it > >>> will re-position megaflows. This might result in two revalidator > threads > >>> might process a same megaflow during one dump stage. > >>> > >>> Consider a situation that, revalidator 1 processes a megaflow A, and > >>> decides to delete it from the datapath, at the mean time, this megaflow > >>> A is also queued in the process batch of revalidator 2. Normally it's > ok > >>> for revalidators to process the same megaflow multiple times, as the > >>> dump_seq shows it's already dumped and the stats will not be > contributed > >>> twice. > >>> > >>> Assume that right after A is deleted, a PMD thread generates again > >>> a new megaflow B which has the same match and action of A. The ukey > >>> of megaflow B will replace the one of megaflow A. Now the ukey B is > >>> new to the revalidator system and its dump seq is 0. > >>> > >>> Now since the dump seq of ukey B is 0, when processing megaflow A, > >>> the revalidator 2 will not identify this megaflow A has already been > >>> dumped by revalidator 1 and will contribute the old megaflow A's stats > >>> again, this results in an inconsistent stats between ukeys and > megaflows. > >>> > >>> To fix this, the newly generated the ukey B should take the dump_seq > >>> of the replaced ukey A to avoid a same megaflow being revalidated > >>> twice in one dump stage. > >>> > >>> We observe in the production environment, the OpenFlow rules' stats > >>> sometimes are amplified compared to the actual value. I believe this > >>> is also the reason that why somtimes there is mismatch between the > >>> ukey and megaflow in stats value. The Eelco's patch > >>> [ovs-dev] [PATCH v2 09/10] revalidator: Fix datapath statistics update > >>> tried to fix it in the past. > >> > >> This sounds plausible, are your statistics extremely elevated? > >> Mine are in the likes of n_bytes=18446744073705804134 where it should be > >> around 100. > >> > >> It looks more like an overflow. > > I just get the report from another team, I need to ask them, we will > have a > > 7 days off due to the national day. > > so it will take time to get the value. :( > > Enjoy your time off!! This fix is not solving my problem; > > Still get the error once out of X runs, n_bytes=18446744073709393054 > > > I’ll try to get my old setup up and run it continuously over the weekend > >> and see if it’s replicated again. > >> > >> thanks! > > > > > >>> Signed-off-by: Peng He <hepeng.0...@bytedance.com> > >>> --- > >>> ofproto/ofproto-dpif-upcall.c | 1 + > >>> 1 file changed, 1 insertion(+) > >>> > >>> diff --git a/ofproto/ofproto-dpif-upcall.c > >> b/ofproto/ofproto-dpif-upcall.c > >>> index e8bbcfeaf..89fad1bdf 100644 > >>> --- a/ofproto/ofproto-dpif-upcall.c > >>> +++ b/ofproto/ofproto-dpif-upcall.c > >>> @@ -1877,6 +1877,7 @@ try_ukey_replace(struct umap *umap, struct > >> udpif_key *old_ukey, > >>> ovs_mutex_lock(&new_ukey->mutex); > >>> cmap_replace(&umap->cmap, &old_ukey->cmap_node, > >>> &new_ukey->cmap_node, new_ukey->hash); > >>> + new_ukey->dump_seq = old_ukey->dump_seq; > >>> ovsrcu_postpone(ukey_delete__, old_ukey); > >>> transition_ukey(old_ukey, UKEY_DELETED); > >>> transition_ukey(new_ukey, UKEY_VISIBLE); > >>> -- > >>> 2.25.1 > >>> > >>> _______________________________________________ > >>> dev mailing list > >>> d...@openvswitch.org > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > >> > > > > -- > > hepeng > > -- hepeng _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev