Thanks for the reviews! Eelco Chaudron <echau...@redhat.com> 于2023年6月9日周五 20:47写道:
> > > On 3 Jun 2023, at 2:01, Peng He wrote: > > > push_dp_ops only handles delete ops errors but ignores the modify > > ops results. It's better to handle all the dp operation errors in > > a consistent way. > > > > This patch prevents the inconsistency by considering modify failure > > in revalidators. > > > > To note, we cannot perform two state transitions and change ukey_state > > into UKEY_EVICTED directly here, because, if we do so, the > > sweep will remove the ukey alone and leave dp flow alive. Later, the > > dump will retrieve the dp flow and might even recover it. This will > > contribute the stats of this dp flow twice. > > > > v8->v9: add testsuite and delete INCONSISTENT ukey at revalidate_sweep. > > v9->v10: change the commit message and refine the test case. > > Hi Peng, > > Thanks for keeping this patch updated :) Some more comments below, but I > think the next rev should be all fine. > > Cheers, > > Eelco > > > Signed-off-by: Peng He <hepeng.0...@bytedance.com> > > --- > > ofproto/ofproto-dpif-upcall.c | 48 ++++++++++++++++++++++++---------- > > tests/dpif-netdev.at | 49 +++++++++++++++++++++++++++++++++++ > > 2 files changed, 84 insertions(+), 13 deletions(-) > > > > diff --git a/ofproto/ofproto-dpif-upcall.c > b/ofproto/ofproto-dpif-upcall.c > > index cd57fdbd9..5a75d9444 100644 > > --- a/ofproto/ofproto-dpif-upcall.c > > +++ b/ofproto/ofproto-dpif-upcall.c > > @@ -258,6 +258,7 @@ enum ukey_state { > > UKEY_CREATED = 0, > > UKEY_VISIBLE, /* Ukey is in umap, datapath flow install is > queued. */ > > UKEY_OPERATIONAL, /* Ukey is in umap, datapath flow is installed. > */ > > + UKEY_INCONSISTENT, /* Ukey is in umap, datapath flow is > inconsistent. */ > > UKEY_EVICTING, /* Ukey is in umap, datapath flow delete is > queued. */ > > UKEY_EVICTED, /* Ukey is in umap, datapath flow is deleted. */ > > UKEY_DELETED, /* Ukey removed from umap, ukey free is > deferred. */ > > @@ -1999,6 +2000,10 @@ transition_ukey_at(struct udpif_key *ukey, enum > ukey_state dst, > > * UKEY_VISIBLE -> UKEY_EVICTED > > * A handler attempts to install the flow, but the datapath > rejects it. > > * Consider that the datapath has already destroyed it. > > + * UKEY_OPERATIONAL -> UKEY_INCONSISTENT > > + * A revalidator modifies the flow with error returns. > > + * UKEY_INCONSISTENT -> UKEY_EVICTING > > + * A revalidator decides to evict the datapath flow. > > * UKEY_OPERATIONAL -> UKEY_EVICTING > > * A revalidator decides to evict the datapath flow. > > * UKEY_EVICTING -> UKEY_EVICTED > > @@ -2007,7 +2012,9 @@ transition_ukey_at(struct udpif_key *ukey, enum > ukey_state dst, > > * A revalidator has removed the ukey from the umap and is > deleting it. > > */ > > if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE && > > - dst < UKEY_DELETED)) { > > + dst < UKEY_DELETED) || > > + (ukey->state == UKEY_OPERATIONAL && > > Indentation is wrong here, but maybe we should fix both lines, so it’s > easier to read: > > if (ukey->state == dst - 1 || > (ukey->state == UKEY_VISIBLE && dst < UKEY_DELETED) || > (ukey->state == UKEY_OPERATIONAL && dst == UKEY_EVICTING)) { > > > > > + dst == UKEY_EVICTING)) { > > ukey->state = dst; > > } else { > > struct ds ds = DS_EMPTY_INITIALIZER; > > @@ -2472,26 +2479,31 @@ push_dp_ops(struct udpif *udpif, struct ukey_op > *ops, size_t n_ops) > > > > for (i = 0; i < n_ops; i++) { > > struct ukey_op *op = &ops[i]; > > - struct dpif_flow_stats *push, *stats, push_buf; > > - > > - stats = op->dop.flow_del.stats; > > - push = &push_buf; > > - > > - if (op->dop.type != DPIF_OP_FLOW_DEL) { > > - /* Only deleted flows need their stats pushed. */ > > - continue; > > - } > > > > if (op->dop.error) { > > - /* flow_del error, 'stats' is unusable. */ > > if (op->ukey) { > > ovs_mutex_lock(&op->ukey->mutex); > > - transition_ukey(op->ukey, UKEY_EVICTED); > > + if (op->ukey->state == UKEY_EVICTING) { > > How about not checking the UKEY_STATE’s as they are internal, but using > the operation type instead? > This will still result in errors in transition_ukey() for any corner cases > we might have missed (if some new operation gets added and fails). > > ovs_mutex_lock(&op->ukey->mutex); > - if (op->ukey->state == UKEY_EVICTING) { > + if (op->dop.type == DPIF_OP_FLOW_DEL) { > transition_ukey(op->ukey, UKEY_EVICTED); > - } else if (op->ukey->state == UKEY_OPERATIONAL) { > - /* Modify failed, ukey's state was UKEY_OPERATIONAL */ > + } else { > + /* Modification of the flow failed. */ > transition_ukey(op->ukey, UKEY_INCONSISTENT); > } > ovs_mutex_unlock(&op->ukey->mutex); > > > > + transition_ukey(op->ukey, UKEY_EVICTED); > > + } else if (op->ukey->state == UKEY_OPERATIONAL) { > > + /* Modify failed, ukey's state was UKEY_OPERATIONAL > */ > > + transition_ukey(op->ukey, UKEY_INCONSISTENT); > > + } > > ovs_mutex_unlock(&op->ukey->mutex); > > } > > continue; > > } > > > > + if (op->dop.type != DPIF_OP_FLOW_DEL) { > > + /* Only deleted flows need their stats pushed. */ > > + continue; > > + } > > + > > + struct dpif_flow_stats *push, *stats, push_buf; > > + > > + stats = op->dop.flow_del.stats; > > + push = &push_buf; > > + > > if (op->ukey) { > > ovs_mutex_lock(&op->ukey->mutex); > > transition_ukey(op->ukey, UKEY_EVICTED); > > @@ -2839,6 +2851,15 @@ revalidate(struct revalidator *revalidator) > > continue; > > } > > > > + if (ukey->state == UKEY_INCONSISTENT) { > > + ukey->dump_seq = dump_seq; > > + reval_op_init(&ops[n_ops++], UKEY_DELETE, udpif, ukey, > > + &recircs, &odp_actions); > > + ovs_mutex_unlock(&ukey->mutex); > > I think it would be nice to have a coverage counter here for this > situation. > > COVERAGE_INC(dumped_inconsisten_flow); > > > + continue; > > + } > > + > > + > > if (ukey->state <= UKEY_OPERATIONAL) { > > /* The flow is now confirmed to be in the datapath. */ > > transition_ukey(ukey, UKEY_OPERATIONAL); > > @@ -2927,13 +2948,14 @@ revalidator_sweep__(struct revalidator > *revalidator, bool purge) > > } > > ukey_state = ukey->state; > > if (ukey_state == UKEY_OPERATIONAL > > + || (ukey_state == UKEY_INCONSISTENT) > > || (ukey_state == UKEY_VISIBLE && purge)) { > > struct recirc_refs recircs = > RECIRC_REFS_EMPTY_INITIALIZER; > > bool seq_mismatch = (ukey->dump_seq != dump_seq > > && ukey->reval_seq != reval_seq); > > enum reval_result result; > > > > - if (purge) { > > + if (purge || ukey_state == UKEY_INCONSISTENT) { > > result = UKEY_DELETE; > > } else if (!seq_mismatch) { > > result = UKEY_KEEP; > > diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at > > index baab60a22..5ee8e6f7e 100644 > > --- a/tests/dpif-netdev.at > > +++ b/tests/dpif-netdev.at > > @@ -716,3 +716,52 @@ AT_CHECK([test `ovs-vsctl get Interface p2 > statistics:tx_q0_packets` -gt 0 -a dn > > > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([dpif-netdev - revalidators handle dp modification fail > correctly]) > > +OVS_VSWITCHD_START( > > + [add-port br0 p1 \ > > + -- set interface p1 type=dummy \ > > + -- set bridge br0 datapath-type=dummy \ > > + -- add-port br0 p2 \ > > + -- set interface p2 type=dummy -- > > + ]) > > + > > +AT_DATA([flows.txt], [dnl > > +table=0,in_port=p1,actions=p2 > > +]) > > + > > +ovs-ofctl add-flows br0 flows.txt > > As this is only a single flow, I would change it to something like: > > -AT_DATA([flows.txt], [dnl > -table=0,in_port=p1,actions=p2 > -]) > - > -ovs-ofctl add-flows br0 flows.txt > +ovs-ofctl add-flow br0 table=0,in_port=p1,actions=p2 > > +ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.0.0.2),tcp(src=1,dst=2)' > > +ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.0.0.2),tcp(src=1,dst=2)' > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*thread://' | > strip_xout_keep_actions ], [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:0, bytes:0, used:0.0s, actions:2 > > +]) > > + > > +dnl wait the dp flow enter OPERATIONAL > > Start with capital W for wait, and end with a dot. Some more re-writing: > > -dnl wait the dp flow enter OPERATIONAL > +dnl Wait for the dp flow to enter OPERATIONAL state. > > > +ovs-appctl revalidator/wait > > + > > +AT_CHECK([ovs-appctl revalidator/pause]) > > + > > +dnl delete all dp flows, so flow modify will fail. > > dnl Delete all dp flows, so flow modification will fail. > > > > +AT_CHECK([ovs-appctl dpctl/del-flows]) > > + > > +dnl replace openflow rules, let revalidator work > > +dnl Replace OpenFlow rules and let the revalidator work. > > > +AT_DATA([flows2.txt], [dnl > > +table=0,in_port=p1,ip actions=ct(commit) > > +]) > > Do this in one command > > > +AT_CHECK([ovs-appctl revalidator/resume]) > > + > > +dnl trigger revalidating and wait > > Trigger revalidation and wait for it to complete. > > > +AT_CHECK([ovs-ofctl --bundle replace-flows br0 flows2.txt]) > > -AT_CHECK([ovs-ofctl --bundle replace-flows br0 flows2.txt]) > > > +ovs-appctl revalidator/wait > > + > > +dnl inconsistent ukey should be deleted. > > +AT_CHECK([ovs-appctl upcall/show | grep keys | grep -q -v 0], [1]) > > If we add the new coverage counter we should add the check here also. > This test case constructs a case that the revalidator is trying to modify a flow that does not exist, so we can't have the counter here (the megaflow is not dumped, it has been deleted). We only test if the revalidator would remove the inconsistent ukey after it found the ukey is in the wrong state. > > > +dnl delete revalidator warnings log to let testsuite pass. > > +sed -i 's/.*failed to put.*$//' ovs-vswitchd.log > > +sed -i 's/.*failed to flow_del.*$//' ovs-vswitchd.log > > Should we not explicitly check for the flow(_del) error, as this will tell > us the test introduced the right problem to begin with? > We can check if the warning logs exist to make sure the error happens. So yes. > > > The above we can do through the OVS_VSWITCHD_STOP(["......"] command. > > > + > > +OVS_VSWITCHD_STOP > > +AT_CLEANUP > > -- > > 2.25.1 > > -- hepeng _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev