On 10 Dec 2022, at 1:37, Peng He wrote:
> Patch v5 has statistics issues. > > In order to solve this issue, we had a discussion. > > below is the quote of the email. > > ” > After a second thought, I think maybe keeping INCONSISTENT just for the > modify error is a better option. > > With current patch: > 1. > the modify error case: > OPERATIONAL -> INCONSISTENT -> EVICTING -> EVICTED > 2. > the delete error case: > EVICTING -> EVICTED > > Change both to INCONSISTENT: > > the modify error case: > did not change. > > the delete error case: > EVICTING -> INCONSISTENT -> EVICTED? > > “ > > And we agree to take the second solution. I know, but going over the state meanings again, UKEY_EVICTING means the following: /* Ukey is in umap, datapath flow delete is queued. */ Which now no longer is the case, so should a new state not make more sense? Any one else has some input on this?? > Eelco Chaudron <[email protected]> 于2022年12月8日周四 18:54写道: > >> >> >> On 27 Nov 2022, at 8:28, Peng He wrote: >> >>> push_dp_ops only handles delete ops errors but ignores the modify >>> ops results. It's better to handle all the dp operation errors in >>> a consistent way. >>> >>> We observe in the production environment that sometimes a megaflow >>> with wrong actions keep staying in datapath. The coverage command shows >>> revalidators have dumped several times, however the correct >>> actions are not set. This implies that the ukey's action does not >>> equal to the meagaflow's, i.e. revalidators think the underlying >>> megaflow's actions are correct however they are not. >>> >>> We also check the megaflow using the ofproto/trace command, and the >>> actions are not matched with the ones in the actual magaflow. By >>> performing a revalidator/purge command, the right actions are set. >>> >>> This patch prevents the inconsistency by considering modify failure >>> in revalidators. >>> >>> To note, we cannot perform two state transitions and change ukey_state >>> into UKEY_EVICTED directly here, because, if we do so, the >>> sweep will remove the ukey alone and leave dp flow alive. Later, the >>> dump will retrieve the dp flow and might even recover it. This will >>> contribute the stats of this dp flow twice. >>> >>> Signed-off-by: Peng He <[email protected]> >>> --- >>> ofproto/ofproto-dpif-upcall.c | 34 +++++++++++++++++++++++----------- >>> 1 file changed, 23 insertions(+), 11 deletions(-) >>> >>> diff --git a/ofproto/ofproto-dpif-upcall.c >> b/ofproto/ofproto-dpif-upcall.c >>> index 7ad728adf..c2cefbeb8 100644 >>> --- a/ofproto/ofproto-dpif-upcall.c >>> +++ b/ofproto/ofproto-dpif-upcall.c >>> @@ -2416,26 +2416,30 @@ push_dp_ops(struct udpif *udpif, struct ukey_op >> *ops, size_t n_ops) >>> >>> for (i = 0; i < n_ops; i++) { >>> struct ukey_op *op = &ops[i]; >>> - struct dpif_flow_stats *push, *stats, push_buf; >>> - >>> - stats = op->dop.flow_del.stats; >>> - push = &push_buf; >>> - >>> - if (op->dop.type != DPIF_OP_FLOW_DEL) { >>> - /* Only deleted flows need their stats pushed. */ >>> - continue; >>> - } >>> >>> if (op->dop.error) { >>> - /* flow_del error, 'stats' is unusable. */ >>> if (op->ukey) { >>> ovs_mutex_lock(&op->ukey->mutex); >>> - transition_ukey(op->ukey, UKEY_EVICTED); >>> + if (op->dop.type == DPIF_OP_FLOW_DEL) { >>> + transition_ukey(op->ukey, UKEY_EVICTED); >>> + } else { I think we could use a comment here to make sure why we set it to evicting. Maybe just a reference to the comment in revalidator_sweep__() might be enough. >>> + transition_ukey(op->ukey, UKEY_EVICTING); >>> + } >>> ovs_mutex_unlock(&op->ukey->mutex); >>> } >>> continue; >>> } >>> >>> + if (op->dop.type != DPIF_OP_FLOW_DEL) { >>> + /* Only deleted flows need their stats pushed. */ >>> + continue; >>> + } >>> + >>> + struct dpif_flow_stats *push, *stats, push_buf; >>> + >>> + stats = op->dop.flow_del.stats; >>> + push = &push_buf; >>> + >>> if (op->ukey) { >>> ovs_mutex_lock(&op->ukey->mutex); >>> transition_ukey(op->ukey, UKEY_EVICTED); >>> @@ -2848,6 +2852,14 @@ revalidator_sweep__(struct revalidator >> *revalidator, bool purge) >>> continue; >>> } >>> ukey_state = ukey->state; >>> + >>> + if (ukey_state == UKEY_EVICTING) { >>> + /* previous modify operation fails on this ukey and >> ukey_state >>> + * is set to UKEY_EVICTING, issue a delete operation on >> this >>> + * ukey. >>> + */ >>> + delete_op_init(udpif, &ops[n_ops++], ukey); This would cause the later push_ukey_ops() to call ukey_delete() which will call transition_ukey(ukey, UKEY_DELETED). Which I think will result in an invalid state transaction from UKEY_EVICTING to UKEY_DELETED. >> >> How can we be sure this state here is only for failing to modify >> operations? >> This state is set in other places in the code also. >> > > because revalidate_sweep is called after revaldiate. all the datapath ops > have already been > pushed, and if there are ukeys at UKEY_EVICTING, it's because previously > datapath ops > have errors. and if del ops fail, ukey is deleted. so only ukeys at > UKEY_EVICTING must > come from modify fail. First of all, I missed that p_purge_cb() was pausing the revalidators, so you are right that there is no other way to set UKEY_EVICTING other than the sweet phase. I also missed the part that the ukey_delete() will do the actual removal, and not the if (ukey_state == UKEY_EVICTED) statement lower. Maybe we should reconsider having a special state? Anyhow, I do think the comment in the code needs some changing to make this clear: if (ukey_state == UKEY_EVICTING) { /* A previous modify operation in push_dp_ops() failed and * caused the state to be UKEY_EVICTING. We need to manually * push a delete operation on this ukey to get both the * datapath flow and ukey removed. The later push_ukey_ops() * will call push_dp_ops() and ukey_delete() to accomplish * this. */ delete_op_init(udpif, &ops[n_ops++], ukey); } >> Maybe we should keep the INCONSISTENT state to avoid this (see v5)? >> What about the case the push_dp_ops()/push_ukey_ops() fails in revalidator_sweep__() for a none DELETE action, who will take care of this? The alternative fix might be better in this case. >>> + } >>> if (ukey_state == UKEY_OPERATIONAL >>> || (ukey_state == UKEY_VISIBLE && purge)) { >>> struct recirc_refs recircs = >> RECIRC_REFS_EMPTY_INITIALIZER; >>> -- >>> 2.25.1 >> >> > > -- > hepeng _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
