On Tue, Apr 07, 2020 at 02:57:03PM -0700, Yi-Hung Wei wrote: > On Tue, Apr 7, 2020 at 2:50 PM Yifeng Sun <pkusunyif...@gmail.com> wrote: > > > > Tun_metadata can be referened by flow and frozen_state at the same > > time. When ovs-vswitchd handles TLV table mod message, the involved > > tun_metadata gets freed. The call trace to free tun_metadata is > > shown as below: > > > > ofproto_run > > - handle_openflow > > - handle_single_part_openflow > > - handle_tlv_table_mod > > - tun_metadata_table_mod > > - tun_metadata_postpone_free > > > > Unfortunately, this tun_metadata can be still used by some frozen_state, > > and later on when frozen_state tries to access its tun_metadata table, > > ovs-vswitchd crashes. The call trace to access tun_metadata from > > frozen_state is shown as below: > > > > udpif_upcall_handler > > - recv_upcalls > > - process_upcall > > - frozen_metadata_to_flow > > > > It is unsafe for frozen_state to reference tun_table because tun_table > > is protected by RCU while the lifecycle of frozen_state can span several > > RCU quiesce states. Current code violates OVS's RCU protection mechanism. > > > > This patch fixes it by simply stopping frozen_state from referencing > > tun_table. If frozen_state needs tun_table, we can find the latest valid > > tun_table through ofproto_get_tun_tab() efficiently. > > > > A previous commit seems fixing the samiliar issue: > > 254878c18874f6 (ofproto-dpif-xlate: Fix segmentation fault caused by > > tun_table) > > > > VMware-BZ: #2526222 > > Signed-off-by: Yifeng Sun <pkusunyif...@gmail.com> > > --- > > v1->v2: Drop the fix based on reference count. It doesn't fit well with RCU > > mechanism. Thanks William and YiHung for the offline discussion. > > > > ofproto/ofproto-dpif-rid.h | 7 +++++++ > > ofproto/ofproto-dpif-upcall.c | 2 ++ > > 2 files changed, 9 insertions(+) > > > > diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h > > index e5d02caf28a3..5235764a9885 100644 > > --- a/ofproto/ofproto-dpif-rid.h > > +++ b/ofproto/ofproto-dpif-rid.h > > @@ -115,6 +115,13 @@ frozen_metadata_from_flow(struct frozen_metadata *md, > > { > > memset(md, 0, sizeof *md); > > md->tunnel = flow->tunnel; > > + /* It is unsafe for frozen_state to reference tun_table because > > + * tun_table is protected by RCU while the lifecycle of frozen_state > > + * can span several RCU quiesce states. > > + * > > + * The latest valid tun_table can be found by ofproto_get_tun_tab() > > + * efficiently. */ > > + md->tunnel.metadata.tab = NULL;
tun_table is RCU-protected, should we use ovsrcu_set? > > md->metadata = flow->metadata; > > memcpy(md->regs, flow->regs, sizeof md->regs); > > md->in_port = flow->in_port.ofp_port; > > diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c > > index 8dfa05b71df4..949cd4dbaf6f 100644 > > --- a/ofproto/ofproto-dpif-upcall.c > > +++ b/ofproto/ofproto-dpif-upcall.c > > @@ -1535,6 +1535,8 @@ process_upcall(struct udpif *udpif, struct upcall > > *upcall, > > } > > > > frozen_metadata_to_flow(&state->metadata, &frozen_flow); > > + frozen_flow.tunnel.metadata.tab = ofproto_get_tun_tab( > > + &upcall->ofproto->up); > > > Thanks for the fix. I wonder if it makes sense to move > ofproto_get_tun_tab() into frozen_metadata_to_flow()? Therefore, we > do not need to call ofproto_get_tun_tab() to reset the tun_table for > other frozen state use case. > > Thanks, > > -Yi-Hung > > > > flow_get_metadata(&frozen_flow, > > &am->pin.up.base.flow_metadata); > > > > ofproto_dpif_send_async_msg(upcall->ofproto, am); > > -- > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev