Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

Numan Siddique Thu, 25 Jan 2024 07:41:28 -0800

On Thu, Jan 25, 2024 at 4:21 AM Dumitru Ceara <dce...@redhat.com> wrote:
>
> On 1/25/24 06:44, Han Zhou wrote:
> > On Wed, Jan 24, 2024 at 8:39 PM Numan Siddique <num...@ovn.org> wrote:
> >>
> >> On Wed, Jan 24, 2024 at 10:53 PM Han Zhou <hz...@ovn.org> wrote:
> >>>
> >>> On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara <dce...@redhat.com> wrote:
> >>>>
> >>>> On 1/24/24 06:01, Han Zhou wrote:
> >>>>> On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara <dce...@redhat.com>
> > wrote:
> >>>>>>
> >>>>>> On 1/11/24 16:31, num...@ovn.org wrote:
> >>>>>>> +
> >>>>>>> +void
> >>>>>>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> >>>>>>> +                      const struct ovn_datapath *od,
> >>>>>>> +                      const unsigned long *dp_bitmap, size_t
> >>>>> dp_bitmap_len,
> >>>>>>> +                      enum ovn_stage stage, uint16_t priority,
> >>>>>>> +                      const char *match, const char *actions,
> >>>>>>> +                      const char *io_port, const char
> > *ctrl_meter,
> >>>>>>> +                      const struct ovsdb_idl_row *stage_hint,
> >>>>>>> +                      const char *where,
> >>>>>>> +                      struct lflow_ref *lflow_ref)
> >>>>>>> +    OVS_EXCLUDED(fake_hash_mutex)
> >>>>>>> +{
> >>>>>>> +    struct ovs_mutex *hash_lock;
> >>>>>>> +    uint32_t hash;
> >>>>>>> +
> >>>>>>> +    ovs_assert(!od ||
> >>>>>>> +               ovn_stage_to_datapath_type(stage) ==
> >>>>> ovn_datapath_get_type(od));
> >>>>>>> +
> >>>>>>> +    hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> >>>>>>> +                                 ovn_stage_get_pipeline(stage),
> >>>>>>> +                                 priority, match,
> >>>>>>> +                                 actions);
> >>>>>>> +
> >>>>>>> +    hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> >>>>>>> +    struct ovn_lflow *lflow =
> >>>>>>> +        do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> >>>>>>> +                         dp_bitmap_len, hash, stage,
> >>>>>>> +                         priority, match, actions,
> >>>>>>> +                         io_port, ctrl_meter, stage_hint, where);
> >>>>>>> +
> >>>>>>> +    if (lflow_ref) {
> >>>>>>> +        /* lflow referencing is only supported if 'od' is not
> > NULL.
> >>> */
> >>>>>>> +        ovs_assert(od);
> >>>>>>> +
> >>>>>>> +        struct lflow_ref_node *lrn =
> >>>>>>> +            lflow_ref_node_find(&lflow_ref->lflow_ref_nodes,
> > lflow,
> >>>>> hash);
> >>>>>>> +        if (!lrn) {
> >>>>>>> +            lrn = xzalloc(sizeof *lrn);
> >>>>>>> +            lrn->lflow = lflow;
> >>>>>>> +            lrn->dp_index = od->index;
> >>>>>>> +            ovs_list_insert(&lflow_ref->lflows_ref_list,
> >>>>>>> +                            &lrn->lflow_list_node);
> >>>>>>> +            inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> >>>>>>> +            ovs_list_insert(&lflow->referenced_by,
> >>>>> &lrn->ref_list_node);
> >>>>>>> +
> >>>>>>> +            hmap_insert(&lflow_ref->lflow_ref_nodes,
> > &lrn->ref_node,
> >>>>> hash);
> >>>>>>> +        }
> >>>>>>> +
> >>>>>>> +        lrn->linked = true;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    lflow_hash_unlock(hash_lock);
> >>>>>>> +
> >>>>>>> +}
> >>>>>>> +
> >>>>>>
> >>>>>> This part is not thread safe.
> >>>>>>
> >>>>>> If two threads try to add logical flows that have different hashes
> > and
> >>>>>> lflow_ref is not NULL we're going to have a race condition when
> >>>>>> inserting to the &lflow_ref->lflow_ref_nodes hash map because the
> > two
> >>>>>> threads will take different locks.
> >>>>>>
> >>>>>
> >>>>> I think it is safe because a lflow_ref is always associated with an
> >>> object,
> >>>>> e.g. port, datapath, lb, etc., and lflow generation for a single
> > such
> >>>>> object is never executed in parallel, which is how the parallel
> > lflow
> >>> build
> >>>>> is designed.
> >>>>> Does it make sense?
> >>>>
> >>>> It happens that it's safe in this current patch set because indeed we
> >>>> always process individual ports, datapaths, lbs, etc, in the same
> >>>> thread.  However, this code (lflow_table_add_lflow()) is generic and
> >>>> there's nothing (not even a comment) that would warn developers in the
> >>>> future about the potential race if the lflow_ref is shared.
> >>>>
> >>>> I spoke to Numan offline a bit about this and I think the current plan
> >>>> is to leave it as is and add proper locking as a follow up (or in v7).
> >>>> But I think we still need a clear comment here warning users about
> > this.
> >>>>  Maybe we should add a comment where the lflow_ref structure is
> > defined
> >>> too.
> >>>>
> >>>> What do you think?
> >>>
> >>> I totally agree with you about adding comments to explain the thread
> > safety
> >>> considerations, and make it clear that the lflow_ref should always be
> >>> associated with the object that is being processed by the thread.
> >>> With regard to any follow up change for proper locking, I am not sure
> > what
> >>> scenario is your concern. I think if we always make sure the lflow_ref
> >>> passed in is the one owned by the object then the current locking is
> >>> sufficient. And this is the natural case.
> >>>
> >>> However, I did think about a situation where there might be a potential
> >>> problem in the future when we need to maintain references for more than
> > one
> >>> input for the same lflow. For the "main" input, which is the object that
> >>> the thread is iterating and generating lflow for, will have its
> > lflow_ref
> >>> passed in this function, but we might also need to maintain the
> > reference
> >>> of the lflow for a secondary input (or even third). In that case it is
> > not
> >>> just the locking issue, but firstly we need to have a proper way to
> > pass in
> >>> the secondary lflow_ref, which is what I had mentioned in the review
> >>> comments for v3:
> >>> https://mail.openvswitch.org/pipermail/ovs-dev/2023-December/410269.html
> > .
> >>> (the last paragraph). Regardless of that, it is not a problem for now,
> > and
> >>> I hope there is no need to add references for more inputs for the I-P
> > that
> >>> matters for production use cases.
> >>>
> >>
> >> I'll update the next version with the proper documentation.
> >>
> >> @Han Regarding your comment that we may have the requirement in the
> >> future to add a logical flow to
> >> 2 or more lflow_ref's,   I doubt if we would have such a requirement
> >> or a scenario.
> >>
> >> Because calling lflow_table_add_lflow(,..., lflow_ref1,  lflow_ref2)
> >> is same as calling lflow_table_add_lflow twice
> >>
> >> i.e
> >> lflow_table_add_lflow(,..., lflow_ref1)
> >> lflow_table_add_lflow(,..., lflow_ref2)
> >>
> >> In the second call,  since the ovn_lflow already exists in the lflow
> >> table, it will be just referenced in the lflow_ref2.
> >> It would be a problem for sure if these lflow_refs (ref1 and ref2)
> >> belong to different objects.
> >>
> >
> > The scenario you mentioned here is more like duplicated lflows generated
> > when processing different objects, which has different lflow_refs, which is
> > not the scenario I was talking about.
> > I am thinking about the case when a lflow is generated, there are multiple
> > inputs. For example (extremely simplified just to express the idea), when
> > generating lflows for a LB object, we also consult other objects such as
> > datapath. Now we implemented I-P for LB, so for the LB object, for each
> > lflow generated we will call lflow_table_add_lflow(..., lb->lflow_ref) to
> > maintain the reference from the LB object to the lflow. However, if in the
> > future we want to do I-P for datapaths, which means when a datapath is
> > deleted or updated we need to quickly find the lflows that is linked to the
> > datapath, we need to maintain the reference from the datapath to the same
> > lflow that is generated by the same call lflow_table_add_lflow(...,
> > lb->lflow_ref, datapath->lflow_ref). Or, do you suggest even in this case
> > we just call the function twice, with the second call's only purpose being
> > to add the reference to the secondary input? Hmm, it would actually work as
> > if we are trying to add a redundant lflow, although wasting some cycles for
> > the ovn_lflow_find(). Of course the locking issue would arise in this case.
> > Anyway, I don't have a realistic example to prove this is a requirement of
> > I-P for solving real production scale/performance issues. In the example I
> > provided, datapath I-P seems to be a non-goal as far as I can tell for now.
> > So I am not too worried.
> >
> >> However I do see a scenario where lflow_table_add_lflow() could be
> >> called with a different lflow_ref object.
> >>
> >> Few scenarios I could think of
> >>
> >> 1.  While generating logical flows for a logical switch port (of type
> >> router) 'P1',  there is a possibility (most likely a mistake from
> >> a contributor) may call something like
> >>
> >> lflow_table_add_lflow(....,  P1->peer->lflow_ref)
> >>
> >> Even in this case,  we don't generate logical router port flows in
> >> parallel with logical switch ports and
> >> hence P1->peer->lflow_ref may not be modified by multiple threads.
> >> But it's a possibility.  And I think Dumitru
> >> is perhaps concerned about such scenarios.
> >>
> > In this case it does seem to be a problem, because it is possible that
> > thread1 is processing P1 while thread2 is processing P1->peer, and now both
> > threads can access P1->peer->lflow_ref. We should make it clear that when
> > processing an object, we only add lflow_ref for that object.
> >
> >> 2.  While generating load balancer flows we may also generate logical
> >> flows for the routers and store them in the od->lflow_ref  (If we
> >> happen to support I-P for router ports)
> >>      i.e  HMAP_FOR_EACH_PARALLEL (lb, lb_maps) {
> >>                 generate_lflows_for_lb(lb, lb->lflow_ref)
> >>                 FOR_EACH_ROUTER_OF_LB(lr, lb) {
> >>                     generate_lb_lflows_for_router(lr, lr->lflow_ref)
> >>                 }
> >>            }
> >>
> >>      In this case, it is possible that a logical router 'lr' may be
> >> associated with multiple lbs and this may corrupt the lr->lflow_ref.
> >>
> >>
> >> We have 2 choices  (after this patch series is accepted :))
> >>   a.  Don't do anything.  The documentation added by this patch series
> >> (which I'll update in the next version)  is good enough.
> >>        and whenever such scenarios arise,  the contributor will add
> >> the thread_safe code or make sure that he/she will not use the
> >> lflow_ref of other objects while
> >>        generating lflows.  In the scenario 2 above,  the contributor
> >> should not generate lflows for the routers in the lb_maps loop.
> >> Instead should generate
> >>        them in the lr_datapaths loop or lr_stateful loop.
> >>
> >>   b.  As a follow up patch,  make the lflow_table_add_lflow()
> >> thread_safe for lflow_ref so that we are covered for the future
> >> scenarios.
> >>        The downside of this is that we have to maintain hash locks for
> >> each lflow_ref and reconsile the lflow_ref hmap after all the threads
> >> finish the lflow generation.
> >>        It does incur some cost (both memory and cpu wise).  But it
> >> will be only during recomputes.  Which should not be too frequent.
> >>
> >
> > I'd prefer solution (a) with proper documentation to emphasize the
> > rule/assumption.
> >
>
> Would it make sense to also try to detect whether a lflow_ref is used
> from multiple threads?  E.g., store the id of the last thread that
> accessed it and assert that it's the same with the thread currently
> accessing the lflow ref?
>
> Just to make sure we crash explicitly in case such bugs are
> inadvertently introduced?
>


I think it would be hard to make a decision here because I-P is
handled in a single thread.
How do we distinguish between a full recompute with parallel threads
vs I-P single threading
in lflow_ref.c.

Numan

> Regards,
> Dumitru
>
> > Thanks,
> > Han
> >
> >>
> >> Let me know what you think.  I'm fine with both, but personally prefer
> >> (1) as I don't see such scenarios in the near future.
> >>
> >>
> >> Thanks
> >> Numan
> >>
> >>
> >>
> >>
> >>> Thanks,
> >>> Han
> >>>
> >>>>
> >>>> Regards,
> >>>> Dumitru
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Han
> >>>>>
> >>>>>> That might corrupt the map.
> >>>>>>
> >>>>>> I guess, if we don't want to cause more performance degradation we
> >>>>>> should maintain as many lflow_ref instances as we do hash_locks,
> > i.e.,
> >>>>>> LFLOW_HASH_LOCK_MASK + 1.  Will that even be possible?
> >>>>>>
> >>>>>> Wdyt?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Dumitru
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> dev mailing list
> >>>>>> d...@openvswitch.org
> >>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>>
> >>>>
> >>> _______________________________________________
> >>> dev mailing list
> >>> d...@openvswitch.org
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

Reply via email to