Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-29 Thread Han Zhou
On Mon, Jan 29, 2024 at 7:11 PM Numan Siddique  wrote:

> On Thu, Jan 25, 2024 at 1:08 AM Han Zhou  wrote:
> >
> > On Thu, Jan 11, 2024 at 7:32 AM  wrote:
> > >
> > > From: Numan Siddique 
> > >
> > > ovn_lflow_add() and other related functions/macros are now moved
> > > into a separate module - lflow-mgr.c.  This module maintains a
> > > table 'struct lflow_table' for the logical flows.  lflow table
> > > maintains a hmap to store the logical flows.
> > >
> > > It also maintains the logical switch and router dp groups.
> > >
> > > Previous commits which added lflow incremental processing for
> > > the VIF logical ports, stored the references to
> > > the logical ports' lflows using 'struct lflow_ref_list'.  This
> > > struct is renamed to 'struct lflow_ref' and is part of lflow-mgr.c.
> > > It is  modified a bit to store the resource to lflow references.
> > >
> > > Example usage of 'struct lflow_ref'.
> > >
> > > 'struct ovn_port' maintains 2 instances of lflow_ref.  i,e
> > >
> > > struct ovn_port {
> > >...
> > >...
> > >struct lflow_ref *lflow_ref;
> > >struct lflow_ref *stateful_lflow_ref;
> >
> > Hi Numan,
> >
> > In addition to the lock discussion with you and Dumitru, I still want to
> > discuss another thing of this patch regarding the second lflow_ref:
> > stateful_lflow_ref.
> > I understand that you added this to achieve finer grained I-P especially
> > for router ports. I am wondering how much performance gain is from this.
> > For my understanding it shouldn't matter much since each ovn_port should
> be
> > associated with a very limited number of lflows. Could you provide more
> > insight/data on this? I think it would be better to keep things simple
> > (i.e. one object, one lflow_ref list) unless the benefit is obvious.
> >
> > I am also trying to run another performance regression test for
> recompute,
> > since I am a little concerned about the DP refcnt hmap associated with
> each
> > lflow. I understand it's necessary to handle the duplicated lflow cases,
> > but it looks heavy and removes the opportunities for more efficient
> bitmap
> > operations. Let me know if you have already had evaluated its
> performance.
> >
>
> Hi Han,
>
> I did some testing on a large scaled OVN database.
> The NB database has
>  - 1000 logical switches
>  - 500 routers
>  - 35253 load balancers
>  - Most of these load balancers are associated with all the
> logical switches and routers.
>
> When I run this command for example
>- ovn-nbctl set load_balancer 0d647ff9-4e49-4570-a05d-db670873b7ef
> options:foo=bar
>
> It results in changes to all the logical routers. And the function
> lflow_handle_lr_stateful_changes() is called.
> If you see this function, for each changed router, it also loops
> through the router ports and calls
>  - build_lbnat_lflows_iterate_by_lrp()
>  - build_lbnat_lflows_iterate_by_lsp()
>
> With your suggestion we also need to call
> build_lswitch_and_lrouter_iterate_by_lsp() and
> build_lbnat_lflows_iterate_by_lrp().
>
> I measured the number of lflows referenced in op->lflow_ref and
> op->stateful_lflow_ref
> for each of the logical switch port  and router port pair.  Total
> lflows in lflow_ref
> (of all router ports and their peer switch ports) were 23000 and total
> lflows in stateful_lflow_ref
> were just 4000.
>
> So with just one lflow_ref (as per suggestion) a small update to load
> balancer like above
> would result in generating 27000 logical flows as compared to just 4000.
>
> I think it has a considerable cost in terms of CPU.  And perhaps it
> would matter more
> when ovn-northd runs in a DPU.  My preference would be to have a
> separate lflow ref
> for stateful flows.
>
> Thanks
> Numan


Thanks for the data points! It makes sense to use separate lists.

Regards,
Han

>
>
>
> > Thanks,
> > Han
> >
> > > };
> > >
> > > All the logical flows generated by
> > > build_lswitch_and_lrouter_iterate_by_lsp() uses the
> ovn_port->lflow_ref.
> > >
> > > All the logical flows generated by build_lsp_lflows_for_lbnats()
> > > uses the ovn_port->stateful_lflow_ref.
> > >
> > > When handling the ovn_port changes incrementally, the lflows referenced
> > > in 'struct ovn_port' are cleared and regenerated and synced to the
> > > SB logical flows.
> > >
> > > eg.
> > >
> > > lflow_ref_clear_lflows(op->lflow_ref);
> > > build_lswitch_and_lrouter_iterate_by_lsp(op, ...);
> > > lflow_ref_sync_lflows_to_sb(op->lflow_ref, ...);
> > >
> > > This patch does few more changes:
> > >   -  Logical flows are now hashed without the logical
> > >  datapaths.  If a logical flow is referenced by just one
> > >  datapath, we don't rehash it.
> > >
> > >   -  The synthetic 'hash' column of sbrec_logical_flow now
> > >  doesn't use the logical datapath.  This means that
> > >  when ovn-northd is updated/upgraded and has this commit,
> > >  all the logical flows with 'logical_datapath' column
> > >  set will get deleted and re-added causin

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-29 Thread Numan Siddique
On Thu, Jan 25, 2024 at 1:08 AM Han Zhou  wrote:
>
> On Thu, Jan 11, 2024 at 7:32 AM  wrote:
> >
> > From: Numan Siddique 
> >
> > ovn_lflow_add() and other related functions/macros are now moved
> > into a separate module - lflow-mgr.c.  This module maintains a
> > table 'struct lflow_table' for the logical flows.  lflow table
> > maintains a hmap to store the logical flows.
> >
> > It also maintains the logical switch and router dp groups.
> >
> > Previous commits which added lflow incremental processing for
> > the VIF logical ports, stored the references to
> > the logical ports' lflows using 'struct lflow_ref_list'.  This
> > struct is renamed to 'struct lflow_ref' and is part of lflow-mgr.c.
> > It is  modified a bit to store the resource to lflow references.
> >
> > Example usage of 'struct lflow_ref'.
> >
> > 'struct ovn_port' maintains 2 instances of lflow_ref.  i,e
> >
> > struct ovn_port {
> >...
> >...
> >struct lflow_ref *lflow_ref;
> >struct lflow_ref *stateful_lflow_ref;
>
> Hi Numan,
>
> In addition to the lock discussion with you and Dumitru, I still want to
> discuss another thing of this patch regarding the second lflow_ref:
> stateful_lflow_ref.
> I understand that you added this to achieve finer grained I-P especially
> for router ports. I am wondering how much performance gain is from this.
> For my understanding it shouldn't matter much since each ovn_port should be
> associated with a very limited number of lflows. Could you provide more
> insight/data on this? I think it would be better to keep things simple
> (i.e. one object, one lflow_ref list) unless the benefit is obvious.
>
> I am also trying to run another performance regression test for recompute,
> since I am a little concerned about the DP refcnt hmap associated with each
> lflow. I understand it's necessary to handle the duplicated lflow cases,
> but it looks heavy and removes the opportunities for more efficient bitmap
> operations. Let me know if you have already had evaluated its performance.
>

Hi Han,

I did some testing on a large scaled OVN database.
The NB database has
 - 1000 logical switches
 - 500 routers
 - 35253 load balancers
 - Most of these load balancers are associated with all the
logical switches and routers.

When I run this command for example
   - ovn-nbctl set load_balancer 0d647ff9-4e49-4570-a05d-db670873b7ef
options:foo=bar

It results in changes to all the logical routers. And the function
lflow_handle_lr_stateful_changes() is called.
If you see this function, for each changed router, it also loops
through the router ports and calls
 - build_lbnat_lflows_iterate_by_lrp()
 - build_lbnat_lflows_iterate_by_lsp()

With your suggestion we also need to call
build_lswitch_and_lrouter_iterate_by_lsp() and
build_lbnat_lflows_iterate_by_lrp().

I measured the number of lflows referenced in op->lflow_ref and
op->stateful_lflow_ref
for each of the logical switch port  and router port pair.  Total
lflows in lflow_ref
(of all router ports and their peer switch ports) were 23000 and total
lflows in stateful_lflow_ref
were just 4000.

So with just one lflow_ref (as per suggestion) a small update to load
balancer like above
would result in generating 27000 logical flows as compared to just 4000.

I think it has a considerable cost in terms of CPU.  And perhaps it
would matter more
when ovn-northd runs in a DPU.  My preference would be to have a
separate lflow ref
for stateful flows.

Thanks
Numan


> Thanks,
> Han
>
> > };
> >
> > All the logical flows generated by
> > build_lswitch_and_lrouter_iterate_by_lsp() uses the ovn_port->lflow_ref.
> >
> > All the logical flows generated by build_lsp_lflows_for_lbnats()
> > uses the ovn_port->stateful_lflow_ref.
> >
> > When handling the ovn_port changes incrementally, the lflows referenced
> > in 'struct ovn_port' are cleared and regenerated and synced to the
> > SB logical flows.
> >
> > eg.
> >
> > lflow_ref_clear_lflows(op->lflow_ref);
> > build_lswitch_and_lrouter_iterate_by_lsp(op, ...);
> > lflow_ref_sync_lflows_to_sb(op->lflow_ref, ...);
> >
> > This patch does few more changes:
> >   -  Logical flows are now hashed without the logical
> >  datapaths.  If a logical flow is referenced by just one
> >  datapath, we don't rehash it.
> >
> >   -  The synthetic 'hash' column of sbrec_logical_flow now
> >  doesn't use the logical datapath.  This means that
> >  when ovn-northd is updated/upgraded and has this commit,
> >  all the logical flows with 'logical_datapath' column
> >  set will get deleted and re-added causing some disruptions.
> >
> >   -  With the commit [1] which added I-P support for logical
> >  port changes, multiple logical flows with same match 'M'
> >  and actions 'A' are generated and stored without the
> >  dp groups, which was not the case prior to
> >  that patch.
> >  One example to generate these lflows is:
> >  ovn-nbctl lsp-set-addresses sw0p1 "MA

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-26 Thread Dumitru Ceara
On 1/25/24 16:40, Numan Siddique wrote:
> On Thu, Jan 25, 2024 at 4:21 AM Dumitru Ceara  wrote:
>>
>> On 1/25/24 06:44, Han Zhou wrote:
>>> On Wed, Jan 24, 2024 at 8:39 PM Numan Siddique  wrote:

 On Wed, Jan 24, 2024 at 10:53 PM Han Zhou  wrote:
>
> On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara  wrote:
>>
>> On 1/24/24 06:01, Han Zhou wrote:
>>> On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara 
>>> wrote:

 On 1/11/24 16:31, num...@ovn.org wrote:
> +
> +void
> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> +  const struct ovn_datapath *od,
> +  const unsigned long *dp_bitmap, size_t
>>> dp_bitmap_len,
> +  enum ovn_stage stage, uint16_t priority,
> +  const char *match, const char *actions,
> +  const char *io_port, const char
>>> *ctrl_meter,
> +  const struct ovsdb_idl_row *stage_hint,
> +  const char *where,
> +  struct lflow_ref *lflow_ref)
> +OVS_EXCLUDED(fake_hash_mutex)
> +{
> +struct ovs_mutex *hash_lock;
> +uint32_t hash;
> +
> +ovs_assert(!od ||
> +   ovn_stage_to_datapath_type(stage) ==
>>> ovn_datapath_get_type(od));
> +
> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> + ovn_stage_get_pipeline(stage),
> + priority, match,
> + actions);
> +
> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> +struct ovn_lflow *lflow =
> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> + dp_bitmap_len, hash, stage,
> + priority, match, actions,
> + io_port, ctrl_meter, stage_hint, where);
> +
> +if (lflow_ref) {
> +/* lflow referencing is only supported if 'od' is not
>>> NULL.
> */
> +ovs_assert(od);
> +
> +struct lflow_ref_node *lrn =
> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes,
>>> lflow,
>>> hash);
> +if (!lrn) {
> +lrn = xzalloc(sizeof *lrn);
> +lrn->lflow = lflow;
> +lrn->dp_index = od->index;
> +ovs_list_insert(&lflow_ref->lflows_ref_list,
> +&lrn->lflow_list_node);
> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> +ovs_list_insert(&lflow->referenced_by,
>>> &lrn->ref_list_node);
> +
> +hmap_insert(&lflow_ref->lflow_ref_nodes,
>>> &lrn->ref_node,
>>> hash);
> +}
> +
> +lrn->linked = true;
> +}
> +
> +lflow_hash_unlock(hash_lock);
> +
> +}
> +

 This part is not thread safe.

 If two threads try to add logical flows that have different hashes
>>> and
 lflow_ref is not NULL we're going to have a race condition when
 inserting to the &lflow_ref->lflow_ref_nodes hash map because the
>>> two
 threads will take different locks.

>>>
>>> I think it is safe because a lflow_ref is always associated with an
> object,
>>> e.g. port, datapath, lb, etc., and lflow generation for a single
>>> such
>>> object is never executed in parallel, which is how the parallel
>>> lflow
> build
>>> is designed.
>>> Does it make sense?
>>
>> It happens that it's safe in this current patch set because indeed we
>> always process individual ports, datapaths, lbs, etc, in the same
>> thread.  However, this code (lflow_table_add_lflow()) is generic and
>> there's nothing (not even a comment) that would warn developers in the
>> future about the potential race if the lflow_ref is shared.
>>
>> I spoke to Numan offline a bit about this and I think the current plan
>> is to leave it as is and add proper locking as a follow up (or in v7).
>> But I think we still need a clear comment here warning users about
>>> this.
>>  Maybe we should add a comment where the lflow_ref structure is
>>> defined
> too.
>>
>> What do you think?
>
> I totally agree with you about adding comments to explain the thread
>>> safety
> considerations, and make it clear that the lflow_ref should always be
> associated with the object that is being processed by the thread.
> With regard to any follow up change for proper locking, I am not sure
>>> what
> scenario is your concern. I think if we 

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-25 Thread Numan Siddique
On Thu, Jan 25, 2024 at 4:21 AM Dumitru Ceara  wrote:
>
> On 1/25/24 06:44, Han Zhou wrote:
> > On Wed, Jan 24, 2024 at 8:39 PM Numan Siddique  wrote:
> >>
> >> On Wed, Jan 24, 2024 at 10:53 PM Han Zhou  wrote:
> >>>
> >>> On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara  wrote:
> 
>  On 1/24/24 06:01, Han Zhou wrote:
> > On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara 
> > wrote:
> >>
> >> On 1/11/24 16:31, num...@ovn.org wrote:
> >>> +
> >>> +void
> >>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> >>> +  const struct ovn_datapath *od,
> >>> +  const unsigned long *dp_bitmap, size_t
> > dp_bitmap_len,
> >>> +  enum ovn_stage stage, uint16_t priority,
> >>> +  const char *match, const char *actions,
> >>> +  const char *io_port, const char
> > *ctrl_meter,
> >>> +  const struct ovsdb_idl_row *stage_hint,
> >>> +  const char *where,
> >>> +  struct lflow_ref *lflow_ref)
> >>> +OVS_EXCLUDED(fake_hash_mutex)
> >>> +{
> >>> +struct ovs_mutex *hash_lock;
> >>> +uint32_t hash;
> >>> +
> >>> +ovs_assert(!od ||
> >>> +   ovn_stage_to_datapath_type(stage) ==
> > ovn_datapath_get_type(od));
> >>> +
> >>> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> >>> + ovn_stage_get_pipeline(stage),
> >>> + priority, match,
> >>> + actions);
> >>> +
> >>> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> >>> +struct ovn_lflow *lflow =
> >>> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> >>> + dp_bitmap_len, hash, stage,
> >>> + priority, match, actions,
> >>> + io_port, ctrl_meter, stage_hint, where);
> >>> +
> >>> +if (lflow_ref) {
> >>> +/* lflow referencing is only supported if 'od' is not
> > NULL.
> >>> */
> >>> +ovs_assert(od);
> >>> +
> >>> +struct lflow_ref_node *lrn =
> >>> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes,
> > lflow,
> > hash);
> >>> +if (!lrn) {
> >>> +lrn = xzalloc(sizeof *lrn);
> >>> +lrn->lflow = lflow;
> >>> +lrn->dp_index = od->index;
> >>> +ovs_list_insert(&lflow_ref->lflows_ref_list,
> >>> +&lrn->lflow_list_node);
> >>> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> >>> +ovs_list_insert(&lflow->referenced_by,
> > &lrn->ref_list_node);
> >>> +
> >>> +hmap_insert(&lflow_ref->lflow_ref_nodes,
> > &lrn->ref_node,
> > hash);
> >>> +}
> >>> +
> >>> +lrn->linked = true;
> >>> +}
> >>> +
> >>> +lflow_hash_unlock(hash_lock);
> >>> +
> >>> +}
> >>> +
> >>
> >> This part is not thread safe.
> >>
> >> If two threads try to add logical flows that have different hashes
> > and
> >> lflow_ref is not NULL we're going to have a race condition when
> >> inserting to the &lflow_ref->lflow_ref_nodes hash map because the
> > two
> >> threads will take different locks.
> >>
> >
> > I think it is safe because a lflow_ref is always associated with an
> >>> object,
> > e.g. port, datapath, lb, etc., and lflow generation for a single
> > such
> > object is never executed in parallel, which is how the parallel
> > lflow
> >>> build
> > is designed.
> > Does it make sense?
> 
>  It happens that it's safe in this current patch set because indeed we
>  always process individual ports, datapaths, lbs, etc, in the same
>  thread.  However, this code (lflow_table_add_lflow()) is generic and
>  there's nothing (not even a comment) that would warn developers in the
>  future about the potential race if the lflow_ref is shared.
> 
>  I spoke to Numan offline a bit about this and I think the current plan
>  is to leave it as is and add proper locking as a follow up (or in v7).
>  But I think we still need a clear comment here warning users about
> > this.
>   Maybe we should add a comment where the lflow_ref structure is
> > defined
> >>> too.
> 
>  What do you think?
> >>>
> >>> I totally agree with you about adding comments to explain the thread
> > safety
> >>> considerations, and make it clear that the lflow_ref should always be
> >>> associated with the object that is being processed by the thread.
> >>> With regard to any follow up change for proper locking, I am not sure
> > what
> >>> scenario is your concern. I think if we always make sure the lflow_ref
> >>> passed 

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-25 Thread Dumitru Ceara
On 1/25/24 06:44, Han Zhou wrote:
> On Wed, Jan 24, 2024 at 8:39 PM Numan Siddique  wrote:
>>
>> On Wed, Jan 24, 2024 at 10:53 PM Han Zhou  wrote:
>>>
>>> On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara  wrote:

 On 1/24/24 06:01, Han Zhou wrote:
> On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara 
> wrote:
>>
>> On 1/11/24 16:31, num...@ovn.org wrote:
>>> +
>>> +void
>>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
>>> +  const struct ovn_datapath *od,
>>> +  const unsigned long *dp_bitmap, size_t
> dp_bitmap_len,
>>> +  enum ovn_stage stage, uint16_t priority,
>>> +  const char *match, const char *actions,
>>> +  const char *io_port, const char
> *ctrl_meter,
>>> +  const struct ovsdb_idl_row *stage_hint,
>>> +  const char *where,
>>> +  struct lflow_ref *lflow_ref)
>>> +OVS_EXCLUDED(fake_hash_mutex)
>>> +{
>>> +struct ovs_mutex *hash_lock;
>>> +uint32_t hash;
>>> +
>>> +ovs_assert(!od ||
>>> +   ovn_stage_to_datapath_type(stage) ==
> ovn_datapath_get_type(od));
>>> +
>>> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
>>> + ovn_stage_get_pipeline(stage),
>>> + priority, match,
>>> + actions);
>>> +
>>> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
>>> +struct ovn_lflow *lflow =
>>> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
>>> + dp_bitmap_len, hash, stage,
>>> + priority, match, actions,
>>> + io_port, ctrl_meter, stage_hint, where);
>>> +
>>> +if (lflow_ref) {
>>> +/* lflow referencing is only supported if 'od' is not
> NULL.
>>> */
>>> +ovs_assert(od);
>>> +
>>> +struct lflow_ref_node *lrn =
>>> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes,
> lflow,
> hash);
>>> +if (!lrn) {
>>> +lrn = xzalloc(sizeof *lrn);
>>> +lrn->lflow = lflow;
>>> +lrn->dp_index = od->index;
>>> +ovs_list_insert(&lflow_ref->lflows_ref_list,
>>> +&lrn->lflow_list_node);
>>> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
>>> +ovs_list_insert(&lflow->referenced_by,
> &lrn->ref_list_node);
>>> +
>>> +hmap_insert(&lflow_ref->lflow_ref_nodes,
> &lrn->ref_node,
> hash);
>>> +}
>>> +
>>> +lrn->linked = true;
>>> +}
>>> +
>>> +lflow_hash_unlock(hash_lock);
>>> +
>>> +}
>>> +
>>
>> This part is not thread safe.
>>
>> If two threads try to add logical flows that have different hashes
> and
>> lflow_ref is not NULL we're going to have a race condition when
>> inserting to the &lflow_ref->lflow_ref_nodes hash map because the
> two
>> threads will take different locks.
>>
>
> I think it is safe because a lflow_ref is always associated with an
>>> object,
> e.g. port, datapath, lb, etc., and lflow generation for a single
> such
> object is never executed in parallel, which is how the parallel
> lflow
>>> build
> is designed.
> Does it make sense?

 It happens that it's safe in this current patch set because indeed we
 always process individual ports, datapaths, lbs, etc, in the same
 thread.  However, this code (lflow_table_add_lflow()) is generic and
 there's nothing (not even a comment) that would warn developers in the
 future about the potential race if the lflow_ref is shared.

 I spoke to Numan offline a bit about this and I think the current plan
 is to leave it as is and add proper locking as a follow up (or in v7).
 But I think we still need a clear comment here warning users about
> this.
  Maybe we should add a comment where the lflow_ref structure is
> defined
>>> too.

 What do you think?
>>>
>>> I totally agree with you about adding comments to explain the thread
> safety
>>> considerations, and make it clear that the lflow_ref should always be
>>> associated with the object that is being processed by the thread.
>>> With regard to any follow up change for proper locking, I am not sure
> what
>>> scenario is your concern. I think if we always make sure the lflow_ref
>>> passed in is the one owned by the object then the current locking is
>>> sufficient. And this is the natural case.
>>>
>>> However, I did think about a situation where there might be a potential
>>> problem in the future when we need to maintain references for more than
> one
>>> input for the same lflow. Fo

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-24 Thread Han Zhou
On Wed, Jan 24, 2024 at 10:07 PM Han Zhou  wrote:
>
>
>
> On Thu, Jan 11, 2024 at 7:32 AM  wrote:
> >
> > From: Numan Siddique 
> >
> > ovn_lflow_add() and other related functions/macros are now moved
> > into a separate module - lflow-mgr.c.  This module maintains a
> > table 'struct lflow_table' for the logical flows.  lflow table
> > maintains a hmap to store the logical flows.
> >
> > It also maintains the logical switch and router dp groups.
> >
> > Previous commits which added lflow incremental processing for
> > the VIF logical ports, stored the references to
> > the logical ports' lflows using 'struct lflow_ref_list'.  This
> > struct is renamed to 'struct lflow_ref' and is part of lflow-mgr.c.
> > It is  modified a bit to store the resource to lflow references.
> >
> > Example usage of 'struct lflow_ref'.
> >
> > 'struct ovn_port' maintains 2 instances of lflow_ref.  i,e
> >
> > struct ovn_port {
> >...
> >...
> >struct lflow_ref *lflow_ref;
> >struct lflow_ref *stateful_lflow_ref;
>
> Hi Numan,
>
> In addition to the lock discussion with you and Dumitru, I still want to
discuss another thing of this patch regarding the second lflow_ref:
stateful_lflow_ref.
> I understand that you added this to achieve finer grained I-P especially
for router ports. I am wondering how much performance gain is from this.
For my understanding it shouldn't matter much since each ovn_port should be
associated with a very limited number of lflows. Could you provide more
insight/data on this? I think it would be better to keep things simple
(i.e. one object, one lflow_ref list) unless the benefit is obvious.
>
> I am also trying to run another performance regression test for
recompute, since I am a little concerned about the DP refcnt hmap
associated with each lflow. I understand it's necessary to handle the
duplicated lflow cases, but it looks heavy and removes the opportunities
for more efficient bitmap operations. Let me know if you have already had
evaluated its performance.
>
> Thanks,
> Han
>

Sorry that I forgot to mention another minor comment for the struct
lflow_ref_node. It would helpful to add more comments about the typical
life cycle of the lflow_ref_node, so that it is easier to understand why
hmap is used in lflow_ref, and why linked is needed in lflow_ref_node. For
my understanding the life cycle is something like:
1. created and linked at lflow_table_add_lflow()
2. unlinked when handling a change of the object that references it.
3. it may be re-linked when handling the same object change.
4. it is used to sync the lflow change (e.g. adding/removing DPs) to SB.
5. it is destroyed after syncing a unlinked lflow_ref to SB.

I think it would help people understand the code more easily. What do you
think?

Thanks,
Han

> > };
> >
> > All the logical flows generated by
> > build_lswitch_and_lrouter_iterate_by_lsp() uses the ovn_port->lflow_ref.
> >
> > All the logical flows generated by build_lsp_lflows_for_lbnats()
> > uses the ovn_port->stateful_lflow_ref.
> >
> > When handling the ovn_port changes incrementally, the lflows referenced
> > in 'struct ovn_port' are cleared and regenerated and synced to the
> > SB logical flows.
> >
> > eg.
> >
> > lflow_ref_clear_lflows(op->lflow_ref);
> > build_lswitch_and_lrouter_iterate_by_lsp(op, ...);
> > lflow_ref_sync_lflows_to_sb(op->lflow_ref, ...);
> >
> > This patch does few more changes:
> >   -  Logical flows are now hashed without the logical
> >  datapaths.  If a logical flow is referenced by just one
> >  datapath, we don't rehash it.
> >
> >   -  The synthetic 'hash' column of sbrec_logical_flow now
> >  doesn't use the logical datapath.  This means that
> >  when ovn-northd is updated/upgraded and has this commit,
> >  all the logical flows with 'logical_datapath' column
> >  set will get deleted and re-added causing some disruptions.
> >
> >   -  With the commit [1] which added I-P support for logical
> >  port changes, multiple logical flows with same match 'M'
> >  and actions 'A' are generated and stored without the
> >  dp groups, which was not the case prior to
> >  that patch.
> >  One example to generate these lflows is:
> >  ovn-nbctl lsp-set-addresses sw0p1 "MAC1 IP1"
> >  ovn-nbctl lsp-set-addresses sw1p1 "MAC1 IP1"
> >  ovn-nbctl lsp-set-addresses sw2p1 "MAC1 IP1"
> >
> >  Now with this patch we go back to the earlier way.  i.e
> >  one logical flow with logical_dp_groups set.
> >
> >   -  With this patch any updates to a logical port which
> >  doesn't result in new logical flows will not result in
> >  deletion and addition of same logical flows.
> >  Eg.
> >  ovn-nbctl set logical_switch_port sw0p1 external_ids:foo=bar
> >  will be a no-op to the SB logical flow table.
> >
> > [1] - 8bbd678("northd: Incremental processing of VIF additions in
'lflow' node.")
> >
> > Signed-off-by: Numan Siddique 
>
___

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-24 Thread Han Zhou
On Thu, Jan 11, 2024 at 7:32 AM  wrote:
>
> From: Numan Siddique 
>
> ovn_lflow_add() and other related functions/macros are now moved
> into a separate module - lflow-mgr.c.  This module maintains a
> table 'struct lflow_table' for the logical flows.  lflow table
> maintains a hmap to store the logical flows.
>
> It also maintains the logical switch and router dp groups.
>
> Previous commits which added lflow incremental processing for
> the VIF logical ports, stored the references to
> the logical ports' lflows using 'struct lflow_ref_list'.  This
> struct is renamed to 'struct lflow_ref' and is part of lflow-mgr.c.
> It is  modified a bit to store the resource to lflow references.
>
> Example usage of 'struct lflow_ref'.
>
> 'struct ovn_port' maintains 2 instances of lflow_ref.  i,e
>
> struct ovn_port {
>...
>...
>struct lflow_ref *lflow_ref;
>struct lflow_ref *stateful_lflow_ref;

Hi Numan,

In addition to the lock discussion with you and Dumitru, I still want to
discuss another thing of this patch regarding the second lflow_ref:
stateful_lflow_ref.
I understand that you added this to achieve finer grained I-P especially
for router ports. I am wondering how much performance gain is from this.
For my understanding it shouldn't matter much since each ovn_port should be
associated with a very limited number of lflows. Could you provide more
insight/data on this? I think it would be better to keep things simple
(i.e. one object, one lflow_ref list) unless the benefit is obvious.

I am also trying to run another performance regression test for recompute,
since I am a little concerned about the DP refcnt hmap associated with each
lflow. I understand it's necessary to handle the duplicated lflow cases,
but it looks heavy and removes the opportunities for more efficient bitmap
operations. Let me know if you have already had evaluated its performance.

Thanks,
Han

> };
>
> All the logical flows generated by
> build_lswitch_and_lrouter_iterate_by_lsp() uses the ovn_port->lflow_ref.
>
> All the logical flows generated by build_lsp_lflows_for_lbnats()
> uses the ovn_port->stateful_lflow_ref.
>
> When handling the ovn_port changes incrementally, the lflows referenced
> in 'struct ovn_port' are cleared and regenerated and synced to the
> SB logical flows.
>
> eg.
>
> lflow_ref_clear_lflows(op->lflow_ref);
> build_lswitch_and_lrouter_iterate_by_lsp(op, ...);
> lflow_ref_sync_lflows_to_sb(op->lflow_ref, ...);
>
> This patch does few more changes:
>   -  Logical flows are now hashed without the logical
>  datapaths.  If a logical flow is referenced by just one
>  datapath, we don't rehash it.
>
>   -  The synthetic 'hash' column of sbrec_logical_flow now
>  doesn't use the logical datapath.  This means that
>  when ovn-northd is updated/upgraded and has this commit,
>  all the logical flows with 'logical_datapath' column
>  set will get deleted and re-added causing some disruptions.
>
>   -  With the commit [1] which added I-P support for logical
>  port changes, multiple logical flows with same match 'M'
>  and actions 'A' are generated and stored without the
>  dp groups, which was not the case prior to
>  that patch.
>  One example to generate these lflows is:
>  ovn-nbctl lsp-set-addresses sw0p1 "MAC1 IP1"
>  ovn-nbctl lsp-set-addresses sw1p1 "MAC1 IP1"
>  ovn-nbctl lsp-set-addresses sw2p1 "MAC1 IP1"
>
>  Now with this patch we go back to the earlier way.  i.e
>  one logical flow with logical_dp_groups set.
>
>   -  With this patch any updates to a logical port which
>  doesn't result in new logical flows will not result in
>  deletion and addition of same logical flows.
>  Eg.
>  ovn-nbctl set logical_switch_port sw0p1 external_ids:foo=bar
>  will be a no-op to the SB logical flow table.
>
> [1] - 8bbd678("northd: Incremental processing of VIF additions in 'lflow'
node.")
>
> Signed-off-by: Numan Siddique 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-24 Thread Han Zhou
On Wed, Jan 24, 2024 at 8:39 PM Numan Siddique  wrote:
>
> On Wed, Jan 24, 2024 at 10:53 PM Han Zhou  wrote:
> >
> > On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara  wrote:
> > >
> > > On 1/24/24 06:01, Han Zhou wrote:
> > > > On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara 
wrote:
> > > >>
> > > >> On 1/11/24 16:31, num...@ovn.org wrote:
> > > >>> +
> > > >>> +void
> > > >>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> > > >>> +  const struct ovn_datapath *od,
> > > >>> +  const unsigned long *dp_bitmap, size_t
> > > > dp_bitmap_len,
> > > >>> +  enum ovn_stage stage, uint16_t priority,
> > > >>> +  const char *match, const char *actions,
> > > >>> +  const char *io_port, const char
*ctrl_meter,
> > > >>> +  const struct ovsdb_idl_row *stage_hint,
> > > >>> +  const char *where,
> > > >>> +  struct lflow_ref *lflow_ref)
> > > >>> +OVS_EXCLUDED(fake_hash_mutex)
> > > >>> +{
> > > >>> +struct ovs_mutex *hash_lock;
> > > >>> +uint32_t hash;
> > > >>> +
> > > >>> +ovs_assert(!od ||
> > > >>> +   ovn_stage_to_datapath_type(stage) ==
> > > > ovn_datapath_get_type(od));
> > > >>> +
> > > >>> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> > > >>> + ovn_stage_get_pipeline(stage),
> > > >>> + priority, match,
> > > >>> + actions);
> > > >>> +
> > > >>> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> > > >>> +struct ovn_lflow *lflow =
> > > >>> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> > > >>> + dp_bitmap_len, hash, stage,
> > > >>> + priority, match, actions,
> > > >>> + io_port, ctrl_meter, stage_hint, where);
> > > >>> +
> > > >>> +if (lflow_ref) {
> > > >>> +/* lflow referencing is only supported if 'od' is not
NULL.
> > */
> > > >>> +ovs_assert(od);
> > > >>> +
> > > >>> +struct lflow_ref_node *lrn =
> > > >>> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes,
lflow,
> > > > hash);
> > > >>> +if (!lrn) {
> > > >>> +lrn = xzalloc(sizeof *lrn);
> > > >>> +lrn->lflow = lflow;
> > > >>> +lrn->dp_index = od->index;
> > > >>> +ovs_list_insert(&lflow_ref->lflows_ref_list,
> > > >>> +&lrn->lflow_list_node);
> > > >>> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> > > >>> +ovs_list_insert(&lflow->referenced_by,
> > > > &lrn->ref_list_node);
> > > >>> +
> > > >>> +hmap_insert(&lflow_ref->lflow_ref_nodes,
&lrn->ref_node,
> > > > hash);
> > > >>> +}
> > > >>> +
> > > >>> +lrn->linked = true;
> > > >>> +}
> > > >>> +
> > > >>> +lflow_hash_unlock(hash_lock);
> > > >>> +
> > > >>> +}
> > > >>> +
> > > >>
> > > >> This part is not thread safe.
> > > >>
> > > >> If two threads try to add logical flows that have different hashes
and
> > > >> lflow_ref is not NULL we're going to have a race condition when
> > > >> inserting to the &lflow_ref->lflow_ref_nodes hash map because the
two
> > > >> threads will take different locks.
> > > >>
> > > >
> > > > I think it is safe because a lflow_ref is always associated with an
> > object,
> > > > e.g. port, datapath, lb, etc., and lflow generation for a single
such
> > > > object is never executed in parallel, which is how the parallel
lflow
> > build
> > > > is designed.
> > > > Does it make sense?
> > >
> > > It happens that it's safe in this current patch set because indeed we
> > > always process individual ports, datapaths, lbs, etc, in the same
> > > thread.  However, this code (lflow_table_add_lflow()) is generic and
> > > there's nothing (not even a comment) that would warn developers in the
> > > future about the potential race if the lflow_ref is shared.
> > >
> > > I spoke to Numan offline a bit about this and I think the current plan
> > > is to leave it as is and add proper locking as a follow up (or in v7).
> > > But I think we still need a clear comment here warning users about
this.
> > >  Maybe we should add a comment where the lflow_ref structure is
defined
> > too.
> > >
> > > What do you think?
> >
> > I totally agree with you about adding comments to explain the thread
safety
> > considerations, and make it clear that the lflow_ref should always be
> > associated with the object that is being processed by the thread.
> > With regard to any follow up change for proper locking, I am not sure
what
> > scenario is your concern. I think if we always make sure the lflow_ref
> > passed in is the one owned by the object then the current locking is
> > sufficient. And this is the natural case.
> >
> > However, I did think about a situation where there might be a potential
> > pro

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-24 Thread Numan Siddique
On Wed, Jan 24, 2024 at 10:53 PM Han Zhou  wrote:
>
> On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara  wrote:
> >
> > On 1/24/24 06:01, Han Zhou wrote:
> > > On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara  wrote:
> > >>
> > >> On 1/11/24 16:31, num...@ovn.org wrote:
> > >>> +
> > >>> +void
> > >>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> > >>> +  const struct ovn_datapath *od,
> > >>> +  const unsigned long *dp_bitmap, size_t
> > > dp_bitmap_len,
> > >>> +  enum ovn_stage stage, uint16_t priority,
> > >>> +  const char *match, const char *actions,
> > >>> +  const char *io_port, const char *ctrl_meter,
> > >>> +  const struct ovsdb_idl_row *stage_hint,
> > >>> +  const char *where,
> > >>> +  struct lflow_ref *lflow_ref)
> > >>> +OVS_EXCLUDED(fake_hash_mutex)
> > >>> +{
> > >>> +struct ovs_mutex *hash_lock;
> > >>> +uint32_t hash;
> > >>> +
> > >>> +ovs_assert(!od ||
> > >>> +   ovn_stage_to_datapath_type(stage) ==
> > > ovn_datapath_get_type(od));
> > >>> +
> > >>> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> > >>> + ovn_stage_get_pipeline(stage),
> > >>> + priority, match,
> > >>> + actions);
> > >>> +
> > >>> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> > >>> +struct ovn_lflow *lflow =
> > >>> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> > >>> + dp_bitmap_len, hash, stage,
> > >>> + priority, match, actions,
> > >>> + io_port, ctrl_meter, stage_hint, where);
> > >>> +
> > >>> +if (lflow_ref) {
> > >>> +/* lflow referencing is only supported if 'od' is not NULL.
> */
> > >>> +ovs_assert(od);
> > >>> +
> > >>> +struct lflow_ref_node *lrn =
> > >>> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes, lflow,
> > > hash);
> > >>> +if (!lrn) {
> > >>> +lrn = xzalloc(sizeof *lrn);
> > >>> +lrn->lflow = lflow;
> > >>> +lrn->dp_index = od->index;
> > >>> +ovs_list_insert(&lflow_ref->lflows_ref_list,
> > >>> +&lrn->lflow_list_node);
> > >>> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> > >>> +ovs_list_insert(&lflow->referenced_by,
> > > &lrn->ref_list_node);
> > >>> +
> > >>> +hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node,
> > > hash);
> > >>> +}
> > >>> +
> > >>> +lrn->linked = true;
> > >>> +}
> > >>> +
> > >>> +lflow_hash_unlock(hash_lock);
> > >>> +
> > >>> +}
> > >>> +
> > >>
> > >> This part is not thread safe.
> > >>
> > >> If two threads try to add logical flows that have different hashes and
> > >> lflow_ref is not NULL we're going to have a race condition when
> > >> inserting to the &lflow_ref->lflow_ref_nodes hash map because the two
> > >> threads will take different locks.
> > >>
> > >
> > > I think it is safe because a lflow_ref is always associated with an
> object,
> > > e.g. port, datapath, lb, etc., and lflow generation for a single such
> > > object is never executed in parallel, which is how the parallel lflow
> build
> > > is designed.
> > > Does it make sense?
> >
> > It happens that it's safe in this current patch set because indeed we
> > always process individual ports, datapaths, lbs, etc, in the same
> > thread.  However, this code (lflow_table_add_lflow()) is generic and
> > there's nothing (not even a comment) that would warn developers in the
> > future about the potential race if the lflow_ref is shared.
> >
> > I spoke to Numan offline a bit about this and I think the current plan
> > is to leave it as is and add proper locking as a follow up (or in v7).
> > But I think we still need a clear comment here warning users about this.
> >  Maybe we should add a comment where the lflow_ref structure is defined
> too.
> >
> > What do you think?
>
> I totally agree with you about adding comments to explain the thread safety
> considerations, and make it clear that the lflow_ref should always be
> associated with the object that is being processed by the thread.
> With regard to any follow up change for proper locking, I am not sure what
> scenario is your concern. I think if we always make sure the lflow_ref
> passed in is the one owned by the object then the current locking is
> sufficient. And this is the natural case.
>
> However, I did think about a situation where there might be a potential
> problem in the future when we need to maintain references for more than one
> input for the same lflow. For the "main" input, which is the object that
> the thread is iterating and generating lflow for, will have its lflow_ref
> passed in this function, but we might also need to ma

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-24 Thread Han Zhou
On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara  wrote:
>
> On 1/24/24 06:01, Han Zhou wrote:
> > On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara  wrote:
> >>
> >> On 1/11/24 16:31, num...@ovn.org wrote:
> >>> +
> >>> +void
> >>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> >>> +  const struct ovn_datapath *od,
> >>> +  const unsigned long *dp_bitmap, size_t
> > dp_bitmap_len,
> >>> +  enum ovn_stage stage, uint16_t priority,
> >>> +  const char *match, const char *actions,
> >>> +  const char *io_port, const char *ctrl_meter,
> >>> +  const struct ovsdb_idl_row *stage_hint,
> >>> +  const char *where,
> >>> +  struct lflow_ref *lflow_ref)
> >>> +OVS_EXCLUDED(fake_hash_mutex)
> >>> +{
> >>> +struct ovs_mutex *hash_lock;
> >>> +uint32_t hash;
> >>> +
> >>> +ovs_assert(!od ||
> >>> +   ovn_stage_to_datapath_type(stage) ==
> > ovn_datapath_get_type(od));
> >>> +
> >>> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> >>> + ovn_stage_get_pipeline(stage),
> >>> + priority, match,
> >>> + actions);
> >>> +
> >>> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> >>> +struct ovn_lflow *lflow =
> >>> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> >>> + dp_bitmap_len, hash, stage,
> >>> + priority, match, actions,
> >>> + io_port, ctrl_meter, stage_hint, where);
> >>> +
> >>> +if (lflow_ref) {
> >>> +/* lflow referencing is only supported if 'od' is not NULL.
*/
> >>> +ovs_assert(od);
> >>> +
> >>> +struct lflow_ref_node *lrn =
> >>> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes, lflow,
> > hash);
> >>> +if (!lrn) {
> >>> +lrn = xzalloc(sizeof *lrn);
> >>> +lrn->lflow = lflow;
> >>> +lrn->dp_index = od->index;
> >>> +ovs_list_insert(&lflow_ref->lflows_ref_list,
> >>> +&lrn->lflow_list_node);
> >>> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> >>> +ovs_list_insert(&lflow->referenced_by,
> > &lrn->ref_list_node);
> >>> +
> >>> +hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node,
> > hash);
> >>> +}
> >>> +
> >>> +lrn->linked = true;
> >>> +}
> >>> +
> >>> +lflow_hash_unlock(hash_lock);
> >>> +
> >>> +}
> >>> +
> >>
> >> This part is not thread safe.
> >>
> >> If two threads try to add logical flows that have different hashes and
> >> lflow_ref is not NULL we're going to have a race condition when
> >> inserting to the &lflow_ref->lflow_ref_nodes hash map because the two
> >> threads will take different locks.
> >>
> >
> > I think it is safe because a lflow_ref is always associated with an
object,
> > e.g. port, datapath, lb, etc., and lflow generation for a single such
> > object is never executed in parallel, which is how the parallel lflow
build
> > is designed.
> > Does it make sense?
>
> It happens that it's safe in this current patch set because indeed we
> always process individual ports, datapaths, lbs, etc, in the same
> thread.  However, this code (lflow_table_add_lflow()) is generic and
> there's nothing (not even a comment) that would warn developers in the
> future about the potential race if the lflow_ref is shared.
>
> I spoke to Numan offline a bit about this and I think the current plan
> is to leave it as is and add proper locking as a follow up (or in v7).
> But I think we still need a clear comment here warning users about this.
>  Maybe we should add a comment where the lflow_ref structure is defined
too.
>
> What do you think?

I totally agree with you about adding comments to explain the thread safety
considerations, and make it clear that the lflow_ref should always be
associated with the object that is being processed by the thread.
With regard to any follow up change for proper locking, I am not sure what
scenario is your concern. I think if we always make sure the lflow_ref
passed in is the one owned by the object then the current locking is
sufficient. And this is the natural case.

However, I did think about a situation where there might be a potential
problem in the future when we need to maintain references for more than one
input for the same lflow. For the "main" input, which is the object that
the thread is iterating and generating lflow for, will have its lflow_ref
passed in this function, but we might also need to maintain the reference
of the lflow for a secondary input (or even third). In that case it is not
just the locking issue, but firstly we need to have a proper way to pass in
the secondary lflow_ref, which is what I had mentioned in the review
comments for v3:
https://mail.ope

Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-24 Thread Dumitru Ceara
On 1/24/24 06:01, Han Zhou wrote:
> On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara  wrote:
>>
>> On 1/11/24 16:31, num...@ovn.org wrote:
>>> +
>>> +void
>>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
>>> +  const struct ovn_datapath *od,
>>> +  const unsigned long *dp_bitmap, size_t
> dp_bitmap_len,
>>> +  enum ovn_stage stage, uint16_t priority,
>>> +  const char *match, const char *actions,
>>> +  const char *io_port, const char *ctrl_meter,
>>> +  const struct ovsdb_idl_row *stage_hint,
>>> +  const char *where,
>>> +  struct lflow_ref *lflow_ref)
>>> +OVS_EXCLUDED(fake_hash_mutex)
>>> +{
>>> +struct ovs_mutex *hash_lock;
>>> +uint32_t hash;
>>> +
>>> +ovs_assert(!od ||
>>> +   ovn_stage_to_datapath_type(stage) ==
> ovn_datapath_get_type(od));
>>> +
>>> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
>>> + ovn_stage_get_pipeline(stage),
>>> + priority, match,
>>> + actions);
>>> +
>>> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
>>> +struct ovn_lflow *lflow =
>>> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
>>> + dp_bitmap_len, hash, stage,
>>> + priority, match, actions,
>>> + io_port, ctrl_meter, stage_hint, where);
>>> +
>>> +if (lflow_ref) {
>>> +/* lflow referencing is only supported if 'od' is not NULL. */
>>> +ovs_assert(od);
>>> +
>>> +struct lflow_ref_node *lrn =
>>> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes, lflow,
> hash);
>>> +if (!lrn) {
>>> +lrn = xzalloc(sizeof *lrn);
>>> +lrn->lflow = lflow;
>>> +lrn->dp_index = od->index;
>>> +ovs_list_insert(&lflow_ref->lflows_ref_list,
>>> +&lrn->lflow_list_node);
>>> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
>>> +ovs_list_insert(&lflow->referenced_by,
> &lrn->ref_list_node);
>>> +
>>> +hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node,
> hash);
>>> +}
>>> +
>>> +lrn->linked = true;
>>> +}
>>> +
>>> +lflow_hash_unlock(hash_lock);
>>> +
>>> +}
>>> +
>>
>> This part is not thread safe.
>>
>> If two threads try to add logical flows that have different hashes and
>> lflow_ref is not NULL we're going to have a race condition when
>> inserting to the &lflow_ref->lflow_ref_nodes hash map because the two
>> threads will take different locks.
>>
> 
> I think it is safe because a lflow_ref is always associated with an object,
> e.g. port, datapath, lb, etc., and lflow generation for a single such
> object is never executed in parallel, which is how the parallel lflow build
> is designed.
> Does it make sense?

It happens that it's safe in this current patch set because indeed we
always process individual ports, datapaths, lbs, etc, in the same
thread.  However, this code (lflow_table_add_lflow()) is generic and
there's nothing (not even a comment) that would warn developers in the
future about the potential race if the lflow_ref is shared.

I spoke to Numan offline a bit about this and I think the current plan
is to leave it as is and add proper locking as a follow up (or in v7).
But I think we still need a clear comment here warning users about this.
 Maybe we should add a comment where the lflow_ref structure is defined too.

What do you think?

Regards,
Dumitru

> 
> Thanks,
> Han
> 
>> That might corrupt the map.
>>
>> I guess, if we don't want to cause more performance degradation we
>> should maintain as many lflow_ref instances as we do hash_locks, i.e.,
>> LFLOW_HASH_LOCK_MASK + 1.  Will that even be possible?
>>
>> Wdyt?
>>
>> Regards,
>> Dumitru
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-23 Thread Han Zhou
On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara  wrote:
>
> On 1/11/24 16:31, num...@ovn.org wrote:
> > +
> > +void
> > +lflow_table_add_lflow(struct lflow_table *lflow_table,
> > +  const struct ovn_datapath *od,
> > +  const unsigned long *dp_bitmap, size_t
dp_bitmap_len,
> > +  enum ovn_stage stage, uint16_t priority,
> > +  const char *match, const char *actions,
> > +  const char *io_port, const char *ctrl_meter,
> > +  const struct ovsdb_idl_row *stage_hint,
> > +  const char *where,
> > +  struct lflow_ref *lflow_ref)
> > +OVS_EXCLUDED(fake_hash_mutex)
> > +{
> > +struct ovs_mutex *hash_lock;
> > +uint32_t hash;
> > +
> > +ovs_assert(!od ||
> > +   ovn_stage_to_datapath_type(stage) ==
ovn_datapath_get_type(od));
> > +
> > +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> > + ovn_stage_get_pipeline(stage),
> > + priority, match,
> > + actions);
> > +
> > +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> > +struct ovn_lflow *lflow =
> > +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> > + dp_bitmap_len, hash, stage,
> > + priority, match, actions,
> > + io_port, ctrl_meter, stage_hint, where);
> > +
> > +if (lflow_ref) {
> > +/* lflow referencing is only supported if 'od' is not NULL. */
> > +ovs_assert(od);
> > +
> > +struct lflow_ref_node *lrn =
> > +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes, lflow,
hash);
> > +if (!lrn) {
> > +lrn = xzalloc(sizeof *lrn);
> > +lrn->lflow = lflow;
> > +lrn->dp_index = od->index;
> > +ovs_list_insert(&lflow_ref->lflows_ref_list,
> > +&lrn->lflow_list_node);
> > +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> > +ovs_list_insert(&lflow->referenced_by,
&lrn->ref_list_node);
> > +
> > +hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node,
hash);
> > +}
> > +
> > +lrn->linked = true;
> > +}
> > +
> > +lflow_hash_unlock(hash_lock);
> > +
> > +}
> > +
>
> This part is not thread safe.
>
> If two threads try to add logical flows that have different hashes and
> lflow_ref is not NULL we're going to have a race condition when
> inserting to the &lflow_ref->lflow_ref_nodes hash map because the two
> threads will take different locks.
>

I think it is safe because a lflow_ref is always associated with an object,
e.g. port, datapath, lb, etc., and lflow generation for a single such
object is never executed in parallel, which is how the parallel lflow build
is designed.
Does it make sense?

Thanks,
Han

> That might corrupt the map.
>
> I guess, if we don't want to cause more performance degradation we
> should maintain as many lflow_ref instances as we do hash_locks, i.e.,
> LFLOW_HASH_LOCK_MASK + 1.  Will that even be possible?
>
> Wdyt?
>
> Regards,
> Dumitru
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-19 Thread Dumitru Ceara
On 1/18/24 22:39, Numan Siddique wrote:
>>> +void
>>> +ovn_dp_groups_clear(struct hmap *dp_groups)
>>> +{
>>> +struct ovn_dp_group *dpg;
>>> +HMAP_FOR_EACH_POP (dpg, node, dp_groups) {
>>> +bitmap_free(dpg->bitmap);
>>> +free(dpg);
>> This is duplicated in dec_ovn_dp_group_ref().  Also, should we assert
>> that all refcounts are 0 here?
> I don't think we can.  Since ovn_dp_groups_clear() will be called
> whenever a full recompute
> happens.  In this case,  we are destroying the internal data and
> rebuilding and the ref count may not be 0.

Hmm, OK, I think that might be fine.

Thanks for checking.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-19 Thread Dumitru Ceara
On 1/11/24 16:31, num...@ovn.org wrote:
> +
> +void
> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> +  const struct ovn_datapath *od,
> +  const unsigned long *dp_bitmap, size_t dp_bitmap_len,
> +  enum ovn_stage stage, uint16_t priority,
> +  const char *match, const char *actions,
> +  const char *io_port, const char *ctrl_meter,
> +  const struct ovsdb_idl_row *stage_hint,
> +  const char *where,
> +  struct lflow_ref *lflow_ref)
> +OVS_EXCLUDED(fake_hash_mutex)
> +{
> +struct ovs_mutex *hash_lock;
> +uint32_t hash;
> +
> +ovs_assert(!od ||
> +   ovn_stage_to_datapath_type(stage) == 
> ovn_datapath_get_type(od));
> +
> +hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> + ovn_stage_get_pipeline(stage),
> + priority, match,
> + actions);
> +
> +hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> +struct ovn_lflow *lflow =
> +do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> + dp_bitmap_len, hash, stage,
> + priority, match, actions,
> + io_port, ctrl_meter, stage_hint, where);
> +
> +if (lflow_ref) {
> +/* lflow referencing is only supported if 'od' is not NULL. */
> +ovs_assert(od);
> +
> +struct lflow_ref_node *lrn =
> +lflow_ref_node_find(&lflow_ref->lflow_ref_nodes, lflow, hash);
> +if (!lrn) {
> +lrn = xzalloc(sizeof *lrn);
> +lrn->lflow = lflow;
> +lrn->dp_index = od->index;
> +ovs_list_insert(&lflow_ref->lflows_ref_list,
> +&lrn->lflow_list_node);
> +inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> +ovs_list_insert(&lflow->referenced_by, &lrn->ref_list_node);
> +
> +hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node, hash);
> +}
> +
> +lrn->linked = true;
> +}
> +
> +lflow_hash_unlock(hash_lock);
> +
> +}
> +

This part is not thread safe.

If two threads try to add logical flows that have different hashes and
lflow_ref is not NULL we're going to have a race condition when
inserting to the &lflow_ref->lflow_ref_nodes hash map because the two
threads will take different locks.

That might corrupt the map.

I guess, if we don't want to cause more performance degradation we
should maintain as many lflow_ref instances as we do hash_locks, i.e.,
LFLOW_HASH_LOCK_MASK + 1.  Will that even be possible?

Wdyt?

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 08/16] northd: Refactor lflow management into a separate module.

2024-01-11 Thread 0-day Robot
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line has non-spaces leading whitespace
WARNING: Line has trailing whitespace
#4753 FILE: northd/northd.c:16002:


Lines checked: 5783, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev