Re: [Bridge] [PATCH net-next v3 05/15] net: bridge: mcast: factor out port group del

2020-09-06 Thread Jakub Kicinski
On Mon, 7 Sep 2020 00:29:10 +0300 Nikolay Aleksandrov wrote:
> >> @@ -1641,16 +1647,7 @@ br_multicast_leave_group(struct net_bridge *br,
> >>if (p->flags & MDB_PG_FLAGS_PERMANENT)
> >>break;
> >>   
> >> -  rcu_assign_pointer(*pp, p->next);
> >> -  hlist_del_init(>mglist);
> >> -  del_timer(>timer);
> >> -  kfree_rcu(p, rcu);
> >> -  br_mdb_notify(br->dev, port, group, RTM_DELMDB,
> >> -p->flags | MDB_PG_FLAGS_FAST_LEAVE);  
> > 
> > And here we'll loose MDB_PG_FLAGS_FAST_LEAVE potentially?
> 
> Good catch, we will lose the fast leave indeed.
> This is a 1 line change to set the flag before notifying. Would you prefer
> me to send a v4 or a follow up for it?

I think it's cleaner if you send a v4. But not rush, I was planning to
apply this tomorrow morning PST, anyway.


Re: [Bridge] [PATCH net-next v3 06/15] net: bridge: mcast: add support for group query retransmit

2020-09-06 Thread Jakub Kicinski
On Mon, 7 Sep 2020 00:14:51 +0300 Nikolay Aleksandrov wrote:
> On 9/7/20 12:01 AM, Jakub Kicinski wrote:
> > On Sat,  5 Sep 2020 11:24:01 +0300 Nikolay Aleksandrov wrote:  
> >> We need to be able to retransmit group-specific and group-and-source
> >> specific queries. The new timer takes care of those.  
> > 
> > What guarantees that timer will not use pg after free? Do timer
> > callbacks hold the RCU read lock?
> 
> See the last patch, it guarantees no entry timer will be used when it's freed.

Ack.


Re: [Bridge] [PATCH net-next v3 05/15] net: bridge: mcast: factor out port group del

2020-09-06 Thread Nikolay Aleksandrov

On 9/7/20 12:29 AM, Nikolay Aleksandrov wrote:

On 9/6/20 11:56 PM, Jakub Kicinski wrote:

On Sat,  5 Sep 2020 11:24:00 +0300 Nikolay Aleksandrov wrote:

@@ -843,24 +843,11 @@ static int __br_mdb_del(struct net_bridge *br, struct 
br_mdb_entry *entry)
  if (!p->port || p->port->dev->ifindex != entry->ifindex)
  continue;
-    if (!hlist_empty(>src_list)) {
-    err = -EINVAL;
-    goto unlock;
-    }
-
  if (p->port->state == BR_STATE_DISABLED)
  goto unlock;
-    __mdb_entry_fill_flags(entry, p->flags);


Just from staring at the code it's unclear why the list_empty() check
and this __mdb_entry_fill_flags() are removed as well.



The hlist_empty check is added by patch 01 temporarily for correctness.
Otherwise I'd have to edit all open-coded pg del places and add src delete
code and then remove it here.



Obviously if I do a v4, I'll just move this patch before adding the hlist_empty
in the first place. :-)


__mdb_entry_fill_flags() are called by __mdb_fill_info() which is the only
function used to fill an mdb both for dumping and notifications after patch 08.


-    rcu_assign_pointer(*pp, p->next);
-    hlist_del_init(>mglist);
-    del_timer(>timer);
-    kfree_rcu(p, rcu);
+    br_multicast_del_pg(mp, p, pp);
  err = 0;
-
-    if (!mp->ports && !mp->host_joined &&
-    netif_running(br->dev))
-    mod_timer(>timer, jiffies);
  break;




+void br_multicast_del_pg(struct net_bridge_mdb_entry *mp,
+ struct net_bridge_port_group *pg,
+ struct net_bridge_port_group __rcu **pp)
+{
+    struct net_bridge *br = pg->port->br;
+    struct net_bridge_group_src *ent;
+    struct hlist_node *tmp;
+
+    rcu_assign_pointer(*pp, pg->next);
+    hlist_del_init(>mglist);
+    del_timer(>timer);
+    hlist_for_each_entry_safe(ent, tmp, >src_list, node)
+    br_multicast_del_group_src(ent);
+    br_mdb_notify(br->dev, pg->port, >addr, RTM_DELMDB, pg->flags);
+    kfree_rcu(pg, rcu);
+
+    if (!mp->ports && !mp->host_joined && netif_running(br->dev))
+    mod_timer(>timer, jiffies);
+}



@@ -1641,16 +1647,7 @@ br_multicast_leave_group(struct net_bridge *br,
  if (p->flags & MDB_PG_FLAGS_PERMANENT)
  break;
-    rcu_assign_pointer(*pp, p->next);
-    hlist_del_init(>mglist);
-    del_timer(>timer);
-    kfree_rcu(p, rcu);
-    br_mdb_notify(br->dev, port, group, RTM_DELMDB,
-  p->flags | MDB_PG_FLAGS_FAST_LEAVE);


And here we'll loose MDB_PG_FLAGS_FAST_LEAVE potentially?



Good catch, we will lose the fast leave indeed.
This is a 1 line change to set the flag before notifying. Would you prefer
me to send a v4 or a follow up for it?

Thanks,
  Nik


-    if (!mp->ports && !mp->host_joined &&
-    netif_running(br->dev))
-    mod_timer(>timer, jiffies);
+    br_multicast_del_pg(mp, p, pp);






Re: [Bridge] [PATCH net-next v3 05/15] net: bridge: mcast: factor out port group del

2020-09-06 Thread Nikolay Aleksandrov

On 9/6/20 11:56 PM, Jakub Kicinski wrote:

On Sat,  5 Sep 2020 11:24:00 +0300 Nikolay Aleksandrov wrote:

@@ -843,24 +843,11 @@ static int __br_mdb_del(struct net_bridge *br, struct 
br_mdb_entry *entry)
if (!p->port || p->port->dev->ifindex != entry->ifindex)
continue;
  
-		if (!hlist_empty(>src_list)) {

-   err = -EINVAL;
-   goto unlock;
-   }
-
if (p->port->state == BR_STATE_DISABLED)
goto unlock;
  
-		__mdb_entry_fill_flags(entry, p->flags);


Just from staring at the code it's unclear why the list_empty() check
and this __mdb_entry_fill_flags() are removed as well.



The hlist_empty check is added by patch 01 temporarily for correctness.
Otherwise I'd have to edit all open-coded pg del places and add src delete
code and then remove it here.

__mdb_entry_fill_flags() are called by __mdb_fill_info() which is the only
function used to fill an mdb both for dumping and notifications after patch 08.


-   rcu_assign_pointer(*pp, p->next);
-   hlist_del_init(>mglist);
-   del_timer(>timer);
-   kfree_rcu(p, rcu);
+   br_multicast_del_pg(mp, p, pp);
err = 0;
-
-   if (!mp->ports && !mp->host_joined &&
-   netif_running(br->dev))
-   mod_timer(>timer, jiffies);
break;




+void br_multicast_del_pg(struct net_bridge_mdb_entry *mp,
+struct net_bridge_port_group *pg,
+struct net_bridge_port_group __rcu **pp)
+{
+   struct net_bridge *br = pg->port->br;
+   struct net_bridge_group_src *ent;
+   struct hlist_node *tmp;
+
+   rcu_assign_pointer(*pp, pg->next);
+   hlist_del_init(>mglist);
+   del_timer(>timer);
+   hlist_for_each_entry_safe(ent, tmp, >src_list, node)
+   br_multicast_del_group_src(ent);
+   br_mdb_notify(br->dev, pg->port, >addr, RTM_DELMDB, pg->flags);
+   kfree_rcu(pg, rcu);
+
+   if (!mp->ports && !mp->host_joined && netif_running(br->dev))
+   mod_timer(>timer, jiffies);
+}



@@ -1641,16 +1647,7 @@ br_multicast_leave_group(struct net_bridge *br,
if (p->flags & MDB_PG_FLAGS_PERMANENT)
break;
  
-			rcu_assign_pointer(*pp, p->next);

-   hlist_del_init(>mglist);
-   del_timer(>timer);
-   kfree_rcu(p, rcu);
-   br_mdb_notify(br->dev, port, group, RTM_DELMDB,
- p->flags | MDB_PG_FLAGS_FAST_LEAVE);


And here we'll loose MDB_PG_FLAGS_FAST_LEAVE potentially?



Good catch, we will lose the fast leave indeed.
This is a 1 line change to set the flag before notifying. Would you prefer
me to send a v4 or a follow up for it?

Thanks,
 Nik


-   if (!mp->ports && !mp->host_joined &&
-   netif_running(br->dev))
-   mod_timer(>timer, jiffies);
+   br_multicast_del_pg(mp, p, pp);




Re: [Bridge] [PATCH net-next v3 06/15] net: bridge: mcast: add support for group query retransmit

2020-09-06 Thread Nikolay Aleksandrov

On 9/7/20 12:01 AM, Jakub Kicinski wrote:

On Sat,  5 Sep 2020 11:24:01 +0300 Nikolay Aleksandrov wrote:

We need to be able to retransmit group-specific and group-and-source
specific queries. The new timer takes care of those.


What guarantees that timer will not use pg after free? Do timer
callbacks hold the RCU read lock?



See the last patch, it guarantees no entry timer will be used when it's freed.


Re: [Bridge] [PATCH net-next v3 06/15] net: bridge: mcast: add support for group query retransmit

2020-09-06 Thread Jakub Kicinski
On Sat,  5 Sep 2020 11:24:01 +0300 Nikolay Aleksandrov wrote:
> We need to be able to retransmit group-specific and group-and-source
> specific queries. The new timer takes care of those.

What guarantees that timer will not use pg after free? Do timer
callbacks hold the RCU read lock?


Re: [Bridge] [PATCH net-next v3 05/15] net: bridge: mcast: factor out port group del

2020-09-06 Thread Jakub Kicinski
On Sat,  5 Sep 2020 11:24:00 +0300 Nikolay Aleksandrov wrote:
> @@ -843,24 +843,11 @@ static int __br_mdb_del(struct net_bridge *br, struct 
> br_mdb_entry *entry)
>   if (!p->port || p->port->dev->ifindex != entry->ifindex)
>   continue;
>  
> - if (!hlist_empty(>src_list)) {
> - err = -EINVAL;
> - goto unlock;
> - }
> -
>   if (p->port->state == BR_STATE_DISABLED)
>   goto unlock;
>  
> - __mdb_entry_fill_flags(entry, p->flags);

Just from staring at the code it's unclear why the list_empty() check
and this __mdb_entry_fill_flags() are removed as well.

> - rcu_assign_pointer(*pp, p->next);
> - hlist_del_init(>mglist);
> - del_timer(>timer);
> - kfree_rcu(p, rcu);
> + br_multicast_del_pg(mp, p, pp);
>   err = 0;
> -
> - if (!mp->ports && !mp->host_joined &&
> - netif_running(br->dev))
> - mod_timer(>timer, jiffies);
>   break;


> +void br_multicast_del_pg(struct net_bridge_mdb_entry *mp,
> +  struct net_bridge_port_group *pg,
> +  struct net_bridge_port_group __rcu **pp)
> +{
> + struct net_bridge *br = pg->port->br;
> + struct net_bridge_group_src *ent;
> + struct hlist_node *tmp;
> +
> + rcu_assign_pointer(*pp, pg->next);
> + hlist_del_init(>mglist);
> + del_timer(>timer);
> + hlist_for_each_entry_safe(ent, tmp, >src_list, node)
> + br_multicast_del_group_src(ent);
> + br_mdb_notify(br->dev, pg->port, >addr, RTM_DELMDB, pg->flags);
> + kfree_rcu(pg, rcu);
> +
> + if (!mp->ports && !mp->host_joined && netif_running(br->dev))
> + mod_timer(>timer, jiffies);
> +}

> @@ -1641,16 +1647,7 @@ br_multicast_leave_group(struct net_bridge *br,
>   if (p->flags & MDB_PG_FLAGS_PERMANENT)
>   break;
>  
> - rcu_assign_pointer(*pp, p->next);
> - hlist_del_init(>mglist);
> - del_timer(>timer);
> - kfree_rcu(p, rcu);
> - br_mdb_notify(br->dev, port, group, RTM_DELMDB,
> -   p->flags | MDB_PG_FLAGS_FAST_LEAVE);

And here we'll loose MDB_PG_FLAGS_FAST_LEAVE potentially?

> - if (!mp->ports && !mp->host_joined &&
> - netif_running(br->dev))
> - mod_timer(>timer, jiffies);
> + br_multicast_del_pg(mp, p, pp);


Re: [Bridge] [PATCH RFC 0/7] net: bridge: cfm: Add support for Connectivity Fault Management(CFM)

2020-09-06 Thread Horatiu Vultur via Bridge
The 09/04/2020 15:44, Stephen Hemminger wrote:
> 
> On Fri, 4 Sep 2020 09:15:20 +
> Henrik Bjoernlund  wrote:
> 
> > Connectivity Fault Management (CFM) is defined in 802.1Q section 12.14.
> >
> > Connectivity Fault Management (CFM) comprises capabilities for
> > detecting, verifying, and isolating connectivity failures in
> > Virtual Bridged Networks. These capabilities can be used in
> > networks operated by multiple independent organizations, each
> > with restricted management access to each other’s equipment.
> >
> > CFM functions are partitioned as follows:
> > — Path discovery
> > — Fault detection
> > — Fault verification and isolation
> > — Fault notification
> > — Fault recovery
> >
> > The primary CFM protocol shims are called Maintenance Points (MPs).
> > A MP can be either a MEP or a MHF.
> > The MEP:
> > -It is the Maintenance association End Point
> >  described in 802.1Q section 19.2.
> > -It is created on a specific level (1-7) and is assuring
> >  that no CFM frames are passing through this MEP on lower levels.
> > -It initiates and terminates/validates CFM frames on its level.
> > -It can only exist on a port that is related to a bridge.
> > The MHF:
> > -It is the Maintenance Domain Intermediate Point
> >  (MIP) Half Function (MHF) described in 802.1Q section 19.3.
> > -It is created on a specific level (1-7).
> > -It is extracting/injecting certain CFM frame on this level.
> > -It can only exist on a port that is related to a bridge.
> > -Currently not supported.
> >
> > There are defined the following CFM protocol functions:
> > -Continuity Check
> > -Loopback. Currently not supported.
> > -Linktrace. Currently not supported.
> >
> > This CFM component supports create/delete of MEP instances and
> > configuration of the different CFM protocols. Also status information
> > can be fetched and delivered through notification due to defect status
> > change.
> >
> > The user interacts with CFM using the 'cfm' user space client program, the
> > client talks with the kernel using netlink. The kernel will try to offload
> > the requests to the HW via switchdev API (not implemented yet).
> >
> > Any notification emitted by CFM from the kernel can be monitored in user
> > space by starting 'cfm_server' program.
> >
> > Currently this 'cfm' and 'cfm_server' programs are standalone placed in a
> > cfm repository https://github.com/microchip-ung/cfm but it is considered
> > to integrate this into 'iproute2'.
> >
> > Reviewed-by: Horatiu Vultur  
> > Signed-off-by: Henrik Bjoernlund  

Hi Stephen,

> 
> Could this be done in userspace? It is a control plane protocol.
> Could it be done by using eBPF?

I might be able to answer this. We have not considered this approach of
using eBPF. Because we want actually to push this in HW extending
switchdev API. I know that this series doesn't cover the switchdev part
but we posted like this because we wanted to get some feedback from
community. We had a similar approach for MRP, where we extended the
bridge and switchdev API, so we tought that is the way to go forward.

Regarding eBPF, I can't say that it would work or not because I lack
knowledge in this.

> 
> Adding more code in bridge impacts a large number of users of Linux distros.
> It creates bloat and potential security vulnerabilities.

-- 
/Horatiu