Re: [Bridge] [PATCH RFC] net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.

2022-05-06 Thread Andrew Lunn
> Some safer alternatives to this patch are based on the idea that we
> could ignore skb->offload_fwd_mark coming from an unoffloaded bridge
> port (i.e. treat this condition at br1, not at br0). We could:
> - clear skb->offload_fwd_mark in br_handle_frame_finish(), if p->hwdom is 0
> - change nbp_switchdev_allowed_egress() to return true if cb->src_hwdom == 0

O.K, i will try out these solutions.

Thanks
 Andrew


Re: [Bridge] [PATCH RFC] net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.

2022-05-06 Thread Stephen Hemminger
On Fri, 6 May 2022 03:18:22 +0200
Andrew Lunn  wrote:

> On Thu, May 05, 2022 at 04:07:20PM -0700, Stephen Hemminger wrote:
> > On Fri,  6 May 2022 00:59:04 +0200
> > Andrew Lunn  wrote:
> >   
> > > It is possible to stack bridges on top of each other. Consider the
> > > following which makes use of an Ethernet switch:
> > > 
> > >br1
> > >  /\
> > > /  \
> > >/\
> > >  br0.11wlan0
> > >|
> > >br0
> > >  /  |  \
> > > p1  p2  p3
> > > 
> > > br0 is offloaded to the switch. Above br0 is a vlan interface, for
> > > vlan 11. This vlan interface is then a slave of br1. br1 also has
> > > wireless interface as a slave. This setup trunks wireless lan traffic
> > > over the copper network inside a VLAN.
> > > 
> > > A frame received on p1 which is passed up to the bridge has the
> > > skb->offload_fwd_mark flag set to true, indicating it that the switch
> > > has dealt with forwarding the frame out ports p2 and p3 as
> > > needed. This flag instructs the software bridge it does not need to
> > > pass the frame back down again. However, the flag is not getting reset
> > > when the frame is passed upwards. As a result br1 sees the flag,
> > > wrongly interprets it, and fails to forward the frame to wlan0.
> > > 
> > > When passing a frame upwards, clear the flag.
> > > 
> > > RFC because i don't know the bridge code well enough if this is the
> > > correct place to do this, and if there are any side effects, could the
> > > skb be a clone, etc.
> > > 
> > > Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward 
> > > mark")
> > > Signed-off-by: Andrew Lunn   
> > 
> > Bridging of bridges is not supposed to be allowed.
> > See:
> > 
> > bridge:br_if.c
> > 
> > /* No bridging of bridges */
> > if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit) {
> > NL_SET_ERR_MSG(extack,
> >"Can not enslave a bridge to a bridge");
> > return -ELOOP;
> > }  
> 
> This is not direct bridging of bridges. There is a vlan interface in
> the middle. And even if it is not supposed to work, it does work, it
> is being used, and it regressed. This fixes the regression.
> 
>Andrew

The problem is that doing this kind of nested bridging screws up
Spanning Tree.


Re: [Bridge] [PATCH RFC] net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.

2022-05-06 Thread Vladimir Oltean
Hi Andrew,

On Fri, May 06, 2022 at 12:59:04AM +0200, Andrew Lunn wrote:
> It is possible to stack bridges on top of each other. Consider the
> following which makes use of an Ethernet switch:
> 
>br1
>  /\
> /  \
>/\
>  br0.11wlan0
>|
>br0
>  /  |  \
> p1  p2  p3
> 
> br0 is offloaded to the switch. Above br0 is a vlan interface, for
> vlan 11. This vlan interface is then a slave of br1. br1 also has
> wireless interface as a slave. This setup trunks wireless lan traffic
> over the copper network inside a VLAN.
> 
> A frame received on p1 which is passed up to the bridge has the
> skb->offload_fwd_mark flag set to true, indicating it that the switch
> has dealt with forwarding the frame out ports p2 and p3 as
> needed. This flag instructs the software bridge it does not need to
> pass the frame back down again. However, the flag is not getting reset
> when the frame is passed upwards. As a result br1 sees the flag,
> wrongly interprets it, and fails to forward the frame to wlan0.
> 
> When passing a frame upwards, clear the flag.
> 
> RFC because i don't know the bridge code well enough if this is the
> correct place to do this, and if there are any side effects, could the
> skb be a clone, etc.

Each skb has its own offload_fwd_mark, so clearing it for this skb does
not affect a clone. And when a packet is simultaneously forwarded and
locally received, the order is first forward/flood it, then receive it.
Cloning takes place during forwarding using deliver_clone(), so it
shouldn't be the case that you are clearing the offload_fwd_mark for a
yet-to-be-forwarded packet, either. So I think we're good there.

> 
> Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark")
> Signed-off-by: Andrew Lunn 
> ---
>  net/bridge/br_input.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index 196417859c4a..9327a5fad1df 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -39,6 +39,13 @@ static int br_pass_frame_up(struct sk_buff *skb)
>   dev_sw_netstats_rx_add(brdev, skb->len);
>  
>   vg = br_vlan_group_rcu(br);
> +
> + /* Reset the offload_fwd_mark because there could be a stacked
> +  * bridge above, and it should not think this bridge it doing
> +  * that bridges work forward out its ports.

"this bridge is doing that bridge's work forwarding out its ports"

> +  */
> + br_switchdev_frame_unmark(skb);
> +
>   /* Bridge is just like any other port.  Make sure the
>* packet is allowed except in promisc mode when someone
>* may be running packet capture.
> -- 
> 2.36.0
>

The good thing with this patch is that it avoids conditionals.
The bad thing is that it prevents true offloading of this configuration
from being possible (when "wlan0" is "p4").

I don't know what hardware is capable of doing this, but I think it's
cautious to not exclude it, either.

Some safer alternatives to this patch are based on the idea that we
could ignore skb->offload_fwd_mark coming from an unoffloaded bridge
port (i.e. treat this condition at br1, not at br0). We could:
- clear skb->offload_fwd_mark in br_handle_frame_finish(), if p->hwdom is 0
- change nbp_switchdev_allowed_egress() to return true if cb->src_hwdom == 0