[PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-12 Thread Roman Mashak
v2:
 Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern)

Signed-off-by: Roman Mashak 
---
 net/bridge/br_netlink.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index f0e8268..1efdd48 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct 
net_bridge_port *p,
 
case RTM_DELLINK:
if (p) {
-   nbp_vlan_delete(p, vinfo->vid);
+   err = nbp_vlan_delete(p, vinfo->vid);
+   if (err)
+   break;
if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
-   br_vlan_delete(p->br, vinfo->vid);
+   err = br_vlan_delete(p->br, vinfo->vid);
} else {
-   br_vlan_delete(br, vinfo->vid);
+   err = br_vlan_delete(br, vinfo->vid);
}
break;
}
-- 
1.9.1



Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-12 Thread David Ahern
On 10/12/17 7:51 AM, Roman Mashak wrote:
> v2:
>  Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern)
> 
> Signed-off-by: Roman Mashak 
> ---
>  net/bridge/br_netlink.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index f0e8268..1efdd48 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct 
> net_bridge_port *p,
>  
>   case RTM_DELLINK:
>   if (p) {
> - nbp_vlan_delete(p, vinfo->vid);
> + err = nbp_vlan_delete(p, vinfo->vid);
> + if (err)
> + break;

I'm not sure a break is the right thing to do. Seems like you leave it
in a half configured state.

>   if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> - br_vlan_delete(p->br, vinfo->vid);
> + err = br_vlan_delete(p->br, vinfo->vid);
>   } else {
> - br_vlan_delete(br, vinfo->vid);
> + err = br_vlan_delete(br, vinfo->vid);
>   }
>   break;
>   }
> 

Why do you want to return the error code here? Walking the code paths
seems like ENOENT or err from switchdev_port_obj_del are the 2 error
possibilities.


Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-12 Thread Roman Mashak
On Thu, Oct 12, 2017 at 10:19 AM, David Ahern  wrote:
> On 10/12/17 7:51 AM, Roman Mashak wrote:
>> v2:
>>  Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern)
>>
>> Signed-off-by: Roman Mashak 
>> ---
>>  net/bridge/br_netlink.c | 8 +---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index f0e8268..1efdd48 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct 
>> net_bridge_port *p,
>>
>>   case RTM_DELLINK:
>>   if (p) {
>> - nbp_vlan_delete(p, vinfo->vid);
>> + err = nbp_vlan_delete(p, vinfo->vid);
>> + if (err)
>> + break;
>
> I'm not sure a break is the right thing to do. Seems like you leave it
> in a half configured state.
>
>>   if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>> - br_vlan_delete(p->br, vinfo->vid);
>> + err = br_vlan_delete(p->br, vinfo->vid);
>>   } else {
>> - br_vlan_delete(br, vinfo->vid);
>> + err = br_vlan_delete(br, vinfo->vid);
>>   }
>>   break;
>>   }
>>
>
> Why do you want to return the error code here? Walking the code paths
> seems like ENOENT or err from switchdev_port_obj_del are the 2 error
> possibilities.

For example, if you attempt to delete a non-existing vlan on a port,
the current code succeeds and also sends event :

rtnetlink_rcv_msg
rtnl_bridge_dellink
   br_dellink
  br_afspec
 br_vlan_info

int br_dellink(..)
{
  ...
  err = br_afspec()
  if (err == 0)
  br_ifinfo_notify(RTM_NEWLINK, p);
}

This is misleading, so a proper errcode has to be produced.


Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-12 Thread Nikolay Aleksandrov
On 12/10/17 21:07, Roman Mashak wrote:
> On Thu, Oct 12, 2017 at 10:19 AM, David Ahern  wrote:
>> On 10/12/17 7:51 AM, Roman Mashak wrote:
>>> v2:
>>>  Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern)
>>>
>>> Signed-off-by: Roman Mashak 
>>> ---
>>>  net/bridge/br_netlink.c | 8 +---
>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>> index f0e8268..1efdd48 100644
>>> --- a/net/bridge/br_netlink.c
>>> +++ b/net/bridge/br_netlink.c
>>> @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct 
>>> net_bridge_port *p,
>>>
>>>   case RTM_DELLINK:
>>>   if (p) {
>>> - nbp_vlan_delete(p, vinfo->vid);
>>> + err = nbp_vlan_delete(p, vinfo->vid);
>>> + if (err)
>>> + break;
>>
>> I'm not sure a break is the right thing to do. Seems like you leave it
>> in a half configured state.
>>
>>>   if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>>> - br_vlan_delete(p->br, vinfo->vid);
>>> + err = br_vlan_delete(p->br, vinfo->vid);
>>>   } else {
>>> - br_vlan_delete(br, vinfo->vid);
>>> + err = br_vlan_delete(br, vinfo->vid);
>>>   }
>>>   break;
>>>   }
>>>
>>
>> Why do you want to return the error code here? Walking the code paths
>> seems like ENOENT or err from switchdev_port_obj_del are the 2 error
>> possibilities.
> 
> For example, if you attempt to delete a non-existing vlan on a port,
> the current code succeeds and also sends event :
> 
> rtnetlink_rcv_msg
> rtnl_bridge_dellink
>br_dellink
>   br_afspec
>  br_vlan_info
> 
> int br_dellink(..)
> {
>   ...
>   err = br_afspec()
>   if (err == 0)
>   br_ifinfo_notify(RTM_NEWLINK, p);
> }
> 
> This is misleading, so a proper errcode has to be produced.
> 

True, but you also change the expected behaviour because now a user can
clear all vlans with one request (1 - 4094), and after the change that
will fail with a partial delete if some vlan was missing.

This has been the behaviour forever and some script might depend on it.
Also IMO, and as David also mentioned, doing a partial delete is not good.







Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-12 Thread Jamal Hadi Salim

On 17-10-12 02:12 PM, Nikolay Aleksandrov wrote:

On 12/10/17 21:07, Roman Mashak wrote:



For example, if you attempt to delete a non-existing vlan on a port,
the current code succeeds and also sends event :

rtnetlink_rcv_msg
 rtnl_bridge_dellink
br_dellink
   br_afspec
  br_vlan_info

int br_dellink(..)
{
   ...
   err = br_afspec()
   if (err == 0)
   br_ifinfo_notify(RTM_NEWLINK, p);
}

This is misleading, so a proper errcode has to be produced.








True, but you also change the expected behaviour because now a user can
clear all vlans with one request (1 - 4094), and after the change that
will fail with a partial delete if some vlan was missing.



The issue is more subtle (per Roman above):
Try to delete a vlan (that doesnt  exist).
1) It says "success".
2) Worse: Another process listening (bridge monitor?) gets an _event_
 that  the vlan has been deleted (when it never existed in the first
 place).


This has been the behaviour forever and some script might depend on it.
Also IMO, and as David also mentioned, doing a partial delete is not good.



I think this is a bug (especially the event part).

cheers,
jamal



Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-12 Thread Nikolay Aleksandrov
On 13.10.2017 05:03, Jamal Hadi Salim wrote:
> On 17-10-12 02:12 PM, Nikolay Aleksandrov wrote:
>> On 12/10/17 21:07, Roman Mashak wrote:
> 
>>> For example, if you attempt to delete a non-existing vlan on a port,
>>> the current code succeeds and also sends event :
>>>
>>> rtnetlink_rcv_msg
>>>  rtnl_bridge_dellink
>>>     br_dellink
>>>    br_afspec
>>>   br_vlan_info
>>>
>>> int br_dellink(..)
>>> {
>>>    ...
>>>    err = br_afspec()
>>>    if (err == 0)
>>>    br_ifinfo_notify(RTM_NEWLINK, p);
>>> }
>>>
>>> This is misleading, so a proper errcode has to be produced.
>>>
>>
> 
> 
> 
>> True, but you also change the expected behaviour because now a user can
>> clear all vlans with one request (1 - 4094), and after the change that
>> will fail with a partial delete if some vlan was missing.
>>
> 
> The issue is more subtle (per Roman above):
> Try to delete a vlan (that doesnt  exist).
> 1) It says "success".
> 2) Worse: Another process listening (bridge monitor?) gets an _event_
>  that  the vlan has been deleted (when it never existed in the first
>  place).
> 
>> This has been the behaviour forever and some script might depend on it.
>> Also IMO, and as David also mentioned, doing a partial delete is not
>> good.
>>
> 
> I think this is a bug (especially the event part).
> 
> cheers,
> jamal
> 

Fair enough, but after the patch you get the opposite effect too - you
delete a couple of vlans but you don't generate an event because of an
error in the middle. That at least can be taken care of.

I do agree it's a bug, but there might be scripts that rely on it and
don't check the return value when clearing vlans. They will end up with
a partial clear and wrongly assumed state, so maybe leave the
opportunistic delete but count if anything was actually deleted and send
an event only then ?
That should make everyone happy :-)




Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-13 Thread David Ahern
On 10/12/17 8:15 PM, Nikolay Aleksandrov wrote:
> I do agree it's a bug, but there might be scripts that rely on it and
> don't check the return value when clearing vlans. They will end up with
> a partial clear and wrongly assumed state, so maybe leave the
> opportunistic delete but count if anything was actually deleted and send
> an event only then ?

+1


Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-13 Thread Roman Mashak
Nikolay Aleksandrov  writes:


[...]

>>> Why do you want to return the error code here? Walking the code paths
>>> seems like ENOENT or err from switchdev_port_obj_del are the 2 error
>>> possibilities.
>> 
>> For example, if you attempt to delete a non-existing vlan on a port,
>> the current code succeeds and also sends event :
>> 
>> rtnetlink_rcv_msg
>> rtnl_bridge_dellink
>>br_dellink
>>   br_afspec
>>  br_vlan_info
>> 
>> int br_dellink(..)
>> {
>>   ...
>>   err = br_afspec()
>>   if (err == 0)
>>   br_ifinfo_notify(RTM_NEWLINK, p);
>> }
>> 
>> This is misleading, so a proper errcode has to be produced.
>> 
>
> True, but you also change the expected behaviour because now a user can
> clear all vlans with one request (1 - 4094), and after the change that
> will fail with a partial delete if some vlan was missing.

Nikolay, would you like to have a crack at fixing this?

> This has been the behaviour forever and some script might depend on it.
> Also IMO, and as David also mentioned, doing a partial delete is not good.


Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan

2017-10-14 Thread Nikolay Aleksandrov
On 13/10/17 19:00, Roman Mashak wrote:
> Nikolay Aleksandrov  writes:
> 
> 
> [...]
> 
 Why do you want to return the error code here? Walking the code paths
 seems like ENOENT or err from switchdev_port_obj_del are the 2 error
 possibilities.
>>>
>>> For example, if you attempt to delete a non-existing vlan on a port,
>>> the current code succeeds and also sends event :
>>>
>>> rtnetlink_rcv_msg
>>> rtnl_bridge_dellink
>>>br_dellink
>>>   br_afspec
>>>  br_vlan_info
>>>
>>> int br_dellink(..)
>>> {
>>>   ...
>>>   err = br_afspec()
>>>   if (err == 0)
>>>   br_ifinfo_notify(RTM_NEWLINK, p);
>>> }
>>>
>>> This is misleading, so a proper errcode has to be produced.
>>>
>>
>> True, but you also change the expected behaviour because now a user can
>> clear all vlans with one request (1 - 4094), and after the change that
>> will fail with a partial delete if some vlan was missing.
> 
> Nikolay, would you like to have a crack at fixing this?
> 

Sure, need to finish something and will cook up a patch next week.

Thanks,
 Nik