[PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
v2: Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern) Signed-off-by: Roman Mashak --- net/bridge/br_netlink.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index f0e8268..1efdd48 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct net_bridge_port *p, case RTM_DELLINK: if (p) { - nbp_vlan_delete(p, vinfo->vid); + err = nbp_vlan_delete(p, vinfo->vid); + if (err) + break; if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER) - br_vlan_delete(p->br, vinfo->vid); + err = br_vlan_delete(p->br, vinfo->vid); } else { - br_vlan_delete(br, vinfo->vid); + err = br_vlan_delete(br, vinfo->vid); } break; } -- 1.9.1
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On 10/12/17 7:51 AM, Roman Mashak wrote: > v2: > Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern) > > Signed-off-by: Roman Mashak > --- > net/bridge/br_netlink.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c > index f0e8268..1efdd48 100644 > --- a/net/bridge/br_netlink.c > +++ b/net/bridge/br_netlink.c > @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct > net_bridge_port *p, > > case RTM_DELLINK: > if (p) { > - nbp_vlan_delete(p, vinfo->vid); > + err = nbp_vlan_delete(p, vinfo->vid); > + if (err) > + break; I'm not sure a break is the right thing to do. Seems like you leave it in a half configured state. > if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER) > - br_vlan_delete(p->br, vinfo->vid); > + err = br_vlan_delete(p->br, vinfo->vid); > } else { > - br_vlan_delete(br, vinfo->vid); > + err = br_vlan_delete(br, vinfo->vid); > } > break; > } > Why do you want to return the error code here? Walking the code paths seems like ENOENT or err from switchdev_port_obj_del are the 2 error possibilities.
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On Thu, Oct 12, 2017 at 10:19 AM, David Ahern wrote: > On 10/12/17 7:51 AM, Roman Mashak wrote: >> v2: >> Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern) >> >> Signed-off-by: Roman Mashak >> --- >> net/bridge/br_netlink.c | 8 +--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c >> index f0e8268..1efdd48 100644 >> --- a/net/bridge/br_netlink.c >> +++ b/net/bridge/br_netlink.c >> @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct >> net_bridge_port *p, >> >> case RTM_DELLINK: >> if (p) { >> - nbp_vlan_delete(p, vinfo->vid); >> + err = nbp_vlan_delete(p, vinfo->vid); >> + if (err) >> + break; > > I'm not sure a break is the right thing to do. Seems like you leave it > in a half configured state. > >> if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER) >> - br_vlan_delete(p->br, vinfo->vid); >> + err = br_vlan_delete(p->br, vinfo->vid); >> } else { >> - br_vlan_delete(br, vinfo->vid); >> + err = br_vlan_delete(br, vinfo->vid); >> } >> break; >> } >> > > Why do you want to return the error code here? Walking the code paths > seems like ENOENT or err from switchdev_port_obj_del are the 2 error > possibilities. For example, if you attempt to delete a non-existing vlan on a port, the current code succeeds and also sends event : rtnetlink_rcv_msg rtnl_bridge_dellink br_dellink br_afspec br_vlan_info int br_dellink(..) { ... err = br_afspec() if (err == 0) br_ifinfo_notify(RTM_NEWLINK, p); } This is misleading, so a proper errcode has to be produced.
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On 12/10/17 21:07, Roman Mashak wrote: > On Thu, Oct 12, 2017 at 10:19 AM, David Ahern wrote: >> On 10/12/17 7:51 AM, Roman Mashak wrote: >>> v2: >>> Return err immediately if nbp_vlan_delete() fails (pointed by David Ahern) >>> >>> Signed-off-by: Roman Mashak >>> --- >>> net/bridge/br_netlink.c | 8 +--- >>> 1 file changed, 5 insertions(+), 3 deletions(-) >>> >>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c >>> index f0e8268..1efdd48 100644 >>> --- a/net/bridge/br_netlink.c >>> +++ b/net/bridge/br_netlink.c >>> @@ -527,11 +527,13 @@ static int br_vlan_info(struct net_bridge *br, struct >>> net_bridge_port *p, >>> >>> case RTM_DELLINK: >>> if (p) { >>> - nbp_vlan_delete(p, vinfo->vid); >>> + err = nbp_vlan_delete(p, vinfo->vid); >>> + if (err) >>> + break; >> >> I'm not sure a break is the right thing to do. Seems like you leave it >> in a half configured state. >> >>> if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER) >>> - br_vlan_delete(p->br, vinfo->vid); >>> + err = br_vlan_delete(p->br, vinfo->vid); >>> } else { >>> - br_vlan_delete(br, vinfo->vid); >>> + err = br_vlan_delete(br, vinfo->vid); >>> } >>> break; >>> } >>> >> >> Why do you want to return the error code here? Walking the code paths >> seems like ENOENT or err from switchdev_port_obj_del are the 2 error >> possibilities. > > For example, if you attempt to delete a non-existing vlan on a port, > the current code succeeds and also sends event : > > rtnetlink_rcv_msg > rtnl_bridge_dellink >br_dellink > br_afspec > br_vlan_info > > int br_dellink(..) > { > ... > err = br_afspec() > if (err == 0) > br_ifinfo_notify(RTM_NEWLINK, p); > } > > This is misleading, so a proper errcode has to be produced. > True, but you also change the expected behaviour because now a user can clear all vlans with one request (1 - 4094), and after the change that will fail with a partial delete if some vlan was missing. This has been the behaviour forever and some script might depend on it. Also IMO, and as David also mentioned, doing a partial delete is not good.
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On 17-10-12 02:12 PM, Nikolay Aleksandrov wrote: On 12/10/17 21:07, Roman Mashak wrote: For example, if you attempt to delete a non-existing vlan on a port, the current code succeeds and also sends event : rtnetlink_rcv_msg rtnl_bridge_dellink br_dellink br_afspec br_vlan_info int br_dellink(..) { ... err = br_afspec() if (err == 0) br_ifinfo_notify(RTM_NEWLINK, p); } This is misleading, so a proper errcode has to be produced. True, but you also change the expected behaviour because now a user can clear all vlans with one request (1 - 4094), and after the change that will fail with a partial delete if some vlan was missing. The issue is more subtle (per Roman above): Try to delete a vlan (that doesnt exist). 1) It says "success". 2) Worse: Another process listening (bridge monitor?) gets an _event_ that the vlan has been deleted (when it never existed in the first place). This has been the behaviour forever and some script might depend on it. Also IMO, and as David also mentioned, doing a partial delete is not good. I think this is a bug (especially the event part). cheers, jamal
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On 13.10.2017 05:03, Jamal Hadi Salim wrote: > On 17-10-12 02:12 PM, Nikolay Aleksandrov wrote: >> On 12/10/17 21:07, Roman Mashak wrote: > >>> For example, if you attempt to delete a non-existing vlan on a port, >>> the current code succeeds and also sends event : >>> >>> rtnetlink_rcv_msg >>> rtnl_bridge_dellink >>> br_dellink >>> br_afspec >>> br_vlan_info >>> >>> int br_dellink(..) >>> { >>> ... >>> err = br_afspec() >>> if (err == 0) >>> br_ifinfo_notify(RTM_NEWLINK, p); >>> } >>> >>> This is misleading, so a proper errcode has to be produced. >>> >> > > > >> True, but you also change the expected behaviour because now a user can >> clear all vlans with one request (1 - 4094), and after the change that >> will fail with a partial delete if some vlan was missing. >> > > The issue is more subtle (per Roman above): > Try to delete a vlan (that doesnt exist). > 1) It says "success". > 2) Worse: Another process listening (bridge monitor?) gets an _event_ > that the vlan has been deleted (when it never existed in the first > place). > >> This has been the behaviour forever and some script might depend on it. >> Also IMO, and as David also mentioned, doing a partial delete is not >> good. >> > > I think this is a bug (especially the event part). > > cheers, > jamal > Fair enough, but after the patch you get the opposite effect too - you delete a couple of vlans but you don't generate an event because of an error in the middle. That at least can be taken care of. I do agree it's a bug, but there might be scripts that rely on it and don't check the return value when clearing vlans. They will end up with a partial clear and wrongly assumed state, so maybe leave the opportunistic delete but count if anything was actually deleted and send an event only then ? That should make everyone happy :-)
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On 10/12/17 8:15 PM, Nikolay Aleksandrov wrote: > I do agree it's a bug, but there might be scripts that rely on it and > don't check the return value when clearing vlans. They will end up with > a partial clear and wrongly assumed state, so maybe leave the > opportunistic delete but count if anything was actually deleted and send > an event only then ? +1
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
Nikolay Aleksandrov writes: [...] >>> Why do you want to return the error code here? Walking the code paths >>> seems like ENOENT or err from switchdev_port_obj_del are the 2 error >>> possibilities. >> >> For example, if you attempt to delete a non-existing vlan on a port, >> the current code succeeds and also sends event : >> >> rtnetlink_rcv_msg >> rtnl_bridge_dellink >>br_dellink >> br_afspec >> br_vlan_info >> >> int br_dellink(..) >> { >> ... >> err = br_afspec() >> if (err == 0) >> br_ifinfo_notify(RTM_NEWLINK, p); >> } >> >> This is misleading, so a proper errcode has to be produced. >> > > True, but you also change the expected behaviour because now a user can > clear all vlans with one request (1 - 4094), and after the change that > will fail with a partial delete if some vlan was missing. Nikolay, would you like to have a crack at fixing this? > This has been the behaviour forever and some script might depend on it. > Also IMO, and as David also mentioned, doing a partial delete is not good.
Re: [PATCH net-next v2 1/1] bridge: return error code when deleting Vlan
On 13/10/17 19:00, Roman Mashak wrote: > Nikolay Aleksandrov writes: > > > [...] > Why do you want to return the error code here? Walking the code paths seems like ENOENT or err from switchdev_port_obj_del are the 2 error possibilities. >>> >>> For example, if you attempt to delete a non-existing vlan on a port, >>> the current code succeeds and also sends event : >>> >>> rtnetlink_rcv_msg >>> rtnl_bridge_dellink >>>br_dellink >>> br_afspec >>> br_vlan_info >>> >>> int br_dellink(..) >>> { >>> ... >>> err = br_afspec() >>> if (err == 0) >>> br_ifinfo_notify(RTM_NEWLINK, p); >>> } >>> >>> This is misleading, so a proper errcode has to be produced. >>> >> >> True, but you also change the expected behaviour because now a user can >> clear all vlans with one request (1 - 4094), and after the change that >> will fail with a partial delete if some vlan was missing. > > Nikolay, would you like to have a crack at fixing this? > Sure, need to finish something and will cook up a patch next week. Thanks, Nik