[iproute2] bridge: Fix garbled json output seen if a vlan filter is specified

2016-10-07 Thread Anuradha Karuppiah
From: anuradhak <anurad...@cumulusnetworks.com>

json objects were started but not completed if the fdb vlan did not
match the specified filter vlan.

Sample output:
$ bridge -j fdb show vlan 111
[{
"mac": "44:38:39:00:69:88",
"dev": "br0",
"vlan": 111,
"master": "br0",
"state": "permanent"
}
]
$ bridge -j fdb show vlan 100
[]
$

Signed-off-by: Anuradha Karuppiah <anurad...@cumulusnetworks.com>
---
 bridge/fdb.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/bridge/fdb.c b/bridge/fdb.c
index c6e0379..90f4b15 100644
--- a/bridge/fdb.c
+++ b/bridge/fdb.c
@@ -100,11 +100,6 @@ int print_fdb(const struct sockaddr_nl *who, struct 
nlmsghdr *n, void *arg)
if (filter_index && filter_index != r->ndm_ifindex)
return 0;
 
-   if (jw_global) {
-   jsonw_pretty(jw_global, 1);
-   jsonw_start_object(jw_global);
-   }
-
parse_rtattr(tb, NDA_MAX, NDA_RTA(r),
 n->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
 
@@ -114,6 +109,11 @@ int print_fdb(const struct sockaddr_nl *who, struct 
nlmsghdr *n, void *arg)
if (filter_vlan && filter_vlan != vid)
return 0;
 
+   if (jw_global) {
+   jsonw_pretty(jw_global, 1);
+   jsonw_start_object(jw_global);
+   }
+
if (n->nlmsg_type == RTM_DELNEIGH) {
if (jw_global)
jsonw_string_field(jw_global, "opCode", "deleted");
-- 
2.1.4



Re: [patch net-next v5 0/4] return offloaded stats as default and expose original sw stats

2016-06-23 Thread Anuradha Karuppiah
 we can't separate CPU and HW stats there. In some cases (or ASICs) HW
 counters do
 not include CPU generated packetsyou will have to add CPU
 generated pkt counters to the
 hw counters for such virtual device stats.
>>> Can you please provide and example how that could happen?
>>
>>example is the bridge vlan stats I mention below. These are usually counted
>>by attaching hw virtual counter resources. And CPU generated packets
>>in some cases maybe setup to bypass the ASIC pipeline because the CPU
>>has already made the required decisions. So, they may not be counted by
>>by such hw virtual counters.
>
> Bypass ASIC? How do the packets get on the wire?
>

Bypass the "forwarding pipeline" in the ASIC that is. Obviously the
ASIC ships the CPU generated packet out of the switch/front-panel
port. Continuing Roopa's example of vlan netdev stats To get the
HW stats counters are typically tied to the ingress and egress vlan hw
entries. All the incoming packets are subject to the ingress vlan
lookup irrespective of whether they get punted to the CPU or whether
they are forwarded to another front panel port. In that case the
ingress HW stats does represent all packets. However for CPU
originated packets egress vlan lookups are bypassed in the ASIC (this
is common forwarding option in most ASICs) and the packet shipped as
is out of front-panel port specified by the CPU. Which means these
packets will NOT be counted against the egress VLAN HW counter; hence
the need for summation.


Re: [PATCH iproute2 net-next v3 1/5] json_writer: allow base json data type to be array or object

2016-06-21 Thread Anuradha Karuppiah
On Tue, Jun 21, 2016 at 9:12 AM, Stephen Hemminger
<step...@networkplumber.org> wrote:
> On Mon, 20 Jun 2016 23:39:43 -0700
> Roopa Prabhu <ro...@cumulusnetworks.com> wrote:
>
>> From: Anuradha Karuppiah <anurad...@cumulusnetworks.com>
>>
>> This patch adds a type qualifier to json_writer. Type can be a
>> json object or array. This can be extended to other types like
>> json-string, json-number etc in the future.
>>
>> Signed-off-by: Anuradha Karuppiah <anurad...@cumulusnetworks.com>
>
> Since json writer is not used in many places yet, why not just
> get rid of the automatic object in the constructor.

I wanted to force the external api to start with an json-object or
json-array. It reduces the chance of mistakes vs. a typeless
constructor. With a typeless constructor you can accidentally end up
with a json output that doesn't pass json lint; especially if optional
params are being suppressed at different places.

>
>
>
> diff --git a/lib/json_writer.c b/lib/json_writer.c
> index 2af16e1..5e588d8 100644
> --- a/lib/json_writer.c
> +++ b/lib/json_writer.c
> @@ -102,7 +102,6 @@ json_writer_t *jsonw_new(FILE *f)
> self->depth = 0;
> self->pretty = false;
> self->sep = '\0';
> -   putc('{', self->out);
> }
> return self;
>  }
> @@ -114,7 +113,6 @@ void jsonw_destroy(json_writer_t **self_p)
>
> assert(self->depth == 0);
> jsonw_eol(self);
> -   fputs("}\n", self->out);
> fflush(self->out);
> free(self);
> *self_p = NULL;
> diff --git a/misc/ifstat.c b/misc/ifstat.c
> index abbb4e7..d9a7e50 100644
> --- a/misc/ifstat.c
> +++ b/misc/ifstat.c
> @@ -246,7 +246,6 @@ static void dump_raw_db(FILE *fp, int to_hist)
> h = hist_db;
> if (jw) {
> jsonw_pretty(jw, pretty);
> -   jsonw_name(jw, info_source);
> jsonw_start_object(jw);
> } else
> fprintf(fp, "#%s\n", info_source);
> @@ -452,7 +451,6 @@ static void dump_kern_db(FILE *fp)
>
> if (jw) {
> jsonw_pretty(jw, pretty);
> -   jsonw_name(jw, info_source);
> jsonw_start_object(jw);
> } else
> print_head(fp);
> @@ -466,8 +464,10 @@ static void dump_kern_db(FILE *fp)
> else
> print_one_if(fp, n, n->val);
> }
> -   if (json_output)
> -   fprintf(fp, "\n} }\n");
> +   if (jw) {
> +   jsonw_end_object(jw);
> +   jsonw_destroy();
> +   }
>  }
>
>  static void dump_incr_db(FILE *fp)
> @@ -478,7 +478,6 @@ static void dump_incr_db(FILE *fp)
> h = hist_db;
> if (jw) {
> jsonw_pretty(jw, pretty);
> -   jsonw_name(jw, info_source);
> jsonw_start_object(jw);
> } else
> print_head(fp);
> diff --git a/misc/nstat.c b/misc/nstat.c
> index a9e0f20..411cd87 100644
> --- a/misc/nstat.c
> +++ b/misc/nstat.c
> @@ -285,7 +285,6 @@ static void dump_kern_db(FILE *fp, int to_hist)
> h = hist_db;
> if (jw) {
> jsonw_pretty(jw, pretty);
> -   jsonw_name(jw, info_source);
> jsonw_start_object(jw);
> } else
> fprintf(fp, "#%s\n", info_source);


Re: [PATCH iproute2 net-next 3/5] bridge: add json support for bridge fdb show

2016-06-17 Thread Anuradha Karuppiah
On Fri, Jun 17, 2016 at 12:04 PM, Stephen Hemminger
<step...@networkplumber.org> wrote:
> On Fri, 17 Jun 2016 11:04:54 -0700
> Anuradha Karuppiah <anurad...@cumulusnetworks.com> wrote:
>
>> On Tue, May 31, 2016 at 12:22 PM, Anuradha Karuppiah
>> <anurad...@cumulusnetworks.com> wrote:
>> > On Tue, May 31, 2016 at 11:59 AM, Stephen Hemminger
>> > <step...@networkplumber.org> wrote:
>> >> On Fri, 27 May 2016 21:37:14 -0700
>> >> Roopa Prabhu <ro...@cumulusnetworks.com> wrote:
>> >>
>> >>> Sample output:
>> >>> $bridge -j fdb show
>> >>> [{
>> >>> "mac": "44:38:39:00:69:88",
>> >>> "dev": "swp2s0",
>> >>> "vlan": 2,
>> >>> "master": "br0",
>> >>> "state": "permanent"
>> >>> },{
>> >>> "mac": "00:02:00:00:00:01",
>> >>> "dev": "swp2s0",
>> >>> "vlan": 2,
>> >>> "master": "br0"
>> >>> },{
>> >>> "mac": "00:02:00:00:00:02",
>> >>> "dev": "swp2s1",
>> >>> "vlan": 2,
>> >>> "master": "br0"
>> >>> },{
>> >>> "mac": "44:38:39:00:69:89",
>> >>> "dev": "swp2s1",
>> >>> "master": "br0",
>> >>> "state": "permanent"
>> >>> },{
>> >>> "mac": "44:38:39:00:69:89",
>> >>> "dev": "swp2s1",
>> >>> "vlan": 2,
>> >>> "master": "br0",
>> >>> "state": "permanent"
>> >>> },{
>> >>> "mac": "44:38:39:00:69:88",
>> >>> "dev": "br0",
>> >>> "master": "br0",
>> >>> "state": "permanent"
>> >>> }
>> >>> ]
>> >>
>> >> In most JSON I have seen, the output would be:
>> >>
>> >> {
>> >>   "fdb" : [
>> >>  {
>> >>  "mac": "44:38:39:00:69:88",
>> >>  "dev": "swp2s0",
>> >>  "vlan": 2,
>> >>  "master": "br0",
>> >>  "state": "permanent"
>> >>  },
>> >> ...
>> >>]
>> >> }
>> >>
>> >> I.e never a bare array.
>> >>
>> > Yes Stephen, Adding an extra level would be one way to force the
>> > format to json-object. And that would definitely be the way to do it
>> > if we ever added a top level json dump - something like - "bridge -j
>> > show".
>> >
>> > But in the case of "bridge -j fdb show" that level is redundant. To be
>> > consistent we would have to add that extra level to all json dumps
>> > (even if they were already objects; such as the "bridge -j vlan
>> > show").The google json style guide recommends against adding hierarchy
>> > unless needed. And it is not that uncommon in java to have a
>> > json-array of objects for e.g. http://json-schema.org/example1.html
>> > talks about a schema that is an "array of products".
>> >
>> > What do you recommend?
>>
>> Hi Stephen,
>> We did a bit more digging around and found that other folks use json
>> output with top level array as well. Here’s a docker networks json
>> output sample -
>>
>> vagrant@host-21 ~ $ docker network inspect red
>> [
>> {
>> "Name": "red",
>> "Id": 
>> "d2fff9bafd7564c4012aa49f322fcd8f5743cc5ceb465dc218af5ba22c920981",
>> "Scope": "global",
>> "Driver": "overlay",
>> "EnableIPv6": false,
>> "IPAM": {
>> "Driver": "default",
>> "Options": {},
>> "Config": [
>> {
>> "Subnet": "10.252.20.0/24"
>> }
>> ]
>> },
>> "Internal": false,
>> "Containers": {
>> 
>> "c92084c1ebfb4f0a601537298c273078862207e3b564787ddd6ef564efbaca47":
>> {
>> "Name": "ctr21",
>> "EndpointID":
>> "e7468a70f13f1ea7b15445ab555374892ac41f71ea9023af1d9ede668bfd8742",
>> "MacAddress": "02:42:0a:fc:14:03",
>> "IPv4Address": "10.252.20.3/24",
>> "IPv6Address": ""
>> },
>> 
>> "ep-9bfc004b4046512f0f0104fe022d3686f5237ae08475741f8d520552cbb63d45":
>> {
>> "Name": "ctr22",
>> "EndpointID":
>> "9bfc004b4046512f0f0104fe022d3686f5237ae08475741f8d520552cbb63d45",
>> "MacAddress": "02:42:0a:fc:14:02",
>> "IPv4Address": "10.252.20.2/24",
>> "IPv6Address": ""
>> }
>> },
>> "Options": {},
>> "Labels": {}
>> }
>> ]
>>
>> Adding an additional namespace to all the json outputs (just to avoid
>> a top-level json-array for some) seems redundant. If a namespace is
>> needed for other reasons we can definitely add it. So we think it
>> would be better to just go with the top-level json-array for a
>> list/set-of-objects outputs.
>
> Ok, resubmit. I never claimed to be a JSON expert :=)

will do :)


Re: [PATCH iproute2 net-next 3/5] bridge: add json support for bridge fdb show

2016-06-17 Thread Anuradha Karuppiah
On Tue, May 31, 2016 at 12:22 PM, Anuradha Karuppiah
<anurad...@cumulusnetworks.com> wrote:
> On Tue, May 31, 2016 at 11:59 AM, Stephen Hemminger
> <step...@networkplumber.org> wrote:
>> On Fri, 27 May 2016 21:37:14 -0700
>> Roopa Prabhu <ro...@cumulusnetworks.com> wrote:
>>
>>> Sample output:
>>> $bridge -j fdb show
>>> [{
>>> "mac": "44:38:39:00:69:88",
>>> "dev": "swp2s0",
>>> "vlan": 2,
>>> "master": "br0",
>>> "state": "permanent"
>>> },{
>>> "mac": "00:02:00:00:00:01",
>>> "dev": "swp2s0",
>>> "vlan": 2,
>>> "master": "br0"
>>> },{
>>> "mac": "00:02:00:00:00:02",
>>> "dev": "swp2s1",
>>> "vlan": 2,
>>> "master": "br0"
>>> },{
>>> "mac": "44:38:39:00:69:89",
>>> "dev": "swp2s1",
>>> "master": "br0",
>>> "state": "permanent"
>>> },{
>>> "mac": "44:38:39:00:69:89",
>>> "dev": "swp2s1",
>>> "vlan": 2,
>>> "master": "br0",
>>> "state": "permanent"
>>> },{
>>> "mac": "44:38:39:00:69:88",
>>> "dev": "br0",
>>> "master": "br0",
>>> "state": "permanent"
>>> }
>>> ]
>>
>> In most JSON I have seen, the output would be:
>>
>> {
>>   "fdb" : [
>>  {
>>  "mac": "44:38:39:00:69:88",
>>  "dev": "swp2s0",
>>  "vlan": 2,
>>  "master": "br0",
>>  "state": "permanent"
>>  },
>> ...
>>]
>> }
>>
>> I.e never a bare array.
>>
> Yes Stephen, Adding an extra level would be one way to force the
> format to json-object. And that would definitely be the way to do it
> if we ever added a top level json dump - something like - "bridge -j
> show".
>
> But in the case of "bridge -j fdb show" that level is redundant. To be
> consistent we would have to add that extra level to all json dumps
> (even if they were already objects; such as the "bridge -j vlan
> show").The google json style guide recommends against adding hierarchy
> unless needed. And it is not that uncommon in java to have a
> json-array of objects for e.g. http://json-schema.org/example1.html
> talks about a schema that is an "array of products".
>
> What do you recommend?

Hi Stephen,
We did a bit more digging around and found that other folks use json
output with top level array as well. Here’s a docker networks json
output sample -

vagrant@host-21 ~ $ docker network inspect red
[
{
"Name": "red",
"Id": 
"d2fff9bafd7564c4012aa49f322fcd8f5743cc5ceb465dc218af5ba22c920981",
"Scope": "global",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "10.252.20.0/24"
}
]
},
"Internal": false,
"Containers": {
"c92084c1ebfb4f0a601537298c273078862207e3b564787ddd6ef564efbaca47":
{
"Name": "ctr21",
"EndpointID":
"e7468a70f13f1ea7b15445ab555374892ac41f71ea9023af1d9ede668bfd8742",
"MacAddress": "02:42:0a:fc:14:03",
"IPv4Address": "10.252.20.3/24",
"IPv6Address": ""
},

"ep-9bfc004b4046512f0f0104fe022d3686f5237ae08475741f8d520552cbb63d45":
{
"Name": "ctr22",
"EndpointID":
"9bfc004b4046512f0f0104fe022d3686f5237ae08475741f8d520552cbb63d45",
"MacAddress": "02:42:0a:fc:14:02",
"IPv4Address": "10.252.20.2/24",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]

Adding an additional namespace to all the json outputs (just to avoid
a top-level json-array for some) seems redundant. If a namespace is
needed for other reasons we can definitely add it. So we think it
would be better to just go with the top-level json-array for a
list/set-of-objects outputs.

thanks
Anuradha.


Re: [PATCH iproute2 net-next 3/5] bridge: add json support for bridge fdb show

2016-05-31 Thread Anuradha Karuppiah
On Tue, May 31, 2016 at 11:59 AM, Stephen Hemminger
 wrote:
> On Fri, 27 May 2016 21:37:14 -0700
> Roopa Prabhu  wrote:
>
>> Sample output:
>> $bridge -j fdb show
>> [{
>> "mac": "44:38:39:00:69:88",
>> "dev": "swp2s0",
>> "vlan": 2,
>> "master": "br0",
>> "state": "permanent"
>> },{
>> "mac": "00:02:00:00:00:01",
>> "dev": "swp2s0",
>> "vlan": 2,
>> "master": "br0"
>> },{
>> "mac": "00:02:00:00:00:02",
>> "dev": "swp2s1",
>> "vlan": 2,
>> "master": "br0"
>> },{
>> "mac": "44:38:39:00:69:89",
>> "dev": "swp2s1",
>> "master": "br0",
>> "state": "permanent"
>> },{
>> "mac": "44:38:39:00:69:89",
>> "dev": "swp2s1",
>> "vlan": 2,
>> "master": "br0",
>> "state": "permanent"
>> },{
>> "mac": "44:38:39:00:69:88",
>> "dev": "br0",
>> "master": "br0",
>> "state": "permanent"
>> }
>> ]
>
> In most JSON I have seen, the output would be:
>
> {
>   "fdb" : [
>  {
>  "mac": "44:38:39:00:69:88",
>  "dev": "swp2s0",
>  "vlan": 2,
>  "master": "br0",
>  "state": "permanent"
>  },
> ...
>]
> }
>
> I.e never a bare array.
>
Yes Stephen, Adding an extra level would be one way to force the
format to json-object. And that would definitely be the way to do it
if we ever added a top level json dump - something like - "bridge -j
show".

But in the case of "bridge -j fdb show" that level is redundant. To be
consistent we would have to add that extra level to all json dumps
(even if they were already objects; such as the "bridge -j vlan
show").The google json style guide recommends against adding hierarchy
unless needed. And it is not that uncommon in java to have a
json-array of objects for e.g. http://json-schema.org/example1.html
talks about a schema that is an "array of products".

What do you recommend?


Re: [PATCH net-next v7 0/4] net: Introduce protodown flag.

2015-07-15 Thread Anuradha Karuppiah
On Wed, Jul 15, 2015 at 11:29 AM, Scott Feldman sfel...@gmail.com wrote:
 On Wed, Jul 15, 2015 at 8:53 AM, Anuradha Karuppiah
 anurad...@cumulusnetworks.com wrote:
 On Tue, Jul 14, 2015 at 10:08 PM, Scott Feldman sfel...@gmail.com wrote:
 On Tue, Jul 14, 2015 at 1:43 PM,  anurad...@cumulusnetworks.com wrote:
 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 User space daemons can detect errors in the network that need to be
 notified to the switch device drivers.

 Drivers can react to this error state by doing a phy-down on the
 switch-port which would result in a carrier-off locally and on the directly
 connected switch. Doing that would prevent loops and black-holes in the
 network.

 Hi Anuradha,

 Since this is a switch-thingy, can you move this to switchdev port
 attribute rather than adding another ndo op?  I know you started this
 patch set before switchdev port attrs, but I think it's worthwhile
 moving it to switchdev since clearly the use-case is switch device
 drivers.   Let me know if you need help converting over to switchdev
 port attr.

 Also, is the sysfs interface necessary?  The netlink/iproute2
 interface seems sufficient.

 -scott

 Yes, for the current use case (MLAG) it would be better to isolate the
 changes to switchdev. Also if we go with the switchdev option are you
 thinking about dropping the netdev-proto_down bool attribute
 altogether and only keeping that info in the switch drivers that
 support the op?

 Yes.

 You are right, the sysfs is not necessary. I can make it RO and just
 use it for displaying or drop it all together. What would you
 recommend?

 Drop it if it's not needed/used.

 On the iproute2 part, do we expect users to manually toggle this flag
 from the cmdline?  Seems kind of weird to walk up to a box and say ip
 link set dev DEV protodown on.  Isn't this more intended for external
 agents (i.e. MLAG) to toggle?  I guess if there is a netlink attr for
 the flag, there should be a user-interface.  But it seems like it'll
 be a source of confusion for 99.999% of the users out there that
 wonder what this knob does.

The MLAG app currently does use the iproute2 interface for toggling
protodown. So I may not be able to drop the iproute2 config option
easily.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v7 0/4] net: Introduce protodown flag.

2015-07-15 Thread Anuradha Karuppiah
On Wed, Jul 15, 2015 at 11:46 AM, Scott Feldman sfel...@gmail.com wrote:
 On Wed, Jul 15, 2015 at 11:36 AM, Anuradha Karuppiah
 anurad...@cumulusnetworks.com wrote:
 On Wed, Jul 15, 2015 at 11:29 AM, Scott Feldman sfel...@gmail.com wrote:
 On Wed, Jul 15, 2015 at 8:53 AM, Anuradha Karuppiah
 anurad...@cumulusnetworks.com wrote:
 On Tue, Jul 14, 2015 at 10:08 PM, Scott Feldman sfel...@gmail.com wrote:
 On Tue, Jul 14, 2015 at 1:43 PM,  anurad...@cumulusnetworks.com wrote:
 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 User space daemons can detect errors in the network that need to be
 notified to the switch device drivers.

 Drivers can react to this error state by doing a phy-down on the
 switch-port which would result in a carrier-off locally and on the 
 directly
 connected switch. Doing that would prevent loops and black-holes in the
 network.

 Hi Anuradha,

 Since this is a switch-thingy, can you move this to switchdev port
 attribute rather than adding another ndo op?  I know you started this
 patch set before switchdev port attrs, but I think it's worthwhile
 moving it to switchdev since clearly the use-case is switch device
 drivers.   Let me know if you need help converting over to switchdev
 port attr.

 Also, is the sysfs interface necessary?  The netlink/iproute2
 interface seems sufficient.

 -scott

 Yes, for the current use case (MLAG) it would be better to isolate the
 changes to switchdev. Also if we go with the switchdev option are you
 thinking about dropping the netdev-proto_down bool attribute
 altogether and only keeping that info in the switch drivers that
 support the op?

 Yes.

 You are right, the sysfs is not necessary. I can make it RO and just
 use it for displaying or drop it all together. What would you
 recommend?

 Drop it if it's not needed/used.

 On the iproute2 part, do we expect users to manually toggle this flag
 from the cmdline?  Seems kind of weird to walk up to a box and say ip
 link set dev DEV protodown on.  Isn't this more intended for external
 agents (i.e. MLAG) to toggle?  I guess if there is a netlink attr for
 the flag, there should be a user-interface.  But it seems like it'll
 be a source of confusion for 99.999% of the users out there that
 wonder what this knob does.

 The MLAG app currently does use the iproute2 interface for toggling
 protodown. So I may not be able to drop the iproute2 config option
 easily.

 It can send netlink msg directly as easy as doing exec ip link set
 ..., I assume.

 Forgive me if I've asked this before: where is this MLAG app?
We are in the process of making the app ready for open-sourcing and
upstreaming this patch set was part of that.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v4 1/4] net core: Add protodown support.

2015-07-15 Thread Anuradha Karuppiah
On Wed, Jul 15, 2015 at 10:31 AM, Stephen Hemminger
step...@networkplumber.org wrote:
 On Wed,  8 Jul 2015 14:04:22 -0700
 anurad...@cumulusnetworks.com wrote:

 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
 index e20979d..99ebb01 100644
 --- a/include/linux/netdevice.h
 +++ b/include/linux/netdevice.h
 @@ -1041,6 +1041,11 @@ typedef u16 (*select_queue_fallback_t)(struct 
 net_device *dev,
   *   TX queue.
   * int (*ndo_get_iflink)(const struct net_device *dev);
   *   Called to get the iflink value of this device.
 + * void (*ndo_change_proto_flags)(struct net_device *dev,
 + * unsigned int proto_flags);
 + *   This function is used to pass protocol port state information
 + *   to the switch driver.
 + *
   */
  struct net_device_ops {
   int (*ndo_init)(struct net_device *dev);
 @@ -1211,6 +1216,8 @@ struct net_device_ops {
 int queue_index,
 u32 maxrate);
   int (*ndo_get_iflink)(const struct net_device 
 *dev);
 + int (*ndo_change_proto_flags)(struct net_device 
 *dev,
 +   unsigned int 
 proto_flags);
  };

  /**
 @@ -1502,6 +1509,10 @@ enum netdev_priv_flags {
   *
   *   @qdisc_tx_busylock: XXX: need comments on this one
   *
 + *   @proto_flags:   protocol port state information can be sent to the
 + *   switch driver and used to set the phys state of the
 + *   switch port.
 + *
   *   FIXME: cleanup struct net_device such that network protocol info
   *   moves out.
   */
 @@ -1762,6 +1773,7 @@ struct net_device {
  #endif
   struct phy_device *phydev;
   struct lock_class_key *qdisc_tx_busylock;
 + unsigned int proto_flags;
  };
  #define to_net_dev(d) container_of(d, struct net_device, dev)

 @@ -2982,6 +2994,7 @@ int dev_get_phys_port_id(struct net_device *dev,
struct netdev_phys_item_id *ppid);
  int dev_get_phys_port_name(struct net_device *dev,
  char *name, size_t len);
 +int dev_change_proto_flags(struct net_device *dev, unsigned int 
 proto_flags);
  struct sk_buff *validate_xmit_skb_list(struct sk_buff *skb, struct 
 net_device *dev);
  struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device 
 *dev,
   struct netdev_queue *txq, int *ret);
 diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
 index 9cf2394..8d60fe7 100644
 --- a/include/uapi/linux/if.h
 +++ b/include/uapi/linux/if.h
 @@ -156,6 +156,12 @@ enum {
   IF_LINK_MODE_DORMANT,   /* limit upward transition to dormant */
  };

 +/* proto_flags - port state information can be passed to the switch driver 
 and
 + * used to determine the phys state of the switch port */
 +enum {
 + IF_PROTOF_DOWN  = 10  /* set switch port phys state down */
 +};
 +
  /*
   *   Device mapping structure. I'd just gone off and designed a
   *   beautiful scheme using only loadable modules with arguments
 diff --git a/net/core/dev.c b/net/core/dev.c
 index 6778a99..87571c4 100644
 --- a/net/core/dev.c
 +++ b/net/core/dev.c
 @@ -6076,6 +6076,26 @@ int dev_get_phys_port_name(struct net_device *dev,
  EXPORT_SYMBOL(dev_get_phys_port_name);

  /**
 + *   dev_change_proto_flags - update protocol port state information
 + *   @dev: device
 + *   @proto_flags: new value
 + *
 + *   This info can be used by switch drivers to set the phys state of the
 + *   port.
 + */
 +int dev_change_proto_flags(struct net_device *dev, unsigned int proto_flags)
 +{
 + const struct net_device_ops *ops = dev-netdev_ops;
 +
 + if (dev-proto_flags == proto_flags)
 + return 0;
 + if (!ops-ndo_change_proto_flags)
 + return -EOPNOTSUPP;
 + return ops-ndo_change_proto_flags(dev, proto_flags);
 +}
 +EXPORT_SYMBOL(dev_change_proto_flags);

 Rather than introducing yet another ndo op etc, can't this be handled under 
 some other
 event (like change flags)?  It seems that this one small feature has grown to 
 have bigger
 impact than necessary.

Yes, I really wanted to avoid another ndo as well. The advantage of
adding the ndo was to isolate functionality and verify if the
operation was actually performed.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH net-next v7 2/4] netlink: changes for setting and clearing protodown via netlink.

2015-07-15 Thread Anuradha Karuppiah
-Original Message-
From: Stephen Hemminger [mailto:step...@networkplumber.org]
Sent: Wednesday, July 15, 2015 10:32 AM
To: anurad...@cumulusnetworks.com
Cc: da...@davemloft.net; sfel...@gmail.com; netdev@vger.kernel.org;
ro...@cumulusnetworks.com; go...@cumulusnetworks.com;
w...@cumulusnetworks.com
Subject: Re: [PATCH net-next v7 2/4] netlink: changes for setting and
clearing protodown via netlink.

On Tue, 14 Jul 2015 13:43:20 -0700
anurad...@cumulusnetworks.com wrote:

 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com
 ---
  include/uapi/linux/if_link.h |1 +
  net/core/rtnetlink.c |   16 ++--
  2 files changed, 15 insertions(+), 2 deletions(-)

 diff --git a/include/uapi/linux/if_link.h
 b/include/uapi/linux/if_link.h index 2c7e8e3..24d68b7 100644
 --- a/include/uapi/linux/if_link.h
 +++ b/include/uapi/linux/if_link.h
 @@ -148,6 +148,7 @@ enum {
   IFLA_PHYS_SWITCH_ID,
   IFLA_LINK_NETNSID,
   IFLA_PHYS_PORT_NAME,
 + IFLA_PROTO_DOWN,
   __IFLA_MAX
  };

 diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index
 01ced4a..4104e27 100644
 --- a/net/core/rtnetlink.c
 +++ b/net/core/rtnetlink.c
 @@ -896,7 +896,9 @@ static noinline size_t if_nlmsg_size(const struct
net_device *dev,
  + rtnl_link_get_size(dev) /* IFLA_LINKINFO */
  + rtnl_link_get_af_size(dev) /* IFLA_AF_SPEC */
  + nla_total_size(MAX_PHYS_ITEM_ID_LEN) /* IFLA_PHYS_PORT_ID
*/
 -+ nla_total_size(MAX_PHYS_ITEM_ID_LEN); /*
IFLA_PHYS_SWITCH_ID */
 ++ nla_total_size(MAX_PHYS_ITEM_ID_LEN) /*
IFLA_PHYS_SWITCH_ID */
 ++ nla_total_size(1); /* IFLA_PROTO_DOWN */
 +
  }

  static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device
 *dev) @@ -1082,7 +1084,8 @@ static int rtnl_fill_ifinfo(struct sk_buff
*skb, struct net_device *dev,
   (dev-ifalias 
nla_put_string(skb, IFLA_IFALIAS, dev-ifalias)) ||
   nla_put_u32(skb, IFLA_CARRIER_CHANGES,
 - atomic_read(dev-carrier_changes)))
 + atomic_read(dev-carrier_changes)) ||
 + nla_put_u8(skb, IFLA_PROTO_DOWN, dev-proto_down))
   goto nla_put_failure;

   if (1) {
 @@ -1319,6 +1322,7 @@ static const struct nla_policy
ifla_policy[IFLA_MAX+1] = {
   [IFLA_CARRIER_CHANGES]  = { .type = NLA_U32 },  /* ignored */
   [IFLA_PHYS_SWITCH_ID]   = { .type = NLA_BINARY, .len =
MAX_PHYS_ITEM_ID_LEN },
   [IFLA_LINK_NETNSID] = { .type = NLA_S32 },
 + [IFLA_PROTO_DOWN]   = { .type = NLA_U8 },
  };

  static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
 @@ -1853,6 +1857,14 @@ static int do_setlink(const struct sk_buff *skb,
   }
   err = 0;

 + if (tb[IFLA_PROTO_DOWN]) {
 + err = dev_change_proto_down(dev,
 +
nla_get_u8(tb[IFLA_PROTO_DOWN]));
 + if (err)
 + goto errout;
 + status |= DO_SETLINK_NOTIFY;
 + }
 +
  errout:
   if (status  DO_SETLINK_MODIFIED) {
   if (status  DO_SETLINK_NOTIFY)

Why is this it's own attribute rather than a flag?

Hi Stephen,
I put out a version with this as a separate flags field (with proto_down
as a single bit). But there was some concern that proto_down will be the
only flag in the field so Dave recommended changing it to a boolean
attribute.
Anuradha.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v7 0/4] net: Introduce protodown flag.

2015-07-15 Thread Anuradha Karuppiah
On Tue, Jul 14, 2015 at 10:08 PM, Scott Feldman sfel...@gmail.com wrote:
 On Tue, Jul 14, 2015 at 1:43 PM,  anurad...@cumulusnetworks.com wrote:
 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 User space daemons can detect errors in the network that need to be
 notified to the switch device drivers.

 Drivers can react to this error state by doing a phy-down on the
 switch-port which would result in a carrier-off locally and on the directly
 connected switch. Doing that would prevent loops and black-holes in the
 network.

 Hi Anuradha,

 Since this is a switch-thingy, can you move this to switchdev port
 attribute rather than adding another ndo op?  I know you started this
 patch set before switchdev port attrs, but I think it's worthwhile
 moving it to switchdev since clearly the use-case is switch device
 drivers.   Let me know if you need help converting over to switchdev
 port attr.

 Also, is the sysfs interface necessary?  The netlink/iproute2
 interface seems sufficient.

 -scott

Yes, for the current use case (MLAG) it would be better to isolate the
changes to switchdev. Also if we go with the switchdev option are you
thinking about dropping the netdev-proto_down bool attribute
altogether and only keeping that info in the switch drivers that
support the op?

You are right, the sysfs is not necessary. I can make it RO and just
use it for displaying or drop it all together. What would you
recommend?

Anuradha.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v6 3/4] rocker: Handle protodown notifications.

2015-07-14 Thread Anuradha Karuppiah
On Tue, Jul 14, 2015 at 9:42 AM, Sergei Shtylyov
sergei.shtyl...@cogentembedded.com wrote:
 Hello.

 On 07/14/2015 06:32 PM, anurad...@cumulusnetworks.com wrote:

 From: Anuradha Karuppiah anurad...@cumulusnetworks.com


 protodown can be set by user space applications like MLAG on detecting
 errors on a switch port. This patch provides sample switch driver changes
 for handling protodown. Rocker PHYS disables the port in response to
 protodown.


 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com
 ---
   drivers/net/ethernet/rocker/rocker.c |   19 ++-
   1 file changed, 18 insertions(+), 1 deletion(-)


 diff --git a/drivers/net/ethernet/rocker/rocker.c
 b/drivers/net/ethernet/rocker/rocker.c
 index 2d8578cade..ec950d9 100644
 --- a/drivers/net/ethernet/rocker/rocker.c
 +++ b/drivers/net/ethernet/rocker/rocker.c
 @@ -3983,7 +3983,8 @@ static int rocker_port_open(struct net_device *dev)

 napi_enable(rocker_port-napi_tx);
 napi_enable(rocker_port-napi_rx);
 -   rocker_port_set_enable(rocker_port, true);
 +   if (!(dev-proto_down))


Inner parens not needed.
Ack.


 +   rocker_port_set_enable(rocker_port, true);
 netif_start_queue(dev);
 return 0;

 @@ -4167,6 +4168,21 @@ static int rocker_port_get_phys_port_name(struct
 net_device *dev,
 return err ? -EOPNOTSUPP : 0;
   }

 +static int rocker_port_change_proto_down(struct net_device *dev,
 +bool proto_down)
 +{
 +   struct rocker_port *rocker_port = netdev_priv(dev);
 +
 +   if (rocker_port-dev-flags  IFF_UP) {
 +   if (proto_down)
 +   rocker_port_set_enable(rocker_port, false);
 +   else
 +   rocker_port_set_enable(rocker_port, true);


Why not:

 rocker_port_set_enable(rocker_port, !proto_down);
yes, that would be better. thanks for the review.

 [...]

 MBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v5 1/4] net core: Add protodown support.

2015-07-13 Thread Anuradha Karuppiah
On Mon, Jul 13, 2015 at 2:34 PM, David Miller da...@davemloft.net wrote:
 From: anurad...@cumulusnetworks.com
 Date: Thu,  9 Jul 2015 15:35:27 -0700

 +/* proto_flags - port state information can be passed to the switch driver 
 and
 + * used to determine the phys state of the switch port */
 +enum {
 + IF_PROTOF_DOWN  = 10  /* set switch port phys state down */
 +};

 Realistically, do we really foresee any other proto flags being added in
 the future?

 Unless there is a strong sense that we will have some, this is
 insanely overengineered with all of these bit masking capabilities and
 such and nested attributes.

 I'd say just do one boolean attribute and that's it.
Ack, I will resubmit with protodown as a boolean. Thanks for the review Dave.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v4 4/4] ip link: proto_down config and display.

2015-07-08 Thread Anuradha Karuppiah
On Wed, Jul 8, 2015 at 6:01 PM, Stephen Hemminger
step...@networkplumber.org wrote:
 On Wed,  8 Jul 2015 14:04:25 -0700
 anurad...@cumulusnetworks.com wrote:

 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 This patch adds support to set and display the IF_PROTOF_DOWN proto_flag.
 One example user space application setting this flag is a multi-chassis
 LAG application to handle split-brain situation on peer-link failure.

 Example:
 root@net-next:~# ip link set eth1 protodown on
 root@net-next:~# ip link show eth1
 4: eth1: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast state 
 DOWN mode DEFAULT group default qlen 1000 proto_flags DOWN
 link/ether 52:54:00:12:35:01 brd ff:ff:ff:ff:ff:ff
 root@net-next:~# ip link set eth1 protodown off
 root@net-next:~# ip link show eth1
 4: eth1: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state 
 UP mode DEFAULT group default qlen 1000
 link/ether 52:54:00:12:35:01 brd ff:ff:ff:ff:ff:ff
 root@net-next:~#

 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com
 ---
  include/linux/if.h  |6 ++
  include/linux/if_link.h |   12 
  ip/ipaddress.c  |   27 +++
  ip/iplink.c |   18 ++
  man/man8/ip-link.8.in   |8 
  5 files changed, 71 insertions(+)

 diff --git a/include/linux/if.h b/include/linux/if.h
 index a55a9e0..97f53d8 100644
 --- a/include/linux/if.h
 +++ b/include/linux/if.h
 @@ -156,6 +156,12 @@ enum {
   IF_LINK_MODE_DORMANT,   /* limit upward transition to dormant */
  };

 +/* proto_flags - port state information can be passed to the switch driver 
 and
 + * used to determine the phys state of the switch port */
 +enum {
 + IF_PROTOF_DOWN  = 10  /* set switch port phys state down */
 +};
 +
  /*
   *   Device mapping structure. I'd just gone off and designed a
   *   beautiful scheme using only loadable modules with arguments
 diff --git a/include/linux/if_link.h b/include/linux/if_link.h
 index 3d0d613..1d07f3f 100644
 --- a/include/linux/if_link.h
 +++ b/include/linux/if_link.h
 @@ -148,6 +148,7 @@ enum {
   IFLA_PHYS_SWITCH_ID,
   IFLA_LINK_NETNSID,
   IFLA_PHYS_PORT_NAME,
 + IFLA_PROTO_FLAGS_INFO,
   __IFLA_MAX
  };

 @@ -620,4 +621,15 @@ enum {

  #define IFLA_HSR_MAX (__IFLA_HSR_MAX - 1)

 +
 +/* proto_flags section */
 +enum {
 + IFLA_PROTO_FLAGS_UNSPEC,
 + IFLA_PROTO_FLAGS_VAL,
 + IFLA_PROTO_FLAGS_CHANGE,
 + __IFLA_PROTO_FLAGS_MAX,
 +};
 +
 +#define IFLA_PROTO_FLAGS_MAX (__IFLA_PROTO_FLAGS_MAX - 1)
 +
  #endif /* _LINUX_IF_LINK_H */
 diff --git a/ip/ipaddress.c b/ip/ipaddress.c
 index 340e1c9..4078023 100644
 --- a/ip/ipaddress.c
 +++ b/ip/ipaddress.c
 @@ -556,6 +556,30 @@ static void print_link_stats(FILE *fp, struct nlmsghdr 
 *n)
   fprintf(fp, %s, _SL_);
  }

 +static void print_proto_flags(FILE *fp, struct rtattr *tb)
 +{
 + struct rtattr *proto_info[IFLA_INFO_MAX+1];
 + uint32_t proto_flags;
 +
 + parse_rtattr_nested(proto_info, IFLA_PROTO_FLAGS_MAX, tb);
 +
 + if (proto_info[IFLA_PROTO_FLAGS_VAL]) {
 + proto_flags = *(uint32_t *)RTA_DATA(
 + proto_info[IFLA_PROTO_FLAGS_VAL]);
 + if (proto_flags) {
 + fprintf(fp,  proto_flags );
 +#define _PROTOF(f) if (proto_flags  IF_PROTOF_##f) { \
 +   proto_flags = ~IF_PROTOF_##f ; \
 +   fprintf(fp, #f %s, proto_flags ? , : ); }
 + _PROTOF(DOWN);
 +#undef _PROTOF
 + if (proto_flags)
 + fprintf(fp, %x, proto_flags);
 + fprintf(fp,  );
 + }
 + }
 +}
 +
  int print_linkinfo(const struct sockaddr_nl *who,
  struct nlmsghdr *n, void *arg)
  {
 @@ -669,6 +693,9 @@ int print_linkinfo(const struct sockaddr_nl *who,
   if (filter.showqueue)
   print_queuelen(fp, tb);

 + if (tb[IFLA_PROTO_FLAGS_INFO])
 + print_proto_flags(fp, tb[IFLA_PROTO_FLAGS_INFO]);
 +
   if (!filter.family || filter.family == AF_PACKET || show_details) {
   SPRINT_BUF(b1);
   fprintf(fp, %s, _SL_);
 diff --git a/ip/iplink.c b/ip/iplink.c
 index a4a4980..29cb8ce 100644
 --- a/ip/iplink.c
 +++ b/ip/iplink.c
 @@ -611,6 +611,24 @@ int iplink_parse(int argc, char **argv, struct 
 iplink_req *req,
   invarg(Invalid \link-netnsid\ value\n, 
 *argv);
   addattr32(req-n, sizeof(*req), IFLA_LINK_NETNSID,
 link_netnsid);
 + } else if (strcmp(*argv, protodown) == 0) {
 + struct rtattr *proto_info;
 + unsigned int proto_flags = 0;
 +
 + NEXT_ARG

Re: [PATCH net-next v4 1/4] net core: Add protodown support.

2015-07-08 Thread Anuradha Karuppiah
On Wed, Jul 8, 2015 at 4:07 PM, Jonathan Toppins
jtopp...@cumulusnetworks.com wrote:
 On 7/8/15 5:04 PM, anurad...@cumulusnetworks.com wrote:

 diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
 index 9cf2394..8d60fe7 100644
 --- a/include/uapi/linux/if.h
 +++ b/include/uapi/linux/if.h
 @@ -156,6 +156,12 @@ enum {
 IF_LINK_MODE_DORMANT,   /* limit upward transition to dormant */
   };

 +/* proto_flags - port state information can be passed to the switch
 driver and
 + * used to determine the phys state of the switch port */
 +enum {
 +   IF_PROTOF_DOWN  = 10  /* set switch port phys state down
 */


 It might be better to use BIT(0) provided by linux/bitopts.h

 +};
 +


Ack.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next v3 1/4] net core: Add IFF_PROTO_DOWN support.

2015-04-29 Thread Anuradha Karuppiah
On Wed, Apr 29, 2015 at 4:07 PM, Anuradha Karuppiah
anurad...@cumulusnetworks.com wrote:
 On Wed, Apr 29, 2015 at 3:13 PM, Stephen Hemminger
 step...@networkplumber.org wrote:
 On Mon, 27 Apr 2015 10:38:21 -0700
 anurad...@cumulusnetworks.com wrote:

 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 This patch introduces an IFF_PROTO_DOWN flag that can be used by
 user space applications to notify drivers that errors have been
 detected on the device.

 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com

 I worry that adding another bit to an already complex state API
 will break userspace.

 There are lots of things besides iproute2 which look at those
 flags including routing daemons (quagga), network manager, netplugd,
 and switch controllers.

 Yes, I understand your concerns here. And tried to work around introducing
 a separate error flag by clearing IFF_UP on proto_down/detecting errors (as
 Scott also brought up earlier).

 That implementation failed because of the following reasons -
 1. There is no way to disambiguate between admin_down (!IFF_UP) and an
 APP/driver enforced error_down (IFF_PROTO_DOWN). Administrator or
 automation-scripts that monitor the config assumed that switch-port
 configuration had somehow fallen out of sync (and attempted to reinstate the
 admin_up repeatedly).

 2. Automatic error recovery was not possible; consider the following scenario
 for e.g.
a. The MLAG peer-link is down so the MLAG app on the secondary switch has
   proto_down’ed all the MLAG ports (including switch-port swp1) by 
 clearing
   IFF_UP.
b. At the same time the administrator is in the process of making some
   changes on the network connected to swp1. To avoid doing it live he 
 would
   admin_disable swp1 (!IFF_UP) by doing an ip link set swp1 down (this
   is a no-op as event #a has already cleared IFF_UP on swp1).
c. If the MLAG peer-link recovers at this point the MLAG app on the
   secondary switch would try to automatically recover the MLAG ports
   by clearing proto_down (i.e. setting IFF_UP); including on swp1. Doing
   that overrides the administrator’s directive to keep swp1 admin_down.
   Overriding an admin-down in a live network can be very dangerous so it
   is not possible to do auto-error-recovery unless we have a way to
   disambiguate between the admin and error states.

I have the need to disambiguate the error state but it doesn't have to be an
IFF_X attribute. Stephen, Do you think it would be more easily consumable if
it were a new/separate net_device attribute instead of being a new bit in
struct net_device flags?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next v3 0/4] net: Introduce IFF_PROTO_DOWN flag.

2015-04-28 Thread Anuradha Karuppiah
On Tue, Apr 28, 2015 at 12:37 PM, Scott Feldman sfel...@gmail.com wrote:
 On Tue, Apr 28, 2015 at 8:39 AM, Anuradha Karuppiah
 anurad...@cumulusnetworks.com wrote:


 On Mon, Apr 27, 2015 at 10:45 PM, Scott Feldman sfel...@gmail.com wrote:

 On Mon, Apr 27, 2015 at 10:38 AM,  anurad...@cumulusnetworks.com wrote:
  From: Anuradha Karuppiah anurad...@cumulusnetworks.com
 
  User space daemons can detect errors in the network that need to be
  notified to the switch device drivers.
 
  Drivers can react to this error state by doing a phy-down on the
  switch-port which would result in a carrier-off locally and on the
  directly connected switch. Doing that would prevent loops and
  black-holes in the network.

 (Sorry if this was asked earlier)

 Can the application simply send a SETLINK with IFF_UP clear and the
 port driver's ndo_stop would bring the PHY link down?


 Yes, Clearing IFF_UP on detecting errors (PROTO_DOWN) is possible and we
 tried
 that implementation as well. Unfortunately it failed because of the
 following
 reasons -

 1. There is no way to disambiguate between admin_down (!IFF_UP) and an
 APP/driver enforced error_down (IFF_PROTO_DOWN). Administrator or
 automation-scripts that monitor the config assumed that switch-port
 configuration had somehow fallen out of sync (and attempted to reinstate the
 admin_up repeatedly).

 2. Automatic error recovery was not possible; consider the following
 scenario
 for e.g.
a. The MLAG peer-link is down so the MLAG app on the secondary switch has
   proto_down’ed all the MLAG ports (including switch-port swp1) by
 clearing
   IFF_UP.
b. At the same time the administrator is in the process of making some
   changes on the network connected to swp1. To avoid doing it live he
 would
   admin_disable swp1 (!IFF_UP) by doing an ip link set swp1 down (this
   is a no-op as event #a has already cleared IFF_UP on swp1).
c. If the MLAG peer-link recovers at this point the MLAG app on the
   secondary switch would try to automatically recover the MLAG ports
   by clearing proto_down (i.e. setting IFF_UP); including on swp1. Doing
   that overrides the administrator’s directive to keep swp1 admin_down.
   Overriding an admin-down in a live network can be very dangerous so it
   is not possible to do auto-error-recovery unless we have a way to
   disambiguate between the admin and error states

 That makes sense.

 Dang, this is so close to IFF_DORMANT.  The interface can be IFF_UP
 and link mode can be DORMANT.  Can the port driver kill PHY link if
 dev-flagsIFF_DORMANT in ndo_set_rx_mode()?  Would require
 IFF_DORMANT is included in dev-flags in __dev_change_flags().

Yes, IFF_DORMANT does seem close to what is needed; in the current/standard
interpretation IFF_DORMANT keeps the switch port phy-up and running (and most
PDUs are also exchanged in the dormant state). Like you said we could
re-interpret IFF_DORMANT in this context to phy-down the switch-port;
unfortunately we are already using IFF_DORMANT as well (in its standard
interpretation)...

We are using the dormant mode (for the MLAG app itself) to hold the MLAG port
in a brief/transition-ary suspended state when the switch-port link/carrier up
happens. This has been done to co-ordinate states across the MLAG peer switches
and to ensure that egress port block masks are programmed on the peer switch
before transitioning the local switch port to an OPER_UP state. If we didn't do
that the dual-connected server would see duplicate packets every time a
link-down to link-up happened on a MLAG port.

So IFF_DORMANT re-interpretation is not going to be easily possible for the
MLAG use case.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next v3 3/4] rocker: Handle IFF_PROTODOWN by doing a PHYS-DOWN on the switch port.

2015-04-28 Thread Anuradha Karuppiah
On Mon, Apr 27, 2015 at 10:51 PM, Scott Feldman sfel...@gmail.com wrote:
 On Mon, Apr 27, 2015 at 10:38 AM,  anurad...@cumulusnetworks.com wrote:
 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 IFF_PROTODOWN can be set by user space applications like MLAG on detecting
 errors on a switch port. This patch provides sample switch driver changes
 for handling IFF_PROTODOWN. Rocker PHYS disables the port in response to
 protodown.

 Note: I understand Scott has some rocker changes on hold. I will re-spin
 this patch once his changes are in.

 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com
 ---
  drivers/net/ethernet/rocker/rocker.c |   16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/drivers/net/ethernet/rocker/rocker.c 
 b/drivers/net/ethernet/rocker/rocker.c
 index a87b177..e3084e3 100644
 --- a/drivers/net/ethernet/rocker/rocker.c
 +++ b/drivers/net/ethernet/rocker/rocker.c
 @@ -3838,7 +3838,8 @@ static int rocker_port_open(struct net_device *dev)

 napi_enable(rocker_port-napi_tx);
 napi_enable(rocker_port-napi_rx);
 -   rocker_port_set_enable(rocker_port, true);
 +   if (!(dev-flags  IFF_PROTO_DOWN))
 +   rocker_port_set_enable(rocker_port, true);
 netif_start_queue(dev);
 return 0;

 @@ -4238,6 +4239,18 @@ static int rocker_port_swdev_port_stp_update(struct 
 net_device *dev, u8 state)
 return rocker_port_stp_update(rocker_port, state);
  }

 +static int rocker_port_swdev_port_phy_state_set(struct net_device *dev,
 +   bool enable)
 +{
 +   struct rocker_port *rocker_port = netdev_priv(dev);
 +
 +   if (enable  (dev-flags  IFF_UP)  !(dev-flags  
 IFF_PROTO_DOWN))
 +   rocker_port_set_enable(rocker_port, true);
 +   else
 +   rocker_port_set_enable(rocker_port, false);

 This isn't working like your expecting.  PHY link is still up on the
 rocker port, so the other end of the wire still sees link UP even when
 IFF_PROTO_DOWN is set locally.

I wanted to check with you first on the rocker handling; that is the main
reason for posting this an RFC (thanks for the review).

I used rocker_port_set_enable/PORT_PHYS_ENABLE for setting the phy state
because I assumed that in the case of an offloaded switch driver it would be
the equivalent of doing sdk port N en=1|0. Is that not the case? If not,
could you please point me in the right direction?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next v3 1/4] net core: Add IFF_PROTO_DOWN support.

2015-04-28 Thread Anuradha Karuppiah
On Mon, Apr 27, 2015 at 11:05 PM, Scott Feldman sfel...@gmail.com wrote:
 On Mon, Apr 27, 2015 at 10:38 AM,  anurad...@cumulusnetworks.com wrote:
 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 This patch introduces an IFF_PROTO_DOWN flag that can be used by
 user space applications to notify drivers that errors have been
 detected on the device.

 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com
 ---
  include/uapi/linux/if.h |4 
  net/8021q/vlan_dev.c|3 ++-
  net/core/dev.c  |2 +-
  3 files changed, 7 insertions(+), 2 deletions(-)

 diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
 index 9cf2394..e263bd2 100644
 --- a/include/uapi/linux/if.h
 +++ b/include/uapi/linux/if.h
 @@ -66,6 +66,8 @@
   * @IFF_LOWER_UP: driver signals L1 up. Volatile.
   * @IFF_DORMANT: driver signals dormant. Volatile.
   * @IFF_ECHO: echo sent packets. Volatile.
 + * @IFF_PROTO_DOWN: protocol is down on the interface. Can be toggled
 + * through sysfs.

 This comment is stale, right?  I didn't see any sysfs support in the patchset.

Correct, No new sysfs has been added for this flag. But IFF_PROTO_DOWN
can be toggled via the already existing generic flags sysfs (similar
to say IFF_MULTICAST); hence the comment –

root@net-next:~# echo 0x81003  /sys/class/net/eth0/flags

root@net-next:~# cat /sys/class/net/eth0/flags

0x81003
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next v3 3/4] rocker: Handle IFF_PROTODOWN by doing a PHYS-DOWN on the switch port.

2015-04-28 Thread Anuradha Karuppiah
On Tue, Apr 28, 2015 at 12:48 PM, Scott Feldman sfel...@gmail.com wrote:
 On Tue, Apr 28, 2015 at 8:49 AM, Anuradha Karuppiah
 anurad...@cumulusnetworks.com wrote:
 On Mon, Apr 27, 2015 at 10:51 PM, Scott Feldman sfel...@gmail.com wrote:
 On Mon, Apr 27, 2015 at 10:38 AM,  anurad...@cumulusnetworks.com wrote:
 From: Anuradha Karuppiah anurad...@cumulusnetworks.com

 IFF_PROTODOWN can be set by user space applications like MLAG on detecting
 errors on a switch port. This patch provides sample switch driver changes
 for handling IFF_PROTODOWN. Rocker PHYS disables the port in response to
 protodown.

 Note: I understand Scott has some rocker changes on hold. I will re-spin
 this patch once his changes are in.

 Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com
 Signed-off-by: Wilson Kok w...@cumulusnetworks.com
 ---
  drivers/net/ethernet/rocker/rocker.c |   16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/drivers/net/ethernet/rocker/rocker.c 
 b/drivers/net/ethernet/rocker/rocker.c
 index a87b177..e3084e3 100644
 --- a/drivers/net/ethernet/rocker/rocker.c
 +++ b/drivers/net/ethernet/rocker/rocker.c
 @@ -3838,7 +3838,8 @@ static int rocker_port_open(struct net_device *dev)

 napi_enable(rocker_port-napi_tx);
 napi_enable(rocker_port-napi_rx);
 -   rocker_port_set_enable(rocker_port, true);
 +   if (!(dev-flags  IFF_PROTO_DOWN))
 +   rocker_port_set_enable(rocker_port, true);
 netif_start_queue(dev);
 return 0;

 @@ -4238,6 +4239,18 @@ static int rocker_port_swdev_port_stp_update(struct 
 net_device *dev, u8 state)
 return rocker_port_stp_update(rocker_port, state);
  }

 +static int rocker_port_swdev_port_phy_state_set(struct net_device *dev,
 +   bool enable)
 +{
 +   struct rocker_port *rocker_port = netdev_priv(dev);
 +
 +   if (enable  (dev-flags  IFF_UP)  !(dev-flags  
 IFF_PROTO_DOWN))
 +   rocker_port_set_enable(rocker_port, true);
 +   else
 +   rocker_port_set_enable(rocker_port, false);

 This isn't working like your expecting.  PHY link is still up on the
 rocker port, so the other end of the wire still sees link UP even when
 IFF_PROTO_DOWN is set locally.

 I wanted to check with you first on the rocker handling; that is the main
 reason for posting this an RFC (thanks for the review).

 I used rocker_port_set_enable/PORT_PHYS_ENABLE for setting the phy state
 because I assumed that in the case of an offloaded switch driver it would be
 the equivalent of doing sdk port N en=1|0. Is that not the case? If not,
 could you please point me in the right direction?

 To put PHY link down, on PORT_PHYS_ENABLE true, rocker device (qemu)
 would need to down the link state for the port so other end of wire
 sees link down.  I think I remember looking into that but didn't
 implement it.  In qemu monitor, you can manually set the link up/down
 (set_link dev [on|off]).  We'd need to call the same from within the
 rocker device.

 So if we want to be able to test this with rocker, we'll need a device
 and driver change.

Thanks for the details, Scott. We will looking into implementing the necessary
changes to get it working with rocker.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html