Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-09 Thread Jiri Pirko
Fri, Apr 06, 2018 at 11:22:29PM CEST, d...@cumulusnetworks.com wrote:
>On 4/5/18 11:52 PM, Jiri Pirko wrote:
>> Thu, Apr 05, 2018 at 11:06:41PM CEST, d...@cumulusnetworks.com wrote:
>>> On 4/5/18 2:10 PM, David Ahern wrote:

 The ASIC here is the kernel tables in a namespace. It does not make
 sense to have 2 devlink instances for a single namespace.
>>>
>>> I put this example controller in netdevsim per a suggestion from Ido.
>>> The netdevsim seemed like a good idea given that modules intention --
>>> testing network facilities. Perhaps I should have done this as a
>>> completely standalone module ...
>>>
>>> The intention is to treat the kernel's tables *per namespace* as a
>>> standalone entity that can be managed very similar to ASIC resources.
>> 
>> So you say you want to treat a namespace as an ASIC? That sounds very
>> odd to me :/
>
>Why? The kernel has forwarding tables, acl's, etc just like the ASIC,
>and each namespace is a separate set of tables.

I don't get it. What's the point? For HW, the reason is it has limited
resources and those resources are not mapped 1:1 with kernel object.
However, for kernel, that is meaningless.


>
>If you think about it, userspace "programs" the kernel just like mlxsw
>and userspace SDKs "program" an asic.

I don't give a  about sdks. I have no clue why you mention that here.


>
>
>>> Given that I can add a resource controller module
>>> (drivers/net/kern_res_mgr.c?) that creates a 'struct device' per network
>>> namespace with a devlink instance. In this case the device would very
>>> much be tied to the namespace 1:1.
>> 
>> That sounds more reasonable and accurate, yet still odd. You would not
>> have any netdevices there? Any ports?
>> 
>
>Sure, what ever ports are assigned to or created in the namespace.
>
>Nothing about the devlink API says it has to be a real h/w device.

Sure, it could represent something made-up, like netdevsim. However I
see a big misfit when you want to represent a namespace.


>Nothing about the devlink API says it can only be used for real h/w that
>has ports represented by netdevices that the devlink instance some how
>has "control" over.
>
>As the netdevsim demo shows, I can build an L3 resource controller for
>the kernel tables using just the devlink API and the in-kernel notifiers.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-06 Thread David Ahern
On 4/5/18 11:52 PM, Jiri Pirko wrote:
> Thu, Apr 05, 2018 at 11:06:41PM CEST, d...@cumulusnetworks.com wrote:
>> On 4/5/18 2:10 PM, David Ahern wrote:
>>>
>>> The ASIC here is the kernel tables in a namespace. It does not make
>>> sense to have 2 devlink instances for a single namespace.
>>
>> I put this example controller in netdevsim per a suggestion from Ido.
>> The netdevsim seemed like a good idea given that modules intention --
>> testing network facilities. Perhaps I should have done this as a
>> completely standalone module ...
>>
>> The intention is to treat the kernel's tables *per namespace* as a
>> standalone entity that can be managed very similar to ASIC resources.
> 
> So you say you want to treat a namespace as an ASIC? That sounds very
> odd to me :/

Why? The kernel has forwarding tables, acl's, etc just like the ASIC,
and each namespace is a separate set of tables.

If you think about it, userspace "programs" the kernel just like mlxsw
and userspace SDKs "program" an asic.


>> Given that I can add a resource controller module
>> (drivers/net/kern_res_mgr.c?) that creates a 'struct device' per network
>> namespace with a devlink instance. In this case the device would very
>> much be tied to the namespace 1:1.
> 
> That sounds more reasonable and accurate, yet still odd. You would not
> have any netdevices there? Any ports?
> 

Sure, what ever ports are assigned to or created in the namespace.

Nothing about the devlink API says it has to be a real h/w device.
Nothing about the devlink API says it can only be used for real h/w that
has ports represented by netdevices that the devlink instance some how
has "control" over.

As the netdevsim demo shows, I can build an L3 resource controller for
the kernel tables using just the devlink API and the in-kernel notifiers.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-05 Thread Jiri Pirko
Thu, Apr 05, 2018 at 11:06:41PM CEST, d...@cumulusnetworks.com wrote:
>On 4/5/18 2:10 PM, David Ahern wrote:
>> 
>> The ASIC here is the kernel tables in a namespace. It does not make
>> sense to have 2 devlink instances for a single namespace.
>
>I put this example controller in netdevsim per a suggestion from Ido.
>The netdevsim seemed like a good idea given that modules intention --
>testing network facilities. Perhaps I should have done this as a
>completely standalone module ...
>
>The intention is to treat the kernel's tables *per namespace* as a
>standalone entity that can be managed very similar to ASIC resources.

So you say you want to treat a namespace as an ASIC? That sounds very
odd to me :/


>Given that I can add a resource controller module
>(drivers/net/kern_res_mgr.c?) that creates a 'struct device' per network
>namespace with a devlink instance. In this case the device would very
>much be tied to the namespace 1:1.

That sounds more reasonable and accurate, yet still odd. You would not
have any netdevices there? Any ports?


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-05 Thread Jiri Pirko
Thu, Apr 05, 2018 at 10:10:29PM CEST, d...@cumulusnetworks.com wrote:
>On 4/5/18 11:27 AM, Jiri Pirko wrote:
>> Wed, Mar 28, 2018 at 03:22:00AM CEST, d...@cumulusnetworks.com wrote:
>>> Add devlink support to netdevsim and use it to implement a simple,
>>> profile based resource controller. Only one controller is needed
>>> per namespace, so the first netdevsim netdevice in a namespace
>>> registers with devlink. If that device is deleted, the resource
>>> settings are deleted.
>> 
>> I don't understand why you add 1:1 fixed relationship between
>> netnamespace and devlink instance. That is highly misleading and reader
>> might think that those 2 are somehow related. They are not. You can have
>> multiple devlink instances for many ports in a single namespace.
>
>The netdevsim devlink instance is an example of limiting the number of
>FIB entries and FIB rules for a namespace. It is currently limited to
>the init_net based on past discussion.
>
>It does not make sense to have multiple resource controllers for the
>same network namespace, hence the limit of only registering with devlink
>on the first device create.

Devlink instance represents an ASIC. 1:1. There is no relation with
network namespaces and should not be. I have no clue why you think so.

The model looks as I described it down below in the picture.


>
>> 
>> Could you please clarify?
>> 
>> Also, to see the relationship between individual netdevsim netdevices
>> and the parent devlink instance, we should use devlink_port
>> instances, like this: 
>> 
>>   devlink1  devlink2
>>||||
>>  dl_port1_1 dlport1_2   dlport2_1 dlport2_2
>>||||
>>  eth0  eth1 eth2 eth3
>> 
>> Note that "devlink instance" reprensents one ASIC.
>> The address of the devlink instance is the bus address of the ASIC.
>> Here, you use address of some/first netdevsim netdev instance.
>
>The ASIC here is the kernel tables in a namespace. It does not make
>sense to have 2 devlink instances for a single namespace.

Again. No clue why you build relationship with namespace.


>
>> 
>> The way it is implemented in netdevsim by this patch is wrong on
>> so many levels :(
>> 
>> Could you please fix this? I'm more than happy to help you with this,
>> please say so. Thanks!
>
>What is there to fix?
>
>Not creating a netdevsim device per netdevsim netdevice? That is
>completely unrelated to the devlink change.

To fit the model. Multiple devlink instances, each representing one
"virtual" ASIC, devlink_port instances, 1 for each netdevsim port.
Netdevsim port should simulate real devices. No real device should have
1:1 relation with network namespace. That is just simply wrong.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-05 Thread David Ahern
On 4/5/18 2:10 PM, David Ahern wrote:
> 
> The ASIC here is the kernel tables in a namespace. It does not make
> sense to have 2 devlink instances for a single namespace.

I put this example controller in netdevsim per a suggestion from Ido.
The netdevsim seemed like a good idea given that modules intention --
testing network facilities. Perhaps I should have done this as a
completely standalone module ...

The intention is to treat the kernel's tables *per namespace* as a
standalone entity that can be managed very similar to ASIC resources.
Given that I can add a resource controller module
(drivers/net/kern_res_mgr.c?) that creates a 'struct device' per network
namespace with a devlink instance. In this case the device would very
much be tied to the namespace 1:1.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-05 Thread David Ahern
On 4/5/18 11:27 AM, Jiri Pirko wrote:
> Wed, Mar 28, 2018 at 03:22:00AM CEST, d...@cumulusnetworks.com wrote:
>> Add devlink support to netdevsim and use it to implement a simple,
>> profile based resource controller. Only one controller is needed
>> per namespace, so the first netdevsim netdevice in a namespace
>> registers with devlink. If that device is deleted, the resource
>> settings are deleted.
> 
> I don't understand why you add 1:1 fixed relationship between
> netnamespace and devlink instance. That is highly misleading and reader
> might think that those 2 are somehow related. They are not. You can have
> multiple devlink instances for many ports in a single namespace.

The netdevsim devlink instance is an example of limiting the number of
FIB entries and FIB rules for a namespace. It is currently limited to
the init_net based on past discussion.

It does not make sense to have multiple resource controllers for the
same network namespace, hence the limit of only registering with devlink
on the first device create.

> 
> Could you please clarify?
> 
> Also, to see the relationship between individual netdevsim netdevices
> and the parent devlink instance, we should use devlink_port
> instances, like this: 
> 
>   devlink1  devlink2
>||||
>  dl_port1_1 dlport1_2   dlport2_1 dlport2_2
>||||
>  eth0  eth1 eth2 eth3
> 
> Note that "devlink instance" reprensents one ASIC.
> The address of the devlink instance is the bus address of the ASIC.
> Here, you use address of some/first netdevsim netdev instance.

The ASIC here is the kernel tables in a namespace. It does not make
sense to have 2 devlink instances for a single namespace.

> 
> The way it is implemented in netdevsim by this patch is wrong on
> so many levels :(
> 
> Could you please fix this? I'm more than happy to help you with this,
> please say so. Thanks!

What is there to fix?

Not creating a netdevsim device per netdevsim netdevice? That is
completely unrelated to the devlink change.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-04-05 Thread Jiri Pirko
Wed, Mar 28, 2018 at 03:22:00AM CEST, d...@cumulusnetworks.com wrote:
>Add devlink support to netdevsim and use it to implement a simple,
>profile based resource controller. Only one controller is needed
>per namespace, so the first netdevsim netdevice in a namespace
>registers with devlink. If that device is deleted, the resource
>settings are deleted.

I don't understand why you add 1:1 fixed relationship between
netnamespace and devlink instance. That is highly misleading and reader
might think that those 2 are somehow related. They are not. You can have
multiple devlink instances for many ports in a single namespace.

Could you please clarify?

Also, to see the relationship between individual netdevsim netdevices
and the parent devlink instance, we should use devlink_port
instances, like this: 

  devlink1  devlink2
   ||||
 dl_port1_1 dlport1_2   dlport2_1 dlport2_2
   ||||
 eth0  eth1 eth2 eth3

Note that "devlink instance" reprensents one ASIC.
The address of the devlink instance is the bus address of the ASIC.
Here, you use address of some/first netdevsim netdev instance.

The way it is implemented in netdevsim by this patch is wrong on
so many levels :(

Could you please fix this? I'm more than happy to help you with this,
please say so. Thanks!


[...]

>+  err = devlink_resource_register(devlink, "IPv4", (u64)-1,
>+  NSIM_RESOURCE_IPV4,
>+  DEVLINK_RESOURCE_ID_PARENT_TOP,
>+  ¶ms, NULL);
>+  if (err) {
>+  pr_err("Failed to register IPv4 top resource\n");
>+  goto out;


this goto is pointless. Just return.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-03-29 Thread David Ahern
On 3/29/18 12:11 PM, David Miller wrote:
> From: Jakub Kicinski 
> Date: Tue, 27 Mar 2018 18:34:50 -0700
> 
>> On Tue, 27 Mar 2018 18:22:00 -0700, David Ahern wrote:
>>> +void nsim_devlink_setup(struct netdevsim *ns)
>>> +{
>  ...
>> nit: DaveM expressed preference to not have silent failures in a
>>  discussion about DebugFS, not sure it applies here, but why not
>>  handle errors?
> 
> Yes it is a concern.
> 
> David please address this as a follow-up.

Will do.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-03-29 Thread David Miller
From: Jakub Kicinski 
Date: Tue, 27 Mar 2018 18:34:50 -0700

> On Tue, 27 Mar 2018 18:22:00 -0700, David Ahern wrote:
>> +void nsim_devlink_setup(struct netdevsim *ns)
>> +{
 ...
> nit: DaveM expressed preference to not have silent failures in a
>  discussion about DebugFS, not sure it applies here, but why not
>  handle errors?

Yes it is a concern.

David please address this as a follow-up.

Thanks.


Re: [PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-03-27 Thread Jakub Kicinski
On Tue, 27 Mar 2018 18:22:00 -0700, David Ahern wrote:
> +void nsim_devlink_setup(struct netdevsim *ns)
> +{
> + struct net *net = nsim_to_net(ns);
> + bool *reg_devlink = net_generic(net, nsim_devlink_id);
> + struct devlink *devlink;
> + int err = -ENOMEM;
> +
> + /* only one device per namespace controls devlink */
> + if (!*reg_devlink) {
> + ns->devlink = NULL;
> + return;
> + }
> +
> + devlink = devlink_alloc(&nsim_devlink_ops, 0);
> + if (!devlink)
> + return;
> +
> + err = devlink_register(devlink, &ns->dev);
> + if (err)
> + goto err_devlink_free;
> +
> + err = devlink_resources_register(devlink);
> + if (err)
> + goto err_dl_unregister;
> +
> + ns->devlink = devlink;
> +
> + *reg_devlink = false;
> +
> + return;
> +
> +err_dl_unregister:
> + devlink_unregister(devlink);
> +err_devlink_free:
> + devlink_free(devlink);
> +}

nit: DaveM expressed preference to not have silent failures in a
 discussion about DebugFS, not sure it applies here, but why not
 handle errors?


[PATCH net-next 6/6] netdevsim: Add simple FIB resource controller via devlink

2018-03-27 Thread David Ahern
Add devlink support to netdevsim and use it to implement a simple,
profile based resource controller. Only one controller is needed
per namespace, so the first netdevsim netdevice in a namespace
registers with devlink. If that device is deleted, the resource
settings are deleted.

The resource controller allows a user to limit the number of IPv4 and
IPv6 FIB entries and FIB rules. The resource paths are:
/IPv4
/IPv4/fib
/IPv4/fib-rules
/IPv6
/IPv6/fib
/IPv6/fib-rules

The IPv4 and IPv6 top level resources are unlimited in size and can not
be changed. From there, the number of FIB entries and FIB rule entries
are unlimited by default. A user can specify a limit for the fib and
fib-rules resources:

$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
$ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
$ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
$ devlink dev reload netdevsim/netdevsim0

such that the number of rules or routes is limited (96 ipv4 routes in the
example above):
$ for n in $(seq 1 32); do ip ro add 10.99.$n.0/24 dev eth1; done
Error: netdevsim: Exceeded number of supported fib entries.

$ devlink resource show netdevsim/netdevsim0
netdevsim/netdevsim0:
  name IPv4 size unlimited unit entry size_min 0 size_max unlimited 
size_gran 1 dpipe_tables non
resources:
  name fib size 96 occ 96 unit entry size_min 0 size_max unlimited 
size_gran 1 dpipe_tables
...

With this template in place for resource management, it is fairly trivial
to extend and shows one way to implement a simple counter based resource
controller typical of network profiles.

Currently, devlink only supports initial namespace. Code is in place to
adapt netdevsim to a per namespace controller once the network namespace
issues are resolved.

Signed-off-by: David Ahern 
---
 drivers/net/Kconfig   |   1 +
 drivers/net/netdevsim/Makefile|   4 +
 drivers/net/netdevsim/devlink.c   | 294 ++
 drivers/net/netdevsim/fib.c   | 263 ++
 drivers/net/netdevsim/netdev.c|  12 +-
 drivers/net/netdevsim/netdevsim.h |  43 ++
 6 files changed, 616 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/netdevsim/devlink.c
 create mode 100644 drivers/net/netdevsim/fib.c

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 08b85215c2be..891846655000 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -500,6 +500,7 @@ source "drivers/net/hyperv/Kconfig"
 config NETDEVSIM
tristate "Simulated networking device"
depends on DEBUG_FS
+   depends on MAY_USE_DEVLINK
help
  This driver is a developer testing tool and software model that can
  be used to test various control path networking APIs, especially
diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index 09388c06171d..449b2a1a1800 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -9,3 +9,7 @@ ifeq ($(CONFIG_BPF_SYSCALL),y)
 netdevsim-objs += \
bpf.o
 endif
+
+ifneq ($(CONFIG_NET_DEVLINK),)
+netdevsim-objs += devlink.o fib.o
+endif
diff --git a/drivers/net/netdevsim/devlink.c b/drivers/net/netdevsim/devlink.c
new file mode 100644
index ..bbdcf064ba10
--- /dev/null
+++ b/drivers/net/netdevsim/devlink.c
@@ -0,0 +1,294 @@
+/*
+ * Copyright (c) 2018 Cumulus Networks. All rights reserved.
+ * Copyright (c) 2018 David Ahern 
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include 
+#include 
+#include 
+
+#include "netdevsim.h"
+
+static unsigned int nsim_devlink_id;
+
+/* place holder until devlink and namespaces is sorted out */
+static struct net *nsim_devlink_net(struct devlink *devlink)
+{
+   return &init_net;
+}
+
+/* IPv4
+ */
+static u64 nsim_ipv4_fib_resource_occ_get(struct devlink *devlink)
+{
+   struct net *net = nsim_devlink_net(devlink);
+
+   return nsim_fib_get_val(net, NSIM_RESOURCE_IPV4_FIB, false);
+}
+
+static struct devlink_resource_ops nsim_ipv4_fib_res_ops = {
+   .occ_get = nsim_ipv4_fib_resource_occ_get,
+};
+
+static u64 nsim_ipv4_fib_rules_res_occ_get(struct devlink *devlink)
+{
+   struct net *net = nsim_devlink_net(devlink);
+
+   retur