Re: Minimum MTU Mess

2016-09-12 Thread Jarod Wilson
On Mon, Sep 12, 2016 at 11:59:41AM +0900, YOSHIFUJI Hideaki wrote:
> 
> 
> Jarod Wilson wrote:
> > On Tue, Sep 06, 2016 at 04:55:29PM -0700, David Miller wrote:
> >> From: Jarod Wilson 
> >> Date: Fri, 2 Sep 2016 13:07:42 -0400
> >>
> >>> In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or
> >>> variations thereof, under drivers/net/ is kind of crazy.
> >>
> >> Agreed, we can have a default and let the different cases provide
> >> overrides.
> >>
> >> Mostly what to do here is a function of the hardware though.
> > 
> > So I've been tinkering with this some, and it looks like having both
> > centralized min and max checking could be useful here. I'm hacking away at
> > drivers now, but the basis of all this would potentially look about like
> > the patch below, and each device would have to set dev->m{in,ax}_mtu one
> > way or another. Drivers using alloc_etherdev and/or ether_setup would get
> > the "default" values, and then they can be overridden. Probably need
> > something to make sure dev->max_mtu isn't set to 0 though...
> > 
> > Possibly on the right track here, or might there be a better way to
> > approach this?
> > 
> > diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
> > index 117d02e..864d6f2 100644
> > --- a/include/uapi/linux/if_ether.h
> > +++ b/include/uapi/linux/if_ether.h
> > @@ -35,6 +35,8 @@
> >  #define ETH_FRAME_LEN  1514/* Max. octets in frame sans 
> > FCS */
> >  #define ETH_FCS_LEN4   /* Octets in the FCS
> >  */
> >  
> > +#define ETH_MIN_MTU68  /* Min IPv4 MTU per RFC791  
> > */
> > +
> >  /*
> >   * These are the defined Ethernet Protocol ID's.
> >   */
> 
> Why don't we disable IPv4 if the MTU is lower than this value
> as we do for IPv6?

What will you be left with that is actually usable? Quite a few NIC
drivers already enforce this as a minimum MTU, and for drivers that really
want to allow less, they just set min_mtu to whatever they like. I'm
actually aiming to be 100% functionally identical wrt all existing minimum
mtu checks already in existence, just trying to improve how they're done.

-- 
Jarod Wilson
ja...@redhat.com



Re: Minimum MTU Mess

2016-09-12 Thread Jarod Wilson
On Mon, Sep 12, 2016 at 04:41:40AM +0200, Andrew Lunn wrote:
> > Actually breaking this up into easily digestable/mergeable chunks is going
> > to be kind of entertaining... Suggestions welcomed on that. First up is
> > obviously the core change, which touches just net/ethernet/eth.c,
> > net/core/dev.c, include/linux/netdevice.h and
> > include/uapi/linux/if_ether.h, and should let existing code continue to
> > Just Work(tm), though devices using ether_setup() that had no MTU range
> > checking (or one or the other missing) will wind up with new bounds.
> 
> Hi Jarod
> 
> Did you find any drivers which support jumbo packets, but don't have
> checks? These drivers, if there are any, need handling first, before
> this core change is made. Otherwise you introduce regressions.

Surprisingly, very few. There was dvb_net using eth_change_mtu and it's
max of 1500 while setting it's initial mtu to 4096, and I swear there was
at least one or two drivers that had no upper bounds checking at all that
I set max_mtu to IP_MAX_MTU (65535), but it's possible I missed something.

At the moment, things are roughly chopped up into:

1) core change
2) deprecate eth_change_mtu and remove in-kernel users
3) set m{in,ax}_mtu in ethernet drivers
4) set m{in,ax}_mtu in wireless drivers
5) set m{in,ax}_mtu in wan drivers
6) set m{in,ax}_mtu in usb ethernet drivers
7) set m{in,ax}_mtu in net infra
8) set m{in,ax}_mtu in virt drivers
9) set m{in,ax}_mtu in misc drivers

The ethernet drivers one is by far the largest, and was thinking I'd start
breaking that up next.

-- 
Jarod Wilson
ja...@redhat.com



Re: Minimum MTU Mess

2016-09-11 Thread YOSHIFUJI Hideaki


Jarod Wilson wrote:
> On Tue, Sep 06, 2016 at 04:55:29PM -0700, David Miller wrote:
>> From: Jarod Wilson 
>> Date: Fri, 2 Sep 2016 13:07:42 -0400
>>
>>> In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or
>>> variations thereof, under drivers/net/ is kind of crazy.
>>
>> Agreed, we can have a default and let the different cases provide
>> overrides.
>>
>> Mostly what to do here is a function of the hardware though.
> 
> So I've been tinkering with this some, and it looks like having both
> centralized min and max checking could be useful here. I'm hacking away at
> drivers now, but the basis of all this would potentially look about like
> the patch below, and each device would have to set dev->m{in,ax}_mtu one
> way or another. Drivers using alloc_etherdev and/or ether_setup would get
> the "default" values, and then they can be overridden. Probably need
> something to make sure dev->max_mtu isn't set to 0 though...
> 
> Possibly on the right track here, or might there be a better way to
> approach this?
> 
> diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
> index 117d02e..864d6f2 100644
> --- a/include/uapi/linux/if_ether.h
> +++ b/include/uapi/linux/if_ether.h
> @@ -35,6 +35,8 @@
>  #define ETH_FRAME_LEN1514/* Max. octets in frame sans 
> FCS */
>  #define ETH_FCS_LEN  4   /* Octets in the FCS */
>  
> +#define ETH_MIN_MTU  68  /* Min IPv4 MTU per RFC791  */
> +
>  /*
>   *   These are the defined Ethernet Protocol ID's.
>   */

Why don't we disable IPv4 if the MTU is lower than this value
as we do for IPv6?

-- 
Hideaki Yoshifuji 
Technical Division, MIRACLE LINUX CORPORATION


Re: Minimum MTU Mess

2016-09-11 Thread Andrew Lunn
> Actually breaking this up into easily digestable/mergeable chunks is going
> to be kind of entertaining... Suggestions welcomed on that. First up is
> obviously the core change, which touches just net/ethernet/eth.c,
> net/core/dev.c, include/linux/netdevice.h and
> include/uapi/linux/if_ether.h, and should let existing code continue to
> Just Work(tm), though devices using ether_setup() that had no MTU range
> checking (or one or the other missing) will wind up with new bounds.

Hi Jarod

Did you find any drivers which support jumbo packets, but don't have
checks? These drivers, if there are any, need handling first, before
this core change is made. Otherwise you introduce regressions.

 Andrew


Re: Minimum MTU Mess

2016-09-09 Thread Jarod Wilson
On Thu, Sep 08, 2016 at 03:24:13AM +0200, Andrew Lunn wrote:
> > This is definitely going to require a few passes... (Working my way
> > through every driver with an ndo_change_mtu wired up right now to
> > see just how crazy this might get).
> 
> It might be something Coccinelle can help you with. Try describing the
> transformation you want to do, to their mailing list, and they might
> come up with a script for you.

>From looking everything over, I'd be very surprised if they could. The
places where things need changing vary quite wildly by driver, but I've
actually got a full set of compiling changes with a cumulative diffstat
of:

 153 files changed, 599 insertions(+), 1002 deletions(-)

Actually breaking this up into easily digestable/mergeable chunks is going
to be kind of entertaining... Suggestions welcomed on that. First up is
obviously the core change, which touches just net/ethernet/eth.c,
net/core/dev.c, include/linux/netdevice.h and
include/uapi/linux/if_ether.h, and should let existing code continue to
Just Work(tm), though devices using ether_setup() that had no MTU range
checking (or one or the other missing) will wind up with new bounds.

For the most part, after the initial patch, very few of the others
would have any direct interaction with any others, so they could all
be singletons, or small batches per-vendor, or whatever.

Full diffstat for the aid of discussion on how to break it up:

 drivers/char/pcmcia/synclink_cs.c  |  1 -
 drivers/firewire/net.c | 14 ++---
 drivers/infiniband/hw/nes/nes.c|  1 -
 drivers/infiniband/hw/nes/nes.h|  4 +-
 drivers/infiniband/hw/nes/nes_nic.c|  7 +--
 drivers/misc/sgi-xp/xpnet.c| 21 ++--
 drivers/net/ethernet/agere/et131x.c|  7 +--
 drivers/net/ethernet/altera/altera_tse.h   |  1 -
 drivers/net/ethernet/altera/altera_tse_main.c  | 12 ++---
 drivers/net/ethernet/amd/amd8111e.c|  5 +-
 drivers/net/ethernet/atheros/alx/hw.h  |  1 -
 drivers/net/ethernet/atheros/alx/main.c|  9 +---
 drivers/net/ethernet/atheros/atl1c/atl1c_main.c| 41 +-
 drivers/net/ethernet/atheros/atl1e/atl1e_main.c| 11 ++--
 drivers/net/ethernet/atheros/atlx/atl1.c   | 15 +++---
 drivers/net/ethernet/atheros/atlx/atl2.c   | 14 +++--
 drivers/net/ethernet/broadcom/b44.c|  5 +-
 drivers/net/ethernet/broadcom/bcm63xx_enet.c   | 30 +++
 drivers/net/ethernet/broadcom/bnx2.c   |  8 ++-
 drivers/net/ethernet/broadcom/bnx2.h   |  6 +--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h|  2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c|  8 +--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c   | 22 +++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   |  4 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt.c  |  7 +--
 drivers/net/ethernet/broadcom/tg3.c|  7 +--
 drivers/net/ethernet/brocade/bna/bnad.c|  7 +--
 drivers/net/ethernet/cadence/macb.c| 17 +++---
 drivers/net/ethernet/calxeda/xgmac.c   | 18 ++-
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 15 ++
 .../net/ethernet/cavium/liquidio/octeon_network.h  |  2 +-
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c   |  5 +-
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 10 ++--
 drivers/net/ethernet/chelsio/cxgb/cxgb2.c  |  2 -
 drivers/net/ethernet/cisco/enic/enic_main.c|  7 +--
 drivers/net/ethernet/cisco/enic/enic_res.h |  2 +-
 drivers/net/ethernet/dlink/dl2k.c  | 22 ++--
 drivers/net/ethernet/dlink/sundance.c  |  6 ++-
 drivers/net/ethernet/freescale/gianfar.c   |  9 ++--
 drivers/net/ethernet/hisilicon/hns/hns_enet.c  |  4 --
 drivers/net/ethernet/ibm/ehea/ehea_main.c  | 13 ++---
 drivers/net/ethernet/ibm/emac/core.c   |  7 +--
 drivers/net/ethernet/intel/e100.c  |  9 
 drivers/net/ethernet/intel/e1000/e1000_main.c  | 12 ++---
 drivers/net/ethernet/intel/e1000e/netdev.c | 14 +++--
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c| 15 ++
 drivers/net/ethernet/intel/i40e/i40e_main.c| 10 ++--
 drivers/net/ethernet/intel/i40evf/i40evf_main.c|  8 +--
 drivers/net/ethernet/intel/igb/e1000_defines.h |  3 +-
 drivers/net/ethernet/intel/igb/igb_main.c  | 16 ++
 drivers/net/ethernet/intel/igbvf/defines.h |  3 +-
 drivers/net/ethernet/intel/igbvf/netdev.c  | 14 ++---
 drivers/net/ethernet/intel/ixgb/ixgb_main.c| 16 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c  | 11 ++--
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c  | 33 ++--
 drivers/net/ethernet/marvell/mvneta.c  | 36 -
 drivers/net/ethernet/marvell/mvpp2.c   | 36 -
 

Re: Minimum MTU Mess

2016-09-07 Thread Andrew Lunn
> This is definitely going to require a few passes... (Working my way
> through every driver with an ndo_change_mtu wired up right now to
> see just how crazy this might get).

It might be something Coccinelle can help you with. Try describing the
transformation you want to do, to their mailing list, and they might
come up with a script for you.

 Andrew


Re: Minimum MTU Mess

2016-09-07 Thread Jarod Wilson
On Wed, Sep 07, 2016 at 01:35:35PM -0700, Stephen Hemminger wrote:
> On Wed, 7 Sep 2016 15:53:56 -0400
> Jarod Wilson  wrote:
> 
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6466,9 +6466,17 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
> > if (new_mtu == dev->mtu)
> > return 0;
> >  
> > -   /*  MTU must be positive.*/
> > -   if (new_mtu < 0)
> > +   if (new_mtu < dev->min_mtu) {
> > +   netdev_err(dev, "Invalid MTU %d requested, hw min %d\n",
> > +  new_mtu, dev->min_mtu);
> > return -EINVAL;
> > +   }
> > +
> > +   if (new_mtu > dev->max_mtu) {
> > +   netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
> > +  new_mtu, dev->min_mtu);
> > +   return -EINVAL;
> > +   }
> >  
> 
> Maybe don't log something that can be triggered from a user program.
> Or at least rate limit it.

Yeah, I was a little bit on the fence on whether to log anything, make it
netdev_err, netdev_dbg, or what. Quite a few drivers have a netdev_err for
failed MTU changes, while others also have netdev_info spew for successful
MTU changes. Maybe a rate-limited netdev_err is the way to go here.

-- 
Jarod Wilson
ja...@redhat.com



Re: Minimum MTU Mess

2016-09-07 Thread Jarod Wilson
On Wed, Sep 07, 2016 at 10:31:12PM +0200, Andrew Lunn wrote:
> Hi Jarod
> 
> > -   /*  MTU must be positive.*/
> > -   if (new_mtu < 0)
> > +   if (new_mtu < dev->min_mtu) {
> > +   netdev_err(dev, "Invalid MTU %d requested, hw min %d\n",
> > +  new_mtu, dev->min_mtu);
> > return -EINVAL;
> > +   }
> > +
> > +   if (new_mtu > dev->max_mtu) {
> > +   netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
> > +  new_mtu, dev->min_mtu);
> > +   return -EINVAL;
> > +   }
> 
> I doubt you can make such a big change like this in one go. Can you
> really guarantee all interfaces, of what ever type, will have some
> value for dev->min_mtu and dev->max_mtu? What may fly is something
> more like:
> 
> > +   if (dev->max_mtu && new_mtu > dev->max_mtu) {
> > +   netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
> > +  new_mtu, dev->min_mtu);
> > +   return -EINVAL;
> > +   }
> 
> Maybe in a few cycles you can add a WARN_ON(!dev->max_mtu), and a few
> cycles after that go with (new_mtu > dev->max_mtu).

My local tree actually has if (dev->max_mtu > 0 && new_mtu > dev->max_mtu)
since just after I'd sent my mail for exactly that reason, though looking
at alloc_netdev_mqs(), it does seem we're at least guaranteed the value
will be 0 if not otherwise initialized, so your version looks perfectly
fine, and in the min_mtu case, without any additional handling, things
behave exactly as they did before. This is definitely going to require a
few passes... (Working my way through every driver with an ndo_change_mtu
wired up right now to see just how crazy this might get).

-- 
Jarod Wilson
ja...@redhat.com



Re: Minimum MTU Mess

2016-09-07 Thread Stephen Hemminger
On Wed, 7 Sep 2016 15:53:56 -0400
Jarod Wilson  wrote:

> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6466,9 +6466,17 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
>   if (new_mtu == dev->mtu)
>   return 0;
>  
> - /*  MTU must be positive.*/
> - if (new_mtu < 0)
> + if (new_mtu < dev->min_mtu) {
> + netdev_err(dev, "Invalid MTU %d requested, hw min %d\n",
> +new_mtu, dev->min_mtu);
>   return -EINVAL;
> + }
> +
> + if (new_mtu > dev->max_mtu) {
> + netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
> +new_mtu, dev->min_mtu);
> + return -EINVAL;
> + }
>  

Maybe don't log something that can be triggered from a user program.
Or at least rate limit it.


Re: Minimum MTU Mess

2016-09-07 Thread Andrew Lunn
Hi Jarod

> - /*  MTU must be positive.*/
> - if (new_mtu < 0)
> + if (new_mtu < dev->min_mtu) {
> + netdev_err(dev, "Invalid MTU %d requested, hw min %d\n",
> +new_mtu, dev->min_mtu);
>   return -EINVAL;
> + }
> +
> + if (new_mtu > dev->max_mtu) {
> + netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
> +new_mtu, dev->min_mtu);
> + return -EINVAL;
> + }

I doubt you can make such a big change like this in one go. Can you
really guarantee all interfaces, of what ever type, will have some
value for dev->min_mtu and dev->max_mtu? What may fly is something
more like:

> + if (dev->max_mtu && new_mtu > dev->max_mtu) {
> + netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
> +new_mtu, dev->min_mtu);
> + return -EINVAL;
> + }

Maybe in a few cycles you can add a WARN_ON(!dev->max_mtu), and a few
cycles after that go with (new_mtu > dev->max_mtu).

  Andrew


Re: Minimum MTU Mess

2016-09-07 Thread Jarod Wilson
On Tue, Sep 06, 2016 at 04:55:29PM -0700, David Miller wrote:
> From: Jarod Wilson 
> Date: Fri, 2 Sep 2016 13:07:42 -0400
> 
> > In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or
> > variations thereof, under drivers/net/ is kind of crazy.
> 
> Agreed, we can have a default and let the different cases provide
> overrides.
> 
> Mostly what to do here is a function of the hardware though.

So I've been tinkering with this some, and it looks like having both
centralized min and max checking could be useful here. I'm hacking away at
drivers now, but the basis of all this would potentially look about like
the patch below, and each device would have to set dev->m{in,ax}_mtu one
way or another. Drivers using alloc_etherdev and/or ether_setup would get
the "default" values, and then they can be overridden. Probably need
something to make sure dev->max_mtu isn't set to 0 though...

Possibly on the right track here, or might there be a better way to
approach this?

diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
index 117d02e..864d6f2 100644
--- a/include/uapi/linux/if_ether.h
+++ b/include/uapi/linux/if_ether.h
@@ -35,6 +35,8 @@
 #define ETH_FRAME_LEN  1514/* Max. octets in frame sans FCS */
 #define ETH_FCS_LEN4   /* Octets in the FCS */
 
+#define ETH_MIN_MTU68  /* Min IPv4 MTU per RFC791  */
+
 /*
  * These are the defined Ethernet Protocol ID's.
  */
diff --git a/net/core/dev.c b/net/core/dev.c
index dd6ce59..d20fd5d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6466,9 +6466,17 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
if (new_mtu == dev->mtu)
return 0;
 
-   /*  MTU must be positive.*/
-   if (new_mtu < 0)
+   if (new_mtu < dev->min_mtu) {
+   netdev_err(dev, "Invalid MTU %d requested, hw min %d\n",
+  new_mtu, dev->min_mtu);
return -EINVAL;
+   }
+
+   if (new_mtu > dev->max_mtu) {
+   netdev_err(dev, "Invalid MTU %d requested, hw max %d\n",
+  new_mtu, dev->min_mtu);
+   return -EINVAL;
+   }
 
if (!netif_device_present(dev))
return -ENODEV;
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index 66dff5e..2be0203 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -322,8 +322,6 @@ EXPORT_SYMBOL(eth_mac_addr);
  */
 int eth_change_mtu(struct net_device *dev, int new_mtu)
 {
-   if (new_mtu < 68 || new_mtu > ETH_DATA_LEN)
-   return -EINVAL;
dev->mtu = new_mtu;
return 0;
 }
@@ -357,6 +355,8 @@ void ether_setup(struct net_device *dev)
dev->type   = ARPHRD_ETHER;
dev->hard_header_len= ETH_HLEN;
dev->mtu= ETH_DATA_LEN;
+   dev->min_mtu= ETH_MIN_MTU;
+   dev->max_mtu= ETH_DATA_LEN;
dev->addr_len   = ETH_ALEN;
dev->tx_queue_len   = 1000; /* Ethernet wants good queues */
dev->flags  = IFF_BROADCAST|IFF_MULTICAST;

-- 
Jarod Wilson
ja...@redhat.com



Re: Minimum MTU Mess

2016-09-06 Thread David Miller
From: Jarod Wilson 
Date: Fri, 2 Sep 2016 13:07:42 -0400

> In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or
> variations thereof, under drivers/net/ is kind of crazy.

Agreed, we can have a default and let the different cases provide
overrides.

Mostly what to do here is a function of the hardware though.


Minimum MTU Mess

2016-09-02 Thread Jarod Wilson
So... I had a bug reported, about a NIC that ceased to work, if it's MTU
was set to 0, then back to it's original value (1500). This got me
thinking... What does an MTU of 0 even mean? Why should it be allowed?

As it turns out, most (but not all) network drivers have a check in their
ndo_change_mtu function that returns -EINVAL if new_mtu < $magic_number,
which is often 68 (per page 25 of RFC 791), but is sometimes 64, or 60, or
46... Sometimes other manipulations are done. But the short version is
that it seems there's an almost universal need to check for a minimum MTU.
There's just some disagreement on what that minimum is.

So, rather than having nearly every ndo_change_mut callback do the exact
same thing, would it make sense to settle on one minimum MTU value, and
check that unilaterally (at least for certain netdev types) in
net/core/dev.c's dev_set_mtu()? Or is intentionally left vague, because
it's really up to the hardware to care?

Alternatively, perhaps each driver should set a netdev->min_mtu value, and
net/core/dev.c could check against that. Could even have a centralized
IP_MIN_MTU of 68 that all devices using ether_setup() and/or
alloc_etherdev() used by default.

In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or
variations thereof, under drivers/net/ is kind of crazy.

In the interim, I think I'll just do a 3-line patch for this one driver
that mirrors all the existing drivers to keep it from going splat...

-- 
Jarod Wilson
ja...@redhat.com