On Tue, Jan 9, 2018 at 11:21 PM, <liuq...@huawei.com> wrote: > From: Keefe Liu <liuq...@huawei.com> > > The MTU of ipvlan interface should not bigger than the phy device, When we > run following scripts, we will find there are some problems. > Step1: > ip link add link eth0 name ipv1 type ipvlan mode l2 > ip netns add net1 > ip link set dev ipv1 netns net1 > Step2: > ip netns exec net1 ip link set dev ipv1 mtu 1501 > RTNETLINK answers: Invalid argument > dmesg info: "ipv1: Invalid MTU 1501 requested, hw max 1500" > Step3: > ip link set dev eth0 mtu 1600 > ip netns exec net1 ip link set dev ipv1 mtu 1501 > RTNETLINK answers: Invalid argument > dmesg info: "ipv1: Invalid MTU 1501 requested, hw max 1500" > Step4: > ip link set dev eth0 mtu 1400 > ip netns exec net1 ip link set dev ipv1 mtu 1500 > The result of Step2 is we expected, but the result of Step3 and Step4 > are not. > > This patch set ipvlan's MTU ranges from ETH_MIN_MTU to phy device's maximum > MTU. When we create a new ipvlan device, its MTU will be set to the same > with phy device if we not specify the mtu value. When we change the ipvlan > device's MTU, ipvlan_change_mtu() will make sure the new MTU no bigger than > the phy device's MTU. > > When we change the phy device's MTU, ipvlan devices attached to this > phy device will check their MTU value. If ipvlan's MTU bigger than phy's, > ipvlan's MTU will be set to the same with phy device, otherwise ipvlan's > MTU remains unchanged. > > Signed-off-by: Keefe Liu <liuq...@huawei.com> > --- > drivers/net/ipvlan/ipvlan_main.c | 36 ++++++++++++++++++++++++++++-------- > 1 file changed, 28 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/ipvlan/ipvlan_main.c > b/drivers/net/ipvlan/ipvlan_main.c > index 30cb803..2ba071a 100644 > --- a/drivers/net/ipvlan/ipvlan_main.c > +++ b/drivers/net/ipvlan/ipvlan_main.c > @@ -34,11 +34,6 @@ struct ipvlan_netns { > .l3mdev_l3_rcv = ipvlan_l3_rcv, > }; > > -static void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device > *dev) > -{ > - ipvlan->dev->mtu = dev->mtu; > -} > - > static int ipvlan_register_nf_hook(struct net *net) > { > struct ipvlan_netns *vnet = net_generic(net, ipvlan_netid); > @@ -380,12 +375,24 @@ static int ipvlan_get_iflink(const struct net_device > *dev) > return ipvlan->phy_dev->ifindex; > } > > +static int ipvlan_change_mtu(struct net_device *dev, int new_mtu) > +{ > + struct ipvl_dev *ipvlan = netdev_priv(dev); > + > + if (ipvlan->phy_dev->mtu < new_mtu) > + return -EINVAL; > + > + dev->mtu = new_mtu; > + return 0; > +} > + > static const struct net_device_ops ipvlan_netdev_ops = { > .ndo_init = ipvlan_init, > .ndo_uninit = ipvlan_uninit, > .ndo_open = ipvlan_open, > .ndo_stop = ipvlan_stop, > .ndo_start_xmit = ipvlan_start_xmit, > + .ndo_change_mtu = ipvlan_change_mtu, > .ndo_fix_features = ipvlan_fix_features, > .ndo_change_rx_flags = ipvlan_change_rx_flags, > .ndo_set_rx_mode = ipvlan_set_multicast_mac_filter, > @@ -581,10 +588,18 @@ int ipvlan_link_new(struct net *src_net, struct > net_device *dev, > } > } > > + if (!tb[IFLA_MTU]) > + dev->mtu = phy_dev->mtu; > + else if (dev->mtu > phy_dev->mtu) > + return -EINVAL; > + > + /* MTU range: 68 - phy_dev->max_mtu */ > + dev->min_mtu = ETH_MIN_MTU; > + dev->max_mtu = phy_dev->max_mtu; > + > ipvlan->phy_dev = phy_dev; > ipvlan->dev = dev; > ipvlan->sfeatures = IPVLAN_FEATURES; > - ipvlan_adjust_mtu(ipvlan, phy_dev); > INIT_LIST_HEAD(&ipvlan->addrs); > > /* TODO Probably put random address here to be presented to the > @@ -680,6 +695,8 @@ void ipvlan_link_setup(struct net_device *dev) > { > ether_setup(dev); > > + dev->min_mtu = 0; > + dev->max_mtu = ETH_MAX_MTU; > dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING); > dev->priv_flags |= IFF_UNICAST_FLT | IFF_NO_QUEUE; > dev->netdev_ops = &ipvlan_netdev_ops; > @@ -775,8 +792,11 @@ static int ipvlan_device_event(struct notifier_block > *unused, > break; > > case NETDEV_CHANGEMTU: > - list_for_each_entry(ipvlan, &port->ipvlans, pnode) > - ipvlan_adjust_mtu(ipvlan, dev); > + list_for_each_entry(ipvlan, &port->ipvlans, pnode) { > + if (ipvlan->dev->mtu <= dev->mtu) > + continue; > + dev_set_mtu(ipvlan->dev, dev->mtu); > + } Please look at the suggestion that I had given you in the earlier thread. This approach will cause regression. Currently if the master has MTU set to 1500, so all it's slave will get 1500 mtu. If master changes from 1500 -> 1600, all slaves used to get the same. But in the above patch if master changes from 1500 -> 1600, all slaves stay at 1500 and another operation need to be performed per slave to change mtu. Your approach probably works when master is changing from 1600 -> 1500.
Also remember when master changes its mtu, you have to reset max_mtu for slaves with the new value. I still prefer the approach I had mentioned that uses 'mtu_adj'. In that approach you can leave those slaves which have changed their mtu to be lower than masters' but if master's mtu changes to larger value all other slaves will get updated mtu leaving behind the slaves who have opted to change their mtu on their own. Also the same thing is true when mtu get reduced at master. > break; > > case NETDEV_CHANGEADDR: > -- > 1.8.3.1 > >