Re: [openstack-dev] [neutron][ovs] The way we deal with MTU

Ihar Hrachyshka Wed, 15 Jun 2016 08:40:21 -0700

First, some context: we talked it thru with Eugene on IRC, and Eugene reported 
that he cannot reproduce the issue on his setup using Ubuntu hypervisor with 
ovs 2.4:


http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-06-13.log.html#t2016-06-13T19:45:22

So I went and did some testing with the functional test I have implemented. I 
validated the following setups:

- ubuntu 14.04 + ovs 2.0.x
- centos 7 + ovs 2.4
- centos 7 + ovs 2.5

All of them fail to pass the test. I also pushed the test without the fix into 
gate, and it failed too:

https://review.openstack.org/329558

So we definitely have some sort of issue that is independent of underlying 
distribution or Open vSwitch.

With that, I believe we should go forward with the fix as a short term 
solution: https://review.openstack.org/327651 (I removed WIP from it.)

I will also reach ovs developers on the matter to see if they can somehow allow 
us to disable the mtu curtailing, and still stay supported.

Ihar

> On 13 Jun 2016, at 19:43, Eugene Nikanorov <enikano...@mirantis.com> wrote:
> 
> That's interesting.
> 
> 
> In our deployments we do something like br-ex (linux bridge, mtu 9000) - 
> OVSIntPort (mtu 65000) - br-floating (ovs bridge, mtu 1500) - br-int (ovs 
> bridge, mtu 1500).
> qgs then are getting created in br-int, traffic goes all the way and that 
> altogether allows jumbo frames over external network.
> 
> For that reason I thought that mtu inside OVS doesn't really matter. 
> This, however is for ovs 2.4.1
> 
> I wonder if that behavior has changed and if the description is available 
> anywhere.
> 
> Thanks,
> Eugene.
> 
> On Mon, Jun 13, 2016 at 9:49 AM, Ihar Hrachyshka <ihrac...@redhat.com> wrote:
> Hi all,
> 
> in Mitaka, we introduced a bunch of changes to the way we handle MTU in 
> Neutron/Nova, making sure that the whole instance data path, starting from 
> instance internal interface, thru hybrid bridge, into the br-int; as well as 
> router data path (qr) have proper MTU value set on all participating devices. 
> On hypervisor side, both Nova and Neutron take part in it, setting it with 
> ip-link tool based on what Neutron plugin calculates for us. So far so good.
> 
> Turns out that for OVS, it does not work as expected in regards to br-int. 
> There was a bug reported lately: https://launchpad.net/bugs/1590397
> 
> Briefly, when we try to set MTU on a device that is plugged into a bridge, 
> and if the bridge already has another port with lower MTU, the bridge itself 
> inherits MTU from that latter port, and Linux kernel (?) does not allow to 
> set MTU on the first device at all, making ip link calls ineffective.
> 
> AFAIU this behaviour is consistent with Linux bridging rules: you can’t have 
> ports of different MTU plugged into the same bridge.
> 
> Now, that’s a huge problem for Neutron, because we plug ports that belong to 
> different networks (and that hence may have different MTUs) into the same 
> br-int bridge.
> 
> So I played with the code locally a bit and spotted that currently, we set 
> MTU for router ports before we move their devices into router namespaces. And 
> once the device is in a namespace, ip-link actually works. So I wrote a fix 
> with a functional test that proves the point: 
> https://review.openstack.org/#/c/327651/ The fix was validated by the 
> reporter of the original bug and seems to fix the issue for him.
> 
> It’s suspicious that it works from inside a namespace but not when the device 
> is still in the root namespace. So I reached out to Jiri Benc from our local 
> Open vSwitch team, and here is a quote:
> 
> ===
> 
> "It's a bug in ovs-vswitchd. It doesn't see the interface that's in
> other netns and thus cannot enforce the correct MTU.
> 
> We'll hopefully fix it and disallow incorrect MTU setting even across
> namespaces. However, it requires significant effort and rework of ovs
> name space handling.
> 
> You should not depend on the current buggy behavior. Don't set MTU of
> the internal interfaces higher than the rest of the bridge, it's not
> supported. Hacking this around by moving the interface to a netns is
> exploiting of a bug.
> 
> We can certainly discuss whether this limitation could be relaxed.
> Honestly, I don't know, it's for a discussion upstream. But as of now,
> it's not supported and you should not do it.”
> 
> So basically, as long as we try to plug ports with different MTUs into the 
> same bridge, we are utilizing a bug in Open vSwitch, that may break us any 
> time.
> 
> I guess our alternatives are:
> - either redesign bridge setup for openvswitch to e.g. maintain a bridge per 
> network;
> - or talk to ovs folks on whether they may support that for us.
> 
> I understand the former option is too scary. It opens lots of questions, 
> including upgrade impact since it will obviously introduce a dataplane 
> downtime. That would be a huge shift in paradigm, probably too huge to 
> swallow. The latter option may not fly with vswitch folks. Any better ideas?
> 
> It’s also not clear whether we want to proceed with my immediate fix. Advices 
> are welcome.
> 
> Thanks,
> Ihar
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron][ovs] The way we deal with MTU

Reply via email to