Re: [lxc-users] Containers have network issues when their host uses a bonded interface

Peter Steele Thu, 10 Sep 2015 17:39:18 -0700

I believe this link describes the exact problem I've been experiencing:


https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/785668

and although the original post targets KVM, later in the thread it movesto LXC. This is an old bug report and I'm surprised that this has notbeen addressed in recent kernels.


Peter

On 09/10/2015 02:52 PM, Peter Steele wrote:

I've configured a standard CentOS bridge/bond, the exact same setupthat I use for creating VMs. VMs on different hosts communicatethrough the bridge without issues. Containers that use the identicalbridge however cannot reliably connect to containers on differenthosts. We've determined that it is some kind of ARP table issue, or atleast appears to be. I have reproduced a scenario where container A onhost X is unable to ping container B on host Y. If I connect to B fromhost Y and ping say the gateway of our subnet, suddenly container A onhost X can ping container B. Here's a specific case. Container A onhost X has the IP 172.16.110.106. Its arp table looks like this:
# arp -v -n
Address HWtype HWaddress FlagsMask Iface172.16.110.112 ether 00:16:3e:21:c6:c5C eth0172.16.110.111 ether 00:16:3e:31:2a:f0C eth0172.16.110.110 ether 00:16:3e:5c:62:a9C eth0
Entries: 4      Skipped: 0      Found: 4
Container B on host Y has the IP 172.16.110.113. Its arp table lookslike this:
# arp -v -n
Address HWtype HWaddress FlagsMask Iface172.16.110.112 ether 00:16:3e:21:c6:c5C eth0172.16.110.106 ether 00:16:3e:20:df:87C eth0
Entries: 2      Skipped: 0      Found: 2
If I try to ping 172.16.110.113 (container B) from 172.16.110.106(container A), I get a "Host Unreachable":
# ping 172.16.110.113
PING 172.16.110.113 (172.16.110.113) 56(84) bytes of data.
From 172.16.110.106 icmp_seq=1 Destination Host Unreachable
From 172.16.110.106 icmp_seq=2 Destination Host Unreachable
From 172.16.110.106 icmp_seq=3 Destination Host Unreachable
From 172.16.110.106 icmp_seq=4 Destination Host Unreachable
From 172.16.110.106 icmp_seq=5 Destination Host Unreachable
...
If while this ping is running I connect to container B and ping thegateway (172.16.0.1), the ping running on container A suddenly startsworking:
...
From 172.16.110.106 icmp_seq=6 Destination Host Unreachable
From 172.16.110.106 icmp_seq=7 Destination Host Unreachable
From 172.16.110.106 icmp_seq=8 Destination Host Unreachable
64 bytes from 172.16.110.113: icmp_seq=57 ttl=64 time=993 ms
64 bytes from 172.16.110.113: icmp_seq=58 ttl=64 time=0.283 ms
64 bytes from 172.16.110.113: icmp_seq=59 ttl=64 time=0.274 ms
...

The arp table on container A now has an entry for container B:

# arp -v -n
Address HWtype HWaddress FlagsMask Iface172.16.110.112 ether 00:16:3e:21:c6:c5C eth0172.16.110.111 ether 00:16:3e:31:2a:f0C eth0172.16.110.113 ether 00:16:3e:65:2a:c5C eth0172.16.110.110 ether 00:16:3e:5c:62:a9C eth0
Entries: 5      Skipped: 0      Found: 5

The arp table on container B of course now has an entry for the gateway:

# arp -v -n
Address HWtype HWaddress FlagsMask Iface172.16.110.112 ether 00:16:3e:21:c6:c5C eth0172.16.110.106 ether 00:16:3e:20:df:87C eth0172.16.0.1 ether 64:12:25:e3:d4:4cC eth0
Entries: 3      Skipped: 0      Found: 3
We've been running VMs (KVM) with this identical bridged/bondedconfiguration for years and have never had an issue with a VM on onehost being unable to communicate with a VM on another host.We've neverhad to force an arp table update of any kind. Why do containers behavein this manner?
Peter

On 09/10/2015 01:01 PM, Bostjan Skufca wrote:
Hi Peter,

since you mentioned you are using bridged interfaces, is my assumption
correct that your containers's network connection is joined directly
to this bridge and containers talk to the world direcly (L2) and not
via routed (L3) network over host OS?

Did you try using routed setup (using bond0 directly and creating
dedicated br1just  for containers) and taking bridging functionality
of the linux kernel out of the picture?

I am very interested in your findings, as I have similar setup planned
for deployment in the coming weeks (vlan trunk -> bond -> bridge ->
containers).

(I had similar issues on bonded interfaces on VMware, where tap-based
OpenVPN server would not work at all. It had to do something with how
vmware handles bonds and this not being compatible with L2 stuff
coming out of VM. The second thing I remember very vaguely is that I
tried using bridged interface inside container too, and it did not
cooperate well with host's bridge, so I stopped using bridges inside
containers altogether.)

b.


_______________________________________________
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Containers have network issues when their host uses a bonded interface

Reply via email to