Hi all,

We have this very disturbing issue on a few of our production servers,
which disconnects VMs
from their network.
* The Vms are part of an oVirt host so each vm is attached to a l2
bridge with a tap device.
* The bridge has an IP on it and is connected via a bond
Issue:
--------
when the machine pings outside to the host (8.8.8.8):
* arp who-has packets are sent to the bridge and forwarded but the
bridge is not forwarding the reply (is-at) (see tcpdump output in [1])

2 more interesting facts:
----------------------------------
* ping directly to the bridge ip succeeds.
* the host is a UCS host.

<Versions>:
[root@ucs1-b200-2 ~]# uname -r
2.6.32-573.7.1.el6.x86_64
[root@ucs1-b200-2 ~]# rpm -q libvirt
libvirt-0.10.2-54.el6.x86_64


<Network configuration on host>
[root@ucs1-b200-2 ~]# ip l
2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
master bond0 state UP qlen 1000
    link/ether 00:25:b5:0a:00:09 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
master bond0 state UP qlen 1000
    link/ether 00:25:b5:0a:00:09 brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP
    link/ether 00:25:b5:0a:00:09 brd ff:ff:ff:ff:ff:ff
5: rhevm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 00:25:b5:0a:00:09 brd ff:ff:ff:ff:ff:ff
11: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN qlen 500
    link/ether fe:1a:4a:23:12:a0 brd ff:ff:ff:ff:ff:ff
************
[root@ucs1-b200-2 ~]# ip -4 a
5: rhevm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    inet 10.35.19.149/22 brd 10.35.19.255 scope global rhevm
************
[root@ucs1-b200-2 ~]# brctl show
bridge name    bridge id        STP enabled    interfaces
rhevm        8000.0025b50a0009    no        bond0
                                                             vnet0
*************
[root@ucs1-b200-2 ~]# brctl showmacs rhevm | grep fe:1a:4a:23:12:a0
  2    fe:1a:4a:23:12:a0    yes           0.00

[1] tcpdump on the host
[root@ucs1-b200-2 ~]# tcpdump -n -i vnet0 "(host 10.35.16.244) and
(icmp or arp)"
tcpdump: WARNING: vnet0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vnet0, link-type EN10MB (Ethernet), capture size 65535 bytes
11:12:11.943033 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 28
11:12:11.943065 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 42
11:12:12.942992 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 28
11:12:12.943022 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 42
11:12:13.057004 ARP, Request who-has 10.35.19.254 tell 10.35.16.244, length 28
11:12:13.057037 ARP, Request who-has 10.35.19.254 tell 10.35.16.244, length 42
11:12:13.943049 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 28
11:12:13.943080 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 42
11:12:14.057043 ARP, Request who-has 10.35.19.254 tell 10.35.16.244, length 28
........
[root@ucs1-b200-2 ~]# tcpdump -n -i rhevm "(host 10.35.16.244) and
(icmp or arp)"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on rhevm, link-type EN10MB (Ethernet), capture size 65535 bytes
11:12:50.072067 ARP, Request who-has 10.35.19.254 tell 10.35.16.244, length 28
11:12:50.072094 ARP, Request who-has 10.35.19.254 tell 10.35.16.244, length 42
11:12:50.072495 ARP, Reply 10.35.19.254 is-at 00:00:0c:07:ac:00, length 46
11:12:50.535085 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 28
11:12:50.535106 ARP, Request who-has 10.35.19.120 tell 10.35.16.244, length 42
11:12:50.535372 ARP, Reply 10.35.19.120 is-at 00:1a:4a:23:13:cb, length 42


-- 
Thanks,
Ido Barkan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to