[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
apport information ** Description changed: i'm seeing poor/intermittent/degrading network connectivity for guests, when the host is using a bonded interface. in a nutshell, the network configuration is as follows: the physical interfaces [eth0 and eth1] are bonded together as bond0 [i've tried various bond modes - see below]. a bridge interface [br0] is configured with bond0 attached to it. all guests use br0 as their "forward" interface. my tests have generally included a single host, with two guests running on it. both guests are running ubuntu 12.10. it depends slightly on the particulars of the configuration, but the most prevalent symptom is that a newly booted guest will at first respond to pings [with little to no loss], and the guest will be able to ping other hosts on the network, but as time passes, more and more packets are dropped. eventually, virtually all ping requests go unanswered. in some cases, it appears that restarting networking on the guest will fix this, partially and temporarily. the guest will begin to reply 4-5 packets after restarting networking, but does not respond consistently, eventually failing again as before. i've also noticed that in some cases where ping against the guest has not yet begun to fail, if i ping something else on the network from the guest, this causes the pings against the guest to abruptly fail. i know this is all quite abstract - i've spent quite a bit a time trying to isolate various variables, and while i've made some progress, i think some guidance would be helpful. what i have noticed specifically is if i attach a physical device [e.g. eth0 or eth1] to the bridge [instead of bond0], things seem to work ok. also, if i use active-backup as the bonding mode, things seem to work ok. i was initially using balance-alb as the bonding mode, and have also tested balance-rr as the bonding mode. both exhibit the above symptoms. i've also tried various network card models for the guests [realtek, e1000, and virtio]. this has not had any impact on the symptoms. lastly, the two guests have been able to ping each other, with no issues, regardless of the various network settings. at the moment, i have switched back to active-backup, so this is reflected in the below information. here is a bit of configuration info: host os/package info: >lsb_release -rd Description: Ubuntu 12.10 Release: 12.10 >apt-cache policy qemu-kvm qemu-kvm: Installed: 1.2.0+noroms-0ubuntu2.12.10.3 Candidate: 1.2.0+noroms-0ubuntu2.12.10.3 Version table: *** 1.2.0+noroms-0ubuntu2.12.10.3 0 500 http://us.archive.ubuntu.com/ubuntu/ quantal-updates/main amd64 Packages 100 /var/lib/dpkg/status 1.2.0+noroms-0ubuntu2.12.10.2 0 500 http://security.ubuntu.com/ubuntu/ quantal-security/main amd64 Packages 1.2.0+noroms-0ubuntu2 0 500 http://us.archive.ubuntu.com/ubuntu/ quantal/main amd64 Packages >dpkg -l | grep -i virt ii libvirt-bin0.9.13-0ubuntu12.2 amd64programs for the libvirt library ii libvirt0 0.9.13-0ubuntu12.2 amd64library for interfacing with different virtualization systems ii python-libvirt 0.9.13-0ubuntu12.2 amd64libvirt Python bindings ii qemu-kvm 1.2.0+noroms-0ubuntu2.12.10.3 amd64Full virtualization on supported hardware ii virtinst 0.600.2-1ubuntu1 all Programs to create and clone virtual machines >dpkg -l | grep -i qemu ii qemu-common1.2.0+noroms-0ubuntu2.12.10.3 all qemu common functionality (bios, documentation, etc) ii qemu-kvm 1.2.0+noroms-0ubuntu2.12.10.3 amd64Full virtualization on supported hardware ii qemu-utils 1.2.0+noroms-0ubuntu2.12.10.3 amd64qemu utilities ii vgabios0.7a-3ubuntu2 all VGA BIOS software for the Bochs and Qemu emulated VGA card host network config: >egrep -v '(^[[:space:]]*#|^[[:space:]]*$)' /etc/network/interfaces auto lo iface lo inet loopback auto eth0 iface eth0 inet manual bond-master bond0 auto eth1 iface eth1 inet manual bond-master bond0 auto bond0 iface bond0 inet manual bond-mode active-backup bond-slaves eth0 eth1 bond-primary eth0 bond-primary_reselect better auto br0 iface br0 inet static bridge_ports bond0 bridge_stp off bridge_waitport 0 bridge_maxwait 0 bridge_maxage 0 bridge_fd 0 bridge_ageing 0
[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
Thanks. Would you mind adding the kernel debug data by doing apport-collect 1153364 -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1153364 Title: trouble with guest network connectivity when host is using a bonded interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1153364/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
i've added iptables -t mangle -A POSTROUTING -o br0 -p udp -m udp -j CHECKSUM --checksum-fill: >iptables -vnt mangle -L --lin Chain PREROUTING (policy ACCEPT 44532 packets, 46M bytes) num pkts bytes target prot opt in out source destination Chain INPUT (policy ACCEPT 44307 packets, 46M bytes) num pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 37675 packets, 25M bytes) num pkts bytes target prot opt in out source destination Chain POSTROUTING (policy ACCEPT 37675 packets, 25M bytes) num pkts bytes target prot opt in out source destination 1 301 27725 CHECKSUM udp -- * br0 0.0.0.0/0 0.0.0.0/0udp CHECKSUM fill it doesn't appear it's had much impact though. pings are still exhibiting the generally erratic behaviors discussed. a possibly unrelated note - reading through bug 1029430, i thought i'd also try not using vhost_net. i unloaded the module [as well as the macvtap module], and edited the guest's config, removing the line: however, when starting the guest, the kernel modules are automatically loaded, and the guest appears to still be using the vhost_net module, according to ps output such as in my earlier note. i'm probably doing this wrong, but i'm not sure what. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1153364 Title: trouble with guest network connectivity when host is using a bonded interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1153364/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
> ping aurora > PING aurora.example.com (192.168.1.70): 56 data bytes > 64 bytes from 192.168.1.70: icmp_seq=0 ttl=64 time=0.466 ms > 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: > 192.168.1.70) > Vr HL TOS Len ID Flg off TTL Pro cks Src Dst > 4 5 00 0054 7b83 0 3f 01 7c14 192.168.1.123 192.168.1.70 This is a bit of a long shot, but is it possible this is the same error as https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1029430 ? That bug and all its solutions/workarounds seem to be only about dhcp, but it may be causing general udp checksum problems. You might at host boot, as per comment #16 (but doing all ports), try [ -e /dev/vhost-net ] && \ sudo iptables -t mangle -A POSTROUTING -o br0 -p udp -m udp -j CHECKSUM --checksum-fill -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1153364 Title: trouble with guest network connectivity when host is using a bonded interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1153364/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
no worries. i'm a bit embarrassed i couldn't offer a more directed initial report. i do believe the vhost_net module is installed and working: >lsmod | grep -i vhost vhost_net 31874 1 macvtap18294 1 vhost_net >pp | grep -i vhost root 2534 1 3 22:11 ?00:00:22 /usr/bin/kvm -name aurora -S -M pc-1.0 -cpu core2duo,+lahf_lm,+dca,+xtpr,+cx16,+tm2,+est,+vmx,+ds_cpl,+pbe,+tm,+ht,+ss,+acpi,+ds -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid 542c39da-f539-6014-6f91-36575f0aef4e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/aurora.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device ahci,id=ahci0,bus=pci.0,addr=0x4 -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/srv/vc/disks/aurora,if=none,id=drive-sata0-0-0,format=qcow2 -device ide-hd,bus=ahci0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=1 -drive if=none,media=cdrom,id=drive-sata0-0-1,readonly=on,format=raw -device ide-cd,bus=ahci0.1,drive=drive-sata0-0-1,id=sata0-0-1 -netdev tap,fd=21,id=hostnet0,vhost=on,vhostfd=22 -device virtio-net-pci,tx=bh,netdev=hostnet0,id=net0,mac=52:54:00:f3:b2:32,bus=pci.0,addr=0x3 -vnc 0.0.0.0:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 i've tested with lxc as you ask - it seems to not exhibit this problem. pings both from the container/guest against other devices on the network, as well as pings against the container/guest. i did notice, with some consistency, duplicate pings - but i know that this is sometimes simply just a largely innocuous side affect of certain types of load balancing, so i'm not necessarily terribly concerned about that. in addition, connectivity to the guest/container in general seemed to be fine, which which not the case with the prior testing. a couple of other notes to add that i've come across [or remembered] since my previous post. i have since also tested with balance-tlb, and this seems to work ok, with no symptoms of intermittent network connectivity - both for my kvm guests, as well as the lxc container/guest. also, i had forgotten about it when i initially wrote up this submission, but initially, i was using macvtap for my kvm guest network connectivity, and this is where i first saw the symptoms. i then switched to using a bridged setup, partially to test things further, but also for also reasons related to some of the limitations of macvtap [specifically guests not being able to communicate with the host when using the same interface]. ultimately, i intend to stay with the bridged configuration, because of this, but wanted to mention that the symptoms do appear to be present with both. lastly, one other possibly interesting bit of info - as i was testing again this morning with balance-alb and lxc, i tested again with a kvm guest to ensure the symptom was still present. this time when pinging, not only were the symptoms still present, i saw some behavior i hadn't noticed previously: ping aurora PING aurora.example.com (192.168.1.70): 56 data bytes 64 bytes from 192.168.1.70: icmp_seq=0 ttl=64 time=0.466 ms 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: 192.168.1.70) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 7b83 0 3f 01 7c14 192.168.1.123 192.168.1.70 64 bytes from 192.168.1.70: icmp_seq=1 ttl=64 time=0.279 ms 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: 192.168.1.70) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 19c2 0 3f 01 ddd5 192.168.1.123 192.168.1.70 64 bytes from 192.168.1.70: icmp_seq=2 ttl=64 time=0.306 ms 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: 192.168.1.70) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 8fc5 0 3f 01 67d2 192.168.1.123 192.168.1.70 64 bytes from 192.168.1.70: icmp_seq=3 ttl=64 time=0.278 ms 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: 192.168.1.70) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 08f0 0 3f 01 eea7 192.168.1.123 192.168.1.70 64 bytes from 192.168.1.70: icmp_seq=4 ttl=64 time=0.285 ms 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: 192.168.1.70) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 438e 0 3f 01 b409 192.168.1.123 192.168.1.70 64 bytes from 192.168.1.70: icmp_seq=5 ttl=64 time=0.327 ms 64 bytes from 192.168.1.70: icmp_seq=5 ttl=64 time=0.329 ms (DUP!) 64 bytes from 192.168.1.70: icmp_seq=6 ttl=64 time=0.292 ms 64 bytes from 192.168.1.70: icmp_seq=7 ttl=64 time=0.266 ms 92 bytes from xenon.example.com (192.168.1.60): Redirect Host(New addr: 192.168.1.70) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 0054 2be0 0 3f 01 cbb7 192.168.1.123 192.168.1.70
[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
apport information ** Tags added: apport-collected ** Description changed: i'm seeing poor/intermittent/degrading network connectivity for guests, when the host is using a bonded interface. in a nutshell, the network configuration is as follows: the physical interfaces [eth0 and eth1] are bonded together as bond0 [i've tried various bond modes - see below]. a bridge interface [br0] is configured with bond0 attached to it. all guests use br0 as their "forward" interface. my tests have generally included a single host, with two guests running on it. both guests are running ubuntu 12.10. it depends slightly on the particulars of the configuration, but the most prevalent symptom is that a newly booted guest will at first respond to pings [with little to no loss], and the guest will be able to ping other hosts on the network, but as time passes, more and more packets are dropped. eventually, virtually all ping requests go unanswered. in some cases, it appears that restarting networking on the guest will fix this, partially and temporarily. the guest will begin to reply 4-5 packets after restarting networking, but does not respond consistently, eventually failing again as before. i've also noticed that in some cases where ping against the guest has not yet begun to fail, if i ping something else on the network from the guest, this causes the pings against the guest to abruptly fail. i know this is all quite abstract - i've spent quite a bit a time trying to isolate various variables, and while i've made some progress, i think some guidance would be helpful. what i have noticed specifically is if i attach a physical device [e.g. eth0 or eth1] to the bridge [instead of bond0], things seem to work ok. also, if i use active-backup as the bonding mode, things seem to work ok. i was initially using balance-alb as the bonding mode, and have also tested balance-rr as the bonding mode. both exhibit the above symptoms. i've also tried various network card models for the guests [realtek, e1000, and virtio]. this has not had any impact on the symptoms. lastly, the two guests have been able to ping each other, with no issues, regardless of the various network settings. at the moment, i have switched back to active-backup, so this is reflected in the below information. here is a bit of configuration info: host os/package info: >lsb_release -rd Description: Ubuntu 12.10 Release: 12.10 >apt-cache policy qemu-kvm qemu-kvm: Installed: 1.2.0+noroms-0ubuntu2.12.10.3 Candidate: 1.2.0+noroms-0ubuntu2.12.10.3 Version table: *** 1.2.0+noroms-0ubuntu2.12.10.3 0 500 http://us.archive.ubuntu.com/ubuntu/ quantal-updates/main amd64 Packages 100 /var/lib/dpkg/status 1.2.0+noroms-0ubuntu2.12.10.2 0 500 http://security.ubuntu.com/ubuntu/ quantal-security/main amd64 Packages 1.2.0+noroms-0ubuntu2 0 500 http://us.archive.ubuntu.com/ubuntu/ quantal/main amd64 Packages >dpkg -l | grep -i virt ii libvirt-bin0.9.13-0ubuntu12.2 amd64programs for the libvirt library ii libvirt0 0.9.13-0ubuntu12.2 amd64library for interfacing with different virtualization systems ii python-libvirt 0.9.13-0ubuntu12.2 amd64libvirt Python bindings ii qemu-kvm 1.2.0+noroms-0ubuntu2.12.10.3 amd64Full virtualization on supported hardware ii virtinst 0.600.2-1ubuntu1 all Programs to create and clone virtual machines >dpkg -l | grep -i qemu ii qemu-common1.2.0+noroms-0ubuntu2.12.10.3 all qemu common functionality (bios, documentation, etc) ii qemu-kvm 1.2.0+noroms-0ubuntu2.12.10.3 amd64Full virtualization on supported hardware ii qemu-utils 1.2.0+noroms-0ubuntu2.12.10.3 amd64qemu utilities ii vgabios0.7a-3ubuntu2 all VGA BIOS software for the Bochs and Qemu emulated VGA card host network config: >egrep -v '(^[[:space:]]*#|^[[:space:]]*$)' /etc/network/interfaces auto lo iface lo inet loopback auto eth0 iface eth0 inet manual bond-master bond0 auto eth1 iface eth1 inet manual bond-master bond0 auto bond0 iface bond0 inet manual bond-mode active-backup bond-slaves eth0 eth1 bond-primary eth0 bond-primary_reselect better auto br0 iface br0 inet static bridge_ports bond0 bridge_stp off bridge_waitport 0 bridge_maxwait 0 bridge_maxage 0 bridge_f
[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
Thanks for reporting this bug. (Sorry about the delay - I've sat and thought about this a few times but haven't yet had any definite thoughts about what would debug this). Do you have the vhost_net module installed? I'm pretty sure the bug is in qemu itself, but just to be sure would you mind testing with a lxc container to see if it has the same problem? Just sudo apt-get -y install lxc cat > lxc.conf.custom << EOF lxc.network.type=veth lxc.network.link=br0 lxc.network.flags=up EOF sudo lxc-create -t ubuntu -n r1 -f lxc.conf.custom sudo lxc-start -n r1 then log in as user ubuntu password ubuntu, and see if networking stays up. This seems very reminiscent of bug 997978. ** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** Changed in: linux (Ubuntu) Importance: Undecided => High ** Changed in: qemu-kvm (Ubuntu) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1153364 Title: trouble with guest network connectivity when host is using a bonded interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1153364/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
[Bug 1153364] Re: trouble with guest network connectivity when host is using a bonded interface
some more information - while running a ping from another physical host, against a guest, i did a bit of testing with tshark: 192.168.1.123 - other physical host on network 192.168.1.60 - virtual host 192.168.1.70 - virtual guest on the virtual host, the current active slave is eth0, so i started there: >cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: adaptive load balancing Primary Slave: None Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth0 MII Status: up Speed: 100 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:19:b9:ec:43:f1 Slave queue ID: 0 tshark appears to indicate that the ping requests are reaching the physical interface on the virtual host: >tshark -i eth0 'icmp[icmptype]==icmp-echo' Capturing on eth0 0.00 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=540/7170, ttl=64 1.000273 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=541/7426, ttl=64 2.001328 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=542/7682, ttl=64 3.002381 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=543/7938, ttl=64 ^C4 packets captured next, tshark appears to indicate that the ping requests are reaching the bond interface: >tshark -i bond0 'icmp[icmptype]==icmp-echo' Capturing on bond0 0.00 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=796/7171, ttl=64 1.001077 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=797/7427, ttl=64 2.001996 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=798/7683, ttl=64 3.002751 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=799/7939, ttl=64 ^C4 packets captured continuing on, tshark appears to indicate that the ping requests are reaching the bridge interface: >tshark -i br0 'icmp[icmptype]==icmp-echo' Capturing on br0 0.00 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=665/39170, ttl=64 1.001045 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=666/39426, ttl=64 2.001173 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=667/39682, ttl=64 3.002232 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=668/39938, ttl=64 4.003298 192.168.1.123 -> 192.168.1.70 ICMP 98 Echo (ping) request id=0xa494, seq=669/40194, ttl=64 ^C5 packets captured while doing each of these captures, i was running a matching capture on the guest, and did not see any of these packets. while i'm not quite sure what [if any] the implication is, it would seem that somehow, the packets are getting lost on their way to the guest, after they reach the bridge interface. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to qemu-kvm in Ubuntu. https://bugs.launchpad.net/bugs/1153364 Title: trouble with guest network connectivity when host is using a bonded interface To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1153364/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs