Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
I am exclusively seeing this on the same two servers so far. Is anyone experiencing on this server with more than one VM? Drew Hastings On 06/23/2013 03:41 PM, Drew Hastings wrote: I'm seeing these issues as well, for the first time today. I had two VM's fail. The unique thing about these dom0's was that they both only had one VM. As an example, the xm info on one of them is: root@phx-1006:~# cat /etc/debian_version 6.0.7 root@phx-1006:~# xm info host : phx-1006 release: 2.6.32-5-xen-amd64 version: #1 SMP Mon Feb 25 02:51:39 UTC 2013 machine: x86_64 nr_cpus: 4 nr_nodes : 1 cores_per_socket : 4 threads_per_core : 1 cpu_mhz: 3100 hw_caps: bfebfbff:28100800::1f40:13bae3ff::0001: virt_caps : hvm total_memory : 8164 free_memory: 50 node_to_cpu: node0:0-3 node_to_memory : node0:50 node_to_dma32_mem : node0:50 max_node_id: 0 xen_major : 4 xen_minor : 0 xen_extra : .1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params: virt_start=0x8000 xen_changeset : unavailable xen_commandline: placeholder dom0_mem=1024M cc_compiler: gcc version 4.4.5 (Debian 4.4.5-8) cc_compile_by : ijc cc_compile_domain : hellion.org.uk cc_compile_date: Wed Apr 17 18:59:41 UTC 2013 xend_config_format : 4 I am also only a small mention of it in /var/logmessages: Jun 23 12:13:58 phx-1006 kernel: [3863294.436187] pub: port 2(vif4.0) entering disabled state I have active-backup bonding setup between two NICs, which might be unrelated to the bug. I then create a bridge using that bonding device in /etc/network/interfaces -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
I'm seeing these issues as well, for the first time today. I had two VM's fail. The unique thing about these dom0's was that they both only had one VM. As an example, the xm info on one of them is: root@phx-1006:~# cat /etc/debian_version 6.0.7 root@phx-1006:~# xm info host : phx-1006 release: 2.6.32-5-xen-amd64 version: #1 SMP Mon Feb 25 02:51:39 UTC 2013 machine: x86_64 nr_cpus: 4 nr_nodes : 1 cores_per_socket : 4 threads_per_core : 1 cpu_mhz: 3100 hw_caps: bfebfbff:28100800::1f40:13bae3ff::0001: virt_caps : hvm total_memory : 8164 free_memory: 50 node_to_cpu: node0:0-3 node_to_memory : node0:50 node_to_dma32_mem : node0:50 max_node_id: 0 xen_major : 4 xen_minor : 0 xen_extra : .1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params: virt_start=0x8000 xen_changeset : unavailable xen_commandline: placeholder dom0_mem=1024M cc_compiler: gcc version 4.4.5 (Debian 4.4.5-8) cc_compile_by : ijc cc_compile_domain : hellion.org.uk cc_compile_date: Wed Apr 17 18:59:41 UTC 2013 xend_config_format : 4 I am also only a small mention of it in /var/logmessages: Jun 23 12:13:58 phx-1006 kernel: [3863294.436187] pub: port 2(vif4.0) entering disabled state I have active-backup bonding setup between two NICs, which might be unrelated to the bug. I then create a bridge using that bonding device in /etc/network/interfaces -- Drew H -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
I'm seeing this on debian wheezy, too like Johnny Strom reported. # uname -a Linux minerve 3.2.0-4-amd64 #1 SMP Debian 3.2.41-2+deb7u2 x86_64 GNU/Linux # dpkg -l | grep hypervisor ii xen-hypervisor-4.1-amd644.1.4-3+deb7u1 amd64Xen Hypervisor on AMD64 The network is configured without (network-script network-bridge) in xend-config.sxp and in /etc/network/interfaces, we have no entry for eth0 and this (replaced the real values by stars): auto br0 iface br0 inet static address * netmask * broadcast * gateway * bridge_ports eth0 on the dom0, we can see this in kern.log: May 28 19:09:30 minerve kernel: [613987.529752] vif vif-11-0: vif11.0: Frag is bigger than frame. May 28 19:09:30 minerve kernel: [613987.539463] vif vif-11-0: vif11.0: fatal error; disabling device May 28 19:09:30 minerve kernel: [613987.550091] br0: port 2(vif11.0) entering forwarding state on the console for the VM, I could see this multiple times: [233721.357648] device eth0 left promiscuous mode [233721.424093] device eth0 entered promiscuous mode I don't have further access to this VM. On other VMs where this happened, I could see this in the logs: [2601471.852025] TCP: Peer 74.94.211.209:31059/80 unexpectedly shrunk window 4233113998:4233124218 (repaired) [2601479.916025] TCP: Peer 74.94.211.209:31059/80 unexpectedly shrunk window 4233113998:4233124218 (repaired) -- Gabriel Filion signature.asc Description: OpenPGP digital signature
Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Hello, We are also encountering this critical issue. I hope a solution will be found shortly. Jürg -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Hello, I was also hit by this problem. Perhaps it is related to the xen-netback module in the dom0 kernel? Have a look at: http://www.gossamer-threads.com/lists/xen/devel/275548 -- greetings eMHa signature.asc Description: This is a digitally signed message part.
Bug#701744: [Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
On Thu, 2013-02-28 at 07:47 +0100, Ingo Jürgensmann wrote: Am 26.02.2013 um 19:19 schrieb Ian Campbell i...@hellion.org.uk: So, was the hypervisor upgrade also accompanied by a kernel update, in either the dom0 or guest domains? If so what versions were involved and where? Additionally to the last mail: After running two days stable with the downgraded packages, I've now upgraded the hypervisor package to xen-hypervisor-4.0-amd64_4.0.1-5.6_amd64.deb again and rebooted. If the machine runs stable with that, I'd say the kernel package is broken. And if the issues popping up again now, it's the hypervisor. I am 99% positive it will turn out to be the kernel, but please do let us know. All domUs are still running the updated kernel packages (pre DSA-2632-1), the dom0 the downgraded kernel package. When you reach a conclusion (assuming it is the one I expect!) please can you confirm the fixed+broken kernel package versions so I can reassign as appropriate. Unfortunately I'm about to go travelling for a week but I have asked one of my colleagues to try and take a look at this. Ian. -- Ian Campbell A beginning is the time for taking the most delicate care that balances are correct. -- Princess Irulan, Manual of Maud'Dib -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Am 28.02.2013 um 11:28 schrieb Ian Campbell i...@hellion.org.uk: I am 99% positive it will turn out to be the kernel, but please do let us know. Shortly before 18:00 today I reinstalled linux-image-2.6.32-5-xen-amd64_2.6.32-48_amd64.deb as kernel and it took only 2.5 hours to trigger the issue. The hypervisor is xen-hypervisor-4.0-amd64 4.0.1-5.4 at this time. All domUs are still running the updated kernel packages (pre DSA-2632-1), the dom0 the downgraded kernel package. When you reach a conclusion (assuming it is the one I expect!) please can you confirm the fixed+broken kernel package versions so I can reassign as appropriate. Unfortunately I'm about to go travelling for a week but I have asked one of my colleagues to try and take a look at this. I will now upgrade to DSA-2632-1 kernels and see whether that works or has the same issues. Will keep you updated. -- Ciao...// Fon: 0381-2744150 Ingo \X/ http://blog.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key.asc -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Am 28.02.2013 um 11:28 schrieb Ian Campbell i...@hellion.org.uk: All domUs are still running the updated kernel packages (pre DSA-2632-1), the dom0 the downgraded kernel package. linux-image-2.6.32-5-xen-amd64 2.6.32-48squeeze1 triggered the problem within 30 mins. ;-) When you reach a conclusion (assuming it is the one I expect!) please can you confirm the fixed+broken kernel package versions so I can reassign as appropriate. Conclusion: it's the kernel... ;-) Working: - kernel: linux-image-2.6.32-5-xen-amd64_2.6.32-46_amd64.deb - hypervisor: xen-hypervisor-4.0-amd64_4.0.1-5.4_amd64.deb, xen-hypervisor-4.0-amd64_4.0.1-5.6_amd64.deb Not Working kernels: - linux-image-2.6.32-5-xen-amd64_2.6.32-48_amd64.deb - linux-image-2.6.32-5-xen-amd64_2.6.32-48squeeze1_amd64.deb So, I'm going back to linux-image-2.6.32-5-xen-amd64_2.6.32-46_amd64.deb... -- Ciao...// Fon: 0381-2744150 Ingo \X/ http://blog.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key.asc -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Am 26.02.2013 um 19:19 schrieb Ian Campbell i...@hellion.org.uk: So, was the hypervisor upgrade also accompanied by a kernel update, in either the dom0 or guest domains? If so what versions were involved and where? Additionally to the last mail: After running two days stable with the downgraded packages, I've now upgraded the hypervisor package to xen-hypervisor-4.0-amd64_4.0.1-5.6_amd64.deb again and rebooted. If the machine runs stable with that, I'd say the kernel package is broken. And if the issues popping up again now, it's the hypervisor. All domUs are still running the updated kernel packages (pre DSA-2632-1), the dom0 the downgraded kernel package. -- Ciao...// Fon: 0381-2744150 Ingo \X/ http://blog.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key.asc -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Package: xen Version: 4.0.1-5.5 Severity: critical --- Please enter the report below this line. --- Hi! Since the update last weekind in stable/squeeze I'm experiencing problems with running Xen on amd64 and multiple domUs losing their network connection/VIFs. From http://blog.windfluechter.net/content/blog/2013/02/26/1597-xen-problems-vms-2632-5-xen-amd64 Unfortunately this update appears to be problematic on my Xen hosting server. This night it happened the second time that some of the virtual network interfaces disappeared or turned out to be non-working. For example I have two VMs: one running the webserver and one running the databases. Between these two VMs there's a bridge on the dom0 and both VMs have a VIF to that (internal) bridge. What happens is that this bridge becomes inaccessible from within the webserver VM. Sadly there's not much to see in the log files. I just spotted this on dom0: Feb 26 01:01:29 gate kernel: [12697.907512] vif3.1: Frag is bigger than frame. Feb 26 01:01:29 gate kernel: [12697.907550] vif3.1: fatal error; disabling device Feb 26 01:01:29 gate kernel: [12697.919921] xenbr1: port 3(vif3.1) entering disabled state Feb 26 01:22:00 gate kernel: [13928.644888] vif2.1: Frag is bigger than frame. Feb 26 01:22:00 gate kernel: [13928.644920] vif2.1: fatal error; disabling device Feb 26 01:22:00 gate kernel: [13928.663571] xenbr1: port 2(vif2.1) entering disabled state Feb 26 01:40:44 gate kernel: [15052.629280] vif7.1: Frag is bigger than frame. Feb 26 01:40:44 gate kernel: [15052.629314] vif7.1: fatal error; disabling device Feb 26 01:40:44 gate kernel: [15052.641725] xenbr1: port 6(vif7.1) entering disabled state This corresponds to the number of VMs having lost their internal connection to the bridge. On the webserver VM I see this output: Feb 26 01:59:01 vserv1 kernel: [16113.539767] IPv6: sending pkt_too_big to self Feb 26 01:59:01 vserv1 kernel: [16113.539794] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.407517] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.407546] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.434761] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.434787] IPv6: sending pkt_too_big to self Feb 26 03:39:16 vserv1 kernel: [22128.768214] IPv6: sending pkt_too_big to self Feb 26 03:39:16 vserv1 kernel: [22128.768240] IPv6: sending pkt_too_big to self Feb 26 04:39:51 vserv1 kernel: [25764.250170] IPv6: sending pkt_too_big to self Feb 26 04:39:51 vserv1 kernel: [25764.250196] IPv6: sending pkt_too_big to self Rebooting the VMs will result in a non-working VM as it will get paused on creation and Xen scripts complain about not working hotplug scripts and Xen logs shows this: [2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:101) XendDomainInfo.create(['vm', ['name', 'vserv1'], ['memory', '2048'], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash', 'restart'], ['on_xend_start', 'ignore'], ['on_xend_stop', 'ignore'], ['vcpus', '2'], ['oos', 1], ['bootloader', '/usr/lib/xen-4.0/bin/pygrub'], ['bootloader_args', ''], ['image', ['linux', ['root', '/dev/xvdb '], ['videoram', 4], ['tsc_mode', 0], ['nomigrate', 0]]], ['s3_integrity', 1], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1-boot'], ['dev', 'xvda'], ['mode', 'w']]], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1-disk'], ['dev', 'xvdb'], ['mode', 'w']]], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1-swap'], ['dev', 'xvdc'], ['mode', 'w']]], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1mirror'], ['dev', 'xvdd'], ['mode', 'w') [2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:2508) XendDomainInfo.constructDomain [2013-02-25 13:06:34 5470] DEBUG (balloon:220) Balloon: 210 KiB free; need 16384; done. [2013-02-25 13:06:34 5470] DEBUG (XendDomain:464) Adding Domain: 39 [2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:2818) XendDomainInfo.initDomain: 39 256 [2013-02-25 13:06:34 5781] DEBUG (XendBootloader:113) Launching bootloader as ['/usr/lib/xen-4.0/bin/pygrub', '--args=root=/dev/xvdb ', '--output=/var/run/xend/boot/xenbl.6040', '/dev/lv/vserv1-boot']. [2013-02-25 13:06:39 5470] DEBUG (XendDomainInfo:2845) _initDomain:shadow_memory=0x0, memory_static_max=0x8000, memory_static_min=0x0. [2013-02-25 13:06:39 5470] INFO (image:182) buildDomain os=linux dom=39 vcpus=2 [2013-02-25 13:06:39 5470] DEBUG (image:721) domid = 39 [2013-02-25 13:06:39 5470] DEBUG (image:722) memsize= 2048 [2013-02-25 13:06:39 5470] DEBUG (image:723) image = /var/run/xend/boot/boot_kernel.xj7W_t [2013-02-25 13:06:39 5470] DEBUG (image:724) store_evtchn = 1 [2013-02-25 13:06:39 5470] DEBUG (image:725) console_evtchn =
Bug#701744: [Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
On Tue, 2013-02-26 at 18:42 +0100, Ingo Juergensmann wrote: Since the update last weekind in stable/squeeze I'm experiencing problems with running Xen on amd64 and multiple domUs losing their network connection/VIFs. The hypervisors involvement in the specifics of the networking is pretty minimal -- a kernel bug is much more likely IMHO. In particular the messages you are seeing look a lot like those which would result from http://wiki.xen.org/wiki/Security_Announcements#XSA_39_Linux_netback_DoS_via_malicious_guest_ring.. So, was the hypervisor upgrade also accompanied by a kernel update, in either the dom0 or guest domains? If so what versions were involved and where? Thanks, Ian -- Ian Campbell pain, n.: One thing, at least it proves that you're alive! signature.asc Description: This is a digitally signed message part
Bug#701744: [Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Am 26.02.2013 um 19:19 schrieb Ian Campbell i...@hellion.org.uk: Since the update last weekind in stable/squeeze I'm experiencing problems with running Xen on amd64 and multiple domUs losing their network connection/VIFs. The hypervisors involvement in the specifics of the networking is pretty minimal -- a kernel bug is much more likely IMHO. In particular the messages you are seeing look a lot like those which would result from http://wiki.xen.org/wiki/Security_Announcements#XSA_39_Linux_netback_DoS_via_malicious_guest_ring.. So, was the hypervisor upgrade also accompanied by a kernel update, in either the dom0 or guest domains? If so what versions were involved and where? Yes, it was a full update, both on dom0 as well as on domUs. I always try to keep kernels on dom0 and domU the same version. The blog posts lists the packages that were updated last weekend: gate:~# dir /var/cache/apt/archives/ base-files_6.0squeeze7_amd64.deb libxenstore3.0_4.0.1-5.6_amd64.deb bind9-host_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb linux-base_2.6.32-48_all.deb dbus_1.2.24-4+squeeze2_amd64.deb linux-image-2.6.32-5-amd64_2.6.32-48_amd64.deb dbus-x11_1.2.24-4+squeeze2_amd64.deb linux-image-2.6.32-5-xen-amd64_2.6.32-48_amd64.deb firmware-linux-free_2.6.32-48_all.deblock gzip_1.3.12-9+squeeze1_amd64.deb openssh-client_1%3a5.5p1-6+squeeze3_amd64.deb host_1%3a9.7.3.dfsg-1~squeeze9_all.deb openssh-server_1%3a5.5p1-6+squeeze3_amd64.deb libbind9-60_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb openssl_0.9.8o-4squeeze14_amd64.deb libcups2_1.4.4-7+squeeze3_amd64.deb partial libdbus-1-3_1.2.24-4+squeeze2_amd64.deb perl_5.10.1-17squeeze5_amd64.deb libdbus-glib-1-2_0.88-2.1+squeeze1_amd64.deb perl-base_5.10.1-17squeeze5_amd64.deb libdns69_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb perl-modules_5.10.1-17squeeze5_all.deb libisc62_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb ssh_1%3a5.5p1-6+squeeze3_all.deb libisccc60_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb tzdata_2012g-0squeeze1_all.deb libisccfg62_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb xen-hypervisor-4.0-amd64_4.0.1-5.6_amd64.deb libldap-2.4-2_2.4.23-7.3_amd64.deb xen-linux-system-2.6.32-5-xen-amd64_2.6.32-48_amd64.deb liblwres60_1%3a9.7.3.dfsg-1~squeeze9_amd64.deb xenstore-utils_4.0.1-5.6_amd64.deb libperl5.10_5.10.1-17squeeze5_amd64.deb xen-utils-4.0_4.0.1-5.6_amd64.deb libssl0.9.8_0.9.8o-4squeeze14_amd64.deb The same kernel versions were updated in the domUs. -- Ciao...// Fon: 0381-2744150 Ingo \X/ http://blog.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key.asc -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org