Package: xen
Version: 4.0.1-5.5
Severity: critical

--- Please enter the report below this line. ---

Hi!

Since the update last weekind in stable/squeeze I'm experiencing problems with running Xen on amd64 and multiple domUs losing their network connection/VIFs.

From http://blog.windfluechter.net/content/blog/2013/02/26/1597-xen-problems-vms-2632-5-xen-amd64

Unfortunately this update appears to be problematic on my Xen hosting server. This night it happened the second time that some of the virtual network interfaces disappeared or turned out to be non-working. For example I have two VMs: one running the webserver and one running the databases. Between these two VMs there's a bridge on the dom0 and both VMs have a VIF to that (internal) bridge. What happens is that this bridge becomes inaccessible from within the webserver VM.

Sadly there's not much to see in the log files. I just spotted this on dom0:

Feb 26 01:01:29 gate kernel: [12697.907512] vif3.1: Frag is bigger than frame. Feb 26 01:01:29 gate kernel: [12697.907550] vif3.1: fatal error; disabling device Feb 26 01:01:29 gate kernel: [12697.919921] xenbr1: port 3(vif3.1) entering disabled state Feb 26 01:22:00 gate kernel: [13928.644888] vif2.1: Frag is bigger than frame. Feb 26 01:22:00 gate kernel: [13928.644920] vif2.1: fatal error; disabling device Feb 26 01:22:00 gate kernel: [13928.663571] xenbr1: port 2(vif2.1) entering disabled state Feb 26 01:40:44 gate kernel: [15052.629280] vif7.1: Frag is bigger than frame. Feb 26 01:40:44 gate kernel: [15052.629314] vif7.1: fatal error; disabling device Feb 26 01:40:44 gate kernel: [15052.641725] xenbr1: port 6(vif7.1) entering disabled state

This corresponds to the number of VMs having lost their internal connection to the bridge. On the webserver VM I see this output:

Feb 26 01:59:01 vserv1 kernel: [16113.539767] IPv6: sending pkt_too_big to self Feb 26 01:59:01 vserv1 kernel: [16113.539794] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.407517] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.407546] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.434761] IPv6: sending pkt_too_big to self Feb 26 02:30:54 vserv1 kernel: [18026.434787] IPv6: sending pkt_too_big to self Feb 26 03:39:16 vserv1 kernel: [22128.768214] IPv6: sending pkt_too_big to self Feb 26 03:39:16 vserv1 kernel: [22128.768240] IPv6: sending pkt_too_big to self Feb 26 04:39:51 vserv1 kernel: [25764.250170] IPv6: sending pkt_too_big to self Feb 26 04:39:51 vserv1 kernel: [25764.250196] IPv6: sending pkt_too_big to self

Rebooting the VMs will result in a non-working VM as it will get paused on creation and Xen scripts complain about not working hotplug scripts and Xen logs shows this:

    [2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:101)
    XendDomainInfo.create(['vm', ['name', 'vserv1'], ['memory', '2048'],
    ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash',
    'restart'], ['on_xend_start', 'ignore'], ['on_xend_stop', 'ignore'],
['vcpus', '2'], ['oos', 1], ['bootloader', '/usr/lib/xen-4.0/bin/pygrub'],
    ['bootloader_args', ''], ['image', ['linux', ['root', '/dev/xvdb '],
['videoram', 4], ['tsc_mode', 0], ['nomigrate', 0]]], ['s3_integrity', 1], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1-boot'], ['dev', 'xvda'], ['mode', 'w']]], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1-disk'],
    ['dev', 'xvdb'], ['mode', 'w']]], ['device', ['vbd', ['uname',
'phy:/dev/lv/vserv1-swap'], ['dev', 'xvdc'], ['mode', 'w']]], ['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1mirror'], ['dev', 'xvdd'], ['mode',
    'w']]]])
    [2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:2508)
    XendDomainInfo.constructDomain
[2013-02-25 13:06:34 5470] DEBUG (balloon:220) Balloon: 2100000 KiB free;
    need 16384; done.
    [2013-02-25 13:06:34 5470] DEBUG (XendDomain:464) Adding Domain: 39
    [2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:2818)
    XendDomainInfo.initDomain: 39 256
[2013-02-25 13:06:34 5781] DEBUG (XendBootloader:113) Launching bootloader
    as ['/usr/lib/xen-4.0/bin/pygrub', '--args=root=/dev/xvdb  ',
    '--output=/var/run/xend/boot/xenbl.6040', '/dev/lv/vserv1-boot'].
    [2013-02-25 13:06:39 5470] DEBUG (XendDomainInfo:2845)
    _initDomain:shadow_memory=0x0, memory_static_max=0x80000000,
    memory_static_min=0x0.
    [2013-02-25 13:06:39 5470] INFO (image:182) buildDomain os=linux dom=39
    vcpus=2
    [2013-02-25 13:06:39 5470] DEBUG (image:721) domid      = 39
    [2013-02-25 13:06:39 5470] DEBUG (image:722) memsize            = 2048
    [2013-02-25 13:06:39 5470] DEBUG (image:723) image      =
    /var/run/xend/boot/boot_kernel.xj7W_t
    [2013-02-25 13:06:39 5470] DEBUG (image:724) store_evtchn   = 1
    [2013-02-25 13:06:39 5470] DEBUG (image:725) console_evtchn = 2
    [2013-02-25 13:06:39 5470] DEBUG (image:726) cmdline            =
    root=UUID=ed71a39f-fd2e-4035-8557-493686baa151 ro root=/dev/xvdb
    [2013-02-25 13:06:39 5470] DEBUG (image:727) ramdisk            =
    /var/run/xend/boot/boot_ramdisk.QavuAo
    [2013-02-25 13:06:39 5470] DEBUG (image:728) vcpus      = 2
    [2013-02-25 13:06:39 5470] DEBUG (image:729) features           =
    [2013-02-25 13:06:39 5470] DEBUG (image:730) flags      = 0
    [2013-02-25 13:06:39 5470] DEBUG (image:731) superpages     = 0
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice: vbd : {'uuid': '04d99772-cf27-aecf-2d1b-c73eaf657410', 'bootable': 1, 'driver':
    'paravirtualised', 'dev': 'xvda', 'uname': 'phy:/dev/lv/vserv1-boot',
    'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController: writing
    {'virtual-device': '51712', 'device-type': 'disk', 'protocol':
    'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
    '/local/domain/0/backend/vbd/39/51712'} to
    /local/domain/39/device/vbd/51712.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController: writing
    {'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51712',
    'uuid': '04d99772-cf27-aecf-2d1b-c73eaf657410', 'bootable': '1', 'dev':
    'xvda', 'state': '1', 'params': '/dev/lv/vserv1-boot', 'mode': 'w',
    'online': '1', 'frontend-id': '39', 'type': 'phy'} to
    /local/domain/0/backend/vbd/39/51712.
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice: vbd : {'uuid': 'e46cb89f-3e54-41d2-53bd-759ed6c690d2', 'bootable': 0, 'driver':
    'paravirtualised', 'dev': 'xvdb', 'uname': 'phy:/dev/lv/vserv1-disk',
    'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController: writing
    {'virtual-device': '51728', 'device-type': 'disk', 'protocol':
    'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
    '/local/domain/0/backend/vbd/39/51728'} to
    /local/domain/39/device/vbd/51728.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController: writing
    {'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51728',
    'uuid': 'e46cb89f-3e54-41d2-53bd-759ed6c690d2', 'bootable': '0', 'dev':
    'xvdb', 'state': '1', 'params': '/dev/lv/vserv1-disk', 'mode': 'w',
    'online': '1', 'frontend-id': '39', 'type': 'phy'} to
    /local/domain/0/backend/vbd/39/51728.
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice: vbd : {'uuid': 'e2d61860-7448-1843-3935-6b63c5d2878e', 'bootable': 0, 'driver':
    'paravirtualised', 'dev': 'xvdc', 'uname': 'phy:/dev/lv/vserv1-swap',
    'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController: writing
    {'virtual-device': '51744', 'device-type': 'disk', 'protocol':
    'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
    '/local/domain/0/backend/vbd/39/51744'} to
    /local/domain/39/device/vbd/51744.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController: writing
    {'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51744',
    'uuid': 'e2d61860-7448-1843-3935-6b63c5d2878e', 'bootable': '0', 'dev':
    'xvdc', 'state': '1', 'params': '/dev/lv/vserv1-swap', 'mode': 'w',
    'online': '1', 'frontend-id': '39', 'type': 'phy'} to
    /local/domain/0/backend/vbd/39/51744.
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice: vbd : {'uuid': 'd314a46e-1ce9-0e8d-b009-3f08e29735f5', 'bootable': 0, 'driver':
    'paravirtualised', 'dev': 'xvdd', 'uname': 'phy:/dev/lv/vserv1mirror',
    'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController: writing
    {'virtual-device': '51760', 'device-type': 'disk', 'protocol':
    'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
    '/local/domain/0/backend/vbd/39/51760'} to
    /local/domain/39/device/vbd/51760.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController: writing
    {'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51760',
    'uuid': 'd314a46e-1ce9-0e8d-b009-3f08e29735f5', 'bootable': '0', 'dev':
    'xvdd', 'state': '1', 'params': '/dev/lv/vserv1mirror', 'mode': 'w',
    'online': '1', 'frontend-id': '39', 'type': 'phy'} to
    /local/domain/0/backend/vbd/39/51760.
[2013-02-25 13:06:40 5470] DEBUG (XendDomainInfo:3400) Storing VM details:
    {'on_xend_stop': 'ignore', 'shadow_memory': '0', 'uuid':
    '04541225-6d3c-3cae-a4c4-0b6d4ccfac7a', 'on_reboot': 'restart',
'start_time': '1361794000.37', 'on_poweroff': 'destroy', 'bootloader_args': '', 'on_xend_start': 'ignore', 'on_crash': 'restart', 'xend/restart_count':
    '0', 'vcpus': '2', 'vcpu_avail': '3', 'bootloader':
    '/usr/lib/xen-4.0/bin/pygrub', 'image': "(linux (kernel ) (args
    'root=/dev/xvdb  ') (superpages 0) (tsc_mode 0) (videoram 4) (pci ())
    (nomigrate 0) (notes (HV_START_LOW 18446603336221196288) (FEATURES
    '!writable_page_tables|pae_pgdir_above_4gb') (VIRT_BASE
18446744071562067968) (GUEST_VERSION 2.6) (PADDR_OFFSET 0) (GUEST_OS linux) (HYPERCALL_PAGE 18446744071578882048) (LOADER generic) (SUSPEND_CANCEL 1)
    (PAE_MODE yes) (ENTRY 18446744071584289280) (XEN_VERSION xen-3.0)))",
    'name': 'vserv1'}
    [2013-02-25 13:06:40 5470] DEBUG (XendDomainInfo:1804) Storing domain
    details: {'console/ring-ref': '2143834', 'image/entry':
'18446744071584289280', 'console/port': '2', 'store/ring-ref': '2143835',
    'image/loader': 'generic', 'vm':
    '/vm/04541225-6d3c-3cae-a4c4-0b6d4ccfac7a',
    'control/platform-feature-multiprocessor-suspend': '1',
'image/hv-start-low': '18446603336221196288', 'image/guest-os': 'linux', 'cpu/1/availability': 'online', 'image/virt-base': '18446744071562067968', 'memory/target': '2097152', 'image/guest-version': '2.6', 'image/pae-mode': 'yes', 'description': '', 'console/limit': '1048576', 'image/paddr-offset':
    '0', 'image/hypercall-page': '18446744071578882048',
    'image/suspend-cancel': '1', 'cpu/0/availability': 'online',
    'image/features/pae-pgdir-above-4gb': '1',
'image/features/writable-page-tables': '0', 'console/type': 'xenconsoled',
    'name': 'vserv1', 'domid': '39', 'image/xen-version': 'xen-3.0',
    'store/port': '1'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController: writing
    {'protocol': 'x86_64-abi', 'state': '1', 'backend-id': '0', 'backend':
    '/local/domain/0/backend/console/39/0'} to
    /local/domain/39/device/console/0.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController: writing
    {'domain': 'vserv1', 'frontend': '/local/domain/39/device/console/0',
    'uuid': 'c8819aed-c78f-02b8-0ef7-1600abd15add', 'frontend-id': '39',
    'state': '1', 'location': '2', 'online': '1', 'protocol': 'vt100'} to
    /local/domain/0/backend/console/39/0.
    [2013-02-25 13:06:40 5470] DEBUG (XendDomainInfo:1891)
    XendDomainInfo.handleShutdownWatch
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for devices
    vif2.
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for devices
    vif.
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for devices
    vscsi.
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for devices
    vbd.
    [2013-02-25 13:06:40 5470] DEBUG (DevController:144) Waiting for 51712.
[2013-02-25 13:06:40 5470] DEBUG (DevController:628) hotplugStatusCallback
    /local/domain/0/backend/vbd/39/51712/hotplug-status.

From my point of view, either Xen hypervisor or the kernel seems to be broken, but it's hard to tell for me.


I suspect the problem within the Xen kernel part of VIF code as a reboot of the dom0 solves this problem temporarily without touching the domUs. But within some hours (<6 hrs) the issue re-appears. Although I assume that xend is responsible for adding/removing VIFs a restart of xend doesn't help at all. That's why I assume a kernel problem within the dom0.

I'm running 8 domUs at the moment, each of them is connected to the outer world through xenbr0 and connected to the internal world through xenbr1 and RFC1918 addresses. I'm running a mixed setup of routed and bridged config:

(vif-script vif-bridge)
(network-script network-route)

But the server ran several years with that setup without any problems, so I don't think that's an issue.

For now I'm forced to go back to a working kernel as I need to keep the server up and running.

--- System information. ---
Architecture: amd64
Kernel:       Linux 2.6.32-5-xen-amd64

gate:~# dpkg -l | grep xen
ii libxenstore3.0 4.0.1-5.6 Xenstore communications library for Xen ii linux-image-2.6.32-5-xen-amd64 2.6.32-48 Linux 2.6.32 for 64-bit PCs, Xen dom0 support ii xen-hypervisor-4.0-amd64 4.0.1-5.6 The Xen Hypervisor on AMD64 ii xen-linux-system-2.6-xen-amd64 2.6.32+29 Xen system with Linux 2.6 for 64-bit PCs (meta-package) ii xen-linux-system-2.6.32-5-xen-amd64 2.6.32-48 Xen system with Linux 2.6.32 on 64-bit PCs (meta-package) ii xen-tools 4.2-1 Tools to manage Xen virtual servers ii xen-utils-4.0 4.0.1-5.6 XEN administrative tools ii xen-utils-common 4.0.0-1 XEN administrative tools - common files ii xenstore-utils 4.0.1-5.6 Xenstore utilities for Xen ii xenwatch 0.5.4-2 Virtualization utilities, mostly for Xen


--
Ciao...            //      Fon: 0381-2744150
      Ingo       \X/       http://blog.windfluechter.net
Please don't share this address with Facebook or Google!
gpg pubkey: http://www.juergensmann.de/ij_public_key.asc


--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to