On Fri, Jun 10, 2016 at 4:22 PM, Philipp Hahn <h...@univention.de> wrote: > Hi, > > while trying to live migrate some VMs from an xen-4.1.6.1 host "xc_save" > crashes with a segmentation fault in tools/libxc/xc_domain_save.c:1141 >> /* >> * Quick belt and braces sanity check. >> */ >> for ( i = 0; i < dinfo->p2m_size; i++ ) >> { >> mfn = pfn_to_mfn(i); >> if( (mfn != INVALID_P2M_ENTRY) && (mfn_to_pfn(mfn) != i) ) > ^^^^^^^^^^^^^^^ > due to a de-reference through >> #define pfn_to_mfn(_pfn) \ >> ((xen_pfn_t) ((dinfo->guest_width==8) \ >> ? (((uint64_t *)ctx->live_p2m)[(_pfn)]) \ >> : ((((uint32_t *)ctx->live_p2m)[(_pfn)]) == 0xffffffffU \ >> ? (-1UL) : (((uint32_t *)ctx->live_p2m)[(_pfn)])))) > ^^^^^^^^ > The VM is a 32bit Linux-PV-domain having maxmem=1997[MB] >> (gdb) print _ctx >> $1 = {hvirt_start = 4118806528, pt_levels = 3, max_mfn = 25690112, live_p2m >> = 0x7f421cc2e000, live_m2p = 0x7f421d02e000, m2p_mfn0 = 8649728, dinfo = >> {guest_width = 4, p2m_size = 1048576}} > > Note that p2m_size = 0x100000 = 4GiB_RAM/4KiB_page_size ยป > maxmem_of_domU, so so loop doesn't end around the allocated 2GB, but > tries to go up to the 32-maximum of 4GB and fails. > > I've added more debugging to verify that: >> xc: detail: p2m_size=0x100000 > ... >> xc: detail: i=0x7cfff mfn=183ecf2 live_m2p=7cfff >> segfault > > I can reproduce that easily doing > /usr/lib/xen/bin/xc_save 28 $domid 0 0 1 28>/dev/null > > Doing a non-live-migration (28 $domid 0 0 0) also fails, so it doesn't > depend on being live. > > Increasing the configured memory size of the domU to 2001 doesn't change > the problem: >> xc: detail: p2m_size=0x100000 >> xc: detail: i=0x7cfff mfn=183ecf2 live_m2p=7cfff >> xc: detail: i=0x7d000 mfn=183ecf1 live_m2p=7d000 > ... >> xc: detail: i=0x7d0fd mfn=10aebce live_m2p=7d0fd >> xc: detail: i=0x7d0fe mfn=10aebcd live_m2p=7d0fe >> xc: detail: i=0x7d0ff mfn=10aebcc live_m2p=7d0ff >> Speicherzugriffsfehler (Speicherabzug geschrieben) > > Rebooting the domU also doesn't fix the problem. > > There has been an 8 year old report about live migration mailing: > <http://lists.xenproject.org/archives/html/xen-devel/2008-10/msg00107.html> > > The host is Linux-3.10.71 amd64, while the guest is Linux-4.1.16 i386 > with PAE. The guest kernel reports >> [ 0.000000] e820: last_pfn = 0x7d900 max_arch_pfn = 0x1000000 > >> # cat iomem >> 00000000-00000fff : reserved >> 00001000-0009ffff : System RAM >> 000a0000-000fffff : reserved >> 000f0000-000fffff : System ROM >> 00100000-7d8fffff : System RAM >> 01000000-014e51d9 : Kernel code >> 014e51da-016efd7f : Kernel data >> 017a9000-0181cfff : Kernel bss >> fee00000-fee00fff : Local APIC > > > To me that looks like a bad mixture of > 1. xc_domain_maximum_gpfn() returning not the right maximum, > 2. xc_map_foreign_pages() mapping only the really allocated pages, > > Any idea? > If you need more info, I can provide that on request. > > I know that the old migration implementation has been removed with > xen-4.6, but I still would like to know how to fix that since we will > not have 4.6 in the next few month and I need working live migration. > > Thank you in advance and have a nice weekend.
Phillip, Given that 4.1 is long out of support, we won't be making a proper fix in-tree (since it will never be released). So what kind of resolution would be the most help to you? A patch you can apply locally to allow the save/restore to work? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel