[Xen-devel] [xen-4.1.6.1] SIGSEGV libxc/xc_save_domain.c: p2m_size >> configured_ram_size

2016-06-10 Thread Philipp Hahn
Hi,

while trying to live migrate some VMs from an xen-4.1.6.1 host "xc_save"
crashes with a segmentation fault in tools/libxc/xc_domain_save.c:1141
> /*
>  * Quick belt and braces sanity check.
>  */
> for ( i = 0; i < dinfo->p2m_size; i++ )
> {
> mfn = pfn_to_mfn(i);
> if( (mfn != INVALID_P2M_ENTRY) && (mfn_to_pfn(mfn) != i) )
 ^^^
due to a de-reference through
> #define pfn_to_mfn(_pfn)\
>   ((xen_pfn_t) ((dinfo->guest_width==8)   \
> ? (((uint64_t *)ctx->live_p2m)[(_pfn)])  \
> : uint32_t *)ctx->live_p2m)[(_pfn)]) == 0xU  \
>? (-1UL) : (((uint32_t *)ctx->live_p2m)[(_pfn)]

The VM is a 32bit Linux-PV-domain having maxmem=1997[MB]
> (gdb) print _ctx
> $1 = {hvirt_start = 4118806528, pt_levels = 3, max_mfn = 25690112, live_p2m = 
> 0x7f421cc2e000, live_m2p = 0x7f421d02e000, m2p_mfn0 = 8649728, dinfo = 
> {guest_width = 4, p2m_size = 1048576}}

Note that p2m_size = 0x10 = 4GiB_RAM/4KiB_page_size »
maxmem_of_domU, so so loop doesn't end around the allocated 2GB, but
tries to go up to the 32-maximum of 4GB and fails.

I've added more debugging to verify that:
> xc: detail: p2m_size=0x10
...
> xc: detail: i=0x7cfff mfn=183ecf2 live_m2p=7cfff
> segfault

I can reproduce that easily doing
 /usr/lib/xen/bin/xc_save 28 $domid 0 0 1 28>/dev/null

Doing a non-live-migration (28 $domid 0 0 0) also fails, so it doesn't
depend on being live.

Increasing the configured memory size of the domU to 2001 doesn't change
the problem:
> xc: detail: p2m_size=0x10
> xc: detail: i=0x7cfff mfn=183ecf2 live_m2p=7cfff
> xc: detail: i=0x7d000 mfn=183ecf1 live_m2p=7d000
...
> xc: detail: i=0x7d0fd mfn=10aebce live_m2p=7d0fd
> xc: detail: i=0x7d0fe mfn=10aebcd live_m2p=7d0fe
> xc: detail: i=0x7d0ff mfn=10aebcc live_m2p=7d0ff
> Speicherzugriffsfehler (Speicherabzug geschrieben)

Rebooting the domU also doesn't fix the problem.

There has been an 8 year old report about live migration mailing:


The host is Linux-3.10.71 amd64, while the guest is Linux-4.1.16 i386
with PAE. The guest kernel reports
> [0.00] e820: last_pfn = 0x7d900 max_arch_pfn = 0x100

> # cat iomem 
> -0fff : reserved
> 1000-0009 : System RAM
> 000a-000f : reserved
>   000f-000f : System ROM
> 0010-7d8f : System RAM
>   0100-014e51d9 : Kernel code
>   014e51da-016efd7f : Kernel data
>   017a9000-0181cfff : Kernel bss
> fee0-fee00fff : Local APIC


To me that looks like a bad mixture of
1. xc_domain_maximum_gpfn() returning not the right maximum,
2. xc_map_foreign_pages() mapping only the really allocated pages,

Any idea?
If you need more info, I can provide that on request.

I know that the old migration implementation has been removed with
xen-4.6, but I still would like to know how to fix that since we will
not have 4.6 in the next few month and I need working live migration.

Thank you in advance and have a nice weekend.

Philipp
-- 
Philipp Hahn
Open Source Software Engineer

Univention GmbH
be open.
Mary-Somerville-Str. 1
D-28359 Bremen
Tel.: +49 421 22232-0
Fax : +49 421 22232-99
h...@univention.de

http://www.univention.de/
Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-4.1.6.1] SIGSEGV libxc/xc_save_domain.c: p2m_size >> configured_ram_size

2016-06-13 Thread George Dunlap
On Fri, Jun 10, 2016 at 4:22 PM, Philipp Hahn  wrote:
> Hi,
>
> while trying to live migrate some VMs from an xen-4.1.6.1 host "xc_save"
> crashes with a segmentation fault in tools/libxc/xc_domain_save.c:1141
>> /*
>>  * Quick belt and braces sanity check.
>>  */
>> for ( i = 0; i < dinfo->p2m_size; i++ )
>> {
>> mfn = pfn_to_mfn(i);
>> if( (mfn != INVALID_P2M_ENTRY) && (mfn_to_pfn(mfn) != i) )
>  ^^^
> due to a de-reference through
>> #define pfn_to_mfn(_pfn)\
>>   ((xen_pfn_t) ((dinfo->guest_width==8)   \
>> ? (((uint64_t *)ctx->live_p2m)[(_pfn)])  \
>> : uint32_t *)ctx->live_p2m)[(_pfn)]) == 0xU  \
>>? (-1UL) : (((uint32_t *)ctx->live_p2m)[(_pfn)]
> 
> The VM is a 32bit Linux-PV-domain having maxmem=1997[MB]
>> (gdb) print _ctx
>> $1 = {hvirt_start = 4118806528, pt_levels = 3, max_mfn = 25690112, live_p2m 
>> = 0x7f421cc2e000, live_m2p = 0x7f421d02e000, m2p_mfn0 = 8649728, dinfo = 
>> {guest_width = 4, p2m_size = 1048576}}
>
> Note that p2m_size = 0x10 = 4GiB_RAM/4KiB_page_size »
> maxmem_of_domU, so so loop doesn't end around the allocated 2GB, but
> tries to go up to the 32-maximum of 4GB and fails.
>
> I've added more debugging to verify that:
>> xc: detail: p2m_size=0x10
> ...
>> xc: detail: i=0x7cfff mfn=183ecf2 live_m2p=7cfff
>> segfault
>
> I can reproduce that easily doing
>  /usr/lib/xen/bin/xc_save 28 $domid 0 0 1 28>/dev/null
>
> Doing a non-live-migration (28 $domid 0 0 0) also fails, so it doesn't
> depend on being live.
>
> Increasing the configured memory size of the domU to 2001 doesn't change
> the problem:
>> xc: detail: p2m_size=0x10
>> xc: detail: i=0x7cfff mfn=183ecf2 live_m2p=7cfff
>> xc: detail: i=0x7d000 mfn=183ecf1 live_m2p=7d000
> ...
>> xc: detail: i=0x7d0fd mfn=10aebce live_m2p=7d0fd
>> xc: detail: i=0x7d0fe mfn=10aebcd live_m2p=7d0fe
>> xc: detail: i=0x7d0ff mfn=10aebcc live_m2p=7d0ff
>> Speicherzugriffsfehler (Speicherabzug geschrieben)
>
> Rebooting the domU also doesn't fix the problem.
>
> There has been an 8 year old report about live migration mailing:
> 
>
> The host is Linux-3.10.71 amd64, while the guest is Linux-4.1.16 i386
> with PAE. The guest kernel reports
>> [0.00] e820: last_pfn = 0x7d900 max_arch_pfn = 0x100
>
>> # cat iomem
>> -0fff : reserved
>> 1000-0009 : System RAM
>> 000a-000f : reserved
>>   000f-000f : System ROM
>> 0010-7d8f : System RAM
>>   0100-014e51d9 : Kernel code
>>   014e51da-016efd7f : Kernel data
>>   017a9000-0181cfff : Kernel bss
>> fee0-fee00fff : Local APIC
>
>
> To me that looks like a bad mixture of
> 1. xc_domain_maximum_gpfn() returning not the right maximum,
> 2. xc_map_foreign_pages() mapping only the really allocated pages,
>
> Any idea?
> If you need more info, I can provide that on request.
>
> I know that the old migration implementation has been removed with
> xen-4.6, but I still would like to know how to fix that since we will
> not have 4.6 in the next few month and I need working live migration.
>
> Thank you in advance and have a nice weekend.

Phillip,

Given that 4.1 is long out of support, we won't be making a proper fix
in-tree (since it will never be released).  So what kind of resolution
would be the most help to you?  A patch you can apply locally to allow
the save/restore to work?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-4.1.6.1] SIGSEGV libxc/xc_save_domain.c: p2m_size >> configured_ram_size

2016-06-13 Thread Philipp Hahn
Hello Georg,

first of all thank you for answering.

Am 13.06.2016 um 12:15 schrieb George Dunlap:
> On Fri, Jun 10, 2016 at 4:22 PM, Philipp Hahn  wrote:
>> while trying to live migrate some VMs from an xen-4.1.6.1 host "xc_save"
>> crashes with a segmentation fault in tools/libxc/xc_domain_save.c:1141
>>> /*
>>>  * Quick belt and braces sanity check.
>>>  */
>>> for ( i = 0; i < dinfo->p2m_size; i++ )
>>> {
>>> mfn = pfn_to_mfn(i);
>>> if( (mfn != INVALID_P2M_ENTRY) && (mfn_to_pfn(mfn) != i) )
>>  ^^^
>> due to a de-reference through
>>> #define pfn_to_mfn(_pfn)\
>>>   ((xen_pfn_t) ((dinfo->guest_width==8)   \
>>> ? (((uint64_t *)ctx->live_p2m)[(_pfn)])  \
>>> : uint32_t *)ctx->live_p2m)[(_pfn)]) == 0xU  \
>>>? (-1UL) : (((uint32_t *)ctx->live_p2m)[(_pfn)]
...
> Given that 4.1 is long out of support, we won't be making a proper fix
> in-tree (since it will never be released).

I know that 4.1 is EOL.
I'm aware of Ubuntu still having xen-4.1 in one of their LTS versions
(Precise) and its also in Debian-oldstable, which a lot people (us
included) still use. I would prefer to update, but I can for reasons
outside my direct control.

I'm already working with Stefan Bader from Canonical to backport most of
the XSAs to 4.1, so there already exists a "better" version outside of
the official Xen repositories.

> So what kind of resolution
> would be the most help to you?  A patch you can apply locally to allow
> the save/restore to work?

A patch is okay. I've already fixed a lot other bugs in xen-4.1 by
patching the last release, so compiling my own version is no problem for me.

Philipp

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel