On Mon, Oct 22, 2018 at 3:22 PM Andrew Cooper <andrew.coop...@citrix.com> wrote: > > On 22/10/2018 22:17, Razvan Cojocaru wrote: > > On 10/22/18 11:48 PM, Tamas K Lengyel wrote: > >> On Thu, Oct 18, 2018 at 3:12 PM Razvan Cojocaru > >> <rcojoc...@bitdefender.com> wrote: > >>> On 10/18/18 11:08 PM, Tamas K Lengyel wrote: > >>>> On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru > >>>> <rcojoc...@bitdefender.com> wrote: > >>>>> Hello, > >>>>> > >>>>> This series aims to prevent the display from freezing when > >>>>> enabling altp2m and switching to a new view (and assorted problems > >>>>> when resizing the display). > >>>>> > >>>>> The first patch propagates ept.ad changes to all active altp2ms, > >>>>> and the second one allocates a new logdirty rangeset for each > >>>>> new altp2m, and propagates (under lock) changes to all p2ms. > >>>>> > >>>>> The first patch is the same as: > >>>>> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms > >>>>> but as it is now required for the second one to apply cleanly, it > >>>>> has been resent as part of this series. > >>>>> > >>>>> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms > >>>>> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new > >>>> Hi Razvan, > >>>> I would be happy to give this a spin, can you push it as a git branch > >>>> somewhere? > >>> Sure, here you go: > >>> > >>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1 > >> I ran into this crash when my config incorrectly pointed to a > >> non-valid disk location: > >> > >> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 > >> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 4 > >> (XEN) RIP: e008:[<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b > >> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > >> (XEN) rax: ffff83046d27802c rbx: ffff8304558dd880 rcx: 0000000000000000 > >> (XEN) rdx: ffff83046d277fff rsi: 00000000004680c0 rdi: 0000000000000000 > >> (XEN) rbp: ffff83046d277d60 rsp: ffff83046d277d50 r8: ffff82d0809304a0 > >> (XEN) r9: 0000000000455940 r10: ffff82e008d01000 r11: 0000000000000017 > >> (XEN) r12: ffff8304558dd880 r13: ffff8304558df830 r14: ffff8304558df000 > >> (XEN) r15: fffffffffffffff8 cr0: 000000008005003b cr4: 00000000003526e0 > >> (XEN) cr3: 000000005da16000 cr2: ffff880456cd6e80 > >> (XEN) fsb: 0000000000000000 gsb: ffff880467f40000 gss: 0000000000000000 > >> (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > >> (XEN) Xen code around <ffff82d08033f40c> (p2m_uninit_altp2m_ept+0x29/0x2b): > >> (XEN) 00 48 83 c4 08 5b 5d c3 <0f> 0b 55 48 89 e5 41 56 41 55 41 54 53 48 > >> 8d 05 > >> (XEN) Xen call trace: > >> (XEN) [<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b > >> (XEN) [<ffff82d0803305ab>] p2m.c#p2m_teardown_altp2m+0x36/0x52 > >> (XEN) [<ffff82d0803331b5>] p2m_final_teardown+0x11/0x28 > >> (XEN) [<ffff82d08034509c>] paging_final_teardown+0x2e/0x3c > >> (XEN) [<ffff82d080276439>] arch_domain_destroy+0x50/0xa1 > >> (XEN) [<ffff82d08020595c>] domain.c#complete_domain_destroy+0x86/0x159 > >> (XEN) [<ffff82d080228f4f>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cf > >> (XEN) [<ffff82d08023ae6b>] softirq.c#__do_softirq+0x71/0x9a > >> (XEN) [<ffff82d08023aede>] do_softirq+0x13/0x15 > >> (XEN) [<ffff82d080275068>] domain.c#idle_loop+0x63/0xb9 > >> (XEN) > >> (XEN) > >> (XEN) **************************************** > >> (XEN) Panic on CPU 4: > >> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475 > >> (XEN) **************************************** > > Right, that one I've also come across now, that will be fixed in the > > next series as a result of doing what Andrew has suggested, which is to say: > > > > "Please make all destroy functions idempotent. i.e. > > > > if ( p2m->sync.logdirty_ranges ) > > { > > rangeset_destroy(p2m->sync.logdirty_ranges); > > p2m->sync.logdirty_ranges = NULL; > > } > > > > and use this destroy function in the cleanup path of init()." > > Indeed. > > > > >> With the config fixed it boots but when I run DRAKVUF on the domain I > >> get the following crash: > >> > >> (XEN) ----[ Xen-4.12-unstable x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 0 > >> (XEN) RIP: e008:[<000000007bdb630c>] 000000007bdb630c > >> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (d0v5) > >> (XEN) rax: 00000000ee138470 rbx: 0000000000000000 rcx: 000000008000b098 > >> (XEN) rdx: 0000000000000cf8 rsi: 0000000000000000 rdi: 000000046d2ef000 > >> (XEN) rbp: 0000000000000000 rsp: ffff83005da27a10 r8: 0000000000000cf8 > >> (XEN) r9: 0000000000000cf8 r10: ffff83005da27ab8 r11: ffff83005da27a08 > >> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000065 > >> (XEN) r15: 00000000000005a7 cr0: 0000000080050033 cr4: 0000000000372660 > >> (XEN) cr3: 000000046d2ef000 cr2: 00000000ee138470 > >> (XEN) fsb: 00007fe46d97bbc0 gsb: ffff880467f40000 gss: 0000000000000000 > >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > >> (XEN) Xen code around <000000007bdb630c> (000000007bdb630c): > >> (XEN) 80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 > >> 64 48 > >> (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace > >> from rsp=ffff83005da27a10: > >> (XEN) 0000000000000000 0000000000000065 ffff83005da27a50 > >> ffff82d08037aafc > >> (XEN) 00000000fffffffe ffff82d08037ae14 0000000000000000 > >> ffff83005da27a90 > >> (XEN) 0000000000372660 000000046d2ef000 0000000393e91000 > >> ffff82d0809602b0 > >> (XEN) 000000fe00000000 ffff82d0802a3b98 ffffffffffffffff > >> ffff83005da27ab8 > >> (XEN) ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 > >> ffff83005da27b08 > >> (XEN) ffff82d0802a3511 ffff83005da27fff 0000138800000292 > >> 000082d0808176a0 > >> (XEN) 0000000000000000 ffff82d08023b889 0000000000000292 > >> ffff82d08046b028 > >> (XEN) ffff82d080451ac8 ffff82d080454af2 00000000000005a7 > >> ffff83005da27b78 > >> (XEN) ffff82d080251d6f ffff82d080250fcd 0000000000000028 > >> ffff83005da27b88 > >> (XEN) ffff83005da27b38 000000000000e010 ffff82d080454c73 > >> ffff82d080451ac8 > >> (XEN) ffff82d080454af2 00000000000005a7 0000000000000030 > >> ffff83005da27bf8 > >> (XEN) ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc > >> ffff82d08033f3dc > >> (XEN) ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d > >> ffff82d08037d969 > >> (XEN) 0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 > >> 0000000000000000 > >> (XEN) 0000000000000000 ffff83005da27fff 0000000000000000 > >> 00007cffa25d83e7 > >> (XEN) ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 > >> ffff83005da27d38 > >> (XEN) ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 > >> 0000000000000000 > >> (XEN) 0000000000000000 deadbeefdeadf00d deadbeefdeadf00d > >> 0000000000000000 > >> (XEN) 0000000000000000 ffff830451835000 0000000000000000 > >> ffff83040a492000 > >> (XEN) 0000000600000000 ffff82d08033f3da 000000000000e008 > >> 0000000000010282 > >> (XEN) Xen call trace: > >> (XEN) [<000000007bdb630c>] 000000007bdb630c > >> (XEN) > >> (XEN) Pagetable walk from 00000000ee138470: > >> (XEN) L4[0x000] = 000000046d2ee063 ffffffffffffffff > >> (XEN) L3[0x003] = 000000005da11063 ffffffffffffffff > >> (XEN) L2[0x170] = 0000000000000000 ffffffffffffffff > >> (XEN) > >> (XEN) **************************************** > >> (XEN) Panic on CPU 0: > >> (XEN) FATAL PAGE FAULT > >> (XEN) [error_code=0002] > >> (XEN) Faulting linear address: 00000000ee138470 > >> (XEN) **************************************** > >> (XEN) > >> (XEN) Reboot in five seconds... > > This one I'm not sure about. What does your introspection agent do at > > that point? > > This crash is bizarre. Xen has most likely followed a corrupt function > pointer, because none of Xen's .text section live just below the 2G boundary >
It's reproducible and happens immediately after a successful call to xc_altp2m_set_domain_state to enable altp2m. Tamas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel