Re: [Xen-devel] Xen 4.11.1 panic
On Thu, Dec 20, 2018 at 12:36:35AM -0700, Jan Beulich wrote: > >>> On 19.12.18 at 13:54, wrote: > On 19.12.18 at 12:55, wrote: > >> On Wed, Dec 19, 2018 at 04:05:57AM -0700, Jan Beulich wrote: > >>> In any event, both Andrew and I must have overlooked the one > >>> crucial place due to which the assertion is indeed wrong from > >>> put_page_from_l2e(): > >>> > >>> int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn))); > >>> > >>> Not allowing for preemption there is fine if the L2E is pointing to > >>> an L1 table, but is now wrong if the L2E points to another L2, > >>> which surely is the case when you see the assertion trigger. > >> > >> Should we just change false to true here, or should the cases above be > >> handled differently ? > > > > Switching from false to true here is just the initial part of the > > necessary change - if you did just this, you'd end up hitting > > the ASSERT() right after the line above. There's quite a bit > > more to it, and it needs to be done pretty carefully. > > Actually there was no reason to alter the free_l2_table() paths > in the XSA-273 fixes: A switch to shadow mode can only occur > when validating page tables. Therefore I think you could safely > revert the respective hunks, which includes deleting the > ASSERT() you found triggering. You mean, Xen is not going to fix this ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference -- ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.11.1 panic
>>> On 19.12.18 at 13:54, wrote: On 19.12.18 at 12:55, wrote: >> On Wed, Dec 19, 2018 at 04:05:57AM -0700, Jan Beulich wrote: >>> In any event, both Andrew and I must have overlooked the one >>> crucial place due to which the assertion is indeed wrong from >>> put_page_from_l2e(): >>> >>> int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn))); >>> >>> Not allowing for preemption there is fine if the L2E is pointing to >>> an L1 table, but is now wrong if the L2E points to another L2, >>> which surely is the case when you see the assertion trigger. >> >> Should we just change false to true here, or should the cases above be >> handled differently ? > > Switching from false to true here is just the initial part of the > necessary change - if you did just this, you'd end up hitting > the ASSERT() right after the line above. There's quite a bit > more to it, and it needs to be done pretty carefully. Actually there was no reason to alter the free_l2_table() paths in the XSA-273 fixes: A switch to shadow mode can only occur when validating page tables. Therefore I think you could safely revert the respective hunks, which includes deleting the ASSERT() you found triggering. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.11.1 panic
On Wed, Dec 19, 2018 at 05:54:42AM -0700, Jan Beulich wrote: > >>> On 19.12.18 at 12:55, wrote: > > On Wed, Dec 19, 2018 at 04:05:57AM -0700, Jan Beulich wrote: > >> In any event, both Andrew and I must have overlooked the one > >> crucial place due to which the assertion is indeed wrong from > >> put_page_from_l2e(): > >> > >> int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn))); > >> > >> Not allowing for preemption there is fine if the L2E is pointing to > >> an L1 table, but is now wrong if the L2E points to another L2, > >> which surely is the case when you see the assertion trigger. > > > > Should we just change false to true here, or should the cases above be > > handled differently ? > > Switching from false to true here is just the initial part of the > necessary change - if you did just this, you'd end up hitting > the ASSERT() right after the line above. There's quite a bit > more to it, and it needs to be done pretty carefully. OK, so I'll wait for a more complete patch :) -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference -- ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.11.1 panic
>>> On 19.12.18 at 12:55, wrote: > On Wed, Dec 19, 2018 at 04:05:57AM -0700, Jan Beulich wrote: >> In any event, both Andrew and I must have overlooked the one >> crucial place due to which the assertion is indeed wrong from >> put_page_from_l2e(): >> >> int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn))); >> >> Not allowing for preemption there is fine if the L2E is pointing to >> an L1 table, but is now wrong if the L2E points to another L2, >> which surely is the case when you see the assertion trigger. > > Should we just change false to true here, or should the cases above be > handled differently ? Switching from false to true here is just the initial part of the necessary change - if you did just this, you'd end up hitting the ASSERT() right after the line above. There's quite a bit more to it, and it needs to be done pretty carefully. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.11.1 panic
On Wed, Dec 19, 2018 at 04:05:57AM -0700, Jan Beulich wrote: > >>> On 18.12.18 at 23:19, wrote: > > I tried updating my NetBSD dom0 to 4.11.1 (from 4.11.0 with security > > patches), > > Hmm, the issue stems from the XSA-273 changes, so did you perhaps > mean "with some security patches", and you didn't have those ones > applied? Yes, for some reason XSA-273 isn't in the list of patches I had for Xen 4.11.0. I guess I forgot it ... > > > and on a 32bits PV domU shutdown I get (100% reproductible): > > (XEN) Assertion 'preemptible' failed at mm.c:2493 > [...] > > The line number above doesn't match any one with a respective > ASSERT() in plain 4.11.1. There are a few nearby ones, and hence > I can only guess that it's the one that was recently added (in > PGT_l2_page_table handling of free_page_type()). Can you confirm > this please with the exact sources you've used for your build? Here's what I have in xen/arch/x86/mm.c: 2486 switch ( type & PGT_type_mask ) 2487 { 2488 case PGT_l1_page_table: 2489 free_l1_table(page); 2490 rc = 0; 2491 break; 2492 case PGT_l2_page_table: 2493 ASSERT(preemptible); 2494 rc = free_l2_table(page); 2495 break; 2496 case PGT_l3_page_table: 2497 ASSERT(preemptible); 2498 rc = free_l3_table(page); 2499 break; 2500 case PGT_l4_page_table: 2501 ASSERT(preemptible); 2502 rc = free_l4_table(page); 2503 break; 2504 default: This is in free_page_type() > > In any event, both Andrew and I must have overlooked the one > crucial place due to which the assertion is indeed wrong from > put_page_from_l2e(): > > int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn))); > > Not allowing for preemption there is fine if the L2E is pointing to > an L1 table, but is now wrong if the L2E points to another L2, > which surely is the case when you see the assertion trigger. Should we just change false to true here, or should the cases above be handled differently ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference -- ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.11.1 panic
>>> On 18.12.18 at 23:19, wrote: > I tried updating my NetBSD dom0 to 4.11.1 (from 4.11.0 with security patches), Hmm, the issue stems from the XSA-273 changes, so did you perhaps mean "with some security patches", and you didn't have those ones applied? > and on a 32bits PV domU shutdown I get (100% reproductible): > (XEN) Assertion 'preemptible' failed at mm.c:2493 > (XEN) [ Xen-4.11.1nb0 x86_64 debug=y Tainted: C ] > (XEN) CPU:1 > (XEN) RIP:e008:[] free_page_type+0x232/0x790 > (XEN) RFLAGS: 00010246 CONTEXT: hypervisor (d0v0) > (XEN) rax: 4000 rbx: 4401 rcx: 4000 > (XEN) rdx: 8300 rsi: 4401 rdi: 82e004215260 > (XEN) rbp: 82e004215260 rsp: 83023704fab8 r8: > (XEN) r9: r10: 82e0 r11: 82e004226000 > (XEN) r12: r13: 8302135d9000 r14: 10ff > (XEN) r15: 1000 cr0: 8005003b cr4: 2660 > (XEN) cr3: 00022f0f6000 cr2: 7f7ff60ce7a0 > (XEN) fsb: 7f7ff7ff36c0 gsb: 80ca42c0 gss: > (XEN) ds: 003f es: 003f fs: gs: ss: e010 cs: e008 > (XEN) Xen code around (free_page_type+0x232/0x790): > (XEN) 05 00 00 45 85 e4 75 02 <0f> 0b 8b 45 18 85 c0 74 18 89 c0 48 c1 e0 0c > 49 > (XEN) Xen stack trace from rsp=83023704fab8: > (XEN)83023704fe38 00ec 4401 82e004215260 > (XEN)82e004215240 00ff 10ff 1000 > (XEN)82d08028b83d 00ff8300bedfc000 83023704 82d0 > (XEN)82e004215260 82e004215240 8302135d9000 00210a92 > (XEN)820040019000 0200 82d08028bedf 01ff > (XEN)82e004215240 82d08028b25e 8302 83023704 > (XEN)4401 82e004215240 82e004206c60 00ff > (XEN)10ff 1000 82d08028b83d 0102 > (XEN)83023704 83020001 82e004215240 82e004206c60 > (XEN) 820040015010 820040015000 > (XEN)82d08028af30 0002 82e004206c60 > (XEN)820040015010 82d08028b3d1 00210363 8302135d9000 > (XEN)6401 82e004206c60 00ff > (XEN)10ff 1000 82d08028b83d 01ff82d08022697f > (XEN)83023704 83020001 82e004206c60 83023704fd10 > (XEN)8302135d9028 8302135d9000 8302135d9020 82e004206c70 > (XEN)82d08028bf1f 82d080274e6b 83023701ec00 e401 > (XEN)83023704 8000 8302135d9000 > (XEN)8302135d9018 deadbeefdeadf00d 0001 7f7ff7b32004 > (XEN)82d080278f83 8302135d9000 7f7ff7b32004 82d080208b2d > (XEN) Xen call trace: > (XEN)[] free_page_type+0x232/0x790 > (XEN)[] mm.c#_put_page_type+0x14d/0x380 > (XEN)[] mm.c#put_page_from_l2e+0xdf/0x110 > (XEN)[] free_page_type+0x2fe/0x790 > (XEN)[] mm.c#_put_page_type+0x14d/0x380 > (XEN)[] mm.c#put_page_from_l3e+0x1a0/0x1d0 The line number above doesn't match any one with a respective ASSERT() in plain 4.11.1. There are a few nearby ones, and hence I can only guess that it's the one that was recently added (in PGT_l2_page_table handling of free_page_type()). Can you confirm this please with the exact sources you've used for your build? In any event, both Andrew and I must have overlooked the one crucial place due to which the assertion is indeed wrong from put_page_from_l2e(): int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn))); Not allowing for preemption there is fine if the L2E is pointing to an L1 table, but is now wrong if the L2E points to another L2, which surely is the case when you see the assertion trigger. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] Xen 4.11.1 panic
Hello, I tried updating my NetBSD dom0 to 4.11.1 (from 4.11.0 with security patches), and on a 32bits PV domU shutdown I get (100% reproductible): (XEN) Assertion 'preemptible' failed at mm.c:2493 (XEN) [ Xen-4.11.1nb0 x86_64 debug=y Tainted: C ] (XEN) CPU:1 (XEN) RIP:e008:[] free_page_type+0x232/0x790 (XEN) RFLAGS: 00010246 CONTEXT: hypervisor (d0v0) (XEN) rax: 4000 rbx: 4401 rcx: 4000 (XEN) rdx: 8300 rsi: 4401 rdi: 82e004215260 (XEN) rbp: 82e004215260 rsp: 83023704fab8 r8: (XEN) r9: r10: 82e0 r11: 82e004226000 (XEN) r12: r13: 8302135d9000 r14: 10ff (XEN) r15: 1000 cr0: 8005003b cr4: 2660 (XEN) cr3: 00022f0f6000 cr2: 7f7ff60ce7a0 (XEN) fsb: 7f7ff7ff36c0 gsb: 80ca42c0 gss: (XEN) ds: 003f es: 003f fs: gs: ss: e010 cs: e008 (XEN) Xen code around (free_page_type+0x232/0x790): (XEN) 05 00 00 45 85 e4 75 02 <0f> 0b 8b 45 18 85 c0 74 18 89 c0 48 c1 e0 0c 49 (XEN) Xen stack trace from rsp=83023704fab8: (XEN)83023704fe38 00ec 4401 82e004215260 (XEN)82e004215240 00ff 10ff 1000 (XEN)82d08028b83d 00ff8300bedfc000 83023704 82d0 (XEN)82e004215260 82e004215240 8302135d9000 00210a92 (XEN)820040019000 0200 82d08028bedf 01ff (XEN)82e004215240 82d08028b25e 8302 83023704 (XEN)4401 82e004215240 82e004206c60 00ff (XEN)10ff 1000 82d08028b83d 0102 (XEN)83023704 83020001 82e004215240 82e004206c60 (XEN) 820040015010 820040015000 (XEN)82d08028af30 0002 82e004206c60 (XEN)820040015010 82d08028b3d1 00210363 8302135d9000 (XEN)6401 82e004206c60 00ff (XEN)10ff 1000 82d08028b83d 01ff82d08022697f (XEN)83023704 83020001 82e004206c60 83023704fd10 (XEN)8302135d9028 8302135d9000 8302135d9020 82e004206c70 (XEN)82d08028bf1f 82d080274e6b 83023701ec00 e401 (XEN)83023704 8000 8302135d9000 (XEN)8302135d9018 deadbeefdeadf00d 0001 7f7ff7b32004 (XEN)82d080278f83 8302135d9000 7f7ff7b32004 82d080208b2d (XEN) Xen call trace: (XEN)[] free_page_type+0x232/0x790 (XEN)[] mm.c#_put_page_type+0x14d/0x380 (XEN)[] mm.c#put_page_from_l2e+0xdf/0x110 (XEN)[] free_page_type+0x2fe/0x790 (XEN)[] mm.c#_put_page_type+0x14d/0x380 (XEN)[] mm.c#put_page_from_l3e+0x1a0/0x1d0 (XEN)[] free_page_type+0x471/0x790 (XEN)[] mm.c#_put_page_type+0x14d/0x380 (XEN)[] put_page_type_preemptible+0xf/0x10 (XEN)[] domain.c#relinquish_memory+0xab/0x460 (XEN)[] domain_relinquish_resources+0x203/0x290 (XEN)[] domain_kill+0xbd/0x150 (XEN)[] do_domctl+0x7d3/0x1a90 (XEN)[] do_physdev_op_compat+0/0x70 (XEN)[] do_domctl+0/0x1a90 (XEN)[] pv_hypercall+0x20b/0x440 (XEN)[] lstar_enter+0xa2/0x120 (XEN)[] lstar_enter+0xae/0x120 (XEN)[] lstar_enter+0xa2/0x120 (XEN)[] lstar_enter+0xae/0x120 (XEN)[] lstar_enter+0xa2/0x120 (XEN)[] lstar_enter+0xae/0x120 (XEN)[] lstar_enter+0x110/0x120 (XEN) (XEN) (XEN) (XEN) Panic on CPU 1: (XEN) Assertion 'preemptible' failed at mm.c:2493 (XEN) (XEN) (XEN) Reboot in five seconds... -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference -- ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel