Re: [Xen-devel] [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop

Sergey Dyasli Fri, 09 Nov 2018 00:50:32 -0800

On 08/11/2018 15:18, Roger Pau Monné wrote:
> On Thu, Nov 08, 2018 at 02:48:40PM +0000, Sergey Dyasli wrote:
>> (CCing Roger)
>>
>> On 08/11/2018 11:07, Andrew Cooper wrote:
>>> On 08/11/18 10:31, Jan Beulich wrote:
>>>>>>> On 07.11.18 at 19:20, <andrew.coop...@citrix.com> wrote:
>>>>> On 09/10/18 16:21, Sergey Dyasli wrote:
>>>>>> Scrubbing RAM during boot may take a long time on machines with lots
>>>>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>>>>>> initially so they will eventually be scrubbed in idle-loop on every
>>>>>> online CPU.
>>>>>>
>>>>>> It's guaranteed that the allocator will return scrubbed pages by doing
>>>>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>>>>>
>>>>>> Use the new 'idle' option as the default one.
>>>>>>
>>>>>> Signed-off-by: Sergey Dyasli <sergey.dya...@citrix.com>
>>>>> This patch reliably breaks boot, although its not immediately obvious how:
>>>>>
>>>>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
>>>>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 
>>>>> model 
>>>>> 60 is not supported
>>>>> (d9) (XEN) Dom0 has maximum 400 PIRQs
>>>>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>>>>> (d9) (XEN) CPU:    0
>>>>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] 
>>>>> setup.c#cmdline_cook+0x1d/0x77
>>>>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
>>>>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 
>>>>> 0000000000000000
>>>>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: 
>>>>> ffff83000045c24b
>>>>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  
>>>>> ffff83003f057000
>>>>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 
>>>>> 0000000000000001
>>>>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: 
>>>>> ffff82d0805f33d0
>>>>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 
>>>>> 00000000001526e0
>>>>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
>>>>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 
>>>>> 0000000000000000
>>>>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>>>> (d9) (XEN) Xen code around <ffff82d080440ddb> 
>>>>> (setup.c#cmdline_cook+0x1d/0x77):
>>>>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 
>>>>> 74 f7 80 3d
>>>>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>>>>> [...]
>>>>> (d9) (XEN) Xen call trace:
>>>>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>>>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
>>>>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
>>>> That's apparently the 2nd cmdline_cook() invocation, when producing
>>>> the Dom0 command line. I would suppose what "loader" points to has
>>>> been scrubbed by the time we get there (with synchronous scrubbing
>>>> APs wouldn't be able to get going with this before reaching
>>>> heap_init_late()).
>>>
>>> This is via a PVH boot (like a lot of my development work), and does
>>> look to be a latent use-after-free.  Dropping the VM down to a single
>>> vcpu causes the problem to go away.
>>>
>>> Sergey is kindly investigating.
>>
>> Yes, this seems to be a bug in Xen PVH boot path. From the serial:
>>
>> (XEN) == mbi->mods_addr 0x46dce0
>>
>> which is marked as usable in e820:
>>
>> (XEN) PVH-e820 RAM map:
>> (XEN)  0000000000000000 - 00000000000a0000 (usable)
>> (XEN)  0000000000100000 - 0000000040000400 (usable)
>> (XEN)  00000000fc000000 - 00000000fc009040 (ACPI data)
>> (XEN)  00000000feff8000 - 00000000feffc000 (reserved)
>> (XEN)  00000000feffc000 - 00000000feffd000 (usable)
>> (XEN)  00000000feffd000 - 00000000ff000000 (reserved)
>>
>> This memory is then given to the allocator and scrubbed by secondary
>> CPUs which leads to use-after-free. Even with fixing the cmdline issue,
>> another FATAL PAGE FAULT occurs further down the boot path:
> 
> Right, shouldn't the scrub be started after Dom0 has been constructed?
> I would say the scrubbing should be started at the same time as
> before, which is just before jumping into Dom0 entry point IIRC?


No, this would only mask the issue again. Although unlikely, that memory
for modules might be given to someone by the allocator, which can lead
to silent memory corruption. Modules are supposed to be freed by
discard_initial_images() which is already called by pvh_load_kernel().

--
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop

Reply via email to