Re: [Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-08 Thread Jan Beulich
>>> On 08.12.14 at 10:28,  wrote:

> On 12/08/2014 04:34 PM, Jan Beulich wrote:
> On 07.12.14 at 14:43,  wrote:
>> 
>>> On 12/05/2014 08:24 PM, Jan Beulich wrote:
>>> On 05.12.14 at 11:00,  wrote:
> 5. Potential workaround
> 5.1 Use per-cpu list in idle_loop()
> Delist a batch of pages from heap_list to a per-cpu list, then scrub the
> per-cpu list and free back to heap_list.
>
> But Jan disagree with this solution:
> "You should really drop the idea of removing pages temporarily.
> All you need to do is make sure a page being allocated and getting
> simultaneously scrubbed by another CPU won't get passed to the
> caller until the scrubbing finished."

 So you don't mention any downsides to this approach. If there are
 any, please name them. If there aren't, what's the reason not to
 go this route?
>>>
>>> The reason was what you suggested was not very specific, I still have no
>>> idea how to implement a patch which can "make sure a page being
>>> allocated and getting simultaneously scrubbed by another CPU won't get
>>> passed to the caller until the scrubbing finished".
>> 
>> The scrubbing code would need to mark the page, and the allocation
>> code would need to spin on such marked pages until the mark clears.
>> 
> 
> Thanks a lot, it's more clear!
> Then do you think it is safe to iterate the heap list without spin lock
> in the scrubbing code?
> 
> Konrad also suggested a similar way which was skip marked pages(instead
> of spin) in the allocator, but I always got panic during
> page_list_for_each(&heap_list) in the scrubbing code if without locking
> the heap list.
> The panic happend in page_list_next(), I think that's because alloc/free
> path modified the heap list.

And which already answers your question above: No, it's not safe.
Switching to a read/write wouldn't necessarily either, so this won't
work without some other helper constructs.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-08 Thread Bob Liu

On 12/08/2014 04:34 PM, Jan Beulich wrote:
 On 07.12.14 at 14:43,  wrote:
> 
>> On 12/05/2014 08:24 PM, Jan Beulich wrote:
>> On 05.12.14 at 11:00,  wrote:
 5. Potential workaround
 5.1 Use per-cpu list in idle_loop()
 Delist a batch of pages from heap_list to a per-cpu list, then scrub the
 per-cpu list and free back to heap_list.

 But Jan disagree with this solution:
 "You should really drop the idea of removing pages temporarily.
 All you need to do is make sure a page being allocated and getting
 simultaneously scrubbed by another CPU won't get passed to the
 caller until the scrubbing finished."
>>>
>>> So you don't mention any downsides to this approach. If there are
>>> any, please name them. If there aren't, what's the reason not to
>>> go this route?
>>
>> The reason was what you suggested was not very specific, I still have no
>> idea how to implement a patch which can "make sure a page being
>> allocated and getting simultaneously scrubbed by another CPU won't get
>> passed to the caller until the scrubbing finished".
> 
> The scrubbing code would need to mark the page, and the allocation
> code would need to spin on such marked pages until the mark clears.
> 

Thanks a lot, it's more clear!
Then do you think it is safe to iterate the heap list without spin lock
in the scrubbing code?

Konrad also suggested a similar way which was skip marked pages(instead
of spin) in the allocator, but I always got panic during
page_list_for_each(&heap_list) in the scrubbing code if without locking
the heap list.
The panic happend in page_list_next(), I think that's because alloc/free
path modified the heap list.

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-08 Thread Jan Beulich
>>> On 07.12.14 at 14:43,  wrote:

> On 12/05/2014 08:24 PM, Jan Beulich wrote:
> On 05.12.14 at 11:00,  wrote:
>>> 5. Potential workaround
>>> 5.1 Use per-cpu list in idle_loop()
>>> Delist a batch of pages from heap_list to a per-cpu list, then scrub the
>>> per-cpu list and free back to heap_list.
>>>
>>> But Jan disagree with this solution:
>>> "You should really drop the idea of removing pages temporarily.
>>> All you need to do is make sure a page being allocated and getting
>>> simultaneously scrubbed by another CPU won't get passed to the
>>> caller until the scrubbing finished."
>> 
>> So you don't mention any downsides to this approach. If there are
>> any, please name them. If there aren't, what's the reason not to
>> go this route?
> 
> The reason was what you suggested was not very specific, I still have no
> idea how to implement a patch which can "make sure a page being
> allocated and getting simultaneously scrubbed by another CPU won't get
> passed to the caller until the scrubbing finished".

The scrubbing code would need to mark the page, and the allocation
code would need to spin on such marked pages until the mark clears.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-07 Thread Bob Liu

On 12/05/2014 08:24 PM, Jan Beulich wrote:
 On 05.12.14 at 11:00,  wrote:
>> 5. Potential workaround
>> 5.1 Use per-cpu list in idle_loop()
>> Delist a batch of pages from heap_list to a per-cpu list, then scrub the
>> per-cpu list and free back to heap_list.
>>
>> But Jan disagree with this solution:
>> "You should really drop the idea of removing pages temporarily.
>> All you need to do is make sure a page being allocated and getting
>> simultaneously scrubbed by another CPU won't get passed to the
>> caller until the scrubbing finished."
> 
> So you don't mention any downsides to this approach. If there are
> any, please name them. If there aren't, what's the reason not to
> go this route?

The reason was what you suggested was not very specific, I still have no
idea how to implement a patch which can "make sure a page being
allocated and getting simultaneously scrubbed by another CPU won't get
passed to the caller until the scrubbing finished".

Thanks,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-05 Thread Jan Beulich
>>> On 05.12.14 at 11:00,  wrote:
> 5. Potential workaround
> 5.1 Use per-cpu list in idle_loop()
> Delist a batch of pages from heap_list to a per-cpu list, then scrub the
> per-cpu list and free back to heap_list.
> 
> But Jan disagree with this solution:
> "You should really drop the idea of removing pages temporarily.
> All you need to do is make sure a page being allocated and getting
> simultaneously scrubbed by another CPU won't get passed to the
> caller until the scrubbing finished."

So you don't mention any downsides to this approach. If there are
any, please name them. If there aren't, what's the reason not to
go this route?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-05 Thread Bob Liu
Hey folks,

In recent months I've been working on speed up the 'xl des' time of XEN
guest with large RAM, but there is still no good solution yet.

I'm looking forward to get more suggestions and appreciate for all of
your input.

(1) The problem
When 'xl destory' a guest with large memory, we have to wait a long
time(~10 minutes for a guest with 1TB memory). Most of the time was
spent on page scrubbing, every page need to get scrubbed before free to
the heap_list(the buddy system).

(2) The way I've tired
1. When free a page to the buddy system only mark it with an new flag
'need_scrub' instead of scrubbing, so 'xl des' can return quickly.

2. Use all idle cpus to do the real page scrubbing in parallel. In:
static void idle_loop(void)
{
iterate the &heap_list and scrub any 'need_scrub' page.
}

3. Also in the alloc_heap_page() path, 'need_scrub' pages can be
allocated and scrubbed.(If 'need_scrub' pages are skipped, 'xl create'
new guest may fail when the system is busy since no idle cpus can finish
the scrubbing.)

4. Problem of this way: Lock contention
The &heap_list is protected by heap_lock which is a spinlock.
alloc/free path may modify the heap list any time with heap_lock hold.

The idle_loop() need to iterate the heap list for every page scrubbing
(won't modify the list but will scrub page content), there is heavy lock
contention and slow down the alloc/free path.

5. Potential workaround
5.1 Use per-cpu list in idle_loop()
Delist a batch of pages from heap_list to a per-cpu list, then scrub the
per-cpu list and free back to heap_list.

But Jan disagree with this solution:
"You should really drop the idea of removing pages temporarily.
All you need to do is make sure a page being allocated and getting
simultaneously scrubbed by another CPU won't get passed to the
caller until the scrubbing finished."

Another reason was it's hard to say how many pages should be delisted to
per-cpu list.

5.2 Use more page flags
Konrad suggested to use more page flags and consider the 'cmpxchg'
instruction instead of spinlock for idle_loop() to iterate the &heap_list.
But 'cmpxchg' is only suitable to protect the content of every single
page, it's difficult to protect kinds of race conditions against a list.

(3) Other solutions for speed up page scrubbing
1. George suggested:
* Have a "clean" freelist and a "dirty" freelist
* When destroying a domain, simply move pages to the dirty freelist
* Have idle vcpus scrub the dirty freelist before going to sleep
 - ...

* In alloc_domheap_pages():
 - If there are pages on the "clean" freelist, allocate them
 - If there are no pages on the "clean" freelist but there are on the
"dirty" freelist, scrub pages from the "dirty" freelist synchronously.

But the lock contention is still a problem and may worse with two lists.

2. Delay page scrubbing to the page fault path
Which means a 'need_scrub' page won't be scrubbed until setting up the
page table mapping in page fault path. This is a populate way under linux.
But konrad mentioned this way was not suitable for Windows guest,
because Windows will access every page during boot up, the boot time of
windows might be slowed down.

Welcome any better ideas and thanks again for your patient to read this
long email.

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel