wait: Introduce lock breaker in wake_up_page_bit

Christopher Lameter Thu, 14 Sep 2017 09:40:18 -0700

On Wed, 13 Sep 2017, Tim Chen wrote:

> Here's what the customer think happened and is willing to tell us.
> They have a parent process that spawns off 10 children per core and
> kicked them to run. The child processes all access a common library.
> We have 384 cores so 3840 child processes running.  When migration occur on
> a page in the common library, the first child that access the page will
> page fault and lock the page, with the other children also page faulting
> quickly and pile up in the page wait list, till the first child is done.


I think we need some way to avoid migration in cases like this. This is
crazy. Page migration was not written to deal with something like this.

Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit

Reply via email to