-----Original Message-----
From: Andy Lutomirski [mailto:l...@amacapital.net] 
Sent: Wednesday, March 26, 2014 7:40 PM
To: Davis, Bud @ SSG - Link; umgwanakikb...@gmail.com
Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
Subject: Re: Bug 71331 - mlock yields processor to lower priority process

On 03/21/2014 07:50 AM, jimmie.da...@l-3com.com wrote:
> 
> ________________________________________
> From: Mike Galbraith [umgwanakikb...@gmail.com]
> Sent: Friday, March 21, 2014 9:41 AM
> To: Davis, Bud @ SSG - Link
> Cc: oneu...@suse.de; artem_fetis...@epam.com; pet...@infradead.org; 
> kosaki.motoh...@jp.fujitsu.com; linux-kernel@vger.kernel.org
> Subject: RE: Bug 71331 - mlock yields processor to lower priority process
> 
> On Fri, 2014-03-21 at 14:01 +0000, jimmie.da...@l-3com.com wrote:
> 
>> If you call mlock () from a SCHED_FIFO task, you expect it to return
>> when done.  You don't expect it to block, and your task to be
>> pre-empted.
> 
> Say some of your pages are sitting in an nfs swapfile orbiting Neptune,
> how do they get home, and what should we do meanwhile?
> 
> -Mike
> 
> Two options.
> 
> #1. Return with a status value of EAGAIN.
> 
> or 
> 
> #2.  Don't return until you can do it.
> 
> If SCHED_FIFO is used, and mlock() is called, the intention of the user is 
> very clear.  Run this task until
> it is completed or it blocks (and until a bit ago, mlock() did not block).
> 
> SCHED_FIFO users don't care about fairness.  They want the system to do what 
> it is told.

I use mlock in real-time processes, but I do it in a separate thread.

Seriously, though, what do you expect the kernel to do?  When you call
mlock on a page that isn't present, the kernel will *read* that page.
mlock will, therefore, block until the IO finishes.

Some time around 3.9, the behavior changed a little bit: IIRC mlock used
to hold mmap_sem while sleeping.  Or maybe just mmap with MCL_FUTURE did
that.  In any case, the mlock code is less lock-happy than it was.  Is
it possible that you have two threads, and the non-mlock-calling thread
got blocked behind mlock, so it looked better?

--Andy

===================================================================================================================


Andy,

The example code submitted into bugzilla (chase back on the thread a bit, there 
is a reference) shows the problem.

Two threads, TaskA (high priority) and TaskB (low priority).  Assigned to the 
same processor, explicitly for the guarantee that only one of them can execute 
at a time.  TaskA becomes eligible to run.  As part of its processing ( which 
the normal end is a call to sem_wait() ), it calls mlock().  TaskA then blocks, 
and TaskB begins running.  But wait, the system is designed that TaskA will run 
until it is done (thus SCHED_FIFO and a priority less than TaskB).  TaskA, a 
higher priority task is suspended and TaskB starts running.  And in the code 
that lead me on this endeavor :) {consisting of a lot of Ada threads}, the 
result was a segfault due to half-processed data by TaskA.

This is what I call 'blocking'; the thread is no longer running and the 
scheduler puts someone else in the processor.  I don't mean 'takes a long time 
until it returns'.  Takes a long time is fine, the system design relies on 
priority based scheduling and cpu affinity to ensure ordered access to 
application data.

mlock() now blocks.  I don't care how long mlock() takes, what I care about is 
the lower priority process pre-empting me.  Only a limited number of syscalls 
block; those that do are documented and usually have a way to obtain blocking 
or non-blocking behavior.

Can I change the system to deal with mlock() being a blocking syscall ?  Yes, 
but this is a situation where working code, that meets the API has stopped 
working.

Thanks for looking at it.

Regards,
Bud Davis






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to