I understand the downside of badly written realtime app.  In my case 
application runs in userspace without making much syscalls and by all means it 
is a well behaved application. Yes, I can wire memory, change the application 
to use mutex instead of spinlock and those changes should help but they are 
still working around the problem. I still believe kernel should not lower the 
realtime priority when blocking on resources. This can lead to priority 
inversion, especially since these threads run at fixed priorities and kernel 
doesn't muck with them.
 
As you suggested _sleep() should not adjust the priorities for realtime 
threads. 

Thanks,
Sushanth

--- On Thu, 4/5/12, John Baldwin <j...@freebsd.org> wrote:

> From: John Baldwin <j...@freebsd.org>
> Subject: Re: Startvation of realtime piority threads
> To: freebsd-hackers@freebsd.org, davi...@freebsd.org
> Date: Thursday, April 5, 2012, 9:01 AM
> On Thursday, April 05, 2012 1:07:55
> am David Xu wrote:
> > On 2012/4/5 11:56, Konstantin Belousov wrote:
> > > On Wed, Apr 04, 2012 at 06:54:06PM -0700, Sushanth
> Rai wrote:
> > >> I have a multithreaded user space program that
> basically runs at realtime 
> priority. Synchronization between threads are done using
> spinlock. When 
> running this program on a SMP system under heavy memory
> pressure I see that 
> thread holding the spinlock is starved out of cpu. The cpus
> are effectively 
> consumed by other threads that are spinning for lock to
> become available.
> > >>
> > >> After instrumenting the kernel a little bit
> what I found was that under 
> memory pressure, when the user thread holding the spinlock
> traps into the 
> kernel due to page fault, that thread sleeps until the free
> pages are 
> available. The thread sleeps PUSER priority (within
> vm_waitpfault()). When it 
> is ready to run, it is queued at PUSER priority even thought
> it's base 
> priority is realtime. The other siblings threads that are
> spinning at realtime 
> priority to acquire the spinlock starves the owner of
> spinlock.
> > >>
> > >> I was wondering if the sleep in
> vm_waitpfault() should be a 
> MAX(td_user_pri, PUSER) instead of just PUSER. I'm running
> on 7.2 and it looks 
> like this logic is the same in the trunk.
> > > It just so happen that your program stumbles upon
> a single sleep point in
> > > the kernel. If for whatever reason the thread in
> kernel is put off CPU
> > > due to failure to acquire any resource without
> priority propagation,
> > > you would get the same effect. Only blockable
> primitives do priority
> > > propagation, that are mutexes and rwlocks, AFAIR.
> In other words, any
> > > sx/lockmgr/sleep points are vulnerable to the same
> issue.
> > This is why I suggested that POSIX realtime priority
> should not be 
> > boosted, it should be
> > only higher than PRI_MIN_TIMESHARE but lower than any
> priority all 
> > msleep() callers
> > provided.  The problem is userland realtime thread
> 's busy looping code 
> > can cause
> > starvation a thread in kernel which holding a critical
> resource.
> > In kernel we can avoid to write dead-loop code, but
> userland code is not 
> > trustable.
> 
> Note that you have to be root to be rtprio, and that there
> is trustable
> userland code (just because you haven't used any doesn't
> mean it doesn't
> exist).
> 
> > If you search "Realtime thread priorities" in
> 2010-december within @arch 
> > list.
> > you may find the argument.
> 
> I think the bug here is that sched_sleep() should not lower
> the priority of
> an rtprio process.  It should arguably not raise the
> priority of an idprio
> process either, but sched_sleep() should probably only apply
> to timesharing
> threads.
> 
> All that said, userland rtprio code is going to have to be
> careful.  It should
> be using things like wired memory as Kostik suggested, and
> probably avoiding
> most system calls.  You can definitely blow your foot
> off quite easily in lots 
> of ways with rtprio.
> 
> -- 
> John Baldwin
> _______________________________________________
> freebsd-hackers@freebsd.org
> mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
>
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to