* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
> 
> > hm, another thing: i think call_rcu() needs to take the read-lock.
> > Right now it assumes that it has the data structure private, but
> > that's only statistically true on PREEMPT_RT: another CPU may have
> > this CPU's RCU control structure in use. So IRQs-off (or preempt-off)
> > is not a guarantee to have the data structure, the read lock has to be
> > taken.
> 
> i've reworked the code to use the read-lock to access the per-CPU data
> RCU structures, and it boots with 2 CPUs and PREEMPT_RT now. The
> -40-05 patch can be downloaded from the usual place:

bah, it's leaking dentries at a massive scale. I'm giving up on this
variant for the time being and have gone towards a much simpler variant,
implemented in the -40-07 patch at:

   http://redhat.com/~mingo/realtime-preempt/

it's along the lines of Esben's patch, but with the conceptual bug fixed
via the rcu_read_lock_nesting code from Paul's patch.

there's a new CONFIG_PREEMPT_RCU option. (always-enabled on PREEMPT_RT)
It builds & boots fine on my 2-way box, doesnt leak dentries and
networking is up and running.

first question, (ignoring the grace priod problem) is this a correct RCU
implementation? The critical scenario is when a task gets migrated to
another CPU, so that current->rcu_data is that of another CPU's. That is
why ->active_readers is an atomic variable now. [ Note that while
->active_readers may be decreased from another CPU, it's always
increased on the current CPU, so when a preemption-off section
determines that a quiescent state has passed that determination stays
true up until it enables preemption again. This is needed for correct
callback processing. ]

this implementation has the 'long grace periods' problem. Starvation
should only be possible if a system has zero idle time for a long period
of time, and even then it needs the permanent starvation of
involuntarily preempted rcu-read-locked tasks. Is there any way to force
such a situation? (which would turn this into a DoS)

[ in OOM situations we could force quiescent state by walking all tasks
and checking for nonzero ->rcu_read_lock_nesting values and priority
boosting affected tasks (to RT prio 99 or RT prio 1), which they'd
automatically drop when they decrease their rcu_read_lock_nesting
counter to zero. ]

        Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to