On 01/18/2013 04:38 PM, Jan Kiszka wrote:
> This fixes a nasty bug on SMP boxes: We may migrate to root in the
> context of an IRQ handler, and then also to a different CPU. Therefore,
> we must not use domain contexts read before the invocation but update
> them afterward or use stable information like the domain reference.
>
> Signed-off-by: Jan Kiszka <[email protected]>
> ---
>
> We are still facing stalled, unkillable RT processes despite this fix,
> but at least the head domain status corruption (and related warnings)
> seems to be gone now.
>
> kernel/ipipe/core.c | 5 +++--
> 1 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/ipipe/core.c b/kernel/ipipe/core.c
> index 68af0b3..6aa9572 100644
> --- a/kernel/ipipe/core.c
> +++ b/kernel/ipipe/core.c
> @@ -1173,18 +1173,19 @@ static void dispatch_irq_head(unsigned int irq) /* hw
> interrupts off */
> head->irqs[irq].handler(irq, head->irqs[irq].cookie);
> __ipipe_run_irqtail(irq);
> hard_local_irq_disable();
> + p = ipipe_this_cpu_head_context();
> __clear_bit(IPIPE_STALL_FLAG, &p->status);
>
> /* Are we still running in the head domain? */
> if (likely(__ipipe_current_context == p)) {
> /* Did we enter this code over the head domain? */
> - if (old == p) {
> + if (old->domain == head) {
> /* Yes, do immediate synchronization. */
> if (__ipipe_ipending_p(p))
> __ipipe_sync_stage();
> return;
> }
> - __ipipe_set_current_context(old);
> + __ipipe_set_current_context(ipipe_this_cpu_root_context());
> }
>
> /*
Hi Jan,
I may have an issue reported which resembles the problem fixed by this
patch. If I understand your patch, it means that an irq handler may
migrate domain. But if I read __ipipe_sync_stage() code:
void __ipipe_do_sync_stage(void)
{
struct ipipe_percpu_domain_data *p;
struct ipipe_domain *ipd;
int irq;
p = __ipipe_current_context;
ipd = p->domain;
__set_bit(IPIPE_STALL_FLAG, &p->status);
smp_wmb();
if (ipd == ipipe_root_domain)
trace_hardirqs_off();
for (;;) {
irq = __ipipe_next_irq(p);
if (irq < 0)
break;
/*
* Make sure the compiler does not reorder wrongly, so
* that all updates to maps are done before the
* handler gets called.
*/
barrier();
if (test_bit(IPIPE_LOCK_FLAG, &ipd->irqs[irq].control))
continue;
if (ipd != ipipe_head_domain)
hard_local_irq_enable();
if (likely(ipd != ipipe_root_domain)) {
ipd->irqs[irq].handler(irq, ipd->irqs[irq].cookie);
__ipipe_run_irqtail(irq);
hard_local_irq_disable();
} else if (ipipe_virtual_irq_p(irq)) {
irq_enter();
ipd->irqs[irq].handler(irq, ipd->irqs[irq].cookie);
irq_exit();
root_stall_after_handler();
hard_local_irq_disable();
while (__ipipe_check_root_resched())
__ipipe_preempt_schedule_irq();
} else {
ipd->irqs[irq].handler(irq, ipd->irqs[irq].cookie);
root_stall_after_handler();
hard_local_irq_disable();
}
p = __ipipe_current_context;
}
if (ipd == ipipe_root_domain)
trace_hardirqs_on();
__clear_bit(IPIPE_STALL_FLAG, &p->status);
}
It seems to me that at the end of each loop, we re-read p, but we assume
that p->domain did not change. So I would add:
ipd = p->domain
right after the line
p = __ipipe_current_context
But I am not sure that we are supposed to allow irq handlers to migrate
domains.
Regards.
--
Gilles.
_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai