On Tue, 2011-07-12 at 14:57 +0200, Jan Kiszka wrote: > On 2011-07-12 14:13, Jan Kiszka wrote: > > On 2011-07-12 14:06, Gilles Chanteperdrix wrote: > >> On 07/12/2011 01:58 PM, Jan Kiszka wrote: > >>> On 2011-07-12 13:56, Jan Kiszka wrote: > >>>> However, this parallel unsynchronized execution of the gatekeeper and > >>>> its target thread leaves an increasingly bad feeling on my side. Did we > >>>> really catch all corner cases now? I wouldn't guarantee that yet. > >>>> Specifically as I still have an obscure crash of a Xenomai thread on > >>>> Linux schedule() on my table. > >>>> > >>>> What if the target thread woke up due to a signal, continued much > >>>> further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the > >>>> gatekeeper continued? I wish we could already eliminate this complexity > >>>> and do the migration directly inside schedule()... > >>> > >>> BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in > >>> the gatekeeper? What would happen if we included it (state == > >>> (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))? > >> > >> I would tend to think that what we should check is > >> xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible > >> state and the XNATOMIC info bit. > > > > Actually, neither the info bits nor the task state is sufficiently > > synchronized against the gatekeeper yet. We need to hold a shared lock > > when testing and resetting the state. I'm not sure yet if that is > > fixable given the gatekeeper architecture. > > > > This may work (on top of the exit-race fix): > > diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c > index 50dcf43..90feb16 100644 > --- a/ksrc/nucleus/shadow.c > +++ b/ksrc/nucleus/shadow.c > @@ -913,20 +913,27 @@ static int gatekeeper_thread(void *data) > if ((xnthread_user_task(target)->state & ~TASK_ATOMICSWITCH) == > TASK_INTERRUPTIBLE) { > rpi_pop(target); > xnlock_get_irqsave(&nklock, s); > -#ifdef CONFIG_SMP > + > /* > - * If the task changed its CPU while in > - * secondary mode, change the CPU of the > - * underlying Xenomai shadow too. We do not > - * migrate the thread timers here, it would > - * not work. For a "full" migration comprising > - * timers, using xnpod_migrate_thread is > - * required. > + * Recheck XNATOMIC to avoid waking the shadow if the > + * Linux task received a signal meanwhile. > */ > - if (target->sched != sched) > - xnsched_migrate_passive(target, sched); > + if (xnthread_test_info(target, XNATOMIC)) { > +#ifdef CONFIG_SMP > + /* > + * If the task changed its CPU while in > + * secondary mode, change the CPU of the > + * underlying Xenomai shadow too. We do not > + * migrate the thread timers here, it would > + * not work. For a "full" migration comprising > + * timers, using xnpod_migrate_thread is > + * required. > + */ > + if (target->sched != sched) > + xnsched_migrate_passive(target, sched); > #endif /* CONFIG_SMP */ > - xnpod_resume_thread(target, XNRELAX); > + xnpod_resume_thread(target, XNRELAX); > + } > xnlock_put_irqrestore(&nklock, s); > xnpod_schedule(); > } > @@ -1036,6 +1043,7 @@ redo: > * to process this signal anyway. > */ > if (rthal_current_domain == rthal_root_domain) { > + XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC)); > if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task) > || this_task->state != TASK_RUNNING)) > xnpod_fatal > @@ -1044,6 +1052,8 @@ redo: > return -ERESTARTSYS; > } > > + xnthread_clear_info(thread, XNATOMIC); > + > /* "current" is now running into the Xenomai domain. */ > thread->gksched = NULL; > sched = xnsched_finish_unlocked_switch(thread->sched); > @@ -2650,6 +2660,8 @@ static inline void do_sigwake_event(struct task_struct > *p) > > xnlock_get_irqsave(&nklock, s); > > + xnthread_clear_info(thread, XNATOMIC); > + > if ((p->ptrace & PT_PTRACED) && !xnthread_test_state(thread, XNDEBUG)) { > sigset_t pending; > > > It totally ignores RPI and PREEMPT_RT for now. RPI is broken anyway,
I want to drop RPI in v3 for sure because it is misleading people. I'm still pondering whether we should do that earlier during the 2.6 timeframe. > ripping it out would allow to use solely XNATOMIC as condition in the > gatekeeper. > > /me is now looking to get this applied to a testbox that still crashes > suspiciously. > > Jan > > _______________________________________________ > Xenomai-core mailing list > Xenomai-core@gna.org > https://mail.gna.org/listinfo/xenomai-core -- Philippe. _______________________________________________ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core