On Sat, 2010-11-06 at 21:37 +0100, Gilles Chanteperdrix wrote:
> Anders Blomdell wrote:
> > Gilles Chanteperdrix wrote:
> >> Anders Blomdell wrote:
> >>> Gilles Chanteperdrix wrote:
> >>>> Jan Kiszka wrote:
> >>>>> Am 05.11.2010 00:24, Gilles Chanteperdrix wrote:
> >>>>>> Jan Kiszka wrote:
> >>>>>>> Am 04.11.2010 23:06, Gilles Chanteperdrix wrote:
> >>>>>>>> Jan Kiszka wrote:
> >>>>>>>>>>> At first sight, here you are more breaking things than cleaning 
> >>>>>>>>>>> them.
> >>>>>>>>>> Still, it has the SMP record for my test program, still runs with 
> >>>>>>>>>> ftrace 
> >>>>>>>>>> on (after 2 hours, where it previously failed after maximum 23 
> >>>>>>>>>> minutes).
> >>>>>>>>> My version was indeed still buggy, I'm reworking it ATM.
> >>>>>>>>>
> >>>>>>>>>> If I get the gist of Jan's changes, they are (using the IPI to 
> >>>>>>>>>> transfer 
> >>>>>>>>>> one bit of information: your cpu needs to reschedule):
> >>>>>>>>>>
> >>>>>>>>>> xnsched_set_resched:
> >>>>>>>>>> -      setbits((__sched__)->status, XNRESCHED);
> >>>>>>>>>>
> >>>>>>>>>> xnpod_schedule_handler:
> >>>>>>>>>> +  xnsched_set_resched(sched);
> >>>>>>>>>>    
> >>>>>>>>>> If you (we?) decide to keep the debug checks, under what 
> >>>>>>>>>> circumstances 
> >>>>>>>>>> would the current check trigger (in laymans language, that I'll be 
> >>>>>>>>>> able 
> >>>>>>>>>> to understand)?
> >>>>>>>>> That's actually what /me is wondering as well. I do not see yet how 
> >>>>>>>>> you
> >>>>>>>>> can reliably detect a missed reschedule reliably (that was the 
> >>>>>>>>> purpose
> >>>>>>>>> of the debug check) given the racy nature between signaling resched 
> >>>>>>>>> and
> >>>>>>>>> processing the resched hints.
> >>>>>>>> The purpose of the debugging change is to detect a change of the
> >>>>>>>> scheduler state which was not followed by setting the XNRESCHED bit.
> >>>>>>> But that is nucleus business, nothing skins can screw up (as long as
> >>>>>>> they do not misuse APIs).
> >>>>>> Yes, but it happens that we modify the nucleus from time to time.
> >>>>>>
> >>>>>>>> Getting it to work is relatively simple: we add a "scheduler change 
> >>>>>>>> set
> >>>>>>>> remotely" bit to the sched structure which is NOT in the status bit, 
> >>>>>>>> set
> >>>>>>>> this bit when changing a remote sched (under nklock). In the debug 
> >>>>>>>> check
> >>>>>>>> code, if the scheduler state changed, and the XNRESCHED bit is not 
> >>>>>>>> set,
> >>>>>>>> only consider this a but if this new bit is not set. All this is
> >>>>>>>> compiled out if the debug is not enabled.
> >>>>>>> I still see no benefit in this check. Where to you want to place the 
> >>>>>>> bit
> >>>>>>> set? Aren't that just the same locations where
> >>>>>>> xnsched_set_[self_]resched already is today?
> >>>>>> Well no, that would be another bit in the sched structure which would
> >>>>>> allow us to manipulate the status bits from the local cpu. That
> >>>>>> supplementary bit would only be changed from a distant CPU, and serve 
> >>>>>> to
> >>>>>> detect the race which causes the false positive. The resched bits are
> >>>>>> set on the local cpu to get xnpod_schedule to trigger a rescheduling on
> >>>>>> the distance cpu. That bit would be set on the remote cpu's sched. Only
> >>>>>> when debugging is enabled.
> >>>>>>
> >>>>>>> But maybe you can provide some motivating bug scenarios, real ones of
> >>>>>>> the past or realistic ones of the future.
> >>>>>> Of course. The bug is anything which changes the scheduler state but
> >>>>>> does not set the XNRESCHED bit. This happened when we started the SMP
> >>>>>> port. New scheduling policies would be good candidates for a revival of
> >>>>>> this bug.
> >>>>>>
> >>>>> You don't gain any worthwhile check if you cannot make the
> >>>>> instrumentation required for a stable detection simpler than the proper
> >>>>> problem solution itself. And this is what I'm still skeptical of.
> >>>> The solution is simple, but finding the problem without the 
> >>>> instrumentation is way harder than with the instrumentation, so the 
> >>>> instrumentation is worth something.
> >>>>
> >>>> Reproducing the false positive is surprisingly easy with a simple
> >>>> dual-cpu semaphore ping-pong test. So, here is the (tested) patch, 
> >>>> using a ridiculous long variable name to illustrate what I was 
> >>>> thinking about:
> >>>>
> >>>> diff --git a/include/nucleus/sched.h b/include/nucleus/sched.h
> >>>> index 8888cf4..454b8e8 100644
> >>>> --- a/include/nucleus/sched.h
> >>>> +++ b/include/nucleus/sched.h
> >>>> @@ -108,6 +108,9 @@ typedef struct xnsched {
> >>>>         struct xnthread *gktarget;
> >>>>  #endif
> >>>>
> >>>> +#ifdef CONFIG_XENO_OPT_DEBUG_NUCLEUS
> >>>> +       int debug_resched_from_remote;
> >>>> +#endif
> >>>>  } xnsched_t;
> >>>>
> >>>>  union xnsched_policy_param;
> >>>> @@ -185,6 +188,8 @@ static inline int xnsched_resched_p(struct xnsched 
> >>>> *sched)
> >>>>    xnsched_t *current_sched = xnpod_current_sched();                    \
> >>>>    __setbits(current_sched->status, XNRESCHED);                         \
> >>>>    if (current_sched != (__sched__))    {                               \
> >>>> +         if (XENO_DEBUG(NUCLEUS))                                      \
> >>>> +                 __sched__->debug_resched_from_remote = 1;             \
> >>>>        xnarch_cpu_set(xnsched_cpu(__sched__), current_sched->resched);  \
> >>>>    }                                                                    \
> >>>>  } while (0)
> >>>> diff --git a/ksrc/nucleus/pod.c b/ksrc/nucleus/pod.c
> >>>> index 4cb707a..50b0f49 100644
> >>>> --- a/ksrc/nucleus/pod.c
> >>>> +++ b/ksrc/nucleus/pod.c
> >>>> @@ -2177,6 +2177,10 @@ static inline int __xnpod_test_resched(struct 
> >>>> xnsched *sched)
> >>>>                 xnarch_cpus_clear(sched->resched);
> >>>>         }
> >>>>  #endif
> >>>> +       if (XENO_DEBUG(NUCLEUS) && sched->debug_resched_from_remote) {
> >>>> +               sched->debug_resched_from_remote = 0;
> >>>> +               resched = 1;
> >>>> +       }
> >>>>         clrbits(sched->status, XNRESCHED);
> >>>>         return resched;
> >>>>  }
> >>>>
> >>>>
> >>>> I am still uncertain.
> >>> Will only work if all is done under nklock, otherwise two almost 
> >>> simultaneous xnsched_resched_p from different cpus, might lead to one of 
> >>> the ipi wakeups sees the 0 written due to handling the first ipi 
> >>> interrupt.
> >> This is a patch artifact, the function modified are xnsched_set_resched
> >> and xnpod_test_resched, and both are run with the nklock locked.
> >>
> > 
> > Isn't this a possible scenario?
> > 
> > CPU A                       CPU B                           CPU C
> > take nklock
> > remote = 1
> > send ipi #1
> > release nklock                                      
> >                     take nklock                     handle ipi
> >                     remote = 1                      ack ipi #1
> >                     send ipi #2
> >                     release nklock
> >                                                     take nklock
> >                                                     if remote (==1)
> >                                                       remote = 0
> >                                                       reseched = 1
> >                                                     relese nklock
> >                                                     handle ipi
> >                                                     ack ipi #2
> >                                                     take nklock
> >                                                     if remote (==0)
> >                                                       OOPS!
> 
> No problem here, since handling the first IPI has taken into account the
> two scheduler state changes. So, no OOPS. The second IPI is spurious.
> 
> Anyway, after some thoughts, I think we are going to try and make the
> current situation work instead of going back to the old way.
> 
> You can find the patch which attempts to do so here:
> http://sisyphus.hd.free.fr/~gilles/sched_status.txt

Ack. At last, this addresses the real issues without asking for
regression funkiness: fix the lack of barrier before testing XNSCHED in
the xnpod_schedule pre-test, and stop sched->status trashing due to
XNINIRQ/XNHTICK/XNRPICK ops done un-synced on nklock.

In short, this patch looks like moving the local-only flags where they
belong, i.e. anywhere you want but *outside* of the status with remotely
accessed bits. XNRPICK seems to be handled differently, but it makes
sense to group it with other RPI data as you did, so fine with me.

> 
> It even avoids the second IPI in this case. Experimental and only
> lightly tested for now.
> 

-- 
Philippe.



_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to