Hi Jeff,

I'm continuing from our chat yesterday here because it might be of wider 
interest:  
http://linuxcnc.mah.priv.at/irc/%23linuxcnc-devel/2012-12-13.html#22:57:03

to recap:

- current 'sim' uses Gnu Pth threads
- the rt-preempt merge candiate code uses Pthreads
- to reduce code duplication, we folded rt-preempt code into the old 'sim' code 
and dropped Pth
- runtests is fine on real machines, including threads.0
- runtests of threads.0 fails on a Virtualbox machine, which has exceptionally 
bad scheduling behaviour due to the hunderlying hypervisor scheme and second OS 
'underneath'.

Looking at the threads.0 test, it seems to imply there is a relative ordering 
of task execution, namely:
- two threads, a fast one, and a slow one with 10 times the period of the fast 
one
- the assumption seems to be the fast one executes 10 times before the slow 
one, resulting in a certain output pattern in the 'result' file, namely 
'1...10','1..10' which says the fast thread ran 10 times, then the slow one got 
scheduled.

The only plausible explanation we arrived at so far seems the semantics of the 
threading libraries: 
- Gnu pth uses N:1 threading 
(http://en.wikipedia.org/wiki/Thread_%28computing%29#N:1_.28User-level_threading.29)
- Pthreads uses M:N threading

If that is the case, then threads.0 really verifies the implementation of Gnu 
Pthreads, but not some expected behaviour of HAL threads. At least I couldnt 
find a spec or comment which says 'even in sim mode, relative scheduling counts 
must remain fixed', which seems to happen by accident with Pth but not with 
pthreads in my Vbox setup.

The reason why this test succeeds in usermode RT schemes, and on real machines 
seems to be that scheduling in these cases is precise enough to fit within the 
expected behavior time windows, whereas pthreads scheduling on Vbox is so 
massively off the scale that it violates the expected behavior. 

--

The question now is what to do with the result.

First, does it indicate a fatal error situation? I dont think so, because in 
'sim' mode all bets are off wrt relative scheduling anyway.

Second, what does that test actually say? 

if all it does is to say 'well Gnu Pth just behaves _so_' then I'm unsure what 
we are testing against here. Of course intuitively one would _assume_ a thread 
10x as fast gets to run 10x as often, but that it is not how it is implemented. 
If we were to ascertain a fixed relative scheduling rate, then the HAL/RTAPI 
threading code must assure that, for instance by explicitly scheduling a slow 
thread after N invocations of the fast thread where N is the period ratio. 
However, AFAICT that is not the case, and I am not at all sure this is actually 
the case with RT scheduling by the OS - just because it's precise enough on 
average that doesnt mean the behaviour isnt actually stochastic.

I guess the proper answer to all this is to firm up the HAL/RTAPI threading 
specification by explicitly stating what relative periods of HAL threads mean 
for expected invocation counts. I can only infer this from the code, but I 
would think the answer is:

If several threads are used, the relative timing suggests, *but does not 
ascertain* a certain ratio of thread invocations.

If we can agree on that, that means threads.0 realistically is not acceptance 
test - but we can make it a standalone measurement for relative scheduling 
count probabilities, or drop it altogether.

- Michael





------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

Reply via email to