Hi.

I like your proposals. It fixed problem. And more important it provides
better documentation for tested behaviour


2016-01-30 10:43 GMT+01:00 Ben Coman <[email protected]>:

> Case 17373 had an erroneous failure from
> ProcessTest>>testHighPriorityOverridesWaitTime
> and several others seem also to have suffered...
> * https://pharo.fogbugz.com/default.asp?17373
> * https://pharo.fogbugz.com/default.asp?17496
> * https://pharo.fogbugz.com/default.asp?17445
> * https://pharo.fogbugz.com/default.asp?7444
>
> These have comments that the problem can't be reproduced, which is
> common of occasional race conditions.  So rather than just try to
> solve this on the issue tracker, I thought I'd share a tip how to
> isolate and squash such intermittent failures, essentially by
> stressing that single test a large number of times like this...
>
>     1 to: 100 do: [  :n |  Transcript crShow: 'Try ' , n printString.
>           ProcessTest debug: #testHighPriorityOverridesWaitTime.  ]
>
> Since the debugger only focusses on a point in time snapshot of the
> running system, some Transcript instrumentation was useful to overview
> the dynamic behaviour, like this...
>
>     ProcessTest>>testHighPriorityOverridesWaitTime
>         | lowerPriorityWaitingLonger higherPriorityWaitingLess
> nextReadyProcess |
>         lowerPriorityWaitingLonger := [ ] forkAt: 11
>         higherPriorityWaitingLess :=   [ ] forkAt: 12.
>         nextReadyProcess := Processor nextReadyProcess.
>         Transcript cr;
>            tab;show: (lowerPriorityWaitingLonger priority ->
> lowerPriorityWaitingLonger identityHash); cr;
>            tab;show: (higherPriorityWaitingLess priority ->
> higherPriorityWaitingLess identityHash); cr;
>            tab;show: (nextReadyProcess priority -> nextReadyProcess
> identityHash).
>         self assert: nextReadyProcess equals: higherPriorityWaitingLess.
>
> which produces...
>   Try 1
>     11->343137024
>     12->40659200
>     12->40659200
>   Try 2
>     11->461794304
>     12->374996992
>     12->40659200
>
> Observe on the second try nextReadyProcess refers to same process as
> from the first try.  This can be cleaned up just prior to the asset
> check by adding ...
>    lowerPriorityWaitingLonger terminate.
>    higherPriorityWaitingLess terminate.
> which produces...
>   Try 1
>     11->1005688064
>     12->833273600
>     12->833273600
>   ...
>   Try 100
>     11->764450816
>     12->807973888
>     12->807973888
>
> However this is still a little fragile, since some process with
> priority greater than the running test could slip in before the call
> to nextReadyProcess, as can be seen by pushing the #to:do: loop out to
> 100000, which produces...
>   Try 32044
>     11->784361216
>     12->431928576
>     40->397533952
>
> or wrapping it by [   ]forkAt:39, which produces...
>   Try 1
>     11->431897088
>     12->400972288
>     30->338363648
>
> After several awkward failed attempts to fix this, it seemed best to
> observe actual behaviour rather than peeking into the dynamic data
> structures. Thus I propose...
>
>   ProcessTest>>testHighPriorityOverridesWaitTime
>     "The first process to resume will pass straight through the gate
>     while the other waits for the assert to whichRan."
>
>     | gate checkAssert priorityRan |
>     gate := Semaphore new signal.
>     checkAssert := Semaphore new.
>
>     [ gate wait. priorityRan := 11. checkAssert signal ] forkAt: 11.
>     [ gate wait. priorityRan := 12. checkAssert signal ] forkAt: 12.
>
>     checkAssert wait.
>     self assert: priorityRan=12 description: 'Second scheduled but
> higher priority should run first'.
>     gate signal.
>
>     checkAssert wait.
>     self assert: whichRan=11 description: 'First scheduled but lower
> priority should run after'.
>
> https://pharo.fogbugz.com/default.asp?17501
>
>

Reply via email to