Hello
Some time ago, I've stumbled upon a challenging bug in pharo. I tried
some things, but this bug still eludes me. Maybe someone here has an idea?
The bug is that the value of `Processor activeProcess` is wrong inside a
process being stepped by a forked process.
In other words, let's say process D is the (frozen) process I am
debugging, and its code is to store the active process into some
variable with `p := Processor activeProcess`:
- If I step process D normally (with `D step`), then p is correct and
worth process D
- If I fork to create a process F that steps process D, then p is
incorrect and worth process F
You will find below the code of the two tests I am using to show the
bug, as well as a condensed version of my findings so far. If you have
any idea or lead as to where this bug could come from, I would be very
grateful.
Thomas Dupriez
-----
Here is the code of the failing test, where process F steps process D:
```
/testActiveProcessInProcessSteppedInForkedProcess//
//| s p D done |//
// s := Semaphore new. done := false.//
// "Create debugged process"//
// D := [p := Processor activeProcess. done := true] newProcess name:
'D'; yourself.//
// "Until the execution of the debugged process is over, create a
forked process to step it"//
// [done]//
// whileFalse: [ //
// [debuggedProcess step. s signal] forkNamed: 'F'.//
// s wait.//
// ].//
// self assert: D identicalTo: p/
```
And here is the passing test, where we step process D directly:
```
/testActiveProcessInProcessDirectlyStepped//
//| s p D done |//
// s := Semaphore new. done := false.//
// "Create debugged process"//
// D := [p := Processor activeProcess. done := true] newProcess name:
'D'; yourself.//
// "Until the execution of the debugged process is over, step it
directly"//
// [done]//
// whileFalse: [ //
// debuggedProcess step.//
// ].//
// self assert: D identicalTo: p/
```
-----
Here are my findings so far:
The call chain of Process>>step is:
- Process>>step
- which calls Process>>evaluate:onBehalfOf:
- which calls BlockClosure>>ensure:
- which calls BlockClosure>>valueNoContextSwitch
1) Replacing the call to BlockClosure>>valueNoContextSwitch with a call
to BlockClosure>>value does not affect the results of the test
2) Since #valueNoContextSwitch is a primitive, it cannot be instrumented
easily. I instrumented right before and after it gets called in the code
of BlockClosure>>ensure to check the value of active process. No wrong
value there, so the problem appears inside the execution of
#valueNoContextSwitch, and it disappears before this method call returns.
3) The block being evaluated by #valueNoContextSwitch contains a call to
Context>>step, which ultimately calls
InstructionStream>>interpretNextV3PlusClosureInstructionFor: (the method
that read what the next bytecode is and applies it to the execution it
is stepping. I instrumented this method to log the name of the active
process, and the context being stepped during the execution of both
tests. The log show a difference between the passing and failing test:
- Passing test: the active process is D for a long time, then 'Test
execution watch dog" for a bit, and finally, it is "Morphic UI Process".
So everything looks in order: the active process is D until the test
ends and the UI process takes control back
- Failing test: The logged active process alternates between F and D,
and looks like this: (I put some F D patterns in bold for readability)
*F D* F D *F D*.....*F D* F F D F F *F D* F D *F D*...*F D* D D D D r M
M M M...
"M" is the morphic UI Process, "r" is a seemingly random process whose
name is "1006977792" in the log. I also logged the ast nodes being
stepped, but I don't really know how to exploit it.
4) I did some experiments by tweaking the tests and changing which
process creates D, which process steps F...and had surprising results:
4-1) Original failing test
a
In the original failing test, the test process creates the debugged
process and a fork, and the fork steps the debugged process (blue
arrow). This test fails.
4-2) Original passing test
a
In the original passing test, the test process creates the debugged
process and steps it. This test passes.
4-3) Forked process creates AND steps the debugged process
a
If the forked process is the one to create the debugged process, the
test passes!
4-4) /Forked process creates the debugged process, and TestProcess steps it/
a
So maybe the test passes whenever the debugged process is a descendant
of the process stepping it? No, 4-5) shows that it is not necessary.
4-5) /A forked process creates the debugged process. Another forked
process steps the debugged process
/a