Hello

Some time ago, I've stumbled upon a challenging bug in pharo. I tried some things, but this bug still eludes me. Maybe someone here has an idea?

The bug is that the value of `Processor activeProcess` is wrong inside a process being stepped by a forked process.

In other words, let's say process D is the (frozen) process I am debugging, and its code is to store the active process into some variable with `p := Processor activeProcess`: - If I step process D normally (with `D step`), then p is correct and worth process D - If I fork to create a process F that steps process D, then p is incorrect and worth process F

You will find below the code of the two tests I am using to show the bug, as well as a condensed version of my findings so far. If you have any idea or lead as to where this bug could come from, I would be very grateful.

Thomas Dupriez

-----

Here is the code of the failing test, where process F steps process D:

```
/testActiveProcessInProcessSteppedInForkedProcess//
//| s p D done |//
//    s := Semaphore new. done := false.//
//    "Create debugged process"//
//    D := [p := Processor activeProcess. done := true] newProcess name: 'D'; yourself.// //    "Until the execution of the debugged process is over, create a forked process to step it"//
//    [done]//
//        whileFalse: [ //
//            [debuggedProcess step. s signal] forkNamed: 'F'.//
//            s wait.//
//        ].//
//    self assert: D identicalTo: p/
```

And here is the passing test, where we step process D directly:

```
/testActiveProcessInProcessDirectlyStepped//
//| s p D done |//
//    s := Semaphore new. done := false.//
//    "Create debugged process"//
//    D := [p := Processor activeProcess. done := true] newProcess name: 'D'; yourself.// //    "Until the execution of the debugged process is over, step it directly"//
//    [done]//
//        whileFalse: [ //
//            debuggedProcess step.//
//        ].//
//    self assert: D identicalTo: p/
```

-----

Here are my findings so far:

The call chain of Process>>step is:
- Process>>step
- which calls Process>>evaluate:onBehalfOf:
- which calls BlockClosure>>ensure:
- which calls BlockClosure>>valueNoContextSwitch

1) Replacing the call to BlockClosure>>valueNoContextSwitch with a call to BlockClosure>>value does not affect the results of the test

2) Since #valueNoContextSwitch is a primitive, it cannot be instrumented easily. I instrumented right before and after it gets called in the code of BlockClosure>>ensure to check the value of active process. No wrong value there, so the problem appears inside the execution of #valueNoContextSwitch, and it disappears before this method call returns.

3) The block being evaluated by #valueNoContextSwitch contains a call to Context>>step, which ultimately calls InstructionStream>>interpretNextV3PlusClosureInstructionFor: (the method that read what the next bytecode is and applies it to the execution it is stepping. I instrumented this method to log the name of the active process, and the context being stepped during the execution of both tests. The log show a difference between the passing and failing test: - Passing test: the active process is D for a long time, then 'Test execution watch dog" for a bit, and finally, it is "Morphic UI Process". So everything looks in order: the active process is D until the test ends and the UI process takes control back - Failing test: The logged active process alternates between F and D, and looks like this: (I put some F D patterns in bold for readability) *F D* F D *F D*.....*F D* F F D F F *F D* F D *F D*...*F D* D D D D r M M M M... "M" is the morphic UI Process, "r" is a seemingly random process whose name is "1006977792" in the log. I also logged the ast nodes being stepped, but I don't really know how to exploit it.

4) I did some experiments by tweaking the tests and changing which process creates D, which process steps F...and had surprising results:
4-1) Original failing test
a

In the original failing test, the test process creates the debugged process and a fork, and the fork steps the debugged process (blue arrow). This test fails.
4-2) Original passing test
a

In the original passing test, the test process creates the debugged process and steps it. This test passes.
4-3) Forked process creates AND steps the debugged process
a
If the forked process is the one to create the debugged process, the test passes!
4-4) /Forked process creates the debugged process, and TestProcess steps it/
a
So maybe the test passes whenever the debugged process is a descendant of the process stepping it? No, 4-5) shows that it is not necessary. 4-5) /A forked process creates the debugged process. Another forked process steps the debugged process
/a


Reply via email to