Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v3]

2024-03-27 Thread Serguei Spitsyn
On Wed, 27 Mar 2024 05:12:53 GMT, Chris Plummer  wrote:

>> Serguei Spitsyn has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains three additional 
>> commits since the last revision:
>> 
>>  - “Merge”
>>  - review: updated test with one more call to notifyAtBreakpoint to reset 
>> the native state
>>  - 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout
>
> test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/PopFrameTest.java
>  line 147:
> 
>> 145: }
>> 146: log("Main #B.2: got expected JVMTI_ERROR_NONE");
>> 147: notifyAtBreakpoint(); // needed to reset the native state
> 
> This doesn't make much sense to me.  The purpose of `notifyAtBreakpoint()` is 
> to unblock the Breakpoint event handler, which is waiting for this notify. 
> However, the event handler can only ever be entered once, yet we have a 
> previous `noitfyBreakpoint()` above and a 3rd `notifyAtBreakpoint()` below. 
> It seems that this call and the one below are no-ops. I don't see how 
> bp_sync_reached ever gets set true again after the first 
> `notifyAtBreakpoint()` call is made.

Thank you for the comment, Chris. You are right.
I've already figured my mistake/confusion (shared it on our team meeting) but 
tested my update which has been just pushed. My analysis in the bug report is 
incorrect, I'll update it soon. In fact, I don't really know what was the root 
cause of this deadlock because I can't reproduce the issue. Also, it is not 
easy to figure out by the code observation. The only idea I have is that it was 
a spurious wakeup on the `wait()` at line 52 in the Breakpoint handler of the 
native agent. I've added a couple of checks for robustness and added more 
tracing to get more details if this happen again.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1540566744


Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v3]

2024-03-26 Thread Chris Plummer
On Wed, 27 Mar 2024 02:57:40 GMT, Serguei Spitsyn  wrote:

>> This PR fixes a synchronization issue in the test:
>>   `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest`
>>   
>> The method `notifyAtBreakpoint()` can notify the `TestTask` thread when it 
>> has not reached an expected breakpoint yet.
>> The fix is to add a call to the method `ensureAtBreakpoint()` one more time 
>> in the `B2` sub-test. It is needed after the top-most frame was popped with 
>> the JVMTI `PopFrame`, and the target thread needs to reach the breakpoint 
>> again after its execution was resumed.
>> 
>> The time is very intermittent. At least, I was not able to reproduce the 
>> timeout failure in thousands of mach5 runs with the `-Xcomp` option.
>> 
>> Testing:
>>  - Run the test 
>> `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest` thousands 
>> times in mach5
>
> Serguei Spitsyn has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - “Merge”
>  - review: updated test with one more call to notifyAtBreakpoint to reset the 
> native state
>  - 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout

test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest/PopFrameTest.java 
line 147:

> 145: }
> 146: log("Main #B.2: got expected JVMTI_ERROR_NONE");
> 147: notifyAtBreakpoint(); // needed to reset the native state

This doesn't make much sense to me.  The purpose of `notifyAtBreakpoint()` is 
to unblock the Breakpoint event handler, which is waiting for this notify. 
However, the event handler can only ever be entered once, yet we have a 
previous `noitfyBreakpoint()` above and a 3rd `notifyAtBreakpoint()` below. It 
seems that this call and the one below are no-ops. I don't see how 
bp_sync_reached ever gets set true again after the first `notifyAtBreakpoint()` 
call is made.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18419#discussion_r1540476191


Re: RFR: 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout [v3]

2024-03-26 Thread Serguei Spitsyn
> This PR fixes a synchronization issue in the test:
>   `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest`
>   
> The method `notifyAtBreakpoint()` can notify the `TestTask` thread when it 
> has not reached an expected breakpoint yet.
> The fix is to add a call to the method `ensureAtBreakpoint()` one more time 
> in the `B2` sub-test. It is needed after the top-most frame was popped with 
> the JVMTI `PopFrame`, and the target thread needs to reach the breakpoint 
> again after its execution was resumed.
> 
> The time is very intermittent. At least, I was not able to reproduce the 
> timeout failure in thousands of mach5 runs with the `-Xcomp` option.
> 
> Testing:
>  - Run the test 
> `test/hotspot/jtreg/serviceability/jvmti/vthread/PopFrameTest` thousands 
> times in mach5

Serguei Spitsyn has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains three additional 
commits since the last revision:

 - “Merge”
 - review: updated test with one more call to notifyAtBreakpoint to reset the 
native state
 - 8328665: serviceability/jvmti/vthread/PopFrameTest failed with a timeout

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18419/files
  - new: https://git.openjdk.org/jdk/pull/18419/files/7843111d..1502492f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18419&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18419&range=01-02

  Stats: 61204 lines in 2409 files changed: 11744 ins; 7474 del; 41986 mod
  Patch: https://git.openjdk.org/jdk/pull/18419.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18419/head:pull/18419

PR: https://git.openjdk.org/jdk/pull/18419