Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-06 Thread Andrew Haley
On Tue, 5 Dec 2023 09:08:59 GMT, Andrew Haley  wrote:

>> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
>> architectures, and the only thing which -- given my research -- could be the 
>> culprit is spuriously failing weakCAS (which is correct in terms of the 
>> implementation of CLQ).
>> 
>> After discussion with @DougLea, it was decided as the CLQ implementation 
>> does not guarantee what the failing test tests, and modifying the test would 
>> mean that it would generally not be able to enforce anything, the test is 
>> invalid and should be removed -- hence this PR.
>
> Few AArch64 HotSpot systems implement weak CAS as anything other than plain 
> CAS. In order to get to the root cause of this problem, it would help to know 
> on which AArch64 hardware this test failed.

> @theRealAph I think the problem in this case was that the whitebox test in 
> this case relied on a presumption that was only true for stronger consistency 
> architectures, and rewriting the test would essentially be asserting that "a 
> lot of permutations are valid, and only internally observable" which is a 
> low-value test.

Oh right, so nothing to do with weak CAS, then. Fair enough.

-

PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1842584686


Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-05 Thread Viktor Klang
On Tue, 5 Dec 2023 09:08:59 GMT, Andrew Haley  wrote:

>> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
>> architectures, and the only thing which -- given my research -- could be the 
>> culprit is spuriously failing weakCAS (which is correct in terms of the 
>> implementation of CLQ).
>> 
>> After discussion with @DougLea, it was decided as the CLQ implementation 
>> does not guarantee what the failing test tests, and modifying the test would 
>> mean that it would generally not be able to enforce anything, the test is 
>> invalid and should be removed -- hence this PR.
>
> Few AArch64 HotSpot systems implement weak CAS as anything other than plain 
> CAS. In order to get to the root cause of this problem, it would help to know 
> on which AArch64 hardware this test failed.

@theRealAph I think the problem in this case was that the whitebox test in this 
case relied on a presumption that was only true for stronger consistency 
architectures, and rewriting the test would essentially be asserting that "a 
lot of permutations are valid, and only internally observable" which is a 
low-value test.

-

PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1841562374


Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-05 Thread Andrew Haley
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang  wrote:

> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
> architectures, and the only thing which -- given my research -- could be the 
> culprit is spuriously failing weakCAS (which is correct in terms of the 
> implementation of CLQ).
> 
> After discussion with @DougLea, it was decided as the CLQ implementation does 
> not guarantee what the failing test tests, and modifying the test would mean 
> that it would generally not be able to enforce anything, the test is invalid 
> and should be removed -- hence this PR.

Few AArch64 HotSpot systems implement weak CAS as anything other than plain 
CAS. In order to get to the root cause of this problem, it would help to know 
on which AArch64 hardware this test failed.

-

PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1840319541


Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-04 Thread Jaikiran Pai
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang  wrote:

> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
> architectures, and the only thing which -- given my research -- could be the 
> culprit is spuriously failing weakCAS (which is correct in terms of the 
> implementation of CLQ).
> 
> After discussion with @DougLea, it was decided as the CLQ implementation does 
> not guarantee what the failing test tests, and modifying the test would mean 
> that it would generally not be able to enforce anything, the test is invalid 
> and should be removed -- hence this PR.

I see both Doug and Alan have reviewed this.

I'll go ahead and sponsor this now.

-

Marked as reviewed by jpai (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16786#pullrequestreview-1763992520


Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-04 Thread Doug Lea
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang  wrote:

> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
> architectures, and the only thing which -- given my research -- could be the 
> culprit is spuriously failing weakCAS (which is correct in terms of the 
> implementation of CLQ).
> 
> After discussion with @DougLea, it was decided as the CLQ implementation does 
> not guarantee what the failing test tests, and modifying the test would mean 
> that it would generally not be able to enforce anything, the test is invalid 
> and should be removed -- hence this PR.

Looks good to me!

-

PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1827901138


Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-04 Thread Viktor Klang
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang  wrote:

> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
> architectures, and the only thing which -- given my research -- could be the 
> culprit is spuriously failing weakCAS (which is correct in terms of the 
> implementation of CLQ).
> 
> After discussion with @DougLea, it was decided as the CLQ implementation does 
> not guarantee what the failing test tests, and modifying the test would mean 
> that it would generally not be able to enforce anything, the test is invalid 
> and should be removed -- hence this PR.

@AlanBateman If you happen to have a couple of seconds to spare, would you mind 
reviewing this? 樂

-

PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1827798624


Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-04 Thread Alan Bateman
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang  wrote:

> We've seen some rare failures of the CLQ Whitebox test on "less-strong" 
> architectures, and the only thing which -- given my research -- could be the 
> culprit is spuriously failing weakCAS (which is correct in terms of the 
> implementation of CLQ).
> 
> After discussion with @DougLea, it was decided as the CLQ implementation does 
> not guarantee what the failing test tests, and modifying the test would mean 
> that it would generally not be able to enforce anything, the test is invalid 
> and should be removed -- hence this PR.

Marked as reviewed by alanb (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/16786#pullrequestreview-1762984985