Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Tue, 5 Dec 2023 09:08:59 GMT, Andrew Haley wrote: >> We've seen some rare failures of the CLQ Whitebox test on "less-strong" >> architectures, and the only thing which -- given my research -- could be the >> culprit is spuriously failing weakCAS (which is correct in terms of the >> implementation of CLQ). >> >> After discussion with @DougLea, it was decided as the CLQ implementation >> does not guarantee what the failing test tests, and modifying the test would >> mean that it would generally not be able to enforce anything, the test is >> invalid and should be removed -- hence this PR. > > Few AArch64 HotSpot systems implement weak CAS as anything other than plain > CAS. In order to get to the root cause of this problem, it would help to know > on which AArch64 hardware this test failed. > @theRealAph I think the problem in this case was that the whitebox test in > this case relied on a presumption that was only true for stronger consistency > architectures, and rewriting the test would essentially be asserting that "a > lot of permutations are valid, and only internally observable" which is a > low-value test. Oh right, so nothing to do with weak CAS, then. Fair enough. - PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1842584686
Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Tue, 5 Dec 2023 09:08:59 GMT, Andrew Haley wrote: >> We've seen some rare failures of the CLQ Whitebox test on "less-strong" >> architectures, and the only thing which -- given my research -- could be the >> culprit is spuriously failing weakCAS (which is correct in terms of the >> implementation of CLQ). >> >> After discussion with @DougLea, it was decided as the CLQ implementation >> does not guarantee what the failing test tests, and modifying the test would >> mean that it would generally not be able to enforce anything, the test is >> invalid and should be removed -- hence this PR. > > Few AArch64 HotSpot systems implement weak CAS as anything other than plain > CAS. In order to get to the root cause of this problem, it would help to know > on which AArch64 hardware this test failed. @theRealAph I think the problem in this case was that the whitebox test in this case relied on a presumption that was only true for stronger consistency architectures, and rewriting the test would essentially be asserting that "a lot of permutations are valid, and only internally observable" which is a low-value test. - PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1841562374
Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang wrote: > We've seen some rare failures of the CLQ Whitebox test on "less-strong" > architectures, and the only thing which -- given my research -- could be the > culprit is spuriously failing weakCAS (which is correct in terms of the > implementation of CLQ). > > After discussion with @DougLea, it was decided as the CLQ implementation does > not guarantee what the failing test tests, and modifying the test would mean > that it would generally not be able to enforce anything, the test is invalid > and should be removed -- hence this PR. Few AArch64 HotSpot systems implement weak CAS as anything other than plain CAS. In order to get to the root cause of this problem, it would help to know on which AArch64 hardware this test failed. - PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1840319541
Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang wrote: > We've seen some rare failures of the CLQ Whitebox test on "less-strong" > architectures, and the only thing which -- given my research -- could be the > culprit is spuriously failing weakCAS (which is correct in terms of the > implementation of CLQ). > > After discussion with @DougLea, it was decided as the CLQ implementation does > not guarantee what the failing test tests, and modifying the test would mean > that it would generally not be able to enforce anything, the test is invalid > and should be removed -- hence this PR. I see both Doug and Alan have reviewed this. I'll go ahead and sponsor this now. - Marked as reviewed by jpai (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16786#pullrequestreview-1763992520
Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang wrote: > We've seen some rare failures of the CLQ Whitebox test on "less-strong" > architectures, and the only thing which -- given my research -- could be the > culprit is spuriously failing weakCAS (which is correct in terms of the > implementation of CLQ). > > After discussion with @DougLea, it was decided as the CLQ implementation does > not guarantee what the failing test tests, and modifying the test would mean > that it would generally not be able to enforce anything, the test is invalid > and should be removed -- hence this PR. Looks good to me! - PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1827901138
Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang wrote: > We've seen some rare failures of the CLQ Whitebox test on "less-strong" > architectures, and the only thing which -- given my research -- could be the > culprit is spuriously failing weakCAS (which is correct in terms of the > implementation of CLQ). > > After discussion with @DougLea, it was decided as the CLQ implementation does > not guarantee what the failing test tests, and modifying the test would mean > that it would generally not be able to enforce anything, the test is invalid > and should be removed -- hence this PR. @AlanBateman If you happen to have a couple of seconds to spare, would you mind reviewing this? 樂 - PR Comment: https://git.openjdk.org/jdk/pull/16786#issuecomment-1827798624
Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang wrote: > We've seen some rare failures of the CLQ Whitebox test on "less-strong" > architectures, and the only thing which -- given my research -- could be the > culprit is spuriously failing weakCAS (which is correct in terms of the > implementation of CLQ). > > After discussion with @DougLea, it was decided as the CLQ implementation does > not guarantee what the failing test tests, and modifying the test would mean > that it would generally not be able to enforce anything, the test is invalid > and should be removed -- hence this PR. Marked as reviewed by alanb (Reviewer). - PR Review: https://git.openjdk.org/jdk/pull/16786#pullrequestreview-1762984985