[ https://issues.apache.org/jira/browse/HADOOP-19071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819215#comment-17819215 ]
ASF GitHub Bot commented on HADOOP-19071: ----------------------------------------- steveloughran commented on PR #6537: URL: https://github.com/apache/hadoop/pull/6537#issuecomment-1956544980 This is not good. But looking at the failures I don't know whether to categorise as "test runner regression" or "brittle tests failing under new test runner". Here are some of the ones I've looked at `TestDirectoryScanner.testThrottling` This test is measuring how long things took. it is way too brittle against timing changes, both slower and faster. ``` java.lang.AssertionError: Throttle is too permissive at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:901) ``` I think the step here is to move to assertj so asserts fail with meaningful messages, see if the failure can be understood. Ideally you'd want a test which doesn't measure elapsed time, but instead uses counters in the code (here: of throttle events) to assert what took place. Test` TestBlockListAsLongs.testFuzz` See this painfully often else where -it means that the protobuf lib was built with a more recent version of java8 than the early oracle ones. Its fixable in your own build (use the older one) or cast ByteBuffer to Buffer. otherwise we need to make sure tests are on a more recent build. ``` java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer; at org.apache.hadoop.thirdparty.protobuf.IterableByteBufferInputStream.read(IterableByteBufferInputStream.java:143) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.read(CodedInputStream.java:2080) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2831) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.refillBuffer(CodedInputStream.java:2777) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawByte(CodedInputStream.java:2859) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawVarint64SlowPath(CodedInputStream.java:2648) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawVarint64(CodedInputStream.java:2641) at org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readSInt64(CodedInputStream.java:2497) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:419) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:397) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:375) at org.apache.hadoop.hdfs.protocol.TestBlockListAsLongs.checkReport(TestBlockListAsLongs.java:156) at org.apache.hadoop.hdfs.protocol.TestBlockListAsLongs.testFuzz(TestBlockListAsLongs.java:139) ``` test `TestDFSAdmin.testDecommissionDataNodesReconfig` ``` java.lang.AssertionError at org.junit.Assert.fail(Assert.java:87) at org.junit.Assert.assertTrue(Assert.java:42) at org.junit.Assert.assertTrue(Assert.java:53) at org.apache.hadoop.hdfs.tools.TestDFSAdmin.testDecommissionDataNodesReconfig(TestDFSAdmin.java:1356) ``` not a very meaningful message. suspect that a different ordering of the threads is causing the assert to fail. 1. move to AssertJ 2. analyse error, see what the fix is. Test `TestCacheDirectives`. ``` at org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:403) at org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:362) at org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.waitForCachedBlocks(TestCacheDirectives.java:760) at org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.teardown(TestCacheDirectives.java:173) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ``` this is a timeout during teardown; after this subsequent tests are possibly going to fail. No obvious cause, though again I'd suspect race conditions. Rather than say "hey, let's revert", I'd propose a "surefire update triggers test failures" and see what can be done about addressing them. because we can't stay frozen with surefire versions. > Update maven-surefire-plugin from 3.0.0 to 3.2.5 > ------------------------------------------------- > > Key: HADOOP-19071 > URL: https://issues.apache.org/jira/browse/HADOOP-19071 > Project: Hadoop Common > Issue Type: Sub-task > Components: build, common > Affects Versions: 3.4.0, 3.5.0 > Reporter: Shilun Fan > Assignee: Shilun Fan > Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org