[ https://issues.apache.org/jira/browse/HDFS-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775005#comment-17775005 ]
Ayush Saxena commented on HDFS-17224: ------------------------------------- Well two tests failed in the same class: [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4996/21/testReport/org.apache.hadoop.hdfs/TestRollingUpgrade/] The first one failed here: [https://github.com/apache/hadoop/blob/85af6c3a2850ffa0d3216bb62c19c55ab6e4dba3/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java#L134] Kind of precheck, rolling upgrade was never kicked in, it was the first time, with an illegal argument, which on CLI failed was confirmed in the line above(expected). So, this MBean is coming from somewhere else.... Checking both the tests which failed. Both failed with MBean not being Null, first one didn't had a GenericTestUtils.waitFor, Other had, HDFS-16336 added a wait, So, the same exception is bit below, The wait was added for the same exception here in this ticket, but looks like it wasn't just some latency.... An interesting thing to observe. The two tests that failed both each uses their own MiniDfsCluster. [From First one|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4996/21/testReport/org.apache.hadoop.hdfs/TestRollingUpgrade/testDFSAdminRollingUpgradeCommands/] {noformat} (itemName=startTime,itemType=javax.management.openmbean.SimpleType(name=java.lang.Long)))),contents={blockPoolId=BP-1679863569-172.17.0.2-1696910973814, createdRollbackImages=true, finalizeTime=0, startTime=1696910977372})> {noformat} [From the Second One|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4996/21/testReport/org.apache.hadoop.hdfs/TestRollingUpgrade/testRollback/] {noformat} (itemName=startTime,itemType=javax.management.openmbean.SimpleType(name=java.lang.Long)))),contents={blockPoolId=BP-1679863569-172.17.0.2-1696910973814, createdRollbackImages=true, finalizeTime=0, startTime=1696910977372})> {noformat} Both these tests have their own MiniDfsCluster, *still the same {{blockPoolId}} and {{startTime}} in the exception.* So, as [~ste...@apache.org] mentioned some other tests poor cleanup, Which one would be bit time consuming or tough to find IMO, or there is some test running in parallel and messing up things :( I haven't played with these MBeans too much but maybe if before starting the test, We check if the MBean is registered & if we unregister that, may be that can solve this problem, if it is a poor cleanup of some test. Though it would be tough to confirm if it does or not... But if two tests are running in parallel & each does rollingUpgrade then it won't help... I think there is some annotation like {{{}@NotThreadSafe{}}}, the test annotated with this should run alone in a thread, maybe that can help, If read this doc right: [https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html#parallel-test-execution-and-single-thread-execution] > TestRollingUpgrade.testDFSAdminRollingUpgradeCommands failing intermittently > ---------------------------------------------------------------------------- > > Key: HDFS-17224 > URL: https://issues.apache.org/jira/browse/HDFS-17224 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfsadmin, test > Affects Versions: 3.4.0 > Reporter: Steve Loughran > Priority: Major > > TestRollingUpgrade.testDFSAdminRollingUpgradeCommands failing because the > static mbean isn't null. This is inevitably related to the fact that in test > runs, the jvm is reused and so the mbean may be present from a previous test > -maybe one which didn't clean up. > it does not fail standalone -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org