[jira] [Commented] (HDFS-17224) TestRollingUpgrade.testDFSAdminRollingUpgradeCommands failing intermittently

Ayush Saxena (Jira) Fri, 13 Oct 2023 10:39:05 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775005#comment-17775005
 ]


Ayush Saxena commented on HDFS-17224:
-------------------------------------

Well two tests failed in the same class:
[https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4996/21/testReport/org.apache.hadoop.hdfs/TestRollingUpgrade/]

The first one failed here:
[https://github.com/apache/hadoop/blob/85af6c3a2850ffa0d3216bb62c19c55ab6e4dba3/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java#L134]

 

Kind of precheck, rolling upgrade was never kicked in, it was the first time, 
with an illegal argument, which on CLI failed was confirmed in the line 
above(expected).

So, this MBean is coming from somewhere else....

Checking both the tests which failed. Both failed with MBean not being Null, 
first one didn't had a GenericTestUtils.waitFor, Other had, HDFS-16336 added a 
wait, So, the same exception is bit below, The wait was added for the same 
exception here in this ticket, but looks like it wasn't just some latency....

An interesting thing to observe. The two tests that failed both each uses their 
own MiniDfsCluster.
[From First 
one|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4996/21/testReport/org.apache.hadoop.hdfs/TestRollingUpgrade/testDFSAdminRollingUpgradeCommands/]
{noformat}
(itemName=startTime,itemType=javax.management.openmbean.SimpleType(name=java.lang.Long)))),contents={blockPoolId=BP-1679863569-172.17.0.2-1696910973814,
 createdRollbackImages=true, finalizeTime=0, startTime=1696910977372})>

{noformat}
[From the Second 
One|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4996/21/testReport/org.apache.hadoop.hdfs/TestRollingUpgrade/testRollback/]
{noformat}
(itemName=startTime,itemType=javax.management.openmbean.SimpleType(name=java.lang.Long)))),contents={blockPoolId=BP-1679863569-172.17.0.2-1696910973814,
 createdRollbackImages=true, finalizeTime=0, startTime=1696910977372})>
{noformat}
Both these tests have their own MiniDfsCluster, *still the same {{blockPoolId}} 
and {{startTime}} in the exception.*

 

So, as [~ste...@apache.org]  mentioned some other tests poor cleanup, Which one 
would be bit time consuming or tough to find IMO, or there is some test running 
in parallel and messing up things :( 

I haven't played with these MBeans too much but maybe if before starting the 
test, We check if the MBean is registered & if we unregister that, may be that 
can solve this problem, if it is a poor cleanup of some test. Though it would 
be tough to confirm if it does or not...

But if two tests are running in parallel & each does rollingUpgrade then it 
won't help...

I think there is some annotation like {{{}@NotThreadSafe{}}}, the test 
annotated with this should run alone in a thread, maybe that can help, If read 
this doc right:

[https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html#parallel-test-execution-and-single-thread-execution]

> TestRollingUpgrade.testDFSAdminRollingUpgradeCommands failing intermittently
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-17224
>                 URL: https://issues.apache.org/jira/browse/HDFS-17224
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: dfsadmin, test
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> TestRollingUpgrade.testDFSAdminRollingUpgradeCommands failing because the 
> static mbean isn't null. This is inevitably related to the fact that in test 
> runs, the jvm is reused and so the mbean may be present from a previous test 
> -maybe one which didn't clean up.
> it does not fail standalone



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17224) TestRollingUpgrade.testDFSAdminRollingUpgradeCommands failing intermittently

Reply via email to