Steve Loughran created HDFS-16934:
-------------------------------------

             Summary: 
org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression
                 Key: HDFS-16934
                 URL: https://issues.apache.org/jira/browse/HDFS-16934
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: dfsadmin, test
    Affects Versions: 3.4.0, 3.3.5, 3.3.9
            Reporter: Steve Loughran


jenkins test failure as the logged output is in the wrong order for the 
assertions. HDFS-16624 flipped the order...without that this would have worked.

{code}

java.lang.AssertionError
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at 
org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1149)
{code}


Here the code is asserting about the contents of the output, 
{code}
    assertTrue(outs.get(0).startsWith("Reconfiguring status for node"));
    assertTrue("SUCCESS: Changed property 
dfs.datanode.peer.stats.enabled".equals(outs.get(2))
        || "SUCCESS: Changed property 
dfs.datanode.peer.stats.enabled".equals(outs.get(1)));  // here
    assertTrue("\tFrom: \"false\"".equals(outs.get(3)) || "\tFrom: 
\"false\"".equals(outs.get(2)));
    assertTrue("\tTo: \"true\"".equals(outs.get(4)) || "\tTo: 
\"true\"".equals(outs.get(3)))
{code}

If you look at the log, the actual line is appearing in that list, just in a 
different place. race condition
{code}
2023-02-24 01:02:06,275 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:testAllDatanodesReconfig(1146)) - dfsadmin -status 
-livenodes output:
2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
status for node [127.0.0.1:41795]: started at Fri Feb 24 01:02:03 GMT 2023 and 
finished at Fri Feb 24 01:02:03 GMT 2023.
2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring 
status for node [127.0.0.1:34007]: started at Fri Feb 24 01:02:03 GMT 
2023SUCCESS: Changed property dfs.datanode.peer.stats.enabled
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -    From: "false"
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -    To: "true"
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  and finished at 
Fri Feb 24 01:02:03 GMT 2023.
2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin 
(TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - SUCCESS: Changed 
property dfs.datanode.peer.stats.enabled
{code}
we have a race condition in output generation and the assertions are clearly 
too brittle

for the 3.3.5 release I'm not going to make this a blocker. What i will do is 
propose that the asserts move to assertJ with an assertion that the collection 
"containsExactlyInAnyOrder" all the strings.

That will
1. not be brittle.
2. give nice errors on failure




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to