[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error pruned

Jira Thu, 30 Jan 2020 11:59:12 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026958#comment-17026958
 ]


Íñigo Goiri commented on HDFS-15147:
------------------------------------

This looks very good.
A few minor comments:
* Avoid the blank line removeal after BlockManager#4819 and FSNamesystem#4427.
* In addition to adding VisibleForTesting, it would be nice to have a comment 
on what getLastRedundancyMonitorTS() and getLazyPersistFileScrubberTS() 
represent and why we are exposing them.
* Make the constants in LazyPersistTestCase consistent with SEC, MS, MSEC or 
whichever we think is good.
* I would argue that we should use TimeUnits.SECONDS.toMillis(XXX) for the 
constants instead of multiplying. It's true that this then adds a cast from 
long to int but I would say that we should provide a waitFor that takes long as 
an input.
* Extract the key and value in LazyPersistTestCase#474. Given a name with 
before and after would help reading it. BTW, should we check for the value to 
be larger instead of equal?
* Complete the javadoc comment for shutdownDataNodes(). I don't think it 
currently parses.
* It might be good to give javadocs to all the wait methods (e.g., 
waitForScrubberCycle(), waitForRedundancyCount()....).
* Let's use logger format {} in the new logging.
* Should we wrap the joinUninterruptibly() somewhat? What's the difference with 
the old Uninterruptibles#joinUninterruptibly()  BTW?

> LazyPersistTestCase wait logic is error pruned
> ----------------------------------------------
>
>                 Key: HDFS-15147
>                 URL: https://issues.apache.org/jira/browse/HDFS-15147
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Ahmed Hussein
>            Assignee: Ahmed Hussein
>            Priority: Minor
>         Attachments: HDFS-15147-branch-2.10.001.patch, HDFS-15147.001.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>       throws IOException, InterruptedException {
>     // Trigger block report to NN
>     DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
>     Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error pruned

Reply via email to