[ 
https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=631631&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631631
 ]

ASF GitHub Bot logged work on HDFS-16143:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Jul/21 13:35
            Start Date: 30/Jul/21 13:35
    Worklog Time Spent: 10m 
      Work Description: virajjasani commented on a change in pull request #3235:
URL: https://github.com/apache/hadoop/pull/3235#discussion_r679753606



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogTailer.java
##########
@@ -433,15 +440,28 @@ public void 
testStandbyTriggersLogRollsWhenTailInProgressEdits()
         NameNodeAdapter.mkdirs(active, getDirPath(i),
             new PermissionStatus("test", "test",
             new FsPermission((short)00755)), true);
+        // reset lastRollTimeMs in EditLogTailer.
+        active.getNamesystem().getEditLogTailer().resetLastRollTimeMs();

Review comment:
       Thanks for taking a look @jojochuang. 
   `EditLogTailer` has a thread that keeps running to identify when is the 
right time to trigger log rolling by calling Active Namenode's rollEditLog() 
API.
   ```
       private void doWork() {
         long currentSleepTimeMs = sleepTimeMs;
         while (shouldRun) {
           long editsTailed  = 0;
           try {
             // There's no point in triggering a log roll if the Standby hasn't
             // read any more transactions since the last time a roll was
             // triggered.
             boolean triggeredLogRoll = false;
             if (tooLongSinceLastLoad() &&
                 lastRollTriggerTxId < lastLoadedTxnId) {
               triggerActiveLogRoll();
               triggeredLogRoll = true;
             }
   ...
   ...
   ```
   
   What happens with this test is that by the time we create new dirs in this 
for loop, this active thread would keep checking and intermittently keep 
triggering log roll by making RPC calls to Active Namenode, and hence this test 
would become flaky because the test expects Standby Namenode's last applied txn 
id to be less than active Namenode's last written txn id within a time limit 
duration. When it comes to how long EditLogTailer's thread keeps waiting to 
trigger log roll depends on `lastRollTimeMs`.
   In the above code, tooLongSinceLastLoad() refers to:
   ```
     /**
      * @return true if the configured log roll period has elapsed.
      */
     private boolean tooLongSinceLastLoad() {
       return logRollPeriodMs >= 0 && 
         (monotonicNow() - lastRollTimeMs) > logRollPeriodMs;
     }
   ```
   Hence, until `lastRollTimeMs` worth of time is elapsed, log roll would not 
be tailed, however, this always tends to be flaky because we have no control 
over how much time mkdir calls in this for loop is going to take and in that 
meantime, `lastRollTimeMs` worth of time can be elapsed easily, hence this test 
is flaky. When we expect Standby Namenode's txnId to be less than that of 
Active Namenode, it is not the case because log is rolled by above thread in 
`EditLogTailer`.
   
   Hence, it is important for this test to keep resetting `lastRollTimeMs` 
while mkdir calls are getting executed so that we don't give chance for 
`tooLongSinceLastLoad()` to be successful until we want it to be successful.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 631631)
    Time Spent: 3.5h  (was: 3h 20m)

> TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-16143
>                 URL: https://issues.apache.org/jira/browse/HDFS-16143
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: test
>            Reporter: Akira Ajisaka
>            Assignee: Viraj Jasani
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3229/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {quote}
> [ERROR] 
> testStandbyTriggersLogRollsWhenTailInProgressEdits[0](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 6.862 s  <<< FAILURE!
> java.lang.AssertionError
>       at org.junit.Assert.fail(Assert.java:87)
>       at org.junit.Assert.assertTrue(Assert.java:42)
>       at org.junit.Assert.assertTrue(Assert.java:53)
>       at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRollsWhenTailInProgressEdits(TestEditLogTailer.java:444)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to