[jira] [Commented] (HDFS-16042) DatanodeAdminMonitor scan should be delay based
[ https://issues.apache.org/jira/browse/HDFS-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381509#comment-17381509 ] Ahmed Hussein commented on HDFS-16042: -- I have uploaded patches for 2.10, 3.2, and 3.3 and posted the conflicts (if any) in the above comments. The yetus did not trigger since yesterday, but compiled the patches locally anyway (perhaps the jira-precommits won't run anymore following the discussion in the community and HADOOP-17798). [~Jim_Brennan], can you please merge those patches into their respective branches? > DatanodeAdminMonitor scan should be delay based > --- > > Key: HDFS-16042 > URL: https://issues.apache.org/jira/browse/HDFS-16042 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-16042-branch-2.10.001.patch, > HDFS-16042-branch-3.2.001.patch, HDFS-16042-branch-3.3.001.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > In {{DatanodeAdminManager.activate()}}, the Monitor task is scheduled with a > fixed rate, ie. the period is from start1 -> start2. > {code:java} > executor.scheduleAtFixedRate(monitor, intervalSecs, intervalSecs, >TimeUnit.SECONDS); > {code} > According to Java API docs for {{scheduleAtFixedRate}}, > {quote}If any execution of this task takes longer than its period, then > subsequent executions may start late, but will not concurrently > execute.{quote} > It should be a fixed delay so it's end1 -> start1. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16042) DatanodeAdminMonitor scan should be delay based
[ https://issues.apache.org/jira/browse/HDFS-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380724#comment-17380724 ] Ahmed Hussein commented on HDFS-16042: -- *cherry-picking to branch-3.2* {code:bash} git cherry-pick -x a2a0283c7be8eac641a256f06731cb6e4bab3b09` {code} conflicts due to missing HDFS-14854. "_Create improved decommission monitor implementation_". *cherry-picking to branch-2.10* {code:bash} git cherry-pick -x a2a0283c7be8eac641a256f06731cb6e4bab3b09` {code} conflicts due to missing HDFS-14854. "_Create improved decommission monitor implementation_". > DatanodeAdminMonitor scan should be delay based > --- > > Key: HDFS-16042 > URL: https://issues.apache.org/jira/browse/HDFS-16042 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-16042-branch-3.2.001.patch, > HDFS-16042-branch-3.3.001.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > In {{DatanodeAdminManager.activate()}}, the Monitor task is scheduled with a > fixed rate, ie. the period is from start1 -> start2. > {code:java} > executor.scheduleAtFixedRate(monitor, intervalSecs, intervalSecs, >TimeUnit.SECONDS); > {code} > According to Java API docs for {{scheduleAtFixedRate}}, > {quote}If any execution of this task takes longer than its period, then > subsequent executions may start late, but will not concurrently > execute.{quote} > It should be a fixed delay so it's end1 -> start1. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16042) DatanodeAdminMonitor scan should be delay based
[ https://issues.apache.org/jira/browse/HDFS-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356692#comment-17356692 ] Kihwal Lee commented on HDFS-16042: --- It is not related to any incident. Since the default interval is 30 seconds, the impact of the change will not be great, but still it is right thing to do. If a lot of decommissioning and/or maintenance mode entering nodes are introduced at once, the initial scan can last seconds. This initial scan is not subject to the max blocks per iteration limit. By changing it from fixed interval to fixed delay, such an impact will be dampened a bit in the long run. The patch looks good. > DatanodeAdminMonitor scan should be delay based > --- > > Key: HDFS-16042 > URL: https://issues.apache.org/jira/browse/HDFS-16042 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In {{DatanodeAdminManager.activate()}}, the Monitor task is scheduled with a > fixed rate, ie. the period is from start1 -> start2. > {code:java} > executor.scheduleAtFixedRate(monitor, intervalSecs, intervalSecs, >TimeUnit.SECONDS); > {code} > According to Java API docs for {{scheduleAtFixedRate}}, > {quote}If any execution of this task takes longer than its period, then > subsequent executions may start late, but will not concurrently > execute.{quote} > It should be a fixed delay so it's end1 -> start1. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16042) DatanodeAdminMonitor scan should be delay based
[ https://issues.apache.org/jira/browse/HDFS-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356468#comment-17356468 ] Jim Brennan commented on HDFS-16042: [~ahussein] I believe this was filed due to an incident where decommissions were happening during a rolling upgrade. They were taking longer and with the fixed rate, they were holding the namesystem lock frequently enough that it was impacting other operations during the upgrade. [~kihwal], [~daryn] is this correct? Seems like a reasonable change to me. I will commit if you guys are ok with it. > DatanodeAdminMonitor scan should be delay based > --- > > Key: HDFS-16042 > URL: https://issues.apache.org/jira/browse/HDFS-16042 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In {{DatanodeAdminManager.activate()}}, the Monitor task is scheduled with a > fixed rate, ie. the period is from start1 -> start2. > {code:java} > executor.scheduleAtFixedRate(monitor, intervalSecs, intervalSecs, >TimeUnit.SECONDS); > {code} > According to Java API docs for {{scheduleAtFixedRate}}, > {quote}If any execution of this task takes longer than its period, then > subsequent executions may start late, but will not concurrently > execute.{quote} > It should be a fixed delay so it's end1 -> start1. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16042) DatanodeAdminMonitor scan should be delay based
[ https://issues.apache.org/jira/browse/HDFS-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352535#comment-17352535 ] Ahmed Hussein commented on HDFS-16042: -- Hey [~Jim_Brennan], can you please take a look at [GitHub Pull Request #3058|https://github.com/apache/hadoop/pull/3058]. It is pretty straightforward change. > DatanodeAdminMonitor scan should be delay based > --- > > Key: HDFS-16042 > URL: https://issues.apache.org/jira/browse/HDFS-16042 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In {{DatanodeAdminManager.activate()}}, the Monitor task is scheduled with a > fixed rate, ie. the period is from start1 -> start2. > {code:java} > executor.scheduleAtFixedRate(monitor, intervalSecs, intervalSecs, >TimeUnit.SECONDS); > {code} > According to Java API docs for {{scheduleAtFixedRate}}, > {quote}If any execution of this task takes longer than its period, then > subsequent executions may start late, but will not concurrently > execute.{quote} > It should be a fixed delay so it's end1 -> start1. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org