[jira] [Work logged] (HDFS-16237) Record the BPServiceActor information that communicates with Standby

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16237?focusedWorklogId=655140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655140
 ]

ASF GitHub Bot logged work on HDFS-16237:
-

Author: ASF GitHub Bot
Created on: 25/Sep/21 00:25
Start Date: 25/Sep/21 00:25
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3479:
URL: https://github.com/apache/hadoop/pull/3479#issuecomment-926980485


   Thanks @prasad-acit for the comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655140)
Time Spent: 50m  (was: 40m)

> Record the BPServiceActor information that communicates with Standby
> 
>
> Key: HDFS-16237
> URL: https://issues.apache.org/jira/browse/HDFS-16237
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When BPServiceActor communicates with Standby, the specific BPServiceActor 
> information should be recorded. Now it is directly filtered.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15910) Replace bzero with explicit_bzero for better safety

2021-09-24 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419988#comment-17419988
 ] 

Bryan Beaudreault commented on HDFS-15910:
--

The linked issue has been backported to branch 3.3, so this is no longer an 
issue for 3.3.2+

> Replace bzero with explicit_bzero for better safety
> ---
>
> Key: HDFS-15910
> URL: https://issues.apache.org/jira/browse/HDFS-15910
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Affects Versions: 3.2.2
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It is better to always use explicit_bzero since it guarantees that the buffer 
> will be cleared irrespective of the compiler optimizations - 
> https://man7.org/linux/man-pages/man3/bzero.3.html.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15977) Call explicit_bzero only if it is available

2021-09-24 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419987#comment-17419987
 ] 

Bryan Beaudreault commented on HDFS-15977:
--

Just to close the loop, with that fix in place I was again able to build Hadoop 
3.3.1 from CentOS 6 with no other changes. Thanks again.

> Call explicit_bzero only if it is available
> ---
>
> Key: HDFS-15977
> URL: https://issues.apache.org/jira/browse/HDFS-15977
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> CentOS/RHEL 7 has glibc 2.17, and it does not support explicit_bzero. Now I 
> don't want to drop support for CentOS/RHEL 7, and we should call 
> explicit_bzero only if it is available. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16237) Record the BPServiceActor information that communicates with Standby

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16237?focusedWorklogId=655048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655048
 ]

ASF GitHub Bot logged work on HDFS-16237:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 18:57
Start Date: 24/Sep/21 18:57
Worklog Time Spent: 10m 
  Work Description: prasad-acit commented on pull request #3479:
URL: https://github.com/apache/hadoop/pull/3479#issuecomment-926853279


   @jianghuazhu Thanks for reporting the issue & the patch.
   Code changes are fine.
   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655048)
Time Spent: 40m  (was: 0.5h)

> Record the BPServiceActor information that communicates with Standby
> 
>
> Key: HDFS-16237
> URL: https://issues.apache.org/jira/browse/HDFS-16237
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When BPServiceActor communicates with Standby, the specific BPServiceActor 
> information should be recorded. Now it is directly filtered.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16000) HDFS : Rename performance optimization

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16000?focusedWorklogId=655046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655046
 ]

ASF GitHub Bot logged work on HDFS-16000:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 18:39
Start Date: 24/Sep/21 18:39
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on a change in pull request #2964:
URL: https://github.com/apache/hadoop/pull/2964#discussion_r715810510



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
##
@@ -1091,6 +1091,24 @@ static void unprotectedUpdateCount(INodesInPath 
inodesInPath,
 }
   }
 
+  /**
+   * check that all parent directories have quotas set.
+   */
+  static boolean verifyIsQuota(INodesInPath iip, int pos) {

Review comment:
   looks like pos is always assigned iip.length() -1. So why not simplify 
the logic and remove the pos?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java
##
@@ -417,12 +434,32 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd,
 
 // Ensure dst has quota to accommodate rename
 verifyFsLimitsForRename(fsd, srcIIP, dstIIP);
-verifyQuotaForRename(fsd, srcIIP, dstIIP);
+QuotaCounts srcPolicyCounts = new QuotaCounts.Builder(true).build();
+QuotaCounts dstPolicyCounts = new QuotaCounts.Builder(true).build();
+boolean srcIIPIsQuota = FSDirectory.verifyIsQuota(
+srcIIP, srcIIP.length() - 1);
+boolean dstIIPIsQuota = FSDirectory.verifyIsQuota(
+dstIIP, dstIIP.length() - 1);
+if (srcIIPIsQuota || dstIIPIsQuota) {
+  if (dstParent.getStoragePolicyID() ==
+  srcIIP.getLastINode().getStoragePolicyID()) {
+srcPolicyCounts = srcIIP.getLastINode().computeQuotaUsage(bsps);

Review comment:
   line 446 and 449 are the same. You may move them to line 444.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
##
@@ -1091,6 +1091,24 @@ static void unprotectedUpdateCount(INodesInPath 
inodesInPath,
 }
   }
 
+  /**
+   * check that all parent directories have quotas set.
+   */
+  static boolean verifyIsQuota(INodesInPath iip, int pos) {
+for (int i = (Math.min(pos, iip.length()) - 1); i > 0; i--) {
+  INode currNode = iip.getINode(i);
+  if (currNode == null) {
+continue;
+  }
+  if (currNode.isDirectory()) {
+if (currNode.isQuotaSet()) {

Review comment:
   if the bottom most directory has quota set, does it imply all the 
ancestor directories have quota set?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java
##
@@ -417,12 +434,32 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd,
 
 // Ensure dst has quota to accommodate rename
 verifyFsLimitsForRename(fsd, srcIIP, dstIIP);
-verifyQuotaForRename(fsd, srcIIP, dstIIP);

Review comment:
   this entire method unprotectedRenameTo() needs a refactor. It's way too 
long too complex to understand.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
##
@@ -1091,6 +1091,24 @@ static void unprotectedUpdateCount(INodesInPath 
inodesInPath,
 }
   }
 
+  /**
+   * check that all parent directories have quotas set.
+   */
+  static boolean verifyIsQuota(INodesInPath iip, int pos) {

Review comment:
   the method name verifyIsQuota() isn't intuitive.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655046)
Time Spent: 40m  (was: 0.5h)

> HDFS : Rename performance optimization
> --
>
> Key: HDFS-16000
> URL: https://issues.apache.org/jira/browse/HDFS-16000
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Affects Versions: 3.1.4, 3.3.1
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, 
> HDFS-16000.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> It takes a long time to move a large directory with rename. For example, it 
> takes about 40 seconds to move a 1000W directory. When a large amount of data 
> is deleted to the trash, the 

[jira] [Commented] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419920#comment-17419920
 ] 

Renukaprasad C commented on HDFS-16235:
---

Thanks [~angerszhuuu] for the clarification. Its clear & PR changes are fine.

PR LGTM. [~ferhui] Thanks for the reviw, we shall merge the PR.

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-24 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-16233:
---
Fix Version/s: 3.3.2
   3.2.3
   2.10.2
   3.4.0
   3.1.5

> Do not use exception handler to implement copy-on-write for EnumCounters
> 
>
> Key: HDFS-16233
> URL: https://issues.apache.org/jira/browse/HDFS-16233
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.3, 3.3.2, 3.1.5
>
> Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png, 
> profile_c7_delete_asyncaudit.html
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
> essentially implementing a copy-on-write strategy.
> At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
> heap space. When it is modified, an exception is thrown and the exception 
> handler converts ConstEnumCounters to EnumCounters object and updates it.
> Using exception handler to perform anything more than occasional is bad for 
> performance. 
> Propose: use instanceof keyword to detect the type of object and do COW 
> accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-24 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16233.

Resolution: Fixed

> Do not use exception handler to implement copy-on-write for EnumCounters
> 
>
> Key: HDFS-16233
> URL: https://issues.apache.org/jira/browse/HDFS-16233
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.3, 3.3.2, 3.1.5
>
> Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png, 
> profile_c7_delete_asyncaudit.html
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
> essentially implementing a copy-on-write strategy.
> At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
> heap space. When it is modified, an exception is thrown and the exception 
> handler converts ConstEnumCounters to EnumCounters object and updates it.
> Using exception handler to perform anything more than occasional is bad for 
> performance. 
> Propose: use instanceof keyword to detect the type of object and do COW 
> accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-24 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419878#comment-17419878
 ] 

Erik Krogen commented on HDFS-16233:


Merged to {{trunk}} and backported to branch-3.3, branch-3.2, branch-3.1, 
branch-2.10

Thanks [~weichiu]!

> Do not use exception handler to implement copy-on-write for EnumCounters
> 
>
> Key: HDFS-16233
> URL: https://issues.apache.org/jira/browse/HDFS-16233
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png, 
> profile_c7_delete_asyncaudit.html
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
> essentially implementing a copy-on-write strategy.
> At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
> heap space. When it is modified, an exception is thrown and the exception 
> handler converts ConstEnumCounters to EnumCounters object and updates it.
> Using exception handler to perform anything more than occasional is bad for 
> performance. 
> Propose: use instanceof keyword to detect the type of object and do COW 
> accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16233?focusedWorklogId=655010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655010
 ]

ASF GitHub Bot logged work on HDFS-16233:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 16:06
Start Date: 24/Sep/21 16:06
Worklog Time Spent: 10m 
  Work Description: xkrogen commented on pull request #3468:
URL: https://github.com/apache/hadoop/pull/3468#issuecomment-926746135


   Merged to `trunk`. I plan to backport to `branch-3.3` ~ `branch-2.10` as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655010)
Time Spent: 1h 10m  (was: 1h)

> Do not use exception handler to implement copy-on-write for EnumCounters
> 
>
> Key: HDFS-16233
> URL: https://issues.apache.org/jira/browse/HDFS-16233
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png, 
> profile_c7_delete_asyncaudit.html
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
> essentially implementing a copy-on-write strategy.
> At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
> heap space. When it is modified, an exception is thrown and the exception 
> handler converts ConstEnumCounters to EnumCounters object and updates it.
> Using exception handler to perform anything more than occasional is bad for 
> performance. 
> Propose: use instanceof keyword to detect the type of object and do COW 
> accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16233?focusedWorklogId=655003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-655003
 ]

ASF GitHub Bot logged work on HDFS-16233:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 15:35
Start Date: 24/Sep/21 15:35
Worklog Time Spent: 10m 
  Work Description: xkrogen merged pull request #3468:
URL: https://github.com/apache/hadoop/pull/3468


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 655003)
Time Spent: 1h  (was: 50m)

> Do not use exception handler to implement copy-on-write for EnumCounters
> 
>
> Key: HDFS-16233
> URL: https://issues.apache.org/jira/browse/HDFS-16233
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png, 
> profile_c7_delete_asyncaudit.html
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
> essentially implementing a copy-on-write strategy.
> At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
> heap space. When it is modified, an exception is thrown and the exception 
> handler converts ConstEnumCounters to EnumCounters object and updates it.
> Using exception handler to perform anything more than occasional is bad for 
> performance. 
> Propose: use instanceof keyword to detect the type of object and do COW 
> accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15977) Call explicit_bzero only if it is available

2021-09-24 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419741#comment-17419741
 ] 

Bryan Beaudreault commented on HDFS-15977:
--

Thank you!

> Call explicit_bzero only if it is available
> ---
>
> Key: HDFS-15977
> URL: https://issues.apache.org/jira/browse/HDFS-15977
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> CentOS/RHEL 7 has glibc 2.17, and it does not support explicit_bzero. Now I 
> don't want to drop support for CentOS/RHEL 7, and we should call 
> explicit_bzero only if it is available. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16231) Fix TestDataNodeMetrics#testReceivePacketSlowMetrics

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16231?focusedWorklogId=654927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654927
 ]

ASF GitHub Bot logged work on HDFS-16231:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 11:14
Start Date: 24/Sep/21 11:14
Worklog Time Spent: 10m 
  Work Description: haiyang1987 edited a comment on pull request #3471:
URL: https://github.com/apache/hadoop/pull/3471#issuecomment-926538825


   @ferhui Thanks for your reply!
   There are two problems:
   1.Current code error MetricsName, unable to get actual metrics,e.g 
TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
 
   
 MetricsRecordBuilder dnMetrics = 
getMetrics(datanode.getMetrics().name());
 assertTrue("More than 1 packet received",
 getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
 assertTrue("More than 1 slow packet to mirror",
 getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
 assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
 assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
   2.Current, we need to get the first or second DataNode of the pipline that 
writes data to get ReceivePacketSlowMetrics of the DataNode
   
   ```
   List datanodes = cluster.getDataNodes();
   DataNode datanode = datanodes.get(0);
   //If the datanode obtained here is the third node of the pipline to write 
data, 
   such as PacketsSlowWriteToMirror  unable to get 
   MetricsRecordBuilder dnMetrics = getMetrics(datanode.getMetrics().name());  
   ...
assertTrue("More than 1 slow packet to mirror",
 getLongCounter("PacketsSlowWriteToMirror", dnMetrics) > 1L);
   ...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654927)
Time Spent: 1h  (was: 50m)

> Fix TestDataNodeMetrics#testReceivePacketSlowMetrics
> 
>
> Key: HDFS-16231
> URL: https://issues.apache.org/jira/browse/HDFS-16231
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> TestDataNodeMetrics#testReceivePacketSlowMetrics fails with stacktrace:
> {code:java}
> java.lang.AssertionError: Expected exactly one metric for name 
> TotalPacketsReceived 
> Expected :1
> Actual   :0
>  
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.test.MetricsAsserts.checkCaptured(MetricsAsserts.java:278)
>   at 
> org.apache.hadoop.test.MetricsAsserts.getLongCounter(MetricsAsserts.java:237)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testReceivePacketSlowMetrics(TestDataNodeMetrics.java:200)
> {code}
> {code:java}
> // Error MetricsName in current code,e.g 
> TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
>   MetricsRecordBuilder dnMetrics = 
> getMetrics(datanode.getMetrics().name());
>   assertTrue("More than 1 packet received",
>   getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
>   assertTrue("More than 1 slow packet to mirror",
>   getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
>   assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
>   assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16231) Fix TestDataNodeMetrics#testReceivePacketSlowMetrics

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16231?focusedWorklogId=654926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654926
 ]

ASF GitHub Bot logged work on HDFS-16231:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 11:13
Start Date: 24/Sep/21 11:13
Worklog Time Spent: 10m 
  Work Description: haiyang1987 edited a comment on pull request #3471:
URL: https://github.com/apache/hadoop/pull/3471#issuecomment-926538825


   @ferhui Thanks for your reply!
   There are two problems:
   1.Current code error MetricsName, unable to get actual metrics,e.g 
TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
 
   
 MetricsRecordBuilder dnMetrics = 
getMetrics(datanode.getMetrics().name());
 assertTrue("More than 1 packet received",
 getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
 assertTrue("More than 1 slow packet to mirror",
 getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
 assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
 assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
   2.Current, we need to get the first or second DataNode of the pipline that 
writes data to get ReceivePacketSlowMetrics of the DataNode
   
   ```
   List datanodes = cluster.getDataNodes();
   DataNode datanode = datanodes.get(0);
   //If the datanode obtained here is the third node of the pipline to write 
data, such as PacketsSlowWriteToMirror  unable to get 
   MetricsRecordBuilder dnMetrics = getMetrics(datanode.getMetrics().name());  
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654926)
Time Spent: 50m  (was: 40m)

> Fix TestDataNodeMetrics#testReceivePacketSlowMetrics
> 
>
> Key: HDFS-16231
> URL: https://issues.apache.org/jira/browse/HDFS-16231
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> TestDataNodeMetrics#testReceivePacketSlowMetrics fails with stacktrace:
> {code:java}
> java.lang.AssertionError: Expected exactly one metric for name 
> TotalPacketsReceived 
> Expected :1
> Actual   :0
>  
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.test.MetricsAsserts.checkCaptured(MetricsAsserts.java:278)
>   at 
> org.apache.hadoop.test.MetricsAsserts.getLongCounter(MetricsAsserts.java:237)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testReceivePacketSlowMetrics(TestDataNodeMetrics.java:200)
> {code}
> {code:java}
> // Error MetricsName in current code,e.g 
> TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
>   MetricsRecordBuilder dnMetrics = 
> getMetrics(datanode.getMetrics().name());
>   assertTrue("More than 1 packet received",
>   getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
>   assertTrue("More than 1 slow packet to mirror",
>   getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
>   assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
>   assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16231) Fix TestDataNodeMetrics#testReceivePacketSlowMetrics

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16231?focusedWorklogId=654925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654925
 ]

ASF GitHub Bot logged work on HDFS-16231:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 11:03
Start Date: 24/Sep/21 11:03
Worklog Time Spent: 10m 
  Work Description: haiyang1987 commented on pull request #3471:
URL: https://github.com/apache/hadoop/pull/3471#issuecomment-926538825


   @ferhui Thanks for your reply!
   There are two problems:
   1.Current code error MetricsName, unable to get actual metrics,e.g 
TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
 
   
 MetricsRecordBuilder dnMetrics = 
getMetrics(datanode.getMetrics().name());
 assertTrue("More than 1 packet received",
 getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
 assertTrue("More than 1 slow packet to mirror",
 getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
 assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
 assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
   2.Current, we need to get the first or second DataNode of the pipline that 
writes data to get ReceivePacketSlowMetrics of the DataNode
   
   ```
   List datanodes = cluster.getDataNodes();
   DataNode datanode = datanodes.get(0);
   //If the datanode obtained here is the third node of the pipline to write 
data, such as 
PacketsSlowWriteToMirror,PacketsSlowWriteToDisk,PacketsSlowWriteToOsCache 
unable to get 
   MetricsRecordBuilder dnMetrics = getMetrics(datanode.getMetrics().name());  
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654925)
Time Spent: 40m  (was: 0.5h)

> Fix TestDataNodeMetrics#testReceivePacketSlowMetrics
> 
>
> Key: HDFS-16231
> URL: https://issues.apache.org/jira/browse/HDFS-16231
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> TestDataNodeMetrics#testReceivePacketSlowMetrics fails with stacktrace:
> {code:java}
> java.lang.AssertionError: Expected exactly one metric for name 
> TotalPacketsReceived 
> Expected :1
> Actual   :0
>  
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.test.MetricsAsserts.checkCaptured(MetricsAsserts.java:278)
>   at 
> org.apache.hadoop.test.MetricsAsserts.getLongCounter(MetricsAsserts.java:237)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testReceivePacketSlowMetrics(TestDataNodeMetrics.java:200)
> {code}
> {code:java}
> // Error MetricsName in current code,e.g 
> TotalPacketsReceived,TotalPacketsSlowWriteToMirror,TotalPacketsSlowWriteToDisk,TotalPacketsSlowWriteToOsCache
>   MetricsRecordBuilder dnMetrics = 
> getMetrics(datanode.getMetrics().name());
>   assertTrue("More than 1 packet received",
>   getLongCounter("TotalPacketsReceived", dnMetrics) > 1L); 
>   assertTrue("More than 1 slow packet to mirror",
>   getLongCounter("TotalPacketsSlowWriteToMirror", dnMetrics) > 1L);
>   assertCounter("TotalPacketsSlowWriteToDisk", 1L, dnMetrics);
>   assertCounter("TotalPacketsSlowWriteToOsCache", 0L, dnMetrics);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16237) Record the BPServiceActor information that communicates with Standby

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16237?focusedWorklogId=654924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654924
 ]

ASF GitHub Bot logged work on HDFS-16237:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 10:59
Start Date: 24/Sep/21 10:59
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3479:
URL: https://github.com/apache/hadoop/pull/3479#issuecomment-926536279


   Checking errors in jenkins don't seem to be too stressful.
   @ayushtkn @virajjasani, can you help review this pr?
   thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654924)
Time Spent: 0.5h  (was: 20m)

> Record the BPServiceActor information that communicates with Standby
> 
>
> Key: HDFS-16237
> URL: https://issues.apache.org/jira/browse/HDFS-16237
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When BPServiceActor communicates with Standby, the specific BPServiceActor 
> information should be recorded. Now it is directly filtered.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16237) Record the BPServiceActor information that communicates with Standby

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16237?focusedWorklogId=654911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654911
 ]

ASF GitHub Bot logged work on HDFS-16237:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 10:19
Start Date: 24/Sep/21 10:19
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3479:
URL: https://github.com/apache/hadoop/pull/3479#issuecomment-926514927


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  12m 17s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   5m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 235m 26s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 344m  8s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3479/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3479 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 403f8bb0ff0b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / db27430bc1fe892574e051ccb236002b06c484c8 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3479/1/testReport/ |
   | Max. process+thread count | 3674 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3479/1/console |
   | versions | git=2.25.1 

[jira] [Commented] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419706#comment-17419706
 ] 

Hui Fei commented on HDFS-16235:


[~angerszhuuu] Thanks for your explanation.

[~prasad-acit]  Would you mind I merge this PR?

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei reassigned HDFS-16235:
--

Assignee: angerszhu  (was: angerszhu)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16234) Improve DataNodeMetrics to initialize IBR more reasonable

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16234?focusedWorklogId=654906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654906
 ]

ASF GitHub Bot logged work on HDFS-16234:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 09:42
Start Date: 24/Sep/21 09:42
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3469:
URL: https://github.com/apache/hadoop/pull/3469#issuecomment-926492611


   Checking errors in jenkins don't seem to be too stressful.
   @ayushtkn  @virajjasani, can you help review this pr?
   thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654906)
Time Spent: 40m  (was: 0.5h)

> Improve DataNodeMetrics to initialize IBR more reasonable
> -
>
> Key: HDFS-16234
> URL: https://issues.apache.org/jira/browse/HDFS-16234
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> DataNode performs DataNodeMetrics to initialize IBR once before sending IBR, 
> so there may be data deviation. It should be processed after sending the IBR 
> to the NameNode. This seems more reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16234) Improve DataNodeMetrics to initialize IBR more reasonable

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16234?focusedWorklogId=654901=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654901
 ]

ASF GitHub Bot logged work on HDFS-16234:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 09:09
Start Date: 24/Sep/21 09:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3469:
URL: https://github.com/apache/hadoop/pull/3469#issuecomment-926471278


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 31s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   4m 52s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 327m 44s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 435m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3469/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3469 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 5a067f0899b4 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / acc1cdf2204015ced364ecdfb66ca301a6bb |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3469/2/testReport/ |
   | Max. process+thread count | 2475 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3469/2/console |
   | versions | 

[jira] [Work logged] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?focusedWorklogId=654884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654884
 ]

ASF GitHub Bot logged work on HDFS-16235:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 08:14
Start Date: 24/Sep/21 08:14
Worklog Time Spent: 10m 
  Work Description: AngersZh commented on pull request #3472:
URL: https://github.com/apache/hadoop/pull/3472#issuecomment-926436701


   > Since [HDFS-14575](https://issues.apache.org/jira/browse/HDFS-14575) is 
checked-into only trunk, we don't need to backport this fix to release branches 
I guess.
   
   yea


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654884)
Time Spent: 1h 20m  (was: 1h 10m)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?focusedWorklogId=654883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654883
 ]

ASF GitHub Bot logged work on HDFS-16235:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 08:14
Start Date: 24/Sep/21 08:14
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3472:
URL: https://github.com/apache/hadoop/pull/3472#issuecomment-926436198


   Since HDFS-14575 is checked-into only trunk, we don't need to backport this 
fix to release branches I guess.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654883)
Time Spent: 1h 10m  (was: 1h)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?focusedWorklogId=654882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654882
 ]

ASF GitHub Bot logged work on HDFS-16235:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 08:11
Start Date: 24/Sep/21 08:11
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3472:
URL: https://github.com/apache/hadoop/pull/3472#issuecomment-926434079


   +1 (non-binding), thanks @AngersZh 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654882)
Time Spent: 1h  (was: 50m)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?focusedWorklogId=654879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654879
 ]

ASF GitHub Bot logged work on HDFS-16235:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 07:45
Start Date: 24/Sep/21 07:45
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3472:
URL: https://github.com/apache/hadoop/pull/3472#discussion_r715385382



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/LeaseRenewer.java
##
@@ -96,7 +96,9 @@ public static LeaseRenewer getInstance(final String authority,
* @param renewer Instance to be cleared from Factory
*/
   public static void remove(LeaseRenewer renewer) {
-Factory.INSTANCE.remove(renewer);
+synchronized (renewer) {
+  Factory.INSTANCE.remove(renewer);

Review comment:
   Isn't remove method already synchronized?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654879)
Time Spent: 50m  (was: 40m)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?focusedWorklogId=654878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654878
 ]

ASF GitHub Bot logged work on HDFS-16235:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 07:42
Start Date: 24/Sep/21 07:42
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3472:
URL: https://github.com/apache/hadoop/pull/3472#discussion_r715388456



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/LeaseRenewer.java
##
@@ -96,7 +96,9 @@ public static LeaseRenewer getInstance(final String authority,
* @param renewer Instance to be cleared from Factory
*/
   public static void remove(LeaseRenewer renewer) {
-Factory.INSTANCE.remove(renewer);
+synchronized (renewer) {
+  Factory.INSTANCE.remove(renewer);

Review comment:
   I see, synchronizing on the object makes sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654878)
Time Spent: 40m  (was: 0.5h)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16235?focusedWorklogId=654877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654877
 ]

ASF GitHub Bot logged work on HDFS-16235:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 07:37
Start Date: 24/Sep/21 07:37
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3472:
URL: https://github.com/apache/hadoop/pull/3472#discussion_r715385382



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/LeaseRenewer.java
##
@@ -96,7 +96,9 @@ public static LeaseRenewer getInstance(final String authority,
* @param renewer Instance to be cleared from Factory
*/
   public static void remove(LeaseRenewer renewer) {
-Factory.INSTANCE.remove(renewer);
+synchronized (renewer) {
+  Factory.INSTANCE.remove(renewer);

Review comment:
   INSTANCE is `static final` so we should synchronize on 
`LeaseRenewer.class`. But anyways, isn't remove method already synchronized?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654877)
Time Spent: 0.5h  (was: 20m)

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-16235.001.patch, image-2021-09-23-19-31-57-337.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> !image-2021-09-23-19-31-57-337.png|width=3339,height=1936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16043) HDFS : Delete performance optimization

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16043?focusedWorklogId=654875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654875
 ]

ASF GitHub Bot logged work on HDFS-16043:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 07:25
Start Date: 24/Sep/21 07:25
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on a change in pull request #3063:
URL: https://github.com/apache/hadoop/pull/3063#discussion_r715377671



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
##
@@ -725,6 +743,9 @@ public void activate(Configuration conf, long blockTotal) {
 datanodeManager.activate(conf);
 this.redundancyThread.setName("RedundancyMonitor");
 this.redundancyThread.start();
+this.markedDeleteBlockScrubberThread.
+setName("MarkedDeleteBlockScrubberThread");
+this.markedDeleteBlockScrubberThread.start();

Review comment:
   Thanks @zhuxiangyi  for the comment.
   It seems that markedDeleteBlockScrubberThread will run in active nn and 
standby nn at the same time. Is that right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654875)
Time Spent: 4h  (was: 3h 50m)

> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16043) HDFS : Delete performance optimization

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16043?focusedWorklogId=654872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654872
 ]

ASF GitHub Bot logged work on HDFS-16043:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 07:10
Start Date: 24/Sep/21 07:10
Worklog Time Spent: 10m 
  Work Description: zhuxiangyi commented on a change in pull request #3063:
URL: https://github.com/apache/hadoop/pull/3063#discussion_r715369363



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
##
@@ -725,6 +743,9 @@ public void activate(Configuration conf, long blockTotal) {
 datanodeManager.activate(conf);
 this.redundancyThread.setName("RedundancyMonitor");
 this.redundancyThread.start();
+this.markedDeleteBlockScrubberThread.
+setName("MarkedDeleteBlockScrubberThread");
+this.markedDeleteBlockScrubberThread.start();

Review comment:
   @jianghuazhu  Thanks a lot for your review.
   The correct BlockManager#close() needs to interrupt the 
markedDeleteBlockScrubberThread thread.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654872)
Time Spent: 3h 50m  (was: 3h 40m)

> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16235) Deadlock in LeaseRenewer for static remove method

2021-09-24 Thread angerszhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419534#comment-17419534
 ] 

angerszhu edited comment on HDFS-16235 at 9/24/21, 6:34 AM:


[~prasad-acit] 

We all know that LeaseRenewer will have a background thread to clean up the 
lease regularly, and the task thread will also call the remove method when 
doing file operations. But they have different entry point.  Then background 
thread hold LeaseRenewer's lock, waiting for LeaseRenewer.Factory's lock. Task 
Thread got LeaseRenwer.Factory's lock and waiting for LeaseRenewer's lock.

 

h3. In task thread
when call DFSClient's create, will call LeaseRenweer.remove()
{code}
  public static void remove(LeaseRenewer renewer) {
  Factory.INSTANCE.remove(renewer);
  }
{code}
When call  Factory.INSTANCE.remove(renewer);, will got LeaseRenewer.Factory's 
lock
{code}
private synchronized void remove(final LeaseRenewer r) {
  final LeaseRenewer stored = renewers.get(r.factorykey);
  //Since a renewer may expire, the stored renewer can be different.
  if (r == stored) {
// Expire LeaseRenewer daemon thread as soon as possible.
r.clearClients();
r.setEmptyTime(0);
renewers.remove(r.factorykey);
  }
}
{code}

And when calling r.setemptyTime() will need LeaseRenewer's lock
{code}
  synchronized void setEmptyTime(long time) {
emptyTime = time;
  }
{code}

h3. In background LeaseNewer thread
{code}
  private void run(final int id) throws InterruptedException {
   ..
  synchronized (this) {
DFSClientFaultInjector.get().delayWhenRenewLeaseTimeout();
dfsclientsCopy = new ArrayList<>(dfsclients);
Factory.INSTANCE.remove(LeaseRenewer.this);
  }
  for (DFSClient dfsClient : dfsclientsCopy) {
dfsClient.closeAllFilesBeingWritten(true);
  }
  break;
} catch (IOException ie) {
  LOG.warn("Failed to renew lease for " + clientsString() + " for "
  + (elapsed/1000) + " seconds.  Will retry shortly ...", ie);
}
  }

...
  }
{code}
When call  Factory.INSTANCE.remove(LeaseRenewer.this);, it hold LeaseRenewer;s 
lock and will wait LeaseRenewer.Factory's lock.
Then the dead lock happen
 

 
This is an obvious deadlock, and the picture above shows it clearly. I am not 
very familiar with Hadoop testing.
 


was (Author: angerszhuuu):
[~prasad-acit] 

We all know that LeaseRenewer will have a background thread to clean up the 
lease regularly, and the task thread will also call the remove method when 
doing file operations. But they have different entry point.  Then background 
thread hold LeaseRenewer's lock, waiting for LeaseRenewer.Factory's lock. Task 
Thread got LeaseRenwer.Factory's lock and waiting for LeaseRenewer's lock.

 

h3. In task thread
when call DFSClient's create, will call LeaseRenweer.remove()
{code}
  public static void remove(LeaseRenewer renewer) {
  Factory.INSTANCE.remove(renewer);
  }
{code}
When call  Factory.INSTANCE.remove(renewer);, will got LeaseRenewer.Factory's 
lock
{code}
private synchronized void remove(final LeaseRenewer r) {
  final LeaseRenewer stored = renewers.get(r.factorykey);
  //Since a renewer may expire, the stored renewer can be different.
  if (r == stored) {
// Expire LeaseRenewer daemon thread as soon as possible.
r.clearClients();
r.setEmptyTime(0);
renewers.remove(r.factorykey);
  }
}
{code}

And when calling r.setemptyTime() will need LeaseRenewer's lock
{code}
  synchronized void setEmptyTime(long time) {
emptyTime = time;
  }
{code}

h3. In background LeaseNewer thread
{code}
  private void run(final int id) throws InterruptedException {
   ..
  synchronized (this) {
DFSClientFaultInjector.get().delayWhenRenewLeaseTimeout();
dfsclientsCopy = new ArrayList<>(dfsclients);
Factory.INSTANCE.remove(LeaseRenewer.this);
  }
  for (DFSClient dfsClient : dfsclientsCopy) {
dfsClient.closeAllFilesBeingWritten(true);
  }
  break;
} catch (IOException ie) {
  LOG.warn("Failed to renew lease for " + clientsString() + " for "
  + (elapsed/1000) + " seconds.  Will retry shortly ...", ie);
}
  }

...
  }
{code}
When call  Factory.INSTANCE.remove(LeaseRenewer.this);, it hold LeaseRenewer;s 
lock and will wait LeaseRenewer.Factory's lock.
Then the dead lock happen
 

 

 

> Deadlock in LeaseRenewer for static remove method
> -
>
> Key: HDFS-16235
> URL: https://issues.apache.org/jira/browse/HDFS-16235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: 

[jira] [Work logged] (HDFS-16232) Fix java doc for BlockReaderRemote#newBlockReader

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16232?focusedWorklogId=654864=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654864
 ]

ASF GitHub Bot logged work on HDFS-16232:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 06:08
Start Date: 24/Sep/21 06:08
Worklog Time Spent: 10m 
  Work Description: tomscut edited a comment on pull request #3456:
URL: https://github.com/apache/hadoop/pull/3456#issuecomment-926369968


   > Looks like it just adds period to each of the sentences. +1
   
   I added several parameter descriptions and fix the checkstyles incidentally. 
Thanks @jojochuang for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654864)
Time Spent: 1h 10m  (was: 1h)

> Fix java doc for BlockReaderRemote#newBlockReader
> -
>
> Key: HDFS-16232
> URL: https://issues.apache.org/jira/browse/HDFS-16232
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Fix java doc for BlockReaderRemote#newBlockReader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16232) Fix java doc for BlockReaderRemote#newBlockReader

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16232?focusedWorklogId=654863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654863
 ]

ASF GitHub Bot logged work on HDFS-16232:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 06:07
Start Date: 24/Sep/21 06:07
Worklog Time Spent: 10m 
  Work Description: tomscut edited a comment on pull request #3456:
URL: https://github.com/apache/hadoop/pull/3456#issuecomment-926369968


   > Looks like it just adds period to each of the sentences. +1
   
   I added several parameter description and fix the checkstyles incidentally. 
Thanks @jojochuang for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654863)
Time Spent: 1h  (was: 50m)

> Fix java doc for BlockReaderRemote#newBlockReader
> -
>
> Key: HDFS-16232
> URL: https://issues.apache.org/jira/browse/HDFS-16232
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Fix java doc for BlockReaderRemote#newBlockReader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16232) Fix java doc for BlockReaderRemote#newBlockReader

2021-09-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16232?focusedWorklogId=654862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654862
 ]

ASF GitHub Bot logged work on HDFS-16232:
-

Author: ASF GitHub Bot
Created on: 24/Sep/21 06:05
Start Date: 24/Sep/21 06:05
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3456:
URL: https://github.com/apache/hadoop/pull/3456#issuecomment-926369968


   > Looks like it just adds period to each of the sentences. +1
   
   I added several parameter description. Thanks @jojochuang for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 654862)
Time Spent: 50m  (was: 40m)

> Fix java doc for BlockReaderRemote#newBlockReader
> -
>
> Key: HDFS-16232
> URL: https://issues.apache.org/jira/browse/HDFS-16232
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fix java doc for BlockReaderRemote#newBlockReader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org