date:20220610

[jira] [Comment Edited] (HDFS-11448) JN log segment syncing should support HA upgrade

2022-06-10 Thread JiangHua Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545848#comment-17545848
 ] 

JiangHua Zhu edited comment on HDFS-11448 at 6/10/22 7:15 AM:
--

Hi [~hanishakoneru], nice to communicate with you.
In JNStorage, getCurrentDir() is not used anywhere.
If you don't mind, I'll remove JNStorage#getCurrentDir() which is not used.


was (Author: jianghuazhu):
Hi [~hanishakoneru], nice to communicate with you.
I found the new addition of JNStorage#getCurrentDir() here, and yes, that's 
good because sd.getCurrentDir() is used in multiple places in the context, but 
there is no use of it anywhere.
If you don't mind, I'll modify this to replace sd.getCurrentDir() with 
JNStorage#getCurrentDir().


> JN log segment syncing should support HA upgrade
> 
>
> Key: HDFS-11448
> URL: https://issues.apache.org/jira/browse/HDFS-11448
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Fix For: 3.0.0-alpha4
>
> Attachments: HDFS-11448.001.patch, HDFS-11448.002.patch, 
> HDFS-11448.003.patch
>
>
> HDFS-4025 adds support for sychronizing past log segments to JNs that missed 
> them. But, as pointed out by [~jingzhao], if the segment download happens 
> when an admin tries to rollback, it might fail ([see 
> comment|https://issues.apache.org/jira/browse/HDFS-4025?focusedCommentId=15850633&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15850633]).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16628) RBF: kerbose user remove Non-default namespace data failed

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16628?focusedWorklogId=780259&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780259
 ]

ASF GitHub Bot logged work on HDFS-16628:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 09:47
Start Date: 10/Jun/22 09:47
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on code in PR #4424:
URL: https://github.com/apache/hadoop/pull/4424#discussion_r894348979


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,

Review Comment:
   testMoveToTrashNoMountPointWithKerBoersUser -> 
testMoveToTrasWithKerberosUser?
   a. IMO, this case is not related with `NoMountPoint` or not, right?
   b. KerBoers -> Kerberos.



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,
+  URISyntaxException, InterruptedException {
+//Constructs the structure of the KerBoers user name
+String kerBoersUser = "randomUser/d...@hadoop.com";

Review Comment:
   Suggest to correct `KerBoers` to `Kerberos`. Both of the following spell 
issues.



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,
+  URISyntaxException, InterruptedException {
+//Constructs the structure of the KerBoers user name
+String kerBoersUser = "randomUser/d...@hadoop.com";
+UserGroupInformation ugi = 
UserGroupInformation.createRemoteUser(kerBoersUser);
+MountTable addEntry = MountTable.newInstance(MOUNT_POINT,
+Collections.singletonMap(ns1, MOUNT_POINT));
+assertTrue(addMountTable(addEntry));
+// current user client
+MiniRouterDFSCluster.NamenodeContext nn1Context = cluster.getNamenode(ns1, 
null);
+DFSClient currentUserClientNs0 = nnContext.getClient();
+DFSClient currentUserClientNs1 = nn1Context.getClient();
+
+currentUserClientNs0.setOwner("/", ugi.getShortUserName(), 
ugi.getShortUserName());
+currentUserClientNs1.setOwner("/", ugi.getShortUserName(), 
ugi.getShortUserName());

Review Comment:
   If only one DFSClient and NamenodeContext is enough for this unit test?



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,
+  URISyntaxException, InterruptedException {
+//Constructs the structure of the KerBoers user name
+String kerBoersUser = "randomUser/d...@hadoop.com";
+UserGroupInformation ugi = 
UserGroupInformation.createRemoteUser(kerBoersUser);
+MountTable addEntry = MountTable.newInstance(MOUNT_POINT,
+Collections.singletonMap(ns1, MOUNT_POINT));
+assertTrue(addMountTable(addEntry));
+// current user client
+MiniRouterDFSCluster.NamenodeContext nn1Context = cluster.getNamenode(ns1, 
null);
+DFSClient currentUserClientNs0 = nnContext.getClient();
+DFSClient currentUserClientNs1 = nn1Context.getClient();
+
+currentUserClientNs0.setOwner("/", ugi.getShortUserName(), 
ugi.getShortUserName());
+currentUserClientNs1.setOwner("/", ugi.getShortUserName(), 
ugi.getShortUserName());
+
+// test user client
+DFSClient testUserClientNs1 = nn1Context.getClient(ugi);
+testUserClientNs1.mkdirs(MOUNT_POINT, new FsPermission("777"), true);
+assertTrue(testUserClientNs1.exists(MOUNT_POINT));
+// create test file
+testUserClientNs1.create(FILE, true);
+Path filePath = new Path(FILE);
+
+FileStatus[] fileStatuses = routerFs.listStatus(filePath);
+assertEquals(1, fileStatuses.length);
+assertEquals(ugi.getShortUserName(), fileStatuses[0].getOwner());
+// move to Trash
+Configuration routerConf = routerContext.getConf();
+FileSystem fs = DFSTestUtil.getFileSystemAs(ugi, routerConf);
+Trash trash = new Trash(fs, routerConf);
+assertTrue(trash.moveToTrash(filePath));

Review Comment:

[jira] [Work logged] (HDFS-16598) All datanodes [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]] are bad. Aborting...

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16598?focusedWorklogId=780260&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780260
 ]

ASF GitHub Bot logged work on HDFS-16598:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 09:49
Start Date: 10/Jun/22 09:49
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4366:
URL: https://github.com/apache/hadoop/pull/4366#issuecomment-1152182901

   > @Hexiaoqiao do you mean to change all getReplicaInfo(ExtendedBlock b) to 
getReplicaInfo(String bpid, long blkid) in fine-grained lock in this PR?
   
   Sure. It is not necessary to check GS when acquire BP/VOLUME lock which is 
totally not related with GS IMO. Thanks.




Issue Time Tracking
---

Worklog Id: (was: 780260)
Time Spent: 2h 20m  (was: 2h 10m)

> All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
> --
>
> Key: HDFS-16598
> URL: https://issues.apache.org/jira/browse/HDFS-16598
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the 
> stack like:
> {code:java}
> java.io.IOException: All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> After tracing the root cause, this bug was introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. Because the 
> block GS of client may be smaller than DN when pipeline recovery failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16601) Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16601?focusedWorklogId=780266&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780266
 ]

ASF GitHub Bot logged work on HDFS-16601:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:01
Start Date: 10/Jun/22 10:01
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4369:
URL: https://github.com/apache/hadoop/pull/4369#issuecomment-1152193661

   > Fortunately, at present, as long as failed exception throw to client, the 
client defaults to thinking that the new dn is abnormal, and will exclude it 
and retry transfer. During retrying transfer, Client will chose new source dn 
and new target dn. 
   
   Thanks for furthermore comment here. Agree that it will improve 
fault-tolerant for transfer, however, we have to accept the truth that the 
source datanode meets issue and choose the same one when retry, thus we could 
not avoid to fail.  I am not sure if any way to expose exceptions to differ 
source Node or target Node exception?  If it is true, it will be helpful for 
the following fault-tolerant improvement at client side. 




Issue Time Tracking
---

Worklog Id: (was: 780266)
Time Spent: 1h 20m  (was: 1h 10m)

> Failed to replace a bad datanode on the existing pipeline due to no more good 
> datanodes being available to try
> --
>
> Key: HDFS-16601
> URL: https://issues.apache.org/jira/browse/HDFS-16601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we found a bug and stack like:
> {code:java}
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1418)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1478)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1704)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1605)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> And the root cause is that DFSClient cannot  perceive the exception of 
> TransferBlock during PipelineRecovery. If failed during TransferBlock, the 
> DFSClient will retry all datanodes in the cluster and then failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16628) RBF: kerbose user remove Non-default namespace data failed

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16628?focusedWorklogId=780267&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780267
 ]

ASF GitHub Bot logged work on HDFS-16628:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:02
Start Date: 10/Jun/22 10:02
Worklog Time Spent: 10m 
  Work Description: zhangxiping1 commented on code in PR #4424:
URL: https://github.com/apache/hadoop/pull/4424#discussion_r894365791


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,

Review Comment:
   OK, thank you for your correction.





Issue Time Tracking
---

Worklog Id: (was: 780267)
Time Spent: 40m  (was: 0.5h)

> RBF: kerbose user remove Non-default namespace data failed
> --
>
> Key: HDFS-16628
> URL: https://issues.apache.org/jira/browse/HDFS-16628
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiping Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> remove data from the router will fail using such a user 
> username/d...@hadoop.com



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16628) RBF: kerbose user remove Non-default namespace data failed

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16628?focusedWorklogId=780268&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780268
 ]

ASF GitHub Bot logged work on HDFS-16628:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:04
Start Date: 10/Jun/22 10:04
Worklog Time Spent: 10m 
  Work Description: zhangxiping1 commented on code in PR #4424:
URL: https://github.com/apache/hadoop/pull/4424#discussion_r894367700


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,

Review Comment:
   for a , yes 





Issue Time Tracking
---

Worklog Id: (was: 780268)
Time Spent: 50m  (was: 40m)

> RBF: kerbose user remove Non-default namespace data failed
> --
>
> Key: HDFS-16628
> URL: https://issues.apache.org/jira/browse/HDFS-16628
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiping Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> remove data from the router will fail using such a user 
> username/d...@hadoop.com



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16628) RBF: kerbose user remove Non-default namespace data failed

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16628?focusedWorklogId=780269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780269
 ]

ASF GitHub Bot logged work on HDFS-16628:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:05
Start Date: 10/Jun/22 10:05
Worklog Time Spent: 10m 
  Work Description: zhangxiping1 commented on code in PR #4424:
URL: https://github.com/apache/hadoop/pull/4424#discussion_r894367932


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,
+  URISyntaxException, InterruptedException {
+//Constructs the structure of the KerBoers user name
+String kerBoersUser = "randomUser/d...@hadoop.com";

Review Comment:
   Haha, ok 



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterTrash.java:
##
@@ -189,6 +189,41 @@ public void testMoveToTrashNoMountPoint() throws 
IOException,
 assertEquals(2, fileStatuses.length);
   }
 
+  @Test
+  public void testMoveToTrashNoMountPointWithKerBoersUser() throws IOException,
+  URISyntaxException, InterruptedException {
+//Constructs the structure of the KerBoers user name
+String kerBoersUser = "randomUser/d...@hadoop.com";
+UserGroupInformation ugi = 
UserGroupInformation.createRemoteUser(kerBoersUser);
+MountTable addEntry = MountTable.newInstance(MOUNT_POINT,
+Collections.singletonMap(ns1, MOUNT_POINT));
+assertTrue(addMountTable(addEntry));
+// current user client
+MiniRouterDFSCluster.NamenodeContext nn1Context = cluster.getNamenode(ns1, 
null);
+DFSClient currentUserClientNs0 = nnContext.getClient();
+DFSClient currentUserClientNs1 = nn1Context.getClient();
+
+currentUserClientNs0.setOwner("/", ugi.getShortUserName(), 
ugi.getShortUserName());
+currentUserClientNs1.setOwner("/", ugi.getShortUserName(), 
ugi.getShortUserName());

Review Comment:
   It is a good idea to set the test user permissions for both subsets. Give 
test users permission to create both subCluster Trash directories, otherwise 
there may be some false positives.





Issue Time Tracking
---

Worklog Id: (was: 780269)
Time Spent: 1h  (was: 50m)

> RBF: kerbose user remove Non-default namespace data failed
> --
>
> Key: HDFS-16628
> URL: https://issues.apache.org/jira/browse/HDFS-16628
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiping Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> remove data from the router will fail using such a user 
> username/d...@hadoop.com



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16627) improve BPServiceActor#register Log Add NN Addr

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16627?focusedWorklogId=780270&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780270
 ]

ASF GitHub Bot logged work on HDFS-16627:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:06
Start Date: 10/Jun/22 10:06
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4419:
URL: https://github.com/apache/hadoop/pull/4419#issuecomment-1152198199

   The latest build seems throw some warning. Try to trigger jenkins again. 
Let's wait what it will say.




Issue Time Tracking
---

Worklog Id: (was: 780270)
Time Spent: 1.5h  (was: 1h 20m)

> improve BPServiceActor#register Log Add NN Addr
> ---
>
> Key: HDFS-16627
> URL: https://issues.apache.org/jira/browse/HDFS-16627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When I read the log, I think the Addr information of NN should be added to 
> make the log information more complete.
> The log is as follows:
> {code:java}
> 2022-06-06 06:15:32,715 [BP-1990954485-172.17.0.2-1654496132136 heartbeating 
> to localhost/127.0.0.1:42811] INFO  datanode.DataNode 
> (BPServiceActor.java:register(819)) - Block pool 
> BP-1990954485-172.17.0.2-1654496132136 (Datanode Uuid 
> 7d4b5459-6f2b-4203-bf6f-d31bfb9b6c3f) service to localhost/127.0.0.1:42811 
> beginning handshake with NN.
> 2022-06-06 06:15:32,717 [BP-1990954485-172.17.0.2-1654496132136 heartbeating 
> to localhost/127.0.0.1:42811] INFO  datanode.DataNode 
> (BPServiceActor.java:register(847)) - Block pool 
> BP-1990954485-172.17.0.2-1654496132136 (Datanode Uuid 
> 7d4b5459-6f2b-4203-bf6f-d31bfb9b6c3f) service to localhost/127.0.0.1:42811 
> successfully registered with NN. {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16601) Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16601?focusedWorklogId=780272&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780272
 ]

ASF GitHub Bot logged work on HDFS-16601:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:09
Start Date: 10/Jun/22 10:09
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #4369:
URL: https://github.com/apache/hadoop/pull/4369#issuecomment-1152200912

   > the source datanode meets issue and choose the same one when retry
   
   It will chose the next datanode as source datanode when retry.
   
   Code like blew, and tried will +1 when retry.
   ```
 final DatanodeInfo src = original[tried % original.length];
 final DatanodeInfo[] targets = {nodes[d]};
 final StorageType[] targetStorageTypes = {storageTypes[d]};
   
 try {
   transfer(src, targets, targetStorageTypes, lb.getBlockToken());
 } catch (IOException ioe) {
   DFSClient.LOG.warn("Error transferring data from " + src + " to " +
   nodes[d] + ": " + ioe.getMessage());
   caughtException = ioe;
   // add the allocated node to the exclude list.
   exclude.add(nodes[d]);
   setPipeline(original, originalTypes, originalIDs);
   tried++;
   continue;
 }
   ```




Issue Time Tracking
---

Worklog Id: (was: 780272)
Time Spent: 1.5h  (was: 1h 20m)

> Failed to replace a bad datanode on the existing pipeline due to no more good 
> datanodes being available to try
> --
>
> Key: HDFS-16601
> URL: https://issues.apache.org/jira/browse/HDFS-16601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In our production environment, we found a bug and stack like:
> {code:java}
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1418)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1478)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1704)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1605)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> And the root cause is that DFSClient cannot  perceive the exception of 
> TransferBlock during PipelineRecovery. If failed during TransferBlock, the 
> DFSClient will retry all datanodes in the cluster and then failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16598) All datanodes [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]] are bad. Aborting...

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16598?focusedWorklogId=780273&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780273
 ]

ASF GitHub Bot logged work on HDFS-16598:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:09
Start Date: 10/Jun/22 10:09
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #4366:
URL: https://github.com/apache/hadoop/pull/4366#issuecomment-1152201718

   Got, I will do it.




Issue Time Tracking
---

Worklog Id: (was: 780273)
Time Spent: 2.5h  (was: 2h 20m)

> All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
> --
>
> Key: HDFS-16598
> URL: https://issues.apache.org/jira/browse/HDFS-16598
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the 
> stack like:
> {code:java}
> java.io.IOException: All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> After tracing the root cause, this bug was introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. Because the 
> block GS of client may be smaller than DN when pipeline recovery failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16600) Deadlock on DataNode

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16600?focusedWorklogId=780274&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780274
 ]

ASF GitHub Bot logged work on HDFS-16600:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:14
Start Date: 10/Jun/22 10:14
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4367:
URL: https://github.com/apache/hadoop/pull/4367#issuecomment-1152205735

   > Oh, I'm sorry, the failed UT is 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement.testSynchronousEviction.
   
   Thanks @ZanderXu for your information. I think it can cover this case, let's 
wait what Ayush think about it.
   Just find the latest build not clean. Try to trigger jenkins again, let's 
wait what it will say.




Issue Time Tracking
---

Worklog Id: (was: 780274)
Time Spent: 3h 20m  (was: 3h 10m)

> Deadlock on DataNode
> 
>
> Key: HDFS-16600
> URL: https://issues.apache.org/jira/browse/HDFS-16600
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The UT 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.testSynchronousEviction 
> failed, because happened deadlock, which  is introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. 
> DeadLock:
> {code:java}
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.createRbw line 1588 
> need a read lock
> try (AutoCloseableLock lock = lockManager.readLock(LockLevel.BLOCK_POOl,
> b.getBlockPoolId()))
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.evictBlocks line 
> 3526 need a write lock
> try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, 
> bpid))
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16601) Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16601?focusedWorklogId=780277&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780277
 ]

ASF GitHub Bot logged work on HDFS-16601:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:29
Start Date: 10/Jun/22 10:29
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4369:
URL: https://github.com/apache/hadoop/pull/4369#issuecomment-1152217871

   Sorry for not very clear comment. I know not it's round-robin way to pick 
the source node, and at third round it will pick the original node again (no 
matter if it is bad/slow node.), of course it will be a tiny probability. 
Actually, I mean, it will be helpful for client to do many fault-tolerant 
improvement later if we could differ the exception about transfer. Once more, 
this is not blocker comment. Thanks again.




Issue Time Tracking
---

Worklog Id: (was: 780277)
Time Spent: 1h 40m  (was: 1.5h)

> Failed to replace a bad datanode on the existing pipeline due to no more good 
> datanodes being available to try
> --
>
> Key: HDFS-16601
> URL: https://issues.apache.org/jira/browse/HDFS-16601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we found a bug and stack like:
> {code:java}
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1418)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1478)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1704)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1605)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> And the root cause is that DFSClient cannot  perceive the exception of 
> TransferBlock during PipelineRecovery. If failed during TransferBlock, the 
> DFSClient will retry all datanodes in the cluster and then failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16601) Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16601?focusedWorklogId=780278&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780278
 ]

ASF GitHub Bot logged work on HDFS-16601:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 10:32
Start Date: 10/Jun/22 10:32
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #4369:
URL: https://github.com/apache/hadoop/pull/4369#issuecomment-1152220339

   Got, thanks @Hexiaoqiao .
   
   > Actually, I mean, it will be helpful for client to do many fault-tolerant 
improvement later if we could differ the exception about transfer
   
   I will try to work for it.




Issue Time Tracking
---

Worklog Id: (was: 780278)
Time Spent: 1h 50m  (was: 1h 40m)

> Failed to replace a bad datanode on the existing pipeline due to no more good 
> datanodes being available to try
> --
>
> Key: HDFS-16601
> URL: https://issues.apache.org/jira/browse/HDFS-16601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we found a bug and stack like:
> {code:java}
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:59670,DS-0d652bc2-1784-430d-961f-750f80a290f1,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:59687,DS-b803febc-7b22-4144-9b39-7bf521cdaa8d,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>   at 
> org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1418)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1478)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1704)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1605)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> And the root cause is that DFSClient cannot  perceive the exception of 
> TransferBlock during PipelineRecovery. If failed during TransferBlock, the 
> DFSClient will retry all datanodes in the cluster and then failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-10 Thread Hiroyuki Adachi (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552722#comment-17552722
 ] 

Hiroyuki Adachi commented on HDFS-16613:


[~caozhiqiang] , thank you for explaining in detail.

I think the data process you described is correct and your approach for 
improving performance is right. My concern was the reconstruction load on a 
large cluster where blocksToProcess is much larger than maxTransfers. But I 
found that I had misunderstood that the blocks held by the busy node would be 
reconstructed rather than replicated. So I think there is no problem using 
dfs.namenode.replication.max-streams-hard-limit for this purpose.

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16627) improve BPServiceActor#register Log Add NN Addr

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16627?focusedWorklogId=780302&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780302
 ]

ASF GitHub Bot logged work on HDFS-16627:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 12:42
Start Date: 10/Jun/22 12:42
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4419:
URL: https://github.com/apache/hadoop/pull/4419#issuecomment-1152319030

   > The latest build seems throw some warning. Try to trigger jenkins again. 
Let's wait what it will say.
   no problem, i will re-trigger jenkins!
   




Issue Time Tracking
---

Worklog Id: (was: 780302)
Time Spent: 1h 40m  (was: 1.5h)

> improve BPServiceActor#register Log Add NN Addr
> ---
>
> Key: HDFS-16627
> URL: https://issues.apache.org/jira/browse/HDFS-16627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When I read the log, I think the Addr information of NN should be added to 
> make the log information more complete.
> The log is as follows:
> {code:java}
> 2022-06-06 06:15:32,715 [BP-1990954485-172.17.0.2-1654496132136 heartbeating 
> to localhost/127.0.0.1:42811] INFO  datanode.DataNode 
> (BPServiceActor.java:register(819)) - Block pool 
> BP-1990954485-172.17.0.2-1654496132136 (Datanode Uuid 
> 7d4b5459-6f2b-4203-bf6f-d31bfb9b6c3f) service to localhost/127.0.0.1:42811 
> beginning handshake with NN.
> 2022-06-06 06:15:32,717 [BP-1990954485-172.17.0.2-1654496132136 heartbeating 
> to localhost/127.0.0.1:42811] INFO  datanode.DataNode 
> (BPServiceActor.java:register(847)) - Block pool 
> BP-1990954485-172.17.0.2-1654496132136 (Datanode Uuid 
> 7d4b5459-6f2b-4203-bf6f-d31bfb9b6c3f) service to localhost/127.0.0.1:42811 
> successfully registered with NN. {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-10 Thread Hiroyuki Adachi (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552722#comment-17552722
 ] 

Hiroyuki Adachi edited comment on HDFS-16613 at 6/10/22 2:28 PM:
-

[~caozhiqiang] , thank you for explaining in detail.

I think the data process you described is correct and your approach for 
improving performance is right. My concern was the reconstruction load on a 
large cluster where blocksToProcess is much larger than maxTransfers. But I 
found that I had misunderstood that the blocks held by the busy node would be 
reconstructed rather than replicated.

So I think there is no problem using 
dfs.namenode.replication.max-streams-hard-limit for this purpose. Basically, it 
is for the highest priority replication, but the patch will not affect that 
since computeReconstructionWorkForBlocks() processes higher priority 
replication queue first, the decommissioning node will not fill up by low 
redundancy EC block replication tasks.


was (Author: hadachi):
[~caozhiqiang] , thank you for explaining in detail.

I think the data process you described is correct and your approach for 
improving performance is right. My concern was the reconstruction load on a 
large cluster where blocksToProcess is much larger than maxTransfers. But I 
found that I had misunderstood that the blocks held by the busy node would be 
reconstructed rather than replicated. So I think there is no problem using 
dfs.namenode.replication.max-streams-hard-limit for this purpose.

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16630) Simplify extern wrapping for XPlatform dirent

2022-06-10 Thread Gautham Banasandra (Jira)

Gautham Banasandra created HDFS-16630:
-

 Summary: Simplify extern wrapping for XPlatform dirent
 Key: HDFS-16630
 URL: https://issues.apache.org/jira/browse/HDFS-16630
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs++
Affects Versions: 3.4.0
Reporter: Gautham Banasandra
Assignee: Gautham Banasandra


Need to simplify the wrapping of the [extern "C" 
block|https://github.com/apache/hadoop/blob/7f5a34dfaa7e6fcb08af75ab40f67e50fe4d78ef/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform/c-api/extern/dirent.h#L25-L33]
 as described here - 
https://github.com/apache/hadoop/pull/4370#discussion_r892836982.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16463) Make dirent cross platform compatible

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16463?focusedWorklogId=780342&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780342
 ]

ASF GitHub Bot logged work on HDFS-16463:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:20
Start Date: 10/Jun/22 15:20
Worklog Time Spent: 10m 
  Work Description: GauthamBanasandra commented on code in PR #4370:
URL: https://github.com/apache/hadoop/pull/4370#discussion_r894642999


##
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform/c-api/extern/dirent.h:
##
@@ -0,0 +1,35 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef NATIVE_LIBHDFSPP_LIB_CROSS_PLATFORM_C_API_EXTERN_DIRENT_H
+#define NATIVE_LIBHDFSPP_LIB_CROSS_PLATFORM_C_API_EXTERN_DIRENT_H
+
+/*
+ * We will use extern "C" only on Windows.
+ */
+#if defined(WIN32) && defined(__cplusplus)
+extern "C" {
+#endif
+
+#include "x-platform/c-api/core/dirent.h"

Review Comment:
   I've filed a JIRA to track this - 
https://issues.apache.org/jira/browse/HDFS-16630.





Issue Time Tracking
---

Worklog Id: (was: 780342)
Time Spent: 6.5h  (was: 6h 20m)

> Make dirent cross platform compatible
> -
>
> Key: HDFS-16463
> URL: https://issues.apache.org/jira/browse/HDFS-16463
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs++
>Affects Versions: 3.4.0
> Environment: Windows 10
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: libhdfscpp, pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> [jnihelper.c|https://github.com/apache/hadoop/blob/1fed18bb2d8ac3dbaecc3feddded30bed918d556/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jni_helper.c#L28]
>  in HDFS native client uses *dirent.h*. This header file isn't available on 
> Windows. Thus, we need to replace this with a cross platform compatible 
> implementation for dirent.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16605) Improve Code With Lambda in hadoop-hdfs-rbf moudle

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16605?focusedWorklogId=780344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780344
 ]

ASF GitHub Bot logged work on HDFS-16605:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:27
Start Date: 10/Jun/22 15:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4375:
URL: https://github.com/apache/hadoop/pull/4375#issuecomment-1152481716

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m 22s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 10 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 53s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 48s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 41s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 53s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 23s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 31s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  39m 45s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 159m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4375/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4375 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux f534471f3f79 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cf3ebaf0ab1351c8d5754d256ef9bf61b51e |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4375/5/testReport/ |
   | Max. process+thread count | 2232 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4375/5/console |

[jira] [Resolved] (HDFS-16463) Make dirent cross platform compatible

2022-06-10 Thread Gautham Banasandra (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gautham Banasandra resolved HDFS-16463.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Merged PR https://github.com/apache/hadoop/pull/4370 to trunk.

> Make dirent cross platform compatible
> -
>
> Key: HDFS-16463
> URL: https://issues.apache.org/jira/browse/HDFS-16463
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs++
>Affects Versions: 3.4.0
> Environment: Windows 10
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: libhdfscpp, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> [jnihelper.c|https://github.com/apache/hadoop/blob/1fed18bb2d8ac3dbaecc3feddded30bed918d556/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jni_helper.c#L28]
>  in HDFS native client uses *dirent.h*. This header file isn't available on 
> Windows. Thus, we need to replace this with a cross platform compatible 
> implementation for dirent.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16600) Deadlock on DataNode

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16600?focusedWorklogId=780345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780345
 ]

ASF GitHub Bot logged work on HDFS-16600:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:40
Start Date: 10/Jun/22 15:40
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on PR #4367:
URL: https://github.com/apache/hadoop/pull/4367#issuecomment-1152493868

   Makes sense to me. Thanx everyone 




Issue Time Tracking
---

Worklog Id: (was: 780345)
Time Spent: 3.5h  (was: 3h 20m)

> Deadlock on DataNode
> 
>
> Key: HDFS-16600
> URL: https://issues.apache.org/jira/browse/HDFS-16600
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The UT 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.testSynchronousEviction 
> failed, because happened deadlock, which  is introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. 
> DeadLock:
> {code:java}
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.createRbw line 1588 
> need a read lock
> try (AutoCloseableLock lock = lockManager.readLock(LockLevel.BLOCK_POOl,
> b.getBlockPoolId()))
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.evictBlocks line 
> 3526 need a write lock
> try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, 
> bpid))
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16605) Improve Code With Lambda in hadoop-hdfs-rbf moudle

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16605?focusedWorklogId=780351&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780351
 ]

ASF GitHub Bot logged work on HDFS-16605:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:54
Start Date: 10/Jun/22 15:54
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4375:
URL: https://github.com/apache/hadoop/pull/4375#issuecomment-1152506400

   @goiri Please help me review the code again, thank you very much! Can you 
help merge to trunk branch?




Issue Time Tracking
---

Worklog Id: (was: 780351)
Time Spent: 2h  (was: 1h 50m)

> Improve Code With Lambda in hadoop-hdfs-rbf moudle
> --
>
> Key: HDFS-16605
> URL: https://issues.apache.org/jira/browse/HDFS-16605
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16619) impove HttpHeaders.Values And HttpHeaders.Names With recommended Class

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16619?focusedWorklogId=780405&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780405
 ]

ASF GitHub Bot logged work on HDFS-16619:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 18:41
Start Date: 10/Jun/22 18:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4406:
URL: https://github.com/apache/hadoop/pull/4406#issuecomment-1152636309

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 45s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 24s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4406/4/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 18s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 27s |  |  
hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 0 new + 
911 unchanged - 26 fixed = 911 total (was 937)  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  
hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 0 new 
+ 890 unchanged - 26 fixed = 890 total (was 916)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 58s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4406/4/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 26s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 242m 57s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 351m 30s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4406/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4406 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsi

[jira] [Work logged] (HDFS-16627) improve BPServiceActor#register Log Add NN Addr

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16627?focusedWorklogId=780407&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780407
 ]

ASF GitHub Bot logged work on HDFS-16627:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 18:50
Start Date: 10/Jun/22 18:50
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4419:
URL: https://github.com/apache/hadoop/pull/4419#issuecomment-1152642425

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 38s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 42s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 25s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4419/4/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 46s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 18s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 59s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4419/4/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 29s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 253m  6s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 361m 27s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4419/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4419 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3f2eab394808 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1c7aa6ab779987a4ead2426520ec1c6d109ed9a9 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0

[jira] [Updated] (HDFS-16623) IllegalArgumentException in LifelineSender

2022-06-10 Thread Chris Nauroth (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-16623:
-
Fix Version/s: 3.4.0
   3.2.4
   3.3.4

> IllegalArgumentException in LifelineSender
> --
>
> Key: HDFS-16623
> URL: https://issues.apache.org/jira/browse/HDFS-16623
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In our production environment, an IllegalArgumentException occurred in the 
> LifelineSender at one DataNode which was undergoing GC at that time. 
> And the bug code is at line 1060 in BPServiceActor.java, because the sleep 
> time is negative.
> {code:java}
> while (shouldRun()) {
>  try {
> if (lifelineNamenode == null) {
>   lifelineNamenode = dn.connectToLifelineNN(lifelineNnAddr);
> }
> sendLifelineIfDue();
> Thread.sleep(scheduler.getLifelineWaitTime());
>   } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
>   } catch (IOException e) {
> LOG.warn("IOException in LifelineSender for " + BPServiceActor.this, 
> e);
>  }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16623) IllegalArgumentException in LifelineSender

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16623?focusedWorklogId=780412&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780412
 ]

ASF GitHub Bot logged work on HDFS-16623:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 19:01
Start Date: 10/Jun/22 19:01
Worklog Time Spent: 10m 
  Work Description: cnauroth merged PR #4409:
URL: https://github.com/apache/hadoop/pull/4409




Issue Time Tracking
---

Worklog Id: (was: 780412)
Time Spent: 1h 10m  (was: 1h)

> IllegalArgumentException in LifelineSender
> --
>
> Key: HDFS-16623
> URL: https://issues.apache.org/jira/browse/HDFS-16623
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In our production environment, an IllegalArgumentException occurred in the 
> LifelineSender at one DataNode which was undergoing GC at that time. 
> And the bug code is at line 1060 in BPServiceActor.java, because the sleep 
> time is negative.
> {code:java}
> while (shouldRun()) {
>  try {
> if (lifelineNamenode == null) {
>   lifelineNamenode = dn.connectToLifelineNN(lifelineNnAddr);
> }
> sendLifelineIfDue();
> Thread.sleep(scheduler.getLifelineWaitTime());
>   } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
>   } catch (IOException e) {
> LOG.warn("IOException in LifelineSender for " + BPServiceActor.this, 
> e);
>  }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16609) Fix Flakes Junit Tests that often report timeouts

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16609?focusedWorklogId=780435&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780435
 ]

ASF GitHub Bot logged work on HDFS-16609:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 20:58
Start Date: 10/Jun/22 20:58
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4382:
URL: https://github.com/apache/hadoop/pull/4382#issuecomment-1152733796

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 42s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 20s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4382/5/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 56s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  0s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4382/5/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 31s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 20s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 374m  1s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4382/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 490m  4s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4382/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4382 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 381a990f87fc

[jira] [Work logged] (HDFS-16626) Under replicated blocks in dfsadmin report should contain pendingReconstruction‘s blocks

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16626?focusedWorklogId=780438&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780438
 ]

ASF GitHub Bot logged work on HDFS-16626:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 21:05
Start Date: 10/Jun/22 21:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4420:
URL: https://github.com/apache/hadoop/pull/4420#issuecomment-1152738083

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 42s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 21s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4420/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  3s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  4s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4420/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m  3s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 377m 11s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4420/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 495m 21s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestMissingBlocksAlert |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4420/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4420 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 9e07a4ebf679 4.15.0-166-gen

[jira] [Work logged] (HDFS-16625) Unit tests aren't checking for PMDK availability

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16625?focusedWorklogId=780453&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780453
 ]

ASF GitHub Bot logged work on HDFS-16625:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 23:27
Start Date: 10/Jun/22 23:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4414:
URL: https://github.com/apache/hadoop/pull/4414#issuecomment-1152799203

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  66m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 25s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 48s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 25s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4414/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 58s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4414/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 242m 19s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4414/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 380m 48s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
   |   | hadoop.hdfs.TestRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4414/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4414 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell de

[jira] [Work logged] (HDFS-16627) improve BPServiceActor#register Log Add NN Addr

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16627?focusedWorklogId=780457&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780457
 ]

ASF GitHub Bot logged work on HDFS-16627:
-

Author: ASF GitHub Bot
Created on: 11/Jun/22 00:03
Start Date: 11/Jun/22 00:03
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4419:
URL: https://github.com/apache/hadoop/pull/4419#issuecomment-1152808552

   > The latest build seems throw some warning. Try to trigger jenkins again. 
Let's wait what it will say.
   
   Hi, @Hexiaoqiao Thank you very much for your help in reviewing the code. I 
read the compilation report carefully and can confirm that the compilation 
error of java doc has nothing to do with this pr.
   
   java doc has become more strict in the compilation environment of jdk11, 
java doc will be run 2 times after a pr commit.
   
   1st run
   Compile the code of the current trunk directly，We can see the following 
report.
   
   [hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.]([/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4419/4/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt))
   
   2nd run
   Compile and run after merge pr，We can see the following report.
   
   [hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4419/4/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
   
   In 2 runs will report the following error
   ```
   1 error
   100 warnings
   [INFO] 

   [INFO] BUILD FAILURE
   [INFO] 

   [INFO] Total time:  38.067 s
   [INFO] Finished at: 2022-06-10T14:10:17Z
   [INFO] 

   
   [ERROR] 
/home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-4419/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/package-info.java:20:
 error: reference not found
   [ERROR]  * This package provides a mechanism for tracking {@link NameNode} 
startup
   ```
   
   I try to fix it in this pr(https://github.com/apache/hadoop/pull/4423), 
after solving the error, the java doc can be compiled successfully with jdk11.
   
   




Issue Time Tracking
---

Worklog Id: (was: 780457)
Time Spent: 2h  (was: 1h 50m)

> improve BPServiceActor#register Log Add NN Addr
> ---
>
> Key: HDFS-16627
> URL: https://issues.apache.org/jira/browse/HDFS-16627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When I read the log, I think the Addr information of NN should be added to 
> make the log information more complete.
> The log is as follows:
> {code:java}
> 2022-06-06 06:15:32,715 [BP-1990954485-172.17.0.2-1654496132136 heartbeating 
> to localhost/127.0.0.1:42811] INFO  datanode.DataNode 
> (BPServiceActor.java:register(819)) - Block pool 
> BP-1990954485-172.17.0.2-1654496132136 (Datanode Uuid 
> 7d4b5459-6f2b-4203-bf6f-d31bfb9b6c3f) service to localhost/127.0.0.1:42811 
> beginning handshake with NN.
> 2022-06-06 06:15:32,717 [BP-1990954485-172.17.0.2-1654496132136 heartbeating 
> to localhost/127.0.0.1:42811] INFO  datanode.DataNode 
> (BPServiceActor.java:register(847)) - Block pool 
> BP-1990954485-172.17.0.2-1654496132136 (Datanode Uuid 
> 7d4b5459-6f2b-4203-bf6f-d31bfb9b6c3f) service to localhost/127.0.0.1:42811 
> successfully registered with NN. {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16563) Namenode WebUI prints sensitive information on Token Expiry

2022-06-10 Thread fanshilun (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553009#comment-17553009
 ] 

fanshilun commented on HDFS-16563:
--

causes[ MAPREDUCE-7387|https://issues.apache.org/jira/browse/MAPREDUCE-7387] 
too.

> Namenode WebUI prints sensitive information on Token Expiry
> ---
>
> Key: HDFS-16563
> URL: https://issues.apache.org/jira/browse/HDFS-16563
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode, security, webhdfs
>Affects Versions: 3.3.3
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
> Attachments: image-2022-04-27-23-01-16-033.png, 
> image-2022-04-27-23-28-40-568.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Login to Namenode WebUI.
> Wait for token to expire. (Or modify the Token refresh time 
> dfs.namenode.delegation.token.renew/update-interval to lower value)
> Refresh the WebUI after the Token expiry.
> Full token information gets printed in WebUI.
>  
> !image-2022-04-27-23-01-16-033.png!
> causes YARN-11172; all branches with this patch need that fix in too



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-16563) Namenode WebUI prints sensitive information on Token Expiry

2022-06-10 Thread fanshilun (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553009#comment-17553009
 ] 

fanshilun edited comment on HDFS-16563 at 6/11/22 12:37 AM:


causes MAPREDUCE-7387 too.


was (Author: slfan1989):
causes[ MAPREDUCE-7387|https://issues.apache.org/jira/browse/MAPREDUCE-7387] 
too.

> Namenode WebUI prints sensitive information on Token Expiry
> ---
>
> Key: HDFS-16563
> URL: https://issues.apache.org/jira/browse/HDFS-16563
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namanode, security, webhdfs
>Affects Versions: 3.3.3
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
> Attachments: image-2022-04-27-23-01-16-033.png, 
> image-2022-04-27-23-28-40-568.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Login to Namenode WebUI.
> Wait for token to expire. (Or modify the Token refresh time 
> dfs.namenode.delegation.token.renew/update-interval to lower value)
> Refresh the WebUI after the Token expiry.
> Full token information gets printed in WebUI.
>  
> !image-2022-04-27-23-01-16-033.png!
> causes YARN-11172; all branches with this patch need that fix in too



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16619) Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Input.

2022-06-10 Thread fanshilun (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun updated HDFS-16619:
-
Summary: Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Input.  
(was: impove HttpHeaders.Values And HttpHeaders.Names With recommended Class)

> Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Input.
> --
>
> Key: HDFS-16619
> URL: https://issues.apache.org/jira/browse/HDFS-16619
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HttpHeaders.Values and HttpHeaders.Names are deprecated, use 
> HttpHeaderValues and HttpHeaderNames instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16600) Deadlock on DataNode

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16600?focusedWorklogId=780461&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780461
 ]

ASF GitHub Bot logged work on HDFS-16600:
-

Author: ASF GitHub Bot
Created on: 11/Jun/22 01:23
Start Date: 11/Jun/22 01:23
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4367:
URL: https://github.com/apache/hadoop/pull/4367#issuecomment-1152828770

   @Hexiaoqiao @ZanderXu @tomscut 
   
   I still have some doubts about this.
   
   1. I still hope ZanderXu Can provide deadlock exception stack error 
information, I will continue to try to reproduce this problem in this part.
   
   2. I read the code of testSynchronousEviction carefully, this code uses the 
special storage strategy LAZY_PERSIST, This strategy will asynchronously flush 
memory blocks to disk. LazyWriter takes care of this work.
   Part of the code is as follows
   ```
   private boolean saveNextReplica() {
 RamDiskReplica block = null;
 FsVolumeReference targetReference;
 FsVolumeImpl targetVolume;
 ReplicaInfo replicaInfo;
 boolean succeeded = false;
   
 try {
   block = ramDiskReplicaTracker.dequeueNextReplicaToPersist();
   if (block != null) {
 try (AutoCloseableLock lock = 
lockManager.writeLock(LockLevel.BLOCK_POOl,
 block.getBlockPoolId())) {
   replicaInfo = volumeMap.get(block.getBlockPoolId(), 
block.getBlockId());
 .
   ```
   If ZanderXu's judgment is correct, will this code also deadlock?
   
   3.I always have a question, why we first add blockpool readlock, and then 
add volume write lock, how is the order of this lock derived?
   
   4.I checked lockManager.writeLock(LockLevel.BLOCK_POOl, 
block.getBlockPoolId()), and I found that when adding volume, the writeLock of 
BLOCK_POOl is also used, so will it also deadlock?
   
   > in conclusion
   
   I don't think this is a deadlock. Is it because createRow got the read lock, 
which caused evictBlocks to get the write lock for a long time, and then 
exceeded the waiting time of the junit test, which eventually led to an error.
   
   I think to solve this problem completely, we also need to look at the 
processing logic of LazyWriter. It should not be enough to just modify 
evictBlocks.
   




Issue Time Tracking
---

Worklog Id: (was: 780461)
Time Spent: 3h 40m  (was: 3.5h)

> Deadlock on DataNode
> 
>
> Key: HDFS-16600
> URL: https://issues.apache.org/jira/browse/HDFS-16600
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The UT 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.testSynchronousEviction 
> failed, because happened deadlock, which  is introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. 
> DeadLock:
> {code:java}
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.createRbw line 1588 
> need a read lock
> try (AutoCloseableLock lock = lockManager.readLock(LockLevel.BLOCK_POOl,
> b.getBlockPoolId()))
> // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.evictBlocks line 
> 3526 need a write lock
> try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, 
> bpid))
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16609) Fix Flakes Junit Tests that often report timeouts

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16609?focusedWorklogId=780465&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780465
 ]

ASF GitHub Bot logged work on HDFS-16609:
-

Author: ASF GitHub Bot
Created on: 11/Jun/22 01:36
Start Date: 11/Jun/22 01:36
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4382:
URL: https://github.com/apache/hadoop/pull/4382#issuecomment-1152830700

   @Hexiaoqiao @tomscut I read the test report carefully and think that 
changing the timeout time will not cause javadoc to report an error, I hope you 
can help me review the code again.




Issue Time Tracking
---

Worklog Id: (was: 780465)
Time Spent: 2h  (was: 1h 50m)

> Fix Flakes Junit Tests that often report timeouts
> -
>
> Key: HDFS-16609
> URL: https://issues.apache.org/jira/browse/HDFS-16609
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors, I 
> found that one type of problem is TimeOut problem, these problems can be 
> avoided by adjusting TimeOut time.
> The modified method is as follows:
> 1.org.apache.hadoop.hdfs.TestFileCreation#testServerDefaultsWithMinimalCaching
> {code:java}
> [ERROR] 
> testServerDefaultsWithMinimalCaching(org.apache.hadoop.hdfs.TestFileCreation) 
>  Time elapsed: 7.136 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. 
> Thread diagnostics: 
> [WARNING] 
> org.apache.hadoop.hdfs.TestFileCreation.testServerDefaultsWithMinimalCaching(org.apache.hadoop.hdfs.TestFileCreation)
> [ERROR]   Run 1: TestFileCreation.testServerDefaultsWithMinimalCaching:277 
> Timeout Timed out ...
> [INFO]   Run 2: PASS{code}
> 2.org.apache.hadoop.hdfs.TestDFSShell#testFilePermissions
> {code:java}
> [ERROR] testFilePermissions(org.apache.hadoop.hdfs.TestDFSShell)  Time 
> elapsed: 30.022 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds
>   at java.lang.Thread.dumpThreads(Native Method)
>   at java.lang.Thread.getStackTrace(Thread.java:1549)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout.createTimeoutException(FailOnTimeout.java:182)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout.getResult(FailOnTimeout.java:177)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout.evaluate(FailOnTimeout.java:128)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> [WARNING] 
> org.apache.hadoop.hdfs.TestDFSShell.testFilePermissions(org.apache.hadoop.hdfs.TestDFSShell)
> [ERROR]   Run 1: TestDFSShell.testFilePermissions TestTimedOut test timed out 
> after 3 mil...
> [INFO]   Run 2: PASS {code}
> 3.org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier#testSPSWhenFileHasExcessRedundancyBlocks
> {code:java}
> [ERROR] 
> testSPSWhenFileHasExcessRedundancyBlocks(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier)
>   Time elapsed: 67.904 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. 
> [WARNING] 
> org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier.testSPSWhenFileHasExcessRedundancyBlocks(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier)
> [ERROR]   Run 1: 
> TestExternalStoragePolicySatisfier.testSPSWhenFileHasExcessRedundancyBlocks:1379
>  Timeout
> [ERROR]   Run 2: 
> TestExternalStoragePolicySatisfier.testSPSWhenFileHasExcessRedundancyBlocks:1379
>  Timeout
> [INFO]   Run 3: PASS {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-10 Thread caozhiqiang (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553020#comment-17553020
 ] 

caozhiqiang commented on HDFS-16613:


[~hadachi] , thank you. Could you help to review this PR [GitHub Pull Request 
#4398|https://github.com/apache/hadoop/pull/4398] if this approach works?

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-15042) Add more tests for ByteBufferPositionedReadable

2022-06-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15042?focusedWorklogId=780468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780468
 ]

ASF GitHub Bot logged work on HDFS-15042:
-

Author: ASF GitHub Bot
Created on: 11/Jun/22 03:42
Start Date: 11/Jun/22 03:42
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #1747:
URL: https://github.com/apache/hadoop/pull/1747#issuecomment-1152847320

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 35s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  5s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 48s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 19s |  |  trunk passed  |
   | -1 :x: |  javadoc  |   1m 53s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1747/3/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   4m 59s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  10m 24s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 10s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 12s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  20m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 15s |  |  root: The patch generated 
0 new + 45 unchanged - 5 fixed = 45 total (was 50)  |
   | +1 :green_heart: |  mvnsite  |   6m 15s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 45s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1747/3/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | +1 :green_heart: |  javadoc  |   5m  0s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  10m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 48s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m 17s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 422m 59s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1747/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 58s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 691m 22s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
   
   
   | Subsystem | Report/Notes |
   |

37 matches

Mail list logo