date:20210908

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=648396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648396
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 05:43
Start Date: 09/Sep/21 05:43
Worklog Time Spent: 10m 
  Work Description: bshashikant edited a comment on pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340#issuecomment-915780582


   > @bshashikant , thank you for the patch.
   > 
   > I see that the patch works by converting logic from reference-equals to 
value-equals. In `AclStorage`, we maintain a reference-counted mapping of every 
immutable instance of every `AclFeature` currently in use by the namesystem. 
Inodes with identical ACL entries all point to the same `AclFeature`, so 
reference equals should work fine. This was an intentional choice of the ACL 
design to lower memory footprint and provide an inexpensive equality operation.
   > 
   > Xattrs do not use this same flyweight pattern though, so I can see why 
deep equals would be necessary there. I wonder if this bug really only happens 
for xattrs and not ACLs? If the bug happens for ACLs, then that might mean 
there is really something wrong with reference counting in `AclStorage`, 
causing duplicate instances for the same ACL entries.
   > 
   > Also, if value equals is necessary, then there is similar logic in 
`INodeFile` and `INodeFileAttributes`.
   
   Thanks @cnauroth for the explanation. I tested the acl feature and it seems 
the issue only seem to exist with Xattr feature not acl itself as you 
suggested. I have modified the patch accordingly. Thank you very much for the 
details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648396)
Time Spent: 1h 40m  (was: 1.5h)

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=648395=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648395
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 05:43
Start Date: 09/Sep/21 05:43
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340#issuecomment-915780582


   > @bshashikant , thank you for the patch.
   > 
   > I see that the patch works by converting logic from reference-equals to 
value-equals. In `AclStorage`, we maintain a reference-counted mapping of every 
immutable instance of every `AclFeature` currently in use by the namesystem. 
Inodes with identical ACL entries all point to the same `AclFeature`, so 
reference equals should work fine. This was an intentional choice of the ACL 
design to lower memory footprint and provide an inexpensive equality operation.
   > 
   > Xattrs do not use this same flyweight pattern though, so I can see why 
deep equals would be necessary there. I wonder if this bug really only happens 
for xattrs and not ACLs? If the bug happens for ACLs, then that might mean 
there is really something wrong with reference counting in `AclStorage`, 
causing duplicate instances for the same ACL entries.
   > 
   > Also, if value equals is necessary, then there is similar logic in 
`INodeFile` and `INodeFileAttributes`.
   
   Thanks @cnauroth for the explanation. I tested the acl feature and it seems 
the issue only seem to exist with Xattr feature not acl itself as you 
suggested. I have modified the patch accordingly. Thank you very much for 
explanation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648395)
Time Spent: 1.5h  (was: 1h 20m)

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648375
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 04:23
Start Date: 09/Sep/21 04:23
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915750759


   Thanks for spending some time @LeonGao91. Although I understand we can repro 
it only when we run twice locally but Jenkins does report this as failure 
sometimes in single run and hence, this is flaky already.
   
   Please check:
   
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   https://user-images.githubusercontent.com/34790606/132622011-32d7c4bf-757d-4309-ac6d-ec4ef21b310e.png;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648375)
Time Spent: 4h  (was: 3h 50m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648374
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 04:22
Start Date: 09/Sep/21 04:22
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915750759


   Thanks for spending some time @LeonGao91. Although I understand we can repro 
it only when we run twice locally but Jenkins does report this as failure 
sometimes.
   
   Please check:
   
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   https://user-images.githubusercontent.com/34790606/132622011-32d7c4bf-757d-4309-ac6d-ec4ef21b310e.png;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648374)
Time Spent: 3h 50m  (was: 3h 40m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16207) Remove NN logs stack trace for non-existent xattr query

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16207?focusedWorklogId=648373=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648373
 ]

ASF GitHub Bot logged work on HDFS-16207:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 04:21
Start Date: 09/Sep/21 04:21
Worklog Time Spent: 10m 
  Work Description: cnauroth merged pull request #3375:
URL: https://github.com/apache/hadoop/pull/3375


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648373)
Time Spent: 0.5h  (was: 20m)

> Remove NN logs stack trace for non-existent xattr query
> ---
>
> Key: HDFS-16207
> URL: https://issues.apache.org/jira/browse/HDFS-16207
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The NN logs a full stack trace every time a getXAttrs is called for a 
> non-existent xattr. The logging has zero value add. The increased logging 
> load may harm performance. Something is now probing for xattrs resulting in 
> many lines of:
> {code:bash}
> 2021-09-02 13:48:03,340 [IPC Server handler 5 on default port 59951] INFO  
> ipc.Server (Server.java:logException(3149)) - IPC Server handler 5 on default 
> port 59951, call Call#17 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getXAttrs from 127.0.0.1:59961
> java.io.IOException: At least one of the attributes provided was not found.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirXAttrOp.getXAttrs(FSDirXAttrOp.java:134)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getXAttrs(FSNamesystem.java:8472)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getXAttrs(NameNodeRpcServer.java:2317)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getXAttrs(ClientNamenodeProtocolServerSideTranslatorPB.java:1745)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1155)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1083)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1900)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3088)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648372
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 04:19
Start Date: 09/Sep/21 04:19
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on a change in pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#discussion_r704939307



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
##
@@ -1068,9 +1070,8 @@ static File moveBlockFiles(Block b, ReplicaInfo 
replicaInfo, File destdir)
   + srcReplica + " metadata to "
   + dstMeta, e);
 }
-if (LOG.isDebugEnabled()) {
-  LOG.info("Linked " + srcReplica.getBlockURI() + " to " + dstFile);
-}
+LOG.info("Linked {} to {} . Dest meta file: {}",

Review comment:
   This seems a bug, we are guarding with isDebugEnabled and if DEBUG is 
enabled, we are logging at INFO level. Either we can use INFO or DEBUG directly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648372)
Time Spent: 3h 40m  (was: 3.5h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16211) Complete some descriptions related to AuthToken

2021-09-08 Thread JiangHua Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412306#comment-17412306
 ] 

JiangHua Zhu commented on HDFS-16211:
-

Thank [~shv] for the comments.
When I saw that I didn't describe Authtoken now, I created this Jira.
Thank u.

> Complete some descriptions related to AuthToken
> ---
>
> Key: HDFS-16211
> URL: https://issues.apache.org/jira/browse/HDFS-16211
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In AuthToken, some description information is missing.
> The purpose of this jira is to complete some descriptions related to 
> AuthToken.
> /**
>  */
> public class AuthToken implements Principal {
>   ..
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

2021-09-08 Thread Hui Fei (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412296#comment-17412296
 ] 

Hui Fei commented on HDFS-16210:


Plan to cherry-pick to branch-3.2 & branch-3.3 after setup my local environment

> RBF: Add the option of refreshCallQueue to RouterAdmin
> --
>
> Key: HDFS-16210
> URL: https://issues.apache.org/jira/browse/HDFS-16210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We enabled FairCallQueue to RouterRpcServer, but Router can not 
> refreshCallQueue as NameNode does.
> This ticket is to enable the refreshCallQueue for Router so that we don't 
> have to restart the Routers when updating FairCallQueue configurations.
>  
> The option is not to refreshCallQueue to NameNodes, just trying to refresh 
> the callQueue of Router itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

2021-09-08 Thread Hui Fei (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei reopened HDFS-16210:


> RBF: Add the option of refreshCallQueue to RouterAdmin
> --
>
> Key: HDFS-16210
> URL: https://issues.apache.org/jira/browse/HDFS-16210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We enabled FairCallQueue to RouterRpcServer, but Router can not 
> refreshCallQueue as NameNode does.
> This ticket is to enable the refreshCallQueue for Router so that we don't 
> have to restart the Routers when updating FairCallQueue configurations.
>  
> The option is not to refreshCallQueue to NameNodes, just trying to refresh 
> the callQueue of Router itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16210?focusedWorklogId=648317=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648317
 ]

ASF GitHub Bot logged work on HDFS-16210:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 01:58
Start Date: 09/Sep/21 01:58
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3379:
URL: https://github.com/apache/hadoop/pull/3379#issuecomment-915698438


   @symious Thanks for contribution, @goiri Thanks for review! Merged to trunk!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648317)
Time Spent: 2h 20m  (was: 2h 10m)

> RBF: Add the option of refreshCallQueue to RouterAdmin
> --
>
> Key: HDFS-16210
> URL: https://issues.apache.org/jira/browse/HDFS-16210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We enabled FairCallQueue to RouterRpcServer, but Router can not 
> refreshCallQueue as NameNode does.
> This ticket is to enable the refreshCallQueue for Router so that we don't 
> have to restart the Routers when updating FairCallQueue configurations.
>  
> The option is not to refreshCallQueue to NameNodes, just trying to refresh 
> the callQueue of Router itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

2021-09-08 Thread Hui Fei (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-16210.

Fix Version/s: 3.4.0
   Resolution: Fixed

> RBF: Add the option of refreshCallQueue to RouterAdmin
> --
>
> Key: HDFS-16210
> URL: https://issues.apache.org/jira/browse/HDFS-16210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We enabled FairCallQueue to RouterRpcServer, but Router can not 
> refreshCallQueue as NameNode does.
> This ticket is to enable the refreshCallQueue for Router so that we don't 
> have to restart the Routers when updating FairCallQueue configurations.
>  
> The option is not to refreshCallQueue to NameNodes, just trying to refresh 
> the callQueue of Router itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16210?focusedWorklogId=648316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648316
 ]

ASF GitHub Bot logged work on HDFS-16210:
-

Author: ASF GitHub Bot
Created on: 09/Sep/21 01:57
Start Date: 09/Sep/21 01:57
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #3379:
URL: https://github.com/apache/hadoop/pull/3379


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648316)
Time Spent: 2h 10m  (was: 2h)

> RBF: Add the option of refreshCallQueue to RouterAdmin
> --
>
> Key: HDFS-16210
> URL: https://issues.apache.org/jira/browse/HDFS-16210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We enabled FairCallQueue to RouterRpcServer, but Router can not 
> refreshCallQueue as NameNode does.
> This ticket is to enable the refreshCallQueue for Router so that we don't 
> have to restart the Routers when updating FairCallQueue configurations.
>  
> The option is not to refreshCallQueue to NameNodes, just trying to refresh 
> the callQueue of Router itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648242
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 21:52
Start Date: 08/Sep/21 21:52
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915598330


   Thanks @virajjasani for reporting this issue!
   Seems like this happens consistently when running the same test multiple 
times, but doesn't fail when running it the first time (happy case), like you 
mentioned, this can reproduce it consistently:
   
 @Test
 public void t1() throws Exception {
   testDnRestartWithHardLink();
   testDnRestartWithHardLink();
 }
   
   Based on this I am not sure if the root cause is the rare race condition you 
mentioned. I suspect it is due to the cache behavior when rerun the same code 
the replica loading changed a little bit.
   
   I can spend some more time investigating in the next few days. Let's try to 
fix this in the unit test itself


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648242)
Time Spent: 3.5h  (was: 3h 20m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-15516) Add info for create flags in NameNode audit logs

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15516?focusedWorklogId=648202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648202
 ]

ASF GitHub Bot logged work on HDFS-15516:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 20:38
Start Date: 08/Sep/21 20:38
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2281:
URL: https://github.com/apache/hadoop/pull/2281#issuecomment-915553180


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m 26s |  |  
https://github.com/apache/hadoop/pull/2281 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/2281 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/1/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648202)
Time Spent: 2h  (was: 1h 50m)

> Add info for create flags in NameNode audit logs
> 
>
> Key: HDFS-15516
> URL: https://issues.apache.org/jira/browse/HDFS-15516
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Shashikant Banerjee
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15516.001.patch, HDFS-15516.002.patch, 
> HDFS-15516.003.patch, HDFS-15516.004.patch, HDFS-15516.005.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, if file create happens with flags like overwrite , the audit logs 
> doesn't seem to contain the info regarding the flags in the audit logs. It 
> would be useful to add info regarding the create options in the audit logs 
> similar to Rename ops. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-09-08 Thread Renukaprasad C (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412170#comment-17412170
 ] 

Renukaprasad C commented on HDFS-16191:
---

Thanks [~xinglin] for review & feedback.

 

org.apache.hadoop.util.PartitionedGSet#addNewPartitionIfNeeded – Here are check 
the SIZE of the partition and create/return new partition if the size exceeds 
otherwise the same partition.

private PartitionEntry addNewPartitionIfNeeded(
 PartitionEntry curPart, K key) {
 if(curPart.size() < DEFAULT_PARTITION_CAPACITY * DEFAULT_PARTITION_OVERFLOW
 || curPart.contains(key)) {
 return curPart;
 }
 return addNewPartition(key);
}

Here we add new partition whenever the size exceeds the threshold configured. 

 

Once new partition is added and some inodes added into it, which fails while 
iterating (As we iterated only static partitions).

With the above patch, i had verified the functionality  & related UTs, which 
are working fine.

 

One issue i found here is, Static partitions were added as => range key[0, 
16385],range key[1, 16385],range key[25, 16385], where as dynamic 
partitions were added like inodefile[0, ], inodefile[0, Y 
InodeId]  When these nodes are compared to get the partition, we get the 
newly added partition iNodeFile[0, X inodeId] after range key[0, 16385] is full.

 

Let me check this scenario once again, any other issue will discuss. Meanwhile 
you can also check the scenario when one partition gets full.

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648189=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648189
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 19:39
Start Date: 08/Sep/21 19:39
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915516780


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m  1s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 49s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  4s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   4m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 58s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 240 unchanged - 3 
fixed = 240 total (was 243)  |
   | +1 :green_heart: |  mvnsite  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   4m 45s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 391m 45s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 511m 12s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestBlockTokenWrappingQOP |
   |   | hadoop.hdfs.TestReconstructStripedFileWithValidator |
   |   | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
   |   | hadoop.hdfs.tools.TestECAdmin |
   |   | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor |
   |   | hadoop.hdfs.TestHDFSFileSystemContract |
   |   | hadoop.hdfs.TestWriteConfigurationToDFS |
   |   | hadoop.hdfs.tools.TestDFSZKFailoverController |
   |   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3386/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3386 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 889f73ecc5d5 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 
16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git

[jira] [Work logged] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16191?focusedWorklogId=648187=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648187
 ]

ASF GitHub Bot logged work on HDFS-16191:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 19:37
Start Date: 08/Sep/21 19:37
Worklog Time Spent: 10m 
  Work Description: prasad-acit commented on a change in pull request #3351:
URL: https://github.com/apache/hadoop/pull/3351#discussion_r704716469



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
##
@@ -1591,8 +1591,8 @@ void moveInodes() throws IOException {
 }
 
 if (count != totalInodes) {
-  String msg = String.format("moveInodes: expected to move %l inodes, " +
-  "but moved %l inodes", totalInodes, count);
+  String msg = String.format("moveInodes: expected to move %d inodes, " +

Review comment:
   Only %d (Decimal integers) & %f (Decimal numbers) are supported. %l is 
not supported format, which lead to RTE.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648187)
Time Spent: 1h  (was: 50m)

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648159
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 19:00
Start Date: 08/Sep/21 19:00
Worklog Time Spent: 10m 
  Work Description: LeonGao91 commented on a change in pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#discussion_r704693282



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
##
@@ -1068,9 +1070,8 @@ static File moveBlockFiles(Block b, ReplicaInfo 
replicaInfo, File destdir)
   + srcReplica + " metadata to "
   + dstMeta, e);
 }
-if (LOG.isDebugEnabled()) {
-  LOG.info("Linked " + srcReplica.getBlockURI() + " to " + dstFile);
-}
+LOG.info("Linked {} to {} . Dest meta file: {}",

Review comment:
   Should be debug or trace level?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648159)
Time Spent: 3h 10m  (was: 3h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16207) Remove NN logs stack trace for non-existent xattr query

2021-09-08 Thread Ahmed Hussein (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412120#comment-17412120
 ] 

Ahmed Hussein commented on HDFS-16207:
--

I am fine with merging the changes.
Thanks [~cnauroth] for the review!

> Remove NN logs stack trace for non-existent xattr query
> ---
>
> Key: HDFS-16207
> URL: https://issues.apache.org/jira/browse/HDFS-16207
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The NN logs a full stack trace every time a getXAttrs is called for a 
> non-existent xattr. The logging has zero value add. The increased logging 
> load may harm performance. Something is now probing for xattrs resulting in 
> many lines of:
> {code:bash}
> 2021-09-02 13:48:03,340 [IPC Server handler 5 on default port 59951] INFO  
> ipc.Server (Server.java:logException(3149)) - IPC Server handler 5 on default 
> port 59951, call Call#17 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getXAttrs from 127.0.0.1:59961
> java.io.IOException: At least one of the attributes provided was not found.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirXAttrOp.getXAttrs(FSDirXAttrOp.java:134)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getXAttrs(FSNamesystem.java:8472)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getXAttrs(NameNodeRpcServer.java:2317)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getXAttrs(ClientNamenodeProtocolServerSideTranslatorPB.java:1745)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1155)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1083)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1900)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3088)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-09-08 Thread Xing Lin (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412107#comment-17412107
 ] 

Xing Lin commented on HDFS-16191:
-

Hi [~prasad-acit], 

Thanks for working on this!

As you asked in the github pull request, we don't support partitions larger 
than NUM_RANGES_STATIC right now.
The key of a inode is calculated and then modulo by NUM_RANGES_STATIC in 
indexof(). As a result, any partition that has an id larger than 
NUM_RANGES_STATIC will receive no insertion.

If we want to support dynamic partition numbers, we need to modify indexof() 
implementation as well. We need to replace `& (INodeMap.NUM_RANGES_STATIC -1)` 
with something like `% partition_num`. Also note, indexof() is a static 
function which means we can not access instance variable from here. I don't 
know how to handle it now. 
{code:java}
public static long indexOf(long[] key) {
if(key[key.length-1] == INodeId.ROOT_INODE_ID) {
  return key[0];
}
long idx = LARGE_PRIME * key[0];
idx = (idx ^ (idx >> 32)) & (INodeMap.NUM_RANGES_STATIC -1);
return idx;
  }
{code}

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16191?focusedWorklogId=648106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648106
 ]

ASF GitHub Bot logged work on HDFS-16191:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 17:37
Start Date: 08/Sep/21 17:37
Worklog Time Spent: 10m 
  Work Description: xinglin commented on pull request #3351:
URL: https://github.com/apache/hadoop/pull/3351#issuecomment-915436686


   If we want to add support dynamic/variable partition numbers, we need to 
modify this function as well in INode.java. 
   idx here is the partition id. it is modulo by INodeMap.NUM_RANGES_STATIC. 
   
   ```
 public static long indexOf(long[] key) {
   if(key[key.length-1] == INodeId.ROOT_INODE_ID) {
 return key[0];
   }
   long idx = LARGE_PRIME * key[0];
   idx = (idx ^ (idx >> 32)) & (INodeMap.NUM_RANGES_STATIC -1);
   return idx;
 }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648106)
Time Spent: 50m  (was: 40m)

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16191?focusedWorklogId=648104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648104
 ]

ASF GitHub Bot logged work on HDFS-16191:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 17:31
Start Date: 08/Sep/21 17:31
Worklog Time Spent: 10m 
  Work Description: xinglin commented on a change in pull request #3351:
URL: https://github.com/apache/hadoop/pull/3351#discussion_r704628110



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
##
@@ -1591,8 +1591,8 @@ void moveInodes() throws IOException {
 }
 
 if (count != totalInodes) {
-  String msg = String.format("moveInodes: expected to move %l inodes, " +
-  "but moved %l inodes", totalInodes, count);
+  String msg = String.format("moveInodes: expected to move %d inodes, " +

Review comment:
   Hi,
   
   Why do we change from %l to %d? Both variables are defined as long, not int. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 648104)
Time Spent: 40m  (was: 0.5h)

> [FGL] Fix FSImage loading issues on dynamic partitions
> --
>
> Key: HDFS-16191
> URL: https://issues.apache.org/jira/browse/HDFS-16191
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When new partitions gets added into PartitionGSet, iterator do not consider 
> the new partitions. Which always iterate on Static Partition count. This lead 
> to full of warn messages as below.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139780 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139781 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139784 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139785 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139786 when saving the leases.
> 2021-08-28 03:23:19,420 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139788 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139789 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139790 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139791 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139793 when saving the leases.
> 2021-08-28 03:23:19,421 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139795 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139796 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139797 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139800 when saving the leases.
> 2021-08-28 03:23:19,422 WARN namenode.FSImageFormatPBINode: Fail to find 
> inode 139801 when saving the leases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16211) Complete some descriptions related to AuthToken

2021-09-08 Thread Konstantin Shvachko (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412096#comment-17412096
 ] 

Konstantin Shvachko commented on HDFS-16211:


Hi [~jianghuazhu]. Thanks for contributing.
Generally changing documentation is a good thing.
But with this particular change I do not see how it clarifies anything about 
{{AuthToken}} class.
Besides, since you commit your changes only into trunk, it increases the 
divergence between supported versions of Hadoop (3.3, 3.2, 2.10) and makes 
backports more complex.
If you are looking for some simpler tasks to get you started with Hadoop, I 
suggest to search for issues labeled "newbie" or "newbie++".

> Complete some descriptions related to AuthToken
> ---
>
> Key: HDFS-16211
> URL: https://issues.apache.org/jira/browse/HDFS-16211
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In AuthToken, some description information is missing.
> The purpose of this jira is to complete some descriptions related to 
> AuthToken.
> /**
>  */
> public class AuthToken implements Principal {
>   ..
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16207) Remove NN logs stack trace for non-existent xattr query

2021-09-08 Thread Chris Nauroth (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412086#comment-17412086
 ] 

Chris Nauroth commented on HDFS-16207:
--

+1

Thank you for the contribution, Ahmed.  I see you asked Kihwal to review, so 
I'll wait a day before merging in case you really wanted Kihwal's opinion 
specifically.

> Remove NN logs stack trace for non-existent xattr query
> ---
>
> Key: HDFS-16207
> URL: https://issues.apache.org/jira/browse/HDFS-16207
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.4.0, 2.10.2, 3.3.2, 3.2.4
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The NN logs a full stack trace every time a getXAttrs is called for a 
> non-existent xattr. The logging has zero value add. The increased logging 
> load may harm performance. Something is now probing for xattrs resulting in 
> many lines of:
> {code:bash}
> 2021-09-02 13:48:03,340 [IPC Server handler 5 on default port 59951] INFO  
> ipc.Server (Server.java:logException(3149)) - IPC Server handler 5 on default 
> port 59951, call Call#17 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getXAttrs from 127.0.0.1:59961
> java.io.IOException: At least one of the attributes provided was not found.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirXAttrOp.getXAttrs(FSDirXAttrOp.java:134)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getXAttrs(FSNamesystem.java:8472)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getXAttrs(NameNodeRpcServer.java:2317)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getXAttrs(ClientNamenodeProtocolServerSideTranslatorPB.java:1745)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1155)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1083)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1900)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3088)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647927
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 12:38
Start Date: 08/Sep/21 12:38
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915201911


   The only way we could avoid making changes in source code is by avoiding 
restart of Datanode as part of this test, however, confirmation of test results 
after Datanode restart is the primary purpose of this test, hence we can't even 
avoid restarting Datanode in the test IMHO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647927)
Time Spent: 3h  (was: 2h 50m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647925
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 12:34
Start Date: 08/Sep/21 12:34
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915135845


   Thanks @ayushtkn.
   
   > Just to know, does the Jenkins also complains about this?
   
   Yes, Jenkins does report this e.g 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   And I also tried to repro on Mac only. It was reproduced one time with 
single test run (debug mode) but simple rerun is always failing with the same 
issue.
   
   > Again just asking is there a test-only fix possible, without going to Prod 
files,
   
   Sure thing, would be happy to fix this way only but given how replicas are 
processed by BlockPoolSlice#getVolumeMap after Datanode is restarted, I think 
it might be tough to achieve it for this particular test. At the same time, I 
have also tried to keep source code changes as clear with descriptions as 
possible to ensure devs understand that a particular flag is for test purpose 
only.
   
   FYI @LeonGao91 @Jing9 if you could also review, it would be great. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647925)
Time Spent: 2h 50m  (was: 2h 40m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647895=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647895
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 11:22
Start Date: 08/Sep/21 11:22
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915135845


   Thanks @ayushtkn.
   
   > Just to know, does the Jenkins also complains about this?
   
   Yes, Jenkins does report this e.g 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   And I also tried to repro on Mac only. It was reproduced one time with 
single test run (debug mode) but simple rerun is always failing with the same 
issue.
   
   > Again just asking is there a test-only fix possible, without going to Prod 
files,
   
   Sure thing, would be happy to fix this way only but given how replicas are 
processed by BlockPoolSlice#getVolumeMap after Datanode is restarted, I think 
it might be tough to achieve it for this particular test. At the same time, I 
have also tried to keep source code changes as clear with descriptions as 
possible to ensure devs understand that a particular flag is for test purpose 
only.
   
   FYI @LeonGao91, if you could also review, it would be great. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647895)
Time Spent: 2h 40m  (was: 2.5h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647887
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 11:10
Start Date: 08/Sep/21 11:10
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915135845


   Thanks @ayushtkn.
   
   > Just to know, does the Jenkins also complains about this?
   
   Yes, Jenkins does report this e.g 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   And I also tried to repro on Mac only. It was reproduced one time with 
single test run (debug mode) but simple rerun is always failing with the same 
issue.
   
   > Again just asking is there a test-only fix possible, without going to Prod 
files,
   
   Sure thing, would be happy to fix this way only but given how replicas are 
built up by BlockPoolSlice instance after Datanode is restarted, I think it 
might be tough to achieve it for this particular test. At the same time, I have 
also tried to keep source code changes as clear with descriptions as possible 
to ensure devs understand that a particular flag is for test purpose only.
   
   FYI @LeonGao91, if you could also review, it would be great. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647887)
Time Spent: 2.5h  (was: 2h 20m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647885
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 11:09
Start Date: 08/Sep/21 11:09
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915135845


   Thanks @ayushtkn.
   
   > Just to know, does the Jenkins also complains about this?
   
   Yes, Jenkins does report this e.g 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   And I also tried to repro on Mac only. It was reproduced one time with 
single test run (debug mode) but simple rerun is always failing with the same 
issue.
   
   > Again just asking is there a test-only fix possible, without going to Prod 
files,
   
   Sure thing, would be happy to fix this by tests only but given how replicas 
are built up by BlockPoolSlice instance after Datanode is restarted, I think it 
might be tough to achieve it. At the same time, I am trying to keep source code 
changes as clear with descriptions as possible to ensure devs understand that a 
particular flag is for test use only.
   
   FYI @LeonGao91, if you could also review, it would be great. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647885)
Time Spent: 2h 20m  (was: 2h 10m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647884
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 11:00
Start Date: 08/Sep/21 11:00
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915135845


   Thanks @ayushtkn.
   
   > Just to know, does the Jenkins also complains about this?
   
   Yes, Jenkins does report this e.g 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
   
   And I also tried to repro on Mac only. It was reproduced one time with 
single test run (debug mode) but simple rerun is always failing with the same 
issue.
   
   > Again just asking is there a test-only fix possible, without going to Prod 
files,
   
   Sure thing, would be happy to fix this by tests only but given how replicas 
are built up by BlockPoolSlice instance after Datanode is restarted, I think it 
might be tough to achieve it.
   
   FYI @LeonGao91, if you could also review, it would be great. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647884)
Time Spent: 2h 10m  (was: 2h)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647882
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 10:53
Start Date: 08/Sep/21 10:53
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915130973


   Thanx @virajjasani for the PR. I too tried to repro this by running twice 
couple of days back, but It didn't repro for me as well. Some OS specific? I am 
on MacOs, can try on Ubuntu if you say so.
   
   Just to know, does the Jenkins also complains about this? Since you said it 
reproduces by re-runing the test. Again just asking is there a test-only fix 
possible, without going to Prod files, 
   
   I think this is from some recent code, can you loop in the original 
developer & the reviewer as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647882)
Time Spent: 2h  (was: 1h 50m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16203) Discover datanodes with unbalanced block pool usage by the standard deviation

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16203?focusedWorklogId=647874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647874
 ]

ASF GitHub Bot logged work on HDFS-16203:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 10:23
Start Date: 08/Sep/21 10:23
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3366:
URL: https://github.com/apache/hadoop/pull/3366#issuecomment-915112950


   Hi @aajisaka @goiri @virajjasani , could you please also review this? Thank 
you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647874)
Time Spent: 1h 20m  (was: 1h 10m)

> Discover datanodes with unbalanced block pool usage by the standard deviation
> -
>
> Key: HDFS-16203
> URL: https://issues.apache.org/jira/browse/HDFS-16203
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-09-01-19-16-27-172.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *Discover datanodes with unbalanced volume usage by the standard deviation.*
> *In some scenarios, we may cause unbalanced datanode disk usage:*
>  1. Repair the damaged disk and make it online again.
>  2. Add disks to some Datanodes.
>  3. Some disks are damaged, resulting in slow data writing.
>  4. Use some custom volume choosing policies.
> In the case of unbalanced disk usage, a sudden increase in datanode write 
> traffic may result in busy disk I/O with low volume usage, resulting in 
> decreased throughput across datanodes.
> We need to find these nodes in time to do diskBalance, or other processing. 
> Based on the volume usage of each datanode, we can calculate the standard 
> deviation of the volume usage. The more unbalanced the volume, the higher the 
> standard deviation.
> *We can display the result on the Web of namenode, and then sorting directly 
> to find the nodes where the volumes usages are unbalanced.*
> *{color:#172b4d}This interface is only used to obtain metrics and does not 
> adversely affect namenode performance.{color}*
>  
> {color:#172b4d}!image-2021-09-01-19-16-27-172.png|width=581,height=216!{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=647815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647815
 ]

ASF GitHub Bot logged work on HDFS-16213:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 08:26
Start Date: 08/Sep/21 08:26
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3386:
URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915028364


   Also, in debug mode, I was able to reproduce the same flaky behaviour in 
single run (only one time). FYI @ferhui 
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647815)
Time Spent: 1h 50m  (was: 1h 40m)

> Flaky test TestFsDatasetImpl#testDnRestartWithHardLink
> --
>
> Key: HDFS-16213
> URL: https://issues.apache.org/jira/browse/HDFS-16213
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Failure case: 
> [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
> {code:java}
> [ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE![ERROR] 
> testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl)
>   Time elapsed: 7.768 s  <<< FAILURE!java.lang.AssertionError at 
> org.junit.Assert.fail(Assert.java:87) at 
> org.junit.Assert.assertTrue(Assert.java:42) at 
> org.junit.Assert.assertTrue(Assert.java:53) at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16186) Datanode kicks out hard disk logic optimization

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16186?focusedWorklogId=647764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647764
 ]

ASF GitHub Bot logged work on HDFS-16186:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 06:56
Start Date: 08/Sep/21 06:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3334:
URL: https://github.com/apache/hadoop/pull/3334#issuecomment-914971392


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 54s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m  6s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 238m 11s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3334/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 327m 41s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerFailures |
   |   | hadoop.hdfs.server.datanode.checker.TestDatasetVolumeChecker |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3334/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3334 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8fda4494e9c3 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1e1dc86c1fb0d402d8a1e3f98d58933b521cd0de |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3334/8/testReport/ |
   | Max.

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

2021-09-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16187?focusedWorklogId=647754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-647754
 ]

ASF GitHub Bot logged work on HDFS-16187:
-

Author: ASF GitHub Bot
Created on: 08/Sep/21 06:19
Start Date: 08/Sep/21 06:19
Worklog Time Spent: 10m 
  Work Description: bshashikant commented on pull request #3340:
URL: https://github.com/apache/hadoop/pull/3340#issuecomment-914952043


   The failures don't seem related. @szetszwo , @smengcl , can you please have 
a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 647754)
Time Spent: 1h 20m  (was: 1h 10m)

> SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN 
> restarts with checkpointing
> ---
>
> Key: HDFS-16187
> URL: https://issues.apache.org/jira/browse/HDFS-16187
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Srinivasu Majeti
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The below test shows the snapshot diff between across snapshots is not 
> consistent with Xattr(EZ here settinh the Xattr) across NN restarts with 
> checkpointed FsImage.
> {code:java}
> @Test
> public void testEncryptionZonesWithSnapshots() throws Exception {
>   final Path snapshottable = new Path("/zones");
>   fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(),
>   true);
>   dfsAdmin.allowSnapshot(snapshottable);
>   dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH);
>   fs.createSnapshot(snapshottable, "snap1");
>   SnapshotDiffReport report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   System.out.println(report);
>   Assert.assertEquals(0, report.getDiffList().size());
>   fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
>   fs.saveNamespace();
>   fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
>   cluster.restartNameNode(true);
>   report =
>   fs.getSnapshotDiffReport(snapshottable, "snap1", "");
>   Assert.assertEquals(0, report.getDiffList().size());
> }{code}
> {code:java}
> Pre Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> Post Restart:
> Difference between snapshot snap1 and current directory under directory 
> /zones:
> M .{code}
> The side effect of this behavior is : distcp with snapshot diff would fail 
> with below error complaining that target cluster has some data changed .
> {code:java}
> WARN tools.DistCp: The target has been modified since snapshot x
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16207) Remove NN logs stack trace for non-existent xattr query

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Commented] (HDFS-16211) Complete some descriptions related to AuthToken

[jira] [Commented] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

[jira] [Reopened] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

[jira] [Work logged] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

[jira] [Resolved] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

[jira] [Work logged] (HDFS-16210) RBF: Add the option of refreshCallQueue to RouterAdmin

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-15516) Add info for create flags in NameNode audit logs

[jira] [Commented] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Commented] (HDFS-16207) Remove NN logs stack trace for non-existent xattr query

[jira] [Commented] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

[jira] [Work logged] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

[jira] [Work logged] (HDFS-16191) [FGL] Fix FSImage loading issues on dynamic partitions

[jira] [Commented] (HDFS-16211) Complete some descriptions related to AuthToken

[jira] [Commented] (HDFS-16207) Remove NN logs stack trace for non-existent xattr query

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16203) Discover datanodes with unbalanced block pool usage by the standard deviation

[jira] [Work logged] (HDFS-16213) Flaky test TestFsDatasetImpl#testDnRestartWithHardLink

[jira] [Work logged] (HDFS-16186) Datanode kicks out hard disk logic optimization

[jira] [Work logged] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing

35 matches

Site Navigation

Mail list logo

Footer information