[jira] [Work logged] (HDFS-15998) Fix NullPointException In listOpenFiles

2021-05-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15998?focusedWorklogId=603412&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603412
 ]

ASF GitHub Bot logged work on HDFS-15998:
-

Author: ASF GitHub Bot
Created on: 28/May/21 06:43
Start Date: 28/May/21 06:43
Worklog Time Spent: 10m 
  Work Description: haiyang1987 commented on pull request #3036:
URL: https://github.com/apache/hadoop/pull/3036#issuecomment-850184216


   add unit tests.
   
   @jojochuang Please take a look, Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603412)
Time Spent: 1h  (was: 50m)

> Fix NullPointException In listOpenFiles
> ---
>
> Key: HDFS-15998
> URL: https://issues.apache.org/jira/browse/HDFS-15998
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Use the Hadoop 3.2.0 client execute the following command: occasionally throw 
> NPE.
> hdfs dfsadmin -Dfs.defaultFS=hdfs://xxx -listOpenFiles -blockingDecommission 
> -path /xxx
>  
> {quote}
>  org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFilesBlockingDecom(FSNamesystem.java:1917)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listOpenFiles(FSNamesystem.java:1876)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.listOpenFiles(NameNodeRpcServer.java:1453)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listOpenFiles(ClientNamenodeProtocolServerSideTranslatorPB.java:1894)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   ...
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listOpenFiles(ClientNamenodeProtocolTranslatorPB.java:1952)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy10.listOpenFiles(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocol.OpenFilesIterator.makeRequest(OpenFilesIterator.java:89)
>   at 
> org.apache.hadoop.hdfs.protocol.OpenFilesIterator.makeRequest(OpenFilesIterator.java:35)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.tools.DFSAdmin.printOpenFiles(DFSAdmin.java:1006)
>   at 
> org.apache.hadoop.hdfs.tools.DFSAdmin.listOpenFiles(DFSAdmin.java:994)
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2431)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2590)
>  List open files failed.
>  listOpenFiles: java.lang.NullPointerException
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16045) FileSystem.CACHE memory leak

2021-05-27 Thread Xiangyi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352937#comment-17352937
 ] 

Xiangyi Zhu commented on HDFS-16045:


[~hexiaoqiao]  Thank you very much for your comment. Presto uses the 
"UserGroupInformation#createProxyUser" method to create UGI, which needs to 
pass the real super user information (here should be related to Kerberos), and 
the "FileSystem#get" Api uses "UserGroupInformation#createRemoteUser" to create 
UGI. The latter does not require real user information, and the created UGI 
only contains information related to the user name, which means that the latter 
uses the same nature of the UGI created by the same user. I think they can 
share the same Filesystem instance.The consideration may not be comprehensive, 
welcome to discuss.

 
{code:java}
public static UserGroupInformation createRemoteUser(String user, AuthMethod 
authMethod) {
  if (user == null || user.isEmpty()) {
throw new IllegalArgumentException("Null user");
  }
  Subject subject = new Subject();
  subject.getPrincipals().add(new User(user));
  UserGroupInformation result = new UserGroupInformation(subject);
  result.setAuthenticationMethod(authMethod);
  return result;
}{code}

> FileSystem.CACHE memory leak
> 
>
> Key: HDFS-16045
> URL: https://issues.apache.org/jira/browse/HDFS-16045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
>
> {code:java}
> FileSystem get(final URI uri, final Configuration conf,
>  final String user){code}
> When the client turns on the cache and uses the above API to specify the user 
> to create a Filesystem instance, the cache will be invalid.
> The specified user creates a new UGI every time he creates a Filesystem 
> instance, and cache compares it according to UGI.
> {code:java}
> public int hashCode() {
>  return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
> }{code}
> Whether you can use username to replace UGI to make a comparison, and whether 
> there are other risks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16040) RpcQueueTime metric counts requeued calls as unique events.

2021-05-27 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-16040:
---
Fix Version/s: 3.3.2
   3.2.3
   2.10.2
   3.1.5
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I just committed this to all active branches. Thank you, [~simbadzina].

> RpcQueueTime metric counts requeued calls as unique events.
> ---
>
> Key: HDFS-16040
> URL: https://issues.apache.org/jira/browse/HDFS-16040
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0, 3.3.0
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Fix For: 3.1.5, 2.10.2, 3.2.3, 3.3.2
>
> Attachments: HDFS-16040.001.patch, HDFS-16040.002.patch, 
> HDFS-16040.003.patch
>
>
> The RpcQueueTime metric is updated every time a call is re-queued while 
> waiting for the server state to reach the call's client's state ID. This is 
> in contrast to RpcProcessingTime which is only updated when the call if 
> finally processed.
> On the Observer NameNode this can result in RpcQueueTimeNumOps being much 
> larger than RpcProcessingTimeNumOps. The re-queueing is an internal 
> optimization to avoid blocking and shouldn't result in an inflated metric.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16046) TestBalanceProcedureScheduler timeouts

2021-05-27 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352931#comment-17352931
 ] 

Akira Ajisaka commented on HDFS-16046:
--

 !screenshot-1.png! 
After extending the timeout, it passed and took about 2 miniutes.

> TestBalanceProcedureScheduler timeouts
> --
>
> Key: HDFS-16046
> URL: https://issues.apache.org/jira/browse/HDFS-16046
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: image-2021-05-28-11-41-16-733.png, screenshot-1.png
>
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/520/testReport/org.apache.hadoop.tools.fedbalance.procedure/TestBalanceProcedureScheduler/testSchedulerDownAndRecoverJob/
> {quote}
> org.junit.runners.model.TestTimedOutException: test timed out after 6 
> milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceJob.waitJobDone(BalanceJob.java:220)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedureScheduler.waitUntilDone(BalanceProcedureScheduler.java:189)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.TestBalanceProcedureScheduler.testSchedulerDownAndRecoverJob(TestBalanceProcedureScheduler.java:331)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16046) TestBalanceProcedureScheduler timeouts

2021-05-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16046:
-
Attachment: screenshot-1.png

> TestBalanceProcedureScheduler timeouts
> --
>
> Key: HDFS-16046
> URL: https://issues.apache.org/jira/browse/HDFS-16046
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: image-2021-05-28-11-41-16-733.png, screenshot-1.png
>
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/520/testReport/org.apache.hadoop.tools.fedbalance.procedure/TestBalanceProcedureScheduler/testSchedulerDownAndRecoverJob/
> {quote}
> org.junit.runners.model.TestTimedOutException: test timed out after 6 
> milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceJob.waitJobDone(BalanceJob.java:220)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedureScheduler.waitUntilDone(BalanceProcedureScheduler.java:189)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.TestBalanceProcedureScheduler.testSchedulerDownAndRecoverJob(TestBalanceProcedureScheduler.java:331)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16046) TestBalanceProcedureScheduler timeouts

2021-05-27 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352929#comment-17352929
 ] 

Akira Ajisaka commented on HDFS-16046:
--

!image-2021-05-28-11-41-16-733.png!

I ran the test from IntelliJ and the test timed out.

> TestBalanceProcedureScheduler timeouts
> --
>
> Key: HDFS-16046
> URL: https://issues.apache.org/jira/browse/HDFS-16046
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: image-2021-05-28-11-41-16-733.png
>
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/520/testReport/org.apache.hadoop.tools.fedbalance.procedure/TestBalanceProcedureScheduler/testSchedulerDownAndRecoverJob/
> {quote}
> org.junit.runners.model.TestTimedOutException: test timed out after 6 
> milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceJob.waitJobDone(BalanceJob.java:220)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedureScheduler.waitUntilDone(BalanceProcedureScheduler.java:189)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.TestBalanceProcedureScheduler.testSchedulerDownAndRecoverJob(TestBalanceProcedureScheduler.java:331)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16046) TestBalanceProcedureScheduler timeouts

2021-05-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16046:
-
Attachment: image-2021-05-28-11-41-16-733.png

> TestBalanceProcedureScheduler timeouts
> --
>
> Key: HDFS-16046
> URL: https://issues.apache.org/jira/browse/HDFS-16046
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf, test
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: image-2021-05-28-11-41-16-733.png
>
>
> https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/520/testReport/org.apache.hadoop.tools.fedbalance.procedure/TestBalanceProcedureScheduler/testSchedulerDownAndRecoverJob/
> {quote}
> org.junit.runners.model.TestTimedOutException: test timed out after 6 
> milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceJob.waitJobDone(BalanceJob.java:220)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedureScheduler.waitUntilDone(BalanceProcedureScheduler.java:189)
>   at 
> org.apache.hadoop.tools.fedbalance.procedure.TestBalanceProcedureScheduler.testSchedulerDownAndRecoverJob(TestBalanceProcedureScheduler.java:331)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13729) Fix broken links to RBF documentation

2021-05-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13729?focusedWorklogId=603374&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603374
 ]

ASF GitHub Bot logged work on HDFS-13729:
-

Author: ASF GitHub Bot
Created on: 28/May/21 02:20
Start Date: 28/May/21 02:20
Worklog Time Spent: 10m 
  Work Description: oojas opened a new pull request #3059:
URL: https://github.com/apache/hadoop/pull/3059


   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603374)
Time Spent: 40m  (was: 0.5h)

> Fix broken links to RBF documentation
> -
>
> Key: HDFS-13729
> URL: https://issues.apache.org/jira/browse/HDFS-13729
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: jwhitter
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4
>
> Attachments: HADOOP-15589.001.patch, HDFS-13729-branch-2.001.patch, 
> hadoop_broken_link.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A broken link on the page [http://hadoop.apache.org/docs/current/]
>  * HDFS
>  ** HDFS Router based federation. See the [user 
> documentation|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSRouterFederation.html]
>  for more details.
> The link for user documentation 
> [http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSRouterFederation.html]
>  is not found.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13729) Fix broken links to RBF documentation

2021-05-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13729?focusedWorklogId=603373&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603373
 ]

ASF GitHub Bot logged work on HDFS-13729:
-

Author: ASF GitHub Bot
Created on: 28/May/21 02:19
Start Date: 28/May/21 02:19
Worklog Time Spent: 10m 
  Work Description: oojas closed pull request #3059:
URL: https://github.com/apache/hadoop/pull/3059


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 603373)
Time Spent: 0.5h  (was: 20m)

> Fix broken links to RBF documentation
> -
>
> Key: HDFS-13729
> URL: https://issues.apache.org/jira/browse/HDFS-13729
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: jwhitter
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.4
>
> Attachments: HADOOP-15589.001.patch, HDFS-13729-branch-2.001.patch, 
> hadoop_broken_link.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A broken link on the page [http://hadoop.apache.org/docs/current/]
>  * HDFS
>  ** HDFS Router based federation. See the [user 
> documentation|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSRouterFederation.html]
>  for more details.
> The link for user documentation 
> [http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSRouterFederation.html]
>  is not found.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16046) TestBalanceProcedureScheduler timeouts

2021-05-27 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-16046:


 Summary: TestBalanceProcedureScheduler timeouts
 Key: HDFS-16046
 URL: https://issues.apache.org/jira/browse/HDFS-16046
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf, test
Reporter: Akira Ajisaka


https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/520/testReport/org.apache.hadoop.tools.fedbalance.procedure/TestBalanceProcedureScheduler/testSchedulerDownAndRecoverJob/
{quote}
org.junit.runners.model.TestTimedOutException: test timed out after 6 
milliseconds
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at 
org.apache.hadoop.tools.fedbalance.procedure.BalanceJob.waitJobDone(BalanceJob.java:220)
at 
org.apache.hadoop.tools.fedbalance.procedure.BalanceProcedureScheduler.waitUntilDone(BalanceProcedureScheduler.java:189)
at 
org.apache.hadoop.tools.fedbalance.procedure.TestBalanceProcedureScheduler.testSchedulerDownAndRecoverJob(TestBalanceProcedureScheduler.java:331)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16040) RpcQueueTime metric counts requeued calls as unique events.

2021-05-27 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352920#comment-17352920
 ] 

Konstantin Shvachko commented on HDFS-16040:


Wow - clean build.
Adding my +1 to that. Will commit shortly.

> RpcQueueTime metric counts requeued calls as unique events.
> ---
>
> Key: HDFS-16040
> URL: https://issues.apache.org/jira/browse/HDFS-16040
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.10.0, 3.3.0
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
> Attachments: HDFS-16040.001.patch, HDFS-16040.002.patch, 
> HDFS-16040.003.patch
>
>
> The RpcQueueTime metric is updated every time a call is re-queued while 
> waiting for the server state to reach the call's client's state ID. This is 
> in contrast to RpcProcessingTime which is only updated when the call if 
> finally processed.
> On the Observer NameNode this can result in RpcQueueTimeNumOps being much 
> larger than RpcProcessingTimeNumOps. The re-queueing is an internal 
> optimization to avoid blocking and shouldn't result in an inflated metric.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15946) Fix java doc in FSPermissionChecker

2021-05-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15946:

Issue Type: Improvement  (was: Wish)

> Fix java doc in FSPermissionChecker
> ---
>
> Key: HDFS-15946
> URL: https://issues.apache.org/jira/browse/HDFS-15946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix java doc for 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker#hasAclPermission.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15946) Fix java doc in FSPermissionChecker

2021-05-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HDFS-15946.
-
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fix java doc in FSPermissionChecker
> ---
>
> Key: HDFS-15946
> URL: https://issues.apache.org/jira/browse/HDFS-15946
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix java doc for 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker#hasAclPermission.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16002) TestJournalNodeRespectsBindHostKeys#testHttpsBindHostKey very flaky

2021-05-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HDFS-16002.
-
Resolution: Duplicate

Was broken by HADOOP-16524 and now fixed as an Addendum there itself.

> TestJournalNodeRespectsBindHostKeys#testHttpsBindHostKey very flaky
> ---
>
> Key: HDFS-16002
> URL: https://issues.apache.org/jira/browse/HDFS-16002
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> This test appears to be failing a lot lately. I suspect it has to be with the 
> new change to support reloading httpserver2 certificates, but I've not looked 
> into it.
> {noformat}
> Stacktrace
> java.lang.NullPointerException
>   at sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:77)
>   at sun.nio.fs.UnixPath.(UnixPath.java:71)
>   at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
>   at java.nio.file.Paths.get(Paths.java:84)
>   at 
> org.apache.hadoop.http.HttpServer2$Builder.makeConfigurationChangeMonitor(HttpServer2.java:609)
>   at 
> org.apache.hadoop.http.HttpServer2$Builder.createHttpsChannelConnector(HttpServer2.java:592)
>   at 
> org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:518)
>   at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeHttpServer.start(JournalNodeHttpServer.java:81)
>   at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNode.start(JournalNode.java:238)
>   at 
> org.apache.hadoop.hdfs.qjournal.MiniJournalCluster.(MiniJournalCluster.java:120)
>   at 
> org.apache.hadoop.hdfs.qjournal.MiniJournalCluster.(MiniJournalCluster.java:47)
>   at 
> org.apache.hadoop.hdfs.qjournal.MiniJournalCluster$Builder.build(MiniJournalCluster.java:79)
>   at 
> org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys.testHttpsBindHostKey(TestJournalNodeRespectsBindHostKeys.java:180)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log

2021-05-27 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352653#comment-17352653
 ] 

Konstantin Shvachko commented on HDFS-15915:


[~daryn] would appreciate your review.

> Race condition with async edits logging due to updating txId outside of the 
> namesystem log
> --
>
> Key: HDFS-15915
> URL: https://issues.apache.org/jira/browse/HDFS-15915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
>Priority: Major
> Fix For: 3.4.0, 3.1.5, 2.10.2, 3.2.3, 3.3.2
>
> Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, 
> HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch, 
> testMkdirsRace.patch
>
>
> {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside 
> {{FSNamesystem.writeLock}}. But one essential field the transaction id of the 
> edits op remains unset until the time when the operation is scheduled for 
> synching. At that time {{beginTransaction()}} will set the the 
> {{FSEditLogOp.txid}} and increment the global transaction count. On busy 
> NameNode this event can fall outside the write lock. 
> This causes problems for Observer reads. It also can potentially reshuffle 
> transactions and Standby will apply them in a wrong order.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15915) Race condition with async edits logging due to updating txId outside of the namesystem log

2021-05-27 Thread Daryn Sharp (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352565#comment-17352565
 ] 

Daryn Sharp commented on HDFS-15915:


I'm very nervous about this patch and need to thoroughly reacquaint myself with 
the code.  Skimming the patch, I'm initially very worried about the added 
synchronization and the potential for deadlock particularly during an edit log 
roll.  We're in the midst of a upgrade cycle so I likely won't have time to 
review till early next but in the meantime we will internally revert due to 
risk...

> Race condition with async edits logging due to updating txId outside of the 
> namesystem log
> --
>
> Key: HDFS-15915
> URL: https://issues.apache.org/jira/browse/HDFS-15915
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
>Priority: Major
> Fix For: 3.4.0, 3.1.5, 2.10.2, 3.2.3, 3.3.2
>
> Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, 
> HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch, 
> testMkdirsRace.patch
>
>
> {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside 
> {{FSNamesystem.writeLock}}. But one essential field the transaction id of the 
> edits op remains unset until the time when the operation is scheduled for 
> synching. At that time {{beginTransaction()}} will set the the 
> {{FSEditLogOp.txid}} and increment the global transaction count. On busy 
> NameNode this event can fall outside the write lock. 
> This causes problems for Observer reads. It also can potentially reshuffle 
> transactions and Standby will apply them in a wrong order.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13522) RBF: Support observer node from Router-Based Federation

2021-05-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13522?focusedWorklogId=603055&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-603055
 ]

ASF GitHub Bot logged work on HDFS-13522:
-

Author: ASF GitHub Bot
Created on: 27/May/21 14:51
Start Date: 27/May/21 14:51
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3005:
URL: https://github.com/apache/hadoop/pull/3005#issuecomment-849700352


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  0s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 11 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 20s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  28m 36s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  24m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   5m 14s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 58s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 17s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   5m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |  11m 36s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  22m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |  18m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 43s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3005/14/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 8 new + 439 unchanged - 1 fixed = 447 total (was 
440)  |
   | +1 :green_heart: |  mvnsite  |   5m 21s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  3s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   4m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   5m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |  10m 38s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 19s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 40s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 384m 12s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3005/14/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  30m 12s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3005/14/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 675m 26s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestDFSShell |
   |   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
   |   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
   |   | 
hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract |
   |   | hadoop.hdfs.server.federation.router.TestObserverWithRouter

[jira] [Commented] (HDFS-16042) DatanodeAdminMonitor scan should be delay based

2021-05-27 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352535#comment-17352535
 ] 

Ahmed Hussein commented on HDFS-16042:
--

Hey [~Jim_Brennan], can you please take a look at [GitHub Pull Request 
#3058|https://github.com/apache/hadoop/pull/3058]. It is pretty straightforward 
change.

> DatanodeAdminMonitor scan should be delay based
> ---
>
> Key: HDFS-16042
> URL: https://issues.apache.org/jira/browse/HDFS-16042
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In {{DatanodeAdminManager.activate()}}, the Monitor task is scheduled with a 
> fixed rate, ie. the period is from start1 -> start2.  
> {code:java}
> executor.scheduleAtFixedRate(monitor, intervalSecs, intervalSecs,
>TimeUnit.SECONDS);
> {code}
> According to Java API docs for {{scheduleAtFixedRate}},
> {quote}If any execution of this task takes longer than its period, then 
> subsequent executions may start late, but will not concurrently 
> execute.{quote}
> It should be a fixed delay so it's end1 -> start1.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16045) FileSystem.CACHE memory leak

2021-05-27 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352527#comment-17352527
 ] 

Xiaoqiao He commented on HDFS-16045:


Thanks [~zhuxiangyi] raise this issue here. I think it could be not enough to 
use username only to index the FileSystem instance. Consider different 
principles but same username then it will index the same FileSystem instance 
without any difference when request to Server. But the UGI for FileSystem 
instances should be not same for different User actually, this could be common 
case for presto especially. Not think deeply, any discussion will welcome. 
Thanks again.

> FileSystem.CACHE memory leak
> 
>
> Key: HDFS-16045
> URL: https://issues.apache.org/jira/browse/HDFS-16045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
>
> {code:java}
> FileSystem get(final URI uri, final Configuration conf,
>  final String user){code}
> When the client turns on the cache and uses the above API to specify the user 
> to create a Filesystem instance, the cache will be invalid.
> The specified user creates a new UGI every time he creates a Filesystem 
> instance, and cache compares it according to UGI.
> {code:java}
> public int hashCode() {
>  return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
> }{code}
> Whether you can use username to replace UGI to make a comparison, and whether 
> there are other risks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-05-27 Thread Bhavik Patel (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352466#comment-17352466
 ] 

Bhavik Patel commented on HDFS-15714:
-

[~PhiloHe] "metadata wrapped in remote file status, like modification time, 
access time, permission, etc, HDFS will create its INode file accordingly and 
set Provided Storage type." ==> Are we psersiting/maintaining the "Modification 
time" of object at HDFS end? If Yes, can please provide more details on that 
like where/how we are maintaining.


> HDFS Provided Storage Read/Write Mount Support On-the-fly
> -
>
> Key: HDFS-15714
> URL: https://issues.apache.org/jira/browse/HDFS-15714
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15714-01.patch, 
> HDFS_Provided_Storage_Design-V1.pdf, HDFS_Provided_Storage_Performance-V1.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HDFS Provided Storage (PS) is a feature to tier HDFS over other file systems. 
> In HDFS-9806, PROVIDED storage type was introduced to HDFS. Through 
> configuring external storage with PROVIDED tag for DataNode, user can enable 
> application to access data stored externally from HDFS side. However, there 
> are two issues need to be addressed. Firstly, mounting external storage 
> on-the-fly, namely dynamic mount, is lacking. It is necessary to get it 
> supported to flexibly combine HDFS with an external storage at runtime. 
> Secondly, PS write is not supported by current HDFS. But in real 
> applications, it is common to transfer data bi-directionally for read/write 
> between HDFS and external storage.
> Through this JIRA, we are presenting our work for PS write support and 
> dynamic mount support for both read & write. Please note in the community 
> several JIRAs have been filed for these topics. Our work is based on these 
> previous community work, with new design & implementation to support called 
> writeBack mount and enable admin to add any mount on-the-fly. We appreciate 
> those folks in the community for their great contribution! See their pending 
> JIRAs: HDFS-14805 & HDFS-12090.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16045) FileSystem.CACHE memory leak

2021-05-27 Thread Xiangyi Zhu (Jira)
Xiangyi Zhu created HDFS-16045:
--

 Summary: FileSystem.CACHE memory leak
 Key: HDFS-16045
 URL: https://issues.apache.org/jira/browse/HDFS-16045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


{code:java}
FileSystem get(final URI uri, final Configuration conf,
 final String user){code}
When the client turns on the cache and uses the above API to specify the user 
to create a Filesystem instance, the cache will be invalid.

The specified user creates a new UGI every time he creates a Filesystem 
instance, and cache compares it according to UGI.
{code:java}
public int hashCode() {
 return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
}{code}
Whether you can use username to replace UGI to make a comparison, and whether 
there are other risks.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludun updated HDFS-16044:
-
Description: 
In production cluster when call getListing very frequent.  The processing time 
of rpc request is very high. we try  to  optimize the performance of getListing 
request.
After some check, we found that, even the source and child is dir,   the 
getListing request also call   getLocatedBlocks. 

the request is and  needLocation is false

{code:java}
2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
8-5-231-4/8.5.231.4:25000: getListing {src: 
"/data/connector/test/topics/102test" startAfter: "" needLocation: false}
{code}

but getListing request 1000 times getLocatedBlocks which not needed.

{code:java}
`---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
`---[35.068532ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
+---[0.003542ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
+---[0.003053ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
+---[0.002938ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
+---[0.00252ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
+---[0.002788ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
+---[0.002905ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
+---[0.002785ms] 
org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
+---[0.002236ms] 
org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
+---[0.002919ms] 
org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
+---[0.003408ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
+---[0.005942ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
+---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
+---[0.005481ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
+---[0.002176ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
+---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
+---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
+---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
+---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
 #95
+---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus() 
#257
+---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
+---[0.003234ms] 
org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
`---[0.002457ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
{code}


  was:
In production cluster when call getListing very frequent.  The processing time 
of rpc request is very high. we try  to  optimize the performance of getListing 
request.
After some check, we found that, even the source and child is dir,   the 
getListing request also call   getLocatedBlocks. 

the request is and  needLocation is false

{code:java}
2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
8-5-231-4/8.5.231.4:25000: getListing {src: 
"/data/connector/test/topics/102test" startAfter: "" needLocation: false}
{code}



{code:java}
`---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
`---[35.068532ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
+---[0.003542ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
+---[0.003053ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
+---[0.002938ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
+---[0.00252ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
+---[0.002788ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
+---[0.002905ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
+---[0.002785ms] 
org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
+--

[jira] [Updated] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludun updated HDFS-16044:
-
Description: 
In production cluster when call getListing very frequent.  The processing time 
of rpc request is very high. we try  to  optimize the performance of getListing 
request.
After some check, we found that, even the source and child is dir,   the 
getListing request also call   getLocatedBlocks. 

the request is and  needLocation is false

{code:java}
2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
8-5-231-4/8.5.231.4:25000: getListing {src: 
"/data/connector/test/topics/102test" startAfter: "" needLocation: false}
{code}



{code:java}
`---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
`---[35.068532ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
+---[0.003542ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
+---[0.003053ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
+---[0.002938ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
+---[0.00252ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
+---[0.002788ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
+---[0.002905ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
+---[0.002785ms] 
org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
+---[0.002236ms] 
org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
+---[0.002919ms] 
org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
+---[0.003408ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
+---[0.005942ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
+---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
+---[0.005481ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
+---[0.002176ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
+---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
+---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
+---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
+---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
 #95
+---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus() 
#257
+---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
+---[0.003234ms] 
org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
`---[0.002457ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
{code}


  was:
In production cluster when call getListing very frequent.  The processing time 
of rpc request is very high. we try  to  optimize the performance of getListing 
request.
After some check, we found that, even the source and child is dir,   the 
getListing request also call   getLocatedBlocks. 

{code:java}
`---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
`---[35.068532ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
+---[0.003542ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
+---[0.003053ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
+---[0.002938ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
+---[0.00252ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
+---[0.002788ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
+---[0.002905ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
+---[0.002785ms] 
org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
+---[0.002236ms] 
org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
+---[0.002919ms] 
org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
+---[0.003408ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
+---[0.005942ms] 
org.apache.hadoop.h

[jira] [Updated] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludun updated HDFS-16044:
-
Attachment: HDFS-16044.00.patch

> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ludun
>Assignee: ludun
>Priority: Major
> Attachments: HDFS-16044.00.patch
>
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
> `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352349#comment-17352349
 ] 

ludun commented on HDFS-16044:
--

[~brahma]  if you have time, please check this issue

> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ludun
>Assignee: ludun
>Priority: Major
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
> `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352348#comment-17352348
 ] 

ludun commented on HDFS-16044:
--

after  check code, the  getLocatedBlocks  is called in 
FSDirStatAndListingOp#getListing: 
{code:java}
  for (int i = 0; i < numOfListing && locationBudget > 0; i++) {
INode child = contents.get(startChild+i);
byte childStoragePolicy = (includeStoragePolicy && !child.isSymlink())
? getStoragePolicyID(child.getLocalStoragePolicyID(),
 parentStoragePolicy)
: parentStoragePolicy;
listing[i] = createFileStatus(fsd, iip, child, childStoragePolicy,
needLocation, false);
listingCnt++;
if (listing[i] instanceof HdfsLocatedFileStatus) {
// Once we  hit lsLimit locations, stop.
// This helps to prevent excessively large response payloads.
// Approximate #locations with locatedBlockCount() * repl_factor
LocatedBlocks blks =
((HdfsLocatedFileStatus)listing[i]).getLocatedBlocks();
locationBudget -= (blks == null) ? 0 :
   blks.locatedBlockCount() * listing[i].getReplication();
}
  }
{code}

It is based in return of  createFileStatus, which create in  
HdfsFileStatus#build 
{code:java}
public HdfsFileStatus build() {
  if (null == locations && !isdir && null == symlink && !locatedStatus) {
return new HdfsNamedFileStatus(length, isdir, replication, blocksize,
mtime, atime, permission, flags, owner, group, symlink, path,
fileId, childrenNum, feInfo, storagePolicy, ecPolicy);
  }
  return new HdfsLocatedFileStatus(length, isdir, replication, blocksize,
  mtime, atime, permission, flags, owner, group, symlink, path,
  fileId, childrenNum, feInfo, storagePolicy, ecPolicy, locations);
}
{code}

when isdir is true, it should return HdfsNamedFileStatus not 
HdfsLocatedFileStatus.




> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ludun
>Assignee: ludun
>Priority: Major
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode

[jira] [Created] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)
ludun created HDFS-16044:


 Summary: getListing call getLocatedBlocks even source is a 
directory
 Key: HDFS-16044
 URL: https://issues.apache.org/jira/browse/HDFS-16044
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ludun


In production cluster when call getListing very frequent.  The processing time 
of rpc request is very high. we try  to  optimize the performance of getListing 
request.
After some check, we found that, even the source and child is dir,   the 
getListing request also call   getLocatedBlocks. 

{code:java}
`---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
`---[35.068532ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
+---[0.003542ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
+---[0.003053ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
+---[0.002938ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
+---[0.00252ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
+---[0.002788ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
+---[0.002905ms] 
org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
+---[0.002785ms] 
org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
+---[0.002236ms] 
org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
+---[0.002919ms] 
org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
+---[0.003408ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
+---[0.005942ms] 
org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
+---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
+---[0.005481ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
+---[0.002176ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
+---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
+---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
+---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
+---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
 #95
+---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus() 
#257
+---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
+---[0.003234ms] 
org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
`---[0.002457ms] 
org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16044) getListing call getLocatedBlocks even source is a directory

2021-05-27 Thread ludun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludun reassigned HDFS-16044:


Assignee: ludun

> getListing call getLocatedBlocks even source is a directory
> ---
>
> Key: HDFS-16044
> URL: https://issues.apache.org/jira/browse/HDFS-16044
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ludun
>Assignee: ludun
>Priority: Major
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
> `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
> +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
> +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
> +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
> +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
> +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
> +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
> +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
> +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
> +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
> +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
> +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
> +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
> +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
> +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
> +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
> +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
> +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
> +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
> +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
> +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
> +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:() #274
> `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org