[ 
https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465242#comment-17465242
 ] 

Ayush Saxena commented on HDFS-16044:
-------------------------------------

[~pilchard]/[~hexiaoqiao]
Though the last yetus report is green. That didn't run the entire test base, it 
ran only tests in hdfs-client. This patch though looks good. But it breaks a 
lot of tests. Since we are explicitly type casting to HdfsLocatedFileStatus at 
couple of places even for directories. 
Two examples:
*One in HDFS.java*
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/fs/Hdfs.java#L211]
*Second in DFS.java*
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java#L1297]

We have to add checks and all, I am not sure if this can break our code, so it 
can break other codes as well? AFAIK changing public API behaviour is 
incompatible change as well, not sure if this also will qualify under that or 
not.

*Tests to try:*
TestViewFsHdfs
TestHDFSContractGetFileStatus

> Fix getListing call getLocatedBlocks even source is a directory
> ---------------------------------------------------------------
>
>                 Key: HDFS-16044
>                 URL: https://issues.apache.org/jira/browse/HDFS-16044
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: ludun
>            Assignee: ludun
>            Priority: Major
>         Attachments: HDFS-16044.00.patch, HDFS-16044.01.patch, 
> HDFS-16044.02.patch, HDFS-16044.03.patch, HDFS-16044.05.patch
>
>
> In production cluster when call getListing very frequent.  The processing 
> time of rpc request is very high. we try  to  optimize the performance of 
> getListing request.
> After some check, we found that, even the source and child is dir,   the 
> getListing request also call   getLocatedBlocks. 
> the request is and  needLocation is false
> {code:java}
> 2021-05-27 15:16:07,093 TRACE ipc.ProtobufRpcEngine: 1: Call -> 
> 8-5-231-4/8.5.231.4:25000: getListing {src: 
> "/data/connector/test/topics/102test" startAfter: "" needLocation: false}
> {code}
> but getListing request 1000 times getLocatedBlocks which not needed.
> {code:java}
> `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on 
> 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2
>     `---[35.068532ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing()
>         +---[0.003542ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214
>         +---[0.003053ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95
>         +---[0.002938ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218
>         +---[0.00252ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220
>         +---[0.002788ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223
>         +---[0.002905ms] 
> org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224
>         +---[0.002785ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230
>         +---[0.002236ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233
>         +---[0.002919ms] 
> org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242
>         +---[0.003408ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243
>         +---[0.005942ms] 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244
>         +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245
>         +---[0.005481ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247
>         +---[0.002176ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248
>         +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] 
> org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252
>         +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253
>         +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254
>         +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID()
>  #95
>         +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus()
>  #257
>         +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] 
> org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265
>         +---[0.003234ms] 
> org.apache.hadoop.hdfs.protocol.DirectoryListing:<init>() #274
>         `---[0.002457ms] 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to