[ https://issues.apache.org/jira/browse/HDFS-16044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352348#comment-17352348 ]
ludun commented on HDFS-16044: ------------------------------ after check code, the getLocatedBlocks is called in FSDirStatAndListingOp#getListing: {code:java} for (int i = 0; i < numOfListing && locationBudget > 0; i++) { INode child = contents.get(startChild+i); byte childStoragePolicy = (includeStoragePolicy && !child.isSymlink()) ? getStoragePolicyID(child.getLocalStoragePolicyID(), parentStoragePolicy) : parentStoragePolicy; listing[i] = createFileStatus(fsd, iip, child, childStoragePolicy, needLocation, false); listingCnt++; if (listing[i] instanceof HdfsLocatedFileStatus) { // Once we hit lsLimit locations, stop. // This helps to prevent excessively large response payloads. // Approximate #locations with locatedBlockCount() * repl_factor LocatedBlocks blks = ((HdfsLocatedFileStatus)listing[i]).getLocatedBlocks(); locationBudget -= (blks == null) ? 0 : blks.locatedBlockCount() * listing[i].getReplication(); } } {code} It is based in return of createFileStatus, which create in HdfsFileStatus#build {code:java} public HdfsFileStatus build() { if (null == locations && !isdir && null == symlink && !locatedStatus) { return new HdfsNamedFileStatus(length, isdir, replication, blocksize, mtime, atime, permission, flags, owner, group, symlink, path, fileId, childrenNum, feInfo, storagePolicy, ecPolicy); } return new HdfsLocatedFileStatus(length, isdir, replication, blocksize, mtime, atime, permission, flags, owner, group, symlink, path, fileId, childrenNum, feInfo, storagePolicy, ecPolicy, locations); } {code} when isdir is true, it should return HdfsNamedFileStatus not HdfsLocatedFileStatus. > getListing call getLocatedBlocks even source is a directory > ----------------------------------------------------------- > > Key: HDFS-16044 > URL: https://issues.apache.org/jira/browse/HDFS-16044 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: ludun > Assignee: ludun > Priority: Major > > In production cluster when call getListing very frequent. The processing > time of rpc request is very high. we try to optimize the performance of > getListing request. > After some check, we found that, even the source and child is dir, the > getListing request also call getLocatedBlocks. > {code:java} > `---ts=2021-05-27 14:19:15;thread_name=IPC Server handler 86 on > 25000;id=e6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@5fcfe4b2 > `---[35.068532ms] > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getListing() > +---[0.003542ms] > org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathComponents() #214 > +---[0.003053ms] > org.apache.hadoop.hdfs.server.namenode.FSDirectory:isExactReservedName() #95 > +---[0.002938ms] > org.apache.hadoop.hdfs.server.namenode.FSDirectory:readLock() #218 > +---[0.00252ms] > org.apache.hadoop.hdfs.server.namenode.INodesInPath:isDotSnapshotDir() #220 > +---[0.002788ms] > org.apache.hadoop.hdfs.server.namenode.INodesInPath:getPathSnapshotId() #223 > +---[0.002905ms] > org.apache.hadoop.hdfs.server.namenode.INodesInPath:getLastINode() #224 > +---[0.002785ms] > org.apache.hadoop.hdfs.server.namenode.INode:getStoragePolicyID() #230 > +---[0.002236ms] > org.apache.hadoop.hdfs.server.namenode.INode:isDirectory() #233 > +---[0.002919ms] > org.apache.hadoop.hdfs.server.namenode.INode:asDirectory() #242 > +---[0.003408ms] > org.apache.hadoop.hdfs.server.namenode.INodeDirectory:getChildrenList() #243 > +---[0.005942ms] > org.apache.hadoop.hdfs.server.namenode.INodeDirectory:nextChild() #244 > +---[0.002467ms] org.apache.hadoop.hdfs.util.ReadOnlyList:size() #245 > +---[0.005481ms] > org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #247 > +---[0.002176ms] > org.apache.hadoop.hdfs.server.namenode.FSDirectory:getLsLimit() #248 > +---[min=0.00211ms,max=0.005157ms,total=2.247572ms,count=1000] > org.apache.hadoop.hdfs.util.ReadOnlyList:get() #252 > +---[min=0.001946ms,max=0.005411ms,total=2.041715ms,count=1000] > org.apache.hadoop.hdfs.server.namenode.INode:isSymlink() #253 > +---[min=0.002176ms,max=0.005426ms,total=2.264472ms,count=1000] > org.apache.hadoop.hdfs.server.namenode.INode:getLocalStoragePolicyID() #254 > +---[min=0.002251ms,max=0.006849ms,total=2.351935ms,count=1000] > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:getStoragePolicyID() > #95 > +---[min=0.006091ms,max=0.012333ms,total=6.439434ms,count=1000] > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp:createFileStatus() > #257 > +---[min=0.00269ms,max=0.004995ms,total=2.788194ms,count=1000] > org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus:getLocatedBlocks() #265 > +---[0.003234ms] > org.apache.hadoop.hdfs.protocol.DirectoryListing:<init>() #274 > `---[0.002457ms] > org.apache.hadoop.hdfs.server.namenode.FSDirectory:readUnlock() #277 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org