[ https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049928#comment-17049928 ]
Fengnan Li commented on HDFS-15196: ----------------------------------- [~elgoiri] Without the fix the test will fail. Actually, https://issues.apache.org/jira/browse/HDFS-14739 introduced a bug that makes ls goes into an possible infinite loop since Mount table point was added as a qualified path with parent, making the startAfter always smallest string from across all children listings. For example, in my test, with the ls limit as 5 from namenode. it will first return file-0, file-1, file-2, file-3 and file-4. Without the fix, /parent/file-7 would be added to the listing, making the next batch listing with startAfter as `/parent/file-7` which is even smaller than file-0, thus the query sent to downstream namenode will return file-[0-4] again. With this fix there won't be such an issue. I guess there is a reason that the mount point is appended as a dir, but I haven't dug too much. After this one I will go there. [~ayushtkn] The result was put into a TreeMap before returning so the order is preserved. > RBF: RouterRpcServer getListing cannot list large dirs correctly > ---------------------------------------------------------------- > > Key: HDFS-15196 > URL: https://issues.apache.org/jira/browse/HDFS-15196 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Fengnan Li > Assignee: Fengnan Li > Priority: Critical > Attachments: HDFS-15196.001.patch, HDFS-15196.002.patch, > HDFS-15196.003.patch, HDFS-15196.003.patch, HDFS-15196.004.patch, > HDFS-15196.005.patch > > > In RouterRpcServer, getListing function is handled as two parts: > # Union all partial listings from destination ns + paths > # Append mount points for the dir to be listed > In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT > (with default value 1k), the batch listing will be used and the startAfter > will be used to define the boundary of each batch listing. However, step 2 > here will add existing mount points, which will mess up with the boundary of > the batch, thus making the next batch startAfter wrong. > The fix is just to append the mount points when there is no more batch query > necessary. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org