[ https://issues.apache.org/jira/browse/HADOOP-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558179#action_12558179 ]
Hairong Kuang commented on HADOOP-2566: --------------------------------------- I am still not comfortable with this change: 1. Some of shell commands like delete, copy, and rename use globPath but don't need FileStatus. 2. GlobPath does not always call listPath for every directory. For example, globPath("/user/*/data") needs only to listPath("/user"). Returning FileStatus[] requires listPath on each user xx's home directory /user/xx and /user/xx/data. This is a lot of overhead. > need FileSystem#globStatus method > --------------------------------- > > Key: HADOOP-2566 > URL: https://issues.apache.org/jira/browse/HADOOP-2566 > Project: Hadoop > Issue Type: Improvement > Components: fs > Reporter: Doug Cutting > Assignee: Hairong Kuang > Fix For: 0.16.0 > > > To remove the cache of FileStatus in DFSPath (HADOOP-2565) without hurting > performance, we must use file enumeration APIs that return FileStatus[] > rather than Path[]. Currently we have FileSystem#globPaths(), but that > method should be deprecated and replaced with a FileSystem#globStatus(). > We need to deprecate FileSystem#globPaths() in 0.16 in order to remove the > cache in 0.17. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.