[
https://issues.apache.org/jira/browse/HADOOP-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558179#action_12558179
]
hairong edited comment on HADOOP-2566 at 1/11/08 5:01 PM:
----------------------------------------------------------------
I am still not comfortable with this change:
1. Some of shell commands like delete, copy, and rename use globPath but don't
need FileStatus.
2. GlobPath does not always call listPath for every directory. For example,
globPath("/user/*/data") needs only to listPath("/user"). Returning
FileStatus[] requires additional listPath calls on each user xx's home
directory /user/xx and the root /. This is a lot of overhead.
was (Author: hairong):
I am still not comfortable with this change:
1. Some of shell commands like delete, copy, and rename use globPath but don't
need FileStatus.
2. GlobPath does not always call listPath for every directory. For example,
globPath("/user/*/data") needs only to listPath("/user"). Returning
FileStatus[] requires listPath on each user xx's home directory /user/xx and
/user/xx/data. This is a lot of overhead.
> need FileSystem#globStatus method
> ---------------------------------
>
> Key: HADOOP-2566
> URL: https://issues.apache.org/jira/browse/HADOOP-2566
> Project: Hadoop
> Issue Type: Improvement
> Components: fs
> Reporter: Doug Cutting
> Assignee: Hairong Kuang
> Fix For: 0.16.0
>
>
> To remove the cache of FileStatus in DFSPath (HADOOP-2565) without hurting
> performance, we must use file enumeration APIs that return FileStatus[]
> rather than Path[]. Currently we have FileSystem#globPaths(), but that
> method should be deprecated and replaced with a FileSystem#globStatus().
> We need to deprecate FileSystem#globPaths() in 0.16 in order to remove the
> cache in 0.17.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.