[
https://issues.apache.org/jira/browse/HADOOP-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558081#action_12558081
]
Raghu Angadi commented on HADOOP-2566:
--------------------------------------
globStatus would certainly be useful since globPaths() is used in many places
where we really want to do globStatus(). globStatus is much more efficient in
those cases since we aften do {{for(path : globPaths(pattern)) { stat =
listStatus(path) ... }.
I am not sure if globPaths() can go away. One difference I see is that
globPath("/non/existent/path/withoutglob") returns simple path without any
filesystem interaction (as expected). But
globStatus("/non/existent/path/withoutglob") will ask filesystem and will
return NULL (or array with zero entries).
> need FileSystem#globStatus method
> ---------------------------------
>
> Key: HADOOP-2566
> URL: https://issues.apache.org/jira/browse/HADOOP-2566
> Project: Hadoop
> Issue Type: Improvement
> Components: fs
> Reporter: Doug Cutting
> Assignee: Hairong Kuang
> Fix For: 0.16.0
>
>
> To remove the cache of FileStatus in DFSPath (HADOOP-2565) without hurting
> performance, we must use file enumeration APIs that return FileStatus[]
> rather than Path[]. Currently we have FileSystem#globPaths(), but that
> method should be deprecated and replaced with a FileSystem#globStatus().
> We need to deprecate FileSystem#globPaths() in 0.16 in order to remove the
> cache in 0.17.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.