[
https://issues.apache.org/jira/browse/HADOOP-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151368#comment-14151368
]
Eduardo Samano commented on HADOOP-9377:
----------------------------------------
Is there some reason to not solve this issue at version 2.4 ?
The workingDirectory feature continue unimplemented and the listStatus is very
slow. It doesn't work. I need to list a folder with more than 2000 files and it
takes several hours. No way.
Add a new parameter in getFileStatus avoid to brake something in the future
when workingDirectory be implemented. And it imply that the app only query the
workingDirectory once in each listStatus execution (instead by file).
But don't cache the workingDirectory in the initialize() method doesn't have
sense. At the end getHomeDirectory query the workingDirectory, then if the
workingDirectory feature is implemented in a future, it should be necessary
reimplement the getHomeDirectory method and all the methods that depend on it.
Cache the workingDirectory is a quick and enough solution, and with less impact
that have an useless implementation.
> FTPFileSystem.listStatus() runs very slow, due to inappropriate call of
> filePath.makeQualified
> ----------------------------------------------------------------------------------------------
>
> Key: HADOOP-9377
> URL: https://issues.apache.org/jira/browse/HADOOP-9377
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.0.3-alpha
> Reporter: James Yu
> Attachments: HADOOP-9377.diff
>
>
> FTPFileSystem.listStatus() calls
> getFileStatus(ftpFiles[i], absolute) calls
> new FileStatus(....) calls
> filePath.makeQualified(...) calls
> fs.getWorkingDirectory() calls
> getHomeDirectory()
> which creates new FTP connection every time, to get the workdingDirectory.
> this caused the FTPFileSystem.listStatus() takes long time to run (on average
> 3-6 seconds per file in my test).
> I attach a suggestion of fix in FTPFileSystem.java, only 4 lines of change.
> after the fix, there's no slowness issue anymore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)