[ 
https://issues.apache.org/jira/browse/VFS-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

krishnan updated VFS-698:
-------------------------
    Description: 
getChildren() applied on SftpFileObject is very slow compared to JSCH 
implementation. This is because, the SftpATTRS which is readily available for 
the children after an "ls" call is again fetched for each child file since they 
are independently resolved. So if a directory contains 10 files, it results in 
1 (ls) + 10 (stat) calls to server.

For a folder with 100 files (AWS), it took about 35 secs instead of 1.5 secs to 
 getChildren().

 

*doListChildrenResolved:*

{{final FileObject fo = 
getFileSystem().resolveFile(getFileSystem().getFileSystemManager()}}
 \{{ .resolveName(getName(), UriParser.encode(name), NameScope.CHILD));}}

{{{color:#ff0000}((SftpFileObject) 
FileObjectUtils.getAbstractFileObject(fo)).setStat(stat.getAttrs());{color}}}

 

The resolveFile call, creates a SftpFileObject and calls its resolve method, 
which results in getting the (stats) SftpATTRS for each child file. This stat 
is already available as part of the 'ls' call we made. The setStat call above 
(highlighted is red) is redundant, since stat for each child file is already 
fetched one at a time.

The solution would be to avoid getting the stat for each child file after an 
'ls' call. May be, the framework makes it difficult to do this easily.

 

 

  was:
getChildren() applied on SftpFileObject is very slow compared to JSCH 
implementation since the SftpATTRS which is readily available for the children 
after an "ls" call is again fetched for each child file since they are 
independently resolved. So if a directory contains 10 files, it results in 1 
(ls) + 10 (stat) calls to server.

For a folder with 100 files (AWS), it took about 35 secs instead of 1.5 secs to 
 getChildren().

 

*doListChildrenResolved:*

{{final FileObject fo = 
getFileSystem().resolveFile(getFileSystem().getFileSystemManager()}}
{{ .resolveName(getName(), UriParser.encode(name), NameScope.CHILD));}}

{{{color:#FF0000}((SftpFileObject) 
FileObjectUtils.getAbstractFileObject(fo)).setStat(stat.getAttrs());{color}}}

 

The resolveFile call, creates a SftpFileObject and calls its resolve method, 
which results in getting the (stats) SftpATTRS for each child file. This stat 
is already available as part of the 'ls' call we made. The setStat call above 
(highlighted is red) is redundant, since stat for each child file is already 
fetched one at a time.

The solution would be to avoid getting the stat for each child file after an 
'ls' call. May be, the framework makes it difficult to do this easily.

 

 


> SFTP file attributes are fetched multiple times leading to very slow 
> directory listing
> --------------------------------------------------------------------------------------
>
>                 Key: VFS-698
>                 URL: https://issues.apache.org/jira/browse/VFS-698
>             Project: Commons VFS
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: krishnan
>            Priority: Major
>
> getChildren() applied on SftpFileObject is very slow compared to JSCH 
> implementation. This is because, the SftpATTRS which is readily available for 
> the children after an "ls" call is again fetched for each child file since 
> they are independently resolved. So if a directory contains 10 files, it 
> results in 1 (ls) + 10 (stat) calls to server.
> For a folder with 100 files (AWS), it took about 35 secs instead of 1.5 secs 
> to  getChildren().
>  
> *doListChildrenResolved:*
> {{final FileObject fo = 
> getFileSystem().resolveFile(getFileSystem().getFileSystemManager()}}
>  \{{ .resolveName(getName(), UriParser.encode(name), NameScope.CHILD));}}
> {{{color:#ff0000}((SftpFileObject) 
> FileObjectUtils.getAbstractFileObject(fo)).setStat(stat.getAttrs());{color}}}
>  
> The resolveFile call, creates a SftpFileObject and calls its resolve method, 
> which results in getting the (stats) SftpATTRS for each child file. This stat 
> is already available as part of the 'ls' call we made. The setStat call above 
> (highlighted is red) is redundant, since stat for each child file is already 
> fetched one at a time.
> The solution would be to avoid getting the stat for each child file after an 
> 'ls' call. May be, the framework makes it difficult to do this easily.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to