[ https://issues.apache.org/jira/browse/SSHD-1217?focusedWorklogId=670583&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-670583 ]
ASF GitHub Bot logged work on SSHD-1217: ---------------------------------------- Author: ASF GitHub Bot Created on: 27/Oct/21 09:23 Start Date: 27/Oct/21 09:23 Worklog Time Spent: 10m Work Description: tomaswolf opened a new pull request #206: URL: https://github.com/apache/mina-sshd/pull/206 An Apache MINA sshd SFTP server configured to use an SftpFileSystem pointing to yet another SFTP server would serve directory listings only very slowly. This was caused by the SFTP server implementation using Java FileSystem abstractions itself for listing directories. Using java.nio.file.Files, it ended up doing essentially the following when receiving a SSH_FXP_READDIR command: ``` try(DirectoryStream<Path> list = Files.newDirectoryStream(dir)) { Map<Path, BasicFileAttributes> toSend = new HashMap<>(); list.iterator().forEach(p -> toSend.put(p, Files.readAttributes(p, BasicFileAttributes.class)) ); replyToClient(SSH_FXP_NAME, toSend); }; ``` (Not literally. This omits a lot of special handling for empty directories, not sending too large messages back to the client, and so on. But ultimately it boiled down to the above.) That is fine when the server-side file system from which files are served is local on the server. But when that file system itself is a remote file system, `newDirectoryStream` is a remote call sending SSH_FXP_OPENDIR and SSH_FXP_READDIR to the upstream server, and additionally each of these `readAttributes()` calls is yet another remote call sending SSH_FXP_LSTAT. This slows down getting the directory listing to the client tremendously. The most annoying part of all this is that the SSH_FXP_READDIR to the upstream SFTP server in `newDirectoryStream` _already returned all the attributes_, but this is lost because the Java NIO File abstractions have no real support for getting a directory listing including attributes in one fell swoop. (No, using a FileVisitor and Files.walkFileTree with depth 1 wouldn't help.) So detect this case in the server's SftpSubsystem, and bypass the Java FileSystem abstraction if the file system is itself an SftpFileSystem. Simply issue SSH_FXP_READDIR requests and forward the whole reply, which includes file names and attributes, directly to the client. For reading a directory containing 2000 files, this eliminates 10040 SSH_FXP_LSTAT calls; only the directory itself is stat'ed. A corollary to this is that clients in general should avoid listing directories on an SftpFileSystem via java.nio.file.Files _if they also need the file attributes_. Instead do ``` SftpFileSystem fs = ...; try (SftpClient client = fs.getClient(); CloseableHandle dir = client.openDir(remDirPath)) { for (SftpClient.DirEntry entry : client.listDir(dir)) { Path path = dir.getFile().resolve(entry.getFilename()); SftpClient.Attributes = entry.getAttributes(); // Do whatever you need with it } } ``` Implementation note: another idea is to cache the attributes read in SSH_FXP_READDIR in `newDirectoryStream` on the Path objects returned, and make `readAttributes` return these cached attributes while the stream is not closed yet. Technically, this is doable; it'd be a hack similar to what Java's own UnixFileSystem does (and FileVisitor takes advantage of it). However, it would break some edge cases, like a client writing to one of the files during the directory stream iteration, and then reading the attributes again and expecting the new attributes but still getting the cached ones. Side note: the SFTP protocol has no command to get _only_ a listing of names. SSH_FXP_READDIR _always_ returns both names and attributes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 670583) Remaining Estimate: 0h Time Spent: 10m > Slow performance listing huge number of files on Apache SSHD server > ------------------------------------------------------------------- > > Key: SSHD-1217 > URL: https://issues.apache.org/jira/browse/SSHD-1217 > Project: MINA SSHD > Issue Type: Improvement > Affects Versions: 2.6.0 > Reporter: Roberto Deandrea > Priority: Minor > Attachments: trace.ssh-frontend-sftplist.finest.log.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Hi Thomas, > I noted slow performance listing files on the front-end Apache SSHD server in > the same scenario as https://issues.apache.org/jira/browse/SSHD-1215 > The front-end Apache SSHD server is configured with a Filesystem built upon > SFTPFileSystemProvider to proxy files to an Apache SSHD back-end server. > > In the /inbox folder of the Apache SSHD backend server I have 2000 files. > The client sftp ls commands take 2 secs on the backend Apache SSHD server, > instead it takes about 48 secs on the front-end Apache SSHD server. > For greater number of files in the /inbox folder times are getting worse. > > I have full traces of sftp list commands to front-end Apache SSHD server > that is attached to this jira.[^trace.frontend.sshd.log.zip] > I looked through the traces on the front-end server and it seems to me that > for every files in the folder the sftp client on the front-end server creates > a SSH_MSG_CHANNEL_DATA generating tcp traffic that slow down the performance > of the list command. > Obviously this does not happen when a sftp client connects directly to the > backend Apache SSHD server. > Can you take a look at traces on the front-end Apache SSHD server ? > Do you think it's possbile change something to improve performance of list > files in this situation ? > > Thanks in advance > > Kind Regards > Roberto > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org For additional commands, e-mail: dev-h...@mina.apache.org