[ https://issues.apache.org/jira/browse/SSHD-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615236#comment-17615236 ]
Kirk Rasmussen edited comment on SSHD-1173 at 10/10/22 5:47 PM: ---------------------------------------------------------------- Here's simple test harness I was using to debug remote server. {code:java} private static void write(final SftpClient client, final String path, final int size, final int bufferSize) throws IOException { System.out.printf("write(\"%s\", %d, %d)%n", path, size, bufferSize); final Path uploadFile = createUploadFile(size); final StopWatch stopWatch = new StopWatch(); stopWatch.start(); try (InputStream inputStream = Files.newInputStream(uploadFile); OutputStream outputStream = client.write( path, bufferSize, SftpClient.OpenMode.Write, SftpClient.OpenMode.Create, SftpClient.OpenMode.Truncate)) { final byte[] buffer = new byte[16 * 1_024]; int sent = 0; int read; while ((read = inputStream.read(buffer)) != -1) { outputStream.write(buffer, 0, read); sent += read; System.out.print("."); // System.out.printf("\tsent=%-15d (%3.02f%%) :: %s%n", sent, (double)sent/size * 100.0, stopWatch.formatTime()); } System.out.printf("%n\tTotal: %s%n", stopWatch.formatTime()); } catch (SftpException ex) { System.out.printf("write(\"%s\") failed: %s (%d)%n", path, ex, ex.getStatus()); } } {code} It would deadlock w/o the SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET setting , e.g. {code:java} write("/sftp-test.txt", 1048576, 0) sent=1024 (0.10%) :: 00:00:00.126 sent=2048 (0.20%) :: 00:00:00.140 sent=3072 (0.29%) :: 00:00:00.140 sent=4096 (0.39%) :: 00:00:00.140 sent=5120 (0.49%) :: 00:00:00.140 sent=6144 (0.59%) :: 00:00:00.140 sent=7168 (0.68%) :: 00:00:00.140 sent=8192 (0.78%) :: 00:00:00.140 sent=9216 (0.88%) :: 00:00:00.141 sent=10240 (0.98%) :: 00:00:00.141 sent=11264 (1.07%) :: 00:00:00.141 sent=12288 (1.17%) :: 00:00:00.141 sent=13312 (1.27%) :: 00:00:00.141 sent=14336 (1.37%) :: 00:00:00.141 sent=15360 (1.46%) :: 00:00:00.141 sent=16384 (1.56%) :: 00:00:00.141 sent=17408 (1.66%) :: 00:00:00.142 sent=18432 (1.76%) :: 00:00:00.142 sent=19456 (1.86%) :: 00:00:00.142 sent=20480 (1.95%) :: 00:00:00.142 sent=21504 (2.05%) :: 00:00:00.142 sent=22528 (2.15%) :: 00:00:00.142 sent=23552 (2.25%) :: 00:00:00.142 sent=24576 (2.34%) :: 00:00:00.142 sent=25600 (2.44%) :: 00:00:00.143 sent=26624 (2.54%) :: 00:00:00.143 sent=27648 (2.64%) :: 00:00:00.143 sent=28672 (2.73%) :: 00:00:00.143 sent=29696 (2.83%) :: 00:00:00.143 sent=30720 (2.93%) :: 00:00:00.143 sent=31744 (3.03%) :: 00:00:00.143 {code} I don't have the full stack trace unfortunately and I lost access to the server but here's a partial grab (which is very similar to the original reporter's) {code:java} Caused by: java.io.InterruptedIOException: IoWriteFutureImpl[SftpChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[cpq_dump@/10.18.242.4:8080][sftp][SSH_MSG_CHANNEL_DATA]]: Interrupted after 37644419 msec. at org.apache.sshd.common.future.DefaultSshFuture.lambda$await0$0(DefaultSshFuture.java:74) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.AbstractSshFuture.formatExceptionMessage(AbstractSshFuture.java:185) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java:73) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.AbstractSshFuture.verifyResult(AbstractSshFuture.java:109) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:39) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:30) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.VerifiableFuture.verify(VerifiableFuture.java:43) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.sftp.client.impl.DefaultSftpClient.send(DefaultSftpClient.java:301) ~[sshd-sftp-2.8.0.jar:2.8.0] at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.flush(SftpOutputStreamAsync.java:207) ~[sshd-sftp-2.8.0.jar:2.8.0] at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.write(SftpOutputStreamAsync.java:138) ~[sshd-sftp-2.8.0.jar:2.8.0] {code} With the SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET setting it no longer deadlocks but DefaultSftpClient still has potential as this waits forever. {code:java} writeFuture.verify(); {code} When I didn't have the SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET setting enabled I noticed even after the socket was closed the main thread was still deadlocked on the above. I did find there was one other workaround I found that worked ok was using a custom bufferSize (e.g. 15000). I had to play around until I found a buffer size that didn't deadlock so it wasn't as reliable as SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET . was (Author: JIRAUSER296597): Here's simple test harness I was using to debug remote server. {code:java} private static void write(final SftpClient client, final String path, final int size, final int bufferSize) throws IOException { System.out.printf("write(\"%s\", %d, %d)%n", path, size, bufferSize); final Path uploadFile = createUploadFile(size); final StopWatch stopWatch = new StopWatch(); stopWatch.start(); try (InputStream inputStream = Files.newInputStream(uploadFile); OutputStream outputStream = client.write( path, bufferSize, SftpClient.OpenMode.Write, SftpClient.OpenMode.Create, SftpClient.OpenMode.Truncate)) { final byte[] buffer = new byte[16 * 1_024]; int sent = 0; int read; while ((read = inputStream.read(buffer)) != -1) { outputStream.write(buffer, 0, read); sent += read; System.out.print("."); // System.out.printf("\tsent=%-15d (%3.02f%%) :: %s%n", sent, (double)sent/size * 100.0, stopWatch.formatTime()); } System.out.printf("%n\tTotal: %s%n", stopWatch.formatTime()); } catch (SftpException ex) { System.out.printf("write(\"%s\") failed: %s (%d)%n", path, ex, ex.getStatus()); } } {code} I did find there was one other workaround I found that worked ok was using a custom bufferSize (e.g. 15000) otherwise it would deadlock w/o the SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET setting , e.g. {code:java} write("/sftp-test.txt", 1048576, 0) sent=1024 (0.10%) :: 00:00:00.126 sent=2048 (0.20%) :: 00:00:00.140 sent=3072 (0.29%) :: 00:00:00.140 sent=4096 (0.39%) :: 00:00:00.140 sent=5120 (0.49%) :: 00:00:00.140 sent=6144 (0.59%) :: 00:00:00.140 sent=7168 (0.68%) :: 00:00:00.140 sent=8192 (0.78%) :: 00:00:00.140 sent=9216 (0.88%) :: 00:00:00.141 sent=10240 (0.98%) :: 00:00:00.141 sent=11264 (1.07%) :: 00:00:00.141 sent=12288 (1.17%) :: 00:00:00.141 sent=13312 (1.27%) :: 00:00:00.141 sent=14336 (1.37%) :: 00:00:00.141 sent=15360 (1.46%) :: 00:00:00.141 sent=16384 (1.56%) :: 00:00:00.141 sent=17408 (1.66%) :: 00:00:00.142 sent=18432 (1.76%) :: 00:00:00.142 sent=19456 (1.86%) :: 00:00:00.142 sent=20480 (1.95%) :: 00:00:00.142 sent=21504 (2.05%) :: 00:00:00.142 sent=22528 (2.15%) :: 00:00:00.142 sent=23552 (2.25%) :: 00:00:00.142 sent=24576 (2.34%) :: 00:00:00.142 sent=25600 (2.44%) :: 00:00:00.143 sent=26624 (2.54%) :: 00:00:00.143 sent=27648 (2.64%) :: 00:00:00.143 sent=28672 (2.73%) :: 00:00:00.143 sent=29696 (2.83%) :: 00:00:00.143 sent=30720 (2.93%) :: 00:00:00.143 sent=31744 (3.03%) :: 00:00:00.143 {code} I don't have the full stack trace unfortunately and I lost access to the server but here's a partial grab (which is very similar to the original reporter's) {code:java} Caused by: java.io.InterruptedIOException: IoWriteFutureImpl[SftpChannelSubsystem[id=0, recipient=0]-ClientSessionImpl[cpq_dump@/10.18.242.4:8080][sftp][SSH_MSG_CHANNEL_DATA]]: Interrupted after 37644419 msec. at org.apache.sshd.common.future.DefaultSshFuture.lambda$await0$0(DefaultSshFuture.java:74) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.AbstractSshFuture.formatExceptionMessage(AbstractSshFuture.java:185) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java:73) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.AbstractSshFuture.verifyResult(AbstractSshFuture.java:109) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:39) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:30) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.common.future.VerifiableFuture.verify(VerifiableFuture.java:43) ~[sshd-common-2.8.0.jar:2.8.0] at org.apache.sshd.sftp.client.impl.DefaultSftpClient.send(DefaultSftpClient.java:301) ~[sshd-sftp-2.8.0.jar:2.8.0] at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.flush(SftpOutputStreamAsync.java:207) ~[sshd-sftp-2.8.0.jar:2.8.0] at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.write(SftpOutputStreamAsync.java:138) ~[sshd-sftp-2.8.0.jar:2.8.0] {code} With the SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET setting it no longer deadlocks but DefaultSftpClient still has potential as this waits forever. {code:java} writeFuture.verify(); {code} When I didn't have the SftpModuleProperties.CHUNK_IF_WINDOW_LESS_THAN_PACKET setting enabled I noticed even after the socket was closed the main thread was still deadlocked on the above. > SFTP worker threads got stuck while processing PUT methods against one > specific SFTP server > ------------------------------------------------------------------------------------------- > > Key: SSHD-1173 > URL: https://issues.apache.org/jira/browse/SSHD-1173 > Project: MINA SSHD > Issue Type: Question > Affects Versions: 2.6.0 > Reporter: Pavel Pohner > Priority: Critical > Labels: SFTP, mina, sshd > Time Spent: 10m > Remaining Estimate: 0h > > Hey guys, > Recently I've encountered one remote SFTP server, that causes SFTP worker > threads to enter TIMED_WAITING state, from which they do not recover. Please, > take a look into this thread dump: > {code:java} > SFTP-worker-3 #2155 prio=5 os_prio=0 cpu=61692.09ms elapsed=1136151.86s > tid=0x00007fe41c005800 nid=0x18808 in Object.wait() > [0x00007fe145b1d000]SFTP-worker-3" #2155 prio=5 os_prio=0 cpu=61692.09ms > elapsed=1136151.86s tid=0x00007fe41c005800 nid=0x18808 in Object.wait() > [0x00007fe145b1d000] java.lang.Thread.State: TIMED_WAITING (on object > monitor) at java.lang.Object.wait(java.base@11.0.10/Native Method) - waiting > on <no object reference available> at > org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java:69) > - waiting to re-lock in wait() <0x00000006e3bbc420> (a > org.apache.sshd.common.channel.IoWriteFutureImpl) at > org.apache.sshd.common.future.AbstractSshFuture.verifyResult(AbstractSshFuture.java:109) > at > org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:39) > at > org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:30) > at > org.apache.sshd.common.future.VerifiableFuture.verify(VerifiableFuture.java:43) > at > org.apache.sshd.sftp.client.impl.DefaultSftpClient.send(DefaultSftpClient.java:292) > at > org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.flush(SftpOutputStreamAsync.java:210) > at > org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.write(SftpOutputStreamAsync.java:141) > at java.io.InputStream.transferTo(java.base@11.0.10/InputStream.java:705) at > com.mina.command.Put.replyInto(Put.java:55) at > com.sftpserver.MinaSftpHandler.channelReadInternal(MinaSftpHandler.java:251) > at > com.sftpserver.MinaSftpHandler.lambda$channelRead$0(MinaSftpHandler.java:125) > at com.sftpserver.MinaSftpHandler$$Lambda$392/0x0000000800764040.run(Unknown > Source) at > java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.10/Executors.java:515) > at > java.util.concurrent.FutureTask.run(java.base@11.0.10/FutureTask.java:264) at > java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.10/ThreadPoolExecutor.java:1128) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.10/ThreadPoolExecutor.java:628) > at java.lang.Thread.run(java.base@11.0.10/Thread.java:834) > {code} > Seems like the thread is waiting for lock in > org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java), > I'm currently not sure what thread holds the lock and why it's not ever > released. > Also, I'm not able to reproduce this, but it constantly happens to all the > PUT methods targeting one specific remote server (which I don't own), so I > suppose there would be something odd with the remote server, but it doesn't > mean, that we shouldn't be able to deal with that. > Have you ever seen such behavior? Could it be Mina sshd's bug? (seems like > verify() in > org.apache.sshd.sftp.client.impl.DefaultSftpClient.send(DefaultSftpClient.java:292) > is called without any timeout, which then defaults to LONG.MAX here: at > org.apache.sshd.common.future.VerifiableFuture.verify(VerifiableFuture.java:43), > shouldn't we have configurable timeout here? or what's the point of default > timeout? what are we really waiting for in at > org.apache.sshd.common.future.DefaultSshFuture.await0(DefaultSshFuture.java:69)?) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org For additional commands, e-mail: dev-h...@mina.apache.org