[ 
https://issues.apache.org/jira/browse/SSHD-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16079668#comment-16079668
 ] 

Eugene Petrenko commented on SSHD-754:
--------------------------------------

It is indeed a non-trivial problem that I have discovered. First, I want to 
make sure the proposed fix is good enough to be converted into a pull request. 

The problem I have is that on the SSH server:
1) channel remote window is set to be 2GB
2) client is slow to receive data (data is generated faster that it is being 
consumed by network/client)
2a) Slow network connection
2b) Rekey is running

Because of 1) and 2) we have OOM - too many data chunks are queued.

The proposed solution is to limit send queue only for DATA and EXTENDED_DATA 
messages (is that correct?), by blocking too active sender (deadlock is 
possible if we block a NIO/callback thread)

Alternative solution can be to implement similar logic in 
org.apache.sshd.common.channel.ChannelOutputStream and 
org.apache.sshd.common.channel.ChannelAsyncOutputStream. Also, there might be 
other usages of Channel/Session, thus fixing those 2 classes may not be enough 
and trick to avoid code duplication

What would you say?


> OOM in sending data for channel
> -------------------------------
>
>                 Key: SSHD-754
>                 URL: https://issues.apache.org/jira/browse/SSHD-754
>             Project: MINA SSHD
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Eugene Petrenko
>
> I have an implementation of SSHD server with the library. It sends gigabytes 
> (e.g. 5GB) of data as command output. 
> Starting from Putty plink 0.68 (also includes plink 0.69) we started to have 
> OOM errors. Checking memory dumps shown the most of the memory is consumed 
> from the function
> org.apache.sshd.common.session.AbstractSession#writePacket(org.apache.sshd.common.util.buffer.Buffer)
> In the hprof I see thousands of PendingWriteFuture objects (btw, each holds a 
> reference to a logger instance). And those objects are only created from this 
> function. 
> It is clear the session is running through rekey. I see the kexState 
> indicating the progress. 
> Is there a way to artificially limit the sending queue, no matter if related 
> remote window allows sending that enormous amount of data? As of my 
> estimation, the window was reported to be around 1.5 GB or more. Maybe, such 
> huge window size was caused by an arithmetic overflow that is fixed on 
> SSHD-701



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to