[
https://issues.apache.org/jira/browse/BROOKLYN-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270971#comment-14270971
]
Aled Sage commented on BROOKLYN-106:
------------------------------------
Looking again at the jstack previously attached, that showed it hung in the
same place: sshj is blocked trying to read the packet length:
{noformat}
"sftp reader" daemon prio=10 tid=0x00007f89a9775800 nid=0x5258 in Object.wait()
[0x00007f897b1f0000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000c9760cd8> (a
net.schmizz.sshj.common.Buffer$PlainBuffer)
at java.lang.Object.wait(Object.java:503)
at
net.schmizz.sshj.connection.channel.ChannelInputStream.read(ChannelInputStream.java:128)
- locked <0x00000000c9760cd8> (a
net.schmizz.sshj.common.Buffer$PlainBuffer)
at
net.schmizz.sshj.sftp.PacketReader.readIntoBuffer(PacketReader.java:49)
at
net.schmizz.sshj.sftp.PacketReader.getPacketLength(PacketReader.java:57)
at net.schmizz.sshj.sftp.PacketReader.readPacket(PacketReader.java:73)
at net.schmizz.sshj.sftp.PacketReader.run(PacketReader.java:85)
{noformat}
That suggests we are blocked waiting for the next packet to arrive, but that it
never arrives (with the TCP connection staying open). I suspect that the
brooklyn-server side is sending no additional packets.
Looking at [1], I wonder if we should enable TCP keepalive, to send probe
packets. If we don't get the ack back then the brooklyn-side would be able to
tell that the connection was broken and so would fail (rather than waiting
forever in case another package arrives?).
[1] http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html
> ssh command hangs (gettin stdout/stderr) for vcloud-director
> ------------------------------------------------------------
>
> Key: BROOKLYN-106
> URL: https://issues.apache.org/jira/browse/BROOKLYN-106
> Project: Brooklyn
> Issue Type: Bug
> Affects Versions: 0.7.0-SNAPSHOT
> Reporter: Aled Sage
> Assignee: Aled Sage
> Attachments: debug.log.tgz, jstack.txt, messages.tgz, ssh-stdout.txt
>
>
> When deploying Tomcat to VMware's vcloud-air, to a CentOS 6.4 VM, when
> installing Java it hangs!
> The Brooklyn web-console shows that it is still waiting for a result from the
> ssh command (which executed `sudo -E -n -S -- yum -y --nogpgcheck install
> java-1.7.0-openjdk-devel`).
> However, when logging into the VM I can see that the `yum` command has
> finished, and the /var/log/messages (attached) shows that the install
> completed.
> This fails repeatedly. It used to pass!
> The stdout is at 32040 bytes. The last few lines of the stdout (as shown in
> the web-console) are:
> {noformat}
> Installing : libtasn1-2.3-6.el6_5.x86_64
> 50/56
> Installing : gnutls-2.8.5-14.el6_5.x86_64
> 51/56
> Installing : 1:cups-libs-1.4.2-67.el6.x86_64
> 52/56
> {noformat}
> Could there be some buffer set to 32K, so it's stuck not reading the rest of
> the stdout (but `SshjToolPerformanceTest.testConsecutiveBigStdoutCommands`
> passes)?
> Why else would our ssh command be stuck, not returning?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)