Hello,

One of the issue we have recently been experiencing with Jenkins is that the 
slaves (node) would go offline for no apparent reason and would not reconnect 
automatically.
When slaves appear as offline, we tried to launch/reconnect the slave manually 
but it does not work either. However, we are able to SSH into the machine using 
PuTTy.
The only workaround is to restart the Jenkins server, until the problem 
surfaces again. (Typically in a week.)

Instance Information
--------------------
Jenkins Server:            1.562
SSH Credentials Plugin:    1.6.1
SSH Slaves Plugin          1.6

Thread dump of slave node:
{dump}
"Channel reader thread: qa-linbuild-02" prio=5 WAITING
        java.lang.Object.wait(Native Method)
        java.lang.Object.wait(Object.java:485)
        
com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:109)
        
com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:583)
        com.trilead.ssh2.Session.<init>(Session.java:41)
        com.trilead.ssh2.Connection.openSession(Connection.java:1129)
        com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99)
        com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119)
        
hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1160)
        hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:437)
        hudson.remoting.Channel.terminate(Channel.java:819)
        
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76)

"Channel reader thread: qa-linbuild-03" prio=5 WAITING
        java.lang.Object.wait(Native Method)
        java.lang.Object.wait(Object.java:485)
        
com.trilead.ssh2.channel.ChannelManager.waitUntilChannelOpen(ChannelManager.java:109)
        
com.trilead.ssh2.channel.ChannelManager.openSessionChannel(ChannelManager.java:583)
        com.trilead.ssh2.Session.<init>(Session.java:41)
        com.trilead.ssh2.Connection.openSession(Connection.java:1129)
        com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:99)
        com.trilead.ssh2.SFTPv3Client.<init>(SFTPv3Client.java:119)
        
hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1160)
        hudson.slaves.SlaveComputer$2.onClosed(SlaveComputer.java:437)
        hudson.remoting.Channel.terminate(Channel.java:819)
        
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:76)
{dump}

Also concerning is the number of threads is in the BLOCKED (126!). 
Doesn't seem normal as there are no BLOCKED threads after the server is 
restarted.
{dump}
// 118 instances
"Computer.threadPoolForRemoting [#26]" daemon prio=5 BLOCKED
        
hudson.plugins.sshslaves.SSHLauncher.afterDisconnect(SSHLauncher.java:1152)
        hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:542)
        
jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
        java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        java.util.concurrent.FutureTask.run(FutureTask.java:138)
        
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        java.lang.Thread.run(Thread.java:662)

// 8 instances
"Computer.threadPoolForRemoting [#2922]" daemon prio=5 BLOCKED
        hudson.plugins.sshslaves.SSHLauncher.launch(SSHLauncher.java:639)
        hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
        
jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
        java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        java.util.concurrent.FutureTask.run(FutureTask.java:138)
        
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        java.lang.Thread.run(Thread.java:662)
{dump}

Looking forward to any ideas or suggestions.

Thank you.
Charles Chan

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to