[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Title: Message Title Guillaume Boucherie commented on JENKINS-14332 Re: Repeated channel/timeout errors from Jenkins slave Hi, I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave. And the problem gone ... Jenkins version used is the last stable : 1.609.1 Regards Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Title: Message Title Sean Abbott commented on JENKINS-14332 Re: Repeated channel/timeout errors from Jenkins slave I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues... Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Title: Message Title Sean Abbott commented on JENKINS-14332 Re: Repeated channel/timeout errors from Jenkins slave I have the same issue. kernel is 3.14. Part of the problem is that when the slave.jar file fails, it does NOT cause the agent to display as offline for the master, so the master keeps trying to send jobs: Expanded the channel window size to 4MB [05/11/15 18:02:23] [SSH] Starting slave process: cd /var/lib/jenkins java -Dfile.encoding=UTF8 -jar slave.jar ===[JENKINS REMOTING CAPACITY]===ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins. java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:448) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:366) at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:945) at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:133) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [05/11/15 18:02:24] Launch failed - cleaning up connection [05/11/15 18:02:24] [SSH] Connection closed. Even though the log reports the connection closed, jenkins still reports it as up. My jenkins master is on 1.596.2. Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Bert Jan Schrijver commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Downgrading the kernel on the master has definitely fixed it for us. Running for a week now without any trouble. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Jonathan Langevin commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I'm on Amazon EC2 w/ Ubuntu 14.04.1 LTS. I've downgraded the kernel (manually) to 3.4. wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105_3.4.105-0304105.201412012335_all.deb wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-image-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb dpkg -i linux-* Once I had the 3.4 kernel installed, I had to make it the default kernel on boot, so I followed the instructions here: http://statusq.org/archives/2012/10/24/4584/ I'm trying out some builds now to see how Jenkins behaves... This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Bert Jan Schrijver commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out. We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since. I'll report back later. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Bert Jan Schrijver edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out. We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since. Slaves are still running 3.14 kernel (3.14.27-25.47.amzn1.x86_64). I'll report back later. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Danny Verbeek commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I have the same kind of issue. Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.2.0-69-virtual Not Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65(Tested both) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both) The slave hangs on the POM parsing stack trace slave.jar: Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode): "Attach Listener" daemon prio=10 tid=0x7f7eb0001800 nid=0x645 runnable 0x java.lang.Thread.State: RUNNABLE "Stream reader: maven process at Socketaddr=/127.0.0.1,port=46056,localport=55623" prio=10 tid=0x7f7ec004f800 nid=0x611 runnable 0x7f7ea4825000 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x7f7ec0046800 nid=0x607 runnable 0x7f7ea4623000 java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) locked 0xbbbc43f0 (a java.lang.UNIXProcess$ProcessPipeInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "Proc.executor 1" daemon prio=10 tid=0x7f7eb81e6000 nid=0x570 waiting on condition 0x7f7ea4724000 java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for 0xcec5cef0 (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) "process reaper" daemon prio=10 tid=0x7f7eb81ec800 nid=0x56c runnable 0x7f7ea485e000 java.lang.Thread.State: RUNNABLE at java.lang.UNIXProcess.waitForProcessExit(Native Method) at java.lang.UNIXProcess.access$500(UNIXProcess.java:54) at java.lang.UNIXProcess$4.run(UNIXProcess.java:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) "pool-1-thread-6" prio=10 tid=0x7f7eac011800 nid=0x557 waiting on condition 0x7f7ea5cba000 java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for 0xce80c268 (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Danny Verbeek edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I have the same kind of issue. Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.2.0-69-virtual Not Working combination: Jenkins version: 1.588 Remoting version: 2.47 Master JVM: 1.7.0_65 Slave JVM: 1.6.0_33/1.7.0_65(Tested both) Linux kernel master: 3.13.0-37-generic Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both) The slave hangs on the POM parsing stack trace slave.jar: Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode): "Attach Listener" daemon prio=10 tid=0x7f7eb0001800 nid=0x645 runnable [0x] java.lang.Thread.State: RUNNABLE "Stream reader: maven process at Socket[addr=/127.0.0.1,port=46056,localport=55623]" prio=10 tid=0x7f7ec004f800 nid=0x611 runnable [0x7f7ea4825000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x7f7ec0046800 nid=0x607 runnable [0x7f7ea4623000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked 0xbbbc43f0 (a java.lang.UNIXProcess$ProcessPipeInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:107) at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60) "Proc.executor [#1]" daemon prio=10 tid=0x7f7eb81e6000 nid=0x570 waiting on condition [0x7f7ea4724000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xcec5cef0 (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) "process reaper" daemon prio=10 tid=0x7f7eb81ec800 nid=0x56c runnable [0x7f7ea485e000] java.lang.Thread.State: RUNNABLE at java.lang.UNIXProcess.waitForProcessExit(Native Method) at java.lang.UNIXProcess.access$500(UNIXProcess.java:54) at java.lang.UNIXProcess$4.run(UNIXProcess.java:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) "pool-1-thread-6" prio=10 tid=0x7f7eac011800 nid=0x557 waiting on condition [0x7f7ea5cba000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0xce80c268 (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Poul Henriksen commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I had the same issue. Downgrading the Jenkins master kernel from 3.14.19-17.43.amzn1.x86_64 to 3.4.62-53.42.amzn1.x86_64 solved it. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev assigned JENKINS-14332 to Unassigned Repeated channel/timeout errors from Jenkins slave Change By: Oleg Nenashev (29/Sep/14 12:16 PM) Assignee: OlegNenashev This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Adam commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave hey Trevor, any chance you have the update to 3.4.x kernel steps handy. I didn't see the kernel in the main repo. Anyone from the Jenkins team able to comment if they could help resolve this issue? I leave ssh sessions open overnight with no issues or drops, so I really can't understand why jenkins keeps having these ping timeouts.Could we get a patch which would disable the ping check entirely? I added "-Dhudson.remoting.Launcher.pingIntervalSec=0 -Dhudson.remoting.Launcher.pingTimeoutSec=0" to my startup on the master but it didn't seem to help. Would appreciate any suggestion to avoid rolling back the kernel. Trevor, It would be interesting to see the output before/after the rollback of "sysctl -a", wonder if there's a kernel param that changed which might have an impact. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Guillaume Boucherie updated JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Hi, First thanks you very much for this tips. And yes current Amazon Linux AMI (ami-892fe1fe) can't be downgraded to 3.4.x. So i install an Ubuntu 12.04 that come with a 3.2.x kernel (ami-00d12677) and all works fine. I also try with an ubuntu 14.04 (3.13.x kernel) and its not good. So I attach the result of "sysctl -a" for kernel 3.2 sysctl_3.2.txt and 3.13 sysctl_3.13.txt if it could help. Regards Change By: Guillaume Boucherie (18/Sep/14 9:58 AM) Attachment: sysctl_3.2.txt Attachment: sysctl_3.13.txt This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Trevor Baker updated JENKINS-14332 Repeated channel/timeout errors from Jenkins slave The instance I run jenkins on was a 2013.09 version that was upgraded to 2014.03.x via yum updates. I presume you could add the 2013.09 yum repo into /etc/yum.repos.d/ and see the 3.4.x kernels. From experience, 3.4.83-70.111 works with everything else from 2014.03.2 at the latest revisions. I've attached a kernel param dump. Change By: Trevor Baker (18/Sep/14 12:20 PM) Attachment: 3.4.83-70.111.amzn1.x86_64_kernel_params.txt This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Trevor Baker commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave This happens consistently when the jenkins master is running amazon linux kernel 3.10.x. If you downgrade the kernel to 3.4.x on the master it goes away. It does not seem to matter which kernel the slave runs. Every time I forget about this and "yum update" the kernel it takes some head scratching before I remember to lock the master's kernel at 3.4.x in /etc/grub.conf This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Daniel Beck updated JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Change By: Daniel Beck (01/Sep/14 2:43 PM) Labels: remoting This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Adam Edwards commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I am seeing similar issues on EC2 linux. [EnvInject] - Loading node environment variables. FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:739) at hudson.FilePath.act(FilePath.java:1009) at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:581) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:487) at hudson.model.Run.execute(Run.java:1706) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:232) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:802) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:951) at hudson.remoting.Channel$2.handle(Channel.java:475) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 ... 3 more Caused by: Command close created at at hudson.remoting.Command.init(Command.java:56) at hudson.remoting.Channel$CloseCommand.init(Channel.java:945) at hudson.remoting.Channel$CloseCommand.init(Channel.java:943) at hudson.remoting.Channel.close(Channel.java:1026) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736 ... 2 more This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Beho Alheit edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I have this bug popping up with Mac Mini OSX SSH slaves. Jenkins version is 1.564. Updated version to 1.574 and the issue is still present. It takes a reboot of both the OSX slave and the Jenkins instance itself to fix this issue. This only occurs with the nodes running OSX. After running for a couple of days and processing a arbitrary number of builds, the timeout takes place. Please let me know if there is more details I can submit. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Beho Alheit commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave I have this bug popping up with Mac OSX SSH slaves. Jenkins version is 1.564 This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
wbauer commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Same issue here with Linux SSH slaves with Jenkins 1.564: 14:33:30 Started by user wbauer 14:33:30 EnvInject - Loading node environment variables. 14:33:30 ERROR: SEVERE ERROR occurs 14:33:30 org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed 14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:75) 14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81) 14:33:30 at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39) 14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:570) 14:33:30 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:476) 14:33:30 at hudson.model.Run.execute(Run.java:1706) 14:33:30 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) 14:33:30 at hudson.model.ResourceController.execute(ResourceController.java:88) 14:33:30 at hudson.model.Executor.run(Executor.java:231) 14:33:30 Caused by: hudson.remoting.ChannelClosedException: channel is already closed 14:33:30 at hudson.remoting.Channel.send(Channel.java:541) 14:33:30 at hudson.remoting.Request.call(Request.java:129) 14:33:30 at hudson.remoting.Channel.call(Channel.java:739) 14:33:30 at hudson.FilePath.act(FilePath.java:1009) 14:33:30 at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44) 14:33:30 ... 8 more 14:33:30 Caused by: java.io.IOException 14:33:30 at hudson.remoting.Channel.close(Channel.java:1027) 14:33:30 at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) 14:33:30 at hudson.remoting.PingThread.ping(PingThread.java:120) 14:33:30 at hudson.remoting.PingThread.run(PingThread.java:81) 14:33:30 Caused by: java.util.concurrent.TimeoutException: Ping started on 1401187729473 hasn't completed at 1401187969473 14:33:30 ... 2 more 14:33:30 Notifying upstream projects of job completion 14:33:30 EnvInject - ERROR - SEVERE ERROR occurs: channel is already closed 14:33:33 Finished: FAILURE This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev assigned JENKINS-14332 to Oleg Nenashev Repeated channel/timeout errors from Jenkins slave Change By: Oleg Nenashev (19/May/14 9:21 AM) Assignee: OlegNenashev This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well. We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see the log above) = All scheduled builds fail with an error The issue affects ssh-slaves only: Linux SSH slaves are "online", but all jobs on the fail with the error above Windows services have reconnected automatically... Windows JNLP slaves have reconnected as well I'm not sure if there's any Linux specifics in the issue. On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev updated JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Updated the issue's description. Change By: Oleg Nenashev (19/May/14 9:24 AM) Environment: jenkins-1.509.4withremoting-2.36ssh-slaves-1.2 Description: TheissueappearsonmycustombuildoftheJenkinscore,butseemsitcouldbereproducedonnewestversionsaswell.Weveexperiencedanetworkoverloading,whichhaslettotheexceptioninthePingThreadonJenkinsmaster,whichhasclosedthecommunicationchannel.However,theslavestillsonlineandtakesjobs,butanyremoteactionfails(seelogsabove)=AllscheduledbuildsfailwithanerrorTheissueaffectsssh-slavesonly:LinuxSSHslavesareonline,butalljobsonthefailwiththeerroraboveWindowsserviceshavereconnectedautomatically...WindowsJNLPslaveshavereconnectedaswell Component/s: master-slave Component/s: ssh-slaves This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev updated JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Change By: Oleg Nenashev (19/May/14 9:25 AM) Description: TheissueappearsonmycustombuildoftheJenkinscore,butseemsitcouldbereproducedonnewestversionsaswell.Weveexperiencedanetworkoverloading,whichhaslettotheexceptioninthePingThreadonJenkinsmaster,whichhasclosedthecommunicationchannel.However,theslavestillsonlineandtakesjobs,butanyremoteactionfails(seelogsabove)=AllscheduledbuildsfailwithanerrorTheissueaffectsssh-slavesonly: * LinuxSSHslavesareonline,butalljobsonthefailwiththeerrorabove * Windowsserviceshavereconnectedautomatically... * WindowsJNLPslaveshavereconnectedaswell This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave The issue appears on remoting-2.36 This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave The issue appears on remoting-2.36 Nodes still alive after the issue This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Oleg Nenashev commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave In my case the issue has appeared after the network overloading, hence slave nodes have been disconnected multiple times due to the "recv failed" (TCP timeout). Linux SSH slaves are "online", but all jobs on the fail with the error above Windows services have reconnected automatically... Windows JNLP slaves have reconnected as well I'm not sure if there's any Linux specifics in the issue. On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Marcus Klein commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave My jobs are randomly failing with version 1.550 with the following exception: FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41) at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34) at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167) at com.sun.proxy.$Proxy47.join(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925) at hudson.Launcher$ProcStarter.join(Launcher.java:360) at hudson.tasks.Ant.perform(Ant.java:217) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:785) at hudson.model.Build$BuildExecution.build(Build.java:199) at hudson.model.Build$BuildExecution.doRun(Build.java:160) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:566) at hudson.model.Run.execute(Run.java:1678) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:231) Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.Channel$CloseCommand.execute(Channel.java:927) at hudson.remoting.Channel$2.handle(Channel.java:461) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60) Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 ... 3 more Caused by: Command close created at at hudson.remoting.Command.init(Command.java:56) at hudson.remoting.Channel$CloseCommand.init(Channel.java:921) at hudson.remoting.Channel$CloseCommand.init(Channel.java:919) at hudson.remoting.Channel.close(Channel.java:1002) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110) at hudson.remoting.PingThread.ping(PingThread.java:120) at hudson.remoting.PingThread.run(PingThread.java:81) Caused by: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137 ... 2 more I rolled back to version 1.546 because I think it did not happen with that version. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Nickolay Rumyantsev commented on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Having the similar issue. Slave is reconnected during the archiving of artifacts, when Jenkins is under load. Jenkins 1.543. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Julien Carsique edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Maybe not a Jenkins issue. Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave: [05/06/13 14:06:04] Launching slave agent $ ssh slavedns java -jar ~/bin/slave.jar ===[JENKINS REMOTING CAPACITY]==[JENKINS REMOTING CAPACITY]===channel started channel started Slave.jar version: 2.22 This is a Unix slave Slave.jar version: 2.22 This is a Unix slave Copied maven-agent.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied maven-agent.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied classworlds.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied classworlds.jar Evacuated stdout Evacuated stdout ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Connection terminated channel stopped ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.NullPointerException at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) channel stopped Connection terminated Then the slaved successfully reconnected itself. It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue. Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log (java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682): the system was partially frozen and with really unstable response times. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave
Julien Carsique edited a comment on JENKINS-14332 Repeated channel/timeout errors from Jenkins slave Maybe not a Jenkins issue. Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave: [05/06/13 14:06:04] Launching slave agent $ ssh slavedns java -jar ~/bin/slave.jar ===[JENKINS REMOTING CAPACITY]==[JENKINS REMOTING CAPACITY]===channel started channel started Slave.jar version: 2.22 This is a Unix slave Slave.jar version: 2.22 This is a Unix slave Copied maven-agent.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied maven-agent.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied maven3-agent.jar Copied maven3-interceptor.jar Copied classworlds.jar Copied maven-interceptor.jar Copied maven2.1-interceptor.jar Copied plexus-classworld.jar Copied classworlds.jar Evacuated stdout Evacuated stdout ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Connection terminated channel stopped ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins (...)java.lang.NullPointerException at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339) at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) channel stopped Connection terminated Then the slaved successfully reconnected itself. It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue. Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log (java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682): the system was partially frozen and with really unstable response times. Note the looping thread came from a previous job which badly stopped on timeout: 03:00:16.868 Build timed out (after 180 minutes). Marking the build as aborted. 03:00:16.873 Build was aborted 03:00:16.874 Archiving artifacts 03:00:16.874 ERROR: Failed to archive artifacts: **/log/*.log, tomcat*/nxserver/config/distribution.properties 03:00:16.875 hudson.remoting.ChannelClosedException: channel is already closed 03:00:16.876 at hudson.remoting.Channel.send(Channel.java:494) 03:00:16.876 at hudson.remoting.Request.call(Request.java:129) 03:00:16.876 at hudson.remoting.Channel.call(Channel.java:672) 03:00:16.876 at hudson.EnvVars.getRemote(EnvVars.java:212) 03:00:16.876 at hudson.model.Computer.getEnvironment(Computer.java:882) 03:00:16.876 at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28) 03:00:16.876 at hudson.model.Run.getEnvironment(Run.java:2028) 03:00:16.876 at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:927) 03:00:16.876 at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:115) 03:00:16.876 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) 03:00:16.876 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:798) 03:00:16.876 at