[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-07-08 Thread guillaume.bouche...@gmail.com (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Guillaume Boucherie commented on  JENKINS-14332 
 
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
  Re: Repeated channel/timeout errors from Jenkins slave  
 
 
 
 
 
 
 
 
 
 
Hi, 
I just do a test on last AWS Linux machine (kernel : 3.14.35), with the same machine on both master and slave. And the problem gone ... Jenkins version used is the last stable : 1.609.1 
Regards 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   





-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-05-11 Thread seabb...@akamai.com (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Sean Abbott commented on  JENKINS-14332 
 
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
  Re: Repeated channel/timeout errors from Jenkins slave  
 
 
 
 
 
 
 
 
 
 
I was able to connect to the same slave from another jenkins master using the same kernel and jenkins version with no issues... 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   





-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-05-11 Thread seabb...@akamai.com (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Sean Abbott commented on  JENKINS-14332 
 
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
  Re: Repeated channel/timeout errors from Jenkins slave  
 
 
 
 
 
 
 
 
 
 
I have the same issue. kernel is 3.14. Part of the problem is that when the slave.jar file fails, it does NOT cause the agent to display as offline for the master, so the master keeps trying to send jobs:  
Expanded the channel window size to 4MB [05/11/15 18:02:23] [SSH] Starting slave process: cd /var/lib/jenkins  java -Dfile.encoding=UTF8 -jar slave.jar 
===[JENKINS REMOTING CAPACITY]===ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins. java.lang.IllegalStateException: Already connected at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:448) at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:366) at hudson.plugins.sshslaves.SSHLauncher.startSlave(SSHLauncher.java:945) at hudson.plugins.sshslaves.SSHLauncher.access$400(SSHLauncher.java:133) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:711) at hudson.plugins.sshslaves.SSHLauncher$2.call(SSHLauncher.java:696) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [05/11/15 18:02:24] Launch failed - cleaning up connection [05/11/15 18:02:24] [SSH] Connection closed. 
Even though the log reports the connection closed, jenkins still reports it as up. 
My jenkins master is on 1.596.2. 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 


 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   





-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-02-17 Thread bert...@jpoint.nl (JIRA)














































Bert Jan Schrijver
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















Downgrading the kernel on the master has definitely fixed it for us.
Running for a week now without any trouble.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-02-10 Thread j...@langevin.me (JIRA)














































Jonathan Langevin
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















I'm on Amazon EC2 w/ Ubuntu 14.04.1 LTS.

I've downgraded the kernel (manually) to 3.4.


wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-headers-3.4.105-0304105_3.4.105-0304105.201412012335_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4.105-quantal/linux-image-3.4.105-0304105-generic_3.4.105-0304105.201412012335_amd64.deb
dpkg -i linux-*


Once I had the 3.4 kernel installed, I had to make it the default kernel on boot, so I followed the instructions here: http://statusq.org/archives/2012/10/24/4584/

I'm trying out some builds now to see how Jenkins behaves...



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-02-10 Thread bert...@jpoint.nl (JIRA)














































Bert Jan Schrijver
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out.
We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since. I'll report back later.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2015-02-10 Thread bert...@jpoint.nl (JIRA)












































 
Bert Jan Schrijver
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Same thing for us: Amazon Linux master with EC2 slaves plugin and Amazon Linux slaves. Builds were randomly hanging and slaves were timing out.
We downgraded the kernel on the master this morning from 3.14.23-22.44.amzn1.x86_64 to 3.4.73-64.112.amzn1.x86_64 and haven't seen any issues since. 
Slaves are still running 3.14 kernel (3.14.27-25.47.amzn1.x86_64).
I'll report back later.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-11-05 Thread dverbee...@gmail.com (JIRA)














































Danny Verbeek
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















I have the same kind of issue.

Working combination:
Jenkins version: 1.588
Remoting version: 2.47
Master JVM: 1.7.0_65
Slave JVM: 1.6.0_33/1.7.0_65) 
Linux kernel master: 3.13.0-37-generic
Linux kernel slave: 3.2.0-69-virtual


Not Working combination:
Jenkins version: 1.588
Remoting version: 2.47
Master JVM: 1.7.0_65
Slave JVM: 1.6.0_33/1.7.0_65(Tested both)
Linux kernel master: 3.13.0-37-generic
Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both)

The slave hangs on the POM parsing

stack trace slave.jar:
Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode):

"Attach Listener" daemon prio=10 tid=0x7f7eb0001800 nid=0x645 runnable 0x
   java.lang.Thread.State: RUNNABLE

"Stream reader: maven process at Socketaddr=/127.0.0.1,port=46056,localport=55623" prio=10 tid=0x7f7ec004f800 nid=0x611 runnable 0x7f7ea4825000
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.read(SocketInputStream.java:152)
	at java.net.SocketInputStream.read(SocketInputStream.java:122)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)

"java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x7f7ec0046800 nid=0x607 runnable 0x7f7ea4623000
   java.lang.Thread.State: RUNNABLE
	at java.io.FileInputStream.readBytes(Native Method)
	at java.io.FileInputStream.read(FileInputStream.java:272)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

	locked 0xbbbc43f0 (a java.lang.UNIXProcess$ProcessPipeInputStream)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)



"Proc.executor 1" daemon prio=10 tid=0x7f7eb81e6000 nid=0x570 waiting on condition 0x7f7ea4724000
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)

	parking to wait for  0xcec5cef0 (a java.util.concurrent.SynchronousQueue$TransferStack)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)



"process reaper" daemon prio=10 tid=0x7f7eb81ec800 nid=0x56c runnable 0x7f7ea485e000
   java.lang.Thread.State: RUNNABLE
	at java.lang.UNIXProcess.waitForProcessExit(Native Method)
	at java.lang.UNIXProcess.access$500(UNIXProcess.java:54)
	at java.lang.UNIXProcess$4.run(UNIXProcess.java:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-6" prio=10 tid=0x7f7eac011800 nid=0x557 waiting on condition 0x7f7ea5cba000
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)

	parking to wait for  0xce80c268 (a java.util.concurrent.SynchronousQueue$TransferStack)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at 

[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-11-05 Thread dverbee...@gmail.com (JIRA)












































 
Danny Verbeek
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















I have the same kind of issue.

Working combination:
Jenkins version: 1.588
Remoting version: 2.47
Master JVM: 1.7.0_65
Slave JVM: 1.6.0_33/1.7.0_65) 
Linux kernel master: 3.13.0-37-generic
Linux kernel slave: 3.2.0-69-virtual


Not Working combination:
Jenkins version: 1.588
Remoting version: 2.47
Master JVM: 1.7.0_65
Slave JVM: 1.6.0_33/1.7.0_65(Tested both)
Linux kernel master: 3.13.0-37-generic
Linux kernel slave: 3.13.0-35-generic/3.13.0-39-generic(Tested both)

The slave hangs on the POM parsing

stack trace slave.jar:

Full thread dump OpenJDK 64-Bit Server VM (24.65-b04 mixed mode):

"Attach Listener" daemon prio=10 tid=0x7f7eb0001800 nid=0x645 runnable [0x]
   java.lang.Thread.State: RUNNABLE

"Stream reader: maven process at Socket[addr=/127.0.0.1,port=46056,localport=55623]" prio=10 tid=0x7f7ec004f800 nid=0x611 runnable [0x7f7ea4825000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.read(SocketInputStream.java:152)
	at java.net.SocketInputStream.read(SocketInputStream.java:122)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)

"java -XX:MaxPermSize=1024m -cp /tmp/maven3-agent.jar:/usr/share/maven/boot/plexus-classworlds-2.x.jar org.jvnet.hudson.maven3.agent.Maven3Main /usr/share/maven /tmp/slave.jar /tmp/maven3-interceptor.jar /tmp/maven3-interceptor-commons.jar 55623: stdout copier" prio=10 tid=0x7f7ec0046800 nid=0x607 runnable [0x7f7ea4623000]
   java.lang.Thread.State: RUNNABLE
	at java.io.FileInputStream.readBytes(Native Method)
	at java.io.FileInputStream.read(FileInputStream.java:272)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
	- locked 0xbbbc43f0 (a java.lang.UNIXProcess$ProcessPipeInputStream)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at hudson.util.StreamCopyThread.run(StreamCopyThread.java:60)

"Proc.executor [#1]" daemon prio=10 tid=0x7f7eb81e6000 nid=0x570 waiting on condition [0x7f7ea4724000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  0xcec5cef0 (a java.util.concurrent.SynchronousQueue$TransferStack)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

"process reaper" daemon prio=10 tid=0x7f7eb81ec800 nid=0x56c runnable [0x7f7ea485e000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.UNIXProcess.waitForProcessExit(Native Method)
	at java.lang.UNIXProcess.access$500(UNIXProcess.java:54)
	at java.lang.UNIXProcess$4.run(UNIXProcess.java:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-6" prio=10 tid=0x7f7eac011800 nid=0x557 waiting on condition [0x7f7ea5cba000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  0xce80c268 (a java.util.concurrent.SynchronousQueue$TransferStack)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at 

[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-11-05 Thread p...@zmags.com (JIRA)














































Poul Henriksen
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















I had the same issue.

Downgrading the Jenkins master kernel from 3.14.19-17.43.amzn1.x86_64 to 3.4.62-53.42.amzn1.x86_64 solved it.




























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-09-29 Thread o.v.nenas...@gmail.com (JIRA)















































Oleg Nenashev
 assigned  JENKINS-14332 to Unassigned



Repeated channel/timeout errors from Jenkins slave
















Change By:


Oleg Nenashev
(29/Sep/14 12:16 PM)




Assignee:


OlegNenashev



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-09-18 Thread abedwar...@gmail.com (JIRA)














































Adam
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















hey Trevor, any chance you have the update to 3.4.x kernel steps handy.  I didn't see the kernel in the main repo.   

Anyone from the Jenkins team able to comment if they could help resolve this issue?  I leave ssh sessions open overnight with no issues or drops,  so I really can't understand why jenkins keeps having these ping timeouts.Could we get a patch which would disable the ping check entirely?   I added "-Dhudson.remoting.Launcher.pingIntervalSec=0 -Dhudson.remoting.Launcher.pingTimeoutSec=0" to my startup on the master but it didn't seem to help.

Would appreciate any suggestion to avoid rolling back the kernel.

Trevor, It would be interesting to see the output before/after the rollback of "sysctl -a", wonder if there's a kernel param that changed which might have an impact.  



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-09-18 Thread guillaume.bouche...@gmail.com (JIRA)














































Guillaume Boucherie
 updated  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Hi,

First thanks you very much for this tips.
And yes current Amazon Linux AMI (ami-892fe1fe) can't be downgraded to 3.4.x.
So i install an Ubuntu 12.04 that come with a 3.2.x kernel (ami-00d12677) and all works fine.
I also try with an ubuntu 14.04 (3.13.x kernel) and its not good.

So I attach the result of "sysctl -a" for kernel 3.2 sysctl_3.2.txt and 3.13 sysctl_3.13.txt if it could help.

Regards





Change By:


Guillaume Boucherie
(18/Sep/14 9:58 AM)




Attachment:


sysctl_3.2.txt





Attachment:


sysctl_3.13.txt



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-09-18 Thread tba...@circle.com (JIRA)














































Trevor Baker
 updated  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















The instance I run jenkins on was a 2013.09 version that was upgraded to 2014.03.x via yum updates.  I presume you could add the 2013.09 yum repo into /etc/yum.repos.d/ and see the 3.4.x kernels.  From experience, 3.4.83-70.111 works with everything else from 2014.03.2 at the latest revisions.

I've attached a kernel param dump.





Change By:


Trevor Baker
(18/Sep/14 12:20 PM)




Attachment:


3.4.83-70.111.amzn1.x86_64_kernel_params.txt



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-09-17 Thread tba...@circle.com (JIRA)














































Trevor Baker
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















This happens consistently when the jenkins master is running amazon linux kernel 3.10.x.  If you downgrade the kernel to 3.4.x on the master it goes away.  It does not seem to matter which kernel the slave runs.

Every time I forget about this and "yum update" the kernel it takes some head scratching before I remember to lock the master's kernel at 3.4.x in /etc/grub.conf



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-09-01 Thread dan...@beckweb.net (JIRA)














































Daniel Beck
 updated  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Change By:


Daniel Beck
(01/Sep/14 2:43 PM)




Labels:


remoting



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-08-27 Thread abedwar...@gmail.com (JIRA)














































Adam Edwards
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















I am seeing similar issues on EC2 linux.

[EnvInject] - Loading node environment variables.
FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
	at hudson.remoting.Request.call(Request.java:174)
	at hudson.remoting.Channel.call(Channel.java:739)
	at hudson.FilePath.act(FilePath.java:1009)
	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:581)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:487)
	at hudson.model.Run.execute(Run.java:1706)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:232)
Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
	at hudson.remoting.Request.abort(Request.java:299)
	at hudson.remoting.Channel.terminate(Channel.java:802)
	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:951)
	at hudson.remoting.Channel$2.handle(Channel.java:475)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
	... 3 more
Caused by: Command close created at
	at hudson.remoting.Command.init(Command.java:56)
	at hudson.remoting.Channel$CloseCommand.init(Channel.java:945)
	at hudson.remoting.Channel$CloseCommand.init(Channel.java:943)
	at hudson.remoting.Channel.close(Channel.java:1026)
	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
	at hudson.remoting.PingThread.ping(PingThread.java:120)
	at hudson.remoting.PingThread.run(PingThread.java:81)
Caused by: java.util.concurrent.TimeoutException: Ping started on 1409173580736 hasn't completed at 1409173820736
	... 2 more




























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-08-14 Thread behoalh...@gmail.com (JIRA)












































 
Beho Alheit
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















I have this bug popping up with Mac Mini OSX SSH slaves.

Jenkins version is 1.564. 

Updated version to 1.574 and the issue is still present.

It takes a reboot of both the OSX slave and the Jenkins instance itself to fix this issue.
This only occurs with the nodes running OSX. After running for a couple of days and processing a arbitrary number of builds, the timeout takes place.

Please let me know if there is more details I can submit. 



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-07-29 Thread behoalh...@gmail.com (JIRA)














































Beho Alheit
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















I have this bug popping up with Mac OSX SSH slaves.

Jenkins version is 1.564



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-27 Thread baue...@me.com (JIRA)














































wbauer
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















Same issue here with Linux SSH slaves with Jenkins 1.564:

14:33:30 Started by user wbauer
14:33:30 EnvInject - Loading node environment variables.
14:33:30 ERROR: SEVERE ERROR occurs
14:33:30 org.jenkinsci.lib.envinject.EnvInjectException: hudson.remoting.ChannelClosedException: channel is already closed
14:33:30 	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:75)
14:33:30 	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
14:33:30 	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
14:33:30 	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:570)
14:33:30 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:476)
14:33:30 	at hudson.model.Run.execute(Run.java:1706)
14:33:30 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
14:33:30 	at hudson.model.ResourceController.execute(ResourceController.java:88)
14:33:30 	at hudson.model.Executor.run(Executor.java:231)
14:33:30 Caused by: hudson.remoting.ChannelClosedException: channel is already closed
14:33:30 	at hudson.remoting.Channel.send(Channel.java:541)
14:33:30 	at hudson.remoting.Request.call(Request.java:129)
14:33:30 	at hudson.remoting.Channel.call(Channel.java:739)
14:33:30 	at hudson.FilePath.act(FilePath.java:1009)
14:33:30 	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
14:33:30 	... 8 more
14:33:30 Caused by: java.io.IOException
14:33:30 	at hudson.remoting.Channel.close(Channel.java:1027)
14:33:30 	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
14:33:30 	at hudson.remoting.PingThread.ping(PingThread.java:120)
14:33:30 	at hudson.remoting.PingThread.run(PingThread.java:81)
14:33:30 Caused by: java.util.concurrent.TimeoutException: Ping started on 1401187729473 hasn't completed at 1401187969473
14:33:30 	... 2 more
14:33:30 Notifying upstream projects of job completion
14:33:30 EnvInject - ERROR - SEVERE ERROR occurs: channel is already closed
14:33:33 Finished: FAILURE



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-19 Thread o.v.nenas...@gmail.com (JIRA)















































Oleg Nenashev
 assigned  JENKINS-14332 to Oleg Nenashev



Repeated channel/timeout errors from Jenkins slave
















Change By:


Oleg Nenashev
(19/May/14 9:21 AM)




Assignee:


OlegNenashev



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-19 Thread o.v.nenas...@gmail.com (JIRA)












































 
Oleg Nenashev
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















The issue appears on my custom build of the Jenkins core, but seems it could be reproduced on newest versions as well.

We've experienced a network overloading, which has let to the exception in the PingThread on Jenkins master, which has closed the communication channel. However, the slave stills online and takes jobs, but any remote action fails (see the log above) = All scheduled builds fail with an error

The issue affects ssh-slaves only:

	Linux SSH slaves are "online", but all jobs on the fail with the error above
	Windows services have reconnected automatically...
	Windows JNLP slaves have reconnected as well



I'm not sure if there's any Linux specifics in the issue.
On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side 



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-19 Thread o.v.nenas...@gmail.com (JIRA)














































Oleg Nenashev
 updated  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Updated the issue's description.





Change By:


Oleg Nenashev
(19/May/14 9:24 AM)




Environment:


jenkins-1.509.4withremoting-2.36ssh-slaves-1.2





Description:


TheissueappearsonmycustombuildoftheJenkinscore,butseemsitcouldbereproducedonnewestversionsaswell.Weveexperiencedanetworkoverloading,whichhaslettotheexceptioninthePingThreadonJenkinsmaster,whichhasclosedthecommunicationchannel.However,theslavestillsonlineandtakesjobs,butanyremoteactionfails(seelogsabove)=AllscheduledbuildsfailwithanerrorTheissueaffectsssh-slavesonly:LinuxSSHslavesareonline,butalljobsonthefailwiththeerroraboveWindowsserviceshavereconnectedautomatically...WindowsJNLPslaveshavereconnectedaswell





Component/s:


master-slave





Component/s:


ssh-slaves



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-19 Thread o.v.nenas...@gmail.com (JIRA)














































Oleg Nenashev
 updated  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Change By:


Oleg Nenashev
(19/May/14 9:25 AM)




Description:


TheissueappearsonmycustombuildoftheJenkinscore,butseemsitcouldbereproducedonnewestversionsaswell.Weveexperiencedanetworkoverloading,whichhaslettotheexceptioninthePingThreadonJenkinsmaster,whichhasclosedthecommunicationchannel.However,theslavestillsonlineandtakesjobs,butanyremoteactionfails(seelogsabove)=AllscheduledbuildsfailwithanerrorTheissueaffectsssh-slavesonly:
*
LinuxSSHslavesareonline,butalljobsonthefailwiththeerrorabove
*
Windowsserviceshavereconnectedautomatically...
*
WindowsJNLPslaveshavereconnectedaswell



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-18 Thread o.v.nenas...@gmail.com (JIRA)














































Oleg Nenashev
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















The issue appears on remoting-2.36



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-18 Thread o.v.nenas...@gmail.com (JIRA)












































 
Oleg Nenashev
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















The issue appears on remoting-2.36
Nodes still alive after the issue



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-05-18 Thread o.v.nenas...@gmail.com (JIRA)














































Oleg Nenashev
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















In my case the issue has appeared after the network overloading, hence slave nodes have been disconnected multiple times due to the "recv failed" (TCP timeout).


	Linux SSH slaves are "online", but all jobs on the fail with the error above
	Windows services have reconnected automatically...
	Windows JNLP slaves have reconnected as well



I'm not sure if there's any Linux specifics in the issue.
On my installation default TCP timeouts exceed the PingThread timeout on the master, hence there could be a collision on the server side 



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-02-20 Thread marcus.kl...@open-xchange.com (JIRA)














































Marcus Klein
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















My jobs are randomly failing with version 1.550 with the following exception:

FATAL: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
	at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
	at hudson.remoting.Request.call(Request.java:174)
	at hudson.remoting.Channel.call(Channel.java:722)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)
	at com.sun.proxy.$Proxy47.join(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
	at hudson.tasks.Ant.perform(Ant.java:217)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:785)
	at hudson.model.Build$BuildExecution.build(Build.java:199)
	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:566)
	at hudson.model.Run.execute(Run.java:1678)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:231)
Caused by: hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
	at hudson.remoting.Request.abort(Request.java:299)
	at hudson.remoting.Channel.terminate(Channel.java:782)
	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:927)
	at hudson.remoting.Channel$2.handle(Channel.java:461)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)
Caused by: hudson.remoting.Channel$OrderlyShutdown: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
	... 3 more
Caused by: Command close created at
	at hudson.remoting.Command.init(Command.java:56)
	at hudson.remoting.Channel$CloseCommand.init(Channel.java:921)
	at hudson.remoting.Channel$CloseCommand.init(Channel.java:919)
	at hudson.remoting.Channel.close(Channel.java:1002)
	at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
	at hudson.remoting.PingThread.ping(PingThread.java:120)
	at hudson.remoting.PingThread.run(PingThread.java:81)
Caused by: java.util.concurrent.TimeoutException: Ping started on 1392852728137 hasn't completed at 1392852968137
	... 2 more

I rolled back to version 1.546 because I think it did not happen with that version.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2014-02-07 Thread nickolay.rumyant...@emc.com (JIRA)














































Nickolay Rumyantsev
 commented on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave















Having the similar issue. Slave is reconnected during the archiving of artifacts, when Jenkins is under load.
Jenkins 1.543.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2013-05-06 Thread jcarsi...@java.net (JIRA)












































 
Julien Carsique
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Maybe not a Jenkins issue.

Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave:

[05/06/13 14:06:04] Launching slave agent
$ ssh slavedns java -jar ~/bin/slave.jar
===[JENKINS REMOTING CAPACITY]==[JENKINS REMOTING CAPACITY]===channel started
channel started
Slave.jar version: 2.22
This is a Unix slave
Slave.jar version: 2.22
This is a Unix slave
Copied maven-agent.jar
Copied maven3-agent.jar
Copied maven3-interceptor.jar
Copied maven-agent.jar
Copied maven-interceptor.jar
Copied maven2.1-interceptor.jar
Copied plexus-classworld.jar
Copied maven3-agent.jar
Copied maven3-interceptor.jar
Copied classworlds.jar
Copied maven-interceptor.jar
Copied maven2.1-interceptor.jar
Copied plexus-classworld.jar
Copied classworlds.jar
Evacuated stdout
Evacuated stdout
ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
(...)java.lang.IllegalStateException: Already connected
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Connection terminated
channel stopped
ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
(...)java.lang.NullPointerException
	at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
channel stopped
Connection terminated

Then the slaved successfully reconnected itself.

It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue. 

Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log (java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682): the system was partially frozen and with really unstable response times.




























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




[JIRA] [core] (JENKINS-14332) Repeated channel/timeout errors from Jenkins slave

2013-05-06 Thread jcarsi...@java.net (JIRA)












































 
Julien Carsique
 edited a comment on  JENKINS-14332


Repeated channel/timeout errors from Jenkins slave
















Maybe not a Jenkins issue.

Having the same issue, reproduced at every build for a given slave (but with nothing relevant in its logs), I tried to disconnect and reconnect the slave:

[05/06/13 14:06:04] Launching slave agent
$ ssh slavedns java -jar ~/bin/slave.jar
===[JENKINS REMOTING CAPACITY]==[JENKINS REMOTING CAPACITY]===channel started
channel started
Slave.jar version: 2.22
This is a Unix slave
Slave.jar version: 2.22
This is a Unix slave
Copied maven-agent.jar
Copied maven3-agent.jar
Copied maven3-interceptor.jar
Copied maven-agent.jar
Copied maven-interceptor.jar
Copied maven2.1-interceptor.jar
Copied plexus-classworld.jar
Copied maven3-agent.jar
Copied maven3-interceptor.jar
Copied classworlds.jar
Copied maven-interceptor.jar
Copied maven2.1-interceptor.jar
Copied plexus-classworld.jar
Copied classworlds.jar
Evacuated stdout
Evacuated stdout
ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
(...)java.lang.IllegalStateException: Already connected
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:459)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Connection terminated
channel stopped
ERROR: Unexpected error in launching a slave. This is probably a bug in Jenkins
(...)java.lang.NullPointerException
	at org.jenkinsci.modules.slave_installer.impl.ComputerListenerImpl.onOnline(ComputerListenerImpl.java:32)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:471)
	at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:339)
	at hudson.slaves.CommandLauncher.launch(CommandLauncher.java:122)
	at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:222)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
channel stopped
Connection terminated

Then the slaved successfully reconnected itself.

It appeared there was looping thread consuming 100% CPU. Killing the process solved the issue. 

Strangely, some system and Java commands were not working (ps, cat, less, jstack, trace, ...) until it has been killed, whereas other commands worked (top, jps, renice, kill, ...). That could explain the weird Jenkins 4s timeout log (java.util.concurrent.TimeoutException: Ping started on 1367743028681 hasn't completed at 1367743268682): the system was partially frozen and with really unstable response times.

Note the looping thread came from a previous job which badly stopped on timeout:

03:00:16.868 Build timed out (after 180 minutes). Marking the build as aborted.
03:00:16.873 Build was aborted
03:00:16.874 Archiving artifacts
03:00:16.874 ERROR: Failed to archive artifacts: **/log/*.log, tomcat*/nxserver/config/distribution.properties
03:00:16.875 hudson.remoting.ChannelClosedException: channel is already closed
03:00:16.876 	at hudson.remoting.Channel.send(Channel.java:494)
03:00:16.876 	at hudson.remoting.Request.call(Request.java:129)
03:00:16.876 	at hudson.remoting.Channel.call(Channel.java:672)
03:00:16.876 	at hudson.EnvVars.getRemote(EnvVars.java:212)
03:00:16.876 	at hudson.model.Computer.getEnvironment(Computer.java:882)
03:00:16.876 	at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:28)
03:00:16.876 	at hudson.model.Run.getEnvironment(Run.java:2028)
03:00:16.876 	at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:927)
03:00:16.876 	at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:115)
03:00:16.876 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
03:00:16.876 	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:798)
03:00:16.876 	at