Frequent connection losses: java.io.IOException

2017-05-22 Thread Andreas Tscharner

Hello World,

Our setup:
* Jenkins 2.32.2 on Debian Linux
  libvirt Plugin for Jenkins
* 8 VMs on the same server, running Windows 7, building our software
  KVM/QEMU VMs, using libvirt
* 4 Ubuntu machines containing 2 VMs each (=8) running Windows 10, 
testing our software

  KVM/QEMU VMs, using libvirt

We've experienced more and more connection problems lately, all having 
the same exception:



FATAL: Befehlsausführung fehlgeschlagen
java.io.IOException: Backing channel 'JNLP4-connect connection from 
build-vm1/192.168.168.160:1033' is disconnected.
	at 
hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:191)
	at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:256)

at com.sun.proxy.$Proxy60.isAlive(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:154)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:108)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:65)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)

at hudson.model.Build$BuildExecution.build(Build.java:205)
at hudson.model.Build$BuildExecution.doRun(Build.java:162)
	at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)

at hudson.model.Run.execute(Run.java:1728)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:404)
Caused by: java.nio.channels.ClosedChannelException
	at 
org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
	at 
org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
	at 
org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
	at 
org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
	at 
org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
	at 
org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
	at 
org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
	at 
org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
	at 
org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
	at 
org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
	at 
org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
	at 
org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
	at 
org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)

at hudson.remoting.Channel.close(Channel.java:1287)
at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:115)
at hudson.remoting.PingThread.ping(PingThread.java:130)
at hudson.remoting.PingThread.run(PingThread.java:86)


These appear randomly on all nodes; we've found no pattern and no way to 
reproduce them. There are also no more information in the logs.


Has anyone seen the same problems? Any hints what I should check?

TIA and best regards
Andreas
--
Andreas Tscharner 
--
"Intruder on level one. All Aliens please proceed to level one."
  -- Call in "Alien: Resurrection"

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/42c33ada-cb0b-0ae5-a6b9-1ffb00cf2720%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Frequent connection losses: java.io.IOException

2017-05-22 Thread Andreas Tscharner

On 22.05.2017 10:13, Andreas Tscharner wrote:

Hello World,


Hello again,

Some more information


Our setup:
* Jenkins 2.32.2 on Debian Linux


jenkins@buildserver:~$ uname -a
Linux buildserver 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 
(2016-10-19) x86_64 GNU/Linux


jenkins@buildserver:~$ java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-8u121-b13-3-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)


  libvirt Plugin for Jenkins
* 8 VMs on the same server, running Windows 7, building our software
  KVM/QEMU VMs, using libvirt

Java version: 1.8.0_121, using the new Jenkins service


* 4 Ubuntu machines containing 2 VMs each (=8) running Windows 10,
testing our software
  KVM/QEMU VMs, using libvirt
Java Version: 1.01 (it worked better with this version), auto-started 
batch file running "java -jar slave.jar -jnlpCredentials ... -jnlpUrl ..."


TIA and best regards
Andreas
--
Andreas Tscharner 
--
"Intruder on level one. All Aliens please proceed to level one."
  -- Call in "Alien: Resurrection"

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/e3971cc1-fa67-2b23-7acd-c607c2c4eeb9%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Frequent connection losses: java.io.IOException

2017-05-29 Thread Andreas Tscharner

On 22.05.2017 10:13, Andreas Tscharner wrote:

Hello World,



[snip]

FATAL: Befehlsausführung fehlgeschlagen
java.io.IOException: Backing channel 'JNLP4-connect connection from
build-vm1/192.168.168.160:1033' is disconnected.


No-one? No idea?
Or any pointer to a similar (old)er issue?

TIA and best regards
Andreas
--
Andreas Tscharner 
--
"Intruder on level one. All Aliens please proceed to level one."
  -- Call in "Alien: Resurrection"

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/fb941d4e-348a-b2aa-213d-1258897e3f9c%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Frequent connection losses: java.io.IOException

2017-05-29 Thread Andreas Tscharner

On 22.05.2017 10:13, Andreas Tscharner wrote:

Hello World,



Hello again,

[snip]

FATAL: Befehlsausführung fehlgeschlagen
java.io.IOException: Backing channel 'JNLP4-connect connection from
build-vm1/192.168.168.160:1033' is disconnected.

[snip]

I have found some more information in /var/log/jenkins/jenkins.log:


Mai 28, 2017 2:02:59 PM hudson.slaves.ChannelPinger$1 onDead
INFORMATION: Ping failed. Terminating the channel JNLP4-connect 
connection from build-vm3/192.168.168.162:1033.
java.util.concurrent.TimeoutException: Ping started at 1495972739294 
hasn't completed by 1495972979294

at hudson.remoting.PingThread.ping(PingThread.java:130)
at hudson.remoting.PingThread.run(PingThread.java:86)


I've calculated the value between those two timestamps: 24. I 
suppose that are ms which would be 4 minutes.
Is there anything I could do? Why would the ping be dead for more than 4 
minutes?


TIA and best regards
Andreas
--
Andreas Tscharner 
--
"Intruder on level one. All Aliens please proceed to level one."
  -- Call in "Alien: Resurrection"

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/ba3c7ec3-dfee-a128-2bc4-a31ec402e573%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Frequent connection losses: java.io.IOException

2017-05-29 Thread Kristian
Did you already looked at
https://wiki.jenkins-ci.org/display/JENKINS/Ping+Thread ?

2017-05-29 15:47 GMT+02:00 Andreas Tscharner :
> On 22.05.2017 10:13, Andreas Tscharner wrote:
>>
>> Hello World,
>>
>
> Hello again,
>
> [snip]
>>
>> FATAL: Befehlsausführung fehlgeschlagen
>> java.io.IOException: Backing channel 'JNLP4-connect connection from
>> build-vm1/192.168.168.160:1033' is disconnected.
>
> [snip]
>
> I have found some more information in /var/log/jenkins/jenkins.log:
>
>
> Mai 28, 2017 2:02:59 PM hudson.slaves.ChannelPinger$1 onDead
> INFORMATION: Ping failed. Terminating the channel JNLP4-connect connection
> from build-vm3/192.168.168.162:1033.
> java.util.concurrent.TimeoutException: Ping started at 1495972739294 hasn't
> completed by 1495972979294
> at hudson.remoting.PingThread.ping(PingThread.java:130)
> at hudson.remoting.PingThread.run(PingThread.java:86)
>
>
> I've calculated the value between those two timestamps: 24. I suppose
> that are ms which would be 4 minutes.
> Is there anything I could do? Why would the ping be dead for more than 4
> minutes?
>
> TIA and best regards
> Andreas
> --
> Andreas Tscharner 
> --
> "Intruder on level one. All Aliens please proceed to level one."
>   -- Call in "Alien: Resurrection"
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jenkinsci-users/ba3c7ec3-dfee-a128-2bc4-a31ec402e573%40gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/CAO5HUhNtpaENFF%2BsHPrnRoMS58cuNFRY-RzmAVEpm65zPNJQUA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Frequent connection losses: java.io.IOException

2017-05-29 Thread Stephen Connolly
On 29 May 2017 at 06:47, Andreas Tscharner  wrote:

> On 22.05.2017 10:13, Andreas Tscharner wrote:
>
>> Hello World,
>>
>>
> Hello again,
>
> [snip]
>
>> FATAL: Befehlsausführung fehlgeschlagen
>> java.io.IOException: Backing channel 'JNLP4-connect connection from
>> build-vm1/192.168.168.160:1033' is disconnected.
>>
> [snip]
>
> I have found some more information in /var/log/jenkins/jenkins.log:
>
>
> Mai 28, 2017 2:02:59 PM hudson.slaves.ChannelPinger$1 onDead
> INFORMATION: Ping failed. Terminating the channel JNLP4-connect connection
> from build-vm3/192.168.168.162:1033.
> java.util.concurrent.TimeoutException: Ping started at 1495972739294
> hasn't completed by 1495972979294
> at hudson.remoting.PingThread.ping(PingThread.java:130)
> at hudson.remoting.PingThread.run(PingThread.java:86)
>
>
> I've calculated the value between those two timestamps: 24. I suppose
> that are ms which would be 4 minutes.
> Is there anything I could do? Why would the ping be dead for more than 4
> minutes?


The ping is an application level ping, i.e. it submits a command into the
command queue and waits for the response to be generated.
The command channel can be functioning, but if there is a backlog in the
command queue on either side and the command takes longer than 4 minutes
(configurable) then the channel is disconnected.

The command queue is single threaded. So there are multiple things that can
cause the issue, e.g.:

* Transferring large artifacts between the master and the agent
* Plugin causing deadlock on the agent and prevent
* etc

You may want to investigate the agent logs

You may want to monitor Disk I/O and CPU and memory usage (especially if
running all things on the one machine)

You may want to monitor Network I/O if running agents over a remote network

HTH


>
> TIA and best regards
> Andreas
> --
> Andreas Tscharner 
> --
> "Intruder on level one. All Aliens please proceed to level one."
>   -- Call in "Alien: Resurrection"
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/jenkinsci-users/ba3c7ec3-dfee-a128-2bc4-a31ec402e573%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMymOoyTz4Pd8cnrEkP5U%2BsuzVjUA0sw_purYfuVvQbGJw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Frequent connection losses: java.io.IOException

2017-05-29 Thread Andreas Tscharner

On 29.05.2017 17:02, Kristian wrote:

Did you already looked at
https://wiki.jenkins-ci.org/display/JENKINS/Ping+Thread ?


No, thank you. I didn't realize that the ping caused the connection loss 
until yesterday (I am not a Java developer so I missed that)



Unfortunately: If I execute the line given on that page 
(Jenkins.instance.injector.getInstance(hudson.slaves.ChannelPinger.class).@pingInterval=-1) 
in the script console in Jenkins, I get the following exception:


groovy.lang.GroovyRuntimeException: Cannot set the property 
'pingInterval' because the backing field is final.
	at 
org.codehaus.groovy.reflection.CachedField.setProperty(CachedField.java:71)

at groovy.lang.MetaClassImpl.setAttribute(MetaClassImpl.java:2858)
at groovy.lang.MetaClassImpl.setAttribute(MetaClassImpl.java:3770)
	at 
org.codehaus.groovy.runtime.InvokerHelper.setAttribute(InvokerHelper.java:161)
	at 
org.codehaus.groovy.runtime.ScriptBytecodeAdapter.setField(ScriptBytecodeAdapter.java:332)

at Script1.run(Script1.groovy:1)
at groovy.lang.GroovyShell.evaluate(GroovyShell.java:585)
at groovy.lang.GroovyShell.evaluate(GroovyShell.java:623)
at groovy.lang.GroovyShell.evaluate(GroovyShell.java:594)
	at 
hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:142)
	at 
hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:114)

at hudson.remoting.LocalChannel.call(LocalChannel.java:45)
	at 
hudson.util.RemotingDiagnostics.executeGroovy(RemotingDiagnostics.java:111)

at jenkins.model.Jenkins._doScript(Jenkins.java:4283)
at jenkins.model.Jenkins.doScript(Jenkins.java:4255)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:335)
at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:175)
	at 
org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:108)

at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:124)
	at 
org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58)

at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649)
at org.kohsuke.stapler.Stapler.service(Stapler.java:238)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
	at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)

at 
hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:135)
	at 
hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59)

at 
hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132)
at 
hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:126)
	at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)

at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:49)
	at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84)
	at 
hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:93)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249)
	at 
hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67)
	at 
hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87)
	at 
hudson.security

Re: Frequent connection losses: java.io.IOException

2017-05-29 Thread Andreas Tscharner

On 29.05.2017 17:31, Stephen Connolly wrote:

[snip]

The ping is an application level ping, i.e. it submits a command into
the command queue and waits for the response to be generated.
The command channel can be functioning, but if there is a backlog in the
command queue on either side and the command takes longer than 4 minutes
(configurable) then the channel is disconnected.

The command queue is single threaded. So there are multiple things that
can cause the issue, e.g.:

* Transferring large artifacts between the master and the agent
* Plugin causing deadlock on the agent and prevent
* etc

You may want to investigate the agent logs


I already have had a short look at them and did not find anything special.


You may want to monitor Disk I/O and CPU and memory usage (especially if
running all things on the one machine)


OK, I will do that. Jenkins and all 8 VMs are running on the same machine.


You may want to monitor Network I/O if running agents over a remote network


The VMs are in the local network using a simple bridge

Thank you very much and best regards
Andreas
--
Andreas Tscharner 
--
"Intruder on level one. All Aliens please proceed to level one."
  -- Call in "Alien: Resurrection"

--
You received this message because you are subscribed to the Google Groups "Jenkins 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/bc7ea735-8ae4-1a5d-cb54-eb73ed6d0724%40gmail.com.
For more options, visit https://groups.google.com/d/optout.