Frequent connection losses: java.io.IOException
Hello World, Our setup: * Jenkins 2.32.2 on Debian Linux libvirt Plugin for Jenkins * 8 VMs on the same server, running Windows 7, building our software KVM/QEMU VMs, using libvirt * 4 Ubuntu machines containing 2 VMs each (=8) running Windows 10, testing our software KVM/QEMU VMs, using libvirt We've experienced more and more connection problems lately, all having the same exception: FATAL: Befehlsausführung fehlgeschlagen java.io.IOException: Backing channel 'JNLP4-connect connection from build-vm1/192.168.168.160:1033' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:191) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:256) at com.sun.proxy.$Proxy60.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:154) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:108) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:65) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779) at hudson.model.Build$BuildExecution.build(Build.java:205) at hudson.model.Build$BuildExecution.doRun(Build.java:162) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534) at hudson.model.Run.execute(Run.java:1728) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:404) Caused by: java.nio.channels.ClosedChannelException at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208) at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832) at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200) at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213) at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800) at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173) at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311) at hudson.remoting.Channel.close(Channel.java:1287) at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:115) at hudson.remoting.PingThread.ping(PingThread.java:130) at hudson.remoting.PingThread.run(PingThread.java:86) These appear randomly on all nodes; we've found no pattern and no way to reproduce them. There are also no more information in the logs. Has anyone seen the same problems? Any hints what I should check? TIA and best regards Andreas -- Andreas Tscharner -- "Intruder on level one. All Aliens please proceed to level one." -- Call in "Alien: Resurrection" -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/42c33ada-cb0b-0ae5-a6b9-1ffb00cf2720%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Frequent connection losses: java.io.IOException
On 22.05.2017 10:13, Andreas Tscharner wrote: Hello World, Hello again, Some more information Our setup: * Jenkins 2.32.2 on Debian Linux jenkins@buildserver:~$ uname -a Linux buildserver 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux jenkins@buildserver:~$ java -version openjdk version "1.8.0_121" OpenJDK Runtime Environment (build 1.8.0_121-8u121-b13-3-b13) OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode) libvirt Plugin for Jenkins * 8 VMs on the same server, running Windows 7, building our software KVM/QEMU VMs, using libvirt Java version: 1.8.0_121, using the new Jenkins service * 4 Ubuntu machines containing 2 VMs each (=8) running Windows 10, testing our software KVM/QEMU VMs, using libvirt Java Version: 1.01 (it worked better with this version), auto-started batch file running "java -jar slave.jar -jnlpCredentials ... -jnlpUrl ..." TIA and best regards Andreas -- Andreas Tscharner -- "Intruder on level one. All Aliens please proceed to level one." -- Call in "Alien: Resurrection" -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/e3971cc1-fa67-2b23-7acd-c607c2c4eeb9%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Frequent connection losses: java.io.IOException
On 22.05.2017 10:13, Andreas Tscharner wrote: Hello World, [snip] FATAL: Befehlsausführung fehlgeschlagen java.io.IOException: Backing channel 'JNLP4-connect connection from build-vm1/192.168.168.160:1033' is disconnected. No-one? No idea? Or any pointer to a similar (old)er issue? TIA and best regards Andreas -- Andreas Tscharner -- "Intruder on level one. All Aliens please proceed to level one." -- Call in "Alien: Resurrection" -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/fb941d4e-348a-b2aa-213d-1258897e3f9c%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Frequent connection losses: java.io.IOException
On 22.05.2017 10:13, Andreas Tscharner wrote: Hello World, Hello again, [snip] FATAL: Befehlsausführung fehlgeschlagen java.io.IOException: Backing channel 'JNLP4-connect connection from build-vm1/192.168.168.160:1033' is disconnected. [snip] I have found some more information in /var/log/jenkins/jenkins.log: Mai 28, 2017 2:02:59 PM hudson.slaves.ChannelPinger$1 onDead INFORMATION: Ping failed. Terminating the channel JNLP4-connect connection from build-vm3/192.168.168.162:1033. java.util.concurrent.TimeoutException: Ping started at 1495972739294 hasn't completed by 1495972979294 at hudson.remoting.PingThread.ping(PingThread.java:130) at hudson.remoting.PingThread.run(PingThread.java:86) I've calculated the value between those two timestamps: 24. I suppose that are ms which would be 4 minutes. Is there anything I could do? Why would the ping be dead for more than 4 minutes? TIA and best regards Andreas -- Andreas Tscharner -- "Intruder on level one. All Aliens please proceed to level one." -- Call in "Alien: Resurrection" -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/ba3c7ec3-dfee-a128-2bc4-a31ec402e573%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Frequent connection losses: java.io.IOException
Did you already looked at https://wiki.jenkins-ci.org/display/JENKINS/Ping+Thread ? 2017-05-29 15:47 GMT+02:00 Andreas Tscharner : > On 22.05.2017 10:13, Andreas Tscharner wrote: >> >> Hello World, >> > > Hello again, > > [snip] >> >> FATAL: Befehlsausführung fehlgeschlagen >> java.io.IOException: Backing channel 'JNLP4-connect connection from >> build-vm1/192.168.168.160:1033' is disconnected. > > [snip] > > I have found some more information in /var/log/jenkins/jenkins.log: > > > Mai 28, 2017 2:02:59 PM hudson.slaves.ChannelPinger$1 onDead > INFORMATION: Ping failed. Terminating the channel JNLP4-connect connection > from build-vm3/192.168.168.162:1033. > java.util.concurrent.TimeoutException: Ping started at 1495972739294 hasn't > completed by 1495972979294 > at hudson.remoting.PingThread.ping(PingThread.java:130) > at hudson.remoting.PingThread.run(PingThread.java:86) > > > I've calculated the value between those two timestamps: 24. I suppose > that are ms which would be 4 minutes. > Is there anything I could do? Why would the ping be dead for more than 4 > minutes? > > TIA and best regards > Andreas > -- > Andreas Tscharner > -- > "Intruder on level one. All Aliens please proceed to level one." > -- Call in "Alien: Resurrection" > > -- > You received this message because you are subscribed to the Google Groups > "Jenkins Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to jenkinsci-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/jenkinsci-users/ba3c7ec3-dfee-a128-2bc4-a31ec402e573%40gmail.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CAO5HUhNtpaENFF%2BsHPrnRoMS58cuNFRY-RzmAVEpm65zPNJQUA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Frequent connection losses: java.io.IOException
On 29 May 2017 at 06:47, Andreas Tscharner wrote: > On 22.05.2017 10:13, Andreas Tscharner wrote: > >> Hello World, >> >> > Hello again, > > [snip] > >> FATAL: Befehlsausführung fehlgeschlagen >> java.io.IOException: Backing channel 'JNLP4-connect connection from >> build-vm1/192.168.168.160:1033' is disconnected. >> > [snip] > > I have found some more information in /var/log/jenkins/jenkins.log: > > > Mai 28, 2017 2:02:59 PM hudson.slaves.ChannelPinger$1 onDead > INFORMATION: Ping failed. Terminating the channel JNLP4-connect connection > from build-vm3/192.168.168.162:1033. > java.util.concurrent.TimeoutException: Ping started at 1495972739294 > hasn't completed by 1495972979294 > at hudson.remoting.PingThread.ping(PingThread.java:130) > at hudson.remoting.PingThread.run(PingThread.java:86) > > > I've calculated the value between those two timestamps: 24. I suppose > that are ms which would be 4 minutes. > Is there anything I could do? Why would the ping be dead for more than 4 > minutes? The ping is an application level ping, i.e. it submits a command into the command queue and waits for the response to be generated. The command channel can be functioning, but if there is a backlog in the command queue on either side and the command takes longer than 4 minutes (configurable) then the channel is disconnected. The command queue is single threaded. So there are multiple things that can cause the issue, e.g.: * Transferring large artifacts between the master and the agent * Plugin causing deadlock on the agent and prevent * etc You may want to investigate the agent logs You may want to monitor Disk I/O and CPU and memory usage (especially if running all things on the one machine) You may want to monitor Network I/O if running agents over a remote network HTH > > TIA and best regards > Andreas > -- > Andreas Tscharner > -- > "Intruder on level one. All Aliens please proceed to level one." > -- Call in "Alien: Resurrection" > > -- > You received this message because you are subscribed to the Google Groups > "Jenkins Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to jenkinsci-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/ms > gid/jenkinsci-users/ba3c7ec3-dfee-a128-2bc4-a31ec402e573%40gmail.com. > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CA%2BnPnMymOoyTz4Pd8cnrEkP5U%2BsuzVjUA0sw_purYfuVvQbGJw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Frequent connection losses: java.io.IOException
On 29.05.2017 17:02, Kristian wrote: Did you already looked at https://wiki.jenkins-ci.org/display/JENKINS/Ping+Thread ? No, thank you. I didn't realize that the ping caused the connection loss until yesterday (I am not a Java developer so I missed that) Unfortunately: If I execute the line given on that page (Jenkins.instance.injector.getInstance(hudson.slaves.ChannelPinger.class).@pingInterval=-1) in the script console in Jenkins, I get the following exception: groovy.lang.GroovyRuntimeException: Cannot set the property 'pingInterval' because the backing field is final. at org.codehaus.groovy.reflection.CachedField.setProperty(CachedField.java:71) at groovy.lang.MetaClassImpl.setAttribute(MetaClassImpl.java:2858) at groovy.lang.MetaClassImpl.setAttribute(MetaClassImpl.java:3770) at org.codehaus.groovy.runtime.InvokerHelper.setAttribute(InvokerHelper.java:161) at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.setField(ScriptBytecodeAdapter.java:332) at Script1.run(Script1.groovy:1) at groovy.lang.GroovyShell.evaluate(GroovyShell.java:585) at groovy.lang.GroovyShell.evaluate(GroovyShell.java:623) at groovy.lang.GroovyShell.evaluate(GroovyShell.java:594) at hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:142) at hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:114) at hudson.remoting.LocalChannel.call(LocalChannel.java:45) at hudson.util.RemotingDiagnostics.executeGroovy(RemotingDiagnostics.java:111) at jenkins.model.Jenkins._doScript(Jenkins.java:4283) at jenkins.model.Jenkins.doScript(Jenkins.java:4255) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.kohsuke.stapler.Function$InstanceFunction.invoke(Function.java:335) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:175) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:108) at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:124) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:746) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649) at org.kohsuke.stapler.Stapler.service(Stapler.java:238) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:135) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:132) at hudson.util.PluginServletFilter.doFilter(PluginServletFilter.java:126) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at hudson.security.csrf.CrumbFilter.doFilter(CrumbFilter.java:49) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:84) at hudson.security.UnwrapSecurityExceptionFilter.doFilter(UnwrapSecurityExceptionFilter.java:51) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:117) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.providers.anonymous.AnonymousProcessingFilter.doFilter(AnonymousProcessingFilter.java:125) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.rememberme.RememberMeProcessingFilter.doFilter(RememberMeProcessingFilter.java:142) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:271) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at jenkins.security.BasicHeaderProcessor.doFilter(BasicHeaderProcessor.java:93) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:249) at hudson.security.HttpSessionContextIntegrationFilter2.doFilter(HttpSessionContextIntegrationFilter2.java:67) at hudson.security.ChainedServletFilter$1.doFilter(ChainedServletFilter.java:87) at hudson.security
Re: Frequent connection losses: java.io.IOException
On 29.05.2017 17:31, Stephen Connolly wrote: [snip] The ping is an application level ping, i.e. it submits a command into the command queue and waits for the response to be generated. The command channel can be functioning, but if there is a backlog in the command queue on either side and the command takes longer than 4 minutes (configurable) then the channel is disconnected. The command queue is single threaded. So there are multiple things that can cause the issue, e.g.: * Transferring large artifacts between the master and the agent * Plugin causing deadlock on the agent and prevent * etc You may want to investigate the agent logs I already have had a short look at them and did not find anything special. You may want to monitor Disk I/O and CPU and memory usage (especially if running all things on the one machine) OK, I will do that. Jenkins and all 8 VMs are running on the same machine. You may want to monitor Network I/O if running agents over a remote network The VMs are in the local network using a simple bridge Thank you very much and best regards Andreas -- Andreas Tscharner -- "Intruder on level one. All Aliens please proceed to level one." -- Call in "Alien: Resurrection" -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/bc7ea735-8ae4-1a5d-cb54-eb73ed6d0724%40gmail.com. For more options, visit https://groups.google.com/d/optout.