Hello,
I've vSphere slaves witch are "reverted" to a snapshot for each build. Those slaves are used to test the installation of our products.
Some builds wait a long time at the beginning of the build before to crash with the following message:
Remote build on slave-seven64 (x86_64 seven uiTest java7 autoRestored win x86)FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:541)
at hudson.remoting.Request.call(Request.java:129)
at hudson.remoting.Channel.call(Channel.java:739)
at hudson.EnvVars.getRemote(EnvVars.java:404)
at hudson.model.Computer.getEnvironment(Computer.java:927)
at jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29)
at hudson.model.Run.getEnvironment(Run.java:2248)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:905)
at hudson.matrix.MatrixRun$MatrixRunExecution.decideWorkspace(MatrixRun.java:175)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:517)
at hudson.model.Run.execute(Run.java:1740)
at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
at hudson.model.ResourceController.execute(ResourceController.java:89)
at hudson.model.Executor.run(Executor.java:240)
Caused by: java.io.IOException
at hudson.remoting.Channel.close(Channel.java:1027)
at hudson.slaves.ChannelPinger$1.onDead(ChannelPinger.java:110)
at hudson.remoting.PingThread.ping(PingThread.java:120)
at hudson.remoting.PingThread.run(PingThread.java:81)
Caused by: java.util.concurrent.TimeoutException: Ping started on 1411030796694 hasn't completed at 1411031036694
... 2 more
The problem happens very often but it is not 100% reproducible. I've this problem with 2 different sets of computers. My slave configuration was working properly in a not so far past (about 6 months), but I don't know at witch jenkins update the problem appeared.
Sometimes a blocked build is running in slave (according to the Jenkins UI) whereas the computer hosting the slave is down (according to the "vSphere Client").
I also found the following message in a log of a running slave on which a build was waiting before to crash:
sept. 18, 2014 3:30:59 PM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: The server rejected the connection: ntedtop-seven64 is already connected to this master. Rejecting this connection.
java.lang.Exception: The server rejected the connection: ntedtop-seven64 is already connected to this master. Rejecting this connection.
at hudson.remoting.Engine.onConnectionRejected(Engine.java:286)
at hudson.remoting.Engine.run(Engine.java:261)
Is there a workaround for this problem.
Regards,
Grégoire
|