x29a edited a comment on Bug JENKINS-12235

Unfortunately, i experience this (very prominent!) problem as well. Here are some more infos and a possible workaround.

Setup:

  • Jenkins 1.512 on tomcat 6.0.36, in a VirtualBox Windows 7 Guest, JRE 1.7.0
  • Slave connected via JNLP client in a VirtualBox Windows 7 Guest, JRE 1.7.0

The Slave is started via:
java -Xmx512m -jar slave.jar -jnlpUrl http://jenkins/computer/slave/slave-agent.jnlp in order to see the log messages

First off, setting various values to these variables (in catalina on tomcat) did not seem to improve the behaviour:
-Dhudson.remoting.Launcher.pingTimeoutSec
-Dhudson.remoting.Launcher.pingIntervalSec
-Dhudson.slaves.ChannelPinger.pingInterval

I was getting the "channel already closed" exception quite frequently and mostly at the same spot during script execution. The job (between 12h and 16h) on the slave (via windows batch file) generates large amounts of documentation via doxygen and pipes the output into a logfile, so it uses quite some CPU and does not echo progress. Throttling the CPU so that the NIC wont suffer from the overload, did not help the problem though. Also, i performed continuous pings to the slave (from the master and back) and ping requests only seldomly failed (normal network tolerances).

To say this first: allthough jenkins failed with the above mentioned exception, the slave continued to perform its job "in the background", so if the exception came after 1h, i would see the updated documentation after 16h allthough jenkins already declared the job as failed.

For the chronology, these are the log excerpts:

In the live console on the jenkins WebUI i see (THE FIRST LINE IS THE LAST OUTPUT BY MY SCRIPT):

Jenkins WebUI
2013-05-03 18:36:50 - Processing: documentationA
FATAL: Unable to delete script file c:\temp\hudson3125329676016517230.bat
hudson.util.IOException2: remote file operation failed: c:\temp\hudson3125329676016517230.bat at hudson.remoting.Channel@1f12b9f:slave
	at hudson.FilePath.act(FilePath.java:900)
	at hudson.FilePath.act(FilePath.java:877)
	at hudson.FilePath.delete(FilePath.java:1262)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
	at hudson.model.Build$BuildExecution.build(Build.java:199)
	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
	at hudson.model.Run.execute(Run.java:1575)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:237)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
	at hudson.remoting.Channel.send(Channel.java:494)
	at hudson.remoting.Request.call(Request.java:129)
	at hudson.remoting.Channel.call(Channel.java:672)
	at hudson.FilePath.act(FilePath.java:893)
	... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
	at java.io.ObjectInputStream.readObject0(Unknown Source)
	at java.io.ObjectInputStream.readObject(Unknown Source)
	at hudson.remoting.Command.readFrom(Command.java:92)
	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.Request.call(Request.java:174)
	at hudson.remoting.Channel.call(Channel.java:672)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
	at $Proxy52.join(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:915)
	at hudson.Launcher$ProcStarter.join(Launcher.java:360)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:802)
	at hudson.model.Build$BuildExecution.build(Build.java:199)
	at hudson.model.Build$BuildExecution.doRun(Build.java:160)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:584)
	at hudson.model.Run.execute(Run.java:1575)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:237)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.Request.abort(Request.java:299)
	at hudson.remoting.Channel.terminate(Channel.java:732)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
	at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
	at java.io.ObjectInputStream.readObject0(Unknown Source)
	at java.io.ObjectInputStream.readObject(Unknown Source)
	at hudson.remoting.Command.readFrom(Command.java:92)
	at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

In the JNLP clientlog on my slave i got:

JNLP Client on slave
Mai 03, 2013 7:16:43 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel channel
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(Unknown Source)
        at java.net.SocketInputStream.read(Unknown Source)
        at java.io.BufferedInputStream.fill(Unknown Source)
        at java.io.BufferedInputStream.read(Unknown Source)
        at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
        at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
        at java.io.ObjectInputStream.readObject0(Unknown Source)
        at java.io.ObjectInputStream.readObject(Unknown Source)
        at hudson.remoting.Command.readFrom(Command.java:92)
        at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

Mai 03, 2013 7:16:43 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Terminated
Mai 03, 2013 7:16:56 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins/]

And on the Tomcat server:

Tomcat server
May 03, 2013 7:16:42 PM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel slave
java.io.IOException: Unexpected termination of the channel
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
        at java.io.ObjectInputStream.readObject0(Unknown Source)
        at java.io.ObjectInputStream.readObject(Unknown Source)
        at hudson.remoting.Command.readFrom(Command.java:92)
        at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

May 03, 2013 7:16:42 PM jenkins.slaves.JnlpSlaveAgentProtocol$Handler$1 onClosed
WARNING: Channel reader thread: slave for + slave terminated
java.io.IOException: Unexpected termination of the channel
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
        at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
        at java.io.ObjectInputStream.readObject0(Unknown Source)
        at java.io.ObjectInputStream.readObject(Unknown Source)
        at hudson.remoting.Command.readFrom(Command.java:92)
        at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

May 03, 2013 7:16:53 PM hudson.TcpSlaveAgentListener$ConnectionHandler run
INFO: Accepted connection #78 from /10.0.0.2:49300

It seemed, that the channel gets closed, when there is no data going through the connection (hence playing with the ping settings mentioned above). I cant definately say how long it would stay open, but in the case shown above, it was about 45min without output. Therefor i modified the script to call doxygen in a thread and output a "." every 15s. So far, no more closed channels If you cant modify your script to generate continuous output, maybe pipe your command (in the batch file) to some program which outputs the output or triggers a continuous output. Also, i noticed that the dots generated from my script modification are not shown in the WebUI until a newline is sent. Nevertheless, the channel did not get closed.

I hope that this investigation delivers some clues to fix this problem and make distributed working with jenkins more stable!

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Reply via email to