We are seeing 2 occurrences of a java.io.IOException when using Windows nodes as agents. 1. When new nodes are being spun up for Windows jobs, it appears that Jenkins will assign a newly spun up node (after establishing a connection) to service the build, but the connection gets terminated before the build even begins.
Started by user some_user
Running as SYSTEM
Building remotely on EC2 (amazon-ec2) - windows-label (i-0de1740c2107602b3) (windows-label) in workspace D:\dev\jenkins\workspace\WindowsStressTests\SimpleWindowsBuild11
FATAL: java.io.IOException: Pipe is already closed
hudson.remoting.FastPipedInputStream$ClosedBy: The pipe was closed at...
at hudson.remoting.FastPipedOutputStream.error(FastPipedOutputStream.java:100)
at hudson.remoting.FastPipedOutputStream.close(FastPipedOutputStream.java:90)
at hudson.plugins.ec2.util.Closeables.closeQuietly(Closeables.java:23)
at hudson.plugins.ec2.win.winrm.WindowsProcess$2.run(WindowsProcess.java:146)
Caused: java.io.IOException: Pipe is already closed
at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:154)
at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:138)
at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:89)
at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:85)
at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:54)
at java.io.OutputStream.write(OutputStream.java:75)
at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:46)
at hudson.remoting.Channel.send(Channel.java:721)
at hudson.remoting.Channel.close(Channel.java:1436)
Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to EC2 (amazon-ec2) - windows-label (i-0de1740c2107602b3)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
at hudson.remoting.Request.call(Request.java:202)
at hudson.remoting.Channel.call(Channel.java:954)
at hudson.FilePath.act(FilePath.java:1069)
at hudson.FilePath.act(FilePath.java:1058)
at hudson.FilePath.mkdirs(FilePath.java:1243)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1199)
at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1853)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:427)
Caused: hudson.remoting.RequestAbortedException
at hudson.remoting.Request.abort(Request.java:340)
at hudson.remoting.Channel.terminate(Channel.java:1038)
at hudson.remoting.Channel.close(Channel.java:1444)
at hudson.remoting.Channel.close(Channel.java:1403)
at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:843)
at hudson.slaves.SlaveComputer.access$100(SlaveComputer.java:108)
at hudson.slaves.SlaveComputer$2.run(SlaveComputer.java:734)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Finished: FAILURE
Afterwards, the node is left in an offline state with the message "Connection was broken" until we manually reconnect it (via the "Launch Agent" button in the Jenkins UI). We are consistently see this problem. Log from the node:
EC2 (amazon-ec2) - windows-label(i-0531ef3f07b2cdaee) booted at 1582839470000
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Waiting for password to be available. Sleeping 10s.
Connecting to (10.93.145.196) with WinRM as david_webb6
WinRM service responded. Waiting for WinRM service to stabilize on EC2 (amazon-ec2) - windows-label (i-0531ef3f07b2cdaee)
WinRM should now be ok on EC2 (amazon-ec2) - windows-label (i-0531ef3f07b2cdaee)
Connected with WinRM.
Creating tmp directory if it does not exist
remoting.jar sent remotely. Bootstrapping it
Launching via WinRM:java -jar C:\Windows\Temp\remoting.jar -workDir D:\dev\jenkins
<===[JENKINS REMOTING CAPACITY]===>Remoting version: 3.36.1
This is a Windows agent
ERROR: Failed to monitor for Free Swap Space
java.util.concurrent.TimeoutException
at hudson.remoting.Request$1.get(Request.java:316)
at hudson.remoting.Request$1.get(Request.java:240)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:114)
at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:78)
at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:306)
ERROR: Failed to monitor for Free Disk Space
java.util.concurrent.TimeoutException
at hudson.remoting.Request$1.get(Request.java:316)
at hudson.remoting.Request$1.get(Request.java:240)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:114)
at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:78)
at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:306)
ERROR: Failed to monitor for Free Temp Space
java.util.concurrent.TimeoutException
at hudson.remoting.Request$1.get(Request.java:316)
at hudson.remoting.Request$1.get(Request.java:240)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitorDetailed(AbstractAsyncNodeMonitorDescriptor.java:114)
at hudson.node_monitors.AbstractAsyncNodeMonitorDescriptor.monitor(AbstractAsyncNodeMonitorDescriptor.java:78)
at hudson.node_monitors.AbstractNodeMonitorDescriptor$Record.run(AbstractNodeMonitorDescriptor.java:306)
Agent successfully connected and online
ERROR: Connection terminated
hudson.remoting.FastPipedInputStream$ClosedBy: The pipe was closed at...
at hudson.remoting.FastPipedOutputStream.error(FastPipedOutputStream.java:100)
at hudson.remoting.FastPipedOutputStream.close(FastPipedOutputStream.java:90)
at hudson.plugins.ec2.util.Closeables.closeQuietly(Closeables.java:23)
at hudson.plugins.ec2.win.winrm.WindowsProcess$2.run(WindowsProcess.java:146)
Caused: java.io.IOException: Pipe is already closed
at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:154)
at hudson.remoting.FastPipedOutputStream.write(FastPipedOutputStream.java:138)
at hudson.remoting.ChunkedOutputStream.sendFrame(ChunkedOutputStream.java:89)
at hudson.remoting.ChunkedOutputStream.drain(ChunkedOutputStream.java:85)
at hudson.remoting.ChunkedOutputStream.write(ChunkedOutputStream.java:54)
at java.io.OutputStream.write(OutputStream.java:75)
at hudson.remoting.ChunkedCommandTransport.writeBlock(ChunkedCommandTransport.java:45)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.write(AbstractSynchronousByteArrayCommandTransport.java:46)
at hudson.remoting.Channel.send(Channel.java:721)
at hudson.remoting.Channel.close(Channel.java:1436)
at hudson.remoting.Channel.close(Channel.java:1403)
at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:843)
at hudson.slaves.SlaveComputer.access$100(SlaveComputer.java:108)
at hudson.slaves.SlaveComputer$2.run(SlaveComputer.java:734)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
ERROR: Connection terminated
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2736)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3211)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:896)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
2. In the middle of a build, the node will go into a disconnected state and see the same error. This occurrence is less frequent and we're unable to consistently recreate it and it may be a different issue altogether but I wanted to at least bring it to your attention in case the information was helpful in solving our problem. |