[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2015-01-20 Thread ji...@gmx.net (JIRA)














































jimis jimis
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















Hi, I'm experiencing the same issue using Jenkins 1.580.1 on ppc64 running AIX 5.3. In particular I'm seeing the exact same bytes that you have posted in the attached file "slavelog-from-slave.log":

java.io.StreamCorruptedException: invalid stream header: 009AACED

Searching the web I found the following explanation for this sequence of bytes:


Object stream data is preceded by a 4 byte 'magical' sequence AC ED 00 05. An ObjectInputStream will peek for this data at construction time rather than before the first read. And that's logical: one wants to be sure it is a proper stream before being too far in an application. The sequence is buffered by the ObjectOutputStream at construction time so that it is pushed on the stream at the first write. This method often leads to complexities in buffered situations or transferring via pipes or sockets. Fortunately there is a just as simple as effective solution to all these problems: Flush the ObjectOutputStream immediately after contruction!

Looking at the similarity in the byte sequence, it looks like either an endianess issue or an off-by-one error.

Thank you for providing the workaround, I'll set up the buildslave using JNLP.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-23 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















With -text, log shown for the node on the master when it dies:


===[JENKINS REMOTING CAPACITY]==[HUDSON TRANSMISSION BEGINS]===channel started
Slave.jar version: 2.43
This is a Unix slave
Slave successfully connected and online
Jul 23, 2014 5:57:48 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel channel
java.io.StreamCorruptedException: invalid stream header: AC64736F
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
	at java.io.ObjectInputStream.init(ObjectInputStream.java:297)
	at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
channel stopped
ERROR: Connection terminated
java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
	at java.io.ObjectInputStream.init(ObjectInputStream.java:298)
	at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
[07/23/14 05:57:49] [SSH] Connection closed.
ERROR: [07/23/14 05:57:49] slave agent was terminated
java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
	at java.io.ObjectInputStream.init(ObjectInputStream.java:298)
	at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)


and from the slave its self:


channel startedchannel started

Jul 23, 2014 5:57:48 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel channel
java.io.StreamCorruptedException: invalid stream header: AC64736F
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
at java.io.ObjectInputStream.init(ObjectInputStream.java:297)
at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
Jul 23, 2014 5:57:48 AM hudson.remoting.SynchronousCommandTransport$ReaderThread run
SEVERE: I/O error in channel channel
java.io.StreamCorruptedException: invalid stream header: AC64736F
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
at java.io.ObjectInputStream.init(ObjectInputStream.java:297)
at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
channel stoppedchannel stopped




























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: 

[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-23 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















I can't for the life of me figure out how to use the -tcp option to make the slave passively accept a tcp connection from the master.

In any case, -text still fails, so it seems unlikely to be an issue with SSH mangling binary.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-23 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















When I switch the node to JNLP (so it uses direct TCP/IP as a transport, rather than SSH) I can no longer reproduce this despite repeated test runs.

So, so far:


	Only observed on ppc64
	Only observed for ssh slaves
	Using -text protocol does not help





























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-23 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















I've left a JNLP worker running for some hours, running the same job with 1mb and 10mb artifact sizes that caused intermittent problems over the ssh transport. No problems.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 updated  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Change By:


Craig Ringer
(22/Jul/14 7:18 AM)




Summary:


Protocoldeadlockwhileuploadingartifacts
fromppc64



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















I'm going to re-test with a simplified build and a single executor configured.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















Interestingly, archiving worked with a trivial configuration - a dd command to create a dummy file and a trivial archiving command to copy it.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















I've been able to reproduce this with a trivial job and after limiting the node to a single executor. It is not consistently reproducible, it's somewhat random. I suspect it's dependent on the input being archived (10MB of randomly generated data), but it could also be just plain random.

I'll attach the config.xml and thread dumps.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















If it is helpful, I can provision access to the build worker for anyone interested in this issue.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 updated  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Build log is:


Started by user Craig Ringer
[EnvInject] - Loading node environment variables.
Building remotely on fedora16-ppc64-Power7-osuosl-karman (ppc64 fedora16 ppc linux fedora) in workspace /home/jenkins/workspace/ppctest
[ppctest] $ /bin/sh -xe /tmp/hudson9160124340407748260.sh
+ dd if=/dev/urandom of=dummy.out bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.604488 s, 17.3 MB/s
Archiving artifacts


At the time the master stack was taken there was another build running. I'll see if I can capture another once it's idle.





Change By:


Craig Ringer
(22/Jul/14 3:53 PM)




Attachment:


config.xml





Attachment:


jenkins-master-stack.txt





Attachment:


jenkins-slave-stack.txt



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 updated  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Master at idle





Change By:


Craig Ringer
(22/Jul/14 4:32 PM)




Attachment:


jenkins-master-idle-stack.txt



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread dan...@beckweb.net (JIRA)














































Daniel Beck
 updated  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Change By:


Daniel Beck
(22/Jul/14 8:31 PM)




Labels:


archivingartifactdeadlock
remoting
slavesshunix



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)












































 
Craig Ringer
 edited a comment on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Attached a jstack for the master at idle except for the stuck connection.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















I cancelled the job after 12 hours. Console output was:


Archiving artifacts
ERROR: Failed to archive artifacts: **/dummy.out
java.io.IOException: java.io.IOException: Failed to extract /home/jenkins/workspace/ppctest/transfer of 1 files
	at hudson.FilePath.readFromTar(FilePath.java:2119)
	at hudson.FilePath.copyRecursiveTo(FilePath.java:2031)
	at jenkins.model.StandardArtifactManager.archive(StandardArtifactManager.java:61)
	at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:183)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:772)
	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:736)
	at hudson.model.Build$BuildExecution.post2(Build.java:183)
	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:685)
	at hudson.model.Run.execute(Run.java:1757)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:234)
Caused by: java.io.IOException
	at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:177)
	at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61)
	at com.jcraft.jzlib.InflaterInputStream.fill(InflaterInputStream.java:175)
	at com.jcraft.jzlib.InflaterInputStream.read(InflaterInputStream.java:106)
	at org.apache.tools.tar.TarBuffer.readBlock(TarBuffer.java:257)
	at org.apache.tools.tar.TarBuffer.readRecord(TarBuffer.java:223)
	at hudson.org.apache.tools.tar.TarInputStream.read(TarInputStream.java:345)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792)
	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769)
	at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744)
	at hudson.util.IOUtils.copy(IOUtils.java:40)
	at hudson.FilePath.readFromTar(FilePath.java:2109)
	... 12 more

	at hudson.FilePath.copyRecursiveTo(FilePath.java:2038)
	at jenkins.model.StandardArtifactManager.archive(StandardArtifactManager.java:61)
	at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:183)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:772)
	at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:736)
	at hudson.model.Build$BuildExecution.post2(Build.java:183)
	at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:685)
	at hudson.model.Run.execute(Run.java:1757)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:234)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.Request$1.get(Request.java:278)
	at hudson.remoting.Request$1.get(Request.java:210)
	at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59)
	at hudson.FilePath.copyRecursiveTo(FilePath.java:2034)
	... 11 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.Request.abort(Request.java:299)
	at hudson.remoting.Channel.terminate(Channel.java:802)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
	at java.io.ObjectInputStream.init(ObjectInputStream.java:298)
	at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
Build step 'Archive 

[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















A re-run with the exact same data to archive succeeded. So this looks like an intermittent fault. Timing, threading, memory, etc.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)












































 
Craig Ringer
 edited a comment on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















A re-run with the exact same data to archive succeeded. So this looks like an intermittent fault. Timing, threading, memory, etc.

A second re-run got stuck.

Fun.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















I've progressively reduced the archive file size. It has got stuck with files as small as 128k.

So far tests with 16k files haven't failed. I'm trying to narrow down whether it can occur with v.small archive files (and is just less likely) or if it's size related.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















Just hit the same issue at a different place. Stuck at:


Started by user Craig Ringer
[EnvInject] - Loading node environment variables.


I scheduled another build while this one was still running, the first time I've done that. It got queued as there's only one executor, but if that still triggers communication with the slave down the ssh session, maybe that's why?

When I cancelled it, the exception was:


Started by user Craig Ringer
[EnvInject] - Loading node environment variables.
ERROR: SEVERE ERROR occurs
org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException
	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77)
	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:589)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:493)
	at hudson.model.Run.execute(Run.java:1732)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:234)
Caused by: java.lang.InterruptedException
	at java.lang.Object.wait(Native Method)
	at hudson.remoting.Request.call(Request.java:146)
	at hudson.remoting.Channel.call(Channel.java:739)
	at hudson.FilePath.act(FilePath.java:1011)
	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
	... 8 more




























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)












































 
Craig Ringer
 edited a comment on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Just hit the same issue at a different place. Stuck at:


Started by user Craig Ringer
[EnvInject] - Loading node environment variables.


I scheduled another build while this one was still running, the first time I've done that. It got queued as there's only one executor, but if that still triggers communication with the slave down the ssh session, maybe that's why?

When I cancelled it, the exception was:


Started by user Craig Ringer
[EnvInject] - Loading node environment variables.
ERROR: SEVERE ERROR occurs
org.jenkinsci.lib.envinject.EnvInjectException: java.lang.InterruptedException
	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:77)
	at org.jenkinsci.plugins.envinject.EnvInjectListener.loadEnvironmentVariablesNode(EnvInjectListener.java:81)
	at org.jenkinsci.plugins.envinject.EnvInjectListener.setUpEnvironment(EnvInjectListener.java:39)
	at hudson.model.AbstractBuild$AbstractBuildExecution.createLauncher(AbstractBuild.java:589)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:493)
	at hudson.model.Run.execute(Run.java:1732)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:88)
	at hudson.model.Executor.run(Executor.java:234)
Caused by: java.lang.InterruptedException
	at java.lang.Object.wait(Native Method)
	at hudson.remoting.Request.call(Request.java:146)
	at hudson.remoting.Channel.call(Channel.java:739)
	at hudson.FilePath.act(FilePath.java:1011)
	at org.jenkinsci.plugins.envinject.service.EnvironmentVariablesNodeLoader.gatherEnvironmentVariablesNode(EnvironmentVariablesNodeLoader.java:44)
	... 8 more


followed by slave agent death with:


java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
	at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
	at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792)
	at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
	at java.io.ObjectInputStream.init(ObjectInputStream.java:298)
	at hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)




























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 commented on  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64















A key clue is that slave.jar appears to be dying suddenly (process terminates) every couple of jobs, when running with a 16kb archive file. Unclear if this was happening before with larger archives and I wasn't noticing because jenkins was relaunching the worker.

This seems to happen after successful job completion though.

After reconfiguring the slave launcher with 


ulimit -c unlimited 


as prefix and 


-slaveLog "log-$(date -Iseconds).txt"


then rerunning the small 16kb archive job until the slave died (3rd run), I was able to capture logs from both sides.

I'm now going to test base64 encoded streams (in case it's an ssh 8-bit clean issue) and direct tcp/ip.



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[JIRA] [core] (JENKINS-23917) Protocol deadlock while uploading artifacts from ppc64

2014-07-22 Thread cr...@2ndquadrant.com (JIRA)














































Craig Ringer
 updated  JENKINS-23917


Protocol deadlock while uploading artifacts from ppc64
















Attached logs from when a slave dies. slavelog-from-master is from the Node log, taken from the web ui. -from-slave is from the file on the slave machine specified with the "-slaveLog" command line param to the slave agent.





Change By:


Craig Ringer
(23/Jul/14 5:47 AM)




Attachment:


slavelog-from-master.txt





Attachment:


slavelog-from-slave.txt



























This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira







-- 
You received this message because you are subscribed to the Google Groups Jenkins Issues group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.