Hi all,

For a while we've seen occasional hangs during the artifact stage of some jobs. 
Of course they always seem to happen when we're under the gun and need a 
critical build, as was the case this morning ;-)

Originally I thought that these hangs might be a disconnect between what the 
job is building and the list of artifacts. For example, I've definitely seen 
issues where if I change our build to create a new artifact, but there are jobs 
running that don't "know" how to produce the new artifact, the artifacting 
stage hangs. I've always theorized that in this case Jenkins searches our tree 
for the missing artifact, and is either bogged down by the shear size of our 
source tree (which is huge), or is confused by our use of symlinks in the tree. 
But to be honest, I've never tried to do more than guess at the problem.

I *did* make a change like this a few days ago, and did see an artifacting 
hang. Since this was expected, I terminated that build. After that new jobs 
succeeded, until I got a hang out of the blue (with no build job changes) this 
morning.

Here's the stack trace from the build slave that is hung:

--snip--
"Executor #1 for MacBuildSlave-Speedy : executing ClientMacFullInstaller #700" 
Id=58 Group=main TIMED_WAITING on [B@5c8c89e2
        at java.lang.Object.wait(Native Method)
        -  waiting on [B@5c8c89e2
        at 
hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:173)
        at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61)
        at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:221)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141)
        at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:90)
        at org.apache.tools.tar.TarBuffer.readBlock(TarBuffer.java:257)
        at org.apache.tools.tar.TarBuffer.readRecord(TarBuffer.java:223)
        at 
hudson.org.apache.tools.tar.TarInputStream.read(TarInputStream.java:345)
        at java.io.FilterInputStream.read(FilterInputStream.java:90)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1025)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:999)
        at hudson.util.IOUtils.copy(IOUtils.java:36)
        at hudson.FilePath.readFromTar(FilePath.java:1759)
        at hudson.FilePath.copyRecursiveTo(FilePath.java:1685)
        at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:116)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
        at 
hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:703)
        at 
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:678)
        at 
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:656)
        at hudson.model.Build$RunnerImpl.post2(Build.java:162)
        at 
hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:625)
        at hudson.model.Run.run(Run.java:1435)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:238)

--snip--

Note that we're running Jenkins 1.455.

I really want to get back to a state where we can trust that Jenkins will 
deliver builds without falling over. Is the above information helpful, or is 
more needed?

Note that I've got a complete threadDump stack trace of all the slaves and 
threads in case that is helpful, but it seemed like way too much to post here. 
Is there anything else I can do to help diagnose this problem while it's 
happening?

Unfortunately I cannot leave the build hung forever. I'll likely have to stop 
and restart it sometime today, so if anyone has suggestions as to what I can do 
next to look at the problem while it's still in front of me, please let me know 
soon.

Best,
--
Allen Cronce

Reply via email to