We found the issue! So I thought I'd let you know.
We downgraded to 1.424.2 but the issue was still there. After a good night's 
sleep we started to dig some more and found this in the log:

Jul 5, 2012 11:00:24 PM hudson.model.Executor run
SEVERE: Executor threw an exception
java.lang.NullPointerException
        at 
org.jenkinsci.plugins.artifactdeployer.ArtifactDeployerPublisher$DeleteRemoteArtifact.onDeleted(ArtifactDeployerPublisher.java:187)
        at 
org.jenkinsci.plugins.artifactdeployer.ArtifactDeployerPublisher$DeleteRemoteArtifact.onDeleted(ArtifactDeployerPublisher.java:171)
        at hudson.model.listeners.RunListener.fireDeleted(RunListener.java:208)
        at hudson.model.Run.delete(Run.java:1187)
        at hudson.model.AbstractBuild.delete(AbstractBuild.java:362)
        at hudson.tasks.LogRotator.perform(LogRotator.java:157)
        at hudson.model.Job.logRotate(Job.java:315)
        at hudson.model.Run.run(Run.java:1440)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:175)

So for some reason the ArtifactDeployer throws a NPE for most builds that gets 
removed, and as you can see it is when a build is finished and the log rotator 
is removing the last builds to fit the new one in and this exception causes the 
Executor not to notify the Future object that the upstream build is waiting on, 
and I get hundreds of angry users breathing down my neck :)
We removed the ArtifactDeployer and all is well.
I should probably do a bug report on it.

Two separate; one for core and one for ArtifactDeployer, or just one with both 
components in it?

Robert Sandell
Software Tools Engineer - Tools and Integration
Sony Mobile Communications

From: jenkinsci-dev@googlegroups.com [mailto:jenkinsci-dev@googlegroups.com] On 
Behalf Of Sandell, Robert
Sent: den 5 juli 2012 18:33
To: 'jenkinsci-dev@googlegroups.com'
Subject: AsyncFutureImpl.get problems in 1.447.2

Hi,
We upgraded yesterday from 1.424.2 to 1.447.2 and the first problems has 
started to arrive that we had missed during testing.

We use the parameterized trigger quite heavily, and the issue is that 
sometimes, about every eight build or so,  the upstream build that is waiting 
for the downstream build that it triggered still waits even though the 
downstream build has finished long ago.
A thread dump reveals that the executor is still waiting in AsyncFutureImpl.get 
so my guess is that for some reason the thread hasn't been notified about the 
build result as it should.
We also have an in-house plugin doing similar stuff and it can also hang the 
same way.
I haven't found anything that recreates the circumstances yet just the 
"sometimes it happens" thing on a couple of critical jobs.

I've been going through the parts of core I can find that is involved in this; 
a git diff between Jenkins-1.424.2 and Jenkins-1.447.2 shows some changes in 
hudson.model.Executor that from what I can see shouldn't affect this, nothing 
in WorkUnitContext, and maybe something in hudson.model.Queue (but I tend to 
get lost in there every time I try to understand that code :) ).

Does anyone have any hints on where I can continue my investigation or ways of 
attack to try and recreate the issue to better debug it?




Robert Sandell
Software Tools Engineer
Tools and Integration

Sony Mobile Communications
Tel: +46 (0)10 80 12721
sonymobile.com<http://sonymobile.com/>

[cid:image001.jpg@01CD5B88.4C002450]


<<inline: image001.jpg>>

Reply via email to