Seth commented on Bug JENKINS-22265

Our analysis indicates that this is caused by the following faulty Future#cancel implementation in https://github.com/jenkinsci/jenkins/blob/master/core/src/main/java/hudson/model/queue/FutureImpl.java :

@Override
public boolean cancel(boolean mayInterruptIfRunning) {
    Queue q = Jenkins.getInstance().getQueue();
    synchronized (q) {
        synchronized (this) {
            if(!executors.isEmpty()) {
                if(mayInterruptIfRunning)
                    for (Executor e : executors)
                        e.interrupt();
                return mayInterruptIfRunning;
            }
            return q.cancel(task);
        }
    }
}

Specifically, note that a FutureImpl will retain references to its executor(s) long after those executors have finished executing that particular task, and because this method makes no attempt to check its state to determine whether a transition to canceled is possible, it seems that any client code that holds a reference to a FutureImpl will be able to cancel arbitrary future jobs an arbitrary number of times (each invocation of cancel, concurrent or not, will produce another attempt to set the executor thread's interrupted bit).

Unfortunately, the fix is neither simple clear, as the AsyncFutureImpl that this class extends is also an invalid future implementation; it is trivially possible to have a cancelled AsyncFutureImpl with a value and throwable. And that the fact none of the fields are declared volatile suggest there is a race between a notifying thread's write and a waiting thread's read. I would recommend looking at (or, ideally, using) Guava's com.google.common.util.concurrent.AbstractFuture implementation instead.

The way this bug is exercised by the MultiJob plugin is that it, if so configured, will attempt to halt execution of all subtasks on any subtask's failure (from https://github.com/jenkinsci/tikal-multijob-plugin/blob/master/src/main/java/com/tikal/jenkins/plugins/multijob/MultiJobBuilder.java ):

...
KillPhaseOnJobResultCondition killCondition = subTask.phaseConfig
		.getKillPhaseOnJobResultCondition();
if (killCondition.equals(KillPhaseOnJobResultCondition.NEVER))
	return false;
if (killCondition.isKillPhase(subTask.result)) {
	for (SubTask _subTask : subTasks)
		_subTask.future.cancel(true);
}
...

Our workaround has been to configure all our usages of the MultiJob plugin to "Kill the phase on: Never" so that we return from the first if statement and never make it to the second block.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to