[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Keith Ball There is often a slight time delay between when a new release is visible on the update center and when it is available on all mirrors, my guess is that was what happened for you. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Keith Ball commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Forget it. it's there now Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Keith Ball commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume I just tried to update the plugin but received a FileNotFoundException: http://archives.jenkins-ci.org/plugins/workflow-job/2.25/workflow-job.hpi Failed to download from http://updates.jenkins-ci.org/download/plugins/workflow-job/2.25/workflow-job.hpi (redirected to: http://archives.jenkins-ci.org/plugins/workflow-job/2.25/workflow-job.hpi) Did someone forget to publish the latest version? Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum updated JENKINS-50199 I just released version 2.25 of the Pipeline Job plugin which should fix the issue where completed builds are resuming after restarting Jenkins. If anyone is still seeing the issue after upgrading the plugin, please comment with any additional information you can provide, such as the build folder of the build in question. Note that there are other issues that may appear similar but seem to have distinct causes. For anyone newly commenting on this issue, if you are seeing stuck executors without ever restarting Jenkins, please check if JENKINS-45571, JENKINS-53223, or JENKINS-51568 are closer to your situation and if so, comment there instead. Thanks! Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Devin Nusbaum Status: In Review Resolved Resolution: Fixed Released As: Pipeline Job 2.25 Add Comment This
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Devin Nusbaum See https://issues.jenkins-ci.org/secure/attachment/43981/867.tar.gz for a build folder. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: John Arnold Attachment: 867.tar.gz Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Roy Arnon edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume [~svanoort] we have also recently encountered this issue, or something similar as in our case the resuming jobs completed with SUCCESS.I commented [ here |https://issues.jenkins-ci.org/browse/JENKINS-33761?focusedCommentId=347675&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-347675] But it seems this issue is more relevant to this ticket.In our case the resuming build was marked as SUCCESS which marked at as success in Bitbucket (with bitbucket-branch-source plugin). The build itself never completed and in fact had a failing test. This allowed one of our developers to merge a failing build into release unfortunately, If there is anything you need from our jenkins instance I am happy to provide it. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Roy Arnon edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume [~svanoort] we have also recently encountered this issue , or something similar as in our case the resuming jobs completed with SUCCESS .I commented hereBut it seems this issue is more relevant to this ticket.In our case the resuming build was marked as SUCCESS which marked at as success in Bitbucket (with bitbucket-branch-source plugin). The build itself never completed and in fact had a failing test. This allowed one of our developers to merge a failing build into release unfortunately, If there is anything you need from our jenkins instance I am happy to provide it. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Roy Arnon edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume [~svanoort] we have also recently encountered this issue.I commented [ here |https://issues.jenkins-ci.org/browse/JENKINS-33761?focusedCommentId=347675&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-347675] But it seems this issue is more relevant to this ticket.In our case the resuming build was marked as SUCCESS which marked at as success in Bitbucket (with bitbucket-branch-source plugin). The build itself never completed and in fact had a failing test. This allowed one of our developers to merge a failing build into release unfortunately, If there is anything you need from our jenkins instance I am happy to provide it. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Roy Arnon commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Sam Van Oort we have also recently encountered this issue. I commented here But it seems this issue is more relevant to this ticket. In our case the resuming build was marked as SUCCESS which marked at as success in Bitbucket (with bitbucket-branch-source plugin). This allowed one of our developers to merge a failing build into release unfortunately, If there is anything you need from our jenkins instance I am happy to provide it. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum updated JENKINS-50199 Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Devin Nusbaum Status: In Progress Review Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum started work on JENKINS-50199 Change By: Devin Nusbaum Status: Reopened In Progress Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Observations from digging in together: 1. We can reproduce the restarted builds simply by setting the FlowExecution to completed for an incomplete build and dirty-restarting. 2. On restart that will trigger the build to be able to resume even when it shouldn't – we can avoid this by marking the build completed upon loading if the execution is done 3. The only place where the message about the build being done can be printed is WorkflowRun line 815 where it gets the StreamBuildListener and calls #finished on it (that prints the build completed message) 4. The build record has non-null logsToCopy - since that's nulled a few lines down from where the "build completed" message prints, just before saving the build, something prevented saving the build after that mutation - Devin Nusbaum speculates that it might be due to the BulkChange logic in WorkflowRun.save, which early-exits if there's a BulkChange containing the WorkflowRun – if the BulkChange aborts, the WorkflowRun's save operation will be a no-op (and it may not ever be saved). I have a small patch which closes the loophole around the completed execution but incomplete build (and passes a basic testcase), but point 4 still needs some thought. Add Comment This message was sent by Atlassian Jira (v7.11.2#711002-sha1:fdc329d) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Sam Van Oort Any thoughts on adding something like the following patch as a temporary fix in workflow-job to prevent these pipelines from resuming while we continue investigating? diff --git a/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java b/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java index 38e6ac1..70d9474 100644 --- a/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java +++ b/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java @@ -752,2 +752,8 @@ public final class WorkflowRun extends Run implements F if (!completed == Boolean.TRUE) { +if (execution.isComplete()) { +LOGGER.log(Level.WARNING, "JENKINS-50199: Run completion inconsistent with execution, defaulting to failure for {0}", this); +setResult(Result.FAILURE); +completed = Boolean.TRUE; +needsToPersist = true; +} else { // we've been restarted while we were running. let's get the execution going again. @@ -757,2 +763,3 @@ public final class WorkflowRun extends Run implements F Timer.get().submit(() -> Queue.getInstance().schedule(new AfterRestartTask(WorkflowRun.this), 0)); // JENKINS-31614 +} } Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Gr
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume John Arnold Could you upload the whole build folder for one of the zombie builds? I'm most interested in the workflow directory adjacent to build.xml to see if the serialized flow nodes tell us anything interesting, but any exceptions/warnings in the various log files would be interesting as well. Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Devin Nusbaum Sam Van Oort Is there any additional data I can gather for you. I have lots of repros of this problem. Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Sam Van Oort I have been looking at the issue. Like I commented here I don't think that case is actually a reproduction, because it does complete if you increase the timeout, although I might be misunderstanding the test case. I have been trying to find a way to reproduce the problem, but have been unsuccessful so far. Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Devin Nusbaum I took a quick pass through the build.xml and it fits my comment in https://issues.jenkins-ci.org/browse/JENKINS-50199?focusedCommentId=345572&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-345572 to the T AFAICT. Please, would you be able to take a peek? It should tie to the issues with CpsContext.isReady and the reproduction case in https://github.com/svanoort/workflow-cps-plugin/tree/reproduce-hung-pipelines – and at this point a fresh perspective might be helpful. Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume [~dnusbaum] Pipeline Groovy: 2.54Pipeline Job: 2.24build.xml (redacted internal strings) [ https://pastebin.com/rqZ7EwcG ] Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Pipeline Groovy: 2.54 Pipeline Job: 2.24 build.xml (redacted internal strings) https://pastebin.com/rqZ7EwcG Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume [~johnar] Are you able to upload the build.xml for one of the stuck builds so that we can see all of its persisted state? (Please redact any sensitive information if you do upload it.) Also, can you please confirm what versions of the Pipeline Groovy Plugin and the Pipeline Job Plugin plugin you are using? Thanks! Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Devin Nusbaum commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume John Arnold Are you able to upload the build.xml for one of the stuck builds so that we can see all of its persisted state? (Please redact any sensitive information if you do upload it.) Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Example job log:[https://pastebin.com/LbnRRTw4]Jenkins 2.135 Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Example log: https://pastebin.com/LbnRRTw4 Jenkins 2.135 Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: John Arnold Priority: Major Critical Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title John Arnold commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume I'm definitely still seeing this "stuck on resume" problem for completed builds AND aborted builds, after a Jenkins reboot. I have Jenkins set for max perf, low durability mode, to disable this Resume functionality, and it's still trying to resume. I can't Stop the jobs that are stuck on resume, I have to do a hard-kill via the /kill uri. Even trying to kill .finish() them with Groovy script doesn't seem to work reliably. Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Vivek Pandey assigned an issue to Devin Nusbaum Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Vivek Pandey Assignee: Sam Van Oort Devin Nusbaum Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Adding a note before I forget that this likely relates to JENKINS-45571. Based on some test scenarios that are generating similar symptoms, I suspect both are potentially related to the the CpsContext.isReady call hanging indefinitely – which seems to be a reproducible but obscure threading bug which I have yet to solve. Symptoms if that's the same culprit: 1. Build never gets marked as complete, even though the execution may - it appears the WorkflowRun never gets its "finish" method called and may not have a result set as a result 2. Pipeline still shows as having a running OneOffExecutor running its WorkflowJob as a Flyweight task 3. I generally see the CpsContext.isReady hang running, and this is triggered when something fails in early stages of loading & resuming the Pipeline program for an incomplete Pipeline after a restart 4. Investigation showed that the AsynchronousExecution created by WorkflowRun#sleep never finished (probably due to the CpsStepContext hang). Add Comment This message was sent by Atlassian JIRA (v7.10.1#710002-sha1:6efc396) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Christian V commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Here are the contents of the log file after the last successful build step (findbugs): [FINDBUGS] Computing warning deltas based on reference build #379 [8mha:4HijrKHZ7mkGmdh75hYMWnTg3qhyKH/0crRFeYRzrFm+aB+LCP9b85aBtbiIwTG/KF0vKzUvOzOvODlTryCnNB3I0ivPL8pOy8kv18vKT9JLzs8rzs9J1QuHCgaV5jlDhPzyS1IZIICRiYGhoohBKqM0pTg/D64Hh8ICAFt0h+h/ [0m[Pipeline] } [8mha:4HijrKHZ7mkGmdh75hYMWnTg3qhyKH/0crRFeYRzrFm+aB+LCP9b85aBtbiIwTG/KF0vKzUvOzOvODlTryCnNB3I0ivPL8pOy8kv18vKT9JLzs8rzs9J1QuHCgaV5jlDhPzyS1IZIICRiYGhoohBKqM0pTg/D64Hh8ICAFt0h+h/ [0m[Pipeline] // stage [8mha:4HijrKHZ7mkGmdh75hYMWnTg3qhyKH/0crRFeYRzrFm+aB+LCP9b85aBtbiIwTG/KF0vKzUvOzOvODlTryCnNB3I0ivPL8pOy8kv18vKT9JLzs8rzs9J1QuHCgaV5jlDhPzyS1IZIICRiYGhoohBKqM0pTg/D64Hh8ICAFt0h+h/ [0m[Pipeline] } [8mha:4HijrKHZ7mkGmdh75hYMWnTg3qhyKH/0crRFeYRzrFm+aB+LCP9b85aBtbiIwTG/KF0vKzUvOzOvODlTryCnNB3I0ivPL8pOy8kv18vKT9JLzs8rzs9J1QuHCgaV5jlDhPzyS1IZIICRiYGhoohBKqM0pTg/D64Hh8ICAFt0h+h/ [0m[Pipeline] // withEnv [8mha:4HijrKHZ7mkGmdh75hYMWnTg3qhyKH/0crRFeYRzrFm+aB+LCP9b85aBtbiIwTG/KF0vKzUvOzOvODlTryCnNB3I0ivPL8pOy8kv18vKT9JLzs8rzs9J1QuHCgaV5jlDhPzyS1IZIICRiYGhoohBKqM0pTg/D64Hh8ICAFt0h+h/ [0m[Pipeline] } [8mha:4HijrKHZ7mkGmdh75hYMWnTg3qhyKH/0crRFeYRzrFm+aB+LCP9b85aBtbiIwTG/KF0vKzUvOzOvODlTryCnNB3I0ivPL8pOy8kv18vKT9JLzs8rzs9J1QuHCgaV5jlDhPzyS1IZIICRiYGhoohBKqM0pTg/D64Hh8ICAFt0h+h/ [0m[Pipeline] // node Aborted by [8mha:4CCuiaL4WcqYxDmSFZCzdJeS5EDzittu/+y5aEfwI3pHmR+LCP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAz2EgZe/dLi1CL90rzsvPzyPAATbMabwg== [0munknown Aborted by [8mha:4CCuiaL4WcqYxDmSFZCzdJeS5EDzittu/+y5aEfwI3pHmR+LCP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAz2EgZe/dLi1CL90rzsvPzyPAATbMabwg== [0munknown [8mha:4AbMXzlrx4Nu9/bUEGTqZys67oL8ln9itw4ppRv/RwtBBh+LCP9djL1OwzAURm9SwYwYERsz10mhPyrqRCVAqqBS4AHS1nWcGNvY17QsfQheAxYkXoGH4AXYmBlYCM0E33LOcr7nT9jyDkbGCSy5rqT2M4lWBVEbLo2rFsos0QdrjSP0xK1HqW0gnFxl1+cPljsldXVpiO+8t16/et8fMURjaAWnCHbHZX6fM5VrwTJyUouTlYP9Isy90Tgz2hvF8c9Ld/j0Mny0bzHEF7CtuBZUNHd3sIaozvf+5acNf2NoFsUAK1sLEkQHBKOCyA4YSxPstbHbxzQ9HnQ6ScpKM2VnN9mG5IKuNiaCP1xIPZ8G4dlRP2HE3e0Pbpyu7CsBAAA= [0mClick here to forcibly terminate running steps [8mha:4AbMXzlrx4Nu9/bUEGTqZys67oL8ln9itw4ppRv/RwtBBh+LCP9djL1OwzAURm9SwYwYERsz10mhPyrqRCVAqqBS4AHS1nWcGNvY17QsfQheAxYkXoGH4AXYmBlYCM0E33LOcr7nT9jyDkbGCSy5rqT2M4lWBVEbLo2rFsos0QdrjSP0xK1HqW0gnFxl1+cPljsldXVpiO+8t16/et8fMURjaAWnCHbHZX6fM5VrwTJyUouTlYP9Isy90Tgz2hvF8c9Ld/j0Mny0bzHEF7CtuBZUNHd3sIaozvf+5acNf2NoFsUAK1sLEkQHBKOCyA4YSxPstbHbxzQ9HnQ6ScpKM2VnN9mG5IKuNiaCP1xIPZ8G4dlRP2HE3e0Pbpyu7CsBAAA= [0mClick here to forcibly terminate running steps Ready to run at Fri May 18 11:36:47 CEST 2018 Resuming build at Fri May 18 11:36:47 CEST 2018 after Jenkins restart Ready to run at Fri May 18 11:54:24 CEST 2018 Resuming build at Fri May 18 11:54:24 CEST 2018 after Jenkins restart Ready to run at Fri May 18 13:36:18 CEST 2018 Resuming build at Fri May 18 13:36:18 CEST 2018 after Jenkins restart Aborted by [8mha:4CCuiaL4WcqYxDmSFZCzdJeS5EDzittu/+y5a
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Christian V commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume I'm also seeing this: org.jenkinsci.remoting.util.ExecutorServiceUtils$FatalRejectedExecutionException: Cannot execute the command java.util.concurrent.FutureTask@134626b. The executor service is shutting down at hudson.remoting.SingleLaneExecutorService.execute(SingleLaneExecutorService.java:108) at java.util.concurrent.AbstractExecutorService.submit(Unknown Source) at com.google.common.util.concurrent.ForwardingExecutorService.submit(ForwardingExecutorService.java:110) at jenkins.util.InterceptingExecutorService.submit(InterceptingExecutorService.java:49) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4.onSuccess(CpsFlowExecution.java:903) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4.onSuccess(CpsFlowExecution.java:899) at org.jenkinsci.plugins.workflow.support.concurrent.Futures$1.run(Futures.java:150) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253) at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149) at com.google.common.util.concurrent.ExecutionList.add(ExecutionList.java:105) at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:155) at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:160) at org.jenkinsci.plugins.workflow.support.concurrent.Futures.addCallback(Futures.java:90) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.runInCpsVmThread(CpsFlowExecution.java:899) at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.getCurrentExecutions(CpsFlowExecution.java:977) at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl.onLoaded(FlowExecutionList.java:180) at jenkins.model.Jenkins.(Jenkins.java:972) at hudson.model.Hudson.(Hudson.java:85) at hudson.model.Hudson.(Hudson.java:81) at hudson.WebAppMain$3.run(WebAppMain.java:233) I have updated to the latest Jenkins version and all my Pipeline plugins are up to date. This happened before the update, after the plugin updates, and after updating Jenkins. It's always the same exception. Add Comment
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell reopened an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Resolution: Done Status: Closed Reopened Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Attachment: flownode-84.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell That is... the exact opposite of what I would expect. Not sure what's going on there, since the ci.jenkins.io builder does both Windows and Linux builds and they succeed. Weird. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated JENKINS-50199 Fixes released with workflow-cps 2.50 and workflow-job 2.21 – would really appreciate if you guys could install the latest and confirm the issue is resolved. CCMike Kozell Matt Gaspar Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Status: In Review Closed Resolution: Done Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop r
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell Are you building on Windows, by any chance? I'd try to set the platform encoding to UTF-8 – that error is safe to ignore though. You can comment out the whole test by adding "@Ignore" to src/test/java/org/jenkinsci/plugins/workflow/job/CLITest.java on line 85. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort edited a comment on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume [~mkozell] Also I think we're clear for release of this now -- but I'm aiming to day delay until at least tonight EST if that'll permit enough time to land it on some real-world Jenkins masters as a sanity check. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell Also I think we're clear for release of this now – but I'm aiming to day until at least tonight EST if that'll permit enough time to land it on some real-world Jenkins masters as a sanity check. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell Sure! The code is here, modulo an extra line or two of logging and a couple of small tweaks since then (nothing substantial): https://github.com/jenkinsci/workflow-job-plugin/pull/96 https://github.com/jenkinsci/workflow-cps-plugin/pull/223 Feel free to add code review too if you like, though it's just gotten a final pass by the other two people that do a lot of Pipeline core development. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated JENKINS-50199 Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Status: In Progress Review Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort started work on JENKINS-50199 Change By: Sam Van Oort Status: Reopened In Progress Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Matt Gaspar Mike Kozell I appreciate your patience and assistance with diagnosing this. Please could you try the attached SNAPSHOTs and see if the issue is fully resolved? workflow-job.hpi workflow-cps.hpi I've done several additional rounds of testing (including unit tests based on your scenarios) and fixes to try to ensure this is as robust and the fixes are as comprehensive as possible. These changes have received initial code review, and if all looks good they will be ready for full release soon – but obviously it would be helpful to have you try them out with your situations. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Attachment: workflow-cps.hpi Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Attachment: workflow-job.hpi Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Jenn Briden updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Jenn Briden Sprint: Pipeline - April 2018 Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Matt Gaspar commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume I didn't see your comment before I posted, but yeah, the workflows.xml are the files from the workflow folder for the build. There was only 45-47 available in there. I did see that this particular one failed despite the console output saying SUCCESS near the bottom. Regardless, this was a build that already ran and completed and shouldn't have reran after a restart of Jenkins. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Matt Gaspar commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Sam Van Oort I've attached the build.xml for one of the builds that shows SUCCESS (at least in the console) but appears to want to replay after a Jenkins restart. I also attached workflows.xml which is the concatenated files from the workflow directory of the same build. I haven't found any other useful logs associated with this build beyond the exception in my last comment. build.xml workflows.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Matt Gaspar Your data shows that we did indeed fail at some point in the loading of the heads/start nodes/flownodes, and switched to the fallback directory. Under your build workflow folder, please could you upload the file 40.xml through 47.xml – or any that are present, in any case? Perhaps as a single matt-flownodes.xml? I also see that, paradoxically, we show persistedClean = false, for the execution (it thinks that we did not cleanly save data), and completed is false too (may be a result of preceding issues and showing as the build being incomplete). Thanks! I think both of those issues should be covered by fixes I've done now. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Matt Gaspar updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Matt Gaspar Attachment: workflows.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Matt Gaspar updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Matt Gaspar Attachment: build.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Matt Gaspar Thanks, do you have the build.xml and the flowNodeStorage.xml or the XML files in the workflow subdirectory of the build? I have something that feels like it's 80% of the way there to resolve this cluster of issues – it addresses most of the things noted by Mike Kozell – but it has one known issue still to resolve with shutdown-time persistence. I would like to identify why YOUR build is not showing as already-complete though – all signs point to something going wrong with loading the FlowExecution information, pointing to it looking as if it is incomplete. Do you have any other log messages associated with this build, also? Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Matt Gaspar commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume I'm seeing some similar behavior on a few masters running 2.89.3 workflow-cps: 2.48 workflow-job: 2.19 It appears to even try and resume on a build that completed successfully previously after any restart of jenkins: Job Console snippet GitHub has been notified of this commit’s build result Finished: SUCCESS Resuming build at Wed Apr 18 03:45:16 MST 2018 after Jenkins restart Resuming build at Fri Apr 20 01:21:23 MST 2018 after Jenkins restart Resuming build at Fri Apr 20 01:52:35 MST 2018 after Jenkins restart Resuming build at Fri Apr 20 11:53:07 MST 2018 after Jenkins restart Resuming build at Fri Apr 20 12:13:33 MST 2018 after Jenkins restart jenkins.log Apr 20, 2018 12:18:52 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$6 onFailure WARNING: Failed to interrupt steps in Owner[JOB_NAME1:JOB_NAME #1] java.lang.IllegalStateException: completed or broken execution at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onLoad(CpsFlowExecution.java:741) at org.jenkinsci.plugins.workflow.job.WorkflowRun.getExecution(WorkflowRun.java:821) at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:694) at hudson.model.RunMap.retrieve(RunMap.java:225) at hudson.model.RunMap.retrieve(RunMap.java:57) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:500) at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:482) at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:380) at jenkins.model.lazy.AbstractLazyLoadRunMap.search(AbstractLazyLoadRunMap.java:345) at jenkins.model.lazy.AbstractLazyLoadRunMap.newestBuild(AbstractLazyLoadRunMap.java:275) at jenkins.model.lazy.LazyLoadRunMapEntrySet$1.(LazyLoadRunMapEntrySet.java:65) at jenkins.model.lazy.LazyLoadRunMapEntrySet.iterator(LazyLoadRunMapEntrySet.java:63) at java.util.AbstractMap$2$1.(AbstractMap.java:411) at java.util.AbstractMap$2.iterator(AbstractMap.java:410) at hudson.util.RunList.iterator(RunList.java:113) at com.google.common.collect.Iterables$15.apply(Iterables.java:1128) at com.google.common.collect.Iterables$15.apply(Iterables.java:1125) at com.google.common.collect.Iterators$8.next(Iterators.java:812) at com.google.common.collect.Iterators$MergingIterator.(Iterators.java:1306) at com.google.common.collect.Iterators.mergeSorted(Iterators.java:1274) at com.google.common.collect.Iterables$14.iterator(Iterables.java:1113) at com.google.common.collect.Iterables$UnmodifiableIterable.iterator(Iterables.java:94) at hudson.util.RunList.iterator(RunList.java:113) at jenkins.widgets.RunListProgressiveRendering.compute(RunListProgressiveRendering.java:60) at jenkins.util.ProgressiveRendering$1.run(ProgressiveRendering.java:122) at jenkins.security.ImpersonatingScheduledExecutorService$1.run(ImpersonatingScheduledExecutorService.java:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecuto
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell Thanks for providing a reproduction. To update, I'm working on this still – a bit of a tricky one, but having a clear case for what triggers this is very helpful. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort reopened an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Resolution: Fixed Status: Closed Reopened Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Comment: Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated JENKINS-50199 Resolved as of workflow-cps 2.47 and workflow-job 2.18 Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Status: In Review Closed Resolution: Fixed Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Component/s: workflow-job-plugin Component/s: pipeline Component/s: workflow-api-plugin Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title SCM/JIRA link daemon commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Code changed in jenkins User: Sam Van Oort Path: pom.xml src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java src/test/java/org/jenkinsci/plugins/workflow/job/WorkflowRunRestartTest.java src/test/java/org/jenkinsci/plugins/workflow/job/WorkflowRunTest.java http://jenkins-ci.org/commit/workflow-job-plugin/f0c26058f31d4f159a82a3cace52935e93f20701 Log: Merge pull request #93 from svanoort/fix-resume-issues Fix resume issues JENKINS-49686 and JENKINS-50199 and JENKINS-50407 Compare: https://github.com/jenkinsci/workflow-job-plugin/compare/e11cea623f61...f0c26058f31d Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell I'm attaching updated SNAPSHOTs - these I consider finalized and release-ready, and include a resolution of the issue above. workflow-job.hpi workflow-cps.hpi Please can you give these a try? Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Attachment: workflow-cps.hpi Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Attachment: workflow-job.hpi Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell I've attached plugin builds for workflow-cps and workflow-job plugins that should resolve the issues for you – and fixes a whole bunch of other related problems around the same logic (plus adds an optimization and some serious extra bulletproofing)-- please could you give them a try and let me know? workflow-cps.hpi workflow-job.hpi Or if you prefer to build them yourself, the PRs are here: https://github.com/jenkinsci/workflow-cps-plugin/pull/216 https://github.com/jenkinsci/workflow-job-plugin/pull/93 Now, there's one remaining test failure I can sometimes trigger with my homemade durability-fuzzing tool with Maximum Durability mode – I'm not clear if it's an issue with the test hardness or not, but if you do see a case where that results in zero flow nodes in the build directory, please let me know. I'm not comfortable releasing until I have it addressed + reviewed though (hopefully in the next day or two). Logs will show something like this (which is OK and sometimes expected in performance-optimized mode but NOT maximum durability): 0.150 [id=1636] WARNING o.j.p.w.cps.CpsFlowExecution#initializeStorage: Tried to load head FlowNodes for execution OwnerNestedParallelDurableJob/6:NestedParallelDurableJob #6 but FlowNode was not found in storage for head id:FlowNodeId 1:25 0.151 [id=1636] WARNING o.j.p.w.cps.CpsFlowExecution#rebuildEmptyGraph: Failed to load pipeline heads, so faking some up for execution CpsFlowExecution[OwnerNestedParallelDurableJob/6:NestedParallelDurableJob #6] 0.161 [id=1636] WARNING o.j.p.w.cps.CpsFlowExecution#onLoad: Completed flow without FlowEndNode: CpsFlowExecution[OwnerNestedParallelDurableJob/6:NestedParallelDurableJob #6] heads:26::26:org.jenkinsci.plugins.workflow.graph.FlowStartNode[id=27] Thank you! Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Attachment: workflow-job.hpi Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Attachment: workflow-cps.hpi Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume I have a dedicated test Jenkins master I can use for testing. I will just need to know where I can get the build and how to enable it. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Piotr Paczyński Mike Kozell Would you by any chance be set up with a non-critical / test master and interested in trying out some SNAPSHOT builds of the proposed fixes? I think what we have now is more or less ready to go and would love to see how it plays for you guys in the wild. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated JENKINS-50199 Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Status: In Progress Review Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort started work on JENKINS-50199 Change By: Sam Van Oort Status: Open In Progress Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Piotr Paczyński commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume We're also affected by this issue and seeing similar NPE errors. I can provide more details if needed. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell We now have the root cause for https://issues.jenkins-ci.org/browse/JENKINS-49686 identified, which accounts for the final missing piece, the NPE itself. The root cause is described in the comments there, and the fix should resolve most of the remaining issues with the partial fix preceding it and create a comprehensive solution to both parts of the problem. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell No, I think we have enough data, thank you – I've got a potential fix that also adds additional diagnostic information almost ready to go. Just working through some test failures and then I can hand over a binary and PR to try out on a test instance. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume So, here's the things I can put together: 1. Flow completed successfully initially. Speculation: somehow there is invalid state in the list of persisted 'heads' (final FlowNodes) for the build with a null head stored, or that FlowNode wasn't persisted into storage for $reasons and cannot be loaded. 2. Because the flow ended in a way that does not evaluate as 'isComplete' or state was not persisted such that it would show as complete – this would happen if either the number of 'head' (most recent) nodes for the flow is >1 or the final 'head' FlowNode is NOT the FlowEndNode as expected (could be null) – this means the Pipeline will show as incomplete and eligible to try to resume (may be blocked if resume is disabled explicitly or fails). 3. Eventually the build is cleared from the SoftReference cache of builds... and one of the processes that iterates through builds is triggered a couple days later (there's a variety of things that can do this, but it's normal) 4. FlowNode Storage is initialized – if it can't load one of the heads OR can't load one of the startNodes (starts of a block), then it will see the null is a problem, and switch to fallback storage (to avoid overwriting original data which might be recoverable), and try to fake up a dummy set of startNodes/heads/etc. 5. This dummy storage will NOT show as complete, so it'll try to resume, sees that the build was marked "can't resume" and fails with the IOException observed. 6. This failure will invoke onProgramEnd and somehow onNewHead runs with the residual FlowNode that isn't in this storage, triggering NPE – also this triggers creation of the secondary fallbock storage ("workflow-fallback/flownodeStore.xml"). Conclusion: something must be wrong internally here, but it's not clear where precisely yet. 7. The Process from 5 triggers creation of a new FlowEndNode written to the FlowNodeStorage (observed in your workflow-fallback flowNodeStore.xml x2). But yet somehow this isn't persisted as the head for the Execution and is once again marked as incomplete and eligible to be reloaded. 8. Because the previous does not work correctly, it can happen again (seen 2x in this cycle) Add Comment
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume 1. I didn't see any IOExceptions in the jenkins.log file. There is one is the workflow-fallback/flowNodeStore.xml file which is included in this ticket. I did get an IOException after I aborted the build with the work around. java.io.IOException: Aborting build at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:83) at org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrapNoCoerce.callConstructor(ConstructorSite.java:105) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallConstructor(CallSiteArray.java:60) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:235) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:247) at Script1.run(Script1.groovy:1) at groovy.lang.GroovyShell.evaluate(GroovyShell.java:585) at groovy.lang.GroovyShell.evaluate(GroovyShell.java:623) at groovy.lang.GroovyShell.evaluate(GroovyShell.java:594) at hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:142) at hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:114) at hudson.remoting.LocalChannel.call(LocalChannel.java:45) at hudson.util.RemotingDiagnostics.executeGroovy(RemotingDiagnostics.java:111) at jenkins.model.Jenkins._doScript(Jenkins.java:4360) at jenkins.model.Jenkins.doScript(Jenkins.java:4331) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627) at org.kohsuke.stapler.Function$MethodFunction.invoke(Function.java:343) at org.kohsuke.stapler.Function.bindAndInvoke(Function.java:184) at org.kohsuke.stapler.Function.bindAndInvokeAndServeResponse(Function.java:117) at org.kohsuke.stapler.MetaClass$1.doDispatch(MetaClass.java:129) at org.kohsuke.stapler.NameBasedDispatcher.dispatch(NameBasedDispatcher.java:58) at org.kohsuke.stapler.Stapler.tryInvoke(Stapler.java:715) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:845) at org.kohsuke.stapler.Stapler.invoke(Stapler.java:649) at org.kohsuke.stapler.Stapler.service(Stapler.java:238) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:154) at com.smartcodeltd.jenkinsci.plugin.assetbundler.filters.LessCSS.doFilter(LessCSS.java:47) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at hudson.plugins.greenballs.GreenBallFilter.doFilter(GreenBallFilter.java:59) at hudson.util.PluginServletFilter$1.doFilter(PluginServletFilter.java:151) at hudson.util.PluginServletFilter.doFilter(PluginServletF
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Ahh, nevermind, I missed that in the listing. That confirms another part of the theory. Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell Also, do you see anything anything in a directory "workflow-fallback" (perhaps with a flowNodeStore.xml)? Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Thanks Mike Kozell that data is indeed very useful and solidly confirms my working theory (which I will explain in a moment). Before I explain (since it'll take a bit to write up and I'm hoping you're still around), please could you do one more analysis: 1. Do you see any IOExceptions in the jenkins log or build log for these builds 2. Please can you do a search for "FlowStartNode" and "FlowEndNode" in both flowNodeStore.xml files and report the id and any error shown for that element? Thanks! Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Attachment: workflow-fallback-flowNodeStore.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Attachment: flowNodeStore.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Attachment: build.xml Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell Attachment: build.log Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort commented on JENKINS-50199 Re: Failed pipeline jobs stuck running after incorrect resume Mike Kozell The NPEs you see look like a duplicate of https://issues.jenkins-ci.org/browse/JENKINS-49686 which I am investigating right now. May I please request the following diagnostic information: 1. Please can you attach both the config.xml for the job and a build.xml for one or more Pipelines that experienced this issue. 2. In the build's 'workflow' directory, there's a flownodeStore.xml file - please can you GZIP it and attach it. 3. Can I request the last part of the build log for these builds (the final parts indicating build completed and anything after that). 4. Are all the Pipelines experiencing issues making use of parallels? 5. Do you have any fairly simple pipelines that will reproduce the issues, which you can provide code for? There's also fix in workflow-cps plugin 2.46 for some issues with Resume Disabled - without that fix there may be odd behaviors if the Resume Disable flag is modified after the build begins. If this has happened, the fix there will help and after gathering the above information, I would suggest installing this update and seeing if issues recur when Resume Disabled in on. If that does not resolve the resume issue, it may be necessary to toggle this option OFF for the meantime: Do not allow the pipeline to resume if the master restarts": Enabled on all jobs We hope to have a hotfix available soon to at least try out. Thank you Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e)
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Sam Van Oort updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Sam Van Oort Priority: Critical Major Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Andrew Bayer assigned an issue to Sam Van Oort Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Andrew Bayer Assignee: Sam Van Oort Add Comment This message was sent by Atlassian JIRA (v7.3.0#73011-sha1:3c73d0e) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell h2. SetupJenkins v2.89.4 LTS Pipeline API Plugin: 2.26 Pipeline Nodes and Processes Plugin: 2.19 Durable Task Plugin: 1.18 Pipeline Job Plugin: 2.17 Pipeline Shared Groovy Libraries Plugin: 2.9 Pipeline Supporting APIs Plugin: 2.18 Pipeline Groovy Plugin: 2.45 Script Security Plugin: 1.41Pipeline Default Speed/Durability Level: Performance-Optimized "Do not allow the pipeline to resume if the master restarts": Enabled on all jobsh2. ProblemI logged into a Jenkins master and saw no builds running but there was a queue of about 10 jobs. When mousing over the queued jobs, I saw "pending - Already running 2 builds across all nodes". This is strange because no jobs were showing as running and no Jenkins agents or executors were showing any running builds.I then ran "http://xx.xxx.xxx.xxx:8080/computer/api/xml?tree=computer[executors[currentExecutable[url]],oneOffExecutors[currentExecutable[url]]]&xpath=//url&wrapper=builds" which did show 5 builds were running. I checked these builds and they were red (failure) and were not running.h2. ResearchI checked the console log of a build that showed as running but isn't and saw the line below near the top of the log._*Resume disabled by user, switching to high-performance, low-durability mode.*_At the end of the of the log I saw the following:*_Finished: FAILURE_* *_Resuming build at Tue Mar 13 23:04:52 UTC 2018 after Jenkins restart_*h2. Why Resume Build?The build failed on *Mar 12, 2018 6:02:37 PM*. +Why did the build try to resume almost a day later+? The job and system durability are configured to not resume builds. Below are some details taken from the API for the build._class "hudson.model.OneOffExecutor" id "41" keepLog false number 41 queueId 7178 result "FAILURE" timestamp 1520877757466I checked the Java process on the server and it was last restarted on *March 02 2018*.+What triggered the "Jenkins restart" identified on Mar 13 23:04:52 UTC 2018 since the Java process was not restarted?++Why does this get the build stuck in a "running" state when it's not running?+h2. ScopeThis issue can be seen across many of our Jenkins masters. In each case we see "Resuming build at x after Jenkins restart" occur a few days after the build failure or abort even though Java was not restarted. This issue didn't occur on Jenkins 2.60.3 running the older (pre-durability configurable) Pipeline plugins.h2. LogsI checked the jenkins.log file and saw the following when the build was attempting to be resumed. {code:java}Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad WARNING: Pipeline state not properly persisted, cannot resume job/JENKINS-JOB-NAME1/42/ Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem WARNING: Unexpected exception in CPS VM thread: CpsFlowExecution[OwnerJENKINS-JOB-NAME1/42:JENKINS-JOB-NAME1 #42] java.lang.NullPointe
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell h2. SetupJenkins v2.89.4 LTS Pipeline API Plugin: 2.26 Pipeline Nodes and Processes Plugin: 2.19 Durable Task Plugin: 1.18 Pipeline Job Plugin: 2.17 Pipeline Shared Groovy Libraries Plugin: 2.9 Pipeline Supporting APIs Plugin: 2.18 Pipeline Groovy Plugin: 2.45 Script Security Plugin: 1.41Pipeline Default Speed/Durability Level: Performance-Optimized "Do not allow the pipeline to resume if the master restarts": Enabled on all jobsh2. ProblemI logged into a Jenkins master and saw no builds running but there was a queue of about 10 jobs. When mousing over the queued jobs, I saw "pending - Already running 2 builds across all nodes". This is strange because no jobs were showing as running and no Jenkins agents or executors were showing any running builds.I then ran "http://xx.xxx.xxx.xxx:8080/computer/api/xml?tree=computer[executors[currentExecutable[url]],oneOffExecutors[currentExecutable[url]]]&xpath=//url&wrapper=builds" which did show 5 builds were running. I checked these builds and they were red (failure) and were not running.h2. ResearchI checked the console log of a build that showed as running but isn't and saw the line below near the top of the log._*Resume disabled by user, switching to high-performance, low-durability mode.*_At the end of the of the log I saw the following:*_Finished: FAILURE_* *_Resuming build at Tue Mar 13 23:04:52 UTC 2018 after Jenkins restart_*h2. Why Resume Build?The build failed on *Mar 12, 2018 6:02:37 PM*. +Why did the build try to resume almost a day later+? The job and system durability are configured to not resume builds. Below are some details taken from the API for the build._class "hudson.model.OneOffExecutor" id "41" keepLog false number 41 queueId 7178 result "FAILURE" timestamp 1520877757466I checked the Java process on the server and it was last restarted on *March 02 2018*.+What triggered the "Jenkins restart" identified on Mar 13 23:04:52 UTC 2018 since the Java process was not restarted?++Why does this get the build stuck in a "running" state when it's not running?+h2. ScopeThis issue can be seen across many of our Jenkins masters. In each case we see "Resuming build at x after Jenkins restart" occur a few days after the build failure or abort even though Java was not restarted. This issue didn't occur on Jenkins 2.60.3 running the older (pre-durability configurable) Pipeline plugins.h2. LogsI checked the jenkins.log file and saw the following when the build was attempting to be resumed. {code:java}Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad WARNING: Pipeline state not properly persisted, cannot resume job/JENKINS-JOB-NAME1/42/ Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem WARNING: Unexpected exception in CPS VM thread: CpsFlowExecution[OwnerJENKINS-JOB-NAME1/42:JENKINS-JOB-NAME1 #42] java.lang.NullPoin
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell h2. SetupJenkins v2.89.4 LTS Pipeline API Plugin: 2.26 Pipeline Nodes and Processes Plugin: 2.19 Durable Task Plugin: 1.18 Pipeline Job Plugin: 2.17 Pipeline Shared Groovy Libraries Plugin: 2.9 Pipeline Supporting APIs Plugin: 2.18 Script Security Plugin: 1.41Pipeline Default Speed/Durability Level: Performance-Optimized "Do not allow the pipeline to resume if the master restarts": Enabled on all jobsh2. ProblemI logged into a Jenkins master and saw no builds running but there was a queue of about 10 jobs. When mousing over the queued jobs, I saw "pending - Already running 2 builds across all nodes". This is strange because no jobs were showing as running and no Jenkins agents or executors were showing any running builds.I then ran "http://xx.xxx.xxx.xxx:8080/computer/api/xml?tree=computer[executors[currentExecutable[url]],oneOffExecutors[currentExecutable[url]]]&xpath=//url&wrapper=builds" which did show 5 builds were running. I checked these builds and they were red (failure) and were not running.h2. ResearchI checked the console log of a build that showed as running but isn't and saw the line below near the top of the log._*Resume disabled by user, switching to high-performance, low-durability mode.*_At the end of the of the log I saw the following:*_Finished: FAILURE_* *_Resuming build at Tue Mar 13 23:04:52 UTC 2018 after Jenkins restart_*h2. Why Resume Build?The build failed on *Mar 12, 2018 6:02:37 PM*. +Why did the build try to resume almost a day later+? The job and system durability are configured to not resume builds. Below are some details taken from the API for the build._class "hudson.model.OneOffExecutor" id "41" keepLog false number 41 queueId 7178 result "FAILURE" timestamp 1520877757466I checked the Java process on the server and it was last restarted on *March 02 2018*.+What triggered the "Jenkins restart" identified on Mar 13 23:04:52 UTC 2018 since the Java process was not restarted?++Why does this get the build stuck in a "running" state when it's not running?+h2. ScopeThis issue can be seen across many of our Jenkins masters. In each case we see "Resuming build at x after Jenkins restart" occur a few days after the build failure or abort even though Java was not restarted. This issue didn't occur on Jenkins 2.60.3 running the older (pre-durability configurable) Pipeline plugins.h2. LogsI checked the jenkins.log file and saw the following when the build was attempting to be resumed. {code:java}Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad WARNING: Pipeline state not properly persisted, cannot resume job/JENKINS-JOB-NAME1/42/ Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem WARNING: Unexpected exception in CPS VM thread: CpsFlowExecution[OwnerJENKINS-JOB-NAME1/42:JENKINS-JOB-NAME1 #42] java.lang.NullPointerException at org.jenkinsci
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell updated an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Change By: Mike Kozell h2. SetupJenkins v2.89.4 LTS Pipeline API Plugin: 2.26 Pipeline Nodes and Processes Plugin: 2.19 Durable Task Plugin: 1.18 Pipeline Job Plugin: 2.17 Pipeline Shared Groovy Libraries Plugin: 2.9 Pipeline Supporting APIs Plugin: 2.18 Script Security Plugin: 1.41Pipeline Default Speed/Durability Level: Performance-Optimized "Do not allow the pipeline to resume if the master restarts": Enabled on all jobsh2. ProblemI logged into a Jenkins master and saw no builds running but there was a queue of about 10 jobs. When mousing over the queued jobs, I saw "pending - Already running 2 builds across all nodes". This is strange because no jobs were showing as running and no Jenkins agents or executors were showing any running builds.I then ran "http://xx.xxx.xxx.xxx:8080/computer/api/xml?tree=computer[executors[currentExecutable[url]],oneOffExecutors[currentExecutable[url]]]&xpath=//url&wrapper=builds" which did show 5 builds were running. I checked these builds and they were red (failure) and were not running.h2. ResearchI checked the console log of a build that showed as running but isn't and saw the line below near the top of the log._*Resume disabled by user, switching to high-performance, low-durability mode.*_At the end of the of the log I saw the following:*_Finished: FAILURE_* *_Resuming build at Tue Mar 13 23:04:52 UTC 2018 after Jenkins restart_*h2. Why Resume Build?The build failed on *Mar 12, 2018 6:02:37 PM*. +Why did the build try to resume almost a day later+? The job and system durability are configured to not resume builds. Below are some details taken from the API for the build._class "hudson.model.OneOffExecutor" id "41" keepLog false number 41 queueId 7178 result "FAILURE" timestamp 1520877757466I checked the Java process on the server and it was last restarted on *March 02 2018*.+What triggered the "Jenkins restart" identified on Mar 13 23:04:52 UTC 2018 since the Java process was not restarted?++Why does this get the build stuck in a "running" state when it's not running?+h2. ScopeThis issue can be seen across many of our Jenkins masters. In each case we see "Resuming build at x after Jenkins restart" occur a few days after the build failure or abort even though Java was not restarted. This issue didn't occur on Jenkins 2.60.3 running the older (pre-durability configurable) Pipeline plugins.h2. LogsI checked the jenkins.log file and saw the following when the build was attempting to be resumed. {code:java}Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsFlowExecution onLoad WARNING: Pipeline state not properly persisted, cannot resume job/JENKINS-JOB-NAME1/42/ Mar 13, 2018 9:29:56 PM org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService reportProblem WARNING: Unexpected exception in CPS VM thread: CpsFlowExecution[OwnerJENKINS-JOB-NAME1/42:JENKINS-JOB-NAME1 #42] java.lang.NullPointerException at org.jenkinsci.p
[JIRA] (JENKINS-50199) Failed pipeline jobs stuck running after incorrect resume
Title: Message Title Mike Kozell created an issue Jenkins / JENKINS-50199 Failed pipeline jobs stuck running after incorrect resume Issue Type: Bug Assignee: Unassigned Components: pipeline, workflow-api-plugin, workflow-cps-plugin Created: 2018-03-15 17:18 Priority: Critical Reporter: Mike Kozell Setup Jenkins v2.89.4 LTS Pipeline API Plugin: 2.26 Pipeline Nodes and Processes Plugin: 2.19 Durable Task Plugin: 1.18 Pipeline Job Plugin: 2.17 Pipeline Shared Groovy Libraries Plugin: 2.9 Pipeline Supporting APIs Plugin: 2.18 Script Security Plugin: 1.41 Pipeline Default Speed/Durability Level: Performance-Optimized "Do not allow the pipeline to resume if the master restarts": Enabled on all jobs Problem I logged into a Jenkins master and saw no builds running but there was a queue of about 10 jobs. When mousing over the queued jobs, I saw "pending - Already running 2 builds across all nodes". This is strange because no jobs were showing as running and no Jenkins agents or executors were showing any running builds. I then ran "http://xx.xxx.xxx.xxx:8080/computer/api/xml?tree=computer[executors[currentExecutable[url]],oneOffExecutors[currentExecutable[url]]]&xpath=//url&wrapper=builds" which did show 5 builds were running. I checked these builds and they were red (failure) and were not running. Research I checked the console log of a build that showed as running but isn't and saw the line below near the top of the log. Resume disabled by user, switching to high-performance, low-durability mode. At the end of the of the log I saw the following: Finished: FAILURE Resuming build at Tue Mar 13 23:04:52 UTC 2018 after Jenkins restart Why Resume Build? The build failed on Mar 12, 2018 6:02:37 PM. Why did the build try to resume almost a day later? The job and system durability are configured to not resume builds. Below are some details taken from the API for the build. _class "hudson.model.OneOffExecutor" id "41" keepLog false number 41 queueId 7178 result "FAILURE" timestamp 1520