Dmitry Konstantinov created CASSANDRA-20833:
-----------------------------------------------

             Summary: Jenkins pipeline: add CauseOfInterruption logging to 
simplify troubleshooting
                 Key: CASSANDRA-20833
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20833
             Project: Apache Cassandra
          Issue Type: Improvement
          Components: Build
            Reporter: Dmitry Konstantinov


Currently when a test split of Jenkins pipeline is aborted due to a timeout or 
if an agent went offline Jenkins aborts other parallel steps too. All the 
steps: originally failed and others are looking very similar in Pipeline 
overview and it is time consuming activity to find the original split failure.
Jenkins throws an exception in this case but it does not have a message and 
stack trace is not informative and generic as well, example:
{code}
org.jenkinsci.plugins.workflow.steps.FlowInterruptedException
        at PluginClassLoader for 
workflow-durable-task-step//org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$RemovedNodeListener.lambda$cancelOwnerExecution$7(ExecutorStepExecution.java:432)
        at PluginClassLoader for 
workflow-durable-task-step//org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask$1.onSuccess(ExecutorStepExecution.java:524)
        at PluginClassLoader for 
workflow-durable-task-step//org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask$1.onSuccess(ExecutorStepExecution.java:520)
        at 
com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1139)
        at 
com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
        at 
com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1307)
        at 
com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1070)
        at 
com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:794)
        at 
com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:49)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:1118)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$5.onSuccess(CpsFlowExecution.java:1096)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:1012)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$wrap$2(CpsVmExecutorService.java:85)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
        at 
hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
        at 
jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
        at 
jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
        at 
jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:53)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:50)
        at 
org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
        at 
org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
        at PluginClassLoader for 
workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$categoryThreadFactory$0(CpsVmExecutorService.java:50)
        at java.base/java.lang.Thread.run(Unknown Source)
        Suppressed: org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: 
363f0c37-5777-48ab-92e6-9aa56ec5431d
{code}

FlowInterruptedException has additional info which may help here:  
List<CauseOfInterruption> getCauses() 
https://github.com/jenkinsci/workflow-step-api-plugin/blob/518c5dcb24c0d692abe25aea4c03f8273ac06ca4/src/main/java/org/jenkinsci/plugins/workflow/steps/FlowInterruptedException.java#L54

We can extract and print this info which contains the name of the original 
failed step. 
Example of the original split failure:
{code}
CauseOfInterruption: 
org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$RemovedNodeCause
 - Agent was removed
{code}
Example of aborted split due to another split error:
{code}
CauseOfInterruption: 
org.jenkinsci.plugins.workflow.cps.steps.ParallelStep$FailFastCause - Failed in 
branch jvm-dtest jdk11 10/16
{code}


Note: to use it the method should be added to the approved Jenkins API: 
{code}
method org.jenkinsci.plugins.workflow.steps.FlowInterruptedException getCauses 
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to