[ https://issues.apache.org/jira/browse/OOZIE-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
longfei.wang updated OOZIE-3613: -------------------------------- Description: It finds that oozie connot kill a spark job launched by oozie spark action, this is because of this spark action's externalChildIDs from this spark job's std log cannot be got timely. As its code, sparkMain filter externalChildIDs(real spark job id from yarn ) from spark job's log file after the spark job finished(can be normal or abnormal finished) and kill spark job should depend on thus externalChildIDs. /***** try { runSpark(sparkArgs.toArray(new String[sparkArgs.size()])); } finally { System.out.println("\n<<< Invocation of Spark command completed <<<\n"); writeExternalChildIDs(logFile, SPARK_JOB_IDS_PATTERNS, "Spark"); } *****/ this code logic can be find in [[oozie|https://github.com/apache/oozie]/sharelib/spark/src/main/java/org/apache/oozie/action/hadoop/*sparkMain.java*] /***** yarnClient = createYarnClient(context, jobConf); String appExternalId = action.getExternalId(); killExternalApp(action, yarnClient, appExternalId); killExternalChildApp(action, yarnClient, appExternalId); killExternalChildAppByTags(action, yarnClient, jobConf, appExternalId); *****/ this code logic can be find in [[oozie|https://github.com/apache/oozie]/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor*.java*] However, before spark action finished, it cannot get these externalChildIDs. So if kill a running spark job launched by spark action, only can kill oozie spark action and cannot this running spark job. maybe this is bug and shoud be improved. was: It finds that oozie connot kill a spark job launched by oozie spark action, this is because of this spark action's externalChildIDs from this spark job's std log cannot be got timely. As its code, sparkMain filter externalChildIDs(real spark job id from yarn ) from spark job log file after the spark job finished(can be normal or abnormal finished) and kill spark job should depend on thus externalChildIDs. this code logic can be find in [[oozie|https://github.com/apache/oozie]/[sharelib|https://github.com/apache/oozie/tree/master/sharelib]/[spark|https://github.com/apache/oozie/tree/master/sharelib/spark]/[src|https://github.com/apache/oozie/tree/master/sharelib/spark/src]/[main|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main]/[java|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main/java]/[org|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main/java/org]/[apache|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main/java/org/apache]/[oozie|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main/java/org/apache/oozie]/[action|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main/java/org/apache/oozie/action]/[hadoop|https://github.com/apache/oozie/tree/master/sharelib/spark/src/main/java/org/apache/oozie/action/hadoop]/*SparkMain.java*] However, before spark action finished, it cannot get these externalChildIDs. So if kill a running spark job launched by spark action, only can kill oozie spark action and cannot this running spark job. maybe this is bug and shoud be improved. > oozie cannot kill spark action > ------------------------------ > > Key: OOZIE-3613 > URL: https://issues.apache.org/jira/browse/OOZIE-3613 > Project: Oozie > Issue Type: Bug > Reporter: longfei.wang > Priority: Critical > Attachments: 屏幕快照 2020-12-01 下午3.27.00.png > > > It finds that oozie connot kill a spark job launched by oozie spark action, > this is because of this spark action's externalChildIDs from this spark job's > std log cannot be got timely. > As its code, sparkMain filter externalChildIDs(real spark job id from yarn ) > from spark job's log file after the spark job finished(can be normal or > abnormal finished) and kill spark job should depend on thus > externalChildIDs. > /***** > try { > runSpark(sparkArgs.toArray(new String[sparkArgs.size()])); > } > finally { > System.out.println("\n<<< Invocation of Spark command completed <<<\n"); > writeExternalChildIDs(logFile, SPARK_JOB_IDS_PATTERNS, "Spark"); > } > *****/ > this code logic can be find in > [[oozie|https://github.com/apache/oozie]/sharelib/spark/src/main/java/org/apache/oozie/action/hadoop/*sparkMain.java*] > > /***** > yarnClient = createYarnClient(context, jobConf); > String appExternalId = action.getExternalId(); > killExternalApp(action, yarnClient, appExternalId); > killExternalChildApp(action, yarnClient, appExternalId); > killExternalChildAppByTags(action, yarnClient, jobConf, appExternalId); > *****/ > this code logic can be find in > [[oozie|https://github.com/apache/oozie]/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor*.java*] > However, before spark action finished, it cannot get these externalChildIDs. > So if kill a running spark job launched by spark action, only can kill oozie > spark action and cannot this running spark job. > maybe this is bug and shoud be improved. -- This message was sent by Atlassian Jira (v8.3.4#803005)