[ https://issues.apache.org/jira/browse/OOZIE-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504558#comment-15504558 ]
Rohini Palaniswamy commented on OOZIE-2582: ------------------------------------------- Few comments: 1) Can you move below declarations from PigMain to LauncherMain and replace references of oozie.action.externalChildIDs with that constant? {code} public static final String ACTION_PREFIX = "oozie.action."; public static final String EXTERNAL_CHILD_IDS = ACTION_PREFIX + "externalChildIDs"; public static final String EXTERNAL_ACTION_STATS = ACTION_PREFIX + "stats.properties"; public static final String EXTERNAL_STATS_WRITE = ACTION_PREFIX + "external.stats.write"; {code} 2) https://issues.apache.org/jira/browse/OOZIE-2503?focusedCommentId=15315945&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15315945 Distcp and Spark child job url was added by [~satishsaley] only in this release (OOZIE-2471 and OOZIE-2503) in Oozie 4.3. But they were implemented by overwriting getCaptureOutput to true and implementing getActionData and writing to output.properties instead of externalChildIDs. Had asked him to create OOZIE-2561 to remove that and write directly to externalChildIDs. Can you take care of that in this patch? HiveMain, Hive2Main and SqoopMain also need to be changed similarly but the code existed in previous releases. So for jobs that were launched before upgrade, we need to still read from output.properties. i.e {code} String externalIDs = actionData.get(LauncherMapper.ACTION_DATA_EXTERNAL_CHILD_IDS); if (externalIDs != null) { context.setExternalChildIDs(externalIDs); LOG.info(XLog.STD, "Hadoop Jobs launched : [{0}]", externalIDs); } else if (LauncherMapperHelper.hasOutputData(actionData)) { // Load stored Hadoop jobs ids and promote them as external child ids // This is for jobs launched with older release during upgrade to Oozie 4.3 Properties props = PropertiesUtils.stringToProperties(actionData .get(LauncherMapper.ACTION_DATA_OUTPUT_PROPS) if (props.get(LauncherMain.HADOOP_JOBS) != null) { context.setExternalChildIDs((String) props.get(LauncherMain.HADOOP_JOBS)); LOG.info(XLog.STD, "Hadoop Jobs launched : [{0}]", externalIDs); } } {code} Your current patch writes to externalChildIds in addition to the properties file. With above change you need to only write to externalChildIds file. Please remove readExternalChildIDs in JavaActionExecutor as well as it will be unused after above change. This will also avoid duplication and wastage of storing child ids in two fields (action output and external child ids) of WorkflowAction in the database. 4) Can you put the job ids in a LinkedHashSet (to maintain the order in which jobs were launched) first and then create the string? Doing toString() and contains check every time is inefficient. {code} if (!sb.toString().contains(jobId)) { + sb.append(separator).append(jobId); + } {code} > Populating external child Ids for action failures > ------------------------------------------------- > > Key: OOZIE-2582 > URL: https://issues.apache.org/jira/browse/OOZIE-2582 > Project: Oozie > Issue Type: Bug > Components: core > Reporter: Abhishek Bafna > Assignee: Abhishek Bafna > Fix For: 4.3.0 > > Attachments: OOZIE-2582-00.patch, OOZIE-2582-01.patch, > OOZIE-2582-02.patch, OOZIE-2582-03.patch, OOZIE-2582-04.patch, > OOZIE-2582-05.patch > > > Currently Oozie external child ids are populated into workflow bean, when the > job/action completes successfully. It should populate external child ids in > case of job failures as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)