[jira] [Commented] (OOZIE-2582) Populating external child Ids for action failures

Rohini Palaniswamy (JIRA) Mon, 19 Sep 2016 13:15:32 -0700

    [ 
https://issues.apache.org/jira/browse/OOZIE-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504558#comment-15504558
 ]


Rohini Palaniswamy commented on OOZIE-2582:
-------------------------------------------

Few comments:
  1) Can you move below declarations from PigMain to LauncherMain and replace 
references of oozie.action.externalChildIDs with that constant?
{code}
public static final String ACTION_PREFIX = "oozie.action.";
    public static final String EXTERNAL_CHILD_IDS = ACTION_PREFIX + 
"externalChildIDs";
    public static final String EXTERNAL_ACTION_STATS = ACTION_PREFIX + 
"stats.properties";
    public static final String EXTERNAL_STATS_WRITE = ACTION_PREFIX + 
"external.stats.write";
{code}
  2) 
https://issues.apache.org/jira/browse/OOZIE-2503?focusedCommentId=15315945&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15315945

Distcp and Spark child job url was added by [~satishsaley] only in this release 
(OOZIE-2471 and OOZIE-2503) in Oozie 4.3. But they were implemented by 
overwriting getCaptureOutput to true and implementing getActionData and writing 
to output.properties instead of externalChildIDs. Had asked him to create 
OOZIE-2561 to remove that and write directly to externalChildIDs. Can you take 
care of that in this patch? HiveMain, Hive2Main and SqoopMain also need to be 
changed similarly but the code existed in previous releases. So for jobs that 
were launched before upgrade, we need to still read from output.properties. i.e

{code}
String externalIDs = 
actionData.get(LauncherMapper.ACTION_DATA_EXTERNAL_CHILD_IDS);
                    if (externalIDs != null) {
                        context.setExternalChildIDs(externalIDs);
                        LOG.info(XLog.STD, "Hadoop Jobs launched : [{0}]", 
externalIDs);
                    } 
                    else if (LauncherMapperHelper.hasOutputData(actionData)) {
            // Load stored Hadoop jobs ids and promote them as external child 
ids
            // This is for jobs launched with older release during upgrade to 
Oozie 4.3
            Properties props = PropertiesUtils.stringToProperties(actionData
                                    
.get(LauncherMapper.ACTION_DATA_OUTPUT_PROPS)
           if (props.get(LauncherMain.HADOOP_JOBS) != null) {
                context.setExternalChildIDs((String) 
props.get(LauncherMain.HADOOP_JOBS));
                LOG.info(XLog.STD, "Hadoop Jobs launched : [{0}]", externalIDs);
            }
        }
{code}
Your current patch writes to externalChildIds in addition to the properties 
file. With above change you need to only write to externalChildIds file. Please 
remove readExternalChildIDs in JavaActionExecutor as well as it will be unused 
after above change. This will also avoid duplication and wastage of storing 
child ids in two fields (action output and external child ids) of 
WorkflowAction in the database. 

4)  Can you put the job ids in a LinkedHashSet (to maintain the order in which 
jobs were launched) first and then create the string? Doing toString() and 
contains check every time is inefficient.
{code}
if (!sb.toString().contains(jobId)) {
+                            sb.append(separator).append(jobId);
+                        }
{code}

> Populating external child Ids for action failures
> -------------------------------------------------
>
>                 Key: OOZIE-2582
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2582
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>            Reporter: Abhishek Bafna
>            Assignee: Abhishek Bafna
>             Fix For: 4.3.0
>
>         Attachments: OOZIE-2582-00.patch, OOZIE-2582-01.patch, 
> OOZIE-2582-02.patch, OOZIE-2582-03.patch, OOZIE-2582-04.patch, 
> OOZIE-2582-05.patch
>
>
> Currently Oozie external child ids are populated into workflow bean, when the 
> job/action completes successfully. It should populate external child ids in 
> case of job failures as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OOZIE-2582) Populating external child Ids for action failures

Reply via email to