[ 
https://issues.apache.org/jira/browse/GOBBLIN-2020?focusedWorklogId=911099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-911099
 ]

ASF GitHub Bot logged work on GOBBLIN-2020:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Mar/24 17:24
            Start Date: 22/Mar/24 17:24
    Worklog Time Spent: 10m 
      Work Description: homatthew commented on code in PR #3900:
URL: https://github.com/apache/gobblin/pull/3900#discussion_r1535938998


##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/CommitActivityImpl.java:
##########
@@ -135,9 +141,21 @@ public Callable<Void> apply(final Map.Entry<String, 
JobState.DatasetState> entry
 
       IteratorExecutor.logFailures(result, null, 10);
 
+      Set<String> failedDatasetUrns = new HashSet<>();
+      for (JobState.DatasetState datasetState : datasetStatesByUrns.values()) {
+        // Set the overall job state to FAILED if the job failed to process 
any dataset
+        if (datasetState.getState() == JobState.RunningState.FAILED) {
+          failedDatasetUrns.add(datasetState.getDatasetUrn());
+        }
+      }
+      if (!failedDatasetUrns.isEmpty()) {
+        String allFailedDatasets = String.join(", ", failedDatasetUrns);
+        log.error("Failed to commit dataset state for dataset(s) {}" + 
String.join(", ", failedDatasetUrns));

Review Comment:
   I think you meant to use the variable from 152 here





Issue Time Tracking
-------------------

    Worklog Id:     (was: 911099)
    Time Spent: 1h 20m  (was: 1h 10m)

> Fixes failed workflow paths in Temporal to properly emit GTE and fail job 
> when commit fails
> -------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-2020
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2020
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: William Lo
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> There are a few bugs in Gobblin-Temporal execution mode:
> 1. If the publishing step fails, the activity does not report a failure due 
> to missing a step post commit to check the dataset states
> 2. No GTEs are emitted upon job failure, which makes tracking difficult
> 3. Some metadata propagation for flow execution ID with workflows is 
> incorrect due to a bug reading worker configs instead of job props
> 4. The GenerateWus activity does not return the right number of workunits 
> created due to counting top level multiworkunits



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to