[
https://issues.apache.org/jira/browse/GOBBLIN-1857?focusedWorklogId=872016&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-872016
]
ASF GitHub Bot logged work on GOBBLIN-1857:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 20/Jul/23 15:57
Start Date: 20/Jul/23 15:57
Worklog Time Spent: 10m
Work Description: Will-Lo commented on code in PR #3719:
URL: https://github.com/apache/gobblin/pull/3719#discussion_r1269670013
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixJobsMapping.java:
##########
@@ -96,15 +96,17 @@ public HelixJobsMapping(Config sysConfig, URI fsUri, String
rootDir) {
}
public static String createPlanningJobId (Properties jobPlanningProps) {
+ long planningJobId = PropertiesUtils.getPropAsBoolean(jobPlanningProps,
GobblinClusterConfigurationKeys.USE_GENERATED_JOBEXECUTION_IDS, "false") ?
+ System.currentTimeMillis() :
PropertiesUtils.getPropAsLong(jobPlanningProps,
ConfigurationKeys.FLOW_EXECUTION_ID_KEY, System.currentTimeMillis());
Review Comment:
The advantage is primarily for tracking, GobblinTrackingEvents have an
execution ID in their metadata and it would work nicely if the execution ID
also allowed the caller to identify the flow execution ID from it as well, lets
the user perform cancel() etc.
Uhh for earlyStop maybe we can append something to the jobName but it'll be
hacky, I feel like that feature should be redesigned in general to follow
gobblin conventions better
Issue Time Tracking
-------------------
Worklog Id: (was: 872016)
Time Spent: 50m (was: 40m)
> Allow override key for job execution ID in Helix Gobblin Cluster
> ----------------------------------------------------------------
>
> Key: GOBBLIN-1857
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1857
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-cluster
> Reporter: William Lo
> Assignee: Hung Tran
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Job Execution ID is automatically inferred from the flow execution ID if the
> job was orchestrated by Gobblin-as-a-Service. However, this can lead to bugs
> in conjunction with other job keys such as earlyStop, since it would create
> multiple planningJobs and job IDs sent to Helix. These jobs would then have
> the same execution ID which is rejected by Helix.
> When those configurations are set, we want to force the Gobblin cluster to
> default to creating the job execution ID from its timestamp.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)