[
https://issues.apache.org/jira/browse/GOBBLIN-2050?focusedWorklogId=915644&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-915644
]
ASF GitHub Bot logged work on GOBBLIN-2050:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 20/Apr/24 00:08
Start Date: 20/Apr/24 00:08
Worklog Time Spent: 10m
Work Description: Will-Lo commented on code in PR #3931:
URL: https://github.com/apache/gobblin/pull/3931#discussion_r1573066383
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java:
##########
@@ -50,6 +50,8 @@ public class GobblinClusterConfigurationKeys {
public static final boolean DEFAULT_STANDALONE_CLUSTER_MODE = false;
// Root working directory for Gobblin cluster
public static final String CLUSTER_WORK_DIR = GOBBLIN_CLUSTER_PREFIX +
"workDir";
+ // Root working dir without appending the application name, keeping
CLUSTER_WORK_DIR property for backward compatibility
+ public static final String CLUSTER_ABSOLUTE_WORK_DIR =
GOBBLIN_CLUSTER_PREFIX + "absolute.workDir";
Review Comment:
It's done by the caller (config file), which can dynamically use properties
in the job to append to the folder. It also can append yarn application ID in
this config file if you so desire by determining the path at runtime. The issue
is that a lot of the other files/folders generated by an E2E Gobblin cluster on
yarn app does not also append the yarn application id (leak of abstraction).
This approach lets there be more explicit control on the caller side to
determine their folder behaviors.
Issue Time Tracking
-------------------
Worklog Id: (was: 915644)
Time Spent: 2h (was: 1h 50m)
> Allow configurable token file paths and cluster work directories to allow for
> easy cleanup after yarn app closes
> ----------------------------------------------------------------------------------------------------------------
>
> Key: GOBBLIN-2050
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2050
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: William Lo
> Priority: Major
> Time Spent: 2h
> Remaining Estimate: 0h
>
> Gobblin Yarn Application Launcher has some issues where directories used for
> the job can persist after the job ends. It also creates a number of temporary
> files which can grow out of control.
> We want to be able to:
> 1. Clean up directories effectively
> 2. Use explicit paths to allow for consolidation of temp files to be under
> the same folder for token renewal and cluster work files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)