William Lo created GOBBLIN-2126:
-----------------------------------
Summary: Implement caching for resources uploaded to hdfs by
Gobblin Yarn jobs
Key: GOBBLIN-2126
URL: https://issues.apache.org/jira/browse/GOBBLIN-2126
Project: Apache Gobblin
Issue Type: Improvement
Reporter: William Lo
Currently Gobblin Yarn jobs will continuously reupload jars to HDFS for each
execution.
We want to instead keep a running cache, similar to MR which gets cleaned up at
a monthly interval (can be configured in the future) where it will ensure that
files do not get repeatedly uploaded to HDFS which is a slow operation.
This should lead to significant performance improvements in the bootstrapping
of a YARN application in Gobblin for Temporal.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)