phet commented on code in PR #4030:
URL: https://github.com/apache/gobblin/pull/4030#discussion_r1762107332


##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnHelixUtils.java:
##########
@@ -229,10 +229,11 @@ public static boolean retainKLatestJarCachePaths(Path 
parentCachePath, int k, Fi
     // Cleanup old cache if necessary
     List<FileStatus> jarDirs =
         Arrays.stream(fs.exists(parentCachePath) ? 
fs.listStatus(parentCachePath) : new 
FileStatus[0]).sorted().collect(Collectors.toList());
-    if (jarDirs.size() > k) {
-      return fs.delete(jarDirs.get(0).getPath(), true);
+    boolean deletesSuccessful = true;
+    for (int i = 0; i < jarDirs.size() - k; i++) {
+      deletesSuccessful = deletesSuccessful && 
fs.delete(jarDirs.get(i).getPath(), true);

Review Comment:
   `&=`



##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinYarnAppLauncher.java:
##########
@@ -590,6 +590,15 @@ ApplicationId setupAndSubmitApplication() throws 
IOException, YarnException, Int
     
amContainerLaunchContext.setEnvironment(YarnHelixUtils.getEnvironmentVariables(this.yarnConfiguration));
     
amContainerLaunchContext.setCommands(Lists.newArrayList(buildApplicationMasterCommand(applicationId.toString(),
 resource.getMemory())));
 
+    if (this.jarCacheEnabled) {
+      Path jarCachePath = YarnHelixUtils.calculateJarCachePath(this.config);
+      // Retain at least the current and last month's jars to handle 
executions running for ~30 days max
+      boolean cleanedSuccessfully = 
YarnHelixUtils.retainKLatestJarCachePaths(jarCachePath.getParent(), 2, this.fs);

Review Comment:
   would be great to add this info to a code comment here (in the caller, since 
it's the combination of the two being used, so less effective in either's 
javadoc).  the crux is that retention of K=2 won't save us in cases where K=3 
might exceed any FS quotas.



##########
gobblin-utility/src/main/java/org/apache/gobblin/util/filesystem/HdfsJarUploadUtils.java:
##########
@@ -39,15 +39,15 @@ public class HdfsJarUploadUtils {
    * given the {@link FileStatus} of a jarFile and the path of directory that 
contains jar.
    * Snapshot dirs should not be shared, as different jobs may be using 
different versions of it.
    * @param fs
-   * @param localJar
+   * @param localJarPath
    * @param unsharedJarsDir
    * @param jarCacheDir
    * @return
    * @throws IOException
    */
-  public static Path calculateDestJarFile(FileSystem fs, FileStatus localJar, 
Path unsharedJarsDir, Path jarCacheDir) throws IOException {
-    Path uploadDir = localJar.getPath().getName().contains("SNAPSHOT") ? 
unsharedJarsDir : jarCacheDir;
-    Path destJarFile = new Path(fs.makeQualified(uploadDir), 
localJar.getPath().getName());
+  public static Path calculateDestJarFilePath(FileSystem fs, String 
localJarPath, Path unsharedJarsDir, Path jarCacheDir) throws IOException {
+    Path uploadDir = localJarPath.contains("SNAPSHOT") ? unsharedJarsDir : 
jarCacheDir;

Review Comment:
   seeing this invoked, `localJarPath` looks more like `localJarBasename`, not 
the (full) path.
   
   (or simply `jarName`)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to