AlexanderAshitkin commented on code in PR #388:
URL: 
https://github.com/apache/maven-build-cache-extension/pull/388#discussion_r2404789258


##########
src/main/java/org/apache/maven/buildcache/xml/CacheConfigImpl.java:
##########
@@ -575,7 +575,24 @@ public String getLocalRepositoryLocation() {
     public List<DirName> getAttachedOutputs() {
         checkInitializedState();
         final AttachedOutputs attachedOutputs = 
getConfiguration().getAttachedOutputs();
-        return attachedOutputs == null ? Collections.emptyList() : 
attachedOutputs.getDirNames();
+        if (attachedOutputs == null) {
+            return getDefaultAttachedOutputs();
+        }
+        return attachedOutputs.getDirNames();
+    }
+
+    private List<DirName> getDefaultAttachedOutputs() {

Review Comment:
   > Build failures when running mvn test after cache restoration
   
   Let me please share some background. The primary goal for the cache was to 
provide lightweight and fast build for CI. Saving and restoring compilation 
directories interferes with this goal, because it effectively duplicates IO and 
upload efforts - it is necessary to pack and upload almost the same jar twice. 
For example, Gradle, which restores classes, skips caching jar files by default 
to avoid doing expensive work twice. In Maven, there is no easy way to find all 
input/output directories for the project because plugins might have own 
input/output configurations. Because of that, instead of caching "classes" the 
choice was mad in favor of caching artifacts, for both correctness and 
performance reasons.
   
   All in all - though restoring classes might be helpful in local development, 
it is better to consider this as a separate, opt-in feature. It might require 
additional efforts to implement correctly. Default save / restore should be 
based on the reactor introspection.



##########
src/main/java/org/apache/maven/buildcache/CacheUtils.java:
##########
@@ -193,16 +209,27 @@ public static void unzip(Path zip, Path out) throws 
IOException {
                     throw new RuntimeException("Bad zip entry");
                 }
                 if (entry.isDirectory()) {
-                    Files.createDirectory(file);
+                    if (!Files.exists(file)) {
+                        Files.createDirectories(file);
+                    }
+                    directoryTimestamps.put(file, entry.getTime());
                 } else {
                     Path parent = file.getParent();
-                    Files.createDirectories(parent);
+                    if (!Files.exists(parent)) {
+                        Files.createDirectories(parent);
+                    }
                     Files.copy(zis, file, StandardCopyOption.REPLACE_EXISTING);
+                    Files.setLastModifiedTime(file, 
FileTime.fromMillis(entry.getTime()));
                 }
-                Files.setLastModifiedTime(file, 
FileTime.fromMillis(entry.getTime()));
                 entry = zis.getNextEntry();
             }
         }
+
+        // Set directory timestamps after all files have been extracted to 
avoid them being
+        // updated by file creation operations
+        for (Map.Entry<Path, Long> dirEntry : directoryTimestamps.entrySet()) {
+            Files.setLastModifiedTime(dirEntry.getKey(), 
FileTime.fromMillis(dirEntry.getValue()));

Review Comment:
   It seems that, based on this logic, the `LastModifiedTime` will always be 
earlier than the creation timestamp. I cannot rule out undefined behavior, such 
as automatic correction using the max(creation time, last modified time), 
ignoring the value altogether, or even raising an exception.



##########
src/main/java/org/apache/maven/buildcache/CacheUtils.java:
##########
@@ -165,13 +167,26 @@ public static boolean zip(final Path dir, final Path zip, 
final String glob) thr
                     "*".equals(glob) ? null : 
FileSystems.getDefault().getPathMatcher("glob:" + glob);
             Files.walkFileTree(dir, new SimpleFileVisitor<Path>() {
 
+                @Override
+                public FileVisitResult preVisitDirectory(Path path, 
BasicFileAttributes attrs) throws IOException {
+                    if (!path.equals(dir)) {

Review Comment:
   Why is this necessary? Including empty directories in the zip file doesn't 
provide much value, and non-empty directories will be added by the `visitFile` 
method. Additionally, once a glob pattern is provided, this logic should adhere 
to the method contract and verify compliance with the glob.



##########
src/main/java/org/apache/maven/buildcache/xml/CacheConfigImpl.java:
##########
@@ -575,7 +575,24 @@ public String getLocalRepositoryLocation() {
     public List<DirName> getAttachedOutputs() {
         checkInitializedState();
         final AttachedOutputs attachedOutputs = 
getConfiguration().getAttachedOutputs();
-        return attachedOutputs == null ? Collections.emptyList() : 
attachedOutputs.getDirNames();
+        if (attachedOutputs == null) {
+            return getDefaultAttachedOutputs();
+        }
+        return attachedOutputs.getDirNames();
+    }
+
+    private List<DirName> getDefaultAttachedOutputs() {

Review Comment:
   1) This should be an opt-in feature. For example, CI builds do not need to 
restore classes. Restoring classes will slow  down the builds in projects with 
large compile outputs or slow files systems.
   2) More reliable approach is to restore output directories based on the 
project model instead of using hardcoded settings.
   
   



##########
src/main/java/org/apache/maven/buildcache/CacheUtils.java:
##########
@@ -193,16 +209,27 @@ public static void unzip(Path zip, Path out) throws 
IOException {
                     throw new RuntimeException("Bad zip entry");
                 }
                 if (entry.isDirectory()) {
-                    Files.createDirectory(file);
+                    if (!Files.exists(file)) {

Review Comment:
   `createDirectory` throws 
[FileAlreadyExistsException](https://docs.oracle.com/javase/8/docs/api/java/nio/file/FileAlreadyExistsException.html),
 ensuring that the zip will be unpacked to a clean directory. Adding 
`Files.exists(file)` means that the zip could be unpacked to a directory 
already containing classes (say from a previous branch), which could lead to an 
erroneous result. 
   
   Also - restoring added empty directories from the zip file requires 
additional I/O, which is an unnecessary slowdown.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to