AlexanderAshitkin commented on code in PR #388:
URL:
https://github.com/apache/maven-build-cache-extension/pull/388#discussion_r2404789258
##########
src/main/java/org/apache/maven/buildcache/xml/CacheConfigImpl.java:
##########
@@ -575,7 +575,24 @@ public String getLocalRepositoryLocation() {
public List<DirName> getAttachedOutputs() {
checkInitializedState();
final AttachedOutputs attachedOutputs =
getConfiguration().getAttachedOutputs();
- return attachedOutputs == null ? Collections.emptyList() :
attachedOutputs.getDirNames();
+ if (attachedOutputs == null) {
+ return getDefaultAttachedOutputs();
+ }
+ return attachedOutputs.getDirNames();
+ }
+
+ private List<DirName> getDefaultAttachedOutputs() {
Review Comment:
> Build failures when running mvn test after cache restoration
Let me please share some background. The primary goal for the cache was to
provide lightweight and fast build for CI. Saving and restoring compilation
directories interferes with this goal, because it effectively duplicates IO and
upload efforts - it is necessary to pack and upload almost the same jar twice.
For example, Gradle, which restores classes, skips caching jar files by default
to avoid doing expensive work twice. In Maven, there is no easy way to find all
input/output directories for the project because plugins might have own
input/output configurations. Because of that, instead of caching "classes" the
choice was mad in favor of caching artifacts, for both correctness and
performance reasons.
All in all - though restoring classes might be helpful in local development,
it is better to consider this as a separate, opt-in feature. It might require
additional efforts to implement correctly. Default save / restore should be
based on the reactor introspection.
##########
src/main/java/org/apache/maven/buildcache/CacheUtils.java:
##########
@@ -193,16 +209,27 @@ public static void unzip(Path zip, Path out) throws
IOException {
throw new RuntimeException("Bad zip entry");
}
if (entry.isDirectory()) {
- Files.createDirectory(file);
+ if (!Files.exists(file)) {
+ Files.createDirectories(file);
+ }
+ directoryTimestamps.put(file, entry.getTime());
} else {
Path parent = file.getParent();
- Files.createDirectories(parent);
+ if (!Files.exists(parent)) {
+ Files.createDirectories(parent);
+ }
Files.copy(zis, file, StandardCopyOption.REPLACE_EXISTING);
+ Files.setLastModifiedTime(file,
FileTime.fromMillis(entry.getTime()));
}
- Files.setLastModifiedTime(file,
FileTime.fromMillis(entry.getTime()));
entry = zis.getNextEntry();
}
}
+
+ // Set directory timestamps after all files have been extracted to
avoid them being
+ // updated by file creation operations
+ for (Map.Entry<Path, Long> dirEntry : directoryTimestamps.entrySet()) {
+ Files.setLastModifiedTime(dirEntry.getKey(),
FileTime.fromMillis(dirEntry.getValue()));
Review Comment:
It seems that, based on this logic, the `LastModifiedTime` will always be
earlier than the creation timestamp. I cannot rule out undefined behavior, such
as automatic correction using the max(creation time, last modified time),
ignoring the value altogether, or even raising an exception.
##########
src/main/java/org/apache/maven/buildcache/CacheUtils.java:
##########
@@ -165,13 +167,26 @@ public static boolean zip(final Path dir, final Path zip,
final String glob) thr
"*".equals(glob) ? null :
FileSystems.getDefault().getPathMatcher("glob:" + glob);
Files.walkFileTree(dir, new SimpleFileVisitor<Path>() {
+ @Override
+ public FileVisitResult preVisitDirectory(Path path,
BasicFileAttributes attrs) throws IOException {
+ if (!path.equals(dir)) {
Review Comment:
Why is this necessary? Including empty directories in the zip file doesn't
provide much value, and non-empty directories will be added by the `visitFile`
method. Additionally, once a glob pattern is provided, this logic should adhere
to the method contract and verify compliance with the glob.
##########
src/main/java/org/apache/maven/buildcache/xml/CacheConfigImpl.java:
##########
@@ -575,7 +575,24 @@ public String getLocalRepositoryLocation() {
public List<DirName> getAttachedOutputs() {
checkInitializedState();
final AttachedOutputs attachedOutputs =
getConfiguration().getAttachedOutputs();
- return attachedOutputs == null ? Collections.emptyList() :
attachedOutputs.getDirNames();
+ if (attachedOutputs == null) {
+ return getDefaultAttachedOutputs();
+ }
+ return attachedOutputs.getDirNames();
+ }
+
+ private List<DirName> getDefaultAttachedOutputs() {
Review Comment:
1) This should be an opt-in feature. For example, CI builds do not need to
restore classes. Restoring classes will slow down the builds in projects with
large compile outputs or slow files systems.
2) More reliable approach is to restore output directories based on the
project model instead of using hardcoded settings.
##########
src/main/java/org/apache/maven/buildcache/CacheUtils.java:
##########
@@ -193,16 +209,27 @@ public static void unzip(Path zip, Path out) throws
IOException {
throw new RuntimeException("Bad zip entry");
}
if (entry.isDirectory()) {
- Files.createDirectory(file);
+ if (!Files.exists(file)) {
Review Comment:
`createDirectory` throws
[FileAlreadyExistsException](https://docs.oracle.com/javase/8/docs/api/java/nio/file/FileAlreadyExistsException.html),
ensuring that the zip will be unpacked to a clean directory. Adding
`Files.exists(file)` means that the zip could be unpacked to a directory
already containing classes (say from a previous branch), which could lead to an
erroneous result.
Also - restoring added empty directories from the zip file requires
additional I/O, which is an unnecessary slowdown.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]