[ https://issues.apache.org/jira/browse/MNG-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472093#comment-17472093 ]
Thomas Skjølberg commented on MNG-7389: --------------------------------------- [~michael-o] please elaborate. Is the dependencies component not the right home for managing the cache? > Incremental .m2 cache cleanup for CI > ------------------------------------ > > Key: MNG-7389 > URL: https://issues.apache.org/jira/browse/MNG-7389 > Project: Maven > Issue Type: New Feature > Components: Dependencies > Reporter: Thomas Skjølberg > Priority: Minor > > One or more popular continous integration are unable to properly manage the > .m2 repository cache, resulting in wasted resources in the form of increased > CI runtime and bandwidth consumption. > *CircleCI cache behaviour:* > - immutable cache entries > - default behaviour is to wipe the cache each time a pom file is modified > (i.e. using pom hash as a cache key) > - cache entries TTL > weeks > So CircleCI always has a cache containing only the necessary artifacts, but > has to download all dependencies every time the pom file changes. > *Github Actions cache behaviour* > - (effectively) mutable cache entries > - incremental cache (if it gets too big, it is wiped). > - cache entries TTL 1 week > So Github actions work well if the cache entries expire from time to time, > otherwise the cache keeps growing. > *Summary* > Perhaps this does not look so bad at first glance, but for a project under > active development, with a lot of artifacts, the pom file changes often. For > example we have apps with 100 dependencies and automatic dependency bumping > via Renovate, in addition to an hierarchy of libraries. > Key takeaways; time is wasted > - saving caches in CI > - loading cache in CI > - loading artifacts from external artifact store > This happens quite a lot. From the artifact store perspective, this probably > multiplies the load by a factor of 10. > Possible solution: A way to define a "transaction" for artifact use, i.e. > 1. run command to mark start of transaction > 2. run one or more maven commands > 3. run command to mark end of transaction, deleting artifacts not in use. > For reference, Gradle has the same problem. > Proof of concept: > * CircleCI : [https://github.com/entur/maven-orb] > * Github actions: [https://github.com/skjolber/tidy-cache-github-action] > The implementation uses instrumentation to record artifact access, then > delete the artifacts not recorded. > *Alternatives:* > I did try the last-accessed file timestamp first, turns out most CI > filesystems are mounted without that option. However it should also be > possible to update the modified timestamp and/or add read access to some > existing metadata file. -- This message was sent by Atlassian Jira (v8.20.1#820001)