Re: [gradle-dev] Task Optimization

Steve Appling Mon, 28 Sep 2009 05:17:52 -0700


Adam Murdoch wrote:

Steve Appling wrote:
Adam Murdoch wrote:
Hi,
It sounds to me like the generic solution might actually be easierthan the hard-coded solution, once you chase down all the edge cases,and will also end up more accurate and reusable. Given that we wantto throw away the hard-coded solution as soon as 0.8 is out andreplace it with a generic solution, I wonder if it's worth pursuingthe hard-coded solution at all.
Hans Dockter wrote:
Hi,
I have implemented a task optimization functionality that we mightput into 0.8. I have uploaded my branch to:http://github.com/hansd/gradle/tree/optim
A couple of comments:
1.) The task history is now stored in gradle user home with somehash that relates it to the actual project. The base for the hash isthe path of the root dir. We might have issues if a subproject takespart in multiple multi-project builds, if the output is sensitive tothe respective multi-project build. The only way I see to solve sucha problem, would be to have multiple output dirs.
We want a unique identifier for the build, not for the project. Atthis stage, the settings dir path would do. Or the project dir of theroot project.
We change the build directories for a project based off of severalconditions to effectively build different products in the same suitefrom the same collection of sub-projects. For this to not causeproblems for us, I think we would need the task history to actually gosomewhere under the build directory. This would have the added"benefit" that the task history would be removed when you did a clean,so you would no longer need the doesOutputExists() method - which Ithink is just there to handle cleans after successful task execution.
There's a couple of problems with storing the state under the builddirectory and using its existence to decide whether to rebuild or not:
- It doesn't work for tasks that generate output outside the builddirectory. For example, in Gradle's build the install task generates itsoutput in the $gradle_installPath directory. If you do a clean, thennext time install is executed, it will reinstall the distribution,regardless of whether anything has changed since last install. Or, ifyou install, then delete the install directory, the install task willnot reinstall the distribution without a clean being executed.

This is a very good point.

- It loses history. I'd like to collect profiling information in thehistory, so we can use it for things like reporting, and taskscheduling, and providing better execution feedback on the various UIs.Storing this in the build directory isn't going to work.
I think your problem is better solved instead by making the artifactsthe first-class citizens of the history store, rather than tasks. Thatis, for a given output file/directory we store the identifier of thetask which produced it, plus the input which that task used. Then, weskip the execution of a task if its output files were most recentlybuilt by that task with the same input it has now.

>
> The task identifier is some combination of build identifier + task path.
> The input is some aggregate of the tasks input properties and files.
>

I generally like this solution, but we may have another wrinkle. We have sometasks in different sub-projects that contribute to the same output directory.As long as you are matching both the task and the output directory (and allowthe history to contain multiple tasks with the same output directory andmultiple output directories for a single task) I think this will work.

On a related topic, I really don't like all of the script cacheinformation to be stored under the user home directory. It seems thatputting this under a .gradle in the root project would be better.That way the script caches go away when a project directory isdeleted. I currently have 745 directories directly under myhome/scriptCache directory.
Is it the fact that the scripts are cached under ~/.gradle that youdon't like, or the fact that they aren't being cleaned up when they areno longer needed?

It is really just that they are never cleaned up that bothers me.

I think we have a similar problem under ~/.gradle/wrapper and~/.gradle/cache.
There's a few problems with moving the scripts to the root project dir:

- It doesn't solve the problem for ~/.gradle/wrapper and ~/.gradle/cache.
- It doesn't solve the problem for scripts which are compiled before weknow the root project dir, such as init scripts.
- It doesn't work for read-only workspaces.
There may not be quite as many files under ~/.gradle/wrapper and~/.gradle/cache, but they take up much more space. It would be nice tocome up with a solution which cleaned up every thing we cache.
Some possible solutions:

- A task or command-line option which garbage collects ~/.gradle.
- The gradle command periodically garbage collects ~/.gradle, based onsome threshold. This could be number of invocations since last garbagecollect, time since last garbage collect, total size of ~/.gradle, orfree disk space.
- We garbage collect a cache whenever we write to it (no more than onceper build).
- Don't cache anything under ~/.gradle. For example, store everythingunder the root project dir, including the ivy cache. For those thingswhere we don't know the root project dir, store in a .gradle dir in thedirectory containing the thing.

I would have said that I prefer this, but it doesn't handle read only workspacesor init scripts. I don't know how important this is. This solution also willduplicate the downloaded ivy files for different projects, which is in line withmy desire to keep project information together, but will slow things down ingeneral :(.

All things considered, I guess I would vote for a task or command line option togarbage collect everything (perhaps it can get rid of the sillytemp/groovy-generated directories as well). I don't want to take the time to dothis on each build - it's slow enough already. All of this is a very minorconcern - we can certainly do this later. Thanks for explaining some of thereasons behind it.

We could probably combine some of these.
Overall I like this approach and think that it can really help. I amconcerned about introducing this at the last minute, however. I alsothink that it needs something like the @Input annotation that Adamsuggested before it is really useful. If you want to implement someof these suggested changes and delay (yet again) to test this somemore, then we will be glad to try it out in our project and give youmore feedback. I think it would probably be wiser to move this toearly 0.9.
I'm keen to get started on this as soon as 0.8 is out.


Adam


--
Steve Appling
Automated Logic Research Team

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email

Re: [gradle-dev] Task Optimization

Reply via email to