Adam Murdoch wrote:


Steve Appling wrote:


Adam Murdoch wrote:
Hi,

It sounds to me like the generic solution might actually be easier than the hard-coded solution, once you chase down all the edge cases, and will also end up more accurate and reusable. Given that we want to throw away the hard-coded solution as soon as 0.8 is out and replace it with a generic solution, I wonder if it's worth pursuing the hard-coded solution at all.


Hans Dockter wrote:
Hi,

I have implemented a task optimization functionality that we might put into 0.8. I have uploaded my branch to: http://github.com/hansd/gradle/tree/optim

A couple of comments:

1.) The task history is now stored in gradle user home with some hash that relates it to the actual project. The base for the hash is the path of the root dir. We might have issues if a subproject takes part in multiple multi-project builds, if the output is sensitive to the respective multi-project build. The only way I see to solve such a problem, would be to have multiple output dirs.

We want a unique identifier for the build, not for the project. At this stage, the settings dir path would do. Or the project dir of the root project.

We change the build directories for a project based off of several conditions to effectively build different products in the same suite from the same collection of sub-projects. For this to not cause problems for us, I think we would need the task history to actually go somewhere under the build directory. This would have the added "benefit" that the task history would be removed when you did a clean, so you would no longer need the doesOutputExists() method - which I think is just there to handle cleans after successful task execution.


There's a couple of problems with storing the state under the build directory and using its existence to decide whether to rebuild or not:

- It doesn't work for tasks that generate output outside the build directory. For example, in Gradle's build the install task generates its output in the $gradle_installPath directory. If you do a clean, then next time install is executed, it will reinstall the distribution, regardless of whether anything has changed since last install. Or, if you install, then delete the install directory, the install task will not reinstall the distribution without a clean being executed.
This is a very good point.


- It loses history. I'd like to collect profiling information in the history, so we can use it for things like reporting, and task scheduling, and providing better execution feedback on the various UIs. Storing this in the build directory isn't going to work.

I think your problem is better solved instead by making the artifacts the first-class citizens of the history store, rather than tasks. That is, for a given output file/directory we store the identifier of the task which produced it, plus the input which that task used. Then, we skip the execution of a task if its output files were most recently built by that task with the same input it has now.
>
> The task identifier is some combination of build identifier + task path.
> The input is some aggregate of the tasks input properties and files.
>
I generally like this solution, but we may have another wrinkle. We have some tasks in different sub-projects that contribute to the same output directory. As long as you are matching both the task and the output directory (and allow the history to contain multiple tasks with the same output directory and multiple output directories for a single task) I think this will work.

On a related topic, I really don't like all of the script cache information to be stored under the user home directory. It seems that putting this under a .gradle in the root project would be better. That way the script caches go away when a project directory is deleted. I currently have 745 directories directly under my home/scriptCache directory.


Is it the fact that the scripts are cached under ~/.gradle that you don't like, or the fact that they aren't being cleaned up when they are no longer needed?
It is really just that they are never cleaned up that bothers me.


I think we have a similar problem under ~/.gradle/wrapper and ~/.gradle/cache.

There's a few problems with moving the scripts to the root project dir:

- It doesn't solve the problem for ~/.gradle/wrapper and ~/.gradle/cache.

- It doesn't solve the problem for scripts which are compiled before we know the root project dir, such as init scripts.

- It doesn't work for read-only workspaces.

There may not be quite as many files under ~/.gradle/wrapper and ~/.gradle/cache, but they take up much more space. It would be nice to come up with a solution which cleaned up every thing we cache.

Some possible solutions:

- A task or command-line option which garbage collects ~/.gradle.

- The gradle command periodically garbage collects ~/.gradle, based on some threshold. This could be number of invocations since last garbage collect, time since last garbage collect, total size of ~/.gradle, or free disk space.

- We garbage collect a cache whenever we write to it (no more than once per build).

- Don't cache anything under ~/.gradle. For example, store everything under the root project dir, including the ivy cache. For those things where we don't know the root project dir, store in a .gradle dir in the directory containing the thing.
I would have said that I prefer this, but it doesn't handle read only workspaces or init scripts. I don't know how important this is. This solution also will duplicate the downloaded ivy files for different projects, which is in line with my desire to keep project information together, but will slow things down in general :(.

All things considered, I guess I would vote for a task or command line option to garbage collect everything (perhaps it can get rid of the silly temp/groovy-generated directories as well). I don't want to take the time to do this on each build - it's slow enough already. All of this is a very minor concern - we can certainly do this later. Thanks for explaining some of the reasons behind it.


We could probably combine some of these.


Overall I like this approach and think that it can really help. I am concerned about introducing this at the last minute, however. I also think that it needs something like the @Input annotation that Adam suggested before it is really useful. If you want to implement some of these suggested changes and delay (yet again) to test this some more, then we will be glad to try it out in our project and give you more feedback. I think it would probably be wiser to move this to early 0.9.


I'm keen to get started on this as soon as 0.8 is out.


Adam


--
Steve Appling
Automated Logic Research Team

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email


Reply via email to