Hi,

It sounds to me like the generic solution might actually be easier than the hard-coded solution, once you chase down all the edge cases, and will also end up more accurate and reusable. Given that we want to throw away the hard-coded solution as soon as 0.8 is out and replace it with a generic solution, I wonder if it's worth pursuing the hard-coded solution at all.


Hans Dockter wrote:
Hi,

I have implemented a task optimization functionality that we might put into 0.8. I have uploaded my branch to: http://github.com/hansd/gradle/tree/optim

A couple of comments:

1.) The task history is now stored in gradle user home with some hash that relates it to the actual project. The base for the hash is the path of the root dir. We might have issues if a subproject takes part in multiple multi-project builds, if the output is sensitive to the respective multi-project build. The only way I see to solve such a problem, would be to have multiple output dirs.

We want a unique identifier for the build, not for the project. At this stage, the settings dir path would do. Or the project dir of the root project.


2.) Each task has a now doesOutputExists() method which defaults to false. So far all archive tasks have a custom implementation which checks for the existence of the archive. The test task also has a custom implementation which checks for at least one test results file. I hope that we find a way to automate this in 0.9 by introducing a generic notion of task output.

We already have the notion to some degree: properties can be marked with @OutputFile and @OutputDirectory. The default doesOutputExists() could make use of these.


3.) So far there are onlyIf implementations only for the test and the jar task provided by the Java plugin. I will add an onlyIf modification for the test task when the Groovy plugin is applied tomorrow. For 0.9 we want to automate the onlyIf statements based on the information we have on the input arguments of a task.

4.) What about the other tasks? For java compile the Ant javac task has its own optimization checking for changed files. I'm not sure about groovyc, I need to check. The Ant Javadoc/Groovydoc tasks do not check for changed files. To optimize them we would need to check for changed source files. The same is true for the code quality stuff. I'm not sure whether I will have time to get this done before 0.8. I would use Tom's change detection stuff. I haven't had a look at that yet. For 0.9 I guess the SourceSet's will be a good place for source change detection. For 0.8 it might be already good enough to distinguish between no changes/do nothing and do the full thing.


I think you can pretty quickly do something general for all tasks with file inputs:

- In the onlyIf predicate, calculate the set of (file path, timestamp) for all input files in the history. You could create a hash from this.

- In the onlyIf predicate, skip the task if the input files hash == the input files hash from last successful execution and task.doesOutputExists()

- execute the task

- store the input files hash in the history.

5.) The onlyIf optimization needs to be disabled if any build.gradle which is part of the multi-project build, the settings.grade or an init.gradle changes. Therefore a ScriptSource object now has a method hasChanged which defaults to true. The DefaultScriptCompilerFactory sets it to false if a script is read from the cache. I'm not very happy about the latter mechanism. To me this looks like a hint that the ScriptSource should be responsible for the compilation, instead of the compile class having a side effect on the state of ScriptSource. I will think about this in more detail tomorrow.


I think a better approach is to use the properties of the task. This is more accurate, in that it catches changes to the task configuration that aren't the result of changes to the build/init/settings scripts. Some types of changes we don't catch by checking if the scripts has changed: * Task is configured using -PsomeProperty=value, and that value is different to last execution. * Task is configured using system property, and that value is different to last execution. * Task is configured based on the DAG, and the DAG contains different tasks to last execution. * Task is configured by a 3rd party plugin, and that plugin has changed since last execution * Task is configured by buildSrc code, and that code has changed since last execution * Task is configured using properties from an imported build.xml, and that build.xml has changed
* Task is configured using properties from gradle.properties, ...
* ... you get the idea ...

So, checking whether the scripts have changed since last execution doesn't come close to accurately detecting if we need to re-execute a task. It also means we unnecessarily re-execute tasks when an unrelated change has been made to the build script.

I think accuracy is really important with this stuff. It absolutely must be reliable, or people will just run clean all the time to get a reliable build. We want to avoid this.

I would suggest instead that we add an @Input annotation which one can use to mark up the properties of a task which contribute in some significant way to the output of the task. The input of a task is stored in the history, and the set of input files is simply treated as one piece of input.

6.) The GradleInternal class exposes now the settings and the init script ScriptSource objects. It also provides a convenience method to check whether any ScriptSource object has changed. To get hold of the settings object it registers as a BuildListener. I think there should be a better way. I will think more about this tomorrow.


Remove the settings file, perhaps? :)

I'm not completely sure whether we want to push this into 0.8 or not. Feedback is welcome.


I don't think it will be reliable enough.


Adam


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email


Reply via email to