Re: [gradle-dev] Task Optimization

Adam Murdoch Thu, 24 Sep 2009 17:10:53 -0700

Hi,

It sounds to me like the generic solution might actually be easier thanthe hard-coded solution, once you chase down all the edge cases, andwill also end up more accurate and reusable. Given that we want to throwaway the hard-coded solution as soon as 0.8 is out and replace it with ageneric solution, I wonder if it's worth pursuing the hard-codedsolution at all.



Hans Dockter wrote:

Hi,
I have implemented a task optimization functionality that we might putinto 0.8. I have uploaded my branch to:http://github.com/hansd/gradle/tree/optim
A couple of comments:
1.) The task history is now stored in gradle user home with some hashthat relates it to the actual project. The base for the hash is thepath of the root dir. We might have issues if a subproject takes partin multiple multi-project builds, if the output is sensitive to therespective multi-project build. The only way I see to solve such aproblem, would be to have multiple output dirs.

We want a unique identifier for the build, not for the project. At thisstage, the settings dir path would do. Or the project dir of the rootproject.

2.) Each task has a now doesOutputExists() method which defaults tofalse. So far all archive tasks have a custom implementation whichchecks for the existence of the archive. The test task also has acustom implementation which checks for at least one test results file.I hope that we find a way to automate this in 0.9 by introducing ageneric notion of task output.

We already have the notion to some degree: properties can be marked with@OutputFile and @OutputDirectory. The default doesOutputExists() couldmake use of these.

3.) So far there are onlyIf implementations only for the test and thejar task provided by the Java plugin. I will add an onlyIfmodification for the test task when the Groovy plugin is appliedtomorrow. For 0.9 we want to automate the onlyIf statements based onthe information we have on the input arguments of a task.
4.) What about the other tasks? For java compile the Ant javac taskhas its own optimization checking for changed files. I'm not sureabout groovyc, I need to check. The Ant Javadoc/Groovydoc tasks do notcheck for changed files. To optimize them we would need to check forchanged source files. The same is true for the code quality stuff. I'mnot sure whether I will have time to get this done before 0.8. I woulduse Tom's change detection stuff. I haven't had a look at that yet.For 0.9 I guess the SourceSet's will be a good place for source changedetection. For 0.8 it might be already good enough to distinguishbetween no changes/do nothing and do the full thing.

I think you can pretty quickly do something general for all tasks withfile inputs:

- In the onlyIf predicate, calculate the set of (file path, timestamp)for all input files in the history. You could create a hash from this.

- In the onlyIf predicate, skip the task if the input files hash == theinput files hash from last successful execution and task.doesOutputExists()


- execute the task

- store the input files hash in the history.

5.) The onlyIf optimization needs to be disabled if any build.gradlewhich is part of the multi-project build, the settings.grade or aninit.gradle changes. Therefore a ScriptSource object now has a methodhasChanged which defaults to true. The DefaultScriptCompilerFactorysets it to false if a script is read from the cache. I'm not veryhappy about the latter mechanism. To me this looks like a hint thatthe ScriptSource should be responsible for the compilation, instead ofthe compile class having a side effect on the state of ScriptSource. Iwill think about this in more detail tomorrow.

I think a better approach is to use the properties of the task. This ismore accurate, in that it catches changes to the task configuration thataren't the result of changes to the build/init/settings scripts. Sometypes of changes we don't catch by checking if the scripts has changed:* Task is configured using -PsomeProperty=value, and that value isdifferent to last execution.* Task is configured using system property, and that value is differentto last execution.* Task is configured based on the DAG, and the DAG contains differenttasks to last execution.* Task is configured by a 3rd party plugin, and that plugin has changedsince last execution* Task is configured by buildSrc code, and that code has changed sincelast execution* Task is configured using properties from an imported build.xml, andthat build.xml has changed

* Task is configured using properties from gradle.properties, ...
* ... you get the idea ...

So, checking whether the scripts have changed since last executiondoesn't come close to accurately detecting if we need to re-execute atask. It also means we unnecessarily re-execute tasks when an unrelatedchange has been made to the build script.

I think accuracy is really important with this stuff. It absolutely mustbe reliable, or people will just run clean all the time to get areliable build. We want to avoid this.

I would suggest instead that we add an @Input annotation which one canuse to mark up the properties of a task which contribute in somesignificant way to the output of the task. The input of a task is storedin the history, and the set of input files is simply treated as onepiece of input.

6.) The GradleInternal class exposes now the settings and the initscript ScriptSource objects. It also provides a convenience method tocheck whether any ScriptSource object has changed. To get hold of thesettings object it registers as a BuildListener. I think there shouldbe a better way. I will think more about this tomorrow.


Remove the settings file, perhaps? :)

I'm not completely sure whether we want to push this into 0.8 or not.Feedback is welcome.


I don't think it will be reliable enough.


Adam


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email

Re: [gradle-dev] Task Optimization

Reply via email to