Re: [gradle-dev] Gradle reporting improvements

Daz DeBoer Wed, 30 Jan 2013 04:01:54 -0800

On 30 January 2013 03:35, Luke Daley <[email protected]> wrote:


>
> On 29/01/2013, at 10:13 PM, Daz DeBoer <[email protected]>
> wrote:
>
> >
> >
> > On 29 January 2013 12:53, Adam Murdoch <[email protected]>
> wrote:
> >
> > On 30/01/2013, at 2:52 AM, Daz DeBoer wrote:
> >
> >> On 29 January 2013 00:44, Adam Murdoch <[email protected]>
> wrote:
> >>
> >> On 29/01/2013, at 10:45 AM, Marcin Erdmann wrote:
> >>
> >>> On 01/16/2013 07:05 AM, Adam Murdoch wrote:
> >>>> I wonder if we should change the plan a little. We've just added a
> `TestReport` task type that can generate the HTML report. The goal is that
> at some point we remove the reporting from the `Test` task type and use
> this task instead. To make this work well, we need to make some
> improvements to the task graph. These are the same improvements that are
> already in the spec.
> >>>>
> >>>> So, if you're interested, you could do something like this:
> >>>>
> >>>> 1. Make the task graph changes. There are two parts to this: schedule
> the dashboard task to run after the reports have been generated, without
> adding a hard dependency, and to automatically add the dashboard task to
> the graph whenever any reporting task is added to the task graph.
> >>>> 2. Change the `TestReport` task to implement `Reporting` so that it
> is included in the dashboard.
> >>>> 3. When the dashboard plugin is applied, add in a `TestReport` task
> and disable the test report in the `Test` task.
> >>>> 4. At some point later, once the above is stable, move #3 to the Java
> plugin and deprecate the test report in the `Test` task.
> >>>
> >>> I had a first look at how we could implement that 'always runs after'
> dependency between tasks.
> >>
> >> Thanks for looking into this.
> >>
> >>>
> >>> From what I can tell the reordering logic should go into
> DefaultTestExecutionPlan.addToTaskGraph(). My first idea is to check every
> time a task is added if it has to run before a task that already is in the
> executionPlan map. That means that even though the API should probably look
> like
> >>>
> >>> project.tasks.withType(Reporting).all {
> >>>    buildDashboard.runsAfter(it) // or maybe alwaysRunsAfter()? method
> name should show in a more explicit way that only a 'soft' dependency is
> defined here…
> >>> }
> >>
> >> Something we eventually should be able to do with this feature is
> declare things like:
> >>
> >> * `clean` and all its dependencies must run before anything else.
> >> * Configuration tasks should run before Validation tasks, and
> Validation tasks should run before Verification tasks, and Verification
> tasks should run before Publish tasks, and Publish tasks should run after
> everything else. For example, validate that the user's repository
> credentials are ok before running the unit and integration tests, before
> uploading the jar.
> >> * A resource clean up task must run after the tasks that use the
> resource. For example, stop jetty after the integ tests have executed (if
> it executes at all).
> >>
> >> So, there are a few things here:
> >>
> >> * Both sides of the predicate can be a collection of tasks.
> >> * The collection of tasks is a subset of the tasks in the task graph.
> >> * The predicate can be 'should run after' or 'must run after'.
> >>
> >> So, it feels like this is a constraint that should be attached to the
> task graph, rather than to individual tasks, and the above Task.runsAfter()
> method might simply be a convenience method for adding the constraint to
> the task graph.
> >>
> >> For this first story, we only need to be able to declare: task `n` must
> run after all tasks in collection `m`. We can add all the other stuff
> later. Which means we could just go with the Task.runsAfter() for now. I'd
> call it 'mustRunAfter()' or something like that.
> >>
> >>
> >>>
> >>> we would need to store the information about the soft dependency on
> both tasks - the task that should run before, as we need to act if a task
> that should run before is added to executionPlan after the task it should
> run after has already been added to it, as well as on the task that should
> run after(will explain that in a while). When that happens we should
> probably take the task that should run before and that is currently added,
> and together with all the tasks it depends on (also transitively) put it
> before (move it in front of) the task that should run after and is already
> in the executionPlan. If the task added depends on the task (also
> transitively) which should be executed after it then we shall throw and
> exception (CircularReferenceException?). When moving the task and the tasks
> it depends on we should also make sure that we're not moving any task that
> runsAfter() in front of a task that it should run after - that's why I
> believe that soft dependencies should be stored also on the task that runs
> after. If that happens we should probably throw an exception
> (CircularReferenceException?).
> >>
> >> I think the implementation depends on how 'must run after' affects the
> transitive dependencies. It would make a lot of sense if the semantics were
> the same as for the command-line, so that:
> >>
> >> gradle n m
> >>
> >> implies:
> >>
> >> 1. All tasks with name `m` must run after all tasks with name 'n'.
> >> 2. Add all tasks with name 'n' to the task graph.
> >> 3. Add all tasks with name 'm' to the task graph.
> >>
> >> When `m` must run after `n`, then Gradle should run `n` and all its
> dependencies before `m` and all its dependencies. Any dependencies in
> common are executed before `n`, and if `m` is in the dependencies of `n`,
> then fail (where 'dependencies' == all hard and soft dependencies and all
> their dependencies).
> >>
> >> It also depends on how failures affect these dependencies. There are
> two options. Given `n` must run after `m`:
> >>
> >> * A failure in `m` prevents `n` from being executed.
> >> * A failure in `m` does not affect whether `n` is executed or not.
> >>
> >> To keep with the command-line semantics, we would use the second option.
> >>
> >> Implementation-wise, I would think about busting up building the task
> graph into 2 steps:
> >>
> >> 1. Build the task graph proper, with a node for each task in the graph
> and edges to represent the various types of dependencies.
> >> 2. Once the graph is built, calculate the execution plan:
> >>      - Take each node that has no incoming edges, sort them and then
> traverse each in turn.
> >>      - To traverse a node
> >>              - Take each soft dependency, sort them and traverse each
> in turn.
> >>              - Take each hard dependency, sort them and traverse each
> in turn.
> >>
> >> Perhaps instead of "soft dependencies" we should refer to these as
> "task ordering rules", or something like that?
> >>
> >> - A "dependency" is "Task A cannot run without Task B running first"
> >> - An "ordering" is "Task B must run after Task A" (making no statement
> about whether TaskA will be run or not)
> >
> > That's a pretty good term for this.
> >
> > There are, however, multiple dimensions here. Given some constraint `m`
> before `n`:
> >
> > - Must vs should. For some constraints, `n` can only run after `m` and
> for others, it's preferred that `m` run first by not required. For example,
> stopping the web container must happen after the integration tests. Or it's
> better if the tests run before publishing, but if some tests need to run
> after (e.g. smoke tests), then that's fine.
> > - What happens when `m` fails? For some constraints, `n` must not run,
> for others `m` may run. For example, we must not publish if the unit tests
> fail, we may run the integration test if the unit tests fail.
> > - Must `n` be present in the task graph if `m` is present? For some
> constraints, if `m` is added, then `n` must also be added. For example, if
> I add a start container task, then I must also add a stop container task.
> > - Must `m` be present in the task graph if `n` is present? For example,
> a regular dependency.
> > - Must `n` be executed if `m` is executed? For some constraints, if `m`
> is executed, then `n` must be executed as well. For example, if I start the
> container, I must stop the container as well.
> >
> > So, do we want to call "`m` must run before `n`, `n` cannot run on `m`
> failure, `n` does not need to be present if `m` is, `m` must be present if
> `n` is, `n` does not need to be executed" a "dependency" and everything
> else an "ordering"? Or call them all "orderings" (and a "dependency" is-a
> "ordering")? Something else?
> >
> >
> > It would be nice to keep these dimensions somewhat separate. I would say
> that an "ordering" includes the "must vs should" dimension, but does not
> say anything about the "if a is present then must have b" or "only run a if
> b succeeds" dimension. They are separate types of rules.
> >       • Regular dependency of X on Y
> >               • a "X must run after Y" ordering
> >               • a "given X, add Y to the graph" rule
> >               • an "do not execute X if Y fails" rule.
> >       • TestReport task
> >               • a simple "TestReport must run after all Test" ordering
> >       • Ant depends="X, Y"
> >               • a simple "Y should run after X" ordering (or maybe a
> "must")
> >       • unit-test before publish (without dependency)
> >               • "publish must run after unit-test" rule
> >               • "do not run publish if unit-test fails" rule
> >       • unit-test before integ-test
> >               • "integ-test should run after unit-test"
> >       • Container Stop/Start tasks (finaliser pattern)
> >               • "Given Start, add Stop to the graph" rule
> >               • "IntegTests must run after Start" rule
> >               • "Stop must run after IntegTests" rule
> >               • If another task has a "must run after IntegTest" rule,
> add a "must run after Stop" rule
> >               • "Always execute Stop if Start succeeds" rule
> > So we have rules for:
> >       • Adding tasks to the graph
> >       • Rules that determine task ordering
> >       • Rules that determine if execution of a task is dependent on
> successful execution of another.
> > 1 & 2 would be used to construct the execution graph with "should"
> ordering rules being discarded when they don't fit.
> >
> > 3 would be used while executing the graph. Part of this work would be to
> rework the current implicit rules for this, where we stop as soon as a task
> fails. We'll need to traverse the entire graph looking for any task that is
> in a "can execute" state. It might be tricky to do this in a way that
> preserves the current behaviour and ordering exactly.
> >
> > Regarding terminology (and probably the DSL), we probably need to look
> for common patterns:
> > - regular dependency uses a task dependency
> > - unit-test before publish uses a soft task dependency (must run after
> and don't run if fails, but don't add to the graph)
> > - TestReport task uses a task ordering rule ("must")
> > - unit-test before integ-test uses a task ordering hint ("should")
> > - Start/Stop container uses a task finaliser
> >
> > Just some name suggestions, I'm not tied to any of them. I guess these
> are some high-level constructs that would be built on top of our 3 types of
> task execution rules above.
>
> I don't like “soft dependency” at all. This isn't a dependency of any
> sort. There's no relationship between these things.
>

I disagree. The use-case we have is that if unit-test is part of the
execution graph, then publish should only proceed if and when unit-test
succeeds. So it's exactly the same as a regular dependency that is added
only if unit-test is already in the execution graph for another reason. If
we were running parallel, we'd need to wait for unit-tests to succeed
before commencing publish.

Not sure that a) this is a particularly valid use-case or b) that 'soft
dependency' is a great name. But certainly it's very close to how we
currently model a dependency.

>
> I always conceptualised this general problem as requiring 3 concepts:
>
> 1. Task dependencies
> 2. Ordering optimisations
> 3. Resource usage
>

> #1 has defined semantics as it is.
> #2 is about optimising the order for usability and is never strictly
> required. It doesn't change any semantics WRT failures etc.
> #3 is the container case. One or more tasks need to use something that has
> a lifecycle, implemented as start/stop tasks. If the start task fails, the
> tasks depending on the resource can't run. If the start task succeeds, the
> stop task must be executed some time after the last task executes that
> requires the resource.
>

This is effectively the 1st, 4th and 5th cases in the 'common patterns'
list above, but doesn't cover the 'unit-test must succeed before publish'
case or the 'TestReport' use case.

Which is why I don't think these 3 concepts are sufficient, particularly
when you consider the "only run if X succeeds" dimension. One possibility
is to model this with different low-level dimensions (reiterating from
before):
1. Rules for adding tasks to the graph
2. Rules and hints for ordering task execution (both "must" and "should")
3. Rules for controlling subsequent task execution based on the result of
current task execution

-- 
Darrell (Daz) DeBoer
Principal Engineer, Gradleware
http://www.gradleware.com

Re: [gradle-dev] Gradle reporting improvements

Reply via email to