Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Alex Heneveld Wed, 31 Aug 2022 04:52:58 -0700

A
Geoff, Peter, all --

Excellent input.  My thoughts:

DSL:  I like this idea.  I think it could be built on incremental
improvement atop the YAML proposal.  I think by starting with YAML first we
get some advantages:  we get a model (the YAML maps) that we are used to
parsing and working with in Apache Brooklyn, including type assist in the
composer text editor, and a GUI to show and even write workflow wouldn't be
a huge undertaking (as composer converts between YAML text and graphical
already).  This is a declarative model that tools can easily use at design
time, and also a runtime model that we can process and show progress to a
user.  With a DSL there are some extra steps, as it would have to be
converted into some form of model to work with (probably with source line
number mappings so we can map back).  (I would NOT want us to lose the
ability to reason about and show the execution, so am reluctant to lean too
much on the DSL execution of eg things like Groovy.)  And I think as DSL is
additive -- eg the following based on the first example in the doc:

   container image my/google-cloud

      command "gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}"

      env { BUCKET: $brooklyn:config("bucket") }

      on-error retry;

    set-sensor spark-output=${1.stdout}

would get processed into something in order to execute it ... and the model
that corresponds to the YAML seems a good a candidate as any.  (i'm not
sure i like that DSL, but the point applies to most any DSL.)

So I would suggest people can evolve the DSL in parallel and this shouldn't
block implementation of the YAML proposal -- assuming we are agreed on the
functionality and flow.

STEPS:  I went back and forth between the 3 options you list, (a) a list of
steps with no IDs, (b) a map of IDs where every step must say what to do
next, and (c) the current proposal which is (b) plus a default
numero-alphabetic ordering.  I settled on (c) because it has the
conciseness of (a) plus enforced labelling for readability and options for
flow-control (next) as well as extensibility, without the cumbersomeness of
(b) and the restrictions on extensibility it implies.  In terms of related
work the "init.d" mechanism seemed a good one because of the readability
and extensibility it gives (in my experience).  And I've actually used
something similar in some AB projects including TOSCA and it has worked
well in practice.  (However it did not have the "next" or "condition"
options that you raise questions about.)

Aside:  if we're asking people to write and read workflow in YAML, even
with plans for graphical tools and/or a DSL atop it, then I think the
emphasis should be on making it as natural as possible.  To Peter's point
there will always be a bit of a disconnect but I've looked at places where
it has been done well -- eg CircleCI -- and places where it hasn't been (no
names, but you know extremely verbose YAML) and tried to emulate the
former.  Explicit IDs force the user to do a bit of documentation, and make
it easier to correlate subsequent behaviour or graphical components.
Having good default behaviour so keys can be omitted, and having a
shorthand for common tasks, these also make a big difference to authoring.
With (b), requiring a "next" all the time, we need an extra line in many
steps, and we lose the ability to support the shorthand.

CONDITION:  Again I played with `if: { ..., then: { ... } }` and reached
the conclusion it's not a nice YAML experience (and gets worse with
"else"!).  Ugly to write and ugly to work with.  The `condition` block I've
used in circle-ci and elsewhere to achieve this and thought it was a more
pleasant experience -- the best I came across.  Pleasant is relative of
course -- and this I think is an area where DSL and visual will in time
help even more.  A DSL could express `if ... then ... elseif ... else ... `
in a more natural way, and easily generate the YAML.  Visually any step
with a condition I would see the route(s) in to it branching, and the
condition shown on the branch leading to the step, and the else branch
leading to the next step in the ordering, and a line after the conditional
step going to the next step (or to a different step if `next` is explicitly
indicated).  But if we're starting with YAML then a condition keeps the
model flat and simpler.

NEXT:  This is probably the aspect I'm least confident of.  But in my
exploration it feels good, IMO better than the alternatives.  Geoff you
bring Dijkstra's "GOTO Considered Harmful" statement to mind -- the
argument being that goto can encourage developers to make a messy program,
(ab)using it when there are other better control structures.  Except when
coming from YAML most of the possible better control structures give a
worse UX to write, in my experience -- we end up with the `if: { ..., then:
... }` or introduce a similar `while: { condition: { ... }, do: { ... } }`
-- both of which I think introduce more cognitive load.

Aside:  "NESTING considered harmful".  In my experience, requiring a lot of
nesting within commands also makes the UX unpleasant -- speaking from a
YAML perspective -- and makes the workflow harder to reason about.  As soon
as things are nested, either in `if` or `while` or nested `workflow` then
I've got something more complicated to work with -- visually it is hard to
read (note I am speaking explicitly about YAML; if we layer a DSL on top I
would embrace nesting and eschew goto in that), implicit context/depth to
consider when resolving labels and execution flow.  The proposed model sees
the workflow as a flat set of steps, so code and user can know exactly
where in execution something is occurring, based on its ID.  The only
nesting is an explicit nested workflow which establishes a sub-context.  If
we have lightweight nesting a la if or while then we lose this runtime
clarity.  (To be clear, for a DSL written on top of this, I would take a
very different view of this!)

LET:  +1 to this suggestion as a shorthand for the cumbersome
`set-workflow-variable`.  Improves readability.  I could see some simple
maths being supported as well here.

LOOPS:  There are a three main ways loops could be done; one is with
explicit nested workflow running over a target eg 1..10; another is
combining a condition with and increment step and a "next" entry to repeat
the loop while the condition is satisfied; and the final one is using the
special "retry" type (especially suited to time-based retries, waiting for
some event or correcting errors).  For the reasons under NEXT, I think a
"while" statement at this level is not wanted -- but I think it would be
good at the DSL.  I've added some worked examples i the document at
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.1af8dhexg3n5
.

REUSE:  As for adding a workflow to the catalog, I hadn't fleshed this out
except to satisfy myself that we could declare a new type "my-task" in a
catalog.bom, have it extend { type: workflow } and declare { steps: { ... }
}.  The type "my-task" could then be used in any workflow step or in an
effecftor definition which wants steps.  The new type could itself be
extended, with Map.putAll done to merge steps (inserting new and
overwriting existing labels).  I think there will be a need to declare
inputs and to specify inputs as part of this, and that side of things I
hadn't thought about but assumed we'd be able to use the `parameters`
syntax we use elsewhere along with default values.  I've added a sketch of
this at
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.rl70czdae7la
.

TESTS:  Excellent observation.  This would be absolutely transformative for
the brooklyn tests mechanism.  Currently that allows test scripts to be
written in a DSL a bit like the one described here.  The existing
assertions and methods there could be implemented as workflow steps so that
writing tests and viewing and re-running tests all become much nicer.

As for how to proceed:

INCREMENTAL IMPLEMENTATION:  An initial set of workflow using a small
subset of types would be the starting point, and allowing this to be used
to declare sensors and effectors and policies.  I would then see most of
the other tasks able to proceed in parallel, with the exception of the UI
which will need the workflow metadata-as-sensors fleshed out a bit more.

WHO:  I'm maybe best placed to start with the initial set of workflow, but
I'd aim to do that pretty quickly keeping it minimal per ^, and then I'd
welcome collaboration on everything else, much of which is parallelizable.
I will update to this list when there is something to look at for the first
set.  You're right I'm raring to go but I'm also very happy to have others
do big swathes of it.  I'm also really grateful for the comments, so keep
them coming.  It'll be a few days before I start, so please LMK if ^^^
makes sense!

Best
Alex

On Tue, 30 Aug 2022 at 23:22, Geoff Macartney <[email protected]> wrote:

> Hi Alex et al.
>
> Here are some thoughts on the proposal.
>
> Cheers
> Geoff
>
> # General thoughts
>
> 1. Adding a procedural "sub-language" like this to Brooklyn could give
> it a whole new level of capability, which is very exciting.
> 2. At the same time this capability could add a whole new dimension of
> complexity and difficulty, so I think it will be very important to
> make sure the implementation and tooling give users lots of support
> for traceability and debugging. It's good to see the emphasis put on
> this in "The Other Big Idea: Traceability and Recoverability",
> hopefully the implementation will also emphasise it.
> 3. "A graphical workflow visualiser is not planned at this time" is
> understandable but I feel probably the more support we can add for
> this in the UI the better the chances of it succeeding.
> 4. Peter's suggestion of a DSL could potentially simplify some of the
> considerations below, what do you think of that suggestion?
>
> # About the workflow language
>
> 1. I find the document currently confusing in terms of what sort of
> language it is describing - is it a workflow, typically expressed as a
> directed graph of nodes (tasks) or a sequence of tasks (which would be
> rather more like a procedural language)?
> 2. if it is a workflow then I'd have thought there shouldn't be any
> notion of ordering ("Step References, Numero-Alphabetic Ordering and
> Extensibility"). Ordering should be defined only by the graph (value
> of "Next" field, and the ids) in this case, don't you think?
> 3. if it is a sequence of tasks then I think the current mechanism of
> ids (the map keys) and Next is awkward. `1-2-http-request` and Next
> puts me in mind of BASIC, with Next playing the role of GOTO. In this
> case I think it has the potential of introducing the same problems as
> GOTO, and might better be done without. Rather it might be preferable
> to express the workflow as an array of steps (sequencing), with
> support for selection (`condition`/`if`), iteration (maybe consider
> introducing `while`? see question below about iteration), and
> "functions" (independently defined named workflows, either in the
> catalog or elsewhere in the blueprint). Ids would be optional, only
> required for steps whose results are referenced elsewhere, not as
> sequencers or labels for Next:
>
> ```yaml
> steps:
>   - id: sparc-job
>     type: container
>     image: my/google-cloud
>     command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
>     env:
>       BUCKET: $brooklyn:config("bucket")
>     on-error: retry
>   - set-sensor: spark-output=${sparc-job.stdout}
> ```
>
> 4. This might also work well with a slightly different markup for
> `condition` (which might be nicer as `if`?), adding a `then` defining
> a (sub) sequence of steps:
>
> ```yaml
> steps:
>   - if:
>       target: ${scratch.skip_date}
>       not: { equals: true }
>       then:
>         - ssh: echo today is `DATE`
>         - <other commands in here...>
> ```
> we could also add `else`.
>
> 5. Can you add some more details about how you see iteration in the
> document - there is that section "Multiple targets and looping" but it
> doesn't really have examples of the latter. Could we introduce a
> `while` construct?
>
> 6. How about renaming `set-workflow-variable` to a simpler `let` or
> `set`? You don't actually give an example of its use, is it something
> like the following? (I'm imagining a convention for a one line
> definition for convenience, with variable name followed by keyword
> `be` to indicate assignment to what follows):
>
> ```yaml
> steps:
>   - let: my-scratch be false
>   - if:
>       target: whatever
>       equals: something
>       then:
>         - let: my-scratch be true
> ```
>
> 7. Could you add some more detail about how independent workflows can
> be defined, e.g. in the catalog (or as a separate workflow in the
> blueprint?) Can these be parameterised, like function definitions? The
> use of `parameters` in the example in section "Request/State Unique
> Identifiers" isn't clear to me.
>
> 8. Would you foresee adding any support for testing all this to the
> Brooklyn tests mechanism? Might be valuable.
>
> # About how to organise implementing all this
>
> 1. Can we sequence work on this to do the simpler bits first and get
> experience with how it works before proceeding to more advanced things
> like nested workflows with per-target conditions ("Multiple Targets
> and looping"). I would particularly like to hope that the UI side of
> things can progress in step with the workflow definitions, rather than
> leave all the UI work to the end.
> 2. (How) could the work for this be spread across the community? I've
> no doubt you're raring to go on this but it would be good if it didn't
> all fall on your shoulders!
>
>
> On Mon, 29 Aug 2022 at 23:51, Geoff Macartney <[email protected]>
> wrote:
> >
> > Hi Alex,
> >
> > I've done a first pass on the document, and it's very impressive. Adding
> a procedural "sub-language" like this to Brooklyn could give it a whole new
> level of capability, which is very exciting. I have some thoughts on some
> of the details proposed which I will try to write up this week.
> >
> > I share the concerns about YAML which I think Peter expressed very well.
> His suggestion of a DSL instead of YAML is interesting and I think would be
> worth considering. I also have some reservations about some of the
> constructs you're proposing (well, at least one of them) and some perhaps
> relatively minor suggestions for changes in structure. My bigger concern is
> that adding a new programming language within Blueprints like this could
> add a whole new dimension of complexity. I'm asking myself, "how would I
> debug this" when things go wrong. I think that's worth some discussion as
> much as the details of the language. There are also points where I simply
> have questions and would like some more detail.
> >
> > I'll try to get more detailed thoughts written up this week.
> >
> > Cheers
> > Geoff
> >
> >
> >
> > On Sat, 27 Aug 2022 at 00:05, Peter Abramowitsch <
> [email protected]> wrote:
> >>
> >> Hi Alex,
> >> I haven't been involved with the Brooklyn team for a long while so take
> >> this suggestion with as little or as much importance as you see at face
> >> value.   Your proposal for a richer specification language to guide
> >> realtime behavior is much appreciated and I think it is a great idea.
> >> You've obviously thought very deeply as to how it could be applied in
> >> different areas of a blueprint.
> >>
> >> My one comment is whether going for a declarative solution, especially
> one
> >> based on YAML is optimal.  Sure Yaml is well known, easy to eyeball,
> but it
> >> has two drawbacks that make me wonder if it is the best platform for
> your
> >> idea.  The first is that it is a format-based language.  Working in
> large
> >> infrastructure projects, small errors can have disastrous consequences,
> so
> >> as little as a missing or extra tab could result in destroying a data
> >> resource or bringing down a complex system.   The other, more
> philosophical
> >> comment has to do with the clumsiness of describing procedural concepts
> in
> >> a declarative language.  (anyone have fun with XSL doing anything
> >> significant?)
> >>
> >> So my suggestion would be to look into DSLs instead of Yaml.  Very nice
> >> ones can be created with little effort in Ruby Python, JS - and even
> Java.
> >> In addition to having the language's own interpreter check the syntax
> for
> >> you, you get lots of freebies such as being able to do line by line
> >> debugging - and of course the obvious advantage that there is no code
> layer
> >> between the DSL and its implementation, whereas with Yaml, someone
> needs to
> >> write the code that converts the grammar into behavior, catch errors
> etc.
> >>
> >> What do you think?
> >>
> >> Peter
> >>
> >> On Wed, Aug 24, 2022 at 8:44 AM Alex Heneveld <[email protected]>
> wrote:
> >>
> >> > Hi folks,
> >> >
> >> > I'd like Apache Brooklyn to allow more sophisticated workflow to be
> written
> >> > in YAML.
> >> >
> >> > As many of you know, we have a powerful task framework in java, but
> only a
> >> > very limited subset is currently exposed via YAML.  I think we could
> >> > generalize this without a mammoth effort, and get a very nice way for
> users
> >> > to write complex effectors, sensor feeds, etc, directly in YAML.
> >> >
> >> > At [1] please find details of the proposal.
> >> >
> >> > This includes the ability to branch and retry on error.  It can also
> give
> >> > us the ability to retry/resume on an Apache Brooklyn server failover.
> >> >
> >> > Comments welcome!
> >> >
> >> > Best
> >> > Alex
> >> >
> >> >
> >> > [1]
> >> >
> >> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
> >> >
>

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Reply via email to