Re: Experience with workflow at Hippo Webworks

Johan Stuyts Mon, 08 Mar 2004 03:03:18 -0800

On Mon, 08 Mar 2004 11:41:12 +0100, Unico Hommes <[EMAIL PROTECTED]> wrote:

Johan Stuyts wrote:
On Fri, 05 Mar 2004 17:18:53 -0500, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote:
Johan Stuyts wrote:
Experience with workflow at Hippo Webworks
==========================================
At Hippo we used OSWorkflow to implement a workflow solution in a demo. Below are our experiences.

As people with different levels of experience are interested in workflow I will start with a (very) brief introduction to workflow.
Workflow introduction
---------------------
Very simply put workflow serves two purposes:
- to determine who can do what at which time with an object;
- to generate a list of pending tasks for users.
An example of the first is that an editor (who) can only publish (do
what) a document (an object) after a writer has asked for a review (at
which time).
The lists of documents to be reviewed is an example of a pending task
list for an editor.
Each object type can have its own specific workflow.
The demo workflow
-----------------
The demo we created has a workflow for a basic document type, a web
page. I have attached a diagram of it.
A document gets created by a writer. The writer is not allowed to
publish his document directly, he has to ask the editor for review.
The editor can easily review documents because we generate a list of
documents waiting for review. The editor can click on the document and
can either approve or disapprove. If the document gets approved it is
published on the public server.
If the document gets disapproved the writer can not ask for a review without editing it first. Editing the document when it has been approved will bring the document back to the editing state too. After making his changes the user can ask for a review of the new version.
Implementation
--------------
For the document repository we use Slide. For the workflow engine we
used OSWorkflow. We connected these two using Slide interceptors.
wow, supercool!! I want it :-)

When a document is created the interceptor checks to see whether a workflow already exists. It does this by retrieving the workflow ID from a WebDAV property of the document. If it doesn't exist a new workflow is created in the workflow store.
Interesting terminology you use here: let me ask you this before we get
confused: "workflow" is for you an instance of the model or the model
itself?
I use the same term for both the model and the instance :">
When our frontend retrieves the tree of documents, the interceptor will retrieve the workflow for each document.
Seems to be the instance. Ok, careful though, because normally people
refer to workflow as the "model", not the instance.
I will be more explicit in further messages.
Looking at the role of the user the interceptor will determine which actions are enabled. The enabled actions (including their display text and activation URLs) are set in a WebDAV property of the document.

For the generation of the pending task list we used the OSWorkflow query API to generate the documents which are in the waiting-for-review state. The approve and disapprove actions are passed to the frontend in the same way as the commands for a writer.

Not all actions are directly shown in the menu, because the user invokes them implicitly. The edit action for example is invoked by the interceptor each time the user saves the document.
Issues
------
We encountered issues with both slides and OSWorkflow during the
implementation.
Before we used Slide, we used the Cocoon repository. The semantics of the repository interceptors and the Slide interceptors is not the same. With the repository interceptor we were able to add a property to the document in postStoreContent(...). In Slide we had to do this in preStoreContent(...).
IMHO, makes more sense to add metadata in pre-saving than in
post-saving. It's way more efficient for clustered environments.
I dont't care what's better. I just thought that two technologies used heavily in Cocoon having different semantics for the same concept was confusing.
Well I care. The interception mechanism in the repository was crafted after Slide's. If there are diverging semantics or you need extra functionality it can be fixed quite easily.

Apart from that the Slide interceptors work very well, but (in the version of Slide we used) they get called a lot. A single store of a document invoked preStoreContent(...) and postStoreContent(...) multiple times.

well, this is a bug then. there should be a way to connect to an atomic event for a content store... you might want to bring this up on slide-dev

OK. I will look into this (making sure we don't add the same interceptor multiple times).

Interception in Slide is quite low level. For instance, when verioning is turned on one WebDAV PUT will typically result in 2n pre- and 2n post store operations calls where n is the number of versions the resource will have after transaction commit. (I even suspect the factor 2 in that is actually the number of stores configured for the scope the resource is in but I'd have to check that). Anyway, I think that the interception model has been superceeded by Daniel Florey's recent work on event in Slide but I still have to look into that too.

OSWorkflow performed great too. The only disadvantage was the complexity of state machines that can be expressed. As you can see in the attached diagram nested states are used. OSWorkflow does not support these.

The more I hear about workflows, the more I think that writing them with flow and continuations makes more sense than writing a finite state machine.

I don't like procedural code to handle complex state. You wind up with a lot of if-statements and it is difficult to determine what happens when a particular action gets invoked. A state machine has a lot of context: I am in state X, so all operations on this state and its parent states are valid. A state machine also hides a lot of implementation details. No need to check what the value of the current-state variable is.

Continuations do that even better. There isn't even an explicit notion of state at all. I don't want to rule out procedural code for handling conditional logic in workflow yet. I think the State pattern Guido described very effectively solves the nested case complexity. I am very interested in pursuing this more.

I am more interested in how easy it is to read an implemented state machine. Of course it is possible to implement a workflow using continuations, but how easy can I visualize this and use this to discuss it with customers? Keeping a diagram up-to-date each time some JavaScript and/or Java changes is cumbersome.

If you put a state machine in an XML document using a well-defined schema, it should be easy to understand and have a complete overview during the definition of the state machine (I see a state machine as a single concern as I posted before). And it is easy, if you are a diagram-layout expert ;), to generate an up-to-date diagram.

Although the attached workflow does not contain parallel states, we think it might be needed for some document types. A newsletter for example follows the same workflow as the attached one. But parallel to this is a mailing workflow for sending it to the newsletter subscribers.
In the mailing workflow the user can send a test email of the current
version to himself. When he is satisfied he can send the final version
to the newsletter subscribers. After this, he can neither send a test
email nor send it to the subscribers.
But what to do if a mistake in the newsletter is found after sending it to the subscribers? The subscribers won't be happy to receive another copy, so the mailing actions should stay blocked. But not correcting the newsletter on the website looks sloppy. Therefore the editing/reviewing/publishing workflow has to remain active.
this screams for long-lasting continuations!
How would you handle parallel states using continuations? If you want a unique continuation point for each possible combination of states, the number of continuations points will explode.
Won't there always be one continuation per workflow instance? That's constant no?

If you do this you will have to keep a state variable (to a Java object) because you return to the same point. Using the GoF State pattern works great for simple state machines and I use it a lot. But the pattern does not talk about nested and/or parallel states, which become incomprehensible when coded in Java; the state machine logic gets intermixed with the document logic.


--
Unico


--
Johan Stuyts

Re: Experience with workflow at Hippo Webworks

Reply via email to