Guido Casper wrote:
Daniel Fagerstrom wrote:
So a pipeline for input handling could look like:
g -> t* -> store -> act -> [select] -> g -> t* -> s.
I'm still not convinced by this symmetry thing :-)
The requirements for inbound data flow seems to be too different from those of outbound data flow.
For outbound data flow everything is converted to a string which is quite easy and nicely supported by XML's weakly typed nature (IMO one major reason for XMLs power and success) and a powerful transformation language.
For inbound data flow (as you already mentioned) you need strongly typed data which requires parsing, validation and error handling. I do see value in putting this data - once grabbed and converted by the forms framework - into some sort fo pipeline. What I'm unsure about is if these pipelines will be of similar power as weakly typed pipelines. I believe Cocoon's pipelines achieve this level of component reusability because of its weakly typed (and therefore loosely coupled) nature.
Now IIUC you suggest a pipelining architecture for inbound data flow with a DOM-like data model.
I guess I was a little bit unclear. The typed DOM is only for storing data in the application. The inbound dataflow is an ordinary Cocoon pipeline, (SAX, untyped). There might be cases where one would like to use validation and type info for the inbound data, but I think that should be done within the pipeline component, not in the communication between them.
--- O ---
Steping back a little bit, the main point with my RT was to discuss design patterns for webapps (or more generally Cocoon based applications in all supported environments), rather than any sitemap extensions or the like. All that I proposed can be done within the current Cocoon, I am not proposing any new mechanisms, (although some stuff possibly could be done in a more convinient way with new sitemap constructs).
So, a webapp consist of a controller for multi page flow or workflow (flowscripts). In each step in the flow input is read, an internal state is updated and output is produced. We all agree about that output production should be XML based. My main message is that it is a good idea to use XML for the inbound dataflow and at least to a part as a storage to.
For the input part this can be done by calling a pipeline with the processPipelineTo[...] function in flowscripts. The pipeline typically starts with a stream or request generator or any generator that reads from a module source connected to an input stream or something similar.
The state consist typically of a backend (DB, EJB etc) and some session state (session attributes, flowscript variables). For form handling it is a good design pattern to store input data in a "form model" instead of writing dirrectly to the backend. Especially if the backend is of transactional nature. This pattern is used in CForms. The form model is a (typically typed) data structure where you gather the input before writing it to the backend. In update operations, the form model is also loaded with data from the backend before it is "edited" through the web gui.
What I propose is that the "form model" idea is not only usable in CForms based form handling but in other kinds of webapps as well. So it would be practical if the utilities for handling some appropriate typed datastructure was made available outside CForms. Thus the mechanisms for loading data (XML, Java beans, DB etc) into the store and saving from it could be reused in other contexts also.
To work well with the rest of Cocoon, XML seem to be the most practical choise.
Since AFAIK there is no standard DOM-like data model/API carrying strongly typed data we would have to come up with our own and I feel it might eventual look similar to the Woody widget hierarchy.
You are AFAIK right in that there are no standard API for accessing type
info or detailed validation info. There is a standard:
http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/ for checking what element you can add on a point in a tree and a note http://www.w3.org/TR/2002/NOTE-DOM-Level-3-AS-20020725/ on accessing schema info. Both are IMO overkill for our current needs. What we need is adding a schema to a DOM, perform a validation of the DOM, ask a text node or attribute for what shema data type it has, if it is valid and geting it content in term of a Java object.
So even if we would need some properitary enhancments, we can still use DOM core, events, XMLSchema, Relax-NG etc, and implementations of them. The data type part of the different schema languages is desiged just for creating the strong type system for XML that is needed for dewscribing data to/from databases, programs etc.
My point is actually that the storage aspect of the widget hierarchy have so many similaritites with typed XML that it seem to be a masive re-invention of the wheel to have an own propitary solution with an own schema language, own data types, own api etc.
So what exactly is it that makes the Woody widget hierarchy unsuitable for being the data model of the inbound pipelining architecture?
The main problem is the impedance mismatch with XML. As soon as you want to go from widgets to XML or back, you have to write a binding for this.
Having said all that it ocurred to me that what you are describing is what in parts may correspond to an XForms-like architecture.
It certainly does, that was one of the inspiration sources. Forgot to refer to it.
Ignoring for a moment that XForms is a client side technology and thinking about the artifacts a developer has to create. There is a form description which has its datatypes and validation rules described by an XML Schema (IIUC). After data validation and everything is done the data is sent to the server as a (weakly typed) XML document which may be fed into a pipeline. When entering the pipeline the document may be validated against the very same XML schema that validates the form on the client-side. The XML flowing through the pipeline is still weakly typed yet might be treated in a strongly typed manner (such as the XPath 2.0 data model to be accessed by XSLT 2.0 or XQuery or some super-duper XUpdate-like whatever next generation XML manipulation language).
Yes, something like that!
However as it stands today there is no standard
way to access such a strongly typed data model (like XML schema's PSVI Post Schema Validation Infoset) via Java (AFAIK). Maybe XMLBeans (as Steven suggested) could help here?
As said above, all the untyped aspects of the data structure can be used through the DOM interfaces, but we need some propitary extensions for the type and validation info.
So what about an XML-adaptor-like component (you could even today do a processPipelineToDOM("pipelineWithWoodyGenerator")) instead of changing Woody's internal data model?
Yes, having a DOM facade to the widget hierarchy is one way to realise what I propose. I believe that there are good reasons for have a stronger speration of concern whithin CForms between the data store and the control aspects of widgets, but I return to that in more detail later.
/Daniel