Re: Pipeline support

Felix Meschberger Tue, 10 Feb 2009 03:14:06 -0800

Hi Juanjo,

Juan José Vázquez Delgado schrieb:
> In response to this thread [1] in the Apache Cocoon Dev list, I have
> been working in a minimal sample [2] concerning about resolution of
> pipelines and Apache Sling. IMHO, having pipeline support in Sling is
> an important feature in terms of separation of concerns.


Agreed.

> On the other hand, because it´s important not reinventing the wheel,
> IMHO we should take advantage of Cocoon community efforts somehow or
> other.
> 
> Right now, the Cocoon team is working in a new and refactored
> framework´s release named Cocoon 3. AFAIK, this release is intended to
> be a more minimal version of Cocoon 2.2 and IMHO more suitable to be
> integrated into Sling. For the time being (alpha-1), Cocoon artifacts
> are not released as OSGi bundles.
> 
> The stuff [2] is just a proof of concept using Cocoon 3 pipelines
> inside Sling but with the current state of art, that is, without
> changes in Sling core.
> 
> Nevertheless, IMHO Sling should have a more natural pipeline support
> with Cocoon pipeline definitions as Sling scripts. Until now, dynamic
> resources have been rendered with two kinds of animals: servlets and
> scripts. What about having pipelines as a new kind of animal?.
> 
> Comments and ideas are welcome.

I think your approach is very lightweight (which is good) and
straightforward. The problem I see, is that you actually need two
resources (you don't need a Node if you do resource.adaptTo(Map.class),
which gives you the properties you need): One provides the actual
content to be processed and one defining the pipeline and pointing to
the content to be processed.

The problem is not merely, that there are two resources involved, the
problem to me is the pointer from the pipeline definition to the data
source to be processed.

How about turning this around and have the pointer to the pipeline
definition in the data resource, as in:

         /a/b/data
              +-- sling:resourceType = "sling/pipeline"
              +-- sling:pipeline = "/the/path/to/the/pipeline"

         /the/path/to/the/pipeline
              +-- sling:definition = [
                       "step one",
                       "step two"
                   ]

You then request the /a/b/data resource which causes the PipeLineServlet
to kick in and construct the pipeline from the pipeline definition at
/the/path/to/the/pipeline. The pipeline definition can be as complex as
it need be (I am not fluent with this enough, to make a full judgement).

Now, the question is: How to generate the initial (XML) data stream for
the pipeline ? Here I would take a two-option approach, both involving a
custom AbstractGenerator:

 * option 1: there is a sling:pipelineResourceType property. This
      is used to render the data resource and use the result as input
      for the pipeline: The pipeline servlet would include the
      processing of the resource overwriting the sling:resourceType with
      the value of the sling:pipelineResourceType.

 * option 2: there is no sling:pipelineResourceType property: Here the
      data resource contains the XML to be processed. You would then
      adapt the resource to an InputStream and used that directly as
      the input for the pipeline.


There is yet another alternative, which also sounds intriguing: We
define a ScriptEngineFactory for the ".pipeline" extension. Files  with
the extension .pipeline would be pipeline configurations, which would be
interpreted by the PipelineScriptEngine. The second part of the
processing -- preparation of the input data -- would be analogous to the
above with the two options :

         /a/b/data
              +-- sling:resourceType = "sling/pipeline/sample"

         /apps/sling/pipeline/sample/html.pipeline
              "file with pipeline config"


WDYT ?

Regards
Felix

Re: Pipeline support

Reply via email to