Re: Exploring Corona

2008-03-28 Thread Steven Dolg



Carsten Ziegeler schrieb:

Ralph Goers wrote:

Consider this:

URL baseUrl = new URL("file:///C:/temp/");
Pipeline pipeline = new NonCachingPipeline();
pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
pipeline.addComponent(new XMLSerializer());
pipeline.invoke(new InvocationImpl(System.out));

This simple pipeline has these potentially cacheable components; 
xyz.xml, xyz.xslt, the result of the XSLT transformation, and the 
final result of the pipeline. As it relates to the pipeline I don't 
see how the URL.getLastModified() really helps as it could apply to 
any of these items, two of which aren't even URLs.



Hmm, I think this isn't different to what we have today with sources.
Today: FileGenerator, XSLTTransformer use a source as input
   For caching: this source provides a validity object
URLs: FileGenerator, XSLTTransformer use a url as input
   For caching: this url provides a last modified date
XMLSerializer in both cases returns a fake (or always valid) validity 
object/last modified.

Thanks for responding ;-)
This is exactly the way I implemented the simple caching approach for 
Corona.
Patch from me is still due (I know, shame on me) - work load is 
currently quite high...


Now, as I responded to Steven, last modified covers most use cases but 
not all of the use cases the validity object can handle. This is where 
we have to think about a good way to have the same.


Carsten


Re: Exploring Corona

2008-03-28 Thread Rainer Pruy
It is essential to keep the different layers straight here.

The example is somewhere at the level of the pipeline api or probably sitemap 
api implementation..
Here caching is a question of the implementation of the components.
It actually will depend on different implementations of generators, 
transformers or serializers (cache-enabled or not).

URL cache support is an issue for implementing the cache support within a 
component.
e.g. the FileGenerator might use .getLastModified() or alike methods for 
determining cache control info for its own cacheability...
Also the transformer might use such information for determining whether the 
script used is still valid.

Thus, it is not really surprising that the example will not really benefit from 
cache parameter info methods provided from URL
implementations - it's a different layer.

However, e.g. when trying to decide whether the "cached" result of the 
FileGenerator() *component* is still valid, it will come handy
to have information on whether the file did change in between.

Rainer

Ralph Goers schrieb:
> Consider this:
> 
> URL baseUrl = new URL("file:///C:/temp/");
> Pipeline pipeline = new NonCachingPipeline();
> pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
> pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
> pipeline.addComponent(new XMLSerializer());
> pipeline.invoke(new InvocationImpl(System.out));
> 
> This simple pipeline has these potentially cacheable components;
> xyz.xml, xyz.xslt, the result of the XSLT transformation, and the final
> result of the pipeline. As it relates to the pipeline I don't see how
> the URL.getLastModified() really helps as it could apply to any of these
> items, two of which aren't even URLs.
> 
> Ralph
> 
> Steven Dolg wrote:
>>
>>
>> Carsten Ziegeler schrieb:
>>> Steven Dolg wrote:
 How about:

 URL url = new URL("some url");
 UrlConnection connection = url.openConnection();
 connection.getLastModified();

 Not sure it this really works in all cases, but appears to be quite
 suitable and easily extensible.

>>> Yes, this works for many cases, but not for cases like where you have
>>> an expiry date etc. What do you mean by "easily extensible"?
>> url.openConnection() actually returns a subclass of URLConnection
>> depending on the protocol of the URL.
>> So own protocol implementations can return own subclasses that
>> implement this (and other methods) accordingly.
>> And - at least theoretically - provide additional methods for handling
>> specific stuff, e.g. expiration dates.
>>>
>>> Carsten
>>>



Re: Exploring Corona

2008-03-28 Thread Carsten Ziegeler

Ralph Goers wrote:

Consider this:

URL baseUrl = new URL("file:///C:/temp/");
Pipeline pipeline = new NonCachingPipeline();
pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
pipeline.addComponent(new XMLSerializer());
pipeline.invoke(new InvocationImpl(System.out));

This simple pipeline has these potentially cacheable components; 
xyz.xml, xyz.xslt, the result of the XSLT transformation, and the final 
result of the pipeline. As it relates to the pipeline I don't see how 
the URL.getLastModified() really helps as it could apply to any of these 
items, two of which aren't even URLs.



Hmm, I think this isn't different to what we have today with sources.
Today: FileGenerator, XSLTTransformer use a source as input
   For caching: this source provides a validity object
URLs: FileGenerator, XSLTTransformer use a url as input
   For caching: this url provides a last modified date
XMLSerializer in both cases returns a fake (or always valid) validity 
object/last modified.


Now, as I responded to Steven, last modified covers most use cases but 
not all of the use cases the validity object can handle. This is where 
we have to think about a good way to have the same.


Carsten
--
Carsten Ziegeler
[EMAIL PROTECTED]


Re: Exploring Corona

2008-03-28 Thread Ralph Goers

Consider this:

URL baseUrl = new URL("file:///C:/temp/");
Pipeline pipeline = new NonCachingPipeline();
pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
pipeline.addComponent(new XMLSerializer());
pipeline.invoke(new InvocationImpl(System.out));

This simple pipeline has these potentially cacheable components; 
xyz.xml, xyz.xslt, the result of the XSLT transformation, and the final 
result of the pipeline. As it relates to the pipeline I don't see how 
the URL.getLastModified() really helps as it could apply to any of these 
items, two of which aren't even URLs.


Ralph

Steven Dolg wrote:



Carsten Ziegeler schrieb:

Steven Dolg wrote:

How about:

URL url = new URL("some url");
UrlConnection connection = url.openConnection();
connection.getLastModified();

Not sure it this really works in all cases, but appears to be quite 
suitable and easily extensible.


Yes, this works for many cases, but not for cases like where you have 
an expiry date etc. What do you mean by "easily extensible"?
url.openConnection() actually returns a subclass of URLConnection 
depending on the protocol of the URL.
So own protocol implementations can return own subclasses that 
implement this (and other methods) accordingly.
And - at least theoretically - provide additional methods for handling 
specific stuff, e.g. expiration dates.


Carsten



Re: Exploring Corona

2008-03-28 Thread Steven Dolg



Carsten Ziegeler schrieb:

Steven Dolg wrote:

How about:

URL url = new URL("some url");
UrlConnection connection = url.openConnection();
connection.getLastModified();

Not sure it this really works in all cases, but appears to be quite 
suitable and easily extensible.


Yes, this works for many cases, but not for cases like where you have 
an expiry date etc. What do you mean by "easily extensible"?
url.openConnection() actually returns a subclass of URLConnection 
depending on the protocol of the URL.
So own protocol implementations can return own subclasses that implement 
this (and other methods) accordingly.
And - at least theoretically - provide additional methods for handling 
specific stuff, e.g. expiration dates.


Carsten



Re: Exploring Corona

2008-03-28 Thread Carsten Ziegeler

Steven Dolg wrote:

How about:

URL url = new URL("some url");
UrlConnection connection = url.openConnection();
connection.getLastModified();

Not sure it this really works in all cases, but appears to be quite 
suitable and easily extensible.


Yes, this works for many cases, but not for cases like where you have an 
expiry date etc. What do you mean by "easily extensible"?


Carsten

--
Carsten Ziegeler
[EMAIL PROTECTED]


Re: Exploring Corona

2008-03-28 Thread Steven Dolg



Carsten Ziegeler schrieb:

Reinhard Poetz wrote:
ok. Steven and I will work on Corona next week again so that the code 
reflects the "layered design" that we have discussed recently. When 
doing this we will also improve the package structure to make it 
becomes cleaner in general (and more OSGi friendly in particular).


Great :) I'll hold my breath till then (and try to get some ideas 
about the url and caching stuff)



How about:

URL url = new URL("some url");
UrlConnection connection = url.openConnection();
connection.getLastModified();

Not sure it this really works in all cases, but appears to be quite 
suitable and easily extensible.



Carsten



Re: Exploring Corona

2008-03-28 Thread Carsten Ziegeler

Reinhard Poetz wrote:
ok. Steven and I will work on Corona next week again so that the code 
reflects the "layered design" that we have discussed recently. When 
doing this we will also improve the package structure to make it becomes 
cleaner in general (and more OSGi friendly in particular).


Great :) I'll hold my breath till then (and try to get some ideas about 
the url and caching stuff)


Carsten

--
Carsten Ziegeler
[EMAIL PROTECTED]


Re: Exploring Corona

2008-03-28 Thread Reinhard Poetz

Carsten Ziegeler wrote:
Intersting stuff - thanks Reinhard and Steven for starting this and 
sharing it with us.


Finally I had time to have a *brief* look at it and I have some remarks :)


:-)

I think the pipeline api and sitemap api should be separate things. So 
the invocation should rather be in the pipeline api as the base of 
executing pipelines. We could than split this into two modules.


good idea



I'm not sure if actions belong to the pipeline api; i think they are 
rather sitemap specific. All they do wrt to the pipeline is to change 
the invocation perhaps. So this could also be done before starting the 
pipeline and get the action stuff out of the pipeline api.


Since I wasn't sure if we need actions in the sitemap language at all, we just 
made them work. Maybe we can merge them with the controller integration which 
hasn't been thought through either.


The classes should be put into different packages: we should separate 
between the pure api, helper classes and implementations. This makes it 
easier to use the stuff in an osgi environment.


ok. Steven and I will work on Corona next week again so that the code reflects 
the "layered design" that we have discussed recently. When doing this we will 
also improve the package structure to make it becomes cleaner in general (and 
more OSGi friendly in particular).


--
Reinhard PötzManaging Director, {Indoqa} GmbH
  http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair[EMAIL PROTECTED]
_


Re: Exploring Corona

2008-03-27 Thread Sylvain Wallez

Carsten Ziegeler wrote:
Intersting stuff - thanks Reinhard and Steven for starting this and 
sharing it with us.


Finally I had time to have a *brief* look at it and I have some 
remarks :)


I think the pipeline api and sitemap api should be separate things. So 
the invocation should rather be in the pipeline api as the base of 
executing pipelines. We could than split this into two modules.


I'm not sure if actions belong to the pipeline api; i think they are 
rather sitemap specific. All they do wrt to the pipeline is to change 
the invocation perhaps. So this could also be done before starting the 
pipeline and get the action stuff out of the pipeline api.


Yes, actions definitely don't belong to the pipeline API. They are 
sitemap control structures, just like matchers and selectors. The main 
difference between matcher and action (besides the pattern/src 
attribute) is that actions are allowed to have side effects while 
matchers should not.


The classes should be put into different packages: we should separate 
between the pure api, helper classes and implementations. This makes 
it easier to use the stuff in an osgi environment.


Ok, final comment for today, the idea of abstracting the consumer and 
the producer seems appealing. It's like the javax.xml stuff (Result, 
Source); the javax.xml stuff has the advantage that the implementation 
knows which results and sources are possible: there are only a 
handfull of subsclasses; adding own results or sources simply is not 
supported.

I fear we will have to follow the same path (which might not be bad).


Reminds me of some old thoughts I had about a Cocoon 3. This can be the 
role of a collection of adapters that would convert data for components 
that can't directly talk to each other. This complexifies the picture a 
bit, but would allow for advanced things such as non-XML pipelines, 
mixing SAX, DOM and StAX transparently to e.g. perform some 
content-aware construction of the pipeline, etc.


Sylvain

--
Sylvain Wallez - http://bluxte.net



Re: Exploring Corona

2008-03-27 Thread Torsten Curdt


On Mar 27, 2008, at 19:14, Carsten Ziegeler wrote:
Intersting stuff - thanks Reinhard and Steven for starting this and  
sharing it with us.


Finally I had time to have a *brief* look at it and I have some  
remarks :)


I think the pipeline api and sitemap api should be separate things.


+1

So the invocation should rather be in the pipeline api as the base  
of executing pipelines. We could than split this into two modules.


I'm not sure if actions belong to the pipeline api; i think they are  
rather sitemap specific. All they do wrt to the pipeline is to  
change the invocation perhaps. So this could also be done before  
starting the pipeline and get the action stuff out of the pipeline  
api.


+1

cheers
--
Torsten


Re: Exploring Corona

2008-03-27 Thread Carsten Ziegeler
Intersting stuff - thanks Reinhard and Steven for starting this and 
sharing it with us.


Finally I had time to have a *brief* look at it and I have some remarks :)

I think the pipeline api and sitemap api should be separate things. So 
the invocation should rather be in the pipeline api as the base of 
executing pipelines. We could than split this into two modules.


I'm not sure if actions belong to the pipeline api; i think they are 
rather sitemap specific. All they do wrt to the pipeline is to change 
the invocation perhaps. So this could also be done before starting the 
pipeline and get the action stuff out of the pipeline api.


The classes should be put into different packages: we should separate 
between the pure api, helper classes and implementations. This makes it 
easier to use the stuff in an osgi environment.


Ok, final comment for today, the idea of abstracting the consumer and 
the producer seems appealing. It's like the javax.xml stuff (Result, 
Source); the javax.xml stuff has the advantage that the implementation 
knows which results and sources are possible: there are only a handfull 
of subsclasses; adding own results or sources simply is not supported.

I fear we will have to follow the same path (which might not be bad).

Carsten
--
Carsten Ziegeler
[EMAIL PROTECTED]


Exploring Corona

2008-03-21 Thread Reinhard Poetz


Today I have added Corona to our whiteboard section in SVN 
(http://svn.apache.org/repos/asf/cocoon/whiteboard/corona/). It mostly mimicks 
the existing concepts of pipelines and sitemaps as you know from Cocoon 2.x. The 
available test cases are a good starting point to explore the sources. We also 
hope that this email explains some of our ideas.


What does already work?
===

PIPELINE API


So far we have created a (minimalistic) pipeline API (o.a.c.c.pipeline.Pipeline 
[1]) that works based on two fundamental concepts:


1. The first component of a pipeline is of type
   o.a.c.c.pipeline.component.Starter. The last component is of type
   o.a.c.c.pipeline.component.Finisher.

2. In order to link components with each other, the first has to be
   a o.a.c.c.pipeline.component.Producer, the latter
   a o.a.c.c.pipeline.component.Consumer.


When the pipeline links the components, it merely checks whether the above 
mentioned interfaces are present. So the pipeline does not know about the 
specifc capabilities or the compatibility of the components.
It is the responsibility of the Producer to decide whether a specific Consumer 
can be linked to it or not (that is, whether it can produce output in the 
desired format of the Consumer or not). It is also conceivable that a Producer 
is capable of accepting different types of Consumers and adjust the output 
format according to the actual Consumer.


There are SAX-based components that implement these concepts.

These concepts are more general than the implementation of Ccooon 2.x which has 
explicit methods to set generators, transformers, serializers and readers on 
pipelines.



SITEMAP
°°°

The sitemap engine works similar like Cocoon 2.x. The 
o.a.c.c.sitemap.SitemapBuilder reads XML and creates a tree of 
o.a.c.c.sitemap.node.SitemapNode objects that know their parent node, their 
child nodes and their parameters.


The o.a.c.c.sitemap.node.AbstractSitemapNode handles the node relationships and 
parameters in a general way.
However there are two annotations (@o.a.c.c.sitemap.node.annotations.NodeChild 
and @o.a.c.c.sitemap.node.annotations.Parameter) to make the access of specific 
child nodes and parameters more explicit.
The ChildNode annotation can be used to store a certain child node in a separate 
member variable instead of the collection of all children (e.g. 
o.a.c.c.sitemap.node.PipelineNode receives its ErrorNode in the errorNode member 
variable).
The Parameter annotation works the same way, but causes parameters to be stored 
in separate member variables (e.g. o.a.c.c.sitemap.node.MatchNode receives its 
pattern in the pattern member variable).



When the sitemap is being executed, the invocation traverses the tree of 
SitemapNodes. Each node returns a o.a.c.c.sitemap.node.InvocationResult that 
indicates the execution state. This is one of NONE, PROCESSED, and COMPLETED:


* NONE means that the node did not do any processing whatsoever (e.g. a 
MatchNode did not match).


* PROCESSED means that the node did some processing, but the traversal should 
continue (e.g. the GenerateNode installed a Generator at the pipeline; but some 
other components might still be pending)


* COMPLETED means that the node did some processing and the traversal should 
stop, since the invocation processing is completed (e.g. the PipelineNode 
executed the pipeline)



Nodes that act as a switch (e.g. MatchNode, ErrorNode, etc.) aggregate the 
individual results of their children.
So a MatchNode will respond with NONE if and only if all of its children return 
NONE, and with COMPLETED otherwise.



EXECUTION CONTEXT
°

When a pipeline and then further the sitemap is invoked, the execution context 
is passed. Since context has so many different meanings to us, we called this 
execution context o.a.c.c.sitemap.Invocation. It contains input parameters, 
sitemap parameters and a component provider and gives access to the result.


Since the sitemap should be useable in any environment, the Invocation doesn't 
have any environment specific dependencies (e.g. the Servlet API). Hence, the 
input parameters are a general map. However, our idea is that environment 
specific parameters (e.g. the HTTPRequest) can be put into this map too and can 
be made accessible by an accessor helper class.
So if a component needs access to environment specific parameters e.g. the 
HTTPRequest, it uses the appropriate accessor helper class. All components and 
accessor helper classes that belong to a certain environment (iow. are not 
generally available) should be bundled together. This creates a core module that 
is useable in any environment and additional modules for specific purposes.


Since the sitemap shouldn't depend on a specific component container, the 
o.a.c.c.sitemap.ComponentProvider as an abstraction for specific containers, was 
introduced. So far we have implemented a o.a.c.c.sitemap.SpringComponentProvider 
that enc