Re: Momento and Cocoon [was Re: Jisp 3.0 moved to GPL licence]

2004-03-05 Thread Daniel Fagerstrom
Alan wrote:
* Daniel Fagerstrom <[EMAIL PROTECTED]> [2004-02-23 15:21]:

XSLT



A MomentoSource would also give a good way to use Momento together with 
XSLT and XQuery in Cocoon. Here we need to extend the ordinary use of 
sources somewhat, let me explain:


The Source interface provides a getInputStream method, in Cocoon some 
Sources implements org.apache.excalibur.xml.sax.XMLizable that provides 
a toSAX method as well. SAX or Streams are probably not the most 
efficient way to communicate with an XML db, so to make the pseudo 
protocol idea usable together with Momento, we should provide a way to 
get a DOM structure from a pseudo protocol. This could be done by 
introducing a new interface:


interface DOMizable {
  org.w3c.domNode getNode();
}


Momento, with Cocoon in mind, lends itself to streaming.

Momento would readily support a read-only W3 DOM, but a read write
W3 DOM is quite ugly.

W3 DOM lets you to create inconsistant documents, with is not in
keeping with the C in ACID. (Examples if you want them.) There
is no way to specify the start and end of an atomic transcation
through the DOM API.

Momento uses XUpdate since one can specify a set of modifications,
and Momento can process those modifications as an atomic
transcation. XUpdate expresses all document modifications, and
does so declaratively. Momento can then make logic of you
intentions.
In a pipeline, XML input can be transformed into XUpdate
statement. I suppose one could an XUpdate using JXTemplate from
Flow as well.
XUpdate is really the method of choice for updating Momento.
Both XUpdate and SAX input are a good way to get data into
Momento.
I don't know if you and I talking about the same thing here, but
the sight of org.w3c.domNode leaves me cold. It is a nice
in-memory interface, but a poor interface for persistence.
If W3 DOM were the way to modify a Momento document, the
application developer would have to be prepared to catch all
kinda hel.., er, exceptions, since there are a bunch of stupid
things that Momento won't allow.
I only talked about read only access of DOM documents from XSLT, don't 
worry ;)



or something similar. If the MomentoSource implements DOMizable, we have 
direct access to nodes in the XML db.


Now we are prepared to connect Momento to XSLT. In Cocoon we can use 
Saxon through the org.apache.cocoon.transformation.TraxTransformer, you 
just need to change cocoon.xconf a little bit to use Saxon instead of 
Xalan. There is also a TraxGenerator in the scratchpad that could be 
used with some small modifications.


Momento connects to XSLT using a Saxon NodeInfo interface. It could
connect to Xalan just as easily (through read-only W3 DOM?).
Yes, that the idea. It can connect to Saxon through read only DOM as 
well, don't know if there are any drawbacks with this though.

I would guess that Momento mainly would be accessed through the document 
function in XSLT and XQuery. Saxon use JAXP 1.1 as external API to the 
transformerand the URLs in the document functions are resolved by using 
an implementation of javax.xml.transform.URIResolver that is provided by 
the TraxTransformer.


The above is somewhat confusing for me. Momento does support the
JAXP API. XUpdate is implemented as a SAX filter. It seems like
Momento would work nicely in as a source, sink, or filter for
SAX events.

I've imagined that a pipeline would start with a Momento
document and an XSLT trasform or XQuery query.

Something along these lines:


  
   xslt="index-document.xslt"/>
  
  


(It is easier for me to express myself as a Cocoon user.)
I rather propose:


  
  
  

The idea is that the xslt generator can be used with any source. For 
this to be efficient with Momento we must organize so that the XSLT 
processor can access momento as a read-only DOM. This will not happen 
today in Cocoon. So what I describe is how to extend the involved 
mechanisms in Cocoon so that Momento get DOM as input.

This is done by creating a new interface, let us call it 
ReadOnlyDOMizable to avoid confusion ;) so that we can check if a 
source, (e.g. the Momento source), can return a DOM. We also need to 
extend the URIResolver in the XSLT processor implementation so that it 
returns a DOMSource if the input source implements ReadOnlyDOMizable, 
SAXSource, if the input source implements XMLizable and StreamSource 
othewise. That is all.


The implementation of the URIResolver that is used is 
org.apache.excalibur.xml.xslt.XSLTProcessorImpl in its current 
incarnation it uses the exclaibur source resolver to get the source and 
then it returns a javax.xml.transform.stream.StreamSource. For use with 
Momento we need an implemetation of URIResolver that checks if the the 
source is DOMIzable and in that case returns a 
javax.xml.transform.dom.DOMS

Re: Momento and Cocoon [was Re: Jisp 3.0 moved to GPL licence]

2004-02-26 Thread Alan
Responding now after having spent a week in California and a week
working on my web site (http://engrm.com/). Announcing Momento
created a communication burden for me that I am learning how to
shoulder, after a little more work on my web site, I ought to be
able to return to Momento coding.

Cocoon is a large application. I'm primarily familiar with
Cocoon output: pipelines. I'm also familiar with XSLT. 

I know nothing of Cocoon internals, and very little about Flow
and CFroms. Although I look forward to learning about both,
the following comments are going to be based on some wild,
unfounded assumptions about Cocoon.

* Daniel Fagerstrom <[EMAIL PROTECTED]> [2004-02-23 15:21]:
> Upayavira wrote:
> >Reinhard Poetz wrote:
> >>From: Alan
> >>>Working on it. As noted, I have JAXP implemented and SAX interface
> >>>   to XUpdate. I have APIs. I am going to start working on services
> >>>   next.
> >>>  A Cocoon generator that takes a Momento data source and an XSLT
> >>>   transform would be a start.
> >>>
> >>>   I'm not sure how to get information into Momento via Cocoon. I'm
> >>>   thinking about some sort of Woody binding, but that goes beyond
> >>>   my current understanding of Cocoon.

> >>speaking without following this thread closly: What about implementing 
> >>a Momento source?

> >Yup. Alan, take a look at the XMLDBSource and XMLDBSourceFactory. I 
> >think you'll find them reasonably similar to what you might want to do 
> >(in src/blocks/xmldb/java/org/apache/cocoon/components/source/impl)

> >If you implemented a MomentoSource, and made it implement 
> >ModifiableSource, then you would be able to read/write from within 
> >Cocoon. With this, you would be able to use Woody's binding 
> >functionality to bind forms directly to Momento data.

I really want to see Momento work with CForms.

> >You could also do something like the XMLDBTransformer to allow updates 
> >(src/blocks/xmldb/java/org/apache/cocoon/transformation/XMLDBTransformer.java). 

> >[NB. with an XML:DB interface to Momento, you wouldn't need to do 
> >anything to interface to Cocoon].

Isn't XML:DB deadish?

I wrote Momento because at the time Xindice was zero traffic,
dbXML was propietory, eXist wouldn't install with Cocoon, and
the XML::DB site hadn't been updated in two years. 

Also, It doesn't look the like it will make the most of Momento.
I don't like the collections concept, I much prefer one big
document.

Still an XML:DB interface shouldn't be two difficult to implement.

--- In response to  Mr. Fagerstrom --- 

> Pseudo protocol
> ===

> In Cocoon (or actually Avalon Excalibur), we have a generalization of 
> protocols, java.net.URL, called pseudo protocol 
> org.apache.excalibur.source.Source, there are also various extensions of 
> Source like ModifiableSource, TraversableSource among others. Pseudo 
> protocols are an excelent way of separating the location of data with 
> what to do with it. If you package a data source as a pseudo protocol 
> you can access it by using its URL, e.g. 
> momento://dbpath/collection#xpath(foo/bar), through Cocoons source 
> resolver. This makes it possible to use sources for ala src attributes 
> in the sitemap, the document function in XSLT and XQuery, hrefs in the 
> [X|C]IncludeTransformer, in the SourceWritingTransformer and within 
> flowscripts.

> A MomentoSource would thus give a lot of flexibility in using Momento in 
> Cocoon. Especially if it allows using XPath(2.0) in the URLs and if it 
> is a modifyable source.

Since Momento is pageable, and multi-threaded, you should be able to
yank stuff out of Momento from the sitemap, maybe dumping it
into your pipeline, without parsing a document.

When a Momento URL resolves, it will not require a document to
be loaded an parsed, so it means that thinks like XLink or
XPointer will not suffer that performance hit I read about.

In implementing XLink or XPointer (or whatever) one could read
documents into Momento as an intermediate, caching step.

> XSLT
> 

> A MomentoSource would also give a good way to use Momento together with 
> XSLT and XQuery in Cocoon. Here we need to extend the ordinary use of 
> sources somewhat, let me explain:

> The Source interface provides a getInputStream method, in Cocoon some 
> Sources implements org.apache.excalibur.xml.sax.XMLizable that provides 
> a toSAX method as well. SAX or Streams are probably not the most 
> efficient way to communicate with an XML db, so to make the pseudo 
> protocol idea usable together with Momento, we should provide a way to 
> get a DOM structure from a pseudo protocol. This could be done by 
> introducing a new interface:

> interface DOMizable {
>org.w3c.domNode getNode();
> }

Momento, with Cocoon in mind, lends itself to streaming.

Momento would readily support a read-only W3 DOM, but a read write
W3 DOM is quite ugly.

W3 DOM

Re: Momento and Cocoon [was Re: Jisp 3.0 moved to GPL licence]

2004-02-23 Thread Alan
* Geoff Howard <[EMAIL PROTECTED]> [2004-02-24 00:31]:
> Upayavira wrote:
> 
> >[changing subject...]
> >
> >Reinhard Poetz wrote:
> >
> >>From: Alan
> >>
> >>>* Geoff Howard <[EMAIL PROTECTED]> [2004-02-22 18:47]:
> >>>  
> >>>
> Alan wrote:
> 
> 
> >* Upayavira <[EMAIL PROTECTED]> [2004-02-22 07:58]:
> >  
> >
> >>I tend to think that Momento isn't suited to this need.   
> >
> >>However, as an XML data repository, it seems very interesting.
> >> 
> >
> >I've got a better idea of how Jisp is used in Cocoon from reading
> > all the discussion after my post.
> > 
> > I suggested Momento because someone suggested Xindice which led
> > me to believe Jisp handled an XML persistence task.
> >
> > Might not be the best bet, no.   
> > 
> 
> Still, I think finding a way to use momento to reduce 
> >>>
> >>>memory overhead   
> >>>
> in
> working with large xml datasets has great potential.  No one really 
> knows how great, but a demo/sample using it would be a 
> >>>
> >>>start...  (hint   
> >>>
> hint :)  )
> 
> >>>
> >>>Working on it. As noted, I have JAXP implemented and SAX interface
> >>>   to XUpdate. I have APIs. I am going to start working on services
> >>>   next.
> >>
> 
> JAXP... see below
> 
> >>>  A Cocoon generator that takes a Momento data source and an XSLT
> >>>   transform would be a start.
> >>>
> >>>   I'm not sure how to get information into Momento via Cocoon. I'm
> >>>   thinking about some sort of Woody binding, but that goes beyond
> >>>   my current understanding of Cocoon.
> >>>  
> >>
> >>speaking without following this thread closly: What about 
> >>implementing a Momento source? 
> >
> 
> I starting to wonder if I'm being dense... wouldn't the easiest first 
> test integratin be to use Memento as the JAXP xslt processor to reduce 
> memory overhead on transformations of large data sets?  Maybe I've 
> misunderstood where/what momento is as a project?   The jaxp processor 
> is declared in cocoon.xconf  (see instructions for switching to saxon 
> for example).

I created a blog entry today:

 Momento Inline
 2004/02/23 12:26:54

 There is a mode of operation for Momento that I've not considered
 at length. Inline operation with XSLT. That is, operation where
 Momento is not a data source, rather it is a transient document
 object model.

 This applies when performing a transform against a document that
 may be too large to fit in system memory. A common use case is XML
 generated from an SQL query. The SQL result set can be streamed as
 a series of SAX events that clogs memory as the XSLT engine tries
 to build a document representing a large data set.

 There is nothing preventing Momento from building a document,
 organized and clustered at the get go. More interesting would be
 for Momento to build the document in memory, writing it to disk
 only when memory runs out.

 Currently, Momento writes its pages as it fills them. Momento might
 delay a page write until a page fills, when it makes sense to do
 so, but it pretty much writes the pages to disk as it writes nodes
 and strings to the pages. It pools pages in memory using weak
 references. When no one is writing to or reading from a page, the
 weak reference will be the only reference, thus it is eligible for
 garbage collection. Momento would have to intercept the garbage
 collector's desire to release the page, and write it out before the
 memory is released.

 This means I need to develop a deeper understanding of weak
 references in Java. If there is no way to hook collectection before
 the fact, I'd have to rethink the paging engine so that it would
 explicitly release pages as part of a MRU cache.

 A hybrid of this could be used to maintain XSLT output as part of a
 cache system. Momento would build the result of an XSLT transform
 organized and clustered, writing it to disk when memory is tight. A
 cache could use this Momento document as a source of SAX events or
 as a W3 DOM document, discarding it when the upstream dependencies
 change. 

So, no Geoff, I'm only getting around to thinking of Momento as an
transient document object model. I'd designed it as a persistence
engine, so I'd considered Momento to be a source of data, a
place where data lives.

(I'll get back to everyone else soon. Great stuff everyone. Thank
you so very much!)

-- 
Alan / [EMAIL PROTECTED] / http://engrm.com/
aim/yim: alanengrm - icq: 228631855 - msn: [EMAIL PROTECTED]


Re: Momento and Cocoon [was Re: Jisp 3.0 moved to GPL licence]

2004-02-23 Thread Geoff Howard
Upayavira wrote:

[changing subject...]

Reinhard Poetz wrote:

From: Alan

* Geoff Howard <[EMAIL PROTECTED]> [2004-02-22 18:47]:
  

Alan wrote:


* Upayavira <[EMAIL PROTECTED]> [2004-02-22 07:58]:
  

I tend to think that Momento isn't suited to this need.   

However, as an XML data repository, it seems very interesting.
 
I've got a better idea of how Jisp is used in Cocoon from reading
 all the discussion after my post.
 
 I suggested Momento because someone suggested Xindice which led
 me to believe Jisp handled an XML persistence task.

 Might not be the best bet, no.   
 
Still, I think finding a way to use momento to reduce 
memory overhead   

in
working with large xml datasets has great potential.  No one really 
knows how great, but a demo/sample using it would be a 
start...  (hint   

hint :)  )

Working on it. As noted, I have JAXP implemented and SAX interface
   to XUpdate. I have APIs. I am going to start working on services
   next.

JAXP... see below

  A Cocoon generator that takes a Momento data source and an XSLT
   transform would be a start.
   I'm not sure how to get information into Momento via Cocoon. I'm
   thinking about some sort of Woody binding, but that goes beyond
   my current understanding of Cocoon.
  
speaking without following this thread closly: What about 
implementing a Momento source? 

I starting to wonder if I'm being dense... wouldn't the easiest first 
test integratin be to use Memento as the JAXP xslt processor to reduce 
memory overhead on transformations of large data sets?  Maybe I've 
misunderstood where/what momento is as a project?   The jaxp processor 
is declared in cocoon.xconf  (see instructions for switching to saxon 
for example).

Yup. Alan, take a look at the XMLDBSource and XMLDBSourceFactory. I 
think you'll find them reasonably similar to what you might want to do 
(in src/blocks/xmldb/java/org/apache/cocoon/components/source/impl)

If you implemented a MomentoSource, and made it implement 
ModifiableSource, then you would be able to read/write from within 
Cocoon. With this, you would be able to use Woody's binding 
functionality to bind forms directly to Momento data.

You could also do something like the XMLDBTransformer to allow updates 
(src/blocks/xmldb/java/org/apache/cocoon/transformation/XMLDBTransformer.java). 

[NB. with an XML:DB interface to Momento, you wouldn't need to do 
anything to interface to Cocoon].


These are also good ideas for the write aspect, but I see benefit in the 
read aspect if I understood correctly.

Geoff


Re: Momento and Cocoon [was Re: Jisp 3.0 moved to GPL licence]

2004-02-23 Thread Daniel Fagerstrom
Upayavira wrote:
Reinhard Poetz wrote:
From: Alan
Working on it. As noted, I have JAXP implemented and SAX interface
   to XUpdate. I have APIs. I am going to start working on services
   next.
  A Cocoon generator that takes a Momento data source and an XSLT
   transform would be a start.
   I'm not sure how to get information into Momento via Cocoon. I'm
   thinking about some sort of Woody binding, but that goes beyond
   my current understanding of Cocoon.
  


speaking without following this thread closly: What about implementing 
a Momento source?
 

Yup. Alan, take a look at the XMLDBSource and XMLDBSourceFactory. I 
think you'll find them reasonably similar to what you might want to do 
(in src/blocks/xmldb/java/org/apache/cocoon/components/source/impl)

If you implemented a MomentoSource, and made it implement 
ModifiableSource, then you would be able to read/write from within 
Cocoon. With this, you would be able to use Woody's binding 
functionality to bind forms directly to Momento data.

You could also do something like the XMLDBTransformer to allow updates 
(src/blocks/xmldb/java/org/apache/cocoon/transformation/XMLDBTransformer.java). 

[NB. with an XML:DB interface to Momento, you wouldn't need to do 
anything to interface to Cocoon].

Hope this helps.

Regards, Upayavira
I agree with the above suggestions and would like to provide some more 
technical details.

Pseudo protocol
===
In Cocoon (or actually Avalon Excalibur), we have a generalization of 
protocols, java.net.URL, called pseudo protocol 
org.apache.excalibur.source.Source, there are also various extensions of 
Source like ModifiableSource, TraversableSource among others. Pseudo 
protocols are an excelent way of separating the location of data with 
what to do with it. If you package a data source as a pseudo protocol 
you can access it by using its URL, e.g. 
momento://dbpath/collection#xpath(foo/bar), through Cocoons source 
resolver. This makes it possible to use sources for ala src attributes 
in the sitemap, the document function in XSLT and XQuery, hrefs in the 
[X|C]IncludeTransformer, in the SourceWritingTransformer and within 
flowscripts.

A MomentoSource would thus give a lot of flexibility in using Momento in 
Cocoon. Especially if it allows using XPath(2.0) in the URLs and if it 
is a modifyable source.

XSLT

A MomentoSource would also give a good way to use Momento together with 
XSLT and XQuery in Cocoon. Here we need to extend the ordinary use of 
sources somewhat, let me explain:

The Source interface provides a getInputStream method, in Cocoon some 
Sources implements org.apache.excalibur.xml.sax.XMLizable that provides 
a toSAX method as well. SAX or Streams are probably not the most 
efficient way to communicate with an XML db, so to make the pseudo 
protocol idea usable together with Momento, we should provide a way to 
get a DOM structure from a pseudo protocol. This could be done by 
introducing a new interface:

interface DOMizable {
   org.w3c.domNode getNode();
}
or something similar. If the MomentoSource implements DOMizable, we have 
direct access to nodes in the XML db.

Now we are prepared to connect Momento to XSLT. In Cocoon we can use 
Saxon through the org.apache.cocoon.transformation.TraxTransformer, you 
just need to change cocoon.xconf a little bit to use Saxon instead of 
Xalan. There is also a TraxGenerator in the scratchpad that could be 
used with some small modifications.

I would guess that Momento mainly would be accessed through the document 
function in XSLT and XQuery. Saxon use JAXP 1.1 as external API to the 
transformerand the URLs in the document functions are resolved by using 
an implementation of javax.xml.transform.URIResolver that is provided by 
the TraxTransformer.

The implementation of the URIResolver that is used is 
org.apache.excalibur.xml.xslt.XSLTProcessorImpl in its current 
incarnation it uses the exclaibur source resolver to get the source and 
then it returns a javax.xml.transform.stream.StreamSource. For use with 
Momento we need an implemetation of URIResolver that checks if the the 
source is DOMIzable and in that case returns a 
javax.xml.transform.dom.DOMSource instead. This can be done by extending 
the excalibur XSLTProcessorImpl and change the XSLTProcessor in 
cocoon.xconf.

XQuery
==
XQuery in Saxon use a propertary api, (there are no standard in this 
area yet). So we need a specialized SaxonXQueryGenerator. Saxon use the 
JAXP URIResolver for XQuery also, so the above described mechanisms can 
be used here as well. Unfortionatly Saxon is MPL 1.0 that is not 
compatible with ASL, so we cannot have Saxon as a part of Cocoon :(

  --- o0o ---

Sorry for all the technical details ;)

As you can see, for reading from Momento, the only Momento specific code 
is in the MomentoSource, everything else is using DOM, JAXP and Cocoon 
APIs. Therefore the proposed mechanisms would give an efficient way of 
usin

RE: Momento and Cocoon [was Re: Jisp 3.0 moved to GPL licence]

2004-02-23 Thread Reinhard Poetz

From: Upayavira

> >speaking without following this thread closly:
> >What about implementing a Momento source?
> >  
> >
> Yup. Alan, take a look at the XMLDBSource and XMLDBSourceFactory. I 
> think you'll find them reasonably similar to what you might 
> want to do 
> (in src/blocks/xmldb/java/org/apache/cocoon/components/source/impl)
> 
> If you implemented a MomentoSource, and made it implement 
> ModifiableSource, then you would be able to read/write from within 
> Cocoon. With this, you would be able to use Woody's binding 
> functionality to bind forms directly to Momento data.
> 
> You could also do something like the XMLDBTransformer to 
> allow updates 
> (src/blocks/xmldb/java/org/apache/cocoon/transformation/XMLDBT
> ransformer.java).
> 
> [NB. with an XML:DB interface to Momento, you wouldn't need to do 
> anything to interface to Cocoon].
> 
> Hope this helps.

Thanks, this is exactly what I was thinking of!

Best,
Reinhard