(Sorry to approach the page limit ... it is worth explaining.) To our local Cocoon-2.0b2 i have added a CatalogEntityResolver onto the entityResolver hook that is provided by the xerces parser. Cocoon can now utilise the power of OASIS Catalogs or XML Catalogs. These provide a standards-based mechanism to resolve Public Identifiers and System Identifiers to local filenames or other identifiers or even to remote network resources. So references to external DTDs, sets of character entities such as mathematical symbols, fragments of XML documents, complete sub-documents, non-xml data chunks (like images), etc. can all be centrally managed and resolved locally. The type of XML documents that we want to serve with Apache Cocoon are already in existence in another information system. The XML document instances have a declaration of their DTD Document Type Definition as an external file. This external DTD also includes entity sets such as ISOnum, ISOlat1, etc. Also the DTD declaration has a Formal Public Identifier and a System Identifier which points to a remote URL. These XML instances cannot be changed. Whether you have validation=yes or not, the parser will still want to resolve all of the entities that are required by the XML instance. So it will happily go across the network to get them. It will do this every time that the document is processed. This is obviously a needless overhead. Additionally, if your Cocoon is an off-line server then it is broken because it cannot retrieve the network-based resources. As far as i know, the sitemap cannot be used to specify the location of these resources, because this resolution of the external entities is under control of the guts of the parser and the XML model. The following article eloquently describes the need for all parsers and XML frameworks to be capable of utilising entity resolvers. Very few do that yet (SP/nsgmls and XMetaL) while others have hooks that are not utilised. Arbortext make their Java code available to the public domain. If You Can Name It, You Can Claim It! by Norman Walsh www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html There are also some other links which extol entity management: www.auslig.gov.au/anzmeta/catalog.html The Walsh document was very easy to follow, to hook up the entity resolver. A handful of lines were added to the code components/parser/JaxpParser.java to load the catalogs and to set the entity resolver. Excellent, Cocoon is now automatically using the local entities and there is a speedup in processing. i believe that this capability should be added to the core Apache Cocoon. The code changes can be supplied, if appropriate. regards, David Crossley --------------------------------------- > Date: Fri, 23 Feb 2001 14:03:43 +1100 > From: David Crossley <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > > Back in July/August 2000 there was a small discussion > on this list about XML Public Identifier resolution. > > There is a very good article by Norm Walsh explaining the > importance of using an SGML Open Catalog (OASIS Catalog) > for resolving Public Identifiers to local file copies > of the relevant DTDs and other entities. > > This document also provides access to Java classes for > implementing catalogs for entity management. The Cocoon > administrator at each server should be able to add new > entries to their XML catalog. > > www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html > www.oasis-open.org/cover/topics.html#fpi-fsi > > regards, David Crossley > > --------------------------------------- > > Date: Sat, 05 Aug 2000 20:58:50 +0200 > > Subject: [Cocoon Devel] Re: DTD PUBLIC ID resolution > > From: Stefano Mazzocchi [EMAIL PROTECTED] > > > > Hans Ulrich Niedermann wrote: > > > > > > I'd like to have a mechanism that maps some known PUBLIC IDs from the > > > <!DOCUMENT> declaration to the corresponding local URIs (similar to > > > SGML catalog files). This would allow one to write XML files with the > > > "canonical" URI for the used DTDs and still use a local copy for > > > validation and default value gathering, which increases both > > > reliability and speed. > > > > > > Do you think such a mechanism makes sense? > > > > Sure it does, it's called "catalog" and it goes back to the old SGML > > days. > > > > > Has anybody seen such a thing implemented yet? > > > > I'm pretty sure all good parsers implement one (I know Xerces does) > > > > > Where could/should such a thing be hooked into the C2 processing chain? > > > > If we use Xerces, we can use their API and provide the catalog > > ourselves.... or use directly SAX EntityResolver...hmmmm, probably > > better using SAX anyway... > > > > > Where and how should the configuration, i.e. the mapping from PUBLIC > > > to SYSTEM be stored? > > > > Good question. I haven't thought about it (yet). Should the sitemap > > contain the semantics to describe schema catalogs as well? > > > > > I don't mean to distract you from important things but perhaps we > > > should think about it before every API and config file spec is set > > > into stone. > > > > Totally. Thanks for bringing this up. > > > > > I'd be willing to contribute some code if/when I can figure out how > > > the C2 internals really are supposed to work. > > > > Same here :) > > > > Anyway, I'll dive into C2 very soon, expect tons of > > "what-the-hell-is-this?" :) > > > > -- > > Stefano Mazzocchi One must still have chaos in oneself to be > > able to give birth to a dancing star. > > <[EMAIL PROTECTED]> Friedrich Nietzsche > > -------------------------------------------------------------------- --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faqs.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>