The problem ----------- It seems that some tools still do not use a Catalog Entity Resolver [1] to locate local copies of DTDs. Hence these tools break, because there are not actual DTDs on the website. This is limiting the uptake of the Apache Document DTDs and drawing lots of user questions.
Some background --------------- Forrest and Cocoon both use the Catalog Entity Resolver and this works fine in all situations (command-line, webapp). Any sensible XML tool can utilise the xml-commons-resolver.jar, and the Catalog supplied by Forrest. Some other tools either don't bother to use it or the user has not bothered to configure it. The solution ------------ Place the document-v* DTDs and supporting files on the Forrest website and keep them up-to-date with Forrest CVS. Add an HTML page to explain that there are better solutions and that this is only a fall-back solution. Some issues ----------- 1) The DTDs will still be managed in the Forrest CVS at [2]. 2) The Forrest build system is complex. It would be good to automate the publishing of DTD versions, but that may not be possible. 3) The Forrest website is built using the "stable" version of Forrest (currently v0.5.1). So how will DTDs from the current CVS (v0.6-dev) get into the website CVS [3]? Manual copy? See 4). 4) If some committer changes the DTDs in CVS then they will be out-of-sync. Will committers remember to do the manual copy? See 3). 5) Extra impact on the Apache webserver. Is this a big bandwidth consumer? See some estimates in [4]. 6) What will be the URLs for the DTDs? http://xml.apache.org/forrest/dtd/... We presume that Forrest will not be a top-level project soon. 7) Does the Apache webserver deliver the supporting files using the appropriate Content-Type? Is that "text/plain"? Does it matter? We have *.dtd and *.mod and *.pen and *.ent extensions. 8) We have two separate directories in CVS. The /dtd/ and the /entity/ directories are parallel. We can probably merge everything into /dtd/ and change the Catalogs. 9) We will never know if the Catalog Entity Resolver gets broken after an upgrade. Forrest will still work but will be slower, doing downloads of the DTD and supporting files on each document parse. We can probably add a test document in the "forrest seed site" to detect failure. 10) Cocoon has a copy of the DTDs and stuff in its own CVS so that it can build its own documentation via its webapp and command-line builds. This can still continue, but needs a better solution. Perhaps Forrest can provide later. 11) Do we need to ask infrastructure@ about this proposal or just do it? Next steps ---------- * Discuss this proposal on cocoon-dev and forrest-dev. * If no new issues surface, then summarise and call a vote if needed. * Modify Forrest build system to copy the DTDs into the website CVS. References ---------- [1] Apache Catalog Entity Resolver http://xml.apache.org/commons/components/resolver/ [2] DTDs in Forrest CVS http://cvs.apache.org/viewcvs/xml-forrest/src/core/context/resources/schema/ [3] Forrest website CVS http://cvs.apache.org/viewcvs/xml-site/targets/forrest/ [4] Some recent mail discussion: http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=107042081805837 [5] Jira issue FOR-107 http://issues.cocoondev.org/jira//secure/ViewIssue.jspa?key=FOR-107