On Feb 10, Steinar H. Gunderson ([EMAIL PROTECTED]) wrote: > On Fri, Feb 10, 2006 at 07:44:23AM -0500, Neil Roeth wrote: > > Correct on both counts: it would be just "pushing it ahead", and I am > > reluctant at this time to turn off DTDDECL in stable, because that is > > removing > > behavior, though the longer the package remains in unstable with DTDDECL > > handling turned off and no complaints, the less reluctant I will be to make > > the corresponding change in stable. > > What does the DTDDECL thing _do_, BTW? I don't think I've gotten to that part > yet :-)
An SGML declaration defines the syntax of SGML used in the document, down to what characters are allowed, capacities, case sensitivity, etc. This is how the rules for XML and HTML are specified. It basically specifies the syntax of elements, while the DTD specifies the structure of the document in terms of elements. There are three ways to determine the SGML decaration to use: (1) Use the default implied SGML declaration built into OpenSP. (2) Explicitly specify it on the command line, e.g., onsgmls -s /usr/share/xml/declaration/xml.dcl foo.xml (3) Use one implied by the DTD of the document and a DTDDECL entry in the SGML catalog. This entry means that if using the specified DTD, use the corresponding SGML declaration. The test document in this case does use a DTD that has a DTDDECL catalog entry in the catalog, so it is trying to use that SGML declaration to parse the document. One difference between this way of finding an SGML declaration and the other two is that the document has to be parsed once to some extent in order to find its DTD and then that DTD has to be used to find any corresponding DTDDECL entries. Conceptually, you could do a first pass over the document to find the DTD, find any DTDDECL entries, read the corresponding SGML declaration, then make a second pass over the document with the new SGML declaration. I think it is this multipass requirement of DTDDECL handling that is causing this bug. > But will a workaround for this be to simply use a temporary file instead of > reading from a pipe? I guess I can do that, I don't need the validation in > production code... Yes, that will work. A temporary file has no problem being reread. You could also pipe the SGML declaration before the document, e.g. cat /usr/share/sgml/html/dtd/4.01/HTML4.decl test8k.html | onsgmls -s I think an explicit SGML declaration passed to OpenSP tells it to skip looking for one any other way. If you know what SGML declaration to use, that's fine; the trick is that it depends on the DTD of the document, and figuring out the SGML declaration that corresponds to the DTD of the document is the whole point of the DTDDECL handling that is causing this bug! HTH -- Neil Roeth -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]