Michael Glavassevich wrote:
[3] In the xerces2-j FAQ on configuration for validation [4] there is note that reads: "An application may choose to create a configuration that does not have a DTD validator but has an XML Schema validator. This will turn Xerces into a non-compliant processor according to XML 1.0 and XML Schema specifications, thus the validation/augmentation outcome is undefined." I completely disagree with this note and think xerces is incorrect in this behavior. I suspose their reasoning is that the presence of a <!
DOCTYPE> in the instance requires that DTD validation be performed but this isn't sanctioned by the XML spec. The presence of <!
DOCTYPE> is NOT the signal to perform validation [5] although many processors believe it is.

This section of the FAQ is targeted at XNI parser configuration authors. I admit that could be made clearer. In Xerces the DTD validator component (even when validation is turned off) provides attribute defaults, attribute types from the DTD and determines which white space characters are element content whitespace (also known as ignorable whitespace). A parser configuration which doesn't include a DTD validator in the pipeline may be missing properties from the infoset, therefore it is not a compliant XML 1.0 processor. What validating and non-validating processors do is described in another FAQ (
http://xml.apache.org/xerces2-j/faq-write.html#faq-2).



[4] http://xml.apache.org/xerces2-j/faq-pcfp.html

Thank you for the reply, but it seems not quite correct. I'm aware you're an expert on these matters but, at the extreme risk of teaching grandmother to suck eggs, I have to point out there is nothing about the infoset in the XML 1.0 specification. There are only things that every processor is required to do and a separate list of things a validating processor is required to do. A non-validating processor is not required to read the external DTD at all (unless standalone="yes" is specified). But it is permitted to. It is specifically permitted to know the definitions of general entities defined in the external subset (or not) even though it is not required to read it. It is not required, because it has this knowledge, to finish the job of validating.


If there are any issues, it seems Xerces has brought them on itself. For example, it won't allow a user to turn on schema-validation or specify a schema language unless validation is also on. Xerces supports XML pipelining in at least two ways, SAX and XNI. A pipeline, by definition, should be independent of what feeds it and to what its output is sent, but the Xerces doc goes out of its way to say that the result of a certain pipeline configuration is undefined. It's like a stem cell saying "Don't use me as a brain cell!"

Bob Foster


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to