Sandy, would it be fair to say that Schema was designed mostly to deal with typing (both element and content), whereas the DTD wasn't designed with a strong typing model and therefore took a more scrutinizing approach as to whether or not content existed and whether or not it was desired (particularly in the case of whitespace)?
Thanks, Brion -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, December 13, 2002 10:57 AM To: [EMAIL PROTECTED] Subject: Re: Question on Feature: http://apache.org/xml/features/dom/include-ignorable-whitespace > But I do think it would be appropriate to have an > additional feature which permits performing that filtering on the basis of > schema knowledge, I agree that it *could* be useful to some users that schema validation also marks some whitespace "ignorable". But ... The infoset (after DTD validation) has a property for "element content whitespace". So it's clear that the infoset is making "element content whitespace" special. But in PSVI (the result of schema validation), nothing was made special for whitespaces within elements with element-only content. So it *seems* that schema doesn't want to make such distinction. This is why the parser doesn't (and I don't think it should) have a feature for that. There is also another argument. Though the XML processing model is not defined, no one said that you can't do both DTD and schema. (Actually, the schema spec implies that, sometimes, DTD should happen before schema validation, because of the ENTITY type.) So if DTD validation says a whitespace is ignorable, but schema says no, or the other way around (DTD: no; schema: ignorable), how do you report that to the application? So IMO, if people really think that "schema-element-only-content-whitespace" is a significant concept and should be marked special by a processor, then such requirement should be raised to the schema WG, so that the PSVI will include something for that. Then we can add something to the parser (but not sure how without conflict with the ignorable whitespace concept). Cheers, Sandy Gao Software Developer, IBM Canada (1-905) 413-3255 [EMAIL PROTECTED] Joseph Kesselman/Watson/ To: [EMAIL PROTECTED] [EMAIL PROTECTED] cc: Subject: Re: Question on Feature: 12/13/2002 09:29 http://apache.org/xml/features/dom/include-ignorable-whitespace AM Please respond to xerces-j-user Sandy, we agree that we disagree slightly. "Ignorable" isn't a term in any of the official W3C documentation; it's an informal phrase that SAX adopted. The actual W3C-defined concept, from the XML spec, is "whitespace in element content" -- in other words, whitespace appearing in a place where only elements are considered valid. I agree that the XML Recommendation itself, where this phrase appears, is only aware of DTDs and thus defines it only in terms of DTDs. However, since schemas are considered an extension of the concept of validity, I believe it's reasonable to allow schemas to also provide information about what is and isn't "element content". I've no objection to Xerces defaulting to looking only at the DTD -- indeed, if I remember correctly the default behavior now is that whitespace-in-element-content is *not* removed unless you enable the proper feature. But I do think it would be appropriate to have an additional feature which permits performing that filtering on the basis of schema knowledge, if that's what the user wants. The alternative would be for them to query the PSVI APIs and post-process the document themselves, which is both inconvenient and relatively inefficient. ______________________________________ Joe Kesselman / IBM Research --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
