include-ignorable-whitespace

sandygao 13 Dec 2002 15:57:25 -0000

> But I do think it would be appropriate to have an
> additional feature which permits performing that filtering on the basis
of
> schema knowledge,


I agree that it *could* be useful to some users that schema validation also
marks some whitespace "ignorable". But ...

The infoset (after DTD validation) has a property for "element content
whitespace". So it's clear that the infoset is making "element content
whitespace" special.

But in PSVI (the result of schema validation), nothing was made special for
whitespaces within elements with element-only content. So it *seems* that
schema doesn't want to make such distinction. This is why the parser
doesn't (and I don't think it should) have a feature for that.

There is also another argument. Though the XML processing model is not
defined, no one said that you can't do both DTD and schema. (Actually, the
schema spec implies that, sometimes, DTD should happen before schema
validation, because of the ENTITY type.) So if DTD validation says a
whitespace is ignorable, but schema says no, or the other way around (DTD:
no; schema: ignorable), how do you report that to the application?

So IMO, if people really think that
"schema-element-only-content-whitespace" is a significant concept and
should be marked special by a processor, then such requirement should be
raised to the schema WG, so that the PSVI will include something for that.
Then we can add something to the parser (but not sure how without conflict
with the ignorable whitespace concept).

Cheers,
Sandy Gao
Software Developer, IBM Canada
(1-905) 413-3255
[EMAIL PROTECTED]



                                                                                
                                                       
                      Joseph                                                    
                                                       
                      Kesselman/Watson/        To:       [EMAIL PROTECTED]      
                                            
                      [EMAIL PROTECTED]                cc:                      
                                                               
                                               Subject:  Re: Question on 
Feature:                                                      
                      12/13/2002 09:29          
http://apache.org/xml/features/dom/include-ignorable-whitespace                 
       
                      AM                                                        
                                                       
                      Please respond to                                         
                                                       
                      xerces-j-user                                             
                                                       
                                                                                
                                                       
                                                                                
                                                       



Sandy, we agree that we disagree slightly.

"Ignorable" isn't a term in any of the official W3C documentation; it's an
informal phrase that SAX adopted. The actual W3C-defined concept, from the
XML spec, is "whitespace in element content" -- in other words, whitespace
appearing in a place where only elements are considered valid.

I agree that the XML Recommendation itself, where this phrase appears, is
only aware of DTDs and thus defines it only in terms of DTDs. However,
since schemas are considered an extension of the concept of validity, I
believe it's reasonable to allow schemas to also provide information about
what is and isn't "element content".

I've no objection to Xerces defaulting to looking only at the DTD --
indeed, if I remember correctly the default behavior now is that
whitespace-in-element-content is *not* removed unless you enable the
proper feature. But I do think it would be appropriate to have an
additional feature which permits performing that filtering on the basis of
schema knowledge, if that's what the user wants. The alternative would be
for them to query the PSVI APIs and post-process the document themselves,
which is both inconvenient and relatively inefficient.

______________________________________
Joe Kesselman  / IBM Research


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Question on Feature: http://apache.org/xml/features/dom/include-ignorable-whitespace

Reply via email to