On Dec 18, 2009, at 16:36 , Cyril Concolato wrote:
> Le 18/12/2009 15:58, Robin Berjon a écrit :
>> I don't think that looking at XHTML is the best idea if you want a normative 
>> definition for XML :)
> I agree but the XML spec is so indigestible sometimes that it's hard to find 
> the proper info. It was a bit digested in XHTML :)

Heh, fair enough. I think ed5 is decently readable, but maybe that's due to 
years reading the previous versions...

>> P+C doesn't tie processors to a particular version of XML, and lists its 
>> white space characters accordingly (and defensively). If you're certain that 
>> you will only ever get content that comes from a conforming XML 1.0 
>> implementation, then you probably don't need to check for this.
> I don't read it like that. P&C explicitely references XML 1.0 and never 
> mentions 1.1. So I thought the behavior was conformant to 1.0. It's fine if 
> the spec also handles 1.1 but it should be mentioned. Also the rationale for 
> the choices of space characters should also be indicated and the differences 
> between XML 1.0 and XML 1.1 should be present.

I beg to differ. I think that we should build specifications that can handle 
future changes to the stack without listing all the versions that are 
supported. P+C is built for XML 1.0, and it's great that it has the resilience 
to handle changes to 1.1 without a hitch — but who knows what XML 4.2 might 
add? We can't guarantee that it'll work, but we can try (and if it does work, I 
don't think that we should list it either). I certainly don't think that it's 
the right place to document potential differences between versions of XML — as 
your XHTML example shows, that kind of information goes stale.

Furthermore, I didn't say that the differences between XML 1.0 and 1.1 are the 
rationale for this choice — I was merely indicating that using 1.1 you could 
get such characters and that P+C's robustness against that was a plus. I wasn't 
in Marcos's brain when that part was written but my specification exegesis 
antennae suspect that the listed class of characters corresponds to the Unicode 
white space character class (and therefore to what Unicode-aware processors 
would consider white space, notably \s in regular expressions).

-- 
Robin Berjon - http://berjon.com/




Reply via email to