Philip Taylor wrote:
Shane McCarron wrote:
Julian Reschke wrote:
It's clear that if RDFa is to be used with prefix declarations done
with xmlns, then mixing uppercase and lowercase declarations is not
going to work.
I think restricting prefixes to be lower-case (insert proper Unicode
terminology here) would be acceptable; it's easy to live with, and
avoids introducing yet another prefix declaration mechanism.
I would not be opposed to adding text in the RDFa in HTML definition
like "prefix names SHOULD be defined in lower-case to help ensure
maximum portability among parsers, since it is common for DOM-based
parsers to not preserve the case of attribute names."
If portability isn't guaranteed in a very simple case like this, then
it sounds like the specification would have failed at the fundamental
task of specifying behaviour that will be interoperably implemented.
(Once portability is guaranteed, it might be good to recommend against
using non-lowercase prefixes because they might have surprising (but
portable) behaviour, but that's a very different reason.)
I don't see there being any need to change the definition of
XML-based languages like RDFa for XHTML. After all, in XML case is
preserved. Or is ot someone's goal that documents be able to be
parsed as EITHER XML or HTML? It's not my goal. If I define a
document using an HTML family language, I expect it to be parser
using an HTML family parser. If I define it using an XHTML family
language then I expect it to be parsed using an XML-conforming
parser. Such a parser would preserve the case of element and
attributes.
People will read the RDFa-in-XHTML specs and guides and tutorials and
examples, and use the same syntax in their own pages. Then they'll
serve their pages as text/html and expect it to work the same.
A survey of random pages from dmoz.org about a year ago found that
~18% used an XHTML doctype, and ~0.03% were served as
application/xhtml+xml. On the Alexa top 200 a bit earlier
(http://lists.w3.org/Archives/Public/public-html/2007Aug/1248.html), a
third used an XHTML doctype and three quarters of those were not
well-formed XML. So: Any new markup will be overwhelmingly served as
text/html, and most of it that claims to be XHTML won't be usable with
an XML parser.
Thus, the XHTML syntax will mostly be processed using the
RDFa-in-text/html processing rules. If those rules don't do what
people expect (after they've read the XHTML-focused specs and guides
and tutorials and examples), then they will be surprised and unhappy
and it will be a bad situation.
To make the situation better, either (a) the RDFa-in-XHTML
documentation should all be removed and replaced with
RDFa-in-text/html documentation so that people won't be encouraged to
use the wrong syntax in their pages; or (b) the RDFa-in-XHTML syntax
should give the same results (as far as possible, given the
backward-compatibility constraints) when processed with the
RDFa-in-text/html processing rules.
I presume (a) isn't going to happen. That leaves (b), which would
require coordination between RDFa-in-XHTML and RDFa-in-text/html, and
seems likely to require changes to the RDFa-in-XHTML spec to smooth
out the differences.
Wow, Philip, you're using an 8-gauge shotgun to hunt baby bunnies here.
Can I take a leap of faith and guess that of the 18% of web pages served
up with the XHTML doctype not using well formed XML probably are also
not using RDFa?
The RDFa in XHTML spec doesn't need to change if a new document covering
RDFa in HTML is created. Does it? Maybe a cross-reference between the
documents, with a general warning about differences between the two
documents would be good.
As it is, there's probably going to be confusion about XHML versus HTML
with the HTML5 spec. I'm rather waiting for someone to use <br> in XHTML5.
Shelley