On May 20, 2006, at 8:49 AM, David Powell wrote: (at great length)

I'm going to re-organize David's argument a little and deal with one of his last points first.

Foreign attributes are bad, and are inherently less interoperable
than Extension Elements.

I would say that furious debates about elements-vs-attributes have been going on since the dawn of XML in 1998, but that would be untrue; they've been going on since the dawn of XML's precursor SGML in 1986. They have never led anywhere. After you've noticed that if you need nested element structure you're stuck with elements, and if you don't want to have two things with the same names attributes can help, there really aren't any deterministic decision procedures.

I note with pleasure that *all* known XML APIs allow you to retrieve elements and attributes with about the same degree of difficulty.

So, my conclusion: I disagree with Powell. Let people put extensions wherever they feel like putting them (they will anyhow), remembering that human-readability is a virtue. If models try to micro-manage the element/attribute thing, those models are broken, don't use them. If software arbitrarily discards markup because the markup doesn't match its idiosyncratic ideas about elements and attributes, that software is non-comformant and non-interoperable.

Software that deals with XML such as an XHTML document, doesn't have
much choice but to model the document using generic XML concepts and
tools - Infosets, DOM, SAX, strings containing XML tags, etc.

For Atom though, it is useful to model feeds and entries in terms of
some other data model: OO, RDBMS, WebDAV (I've been doing it as RDF,
but that is a dirty word around these parts).

Well yes, but each and every one of those non-XML models fails to capture the information content of perfectly-legal XML in one way or another. The Atom data format *could* have been designed in such a way as to be conformant with one or more of those models, but it wasn't. Extensions to Atom can be designed in such a way as to fit into some particular data model, or not.

So I think it's really questionable to try to normatively reverse engineer the WG consensus to try to pretend that Atom documents can usefully be processed as anything but XML.

Section 6 of RFC4287 is flawed. It is an ugly mix

I agree. In fact, what Simple Extension *really* says is "this property can trivially be modeled in RDF" and Structured Extension really says is "doesn't directly map to RDF", but I failed to convince the WG either to remove this hand-waving or to be clear about what we really meant. Having said that, these notions fortunately have exactly zero normative effect on implementors.

of my (overly)
strict PaceExtensionConstruct proposal[1], and an (overly) liberal
philosophy that the existence of foreign markup anywhere won't break
implementations, so shouldn't be disallowed.

I have no comment on your proposal, but the philosophy you describe does in fact represent the consensus of the WG and the IETF community; your opinion that it is overly liberal is interesting but not particularly relevant to implementors.

atompub's charter states:

Atom consists of:
    * A conceptual model of a resource
    * A concrete syntax for this model

Extension elements are defined to have both a model and a syntax, but
Atom's allowance for foreign attributes to appear anywhere is a case
of syntax that has no corresponding model. Atom doesn't really explain
what foreign attributes are intended for.

Extension elements also, as noted above, have *no normative effect*. It is arguable that the design of Atom departed from the charter in that the model was never explicitly specified. To me, this seems like sound design; the success of the Web has been based on careful specification of the content and sequence of the interchanged messages without any attempt to standardize on a model. This is A Good Thing, and demonstrably works.

And Atom also doesn't really explain what foreign elements are intended for either.

It seems like they could be
an extension point, but given that many implementations will have an
application model that isn't based on the XML Infoset (as described
above),

There's a word for implementations (especially intermediaries, as you notice) that aren't based on the Infoset: broken. Because RFC4287 is explicitly defined only in terms of the infoset. Go ahead and try to impose any models that are appropriate for your application needs; I do this all the time. But don't change the Infoset.

it seems very unwise to create an extension proposal which
depends on the precise syntax of an element being preserved.

The "precise syntax" claim is utterly bogus. RFC4287 properly standardizes at the Infoset level, thus it makes zero difference whether I say <title>Café</title> or <title>Caf&#xe9;</title>. I personally think that an extension proposal should

(a) be useful
(b) be readable
(c) not rely on anything that isn't preserved in the XML Infoset

foreign attributes appear to provide a
class of extension (if that is what it is) that will be much less
interoperable.

This is only true for software which ignores the fact that RFC4287 is specified only in terms of the XML Infoset. If you lose information because it doesn't match up with some ex post facto model you've dreamed up, you cannot expect to achieve interoperability.

Some guidance in how to design extensions is definitely missing from
the RFC, perhaps an Informational RFC explaining the issues would be
appropriate.

Agreed. But I predict you'll have a devilish hard time building any consensus that goes further than my (a), (b), and (c) above.

The lack of standardisation is not necessarily a bad thing,
implementations are free to implement what is appropriate to their
requirements - if implementations were required to preserve everything
perfectly it would massively raise the cost of integrating Atom with
existing systems.

For good interoperability, they should not do violence to the content of what they are given, as seen in information-set terms. Not much more (or less) can be asked.

As an extension proposal makes greater requirements on software, the
chances of information loss, and interop problems increases.

Obviously, but the notion that this depends on whether you use an attribute or an element seems really, really bizarre to me. An intermediary that drops markup it doesn't recognize won't last long in the marketplace, whether those are elements or attributes.

Interoperability should take priority of concerns that 'approach X looks
better than Y', and other unjustifiable minor concerns.

Yes, and interoperability is based on the normative rules in RFC4287, right?

Perhaps foreign attributes could be clarified as being a
3rd class of extension and reincorporated into the Atom model, with
the disclaimer that they are less interoperable than Simple &
Structured Extensions?

I would vociferously resist any such claim. -Tim


Reply via email to