Re: Atom 1.0 xml:base/URI funnies

Antone Roundy Tue, 19 Jul 2005 13:39:40 -0700

Reluctantly, I must admit that I can't find anything in RFC 3986 or thexml:base spec to convince me that the "same document reference" ruledoesn't cause the problems for Tim's feed that have been asserted. Theexistence of the text regarding same document references, and the factthat that text points explicitly to section 5.1, where the method ofdetermining the base URI (included embedded in content) is defined,leads me to conclude that it is intended that dereferencing a base URIwould result in retrieval of the same data as exists within the scopeof the setting of the base URI, and that base URIs are not intended asprefixes for convenience.


On Tuesday, July 19, 2005, at 02:28  AM, A. Pagaltzis wrote:

* David Powell <[EMAIL PROTECTED]> [2005-07-19 08:25]:

Why does xml:base allow for relative base URIs and stacking
then? If xml:base can only describe the actual source URI of
the document, then these features don't make sense.

I think it does...or at least could. Consider the following pseudo-XML(element names have no significance--line numbers are for convenientreference below):


1   <a xml:base="http://www.example.com/";>
2     <b>Welcome to example.com</b>
3     <c xml:base="/new/">
4       <b>Here's what's new on example.com</b>
5       <d href="foo.html">foo</d>
6       <d href="bar.html">bar</d>
7     </c>
8     <c xml:base="/popular/">
9       <b>Here's what's popular on example.com</b>
10     <d href="qwerty.html">qwerty</d>
11      <d href="asdf.html">asdf</d>
12     <c xml:base="atom/">
13       <b>Here's the popular Atom stuff</b>
14       <d href="link.html">link</d>
15     </c>
16   </c>
17 </a>

If you dereference http://www.example.com/, you get this wholedocument, or at least lines 2-16. If you dereferencehttp://www.example.com/new/, you get a document containing lines 3-7,or at least 4-6 (the "what's new" page). If you dereferencehttp://www.example.com/popular/, you get lnes 8-16, or at least lines9-15 (the "what's popular" page). If you dereferencehttp://www.example.com/popular/atom/, you get lines 12-15, or at least13-14 (the "what's popular with Atom" page).

The entire document is a composite of the documents athttp://www.example.com/new/ and http://www.example.com/popular/, whichin turn is a composite of the document athttp://www.example.com/popular/atom/ along with some additional datathat originates at http://www.example.com/popular/. xml:base enablesthis compositing without requiring adjustment of the relative URIs. Itmakes it look to the consumer as if it had gotten different parts ofthe document from different places, so that their relative URIs can beresolved correctly.

The example in the xml:base spec [1] uses a relative URI in the
<olist xml:base="/hotpicks/"> element, after defining an
absolute URI in <doc xml:base="http://example.org/today/";> at
the top of the document.

[1] http://www.w3.org/TR/xmlbase/#syntax


That example says: the content of the root element can be found
in the resource at <http://example.org/today/>, and the content
of the olist tag can be found in the resource at
<http://example.org/hotpicks/>. xml:base is quite apparently
being used as “a prefix for calculating relative URIs” instead of
“the source URI for the material found inside this tag.”

As you can see above, I reached the opposite conclusion.

Now, xml:base appears to try to address the situation where an
aggregate document may contain fragments from many sources, and
each of which thus has its own base URI. But the devilish detail
is that RFC-specified behaviour means that if a useragent were to
find a link to <http://example.org/today/> somewhere inside the
example document except inside the olist tag, or a link to
<http://example.org/hotpicks/> inside the olist tag, it may not
retrieve that URL – instead it would have to consider the XML
document itself to be the document found at the respective URL.

...

It is the xml:base TR which is at odds with this; applying
same-document reference behaviour to fragments of an aggregate
document is non-sensical.

The problem lies not in applying same-document reference behavior, butin copying EXCERPTS from source documents that have links to fragmentsthat aren't part of the excerpt. The same-document reference behavioris desirable if both the link and the fragment it links to are copiedinto the destination document. But there is no way to link tonon-excerpted fragments. The URI spec would have to say that if thefragment isn't found in the current document, you can fetch the baseURI to see if it exists there (it could even say that you can only dothis if the current base URI was embedded in the content). If thefragment doesn't at the base URI, it's a broken link.

A hackish solution to the "Tim's Feed Conundrum" would be to setxml:base not to 'http://www.tbray.org/ongoing/', but to'http://www.tbray.org/ongoing/foo', where "foo" doesn't actually exist,but is just used to ensure that relative references don't end up beingidentical to the base URI. Then, instead of <link href='' /> (whichwould be a same-document reference...I think I was wrong in the otherthread), you could say <link href='./' />.

The other solution I can think of would be for the Atom spec to saythat the same-document reference rule from the URI spec does not applyto the atom:link element. But that's kinda lame too--it wouldbasically mean that Atom uses base URIs as prefixes for convenience,rather than to rectify the base URI of data taken from somewhere else,which seems to me to be their intent.

Re: Atom 1.0 xml:base/URI funnies

Reply via email to