Reluctantly, I must admit that I can't find anything in RFC 3986 or the xml:base spec to convince me that the "same document reference" rule doesn't cause the problems for Tim's feed that have been asserted. The existence of the text regarding same document references, and the fact that that text points explicitly to section 5.1, where the method of determining the base URI (included embedded in content) is defined, leads me to conclude that it is intended that dereferencing a base URI would result in retrieval of the same data as exists within the scope of the setting of the base URI, and that base URIs are not intended as prefixes for convenience.

On Tuesday, July 19, 2005, at 02:28  AM, A. Pagaltzis wrote:
* David Powell <[EMAIL PROTECTED]> [2005-07-19 08:25]:
Why does xml:base allow for relative base URIs and stacking
then? If xml:base can only describe the actual source URI of
the document, then these features don't make sense.
I think it does...or at least could. Consider the following pseudo-XML (element names have no significance--line numbers are for convenient reference below):

1   <a xml:base="http://www.example.com/";>
2     <b>Welcome to example.com</b>
3     <c xml:base="/new/">
4       <b>Here's what's new on example.com</b>
5       <d href="foo.html">foo</d>
6       <d href="bar.html">bar</d>
7     </c>
8     <c xml:base="/popular/">
9       <b>Here's what's popular on example.com</b>
10     <d href="qwerty.html">qwerty</d>
11      <d href="asdf.html">asdf</d>
12     <c xml:base="atom/">
13       <b>Here's the popular Atom stuff</b>
14       <d href="link.html">link</d>
15     </c>
16   </c>
17 </a>

If you dereference http://www.example.com/, you get this whole document, or at least lines 2-16. If you dereference http://www.example.com/new/, you get a document containing lines 3-7, or at least 4-6 (the "what's new" page). If you dereference http://www.example.com/popular/, you get lnes 8-16, or at least lines 9-15 (the "what's popular" page). If you dereference http://www.example.com/popular/atom/, you get lines 12-15, or at least 13-14 (the "what's popular with Atom" page).

The entire document is a composite of the documents at http://www.example.com/new/ and http://www.example.com/popular/, which in turn is a composite of the document at http://www.example.com/popular/atom/ along with some additional data that originates at http://www.example.com/popular/. xml:base enables this compositing without requiring adjustment of the relative URIs. It makes it look to the consumer as if it had gotten different parts of the document from different places, so that their relative URIs can be resolved correctly.

The example in the xml:base spec [1] uses a relative URI in the
<olist xml:base="/hotpicks/"> element, after defining an
absolute URI in <doc xml:base="http://example.org/today/";> at
the top of the document.

[1] http://www.w3.org/TR/xmlbase/#syntax

That example says: the content of the root element can be found
in the resource at <http://example.org/today/>, and the content
of the olist tag can be found in the resource at
<http://example.org/hotpicks/>. xml:base is quite apparently
being used as “a prefix for calculating relative URIs” instead of
“the source URI for the material found inside this tag.”
As you can see above, I reached the opposite conclusion.

Now, xml:base appears to try to address the situation where an
aggregate document may contain fragments from many sources, and
each of which thus has its own base URI. But the devilish detail
is that RFC-specified behaviour means that if a useragent were to
find a link to <http://example.org/today/> somewhere inside the
example document except inside the olist tag, or a link to
<http://example.org/hotpicks/> inside the olist tag, it may not
retrieve that URL – instead it would have to consider the XML
document itself to be the document found at the respective URL.
...
It is the xml:base TR which is at odds with this; applying
same-document reference behaviour to fragments of an aggregate
document is non-sensical.
The problem lies not in applying same-document reference behavior, but in copying EXCERPTS from source documents that have links to fragments that aren't part of the excerpt. The same-document reference behavior is desirable if both the link and the fragment it links to are copied into the destination document. But there is no way to link to non-excerpted fragments. The URI spec would have to say that if the fragment isn't found in the current document, you can fetch the base URI to see if it exists there (it could even say that you can only do this if the current base URI was embedded in the content). If the fragment doesn't at the base URI, it's a broken link.

A hackish solution to the "Tim's Feed Conundrum" would be to set xml:base not to 'http://www.tbray.org/ongoing/', but to 'http://www.tbray.org/ongoing/foo', where "foo" doesn't actually exist, but is just used to ensure that relative references don't end up being identical to the base URI. Then, instead of <link href='' /> (which would be a same-document reference...I think I was wrong in the other thread), you could say <link href='./' />.

The other solution I can think of would be for the Atom spec to say that the same-document reference rule from the URI spec does not apply to the atom:link element. But that's kinda lame too--it would basically mean that Atom uses base URIs as prefixes for convenience, rather than to rectify the base URI of data taken from somewhere else, which seems to me to be their intent.

Reply via email to