Reluctantly, I must admit that I can't find anything in RFC 3986 or the
xml:base spec to convince me that the "same document reference" rule
doesn't cause the problems for Tim's feed that have been asserted. The
existence of the text regarding same document references, and the fact
that that text points explicitly to section 5.1, where the method of
determining the base URI (included embedded in content) is defined,
leads me to conclude that it is intended that dereferencing a base URI
would result in retrieval of the same data as exists within the scope
of the setting of the base URI, and that base URIs are not intended as
prefixes for convenience.
On Tuesday, July 19, 2005, at 02:28 AM, A. Pagaltzis wrote:
* David Powell <[EMAIL PROTECTED]> [2005-07-19 08:25]:
Why does xml:base allow for relative base URIs and stacking
then? If xml:base can only describe the actual source URI of
the document, then these features don't make sense.
I think it does...or at least could. Consider the following pseudo-XML
(element names have no significance--line numbers are for convenient
reference below):
1 <a xml:base="http://www.example.com/">
2 <b>Welcome to example.com</b>
3 <c xml:base="/new/">
4 <b>Here's what's new on example.com</b>
5 <d href="foo.html">foo</d>
6 <d href="bar.html">bar</d>
7 </c>
8 <c xml:base="/popular/">
9 <b>Here's what's popular on example.com</b>
10 <d href="qwerty.html">qwerty</d>
11 <d href="asdf.html">asdf</d>
12 <c xml:base="atom/">
13 <b>Here's the popular Atom stuff</b>
14 <d href="link.html">link</d>
15 </c>
16 </c>
17 </a>
If you dereference http://www.example.com/, you get this whole
document, or at least lines 2-16. If you dereference
http://www.example.com/new/, you get a document containing lines 3-7,
or at least 4-6 (the "what's new" page). If you dereference
http://www.example.com/popular/, you get lnes 8-16, or at least lines
9-15 (the "what's popular" page). If you dereference
http://www.example.com/popular/atom/, you get lines 12-15, or at least
13-14 (the "what's popular with Atom" page).
The entire document is a composite of the documents at
http://www.example.com/new/ and http://www.example.com/popular/, which
in turn is a composite of the document at
http://www.example.com/popular/atom/ along with some additional data
that originates at http://www.example.com/popular/. xml:base enables
this compositing without requiring adjustment of the relative URIs. It
makes it look to the consumer as if it had gotten different parts of
the document from different places, so that their relative URIs can be
resolved correctly.
The example in the xml:base spec [1] uses a relative URI in the
<olist xml:base="/hotpicks/"> element, after defining an
absolute URI in <doc xml:base="http://example.org/today/"> at
the top of the document.
[1] http://www.w3.org/TR/xmlbase/#syntax
That example says: the content of the root element can be found
in the resource at <http://example.org/today/>, and the content
of the olist tag can be found in the resource at
<http://example.org/hotpicks/>. xml:base is quite apparently
being used as “a prefix for calculating relative URIs” instead of
“the source URI for the material found inside this tag.”
As you can see above, I reached the opposite conclusion.
Now, xml:base appears to try to address the situation where an
aggregate document may contain fragments from many sources, and
each of which thus has its own base URI. But the devilish detail
is that RFC-specified behaviour means that if a useragent were to
find a link to <http://example.org/today/> somewhere inside the
example document except inside the olist tag, or a link to
<http://example.org/hotpicks/> inside the olist tag, it may not
retrieve that URL – instead it would have to consider the XML
document itself to be the document found at the respective URL.
...
It is the xml:base TR which is at odds with this; applying
same-document reference behaviour to fragments of an aggregate
document is non-sensical.
The problem lies not in applying same-document reference behavior, but
in copying EXCERPTS from source documents that have links to fragments
that aren't part of the excerpt. The same-document reference behavior
is desirable if both the link and the fragment it links to are copied
into the destination document. But there is no way to link to
non-excerpted fragments. The URI spec would have to say that if the
fragment isn't found in the current document, you can fetch the base
URI to see if it exists there (it could even say that you can only do
this if the current base URI was embedded in the content). If the
fragment doesn't at the base URI, it's a broken link.
A hackish solution to the "Tim's Feed Conundrum" would be to set
xml:base not to 'http://www.tbray.org/ongoing/', but to
'http://www.tbray.org/ongoing/foo', where "foo" doesn't actually exist,
but is just used to ensure that relative references don't end up being
identical to the base URI. Then, instead of <link href='' /> (which
would be a same-document reference...I think I was wrong in the other
thread), you could say <link href='./' />.
The other solution I can think of would be for the Atom spec to say
that the same-document reference rule from the URI spec does not apply
to the atom:link element. But that's kinda lame too--it would
basically mean that Atom uses base URIs as prefixes for convenience,
rather than to rectify the base URI of data taken from somewhere else,
which seems to me to be their intent.