Eran,

I've made a couple of minor comments on this proposal which in general I like as it does seem to be the well known location to end all well known locations (which I reckon is about the only justification there could be for a new well known location).

Eran Hammer-Lahav wrote:
Context

The /site-meta proposal (a known-location solution for site metadata) [1]
includes a simple XML format for representing site metadata directly or via
links. In discussing the proposal and the appropriate format for the list of
meta resources, John Panzer suggested using a simpler text format [2]
directly based on the content of the Link header [3].

While I see the value of an XML format for this data, and was the main
supported of it, I now strongly support the idea of using a super-simple
text-based document. Partially because it fits better with the current
use-cases, and partially because I am an editor of a "competing" XML format
which covers this use case (XRDS/XRD) but is too complex to be positioned as
the default form.

I would like /site-meta to list a single text-based format with a clear
Content-type associated with it. I also want the spec to explicitly allow
user-agents to request other representations of the /site-meta resource with
the default being the super-simple-text-based version. One such
representation (I expect to be widely supported) will be
application/xrd+xml.


Some Questions (and answers)

- Should the /site-meta text format be restricted to a set of links or
provide an easy path for extensions of some other kinds of records?

While I can't come come up with compelling use cases for /site-meta to
directly include other metadata, it is likely someone else will in the
future.

I fully understand the desire for extensibility and for not imposing restrictions unnecessarily. However, I do think it would be a big mistake to allow a /site-meta file to include anything other than links to data. Let's imagine you allowed, say, Dublin Core and Creative Commons to be encoded in a /site-meta file directly. Why not? They're well-defined, well used metadata systems that can often be applied to a whole site.

Why make people put this in a separate file when it could, surely, go in the /site-meta file? Well, you could allow it, and any other metadata - and hey presto you've just reinvented a WKL for POWDER, XRD and whatever comes next.

No... if /site-meta is the WKL to end all WKLs then it has to be just a set of pointers to where the 'real data' actually is. So I would say that there is a case for deliberately limiting the extensibility. As you go on to point out, if it supports an HTTP Link-like structure, that's already flexible and it meets the need. When extensibility leads to mission creep, things will go wrong.



By replacing each record in John's proposal:

---
/robots.txt rel="robots"
/p3p.xml rel="privacy"
http://other.example.net/example rel="http://example.com/rel";
---

with actual Link headers:

---
Link: </robots.txt>; rel="robots"
Link: </p3p.xml>; rel="privacy"
Link: <http://other.example.net/example>; rel="http://example.com/rel";
---

other record types can be added in the future.

Indeed. Here are two that come to mind:

Link: </styles.css>; rel="stylesheet"; type="text/css"
Link: </powder.xml> rel="describedby"; type="text/powder+xml"

The mobile world would probably like something like

Link: <http://m.example.com>;
  rel="http://example.org/mobile-vocab#mobile_entry_page";

Link: <http://example.com>;
  rel="http://example.org/mobile-vocab#desktop_entry_page";

(I'm basing this on the metaTXT work just getting going [PA1])

Oops... I'm straying into mission creep there aren't I? I mean, are those URIs links or metadata? I hope it doesn't matter - I've used URIs where URIs are allowed.

One thing I have done in my first 2 examples is to include the type attribute (which if we're following the HTTP Link format is allowed and, IMHO, should be encouraged!)

 This also means the same code
used to read Link headers (or HTTP headers in general) can be used for this
format. This also plays nicely with the idea of equating links in /site-meta
to Links in individual resources' HTTP response headers.

- Should /site-meta define its own content type, use an existing content
type, or define a new generic content type?

If we take the route of using an HTTP-header-like format for /site-meta, is
there value in making this format generally available for other resources.
RFC 2616 offers a similar construct in the form of message/http. It seems
that as long as the document can be considered a valid HTTP request or
response, we can use this content type.

So /site-meta can be considered a body-less HTTP response with Link headers.
The question is, is such a header-fragment allowed in a message/http
document? It is not clear if in this use-case, the Date header may be
omitted, which is otherwise required for a valid response header. The Date
header makes little sense in this context and should be omitted. Note that
the HTTP header for GET /site-meta must still include Date.


In Conclusion

1. The idea of allowing multiple representations for /site-meta resources
suggests the use of a more generic content type for the default (and the
only required) representation than application/site-meta.

I'd stick with one format. Choice can be overrated and leads to confusion (and you thought I was a dripping wet liberal? Only when it suits me ;-) )


2. There is value in using a single mechanism for metadata discovery, either
for an individual resource (via HTTP Link header or HTML/ATOM Link element)
and for a domain authority (via /site-meta list of links). Using the exact
same semantics between HTTP Link and /site-meta links seems productive.

Agreed. And this further supports the one-format point.


3. Preparing for some unknown need for extending /site-meta while not
increasing complexity (assuming Link header structured is simple enough)
seems like a good idea.

Yes - but the flexibility is in the relationship and content types. Sign posts can point to towns, multi-lane highways, country dirt tracks and little 'ol houses on the prairie, but they're still sign posts and that's what, for me, /site-meta is about. Enough with the flexibility.

Actually, at least 3 use cases - robots.txt, p3p and POWDER - all have their own method of defining which sections of Web sites they refer to. If there is an argument for making /site-meta more complex or flexible, I'd say it would be in the area of defining a common method of doing that - but that means re-writing those specs so let's not go there.



Action Items

* Change /site-meta draft to use the Link header format instead of the
current XML proposal.

+1

* If allowed, use message/http as the default content type for /site-meta.
If not, register a new content type, preferably something like
application/http-header-fragment, or just application/site-meta.

Why application? I'd say text was more appropriate. Application suggests something really complicated that needs a lot of processing. This is just a bunch of links and a little syntactic sugar.


* Clarify that the content of /site-meta does not describe any actual
resource or URI, but the abstract concept of 'web site' or 'domain
authority', expressed as an HTTP header. In practice, it is still just a
registry for resource locations to avoid more known-location solutions.

+1


Thoughts?

EHL


[1] http://tools.ietf.org/html/draft-nottingham-site-meta-00
[2]
http://www.abstractioneer.org/2008/11/one-site-meta-to-rule-them-all.html
[3] http://tools.ietf.org/html/draft-nottingham-http-link-header-02





[PA1] http://www.visibilitymobile.com/Whitepaper_On_MetaTXT.pdf

--

Phil Archer
w. http://philarcher.org/

Reply via email to