Re: Transparent Metalinks?

Nils Thu, 05 Feb 2009 14:34:22 -0800

On Jan 20, 1:05 am, Anthony Bryan <[email protected]> wrote:
> Eran, thanks for joining us!
>
> On Sat, Jan 17, 2009 at 2:10 PM, Eran Hammer-Lahav <[email protected]> 
> wrote:
>
> > Again, there is nothing 'technically' wrong with this approach any
> > others have taken a similar position that a descriptor can be
> > collapsed into a single resource URI. But a consensus is building that
> > this is the wrong way of doing things. What you might want to
> > consider, since you find the semantic discussion as nonsense (which I
> > can respect) is the deployment ramifications of using the Accept
> > header. Many platforms limit access to such headers, some proxies
> > mishandle Vary headers (which BTW, the spec should require with to any
> > Accept reply), and some providers will not allow using it on their
> > servers. You might want to read John Panzer's view of this [1].
>
> what should the spec require? could you propose some text, I'm not
> familiar w/ that.
>
> Eran started a thread about TCN on the HTTP list 
> athttp://lists.w3.org/Archives/Public/ietf-http-wg/2009JanMar/0014.html
> (it wouldn't hurt the draft process for metalink people to be involved
> on there :) which includes Mark's reply:
>
> "To my knowledge, caching intermediaries haven't deployed it (i.e.,
> they'll work with TCN, but they won't be able to serve negotiated
> requests from cache... somebody please correct me if I'm wrong).
>
> I'm not sure about browser implementation, but I did a quick check of
> the request headers seen by a very high-traffic Web site, and a
> vanishingly small number contained the Negotiate header..."


A usual website usually has not much to negotiate...
Likely only image representations, if there are even different
available.

What you will likely see far more often is:
Vary: Accept-Encoding
A lot of sites offer either "plain" or gzip encoding. This is, as far
as I know, relatively widely deployed.
Hence I guess (but have no concrete data to undermine that guess) that
most proxies today in fact have no problem dealing with this.

>
> (someone had been working on a metalink plugin for squid).
>
> I figured it wouldn't hurt to quote what we use now & what we could
> use in the future from Eran's draft directly:
>
> http://tools.ietf.org/html/draft-hammer-discovery-01
>
> Appendix A.2.1. HTTP Response Header
>
>    When a resource representation is retrieved using and HTTP GET
>    request, the server includes in the response a header pointing to the
>    location of the descriptor document.  For example, POWDER uses the
>    'Link' response header to create an association between the resource
>    and its descriptor.  XRDS [XRDS] (based on the Yadis protocol
>    [Yadis]) uses a similar approach, but since the Link header was not
>    available when Yadis was first drafted, it defines a custom header
>    X-XRDS-Location which serves a similar but less generic purpose.
>
>    [+] Self Declaration -  using the Link header, any resource can point
>       to its descriptor documents.

Nice to have, but in my opinion not a strict requirement, at least
from the download manager perspective.

>
>    [-] Direct Descriptor Access -  the header is only accessible when
>       requesting the resource itself via an HTTP GET request.  While
>       HTTP GET is meant to be a safe operation, it is still possible for
>       some resource to have side-effects.
>
>    [+] Web Architecture Compliant -  uses the Link header which is an
>       IETF Internet Standard [[ currently a standard-track draft ]], and
>       is consistent with HTTP protocol design.

It is a draft of a draft of a proposed standard.
Which bears the risk that the draft may change after first
implementations arrive.

>    [-] Scale and Technology Agnostic -  since discovery accounts for a
>       small percent of resource requests, the extra Link header is
>       wasteful.  For some hosted servers, access to HTTP headers is
>       limited and will prevent implementation.
>
>    [+] Extensible -  the Link header provides built-in extensibility by
>       allowing new link relationships, mime-types, and other extensions.

Extensibility is a strong plus.

>
>    Minimum roundtrips to retrieve the resource descriptor: 2

Wasting resources by the additional round trip.

>
> Appendix A.2.2. HTTP Response Header Via HEAD
>
>    Same as the HTTP Response Header method but used with an HTTP HEAD
>    request.  The idea of using the HEAD method is to solve the wasteful
>    overhead of including the Link header in every reply.  By limiting
>    the appearance of the Link header only to HEAD responses, typical GET
>    requests are not encumbered by the extra bytes.
>
>    [+] Self Declaration -  Same as the HTTP Response Header method.
>
>    [-] Direct Descriptor Access -  Same as the HTTP Response Header
>       method.
>
>    [-] Web Architecture Compliant -  HTTP HEAD should return the exact
>       same response as HTTP GET with the sole exception that the
>       response body is omitted.  By adding headers only to the HEAD
>       response, this solution violates the HTTP protocol and might not
>       work properly with proxies as they can return the header of the
>       cached GET request.
>
>    [+] Scale and Technology Agnostic -  solves the wasted bandwidth
>       associated with the HTTP Response Header method, but still suffers
>       from the limitation imposed by requiring access to HTTP headers.

However the second round trip become "mandatory".
When using the Header method and there is no Link header you won't do
a second request.
Here you need to do a second request, always, if you're trying to
discover something.
(The second request either being the GET of the download or the GET of
the metalink, depending on the presence of a metalink).

>
>    [+] Extensible -  Same as the HTTP Response Header method.
>
>    Minimum roundtrips to retrieve the resource descriptor: 2

Actually worse than the Header method. See above.

>
> Appendix A.2.3. HTTP Content Negotiation
>
>    Using the HTTP Accept request header or Transparent Content
>    Negotiation as defined in [RFC2295], the consumer informs the server
>    it is interested in the descriptor and not the resource itself, to
>    which the server responds with the descriptor document or its
>    location.  In Yadis, the consumer sends an HTTP GET (or HEAD) request
>    to the resource URI with an Accept header and content-type
>    application/xrds+xml.  This informs the server of the consumer's
>    discovery interest, which in turn may reply with the descriptor
>    document itself, redirect to it, or return its location via the
>    X-XRDS-Location response header.
>
>    [-] Self Declaration -  does not address as it focuses on the
>       consumer declaring its intentions.

Not strictly required for the current use case.
The Header method could additionally be used if Self Declaration is
wanted.

>
>    [+] Direct Descriptor Access -  provides a simple method for directly
>       requesting the descriptor document.

Biggest plus, which seals the deal for me.

>
>    [-] Web Architecture Compliant -  while it can be argued that the
>       descriptor can be considered another representation of the
>       resource, it is very much external to it.  Using the Accept header
>       to request a separate resource (as opposed to a different
>       representation of the same resource) violates web architecture.

Again, the philosophical dispute if the "descriptor" in our case is a
completely different resource or merely a representation.

>       It also prevents using the discovery content-type as a valid
>       (self-standing) web resource having its own descriptor.

I don't see/get this.
Anyway, I don't think that this affects our use case.

>
>    [-] Scale and Technology Agnostic -  requires access to HTTP request
>       and response headers, as well as the registration of multiple
>       handlers for the same resource URI based on the Accept header.  In
>       addition, improper use or implementation of the Vary header in
>       conjunction with the Accept header will cause caches to serve the
>       descriptor document instead of the resource itself - a great
>       concern to large providers with frequently visited front-pages.

All major implementations at least seem to support Vary well enough
(at least for Accept-Encoding).
Just because there is a risk some implementations are broken shouldn't
hinder us using it.
After all, nobody turned of the web even although clients have lots of
bugs in all major areas, from http/https to html/css/js.
Those few broken implementations need to be corrected, or they will
someday replaced by working competitors anyway.
"Backward compatibility" is always neat to have, but it should not
completely prevent the deployment of new technology, methods or uses.

>
>    [-] Extensible -  applies an implicit relationship type to the
>       descriptor mime-type, limiting descriptor formats to a single
>       purpose.  It also prevents using existing mime-types from being
>       used as a descriptor format.

Valid, but not an issue for our use case.

>
>    Minimum roundtrips to retrieve the resource descriptor: 1
>

That's what I'm looking after.


Nils
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Metalink Discussion" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/metalink-discussion?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Transparent Metalinks?

Reply via email to