Hi,

On Tue, Jun 30, 2009 at 04:59:55PM -0400, Ant Bryan wrote:
> On Thu, Jun 25, 2009 at 8:27 PM, Peter Poeml<[email protected]> wrote:
> > There is one thing in the Internet Draft that I'd like to bring to our
> > attention. Section 4.2.18 sets up a a strong requirement:
> >  All IRIs MUST lead to identical files.
> >
> > While surely this would be the intention, in practice I know more
> > examples where this either isn't the case, and albeit attempted it is
> > hard to assure.
> >
> > Content verification is there to help -- one of the purposes of metalinks.
> >
> > It might make sense to put this in a different way.
> >
> > Without any content verification being done (well, it is optional!), it
> > is a relatively hard requirement to make. When a content delivery
> > infrastructure reaches a certain scale, it becomes difficult though, as
> > we know. In particular this is true for collaborative mirror networks
> > formed by volunteers, where, in fact, the referenced IRIs might be
> > outside of the control of the content provider at all. (Security comes
> > into play here as well.)
> 
> the current text is lacking, I'm glad you brought it up! at the least
> it should be something like:
> 
> All IRIs under each "metalink:file" container Element SHOULD lead to
> identical files.
> 
> because we don't want ALL the IRIs under separate <file> elements to
> lead to identical files.

Ah, yes, that makes sense. Good spot.

> yes, MUST is too strong. the point is that the by design, IRIs that
> are included in a metalink SHOULD lead to identical files.
> will metalinks contain IRIs that aren't identical files? yes. do we
> want it to no longer be a valid metalink if a mirror network is out of
> sync & a file isn't identical? no
> I'm just trying to document the design purpose, that each IRI is a
> valid way to get the same exact file, under perfect conditions.
> 
> clients should weed out files that have different sizes, & reject
> chunks that don't have the correct checksum, etc. a  metalink
> generators will attempt to include IRIs that point to identical files
> to be most helpful, but we want to protect against accidents or
> malicious people that would possibly want to lead to incorrect
> downloads.
> 
> maybe content verification should be required for future versions (as
> in this version for the IETF)? even tho it is extremely important,
> some lazy implementers :) did not want to add support. as of now, I
> think only TheWorld browser does not support checksums. it's clear
> that most implementers see the value in it.

Well summarized - I think that's exactly what we should document.

> > I would tend to make this a SHOULD, for practical reasons. Also, the
> > text could/should expand both on the implications.
> 
> care to throw in some expansion text? :)

See below for more thoughts about your suggestions :-)
(I saw your change)

> > Alternatively, would the following be an idea?
> >  All referenced IRIs SHOULD lead to identical resources, if the
> >  Metalink includes a "metalink:verification" container with at least
> >  one "metalink:hash" element. All referenced IRIs MUST be identical, if
> >  the latter is not the case.
> 
> If the Metalink Document includes a "metalink:verification" container
> element with at least one "metalink:hash" element, all referenced IRIs
> SHOULD lead to identical resources.
> If the Metalink Document does not include a "metalink:verification"
> container element with at least one "metalink:hash" element, all
> referenced IRIs MUST be identical.
> 
> I don't know, do you think that is necessary or better?

That's better, I think.


My head is spinning around this, and I'm not sure. I'd like to hear from
more folks, what they think about it.

I think, in the end we'll want to have something like what you summed up
above, with some things not mandatory, but documenting the intended use
cases spiced with good recommendations (and reasoning) for content
verification.

But if we agree that we should make content verification mandatory it
would be even better I think. Especially if implementors have largely
done it anyway, so far!

On Tue, Jun 30, 2009 at 05:50:36PM -0400, Ant Bryan wrote:
> 
> I changed it to this for now
> 
>    The "metalink:url" element contains the IRI of a file.  All IRIs
>    contained in each metalink:resources element SHOULD lead to identical
>    files.
> 
> looking a bit more, maybe this should be defined under <file>
> 
>    The "metalink:file" element represents an individual file, acting as
>    a container for metadata and data associated with the file.
> 
> or <resources>,
> 
>    The "metalink:resources" element acts as a container for metadata and
>    data associated with the listed files.  It contains one or more
>    metalink:url child elements.  It can also contain one or more
>    metalink:metadata child elements.
> 
>  because we want this to include <metadata> (anyone come up with a
> better name? :), that is a torrent or metalink listed in <metadata>
> won't be the same file as in <url> but will eventually lead to it if
> you use that <metadata>

Good thought.

Unfortunately, no idea for a better name for "metadata". The name is
correct for the purpose, just not very intuitive. Confusingly, the
content is another URL, but it does lead to another description and not
to the file itself. But it looks clean and correct, once you think about
it.

Peter
-- 
"WARNING: This bug is visible to non-employees. Please be respectful!"
 
SUSE LINUX Products GmbH
Research & Development

Attachment: pgpqt80MdeHnD.pgp
Description: PGP signature

Reply via email to