On Tue, Sep 2, 2008 at 8:23 AM, Ian Macfarlane <[EMAIL PROTECTED]> wrote:
> Dear Anthony,
>
> One extra alteration where I think the wording could be slightly tidied up:
>
> "When one or more metalink:url elements have a preference attribute
>  value of "100", other metalink:url elements SHOULD NOT be used,
>  unless these cannot be processed (e.g. are "bittorrent" etc, and this
>  is not supported by the Metalink Processor, or the servers are down)."
>
> Here "these" could potentially be misread as the "non-100" elements
> rather than the "100" elements. I think slightly clarifying the
> wording here would be beneficial. I suggest something along the lines
> of:
>
> "When one or more metalink:url elements have a preference attribute
> value of "100", other metalink:url elements SHOULD NOT be used, unless
> the elements with a preference of 100 cannot be processed (e.g. if
> they are of a type which is not supported by the Metalink Processor,
> such as bittorrent, or if the servers are unavailable)."

Changed.

> Also, I still think the "type" definitions such as "http", "https" etc
> should be removed, as per the reasons given in the previous emails.

It looks like you're getting your way. :)

You & others have made good points against it.

> Thank you for taking the time to look at my suggestions.

Thanks again for taking the time to make them. Your comments were the
first & very helpful!

> ps: I have not had a chance to look through the entire revised
> document - these comments are based on the revisions described in your
> email.
>
> 2008/8/28 Anthony Bryan <[EMAIL PROTECTED]>:
>> On Thu, Aug 28, 2008 at 7:14 AM, Ian Macfarlane <[EMAIL PROTECTED]> wrote:
>>> Hi Anthony,
>>>
>>> Thanks for your reply. A few comments about these changes:
>>
>> Thanks again for the patience & taking the time to help out.
>>
>> I'm keeping up to date versions at
>> http://metalinks.svn.sourceforge.net/viewvc/metalinks/internetdraft/draft-bryan-metalink-01.txt?view=markup
>>
>>> (1) With regards to this new wording:
>>>
>>> "  6.  The value "bittorrent" signifies that the IRI leads to a
>>>     BitTorrent .torrent file as specified in [BITTORRENT].  Metalink
>>>     Processors that do not support BitTorrent should ignore this type
>>>     and also ignore metalink:url elements which retrieve files that
>>>     end with the extension ".torrent"."
>>>
>>> This implies that the file extension still overrides the type
>>> attribute even if the type is not "bittorrent" - I might suggest
>>> adding to the end:
>>>
>>> ", unless the metalink:url element has a type attribute which the
>>> Metalink Processor supports".
>>>
>>> It's definitely a real corner case, but it's good to specify the
>>> correct behavior for future proofing (what if a new file format comes
>>> out called "bittorrent2" which extends bittorrent and uses .torrent
>>> files, but which existing "bittorrent1" processors can't handle?)
>>
>> Yes, better to be clear. I've added it.
>>
>>> Also my original point regarding the location of the ".torrent" text
>>> in the IRI isn't dealt with by this new text - I would suggest
>>> explicitly stating that this means when the IRI path ends with the
>>> characters ".torrent" (or alternatively, as you suggest, require the
>>> "type" attribute for bittorrent).
>>
>> I think it's good to require the "type" attribute for bittorrent so
>> I've changed it. (In practice, it has always been used). This way, as
>> you say, FTP or HTTP etc IRIs that don't obviously lead to a torrent
>> can be disregarded by Metalink processors, even tho they would be most
>> likely by not having a matching file size or the correct hash as the
>> other files.
>>
>>> (2):
>>>
>>> "What about requiring that "bittorrent" is used as a "type" attribute
>>> since .torrent files can be acquired from multiple methods, & just
>>> examining the IRI as you mentioned can be misleading?"
>>>
>>> The difference between the ".torrent" naming issue and the "http://";
>>> naming issue is that the first is simply part of the path, and doesn't
>>> really mean anything (there's no reason you couldn't serve a web page
>>> with a .torrent extension, if you have the right Content-Type).
>>
>> I didn't think of that.
>>
>>> However, for "http://"; etc, this is the IRI's scheme itself, which has
>>> an explicit unalterable meaning. They're really two very different
>>> things.
>>>
>>> (technically ed2k/magnet/rsync URIs don't need a type either, as the
>>> scheme provides the required information - it's only the BitTorrent
>>> protocol which is different as there is not a 'torrent' URI scheme per
>>> se).
>>>
>>> No strong objection either way to requiring "type" for "bittorrent",
>>> so long as any explicit "type" attribute specified overrides any file
>>> type "sniffing", but I'm slightly in favour of requiring the type
>>> attribute for where it can't be inferred from the scheme and dropping
>>> sniffing altogether.
>>
>> Ok, "type" attribute for bittorrent is required.
>>
>>> (3):
>>>
>>> "A Metalink Processor MAY download different segments of a file from
>>> more than one IRI simultaneously, and when doing so SHOULD first use
>>> the highest priority IRIs and then use lower ones."
>>>
>>> I agree that this is a difficult one. Some possible suggestions:
>>>
>>> - When one or more resources have a value of "100", no other resources
>>> should be used, unless these cannot be processed (e.g. are bittorrent
>>> etc and this is not supported, or the servers are down).
>>>
>>> - Any resources with a value of "1" should not be used unless all
>>> other resources cannot be processed (e.g. are bittorrent etc and this
>>> is not supported, or the servers are down).
>>>
>>> I think at least those two are valuable enough to include (probably a 
>>> SHOULD).
>>
>>   metalink:url elements MAY have a preference attribute, whose value
>>   MUST be a number from 1 to 100 for priority, with 100 used first and
>>   1 used last.  Multiple metalink:url elements can have the same
>>   preference, i.e. ten mirrors could have preference="100".  A Metalink
>>   Processor MAY download different segments of a file from more than
>>   one IRI simultaneously, and when doing so SHOULD first use the
>>   highest priority IRIs and then use lower ones.
>>
>>   When one or more metalink:url elements have a preference attribute
>>   value of "100", other metalink:url elements SHOULD NOT be used,
>>   unless these cannot be processed (e.g. are "bittorrent" etc, and this
>>   is not supported by the Metalink Processor, or the servers are down).
>>
>>   Any metalink:url elements with a preference attribute value of "1"
>>   SHOULD NOT be used unless all other metalink:url elements cannot be
>>   processed (e.g. are "bittorrent" etc and this is not supported by the
>>   Metalink Processor, or the servers are down).
>>
>>> Lastly, it might be possible to do something based on the 'initial
>>> digit', e.g. if the initial digit is higher, all servers with lower
>>> digits should not be used (unless the higher ones cannot be
>>> processed), and the others should have their work distributed evenly
>>> based on the minor digit. For example if you have three resources with
>>> preferences of 89, 91 and 95 - the one with 89 would not be used
>>> (unless the other two can't be used), and the processor would try and
>>> distribute more work to the resource with a value of 95 than the one
>>> with 91 (e.g. 5 times more, or something along those lines - or you
>>> could leave the exact distribution down to the metalink processor). I
>>> think this sort of behavior could be no stronger than a SHOULD though.
>>
>> This could be interesting, I want to consult the authors of Metalink
>> clients first though.
>>
>>> (4):
>>>
>>> " In this example, a subdirectory debian-amd64/sarge/ will be created
>>>  and a file named Contents-amd64.gz will be created inside it.  The
>>>  path MUST be relative.  The path MUST NOT begin with a "/" or contain
>>>  "../" or "./" Metalink Processors MUST NOT allow directory traversal."
>>>
>>> I think the actual correct form for this should be:
>>>
>>> " In this example, a subdirectory debian-amd64/sarge/ will be created
>>>  and a file named Contents-amd64.gz will be created inside it.  The
>>>  path MUST be relative.  The path MUST NOT begin with a "/", "./" or
>>> "../", contain
>>>  "/../", or end with "/..". Metalink Processors MUST NOT allow
>>> directory traversal."
>>>
>>> (./ at the start could cause some badly written applications to change
>>> to their current directory, but /./ anywhere else should be fine I
>>> think).
>>>
>>> I think it would be good if you could get a second opinion on this
>>> wording from someone who knows this a bit better than I.
>>
>> I've fixed it & hopefully we'll have corrections :)
>>
>>> (5) It might also be worth adding information as to how to deal with
>>> characters which are invalid in the filesystem - I'd suggest something
>>> like:
>>>
>>> "A Metalink Processor MAY alter the name of the subdirectory or file
>>> if they contain characters which are invalid in the destination
>>> filesystem."
>>>
>>> (that way it can be left to the processor itself to decide what to
>>> rename it to on any particular filesystem, or even reject it if
>>> desired).
>>
>> That sounds good, added.
>>
>>> (6) "What do you suggest about dealing with multiple hash types?" -
>>> obviously it would be better for a processor to check multiple hashes,
>>> as it's a good way to prevent malicious altering of the files. This
>>> needs to be left down to the metalink processor though. Something
>>> like:
>>>
>>> "When multiple hash types methods are provided, a Metalink Processor
>>> MAY verify using more than one of these hash types".
>>
>> Added. Currently, I think most only do one.
>>
>>> Also you write:
>>>
>>> "An issue could be if someone malicious makes a metalink where the MD5
>>> matches that of something published by a legit group, but also
>>> includes a SHA-256 checksum, and if clients prefer & only verify
>>> SHA-256, then the file could appear to be good even if the downloader
>>> looked inside the metalink & compared the MD5 (if the legit group
>>> didn't also use & publish SHA-256 checksums too)."
>>>
>>> That's an interesting case, but if their metalink processor didn't
>>> support md5 and only supported SHA-256, it'd be the same scenario too.
>>> I don't think this is too much of a concern. It's not unreasonable to
>>> change the "MAY" suggested above for checking multiple hash types to a
>>> "SHOULD", but for reasons such as performance, this might not always
>>> be desirable, and I'm not sure you'd gain as much security as you
>>> might think from doing this. Most people don't even know what md5 etc
>>> are, and a user who cares about this will probably make sure their
>>> metalink processor is one which checks all hashes.
>>>
>>> It's quite in keeping with standards to write something like this:
>>>
>>> "Metalink processors are encouraged to check all hash types given
>>> which they are able to process"
>>>
>>> This will probably lead to checking all as the common behavior while
>>> not preventing people choosing not to do this for whatever reason.
>>
>> Yes, I think it ultimately comes down to downloading .metalinks from
>> someone you trust, just like any other type of download.
>>
>>> (7): "is the only thing changed?" - also the indentation changed (e.g.
>>> <description> should be indented one more level, as it's a child of
>>> <file>).
>>
>> Hopefully I've got this right now, let me know.
>>
>>> (8) An additional point I've just noticed - it specifies that it
>>> should ignore resources with a "type" of bittorrent if it is
>>> unsupported. ("Metalink Processors that do not support BitTorrent MUST
>>> ignore" ...) [as changed in your additional email from should]. This
>>> should probably be removed from the bittorrent subsection (point 6 in
>>> 4.2.17.2), and moved to above the list in 4.2.17.2, and state
>>> something like "Metalink Processors that do not support a specified
>>> type of resource MUST ignore that resource". This is both future
>>> proof, and handles the case of Metalink Processors not supporting ed2k
>>> etc.
>>
>> Good point.
>>
>>> Best wishes
>>>
>>> Ian Macfarlane
>>>
>>> ps: No objections to forwarding any of this to the metalink list.
>>
>> Thanks!
>>
>>> 2008/8/28 Anthony Bryan <[EMAIL PROTECTED]>:
>>>> Hi Ian,
>>>>
>>>> Great comments, thank you so much for taking the time to examine this!
>>>> These are issues that needed to be addressed.
>>>>
>>>> Do you mind if I forward this to the metalink-discussion list?
>>>>
>>>> I'll put the changes here, let me know if they are an improvement, or
>>>> suggest a change.
>>>>
>>>> On Wed, Aug 27, 2008 at 7:33 AM, Ian Macfarlane <[EMAIL PROTECTED]> wrote:
>>>>> A few comments regarding the draft at
>>>>> http://tools.ietf.org/html/draft-bryan-metalink-00
>>>>>
>>>>> (1) With regards to the "type "attribute of the metalink:url element
>>>>> in 4.2.17.2, I think it should be made clear that this overrides any
>>>>> file extension sniffing specified in 4.2.17.
>>>>
>>>> 4.2.17.2.  The "type" Attribute
>>>>
>>>>  metalink:url elements MAY have a "type" attribute that indicates the
>>>>  IRI type.  The "type" attribute overrides any file extension sniffing
>>>>  specified above.
>>>>
>>>>> (2) With regards to the metalink:url element in 4.2.17, it is not
>>>>> clear if the IRI must end with ".torrent", or if the path should, e.g.
>>>>> does http://example.com/file.torrent?id=1 count? What about
>>>>> http://example.com/generate.php?file.torrent
>>>>
>>>>  6.  The value "bittorrent" signifies that the IRI leads to a
>>>>      BitTorrent .torrent file as specified in [BITTORRENT].  Metalink
>>>>      Processors that do not support BitTorrent should ignore this type
>>>>      and also ignore metalink:url elements which retrieve files that
>>>>      end with the extension ".torrent".
>>>>
>>>>> (3) Also with regards to the "type "attribute of the metalink:url
>>>>> element in 4.2.17.2, it's slightly inconsistent to allow both
>>>>> http/https/ftp/etc as well as "bittorrent" as types, as a .torrent
>>>>> file itself can be sent over any of these protocols. There is explicit
>>>>> information about the protocol from the scheme in these URLs. I would
>>>>> suggest "direct" (or omit altogether) for this type of file, and the
>>>>> Metalink Processor should infer the protocol from the scheme used.
>>>>
>>>> What about requiring that "bittorrent" is used as a "type" attribute
>>>> since .torrent files can be acquired from multiple methods, & just
>>>> examining the IRI as you mentioned can be misleading?
>>>>
>>>>> (4) With regards to the metalink:url element "preference" attribute in
>>>>> 4.2.17.1, it is not entirely clear if a Metalink Processor which can
>>>>> download simultaneously should download from two locations where one
>>>>> has a lower priority. A comment such as "A Metalink Processor SHOULD
>>>>> do xxx" would be helpful.
>>>>
>>>> I'm not sure what to put here :)
>>>>
>>>> This sentence would accurately describe what they do now. In our
>>>> pre-ID version, we have a "maxconnections" attribute where you can
>>>> limit the amount of segments for a download.
>>>>
>>>> "A Metalink Processor MAY download different segments of a file from
>>>> more than one IRI simultaneously, and when doing so SHOULD first use
>>>> the highest priority IRIs and then use lower ones."
>>>>
>>>>> (5) With regards to the "name" attribute in 4.1.3.1, instead of "Only
>>>>> relative paths are allowed" I think using the formal restrictive
>>>>> language of the standards process is a good idea here, e.g."The path
>>>>> MUST be relative". It might also be a good idea to add that the path
>>>>> MUST NOT begin with a "/" or contain "../" (and possibly ./")
>>>>> [technically this should be starting with ../ or ./ or containing /..
>>>>> or /. I think].
>>>>
>>>>  In this example, a subdirectory debian-amd64/sarge/ will be created
>>>>  and a file named Contents-amd64.gz will be created inside it.  The
>>>>  path MUST be relative.  The path MUST NOT begin with a "/" or contain
>>>>  "../" or "./" Metalink Processors MUST NOT allow directory traversal.
>>>>
>>>>> (6) Under 4.1.4 where it says "This specification assigns no
>>>>> significance to the order of metalink:url elements" it might be useful
>>>>> to include a reference to the "preference" attribute.
>>>>
>>>>  This specification assigns no significance to the order of metalink:
>>>>  url elements.  Significance is determines by the value of the
>>>>  "preference" attribute of the metalink:url elements.
>>>>
>>>>> (7) With regards to verification (4.1.6.1 and 4.2.4.1) there is no
>>>>> information as to how a Metalink Processor should deal with one (can
>>>>> it ignore it) or deal with multiple hash types (e.g. if there is MD5
>>>>> and SHA1, MUST / MAY / MUST NOT it check more than one?). Also, it
>>>>> might be useful to extend metalink documents with new verification
>>>>> methods before they arrive in the standard. Perhaps unknown types
>>>>> could be allowed here? The same comments mostly apply to digital
>>>>> signatures too (4.2.13).
>>>>
>>>> I agree that unknown hash types or digital signatures should be allowed.
>>>>
>>>>  This document defines nine initial values for hash types.  It may be
>>>>  useful to extend Metalink documents with new verification methods, so
>>>>  unknown types are allowed.
>>>>
>>>> and
>>>>
>>>>  metalink:signature elements MUST have a "type" attribute.  The inital
>>>>  value of "type" is the string that is non-empty and matches "pgp".
>>>>  It may be useful to extend Metalink documents with new types of
>>>>  digital signatures, so unknown types are allowed.
>>>>
>>>>
>>>> What do you suggest about dealing with multiple hash types?
>>>>
>>>> An issue could be if someone malicious makes a metalink where the MD5
>>>> matches that of something published by a legit group, but also
>>>> includes a SHA-256 checksum, and if clients prefer & only verify
>>>> SHA-256, then the file could appear to be good even if the downloader
>>>> looked inside the metalink & compared the MD5 (if the legit group
>>>> didn't also use & publish SHA-256 checksums too).
>>>> - Show quoted text -
>>>>
>>>>> (8) Formatting nit last - the use of spacing in the nesting of XML
>>>>> elements is pretty inconsistent - so instead of this on page 4:
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8" ?>
>>>>>  <metalink version="3.0" xmlns="http://metalinker.org";>
>>>>>    <published>2008-05-15T12:23:23Z</published>
>>>>>    <files>
>>>>>      <file name="example.ext">
>>>>>      <description>A description of the example file for download.
>>>>>      </description>
>>>>>      <verification>
>>>>>        <hash type="md5">83b1a04f18d6782cfe0407edadac377f</hash>
>>>>>        <hash type="sha1">80bc95fd391772fa61c91ed68567f0980bb45fd9
>>>>>        </hash>
>>>>>      </verification>
>>>>>      <resources>
>>>>>        <url>ftp://ftp.example.com/example.ext</url>
>>>>>        <url>http://example.com/example.ext</url>
>>>>>        <url>http://example.com/example.ext.torrent</url>
>>>>>      </resources>
>>>>>      </file>
>>>>>    </files>
>>>>>  </metalink>
>>>>>
>>>>> It would be much nicer if it were nested something like this:
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8" ?>
>>>>>  <metalink version="3.0" xmlns="http://metalinker.org";>
>>>>>    <published>2008-05-15T12:23:23Z</published>
>>>>>    <files>
>>>>>      <file name="example.ext">
>>>>>        <description>A description of the example file for
>>>>> download.</description>
>>>>>        <verification>
>>>>>          <hash type="md5">83b1a04f18d6782cfe0407edadac377f</hash>
>>>>>          <hash type="sha1">80bc95fd391772fa61c91ed68567f0980bb45fd9
>>>>>          </hash>
>>>>>        </verification>
>>>>>        <resources>
>>>>>          <url>ftp://ftp.example.com/example.ext</url>
>>>>>          <url>http://example.com/example.ext</url>
>>>>>          <url>http://example.com/example.ext.torrent</url>
>>>>>        </resources>
>>>>>      </file>
>>>>>    </files>
>>>>>  </metalink>
>>>>
>>>> Done, if:
>>>>
>>>>      <description>A description of the example file for
>>>> download.</description>
>>>>
>>>> is the only thing changed?
>>>>


-- 
(( Anthony Bryan ... Metalink [ http://www.metalinker.org ]
 )) Easier, More Reliable, Self Healing Downloads
_______________________________________________
Int-area mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/int-area

Reply via email to