On Aug 5, 3:24 pm, Anthony Bryan <[email protected]> wrote: > On Tue, Aug 4, 2009 at 9:38 AM, Hampus Wessman<[email protected]> > wrote: > > > Hello everyone! > > > This is not a review of the internet draft document as such, but rather > > some more general changes to the structure of the format that I think > > would make metalinks a lot easier to use in computer programs The > > changes should be fairly easy to add to the ID if Anthony and the rest > > of you like them. Sorry for suggesting all these changes at this late > > stage, but I think they are important so please take a look at them at > > least. > > it's not too late by any means, thanks for taking the time Hampus! > > > My suggestions would make the new format backwards incompatible, but > > AFAIK the ID isn't completely compatible with most current > > implementations anyway (not meta data at least). I think it is more > > important to make the standard as good as possible than making it > > backwards compatible. Clients with support for 3.0 will be able to add > > support for the new standard easily anyway. > > it's been my intention to keep the ID version as close to the current > version as possible (at least for assisting downloads), until it MUST > not be. > > this is because the ID is a re-specification of something we have a > few years experience with, and 50+ programs that currently support it. > at my last count, 9 of those were closed source & will be slow to > update. most of the open source clients will probably be slow to > update, even in the current "search & replace" version. > > I've been trying to balance an attempt at (almost) perfect and > backwards compatible. I've tried to slim things down & make them > simpler. > > now is the perfect time for change! > > how bad are things currently? & how much better will we make them with > changes? what will be the incentive for authors to do more work? > > also, it's probably a good time to discuss what to do in the > changeover period to convert back & forth between versions. a python > script, .exe for windows users, XSLT, a web service... > > these are some great suggestions! > > why don't we take them on, starting with less invasive first. that > would be #3, 4, 2, then 1 I think. >
I like the idea 'Change 2: remove "piece" attribute from piece hashes'. Actually aria2 sorts piece hash data by its index! I think the current ID is very well written in terms of compatibility and improvements Anthony mentioned, but hey, I don't say there are no room for change ;) > so for #3, you suggest we remove metadata inheritance & these elements > from <files>: > copyright > description > identity > language > license > logo > os > publisher > version > > that makes things quite a bit simpler... > I agree to change#3. Metadata inheritance is too complicated for its own good. I think metalink file is generally produced by machine, not human, it can copy all metadata to all file without complain and we should not care about the size of XML. If size matters, we can use gzip to transfer compressed file. > > > > Here's my suggested changes: > > > Change 1: Remove unnecessary tags that carry no information > > > The metalink format contains some tags that could be removed without > > losing ANY functionality. I'm thinking about <files>, <verification> and > > <resources>. They may look pretty to humans, but I think the format > > would be easier to deal with if they were removed. A metalink contains > > one or more files, which contains hashes and urls (among other things). > > The following xml structure reflects this hierarchy just as well as the > > current one: > > > <metalink> > > <file name="example.ext"> > > <identity>Example</identity> > > <hash type="md5">2156346474343745</hash> > > <url>http://example.com/</url> > > <url>ftp://ftp.example.com/</url> > > </file> > > <file name="example2.ext"> > > ... > > </file> > > </metalink> > > > (I skipped some details here, like <?xml ...) > > > In my experience it would be easier to parse/load/read a metalink with > > that structure. It may depend on how you do that, but I can't think of > > any situation when it would make it harder. > > > Change 2: remove "piece" attribute from piece hashes > > > The internet draft does state that the "piece" attribute starts at zero > > and "increses", which probably means that you must supply the chunk > > checksums / piece hashes in the right order (the first one first and so > > on). This is really good. Otherwise you need to sort them each time you > > load a metalink file. > > > If you supply the piece hashes in the correct order, then you don't need > > the "piece" attribute as the order of xml elements is significant (you > > can't, for example, show the <p> tags in an xhtml document in any > > order!). Having the piece attribute will without doubt make people > > believe you can supply them in any order, as that is the only reason for > > having it. > > > My suggestion: remove the "piece" attribute and require that the piece > > hashes are placed in the correct order. > > > Change 3: Remove (and forget about!) meta data inheritance > > > This is a confusing and unnecessary part of the standard, which makes it > > harder for applications to read metalinks and only gives us some kind of > > "compression" in return (i.e. some duplicates of tags can be removed in > > multi-file metalinks, at times). If we really want small files, then an > > XML-format is the wrong way to achieve that. In that case we should > > investigate alternative solutions, because there will be better ones. > > > Even though this feature might be useful in some situations, I think the > > added complexity it adds to every application that wants to load a > > metalink is a too high price to pay. It is far more important that > > metalinks are easy to deal with (and easy to understand!) than that they > > are as small as possible. Remember, XML isn't small and will never be! > > Lets focus on what we are good at instead (ie being a nice and easy xml > > format that bundles data about files). > > > Change 4: Add meta data about the metalink (i.e. about the whole > > metalink as such) > > > Screenshot of DTA:http://hampuswessman.se/dta_metalink.png > > > A metalink contains a collection of files. The current standard only > > makes it possible to add meta data (ie identity, description, ...) for > > each separate file. Many clients display information about the > > collection as such (i.e. the whole metalink). See the DTA screenshot > > above for an example. These clients apparently interpret the contents of > > the metalink wrong as there is no such data in the metalink format. The > > "meta data inheritance" mentioned in Change 3 is probably one reason for > > this confusion. > > > Now to the solution. I like the way that e.g. DTA presents the metalink > > and so I think we should adapt the format after this. More precisely, we > > remove all kinds of "meta data inheritance" (see change 3) and then we > > add some new tags directly under <metalink>, like <identity> and > > <description>. Exactly which can be determined later on. This way there > > would be some meta data about the <metalink> and some about each <file> > > and it would be placed directly under those tags (only). > > > This would make the metalink format behave more like many people who > > come into contact with it for the first time expects it to work (in my > > very limited experience). It would also be very useful. An example is a > > good way to describe why: > > > A web site presents their 10 favorite open source games in an article. > > They want everyone to be able to download these games easily. A metalink > > would, of course, be perfect! They add all the 10 games to the metalink > > and write short descriptions (and so on) for each file/game. They also > > set the <identity> of the metalink (ie of the whole collection of files) > > to "Our 10 favorite games" and add a description to the whole metalink > > which describes what kind of file collection this is. > > > When using DTA to download this fictional metalink, we would be > > presented with the description of the metalink at the top and then each > > file below. We can then choose which files we actually want and so on... > > Perfect!! > > > Example xml: > > <metalink> > > <identity>The best 10 open source games ever</identity> > > <description>...</description> > > > <file name="wesnoth.exe"> > > <identity>Wesnoth</identity> > > ... > > </file> > > > <file name="superpong.exe"> > > <identity>Super Pong 3000</identity> > > ... > > </file> > > ... > > </metalink> > > > That was a far too long e-mail... Kudos to everyone that got this far! > > > Keep up the good work with the internet draft, Anthony. > > -- > (( Anthony Bryan ... Metalink [http://www.metalinker.org] > )) Easier, More Reliable, Self Healing Downloads --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Metalink Discussion" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/metalink-discussion?hl=en -~----------~----~----~----~------~----~------~--~---
