awesome, thanks Jack!

it's very nice of you to keep this updated, working, & also committed
back to ATS.

btw, the README is great. I'm going to include some of it here:


                                        Metalink

   Try not to download the same file twice.  Improve cache efficiency
   and speed up downloads.

   Take standard headers and knowledge about objects in the cache and
   potentially rewrite those headers so that a client will use a URL
   that's already cached instead of one that isn't.  The headers are
   specified in [RFC 6429] (Metalink/HTTP: Mirrors and Hashes) and
   [RFC 3230] (Instance Digests in HTTP) and are sent by various
   download redirectors or content distribution networks.


1.  Who Cares?

   More important than saving a little bit of bandwidth, this saves
   users from frustration.

   A lot of download sites distribute the same files from many
   different mirrors and users don't know which mirrors are already
   cached.  These sites often present users with a simple download
   button, but the button doesn't predictably access the same mirror,
   or a mirror that's already cached.  To users it seems like the
   download works sometimes (takes seconds) and not others (takes
   hours), which is frustrating.

   An extreme example of this happens when users share a limited,
   possibly unreliable internet connection, as is common in parts of
   Africa for example.

   [How to cache openSUSE repositories with Squid] is another,
   different example of a use case where picking a URL that's already
   cached is valuable.

2.  What it Does

   When it sees a response with a "Location: ..." header and a
   "Digest: SHA-256=..." header, it checks if the URL in the Location
   header is already cached.  If it isn't, then it tries to find a URL
   that is cached to use instead.  It looks in the cache for some
   object that matches the digest in the Digest header and if it
   succeeds, then it rewites the Location header with the URL from
   that object.

   This way a client should get sent to a URL that's already cached
   and won't download the file again.

On Wed, Feb 5, 2014 at 6:33 PM, Jack Bates <[email protected]> wrote:
> I just pushed an update to the Metalink plugin for Apache Traffic server [1]
> The update fixes a segfault that was reported by Faysal Banna [2]
>
> [1]  https://cwiki.apache.org/confluence/display/TS/Metalink
> [2]  https://github.com/jablko/dedup/issues/1
>
> I pushed the updated plugin to GitHub (hopefully it will also be distributed
> with the next Traffic Server release).
> To build it,
>
>    1) download the updated metalink.cc file [3],
>    2) replace the plugins/experimental/metalink/metalink.cc file in your
> Traffic Server source tree with the updated file,
>    3) and rebuild Traffic Server by rerunning "make".
>
> [3]  https://raw.github.com/jablko/dedup/master/metalink.cc
>
> Here is a real example of what this plugin does:
> If I download the latest version of LibreOffice, their download redirector
> (MirrorBrain) sends me to tdf.mirror.rafal.ca
> But if several users are all sitting behind Traffic Server and someone has
> already downloaded this file from mirror.nexcess.net, the response from
> MirrorBrain is rewritten to redirect me to that mirror instead. Then I will
> get the file that Traffic Server cached, instead of going out over our
> internet connection.
>
> Here is the response I get from MirrorBrain:
>
> $ curl -v
> download.documentfoundation.org/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz
>> /dev/null
>
> < HTTP/1.1 302 Found
> < Link:
> <http://tdf.mirror.rafal.ca/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz>;
> rel=duplicate; pri=1; geo=ca
> < Link:
> <http://mirror.nexcess.net/tdf/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz>;
> rel=duplicate; pri=2; geo=us
> < Digest: SHA-256=YVJGdJtB7E2kxPTBjLPBwd4zhlgiDclzqBUTWyvzGkk=
> < Location:
> http://tdf.mirror.rafal.ca/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz
>
> But if I cause the proxy to cache the file from mirror.nexcess.net ...
>
> $ curl -vx localhost:8080
> mirror.nexcess.net/tdf/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz
>> /dev/null
>
> ... then the proxy will rewrite the response from MirrorBrain:
>
> $ curl -vx localhost:8080
> download.documentfoundation.org/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz
>> /dev/null
>
> < HTTP/1.1 302 Found
> < Location:
> HTTP://mirror.nexcess.net/tdf/libreoffice/stable/4.2.0/rpm/x86/LibreOffice_4.2.0_Linux_x86_rpm.tar.gz
>
> And I will get the file that the proxy cached.
> I hope other people will find this useful.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Metalink Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/metalink-discussion.
> For more options, visit https://groups.google.com/groups/opt_out.



-- 
(( Anthony Bryan ... Metalink [ http://www.metalinker.org ]
  )) Easier, More Reliable, Self Healing Downloads

-- 
You received this message because you are subscribed to the Google Groups 
"Metalink Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/metalink-discussion.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to