On Friday, May 18, 2012 5:11:58 PM UTC+5:30, Jack Bates wrote: > > Hi, I started work on a plugin for Apache Traffic Server and I would > love any feedback (and maybe implementation advice) from the Metalink > community > > Traffic Server is a caching proxy and the goal of this plugin is to > help it work better with files distributed from multiple mirrors or > content distribution networks. Currently downloading a file that is > already cached from a different mirror is a cache miss. A lot of > download sites present users with a simple download button that > doesn't always redirect them to the same mirror, which defeats the > benefit of a caching proxy and frustrates users > > I would love to hear any of your thoughts on how caching proxies could > work better with content distribution networks > > For this first attempt at this plugin, the approach taken is to use > RFC 6249, Metalink/HTTP: Mirrors and Hashes. The plugin listens for > responses that are an HTTP redirect and have "Link: <...>; > rel=duplicate" headers, then scans the URLs for one that already > exists in the cache. If found then it transforms the response, > replacing the "Location: ..." header with the URL that already exists > in the cache > > The code is up on GitHub [1] and works just enough that, given a > response with a "Location: ..." header that's not cached and a "Link: > <...>; rel=duplicate" header that is cached, it will rewrite the > "Location: ..." header with the cached URL > > I would love any feedback on this approach > > Excellent start Jack. I am also a GSoC student this year working on the RFC 6249 implementation in KGet. I guess what you have have in github is really great.
We are also thinking of using RFC 3230, Instance Digests in HTTP. > Given a response with a "Location: ..." header that's not cached and a > "Digest: ..." header, the plugin would check if another URL with the > same digest already exists in the cache and rewrite the > "Location: ..." header with that URL if so > > Still more ideas include: > > * Remember URLs for the same file so future requests for any of > these URLs use the same cache key. A problem is how to prevent a > malicious domain from distributing false information about URLs it > doesn't control. This could be addressed with a whitelist of domains > > * Making decisions about the best mirror to choose, e.g. one that > is most cost efficient, faster, or more local > > * Use content digest to detect or repair download errors > > Finally, can anyone in the Metalink community recommend a reusable C/C+ > + solution for checking if a "Link: ..." header has a "rel=duplicate" > parameter? For now I am parsing these headers from scratch with > memchr(), but I expect that I am neglecting some accumulated wisdom on > getting all the RFC rules right, and maybe interoperating with > nonconformant implementations. Please let me know if you know a better > way > I guess aria2 has support for Metalink/HTML and may have the reusable code you are looking for. I have gone through the code but as my implemenation is Qt/KDE dependent, I am not sure about its use to me. Still go ahead and have a look at the code. Here's the link https://github.com/tatsuhiro-t/aria2 All the best. :) > > Here is a similar message [2] on the Traffic Server developers list, > with slightly more detail > > We run Traffic Server here at a rural village in Rwanda for faster, > more reliable internet access. I am working on this as part of the > Google Summer of Code > > [1] https://github.com/jablko/dedup > [2] > http://mail-archives.apache.org/mod_mbox/trafficserver-dev/201205.mbox/%3C4FAE78FB.1070404%40nottheoilrig.com%3E -- You received this message because you are subscribed to the Google Groups "Metalink Discussion" group. To view this discussion on the web visit https://groups.google.com/d/msg/metalink-discussion/-/2slYUvzBBXUJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/metalink-discussion?hl=en.
