Re: [mirrorbrain] How to make Squid work with mirrorbrain

Jack Bates Thu, 14 Jun 2012 10:39:36 -0700

On 04/06/12 04:41 AM, Per Jessen wrote:

Hi Jack


not sure how valid these comments might be, I have zero knowledge about
ATS.

Your solution appears to address what I think of as the "first half of
the issue" - that a file may have several URLs (one per mirror).  Pick
the wrong one, and a previously cached copy will not be found.


Yes exactly, I have only tackled the first half of the issue you describe

If looking up the currently cached content is fast/efficient, rewriting
the header accordingly sounds okay, but I can't help thinking that it
would be easier to do what I do with Squid - rewrite the URLs when they
are stored?
If<primary>  is the primary location, e.g.
http://download.services.openoffice.org, and<mirror1-9>  are mirrors,
then files retrieved from<mirror1-9>  are stored as if they were
fetched from<primary>.  On subsequent retrievals, you would have a
direct cache hit with no need to look at the header.

Hmm, is there any way to automatically discover the list of mirrors? Iknow you automatically retrieve the list of mirrors fromhttp://mirrors.opensuse.org/list/all.html, and you are looking forsomething less messy than scraping this HTML. But I think the proxyadministrator must manually configure where to find the list of mirrors,for each different content distribution network (openSUSE, OpenOffice, etc.)

A strong motivation for using Metalink is that no manual intervention isrequired by the proxy administrator. Any content distribution networkthat supports Metalink should be automatically discovered

We are also thinking of examining "Digest: ..." headers. If a response
has a "Location: ..." header that's not already cached and a "Digest:
..." header, then the plugin would check the cache for a matching
digest. If found then it would rewrite the "Location: ..." header with
the cached URL


I'm not really very familiar with metalink, what is your thinking behind
wanting to use the digest to identify a cached object?

My thinking is that looking up cached content by digest might result insome additional cache hits where scanning the list of "Link: <...>;rel=duplicate" headers did not, e.g. if the content was downloaded froma server outside of the CDN, and therefore the URL is not among the"Link: <...>; rel=duplicate" headers

It might also be more efficient because the digest should be looked uponly once, vs. scanning a possibly long list of "Link: <...>;rel=duplicate" URLs



_______________________________________________
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
        unsubscribe
to the address mirrorbrain-requ...@mirrorbrain.org

Re: [mirrorbrain] How to make Squid work with mirrorbrain

Reply via email to