Re: [gradle-dev] Strategy for minimising network traffic during dependency resolution.

Luke Daley Fri, 30 Mar 2012 02:44:34 -0700

On 30/03/2012, at 5:53 AM, Adam Murdoch wrote:

> 
> On 30/03/2012, at 8:35 AM, Daz DeBoer wrote:
> 
>> After a little pondering, I'd favour an approach that is simple to describe 
>> and doesn't result in unexpected behaviour; I think an extra HEAD request 
>> here or there is ok.
>> 
>> How about we perform a HEAD request if we have any cache candidates, be they 
>> local files or previous accesses to this URL.
>> So the logic would be:
>> Do we have any cache candidates? 
>> If not, just HTTP GET the resource and we're done.
>> HTTP HEAD to get the resource meta-data (and possibly the SHA1)
>> If we got a 404, the resource is missing, we're done.
>> If we match a cached URL resource, just use it and we're done.
>> If we have a local file candidate, HTTP GET SHA1. 
>> If published SHA1 was found and matches then we can cache the URL resource 
>> and we're done. 
>> HTTP GET the actual resource
>> Pros:
>> - We can get the SHA1 from the headers if available, and avoid the GET-SHA1 
>> call.
>> - If a local file matches, we can cache the URL resolution as if we did an 
>> HTTP GET, since we have the full HTTP headers + the content. We never have a 
>> cached resource without an origin.
>> - After initially using a file from say .m2/repo to satisfy a request, from 
>> then on it will be just like we actually downloaded it from the URL. So 
>> there are no residual effects of using a local file in place of a downloaded 
>> one. Use of local files is a pure optimisation.
>> - If the artifact is missing altogether, we get a single 404 for the HEAD, 
>> rather than 404 for the SHA1 + 404 for the GET
>> - It's simpler to understand, I think.
> 
> - This approach works nicely as a decoration over all the transports we're 
> interested in (http, sftp, webdav, local file, network file). These all offer 
> a way to get at least (content-length + last-modified-time) without fetching 
> the entire content. So, we could have a number of Resource implementations 
> that sit directly on top of the transport and which don't care about caching, 
> and a single Resource implementation that sits on top of this to apply this 
> caching algorithm. This would allow us, for example, to start efficiently 
> caching file resources, regardless of whether they are sitting on local or 
> network file system.


I've actually made this kind of thing NOT the responsibility of the 
ExternalResource (named so to differentiate from Ivy's Resource type) object.

https://github.com/gradle/gradle/blob/master/subprojects/core-impl/src/main/groovy/org/gradle/api/internal/externalresource/transfer/ExternalResourceAccessor.java

My thinking was that we are likely to have different strategies for cache 
optimisations here for different transports. That's starting to look like 
that's not going to be the case.

-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com

Re: [gradle-dev] Strategy for minimising network traffic during dependency resolution.

Reply via email to