On 24/03/2012, at 1:39 AM, Luke Daley wrote: > > On 22/03/2012, at 7:53 PM, Adam Murdoch wrote: > >> >> On 23/03/2012, at 3:48 AM, Daz DeBoer wrote: >> >>> On 22 March 2012 05:02, Luke Daley <[email protected]> wrote: >>> >>> On 22/03/2012, at 7:15 AM, Daz DeBoer wrote: >>>> I guess one question is whether the lastModified date can actually be >>>> considered some sort of meta-data about the artifact itself, or if it's >>>> actually more like meta-data about the artifact retrieval. I kind of think >>>> the lastModified date should be only used when checking if a particular >>>> URL retrieval is up-to-date: and that this value should not leak into the >>>> model further. >>> >>> I've got no problem conceptualising it as meta data about the resolution. >>> Say if we provided some kind of report on the contents of the cache >>> relevant to the particular build you are working on. I can see providing >>> the url and last modified of the artifacts when they were resolved in a >>> particular repository being useful information. >>> >>>> At the moment, if we get the same artifact from 2 different URLs in the >>>> same repository, it looks like we will overwrite the first retrieval >>>> (lastModified + url) when we cache the second. This doesn't feel great to >>>> me. >>> >>> For the “by-repository” cache, yes. But that in itself is interesting >>> information (i.e. given the same repository and module vectors we got >>> something at a different URL). It becomes more interesting if the >>> repository is actually serving back redirects too. We could potentially >>> then do some short circuiting here by using this info (need to think about >>> that more). >>> >>> We could be missing a concept here; >> >> We are. There're two things here: >> >> * Artifacts. This is a binary object that is published as part of a module. > > I'm not clear on the term “module” here. Is it as simple as the module being > "commons-lang:commons-lang" and the the artifact being the particular > instance of this module that is "commons-lang:commons-lang:2.4" ? > > Or is it that an “artifact” is just a kind of more abstract description, > usually group/name/version, that ultimately resolves to some binary resource?
Not quite. Here's the current model (this stuff is all up for grabs post-1.0): A module represents some software component. It has: * An identifier (group, module-name). * One or more versions. A module version represents some release or build of the software component. A module version is the thing that is get published to a repository. It has: * An identifier (group, module, version) * Some meta-data (status, and later, things like where the source can found, what was used to build the artefacts, and so on). * Some dependency meta-data (let's ignore configurations for now). * Zero or more artifacts. An artifact represents some binary object (for want of a better term) that is published as part of a module version. It has: * An identifier (group, module, version, artifact-name). * Some meta-data (type, and so on). * Some dependency meta-data. * Binary content. Resources then add in the concept of some binary content at a location. > >> * Resources. This is a binary object at a given location. >> >> An artefact has: >> * An artifact id. >> * A (file) resource that represents its location in the cache. >> * A resource that represents its origin. > > Given just an artifact, I don't think we can talk about its origin. At least > at the moment we don't as we ask Gradle to effectively “select” the > origin/resource for an artifact. It's not really intrinsic to the artifact > itself. > > Or should I just add the implication that you are talking about > artifact-at-repository here? That's right. I guess we're talking about a resolved artefact here. A resolved artefact is-an artefact that also has: * A file resource that represents where the artifact's binary content can be found. * An origin resource that represents the location where the artefact was found. >> >> My vote is #3: separate out the artefact and resource caches, and use a HEAD >> request instead of a GET if-modified-since request. > > My gut feel is that we should sort this out now while it's fresh and we are > discussing it. It doesn't seem like a lot of work to me. I agree. We should certainly tidy up the behaviour for 1.0 (use a HEAD request, compare content-length and last-modified-time, and also make sure that a last-modified-time is only ever used with the URL from which it was obtained and never across different URLs). I think that this will happen pretty naturally by restructuring the caching APIs. So, we should do this now. -- Adam Murdoch Gradle Co-founder http://www.gradle.org VP of Engineering, Gradleware Inc. - Gradle Training, Support, Consulting http://www.gradleware.com
