On 11/05/2013, at 2:40 AM, Adam Murdoch <adam.murd...@gradleware.com> wrote:

> 
> On 10/05/2013, at 2:33 AM, Luke Daley <luke.da...@gradleware.com> wrote:
> 
>> 
>> On 09/05/2013, at 12:51 AM, Adam Murdoch <adam.murd...@gradleware.com> wrote:
>> 
>>> Hi,
>>> 
>>> Just revisiting the old 'how do we deal with names for domain objects' 
>>> question.
>>> 
>>> Currently, our polymorphic container is-a NamedDomainObjectContainer. As 
>>> such it forces every domain object to have a unique name. The problem with 
>>> this approach is that I need to encode the type into the name.
>>> 
>>> For example, we are planning to use a `binaries` container to own all the 
>>> binaries produced by the project. If I have a `main` Java library, then we 
>>> need to add (by convention) a `mainClasses` ClassesDirectory binary and a 
>>> `mainJar` Jar binary for this library. If I have a `main` Groovy library 
>>> built for Groovy 1.8 and Groovy 2.0, then I need to add 
>>> `mainGroovy18Classes` and `mainGroovy20Classes` and `mainGroovy18Jar` and 
>>> `mainGroovy20Jar`. If I have a `main` C library built for windows and linux 
>>> with 32-bit and 64-bit and static and shared and debug and release 
>>> variants, well… then we need lots of names.
>>> 
>>> I'd like to change things so we can get rid of the type from the name. 
>>> This, of course, still leaves the other dimensions packed into the name. We 
>>> don't have a good solution for this yet, but I suspect one will emerge.
>>> 
>>> The approach I'd like to take (which isn't a new idea - can't remember who 
>>> suggested it) is to make (name, type) the unique identifier for a thing.
>>> 
>>> The DSL for defining an object would change from this:
>>> 
>>> binaries {
>>>    mainStaticLibrary(StaticLibraryBinary) { … }
>>>    mainSharedLibrary(SharedLibraryBinary) { … }
>>> }
>>> 
>>> To:
>>> 
>>> binaries {
>>>    staticLibraries {
>>>        main { … }
>>>    }
>>>    sharedLibraries {
>>>        main { … }
>>>    }
>>> }
>>> 
>>> Also:
>>> 
>>> publications {
>>>    maven { 
>>>        main { … }
>>>    }
>>>    ivy { 
>>>        main { … }
>>>    }
>>> }
>>> 
>>> 
>>> To find something:
>>> 
>>> // look something up by name
>>> def main = binaries.main  // fails if there are multiple binaries with name 
>>> `main`
>>> 
>>> // look something up by type and name
>>> def main = binaries.staticLibraries.main
>>> 
>>> // look something up by super type and name
>>> def main = binaries.nativeLibraries.main // fails if there are multiple 
>>> native libraries with name `main`
>> 
>> Are we automatically doing the lookup here? i.e. drop package, camel case 
>> and pluralise? Or is there an explicit mapping between a type and its "dsl 
>> name"?
> 
> Possibly one and then the other as a fallback. We might also discover the 
> mappings by inspecting the types (eg annotations). Or all of the above. We 
> also need to know the implementation type for a given contract type and this 
> would be declared in the same way. At the moment we're using explicit 
> registration for this.
> 
> Whatever we come up with here would be reused for declaring tasks, meaning 
> that the task types would no longer need to be visible on the script 
> classpath. And this means we can get rid of the boilerplate for custom 
> plugins, we can isolate plugins from each other, and we can more easily 
> inject task types into scripts (eg from some other script).
> 
> 
>> 
>>> // look up all things by type
>>> def libs = binaries.staticLibraries
>>> 
>>> And to deal with the other dimensions, you'd use the existing collection 
>>> stuff:
>>> 
>>> // All windows static libs
>>> def libs = binaries.staticLibraries.matching { it.platform.operatingSystem 
>>> == operatingSystems.windows }
>>> 
>>> We can come up with conveniences for the other dimensions. Perhaps a 
>>> map-based selector:
>>> 
>>> def libs = binaries.staticLibraries(platform.operatingSystem: 
>>> operatingSystems.windows)
>>> 
>>> Thoughts? I don't think this plan quite gets it right, but it feels better 
>>> than the current DSL.
>> 
>> I don't think it really fixes the problem. I think we want to get away from 
>> the idea of an immutable name altogether. Instead, we probably want a way to 
>> extract an identifier based on differentiating characteristics. This seems 
>> to be how the names are used as we've discussed.
> 
> Right, this is exactly the goal. So far, no-one has come up with a good 
> proposal for a DSL that solves this well.
> 
>> 
>> Combining this with some of deferred configuration stuff, I think you want 
>> to tell Gradle which characteristics to use to construct the identifier. 
>> This also means you don't have to predict names. 
>> 
>> Disclaimer: I haven't really thought this through beyond reading this email.
>> 
>> Let's say we have the following interface:
>> 
>> interface Identifiable {
>>      String getIdentifier()
>>      void setIdentifier(String identifierPattern)
>> }
>> 
>> An identifier pattern is a tokenisable (at runtime) string, based on 
>> properties of the thing…
>> 
>> class HttpRepository extends AbstractIdentifiable {  
>>      …
>>      String getUrl()
>> }
>> 
>> def repository = new HttpRepository(url: "http://org.com";) 
>> repository.identifier = "#url"
>> assert repository.identifier == "http://org.com";
>> 
>> The benefit here is that types can supply a default naming strategy, which 
>> could be based on important characteristics. Assuming that we find some way 
>> to actually "lock" objects after they've been configured with the deferring 
>> stuff, we could also make the actual value immutable at this time. We could 
>> defer preventing collisions until this point.
>> 
>> The idea is also that the identifier is only used for output and for 
>> deriving names (would be good if we could avoid this in the future too). If 
>> you are programmatically looking for something, you find it by looking up 
>> characteristics. I don't think plugins do this often anyway. It doesn't seem 
>> often to me that infrastructure code pulls objects from containers via their 
>> name. You're usually agnostic to containers at this level and work directly 
>> with the instances. Configurers (a.k.a users) do this all the time, but they 
>> should have the knowledge required to identify the thing they want. If you 
>> add in a singleFile type approach here (i.e. I'm expecting there to be one 
>> thing in this collection and I want it) then I think it would work.
> 
> This is roughly what we're aiming for, I think. There are still some issues 
> to sort out.
> 
> One problem is that the set of attributes that identify a thing are not 
> constant for a given type:
> 
> - We want to be able to add more variant attributes in a backwards compatible 
> way. For example, we want to be able to add debug vs release variants for 
> native binaries. Given that binaries with different values of this attribute 
> are mutually exclusive, this attribute needs to be baked into the identifier 
> of the thing. If you're using the default identifier scheme, then the names 
> for tasks and output directories and so on will change in a breaking way. If 
> you're using your own scheme, then your build is going to break because 
> you'll start generating duplicate identifiers for the new variants.

Changing the default identifier pattern would have to be a breaking change. I 
don't see any way around this. Changing the names of derived tasks is breaking. 
This is no worse than the current situation, but not much better either I guess.

I can't see a way around this. Perhaps making it easy to fix/diagnose such 
issues is the way.


> - The identifier often includes attributes that aren't relevant for building 
> the thing. For example, for the 'debug' variant of the 'main' native library 
> includes the identity of the 'main' native library in its identifier. Or if I 
> define 'snapshot' and 'production' publications, then the fact of whether a 
> publication is 'snapshot' or 'production' has to be included in its 
> identifier.

What if the default identifier is just essentially the type? In other words, be 
as conservative as possible.

As soon as there is more than one thing, it's then that you need to provide an 
identifier pattern that can differentiate this thing from the other.

> - The identifier often does not include all the attributes that can be 
> relevant for building the thing. For example, if I build all my binaries on 
> windows and only ever use visual c++, then the identifier for my binaries 
> doesn't need to include either the operating system or compiler version. More 
> generally, if a given attribute has the same value for all objects of a given 
> type, then that attribute is not relevant for identifying the objects.

I think the only thing that can now what is interesting about a thing (and 
therefore what should form its identifier) is whoever is creating it.

Alternatively, you could tell us what the potential identifiers of a type are 
via annotations on properties and we could work it out based on uniqueness. I 
don't see that working out though. The rules could get complex and doesn't 
allow the use of arbitrary identifiers.

> If there's only a single instance of a given type, then no attributes other 
> than the type are relevant.

Which is why I think this would/should be the default.


> It feels like there are 2 things going on here:
> 
> - The role a thing plays in the build: This is the maven central repository, 
> this is the (single) publication, this is the api binary of the java library, 
> this is the test java source, this is the groovy 1.8 binary for the groovy 
> library, this is the free flavour of the main Android application, this is 
> the 32-bit windows jni library for the main java library.
> - The attributes that affect the output of the thing: target byte code level, 
> groovy runtime version, c++ compiler version, target operating system, debug 
> or release binary, etc.
> 
> Sometimes attributes from the second group are included in the role (or, 
> perhaps, are inferred from the role), sometimes not at all. For me, a good 
> DSL would use the role of a thing to identify that thing and to generate 
> names for that thing - output files names, task names, and so on. It would 
> also use the role of a thing to infer the roles of the things it is composed 
> from.

It's an interesting idea. Starting to border on a lot of magic and 
frameworkiness though on first consideration. The coupling that this would 
require might save some wiring, but will limit flexibility unless you can opt 
out. That's a vague/general feeling though. All depends on implementation of 
course.

> You can split up the attributes of a thing into 3 groups:
> 
> 1. The attributes that form the identity of the thing.

So what are the attributes here for a publication? Specifically when there is 
only one? The "module" name?

> 2. The attributes that affect the output of the thing.

AKA inputs right?

> 3. The attributes whose values are derived from these other values.

Do you have an example of this? I don't get it.

> All the DSL proposals so far require a multi-phase approach to configuring a 
> thing. First we have to configure the attributes that form the identity, then 
> we have to configure the remaining attributes from #2 (if we care), and then 
> configure the attributes from #3 (if we care). The current DSL just uses name 
> for the identity and this is immutable once configured.
> 
> Possibly we can't avoid this. A multi-phase approach does offer some 
> interesting options. For example, we might configure the attributes of #2, 
> and these become immutable. Once this has been done for all objects of a 
> given type, we can infer #1, and then configure the attributes of #3.

Aren't we already headed towards multi phase configuration with deferred 
configuration regardless?

-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com

Join me at the Gradle Summit 2013, June 13th and 14th in Santa Clara, CA: 
http://www.gradlesummit.com


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to