Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Luke Daley Thu, 07 Feb 2013 02:16:26 -0800

On 07/02/2013, at 4:06 AM, Adam Murdoch <[email protected]> wrote:


> 
> On 06/02/2013, at 9:02 PM, Luke Daley wrote:
> 
>> 
>> On 06/02/2013, at 12:57 AM, Adam Murdoch <[email protected]> wrote:
>> 
>>> 
>>> On 06/02/2013, at 10:45 AM, Luke Daley wrote:
>>> 
>>>> 
>>>> 
>>>> On 05/02/2013, at 23:08, Adam Murdoch <[email protected]> wrote:
>>>> 
>>>>> 
>>>>> On 06/02/2013, at 2:27 AM, Daz DeBoer wrote:
>>>>> 
>>>>>> On 4 February 2013 15:50, Adam Murdoch <[email protected]> 
>>>>>> wrote:
>>>>>> 
>>>>>> On 05/02/2013, at 5:12 AM, Daz DeBoer wrote:
>>>>>> 
>>>>>>> On 4 February 2013 00:07, Adam Murdoch <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> So, we're planning to have a bunch of 'jvm binaries' that can be built 
>>>>>>>> from
>>>>>>>> various language source sets and other things. There will be a few 
>>>>>>>> different
>>>>>>>> types of binaries, such as class directory binaries and jar binaries,
>>>>>>>> possibly some others.
>>>>>>>> 
>>>>>>>> Something we need to sort out is how to structure the DSL for these
>>>>>>>> executable things. The current plan is to have a single container that 
>>>>>>>> owns
>>>>>>>> all of these jvm binaries, so you might declare something like this:
>>>>>>>> 
>>>>>>>> jvm {
>>>>>>>>   binaries {
>>>>>>>>       mainClasses(ClassesDirectoryBinary) {
>>>>>>>>           … some inputs and other configuration ...
>>>>>>>>       }
>>>>>>>>       mainJar(JarBinary) {
>>>>>>>>           … some inputs and other configuration …
>>>>>>>>       }
>>>>>>>>   }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> There might be a similar container for native binaries:
>>>>>>>> 
>>>>>>>> native {
>>>>>>>>   binaries {
>>>>>>>>       windowsX86DebugShared(SharedLibraryBinary) {
>>>>>>>>           … some inputs and other configuration …
>>>>>>>>       }
>>>>>>>>       windowsX86DebugStatic(StaticLibraryBinary) {
>>>>>>>>           ...
>>>>>>>>       }
>>>>>>>>       windowsX86DebugExe(ExecutableBinary) {
>>>>>>>>           …
>>>>>>>>       }
>>>>>>>>   }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> Some questions:
>>>>>>>> 
>>>>>>>> * Is using a flat name the best way to identify these things? Once you 
>>>>>>>> add a
>>>>>>>> few dimensions, the names start to get awkward. This is certainly can 
>>>>>>>> be the
>>>>>>>> case for native binaries, and can also be the case for jvm binaries. 
>>>>>>>> For
>>>>>>>> example, I might have (feature, binary type, groovy version, jvm 
>>>>>>>> version) as
>>>>>>>> relevant dimensions for a Groovy library that targets multiple groovy
>>>>>>>> versions and jvm versions.
>>>>>>> 
>>>>>>> Are the names of these things important at all? Or in general are we
>>>>>>> just forcing users to come up with a name that adds little value?
>>>>>> 
>>>>>> I think it varies for different types of things. For some things, a name 
>>>>>> is a natural way of identifying the thing. For other things (most 
>>>>>> things?) it makes more sense to identify a thing by its type and some 
>>>>>> attributes about the thing.
>>>>>> 
>>>>>> The complication is that the set of attributes that identify a thing 
>>>>>> vary based on what I'm building. For example:
>>>>>> 
>>>>>> * If I have a single publication, then I want to refer to it as 'the 
>>>>>> publication'. The other stuff (type, groupId, artefactId, version) are 
>>>>>> just attributes of the publication.
>>>>>> * If I publish 2 maven modules, then I want to refer to them as the 'api 
>>>>>> publication' and the 'impl publication', say.
>>>>>> * If I build debug and release variants of my windows executable, then I 
>>>>>> want to refer to them as the 'debug executable' and the 'release 
>>>>>> executable'. All the other stuff (windows, amd64, multi-threaded, 
>>>>>> visual-c++ compiler, optimisation-level) are just attributes of the 
>>>>>> publication.
>>>>>> * If I build debug and release variants on windows and linux for x86 and 
>>>>>> amd64, then I want to refer to them using a tuple such as (windows, 
>>>>>> amd64, release).
>>>>>> 
>>>>>> That is, a thing often just has a bunch of attributes, any of which 
>>>>>> could be used to identify it, and it's how the thing is different to the 
>>>>>> others that is useful for identifying it.
>>>>>> 
>>>>>> Right, so it "name" just another one of those ways of identifying? 
>>>>>> Sometimes I want to give something a meaningful name, sometimes forcing 
>>>>>> me to come up with a name is a pain in the ass.
>>>>>> 
>>>>>> One nice aspect of ditching the name is that a thing can more naturally 
>>>>>> live in different containers and be grouped in different ways. Which 
>>>>>> would mean that some of these questions about how things are grouped 
>>>>>> become less important - just group them whichever way you like.
>>>>>> 
>>>>>> 
>>>>>>> How
>>>>>>> often does a user need to differentiate between them by name?
>>>>>> 
>>>>>> There are a few main reasons, I think:
>>>>>> 
>>>>>> 1. To configure something that some other logic (a plugin, say) has 
>>>>>> already defined.
>>>>>> 2. To configure the tasks that do work with the thing (compile it, 
>>>>>> generate the pom.xml for it, publish it).
>>>>>> 3. To find the thing to use it as input for some other thing.
>>>>>> 4. To refer to the thing before the 'identifying' attributes have been 
>>>>>> calculated. For example, to refer to a publication before the version 
>>>>>> has been calculated.
>>>>>> 
>>>>>> None this necessarily requires a name - this is just what the name is 
>>>>>> used for at the moment.
>>>>>> 
>>>>>> And I'm not sure any of these are the 'standard' case either. Again I 
>>>>>> refer to repositories: imagine that we used the new "name(Type)" syntax. 
>>>>>> Users would be forced to come up with a name for each of their 
>>>>>> repositories, which would likely not be used elsewhere. Instead, we give 
>>>>>> the ability to supply a name _if_ they want to refer to the repository 
>>>>>> elsewhere.
>>>>>> 
>>>>>> One thing that concerns me about the "name(Type) {}" syntax is that it's 
>>>>>> possibly trickier to document, and trickier for users to grok what's 
>>>>>> going on. In some cases it might make for a cleaner DSL, but I'm not 
>>>>>> certain it's worth the cost.
>>>>>>> We could consider a DSL similar to the repositories syntax:
>>>>>>> 
>>>>>>> jvm {
>>>>>>>   binaries {
>>>>>>>       classes {
>>>>>>>           name "main" // optional
>>>>>>>           … some inputs and other configuration ...
>>>>>>>       }
>>>>>>>       jar {
>>>>>>>           ... we generate a sensible name ...
>>>>>>>           … some inputs and other configuration …
>>>>>>>       }
>>>>>>>   }
>>>>>>> }
>>>>>>> 
>>>>>>> It's possible that we treat this as a standard pattern, whereby a
>>>>>>> NamedDomainObjectContainer could support both with some sort of DSL
>>>>>>> magic:
>>>>>>> 
>>>>>>> container {
>>>>>>>     name(Type) {}
>>>>>>>     subtype { // generated name }
>>>>>>> }
>>>>>>> 
>>>>>>> Or maybe get rid of the 'name' method altogether, and go with:
>>>>>>> 
>>>>>>> // In all cases the added element must provide a unique name, which
>>>>>>> may or may not be configured explicitly.
>>>>>>> container {
>>>>>>>      generalType(SubType) {} // eg 'publication' for 'publications'
>>>>>>> container, or 'dependency' for 'dependencies' container.
>>>>>>>      subType { } // eg 'ivy' for 'publications' or 'project' for
>>>>>>> 'dependencies'
>>>>>>> }
>>>>>> 
>>>>>> These are both interesting options for defining things. One question is 
>>>>>> how do I get something out again, to either configure it or use it?
>>>>>> 
>>>>>> There would be options:
>>>>>> container.findOne({attrib == "value"})
>>>>>> container.findOne(attrib1: "value", attrib2: "value")
>>>>>> container['name']
>>>>>> container.name
>>>>>> 
>>>>>> Note that I'm not suggesting doing away with "name" altogether, but 
>>>>>> instead making it optional.
>>>>> 
>>>>> It might be interesting to push this further, and make name a decoration 
>>>>> of some kind. We've already discussed here a few cases where sometimes 
>>>>> name is relevant and sometimes its not. This isn't a function of the type 
>>>>> of thing, but it is instead a function of how the thing is used. Here are 
>>>>> some other cases:
>>>>> 
>>>>> * Sometimes a piece of code is used as a task and sometimes as an action. 
>>>>> A task is really just an action with a name. The name allows us to do 
>>>>> some useful stuff with the piece of code (e.g. track its history, declare 
>>>>> dependencies and so on), but sometimes we don't care about this useful 
>>>>> stuff.
>>>> 
>>>> The task name is also the primary interface between the user and Gradle.
>>> 
>>> Indeed. This is part of the 'useful stuff'.
>> 
>> Point taken, but I think it's worth pointing out that this is beyond 
>> fundamental to the way that Gradle works currently.
>> 
>>>>> * When using, say, a JavaSourceSet as an input, we don't care about the 
>>>>> name of the source set. We just care that it can describe some source 
>>>>> files and compile dependencies. If we keep name off JavaSourceSet, we 
>>>>> allow other interesting implementations that can be used as input (but 
>>>>> not necessarily output) without forcing each one to have an arbitrary 
>>>>> name.
>>>> 
>>>> How do we require names for this now?
>>> 
>>> Because these things (sometimes) need to be buildable, and to build 
>>> something we currently need a name for it. Whereas to consume something, we 
>>> don't need an identity if we have an object reference to the thing.
>> 
>> I still don't get it. There are all kinds of unnamed buildable things, e.g. 
>> file collections.
>> 
>>>>> * Coming from the other direction: Some of our domain objects are defined 
>>>>> using attributes other than a name. For example, dependencies are defined 
>>>>> using (group, module, version). However, these are treated as the 
>>>>> identifier of the dependency and cannot be changed, even though its quite 
>>>>> ok that these are changed, up to the point that they are consumed.  In 
>>>>> other words, they're just attributes of the dependency. Having a 
>>>>> consistent way to define domain objects in terms of their attributes, and 
>>>>> making identity a decoration, would mean dependencies and publish 
>>>>> artefacts can be defined and used in the same way as everything else.
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> Putting together a few ideas from this thread (this DSL isn't quite 
>>>>> right, but should give the idea):
>>>>> 
>>>>> // defines a NativeExecutable, with a generated name. With some AST magic 
>>>>> the name might be 'someNativeBinary'
>>>>> def someNativeBinary = items.nativeExecutable { os 'windows'; 
>>>>> architecture 'amd64'; debug: true }
>>>> 
>>>> I know it's not the point but we should be _very_ care about introducing 
>>>> any more ASTs. There use can be very confusing for users and could make 
>>>> IDE support even more difficult.
>>> 
>>> Absolutely. It needs to be worth it.
>>> 
>>> I don't see this particular transform as overly risky. The IDE can infer 
>>> the return type of items.nativeExecutable()
>> 
>> How could it infer it? 
>> 
>> I can see how it might be possible with sophisticated flow analysis. But 
>> that would mean the IDE needs to know which plugins have been applied at 
>> which point in the script and which factories they add to “items”.
>> 
>>> and hence the type of someNativeBinary just fine. It doesn't introduce any 
>>> new syntax. It just takes advantage of an otherwise quite natural syntax, 
>>> ie this statement would work just fine without the transform.
>> 
>> I'm not convinced, but I don't think it matters right now.
>> 
>>>>> // defines an IvyPublication, with a provided name
>>>>> def myPublication = items.ivyPublication { name 'main'; organisation: 
>>>>> 'my-org'; module: 'my-module' }
>>>> 
>>>> So items is just a factory?
>>> 
>>> Maybe. There are 2 parts: creating things and finding things. Maybe `items` 
>>> can do both, maybe there are 2 separate things.
>> 
>> It at least needs to be the graph. I would think factory like behaviour 
>> would be a convenience and not fundamental. Actually, more correctly, it 
>> needs to be a query engine for the graph. The graph is already there in the 
>> connections between objects, we just need a way to dig out parts.
>> 
>>>>> // do some things with the publication
>>>>> myPublication.revision = '1.2'
>>>>> publishing.publications << myPublication
>>>> 
>>>> Why would there even be a publications container? Couldn't you just query 
>>>> the items graph for all of the publications?
>>> 
>>> Good question. Currently, the publications container declares the purpose 
>>> or role of a publication. When it's in the container, its a public output 
>>> of the project. When it's not, it's a publication used for some other 
>>> (undisclosed) purpose and we can't infer anything about it beyond how to 
>>> build it.
>> 
>> This could be a characteristic of the publication itself, not of its 
>> context. Then finding the “public” publications just becomes a more refined 
>> query.
> 
> It might be interesting to combine this with the inferencing we've been 
> talking about, so that any built item marked as 'published' ends up in the 
> right publication in the right repository. For example:
> 
> apply plugin: 'publishing'
> apply plugin: 'maven-publish'
> 
> publishing { repositories { maven { url 'http://myserver' } } }
> 
> def mainJar = …. some kind of lookup ….
> mainJar {
>     publishedAs groupId: 'myGroup', artifactId: 'myArtifact' // with implicit 
> version
> }

Seems like it might be better to think about items only having knowledge of 
their upstream. In the above example you have a jar knowing that it can be 
published (that this might have come from an extension type thing doesn't 
change things I think).

Here's the kind of thing I'm mentally playing with…

create(MavenRepository) { url 'http://myserver' }

create(MavenPublication) { 
        id groupId: 'myGroup', artifactId: 'myArtifact' // with implicit version
        component  find(JavaLibraryComponent, { /* some predicate(s) for 
getting the right one */ })
        publishTo findAll(MavenRepository)
}


> 
> Then, we can infer something like:
> 
> * There is a Maven publication `myGroup:myArtifact:${project.version}` 

How are you inferring it's a maven publication? The existence of a maven 
repository?

> * The main jar (and its meta-data) should be included in this publication.
> * This publication should be published to the Maven repository 
> `http://myserver`

What is this inference based on? The fact that there is only one maven 
repository?

> * This publication can also be published to maven local.
> * All the other stuff that can be inferred about a Maven publication.






-- 
Luke Daley
Principal Engineer, Gradleware 
http://gradleware.com


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [gradle-dev] some thoughts on the dsl for multiple outputs for jvm based projects

Reply via email to