Re: [gradle-dev] Improving JavaScript + Gradle

Adam Murdoch Sun, 20 May 2012 16:25:11 -0700

On 17/05/2012, at 9:20 PM, Luke Daley wrote:

> 
> On 17/05/2012, at 1:15 AM, Adam Murdoch wrote:
> 
>> On 17/05/2012, at 6:12 AM, Luke Daley wrote:
>> 
>>> Howdy,
>>> 
>>> Here are some disjointed thoughts about JavaScript support, as provided by 
>>> the different 3rd party plugins.
>>> 
>>> The biggest problem I can see is that none of these plugins play well 
>>> together, which to some extent is a symptom of there being many different 
>>> ways to “do JavaScript”. Also, I think it's a symptom of there being no 
>>> core modelling in this area.
>>> 
>>> I've been contemplating what a “javascript-base” plugin might provide if we 
>>> were to do such a thing.
>>> 
>>> Providing something like sourceSets would probably be a good start. At the 
>>> moment most tasks are using references to directories and finding all js 
>>> files manually, which is unnecessary and defeats task chaining through 
>>> buildable inputs. I'm not sure what else a JavaScript source set would do 
>>> besides be a glorified FileTree. I'm not sure at this point what else there 
>>> is to model, or is worth modelling.
>> 
>> A couple more things a SourceSet would provide:
>> * A way to declare the purpose of some javascript source files: is it part 
>> of the web app, or is it a test script, or is it a test fixture, or is it a 
>> tool, or …
> 
> Which would just be name of the sourceSet right?


That, plus the meaning we attach to the name: e.g. 'main' vs 'test'.

> 
>> * A way to declare the dependencies of some javascript source. In general, a 
>> set of javascript source has dependencies on other javascript libraries, and 
>> on various other kinds of artefacts, such as css or images.
> 
> I'm not sure javascript ever really does depend on css or images. What did 
> you have in mind here? I think the combination of js + css + images is 
> probably another concept.

Quite possibly.

The use case I have in mind is where a javascript library expects a certain set 
of css rules have been applied, where the rules come from a css script provided 
as part of that library. And the css script in turns expects a certain set of 
images and other resources available at some relative urls, where the images 
are also provided as part of that library. That is, pretty much any widget/ui 
library. An example would be jquery-ui.

I can see a few ways to model this:
1. Introduce a 'web library' concept, which is simply a file tree of resources 
to be bundled in a web application. A javascript library like the above could 
be packaged [1] as a web library containing the javscript, css and other 
resources.
2. Introduce a 'javascript library' and a 'css library' concept (possible both 
as specialisations of 'web library'). The javascript library would depend on 
the css library at runtime, and the web application bundling would need to take 
care of honouring this dependency. This approach would model something like 
jquery-ui + separate themes.
3. Introduce a 'javascript library' concept with 2 usages: 'runtime' and 
'deployment' [2]. The runtime usage defines the dependencies required to 
execute the script, e.g. for unit testing, and the deployment usage defines the 
dependencies required at deployment time. The web application bundling or 
deployment would need to take care of honouring the deployment time 
dependencies.

None of these are really mutually exclusive, in that we could do all 3. We 
probably don't want to do 1), except to maybe model css library is-a web 
library. We don't want to model a javascript library as a web library.

[1] packaging == the way (or ways) that a component is bundled into artefacts. 
A library may have multiple packagings. Some examples for a javascript library: 
as a stand alone script, as a file tree containing the script + css + 
resources, as a zip containing the script, dependencies, documentation, samples 
and so on.
[2] usage == some way that a component can be used. A library may have multiple 
usages. Some examples (from outside the javascript domain): 


> 
>> A few other things we should model:
>> * The concept of a javascript library.
> 
> This is a very loose concept. I'm not sure what we could do besides naming 
> attributes and source. 
> 
> We could look at modelling enough so that we can generated module 
> definitions, more on this in the next point.
> 
>> * Dependency management for javascript libraries. This, most importantly, 
>> would involve defining a convention for publishing a javascript component, 
>> and automating the work that has to happen to follow this convention.
> 
> AFAICT, we'd be carving new ground here. 

That's exactly the idea.

Dependencies would be declared using exactly the same DSL as we use to declare 
dependencies on java libraries, or c++ libraries. Javascript libraries would be 
published using exactly the same DSL as java libraries or c++ libraries. And 
into the same format as these things.

Of course, we're going to have to improve those DSLs to better deal with this. 
And we need to improve the schemes we use to publish to binary repositories.

> 
> There is no format for declaring dependencies in a metadata format. At least 
> in the short term, we'd be expressing dependencies in terms of downloading 
> raw js files, or zips then extracting them to somewhere in a js source tree. 

I think it's really important that dependency management is in there from the 
start. We can improve any awkwardness over time.

For example, there's an 'on-boarding' problem, where most javascript libraries 
are not published to a binary repository in a well-defined format. But this is 
certainly not unique to the javascript world. It's also a problem in the c++ 
and java worlds, too.

However, these libraries are really only published in a handful of formats. We 
can probably come up with a repository type that can resolve the distributions 
at their origin, download them into the cache and munge them into the 
normalised layout appropriate for the type of library. We could even build an 
importer on top of this, that republishes these normalised distributions to the 
corporate repository manager. Perhaps we even publish some definitions for 
popular javascript libraries up on repo.gradle.org (i.e. don't publish the 
libraries, but publish enough metadata to download and use them from their 
origin).


> The only benefit I can see to doing this (opposed to checking this stuff into 
> the source tree) would be that we could build up a model of the dependency 
> graph locally in the build and potentially use this information to drive 
> bundlers/compressors.

There's all the benefits of declarative dependency management.

From a development point of view, I can declare a dependency on jquery, point 
the build at my corporate repository (or some public repository) and Gradle 
takes care of downloading jquery, injecting it into the correct contexts at 
test and runtime, and setting up the meta-data in the IDE.

From an enterprise point of view, there are many benefits: I don't have to open 
up the firewall to fetch this stuff. I can run each library through my 
procurement and audit process before making it available in the corporate 
repository. I have all the reporting goodness about which licenses and 
libraries are being used across my organisation. I can share javascript 
libraries and related assets in a controlled way between my teams. And so on.

In short, declarative dependency management >> checking stuff into the source 
tree.

> 
> There are runtime based resolution mechanisms though (e.g 
> http://requirejs.org/, https://github.com/unscriptable/curl) that use either 
> AMD or UMD (http://addyosmani.com/writing-modular-js/ was the best 
> explanation I found). This is really for managing scoping between “modules” 
> in a large JavaScript codebase. It's unclear what the build time implications 
> of this kind of thing is, as it relates to dependency management.
> 
>> * The connection between javascript and web applications. Javascript almost 
>> always ends up in a web application. Our model should reflect this fact
> 
> Unsure on this, depending on what you mean. Something like a JavaScript 
> source set should have no knowledge of web applications.

Right. It would work in the same way as Java: A Java source set does not have 
any knowledge of web application, but when you tell Gradle that you're building 
a web application, it knows that certain compiled classes need to end up in a 
certain location in the resulting bundle. Same for Javascript. When you tell 
Gradle that you're building a web application, certain javascript source files 
need to end up in a certain location in the resulting bundle.

> 
>> our infrastructure should automate the bundling, there should be conventions 
>> for how this happens, and our IDE integration should expose this concept.
> 
> I'm unsure what kind of conventions we'll be able to impose, given the sheer 
> number of different tools that do the same thing.

The key here is that these tools do 'the same thing', i.e. they are different 
implementations of an abstract lifecycle. The focus should not be on what the 
tools produce, rather than how they go about it.

There's a few aspects to this. We might do something like:
* Define an abstract lifecycle for javascript projects. For the most part, this 
won't be any different to the abstract lifecycle for java or c++ projects.
* Define a 'main' and 'test' source set, with source files living in 
'src/$sourceSet/js'. Same for coffeescript.
* A source set may or may not be transformed before it is ready for execution. 
This might include compilation, concatenation, minification.
* Tests are executed against the transformed output of the main and test source 
sets.
* When building a javascript library, the transformed output of the main source 
set is published (plus meta-data).
* When building a web application, the transformed output of the main source 
set is bundled in the web application, at, say, '/js'.
* Start handling the concept of variants, to allow things such as a minified 
and non-minified variant to be built.


> I think the best we could do is model the kind of processing pipeline that's 
> typically involved here in a way that lets people use whatever tools they 
> want at each processing step. In the Java world, we can get away with 
> standardising on a compiler and a bundler for the most part where that level 
> of standardisation will not be possible in the JavaScript space.

It's not any different in the java space. People use various kinds of 
compilers, code generators, and transformers to end up with the output that is 
ready to execute. Same with bundling (e.g. a regular jar, a fat jar, using jar 
jar, or an obfuscator).

> 
> I think the difference between C++ and JavaScript here is that common 
> abstractions will be harder to find in the JS space because of its immaturity 
> and the different approaches that different tools take. I think we'll have to 
> pitch lower, and focus on providing general processing/pipelining 
> abstractions and runtime/execution abstractions. I don't see this the same as 
> abstracting over C++ compilers, there are more established patterns in that 
> space allowing us to abstract a little higher.

I don't think there's really any practical difference. The abstractions are 
there in the Javascript world, and they're exactly the same ones that are in 
the c++ and java worlds:
* I write some code that I want someone to execute.
* I transform the source code into an executable form.
* I write some tests for it.
* I run the tests against the executable code.
* I package up the executable code.
* I publish the result, or deploy the result. Or both.
* My code has dependencies on other code.
* Some of these dependencies are written by other teams.
* I want to use an IDE to edit and execute my code

And so on.

> 
>> There is also some goodness we can reuse from the c++ dependency management 
>> stuff we want to do:
>> * The idea of variants: a minified variant of a script vs a non-minified 
>> variant of a script.
>> * The idea of packaging: a library distributed as a script vs a library 
>> distributed in a zip with supporting resources, dependencies and so on.
>> * The idea of bundling: a published library may include some or all of the 
>> dependencies of that library.
> 
> These mostly address the case of building a JS library to be consumed by 
> others. This will not be the common case by far. Pulling dependencies, and 
> combining with local code to become part of the web application that's being 
> built will be the common case.

It's exactly the same thing. My consuming a dependency is someone else's 
publishing, whether that person is using Gradle or some other tool or 
publishing manually. The above things form the contract between the publisher 
and consumer.

These concepts are not just applicable to dependency management, either. 
They're very much independent of whether you're sharing your output with 
another team. For example, my CI release build might build 3 variants of my web 
application, one for the development environment, one for QA, and one for 
production. My dev variant would bundle the non-minified script. The QA and 
productions variants would bundle the minified script.


>  Publishing JS would only be done for very large orgs, or for multi project 
> builds where you need to share js across apps in the build.
> 
> It's unclear to me atm how appealing the idea of building your JS as an 
> independent project in a multi project build would be. I imagine most 
> developers would prefer to not do this, unless they explicitly need to share 
> the JS.
> 
>>> Not sure what we would do code wise to achieve this, but promotion of using 
>>> SourceTask as a super class would probably do wonders in this space. 
>>> 
>>> Providing some JavaScript runtime support would be useful too. At the very 
>>> least, this would be support for RhinoJS. It could eventually expand to 
>>> include PhantomJS and possibly real browsers automated with WebDriver too. 
>>> A lot of the javascript tools (compressors, static analysis etc.) are 
>>> written in JavaScript, hence the need for a runtime.
>>> 
>>> As for testing, I'm still not sure what to do. The different test 
>>> frameworks have different ways of working that would be hard to generalise 
>>> like we did with JUnit and TestNG I think.
>> 
>> What are some of the candidates we might consider?
> 
> http://pivotal.github.com/jasmine/
> 
> Requires a HTML driver page, that includes Jasmine, the CUT and the test 
> cases. Can be used in a pure JS environment, but you'd need to bundle the CUT 
> and test cases in some fashion and throw them at a runtime. Some projects 
> create test suites of different subsets of the tests, which means more than 
> one HTML driver page.
> 
> For automation, people typically do one of three things:
> 
> 1. Start a http server that serves the html driver page, then automate a real 
> browser to hit it, then capture the results
> 2. Using Rhino with http://www.envjs.com/ (a Javascript DOM impl) 
> 3. Use PhantomJS (i.e. headless webkit)
> 4. Use HtmlUnit
> 
> http://visionmedia.github.com/mocha/ (same kind of story as above)
> 
> http://docs.jquery.com/QUnit (same kind of story as above)
> 
> Looking at it again, there is clearly a pattern here. We could provide a “DOM 
> runtime” abstraction/infrastructure with potential impls for WebDriver (i.e. 
> real browsers), Rhino + envjs, PhantomJS  and HtmlUnit. We then would just 
> point them at a html page (probably served through a server that we start) 
> and then capture the DOM after the page loads. More focussed testing support 
> could build on this.
> 
> There's another interesting option in 
> http://code.google.com/p/js-test-driver. The stated goal of this tool is to 
> bring a JUnit type workflow to JS testing. Here's how it works:
> 
> You start a js-test-driver process that serves up a html page. You then point 
> one or more browsers (real or otherwise) at this page, this “connects” the 
> browser to the server in a bi-directional way. You then can tell the server 
> to “run the tests”, which in turn triggers the execution of the tests in all 
> of the connected browsers, then collects and aggregates the results.
> 
> What's nice about this tool is that it has IDEA and Eclipse plugins for 
> starting the server and running tests via the IDE, and it also spits out 
> JUnit XML. There are also adapters available for QUnit and Jasmine, that 
> allow them to run in this environment. This might be a compelling place to 
> start.
> 
> There's some overlap between these two approaches that we could exploit. At 
> build time, we could use js-test-driver to manage the generation of the 
> driver html page and results XML and use or “DOM runtime” machinery to point 
> browsers at the junit-test-driver server.
>  
> 
>>> I think the best we could do is provide some abstractions around JavaScript 
>>> runtimes and JavaScript + HTML runtimes. This would allow different 
>>> focussed JavaScript testing plugins to code against these abstractions. At 
>>> the moment, most of the existing javascript testing plugins couple a 
>>> testing framework with a runtime. This is something users will want to 
>>> control. They may even want to use multiple, e.g. run their JS unit tests 
>>> in IE, then FireFox etc.
>>> 
>>> No actions to take at this stage, just sharing my thoughts so far.
>> 
>> I think there's a huge overlap with javascript and c++. For everything 
>> you've written above, you could replace 'javascript' with 'c++' and still 
>> make sense. And I think whatever we do in the javascript space will benefit 
>> our c++ support, and vice versa.
> 
> You're dead right here. I didn't see it initially.
> 
>> It all boils down to 1) no obvious conventions, so we either need to choose 
>> one, or abstract over several, and 2) much of the Gradle model is 
>> jvm-centric, even where it's abstract, so we need to detangle the model from 
>> the jvm, and model more stuff. In particular, the 'build my jar but pretend 
>> we're general-purpose' view of the world that our dependency model has has 
>> to go.
> 
> -- 
> Luke Daley
> Principal Engineer, Gradleware 
> http://gradleware.com
> 


--
Adam Murdoch
Gradle Co-founder
http://www.gradle.org
VP of Engineering, Gradleware Inc. - Gradle Training, Support, Consulting
http://www.gradleware.com

Re: [gradle-dev] Improving JavaScript + Gradle

Reply via email to