Re: How to name modules, automatic and otherwise

Stephen Colebourne Mon, 27 Feb 2017 05:48:09 -0800

Having spent further time considering naming, I have come to the
conclusion there is only one acceptable solution - highest package
name (reverse DNS). It now seems to me that this can almost be
"proven" as follows:

Consider a module that contains three packages:

module ??? {
   exports com.foo.willow;
   exports com.foo.willow.model;
   exports com.foo.willow.util;
}

It is possible to refactor such a module into three separate modules,
where each module consists of one package:

module ??? {
   exports com.foo.willow;
   requires transient ???.model;
   requires transient ???.util;
}
module ??? {
   exports com.foo.willow.model;
}
module ??? {
   exports com.foo.willow.util;
}

Note here that as a general rule, it is always possible to refactor a
module into a number of single-package modules. (I can't actually
prove this offhand, but it seems an entirely reasonable claim).

Now, ask yourself the question - what should the module name be for
each single-package module?

There is only one possible, answer that is not completely ridiculous -
the module name must be the same as the package name. Any other answer
involves the invention of some other name that does not uniquely
reference the content of the single-package module.

module com.foo.willow {
   exports com.foo.willow;
   requires transient com.foo.willow.model;
   requires transient com.foo.willow.util;
}
module com.foo.willow.model {
   exports com.foo.willow.model;
}
module com.foo.willow.util {
   exports com.foo.willow.util;
}

Stepping back up to the original, it therefore follows that the module
name must be a summary of the packages included. Which means, in
general, the "highest" of the packages included/exported - aka reverse
DNS.

module com.foo.willow {
   exports com.foo.willow;
   exports com.foo.willow.model;
   exports com.foo.willow.util;
}

In fact, the primacy of package naming is fundamental to the whole
Jigsaw design. The JPMS won't load two modules where the packages
conflict. Exports and Requires are expressed at the package level too.
And the developer writing code is actually importing from a package,
not a module. There is thus no getting away from the fact that modules
are built on top of the existing package concept, simply providing
packages with greater security.

Given all this, a module name SHOULD match one of the packages it contains.

In essence, there are two worlds that JPMS seeks to keep separate -
development and build system. The build system world has versions,
artifacts, groups, organizations, jar-files, project names, etc - none
of which affect bytecode. The development world has class, package and
module names - all ending up in bytecode. Module names should
therefore be completely uninfluenced by versions, artifacts, groups,
organizations, jar-file names and project names. The fact that
jar-files are being used to package modules is a big confusion to this
whole naming debate, as the jar-file name is from the build system
world, not the development world.

With regards to automatic modules, this implies that automatic modules
must contain the Module-Name MANIFEST information (as it takes too
long to scan a jar for packages). Allowing the command line to map jar
files to module names would also be acceptable.

On 16 February 2017 at 16:48,  <mark.reinh...@oracle.com> wrote:
> I do know, however, of at least
> one major, well-known project whose developers intend to adopt the
> project-name-prefix convention for their module names.

Mark indicates that some projects are intent on using short names. To
stop this (where developers try and be "clever" and anti-social) the
module-info compiler and/or jar tool MUST emit a warning (I'd prefer
error, but there are probably some edge cases). The warning would be
something like "the module name must match or be related to one of the
contained packages". While this does not absolutely prevent developers
doing the wrong thing, it would provide additional force to the rule.
Potentially Maven Central could also validate this.

In summary, module naming must match package naming, because modules
are ultimately just a collection of packages with some additional
security rules. Any module can always be reduced to a number of
single-package modules, and since there is only one possible name a
single-package module should have, the combined module should have a
name based on the packages contained - which will almost certainly be
the highest contained package name.

Stephen

On 16 February 2017 at 23:19, Stephen Colebourne <scolebou...@joda.org> wrote:
> On 16 February 2017 at 16:48,  <mark.reinh...@oracle.com> wrote:
>>  This can be done very simply, with a single new JAR-file manifest 
>> `Module-Name` attribute
>
> I welcome this.
>
>> The reversed domain-name approach was sensible in the early days of Java,
>> before we had development tools sophisticated enough to help us deal with
>> the occasional conflict.  We have such tools now, so going forward the
>> superior readability of short module and package names that start with
>> project or product names is preferable to the onerous verbosity of those
>> that start with reversed domain names.
>
> What tools?
>
> With short identifiers clashes are inevitable. Because the module name
> is baked inside the module in binary format, the only way to resolve
> the clash is to rewrite the module. The Java platform has not demanded
> anything like this before, and I can't see how it meets the reliable
> configuration requirement. Rewriting modules as part of the build
> system is a red line for me. I need to be able to see that the module
> on the module path is the same bits as that from the source of jars.
>
> The standard case to consider is as follows:
>
> - In 2017, a company creates an internal foundation library called
> "willow" and it becomes very popular within the company and is used
> 100s of times
> - In 2018, an unrelated open source project starts up with the name
> "willow" and becomes very popular. Both now publish modules with the
> name "willow" (one privately, one publicly).
> - In 2019, the company wants to use the open source "willow" library
> (directly or indirectly), but can't due to name clash
> - In 2020, the company wants to open source their "willow" library,
> but can't due to name clash
>
> The plan outlined, favouring short IDs, provides no solution to this
> problem that I can see. There simply isn't the breadth of identifier
> to avoid clashes like this (you can't possibly predict the future
> where you might need to coexist with an open source module that
> doesn't even exist yet). Proposal (A) only tackles automatic modules,
> and not the bigger problem where names are baked into the module
> itself.
>
> The simplest and most consistent option is reverse DNS everywhere.
> Everyone understand it and few will object!
>
> An alternative option would be that open source can use short names,
> but companies "must" use reverse DNS. But this is far from ideal given
> how projects move from private to public, or how companies merge.
>
> Another alternative is some form of group, that may or may not map
> onto maven's group, where most of the time it does not have to be
> specified:
>
> module mainlib from com.mycompany {
>   requires base;  // implicit, favours group 'com.mycompany' if there is a 
> clash
>   requires willow;  // uses 'com.mycompany' because there is a clash
>   requires willow from org.joda;  // explicitly specified, but only
> needed to resolve a clash
> }
>
> With this approach, the clash can be resolved, but only needs to be by
> the first module in the graph to pull both in. Any transitive use of
> the two willow modules would be fine.
>
> In summary, I recognise the desire for short, pretty identifiers.
> However, I remain of the opinion that they are highly dangerous for
> the wider ecosystem without some additional ability to qualify them.
> The are many more private jars than public jars, and the clashes seen
> today on Maven Central are just the tip of the iceberg of this
> problem.
>
> Stephen

Re: How to name modules, automatic and otherwise

Reply via email to