Re: Mutable modules

David M. Lloyd Fri, 20 May 2016 08:21:13 -0700

On 05/20/2016 09:12 AM, Alan Bateman wrote:

On 18/05/2016 22:47, David M. Lloyd wrote:


I mean in *our* current concept of a module, we can add/remove/modify
the contents of a module (its "class path") at run time.  It is up to
the user to ensure that doing so makes sense.

I don't think I can relate to the use case. As you probably know then
ZIP files have historically had their central directory mapped into
memory. Removing or replacing a file that is memory mapped will likely
lead to processes accessing the mapped file to crash (SIGBUS usually).
So if users are really doing such hairy things they would need a lot of
insight into what is running and whether the file is opened before
taking this risk.

No, we don't support replacement of JAR files; more like we symbolicallyremove the resources and add overlays. Internally, we generally explodeJAR files for various reasons. It has nothing to do with JAR files,more to do with the logical presence and absence of class files andresources.

Our modules each correspond to their own class loader: so far so good,
we can just have one Module per class loader.  Problem is that we
support circularity, and also we support dependencies that go across
module systems with isolated namespaces (basically, our module loaders
are a higher order of the exact same concept of class loaders).

If there are cyclic relationships between your modules then it will be
problematic. Do you see much of this? If you've read Alex's JavaOne
slides then you'll know that some of us like Kirk Knoernschild's book on
Java Application Architecture and section "4.4 Cyclic Dependencies - the
Death Knell" where he poses the question "Are Cycles Always Bad?". I
don't want to say too much on this topic here as it is listed as an open
issue on the JSR issues list.

We use circular dependencies in both our static module layouts and alsoin our dynamic deployment system. I don't think we have a clear pathout if we can't support them; it will certainly be a difficult situation.

Our modules support specifications including the content of the module
("resource loaders") and the dependencies of the module. At run time,
custom ModuleLoader implementations can change the resource loader
list and/or the dependency list at any time, causing the module to be
relinked on the spot; the most useful aspect of this is the ability to
incrementally deploy applications which may include circular
dependencies.

Aside from cycles then what other use-cases do you have here? I read
"change the resource loader list" to mean that the set of resources in
the module changes, which is a bit weird if those resources are class
files that have already been loaded. Maybe there is dynamic code
generation with class bytes generated to the file system or into
somewhere virtual? Or maybe these resources are something else, data
files? I'm just trying to understand what you mean as we are using
differently terminology.

Within our module concept, the "class path" is a sort of colloquial termwhich refers to the series of resource loaders used to locate classesand resources within each module. Functionally it's somewhat analogousto the URLClassPath concept in the JDK, in the way that resources aresought (i.e. by a linear search from start to end of the list).

This is separate from the module dependency list, which, when combinedwith the resource loader list, is used to construct an index by path(which is a superset of packages that includes not just classes but alsoresources) which refers to dependencies and/or internal resources.

When the resource list is changed, all future lookups for classes andresources will use the new index. If there are already classes loadedfrom the previous list, and those classes are sufficiently incompatiblewith the new code, obviously this will result in errors; however, thisis usually not the case when (say) making incremental changes duringdevelopment. This is generally an edge case, but it is one which wepresently support.

Changing the dependency list has effects that are somewhat similar.Existing loaded classes which have already linked against classes of theprevious dependency may malfunction when doing this. But in the hotdeployment situation, most of the time these changes are additive, sothe effect is generally to enable previously unlinked code to becomelinkable. This could happen, for example, when deploying a JAR sometime after a first JAR was deployed, which resolves some missingdependencies in the first JAR.

We also support delegating "fallback" class loading decisions to
outside suppliers for dynamic class loading behavior (this was done to
support a dynamic OSGi environment).  The ongoing integrity of the
system is up to the party doing the relinking (the EE deployer or the
OSGi resolver); most of the time it can reason about what is "safe"
and what might cause system breakage (but still might be useful to do
anyway).  These are the features we can't seem to support under
Jigsaw, architecturally speaking.

This sounds like class loader delegation to resolve types that are not
in the module.

Exactly. Our OSGi people have told me in the past that OSGi that can'tfunction to spec without this ability.

Specifically this includes (but is not limited to) changing the
package set associated with a JDK module at run time, something that
this native code block makes impossible.  Also the ability to
dynamically change module dependencies is an essential ingredient to
making this work.

Suppose that module m has package p and p.C has been loaded. Are you
saying that you can drop package p from the module?

Yes - although normally you would only drop p if you knew specificallythat p.C *hadn't* been loaded, for obvious reasons. It's probably morecommon to *add* than to drop.

As things currently stand in JDK 9 then packages may be added to modules
at runtime, the main use case is the dynamic proxy to a public interface
in a non-exported packages. So I can relate to adding packages for code
gen cases, I'm less sure about a module starting out as an XML API and
suddenly changing into a JDBC driver. Do you really mean the same module
instance?

A more apt example might be to update a part of a module which hasn'tyet been loaded.

In my view, architecturally speaking, most of the constraints imposed
by the core module framework should be layer policy.  If the system's
core module layer wants to maintain strict, static integrity, name
constraints, version syntax and semantics, etc., that's fine, but why
should all modules everywhere be forced to the same constraints?

Using module names as an example, then it should be possible to develop
a module that is deployed on the application module path or instantiated
in a layer of modules that a container creates. The author of the module
(that chooses the name) isn't going to know in advance how the module is
deployed. I'm not even sure how such a module could be compiled or how
anyone could depend on it when the characters or format can vary like
this. I see there is an issue on the JSR issues list so I don't want to
say any more on this topic.

The idea of a "module" that I keep referring to is one that is describedin the JSR requirements: "... named, self-describing program componentsconsisting of code and data". A plain old JAR file meets thisdefinition just as readily as the current Jigsaw concept. Java 8 javaccan compile things that more than meet the definition of "module". Theimportant and relevant part of a module is the ABI it exposes, not itsname or even its internal structure (which should be encapsulated innumerous senses of the word). This becomes even more clear when you areassembling an environment consisting of hundreds of artifacts fromhundreds of authors. We package many, many artifacts whose authorsnever gave a thought to modularity at all.

The responsibility of assigning a name to an ABI and behavior has to liewith the environment assembler; it just doesn't make sense in the handsof the original author, who is necessarily concerned only with theparameters of their problem space, and not with that of any greaterecosystem in which their project might find itself. This is the verydefinition of encapsulation.

For an example of why we may need flexible naming, I can deploy anapplication into a Java EE container called "my-cool-application.ear".In this case, I might expect my module to be named"my-cool-application.ear" or perhaps "my-cool-application". I mightexpect nested JARs to correspond to modules named by their relative JARlocations (including "/" separators). Really the only constraint is thevalidity of the name on the filesystem. Java EE certainly places nolimitations on this today.

I might name my modules based on Maven group and artifact IDs, whichhave different syntax requirements, or by OSGi bundle name. You get theidea.

The point is that, yes, you are correct: you *don't* know how a moduleis going to be deployed; not before Java 9 and likely not after iteither. It doesn't even make any sense to define the module name insidethe module when the name is going to be 100% dependent on theenvironment in which the module is used.

You can't, on one hand, define a universal namespace and syntax formodules and their versions in the JDK or establish hard constraints onlayer and module graph structure, and on the other hand expect othermodule systems with differing existing constraints to unify on the JDKmodule system. You're basically cutting these systems off at the kneesand forcing them to reinvent everything, unless you completelycoincidentally have a system that already conforms to this structure (ifso, you are either very fortunate or maybe starting off in a ratherprivileged position).

There is no way that existing containers and class loading
environments (other than, apparently, WebLogic) can conform to
Jigsaw's constraints without losing functionality (and I'm trying hard
to find ways to make it work).  This is where most of my raised issues
are coming from.

The module system imposes surprising few constraints.  If you are using
your own class loaders then the delegation needs to respect module
readability, something that should not be controversial.

See earlier posts about the controversy of making "public" no longer be"public". The email thread is still unresolved and unanswered.

However it is possible that you are still at the starting line because
your have a dependency graph with cycles and/or modules that don't have
names that can be expressed as a Java identifier, is that right?

Yes, and also isolated module namespaces which nevertheless need to linkwith one another. Also version syntax and schemes which are notcompatible with the Jigsaw scheme. But these issues are all raised inthe document.

All these problems seem surmountable to me, but it becomes
substantially more difficult when it is necessary to report all of a
module's packages to the module when it is created, since this
information is now not easily changed.

I'm surprised that this is an issue as module membership is critical to
access control.

Sure, I get that, but it's only critical to *new* access control rulesthat were introduced with Jigsaw. The classical rules whereinpublic=public that we have relied on this past decade would ease thissituation substantially, since in this case the only reason moduleinformation would pass to the JVM would be for diagnostics; since eachclass has a module membership, the JVM theoretically would already haveaccess to everything it needs for this purpose.

As long as we're talking access control though... the idea of replacingthis idea with "friend packages" and using them to selectively expandpackage-private access has never been resolved or even seriouslydiscussed as far as I can see. It's been brought up several times andbasically ignored. But this idea would allow not only modules but *all*Java code to take advantage of better security by removing publicqualifiers from things that are not public instead of relying on specialpackages (because package identity is defined by class loader andpackage name, meaning modules are not central to the concept though theycan easily take advantage of it), which makes far more sense to me atleast (in particular, within the JDK code itself - all those "sharedsecrets" classes!) and imposes far less risk on the access control modelby keeping it homogeneous instead of making it bi-layered. I woulddeeply wish to resurrect this discussion at some point. Readability andexports have been nothing but a problem for users as far as I can see;the current security model certainly isn't doing me any favors.


--
- DML

Re: Mutable modules

Reply via email to