On 05/20/2016 09:12 AM, Alan Bateman wrote:
On 18/05/2016 22:47, David M. Lloyd wrote:

I mean in *our* current concept of a module, we can add/remove/modify
the contents of a module (its "class path") at run time.  It is up to
the user to ensure that doing so makes sense.
I don't think I can relate to the use case. As you probably know then
ZIP files have historically had their central directory mapped into
memory. Removing or replacing a file that is memory mapped will likely
lead to processes accessing the mapped file to crash (SIGBUS usually).
So if users are really doing such hairy things they would need a lot of
insight into what is running and whether the file is opened before
taking this risk.

No, we don't support replacement of JAR files; more like we symbolically remove the resources and add overlays. Internally, we generally explode JAR files for various reasons. It has nothing to do with JAR files, more to do with the logical presence and absence of class files and resources.

Our modules each correspond to their own class loader: so far so good,
we can just have one Module per class loader.  Problem is that we
support circularity, and also we support dependencies that go across
module systems with isolated namespaces (basically, our module loaders
are a higher order of the exact same concept of class loaders).
If there are cyclic relationships between your modules then it will be
problematic. Do you see much of this? If you've read Alex's JavaOne
slides then you'll know that some of us like Kirk Knoernschild's book on
Java Application Architecture and section "4.4 Cyclic Dependencies - the
Death Knell" where he poses the question "Are Cycles Always Bad?". I
don't want to say too much on this topic here as it is listed as an open
issue on the JSR issues list.

We use circular dependencies in both our static module layouts and also in our dynamic deployment system. I don't think we have a clear path out if we can't support them; it will certainly be a difficult situation.

Our modules support specifications including the content of the module
("resource loaders") and the dependencies of the module. At run time,
custom ModuleLoader implementations can change the resource loader
list and/or the dependency list at any time, causing the module to be
relinked on the spot; the most useful aspect of this is the ability to
incrementally deploy applications which may include circular
dependencies.
Aside from cycles then what other use-cases do you have here? I read
"change the resource loader list" to mean that the set of resources in
the module changes, which is a bit weird if those resources are class
files that have already been loaded. Maybe there is dynamic code
generation with class bytes generated to the file system or into
somewhere virtual? Or maybe these resources are something else, data
files? I'm just trying to understand what you mean as we are using
differently terminology.

Within our module concept, the "class path" is a sort of colloquial term which refers to the series of resource loaders used to locate classes and resources within each module. Functionally it's somewhat analogous to the URLClassPath concept in the JDK, in the way that resources are sought (i.e. by a linear search from start to end of the list).

This is separate from the module dependency list, which, when combined with the resource loader list, is used to construct an index by path (which is a superset of packages that includes not just classes but also resources) which refers to dependencies and/or internal resources.

When the resource list is changed, all future lookups for classes and resources will use the new index. If there are already classes loaded from the previous list, and those classes are sufficiently incompatible with the new code, obviously this will result in errors; however, this is usually not the case when (say) making incremental changes during development. This is generally an edge case, but it is one which we presently support.

Changing the dependency list has effects that are somewhat similar. Existing loaded classes which have already linked against classes of the previous dependency may malfunction when doing this. But in the hot deployment situation, most of the time these changes are additive, so the effect is generally to enable previously unlinked code to become linkable. This could happen, for example, when deploying a JAR some time after a first JAR was deployed, which resolves some missing dependencies in the first JAR.

We also support delegating "fallback" class loading decisions to
outside suppliers for dynamic class loading behavior (this was done to
support a dynamic OSGi environment).  The ongoing integrity of the
system is up to the party doing the relinking (the EE deployer or the
OSGi resolver); most of the time it can reason about what is "safe"
and what might cause system breakage (but still might be useful to do
anyway).  These are the features we can't seem to support under
Jigsaw, architecturally speaking.
This sounds like class loader delegation to resolve types that are not
in the module.

Exactly. Our OSGi people have told me in the past that OSGi that can't function to spec without this ability.

Specifically this includes (but is not limited to) changing the
package set associated with a JDK module at run time, something that
this native code block makes impossible.  Also the ability to
dynamically change module dependencies is an essential ingredient to
making this work.
Suppose that module m has package p and p.C has been loaded. Are you
saying that you can drop package p from the module?

Yes - although normally you would only drop p if you knew specifically that p.C *hadn't* been loaded, for obvious reasons. It's probably more common to *add* than to drop.

As things currently stand in JDK 9 then packages may be added to modules
at runtime, the main use case is the dynamic proxy to a public interface
in a non-exported packages. So I can relate to adding packages for code
gen cases, I'm less sure about a module starting out as an XML API and
suddenly changing into a JDBC driver. Do you really mean the same module
instance?

A more apt example might be to update a part of a module which hasn't yet been loaded.

In my view, architecturally speaking, most of the constraints imposed
by the core module framework should be layer policy.  If the system's
core module layer wants to maintain strict, static integrity, name
constraints, version syntax and semantics, etc., that's fine, but why
should all modules everywhere be forced to the same constraints?
Using module names as an example, then it should be possible to develop
a module that is deployed on the application module path or instantiated
in a layer of modules that a container creates. The author of the module
(that chooses the name) isn't going to know in advance how the module is
deployed. I'm not even sure how such a module could be compiled or how
anyone could depend on it when the characters or format can vary like
this. I see there is an issue on the JSR issues list so I don't want to
say any more on this topic.

The idea of a "module" that I keep referring to is one that is described in the JSR requirements: "... named, self-describing program components consisting of code and data". A plain old JAR file meets this definition just as readily as the current Jigsaw concept. Java 8 javac can compile things that more than meet the definition of "module". The important and relevant part of a module is the ABI it exposes, not its name or even its internal structure (which should be encapsulated in numerous senses of the word). This becomes even more clear when you are assembling an environment consisting of hundreds of artifacts from hundreds of authors. We package many, many artifacts whose authors never gave a thought to modularity at all.

The responsibility of assigning a name to an ABI and behavior has to lie with the environment assembler; it just doesn't make sense in the hands of the original author, who is necessarily concerned only with the parameters of their problem space, and not with that of any greater ecosystem in which their project might find itself. This is the very definition of encapsulation.

For an example of why we may need flexible naming, I can deploy an application into a Java EE container called "my-cool-application.ear". In this case, I might expect my module to be named "my-cool-application.ear" or perhaps "my-cool-application". I might expect nested JARs to correspond to modules named by their relative JAR locations (including "/" separators). Really the only constraint is the validity of the name on the filesystem. Java EE certainly places no limitations on this today.

I might name my modules based on Maven group and artifact IDs, which have different syntax requirements, or by OSGi bundle name. You get the idea.

The point is that, yes, you are correct: you *don't* know how a module is going to be deployed; not before Java 9 and likely not after it either. It doesn't even make any sense to define the module name inside the module when the name is going to be 100% dependent on the environment in which the module is used.

You can't, on one hand, define a universal namespace and syntax for modules and their versions in the JDK or establish hard constraints on layer and module graph structure, and on the other hand expect other module systems with differing existing constraints to unify on the JDK module system. You're basically cutting these systems off at the knees and forcing them to reinvent everything, unless you completely coincidentally have a system that already conforms to this structure (if so, you are either very fortunate or maybe starting off in a rather privileged position).

There is no way that existing containers and class loading
environments (other than, apparently, WebLogic) can conform to
Jigsaw's constraints without losing functionality (and I'm trying hard
to find ways to make it work).  This is where most of my raised issues
are coming from.
The module system imposes surprising few constraints.  If you are using
your own class loaders then the delegation needs to respect module
readability, something that should not be controversial.

See earlier posts about the controversy of making "public" no longer be "public". The email thread is still unresolved and unanswered.

However it is possible that you are still at the starting line because
your have a dependency graph with cycles and/or modules that don't have
names that can be expressed as a Java identifier, is that right?

Yes, and also isolated module namespaces which nevertheless need to link with one another. Also version syntax and schemes which are not compatible with the Jigsaw scheme. But these issues are all raised in the document.

All these problems seem surmountable to me, but it becomes
substantially more difficult when it is necessary to report all of a
module's packages to the module when it is created, since this
information is now not easily changed.
I'm surprised that this is an issue as module membership is critical to
access control.

Sure, I get that, but it's only critical to *new* access control rules that were introduced with Jigsaw. The classical rules wherein public=public that we have relied on this past decade would ease this situation substantially, since in this case the only reason module information would pass to the JVM would be for diagnostics; since each class has a module membership, the JVM theoretically would already have access to everything it needs for this purpose.

As long as we're talking access control though... the idea of replacing this idea with "friend packages" and using them to selectively expand package-private access has never been resolved or even seriously discussed as far as I can see. It's been brought up several times and basically ignored. But this idea would allow not only modules but *all* Java code to take advantage of better security by removing public qualifiers from things that are not public instead of relying on special packages (because package identity is defined by class loader and package name, meaning modules are not central to the concept though they can easily take advantage of it), which makes far more sense to me at least (in particular, within the JDK code itself - all those "shared secrets" classes!) and imposes far less risk on the access control model by keeping it homogeneous instead of making it bi-layered. I would deeply wish to resurrect this discussion at some point. Readability and exports have been nothing but a problem for users as far as I can see; the current security model certainly isn't doing me any favors.

--
- DML

Reply via email to