Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-12-01 Thread Ioi Lam




On 11/7/21 9:44 PM, Jaikiran Pai wrote:

Hello Ioi,

On 02/11/21 12:13 am, Ioi Lam wrote:

Hi Jaikiran,

Thanks for writing the test case to explore the problems in this area.

Please see my comments below:
...

Generally speaking, CDS has two levels of archiving:

[1] archiving class metadata -- classes in the 
$JAVA_HOME/lib/classlist are considered to be frequently loaded 
classes. They are parsed from classfiles and stored into the CDS 
archive. At run time, instead of parsing the classes from classfiles, 
the VM directly use the pre-parsed version of these classes (as 
InstanceKlass* in C++).


At runtime, all such pre-parsed classes are initially in the "loaded" 
state. This means their static constructors will be executed when 
these classes are referenced for the first time. So as far as Java 
semantic is concerned, there's no difference between a pre-parsed 
class vs a class loaded from classfile.


E.g, the examples of loggers in static initializers will be executed 
at runtime.


[2] archiving heap objects

As shown in your test, we cannot arbitrarily archive the static 
fields that were initialized during -Xshare:dump, because they may 
have environment dependency.


The strategy used by CDS is to archive only a few static fields in a 
small number of carefully hand-picked system classes. You can see the 
list in


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/cds/heapShared.cpp*L97__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVBMi-78hA$ 



Thank you for that link. That helped. So essentially even though the 
list of classes used for archiving class metadata isn't very tightly 
controlled, the list of objects which are archived in the heap is much 
more selective.


The reason why my PoC ended up reproducing this issue is because it 
just so happened that I selected a class (ModuleDescriptor) which 
(indirectly) is hand-picked in that list of classes that can end up in 
the archived heap.


These static fields are stored into the CDS archive. At run time, 
these fields are essentially copied into the Java heap, and then 
picked up by code like this:


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/java.base/share/classes/jdk/internal/module/ModuleBootstrap.java*L163__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVABkO6Q0w$ 



    public static ModuleLayer boot() {
    Counters.start();

    ModuleLayer bootLayer;
    ArchivedBootLayer archivedBootLayer = ArchivedBootLayer.get();
    if (archivedBootLayer != null) {
    assert canUseArchivedBootLayer();
    bootLayer = archivedBootLayer.bootLayer();
    BootLoader.getUnnamedModule(); // trigger  of 
BootLoader.
CDS.defineArchivedModules(ClassLoaders.platformClassLoader(), 
ClassLoaders.appClassLoader());


    // assume boot layer has at least one module providing a 
service

    // that is mapped to the application class loader.
    JLA.bindToLoader(bootLayer, ClassLoaders.appClassLoader());
    } else {
    bootLayer = boot2();
    }

In the case of the module graph, we remove things that depend on the 
environment (such as CLASSPATH)


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/cds/heapShared.cpp*L190__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVAx46l0Bg$ 



The remaining parts of the archived module graph only depend on the 
following system properties:


    private static boolean canUseArchivedBootLayer() {
    return getProperty("jdk.module.upgrade.path") == null &&
   getProperty("jdk.module.path") == null &&
   getProperty("jdk.module.patch.0") == null &&   // 
--patch-module
   getProperty("jdk.module.main") == null &&  // 
--module
   getProperty("jdk.module.addmods.0") == null &&    // 
--add-modules
   getProperty("jdk.module.limitmods") == null && // 
--limit-modules
   getProperty("jdk.module.addreads.0") == null &&    // 
--add-reads
   getProperty("jdk.module.addexports.0") == null &&  // 
--add-exports
   getProperty("jdk.module.addopens.0") == null; // 
--add-opens

    }

As a result, we will invalidate the archived module graph if these 
properties differ between dump time and run time. The Java code above 
only asserts that the check has already been done. The actual check 
is done in here:


https://urldefense.com/v3/__https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/runtime/arguments.cpp*L1339__;Iw!!ACWV5N9M2RV99hQ!eWsSVUa8qjZ0BWlbOonNdDtE7dcU3w4c9Su5hb24IXirxZFdPoS6wVAwq_becQ$ 



Understood.



Am I misunderstanding the severity of this issue or is this serious 
enough 

Re: [External] : Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-11-07 Thread Jaikiran Pai

Hello Ioi,

On 02/11/21 12:13 am, Ioi Lam wrote:

Hi Jaikiran,

Thanks for writing the test case to explore the problems in this area.

Please see my comments below:
...

Generally speaking, CDS has two levels of archiving:

[1] archiving class metadata -- classes in the 
$JAVA_HOME/lib/classlist are considered to be frequently loaded 
classes. They are parsed from classfiles and stored into the CDS 
archive. At run time, instead of parsing the classes from classfiles, 
the VM directly use the pre-parsed version of these classes (as 
InstanceKlass* in C++).


At runtime, all such pre-parsed classes are initially in the "loaded" 
state. This means their static constructors will be executed when 
these classes are referenced for the first time. So as far as Java 
semantic is concerned, there's no difference between a pre-parsed 
class vs a class loaded from classfile.


E.g, the examples of loggers in static initializers will be executed 
at runtime.


[2] archiving heap objects

As shown in your test, we cannot arbitrarily archive the static fields 
that were initialized during -Xshare:dump, because they may have 
environment dependency.


The strategy used by CDS is to archive only a few static fields in a 
small number of carefully hand-picked system classes. You can see the 
list in


https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/cds/heapShared.cpp#L97 



Thank you for that link. That helped. So essentially even though the 
list of classes used for archiving class metadata isn't very tightly 
controlled, the list of objects which are archived in the heap is much 
more selective.


The reason why my PoC ended up reproducing this issue is because it just 
so happened that I selected a class (ModuleDescriptor) which 
(indirectly) is hand-picked in that list of classes that can end up in 
the archived heap.


These static fields are stored into the CDS archive. At run time, 
these fields are essentially copied into the Java heap, and then 
picked up by code like this:


https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/java.base/share/classes/jdk/internal/module/ModuleBootstrap.java#L163 



    public static ModuleLayer boot() {
    Counters.start();

    ModuleLayer bootLayer;
    ArchivedBootLayer archivedBootLayer = ArchivedBootLayer.get();
    if (archivedBootLayer != null) {
    assert canUseArchivedBootLayer();
    bootLayer = archivedBootLayer.bootLayer();
    BootLoader.getUnnamedModule(); // trigger  of 
BootLoader.
CDS.defineArchivedModules(ClassLoaders.platformClassLoader(), 
ClassLoaders.appClassLoader());


    // assume boot layer has at least one module providing a 
service

    // that is mapped to the application class loader.
    JLA.bindToLoader(bootLayer, ClassLoaders.appClassLoader());
    } else {
    bootLayer = boot2();
    }

In the case of the module graph, we remove things that depend on the 
environment (such as CLASSPATH)


https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/cds/heapShared.cpp#L190 



The remaining parts of the archived module graph only depend on the 
following system properties:


    private static boolean canUseArchivedBootLayer() {
    return getProperty("jdk.module.upgrade.path") == null &&
   getProperty("jdk.module.path") == null &&
   getProperty("jdk.module.patch.0") == null &&   // 
--patch-module
   getProperty("jdk.module.main") == null &&  // 
--module
   getProperty("jdk.module.addmods.0") == null &&    // 
--add-modules
   getProperty("jdk.module.limitmods") == null && // 
--limit-modules
   getProperty("jdk.module.addreads.0") == null &&    // 
--add-reads
   getProperty("jdk.module.addexports.0") == null &&  // 
--add-exports
   getProperty("jdk.module.addopens.0") == null; // 
--add-opens

    }

As a result, we will invalidate the archived module graph if these 
properties differ between dump time and run time. The Java code above 
only asserts that the check has already been done. The actual check is 
done in here:


https://github.com/openjdk/jdk/blob/977154400be786c500f36ba14188bff79db57075/src/hotspot/share/runtime/arguments.cpp#L1339 



Understood.



Am I misunderstanding the severity of this issue or is this serious 
enough that -Xshare:off should be default (or heap archiving disabled 
somehow by default till this is fixed) to prevent issues like these 
which can at the minimal be hard to debug bugs and on the extreme end 
perhaps leak things from the build server where the JDK was built? I 
guess it all boils down to which exact classes get 
replaced/mapped/copied over from the heap archive? Is there a 
definitive list that can be derived in the current JDK?




I believe the currently implementation is still safe 

Re: [External] : Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-11-01 Thread Ioi Lam

Hi Jaikiran,

Thanks for writing the test case to explore the problems in this area.

Please see my comments below:

On 10/29/21 5:55 AM, Jaikiran Pai wrote:

Hello Ioi (and others),

On 22/10/21 1:39 pm, Jaikiran Pai wrote:

Hello Ioi,

On 22/10/21 12:29 pm, Ioi Lam wrote:



On 10/21/21 9:09 PM, Jaikiran Pai wrote:

Hello Ioi,




This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen 
elsewhere?


The static constructors of enum classes are executed at both CDS 
dump time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived 
heap during dump time, it will be different than the value of 
Modifier.OPEN that is re-created at runtime by the execution of 
Modifier.


I have almost next to nothing knowledge about CDS internals. My 
only understanding of it is based on some documentation that I have 
read. One of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" 
like it states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple 
JVM processes."


But from what you explain in that enum example, it looks like it 
also stores class instance data that is computed at build time on 
the host where the JDK image was generated? Did I understand it 
correctly? Is this only for enums or does it also store the static 
initialization data of "class" types too? If it does store the 
static init data of class types too, then wouldn't such data be 
host/build time specific and as such the classes that need to be 
enrolled into the default CDS archive of the JDK should be very 
selective (by reviewing what they do in their static init)? Like I 
said, I haven't looked into this in detail so perhaps it already is 
selective in the JDK build?




Hi Jaikiran,


Thank you very much for the detailed response.

CDS also has the ability to archive Java heap object. Since 
https://bugs.openjdk.java.net/browse/JDK-8244778 , we have archived 
the entire module graph to improve start-up time. At run time, the 
module graph (as well as other archived heap objects) are loaded 
from the CDS archive and put into the Java heap (either through 
memory mapping or copying).


That is interesting and something that I hadn't known.



You can see the related code in 
jdk.internal.module.ModuleBootstrap::boot()
I just had a look at it and it's quite elaborate and it'll take a me 
while to fully grasp it (if at all) given its understandable complexity.


When the module system has started up, the module graph will 
reference a copy of the OPEN enum object that was created as part of 
the archive. However, the Modifier. will also be executed at 
VM start-up, causing a second copy of the OPEN enum object to be 
stored into the static field Modified::OPEN.


Thank you for that detail. That helps me understand this a bit more 
(and opens a few questions). To be clear - the VM startup code which 
creates that other copy, ends up creating that copy because that 
piece of initialization happens even before the module system has 
fully started up and created those references from the archived 
state? Otherwise, the classloaders I believe would be smart enough to 
not run that static init again, had the module system with that graph 
from the archived state been fully "up"?


So would this mean that this not just impacts enums but essentially 
every class referenced within the module system (of just boot layer?) 
that has state which is initialized during static init? To be more 
precise, consider the very common example of loggers which are 
typically static initialized and stored in a static (final) field:


private static final java.util.logger.Logger logger = 
Logger.getLogger(SomeClass.class);


If the boot layer module graph has any classes which has state like 
this, it would then mean that if such classes do get initialized very 
early on during VM startup, then they too are impacted and the module 
graph holding instances of such classes will end up using a different 
instance for such fields as compared to the rest of the application 
code?


In essence, such classes which get accessed early (before module 
system with the archived graph is "up") during VM startup can end up 
_virtually_ having their static initialization run twice (I 
understand it won't be run twice, but that's the end result, isn't it)?


I was really curious why this was only applicable to enums and why 
other static initialization (like the one I explained above) wasn't 
impacted. So I decided to do a small PoC. To come up with an 

Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-29 Thread Jaikiran Pai

Hello Ioi (and others),

On 22/10/21 1:39 pm, Jaikiran Pai wrote:

Hello Ioi,

On 22/10/21 12:29 pm, Ioi Lam wrote:



On 10/21/21 9:09 PM, Jaikiran Pai wrote:

Hello Ioi,




This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen 
elsewhere?


The static constructors of enum classes are executed at both CDS 
dump time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived 
heap during dump time, it will be different than the value of 
Modifier.OPEN that is re-created at runtime by the execution of 
Modifier.


I have almost next to nothing knowledge about CDS internals. My only 
understanding of it is based on some documentation that I have read. 
One of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" 
like it states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple 
JVM processes."


But from what you explain in that enum example, it looks like it 
also stores class instance data that is computed at build time on 
the host where the JDK image was generated? Did I understand it 
correctly? Is this only for enums or does it also store the static 
initialization data of "class" types too? If it does store the 
static init data of class types too, then wouldn't such data be 
host/build time specific and as such the classes that need to be 
enrolled into the default CDS archive of the JDK should be very 
selective (by reviewing what they do in their static init)? Like I 
said, I haven't looked into this in detail so perhaps it already is 
selective in the JDK build?




Hi Jaikiran,


Thank you very much for the detailed response.

CDS also has the ability to archive Java heap object. Since 
https://bugs.openjdk.java.net/browse/JDK-8244778 , we have archived 
the entire module graph to improve start-up time. At run time, the 
module graph (as well as other archived heap objects) are loaded from 
the CDS archive and put into the Java heap (either through memory 
mapping or copying).


That is interesting and something that I hadn't known.



You can see the related code in 
jdk.internal.module.ModuleBootstrap::boot()
I just had a look at it and it's quite elaborate and it'll take a me 
while to fully grasp it (if at all) given its understandable complexity.


When the module system has started up, the module graph will 
reference a copy of the OPEN enum object that was created as part of 
the archive. However, the Modifier. will also be executed at 
VM start-up, causing a second copy of the OPEN enum object to be 
stored into the static field Modified::OPEN.


Thank you for that detail. That helps me understand this a bit more 
(and opens a few questions). To be clear - the VM startup code which 
creates that other copy, ends up creating that copy because that piece 
of initialization happens even before the module system has fully 
started up and created those references from the archived state? 
Otherwise, the classloaders I believe would be smart enough to not run 
that static init again, had the module system with that graph from the 
archived state been fully "up"?


So would this mean that this not just impacts enums but essentially 
every class referenced within the module system (of just boot layer?) 
that has state which is initialized during static init? To be more 
precise, consider the very common example of loggers which are 
typically static initialized and stored in a static (final) field:


private static final java.util.logger.Logger logger = 
Logger.getLogger(SomeClass.class);


If the boot layer module graph has any classes which has state like 
this, it would then mean that if such classes do get initialized very 
early on during VM startup, then they too are impacted and the module 
graph holding instances of such classes will end up using a different 
instance for such fields as compared to the rest of the application code?


In essence, such classes which get accessed early (before module 
system with the archived graph is "up") during VM startup can end up 
_virtually_ having their static initialization run twice (I understand 
it won't be run twice, but that's the end result, isn't it)?


I was really curious why this was only applicable to enums and why other 
static initialization (like the one I explained above) wasn't impacted. 
So I decided to do a small PoC. To come up with an illustration that 
this impacts regular static initialization where enums aren't involved, 
I decided to add a small piece of code in the 

Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-22 Thread Jaikiran Pai

Hello Ioi,

On 22/10/21 12:29 pm, Ioi Lam wrote:



On 10/21/21 9:09 PM, Jaikiran Pai wrote:

Hello Ioi,




This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen elsewhere?

The static constructors of enum classes are executed at both CDS 
dump time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived 
heap during dump time, it will be different than the value of 
Modifier.OPEN that is re-created at runtime by the execution of 
Modifier.


I have almost next to nothing knowledge about CDS internals. My only 
understanding of it is based on some documentation that I have read. 
One of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" like 
it states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple 
JVM processes."


But from what you explain in that enum example, it looks like it also 
stores class instance data that is computed at build time on the host 
where the JDK image was generated? Did I understand it correctly? Is 
this only for enums or does it also store the static initialization 
data of "class" types too? If it does store the static init data of 
class types too, then wouldn't such data be host/build time specific 
and as such the classes that need to be enrolled into the default CDS 
archive of the JDK should be very selective (by reviewing what they 
do in their static init)? Like I said, I haven't looked into this in 
detail so perhaps it already is selective in the JDK build?




Hi Jaikiran,


Thank you very much for the detailed response.

CDS also has the ability to archive Java heap object. Since 
https://bugs.openjdk.java.net/browse/JDK-8244778 , we have archived 
the entire module graph to improve start-up time. At run time, the 
module graph (as well as other archived heap objects) are loaded from 
the CDS archive and put into the Java heap (either through memory 
mapping or copying).


That is interesting and something that I hadn't known.



You can see the related code in 
jdk.internal.module.ModuleBootstrap::boot()
I just had a look at it and it's quite elaborate and it'll take a me 
while to fully grasp it (if at all) given its understandable complexity.


When the module system has started up, the module graph will reference 
a copy of the OPEN enum object that was created as part of the 
archive. However, the Modifier. will also be executed at VM 
start-up, causing a second copy of the OPEN enum object to be stored 
into the static field Modified::OPEN.


Thank you for that detail. That helps me understand this a bit more (and 
opens a few questions). To be clear - the VM startup code which creates 
that other copy, ends up creating that copy because that piece of 
initialization happens even before the module system has fully started 
up and created those references from the archived state? Otherwise, the 
classloaders I believe would be smart enough to not run that static init 
again, had the module system with that graph from the archived state 
been fully "up"?


So would this mean that this not just impacts enums but essentially 
every class referenced within the module system (of just boot layer?) 
that has state which is initialized during static init? To be more 
precise, consider the very common example of loggers which are typically 
static initialized and stored in a static (final) field:


private static final java.util.logger.Logger logger = 
Logger.getLogger(SomeClass.class);


If the boot layer module graph has any classes which has state like 
this, it would then mean that if such classes do get initialized very 
early on during VM startup, then they too are impacted and the module 
graph holding instances of such classes will end up using a different 
instance for such fields as compared to the rest of the application code?


In essence, such classes which get accessed early (before module system 
with the archived graph is "up") during VM startup can end up 
_virtually_ having their static initialization run twice (I understand 
it won't be run twice, but that's the end result, isn't it)?


-Jaikiran



Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-22 Thread Ioi Lam




On 10/21/21 9:09 PM, Jaikiran Pai wrote:

Hello Ioi,




This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen elsewhere?

The static constructors of enum classes are executed at both CDS dump 
time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived 
heap during dump time, it will be different than the value of 
Modifier.OPEN that is re-created at runtime by the execution of 
Modifier.


I have almost next to nothing knowledge about CDS internals. My only 
understanding of it is based on some documentation that I have read. 
One of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" like 
it states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple JVM 
processes."


But from what you explain in that enum example, it looks like it also 
stores class instance data that is computed at build time on the host 
where the JDK image was generated? Did I understand it correctly? Is 
this only for enums or does it also store the static initialization 
data of "class" types too? If it does store the static init data of 
class types too, then wouldn't such data be host/build time specific 
and as such the classes that need to be enrolled into the default CDS 
archive of the JDK should be very selective (by reviewing what they do 
in their static init)? Like I said, I haven't looked into this in 
detail so perhaps it already is selective in the JDK build?




Hi Jaikiran,

CDS also has the ability to archive Java heap object. Since 
https://bugs.openjdk.java.net/browse/JDK-8244778 , we have archived the 
entire module graph to improve start-up time. At run time, the module 
graph (as well as other archived heap objects) are loaded from the CDS 
archive and put into the Java heap (either through memory mapping or 
copying).


You can see the related code in jdk.internal.module.ModuleBootstrap::boot()

When the module system has started up, the module graph will reference a 
copy of the OPEN enum object that was created as part of the archive. 
However, the Modifier. will also be executed at VM start-up, 
causing a second copy of the OPEN enum object to be stored into the 
static field Modified::OPEN.


When an application tries to get the OPEN enum, it will get the copy 
stored in the static field Modified::OPEN. If you walk the module graph, 
you get get a reference to the other copy. Since these are two distinct 
heap objects, the == operator will return comparing them.


Thanks
- Ioi





Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-21 Thread Jaikiran Pai

Hello Ioi,

On 21/10/21 9:59 pm, Ioi Lam wrote:



On 10/21/21 5:25 AM, Alan Bateman wrote:

On 21/10/2021 10:49, Jaikiran Pai wrote:

:

Digging into it, it appears that since the ModuleDescriptor#equals() 
calls equals() on enum types (in this specific case on 
ModuleDescriptor.Requires.Modifier) and since enum type equality is 
implemented as identity checks, those identity checks are 
surprisingly failing. More specifically 
ModuleDescriptor.Requires.Modifier.MANDATED == 
ModuleDescriptor.Requires.Modifier.MANDATED is equating to false 
because at runtime I see that two different instances of 
ModuleDescriptor.Requires.Modifier.MANDATED have been loaded (by the 
same boot module classloader). Although I use 
ModuleDescriptor.Requires.Modifier.MANDATED as an example, the same 
is reproducible with other enum values like 
ModuleDescriptor.Requires.Modifier.TRANSITIVE.


This appears to be specific to CDS since running the above program 
with:


java -Xshare:off EnumEquality

succeeds and the ModuleDescriptor equality check passes.

In short, it looks like there is some general issue with CDS and 
equality checks with enums and perhaps deserves a separate JBS issue?
I've asked Ioi Lam to comment on this, off-hand I'm not aware of any 
issues with CDS here but it may be related to the archiving of object 
graphs.


-Alan


Hi Jaikiran and Alan,

Thanks for reporting this issue. It's a bug in CDS. I have filed 
https://bugs.openjdk.java.net/browse/JDK-8275731 and am working on a fix.


Thank you for looking into this.



This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen elsewhere?

The static constructors of enum classes are executed at both CDS dump 
time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived heap 
during dump time, it will be different than the value of Modifier.OPEN 
that is re-created at runtime by the execution of Modifier.


I have almost next to nothing knowledge about CDS internals. My only 
understanding of it is based on some documentation that I have read. One 
of them being this one 
https://docs.oracle.com/en/java/javase/17/vm/class-data-sharing.html#GUID-7EAA3411-8CF0-4D19-BD05-DF5E1780AA91.


Based on that documentation (and other similar ones), it was my 
understanding that CDS was meant to store/share class "metadata" like it 
states in that doc:


"When the JVM starts, the shared archive is memory-mapped to allow 
sharing of read-only JVM metadata for these classes among multiple JVM 
processes."


But from what you explain in that enum example, it looks like it also 
stores class instance data that is computed at build time on the host 
where the JDK image was generated? Did I understand it correctly? Is 
this only for enums or does it also store the static initialization data 
of "class" types too? If it does store the static init data of class 
types too, then wouldn't such data be host/build time specific and as 
such the classes that need to be enrolled into the default CDS archive 
of the JDK should be very selective (by reviewing what they do in their 
static init)? Like I said, I haven't looked into this in detail so 
perhaps it already is selective in the JDK build?


-Jaikiran




Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-21 Thread Ioi Lam




On 10/21/21 5:25 AM, Alan Bateman wrote:

On 21/10/2021 10:49, Jaikiran Pai wrote:

:

Digging into it, it appears that since the ModuleDescriptor#equals() 
calls equals() on enum types (in this specific case on 
ModuleDescriptor.Requires.Modifier) and since enum type equality is 
implemented as identity checks, those identity checks are 
surprisingly failing. More specifically 
ModuleDescriptor.Requires.Modifier.MANDATED == 
ModuleDescriptor.Requires.Modifier.MANDATED is equating to false 
because at runtime I see that two different instances of 
ModuleDescriptor.Requires.Modifier.MANDATED have been loaded (by the 
same boot module classloader). Although I use 
ModuleDescriptor.Requires.Modifier.MANDATED as an example, the same 
is reproducible with other enum values like 
ModuleDescriptor.Requires.Modifier.TRANSITIVE.


This appears to be specific to CDS since running the above program with:

java -Xshare:off EnumEquality

succeeds and the ModuleDescriptor equality check passes.

In short, it looks like there is some general issue with CDS and 
equality checks with enums and perhaps deserves a separate JBS issue?
I've asked Ioi Lam to comment on this, off-hand I'm not aware of any 
issues with CDS here but it may be related to the archiving of object 
graphs.


-Alan


Hi Jaikiran and Alan,

Thanks for reporting this issue. It's a bug in CDS. I have filed 
https://bugs.openjdk.java.net/browse/JDK-8275731 and am working on a fix.


This is my initial analysis of the problem.

>>> Can anyone think of similar problems that may happen elsewhere?

The static constructors of enum classes are executed at both CDS dump 
time and run time. E.g.,


    public enum Modifier {
    OPEN
    }

The  method essentially does this:

    public static final Modifier OPEN = new Modifier("OPEN");

If a reference of Modifier.OPEN is stored inside the CDS archived heap 
during dump time, it will be different than the value of Modifier.OPEN 
that is re-created at runtime by the execution of Modifier.


Thanks
- Ioi


Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-21 Thread Alan Bateman

On 21/10/2021 10:49, Jaikiran Pai wrote:

:

Digging into it, it appears that since the ModuleDescriptor#equals() 
calls equals() on enum types (in this specific case on 
ModuleDescriptor.Requires.Modifier) and since enum type equality is 
implemented as identity checks, those identity checks are surprisingly 
failing. More specifically ModuleDescriptor.Requires.Modifier.MANDATED 
== ModuleDescriptor.Requires.Modifier.MANDATED is equating to false 
because at runtime I see that two different instances of 
ModuleDescriptor.Requires.Modifier.MANDATED have been loaded (by the 
same boot module classloader). Although I use 
ModuleDescriptor.Requires.Modifier.MANDATED as an example, the same is 
reproducible with other enum values like 
ModuleDescriptor.Requires.Modifier.TRANSITIVE.


This appears to be specific to CDS since running the above program with:

java -Xshare:off EnumEquality

succeeds and the ModuleDescriptor equality check passes.

In short, it looks like there is some general issue with CDS and 
equality checks with enums and perhaps deserves a separate JBS issue?
I've asked Ioi Lam to comment on this, off-hand I'm not aware of any 
issues with CDS here but it may be related to the archiving of object 
graphs.


-Alan


Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-21 Thread Jaikiran Pai

Hello Alan,

On 19/10/21 7:40 pm, Alan Bateman wrote:

On 19/10/2021 14:49, Jaikiran Pai wrote:


Ah! So this exact same investigation had already happened a few weeks 
back then. I haven't subscribed to that list, so missed it. I see in 
one of those messages this part:


"Off hand I can't think of any issues with the ModuleDescriptor 
hashCode. It is computed at link time and should be deterministic. If 
I were to guess then then this may be something to do with the module 
version recorded at compile-time at that is one of the components 
that the hash is based on."


To be clear, is the ModuleDescriptor#hashCode() expected to return 
reproducible (same) hashCode across multiple runs? What currently 
changes the hashCode() across multiple runs is various components 
within ModuleDescriptor's hashCode() implementation using the 
hashCode() of the enums (specifically the various Modifier enums).


The discussion on jigsaw-dev didn't get to the bottom of the issue at 
the time, mostly because it wasn't easy to reproduce.


Now that the issue is clearer then we should fix it. Aside from 
reproducible builds then I expect it is possible to use a 
ModuleDescriptor.Builder to create a ModuleDescriptor that is equal to 
a ModuleDescriptor in boot layer configuration but with a different 
hashCode.


Based on this input, one of the tests I have included for verifying this 
proposed hashCode fix, involves dealing with a boot layer 
ModuleDescriptor and then verifying it's hashCode against a 
ModuleDescriptor that is built for the same module using the 
ModuleDescriptor.Builder. It does reproduce the hashCode issue. However, 
that test seems to have exposed a different bug with CDS and equality 
checks against enums (which impact ModuleDescriptor#equals()).


More precisely, consider this trivial Java code:

import java.lang.module.*;
import java.io.*;
import java.util.*;

public class EnumEquality {

    public static void main(final String[] args) throws Exception {
    String moduleName = "java.sql";
    // load the "java.sql" module from boot layer
    Optional bootModule = 
ModuleLayer.boot().findModule(moduleName);

    if (bootModule.isEmpty()) {
    throw new RuntimeException(moduleName + " module is missing 
in boot layer");

    }
    ModuleDescriptor m1 = bootModule.get().getDescriptor();
    // now recreate the same module using the ModuleDescriptor.read 
on the module's module-info.class

    ModuleDescriptor m2;
    try (InputStream moduleInfo = 
bootModule.get().getResourceAsStream("module-info.class")) {

    if (moduleInfo == null) {
    throw new RuntimeException("Could not locate 
module-info.class in " + moduleName + " module");

    }
    // this will internally use a ModuleDescriptor.Builder to 
construct the ModuleDescriptor

    m2 = ModuleDescriptor.read(moduleInfo);
    }
    if (!m1.equals(m2)) {
    // root cause - the enums aren't "equal"
    for (ModuleDescriptor.Requires r1 : m1.requires()) {
    if (r1.name().equals("java.base")) {
    for (ModuleDescriptor.Requires r2 : m2.requires()) {
    if (r2.name().equals("java.base")) {
    System.out.println("Modifiers r1 " + 
r1.modifiers() + " r2 " + r2.modifiers()
    + " --> equals? " + 
r1.modifiers().equals(r2.modifiers()));

    }
    }
    }
    }

    throw new RuntimeException("ModuleDescriptor(s) aren't 
equal: \n" + m1 + "\n" + m2);

    }
    System.out.println("Success");
    }
}

This program uses "java.sql" as the module under test. This "java.sql" 
is a boot layer module and among other things has:


    requires transitive java.logging;
    requires transitive java.transaction.xa;
    requires transitive java.xml;

in its module definition[1]. The program first loads this module's 
ModuleDescriptor into an instance m1, using the boot() module layer. It 
then "reconstructs" this same module by reading the module-info.class of 
this module, using the ModuleDescriptor.read() API (which internally 
calls ModuleDescriptor.Builder). This is stored into m2. m1 and m2 are 
then checked for equality (using a call to equals() method). This 
equality check keeps failing consistently.


Digging into it, it appears that since the ModuleDescriptor#equals() 
calls equals() on enum types (in this specific case on 
ModuleDescriptor.Requires.Modifier) and since enum type equality is 
implemented as identity checks, those identity checks are surprisingly 
failing. More specifically ModuleDescriptor.Requires.Modifier.MANDATED 
== ModuleDescriptor.Requires.Modifier.MANDATED is equating to false 
because at runtime I see that two different instances of 
ModuleDescriptor.Requires.Modifier.MANDATED have been loaded (by the 
same boot module classloader). Although I 

Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-19 Thread Alan Bateman

On 19/10/2021 14:49, Jaikiran Pai wrote:


Ah! So this exact same investigation had already happened a few weeks 
back then. I haven't subscribed to that list, so missed it. I see in 
one of those messages this part:


"Off hand I can't think of any issues with the ModuleDescriptor 
hashCode. It is computed at link time and should be deterministic. If 
I were to guess then then this may be something to do with the module 
version recorded at compile-time at that is one of the components that 
the hash is based on."


To be clear, is the ModuleDescriptor#hashCode() expected to return 
reproducible (same) hashCode across multiple runs? What currently 
changes the hashCode() across multiple runs is various components 
within ModuleDescriptor's hashCode() implementation using the 
hashCode() of the enums (specifically the various Modifier enums).


The discussion on jigsaw-dev didn't get to the bottom of the issue at 
the time, mostly because it wasn't easy to reproduce.


Now that the issue is clearer then we should fix it. Aside from 
reproducible builds then I expect it is possible to use a 
ModuleDescriptor.Builder to create a ModuleDescriptor that is equal to a 
ModuleDescriptor in boot layer configuration but with a different hashCode.


On the surface, changing the MD hash code to use the ordinal of the 
modifiers should be okay. If the enums values are re-ordered then it 
should also be okay but clearly doing so would end up with a build that 
is not binary identical to a build done with a different order. I think 
we need to think though if there are any implications.


-Alan


Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-19 Thread Jaikiran Pai

Hello Claes,

On 19/10/21 7:07 pm, Claes Redestad wrote:

On 2021-10-19 14:31, Jaikiran Pai wrote:
The other option I experimented with was to make 
ModuleDescriptor#hashCode() generate the same hashcode across 
multiple JVM runs. Although I do have a "working" version of that 
change, I decided not to spend too much time on it because the 
java.lang.Object#hashCode() contract itself clearly states that this 
value isn't expected to be same across multiple JVM runs. So whatever 
I do here is going to be brittle. 


I'm assuming the cause for ModuleDescriptor#hashCode being is due to 
the various enums not having an explicitly defined hashCode? 


You are right. That was what was causing the change in values. I just 
sent a separate reply in this thread with additional details.



I think this should be fixed.


Okay, I'll pursue this path then.



Either way you're going to be brittle since the patch to emit a 0 is 
relying on the ModuleDescriptor#hashCode implementation disallowing 0 
as a hash value (a 0 will force a recalculation). While it'll only 
happen at most once per module these relatively expensive calculations 
are something we want to avoid on startup if we can do so for free.


Understood. Thank you for these inputs.

-Jaikiran




Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-19 Thread Jaikiran Pai

Hello Alan,

On 19/10/21 6:59 pm, Alan Bateman wrote:

On 19/10/2021 13:31, Jaikiran Pai wrote:
This relates to the intermittent failures in 
tools/jlink/JLinkReproducibleTest.java test case which has been 
ProblemListed for a while now. The root cause is 
https://bugs.openjdk.java.net/browse/JDK-8275509. I couldn't find any 
specific mailing lists for jlink tool and I remember seeing 
jlink/jpackage related discussions on this mailing list previously, 
so creating this discussion here.


Reproducible builds have been a game of whack-a-mole. Many issues have 
been fixed in recent releases to the JDK is a lot better than it used 
to be. As it happens, someone else interested in reproducible builds 
brought up the issue of the hashCode on jigsaw-dev a few weeks ago [1].


Ah! So this exact same investigation had already happened a few weeks 
back then. I haven't subscribed to that list, so missed it. I see in one 
of those messages this part:


"Off hand I can't think of any issues with the ModuleDescriptor 
hashCode. It is computed at link time and should be deterministic. If I 
were to guess then then this may be something to do with the module 
version recorded at compile-time at that is one of the components that 
the hash is based on."


To be clear, is the ModuleDescriptor#hashCode() expected to return 
reproducible (same) hashCode across multiple runs? What currently 
changes the hashCode() across multiple runs is various components within 
ModuleDescriptor's hashCode() implementation using the hashCode() of the 
enums (specifically the various Modifier enums). For example here 
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java#L330 
(the mods.hashCode()). Since the hashCode() returned by enums is 
literally through a call to java.lang.Object#hashCode(), those 
hashCode() value ended up changing across JVM runs, in one of the setups 
I was testing (which I didn't consider a surprise since that's what the 
Object#hashCode() stated).


The other approach that I talked about in my previous mail of trying to 
make ModuleDescriptor#hashCode() reproducible involved using the enum's 
ordinal value as a part of the hashCode() computation instead of calling 
the enum's hashCode() method. Very crudely, it looked like:


diff --git 
a/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java 
b/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java

index a412dd753cc..13c8cd04360 100644
--- a/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
+++ b/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
@@ -327,7 +327,7 @@ public class ModuleDescriptor
  */
 @Override
 public int hashCode() {
-    int hash = name.hashCode() * 43 + mods.hashCode();
+    int hash = name.hashCode() * 43 + enumOrdinalHashCode(mods);
 if (compiledVersion != null)
 hash = hash * 43 + compiledVersion.hashCode();
 if (rawCompiledVersion != null)
@@ -505,7 +505,7 @@ public class ModuleDescriptor
  */
 @Override
 public int hashCode() {
-    int hash = mods.hashCode();
+    int hash = enumOrdinalHashCode(mods);
 hash = hash * 43 + source.hashCode();
 return hash * 43 + targets.hashCode();
 }
@@ -708,7 +708,7 @@ public class ModuleDescriptor
  */
 @Override
 public int hashCode() {
-    int hash = mods.hashCode();
+    int hash = enumOrdinalHashCode(mods);
 hash = hash * 43 + source.hashCode();
 return hash * 43 + targets.hashCode();
 }
@@ -2261,7 +2261,7 @@ public class ModuleDescriptor
 int hc = hash;
 if (hc == 0) {
 hc = name.hashCode();
-    hc = hc * 43 + Objects.hashCode(modifiers);
+    hc = hc * 43 + enumOrdinalHashCode(modifiers);
 hc = hc * 43 + requires.hashCode();
 hc = hc * 43 + Objects.hashCode(packages);
 hc = hc * 43 + exports.hashCode();
@@ -2546,6 +2546,21 @@ public class ModuleDescriptor
 .collect(Collectors.joining(" "));
 }

+    /**
+ * Generates and returns a hashcode for the enum instances. The 
returned hashcode
+ * is a sum of each of the enum instances' {@link Enum#ordinal() 
ordinal} value.

+ */
+    private static int enumOrdinalHashCode(final IterableEnum> enums) {

+    int h = 0;
+    for (final Enum e : enums) {
+    if (e == null) {
+    continue;
+    }
+    h += e.ordinal();
+    }
+    return h;
+    }
+
 private static >
 int compare(T obj1, T obj2) {
 if (obj1 != null) {

With this change (and this change only, no changes in 
SystemModulesPlugin were needed) I was able to consistently run that 
test without any failures. But I didn't pursue this effort further 
because I thought making 

Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-19 Thread Claes Redestad

On 2021-10-19 14:31, Jaikiran Pai wrote:
The other option I experimented with was to make 
ModuleDescriptor#hashCode() generate the same hashcode across multiple 
JVM runs. Although I do have a "working" version of that change, I 
decided not to spend too much time on it because the 
java.lang.Object#hashCode() contract itself clearly states that this 
value isn't expected to be same across multiple JVM runs. So whatever 
I do here is going to be brittle. 


I'm assuming the cause for ModuleDescriptor#hashCode being is due to the 
various enums not having an explicitly defined hashCode? I think this 
should be fixed.


Either way you're going to be brittle since the patch to emit a 0 is 
relying on the ModuleDescriptor#hashCode implementation disallowing 0 as 
a hash value (a 0 will force a recalculation). While it'll only happen 
at most once per module these relatively expensive calculations are 
something we want to avoid on startup if we can do so for free.


/Claes



Re: JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-19 Thread Alan Bateman

On 19/10/2021 13:31, Jaikiran Pai wrote:
This relates to the intermittent failures in 
tools/jlink/JLinkReproducibleTest.java test case which has been 
ProblemListed for a while now. The root cause is 
https://bugs.openjdk.java.net/browse/JDK-8275509. I couldn't find any 
specific mailing lists for jlink tool and I remember seeing 
jlink/jpackage related discussions on this mailing list previously, so 
creating this discussion here.


Reproducible builds have been a game of whack-a-mole. Many issues have 
been fixed in recent releases to the JDK is a lot better than it used to 
be. As it happens, someone else interested in reproducible builds 
brought up the issue of the hashCode on jigsaw-dev a few weeks ago [1].


The system modules plugin pre-dates the archiving of the module graph in 
the CDS archive so the performance profile is probably very different 
from when the plugin was created. I'll defer to Claes on what benchmarks 
to run.


As regards changing the plugin then if we aren't creating the MD with a 
hashCode then I'd prefer to drop the hashCode parameter from the Builder 
and the shared secret. In other words, do some cleanup rather than leave 
it orfaned.


-Alan

[1] 
https://mail.openjdk.java.net/pipermail/jigsaw-dev/2021-September/014708.html





JDK-8275509: (jlink) SystemModulesPlugin generates a jdk.internal.module.SystemModules$all.class which isn't reproducible

2021-10-19 Thread Jaikiran Pai
This relates to the intermittent failures in 
tools/jlink/JLinkReproducibleTest.java test case which has been 
ProblemListed for a while now. The root cause is 
https://bugs.openjdk.java.net/browse/JDK-8275509. I couldn't find any 
specific mailing lists for jlink tool and I remember seeing 
jlink/jpackage related discussions on this mailing list previously, so 
creating this discussion here.


The jlink tool uses plugins to let them transform contents of the 
modules that are part of the image. One such plugin is the 
SystemModulesPlugin and as per its javadoc specification its role is to 
prevent parsing of module-info.class files during JVM startup. To do so, 
it creates specialized dynamically generated class files (during jlink). 
The class file(s) are then bundled into the image that gets generated.


One such class file is a dynamically generated 
jdk/internal/module/SystemModules$all.class. The SystemModulesPlugin 
generates the bytecode for this class. The details of that bytecode are 
very implementation specific. One part of the bytecode generation 
involves bytecode for an internal method called "moduleDescriptors()". 
One of the statements generated for this method implementation is a call 
to jdk.internal.module.Builder.build(int hashCode) method[1]. What this 
call does is, it creates a (pre-populated/validated) instance of the 
java.lang.module.ModuleDescriptor.


The SystemModulesPlugin, when generating the bytecode of this method, 
uses the hashCode() of the ModuleDescriptor of the current JVM instance 
(through a typical ModuleDescriptor#hashCode() call)[2]. By doing so, it 
ends up generating bytecode which "embeds" the current runtime specific 
hashcode into the generated class file.


The contract of java.lang.Object#hashCode() states:

"
...
This integer need not remain consistent from one execution of an 
application to another execution of the same application.


"

Effectively, what this means is, if jlink is used to generate an image 
with the exact same set of modules (requirements, package, exports, 
opens etc...) it still can (and does) end up generating a 
jdk/internal/module/SystemModules$all.class file whose binary content 
will differ across these runs, thus being non-reproducible.


The implementation in java.lang.module.ModuleDescriptor is such that if 
the hashcode passed to it (during construction) is 0, then it lazily 
computes the correct hashcode whenever the invocation of hashCode() 
happens at runtime. What that means is, the SystemModulesPlugin in its 
bytecode generation for the moduleDescriptors() method could always pass 
0 as the hashcode to the jdk.internal.module.Builder.build(int hashCode) 
call. That's something that I experimented with and after that change, 
with 100s of runs of the JLinkReproducibleTest, I no longer get any 
failures. However, given that the SystemModulesPlugin's goal appears to 
be to reduce the booting time for system modules, I was wondering if 
this change would introduce any performance penalty that is big enough. 
What this change will end up doing is, whenever the next time the 
hashCode method on instances of the system module's ModuleDescriptor 
gets called, it will have to compute it and that computation is 
relatively expensive (given how many "components" it uses to calculate 
it[3]). It's a (mostly) one time thing for each instance, but I don't 
know how expensive that would be. Do we have any existing benchmarks in 
this area that I could reuse to see what performance impact this might 
have? Keeping aside the performance issue for a bit, is this proposed 
patch something worth considering:


--- 
a/src/jdk.jlink/share/classes/jdk/tools/jlink/internal/plugins/SystemModulesPlugin.java
+++ 
b/src/jdk.jlink/share/classes/jdk/tools/jlink/internal/plugins/SystemModulesPlugin.java
@@ -1099,7 +1099,13 @@ public final class SystemModulesPlugin extends 
AbstractPlugin {

 mv.visitVarInsn(ALOAD, MD_VAR);
 pushInt(mv, index);
 mv.visitVarInsn(ALOAD, BUILDER_VAR);
-    mv.visitLdcInsn(md.hashCode());
+    // Let the ModuleDescriptor hashcode be computed at 
runtime.

+    // Embedding the current hashcode of the ModuleDescriptor
+    // into the bytecode of a generated class can cause the 
generated
+    // bytecode to be not reproducible, since an object's 
hashcode is allowed

+    // to change across JVM runs.
+    mv.visitLdcInsn(0); // the hashcode to be passed to the
+  // 
jdk.internal.module.Builder.build(int) method
 mv.visitMethodInsn(INVOKEVIRTUAL, 
MODULE_DESCRIPTOR_BUILDER,

 "build", "(I)Ljava/lang/module/ModuleDescriptor;",
 false);


The other option I experimented with was to make 
ModuleDescriptor#hashCode() generate the same hashcode across multiple 
JVM runs. Although I do have a