Hello Alan,

On 19/10/21 6:59 pm, Alan Bateman wrote:
On 19/10/2021 13:31, Jaikiran Pai wrote:
This relates to the intermittent failures in tools/jlink/JLinkReproducibleTest.java test case which has been ProblemListed for a while now. The root cause is https://bugs.openjdk.java.net/browse/JDK-8275509. I couldn't find any specific mailing lists for jlink tool and I remember seeing jlink/jpackage related discussions on this mailing list previously, so creating this discussion here.

Reproducible builds have been a game of whack-a-mole. Many issues have been fixed in recent releases to the JDK is a lot better than it used to be. As it happens, someone else interested in reproducible builds brought up the issue of the hashCode on jigsaw-dev a few weeks ago [1].

Ah! So this exact same investigation had already happened a few weeks back then. I haven't subscribed to that list, so missed it. I see in one of those messages this part:

"Off hand I can't think of any issues with the ModuleDescriptor hashCode. It is computed at link time and should be deterministic. If I were to guess then then this may be something to do with the module version recorded at compile-time at that is one of the components that the hash is based on."

To be clear, is the ModuleDescriptor#hashCode() expected to return reproducible (same) hashCode across multiple runs? What currently changes the hashCode() across multiple runs is various components within ModuleDescriptor's hashCode() implementation using the hashCode() of the enums (specifically the various Modifier enums). For example here https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java#L330 (the mods.hashCode()). Since the hashCode() returned by enums is literally through a call to java.lang.Object#hashCode(), those hashCode() value ended up changing across JVM runs, in one of the setups I was testing (which I didn't consider a surprise since that's what the Object#hashCode() stated).

The other approach that I talked about in my previous mail of trying to make ModuleDescriptor#hashCode() reproducible involved using the enum's ordinal value as a part of the hashCode() computation instead of calling the enum's hashCode() method. Very crudely, it looked like:

diff --git a/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java b/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
index a412dd753cc..13c8cd04360 100644
--- a/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
+++ b/src/java.base/share/classes/java/lang/module/ModuleDescriptor.java
@@ -327,7 +327,7 @@ public class ModuleDescriptor
          */
         @Override
         public int hashCode() {
-            int hash = name.hashCode() * 43 + mods.hashCode();
+            int hash = name.hashCode() * 43 + enumOrdinalHashCode(mods);
             if (compiledVersion != null)
                 hash = hash * 43 + compiledVersion.hashCode();
             if (rawCompiledVersion != null)
@@ -505,7 +505,7 @@ public class ModuleDescriptor
          */
         @Override
         public int hashCode() {
-            int hash = mods.hashCode();
+            int hash = enumOrdinalHashCode(mods);
             hash = hash * 43 + source.hashCode();
             return hash * 43 + targets.hashCode();
         }
@@ -708,7 +708,7 @@ public class ModuleDescriptor
          */
         @Override
         public int hashCode() {
-            int hash = mods.hashCode();
+            int hash = enumOrdinalHashCode(mods);
             hash = hash * 43 + source.hashCode();
             return hash * 43 + targets.hashCode();
         }
@@ -2261,7 +2261,7 @@ public class ModuleDescriptor
         int hc = hash;
         if (hc == 0) {
             hc = name.hashCode();
-            hc = hc * 43 + Objects.hashCode(modifiers);
+            hc = hc * 43 + enumOrdinalHashCode(modifiers);
             hc = hc * 43 + requires.hashCode();
             hc = hc * 43 + Objects.hashCode(packages);
             hc = hc * 43 + exports.hashCode();
@@ -2546,6 +2546,21 @@ public class ModuleDescriptor
                 .collect(Collectors.joining(" "));
     }

+    /**
+     * Generates and returns a hashcode for the enum instances. The returned hashcode +     * is a sum of each of the enum instances' {@link Enum#ordinal() ordinal} value.
+     */
+    private static int enumOrdinalHashCode(final Iterable<? extends Enum<?>> enums) {
+        int h = 0;
+        for (final Enum<?> e : enums) {
+            if (e == null) {
+                continue;
+            }
+            h += e.ordinal();
+        }
+        return h;
+    }
+
     private static <T extends Object & Comparable<? super T>>
     int compare(T obj1, T obj2) {
         if (obj1 != null) {

With this change (and this change only, no changes in SystemModulesPlugin were needed) I was able to consistently run that test without any failures. But I didn't pursue this effort further because I thought making ModuleDescriptor#hashCode() return reproducible result wasn't a goal.

Do you think we should be making ModuleDescriptor#hashCode() deterministic and reproducible across runs? If so, if that above patch looks reasonable, I can clean it up a bit and run some further tests.


-Jaikiran

Reply via email to