> This PR introduces an option to output stable names for the lambda classes in 
> the JDK. A stable name consists of two parts: The first part is the 
> predefined value `$$Lambda$` appended to the lambda capturing class, and the 
> second is a 64-bit hash part of the name. Thus, it looks like 
> `lambdaCapturingClass$$Lambda$hashValue`.
> Parameters used to create a stable hash are a superset of the parameters used 
> for lambda class archiving when the CDS dumping option is enabled. During 
> this process, all the mutual parameters are in the same form as they are in 
> the low-level implementation 
> (`SystemDictionaryShared::add_lambda_proxy_class`) of the archiving process.
> We decided to use a well-specified `CRC32` algorithm from the standard Java 
> library. We created two 32-bit hashes from the parameters used to create 
> stable names. Then, we combined those two 32-bit hashes into one 64-bit hash 
> value.
> We chose `CRC32` because it is a well-specified hash function, and we don't 
> need to write additional code in the JDK. `SHA-256, MD5`, and all other hash 
> functions that rely on `MessageDigest` use lambdas in the implementation, so 
> they are unsuitable for our purpose. We also considered a few different hash 
> functions with a low collision rate. All these functions would require at 
> least 100 lines of additional code in the JDK. The best alternative we found 
> is 64-bit` MurmurHash2`: 
> https://commons.apache.org/proper/commons-codec/jacoco/org.apache.commons.codec.digest/MurmurHash2.java.html.
>   In case adding a new hash implementation (e.g., Murmur2) to the JDK is 
> preferred, this PR can be easily modified.
> We found the post 
> (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed/145633#145633)
>  that compares different hash functions.
> We also tested the `CRC32` hash function against half a billion generated 
> strings, and there were no collisions. Note that the capturing-class name is 
> also part of the lambda class name, so the potential collisions can only 
> appear in a single class. Thus, we do not expect to have name collisions due 
> to a relatively low number of lambdas per class. Every tool that uses this 
> feature should handle potential collisions on its own.  
> We found an overall approximation of the collision rate too. You can find it 
> here: https://preshing.com/20110504/hash-collision-probabilities/.
> 
> JDK currently adds an atomic integer after `$$Lambda$`, and the names of the 
> lambdas depend on the creation order. In the `TestStableLambdaNames`, we 
> generate all the lambdas two times. In the first run, the method 
> createPlainLambdas generate the following lambdas:
> 
> - TestStableLambdaNames$$Lambda$1/0x0000000800c00400
> - TestStableLambdaNames$$Lambda$2/0x0000000800c01800
> - TestStableLambdaNames$$Lambda$3/0x0000000800c01a38
> The same method in the second run generates lambdas with different names:
> - TestStableLambdaNames$$Lambda$1471/0x0000000800d10000
> - TestStableLambdaNames$$Lambda$1472/0x0000000800d10238
> - TestStableLambdaNames$$Lambda$1473/0x0000000800d10470
> 
> If we use the introduced flag, generated lambdas are:
> - TestStableLambdaNames$$Lambda$65ba26bbc6c7500d/0x0000000800c00400
> - TestStableLambdaNames$$Lambda$1569c8c4abe3ab18/0x0000000800c01800
> - TestStableLambdaNames$$Lambda$493c0ecaaf682428/0x0000000800c01a38
> In the second run of the method, generated lambdas are:
> - TestStableLambdaNames$$Lambda$65ba26bbc6c7500d/0x0000000800d10000
> - TestStableLambdaNames$$Lambda$1569c8c4abe3ab18/0x0000000800d10238
> - TestStableLambdaNames$$Lambda$493c0ecaaf682428/0x0000000800d10470
> 
> We can see that the introduced hash value does not change between two calls 
> of the method `createPlainLambdas`. That was not the case in the JDK run 
> without this change. Those lambdas are extracted directly from the test.

Strahinja Stanojevic has refreshed the contents of this pull request, and 
previous commits have been removed. The incremental views will show differences 
compared to the previous content of the PR. The pull request contains one new 
commit since the last revision:

  Calculate stable names for lambda classes using different hash function.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/10024/files
  - new: https://git.openjdk.org/jdk/pull/10024/files/dd8e592d..6179915a

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=10024&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=10024&range=03-04

  Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/10024.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/10024/head:pull/10024

PR: https://git.openjdk.org/jdk/pull/10024

Reply via email to