Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Wed, 10 Feb 2021 14:08:22 GMT, PROgrm_JARvis wrote: >>> Hi Claes, >>> Would flattening the state of MD5 bring any further improvements? >>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083) >> >> I think it might, marginally, but it seemed to me that the implCompress0 MD5 >> intrinsic is using `state` so I've not tried that yet since rewriting and >> checking the intrinsic code is a lot of work. (I've blogged about my >> experiments in this area and mentioned this issue in passing: >> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints) > >> >> >> @JarvisCraft This pull request has been inactive for more than 4 weeks and >> will be automatically closed if another 4 weeks passes without any activity. >> To avoid this, simply add a new comment to the pull request. Feel free to >> ask for assistance if you need help with progressing this pull request >> towards integration! > @JarvisCraft Can you re-run your benchmarks in light of the changes that > @cl4es pointed out in the previous comment? We think this PR can be closed. My apologies for the late reply, I'll try re-running these with the merged branch ASAP (the delay is due to me changing the workstation thus having some job to reconfigure the environment). - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Wed, 10 Feb 2021 14:08:22 GMT, PROgrm_JARvis wrote: >>> Hi Claes, >>> Would flattening the state of MD5 bring any further improvements? >>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083) >> >> I think it might, marginally, but it seemed to me that the implCompress0 MD5 >> intrinsic is using `state` so I've not tried that yet since rewriting and >> checking the intrinsic code is a lot of work. (I've blogged about my >> experiments in this area and mentioned this issue in passing: >> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints) > >> >> >> @JarvisCraft This pull request has been inactive for more than 4 weeks and >> will be automatically closed if another 4 weeks passes without any activity. >> To avoid this, simply add a new comment to the pull request. Feel free to >> ask for assistance if you need help with progressing this pull request >> towards integration! @JarvisCraft Can you re-run your benchmarks in light of the changes that @cl4es pointed out in the previous comment? We think this PR can be closed. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Wed, 10 Feb 2021 14:08:22 GMT, PROgrm_JARvis wrote: >>> Hi Claes, >>> Would flattening the state of MD5 bring any further improvements? >>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083) >> >> I think it might, marginally, but it seemed to me that the implCompress0 MD5 >> intrinsic is using `state` so I've not tried that yet since rewriting and >> checking the intrinsic code is a lot of work. (I've blogged about my >> experiments in this area and mentioned this issue in passing: >> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints) > >> >> >> @JarvisCraft This pull request has been inactive for more than 4 weeks and >> will be automatically closed if another 4 weeks passes without any activity. >> To avoid this, simply add a new comment to the pull request. Feel free to >> ask for assistance if you need help with progressing this pull request >> towards integration! FWIW, I ended up doing two related improvements: - #1855 - which reduced overheads of constructing MD5, SHA1, SHA2, SHA5 `MessageDigest` objects, and slightly improving digest performance - #1933 - reducing cost of `MessageDigest.getInstance` in general by better internal caching of `Constructor` objects, among other things. With this there's now no extra allocations when looking up something that has been looked up before. Together these enhancements bring the overheads of `UUID.nameUUIDFromBytes` down close to what you can get from cloning from a `ThreadLocal` `MD5` object as suggested here. Taking the ambient overheads of `ThreadLocal`s into account I'd say it's not worth adding this cache, and would suggest withdrawing this PR and closing the RFE. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Thu, 7 Jan 2021 17:09:14 GMT, Claes Redestad wrote: >> Hi Claes, >> Would flattening the state of MD5 bring any further improvements? >> https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083 > >> Hi Claes, >> Would flattening the state of MD5 bring any further improvements? >> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083) > > I think it might, marginally, but it seemed to me that the implCompress0 MD5 > intrinsic is using `state` so I've not tried that yet since rewriting and > checking the intrinsic code is a lot of work. (I've blogged about my > experiments in this area and mentioned this issue in passing: > https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints) > > > @JarvisCraft This pull request has been inactive for more than 4 weeks and > will be automatically closed if another 4 weeks passes without any activity. > To avoid this, simply add a new comment to the pull request. Feel free to ask > for assistance if you need help with progressing this pull request towards > integration! - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Thu, 7 Jan 2021 16:39:48 GMT, Peter Levart wrote: >>> I have to say that introducing a ThreadLocal here seems like a step in the >>> wrong direction. With a ThreadLocal, if I read this correctly, a >>> MessageDigest will be cached with each thread that ever calls this API, and >>> it won't be garbage collected until the thread dies. Some threads live >>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we >>> want? Is UUID creation so frequent that each thread needs its own, or could >>> thread safety be handled using a simple locking protocol? >> >> Fair enough; the solution proposed by Claes in #1855 seems to be a better >> alternative. >> >> Currently, I will keep this PR open but I expect this to be superseded by >> the latter. > > Hi Claes, > Would flattening the state of MD5 bring any further improvements? > https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083 > Hi Claes, > Would flattening the state of MD5 bring any further improvements? > [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083) I think it might, marginally, but it seemed to me that the implCompress0 MD5 intrinsic is using `state` so I've not tried that yet since rewriting and checking the intrinsic code is a lot of work. (I've blogged about my experiments in this area and mentioned this issue in passing: https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints) - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Sun, 20 Dec 2020 22:41:33 GMT, PROgrm_JARvis wrote: >>> I have to say that introducing a ThreadLocal here seems like a step in the >>> wrong direction. With a ThreadLocal, if I read this correctly, a >>> MessageDigest will be cached with each thread that ever calls this API, and >>> it won't be garbage collected until the thread dies. Some threads live >>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we >>> want? Is UUID creation so frequent that each thread needs its own, or could >>> thread safety be handled using a simple locking protocol? >> >> This is a good point. Significant effort has gone into recent releases to >> reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, >> JDK-8245302) so adding new usages is a disappointing. So I think this PR >> does need good justifications, and probably a lot more exploration of >> options. > >> I have to say that introducing a ThreadLocal here seems like a step in the >> wrong direction. With a ThreadLocal, if I read this correctly, a >> MessageDigest will be cached with each thread that ever calls this API, and >> it won't be garbage collected until the thread dies. Some threads live >> indefinitely, e.g., the ones in the fork-join common pool. Is this what we >> want? Is UUID creation so frequent that each thread needs its own, or could >> thread safety be handled using a simple locking protocol? > > Fair enough; the solution proposed by Claes in #1855 seems to be a better > alternative. > > Currently, I will keep this PR open but I expect this to be superseded by the > latter. Hi Claes, Would flattening the state of MD5 bring any further improvements? https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083 - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Sun, 20 Dec 2020 19:48:43 GMT, Alan Bateman wrote: > I have to say that introducing a ThreadLocal here seems like a step in the > wrong direction. With a ThreadLocal, if I read this correctly, a > MessageDigest will be cached with each thread that ever calls this API, and > it won't be garbage collected until the thread dies. Some threads live > indefinitely, e.g., the ones in the fork-join common pool. Is this what we > want? Is UUID creation so frequent that each thread needs its own, or could > thread safety be handled using a simple locking protocol? Fair enough; the solution proposed by Claes in #1855 seems to be a better alternative. Currently, I will keep this PR open but I expect this to be superseded by the latter. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Sun, 20 Dec 2020 20:45:55 GMT, Claes Redestad wrote: >>> I have to say that introducing a ThreadLocal here seems like a step in the >>> wrong direction. With a ThreadLocal, if I read this correctly, a >>> MessageDigest will be cached with each thread that ever calls this API, and >>> it won't be garbage collected until the thread dies. Some threads live >>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we >>> want? Is UUID creation so frequent that each thread needs its own, or could >>> thread safety be handled using a simple locking protocol? >> >> This is a good point. Significant effort has gone into recent releases to >> reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, >> JDK-8245302) so adding new usages is a disappointing. So I think this PR >> does need good justifications, and probably a lot more exploration of >> options. > > As Stuart argues a TL approach is likely to look better in a microbenchmark, > but make things worse in other regards. > > I started exploring options to this in #1855 (not logged an RFE, yet), and I > think there is potential to get most of the gains seen here without the > introduction of a `ThreadLocal`. The current patch also speeds up MD5.digest > for small chunks (+19% for 16 bytes, but less relative impact on larger > chunks, down to being lost in the noise on 16Kb). Speed-up of > `UUID.nameUUIDFromBytes` is somewhat modest right now (+4%, -17% > allocations), but I think it can be improved further without complicating > things too much. The MD5 intrinsic added by JDK-8250902 adds some constraints > on the implementation that holds back some polish. This can be fixed, but > requires some coordination. A trick that could be used here instead of `ThreadLocal` is to store an instance of MD5 retrieved via MessageDigest.getInstance, but clone it before use. See [this commit](https://github.com/openjdk/jdk/pull/1855/commits/2f266316d62ca875c38e2f771412d02625414bf4). This maintains thread safety, avoids the allure of TLs, and gives a substantial speed-up on the `UUIDBench.fromType3Bytes` micro (comparing it against the other optimizations that were already in #1855): Benchmark (size) Mode Cnt ScoreError Units UUIDBench.fromType3Bytes2 thrpt 12 1.523 ± 0.066 ops/us UUIDBench.fromType3Bytes:·gc.alloc.rate.norm2 thrpt 12 408.036 ± 0.003B/op Benchmark (size) Mode Cnt ScoreError Units UUIDBench.fromType3Bytes2 thrpt 12 2.186 ± 0.228 ops/us UUIDBench.fromType3Bytes:·gc.alloc.rate.norm2 thrpt 12 264.023 ± 0.006B/op - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Sun, 20 Dec 2020 19:48:43 GMT, Alan Bateman wrote: >> I have to say that introducing a ThreadLocal here seems like a step in the >> wrong direction. With a ThreadLocal, if I read this correctly, a >> MessageDigest will be cached with each thread that ever calls this API, and >> it won't be garbage collected until the thread dies. Some threads live >> indefinitely, e.g., the ones in the fork-join common pool. Is this what we >> want? Is UUID creation so frequent that each thread needs its own, or could >> thread safety be handled using a simple locking protocol? >> >> I'd like to see a demonstration that fetching and allocating an MD5 >> MessageDigest instance on each call is indeed a performance problem (either >> because it's too slow, or because it takes too much memory). Based on that >> we might or might not want to do some optimization of some form or another. >> Or we might want to optimize creation in MessageDigest instead of caching >> instances in UUID. More fundamentally, I'd like to see an answer to the >> question, "What problem is this change intended to solve?" > >> I have to say that introducing a ThreadLocal here seems like a step in the >> wrong direction. With a ThreadLocal, if I read this correctly, a >> MessageDigest will be cached with each thread that ever calls this API, and >> it won't be garbage collected until the thread dies. Some threads live >> indefinitely, e.g., the ones in the fork-join common pool. Is this what we >> want? Is UUID creation so frequent that each thread needs its own, or could >> thread safety be handled using a simple locking protocol? > > This is a good point. Significant effort has gone into recent releases to > reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, > JDK-8245302) so adding new usages is a disappointing. So I think this PR does > need good justifications, and probably a lot more exploration of options. As Stuart argues a TL approach is likely to look better in a microbenchmark, but make things worse in other regards. I started exploring options to this in #1855 (not logged an RFE, yet), and I think there is potential to get most of the gains seen here without the introduction of a `ThreadLocal`. The current patch also speeds up MD5.digest for small chunks (+19% for 16 bytes, but less relative impact on larger chunks, down to being lost in the noise on 16Kb). Speed-up of `UUID.nameUUIDFromBytes` is somewhat modest right now (+4%, -17% allocations), but I think it can be improved further without complicating things too much. The MD5 intrinsic added by JDK-8250902 adds some constraints on the implementation that holds back some polish. This can be fixed, but requires some coordination. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 20:11:26 GMT, Stuart Marks wrote: > I have to say that introducing a ThreadLocal here seems like a step in the > wrong direction. With a ThreadLocal, if I read this correctly, a > MessageDigest will be cached with each thread that ever calls this API, and > it won't be garbage collected until the thread dies. Some threads live > indefinitely, e.g., the ones in the fork-join common pool. Is this what we > want? Is UUID creation so frequent that each thread needs its own, or could > thread safety be handled using a simple locking protocol? This is a good point. Significant effort has gone into recent releases to reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, JDK-8245302) so adding new usages is a disappointing. So I think this PR does need good justifications, and probably a lot more exploration of options. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 19:04:36 GMT, Sean Mullan wrote: >>> MD5 and DES were removed as SE requirements in JDK 14. See >>> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. >>> However, there are no plans to remove the implementations from the JDK at >>> this time. >> >> In this case, should a bug report be filled to require specifying behaviour >> for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present? > >> > MD5 and DES were removed as SE requirements in JDK 14. See >> > https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. >> > However, there are no plans to remove the implementations from the JDK at >> > this time. >> >> In this case, should a bug report be filled to require specifying behaviour >> for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present? > > Perhaps it should throw `UnsupportedOperationException`, but I'll defer to > someone who works in this area. It may also be useful to support type 5 UUIDs > which use SHA-1. I have to say that introducing a ThreadLocal here seems like a step in the wrong direction. With a ThreadLocal, if I read this correctly, a MessageDigest will be cached with each thread that ever calls this API, and it won't be garbage collected until the thread dies. Some threads live indefinitely, e.g., the ones in the fork-join common pool. Is this what we want? Is UUID creation so frequent that each thread needs its own, or could thread safety be handled using a simple locking protocol? I'd like to see a demonstration that fetching and allocating an MD5 MessageDigest instance on each call is indeed a performance problem (either because it's too slow, or because it takes too much memory). Based on that we might or might not want to do some optimization of some form or another. Or we might want to optimize creation in MessageDigest instead of caching instances in UUID. More fundamentally, I'd like to see an answer to the question, "What problem is this change intended to solve?" - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:42:38 GMT, PROgrm_JARvis wrote: > > MD5 and DES were removed as SE requirements in JDK 14. See > > https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. > > However, there are no plans to remove the implementations from the JDK at > > this time. > > In this case, should a bug report be filled to require specifying behaviour > for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present? Perhaps it should throw `UnsupportedOperationException`, but I'll defer to someone who works in this area. It may also be useful to support type 5 UUIDs which use SHA-1. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 15:48:52 GMT, PROgrm_JARvis wrote: >> Might be fun to try, but it looks like rewriting to have MD5 to only use >> transient state will be a significant effort, and might just end up >> shuffling over allocations from `getInstance` to `digest`, which could >> regress code that reuse a digest instance (in a thread safe manner). > >> Might be fun to try, but it looks like rewriting to have MD5 to only use >> transient state will be a significant effort, and might just end up >> shuffling over allocations from `getInstance` to `digest`, which could >> regress code that reuse a digest instance (in a thread safe manner). > > I've thought about using `ThreadLocal` at holder level and my naive benchmark > shows significant improvement in speed compared to current implementation: > https://gist.github.com/JarvisCraft/61cd45f83aea7f33cba9510e523a5c4e > > As of the state from single thread's perspective, it seems to be safe to > reuse `MessageDigest` in this case: > >> The `digest` method can be called once for a given number of updates. After >> `digest` has been called, the MessageDigest object is reset to its >> initialized state. I've pushed the change utilizing `ThreadLocal` and my benchmark to correspond to the discussion. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached [v3]
> Please review this change moving lookup of MD5 digest in `java.lang.UUID` to > an internal holder class. PROgrm_JARvis has updated the pull request incrementally with two additional commits since the last revision: - 8258588: add Md5MessageDigestLookup benchmark - 8258588: make UUID#Md5Digest thread-safe using ThreadLocal - Changes: - all: https://git.openjdk.java.net/jdk/pull/1821/files - new: https://git.openjdk.java.net/jdk/pull/1821/files/37db4c56..1abb7cfe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=1821=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1821=01-02 Stats: 87 lines in 2 files changed: 77 ins; 5 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/1821.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1821/head:pull/1821 PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 15:24:21 GMT, Claes Redestad wrote: > Might be fun to try, but it looks like rewriting to have MD5 to only use > transient state will be a significant effort, and might just end up shuffling > over allocations from `getInstance` to `digest`, which could regress code > that reuse a digest instance (in a thread safe manner). I've thought about using `ThreadLocal` at holder level and my naive benchmark shows significant improvement in speed compared to current implementation: https://gist.github.com/JarvisCraft/61cd45f83aea7f33cba9510e523a5c4e As of the state from single thread's perspective, it seems to be safe to reuse `MessageDigest` in this case: > The `digest` method can be called once for a given number of updates. After > `digest` has been called, the MessageDigest object is reset to its > initialized state. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:59:10 GMT, PROgrm_JARvis wrote: >> A more general issue is that this patch assumes the `MessageDigest` object >> returned is statically shareable, which implies it being stateless and >> thread-safe. >> >> This doesn't seem to be the case. See >> [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java) >> and the >> [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java) >> base class, which both have mutating buffers for doing the digest. > >> A more general issue is that this patch assumes the `MessageDigest` object >> returned is statically shareable, which implies it being stateless and >> thread-safe. >> >> This doesn't seem to be the case. See >> [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java) >> and the >> [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java) >> base class, which both have mutating buffers for doing the digest. > > Wow, this is interesting. In this case I will check if there is a way to > implement a thread-safe mechanism there. Will report here once there is some > result. Might be fun to try, but it looks like rewriting to have MD5 to only use transient state will be a significant effort, and might just end up shuffling over allocations from `getInstance` to `digest`, which could regress code that reuse a digest instance (in a thread safe manner). - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:55:08 GMT, Claes Redestad wrote: > A more general issue is that this patch assumes the `MessageDigest` object > returned is statically shareable, which implies it being stateless and > thread-safe. > > This doesn't seem to be the case. See > [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java) > and the > [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java) > base class, which both have mutating buffers for doing the digest. Wow, this is interesting. In this case I will check if there is a way to implement a thread-safe mechanism there. Will report here once there is some result. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:42:38 GMT, PROgrm_JARvis wrote: >>> I've looked through [Standard Algorithms section for >>> MessageDigest](https://docs.oracle.com/en/java/javase/15/docs/specs/security/standard-names.html#messagedigest-algorithms) >>> and is says >>> >>> > Algorithm names that _can_ be specified >>> >>> And the javadoc of `MessageDigest` says: >>> >>> > Every implementation of the Java platform is required to support the >>> > following standard `MessageDigest` algorithms: >>> > >>> > * `SHA-1` >>> > * `SHA-256` >>> >>> So I cannot find any requirement for `MD5` to be present. Although I >>> believe that every implementation does provide it, it may be essential to >>> either specify it or describe the behavior for its absence in case of >>> `UUID`'s usage. >> >> MD5 and DES were removed as SE requirements in JDK 14. See >> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. >> However, there are no plans to remove the implementations from the JDK at >> this time. > >> MD5 and DES were removed as SE requirements in JDK 14. See >> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. >> However, there are no plans to remove the implementations from the JDK at >> this time. > > In this case, should a bug report be filled to require specifying behaviour > for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present? A more general issue is that this patch assumes the `MessageDigest` object returned is statically shareable, which implies it being stateless and thread-safe. This doesn't seem to be the case. See [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java) and the [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java) base class, which both have mutating buffers for doing the digest. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:37:53 GMT, Sean Mullan wrote: > MD5 and DES were removed as SE requirements in JDK 14. See > https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. > However, there are no plans to remove the implementations from the JDK at > this time. In this case, should a bug report be filled to require specifying behaviour for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present? - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:29:00 GMT, PROgrm_JARvis wrote: >> There are pre-existing microbenchmarks for java.security under >> test/micro/org/openjdk/bench/java/security >> >> You can build and run these using `make test TEST=micro:YourBenchmark`. >> Refer to >> [doc/testing.md](https://github.com/openjdk/jdk/blob/master/doc/testing.md) >> for some required configuration steps. >> >> I'd suggest starting with a simple micro that zooms in on >> MessageDigest.getInstance("MD5"), maybe like so: >> >> /* >> * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved. >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >> * >> * This code is free software; you can redistribute it and/or modify it >> * under the terms of the GNU General Public License version 2 only, as >> * published by the Free Software Foundation. >> * >> * This code is distributed in the hope that it will be useful, but WITHOUT >> * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License >> * version 2 for more details (a copy is included in the LICENSE file that >> * accompanied this code). >> * >> * You should have received a copy of the GNU General Public License version >> * 2 along with this work; if not, write to the Free Software Foundation, >> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >> * >> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA >> * or visit www.oracle.com if you need additional information or have any >> * questions. >> */ >> package org.openjdk.bench.java.security; >> >> import java.security.DigestException; >> import java.security.MessageDigest; >> import java.security.NoSuchAlgorithmException; >> import java.security.NoSuchProviderException; >> import java.util.Random; >> import java.util.concurrent.TimeUnit; >> import org.openjdk.jmh.annotations.Benchmark; >> import org.openjdk.jmh.annotations.BenchmarkMode; >> import org.openjdk.jmh.annotations.Fork; >> import org.openjdk.jmh.annotations.Measurement; >> import org.openjdk.jmh.annotations.Mode; >> import org.openjdk.jmh.annotations.OutputTimeUnit; >> import org.openjdk.jmh.annotations.Param; >> import org.openjdk.jmh.annotations.Scope; >> import org.openjdk.jmh.annotations.Setup; >> import org.openjdk.jmh.annotations.State; >> import org.openjdk.jmh.annotations.Warmup; >> >> /** >> * Tests speed of looking up MessageDigests. >> */ >> @State(Scope.Thread) >> @BenchmarkMode(Mode.AverageTime) >> @OutputTimeUnit(TimeUnit.NANOSECONDS) >> @Warmup(iterations = 5, time = 1) >> @Measurement(iterations = 10, time = 1) >> @Fork(value = 3) >> public class GetMessageDigest { >> >> @Benchmark >> public MessageDigest getMD5Digest() throws NoSuchAlgorithmException { >> return MessageDigest.getInstance("MD5"); >> } >> } > >> I'd suggest starting with a simple micro that zooms in on >> MessageDigest.getInstance("MD5"), maybe like so: > > Thanks, that's what I wanted to hear. > I will implement it now. > I've looked through [Standard Algorithms section for > MessageDigest](https://docs.oracle.com/en/java/javase/15/docs/specs/security/standard-names.html#messagedigest-algorithms) > and is says > > > Algorithm names that _can_ be specified > > And the javadoc of `MessageDigest` says: > > > Every implementation of the Java platform is required to support the > > following standard `MessageDigest` algorithms: > > > > * `SHA-1` > > * `SHA-256` > > So I cannot find any requirement for `MD5` to be present. Although I believe > that every implementation does provide it, it may be essential to either > specify it or describe the behavior for its absence in case of `UUID`'s usage. MD5 and DES were removed as SE requirements in JDK 14. See https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. However, there are no plans to remove the implementations from the JDK at this time. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 14:01:19 GMT, Claes Redestad wrote: > I'd suggest starting with a simple micro that zooms in on > MessageDigest.getInstance("MD5"), maybe like so: Thanks, that's what I wanted to hear. I will implement it now. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 13:39:48 GMT, Alan Bateman wrote: > Can you look in test/micro for existing examples? Yes, of course, the question is more about the following: should I simply cover `UUID#nameUUIDFromBytes(byte[])` by the benchmark or should I rather write a comparison benchmark which would compare finding the MessageDigest on every iteration against using the holder for this purpose (without any direct reference to `UUID` class). > It might be that the right place to cache is in `MessageDigest` rather than > `UUID` so that other code can benefit too. `UUID.Md5Digest` may be moved to `MessageDigest` but this seems to be a case to implement something similar to this in the way similar to `StandardCharsets`. > One other point is that Standard Algorithms specifies that MD5 is supported > by MessageDigest so the JDK would be broken it could not be found. If the > eventual patch is in UUID then this aspect will need a bit of cleanup. I've looked through [Standard Algorithms section for MessageDigest](https://docs.oracle.com/en/java/javase/15/docs/specs/security/standard-names.html#messagedigest-algorithms) and is says > Algorithm names that *can* be specified And the javadoc of `MessageDigest` says: > Every implementation of the Java platform is required to support the > following standard `MessageDigest` algorithms: > - `SHA-1` > - `SHA-256` So I cannot find any requirement for `MD5` to be present. Although I believe that every implementation does provide it, it may be essential to either specify it or describe the behavior for its absence in case of `UUID`'s usage. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 13:39:48 GMT, Alan Bateman wrote: >>> If there is a holder class for the MessageDigest then its initialisation >>> can fail, this would allow the orThrow method to go away >> >> As of allowing `Md5Digest` instead of explicitly throwing the exception from >> `orThrow` I think that the latter is more appropriate as in case of allowing >> the class initialization to fail all non-first calls will lead to >> `NoClassDefFoundError` which will be counterintuitive from users' >> perspective although the behaviour is simply unspecified for this in >> contract of `nameUUIDFromBytes` (Javadoc of `MessageDigest` requires the >> existence of `SHA-1` and `SHA-256` but nothing is said about `MD5` (should a >> bug-report be raised for the purpose of specifying this detail in the >> Javadoc?). >> >>> Are you planning to add a microbenchmark to demonstrate the issue? >> >> As for testing I am ready to write a benchmark but I am unsure what is the >> canonical way to write the one comparing the value before and after >> changeset. Or should it simply be a comparison of the very approach used >> here, not exactly the changed method (i.e. writing the benchmarks using the >> try/catch variant and holder variant and comparing these)? Could you please >> give any example of such if there are some. >> >> Thanks in advance! > > Can you look in test/micro for existing examples? > > It might be that the right place to cache is in MessageDigest rather than > UUID so that other code can benefit too. > > One other point is that Standard Algorithms specifies that MD5 is supported > by MessageDigest so the JDK would be broken it could not be found. If the > eventual patch is in UUID then this aspect will need a bit of cleanup. There are pre-existing microbenchmarks for java.security under test/micro/org/openjdk/bench/java/security You can build and run these using `make test TEST=micro:YourBenchmark`. Refer to [doc/testing.md](https://github.com/openjdk/jdk/blob/master/doc/testing.md) for some required configuration steps. I'd suggest starting with a simple micro that zooms in on MessageDigest.getInstance("MD5"), maybe like so: /* * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as * published by the Free Software Foundation. * * This code is distributed in the hope that it will be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License * version 2 for more details (a copy is included in the LICENSE file that * accompanied this code). * * You should have received a copy of the GNU General Public License version * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. * * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA * or visit www.oracle.com if you need additional information or have any * questions. */ package org.openjdk.bench.java.security; import java.security.DigestException; import java.security.MessageDigest; import java.security.NoSuchAlgorithmException; import java.security.NoSuchProviderException; import java.util.Random; import java.util.concurrent.TimeUnit; import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.BenchmarkMode; import org.openjdk.jmh.annotations.Fork; import org.openjdk.jmh.annotations.Measurement; import org.openjdk.jmh.annotations.Mode; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.annotations.Param; import org.openjdk.jmh.annotations.Scope; import org.openjdk.jmh.annotations.Setup; import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.annotations.Warmup; /** * Tests speed of looking up MessageDigests. */ @State(Scope.Thread) @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) @Warmup(iterations = 5, time = 1) @Measurement(iterations = 10, time = 1) @Fork(value = 3) public class GetMessageDigest { @Benchmark @Warmup(iterations = 5, time = 1) @Measurement(iterations = 10, time = 1) public MessageDigest getMD5Digest() throws NoSuchAlgorithmException { return MessageDigest.getInstance("MD5"); } } - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 13:27:02 GMT, PROgrm_JARvis wrote: >> Are you planning to add a microbenchmark to demonstrate the issue? >> If there is a holder class for the MessageDigest then its initialisation can >> fail, this would allow the orThrow method to go away > >> If there is a holder class for the MessageDigest then its initialisation can >> fail, this would allow the orThrow method to go away > > As of allowing `Md5Digest` instead of explicitly throwing the exception from > `orThrow` I think that the latter is more appropriate as in case of allowing > the class initialization to fail all non-first calls will lead to > `NoClassDefFoundError` which will be counterintuitive from users' perspective > although the behaviour is simply unspecified for this in contract of > `nameUUIDFromBytes` (Javadoc of `MessageDigest` requires the existence of > `SHA-1` and `SHA-256` but nothing is said about `MD5` (should a bug-report be > raised for the purpose of specifying this detail in the Javadoc?). > >> Are you planning to add a microbenchmark to demonstrate the issue? > > As for testing I am ready to write a benchmark but I am unsure what is the > canonical way to write the one comparing the value before and after > changeset. Or should it simply be a comparison of the very approach used > here, not exactly the changed method (i.e. writing the benchmarks using the > try/catch variant and holder variant and comparing these)? Could you please > give any example of such if there are some. > > Thanks in advance! Can you look in test/micro for existing examples? It might be that the right place to cache is in MessageDigest rather than UUID so that other code can benefit too. One other point is that Standard Algorithms specifies that MD5 is supported by MessageDigest so the JDK would be broken it could not be found. If the eventual patch is in UUID then this aspect will need a bit of cleanup. - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Fri, 18 Dec 2020 09:10:26 GMT, Alan Bateman wrote: > If there is a holder class for the MessageDigest then its initialisation can > fail, this would allow the orThrow method to go away As of allowing `Md5Digest` instead of explicitly throwing the exception from `orThrow` I think that the latter is more appropriate as in case of allowing the class initialization to fail all non-first calls will lead to `NoClassDefFoundError` which will be counterintuitive from users' perspective although the behaviour is simply unspecified for this in contract of `nameUUIDFromBytes` (Javadoc of `MessageDigest` requires the existence of `SHA-1` and `SHA-256` but nothing is said about `MD5` (should a bug-report be raised for the purpose of specifying this detail in the Javadoc?). > Are you planning to add a microbenchmark to demonstrate the issue? As for testing I am ready to write a benchmark but I am unsure what is the canonical way to write the one comparing the value before and after changeset. Or should it simply be a comparison of the very approach used here, not exactly the changed method (i.e. writing the benchmarks using the try/catch variant and holder variant and comparing these)? Could you please give any example of such if there are some. Thanks in advance! - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
On Thu, 17 Dec 2020 13:36:17 GMT, PROgrm_JARvis wrote: > Please review this change moving lookup of MD5 digest in `java.lang.UUID` to > an internal holder class. Are you planning to add a microbenchmark to demonstrate the issue? If there is a holder class for the MessageDigest then its initialisation can fail, this would allow the orThrow method to go away - PR: https://git.openjdk.java.net/jdk/pull/1821
Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached [v2]
> Please review this change moving lookup of MD5 digest in `java.lang.UUID` to > an internal holder class. PROgrm_JARvis has updated the pull request incrementally with one additional commit since the last revision: 8258588: add missing " UUIDs" to comment of UUID$Md5Digest - Changes: - all: https://git.openjdk.java.net/jdk/pull/1821/files - new: https://git.openjdk.java.net/jdk/pull/1821/files/73eb2c4f..37db4c56 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk=1821=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1821=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/1821.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1821/head:pull/1821 PR: https://git.openjdk.java.net/jdk/pull/1821
RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached
Please review this change moving lookup of MD5 digest in `java.lang.UUID` to an internal holder class. - Commit messages: - 8258588: MD5 MessageDigest in java.util.UUID should be cached Changes: https://git.openjdk.java.net/jdk/pull/1821/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=1821=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8258588 Stats: 33 lines in 1 file changed: 26 ins; 6 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/1821.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1821/head:pull/1821 PR: https://git.openjdk.java.net/jdk/pull/1821