Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2021-02-15 Thread PROgrm_JARvis
On Wed, 10 Feb 2021 14:08:22 GMT, PROgrm_JARvis 
 wrote:

>>> Hi Claes,
>>> Would flattening the state of MD5 bring any further improvements?
>>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083)
>> 
>> I think it might, marginally, but it seemed to me that the implCompress0 MD5 
>> intrinsic is using `state` so I've not tried that yet since rewriting and 
>> checking the intrinsic code is a lot of work. (I've blogged about my 
>> experiments in this area and mentioned this issue in passing: 
>> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints)
>
>> 
>> 
>> @JarvisCraft This pull request has been inactive for more than 4 weeks and 
>> will be automatically closed if another 4 weeks passes without any activity. 
>> To avoid this, simply add a new comment to the pull request. Feel free to 
>> ask for assistance if you need help with progressing this pull request 
>> towards integration!

> @JarvisCraft Can you re-run your benchmarks in light of the changes that 
> @cl4es pointed out in the previous comment? We think this PR can be closed.

My apologies for the late reply,

I'll try re-running these with the merged branch ASAP (the delay is due to me 
changing the workstation thus having some job to reconfigure the environment).

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2021-02-12 Thread Alan Bateman
On Wed, 10 Feb 2021 14:08:22 GMT, PROgrm_JARvis 
 wrote:

>>> Hi Claes,
>>> Would flattening the state of MD5 bring any further improvements?
>>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083)
>> 
>> I think it might, marginally, but it seemed to me that the implCompress0 MD5 
>> intrinsic is using `state` so I've not tried that yet since rewriting and 
>> checking the intrinsic code is a lot of work. (I've blogged about my 
>> experiments in this area and mentioned this issue in passing: 
>> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints)
>
>> 
>> 
>> @JarvisCraft This pull request has been inactive for more than 4 weeks and 
>> will be automatically closed if another 4 weeks passes without any activity. 
>> To avoid this, simply add a new comment to the pull request. Feel free to 
>> ask for assistance if you need help with progressing this pull request 
>> towards integration!

@JarvisCraft Can you re-run your benchmarks in light of the changes that @cl4es 
pointed out in the previous comment? We think this PR can be closed.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2021-02-10 Thread Claes Redestad
On Wed, 10 Feb 2021 14:08:22 GMT, PROgrm_JARvis 
 wrote:

>>> Hi Claes,
>>> Would flattening the state of MD5 bring any further improvements?
>>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083)
>> 
>> I think it might, marginally, but it seemed to me that the implCompress0 MD5 
>> intrinsic is using `state` so I've not tried that yet since rewriting and 
>> checking the intrinsic code is a lot of work. (I've blogged about my 
>> experiments in this area and mentioned this issue in passing: 
>> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints)
>
>> 
>> 
>> @JarvisCraft This pull request has been inactive for more than 4 weeks and 
>> will be automatically closed if another 4 weeks passes without any activity. 
>> To avoid this, simply add a new comment to the pull request. Feel free to 
>> ask for assistance if you need help with progressing this pull request 
>> towards integration!

FWIW, I ended up doing two related improvements:

- #1855 - which reduced overheads of constructing MD5, SHA1, SHA2, SHA5 
`MessageDigest` objects, and slightly improving digest performance
- #1933 - reducing cost of `MessageDigest.getInstance` in general by better 
internal caching of `Constructor` objects, among other things. With this 
there's now no extra allocations when looking up something that has been looked 
up before.

Together these enhancements bring the overheads of `UUID.nameUUIDFromBytes` 
down close to what you can get from cloning from a `ThreadLocal` `MD5` object 
as suggested here. Taking the ambient overheads of `ThreadLocal`s into account 
I'd say it's not worth adding this cache, and would suggest withdrawing this PR 
and closing the RFE.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2021-02-10 Thread PROgrm_JARvis
On Thu, 7 Jan 2021 17:09:14 GMT, Claes Redestad  wrote:

>> Hi Claes,
>> Would flattening the state of MD5 bring any further improvements?
>> https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083
>
>> Hi Claes,
>> Would flattening the state of MD5 bring any further improvements?
>> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083)
> 
> I think it might, marginally, but it seemed to me that the implCompress0 MD5 
> intrinsic is using `state` so I've not tried that yet since rewriting and 
> checking the intrinsic code is a lot of work. (I've blogged about my 
> experiments in this area and mentioned this issue in passing: 
> https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints)

> 
> 
> @JarvisCraft This pull request has been inactive for more than 4 weeks and 
> will be automatically closed if another 4 weeks passes without any activity. 
> To avoid this, simply add a new comment to the pull request. Feel free to ask 
> for assistance if you need help with progressing this pull request towards 
> integration!

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2021-01-07 Thread Claes Redestad
On Thu, 7 Jan 2021 16:39:48 GMT, Peter Levart  wrote:

>>> I have to say that introducing a ThreadLocal here seems like a step in the 
>>> wrong direction. With a ThreadLocal, if I read this correctly, a 
>>> MessageDigest will be cached with each thread that ever calls this API, and 
>>> it won't be garbage collected until the thread dies. Some threads live 
>>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
>>> want? Is UUID creation so frequent that each thread needs its own, or could 
>>> thread safety be handled using a simple locking protocol?
>> 
>> Fair enough; the solution proposed by Claes in #1855 seems to be a better 
>> alternative.
>> 
>> Currently, I will keep this PR open but I expect this to be superseded by 
>> the latter.
>
> Hi Claes,
> Would flattening the state of MD5 bring any further improvements?
> https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083

> Hi Claes,
> Would flattening the state of MD5 bring any further improvements?
> [plevart@92bf48f](https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083)

I think it might, marginally, but it seemed to me that the implCompress0 MD5 
intrinsic is using `state` so I've not tried that yet since rewriting and 
checking the intrinsic code is a lot of work. (I've blogged about my 
experiments in this area and mentioned this issue in passing: 
https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#accidental-constraints)

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2021-01-07 Thread Peter Levart
On Sun, 20 Dec 2020 22:41:33 GMT, PROgrm_JARvis 
 wrote:

>>> I have to say that introducing a ThreadLocal here seems like a step in the 
>>> wrong direction. With a ThreadLocal, if I read this correctly, a 
>>> MessageDigest will be cached with each thread that ever calls this API, and 
>>> it won't be garbage collected until the thread dies. Some threads live 
>>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
>>> want? Is UUID creation so frequent that each thread needs its own, or could 
>>> thread safety be handled using a simple locking protocol?
>> 
>> This is a good point. Significant effort has gone into recent releases to 
>> reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, 
>> JDK-8245302) so adding new usages is a disappointing. So I think this PR 
>> does need good justifications, and probably a lot more exploration of 
>> options.
>
>> I have to say that introducing a ThreadLocal here seems like a step in the 
>> wrong direction. With a ThreadLocal, if I read this correctly, a 
>> MessageDigest will be cached with each thread that ever calls this API, and 
>> it won't be garbage collected until the thread dies. Some threads live 
>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
>> want? Is UUID creation so frequent that each thread needs its own, or could 
>> thread safety be handled using a simple locking protocol?
> 
> Fair enough; the solution proposed by Claes in #1855 seems to be a better 
> alternative.
> 
> Currently, I will keep this PR open but I expect this to be superseded by the 
> latter.

Hi Claes,
Would flattening the state of MD5 bring any further improvements?
https://github.com/plevart/jdk/commit/92bf48ff58f0ce9648e49466dbf1befebbf49083

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-20 Thread PROgrm_JARvis
On Sun, 20 Dec 2020 19:48:43 GMT, Alan Bateman  wrote:

> I have to say that introducing a ThreadLocal here seems like a step in the 
> wrong direction. With a ThreadLocal, if I read this correctly, a 
> MessageDigest will be cached with each thread that ever calls this API, and 
> it won't be garbage collected until the thread dies. Some threads live 
> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
> want? Is UUID creation so frequent that each thread needs its own, or could 
> thread safety be handled using a simple locking protocol?

Fair enough; the solution proposed by Claes in #1855 seems to be a better 
alternative.

Currently, I will keep this PR open but I expect this to be superseded by the 
latter.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-20 Thread Claes Redestad
On Sun, 20 Dec 2020 20:45:55 GMT, Claes Redestad  wrote:

>>> I have to say that introducing a ThreadLocal here seems like a step in the 
>>> wrong direction. With a ThreadLocal, if I read this correctly, a 
>>> MessageDigest will be cached with each thread that ever calls this API, and 
>>> it won't be garbage collected until the thread dies. Some threads live 
>>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
>>> want? Is UUID creation so frequent that each thread needs its own, or could 
>>> thread safety be handled using a simple locking protocol?
>> 
>> This is a good point. Significant effort has gone into recent releases to 
>> reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, 
>> JDK-8245302) so adding new usages is a disappointing. So I think this PR 
>> does need good justifications, and probably a lot more exploration of 
>> options.
>
> As Stuart argues a TL approach is likely to look better in a microbenchmark, 
> but make things worse in other regards.
> 
> I started exploring options to this in #1855 (not logged an RFE, yet), and I 
> think there is potential to get most of the gains seen here without the 
> introduction of a `ThreadLocal`. The current patch also speeds up MD5.digest 
> for small chunks (+19% for 16 bytes, but less relative impact on larger 
> chunks, down to being lost in the noise on 16Kb). Speed-up of 
> `UUID.nameUUIDFromBytes` is somewhat modest right now (+4%, -17% 
> allocations), but I think it can be improved further without complicating 
> things too much. The MD5 intrinsic added by JDK-8250902 adds some constraints 
> on the implementation that holds back some polish. This can be fixed, but 
> requires some coordination.

A trick that could be used here instead of `ThreadLocal` is to store an 
instance of MD5 retrieved via MessageDigest.getInstance, but clone it before 
use. See [this 
commit](https://github.com/openjdk/jdk/pull/1855/commits/2f266316d62ca875c38e2f771412d02625414bf4).
 

This maintains thread safety, avoids the allure of TLs, and gives a substantial 
speed-up on the `UUIDBench.fromType3Bytes` micro (comparing it against the 
other optimizations that were already in #1855):

 
Benchmark  (size)   Mode  Cnt   
 ScoreError   Units
UUIDBench.fromType3Bytes2  thrpt   12   
 1.523 ±  0.066  ops/us
UUIDBench.fromType3Bytes:·gc.alloc.rate.norm2  thrpt   12  
408.036 ±  0.003B/op

Benchmark  (size)   Mode  Cnt   
 ScoreError   Units
UUIDBench.fromType3Bytes2  thrpt   12   
 2.186 ±  0.228  ops/us
UUIDBench.fromType3Bytes:·gc.alloc.rate.norm2  thrpt   12  
264.023 ±  0.006B/op

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-20 Thread Claes Redestad
On Sun, 20 Dec 2020 19:48:43 GMT, Alan Bateman  wrote:

>> I have to say that introducing a ThreadLocal here seems like a step in the 
>> wrong direction. With a ThreadLocal, if I read this correctly, a 
>> MessageDigest will be cached with each thread that ever calls this API, and 
>> it won't be garbage collected until the thread dies. Some threads live 
>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
>> want? Is UUID creation so frequent that each thread needs its own, or could 
>> thread safety be handled using a simple locking protocol?
>> 
>> I'd like to see a demonstration that fetching and allocating an MD5 
>> MessageDigest instance on each call is indeed a performance problem (either 
>> because it's too slow, or because it takes too much memory). Based on that 
>> we might or might not want to do some optimization of some form or another. 
>> Or we might want to optimize creation in MessageDigest instead of caching 
>> instances in UUID. More fundamentally, I'd like to see an answer to the 
>> question, "What problem is this change intended to solve?"
>
>> I have to say that introducing a ThreadLocal here seems like a step in the 
>> wrong direction. With a ThreadLocal, if I read this correctly, a 
>> MessageDigest will be cached with each thread that ever calls this API, and 
>> it won't be garbage collected until the thread dies. Some threads live 
>> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
>> want? Is UUID creation so frequent that each thread needs its own, or could 
>> thread safety be handled using a simple locking protocol?
> 
> This is a good point. Significant effort has gone into recent releases to 
> reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, 
> JDK-8245302) so adding new usages is a disappointing. So I think this PR does 
> need good justifications, and probably a lot more exploration of options.

As Stuart argues a TL approach is likely to look better in a microbenchmark, 
but make things worse in other regards.

I started exploring options to this in #1855 (not logged an RFE, yet), and I 
think there is potential to get most of the gains seen here without the 
introduction of a `ThreadLocal`. The current patch also speeds up MD5.digest 
for small chunks (+19% for 16 bytes, but less relative impact on larger chunks, 
down to being lost in the noise on 16Kb). Speed-up of `UUID.nameUUIDFromBytes` 
is somewhat modest right now (+4%, -17% allocations), but I think it can be 
improved further without complicating things too much. The MD5 intrinsic added 
by JDK-8250902 adds some constraints on the implementation that holds back some 
polish. This can be fixed, but requires some coordination.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-20 Thread Alan Bateman
On Fri, 18 Dec 2020 20:11:26 GMT, Stuart Marks  wrote:

> I have to say that introducing a ThreadLocal here seems like a step in the 
> wrong direction. With a ThreadLocal, if I read this correctly, a 
> MessageDigest will be cached with each thread that ever calls this API, and 
> it won't be garbage collected until the thread dies. Some threads live 
> indefinitely, e.g., the ones in the fork-join common pool. Is this what we 
> want? Is UUID creation so frequent that each thread needs its own, or could 
> thread safety be handled using a simple locking protocol?

This is a good point. Significant effort has gone into recent releases to 
reduce the use of TLs in the (JDK-8245309, JDK-8245308, JDK-8245304, 
JDK-8245302) so adding new usages is a disappointing. So I think this PR does 
need good justifications, and probably a lot more exploration of options.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Stuart Marks
On Fri, 18 Dec 2020 19:04:36 GMT, Sean Mullan  wrote:

>>> MD5 and DES were removed as SE requirements in JDK 14. See 
>>> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. 
>>> However, there are no plans to remove the implementations from the JDK at 
>>> this time.
>> 
>> In this case, should a bug report be filled to require specifying behaviour 
>> for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present?
>
>> > MD5 and DES were removed as SE requirements in JDK 14. See 
>> > https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. 
>> > However, there are no plans to remove the implementations from the JDK at 
>> > this time.
>> 
>> In this case, should a bug report be filled to require specifying behaviour 
>> for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present?
> 
> Perhaps it should throw `UnsupportedOperationException`, but I'll defer to 
> someone who works in this area. It may also be useful to support type 5 UUIDs 
> which use SHA-1.

I have to say that introducing a ThreadLocal here seems like a step in the 
wrong direction. With a ThreadLocal, if I read this correctly, a MessageDigest 
will be cached with each thread that ever calls this API, and it won't be 
garbage collected until the thread dies. Some threads live indefinitely, e.g., 
the ones in the fork-join common pool. Is this what we want? Is UUID creation 
so frequent that each thread needs its own, or could thread safety be handled 
using a simple locking protocol?

I'd like to see a demonstration that fetching and allocating an MD5 
MessageDigest instance on each call is indeed a performance problem (either 
because it's too slow, or because it takes too much memory). Based on that we 
might or might not want to do some optimization of some form or another. Or we 
might want to optimize creation in MessageDigest instead of caching instances 
in UUID. More fundamentally, I'd like to see an answer to the question, "What 
problem is this change intended to solve?"

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Sean Mullan
On Fri, 18 Dec 2020 14:42:38 GMT, PROgrm_JARvis 
 wrote:

> > MD5 and DES were removed as SE requirements in JDK 14. See 
> > https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. 
> > However, there are no plans to remove the implementations from the JDK at 
> > this time.
> 
> In this case, should a bug report be filled to require specifying behaviour 
> for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present?

Perhaps it should throw `UnsupportedOperationException`, but I'll defer to 
someone who works in this area. It may also be useful to support type 5 UUIDs 
which use SHA-1.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 15:48:52 GMT, PROgrm_JARvis 
 wrote:

>> Might be fun to try, but it looks like rewriting to have MD5 to only use 
>> transient state will be a significant effort, and might just end up 
>> shuffling over allocations from `getInstance` to `digest`, which could 
>> regress code that reuse a digest instance (in a thread safe manner).
>
>> Might be fun to try, but it looks like rewriting to have MD5 to only use 
>> transient state will be a significant effort, and might just end up 
>> shuffling over allocations from `getInstance` to `digest`, which could 
>> regress code that reuse a digest instance (in a thread safe manner).
> 
> I've thought about using `ThreadLocal` at holder level and my naive benchmark 
> shows significant improvement in speed compared to current implementation: 
> https://gist.github.com/JarvisCraft/61cd45f83aea7f33cba9510e523a5c4e
> 
> As of the state from single thread's perspective, it seems to be safe to 
> reuse `MessageDigest` in this case:
> 
>> The `digest` method can be called once for a given number of updates. After 
>> `digest` has been called, the MessageDigest object is reset to its 
>> initialized state.

I've pushed the change utilizing `ThreadLocal` and my benchmark to correspond 
to the discussion.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached [v3]

2020-12-18 Thread PROgrm_JARvis
> Please review this change moving lookup of MD5 digest in `java.lang.UUID` to 
> an internal holder class.

PROgrm_JARvis has updated the pull request incrementally with two additional 
commits since the last revision:

 - 8258588: add Md5MessageDigestLookup benchmark
 - 8258588: make UUID#Md5Digest thread-safe using ThreadLocal

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1821/files
  - new: https://git.openjdk.java.net/jdk/pull/1821/files/37db4c56..1abb7cfe

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=1821=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1821=01-02

  Stats: 87 lines in 2 files changed: 77 ins; 5 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1821.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1821/head:pull/1821

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 15:24:21 GMT, Claes Redestad  wrote:

> Might be fun to try, but it looks like rewriting to have MD5 to only use 
> transient state will be a significant effort, and might just end up shuffling 
> over allocations from `getInstance` to `digest`, which could regress code 
> that reuse a digest instance (in a thread safe manner).

I've thought about using `ThreadLocal` at holder level and my naive benchmark 
shows significant improvement in speed compared to current implementation: 
https://gist.github.com/JarvisCraft/61cd45f83aea7f33cba9510e523a5c4e

As of the state from single thread's perspective, it seems to be safe to reuse 
`MessageDigest` in this case:

> The `digest` method can be called once for a given number of updates. After 
> `digest` has been called, the MessageDigest object is reset to its 
> initialized state.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Claes Redestad
On Fri, 18 Dec 2020 14:59:10 GMT, PROgrm_JARvis 
 wrote:

>> A more general issue is that this patch assumes the `MessageDigest` object 
>> returned is statically shareable, which implies it being stateless and 
>> thread-safe.
>> 
>> This doesn't seem to be the case. See 
>> [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java)
>>  and the 
>> [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java)
>>  base class, which both have mutating buffers for doing the digest.
>
>> A more general issue is that this patch assumes the `MessageDigest` object 
>> returned is statically shareable, which implies it being stateless and 
>> thread-safe.
>> 
>> This doesn't seem to be the case. See 
>> [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java)
>>  and the 
>> [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java)
>>  base class, which both have mutating buffers for doing the digest.
> 
> Wow, this is interesting. In this case I will check if there is a way to 
> implement a thread-safe mechanism there. Will report here once there is some 
> result.

Might be fun to try, but it looks like rewriting to have MD5 to only use 
transient state will be a significant effort, and might just end up shuffling 
over allocations from `getInstance` to `digest`, which could regress code that 
reuse a digest instance (in a thread safe manner).

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 14:55:08 GMT, Claes Redestad  wrote:

> A more general issue is that this patch assumes the `MessageDigest` object 
> returned is statically shareable, which implies it being stateless and 
> thread-safe.
> 
> This doesn't seem to be the case. See 
> [MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java)
>  and the 
> [DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java)
>  base class, which both have mutating buffers for doing the digest.

Wow, this is interesting. In this case I will check if there is a way to 
implement a thread-safe mechanism there. Will report here once there is some 
result.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Claes Redestad
On Fri, 18 Dec 2020 14:42:38 GMT, PROgrm_JARvis 
 wrote:

>>> I've looked through [Standard Algorithms section for 
>>> MessageDigest](https://docs.oracle.com/en/java/javase/15/docs/specs/security/standard-names.html#messagedigest-algorithms)
>>>  and is says
>>> 
>>> > Algorithm names that _can_ be specified
>>> 
>>> And the javadoc of `MessageDigest` says:
>>> 
>>> > Every implementation of the Java platform is required to support the 
>>> > following standard `MessageDigest` algorithms:
>>> > 
>>> > * `SHA-1`
>>> > * `SHA-256`
>>> 
>>> So I cannot find any requirement for `MD5` to be present. Although I 
>>> believe that every implementation does provide it, it may be essential to 
>>> either specify it or describe the behavior for its absence in case of 
>>> `UUID`'s usage.
>> 
>> MD5 and DES were removed as SE requirements in JDK 14. See 
>> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. 
>> However, there are no plans to remove the implementations from the JDK at 
>> this time.
>
>> MD5 and DES were removed as SE requirements in JDK 14. See 
>> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. 
>> However, there are no plans to remove the implementations from the JDK at 
>> this time.
> 
> In this case, should a bug report be filled to require specifying behaviour 
> for `UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present?

A more general issue is that this patch assumes the `MessageDigest` object 
returned is statically shareable, which implies it being stateless and 
thread-safe.

This doesn't seem to be the case. See 
[MD5.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/MD5.java)
 and the 
[DigestBase.java](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/security/provider/DigestBase.java)
 base class, which both have mutating buffers for doing the digest.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 14:37:53 GMT, Sean Mullan  wrote:

> MD5 and DES were removed as SE requirements in JDK 14. See 
> https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. 
> However, there are no plans to remove the implementations from the JDK at 
> this time.

In this case, should a bug report be filled to require specifying behaviour for 
`UUID#nameUUIDFromBytes(byte[])` in case of MD5 not being present?

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Sean Mullan
On Fri, 18 Dec 2020 14:29:00 GMT, PROgrm_JARvis 
 wrote:

>> There are pre-existing microbenchmarks for java.security under 
>> test/micro/org/openjdk/bench/java/security
>> 
>> You can build and run these using `make test TEST=micro:YourBenchmark`. 
>> Refer to 
>> [doc/testing.md](https://github.com/openjdk/jdk/blob/master/doc/testing.md) 
>> for some required configuration steps. 
>> 
>> I'd suggest starting with a simple micro that zooms in on 
>> MessageDigest.getInstance("MD5"), maybe like so:
>> 
>> /*
>>  * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
>>  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>>  *
>>  * This code is free software; you can redistribute it and/or modify it
>>  * under the terms of the GNU General Public License version 2 only, as
>>  * published by the Free Software Foundation.
>>  *
>>  * This code is distributed in the hope that it will be useful, but WITHOUT
>>  * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>  * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>  * version 2 for more details (a copy is included in the LICENSE file that
>>  * accompanied this code).
>>  *
>>  * You should have received a copy of the GNU General Public License version
>>  * 2 along with this work; if not, write to the Free Software Foundation,
>>  * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
>>  *
>>  * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
>>  * or visit www.oracle.com if you need additional information or have any
>>  * questions.
>>  */
>> package org.openjdk.bench.java.security;
>> 
>> import java.security.DigestException;
>> import java.security.MessageDigest;
>> import java.security.NoSuchAlgorithmException;
>> import java.security.NoSuchProviderException;
>> import java.util.Random;
>> import java.util.concurrent.TimeUnit;
>> import org.openjdk.jmh.annotations.Benchmark;
>> import org.openjdk.jmh.annotations.BenchmarkMode;
>> import org.openjdk.jmh.annotations.Fork;
>> import org.openjdk.jmh.annotations.Measurement;
>> import org.openjdk.jmh.annotations.Mode;
>> import org.openjdk.jmh.annotations.OutputTimeUnit;
>> import org.openjdk.jmh.annotations.Param;
>> import org.openjdk.jmh.annotations.Scope;
>> import org.openjdk.jmh.annotations.Setup;
>> import org.openjdk.jmh.annotations.State;
>> import org.openjdk.jmh.annotations.Warmup;
>> 
>> /**
>>  * Tests speed of looking up MessageDigests.
>>  */
>> @State(Scope.Thread)
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> @Warmup(iterations = 5, time = 1)
>> @Measurement(iterations = 10, time = 1)
>> @Fork(value = 3)
>> public class GetMessageDigest {
>> 
>> @Benchmark
>> public MessageDigest getMD5Digest() throws NoSuchAlgorithmException {
>> return MessageDigest.getInstance("MD5");
>> }
>> }
>
>> I'd suggest starting with a simple micro that zooms in on 
>> MessageDigest.getInstance("MD5"), maybe like so:
> 
> Thanks, that's what I wanted to hear.
> I will implement it now.

> I've looked through [Standard Algorithms section for 
> MessageDigest](https://docs.oracle.com/en/java/javase/15/docs/specs/security/standard-names.html#messagedigest-algorithms)
>  and is says
> 
> > Algorithm names that _can_ be specified
> 
> And the javadoc of `MessageDigest` says:
> 
> > Every implementation of the Java platform is required to support the 
> > following standard `MessageDigest` algorithms:
> > 
> > * `SHA-1`
> > * `SHA-256`
> 
> So I cannot find any requirement for `MD5` to be present. Although I believe 
> that every implementation does provide it, it may be essential to either 
> specify it or describe the behavior for its absence in case of `UUID`'s usage.

MD5 and DES were removed as SE requirements in JDK 14. See 
https://bugs.openjdk.java.net/browse/JDK-8214483 for more information. However, 
there are no plans to remove the implementations from the JDK at this time.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 14:01:19 GMT, Claes Redestad  wrote:

> I'd suggest starting with a simple micro that zooms in on 
> MessageDigest.getInstance("MD5"), maybe like so:

Thanks, that's what I wanted to hear.
I will implement it now.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 13:39:48 GMT, Alan Bateman  wrote:

> Can you look in test/micro for existing examples?

Yes, of course, the question is more about the following: should I simply cover 
`UUID#nameUUIDFromBytes(byte[])` by the benchmark or should I rather write a 
comparison benchmark which would compare finding the MessageDigest on every 
iteration against using the holder for this purpose (without any direct 
reference to `UUID` class).

> It might be that the right place to cache is in `MessageDigest` rather than 
> `UUID` so that other code can benefit too.

`UUID.Md5Digest` may be moved to `MessageDigest` but this seems to be a case to 
implement something similar to this in the way similar to `StandardCharsets`.

> One other point is that Standard Algorithms specifies that MD5 is supported 
> by MessageDigest so the JDK would be broken it could not be found. If the 
> eventual patch is in UUID then this aspect will need a bit of cleanup.

I've looked through [Standard Algorithms section for 
MessageDigest](https://docs.oracle.com/en/java/javase/15/docs/specs/security/standard-names.html#messagedigest-algorithms)
 and is says

> Algorithm names that *can* be specified

And the javadoc of `MessageDigest` says:
> Every implementation of the Java platform is required to support the 
> following standard `MessageDigest` algorithms:
> - `SHA-1`
> - `SHA-256`

So I cannot find any requirement for `MD5` to be present. Although I believe 
that every implementation does provide it, it may be essential to either 
specify it or describe the behavior for its absence in case of `UUID`'s usage.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Claes Redestad
On Fri, 18 Dec 2020 13:39:48 GMT, Alan Bateman  wrote:

>>> If there is a holder class for the MessageDigest then its initialisation 
>>> can fail, this would allow the orThrow method to go away
>> 
>> As of allowing `Md5Digest` instead of explicitly throwing the exception from 
>> `orThrow` I think that the latter is more appropriate as in case of allowing 
>> the class initialization to fail all non-first calls will lead to 
>> `NoClassDefFoundError` which will be counterintuitive from users' 
>> perspective although the behaviour is simply unspecified for this in 
>> contract of `nameUUIDFromBytes` (Javadoc of `MessageDigest` requires the 
>> existence of `SHA-1` and `SHA-256` but nothing is said about `MD5` (should a 
>> bug-report be raised for the purpose of specifying this detail in the 
>> Javadoc?).
>> 
>>> Are you planning to add a microbenchmark to demonstrate the issue?
>> 
>> As for testing I am ready to write a benchmark but I am unsure what is the 
>> canonical way to write the one comparing the value before and after 
>> changeset. Or should it simply be a comparison of the very approach used 
>> here, not exactly the changed method (i.e. writing the benchmarks using the 
>> try/catch variant and holder variant and comparing these)? Could you please 
>> give any example of such if there are some.
>> 
>> Thanks in advance!
>
> Can you look in test/micro for existing examples?
> 
> It might be that the right place to cache is in MessageDigest rather than 
> UUID so that other code can benefit too.
> 
> One other point is that Standard Algorithms specifies that MD5 is supported 
> by MessageDigest so the JDK would be broken it could not be found. If the 
> eventual patch is in UUID then this aspect will need a bit of cleanup.

There are pre-existing microbenchmarks for java.security under 
test/micro/org/openjdk/bench/java/security

You can build and run these using `make test TEST=micro:YourBenchmark`. Refer 
to [doc/testing.md](https://github.com/openjdk/jdk/blob/master/doc/testing.md) 
for some required configuration steps. 

I'd suggest starting with a simple micro that zooms in on 
MessageDigest.getInstance("MD5"), maybe like so:

/*
 * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
 *
 * This code is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 only, as
 * published by the Free Software Foundation.
 *
 * This code is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 * version 2 for more details (a copy is included in the LICENSE file that
 * accompanied this code).
 *
 * You should have received a copy of the GNU General Public License version
 * 2 along with this work; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
 *
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
 * or visit www.oracle.com if you need additional information or have any
 * questions.
 */
package org.openjdk.bench.java.security;

import java.security.DigestException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.security.NoSuchProviderException;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;

/**
 * Tests speed of looking up MessageDigests.
 */
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 10, time = 1)
@Fork(value = 3)
public class GetMessageDigest {

@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 10, time = 1)
public MessageDigest getMD5Digest() throws NoSuchAlgorithmException {
return MessageDigest.getInstance("MD5");
}
}

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Alan Bateman
On Fri, 18 Dec 2020 13:27:02 GMT, PROgrm_JARvis 
 wrote:

>> Are you planning to add a microbenchmark to demonstrate the issue?
>> If there is a holder class for the MessageDigest then its initialisation can 
>> fail, this would allow the orThrow method to go away
>
>> If there is a holder class for the MessageDigest then its initialisation can 
>> fail, this would allow the orThrow method to go away
> 
> As of allowing `Md5Digest` instead of explicitly throwing the exception from 
> `orThrow` I think that the latter is more appropriate as in case of allowing 
> the class initialization to fail all non-first calls will lead to 
> `NoClassDefFoundError` which will be counterintuitive from users' perspective 
> although the behaviour is simply unspecified for this in contract of 
> `nameUUIDFromBytes` (Javadoc of `MessageDigest` requires the existence of 
> `SHA-1` and `SHA-256` but nothing is said about `MD5` (should a bug-report be 
> raised for the purpose of specifying this detail in the Javadoc?).
> 
>> Are you planning to add a microbenchmark to demonstrate the issue?
> 
> As for testing I am ready to write a benchmark but I am unsure what is the 
> canonical way to write the one comparing the value before and after 
> changeset. Or should it simply be a comparison of the very approach used 
> here, not exactly the changed method (i.e. writing the benchmarks using the 
> try/catch variant and holder variant and comparing these)? Could you please 
> give any example of such if there are some.
> 
> Thanks in advance!

Can you look in test/micro for existing examples?

It might be that the right place to cache is in MessageDigest rather than UUID 
so that other code can benefit too.

One other point is that Standard Algorithms specifies that MD5 is supported by 
MessageDigest so the JDK would be broken it could not be found. If the eventual 
patch is in UUID then this aspect will need a bit of cleanup.

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread PROgrm_JARvis
On Fri, 18 Dec 2020 09:10:26 GMT, Alan Bateman  wrote:

> If there is a holder class for the MessageDigest then its initialisation can 
> fail, this would allow the orThrow method to go away

As of allowing `Md5Digest` instead of explicitly throwing the exception from 
`orThrow` I think that the latter is more appropriate as in case of allowing 
the class initialization to fail all non-first calls will lead to 
`NoClassDefFoundError` which will be counterintuitive from users' perspective 
although the behaviour is simply unspecified for this in contract of 
`nameUUIDFromBytes` (Javadoc of `MessageDigest` requires the existence of 
`SHA-1` and `SHA-256` but nothing is said about `MD5` (should a bug-report be 
raised for the purpose of specifying this detail in the Javadoc?).

> Are you planning to add a microbenchmark to demonstrate the issue?

As for testing I am ready to write a benchmark but I am unsure what is the 
canonical way to write the one comparing the value before and after changeset. 
Or should it simply be a comparison of the very approach used here, not exactly 
the changed method (i.e. writing the benchmarks using the try/catch variant and 
holder variant and comparing these)? Could you please give any example of such 
if there are some.

Thanks in advance!

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-18 Thread Alan Bateman
On Thu, 17 Dec 2020 13:36:17 GMT, PROgrm_JARvis 
 wrote:

> Please review this change moving lookup of MD5 digest in `java.lang.UUID` to 
> an internal holder class.

Are you planning to add a microbenchmark to demonstrate the issue?
If there is a holder class for the MessageDigest then its initialisation can 
fail, this would allow the orThrow method to go away

-

PR: https://git.openjdk.java.net/jdk/pull/1821


Re: RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached [v2]

2020-12-17 Thread PROgrm_JARvis
> Please review this change moving lookup of MD5 digest in `java.lang.UUID` to 
> an internal holder class.

PROgrm_JARvis has updated the pull request incrementally with one additional 
commit since the last revision:

  8258588: add missing " UUIDs" to comment of UUID$Md5Digest

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/1821/files
  - new: https://git.openjdk.java.net/jdk/pull/1821/files/73eb2c4f..37db4c56

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=1821=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=1821=00-01

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1821.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1821/head:pull/1821

PR: https://git.openjdk.java.net/jdk/pull/1821


RFR: 8258588: MD5 MessageDigest in java.util.UUID should be cached

2020-12-17 Thread PROgrm_JARvis
Please review this change moving lookup of MD5 digest in `java.lang.UUID` to an 
internal holder class.

-

Commit messages:
 - 8258588: MD5 MessageDigest in java.util.UUID should be cached

Changes: https://git.openjdk.java.net/jdk/pull/1821/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=1821=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8258588
  Stats: 33 lines in 1 file changed: 26 ins; 6 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/1821.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/1821/head:pull/1821

PR: https://git.openjdk.java.net/jdk/pull/1821