Re: RFR: 8196334: Optimize UUID#fromString

2020-03-08 Thread Claes Redestad

Hi Peter,

sorry about defaulting to throughput measurement - while average time
has its advantages (especially on nanobenchmarks), most benchmarks are
intuitively "higher is better", so I lean that way by old habit.

Thanks for trying out your ideas, though! It's important we try out and
report the result of reasonable alternatives, even when they fall short.
I've done a number of experiments here that I should have probably made
a note of somewhere...

Thanks!

/Claes

On 2020-03-08 12:59, Peter Levart wrote:

Sorry, bash this.

I just noticed that the units used in benchmark are ops/us and that the 
best performing code is the one which is in place now.


I usually use ns/op as a unit in JMH benchmarks, so this led me to false 
belief that alternatives are better performing where in fact are worse.


Sorry for noise.

Regards, Peter


On 3/8/20 12:46 PM, Peter Levart wrote:

Hi,

When I noticed this optimization, I had some ideas how to push this 
even further, but only now had some time to try them out.


Here are some incremental improvements [1] to the already brilliant 
idea with the following JMH results (the baseline is JDK build with 
patch for 8196334 already applied):


Benchmark (size)   Mode  Cnt   Score   Error Units
UUIDBench.uuidFromString   2  thrpt    5  37.344 ± 0.257 ops/us
UUIDBench.uuidFromStringAlt1   2  thrpt    5  29.069 ± 0.137 ops/us
UUIDBench.uuidFromStringAlt2   2  thrpt    5  27.308 ± 0.204 ops/us
UUIDBench.uuidFromStringAlt3   2  thrpt    5  23.195 ± 0.885 ops/us

As can be seen, the trick was to eliminate branches (alt1, alt2) and 
to reduce the number of operations (shifts) in alt3. One last trick in 
alt3 was to use the array length in a check that is then fused with 
array index bounds check (I think) so that we get one of them for free.


What do you think? Is this good enough to warrant another change? As 
you can see, I haven't prepared the patch yet,  just benchmark.


Regards, Peter

[1] 
http://cr.openjdk.java.net/~plevart/jdk-dev/8196334_UUID.fromString/UUIDBench.java 




On 3/2/20 8:26 AM, Claes Redestad wrote:

On 2020-03-02 08:14, Alan Bateman wrote:


http://cr.openjdk.java.net/~redestad/8196334/open.02/

Looks good.


Thanks - pushed!

/Claes






Re: RFR: 8196334: Optimize UUID#fromString

2020-03-08 Thread Peter Levart

Sorry, bash this.

I just noticed that the units used in benchmark are ops/us and that the 
best performing code is the one which is in place now.


I usually use ns/op as a unit in JMH benchmarks, so this led me to false 
belief that alternatives are better performing where in fact are worse.


Sorry for noise.

Regards, Peter


On 3/8/20 12:46 PM, Peter Levart wrote:

Hi,

When I noticed this optimization, I had some ideas how to push this 
even further, but only now had some time to try them out.


Here are some incremental improvements [1] to the already brilliant 
idea with the following JMH results (the baseline is JDK build with 
patch for 8196334 already applied):


Benchmark (size)   Mode  Cnt   Score   Error Units
UUIDBench.uuidFromString   2  thrpt    5  37.344 ± 0.257 ops/us
UUIDBench.uuidFromStringAlt1   2  thrpt    5  29.069 ± 0.137 ops/us
UUIDBench.uuidFromStringAlt2   2  thrpt    5  27.308 ± 0.204 ops/us
UUIDBench.uuidFromStringAlt3   2  thrpt    5  23.195 ± 0.885 ops/us

As can be seen, the trick was to eliminate branches (alt1, alt2) and 
to reduce the number of operations (shifts) in alt3. One last trick in 
alt3 was to use the array length in a check that is then fused with 
array index bounds check (I think) so that we get one of them for free.


What do you think? Is this good enough to warrant another change? As 
you can see, I haven't prepared the patch yet,  just benchmark.


Regards, Peter

[1] 
http://cr.openjdk.java.net/~plevart/jdk-dev/8196334_UUID.fromString/UUIDBench.java



On 3/2/20 8:26 AM, Claes Redestad wrote:

On 2020-03-02 08:14, Alan Bateman wrote:


http://cr.openjdk.java.net/~redestad/8196334/open.02/

Looks good.


Thanks - pushed!

/Claes






Re: RFR: 8196334: Optimize UUID#fromString

2020-03-08 Thread Peter Levart

Hi,

When I noticed this optimization, I had some ideas how to push this even 
further, but only now had some time to try them out.


Here are some incremental improvements [1] to the already brilliant idea 
with the following JMH results (the baseline is JDK build with patch for 
8196334 already applied):


Benchmark (size)   Mode  Cnt   Score   Error Units
UUIDBench.uuidFromString   2  thrpt    5  37.344 ± 0.257 ops/us
UUIDBench.uuidFromStringAlt1   2  thrpt    5  29.069 ± 0.137 ops/us
UUIDBench.uuidFromStringAlt2   2  thrpt    5  27.308 ± 0.204 ops/us
UUIDBench.uuidFromStringAlt3   2  thrpt    5  23.195 ± 0.885 ops/us

As can be seen, the trick was to eliminate branches (alt1, alt2) and to 
reduce the number of operations (shifts) in alt3. One last trick in alt3 
was to use the array length in a check that is then fused with array 
index bounds check (I think) so that we get one of them for free.


What do you think? Is this good enough to warrant another change? As you 
can see, I haven't prepared the patch yet,  just benchmark.


Regards, Peter

[1] 
http://cr.openjdk.java.net/~plevart/jdk-dev/8196334_UUID.fromString/UUIDBench.java



On 3/2/20 8:26 AM, Claes Redestad wrote:

On 2020-03-02 08:14, Alan Bateman wrote:


http://cr.openjdk.java.net/~redestad/8196334/open.02/

Looks good.


Thanks - pushed!

/Claes




Re: RFR: 8196334: Optimize UUID#fromString

2020-03-01 Thread Claes Redestad

On 2020-03-02 08:14, Alan Bateman wrote:


http://cr.openjdk.java.net/~redestad/8196334/open.02/

Looks good.


Thanks - pushed!

/Claes


Re: RFR: 8196334: Optimize UUID#fromString

2020-03-01 Thread Alan Bateman

On 01/03/2020 21:05, Claes Redestad wrote:

:

Filed https://bugs.openjdk.java.net/browse/JDK-8240266 to investigate
if we can improve the branch elimination in Character.digit and removed
the comment.

Also moved parse4Nibbles up before fromString and fixed the redundant
FQDN for Array:

http://cr.openjdk.java.net/~redestad/8196334/open.02/

Looks good.

-Alan.


Re: RFR: 8196334: Optimize UUID#fromString

2020-03-01 Thread Claes Redestad




On 2020-03-01 20:41, Alan Bateman wrote:

On 01/03/2020 18:19, Claes Redestad wrote:

:

After discussing it a bit offline, Andriy and I would like to simplify
the dash-checking code a bit to make it more clear, as there seems to be
no obvious performance drawback of simplifying to (ch1 == '-' && ...).

New webrev: http://cr.openjdk.java.net/~redestad/8196334/open.01/

This looks good and keeping the old code for the size != 36 case gets 
around the compatibility issues discussed here with previous attempts to 
improve it.


A few minor nits: The effective TODO to optimize Character.digit in the 
declaration of NIBBLES seems like something for JIRA, not a comment 
here. I assume the fully qualified class name for java.util.Arrays is 
because this patch was initially developed outside of java.util? Can 
parse4Nibbles be moved to after the declaration of NIBBLES to avoid it 
being in the middle of the two fromString methods?


Filed https://bugs.openjdk.java.net/browse/JDK-8240266 to investigate
if we can improve the branch elimination in Character.digit and removed
the comment.

Also moved parse4Nibbles up before fromString and fixed the redundant
FQDN for Array:

http://cr.openjdk.java.net/~redestad/8196334/open.02/

Thanks!

/Claes


Re: RFR: 8196334: Optimize UUID#fromString

2020-03-01 Thread Alan Bateman

On 01/03/2020 18:19, Claes Redestad wrote:

:

After discussing it a bit offline, Andriy and I would like to simplify
the dash-checking code a bit to make it more clear, as there seems to be
no obvious performance drawback of simplifying to (ch1 == '-' && ...).

New webrev: http://cr.openjdk.java.net/~redestad/8196334/open.01/

This looks good and keeping the old code for the size != 36 case gets 
around the compatibility issues discussed here with previous attempts to 
improve it.


A few minor nits: The effective TODO to optimize Character.digit in the 
declaration of NIBBLES seems like something for JIRA, not a comment 
here. I assume the fully qualified class name for java.util.Arrays is 
because this patch was initially developed outside of java.util? Can 
parse4Nibbles be moved to after the declaration of NIBBLES to avoid it 
being in the middle of the two fromString methods?


-Alan.


Re: RFR: 8196334: Optimize UUID#fromString

2020-03-01 Thread Claes Redestad




On 2020-02-29 00:58, Ivan Gerasimov wrote:

Sounds reasonable.

I think the current proposal is clearly a progress, so it looks good.


Ok!

After discussing it a bit offline, Andriy and I would like to simplify
the dash-checking code a bit to make it more clear, as there seems to be
no obvious performance drawback of simplifying to (ch1 == '-' && ...).

New webrev: http://cr.openjdk.java.net/~redestad/8196334/open.01/

Thanks!

/Claes


Re: RFR: 8196334: Optimize UUID#fromString

2020-02-28 Thread Ivan Gerasimov

Sounds reasonable.

I think the current proposal is clearly a progress, so it looks good.

I agree that any further optimization can be considered within a 
separate enhancement request.


With kind regards,

Ivan

On 2/28/20 2:59 PM, Claes Redestad wrote:

Hi Ivan,

I've considered trying it out, but it gets complicated as we don't want
to penalize -XX:-UseCompactStrings.

I think adding special methods for UUID inside StringLatin1 is a step
in the wrong direction, but maybe there's something more generic we can
consider. However, short of adding String.value()+coder() access via
JavaLangAccess I'm not sure there's anything that'd net us any
significant gain. And I don't think it's worth it to open up that
particular can of worms right now just for this.

/Claes

On 2020-02-28 20:51, Ivan Gerasimov wrote:

Hi Claes and Andriy!

It looks good overall.

I wonder if it'll be even faster if the fast path were implemented in 
StringLatin1 (for compact strings only) and was called via 
SharedSecrets/JavaLangAccess?


Then, you could operate directly on bytes and avoid dealing with longs.

With kind regards,

Ivan


On 2/28/20 6:22 AM, Claes Redestad wrote:

Hi all,

please review this patch to optimize UUID#fromString.

Jon Chambers originally proposed a patch that used a strict parser to
get a similar speed-up, but I failed to adapt it in a way that
could fall back to the less strict behavior while maintaining a
reasonable speed-up in the fast-path case. Sorry, Jon!

The patch proposed here was recently contributed by Andriy Plokhotnyuk
(OCA signed), and manages to get more than a 3x speed-up on the new
fromString microbenchmark, while falling back gently to the current,
less strict implementation if ever needed. I've done some light edits,
and added a simple microbenchmark.

Bug:    https://bugs.openjdk.java.net/browse/JDK-8196334
Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/

Testing: tier1-3

Thanks!

/Claes



--
With kind regards,
Ivan Gerasimov



Re: RFR: 8196334: Optimize UUID#fromString

2020-02-28 Thread Claes Redestad

Hi Ivan,

I've considered trying it out, but it gets complicated as we don't want
to penalize -XX:-UseCompactStrings.

I think adding special methods for UUID inside StringLatin1 is a step
in the wrong direction, but maybe there's something more generic we can
consider. However, short of adding String.value()+coder() access via
JavaLangAccess I'm not sure there's anything that'd net us any
significant gain. And I don't think it's worth it to open up that
particular can of worms right now just for this.

/Claes

On 2020-02-28 20:51, Ivan Gerasimov wrote:

Hi Claes and Andriy!

It looks good overall.

I wonder if it'll be even faster if the fast path were implemented in 
StringLatin1 (for compact strings only) and was called via 
SharedSecrets/JavaLangAccess?


Then, you could operate directly on bytes and avoid dealing with longs.

With kind regards,

Ivan


On 2/28/20 6:22 AM, Claes Redestad wrote:

Hi all,

please review this patch to optimize UUID#fromString.

Jon Chambers originally proposed a patch that used a strict parser to
get a similar speed-up, but I failed to adapt it in a way that
could fall back to the less strict behavior while maintaining a
reasonable speed-up in the fast-path case. Sorry, Jon!

The patch proposed here was recently contributed by Andriy Plokhotnyuk
(OCA signed), and manages to get more than a 3x speed-up on the new
fromString microbenchmark, while falling back gently to the current,
less strict implementation if ever needed. I've done some light edits,
and added a simple microbenchmark.

Bug:    https://bugs.openjdk.java.net/browse/JDK-8196334
Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/

Testing: tier1-3

Thanks!

/Claes




Re: RFR: 8196334: Optimize UUID#fromString

2020-02-28 Thread Ivan Gerasimov

Hi Claes and Andriy!

It looks good overall.

I wonder if it'll be even faster if the fast path were implemented in 
StringLatin1 (for compact strings only) and was called via 
SharedSecrets/JavaLangAccess?


Then, you could operate directly on bytes and avoid dealing with longs.

With kind regards,

Ivan


On 2/28/20 6:22 AM, Claes Redestad wrote:

Hi all,

please review this patch to optimize UUID#fromString.

Jon Chambers originally proposed a patch that used a strict parser to
get a similar speed-up, but I failed to adapt it in a way that
could fall back to the less strict behavior while maintaining a
reasonable speed-up in the fast-path case. Sorry, Jon!

The patch proposed here was recently contributed by Andriy Plokhotnyuk
(OCA signed), and manages to get more than a 3x speed-up on the new
fromString microbenchmark, while falling back gently to the current,
less strict implementation if ever needed. I've done some light edits,
and added a simple microbenchmark.

Bug:    https://bugs.openjdk.java.net/browse/JDK-8196334
Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/

Testing: tier1-3

Thanks!

/Claes


--
With kind regards,
Ivan Gerasimov



RFR: 8196334: Optimize UUID#fromString

2020-02-28 Thread Claes Redestad

Hi all,

please review this patch to optimize UUID#fromString.

Jon Chambers originally proposed a patch that used a strict parser to
get a similar speed-up, but I failed to adapt it in a way that
could fall back to the less strict behavior while maintaining a
reasonable speed-up in the fast-path case. Sorry, Jon!

The patch proposed here was recently contributed by Andriy Plokhotnyuk
(OCA signed), and manages to get more than a 3x speed-up on the new
fromString microbenchmark, while falling back gently to the current,
less strict implementation if ever needed. I've done some light edits,
and added a simple microbenchmark.

Bug:https://bugs.openjdk.java.net/browse/JDK-8196334
Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/

Testing: tier1-3

Thanks!

/Claes