Re: RFR: 8196334: Optimize UUID#fromString
Hi Peter, sorry about defaulting to throughput measurement - while average time has its advantages (especially on nanobenchmarks), most benchmarks are intuitively "higher is better", so I lean that way by old habit. Thanks for trying out your ideas, though! It's important we try out and report the result of reasonable alternatives, even when they fall short. I've done a number of experiments here that I should have probably made a note of somewhere... Thanks! /Claes On 2020-03-08 12:59, Peter Levart wrote: Sorry, bash this. I just noticed that the units used in benchmark are ops/us and that the best performing code is the one which is in place now. I usually use ns/op as a unit in JMH benchmarks, so this led me to false belief that alternatives are better performing where in fact are worse. Sorry for noise. Regards, Peter On 3/8/20 12:46 PM, Peter Levart wrote: Hi, When I noticed this optimization, I had some ideas how to push this even further, but only now had some time to try them out. Here are some incremental improvements [1] to the already brilliant idea with the following JMH results (the baseline is JDK build with patch for 8196334 already applied): Benchmark (size) Mode Cnt Score Error Units UUIDBench.uuidFromString 2 thrpt 5 37.344 ± 0.257 ops/us UUIDBench.uuidFromStringAlt1 2 thrpt 5 29.069 ± 0.137 ops/us UUIDBench.uuidFromStringAlt2 2 thrpt 5 27.308 ± 0.204 ops/us UUIDBench.uuidFromStringAlt3 2 thrpt 5 23.195 ± 0.885 ops/us As can be seen, the trick was to eliminate branches (alt1, alt2) and to reduce the number of operations (shifts) in alt3. One last trick in alt3 was to use the array length in a check that is then fused with array index bounds check (I think) so that we get one of them for free. What do you think? Is this good enough to warrant another change? As you can see, I haven't prepared the patch yet, just benchmark. Regards, Peter [1] http://cr.openjdk.java.net/~plevart/jdk-dev/8196334_UUID.fromString/UUIDBench.java On 3/2/20 8:26 AM, Claes Redestad wrote: On 2020-03-02 08:14, Alan Bateman wrote: http://cr.openjdk.java.net/~redestad/8196334/open.02/ Looks good. Thanks - pushed! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
Sorry, bash this. I just noticed that the units used in benchmark are ops/us and that the best performing code is the one which is in place now. I usually use ns/op as a unit in JMH benchmarks, so this led me to false belief that alternatives are better performing where in fact are worse. Sorry for noise. Regards, Peter On 3/8/20 12:46 PM, Peter Levart wrote: Hi, When I noticed this optimization, I had some ideas how to push this even further, but only now had some time to try them out. Here are some incremental improvements [1] to the already brilliant idea with the following JMH results (the baseline is JDK build with patch for 8196334 already applied): Benchmark (size) Mode Cnt Score Error Units UUIDBench.uuidFromString 2 thrpt 5 37.344 ± 0.257 ops/us UUIDBench.uuidFromStringAlt1 2 thrpt 5 29.069 ± 0.137 ops/us UUIDBench.uuidFromStringAlt2 2 thrpt 5 27.308 ± 0.204 ops/us UUIDBench.uuidFromStringAlt3 2 thrpt 5 23.195 ± 0.885 ops/us As can be seen, the trick was to eliminate branches (alt1, alt2) and to reduce the number of operations (shifts) in alt3. One last trick in alt3 was to use the array length in a check that is then fused with array index bounds check (I think) so that we get one of them for free. What do you think? Is this good enough to warrant another change? As you can see, I haven't prepared the patch yet, just benchmark. Regards, Peter [1] http://cr.openjdk.java.net/~plevart/jdk-dev/8196334_UUID.fromString/UUIDBench.java On 3/2/20 8:26 AM, Claes Redestad wrote: On 2020-03-02 08:14, Alan Bateman wrote: http://cr.openjdk.java.net/~redestad/8196334/open.02/ Looks good. Thanks - pushed! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
Hi, When I noticed this optimization, I had some ideas how to push this even further, but only now had some time to try them out. Here are some incremental improvements [1] to the already brilliant idea with the following JMH results (the baseline is JDK build with patch for 8196334 already applied): Benchmark (size) Mode Cnt Score Error Units UUIDBench.uuidFromString 2 thrpt 5 37.344 ± 0.257 ops/us UUIDBench.uuidFromStringAlt1 2 thrpt 5 29.069 ± 0.137 ops/us UUIDBench.uuidFromStringAlt2 2 thrpt 5 27.308 ± 0.204 ops/us UUIDBench.uuidFromStringAlt3 2 thrpt 5 23.195 ± 0.885 ops/us As can be seen, the trick was to eliminate branches (alt1, alt2) and to reduce the number of operations (shifts) in alt3. One last trick in alt3 was to use the array length in a check that is then fused with array index bounds check (I think) so that we get one of them for free. What do you think? Is this good enough to warrant another change? As you can see, I haven't prepared the patch yet, just benchmark. Regards, Peter [1] http://cr.openjdk.java.net/~plevart/jdk-dev/8196334_UUID.fromString/UUIDBench.java On 3/2/20 8:26 AM, Claes Redestad wrote: On 2020-03-02 08:14, Alan Bateman wrote: http://cr.openjdk.java.net/~redestad/8196334/open.02/ Looks good. Thanks - pushed! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
On 2020-03-02 08:14, Alan Bateman wrote: http://cr.openjdk.java.net/~redestad/8196334/open.02/ Looks good. Thanks - pushed! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
On 01/03/2020 21:05, Claes Redestad wrote: : Filed https://bugs.openjdk.java.net/browse/JDK-8240266 to investigate if we can improve the branch elimination in Character.digit and removed the comment. Also moved parse4Nibbles up before fromString and fixed the redundant FQDN for Array: http://cr.openjdk.java.net/~redestad/8196334/open.02/ Looks good. -Alan.
Re: RFR: 8196334: Optimize UUID#fromString
On 2020-03-01 20:41, Alan Bateman wrote: On 01/03/2020 18:19, Claes Redestad wrote: : After discussing it a bit offline, Andriy and I would like to simplify the dash-checking code a bit to make it more clear, as there seems to be no obvious performance drawback of simplifying to (ch1 == '-' && ...). New webrev: http://cr.openjdk.java.net/~redestad/8196334/open.01/ This looks good and keeping the old code for the size != 36 case gets around the compatibility issues discussed here with previous attempts to improve it. A few minor nits: The effective TODO to optimize Character.digit in the declaration of NIBBLES seems like something for JIRA, not a comment here. I assume the fully qualified class name for java.util.Arrays is because this patch was initially developed outside of java.util? Can parse4Nibbles be moved to after the declaration of NIBBLES to avoid it being in the middle of the two fromString methods? Filed https://bugs.openjdk.java.net/browse/JDK-8240266 to investigate if we can improve the branch elimination in Character.digit and removed the comment. Also moved parse4Nibbles up before fromString and fixed the redundant FQDN for Array: http://cr.openjdk.java.net/~redestad/8196334/open.02/ Thanks! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
On 01/03/2020 18:19, Claes Redestad wrote: : After discussing it a bit offline, Andriy and I would like to simplify the dash-checking code a bit to make it more clear, as there seems to be no obvious performance drawback of simplifying to (ch1 == '-' && ...). New webrev: http://cr.openjdk.java.net/~redestad/8196334/open.01/ This looks good and keeping the old code for the size != 36 case gets around the compatibility issues discussed here with previous attempts to improve it. A few minor nits: The effective TODO to optimize Character.digit in the declaration of NIBBLES seems like something for JIRA, not a comment here. I assume the fully qualified class name for java.util.Arrays is because this patch was initially developed outside of java.util? Can parse4Nibbles be moved to after the declaration of NIBBLES to avoid it being in the middle of the two fromString methods? -Alan.
Re: RFR: 8196334: Optimize UUID#fromString
On 2020-02-29 00:58, Ivan Gerasimov wrote: Sounds reasonable. I think the current proposal is clearly a progress, so it looks good. Ok! After discussing it a bit offline, Andriy and I would like to simplify the dash-checking code a bit to make it more clear, as there seems to be no obvious performance drawback of simplifying to (ch1 == '-' && ...). New webrev: http://cr.openjdk.java.net/~redestad/8196334/open.01/ Thanks! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
Sounds reasonable. I think the current proposal is clearly a progress, so it looks good. I agree that any further optimization can be considered within a separate enhancement request. With kind regards, Ivan On 2/28/20 2:59 PM, Claes Redestad wrote: Hi Ivan, I've considered trying it out, but it gets complicated as we don't want to penalize -XX:-UseCompactStrings. I think adding special methods for UUID inside StringLatin1 is a step in the wrong direction, but maybe there's something more generic we can consider. However, short of adding String.value()+coder() access via JavaLangAccess I'm not sure there's anything that'd net us any significant gain. And I don't think it's worth it to open up that particular can of worms right now just for this. /Claes On 2020-02-28 20:51, Ivan Gerasimov wrote: Hi Claes and Andriy! It looks good overall. I wonder if it'll be even faster if the fast path were implemented in StringLatin1 (for compact strings only) and was called via SharedSecrets/JavaLangAccess? Then, you could operate directly on bytes and avoid dealing with longs. With kind regards, Ivan On 2/28/20 6:22 AM, Claes Redestad wrote: Hi all, please review this patch to optimize UUID#fromString. Jon Chambers originally proposed a patch that used a strict parser to get a similar speed-up, but I failed to adapt it in a way that could fall back to the less strict behavior while maintaining a reasonable speed-up in the fast-path case. Sorry, Jon! The patch proposed here was recently contributed by Andriy Plokhotnyuk (OCA signed), and manages to get more than a 3x speed-up on the new fromString microbenchmark, while falling back gently to the current, less strict implementation if ever needed. I've done some light edits, and added a simple microbenchmark. Bug: https://bugs.openjdk.java.net/browse/JDK-8196334 Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/ Testing: tier1-3 Thanks! /Claes -- With kind regards, Ivan Gerasimov
Re: RFR: 8196334: Optimize UUID#fromString
Hi Ivan, I've considered trying it out, but it gets complicated as we don't want to penalize -XX:-UseCompactStrings. I think adding special methods for UUID inside StringLatin1 is a step in the wrong direction, but maybe there's something more generic we can consider. However, short of adding String.value()+coder() access via JavaLangAccess I'm not sure there's anything that'd net us any significant gain. And I don't think it's worth it to open up that particular can of worms right now just for this. /Claes On 2020-02-28 20:51, Ivan Gerasimov wrote: Hi Claes and Andriy! It looks good overall. I wonder if it'll be even faster if the fast path were implemented in StringLatin1 (for compact strings only) and was called via SharedSecrets/JavaLangAccess? Then, you could operate directly on bytes and avoid dealing with longs. With kind regards, Ivan On 2/28/20 6:22 AM, Claes Redestad wrote: Hi all, please review this patch to optimize UUID#fromString. Jon Chambers originally proposed a patch that used a strict parser to get a similar speed-up, but I failed to adapt it in a way that could fall back to the less strict behavior while maintaining a reasonable speed-up in the fast-path case. Sorry, Jon! The patch proposed here was recently contributed by Andriy Plokhotnyuk (OCA signed), and manages to get more than a 3x speed-up on the new fromString microbenchmark, while falling back gently to the current, less strict implementation if ever needed. I've done some light edits, and added a simple microbenchmark. Bug: https://bugs.openjdk.java.net/browse/JDK-8196334 Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/ Testing: tier1-3 Thanks! /Claes
Re: RFR: 8196334: Optimize UUID#fromString
Hi Claes and Andriy! It looks good overall. I wonder if it'll be even faster if the fast path were implemented in StringLatin1 (for compact strings only) and was called via SharedSecrets/JavaLangAccess? Then, you could operate directly on bytes and avoid dealing with longs. With kind regards, Ivan On 2/28/20 6:22 AM, Claes Redestad wrote: Hi all, please review this patch to optimize UUID#fromString. Jon Chambers originally proposed a patch that used a strict parser to get a similar speed-up, but I failed to adapt it in a way that could fall back to the less strict behavior while maintaining a reasonable speed-up in the fast-path case. Sorry, Jon! The patch proposed here was recently contributed by Andriy Plokhotnyuk (OCA signed), and manages to get more than a 3x speed-up on the new fromString microbenchmark, while falling back gently to the current, less strict implementation if ever needed. I've done some light edits, and added a simple microbenchmark. Bug: https://bugs.openjdk.java.net/browse/JDK-8196334 Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/ Testing: tier1-3 Thanks! /Claes -- With kind regards, Ivan Gerasimov
RFR: 8196334: Optimize UUID#fromString
Hi all, please review this patch to optimize UUID#fromString. Jon Chambers originally proposed a patch that used a strict parser to get a similar speed-up, but I failed to adapt it in a way that could fall back to the less strict behavior while maintaining a reasonable speed-up in the fast-path case. Sorry, Jon! The patch proposed here was recently contributed by Andriy Plokhotnyuk (OCA signed), and manages to get more than a 3x speed-up on the new fromString microbenchmark, while falling back gently to the current, less strict implementation if ever needed. I've done some light edits, and added a simple microbenchmark. Bug:https://bugs.openjdk.java.net/browse/JDK-8196334 Webrev: http://cr.openjdk.java.net/~redestad/8196334/open.00/ Testing: tier1-3 Thanks! /Claes