On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain >> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- | -- >> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | >> 510.36 | 548.39 | 1.07 >> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | >> 293.48 | 274.01 | 0.93 >> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | >> 751.83 | 2274.13 | 3.02 >> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | >> 388.52 | 1334.18 | 3.43 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > 8279508: Adding descriptive comments. Other than this the patch looks good to me. What testing have you done? src/hotspot/cpu/x86/x86.ad line 7263: > 7261: __ vector_round_float_avx($dst$$XMMRegister, $src$$XMMRegister, > $xtmp1$$XMMRegister, > 7262: $xtmp2$$XMMRegister, $xtmp3$$XMMRegister, > $xtmp4$$XMMRegister, > 7263: ExternalAddress(vector_float_signflip()), > new_mxcsr, $scratch$$Register, vlen_enc); The vector_float_signflip() here should be replaced by vector_all_bits_set(). cvtps2dq description: If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (2w-1, where w represents the number of bits in the destination format) is returned. src/hotspot/cpu/x86/x86.ad line 7280: > 7278: __ vector_round_float_evex($dst$$XMMRegister, $src$$XMMRegister, > $xtmp1$$XMMRegister, > 7279: $xtmp2$$XMMRegister, $ktmp1$$KRegister, > $ktmp2$$KRegister, > 7280: > ExternalAddress(vector_float_signflip()), new_mxcsr, $scratch$$Register, > vlen_enc); The vector_float_signflip() here should be replaced by vector_all_bits_set(). src/hotspot/cpu/x86/x86.ad line 7295: > 7293: __ vector_round_double_evex($dst$$XMMRegister, $src$$XMMRegister, > $xtmp1$$XMMRegister, > 7294: $xtmp2$$XMMRegister, $ktmp1$$KRegister, > $ktmp2$$KRegister, > 7295: > ExternalAddress(vector_double_signflip()), new_mxcsr, $scratch$$Register, > vlen_enc); The vector_double_signflip() here should be replaced by vector_all_bits_set(). vcvtpd2qq description: If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (2w-1, where w represents the number of bits in the destination format) is returned. ------------- PR: https://git.openjdk.java.net/jdk/pull/7094