On Fri, 7 May 2021 18:31:15 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res = lanewise(VectorOperations.OR, vec1 , vec2) >> >> This patch moves above handling from Java side to C2 compiler which >> facilitates dismantling the rotate operation if target ISA does not support >> a direct rotate instruction. >> >> AVX512 added vector rotate instructions vpro[rl][v][dq] which operate over >> long and integer type vectors. For other cases (i.e. sub-word type vectors >> or for targets which do not support direct rotate operations ) instruction >> sequence comprising of vector SHIFT (LEFT/RIGHT) and vector OR is emitted. >> >> Please find below the performance data for included JMH benchmark. >> Machine: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (Cascade lake Server) >> >> `` >> >> Benchmark | (TESTSIZE) | Shift | Baseline (ops/ms) | Withopt (ops/ms) | Gain >> % >> -- | -- | -- | -- | -- | -- >> RotateBenchmark.testRotateLeftI | (size) | (shift) | 7384.747 | 7706.652 | >> 4.36 >> RotateBenchmark.testRotateLeftI | 64 | 11 | 3723.305 | 3816.968 | 2.52 >> RotateBenchmark.testRotateLeftI | 128 | 11 | 1811.521 | 1966.05 | 8.53 >> RotateBenchmark.testRotateLeftI | 256 | 11 | 7133.296 | 7715.047 | 8.16 >> RotateBenchmark.testRotateLeftI | 64 | 21 | 3612.144 | 3886.225 | 7.59 >> RotateBenchmark.testRotateLeftI | 128 | 21 | 1815.422 | 1962.753 | 8.12 >> RotateBenchmark.testRotateLeftI | 256 | 21 | 7216.353 | 7677.165 | 6.39 >> RotateBenchmark.testRotateLeftI | 64 | 31 | 3602.008 | 3892.297 | 8.06 >> RotateBenchmark.testRotateLeftI | 128 | 31 | 1882.163 | 1958.887 | 4.08 >> RotateBenchmark.testRotateLeftI | 256 | 31 | 11819.443 | 11912.864 | 0.79 >> RotateBenchmark.testRotateLeftI | 64 | 11 | 5978.475 | 6060.189 | 1.37 >> RotateBenchmark.testRotateLeftI | 128 | 11 | 2965.179 | 3060.969 | 3.23 >> RotateBenchmark.testRotateLeftI | 256 | 11 | 11479.579 | 11684.148 | 1.78 >> RotateBenchmark.testRotateLeftI | 64 | 21 | 5904.903 | 6094.409 | 3.21 >> RotateBenchmark.testRotateLeftI | 128 | 21 | 2969.879 | 3074.1 | 3.51 >> RotateBenchmark.testRotateLeftI | 256 | 21 | 11531.654 | 12155.954 | 5.41 >> RotateBenchmark.testRotateLeftI | 64 | 31 | 5730.918 | 6112.514 | 6.66 >> RotateBenchmark.testRotateLeftI | 128 | 31 | 2937.19 | 2976.297 | 1.33 >> RotateBenchmark.testRotateLeftI | 256 | 31 | 16159.184 | 16459.462 | 1.86 >> RotateBenchmark.testRotateLeftI | 64 | 11 | 8154.982 | 8396.089 | 2.96 >> RotateBenchmark.testRotateLeftI | 128 | 11 | 4142.224 | 4292.049 | 3.62 >> RotateBenchmark.testRotateLeftI | 256 | 11 | 15958.154 | 16163.518 | 1.29 >> RotateBenchmark.testRotateLeftI | 64 | 21 | 8098.805 | 8504.279 | 5.01 >> RotateBenchmark.testRotateLeftI | 128 | 21 | 4137.598 | 4314.868 | 4.28 >> RotateBenchmark.testRotateLeftI | 256 | 21 | 16201.666 | 15992.958 | -1.29 >> RotateBenchmark.testRotateLeftI | 64 | 31 | 8027.169 | 8484.379 | 5.70 >> RotateBenchmark.testRotateLeftI | 128 | 31 | 4146.29 | 4039.681 | -2.57 >> RotateBenchmark.testRotateLeftL | 256 | 31 | 3566.176 | 3805.248 | 6.70 >> RotateBenchmark.testRotateLeftL | 64 | 11 | 1820.219 | 1962.866 | 7.84 >> RotateBenchmark.testRotateLeftL | 128 | 11 | 917.085 | 1007.334 | 9.84 >> RotateBenchmark.testRotateLeftL | 256 | 11 | 3592.139 | 3973.698 | 10.62 >> RotateBenchmark.testRotateLeftL | 64 | 21 | 1827.63 | 1999.711 | 9.42 >> RotateBenchmark.testRotateLeftL | 128 | 21 | 907.104 | 1002.997 | 10.57 >> RotateBenchmark.testRotateLeftL | 256 | 21 | 3780.962 | 3873.489 | 2.45 >> RotateBenchmark.testRotateLeftL | 64 | 31 | 1830.121 | 1955.63 | 6.86 >> RotateBenchmark.testRotateLeftL | 128 | 31 | 891.411 | 982.138 | 10.18 >> RotateBenchmark.testRotateLeftL | 256 | 31 | 5890.544 | 6100.594 | 3.57 >> RotateBenchmark.testRotateLeftL | 64 | 11 | 2984.329 | 3021.971 | 1.26 >> RotateBenchmark.testRotateLeftL | 128 | 11 | 1485.109 | 1527.689 | 2.87 >> RotateBenchmark.testRotateLeftL | 256 | 11 | 5903.411 | 6083.775 | 3.06 >> RotateBenchmark.testRotateLeftL | 64 | 21 | 2925.37 | 3050.958 | 4.29 >> RotateBenchmark.testRotateLeftL | 128 | 21 | 1486.432 | 1537.155 | 3.41 >> RotateBenchmark.testRotateLeftL | 256 | 21 | 5853.721 | 6000.682 | 2.51 >> RotateBenchmark.testRotateLeftL | 64 | 31 | 2896.116 | 3072.783 | 6.10 >> RotateBenchmark.testRotateLeftL | 128 | 31 | 1483.132 | 1546.588 | 4.28 >> RotateBenchmark.testRotateLeftL | 256 | 31 | 8059.206 | 8218.047 | 1.97 >> RotateBenchmark.testRotateLeftL | 64 | 11 | 4022.416 | 4195.52 | 4.30 >> RotateBenchmark.testRotateLeftL | 128 | 11 | 2084.296 | 2068.238 | -0.77 >> RotateBenchmark.testRotateLeftL | 256 | 11 | 7971.832 | 8172.819 | 2.52 >> RotateBenchmark.testRotateLeftL | 64 | 21 | 4032.036 | 4344.469 | 7.75 >> RotateBenchmark.testRotateLeftL | 128 | 21 | 2068.957 | 2138.685 | 3.37 >> RotateBenchmark.testRotateLeftL | 256 | 21 | 8140.63 | 8003.283 | -1.69 >> RotateBenchmark.testRotateLeftL | 64 | 31 | 4088.621 | 4296.091 | 5.07 >> RotateBenchmark.testRotateLeftL | 128 | 31 | 2007.753 | 2088.455 | 4.02 >> RotateBenchmark.testRotateRightI | 256 | 31 | 7358.793 | 7548.976 | 2.58 >> RotateBenchmark.testRotateRightI | 64 | 11 | 3648.868 | 3897.47 | 6.81 >> RotateBenchmark.testRotateRightI | 128 | 11 | 1862.73 | 1969.964 | 5.76 >> RotateBenchmark.testRotateRightI | 256 | 11 | 7268.806 | 7790.588 | 7.18 >> RotateBenchmark.testRotateRightI | 64 | 21 | 3577.79 | 3979.675 | 11.23 >> RotateBenchmark.testRotateRightI | 128 | 21 | 1773.243 | 1921.088 | 8.34 >> RotateBenchmark.testRotateRightI | 256 | 21 | 7084.974 | 7609.912 | 7.41 >> RotateBenchmark.testRotateRightI | 64 | 31 | 3688.781 | 3909.65 | 5.99 >> RotateBenchmark.testRotateRightI | 128 | 31 | 1845.978 | 1928.316 | 4.46 >> RotateBenchmark.testRotateRightI | 256 | 31 | 11463.228 | 12179.833 | 6.25 >> RotateBenchmark.testRotateRightI | 64 | 11 | 5678.052 | 6028.573 | 6.17 >> RotateBenchmark.testRotateRightI | 128 | 11 | 2990.419 | 3070.409 | 2.67 >> RotateBenchmark.testRotateRightI | 256 | 11 | 11780.283 | 12105.261 | 2.76 >> RotateBenchmark.testRotateRightI | 64 | 21 | 5827.8 | 6020.208 | 3.30 >> RotateBenchmark.testRotateRightI | 128 | 21 | 2904.852 | 3047.154 | 4.90 >> RotateBenchmark.testRotateRightI | 256 | 21 | 11359.146 | 12060.401 | 6.17 >> RotateBenchmark.testRotateRightI | 64 | 31 | 5823.207 | 6079.82 | 4.41 >> RotateBenchmark.testRotateRightI | 128 | 31 | 2984.484 | 3045.719 | 2.05 >> RotateBenchmark.testRotateRightI | 256 | 31 | 16200.504 | 16376.475 | 1.09 >> RotateBenchmark.testRotateRightI | 64 | 11 | 8118.399 | 8315.407 | 2.43 >> RotateBenchmark.testRotateRightI | 128 | 11 | 4130.745 | 4092.588 | -0.92 >> RotateBenchmark.testRotateRightI | 256 | 11 | 15842.168 | 16469.119 | 3.96 >> RotateBenchmark.testRotateRightI | 64 | 21 | 7855.164 | 8188.913 | 4.25 >> RotateBenchmark.testRotateRightI | 128 | 21 | 4114.378 | 4035.56 | -1.92 >> RotateBenchmark.testRotateRightI | 256 | 21 | 15636.117 | 16289.632 | 4.18 >> RotateBenchmark.testRotateRightI | 64 | 31 | 8108.067 | 7996.517 | -1.38 >> RotateBenchmark.testRotateRightI | 128 | 31 | 3997.547 | 4153.58 | 3.90 >> RotateBenchmark.testRotateRightL | 256 | 31 | 3685.99 | 3814.384 | 3.48 >> RotateBenchmark.testRotateRightL | 64 | 11 | 1787.875 | 1916.541 | 7.20 >> RotateBenchmark.testRotateRightL | 128 | 11 | 940.141 | 990.383 | 5.34 >> RotateBenchmark.testRotateRightL | 256 | 11 | 3745.968 | 3920.667 | 4.66 >> RotateBenchmark.testRotateRightL | 64 | 21 | 1877.94 | 1998.072 | 6.40 >> RotateBenchmark.testRotateRightL | 128 | 21 | 933.536 | 1004.61 | 7.61 >> RotateBenchmark.testRotateRightL | 256 | 21 | 3744.763 | 3947.427 | 5.41 >> RotateBenchmark.testRotateRightL | 64 | 31 | 1864.818 | 1978.277 | 6.08 >> RotateBenchmark.testRotateRightL | 128 | 31 | 906.965 | 998.692 | 10.11 >> RotateBenchmark.testRotateRightL | 256 | 31 | 5910.469 | 6062.429 | 2.57 >> RotateBenchmark.testRotateRightL | 64 | 11 | 2914.64 | 3033.127 | 4.07 >> RotateBenchmark.testRotateRightL | 128 | 11 | 1491.344 | 1543.936 | 3.53 >> RotateBenchmark.testRotateRightL | 256 | 11 | 5801.818 | 6098.892 | 5.12 >> RotateBenchmark.testRotateRightL | 64 | 21 | 2881.328 | 3089.547 | 7.23 >> RotateBenchmark.testRotateRightL | 128 | 21 | 1485.969 | 1526.231 | 2.71 >> RotateBenchmark.testRotateRightL | 256 | 21 | 5783.495 | 5957.649 | 3.01 >> RotateBenchmark.testRotateRightL | 64 | 31 | 3008.182 | 3026.323 | 0.60 >> RotateBenchmark.testRotateRightL | 128 | 31 | 1464.566 | 1546.825 | 5.62 >> RotateBenchmark.testRotateRightL | 256 | 31 | 8208.124 | 8361.437 | 1.87 >> RotateBenchmark.testRotateRightL | 64 | 11 | 4062.465 | 4319.412 | 6.32 >> RotateBenchmark.testRotateRightL | 128 | 11 | 2029.995 | 2086.497 | 2.78 >> RotateBenchmark.testRotateRightL | 256 | 11 | 8183.789 | 8193.087 | 0.11 >> RotateBenchmark.testRotateRightL | 64 | 21 | 4092.686 | 4193.712 | 2.47 >> RotateBenchmark.testRotateRightL | 128 | 21 | 2036.854 | 2038.927 | 0.10 >> RotateBenchmark.testRotateRightL | 256 | 21 | 8155.015 | 8175.792 | 0.25 >> RotateBenchmark.testRotateRightL | 64 | 31 | 3960.629 | 4263.922 | 7.66 >> RotateBenchmark.testRotateRightL | 128 | 31 | 1996.862 | 2055.486 | 2.94 >> >> `` > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > 8266054: Review comments resolution. Java code updates look good I believe you can now remove the four "*-Rotate_*.template" files now that you leverage exiting templates? Also, i believe ancillary changes to `gen-template.sh` are no longer strictly required, now that we defer to method calls for ROL/ROR? ------------- PR: https://git.openjdk.java.net/jdk/pull/3720