Patch extends existing macrologic inferencing algorithm to handle masked logic operations.
Existing algorithm: 1. Identify logic cone roots. 2. Packs parent and logic child nodes into a MacroLogic node in bottom up traversal if input constraint are met. i.e. maximum number of inputs which a macro logic node can have. 3. Perform symbolic evaluation of logic expression tree by assigning value corresponding to a truth table column to each input. 4. Inputs along with encoded function together represents a macro logic node which mimics a truth table. Modification: Extended the packing algorithm to operate on both predicated or non-predicated logic nodes. Following rules define the criteria under which nodes gets packed into a macro logic node:- 1. Parent and both child nodes are all unmasked or masked with same predicates. 2. Masked parent can be packed with left child if it is predicated and both have same prediates. 3. Masked parent can be packed with right child if its un-predicated or has matching predication condition. 4. An unmasked parent can be packed with an unmasked child. New jtreg test case added with the patch exhaustively covers all the different combinations of predications of parent and child nodes. Following are the performance number for JMH benchmark included with the patch. Machine Configuration: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (40C 2S Icelake Server) Benchmark | ARRAYLEN | Baseline (ops/s) | Withopt (ops/s) | Gain ( withopt/baseline) -- | -- | -- | -- | -- o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 64 | 2365.421 | 5136.283 | 2.171403315 o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 128 | 2034.1 | 4073.381 | 2.002547072 o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 256 | 1568.694 | 2811.975 | 1.792558013 o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 512 | 883.261 | 1662.771 | 1.882536419 o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 1024 | 469.513 | 732.81 | 1.560787454 o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 64 | 273.049 | 552.106 | 2.022003377 o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 128 | 219.624 | 359.775 | 1.63814064 o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 256 | 131.649 | 182.23 | 1.384211046 o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 512 | 71.452 | 81.522 | 1.140933774 o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 1024 | 37.427 | 41.966 | 1.121276084 o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 64 | 2805.759 | 3383.16 | 1.205791374 o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 128 | 2069.012 | 2250.37 | 1.087654397 o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 256 | 1098.766 | 1101.996 | 1.002939661 o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 512 | 470.035 | 484.732 | 1.031267884 o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 1024 | 202.827 | 209.073 | 1.030794717 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt128 | 256 | 3435.989 | 4418.09 | 1.285827749 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt128 | 512 | 1524.803 | 1678.201 | 1.100601848 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt128 | 1024 | 972.501 | 1166.734 | 1.199725244 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt256 | 256 | 5980.85 | 7584.17 | 1.268075608 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt256 | 512 | 3258.108 | 3939.23 | 1.209054457 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt256 | 1024 | 1475.365 | 1511.159 | 1.024261115 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt512 | 256 | 4208.766 | 4220.678 | 1.002830283 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt512 | 512 | 2056.651 | 2049.489 | 0.99651764 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt512 | 1024 | 1110.461 | 1116.448 | 1.005391455 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong256 | 256 | 3259.348 | 3947.94 | 1.211266793 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong256 | 512 | 1515.147 | 1536.647 | 1.014190042 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong256 | 1024 | 911.58 | 1030.54 | 1.130498695 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong512 | 256 | 2034.611 | 2073.764 | 1.019243482 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong512 | 512 | 1110.659 | 1116.093 | 1.004892591 o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong512 | 1024 | 559.269 | 559.651 | 1.000683034 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt128 | 256 | 3636.141 | 4446.505 | 1.222863745 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt128 | 512 | 1433.145 | 1681.261 | 1.173126934 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt128 | 1024 | 1000.107 | 1172.866 | 1.172740517 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt256 | 256 | 5568.313 | 7670.259 | 1.37748345 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt256 | 512 | 3350.108 | 3927.803 | 1.172440709 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt256 | 1024 | 1495.966 | 1541.56 | 1.030477965 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt512 | 256 | 4230.379 | 4282.154 | 1.012238856 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt512 | 512 | 2029.801 | 2049.638 | 1.009772879 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt512 | 1024 | 1108.738 | 1118.897 | 1.00916267 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsLong256 | 256 | 3802.801 | 3783.537 | 0.99493426 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsLong256 | 512 | 1546.244 | 1552.691 | 1.004169458 o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsLong256 | 1024 | 1017.512 | 1020.075 | 1.002518889 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt128 | 256 | 4159.835 | 4527.676 | 1.088426825 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt128 | 512 | 1665.335 | 1733.04 | 1.040655484 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt128 | 1024 | 1150.319 | 1181.935 | 1.02748455 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt256 | 256 | 6989.791 | 7382.883 | 1.056238019 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt256 | 512 | 3711.362 | 3911.921 | 1.054039191 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt256 | 1024 | 1540.341 | 1554.175 | 1.008981128 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt512 | 256 | 4164.559 | 4213.546 | 1.01176283 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt512 | 512 | 2072.91 | 2079.105 | 1.002988552 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt512 | 1024 | 1112.678 | 1116.675 | 1.003592234 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong256 | 256 | 3702.998 | 3906.093 | 1.0548461 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong256 | 512 | 1536.571 | 1546.043 | 1.006164375 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong256 | 1024 | 996.906 | 1013.649 | 1.016794964 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 | 256 | 2045.594 | 2048.966 | 1.001648421 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 | 512 | 1111.933 | 1117.689 | 1.005176571 o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 | 1024 | 559.971 | 561.144 | 1.002094751 Kindly review and share your feedback. Best Regards, Jatin ------------- Commit messages: - 8273322: Enhance macro logic optimization for masked logic operations. Changes: https://git.openjdk.java.net/jdk/pull/6893/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6893&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8273322 Stats: 1413 lines in 12 files changed: 1370 ins; 6 del; 37 mod Patch: https://git.openjdk.java.net/jdk/pull/6893.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/6893/head:pull/6893 PR: https://git.openjdk.java.net/jdk/pull/6893