On Wed, 25 Feb 2026 11:15:33 GMT, Eric Fang <[email protected]> wrote:

>> `VectorMaskCastNode` is used to cast a vector mask from one type to another 
>> type. The cast may be generated by calling the vector API `cast` or 
>> generated by the compiler. For example, some vector mask operations like 
>> `trueCount` require the input mask to be integer types, so for floating 
>> point type masks, the compiler will cast the mask to the corresponding 
>> integer type mask automatically before doing the mask operation. This kind 
>> of cast is very common.
>> 
>> If the vector element size is not changed, the `VectorMaskCastNode` don't 
>> generate code, otherwise code will be generated to extend or narrow the 
>> mask. This IR node is not free no matter it generates code or not because it 
>> may block some optimizations. For example:
>> 1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))` The middle 
>> `VectorMaskCast` prevented the following optimization: `(VectorStoremask 
>> (VectorLoadMask x)) => (x)`
>> 2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which blocks 
>> the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`.
>> 
>> In these IR patterns, the value of the input `x` is not changed, so we can 
>> safely do the optimization. But if the input value is changed, we can't 
>> eliminate the cast.
>> 
>> The general idea of this PR is introducing an `uncast_mask` helper function, 
>> which can be used to uncast a chain of `VectorMaskCastNode`, like the 
>> existing `Node::uncast(bool)` function. The funtion returns the first non 
>> `VectorMaskCastNode`.
>> 
>> The intended use case is when the IR pattern to be optimized may contain one 
>> or more consecutive `VectorMaskCastNode` and this does not affect the 
>> correctness of the optimization. Then this function can be called to 
>> eliminate the `VectorMaskCastNode` chain.
>> 
>> Current optimizations related to `VectorMaskCastNode` include:
>> 1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760.
>> 2. `(XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) => 
>> (VectorMaskCast (VectorMaskCmp src1 src2 ncond))`, see JDK-8354242.
>> 
>> This PR does the following optimizations:
>> 1. Extends the optimization pattern `(VectorMaskCast (VectorMaskCast x)) => 
>> (x)` as `(VectorMaskCast (VectorMaskCast  ... (VectorMaskCast x))) => (x)`. 
>> Because as long as types of the head and tail `VectorMaskCastNode` are 
>> consistent, the optimization is correct.
>> 2. Supports a new optimization pattern `(VectorStoreMask (VectorMaskCast ... 
>> (VectorLoadMask x))) => (x)`. Since the value before and after the pattern 
>> is a boolean vect...
>
> Eric Fang has updated the pull request with a new target base due to a merge 
> or a rebase. The pull request now contains 16 commits:
> 
>  - Improve the code comment and tests
>  - Merge branch 'master' into JDK-8370863-mask-cast-opt
>  - Refine the JTReg tests
>  - Add clearer comments to VectorMaskCastIdentityTest.java
>  - Update copyright year to 2026
>  - Merge branch 'master' into JDK-8370863-mask-cast-opt
>  - Convert the check condition for vector length into an assertion
>    
>    Also refined the tests.
>  - Refine code comments
>  - Merge branch 'master' into JDK-8370863-mask-cast-opt
>  - Merge branch 'master' into JDK-8370863-mask-cast-opt
>  - ... and 6 more: https://git.openjdk.org/jdk/compare/c0c1775a...dcd64ad1

I'm getting some failures with your new test:


Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method 
"compiler.vectorapi.VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityFloat256"
 - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, 
applyIfPlatformAnd={}, applyIfCPUFeatureOr={"sve", "true", "avx2", "true"}, 
counts={"_#V#LOAD_VECTOR_Z#_", "_@4", ">= 3", "_#VECTOR_LOAD_MASK#_", "= 0", 
"_#VECTOR_STORE_MASK#_", "= 0", "_#VECTOR_MASK_CAST#_", "= 0"}, failOn={}, 
applyIfPlatform={}, applyIfPlatformOr={}, applyIfOr={}, 
applyIfCPUFeatureAnd={}, applyIf={"MaxVectorSize", "> 16"}, 
applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 2: "(\\d+(\\s){2}(VectorLoadMask.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 0 [given]
             - Matched node:
               * 277  VectorLoadMask  === _ 106  [[ 275 ]]  #vectormask<F,4> 
!jvms: VectorMask::fromArray @ bci:47 (line 209) 
VectorStoreMaskIdentityTest::testTwoCastsKernel @ bci:5 (line 75) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityFloat256 @ bci:29 (line 
271)
         * Constraint 3: "(\\d+(\\s){2}(VectorStoreMask.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 0 [given]
             - Matched node:
               * 254  VectorStoreMask  === _ 271 272  [[ 216 ]]  #vectors<Z,4> 
!jvms: AbstractMask::intoArray @ bci:50 (line 75) 
VectorStoreMaskIdentityTest::testTwoCastsKernel @ bci:20 (line 77) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityFloat256 @ bci:29 (line 
271)
         * Constraint 4: "(\\d+(\\s){2}(VectorMaskCast.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 2 = 0 [given]
             - Matched nodes (2):
               * 271  VectorMaskCast  === _ 275  [[ 254 ]]  #vectormask<J,4> 
!jvms: ShortVector64$ShortMask64::cast @ bci:54 (line 665) 
VectorStoreMaskIdentityTest::testTwoCastsKernel @ bci:13 (line 77) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityFloat256 @ bci:29 (line 
271)
               * 275  VectorMaskCast  === _ 277  [[ 271 ]]  #vectormask<S,4> 
!orig=[5222],[2038] !jvms: FloatVector128$FloatMask128::cast @ bci:54 (line 
654) VectorStoreMaskIdentityTest::testTwoCastsKernel @ bci:9 (line 76) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityFloat256 @ bci:29 (line 
271)


This was run on an x64 machine with extra flags:
`-ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server 
-XX:-TieredCompilation`

And with the same flags, I get a failure on an aarch64 machine:

Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method 
"compiler.vectorapi.VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityLong"
 - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, 
applyIfPlatformAnd={}, applyIfCPUFeatureOr={"asimd", "true"}, 
counts={"_#V#LOAD_VECTOR_Z#_", "_@2", ">= 3", "_#VECTOR_LOAD_MASK#_", "= 0", 
"_#VECTOR_STORE_MASK#_", "= 0", "_#VECTOR_MASK_CAST#_", "= 0"}, failOn={}, 
applyIfPlatform={}, applyIfPlatformOr={}, applyIfOr={}, 
applyIfCPUFeatureAnd={}, applyIf={"MaxVectorSize", ">= 16"}, 
applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 2: "(\\d+(\\s){2}(VectorLoadMask.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 0 [given]
             - Matched node:
               * 278  VectorLoadMask  === _ 101  [[ 277 ]]  #vectorx<J,2> 
!jvms: VectorMask::fromArray @ bci:47 (line 209) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:5 (line 85) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityLong @ bci:55 (line 220)
         * Constraint 3: "(\\d+(\\s){2}(VectorStoreMask.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 0 [given]
             - Matched node:
               * 238  VectorStoreMask  === _ 263 264  [[ 194 ]]  #vectord<Z,2> 
!jvms: AbstractMask::intoArray @ bci:50 (line 75) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:24 (line 88) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityLong @ bci:55 (line 220)
         * Constraint 4: "(\\d+(\\s){2}(VectorMaskCast.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 3 = 0 [given]
             - Matched nodes (3):
               * 263  VectorMaskCast  === _ 275  [[ 238 ]]  #vectord<I,2> 
!jvms: DoubleVector128$DoubleMask128::cast @ bci:54 (line 650) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:17 (line 88) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityLong @ bci:55 (line 220)
               * 275  VectorMaskCast  === _ 277  [[ 263 ]]  #vectorx<D,2> 
!orig=[3350] !jvms: IntVector64$IntMask64::cast @ bci:54 (line 661) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:13 (line 87) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityLong @ bci:55 (line 220)
               * 277  VectorMaskCast  === _ 278  [[ 275 ]]  #vectord<I,2> 
!jvms: LongVector128$LongMask128::cast @ bci:54 (line 651) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:9 (line 86) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityLong @ bci:55 (line 220)


And with other flags, also on aarch64, I get:
`-XX:-TieredCompilation -XX:VerifyIterativeGVN=1110`


Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method 
"compiler.vectorapi.VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityInt"
 - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, 
applyIfPlatformAnd={}, applyIfCPUFeatureOr={"asimd", "true", "avx", "true"}, 
counts={"_#V#LOAD_VECTOR_Z#_", "_@4", ">= 3", "_#VECTOR_LOAD_MASK#_", "= 0", 
"_#VECTOR_STORE_MASK#_", "= 0", "_#VECTOR_MASK_CAST#_", "= 0"}, failOn={}, 
applyIfPlatform={}, applyIfPlatformOr={}, applyIfOr={}, 
applyIfCPUFeatureAnd={}, applyIf={"MaxVectorSize", ">= 16"}, 
applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 2: "(\\d+(\\s){2}(VectorLoadMask.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 0 [given]
             - Matched node:
               * 278  VectorLoadMask  === _ 118  [[ 277 ]]  #vectorx<I,4> 
!jvms: VectorMask::fromArray @ bci:47 (line 209) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:5 (line 85) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityInt @ bci:55 (line 184)
         * Constraint 3: "(\\d+(\\s){2}(VectorStoreMask.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 0 [given]
             - Matched node:
               * 220  VectorStoreMask  === _ 256 257  [[ 168 ]]  #vectord<Z,4> 
!jvms: AbstractMask::intoArray @ bci:50 (line 75) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:24 (line 88) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityInt @ bci:55 (line 184)
         * Constraint 4: "(\\d+(\\s){2}(VectorMaskCast.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 3 = 0 [given]
             - Matched nodes (3):
               * 256  VectorMaskCast  === _ 274  [[ 220 ]]  #vectord<S,4> 
!jvms: FloatVector128$FloatMask128::cast @ bci:54 (line 654) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:17 (line 88) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityInt @ bci:55 (line 184)
               * 274  VectorMaskCast  === _ 277  [[ 256 ]]  #vectorx<F,4> 
!orig=5368,[4300] !jvms: ShortVector64$ShortMask64::cast @ bci:54 (line 665) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:13 (line 87) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityInt @ bci:55 (line 184)
               * 277  VectorMaskCast  === _ 278  [[ 274 ]]  #vectord<S,4> 
!orig=[5231] !jvms: IntVector128$IntMask128::cast @ bci:54 (line 665) 
VectorStoreMaskIdentityTest::testThreeCastsKernel @ bci:9 (line 86) 
VectorStoreMaskIdentityTest::testVectorMaskStoreIdentityInt @ bci:55 (line 184)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28313#issuecomment-3959167455

Reply via email to