On Fri, 5 Dec 2025 08:10:32 GMT, Eric Fang <[email protected]> wrote:
>> `VectorMaskCastNode` is used to cast a vector mask from one type to another >> type. The cast may be generated by calling the vector API `cast` or >> generated by the compiler. For example, some vector mask operations like >> `trueCount` require the input mask to be integer types, so for floating >> point type masks, the compiler will cast the mask to the corresponding >> integer type mask automatically before doing the mask operation. This kind >> of cast is very common. >> >> If the vector element size is not changed, the `VectorMaskCastNode` don't >> generate code, otherwise code will be generated to extend or narrow the >> mask. This IR node is not free no matter it generates code or not because it >> may block some optimizations. For example: >> 1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))` The middle >> `VectorMaskCast` prevented the following optimization: `(VectorStoremask >> (VectorLoadMask x)) => (x)` >> 2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which blocks >> the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`. >> >> In these IR patterns, the value of the input `x` is not changed, so we can >> safely do the optimization. But if the input value is changed, we can't >> eliminate the cast. >> >> The general idea of this PR is introducing an `uncast_mask` helper function, >> which can be used to uncast a chain of `VectorMaskCastNode`, like the >> existing `Node::uncast(bool)` function. The funtion returns the first non >> `VectorMaskCastNode`. >> >> The intended use case is when the IR pattern to be optimized may contain one >> or more consecutive `VectorMaskCastNode` and this does not affect the >> correctness of the optimization. Then this function can be called to >> eliminate the `VectorMaskCastNode` chain. >> >> Current optimizations related to `VectorMaskCastNode` include: >> 1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760. >> 2. `(XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) => >> (VectorMaskCast (VectorMaskCmp src1 src2 ncond))`, see JDK-8354242. >> >> This PR does the following optimizations: >> 1. Extends the optimization pattern `(VectorMaskCast (VectorMaskCast x)) => >> (x)` as `(VectorMaskCast (VectorMaskCast ... (VectorMaskCast x))) => (x)`. >> Because as long as types of the head and tail `VectorMaskCastNode` are >> consistent, the optimization is correct. >> 2. Supports a new optimization pattern `(VectorStoreMask (VectorMaskCast ... >> (VectorLoadMask x))) => (x)`. Since the value before and after the pattern >> is a boolean vect... > > Eric Fang has updated the pull request with a new target base due to a merge > or a rebase. The incremental webrev excludes the unrelated changes brought in > by the merge/rebase. The pull request contains five additional commits since > the last revision: > > - Refine the test code and comments > - Merge branch 'master' into JDK-8370863-mask-cast-opt > - Don't read and write the same memory in the JMH benchmarks > - Merge branch 'master' into JDK-8370863-mask-cast-opt > - 8370863: VectorAPI: Optimize the VectorMaskCast chain in specific patterns > > `VectorMaskCastNode` is used to cast a vector mask from one type to > another type. The cast may be generated by calling the vector API `cast` > or generated by the compiler. For example, some vector mask operations > like `trueCount` require the input mask to be integer types, so for > floating point type masks, the compiler will cast the mask to the > corresponding integer type mask automatically before doing the mask > operation. This kind of cast is very common. > > If the vector element size is not changed, the `VectorMaskCastNode` > don't generate code, otherwise code will be generated to extend or narrow > the mask. This IR node is not free no matter it generates code or not > because it may block some optimizations. For example: > 1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))` > The middle `VectorMaskCast` prevented the following optimization: > `(VectorStoremask (VectorLoadMask x)) => (x)` > 2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which > blocks the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`. > > In these IR patterns, the value of the input `x` is not changed, so we > can safely do the optimization. But if the input value is changed, we > can't eliminate the cast. > > The general idea of this PR is introducing an `uncast_mask` helper > function, which can be used to uncast a chain of `VectorMaskCastNode`, > like the existing `Node::uncast(bool)` function. The funtion returns > the first non `VectorMaskCastNode`. > > The intended use case is when the IR pattern to be optimized may > contain one or more consecutive `VectorMaskCastNode` and this does not > affect the correctness of the optimization. Then this function can be > called to eliminate the `VectorMaskCastNode` chain. > > Current optimizations related to `VectorMaskCastNode` include: > 1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760. > 2. `(XorV... Thanks for your review! @galderz ------------- PR Review: https://git.openjdk.org/jdk/pull/28313#pullrequestreview-3537647873
