On Tue, 8 Jul 2025 09:07:00 GMT, Xiaohong Gong <xg...@openjdk.org> wrote:
>>> > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 >>> > > > >>> > > > >>> > > > Actually I didn't change the min vector size for `char` vectors in >>> > > > this patch. Relaxing `short` vectors to 32-bit is to support the >>> > > > vector cast for Vector API, and there is no `char` species in it. Do >>> > > > you think it's better to do the same change for `char` as well? This >>> > > > will just benefit auto-vectorization. >>> > > >>> > > >>> > > Hi @XiaohongGong thanks for asking. In many auto-vectorization cases >>> > > involving `char`, the vector elements are represented using `T_SHORT` >>> > > as the `BasicType`, rather than `T_CHAR`. >>> > > This is because, in Java, operands of subword types are always promoted >>> > > to `int` before any arithmetic operation. As a result, when handling a >>> > > node like `ConvD2I`, we don’t initially know its actual subword type. >>> > > Later, the SuperWord phase propagates a narrowed integer type backward >>> > > to help determine the correct subword type. See: >>> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2551-L2558 >>> > > >>> > > Since SuperWord assigns `T_SHORT` to `StoreC` early on >>> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2646-L2650 >>> > > >>> > > the entire propagation chain tends to use `T_SHORT` as well. >>> > > This applies to most operations, with the exception of a few like >>> > > `RShiftI`, `Abs`, and `ReverseBytesI`, which are handled separately. >>> > > So your change already benefits many char-related vectorization cases >>> > > like `convertDoubleToChar` above. That’s why we can safely relax the IR >>> > > condition mentioned earlier. >>> > >>> > >>> > Thanks for your input! It's really helpful to me. Does this mean it >>> > always use `T_SHORT` for char vectors in SLP? If so, it's safe that we do >>> > not need to consider `T_CHAR` in vector IRs in backend? >>> >>> No, we don't always use `T_SHORT` for char vectors. As mentioned earlier, >>> for operations like `RShiftI`, `Abs`, and `ReverseBytesI`, the compiler >>> needs to preserve the higher-order bits of the first operand. Therefore, >>> SuperWord still needs to assign them precise subword types. See: >>> >>> https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2583-L2589 >> >> Yes, I see. Thanks! What I mean is for cases that SLP will use the sub... > >> > > Hi @XiaohongGong, is there any way we can implement 2HF -> 2S and 2S -> >> > > 2HF in these match rules ? >> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4697 >> > > >> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4679 >> > > >> > > The `fcvtn` and `fcvtl` instructions do not support these arrangements. >> > > I was wondering if there is any other way we can implement these by any >> > > chance? >> > >> > >> > Do you mean `2HF -> 2F` and `2F -> 2HF` ? >> > Yes, it does not support the 32-bit arrangements. Vector conversion is a >> > kind of lanewise vector operation. For such cases, we usually use the same >> > arrangements with 64-bit vector size for 32-bit ones. That means we can >> > reuse the `T4H` and `T4S` to implement it. Hence, current match rules can >> > cover the conversions between `2HF` and `2F`. >> > Consider there is no such conversion cases in Vector API, I didn't change >> > the comment in the match rules. I think this may benefit >> > auto-vectorization. Currently, do we have cases that can match these rules >> > with SLP? >> >> Sorry yes I meant 2HF <-> 2F. Yes, currently there are no such cases in >> VectorAPI as we do not support Float16 Vectors yet but this will benefit >> autovectorization cases. I think in this case this may also benefit 2D <-> >> 2HF as well (eventually we might add support for D <-> HF as well). Yes we >> have some JTREG tests that match these rules currently like - >> `test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java`, >> `test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java`. > > Thanks! So per my understanding, things that I just need is updating comment > (e.g. `// 4HF to 4F`) of rules like `vcvtHFtoF`, right? For conversions > between double and HF, we do not need any new rules as it will be actually > `double -> float -> HF`, right? > > > > > Hi @XiaohongGong, is there any way we can implement 2HF -> 2S and 2S > > > > > -> 2HF in these match rules ? > > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4697 > > > > > > > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4679 > > > > > > > > > > The `fcvtn` and `fcvtl` instructions do not support these > > > > > arrangements. I was wondering if there is any other way we can > > > > > implement these by any chance? > > > > > > > > > > > > Do you mean `2HF -> 2F` and `2F -> 2HF` ? > > > > Yes, it does not support the 32-bit arrangements. Vector conversion is > > > > a kind of lanewise vector operation. For such cases, we usually use the > > > > same arrangements with 64-bit vector size for 32-bit ones. That means > > > > we can reuse the `T4H` and `T4S` to implement it. Hence, current match > > > > rules can cover the conversions between `2HF` and `2F`. > > > > Consider there is no such conversion cases in Vector API, I didn't > > > > change the comment in the match rules. I think this may benefit > > > > auto-vectorization. Currently, do we have cases that can match these > > > > rules with SLP? > > > > > > > > > Sorry yes I meant 2HF <-> 2F. Yes, currently there are no such cases in > > > VectorAPI as we do not support Float16 Vectors yet but this will benefit > > > autovectorization cases. I think in this case this may also benefit 2D > > > <-> 2HF as well (eventually we might add support for D <-> HF as well). > > > Yes we have some JTREG tests that match these rules currently like - > > > `test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java`, > > > > > > `test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java`. > > > > > > Thanks! So per my understanding, things that I just need is updating > > comment (e.g. `// 4HF to 4F`) of rules like `vcvtHFtoF`, right? For > > conversions between double and HF, we do not need any new rules as it will > > be actually `double -> float -> HF`, right? > > Yes please and also for `4F to 4HF` case for `vcvtF2HF`. Thanks! > > As for the double to half float conversion - yes with the current > infrastructure it would be ConvD2F -> ConvF2HF which will be autovectorized > to generate corresponding vector nodes. Sooner or later, support for ConvD2HF > (and its vectorized version) might be added upstream (support already > available in `lworld+fp16` branch of Valhalla here - > https://github.com/openjdk/valhalla/blob/0ed65b9a63405e950c411835120f0f36e326aaaa/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4535). > You do not have to add any new rules now for this case. I was just hinting > at possible D<->HF implementation in the future. As the max vector length was > 64bits, I did not add any implementation for Neon vcvtD2HF or vcvtHF2D in > Valhalla. Maybe we can do two `fcvtl/fcvtn` to convert D to F and then F to > HF for this specific case but we can think about that later :) Make sense to me. The latest change has been updated together with the relative jtreg tests. Would you mind taking another look at it? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/26057#issuecomment-3050730818