Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

Xiaohong Gong Tue, 08 Jul 2025 02:10:10 -0700

On Tue, 8 Jul 2025 09:00:53 GMT, Xiaohong Gong <xg...@openjdk.org> wrote:


>>> > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392
>>> > > 
>>> > > 
>>> > > Actually I didn't change the min vector size for `char` vectors in this 
>>> > > patch. Relaxing `short` vectors to 32-bit is to support the vector cast 
>>> > > for Vector API, and there is no `char` species in it. Do you think it's 
>>> > > better to do the same change for `char` as well? This will just benefit 
>>> > > auto-vectorization.
>>> > 
>>> > 
>>> > Hi @XiaohongGong thanks for asking. In many auto-vectorization cases 
>>> > involving `char`, the vector elements are represented using `T_SHORT` as 
>>> > the `BasicType`, rather than `T_CHAR`.
>>> > This is because, in Java, operands of subword types are always promoted 
>>> > to `int` before any arithmetic operation. As a result, when handling a 
>>> > node like `ConvD2I`, we don’t initially know its actual subword type. 
>>> > Later, the SuperWord phase propagates a narrowed integer type backward to 
>>> > help determine the correct subword type. See:
>>> > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2551-L2558
>>> > 
>>> > Since SuperWord assigns `T_SHORT` to `StoreC` early on
>>> > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2646-L2650
>>> > 
>>> > the entire propagation chain tends to use `T_SHORT` as well.
>>> > This applies to most operations, with the exception of a few like 
>>> > `RShiftI`, `Abs`, and `ReverseBytesI`, which are handled separately.
>>> > So your change already benefits many char-related vectorization cases 
>>> > like `convertDoubleToChar` above. That’s why we can safely relax the IR 
>>> > condition mentioned earlier.
>>> 
>>> Thanks for your input! It's really helpful to me. Does this mean it always 
>>> use `T_SHORT` for char vectors in SLP? If so, it's safe that we do not need 
>>> to consider `T_CHAR` in vector IRs in backend?
>> 
>> No, we don't always use `T_SHORT` for char vectors. As mentioned earlier, 
>> for operations like `RShiftI`, `Abs`, and `ReverseBytesI`, the compiler 
>> needs to preserve the higher-order bits of the first operand. Therefore, 
>> SuperWord still needs to assign them precise subword types. See:
>> https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2583-L2589
>
>> > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392
>> > > > 
>> > > > 
>> > > > Actually I didn't change the min vector size for `char` vectors in 
>> > > > this patch. Relaxing `short` vectors to 32-bit is to support the 
>> > > > vector cast for Vector API, and there is no `char` species in it. Do 
>> > > > you think it's better to do the same change for `char` as well? This 
>> > > > will just benefit auto-vectorization.
>> > > 
>> > > 
>> > > Hi @XiaohongGong thanks for asking. In many auto-vectorization cases 
>> > > involving `char`, the vector elements are represented using `T_SHORT` as 
>> > > the `BasicType`, rather than `T_CHAR`.
>> > > This is because, in Java, operands of subword types are always promoted 
>> > > to `int` before any arithmetic operation. As a result, when handling a 
>> > > node like `ConvD2I`, we don’t initially know its actual subword type. 
>> > > Later, the SuperWord phase propagates a narrowed integer type backward 
>> > > to help determine the correct subword type. See:
>> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2551-L2558
>> > > 
>> > > Since SuperWord assigns `T_SHORT` to `StoreC` early on
>> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2646-L2650
>> > > 
>> > > the entire propagation chain tends to use `T_SHORT` as well.
>> > > This applies to most operations, with the exception of a few like 
>> > > `RShiftI`, `Abs`, and `ReverseBytesI`, which are handled separately.
>> > > So your change already benefits many char-related vectorization cases 
>> > > like `convertDoubleToChar` above. That’s why we can safely relax the IR 
>> > > condition mentioned earlier.
>> > 
>> > 
>> > Thanks for your input! It's really helpful to me. Does this mean it always 
>> > use `T_SHORT` for char vectors in SLP? If so, it's safe that we do not 
>> > need to consider `T_CHAR` in vector IRs in backend?
>> 
>> No, we don't always use `T_SHORT` for char vectors. As mentioned earlier, 
>> for operations like `RShiftI`, `Abs`, and `ReverseBytesI`, the compiler 
>> needs to preserve the higher-order bits of the first operand. Therefore, 
>> SuperWord still needs to assign them precise subword types. See:
>> 
>> https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/share/opto/superword.cpp#L2583-L2589
> 
> Yes, I see. Thanks! What I mean is for cases that SLP will use the subword 
> types, it will be actu...

> > > Hi @XiaohongGong, is there any way we can implement 2HF -> 2S and 2S -> 
> > > 2HF in these match rules ?
> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4697
> > > 
> > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/src/hotspot/cpu/aarch64/aarch64_vector.ad#L4679
> > > 
> > > The `fcvtn` and `fcvtl` instructions do not support these arrangements. I 
> > > was wondering if there is any other way we can implement these by any 
> > > chance?
> > 
> > 
> > Do you mean `2HF -> 2F` and `2F -> 2HF` ?
> > Yes, it does not support the 32-bit arrangements. Vector conversion is a 
> > kind of lanewise vector operation. For such cases, we usually use the same 
> > arrangements with 64-bit vector size for 32-bit ones. That means we can 
> > reuse the `T4H` and `T4S` to implement it. Hence, current match rules can 
> > cover the conversions between `2HF` and `2F`.
> > Consider there is no such conversion cases in Vector API, I didn't change 
> > the comment in the match rules. I think this may benefit 
> > auto-vectorization. Currently, do we have cases that can match these rules 
> > with SLP?
> 
> Sorry yes I meant 2HF <-> 2F. Yes, currently there are no such cases in 
> VectorAPI as we do not support Float16 Vectors yet but this will benefit 
> autovectorization cases. I think in this case this may also benefit 2D <-> 
> 2HF as well (eventually we might add support for D <-> HF as well). Yes we 
> have some JTREG tests that match these rules currently like - 
> `test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java`, 
> `test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java`.

Thanks! So per my understanding, things that I just need is updating comment 
(e.g. `// 4HF to 4F`) of rules like `vcvtHFtoF`, right? For conversions between 
double and HF, we do not need any new rules as it will be actually `double -> 
float -> HF`, right?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26057#issuecomment-3048019109

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

Reply via email to