On Wed, 22 Nov 2023 07:05:21 GMT, Eric Liu <e...@openjdk.org> wrote:

>> Vector API defines zero-extend operations [1], which are going to be 
>> intrinsified and generated to `VectorUCastNode` by C2. This patch adds 
>> backend implementation for `VectorUCastNode` on AArch64.
>> 
>> The micro benchmark shows significant performance improvement. In my test 
>> machine (SVE, 256-bit), the result is shown as below:
>> 
>> 
>> 
>>   Benchmark                     Before     After       Units   Gain
>>   VectorZeroExtend.byte2Int     3168.251   243012.399  ops/ms  75.70
>>   VectorZeroExtend.byte2Long    3212.201   216291.588  ops/ms  66.33
>>   VectorZeroExtend.byte2Short   3391.968   182655.365  ops/ms  52.85
>>   VectorZeroExtend.int2Long     1012.197    80448.553  ops/ms  78.48
>>   VectorZeroExtend.short2Int    1812.471   153416.828  ops/ms  83.65
>>   VectorZeroExtend.short2Long   1788.382   129794.814  ops/ms  71.58
>> 
>> 
>> On other Neon systems, we can get similar performance boost as a result of 
>> intrinsification success.
>> 
>> Since `VectorUCastNode` only used in Vector API's zero extension currently, 
>> this patch also adds assertion on nodes' definitions to clarify their usages.
>> 
>> [TEST]
>> compiler/vectorapi and jdk/incubator/vector passed on NEON and SVE machines.
>> 
>> [1] 
>> https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java#L726
>
> Eric Liu has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   small fix
>   
>   Change-Id: Icfe9619af1c9e7d5ea8cac457ccebb4eec5c34ad

That looks nice. Thanks.

-------------

Marked as reviewed by aph (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16670#pullrequestreview-1757770099

Reply via email to