On Wed, 1 Feb 2023 14:20:05 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractMask.java
>>  line 236:
>> 
>>> 234:         } else if (offset >= limit) {
>>> 235:             return vectorSpecies().maskAll(false);
>>> 236:         } else if (limit - offset >= length()) {
>> 
>> Can you move this else if check at the top, this is the most general case 
>> appearing in the loop and hence two extra uncommon trap jumps before it for 
>> special cases may penalize this.
>
> Your patch re-organized the code to take care of following cases :-
> 1.   offset < 0                         : Still falling over to old code.
> 2.   offset >= limit                  : Special case optimized 
> 3.   (limit - offset) >= length  :  Most general case.
> 4.   Partial in range check       :  New intrinsic, existing java side 
> implementation in checkIndex0 is also composed of intrinsified calls,  
> infrequently taken paths will anyways lead to uncommon traps.
> 
> Have you tried introducing just case 3 (first) and then case 2 in existing 
> indexInRange implementation.

> Thanks for the review @jatin-bhateja !
> 
> > Have you tried introducing just case 3 (first) and then case 2 in existing 
> > indexInRange implementation.
> 
> Yes, I tried with this way as well, together with the performance testing 
> compared with current version. The result is:
> 
> 1. For cases that the array size is aligned with the vector size, the 
> performance is almost the same with each other.
> 2. For cases that the array size is not aligned (which needs the masked 
> operation), the performance with current version can improve about 5%~6% 
> compared with the way as you suggested. But the improvement is limited to the 
> cases with smaller vector size, whose tail loop size is bigger.
> 
> So, adding the new intrinsic has no side effect for me, except for the more 
> complex java side code and one additional intrinsic.

Another benefit is the code change will be simpler, that almost all the files 
except the `AbstractMask.java` do not need to be changed. I will make the 
modification and rerun the benchmarks to decide whether it's necessary to add 
the new intrinsics. Thanks!

-------------

PR: https://git.openjdk.org/jdk/pull/12064

Reply via email to