On Tue, 4 Oct 2022 06:49:53 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> @merykitty Thanks for the suggestion. I will update the instruct to use 
>> kmovwl. I will also experiment with kshiftrw and let you know.
>
>> You can use `kmovwl` instead which will relax the avx512bw constraint, 
>> however, you will need avx512vl for `evcvtps2ph`. Thanks.
> 
> Yes, in general all AVX512VL targets support AVX512BW, but cloud instances 
> give freedom to enable custom features. Regarding K0, as per section 
> "15.6.1.1" of SDM, expectation is that K0 can appear in source and 
> destination of regular non predication context, k0 should always contain all 
> true mask so it should be unmodifiable for subsequent usages i.e. should not 
> be present as destination of a mask manipulating instruction. Your suggestion 
> is to have that in source but it may not work either. Changing existing 
> sequence to use kmovw and replace AVX512BW with AVX512VL will again mean 
> introducing an additional predication check for this pattern.

Ah I get it, the encoding of k0 is treated specially in predicated instructions 
to refer to an all-set mask, but the register itself may not actually contain 
that value. So usage in `kshiftrw` may fail. In that case I think we can 
generate an all-set mask on the fly using `kxnorw(ktmp, ktmp)` to save a GPR in 
this occasion. Thanks.

-------------

PR: https://git.openjdk.org/jdk/pull/9781

Reply via email to