On 23/02/2022 19:19, Richard Henderson wrote:
On 2/23/22 11:43, Matheus K. Ferst wrote:
Note that rotlv does the masking itself:
/*
* Expand D = A << (B % element bits)
*
* Unlike scalar shifts, where it is easy for the target front end
* to include the modulo as part of the expansion. If the target
* naturally includes the modulo as part of the operation, great!
* If the target has some other behaviour from out-of-range shifts,
* then it could not use this function anyway, and would need to
* do it's own expansion with custom functions.
*/
Using tcg_gen_rotlv_vec(vece, vrt, vra, vrb) works on PPC but fails on
x86. It looks like
a problem on the i386 backend. It's using VPS[RL]LV[DQ], but instead
of this modulo
behavior, these instructions write zero to the element[1]. I'm not
sure how to fix that.
You don't want to use tcg_gen_rotlv_vec directly, but tcg_gen_rotlv_vec.
I guess there is a typo here. Did you mean tcg_gen_gvec_rotlv? Or
tcg_gen_rotlv_mod_vec?
The generic modulo is being applied here:
static void tcg_gen_rotlv_mod_vec(unsigned vece, TCGv_vec d,
TCGv_vec a, TCGv_vec b)
{
TCGv_vec t = tcg_temp_new_vec_matching(d);
TCGv_vec m = tcg_constant_vec_matching(d, vece, (8 << vece) - 1);
tcg_gen_and_vec(vece, t, b, m);
tcg_gen_rotlv_vec(vece, d, a, t);
tcg_temp_free_vec(t);
}
I can see that this method is called when we use tcg_gen_gvec_rotlv to
implement vrl[bhwd], and they are working as expected. For vrl[wd]nm and
vrl[wd]mi, however, we can't call tcg_gen_rotlv_mod_vec directly in the
.fniv implementation because it is not exposed in tcg-op.h. Is there any
other way to use this method? Should we add it to the header file?
Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>