On 2/22/22 04:36, matheus.fe...@eldorado.org.br wrote:
+ tcg_gen_movi_i64(disj, 0);
The init here means there's one more OR generated than necessary. Though perhaps it gets
folded away...
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctz64(imm);
+ if (bit & 0x4) {
+ tcg_gen_mov_i64(conj, a);
+ } else {
+ tcg_gen_not_i64(conj, a);
+ }
+ if (bit & 0x2) {
+ tcg_gen_and_i64(conj, conj, b);
+ } else {
+ tcg_gen_andc_i64(conj, conj, b);
+ }
+ if (bit & 0x1) {
+ tcg_gen_and_i64(conj, conj, c);
+ } else {
+ tcg_gen_andc_i64(conj, conj, c);
+ }
+ tcg_gen_or_i64(disj, disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
I guess this works, though it's not nearly optimal.
It's certainly a good fallback for the out-of-line function.
Table 145 has the folded equivalent functions. Implementing all 256 of them as is, twice,
for both i64 and vec could be tedious. But we could cherry-pick the easiest, or most
commonly used, or something, and let all other imm values go through to out-of-line function.
r~