https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583
--- Comment #18 from Tamar Christina <tnfchris at gcc dot gnu.org> --- > > > > Ack, that also tracks with what I tried before, we don't indeed track ranges > > for vector ops. The general case can still be handled slightly better (I > > think) > > but it doesn't become as clear of a win as this one. > > > > > You probably did so elsewhere some time ago, but what exactly are those > > > four instructions? (pointers to specifications appreciated) > > > > For NEON we use: > > https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/ADDHN--ADDHN2--Add-returning-High-Narrow- > > so thats a add + pack high > Yes, though with no overflow, the addition is done in twice the precision of the original type. So it's more a widening add + pack high which narrows it back and zero extends. > > https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/UADDW--UADDW2--Unsigned-Add-Wide- > > and that unpacks (zero-extends) the high/low part of one operand of an add > > I wonder if we'd open-code the pack / unpack and use regular add whether > combine can synthesize uaddw and addhn? The pack and unpack would be > vec_perms on GIMPLE (plus V_C_E). I don't think so for addhn, because it wouldn't truncate the top bits, it truncates the bottom bits. The instruction does element1 = Elem[operand1, e, 2*esize]; element2 = Elem[operand2, e, 2*esize]; So it widens on input. > > So the difficulty here will be to decide whether that's in the end > better than what the pattern handling code does now, right? Because > I think most targets will be able to do the above but lacking the > special adds it will be slower because of the extra packing/unpacking? > > That said, can we possibly do just that costing (would be a first in > the pattern code I guess) with a target hook? Or add optabs for > the addh operations so we can query support? We could, the alternative wouldn't be correct for costing I think.. if we generate *+ , vec_perm that's gonna be more expensive.