Hi! On Thu, Jul 09, 2020 at 09:14:44PM -0500, Xiong Hu Luo wrote: > Move V4SF to V4SI, init vector like V4SI and move to V4SF back. > Better instruction sequence could be generated on Power9:
> The point is to use lwz to avoid converting the single-precision to > double-precision upon load, pack four 32-bit data into one 128-bit > register directly. > + rtx tmpSF[4]; > + rtx tmpSI[4]; > + rtx tmpDI[4]; > + rtx mrgDI[4]; Don't use upper case in variable names like this please. Either tmpsf or tmp_sf is fine. > + emit_move_insn (target, gen_lowpart (V4SFmode, tmpV2DI)); (This is a good example of why: it isn't obvious from just seeing this that the tmpV2DI is a variable, while the V4SFmode is a symbolic constant). Looks fine other than that :-) Segher