Hello Segher: On 21/02/23 4:34 pm, Segher Boessenkool wrote: > Hi! > > On Tue, Feb 21, 2023 at 02:18:25PM +0530, Ajit Agarwal wrote: >> This patch replaces fmr instruction 6 cycles with 2 cycles xxlor instruction >> for p7 and p8 architecture. >> >> I have implemented with switch and cases otherwise it is difficult to >> accommodate >> xxlor with p7 and p8 and fmr for other architectures. > > Please domn't use a switch, it isn't needed. Instead use the "isa" > attribute (with p7v here), and put the preferred alternative first.
I am not sure how this is possible without switch and using only "isa". > >> rs6000: fmr gets used instead of faster xxlor [PR93571] > > rs6000: Use xxlor instead of fmr where possible > >> This patch replaces 6 cycles fmr instruction with xxlor >> 2 cycles in p8 and p7 architecture. > > No, it also does it on all later architectures. > > Do you have any actual timings (i.e. from hardware, not documentation)? > >> * config/rs6000/rs6000.md (*movdf_hardfloat64): Replace fmr with xxlor >> instruction. > > Line too long. And, that is not what the patch does. Changelog should > be totally boring just saying what the patch changes. If the patch > changes things other than what thechangelog says your reviewer will > think something went missin somewhere :-) I will correct this. > >> - "@ >> - stfd%U0%X0 %1,%0 >> - lfd%U1%X1 %0,%1 >> - xxlor %0,%1,%1 > > That is not what is currently in trunk, so your patch cannot apply. > >> + switch (which_alternative) { >> + case 0 : return "stfd%U0%X0 %1,%0"; >> + case 1 : return "lfd%U1%X1 %0,%1"; > > Formatting is all incorrect. We dom't need or want a switch at all, but > correct would be: > switch (which_alternative) > { > case 0: > return "stfd%U0%X0 %1,%0"; > case 1: > return "lfd%U1%X1 %0,%1"; > > etc. I will correct that. > >> + case 2 : if ((TARGET_VSX || TARGET_P8_VECTOR) >> + && !TARGET_P9_VECTOR >> + && !TARGET_POWER10) >> + return "xxlor %0,%1,%1"; >> + else >> + return "fmr %0,%1"; > > Ah, so you are excluding p9 and p10 here. Hrm. That should be written > TARGET_VSX && !TARGET_P9_VECTOR, none of the rest is needed; but is that > a good idea at all? > > Please use %xN for VSX arguments whenever possible. If this alternative > allows only the low numbered vector registers, that is a hint that you > probably should write this differently (and %xN is harmless then). > >> + return "unreachable"; > > No, never do that. There is "gcc_unreachable ()" if you need it. > I will also correct this. > So, let's first do actual timings, and see if it is better on p9 and > p10 as well (or at least not worse). > > > Segher Thanks & Regards Ajit