Hi Segher!
Thanks for your helpful comments! Segher Boessenkool <seg...@kernel.crashing.org> writes: > Hi! > > On Fri, Nov 25, 2022 at 09:21:21PM +0800, Jiufu Guo wrote: >> "Kewen.Lin" <li...@linux.ibm.com> writes: >> > on 2022/9/15 16:30, Jiufu Guo wrote: >> >> For a complicate 64bit constant, blow is one instruction-sequence to >> >> build: >> >> lis 9,0x800a >> >> ori 9,9,0xabcd >> >> sldi 9,9,32 >> >> oris 9,9,0xc167 >> >> ori 9,9,0xfa16 >> >> >> >> while we can also use below sequence to build: >> >> lis 9,0xc167 >> >> lis 10,0x800a >> >> ori 9,9,0xfa16 >> >> ori 10,10,0xabcd >> >> rldimi 9,10,32,0 >> >> This sequence is using 2 registers to build high and low part firstly, >> >> and then merge them. >> >> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more >> >> register with potential register pressure). > > And crucially this patch only uses two registers if can_create_pseudo_p. > Please mention that. OK. > >> >> * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Update 64bit >> >> constant build. > > If you don't give details of what this does, just say "Update." please. > But update to what? > > "Generate more parallel code if can_create_pseudo_p." maybe? Thanks. > >> >> + rtx H = gen_reg_rtx (DImode); >> >> + rtx L = gen_reg_rtx (DImode); > > Please don't use all-uppercase variable names, those are for macros. In > fact, don't use uppercase in variable (and function etc.) names at all, > unless there is a really good reason to. > > Just call it "high" and "low", or "hi" and "lo", or something? Thanks. > >> >> --- /dev/null >> >> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c >> >> @@ -0,0 +1,27 @@ >> >> +/* { dg-do run } */ >> >> +/* { dg-options "-O2 -mdejagnu-cpu=power7 -save-temps" } */ >> > >> > Why do we need power7 here? >> power8/9 are also ok for this case. Actually, O just want to >> avoid to use new p10 instruction, like "pli", and then selected >> an old arch option. > > Why does it need _at least_ p7, is the question (as I understand it). > > To prohibit pli etc. you can do -mno-prefixed (which works on all older > CPUs just as well), or skip the test if prefixed insns are enabled, or > scan for the then generated code as well. The first option is by far > the simplest. Thanks for your suggestion! -mno-prefixed is great, it meets the intention here. > >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c >> @@ -0,1 +1,27 @@ >> +/* { dg-do run } */ >> +/* { dg-options "-O2 -mdejagnu-cpu=power8 -save-temps" } */ >> +/* { dg-require-effective-target has_arch_ppc64 } */ > > p8 here makes no sense, we also want good and correct code generated for > older CPUs, so we should not preevent testing on those. For that reason > you shouldn't use -mcpu= when not needed. Like here :-) Yeap, thanks! BR, Jeff (Jiufu)