Hi,
Andrew Pinski <pins...@gmail.com> writes: > On Thu, Dec 29, 2022 at 11:45 PM Segher Boessenkool > <seg...@kernel.crashing.org> wrote: >> >> Hi! >> >> On Fri, Dec 30, 2022 at 10:22:31AM +0800, Jiufu Guo wrote: >> > Considering the limitations of CSE, I try to find other places >> > to handle this issue, and notice DSE can optimize below code: >> > "[sfp:DI]=x:DI ; y:SI=[sfp:DI]" to "y:SI=x:DI#0". >> > >> > So, I drafted a patch to update DSE to handle DI->DF/SF. >> > The patch updates "extract_low_bits" to get mode change >> > with subreg. >> > >> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc >> > index b12b0e000c2..5e36331082c 100644 >> > --- a/gcc/expmed.cc >> > +++ b/gcc/expmed.cc >> > @@ -2439,7 +2439,10 @@ extract_low_bits (machine_mode mode, machine_mode >> > src_mode, rtx src) >> > >> > if (!targetm.modes_tieable_p (src_int_mode, src_mode)) >> > return NULL_RTX; >> > - if (!targetm.modes_tieable_p (int_mode, mode)) >> > + if (!targetm.modes_tieable_p (int_mode, mode) >> > + && !(known_le (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (src_mode)) >> > + && GET_MODE_CLASS (mode) == MODE_FLOAT >> > + && GET_MODE_CLASS (src_mode) == MODE_INT)) >> > return NULL_RTX; >> > >> > src = gen_lowpart (src_int_mode, src); >> >> Ah! This simply shows rs6000_modes_tieable_p is decidedly non-optimal: >> it does not allow tying a scalar float to anything else. No such thing >> is required, or good apparently. I wonder why we have such restrictions >> at all in rs6000; is it just unfortunate history, was it good at one >> point in time? > > The documentation for TARGET_MODES_TIEABLE_P says the following: > If TARGET_HARD_REGNO_MODE_OK (r, mode1) and TARGET_HARD_REGNO_MODE_OK > (r, mode2) are always the same for any r, then TARGET_MODES_TIEABLE_P > (mode1, mode2) should be true. If they differ for any r, you should > define this hook to return false unless some other mechanism ensures > the accessibility of the value in a narrower mode. > > even though rs6000_hard_regno_mode_ok_uncached's comment has the following: > /* The float registers (except for VSX vector modes) can only hold floating > modes and DImode. */ > > TARGET_P8_VECTOR and TARGET_P9_VECTOR has special cased different modes now: > if (TARGET_P8_VECTOR && (mode == SImode)) > return 1; > > if (TARGET_P9_VECTOR && (mode == QImode || mode == HImode)) > return 1; > Which I suspect that means rs6000_modes_tieable_p should return true > for SImode and SFmode if TARGET_P8_VECTOR is true. Likewise for > TARGET_P9_VECTOR and SFmode and QImode/HImode too. > Thanks for your great comments! modes_tieable_p is invoked by a few places besides extract_low_bits, so updating this hook to relax the restriction may benefit more passes. We may update modes_tieable_p for more cases as possible. A hacked patch for "float vs. int" is listed at the end of this mail. While back to the issue of this PR: optimize float loading which is stored from the int register. DSE works more on basicblock, so updating modes_tieable_p (or extract_low_bits) can not handle some cases like: double __attribute__ ((noipa)) foo_df (DF arg, int flag) { if (flag == 2) return arg.a[3]; return 0.0; } I'm thinking a way to handle this case. BR, Jeff (Jiufu) > > Thanks, > Andrew Pinski > >> >> >> Segher (To be refined.) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index b3a609f3aa3..8088a608be6 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -1959,6 +1959,17 @@ rs6000_hard_regno_mode_ok (unsigned int regno, machine_mode mode) static bool rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2) { + + if ((GET_MODE_CLASS (mode1) == MODE_FLOAT + && (GET_MODE_SIZE (mode2) == UNITS_PER_FP_WORD + || (TARGET_P8_VECTOR && (mode2 == SImode)) + || (TARGET_P9_VECTOR && (mode2 == QImode || mode2 == HImode)))) + || (GET_MODE_CLASS (mode2) == MODE_FLOAT + && (GET_MODE_SIZE (mode1) == UNITS_PER_FP_WORD + || (TARGET_P8_VECTOR && (mode1 == SImode)) + || (TARGET_P9_VECTOR && (mode1 == QImode || mode1 == HImode))))) + return true; + if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode) return mode1 == mode2; -------