On Mon, 16 Jan 2017, Toma Tabacu wrote:

> After searching through the archives, I have found an interesting bit of
> information about DIV.G/MOD.G in the original submission thread:
> 
> > > Ruan Beihong 23 July 2008:
> > > 
> > > I've seen the Loongson 2F manual carefully. The (d)div(u) is 
> > > internally splited into one (d)div(u).g and one (d)mod(u).g. So I said 
> > > before was wrong. The truth is that, (d)div(u).g and (d)mod(u).g are 
> > > always faster than (d)div(u), at least the time spend on mflo/mfhi is 
> > > saved. 
> > > 
> > > James Ruan 
> > 
> > Richard Sandiford 24 July 2008:
> > 
> > OK, great.  In that case, it should simply be a case of disabling
> > the divmod-related insns for Loongson, in addition to your patch.
> > (Probably stating the obvious there, sorry.)
> > 
> > Richard
> 
> Here's the link for part 1 of the submission thread (has the quotes from 
> above):
> https://gcc.gnu.org/ml/gcc-patches/2008-07/msg01529.html
> and here's part 2:
> https://gcc.gnu.org/ml/gcc-patches/2008-11/msg00273.html

 Thanks for digging this out!

> If DIV.G/MOD.G are faster, according to Ruan Beihong, and also smaller than 
> DIV
> (or the same size [1]), as pointed out by Maciej, then I am led to the same
> conclusion as Richard Sandiford: that only DIV.G/MOD.G should be generated for
> Loongson.
> 
> I think it would still be a good idea to add a test for separated DIV.G/MOD.G,
> though.

 Possibly, though the combined tests need to stay then, to make sure 
generic DIV/DIVU is not ever produced.

> What are your thoughts on this ?
> Have I misunderstood something in the context of the submission thread ?
> 
> Regards,
> Toma
> 
> [1] I've noticed that GCC generates the same TEQ instruction twice if both
> DIV.G and MOD.G are needed, which makes the sequence just as big as
> DIV + TEQ + MFHI + MFLO; this seems unnecessary to me.

 This ought to be handled then, likely by adding Loongson-specific RTL 
insns matching the `divmod<mode>4' and `udivmod<mode>4' expanders.  It may 
be as simple as say (conceptually, untested):

(define_insn "<u>divmod<GPR:mode>4_loongson"
  [(set (match_operand:GPR 0 "register_operand" "=d")
        (any_div:GPR (match_operand:GPR 1 "register_operand" "d")
                     (match_operand:GPR 2 "register_operand" "d")))
   (set (match_operand:GPR 3 "register_operand" "=d")
        (any_mod:GPR (match_dup 1)
                     (match_dup 2)))]
  "TARGET_LOONGSON_2EF"
{
  return mips_output_division
    ("<GPR:d>div<u>.g\t%0,%1,%2\;<GPR:d>mod<u>.g\t%3,%1,%2", operands);
}
  [(set_attr "type" "idiv")
   (set_attr "mode" "<GPR:MODE>")])

although any final fix will have to take an instruction count adjustment 
into account too, as `mips_idiv_insns' won't as it stands handle the new 
case.

  Maciej

Reply via email to