Re: [PATCH v6 01/10] Initial TI PRU GCC port

2019-06-09 Thread Segher Boessenkool
Hi Dimitar,

Just some comments, do with it what you want :-)

On Sun, Jun 09, 2019 at 11:01:38PM +0300, Dimitar Dimitrov wrote:
> +; An unfortunate side effect is that quite a few invalid RTL patterns are
> +; generated.  For example:
> +;  ... (zero_extend:SI (match_operand:SI ...)) ...

You could perhaps use a mode iterator like rs6000's EXTHI and friends.

> +; These patterns are harmless since no pass should generate such RTL.  This
> +; shortcut allows us to keep small and concise machine description patterns.

But the generated code (of GCC itself) will be larger.  It's also not
harmless in that it can complicate your debugging, problems take longer
to spot.


> +(define_insn 
> "add_impl_"

Many lines are too long.  Not that it is clear how to fix that in this
particular case ;-)


> +  [(set_attr "type" "alu,alu,alu")])

You can just say

  [(set_attr "type" "alu")])

if all alternatives are the same value for an attribute.


> +(define_insn "one_impl_"

The standard pattern is called one_cmpl, so maybe you want one_cmpl_impl
here?  one_impl looks like a typo.


> +(define_subst "alu2_zext_subst"
> +  [(set (match_operand:EQD 0 "" "")
> + (ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""]

I don't know if this actually works for define_subst, but at least in many
other cases you can write this like

(define_subst "alu2_zext_subst"
  [(set (match_operand:EQD 0)
(ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1]

(you can omit trailing empty string arguments).


> +(define_predicate "pru_muldst_operand"
> +  (match_code "subreg,reg")
> +{
> +  if (register_operand (op, mode))
> +{
> +  int regno;
> +
> +  if (REG_P (op))
> + regno = REGNO (op);
> +  else if (GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op)))
> + regno = REGNO (SUBREG_REG (op));
> +  else
> + return 0;
> +
> +  return REGNO_REG_CLASS (regno) == MULDST_REGS
> +  || regno >= FIRST_PSEUDO_REGISTER;
> +}
> +  return 0;
> +})


> +static bool
> +pru_hard_regno_scratch_ok (unsigned int regno)
> +{
> +  /* Don't allow hard registers that might be part of the frame pointer.
> + Some places in the compiler just test for [HARD_]FRAME_POINTER_REGNUM
> + and don't handle a frame pointer that spans more than one register.
> + TODO: Fix those faulty places.  */
> +
> +  if ((!reload_completed || frame_pointer_needed)
> +  && ((regno >= HARD_FRAME_POINTER_REGNUM
> +&& regno <= HARD_FRAME_POINTER_REGNUM + 3)
> +   || (regno >= FRAME_POINTER_REGNUM
> +   && regno <= FRAME_POINTER_REGNUM + 3)))
> +return false;

Use IN_RANGE?


> +  /* QBxx conditional branching cannot cope with block reordering.  */
> +  if (flag_reorder_blocks_and_partition)
> +{
> +  inform (input_location, "%<-freorder-blocks-and-partition%> "
> +   "not supported on this architecture");
> +  flag_reorder_blocks_and_partition = 0;
> +  flag_reorder_blocks = 1;
> +}

What you cannot cope with is the hot/cold partitioning, I guess --
otherwise you'd have to disable reorder_blocks itself, and that would
result in pretty terrible code.


> +; There is no pipeline, so our scheduling description is simple.
> +(define_automaton "pru")
> +(define_cpu_unit "cpu" "pru")
> +
> +(define_insn_reservation "everything" 1 (match_test "true") "cpu")

Because you have a scheduling description, INSN_SCHEDULING is defined,
and that makes combine not create SUBREGs of MEM.  Which is pretty
important :-)


It looks like a quite complete port, congratulations :-)


Segher


Re: [PATCH v6 01/10] Initial TI PRU GCC port

2019-06-11 Thread Dimitar Dimitrov
On неделя, 9 юни 2019 г. 17:38:58 EEST Segher Boessenkool wrote:
> Hi Dimitar,
> 
> Just some comments, do with it what you want :-)
> 
> On Sun, Jun 09, 2019 at 11:01:38PM +0300, Dimitar Dimitrov wrote:
> > +; An unfortunate side effect is that quite a few invalid RTL patterns are
> > +; generated.  For example:
> > +;  ... (zero_extend:SI (match_operand:SI ...)) ...
> 
> You could perhaps use a mode iterator like rs6000's EXTHI and friends.
The machine description already extensively uses mode iterators.

The example in that comment is written with SI mode for brevity.

> 
> > +; These patterns are harmless since no pass should generate such RTL. 
> > This +; shortcut allows us to keep small and concise machine description
> > patterns.
> But the generated code (of GCC itself) will be larger.  It's also not
> harmless in that it can complicate your debugging, problems take longer
> to spot.
I'm afraid that's the best I can do without tooling upgrades.

Reference: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00688.html

> 
> > +(define_insn
> > "add_impl_ > _zext_op2>"
> Many lines are too long.  Not that it is clear how to fix that in this
> particular case ;-)
Those ones I'm not sure how to break. The C code should conform to GNU 
standards, though.

> 
> > +  [(set_attr "type" "alu,alu,alu")])
> 
> You can just say
> 
>   [(set_attr "type" "alu")])
> 
> if all alternatives are the same value for an attribute.

I'll fix it. Thanks.

> 
> > +(define_insn "one_impl_"
> 
> The standard pattern is called one_cmpl, so maybe you want one_cmpl_impl
> here?  one_impl looks like a typo.
I'll update to one_cmpl_impl. Thanks.

> 
> > +(define_subst "alu2_zext_subst"
> > +  [(set (match_operand:EQD 0 "" "")
> > +   (ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""]
> 
> I don't know if this actually works for define_subst, but at least in many
> other cases you can write this like
> 
> (define_subst "alu2_zext_subst"
>   [(set (match_operand:EQD 0)
>   (ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1]
> 
> (you can omit trailing empty string arguments).
Indeed, genmddump generates the same output with or without the empty strings. 
I was not sure which is the preferred form, though. I see lots of MD 
definitions in the GCC tree with empty strings.

> 
> > +(define_predicate "pru_muldst_operand"
> > +  (match_code "subreg,reg")
> > +{
> > +  if (register_operand (op, mode))
> > +{
> > +  int regno;
> > +
> > +  if (REG_P (op))
> > +   regno = REGNO (op);
> > +  else if (GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op)))
> > +   regno = REGNO (SUBREG_REG (op));
> > +  else
> > +   return 0;
> > +
> > +  return REGNO_REG_CLASS (regno) == MULDST_REGS
> > +|| regno >= FIRST_PSEUDO_REGISTER;
> > +}
> > +  return 0;
> > +})
> > 
> > 
> > +static bool
> > +pru_hard_regno_scratch_ok (unsigned int regno)
> > +{
> > +  /* Don't allow hard registers that might be part of the frame pointer.
> > + Some places in the compiler just test for
> > [HARD_]FRAME_POINTER_REGNUM
> > + and don't handle a frame pointer that spans more than one register.
> > + TODO: Fix those faulty places.  */
> > +
> > +  if ((!reload_completed || frame_pointer_needed)
> > +  && ((regno >= HARD_FRAME_POINTER_REGNUM
> > +  && regno <= HARD_FRAME_POINTER_REGNUM + 3)
> > + || (regno >= FRAME_POINTER_REGNUM
> > + && regno <= FRAME_POINTER_REGNUM + 3)))
> > +return false;
> 
> Use IN_RANGE?
Sounds good. Will rewrite it.

> 
> > +  /* QBxx conditional branching cannot cope with block reordering.  */
> > +  if (flag_reorder_blocks_and_partition)
> > +{
> > +  inform (input_location, "%<-freorder-blocks-and-partition%> "
> > + "not supported on this architecture");
> > +  flag_reorder_blocks_and_partition = 0;
> > +  flag_reorder_blocks = 1;
> > +}
> 
> What you cannot cope with is the hot/cold partitioning, I guess --
> otherwise you'd have to disable reorder_blocks itself, and that would
> result in pretty terrible code.
Yes.

> 
> > +; There is no pipeline, so our scheduling description is simple.
> > +(define_automaton "pru")
> > +(define_cpu_unit "cpu" "pru")
> > +
> > +(define_insn_reservation "everything" 1 (match_test "true") "cpu")
> 
> Because you have a scheduling description, INSN_SCHEDULING is defined,
> and that makes combine not create SUBREGs of MEM.  Which is pretty
> important :-)
Would lack of INSN_SCHEDULING result in a more efficient target code?  Is it 
recommended?  I added dummy scheduling as a precaution to avoid issues like 
PR78883 for AVR.

BTW, today I tested with and without scheduling description for PRU.  I didn't 
get any new testsuite failures.
> 
> 
> It looks like a quite complete port, congratulations :-)
> 
> 
> Segher

Thank you,
Dimitar




Re: [PATCH v6 01/10] Initial TI PRU GCC port

2019-06-14 Thread Segher Boessenkool
On Wed, Jun 12, 2019 at 06:08:13AM +0300, Dimitar Dimitrov wrote:
> On неделя, 9 юни 2019 г. 17:38:58 EEST Segher Boessenkool wrote:
> > On Sun, Jun 09, 2019 at 11:01:38PM +0300, Dimitar Dimitrov wrote:
> > > +; An unfortunate side effect is that quite a few invalid RTL patterns are
> > > +; generated.  For example:
> > > +;  ... (zero_extend:SI (match_operand:SI ...)) ...
> > 
> > You could perhaps use a mode iterator like rs6000's EXTHI and friends.
> The machine description already extensively uses mode iterators.

I meant specifically type iterators like this.  We can write e.g.

  (zero_extend:EXTSI (match_operand:SI ...))

and it will only generate patterns for those modes that SI can be extended
to.  This might be hard to make work with more than one mode in your
pattern though :-(

> > > +(define_subst "alu2_zext_subst"
> > > +  [(set (match_operand:EQD 0 "" "")
> > > + (ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1 "" ""]
> > 
> > I don't know if this actually works for define_subst, but at least in many
> > other cases you can write this like
> > 
> > (define_subst "alu2_zext_subst"
> >   [(set (match_operand:EQD 0)
> > (ALUOP2:EQD (zero_extend:EQD (match_operand:EQD 1]
> > 
> > (you can omit trailing empty string arguments).
> Indeed, genmddump generates the same output with or without the empty 
> strings. 
> I was not sure which is the preferred form, though. I see lots of MD 
> definitions in the GCC tree with empty strings.

Yes, this is a pretty new feature :-)

Usually all those trailing "" are just noise.  Sometimes there *could*
be something non-empty there, and then you sometimes want to keep it,
as documentation or whatever.

> > > +; There is no pipeline, so our scheduling description is simple.
> > > +(define_automaton "pru")
> > > +(define_cpu_unit "cpu" "pru")
> > > +
> > > +(define_insn_reservation "everything" 1 (match_test "true") "cpu")
> > 
> > Because you have a scheduling description, INSN_SCHEDULING is defined,
> > and that makes combine not create SUBREGs of MEM.  Which is pretty
> > important :-)
> Would lack of INSN_SCHEDULING result in a more efficient target code?  Is it 
> recommended?  I added dummy scheduling as a precaution to avoid issues like 
> PR78883 for AVR.

SUBREGs of MEM are an obsolete feature.  Maybe we should decouple it from
INSN_SCHEDULING; as it is now this workaround is non-obvious and frankly
a bit silly.  I'll see what I can do.


Segher