RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-17 Thread Matthew Fortune
Hi Richard,

I know we are bombarding you with quite large features but if you have chance
to comment on this patch it would be useful.

There is only one further major patch to post and then all recent work from
Imagination will be visible. Once I've sent that out we should probably
discuss the order of reviewing and committing the work.

Regards,
Matthew

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org]
> On Behalf Of Matthew Fortune
> Sent: 07 May 2014 16:22
> To: Richard Sandiford
> Cc: Joseph Myers (jos...@codesourcery.com); ma...@codesourcery.com; Moore,
> Catherine (catherine_mo...@mentor.com); Rich Fuhler; 'gcc-
> patc...@gcc.gnu.org' (gcc-patches@gcc.gnu.org)
> Subject: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
> 
> Attached is the patch to implement the O32 FPXX ABI described here:
> 
> https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
> 
> Corresponding binutils and glibc patches are here:
> http://sourceware.org/ml/binutils/2014-05/msg2.html
> http://sourceware.org/ml/libc-alpha/2014-05/msg00044.html
> 
> A few comments on the patch:
> *) The hook to enable LRA is a temporary measure. This patch is dependent
>on the LRA work from Robert.
> *) Dwarf debug for 64-bit values in floating point values for FPXX can't
>be strictly correct for both 32-bit and 64-bit registers but opts to
>describe one 64-bit register as that is what the FPXX ABI is emulating.
>I have not yet checked what exactly happens in GDB when confronted with
>this and 32-bit registers. This also impacts frame information described
>via mips_save_reg and mips_restore_reg. Advice on this would be
>appreciated.
> *) ISA_HAS_MXHC1 could be defined as true for all three O32 FP ABIs but
>I left out FP32 to maintain historic behaviour. It should be safe to
>Include it though. Thoughts?
> *) Because GCC can be built to have mfpxx or mfp64 as the default option
>the ASM_SPEC has to handle these specially such that they are not
>passed in conjunction with -msingle-float. Depending on how all this
>option handling pans out then this may also need to account for
>msoft-float as well. It is an error to have -msoft-float and -mfp64 in
>the assembler.
> *) The REGISTER_PREFIX patch is included here as I need it here but can't
>commit it yet even though it is approved.
> 
> I am still working through all the testing that this needs but I am fairly
> happy with the current state of this patch. I'll report more test status
> as I get it.
> 
> Regards,
> Matthew
> 
> 2014-05-07  Matthew Fortune  
> 
> gcc/
>   * config.gcc (--with-fp): New option.
>   * config.in (HAVE_AS_MODULE): New config define.
>   * config/mips/mips-protos.h (mips_hard_regno_caller_save_mode): New
>   prototype.
>   * config/mips/mips.c (mips_get_arg_info): Ensure V2SFmode is only
>   handled specially with TARGET_PAIRED_SINGLE_FLOAT.
>   (mips_return_mode_in_fpr_p): Likewise.
>   (mips16_call_stub_mode_suffix): Likewise.
>   (mips_return_fpr_pair): O32 FPR pairs are split over even registers.
>   (mips16_build_call_stub): Likewise.
>   (mips_output_64bit_xfer): Use mthc1 whenever TARGET_HAS_MXHC1.  Add
>   specific cases for TARGET_FPXX to move via memory.
>   (mips_dwarf_register_span): For TARGET_FPXX pretend that modes larger
>   than UNITS_PER_FPREG do not span registers.
>   (mips_file_start): Switch to using .module instead of .gnu_attribute.
>   No longer support FP ABI 4 (-mips32r2 -mfp64), replace with FP ABI 6.
>   Add FP ABI 5 (-mfpxx).
>   (mips_save_reg, mips_restore_reg): Do not split saves for TARGET_FPXX.
>   (mips_secondary_reload_class): Only reload FP_REGS via GPRS if the
>   register has been chosen.
>   (mips_hard_regno_caller_save_mode): Implement.
>   (mips_option_override): ABI check for TARGET_FLOATXX.
>   (mips_conditional_register_usage): Redefine O32 FP64 to match O32 FP32
>   callee-saved behaviour.
>   (TARGET_LRA_P) TEMPORARY, Define and enable LRA.
>   * config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): TARGET_FPXX is
>   __mips_fpr==0.
>   (MIPS_FP32_OPTION_SPEC): New define.
>   (OPTION_DEFAULT_SPECS): Pass through --with-fp=* to -mfp*.
>   (ISA_HAS_MXHC1): True for TARGET_FPXX.
>   (ASM_SPEC): Pass through mfpxx/mfp64 but not for msingle-float.
>   (REGISTER_PREFIX): Define.
>   (HARD_REGNO_CALL_PART_CLOBBERED): Define.  Implement O32 FPXX ABI.
>   (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
>   (SECONDARY_MEMORY_NEEDED): Likewise.
>   (HAVE_AS_MODULE): Define default.
>   * config/mips/mips.md (define_attr enabled): Implement O32 FPXX ABI.
>   (move_doubleword_fpr): Use ISA_HAS_MXHC1 instead of
>   TARGET_FLOAT64.
>   * config/mips/mips.opt (mfpxx): New target option.
>   * config/mips/mti-elf.h (DRIVER_SELF_SPECS): Infer FP ABI from arch.
>  

Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-17 Thread Richard Sandiford
Matthew Fortune  writes:
> Hi Richard,
>
> I know we are bombarding you with quite large features

Yeah :-)  Please bear in mind I'm doing this in my spare time
and this kind of thing takes in the order of hours to review properly.
I'd been prioritising the binutils patches because those can go in now,
whereas AIUI, because of the copyright situation, the GCC patches
couldn't go in for a while even if they were reviewed today.

My priority for MIPS GCC this weekend is to fix the handling of
eliminated registers when checking the extra memory constraints
for Robert's LRA patch.  I'll try to get to this too over the
weekend if I have time.

Thanks,
Richard


Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-18 Thread Richard Sandiford
Matthew Fortune  writes:
> *) Dwarf debug for 64-bit values in floating point values for FPXX can't
>be strictly correct for both 32-bit and 64-bit registers but opts to
>describe one 64-bit register as that is what the FPXX ABI is emulating.
>I have not yet checked what exactly happens in GDB when confronted with
>this and 32-bit registers. This also impacts frame information described
>via mips_save_reg and mips_restore_reg. Advice on this would be
>appreciated.

I'm not sure what's best either.  Clearly it's something that needs
to be spelled out in the ABI, but I can imagine it would be dictated
by what consumers like the unwinder and gdb find easiest to handle.

> *) ISA_HAS_MXHC1 could be defined as true for all three O32 FP ABIs but
>I left out FP32 to maintain historic behaviour. It should be safe to
>Include it though. Thoughts?

Sounds like the right call to me FWIW.  Enabling it for FP32 is a separate
change really.

> *) Because GCC can be built to have mfpxx or mfp64 as the default option
>the ASM_SPEC has to handle these specially such that they are not
>passed in conjunction with -msingle-float. Depending on how all this
>option handling pans out then this may also need to account for
>msoft-float as well. It is an error to have -msoft-float and -mfp64 in
>the assembler.

The assembler and GCC shouldn't treat the options differently though.
Either it should be OK for both or neither.

> @@ -5141,7 +5141,7 @@ mips_get_arg_info (struct mips_arg_info *info, const 
> CUMULATIVE_ARGS *cum,
>|| SCALAR_FLOAT_TYPE_P (type)
>|| VECTOR_FLOAT_TYPE_P (type))
>&& (GET_MODE_CLASS (mode) == MODE_FLOAT
> -  || mode == V2SFmode)
> +  || (TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode))
>&& GET_MODE_SIZE (mode) <= UNITS_PER_FPVALUE);
>break;

This looks odd.  We shouldn't have V2SF values if there's no ISA support
for them.

> @@ -5636,7 +5636,7 @@ mips_return_fpr_pair (enum machine_mode mode,
>  {
>int inc;
>  
> -  inc = (TARGET_NEWABI ? 2 : MAX_FPRS_PER_FMT);
> +  inc = ((TARGET_NEWABI || mips_abi == ABI_32) ? 2 : MAX_FPRS_PER_FMT);

Formatting nit: no extra brackets here.

> @@ -6508,13 +6508,27 @@ mips_output_64bit_xfer (char direction, unsigned int 
> gpreg, unsigned int fpreg)
>if (TARGET_64BIT)
>  fprintf (asm_out_file, "\tdm%cc1\t%s,%s\n", direction,
>reg_names[gpreg], reg_names[fpreg]);
> -  else if (TARGET_FLOAT64)
> +  else if (ISA_HAS_MXHC1)
>  {
>fprintf (asm_out_file, "\tm%cc1\t%s,%s\n", direction,
>  reg_names[gpreg + TARGET_BIG_ENDIAN], reg_names[fpreg]);
>fprintf (asm_out_file, "\tm%chc1\t%s,%s\n", direction,
>  reg_names[gpreg + TARGET_LITTLE_ENDIAN], reg_names[fpreg]);
>  }
> +  else if (TARGET_FLOATXX && direction == 't')
> +{
> +  /* Use the argument save area to move via memory.  */
> +  fprintf (asm_out_file, "\tsw\t%s,0($sp)\n", reg_names[gpreg]);
> +  fprintf (asm_out_file, "\tsw\t%s,4($sp)\n", reg_names[gpreg + 1]);
> +  fprintf (asm_out_file, "\tldc1\t%s,0($sp)\n", reg_names[fpreg]);
> +}
> +  else if (TARGET_FLOATXX && direction == 'f')
> +{
> +  /* Use the argument save area to move via memory.  */
> +  fprintf (asm_out_file, "\tsdc1\t%s,0($sp)\n", reg_names[fpreg]);
> +  fprintf (asm_out_file, "\tlw\t%s,0($sp)\n", reg_names[gpreg]);
> +  fprintf (asm_out_file, "\tlw\t%s,4($sp)\n", reg_names[gpreg + 1]);
> +}

The argument save area might be in use.  E.g. if an argument register
gets spilled, we'll generally try to spill it to the save area rather
than create a new stack slot for it.

This case should always be handled via SECONDARY_MEMORY_NEEDED.

> @@ -10499,7 +10544,7 @@ mips_for_each_saved_acc (HOST_WIDE_INT sp_offset, 
> mips_save_restore_fn fn)
>  static void
>  mips_save_reg (rtx reg, rtx mem)
>  {
> -  if (GET_MODE (reg) == DFmode && !TARGET_FLOAT64)
> +  if (GET_MODE (reg) == DFmode && !TARGET_FLOAT64 && !TARGET_FLOATXX)

TARGET_FLOAT32, and elsewhere.

> @@ -12202,7 +12247,8 @@ mips_secondary_reload_class (enum reg_class rclass,
>   return NO_REGS;
>  
>/* Otherwise, we need to reload through an integer register.  */
> -  return GR_REGS;
> +  if (regno >= 0)
> +return GR_REGS;
>  }
>if (FP_REG_P (regno))
>  return reg_class_subset_p (rclass, GR_REGS) ? NO_REGS : GR_REGS;

Why's this change needed?  Although I assume it's dead code if you tested
against LRA.

> @@ -12210,6 +12256,22 @@ mips_secondary_reload_class (enum reg_class rclass,
>return NO_REGS;
>  }
>  
> +/* Implement HARD_REGNO_CALLER_SAVE_MODE.
> +   Always save floating-point registers using their current mode to avoid
> +   using a 64-bit load/store when a 64-bit FP register only contains a 32-bit
> +   mode.  */
> +
> +enum machine_mode
> +mips_hard_regno_calle

RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-20 Thread Matthew Fortune
Hi Richard,

Apologies for appearing to be pushy on this. I appreciate all of this work
is done in your spare time, I doubt I could manage that. I missed the
important part which was to ask if you had thoughts on the user visible
parts of the patch but you have been through the code now anyway...

Richard Sandiford  writes:
> Matthew Fortune  writes:
> > *) Dwarf debug for 64-bit values in floating point values for FPXX can't
> >be strictly correct for both 32-bit and 64-bit registers but opts to
> >describe one 64-bit register as that is what the FPXX ABI is
> emulating.
> >I have not yet checked what exactly happens in GDB when confronted
> with
> >this and 32-bit registers. This also impacts frame information
> described
> >via mips_save_reg and mips_restore_reg. Advice on this would be
> >appreciated.
> 
> I'm not sure what's best either.  Clearly it's something that needs
> to be spelled out in the ABI, but I can imagine it would be dictated
> by what consumers like the unwinder and gdb find easiest to handle.

I'll be looking into this in detail this week.

> > *) Because GCC can be built to have mfpxx or mfp64 as the default option
> >the ASM_SPEC has to handle these specially such that they are not
> >passed in conjunction with -msingle-float. Depending on how all this
> >option handling pans out then this may also need to account for
> >msoft-float as well. It is an error to have -msoft-float and -mfp64 in
> >the assembler.
> 
> The assembler and GCC shouldn't treat the options differently though.
> Either it should be OK for both or neither.

I wasn't sure if you were applying this rule to options set by --with- at
configure time as well as user supplied options. If I move this logic
to OPTION_DEFAULT_SPECS instead and not apply the --with-fp option if
-msingle-float is given then that will fix the issue too. Is that OK?

> > @@ -5141,7 +5141,7 @@ mips_get_arg_info (struct mips_arg_info *info,
> const CUMULATIVE_ARGS *cum,
> >  || SCALAR_FLOAT_TYPE_P (type)
> >  || VECTOR_FLOAT_TYPE_P (type))
> >  && (GET_MODE_CLASS (mode) == MODE_FLOAT
> > -|| mode == V2SFmode)
> > +|| (TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode))
> >  && GET_MODE_SIZE (mode) <= UNITS_PER_FPVALUE);
> >break;
> 
> This looks odd.  We shouldn't have V2SF values if there's no ISA support
> for them.

True. This is a safety measure against future vector support. I/We wish to
completely restrict this ABI extension to the paired single-float
hardware feature. This detail is easy to forget, I would like to keep
the check but you get final call of course.
 
> > @@ -5636,7 +5636,7 @@ mips_return_fpr_pair (enum machine_mode mode,
> >  {
> >int inc;
> >
> > -  inc = (TARGET_NEWABI ? 2 : MAX_FPRS_PER_FMT);
> > +  inc = ((TARGET_NEWABI || mips_abi == ABI_32) ? 2 : MAX_FPRS_PER_FMT);
> 
> Formatting nit: no extra brackets here.

Is this a no brackets at all case, or brackets on the condition? There are
both styles in GCC but it is difficult to tell which is most common/correct.

> > @@ -6508,13 +6508,27 @@ mips_output_64bit_xfer (char direction, unsigned
> int gpreg, unsigned int fpreg)
> >if (TARGET_64BIT)
> >  fprintf (asm_out_file, "\tdm%cc1\t%s,%s\n", direction,
> >  reg_names[gpreg], reg_names[fpreg]);
> > -  else if (TARGET_FLOAT64)
> > +  else if (ISA_HAS_MXHC1)
> >  {
> >fprintf (asm_out_file, "\tm%cc1\t%s,%s\n", direction,
> >reg_names[gpreg + TARGET_BIG_ENDIAN], reg_names[fpreg]);
> >fprintf (asm_out_file, "\tm%chc1\t%s,%s\n", direction,
> >reg_names[gpreg + TARGET_LITTLE_ENDIAN], reg_names[fpreg]);
> >  }
> > +  else if (TARGET_FLOATXX && direction == 't')
> > +{
> > +  /* Use the argument save area to move via memory.  */
> > +  fprintf (asm_out_file, "\tsw\t%s,0($sp)\n", reg_names[gpreg]);
> > +  fprintf (asm_out_file, "\tsw\t%s,4($sp)\n", reg_names[gpreg + 1]);
> > +  fprintf (asm_out_file, "\tldc1\t%s,0($sp)\n", reg_names[fpreg]);
> > +}
> > +  else if (TARGET_FLOATXX && direction == 'f')
> > +{
> > +  /* Use the argument save area to move via memory.  */
> > +  fprintf (asm_out_file, "\tsdc1\t%s,0($sp)\n", reg_names[fpreg]);
> > +  fprintf (asm_out_file, "\tlw\t%s,0($sp)\n", reg_names[gpreg]);
> > +  fprintf (asm_out_file, "\tlw\t%s,4($sp)\n", reg_names[gpreg + 1]);
> > +}
> 
> The argument save area might be in use.  E.g. if an argument register
> gets spilled, we'll generally try to spill it to the save area rather
> than create a new stack slot for it.
> 
> This case should always be handled via SECONDARY_MEMORY_NEEDED.

It is handled via SECONDARY_MEMORY_NEEDED for normal code generation. This
function needs a better name! It is solely used as part of generating a
mips16 stub and as such the argument save area is fair game as far as I can
see, 

Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-21 Thread Richard Sandiford
Matthew Fortune  writes:
>> > *) Because GCC can be built to have mfpxx or mfp64 as the default option
>> >the ASM_SPEC has to handle these specially such that they are not
>> >passed in conjunction with -msingle-float. Depending on how all this
>> >option handling pans out then this may also need to account for
>> >msoft-float as well. It is an error to have -msoft-float and -mfp64 in
>> >the assembler.
>> 
>> The assembler and GCC shouldn't treat the options differently though.
>> Either it should be OK for both or neither.
>
> I wasn't sure if you were applying this rule to options set by --with- at
> configure time as well as user supplied options. If I move this logic
> to OPTION_DEFAULT_SPECS instead and not apply the --with-fp option if
> -msingle-float is given then that will fix the issue too. Is that OK?

--with-foo is just a way of setting the default -mfoo option when no
explicit -mfoo option is given.  We shouldn't generally distinguish
between them.

>> > @@ -5141,7 +5141,7 @@ mips_get_arg_info (struct mips_arg_info *info,
>> const CUMULATIVE_ARGS *cum,
>> > || SCALAR_FLOAT_TYPE_P (type)
>> > || VECTOR_FLOAT_TYPE_P (type))
>> > && (GET_MODE_CLASS (mode) == MODE_FLOAT
>> > -   || mode == V2SFmode)
>> > +   || (TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode))
>> > && GET_MODE_SIZE (mode) <= UNITS_PER_FPVALUE);
>> >break;
>> 
>> This looks odd.  We shouldn't have V2SF values if there's no ISA support
>> for them.
>
> True. This is a safety measure against future vector support. I/We wish to
> completely restrict this ABI extension to the paired single-float
> hardware feature. This detail is easy to forget, I would like to keep
> the check but you get final call of course.

But the problem is that the change gives a specific behaviour for
!TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode, which at the moment
should be a nonsensical condition.  I want to avoid defining an ABI
for something that shouldn't happen.

I'd be happy with an assert like:

  gcc_assert (TARGET_PAIRED_SINGLE_FLOAT || mode != V2SFmode);

instead.

>> > @@ -5636,7 +5636,7 @@ mips_return_fpr_pair (enum machine_mode mode,
>> >  {
>> >int inc;
>> >
>> > -  inc = (TARGET_NEWABI ? 2 : MAX_FPRS_PER_FMT);
>> > +  inc = ((TARGET_NEWABI || mips_abi == ABI_32) ? 2 : MAX_FPRS_PER_FMT);
>> 
>> Formatting nit: no extra brackets here.
>
> Is this a no brackets at all case, or brackets on the condition? There are
> both styles in GCC but it is difficult to tell which is most common/correct.

Sorry, I meant just the new brackets.

>> > @@ -6508,13 +6508,27 @@ mips_output_64bit_xfer (char direction, unsigned
>> int gpreg, unsigned int fpreg)
>> >if (TARGET_64BIT)
>> >  fprintf (asm_out_file, "\tdm%cc1\t%s,%s\n", direction,
>> > reg_names[gpreg], reg_names[fpreg]);
>> > -  else if (TARGET_FLOAT64)
>> > +  else if (ISA_HAS_MXHC1)
>> >  {
>> >fprintf (asm_out_file, "\tm%cc1\t%s,%s\n", direction,
>> >   reg_names[gpreg + TARGET_BIG_ENDIAN], reg_names[fpreg]);
>> >fprintf (asm_out_file, "\tm%chc1\t%s,%s\n", direction,
>> >   reg_names[gpreg + TARGET_LITTLE_ENDIAN], reg_names[fpreg]);
>> >  }
>> > +  else if (TARGET_FLOATXX && direction == 't')
>> > +{
>> > +  /* Use the argument save area to move via memory.  */
>> > +  fprintf (asm_out_file, "\tsw\t%s,0($sp)\n", reg_names[gpreg]);
>> > +  fprintf (asm_out_file, "\tsw\t%s,4($sp)\n", reg_names[gpreg + 1]);
>> > +  fprintf (asm_out_file, "\tldc1\t%s,0($sp)\n", reg_names[fpreg]);
>> > +}
>> > +  else if (TARGET_FLOATXX && direction == 'f')
>> > +{
>> > +  /* Use the argument save area to move via memory.  */
>> > +  fprintf (asm_out_file, "\tsdc1\t%s,0($sp)\n", reg_names[fpreg]);
>> > +  fprintf (asm_out_file, "\tlw\t%s,0($sp)\n", reg_names[gpreg]);
>> > +  fprintf (asm_out_file, "\tlw\t%s,4($sp)\n", reg_names[gpreg + 1]);
>> > +}
>> 
>> The argument save area might be in use.  E.g. if an argument register
>> gets spilled, we'll generally try to spill it to the save area rather
>> than create a new stack slot for it.
>> 
>> This case should always be handled via SECONDARY_MEMORY_NEEDED.
>
> It is handled via SECONDARY_MEMORY_NEEDED for normal code generation. This
> function needs a better name! It is solely used as part of generating a
> mips16 stub and as such the argument save area is fair game as far as I can
> see, this code is essentially pre-prologue or post-epilogue of the callee.
>
> Shall I rename the function (and related functions which are solely
> applicable to mips16 stubs) to make it clear?

I think the name's OK.  I'd just forgotten how the function was used.

So yeah, this looks good after all.

>> > @@ -12202,7 +12247,8 @@ mips_secondary_reload_class (enum reg_class
>> rclass,
>> >return NO_REGS;
>> >
>> >/* Otherwise, we need to reloa

RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Matthew Fortune
Richard Sandiford  writes:
> Matthew Fortune  writes:
> >> > *) Because GCC can be built to have mfpxx or mfp64 as the default
> option
> >> >the ASM_SPEC has to handle these specially such that they are not
> >> >passed in conjunction with -msingle-float. Depending on how all
> this
> >> >option handling pans out then this may also need to account for
> >> >msoft-float as well. It is an error to have -msoft-float and -mfp64
> in
> >> >the assembler.
> >>
> >> The assembler and GCC shouldn't treat the options differently though.
> >> Either it should be OK for both or neither.
> >
> > I wasn't sure if you were applying this rule to options set by --with- at
> > configure time as well as user supplied options. If I move this logic
> > to OPTION_DEFAULT_SPECS instead and not apply the --with-fp option if
> > -msingle-float is given then that will fix the issue too. Is that OK?
> 
> --with-foo is just a way of setting the default -mfoo option when no
> explicit -mfoo option is given.  We shouldn't generally distinguish
> between them.

OK. I don't mind what we do here so I'm happy to just relax the (new)
assembler restrictions that would prevent -msingle-float -mfp64. I need to
check if these ended up in the .module binutils patch or are in the FPXX
patch.
 
> >> > @@ -5141,7 +5141,7 @@ mips_get_arg_info (struct mips_arg_info *info,
> >> const CUMULATIVE_ARGS *cum,
> >> >   || SCALAR_FLOAT_TYPE_P (type)
> >> >   || VECTOR_FLOAT_TYPE_P (type))
> >> >   && (GET_MODE_CLASS (mode) == MODE_FLOAT
> >> > - || mode == V2SFmode)
> >> > + || (TARGET_PAIRED_SINGLE_FLOAT && mode == 
> >> > V2SFmode))
> >> >   && GET_MODE_SIZE (mode) <= UNITS_PER_FPVALUE);
> >> >break;
> >>
> >> This looks odd.  We shouldn't have V2SF values if there's no ISA support
> >> for them.
> >
> > True. This is a safety measure against future vector support. I/We wish
> to
> > completely restrict this ABI extension to the paired single-float
> > hardware feature. This detail is easy to forget, I would like to keep
> > the check but you get final call of course.
> 
> But the problem is that the change gives a specific behaviour for
> !TARGET_PAIRED_SINGLE_FLOAT && mode == V2SFmode, which at the moment
> should be a nonsensical condition.  I want to avoid defining an ABI
> for something that shouldn't happen.
> 
> I'd be happy with an assert like:
> 
>   gcc_assert (TARGET_PAIRED_SINGLE_FLOAT || mode != V2SFmode);

OK. It's just for future safety so this looks good.

> instead.
> >> > @@ -12202,7 +12247,8 @@ mips_secondary_reload_class (enum reg_class
> >> rclass,
> >> >  return NO_REGS;
> >> >
> >> >/* Otherwise, we need to reload through an integer register.
> */
> >> > -  return GR_REGS;
> >> > +  if (regno >= 0)
> >> > +return GR_REGS;
> >> >  }
> >> >if (FP_REG_P (regno))
> >> >  return reg_class_subset_p (rclass, GR_REGS) ? NO_REGS : GR_REGS;
> >>
> >> Why's this change needed?  Although I assume it's dead code if you
> tested
> >> against LRA.
> >
> > It is an optimisation actually (and this code is used by LRA). Without it
> > We end up with reload registers for floating point values being reloaded
> > via GR_REGS even though they can be stored to memory directly if a
> > FP_REG was used. This results in extra moves between GP and FP. The test
> > cases will fail if this code is removed (call-clobbered-3 and
> > call-clobbered-5).
> 
> Ah, OK.  In that case I think we should instead change:
> 
>   if (MEM_P (x)
> && (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8))
>   /* In this case we can use lwc1, swc1, ldc1 or sdc1.  We'll use
>  pairs of lwc1s and swc1s if ldc1 and sdc1 are not supported.  */
>   return NO_REGS;
> 
> to:
> 
>   if (MEM_P (x) || regno < 0)
>   {
> /* In this case we can use combinations of LWC1, SWC1, LDC1 and
>SDC1.  */
> gcc_checking_assert (GET_MODE_SIZE (mode) % 4 == 0);
> return NO_REGS;
>   }
> 
> Please do it as a separate patch though.

OK
 
> >> > diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> > new file mode 100644
> >> > index 000..f47cdda
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> > @@ -0,0 +1,21 @@
> >> > +/* Check that we handle call-clobbered FPRs correctly.  */
> >> > +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
> >> > +/* { dg-options "-mabi=32 -modd-spreg -mfp32 -ffixed-f0 -ffixed-f1 -
> >> ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -
> ffixed-f8
> >> -ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 -
> >> ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 -ffixed-f20 -
> >> ffixed-f22 -ffixed-f24 -ffixed-f26 -ffixed-

Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Richard Sandiford
Matthew Fortune  writes:
> Richard Sandiford  writes:
>> Matthew Fortune  writes:
>> >> > *) Because GCC can be built to have mfpxx or mfp64 as the default
>> option
>> >> >the ASM_SPEC has to handle these specially such that they are not
>> >> >passed in conjunction with -msingle-float. Depending on how all
>> this
>> >> >option handling pans out then this may also need to account for
>> >> >msoft-float as well. It is an error to have -msoft-float and -mfp64
>> in
>> >> >the assembler.
>> >>
>> >> The assembler and GCC shouldn't treat the options differently though.
>> >> Either it should be OK for both or neither.
>> >
>> > I wasn't sure if you were applying this rule to options set by --with- at
>> > configure time as well as user supplied options. If I move this logic
>> > to OPTION_DEFAULT_SPECS instead and not apply the --with-fp option if
>> > -msingle-float is given then that will fix the issue too. Is that OK?
>> 
>> --with-foo is just a way of setting the default -mfoo option when no
>> explicit -mfoo option is given.  We shouldn't generally distinguish
>> between them.
>
> OK. I don't mind what we do here so I'm happy to just relax the (new)
> assembler restrictions that would prevent -msingle-float -mfp64. I need to
> check if these ended up in the .module binutils patch or are in the FPXX
> patch.

Let's step back a bit.  Please explain which case you were trying to
handle with the specs patch.  Rejecting -msingle-float -mfp64 seems fine.
Fiddling with the specs to stop that combination reaching the assembler
is what seemed odd.

>> >> > diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> >> b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> >> > new file mode 100644
>> >> > index 000..f47cdda
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> >> > @@ -0,0 +1,21 @@
>> >> > +/* Check that we handle call-clobbered FPRs correctly.  */
>> >> > +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
>> >> > +/* { dg-options "-mabi=32 -modd-spreg -mfp32 -ffixed-f0 -ffixed-f1 -
>> >> ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -
>> ffixed-f8
>> >> -ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 -
>> >> ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 -ffixed-f20 -
>> >> ffixed-f22 -ffixed-f24 -ffixed-f26 -ffixed-f28 -ffixed-f30" } */
>> >> > +
>> >> > +void bar (void);
>> >> > +float a;
>> >> > +float
>> >> > +foo ()
>> >> > +{
>> >> > +  float b = a + 1.0f;
>> >> > +  bar();
>> >> > +  return b;
>> >> > +}
>> >> > +/* { dg-final { scan-assembler-times "lwc1" 2 } } */
>> >> > +/* { dg-final { scan-assembler-not "swc1" } } */
>> >> > +/* { dg-final { scan-assembler-times "sdc1" 2 } } */
>> >> > +/* { dg-final { scan-assembler-not "mtc" } } */
>> >> > +/* { dg-final { scan-assembler-not "mfc" } } */
>> >> > +/* { dg-final { scan-assembler-not "mthc" } } */
>> >> > +/* { dg-final { scan-assembler-not "mfhc" } } */
>> >>
>> >> Why require sdc1 here?  Would Chao-Ying's patch make this use SWC1 and
>> LWC1
>> >> exclusively?
>> >
>> > The SDC1 instructions are for callee-saved registers. I'll add the
>> > check for two corresponding LDC1 instructions in the epilogue though
>> > since I've noticed them being missing.
>> 
>> I'm probably being dense, sorry, but why is SDC1 needed for -mfp32
>> -modd-spreg?  Can't we just save and restore the odd register?
>> That's technically more correct too, since we shouldn't really touch
>> -ffixed registers.
>
> You are correct and I am/was aware of this...
>
> GCC is saving too much of the callee-saved registers when single-precision
> values are live in them but this is historic behaviour. The code which
> controls this is very complex and I'd be worried about breaking it. The
> fix would be invasive as all the logic is honed towards the notion of
> 64-bit callee-saved registers.

But that's because it was written before separate odd spregs came along.
I'm not convinced the situation is as bad as you say.

> One thing to say is that this test is
> a very aggressive test of ABI not normal compiler behaviour. Normally
> an even-numbered callee-saved register would be used first followed by the
> odd meaning that it is rare to only use the odd register. That flips the
> problem round to single precision data in even registers...
>
> I believe that prior to mips32 introducing odd-numbered single-precision
> registers then GCC will have been saving and restoring using sdc1/ldc1
> for even-numbered registers regardless of whether they are used for
> single or double precision data (for mips1 it also will do pairs of
> lwc1/swc1).

Sure.  And saving and restoring both using LDC1 and SDC1 is still
the right thing to do if both registers are clobbered.  But if only
one is clobbered then we should save just that.  It's really just
the same principle as Chao-ying's patch, but applied to the prologue
and epilogue.

> My vote FWIW is to l

RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Matthew Fortune
> Let's step back a bit.  Please explain which case you were trying to
> handle with the specs patch.  Rejecting -msingle-float -mfp64 seems
> fine.
> Fiddling with the specs to stop that combination reaching the assembler
> is what seemed odd.

So, perhaps this is a 'vendor' issue too like some other debates we have
had in this area. I presume the --with arguments are intended to
be used to configure a single purpose toolchain that targets a specific
ABI. So if used to construct a single-float toolchain then --with-fp=64
would not be used. And if configured using --with-fp=64 then it is OK to
have an error if using -msingle-float.

I am actually considering implications of adding support for MIPS64r6
to GCC which I have not posted yet, sorry :-). MIPSr6 only supports
64-bit registers which naturally leads to needing -mfp64. MIPSr6 does
however also support a single-precision only variant.

For a single purpose native toolchain then --with-fp=64 can be used xor
--with-fpu=single to get the desired tools.

In a multilib toolchain then the vendor specific DRIVER_SELF_SPECS need
to infer -mfp64 for -mabi=32 and MIPSr6.  The FPXX patch currently has:

mti-linux.h:
  /* If no FP option is specified, infer one from the ABI/ISA level.  */\
  "%{!mfp*: %{mabi=32: %{" MIPS_FP32_OPTION_SPEC\
  ": -mfp32;: -mfpxx}}}",   \
 
With MIPSr6 added this would naturally become

mti-linux.h:
  /* If no FP option is specified, infer one from the ABI/ISA level.  */\
  "%{!mfp*: %{mabi=32: %{" MIPS_FP32_OPTION_SPEC\
  ": -mfp32;%{" MIPS_FP64_OPTION_SPEC ": -mfp64;: -mfpxx}}}",

With:
#define MIPS_FP64_OPTION_SPEC "mips32r6"

But I suppose it could be changed to the following such that
msingle-float prevents inferring a non-default -mfp setting.

mti-linux.h:
  /* If no FP option is specified, infer one from the ABI/ISA level.  */\
  "%{!mfp*: %{mabi=32: %{!msingle-float:%{" MIPS_FP32_OPTION_SPEC   
 \
  ": -mfp32;" MIPS_FP64_OPTION_SPEC ": -mfp64;: -mfpxx",

With:
#define MIPS_FP64_OPTION_SPEC "mips32r6"

The goal is to support a multilib toolchain which will generate correct
code by just using the architecture option and then be able to add other
options for non-default ABI variations.

mips-mti-linux-gnu -mips32r6 ==> gets O32 FP64
mips-mti-linux-gnu -mips32r6 -msingle-float ==> gets O32 FPSINGLE
mips-mti-linux-gnu -mips32r5 ==> gets O32 FPXX
mips-mti-linux-gnu -mips1 ==> gets O32 FP32

It seems strange if the -mips32r6 -msingle-float required you to set
-mfp32 as that should not matter. Single-float can only possibly mean
32-bit registers.

> >> >> > diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> >> b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> >> > new file mode 100644
> >> >> > index 000..f47cdda
> >> >> > --- /dev/null
> >> >> > +++ b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> >> > @@ -0,0 +1,21 @@
> >> >> > +/* Check that we handle call-clobbered FPRs correctly.  */
> >> >> > +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" }
> } */
> >> >> > +/* { dg-options "-mabi=32 -modd-spreg -mfp32 -ffixed-f0 -
> ffixed-f1 -
> >> >> ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -
> >> ffixed-f8
> >> >> -ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -
> ffixed-f14 -
> >> >> ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 -
> ffixed-f20 -
> >> >> ffixed-f22 -ffixed-f24 -ffixed-f26 -ffixed-f28 -ffixed-f30" } */
> >> >> > +
> >> >> > +void bar (void);
> >> >> > +float a;
> >> >> > +float
> >> >> > +foo ()
> >> >> > +{
> >> >> > +  float b = a + 1.0f;
> >> >> > +  bar();
> >> >> > +  return b;
> >> >> > +}
> >> >> > +/* { dg-final { scan-assembler-times "lwc1" 2 } } */
> >> >> > +/* { dg-final { scan-assembler-not "swc1" } } */
> >> >> > +/* { dg-final { scan-assembler-times "sdc1" 2 } } */
> >> >> > +/* { dg-final { scan-assembler-not "mtc" } } */
> >> >> > +/* { dg-final { scan-assembler-not "mfc" } } */
> >> >> > +/* { dg-final { scan-assembler-not "mthc" } } */
> >> >> > +/* { dg-final { scan-assembler-not "mfhc" } } */
> >> >>
> >> >> Why require sdc1 here?  Would Chao-Ying's patch make this use SWC1
> and
> >> LWC1
> >> >> exclusively?
> >> >
> >> > The SDC1 instructions are for callee-saved registers. I'll add the
> >> > check for two corresponding LDC1 instructions in the epilogue
> though
> >> > since I've noticed them being missing.
> >>
> >> I'm probably being dense, sorry, but why is SDC1 needed for -mfp32
> >> -modd-spreg?  Can't we just save and restore the odd register?
> >> That's technically more correct too, since we shouldn't really touch
> >> -ffixed registers.
> >
> > You are correct and I am/was aware of this...
> >
> > GCC is saving too much of the callee-saved registers when single-
> precision
> > values are live in them but this is historic behaviour. The code which
> > controls this is very complex 

RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Maciej W. Rozycki
On Thu, 22 May 2014, Matthew Fortune wrote:

> GCC is saving too much of the callee-saved registers when single-precision
> values are live in them but this is historic behaviour. The code which
> controls this is very complex and I'd be worried about breaking it. The
> fix would be invasive as all the logic is honed towards the notion of
> 64-bit callee-saved registers. One thing to say is that this test is
> a very aggressive test of ABI not normal compiler behaviour. Normally
> an even-numbered callee-saved register would be used first followed by the
> odd meaning that it is rare to only use the odd register. That flips the
> problem round to single precision data in even registers...
> 
> I believe that prior to mips32 introducing odd-numbered single-precision
> registers then GCC will have been saving and restoring using sdc1/ldc1
> for even-numbered registers regardless of whether they are used for
> single or double precision data (for mips1 it also will do pairs of
> lwc1/swc1).

 This is required by the architecture, on legacy MIPS processors any 
arithmetic operation writing to a single floating-point (S format) or a 
single fixed-point (W format) destination is specified to leave the upper 
32 bits of the destination in an undefined state.  Depending on the FPU 
implementation and the state of the CP0 Status.FR bit (when implemented) 
the upper 32 bits are located in either the upper half of the 64-bit FPR 
selected or the odd-numbered 32-bit FPR corresponding to the even-numbered 
FPR selected.  See the StoreFPR pseudocode in the R4400 processor manual 
or other legacy MIPS documentation that you undoubtedly have handy.

 So while making changes in this area you need to take care to retain 
saving/restoring full double FPRs on pre MIPS32/MIPS64 processors as their 
upper halves may be clobbered even if only single operations are used.

  Maciej


RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Maciej W. Rozycki
On Thu, 22 May 2014, Matthew Fortune wrote:

> I'm not confident that it would be the right thing to change this
> behaviour. While the test is slightly weird in that the compiler generates
> code that touches a -ffixed-reg this is just an ABI test.
> 
> == Taken from MIPS ABI supplement ==
> Only registers $f20.$f30 are preserved across a function call. All other 
> float-
> ing-point registers can change across a function call. However, functions
> that use any of $f20.$f30 for single-precision operations only must still save
> and restore the corresponding odd-numbered register since the odd-num-
> bered register contents are left undefined by single-precision operations
> == end == 
> 
> There are comments in the code to assert that this is behaviour is very much
> intentional as well:
> 
> == mips.c ==
> static bool
> mips_save_reg_p (unsigned int regno)
> {
>   if (mips_cfun_call_saved_reg_p (regno))
> {
>   if (mips_cfun_might_clobber_call_saved_reg_p (regno))
> return true;
> 
>   /* Save both registers in an FPR pair if either one is used.  This is
>  needed for the case when MIN_FPRS_PER_FMT == 1, which allows the odd
>  register to be used without the even register.  */
>   if (FP_REG_P (regno)
>   && MAX_FPRS_PER_FMT == 2
>   && mips_cfun_might_clobber_call_saved_reg_p (regno + 1))
> return true;
> }
> 
>   /* We need to save the incoming return address if __builtin_eh_return
>  is being used to set a different return address.  */
>   if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
> return true;
> 
>   return false;
> }
> == end ==
> 
> I therefore take back my comment about the prologue/epilogue saving too much
> callee-save register state :-) It is doing exactly what is required by the
> arch. I doubt very much that any MIPS implementation would end up with
> incorrect operation if you ran the following and ended up with $f18 and $f20
> being different or undefined behaviour if $f20 were accessed as a double but 
> that is actually what the spec says.
> 
> ADD.D $f20, $f10, $f10
> MOV.D $f18, $f20
> SWC1 $f20, 0($sp)
> MTC1 $2, $f20
> LWC1 $f20, 0($sp)
> ADD.D $f16, $f18, $f20
> ($f16 should be 4*$f10)

 Neither of MTC1/LWC1 is arithmetic so a different rule applies.  This 
code will work just fine on a 32-bit FPU or with the CP0 Status.FR bit 
clear where implemented because non-arithmetic FPR 32-bit write operations 
do not change the corresponding high-order 32-bit FPR.  It will only break 
with the CP0 Status.FR bit set, where the upper half of the 64-bit FPR 
written will be left undefined, but that is actually no different to the 
current architecture definition.

 This code would break on legacy processors if you substituted MTC1 with 
an arithmetic operation e.g.:

ADD.D $f20, $f10, $f10
MOV.D $f18, $f20
SWC1 $f20, 0($sp)
CVT.S.D $f20, $f20
LWC1 $f20, 0($sp)
ADD.D $f16, $f18, $f20
($f16 should be 4*$f10)

> Presumably, to account for MIPS I then a pair of SWC1s/LWC1s are required
> by the architecture to allow the resulting even numbered register to be
> accessed as a double.
> 
> MIPS32 to my knowledge makes no re-assuring statement that accessing the
> odd numbered register with a single precision operation does not clobber
> the lower 32-bits of the double it is overlaid with.

 But the MIPS psABI also says that only MIPS I ISA instructions may be 
used in a compliant program, so by building a program for the MIPS32 or 
higher ISA you've already broken the ABI as it stands.  So I think we 
could relax the rule on FPR saving/restoration from MIPS32 ISA up, as an 
ABI extension that we already made anyway.

  Maciej


Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Richard Sandiford
Matthew Fortune  writes:
>> Let's step back a bit.  Please explain which case you were trying to
>> handle with the specs patch.  Rejecting -msingle-float -mfp64 seems
>> fine.
>> Fiddling with the specs to stop that combination reaching the assembler
>> is what seemed odd.
>
> So, perhaps this is a 'vendor' issue too like some other debates we have
> had in this area. I presume the --with arguments are intended to
> be used to configure a single purpose toolchain that targets a specific
> ABI. So if used to construct a single-float toolchain then --with-fp=64
> would not be used. And if configured using --with-fp=64 then it is OK to
> have an error if using -msingle-float.

Yeah, it sounds like it could be a vendor issue.  IMO there are two useful
ways that a configuration can handle --with: (a) build the same multilibs
regardless of the --with option but let the --with option control the
default; or (b) deselect any multilibs that are incompatible with the
--with option.  The simple configurations like mips64*-linux-gnu and
mips64-elf do (a), but for configurations that usually have lots of
multilibs I can imagine it would make sense to do (b).  Obviously your
call for your toolchains (as long as we can avoid the ASM_SPEC magic).

> I am actually considering implications of adding support for MIPS64r6
> to GCC which I have not posted yet, sorry :-).

I don't think you're making anyone suffer from an undersupply of patches. :-)

> MIPSr6 only supports 64-bit registers which naturally leads to needing
> -mfp64. MIPSr6 does however also support a single-precision only variant.
>
> For a single purpose native toolchain then --with-fp=64 can be used xor
> --with-fpu=single to get the desired tools.

Hmm, does that really mean that an FPU that only implements S/W-format
instructions must still have 64-bit registers?  And are MTHC1 and MFHC1
therefore always supported for single-float-only FPUs?  The r6 documentation
I downloaded made it sound like 32-bit FPUs were still allowed for
S/W-only FPUs but that 64-bit FPUs are required if D/L-format
instructions are supported.

Or is this more a single-precision+MSA thing?

(Can't really respond to the later specs stuff without understanding
this part first.)

>> >> >> > diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> >> >> b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> >> >> > new file mode 100644
>> >> >> > index 000..f47cdda
>> >> >> > --- /dev/null
>> >> >> > +++ b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
>> >> >> > @@ -0,0 +1,21 @@
>> >> >> > +/* Check that we handle call-clobbered FPRs correctly.  */
>> >> >> > +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" }
>> } */
>> >> >> > +/* { dg-options "-mabi=32 -modd-spreg -mfp32 -ffixed-f0 -
>> ffixed-f1 -
>> >> >> ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -
>> >> ffixed-f8
>> >> >> -ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -
>> ffixed-f14 -
>> >> >> ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 -
>> ffixed-f20 -
>> >> >> ffixed-f22 -ffixed-f24 -ffixed-f26 -ffixed-f28 -ffixed-f30" } */
>> >> >> > +
>> >> >> > +void bar (void);
>> >> >> > +float a;
>> >> >> > +float
>> >> >> > +foo ()
>> >> >> > +{
>> >> >> > +  float b = a + 1.0f;
>> >> >> > +  bar();
>> >> >> > +  return b;
>> >> >> > +}
>> >> >> > +/* { dg-final { scan-assembler-times "lwc1" 2 } } */
>> >> >> > +/* { dg-final { scan-assembler-not "swc1" } } */
>> >> >> > +/* { dg-final { scan-assembler-times "sdc1" 2 } } */
>> >> >> > +/* { dg-final { scan-assembler-not "mtc" } } */
>> >> >> > +/* { dg-final { scan-assembler-not "mfc" } } */
>> >> >> > +/* { dg-final { scan-assembler-not "mthc" } } */
>> >> >> > +/* { dg-final { scan-assembler-not "mfhc" } } */
>> >> >>
>> >> >> Why require sdc1 here?  Would Chao-Ying's patch make this use SWC1
>> and
>> >> LWC1
>> >> >> exclusively?
>> >> >
>> >> > The SDC1 instructions are for callee-saved registers. I'll add the
>> >> > check for two corresponding LDC1 instructions in the epilogue
>> though
>> >> > since I've noticed them being missing.
>> >>
>> >> I'm probably being dense, sorry, but why is SDC1 needed for -mfp32
>> >> -modd-spreg?  Can't we just save and restore the odd register?
>> >> That's technically more correct too, since we shouldn't really touch
>> >> -ffixed registers.
>> >
>> > You are correct and I am/was aware of this...
>> >
>> > GCC is saving too much of the callee-saved registers when single-
>> precision
>> > values are live in them but this is historic behaviour. The code which
>> > controls this is very complex and I'd be worried about breaking it.
>> The
>> > fix would be invasive as all the logic is honed towards the notion of
>> > 64-bit callee-saved registers.
>> 
>> But that's because it was written before separate odd spregs came along.
>> I'm not convinced the situation is as bad as you say.
>>
>> > One thing to say is that this test is
>> > a very aggressive test of ABI not normal c

Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-05-22 Thread Maciej W. Rozycki
On Thu, 22 May 2014, Richard Sandiford wrote:

> I can't hope to match Maciej's reply on the details, but like he says,
> my understanding was that this:
> 
> > ADD.D $f20, $f10, $f10
> > MOV.D $f18, $f20
> > SWC1 $f20, 0($sp)
> > MTC1 $2, $f20
> > LWC1 $f20, 0($sp)
> > ADD.D $f16, $f18, $f20
> > ($f16 should be 4*$f10)
> 
> really is required to work for -modd-spreg.  Specifically I thought the
> LWC1 would force $f20 to be become "uninterpreted" and then the ADD.D
> would need to (re)interpret the register pair as D-format.

 But as I wrote it has always worked, even with the MIPS I R2010 FPU.  All 
the CP1 transfer instructions are non-arithmetic and operate on FPRs in 
the raw manner (note that MOV.S and MOV.D do not fall in this class).  And 
they actually have to, or otherwise a sequence like this (typical for a 
MIPS I function epilogue):

lwc1$f20,0(sp)
# <- hw interrupt taken here
lwc1$f21,4(sp)
jr  ra
 addiu  sp,sp,8

would break on a system with a proper OS such as Linux if a hardware 
interrupt exception was taken right after the first LWC1 that would cause 
a context switch and the other process also switched (lazily) the FP 
context.  Once back to this process, the second LWC1 would fault with a 
Coprocessor Unusable exception that would pull the whole FP context back.  
Once that has completed the LWC1 instruction would only replace one half 
of the double value stored in $f20 (note that the FP context switch 
handler may well use LDC1 instructions if supported by hardware; Linux 
does where possible).

 Yes, it happens that we restore even-numbered FPRs first, before their 
odd-numbered counterparts each, but neither the ABI nor hardware mandates 
that, we could as well swap the order and that would have to work too.

  Maciej


RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-06-02 Thread Matthew Fortune
Hi Richard,

I've realised that I may need to do 'something' to prevent GCC from loading or
storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 when using
an unaligned address. Initial tests show that when loading from an unaligned
address (4-byte aligned) then GCC loads the two halves of a 64-bit value into
GPRs and then moves across to FPRs. This is good but I don't know if it is
guaranteed.

From what I can tell the backend doesn't specifically deal with loading
unaligned data but rather the normal load/store patterns are used by the
middle end. As such I'm not sure there is anything to prevent direct loads
to FPRs by parts.

Do you know one way or the other if unaligned doubles can currently be loaded
via pairs of lwc1 (same for store) and if so can you advise on an approach I 
could try to prevent this for FPXX? I will try to figure this out on my own in
the meantime.

Regards,
Matthew 



Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-06-02 Thread Richard Sandiford
Matthew Fortune  writes:
> I've realised that I may need to do 'something' to prevent GCC from loading or
> storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 when using
> an unaligned address. Initial tests show that when loading from an unaligned
> address (4-byte aligned) then GCC loads the two halves of a 64-bit value into
> GPRs and then moves across to FPRs. This is good but I don't know if it is
> guaranteed.
>
> From what I can tell the backend doesn't specifically deal with loading
> unaligned data but rather the normal load/store patterns are used by the
> middle end. As such I'm not sure there is anything to prevent direct loads
> to FPRs by parts.
>
> Do you know one way or the other if unaligned doubles can currently be loaded
> via pairs of lwc1 (same for store) and if so can you advise on an approach I 
> could try to prevent this for FPXX? I will try to figure this out on my own in
> the meantime.

The port does handle some widths of unaligned access via the
{insv,extv,extzv}misalign patterns.  That's how an unaligned DImode
value will be handled on 64-bit targets.

The MIPS *misalign patterns only handle integer modes, so for other
types of mode the target-independent code will fall back to using an
integer load followed by a subreg (or a subreg followed by an integer
store).  IIRC that's how an unaligned DFmode access will be handled on
64-bit targets.

For modes that are larger or smaller than *misalign can handle,
the target-independent code has to split the access up into smaller
pieces and reassemble them.  And these pieces have to have integer modes.
E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be
done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4).
The thing that prevents these individual loads from using LWC1 is
CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid
for any target-independent code to reduce a subreg of an FPR pair
to an individual FPR.

[FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for things
like DImode on 32-bit targets is because there's no special architecture
feature than can be used.  It's just a case of decomposing the access.
Since that's a general technique, we want to make the target-independent
code do it as well as possible rather than handle it in a port-specific way.]

So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of unaligned
floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS should
guarantee what you want.

Thanks,
Richard


RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-06-02 Thread Matthew Fortune
Richard Sandiford  writes:
> Matthew Fortune  writes:
> > I've realised that I may need to do 'something' to prevent GCC from
> loading or
> > storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1
> when using
> > an unaligned address. Initial tests show that when loading from an
> unaligned
> > address (4-byte aligned) then GCC loads the two halves of a 64-bit
> value into
> > GPRs and then moves across to FPRs. This is good but I don't know if
> it is
> > guaranteed.
> >
> > From what I can tell the backend doesn't specifically deal with
> loading
> > unaligned data but rather the normal load/store patterns are used by
> the
> > middle end. As such I'm not sure there is anything to prevent direct
> loads
> > to FPRs by parts.
> >
> > Do you know one way or the other if unaligned doubles can currently be
> loaded
> > via pairs of lwc1 (same for store) and if so can you advise on an
> approach I
> > could try to prevent this for FPXX? I will try to figure this out on
> my own in
> > the meantime.
> 
> The port does handle some widths of unaligned access via the
> {insv,extv,extzv}misalign patterns.  That's how an unaligned DImode
> value will be handled on 64-bit targets.
> 
> The MIPS *misalign patterns only handle integer modes, so for other
> types of mode the target-independent code will fall back to using an
> integer load followed by a subreg (or a subreg followed by an integer
> store).  IIRC that's how an unaligned DFmode access will be handled on
> 64-bit targets.
> 
> For modes that are larger or smaller than *misalign can handle,
> the target-independent code has to split the access up into smaller
> pieces and reassemble them.  And these pieces have to have integer
> modes.
> E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be
> done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4).
> The thing that prevents these individual loads from using LWC1 is
> CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid
> for any target-independent code to reduce a subreg of an FPR pair
> to an individual FPR.
> 
> [FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for
> things
> like DImode on 32-bit targets is because there's no special architecture
> feature than can be used.  It's just a case of decomposing the access.
> Since that's a general technique, we want to make the target-independent
> code do it as well as possible rather than handle it in a port-specific
> way.]
> 
> So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of
> unaligned
> floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS
> should
> guarantee what you want.

Thanks for all the details. I did not know if CANNOT_CHANGE_MODE_CLASS
would give this guarantee as I am conscious of MIPS 1 having to do pairs
of LWC1/SWC1 for doubles. However, if I understand the code correctly
the LWC1/SWC1 pairs are generated by splits for MIPS 1 and not directly
from target-independent code so CANNOT_CHANGE_MODE_CLASS does not impact
that explicit splitting logic. Is that right?

Regards,
Matthew


Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-06-02 Thread Richard Sandiford
Matthew Fortune  writes:
> Richard Sandiford  writes:
>> Matthew Fortune  writes:
>> > I've realised that I may need to do 'something' to prevent GCC from
>> loading or
>> > storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1
>> when using
>> > an unaligned address. Initial tests show that when loading from an
>> unaligned
>> > address (4-byte aligned) then GCC loads the two halves of a 64-bit
>> value into
>> > GPRs and then moves across to FPRs. This is good but I don't know if
>> it is
>> > guaranteed.
>> >
>> > From what I can tell the backend doesn't specifically deal with
>> loading
>> > unaligned data but rather the normal load/store patterns are used by
>> the
>> > middle end. As such I'm not sure there is anything to prevent direct
>> loads
>> > to FPRs by parts.
>> >
>> > Do you know one way or the other if unaligned doubles can currently be
>> loaded
>> > via pairs of lwc1 (same for store) and if so can you advise on an
>> approach I
>> > could try to prevent this for FPXX? I will try to figure this out on
>> my own in
>> > the meantime.
>> 
>> The port does handle some widths of unaligned access via the
>> {insv,extv,extzv}misalign patterns.  That's how an unaligned DImode
>> value will be handled on 64-bit targets.
>> 
>> The MIPS *misalign patterns only handle integer modes, so for other
>> types of mode the target-independent code will fall back to using an
>> integer load followed by a subreg (or a subreg followed by an integer
>> store).  IIRC that's how an unaligned DFmode access will be handled on
>> 64-bit targets.
>> 
>> For modes that are larger or smaller than *misalign can handle,
>> the target-independent code has to split the access up into smaller
>> pieces and reassemble them.  And these pieces have to have integer
>> modes.
>> E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be
>> done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4).
>> The thing that prevents these individual loads from using LWC1 is
>> CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid
>> for any target-independent code to reduce a subreg of an FPR pair
>> to an individual FPR.
>> 
>> [FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for
>> things
>> like DImode on 32-bit targets is because there's no special architecture
>> feature than can be used.  It's just a case of decomposing the access.
>> Since that's a general technique, we want to make the target-independent
>> code do it as well as possible rather than handle it in a port-specific
>> way.]
>> 
>> So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of
>> unaligned
>> floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS
>> should
>> guarantee what you want.
>
> Thanks for all the details. I did not know if CANNOT_CHANGE_MODE_CLASS
> would give this guarantee as I am conscious of MIPS 1 having to do pairs
> of LWC1/SWC1 for doubles. However, if I understand the code correctly
> the LWC1/SWC1 pairs are generated by splits for MIPS 1 and not directly
> from target-independent code so CANNOT_CHANGE_MODE_CLASS does not impact
> that explicit splitting logic. Is that right?

Yeah, that's right.  All splits of floating-point values are done in the
MIPS code and the MIPS code is free to ignore CANNOT_CHANGE_MODE_CLASS.

One of the many reasons for defining CANNOT_CHANGE_MODE_CLASS the way it
is now is that the FPRs are always little-endian.  If the target-independent
code ever did try to access one half of a pair, it would access the wrong
one on big-endian targets.  So CCCM is by no means just a technicality at
the moment.  Other targets also rely on CCCM for similarly important reasons.

Thanks,
Richard


RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)

2014-06-06 Thread Matthew Fortune
Richard Sandiford  writes:
> Matthew Fortune  writes: 
> > MIPSr6 only supports 64-bit registers which naturally leads to needing
> > -mfp64. MIPSr6 does however also support a single-precision only
> variant.
> >
> > For a single purpose native toolchain then --with-fp=64 can be used
> xor
> > --with-fpu=single to get the desired tools.
> 
> Hmm, does that really mean that an FPU that only implements S/W-format
> instructions must still have 64-bit registers?  And are MTHC1 and MFHC1

No, the single-precision only is 32-bit registers as you would assume.

> therefore always supported for single-float-only FPUs?  The r6
> documentation
> I downloaded made it sound like 32-bit FPUs were still allowed for
> S/W-only FPUs but that 64-bit FPUs are required if D/L-format
> instructions are supported.
> 
> Or is this more a single-precision+MSA thing?

No it is me overcomplicating things. We can deal with it as part of the
R6 patch. I have however modified the mti vendor specs to infer -mfpxx
for specific architectures rather than for everything other than mips1
to make it ready for R6.

> (Can't really respond to the later specs stuff without understanding
> this part first.)
> 
> >> >> >> > diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> >> >> b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> >> >> > new file mode 100644
> >> >> >> > index 000..f47cdda
> >> >> >> > --- /dev/null
> >> >> >> > +++ b/gcc/testsuite/gcc.target/mips/call-clobbered-2.c
> >> >> >> > @@ -0,0 +1,21 @@
> >> >> >> > +/* Check that we handle call-clobbered FPRs correctly.  */
> >> >> >> > +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { ""
> }
> >> } */
> >> >> >> > +/* { dg-options "-mabi=32 -modd-spreg -mfp32 -ffixed-f0 -
> >> ffixed-f1 -
> >> >> >> ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-
> f7 -
> >> >> ffixed-f8
> >> >> >> -ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -
> >> ffixed-f14 -
> >> >> >> ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19 -
> >> ffixed-f20 -
> >> >> >> ffixed-f22 -ffixed-f24 -ffixed-f26 -ffixed-f28 -ffixed-f30" }
> */
> >> >> >> > +
> >> >> >> > +void bar (void);
> >> >> >> > +float a;
> >> >> >> > +float
> >> >> >> > +foo ()
> >> >> >> > +{
> >> >> >> > +  float b = a + 1.0f;
> >> >> >> > +  bar();
> >> >> >> > +  return b;
> >> >> >> > +}
> >> >> >> > +/* { dg-final { scan-assembler-times "lwc1" 2 } } */
> >> >> >> > +/* { dg-final { scan-assembler-not "swc1" } } */
> >> >> >> > +/* { dg-final { scan-assembler-times "sdc1" 2 } } */
> >> >> >> > +/* { dg-final { scan-assembler-not "mtc" } } */
> >> >> >> > +/* { dg-final { scan-assembler-not "mfc" } } */
> >> >> >> > +/* { dg-final { scan-assembler-not "mthc" } } */
> >> >> >> > +/* { dg-final { scan-assembler-not "mfhc" } } */
> >> >> >>
> >> >> >> Why require sdc1 here?  Would Chao-Ying's patch make this use
> SWC1
> >> and
> >> >> LWC1
> >> >> >> exclusively?
> >> >> >
> >> >> > The SDC1 instructions are for callee-saved registers. I'll add
> the
> >> >> > check for two corresponding LDC1 instructions in the epilogue
> >> though
> >> >> > since I've noticed them being missing.
> >> >>
> >> >> I'm probably being dense, sorry, but why is SDC1 needed for -mfp32
> >> >> -modd-spreg?  Can't we just save and restore the odd register?
> >> >> That's technically more correct too, since we shouldn't really
> touch
> >> >> -ffixed registers.
> >> >
> >> > You are correct and I am/was aware of this...
> >> >
> >> > GCC is saving too much of the callee-saved registers when single-
> >> precision
> >> > values are live in them but this is historic behaviour. The code
> which
> >> > controls this is very complex and I'd be worried about breaking it.
> >> The
> >> > fix would be invasive as all the logic is honed towards the notion
> of
> >> > 64-bit callee-saved registers.
> >>
> >> But that's because it was written before separate odd spregs came
> along.
> >> I'm not convinced the situation is as bad as you say.
> >>
> >> > One thing to say is that this test is
> >> > a very aggressive test of ABI not normal compiler behaviour.
> Normally
> >> > an even-numbered callee-saved register would be used first followed
> by
> >> the
> >> > odd meaning that it is rare to only use the odd register. That
> flips
> >> the
> >> > problem round to single precision data in even registers...
> >> >
> >> > I believe that prior to mips32 introducing odd-numbered single-
> >> precision
> >> > registers then GCC will have been saving and restoring using
> sdc1/ldc1
> >> > for even-numbered registers regardless of whether they are used for
> >> > single or double precision data (for mips1 it also will do pairs of
> >> > lwc1/swc1).
> >>
> >> Sure.  And saving and restoring both using LDC1 and SDC1 is still
> >> the right thing to do if both registers are clobbered.  But if only
> >> one is clobbered then we should save just that.  It's really just
> >> the same principle as Chao-ying's patch, but applied t