[PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-01 Thread Claudiu Zissulescu
In this patch, we add support for the new FPU instructions available with
ARC V2 processors.  The new FPU instructions covers both single and
double precision IEEE formats. While the single precision is available
for both ARC EM and ARC HS processors, the double precision is only
available for ARC HS. ARC EM will make use of the double precision assist
instructions which are in fact FPX double instructions.  The double
floating point precision instructions are making use of the odd-even
register pairs to hold 64-bit datums, exactly like in the case of ldd/std
instructions.

Additional to the mods required by FPU instructions to be supported by
GCC, we forced all the 64 bit datum to use odd-even register pairs (HS
only), as we observed a better usage of the ldd/std, and less generated
move instructions.  A byproduct of this optimization, is a new ABI, which
places the 64-bit arguments into odd-even register pairs.  This behavior
can be selected using -mabi option.

Feedback is welcomed,
Claudiu

gcc/
2016-02-01  Claudiu Zissulescu  

* config/arc/arc-modes.def (CC_FPU, CC_FPUE, CC_FPU_UNEQ): New
modes.
* config/arc/arc-opts.h (FPU_SP, FPU_SF, FPU_SC, FPU_SD, FPU_DP)
(FPU_DF, FPU_DC, FPU_DD, FXP_DP): Define.
(arc_abi_type): New enum.
* config/arc/arc-protos.h (arc_init_cumulative_args): Declare.
* config/arc/arc.c (TARGET_STRICT_ARGUMENT_NAMING): Define.
(arc_init): Check FPU options.
(get_arc_condition_code): Handle new CC_FPU* modes.
(arc_select_cc_mode): Likewise.
(arc_conditional_register_usage): Allow 64 bit datum into even-odd
register pair only. Allow access for ARCv2 accumulator.
(gen_compare_reg): Whenever we have FPU support use FPU compare
instructions.
(arc_setup_incoming_varargs): Handle even-odd register pair (ARC
HS only).
(arc_strict_argument_naming): New function.
(arc_init_cumulative_args): Likewise.
(arc_hard_regno_nregs): Likewise.
(arc_function_args_impl): Likewise.
(arc_arg_partial_bytes): Use arc_function_args_impl function.
(arc_function_arg): Likewise.
(arc_function_arg_advance): Likewise.
(arc_reorg): Don't generate brcc insns when FPU compare
instructions are involved.
* config/arc/arc.h (TARGET_DPFP): Add TARGET_FP_DPAX condition.
(TARGET_OPTFPE): Add condition when ARC EM can use optimized
floating point emulation.
(ACC_REG_FIRST, ACC_REG_LAST, ACCL_REGNO, ACCH_REGNO): Define.
(CUMULATIVE_ARGS): New structure.
(INIT_CUMULATIVE_ARGS): Use arc_init_cumulative_args.
(REVERSE_CONDITION): Add new CC_FPU* modes.
(TARGET_HARD_FLOAT, TARGET_FP_SINGLE, TARGET_FP_DOUBLE)
(TARGET_FP_SFUZED, TARGET_FP_DFUZED, TARGET_FP_SCONV)
(TARGET_FP_DCONV, TARGET_FP_SSQRT, TARGET_FP_DSQRT)
(TARGET_FP_DPAX): Define.
* config/arc/arc.md (ARCV2_ACC): New constant.
(type): New fpu type attribute.
(SDF): Conditional iterator.
(cstore, cbranch): Change expand condition.
(addsf3, subsf3, mulsf3, adddf3, subdf3, muldf3): New expands,
handles FPU/FPX cases as well.
* config/arc/arc.opt (mfpu, mabi): New options.
* config/arc/fpx.md (addsf3_fpx, subsf3_fpx, mulsf3_fpx):
Renamed.
(adddf3, muldf3, subdf3): Removed.
* config/arc/predicates.md (proper_comparison_operator): Recognize
CC_FPU* modes.
* config/arc/fpu.md: New file.
* doc/invoke.texi (ARC Options): Document mabi and mfpu options.
---
 gcc/config/arc/arc-modes.def |   5 +
 gcc/config/arc/arc-opts.h|  27 ++
 gcc/config/arc/arc-protos.h  |   1 +
 gcc/config/arc/arc.c | 398 -
 gcc/config/arc/arc.h |  70 --
 gcc/config/arc/arc.md| 150 ++-
 gcc/config/arc/arc.opt   |  56 +
 gcc/config/arc/fpu.md| 580 +++
 gcc/config/arc/fpx.md|  64 +
 gcc/config/arc/predicates.md |  10 +
 gcc/doc/invoke.texi  |  91 ++-
 11 files changed, 1310 insertions(+), 142 deletions(-)
 create mode 100644 gcc/config/arc/fpu.md

diff --git a/gcc/config/arc/arc-modes.def b/gcc/config/arc/arc-modes.def
index b64a596..1f4bf95 100644
--- a/gcc/config/arc/arc-modes.def
+++ b/gcc/config/arc/arc-modes.def
@@ -35,3 +35,8 @@ CC_MODE (CC_FPX);
 VECTOR_MODES (INT, 4);/*V4QI V2HI */
 VECTOR_MODES (INT, 8);/*   V8QI V4HI V2SI */
 VECTOR_MODES (INT, 16);   /* V16QI V8HI V4SI V2DI */
+
+/* FPU conditon flags. */
+CC_MODE (CC_FPU);
+CC_MODE (CC_FPUE);
+CC_MODE (CC_FPU_UNEQ);
diff --git a/gcc/config/arc/arc-opts.h b/gcc/config/arc/arc-opts.h
index 0f12885..be628e5 100644
--- a/gcc/config/arc/arc-opts.h
+++ b/gcc/config/arc/arc-opts.h
@@ -27,3 +27,30 @@ enum processor_type
   PROCESSOR_ARCEM,
   PROCESSOR_ARCHS
 };
+
+/* Single precisio

Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-12 Thread Joern Wolfgang Rennecke



On 10/02/16 13:31, Claudiu Zissulescu wrote:

Please find attached the amended patch for FPU instructions.

Ok to apply?

+(define_insn "*cmpdf_fpu"

I'm wondering - could you compare with +zero using a literal (adding an 
alternative)?
(No need to hold up the main patch, but you can consider it for a 
follow-up patch)


(define_insn "*cmpsf_fpu_uneq"
+  [(set (reg:CC_FPU_UNEQ CC_REG)
+   (compare:CC_FPU_UNEQ
+(match_operand:DF 0 "even_register_operand"  "r")

Typo: probably should be *cmpdf_fpu_uneq

+case CC_FPUmode:
+  return !((code == LTGT) || (code == UNEQ));
`
strictly speaking, this shouldn't accept unsigned comparisons,
although I can't think of a scenario where these would be presented
in this mode,
and the failure mode would just be an abort in get_arc_condition_code.

Otherwise, this is OK.


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-16 Thread Claudiu Zissulescu
Thanks Joern,

Committed: r233451

> -Original Message-
> From: Joern Wolfgang Rennecke [mailto:g...@amylaar.uk]
> Sent: Saturday, February 13, 2016 12:42 AM
> To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: francois.bed...@synopsys.com; jeremy.benn...@embecosm.com
> Subject: Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.
> 
> 
> 
> On 10/02/16 13:31, Claudiu Zissulescu wrote:
> > Please find attached the amended patch for FPU instructions.
> >
> > Ok to apply?
> +(define_insn "*cmpdf_fpu"
> 
> I'm wondering - could you compare with +zero using a literal (adding an
> alternative)?
> (No need to hold up the main patch, but you can consider it for a follow-up
> patch)
> 
> (define_insn "*cmpsf_fpu_uneq"
> +  [(set (reg:CC_FPU_UNEQ CC_REG)
> +   (compare:CC_FPU_UNEQ
> +(match_operand:DF 0 "even_register_operand"  "r")
> 
> Typo: probably should be *cmpdf_fpu_uneq
> 
> +case CC_FPUmode:
> +  return !((code == LTGT) || (code == UNEQ));
> `
> strictly speaking, this shouldn't accept unsigned comparisons, although I 
> can't
> think of a scenario where these would be presented in this mode, and the
> failure mode would just be an abort in get_arc_condition_code.
> 
> Otherwise, this is OK.


Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-02 Thread Joern Wolfgang Rennecke



On 01/02/16 13:57, Claudiu Zissulescu wrote:

In this patch, we add support for the new FPU instructions available with
ARC V2 processors.  The new FPU instructions covers both single and
double precision IEEE formats. While the single precision is available
for both ARC EM and ARC HS processors, the double precision is only
available for ARC HS. ARC EM will make use of the double precision assist
instructions which are in fact FPX double instructions.  The double
floating point precision instructions are making use of the odd-even
register pairs to hold 64-bit datums, exactly like in the case of ldd/std
instructions.

Additional to the mods required by FPU instructions to be supported by
GCC, we forced all the 64 bit datum to use odd-even register pairs (HS
only), as we observed a better usage of the ldd/std, and less generated
move instructions.  A byproduct of this optimization, is a new ABI, which
places the 64-bit arguments into odd-even register pairs.  This behavior
can be selected using -mabi option.

Feedback is welcomed,


  VECTOR_MODES (INT, 16);   /* V16QI V8HI V4SI V2DI */
+
+/* FPU conditon flags. */

Typo

+   error ("FPU double precission options are available for ARC HS 
only.");


There should be no period at the end of the error message string.

+  if (TARGET_HS && (arc_fpu_build & FPX_DP))
+   error ("FPU double precission assist "

Typo.  And Ditto.


+  case EQ:
+  case NE:
+  case UNORDERED:
+  case UNLT:
+  case UNLE:
+  case UNGT:
+  case UNGE:
+   return CC_FPUmode;
+
+  case LT:
+  case LE:
+  case GT:
+  case GE:
+  case ORDERED:
+   return CC_FPUEmode;

cse and other code transformations are likely to do better if you use
just one mode for these.  It is also very odd to have comparisons and their
inverse use different modes.  Have you done any benchmarking for this?

@@ -1282,6 +1363,16 @@ arc_conditional_register_usage (void)
arc_hard_regno_mode_ok[60] = 1 << (int) S_MODE;
 }

+  /* ARCHS has 64-bit data-path which makes use of the even-odd paired
+ registers.  */
+  if (TARGET_HS)
+{
+  for (regno = 1; regno < 32; regno +=2)
+   {
+ arc_hard_regno_mode_ok[regno] = S_MODES;
+   }
+}
+

Does TARGET_HS with -mabi=default allow for passing DFmode / DImode 
arguments

in odd registers?  I fear you might run into reload trouble when trying to
access the values.

+arc_hard_regno_nregs (int regno,
...
+  if ((regno < FIRST_PSEUDO_REGISTER)
+  && (HARD_REGNO_MODE_OK (regno, mode)
+ || (mode == BLKmode)))
+return words;
+  return 0;

This prima facie contradicts HARD_REGNO_NREGS, which considers the
larger sizes of simd vector and dma config registers.
I see that there is no actual conflict as the vector registers are not
used for argument passing, but the comment in the function only states
what the function does - not quite correctly, as detailed before - and
not what it is for.

So, either the mxp support has to be removed before this patch goes in,
or arc_hard_regno_nregs has to handle simd registers properly, or the
comment at the top should state the limited applicability of this
function, and there should be an assert to check that the register
number passed is suitable - e.g.:
gcc_assert (regno < ARC_FIRST_SIMD_VR_REG)

+/* Given an CUMULATIVE_ARGS, this function returns an RTX if the

Typo: C is not a vowel.

+  if (!named && TARGET_HS)
+{
+  /* For unamed args don't try fill up the reg-holes.  */
+  reg_idx = cum->last_reg;
+  /* Only interested in the number of regs.  */

You should make up your mind what the priorities for stdarg are.
Traditionally, lots of gcc ports have supported broken code that lacks
declarations of variadic functions, and furthermore have placed
emphasis on simplicity of varargs/stdarg callee code, at the expense
of normal code.  Often for compatibility with a pre-existing
compiler, sometimes by just copying from existing ports without
stopping to consider the ramifications.
If you make argument passing different for stdarg declared functions,
the broken code that lacks declarations won't work any more.
Ignoring registers for argument passing is not helping the callers
code density.  So the only objective that might be furthered here
is stdarg callee simplicity.  But if you really want that, and ignore
compatibility with broken code, the logical thing to do is not to
pass any unnamed arguments in registers.

If stdarg caller's code size is considered important, and stdarg
callees mostly irrelevant (as mostly associated with I/O, and
linked in just once per function), this aligns well with supporting
broken code: it shouldn't matter if the argument is anonymous or
not, it's the same effort for the caller to pass it.

One further thing to consider when forging new ABIs is that
partial argument passing is there solely for the convenience of
stdarg callees, and/or the programmer who wrote that part of
the t

RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-03 Thread Claudiu Zissulescu
First, I will split this patch in two. The first part will deal with the FPU 
instructions. The second patch, will try to address a new abi optimized for 
odd-even registers as the comments for the mabi=optimized are numerous and I 
need to carefully prepare for an answer.
The remaining of this email will focus on FPU patch.

> +  case EQ:
> +  case NE:
> +  case UNORDERED:
> +  case UNLT:
> +  case UNLE:
> +  case UNGT:
> +  case UNGE:
> +   return CC_FPUmode;
> +
> +  case LT:
> +  case LE:
> +  case GT:
> +  case GE:
> +  case ORDERED:
> +   return CC_FPUEmode;
> 
> cse and other code transformations are likely to do better if you use
> just one mode for these.  It is also very odd to have comparisons and their
> inverse use different modes.  Have you done any benchmarking for this?

Right, the ORDERED should be in CC_FPUmode. An inspiration point for 
CC_FPU/CC_FPUE mode is the arm port. The reason why having the two CC_FPU and 
CC_FPUE modes is to emit signaling FPU compare instructions.  We can use a 
single CC_FPU mode here instead of two, but we may lose functionality.
Regarding benchmarks, I do not have an establish benchmark for this, however, 
as far as I could see the code generated for FPU looks clean.
Please let me know if it is acceptable to go with CC_FPU/CC_FPUE, and ORDERED 
fix further on. Or, to have a single mode.

> +  /* ARCHS has 64-bit data-path which makes use of the even-odd paired
> + registers.  */
> +  if (TARGET_HS)
> +{
> +  for (regno = 1; regno < 32; regno +=2)
> +   {
> + arc_hard_regno_mode_ok[regno] = S_MODES;
> +   }
> +}
> +
> 
> Does TARGET_HS with -mabi=default allow for passing DFmode / DImode
> arguments
> in odd registers?  I fear you might run into reload trouble when trying to
> access the values.

Although, I haven't bump into this issue until now, I do not say it may not 
happen. Hence, I would create a new register class to hold the odd-even 
registers. Hence the above code will not be needed. What do u say?

> still in "subdf3":
> +  else if (TARGET_FP_DOUBLE)
> 
> So this implies that both (TARGET_DPFP) and (TARGET_FP_DOUBLE) might
> be
> true at
> the same time.  In that case, so we really want to prefer the
> (TARGET_DPFP) expansion?

The TARGET_DPFP (FPX instructions) and TARGET_FP_DOUBLE (FPU) are mutually 
exclusive. It should be a check in arc_init() function for this case.

> +(define_insn "*cmpsf_trap_fpu"
> 
> That name makes as little sense to me as having two separate modes
> CC_FPU and CC_FPUE
> for positive / negated usage and having two comparison patterns pre
> precision that
> do the same but pretend to be dissimilar.
> 

The F{S/D}CMPF instruction is similar to the F{S/D}CMP instruction in cases 
when either of the instruction operands is a signaling NaN. The FxCMPF 
instruction updates the invalid flag (FPU_STATUS.IV) when either of the 
operands is a quiet or signaling NaN, whereas, the FxCMP instruction updates 
the invalid flag (FPU_STATUS.IV) only when either of the operands is a quiet 
NaN. We need to use the FxCMPF only if we keep the CC_FPU an CC_FPUE otherwise, 
we shall use only FxCMP instruction.

> Also, the agglomeration of S/D with FU{S,Z}ED is confusing.  Could you
> spare another underscore? 

Is this better?

#define TARGET_FP_SP_BASE   ((arc_fpu_build & FPU_SP) != 0)
#define TARGET_FP_DP_BASE   ((arc_fpu_build & FPU_DP) != 0)
#define TARGET_FP_SP_FUSED  ((arc_fpu_build & FPU_SF) != 0)
#define TARGET_FP_DP_FUSED  ((arc_fpu_build & FPU_DF) != 0)
#define TARGET_FP_SP_CONV   ((arc_fpu_build & FPU_SC) != 0)
#define TARGET_FP_DP_CONV   ((arc_fpu_build & FPU_DC) != 0)
#define TARGET_FP_SP_SQRT   ((arc_fpu_build & FPU_SD) != 0)
#define TARGET_FP_DP_SQRT   ((arc_fpu_build & FPU_DD) != 0)
#define TARGET_FP_DP_AX ((arc_fpu_build & FPX_DP) != 0)

Thanks,
Claudiu


Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-03 Thread Joern Wolfgang Rennecke



On 03/02/16 15:02, Claudiu Zissulescu wrote:

First, I will split this patch in two. The first part will deal with the FPU 
instructions. The second patch, will try to address a new abi optimized for 
odd-even registers as the comments for the mabi=optimized are numerous and I 
need to carefully prepare for an answer.
The remaining of this email will focus on FPU patch.


+  case EQ:
+  case NE:
+  case UNORDERED:
+  case UNLT:
+  case UNLE:
+  case UNGT:
+  case UNGE:
+   return CC_FPUmode;
+
+  case LT:
+  case LE:
+  case GT:
+  case GE:
+  case ORDERED:
+   return CC_FPUEmode;

cse and other code transformations are likely to do better if you use
just one mode for these.  It is also very odd to have comparisons and their
inverse use different modes.  Have you done any benchmarking for this?

Right, the ORDERED should be in CC_FPUmode. An inspiration point for 
CC_FPU/CC_FPUE mode is the arm port.


I can't see how this code in the arm can actually work correctly. When, 
for instance, combine simplifies
a comparison, the comparison code can change, and it will use 
SELECT_CC_MODE to find a new
mode for the comparison.  Thus, if a comparison traps or not on qNaNs 
will depend on the whims

of combine.
Also, the the trapping comparisons don't show the side effect of 
trapping on qNaNs, which means

they can be speculated.

To make the trapping comparisons safe, they should display the side 
effect in the rtl, and only

be used when requested by options, type attributes, pragmas etc.
They could almost be safe to use be default for -ffinite-math-only, 
except that when the frontend knows
how to tell qNaNs and sNaNs apart, and speculates a comparison after 
some integer/fp mixed computation when it can infer that no sNaN will 
occur, you could still get an unexpected signal.

  The reason why having the two CC_FPU and CC_FPUE modes is to emit signaling 
FPU compare instructions.
I don't know if your compare instructions are signalling for quiet NaNs 
(I hope they're not),
but  the mode of the comparison result shouldn't be used to distinguish 
that - it's not safe,

see above.
The purpose of the mode of the result is to distinguish different 
interpretations for the bit

patterns inside the comparison result flags.

   We can use a single CC_FPU mode here instead of two, but we may lose 
functionality.
Can you define what that functionality actually is, and show some simple 
test code

to demonstrate how it works with your port  extension?

Regarding benchmarks, I do not have an establish benchmark for this, however, 
as far as I could see the code generated for FPU looks clean.
Please let me know if it is acceptable to go with CC_FPU/CC_FPUE, and ORDERED 
fix further on.
No, there should be a discernible and achievable purpose for comparison 
modes.
Which you have not demonstrated yet so far for the CC_FPU/CC_FPUE 
dichotomy.

  Or, to have a single mode.

Yes.



+  /* ARCHS has 64-bit data-path which makes use of the even-odd paired
+ registers.  */
+  if (TARGET_HS)
+{
+  for (regno = 1; regno < 32; regno +=2)
+   {
+ arc_hard_regno_mode_ok[regno] = S_MODES;
+   }
+}
+

Does TARGET_HS with -mabi=default allow for passing DFmode / DImode
arguments
in odd registers?  I fear you might run into reload trouble when trying to
access the values.

Although, I haven't bump into this issue until now, I do not say it may not 
happen. Hence, I would create a new register class to hold the odd-even 
registers. Hence the above code will not be needed. What do u say?
That would have been possible a couple of years ago, but these days, all 
the constituent
registers of a multi-reg hard register have to be in a constraint's 
register class for the

constraint to match.
You could fudge this by using two classes for two-reg allocations, one 
with 0,1, 4,5, 8,9 ... the other
with 2,3, 6,7, 10,11 ... , but then you need another pair for 
four-register allocations, and maybe you
want to add various union and intersection classes, and the register 
allocators are rather rubbis
when it comes to balance allocations between classes of similar size and 
utility, so you should

really try to avoid this.
A way to avoid this is not to give the option of using the old ABI while 
enforcing alignment in registers.
Or you could use a different mode for the argument passing when it ends 
up unaligned; I suppose
BLKmode should work, using a vector to designate the constituent 
registers of the function argument.

+(define_insn "*cmpsf_trap_fpu"

That name makes as little sense to me as having two separate modes
CC_FPU and CC_FPUE
for positive / negated usage and having two comparison patterns pre
precision that
do the same but pretend to be dissimilar.

The F{S/D}CMPF instruction is similar to the F{S/D}CMP instruction
Oops. I missed the 'f' suffix.  So the "*trap_fpu" patterns really are 
different...

  in cases when either of the instruction opera

Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-05 Thread Joern Wolfgang Rennecke
P.S.: if code that is missing prototypes for stdarg functions is of no 
concern, there is another ABI
alternative that might give good code density for architectures like ARC 
that have pre-decrement

addressing modes and allow immediates to be pushed:

You could put all unnamed arguments on the stack (thus simplifying 
varargs processing), and

leave all registers not used for argument passing call-saved.
Thus, the callers wouldn't have to worry about saving these registers or 
reloading their values

from the stack.

For gcc, this would require making the call fusage really work - 
probably involving a hook to tell
the middle-end that the port really wants that - or a kludge to make 
affected call insn not look

like call insns, similar to the sfuncs.


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-05 Thread Claudiu Zissulescu
> P.S.: if code that is missing prototypes for stdarg functions is of no 
> concern,
> there is another ABI alternative that might give good code density for
> architectures like ARC that have pre-decrement addressing modes and allow
> immediates to be pushed:
> 
> You could put all unnamed arguments on the stack (thus simplifying varargs
> processing), and leave all registers not used for argument passing call-saved.
> Thus, the callers wouldn't have to worry about saving these registers or
> reloading their values from the stack.
> 
> For gcc, this would require making the call fusage really work - probably
> involving a hook to tell the middle-end that the port really wants that - or a
> kludge to make affected call insn not look like call insns, similar to the 
> sfuncs.

Unfortunately, we need to be compatible with the previous ABI for the time 
being.
I am now investigating passing the DI like modes in non even-odd registers. The 
biggest challenge is how to pass such a mode partially, without introducing 
odd/even register classes.


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-09 Thread Claudiu Zissulescu
Please find attached a reworked patch. It doesn't contain the ABI modifications 
as I notified you earlier in an email.  Also, you may have extra comments 
regarding these original observations:

>+  /* ARCHS has 64-bit data-path which makes use of the even-odd paired
>+ registers.  */
>+  if (TARGET_HS)
>+{
>+  for (regno = 1; regno < 32; regno +=2)
>+   {
>+ arc_hard_regno_mode_ok[regno] = S_MODES;
>+   }
>+}
>+
>
>Does TARGET_HS with -mabi=default allow for passing DFmode / DImode 
>arguments
>in odd registers?  I fear you might run into reload trouble when trying to
>access the values.

The current ABI passes the DI-like modes in any register pair. This should not 
be an issue as the movdi_insn and movdf_insn should handle those exceptional 
cases. As for partial passing of arguments, move_block_from_reg() should take 
care of exceptional cases like DImode.

>+ if (!link_insn
>+ /* Avoid FPU instructions.  */
>+ || (GET_MODE (SET_DEST
>+   (PATTERN (link_insn))) == CC_FPUmode)
>+ || (GET_MODE (SET_DEST
>+   (PATTERN (link_insn))) == CC_FPU_UNEQmode)
>+ || (GET_MODE (SET_DEST
>+   (PATTERN (link_insn))) == CC_FPUEmode))
>
>It's pointless to search for the CC setter and then bail out this late.
>The mode is also accessible in the CC user, so after we have computed
>pc_target, we can check the condition code register in the comparison
>XEXP (pc_target, 1) for its mode.

Most of the cases checking only the CC user may be sufficient. However, there 
are cases (only one which I found), where the CC user has a different mode than 
of the CC setter.  This is happening when running gcc.dg/pr56424.c test. Here, 
the C_FPU mode cstore is simplified by the following steps losing the CC_FPU 
mode:

In the expand:
   18: cc:CC_FPU=cmp(r159:DF,r162:DF)
   19: r163:SI=cc:CC_FPU<0
   20: r161:QI=r163:SI#0
   21: r153:SI=zero_extend(r161:QI)
   22: cc:CC_ZN=cmp(r153:SI,0)
   23: pc={(cc:CC_ZN!=0)?L28:pc}

Then after combine we get this:
   18: cc:CC_FPU=cmp(r2:DF,r4:DF)
  REG_DEAD r4:DF
  REG_DEAD r2:DF
   23: pc={(cc:CC_ZN<0)?L28:pc}
  REG_DEAD cc:CC_ZN
  REG_BR_PROB 6102

Ok to apply?
Claudiu


0001-ARC-Add-single-double-IEEE-precission-FPU-support.patch
Description: 0001-ARC-Add-single-double-IEEE-precission-FPU-support.patch


Re: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-09 Thread Joern Wolfgang Rennecke



On 09/02/16 15:34, Claudiu Zissulescu wrote:

Most of the cases checking only the CC user may be sufficient. However, there 
are cases (only one which I found), where the CC user has a different mode than 
of the CC setter.  This is happening when running gcc.dg/pr56424.c test. Here, 
the C_FPU mode cstore is simplified by the following steps losing the CC_FPU 
mode:

In the expand:
18: cc:CC_FPU=cmp(r159:DF,r162:DF)
19: r163:SI=cc:CC_FPU<0
20: r161:QI=r163:SI#0
21: r153:SI=zero_extend(r161:QI)
22: cc:CC_ZN=cmp(r153:SI,0)
23: pc={(cc:CC_ZN!=0)?L28:pc}

Then after combine we get this:
18: cc:CC_FPU=cmp(r2:DF,r4:DF)
   REG_DEAD r4:DF
   REG_DEAD r2:DF
23: pc={(cc:CC_ZN<0)?L28:pc}
   REG_DEAD cc:CC_ZN
   REG_BR_PROB 6102


That sound like a bug.  Have you looked more closely what's going on?


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-10 Thread Claudiu Zissulescu
> > In the expand:
> > 18: cc:CC_FPU=cmp(r159:DF,r162:DF)
> > 19: r163:SI=cc:CC_FPU<0
> > 20: r161:QI=r163:SI#0
> > 21: r153:SI=zero_extend(r161:QI)
> > 22: cc:CC_ZN=cmp(r153:SI,0)
> > 23: pc={(cc:CC_ZN!=0)?L28:pc}
> >
> > Then after combine we get this:
> > 18: cc:CC_FPU=cmp(r2:DF,r4:DF)
> >REG_DEAD r4:DF
> >REG_DEAD r2:DF
> > 23: pc={(cc:CC_ZN<0)?L28:pc}
> >REG_DEAD cc:CC_ZN
> >REG_BR_PROB 6102
> 
> That sound like a bug.  Have you looked more closely what's going on?

The fwprop1 is collapsing insn 20 into insn 21. No surprise until here. Then, 
the combiner is changing first insn 19 and 21 into insn 21 (this seems sane). 
Followed by combining the resulted insn 21 into insn 22. Finally, insn 22 is 
changing the condition of the jump (insn 22).
The last steps are a bit too aggressive, but I can make a logic out of it. 
Practically, insn 22 tells to the combiner how to change a CC_FPU mode into a 
CC_ZN mode, resulting into the modification of insn 21 to insn23. However, I 
cannot understand why the combiner chooses for CC_ZN instead of CC_FPU


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-10 Thread Claudiu Zissulescu
> That sound like a bug.  Have you looked more closely what's going on?

Right, I found it. Forgot to set the C_MODE for CC_FPU* modes in the 
arc_mode_class[]. I will prepare a new patch with the proper handling.

Thanks!


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-10 Thread Claudiu Zissulescu
Please find attached the amended patch for FPU instructions.

Ok to apply?


0001-ARC-Add-single-double-IEEE-precission-FPU-support.patch
Description: 0001-ARC-Add-single-double-IEEE-precission-FPU-support.patch