RE: [PATCH 1/2][ARC] Add support for ARCv2 CPUs

2015-11-11 Thread Claudiu Zissulescu
This patch is committed (without the gen_compare_reg change). 

Thanks Joern,
Claudiu

> Apart from the gen_compare_reg change, the patch is OK.
> If the v2 support mostly works like support for the other subtargets, you may
> check it in without the gen_compare_reg change.
> If that change is required because of particular code paths taken with the v2
> port, you may check in the whole patch.
> 
> The operand-swapping in gen_compare_reg was not expected to be
> triggered when re-generating a comparison, as comparisons gleaned from
> existing instructions are supposed to already have the operands in the right
> order.
> Do you have a testcase that triggers the assert independently from the
> v2 support?
> If you can name a pre-existing testcase to trigger the assert, the patch is
> approved for separate check-in.
> If you have a new testcase, is it in a form and of a legal status that it can 
> be
> submitted for inclusion in the gcc regression tests suite?


RE: [PATCH 1/2][ARC] Add support for ARCv2 CPUs

2015-11-10 Thread Claudiu Zissulescu

> If you can name a pre-existing testcase to trigger the assert, the patch is
> approved for separate check-in.

The patch solves the gcc.dg/pr29921-2.c error, visible for ARC700 architecture. 
I will prepare a new patch for this error. 

Thank you for the review,
Claudiu


Re: [PATCH 1/2][ARC] Add support for ARCv2 CPUs

2015-11-10 Thread Joern Wolfgang Rennecke



On 30/10/15 11:19, Claudiu Zissulescu wrote:

Hi,

Please find the updated patch. I will defer the secondary reload optimization 
which will use the ld  instructions with LIMM, for the time being.



Apart from the gen_compare_reg change, the patch is OK.
If the v2 support mostly works like support for the other subtargets, 
you may check it in without the gen_compare_reg

change.
If that change is required because of particular code paths taken with 
the v2 port, you may check in the whole patch.


The operand-swapping in gen_compare_reg was not expected to be triggered 
when re-generating a comparison,
as comparisons gleaned from existing instructions are supposed to 
already have the operands in the right order.
Do you have a testcase that triggers the assert independently from the 
v2 support?
If you can name a pre-existing testcase to trigger the assert, the patch 
is approved for separate check-in.
If you have a new testcase, is it in a form and of a legal status that 
it can be submitted for inclusion in the

gcc regression tests suite?


RE: [PATCH 1/2][ARC] Add support for ARCv2 CPUs

2015-10-30 Thread Claudiu Zissulescu
Hi,

Please find the updated patch. I will defer the secondary reload optimization 
which will use the ld  instructions with LIMM, for the time being.

Thank you,
Claudiu

gcc/ChangeLog:

2015-08-27  Claudiu Zissulescu
 

 
* common/config/arc/arc-common.c (arc_handle_option): Handle ARCv2  
 
options.
 
* config/arc/arc-opts.h: Add ARCv2 CPUs.
 
* config/arc/arc-protos.h (arc_secondary_reload_conv): Prototype.   
 
* config/arc/arc.c (arc_secondary_reload): Handle subreg (reg)  
 
situation, and store instructions with large offsets.   
 
(arc_secondary_reload_conv): New function.  
 
(arc_init): Add ARCv2 options.  
 
(arc_conditional_register_usage): Select the proper register usage  
 
for ARCv2 processors.   
 
(arc_handle_interrupt_attribute): ILINK2 is only valid for ARCv1
 
architecture.   
 
(arc_compute_function_type): Likewise.  
 
(arc_print_operand): Handle new ARCv2 punctuation characters.   
 
(arc_return_in_memory): ARCv2 ABI returns in registers up to 16 
 
bytes.  
 
(workaround_arc_anomaly, arc_asm_insn_p, arc_loop_hazard): New  
 
function.   
 
(arc_reorg, arc_hazard): Use it.
 
(gen_compare_reg): Eliminate false assert situations.   
 
* config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): Define __HS__ and 
 
__EM__. 
 
(ASM_SPEC): Add ARCv2 options.  
 
(TARGET_NORM): ARC HS has norm instructions by default. 
 
(TARGET_OPTFPE): Use optimized floating point emulation for ARC 
 
HS. 
 
(TARGET_AT_DBR_CONDEXEC): Only for ARC600 family.   
 
(TARGET_EM, TARGET_HS, TARGET_V2, TARGET_MPYW, TARGET_MULTI):   
 
Define. 
 
(SIGNED_INT16, TARGET_MPY, TARGET_ARC700_MPY, TARGET_ANY_MPY):  
 
Likewise.   
 
(TARGET_ARC600_FAMILY, TARGET_ARCOMPACT_FAMILY): Likewise.  
 

Re: [PATCH 1/2][ARC] Add support for ARCv2 CPUs

2015-10-23 Thread Joern Wolfgang Rennecke



On 30/09/15 11:42, Claudiu Zissulescu wrote:

This patch adds basic support for Synopsys' ARCv2 CPUs.



There is an awful lot of places that have TARGET_ARC700 || TARGET_V2 etc.
Maybe it's time for some new feature-oriented macros?  Like:

TARGET_ARC600_FAMILY  ARC600 and ARC601, for ARC600 pipeline idiosyncrasies
(currently tested by ruling out everything else... OK when that was just the
 ARC700,  and ARCtangent-A5 shared lots of the same issues, but 
backwards now)


TARGET_LP_WR_INTERLOCK loop count register can be read in very next
   instruction after it has been written to by an ordinary instruction
   (Which unfortunately comes at a penalty for every LP use).

TARGET_MPY   MPY support

TARGET_SLOW_CARRY high latency for carry input to arithmetic


comments to the patch in detail:

arc.c

For arc_secondary_reload:

The new code is puzzling at first glance (address reloading is generally
the job of find_reloads_toplev and its direct and indirect callees).
Sifting through the port I can see why it may be needed, but this is far
from obvious, so it requires a comment.  Something like:

  /* For ARC base register + offset addressing, the validity of the address
 is mode-dependent for most of the offset range, as the offset can be
 scaled by the access size.
 We don't expose these as mode-dependent addresses in the
 mode_dependent_address_p target hook, because that would disable
 lots of optimizations, and most uses of these addresses are for 32 or
 64 bit accesses anyways, which are fine.
 However, that leaves some addresses for 8 / 16 bit values not properly
 reloaded by the generic code, which is why we have to schedule 
secondary

 reloads for these.  */

Also, the test is unnecessarily strict for HImode.  Instead of

 (INTVAL (XEXP (addr, 1)) < -256
  || INTVAL (XEXP (addr, 1)) > 255)

you can use:
 !RTX_OK_FOR_OFFSET_P (mode, XEXP (addr, 1))
which takes potential scaling for HImode into account.

If you want to allow LIMMs for loads, this is also the place to
avoid unnecessary reloads in the first place - rather than making them
no-ops as the comment in arc_secondary_reload_conv suggests.
In arc_secondary_reload, you would just have to demur scheduling a reloads
for a large offset if in_p is set (that wouldn't allow them in the general
case, though - just the mode-dependent stuff).
To make this work, there would have to be a pattern-alternative to
match, though.  Currently, we don't allow large offsets for ARC memory
operands.  You could create a special constraint (not officially a memory
constraint) that allows MEMs with large offsets, and add it to the load
alternatives of the move patterns.
Another approach you could experiment with (don't know if it'll work)
would be to allow the full 32 bit offset range in arc_legitimate_address_p,
but claim that that which can not be stored with a 32 bit store is mode
dependent.  The SH is sort of a precedent for this as it had pre-decrement /
post-increment addresses which are valid only for store / load, 
respectively.

But it also has to use predicates to separate load and store operands.
You'd probably will also need to make arc's move_dest_operand predicate
reject larger offsets in MEMs.
We already reject scaled indices there.


arc_secondary_reload_conv:
I don't see why you test the address range again - isn't this function
only called when you have already scheduled a reload for the express
purpose of reloading the address?

If you do want to re-do the test (for sanity checking?), again,
  !RTX_OK_FOR_OFFSET_P (GET_MODE (mem), XEXP (addr, 1))
will consider the scaling for HImode.

in arc_loop_hazard
typo (moved, pre-existing, but still...): instert

arc.md

"*commutative_binary_mult_comparison_result_used":
You will also have to fix mult_operator in predicates.md to make this work
for TARGET_V2 .

"mulhisi3_imm" / "umulhisi3_imm"
I suspect that the hardware does not support multiplying with arbitrary
32 bit constants.  If so, the predicates and constraints for these
alternatives are wrong.  Without these properly in place, cse / combine etc.
might generate constants the hardware process as the compiler thinks it
does.

"simple_return"
Here you say a return from interrupt is nocond - but you don't change
the "p_return_i" pattern, so presumably the machine-independent 
if-conversion

can still generate a conditional return from interrupt.

predicates.md:

proper_comparison_operator

When are we passed a comparison where the first operand is VOIDmode?
I thought we generally canonicalize so that this does not happen?