Re: [i386] logical shift right in shrd

2014-06-21 Thread Uros Bizjak
On Fri, Jun 20, 2014 at 10:42 PM, Marc Glisse marc.gli...@inria.fr wrote:

 as reported in PR 61503, there seems to be a typo in the shrd pattern. I
 think it is quite unlikely to cause any problem, because the pattern is 1
 instruction too long for combine to recognize it (by the way, if someone has
 suggestions for PR 55583...). But it is still better to fix it.

 Bootstrap+testsuite on x86_64-linux-gnu.

 2014-06-21  Marc Glisse  marc.gli...@inria.fr

 PR target/61503
 * config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt
 with lshiftrt.

OK for mainline and 4.9.

Thanks,
Uros.


Re: Move DECL_INIT_PRIORITY/FINI_PRIORITY to symbol table

2014-06-21 Thread Andreas Schwab
Jan Hubicka hubi...@ucw.cz writes:

 this patch moves init and fini priorities to symbol table instead of trees.
 They are already in on-side hashtables, but the hashtables are now maintaned
 by symbol table.  This is needed for correctness with LTO.

This breaks gcc.dg/initpri3.c.  The constructor and destructor are
miscompiled to unconditionally call abort.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: Fortran OpenMP UDR fixes, nested handling fixes etc.

2014-06-21 Thread Tobias Burnus

Jakub Jelinek wrote:

Bootstrap/regtest pending, does this look ok?


Except for the module/resolved issues discussed elsewhere, it look good 
to me.


Thanks!

Tobias


2014-06-20  Jakub Jelinek  ja...@redhat.com

* gimplify.c (gimplify_scan_omp_clauses) case OMP_CLAUSE_MAP,
OMP_CLAUSE_TO, OMP_CLAUSE_FROM): Make sure OMP_CLAUSE_SIZE is
non-NULL.
case OMP_CLAUSE_ALIGNED: Gimplify OMP_CLAUSE_ALIGNED_ALIGNMENT.
(gimplify_adjust_omp_clauses_1): Make sure OMP_CLAUSE_SIZE is
non-NULL.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.c (lower_rec_simd_input_clauses,
lower_rec_input_clauses, expand_omp_simd): Handle non-constant
safelen the same as safelen(1).
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_ALIGNED.  For
OMP_CLAUSE_{MAP,TO,FROM} if not decl use walk_tree.
(convert_nonlocal_reference_stmt, convert_local_reference_stmt):
Fixup handling of GIMPLE_OMP_TARGET.
(convert_tramp_reference_stmt, convert_gimple_call): Handle
GIMPLE_OMP_TARGET.
gcc/fortran/
* dump-parse-tree.c (show_omp_namelist): Use n-udr-udr instead
of n-udr.
* f95-lang.c (gfc_init_builtin_functions): Initialize
BUILT_IN_ASSUME_ALIGNED.
* gfortran.h (gfc_omp_namelist): Change udr field type to
struct gfc_omp_namelist_udr.
(gfc_omp_namelist_udr): New type.
(gfc_get_omp_namelist_udr): Define.
(gfc_resolve_code): New prototype.
* match.c (gfc_free_omp_namelist): Free name-udr.
* module.c (intrinsics): Add INTRINSIC_USER.
(mio_expr): Handle INSTRINSIC_USER and non-resolved EXPR_FUNCTION.
* openmp.c (gfc_match_omp_clauses): Adjust initialization of n-udr.
(gfc_match_omp_declare_reduction): Treat len=: the same as len=*.
Set attr.flavor on omp_{out,in,priv,orig} artificial variables.
(struct resolve_omp_udr_callback_data): New type.
(resolve_omp_udr_callback, resolve_omp_udr_callback2,
resolve_omp_udr_clause): New functions.
(resolve_omp_clauses): Adjust for n-udr changes, resolve UDR clauses
here.
(omp_udr_callback): Don't check for implicitly declared functions
here.
(gfc_resolve_omp_udr): Don't call gfc_resolve.  Don't check for
implicitly declared subroutines here.
* resolve.c (resolve_code): Renamed to ...
(gfc_resolve_code): ... this.  No longer static.
(gfc_resolve_blocks, generate_component_assignments, resolve_codes):
Adjust callers.
* trans-openmp.c (gfc_omp_privatize_by_reference): Don't privatize
by reference type (C_PTR) variables.
(gfc_omp_finish_clause): Make sure OMP_CLAUSE_SIZE is non-NULL.
(gfc_trans_omp_udr_expr): Remove.
(gfc_trans_omp_array_reduction_or_udr): Adjust for n-udr changes.
Don't call gfc_trans_omp_udr_expr, even for sym-attr.dimension
expand it as assignment or subroutine call.
gcc/testsuite/
* gfortran.dg/gomp/udr2.f90 (f7, f9): Add !$omp parallel with
reduction clause.
* gfortran.dg/gomp/udr4.f90 (f4): Likewise.
Remove Label is never defined expected error.
* gfortran.dg/gomp/udr8.f90: New test.
libgomp/
* testsuite/libgomp.fortran/aligned1.f03: New test.
* testsuite/libgomp.fortran/nestedfn5.f90: New test.
* testsuite/libgomp.fortran/target7.f90: Surround loop spawning
tasks with !$omp parallel !$omp single.
* testsuite/libgomp.fortran/target8.f90: New test.
* testsuite/libgomp.fortran/udr4.f90 (foo UDR, bar UDR): Adjust
not to use trim in the combiner, instead call elemental function.
(fn): New elemental function.
* testsuite/libgomp.fortran/udr6.f90 (do_add, dp_add, dp_init):
Make elemental.
* testsuite/libgomp.fortran/udr7.f90 (omp_priv, omp_orig, omp_out,
omp_in): Likewise.
* testsuite/libgomp.fortran/udr12.f90: New test.
* testsuite/libgomp.fortran/udr13.f90: New test.
* testsuite/libgomp.fortran/udr14.f90: New test.
* testsuite/libgomp.fortran/udr15.f90: New test.


Re: [PATCH][MIPS] Enable load-load/store-store bonding

2014-06-21 Thread Richard Sandiford
Hi Sameera,

Thanks for the patch.

Sameera Deshpande sameera.deshpa...@imgtec.com writes:
 diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
 index b5b5ba7..9804ef2 100644
 --- a/gcc/config/mips/mips.c
 +++ b/gcc/config/mips/mips.c
 @@ -18813,6 +18813,9 @@ mips_option_override (void)
if (TARGET_MICROMIPS  TARGET_MIPS16)
  error (unsupported combination: %s, -mips16 -mmicromips);
  
 +  if (TARGET_FIX_24K  TUNE_P5600)
 +error (unsupported combination: %s, -mtune=p5600 -mfix-24k);
 +
/* Save the base compression state and process flags as though we
   were generating uncompressed code.  */
mips_base_compression_flags = TARGET_COMPRESSION;

Although it's a bit of an odd combination, we need to accept
-mfix-24k -mtune=p5600 and continue to implement the 24k workarounds.
The idea is that a distributor can build for a common base architecture,
add -mfix- options for processors that might run the code, and add -mtune=
for the processor that's most of interest optimisation-wise.

We should just make the pairing of stores conditional on !TARGET_FIX_24K.

 diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
 index b9cfd62..d4135cf 100644
 --- a/gcc/config/mips/mips.opt
 +++ b/gcc/config/mips/mips.opt
 @@ -445,3 +445,7 @@ Enum(mips_lib_setting) String(tiny) Value(MIPS_LIB_TINY)
  
  msched-weight
  Target Report Var(TARGET_SCHED_WEIGHT) Undocumented
 +
 +mld-st-pairing
 +Target Report Var(TARGET_ENABLE_LD_ST_PAIRING)
 +Enable load/store pairing

Other options are just TARGET_ + the captialised form of the option name,
so I'd prefer TARGET_LD_ST_PAIRING instead.  Although ld might be misleading
since it's an abbreviation for load rather than the LD instruction.
Maybe -mload-store-pairs, since plurals are more common than -ing?
Not sure that's a great suggestion though.

If we want a user-level option then it needs to be documented in
invoke.texi.

 diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
 index 86ca419..4478f81e 100644
 --- a/gcc/config/mips/mips.h
 +++ b/gcc/config/mips/mips.h
 @@ -3184,3 +3184,6 @@ extern GTY(()) struct target_globals *mips16_globals;
 with arguments ARGS.  */
  #define PMODE_INSN(NAME, ARGS) \
(Pmode == SImode ? NAME ## _si ARGS : NAME ## _di ARGS)
 +
 +#define ENABLE_LD_ST_PAIRING \
 +  (TARGET_ENABLE_LD_ST_PAIRING  TUNE_P5600)

The patch requires -mld-st-pairing to be passed explicitly even for
-mtune=p5600.  Is that because it's not a consistent enough win for us
to enable it by default?  It sounded from the description like it should
be an improvement more often that not.

We should allow pairing even without -mtune=p5600.

 diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
 index 7229e8f..05605c5 100644
 --- a/gcc/config/mips/mips.md
 +++ b/gcc/config/mips/mips.md
 @@ -780,6 +780,7 @@
  
  (define_mode_iterator MOVEP1 [SI SF])
  (define_mode_iterator MOVEP2 [SI SF])
 +(define_mode_iterator JOINLDST1 [SI SF DF])

Maybe:

(define_mode_iterator JOIN_MODE [
  SI
  (DI TARGET_64BIT)
  (SF TARGET_HARD_FLOAT)
  (DF TARGET_HARD_FLOAT  TARGET_DOUBLE_FLOAT)])

and then extend:

 @@ -883,6 +884,8 @@
  (define_mode_attr loadx [(SF lwxc1) (DF ldxc1) (V2SF ldxc1)])
  (define_mode_attr storex [(SF swxc1) (DF sdxc1) (V2SF sdxc1)])
  
 +(define_mode_attr insn_type [(SI ) (SF fp) (DF fp)])
 +
  ;; The unextended ranges of the MIPS16 addiu and daddiu instructions
  ;; are different.  Some forms of unextended addiu have an 8-bit immediate
  ;; field but the equivalent daddiu has only a 5-bit field.

this accordingly.

 @@ -7442,6 +7445,153 @@
{ return MIPS_CALL (jal, operands, 0, -1); }
[(set_attr type call)
 (set_attr insn_count 3)])
 +
 +(define_insn *join2_load_storeJOINLDST1:mode
 +  [(parallel
 +[(set (match_operand:JOINLDST1 0 nonimmediate_operand =reg,m)
 +   (match_operand:JOINLDST1 1 nonimmediate_operand m,reg))
 + (set (match_operand:JOINLDST1 2 nonimmediate_operand =reg,m)
 +   (match_operand:JOINLDST1 3 nonimmediate_operand m,reg))])]
 +  ENABLE_LD_ST_PAIRING  reload_completed
 +  {
 +output_asm_insn (mips_output_move (operands[0], operands[1]), operands);
 +output_asm_insn (mips_output_move (operands[2], operands[3]), 
 operands[2]);
 +return ;
 +  }
 +  [(set_attr move_type insn_typeload,insn_typestore)
 +   (set_attr_alternative insn_count
 + [(mult (symbol_ref mips_load_store_insns (operands[1], insn))
 +(const_int 2))
 +  (mult (symbol_ref mips_load_store_insns (operands[0], insn))
 +(const_int 2))])])

Outer (parallel ...)s are redundant in a define_insn.

It would be better to add the mips_load_store_insns for each operand
rather than multiplying one of them by 2.  Or see the next bit
for an alternative.

 +;;2 SI/SF/DF loads are joined.
 +(define_peephole2
 +  [(set (match_operand:JOINLDST1 0 register_operand)
 + (mem:JOINLDST1 (plus:SI (match_operand:SI 1 register_operand)
 + (match_operand:SI 2 

Re: [i386] logical shift right in shrd

2014-06-21 Thread Marc Glisse

On Sat, 21 Jun 2014, Uros Bizjak wrote:


On Fri, Jun 20, 2014 at 10:42 PM, Marc Glisse marc.gli...@inria.fr wrote:


as reported in PR 61503, there seems to be a typo in the shrd pattern. I
think it is quite unlikely to cause any problem, because the pattern is 1
instruction too long for combine to recognize it (by the way, if someone has
suggestions for PR 55583...). But it is still better to fix it.

Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-21  Marc Glisse  marc.gli...@inria.fr

PR target/61503
* config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt
with lshiftrt.


OK for mainline and 4.9.


Thanks.

Er, I am sorry, I don't know what happened, but when testing the backport 
to 4.9 I got an obvious failure in the testsuite, which I am sure should 
also happen on trunk, but somehow I didn't see it (I am almost sure I 
tested the right branch though). Anyway, here is an updated patch, that 
did pass bootstrap+testsuite both on trunk and 4.9. I haven't committed 
anything yet, is the new patch ok?


2014-06-21  Marc Glisse  marc.gli...@inria.fr

PR target/61503
* config/i386/i386.md (x86_64_shrd, x86_shrd,
ix86_rotrdwi3_doubleword): Replace ashiftrt with lshiftrt.

--
Marc GlisseIndex: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md (revision 211865)
+++ gcc/config/i386/i386.md (working copy)
@@ -9601,37 +9601,37 @@
 (match_operand:DWI 1 register_operand)
 (match_operand:QI 2 nonmemory_operand)))
  (clobber (reg:CC FLAGS_REG))])
(match_dup 3)]
   TARGET_CMOVE
   [(const_int 0)]
   ix86_split_shift_insn (operands, operands[3], DWImode); DONE;)
 
 (define_insn x86_64_shrd
   [(set (match_operand:DI 0 nonimmediate_operand +r*m)
-(ior:DI (ashiftrt:DI (match_dup 0)
+(ior:DI (lshiftrt:DI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Jc))
(ashift:DI (match_operand:DI 1 register_operand r)
  (minus:QI (const_int 64) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   TARGET_64BIT
   shrd{q}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode DI)
(set_attr athlon_decode vector)
(set_attr amdfam10_decode vector)
(set_attr bdver1_decode vector)])
 
 (define_insn x86_shrd
   [(set (match_operand:SI 0 nonimmediate_operand +r*m)
-(ior:SI (ashiftrt:SI (match_dup 0)
+(ior:SI (lshiftrt:SI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Ic))
(ashift:SI (match_operand:SI 1 register_operand r)
  (minus:QI (const_int 32) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   
   shrd{l}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode SI)
(set_attr pent_pair np)
@@ -10069,27 +10069,27 @@
(rotatert:DWI (match_operand:DWI 1 register_operand 0)
   (match_operand:QI 2 shift_immediate_operand S)))
   (clobber (reg:CC FLAGS_REG))
   (clobber (match_scratch:DWIH 3 =r))]
  
  #
  reload_completed
  [(set (match_dup 3) (match_dup 4))
   (parallel
[(set (match_dup 4)
-(ior:DWIH (ashiftrt:DWIH (match_dup 4) (match_dup 2))
+(ior:DWIH (lshiftrt:DWIH (match_dup 4) (match_dup 2))
   (ashift:DWIH (match_dup 5)
(minus:QI (match_dup 6) (match_dup 2)
 (clobber (reg:CC FLAGS_REG))])
   (parallel
[(set (match_dup 5)
-(ior:DWIH (ashiftrt:DWIH (match_dup 5) (match_dup 2))
+(ior:DWIH (lshiftrt:DWIH (match_dup 5) (match_dup 2))
   (ashift:DWIH (match_dup 3)
(minus:QI (match_dup 6) (match_dup 2)
 (clobber (reg:CC FLAGS_REG))])]
 {
   operands[6] = GEN_INT (GET_MODE_BITSIZE (MODEmode));
 
   split_double_mode (DWImode, operands[0], 1, operands[4], operands[5]);
 })
 
 (define_insn *bmi2_rorxmode3_1


Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option

2014-06-21 Thread Herman, Andrei

I will be on vacation until July 13.
I'll have access to my e-mail occasionally.

If you could send me please your comments, both style and content, 
pertaining to all three patches related to this subject, I will make all 
the needed changes and I could submit a new version, as soon as I get back.


Thanks and regards,
Andrei Herman
Mentor Graphics Corporation
Israel branch

On 6/20/2014 12:09 AM, Joseph S. Myers wrote:

On Sun, 1 Jun 2014, Herman, Andrei wrote:


+  /* The -fforce-dwarf-lexical-blocks option is only relevant when debug
+ info is in DWARF4 format */
+  if (flag_force_dwarf_blocks) {


Watch coding style: the opening '{' always goes on the next line.


+fforce-dwarf-lexical-blocks
+C C++ Var(flag_force_dwarf_blocks)
+Force generation of lexical blocks in dwarf output


I don't see a good reason for this not to be supported for ObjC and ObjC++
as well.  Say DWARF, not dwarf.


+@item -fforce-dwarf-lexical-blocks
+Produce debug information (a DW_TAG_lexical_block) for every function
+body, loop body, switch body, case statement, if-then and if-else statement,
+even if the body is a single statement.  Likewise, a lexical block will be
+emitted for the first label of a statement.  This block ends at the end of the
+current lexical scope, or when a break, continue, goto or return statement is
+encountered at the same lexical scope level.  This option is usefull for
+coverage tools that utilize the dwarf debug information.
+This option only applies to C/C++ code and is available when using DWARF
+Version 4 or higher.


Use @code{} markup for keywords (if, else, break, continue, goto, return).
useful not usefull.  DWARF not dwarf.


+/* Create a block_loc struct for a statement list created on behalf of
+   flag_force_dwarf_blocks.  We use this for label or forced c99 scopes.  */
+
+void
+push_block_info (tree block, location_t loc, bool is_label)
+{
+  if (TREE_CODE(block) != STATEMENT_LIST)


Watch coding style: space before '(' in function and macro calls (and
similar calls such as sizeof) (many places in this patch, not just this
one).


+tree
+pop_block_info (location_t loc)


It's not documented in codingconventions.html, but I think it's preferred
to avoid returning values through reference arguments (see e.g.
https://gcc.gnu.org/ml/gcc-patches/2013-11/msg00198.html).


+{
+  block_loc  tl = NULL;


Excess space between block_loc and tl.


@@ -4679,7 +4712,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
expressions being rejected later.  */

static void
-c_parser_label (c_parser *parser)
+c_parser_label (c_parser *parser, bool prev_label)


You're adding a new argument - you need to update the comment above this
function to explain the semantics of this argument.

In general, make sure that new functions have comments above them that
explain the semantics of the arguments (by name) and any return value.


+/* If current scope is a label scope, pop it from block info stack
+   and close it's compound statement.  */


its not it's.





Re: [patch] change specific int128 - generic intN

2014-06-21 Thread Marc Glisse

(Adding libstdc++@ in Cc: so they see the patch at
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01655.html )

On Sat, 21 Jun 2014, DJ Delorie wrote:


New version of https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00723.html

This is a big patch, but it includes all the features/changes/support
requested since the initial patch.  Tested with identical results
before/after on x86-64 EXCEPT for g++.dg/ext/int128-1.C, which assumes
__int128 is not supported with -std= but as per previous discussion,
it now is (because __intN might be used for size_t, which is
required).  Tested on msp430 with significant improvements but a few
regressions (expected).

This patch replaces the current int128 support with a generic intN
support, which happens to provide int128 as well as up to three
additional intN types per target.  I will post a separate patch for
the msp430 backend demonstrating the use of this new feature for
__int20, as well as a third to fix some problems with PSImode in
general.

The general idea is that genmodes has a new macro INT_N() for
tm-modes.def, which gives the mode and precision to use for
target-specific intN types.  These types are always created, but the
parser will error with unsupported if the target does not support
that mode for that compile (i.e. because of command line switches,
etc).  There's an INT_N(TI,128) in the common sources.

If the target defines any types larger than long long, those types
(like int128 before) will be used for type promotion, but types
smaller than long long will not, to avoid confusing the C type
promotion rules.  Otherwise, it's up to the source being compiled to
specific __intN types as desired.


Nice. A couple quick comments on the parts I understand while
maintainers are resting for the week-end.


Index: libstdc++-v3/src/c++11/limits.cc
===

[...]

+#if !defined(__STRICT_ANSI__)


Since the test on __STRICT_ANSI__ is removed for all other uses, it would 
seem consistent to me to remove this one as well. Besides, you are already 
testing __GLIBCXX_USE_INT_N_0, which as far as I understand is protected 
by !flag_iso (with the exception of size_t).



-#if !defined(__STRICT_ANSI__)  defined(_GLIBCXX_USE_INT128)
+  // Conditionalizing on __STRICT_ANSI__ here will break any port that
+  // uses one of these types for size_t.
+#if defined(__GLIBCXX_USE_INT_N_0)
  template
-struct __is_integral_helper__int128
+struct __is_integral_helper__GLIBCXX_TYPE_INT_N_0


Since the check for __STRICT_ANSI__ is removed, do we need to add
__extension__ in front of __GLIBCXX_TYPE_INT_N_0 to avoid warning with
-Wsystem-headers?


--- gcc/cp/rtti.c   (revision 211858)
+++ gcc/cp/rtti.c   (working copy)
@@ -1506,31 +1506,44 @@ emit_support_tinfo_1 (tree bltn)

void
emit_support_tinfos (void)
{
  /* Dummy static variable so we can put nullptr in the array; it will be
 set before we actually start to walk the array.  */
-  static tree *const fundamentals[] =
+  static tree * fundamentals[] =
  {
void_type_node,
boolean_type_node,
wchar_type_node, char16_type_node, char32_type_node,
char_type_node, signed_char_type_node, unsigned_char_type_node,
short_integer_type_node, short_unsigned_type_node,
integer_type_node, unsigned_type_node,
long_integer_type_node, long_unsigned_type_node,
long_long_integer_type_node, long_long_unsigned_type_node,
-int128_integer_type_node, int128_unsigned_type_node,
float_type_node, double_type_node, long_double_type_node,
dfloat32_type_node, dfloat64_type_node, dfloat128_type_node,
+#define FUND_INT_N_IDX 22
+// These eight are for intN_t nodes
+nullptr_type_node, nullptr_type_node, nullptr_type_node, 
nullptr_type_node,
+nullptr_type_node, nullptr_type_node, nullptr_type_node, 
nullptr_type_node,
nullptr_type_node,
0
  };
-  int ix;
+  int ix, i;
  tree bltn_type, dtor;

+  ix = FUND_INT_N_IDX;
+  for (i = 0; i  NUM_INT_N_ENTS; i ++)
+if (int_n_enabled_p[i])
+  {
+   fundamentals [ix++] = int_n_trees[i].signed_type;
+   fundamentals [ix++] = int_n_trees[i].unsigned_type;
+  }
+  fundamentals [ix++] = nullptr_type_node;
+  fundamentals [ix++] = 0;
+
  push_abi_namespace ();
  bltn_type = xref_tag (class_type,
get_identifier (__fundamental_type_info),
/*tag_scope=*/ts_current, false);
  pop_abi_namespace ();
  if (!COMPLETE_TYPE_P (bltn_type))


That seems complicated. You just need to call emit_support_tinfo_1 on
each of the types (see how fundamentals is used at the end of the
function), no need to put everything in the array.

--
Marc Glisse


Re: [PATCH] Fix up -march=native handling under KVM (PR target/61570)

2014-06-21 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 03:22:52PM -0700, H.J. Lu wrote:
 On Fri, Jun 20, 2014 at 2:42 PM, Jakub Jelinek ja...@redhat.com wrote:
  --- gcc/config/i386/driver-i386.c.jj2014-05-14 14:45:54.0 +0200
  +++ gcc/config/i386/driver-i386.c   2014-06-20 18:59:57.805006358 +0200
  @@ -745,6 +745,11 @@ const char *host_detect_local_cpu (int a
  /* Assume Core 2.  */
  cpu = core2;
  }
  + else if (has_longmode)
  +   /* Perhaps some emulator?  Assume x86-64, otherwise gcc
  +  -march=native would be unusable for 64-bit compilations,
  +  as all the CPUs below are 32-bit only.  */
  +   cpu = x86-64;
else if (has_sse3)
  /* It is Core Duo.  */
  cpu = pentium-m;
 
  Jakub
 
 host_detect_local_cpu guesses the cpu based on the real processors.
 It doesn't work with emulators due to some conflicts.  This isn't the
 only only place which has the same issue.   I prefer something like
 this.

I'm fine with your patch too.  Let's wait what Uros (or other i?86
maintainers) pick up.

Jakub


RE: [PATCH, cpp] Fix line directive bug

2014-06-21 Thread Nicholas Ormrod
Hello all,

(Re-adding gcc-patches, since it got dropped and missed six emails)

=== CPP FEATURE SUGGESTION ===

Adding line directives inside of a macro expansion to differentiate between 
system tokens and user tokens is a valid solution. As Manuel pointed out, there 
would need to be many line directives.

So, if #define FOO(x) (x) + 1/0 was in a system file, and was instantiated 
from a user file like FOO(2/0), the 2/0 are user tokens (which should raise 
a div-by-zero warning), but the 1/0 are system tokens (and so should not 
raise a warning). This would need to expand as follows:

# 1 user.cpp 3
(
# 1 user.cpp 
2/0
# 1 user.cpp 3
) + 1/0
# 2 user.cpp

The last line directive is crucial, to specify that the rest of the file is not 
comprised of user tokens.

Currently, compiling a user program with FOO(2/0) yields a single div-by-zero 
error. Compiling that same program with -no-integrated-cpp yields two errors: 
the original and correct 2/0 error, but also the 1/0 system error. So adjusting 
the preprocessor to emit extra line directives would be consistent with the 
integrated-cpp mode.


=== NEXT STEPS ===

The bugzilla report now has two independent reports, so this is an issue which 
is currently affecting the community. Both reports, it should be noted, stem 
from the use of ccache, which caches preprocessed output (since this bug is not 
present with integrated preprocessing).

The status quo is subtle and infrequent, but confounding. The extremely 
specific requirements to trigger the bug mean that it is effectively 
undiagnosable (my initial bug report was the result of a very chance set of 
circumstances and a lot of elbow grease, which ended up being the reason for a 
years-old abnormality in our error messages). Further, when the bug does 
happen, it is severely detrimental, since it disables warnings in the rest of 
the file. Given these two circumstances, undiagnosability and high-impact, I 
think that a fix should be applied immediately.

We currently have two possible approaches. I see the trade-offs between these 
two solutions as follows:

Adding full line directives is not implemented, and will be more complicated 
than my patch. It will require someone else to implement it, but should fix the 
problem in its entirety.

My solution is simple, battle-tested (I've had it in our production gcc for a 
few months now), and ready to roll. The primary downside is that it will not 
supress system-token errors in macro expansions (though in my experience this 
is not a problem).



Cheers,
Nicholas



[Patch, Fortran] Some coarray fixes

2014-06-21 Thread Tobias Burnus
This patch primarily adds a check that the A argument (= 
source/result) of a collective is definable. I found the issue when a 
co_* test case didn't work with vector subscripts. (gfortran doesn't do 
a copy-out.)



The patch additionally fixes one issue I found on the way: 
gfc_check_vardef_context with context == NULL segfaulted for vector 
subscripts.



And I fixed two issues I encountered with coindexed strings:

a) gfc_conv_string_tmp requires that the type argument is a pointer – 
otherwise, it will ICE. (See also other uses of that function)
b) get_scalar_to_descriptor_type: If the argument is a pointer, the type 
and hence the dtype is wrong.


I found those while writing a test case for coindexed strings and type 
conversion; I will later submit the test case together with some other 
coarray-related patches, but to clean up my trunk and to make strings 
already usable, I have included those bits of the patch already.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
gcc/fortran/
2014-06-21  Tobias Burnus  bur...@net-b.de

	* check.c (check_co_minmaxsum): Add definable check.
	* expr.c (gfc_check_vardef_context): Fix context == NULL case.
	* trans-expr.c (get_scalar_to_descriptor_type): Handle pointer arguments.
	* trans-intrinsic.c (gfc_conv_intrinsic_caf_get): Fix generation of temporary
	strings.

gcc/testsuite/
2014-06-21  Tobias Burnus  bur...@net-b.de

	* gfortran.dg/coarray_collectives_7.f90: New.

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index bd3eff6..10944eb 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1307,6 +1307,18 @@ check_co_minmaxsum (gfc_expr *a, gfc_expr *result_image, gfc_expr *stat,
   if (!variable_check (a, 0, false))
 return false;
 
+  if (!gfc_check_vardef_context (a, false, false, false, argument 'A' with 
+ INTENT(INOUT)))
+return false;
+
+  if (gfc_has_vector_subscript (a))
+{
+  gfc_error (Argument 'A' with INTENT(INOUT) at %L of the intrinsic 
+		 subroutine %s shall not have a vector subscript,
+		 a-where, gfc_current_intrinsic);
+  return false;
+}
+
   if (result_image != NULL)
 {
   if (!type_check (result_image, 1, BT_INTEGER))
diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index f0238c1..feb089e 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -4956,10 +4956,11 @@ gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj,
 			  en = n-expr;
 			  if (gfc_dep_compare_expr (ec, en) == 0)
 			{
-			  gfc_error_now (Elements with the same value at %L
-	  and %L in vector subscript
-	  in a variable definition
-	  context (%s), (ec-where),
+			  if (context)
+gfc_error_now (Elements with the same value at %L
+	and %L in vector subscript
+	in a variable definition
+	context (%s), (ec-where),
 	 (en-where), context);
 			  return false;
 			}
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index d67d737..7ee0206 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -57,6 +57,8 @@ get_scalar_to_descriptor_type (tree scalar, symbol_attribute attr)
   else
 akind = GFC_ARRAY_ASSUMED_SHAPE_CONT;
 
+  if (POINTER_TYPE_P (TREE_TYPE (scalar)))
+scalar = TREE_TYPE (scalar);
   return gfc_get_array_type_bounds (TREE_TYPE (scalar), 0, 0, NULL, NULL, 1,
 akind, !(attr.pointer || attr.target));
 }
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 548fd9f..a0c7421 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -1258,7 +1258,8 @@ gfc_conv_intrinsic_caf_get (gfc_se *se, gfc_expr *expr, tree lhs, tree lhs_kind)
 	{
 	  gfc_clear_attr (attr);
 	  if (array_expr-ts.type == BT_CHARACTER)
-	res_var = gfc_conv_string_tmp (se, type, argse.string_length);
+	res_var = gfc_conv_string_tmp (se, build_pointer_type (type),
+	   argse.string_length);
 	  else
 	res_var = gfc_create_var (type, caf_res);
 	  dst_var = gfc_conv_scalar_to_descriptor (argse, res_var, attr);
diff --git a/gcc/testsuite/gfortran.dg/coarray_collectives_8.f90 b/gcc/testsuite/gfortran.dg/coarray_collectives_8.f90
new file mode 100644
index 000..aa97b7f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_collectives_7.f90
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-options -fcoarray=lib }
+!
+! As SOURCE is INTENT(INOUT), it must be definable,
+! cf. J3/14-147
+!
+
+intrinsic :: co_sum, co_min, co_max
+integer :: vec(3), idx(3)
+
+call co_sum(vec(idx)) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_sum shall not have a vector subscript }
+call co_min(vec([1,3,2])) ! { dg-error Argument 'A' with INTENT\\(INOUT\\) at .1. of the intrinsic subroutine co_min shall not have a vector subscript }
+call co_sum(vec([1,1,1])) ! { dg-error Elements with the same value at .1. and .2. in vector subscript in a variable definition context \\(argument 'A' 

[webdoc, patch, committed] svn.html - retire fortran-caf

2014-06-21 Thread Tobias Burnus
Now that all changes of the fortran-caf branch are in the GCC 4.10 
trunk, it makes sense to retire that branch.


That's what I now did with the attached patched.

Tobias
Index: svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.197
diff -p -u -r1.197 svn.html
--- svn.html	11 Jun 2014 18:49:25 -	1.197
+++ svn.html	21 Jun 2014 19:00:41 -
@@ -485,11 +485,6 @@ the command codesvn log --stop-on-copy
 h4Language-specific/h4
 
 dl
-  dtfortran-caf/dt
-  ddThis branch is for coarray changes to the Fortran front end.  It is
-maintained by Tobias Burnus
-lt;a href=mailto:bur...@gcc.gnu.orgbur...@gcc.gnu.org/agt;./dd
-
   dtfortran-dev/dt
   ddThis branch is for disruptive changes to the Fortran front end,
 especially for OOP development and 
@@ -1178,6 +1173,12 @@ be prefixed with the initials of the dis
 for array constructor refactoring using splay-tree and other areas of
 optimization.  It was maintained by Jerry DeLisle
 lt;a href=mailto:jvdeli...@gcc.gnu.orgjvdeli...@gcc.gnu.org/agt;./dd
+
+  dtfortran-caf/dt
+  ddThis branch contained experimental changes to the Fortran front end for
+implementing the library calls for coarray communication.  It was
+maintained by Tobias Burnus
+lt;a href=mailto:bur...@gcc.gnu.orgbur...@gcc.gnu.org/agt;./dd
 /dl
 
 /body


[RTL] (vec_select (vec_concat a b) c) may be just a or b

2014-06-21 Thread Marc Glisse

Hello,

this is another small simplification of RTL for vectors. Note that it 
doesn't really solve the problem, because these simplifications are only 
performed for single-use objects. If I start from vectors [a,b] and [c,d] 
and concatenate them into [a,b,c,d], then extract both halves, as in the 
original testcase in the PR, we won't notice that those are the original 
vectors. Still, better than nothing...


(we output a vzeroupper for the testcase, that seems unnecessary)

Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-22  Marc Glisse  marc.gli...@inria.fr

PR target/44551
gcc/
* simplify-rtx.c (simplify_binary_operation_1) VEC_SELECT:
Optimize inverse of a VEC_CONCAT.
gcc/testsuite/
* gcc.target/i386/pr44551-1.c: New file.

--
Marc GlisseIndex: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 211867)
+++ gcc/simplify-rtx.c  (working copy)
@@ -3359,20 +3359,64 @@ simplify_binary_operation_1 (enum rtx_co
  unsigned int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
  unsigned int i1 = INTVAL (XVECEXP (trueop1, 0, 1));
  rtx subop0, subop1;
 
  gcc_assert (i0  2  i1  2);
  subop0 = XEXP (trueop0, i0);
  subop1 = XEXP (trueop0, i1);
 
  return simplify_gen_binary (VEC_CONCAT, mode, subop0, subop1);
}
+
+ /* If we select one half of a vec_concat, return that.  */
+ if (GET_CODE (trueop0) == VEC_CONCAT
+  CONST_INT_P (XVECEXP (trueop1, 0, 0)))
+   {
+ rtx subop0 = XEXP (trueop0, 0);
+ rtx subop1 = XEXP (trueop0, 1);
+ enum machine_mode mode0 = GET_MODE (subop0);
+ enum machine_mode mode1 = GET_MODE (subop1);
+ int li = GET_MODE_SIZE (GET_MODE_INNER (mode0));
+ int l0 = GET_MODE_SIZE (mode0) / li;
+ int l1 = GET_MODE_SIZE (mode1) / li;
+ int i0 = INTVAL (XVECEXP (trueop1, 0, 0));
+ if (i0 == 0  !side_effects_p (op1)  mode == mode0)
+   {
+ bool success = true;
+ for (int i = 1; i  l0; ++i)
+   {
+ rtx j = XVECEXP (trueop1, 0, i);
+ if (!CONST_INT_P (j) || INTVAL (j) != i)
+   {
+ success = false;
+ break;
+   }
+   }
+ if (success)
+   return subop0;
+   }
+ if (i0 == l0  !side_effects_p (op0)  mode == mode1)
+   {
+ bool success = true;
+ for (int i = 1; i  l1; ++i)
+   {
+ rtx j = XVECEXP (trueop1, 0, i);
+ if (!CONST_INT_P (j) || INTVAL (j) != i0 + i)
+   {
+ success = false;
+ break;
+   }
+   }
+ if (success)
+   return subop1;
+   }
+   }
}
 
   if (XVECLEN (trueop1, 0) == 1
   CONST_INT_P (XVECEXP (trueop1, 0, 0))
   GET_CODE (trueop0) == VEC_CONCAT)
{
  rtx vec = trueop0;
  int offset = INTVAL (XVECEXP (trueop1, 0, 0)) * GET_MODE_SIZE (mode);
 
  /* Try to find the element in the VEC_CONCAT.  */
Index: gcc/testsuite/gcc.target/i386/pr44551-1.c
===
--- gcc/testsuite/gcc.target/i386/pr44551-1.c   (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr44551-1.c   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mavx } */
+
+#include immintrin.h
+
+__m128i
+foo (__m256i x, __m128i y)
+{
+  __m256i r = _mm256_insertf128_si256(x, y, 1);
+  __m128i a = _mm256_extractf128_si256(r, 1);
+  return a;
+}
+
+/* { dg-final { scan-assembler-not vinsertf } } */
+/* { dg-final { scan-assembler-not vextractf } } */


Re: [patch] change specific int128 - generic intN

2014-06-21 Thread Joseph S. Myers
The changes to dwarf2asm.c, cppbuiltin.c, optabs.c, defaults.h, expr.c, 
expmed.c, tree-dfa.c, simplify-rtx.c, lto-object.c, loop-iv.c, varasm.c, 
the msp430 back end and some of the stor-layout.c changes don't look like 
they should depend on the rest of the patch.  I think it would help review 
if anything that can reasonably be separated from the main intN support is 
posted separately, as a much smaller patch with its own self-contained 
rationale (I presume all those changes should work fine without the main 
intN support), and then the intN patch only contains things directly 
related to intN support.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH][BUILDROBOT] nios2: Include builtins.h

2014-06-21 Thread Jan-Benedict Glaw
Hi!

The nios2 backend was forgotten, it also needs to include builtins.h:

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
-Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long 
-Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. 
-I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
-I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include -I/opt/cfarm/gmp-latest/include 
-I/opt/cfarm/mpfr-latest/include -I/opt/cfarm/mpc-latest/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o nios2.o -MT nios2.o -MMD -MP 
-MF ./.deps/nios2.TPo /home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c: In function ‘tree_node* 
nios2_merge_decl_attributes(tree_node*, tree_node*)’:
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3225: warning: unknown 
conversion type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3225: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3225: warning: too many 
arguments for format
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c: At global scope:
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3349: error: 
‘std_build_builtin_va_list’ was not declared in this scope
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3349: error: 
‘std_fn_abi_va_list’ was not declared in this scope
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3349: error: 
‘std_canonical_va_list_type’ was not declared in this scope
/home/jbglaw/repos/gcc/gcc/config/nios2/nios2.c:3349: error: too many 
initializers for ‘gcc_target’
make[1]: *** [nios2.o] Error 1

Committed as obvious.

MfG, JBG



2014-06-21  Jan-Benedict Glaw  jbg...@lug-owl.de

gcc/
* config/nios2/nios2.c: Include builtins.h.




diff --git a/gcc/config/nios2/nios2.c b/gcc/config/nios2/nios2.c
index 354e3d9..a4e60c6 100644
--- a/gcc/config/nios2/nios2.c
+++ b/gcc/config/nios2/nios2.c
@@ -52,6 +52,7 @@
 #include stor-layout.h
 #include varasm.h
 #include calls.h
+#include builtins.h
 
 /* Forward function declarations.  */
 static bool prologue_saved_reg_p (unsigned);


-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: They that give up essential liberty to obtain temporary safety,
the second  : deserve neither liberty nor safety.  (Ben Franklin)


signature.asc
Description: Digital signature


Re: Another AIX Bootstrap failure

2014-06-21 Thread David Edelsohn
 Index: testsuite/gcc.dg/localalias.c
 ===
 --- testsuite/gcc.dg/localalias.c   (revision 0)
 +++ testsuite/gcc.dg/localalias.c   (revision 0)
 @@ -0,0 +1,42 @@
 +/* This test checks that local aliases behave sanely.  This is necessary for 
 code correctness
 +   of aliases introduced by ipa-visibility pass.
 +
 +   If this test fails either aliases needs to be disabled on given target on 
 aliases with
 +   proper semantic needs to be implemented.  This is problem with e.g. AIX 
 .set pseudo-op
 +   that implementes alias syntactically (by substituting in assembler) 
 rather as alternative
 +   symbol defined on a target's location.  */
 +
 +/* { dg-do run }
 +   { dg-options -Wstrict-aliasing=2 -fstrict-aliasing }
 +   { dg-require-alias  }
 +   { dg-xfail-if  { powerpc-ibm-aix* } { * } {  } }
 +   { dg-additional-sources localalias-2.c } */
 +extern void abort (void);
 +extern int test2count;
 +int testcount;
 +__attribute__ ((weak,noinline))
 +void test(void)
 +{
 +  testcount++;
 +}
 +__attribute ((alias(test)))
 +static void test2(void);
 +
 +void main()
 +{
 +  test2();
 +  /* This call must bind locally.  */
 +  if (!testcount)
 +abort ();
 +  test();
 +  /* Depending on linker choice, this one may bind locally
 + or to the other unit.  */
 +  if (!testcount  !test2count)
 +abort();
 +  tt();
 +
 +  if ((testcount != 1 || test2count != 3)
 +   (testcount != 3 || test2count != 1))
 +abort ();
 +  reutrn 0;
^ typo
 +}

return 0;

You probably should run the testcases before committing them.

Thanks, David


Re: Another AIX Bootstrap failure

2014-06-21 Thread Jan Hubicka
  +  /* Depending on linker choice, this one may bind locally
  + or to the other unit.  */
  +  if (!testcount  !test2count)
  +abort();
  +  tt();
  +
  +  if ((testcount != 1 || test2count != 3)
  +   (testcount != 3 || test2count != 1))
  +abort ();
  +  reutrn 0;
 ^ typo
  +}
 
 return 0;
 
 You probably should run the testcases before committing them.

Uhm, sorry. I must have messed up testing.
I commited the fix.

Honza


[PATCH 1/6] rs6000: Remove O alternative from lshrsi3

2014-06-21 Thread Segher Boessenkool
Nothing will ever generate RTL matching this alternative.  Maybe long
ago this was needed, but not anymore.

Bootstrapped and tested on powerpc64-linux, {-m64,-m64/-mtune=power8,
-m32,-m32/-mpowerpc64}, no regressions.  Okay to apply?


Segher


2014-06-21  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* config/rs6000/rs6000.md (lshrsi3, and its two dot patterns):
Remove O alternative.

---
 gcc/config/rs6000/rs6000.md | 43 +++
 1 file changed, 19 insertions(+), 24 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c6e85b3..9d92d8f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4537,16 +4537,15 @@ (define_split
   )
 
 (define_insn lshrsi3
-  [(set (match_operand:SI 0 gpc_reg_operand =r,r,r)
-   (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand r,r,r)
-(match_operand:SI 2 reg_or_cint_operand O,r,i)))]
+  [(set (match_operand:SI 0 gpc_reg_operand =r,r)
+   (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand r,r)
+(match_operand:SI 2 reg_or_cint_operand r,i)))]
   
   @
-  mr %0,%1
   srw %0,%1,%2
   srwi %0,%1,%h2
-  [(set_attr type integer,shift,shift)
-   (set_attr var_shift no,yes,no)])
+  [(set_attr type shift)
+   (set_attr var_shift yes,no)])
 
 (define_insn *lshrsi3_64
   [(set (match_operand:DI 0 gpc_reg_operand =r,r)
@@ -4561,23 +4560,21 @@ (define_insn *lshrsi3_64
(set_attr var_shift yes,no)])
 
 (define_insn 
-  [(set (match_operand:CC 0 cc_reg_operand =x,x,x,?y,?y,?y)
-   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r,r,r)
-(match_operand:SI 2 reg_or_cint_operand 
O,r,i,O,r,i))
+  [(set (match_operand:CC 0 cc_reg_operand =x,x,?y,?y)
+   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r)
+(match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
(const_int 0)))
-   (clobber (match_scratch:SI 3 =X,r,r,X,r,r))]
+   (clobber (match_scratch:SI 3 =r,r,r,r))]
   TARGET_32BIT
   @
-   mr. %1,%1
srw. %3,%1,%2
srwi. %3,%1,%h2
#
-   #
#
-  [(set_attr type logical,shift,shift,shift,shift,shift)
-   (set_attr var_shift no,yes,no,no,yes,no)
+  [(set_attr type shift)
+   (set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
-   (set_attr length 4,4,4,8,8,8)])
+   (set_attr length 4,4,8,8)])
 
 (define_split
   [(set (match_operand:CC 0 cc_reg_not_cr0_operand )
@@ -4594,24 +4591,22 @@ (define_split
   )
 
 (define_insn 
-  [(set (match_operand:CC 3 cc_reg_operand =x,x,x,?y,?y,?y)
-   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r,r,r)
-(match_operand:SI 2 reg_or_cint_operand 
O,r,i,O,r,i))
+  [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
+   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r)
+(match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
(const_int 0)))
-   (set (match_operand:SI 0 gpc_reg_operand =r,r,r,r,r,r)
+   (set (match_operand:SI 0 gpc_reg_operand =r,r,r,r)
(lshiftrt:SI (match_dup 1) (match_dup 2)))]
   TARGET_32BIT
   @
-   mr. %0,%1
srw. %0,%1,%2
srwi. %0,%1,%h2
#
-   #
#
-  [(set_attr type logical,shift,shift,shift,shift,shift)
-   (set_attr var_shift no,yes,no,no,yes,no)
+  [(set_attr type shift)
+   (set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
-   (set_attr length 4,4,4,8,8,8)])
+   (set_attr length 4,4,8,8)])
 
 (define_split
   [(set (match_operand:CC 3 cc_reg_not_cr0_operand )
-- 
1.8.1.4



[PATCH 3/6] rs6000: Merge ashlsi3 and ashldi3

2014-06-21 Thread Segher Boessenkool
As the previous patch.

Bootstrapped and tested on powerpc64-linux, {-m64,-m64/-mtune=power8,
-m32,-m32/-mpowerpc64}, no regressions.  Okay to apply?


Segher


gcc/
* config/rs6000/rs6000.md (ashlsi3, two anonymous define_insns
and define_splits, ashldi3, *ashldi3_internal1, *ashldi3_internal2
and split, *ashldi3_internal3 and split): Delete, merge into...
(ashlmode3, ashlmode3_dot, ashlmode3_dot2): New.
(*ashlsi3_64): Fix formatting.  Replace i by n.

---
 gcc/config/rs6000/rs6000.md | 177 +++-
 1 file changed, 43 insertions(+), 134 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f023162..77c2161 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4376,22 +4376,23 @@ (define_split
(const_int 0)))]
   )
 
-(define_insn ashlsi3
-  [(set (match_operand:SI 0 gpc_reg_operand =r,r)
-   (ashift:SI (match_operand:SI 1 gpc_reg_operand r,r)
-  (match_operand:SI 2 reg_or_cint_operand r,i)))]
+
+(define_insn ashlmode3
+  [(set (match_operand:GPR 0 gpc_reg_operand =r,r)
+   (ashift:GPR (match_operand:GPR 1 gpc_reg_operand r,r)
+   (match_operand:GPR 2 reg_or_cint_operand r,n)))]
   
   @
-   slw %0,%1,%2
-   slwi %0,%1,%h2
+   slwd %0,%1,%2
+   slwdi %0,%1,%hH2
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
 (define_insn *ashlsi3_64
   [(set (match_operand:DI 0 gpc_reg_operand =r,r)
-   (zero_extend:DI
+   (zero_extend:DI
(ashift:SI (match_operand:SI 1 gpc_reg_operand r,r)
-  (match_operand:SI 2 reg_or_cint_operand r,i]
+  (match_operand:SI 2 reg_or_cint_operand r,n]
   TARGET_POWERPC64
   @
slw %0,%1,%2
@@ -4399,69 +4400,58 @@ (define_insn *ashlsi3_64
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
-(define_insn 
-  [(set (match_operand:CC 0 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (ashift:SI (match_operand:SI 1 gpc_reg_operand r,r,r,r)
-  (match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+(define_insn_and_split *ashlmode3_dot
+  [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
+   (compare:CC (ashift:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+   (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (clobber (match_scratch:SI 3 =r,r,r,r))]
-  TARGET_32BIT
+   (clobber (match_scratch:GPR 0 =r,r,r,r))]
+  MODEmode == Pmode  rs6000_gen_cell_microcode
   @
-   slw. %3,%1,%2
-   slwi. %3,%1,%h2
+   slwd. %0,%1,%2
+   slwdi. %0,%1,%hH2
#
#
+   reload_completed
+  [(set (match_dup 0)
+   (ashift:GPR (match_dup 1)
+   (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  
   [(set_attr type shift)
(set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
(set_attr length 4,4,8,8)])
 
-(define_split
-  [(set (match_operand:CC 0 cc_reg_not_cr0_operand )
-   (compare:CC (ashift:SI (match_operand:SI 1 gpc_reg_operand )
-  (match_operand:SI 2 reg_or_cint_operand ))
-   (const_int 0)))
-   (clobber (match_scratch:SI 3 ))]
-  TARGET_32BIT  reload_completed
-  [(set (match_dup 3)
-   (ashift:SI (match_dup 1) (match_dup 2)))
-   (set (match_dup 0)
-   (compare:CC (match_dup 3)
-   (const_int 0)))]
-  )
-
-(define_insn 
+(define_insn_and_split *ashlmode3_dot2
   [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (ashift:SI (match_operand:SI 1 gpc_reg_operand r,r,r,r)
-  (match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+   (compare:CC (ashift:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+   (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (set (match_operand:SI 0 gpc_reg_operand =r,r,r,r)
-   (ashift:SI (match_dup 1) (match_dup 2)))]
-  TARGET_32BIT
+   (set (match_operand:GPR 0 gpc_reg_operand =r,r,r,r)
+   (ashift:GPR (match_dup 1)
+   (match_dup 2)))]
+  MODEmode == Pmode  rs6000_gen_cell_microcode
   @
-   slw. %0,%1,%2
-   slwi. %0,%1,%h2
+   slwd. %0,%1,%2
+   slwdi. %0,%1,%hH2
#
#
+   reload_completed
+  [(set (match_dup 0)
+   (ashift:GPR (match_dup 1)
+   (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  
   [(set_attr type shift)
(set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
(set_attr length 4,4,8,8)])
 
-(define_split
-  [(set (match_operand:CC 3 cc_reg_not_cr0_operand )
-   (compare:CC (ashift:SI (match_operand:SI 1 gpc_reg_operand )
-  (match_operand:SI 2 reg_or_cint_operand ))
-   (const_int 0)))
-   (set (match_operand:SI 0 

[PATCH 4/6] rs6000: Merge rotlsi3 and rotldi3

2014-06-21 Thread Segher Boessenkool
This uses the rotl* extended mnemonics instead of the rlw*nm and rld*cl
mnemonics, because they are shorter and more importantly they look the
same for 32-bit and 64-bit.

Bootstrapped and tested on powerpc64-linux, {-m64,-m64/-mtune=power8,
-m32,-m32/-mpowerpc64}, no regressions.  Okay to apply?


Segher


gcc/
* config/rs6000/rs6000.md (rotlsi3, *rotlsi3_internal2 and split,
*rotlsi3_internal3 and split, rotldi3, *rotldi3_internal2 and split,
*rotldi3_internal3 and split): Delete, merge into...
(rotlmode3, rotlmode3_dot, rotlmode3_dot2): New.
(*rotlsi3_64): Fix formatting.  Fix condition.  Replace i by n.
Use rotlw extended mnemonic.

---
 gcc/config/rs6000/rs6000.md | 175 
 1 file changed, 45 insertions(+), 130 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 77c2161..665fced 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3853,92 +3853,82 @@ (define_insn *extzvdi_internal2
   [(set_attr type shift)
(set_attr dot yes)])
 
-(define_insn rotlsi3
-  [(set (match_operand:SI 0 gpc_reg_operand =r,r)
-   (rotate:SI (match_operand:SI 1 gpc_reg_operand r,r)
-  (match_operand:SI 2 reg_or_cint_operand r,i)))]
+
+(define_insn rotlmode3
+  [(set (match_operand:GPR 0 gpc_reg_operand =r,r)
+   (rotate:GPR (match_operand:GPR 1 gpc_reg_operand r,r)
+   (match_operand:GPR 2 reg_or_cint_operand r,n)))]
   
   @
-   rlwnm %0,%1,%2,0x
-   rlwinm %0,%1,%h2,0x
+   rotlwd %0,%1,%2
+   rotlwdi %0,%1,%hH2
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
 (define_insn *rotlsi3_64
   [(set (match_operand:DI 0 gpc_reg_operand =r,r)
-   (zero_extend:DI
+   (zero_extend:DI
(rotate:SI (match_operand:SI 1 gpc_reg_operand r,r)
-  (match_operand:SI 2 reg_or_cint_operand r,i]
-  TARGET_64BIT
+  (match_operand:SI 2 reg_or_cint_operand r,n]
+  TARGET_POWERPC64
   @
-   rlwnm %0,%1,%2,0x
-   rlwinm %0,%1,%h2,0x
+   rotlw %0,%1,%2
+   rotlwi %0,%1,%h2
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
-(define_insn *rotlsi3_internal2
-  [(set (match_operand:CC 0 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (rotate:SI (match_operand:SI 1 gpc_reg_operand r,r,r,r)
-  (match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+(define_insn_and_split *rotlmode3_dot
+  [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
+   (compare:CC (rotate:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+   (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (clobber (match_scratch:SI 3 =r,r,r,r))]
-  
+   (clobber (match_scratch:GPR 0 =r,r,r,r))]
+  MODEmode == Pmode  rs6000_gen_cell_microcode
   @
-   rlwnm. %3,%1,%2,0x
-   rlwinm. %3,%1,%h2,0x
+   rotlwd. %0,%1,%2
+   rotlwdi. %0,%1,%hH2
#
#
+   reload_completed
+  [(set (match_dup 0)
+   (rotate:GPR (match_dup 1)
+   (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  
   [(set_attr type shift)
(set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
(set_attr length 4,4,8,8)])
 
-(define_split
-  [(set (match_operand:CC 0 cc_reg_not_micro_cr0_operand )
-   (compare:CC (rotate:SI (match_operand:SI 1 gpc_reg_operand )
-  (match_operand:SI 2 reg_or_cint_operand ))
-   (const_int 0)))
-   (clobber (match_scratch:SI 3 ))]
-  reload_completed
-  [(set (match_dup 3)
-   (rotate:SI (match_dup 1) (match_dup 2)))
-   (set (match_dup 0)
-   (compare:CC (match_dup 3)
-   (const_int 0)))]
-  )
-
-(define_insn *rotlsi3_internal3
+(define_insn_and_split *rotlmode3_dot2
   [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (rotate:SI (match_operand:SI 1 gpc_reg_operand r,r,r,r)
-  (match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+   (compare:CC (rotate:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+   (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (set (match_operand:SI 0 gpc_reg_operand =r,r,r,r)
-   (rotate:SI (match_dup 1) (match_dup 2)))]
-  
+   (set (match_operand:GPR 0 gpc_reg_operand =r,r,r,r)
+   (rotate:GPR (match_dup 1)
+   (match_dup 2)))]
+  MODEmode == Pmode  rs6000_gen_cell_microcode
   @
-   rlwnm. %0,%1,%2,0x
-   rlwinm. %0,%1,%h2,0x
+   rotlwd. %0,%1,%2
+   rotlwdi. %0,%1,%hH2
#
#
+   reload_completed
+  [(set (match_dup 0)
+   (rotate:GPR (match_dup 1)
+   (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  
   [(set_attr type shift)

[PATCH 2/6] rs6000: Merge lshrsi3 and lshrdi3

2014-06-21 Thread Segher Boessenkool
For this create a new mode_attr hH.

Also change i constraints on the shift amount to n, which better
describes what it really is (GCC takes the integer value of these
operands and does arithmetic on them; symbolic constants will not work
here).

Also merge the dot insns with the corresponding splitters.  To do
this, don't allow the dot insns for CBE non-microcode mode at all
(it previously would just split it back always).

Bootstrapped and tested on powerpc64-linux, {-m64,-m64/-mtune=power8,
-m32,-m32/-mpowerpc64}, no regressions.  Okay to apply?


Segher


2014-06-21  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* config/rs6000/rs6000.md (hH): New define_mode_attr.
(lshrsi3, two anonymous define_insns and define_splits,
lshrdi3, *lshrdi3_internal1, *lshrdi3_internal2 and split,
*lshrdi3_internal3 and split): Delete, merge into...
(lshrmode3, lshrmode3_dot, lshrmode3_dot2): New.
(*lshrsi3_64): Fix formatting.  Replace i by n.

---
 gcc/config/rs6000/rs6000.md | 183 
 1 file changed, 47 insertions(+), 136 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 9d92d8f..f023162 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -438,6 +438,9 @@ (define_mode_attr dbits [(QI 56) (HI 48) (SI 32)])
 ;; ISEL/ISEL64 target selection
 (define_mode_attr sel [(SI ) (DI 64)])
 
+;; Bitmask for shift instructions
+(define_mode_attr hH [(SI h) (DI H)])
+
 ;; Suffix for reload patterns
 (define_mode_attr ptrsize [(SI 32bit)
   (DI 64bit)])
@@ -4536,92 +4539,82 @@ (define_split
(const_int 0)))]
   )
 
-(define_insn lshrsi3
-  [(set (match_operand:SI 0 gpc_reg_operand =r,r)
-   (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand r,r)
-(match_operand:SI 2 reg_or_cint_operand r,i)))]
+
+(define_insn lshrmode3
+  [(set (match_operand:GPR 0 gpc_reg_operand =r,r)
+   (lshiftrt:GPR (match_operand:GPR 1 gpc_reg_operand r,r)
+ (match_operand:GPR 2 reg_or_cint_operand r,n)))]
   
   @
-  srw %0,%1,%2
-  srwi %0,%1,%h2
+   srwd %0,%1,%2
+   srwdi %0,%1,%hH2
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
 (define_insn *lshrsi3_64
   [(set (match_operand:DI 0 gpc_reg_operand =r,r)
-   (zero_extend:DI
+   (zero_extend:DI
(lshiftrt:SI (match_operand:SI 1 gpc_reg_operand r,r)
-(match_operand:SI 2 reg_or_cint_operand r,i]
+(match_operand:SI 2 reg_or_cint_operand r,n]
   TARGET_POWERPC64
   @
-  srw %0,%1,%2
-  srwi %0,%1,%h2
+   srw %0,%1,%2
+   srwi %0,%1,%h2
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
-(define_insn 
-  [(set (match_operand:CC 0 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r)
-(match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+(define_insn_and_split *lshrmode3_dot
+  [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
+   (compare:CC (lshiftrt:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+ (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (clobber (match_scratch:SI 3 =r,r,r,r))]
-  TARGET_32BIT
+   (clobber (match_scratch:GPR 0 =r,r,r,r))]
+  MODEmode == Pmode  rs6000_gen_cell_microcode
   @
-   srw. %3,%1,%2
-   srwi. %3,%1,%h2
+   srwd. %0,%1,%2
+   srwdi. %0,%1,%hH2
#
#
+   reload_completed
+  [(set (match_dup 0)
+   (lshiftrt:GPR (match_dup 1)
+ (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  
   [(set_attr type shift)
(set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
(set_attr length 4,4,8,8)])
 
-(define_split
-  [(set (match_operand:CC 0 cc_reg_not_cr0_operand )
-   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand )
-(match_operand:SI 2 reg_or_cint_operand ))
-   (const_int 0)))
-   (clobber (match_scratch:SI 3 ))]
-  TARGET_32BIT  reload_completed
-  [(set (match_dup 3)
-   (lshiftrt:SI (match_dup 1) (match_dup 2)))
-   (set (match_dup 0)
-   (compare:CC (match_dup 3)
-   (const_int 0)))]
-  )
-
-(define_insn 
+(define_insn_and_split *lshrmode3_dot2
   [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (lshiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r)
-(match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+   (compare:CC (lshiftrt:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+ (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (set (match_operand:SI 0 gpc_reg_operand =r,r,r,r)
-   (lshiftrt:SI (match_dup 1) (match_dup 2)))]
-  TARGET_32BIT
+   

[PATCH 6/6] rs6000: Merge the var_shift yes/no alternatives

2014-06-21 Thread Segher Boessenkool
All instructions that are var_shift for some alternative have the shift
amount as operands[2].

This patch introduces an attribute maybe_var_shift.  If that is set to
yes, the default value of var_shift is set based on the operands[2]
value.

With that, we can merge the var_shift yes/no cases everywhere.  Do so.

Also change some more i to n.

Bootstrapped and tested on powerpc64-linux, {-m64,-m64/-mtune=power8,
-m32,-m32/-mpowerpc64}, no regressions.  Okay to apply?


Segher


2014-06-21  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* config/rs6000/rs6000.md (maybe_var_shift): New define_attr.
(var_shift): Use it.
(rotlmode3, *rotlsi3_64, *rotlmode3_dot, *rotlmode3_dot2,
*rotlsi3_internal4, *rotlsi3_internal5, *rotlsi3_internal6,
*rotlsi3_internal8le, *rotlsi3_internal8be, *rotlsi3_internal9le,
*rotlsi3_internal9be, *rotlsi3_internal10le, *rotlsi3_internal10be,
*rotlsi3_internal11le, *rotlsi3_internal11be, *rotlsi3_internal12le,
*rotlsi3_internal12be, ashlmode3, *ashlsi3_64, *ashlmode3_dot,
*ashlmode3_dot2, lshrmode3, *lshrsi3_64, *lshrmode3_dot,
*lshrmode3_dot2, *ashrmode3, *ashrsi3_64, *ashrmode3_dot,
*ashrmode3_dot2, *rotldi3_internal4, *rotldi3_internal5,
*rotldi3_internal6, *rotldi3_internal7le, *rotldi3_internal7be,
*rotldi3_internal8le, *rotldi3_internal8be, *rotldi3_internal9le,
*rotldi3_internal9be, *rotldi3_internal10le, *rotldi3_internal10be,
*rotldi3_internal11le, *rotldi3_internal11be, *rotldi3_internal12le,
*rotldi3_internal12be, *rotldi3_internal13le, *rotldi3_internal13be,
*rotldi3_internal14le, *rotldi3_internal14be, *rotldi3_internal15le,
*rotldi3_internal15be): Use the new attribute.  Merge register and
integer alternatives.

---
 gcc/config/rs6000/rs6000.md | 753 +++-
 1 file changed, 332 insertions(+), 421 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d67b4e4..c716bae 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -205,9 +205,20 @@ (define_attr update no,yes
(const_string yes)
(const_string no)))
 
+;; Is this instruction using operands[2] as shift amount, and can that be a
+;; register?
+;; This is used for shift insns.
+(define_attr maybe_var_shift no,yes (const_string no))
+
 ;; Is this instruction using a shift amount from a register?
 ;; This is used for shift insns.
-(define_attr var_shift no,yes (const_string no))
+(define_attr var_shift no,yes
+  (if_then_else (and (eq_attr type shift)
+(eq_attr maybe_var_shift yes))
+   (if_then_else (match_operand 2 gpc_reg_operand)
+ (const_string yes)
+ (const_string no))
+   (const_string no)))
 
 ;; Define floating point instruction sub-types for use with Xfpu.md
 (define_attr fp_type 
fp_default,fp_addsub_s,fp_addsub_d,fp_mul_s,fp_mul_d,fp_div_s,fp_div_d,fp_maddsub_s,fp_maddsub_d,fp_sqrt_s,fp_sqrt_d
 (const_string fp_default))
@@ -3855,39 +3866,33 @@ (define_insn *extzvdi_internal2
 
 
 (define_insn rotlmode3
-  [(set (match_operand:GPR 0 gpc_reg_operand =r,r)
-   (rotate:GPR (match_operand:GPR 1 gpc_reg_operand r,r)
-   (match_operand:GPR 2 reg_or_cint_operand r,n)))]
+  [(set (match_operand:GPR 0 gpc_reg_operand =r)
+   (rotate:GPR (match_operand:GPR 1 gpc_reg_operand r)
+   (match_operand:GPR 2 reg_or_cint_operand rn)))]
   
-  @
-   rotlwd %0,%1,%2
-   rotlwdi %0,%1,%hH2
+  rotlwd%I2 %0,%1,%2
   [(set_attr type shift)
-   (set_attr var_shift yes,no)])
+   (set_attr maybe_var_shift yes)])
 
 (define_insn *rotlsi3_64
-  [(set (match_operand:DI 0 gpc_reg_operand =r,r)
+  [(set (match_operand:DI 0 gpc_reg_operand =r)
(zero_extend:DI
-   (rotate:SI (match_operand:SI 1 gpc_reg_operand r,r)
-  (match_operand:SI 2 reg_or_cint_operand r,n]
+   (rotate:SI (match_operand:SI 1 gpc_reg_operand r)
+  (match_operand:SI 2 reg_or_cint_operand rn]
   TARGET_POWERPC64
-  @
-   rotlw %0,%1,%2
-   rotlwi %0,%1,%h2
+  rotlw%I2 %0,%1,%h2
   [(set_attr type shift)
-   (set_attr var_shift yes,no)])
+   (set_attr maybe_var_shift yes)])
 
 (define_insn_and_split *rotlmode3_dot
-  [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (rotate:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
-   (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
+  [(set (match_operand:CC 3 cc_reg_operand =x,?y)
+   (compare:CC (rotate:GPR (match_operand:GPR 1 gpc_reg_operand r,r)
+   (match_operand:GPR 2 reg_or_cint_operand 
rn,rn))
(const_int 0)))
-   (clobber (match_scratch:GPR 0 =r,r,r,r))]
+   (clobber (match_scratch:GPR 0 =r,r))]
   MODEmode == Pmode  

[PATCH 5/6] rs6000: Merge ashrsi3 and ashrdi3

2014-06-21 Thread Segher Boessenkool
The last (and ugliest) kind of shift.

Bootstrapped and tested on powerpc64-linux, {-m64,-m64/-mtune=power8,
-m32,-m32/-mpowerpc64}, no regressions.  Okay to apply?


Segher


2014-06-21  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* config/rs6000/rs6000.md (ashrsi3, two anonymous define_insns and 
define_splits,
ashrdi3, *ashrdi3_internal1, *ashrdi3_internal2 and split,
*ashrdi3_internal3 and split): Delete, merge into...
(ashrmode3): New expander.
(*ashrmode3, ashrmode3_dot, ashrmode3_dot2): New.
(*ashrsi3_64): Fix formatting.  Replace i by n.

---
 gcc/config/rs6000/rs6000.md | 210 +---
 1 file changed, 63 insertions(+), 147 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 665fced..d67b4e4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5001,22 +5001,44 @@ (define_split
   )
 
 
-(define_insn ashrsi3
-  [(set (match_operand:SI 0 gpc_reg_operand =r,r)
-   (ashiftrt:SI (match_operand:SI 1 gpc_reg_operand r,r)
-(match_operand:SI 2 reg_or_cint_operand r,i)))]
+(define_expand ashrmode3
+  [(set (match_operand:GPR 0 gpc_reg_operand )
+   (ashiftrt:GPR (match_operand:GPR 1 gpc_reg_operand )
+ (match_operand:GPR 2 reg_or_cint_operand )))]
+  
+{
+  /* The generic code does not generate optimal code for the low word
+ (it should be a rlwimi and a rot).  Until we have target code to
+ solve this generically, keep this expander.  */
+
+  if (MODEmode == DImode  !TARGET_POWERPC64)
+{
+  if (CONST_INT_P (operands[2]))
+   {
+ emit_insn (gen_ashrdi3_no_power (operands[0], operands[1], 
operands[2]));
+ DONE;
+   }
+  else
+   FAIL;
+}
+})
+
+(define_insn *ashrmode3
+  [(set (match_operand:GPR 0 gpc_reg_operand =r,r)
+   (ashiftrt:GPR (match_operand:GPR 1 gpc_reg_operand r,r)
+ (match_operand:GPR 2 reg_or_cint_operand r,n)))]
   
   @
-   sraw %0,%1,%2
-   srawi %0,%1,%h2
+   srawd %0,%1,%2
+   srawdi %0,%1,%hH2
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
 (define_insn *ashrsi3_64
   [(set (match_operand:DI 0 gpc_reg_operand =r,r)
-   (sign_extend:DI
+   (sign_extend:DI
(ashiftrt:SI (match_operand:SI 1 gpc_reg_operand r,r)
-(match_operand:SI 2 reg_or_cint_operand r,i]
+(match_operand:SI 2 reg_or_cint_operand r,n]
   TARGET_POWERPC64
   @
sraw %0,%1,%2
@@ -5024,50 +5046,53 @@ (define_insn *ashrsi3_64
   [(set_attr type shift)
(set_attr var_shift yes,no)])
 
-(define_insn 
-  [(set (match_operand:CC 0 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (ashiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r)
-(match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+(define_insn_and_split *ashrmode3_dot
+  [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
+   (compare:CC (ashiftrt:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+ (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (clobber (match_scratch:SI 3 =r,r,r,r))]
-  
+   (clobber (match_scratch:GPR 0 =r,r,r,r))]
+  MODEmode == Pmode  rs6000_gen_cell_microcode
   @
-   sraw. %3,%1,%2
-   srawi. %3,%1,%h2
+   srawd. %0,%1,%2
+   srawdi. %0,%1,%hH2
#
#
+   reload_completed
+  [(set (match_dup 0)
+   (ashiftrt:GPR (match_dup 1)
+ (match_dup 2)))
+   (set (match_dup 3)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  
   [(set_attr type shift)
(set_attr var_shift yes,no,yes,no)
(set_attr dot yes)
(set_attr length 4,4,8,8)])
 
-(define_split
-  [(set (match_operand:CC 0 cc_reg_not_micro_cr0_operand )
-   (compare:CC (ashiftrt:SI (match_operand:SI 1 gpc_reg_operand )
-(match_operand:SI 2 reg_or_cint_operand ))
-   (const_int 0)))
-   (clobber (match_scratch:SI 3 ))]
-  reload_completed
-  [(set (match_dup 3)
-   (ashiftrt:SI (match_dup 1) (match_dup 2)))
-   (set (match_dup 0)
-   (compare:CC (match_dup 3)
-   (const_int 0)))]
-  )
-
-(define_insn 
+(define_insn_and_split *ashrmode3_dot2
   [(set (match_operand:CC 3 cc_reg_operand =x,x,?y,?y)
-   (compare:CC (ashiftrt:SI (match_operand:SI 1 gpc_reg_operand 
r,r,r,r)
-(match_operand:SI 2 reg_or_cint_operand 
r,i,r,i))
+   (compare:CC (ashiftrt:GPR (match_operand:GPR 1 gpc_reg_operand 
r,r,r,r)
+ (match_operand:GPR 2 reg_or_cint_operand 
r,n,r,n))
(const_int 0)))
-   (set (match_operand:SI 0 gpc_reg_operand =r,r,r,r)
-   (ashiftrt:SI (match_dup 1) (match_dup 2)))]
-  
+   (set (match_operand:GPR 0 gpc_reg_operand =r,r,r,r)
+   (ashiftrt:GPR (match_dup 1)
+ (match_dup