Re: [PATCH,GRAPHITE] Fix for P1 bug 58028

2014-02-27 Thread Richard Biener
On Wed, Feb 26, 2014 at 10:13 PM, Tobias Grosser tob...@grosser.es wrote:
 On 02/26/2014 10:09 PM, Mircea Namolaru wrote:

 This patch fixes the libgomp problems:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58028

 2014-02-26 Tobias Grosser tob...@grosser.es
 Mircea Namolaru mnamo...@inria.fr

 Fix for bug 58028
 * graphite-clast-to-gimple.c (set_cloog_options):
 Don't remove scalar dimensions

 Index: gcc/graphite-clast-to-gimple.c
 ===
 --- gcc/graphite-clast-to-gimple.c (revision 207298)
 +++ gcc/graphite-clast-to-gimple.c (working copy)
 @@ -1522,6 +1522,13 @@

 variables. */
 options-save_domains = 1;

 + /* Do not remove scalar dimensions. Cloog be default removes scalar

 + dimensions very early from the input schedule. However, they are
 + necessary to correctly derive from the saved domains
 + (options-save_domains) the relationship between the generated loops
 + and the schedule dimensions they are generated from. */

 + options-noscalars = 1;
 +
 /* Disable optimizations and make cloog generate source code closer to the
 input. This is useful for debugging, but later we want the optimized
 code.

 Tested on x86-64 Linux, no regressions for c/c++/fortran.

 Got a notification failure from gcc-patches for the initial submission of
 this
 patch. However, Tobias (on cc) received it and already sent his comments,
 (see http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01540.html), that are
 addressed in this re-submission.


 Yes, very nice. Thanks Mircea for reacting so quickly. This patch look good
 to me.

 @Release-managers: Is this patch OK for trunk?

It fixes a P1 regression, so yes.

Richard.

 Cheers,
 Tobias



Re: PR60147: fix ICE with DECL_NAMELIST and tree-pretty-printer.c

2014-02-27 Thread Richard Biener
On Wed, Feb 26, 2014 at 11:19 PM, Tobias Burnus bur...@net-b.de wrote:
 Dear all,

 as suggested by Richard, it now only prints the namelist name and no longer
 the variables of the namelist.

 Bootstrapped on x86-64-gnu-linux and currently regtesting.
 OK for the trunk when it succeeds?

Works for me, but doesn't the simpler

Index: tree-pretty-print.c
===
--- tree-pretty-print.c (revision 208066)
+++ tree-pretty-print.c (working copy)
@@ -1390,6 +1390,7 @@
 case FIELD_DECL:
 case DEBUG_EXPR_DECL:
 case NAMESPACE_DECL:
+case NAMELIST_DECL:
   dump_decl_name (buffer, node, flags);
   break;

also work?  It's odd that we need to do sth special just for namelist-decls.

Richard.

 Tobias


 On February 22, 2014 10:00, Tobias Burnus wrote:

 Since GCC 4.9, gfortran generates a DECL_NAMELIST (for DWARF's
 DW_TAG_namelist) - they are stored in the BIND_EXPR. Namelists are a bit
 boring: They only group variable names and the namelist name is only used in
 I/O statements (READ, WRITE) to permit a fancy data input [and output] - and
 for the debugger.

 Due to DW_TAG_namelist, namelists are now exposed to the middle end - and
 I forgot to handle them also in the tree pretty printer - hence
 -fdump-tree-original now ICEs.

 For the pretty printer one has two options: Ignoring (or NYI;) the decl
 or dumping it. The attached patch does the latter.

 Bootstrapped (C/C++/Fortran) and regtested on x86-64-gnu-linux.
 OK for the trunk?

 Tobias


Re: [PR debug/57232] don't record non-fixed reg as CFA base value

2014-02-27 Thread Richard Biener
On Thu, Feb 27, 2014 at 6:28 AM, Alexandre Oliva aol...@redhat.com wrote:
 Some ports were failing an assertion check that was supposed to make
 sure some RTX created a new VALUE, rather than reuse an existing one in
 the cselib tables.  The reason the value was already there was that we'd
 recorded the register in the permanent table as the CFA, but the
 register was correctly used for other purposes by the compiler itself,
 we'd just failed to refrain from take note of that in this particular
 code path.  Other paths that register the CFA base value are guarded by
 this condition, and guarding this one fixed the problem on the affected
 ports.

 Regstrapped on x86_64-linux-gnu and i686-linux-gnu; in the PR, the
 problem is confirmed fixed by this patch on the rx and lm32 ports too.
 Ok?

Ok.

Thanks,
Richard.


 for  gcc/ChangeLog

 PR debug/57232
 * var-tracking.c (vt_initialize): Apply the same condition to
 preserve the CFA base value.
 ---
  gcc/var-tracking.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
 index c5ce1dd..65d8285 100644
 --- a/gcc/var-tracking.c
 +++ b/gcc/var-tracking.c
 @@ -9924,7 +9924,8 @@ vt_initialize (void)
val = cselib_lookup_from_insn (reg, GET_MODE (reg), 1,
  VOIDmode, get_insns ());
preserve_value (val);
 -  cselib_preserve_cfa_base_value (val, REGNO (reg));
 +  if (reg != hard_frame_pointer_rtx  fixed_regs[REGNO (reg)])
 +   cselib_preserve_cfa_base_value (val, REGNO (reg));
expr = plus_constant (GET_MODE (stack_pointer_rtx),
 stack_pointer_rtx, -ofst);
cselib_add_permanent_equiv (val, expr, get_insns ());


 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist Red Hat Brazil Toolchain Engineer


Re: [PR debug/59992 #1/2] avoid quadratic behavior for the removal of useless values

2014-02-27 Thread Richard Biener
On Thu, Feb 27, 2014 at 6:54 AM, Alexandre Oliva aol...@redhat.com wrote:
 We indirectly call remove_useless_values quite often during
 vt_initialize; at least once per extended basic block.  On functions
 with thousands of small basic blocks, each adding permanent and
 temporary entries to the table, that turns out to be quite expensive:
 the permanent entries pile up and trigger the quadratic behavior
 mentioned in teh guard of one of the two calls of remove_useless_values.
 This patch moves the guard so that it applies to the other as well.

 I wasn't entirely sure this wouldn't invalidate assumptions made in
 callers of cselib_preserve_only_values (the function called at the end
 of each extended basic block), but some analysis, plus comparing before
 and after assembly output ;-), made sure it didn't ;-)

 This patch alone cut down the vt_initialize time of insn-recog on
 generic i686 from more than 1700 seconds (~84% of the total compile
 time) to about 500 seconds.

 Regstrapped on x86_64-linux-gnu and i686-linux-gnu, along with the other
 patch for PR debug/59992 that I'm about to post.

Ok.

Thanks,
Richard.


 for  gcc/ChangeLog

 PR debug/59992
 * cselib.c (remove_useless_values): Skip to avoid quadratic
 behavior if the condition moved from...
 (cselib_process_insn): ... here holds.
 ---
  gcc/cselib.c |   16 +---
  1 file changed, 9 insertions(+), 7 deletions(-)

 diff --git a/gcc/cselib.c b/gcc/cselib.c
 index 525e717..dabd2d3 100644
 --- a/gcc/cselib.c
 +++ b/gcc/cselib.c
 @@ -662,6 +662,14 @@ remove_useless_values (void)
  {
cselib_val **p, *v;

 +  if (n_useless_values = MAX_USELESS_VALUES
 +  /* remove_useless_values is linear in the hash table size.  Avoid
 + quadratic behavior for very large hashtables with very few
 +useless elements.  */
 +  || ((unsigned int)n_useless_values
 + = (cselib_hash_table.elements () - n_debug_values) / 4))
 +return;
 +
/* First pass: eliminate locations that reference the value.  That in
   turn can make more values useless.  */
do
 @@ -2693,13 +2701,7 @@ cselib_process_insn (rtx insn)

cselib_current_insn = NULL_RTX;

 -  if (n_useless_values  MAX_USELESS_VALUES
 -  /* remove_useless_values is linear in the hash table size.  Avoid
 - quadratic behavior for very large hashtables with very few
 -useless elements.  */
 -   ((unsigned int)n_useless_values
 -  (cselib_hash_table.elements () - n_debug_values) / 4))
 -remove_useless_values ();
 +  remove_useless_values ();
  }

  /* Initialize cselib for one pass.  The caller must also call

 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist Red Hat Brazil Toolchain Engineer


Re: [PATCH, GOMP4] Fix OpenACC async clause

2014-02-27 Thread Thomas Schwinge
Hi!

On Wed, 26 Feb 2014 19:32:35 +0400, Ilmir Usmanov i.usma...@samsung.com wrote:
 This patch is pretty obvious. Currently ASYNC clause cannot have 
 integer-expression-list. Patch fixes this.
 
 OK for gomp4 branch?

 From df76a29ebf869687209d7a606e243624cc136dbc Mon Sep 17 00:00:00 2001
 From: Ilmir Usmanov i.usma...@samsung.com
 Date: Wed, 26 Feb 2014 19:04:47 +0400
 Subject: [PATCH 4/5] Fix ASYNC
 
 ---
  gcc/gimplify.c  | 4 ++--
  gcc/omp-low.c   | 4 ++--
  gcc/tree-core.h | 6 +++---
  gcc/tree.c  | 6 +++---
  4 files changed, 10 insertions(+), 10 deletions(-)

Sure, with ChangeLog added, and the following changes folded in, I think:

 --- a/gcc/gimplify.c
 +++ b/gcc/gimplify.c
 @@ -6545,11 +6545,11 @@ gimplify_adjust_omp_clauses (tree *list_p)
   case OMP_CLAUSE_OACC_DEVICE:
   case OMP_CLAUSE_DEVICE_RESIDENT:
   case OMP_CLAUSE_USE_DEVICE:
 + case OMP_CLAUSE_ASYNC:
   case OMP_CLAUSE_GANG:
   case OMP_CLAUSE_WAIT:

GANG/ASYNC/WAIT, for consistency?

 --- a/gcc/tree.c
 +++ b/gcc/tree.c
 @@ -263,6 +263,7 @@ unsigned const char omp_clause_num_ops[] =
1, /* OMP_CLAUSE_OACC_DEVICE  */
1, /* OMP_CLAUSE_DEVICE_RESIDENT  */
1, /* OMP_CLAUSE_USE_DEVICE  */
 +  1, /* OMP_CLAUSE_ASYNC  */
1, /* OMP_CLAUSE_GANG  */
1, /* OMP_CLAUSE_WAIT  */

Likewise.

For consistency, also move OMP_CLAUSE_ASYNC_EXPR in gcc/tree.h, and also
adapt OMP_CLAUSE_ASYNC in gcc/tree-pretty-print.c:dump_omp_clause?


Grüße,
 Thomas


pgpQJ0AeLLvJc.pgp
Description: PGP signature


Re: [PATCH][AArch64] Use Cortex-A57 rtx costs for the generic CPU

2014-02-27 Thread Richard Earnshaw

On 26/02/14 14:48, Kyrill Tkachov wrote:

Hi all,

The generic rtx cost table was written with AArch32 cores in mind. It would be a
better idea to use the Cortex-A57 costs for the generic CPU in aarch64. That way
we schedule for the Cortex-A53 and do instruction selection for the Cortex-A57.

Since generic is the default CPU, this patch can change default code generation
but is highly unlikely to affect correctness (unless it exposes latent bugs)
since it only affects instruction selection.

Tested aarch64-none-elf on a model with no regressions.

Is it ok for trunk now or shall we wait for stage1?

Thanks,
Kyrill

2014-02-26  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/aarch64/aarch64.c (generic_tunings):
  Use cortexa57_extra_costs.



I think this is the right direction to go, and as such I'd prefer that 
we did it for 4.9.


Please give the RMs 24 hours to comment.  But otherwise OK.

R.




Re: PR60147: fix ICE with DECL_NAMELIST and tree-pretty-printer.c

2014-02-27 Thread Tobias Burnus
Hi Richard, hi all,

I wrote:
 @@ -1713,21 +1733,24 @@ dump_generic_node (pretty_printer *buffer, tree node, 
 int spc, int flags,
 case BIND_EXPR:
...
 for (op0 = BIND_EXPR_VARS (node); op0; op0 = DECL_CHAIN (op0))
   {
-print_declaration (buffer, op0, spc+2, flags);
+if (TREE_CODE(op0) == NAMELIST_DECL)
...

Richard Biener wrote:
 Works for me, but doesn't the simpler

 --- tree-pretty-print.c (revision 208066)
 +++ tree-pretty-print.c (working copy)
 @@ -1390,6 +1390,7 @@
 case FIELD_DECL:
 case DEBUG_EXPR_DECL:
 case NAMESPACE_DECL:
+case NAMELIST_DECL:
   dump_decl_name (buffer, node, flags);
   break;

 also work?  It's odd that we need to do sth special just for namelist-decls.

It won't do - at least not like that. NAMELIST_DECL are stored in BIND_EXPR,
hence, one ends up in the code I quote above. That code calls
  print_declaration()

NAMELIST_DECL are created as
  decl = make_node (NAMELIST_DECL);
  TREE_TYPE (decl) = void_type_node;
  NAMELIST_DECL_ASSOCIATED_DECL (decl) = build_constructor (NULL_TREE, 
nml_decls);
Hence, TREE_TYPE() is void_type_node and dereferencing that one leads to an ICE.


However, the following should work - do you like it more?

Tobias


--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -2686,6 +2686,13 @@ print_declaration (pretty_printer *buffer, tree t, int 
spc, int flags)
 {
   INDENT (spc);

+  if (TREE_CODE(t) == NAMELIST_DECL)
+{
+   pp_string(buffer, namelist );
+   dump_decl_name (buffer, t, flags);
+   goto done;
+}
+
   if (TREE_CODE (t) == TYPE_DECL)
 pp_string (buffer, typedef );

@@ -2767,6 +2774,7 @@ print_declaration (pretty_printer *buffer, tree t, int 
spc, int flags)
   pp_right_bracket (buffer);
 }

+done:
   pp_semicolon (buffer);
 }


Re: [PING^3][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope

2014-02-27 Thread Dimitris Papavasiliou

Ping!

On 02/20/2014 12:11 PM, Dimitris Papavasiliou wrote:

Hello all,

Pinging this patch review request again. See previous messages quoted
below for details.

Regards,
Dimitris

On 02/13/2014 04:22 PM, Dimitris Papavasiliou wrote:

Hello,

Pinging this patch review request. Can someone involved in the
Objective-C language frontend have a quick look at the description of
the proposed features and tell me if it'd be ok to have them in the
trunk so I can go ahead and create proper patches?

Thanks,
Dimitris

On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote:

Hello,

This is a patch regarding a couple of Objective-C related dialect
options and warning switches. I have already submitted it a while ago
but gave up after pinging a couple of times. I am now informed that
should have kept pinging until I got someone's attention so I'm
resending it.

The patch is now against an old revision and as I stated originally it's
probably not in a state that can be adopted as is. I'm sending it as is
so that the implemented features can be assesed in terms of their
usefulness and if they're welcome I'd be happy to make any necessary
changes to bring it up-to-date, split it into smaller patches, add
test-cases and anything else that is deemed necessary.

Here's the relevant text from my initial message:

Two of these switches are related to a feature request I submitted a
while ago, Bug 56044
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce
the entire argument here since it is available in the feature request.
The relevant functionality in the patch comes in the form of two
switches:

-Wshadow-ivars which controls the local declaration of ‘somevar’ hides
instance variable warning which curiously is enabled by default instead
of being controlled at least by -Wshadow. The patch changes it so that
this warning can be enabled and disabled specifically through
-Wshadow-ivars as well as with all other shadowing-related warnings
through -Wshadow.

The reason for the extra switch is that, while searching through the
Internet for a solution to this problem I have found out that other
people are inconvenienced by this particular warning as well so it might
be useful to be able to turn it off while keeping all the other
shadowing-related warnings enabled.

-flocal-ivars which when true, as it is by default, treats instance
variables as having local scope. If false (-fno-local-ivars) instance
variables must always be referred to as self-ivarname and references of
ivarname resolve to the local or global scope as usual.

I've also taken the opportunity of adding another switch unrelated to
the above but related to instance variables:

-fivar-visibility which can be set to either private, protected (the
default), public and package. This sets the default instance variable
visibility which normally is implicitly protected. My use-case for it is
basically to be able to set it to public and thus effectively disable
this visibility mechanism altogether which I find no use for and
therefore have to circumvent. I'm not sure if anyone else feels the same
way towards this but I figured it was worth a try.

I'm attaching a preliminary patch against the current revision in case
anyone wants to have a look. The changes are very small and any blatant
mistakes should be immediately obvious. I have to admit to having
virtually no knowledge of the internals of GCC but I have tried to keep
in line with formatting guidelines and general style as well as looking
up the particulars of the way options are handled in the available
documentation to avoid blind copy-pasting. I have also tried to test the
functionality both in my own (relatively large, or at least not too
small) project and with small test programs and everything works as
expected. Finallly, I tried running the tests too but these fail to
complete both in the patched and unpatched version, possibly due to the
way I've configured GCC.

Dimitris








Re: [PATCH,GRAPHITE] Fix for P1 bug 58028

2014-02-27 Thread Tobias Burnus
Hi Tobias and Mircea,

On 02/26/2014 10:09 PM, Mircea Namolaru wrote:
 + /* Do not remove scalar dimensions. Cloog be default removes scalar
 + dimensions very early from the input schedule. However, they are

Small nit: Instead of 'be default' it should be 'by default'

Tobias


Re: PR60147: fix ICE with DECL_NAMELIST and tree-pretty-printer.c

2014-02-27 Thread Richard Biener
On Thu, Feb 27, 2014 at 10:37 AM, Tobias Burnus
tobias.bur...@physik.fu-berlin.de wrote:
 Hi Richard, hi all,

 I wrote:
 @@ -1713,21 +1733,24 @@ dump_generic_node (pretty_printer *buffer, tree 
 node, int spc, int flags,
 case BIND_EXPR:
 ...
 for (op0 = BIND_EXPR_VARS (node); op0; op0 = DECL_CHAIN (op0))
   {
-print_declaration (buffer, op0, spc+2, flags);
+if (TREE_CODE(op0) == NAMELIST_DECL)
 ...

 Richard Biener wrote:
 Works for me, but doesn't the simpler

 --- tree-pretty-print.c (revision 208066)
 +++ tree-pretty-print.c (working copy)
 @@ -1390,6 +1390,7 @@
 case FIELD_DECL:
 case DEBUG_EXPR_DECL:
 case NAMESPACE_DECL:
+case NAMELIST_DECL:
   dump_decl_name (buffer, node, flags);
   break;

 also work?  It's odd that we need to do sth special just for namelist-decls.

 It won't do - at least not like that. NAMELIST_DECL are stored in BIND_EXPR,
 hence, one ends up in the code I quote above. That code calls
   print_declaration()

 NAMELIST_DECL are created as
   decl = make_node (NAMELIST_DECL);
   TREE_TYPE (decl) = void_type_node;
   NAMELIST_DECL_ASSOCIATED_DECL (decl) = build_constructor (NULL_TREE, 
 nml_decls);
 Hence, TREE_TYPE() is void_type_node and dereferencing that one leads to an 
 ICE.

Where is it dereferenced?  void_type_node is a valid type.


 However, the following should work - do you like it more?

 Tobias


 --- a/gcc/tree-pretty-print.c
 +++ b/gcc/tree-pretty-print.c
 @@ -2686,6 +2686,13 @@ print_declaration (pretty_printer *buffer, tree t, int 
 spc, int flags)
  {
INDENT (spc);

 +  if (TREE_CODE(t) == NAMELIST_DECL)
 +{
 +   pp_string(buffer, namelist );
 +   dump_decl_name (buffer, t, flags);
 +   goto done;
 +}
 +
if (TREE_CODE (t) == TYPE_DECL)
  pp_string (buffer, typedef );

 @@ -2767,6 +2774,7 @@ print_declaration (pretty_printer *buffer, tree t, int 
 spc, int flags)
pp_right_bracket (buffer);
  }

 +done:
pp_semicolon (buffer);

Instead of the goto please duplicate the pp_semicolon and return.

Ok with that change.  You probably still want to add the NAMELIST_DECL
case to dump_generic_node (to avoid the  Unknown tree: ...).

Thanks,
Richard.

  }


Re: [PATCH, GOMP4] Fix OpenACC async clause

2014-02-27 Thread Ilmir Usmanov

Hi Thomas!

Fixed patch and ChangeLog in attachment.

On 27.02.2014 12:59, Thomas Schwinge wrote:

GANG/ASYNC/WAIT, for consistency?

Fixed.

For consistency, also move OMP_CLAUSE_ASYNC_EXPR in gcc/tree.h, and also
adapt OMP_CLAUSE_ASYNC in gcc/tree-pretty-print.c:dump_omp_clause?

It is already adapted:

case OMP_CLAUSE_ASYNC:
  pp_string (buffer, async);
  if (OMP_CLAUSE_DECL (clause))
{
  pp_character(buffer, '(');
  dump_generic_node (buffer, OMP_CLAUSE_DECL (clause),
 spc, flags, false);
  pp_character(buffer, ')');
}
  break;


--
Ilmir.
From bff6eb6dd930316bab91f222c1673e699fa97b96 Mon Sep 17 00:00:00 2001
From: Ilmir Usmanov i.usma...@samsung.com
Date: Thu, 27 Feb 2014 13:46:24 +0400
Subject: [PATCH] Fix ASYNC

---
 gcc/gimplify.c  | 4 ++--
 gcc/omp-low.c   | 4 ++--
 gcc/tree-core.h | 6 +++---
 gcc/tree.c  | 6 +++---
 gcc/tree.h  | 6 +++---
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index fd4305c..d0a4779 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6186,10 +6186,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_USE_DEVICE:
 	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WAIT:
 	case OMP_NO_CLAUSE_CACHE:
 	case OMP_CLAUSE_INDEPENDENT:
-	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WORKER:
 	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_NUM_GANGS:
@@ -6546,10 +6546,10 @@ gimplify_adjust_omp_clauses (tree *list_p)
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_USE_DEVICE:
 	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WAIT:
 	case OMP_NO_CLAUSE_CACHE:
 	case OMP_CLAUSE_INDEPENDENT:
-	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WORKER:
 	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_NUM_GANGS:
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6dec687..eec862e 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1776,10 +1776,10 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_USE_DEVICE:
 	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WAIT:
 	case OMP_NO_CLAUSE_CACHE:
 	case OMP_CLAUSE_INDEPENDENT:
-	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WORKER:
 	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_NUM_GANGS:
@@ -1916,10 +1916,10 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_USE_DEVICE:
 	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WAIT:
 	case OMP_NO_CLAUSE_CACHE:
 	case OMP_CLAUSE_INDEPENDENT:
-	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WORKER:
 	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_NUM_GANGS:
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index fcdeb44..e9aeb65 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -286,6 +286,9 @@ enum omp_clause_code {
   size-expression: * | integer-expression.  */
   OMP_CLAUSE_GANG,
 
+  /* OpenACC clause: async [(integer-expression)].  */
+  OMP_CLAUSE_ASYNC,
+
   /* OpenACC clause/directive: wait [(integer-expression-list)].  */
   OMP_CLAUSE_WAIT,
 
@@ -372,9 +375,6 @@ enum omp_clause_code {
   /* OpenACC clause: independent.  */
   OMP_CLAUSE_INDEPENDENT,
 
-  /* OpenACC clause: async [(integer-expression)].  */
-  OMP_CLAUSE_ASYNC,
-
   /* OpenACC clause: worker [( [num:] integer-expression)].  */
   OMP_CLAUSE_WORKER,
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 0655db0..d9a577b 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -264,6 +264,7 @@ unsigned const char omp_clause_num_ops[] =
   1, /* OMP_CLAUSE_DEVICE_RESIDENT  */
   1, /* OMP_CLAUSE_USE_DEVICE  */
   1, /* OMP_CLAUSE_GANG  */
+  1, /* OMP_CLAUSE_ASYNC  */
   1, /* OMP_CLAUSE_WAIT  */
   1, /* OMP_NO_CLAUSE_CACHE  */
   1, /* OMP_CLAUSE__LOOPTEMP_  */
@@ -292,7 +293,6 @@ unsigned const char omp_clause_num_ops[] =
   0, /* OMP_CLAUSE_TASKGROUP  */
   1, /* OMP_CLAUSE__SIMDUID_  */
   0, /* OMP_CLAUSE_INDEPENDENT  */
-  1, /* OMP_CLAUSE_ASYNC  */
   1, /* OMP_CLAUSE_WORKER  */
   1, /* OMP_CLAUSE_VECTOR  */
   1, /* OMP_CLAUSE_NUM_GANGS  */
@@ -322,6 +322,7 @@ const char * const omp_clause_code_name[] =
   device_resident,
   use_device,
   gang,
+  async,
   wait,
   _cache_,
   _looptemp_,
@@ -350,7 +351,6 @@ const char * const omp_clause_code_name[] =
   taskgroup,
   _simduid_,
   independent,
-  async,
   worker,
   vector,
   num_gangs,
@@ -11062,9 +11062,9 @@ walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	case OMP_CLAUSE_DEVICE_RESIDENT:
 	case OMP_CLAUSE_USE_DEVICE:
 	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WAIT:
 	case OMP_NO_CLAUSE_CACHE:
-	case OMP_CLAUSE_ASYNC:
 	case OMP_CLAUSE_WORKER:
 	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_NUM_GANGS:
diff --git a/gcc/tree.h b/gcc/tree.h
index 4d598d1..84801ff 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1315,15 +1315,15 @@ extern void protected_set_expr_location (tree, location_t);
 #define 

Re: [PATCH,GRAPHITE] Fix for P1 bug 58028

2014-02-27 Thread Rainer Orth
Tobias Burnus tobias.bur...@physik.fu-berlin.de writes:

 On 02/26/2014 10:09 PM, Mircea Namolaru wrote:
 + /* Do not remove scalar dimensions. Cloog be default removes scalar
 + dimensions very early from the input schedule. However, they are

 Small nit: Instead of 'be default' it should be 'by default'

... and two spaces after the full stops.  While you're at it, please fix
the CLooG capitalization ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, GOMP4] Fix OpenACC async clause

2014-02-27 Thread Thomas Schwinge
Hi Ilmir!

On Thu, 27 Feb 2014 13:55:36 +0400, Ilmir Usmanov i.usma...@samsung.com wrote:
 Fixed patch and ChangeLog in attachment.

By the way, what I usually do is to put a change's ChangeLog snippet into
that Git commit's commit message, and then once I get to the point where
I push something upstream, I can just take it from there, and put it into
the respective ChangeLog file(s).  Just a suggestions -- no idea whether
that fits your workflow.


 On 27.02.2014 12:59, Thomas Schwinge wrote:
  For consistency, also move OMP_CLAUSE_ASYNC_EXPR in gcc/tree.h, and also
  adapt OMP_CLAUSE_ASYNC in gcc/tree-pretty-print.c:dump_omp_clause?
 It is already adapted:
  case OMP_CLAUSE_ASYNC:

Move between handling of OMP_CLAUSE_GANG and OMP_CLAUSE_ASYNC?

 2014-02-27  Ilmir Usmanov  i.usma...@samsung.com
 
   Fix OpenACC ASYNC clause: it can contain integer-expression-list.

s%can%cannot, I think?

   * gcc/
   tree-core.h (enum omp_clause_code): Likewise.
   tree.h (OMP_CLAUSE_ASYNC_EXPR): Reorder.
   tree.c (omp_clause_num_ops, omp_clause_code_name): Reorder clauses.
   omp-low.c (scan_sharing_clauses): Likewise.
   gimplify.c (gimplify_scan_omp_clauses): Likewise.

Can't start that with »Likewise« ;-), so change that or reorder the
entries.

With that changed, it's OK to commit (don't need to repost), thanks!


Grüße,
 Thomas


pgpmxlR9xEbFU.pgp
Description: PGP signature


Re: PR60147: fix ICE with DECL_NAMELIST and tree-pretty-printer.c

2014-02-27 Thread Tobias Burnus
On Thu, Feb 27, 2014 at 10:43:33AM +0100, Richard Biener wrote:

  hence, one ends up in the code I quote above. That code calls
print_declaration()
...
  Hence, TREE_TYPE() is void_type_node and dereferencing that one leads to an 
  ICE.

Sorry, I misremembered. In any case, the following leads to an ICE.

print_declaration has the code:
  if (TREE_CODE (t) != FUNCTION_DECL)
  if (DECL_INITIAL (t))
...
  dump_generic_node (buffer, DECL_INITIAL (t), spc, flags, false);
which handles the CONSTRUCTOR - which fails with:
else if (TREE_CODE (TREE_TYPE (node)) == RECORD_TYPE
as TREE_TYPE (node) == NULL.


 Instead of the goto please duplicate the pp_semicolon and return.
 
 Ok with that change.  You probably still want to add the NAMELIST_DECL
 case to dump_generic_node (to avoid the  Unknown tree: ...).

Fine with me. I assume that that is only relevant for calls to debug_tree()
in the debugger? Because otherwise, it shouldn't be reachable.

Thanks for the review!

Tobias


Re: PR60147: fix ICE with DECL_NAMELIST and tree-pretty-printer.c

2014-02-27 Thread Richard Biener
On Thu, Feb 27, 2014 at 11:39 AM, Tobias Burnus
tobias.bur...@physik.fu-berlin.de wrote:
 On Thu, Feb 27, 2014 at 10:43:33AM +0100, Richard Biener wrote:

  hence, one ends up in the code I quote above. That code calls
print_declaration()
 ...
  Hence, TREE_TYPE() is void_type_node and dereferencing that one leads to 
  an ICE.

 Sorry, I misremembered. In any case, the following leads to an ICE.

 print_declaration has the code:
   if (TREE_CODE (t) != FUNCTION_DECL)
   if (DECL_INITIAL (t))
 ...
   dump_generic_node (buffer, DECL_INITIAL (t), spc, flags, false);
 which handles the CONSTRUCTOR - which fails with:
 else if (TREE_CODE (TREE_TYPE (node)) == RECORD_TYPE
 as TREE_TYPE (node) == NULL.


 Instead of the goto please duplicate the pp_semicolon and return.

 Ok with that change.  You probably still want to add the NAMELIST_DECL
 case to dump_generic_node (to avoid the  Unknown tree: ...).

 Fine with me. I assume that that is only relevant for calls to debug_tree()
 in the debugger? Because otherwise, it shouldn't be reachable.

Yes.

 Thanks for the review!

 Tobias


Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Andreas Schwab
Jason Merrill ja...@redhat.com writes:

 diff --git a/gcc/testsuite/g++.dg/opt/devirt4.C 
 b/gcc/testsuite/g++.dg/opt/devirt4.C
 new file mode 100644
 index 000..26e8ee6
 --- /dev/null
 +++ b/gcc/testsuite/g++.dg/opt/devirt4.C
 @@ -0,0 +1,16 @@
 +// PR c++/53808
 +// Devirtualization + inlining should produce a non-virtual
 +// call to ~foo.
 +// { dg-options -O -fdevirtualize }
 +// { dg-final { scan-assembler _ZN3fooD2Ev } }
 +
 +struct foo {
 + virtual ~foo();
 +};
 +struct bar : public foo {
 + virtual void zed();
 +};
 +void f() {
 + foo *x(new bar);
 + delete x;
 +}

$ gcc/xg++ -Bgcc/ ../gcc/testsuite/g++.dg/opt/devirt4.C -nostdinc++ 
-Iia64-suse-linux/libstdc++-v3/include/ia64-suse-linux 
-Iia64-suse-linux/libstdc++-v3/include -Ilibstdc++-v3/libsupc++ 
-I../libstdc++-v3/include/backward -I../libstdc++-v3/testsuite/util 
-std=gnu++11 -O -fdevirtualize -ffat-lto-objects -S -o devirt4.s
$ cat devirt4.s 
.file   devirt4.C
.pred.safe_across_calls p1-p5,p16-p63
.sdata
.align 8
.LC0:
data8   _ZTV3bar#+16
.align 8
.LC1:
data8   _ZTV3bar#+32
.text
.align 16
.global _Z1fv#
.type   _Z1fv#, @function
.proc _Z1fv#
_Z1fv:
[.LFB0:]
.prologue 12, 32
.save ar.pfs, r33
alloc r33 = ar.pfs, 0, 3, 1, 0
.save rp, r32
mov r32 = b0
mov r34 = r1
.body
addl r35 = 8, r0
;;
br.call.sptk.many b0 = _Znwm#
mov r1 = r34
;;
addl r14 = @gprel(.LC0), gp
;;
ld8 r14 = [r14]
;;
st8 [r8] = r14
cmp.eq p6, p7 = 0, r8
(p6) br.cond.dpnt .L1
mov r35 = r8
addl r14 = @gprel(.LC1), gp
;;
ld8 r14 = [r14]
;;
ld8 r15 = [r14], 8
;;
mov b6 = r15
ld8 r1 = [r14]
br.call.sptk.many b0 = b6
;;
mov r1 = r34
.L1:
mov ar.pfs = r33
mov b0 = r32
br.ret.sptk.many b0
;;
.endp _Z1fv#
.ident  GCC: (GNU) 4.9.0 20140227 (experimental) [trunk revision 
208194]

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: [PATCH, GOMP4] Fix OpenACC async clause

2014-02-27 Thread Ilmir Usmanov

Committed as r208197.

--
Ilmir.


Re: [PATCH] Fix PR 60268

2014-02-27 Thread Andreas Schwab
Thanks, this is what I have installed.

Andreas.

* config/m68k/m68k.c (m68k_option_override): Disable
-flive-range-shrinkage for classic m68k.
(m68k_override_options_after_change): Likewise.

diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index f20d071..7f7d668 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -642,6 +642,7 @@ m68k_option_override (void)
   flag_schedule_insns = 0;
   flag_schedule_insns_after_reload = 0;
   flag_modulo_sched = 0;
+  flag_live_range_shrinkage = 0;
 }
 
   if (m68k_sched_cpu != CPU_UNKNOWN)
@@ -665,6 +666,7 @@ m68k_override_options_after_change (void)
   flag_schedule_insns = 0;
   flag_schedule_insns_after_reload = 0;
   flag_modulo_sched = 0;
+  flag_live_range_shrinkage = 0;
 }
 }
 
-- 
1.9.0

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: [PATCH][AARCH64]PR60034

2014-02-27 Thread Marcus Shawcroft
On 21 February 2014 04:24, Kugan kugan.vivekanandara...@linaro.org wrote:

 Compiling inline asm results in ICE (PR60034). Alignment calculation in
 aarch64_classify_address for (symbol_ref:DI (*.LANCHOR4) [flags
 0x182])) seems wrong here.

Hi Kugan,

+  else if (SYMBOL_REF_FLAGS (sym))
+ align = GET_MODE_ALIGNMENT (GET_MODE (sym));

This is inserted into the LO_SUM handling in the function
aarch64_classify_address(), the code in question is checking the
alignment of the object to ensure that a scaled address instruction
would be valid. The proposed code is testing if any of a bunch of
unrelated predicate flags have been set on the symbol and using that
to gate whether GET_MODE_ALIGNMENT would give accurate alignment
information on the symbol. I'm not convinced that the presence of
SYMBOL_REF_FLAGS states anything definitive about the relevance of
GET_MODE_ALIGNMENT.   The test looks like it fails because a section
anchor has been introduced and we fail to determine anything sensible
about the alignment of a section anchor.  How about this instead?

 if (SYMBOL_REF_BLOCK (sym))
   align = SYMBOL_REF_BLOCK (sym)-alignment;


 Fixing this also  caused a regression for pr38151.c, which is due to
 complex type being allocated with wrong alignment. Attached patch fixes
 these issues.

It ~might~ be beneficial to increase data_alignment here as suggest
for performance reasons, but the existing alignment should not cause
breakage... this issue suggest to me that the SYMBOL_REF_FLAGS
approach is at fault.

Cheers
/Marcus


[Patch ARM] Fix failing gcc.dg/atomics/atomic-exec-2.c - movmisalignmode expanders.

2014-02-27 Thread Ramana Radhakrishnan

Hi,

	This is a case where we end up with an ICE in movmisalignmode where 
mode is DImode because the address so generated is aligned to 32 bits as 
a result of V_C_E on an _Atomic Complex float temporary to a DImode 
temporary. The problem also shows up only when you have -mfpu=neon 
turned on because these expanders are only valid on hardware where you 
have neon and not elsewhere.


Now, the problem we have is that the address in this case is

unspec:DI [
(mem/c:DI (plus:SI (reg/f:SI 1073)
(const_int -8 [0xfff8])) [0  S8 A32])
] UNSPEC_MISALIGNED_ACCESS)) ../gcc/besttry.c:38 -1
 (nil))


Now the problem here is that we don't allow anything but the plus 
(virtual*-regs) (offset) type addresses in our predicates but the 
expander allows the generation in that form. So, forcing the memory 
addresses to be the right form is the right thing to be doing here with 
movmisalignmode unfortunately. Given we have no other way of 
controlling illegitimate misaligned addresses, this is the only place to 
fix this at this point of time. Longer term maybe we should look at 
relaxing the predicates and allowing LRA / reload to deal with this but 
that's not suitable for stage4.


Tested by bootstrap on arm-linux-gnueabihf with neon, no regressions and 
the testcase passes.


Will apply if RMs don't object in 24 hours.

regards
Ramana


* config/arm/neon.md (movmisalignmode): Legitimize addresses not 
allowed by recognizers.



P.S. I had my share of the fun with the C11 atomics and realized that 
the standard allows for the size and alignment of atomic types to be 
different from the equivalent base types (n1570 - 6.2.5 = 27). Keeping 
the 2 out of sync is a bit unfortunate but required for atomic access 
(in this case we can only have atomic access to 8byte objects on 8 byte 
aligned addresses).



--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 2f06e42..aad420c 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -245,12 +245,23 @@
 UNSPEC_MISALIGNED_ACCESS))]
   TARGET_NEON  !BYTES_BIG_ENDIAN  unaligned_access
 {
+  rtx adjust_mem;
   /* This pattern is not permitted to fail during expansion: if both arguments
  are non-registers (e.g. memory := constant, which can be created by the
  auto-vectorizer), force operand 1 into a register.  */
   if (!s_register_operand (operands[0], MODEmode)
!s_register_operand (operands[1], MODEmode))
 operands[1] = force_reg (MODEmode, operands[1]);
+
+  if (s_register_operand (operands[0], MODEmode))
+adjust_mem = operands[1];
+  else
+adjust_mem = operands[0];
+
+  /* Legitimize address.  */
+  if (!neon_vector_mem_operand (adjust_mem, 2, true))
+XEXP (adjust_mem, 0) = force_reg (Pmode, XEXP (adjust_mem, 0));
+
 })
 
 (define_insn *movmisalignmode_neon_store

Re: [PATCH, AArch64] Define __ARM_NEON by default

2014-02-27 Thread Marcus Shawcroft
On 24 February 2014 10:01, Ian Bolton ian.bol...@arm.com wrote:
 Hi,

 This is needed for when people are porting their aarch32 code to aarch64.
 They will have #ifdef __ARM_NEON (as specified in ACLE) and their intrinsics
 currently won't get used on aarch64 because it's not defined there by
 default.

 This patch defines __ARM_NEON so long as we are not using general regs only.

 Tested on simple testcase to ensure __ARM_NEON was defined.

 OK for trunk?

OK provide no objection in next 24 hours.

/Marcus


Re: [PING][PATCH][AARCH64]Resolves testsuite/gcc.target/aarch64/aapcs64/ret-func-1.c regression

2014-02-27 Thread Marcus Shawcroft
On 24 February 2014 09:49, Renlin Li renlin...@arm.com wrote:

 gcc/testsuite/ChangeLog:

 2014-02-03  Renlin Li renlin...@arm.com

 * gcc.target/aarch64/aapcs64/validate_memory.h: Move f32in64 and
 i32in128 cases
 outside special big-endian processing block.


This is is a fix for a broken test case, this is OK.

/Marcus


Re: [PATCH][AARCH64]Adjust address with offset assembler format

2014-02-27 Thread Marcus Shawcroft
On 12 February 2014 16:10, Renlin Li renlin...@arm.com wrote:

 gcc/ChangeLog:

 2014-02-12  Renlin Li  renlin...@arm.com

 * config/aarch64/aarch64.c (aarch64_print_operand_address): Adjust the
 output asm format
 by adding a space  between base register and offset.

OK for stage-1.

/Marcus


Re: copyright dates in binutils (and includes/)

2014-02-27 Thread Joel Brobecker
Hi Alan,

My two cents...

 --- a/bfd/elf32-sparc.c
 +++ b/bfd/elf32-sparc.c
 @@ -1,7 +1,5 @@
  /* SPARC-specific support for 32-bit ELF
 -   Copyright 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
 -   2003, 2004, 2005, 2006, 2007, 2010, 2011
 -   Free Software Foundation, Inc.
 +   Copyright (C) 1993-2014 Free Software Foundation, Inc.
  
 This file is part of BFD, the Binary File Descriptor library.
  
 Does anyone have a violent objection to committing updates in
 bfd, binutils, elfcpp, gas, gold, gprof, ld, and opcodes?

FWIW, I find that the new ouptut is a nice improvement.

I should mention, however, that for us to use ranges like this,
the FSF asked us to add a note explaining that the copyright years
could be abbreviated into a range. See gdb/README (at the end).
I suspect that you'll need the same note for binutils.

 How about includes/ too?  The choices there are
 a) apply to just binutils owned files, 
 b) apply to binutils+gdb files,
 c) apply to the lot, and update gcc/include/ too.
 
 I see Joel already updated include/gdb, but the script makes a further
 small change.  So choice (b) in include/gdb consists of patches like
 the following:

IIRC, the FSF does not require the (C), but I think they would
prefer it. Either way, it doesn't get in the way for future updates,
so I'd go with at least (b).

As for (c), I'll let the GCC folks answer.

-- 
Joel


Re: [AArch64] Fix possible wrong code generation when comparing DImode values.

2014-02-27 Thread Marcus Shawcroft
On 24 February 2014 18:17, James Greenhalgh james.greenha...@arm.com wrote:

 gcc/

 2014-02-24  James Greenhalgh  james.greenha...@arm.com

 * config/aarch64/aarch64-simd.md
 (aarch64_cmoptabdi): Always split.
 (*aarch64_cmoptabdi): New.
 (aarch64_cmtstdi): Always split.
 (*aarch64_cmtstdi): New.

OK for stage-1

/Marcus


Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Jason Merrill
Hmm, I wonder why we aren't devirtualizing that call on ia64.  Does it 
work with -O3?


Jason


Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Andreas Schwab
Jason Merrill ja...@redhat.com writes:

 Hmm, I wonder why we aren't devirtualizing that call on ia64.  Does it
 work with -O3?

That doesn't change anything fundamentally.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


[Patch ARM] Allow any register for DImode values in Thumb2.

2014-02-27 Thread Ramana Radhakrishnan

Hi

I noticed that for T32 we don't allow any old register for DImode 
values. The restriction of an even register is true only for ARM state 
because the ISA doesn't allow any old register in this place. In a few 
large .i files that I had knocking about, noticed a nice drop in stack 
usage and a generally improved register allocation strategy.


Queued for stage1 after suitable testing including a bootstrap and 
regression test in Thumb2 found no issues.


regards
Ramana

DATE  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

* config/arm/arm.c (arm_hard_regno_mode_ok): Loosen 
restrictions on core registers for DImode values in Thumb2.


--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b49f43e..73dc04a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -22593,12 +22593,19 @@ arm_hard_regno_mode_ok (unsigned int regno, enum 
machine_mode mode)
 }
 
   /* We allow almost any value to be stored in the general registers.
- Restrict doubleword quantities to even register pairs so that we can
- use ldrd.  Do not allow very large Neon structure opaque modes in
- general registers; they would use too many.  */
+ Restrict doubleword quantities to even register pairs in ARM state
+ so that we can use ldrd.  Do not allow very large Neon structure
+ opaque modes in general registers; they would use too many.  */
   if (regno = LAST_ARM_REGNUM)
-return !(TARGET_LDRD  GET_MODE_SIZE (mode)  4  (regno  1) != 0)
-   ARM_NUM_REGS (mode) = 4;
+{
+  if (ARM_NUM_REGS (mode)  4)
+ return FALSE;
+
+  if (TARGET_THUMB2)
+   return TRUE;
+
+  return !(TARGET_LDRD  GET_MODE_SIZE (mode)  4  (regno  1) != 0);
+}
 
   if (regno == FRAME_POINTER_REGNUM
   || regno == ARG_POINTER_REGNUM)

Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Richard Biener
On Thu, Feb 27, 2014 at 2:53 PM, Andreas Schwab sch...@suse.de wrote:
 Jason Merrill ja...@redhat.com writes:

 Hmm, I wonder why we aren't devirtualizing that call on ia64.  Does it
 work with -O3?

 That doesn't change anything fundamentally.

I think the vtable lookup sequence is different and nobody cared to adjust
the gimple matcher to also match the ia64 sequence.

Richard.

 Andreas.

 --
 Andreas Schwab, SUSE Labs, sch...@suse.de
 GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
 And now for something completely different.


Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Jason Merrill

On 02/27/2014 09:03 AM, Richard Biener wrote:

Jason Merrill ja...@redhat.com writes:

Hmm, I wonder why we aren't devirtualizing that call on ia64.

I think the vtable lookup sequence is different and nobody cared to adjust
the gimple matcher to also match the ia64 sequence.


Ah.  So xfail on ia64?

Jason



Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Richard Biener
On Thu, Feb 27, 2014 at 3:51 PM, Jason Merrill ja...@redhat.com wrote:
 On 02/27/2014 09:03 AM, Richard Biener wrote:

 Jason Merrill ja...@redhat.com writes:

 Hmm, I wonder why we aren't devirtualizing that call on ia64.

 I think the vtable lookup sequence is different and nobody cared to adjust

 the gimple matcher to also match the ia64 sequence.


 Ah.  So xfail on ia64?

OTOH the g++dg/ipa/devirt-* testcases seem to run fine everywhere?

Richard.

 Jason



Re: [PATCHv2/AARCH64 2/3] Fix TLS for ILP32.

2014-02-27 Thread Yufeng Zhang

Hi,

On 02/26/14 02:25, Andrew Pinski wrote:

Hi,
   With ILP32, some simple usage of TLS variables causes an unrecognizable
instruction due to needing to use SImode for loading pointers from memory.
This fixes the three (tlsie_small, tlsle_small, tlsdesc_small) patterns to
support SImode for pointers.  I modified them to be like what was done for
the GOT patterns.


The patch looks good to me (but I cannot approve).

There are some minor indentation issues though; see below.



OK?  Build and tested on aarch64-elf with no regressions.

Thanks,
Andrew Pinski

* config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
Handle TLS for ILP32.
* config/aarch64/aarch64.md (tlsie_small): Rename to ...
(tlsie_small_mode): this and handle PTR.
(tlsie_small_sidi): New pattern.
(tlsle_small): Change to an expand to handle ILP32.
(tlsle_small_mode): New pattern.
(tlsdesc_small): Rename to ...
(tlsdesc_small_mode): this and handle PTR.
---
  gcc/ChangeLog |   12 +
  gcc/config/aarch64/aarch64.c  |   48 +++
  gcc/config/aarch64/aarch64.md |   54 +++-
  3 files changed, 96 insertions(+), 18 deletions(-)


[snip]


diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99a6ac8..7d8a645 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3581,35 +3581,65 @@
[(set_attr type call)
 (set_attr length 16)])

-(define_insn tlsie_small
-  [(set (match_operand:DI 0 register_operand =r)
-(unspec:DI [(match_operand:DI 1 aarch64_tls_ie_symref S)]
+(define_insn tlsie_small_mode
+  [(set (match_operand:PTR 0 register_operand =r)
+(unspec:PTR [(match_operand 1 aarch64_tls_ie_symref S)]
   UNSPEC_GOTSMALLTLS))]

-  adrp\\t%0, %A1\;ldr\\t%0, [%0, #%L1]
+  adrp\\t%0, %A1\;ldr\\t%w0, [%0, #%L1]
[(set_attr type load1)
 (set_attr length 8)]
  )

-(define_insn tlsle_small
+
+(define_insn tlsie_small_sidi
[(set (match_operand:DI 0 register_operand =r)
-(unspec:DI [(match_operand:DI 1 register_operand r)
-   (match_operand:DI 2 aarch64_tls_le_symref S)]
+   (zero_extend:DI
+  (unspec:SI [(match_operand 1 aarch64_tls_ie_symref S)]
+ UNSPEC_GOTSMALLTLS)))]
+  
+  adrp\\t%0, %A1\;ldr\\t%w0, [%0, #%L1]
+  [(set_attr type load1)
+   (set_attr length 8)]
+)
+
+
+(define_expand tlsle_small
+  [(set (match_operand 0 register_operand =r)
+(unspec [(match_operand 1 register_operand r)
+   (match_operand 2 aarch64_tls_le_symref S)]
+   UNSPEC_GOTSMALLTLS))]


The last two lines were not indented well.


+  
+{
+  enum machine_mode mode = GET_MODE (operands[0]);
+  emit_insn ((mode == DImode
+ ? gen_tlsle_small_di
+ : gen_tlsle_small_si) (operands[0],
+  operands[1],
+  operands[2]));


Likewise here.


Thanks,
Yufeng



+  DONE;
+
+})
+
+(define_insn tlsle_small_mode
+  [(set (match_operand:P 0 register_operand =r)
+(unspec:P [(match_operand:P 1 register_operand r)
+   (match_operand 2 aarch64_tls_le_symref S)]
   UNSPEC_GOTSMALLTLS))]

-  add\\t%0, %1, #%G2\;add\\t%0, %0, #%L2
+  add\\t%w0, %w1, #%G2\;add\\t%w0, %w0, #%L2
[(set_attr type alu_reg)
 (set_attr length 8)]
  )

-(define_insn tlsdesc_small
-  [(set (reg:DI R0_REGNUM)
-(unspec:DI [(match_operand:DI 0 aarch64_valid_symref S)]
+(define_insn tlsdesc_small_mode
+  [(set (reg:PTR R0_REGNUM)
+(unspec:PTR [(match_operand 0 aarch64_valid_symref S)]
   UNSPEC_TLSDESC))
 (clobber (reg:DI LR_REGNUM))
 (clobber (match_scratch:DI 1 =r))]
TARGET_TLS_DESC
-  adrp\\tx0, %A0\;ldr\\t%1, [x0, #%L0]\;add\\tx0, x0, 
%L0\;.tlsdesccall\\t%0\;blr\\t%1
+  adrp\\tx0, %A0\;ldr\\t%w1, [x0, #%L0]\;add\\tw0,w0, 
%L0\;.tlsdesccall\\t%0\;blr\\t%1
[(set_attr type call)
 (set_attr length 16)])





Re: [PATCHv2/AARCH64 3/3] Support ILP32 multi-lib

2014-02-27 Thread Yufeng Zhang

On 02/26/14 02:25, Andrew Pinski wrote:

Hi,
   This is the final patch which adds support for the dynamic linker and
multi-lib directories for ILP32.  I did not change multi-arch support as
I did not know what it should be changed to and internally here at Cavium,
we don't use multi-arch.  Updated for the new names that were decided on.


OK?  Build and tested for aarch64-linux-gnu with and without 
--with-multilib-list=lp64,ilp32.


Looks good to me, but I cannot approve.  I have a couple of minor 
comments on the changelog entry.




Thanks,
Andrew Pinski

* config/aarch64/aarch64-linux.h (GLIBC_DYNAMIC_LINKER): 
/lib/ld-linux-aarch64_ilp32.so.1
is used for ILP32.
(LINUX_TARGET_LINK_SPEC): Update linker script for ILP32.
file whose name depends on -mabi= and -mbig-endian.
* config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Handle LP64 
better
and handle ilp32 too.
(MULTILIB_OPTIONS): Delete.
(MULTILIB_DIRNAMES): Delete.
---
  gcc/ChangeLog  |   11 +++
  gcc/config/aarch64/aarch64-linux.h |4 ++--
  gcc/config/aarch64/t-aarch64-linux |7 ++-
  3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 155ce45..a0cdc58 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,16 @@
  2014-02-25  Andrew Pinskiapin...@cavium.com

+   * config/aarch64/aarch64-linux.h (GLIBC_DYNAMIC_LINKER): 
/lib/ld-linux32-aarch64_ilp32.so.1


s/linux32/linux


+   is used for ILP32.
+   (LINUX_TARGET_LINK_SPEC): Update linker script for ILP32.
+   file whose name depends on -mabi= and -mbig-endian.
+   * config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Handle LP64 
better
+   and handle ilp32 too.


s/ilp32/ILP32

Thanks,
Yufeng





+   (MULTILIB_OPTIONS): Delete.
+   (MULTILIB_DIRNAMES): Delete.
+
+2014-02-25  Andrew Pinskiapin...@cavium.com
+
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
Handle TLS for ILP32.
* config/aarch64/aarch64.md (tlsie_small): Rename to ...
diff --git a/gcc/config/aarch64/aarch64-linux.h 
b/gcc/config/aarch64/aarch64-linux.h
index a8f0771..48beafb 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -21,7 +21,7 @@
  #ifndef GCC_AARCH64_LINUX_H
  #define GCC_AARCH64_LINUX_H

-#define GLIBC_DYNAMIC_LINKER /lib/ld-linux-aarch64%{mbig-endian:_be}.so.1
+#define GLIBC_DYNAMIC_LINKER 
/lib/ld-linux-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1

  #define CPP_SPEC %{pthread:-D_REENTRANT}

@@ -33,7 +33,7 @@
 -dynamic-linker  GNU_USER_DYNAMIC_LINKER   \
 -X \
 %{mbig-endian:-EB} %{mlittle-endian:-EL} \
-   -maarch64linux%{mbig-endian:b}
+   -maarch64linux%{mabi=ilp32:32}%{mbig-endian:b}

  #define LINK_SPEC LINUX_TARGET_LINK_SPEC

diff --git a/gcc/config/aarch64/t-aarch64-linux 
b/gcc/config/aarch64/t-aarch64-linux
index 147452b..d6a678e 100644
--- a/gcc/config/aarch64/t-aarch64-linux
+++ b/gcc/config/aarch64/t-aarch64-linux
@@ -22,10 +22,7 @@ LIB1ASMSRC   = aarch64/lib1funcs.asm
  LIB1ASMFUNCS = _aarch64_sync_cache_range

  AARCH_BE = $(if $(findstring TARGET_BIG_ENDIAN_DEFAULT=1, $(tm_defines)),_be)
-MULTILIB_OSDIRNAMES = .=../lib64$(call 
if_multiarch,:aarch64$(AARCH_BE)-linux-gnu)
+MULTILIB_OSDIRNAMES = mabi.lp64=../lib64$(call 
if_multiarch,:aarch64$(AARCH_BE)-linux-gnu)
  MULTIARCH_DIRNAME = $(call if_multiarch,aarch64$(AARCH_BE)-linux-gnu)

-# Disable the multilib for linux-gnu targets for the time being; focus
-# on the baremetal targets.
-MULTILIB_OPTIONS=
-MULTILIB_DIRNAMES   =
+MULTILIB_OSDIRNAMES += mabi.ilp32=../libilp32





[Ping] Re: [C++ Patch] PR 60253

2014-02-27 Thread Paolo Carlini

On 02/21/2014 12:14 PM, Paolo Carlini wrote:

Hi,

unless we have reasons to believe that the diagnostic quality could 
regress in some circumstances, we can easily resolve this ICE on 
invalid regression by always returning error_mark_node after error 
(thus outside SFINAE too).

Pinging this patchlet...

Thanks!
Paolo.


Fwd: [PATCH, ARM] Improve 64 bit division performance

2014-02-27 Thread Charles Baylis
[resending as text/plain]

Hi

These patches optimise 64 bit division by removing the use of the
__gnu_[u]ldivmod_helper functions and hence avoiding the redundant
calculation of the remainder in those functions.

Bootstrapped, tested and checked for arm-unknown-linux-gnueabihf.

Benchmarked on Chromebook and Raspberry Pi using attached divbench3.c.
Loop1 varies the divisor and loop2 varies the dividend.

Chromebook:

before:
loop1 unsigned: 3.474419
loop2 unsigned: 6.564871
loop1 signed:   4.127967
loop2 signed:   6.071490

after:
loop1 unsigned: 2.781364
loop2 unsigned: 6.166478
loop1 signed:   2.800974
loop2 signed:   6.129588

Raspberry pi:
before
loop1 unsigned:28.881753
loop2 unsigned:19.876385
loop1 signed:  32.074941
loop2 signed:  20.594860

after:
loop1 unsigned:24.893846
loop2 unsigned:19.537562
loop1 signed:  25.334509
loop2 signed:  19.615088

Any comments? OK for stage 1?


Patch 1:

2014-02-27  Charles Baylis  charles.bay...@linaro.org

* config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
to __udivmoddi4.


Patch 2:

2014-02-27  Charles Baylis  charles.bay...@linaro.org

* config/arm/bpabi.S (__aeabi_ldivmod): Perform signed division via
call to __udivmoddi4 and fixing up for negative operands.
From 35254b813303e7fb40eb8aa0bb749216fd8f96fc Mon Sep 17 00:00:00 2001
From: Charles Baylis charles.bay...@linaro.org
Date: Tue, 25 Feb 2014 18:34:38 +
Subject: [PATCH 1/2] Optimise __aeabi_uldivmod

2014-02-25  Charles Baylis  charles.bay...@linaro.org

	* config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
	to __udivmoddi4.
	* config/arm/bpabi.S (__aeabi_uldivmod): Optimise stack pointer
	manipulation.
---
 libgcc/config/arm/bpabi.S | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index 7772301..e020af5 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -120,6 +120,16 @@ ARM_FUNC_START aeabi_ulcmp
 #endif
 .endm
 
+/* we can use STRD/LDRD on v5TE and later, and any Thumb-2 architecture. */
+#if (defined(__ARM_EABI__)\
+  (defined(__thumb2__)  \
+ || (__ARM_ARCH = 5  defined(__TARGET_FEATURE_DSP
+#define CAN_USE_LDRD 1
+#else
+#define CAN_USE_LDRD 0
+#endif
+
+
 #ifdef L_aeabi_ldivmod
 
 ARM_FUNC_START aeabi_ldivmod
@@ -149,18 +159,23 @@ ARM_FUNC_START aeabi_uldivmod
 	cfi_start	__aeabi_uldivmod, LSYM(Lend_aeabi_uldivmod)
 	test_div_by_zero unsigned
 
-	sub sp, sp, #8
-#if defined(__thumb2__)
-	mov ip, sp
-	push {ip, lr}
+#if defined(__thumb2__)  CAN_USE_LDRD
+	sub ip, sp, #8
+	strd ip,lr, [sp, #-16]!
 #else
+	sub sp, sp, #8
 	do_push {sp, lr}
 #endif
 98:	cfi_push 98b - __aeabi_uldivmod, 0xe, -0xc, 0x10
-	bl SYM(__gnu_uldivmod_helper) __PLT__
+	bl SYM(__udivmoddi4) __PLT__
 	ldr lr, [sp, #4]
+#if CAN_USE_LDRD
+	ldrd r2, r3, [sp, #8]
+	add sp, sp, #16
+#else
 	add sp, sp, #8
 	do_pop {r2, r3}
+#endif
 	RET
 	cfi_end	LSYM(Lend_aeabi_uldivmod)
 
-- 
1.8.3.2

From 975d9c624e77ee00476e6866250b0e2e31461fca Mon Sep 17 00:00:00 2001
From: Charles Baylis charles.bay...@linaro.org
Date: Tue, 25 Feb 2014 16:27:59 +
Subject: [PATCH 2/2] Optimise __aeabi_ldivmod

2014-02-25  Charles Baylis  charles.bay...@linaro.org

* config/arm/bpabi.S (__aeabi_ldivmod): Perform signed division using
	unsigned division via call to __udivmoddi4 and additional logic.
---
 libgcc/config/arm/bpabi.S | 74 +++
 1 file changed, 69 insertions(+), 5 deletions(-)

diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S
index e020af5..8b75a28 100644
--- a/libgcc/config/arm/bpabi.S
+++ b/libgcc/config/arm/bpabi.S
@@ -136,20 +136,84 @@ ARM_FUNC_START aeabi_ldivmod
 	cfi_start	__aeabi_ldivmod, LSYM(Lend_aeabi_ldivmod)
 	test_div_by_zero signed
 
-	sub sp, sp, #8
-#if defined(__thumb2__)
-	mov ip, sp
-	push {ip, lr}
+#if defined(__thumb2__)  CAN_USE_LDRD
+	sub ip, sp, #8
+	strd ip,lr, [sp, #-16]!
 #else
+	sub sp, sp, #8
 	do_push {sp, lr}
 #endif
+	cmp xxh, #0
+	blt 1f
+	cmp yyh, #0
+	blt 2f
+
+98:	cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
+	bl SYM(__udivmoddi4) __PLT__
+	ldr lr, [sp, #4]
+#if CAN_USE_LDRD
+	ldrd r2, r3, [sp, #8]
+	add sp, sp, #16
+#else
+	add sp, sp, #8
+	do_pop {r2, r3}
+#endif
+	RET
+1: /* xxh:xxl is negative */
+	rsbs xxl, xxl, #0
+	sbc xxh, xxh, xxh, lsl #1
+	cmp yyh, #0
+	blt 3f
+98:	cfi_push 98b - __aeabi_ldivmod, 0xe, -0xc, 0x10
+	bl SYM(__udivmoddi4) __PLT__
+	ldr lr, [sp, #4]
+#if CAN_USE_LDRD
+	ldrd r2, r3, [sp, #8]
+	add sp, sp, #16
+#else
+	add sp, sp, #8
+	do_pop {r2, r3}
+#endif
+	rsbs xxl, xxl, #0
+	sbc xxh, xxh, xxh, lsl #1
+	rsbs yyl, yyl, #0
+	sbc yyh, yyh, yyh, lsl #1
+	RET
+
+2: /* only yyh:yyl is negative */
+	rsbs yyl, 

Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Martin Jambor
On Thu, Feb 27, 2014 at 04:00:22PM +0100, Richard Biener wrote:
 On Thu, Feb 27, 2014 at 3:51 PM, Jason Merrill ja...@redhat.com wrote:
  On 02/27/2014 09:03 AM, Richard Biener wrote:
 
  Jason Merrill ja...@redhat.com writes:
 
  Hmm, I wonder why we aren't devirtualizing that call on ia64.
 
  I think the vtable lookup sequence is different and nobody cared to adjust
 
  the gimple matcher to also match the ia64 sequence.
 
 
  Ah.  So xfail on ia64?
 
 OTOH the g++dg/ipa/devirt-* testcases seem to run fine everywhere?

This is probably the same issue as the one described in
http://gcc.gnu.org/ml/gcc/2012-08/msg00055.html and the subsequent
thread.  The problem however turned out to be slightly more difficult
and I was not interested enough in ia64 to care again.

Martin


C++ PATCH for c++/60353 (bogus error with unnamed class)

2014-02-27 Thread Jason Merrill
The code I factored out of expand_or_defer_fn_1 into 
tentative_decl_linkage is intended for use with functions that are 
defined; using it on functions that have not been defined causes problems.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 2e63bb7138e59e7e04e2649fb3d8086fffc238a7
Author: Jason Merrill ja...@redhat.com
Date:   Thu Feb 27 09:15:02 2014 -0500

	PR c++/60353
	PR c++/55877
	* decl2.c (tentative_decl_linkage): Don't mess with functions that
	are not yet defined.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1afe16e..dfc532d 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -2564,7 +2564,8 @@ tentative_decl_linkage (tree decl)
be handled.  */;
   else if (vague_linkage_p (decl))
 {
-  if (TREE_CODE (decl) == FUNCTION_DECL)
+  if (TREE_CODE (decl) == FUNCTION_DECL
+	   decl_defined_p (decl))
 	{
 	  DECL_EXTERNAL (decl) = 1;
 	  DECL_NOT_REALLY_EXTERN (decl) = 1;
@@ -2586,11 +2587,8 @@ tentative_decl_linkage (tree decl)
 	  DECL_INTERFACE_KNOWN (decl) = 1;
 	}
 	}
-  else
-	{
-	  gcc_assert (TREE_CODE (decl) == VAR_DECL);
-	  maybe_commonize_var (decl);
-	}
+  else if (TREE_CODE (decl) == VAR_DECL)
+	maybe_commonize_var (decl);
 }
 }
 
diff --git a/gcc/testsuite/g++.dg/other/anon6.C b/gcc/testsuite/g++.dg/other/anon6.C
new file mode 100644
index 000..2fd0942
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/anon6.C
@@ -0,0 +1,8 @@
+// PR c++/60353
+
+struct A {
+  A(int);
+};
+typedef struct {
+  A format;
+} B;


Re: [PATCHv2/AARCH64 2/3] Fix TLS for ILP32.

2014-02-27 Thread Marcus Shawcroft
On 26 February 2014 02:25, Andrew Pinski apin...@cavium.com wrote:
 Hi,
   With ILP32, some simple usage of TLS variables causes an unrecognizable
 instruction due to needing to use SImode for loading pointers from memory.
 This fixes the three (tlsie_small, tlsle_small, tlsdesc_small) patterns to
 support SImode for pointers.  I modified them to be like what was done for
 the GOT patterns.

 OK?  Build and tested on aarch64-elf with no regressions.

 Thanks,
 Andrew Pinski

 * config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
 Handle TLS for ILP32.
 * config/aarch64/aarch64.md (tlsie_small): Rename to ...
 (tlsie_small_mode): this and handle PTR.
 (tlsie_small_sidi): New pattern.
 (tlsle_small): Change to an expand to handle ILP32.
 (tlsle_small_mode): New pattern.
 (tlsdesc_small): Rename to ...
 (tlsdesc_small_mode): this and handle PTR.

Thanks Andrew, this is OK for stage-1, but please fix Yufeng's layout nit.
/Marcus


patch to fix PR59222

2014-02-27 Thread Vladimir Makarov
The following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59222

The patch was successfully bootstrapped and tested on x86/x86-64.

Committed as rev. 208201.

2014-02-27  Vladimir Makarov  vmaka...@redhat.com

PR target/59222
* lra.c (lra_emit_add): Check SUBREG too.

Index: lra.c
===
--- lra.c	(revision 208174)
+++ lra.c	(working copy)
@@ -340,8 +340,9 @@ lra_emit_add (rtx x, rtx y, rtx z)
 	  base = a1;
 	  index = a2;
 	}
-  if (! REG_P (base)
-	  || (index != NULL_RTX  ! REG_P (index))
+  if (! (REG_P (base) || GET_CODE (base) == SUBREG)
+	  || (index != NULL_RTX
+	   ! (REG_P (index) || GET_CODE (index) == SUBREG))
 	  || (disp != NULL_RTX  ! CONSTANT_P (disp))
 	  || (scale != NULL_RTX  ! CONSTANT_P (scale)))
 	{


Re: [C++ Patch] PR 60253

2014-02-27 Thread Jason Merrill

OK.

Jason


Re: [C++ patch] for C++/52369

2014-02-27 Thread Jason Merrill

On 02/23/2014 02:36 PM, Fabien Chêne wrote:

 * cp/method.c (walk_field_subobs): improve the diagnostic
 locations for both REFERENCE_TYPEs and non-static const members.


It's important to have the error location be the place where the actual 
problem is, namely the constructor definition.  Instead of changing it, 
please add an inform pointing out the location of the member in question.


Jason



Re: copyright dates in binutils (and includes/)

2014-02-27 Thread Joseph S. Myers
On Thu, 27 Feb 2014, Joel Brobecker wrote:

 I should mention, however, that for us to use ranges like this,
 the FSF asked us to add a note explaining that the copyright years
 could be abbreviated into a range. See gdb/README (at the end).
 I suspect that you'll need the same note for binutils.

And, where a gap in the years is being implicitly filled in by conversion 
to a range, make sure that either (a) there was a public version control 
repository for binutils during that year, or (b) there was a release 
(including beta releases, Cygnus releases etc., not just official 
releases) during that year.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH/AARCH64 1/3] Add AARCH64 ILP32 PCH support

2014-02-27 Thread Kyle McMartin
On Tue, Feb 25, 2014 at 06:25:11PM -0800, Andrew Pinski wrote:
 -#elif defined(__aarch64__)
 +#elif defined(__aarch64__)  defined(__LP64__)
  # define TRY_EMPTY_VM_SPACE  0x10
 +#elif defined(__aarch64__)
 +# define TRY_EMPTY_VM_SPACE  0x6000

awesome. thanks andrew!

--kyle


[PATCH] [rtl-optimization/49847] Fix cse to handle cc0 setter and cc0 user in different blocks

2014-02-27 Thread Jeff Law


As discussed in 49847, a few years ago GCC was changed to add EH edges 
for exceptions that might arise from a floating point comparison.  That 
change made it possible for a cc0-setter and cc0-user to end up in 
different blocks, separated by a NOTE.


After discussing reverting the change, duplicating the cc0-setter and a 
couple other ideas, eventually Richi and myself settled on relaxing the 
long standing restrictions on the relative locations of the cc0 setter 
and cc0 user.


We agreed to update the documentation and fault in fixes due to low 
priority of cc0 targets (and doubly so since this really only affects 
cc0 targets with used with non-call-exceptions).


This patch updates the documentation and fixes the only known failure 
due to this change in cse.c with a patch from Mikael.


Tested with a cross compiler.  I'll probably fire off a m68k bootstrap 
for good measure, but it'll be several days before that completes.


Installed on the trunk.

Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a78716..a20cee3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2014-02-27  Mikael Pettersson  mi...@it.uu.se
+   Jeff Law  l...@redhat.com
+
+   PR rtl-optimization/49847
+   * cse.c (fold_rtx) Handle case where cc0 setter and cc0 user
+   are in different blocks.
+   * doc/tm.texi (Condition Code Status): Update documention for
+   relative locations of cc0-setter and cc0-user.
+
 2014-02-27  Vladimir Makarov  vmaka...@redhat.com
 
PR target/59222
diff --git a/gcc/cse.c b/gcc/cse.c
index cffa553..dba85f1 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -3199,9 +3199,27 @@ fold_rtx (rtx x, rtx insn)
 
 #ifdef HAVE_cc0
  case CC0:
-   folded_arg = prev_insn_cc0;
-   mode_arg = prev_insn_cc0_mode;
-   const_arg = equiv_constant (folded_arg);
+   /* The cc0-user and cc0-setter may be in different blocks if
+  the cc0-setter potentially traps.  In that case PREV_INSN_CC0
+  will have been cleared as we exited the block with the
+  setter.
+
+  While we could potentially track cc0 in this case, it just
+  doesn't seem to be worth it given that cc0 targets are not
+  terribly common or important these days and trapping math
+  is rarely used.  The combination of those two conditions
+  necessary to trip this situation is exceedingly rare in the
+  real world.  */
+   if (!prev_insn_cc0)
+ {
+   const_arg = NULL_RTX;
+ }
+   else
+ {
+   folded_arg = prev_insn_cc0;
+   mode_arg = prev_insn_cc0_mode;
+   const_arg = equiv_constant (folded_arg);
+ }
break;
 #endif
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f204936..f7024a7 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5900,8 +5900,13 @@ most instructions do not affect it.  The latter category 
includes
 most RISC machines.
 
 The implicit clobbering poses a strong restriction on the placement of
-the definition and use of the condition code, which need to be in adjacent
-insns for machines using @code{(cc0)}.  This can prevent important
+the definition and use of the condition code.  In the past the definition
+and use were always adjacent.  However, recent changes to support trapping
+arithmatic may result in the definition and user being in different blocks.
+Thus, there may be a @code{NOTE_INSN_BASIC_BLOCK} between them.  Additionally,
+the definition may be the source of exception handling edges.
+
+These restrictions can prevent important
 optimizations on some machines.  For example, on the IBM RS/6000, there
 is a delay for taken branches unless the condition code register is set
 three instructions earlier than the conditional branch.  The instruction
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 128a5f7..be4cb12 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2014-02-27  Mikael Pettersson  mi...@it.uu.se
+Jeff Law  l...@redhat.com
+
+PR rtl-optimization/49847
+* g++.dg/pr49847.C: New test.
+
 2014-02-27  Marek Polacek  pola...@redhat.com
 
PR middle-end/59223
diff --git a/gcc/testsuite/g++.dg/pr49847.C b/gcc/testsuite/g++.dg/pr49847.C
new file mode 100644
index 000..b047713
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr49847.C
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options -O -fnon-call-exceptions } */
+int f (float g)
+{
+  try { return g = 0; }
+  catch (...) {}
+}


Re: C++ PATCH for lto/53808 (devirtualization of defaulted virtual dtor)

2014-02-27 Thread Jan Hubicka
 On Thu, Feb 27, 2014 at 2:53 PM, Andreas Schwab sch...@suse.de wrote:
  Jason Merrill ja...@redhat.com writes:
 
  Hmm, I wonder why we aren't devirtualizing that call on ia64.  Does it
  work with -O3?
 
  That doesn't change anything fundamentally.
 
 I think the vtable lookup sequence is different and nobody cared to adjust
 the gimple matcher to also match the ia64 sequence.

We sort of handle the function descriptors thorough the devirtualization code.
I will check why we fail here.

Honza
 
 Richard.
 
  Andreas.
 
  --
  Andreas Schwab, SUSE Labs, sch...@suse.de
  GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
  And now for something completely different.


Re: [C++ Patch] PR 60314 (ICE with decltype(auto))

2014-02-27 Thread Jason Merrill

On 02/25/2014 05:03 AM, Paolo Carlini wrote:

here we ICE exactly as we did in c++/53756: the only difference is the
use of decltype(auto) instead of auto. Now, if we compare is_cxx_auto to
is_auto (the front-end helper), evidently there is an inconsistency
about the handling of decltype(auto) and the below fixes the ICE.
However, also clearly the patchlet needs a review, because an out of
class decltype(auto) is already fine. Also, I'm not 100% sure we don't
need a decltype_auto_die, etc.


I think we do need a decltype_auto_die.

Jason



Re: [RFA][rtl-optimization/52714] Do not allow combine to create invalid RTL

2014-02-27 Thread Jeff Law

On 02/17/14 02:28, Eric Botcazou wrote:


Although it's probably time to concede defeat in this particular case, I don't
think we should use the NONJUMP_INSN_P big hammer because we want to eliminate
branches if possible.  So I'd just add the minimal test to the two existing
conditions so as to prevent the invalid RTL from being created, e.g.

(!JUMP_P (i3) || SET_DEST (set[01]) == pc_rtx)

Done.  Attached is the final iteration that got installed.


Verified with a cross compiler.  Bootstrap on m68k will start at some 
point and finish in the next week or so.


jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a20cee3..16c499b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2014-02-27  Jeff Law  l...@redhat.com
+
+   PR rtl-optimization/52714
+   * combine.c (try_combine): When splitting an unrecognized PARALLEL
+   into two independent simple sets, if I3 is a jump, ensure the
+   pattern we place into I3 is a (set (pc) ...)
+
 2014-02-27  Mikael Pettersson  mi...@it.uu.se
Jeff Law  l...@redhat.com
 
diff --git a/gcc/combine.c b/gcc/combine.c
index 1b598b4..fc473b6 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -3712,6 +3712,9 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int 
*new_direct_jump_p,
 #ifdef HAVE_cc0
   !reg_referenced_p (cc0_rtx, set0)
 #endif
+ /* If I3 is a jump, ensure that set0 is a jump so that
+we do not create invalid RTL.  */
+  (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
 )
{
  newi2pat = set1;
@@ -3726,6 +3729,9 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int 
*new_direct_jump_p,
 #ifdef HAVE_cc0
!reg_referenced_p (cc0_rtx, set1)
 #endif
+  /* If I3 is a jump, ensure that set1 is a jump so that
+ we do not create invalid RTL.  */
+   (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
  )
{
  newi2pat = set0;
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index be4cb12..b95bcfd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-02-27  Jeff Law  l...@redhat.com
+
+   PR rtl-optimization/52714
+   * gcc.c-torture/compile/pr52714.c: New test.
+
 2014-02-27  Mikael Pettersson  mi...@it.uu.se
 Jeff Law  l...@redhat.com
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr52714.c 
b/gcc/testsuite/gcc.c-torture/compile/pr52714.c
new file mode 100644
index 000..03d4990
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr52714.c
@@ -0,0 +1,25 @@
+
+int __re_compile_fastmap(unsigned char *p)
+{
+unsigned char **stack;
+unsigned size;
+unsigned avail;
+
+stack = __builtin_alloca(5 * sizeof(unsigned char*));
+if (stack == 0)
+   return -2;
+size = 5;
+avail = 0;
+
+for (;;) {
+   switch (*p++) {
+   case 0:
+   if (avail == size)
+   return -2;
+   stack[avail++] = p;
+   }
+}
+
+return 0;
+}
+


Re: [PATCH] [rtl-optimization/49847] Fix cse to handle cc0 setter and cc0 user in different blocks

2014-02-27 Thread H.J. Lu
On Thu, Feb 27, 2014 at 11:02 AM, Jeff Law l...@redhat.com wrote:

 As discussed in 49847, a few years ago GCC was changed to add EH edges for
 exceptions that might arise from a floating point comparison.  That change
 made it possible for a cc0-setter and cc0-user to end up in different
 blocks, separated by a NOTE.

 After discussing reverting the change, duplicating the cc0-setter and a
 couple other ideas, eventually Richi and myself settled on relaxing the long
 standing restrictions on the relative locations of the cc0 setter and cc0
 user.

 We agreed to update the documentation and fault in fixes due to low priority
 of cc0 targets (and doubly so since this really only affects cc0 targets
 with used with non-call-exceptions).

 This patch updates the documentation and fixes the only known failure due to
 this change in cse.c with a patch from Mikael.

 Tested with a cross compiler.  I'll probably fire off a m68k bootstrap for
 good measure, but it'll be several days before that completes.

 Installed on the trunk.

 Jeff

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 8a78716..a20cee3 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,12 @@
 +2014-02-27  Mikael Pettersson  mi...@it.uu.se
 +   Jeff Law  l...@redhat.com
 +
 +   PR rtl-optimization/49847
 +   * cse.c (fold_rtx) Handle case where cc0 setter and cc0 user
 +   are in different blocks.
 +   * doc/tm.texi (Condition Code Status): Update documention for
 +   relative locations of cc0-setter and cc0-user.
 +

This breaks bootstrap:

You should edit ../../src-trunk/gcc/doc/tm.texi.in rather than
../../src-trunk/gcc/doc/tm.texi .
make[6]: *** [s-tm-texi] Error 1
make[6]: *** Waiting for unfinished jobs

-- 
H.J.


Re: [PATCH] [rtl-optimization/49847] Fix cse to handle cc0 setter and cc0 user in different blocks

2014-02-27 Thread H.J. Lu
On Thu, Feb 27, 2014 at 11:50 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Feb 27, 2014 at 11:02 AM, Jeff Law l...@redhat.com wrote:

 As discussed in 49847, a few years ago GCC was changed to add EH edges for
 exceptions that might arise from a floating point comparison.  That change
 made it possible for a cc0-setter and cc0-user to end up in different
 blocks, separated by a NOTE.

 After discussing reverting the change, duplicating the cc0-setter and a
 couple other ideas, eventually Richi and myself settled on relaxing the long
 standing restrictions on the relative locations of the cc0 setter and cc0
 user.

 We agreed to update the documentation and fault in fixes due to low priority
 of cc0 targets (and doubly so since this really only affects cc0 targets
 with used with non-call-exceptions).

 This patch updates the documentation and fixes the only known failure due to
 this change in cse.c with a patch from Mikael.

 Tested with a cross compiler.  I'll probably fire off a m68k bootstrap for
 good measure, but it'll be several days before that completes.

 Installed on the trunk.

 Jeff

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 8a78716..a20cee3 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,12 @@
 +2014-02-27  Mikael Pettersson  mi...@it.uu.se
 +   Jeff Law  l...@redhat.com
 +
 +   PR rtl-optimization/49847
 +   * cse.c (fold_rtx) Handle case where cc0 setter and cc0 user
 +   are in different blocks.
 +   * doc/tm.texi (Condition Code Status): Update documention for
 +   relative locations of cc0-setter and cc0-user.
 +

 This breaks bootstrap:

 You should edit ../../src-trunk/gcc/doc/tm.texi.in rather than
 ../../src-trunk/gcc/doc/tm.texi .
 make[6]: *** [s-tm-texi] Error 1
 make[6]: *** Waiting for unfinished jobs


I checked in this as an obvious fix.

-- 
H.J.
---
Index: ChangeLog
===
--- ChangeLog (revision 208204)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-02-27  H.J. Lu  hongjiu...@intel.com
+
+ * doc/tm.texi.in (Condition Code Status): Update documention for
+ relative locations of cc0-setter and cc0-user.
+
 2014-02-27  Jeff Law  l...@redhat.com

  PR rtl-optimization/52714
Index: doc/tm.texi.in
===
--- doc/tm.texi.in (revision 208204)
+++ doc/tm.texi.in (working copy)
@@ -4484,8 +4484,13 @@ most instructions do not affect it.  The
 most RISC machines.

 The implicit clobbering poses a strong restriction on the placement of
-the definition and use of the condition code, which need to be in adjacent
-insns for machines using @code{(cc0)}.  This can prevent important
+the definition and use of the condition code.  In the past the definition
+and use were always adjacent.  However, recent changes to support trapping
+arithmatic may result in the definition and user being in different blocks.
+Thus, there may be a @code{NOTE_INSN_BASIC_BLOCK} between them.  Additionally,
+the definition may be the source of exception handling edges.
+
+These restrictions can prevent important
 optimizations on some machines.  For example, on the IBM RS/6000, there
 is a delay for taken branches unless the condition code register is set
 three instructions earlier than the conditional branch.  The instruction


[gomp4] Gimplification: Merge gimplify_oacc_parallel into gimplify_omp_workshare. (was: [gomp4 9/9] OpenACC: Basic support for #pragma acc parallel.)

2014-02-27 Thread Thomas Schwinge
Hi!

On Wed, 6 Nov 2013 20:53:00 +0100, I wrote:
   gcc/
   * gimplify.c [...]
   (gimplify_oacc_parallel): New function.

To get rid of code duplication, I have applied the following in r208206:

commit 9ffb216dd43bda84f56ce7fe68ae15cc08110924
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Thu Feb 27 19:56:28 2014 +

Gimplification: Merge gimplify_oacc_parallel into gimplify_omp_workshare.

gcc/
* gimplify.c (gimplify_oacc_parallel): Merge into
gimplify_omp_workshare.  Update all callers.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@208206 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 9f5941f..3d9b06d 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-02-27  Thomas Schwinge  tho...@codesourcery.com
+
+   * gimplify.c (gimplify_oacc_parallel): Merge into
+   gimplify_omp_workshare.  Update all callers.
+
 2014-02-27  Ilmir Usmanov  i.usma...@samsung.com
 
Fix OpenACC ASYNC clause: it cannot contain integer-expression-list.
diff --git gcc/gimplify.c gcc/gimplify.c
index d0a4779..6dbabfa 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6572,38 +6572,6 @@ gimplify_adjust_omp_clauses (tree *list_p)
   delete_omp_context (ctx);
 }
 
-/* Gimplify the contents of an OACC_PARALLEL statement.  This involves
-   gimplification of the body, as well as scanning the body for used
-   variables.  We need to do this scan now, because variable-sized
-   decls will be decomposed during gimplification.  */
-
-static void
-gimplify_oacc_parallel (tree *expr_p, gimple_seq *pre_p)
-{
-  tree expr = *expr_p;
-  gimple g;
-  gimple_seq body = NULL;
-  enum omp_region_type ort = (enum omp_region_type) (ORT_TARGET
-| ORT_TARGET_OFFLOAD
-| ORT_TARGET_MAP_FORCE);
-
-  gimplify_scan_omp_clauses (OACC_PARALLEL_CLAUSES (expr), pre_p, ort);
-
-  push_gimplify_context ();
-
-  g = gimplify_and_return_first (OACC_PARALLEL_BODY (expr), body);
-  if (gimple_code (g) == GIMPLE_BIND)
-pop_gimplify_context (g);
-  else
-pop_gimplify_context (NULL);
-
-  gimplify_adjust_omp_clauses (OACC_PARALLEL_CLAUSES (expr));
-
-  g = gimple_build_oacc_parallel (body, OACC_PARALLEL_CLAUSES (expr));
-  gimplify_seq_add_stmt (pre_p, g);
-  *expr_p = NULL_TREE;
-}
-
 /* Gimplify the contents of an OMP_PARALLEL statement.  This involves
gimplification of the body, as well as scanning the body for used
variables.  We need to do this scan now, because variable-sized
@@ -7039,6 +7007,11 @@ gimplify_omp_workshare (tree *expr_p, gimple_seq *pre_p)
   ort = (enum omp_region_type) (ORT_TARGET
| ORT_TARGET_MAP_FORCE);
   break;
+case OACC_PARALLEL:
+  ort = (enum omp_region_type) (ORT_TARGET
+   | ORT_TARGET_OFFLOAD
+   | ORT_TARGET_MAP_FORCE);
+  break;
 case OMP_SECTIONS:
 case OMP_SINGLE:
   ort = ORT_WORKSHARE;
@@ -7097,6 +7070,9 @@ gimplify_omp_workshare (tree *expr_p, gimple_seq *pre_p)
   stmt = gimple_build_omp_target (body, GF_OMP_TARGET_KIND_OACC_DATA,
  OACC_DATA_CLAUSES (expr));
   break;
+case OACC_PARALLEL:
+  stmt = gimple_build_oacc_parallel (body, OACC_PARALLEL_CLAUSES (expr));
+  break;
 case OMP_SECTIONS:
   stmt = gimple_build_omp_sections (body, OMP_CLAUSES (expr));
   break;
@@ -8060,11 +8036,6 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
  ret = GS_ALL_DONE;
  break;
 
-   case OACC_PARALLEL:
- gimplify_oacc_parallel (expr_p, pre_p);
- ret = GS_ALL_DONE;
- break;
-
case OACC_KERNELS:
case OACC_HOST_DATA:
case OACC_DECLARE:
@@ -8095,6 +8066,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
  break;
 
case OACC_DATA:
+   case OACC_PARALLEL:
case OMP_SECTIONS:
case OMP_SINGLE:
case OMP_TARGET:


Grüße,
 Thomas


pgpoBzrHjF4uI.pgp
Description: PGP signature


Re: [PR debug/59992 #2/2] preserve permanent values in a separate table

2014-02-27 Thread Richard Henderson
On 02/26/2014 10:09 PM, Alexandre Oliva wrote:
 Regstrapped on x86_64-linux-gnu and i686-linux-gnu, along with the other
 patch for PR debug/59992.  Ok to install?
 
 
 for  gcc/ChangeLog
 
   PR debug/59992
   * cselib.c (cselib_hasher::equal): Special-case VALUE lookup.
   (cselib_preserved_hash_table): New.
   (preserve_constants_and_equivs): Move preserved vals to it.
   (cselib_find_slot): Look it up first.
   (cselib_init): Initialize it.
   (cselib_finish): Release it.
   (dump_cselib_table): Dump it.

Ok.


r~


[C++ Patch] PR 58610

2014-02-27 Thread Paolo Carlini

Hi,

in this regression, we ICE in explain_non_literal_class in:

  if (DECL_DELETED_FN (fn))
maybe_explain_implicit_delete (fn);

because fn is a TEMPLATE_DECL. Now, if I read the comment which provides 
rationale for LANG_DECL_FN_CHECK:


/* We want to be able to check DECL_CONSTRUCTOR_P and such on a function
   template, not just on a FUNCTION_DECL.  So when looking for things in
   lang_decl_fn, look down through a TEMPLATE_DECL into its result. */
#define LANG_DECL_FN_CHECK(NODE) __extension__\

I think it could make sense to use the macro for DECL_DELETED_FN too, 
not just for DECL_DEFAULTED_FN. Or do we want something more localized?


Tested x86_64-linux.

Thanks,
Paolo.

/
/cp
2014-02-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58610
* cp-tree.h (DECL_DELETED_FN): Use LANG_DECL_FN_CHECK.
* call.c (print_z_candidate): Remove STRIP_TEMPLATE use.

/testsuite
2014-02-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58610
* g++.dg/cpp0x/constexpr-ice11.C: New.
Index: cp/call.c
===
--- cp/call.c   (revision 208203)
+++ cp/call.c   (working copy)
@@ -3237,7 +3237,7 @@ print_z_candidate (location_t loc, const char *msg
 inform (cloc, %s%T conversion, msg, candidate-fn);
   else if (candidate-viable == -1)
 inform (cloc, %s%#D near match, msg, candidate-fn);
-  else if (DECL_DELETED_FN (STRIP_TEMPLATE (candidate-fn)))
+  else if (DECL_DELETED_FN (candidate-fn))
 inform (cloc, %s%#D deleted, msg, candidate-fn);
   else
 inform (cloc, %s%#D, msg, candidate-fn);
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 208203)
+++ cp/cp-tree.h(working copy)
@@ -3222,7 +3222,7 @@ more_aggr_init_expr_args_p (const aggr_init_expr_a
 
 /* Nonzero if DECL was declared with '= delete'.  */
 #define DECL_DELETED_FN(DECL) \
-  (DECL_LANG_SPECIFIC (FUNCTION_DECL_CHECK 
(DECL))-u.base.threadprivate_or_deleted_p)
+  (LANG_DECL_FN_CHECK (DECL)-min.base.threadprivate_or_deleted_p)
 
 /* Nonzero if DECL was declared with '= default' (maybe implicitly).  */
 #define DECL_DEFAULTED_FN(DECL) \
Index: testsuite/g++.dg/cpp0x/constexpr-ice11.C
===
--- testsuite/g++.dg/cpp0x/constexpr-ice11.C(revision 0)
+++ testsuite/g++.dg/cpp0x/constexpr-ice11.C(working copy)
@@ -0,0 +1,9 @@
+// PR c++/58610
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  templatetypename A();
+};
+
+constexpr A a;  // { dg-error literal|matching }


Re: [PATCH,GRAPHITE] Fix for P1 bug 58028

2014-02-27 Thread Mircea Namolaru
Thanks for comments - updated the patch (fixed my e-mail address too :-)).

2014-02-26  Tobias Grosser  tob...@grosser.es
Mircea Namolaru  mircea.namol...@inria.fr

 Fix for bug 58028
 * graphite-clast-to-gimple.c (set_cloog_options):
 Don't remove scalar dimensions.

Index: gcc/graphite-clast-to-gimple.c
===
--- gcc/graphite-clast-to-gimple.c  (revision 207298)
+++ gcc/graphite-clast-to-gimple.c  (working copy)
@@ -1522,6 +1522,13 @@
  variables.  */
   options-save_domains = 1;
 
+  /* Do not remove scalar dimensions.  CLooG by default removes scalar 
+ dimensions very early from the input schedule.  However, they are 
+ necessary to correctly derive from the saved domains 
+ (options-save_domains) the relationship between the generated loops 
+ and the schedule dimensions they are generated from.  */ 
+  options-noscalars = 1;
+
   /* Disable optimizations and make cloog generate source code closer to the
  input.  This is useful for debugging,  but later we want the optimized
  code.

- Mail original -
 De: Rainer Orth r...@cebitec.uni-bielefeld.de
 À: Tobias Burnus tobias.bur...@physik.fu-berlin.de
 Cc: Tobias Grosser tob...@grosser.es, Mircea Namolaru 
 mircea.namol...@inria.fr, gcc-patches@gcc.gnu.org
 Envoyé: Jeudi 27 Février 2014 11:14:30
 Objet: Re: [PATCH,GRAPHITE] Fix for P1 bug 58028
 
 Tobias Burnus tobias.bur...@physik.fu-berlin.de writes:
 
  On 02/26/2014 10:09 PM, Mircea Namolaru wrote:
  + /* Do not remove scalar dimensions. Cloog be default removes scalar
  + dimensions very early from the input schedule. However, they are
 
  Small nit: Instead of 'be default' it should be 'by default'
 
 ... and two spaces after the full stops.  While you're at it, please fix
 the CLooG capitalization ;-)
 
   Rainer
 
 --
 -
 Rainer Orth, Center for Biotechnology, Bielefeld University
 


Re: [PATCH] [rtl-optimization/49847] Fix cse to handle cc0 setter and cc0 user in different blocks

2014-02-27 Thread Jeff Law

On 02/27/14 12:50, H.J. Lu wrote:

On Thu, Feb 27, 2014 at 11:02 AM, Jeff Law l...@redhat.com wrote:


As discussed in 49847, a few years ago GCC was changed to add EH edges for
exceptions that might arise from a floating point comparison.  That change
made it possible for a cc0-setter and cc0-user to end up in different
blocks, separated by a NOTE.

After discussing reverting the change, duplicating the cc0-setter and a
couple other ideas, eventually Richi and myself settled on relaxing the long
standing restrictions on the relative locations of the cc0 setter and cc0
user.

We agreed to update the documentation and fault in fixes due to low priority
of cc0 targets (and doubly so since this really only affects cc0 targets
with used with non-call-exceptions).

This patch updates the documentation and fixes the only known failure due to
this change in cse.c with a patch from Mikael.

Tested with a cross compiler.  I'll probably fire off a m68k bootstrap for
good measure, but it'll be several days before that completes.

Installed on the trunk.

Jeff

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a78716..a20cee3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2014-02-27  Mikael Pettersson  mi...@it.uu.se
+   Jeff Law  l...@redhat.com
+
+   PR rtl-optimization/49847
+   * cse.c (fold_rtx) Handle case where cc0 setter and cc0 user
+   are in different blocks.
+   * doc/tm.texi (Condition Code Status): Update documention for
+   relative locations of cc0-setter and cc0-user.
+


This breaks bootstrap:

You should edit ../../src-trunk/gcc/doc/tm.texi.in rather than
../../src-trunk/gcc/doc/tm.texi .
make[6]: *** [s-tm-texi] Error 1
make[6]: *** Waiting for unfinished jobs

Nuts.  Sorry.  Thanks for fixing it.
jeff


Re: [PATCH,GRAPHITE] Fix for P1 bug 58028

2014-02-27 Thread Rainer Orth
Hi Mircea,

two last nits:

 2014-02-26  Tobias Grosser  tob...@grosser.es
 Mircea Namolaru  mircea.namol...@inria.fr
 
  Fix for bug 58028

Please write this as

   PR tree-optimization/58028

instead.  This way, the commit triggers an update to the bugzilla bug.

  * graphite-clast-to-gimple.c (set_cloog_options):
  Don't remove scalar dimensions.

Don't insert a line break after the colon, but continue on the same
line, letting it wrap at pos. 75 (Emacs' fill-column).

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [jit] Major API change: blocks rather than labels

2014-02-27 Thread David Malcolm
On Thu, 2014-02-27 at 17:11 -0500, David Malcolm wrote:

[...]

 With this commit, the API changes to using basic blocks instead: blocks
 are created within functions, and statements are added to blocks, rather
 than to functions.

[...]

I've also ported the jittest example to the new API, as of this
commit:
https://github.com/davidmalcolm/jittest/commit/af66efe0386e52a9292b7527174ae402c0af5e43

(though currently it falls foul of type-checking, due to int vs bool
issues in conditionals; upon hacking out the type-checking from
libgccjit it compiles and runs OK).

I haven't yet ported the Python bindings.

Dave



Re: [PATCH GCC]Allow cfgcleanup to remove forwarder loop preheaders and latches

2014-02-27 Thread H.J. Lu
On Mon, Feb 24, 2014 at 9:12 PM, bin.cheng bin.ch...@arm.com wrote:
 Hi,
 This patch is to fix regression reported in PR60280 by removing forward loop
 headers/latches in cfg cleanup if possible.  Several tests are broken by
 this change since cfg cleanup is shared by all optimizers.  Some tests has
 already been fixed by recent patches, I went through and fixed the others.
 One case needs to be clarified is gcc.dg/tree-prof/update-loopch.c.  When
 GCC removing a basic block, it checks profile information by calling
 check_bb_profile after redirecting incoming edges of the bb.  This certainly
 results in warnings about invalid profile information and causes the case to
 fail.  I will send a patch to skip checking profile information for a
 removing basic block in stage 1 if it sounds reasonable.  For now I just
 twisted the case itself.

 Bootstrap and tested on x86_64 and arm_a15.

 Is it OK?


 2014-02-25  Bin Cheng  bin.ch...@arm.com

 PR target/60280
 * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
 preheaders and latches only if requested.  Fix latch if it
 is removed.
 * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
 LOOPS_HAVE_PREHEADERS.


This change:

if (dest-loop_father-header == dest)
-  return false;
+  {
+if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
+ bb-loop_father-header != dest)
+  return false;
+
+if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
+ bb-loop_father-header == dest)
+  return false;
+  }
 }

miscompiled 435.gromacs in SPEC CPU 2006 on x32 with

-O3 -funroll-loops -ffast-math -fwhole-program -flto=jobserver
-fuse-linker-plugin

This patch changes loops without LOOPS_HAVE_PREHEADERS
nor LOOPS_HAVE_SIMPLE_LATCHES from returning false to returning
true.  I don't have a small testcase.  But this patch:

diff --git a/gcc/tree-cfgcleanup.c b/gcc/tree-cfgcleanup.c
index b5c384b..2ba673c 100644
--- a/gcc/tree-cfgcleanup.c
+++ b/gcc/tree-cfgcleanup.c
@@ -323,6 +323,10 @@ tree_forwarder_block_p (basic_block bb, bool phi_wanted)
 if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
  bb-loop_father-header == dest)
   return false;
+
+if (!loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
+ !loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES))
+  return false;
   }
 }

fixes the regression.  Does it make any senses?


-- 
H.J.


Re: [Patch, testsuite]: Allow MicroBlaze .weakext pattern in regex match

2014-02-27 Thread Michael Eager

On 02/26/14 16:29, David Holsgrove wrote:

Hi Mike S., Michael E.,


-Original Message-
From: Mike Stump [mailto:mikest...@comcast.net]
Sent: Friday, 21 February 2014 6:17 am
To: David Holsgrove
Cc: gcc-patches@gcc.gnu.org; Michael Eager (ea...@eagerm.com); Vidhumouli
Hunsigida; Nagaraju Mekala; John Williams; Edgar Iglesias
Subject: Re: [Patch, testsuite]: Allow MicroBlaze .weakext pattern in regex 
match

On Feb 16, 2014, at 4:21 PM, David Holsgrove david.holsgr...@xilinx.com
wrote:

I've attached an updated patch


Ok, thanks.


If no one has any objections on this updated patch, would either of you be able 
to approve and apply?


What targets have you tested this patch with?

It isn't sufficient to test it only with MicroBlaze,
since this code is used by all other targets.

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [PATCH GCC]Allow cfgcleanup to remove forwarder loop preheaders and latches

2014-02-27 Thread Hans-Peter Nilsson
On Tue, 25 Feb 2014, bin.cheng wrote:

 Hi,
 This patch is to fix regression reported in PR60280 by removing forward loop
 headers/latches in cfg cleanup if possible.  Several tests are broken by
 this change since cfg cleanup is shared by all optimizers.  Some tests has
 already been fixed by recent patches, I went through and fixed the others.
 One case needs to be clarified is gcc.dg/tree-prof/update-loopch.c.  When
 GCC removing a basic block, it checks profile information by calling
 check_bb_profile after redirecting incoming edges of the bb.  This certainly
 results in warnings about invalid profile information and causes the case to
 fail.  I will send a patch to skip checking profile information for a
 removing basic block in stage 1 if it sounds reasonable.  For now I just
 twisted the case itself.

 Bootstrap and tested on x86_64 and arm_a15.

 Is it OK?


 2014-02-25  Bin Cheng  bin.ch...@arm.com

   PR target/60280
   * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
   preheaders and latches only if requested.  Fix latch if it
   is removed.
   * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
   LOOPS_HAVE_PREHEADERS.

 gcc/testsuite/ChangeLog
 2014-02-25  Bin Cheng  bin.ch...@arm.com

   PR target/60280
   * gnat.dg/renaming5.adb: Change to two expected gotos.
   * gcc.dg/tree-ssa/pr21559.c: Change back to three expected
   jump threads.
   * gcc.dg/tree-prof/update-loopch.c: Check two Invalid sum
   messages for removed basic block.
   * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
   * gcc.dg/tree-ssa/ivopt_2.c: Ditto.
   * gcc.dg/tree-ssa/ivopt_3.c: Ditto.
   * gcc.dg/tree-ssa/ivopt_4.c: Ditto.

Do you need to also update gcc.dg/tree-ssa/ssa-dom-thread-4.c,
at least for logical_op_short_circuit targets?
(There's a nice analysis comment there by Richard S which may
also have to be updated.)

This caused a regression for logical_op_short_circuit targets, I
entered PR60363 for convenience.

brgds, H-P


Re: [PATCH GCC]Allow cfgcleanup to remove forwarder loop preheaders and latches

2014-02-27 Thread Bin.Cheng
Thanks for reporting this, I will look into it.

Thanks,
bin

On Fri, Feb 28, 2014 at 8:52 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Feb 24, 2014 at 9:12 PM, bin.cheng bin.ch...@arm.com wrote:
 Hi,
 This patch is to fix regression reported in PR60280 by removing forward loop
 headers/latches in cfg cleanup if possible.  Several tests are broken by
 this change since cfg cleanup is shared by all optimizers.  Some tests has
 already been fixed by recent patches, I went through and fixed the others.
 One case needs to be clarified is gcc.dg/tree-prof/update-loopch.c.  When
 GCC removing a basic block, it checks profile information by calling
 check_bb_profile after redirecting incoming edges of the bb.  This certainly
 results in warnings about invalid profile information and causes the case to
 fail.  I will send a patch to skip checking profile information for a
 removing basic block in stage 1 if it sounds reasonable.  For now I just
 twisted the case itself.

 Bootstrap and tested on x86_64 and arm_a15.

 Is it OK?


 2014-02-25  Bin Cheng  bin.ch...@arm.com

 PR target/60280
 * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
 preheaders and latches only if requested.  Fix latch if it
 is removed.
 * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
 LOOPS_HAVE_PREHEADERS.


 This change:

 if (dest-loop_father-header == dest)
 -  return false;
 +  {
 +if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
 + bb-loop_father-header != dest)
 +  return false;
 +
 +if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
 + bb-loop_father-header == dest)
 +  return false;
 +  }
  }

 miscompiled 435.gromacs in SPEC CPU 2006 on x32 with

 -O3 -funroll-loops -ffast-math -fwhole-program -flto=jobserver
 -fuse-linker-plugin

 This patch changes loops without LOOPS_HAVE_PREHEADERS
 nor LOOPS_HAVE_SIMPLE_LATCHES from returning false to returning
 true.  I don't have a small testcase.  But this patch:

 diff --git a/gcc/tree-cfgcleanup.c b/gcc/tree-cfgcleanup.c
 index b5c384b..2ba673c 100644
 --- a/gcc/tree-cfgcleanup.c
 +++ b/gcc/tree-cfgcleanup.c
 @@ -323,6 +323,10 @@ tree_forwarder_block_p (basic_block bb, bool phi_wanted)
  if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
   bb-loop_father-header == dest)
return false;
 +
 +if (!loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
 + !loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES))
 +  return false;
}
  }

 fixes the regression.  Does it make any senses?


 --
 H.J.



-- 
Best Regards.


Re: [PATCH GCC]Allow cfgcleanup to remove forwarder loop preheaders and latches

2014-02-27 Thread Bin.Cheng
Sorry, I didn't test it against logical_op_short_circuit target.  I
will look into this PR.

Thanks,
bin

On Fri, Feb 28, 2014 at 9:34 AM, Hans-Peter Nilsson h...@bitrange.com wrote:
 On Tue, 25 Feb 2014, bin.cheng wrote:

 Hi,
 This patch is to fix regression reported in PR60280 by removing forward loop
 headers/latches in cfg cleanup if possible.  Several tests are broken by
 this change since cfg cleanup is shared by all optimizers.  Some tests has
 already been fixed by recent patches, I went through and fixed the others.
 One case needs to be clarified is gcc.dg/tree-prof/update-loopch.c.  When
 GCC removing a basic block, it checks profile information by calling
 check_bb_profile after redirecting incoming edges of the bb.  This certainly
 results in warnings about invalid profile information and causes the case to
 fail.  I will send a patch to skip checking profile information for a
 removing basic block in stage 1 if it sounds reasonable.  For now I just
 twisted the case itself.

 Bootstrap and tested on x86_64 and arm_a15.

 Is it OK?


 2014-02-25  Bin Cheng  bin.ch...@arm.com

   PR target/60280
   * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
   preheaders and latches only if requested.  Fix latch if it
   is removed.
   * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
   LOOPS_HAVE_PREHEADERS.

 gcc/testsuite/ChangeLog
 2014-02-25  Bin Cheng  bin.ch...@arm.com

   PR target/60280
   * gnat.dg/renaming5.adb: Change to two expected gotos.
   * gcc.dg/tree-ssa/pr21559.c: Change back to three expected
   jump threads.
   * gcc.dg/tree-prof/update-loopch.c: Check two Invalid sum
   messages for removed basic block.
   * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
   * gcc.dg/tree-ssa/ivopt_2.c: Ditto.
   * gcc.dg/tree-ssa/ivopt_3.c: Ditto.
   * gcc.dg/tree-ssa/ivopt_4.c: Ditto.

 Do you need to also update gcc.dg/tree-ssa/ssa-dom-thread-4.c,
 at least for logical_op_short_circuit targets?
 (There's a nice analysis comment there by Richard S which may
 also have to be updated.)

 This caused a regression for logical_op_short_circuit targets, I
 entered PR60363 for convenience.

 brgds, H-P



-- 
Best Regards.


RE: [patch] [arm] Fix PR60169 - thumb1 far jump

2014-02-27 Thread Joey Ye
Ping. OK for trunk and 4.8?

 -Original Message-
 From: Joey Ye [mailto:joey...@arm.com]
 Sent: 21 February 2014 19:32
 To: gcc-patches@gcc.gnu.org
 Subject: [patch] [arm] Fix PR60169 - thumb1 far jump
 
 Patch http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01229.html introduced
 this ICE:
 
 1. thumb1 estimate if far_jump is used based on function insn size 2.
During
 reload, after stack layout finalized, it does reload_as_needed. It however
 increases insn size that changes estimation result of far_jump, which in
 return need to save lr and change stack layout again. While there is not
 chance to change, GCC crashes.
 
 Solution:
 Do not change estimation result of far_jump if reload_in_progress or
 reload_completed is true.
 
 Not likely need to fix lra according to Vlad:
 http://gcc.gnu.org/ml/gcc/2014-02/msg00355.html
 
 ChangeLog:
 * config/arm/arm.c (thumb_far_jump_used_p): Don't change
   if reload in progress or completed.
 
 * gcc.target/arm/thumb1-far-jump-3.c: New case.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b562986..2cf362c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -26255,6 +26255,11 @@ thumb_far_jump_used_p (void)
return 0;
 }
 
+  /* We should not change far_jump_used during or after reload, as there is
+ no chance to change stack frame layout.  */
+  if (reload_in_progress || reload_completed)
+return 0;
+
   /* Check to see if the function contains a branch
  insn with the far jump attribute set.  */
   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
diff --git a/gcc/testsuite/gcc.target/arm/thumb1-far-jump-3.c 
b/gcc/testsuite/gcc.target/arm/thumb1-far-jump-3.c
new file mode 100644
index 000..90559ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb1-far-jump-3.c
@@ -0,0 +1,108 @@
+/* Catch reload ICE on target thumb1 with far jump optimization.
+ * It is also a valid case for non-thumb1 target.  */
+
+/* Add -mno-lra option as it is only reproducable with reload.  It will
+   be removed after reload is completely removed.  */
+/* { dg-options -mno-lra -fomit-frame-pointer } */
+/* { dg-do compile } */
+
+#define C  2
+#define A  4
+#define RGB  (C | A)
+#define GRAY (A)
+
+typedef unsigned long uint_32;
+typedef unsigned char byte;
+typedef byte* bytep;
+
+typedef struct ss
+{
+   uint_32 w;
+   uint_32 r;
+   byte c;
+   byte b;
+   byte p;
+} info;
+
+typedef info * infop;
+
+void
+foo(infop info, bytep row)
+{
+   uint_32 iw = info-w;
+   if (info-c == RGB)
+   {
+  if (info-b == 8)
+  {
+ bytep sp = row + info-r;
+ bytep dp = sp;
+ byte save;
+ uint_32 i;
+
+ for (i = 0; i  iw; i++)
+ {
+save = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = save;
+ }
+  }
+
+  else
+  {
+ bytep sp = row + info-r;
+ bytep dp = sp;
+ byte save[2];
+ uint_32 i;
+
+ for (i = 0; i  iw; i++)
+ {
+save[0] = *(--sp);
+save[1] = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = save[0];
+*(--dp) = save[1];
+ }
+  }
+   }
+   else if (info-c == GRAY)
+   {
+  if (info-b == 8)
+  {
+ bytep sp = row + info-r;
+ bytep dp = sp;
+ byte save;
+ uint_32 i;
+
+ for (i = 0; i  iw; i++)
+ {
+save = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = save;
+ }
+  }
+  else
+  {
+ bytep sp = row + info-r;
+ bytep dp = sp;
+ byte save[2];
+ uint_32 i;
+
+ for (i = 0; i  iw; i++)
+ {
+save[0] = *(--sp);
+save[1] = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = *(--sp);
+*(--dp) = save[0];
+*(--dp) = save[1];
+ }
+  }
+   }
+}


[PATCH] [libgcc,arm] Fix PR 60166 - NAN fraction bits

2014-02-27 Thread Joey Ye
This patch is a mirror copy from approved patch in glibc:
http://sourceware.org/ml/libc-alpha/2014-02/msg00741.html

OK to trunk, 4.8 and 4.7?

ChangeLog.libgcc:

* config/arm/sfp-machine.h (_FP_NANFRAC_H,
  _FP_NANFRAC_S, _FP_NANFRAC_D, _FP_NANFRAC_Q):
  Set to zero.

diff --git a/libgcc/config/arm/sfp-machine.h
b/libgcc/config/arm/sfp-machine.h
index bb34895..8d45320 100644
--- a/libgcc/config/arm/sfp-machine.h
+++ b/libgcc/config/arm/sfp-machine.h
@@ -19,10 +19,12 @@ typedef int __gcc_CMPtype __attribute__ ((mode
(__libgcc_cmp_return__)));
 #define _FP_DIV_MEAT_D(R,X,Y)  _FP_DIV_MEAT_2_udiv(D,R,X,Y)
 #define _FP_DIV_MEAT_Q(R,X,Y)  _FP_DIV_MEAT_4_udiv(Q,R,X,Y)
 
-#define _FP_NANFRAC_H  ((_FP_QNANBIT_H  1) - 1)
-#define _FP_NANFRAC_S  ((_FP_QNANBIT_S  1) - 1)
-#define _FP_NANFRAC_D  ((_FP_QNANBIT_D  1) - 1), -1
-#define _FP_NANFRAC_Q  ((_FP_QNANBIT_Q  1) - 1), -1, -1, -1
+/* According to RTABI, QNAN is only with the most significant bit of the
+   significand set, and all other significand bits zero.  */
+#define _FP_NANFRAC_H  0
+#define _FP_NANFRAC_S  0
+#define _FP_NANFRAC_D  0, 0
+#define _FP_NANFRAC_Q  0, 0, 0, 0
 #define _FP_NANSIGN_H  0
 #define _FP_NANSIGN_S  0
 #define _FP_NANSIGN_D  0





RE: [PATCH v4] PR middle-end/60281

2014-02-27 Thread Bernd Edlinger
Hi,

I see the problem too.

But I think it is not necessary to change the stack alignment
to solve the problem.

It appears to me that the code in asan_emit_stack_protection
is just wrong. It uses SImode when the memory is not aligned
enough for that mode. This would not happen if that code
is rewritten to use get_best_mode, and by the way, even on
x86_64 the emitted code is not optimal, because that target
could work with DImode more efficiently.

So, to fix that, it would be better to concentrate on that function,
and use word_mode instead of SImode, and let get_best_mode
choose the required mode.


Regards
Bernd Edlinger.   

Re: [PATCH v4] PR middle-end/60281

2014-02-27 Thread lin zuojian
Hi Bernd,
I agree you with the mode problem.

And I have not change the stack alignment.What I change is the virtual
register base's alignment.
Realignment must be make in !STRICT_ALIGNMENT machine,or emitting the
efficient code is impossible.
For example 4 set mem:QI X,REG:QI Y will not combine into one set mem:SI
X1,REG:SI Y1,if X is not mentioned as SI mode aligned.
To make sure X is SI mode algined,virtual register base must be realigned.

For this patch,I only intent to make it right.Making it best is next task.
--
Regards
lin zuojian.

于 2014年02月28日 15:47, Bernd Edlinger 写道:
 Hi,

 I see the problem too.

 But I think it is not necessary to change the stack alignment
 to solve the problem.

 It appears to me that the code in asan_emit_stack_protection
 is just wrong. It uses SImode when the memory is not aligned
 enough for that mode. This would not happen if that code
 is rewritten to use get_best_mode, and by the way, even on
 x86_64 the emitted code is not optimal, because that target
 could work with DImode more efficiently.

 So, to fix that, it would be better to concentrate on that function,
 and use word_mode instead of SImode, and let get_best_mode
 choose the required mode.


 Regards
 Bernd Edlinger.