Re: [Patch, Fortran, OOP] PR 63733: [4.8/4.9/5 Regression] wrong resolution for OPERATOR generics

2015-01-12 Thread Paul Richard Thomas
Dear Janus,

Since it is a regression, by all means update the branches. We
usually, propose delaying a bit but I am not convinced that this is
effective for this kind of bug fix - usually, further problems take a
long time to emerge. Thus, I would recommend that you get on with it.

Thanks

Paul

On 11 January 2015 at 23:01, Janus Weil  wrote:
>> Well done for sorting that out. OK for trunk.
>
> Thanks, Paul. Committed as r219440.
>
> What about the branches?
>
> Cheers,
> Janus
>
>
>
>> On 11 January 2015 at 14:38, Janus Weil  wrote:
>>> Hi all,
>>>
>>> this patch fixes a wrong-code regression related to operators, by
>>> making sure that we look for typebound operators first, before looking
>>> for non-typebound ones. (Note: Each typebound operator is also added
>>> to the list of non-typebound ones, for reasons of diagnostics.)
>>>
>>> Regtested on x86_64-unknown-linux-gnu. Ok for trunk? 4.9/4.8?
>>>
>>> Cheers,
>>> Janus
>>>
>>>
>>>
>>> 2015-01-11  Janus Weil  
>>>
>>> PR fortran/63733
>>> * interface.c (gfc_extend_expr): Look for type-bound operators before
>>> non-typebound ones.
>>>
>>> 2015-01-11  Janus Weil  
>>>
>>> PR fortran/63733
>>> * gfortran.dg/typebound_operator_20.f90: New.
>>
>>
>>
>> --
>> Outside of a dog, a book is a man's best friend. Inside of a dog it's
>> too dark to read.
>>
>> Groucho Marx



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


Re: [PATCH] fix visium build

2015-01-12 Thread Richard Biener
On Fri, 9 Jan 2015, Prathamesh Kulkarni wrote:

> Hi,
> The tree.h and tree-core.h flattening patch:
> (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00467.html
> broke visium build. The attached patch fixes that.
> Built on visium-elf.
> OK to commit ?

Ok.

Thanks,
Richard.


Re: [PATCH] ipa-icf.c: Fix issues generated by original latest commit

2015-01-12 Thread Richard Biener
On Sat, Jan 10, 2015 at 10:03 AM, Chen Gang S  wrote:
> The related commit is "275e275 IPA ICF: target and optimization flags
> comparison.". For sem_function::equals_private(), fix the typo issue,
> and for target_opts_for_fn(), fix access NULL issue.
>
> For cross compiling h8300, it will cause the issue below:
>
>   [root@localhost h8300]# cat fp-bit.i
>   __inline__ static int a (int x)
>   {
> return __builtin_expect (x == 0, 0);
>   }
>
>   __inline__ static int b (int x)
>   {
> return __builtin_expect (x == 1, 0);
>   }
>
>   __attribute__ ((__always_inline__)) int c (int x, int y)
>   {
> if (a (x))
>   return x;
> if (b (x))
>   return x;
> return y;
>   }
>   [root@localhost h8300]# /upstream/build-gcc-h8300/gcc/cc1 -O2 fp-bit.i -o 
> test.s
>a b c
>   Analyzing compilation unit
>
>   fp-bit.i:11:41: warning: always_inline function might not be inlinable 
> [-Wattributes]
>__attribute__ ((__always_inline__)) int c (int x, int y)
>  ^
>   Performing interprocedural optimizations
><*free_lang_data>
> 
>  fp-bit.i:18:1: internal compiler error: Segmentation 
> fault
>}
>^
>   0xa11f0e crash_signal
> ../../gcc/gcc/toplev.c:372
>   0xda33e7 tree_check
> ../../gcc/gcc/tree.h:2769
>   0xda33e7 target_opts_for_fn
> ../../gcc/gcc/tree.h:4643
>   0xda33e7 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*, 
> hash_map&)
> ../../gcc/gcc/ipa-icf.c:438
>   0xda4023 ipa_icf::sem_function::equals(ipa_icf::sem_item*, 
> hash_map&)
> ../../gcc/gcc/ipa-icf.c:393
>   0xda6472 ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)
> ../../gcc/gcc/ipa-icf.c:1900
>   0xdaad3c ipa_icf::sem_item_optimizer::execute()
> ../../gcc/gcc/ipa-icf.c:1719
>   0xdab961 ipa_icf_driver
> ../../gcc/gcc/ipa-icf.c:2448
>   0xdab961 ipa_icf::pass_ipa_icf::execute(function*)
> ../../gcc/gcc/ipa-icf.c:2496
>   Please submit a full bug report,
>   with preprocessed source if appropriate.
>   Please include the complete backtrace with any bug report.
>   See  for instructions.
>
> This issue can be found for cross compiling gcc "make all-target-libgcc"
> under h8300, after fix this issue, it can continue to cross compiling to
> meet the next building issue for h8300.

Ok.

Thanks,
Richard.

> 2015-01-10  Chen Gang  
>
> * ipa-icf.c (sem_function::equals_private): Use '&&' instead of
> '||' to fix typo issue.
>
> * gcc/tree.h (target_opts_for_fn): Check NULL_TREE since it can
> accept and return NULL.
> ---
>  gcc/ipa-icf.c | 2 +-
>  gcc/tree.h| 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index 1b76a1d..4ccaf8c 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -438,7 +438,7 @@ sem_function::equals_private (sem_item *item,
>cl_target_option *tar1 = target_opts_for_fn (decl);
>cl_target_option *tar2 = target_opts_for_fn (m_compared_func->decl);
>
> -  if (tar1 != NULL || tar2 != NULL)
> +  if (tar1 != NULL && tar2 != NULL)
>  {
>if (!cl_target_option_eq (tar1, tar2))
> {
> diff --git a/gcc/tree.h b/gcc/tree.h
> index fc8c8fe..ac27268 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4640,7 +4640,7 @@ target_opts_for_fn (const_tree fndecl)
>tree fn_opts = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
>if (fn_opts == NULL_TREE)
>  fn_opts = target_option_default_node;
> -  return TREE_TARGET_OPTION (fn_opts);
> +  return fn_opts == NULL_TREE ? NULL : TREE_TARGET_OPTION (fn_opts);
>  }
>
>  /* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
> --
> 1.9.3


Re: [testsuite] PATCH: Correct target selector in gcc.target/i386/nop-mcount.c

2015-01-12 Thread Uros Bizjak
On Mon, Jan 12, 2015 at 1:31 AM, H.J. Lu  wrote:
> nonpic in target selector in gcc.target/i386/nop-mcount.c is ignored
> since {} is misplaced.  This patch properly places {} in target selector.
> Tested on Linux/x86.  OK for trunk?
>
> Thanks.
>
> H.J.
> ---
>  gcc/testsuite/gcc.target/i386/nop-mcount.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> 2015-01-11  H.J. Lu  
>
> * gcc.target/i386/nop-mcount.c: Properly place {} in target
> selector.

This counts as obvious, so OK.

Thanks,
Uros.

> diff --git a/gcc/testsuite/gcc.target/i386/nop-mcount.c 
> b/gcc/testsuite/gcc.target/i386/nop-mcount.c
> index 561792f..139fbb0 100644
> --- a/gcc/testsuite/gcc.target/i386/nop-mcount.c
> +++ b/gcc/testsuite/gcc.target/i386/nop-mcount.c
> @@ -1,5 +1,5 @@
>  /* Test -mnop-mcount */
> -/* { dg-do compile { target { *-*-linux* } && { nonpic } } } */
> +/* { dg-do compile { target { { *-*-linux* } && nonpic } } } */
>  /* { dg-options "-pg -mfentry -mrecord-mcount -mnop-mcount" } */
>  /* { dg-final { scan-assembler-not "__fentry__" } } */
>  /* Origin: Andi Kleen */
> --
> 1.9.3
>


Re: [match-and-simplify] Remove printing "for expression"

2015-01-12 Thread Richard Biener
On Sat, 10 Jan 2015, Prathamesh Kulkarni wrote:

> On 8 January 2015 at 17:52, Richard Biener  wrote:
> > On Sun, 21 Dec 2014, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> I removed printing "for expression:" from print_matches. I think it
> >> is out of place tvim here and we call print_matches after lowering.
> >> OK to commit ?
> >
> > Hum, it's now a very simple wrapper around print_operand - why
> > not replace the two callers with its content?
> Indeed. Done the changes in the attached patch.

Doesn't that miss the '\n' putc?

Thanks,
Richard.

> OK to commit to match-and-simplify branch ?
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Richard.
> >
> >> Thanks,
> >> Prathamesh
> >>
> >
> > --
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
> > Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: PATCH: PR bootstrap/64561: [5 Regression] HAVE_LD_PIE_COPYRELOC is defined to 1 for broken linker

2015-01-12 Thread Uros Bizjak
On Mon, Jan 12, 2015 at 2:48 AM, H.J. Lu  wrote:
> Hi,
>
> This patch updates Linux/x86-64 linker test for PIE with copy reloc.
> Tested with broken and working linkers on Linux/x86-64.  OK to install?
>
> Thanks.
>
>
> H.J.
> ---
> 2015-01-12  H.J. Lu  
>
> PR bootstrap/64561
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Update Linux/x86-64 linker
> test for PIE with copy reloc.
> * configure: Regenerated.

OK.

Thanks,
Uros.

> diff --git a/gcc/configure b/gcc/configure
> index 8670f73..1bf4358 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -27052,6 +27052,11 @@ EOF
>  main:
> movl%eax, a_glob(%rip)
> .size   main, .-main
> +   .globl  ptr
> +   .section.data.rel,"aw",@progbits
> +   .type   ptr, @object
> +ptr:
> +   .quad   a_glob
>  EOF
>if $gcc_cv_as --64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
>   && $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o > 
> /dev/null 2>&1 \
> diff --git a/gcc/configure.ac b/gcc/configure.ac
> index d010141..102dab9 100644
> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -4719,6 +4719,11 @@ EOF
>  main:
> movl%eax, a_glob(%rip)
> .size   main, .-main
> +   .globl  ptr
> +   .section.data.rel,"aw",@progbits
> +   .type   ptr, @object
> +ptr:
> +   .quad   a_glob
>  EOF
>if $gcc_cv_as --64 -o conftest1.o conftest1.s > /dev/null 2>&1 \
>   && $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o > 
> /dev/null 2>&1 \


Re: [PATCH, testsuite] fix ggcplug.c test-case

2015-01-12 Thread Richard Biener
On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote:

> Hi,
> The test-case plugin/ggcplug.c was failing due to flattening of tree.h
> and tree-core.h.
> Test-case was incorrect because it included gcc-plugin.h after tree.h whereas
> gcc-plugin.h should be the first header to be included by plugins.

No, it should be definitely included _after_ config.h, system.h
and coretypes.h.

Ok with moving it after coretypes.h.

Thanks,
Richard.


Re: [PATCH] Enable experimental TSAN support for Ada

2015-01-12 Thread Richard Biener
On Sun, Jan 11, 2015 at 1:39 PM, Bernd Edlinger
 wrote:
> Hi Richard,
>
> On Fri, 9 Jan 2015 17:19:57, Richard Biener wrote:
>>
>>
>> Yes. As said, you generally need to run folding results through 
>> force_gimple_operand.
>>
>> Richard.
>>
>
>
> I have now used force_gimple_operand instead of special casing the 
> VIEW_CONVERT_EXPRs.
> And I see that all Ada test cases still work with -fsanitize=thread. So this 
> feels like an improvement.
>
> I have checked with a large C++ application, to see if the generated code 
> changes or not.
> And although this looked like it should not change the resulting code, I 
> found one small difference
> at -O3 -fsanitize=thread while compiling the function 
> xmlSchemaCompareValuesInt in xmlschematypes.c
> of libxml2.  The generated code size did not change, only two blocks of code 
> changed place.
> That was the only difference in about 16 MB of code.
>
> The reason for this seems to be the following changes in the 
> xmlschemastypes.c.104t.tsan1
>
>:
>p1_179 = xmlSchemaDateNormalize (x_7(D), 0.0);
># DEBUG p1 => p1_179
>_180 = _xmlSchemaDateCastYMToDays (p1_179);
> -  _660 = &p1_179->value.date;
> -  _659 = &MEM[(struct xmlSchemaValDate *)_660 + 8B];
> -  __builtin___tsan_read2 (_659);
> +  _660 = &MEM[(struct xmlSchemaValDate *)p1_179 + 24B];
> +  __builtin___tsan_read2 (_660);
>_181 = p1_179->value.date.day;
>_182 = (long int) _181;
>p1d_183 = _180 + _182;
>
> this pattern is repeated everywhere. (- = before the patch. + = with the 
> patch)
>
> So it looks as if the generated code quality slightly improves with this 
> change.
>
> I have also tried to fold &base + offset + bitpos,  like this:
>
> --- tsan.c.orig2015-01-10 00:39:06.465210937 +0100
> +++ tsan.c2015-01-11 09:28:38.109423856 +0100
> @@ -213,7 +213,18 @@ instrument_expr (gimple_stmt_iterator gs
>align = get_object_alignment (expr);
>if (align < BITS_PER_UNIT)
>  return false;
> -  expr_ptr = build_fold_addr_expr (unshare_expr (expr));
> +  expr_ptr = build_fold_addr_expr (unshare_expr (base));
> +  if (bitpos != 0)
> +{
> +  if (offset != NULL)
> +offset = size_binop (PLUS_EXPR, offset,
> + build_int_cst (sizetype,
> +bitpos / BITS_PER_UNIT));
> +  else
> +offset = build_int_cst (sizetype, bitpos / BITS_PER_UNIT);
> +}
> +  if (offset != NULL)
> +expr_ptr = fold_build_pointer_plus (expr_ptr, offset);
>  }
>expr_ptr = force_gimple_operand (expr_ptr, &seq, true, NULL_TREE);
>if ((size & (size - 1)) != 0 || size> 16
>
>
> For simplicity first only in the simple case without 
> DECL_BIT_FIELD_REPRESENTATIVE.
> I tried this change at the same large C++ application, and see the code still 
> works,
> but the binary size increases at -O3 by about 1%.
>
> So my conclusion would be that it is better to use force_gimple_operand 
> directly
> on build_fold_addr_expr (unshare_expr (expr)), without using offset.

Yeah, it probably needs more investigation.

> Well, I think this still resolves your objections.
>
> Furthermore I used may_be_nonaddressable_p instead of is_gimple_addressable
> and just return if it is found to be not true. (That did not happen in my 
> tests.)
>
> And I reworked the block with the pt_solution_includes.
>
> I found that It can be rewritten, because pt_solution_includes can be
> expanded to (is_global_var (decl) || pt_solution_includes_1 
> (&cfun->gimple_df->escaped, decl)
> || pt_solution_includes_1 (&ipa_escaped_pt, decl))
>
> So, by De Morgan's law, you can rewite that block to
>
>   if (DECL_P (base))
> {
>   if (!is_global_var (base)
>   && !pt_solution_includes_1 (&cfun->gimple_df->escaped, base)
>   && !pt_solution_includes_1 (&ipa_escaped_pt, base))
> return false;
>   if (!is_global_var (base) && !may_be_aliased (base))
> return false;
> }
>
> Therefore I can move the common term !is_global_var (base) out of the block.  
> That's what I did.
> As far as I can tell, none of the other terms here seem to be redundant.
>
>
> Attached patch was boot-strapped and regression-tested on x86_64-linux-gnu.
> OK for trunk?

Ok.  Thanks for these improvements!

Richard.

>
> Thanks
> Bernd.
>


Re: [PATCH] ipa-icf.c: Fix issues generated by original latest commit

2015-01-12 Thread Martin Liška

On 01/12/2015 09:51 AM, Richard Biener wrote:

On Sat, Jan 10, 2015 at 10:03 AM, Chen Gang S  wrote:

The related commit is "275e275 IPA ICF: target and optimization flags
comparison.". For sem_function::equals_private(), fix the typo issue,
and for target_opts_for_fn(), fix access NULL issue.

For cross compiling h8300, it will cause the issue below:

   [root@localhost h8300]# cat fp-bit.i
   __inline__ static int a (int x)
   {
 return __builtin_expect (x == 0, 0);
   }

   __inline__ static int b (int x)
   {
 return __builtin_expect (x == 1, 0);
   }

   __attribute__ ((__always_inline__)) int c (int x, int y)
   {
 if (a (x))
   return x;
 if (b (x))
   return x;
 return y;
   }
   [root@localhost h8300]# /upstream/build-gcc-h8300/gcc/cc1 -O2 fp-bit.i -o 
test.s
a b c
   Analyzing compilation unit

   fp-bit.i:11:41: warning: always_inline function might not be inlinable 
[-Wattributes]
__attribute__ ((__always_inline__)) int c (int x, int y)
  ^
   Performing interprocedural optimizations
<*free_lang_data> 
fp-bit.i:18:1: internal compiler 
error: Segmentation fault
}
^
   0xa11f0e crash_signal
 ../../gcc/gcc/toplev.c:372
   0xda33e7 tree_check
 ../../gcc/gcc/tree.h:2769
   0xda33e7 target_opts_for_fn
 ../../gcc/gcc/tree.h:4643
   0xda33e7 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*, 
hash_map&)
 ../../gcc/gcc/ipa-icf.c:438
   0xda4023 ipa_icf::sem_function::equals(ipa_icf::sem_item*, hash_map&)
 ../../gcc/gcc/ipa-icf.c:393
   0xda6472 ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)
 ../../gcc/gcc/ipa-icf.c:1900
   0xdaad3c ipa_icf::sem_item_optimizer::execute()
 ../../gcc/gcc/ipa-icf.c:1719
   0xdab961 ipa_icf_driver
 ../../gcc/gcc/ipa-icf.c:2448
   0xdab961 ipa_icf::pass_ipa_icf::execute(function*)
 ../../gcc/gcc/ipa-icf.c:2496
   Please submit a full bug report,
   with preprocessed source if appropriate.
   Please include the complete backtrace with any bug report.
   See  for instructions.

This issue can be found for cross compiling gcc "make all-target-libgcc"
under h8300, after fix this issue, it can continue to cross compiling to
meet the next building issue for h8300.


Ok.

Thanks,
Richard.


Hello.

I've just installed Chen's patch.

Thanks,
Martin




2015-01-10  Chen Gang  

 * ipa-icf.c (sem_function::equals_private): Use '&&' instead of
 '||' to fix typo issue.

 * gcc/tree.h (target_opts_for_fn): Check NULL_TREE since it can
 accept and return NULL.
---
  gcc/ipa-icf.c | 2 +-
  gcc/tree.h| 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 1b76a1d..4ccaf8c 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -438,7 +438,7 @@ sem_function::equals_private (sem_item *item,
cl_target_option *tar1 = target_opts_for_fn (decl);
cl_target_option *tar2 = target_opts_for_fn (m_compared_func->decl);

-  if (tar1 != NULL || tar2 != NULL)
+  if (tar1 != NULL && tar2 != NULL)
  {
if (!cl_target_option_eq (tar1, tar2))
 {
diff --git a/gcc/tree.h b/gcc/tree.h
index fc8c8fe..ac27268 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4640,7 +4640,7 @@ target_opts_for_fn (const_tree fndecl)
tree fn_opts = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
if (fn_opts == NULL_TREE)
  fn_opts = target_option_default_node;
-  return TREE_TARGET_OPTION (fn_opts);
+  return fn_opts == NULL_TREE ? NULL : TREE_TARGET_OPTION (fn_opts);
  }

  /* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
--
1.9.3




Re: [PATCH, testsuite] fix ggcplug.c test-case

2015-01-12 Thread Prathamesh Kulkarni
On 12 January 2015 at 14:19, Richard Biener  wrote:
> On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote:
>
>> Hi,
>> The test-case plugin/ggcplug.c was failing due to flattening of tree.h
>> and tree-core.h.
>> Test-case was incorrect because it included gcc-plugin.h after tree.h whereas
>> gcc-plugin.h should be the first header to be included by plugins.
>
> No, it should be definitely included _after_ config.h, system.h
> and coretypes.h.
gcc-plugin.h already includes these files. Shall I remove config.h,
system.h and coretypes.h
from ggcplug.c instead ?
>
> Ok with moving it after coretypes.h.
>
> Thanks,
> Richard.


PR ipa/63470 (zero sized call_stmt)

2015-01-12 Thread Jan Hubicka
Hi,
this ICE is caused by double updating in ipa-prop that reduces call stmt size
once when it becomes speculative and again when it is turned to direct.  Fixed
by the following patch that makes updating to happen while duplication so
ipa-prop needs to care only about case it turned real indirect call into
direct.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

PR ipa/63470
* ipa-inline-analysis.c (inline_edge_duplication_hook): Adjust
cost when edge becomes direct.
* ipa-prop.c (make_edge_direct): Do not adjust when speculation
is resolved or when introducing new speculation.

* testsuite/g++.dg/ipa/pr63470.C: New testcase.

Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 219430)
+++ ipa-inline-analysis.c   (working copy)
@@ -1312,6 +1312,13 @@
   info->predicate = NULL;
   edge_set_predicate (dst, srcinfo->predicate);
   info->param = srcinfo->param.copy ();
+  if (!dst->indirect_unknown_callee && src->indirect_unknown_callee)
+{
+  info->call_stmt_size -= (eni_size_weights.indirect_call_cost
+  - eni_size_weights.call_cost);
+  info->call_stmt_time -= (eni_time_weights.indirect_call_cost
+  - eni_time_weights.call_cost);
+}
 }
 
 
Index: ipa-prop.c
===
--- ipa-prop.c  (revision 219430)
+++ ipa-prop.c  (working copy)
@@ -2737,7 +2737,20 @@
   ie->caller->name (), callee->name ());
 }
   if (!speculative)
-ie = ie->make_direct (callee);
+{
+  struct cgraph_edge *orig = ie;
+  ie = ie->make_direct (callee);
+  /* If we resolved speculative edge the cost is already up to date
+for direct call (adjusted by inline_edge_duplication_hook).  */
+  if (ie == orig)
+   {
+ es = inline_edge_summary (ie);
+ es->call_stmt_size -= (eni_size_weights.indirect_call_cost
+- eni_size_weights.call_cost);
+ es->call_stmt_time -= (eni_time_weights.indirect_call_cost
+- eni_time_weights.call_cost);
+   }
+}
   else
 {
   if (!callee->can_be_discarded_p ())
@@ -2747,14 +2760,10 @@
  if (alias)
callee = alias;
}
+  /* make_speculative will update ie's cost to direct call cost. */
   ie = ie->make_speculative
 (callee, ie->count * 8 / 10, ie->frequency * 8 / 10);
 }
-  es = inline_edge_summary (ie);
-  es->call_stmt_size -= (eni_size_weights.indirect_call_cost
-- eni_size_weights.call_cost);
-  es->call_stmt_time -= (eni_time_weights.indirect_call_cost
-- eni_time_weights.call_cost);
 
   return ie;
 }
Index: testsuite/g++.dg/ipa/pr63470.C
===
--- testsuite/g++.dg/ipa/pr63470.C  (revision 0)
+++ testsuite/g++.dg/ipa/pr63470.C  (revision 0)
@@ -0,0 +1,54 @@
+/* PR ipa/63470.C */
+/* { dg-do compile } */
+/* { dg-options "-O2 -finline-functions" } */
+
+class A
+{
+public:
+  virtual bool m_fn1 ();
+  virtual const char **m_fn2 (int);
+  virtual int m_fn3 ();
+};
+class FTjackSupport : A
+{
+  ~FTjackSupport ();
+  bool m_fn1 ();
+  bool m_fn4 ();
+  const char **
+  m_fn2 (int)
+  {
+  }
+  int _inited;
+  int *_jackClient;
+  int _activePathCount;
+}
+
+* a;
+void fn1 (...);
+void fn2 (void *);
+int fn3 (int *);
+FTjackSupport::~FTjackSupport () { m_fn4 (); }
+
+bool
+FTjackSupport::m_fn1 ()
+{
+  if (!_jackClient)
+return 0;
+  for (int i=0; _activePathCount; ++i)
+if (m_fn2 (i))
+  fn2 (a);
+  if (m_fn3 ())
+fn2 (a);
+  if (fn3 (_jackClient))
+fn1 (0);
+}
+
+bool
+FTjackSupport::m_fn4 ()
+{
+  if (_inited && _jackClient)
+{
+  m_fn1 ();
+  return 0;
+}
+}


Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-12 Thread Richard Biener
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 03:10:16PM +0100, Richard Biener wrote:
> > Well, you have until the end of next week ;)  For GIMPLE this is
> > a switch with all cases going to the same basic-block, right?
> > I think we optimize that in cleanup_control_expr_graph via the
> > single_succ_p case?
> 
> No, it is a switch with cases that all look like:
>   _1 = a; // load
>   _2 = _1 + 1;
>   a = _2; // store
> So, either if tree-ssa-tail-merge could be tought about loads/stores,
> or some other pass would be able to hoist the loads before the switch and
> sink the store after the switch, because every switch case does that.

Ah, ok.  Indeed code-hoisting on GIMPLE wasn't finished (there is a
very old PR with patches still), and sinking has the same issue
in that it only exploits partial dead code elimination opportunities.

I think that tail-merging already handles some of these cases, just
maybe not the one with more than two PHI args or switches.

Richard.


Re: [PATCH, testsuite] fix ggcplug.c test-case

2015-01-12 Thread Richard Biener
On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote:

> On 12 January 2015 at 14:19, Richard Biener  wrote:
> > On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> The test-case plugin/ggcplug.c was failing due to flattening of tree.h
> >> and tree-core.h.
> >> Test-case was incorrect because it included gcc-plugin.h after tree.h 
> >> whereas
> >> gcc-plugin.h should be the first header to be included by plugins.
> >
> > No, it should be definitely included _after_ config.h, system.h
> > and coretypes.h.
> gcc-plugin.h already includes these files. Shall I remove config.h,
> system.h and coretypes.h
> from ggcplug.c instead ?

No, keep the patch simple for now - we are inconsitent in all the
testsuite plugins it seems and wasn't the idea that plugins _only_
need to include gcc-plugin.h now?  Thus I'd rather cleanup all
plugin testcases at once, with a separate patch.

Thanks,
Richard.

> >
> > Ok with moving it after coretypes.h.
> >
> > Thanks,
> > Richard.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH] Fix enum operands exchange in ipa-inline.c

2015-01-12 Thread Richard Biener
On Mon, Jan 12, 2015 at 7:58 AM, Hurugalawadi, Naveen
 wrote:
> Hi,
>
> Sorry, Had forgot the ChangeLog.

Ok, but please properly wrap the long lines, put '? gimple_...' on a new
one.

Thanks,
Richard.

> ChangeLog
> 2015-01-12  Naveen H.S  
>
> * ipa-inline.c (inline_small_functions): Swap the operands in
> enum.
>
> Thanks,
> Naveen


[PATCH] IPA ICF: handle correctly indirect_calls

2015-01-12 Thread Martin Liška

Hello.

Following patch is needed to pass LTO compilation for chromium. IPA ICF 
verifies polymorphic types
for functions that have any function call. I forgot to handle indirect_calls.

Patch can bootstrap on x86_64-linux-pc and new regression is seen.
Ready for trunk?

Thanks,
Martin
>From d0f7fc76d5dac5f4c3c57a2e632082485debbd8a Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 8 Jan 2015 13:49:45 +0100
Subject: [PATCH] IPA ICF: handle correctly indirect_calls.

gcc/ChangeLog:

2015-01-08  Martin Liska  

	* ipa-icf.c (sem_function::equals_wpa): Add indirect_calls as indication
	that a function is not leaf.
	(sem_function::compare_polymorphic_p): Likewise.
---
 gcc/ipa-icf.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 1b76a1d..ed6d019 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -340,7 +340,8 @@ sem_function::equals_wpa (sem_item *item,
 	return return_false_with_msg ("NULL argument type");
 
   /* Polymorphic comparison is executed just for non-leaf functions.  */
-  bool is_not_leaf = get_node ()->callees != NULL;
+  bool is_not_leaf = get_node ()->callees != NULL
+			 || get_node ()->indirect_calls != NULL;
 
   if (!func_checker::compatible_types_p (arg_types[i],
 	 m_compared_func->arg_types[i],
@@ -884,7 +885,9 @@ bool
 sem_function::compare_polymorphic_p (void)
 {
   return get_node ()->callees != NULL
-	 || m_compared_func->get_node ()->callees != NULL;
+	 || get_node ()->indirect_calls != NULL
+	 || m_compared_func->get_node ()->callees != NULL
+	 || m_compared_func->get_node ()->indirect_calls != NULL;
 }
 
 /* For a given call graph NODE, the function constructs new
-- 
2.1.2



Simplify badness metrics in inliner, take 2

2015-01-12 Thread Jan Hubicka
Hi,
this is variant of my earlier patch I comited. It solves issues with 
-fprofile-use
and various roundoff errors that triggered sanity checks (partly by disabling 
them).

Bootstrapped/regtested x86_64-linux.
Honza

PR ipa/63967
PR ipa/64425
* ipa-inline.c (compute_uninlined_call_time,
compute_inlined_call_time): Use counts for extra precision when
needed possible.
(big_speedup_p): Fix formating.
(RELATIVE_TIME_BENEFIT_RANGE): Remove.
(relative_time_benefit): Remove.
(edge_badness): Turn DECL_DISREGARD_INLINE_LIMITS into hint;
merge guessed and read profile paths.
(inline_small_functions): Count only !optimize_size functions into
initial size; be more lax about sanity check when profile is used;
be sure to update inlined function profile when profile is read.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 219430)
+++ ipa-inline.c(working copy)
@@ -530,12 +530,19 @@ inline sreal
 compute_uninlined_call_time (struct inline_summary *callee_info,
 struct cgraph_edge *edge)
 {
-  sreal uninlined_call_time = (sreal)callee_info->time
- * MAX (edge->frequency, 1)
- * cgraph_freq_base_rec;
-  int caller_time = inline_summaries->get (edge->caller->global.inlined_to
-  ? edge->caller->global.inlined_to
-  : edge->caller)->time;
+  sreal uninlined_call_time = (sreal)callee_info->time;
+  cgraph_node *caller = (edge->caller->global.inlined_to 
+? edge->caller->global.inlined_to
+: edge->caller);
+
+  if (edge->count && caller->count)
+uninlined_call_time *= (sreal)edge->count / caller->count;
+  if (edge->frequency)
+uninlined_call_time *= cgraph_freq_base_rec * edge->frequency;
+  else
+uninlined_call_time = uninlined_call_time >> 11;
+
+  int caller_time = inline_summaries->get (caller)->time;
   return uninlined_call_time + caller_time;
 }
 
@@ -546,13 +553,28 @@ inline sreal
 compute_inlined_call_time (struct cgraph_edge *edge,
   int edge_time)
 {
-  int caller_time = inline_summaries->get (edge->caller->global.inlined_to
-  ? edge->caller->global.inlined_to
-  : edge->caller)->time;
-  sreal time = (sreal)caller_time
-  + ((sreal) (edge_time - inline_edge_summary 
(edge)->call_stmt_time)
- * MAX (edge->frequency, 1)
- * cgraph_freq_base_rec);
+  cgraph_node *caller = (edge->caller->global.inlined_to 
+? edge->caller->global.inlined_to
+: edge->caller);
+  int caller_time = inline_summaries->get (caller)->time;
+  sreal time = edge_time;
+
+  if (edge->count && caller->count)
+time *= (sreal)edge->count / caller->count;
+  if (edge->frequency)
+time *= cgraph_freq_base_rec * edge->frequency;
+  else
+time = time >> 11;
+
+  /* This calculation should match one in ipa-inline-analysis.
+ FIXME: Once ipa-inline-analysis is converted to sreal this can be
+ simplified.  */
+  time -= (sreal) ((gcov_type) edge->frequency
+  * inline_edge_summary (edge)->call_stmt_time
+  * (INLINE_TIME_SCALE / CGRAPH_FREQ_BASE)) / 
INLINE_TIME_SCALE;
+  time += caller_time;
+  if (time <= 0)
+time = ((sreal) 1) >> 8;
   gcc_checking_assert (time >= 0);
   return time;
 }
@@ -563,8 +585,10 @@ compute_inlined_call_time (struct cgraph
 static bool
 big_speedup_p (struct cgraph_edge *e)
 {
-  sreal time = compute_uninlined_call_time (inline_summaries->get (e->callee), 
e);
+  sreal time = compute_uninlined_call_time (inline_summaries->get (e->callee),
+   e);
   sreal inlined_time = compute_inlined_call_time (e, estimate_edge_time (e));
+
   if (time - inlined_time
   > (sreal) time * PARAM_VALUE (PARAM_INLINE_MIN_SPEEDUP)
 * percent_rec)
@@ -862,49 +886,6 @@ want_inline_function_to_all_callers_p (s
   return true;
 }
 
-#define RELATIVE_TIME_BENEFIT_RANGE (INT_MAX / 64)
-
-/* Return relative time improvement for inlining EDGE in range
-   as value NUMERATOR/DENOMINATOR.  */
-
-static inline void
-relative_time_benefit (struct inline_summary *callee_info,
-  struct cgraph_edge *edge,
-  int edge_time,
-  sreal *numerator,
-  sreal *denominator)
-{
-  /* Inlining into extern inline function is not a win.  */
-  if (DECL_EXTERNAL (edge->caller->global.inlined_to
-? edge->caller->global.inlined_to->decl
-: edge->caller->decl))
-{
-  *numerator = (sreal) 1;
-  *denominator = (sreal) 1024;
-  return

[PATCH, autofdo] Some code cleanup

2015-01-12 Thread Yangfei (Felix)
Hi,

  The attached patch does some code cleanup for auto-profile.c: fix typos and 
remove some unnecessary MAX/MIN checks plus some "else".
  OK for the trunk? 


Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c  (revision 219297)
+++ gcc/auto-profile.c  (working copy)
@@ -96,7 +96,7 @@ along with GCC; see the file COPYING3.  If not see
  standalone symbol, or a clone of a function that is inlined into another
  function.
 
-   Phase 2: Early inline + valur profile transformation.
+   Phase 2: Early inline + value profile transformation.
  Early inline uses autofdo_source_profile to find if a callsite is:
 * inlined in the profiled binary.
 * callee body is hot in the profiling run.
@@ -361,7 +361,7 @@ get_original_name (const char *name)
 
 /* Return the combined location, which is a 32bit integer in which
higher 16 bits stores the line offset of LOC to the start lineno
-   of DECL, The lower 16 bits stores the discrimnator.  */
+   of DECL, The lower 16 bits stores the discriminator.  */
 
 static unsigned
 get_combined_location (location_t loc, tree decl)
@@ -424,7 +424,7 @@ get_inline_stack (location_t locus, inline_stack *
 
 /* Return STMT's combined location, which is a 32bit integer in which
higher 16 bits stores the line offset of LOC to the start lineno
-   of DECL, The lower 16 bits stores the discrimnator.  */
+   of DECL, The lower 16 bits stores the discriminator.  */
 
 static unsigned
 get_relative_location_for_stmt (gimple stmt)
@@ -481,8 +481,8 @@ string_table::get_index (const char *name) const
   string_index_map::const_iterator iter = map_.find (name);
   if (iter == map_.end ())
 return -1;
-  else
-return iter->second;
+
+  return iter->second;
 }
 
 /* Return the index of a given function DECL. Return -1 if DECL is not 
@@ -502,8 +502,8 @@ string_table::get_index_by_decl (tree decl) const
 return ret;
   if (DECL_ABSTRACT_ORIGIN (decl))
 return get_index_by_decl (DECL_ABSTRACT_ORIGIN (decl));
-  else
-return -1;
+
+  return -1;
 }
 
 /* Return the function name of a given INDEX.  */
@@ -569,8 +569,8 @@ function_instance::get_function_instance_by_decl (
 }
   if (DECL_ABSTRACT_ORIGIN (decl))
 return get_function_instance_by_decl (lineno, DECL_ABSTRACT_ORIGIN (decl));
-  else
-return NULL;
+
+  return NULL;
 }
 
 /* Store the profile info for LOC in INFO. Return TRUE if profile info
@@ -597,7 +597,7 @@ function_instance::mark_annotated (location_t loc)
   iter->second.annotated = true;
 }
 
-/* Read the inlinied indirect call target profile for STMT and store it in
+/* Read the inlined indirect call target profile for STMT and store it in
MAP, return the total count for all inlined indirect calls.  */
 
 gcov_type
@@ -824,8 +824,8 @@ autofdo_source_profile::get_callsite_total_count (
   || afdo_string_table->get_index (IDENTIFIER_POINTER (
  DECL_ASSEMBLER_NAME (edge->callee->decl))) != s->name ())
 return 0;
-  else
-return s->total_count ();
+
+  return s->total_count ();
 }
 
 /* Read AutoFDO profile and returns TRUE on success.  */
@@ -956,9 +956,9 @@ read_profile (void)
histograms for indirect-call optimization.
 
This function is actually served for 2 purposes:
-     * before annotation, we need to mark histogram, promote and inline
-     * after annotation, we just need to mark, and let follow-up logic to
-       decide if it needs to promote and inline.  */
+ * before annotation, we need to mark histogram, promote and inline
+ * after annotation, we just need to mark, and let follow-up logic to
+   decide if it needs to promote and inline.  */
 
 static void
 afdo_indirect_call (gimple_stmt_iterator *gsi, const icall_target_map &map,
@@ -1054,7 +1054,7 @@ set_edge_annotated (edge e, edge_set *annotated)
 }
 
 /* For a given BB, set its execution count. Attach value profile if a stmt
-   is not in PROMOTED, because we only want to promot an indirect call once.
+   is not in PROMOTED, because we only want to promote an indirect call once.
Return TRUE if BB is annotated.  */
 
 static bool
@@ -1138,7 +1138,7 @@ afdo_find_equiv_class (bb_set *annotated_bb)
 bb1->aux = bb;
 if (bb1->count > bb->count && is_bb_annotated (bb1, *annotated_bb))
   {
-bb->count = MAX (bb->count, bb1->count);
+bb->count = bb1->count;
 set_bb_annotated (bb, annotated_bb);
   }
   }
@@ -1150,7 +1150,7 @@ afdo_find_equiv_class (bb_set *annotated_bb)
 bb1->aux = bb;
 if (bb1->count > bb->count && is_bb_annotated (bb1, *annotated_bb))
   {
-bb->count = MAX (bb->count, bb1->count);
+bb->count = bb1->count;
 set_bb_annotated (bb, annotated_bb);
   }
   }
@@ -1455,13 +1455,14 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt
   }
   }
   }
+
   if (has_vpt)
 {
 

[match-and-simplify] Merge from trunk

2015-01-12 Thread Richard Biener

Committed.

2015-01-12  Richard Biener  

Merge from trunk r218478 through r219383.



Re: [PATCH] Fix enum operands exchange in ipa-inline.c

2015-01-12 Thread Hurugalawadi, Naveen
Hi Richard,

Thanks for the quick review and comments.

Please find attached the modified patch as per your suggestion.

Thanks,
Naveen

From: Richard Biener 
Sent: Monday, January 12, 2015 2:48 PM
To: Hurugalawadi, Naveen
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Fix enum operands exchange in ipa-inline.c

On Mon, Jan 12, 2015 at 7:58 AM, Hurugalawadi, Naveen
 wrote:
> Hi,
>
> Sorry, Had forgot the ChangeLog.

Ok, but please properly wrap the long lines, put '? gimple_...' on a new
one.

Thanks,
Richard.

> ChangeLog
> 2015-01-12  Naveen H.S  
>
> * ipa-inline.c (inline_small_functions): Swap the operands in
> enum.
>
> Thanks,
> Naveen
--- gcc/ipa-inline.c	2015-01-12 14:55:25.291575873 +0530
+++ gcc/ipa-inline.c	2015-01-12 14:56:01.795575453 +0530
@@ -1730,10 +1730,12 @@ inline_small_functions (void)
 		   " to be inlined into %s/%i in %s:%i\n"
 		   " Estimated badness is %f, frequency %.2f.\n",
 		   edge->caller->name (), edge->caller->order,
-		   edge->call_stmt ? "unknown"
-		   : gimple_filename ((const_gimple) edge->call_stmt),
-		   edge->call_stmt ? -1
-		   : gimple_lineno ((const_gimple) edge->call_stmt),
+		   edge->call_stmt
+		   ? gimple_filename ((const_gimple) edge->call_stmt)
+		   : "unknown",
+		   edge->call_stmt
+		   ? gimple_lineno ((const_gimple) edge->call_stmt)
+		   : -1,
 		   badness.to_double (),
 		   edge->frequency / (double)CGRAPH_FREQ_BASE);
 	  if (edge->count)


[gomp4] Merge trunk r219425 (2015-01-10) into gomp-4_0-branch

2015-01-12 Thread Thomas Schwinge
Hi!

In r219453, I have committed a merge from trunk r219425 (2015-01-10) into
gomp-4_0-branch.


Grüße,
 Thomas


pgphl0a_xYFFn.pgp
Description: PGP signature


Re: [PATCH] IPA ICF: handle correctly indirect_calls

2015-01-12 Thread Richard Biener
On Mon, Jan 12, 2015 at 10:29 AM, Martin Liška  wrote:
> Hello.
>
> Following patch is needed to pass LTO compilation for chromium. IPA ICF
> verifies polymorphic types
> for functions that have any function call. I forgot to handle
> indirect_calls.
>
> Patch can bootstrap on x86_64-linux-pc and new regression is seen.
> Ready for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Martin


Re: [PATCH] Fix enum operands exchange in ipa-inline.c

2015-01-12 Thread Richard Biener
On Mon, Jan 12, 2015 at 10:36 AM, Hurugalawadi, Naveen
 wrote:
> Hi Richard,
>
> Thanks for the quick review and comments.
>
> Please find attached the modified patch as per your suggestion.

Ok.

Richard.

> Thanks,
> Naveen
> 
> From: Richard Biener 
> Sent: Monday, January 12, 2015 2:48 PM
> To: Hurugalawadi, Naveen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Fix enum operands exchange in ipa-inline.c
>
> On Mon, Jan 12, 2015 at 7:58 AM, Hurugalawadi, Naveen
>  wrote:
>> Hi,
>>
>> Sorry, Had forgot the ChangeLog.
>
> Ok, but please properly wrap the long lines, put '? gimple_...' on a new
> one.
>
> Thanks,
> Richard.
>
>> ChangeLog
>> 2015-01-12  Naveen H.S  
>>
>> * ipa-inline.c (inline_small_functions): Swap the operands in
>> enum.
>>
>> Thanks,
>> Naveen


Re: Simplify badness metrics in inliner, take 2

2015-01-12 Thread Markus Trippelsdorf
On 2015.01.12 at 10:30 +0100, Jan Hubicka wrote:
> this is variant of my earlier patch I comited. It solves issues with 
> -fprofile-use
> and various roundoff errors that triggered sanity checks (partly by disabling 
> them).

The new assert triggers during Firefox LTO build on ppc64:

(final libxul link:)

lto1: internal compiler error: in inline_small_functions, at ipa-inline.c:1664
0x10d0a023 inline_small_functions
../../gcc/gcc/ipa-inline.c:1664
0x10d0a023 ipa_inline
../../gcc/gcc/ipa-inline.c:2163
0x10d0a023 execute
../../gcc/gcc/ipa-inline.c:2536
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
lto-wrapper: fatal error: ../../../gcc_test/usr/local/bin/c++ returned 1 exit 
status
compilation terminated.
/home/trippels/bin/ld: fatal error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[5]: *** [libxul.so] Error 1


-- 
Markus


[wwwdocs, patch] Update Fortran part of gcc-5/changes.html

2015-01-12 Thread Tobias Burnus
Hi all, hi Gerald,

sync the changes from https://gcc.gnu.org/wiki/GFortran/News#GCC5 for the
today's added compatibilty section and Janne's locale addition.

If there are no objects or comments, I will commit it this evening

Tobias,
who is really behind reading fortran@gcc emails.
Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.57
diff -p -u -r1.57 changes.html
--- changes.html	8 Jan 2015 16:50:23 -	1.57
+++ changes.html	12 Jan 2015 10:08:29 -
@@ -333,6 +333,24 @@ void operator delete[] (void *, std::siz
 
 Fortran
   
+Compatibility notice:
+  The version of the module files (.mod) has been incremented.
+  For free-form source files,
+https://gcc.gnu.org/onlinedocs/gfortran/Error-and-Warning-Options.html";>-Werror=line-truncation
+is now enabled by default; note that comments exceeding the line length
+are not diagnosed.  (For fixed-form source code, the same warning is
+available but turned off by default, such that excess characters are
+ignored. -ffree-line-length-n and
+-ffixed-line-length-n can be used to modify the default
+line lengths of 132 and 72 columns, respectively.)
+  The -Wtabs option is now more sensible: with
+-Wtabs the compiler warns if it encounters tabs and with
+-Wno-tabs this warning is turned off.  Before,
+-Wno-tabs warned and -Wtabs turned the warning
+off.  As before, the warning is also enabled by -Wall,
+-pedantic and the f95, f2003,
+f2008 and f2008ts options of -std=.
+  
 Incomplete support for colorizing diagnostics emitted by
   gfortran has been added. The
   option https://gcc.gnu.org/onlinedocs/gcc/Language-Independent-Options.html";
@@ -359,6 +377,10 @@ void operator delete[] (void *, std::siz
 The -Wuse-without-only option has been added to warn when a
   USE statement has no ONLY qualifier and, thus,
   implicitly imports all public entities of the used module.
+Formatted READ and WRITE statements now work correctly in locale-aware
+  programs.  For more information and potential caveats, see
+  https://gcc.gnu.org/onlinedocs/gfortran/Thread-safety-of-the-runtime-library.html";>Section
+  5.3 Thread-safety of the runtime library in the manual.
 https://gcc.gnu.org/wiki/Fortran2003Status";>Fortran 2003:
 
   The intrinsic IEEE modules (IEEE_FEATURES,


[PATCH]: Fix for PR ipa/64550

2015-01-12 Thread Martin Liška

Hello.

Following patch is fix for PR ipa/64550 which can bootstrap on x86_64-linux-pc.
Explanation for the patch is described here: [1].

I hope this is correct fix for such cases?

Thanks,
Martin

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64550
From bef79e6e5e0d7d8e555e9241ffcfb88a92552e12 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Mon, 12 Jan 2015 10:54:36 +0100
Subject: [PATCH] Fix for PR64550.

gcc/ChangeLog:

2015-01-12  Martin Liska  

	* ipa-icf-gimple.c (func_checker::compare_memory_operand): Compare
	volatility for correct operands.

gcc/testsuite/ChangeLog:

2015-01-12  Martin Liska  

	* gcc.dg/ipa/PR64550.c: New test.
---
 gcc/ipa-icf-gimple.c   |  2 +-
 gcc/testsuite/gcc.dg/ipa/PR64550.c | 76 ++
 2 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/PR64550.c

diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 8c3a27b..ed3cdf5 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -267,7 +267,7 @@ func_checker::compare_memory_operand (tree t1, tree t2)
   /* Compare alias sets for memory operands.  */
   if (source_is_memop && target_is_memop)
 {
-  if (TREE_THIS_VOLATILE (b1) != TREE_THIS_VOLATILE (b2))
+  if (TREE_THIS_VOLATILE (t1) != TREE_THIS_VOLATILE (t2))
 	return return_false_with_msg ("different operand volatility");
 
   if (ao_ref_alias_set (&r1) != ao_ref_alias_set (&r2)
diff --git a/gcc/testsuite/gcc.dg/ipa/PR64550.c b/gcc/testsuite/gcc.dg/ipa/PR64550.c
new file mode 100644
index 000..3b439c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/PR64550.c
@@ -0,0 +1,76 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-icf-details"  } */
+
+struct __hlist_head
+{
+  struct __hlist_node *first;
+};
+
+struct __hlist_node
+{
+  struct __hlist_node *next, **pprev;
+};
+
+struct __net
+{
+  int ifindex;
+  struct __hlist_head * dev_index_head;
+};
+
+struct __net_device
+{
+  int ifindex;
+  struct __net *nd_net;
+  struct __hlist_node index_hlist;
+};
+
+__attribute__ ((noinline, noclone))
+static struct __hlist_head * __dev_index_hash(struct __net *net,
+int ifindex)
+{
+  return &net->dev_index_head[ifindex & 1];
+}
+
+__attribute__ ((noinline, noclone))
+struct __net_device * __dev_get_by_index(struct __net *net, int ifindex)
+{
+  struct __net_device * dev;
+  struct __hlist_head * head = __dev_index_hash(net, ifindex);
+
+  for (dev = ( { typeof((head)->first) ptr = ((head)->first); ptr ? ( { const typeof(((typeof(*(dev)) *) 0)->index_hlist) * __mptr = (ptr); (typeof(*(dev)) *) ((char *)__mptr - __builtin_offsetof(typeof(*(dev)), index_hlist));}): ((void *) 0);});
+   dev; dev = ( { typeof ((dev)->index_hlist.next) ptr = ((dev)->index_hlist.next); ptr ? ( { const typeof(((typeof(*(dev)) *) 0)->index_hlist) * __mptr = (ptr); (typeof(*(dev)) *) ((char *)__mptr - __builtin_offsetof(typeof(*(dev)), index_hlist));}): ((void *) 0);}))
+if (dev->ifindex == ifindex)
+  return dev;
+
+  return ((void *)0);
+}
+
+__attribute__ ((noinline, noclone))
+struct __net_device * dev_get_by_index_rcu(struct __net *net, int ifindex)
+{
+  struct __net_device * dev;
+  struct __hlist_head * head = __dev_index_hash(net, ifindex);
+
+  for (dev = ( { typeof(( { typeof (* ((*((struct __hlist_node **)(&(head)->first) * _p1 = (typeof(*((*((struct __hlist_node **)(&(head)->first) *) (*(volatile typeof(((*((struct __hlist_node **)(&(head)->first) *)&(((*((struct __hlist_node **)(&(head)->first)); do { } while (0);; do { } while (0); ((typeof(*((*((struct __hlist_node **)(&(head)->first) *) (_p1));})) ptr = (( { typeof (* ((*((struct __hlist_node **)(&(head)->first) * _p1 = (typeof(*((*((struct __hlist_node **)(&(head)->first) *) (*(volatile typeof(((*((struct __hlist_node **)(&(head)->first) *)&(((*((struct __hlist_node **)(&(head)->first)); do { } while (0);; do { } while (0); ((typeof(*((*((struct __hlist_node **)(&(head)->first) *) (_p1));})); ptr ? ( { const typeof(((typeof(*(dev)) *) 0)->index_hlist) * __mptr = (ptr); (typeof(*(dev)) *) ((char *)__mptr - __builtin_offsetof(typeof(*(dev)), index_hlist));}):((void *) 0);});
+	  dev; dev = ( { typeof(( { typeof (* ((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next) * _p1 = (typeof(*((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next) *) (*(volatile typeof(((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next) *)&(((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next)); do { } while (0);; do { } while (0); ((typeof(*((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next) *) (_p1));})) ptr = (( { typeof (* ((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next) * _p1 = (typeof(*((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next) *) (*(volatile typeof(((*((struct __hlist_node **)(&(&(dev)->index_hlist)->next)

Re: [PATCH, testsuite] fix ggcplug.c test-case

2015-01-12 Thread Prathamesh Kulkarni
On 12 January 2015 at 14:36, Richard Biener  wrote:
> On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote:
>
>> On 12 January 2015 at 14:19, Richard Biener  wrote:
>> > On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote:
>> >
>> >> Hi,
>> >> The test-case plugin/ggcplug.c was failing due to flattening of tree.h
>> >> and tree-core.h.
>> >> Test-case was incorrect because it included gcc-plugin.h after tree.h 
>> >> whereas
>> >> gcc-plugin.h should be the first header to be included by plugins.
>> >
>> > No, it should be definitely included _after_ config.h, system.h
>> > and coretypes.h.
>> gcc-plugin.h already includes these files. Shall I remove config.h,
>> system.h and coretypes.h
>> from ggcplug.c instead ?
>
> No, keep the patch simple for now - we are inconsitent in all the
> testsuite plugins it seems and wasn't the idea that plugins _only_
> need to include gcc-plugin.h now?  Thus I'd rather cleanup all
> plugin testcases at once, with a separate patch.
I thought gcc-plugin.h would contain include dependencies of all
headers (to make plugins transparent
to include restructuring) and if a plugin needs a particular header,
it should explicitly include it. Or am I
missing something ?
>
> Thanks,
> Richard.
>
>> >
>> > Ok with moving it after coretypes.h.
Shall I commit the patch after this change since this is the only
plugin test case that's failing ?

Thanks,
Prathamesh
>> >
>> > Thanks,
>> > Richard.
>>
>>
>
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
> Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH]: Fix for PR ipa/64550

2015-01-12 Thread Richard Biener
On Mon, 12 Jan 2015, Martin Liška wrote:

> Hello.
> 
> Following patch is fix for PR ipa/64550 which can bootstrap on
> x86_64-linux-pc.
> Explanation for the patch is described here: [1].
> 
> I hope this is correct fix for such cases?

Ah, using TREE_THIS_VOLATILE on the result of ao_ref_base
is wrong - you need to use r1.volatile_p != r2.volatile_p.

That's actually equivalent to what your patch does, thus that
is ok.

Thanks,
Richard.

> Thanks,
> Martin
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64550
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

Re: [PATCH, testsuite] fix ggcplug.c test-case

2015-01-12 Thread Richard Biener
On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote:

> On 12 January 2015 at 14:36, Richard Biener  wrote:
> > On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote:
> >
> >> On 12 January 2015 at 14:19, Richard Biener  wrote:
> >> > On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote:
> >> >
> >> >> Hi,
> >> >> The test-case plugin/ggcplug.c was failing due to flattening of tree.h
> >> >> and tree-core.h.
> >> >> Test-case was incorrect because it included gcc-plugin.h after tree.h 
> >> >> whereas
> >> >> gcc-plugin.h should be the first header to be included by plugins.
> >> >
> >> > No, it should be definitely included _after_ config.h, system.h
> >> > and coretypes.h.
> >> gcc-plugin.h already includes these files. Shall I remove config.h,
> >> system.h and coretypes.h
> >> from ggcplug.c instead ?
> >
> > No, keep the patch simple for now - we are inconsitent in all the
> > testsuite plugins it seems and wasn't the idea that plugins _only_
> > need to include gcc-plugin.h now?  Thus I'd rather cleanup all
> > plugin testcases at once, with a separate patch.
> I thought gcc-plugin.h would contain include dependencies of all
> headers (to make plugins transparent
> to include restructuring) and if a plugin needs a particular header,
> it should explicitly include it. Or am I
> missing something ?

No idea - I thought the idea was that plugins only ever need to
include gcc-plugin.h which will include everything (aka the "world")
so plugins are immune to things moving between headers (another
thing that happened a lot for GCC 5).

> >
> > Thanks,
> > Richard.
> >
> >> >
> >> > Ok with moving it after coretypes.h.
> Shall I commit the patch after this change since this is the only
> plugin test case that's failing ?

You should commit a patch moving the gcc-plugin.h include in ggcplug.c
to after the include of coretypes.h.

Thanks,
Richard.


Re: [PATCH, i386] Remove EBX usage from asm code

2015-01-12 Thread Evgeny Stupachenko
"frame_dummy" does not use EBX in allocation now as there are enough
other registers (that we don't need to save/restore). So if we do not
modify "frame_dummy" EBX should stay unchanged.
"frame_dummy" does not initialize EBX register at the beginning it
expects that EBX is pic from glibc
"frame_dummy" is called from glibc and while we have glibc compiled by
4.9 or older compiler EBX should come to "frame_dummy" as pic register

Not sure that it is correct right now, but obviously will be
potentially buggy when glibc is recompiled with GCC 5.0.

libgcc (frame_dummy):

static void __attribute__((used))
frame_dummy (void)
{
#ifdef USE_EH_FRAME_REGISTRY
  static struct object object;
#ifdef CRT_GET_RFIB_DATA
  void *tbase, *dbase;
  tbase = 0;
  CRT_GET_RFIB_DATA (dbase);
  if (__register_frame_info_bases)
__register_frame_info_bases (__EH_FRAME_BEGIN__, &object, tbase, dbase);
#else
  if (__register_frame_info)
__register_frame_info (__EH_FRAME_BEGIN__, &object);
#endif /* CRT_GET_RFIB_DATA */


On Mon, Jan 5, 2015 at 11:50 PM, Jeff Law  wrote:
> On 12/28/14 09:46, Evgeny Stupachenko wrote:
>>
>> Hi,
>>
>> The patch removes EBX usage from asm code used in libgcc/crtstuff.c
>> It is safe now, but potentially buggy when glibc is rebuild with GCC
>> 5.0 as EBX is not GOT register any more.
>>
>> x86 bootstrap, make check passed.
>>
>> Is it ok?
>>
>> Evgeny
>>
>> 2014-12-28  Evgeny Stupachenko  
>>
>>  * gnu-user.h (CRT_GET_RFIB_DATA): Remove EBX register usage.
>>  * config/i386/sysv4.h (CRT_GET_RFIB_DATA): Ditto.
>
> I don't understand the glibc reference above.
>
> Ultimately what matters here, AFAICT is the value assigned to the parameter
> to CRT_GET_RFIB_DATA which should be the base of the data relative
> relocations.  So the comment "It is safe now" seems wrong as well.
>
> ISTM this is a critical fix as it would be possible for the PIC pseudo to be
> assigned to something other than %ebx when compiling libgcc/crtstuff.c.  And
> if that happens, we'll pass in a junk value to register_frame_info_bases.
>
> Evgeny, can you clarify why you think things are safe now, but would not be
> safe if glibc were to be built with the current GCC trunk?
>
> Jeff


Re: [PATCH, i386] Remove EBX usage from asm code

2015-01-12 Thread Jakub Jelinek
On Mon, Jan 12, 2015 at 01:36:05PM +0300, Evgeny Stupachenko wrote:
> "frame_dummy" does not use EBX in allocation now as there are enough
> other registers (that we don't need to save/restore). So if we do not
> modify "frame_dummy" EBX should stay unchanged.
> "frame_dummy" does not initialize EBX register at the beginning it
> expects that EBX is pic from glibc
> "frame_dummy" is called from glibc and while we have glibc compiled by
> 4.9 or older compiler EBX should come to "frame_dummy" as pic register

I also don't understand how is this related to glibc in any way.
>From my understanding, the macro relied on %ebx being set to
_GLOBAL_OFFSET_TABLE_ because the frame_dummy function does access
GOT, so before the i?86 PIC reg changes it was computing %ebx.

Jakub


Re: [PATCH, testsuite] fix ggcplug.c test-case

2015-01-12 Thread Prathamesh Kulkarni
On 12 January 2015 at 15:49, Richard Biener  wrote:
> On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote:
>
>> On 12 January 2015 at 14:36, Richard Biener  wrote:
>> > On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote:
>> >
>> >> On 12 January 2015 at 14:19, Richard Biener  wrote:
>> >> > On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote:
>> >> >
>> >> >> Hi,
>> >> >> The test-case plugin/ggcplug.c was failing due to flattening of tree.h
>> >> >> and tree-core.h.
>> >> >> Test-case was incorrect because it included gcc-plugin.h after tree.h 
>> >> >> whereas
>> >> >> gcc-plugin.h should be the first header to be included by plugins.
>> >> >
>> >> > No, it should be definitely included _after_ config.h, system.h
>> >> > and coretypes.h.
>> >> gcc-plugin.h already includes these files. Shall I remove config.h,
>> >> system.h and coretypes.h
>> >> from ggcplug.c instead ?
>> >
>> > No, keep the patch simple for now - we are inconsitent in all the
>> > testsuite plugins it seems and wasn't the idea that plugins _only_
>> > need to include gcc-plugin.h now?  Thus I'd rather cleanup all
>> > plugin testcases at once, with a separate patch.
>> I thought gcc-plugin.h would contain include dependencies of all
>> headers (to make plugins transparent
>> to include restructuring) and if a plugin needs a particular header,
>> it should explicitly include it. Or am I
>> missing something ?
>
> No idea - I thought the idea was that plugins only ever need to
> include gcc-plugin.h which will include everything (aka the "world")
> so plugins are immune to things moving between headers (another
> thing that happened a lot for GCC 5).
>
>> >
>> > Thanks,
>> > Richard.
>> >
>> >> >
>> >> > Ok with moving it after coretypes.h.
>> Shall I commit the patch after this change since this is the only
>> plugin test case that's failing ?
>
> You should commit a patch moving the gcc-plugin.h include in ggcplug.c
> to after the include of coretypes.h.
Moved gcc-plugin.h include after coretypes.h and committed as r219458.

Thanks,
Prathamesh
>
> Thanks,
> Richard.


Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)

2015-01-12 Thread Andreas Schwab
I'm getting this testsuite regression:

FAIL: gcc.dg/plugin/ggcplug.c compilation

In file included from 
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:23:0,
 from 
/usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8:
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:18: error: 
'hash_set' has not been declared
  void *, hash_set *);
  ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:26: error: 
expected ',' or '...' before '<' token
  void *, hash_set *);
      ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:24: error: 
field 'id' has incomplete type 'ht_identifier'
   struct ht_identifier id;
^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:10: note: 
forward declaration of 'struct ht_identifier'
   struct ht_identifier id;
  ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1164:3: error: 
'vec' does not name a type
   vec *elts;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1206:3: error: 
'location_t' does not name a type
   location_t locus;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1253:3: error: 
'location_t' does not name a type
   location_t locus;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1258:3: error: 
'location_t' does not name a type
   location_t locus;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1285:3: error: 
'location_t' does not name a type
   location_t locus;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1286:3: error: 
'location_t' does not name a type
   location_t end_locus;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1289:3: error: 
'vec' does not name a type
   vec *nonlocalized_vars;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1324:3: error: 
'alias_set_type' does not name a type
   alias_set_type alias_set;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1360:3: error: 
'vec' does not name a type
   vec *base_accesses;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1366:3: error: 
'vec' does not name a type
   vec base_binfos;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1371:3: error: 
'location_t' does not name a type
   location_t locus;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1758:3: error: 
'vec' does not name a type
   vec *pending_statics;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1799:3: error: 
'vec' does not name a type
   vec *to;
   ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1845:16: error: 
'vec' does not name a type
 extern GTY(()) vec *alias_pairs;
    ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1858:17: error: 
'vec' does not name a type
 extern GTY (()) vec *all_translation_units;
 ^
In file included from 
/usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8:0:
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:1073:48: error: 
'location_t' has not been declared
 extern void protected_set_expr_location (tree, location_t);
^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2642:8: error: 'vec' 
does not name a type
 extern vec **decl_debug_args_lookup (tree);
^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2643:8: error: 'vec' 
does not name a type
 extern vec **decl_debug_args_insert (tree);
^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:38: error: 
'vec' has not been declared
 extern tree build_nt_call_vec (tree, vec *);
      ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:41: error: 
expected ',' or '...' before '<' token
 extern tree build_nt_call_vec (tree, vec *);
 ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:18: error: 
'build1_stat_loc' declared as an 'inline' variable
 build1_stat_loc (location_t loc, enum tree_code code, tree type,
  ^
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:18: error: 
'location_t' was not declared in this scope
/usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:34: error: 
expected primary-expression before 'enum'
 build1_stat_loc (location_t loc, enum tree_code code, tree type,

Re: [PATCH][ARM][cleanup] Use R0_REGNUM and R1_REGNUM instead of 0 and 1 where appropriate

2015-01-12 Thread Ramana Radhakrishnan
On Thu, Dec 11, 2014 at 9:34 AM, Kyrill Tkachov  wrote:
> Hi all,
>
> While looking in this area on other business I noticed we could be using the
> names R0_REGNUM
> and R1_REGNUM when creating those REG rtxs since it's a bit more descriptive
> that just 0 and 1.
>
> Tested arm-none-eabi.
>
> Ok for trunk?

Sorry been on holiday and now catching up on emails.

This is OK, thanks.

Ramana

>
> Thanks,
> Kyrill
>
> 2014-12-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com
>
> * config/arm/arm.c (arm_load_tp): Use R0_REGNUM instead of constant 0
> in gen_rtx_REG.
> (arm_tls_descseq_addr): Likewise.
> (arm_gen_movmemqi): Likewise.
> (arm_expand_epilogue_apcs_frame): Likewise.
> (arm_expand_epilogue): Likewise.
> (arm_expand_prologue): Likewise.  Use R1_REGNUM instead of constant 1
> in gen_rtx_REG.


Re: Simplify badness metrics in inliner, take 2

2015-01-12 Thread Markus Trippelsdorf
On 2015.01.12 at 10:59 +0100, Markus Trippelsdorf wrote:
> On 2015.01.12 at 10:30 +0100, Jan Hubicka wrote:
> > this is variant of my earlier patch I comited. It solves issues with 
> > -fprofile-use
> > and various roundoff errors that triggered sanity checks (partly by 
> > disabling them).
> 
> The new assert triggers during Firefox LTO build on ppc64:
> 
> (final libxul link:)
> 
> lto1: internal compiler error: in inline_small_functions, at ipa-inline.c:1664
> 0x10d0a023 inline_small_functions
> ../../gcc/gcc/ipa-inline.c:1664
> 0x10d0a023 ipa_inline
> ../../gcc/gcc/ipa-inline.c:2163
> 0x10d0a023 execute
> ../../gcc/gcc/ipa-inline.c:2536
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
> lto-wrapper: fatal error: ../../../gcc_test/usr/local/bin/c++ returned 1 exit 
> status
> compilation terminated.
> /home/trippels/bin/ld: fatal error: lto-wrapper failed
> collect2: error: ld returned 1 exit status
> make[5]: *** [libxul.so] Error 1

See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64565

-- 
Markus


Re: [PATCH, i386] Remove EBX usage from asm code

2015-01-12 Thread Evgeny Stupachenko
Agree, I've missed the usage of the function
"__register_frame_info_bases" (frame_dummy assembly had only indirect
call when I miss "-pie" in compilation).
There is no reference on glibc that way. Sorry for the confusion.
So that is potentially buggy right now.


On Mon, Jan 12, 2015 at 1:50 PM, Jakub Jelinek  wrote:
> On Mon, Jan 12, 2015 at 01:36:05PM +0300, Evgeny Stupachenko wrote:
>> "frame_dummy" does not use EBX in allocation now as there are enough
>> other registers (that we don't need to save/restore). So if we do not
>> modify "frame_dummy" EBX should stay unchanged.
>> "frame_dummy" does not initialize EBX register at the beginning it
>> expects that EBX is pic from glibc
>> "frame_dummy" is called from glibc and while we have glibc compiled by
>> 4.9 or older compiler EBX should come to "frame_dummy" as pic register
>
> I also don't understand how is this related to glibc in any way.
> From my understanding, the macro relied on %ebx being set to
> _GLOBAL_OFFSET_TABLE_ because the frame_dummy function does access
> GOT, so before the i?86 PIC reg changes it was computing %ebx.
>
> Jakub


Re: [PATCH 0/2] Offloading from dlopened libraries

2015-01-12 Thread Ilya Verbin
Hi!

How about this patch?  It adds a new symbol into GOMP_4.0.1 symver, so it would
be nice to include it into GCC 5 release.

On 14 Nov 02:53, Ilya Verbin wrote:
> This patch fixes offloading from dlopened libraries, part 1 is for libgomp and
> part 2 is for intelmic plugin.
> 
> How it works:
> When a library is loaded it calls GOMP_offload_register as usually.
> At this time some devices may already be initialized, and some may be not.
> Therefore libgomp goes through all devices and for the initialized devices 
> calls
> GOMP_OFFLOAD_load_image, then receives corresponding addresses and inserts 
> them
> into splay tree.  Also it fills offload_images array for lazy Initialization.
> 
> When the library is unloaded it calls GOMP_offload_unregister.
> This function also need to go through all devices and to call
> GOMP_OFFLOAD_unload_image for all initialized devices.  Also it removes mapped
> addresses from corresponding splay trees and pending images from the array.
> 
> Any thoughts on that?
> 
> Thomas, Julian,
> Will this approach work for OpenACC+PTX?  I hope that it is general enough.
> Yeah, I understand that this change will require some efforts on your part to
> rebase the patches, but it would be good to define a common libgomp<->plugin
> interface as early as possible.

  -- Ilya


[PATCH] Fix PR64530

2015-01-12 Thread Richard Biener

This fixes PR64530 by fixing a mistake (oops) in the iteration
over all data-ref pairs in pg_add_dependence_edges.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2015-01-12  Richard Biener  

PR tree-optimization/64530
* tree-loop-distribution.c (pg_add_dependence_edges): Shuffle
back dr1.

* gfortran.dg/pr64530.f90: New testcase.

Index: gcc/tree-loop-distribution.c
===
--- gcc/tree-loop-distribution.c(revision 219446)
+++ gcc/tree-loop-distribution.c(working copy)
@@ -1362,6 +1375,7 @@ pg_add_dependence_edges (struct graph *r
   for (int ii = 0; drs1.iterate (ii, &dr1); ++ii)
 for (int jj = 0; drs2.iterate (jj, &dr2); ++jj)
   {
+   data_reference_p saved_dr1 = dr1;
int this_dir = 1;
ddr_p ddr;
/* Re-shuffle data-refs to be in dominator order.  */
@@ -1407,6 +1421,8 @@ pg_add_dependence_edges (struct graph *r
  dir = this_dir;
else if (dir != this_dir)
  return 2;
+   /* Shuffle "back" dr1.  */
+   dr1 = saved_dr1;
   }
   return dir;
 }
Index: gcc/testsuite/gfortran.dg/pr64530.f90
===
--- gcc/testsuite/gfortran.dg/pr64530.f90   (revision 0)
+++ gcc/testsuite/gfortran.dg/pr64530.f90   (working copy)
@@ -0,0 +1,38 @@
+! { dg-do run }
+
+program bug
+  ! Bug triggered with at least three elements
+  integer, parameter :: asize = 3
+
+  double precision,save :: ave(asize)
+  double precision,save :: old(asize)
+  double precision,save :: tmp(asize)
+
+  ave(:) = 10.d0
+  old(:) = 3.d0
+  tmp(:) = 0.d0
+
+  call buggy(2.d0,asize,ave,old,tmp)
+  if (any (tmp(:) .ne. 3.5)) call abort
+end
+
+subroutine buggy(scale_factor, asize, ave, old, tmp)
+
+  implicit none
+  ! Args
+  double precision scale_factor
+  integer asize
+  double precision ave(asize)
+  double precision old(asize)
+  double precision tmp(asize)
+
+  ! Local 
+  integer i
+
+  do i = 1, asize
+tmp(i) = ave(i) - old(i)
+old(i) = ave(i)
+tmp(i) = tmp(i) / scale_factor
+  end do
+
+end subroutine buggy


Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)

2015-01-12 Thread Prathamesh Kulkarni
On 12 January 2015 at 16:24, Andreas Schwab  wrote:
> I'm getting this testsuite regression:
>
> FAIL: gcc.dg/plugin/ggcplug.c compilation
Fixed with r219458.

Thanks,
Prathamesh
>
> In file included from 
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:23:0,
>  from 
> /usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8:
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:18: 
> error: 'hash_set' has not been declared
>   void *, hash_set *);
>   ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:26: 
> error: expected ',' or '...' before '<' token
>   void *, hash_set *);
>   ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:24: 
> error: field 'id' has incomplete type 'ht_identifier'
>struct ht_identifier id;
> ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:10: 
> note: forward declaration of 'struct ht_identifier'
>struct ht_identifier id;
>   ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1164:3: 
> error: 'vec' does not name a type
>vec *elts;
>    ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1206:3: 
> error: 'location_t' does not name a type
>location_t locus;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1253:3: 
> error: 'location_t' does not name a type
>location_t locus;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1258:3: 
> error: 'location_t' does not name a type
>location_t locus;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1285:3: 
> error: 'location_t' does not name a type
>location_t locus;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1286:3: 
> error: 'location_t' does not name a type
>location_t end_locus;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1289:3: 
> error: 'vec' does not name a type
>vec *nonlocalized_vars;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1324:3: 
> error: 'alias_set_type' does not name a type
>alias_set_type alias_set;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1360:3: 
> error: 'vec' does not name a type
>    vec *base_accesses;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1366:3: 
> error: 'vec' does not name a type
>vec base_binfos;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1371:3: 
> error: 'location_t' does not name a type
>location_t locus;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1758:3: 
> error: 'vec' does not name a type
>vec *pending_statics;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1799:3: 
> error: 'vec' does not name a type
>vec *to;
>^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1845:16: 
> error: 'vec' does not name a type
>  extern GTY(()) vec *alias_pairs;
> ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1858:17: 
> error: 'vec' does not name a type
>  extern GTY (()) vec *all_translation_units;
>  ^
> In file included from 
> /usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8:0:
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:1073:48: error: 
> 'location_t' has not been declared
>  extern void protected_set_expr_location (tree, location_t);
> ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2642:8: error: 
> 'vec' does not name a type
>  extern vec **decl_debug_args_lookup (tree);
> ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2643:8: error: 
> 'vec' does not name a type
>  extern vec **decl_debug_args_insert (tree);
> ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:38: error: 
> 'vec' has not been declared
>  extern tree build_nt_call_vec (tree, vec *);
>   ^
> /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:41: error: 
> expected ',' or '...' before '<' token
>  extern tree build_nt_call_vec (tree, vec *);
>  

[PATCH] Fix PR64357

2015-01-12 Thread Richard Biener

The following patch fixes PR64357 (or papers over some latent issue).
We were not protecting a certain aspect of simple latches properly
(a simple latch should belong to its loop).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-01-12  Richard Biener  

PR middle-end/64357
* tree-cfg.c (gimple_can_merge_blocks_p): Protect simple
latches properly.

* gcc.dg/torture/pr64357.c: New testcase.

Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 219446)
+++ gcc/tree-cfg.c  (working copy)
@@ -1723,11 +1727,13 @@ gimple_can_merge_blocks_p (basic_block a
 }
 
   /* Protect simple loop latches.  We only want to avoid merging
- the latch with the loop header in this case.  */
+ the latch with the loop header or with a block in another
+ loop in this case.  */
   if (current_loops
   && b->loop_father->latch == b
   && loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
-  && b->loop_father->header == a)
+  && (b->loop_father->header == a
+ || b->loop_father != a->loop_father))
 return false;
 
   /* It must be possible to eliminate all phi nodes in B.  If ssa form
Index: gcc/testsuite/gcc.dg/torture/pr64357.c
===
--- gcc/testsuite/gcc.dg/torture/pr64357.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr64357.c  (working copy)
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+
+int a, b, c, d, e, f;
+
+long long
+fn1 (int p)
+{
+  return p ? p : 1;
+}
+
+static int
+fn2 ()
+{
+lbl:
+  for (; f;)
+return 0;
+  for (;;)
+{
+  for (b = 0; b; ++b)
+   if (d)
+ goto lbl;
+  c = e;
+}
+}
+
+void
+fn3 ()
+{
+  for (; a; a = fn1 (a))
+{
+  fn2 ();
+  e = 0;
+}
+}


[PATCH] Fix PR64535 - increase emergency EH buffers via a new allocator

2015-01-12 Thread Richard Biener

This "fixes" PR64535 by changing the fixed object size emergency pool
to a variable EH object size (but fixed arena size) allocator.  Via
combining the dependent and non-dependent EH arenas this should allow
around 600 bad_alloc throws in OOM situations on x86_64-linux
compared to the current 64 which should provide some headroom to
the poor souls using EH to communicate OOM in a heavily threaded
enviroment.

Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 1
as in the patch below, forcing the use of the allocator).

Comments?  Ok with only the #else path retained?

What about the buffer size - we're now free to choose sth that
doesn't depend on the size of INT_MAX (previously required for
old allocator bitmap)?

With the cost of some more members I can make the allocator more
generic (use a constructor with a arena and a arena size parameter)
and we may move it somewhere public under __gnu_cxx?  But eventually
boost has something like this anyway.

Thanks,
Richard.

2015-01-12  Richard Biener  

PR libstdc++/64535
* libsupc++/eh_alloc.cc: Include new.
(bitmask_type): Remove.
(one_buffer): Likewise.
(emergency_buffer): Likewise.
(emergency_used): Likewise.
(dependents_buffer): Likewise.
(dependents_used): Likewise.
(class pool): New custom fixed-size arena, variable size object
allocator.
(emergency_pool): New global.
(__cxxabiv1::__cxa_allocate_exception): Use new emergency_pool.
(__cxxabiv1::__cxa_free_exception): Likewise.
(__cxxabiv1::__cxa_allocate_dependent_exception): Likewise.
(__cxxabiv1::__cxa_free_dependent_exception): Likewise.

Index: libstdc++-v3/libsupc++/eh_alloc.cc
===
--- libstdc++-v3/libsupc++/eh_alloc.cc  (revision 216303)
+++ libstdc++-v3/libsupc++/eh_alloc.cc  (working copy)
@@ -34,6 +34,7 @@
 #include 
 #include "unwind-cxx.h"
 #include 
+#include 
 
 #if _GLIBCXX_HOSTED
 using std::free;
@@ -72,62 +73,176 @@ using namespace __cxxabiv1;
 # define EMERGENCY_OBJ_COUNT   4
 #endif
 
-#if INT_MAX == 32767 || EMERGENCY_OBJ_COUNT <= 32
-typedef unsigned int bitmask_type;
-#else
-#if defined (_GLIBCXX_LLP64)
-typedef unsigned long long bitmask_type;
-#else
-typedef unsigned long bitmask_type;
-#endif
-#endif
-
-
-typedef char one_buffer[EMERGENCY_OBJ_SIZE] __attribute__((aligned));
-static one_buffer emergency_buffer[EMERGENCY_OBJ_COUNT];
-static bitmask_type emergency_used;
-
-static __cxa_dependent_exception dependents_buffer[EMERGENCY_OBJ_COUNT];
-static bitmask_type dependents_used;
 
 namespace
 {
   // A single mutex controlling emergency allocations.
   __gnu_cxx::__mutex emergency_mutex;
-}
 
-extern "C" void *
-__cxxabiv1::__cxa_allocate_exception(std::size_t thrown_size) _GLIBCXX_NOTHROW
-{
-  void *ret;
+  // A fixed-size heap, variable size object allocator
+  class pool
+{
+public:
+  pool();
 
-  thrown_size += sizeof (__cxa_refcounted_exception);
-  ret = malloc (thrown_size);
+  void *allocate (size_t);
+  void free (void *);
+
+  bool in_pool (void *);
+
+private:
+  struct free_entry {
+   size_t size;
+   free_entry *next;
+  };
+  struct allocated_entry {
+   size_t size;
+   char data[];
+  };
+  free_entry *first_free_entry;
+  char arena[EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT
++ EMERGENCY_OBJ_COUNT * sizeof (__cxa_dependent_exception)]
+__attribute__((aligned(__alignof__(free_entry;
+};
 
-  if (! ret)
+  pool::pool()
 {
-  __gnu_cxx::__scoped_lock sentry(emergency_mutex);
+  first_free_entry = reinterpret_cast  (arena);
+  new (first_free_entry) free_entry;
+  first_free_entry->size = EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT;
+  first_free_entry->next = NULL;
+}
 
-  bitmask_type used = emergency_used;
-  unsigned int which = 0;
+  void *pool::allocate (size_t size)
+{
+  __gnu_cxx::__scoped_lock sentry(emergency_mutex);
+  /* We need an additional size_t member.  */
+  size += sizeof (size_t);
+  /* And we need to at least hand out objects of the size of
+ a freelist entry.  */
+  if (size < sizeof (free_entry))
+   size = sizeof (free_entry);
+  /* And we need to align objects we hand out to the required
+ alignment of a freelist entry (this really aligns the
+tail which will become a new freelist entry).  */
+  size = ((size + __alignof__(free_entry) - 1)
+ & ~(__alignof__(free_entry) - 1));
+  /* Search for an entry of proper size on the freelist.  */
+  free_entry **e;
+  for (e = &first_free_entry;
+  *e && (*e)->size < size;
+  e = &(*e)->next)
+   ;
+  if (!*e)
+   return NULL;
+  allocated_entry *x;
+  if ((*e)->size - size >= sizeof (free_entry))
+   {
+ /* Slit block if it is too large

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2015-01-12 Thread Andrew Stubbs

Ping.

On 23/12/14 16:46, Andrew Stubbs wrote:

On 03/12/14 15:03, Andrew Stubbs wrote:

The tools have always allowed us to drop down the arch to
march=armv5te along with using -mfpu=neon. We are now changing command
line behaviour, so an inform in terms of diagnostics to the user would
be useful as it states that we don't really have mfpu=neon generating
neon code any more because of this particular case. If we are to do
this then the original patch is probably not enough as it then doesn't
handle the case of TARGET_VFP3 / TARGET_VFP5 / TARGET_NEON_FP16 /
TARGET_FP16 / TARGET_FPU_ARMV8 etc. etc. etc.


I'll take a look at those shortly.


Or, not so shortly.

It seems that, on ARM, the arch/CPU setting is basically orthogonal to
the FPU setting, and the compiler doesn't even try to match the one to
the other. The assembler does the same. In fact, the testcases that
James refers to, that have hard-coded -march options, really do emit
armv4 code with Neon, say, although most probably don't have
vectorizable code. They only work because they're most likely executed
on Neon hardware.

This means that there's no obvious patch to fix the issue, in the
compiler. It's easy to reject Neon for pre-v7 CPUs, but that has
consequences, as we've seen. We'd have to have a table of fall-back FPUs
or something, and that doesn't seem straight-forward (and anyway, I'm
not sure what values to enter into that table).

So, I've attacked the problem from the other end, and updated the
compiler check.

OK to commit?

Andrew


Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2015-01-12 Thread Ramana Radhakrishnan
Sorry about the slow response- have been on holiday and still catching 
up on email.


On 12/01/15 13:16, Andrew Stubbs wrote:

Ping.

On 23/12/14 16:46, Andrew Stubbs wrote:

On 03/12/14 15:03, Andrew Stubbs wrote:

The tools have always allowed us to drop down the arch to
march=armv5te along with using -mfpu=neon. We are now changing command
line behaviour, so an inform in terms of diagnostics to the user would
be useful as it states that we don't really have mfpu=neon generating
neon code any more because of this particular case. If we are to do
this then the original patch is probably not enough as it then doesn't
handle the case of TARGET_VFP3 / TARGET_VFP5 / TARGET_NEON_FP16 /
TARGET_FP16 / TARGET_FPU_ARMV8 etc. etc. etc.


I'll take a look at those shortly.


Or, not so shortly.



Sigh.



It seems that, on ARM, the arch/CPU setting is basically orthogonal to
the FPU setting, and the compiler doesn't even try to match the one to
the other. The assembler does the same. In fact, the testcases that
James refers to, that have hard-coded -march options, really do emit
armv4 code with Neon, say, although most probably don't have
vectorizable code. They only work because they're most likely executed
on Neon hardware.


Yes - though I'm surprised as I run an armv5te soft float only test run 
once a while on my Sheevaplug and don't see these issues. Maybe others do.




This means that there's no obvious patch to fix the issue, in the
compiler. It's easy to reject Neon for pre-v7 CPUs, but that has
consequences, as we've seen. We'd have to have a table of fall-back FPUs
or something, and that doesn't seem straight-forward (and anyway, I'm
not sure what values to enter into that table).

So, I've attacked the problem from the other end, and updated the
compiler check.

OK to commit?


In principle ok, but I'd like a comment in there explaining why we've 
done this. Can you also post under what configurations these have been 
tested ?



Ramana



Andrew




C++ PATCH for c++/64547 (constexpr fn returning void)

2015-01-12 Thread Jason Merrill

In C++14 a constexpr function doesn't need to return a value.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 9675a7bde41b5430197854d8c1822c8f4d95b95e
Author: Jason Merrill 
Date:   Fri Jan 9 01:46:16 2015 -0500

	PR c++/64547
	* constexpr.c (cxx_eval_call_expression): A call to a void
	function doesn't need to return a value.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 9a0d518..650250b 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1386,6 +1386,8 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 		   value by evaluating *this, but we don't bother; there's
 		   no need to put such a call in the hash table.  */
 		result = lval ? ctx->object : ctx->ctor;
+	  else if (VOID_TYPE_P (TREE_TYPE (res)))
+		result = void_node;
 	  else
 		{
 		  result = *ctx->values->get (slot ? slot : res);
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-void2.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-void2.C
new file mode 100644
index 000..321a35e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-void2.C
@@ -0,0 +1,21 @@
+// PR c++/64547
+// { dg-do compile { target c++14 } }
+
+struct X
+{
+int x;
+constexpr int get() const {return x;}
+constexpr void set(int foo) {x = foo;}
+};
+
+constexpr int bar()
+{
+X x{42};
+x.set(666);
+return x.get();
+}
+
+int main()
+{
+constexpr int foo = bar();
+}


[PATCH][ARM] Fix PR target/64460: Set 'shift' attr properly on some patterns

2015-01-12 Thread Kyrill Tkachov

Hi all,

In this PR we ICE when compiling with -mtune=xscale. The ICE is a 
segfault in xscale_sched_adjust_cost.
The root cause is that xscale_sched_adjust_cost uses the value of the 
'shift' insn attribute to index
the recog operands. In GCC 5 the form and number of operands in those 
patterns were updated but the

shift value was not:

Author: rearnsha 
Date:   Thu May 29 09:39:07 2014 +

* arm/iterators.md (shiftable_ops): New code iterator.
(t2_binop0, arith_shift_insn): New code attributes.
* arm/predicates.md (shift_nomul_operator): New predicate.
* arm/arm.md (insn_enabled): Delete.
(enabled): Remove insn_enabled test.
(*arith_shiftsi): Delete.  Replace with ...
(*_multsi): ... new pattern.
(*_shiftsi): ... new pattern.
* config/arm/arm.c (arm_print_operand): Handle operand format 'b'.

This led to an out-of-bounds array access. Only xscale_sched_adjust_cost 
uses the shift
attribute, so the segfault only happens for xscale tuning. In the future 
we might want
to use a more general pattern-matching approach to find the shifted 
operand in an rtx...


In any case, this patch fixes the value of 'shift' for the offending 
pattern and also
updates 'shift'  for the *_shiftsi pattern to point to 
the correct

operand that is being shifted.

Tested arm-none-eabi and bootstrapped with -mtune=xscale in BOOT_CFLAGS.

Ok for trunk?

Thanks,
Kyrill

2014-01-12  Kyrylo Tkachov  

PR target/64460
* config/arm/arm.md (*_multsi): Set 'shift' attr
to 2.
(*_shiftsi): Set 'shift' attr to 3.

2014-01-12  Kyrylo Tkachov  

PR target/64460
* gcc.target/arm/pr64460_1.c: New test.



Re: [PATCH][ARM] Fix PR target/64460: Set 'shift' attr properly on some patterns

2015-01-12 Thread Kyrill Tkachov

Now with patch attached

Kyrill

On 12/01/15 14:27, Kyrill Tkachov wrote:

Hi all,

In this PR we ICE when compiling with -mtune=xscale. The ICE is a
segfault in xscale_sched_adjust_cost.
The root cause is that xscale_sched_adjust_cost uses the value of the
'shift' insn attribute to index
the recog operands. In GCC 5 the form and number of operands in those
patterns were updated but the
shift value was not:

Author: rearnsha 
Date:   Thu May 29 09:39:07 2014 +

  * arm/iterators.md (shiftable_ops): New code iterator.
  (t2_binop0, arith_shift_insn): New code attributes.
  * arm/predicates.md (shift_nomul_operator): New predicate.
  * arm/arm.md (insn_enabled): Delete.
  (enabled): Remove insn_enabled test.
  (*arith_shiftsi): Delete.  Replace with ...
  (*_multsi): ... new pattern.
  (*_shiftsi): ... new pattern.
  * config/arm/arm.c (arm_print_operand): Handle operand format 'b'.

This led to an out-of-bounds array access. Only xscale_sched_adjust_cost
uses the shift
attribute, so the segfault only happens for xscale tuning. In the future
we might want
to use a more general pattern-matching approach to find the shifted
operand in an rtx...

In any case, this patch fixes the value of 'shift' for the offending
pattern and also
updates 'shift'  for the *_shiftsi pattern to point to
the correct
operand that is being shifted.

Tested arm-none-eabi and bootstrapped with -mtune=xscale in BOOT_CFLAGS.

Ok for trunk?

Thanks,
Kyrill

2014-01-12  Kyrylo Tkachov  

  PR target/64460
  * config/arm/arm.md (*_multsi): Set 'shift' attr
  to 2.
  (*_shiftsi): Set 'shift' attr to 3.

2014-01-12  Kyrylo Tkachov  

  PR target/64460
  * gcc.target/arm/pr64460_1.c: New test.


commit c89087db2f16eda521d6c938d342570c1d69a7a2
Author: Kyrylo Tkachov 
Date:   Fri Jan 9 16:41:44 2015 +

[ARM] PR target/64460 ICE with -mtune=xscale in shift attr

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index c61057f..bbefb93 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -8255,36 +8255,36 @@ (define_insn "trap"
 (define_insn "*_multsi"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(shiftable_ops:SI
 	 (mult:SI (match_operand:SI 2 "s_register_operand" "r,r")
 		  (match_operand:SI 3 "power_of_two_operand" ""))
 	 (match_operand:SI 1 "s_register_operand" "rk,")))]
   "TARGET_32BIT"
   "%?\\t%0, %1, %2, lsl %b3"
   [(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")
-   (set_attr "shift" "4")
+   (set_attr "shift" "2")
(set_attr "arch" "a,t2")
(set_attr "type" "alu_shift_imm")])
 
 (define_insn "*_shiftsi"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")
 	(shiftable_ops:SI
 	 (match_operator:SI 2 "shift_nomul_operator"
 	  [(match_operand:SI 3 "s_register_operand" "r,r,r")
 	   (match_operand:SI 4 "shift_amount_operand" "M,M,r")])
 	 (match_operand:SI 1 "s_register_operand" "rk,,rk")))]
   "TARGET_32BIT && GET_CODE (operands[2]) != MULT"
   "%?\\t%0, %1, %3%S2"
   [(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")
-   (set_attr "shift" "4")
+   (set_attr "shift" "3")
(set_attr "arch" "a,t2,a")
(set_attr "type" "alu_shift_imm,alu_shift_imm,alu_shift_reg")])
 
 (define_split
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(match_operator:SI 1 "shiftable_operator"
 	 [(match_operator:SI 2 "shiftable_operator"
 	   [(match_operator:SI 3 "shift_operator"
 	 [(match_operand:SI 4 "s_register_operand" "")
 	  (match_operand:SI 5 "reg_or_int_operand" "")])
diff --git a/gcc/testsuite/gcc.target/arm/pr64460_1.c b/gcc/testsuite/gcc.target/arm/pr64460_1.c
new file mode 100644
index 000..ee6ad4a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr64460_1.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=xscale" } */
+
+typedef unsigned int size_t;
+typedef short unsigned int __uint16_t;
+typedef long unsigned int __uint32_t;
+typedef unsigned int __uintptr_t;
+typedef __uint16_t uint16_t ;
+typedef __uint32_t uint32_t ;
+typedef __uintptr_t uintptr_t;
+typedef uint32_t Objects_Id;
+typedef uint16_t Objects_Maximum;
+typedef struct { } Objects_Control;
+
+static __inline__ void *_Addresses_Align_up (void *address, size_t alignment)
+{
+	uintptr_t mask = alignment - (uintptr_t)1;
+	return (void*)(((uintptr_t)address + mask) & ~mask);
+}
+
+typedef struct {
+	Objects_Id minimum_id;
+	Objects_Maximum maximum;
+	_Bool
+		auto_extend;
+	Objects_Maximum allocation_size;
+	void **object_blocks;
+} Objects_Information;
+
+extern uint32_t _Objects_Get_index (Objects_Id);
+extern void** _Workspace_Allocate (size_t);
+
+void _Objects_Extend_information (Objects_Information *information)
+{
+	uint32_t block_count;
+	uint32_t minimum_index;
+	uint32_t maximum;
+	size_t block_size;
+	_Bool
+		do_extend =
+		minimum_index = _Objects_Get_index( information->minimum_id );
+	if ( informati

Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

2015-01-12 Thread Ramana Radhakrishnan
On Thu, Dec 4, 2014 at 9:19 AM, Kyrill Tkachov  wrote:
>
> On 02/12/14 22:58, Ramana Radhakrishnan wrote:
>>
>> On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov 
>> wrote:
>>>
>>> Hi all,
>>>
>>> This is the arm implementation of the macro fusion hook.
>>> It tries to fuse movw+movt operations together. It also tries to take
>>> lo_sum
>>> RTXs into account since those generate movt instructions as well.
>>>
>>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>>
>>> Ok for trunk?
>>
>>
>>
>>>   if (current_tune->fuseable_ops & ARM_FUSE_MOVW_MOVT)
>>> +{
>>> +  /* We are trying to fuse
>>> + movw imm / movt imm
>>> + instructions as a group that gets scheduled together.  */
>>> +
>>
>> A comment here about the insn structure would be useful.
>
>
> Done. It's similar to the aarch64 adrp+add case. It does make it easier to
> read, thanks.
>
> 2014-12-04  Kyrylo Tkachov  kyrylo.tkac...@arm.com\
>
>   * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
>   * config/arm/arm.c (arm_macro_fusion_p): New function.
>   (arm_macro_fusion_pair_p): Likewise.
>   (TARGET_SCHED_MACRO_FUSION_P): Define.
>   (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
>   (ARM_FUSE_NOTHING): Likewise.
>   (ARM_FUSE_MOVW_MOVT): Likewise.
>   (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
>   arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
>   arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
>   arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
>   arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
>   arm_cortex_a5_tune): Specify fuseable_ops value.
>
>>
>>> +  set_dest = SET_DEST (curr_set);
>>> +  if (GET_CODE (set_dest) == ZERO_EXTRACT)
>>> +{
>>> +  if (CONST_INT_P (SET_SRC (curr_set))
>>> +  && CONST_INT_P (SET_SRC (prev_set))
>>> +  && REG_P (XEXP (set_dest, 0))
>>> +  && REG_P (SET_DEST (prev_set))
>>> +  && REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set)))
>>> +return true;
>>> +}
>>> +  else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM
>>> +   && REG_P (SET_DEST (curr_set))
>>> +   && REG_P (SET_DEST (prev_set))
>>> +   && GET_CODE (SET_SRC (prev_set)) == HIGH
>>> +   && REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST
>>> (prev_set)))
>>> +{
>>> +  return true;
>>> +}
>>
>> Can we add a fast path exit to be
>>
>> if (GET_MODE (set_dest) != SImode)
>>return false;
>
>
> Done, but if/when we extend the function to handle more fusion cases it will
> need to be
> refactored, since we will want to just bail out of this MOVW+MOVT case
> rather than the whole function.

Sure -

>
>>
>> I did think whether we wanted to use reg_overlap_mentioned_p as that
>> may simplify the logic a bit but that's  overkill here as we still
>> want to restrict it to the cases above.
>>
>> Otherwise OK.
>
>
> Here's the updated patch. I've tested on arm-none-eabi and made sure that
> the
> fusion still happens on the benchmarks I looked at.
> Ok?

Ok - thanks, sorry about the slow response - been on vacation and
still catching up.

regards
Ramana

>
> Thanks,
> Kyrill
>
>
>>
>> Ramana
>>
>>
>>
>>
>>> +}
>>> +  return false;
>>> Thanks,
>>> Kyrill
>>>
>>> 2014-11-11  Kyrylo Tkachov  
>>>
>>>  * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
>>>  * config/arm/arm.c (arm_macro_fusion_p): New function.
>>>  (arm_macro_fusion_pair_p): Likewise.
>>>  (TARGET_SCHED_MACRO_FUSION_P): Define.
>>>  (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
>>>  (ARM_FUSE_NOTHING): Likewise.
>>>  (ARM_FUSE_MOVW_MOVT): Likewise.
>>>  (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
>>>  arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
>>>  arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
>>>  arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
>>>  arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
>>>  arm_cortex_a5_tune): Specify fuseable_ops value.


Re: [PATCH 7/10] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin

2015-01-12 Thread Thomas Schwinge
Hi!

On Tue, 23 Sep 2014 19:19:31 +0100, Julian Brown  
wrote:
> This patch contains the bulk of the OpenACC 2.0 runtime support, [...]

> --- /dev/null
> +++ b/libgomp/libgomp-plugin.c
> @@ -0,0 +1,106 @@

> +/* Exported (non-hidden) functions exposing libgomp interface for plugins.  
> */

> +void
> +gomp_plugin_mutex_init (gomp_mutex_t *mutex)
> +{
> +  gomp_mutex_init (mutex);
> +}
> +
> +void
> +gomp_plugin_mutex_destroy (gomp_mutex_t *mutex)
> +{
> +  gomp_mutex_destroy (mutex);
> +}
> +
> +void
> +gomp_plugin_mutex_lock (gomp_mutex_t *mutex)
> +{
> +  gomp_mutex_lock (mutex);
> +}
> +
> +void
> +gomp_plugin_mutex_unlock (gomp_mutex_t *mutex)
> +{
> +  gomp_mutex_unlock (mutex);
> +}

> --- a/libgomp/libgomp.map
> +++ b/libgomp/libgomp.map

> +PLUGIN_1.0 {
> +  global:

> + gomp_plugin_mutex_init;
> + gomp_plugin_mutex_destroy;
> + gomp_plugin_mutex_lock;
> + gomp_plugin_mutex_unlock;

> +};

> --- /dev/null
> +++ b/libgomp/plugin-nvptx.c
> @@ -0,0 +1,1854 @@
> +/* Plugin for NVPTX execution.

> +#include "libgomp.h"

Plugins in libgomp are not to depend on libgomp internals (libgomp.h),
and given that...

> +struct PTX_device
> +{

> +  /* A lock for use when manipulating the above stream list and array.  */
> +  gomp_mutex_t stream_lock;

> +};

> +static gomp_mutex_t PTX_event_lock;

> +static void
> +init_streams_for_device (struct PTX_device *ptx_dev, int concurrency)
> +{

> +  gomp_plugin_mutex_init (&ptx_dev->stream_lock);

> +}
> +[...]

... it much more makes sense to just use pthread mutexes here.  Committed
to gomp-4_0-branch in r219467:

commit 4de7ea8222739fa60d6eb81284dac61dc2bae7b2
Author: tschwinge 
Date:   Mon Jan 12 14:35:51 2015 +

libgomp: Use pthread mutexes in the nvptx plugin.

... instead of libgomp's internal mutex implementation.  Plugins aren't to
depend on internal libgomp interfaces, and how would you instantiate a
gomp_mutex_t in a plugin without knowing what it is exactly?

libgomp/
* plugin/plugin-nvptx.c (struct ptx_device): Turn stream_lock
member into a pthread_mutex_t.  Adjust all users.
(ptx_event_lock): Likewise.
* libgomp-plugin.c (GOMP_PLUGIN_mutex_init)
(GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock)
(GOMP_PLUGIN_mutex_unlock): Remove.
* libgomp-plugin.h (GOMP_PLUGIN_mutex_init)
(GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock)
(GOMP_PLUGIN_mutex_unlock): Likewise.
* libgomp.map (GOMP_PLUGIN_1.0): Remove GOMP_PLUGIN_mutex_init,
GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock,
GOMP_PLUGIN_mutex_unlock.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219467 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp| 15 +++
 libgomp/libgomp-plugin.c  | 24 
 libgomp/libgomp-plugin.h  |  7 ---
 libgomp/libgomp.map   |  4 
 libgomp/plugin/plugin-nvptx.c | 39 ---
 5 files changed, 35 insertions(+), 54 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 745b836..d955a85 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,18 @@
+2015-01-12  Thomas Schwinge  
+
+   * plugin/plugin-nvptx.c (struct ptx_device): Turn stream_lock
+   member into a pthread_mutex_t.  Adjust all users.
+   (ptx_event_lock): Likewise.
+   * libgomp-plugin.c (GOMP_PLUGIN_mutex_init)
+   (GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock)
+   (GOMP_PLUGIN_mutex_unlock): Remove.
+   * libgomp-plugin.h (GOMP_PLUGIN_mutex_init)
+   (GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock)
+   (GOMP_PLUGIN_mutex_unlock): Likewise.
+   * libgomp.map (GOMP_PLUGIN_1.0): Remove GOMP_PLUGIN_mutex_init,
+   GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock,
+   GOMP_PLUGIN_mutex_unlock.
+
 2014-12-22  Thomas Schwinge  
 
* libgomp.c (struct gomp_device_descr): Add lock member.
diff --git libgomp/libgomp-plugin.c libgomp/libgomp-plugin.c
index 0026270..77e250e 100644
--- libgomp/libgomp-plugin.c
+++ libgomp/libgomp-plugin.c
@@ -82,27 +82,3 @@ GOMP_PLUGIN_fatal (const char *msg, ...)
   /* Unreachable.  */
   abort ();
 }
-
-void
-GOMP_PLUGIN_mutex_init (gomp_mutex_t *mutex)
-{
-  gomp_mutex_init (mutex);
-}
-
-void
-GOMP_PLUGIN_mutex_destroy (gomp_mutex_t *mutex)
-{
-  gomp_mutex_destroy (mutex);
-}
-
-void
-GOMP_PLUGIN_mutex_lock (gomp_mutex_t *mutex)
-{
-  gomp_mutex_lock (mutex);
-}
-
-void
-GOMP_PLUGIN_mutex_unlock (gomp_mutex_t *mutex)
-{
-  gomp_mutex_unlock (mutex);
-}
diff --git libgomp/libgomp-plugin.h libgomp/libgomp-plugin.h
index 051d4e2..2e2be1f 100644
--- libgomp/libgomp-plugin.h
+++ libgomp/libgomp-plugin.h
@@ -29,8 +29,6 @@
 #ifndef LIBGOMP_PLUGIN_H
 #define LIBGOMP_PLUGIN_H 1
 
-#include "mutex.h"
-
 extern void *GOMP_PLUGIN_malloc (size_t) __attribute__((malloc));
 extern void *GOMP_PLUGIN_malloc_cleared (

[PATCH] Fix PR64404

2015-01-12 Thread Richard Biener

I am testing the following patch to fix a latent bug in the vectorizer
dealing with redundant DRs.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2015-01-12  Richard Biener  

PR tree-optimization/64404
* tree-vect-stmts.c (vectorizable_load): Use the proper
vectorized stmts for CSEing loads with the same DR.

* gcc.dg/vect/pr64404.c: New testcase.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 219446)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -6155,7 +6155,7 @@ vectorizable_load (gimple stmt, gimple_s
 is even wrong code.  See PR56270.  */
  && !slp)
{
- *vec_stmt = STMT_VINFO_VEC_STMT (stmt_info);
+ *vec_stmt = STMT_VINFO_VEC_STMT (vinfo_for_stmt (first_stmt));
  return true;
}
   first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt));
Index: gcc/testsuite/gcc.dg/vect/pr64404.c
===
--- gcc/testsuite/gcc.dg/vect/pr64404.c (revision 0)
+++ gcc/testsuite/gcc.dg/vect/pr64404.c (working copy)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--param=sccvn-max-alias-queries-per-access=1" } */
+
+typedef struct
+{
+  float l, h;
+} tFPinterval;
+
+tFPinterval X[1024];
+tFPinterval Y[1024];
+tFPinterval Z[1024];
+
+void
+Compute (void)
+{
+  int d;
+  for (d = 0; d < 1024; d++)
+{
+  Y[d].l = X[d].l + X[d].h;
+  Y[d].h = Y[d].l;
+  Z[d].l = X[d].l;
+  Z[d].h = X[d].h;
+}
+}
+
+/* { dg-final { cleanup-tree-dump "vect" } } */


Re: [PATCH 7/10] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin

2015-01-12 Thread Thomas Schwinge
Hi!

On Tue, 23 Sep 2014 19:19:31 +0100, Julian Brown  
wrote:
> This patch contains the bulk of the OpenACC 2.0 runtime support, [...]

> --- /dev/null
> +++ b/libgomp/libgomp-plugin.h
> @@ -0,0 +1,57 @@

> +/* An interface to various libgomp-internal functions for use by plugins.  */

..., and in parallel, a libgomp_target.h file came into existence.  In
gomp-4_0-branch's r219468, I now merged the two into the one with -- in
my opinion -- the more descriptive name:

commit 5024605e60ed2a42fefaa6882ac0ca7493643460
Author: tschwinge 
Date:   Mon Jan 12 14:47:46 2015 +

libgomp: Merge libgomp_target.h into libgomp-plugin.h.

libgomp/
* env.c: Don't include "libgomp_target.h".
* libgomp-plugin.c: Likewise.
* oacc-async.c: Likewise.
* oacc-cuda.c: Likewise.
* oacc-init.c: Likewise.
* oacc-mem.c: Likewise.
* oacc-parallel.c: Likewise.
* oacc-plugin.c: Likewise.
* plugin/plugin-host.c: Likewise.
* plugin/plugin-nvptx.c: Likewise.
* target.c: Likewise.
* libgomp_target.h: Remove file after merging its content into...
* libgomp-plugin.h: ... this file.  Adjust all users.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219468 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/config/i386/intelmic-mkoffload.c |  2 +-
 libgomp/ChangeLog.gomp   | 14 +++
 libgomp/env.c|  1 -
 libgomp/libgomp-plugin.c |  1 -
 libgomp/libgomp-plugin.h | 37 +
 libgomp/libgomp.h|  2 +-
 libgomp/libgomp_target.h | 53 
 libgomp/oacc-async.c |  1 -
 libgomp/oacc-cuda.c  |  1 -
 libgomp/oacc-init.c  |  1 -
 libgomp/oacc-mem.c   |  1 -
 libgomp/oacc-parallel.c  |  1 -
 libgomp/oacc-plugin.c|  1 -
 libgomp/plugin/plugin-host.c |  1 -
 libgomp/plugin/plugin-nvptx.c|  1 -
 libgomp/target.c |  1 -
 liboffloadmic/plugin/libgomp-plugin-intelmic.cpp |  2 +-
 17 files changed, 54 insertions(+), 67 deletions(-)

diff --git gcc/config/i386/intelmic-mkoffload.c 
gcc/config/i386/intelmic-mkoffload.c
index 050f2e6..edc3f92 100644
--- gcc/config/i386/intelmic-mkoffload.c
+++ gcc/config/i386/intelmic-mkoffload.c
@@ -22,13 +22,13 @@
 
 #include "config.h"
 #include 
+#include "libgomp-plugin.h"
 #include "system.h"
 #include "coretypes.h"
 #include "obstack.h"
 #include "intl.h"
 #include "diagnostic.h"
 #include "collect-utils.h"
-#include 
 
 const char tool_name[] = "intelmic mkoffload";
 
diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index d955a85..76f21e6 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,19 @@
 2015-01-12  Thomas Schwinge  
 
+   * env.c: Don't include "libgomp_target.h".
+   * libgomp-plugin.c: Likewise.
+   * oacc-async.c: Likewise.
+   * oacc-cuda.c: Likewise.
+   * oacc-init.c: Likewise.
+   * oacc-mem.c: Likewise.
+   * oacc-parallel.c: Likewise.
+   * oacc-plugin.c: Likewise.
+   * plugin/plugin-host.c: Likewise.
+   * plugin/plugin-nvptx.c: Likewise.
+   * target.c: Likewise.
+   * libgomp_target.h: Remove file after merging its content into...
+   * libgomp-plugin.h: ... this file.  Adjust all users.
+
* plugin/plugin-nvptx.c (struct ptx_device): Turn stream_lock
member into a pthread_mutex_t.  Adjust all users.
(ptx_event_lock): Likewise.
diff --git libgomp/env.c libgomp/env.c
index 81460dc..130c52c 100644
--- libgomp/env.c
+++ libgomp/env.c
@@ -28,7 +28,6 @@
 
 #include "libgomp.h"
 #include "libgomp_f.h"
-#include "libgomp_target.h"
 #include "oacc-int.h"
 #include 
 #include 
diff --git libgomp/libgomp-plugin.c libgomp/libgomp-plugin.c
index 77e250e..1dd33f5 100644
--- libgomp/libgomp-plugin.c
+++ libgomp/libgomp-plugin.c
@@ -30,7 +30,6 @@
 
 #include "libgomp.h"
 #include "libgomp-plugin.h"
-#include "libgomp_target.h"
 
 void *
 GOMP_PLUGIN_malloc (size_t size)
diff --git libgomp/libgomp-plugin.h libgomp/libgomp-plugin.h
index 2e2be1f..c8383e1 100644
--- libgomp/libgomp-plugin.h
+++ libgomp/libgomp-plugin.h
@@ -29,6 +29,39 @@
 #ifndef LIBGOMP_PLUGIN_H
 #define LIBGOMP_PLUGIN_H 1
 
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Capabilities of offloading devices.  */
+#define GOMP_OFFLOAD_CAP_SHARED_MEM(1 << 0)
+#define GOMP_OFFLOAD_CAP_NATIVE_EXEC   (1 << 1)
+#define GOMP_OFFLOAD_CAP_OPENMP_400(1 << 2)
+#define GOMP_OFFLOAD_CAP_OPENACC_200   (1 << 3)
+
+/* Type of offload target device.  Keep in sync with include/gomp-constants.h. 
 */
+enum offload_target_type
+{
+  OFFLOAD_TARGET_TYPE_HOST 

Re: [PATCH 7/10] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin

2015-01-12 Thread Thomas Schwinge
Hi!

On Mon, 12 Jan 2015 15:37:46 +0100, I wrote:
> On Tue, 23 Sep 2014 19:19:31 +0100, Julian Brown  
> wrote:
> > This patch contains the bulk of the OpenACC 2.0 runtime support, [...]

> > --- /dev/null
> > +++ b/libgomp/plugin-nvptx.c
> > @@ -0,0 +1,1854 @@
> > +/* Plugin for NVPTX execution.
> 
> > +#include "libgomp.h"
> 
> Plugins in libgomp are not to depend on libgomp internals (libgomp.h),

> ... it much more makes sense to just use pthread mutexes here.  Committed
> to gomp-4_0-branch in r219467:
> 
> commit 4de7ea8222739fa60d6eb81284dac61dc2bae7b2
> Author: tschwinge 
> Date:   Mon Jan 12 14:35:51 2015 +
> 
> libgomp: Use pthread mutexes in the nvptx plugin.
> 
> ... instead of libgomp's internal mutex implementation.  Plugins aren't to
> depend on internal libgomp interfaces, and how would you instantiate a
> gomp_mutex_t in a plugin without knowing what it is exactly?

Given this, we can then tighten the libgomp plugins' include files;
committed to gomp-4_0-branch in r219469:

commit 7c011e60ec4e056e4c1b054966fd95fb2cb5e44a
Author: tschwinge 
Date:   Mon Jan 12 14:53:53 2015 +

libgomp: Don't use internal libgomp.h for plugins.

..., and explicitly link libgomp plugins against libgomp.

libgomp/
* plugin/plugin-host.c [HOST_NONSHM_PLUGIN]: Don't include "libgomp.h".
* plugin/plugin-nvptx.c: Likewise.  Include .
* plugin/Makefrag.am (libgomp_plugin_nvptx_la_LIBADD)
(libgomp_plugin_host_nonshm_la_LIBADD): Append "libgomp.la".
* Makefile.in: Regenerate.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219469 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp| 6 ++
 libgomp/Makefile.in   | 7 ---
 libgomp/plugin/Makefrag.am| 3 ++-
 libgomp/plugin/plugin-host.c  | 2 +-
 libgomp/plugin/plugin-nvptx.c | 2 +-
 5 files changed, 14 insertions(+), 6 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 76f21e6..c2566cf 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,11 @@
 2015-01-12  Thomas Schwinge  
 
+   * plugin/plugin-host.c [HOST_NONSHM_PLUGIN]: Don't include "libgomp.h".
+   * plugin/plugin-nvptx.c: Likewise.  Include .
+   * plugin/Makefrag.am (libgomp_plugin_nvptx_la_LIBADD)
+   (libgomp_plugin_host_nonshm_la_LIBADD): Append "libgomp.la".
+   * Makefile.in: Regenerate.
+
* env.c: Don't include "libgomp_target.h".
* libgomp-plugin.c: Likewise.
* oacc-async.c: Likewise.
diff --git libgomp/Makefile.in libgomp/Makefile.in
index ac34b97..8758989 100644
--- libgomp/Makefile.in
+++ libgomp/Makefile.in
@@ -123,7 +123,7 @@ am__installdirs = "$(DESTDIR)$(toolexeclibdir)" 
"$(DESTDIR)$(infodir)" \
"$(DESTDIR)$(fincludedir)" "$(DESTDIR)$(libsubincludedir)" \
"$(DESTDIR)$(toolexeclibdir)"
 LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
-libgomp_plugin_host_nonshm_la_LIBADD =
+libgomp_plugin_host_nonshm_la_DEPENDENCIES = libgomp.la
 am_libgomp_plugin_host_nonshm_la_OBJECTS =  \
libgomp_plugin_host_nonshm_la-plugin-host.lo
 libgomp_plugin_host_nonshm_la_OBJECTS =  \
@@ -133,7 +133,7 @@ libgomp_plugin_host_nonshm_la_LINK = $(LIBTOOL) --tag=CC \
--mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) \
$(libgomp_plugin_host_nonshm_la_LDFLAGS) $(LDFLAGS) -o $@
 am__DEPENDENCIES_1 =
-@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES =  \
+@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = libgomp.la \
 @PLUGIN_NVPTX_TRUE@$(am__DEPENDENCIES_1)
 @PLUGIN_NVPTX_TRUE@am_libgomp_plugin_nvptx_la_OBJECTS =  \
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la-plugin-nvptx.lo
@@ -407,7 +407,7 @@ libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c 
error.c iter.c \
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LDFLAGS =  \
 @PLUGIN_NVPTX_TRUE@$(libgomp_plugin_nvptx_version_info) \
 @PLUGIN_NVPTX_TRUE@$(lt_host_flags) $(PLUGIN_NVPTX_LDFLAGS)
-@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = $(PLUGIN_NVPTX_LIBS)
+@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = libgomp.la 
$(PLUGIN_NVPTX_LIBS)
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
 libgomp_plugin_host_nonshm_version_info = -version-info $(libtool_VERSION)
 libgomp_plugin_host_nonshm_la_SOURCES = plugin/plugin-host.c
@@ -415,6 +415,7 @@ libgomp_plugin_host_nonshm_la_CPPFLAGS = $(AM_CPPFLAGS) 
-DHOST_NONSHM_PLUGIN
 libgomp_plugin_host_nonshm_la_LDFLAGS = \
$(libgomp_plugin_host_nonshm_version_info) $(lt_host_flags)
 
+libgomp_plugin_host_nonshm_la_LIBADD = libgomp.la
 libgomp_plugin_host_nonshm_la_LIBTOOLFLAGS = --tag=disable-static
 nodist_noinst_HEADERS = libgomp_f.h
 nodist_libsubinclude_HEADERS = omp.h openacc.h
diff --git libgomp/plugin/Makefrag.am libgomp/plugin/Makefrag.am
index d2c5428..167485f 100644
--- libgomp/plugin/Makefrag.am
+++ libgomp/plugin/Makefrag.am
@@ -35,7 +35,7 @@ libgomp_plugin_nvptx_la_CPPFLAGS = $(AM_CP

Re: [x86, PATCH] operand reordering for commutative operations

2015-01-12 Thread Yuri Rumyantsev
Hi All,

Thanks a lot for your comments.
I've re-written reorder_operands as you proposed, but I'd like to know
if we should apply this reordering at -O0?

I will re-send the patch after testing completion.
Thanks.
Yuri.

2015-01-09 13:13 GMT+03:00 Richard Biener :
> On Mon, Jan 5, 2015 at 9:26 PM, Jeff Law  wrote:
>> On 12/29/14 06:30, Yuri Rumyantsev wrote:
>>>
>>> Hi All,
>>>
>>> Here is a patch which fixed several performance degradation after
>>> operand canonicalization (r216728). Very simple approach is used - if
>>> operation is commutative and its second operand required more
>>> operations (statements) for computation, swap operands.
>>> Currently this is done under special option which is set-up to true
>>> only for x86 32-bit targets ( we have not  seen any performance
>>> improvements on 64-bit).
>>>
>>> Is it OK for trunk?
>>>
>>> 2014-12-26  Yuri Rumyantsev  
>>>
>>> * cfgexpand.c (count_num_stmt): New function.
>>> (reorder_operands): Likewise.
>>> (expand_gimple_basic_block): Insert call of reorder_operands.
>>> * common.opt(flag_reorder_operands): Add new flag.
>>> * config/i386/i386.c (ix86_option_override_internal): Add setup of
>>> flag_reorder_operands for 32-bit target only.
>>> * (doc/invoke.texi: Add new optimization option -freorder-operands.
>>>
>>> gcc/testsuite/ChangeLog
>>> * gcc.target/i386/swap_opnd.c: New test.
>>
>> I'd do this unconditionally -- I don't think there's a compelling reason to
>> add another flag here.
>
> Indeed.
>
>> Could you use estimate_num_insns rather than rolling your own estimate code
>> here?  All you have to do is setup the weights structure and call the
>> estimation code.  I wouldn't be surprised if ultimately the existing insn
>> estimator is better than the one you're adding.
>
> Just use eni_size_weights.
>
> Your counting is quadratic, that's a no-go.  You'd have to keep a lattice
> of counts for SSA names to avoid this.  There is swap_ssa_operands (),
> in your swapping code you fail to update SSA operands (maybe non-fatal
> because we are just expanding to RTL, but ...).
>
> bb->loop_father is always non-NULL, but doing this everywhere, not only
> in loops looks fine to me.
>
> You can swap comparison operands on GIMPLE_CONDs for all
> codes by also swapping the EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
> flags on the outgoing BB edges.
>
> There are more cases that can be swapped in regular stmts as well,
> but I suppose we don't need to be "complete" here.
>
> So, in reorder_operands I'd do (pseudo-code)
>
>   n = 0;
>   for-all-stmts
> gimple_set_uid (stmt, n++);
>   lattice = XALLOCVEC (unsigned, n);
>
>   i = 0;
>   for-all-stmts
> this_stmt_cost = estimate_num_insns (stmt, &eni_size_weights);
> lattice[i] = this_stmt_cost;
> FOR_EACH_SSA_USE_OPERAND ()
>if (use-in-this-BB)
>  lattice[i] += lattice[gimple_uid (SSA_NAME_DEF_STMT)];
>  i++;
> swap-if-operand-cost says so
>
> Richard.
>
>> Make sure to reference the PR in the ChangeLog.
>>
>> Please update and resubmit.
>>
>> Thanks,
>> Jeff


Re: [x86, PATCH] operand reordering for commutative operations

2015-01-12 Thread Richard Biener
On Mon, Jan 12, 2015 at 4:00 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Thanks a lot for your comments.
> I've re-written reorder_operands as you proposed, but I'd like to know
> if we should apply this reordering at -O0?

No, I think we can spare those cycles there.

Richard.

> I will re-send the patch after testing completion.
> Thanks.
> Yuri.
>
> 2015-01-09 13:13 GMT+03:00 Richard Biener :
>> On Mon, Jan 5, 2015 at 9:26 PM, Jeff Law  wrote:
>>> On 12/29/14 06:30, Yuri Rumyantsev wrote:

 Hi All,

 Here is a patch which fixed several performance degradation after
 operand canonicalization (r216728). Very simple approach is used - if
 operation is commutative and its second operand required more
 operations (statements) for computation, swap operands.
 Currently this is done under special option which is set-up to true
 only for x86 32-bit targets ( we have not  seen any performance
 improvements on 64-bit).

 Is it OK for trunk?

 2014-12-26  Yuri Rumyantsev  

 * cfgexpand.c (count_num_stmt): New function.
 (reorder_operands): Likewise.
 (expand_gimple_basic_block): Insert call of reorder_operands.
 * common.opt(flag_reorder_operands): Add new flag.
 * config/i386/i386.c (ix86_option_override_internal): Add setup of
 flag_reorder_operands for 32-bit target only.
 * (doc/invoke.texi: Add new optimization option -freorder-operands.

 gcc/testsuite/ChangeLog
 * gcc.target/i386/swap_opnd.c: New test.
>>>
>>> I'd do this unconditionally -- I don't think there's a compelling reason to
>>> add another flag here.
>>
>> Indeed.
>>
>>> Could you use estimate_num_insns rather than rolling your own estimate code
>>> here?  All you have to do is setup the weights structure and call the
>>> estimation code.  I wouldn't be surprised if ultimately the existing insn
>>> estimator is better than the one you're adding.
>>
>> Just use eni_size_weights.
>>
>> Your counting is quadratic, that's a no-go.  You'd have to keep a lattice
>> of counts for SSA names to avoid this.  There is swap_ssa_operands (),
>> in your swapping code you fail to update SSA operands (maybe non-fatal
>> because we are just expanding to RTL, but ...).
>>
>> bb->loop_father is always non-NULL, but doing this everywhere, not only
>> in loops looks fine to me.
>>
>> You can swap comparison operands on GIMPLE_CONDs for all
>> codes by also swapping the EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>> flags on the outgoing BB edges.
>>
>> There are more cases that can be swapped in regular stmts as well,
>> but I suppose we don't need to be "complete" here.
>>
>> So, in reorder_operands I'd do (pseudo-code)
>>
>>   n = 0;
>>   for-all-stmts
>> gimple_set_uid (stmt, n++);
>>   lattice = XALLOCVEC (unsigned, n);
>>
>>   i = 0;
>>   for-all-stmts
>> this_stmt_cost = estimate_num_insns (stmt, &eni_size_weights);
>> lattice[i] = this_stmt_cost;
>> FOR_EACH_SSA_USE_OPERAND ()
>>if (use-in-this-BB)
>>  lattice[i] += lattice[gimple_uid (SSA_NAME_DEF_STMT)];
>>  i++;
>> swap-if-operand-cost says so
>>
>> Richard.
>>
>>> Make sure to reference the PR in the ChangeLog.
>>>
>>> Please update and resubmit.
>>>
>>> Thanks,
>>> Jeff


[PATCH][Aarch64] PR64149: Remove -mlra/-mno-lra option for Aarch64.

2015-01-12 Thread Matthew Wahab

Hello,

The LRA register is enabled by default for the Aarch64 backend and 
-mno-lra should no longer be used. This patch removes the -mlra/-mno-lra 
option for AArch64.


Tested aarch64-none-linux-gnu with gcc-check.

Matthew

2015-01-08  Matthew Wahab  

PR target/64149
* config/aarch64/aarch64.opt: Remove lra option and aarch64_lra_flag
variable.
* config/aarch64/aarch64.c (TARGET_LRA_P): Set to 
hook_bool_void_true.

(aarch64_lra_p): Remove.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5100532..fc0bbad 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -146,7 +146,6 @@ enum aarch64_code_model aarch64_cmodel;
 #define TARGET_HAVE_TLS 1
 #endif
 
-static bool aarch64_lra_p (void);
 static bool aarch64_composite_type_p (const_tree, machine_mode);
 static bool aarch64_vfp_is_call_or_return_candidate (machine_mode,
 		 const_tree,
@@ -7732,13 +7731,6 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
   return -1;
 }
 
-/* Return true if we use LRA instead of reload pass.  */
-static bool
-aarch64_lra_p (void)
-{
-  return aarch64_lra_flag;
-}
-
 /* Return TRUE if the type, as described by TYPE and MODE, is a composite
type as described in AAPCS64 \S 4.3.  This includes aggregate, union and
array types.  The C99 floating-point complex types are also considered
@@ -11053,7 +11045,7 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool load,
 #define TARGET_LIBGCC_CMP_RETURN_MODE aarch64_libgcc_cmp_return_mode
 
 #undef TARGET_LRA_P
-#define TARGET_LRA_P aarch64_lra_p
+#define TARGET_LRA_P hook_bool_void_true
 
 #undef TARGET_MANGLE_TYPE
 #define TARGET_MANGLE_TYPE aarch64_mangle_type
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 44c6350..f2ef124 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -107,10 +107,6 @@ mabi=
 Target RejectNegative Joined Enum(aarch64_abi) Var(aarch64_abi) Init(AARCH64_ABI_DEFAULT)
 -mabi=ABI	Generate code that conforms to the specified ABI
 
-mlra
-Target Report Var(aarch64_lra_flag) Init(1) Save
-Use LRA instead of reload (transitional)
-
 Enum
 Name(aarch64_abi) Type(int)
 Known AArch64 ABIs (for use with the -mabi= option):

[PATCH] Fix PR64436: broken logic to process bitwise ORs in bswap pass

2015-01-12 Thread Thomas Preud'homme
Hi all,

To identify if a set of loads, shift, cast, mask (bitwise and) and bitwise OR 
is equivalent to a load or byteswap, the bswap pass assign a number to each 
byte loaded according to its significance (1 for lsb, 2 for next least 
significant byte, etc.) and form a symbolic number such as 0x04030201 for a 
32bit load. When processing a bitwise OR of two such symbolic numbers, it is 
necessary to consider the lowest and highest addresses where a byte was loaded 
to renumber each byte accordingly. For instance if the two numbers are 
0x04030201 and they were loaded from consecutive word in memory the result 
would be 0x0807060504030201 but if they overlap fully the result would be 
0x04030201.

Currently the computation of the byte with highest address is broken: it takes 
the byte with highest address of the symbolic number that starts last. That is, 
if one number represents a 8bit load at address 0x14 and another number 
represent a 32bit load at address 0x12 it will compute the end as 0x14 instead 
of 0x15. This error affects the computation of the size of the load for all 
targets and the computation of the symbolic number that result from the bitwise 
OR for big endian targets. This is what causes PR64436 due to a change in the 
gimple generated for that testcase.

ChangeLog entry is as follows:

gcc/ChangeLog

2014-12-30 Thomas Preud'homme thomas.preudho...@arm.com

PR tree-optimization/64436
* tree-ssa-math-opts.c (find_bswap_or_nop_1): Move code performing the
merge of two symbolic numbers for a bitwise OR to ...
(perform_symbolic_merge): This. Also fix computation of the range and
end of the symbolic number corresponding to the result of a bitwise OR.

diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 1ed2838..286183a 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1816,6 +1816,123 @@ find_bswap_or_nop_load (gimple stmt, tree ref, struct 
symbolic_number *n)
   return true;
 }
 
+/* Compute the symbolic number N representing the result of a bitwise OR on 2
+   symbolic number N1 and N2 whose source statements are respectively
+   SOURCE_STMT1 and SOURCE_STMT2.  */
+
+static gimple
+perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1,
+   gimple source_stmt2, struct symbolic_number *n2,
+   struct symbolic_number *n)
+{
+  int i, size;
+  uint64_t mask;
+  gimple source_stmt;
+  struct symbolic_number *n_start;
+
+  /* Sources are different, cancel bswap if they are not memory location with
+ the same base (array, structure, ...).  */
+  if (gimple_assign_rhs1 (source_stmt1) != gimple_assign_rhs1 (source_stmt2))
+{
+  int64_t inc;
+  HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
+  struct symbolic_number *toinc_n_ptr, *n_end;
+
+  if (!n1->base_addr || !n2->base_addr
+ || !operand_equal_p (n1->base_addr, n2->base_addr, 0))
+   return NULL;
+
+  if (!n1->offset != !n2->offset ||
+  (n1->offset && !operand_equal_p (n1->offset, n2->offset, 0)))
+   return NULL;
+
+  if (n1->bytepos < n2->bytepos)
+   {
+ n_start = n1;
+ start_sub = n2->bytepos - n1->bytepos;
+ source_stmt = source_stmt1;
+   }
+  else
+   {
+ n_start = n2;
+ start_sub = n1->bytepos - n2->bytepos;
+ source_stmt = source_stmt2;
+   }
+
+  /* Find the highest address at which a load is performed and
+compute related info.  */
+  end1 = n1->bytepos + (n1->range - 1);
+  end2 = n2->bytepos + (n2->range - 1);
+  if (end1 < end2)
+   {
+ end = end2;
+ end_sub = end2 - end1;
+   }
+  else
+   {
+ end = end1;
+ end_sub = end1 - end2;
+   }
+  n_end = (end2 > end1) ? n2 : n1;
+
+  /* Find symbolic number whose lsb is the most significant.  */
+  if (BYTES_BIG_ENDIAN)
+   toinc_n_ptr = (n_end == n1) ? n2 : n1;
+  else
+   toinc_n_ptr = (n_start == n1) ? n2 : n1;
+
+  n->range = end - n_start->bytepos + 1;
+
+  /* Check that the range of memory covered can be represented by
+a symbolic number.  */
+  if (n->range > 64 / BITS_PER_MARKER)
+   return NULL;
+
+  /* Reinterpret byte marks in symbolic number holding the value of
+bigger weight according to target endianness.  */
+  inc = BYTES_BIG_ENDIAN ? end_sub : start_sub;
+  size = TYPE_PRECISION (n1->type) / BITS_PER_UNIT;
+  for (i = 0; i < size; i++, inc <<= BITS_PER_MARKER)
+   {
+ unsigned marker =
+   (toinc_n_ptr->n >> (i * BITS_PER_MARKER)) & MARKER_MASK;
+ if (marker && marker != MARKER_BYTE_UNKNOWN)
+   toinc_n_ptr->n += inc;
+   }
+}
+  else
+{
+  n->range = n1->range;
+  n_start = n1;
+  source_stmt = source_stmt1;
+}
+
+  if (!n1->alias_set
+  || alias_ptr_types_compatible_p (n1->alias_set, n2->alias_set))
+n->ali

[PATCH,MIPS] Add support for the R6 LSA and DLSA instructions

2015-01-12 Thread Matthew Fortune
This patch adds support for the R6 [D]LSA instructions.  The support
has been structured to allow MSA (when implemented) to turn on the
same instructions as they are also added by the MSA ASE.

I have continued to use the idea of 'ghost' options in the testsuite to
indicate what features are required rather than arch revisions.

Thanks,
Matthew

gcc/

* config/mips/mips.c (mips_rtx_costs): Set costs for LSA/DLSA.
(mips_print_operand): Support 'y' to print exact log2 in decimal
of a const_int.
* config/mips/mips.h (ISA_HAS_LSA): New define.
(ISA_HAS_DLSA): Likewise.
* config/mips/mips.md (lsa): New define_insn.
* config/mips/predicates.md (const_immlsa_operand): New predicate.

gcc/testsuite/

* gcc.target/mips/lsa.c: New file.
* gcc.target/mips/mips64-lsa.c: Likewise.
* gcc.target/mips/mulsize-2.c: Require !HAS_LSA.
* gcc.target/mips/mulsize-4.c: Likewise.
* gcc.target/mips/mulsize-5.c: New file.
* gcc.target/mips/mulsize-6.c: Likewise.
* gcc.target/mips/mips.exp (mips_option_groups): Support HAS_LSA
and !HAS_LSA as ghost options.
(mips-dg-options): Require rev 6 for HAS_LSA. Downgrade to rev 5
for !HAS_LSA.
---
 gcc/config/mips/mips.c | 30 ++
 gcc/config/mips/mips.h |  6 ++
 gcc/config/mips/mips.md| 10 ++
 gcc/config/mips/predicates.md  |  4 
 gcc/testsuite/gcc.target/mips/lsa.c| 11 +++
 gcc/testsuite/gcc.target/mips/mips.exp | 16 ++--
 gcc/testsuite/gcc.target/mips/mips64-lsa.c | 11 +++
 gcc/testsuite/gcc.target/mips/mulsize-2.c  |  1 +
 gcc/testsuite/gcc.target/mips/mulsize-4.c  |  1 +
 gcc/testsuite/gcc.target/mips/mulsize-5.c  | 13 +
 gcc/testsuite/gcc.target/mips/mulsize-6.c  | 13 +
 11 files changed, 114 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/lsa.c
 create mode 100644 gcc/testsuite/gcc.target/mips/mips64-lsa.c
 create mode 100644 gcc/testsuite/gcc.target/mips/mulsize-5.c
 create mode 100644 gcc/testsuite/gcc.target/mips/mulsize-6.c

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index c2cc76e..a858a84 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -4108,6 +4108,22 @@ mips_rtx_costs (rtx x, int code, int outer_code, int 
opno ATTRIBUTE_UNUSED,
  return false;
}
 
+  /* If it's an add + mult (which is equivalent to shift left) and
+ it's immediate operand satisfies const_immlsa_operand predicate.  */
+  if (((ISA_HAS_LSA && mode == SImode)
+  || (ISA_HAS_DLSA && mode == DImode))
+ && GET_CODE (XEXP (x, 0)) == MULT)
+   {
+ rtx op2 = XEXP (XEXP (x, 0), 1);
+ if (const_immlsa_operand (op2, mode))
+   {
+ *total = (COSTS_N_INSNS (1)
+   + set_src_cost (XEXP (XEXP (x, 0), 0), speed)
+   + set_src_cost (XEXP (x, 1), speed));
+ return true;
+   }
+   }
+
   /* Double-word operations require three single-word operations and
 an SLTU.  The MIPS16 version then needs to move the result of
 the SLTU from $24 to a MIPS16 register.  */
@@ -8413,6 +8429,7 @@ mips_print_operand_punct_valid_p (unsigned char code)
'x' Print the low 16 bits of CONST_INT OP in hexadecimal format.
'd' Print CONST_INT OP in decimal.
'm' Print one less than CONST_INT OP in decimal.
+   'y' Print exact log2 of CONST_INT OP in decimal.
'h' Print the high-part relocation associated with OP, after stripping
  any outermost HIGH.
'R' Print the low-part relocation associated with OP.
@@ -8476,6 +8493,19 @@ mips_print_operand (FILE *file, rtx op, int letter)
output_operand_lossage ("invalid use of '%%%c'", letter);
   break;
 
+case 'y':
+  if (CONST_INT_P (op))
+   {
+ int val = exact_log2 (INTVAL (op));
+ if (val != -1)
+   fprintf (file, "%d", val);
+ else
+   output_operand_lossage ("invalid use of '%%%c'", letter);
+   }
+  else
+   output_operand_lossage ("invalid use of '%%%c'", letter);
+  break;
+
 case 'h':
   if (code == HIGH)
op = XEXP (op, 0);
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 3d95a58..37d4cb4 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -181,6 +181,12 @@ struct mips_cpu_info {
 #define ISA_HAS_DSP_MULT ISA_HAS_DSPR2
 #endif
 
+/* ISA has LSA available.  */
+#define ISA_HAS_LSA(mips_isa_rev >= 6)
+
+/* ISA has DLSA available.  */
+#define ISA_HAS_DLSA   (TARGET_64BIT && mips_isa_rev >= 6)
+
 /* The ISA compression flags that are currently in effect.  */
 #define TARGET_COMPRESSION (target_flags & (MASK_MIPS16 | MASK_MICROMIPS))
 
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/

[PATCH] Fix PR64568

2015-01-12 Thread Richard Biener

The following avoids splitting TARGET_MEM_REFs by attaching
REAL/IMAGPART_EXPRs around it which isn't allowed.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-01-12  Richard Biener  

PR tree-optimization/64568
* tree-ssa-forwprop.c (pass_forwprop::execute): Properly
release defs of removed stmts, avoid splitting TARGET_MEM_REFs.

* g++.dg/torture/pr64568.C: New testcase.

Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 219446)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -2267,6 +2267,8 @@ pass_forwprop::execute (function *fun)
 
  gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
}
+
+ release_defs (stmt);
  gsi_remove (&gsi, true);
}
  else
@@ -2281,7 +2283,9 @@ pass_forwprop::execute (function *fun)
  if (single_imm_use (lhs, &use_p, &use_stmt)
  && gimple_store_p (use_stmt)
  && !gimple_has_volatile_ops (use_stmt)
- && is_gimple_assign (use_stmt))
+ && is_gimple_assign (use_stmt)
+ && (TREE_CODE (gimple_assign_lhs (use_stmt))
+ != TARGET_MEM_REF))
{
  tree use_lhs = gimple_assign_lhs (use_stmt);
  tree new_lhs = build1 (REALPART_EXPR,
@@ -2302,6 +2306,7 @@ pass_forwprop::execute (function *fun)
  gimple_assign_set_rhs1 (use_stmt, gimple_assign_rhs2 (stmt));
  update_stmt (use_stmt);
 
+ release_defs (stmt);
  gsi_remove (&gsi, true);
}
  else
Index: gcc/testsuite/g++.dg/torture/pr64568.C
===
--- gcc/testsuite/g++.dg/torture/pr64568.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr64568.C  (working copy)
@@ -0,0 +1,111 @@
+// { dg-do compile }
+// { dg-additional-options "-std=c++11" }
+
+namespace std
+{
+typedef long unsigned size_t;
+template  class complex;
+template  complex<_Tp> operator+(complex<_Tp>, complex<_Tp>)
+{
+  complex<_Tp> a = 0;
+  a += 0;
+  return a;
+}
+template <> struct complex
+{
+  complex (int __i) : _M_value{ __i } {}
+  int imag ();
+  void operator+=(complex __z) { _M_value = __z.imag (); }
+  _Complex double _M_value;
+};
+}
+class A
+{
+public:
+  A (int);
+  std::complex &operator[](int i) { return data_[i]; }
+  std::complex *data_;
+};
+struct B
+{
+  static std::complex
+  apply (std::complex t1, std::complex t2)
+  {
+return t1 + t2;
+  }
+};
+template  struct C
+{
+  static void
+  apply (T1 t1, std::complex t2)
+  {
+t1 = t2;
+  }
+};
+template  class D
+{
+public:
+  E operator()();
+};
+class G : public D
+{
+public:
+  typedef std::complex value_type;
+  value_type operator()(int) { return B::apply (0, 0); }
+};
+template  G operator+(D, D);
+template  class F, class V, class E>
+void
+indexing_vector_assign (V v, D e)
+{
+  for (int i;; ++i)
+F::apply (v (i), e ()(0));
+}
+template  class F, class V, class E>
+void
+vector_assign (V v, D e, int)
+{
+  indexing_vector_assign (v, e);
+}
+template  class F, class V, class E>
+void
+vector_assign (V v, D e)
+{
+  vector_assign (v, e, typename V::storage_category ());
+}
+class H : public D
+{
+public:
+  typedef std::complex &reference;
+  typedef int storage_category;
+  H (int);
+  template  H (D ae) : data_ (0)
+  {
+vector_assign (*this, ae);
+  }
+  A
+  data ()
+  {
+return data_;
+  }
+  reference operator()(int i) { return data ()[i]; }
+  A data_;
+};
+template 
+void
+rot (T1, V1 v1, T2, V2 v2)
+{
+  H (v1 + v2);
+}
+template  struct F
+{
+  void test ();
+};
+template struct F;
+template 
+void
+F::test ()
+{
+  V b (0), c (0);
+  rot (0, b, 0, c);
+}


[PATCH,MIPS] Only pass floating-point options to the assembler then

2015-01-12 Thread Matthew Fortune
The new behaviour of the GCC driver passing floating point options
like -msoft-float to the assembler is essential for the new o32 ABI
extensions but is a change in behaviour. In particular GCC 5 used with
binutils 2.24 would require a user to fix any hand-crafted code that
made use of floating-point instructions when building for soft-float.
This patch limits the new behaviour to a combination of GCC and
binutils that both have the new ABI support.

This patch along with parts of several previous patches need backporting
to GCC 4.9 (and GCC 4.8) to enable use of binutils 2.25 with those
compilers. The GCC 4.9 patch will be posted shortly.

Thanks,
Matthew

gcc/
* config/mips/mips.h (FP_ASM_SPEC): New define.
(ASM_SPEC): Remove floating-point options and use FP_ASM_SPEC
instead.
---
 gcc/config/mips/mips.h | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 37d4cb4..ed241fa 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1243,6 +1243,22 @@ struct mips_cpu_info {
 %{gcoff*:-mdebug} %{!gcoff*:-no-mdebug}"
 #endif
 
+/* FP_ASM_SPEC represents the floating-point options that must be passed
+   to the assembler when FPXX support exists.  Prior to that point the
+   assembler could accept the options but were not required for
+   correctness.  We only add the options when absolutely necessary
+   because passing -msoft-float to the assembler will cause it to reject
+   all hard-float instructions which may require some user code to be
+   updated.  */
+
+#ifdef HAVE_AS_DOT_MODULE
+#define FP_ASM_SPEC "\
+%{mhard-float} %{msoft-float} \
+%{msingle-float} %{mdouble-float}"
+#else
+#define FP_ASM_SPEC
+#endif
+
 /* SUBTARGET_ASM_SPEC is always passed to the assembler.  It may be
overridden by subtargets.  */
 
@@ -1277,9 +1293,8 @@ struct mips_cpu_info {
 %{modd-spreg} %{mno-odd-spreg} \
 %{mshared} %{mno-shared} \
 %{msym32} %{mno-sym32} \
-%{mtune=*} \
-%{mhard-float} %{msoft-float} \
-%{msingle-float} %{mdouble-float} \
+%{mtune=*}" \
+FP_ASM_SPEC "\
 %(subtarget_asm_spec)"
 
 /* Extra switches sometimes passed to the linker.  */
-- 
2.2.1



[PATCH][AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

2015-01-12 Thread Kyrill Tkachov

Hi all,

As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html 
and discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end 
up in a call to the library sqrt at -O0. To avoid that this patch uses a 
target builtin for sqrt on DF mode and uses that to implement the intrinsic.


With this patch I don't see sqrt calls being created at -O0 on a large 
arm_neon.h testcase where they were generated before.
aarch64-none-elf testing and the intrinsics testsuite in particular are 
clean.

Ok for trunk?

Thanks,
Kyrill

2015-01-12  Kyrylo Tkachov  

* config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
* config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
instead of __builtin_sqrt.commit 865be1cc8365886904d571e244746815e2317162
Author: Kyrylo Tkachov 
Date:   Fri Jan 9 12:18:59 2015 +

[AArch64] Use target builtin for vsqrt_f64

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index b41d9f6..60cd1d7 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -41,7 +41,7 @@
 
   BUILTIN_VDC (COMBINE, combine, 0)
   BUILTIN_VB (BINOP, pmul, 0)
-  BUILTIN_VDQF (UNOP, sqrt, 2)
+  BUILTIN_VDQF_DF (UNOP, sqrt, 2)
   BUILTIN_VD_BHSI (BINOP, addp, 0)
   VAR1 (UNOP, addp, 0, di)
   BUILTIN_VDQ_BHSI (UNOP, clrsb, 2)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index c679802..3b151a2 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -22194,7 +22194,7 @@ vsqrtq_f32 (float32x4_t a)
 __extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
 vsqrt_f64 (float64x1_t a)
 {
-  return (float64x1_t) { __builtin_sqrt (a[0]) };
+  return (float64x1_t) { __builtin_aarch64_sqrtdf (a[0]) };
 }
 
 __extension__ static __inline float64x2_t __attribute__ ((__always_inline__))

[PATCH][test] Gate gcc.dg/aru-2.c test on profiling support

2015-01-12 Thread Kyrill Tkachov

Hi all,

This recently added test adds -pg to its dg-options but not all targets 
support

 this and fail at link-time with "bin/ld: cannot find -lc_p".

Looking around I see that all tests that use -pg also do a 
dg-require-profiling.

This patch adds that.

With this patch the test doesn't FAIL on aarch64-none-elf. It appears as 
UNSUPPORTED.


Ok for trunk?

Thanks,
Kyrill

2015-01-12  Kyrylo Tkachov  

* gcc.dg/aru-2.c: Add dg-require-profiling directive.commit d4cffed0dede524cf1d0b3487e21ab4f783d06bd
Author: Kyrylo Tkachov 
Date:   Fri Jan 9 12:13:02 2015 +

[testuite] Add check on profiling in aru-2.c test

diff --git a/gcc/testsuite/gcc.dg/aru-2.c b/gcc/testsuite/gcc.dg/aru-2.c
index efd1f01..d36adc1 100644
--- a/gcc/testsuite/gcc.dg/aru-2.c
+++ b/gcc/testsuite/gcc.dg/aru-2.c
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-require-profiling "-pg" } */
 /* { dg-options "-O2 -pg" } */
 
 static int __attribute__((noinline))

[PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-01-12 Thread H.J. Lu
On Fri, Jan 09, 2015 at 01:58:45PM +0100, Richard Biener wrote:
> On Tue, Dec 30, 2014 at 10:23 PM, Magnus Granberg  wrote:
> > fredag 14 november 2014 23.31.48 skrev  Magnus Granberg:
> >> måndag 10 november 2014 21.26.39 skrev  Magnus Granberg:
> >> > >   Rainer
> >> >
> >> > Thanks Rainer for the nits and comments.
> >> > Have updated the patches and Changelogs.
> >> > But i still use PIE_DRIVER_SELF_SPECS, do you have a ide where move it so
> >> > i don't need to duplicate that stuff or how to do it?
> >> >
> >> > Magnus G
> >> >
> >> > 2014-11-10  Magnus Granberg  
> >> >
> >> > /gcc
> >> > * config/gnu-user.h (PIE_DRIVER_SELF_SPECS) and
> >> > (GNU_DRIVER_SELF_SPECS): Define.
> >> > * config/i386/gnu-user-common.h (DRIVER_SELF_SPECS): Define
> >> > * configure.ac: Add new option.
> >> > * configure, config.in: Rebuild.
> >> > * Makefile.in (ALL_CFLAGS) and (ALL_CXXFLAGS): Disable PIE.
> >> > * doc/install.texi: New configure option.
> >> > * doc/invoke.texi: Add note to PIE.
> >> > * doc/sourcebuild.texi: New effective target.
> >> > gcc/testsuite
> >> > * gcc/default-pie.c: New test
> >> > * gcc.dg/tree-ssa/ssa-store-ccp-3.c: Skip if default_pie
> >> > * g++.dg/other/anon5.C: Skip if default_pie
> >> > * lib/target-supports.exp (check_effective_target_default_pie):
> >> > New proc.
> >> > /libgcc
> >> > * Makefile.in (CRTSTUFF_CFLAGS): Disable PIE.
> >>
> >> Can this be included for GCC 5 ?
> >>
> >> /Magnus G.
> > One more ping on this. The patches where sent before stage 1 closed but i
> > did't get any feed back from it
> > Have updete the patchses for gcc 5.0 20141228 snapshot.
> > Bootstrapped and tested on x86_64-unknown-linux-gnu (Gentoo)
> > /Magnus
> 
> Looking at the actual implementation I wonder why it's not similar
> to how darwin gets at it default (not sure how it does).  Also
> looking at how DRIVER_SELF_SPECS is used I wonder if the
> functionality can be enabled with a simple
> 
> --with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE
> -pie}"
> 
> at configure time (using CONFIGURE_SPECS).
> 
> I have no idea if the above is really the proper spec to use - why
> do you include static, nostdlib, nodefaultlibs and nostartfiles
> for example?  Similar, if I say
> 
>  gcc -pie -c t.c
> 
> we will end up with a non-PIE object, and linking with -fPIE will
> end up with a DYN_EXEC object.
> 
> I believe you want to treat link and compile arguments separately
> (and adjust the link spec for linking).  I also would have said that
> elfos.h is more appropriate than gnu-user.h, but ...
> 
> That said, the patch looks more like a hack (and see above how
> to achieve the same without a patch(?)), not like a proper implementation
> of a PIE default.
> 
> Joseph may have an idea where the proper place for a spec-wise
> default PIE is.
> 

This is the new implementation of --enable-default-pie.  Tested on
Linux/x86-64.  OK for trunk?

Thanks.


H.J.
---
gcc/

2015-01-12  Magnus Granberg  
H.J. Lu  

* Makefile.in (COMPILER): Add @NO_PIE_CFLAGS@.
(LINKER): Add @NO_PIE_FLAG@.
(libgcc.mvars): Set NO_PIE_CFLAGS to -fno-PIE for
--enable-default-pie.
* common.opt (fPIE): Initialize to -1.
(fpie): Likewise.
(static): Add "RejectNegative Negative(shared)".
(no-pie): New option.
(pie): Replace "Negative(shared)" with "Negative(no-pie)".
* configure.ac: Add --enable-default-pie.
(NO_PIE_CFLAGS): New. Check if -fno-PIE works.  AC_SUBST.
(NO_PIE_FLAG): New. Check if -no-pie works.  AC_SUBST.
* defaults.h (DEFAULT_FLAG_PIE): New.  Default PIE to -fPIE.
* gcc.c (NO_PIE_SPEC): New.
(PIE_SPEC): Likewise.
(LD_PIE_SPEC): Likewise.
(LINK_PIE_SPEC): Handle -no-pie.  Use PIE_SPEC and LD_PIE_SPEC.
* opts.c (DEFAULT_FLAG_PIE): New.  Set to 0 if ENABLE_DEFAULT_PIE
is undefined.
(finish_options): Update opts->x_flag_pie if it is -1.
* config/gnu-user.h (FVTABLE_VERIFY_SPEC): New.
(GNU_USER_TARGET_STARTFILE_SPEC): Use FVTABLE_VERIFY_SPEC.  Use
NO_PIE_SPEC and NO_PIE_SPEC if ENABLE_DEFAULT_PIE is defined.
(GNU_USER_TARGET_STARTFILE_SPEC): Use FVTABLE_VERIFY_SPEC.
* doc/install.texi: Document --enable-default-pie.
* doc/invoke.texi: Document -no-pie.
* config.in: Regenerated.
* configure: Likewise.

gcc/ada/

2015-01-12  H.J. Lu  

* gcc-interface/Makefile.in (TOOLS_LIBS): Add @NO_PIE_FLAG@.

libgcc/

2015-01-12  H.J. Lu  

* Makefile.in (CRTSTUFF_CFLAGS): Add $(NO_PIE_CFLAGS).

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 5f9261f..180751f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -252,6 +252,12 @@ LINKER = $(CC)
 LINKER_FLAGS = $(CFLAGS)
 endif
 
+# We don't want to compile the compiler with -fPIE, it make PCH f

[PATCH,MIPS] Remove all excess parallel constructs

2015-01-12 Thread Matthew Fortune
I found while checking ToT test status...

define_insn implicitly wraps the pattern in a parallel if there are
multiple instructions. Several MIPS patterns have an explicit parallel
which is mostly handled correctly but the code in 'gen_insn' does not
manage to locate the clobbers inside an explicit parallel and misses
them from the generated add_clobbers function. Until recently this had
gone unnoticed but the MIPS DSP tests now lead to combine creating
one of the DSP patterns and the missing code from add_clobbers causes
an ICE.

While gen_insn could probably be made to cope with this, GCC internals
docs clearly state that there is an implicit parallel so I am fixing the
MIPS patterns instead.

The significant amount of lines touched by this patch are entirely
due to unindenting the instructions, no text is changed except removing
[(parallel and the terminating )]. The testsuites improve their pass
rate with this in place.

For reference, current o32 and n32 test status for hard-float mips32r2
and mips64r2 is below:

o32 is very close to 100% clean:
Compile FAIL gcc.dg/tree-prof/stringop-2.c
3x FAIL gcc.dg/tree-prof/time-profiler-2.c

n64 is more broken:
Exec FAIL gcc.c-torture/execute/20110418-1.c
Exec FAIL gcc.dg/cleanup-10.c
Exec FAIL gcc.dg/cleanup-9.c
Exec FAIL gcc.dg/cleanup-11.c
Exec FAIL gcc.dg/cleanup-8.c
Exec FAIL gcc.dg/localalias.c
Exec FAIL c-c++-common/torture/builtin-arith-overflow-12.c
Compile FAIL gcc.dg/tree-prof/stringop-2.c
3x FAIL gcc.dg/tree-prof/time-profiler-2.c
2x FAIL gcc.dg/tree-ssa/ivopts-lt-2.c
1x FAIL gcc.dg/tree-ssa/ssa-dom-cse-2.c
4x FAIL gcc.target/mips/madd-8.c
4x FAIL gcc.target/mips/msub-8.c

Thanks,
Matthew

gcc/

* config/mips/micromips.md (*swp): Remove explicit parallel.
(jraddiusp, *movep): Likewise.
* config/mips/mips-dsp.md (add3): Likewise.
(mips_add_s_, sub3): Likewise.
(mips_sub_s_, mips_addsc): Likewise.
(mips_addwc, mips_absq_s_): Likewise.
(mips_precrq_rs_ph_w, mips_precrqu_s_qb_ph): Likewise.
(mips_shll_, mips_shll_s_): Likewise.
(mips_muleu_s_ph_qbl, mips_muleu_s_ph_qbr): Likewise.
(mips_mulq_rs_ph, mips_muleq_s_w_phl, mips_muleq_s_w_phr): Likewise.
(mips_dpaq_s_w_ph, mips_dpsq_s_w_ph, mips_mulsaq_s_w_ph): Likewise.
(mips_dpaq_sa_l_w, mips_dpsq_sa_l_w, mips_maq_s_w_phl): Likewise.
(mips_maq_s_w_phr, mips_maq_sa_w_phl, mips_maq_sa_w_phr): Likewise.
(mips_extr_w, mips_extr_r_w, mips_extr_rs_w): Likewise.
(mips_extr_s_h, mips_extp, mips_extpdp, mips_mthlip): Likewise.
(mips_wrdsp): Likewise.
* config/mips/mips-dspr2.md (mips_absq_s_qb): Remove explicit
parallel.
(mips_addu_ph, mips_addu_s_ph, mips_cmpgdu_eq_qb): Likewise.
(mips_cmpgdu_lt_qb, mips_cmpgdu_le_qb, mulv2hi3): Likewise.
(mips_mul_s_ph, mips_mulq_rs_w, mips_mulq_s_ph): Likewise.
(mips_mulq_s_w, mips_subu_ph, mips_subu_s_ph): Likewise.
(mips_dpaqx_s_w_ph, mips_dpaqx_sa_w_ph): Likewise.
(mips_dpsqx_s_w_ph, mips_dpsqx_sa_w_ph): Likewise.
* config/mips/mips-fixed.md (usadd3): Remove explicit parallel.
(ssadd3, ussub3, sssub3, ssmul3): Likewise.
(ssmaddsqdq4, ssmsubsqdq4): Likewise.
---
 gcc/config/mips/micromips.md  |  27 ++-
 gcc/config/mips/mips-dsp.md   | 505 --
 gcc/config/mips/mips-dspr2.md | 253 ++---
 gcc/config/mips/mips-fixed.md |  99 -
 4 files changed, 413 insertions(+), 471 deletions(-)

diff --git a/gcc/config/mips/micromips.md b/gcc/config/mips/micromips.md
index c8262c8..ef9920f 100644
--- a/gcc/config/mips/micromips.md
+++ b/gcc/config/mips/micromips.md
@@ -80,11 +80,10 @@ (define_peephole2
 
 ;; The behavior of the SWP insn is undefined if placed in a delay slot.
 (define_insn "*swp"
-  [(parallel [(set (match_operand:SI 0 "non_volatile_mem_operand")
-  (match_operand:SI 1 "d_operand"))
- (set (match_operand:SI 2 "non_volatile_mem_operand")
-  (match_operand:SI 3 "d_operand"))])]
-
+  [(set (match_operand:SI 0 "non_volatile_mem_operand")
+   (match_operand:SI 1 "d_operand"))
+   (set (match_operand:SI 2 "non_volatile_mem_operand")
+   (match_operand:SI 3 "d_operand"))]
   "TARGET_MICROMIPS
&& umips_load_store_pair_p (false, operands)"
 {
@@ -97,11 +96,11 @@ (define_insn "*swp"
 
 ;; For JRADDIUSP.
 (define_insn "jraddiusp"
-  [(parallel [(return)
- (use (reg:SI 31))
- (set (reg:SI 29)
-  (plus:SI (reg:SI 29)
-   (match_operand 0 "uw5_operand")))])]
+  [(return)
+   (use (reg:SI 31))
+   (set (reg:SI 29)
+   (plus:SI (reg:SI 29)
+(match_operand 0 "uw5_operand")))]
   "TARGET_MICROMIPS"
   "jraddiusp\t%0"
   [(set_attr "type""trap")
@@ -121,10 +120,10 @@ (define_peephole2
 
 ;; The behavior of the MOVEP insn is undefined if placed in a delay slot.
 (define_insn "*movep"

Re: [Patch, Fortran, OOP] PR 63733: [4.8/4.9/5 Regression] wrong resolution for OPERATOR generics

2015-01-12 Thread Janus Weil
Good, I fully agree. Fortunately the patch applies cleanly to the 4.9
branch and regtests without errors. Thus I have applied it as r219475.
Will do 4.8 soon.

Cheers,
Janus



2015-01-12 9:30 GMT+01:00 Paul Richard Thomas :
> Dear Janus,
>
> Since it is a regression, by all means update the branches. We
> usually, propose delaying a bit but I am not convinced that this is
> effective for this kind of bug fix - usually, further problems take a
> long time to emerge. Thus, I would recommend that you get on with it.
>
> Thanks
>
> Paul
>
> On 11 January 2015 at 23:01, Janus Weil  wrote:
>>> Well done for sorting that out. OK for trunk.
>>
>> Thanks, Paul. Committed as r219440.
>>
>> What about the branches?
>>
>> Cheers,
>> Janus
>>
>>
>>
>>> On 11 January 2015 at 14:38, Janus Weil  wrote:
 Hi all,

 this patch fixes a wrong-code regression related to operators, by
 making sure that we look for typebound operators first, before looking
 for non-typebound ones. (Note: Each typebound operator is also added
 to the list of non-typebound ones, for reasons of diagnostics.)

 Regtested on x86_64-unknown-linux-gnu. Ok for trunk? 4.9/4.8?

 Cheers,
 Janus



 2015-01-11  Janus Weil  

 PR fortran/63733
 * interface.c (gfc_extend_expr): Look for type-bound operators before
 non-typebound ones.

 2015-01-11  Janus Weil  

 PR fortran/63733
 * gfortran.dg/typebound_operator_20.f90: New.
>>>
>>>
>>>
>>> --
>>> Outside of a dog, a book is a man's best friend. Inside of a dog it's
>>> too dark to read.
>>>
>>> Groucho Marx
>
>
>
> --
> Outside of a dog, a book is a man's best friend. Inside of a dog it's
> too dark to read.
>
> Groucho Marx


[gomp4] Replace enum omp_clause_map_kind with enum gomp_map_kind (was: Including a file from include/ in gcc/*.h)

2015-01-12 Thread Thomas Schwinge
Hi!

On Mon, 22 Dec 2014 16:13:01 +0100, I wrote:
> I'm sending this again with some more people copied -- because I see
> you're working on tree.h/tree-core.h flattening, or know you're familiar
> with GCC plugins.  ;-) Here is a question concerning both of that, where
> I'd appreciate your input.
> 
> (That said, I don't find a "GCC plugins" person listed in the MAINTAINERS
> file, would that be worth adding?)
> 
> Full quote follows:
> 
> On Fri, 19 Dec 2014 18:54:04 +0100, I wrote:
> > On Thu, 18 Dec 2014 19:33:07 +0100, Jakub Jelinek  wrote:
> > > On Thu, Dec 18, 2014 at 07:25:03PM +0100, Thomas Schwinge wrote:
> > > > On Wed, 17 Dec 2014 23:26:53 +0100, I wrote:
> > > > > Committed to gomp-4_0-branch in r218840:
> > > > > 
> > > > > commit febcd8dfdb10fa80edff0880973d1915ca2fef74
> > > > > Author: tschwinge 
> > > > > Date:   Wed Dec 17 22:26:24 2014 +
> > > > > 
> > > > > Use include/gomp-constants.h more actively.
> > > > 
> > > > > diff --git gcc/tree-core.h gcc/tree-core.h
> > > > > index 743bc0d..fc61b88 100644
> > > > > --- gcc/tree-core.h
> > > > > +++ gcc/tree-core.h
> > > > > @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
> > > > >  #include "alias.h"
> > > > >  #include "flags.h"
> > > > >  #include "symtab.h"
> > > > > +#include "gomp-constants.h"
> > > > >  
> > > > >  /* This file contains all the data structures that define the 'tree' 
> > > > > type.
> > > > > There are no accessor macros nor functions in this file. Only the
> > > > 
> > > > Is it actually "OK" to #include "gomp-constants.h" (living in
> > > > [GCC]/include/) from gcc/tree-core.h?  Isn't the tree-core.h file 
> > > > getting
> > > > installed, for later use by plugins -- which then need to be able to 
> > > > find
> > > > gomp-constants, too, but that is not being installed?
> > > 
> > > Generally, it must be possible to include include/ headers from gcc/
> > > headers, even when they are used by plugins.  Otherwise system.h including
> > > libiberty.h and safe-ctype.h etc. wouldn't work.  Perhaps you need to add
> > > gomp-constants.h to some Makefile variable or something, look at how is
> > > safe-ctype.h etc. handled.
> > 
> > Aha, that's how it is done, I guess, in gcc/Makefile.in:
> > 
> > [...]
> > SYSTEM_H = system.h hwint.h $(srcdir)/../include/libiberty.h \
> > $(srcdir)/../include/safe-ctype.h 
> > $(srcdir)/../include/filenames.h
> > [...]
> > PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h [...]
> > [...]
> > # Install the headers needed to build a plugin.
> > install-plugin: installdirs lang.install-plugin s-header-vars 
> > install-gengtype
> > # We keep the directory structure for files in config or c-family and 
> > .def
> > # files. All other files are flattened to a single directory.
> > $(mkinstalldirs) $(DESTDIR)$(plugin_includedir)
> > headers=`echo $(PLUGIN_HEADERS) | tr ' ' '\012' | sort -u`; \
> > [...]
> > [...]
> > 
> > > That said, including gomp-constants.h from tree-core.h is I think very 
> > > much
> > > against all the Andrew's efforts to flatten headers (which is something 
> > > I'm
> > > not very happy with generally, but in this case, I think only the very few
> > > files that really need the constants should include it).
> > 
> > Like this (not yet applied)?
> 
> [Jakub: »I think it is fine.«]
> 
> > Talking about external code (GCC plugins), do we have to take any
> > measures about the removed enum omp_clause_map_kind?  (Would a mere
> > »#define omp_clause_map_kind gomp_map_kind« work?  That'd also mean that
> > we do have to add include/gomp-constants.h to PLUGIN_HEADERS, and get it
> > included automatically, I think?)
> > 
> > commit b1255597c6b069719960e53e385399c479c4be8b
> > Author: Thomas Schwinge 
> > Date:   Fri Dec 19 18:32:25 2014 +0100
> > 
> > Replace enum omp_clause_map_kind with enum gomp_map_kind.
> > 
> > gcc/
> > * tree-core.h: Instead of defining enum omp_clause_map_kind, use
> > include/gomp-constants.h's enum gomp_map_kind.  Update all 
> > users.
> > include/
> > * gomp-constants.h: Populate enum gomp_map_kind.
> > ---
> >  gcc/c/c-parser.c   | 38 ++---
> >  gcc/c/c-typeck.c   |  9 +++
> >  gcc/cp/parser.c| 38 ++---
> >  gcc/cp/semantics.c |  9 +++
> >  gcc/fortran/trans-openmp.c | 47 ++--
> >  gcc/gimplify.c | 18 +++---
> >  gcc/omp-low.c  | 60 
> > ++
> >  gcc/tree-core.h| 43 +++--
> >  gcc/tree-nested.c  |  8 +++
> >  gcc/tree-pretty-print.c| 31 
> >  gcc/tree-streamer-in.c |  2 +-
> >  gcc/tree-streamer-out.c|  2 +-
> >  gcc/tree.h |  4 ++--
> >  include

Re: [PATCH] config/h8300/h8300.c: Regress part of the original commit for fixing issue

2015-01-12 Thread Jeff Law

On 01/11/15 07:02, Chen Gang S wrote:

The related commit "1a1ed14 config/h8300: Use rtx_insn" gives an extra
check for rtx, which will cause building libgcc break, after regress it,
it can still generate the correct assemble code.

The related information is below:

   [root@localhost libgcc]# cat libgcc2.i
   typedef int DItype __attribute__ ((mode (DI)));
   DItype __muldi3 (DItype u, DItype v)
   {
 return u + v;
   }
   [root@localhost libgcc]# /upstream/build-gcc-h8300/gcc/cc1 -ms -O2 libgcc2.i
__muldi3
   Analyzing compilation unit
   Performing interprocedural optimizations
<*free_lang_data>
Assembling functions:
__muldi3
   libgcc2.i: In function '__muldi3':
   libgcc2.i:5:1: internal compiler error: in as_a, at is-a.h:192
}
This indicates a violation of the type safety invariants we're adding to 
GCC.  Simply changing the code to use rtx rather than rtx_insn is 
probably a step in the wrong direction.


Part of the problem here is that RTX_FRAME_RELATED_P is valid on both 
rtx_insn and rtx objects.  That's something we'll have to fix as the 
type safety work moves forward, assuming we continue towards the goal of 
totally separating rtx and rtx_insn objects.


Returning to the code in h8300.c, we have "F" which assumes its argument 
is an rtx_insn.  We should never be calling "F" will anything other than 
an rtx_insn argument.  The calls from "Fpa" are the only violators of 
that invariant.


Given that the vectors inside the PARALLEL will always be rtx objects 
and that we always want to set RTX_FRAME_RELATED on those objects, it 
seems that we could just replace the call to "F" in "Fpa" with


RTX_FRAME_RELATED_P (XVECEXP (par, 0, i)) = 1;


That simplifies the code and removes a bogus as_a cast.

Can you try that and report back to me?

Thanks,

Jeff

ps.  Someone should have chastised DJ for using such poor function names 
as "F" and "Fpa".  If you wanted to clean that up and use more 
descriptive function names, that would be appreciated as a separate patch.






Re: [Patch, i386] Support BMI and BMI2 targets in multiversioning

2015-01-12 Thread Uros Bizjak
Hello!

>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen wrote:
>> > I recently wanted to use multiversioning for BMI2 specific extensions
>> > PDEP/PEXT, and noticed it wasn't there. So I wrote this patch to add it,
>> > and also added AES, F16C and BMI1 for completeness.
>>
>> AES nor F16C doesn't make any sense IMHO for multiversioning, you need
>> special intrinsics for that anyway and when you use them, the function will
>> fail to compile without those features.
>> Multiversioning only makes sense for ISA features the compiler uses for
>> normal C/C++ code without any intrinsics.
>>
> Patch reduced to just adding BMI and BMI2 multiversioning:

+2014-12-29  Allan Sandfeld Jensen  
+
+ * config/i386/i386.c (get_builtin_code_for_version): Add
+ support for BMI and BMI2 multiversion functions.

+2014-12-29  Allan Sandfeld Jensen  
+
+ * gcc.target/i386/funcspec-5.c: Test new multiversion targets.
+ * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher.

+2014-12-29  Allan Sandfeld Jensen  
+
+ * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI and
+ FEATURE_BMI2.
+ (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2.

OK for mainline

(In future, please add ChangeLogs as plaintext as instructed in [1]).

[1] https://gcc.gnu.org/contribute.html#patches

Thanks,
Uros.


PATCH: PR testsuite/64427: gcc.target/i386/pr64291-1.c is invalid

2015-01-12 Thread H.J. Lu
Hi,

gcc.target/i386/pr64291-1.c has 2 issues:

1. Stack variables, n and d, aren't initialized.
2. dnp[dn - 1] |= 1UL<<63; doesn't work with 32-bit long.

I am checking this patch from

https://gcc.gnu.org/bugzilla/attachment.cgi?id=34342

as an obvious fix.


H.J.
2015-01-12  Marc Glisse  

PR testsuite/64427
* gcc.target/i386/pr64291-1.c: Limit to lp64 tarrget.  Avoid
undefined behavior.
* gcc.target/i386/pr64291-2.c: Updated.

diff --git a/gcc/testsuite/gcc.target/i386/pr64291-1.c 
b/gcc/testsuite/gcc.target/i386/pr64291-1.c
index 85253c0..1d3a380 100644
--- a/gcc/testsuite/gcc.target/i386/pr64291-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr64291-1.c
@@ -1,6 +1,6 @@
 /* { dg-options "-O2" } */
 /* { dg-additional-sources pr64291-2.c } */
-/* { dg-do run } */
+/* { dg-do run { target lp64 } } */
 void f(void*,...);
 void g(void*,long,long);
 int nnn=0;
@@ -12,6 +12,7 @@ typedef struct
   unsigned long *_mp_d;
 } __mpz_struct;
 typedef __mpz_struct mpz_t[1];
+void h(mpz_t);
 
 int main ()
 {
@@ -21,7 +22,7 @@ int main ()
   long alloc, itch;
 
   f (n);
-  f (d);
+  h (d);
   qp = (unsigned long*)__builtin_alloca(4099*8) + 1;
   dnp = (unsigned long*)__builtin_alloca (2049*8);
   alloc = 1;
diff --git a/gcc/testsuite/gcc.target/i386/pr64291-2.c 
b/gcc/testsuite/gcc.target/i386/pr64291-2.c
index 2f3f929..7b7e88a 100644
--- a/gcc/testsuite/gcc.target/i386/pr64291-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr64291-2.c
@@ -1,4 +1,14 @@
 /* { dg-do compile } */
-extern void abort (void);
+#include 
 void f(void*p,...){}
-void g(void*p,long a,long b){if (a!=8) abort();}
+void g(void*p,long a,long b){assert(a==8);}
+typedef struct
+{
+  int _mp_size;
+  unsigned long *_mp_d;
+} __mpz_struct;
+typedef __mpz_struct mpz_t[1];
+void h(mpz_t x) {
+  x->_mp_d=0;
+  x->_mp_size=0;
+}


[PATCH, committed] jit: API change to gcc_jit_context_new_global

2015-01-12 Thread David Malcolm
This is an API change to one of the libgccjit.h entrypoints.

Although we don't yet guarantee API stability for libgccjit, I'm loathe
to break things without strong reasons.
I think that in this case the reasons *are* sufficient (see below),
and hence I feel that it's best to get this change in now before any
official release of gcc 5.

The existing jit API for working with global variables was:

extern gcc_jit_lvalue *
gcc_jit_context_new_global (gcc_jit_context *ctxt,
gcc_jit_location *loc,
gcc_jit_type *type,
const char *name);

This API was misnamed: it didn't create a new global, it instead created
a new lvalue referencing an existing global variable (by name).

In my PyPy libgccjit experiment I needed to be able to create new global
variables, and mistakenly attempted to use the existing API to do this.
Given that I wrote the API, my inability to use it (together with missing
functionality) is a major red flag for the design :)

The following patch adds a new param to the entrypoint:

 extern gcc_jit_lvalue *
 gcc_jit_context_new_global (gcc_jit_context *ctxt,
 gcc_jit_location *loc,
+enum gcc_jit_global_kind kind,
 gcc_jit_type *type,
 const char *name);

with a new enum:

enum gcc_jit_global_kind
{
  /* Global is defined by the client code and visible
 by name outside of this JIT context via
 gcc_jit_result_get_global.  */
  GCC_JIT_GLOBAL_EXPORTED,

  /* Global is defined by the client code, but is invisible
 outside of this JIT context.  Analogous to a "static" global.  */
  GCC_JIT_GLOBAL_INTERNAL,

  /* Global is not defined by the client code; we're merely
 referring to it.  Analogous to using an "extern" global from a
 header file.  */
  GCC_JIT_GLOBAL_IMPORTED
};

This approach is by analogy to the gcc_jit_context_new_function
entrypoint.

For porting existing code, the old behavior is equivalent to passing
in GCC_JIT_GLOBAL_IMPORTED as the new param.

There's also a new entrypoint for accessing globals created via
GCC_JIT_GLOBAL_EXPORTED:

/* Locate a given global within the built machine code.
   It must have been created using GCC_JIT_GLOBAL_EXPORTED.
   This is a ptr to the global, so e.g. for an int this is an int *.  */
extern void *
gcc_jit_result_get_global (gcc_jit_result *result,
   const char *name);

The new test coverage takes jit.sum from 7152 to 7272 passes.

Committed to trunk as r219480.

gcc/jit/ChangeLog:
* docs/cp/topics/expressions.rst (Global variables): Add
enum gcc_jit_global_kind param to gccjit::context::new_global.
* docs/topics/expressions.rst (Global variables): Likewise.
Document the new enum.
* docs/topics/results.rst (Compilation results): Document
globals-handling.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
* dummy-frontend.c (jit_langhook_write_globals): Call into the
playback context's write_global_decls_1 and write_global_decls_2
before and after calling symtab->finalize_compilation_unit ().
* jit-playback.c: Include "debug.h".
(gcc::jit::playback::context::new_global): Add "kind" param and
use it to set TREE_PUBLIC, TREE_STATIC and DECL_EXTERNAL on the
underlying VAR_DECL.  Call varpool_node::get_create on the
VAR_DECL, and add it to m_globals.
(gcc::jit::playback::context::write_global_decls_1): New function.
(gcc::jit::playback::context::write_global_decls_2): New function.
* jit-playback.h (gcc::jit::playback::context::context): Call
create on m_globals.
(gcc::jit::playback::context::new_global): Add "kind" param.
(gcc::jit::playback::context::write_global_decls_1): New function.
(gcc::jit::playback::context::write_global_decls_2): New function.
(gcc::jit::playback::context::m_globals): New field.
* jit-recording.c (gcc::jit::recording::context::context):
Initialize m_globals.
(gcc::jit::recording::context::new_global): Add param "kind".
Add the new global to m_globals.
(gcc::jit::recording::context::dump_to_file): Dump the globals.
(gcc::jit::recording::global::replay_into): Add field m_kind.
(gcc::jit::recording::global::write_to_dump): New override.
* jit-recording.h (gcc::jit::recording::context::new_global): Add
param "kind".
(gcc::jit::recording::context::m_globals): New field.
(gcc::jit::recording::global::global): Add param kind.
(gcc::jit::recording::global::write_to_dump): New override.
(gcc::jit::recording::global::m_kind): New field.
* jit-result.c (gcc::jit::result::get_global): New function.
* jit-result.h (gcc::jit::result::get_global): New function.
* libgccjit++.h (gccjit::context::new_g

Re: [PATCH][AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

2015-01-12 Thread Andrew Pinski
On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov  wrote:
> Hi all,
>
> As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
> discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
> call to the library sqrt at -O0. To avoid that this patch uses a target
> builtin for sqrt on DF mode and uses that to implement the intrinsic.
>
> With this patch I don't see sqrt calls being created at -O0 on a large
> arm_neon.h testcase where they were generated before.
> aarch64-none-elf testing and the intrinsics testsuite in particular are
> clean.
> Ok for trunk?

Maybe have a target fold which folds this into sqrt if -fno-math-errno
is supplied.  This might be useful the -ffast-math case.
Maybe also fold it when a constant is supplied too.

Thanks,
Andrew

>
> Thanks,
> Kyrill
>
> 2015-01-12  Kyrylo Tkachov  
>
> * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
> * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
> instead of __builtin_sqrt.


Re: [PATCH] config/h8300/h8300.c: Regress part of the original commit for fixing issue

2015-01-12 Thread Jeff Law

On 01/12/15 10:01, Jeff Law wrote:

This indicates a violation of the type safety invariants we're adding to
GCC.  Simply changing the code to use rtx rather than rtx_insn is
probably a step in the wrong direction.

Part of the problem here is that RTX_FRAME_RELATED_P is valid on both
rtx_insn and rtx objects.  That's something we'll have to fix as the
type safety work moves forward, assuming we continue towards the goal of
totally separating rtx and rtx_insn objects.

Returning to the code in h8300.c, we have "F" which assumes its argument
is an rtx_insn.  We should never be calling "F" will anything other than
an rtx_insn argument.  The calls from "Fpa" are the only violators of
that invariant.

Given that the vectors inside the PARALLEL will always be rtx objects
and that we always want to set RTX_FRAME_RELATED on those objects, it
seems that we could just replace the call to "F" in "Fpa" with

RTX_FRAME_RELATED_P (XVECEXP (par, 0, i)) = 1;


That simplifies the code and removes a bogus as_a cast.

Can you try that and report back to me?
Nevermind.  I went ahead and made this change and verified that libgcc, 
libquadmath and newlib would build for the h8300-elf configuration.


Jeff



Re: [PATCH][test] Gate gcc.dg/aru-2.c test on profiling support

2015-01-12 Thread Jeff Law

On 01/12/15 08:54, Kyrill Tkachov wrote:

Hi all,

This recently added test adds -pg to its dg-options but not all targets
support
  this and fail at link-time with "bin/ld: cannot find -lc_p".

Looking around I see that all tests that use -pg also do a
dg-require-profiling.
This patch adds that.

With this patch the test doesn't FAIL on aarch64-none-elf. It appears as
UNSUPPORTED.

Ok for trunk?

Thanks,
Kyrill

2015-01-12  Kyrylo Tkachov  

 * gcc.dg/aru-2.c: Add dg-require-profiling directive.

OK.
Jeff



Re: [gomp4] Replace enum omp_clause_map_kind with enum gomp_map_kind (was: Including a file from include/ in gcc/*.h)

2015-01-12 Thread Jakub Jelinek
On Mon, Jan 12, 2015 at 05:32:14PM +0100, Thomas Schwinge wrote:
> I have now committed the patch to gomp-4_0-branch in the following form.
> The issues raised above remain to be resolved.
> 
> In spirit against the tree.h header flattening, I had to keep the
> #include "include/gomp-constants.h" in gcc/tree-core.h, because otherwise
> I'd have to add it to a ton of *.c files, just for the enum gomp_map_kind
> definition.
> 
> I found that in the C++ dialect used by GCC, it is not possible to
> declare an enum without giving the list of enumerators.  N2764 (from
> 2008) resolved this by adding appropriate syntax for declaring enums,
> however: "warning: scoped enums only available with -std=c++11 or
> -std=gnu++11".  If it were possible to use this, we could add to
> gcc/tree-core.h:
> 
> enum gomp_map_kind : char;
> 
> ... (or similar), and this way decouple the declaration (gcc/tree-core.h)
> From the actual "population of it" (include/gomp-constants.h).
> Alternatively, in gcc/tree-core.h:struct tree_omp_clause, we could switch
> the map_kind member from enum gomp_map_kind to a char -- but that would
> defeat the usage of an enum (easy pretty-printing of its enumerators in
> GDB, and so on.).

Or just don't do this and duplicate the constants and just assert somewhere
(in omp-low.c) at compile time that all the values match.
Either using char and casting the value only in the OMP_* macros
or duplicating the values sound preferrable over including
include/gomp-constants.h from tree-core.h.

Jakub


Re: [PATCH, i386] Remove EBX usage from asm code

2015-01-12 Thread Jeff Law

On 01/12/15 04:56, Evgeny Stupachenko wrote:

Agree, I've missed the usage of the function
"__register_frame_info_bases" (frame_dummy assembly had only indirect
call when I miss "-pie" in compilation).
There is no reference on glibc that way. Sorry for the confusion.
So that is potentially buggy right now.

So it seems to me this ought to be installed now.

Jeff



Re: [PATCH] add option to emit more array bounds warnigs

2015-01-12 Thread Jeff Law

On 11/11/14 23:13, Martin Uecker wrote:


Hi,

this proposed patch adds an option "-Warray-bounds=" in addition to
"-Warray-bound". "-Warray-bounds=1" corresponds to "-Warray-bound".
For higher warning levels more warnings about optional accesses
outside of arrays are emitted. For example, warnings for
arrays accessed through pointers are now emitted:

void foo(int (*a)[3])
{
(*a)[4] = 1;
}

Also warnings for arrays which are the last element of a struct
are emitted, if it is not a flexible array member or does not use
the zero size extensions.

Because there is the risk of false positives, the higher warning
level is not used by default.


Martin


* gcc/tree-vrp.c (check_array_ref): Emit more warnings
for warn_array_bounds >= 2.
* gcc/testsuite/gcc.dg/Warray-bounds-11.c: New test-case.
* gcc/c-family/c.opt: New option -Warray-bounds=.
* gcc/common.opt: New option -Warray-bounds=.
* gcc/doc/invoke.texi: Document new option.
Has this patch been bootstrapped and regression tested, if so on what 
platform.


Given the new warnings (as implemented by the patch) are not enabled by 
default, I'm inclined to approve once Martin verifies things via 
bootstrap and regression test.


jeff



Re: [testsuite] PATCH: Add check_effective_target_pie

2015-01-12 Thread Jeff Law

On 01/11/15 16:58, H.J. Lu wrote:

Hi,

This patch adds check_effective_target_pie to check if the current
multilib generatse PIE by default.  I will submit other patches to use
it.  OK for trunk?

Thanks.

H.J.
---
2015-01-11  H.J. Lu  

* gcc.target/i386/pie.c: New test.

* lib/target-supports.exp (check_profiling_available): Return 0
if PIE is enabled.
(check_effective_target_pie): New.
---
  gcc/testsuite/gcc.target/i386/pie.c   | 12 
  gcc/testsuite/lib/target-supports.exp | 15 +++
  2 files changed, 27 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/i386/pie.c

diff --git a/gcc/testsuite/gcc.target/i386/pie.c 
b/gcc/testsuite/gcc.target/i386/pie.c
new file mode 100644
index 000..0a9f5ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pie.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target pie } } */
+/* { dg-options "-O2" } */
+
+int foo (void);
+
+int
+main (void)
+{
+  return foo ();
+}
+
+/* { dg-final { scan-assembler "foo@PLT" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index f5c6db8..549bcdf 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -475,6 +475,11 @@ proc check_profiling_available { test_what } {
}
  }

+# Profiling don't work with -fPIE -pie.
+if { [check_effective_target_pie] } {
+  return 0
+}
Is this an inherent restriction of -fPIE, or is it merely an 
implementation detail?  If the latter, is that implementation detail a 
target issue?  ie, could we have a target that supports profiling in 
conjunction with -fPIE?  If so, then this test seems too restrictive.



  }

+# Return 1 if the current multilib generatse PIE by default.

s/generatse/generates/

Waiting on answer to PIE vs pg question above prior to approving or 
requesting further refinement.


jeff



Re: [Ping] Port of VTV for Cygwin and MinGW

2015-01-12 Thread Caroline Tice
On Thu, Jan 8, 2015 at 12:33 PM, Patrick Wollgast
 wrote:
> A short recap again:
>
> Latest patch, changelog and a test program (further information about
> the program in the mail):
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03368.html
>
>
> Approved:
> * gcc/config/i386/*
> * libgcc/*
> * libstdc++-v3/*
> * libvtv/*  (Some changes made to three of these files.
>  Listed in 'Not approved'.)
>
>
> Not approved:
> For the following two files I added checks, if TARGET_PECOFF is defined
> ( https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00815.html )
> * gcc/cp/vtable-class-hierarchy.c
> * gcc/varasm.c

The changes in gcc/cp/vtable-class-hierarchy.c and gcc/varasm.c both
look good to me.  However I am not authorized to approve stuff in this
part of GCC, so I need someone with global approval rights to look at
these changes and give the final OK.


>
> Reasons for changes in the following files stated in
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00815.html and in the mail
> of the latest patch.
>
> Removed implementation of mprotect.
> * libvtv/vtv_malloc.cc
> Added extern "C" to the prototype of mprotect.
> * libvtv/vtv_malloc.h
> Exchanged call to TerminateProcess with call to abort in __fortify_fail.
> * libvtv/vtv_rts.cc

The changes in libvtv all look good to me (approved).

-- Caroline Tice
cmt...@google.com


>
> Has been removed from the most recent patch. Just listed for completeness.
> * libiberty/obstack.c
>
>
> Regards,
> Patrick


Re: [testsuite] PATCH: Support PIE in gcc.dg/tree-ssa/ssa-store-ccp-3.c

2015-01-12 Thread Jeff Law

On 01/11/15 17:25, H.J. Lu wrote:

target nonpic is always false for -fPIE since it defines both __PIC__ and
__PIE__.  This patch changes gcc.dg/tree-ssa/ssa-store-ccp-3.c to make it
to pass with -fPIE by excluding PIE when nonpic is true.  OK to for trunk?

Thanks.

H.J.
---
  gcc/testsuite/gcc.dg/tree-ssa/ssa-store-ccp-3.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

2015-01-11  H.J. Lu  

* gcc.dg/tree-ssa/ssa-store-ccp-3.c: Exclude pie when nonpic is
true.

OK if/when prerequisites go in.

jeff


Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2015-01-12 Thread Jeff Law

On 01/10/15 05:51, Richard Sandiford wrote:


I agree this is the kind of thing we'd need to consider if we were
deciding whether it's valid to connect a (lo_sum (high x+N) x+N) to an
existing (high x).  But this code is handling cases where the connection
has already been made and we're trying to simplify the result.  Would it
be valid RTL to use:
And I probably should have said that if we've already somehow determined 
that the high/losum are paired then it'd be OK.  I hesitated simply 
because I hadn't walked how this code was called or used in any detail.




   (lo_sum (high x) (const (plus x offset)))

to mean anything other than x+offset?
Agreed, when we've made a connection between the two -- for example, if 
we know the output of the high dominates and was originally used in the 
lo_sum.   However, we can't make that assumption for an arbitrary 
high/lo_sum, even if they refer to the same x.




If we do go for the change, I think we should generalise it to handle
(lo_sum (high x+N) x+N') and (lo_sum (high x-N) x) too.  Things like
get_related_value or split_const could help there.

Unless the change is pretty trivial, this might be a gcc-6 kind of thing.

jeff


Re: [PATCH, committed] Fix build of jit (was Re: [PATCH] Flatten tree.h and tree-core.h (Version 3))

2015-01-12 Thread Mike Stump
On Jan 11, 2015, at 2:33 PM, Prathamesh Kulkarni 
 wrote:
> oops, sorry about this. We will build further flattening patches with
> --enable-languages=all,go,jit,ada.
> Shall that cover all the front-ends ?

No objc++ is non-default:

$ grep build_by_default */config-lang.in
go/config-lang.in:build_by_default="no"
java/config-lang.in:#build_by_default=no
jit/config-lang.in:build_by_default="no"
lto/config-lang.in:build_by_default=no
objcp/config-lang.in:build_by_default=“no"

I saw changes to objcp, so, I had assumed you did it, otherwise I was going to 
mention it.

Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2015-01-12 Thread Jeff Law

On 01/10/15 09:35, Matthew Fortune wrote:


I guess so. I took the phrasing below for (high:m exp) to mean that high
only made sense when used with lo_sum.
True.  But one can use a single high with different lo_sum expressions 
when those lo_sum expressions are related.


So you might have a single high such as

(high (symbol_ref "x"))

That feeds multiple lo_sum expressions like

(lo_sum (reg) (symbol_ref "x"))
(lo_sum (reg) (const (plus (symbol_ref "x") (const_int 4
(lo_sum (reg) (const (plus (symbol_ref "x") (const_int 8
(lo_sum (reg) (const (plus (symbol_ref "x") (const_int 12)))


IIRC this gets implemented in either the move expander or a 
legitimize_address hook.  You start with a high/lo_sum pair for each 
reference.  However, you rewrite the high part to chop off low bits. 
That makes many of the high expressions become common subexpressions and 
they get removed by CSE in the expected ways.


You have to be careful for overflows and such.  I don't recall the 
precise rules there, but it was the source of problems with interfacing 
with the optimizing PA linker.



Jeff


Re: [PATCH, committed] Fix build of jit (was Re: [PATCH] Flatten tree.h and tree-core.h (Version 3))

2015-01-12 Thread Prathamesh Kulkarni
On 13 January 2015 at 00:01, Mike Stump  wrote:
> On Jan 11, 2015, at 2:33 PM, Prathamesh Kulkarni 
>  wrote:
>> oops, sorry about this. We will build further flattening patches with
>> --enable-languages=all,go,jit,ada.
>> Shall that cover all the front-ends ?
>
> No objc++ is non-default:
Thanks!
>
> $ grep build_by_default */config-lang.in
> go/config-lang.in:build_by_default="no"
> java/config-lang.in:#build_by_default=no
> jit/config-lang.in:build_by_default="no"
> lto/config-lang.in:build_by_default=no
> objcp/config-lang.in:build_by_default=“no"
>
> I saw changes to objcp, so, I had assumed you did it, otherwise I was going 
> to mention it.
Michael had built objcp for tree.h flattening patch and tested it.

Thanks,
Prathamesh


Re: [RFC PATCH] Handle sequence in reg_set_p

2015-01-12 Thread Jeff Law

On 01/11/15 04:40, Oleg Endo wrote:


Any particular reason why the SEQUENCE handling isn't done first, then
the REG_INC and CALL insn handling?  I'd probably explicitly return
false if we had a sequence and none of its elements returned true.
There's no need to check anything on the toplevel SEQUENCE to the best
of my knowledge.


No meaningful reason.  Attached is an updated patch that applies on 4.8
and trunk.  Bootstrapped on trunk i686-pc-linux-gnu.  Tested with make
-k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" on trunk
sh-elf.  In the PR it has been reported, that the patch fixes the
problems when building a native SH GCC.

Although the related SH problem occurs only on 4.8, I'd like to install
this on trunk and 4.9, too, to avoid future surprises.  OK?

Cheers,
Oleg

gcc/ChangeLog:
PR target/64479
* rtlanal.c (set_reg_p): Handle SEQUENCE constructs.

OK.

Jeff



Re: [COMMITTED] Merge libffi with upstream

2015-01-12 Thread Uros Bizjak
Hello!

> Upstream libffi has added support for Go closures (using the static chain),
> and support for complex numbers.  Perhaps less relevant is new support for
> arc, microblaze, moxie, nios, and or1k targets.
>
> Without additional changes for Go, this merge has little effect.  Within the
> gcc tree libffi is primarily used by libjava.
>
> Tested with no regressions on {i686,x86_64,ppc64,s390x,aarch64,alpha}-linux.

This patchset regressed libjava on -m32 x86_64-linux-gnu (Fedora 21):

=== libjava tests ===


Running target unix

=== libjava Summary for unix ===


Running target unix/-m32
FAIL: libjava.jar/TestClosureGC.jar execution - gij test
FAIL: libjava.jar/simple.jar execution - gij test
FAIL: PR15133 execution - gij test
FAIL: PR18116 execution - gij test
FAIL: PR28178 execution - gij test
FAIL: bytebuffer execution - gij test
FAIL: calls execution - gij test
FAIL: cxxtest execution - gij test
FAIL: directbuffer execution - gij test
FAIL: field execution - gij test
FAIL: final_method execution - gij test
FAIL: findclass execution - gij test
FAIL: findclass2 execution - gij test
FAIL: iface execution - gij test
FAIL: init execution - gij test
FAIL: invoke execution - gij test
FAIL: jniutf execution - gij test
FAIL: martin execution - gij test
FAIL: noclass execution - gij test
FAIL: overload execution - gij test
FAIL: pr11951 execution - gij test
FAIL: pr18278 execution - gij test
FAIL: pr23739 execution - gij test
FAIL: register execution - gij test
FAIL: register2 execution - gij test
FAIL: simple_int execution - gij test
FAIL: throwit execution - gij test
FAIL: virtual execution - gij test
FAIL: PR16923 run
FAIL: pr29812 execution - gij test
FAIL: getargssize run
FAIL: getlocalvartable run
FAIL: getstacktrace run
FAIL: ExtraClassLoader execution - source compiled test
FAIL: ExtraClassLoader -findirect-dispatch execution - source compiled test
FAIL: ExtraClassLoader -O3 execution - source compiled test
FAIL: ExtraClassLoader -O3 -findirect-dispatch execution - source compiled test
FAIL: TestEarlyGC execution - source compiled test

=== libjava Summary for unix/-m32 ===


=== libjava Summary ===

# of expected passes5092
# of unexpected failures38
# of expected failures8
# of untested testcases38

Uros.


Re: [testsuite] PATCH: Add check_effective_target_pie

2015-01-12 Thread H.J. Lu
On Mon, Jan 12, 2015 at 10:09 AM, Jeff Law  wrote:
> On 01/11/15 16:58, H.J. Lu wrote:
>>
>> Hi,
>>
>> This patch adds check_effective_target_pie to check if the current
>> multilib generatse PIE by default.  I will submit other patches to use
>> it.  OK for trunk?
>>
>> Thanks.
>>
>> H.J.
>> ---
>> 2015-01-11  H.J. Lu  
>>
>> * gcc.target/i386/pie.c: New test.
>>
>> * lib/target-supports.exp (check_profiling_available): Return 0
>> if PIE is enabled.
>> (check_effective_target_pie): New.
>> ---
>>   gcc/testsuite/gcc.target/i386/pie.c   | 12 
>>   gcc/testsuite/lib/target-supports.exp | 15 +++
>>   2 files changed, 27 insertions(+)
>>   create mode 100644 gcc/testsuite/gcc.target/i386/pie.c
>>
>> diff --git a/gcc/testsuite/gcc.target/i386/pie.c
>> b/gcc/testsuite/gcc.target/i386/pie.c
>> new file mode 100644
>> index 000..0a9f5ee
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/pie.c
>> @@ -0,0 +1,12 @@
>> +/* { dg-do compile { target pie } } */
>> +/* { dg-options "-O2" } */
>> +
>> +int foo (void);
>> +
>> +int
>> +main (void)
>> +{
>> +  return foo ();
>> +}
>> +
>> +/* { dg-final { scan-assembler "foo@PLT" } } */
>> diff --git a/gcc/testsuite/lib/target-supports.exp
>> b/gcc/testsuite/lib/target-supports.exp
>> index f5c6db8..549bcdf 100644
>> --- a/gcc/testsuite/lib/target-supports.exp
>> +++ b/gcc/testsuite/lib/target-supports.exp
>> @@ -475,6 +475,11 @@ proc check_profiling_available { test_what } {
>> }
>>   }
>>
>> +# Profiling don't work with -fPIE -pie.
>> +if { [check_effective_target_pie] } {
>> +  return 0
>> +}
>
> Is this an inherent restriction of -fPIE, or is it merely an implementation
> detail?  If the latter, is that implementation detail a target issue?  ie,
> could we have a target that supports profiling in conjunction with -fPIE?
> If so, then this test seems too restrictive.

[hjl@gnu-6 tmp]$ gcc -pie -fPIE -pg h.c
/usr/local/bin/ld:
/usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o:
relocation R_X86_64_32S against `__libc_csu_fini' can not be used when
making a shared object; recompile with -fPIC
/usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o:
error adding symbols: Bad value
collect2: error: ld returned 1 exit status
[hjl@gnu-6 tmp]$

There is no crt1.o from glibc to support -pg -pie -fPIE.
I don't know if other targets support -pie -fPIE.

>>   }
>>
>> +# Return 1 if the current multilib generatse PIE by default.
>
> s/generatse/generates/

I will fix it.

> Waiting on answer to PIE vs pg question above prior to approving or
> requesting further refinement.

Sure.

-- 
H.J.


Re: [testsuite] PATCH: Add check_effective_target_pie

2015-01-12 Thread Jeff Law

On 01/12/15 12:29, H.J. Lu wrote:


Is this an inherent restriction of -fPIE, or is it merely an implementation
detail?  If the latter, is that implementation detail a target issue?  ie,
could we have a target that supports profiling in conjunction with -fPIE?
If so, then this test seems too restrictive.


[hjl@gnu-6 tmp]$ gcc -pie -fPIE -pg h.c
/usr/local/bin/ld:
/usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o:
relocation R_X86_64_32S against `__libc_csu_fini' can not be used when
making a shared object; recompile with -fPIC
/usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o:
error adding symbols: Bad value
collect2: error: ld returned 1 exit status
[hjl@gnu-6 tmp]$

There is no crt1.o from glibc to support -pg -pie -fPIE.
I don't know if other targets support -pie -fPIE.
Can you please investigate the questions.  Showing me the link failure 
doesn't really help much here.  It tells me there's no crt1.o, but it 
says nothing about *why*.


Is there inherently something about PIE/pg that makes them impossible to 
work together or is this an implementation detail?  If the latter, then 
is the implementation detail a target issue or not.


Jeff


[PATCH, committed] jit-playback.c: fix missing fclose

2015-01-12 Thread David Malcolm
Reported by David Binderman within discussion of PR jit/63854.

Before/after jit.sum has 7272 passes.

Committed to trunk as r219487.

gcc/jit/ChangeLog:
* jit-playback.c (gcc::jit::playback::context::read_dump_file):
Add missing fclose on error-handling path.
---
 gcc/jit/jit-playback.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
index 0e45e02..ca4e112 100644
--- a/gcc/jit/jit-playback.c
+++ b/gcc/jit/jit-playback.c
@@ -1947,6 +1947,7 @@ playback::context::read_dump_file (const char *path)
 {
   add_error (NULL, "error reading from %s", path);
   free (result);
+  fclose (f_in);
   return NULL;
 }
 
-- 
1.8.5.3



[PATCH] Fix REE for vector modes (PR rtl-optimization/64286)

2015-01-12 Thread Jakub Jelinek
Hi!

As mentioned in the PR, giving up for all vector mode extensions
is unnecessary, but unlike scalar integer extensions, where the low part
of the extended value is the original value, for vectors this is not true,
thus the old value is lost.  Which means we can perform REE, but only if
all uses of the definition are the same (code+mode) extension.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-12  Jakub Jelinek  

PR rtl-optimization/64286
* ree.c (add_removable_extension): Don't add vector mode
extensions if all uses of the source register aren't the same
vector extensions.

* gcc.target/i386/avx2-pr64286.c: New test.

--- gcc/ree.c.jj2015-01-09 22:00:00.427660442 +0100
+++ gcc/ree.c   2015-01-12 12:15:56.097184674 +0100
@@ -1027,6 +1027,7 @@ add_removable_extension (const_rtx expr,
 different extension.  FIXME: this obviously can be improved.  */
   for (def = defs; def; def = def->next)
if ((idx = def_map[INSN_UID (DF_REF_INSN (def->ref))])
+   && idx != -1U
&& (cand = &(*insn_list)[idx - 1])
&& cand->code != code)
  {
@@ -1038,6 +1039,57 @@ add_removable_extension (const_rtx expr,
  }
return;
  }
+   /* For vector mode extensions, ensure that all uses of the
+  XEXP (src, 0) register are the same extension (both code
+  and to which mode), as unlike integral extensions lowpart
+  subreg of the sign/zero extended register are not equal
+  to the original register, so we have to change all uses or
+  none.  */
+   else if (VECTOR_MODE_P (GET_MODE (XEXP (src, 0
+ {
+   if (idx == 0)
+ {
+   struct df_link *ref_chain, *ref_link;
+
+   ref_chain = DF_REF_CHAIN (def->ref);
+   for (ref_link = ref_chain; ref_link; ref_link = ref_link->next)
+ {
+   if (ref_link->ref == NULL
+   || DF_REF_INSN_INFO (ref_link->ref) == NULL)
+ {
+   idx = -1U;
+   break;
+ }
+   rtx_insn *use_insn = DF_REF_INSN (ref_link->ref);
+   const_rtx use_set;
+   if (use_insn == insn || DEBUG_INSN_P (use_insn))
+ continue;
+   if (!(use_set = single_set (use_insn))
+   || !REG_P (SET_DEST (use_set))
+   || GET_MODE (SET_DEST (use_set)) != GET_MODE (dest)
+   || GET_CODE (SET_SRC (use_set)) != code
+   || !rtx_equal_p (XEXP (SET_SRC (use_set), 0),
+XEXP (src, 0)))
+ {
+   idx = -1U;
+   break;
+ }
+ }
+   if (idx == -1U)
+ def_map[INSN_UID (DF_REF_INSN (def->ref))] = idx;
+ }
+   if (idx == -1U)
+ {
+   if (dump_file)
+ {
+   fprintf (dump_file, "Cannot eliminate extension:\n");
+   print_rtl_single (dump_file, insn);
+   fprintf (dump_file,
+" because some vector uses aren't extension\n");
+ }
+   return;
+ }
+ }
 
   /* Then add the candidate to the list and insert the reaching definitions
  into the definition map.  */
--- gcc/testsuite/gcc.target/i386/avx2-pr64286.c.jj 2015-01-12 
12:19:54.863031657 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-pr64286.c2015-01-12 
12:19:36.0 +0100
@@ -0,0 +1,37 @@
+/* PR rtl-optimization/64286 */
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx2" } */
+/* { dg-require-effective-target avx2 } */
+
+#include 
+#include 
+#include 
+#include "avx2-check.h"
+
+__m128i v;
+__m256i w;
+
+__attribute__((noinline, noclone)) void
+foo (__m128i *p, __m128i *q)
+{
+  __m128i a = _mm_loadu_si128 (p);
+  __m128i b = _mm_xor_si128 (a, v);
+  w = _mm256_cvtepu8_epi16 (a);
+  *q = b;
+}
+
+static void
+avx2_test (void)
+{
+  v = _mm_set1_epi8 (0x40);
+  __m128i c = _mm_set_epi8 (16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 
2, 1);
+  __m128i d;
+  foo (&c, &d);
+  __m128i e = _mm_set_epi8 (0x50, 0x4f, 0x4e, 0x4d, 0x4c, 0x4b, 0x4a, 0x49,
+   0x48, 0x47, 0x46, 0x45, 0x44, 0x43, 0x42, 0x41);
+  __m256i f = _mm256_set_epi16 (16, 15, 14, 13, 12, 11, 10, 9,
+   8, 7, 6, 5, 4, 3, 2, 1);
+  if (memcmp (&w, &f, sizeof (w)) != 0
+  || memcmp (&d, &e, sizeof (d)) != 0)
+abort ();
+}

Jakub


[PATCH] Fix VRP ICE with -Wtype-limits (PR tree-optimization/64563)

2015-01-12 Thread Jakub Jelinek
Hi!

On the following testcase we ICE with -Os -Wtype-limits, as
VR_UNDEFINED has NULL vr0->min and vr0->max.  From what the code
does I believe the code only means to handle VR_RANGE and not anything else.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-12  Jakub Jelinek  

PR tree-optimization/64563
* tree-vrp.c (vrp_evaluate_conditional): Check for VR_RANGE
instead of != VR_VARYING.

* gcc.dg/pr64563.c: New test.

--- gcc/tree-vrp.c.jj   2015-01-09 21:59:29.0 +0100
+++ gcc/tree-vrp.c  2015-01-12 11:21:50.363521200 +0100
@@ -7545,7 +7545,7 @@ vrp_evaluate_conditional (enum tree_code
   tree type = TREE_TYPE (op0);
   value_range_t *vr0 = get_value_range (op0);
 
-  if (vr0->type != VR_VARYING
+  if (vr0->type == VR_RANGE
  && INTEGRAL_TYPE_P (type)
  && vrp_val_is_min (vr0->min)
  && vrp_val_is_max (vr0->max)
--- gcc/testsuite/gcc.dg/pr64563.c.jj   2015-01-12 11:24:07.595145836 +0100
+++ gcc/testsuite/gcc.dg/pr64563.c  2015-01-12 11:23:08.0 +0100
@@ -0,0 +1,14 @@
+/* PR tree-optimization/64563 */
+/* { dg-do compile } */
+/* { dg-options "-Os -Wtype-limits" } */
+
+int a, b, c, d, f;
+unsigned int e;
+
+void
+foo (void)
+{
+  d = b = (a != (e | 4294967288UL));
+  if (!d)
+c = f || b;
+}

Jakub


Re: LTO streaming of TARGET_OPTIMIZE_NODE

2015-01-12 Thread Jan Hubicka
> On 09 Jan 12:45, Jakub Jelinek wrote:
> > --- gcc/cgraphunit.c.jj 2015-01-09 12:01:33.0 +0100
> > +++ gcc/cgraphunit.c2015-01-09 12:22:27.742692667 +0100
> > @@ -2108,11 +2108,14 @@ ipa_passes (void)
> >if (g->have_offload)
> > {
> >   section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX;
> > + lto_stream_offload_p = true;
> >   ipa_write_summaries (true);
> > + lto_stream_offload_p = false;
> > }
> >if (flag_lto)
> > {
> >   section_name_prefix = LTO_SECTION_NAME_PREFIX;
> > + lto_stream_offload_p = false;
> >   ipa_write_summaries (false);
> > }
> 
> Now when we have a global flag, there is no longer need to pass a flag to
> ipa_write_summaries and to select_what_to_stream.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, OK for trunk?
OK,
Honza
> 
> 
> gcc/
>   * cgraphunit.c (ipa_passes): Remove argument from ipa_write_summaries.
>   * lto-cgraph.c (select_what_to_stream): Remove argument, use
>   lto_stream_offload_p instead.
>   * lto-streamer.h (select_what_to_stream): Remove argument.
>   * passes.c (ipa_write_summaries): Likewise.
>   * tree-pass.h (ipa_write_summaries): Likewise.
> gcc/lto/
>   * lto-partition.c (lto_promote_cross_file_statics): Remove argument
>   from select_what_to_stream.
> 
> 
> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
> index 149f447..1ef1b6c 100644
> --- a/gcc/cgraphunit.c
> +++ b/gcc/cgraphunit.c
> @@ -2115,14 +2115,14 @@ ipa_passes (void)
>   {
> section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX;
> lto_stream_offload_p = true;
> -   ipa_write_summaries (true);
> +   ipa_write_summaries ();
> lto_stream_offload_p = false;
>   }
>if (flag_lto)
>   {
> section_name_prefix = LTO_SECTION_NAME_PREFIX;
> lto_stream_offload_p = false;
> -   ipa_write_summaries (false);
> +   ipa_write_summaries ();
>   }
>  }
>  
> diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
> index 6c6501a..91be530 100644
> --- a/gcc/lto-cgraph.c
> +++ b/gcc/lto-cgraph.c
> @@ -842,11 +842,11 @@ create_references (lto_symtab_encoder_t encoder, 
> symtab_node *node)
>  /* Select what needs to be streamed out.  In regular lto mode stream 
> everything.
> In offload lto mode stream only nodes marked as offloadable.  */
>  void
> -select_what_to_stream (bool offload_lto_mode)
> +select_what_to_stream (void)
>  {
>struct symtab_node *snode;
>FOR_EACH_SYMBOL (snode)
> -snode->need_lto_streaming = !offload_lto_mode || snode->offloadable;
> +snode->need_lto_streaming = !lto_stream_offload_p || snode->offloadable;
>  }
>  
>  /* Find all symbols we want to stream into given partition and insert them
> diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> index 853630f..2d9f30c 100644
> --- a/gcc/lto-streamer.h
> +++ b/gcc/lto-streamer.h
> @@ -837,7 +837,7 @@ bool referenced_from_this_partition_p (symtab_node *,
>  bool reachable_from_this_partition_p (struct cgraph_node *,
> lto_symtab_encoder_t);
>  lto_symtab_encoder_t compute_ltrans_boundary (lto_symtab_encoder_t encoder);
> -void select_what_to_stream (bool);
> +void select_what_to_stream (void);
>  
>  /* In options-save.c.  */
>  void cl_target_option_stream_out (struct output_block *, struct bitpack_d *,
> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> index 38809d2..c1179cb 100644
> --- a/gcc/lto/lto-partition.c
> +++ b/gcc/lto/lto-partition.c
> @@ -973,7 +973,8 @@ lto_promote_cross_file_statics (void)
>  
>gcc_assert (flag_wpa);
>  
> -  select_what_to_stream (false);
> +  lto_stream_offload_p = false;
> +  select_what_to_stream ();
>  
>/* First compute boundaries.  */
>n_sets = ltrans_partitions.length ();
> diff --git a/gcc/passes.c b/gcc/passes.c
> index 52dc067..e78a325 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -2464,7 +2464,7 @@ ipa_write_summaries_1 (lto_symtab_encoder_t encoder)
>  /* Write out summaries for all the nodes in the callgraph.  */
>  
>  void
> -ipa_write_summaries (bool offload_lto_mode)
> +ipa_write_summaries (void)
>  {
>lto_symtab_encoder_t encoder;
>int i, order_pos;
> @@ -2475,7 +2475,7 @@ ipa_write_summaries (bool offload_lto_mode)
>if ((!flag_generate_lto && !flag_generate_offload) || seen_error ())
>  return;
>  
> -  select_what_to_stream (offload_lto_mode);
> +  select_what_to_stream ();
>  
>encoder = lto_symtab_encoder_new (false);
>  
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 398ab83..9ff5bdc 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -603,7 +603,7 @@ extern void pass_fini_dump_file (opt_pass *);
>  extern const char *get_current_pass_name (void);
>  extern void print_current_pass (FILE *);
>  extern void debug_pass (void);
> -extern void ipa_write_summaries (bool);
> +extern void ipa_write_summaries (void);
>  extern void ipa_write_optimization_summaries (struct l

[PATCH] Install libgcj.pc as libgcj-5.pc rather than libgcj-5.0.pc (PR libgcj/64219)

2015-01-12 Thread Jakub Jelinek
Hi!

This patch changes the libgcj*.pc installed filename to match the new GCC
versioning scheme.

Bootstrapped/regtested on x86_64-linux and i686-linux, tested make install.

-rw-r--r--. 1 jakub jakub 192 Jan 12 21:02 
/tmp/blah/usr/local/lib64/pkgconfig/libgcj-5.pc
-rw-r--r--. 1 jakub jakub 192 Jan 12 21:02 
/tmp/blah/usr/local/lib/pkgconfig/libgcj-5.pc

Ok for trunk?

2015-01-12  Jakub Jelinek  

PR libgcj/64219
* Makefile.am (install-data-local): Use just the major version
from GCJVERSION instead of major.minor.
* Makefile.in: Regenerated.

--- libjava/Makefile.am.jj  2014-02-20 21:38:45.0 +0100
+++ libjava/Makefile.am 2015-01-12 12:40:50.453179067 +0100
@@ -779,7 +779,7 @@ install_data_local_split = 50
 install-data-local:
$(PRE_INSTALL)
 ## Install the .pc file.
-   @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*$$//'`; \
+   @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*[.][^.]*$$//'`; \
file="libgcj-$${pc_version}.pc"; \
$(mkinstalldirs) $(DESTDIR)$(pkgconfigdir); \
echo "  $(INSTALL_DATA) libgcj.pc $(DESTDIR)$(pkgconfigdir)/$$file"; \
--- libjava/Makefile.in.jj  2014-02-20 21:38:45.0 +0100
+++ libjava/Makefile.in 2015-01-12 12:41:09.376849424 +0100
@@ -12455,7 +12455,7 @@ install-exec-hook: install-binPROGRAMS i
 @BUILD_ECJ1_TRUE@  mv $(DESTDIR)$(libexecsubdir)/`echo ecjx | sed 
's,^.*/,,;$(transform);s/$$/$(EXEEXT)/'` 
$(DESTDIR)$(libexecsubdir)/ecj1$(host_exeext)
 install-data-local:
$(PRE_INSTALL)
-   @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*$$//'`; \
+   @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*[.][^.]*$$//'`; \
file="libgcj-$${pc_version}.pc"; \
$(mkinstalldirs) $(DESTDIR)$(pkgconfigdir); \
echo "  $(INSTALL_DATA) libgcj.pc $(DESTDIR)$(pkgconfigdir)/$$file"; \

Jakub


[PATCH] Fix up some gcc.dg/vect/ testcases with -fpic (PR testsuite/64028)

2015-01-12 Thread Jakub Jelinek
Hi!

Various gcc.dg/vect/ testcases now fail on the trunk with -fpic.
The problem is that they expect that the global vars bind locally and
vectorizer can increase their alignment, but with -fpic that does not
work, as one can interpose them.

Fixed by adding dg-add-options bind_pic_locally.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2015-01-12  Jakub Jelinek  

PR testsuite/64028
* gcc.dg/vect/no-section-anchors-vect-31.c: Add dg-add-options
bind_pic_locally.
* gcc.dg/vect/no-section-anchors-vect-34.c: Likewise.
* gcc.dg/vect/no-section-anchors-vect-36.c: Likewise.
* gcc.dg/vect/no-section-anchors-vect-64.c: Likewise.
* gcc.dg/vect/no-section-anchors-vect-65.c: Likewise.
* gcc.dg/vect/no-section-anchors-vect-68.c: Likewise.
* gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
* gcc.dg/vect/slp-25.c: Likewise.
* gcc.dg/vect/vect-109.c: Likewise.
* gcc.dg/vect/vect-13.c: Likewise.
* gcc.dg/vect/vect-17.c: Likewise.
* gcc.dg/vect/vect-18.c: Likewise.
* gcc.dg/vect/vect-19.c: Likewise.
* gcc.dg/vect/vect-20.c: Likewise.
* gcc.dg/vect/vect-21.c: Likewise.
* gcc.dg/vect/vect-22.c: Likewise.
* gcc.dg/vect/vect-27.c: Likewise.
* gcc.dg/vect/vect-29.c: Likewise.
* gcc.dg/vect/vect-2-big-array.c: Likewise.
* gcc.dg/vect/vect-2.c: Likewise.
* gcc.dg/vect/vect-3.c: Likewise.
* gcc.dg/vect/vect-4.c: Likewise.
* gcc.dg/vect/vect-5.c: Likewise.
* gcc.dg/vect/vect-72.c: Likewise.
* gcc.dg/vect/vect-73-big-array.c: Likewise.
* gcc.dg/vect/vect-73.c: Likewise.
* gcc.dg/vect/vect-77-global.c: Likewise.
* gcc.dg/vect/vect-78-global.c: Likewise.
* gcc.dg/vect/vect-7.c: Likewise.
* gcc.dg/vect/vect-86.c: Likewise.
* gcc.dg/vect/vect-align-1.c: Likewise.
* gcc.dg/vect/vect-align-3.c: Likewise.
* gcc.dg/vect/vect-all-big-array.c: Likewise.
* gcc.dg/vect/vect-all.c: Likewise.
* gcc.dg/vect/vect-multitypes-1.c: Likewise.
* gcc.dg/vect/vect-multitypes-4.c: Likewise.
* gcc.dg/vect/vect-peel-3.c: Likewise.
* gcc.dg/vect/vect-peel-4.c: Likewise.
* gcc.dg/vect/wrapv-vect-7.c: Likewise.

--- gcc/testsuite/gcc.dg/vect/vect-18.c.jj  2008-09-05 12:54:35.0 
+0200
+++ gcc/testsuite/gcc.dg/vect/vect-18.c 2015-01-12 13:54:28.201166746 +0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
   
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-17.c.jj  2010-11-03 16:58:21.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/vect-17.c 2015-01-12 13:54:22.394268074 +0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
   
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-4.c.jj   2008-09-05 12:54:35.0 
+0200
+++ gcc/testsuite/gcc.dg/vect/vect-4.c  2015-01-12 13:55:08.322466645 +0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-add-options bind_pic_locally } */
 
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-2-big-array.c.jj 2011-12-11 
22:02:36.043642629 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-2-big-array.c2015-01-12 
13:54:59.260624770 +0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
 
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-2.c.jj   2012-03-20 08:51:25.653267483 
+0100
+++ gcc/testsuite/gcc.dg/vect/vect-2.c  2015-01-12 13:55:01.959577675 +0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
 
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-77-global.c.jj   2009-05-13 
08:42:37.0 +0200
+++ gcc/testsuite/gcc.dg/vect/vect-77-global.c  2015-01-12 13:55:23.363204190 
+0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
 
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-27.c.jj  2009-11-04 18:36:22.0 
+0100
+++ gcc/testsuite/gcc.dg/vect/vect-27.c 2015-01-12 13:54:52.915735486 +0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
 
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-78-global.c.jj   2009-05-13 
08:42:37.0 +0200
+++ gcc/testsuite/gcc.dg/vect/vect-78-global.c  2015-01-12 13:55:25.802161631 
+0100
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options bind_pic_locally } */
 
 #include 
 #include "tree-vect.h"
--- gcc/testsuite/gcc.dg/vect/vect-5.c.jj   2008-09-05 12:54:36.0 
+0200
+++ gcc/testsuite/gcc.dg/vect/vect-5.c  2015-01-12 13:55:12.

[PATCH] Use ldexp instead of scalbln for portability (PR other/64370)

2015-01-12 Thread Jakub Jelinek
Hi!

As mentioned in the PR, HPUX doesn't have scalbln, but does have ldexp and
that function is already used in gcj-dump, so supposedly it is more portable
to use ldexp.  Also in glibc it is defined in libc in addition to libm.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-12  Jakub Jelinek  

PR other/64370
* sreal.c (sreal::to_double): Use ldexp instead of scalbnl.

--- gcc/sreal.c.jj  2015-01-09 12:01:33.0 +0100
+++ gcc/sreal.c 2015-01-12 14:26:44.339128332 +0100
@@ -122,7 +122,7 @@ sreal::to_double () const
 {
   double val = m_sig;
   if (m_exp)
-val = scalbln (val, m_exp);
+val = ldexp (val, m_exp);
   return val;
 }
 

Jakub


Re: [testsuite] PATCH: Add check_effective_target_pie

2015-01-12 Thread H.J. Lu
On Mon, Jan 12, 2015 at 12:03 PM, Jeff Law  wrote:
> On 01/12/15 12:59, H.J. Lu wrote:
>>
>> I don't know if -pg will work PIE on any targets.  For Linux/x86
>> the choices of crt1.o are
>>
>> %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}}
>>
>> -shared, -pg and -pie are mutually exclusive. Those crt1 files are
>> only crt1 files provided by glibc.  You can't even try -pg -pie on
>> Linux without changing glibc.
>
> You're totally missing the point.  What I care about is *why*.
>
> Showing me spec file fragments is totally unhelpful.  What is the technical
> reason why pg and pie are mutually exclusive?

What kind of "technical" reason are you looking for?  glibc doesn't
provide the right crt1 file for GCC to support this combination.  You
can't define GNU_USER_TARGET_STARTFILE_SPEC to support
-pg and -pie.

If you are asking "why" glibc doesn't provide one, my guess is no
one has requested one before.

-- 
H.J.


[PATCH 1/4] Core definition for APM XGene-1 and associated cost-table.

2015-01-12 Thread Philipp Tomsich
To keep this change separately buildable from the pipeline model,
this patch directs the APM XGene-1 to use the generic scheduling
model.
---
 gcc/ChangeLog-2014   |   8 +++
 gcc/config/aarch64/aarch64-cores.def |   1 +
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  68 +++
 gcc/config/arm/aarch-cost-tables.h   | 101 +++
 gcc/doc/invoke.texi  |   3 +-
 6 files changed, 181 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog-2014 b/gcc/ChangeLog-2014
index 58091df..dd49d7f 100644
--- a/gcc/ChangeLog-2014
+++ b/gcc/ChangeLog-2014
@@ -5350,6 +5350,14 @@
optimization of ashiftrt of subreg of lshiftrt, check that code
is ASHIFTRT.
 
+2014-11-19  Philipp Tomsich  
+
+   * config/aarch64/aarch64-cores.def (xgene1): Update/add the
+   xgene1 (APM XGene-1) core definition.
+   * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
+   * config/arm/aarch-cost-tables.h: Add cost tables for APM XGene-1
+   * doc/invoke.texi: Document -mcpu=xgene1.
+
 2014-11-18  Andrew MacLeod  
 
* attribs.c (decl_attributes): Remove always true condition,
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 18f5c48..35a43e6 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -37,6 +37,7 @@
 AARCH64_CORE("cortex-a53",  cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa53)
 AARCH64_CORE("cortex-a57",  cortexa15, cortexa15, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa57)
 AARCH64_CORE("thunderx",thunderx,  thunderx, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx)
+AARCH64_CORE("xgene1",  xgene1,xgene1,8,  AARCH64_FL_FOR_ARCH8, 
xgene1)
 
 /* V8 big.LITTLE implementations.  */
 
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index c717ea8..6409082 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   "cortexa53,cortexa15,thunderx,cortexa57cortexa53"
+   "cortexa53,cortexa15,thunderx,xgene1,cortexa57cortexa53"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5100532..dd43a73 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -233,6 +233,27 @@ static const struct cpu_addrcost_table 
cortexa57_addrcost_table =
 #if HAVE_DESIGNATED_INITIALIZERS && GCC_VERSION >= 2007
 __extension__
 #endif
+static const struct cpu_addrcost_table xgene1_addrcost_table =
+{
+#if HAVE_DESIGNATED_INITIALIZERS
+  .addr_scale_costs =
+#endif
+{
+  NAMED_PARAM (hi, 1),
+  NAMED_PARAM (si, 0),
+  NAMED_PARAM (di, 0),
+  NAMED_PARAM (ti, 1),
+},
+  NAMED_PARAM (pre_modify, 1),
+  NAMED_PARAM (post_modify, 0),
+  NAMED_PARAM (register_offset, 0),
+  NAMED_PARAM (register_extend, 1),
+  NAMED_PARAM (imm_offset, 0),
+};
+
+#if HAVE_DESIGNATED_INITIALIZERS && GCC_VERSION >= 2007
+__extension__
+#endif
 static const struct cpu_regmove_cost generic_regmove_cost =
 {
   NAMED_PARAM (GP2GP, 1),
@@ -271,6 +292,16 @@ static const struct cpu_regmove_cost thunderx_regmove_cost 
=
   NAMED_PARAM (FP2FP, 4)
 };
 
+static const struct cpu_regmove_cost xgene1_regmove_cost =
+{
+  NAMED_PARAM (GP2GP, 1),
+  /* Avoid the use of slow int<->fp moves for spilling by setting
+ their cost higher than memmov_cost.  */
+  NAMED_PARAM (GP2FP, 8),
+  NAMED_PARAM (FP2GP, 8),
+  NAMED_PARAM (FP2FP, 2)
+};
+
 /* Generic costs for vector insn classes.  */
 #if HAVE_DESIGNATED_INITIALIZERS && GCC_VERSION >= 2007
 __extension__
@@ -311,6 +342,26 @@ static const struct cpu_vector_cost cortexa57_vector_cost =
   NAMED_PARAM (cond_not_taken_branch_cost, 1)
 };
 
+/* Generic costs for vector insn classes.  */
+#if HAVE_DESIGNATED_INITIALIZERS && GCC_VERSION >= 2007
+__extension__
+#endif
+static const struct cpu_vector_cost xgene1_vector_cost =
+{
+  NAMED_PARAM (scalar_stmt_cost, 1),
+  NAMED_PARAM (scalar_load_cost, 5),
+  NAMED_PARAM (scalar_store_cost, 1),
+  NAMED_PARAM (vec_stmt_cost, 2),
+  NAMED_PARAM (vec_to_scalar_cost, 4),
+  NAMED_PARAM (scalar_to_vec_cost, 4),
+  NAMED_PARAM (vec_align_load_cost, 10),
+  NAMED_PARAM (vec_unalign_load_cost, 10),
+  NAMED_PARAM (vec_unalign_store_cost, 2),
+  NAMED_PARAM (vec_store_cost, 2),
+  NAMED_PARAM (cond_taken_branch_cost, 2),
+  NAMED_PARAM (cond_not_taken_branch_cost, 1)
+};
+
 #define AARCH64_FUSE_NOTHING   (0)
 #define AARCH64_FUSE_MOV_MOVK  (1 << 0)
 #define AARCH64_FUSE_ADRP_ADD  (1 << 1)
@@ -390,6 +441,23 @@ static const struct tune_params thunderx_tunings =
   1/* vec_reassoc_width.  */
 };
 
+static const struct tune_params xgene1_tunings

[PATCH 3/4] Change the type of the prefetch-instructions to 'prefetch'.

2015-01-12 Thread Philipp Tomsich
---
 gcc/config/aarch64/aarch64.md | 2 +-
 gcc/config/arm/types.md   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1f6b1b6..98f4f30 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -391,7 +391,7 @@
 
 return pftype[INTVAL(operands[1])][locality];
   }
-  [(set_attr "type" "load1")]
+  [(set_attr "type" "prefetch")]
 )
 
 (define_insn "trap"
diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
index d368446..088c21a 100644
--- a/gcc/config/arm/types.md
+++ b/gcc/config/arm/types.md
@@ -118,6 +118,7 @@
 ; mvn_shift_reg  inverting move instruction, shifted operand by a register.
 ; no_insnan insn which does not represent an instruction in the
 ;final output, thus having no impact on scheduling.
+; prefetch   a prefetch instruction
 ; rbit   reverse bits.
 ; revreverse bytes.
 ; sdiv   signed division.
@@ -556,6 +557,7 @@
   call,\
   clz,\
   no_insn,\
+  prefetch,\
   csel,\
   crc,\
   extend,\
-- 
1.9.1



[PATCH 2/4] Pipeline model for APM XGene-1.

2015-01-12 Thread Philipp Tomsich
---
 gcc/config/aarch64/aarch64.md |   1 +
 gcc/config/arm/xgene1.md  | 531 ++
 2 files changed, 532 insertions(+)
 create mode 100644 gcc/config/arm/xgene1.md

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 12e1054..1f6b1b6 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -190,6 +190,7 @@
 (include "../arm/cortex-a53.md")
 (include "../arm/cortex-a15.md")
 (include "thunderx.md")
+(include "../arm/xgene1.md")
 
 ;; ---
 ;; Jumps and other miscellaneous insns
diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
new file mode 100644
index 000..e909fd0
--- /dev/null
+++ b/gcc/config/arm/xgene1.md
@@ -0,0 +1,531 @@
+;; Machine description for AppliedMicro xgene1 core.
+;; Copyright (C) 2012-2014 Free Software Foundation, Inc.
+;; Contributed by Theobroma Systems Design und Consulting GmbH.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Pipeline description for the xgene1 micro-architecture
+
+(define_automaton "xgene1")
+
+(define_cpu_unit "xgene1_decode_out0" "xgene1")
+(define_cpu_unit "xgene1_decode_out1" "xgene1")
+(define_cpu_unit "xgene1_decode_out2" "xgene1")
+(define_cpu_unit "xgene1_decode_out3" "xgene1")
+
+(define_cpu_unit "xgene1_divide" "xgene1")
+(define_cpu_unit "xgene1_fp_divide" "xgene1")
+(define_cpu_unit "xgene1_fsu" "xgene1")
+(define_cpu_unit "xgene1_fcmp" "xgene1")
+
+(define_reservation "xgene1_decode1op"
+"( xgene1_decode_out0 )
+|( xgene1_decode_out1 )
+|( xgene1_decode_out2 )
+|( xgene1_decode_out3 )"
+)
+(define_reservation "xgene1_decode2op"
+"( xgene1_decode_out0 + xgene1_decode_out1 )
+|( xgene1_decode_out0 + xgene1_decode_out2 )
+|( xgene1_decode_out0 + xgene1_decode_out3 )
+|( xgene1_decode_out1 + xgene1_decode_out2 )
+|( xgene1_decode_out1 + xgene1_decode_out3 )
+|( xgene1_decode_out2 + xgene1_decode_out3 )"
+)
+(define_reservation "xgene1_decodeIsolated"
+"( xgene1_decode_out0 + xgene1_decode_out1 + xgene1_decode_out2 + 
xgene1_decode_out3 )"
+)
+
+(define_insn_reservation "xgene1_branch" 1
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "branch"))
+  "xgene1_decode1op")
+
+(define_insn_reservation "xgene1_nop" 1
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "no_insn"))
+  "xgene1_decode1op")
+
+(define_insn_reservation "xgene1_call" 1
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "call"))
+  "xgene1_decode2op")
+
+(define_insn_reservation "xgene1_f_load" 10
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "f_loadd,f_loads"))
+  "xgene1_decode2op")
+
+(define_insn_reservation "xgene1_f_store" 4
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "f_stored,f_stores"))
+  "xgene1_decode2op")
+
+(define_insn_reservation "xgene1_fmov" 2
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "fmov,fconsts,fconstd"))
+  "xgene1_decode1op")
+
+(define_insn_reservation "xgene1_f_mcr" 10
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "f_mcr"))
+  "xgene1_decodeIsolated")
+
+(define_insn_reservation "xgene1_f_mrc" 4
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "f_mrc"))
+  "xgene1_decode2op")
+
+(define_insn_reservation "xgene1_load_pair" 6
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "load2"))
+  "xgene1_decodeIsolated")
+
+(define_insn_reservation "xgene1_store_pair" 2
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "store2"))
+  "xgene1_decodeIsolated")
+
+(define_insn_reservation "xgene1_fp_load1" 10
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "load1")
+   (eq_attr "fp" "yes"))
+  "xgene1_decode1op")
+
+(define_insn_reservation "xgene1_load1" 5
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "load1"))
+  "xgene1_decode1op")
+
+(define_insn_reservation "xgene1_store1" 2
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "store1"))
+  "xgene1_decode2op")
+
+(define_insn_reservation "xgene1_move" 1
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "mov_reg,mov_imm,mrs"))
+  "xgene1_decode1op")
+
+(define_insn_reservation "xgene1_alu" 1
+  (and (eq_attr "tune" "xgene1")
+   (eq_attr "type" "alu_imm,alu_sreg,alu_shift_imm,\
+  

[PATCH 0/4, AArch64, v4] APM X-Gene 1 cost-table and pipeline model

2015-01-12 Thread Philipp Tomsich
Marcus & Ramana,

Attached is the updated---and hopefully final---revision of the changes to
get XGene-1 properly wired up in the AArch64 and AArch64 backends. 

On the AArch64 side, we've only removed the URL from the credits of the 
xgene1.md file and the remaining content is unchanged (safe for changes
from rebasing to the current head).
Note that the AArch32 integration is contained entirely in patch 4/4 and 
requires a gas-change that was merged as ea0d6bb on the binutils tree.

These patches incorporate all earlier comments and have been tested for
AArch64 (aarch64-linux-gnu in LE and BE configurations) and for AArch32
(arm-none-eabi).

I'd be grateful, if you could apply at least the AArch64 patches.

Best,
Phil.



Philipp Tomsich (4):
  Core definition for APM XGene-1 and associated cost-table.
  Pipeline model for APM XGene-1.
  Change the type of the prefetch-instructions to 'prefetch'.
  Wire X-Gene 1 up in the ARM (32bit) backend as a AArch32-capable core.

 gcc/ChangeLog-2014   |  18 ++
 gcc/config/aarch64/aarch64-cores.def |   1 +
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  68 +
 gcc/config/aarch64/aarch64.md|   3 +-
 gcc/config/arm/aarch-cost-tables.h   | 101 +++
 gcc/config/arm/arm-cores.def |   1 +
 gcc/config/arm/arm-tables.opt|   3 +
 gcc/config/arm/arm-tune.md   |   3 +-
 gcc/config/arm/arm.c |  22 ++
 gcc/config/arm/arm.md|  11 +-
 gcc/config/arm/bpabi.h   |   2 +
 gcc/config/arm/t-arm |   1 +
 gcc/config/arm/types.md  |   2 +
 gcc/config/arm/xgene1.md | 531 +++
 gcc/doc/invoke.texi  |   6 +-
 16 files changed, 768 insertions(+), 7 deletions(-)
 create mode 100644 gcc/config/arm/xgene1.md

-- 
1.9.1



[PATCH 4/4] Wire X-Gene 1 up in the ARM (32bit) backend as a AArch32-capable core.

2015-01-12 Thread Philipp Tomsich
---
 gcc/ChangeLog-2014| 10 ++
 gcc/config/arm/arm-cores.def  |  1 +
 gcc/config/arm/arm-tables.opt |  3 +++
 gcc/config/arm/arm-tune.md|  3 ++-
 gcc/config/arm/arm.c  | 22 ++
 gcc/config/arm/arm.md | 11 +--
 gcc/config/arm/bpabi.h|  2 ++
 gcc/config/arm/t-arm  |  1 +
 gcc/doc/invoke.texi   |  3 ++-
 9 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/gcc/ChangeLog-2014 b/gcc/ChangeLog-2014
index dd49d7f..c3c62db 100644
--- a/gcc/ChangeLog-2014
+++ b/gcc/ChangeLog-2014
@@ -3497,6 +3497,16 @@
63965.
* config/rs6000/rs6000.c: Likewise.
 
+2014-12-23  Philipp Tomsich  
+
+   * config/arm/arm.md (generic_sched): Specify xgene1 in 'no' list.
+   Include xgene1.md.
+   * config/arm/arm.c (arm_issue_rate): Specify 4 for xgene1.
+   * config/arm/arm-cores.def (xgene1): New entry.
+   * config/arm/arm-tables.opt: Regenerate.
+   * config/arm/arm-tune.md: Regenerate.
+   * config/arm/bpabi.h (BE8_LINK_SPEC): Specify mcpu=xgene1.
+
 2014-11-22  Jan Hubicka  
 
PR ipa/63671
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index be125ac..fa13eb9 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -167,6 +167,7 @@ ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, 
cortexa7,   7A,  FL_LDSCHED |
 /* V8 Architecture Processors */
 ARM_CORE("cortex-a53", cortexa53, cortexa53,   8A, FL_LDSCHED | FL_CRC32, 
cortex_a53)
 ARM_CORE("cortex-a57", cortexa57, cortexa15,   8A, FL_LDSCHED | FL_CRC32, 
cortex_a57)
+ARM_CORE("xgene1",  xgene1,xgene1,  8A, FL_LDSCHED,
xgene1)
 
 /* V8 big.LITTLE implementations */
 ARM_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A,  
FL_LDSCHED | FL_CRC32, cortex_a57)
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ece9d5e..1392429 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -310,6 +310,9 @@ EnumValue
 Enum(processor_type) String(cortex-a57) Value(cortexa57)
 
 EnumValue
+Enum(processor_type) String(xgene1) Value(xgene1)
+
+EnumValue
 Enum(processor_type) String(cortex-a57.cortex-a53) Value(cortexa57cortexa53)
 
 Enum
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 452820ab..dcd5054 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -32,5 +32,6 @@
cortexr4f,cortexr5,cortexr7,
cortexm7,cortexm4,cortexm3,
marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-   cortexa53,cortexa57,cortexa57cortexa53"
+   cortexa53,cortexa57,xgene1,
+   cortexa57cortexa53"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8ca2dd8..14c8a87 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1903,6 +1903,25 @@ const struct tune_params arm_cortex_a57_tune =
   ARM_FUSE_MOVW_MOVT   /* Fuseable pairs of 
instructions.  */
 };
 
+const struct tune_params arm_xgene1_tune =
+{
+  arm_9e_rtx_costs,
+  &xgene1_extra_costs,
+  NULL,/* Scheduler cost adjustment.  
*/
+  1,   /* Constant limit.  */
+  2,   /* Max cond insns.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  false,   /* Prefer constant pool.  */
+  arm_default_branch_cost,
+  true,/* Prefer LDRD/STRD.  */
+  {true, true},/* Prefer non short circuit.  */
+  &arm_default_vec_cost,   /* Vectorizer costs.  */
+  false,   /* Prefer Neon for 64-bits 
bitops.  */
+  true, true,  /* Prefer 32-bit encodings.  */
+  false,  /* Prefer Neon for stringops.  */
+  32  /* Maximum insns to inline 
memset.  */
+};
+
 /* Branches can be dual-issued on Cortex-A5, so conditional execution is
less appealing.  Set max_insns_skipped to a low value.  */
 
@@ -27066,6 +27085,9 @@ arm_issue_rate (void)
 {
   switch (arm_tune)
 {
+case xgene1:
+  return 4;
+
 case cortexa15:
 case cortexa57:
   return 3;
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index c61057f..a3cbf3b 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -109,6 +109,11 @@
 ;; given instruction does not shift one of its input operands.
 (define_attr "shift" "" (const_int 0))
 
+;; [For compatibility with AArch64 in pipeline models]
+;; Attribute that specifies whether or not the instruction touches fp
+;; registers.
+(define_attr "fp" "no,yes" (const_string "no"))
+
 ; Floating Point Unit.  If we only have floating point emulation, then there
 ; is no point in scheduling the floating point 

[PATCH] Fix up computed goto on POINTERS_EXTEND_UNSIGNED targets (PR middle-end/63974)

2015-01-12 Thread Jakub Jelinek
Hi!

The 991213-3.c testcase ICEs on aarch64-linux with -mabi=ilp32
since wide-int merge.  The problem is that
x = convert_memory_address (Pmode, x)
is used twice on a VOIDmode CONST_INT, which is wrong.
For non-VOIDmode rtl the second convert_memory_address
is a NOP, but for VOIDmode the second call treats the CONST_INT
returned by the first call as if it was again ptr_mode, rather
than Pmode.  On aarch64-linux in particular, the constant is
zero-extended from SImode to DImode in the first call, so it
is not valid SImode CONST_INT any longer.

emit_indirect_jump always calls convert_memory_address (Pmode, ...)
on the operand in optabs.c when handling EXPAND_ADDRESS case
in maybe_legitimize_operand, so the first convert_memory_address
is both unnecessary and harmful.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux
(which do not define POINTERS_EXTEND_UNSIGNED) and tested on the
problematic testcase with aarch64-linux cross.  Can anyone with
easy access to POINTERS_EXTEND_UNSIGNED targets (aarch64-linux ilp32,
x86_64 -mx32, ia64-hpux) please test this?

Ok for trunk if it works there?

2015-01-12  Jakub Jelinek  

PR middle-end/63974
* cfgexpand.c (expand_computed_goto): Don't call
convert_memory_address here.

--- gcc/cfgexpand.c.jj  2015-01-09 21:59:54.0 +0100
+++ gcc/cfgexpand.c 2015-01-12 14:41:35.210705174 +0100
@@ -3060,8 +3060,6 @@ expand_computed_goto (tree exp)
 {
   rtx x = expand_normal (exp);
 
-  x = convert_memory_address (Pmode, x);
-
   do_pending_stack_adjust ();
   emit_indirect_jump (x);
 }

Jakub


[PATCH] Fix PR64461, Incorrect code on coldfire targets

2015-01-12 Thread Jeff Law


As suggested by Andreas in the PR, the simplest fix for this problem is 
to disable the various trunc* patterns for TARGET_COLDFIRE.  That's 
precisely what this patch does.


Built cross compilers with and without the m68k.md hunk.  Verified the 
test failed without the m68k.mk hunk and passed with the m68k.md hunk.


For "fun" I've got an m68k bootstrap of the trunk running.  I don't 
expect it to finish for at least a week or so, assuming it runs to 
completion.


Installed on the trunk (in separate commits due to stupidity on my part).

Jeff

PR target/64461
* config/m68k/m68k.md (truncsiqi2): Disable for TARGET_COLDFIRE.
(trunchiqi2, truncsihi2): Similarly.

PR target/64461
* gcc.target/m68k/pr64461.c: New test.

diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 2783a8f..2a314c3 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -1572,7 +1572,7 @@
   [(set (match_operand:QI 0 "nonimmediate_operand" "=dm,d")
(truncate:QI
 (match_operand:SI 1 "general_src_operand" "doJS,i")))]
-  ""
+  "!TARGET_COLDFIRE"
 {
   if (GET_CODE (operands[0]) == REG)
 {
@@ -1590,7 +1590,7 @@
   [(set (match_operand:QI 0 "nonimmediate_operand" "=dm,d")
(truncate:QI
 (match_operand:HI 1 "general_src_operand" "doJS,i")))]
-  ""
+  "!TARGET_COLDFIRE"
 {
   if (GET_CODE (operands[0]) == REG
   && (GET_CODE (operands[1]) == MEM
@@ -1617,7 +1617,7 @@
   [(set (match_operand:HI 0 "nonimmediate_operand" "=dm,d")
(truncate:HI
 (match_operand:SI 1 "general_src_operand" "roJS,i")))]
-  ""
+  "!TARGET_COLDFIRE"
 {
   if (GET_CODE (operands[0]) == REG)
 {

diff --git a/gcc/testsuite/gcc.target/m68k/pr64461.c 
b/gcc/testsuite/gcc.target/m68k/pr64461.c
new file mode 100644
index 000..dd70355
--- /dev/null
+++ b/gcc/testsuite/gcc.target/m68k/pr64461.c
@@ -0,0 +1,16 @@
+/* { dg-do assemble } */
+/* { dg-options "-mcpu=5235 -Os" } */
+
+typedef struct rtems_rfs_block_map_s
+{
+  long unsigned int blocks[(5)];
+} rtems_rfs_block_map;
+
+extern int foo (void);
+
+int
+rtems_rfs_block_map_indirect_alloc (rtems_rfs_block_map *map,
+   unsigned char* buffer, int b)
+{
+  (buffer + b * 4)[3] = (unsigned char) map->blocks[b];
+}


[PATCH] Fix -mstack-arg-probe (PR target/64513)

2015-01-12 Thread Jakub Jelinek
Hi!

For -mstack-arg-probe we push %rax and/or %r10 in the prologue, and
mark that insn as RTX_FRAME_RELATED_P.  But that means that the dwarf2 pass
also considers that the %rax/%r10 registers, which are call used, to be
saved in the unwind info, but they are never restored, which makes the
dwarf2 pass ICE on it.  Fixed by letting the dwarf2 pass know just that
those instructions decrease stack pointer.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-12  Jakub Jelinek  

PR target/64513
* config/i386/i386.c (ix86_expand_prologue): Add
REG_FRAME_RELATED_EXPR to %rax and %r10 pushes.

* gcc.target/i386/pr64513.c: New test.

--- gcc/config/i386/i386.c.jj   2015-01-09 22:00:02.0 +0100
+++ gcc/config/i386/i386.c  2015-01-12 17:13:21.342463547 +0100
@@ -11559,6 +11559,10 @@ ix86_expand_prologue (void)
  if (sp_is_cfa_reg)
m->fs.cfa_offset += UNITS_PER_WORD;
  RTX_FRAME_RELATED_P (insn) = 1;
+ add_reg_note (insn, REG_FRAME_RELATED_EXPR,
+   gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+plus_constant (Pmode, 
stack_pointer_rtx,
+   -UNITS_PER_WORD)));
}
}
 
@@ -11572,6 +11576,10 @@ ix86_expand_prologue (void)
  if (sp_is_cfa_reg)
m->fs.cfa_offset += UNITS_PER_WORD;
  RTX_FRAME_RELATED_P (insn) = 1;
+ add_reg_note (insn, REG_FRAME_RELATED_EXPR,
+   gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+plus_constant (Pmode, 
stack_pointer_rtx,
+   -UNITS_PER_WORD)));
}
}
 
--- gcc/testsuite/gcc.target/i386/pr64513.c.jj  2015-01-12 17:20:12.052330807 
+0100
+++ gcc/testsuite/gcc.target/i386/pr64513.c 2015-01-12 17:20:02.0 
+0100
@@ -0,0 +1,17 @@
+/* PR target/64513 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mstack-arg-probe" } */
+
+struct A {};
+struct B { struct A y; };
+int foo (struct A);
+
+int
+bar (int x)
+{
+  struct B b;
+  int c;
+  while (x--)
+c = foo (b.y);
+  return c;
+}

Jakub


  1   2   >