Re: [PATCH] IPA ICF: add comparison for target and optimization nodes

2015-01-09 Thread John David Anglin
On Fri, 09 Jan 2015, Kyrill Tkachov wrote:

>
> On 09/01/15 16:11, Christophe Lyon wrote:
>> On 9 January 2015 at 11:26, Martin Liška  wrote:
>>> On 01/09/2015 06:21 AM, Jeff Law wrote:
 On 01/07/15 04:38, Martin Liška wrote:
> Hello.
>
> Following patch adds support for target and optimization nodes
> comparison, which is
> based on Honza's newly added infrastructure.
>
> Apart from that, there's a small hunk that corrects formatting and
> removes unnecessary
> call to a comparison function.
>
> Hope it can be applied as one patch.
>
> Tested on x86_64-linux-pc without any new regression introduction.
>
> Ready for trunk?
>
> Thank you,
> Martin
>
> 0001-IPA-ICF-target-and-optimization-flags-comparison.patch
>
>
>   From 393eaa47c8aef9a91a1c635016f23ca2f5aa25e4 Mon Sep 17 00:00:00 
> 2001
> From: mliska
> Date: Tue, 6 Jan 2015 15:06:18 +0100
> Subject: [PATCH] IPA ICF: target and optimization flags comparison.
>
> gcc/ChangeLog:
>
> 2015-01-06  Martin Liska
>
>  * cgraphunit.c (cgraph_node::create_wrapper): Fix level of
> indentation.
>  * ipa-icf.c (sem_function::equals_private): Add support for target
> and
>  (sem_item_optimizer::merge_classes): Remove redundant function
>  comparison.
>  optimization flags comparison.
>  * tree.h (target_opts_for_fn): New function.
 Looks like the changelog is a bit goof'd with lines intermixed.

 Patch itself is good for the trunk.  It'd be nice if you could add a
 testcase as well.

 Jeff
>>>
>>> Hi.
>>>
>>> You are right, I forgot to delete a line in Changelog.
>>> Attachment contains final version with a new test case I'm going to 
>>> install.
>>>
>> Hi,
>>
>> It looks like this patch broke GCC builds for ARM and AArch64 targets at 
>> least.
>>
>> I see failures builds pr-support.o and unwind-arm.o:
>>
>> 0x10bb077 tree_check
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree.h:2778
>> 0x10bb077 target_opts_for_fn
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree.h:4681
>> 0x10bb077 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*,
>> hash_map&)
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:431
>> 0x10bbd27 ipa_icf::sem_function::equals(ipa_icf::sem_item*,
>> hash_map&)
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:386
>> 0x10ba63a ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:1893
>> 0x10bcd86 ipa_icf::sem_item_optimizer::execute()
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:1712
>> 0x10bce11 ipa_icf_driver
>>  
>> /tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:2441
>>
>> (for target arm-none-eabi)
>
> Yeah, I see these too when trying to bootstrap arm and aarch64

Also seen on hppa-hpux.

>
> Kyrill
>
>>
>>> Thanks,
>>> Martin
>

Dave
-- 
J. David Anglin  dave.ang...@bell.net


[PATCH, moxie] Tabify assembly output

2015-01-09 Thread Anthony Green

I'm committing the following patch, which cleans up the assembly output
by using tabs between opcodes and operands.

Thanks,

AG

 2015-01-09  Anthony Green  
 
* config/moxie/moxie.md: Tabify assembly output.


Index: gcc/config/moxie/moxie.md
===
--- gcc/config/moxie/moxie.md   (revision 219420)
+++ gcc/config/moxie/moxie.md   (working copy)
@@ -48,9 +48,9 @@
   (match_operand:SI 2 "moxie_add_operand" "I,N,r")))]
   ""
   "@
-  inc%0, %2
-  dec   %0, -%2
-  add%0, %2")
+  inc\\t%0, %2
+  dec\\t%0, -%2
+  add\\t%0, %2")
 
 (define_insn "subsi3"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
@@ -59,8 +59,8 @@
   (match_operand:SI 2 "moxie_sub_operand" "I,r")))]
   ""
   "@
-  dec%0, %2
-  sub%0, %2")
+  dec\\t%0, %2
+  sub\\t%0, %2")
 
 (define_insn "mulsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
@@ -68,7 +68,7 @@
   (match_operand:SI 1 "register_operand" "0")
   (match_operand:SI 2 "register_operand" "r")))]
   ""
-  "mul%0, %2")
+  "mul\\t%0, %2")
 
 (define_code_iterator EXTEND [sign_extend zero_extend])
 (define_code_attr mul [(sign_extend "mul") (zero_extend "umul")])
@@ -105,7 +105,7 @@
   (match_operand:SI 1 "register_operand" "0")
   (match_operand:SI 2 "register_operand" "r")))]
   ""
-  "div%0, %2")
+  "div\\t%0, %2")
 
 (define_insn "udivsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
@@ -113,7 +113,7 @@
   (match_operand:SI 1 "register_operand" "0")
   (match_operand:SI 2 "register_operand" "r")))]
   ""
-  "udiv   %0, %2")
+  "udiv\\t%0, %2")
 
 (define_insn "modsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
@@ -121,7 +121,7 @@
   (match_operand:SI 1 "register_operand" "0")
   (match_operand:SI 2 "register_operand" "r")))]
   ""
-  "mod%0, %2")
+  "mod\\t%0, %2")
 
 (define_insn "umodsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
@@ -129,7 +129,7 @@
   (match_operand:SI 1 "register_operand" "0")
   (match_operand:SI 2 "register_operand" "r")))]
   ""
-  "umod   %0, %2")
+  "umod\\t%0, %2")
 
 ;; -
 ;; Unary arithmetic instructions
@@ -139,13 +139,13 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
  (neg:SI (match_operand:SI 1 "register_operand" "r")))]
   ""
-  "neg%0, %1")
+  "neg\\t%0, %1")
 
 (define_insn "one_cmplsi2"
   [(set (match_operand:SI 0 "register_operand" "=r")
(not:SI (match_operand:SI 1 "register_operand" "r")))]
   ""
-  "not%0, %1")
+  "not\\t%0, %1")
 
 ;; -
 ;; Logical operators
@@ -157,7 +157,7 @@
(match_operand:SI 2 "register_operand" "r")))]
   ""
 {
-  return "and%0, %2";
+  return "and\\t%0, %2";
 })
 
 (define_insn "xorsi3"
@@ -166,7 +166,7 @@
(match_operand:SI 2 "register_operand" "r")))]
   ""
 {
-  return "xor%0, %2";
+  return "xor\\t%0, %2";
 })
 
 (define_insn "iorsi3"
@@ -175,7 +175,7 @@
(match_operand:SI 2 "register_operand" "r")))]
   ""
 {
-  return "or %0, %2";
+  return "or\\t%0, %2";
 })
 
 ;; -
@@ -188,7 +188,7 @@
   (match_operand:SI 2 "register_operand" "r")))]
   ""
 {
-  return "ashl   %0, %2";
+  return "ashl\\t%0, %2";
 })
 
 (define_insn "ashrsi3"
@@ -197,7 +197,7 @@
 (match_operand:SI 2 "register_operand" "r")))]
   ""
 {
-  return "ashr   %0, %2";
+  return "ashr\\t%0, %2";
 })
 
 (define_insn "lshrsi3"
@@ -206,7 +206,7 @@
 (match_operand:SI 2 "register_operand" "r")))]
   ""
 {
-  return "lshr   %0, %2";
+  return "lshr\\t%0, %2";
 })
 
 ;; -
@@ -220,14 +220,14 @@
   [(set (mem:SI (pre_dec:SI (reg:SI 1)))
(match_operand:SI 0 "register_operand" "r"))]
   ""
-  "push   $sp, %0")
+  "push\\t$sp, %0")
 
 ;; Pop a register from the stack
 (define_insn "movsi_pop"
   [(set (match_operand:SI 1 "register_operand" "=r")
(mem:SI (post_inc:SI (match_operand:SI 0 "register_operand" "r"]
   ""
-  "pop%0, %1")
+  "pop\\t%0, %1")
 
 (define_expand "movsi"
[(set (match_operand:SI 0 "general_operand" "")
@@ -257,15 +257,15 @@
   "register_operand (operands[0], SImode)
|| register_operand (operands[1], SImode)"
   "@
-   xor%0, %0
-   mov%0, %1
-   ldi.l  %0, %1
-   st.l   %0, %1
-   sta.l  %0, %1
-   ld.l   %0, %1
-   lda.l  %0, %1
-   sto.l  %0, %1
-   ldo.l  %0, %1"
+   xor\\t%0, %0
+   mov\\t%0, %1
+   ldi.l\\t%0, %1
+   st.l\\t%0, %1
+   sta.l\\t%0, %1
+   ld.l\\t%0, %1
+   lda.l\\t%0, %1
+   sto.l\\t%0, %1
+   ldo.l\\t%0, %1"
   [(set_attr "length"  "2,2,6,2,6,2,6,4,4")])
 
 (define_insn "zero_extendqisi2"
@@ -273,10 +273,10 @@
(

Re: [PING][PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread H.J. Lu
On Fri, Jan 9, 2015 at 12:12 PM, Magnus Granberg  wrote:
> fredag 09 januari 2015 13.00.14 skrev  Daniel Micay:
>> On 09/01/15 12:49 PM, Joseph Myers wrote:
>> > On Fri, 9 Jan 2015, Daniel Micay wrote:
>> >>> --with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|
>> >>> shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE -pie}"
>> >>>
>> >>> at configure time (using CONFIGURE_SPECS).
> DRIVER_SELF_SPECS is checkt before CONFIGURE_SPECS. On mips it will have added
> -mno-shared before it check CONFIGURE_SPECS. I want to support more targets
> later on. Can move the spec to elfos.h.
>> >>>
>> >>> I have no idea if the above is really the proper spec to use - why
>> >>> do you include static, nostdlib, nodefaultlibs and nostartfiles
>> >>> for example?  Similar, if I say
>> >>
>> >> PIE isn't supported for static executables by binutils, etc. so it
>> >> does need to exclude that. The checks for nostdlib, nodefaultlibs
>> >
>> > Well - that would indicate excluding -pie if one of the link-time options
>> > conflicting with it is used, -fPIE if one of the compile-time options
>> > conflicting with it is used.  That way, "gcc -static file.c" would still
>> > have the same effect as "gcc -c file.c; gcc -static file.o" (building a
>> > PIE object, linking it into a non-PIE static executable), which makes
>> > logical sense to me (although there may be no great benefit either way).
>>
>> Sure, I agree. It should have separate lists of exceptions for both of
>> these.
> I can separete it to compile and linke sections and remove the nostdlib,
> nodefaultlibs and nostartfiles. But how do we not pass -pie to the linker when
> we don't pass static or shared and don't link it with -pie? For only the gold
> linker support -no-pie.
>
> /Magnus G.
>
>

Please try hjl/pie branch:

https://gcc.gnu.org/git/?p=gcc.git;a=summary

and let know if it works for you.


-- 
H.J.


[PATCH, moxie] Fix CC_REG definition

2015-01-09 Thread Anthony Green

The moxie port had CC_REG referring to a real hard register ($r9) by
mistake instead of the virtual CC register.  This never resulted in
incorrect code, but we would often see $r9 marked as used in a function
when it actually wasn't.  I'm checking this in.

2015-01-09  Anthony Green  

* config/moxie/moxie.md (CC_REG): Correct register definition.


Index: gcc/config/moxie/moxie.md
===
--- gcc/config/moxie/moxie.md   (revision 219418)
+++ gcc/config/moxie/moxie.md   (working copy)
@@ -367,7 +367,7 @@
 ;; -
 
 (define_constants
-  [(CC_REG 11)])
+  [(CC_REG 19)])
 
 (define_expand "cbranchsi4"
   [(set (reg:CC CC_REG)


libgo patch committed: Pass CGO_LDFLAGS to linker for cgo

2015-01-09 Thread Ian Lance Taylor
This patch from Peter Collingbourne backports a patch from the master
repository to pass CGO_LDFLAGS to the linker when using cgo with
gccgo.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian
diff -r b0a83aacb539 libgo/go/cmd/go/build.go
--- a/libgo/go/cmd/go/build.go  Fri Jan 09 13:17:25 2015 -0800
+++ b/libgo/go/cmd/go/build.go  Fri Jan 09 16:41:10 2015 -0800
@@ -1895,6 +1895,7 @@
}
ldflags = append(ldflags, afiles...)
ldflags = append(ldflags, cgoldflags...)
+   ldflags = append(ldflags, envList("CGO_LDFLAGS", "")...)
ldflags = append(ldflags, p.CgoLDFLAGS...)
if usesCgo && goos == "linux" {
ldflags = append(ldflags, "-Wl,-E")


Re: [PATCH 12/21] PR jit/63854: Add a valgrind suppresion file

2015-01-09 Thread Hans-Peter Nilsson
On Wed, 19 Nov 2014, David Malcolm wrote:

> On Wed, 2014-11-19 at 10:09 -0700, Jeff Law wrote:
> > On 11/19/14 04:47, Richard Biener wrote:
> > > On Wed, Nov 19, 2014 at 11:46 AM, David Malcolm  
> > > wrote:
> > >> Valgrind complains about uninitialized data within sparseset_bit_p.
> > >> Provide a suppression file to silence these warnings.
> > >>
> > >> Valgrind requires suppression files for C++ code to use the mangled
> > >> names, so we do that here.
> > >
> > > There is --enable-valgrind-annotations to get the same effect by GCC
> > > telling valgrind about this (and more).
> > Right.  See VALGRIND_DISCARD.  Is that not covering this case?
>
> I simply didn't spot the option, and was running without it.
>
> I'll drop the new file, and document that people running the jit
> testsuite under valgrind need to use that configure option.

IMHO, making --enable-valgrind-annotations the default when
headers are found and when in gcc development is in DEV-PHASE =
experimental (i.e. not for releases) would be even better.
Anyone opposed?  I thought it already was the default!

The overhead is IIRC a few weird NOP instructions per
VALGRIND_DISCARD (& Co.) annotation.

brgds, H-P
(PS. I care a little bit since I added them in the first place.)


gotools patch committed: Fix for non-bootstrap case

2015-01-09 Thread Ian Lance Taylor
When not bootstrapping, the newly built Go compiler is not passed down
to the Go tools as GOC.  This patch changes the gotools Makefile to
use GOC_FOR_TARGET for a native build.  I also set MOSTLYCLEANFILES.
Bootstrapped on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2015-01-09  Ian Lance Taylor  

* Makefile.am (GOCOMPILER): Set to GOC or GOC_FOR_TARGET depending
on whether this is a native build or not.
(GOCOMPILE, GOLINK): Use $(GOCOMPILER) instead of $(GOC).
(MOSTLYCLEANFILES): Define.
* Makefile.in: Rebuild.
Index: Makefile.am
===
--- Makefile.am (revision 219408)
+++ Makefile.am (working copy)
@@ -28,11 +28,18 @@ STAMP = echo timestamp >
 libgodir = ../$(target_noncanonical)/libgo
 LIBGODEP = $(libgodir)/libgo.la
 
+if NATIVE
+# Use the compiler we just built.
+GOCOMPILER = $(GOC_FOR_TARGET)
+else
+GOCOMPILER = $(GOC)
+endif
+
 GOCFLAGS = $(CFLAGS_FOR_TARGET)
-GOCOMPILE = $(GOC) $(GOCFLAGS)
+GOCOMPILE = $(GOCOMPILER) $(GOCFLAGS)
 
 AM_LDFLAGS = -L $(libgodir) -L $(libgodir)/.libs -static-libgo
-GOLINK = $(GOC) $(AM_GOCFLAGS) $(LDFLAGS) $(AM_LDFLAGS) -o $@
+GOLINK = $(GOCOMPILER) $(AM_GOCFLAGS) $(LDFLAGS) $(AM_LDFLAGS) -o $@
 
 cmdsrcdir = $(srcdir)/../libgo/go/cmd
 
@@ -89,6 +96,8 @@ s-zdefaultcc: Makefile
$(SHELL) $(srcdir)/../move-if-change zdefaultcc.go.tmp zdefaultcc.go
$(STAMP) $@ 
 
+MOSTLYCLEANFILES = zdefaultcc.go s-zdefaultcc
+
 if NATIVE
 
 # For a native build we build the programs using the newly built libgo


Re: Housekeeping work in backends.html

2015-01-09 Thread Bernd Schmidt

On 01/07/2015 12:39 AM, Eric Botcazou wrote:

Some ports are missing (lm32, moxie, nios2, nvptx, rl78, rx) so the relevant
maintainers are CCed (see 6.3.9 Anatomy of a Target Back End in the doc).


The page is directly browsable at https://gcc.gnu.org/backends.html

For the moxie, nvptx, rl178 and rx ports, maintainers can send me the string
as Sandra did for the nios2 port and I'll update the document.


For nvptx it's not really clear what to use in some cases, and in others 
I have no idea why we would need to keep track of these "characteristics".


H - you could argue a hardware implementation does not exist as it's a
virtual target, but it can obviously be compiled down to run on
hardware that does exist.
Q - "registers" are typed and you can declare 64 bit registers, so
probably this is true
f - Not even sure what this is about. It only defines a small epilogue
pattern.
a - Port uses neither LRA nor reload. Might be a new characteristic
(along with several others).

The closest string is probably

SQCqfbde


Bernd



[PATCH] fix visium build

2015-01-09 Thread Prathamesh Kulkarni
Hi,
The tree.h and tree-core.h flattening patch:
(https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00467.html
broke visium build. The attached patch fixes that.
Built on visium-elf.
OK to commit ?

Thank you,
Prathamesh
2015-01-09 Prathamesh Kulkarni 

* config/visium/visium.c: Add includes hashtab.h, hash-set.h, 
machmode.h,
input.h, statistics.h, vec.h, double-int.h, real.h, fixed-value.h, 
alias.h,
flags.h, symtab.h, tree-core.h, wide-int.h, inchash.h, fold-const.h, 
tree-check.h.
Index: gcc/config/visium/visium.c
===
--- gcc/config/visium/visium.c	(revision 219408)
+++ gcc/config/visium/visium.c	(working copy)
@@ -22,6 +22,23 @@
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
+#include "hashtab.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "input.h"
+#include "statistics.h"
+#include "vec.h"
+#include "double-int.h"
+#include "real.h"
+#include "fixed-value.h"
+#include "alias.h"
+#include "flags.h"
+#include "symtab.h"
+#include "tree-core.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "fold-const.h"
+#include "tree-check.h"
 #include "tree.h"
 #include "stringpool.h"
 #include "stor-layout.h"
@@ -34,7 +51,6 @@
 #include "conditions.h"
 #include "output.h"
 #include "insn-attr.h"
-#include "flags.h"
 #include "expr.h"
 #include "function.h"
 #include "recog.h"


[PATCH, committed] Simplify jit.dg/test-combination.c

2015-01-09 Thread David Malcolm
jit.dg/test-combination.c was spelling out all of the passing
test cases, twice, when test-threads.c already had this as metadata.

Move the metadata from test-threads.c into all-non-failing-tests.h,
and use it from test-combination.c to avoid this repetition.

Before/after test-combination.c both have 1900 passes.

Before/after jit.sum both have 7152 passes.

Committed to trunk as r219413.

gcc/testsuite/ChangeLog:
* jit.dg/test-threads.c (struct testcase): Move declaration
to jit.dg/all-non-failing-tests.h.
(testcases): Likewise.
* jit.dg/all-non-failing-tests.h (struct testcase): Move
declaration here from jit.dg/all-non-failing-tests.h.
(testcases): Likewise.
* jit.dg/test-combination.c (create_code): Eliminate spelling
out all of the testcases in favor of looping through the
"testcases" metadata.
(verify_code): Likewise.
---
 gcc/testsuite/jit.dg/all-non-failing-tests.h | 88 
 gcc/testsuite/jit.dg/test-combination.c  | 52 ++--
 gcc/testsuite/jit.dg/test-threads.c  | 86 ---
 3 files changed, 92 insertions(+), 134 deletions(-)

diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h 
b/gcc/testsuite/jit.dg/all-non-failing-tests.h
index 14211af..82ce736 100644
--- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
+++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
@@ -178,3 +178,91 @@
 #include "test-volatile.c"
 #undef create_code
 #undef verify_code
+
+/* Now expose the individual testcases as instances of this struct.  */
+
+struct testcase
+{
+  const char *m_name;
+  void (*m_hook_to_create_code) (gcc_jit_context *ctxt,
+void * user_data);
+  void (*m_hook_to_verify_code) (gcc_jit_context *ctxt,
+gcc_jit_result *result);
+};
+
+const struct testcase testcases[] = {
+  {"accessing_struct",
+   create_code_accessing_struct,
+   verify_code_accessing_struct},
+  {"accessing_union",
+   create_code_accessing_union,
+   verify_code_accessing_union},
+  {"arith_overflow",
+   create_code_arith_overflow,
+   verify_code_arith_overflow},
+  {"array_as_pointer",
+   create_code_array_as_pointer,
+   verify_code_array_as_pointer},
+  {"arrays",
+   create_code_arrays,
+   verify_code_arrays},
+  {"calling_external_function",
+   create_code_calling_external_function,
+   verify_code_calling_external_function},
+  {"calling_function_ptr",
+   create_code_calling_function_ptr,
+   verify_code_calling_function_ptr},
+  {"constants",
+   create_code_constants,
+   verify_code_constants},
+  {"dot_product",
+   create_code_dot_product,
+   verify_code_dot_product},
+  {"expressions",
+   create_code_expressions,
+   verify_code_expressions},
+  {"factorial",
+   create_code_factorial,
+   verify_code_factorial},
+  {"fibonacci",
+   create_code_fibonacci,
+   verify_code_fibonacci},
+  {"functions",
+   create_code_functions,
+   verify_code_functions},
+  {"hello_world",
+   create_code_hello_world,
+   verify_code_hello_world},
+  {"linked_list",
+   create_code_linked_list,
+   verify_code_linked_list},
+  {"long_names",
+   create_code_long_names,
+   verify_code_long_names},
+  {"quadratic",
+   create_code_quadratic,
+   verify_code_quadratic},
+  {"nested_loop",
+   create_code_nested_loop,
+   verify_code_nested_loop},
+  {"reading_struct ",
+   create_code_reading_struct ,
+   verify_code_reading_struct },
+  {"string_literal",
+   create_code_string_literal,
+   verify_code_string_literal},
+  {"sum_of_squares",
+   create_code_sum_of_squares,
+   verify_code_sum_of_squares},
+  {"types",
+   create_code_types,
+   verify_code_types},
+  {"using_global",
+   create_code_using_global,
+   verify_code_using_global},
+  {"volatile",
+   create_code_volatile,
+   verify_code_volatile}
+};
+
+const int num_testcases = (sizeof (testcases) / sizeof (testcases[0]));
diff --git a/gcc/testsuite/jit.dg/test-combination.c 
b/gcc/testsuite/jit.dg/test-combination.c
index 5131613..a9f3347 100644
--- a/gcc/testsuite/jit.dg/test-combination.c
+++ b/gcc/testsuite/jit.dg/test-combination.c
@@ -15,57 +15,13 @@
 void
 create_code (gcc_jit_context *ctxt, void * user_data)
 {
-  create_code_accessing_struct (ctxt, user_data);
-  create_code_accessing_union (ctxt, user_data);
-  create_code_arith_overflow (ctxt, user_data);
-  create_code_array_as_pointer (ctxt, user_data);
-  create_code_arrays (ctxt, user_data);
-  create_code_calling_external_function (ctxt, user_data);
-  create_code_calling_function_ptr (ctxt, user_data);
-  create_code_constants (ctxt, user_data);
-  create_code_dot_product (ctxt, user_data);
-  create_code_expressions (ctxt, user_data);
-  create_code_factorial (ctxt, user_data);
-  create_code_fibonacci (ctxt, user_data);
-  create_code_functions (ctxt, user_data);
-  create_code_hello_world (ctxt, user_data);
-  create_code_linked_list (ctxt, user_data);
-  create_code_long_names (ctxt, 

Re: [Patch/combine] PR64304 wrong bitmask passed to force_to_mode in combine_simplify_rtx

2015-01-09 Thread Jiong Wang
2015-01-09 21:29 GMT+00:00 Andrew Pinski :
> On Fri, Jan 9, 2015 at 12:40 PM, Jeff Law  wrote:
>> On 01/09/15 06:39, Jiong Wang wrote:
>>>
>>> as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304
>>>
>>> given the following test:
>>>
>>> unsigned char byte = 0;
>>>
>>> void set_bit(unsigned int bit, unsigned char value) {
>>>  unsigned char mask = (unsigned char)(1 << (bit & 7));
>>>  if (!value) {
>>>  byte &= (unsigned char)~mask;
>>>  } else {
>>>  byte |= mask;
>>>  }
>>> }
>>>
>>> we should generate something like:
>>>
>>>set_bit:
>>>  and w0, w0, 7
>>>  mov w2, 1
>>>  lsl w2, w2, w0
>>>
>>> while we are generating
>>>  mov w2, 1
>>>  lsl w2, w2, w0
>>>
>>>
>>> the necessary "and w0, w0, 7" deleted wrongly.
>>>
>>> that because
>>>
>>>(insn 2 5 3 2 (set (reg/v:SI 82 [ bit ])
>>>  (reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64}
>>>   (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
>>>  (nil)))
>>>(insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ])
>>>  (and:SI (reg/v:SI 82 [ bit ])
>>>  (const_int 7 [0x7]))) bug.c:4 399 {andsi3}
>>>   (expr_list:REG_DEAD (reg/v:SI 82 [ bit ])
>>>  (nil)))
>>>(insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ])
>>>  (ashift:SI (reg:SI 86)
>>>  (subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539
>>> {*aarch64_ashl_sisd_or_int_si3}
>>>   (expr_list:REG_DEAD (reg:SI 86)
>>>  (expr_list:REG_DEAD (reg:SI 84 [ D.1482 ])
>>>  (expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1])
>>>  (subreg:QI (reg:SI 84 [ D.1482 ]) 0))
>>>  (nil)
>>>
>>> are wrongly combined into
>>>
>>>(insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ])
>>>  (ashift:QI (subreg:QI (reg:SI 86) 0)
>>>  (reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn}
>>>   (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
>>>  (expr_list:REG_DEAD (reg:SI 86)
>>>  (nil
>>>
>>> thus, the generated assembly is lack of the necessary "and w0, x0, 7".
>>>
>>> the root cause is at one place in combine pass, we are passing wrong
>>> bitmask to force_to_mode.
>>>
>>> in this particular case, for QI mode, we should pass (1 << 8 - 1), while
>>> we are passing (1 << 3 - 1),
>>> thus the combiner think we only need the lower 3 bits, that X & 7 is
>>> unnecessary. While for QI mode, we
>>> want the lower 8 bits. we should remove the exp operator.
>>>
>>> this should be a historical bug in combine pass?? while it's only
>>> triggered for target
>>> where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly
>>> because x86/arm will
>>> not trigger this part of code.
>>>
>>> bootstrap on x86 and gcc check OK.
>>> bootstrap on aarch64 and bare-metal regression OK.
>>> ok for trunk?
>>>
>>> gcc/
>>>PR64303
>>>* combine.c (combine_simplify_rtx): Correct the bitmask passed to
>>> force_to_mode.
>>> gcc/testsuite/
>>>PR64303
>>>* gcc.target/aarch64/pr64304.c: New testcase.
>>
>> I don't think this is correct.
>>
>> When I put a breakpoint on the code in question I see the following RTL
>> prior to the call to DO_SUBST:
>>
>> (ashift:QI (const_int 1 [0x1])
>> (subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
>> (const_int 7 [0x7])) 0))
>>
>>
>> Note carefully the QImode for the ASHIFT.  That clearly indicates that just
>> the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target the
>> masking of the count with 0x7 is redundant as the only valid shift counts
>> are 0-7 (again because of the QImode for the ASHIFT).

Thanks for the explain, I have misunderstood some key points here.

> Jeff is correct here.  SHIFT_COUNT_TRUNCATED cannot be true if the
> aarch64 back-end has shifts patterns for smaller than 32/64bit but the
> aarch64 target only has shifts for 32 and 64bit.

Pinski, thanks for pointing this out.

Agree. AArch64 define shift pattern for SHORT type should be the problem.

Regards,
Jiong

> The middle-end is doing the correct thing, with SHIFT_COUNT_TRUNCATED
> true and a pattern for a QIshift, means the shifter does not to be
> truncated before use.
>
> Thanks,
> Andrew Pinski
>
>>
>> Jeff
>>
>>


[PATCH, committed] Fix build of jit (was Re: [PATCH] Flatten tree.h and tree-core.h (Version 3))

2015-01-09 Thread David Malcolm
On Sat, 2015-01-10 at 01:50 +0530, Prathamesh Kulkarni wrote:
> On 9 January 2015 at 16:21, Richard Biener  wrote:
> > On Fri, Jan 9, 2015 at 10:39 AM, Michael Collison
> >  wrote:
> >> This patch flattens tree.h and tree-core.h. This is a revised patch that
> >> does not include tree-core.h as a result of flattening.
> >>
> >> Version 3 of the patch adds the header files removed from tree-core.h to
> >> gcc-plugin.h in order to allow ggc-common.c to compile. This is a recent
> >> issue seen on trunk.
> >>
> >> I removed the includes in tree.h and tree-core.h except for the include of
> >> tree-core.h in tree.h.
> >>
> >> I modified genattrtab.c, genautomata.c, genemit.c, gengtype.c, gengtype.c,
> >> genoptinit.c, genoutput.c,
> >> genpeep.c, genpreds.c, and optc-save-gen-awk to include the the necessary
> >> include files removed from
> >> tree.h and tree-core.h when generating their respective files.
> >>
> >> I removed three inline functions from tree.h and relocated them to
> >> fold-const.c and exported them in fold-const.h. The functions are:
> >>
> >> convert_to_ptrofftype-loc
> >> fold_build_pointer_plus_loc
> >> fold_build_pointer_plus_hwi_loc
> >>
> >> All other changes include the necessary include files removed from tree.h
> >> and tree-core.h. Note the patches modifies all the front-ends.
> >>
> >> I bootstrapped on x86 with all languages. I also bootstrapped on all 
> >> targets
> >> listed in contrib/config-list.mk with c and c++ enabled.
> >>
> >> Is this okay for trunk?
> >
> > Ok.
> Committed as r219402 on behalf of Michael.
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Richard.
> >
> >> 2014-12-24  Michael Collison  
> >>
> >> * genattrtab.c (write_header): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-attrtab.c.
> >> * genautomata.c (main) : Include hash-set.h, macInclude hash-set.h,
> >> machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-automata.c.
> >> * genemit.c (main): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-emit.c.
> >> * gengtype.c (open_base_files): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> gtype-desc.c.
> >> * genopinit.c (main): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-opinit.c.
> >> * genoutput.c (output_prologue): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-output.c.
> >> * genpeep.c (main): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-peep.c.
> >> * genpreds.c (write_insn_preds_c): Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> insn-preds.c.
> >> * optc-save-gen-awk: Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> >> fold-const.h, wide-int.h, and inchash.h when generating
> >> options-save.c.
> >> * opth-gen.awk: Change include guard from GCC_C_COMMON_H to
> >> GCC_C_COMMON_C
> >> when generating options.h.
> >>
> >> 2014-12-24  Michael Collison  
> >>
> >> * ada/gcc-interface/cuintp.c: Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h,
> >> fold-const.h, wide-int.h, and inchash.h due to
> >> flattening of tree.h.
> >> * ada/gcc-interface/decl.c: ditto.
> >> * ada/gcc-interface/misc.c: ditto.
> >> * ada/gcc-interface/targtyps.c: Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h,
> >> fold-const.h, wide-int.h, and inchash.h due to
> >> flattening of tree.h.
> >> * ada/gcc-interface/trans.c: Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h, real.h,
> >> fold-const.h, wide-int.h, inchash.h due to
> >> flattening of tree.h.
> >> * ada/gcc-interface/utils.c: Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h, symtab.h,
> >> fold-const.h, wide-int.h, and inchash.h due to
> >> flattening of tree.h.
> >> * ada/gcc-interface/utils2.c: ditto.
> >> * alias.c: Include hash-set.h, machmode.h,
> >> vec.h, double-int.h, input.h, alias.h,

Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)

2015-01-09 Thread Prathamesh Kulkarni
On 10 January 2015 at 02:58, Jakub Jelinek  wrote:
> On Sat, Jan 10, 2015 at 01:50:42AM +0530, Prathamesh Kulkarni wrote:
>> >> I bootstrapped on x86 with all languages. I also bootstrapped on all 
>> >> targets
>> >> listed in contrib/config-list.mk with c and c++ enabled.
>> >>
>> >> Is this okay for trunk?
>> >
>> > Ok.
>> Committed as r219402 on behalf of Michael.
>
> Please note that the GCC tree has many different ChangeLog files, so
> it is incorrect to put everything into gcc/ChangeLog.  E.g. fortran/
> entries belong into gcc/fortran/ChangeLog and should not have
> fortran/ prefixes, etc.  I've fixed it up for this time, but please
> take care about it next time.
Oops, sorry about that, I will take care henceforth.
Thanks for fixing it up.

Regards,
Prathamesh
>
> Jakub


Re: [Patch/combine] PR64304 wrong bitmask passed to force_to_mode in combine_simplify_rtx

2015-01-09 Thread Andrew Pinski
On Fri, Jan 9, 2015 at 12:40 PM, Jeff Law  wrote:
> On 01/09/15 06:39, Jiong Wang wrote:
>>
>> as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304
>>
>> given the following test:
>>
>> unsigned char byte = 0;
>>
>> void set_bit(unsigned int bit, unsigned char value) {
>>  unsigned char mask = (unsigned char)(1 << (bit & 7));
>>  if (!value) {
>>  byte &= (unsigned char)~mask;
>>  } else {
>>  byte |= mask;
>>  }
>> }
>>
>> we should generate something like:
>>
>>set_bit:
>>  and w0, w0, 7
>>  mov w2, 1
>>  lsl w2, w2, w0
>>
>> while we are generating
>>  mov w2, 1
>>  lsl w2, w2, w0
>>
>>
>> the necessary "and w0, w0, 7" deleted wrongly.
>>
>> that because
>>
>>(insn 2 5 3 2 (set (reg/v:SI 82 [ bit ])
>>  (reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64}
>>   (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
>>  (nil)))
>>(insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ])
>>  (and:SI (reg/v:SI 82 [ bit ])
>>  (const_int 7 [0x7]))) bug.c:4 399 {andsi3}
>>   (expr_list:REG_DEAD (reg/v:SI 82 [ bit ])
>>  (nil)))
>>(insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ])
>>  (ashift:SI (reg:SI 86)
>>  (subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539
>> {*aarch64_ashl_sisd_or_int_si3}
>>   (expr_list:REG_DEAD (reg:SI 86)
>>  (expr_list:REG_DEAD (reg:SI 84 [ D.1482 ])
>>  (expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1])
>>  (subreg:QI (reg:SI 84 [ D.1482 ]) 0))
>>  (nil)
>>
>> are wrongly combined into
>>
>>(insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ])
>>  (ashift:QI (subreg:QI (reg:SI 86) 0)
>>  (reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn}
>>   (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
>>  (expr_list:REG_DEAD (reg:SI 86)
>>  (nil
>>
>> thus, the generated assembly is lack of the necessary "and w0, x0, 7".
>>
>> the root cause is at one place in combine pass, we are passing wrong
>> bitmask to force_to_mode.
>>
>> in this particular case, for QI mode, we should pass (1 << 8 - 1), while
>> we are passing (1 << 3 - 1),
>> thus the combiner think we only need the lower 3 bits, that X & 7 is
>> unnecessary. While for QI mode, we
>> want the lower 8 bits. we should remove the exp operator.
>>
>> this should be a historical bug in combine pass?? while it's only
>> triggered for target
>> where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly
>> because x86/arm will
>> not trigger this part of code.
>>
>> bootstrap on x86 and gcc check OK.
>> bootstrap on aarch64 and bare-metal regression OK.
>> ok for trunk?
>>
>> gcc/
>>PR64303
>>* combine.c (combine_simplify_rtx): Correct the bitmask passed to
>> force_to_mode.
>> gcc/testsuite/
>>PR64303
>>* gcc.target/aarch64/pr64304.c: New testcase.
>
> I don't think this is correct.
>
> When I put a breakpoint on the code in question I see the following RTL
> prior to the call to DO_SUBST:
>
> (ashift:QI (const_int 1 [0x1])
> (subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
> (const_int 7 [0x7])) 0))
>
>
> Note carefully the QImode for the ASHIFT.  That clearly indicates that just
> the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target the
> masking of the count with 0x7 is redundant as the only valid shift counts
> are 0-7 (again because of the QImode for the ASHIFT).  Thus that's
> equivalent to:
>
>
> (ashift:QI (const_int 1 [0x1])
> (reg:QI 0 x0 [ bit ]))
>
>
> Similarly for the case:
>
>
> (ashift:QI (subreg:QI (reg:SI 85) 0)
> (subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
> (const_int 7 [0x7])) 0))
>
>
> Again, QImode ASHIFT, so the masking of the shift count is redundant
> resulting in:
>
> (ashift:QI (subreg:QI (reg:SI 85) 0)
> (reg:QI 0 x0 [ bit ]))
>
>
> I think you need to do some further analysis.  Is it perhaps the case that
> SHIFT_COUNT_TRUNCATED is nonzero when in fact it should be zero?

Jeff is correct here.  SHIFT_COUNT_TRUNCATED cannot be true if the
aarch64 back-end has shifts patterns for smaller than 32/64bit but the
aarch64 target only has shifts for 32 and 64bit.
The middle-end is doing the correct thing, with SHIFT_COUNT_TRUNCATED
true and a pattern for a QIshift, means the shifter does not to be
truncated before use.

Thanks,
Andrew Pinski

>
> Jeff
>
>


Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)

2015-01-09 Thread Jakub Jelinek
On Sat, Jan 10, 2015 at 01:50:42AM +0530, Prathamesh Kulkarni wrote:
> >> I bootstrapped on x86 with all languages. I also bootstrapped on all 
> >> targets
> >> listed in contrib/config-list.mk with c and c++ enabled.
> >>
> >> Is this okay for trunk?
> >
> > Ok.
> Committed as r219402 on behalf of Michael.

Please note that the GCC tree has many different ChangeLog files, so
it is incorrect to put everything into gcc/ChangeLog.  E.g. fortran/
entries belong into gcc/fortran/ChangeLog and should not have
fortran/ prefixes, etc.  I've fixed it up for this time, but please
take care about it next time.

Jakub


Re: Patch RFA: Support for building Go tools

2015-01-09 Thread Ian Lance Taylor
Committed initial gotools patch with following ChangeLog entries:

./:

2015-01-09  Ian Lance Taylor  

* configure.ac (host_tools): Add gotools.
* Makefile.def (host_modules): Add gotools.
(dependencies): Add dependency of all-gotools on all-target-libgo.

gcc/go/:

2015-01-09  Ian Lance Taylor  

* config-lang.in (lang_dirs): Define.

gotools/:

2015-01-09  Ian Lance Taylor  

* Initial implementation.


libgo patch commited: Adjust finding gccgo by cmd/go to match upstream

2015-01-09 Thread Ian Lance Taylor
This patch changes the way that cmd/go finds gccgo to match the
upstream sources (which had changed since the 1.3 sources currently in
libgo).  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r 882d8b02b84b libgo/go/cmd/go/build.go
--- a/libgo/go/cmd/go/build.go  Thu Jan 08 12:31:29 2015 -0800
+++ b/libgo/go/cmd/go/build.go  Fri Jan 09 13:16:57 2015 -0800
@@ -1783,15 +1783,22 @@
 // The Gccgo toolchain.
 type gccgoToolchain struct{}
 
-func (gccgoToolchain) compiler() string {
-   if v := os.Getenv("GOC"); v != "" {
-   return v
+var gccgoName, gccgoBin string
+
+func init() {
+   gccgoName = os.Getenv("GCCGO")
+   if gccgoName == "" {
+   gccgoName = defaultGCCGO
}
-   return defaultGOC
+   gccgoBin, _ = exec.LookPath(gccgoName)
 }
 
-func (tools gccgoToolchain) linker() string {
-   return tools.compiler()
+func (gccgoToolchain) compiler() string {
+   return gccgoBin
+}
+
+func (gccgoToolchain) linker() string {
+   return gccgoBin
 }
 
 func (tools gccgoToolchain) gc(b *builder, p *Package, archive, obj string, 
importArgs []string, gofiles []string) (ofile string, output []byte, err error) 
{


Re: Patch RFA: Support for building Go tools

2015-01-09 Thread Ian Lance Taylor
On Fri, Jan 9, 2015 at 7:40 AM, Paolo Bonzini  wrote:
>
>
> On 09/01/2015 15:24, Ian Lance Taylor wrote:
>>> >
>>> > This should work automatically, the only difference is that you must
>>> > omit $(LIBGODEP) from the dependencies.
>> What will happen if there is no installed gccgo at the right version?
>
> Compilation fails.
>
>> What should happen?
>
> Compilation fails. :)
>
>> I haven't been keeping track--will the build machinery now build a
>> build-x-host Go compiler?
>
> No, but the build machinery should point the Makefile variables to a
> build-x-host tool from the PATH.  If it doesn't, you need to modify the
> toplevel configure.ac and Makefile.tpl.

I forgot: there is another problem with using the build-x-host
compiler to build the tools.  That will build them with the host
version of the go/build package, which will set the default values for
the architecture and OS to be that of the host, not the target.  They
should really be the target.  We can fix this, but it will require
some thought.

I plan to commit the version that does not do anything when building a
cross-compiler in order to proceed.  The situation will not be worse
than it is today.

Ian


Re: [Patch/combine] PR64304 wrong bitmask passed to force_to_mode in combine_simplify_rtx

2015-01-09 Thread Jeff Law

On 01/09/15 06:39, Jiong Wang wrote:

as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304

given the following test:

unsigned char byte = 0;

void set_bit(unsigned int bit, unsigned char value) {
 unsigned char mask = (unsigned char)(1 << (bit & 7));
 if (!value) {
 byte &= (unsigned char)~mask;
 } else {
 byte |= mask;
 }
}

we should generate something like:

   set_bit:
 and w0, w0, 7
 mov w2, 1
 lsl w2, w2, w0

while we are generating
 mov w2, 1
 lsl w2, w2, w0


the necessary "and w0, w0, 7" deleted wrongly.

that because

   (insn 2 5 3 2 (set (reg/v:SI 82 [ bit ])
 (reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64}
  (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
 (nil)))
   (insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ])
 (and:SI (reg/v:SI 82 [ bit ])
 (const_int 7 [0x7]))) bug.c:4 399 {andsi3}
  (expr_list:REG_DEAD (reg/v:SI 82 [ bit ])
 (nil)))
   (insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ])
 (ashift:SI (reg:SI 86)
 (subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539
{*aarch64_ashl_sisd_or_int_si3}
  (expr_list:REG_DEAD (reg:SI 86)
 (expr_list:REG_DEAD (reg:SI 84 [ D.1482 ])
 (expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1])
 (subreg:QI (reg:SI 84 [ D.1482 ]) 0))
 (nil)

are wrongly combined into

   (insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ])
 (ashift:QI (subreg:QI (reg:SI 86) 0)
 (reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn}
  (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
 (expr_list:REG_DEAD (reg:SI 86)
 (nil

thus, the generated assembly is lack of the necessary "and w0, x0, 7".

the root cause is at one place in combine pass, we are passing wrong
bitmask to force_to_mode.

in this particular case, for QI mode, we should pass (1 << 8 - 1), while
we are passing (1 << 3 - 1),
thus the combiner think we only need the lower 3 bits, that X & 7 is
unnecessary. While for QI mode, we
want the lower 8 bits. we should remove the exp operator.

this should be a historical bug in combine pass?? while it's only
triggered for target
where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly
because x86/arm will
not trigger this part of code.

bootstrap on x86 and gcc check OK.
bootstrap on aarch64 and bare-metal regression OK.
ok for trunk?

gcc/
   PR64303
   * combine.c (combine_simplify_rtx): Correct the bitmask passed to
force_to_mode.
gcc/testsuite/
   PR64303
   * gcc.target/aarch64/pr64304.c: New testcase.

I don't think this is correct.

When I put a breakpoint on the code in question I see the following RTL 
prior to the call to DO_SUBST:


(ashift:QI (const_int 1 [0x1])
(subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
(const_int 7 [0x7])) 0))


Note carefully the QImode for the ASHIFT.  That clearly indicates that 
just the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target 
the masking of the count with 0x7 is redundant as the only valid shift 
counts are 0-7 (again because of the QImode for the ASHIFT).  Thus 
that's equivalent to:



(ashift:QI (const_int 1 [0x1])
(reg:QI 0 x0 [ bit ]))


Similarly for the case:


(ashift:QI (subreg:QI (reg:SI 85) 0)
(subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
(const_int 7 [0x7])) 0))


Again, QImode ASHIFT, so the masking of the shift count is redundant 
resulting in:


(ashift:QI (subreg:QI (reg:SI 85) 0)
(reg:QI 0 x0 [ bit ]))


I think you need to do some further analysis.  Is it perhaps the case 
that SHIFT_COUNT_TRUNCATED is nonzero when in fact it should be zero?


Jeff




Re: [PING][PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread Magnus Granberg
fredag 09 januari 2015 13.00.14 skrev  Daniel Micay:
> On 09/01/15 12:49 PM, Joseph Myers wrote:
> > On Fri, 9 Jan 2015, Daniel Micay wrote:
> >>> --with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|
> >>> shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE -pie}"
> >>> 
> >>> at configure time (using CONFIGURE_SPECS).
DRIVER_SELF_SPECS is checkt before CONFIGURE_SPECS. On mips it will have added 
-mno-shared before it check CONFIGURE_SPECS. I want to support more targets 
later on. Can move the spec to elfos.h.
> >>> 
> >>> I have no idea if the above is really the proper spec to use - why
> >>> do you include static, nostdlib, nodefaultlibs and nostartfiles
> >>> for example?  Similar, if I say
> >> 
> >> PIE isn't supported for static executables by binutils, etc. so it
> >> does need to exclude that. The checks for nostdlib, nodefaultlibs
> > 
> > Well - that would indicate excluding -pie if one of the link-time options
> > conflicting with it is used, -fPIE if one of the compile-time options
> > conflicting with it is used.  That way, "gcc -static file.c" would still
> > have the same effect as "gcc -c file.c; gcc -static file.o" (building a
> > PIE object, linking it into a non-PIE static executable), which makes
> > logical sense to me (although there may be no great benefit either way).
> 
> Sure, I agree. It should have separate lists of exceptions for both of
> these.
I can separete it to compile and linke sections and remove the nostdlib, 
nodefaultlibs and nostartfiles. But how do we not pass -pie to the linker when
we don't pass static or shared and don't link it with -pie? For only the gold 
linker support -no-pie.

/Magnus G.




Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)

2015-01-09 Thread Prathamesh Kulkarni
On 9 January 2015 at 16:21, Richard Biener  wrote:
> On Fri, Jan 9, 2015 at 10:39 AM, Michael Collison
>  wrote:
>> This patch flattens tree.h and tree-core.h. This is a revised patch that
>> does not include tree-core.h as a result of flattening.
>>
>> Version 3 of the patch adds the header files removed from tree-core.h to
>> gcc-plugin.h in order to allow ggc-common.c to compile. This is a recent
>> issue seen on trunk.
>>
>> I removed the includes in tree.h and tree-core.h except for the include of
>> tree-core.h in tree.h.
>>
>> I modified genattrtab.c, genautomata.c, genemit.c, gengtype.c, gengtype.c,
>> genoptinit.c, genoutput.c,
>> genpeep.c, genpreds.c, and optc-save-gen-awk to include the the necessary
>> include files removed from
>> tree.h and tree-core.h when generating their respective files.
>>
>> I removed three inline functions from tree.h and relocated them to
>> fold-const.c and exported them in fold-const.h. The functions are:
>>
>> convert_to_ptrofftype-loc
>> fold_build_pointer_plus_loc
>> fold_build_pointer_plus_hwi_loc
>>
>> All other changes include the necessary include files removed from tree.h
>> and tree-core.h. Note the patches modifies all the front-ends.
>>
>> I bootstrapped on x86 with all languages. I also bootstrapped on all targets
>> listed in contrib/config-list.mk with c and c++ enabled.
>>
>> Is this okay for trunk?
>
> Ok.
Committed as r219402 on behalf of Michael.

Thanks,
Prathamesh
>
> Thanks,
> Richard.
>
>> 2014-12-24  Michael Collison  
>>
>> * genattrtab.c (write_header): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-attrtab.c.
>> * genautomata.c (main) : Include hash-set.h, macInclude hash-set.h,
>> machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-automata.c.
>> * genemit.c (main): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-emit.c.
>> * gengtype.c (open_base_files): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> gtype-desc.c.
>> * genopinit.c (main): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-opinit.c.
>> * genoutput.c (output_prologue): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-output.c.
>> * genpeep.c (main): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-peep.c.
>> * genpreds.c (write_insn_preds_c): Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> insn-preds.c.
>> * optc-save-gen-awk: Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h when generating
>> options-save.c.
>> * opth-gen.awk: Change include guard from GCC_C_COMMON_H to
>> GCC_C_COMMON_C
>> when generating options.h.
>>
>> 2014-12-24  Michael Collison  
>>
>> * ada/gcc-interface/cuintp.c: Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h,
>> fold-const.h, wide-int.h, and inchash.h due to
>> flattening of tree.h.
>> * ada/gcc-interface/decl.c: ditto.
>> * ada/gcc-interface/misc.c: ditto.
>> * ada/gcc-interface/targtyps.c: Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h,
>> fold-const.h, wide-int.h, and inchash.h due to
>> flattening of tree.h.
>> * ada/gcc-interface/trans.c: Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, real.h,
>> fold-const.h, wide-int.h, inchash.h due to
>> flattening of tree.h.
>> * ada/gcc-interface/utils.c: Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h,
>> fold-const.h, wide-int.h, and inchash.h due to
>> flattening of tree.h.
>> * ada/gcc-interface/utils2.c: ditto.
>> * alias.c: Include hash-set.h, machmode.h,
>> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
>> fold-const.h, wide-int.h, and inchash.h due to
>> flattening of tree.h.
>> * asan.c: ditto.
>> * attribs.c: ditto.
>> * auto-inc-dec.c: ditto.
>> * auto-profile.c: ditto
>> * bb-reorder.c: ditto.
>> * bt-load.c: Include symtab.h d

[PATCH, committed] New jit API entrypoint: gcc_jit_context_new_rvalue_from_long

2015-01-09 Thread David Malcolm
Previously it was only possible to create integer constants via
gcc_jit_context_new_rvalue_from_int [1], which takes a host "int" and
a gcc_jit_type representing a target type

Hence it wasn't possible to create e.g. the constant (long)LONG_MAX
if int != long.

(strictly speaking, one might be able to do it via the double and
void * entrypoints, but doing so would be error-prone).

Introduce new API entrypoint: gcc_jit_context_new_rvalue_from_long
which takes a host "long".

Internally, rework the constant-handling machinery to use templates,
which should make it easier to add further host types for constant
value e.g. longlong and the unsigned variants, if we need to.

This brings jit.sum to:
  # of expected passes   7152

Committed to trunk as r219401.

gcc/jit/ChangeLog:
* docs/cp/topics/expressions.rst (Simple expressions): Use
":c:type:" for C types.  Document new overload of
gcc::jit::context::new_rvalue.
* docs/topics/expressions.rst (Simple expressions): Use
":c:type:" for C types.  Document new entrypoint
gcc_jit_context_new_rvalue_from_long.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
* jit-playback.c: Within namespace gcc::jit::playback...
(context::new_rvalue_from_int): Eliminate in favor of...
(context::new_rvalue_from_const ): ...this.
(context::new_rvalue_from_double): Eliminate in favor of...
(context::new_rvalue_from_const ): ...this.
(context::new_rvalue_from_const ): New.
(context::new_rvalue_from_ptr): Eliminate in favor of...
(context::new_rvalue_from_const ): ...this.
* jit-playback.h: Within namespace gcc::jit::playback...
(context::new_rvalue_from_int): Eliminate in favor of...
(context::new_rvalue_from_const ): ...this.
(context::new_rvalue_from_double): Likewise.
(context::new_rvalue_from_ptr): Likewise.
* jit-recording.c: Within namespace gcc::jit::recording...
(context::new_rvalue_from_int): Eliminate.
(context::new_rvalue_from_double): Likewise.
(context::new_rvalue_from_ptr): Likewise.
(class memento_of_new_rvalue_from_const ):
Add explicit specialization.
(class memento_of_new_rvalue_from_const ):
Likewise.
(class memento_of_new_rvalue_from_const ):
Likewise.
(class memento_of_new_rvalue_from_const ):
Likewise.
(memento_of_new_rvalue_from_int::replay_into):
Generalize into...
(memento_of_new_rvalue_from_const ::replay_into):
...this...
(memento_of_new_rvalue_from_double::replay_into):
...allowing this...
(memento_of_new_rvalue_from_ptr::replay_into):
...and this to be deleted.
(memento_of_new_rvalue_from_int::make_debug_string):
Convert to...
(memento_of_new_rvalue_from_const ::make_debug_string):
...this.
(memento_of_new_rvalue_from_double::make_debug_string):
Convert to...
(memento_of_new_rvalue_from_const ::make_debug_string):
...this.
(memento_of_new_rvalue_from_ptr::make_debug_string)
Convert to...
(memento_of_new_rvalue_from_const ::make_debug_string):
...this.
(memento_of_new_rvalue_from_const ::make_debug_string):
New function.
* jit-recording.h: Within namespace gcc::jit::recording...
(context::new_rvalue_from_int): Eliminate.
(context::new_rvalue_from_double): Likewise.
(context::new_rvalue_from_ptr): Likewise, all in favor of...
(context::new_rvalue_from_const ): New family of
methods.
(class memento_of_new_rvalue_from_int): Eliminate.
(class memento_of_new_rvalue_from_double): Likewise.
(class memento_of_new_rvalue_from_ptr): Likewise.
(class memento_of_new_rvalue_from_const ): New family
of rvalue subclasses.
* libgccjit++.h (gccjit::context::new_rvalue): New overload, for
"long".
* libgccjit.c (gcc_jit_context_new_rvalue_from_int): Update for
rewriting of recording::context::new_rvalue_from_int to
recording::context::new_rvalue_from_const .
(gcc_jit_context_new_rvalue_from_long): New API entrypoint.
(gcc_jit_context_new_rvalue_from_double): Update for
rewriting of recording::context::new_rvalue_from_double to
recording::context::new_rvalue_from_const .
(gcc_jit_context_new_rvalue_from_ptr): Update for
rewriting of recording::context::new_rvalue_from_ptr to
recording::context::new_rvalue_from_const .
* libgccjit.h (gcc_jit_context_new_rvalue_from_long): New API
entrypoint.
* libgccjit.map (gcc_jit_context_new_rvalue_from_long): Likewise.

gcc/testsuite/ChangeLog:
* jit.dg/all-non-failing-tests.h: Add test-constants.c.
* jit.dg/test-combination.c (create_code): Likewise.
(verify_code): Likewise.
   

Re: [PATCH] Handle CALL_INSN_FUNCTION_USAGE clobbers in regcprop.c

2015-01-09 Thread Tom de Vries

On 09-01-15 11:48, Jakub Jelinek wrote:

On Fri, Jan 09, 2015 at 11:35:41AM +0100, Tom de Vries wrote:

2015-01-09  Tom de Vries  

PR rtl-optimization/64539
* regcprop.c (copyprop_hardreg_forward_1): Handle clobbers in
CALL_INSN_FUNCTION_USAGE.


To avoid the duplication, wouldn't it be better to add

static void
kill_clobbered_values (rtx_insn *insn, struct value_data *vd)
{
   note_stores (PATTERN (insn), kill_clobbered_value, vd);
   if (CALL_P (insn))
 {
   rtx exp;
   for (exp = CALL_INSN_FUNCTION_USAGE (insn); exp; exp = XEXP (exp, 1))
{
  rtx x = XEXP (exp, 0);
  if (GET_CODE (x) == CLOBBER)
kill_value (SET_DEST (x), vd);
}
 }
}
function (with appropriate function comment) and use it in both places?

Otherwise LGTM.


Committed as attached.

Thanks,
- Tom
2015-01-09  Tom de Vries  

	PR rtl-optimization/64539
	* regcprop.c (kill_clobbered_values): Factor out of ...
	(copyprop_hardreg_forward_1): ... here.  Use kill_clobbered_values
	instead of note_stores with kill_clobbered_value.
---
 gcc/regcprop.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 8c4f564..c809e77 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -734,6 +734,26 @@ cprop_find_used_regs (rtx *loc, void *data)
 }
 }
 
+/* Apply clobbers of INSN in PATTERN and C_I_F_U to value_data VD.  */
+
+static void
+kill_clobbered_values (rtx_insn *insn, struct value_data *vd)
+{
+  note_stores (PATTERN (insn), kill_clobbered_value, vd);
+
+  if (CALL_P (insn))
+{
+  rtx exp;
+
+  for (exp = CALL_INSN_FUNCTION_USAGE (insn); exp; exp = XEXP (exp, 1))
+	{
+	  rtx x = XEXP (exp, 0);
+	  if (GET_CODE (x) == CLOBBER)
+	kill_value (SET_DEST (x), vd);
+	}
+}
+}
+
 /* Perform the forward copy propagation on basic block BB.  */
 
 static bool
@@ -800,7 +820,7 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
   /* Within asms, a clobber cannot overlap inputs or outputs.
 	 I wouldn't think this were true for regular insns, but
 	 scan_rtx treats them like that...  */
-  note_stores (PATTERN (insn), kill_clobbered_value, vd);
+  kill_clobbered_values (insn, vd);
 
   /* Kill all auto-incremented values.  */
   /* ??? REG_INC is useless, since stack pushes aren't done that way.  */
@@ -1035,17 +1055,7 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
 	 but instead among CLOBBERs on the CALL_INSN, we could wrongly
 	 assume the value in it is still live.  */
 	  if (ksvd.ignore_set_reg)
-	{
-	  note_stores (PATTERN (insn), kill_clobbered_value, vd);
-	  for (exp = CALL_INSN_FUNCTION_USAGE (insn);
-		   exp;
-		   exp = XEXP (exp, 1))
-		{
-		  rtx x = XEXP (exp, 0);
-		  if (GET_CODE (x) == CLOBBER)
-		kill_value (SET_DEST (x), vd);
-		}
-	}
+	kill_clobbered_values (insn, vd);
 	}
 
   bool copy_p = (set
-- 
1.9.1



Re: [PATCH/expand] PR64011 Adjust bitsize when partial overflow happen for big-endian

2015-01-09 Thread Jiong Wang
2015-01-09 13:39 GMT+00:00 Jiong Wang :
>
> the following code in store_bit_field_using_insv haven't consider above
> MEM->REG situation,
> it always assume bitnum + bitsize is within unit which is wrong.
>if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
> bitnum = unit - bitsize - bitnum;
>
> while my patch do have a problem, I should restrict the check on
> REG_P/SUBREG_P (op0) only.
> so is the patch OK with on extra check
>
>   if (REG_P (xop0) || (SUBREG_P (xop0) && REG_P (SUBREG_REG (xop0

sorry, this extra check should be unnecessary. because the following
check will only
be true if the destination is register.

+  if (bitsize + bitnum > unit && bitnum < unit)
+bitsize = unit - bitnum;

so I think the original patch is OK?

thanks.

Regards,
Jiong

>
> Regards,
> Jiong
>
>>
>> jeff
>>
>>
>>
>
>


Re: [PATCH] libobjc: Properly handle classes without instance variables in class_copyIvarList ().

2015-01-09 Thread Mike Stump
On Jan 8, 2015, at 9:35 PM, Jeff Law  wrote:
> Do you want to be a reviewer for libobjc?

I think things are fine as is.  If things were pinged and there were no 
response, or if someone wanted to do major updated on the library to bring it 
up a decade, then we might want to change things.  If someone wanted to do a 
major update, then, I’d recommend they step forward.


Re: [patch libstdc++] Fix assignability check in uninitialized_copy

2015-01-09 Thread Eelis

On 2015-01-09 19:03, Jonathan Wakely wrote:

The attached patch should be correct.

Tested x86_64-linux, committed to trunk and 4.9.


Awesome, thanks!



Re: [patch libstdc++] Fix assignability check in uninitialized_copy

2015-01-09 Thread Jonathan Wakely

On 28/12/14 13:47 +0100, Eelis wrote:

On 2014-12-28 00:18, Eelis wrote:

Trivial fix attached.


Please don't commit this patch.

I just noticed that the assignability test is wrong in an additional way: it 
should look at assignability of /output/ elements, not /input/ elements.

As a result, this code is currently rejected (because uninitialized_copy 
doesn't notice in time that chars can't be assigned to Xs):

struct X
{
X() = default;
X(char) {}
X(X const &) = default;
X & operator=(X const&) = default;

X & operator=(char) = delete;
};

#include 

int main()
{
char a[100];
X b[100];

std::uninitialized_copy(a, a+10, b);
}

Updated fix attached. With it, the code is accepted again. Testing as we speak.



Index: stl_uninitialized.h
===
--- stl_uninitialized.h (revision 219070)
+++ stl_uninitialized.h (working copy)
@@ -104,30 +104,30 @@
  */
  template
inline _ForwardIterator
uninitialized_copy(_InputIterator __first, _InputIterator __last,
   _ForwardIterator __result)
{
  typedef typename iterator_traits<_InputIterator>::value_type
_ValueType1;
  typedef typename iterator_traits<_ForwardIterator>::value_type
_ValueType2;
#if __cplusplus < 201103L
  const bool __assignable = true;
#else
  // trivial types can have deleted assignment
-  typedef typename iterator_traits<_InputIterator>::reference _RefType;
-  const bool __assignable = is_assignable<_ValueType1, _RefType>::value;
+  typedef typename iterator_traits<_ForwardIterator>::reference _RefType;
+  const bool __assignable = is_assignable<_RefType, _ValueType1>::value;
#endif


This is still wrong, because it tests for assigning an rvalue to the
result sequence, but dereferencing the _InputIterator doesn't
necessarily produce an rvalue, and the output type might behave
differently for assignment from lvalues and rvalues.

My first attempt to fix it was simply:

#else
  // trivial types can have deleted assignment
  typedef typename iterator_traits<_InputIterator>::reference _RefType;
-  const bool __assignable = is_assignable<_ValueType1, _RefType>::value;
+  const bool __assignable = is_assignable<_ValueType2, _RefType>::value;
#endif


But even that is wrong if the target type has a ref-qualified
assignment from the source type. The attached patch should be correct.

Tested x86_64-linux, committed to trunk and 4.9.

commit 8ba322dd71cca830b3641002b4c1bf132f195741
Author: Jonathan Wakely 
Date:   Fri Jan 9 15:51:20 2015 +

	PR libstdc++/64476
	* include/bits/stl_uninitialized.h (uninitialized_copy): Fix
	is_assignable arguments.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc:
	New.

diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h b/libstdc++-v3/include/bits/stl_uninitialized.h
index 00659e9..61a1561 100644
--- a/libstdc++-v3/include/bits/stl_uninitialized.h
+++ b/libstdc++-v3/include/bits/stl_uninitialized.h
@@ -115,8 +115,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const bool __assignable = true;
 #else
   // trivial types can have deleted assignment
-  typedef typename iterator_traits<_InputIterator>::reference _RefType;
-  const bool __assignable = is_assignable<_ValueType1, _RefType>::value;
+  typedef typename iterator_traits<_InputIterator>::reference _RefType1;
+  typedef typename iterator_traits<_ForwardIterator>::reference _RefType2;
+  const bool __assignable = is_assignable<_RefType2, _RefType1>::value;
 #endif
 
   return std::__uninitialized_copy<__is_trivial(_ValueType1)
diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc
new file mode 100644
index 000..6369b17
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc
@@ -0,0 +1,65 @@
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++11" }
+
+#i

Re: [PING][PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread Daniel Micay
On 09/01/15 12:49 PM, Joseph Myers wrote:
> On Fri, 9 Jan 2015, Daniel Micay wrote:
> 
>>> --with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE
>>> -pie}"
>>>
>>> at configure time (using CONFIGURE_SPECS).
>>>
>>> I have no idea if the above is really the proper spec to use - why
>>> do you include static, nostdlib, nodefaultlibs and nostartfiles
>>> for example?  Similar, if I say
>>
>> PIE isn't supported for static executables by binutils, etc. so it
>> does need to exclude that. The checks for nostdlib, nodefaultlibs
> 
> Well - that would indicate excluding -pie if one of the link-time options 
> conflicting with it is used, -fPIE if one of the compile-time options 
> conflicting with it is used.  That way, "gcc -static file.c" would still 
> have the same effect as "gcc -c file.c; gcc -static file.o" (building a 
> PIE object, linking it into a non-PIE static executable), which makes 
> logical sense to me (although there may be no great benefit either way).

Sure, I agree. It should have separate lists of exceptions for both of
these.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] libobjc: Properly handle classes without instance variables in class_copyIvarList ().

2015-01-09 Thread Mike Stump
On Jan 8, 2015, at 9:38 PM, Andrew Pinski  wrote:
>>   2014-12-24  Dimitris Papavasiliou  
>> 
>> PR libobjc/51891
>> * libobjc/ivars.c: Add a check for classes without instance
>>variables, which have a NULL ivar list pointer.
>> * gcc/testsuite/objc.dg/gnu-api-2-class.m: Add a test case
>>for the above change.

> This is ok.

Committed revision 219396.


Re: [PING][PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread Joseph Myers
On Fri, 9 Jan 2015, Daniel Micay wrote:

> > --with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE
> > -pie}"
> > 
> > at configure time (using CONFIGURE_SPECS).
> > 
> > I have no idea if the above is really the proper spec to use - why
> > do you include static, nostdlib, nodefaultlibs and nostartfiles
> > for example?  Similar, if I say
> 
> PIE isn't supported for static executables by binutils, etc. so it
> does need to exclude that. The checks for nostdlib, nodefaultlibs

Well - that would indicate excluding -pie if one of the link-time options 
conflicting with it is used, -fPIE if one of the compile-time options 
conflicting with it is used.  That way, "gcc -static file.c" would still 
have the same effect as "gcc -c file.c; gcc -static file.o" (building a 
PIE object, linking it into a non-PIE static executable), which makes 
logical sense to me (although there may be no great benefit either way).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] IPA ICF: add comparison for target and optimization nodes

2015-01-09 Thread Kyrill Tkachov


On 09/01/15 16:11, Christophe Lyon wrote:

On 9 January 2015 at 11:26, Martin Liška  wrote:

On 01/09/2015 06:21 AM, Jeff Law wrote:

On 01/07/15 04:38, Martin Liška wrote:

Hello.

Following patch adds support for target and optimization nodes
comparison, which is
based on Honza's newly added infrastructure.

Apart from that, there's a small hunk that corrects formatting and
removes unnecessary
call to a comparison function.

Hope it can be applied as one patch.

Tested on x86_64-linux-pc without any new regression introduction.

Ready for trunk?

Thank you,
Martin

0001-IPA-ICF-target-and-optimization-flags-comparison.patch


  From 393eaa47c8aef9a91a1c635016f23ca2f5aa25e4 Mon Sep 17 00:00:00 2001
From: mliska
Date: Tue, 6 Jan 2015 15:06:18 +0100
Subject: [PATCH] IPA ICF: target and optimization flags comparison.

gcc/ChangeLog:

2015-01-06  Martin Liska

 * cgraphunit.c (cgraph_node::create_wrapper): Fix level of
indentation.
 * ipa-icf.c (sem_function::equals_private): Add support for target
and
 (sem_item_optimizer::merge_classes): Remove redundant function
 comparison.
 optimization flags comparison.
 * tree.h (target_opts_for_fn): New function.

Looks like the changelog is a bit goof'd with lines intermixed.

Patch itself is good for the trunk.  It'd be nice if you could add a
testcase as well.

Jeff


Hi.

You are right, I forgot to delete a line in Changelog.
Attachment contains final version with a new test case I'm going to install.


Hi,

It looks like this patch broke GCC builds for ARM and AArch64 targets at least.

I see failures builds pr-support.o and unwind-arm.o:

0x10bb077 tree_check
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree.h:2778
0x10bb077 target_opts_for_fn
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree.h:4681
0x10bb077 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*,
hash_map&)
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:431
0x10bbd27 ipa_icf::sem_function::equals(ipa_icf::sem_item*,
hash_map&)
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:386
0x10ba63a ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:1893
0x10bcd86 ipa_icf::sem_item_optimizer::execute()
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:1712
0x10bce11 ipa_icf_driver
 
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:2441

(for target arm-none-eabi)


Yeah, I see these too when trying to bootstrap arm and aarch64

Kyrill




Thanks,
Martin





Re: [PATCH, fortran] Add gfc_define_builtin_with_spec

2015-01-09 Thread Tom de Vries

On 09-01-15 18:11, FX wrote:

If unused on trunk, why would we commit it there?
When your branch is merged, you'll merge it along. Otherwise that defeats the 
purpose of working on a branch, unless I misunderstand something...


This patch is not branch-specific.

Thanks,
- Tom



FX




Le 9 janv. 2015 à 16:37, Tom de Vries  a écrit :

Jakub,

For the oacc kernels patch series I need a fortran builtin with fn spec 
attribute (as mentioned here: https://gcc.gnu.org/ml/gcc/2014-12/msg1.html 
).

Attached patch adds a function gfc_define_builtin_with_spec that allows me to 
define such a builtin.

At this point there's no user yet in trunk, so I've declared it unused.

Bootstrapped and reg-tested on x86_64.

OK for stage 3 trunk?

Thanks,
- Tom
<0001-Add-gfc_define_builtin_with_spec.patch>




Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2015-01-09 Thread Jeff Law

On 01/09/15 04:32, Robert Suchanek wrote:

Hi Steven/Vladimir,


It's hard to say what the correct fix should be, but it sounds like
the address you get after the substitutions should be simplified
(folded).


Coming back to the original testcase and re-analyzing the problem, it appears
that there is, indeed, a missing case for simplification of LO_SUM/HIGH pair.
The patch attached resolves the issue.

Although the testcase is not reproducible on the trunk, I think it is still
worth to include it in case the ICE reoccurred.

The patch has been bootstrapped and regtested on x86_64-unknown-linux-gnu target
and similarly tested against SVN revision r212763 where it can be reproduced.

Regards,
Robert

2015-01-08  Robert Suchanek  

gcc/
* simplify-rtx.c (simplify_replace_fn_rtx): Simplify (lo_sum (high x)
(const (plus x offset))) to (const (plus x offset)).
You have to be careful here.  Whether or not this transformation is 
valid depends on the size of the offset and whether or not the port has 
an overlap between its sethi and losum insns and whether or not any 
rounding occurs when applying the relocations for sethi/losum as well as 
potentially other factors.


We don't currently have a way for ports to indicate what offsets would 
make this kind of simplification valid.   In fact, there's at least one 
port (PA) where the validity of this kind of simplification can't be 
determined until after LRA/reload when you know the full context of how 
the result is going to be used.


Jeff




Re: [PATCH, fortran] Add gfc_define_builtin_with_spec

2015-01-09 Thread FX
If unused on trunk, why would we commit it there?
When your branch is merged, you'll merge it along. Otherwise that defeats the 
purpose of working on a branch, unless I misunderstand something...

FX



> Le 9 janv. 2015 à 16:37, Tom de Vries  a écrit :
> 
> Jakub,
> 
> For the oacc kernels patch series I need a fortran builtin with fn spec 
> attribute (as mentioned here: 
> https://gcc.gnu.org/ml/gcc/2014-12/msg1.html ).
> 
> Attached patch adds a function gfc_define_builtin_with_spec that allows me to 
> define such a builtin.
> 
> At this point there's no user yet in trunk, so I've declared it unused.
> 
> Bootstrapped and reg-tested on x86_64.
> 
> OK for stage 3 trunk?
> 
> Thanks,
> - Tom
> <0001-Add-gfc_define_builtin_with_spec.patch>


[PATCH, committed] PR jit/64206: delay cleanup of tempdir if the user has requested debuginfo

2015-01-09 Thread David Malcolm
Committed to trunk as r219395.

gcc/jit/ChangeLog:
PR jit/64206
* docs/internals/test-hello-world.exe.log.txt: Update, the log now
shows tempdir creation/cleanup.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
* jit-logging.h (class gcc::jit::log_user): Add gcc::jit::tempdir
to the list of subclasses in the comment.
* jit-playback.c (gcc::jit::playback::context::context): Add a
comment clarifying when the tempdir gets cleaned up.
(gcc::jit::playback::context::compile): Pass the context's logger,
if any, to the tempdir.
(gcc::jit::playback::context::dlopen_built_dso): When creating the
gcc::jit::result, if GCC_JIT_BOOL_OPTION_DEBUGINFO is set, hand
over ownership of the tempdir to it.
* jit-result.c: Include "jit-tempdir.h".
(gcc::jit::result::result): Add tempdir param, saving it as
m_tempdir.
(gcc::jit::result::~result): Delete m_tempdir.
* jit-result.h (gcc::jit::result::result): Add tempdir param.
(gcc::jit::result::m_tempdir): New field.
* jit-tempdir.c (gcc::jit::tempdir::tempdir): Add logger param;
add JIT_LOG_SCOPE.
(gcc::jit::tempdir::create): Add JIT_LOG_SCOPE to log entry/exit,
and log m_path_template and m_path_tempdir.
(gcc::jit::tempdir::~tempdir): Add JIT_LOG_SCOPE to log
entry/exit, and log the unlink and rmdir calls.
* jit-tempdir.h: Include "jit-logging.h".
(class gcc::jit::tempdir): Make this be a subclass of log_user.
(gcc::jit::tempdir::tempdir): Add logger param.
* notes.txt: Update to show the two possible places where the
tempdir can be cleaned up.
---
 .../docs/internals/test-hello-world.exe.log.txt| 16 +++-
 gcc/jit/jit-logging.h  |  2 +
 gcc/jit/jit-playback.c | 48 --
 gcc/jit/jit-result.c   | 13 +-
 gcc/jit/jit-result.h   |  3 +-
 gcc/jit/jit-tempdir.c  | 29 ++---
 gcc/jit/jit-tempdir.h  |  6 ++-
 gcc/jit/notes.txt  |  9 +++-
 8 files changed, 109 insertions(+), 17 deletions(-)

diff --git a/gcc/jit/docs/internals/test-hello-world.exe.log.txt 
b/gcc/jit/docs/internals/test-hello-world.exe.log.txt
index a96d80f..113dc35 100644
--- a/gcc/jit/docs/internals/test-hello-world.exe.log.txt
+++ b/gcc/jit/docs/internals/test-hello-world.exe.log.txt
@@ -46,6 +46,12 @@ JIT:   exiting: void gcc::jit::recording::context::validate()
 JIT:   entering: 
gcc::jit::playback::context::context(gcc::jit::recording::context*)
 JIT:   exiting: 
gcc::jit::playback::context::context(gcc::jit::recording::context*)
 JIT:   entering: gcc::jit::result* gcc::jit::playback::context::compile()
+JIT:entering: gcc::jit::tempdir::tempdir(gcc::jit::logger*, int)
+JIT:exiting: gcc::jit::tempdir::tempdir(gcc::jit::logger*, int)
+JIT:entering: bool gcc::jit::tempdir::create()
+JIT: m_path_template: /tmp/libgccjit-XX
+JIT: m_path_tempdir: /tmp/libgccjit-CKq1M9
+JIT:exiting: bool gcc::jit::tempdir::create()
 JIT:entering: void 
gcc::jit::playback::context::make_fake_args(vec*, const char*, 
vec*)
 JIT:exiting: void gcc::jit::playback::context::make_fake_args(vec*, 
const char*, vec*)
 JIT:entering: void gcc::jit::playback::context::acquire_mutex()
@@ -96,8 +102,9 @@ JIT: argv[5]: -fno-use-linker-plugin
 JIT: argv[6]: (null)
 JIT:exiting: void gcc::jit::playback::context::convert_to_dso(const char*)
 JIT:entering: gcc::jit::result* 
gcc::jit::playback::context::dlopen_built_dso()
-JIT: entering: gcc::jit::result::result(gcc::jit::logger*, void*)
-JIT: exiting: gcc::jit::result::result(gcc::jit::logger*, void*)
+JIT: GCC_JIT_BOOL_OPTION_DEBUGINFO was set: handing over tempdir to 
jit::result
+JIT: entering: gcc::jit::result::result(gcc::jit::logger*, void*, 
gcc::jit::tempdir*)
+JIT: exiting: gcc::jit::result::result(gcc::jit::logger*, void*, 
gcc::jit::tempdir*)
 JIT:exiting: gcc::jit::result* 
gcc::jit::playback::context::dlopen_built_dso()
 JIT:entering: void gcc::jit::playback::context::release_mutex()
 JIT:exiting: void gcc::jit::playback::context::release_mutex()
@@ -121,6 +128,11 @@ JIT: exiting: gcc_jit_context_release
 JIT: entering: gcc_jit_result_release
 JIT:  deleting result: 0x12f75d0
 JIT:  entering: virtual gcc::jit::result::~result()
+JIT:   entering: gcc::jit::tempdir::~tempdir()
+JIT:unlinking .s file: /tmp/libgccjit-CKq1M9/fake.s
+JIT:unlinking .so file: /tmp/libgccjit-CKq1M9/fake.so
+JIT:removing tempdir: /tmp/libgccjit-CKq1M9
+JIT:   exiting: gcc::jit::tempdir::~tempdir()
 JIT:  exiting: virtual gcc::jit::result::~result()
 JIT: exiting: gcc_jit_result_release
 JIT: gcc::jit::logger::~logger()
diff --git a/gcc/jit/jit-logging.h 

Re: [PATCH] rs6000: Fix TARGET_PROMOTE_FUNCTION_MODE

2015-01-09 Thread David Edelsohn
On Thu, Jan 8, 2015 at 8:10 PM, Segher Boessenkool
 wrote:
> As the existing comment explains, we should always promote function
> arguments and return values.  However, notwithstanding its name,
> default_promote_function_mode_always_promote does not always promote.
> Importantly, it does not for libcalls.  This makes ftrapv-[12].c fail
> with 64-bit ABIs.
>
> This patch introduces an rs6000_promote_function_mode that _does_
> always promote, fixing this.
>
> Tested as usual (c,c++,fortran,ada; -m32,-m32/-mpowerpc64,-m64,-m64/-mlra).
> Is this okay for mainline?
>
>
> Segher
>
>
> 2015-01-08  Segher Boessenkool  
>
> gcc/
> * config/rs6000/rs6000.c (TARGET_PROMOTE_FUNCTION_MODE): Implement
> as rs6000_promote_function_mode.  Move comment to there.
> (rs6000_promote_function_mode): New function.

rs6000_promote_function_mode should use the PROMOTE_MODE macro.  I
think that the macro has a bug for -m32 -mpowerpc64 in its use of
UNITS_PER_WORD.  But I prefer one definition of the behavior to avoid
divergence.

Thanks David


[Patch, AArch64, Testsuite] Check for expected MOVI vectorization.

2015-01-09 Thread Tejas Belagod

Hi,

This change:

+2014-12-05 Martin Jambor mjam...@suse.cz
+

PR ipa/64192
ipa-prop.c (ipa_compute_jump_functions_for_edge): Convert alignment
from bits to bytes after checking they are byte-aligned.
+

causes this regression on AArch64.

FAIL: gcc.target/aarch64/vect-movi.c scan-assembler movi\\tv[0-9]+.4s, 
0xab, msl 16
FAIL: gcc.target/aarch64/vect-movi.c scan-assembler movi\\tv[0-9]+.4s, 
0xab, msl 8
FAIL: gcc.target/aarch64/vect-movi.c scan-assembler mvni\\tv[0-9]+.4s, 
0xab, msl 16
FAIL: gcc.target/aarch64/vect-movi.c scan-assembler mvni\\tv[0-9]+.4s, 
0xab, msl 8


It causes AArch64 vector cost model to vectorize the loops in the test 
case with a VF = 2 on A53/default and VF = 4 for A57.


A53/default:
movi v0.2s, 0xab, msl 8
str d0, [x0]
str d0, [x0, 8]
str d0, [x0, 16]
str d0, [x0, 24]
str d0, [x0, 32]
str d0, [x0, 40]
str d0, [x0, 48]
str d0, [x0, 56]

vs. A57

movi v0.4s, 0xab, msl 8
str q0, [x0]
str q0, [x0, 16]
str q0, [x0, 32]
str q0, [x0, 48]


But the test case isn't checking for a per-core optimized code, just 
whether we vectorize for MOVI or not. So, this patch improves reg exp to 
make sure the compiler vectorizes the code for either vectorization factor.


OK for trunk?

Thanks,
Tejas.

Changelog:

gcc/testsuite:

* gcc.target/aarch64/vect-movi.c: Check for vectorization for
64-bit and 128-bit.diff --git a/gcc/testsuite/gcc.target/aarch64/vect-movi.c 
b/gcc/testsuite/gcc.target/aarch64/vect-movi.c
index 59a0bd5..d28a71d 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-movi.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-movi.c
@@ -10,7 +10,7 @@ movi_msl8 (int *__restrict a)
 {
   int i;
 
-  /* { dg-final { scan-assembler "movi\\tv\[0-9\]+\.4s, 0xab, msl 8" } } */
+  /* { dg-final { scan-assembler "movi\\tv\[0-9\]+\.\[42\]s, 0xab, msl 8" } } 
*/
   for (i = 0; i < N; i++)
 a[i] = 0xabff;
 }
@@ -20,7 +20,7 @@ movi_msl16 (int *__restrict a)
 {
   int i;
 
-  /* { dg-final { scan-assembler "movi\\tv\[0-9\]+\.4s, 0xab, msl 16" } } */
+  /* { dg-final { scan-assembler "movi\\tv\[0-9\]+\.\[42\]s, 0xab, msl 16" } } 
*/
   for (i = 0; i < N; i++)
 a[i] = 0xab;
 }
@@ -30,7 +30,7 @@ mvni_msl8 (int *__restrict a)
 {
   int i;
 
-  /* { dg-final { scan-assembler "mvni\\tv\[0-9\]+\.4s, 0xab, msl 8" } } */
+  /* { dg-final { scan-assembler "mvni\\tv\[0-9\]+\.\[42\]s, 0xab, msl 8" } } 
*/
   for (i = 0; i < N; i++)
 a[i] = 0x5400;
 }
@@ -40,7 +40,7 @@ mvni_msl16 (int *__restrict a)
 {
   int i;
 
-  /* { dg-final { scan-assembler "mvni\\tv\[0-9\]+\.4s, 0xab, msl 16" } } */
+  /* { dg-final { scan-assembler "mvni\\tv\[0-9\]+\.\[42\]s, 0xab, msl 16" } } 
*/
   for (i = 0; i < N; i++)
 a[i] = 0xff54;
 }


RE: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Richard Biener
On January 9, 2015 4:10:44 PM CET, Bernd Edlinger  
wrote:
>On Fri, 9 Jan 2015 14:04:27, Richard Biener wrote:
>>
>>>
>>> FYI: the VIEW_CONVERT_EXPR did not fail in the
>>> gcc_checking_assert (is_gimple_addressable (base))
>>> but much later, somewhere in tree-cfg.c it dropped out.
>>
>> How did it fail there? It doesn't look like &VIEW_CONVERT_EXPR
>> is forbidden.
>
>without that patch the following ICE happened,
>which also happened without the bit-fields patch.
>So this had never been working before:
>
>
>/home/ed/gnu/gcc-build/gcc/xgcc -c -B/home/ed/gnu/gcc-build/gcc/
>-gnatws -O2 -fsanitize=thread -gnato -gnatE check_file.adb
>check_file.adb: In function 'CHECK_FILE':
>check_file.adb:44:1: error: invalid operand in unary operation
>_498 = (character[1:15] *) &MEM[(character[1:D.5362]
>*)S38b.54_315][_321 ...]{lb: 1 sz: 1};
>check_file.adb:44:1: error: invalid operand in unary operation
>_503 = (character[1:15] *) &MEM[(character[1:D.5261]
>*)S75b.39_360][_366 ...]{lb: 1 sz: 1};
>check_file.adb:44:1: error: invalid operand in unary operation
>_510 = (character[1:10] *) &*S112b.17_112[_120 ...]{lb: 1 sz: 1};
>check_file.adb:44:1: error: invalid operand in unary operation
>_508 = (character[1:5] *) &*S112b.17_112[_123 ...]{lb: 1 sz: 1};
>+===GNAT BUG
>DETECTED==+
>| 5.0.0 20150108 (experimental) (x86_64-unknown-linux-gnu) GCC
>error:  |
>| verify_gimple
>failed |
>| Error detected around
>check_file.adb:44:1    |
>| Please submit a bug report; see
>http://gcc.gnu.org/bugs.html.    |
>| Use a subject line meaningful to you and us to track the
>bug.    |
>| Include the entire contents of this bug box in the
>report.   |
>| Include the exact command that you
>entered.  |
>| Also include sources listed
>below.   |
>+==+
>
>Please include these source files with error report
>Note that list may not be accurate in some cases,
>so please double check that the problem can still
>be reproduced with the set of files listed.
>Consider also -gnatd.n switch (see debug.adb).
>
>
>so it must be this check in verify_gimple_assign_unary:
>
>  if (!is_gimple_val (rhs1))
>    {
>  error ("invalid operand in unary operation");
>  return true;
>    }

Yes. As said, you generally need to run folding results through 
force_gimple_operand.

Richard.

>
>Bernd.
> 




Re: [PATCH] rs6000: Fix recip tests for -m32 -mpowerpc64

2015-01-09 Thread David Edelsohn
On Fri, Jan 9, 2015 at 1:38 AM, Segher Boessenkool
 wrote:
> This fixes gcc.target/powerpc/recip-[67].c with -m32 -mpowerpc64.
> Tested etc.; okay for mainline?
>
>
> Segher
>
>
>
> 2015-01-08  Segher Boessenkool  
>
> gcc/testsuite/
> * gcc.target/powerpc/recip-test.h (_ARCH_PPC64): Use __LP64__ instead.

Okay.

Thanks, David


Re: [PATCH] IPA ICF: add comparison for target and optimization nodes

2015-01-09 Thread Christophe Lyon
On 9 January 2015 at 11:26, Martin Liška  wrote:
> On 01/09/2015 06:21 AM, Jeff Law wrote:
>>
>> On 01/07/15 04:38, Martin Liška wrote:
>>>
>>> Hello.
>>>
>>> Following patch adds support for target and optimization nodes
>>> comparison, which is
>>> based on Honza's newly added infrastructure.
>>>
>>> Apart from that, there's a small hunk that corrects formatting and
>>> removes unnecessary
>>> call to a comparison function.
>>>
>>> Hope it can be applied as one patch.
>>>
>>> Tested on x86_64-linux-pc without any new regression introduction.
>>>
>>> Ready for trunk?
>>>
>>> Thank you,
>>> Martin
>>>
>>> 0001-IPA-ICF-target-and-optimization-flags-comparison.patch
>>>
>>>
>>>  From 393eaa47c8aef9a91a1c635016f23ca2f5aa25e4 Mon Sep 17 00:00:00 2001
>>> From: mliska
>>> Date: Tue, 6 Jan 2015 15:06:18 +0100
>>> Subject: [PATCH] IPA ICF: target and optimization flags comparison.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-01-06  Martin Liska
>>>
>>> * cgraphunit.c (cgraph_node::create_wrapper): Fix level of
>>> indentation.
>>> * ipa-icf.c (sem_function::equals_private): Add support for target
>>> and
>>> (sem_item_optimizer::merge_classes): Remove redundant function
>>> comparison.
>>> optimization flags comparison.
>>> * tree.h (target_opts_for_fn): New function.
>>
>> Looks like the changelog is a bit goof'd with lines intermixed.
>>
>> Patch itself is good for the trunk.  It'd be nice if you could add a
>> testcase as well.
>>
>> Jeff
>
>
> Hi.
>
> You are right, I forgot to delete a line in Changelog.
> Attachment contains final version with a new test case I'm going to install.
>
Hi,

It looks like this patch broke GCC builds for ARM and AArch64 targets at least.

I see failures builds pr-support.o and unwind-arm.o:

0x10bb077 tree_check
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree.h:2778
0x10bb077 target_opts_for_fn
/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree.h:4681
0x10bb077 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*,
hash_map&)

/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:431
0x10bbd27 ipa_icf::sem_function::equals(ipa_icf::sem_item*,
hash_map&)

/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:386
0x10ba63a ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)

/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:1893
0x10bcd86 ipa_icf::sem_item_optimizer::execute()

/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:1712
0x10bce11 ipa_icf_driver

/tmp/2239141_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ipa-icf.c:2441

(for target arm-none-eabi)

> Thanks,
> Martin


Re: [doc] fix documentation of -fvtable-verify and related options

2015-01-09 Thread Sandra Loosemore

On 01/08/2015 10:10 PM, Jeff Law wrote:

On 01/08/15 15:08, Sandra Loosemore wrote:

This patch cleans up the documentation of -fvtable-verify, -fvtv-debug,
and -fvtv-counts.  The substantive change is to correct the location of
the debug log files per discussion here:

https://gcc.gnu.org/ml/gcc/2015-01/msg00029.html

but I ended up doing a pretty much total rewrite of the text to fix
various markup problems, issues with agreement of verb tense and
plurals, usage of terms like "runtime", etc.

I think this particular patch goes a little beyond an obvious fix, so I
have not committed it yet.  But, I don't want it to get lost in the
shuffle, so I propose to do so in a few days if I don't hear any
objection or request for more time to review it meanwhile.

-Sandra


2015-01-08  Sandra Loosemore  

 gcc/
 * doc/invoke.texi ([-fvtable-verify]): Copy-edit and fix markup.
 ([-fvtv-debug], [-fvtv-counts]): Likewise.  Correct location
 of log files.


This is fine.  I did note that in some places you use "run time" and
others "run-time".  Not sure if you want those to be consistent or not.

Ok for the trunk.  If you want to make "run time" vs "run-time"
consistent one way or the other consider it preapproved.


As I noted, the "runtime" vs "run time" vs "run-time" changes are 
deliberate.  See


https://gcc.gnu.org/codingconventions.html#Spelling

I did wonder, though, if "startup" should get the same treatment 
but currently "startup" is used consistently throughout the document as 
both noun and adjective, so any change to that ought to be handled 
separately.


-Sandra



Re: [PATCH] rs6000: Fix va_start handling for -m32 -mpowerpc64 ABI_V4

2015-01-09 Thread David Edelsohn
On Fri, Jan 9, 2015 at 5:52 AM, Segher Boessenkool
 wrote:
> This fixes 88 testsuite FAILs.
>
> -mpowerpc64 does not change the ABI, but it does change the value of
> UNITS_PER_WORD.  We could use POINTER_SIZE_UNITS instead of 4 here,
> but that does not seem quite right.  This code is for SVR4 only, so
> a literal 4 isn't so bad I think.  Better suggestions welcome though.
>
> Bootstrapped and tested as usual.  Okay for mainline?
>
>
> Segher
>
>
> 2015-01-09  Segher Boessenkool  
>
> gcc/
> * config/rs6000/rs6000.c (rs6000_va_start): Use a literal 4 instead
> of UNITS_PER_WORD to describe the size of stack slots.
>
> ---
>  gcc/config/rs6000/rs6000.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 66a1399..cc7b2a4 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -11225,7 +11225,7 @@ rs6000_va_start (tree valist, rtx nextarg)
>/* Find the overflow area.  */
>t = make_tree (TREE_TYPE (ovf), virtual_incoming_args_rtx);
>if (words != 0)
> -t = fold_build_pointer_plus_hwi (t, words * UNITS_PER_WORD);
> +t = fold_build_pointer_plus_hwi (t, 4 * words);
>t = build2 (MODIFY_EXPR, TREE_TYPE (ovf), ovf, t);
>TREE_SIDE_EFFECTS (t) = 1;
>expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);

MIN_UNITS_PER_WORD

Okay with that change.

Thanks, David


Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2015-01-09 Thread pinskia




> On Jan 9, 2015, at 4:20 AM, Matthew Fortune  
> wrote:
> 
> Robert Suchanek  writes:
> 
>> gcc/
>>* simplify-rtx.c (simplify_replace_fn_rtx): Simplify (lo_sum (high x)
>>(const (plus x offset))) to (const (plus x offset)).
> 
> The fix appears valid to me. Just some comments on the test case.
> 
>> a/gcc/testsuite/gcc.target/mips/20150108.c
>> b/gcc/testsuite/gcc.target/mips/20150108.c
>> new file mode 100644
>> index 000..f18dbe7
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/mips/20150108.c
>> @@ -0,0 +1,25 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-mips32r2" } */
> 
> Please remove this line as there is nothing ISA dependent in the test case.

And since there is no mips specific part to the testcase (except for nomips16), 
we should place it in the generic part of the testsuite. 

Thanks,
Andrew

> 
>> +
>> +long long a[10];
>> +long long b, c, d, k, m, n, o, p, q, r, s, t, u, v, w; int e, f, g, h,
>> +i, j, l, x;
>> +
> 
> nit, no return type specified.
> 
>> +NOMIPS16 fn1() {
> 
> Nit, newline for the brace.
> 
>> +  for (; x; x++)
>> +if (x & 1)
>> +  s = h | g;
>> +else
>> +  s = f | e;
>> +  l = ~0;
>> +  m = 1 | k;
>> +  n = i;
>> +  o = j;
>> +  p = f | e;
>> +  q = h | g;
>> +  w = d | c | a[1];
>> +  t = c;
>> +  v = b | c;
>> +  u = v;
>> +  r = b | a[4];
>> +}
>> --
>> 1.7.9.5
> 
> Thanks,
> Matthew


Re: [PATCH] rs6000: Introducing rs6000_abi_word_mode

2015-01-09 Thread David Edelsohn
On Fri, Jan 9, 2015 at 10:26 AM, Segher Boessenkool
 wrote:
> Some hooks return word_mode by default, which is incorrect for -m32
> -mpowerpc64.  This patch creates a new function rs6000_abi_word_mode
> to implement these hooks, and does so.
>
> This fixes 163 testuite FAILs.
>
> Tested as usual; okay for mainline?
>
>
> 2015-01-09  Segher Boessenkool  
>
> gcc/
> * config/rs6000/rs6000.c (TARGET_LIBGCC_CMP_RETURN_MODE,
> TARGET_LIBGCC_SHIFT_COUNT_MODE, TARGET_UNWIND_WORD_MODE): Implement
> as ...
> (rs6000_abi_word_mode): New function.

Okay.

> +/* The mode the ABI uses for a word.  This is not the same as word_mode
> +   for -m32 -mpowerpc64.  This is used to implement various target hooks.  */
> +
> +static enum machine_mode
> +rs6000_abi_word_mode (void)
> +{
> +  return TARGET_32BIT ? SImode : DImode;
> +}

But I think that new code does not need "enum".

Thanks, David


Re: Patch RFA: Support for building Go tools

2015-01-09 Thread Paolo Bonzini


On 09/01/2015 15:24, Ian Lance Taylor wrote:
>> >
>> > This should work automatically, the only difference is that you must
>> > omit $(LIBGODEP) from the dependencies.
> What will happen if there is no installed gccgo at the right version?

Compilation fails.

> What should happen?

Compilation fails. :)

> I haven't been keeping track--will the build machinery now build a
> build-x-host Go compiler?

No, but the build machinery should point the Makefile variables to a
build-x-host tool from the PATH.  If it doesn't, you need to modify the
toplevel configure.ac and Makefile.tpl.

Paolo


Re: [nvptx-tools, committed] Also install [...]/nvptx-none/bin/ar and [...]/nvptx-none/bin/ranlib.

2015-01-09 Thread Bernd Schmidt

On 12/23/2014 07:50 PM, Thomas Schwinge wrote:

GCC needs this, if nvptx-none-ar and nvptx-none-ranlib aren't found in $PATH.


I've pushed the three patches you sent to my github repository.


Bernd



[PATCH, fortran] Add gfc_define_builtin_with_spec

2015-01-09 Thread Tom de Vries

Jakub,

For the oacc kernels patch series I need a fortran builtin with fn spec 
attribute (as mentioned here: https://gcc.gnu.org/ml/gcc/2014-12/msg1.html ).


Attached patch adds a function gfc_define_builtin_with_spec that allows me to 
define such a builtin.


At this point there's no user yet in trunk, so I've declared it unused.

Bootstrapped and reg-tested on x86_64.

OK for stage 3 trunk?

Thanks,
- Tom
2015-01-09  Tom de Vries  

	* f95-lang.c (gfc_define_builtin_with_spec): New function.
---
 gcc/fortran/f95-lang.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index 9103fa9..0c33bb8 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "gfortran.h"
 #include "tree.h"
+#include "stringpool.h"
 #include "flags.h"
 #include "langhooks.h"
 #include "langhooks-def.h"
@@ -594,6 +595,26 @@ gfc_define_builtin (const char *name, tree type, enum built_in_function code,
   set_builtin_decl (code, decl, true);
 }
 
+/* Like gfc_define_builtin, but with fn spec attribute FNSPEC.  */
+
+static void ATTRIBUTE_UNUSED
+gfc_define_builtin_with_spec (const char *name, tree fntype,
+			  enum built_in_function code,
+			  const char *library_name, int attr,
+			  const char *fnspec)
+{
+  if (fnspec)
+{
+  /* Code copied from build_library_function_decl_1.  */
+  tree attr_args = build_tree_list (NULL_TREE,
+	build_string (strlen (fnspec), fnspec));
+  tree attrs = tree_cons (get_identifier ("fn spec"),
+			  attr_args, TYPE_ATTRIBUTES (fntype));
+  fntype = build_type_attribute_variant (fntype, attrs);
+}
+
+  gfc_define_builtin (name, fntype, code, library_name, attr);
+}
 
 #define DO_DEFINE_MATH_BUILTIN(code, name, argtype, tbase) \
 gfc_define_builtin ("__builtin_" name "l", tbase##longdouble[argtype], \
-- 
1.9.1



Re: [PATCH] Fix sporadic failure in g++.dg/tsan/aligned_vs_unaligned_race.C

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 04:32:47PM +0100, Bernd Edlinger wrote:
> Hi,
> 
> On Thu, 8 Jan 2015 22:27:26, Jakub Jelinek wrote:
> >> Any objections to approving it now?
> >
> > LGTM.
> >
> > Jakub
> 
> would it be OK to apply this patch also to the 4.9 testsuite,
> except for c-c++-common/tsan/bitfield_race.c and 
> g++.dg/tsan/aligned_vs_unaligned_race.C of course?

Yes, but please give Dmitry some time to respond.

Jakub


RE: [PATCH] Fix sporadic failure in g++.dg/tsan/aligned_vs_unaligned_race.C

2015-01-09 Thread Bernd Edlinger
Hi,

On Thu, 8 Jan 2015 22:27:26, Jakub Jelinek wrote:
>> Any objections to approving it now?
>
> LGTM.
>
> Jakub

would it be OK to apply this patch also to the 4.9 testsuite,
except for c-c++-common/tsan/bitfield_race.c and 
g++.dg/tsan/aligned_vs_unaligned_race.C of course?


Bernd.
  

[PATCH] rs6000: Introducing rs6000_abi_word_mode

2015-01-09 Thread Segher Boessenkool
Some hooks return word_mode by default, which is incorrect for -m32
-mpowerpc64.  This patch creates a new function rs6000_abi_word_mode
to implement these hooks, and does so.

This fixes 163 testuite FAILs.

Tested as usual; okay for mainline?


2015-01-09  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.c (TARGET_LIBGCC_CMP_RETURN_MODE,
TARGET_LIBGCC_SHIFT_COUNT_MODE, TARGET_UNWIND_WORD_MODE): Implement
as ...
(rs6000_abi_word_mode): New function.

---
 gcc/config/rs6000/rs6000.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index cc7b2a4..958a8b9 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1663,6 +1663,13 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV rs6000_atomic_assign_expand_fenv
+
+#undef TARGET_LIBGCC_CMP_RETURN_MODE
+#define TARGET_LIBGCC_CMP_RETURN_MODE rs6000_abi_word_mode
+#undef TARGET_LIBGCC_SHIFT_COUNT_MODE
+#define TARGET_LIBGCC_SHIFT_COUNT_MODE rs6000_abi_word_mode
+#undef TARGET_UNWIND_WORD_MODE
+#define TARGET_UNWIND_WORD_MODE rs6000_abi_word_mode
 
 
 /* Processor table.  */
@@ -9293,8 +9300,18 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, tree fntype,
 }
 }
 
+/* The mode the ABI uses for a word.  This is not the same as word_mode
+   for -m32 -mpowerpc64.  This is used to implement various target hooks.  */
+
+static enum machine_mode
+rs6000_abi_word_mode (void)
+{
+  return TARGET_32BIT ? SImode : DImode;
+}
+
 /* On rs6000, function arguments are promoted, as are function return
values.  */
+
 static machine_mode
 rs6000_promote_function_mode (const_tree, machine_mode mode, int *,
  const_tree, int)
-- 
1.8.1.4



RE: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Bernd Edlinger
On Fri, 9 Jan 2015 14:04:27, Richard Biener wrote:
>
>>
>> FYI: the VIEW_CONVERT_EXPR did not fail in the
>> gcc_checking_assert (is_gimple_addressable (base))
>> but much later, somewhere in tree-cfg.c it dropped out.
>
> How did it fail there? It doesn't look like &VIEW_CONVERT_EXPR
> is forbidden.

without that patch the following ICE happened,
which also happened without the bit-fields patch.
So this had never been working before:


/home/ed/gnu/gcc-build/gcc/xgcc -c -B/home/ed/gnu/gcc-build/gcc/ -gnatws -O2 
-fsanitize=thread -gnato -gnatE check_file.adb
check_file.adb: In function 'CHECK_FILE':
check_file.adb:44:1: error: invalid operand in unary operation
_498 = (character[1:15] *) &MEM[(character[1:D.5362] *)S38b.54_315][_321 
...]{lb: 1 sz: 1};
check_file.adb:44:1: error: invalid operand in unary operation
_503 = (character[1:15] *) &MEM[(character[1:D.5261] *)S75b.39_360][_366 
...]{lb: 1 sz: 1};
check_file.adb:44:1: error: invalid operand in unary operation
_510 = (character[1:10] *) &*S112b.17_112[_120 ...]{lb: 1 sz: 1};
check_file.adb:44:1: error: invalid operand in unary operation
_508 = (character[1:5] *) &*S112b.17_112[_123 ...]{lb: 1 sz: 1};
+===GNAT BUG DETECTED==+
| 5.0.0 20150108 (experimental) (x86_64-unknown-linux-gnu) GCC error:  |
| verify_gimple failed |
| Error detected around check_file.adb:44:1    |
| Please submit a bug report; see http://gcc.gnu.org/bugs.html.    |
| Use a subject line meaningful to you and us to track the bug.    |
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Please include these source files with error report
Note that list may not be accurate in some cases,
so please double check that the problem can still
be reproduced with the set of files listed.
Consider also -gnatd.n switch (see debug.adb).


so it must be this check in verify_gimple_assign_unary:

  if (!is_gimple_val (rhs1))
    {
  error ("invalid operand in unary operation");
  return true;
    }


Bernd.
  

Re: [PATCH][OpenMP] Forbid usage of non-target functions in target regions

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 05:57:02PM +0300, Ilya Verbin wrote:
> Hi!
> 
> If one (by mistake) calls a non-target function from the target region, the
> offload compiler crashes in input_overwrite_node.  This is because
> compute_ltrans_boundary during streaming-out includes to SET the
> non-offloadable nodes, called from offloadable nodes.
> Probably it's possible to ignore such incorrect nodes (and edges) in
> streaming-out, but such a situation can not appear in a correct OpenMP 4.0
> program, therefore I've added a check to scan_omp_1_stmt.

Unlike variables, the spec last time I've checked isn't all that clear about
that.

> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -2818,6 +2818,19 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool 
> *handled_ops_p,
> default:
>   break;
> }
> +   else if (!DECL_EXTERNAL (fndecl)
> +&& !cgraph_node::get_create (fndecl)->offloadable)

What about if fndecl is defined in the current TU, but as global symbol and can 
be
interposed (e.g. is in a shared library and not hidden in there), the local 
function
definition is without target attribute but the definition used at runtime is 
not?

> + {
> +   omp_context *octx;
> +   if (cgraph_node::get (current_function_decl)->offloadable)
> + remove = true;
> +   for (octx = ctx; octx && !remove; octx = octx->outer)
> + if (is_targetreg_ctx (octx))
> +   remove = true;
> +   if (remove)
> + error_at (gimple_location (stmt), "function called from "
> +   "target region, but not marked as 'declare target'");

% ?

Jakub


[PATCH][OpenMP] Forbid usage of non-target functions in target regions

2015-01-09 Thread Ilya Verbin
Hi!

If one (by mistake) calls a non-target function from the target region, the
offload compiler crashes in input_overwrite_node.  This is because
compute_ltrans_boundary during streaming-out includes to SET the
non-offloadable nodes, called from offloadable nodes.
Probably it's possible to ignore such incorrect nodes (and edges) in
streaming-out, but such a situation can not appear in a correct OpenMP 4.0
program, therefore I've added a check to scan_omp_1_stmt.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


gcc/
* omp-low.c (scan_omp_1_stmt): Forbid usage of non-target functions in
target regions.
gcc/testsuite/
* gcc.dg/gomp/target-3.c: New test.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 8f88d5e..021f86f 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -2818,6 +2818,19 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool 
*handled_ops_p,
  default:
break;
  }
+ else if (!DECL_EXTERNAL (fndecl)
+  && !cgraph_node::get_create (fndecl)->offloadable)
+   {
+ omp_context *octx;
+ if (cgraph_node::get (current_function_decl)->offloadable)
+   remove = true;
+ for (octx = ctx; octx && !remove; octx = octx->outer)
+   if (is_targetreg_ctx (octx))
+ remove = true;
+ if (remove)
+   error_at (gimple_location (stmt), "function called from "
+ "target region, but not marked as 'declare target'");
+   }
}
 }
   if (remove)
diff --git a/gcc/testsuite/gcc.dg/gomp/target-3.c 
b/gcc/testsuite/gcc.dg/gomp/target-3.c
new file mode 100644
index 000..7473d08
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/target-3.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+
+int
+bar ()
+{
+  return 1;
+}
+
+int
+foo ()
+{
+  int x = 0;
+  #pragma omp target
+x = bar ();/* { dg-error "function called from 
target region, but not marked as 'declare target'" } */
+  return x;
+}
+
+#pragma omp declare target
+int
+baz ()
+{
+  return bar ();   /* { dg-error "function called from target 
region, but not marked as 'declare target'" } */
+}
+#pragma omp end declare target


  -- Ilya


[patch] libstdc++/60966 fix packaged_task also

2015-01-09 Thread Jonathan Wakely

The race conditions fixed in PR 60966 can also happen when using
std::packaged_task (on the release branches only, the fix applied to
the trunk covers all cases).

The problem is that the mutex protecting the result in the shared
state is unlocked before the _M_cond.notify_all() call. This leaves a
small window where threads waiting on the future can see the result
(and so return from a waiting function) before the notify_all(). If
the waiting thread destroys the future *and* the
packaged_task as soon as the waiting function returns, then _M_cond
will be destroyed before the notify_all() call on it, resulting in
undefined behaviour (typically this means blocking forever in the
pthread_cond_broadcast() call).

This patch uses the same approach as done for std::promise on the
release branches: increasing the ref-count on the shared state until
the function setting the result has completed. This ensures that the
shared state (and its condition_variable member) will not be destroyed 
until after the _M_cond.notify_all() call.


Thanks to Barry Revzin for finding the problem and Michael Karcher for
debugging it.

The original fixes for 60966 solved the problem for std::promise, this
solves it for std::packaged_task. The only other types of asynchronous
result providers are those created by std::async, but for those types
the waiting functions already block in _M_complete_async() so cannot
return before the notify_all() call happens.

Tested x86_64-linux, committed to the 4.9 and 4.8 branches.

(No new testcase, because the hang only shows up occasionally after
thousands of iterations, or when using valgrind.)

commit 053f27a4ffbb7cadd780c0b28507aaff98a38824
Author: Jonathan Wakely 
Date:   Fri Jan 9 13:35:42 2015 +

	PR libstdc++/60966
	* include/std/future (packaged_task::operator()): Increment the
	reference count on the shared state until the function returns.

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index d446b9d..6523cea 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -1450,7 +1450,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   operator()(_ArgTypes... __args)
   {
 	__future_base::_State_base::_S_check(_M_state);
-	_M_state->_M_run(std::forward<_ArgTypes>(__args)...);
+	auto __state = _M_state;
+	__state->_M_run(std::forward<_ArgTypes>(__args)...);
   }
 
   void


Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 03:10:16PM +0100, Richard Biener wrote:
> Well, you have until the end of next week ;)  For GIMPLE this is
> a switch with all cases going to the same basic-block, right?
> I think we optimize that in cleanup_control_expr_graph via the
> single_succ_p case?

No, it is a switch with cases that all look like:
  _1 = a; // load
  _2 = _1 + 1;
  a = _2; // store
So, either if tree-ssa-tail-merge could be tought about loads/stores,
or some other pass would be able to hoist the loads before the switch and
sink the store after the switch, because every switch case does that.

Jakub


Re: Patch RFA: Support for building Go tools

2015-01-09 Thread Ian Lance Taylor
On Fri, Jan 9, 2015 at 12:54 AM, Paolo Bonzini  wrote:
>
>> +
>> +# For a non-native build we have to build the programs using a
>> +# previously built host (or build -> host) Go compiler.  We should
>> +# only do this if such a compiler is available.  Figure this out
>> +# later.
>> +
>> +endif
>
> This should work automatically, the only difference is that you must
> omit $(LIBGODEP) from the dependencies.

What will happen if there is no installed gccgo at the right version?
What should happen?

I haven't been keeping track--will the build machinery now build a
build-x-host Go compiler?

Ian


Re: [PATCH][ARM] FreeBSD arm support, EABI, v2

2015-01-09 Thread Andreas Tobler

On 09.01.15 10:27, Richard Earnshaw wrote:

On 08/01/15 20:28, Andreas Tobler wrote:

On 08.01.15 17:22, Richard Earnshaw wrote:

On 27/11/14 20:56, Andreas Tobler wrote:

Hi all,

this is the second attempt.

I reworked the issues Richard mentioned in the previous review.
I also found one issue which will break build/bootstrap if I pass
--enable-gnu-indirect-function, also fixed.

One thing which came up is the way we generate code for the
armv6*-*-freebsd* triplet versus the arm-*-freebsd* triplet.

I think the thing which confuses is the fact that we have only two fixed
triplets where we build a complete OS with. Means the whole OS is built
with the same optimization, not only the kernel or one binary.

For the armv6* we want to benefit from the cpu's functionality by
default. We build all __ARM_ARCH >= 6 with TARGET_CPU_arm1176jzs,
on the other hand all __ARM_ARCH <=5 will be built with TARGET_CPU_arm9.

Now who becomes arm-*-freebsd* and who becomes armv6*-*-freebsd*?

As tried above, we only know two triplets, so __ARM_ARCH >= 6 becomes
armv6*-*-freebsd* and __ARM_ARCH <=5 becomes arm-*-freebsd*.

armv8 is not yet in the portfolio and it will become something
different, either arm64 or aarch64, I do not know.

I'd like to keep this since our system compilers, clang and gcc-4.2.1
behave the same.
If we have to change here, we would confuse people quite a lot.

The whole thing is FreeBSD specific and does not touch others.

As usual, bootstrapped, cross compiled, tested.

Ok for trunk?

TIA,
Andreas

toplevel:

* configure.ac: Don't add ${libgcj} for arm*-*-freebsd*.
* configure: Regenerate.
gcc:
* config.gcc (arm*-*-freebsd*): New configuration.
* config/arm/freebsd.h: New file.
* config.host: Add extra components for arm*-*-freebsd*.
* config/arm/arm.h: Introduce MAX_SYNC_LIBFUNC_SIZE.
* config/arm/arm.c (arm_init_libfuncs): Use MAX_SYNC_LIBFUNC_SIZE.

libgcc:

* config.host (arm*-*-freebsd*): Add new configuration for
arm*-*-freebsd*.
* config/arm/freebsd-atomic.c: New file.
* config/arm/t-freebsd: Likewise.
* config/arm/unwind-arm.h: Add __FreeBSD__ to the list of
'PC-relative indirect' OS's.

libatomic:

* configure.tgt: Exclude arm*-*-freebsd* from try_ifunc.

libstdc++-v3:

* configure.host: Add arm*-*-freebsd* port_specific_symbol_files.








Sorry for the delay in responding, I've been OoO quite a bit over the
last month or so and not had much time for patch reviewing.


Thank you for the review. Also being very busy, I know what you're
talking about.


This is OK to go in.  Most of the patch is now in OS specific files and
thus not really relevant to the CPU maintenance.

One item I do think you may need to think about is the definition of
SUBTARGET_EXTRA_LINK_SPEC.  This picks the linker emulation setting.
But it's a build-time choice and seems to ignore a run-time -mbig-endian
or -mlittle-endian choice on the command line.  Is this selection even
necessary?  I don't remember ever having to do it on other OS ports -
the linker just picks the correct emulation based on the first object
file in the link list.


I don't think this is necessary. Certainly not yet. Might become a
future issue.

I also got your second mail.

If you're ok with, I'll commit the first hunk and send the undwind
update as separate patch again for review?



Yes, that's fine.



Committed. (r219388) Thanks.

Unfortunately I commited too much but I reverted the two files (r219391) 
which are not review yet.


Sorry for the troubles.

Andreas




Re: [PATCH 3/3] RTEMS: Add e6500 multilibs for PowerPC

2015-01-09 Thread Sebastian Huber

Checked in slightly modified as

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219387
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219391

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH 2/3] RTEMS: Fix MPC8540 multilibs for PowerPC

2015-01-09 Thread Sebastian Huber

Checked in as

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219385
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219390

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH 1/3] RTEMS: Use MULTILIB_REQUIRED for PowerPC

2015-01-09 Thread Sebastian Huber

Checked in as

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219384
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219389

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-09 Thread Richard Biener
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 11:59:44AM +0100, Richard Biener wrote:
> > > If you want, I can try instead of disabling it for tablejumps
> > > just move the label.
> > 
> > Yeah, I'd prefer that - it can't be too difficult, no?
> 
> So like this (tested just on the testcase, fully bootstrap/regtest
> will follow)?

Yeah.

> > > Still, I think we should be able to optimize it somewhere else too 
> > > (we can remove the tablejumps not just if all jump_table_data 
> > > entries point to next_bb, but even when they point to some 
> > > completely different bb, as long as it is a single_succ_p).  And 
> > > ideally also optimize it at GIMPLE, but guess that is GCC 6 
> > > material.
> > 
> > cfgcleanup material, similar for GIMPLE I guess.
> 
> You mean that cfgcleanup changes are GCC 6 material too?

Well, you have until the end of next week ;)  For GIMPLE this is
a switch with all cases going to the same basic-block, right?
I think we optimize that in cleanup_control_expr_graph via the
single_succ_p case?

Thanks,
Richard.

> 2015-01-09  Jakub Jelinek  
> 
>   PR rtl-optimization/64536
>   * cfgrtl.c (rtl_tidy_fallthru_edge): Handle removal of degenerate
>   tablejumps.
> 
>   * gcc.dg/pr64536.c: New test.
> 
> --- gcc/cfgrtl.c.jj   2015-01-08 18:10:23.616598916 +0100
> +++ gcc/cfgrtl.c  2015-01-09 14:47:26.637855477 +0100
> @@ -1791,6 +1791,24 @@ rtl_tidy_fallthru_edge (edge e)
>&& (any_uncondjump_p (q)
> || single_succ_p (b)))
>  {
> +  rtx label;
> +  rtx_jump_table_data *table;
> +
> +  if (tablejump_p (q, &label, &table))
> + {
> +   /* The label is likely mentioned in some instruction before
> +  the tablejump and might not be DCEd, so turn it into
> +  a note instead and move before the tablejump that is going to
> +  be deleted.  */
> +   const char *name = LABEL_NAME (label);
> +   PUT_CODE (label, NOTE);
> +   NOTE_KIND (label) = NOTE_INSN_DELETED_LABEL;
> +   NOTE_DELETED_LABEL_NAME (label) = name;
> +   rtx_insn *lab = safe_as_a  (label);
> +   reorder_insns (lab, lab, PREV_INSN (q));
> +   delete_insn (table);
> + }
> +
>  #ifdef HAVE_cc0
>/* If this was a conditional jump, we need to also delete
>the insn that set cc0.  */
> --- gcc/testsuite/gcc.dg/pr64536.c.jj 2015-01-09 13:55:53.035267213 +0100
> +++ gcc/testsuite/gcc.dg/pr64536.c2015-01-09 13:55:53.035267213 +0100
> @@ -0,0 +1,67 @@
> +/* PR rtl-optimization/64536 */
> +/* { dg-do link } */
> +/* { dg-options "-O2" } */
> +/* { dg-additional-options "-fPIC" { target fpic } } */
> +
> +struct S { long q; } *h;
> +long a, b, g, j, k, *c, *d, *e, *f, *i;
> +long *baz (void)
> +{
> +  asm volatile ("" : : : "memory");
> +  return e;
> +}
> +
> +void
> +bar (int x)
> +{
> +  int y;
> +  for (y = 0; y < x; y++)
> +{
> +  switch (b)
> + {
> + case 0:
> + case 2:
> +   a++;
> +   break;
> + case 3:
> +   a++;
> +   break;
> + case 1:
> +   a++;
> + }
> +  if (d)
> + {
> +   f = baz ();
> +   g = k++;
> +   if (&h->q)
> + {
> +   j = *f;
> +   h->q = *f;
> + }
> +   else
> + i = (long *) (h->q = *f);
> +   *c++ = (long) f;
> +   e += 6;
> + }
> +  else
> + {
> +   f = baz ();
> +   g = k++;
> +   if (&h->q)
> + {
> +   j = *f;
> +   h->q = *f;
> + }
> +   else
> + i = (long *) (h->q = *f);
> +   *c++ = (long) f;
> +   e += 6;
> + }
> +}
> +}
> +
> +int
> +main ()
> +{
> +  return 0;
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PING][PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread Daniel Micay
On 09/01/15 07:58 AM, Richard Biener wrote:
> 
> Looking at the actual implementation I wonder why it's not similar
> to how darwin gets at it default (not sure how it does).  Also
> looking at how DRIVER_SELF_SPECS is used I wonder if the
> functionality can be enabled with a simple
> 
> --with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE
> -pie}"
> 
> at configure time (using CONFIGURE_SPECS).
> 
> I have no idea if the above is really the proper spec to use - why
> do you include static, nostdlib, nodefaultlibs and nostartfiles
> for example?  Similar, if I say

PIE isn't supported for static executables by binutils, etc. so it
does need to exclude that. The checks for nostdlib, nodefaultlibs
and nostartfiles do seem unnecessary. I think distributions include
those in the existing wrapper scripts and GCC patches because it
avoids the need for patching build systems for kernel / freestanding
code to include -fno-pie, but it's more correct to leave these out.
 
>  gcc -pie -c t.c
> 
> we will end up with a non-PIE object, and linking with -fPIE will
> end up with a DYN_EXEC object.
> 
> I believe you want to treat link and compile arguments separately
> (and adjust the link spec for linking).  I also would have said that
> elfos.h is more appropriate than gnu-user.h, but ...

Handling it separately is what the existing wrapper scripts for this do:

-fno-PIC|-fno-pic|-fno-PIE|-fno-pie|-static|--static|-shared|--shared)
  force_fPIE=0
  force_pie=0
  ;;
-fPIC|-fpic|-fPIE|-fpie)
  force_fPIE=0
  ;;
-c|-E|-S)
  force_pie=0
  ;;

I think it's appropriate for it to 
 
> That said, the patch looks more like a hack (and see above how
> to achieve the same without a patch(?)), not like a proper implementation
> of a PIE default.

I don't think it can be considered a hack if it's handling all of the cases
correctly, so it might need some changes from the current implementation but
that doesn't make it a dead end. Is it actually done in a significantly
different way for OS X?

If it can be done by passing --with-specs to configure then that could be a
viable alternative for distributions that do not want to add GCC patches or
use wrapper scripts (Arch Linux) but I'm not sure that it will fly.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 11:59:44AM +0100, Richard Biener wrote:
> > If you want, I can try instead of disabling it for tablejumps
> > just move the label.
> 
> Yeah, I'd prefer that - it can't be too difficult, no?

So like this (tested just on the testcase, fully bootstrap/regtest
will follow)?

> > Still, I think we should be able to optimize it somewhere else too
> > (we can remove the tablejumps not just if all jump_table_data entries
> > point to next_bb, but even when they point to some completely different bb,
> > as long as it is a single_succ_p).  And ideally also optimize it at GIMPLE,
> > but guess that is GCC 6 material.
> 
> cfgcleanup material, similar for GIMPLE I guess.

You mean that cfgcleanup changes are GCC 6 material too?

2015-01-09  Jakub Jelinek  

PR rtl-optimization/64536
* cfgrtl.c (rtl_tidy_fallthru_edge): Handle removal of degenerate
tablejumps.

* gcc.dg/pr64536.c: New test.

--- gcc/cfgrtl.c.jj 2015-01-08 18:10:23.616598916 +0100
+++ gcc/cfgrtl.c2015-01-09 14:47:26.637855477 +0100
@@ -1791,6 +1791,24 @@ rtl_tidy_fallthru_edge (edge e)
   && (any_uncondjump_p (q)
  || single_succ_p (b)))
 {
+  rtx label;
+  rtx_jump_table_data *table;
+
+  if (tablejump_p (q, &label, &table))
+   {
+ /* The label is likely mentioned in some instruction before
+the tablejump and might not be DCEd, so turn it into
+a note instead and move before the tablejump that is going to
+be deleted.  */
+ const char *name = LABEL_NAME (label);
+ PUT_CODE (label, NOTE);
+ NOTE_KIND (label) = NOTE_INSN_DELETED_LABEL;
+ NOTE_DELETED_LABEL_NAME (label) = name;
+ rtx_insn *lab = safe_as_a  (label);
+ reorder_insns (lab, lab, PREV_INSN (q));
+ delete_insn (table);
+   }
+
 #ifdef HAVE_cc0
   /* If this was a conditional jump, we need to also delete
 the insn that set cc0.  */
--- gcc/testsuite/gcc.dg/pr64536.c.jj   2015-01-09 13:55:53.035267213 +0100
+++ gcc/testsuite/gcc.dg/pr64536.c  2015-01-09 13:55:53.035267213 +0100
@@ -0,0 +1,67 @@
+/* PR rtl-optimization/64536 */
+/* { dg-do link } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-fPIC" { target fpic } } */
+
+struct S { long q; } *h;
+long a, b, g, j, k, *c, *d, *e, *f, *i;
+long *baz (void)
+{
+  asm volatile ("" : : : "memory");
+  return e;
+}
+
+void
+bar (int x)
+{
+  int y;
+  for (y = 0; y < x; y++)
+{
+  switch (b)
+   {
+   case 0:
+   case 2:
+ a++;
+ break;
+   case 3:
+ a++;
+ break;
+   case 1:
+ a++;
+   }
+  if (d)
+   {
+ f = baz ();
+ g = k++;
+ if (&h->q)
+   {
+ j = *f;
+ h->q = *f;
+   }
+ else
+   i = (long *) (h->q = *f);
+ *c++ = (long) f;
+ e += 6;
+   }
+  else
+   {
+ f = baz ();
+ g = k++;
+ if (&h->q)
+   {
+ j = *f;
+ h->q = *f;
+   }
+ else
+   i = (long *) (h->q = *f);
+ *c++ = (long) f;
+ e += 6;
+   }
+}
+}
+
+int
+main ()
+{
+  return 0;
+}


Jakub


Re: [PATCH 2/2] RTEMS: Use MULTILIB_REQUIRED for ARM

2015-01-09 Thread Sebastian Huber

Checked in as

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219383

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH 1/2] RTEMS: Rename ARM target config files

2015-01-09 Thread Sebastian Huber

Checked in as

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=219382

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



[Patch/combine] PR64304 wrong bitmask passed to force_to_mode in combine_simplify_rtx

2015-01-09 Thread Jiong Wang

as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304

given the following test:

unsigned char byte = 0;

void set_bit(unsigned int bit, unsigned char value) {
unsigned char mask = (unsigned char)(1 << (bit & 7));
if (!value) {
byte &= (unsigned char)~mask;
} else {
byte |= mask;
}
}

we should generate something like:

  set_bit:
and w0, w0, 7
mov w2, 1
lsl w2, w2, w0

while we are generating
mov w2, 1
lsl w2, w2, w0


the necessary "and w0, w0, 7" deleted wrongly.

that because
 
  (insn 2 5 3 2 (set (reg/v:SI 82 [ bit ])

(reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64}
 (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
(nil)))
  (insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ])
(and:SI (reg/v:SI 82 [ bit ])
(const_int 7 [0x7]))) bug.c:4 399 {andsi3}
 (expr_list:REG_DEAD (reg/v:SI 82 [ bit ])
(nil)))
  (insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ])
(ashift:SI (reg:SI 86)
(subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539 
{*aarch64_ashl_sisd_or_int_si3}
 (expr_list:REG_DEAD (reg:SI 86)
(expr_list:REG_DEAD (reg:SI 84 [ D.1482 ])
(expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1])
(subreg:QI (reg:SI 84 [ D.1482 ]) 0))
(nil)

are wrongly combined into

  (insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ])
(ashift:QI (subreg:QI (reg:SI 86) 0)
(reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn}
 (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
(expr_list:REG_DEAD (reg:SI 86)
(nil

thus, the generated assembly is lack of the necessary "and w0, x0, 7".

the root cause is at one place in combine pass, we are passing wrong bitmask to 
force_to_mode.

in this particular case, for QI mode, we should pass (1 << 8 - 1), while we are 
passing (1 << 3 - 1),
thus the combiner think we only need the lower 3 bits, that X & 7 is 
unnecessary. While for QI mode, we
want the lower 8 bits. we should remove the exp operator.

this should be a historical bug in combine pass?? while it's only triggered for 
target
where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly because 
x86/arm will
not trigger this part of code.

bootstrap on x86 and gcc check OK.
bootstrap on aarch64 and bare-metal regression OK.
ok for trunk?

gcc/
  PR64303
  * combine.c (combine_simplify_rtx): Correct the bitmask passed to 
force_to_mode.
  
gcc/testsuite/

  PR64303
  * gcc.target/aarch64/pr64304.c: New testcase.
diff --git a/gcc/combine.c b/gcc/combine.c
index 1808f97..31a7fd0 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5922,7 +5922,7 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
 	SUBST (XEXP (x, 1),
 	   force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)),
 			  ((unsigned HOST_WIDE_INT) 1
-			   << exact_log2 (GET_MODE_BITSIZE (GET_MODE (x
+			   << GET_MODE_BITSIZE (GET_MODE (x)))
 			  - 1,
 			  0));
   break;
diff --git a/gcc/testsuite/gcc.target/aarch64/pr64304.c b/gcc/testsuite/gcc.target/aarch64/pr64304.c
new file mode 100644
index 000..5423bb3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr64304.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 --save-temps" } */
+
+unsigned char byte = 0;
+
+void
+set_bit (unsigned int bit, unsigned char value)
+{
+  unsigned char mask = (unsigned char) (1 << (bit & 7));
+
+  if (! value)
+byte &= (unsigned char)~mask;
+  else
+byte |= mask;
+  /* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, 7" } } */
+}
+
+/* { dg-final { cleanup-saved-temps } } */

Re: [PATCH/expand] PR64011 Adjust bitsize when partial overflow happen for big-endian

2015-01-09 Thread Jiong Wang


On 09/01/15 05:46, Jeff Law wrote:

On 12/30/14 03:21, Jiong Wang wrote:

PR64011 is actually a general problem on all target support bit insertion
instructions.

we overflow check at the start of store_bit_field_1, but that only check
the
situation where the field lies completely outside the register, while
there do
have situation where the field lies partly in the register, we need to
adjust
bitsize for this partial overflow situation. Without this fix,
pr48335-2.c on
big-endian will broken on those arch support bit insert instruction,
like arm, aarch64.

the testcase is just pr48335-2.c, before this patch is will ICE on arm
and =
generate
invalid assembly on AArch64. after this patch, problem gone away.

ok for trunk?

bootstrap OK on x86-64 && aarch64.
no regression on x86-64

thanks.

gcc/
 PR64011
 * expmed.c (store_bit_field_using_insv): Adjust bitsize when there
is partial overflow.

Why adjust here the size of the stored field?  Doesn't this end up
storing less information?

If those bits are still within the object, even if the object is by some
means living in a mixture of registers and memory, then don't we need to
set them all?

If those bits are outside the object, then isn't the source simply
broken because it's writing data outside the bounds of the given object?

Am I  missing something here?


Jeff,

  thanks for review and the questions.

the bug testcase is
===

typedef short U __attribute__((may_alias, aligned (1)));
  
struct S

{
  _Complex float d __attribute__((aligned (8)));
};
  
void bar(struct S);
  
void f5 (int x)

{
  struct S s;
  ((U *)((char *) &s.d + 1))[3] = x;
  bar (s);
}

and the final tree is:
==
;; Function f5 (f5, funcdef_no=0, decl_uid=2608, cgraph_uid=0, symbol_order=0)
  
f5 (int x)

{
  struct S s;
  short int _2;
  
  :

  _2 = (short int) x_1(D);
  MEM[(U * {ref-all})&s + 7B] = _2; <-- A
  bar (s);
  s ={v} {CLOBBER};
  return;
  
}


during expand_used_vars, gcc decide that "s" is OK reside in register pair
(regLow, regHigh), instead of on stack.

thus later, MEM[(U * {ref-all})&s + 7B] expanded into:

  regHigh + 3B which means regHigh[31:24],

so "MEM[(U * {ref-all})&s + 7B] = _2;" expanded into

  regHigh[31:24] = _2

then, store_bit_field_1 will get a to_rtx be regHigh, bitsize be 16, and bitnum 
be 24, so the
8bit outside of regB is safe to be ignored.

I think if store_bit_field_using_insv is called with op0 be a REG_P, and 
bitsize + bitnum
overflow the reg size, then it's caused by a memory object optimized into 
register
object.

As the outer code decided it's OK to fit the mem obj into a register, all those 
bit out of
reg size should be safe to ignore.

the following code in store_bit_field_using_insv haven't consider above 
MEM->REG situation,
it always assume bitnum + bitsize is within unit which is wrong.
 
  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)

bitnum = unit - bitsize - bitnum;

while my patch do have a problem, I should restrict the check on REG_P/SUBREG_P 
(op0) only.
so is the patch OK with on extra check

  if (REG_P (xop0) || (SUBREG_P (xop0) && REG_P (SUBREG_REG (xop0

thanks

Regards,
Jiong



jeff








Re: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Richard Biener
On Fri, Jan 9, 2015 at 1:44 PM, Bernd Edlinger
 wrote:
>
>
> On Fri, 9 Jan 2015 13:30:45, Jakub Jelinke wrote:
>>
>> On Fri, Jan 09, 2015 at 01:12:09PM +0100, Bernd Edlinger wrote:
> should be equivalent to
>
> if (DECL_P (base) && !may_be_aliased (base))
> return false;
>
> is that right?

 Yes, well, not exactly, but I wonder if its worth doing the extra check
 if you only check decl accesses anyway and not indirect accesses.

>>>
>>>
>>> I think Jakub, you wrote that initially, any comments on that?
>>
>> I think it was still from Dmitry's code. If you can make it work by taking
>> address of base and offsetting it, it works for me. Just note that I think
>> base doesn't have to be always addressable, so you probably should punt if
>> it is not rather than assert it is. If something is not addressable, then
>> it can't be accessed by multiple threads.
>>
>
> Thanks.
>
> FYI: the VIEW_CONVERT_EXPR did not fail in the
> gcc_checking_assert (is_gimple_addressable (base))
> but much later, somewhere in tree-cfg.c it dropped out.

How did it fail there?  It doesn't look like &VIEW_CONVERT_EXPR
is forbidden.

> Maybe that assert does not check exactly what is
> needed for a valid argument of ADDR_EXPR ?

Indeed, it's a gimplifier predicate not to be used elsewhere.
tree-ssa-loop-ivopts.c:may_be_nonaddressable_p is more
sophisticated.

> I mean I can somehow fold an ADDR_EXPR of a bit field member,
> This won't crash at all, but I can't fold ADDR_EXPR(VIEW_CONVERT_EXPR).

well, that's probably because it will return (ptr *)&xxx, thus not
a valid gimple result.  You need to funnel that through force_gimple_operand.

That said, please try removing all the code and using &base + offset
+ bitoffset /  BITS_PER_UNIT as I said.

Richard.

>
> Bernd.
>
>


Re: [PING][PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread Richard Biener
On Tue, Dec 30, 2014 at 10:23 PM, Magnus Granberg  wrote:
> fredag 14 november 2014 23.31.48 skrev  Magnus Granberg:
>> måndag 10 november 2014 21.26.39 skrev  Magnus Granberg:
>> > >   Rainer
>> >
>> > Thanks Rainer for the nits and comments.
>> > Have updated the patches and Changelogs.
>> > But i still use PIE_DRIVER_SELF_SPECS, do you have a ide where move it so
>> > i don't need to duplicate that stuff or how to do it?
>> >
>> > Magnus G
>> >
>> > 2014-11-10  Magnus Granberg  
>> >
>> > /gcc
>> > * config/gnu-user.h (PIE_DRIVER_SELF_SPECS) and
>> > (GNU_DRIVER_SELF_SPECS): Define.
>> > * config/i386/gnu-user-common.h (DRIVER_SELF_SPECS): Define
>> > * configure.ac: Add new option.
>> > * configure, config.in: Rebuild.
>> > * Makefile.in (ALL_CFLAGS) and (ALL_CXXFLAGS): Disable PIE.
>> > * doc/install.texi: New configure option.
>> > * doc/invoke.texi: Add note to PIE.
>> > * doc/sourcebuild.texi: New effective target.
>> > gcc/testsuite
>> > * gcc/default-pie.c: New test
>> > * gcc.dg/tree-ssa/ssa-store-ccp-3.c: Skip if default_pie
>> > * g++.dg/other/anon5.C: Skip if default_pie
>> > * lib/target-supports.exp (check_effective_target_default_pie):
>> > New proc.
>> > /libgcc
>> > * Makefile.in (CRTSTUFF_CFLAGS): Disable PIE.
>>
>> Can this be included for GCC 5 ?
>>
>> /Magnus G.
> One more ping on this. The patches where sent before stage 1 closed but i
> did't get any feed back from it
> Have updete the patchses for gcc 5.0 20141228 snapshot.
> Bootstrapped and tested on x86_64-unknown-linux-gnu (Gentoo)
> /Magnus

Looking at the actual implementation I wonder why it's not similar
to how darwin gets at it default (not sure how it does).  Also
looking at how DRIVER_SELF_SPECS is used I wonder if the
functionality can be enabled with a simple

--with-specs="%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE|shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE
-pie}"

at configure time (using CONFIGURE_SPECS).

I have no idea if the above is really the proper spec to use - why
do you include static, nostdlib, nodefaultlibs and nostartfiles
for example?  Similar, if I say

 gcc -pie -c t.c

we will end up with a non-PIE object, and linking with -fPIE will
end up with a DYN_EXEC object.

I believe you want to treat link and compile arguments separately
(and adjust the link spec for linking).  I also would have said that
elfos.h is more appropriate than gnu-user.h, but ...

That said, the patch looks more like a hack (and see above how
to achieve the same without a patch(?)), not like a proper implementation
of a PIE default.

Joseph may have an idea where the proper place for a spec-wise
default PIE is.

Thanks,
Richard.

> 2014-12-30  Magnus Granberg  
>
> /gcc
> * config/gnu-user.h (PIE_DRIVER_SELF_SPECS): Define.
> * config/i386/gnu-user-common.h (DRIVER_SELF_SPECS): Define and
> add PIE_DRIVER_SELF_SPECS.
> * configure.ac: Add new option.
> * configure, config.in: Rebuild.
> * Makefile.in (ALL_CFLAGS) and (ALL_CXXFLAGS): Disable PIE.
> * doc/install.texi: New configure option.
> * doc/invoke.texi: Add note to PIE.
> * doc/sourcebuild.texi: New effective target.
> gcc/testsuite
> * gcc/default-pie.c: New test
> * gcc.dg/tree-ssa/ssa-store-ccp-3.c: Skip if default_pie
> * g++.dg/other/anon5.C: Skip if default_pie
> * lib/target-supports.exp (check_effective_target_default_pie):
> New proc.
> /libgcc
> * Makefile.in (CRTSTUFF_CFLAGS): Disable PIE.
>


RE: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Bernd Edlinger


On Fri, 9 Jan 2015 13:30:45, Jakub Jelinke wrote:
>
> On Fri, Jan 09, 2015 at 01:12:09PM +0100, Bernd Edlinger wrote:
 should be equivalent to

 if (DECL_P (base) && !may_be_aliased (base))
 return false;

 is that right?
>>>
>>> Yes, well, not exactly, but I wonder if its worth doing the extra check
>>> if you only check decl accesses anyway and not indirect accesses.
>>>
>>
>>
>> I think Jakub, you wrote that initially, any comments on that?
>
> I think it was still from Dmitry's code. If you can make it work by taking
> address of base and offsetting it, it works for me. Just note that I think
> base doesn't have to be always addressable, so you probably should punt if
> it is not rather than assert it is. If something is not addressable, then
> it can't be accessed by multiple threads.
>

Thanks.

FYI: the VIEW_CONVERT_EXPR did not fail in the 
gcc_checking_assert (is_gimple_addressable (base))
but much later, somewhere in tree-cfg.c it dropped out.

Maybe that assert does not check exactly what is
needed for a valid argument of ADDR_EXPR ?

I mean I can somehow fold an ADDR_EXPR of a bit field member,
This won't crash at all, but I can't fold ADDR_EXPR(VIEW_CONVERT_EXPR).


Bernd.

  

Re: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 01:12:09PM +0100, Bernd Edlinger wrote:
> >> should be equivalent to
> >>
> >> if (DECL_P (base) && !may_be_aliased (base))
> >> return false;
> >>
> >> is that right?
> >
> > Yes, well, not exactly, but I wonder if its worth doing the extra check
> > if you only check decl accesses anyway and not indirect accesses.
> >
> 
> 
> I think Jakub, you wrote that initially, any comments on that?

I think it was still from Dmitry's code.  If you can make it work by taking
address of base and offsetting it, it works for me.  Just note that I think
base doesn't have to be always addressable, so you probably should punt if
it is not rather than assert it is.  If something is not addressable, then
it can't be accessed by multiple threads.

Jakub


Re: [PATCH 2/3] Extended if-conversion

2015-01-09 Thread Richard Biener
On Mon, Dec 22, 2014 at 3:39 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> I changed algorithm for bool pattern repair.
> It turned out that ifcvt_local_dce phaase is required since for
> test-case I sent you in previous mail vectorization is not performed
> without dead code elimination:
>
> For the loop
> #pragma omp simd safelen(8)
>   for (i=0; i<512; i++)
>   {
> float t = a[i];
> if (t > 0.0f & t < 1.0e+17f)
>   if (c[i] != 0)
> res += 1;
>   }
>
> I've got the following message from vectorizer:
>
> t3.c:10:11: note: ==> examining statement: _ifc__39 = t_5 > 0.0;
>
> t3.c:10:11: note: bit-precision arithmetic not supported.
> t3.c:10:11: note: not vectorized: relevant stmt not supported:
> _ifc__39 = t_5 > 0.0;
>
> It is caused by the following dead predicate computations after
> critical edge splitting:
>
> (after combine blocks):
>
> :
> # res_15 = PHI 
> # i_16 = PHI 
> # ivtmp_14 = PHI 
> t_5 = a[i_16];
> _6 = t_5 > 0.0;
> _7 = t_5 < 9.998430674944e+16;
> _8 = _6 & _7;
> _10 = &c[i_16];
> _ifc__36 = _8 ? 4294967295 : 0;
> _9 = MASK_LOAD (_10, 0B, _ifc__36);
> _28 = _8;
> _29 = _9 != 0;
> _30 = _28 & _29;
> // Statements below are dead!!
> _31 = _8;
> _32 = _9 != 0;
> _33 = ~_32;
> _34 = _31 & _33;
> // End of dead statements.
> _ifc__35 = _30 ? 1 : 0;
> res_1 = res_15 + _ifc__35;
> i_11 = i_16 + 1;
> ivtmp_13 = ivtmp_14 - 1;
> if (ivtmp_13 != 0)
>   goto ;
> else
>   goto ;
>
> But if we delete these statements loop will be vectorized.

Hm, ok.  We insert predicates too early obviously and not only when
needed.  But let's fix that later.

> New patch is attached.

 fold_build_cond_expr (tree type, tree cond, tree rhs, tree lhs)
 {
   tree rhs1, lhs1, cond_expr;
+
+  /* If COND is comparison r != 0 and r has boolean type, convert COND
+ to SSA_NAME to accept by vect bool pattern.  */
+  if (TREE_CODE (cond) == NE_EXPR)
+{
+  tree op0 = TREE_OPERAND (cond, 0);
+  tree op1 = TREE_OPERAND (cond, 1);
+  if (TREE_CODE (op0) == SSA_NAME
+ && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
+ && (integer_zerop (op1)))
+   cond = op0;
+  else if (TREE_CODE (op1) == SSA_NAME
+  && TREE_CODE (TREE_TYPE (op1)) == BOOLEAN_TYPE
+  && (integer_zerop (op0)))
+   cond = op1;

The 2nd form, 0 != SSA_NAME doesn't happen due to operand
canonicalization.  Please remove its handling.

+  if (gimple_phi_num_args (phi) != 2)
+   {
+ if (!aggressive_if_conv)

&& !aggressive_if_conv

+  if (EDGE_COUNT (bb->preds) > 2)
+{
+  if (!aggressive_if_conv)

Likewise.

-  gimple reduc;
+ && (rhs = gimple_phi_arg_def (phi, 0 {

the { goes to the next line

 static void
 predicate_mem_writes (loop_p loop)
 {
-  unsigned int i, orig_loop_num_nodes = loop->num_nodes;
+  unsigned int i, j, orig_loop_num_nodes = loop->num_nodes;
+  tree mask_vec[10];

an upper limit of 10?

+  for (j=0; j<10; j++)

spaces around '<' and '='

+   mask_vec[j] = NULL_TREE;
+

+   gcc_assert (exact_log2 (bitsize) != -1);
+   if ((mask = mask_vec[exact_log2 (bitsize)]) == NULL_TREE)
+ {

this seems to be a completely separate "optimization"?  Note that
there are targets with non-power-of-two bitsize modes (PSImode),
so the assert will likely trigger.  I would prefer if you separate this
part of the patch.

+  if ( gimple_code (stmt) != GIMPLE_ASSIGN)
+   continue;

no space before gimple_code

+  imm_use_iterator imm_iter;
+
+
+  worklist.create (64);

excessive vertical space.

The patch misses the addition of new testcases - please add some,
otherwise the code will be totally untested.

I assume the patch passes bootstrap and regtest (you didn't say so).
Can you also do a bootstrap with aggressive_if_conv forced to
true and --with-build-config=bootstrap-O3 --disable-werror?

Thanks,
Richard.

> Thanks.
> Yuri.
>
> 2014-12-19 14:45 GMT+03:00 Richard Biener :
>> On Thu, Dec 18, 2014 at 2:45 PM, Yuri Rumyantsev  wrote:
>>> Richard,
>>>
>>> I am sending you full patch (~1000 lines) but if you need only patch.1
>>> and patch.2 will let me know and i'll send you reduced patch.
>>>
>>> Below are few comments regarding your remarks for patch.3.
>>>
>>> 1. I deleted sub-phase ifcvt_local_dce since I did not find test-case
>>> when dead code elimination is required to vectorize loop, i.e. dead
>>> statement is marked as relevant.
>>> 2. You wrote:
 The "retry" code also looks odd - why do you walk the BB multiple
 times instead of just doing sth like

  while (!has_single_use (lhs))
{
  gimple copy = ifcvt_split_def_stmt (def_stmt);
  ifcvt_walk_pattern_tree (copy);
}

 thus returning the copy you create and re-process it (the copy should
 now have a single-use).
>>>
>>> The problem is that not only top SSA_NAME (lhs) may have multiple uses
>>> but some intermediate variables too. For example, for the following
>>> test-case
>>>
>>> float a[1000]

RE: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2015-01-09 Thread Matthew Fortune
Robert Suchanek  writes:

> gcc/
>   * simplify-rtx.c (simplify_replace_fn_rtx): Simplify (lo_sum (high x)
>   (const (plus x offset))) to (const (plus x offset)).

The fix appears valid to me. Just some comments on the test case.

> a/gcc/testsuite/gcc.target/mips/20150108.c
> b/gcc/testsuite/gcc.target/mips/20150108.c
> new file mode 100644
> index 000..f18dbe7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/20150108.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mips32r2" } */

Please remove this line as there is nothing ISA dependent in the test case.

> +
> +long long a[10];
> +long long b, c, d, k, m, n, o, p, q, r, s, t, u, v, w; int e, f, g, h,
> +i, j, l, x;
> +

nit, no return type specified.

> +NOMIPS16 fn1() {

Nit, newline for the brace.

> +  for (; x; x++)
> +if (x & 1)
> +  s = h | g;
> +else
> +  s = f | e;
> +  l = ~0;
> +  m = 1 | k;
> +  n = i;
> +  o = j;
> +  p = f | e;
> +  q = h | g;
> +  w = d | c | a[1];
> +  t = c;
> +  v = b | c;
> +  u = v;
> +  r = b | a[4];
> +}
> --
> 1.7.9.5

Thanks,
Matthew


RE: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Bernd Edlinger
On Fri, 9 Jan 2015 12:04:26, Richard Biener wrote:
>
>>>
>>> There may be multiple VIEW_CONVERT_EXPRs in a reference chain
>>> so simply stripping the outermost only doesn't work (the assert).
>>>
>>
>> Hmm, that did not happen in any of the Ada tests in ada/acats nor in gnat.dg,
>> but with Ada anything may be possible...
>>
>> Would you like it better if I do it this way:
>>
>> align = get_object_alignment (expr);
>> if (align < BITS_PER_UNIT)
>> return false;
>>
>> do
>> {
>> expr = TREE_OPERAND (expr, 0);
>> } while (TREE_CODE (expr) == VIEW_CONVERT_EXPR);
>
> No, I mean the tree might look like
>
> COMPONENT_REF 
> thus VIEW_CONVERT_EXPR doesn't have to be outermost but it
> can appear anywhere in the reference chain.
>

Nice...

Could some one give me a test case for that?


>> should be equivalent to
>>
>> if (DECL_P (base) && !may_be_aliased (base))
>> return false;
>>
>> is that right?
>
> Yes, well, not exactly, but I wonder if its worth doing the extra check
> if you only check decl accesses anyway and not indirect accesses.
>


I think Jakub, you wrote that initially, any comments on that?



Thanks
Bernd.

  

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2015-01-09 Thread Richard Biener
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 12:07:26PM +0100, Thomas Schwinge wrote:
> > On Thu, 8 Jan 2015 15:11:49 +0100, Jakub Jelinek  wrote:
> > > On Thu, Nov 20, 2014 at 01:27:08PM +0100, Bernd Schmidt wrote:
> > > > On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> > > > >this patch adds infrastructure for proper streaming and merging of
> > > > >TREE_TARGET_OPTION.
> > > > 
> > > > This breaks the offloading path via LTO since it introduces an
> > > > incompatibility in LTO format between host and offload machine.
> > > > 
> > > > A very quick patch to fix it is below - the OpenACC testcase I was using
> > > > seems to be working again with this. Thoughts, suggestions?
> > > 
> > > I actually think
> > 
> > Thanks for picking up this issue!
> 
> Richard said on IRC he doesn't like the string comparisons, so here is
> untested modification of the patch.  If it looks good, I'll test it today:

Looks good to me.

Richard.

> 2015-01-09  Bernd Schmidt  
>   Jakub Jelinek  
> 
>   PR middle-end/64412
>   * lto-streamer.h (lto_stream_offload_p): New declaration.
>   * lto-streamer.c (lto_stream_offload_p): New variable.
>   * cgraphunit.c (ipa_passes): Set lto_stream_offload_p
>   at the same time as section_name_prefix.
>   * lto-streamer-out.c (hash_tree): Don't hash TREE_TARGET_OPTION
>   if lto_stream_offload_p.
>   * tree-streamer-out.c (streamer_pack_tree_bitfields): Don't
>   stream TREE_TARGET_OPTION if lto_stream_offload_p.
>   (write_ts_function_decl_tree_pointers): Don't
>   stream DECL_FUNCTION_SPECIFIC_TARGET if lto_stream_offload_p.
>   * tree-streamer-in.c (unpack_value_fields): Don't stream
>   TREE_TARGET_OPTION in if ACCEL_COMPILER.
>   (lto_input_ts_function_decl_tree_pointers): Don't stream
>   DECL_FUNCTION_SPECIFIC_TARGET in if ACCEL_COMPILER.
>   * lto-opts.c (lto_write_options): Use lto_stream_offload_p
>   instead of section_name_prefix string comparisons.
> lto/
>   * lto.c (read_cgraph_and_symbols): Set lto_stream_offload_p
>   if ACCEL_COMPILER.
> 
> --- gcc/lto-streamer.h.jj 2015-01-05 13:07:13.0 +0100
> +++ gcc/lto-streamer.h2015-01-09 12:18:26.199842482 +0100
> @@ -744,6 +744,10 @@ extern void lto_append_block (struct lto
>  
>  
>  /* In lto-streamer.c.  */
> +
> +/* Set when streaming LTO for offloading compiler.  */
> +extern bool lto_stream_offload_p;
> +
>  extern const char *lto_tag_name (enum LTO_tags);
>  extern bitmap lto_bitmap_alloc (void);
>  extern void lto_bitmap_free (bitmap);
> --- gcc/lto-streamer.c.jj 2015-01-05 13:07:13.0 +0100
> +++ gcc/lto-streamer.c2015-01-09 12:16:04.909269917 +0100
> @@ -61,6 +61,8 @@ static bitmap_obstack lto_obstack;
>  static bool lto_obstack_initialized;
>  
>  const char *section_name_prefix = LTO_SECTION_NAME_PREFIX;
> +/* Set when streaming LTO for offloading compiler.  */
> +bool lto_stream_offload_p;
>  
>  /* Return a string representing LTO tag TAG.  */
>  
> --- gcc/cgraphunit.c.jj   2015-01-09 12:01:33.0 +0100
> +++ gcc/cgraphunit.c  2015-01-09 12:22:27.742692667 +0100
> @@ -2108,11 +2108,14 @@ ipa_passes (void)
>if (g->have_offload)
>   {
> section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX;
> +   lto_stream_offload_p = true;
> ipa_write_summaries (true);
> +   lto_stream_offload_p = false;
>   }
>if (flag_lto)
>   {
> section_name_prefix = LTO_SECTION_NAME_PREFIX;
> +   lto_stream_offload_p = false;
> ipa_write_summaries (false);
>   }
>  }
> --- gcc/lto-streamer-out.c.jj 2015-01-08 18:10:23.633598629 +0100
> +++ gcc/lto-streamer-out.c2015-01-09 12:14:41.017711211 +0100
> @@ -944,7 +944,9 @@ hash_tree (struct streamer_tree_cache_d
>  hstate.add (TRANSLATION_UNIT_LANGUAGE (t),
>   strlen (TRANSLATION_UNIT_LANGUAGE (t)));
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
> +  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
> +  /* We don't stream these when passing things to a different target.  */
> +  && !lto_stream_offload_p)
>  hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t)));
>  
>if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
> --- gcc/tree-streamer-out.c.jj2015-01-08 18:10:23.631598663 +0100
> +++ gcc/tree-streamer-out.c   2015-01-09 12:14:41.018711194 +0100
> @@ -472,7 +472,9 @@ streamer_pack_tree_bitfields (struct out
>if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
>  bp_pack_var_len_unsigned (bp, CONSTRUCTOR_NELTS (expr));
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
> +  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
> +  /* Don't stream these when passing things to a different target.  */
> +  && !lto_stream_offload_p)
>  cl_target_option_stream_out (ob, bp, TREE_TARGET_OPTION (expr));
>  
>if (code == OMP_CLAUSE)
> @@ -687,7 +689,9 @@ write_ts_functi

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 12:07:26PM +0100, Thomas Schwinge wrote:
> On Thu, 8 Jan 2015 15:11:49 +0100, Jakub Jelinek  wrote:
> > On Thu, Nov 20, 2014 at 01:27:08PM +0100, Bernd Schmidt wrote:
> > > On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> > > >this patch adds infrastructure for proper streaming and merging of
> > > >TREE_TARGET_OPTION.
> > > 
> > > This breaks the offloading path via LTO since it introduces an
> > > incompatibility in LTO format between host and offload machine.
> > > 
> > > A very quick patch to fix it is below - the OpenACC testcase I was using
> > > seems to be working again with this. Thoughts, suggestions?
> > 
> > I actually think
> 
> Thanks for picking up this issue!

Richard said on IRC he doesn't like the string comparisons, so here is
untested modification of the patch.  If it looks good, I'll test it today:

2015-01-09  Bernd Schmidt  
Jakub Jelinek  

PR middle-end/64412
* lto-streamer.h (lto_stream_offload_p): New declaration.
* lto-streamer.c (lto_stream_offload_p): New variable.
* cgraphunit.c (ipa_passes): Set lto_stream_offload_p
at the same time as section_name_prefix.
* lto-streamer-out.c (hash_tree): Don't hash TREE_TARGET_OPTION
if lto_stream_offload_p.
* tree-streamer-out.c (streamer_pack_tree_bitfields): Don't
stream TREE_TARGET_OPTION if lto_stream_offload_p.
(write_ts_function_decl_tree_pointers): Don't
stream DECL_FUNCTION_SPECIFIC_TARGET if lto_stream_offload_p.
* tree-streamer-in.c (unpack_value_fields): Don't stream
TREE_TARGET_OPTION in if ACCEL_COMPILER.
(lto_input_ts_function_decl_tree_pointers): Don't stream
DECL_FUNCTION_SPECIFIC_TARGET in if ACCEL_COMPILER.
* lto-opts.c (lto_write_options): Use lto_stream_offload_p
instead of section_name_prefix string comparisons.
lto/
* lto.c (read_cgraph_and_symbols): Set lto_stream_offload_p
if ACCEL_COMPILER.

--- gcc/lto-streamer.h.jj   2015-01-05 13:07:13.0 +0100
+++ gcc/lto-streamer.h  2015-01-09 12:18:26.199842482 +0100
@@ -744,6 +744,10 @@ extern void lto_append_block (struct lto
 
 
 /* In lto-streamer.c.  */
+
+/* Set when streaming LTO for offloading compiler.  */
+extern bool lto_stream_offload_p;
+
 extern const char *lto_tag_name (enum LTO_tags);
 extern bitmap lto_bitmap_alloc (void);
 extern void lto_bitmap_free (bitmap);
--- gcc/lto-streamer.c.jj   2015-01-05 13:07:13.0 +0100
+++ gcc/lto-streamer.c  2015-01-09 12:16:04.909269917 +0100
@@ -61,6 +61,8 @@ static bitmap_obstack lto_obstack;
 static bool lto_obstack_initialized;
 
 const char *section_name_prefix = LTO_SECTION_NAME_PREFIX;
+/* Set when streaming LTO for offloading compiler.  */
+bool lto_stream_offload_p;
 
 /* Return a string representing LTO tag TAG.  */
 
--- gcc/cgraphunit.c.jj 2015-01-09 12:01:33.0 +0100
+++ gcc/cgraphunit.c2015-01-09 12:22:27.742692667 +0100
@@ -2108,11 +2108,14 @@ ipa_passes (void)
   if (g->have_offload)
{
  section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX;
+ lto_stream_offload_p = true;
  ipa_write_summaries (true);
+ lto_stream_offload_p = false;
}
   if (flag_lto)
{
  section_name_prefix = LTO_SECTION_NAME_PREFIX;
+ lto_stream_offload_p = false;
  ipa_write_summaries (false);
}
 }
--- gcc/lto-streamer-out.c.jj   2015-01-08 18:10:23.633598629 +0100
+++ gcc/lto-streamer-out.c  2015-01-09 12:14:41.017711211 +0100
@@ -944,7 +944,9 @@ hash_tree (struct streamer_tree_cache_d
 hstate.add (TRANSLATION_UNIT_LANGUAGE (t),
strlen (TRANSLATION_UNIT_LANGUAGE (t)));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
+  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
+  /* We don't stream these when passing things to a different target.  */
+  && !lto_stream_offload_p)
 hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t)));
 
   if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
--- gcc/tree-streamer-out.c.jj  2015-01-08 18:10:23.631598663 +0100
+++ gcc/tree-streamer-out.c 2015-01-09 12:14:41.018711194 +0100
@@ -472,7 +472,9 @@ streamer_pack_tree_bitfields (struct out
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
 bp_pack_var_len_unsigned (bp, CONSTRUCTOR_NELTS (expr));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
+  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
+  /* Don't stream these when passing things to a different target.  */
+  && !lto_stream_offload_p)
 cl_target_option_stream_out (ob, bp, TREE_TARGET_OPTION (expr));
 
   if (code == OMP_CLAUSE)
@@ -687,7 +689,9 @@ write_ts_function_decl_tree_pointers (st
   stream_write_tree (ob, DECL_VINDEX (expr), ref_p);
   /* DECL_STRUCT_FUNCTION is handled by lto_output_function.  */
   stream_write_tree (ob, DECL_FUNCTION_PERSONALITY (expr), ref_p);
-  stream_write_tree (ob, D

Re: [PATCH][AArch64] Implement vsqrt_f64 intrinsic

2015-01-09 Thread Kyrill Tkachov


On 17/12/14 00:04, Joseph Myers wrote:

On Mon, 15 Dec 2014, James Greenhalgh wrote:


@@ -22792,6 +22792,12 @@ vsqrtq_f32 (float32x4_t a)
return __builtin_aarch64_sqrtv4sf (a);
  }
  
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))

+vsqrt_f64 (float64x1_t a)
+{
+  return (float64x1_t) { __builtin_sqrt (a[0]) };
+}

Hi Kyrill,

Does this introduce an implicit need to link against a maths library if
we want arm_neon.h to work correctly? If so, I think we need to take a
different approach.

At O0 I've started to see:

   " undefined reference to `sqrt' "

When checking a large arm_neon.h testcase.

It does seem strange that the mid-end would convert a __builtin_sqrt back
to a library call at O0 when the target has an optab for it, so perhaps
there is a bug there to go hunt?

__builtin_sqrt has the same semantics as the sqrt library function.  This
includes setting errno for negative arguments (other than -0 and -NaN).
The semantics also include that it's always OK to expand to a call to that
library function (generally, __builtin_foo may always expand to a call to
foo, if there is such a library function).


So my first attempt at this patch had created a target builtin 
(__builtin_aarch64_sqrtdf) and used that. Eventually though I went for 
the shorter __builtin_sqrt because I thought we could benefit from the 
tree-level information about the semantics rather than the RTL-level 
expansion that the target-specific builtin would provide.
But if there's a risk for it to expand to a library function call, I 
guess it's better to go with the target builtin. I'll prepare a patch.


Thanks for the explanations,
Kyrill








Re: [PATCH][ARM][cleanup] Use R0_REGNUM and R1_REGNUM instead of 0 and 1 where appropriate

2015-01-09 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00989.html

Thanks,
Kyrill

On 11/12/14 09:34, Kyrill Tkachov wrote:

Hi all,

While looking in this area on other business I noticed we could be using
the names R0_REGNUM
and R1_REGNUM when creating those REG rtxs since it's a bit more
descriptive that just 0 and 1.

Tested arm-none-eabi.

Ok for trunk?

Thanks,
Kyrill

2014-12-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/arm/arm.c (arm_load_tp): Use R0_REGNUM instead of constant 0
  in gen_rtx_REG.
  (arm_tls_descseq_addr): Likewise.
  (arm_gen_movmemqi): Likewise.
  (arm_expand_epilogue_apcs_frame): Likewise.
  (arm_expand_epilogue): Likewise.
  (arm_expand_prologue): Likewise.  Use R1_REGNUM instead of constant 1
  in gen_rtx_REG.





Re: [PATCH][ARM] Make issue rate part of per-core tuning structs

2015-01-09 Thread Kyrill Tkachov

Ping.

Thanks,
Kyrill
On 12/12/14 13:57, Kyrill Tkachov wrote:

Ping (after the macro fusion patch)...
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02706.html

Thanks,
Kyrill
On 20/11/14 16:48, Kyrill Tkachov wrote:

I should say that the patch context depends on the macro fusion hook
implementation posted here:
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00958.html

Kyrill

On 20/11/14 16:43, Kyrill Tkachov wrote:

Hi all,

This patch makes the arm_issue_rate function lookup the issue rate of
the process from the tuning structs.
This makes it look more like the aarch64 mechanism and centralises a
processor-specific construct to the
tuning structs, thus not forcing us to remember to update the
arm_issue_rate function every time a new core
is added.

A new tuning struct is added for the marvell-pj4 in order to decouple it
from the 9e tuning struct and
enable us to set it's correct issue rate to 2.

Bootstrapped and tested on arm-none-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2014-11-19  Kyrylo Tkachov  

* config/arm/arm-protos.h (struct tune_params): Add issue_rate field.
* config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune,
arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune,
arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune,
arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune,
arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune,
arm_fa726te_tune arm_cortex_a5_tune): Specify issue_rate value.
(arm_issue_rate): Look up issue rate from tuning structs. Remove
large switch statement.
(arm_marvell_pj4_tune): New struct.
* config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune
struct.










Re: [PATCH][AArch64] Fix PR 64263: Do not try to split constants when destination is SIMD reg

2015-01-09 Thread Kyrill Tkachov

Ping.

https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01116.html

Thanks,
Kyrill

On 12/12/14 15:33, Kyrill Tkachov wrote:

Hi all,

Since the movsi_aarch64 and movdi_aarch64 patterns became splitters we
want to make sure that the splitting happens only when we deal with GP
registers.

This patch guards the splitting part by GP_REGNUM_P rather than trying
to complicate aarch64_expand_mov_immediate too much to try and handle
the SIMD registers case.

A testcase is added.
Bootstrap on aarch64-none-linux-gnu and testing on aarch64-none-elf was
succesfull.

Ok for trunk?

Thanks,
Kyrill

2014-12-11  Kyrylo Tkachov  
  Ramana Radhakrishnan 

  PR target/64263
  * config/aarch64/aarch64.md (*movsi_aarch64): Don't split if the
  destination is not a GP reg.
  (*movdi_aarch64): Likewise.

2014-12-11  Kyrylo Tkachov  

  PR target/64263
  * gcc.target/aarch64/pr64263_1.c: New test.





RE: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2015-01-09 Thread Robert Suchanek
Hi Steven/Vladimir,

> It's hard to say what the correct fix should be, but it sounds like
> the address you get after the substitutions should be simplified
> (folded).

Coming back to the original testcase and re-analyzing the problem, it appears
that there is, indeed, a missing case for simplification of LO_SUM/HIGH pair.
The patch attached resolves the issue.

Although the testcase is not reproducible on the trunk, I think it is still
worth to include it in case the ICE reoccurred.

The patch has been bootstrapped and regtested on x86_64-unknown-linux-gnu target
and similarly tested against SVN revision r212763 where it can be reproduced.

Regards,
Robert

2015-01-08  Robert Suchanek  

gcc/
* simplify-rtx.c (simplify_replace_fn_rtx): Simplify (lo_sum (high x)
(const (plus x offset))) to (const (plus x offset)).

gcc/testsuite/
* gcc.target/mips/20150108.c: New test.
---
 gcc/simplify-rtx.c   |6 ++
 gcc/testsuite/gcc.target/mips/20150108.c |   25 +
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/20150108.c

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 04af01e..7621316 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -503,6 +503,12 @@ simplify_replace_fn_rtx (rtx x, const_rtx old_rtx,
  if (GET_CODE (op0) == HIGH && rtx_equal_p (XEXP (op0, 0), op1))
return op1;
 
+ /* (lo_sum (high x) (const (plus x ofs))) -> (const (plus x ofs))  */
+ if (GET_CODE (op0) == HIGH && GET_CODE (op1) == CONST
+ && GET_CODE(XEXP (op1, 0)) == PLUS
+ && rtx_equal_p (XEXP (XEXP (op1, 0), 0), XEXP (op0, 0)))
+   return op1;
+
  if (op0 == XEXP (x, 0) && op1 == XEXP (x, 1))
return x;
  return gen_rtx_LO_SUM (mode, op0, op1);
diff --git a/gcc/testsuite/gcc.target/mips/20150108.c 
b/gcc/testsuite/gcc.target/mips/20150108.c
new file mode 100644
index 000..f18dbe7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/20150108.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mips32r2" } */
+
+long long a[10];
+long long b, c, d, k, m, n, o, p, q, r, s, t, u, v, w;
+int e, f, g, h, i, j, l, x;
+
+NOMIPS16 fn1() {
+  for (; x; x++)
+if (x & 1)
+  s = h | g;
+else
+  s = f | e;
+  l = ~0;
+  m = 1 | k;
+  n = i;
+  o = j;
+  p = f | e;
+  q = h | g;
+  w = d | c | a[1];
+  t = c;
+  v = b | c;
+  u = v;
+  r = b | a[4];
+}
-- 
1.7.9.5


Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

2015-01-09 Thread Kyrill Tkachov

Ping.

Thanks,
Kyrill

On 18/12/14 15:55, Kyrill Tkachov wrote:

Ping.

Thanks,
Kyrill

On 11/12/14 15:06, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00340.html

Thanks,
Kyrill

On 04/12/14 09:19, Kyrill Tkachov wrote:

On 02/12/14 22:58, Ramana Radhakrishnan wrote:

On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov  wrote:

Hi all,

This is the arm implementation of the macro fusion hook.
It tries to fuse movw+movt operations together. It also tries to take lo_sum
RTXs into account since those generate movt instructions as well.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?
 if (current_tune->fuseable_ops & ARM_FUSE_MOVW_MOVT)
+{
+  /* We are trying to fuse
+ movw imm / movt imm
+ instructions as a group that gets scheduled together.  */
+

A comment here about the insn structure would be useful.

Done. It's similar to the aarch64 adrp+add case. It does make it easier
to read, thanks.

2014-12-04  Kyrylo Tkachov  kyrylo.tkac...@arm.com\

  * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
  * config/arm/arm.c (arm_macro_fusion_p): New function.
  (arm_macro_fusion_pair_p): Likewise.
  (TARGET_SCHED_MACRO_FUSION_P): Define.
  (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
  (ARM_FUSE_NOTHING): Likewise.
  (ARM_FUSE_MOVW_MOVT): Likewise.
  (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
  arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
  arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
  arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
  arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
  arm_cortex_a5_tune): Specify fuseable_ops value.


+  set_dest = SET_DEST (curr_set);
+  if (GET_CODE (set_dest) == ZERO_EXTRACT)
+{
+  if (CONST_INT_P (SET_SRC (curr_set))
+  && CONST_INT_P (SET_SRC (prev_set))
+  && REG_P (XEXP (set_dest, 0))
+  && REG_P (SET_DEST (prev_set))
+  && REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set)))
+return true;
+}
+  else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM
+   && REG_P (SET_DEST (curr_set))
+   && REG_P (SET_DEST (prev_set))
+   && GET_CODE (SET_SRC (prev_set)) == HIGH
+   && REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST (prev_set)))
+{
+  return true;
+}

Can we add a fast path exit to be

if (GET_MODE (set_dest) != SImode)
  return false;

Done, but if/when we extend the function to handle more fusion cases it
will need to be
refactored, since we will want to just bail out of this MOVW+MOVT case
rather than the whole function.


I did think whether we wanted to use reg_overlap_mentioned_p as that
may simplify the logic a bit but that's  overkill here as we still
want to restrict it to the cases above.

Otherwise OK.

Here's the updated patch. I've tested on arm-none-eabi and made sure
that the
fusion still happens on the benchmarks I looked at.
Ok?

Thanks,
Kyrill


Ramana





+}
+  return false;
Thanks,
Kyrill

2014-11-11  Kyrylo Tkachov  

* config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
* config/arm/arm.c (arm_macro_fusion_p): New function.
(arm_macro_fusion_pair_p): Likewise.
(TARGET_SCHED_MACRO_FUSION_P): Define.
(TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
(ARM_FUSE_NOTHING): Likewise.
(ARM_FUSE_MOVW_MOVT): Likewise.
(arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
arm_cortex_a5_tune): Specify fuseable_ops value.










Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-01-09 Thread Kyrill Tkachov

Hi Xingxing,

On 19/12/14 11:01, Xingxing Pan wrote:

+/* Return true if vector element size is byte. */
Minor nit: two spaces after full stop and before */ Same in other places 
in the patch.



+bool
+marvell_whitney_vector_element_size_is_byte (rtx insn)
+{
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  if ((GET_MODE (SET_DEST (PATTERN (insn))) == V8QImode) ||
+  (GET_MODE (SET_DEST (PATTERN (insn))) == V16QImode))
+   return true;
+}
+
+  return false;
+}


I see this is called from inside marvell-whitney.md. It seems to me that 
this function takes RTX insns. Can the type of this be strengthened to 
rtx_insn * ?
Also, this should be refactored and written a bit more generally by 
checking for VECTOR_MODE_P and then GET_MODE_INNER for QImode, saving 
you the trouble of enumerating the different vector QI modes.




+
+/* Return true if INSN has shift operation but is not a shift insn. */
+bool
+marvell_whitney_non_shift_with_shift_operand (rtx insn)


Similar comment. Can this be strengthened to rtx_insn * ?

Thanks,
Kyrill

+{
+  rtx pat = PATTERN (insn);
+
+  if (GET_CODE (pat) != SET)
+return false;
+
+  /* Is not a shift insn. */
+  rtx rvalue = SET_SRC (pat);
+  RTX_CODE code = GET_CODE (rvalue);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return false;
+
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, rvalue, ALL)
+{
+  /* Has shift operation. */
+  RTX_CODE code = GET_CODE (*iter);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return true;
+}
+
+  return false;
+}





[PATCH] Fix PR64410, complex vectorization issue

2015-01-09 Thread Richard Biener

This fixes the specific case of complex arithmetic vectorization
in the PR which is caused by loads/stores of complex types which
the vectorizer does not like.

The patch implements two things, first a "late" variant of
gimplify_modify_expr_complex_part in update-address-taken
when we can write the variable into SSA form (which ends up
removing a temporary for the testcase).  Second, splitting
up loads and stores of complex type if the loads all feed
{REAL,IMAG}PART_EXPRs or the store is fed by a single-use
COMPLEX_EXPR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-01-09  Richard Biener  

PR tree-optimization/64410
* tree-ssa.c (non_rewritable_lvalue_p): Allow REALPART/IMAGPART_EXPR
on the LHS.
(execute_update_addresses_taken): Deal with that.
* tree-ssa-forwprop.c (pass_forwprop::execute): Use component-wise
loads/stores for complex variables.

* g++.dg/vect/pr64410.cc: New testcase.

Index: gcc/tree-ssa.c
===
*** gcc/tree-ssa.c.orig 2015-01-08 14:58:47.058032954 +0100
--- gcc/tree-ssa.c  2015-01-08 15:00:46.253028827 +0100
*** non_rewritable_lvalue_p (tree lhs)
*** 1340,1345 
--- 1340,1352 
if (DECL_P (lhs))
  return false;
  
+   /* We can re-write REALPART_EXPR and IMAGPART_EXPR sets in
+  a reasonably efficient manner... */
+   if ((TREE_CODE (lhs) == REALPART_EXPR
+|| TREE_CODE (lhs) == IMAGPART_EXPR)
+   && DECL_P (TREE_OPERAND (lhs, 0)))
+ return false;
+ 
/* A decl that is wrapped inside a MEM-REF that covers
   it full is also rewritable.
   ???  The following could be relaxed allowing component
*** execute_update_addresses_taken (void)
*** 1544,1549 
--- 1551,1585 
tree rhs, *rhsp = gimple_assign_rhs1_ptr (stmt);
tree sym;
  
+   /* Rewrite LHS IMAG/REALPART_EXPR similar to
+  gimplify_modify_expr_complex_part.  */
+   if ((TREE_CODE (lhs) == IMAGPART_EXPR
+|| TREE_CODE (lhs) == REALPART_EXPR)
+   && DECL_P (TREE_OPERAND (lhs, 0))
+   && bitmap_bit_p (suitable_for_renaming,
+DECL_UID (TREE_OPERAND (lhs, 0
+ {
+   tree other = make_ssa_name (TREE_TYPE (lhs));
+   tree lrhs = build1 (TREE_CODE (lhs) == IMAGPART_EXPR
+   ? REALPART_EXPR : IMAGPART_EXPR,
+   TREE_TYPE (other),
+   TREE_OPERAND (lhs, 0));
+   gimple load = gimple_build_assign (other, lrhs);
+   gimple_set_vuse (load, gimple_vuse (stmt));
+   gsi_insert_before (&gsi, load, GSI_SAME_STMT);
+   gimple_assign_set_lhs (stmt, TREE_OPERAND (lhs, 0));
+   gimple_assign_set_rhs_with_ops
+ (&gsi, COMPLEX_EXPR,
+  TREE_CODE (lhs) == IMAGPART_EXPR
+  ? other : gimple_assign_rhs1 (stmt),
+  TREE_CODE (lhs) == IMAGPART_EXPR
+  ? gimple_assign_rhs1 (stmt) : other, NULL_TREE);
+   stmt = gsi_stmt (gsi);
+   unlink_stmt_vdef (stmt);
+   update_stmt (stmt);
+   continue;
+ }
+ 
/* We shouldn't have any fancy wrapping of
   component-refs on the LHS, but look through
   VIEW_CONVERT_EXPRs as that is easy.  */
Index: gcc/tree-ssa-forwprop.c
===
*** gcc/tree-ssa-forwprop.c.orig2015-01-08 13:25:14.892227266 +0100
--- gcc/tree-ssa-forwprop.c 2015-01-09 11:09:50.785517099 +0100
*** pass_forwprop::execute (function *fun)
*** 2210,2215 
--- 2210,2306 
  else
gsi_next (&gsi);
}
+ else if (TREE_CODE (TREE_TYPE (lhs)) == COMPLEX_TYPE
+  && gimple_assign_load_p (stmt)
+  && !gimple_has_volatile_ops (stmt)
+  && !stmt_can_throw_internal (stmt))
+   {
+ /* Rewrite loads used only in real/imagpart extractions to
+component-wise loads.  */
+ use_operand_p use_p;
+ imm_use_iterator iter;
+ bool rewrite = true;
+ FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
+   {
+ gimple use_stmt = USE_STMT (use_p);
+ if (is_gimple_debug (use_stmt))
+   continue;
+ if (!is_gimple_assign (use_stmt)
+ || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
+ && gimple_assign_rhs_code (use_stmt) != 
IMAGPART_EXPR))
+   {
+   

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2015-01-09 Thread Thomas Schwinge
Hi!

On Thu, 8 Jan 2015 15:11:49 +0100, Jakub Jelinek  wrote:
> On Thu, Nov 20, 2014 at 01:27:08PM +0100, Bernd Schmidt wrote:
> > On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> > >this patch adds infrastructure for proper streaming and merging of
> > >TREE_TARGET_OPTION.
> > 
> > This breaks the offloading path via LTO since it introduces an
> > incompatibility in LTO format between host and offload machine.
> > 
> > A very quick patch to fix it is below - the OpenACC testcase I was using
> > seems to be working again with this. Thoughts, suggestions?
> 
> I actually think

Thanks for picking up this issue!

> this patch makes a lot of sense.  Target option nodes
> by definition are target specific, generally there is no mapping between
> host and offloading target features.  So, the host target options
> are not useful to the offloading target.  And, because the amount of bits
> streamed is also target specific, say x86_64 will have different and
> incompatible cl_target_option_stream_{out,in} from nvptx, and even
> for Intel MIC offloading it doesn't make much sense, what CPU is certain
> function targetting doesn't necessarily have any relation to the Intel MIC
> that will offload it.

> Also note that the patch fixes all the current regressions in Intel MIC
> (emulated) offloading caused by the r218767

(Which has been filed as , by the way.)

I'm confirming that Bernd's patch resolves the intelmic offloading
regressions, but I still do see issues in all nvptx offloading, but
cannot tell yet what's going on.  (Reverting Honza's patch resolves
these; but maybe it's something that is solely an issue with the nvptx
offloading path.)


Grüße,
 Thomas


pgp63g_0pFcY4.pgp
Description: PGP signature


Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-09 Thread Richard Biener
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 11:15:14AM +0100, Richard Biener wrote:
> > On Fri, 9 Jan 2015, Jakub Jelinek wrote:
> > 
> > > On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> > > > I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> > > > that break the just computed postorder?
> > > 
> > > Dunno, but I think it shouldn't break anything, the function doesn't 
> > > remove
> > > any blocks, just in the typical case of an unconditional jump to the next 
> > > bb
> > > or conditional jump to the next bb (if only successor) removes the jump 
> > > and
> > > makes the edge EDGE_FALLTHRU.
> > > 
> > > > Other than that, why doesn't can't the issue show up with 
> > > > non-table-jumps?
> > > 
> > > I think tablejumps are the only case where (at least during jump2)
> > > code_labels live in between the basic blocks, not inside of them.
> > > 
> > > > What does it take to preserve (all) the labels?
> > > 
> > > Then we'd need to remove all the instructions in between the two basic
> > > blocks (as we currently do), but move any code_labels from there first to
> > > the start of the next basic block.  Probably better just call tablejump_p
> > > with non-NULL args and move precisely that code_label that it sets.
> > > 
> > > But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
> > > called, so we'd need to do it at another place too.
> > 
> > Ok, I see.  I still wonder why we call tidy_fallthru_edges from
> > postorder_compute.  If we delete unreachable blocks that means
> > we at most remove incoming edges to a block - that should never
> > change any other edges fallthru status...?
> 
> The call has been added by
> http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00095.html
> and is only done if post_order_compute is called with the special flag,
> supposedly that replaced explicit delete_unreachable_blocks or similar.
> And, if you remove unreachable blocks, if they are in between some bb
> and its single successor, then indeed that is something that should be
> tidied, as we don't have to jump around nothing.

Ah, indeed.

> If you want, I can try instead of disabling it for tablejumps
> just move the label.

Yeah, I'd prefer that - it can't be too difficult, no?

> Still, I think we should be able to optimize it somewhere else too
> (we can remove the tablejumps not just if all jump_table_data entries
> point to next_bb, but even when they point to some completely different bb,
> as long as it is a single_succ_p).  And ideally also optimize it at GIMPLE,
> but guess that is GCC 6 material.

cfgcleanup material, similar for GIMPLE I guess.

Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Richard Biener
On Fri, Jan 9, 2015 at 11:21 AM, Bernd Edlinger
 wrote:
> Hi,
>
> On Fri, 9 Jan 2015 10:57:14, Richard Biener wrote:
>>
>> On Mon, Jan 5, 2015 at 9:00 PM, Jeff Law  wrote:
>>> On 01/03/15 06:49, Bernd Edlinger wrote:

 Hi,

 I was experimenting with enabling TSAN for Ada recently.
 I think this gives rather interesting results.

 The Instrumentation worked almost out of the box, we just have
 the problem that it is not gimple-OK to fold something like
 "& VIEW_CONVERT_EXPR(x)", and this happens in Ada all the time.

 Boot-Strapped and regression-tested on x86_64-linux-gnu.
 OK for trunk?


 Thanks
 Bernd.



 changelog-tsan-ada.txt


 gcc/ChangeLog:
 2015-01-03 Bernd Edlinger

 Enable experimental TSAN support for Ada.
 * tsan.c (instrument_expr): Handle VIEW_CONVERT_EXPR.
>>>
>>> OK for the trunk with a comment before the new block of code indicating why
>>> we need to handle VIEW_CONVERT_EXPR specially here (specifically we can't
>>> call build_fold_addr_expr on the VIEW_CONVERT_EXPR).
>>
>> There may be multiple VIEW_CONVERT_EXPRs in a reference chain
>> so simply stripping the outermost only doesn't work (the assert).
>>
>
> Hmm, that did not happen in any of the Ada tests in ada/acats nor in gnat.dg,
> but with Ada anything may be possible...
>
> Would you like it better if I do it this way:
>
>   align = get_object_alignment (expr);
>   if (align < BITS_PER_UNIT)
>  return false;
>
>   do
> {
>expr = TREE_OPERAND (expr, 0);
> } while (TREE_CODE (expr) == VIEW_CONVERT_EXPR);

No, I mean the tree might look like

  COMPONENT_REFgcc_checking_assert (is_gimple_addressable (expr));
>   expr_ptr = build_fold_addr_expr (unshare_expr (expr));
>
>
>
>> I wonder why you do all the special-casing when you have already
>> called get_inner_reference on the reference. The address is
>> simply &base + offset + bitpos / BITS_PER_UNIT, the bitfield
>> case is detectable via bitpos % BITS_PER_UNIT != 0.
>>
>
> I tried that first, but for something lile S.A[x].B
> offset is someting like a+b*x, and while we handle that in expansion,
> it is pretty hard to fold gimple code for &base + offset in this case.

Not really, just use one of the force_gimple_operand* functions.

> But it is relatively easy to fold &S.A[x] and add a contant byte offset.
>
>
>> Sth else I noticed, instead of checking points-to in the weird way
>> you do you simply want if (DECL_P (base) && !may_be_aliased (base)).
>>
>
> you mean,
>
>   /* No need to instrument accesses to decls that don't escape,
>  they can't escape to other threads then.  */
>   if (DECL_P (base))
> {
>   struct pt_solution pt;
>   memset (&pt, 0, sizeof (pt));
>   pt.escaped = 1;
>   pt.ipa_escaped = flag_ipa_pta != 0;
>   pt.nonlocal = 1;
>   if (!pt_solution_includes (&pt, base))
> return false;
>   if (!is_global_var (base) && !may_be_aliased (base))
> return false;
> }
>
> should be equivalent to
>
>if (DECL_P (base) && !may_be_aliased (base))
> return false;
>
> is that right?

Yes, well, not exactly, but I wonder if its worth doing the extra check
if you only check decl accesses anyway and not indirect accesses.

Richard.

>
> Thanks,
> Bernd
>
>> Richard.
>>
>>> Jeff
>>>
>


Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 11:15:14AM +0100, Richard Biener wrote:
> On Fri, 9 Jan 2015, Jakub Jelinek wrote:
> 
> > On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> > > I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> > > that break the just computed postorder?
> > 
> > Dunno, but I think it shouldn't break anything, the function doesn't remove
> > any blocks, just in the typical case of an unconditional jump to the next bb
> > or conditional jump to the next bb (if only successor) removes the jump and
> > makes the edge EDGE_FALLTHRU.
> > 
> > > Other than that, why doesn't can't the issue show up with non-table-jumps?
> > 
> > I think tablejumps are the only case where (at least during jump2)
> > code_labels live in between the basic blocks, not inside of them.
> > 
> > > What does it take to preserve (all) the labels?
> > 
> > Then we'd need to remove all the instructions in between the two basic
> > blocks (as we currently do), but move any code_labels from there first to
> > the start of the next basic block.  Probably better just call tablejump_p
> > with non-NULL args and move precisely that code_label that it sets.
> > 
> > But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
> > called, so we'd need to do it at another place too.
> 
> Ok, I see.  I still wonder why we call tidy_fallthru_edges from
> postorder_compute.  If we delete unreachable blocks that means
> we at most remove incoming edges to a block - that should never
> change any other edges fallthru status...?

The call has been added by
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00095.html
and is only done if post_order_compute is called with the special flag,
supposedly that replaced explicit delete_unreachable_blocks or similar.
And, if you remove unreachable blocks, if they are in between some bb
and its single successor, then indeed that is something that should be
tidied, as we don't have to jump around nothing.

If you want, I can try instead of disabling it for tablejumps
just move the label.

Still, I think we should be able to optimize it somewhere else too
(we can remove the tablejumps not just if all jump_table_data entries
point to next_bb, but even when they point to some completely different bb,
as long as it is a single_succ_p).  And ideally also optimize it at GIMPLE,
but guess that is GCC 6 material.

Jakub


Re: [PATCH, fortran] PR fortran/60255 Deferred character length

2015-01-09 Thread Andre Vehreschild
Hi all, hi Paul,

I started to implement the changes requested below, but I stumbled over an
oddity:

For a deferred length kind4 char array, the length of the string is stored
without multiplication by 4 in the length variable attached. So when we now
decide to store the length of the string in an unlimited polymorphic entity in
bytes in the component formerly called _len and the size of each character in
_vtype->_size then we have an inconsistency with the style deferred char
lengths are stored. IMHO we should store this consistently, i.e., both
'length'-variables store either the length of the string ('length' = array_len)
or the size of the memory needed ('length' = array_len * char_size). What do
you think?

Furthermore, think about debugging: When looking at an unlimited polymorphic
entity storing a kind-4-char-array of length 7, then having a 'length' component
set to 28 will lead to confusion. I humbly predict, that this will produce many
entries in the bugtracker, because people don't understand that 'length' stores
the product of elem_size times string_len, because all they see is an
assignment of a length-7 char array.

What do we do about it?

Regards,
Andre

On Thu, 8 Jan 2015 20:56:43 +0100
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> Thanks for the patch. As I have said to you, off list, I think that
> the _size field in the vtable should contain the kind information and
> that the _len field should carry the length of the string in bytes. I
> think that it is better to optimise array access this way than to
> avoid the division in evaluating LEN (). I am happy to accept contrary
> opinions from the others.
> 
> I do not believe that the bind_c issue is an issue. Your patch
> correctly deals with it IMHO.
> 
> Subject to the above change in the value of _len, I think that your
> patch is OK for trunk.
> 
> With best regards
> 
> Paul
> 
> On 4 January 2015 at 13:40, Andre Vehreschild  wrote:
> > Hi Janus, hi Paul, hi Tobias,
> >
> > Janus: During code review, I found that I had the code in
> > gfc_get_len_component() duplicated. So I now reintroduced and documented the
> > routine making is more commonly usable and added more documentation. The
> > call sites are now simplify.c (gfc_simplify_len) and trans-expr.c
> > (gfc_trans_pointer_assignment). Attached is the reworked version of the
> > patch.
> >
> > Paul, Tobias: Can one of you have a look at line 253 of the patch? I need
> > some expertise on the bind_c behavior. My patch needs the check for
> > is_bind_c added in trans_expr.c (gfc_conv_expr) to prevent mistyping an
> > associated variable in a select type() during the conv. Background: This
> > code fragment taken from the testcase in the patch:
> >
> > MODULE m
> > contains
> >   subroutine bar (arg, res)
> > class(*) :: arg
> > character(100) :: res
> > select type (w => arg)
> >   type is (character(*))
> > write (res, '(I2)') len(w)
> > end select
> >   end subroutine
> > END MODULE
> >
> > has the conditions required for line trans-expr.c:6630 of gfc_conv_expr when
> > the associate variable w is converted. This transforms the type of the
> > associate variable to something unexpected in the further processing
> > leading to some issues during fortraning. Janus told me, that the f90_type
> > has been abused for some other things (unlimited polymorphic treatment).
> > Although I believe that reading the comments above the if in question, the
> > check I had to enhance is treating bind_c stuff (see the threads content
> > for more). I would feel safer when one of you gfortran gurus can have a
> > look and given an opinion, whether the change is problematic. I couldn't
> > figure why w is resolved to meet the criteria (any ideas). Btw, all regtest
> > are ok reporting no issues at all.
> >
> > Bootstraps and regtests ok on x86_64-linux-gnu
> >
> > Regards,
> > Andre
> >
> >
> > On Sat, 3 Jan 2015 16:45:07 +0100
> > Janus Weil  wrote:
> >
> >> Hi Andre,
> >>
> >> >> >> For the
> >> >> >> second one (in gfc_conv_expr), I don't directly see how it's related
> >> >> >> to deferred char-len. Why is this change needed?
> >> >> >
> >> >> > That change is needed, because in some rare case where an associated
> >> >> > variable in a "select type ()" is used, then the type and f90_type
> >> >> > match the condition while them not really being in a bind_c context.
> >> >> > Therefore I have added the check for bind_c. Btw, I now have removed
> >> >> > the TODO, because that case is covered by the regression tests.
> >> >>
> >> >> I don't understand how f90_type can be BT_VOID without being in a
> >> >> BIND_C context, but I'm not really a ISO_C_BINDING expert. Which test
> >> >> case is the one that triggered this?
> >> >
> >> > This case is triggered by the test-case in the patch, where in the select
> >> > type (w => arg) in module m routine bar the w meets the criteria to make
> >> > the condition become true. The type of w is then "fixed" an

[PATCH] rs6000: Fix va_start handling for -m32 -mpowerpc64 ABI_V4

2015-01-09 Thread Segher Boessenkool
This fixes 88 testsuite FAILs.

-mpowerpc64 does not change the ABI, but it does change the value of
UNITS_PER_WORD.  We could use POINTER_SIZE_UNITS instead of 4 here,
but that does not seem quite right.  This code is for SVR4 only, so
a literal 4 isn't so bad I think.  Better suggestions welcome though.

Bootstrapped and tested as usual.  Okay for mainline?


Segher


2015-01-09  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.c (rs6000_va_start): Use a literal 4 instead
of UNITS_PER_WORD to describe the size of stack slots.

---
 gcc/config/rs6000/rs6000.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 66a1399..cc7b2a4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -11225,7 +11225,7 @@ rs6000_va_start (tree valist, rtx nextarg)
   /* Find the overflow area.  */
   t = make_tree (TREE_TYPE (ovf), virtual_incoming_args_rtx);
   if (words != 0)
-t = fold_build_pointer_plus_hwi (t, words * UNITS_PER_WORD);
+t = fold_build_pointer_plus_hwi (t, 4 * words);
   t = build2 (MODIFY_EXPR, TREE_TYPE (ovf), ovf, t);
   TREE_SIDE_EFFECTS (t) = 1;
   expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
-- 
1.8.1.4



Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)

2015-01-09 Thread Richard Biener
On Fri, Jan 9, 2015 at 10:39 AM, Michael Collison
 wrote:
> This patch flattens tree.h and tree-core.h. This is a revised patch that
> does not include tree-core.h as a result of flattening.
>
> Version 3 of the patch adds the header files removed from tree-core.h to
> gcc-plugin.h in order to allow ggc-common.c to compile. This is a recent
> issue seen on trunk.
>
> I removed the includes in tree.h and tree-core.h except for the include of
> tree-core.h in tree.h.
>
> I modified genattrtab.c, genautomata.c, genemit.c, gengtype.c, gengtype.c,
> genoptinit.c, genoutput.c,
> genpeep.c, genpreds.c, and optc-save-gen-awk to include the the necessary
> include files removed from
> tree.h and tree-core.h when generating their respective files.
>
> I removed three inline functions from tree.h and relocated them to
> fold-const.c and exported them in fold-const.h. The functions are:
>
> convert_to_ptrofftype-loc
> fold_build_pointer_plus_loc
> fold_build_pointer_plus_hwi_loc
>
> All other changes include the necessary include files removed from tree.h
> and tree-core.h. Note the patches modifies all the front-ends.
>
> I bootstrapped on x86 with all languages. I also bootstrapped on all targets
> listed in contrib/config-list.mk with c and c++ enabled.
>
> Is this okay for trunk?

Ok.

Thanks,
Richard.

> 2014-12-24  Michael Collison  
>
> * genattrtab.c (write_header): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-attrtab.c.
> * genautomata.c (main) : Include hash-set.h, macInclude hash-set.h,
> machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-automata.c.
> * genemit.c (main): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-emit.c.
> * gengtype.c (open_base_files): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> gtype-desc.c.
> * genopinit.c (main): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-opinit.c.
> * genoutput.c (output_prologue): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-output.c.
> * genpeep.c (main): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-peep.c.
> * genpreds.c (write_insn_preds_c): Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> insn-preds.c.
> * optc-save-gen-awk: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h when generating
> options-save.c.
> * opth-gen.awk: Change include guard from GCC_C_COMMON_H to
> GCC_C_COMMON_C
> when generating options.h.
>
> 2014-12-24  Michael Collison  
>
> * ada/gcc-interface/cuintp.c: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h,
> fold-const.h, wide-int.h, and inchash.h due to
> flattening of tree.h.
> * ada/gcc-interface/decl.c: ditto.
> * ada/gcc-interface/misc.c: ditto.
> * ada/gcc-interface/targtyps.c: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h,
> fold-const.h, wide-int.h, and inchash.h due to
> flattening of tree.h.
> * ada/gcc-interface/trans.c: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, real.h,
> fold-const.h, wide-int.h, inchash.h due to
> flattening of tree.h.
> * ada/gcc-interface/utils.c: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h,
> fold-const.h, wide-int.h, and inchash.h due to
> flattening of tree.h.
> * ada/gcc-interface/utils2.c: ditto.
> * alias.c: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h due to
> flattening of tree.h.
> * asan.c: ditto.
> * attribs.c: ditto.
> * auto-inc-dec.c: ditto.
> * auto-profile.c: ditto
> * bb-reorder.c: ditto.
> * bt-load.c: Include symtab.h due to flattening of tree.h.
> * builtins.c: Include hash-set.h, machmode.h,
> vec.h, double-int.h, input.h, alias.h, symtab.h, options.h
> fold-const.h, wide-int.h, and inchash.h due to
> flattening of tree.h.
>

Re: [PATCH] Handle CALL_INSN_FUNCTION_USAGE clobbers in regcprop.c

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 11:35:41AM +0100, Tom de Vries wrote:
> 2015-01-09  Tom de Vries  
> 
>   PR rtl-optimization/64539
>   * regcprop.c (copyprop_hardreg_forward_1): Handle clobbers in
>   CALL_INSN_FUNCTION_USAGE.

To avoid the duplication, wouldn't it be better to add

static void
kill_clobbered_values (rtx_insn *insn, struct value_data *vd)
{
  note_stores (PATTERN (insn), kill_clobbered_value, vd);
  if (CALL_P (insn))
{
  rtx exp;
  for (exp = CALL_INSN_FUNCTION_USAGE (insn); exp; exp = XEXP (exp, 1))
{
  rtx x = XEXP (exp, 0);
  if (GET_CODE (x) == CLOBBER)
kill_value (SET_DEST (x), vd);
}
}
}
function (with appropriate function comment) and use it in both places?

Otherwise LGTM.

Jakub


Re: [PATCH][1-3] New configure options that make the compiler use -fPIE and -pie as default option

2015-01-09 Thread Marcus Meissner
Hi,

can this be added for GCC 5? 

It would be interesting for SUSE too.

Ciao, Marcus
On Mon, Nov 10, 2014 at 09:26:39PM +0100, Magnus Granberg wrote:
> fredag 01 augusti 2014 10.52.27 skrev  Rainer Orth:
> > Hi Magnus,
> > 
> > a couple of comments, mostly nits.
> > 
> > > 2014-07-31  Magnus Granberg  
> > > 
> > >   /gcc
> > >   * config/gnu-user.h: Define PIE_DRIVER_SELF_SPECS for PIE
> > >   as default and GNU_DRIVER_SELF_SPECS.
> > >   * config/i386/gnu-user-common.h: Define DRIVER_SELF_SPECS
> > >   * configure.ac: Add new option that enable PIE as default.
> > >   * configure, config.in: Rebuild.
> > >   * Makefile.in: Disable PIE when building the compiler.
> > >   * doc/install.texi: Add the new configure option default PIE.
> > >   * doc/invoke.texi: Add note for the new configure option default PIE.
> > 
> > Many of those entries are mis-formatted.  See other examples and the GNU
> > Coding Standards for details.  E.g. the first would be
> > 
> > * config/gnu-user.h (PIE_DRIVER_SELF_SPECS): Define.
> > 
> > In general, you need to mention which macro, variable, manual section
> > you change.  Emacs' add-change-log-entry does the basics for you.
> > Besides, you only state what changed, not why.
> > 
> > Apart from that, I don't think defining PIE_DRIVER_SELF_SPECS in
> > gnu-user.h is a good idea.  This way, every other target supporting the
> > option would have to duplicate that stuff.
> > 
> > * testsuite/gcc/default-pie.c: New test for new configure option
> > --enale-default-pie
> > 
> > gcc/testsuite has its own ChangeLog file.  Typo for --enale-...
> > 
> > * testsuite/gcc.dg/other/anon5.C: Add skip test as it fail to link
> > on effective_target default_pie.
> > 
> > should be
> > 
> > * g++.dg/other/anon5.C: Skip if default_pie.
> > 
> > No explanations in ChangeLog entries; they belong into the code.
> > Besides, you had the first dir component wrong.  Again, Emacs does this
> > for you.
> > 
> > * testsuite/lib/target-supports.exp (check_profiling_available):
> > We can't use profiling on effective target default_pie.
> > (check_effective_target_pie): Add check_effective_target_default_pie.
> > 
> > Wrong: should be
> > 
> > * lib/target-supports.exp (check_effective_target_default_pie):
> > New proc.
> > 
> > The new default_pic effective-target keyword needs to be documented in
> > doc/sourcebuild.texi.
> > 
> > --- a/gcc/testsuite/gcc.dg/default-pie.c2013-11-09 21:07:16.741479728 
> +0100
> > +++ b/gcc/testsuite/gcc.dg/default-pie.c2013-11-09 21:05:07.801479218
> > +0100 @@ -0,0 +1,12 @@
> > +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
> > +/* { dg-require-effective-target default_pie } */
> > 
> > Why restrict to Linux, GNU?  default_pie should be enough once other
> > targets add this.
> > 
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-store-ccp-3.c   2012-03-14
> > 17:33:37.0 +0100 +++
> > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-store-ccp-3.c   2014-07-29
> > 00:55:17.421086416 +0200 @@ -2,6 +2,9 @@
> >  /* Skipped on MIPS GNU/Linux target because __PIC__ can be
> > defined for executables as well as shared libraries.  */
> >  /* { dg-skip-if "" { *-*-darwin* hppa*64*-*-* mips*-*-linux* *-*-mingw* } {
> > "*" } { "" } } */ +/* Skipped on default_pie targets because __PIC__ is
> > +   defined for executables.  */
> > +/* { dg-skip-if "" { default_pie } { "*" } { "" } }  */
> > 
> > Emit those default args, they're unnecessary.  Also in g++.dg/other/anon5.C.
> > 
> > --- a/gcc/testsuite/g++.dg/other/anon5.C2012-11-10 15:34:42.0 
> > +0100
> > +++ b/gcc/testsuite/g++.dg/other/anon5.C2013-11-09 14:49:52.281390127
> > +0100 @@ -1,5 +1,6 @@
> >  // PR c++/34094
> >  // { dg-do link { target { ! { *-*-darwin* *-*-hpux* *-*-solaris2.* } } } }
> > +// { dg-skip-if "" { default_pie } { "*" } { "" } }
> > 
> > The first arg to dg-skip-if should explain why you're skipping the test.
> > 
> > --- a/gcc/testsuite/lib/target-supports.exp 2013-10-01 11:18:30.0
> > +0200 +++ b/gcc/testsuite/lib/target-supports.exp   2013-10-25
> > 22:01:46.743388469 +0200 @@ -474,6 +474,11 @@ proc
> > check_profiling_available { test_wh
> > }
> >  }
> > 
> > +# Profiling don't work with default -fPIE -pie.
> > 
> > Grammar: "doesn't work".
> > 
> > +# Return 1 if -pie, -fPIE are default enable, 0 otherwise.
> > +
> > +proc check_effective_target_default_pie { } {
> > 
> > Hard to understand, perhaps
> > 
> > # Return 1 if -pie -fPIE are enabled by default, 0 otherwise.
> > 
> > --- a/gcc/doc/invoke.texi   2013-10-03 19:13:50.0 +0200
> > +++ b/gcc/doc/invoke.texi   2013-11-17 21:30:02.784220111 +0100
> > @@ -10535,6 +10535,12 @@ For predictable results, you must also s
> >  used for compilation (@option{-fpie}, @option{-fPIE},
> >  or model suboptions) when you specify this linker option.
> > 
> > +NOTE: With configure --enable-default-pie this option is enabled by default
> > 
> > With the @

Re: OMP builtins in offloading (was: [PATCH 1/4] Add mkoffload for Intel MIC)

2015-01-09 Thread Jakub Jelinek
On Fri, Jan 09, 2015 at 11:36:54AM +0100, Richard Biener wrote:
> Maybe pass it through if you specify -Wl,-debug -v -save-temps
> (that also makes sure to disable collect2s error output buffering
> which is annoying with LTO)

Works with me too (but should be documented somewhere).

Jakub


Re: OMP builtins in offloading (was: [PATCH 1/4] Add mkoffload for Intel MIC)

2015-01-09 Thread Richard Biener
On Thu, Jan 8, 2015 at 5:39 PM, Jakub Jelinek  wrote:
> On Thu, Jan 08, 2015 at 07:32:13PM +0300, Ilya Verbin wrote:
>> On 08 Jan 16:49, Jakub Jelinek wrote:
>> > BTW, today when looking at the TARGET_OPTION_NODE streaming caused
>> > regressions, I've discovered that it is very hard to debug issues in the
>> > offloading compiler.  Would be nice if
>> > -save-temps -v
>> > printed enough information that it is actually possible to reproduce it,
>> > e.g. while mkoffload command is printed, one can't cut and paste it easily,
>> > because some env vars are required and those aren't printed in the -v dump.
>>
>> I agree, this should be improved.  Unfortunately, I didn't have time so far.
>>
>> > Similarly, the lto1 offloading compiler invocation is not printed, and
>>
>> It can be printed by -foffload="-save-temps -v", or should we pass through 
>> these
>> options from host to offload compiler by default?
>
> Certainly not if they weren't passed by the user to the host compiler.
> But if they have been passed, it might be useful, having to add -save-temps -v
> to too many spaces is annoying.
> And it would be really nice to print the essential env vars mkoffload is
> relying on, like:
> var1=value1
> var2=value2
> ./mkoffload /tmp/@ccABCDEF

Maybe pass it through if you specify -Wl,-debug -v -save-temps
(that also makes sure to disable collect2s error output buffering
which is annoying with LTO)

Richard.

> Jakub


Re: [PATCH] IPA ICF: add comparison for target and optimization nodes

2015-01-09 Thread Martin Liška

On 01/07/2015 12:38 PM, Martin Liška wrote:

Hello.

Following patch adds support for target and optimization nodes comparison, 
which is
based on Honza's newly added infrastructure.

Apart from that, there's a small hunk that corrects formatting and removes 
unnecessary
call to a comparison function.

Hope it can be applied as one patch.

Tested on x86_64-linux-pc without any new regression introduction.

Ready for trunk?

Thank you,
Martin


Hello.

Apart from the previous patch, I would like to install following patch which 
introduces
new dump functions related to target and optimization nodes. These functions 
dump just
different flags.

Patch has been tested on x86_64-linux-pc.

Thanks,
Martin
>From dfc2b68a2f81745f3768cc9349076cc56d3efc8f Mon Sep 17 00:00:00 2001
From: mliska 
Date: Thu, 8 Jan 2015 10:35:38 +0100
Subject: [PATCH 2/2] Option diff dump is added for target and optimization
 flags.

gcc/ChangeLog:

2015-01-09  Martin Liska  

	* ipa-icf.c (sem_function::equals_private): Call new functions
	cl_target_option_print_diff and cl_optimization_print_diff.
	* optc-save-gen.awk (cl_target_option_print_diff): New function.
	(cl_optimization_print_diff): Likewise.
	* opth-gen.awk: Likewise.
---
 gcc/ipa-icf.c |  12 ++---
 gcc/optc-save-gen.awk | 133 ++
 gcc/opth-gen.awk  |   6 +++
 3 files changed, 143 insertions(+), 8 deletions(-)

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 28158b3..3d9943e 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -437,10 +437,8 @@ sem_function::equals_private (sem_item *item,
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	{
-	  fprintf (dump_file, "Source target flags\n");
-	  cl_target_option_print (dump_file, 2, tar1);
-	  fprintf (dump_file, "Target target flags\n");
-	  cl_target_option_print (dump_file, 2, tar2);
+	  fprintf (dump_file, "target flags difference");
+	  cl_target_option_print_diff (dump_file, 2, tar1, tar2);
 	}
 
 	  return return_false_with_msg ("Target flags are different");
@@ -458,10 +456,8 @@ sem_function::equals_private (sem_item *item,
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	{
-	  fprintf (dump_file, "Source optimization flags\n");
-	  cl_optimization_print (dump_file, 2, opt1);
-	  fprintf (dump_file, "Target optimization flags\n");
-	  cl_optimization_print (dump_file, 2, opt2);
+	  fprintf (dump_file, "optimization flags difference");
+	  cl_optimization_print_diff (dump_file, 2, opt1, opt2);
 	}
 
 	  return return_false_with_msg ("optimization flags are different");
diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk
index ebeb509..4e28261 100644
--- a/gcc/optc-save-gen.awk
+++ b/gcc/optc-save-gen.awk
@@ -234,6 +234,69 @@ for (i = 0; i < n_opt_char; i++) {
 print "}";
 
 print "";
+print "/* Print different optimization variables from structures provided as arguments.  */";
+print "void";
+print "cl_optimization_print_diff (FILE *file,";
+print "int indent_to,";
+print "struct cl_optimization *ptr1,";
+print "struct cl_optimization *ptr2)";
+print "{";
+
+print "  fputs (\"\\n\", file);";
+for (i = 0; i < n_opt_other; i++) {
+	print "  if (ptr1->x_" var_opt_other[i] " != ptr2->x_" var_opt_other[i] ")";
+	print "fprintf (file, \"%*s%s (%#lx/%#lx)\\n\",";
+	print " indent_to, \"\",";
+	print " \"" var_opt_other[i] "\",";
+	print " (unsigned long)ptr1->x_" var_opt_other[i] ",";
+	print " (unsigned long)ptr2->x_" var_opt_other[i] ");";
+	print "";
+}
+
+for (i = 0; i < n_opt_int; i++) {
+	print "  if (ptr1->x_" var_opt_int[i] " != ptr2->x_" var_opt_int[i] ")";
+	print "fprintf (file, \"%*s%s (%#x/%#x)\\n\",";
+	print " indent_to, \"\",";
+	print " \"" var_opt_int[i] "\",";
+	print " ptr1->x_" var_opt_int[i] ",";
+	print " ptr2->x_" var_opt_int[i] ");";
+	print "";
+}
+
+for (i = 0; i < n_opt_enum; i++) {
+	print "  if (ptr1->x_" var_opt_enum[i] " != ptr2->x_" var_opt_enum[i] ")";
+	print "  fprintf (file, \"%*s%s (%#x/%#x)\\n\",";
+	print "   indent_to, \"\",";
+	print "   \"" var_opt_enum[i] "\",";
+	print "   (int) ptr1->x_" var_opt_enum[i] ",";
+	print "   (int) ptr2->x_" var_opt_enum[i] ");";
+	print "";
+}
+
+for (i = 0; i < n_opt_short; i++) {
+	print "  if (ptr1->x_" var_opt_short[i] " != ptr2->x_" var_opt_short[i] ")";
+	print "fprintf (file, \"%*s%s (%#x/%#x)\\n\",";
+	print " indent_to, \"\",";
+	print " \"" var_opt_short[i] "\",";
+	print " ptr1->x_" var_opt_short[i] ",";
+	print " ptr2->x_" var_opt_short[i] ");";
+	print "";
+}
+
+for (i = 0; i < n_opt_char; i++) {
+	print "  if (ptr1->x_" var_opt_char[i] " != ptr2->x_" var_opt_char[i] ")";
+	print "fprintf (file, \"%*s%s (%#x/%#x)\\n\",";
+	print "  

[PATCH] Handle CALL_INSN_FUNCTION_USAGE clobbers in regcprop.c

2015-01-09 Thread Tom de Vries

Jakub,

Attached patch handles CALL_INSN_FUNCTION_USAGE clobbers in 
copyprop_hardreg_forward_1.


Terry reported a cprop_hardreg misbehaviour here ( 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64154#c2 ), in the context of 
trying out -fipa-ra for thumb1. The -fipa-ra flag is currently disabled for 
thumb1, but due to an AFAIU unrelated issue.


The problem is that when cprop_hardreg processes a call_insn with a clobber in 
CALL_INSN_FUNCTION_USAGE, the clobber is not taken into account.


So for this call instruction, the 'clobber (reg:SI 12 ip)' is ignored:
...
(call_insn 141 292 142 13
 (parallel
  [
   (call (mem:SI
  (symbol_ref:SI ("f2") [flags 0x3]  )
  [0 f2 S4 A32])
(const_int 0 [0]))
   (use (const_int 0 [0]))
   (clobber (reg:SI 14 lr))
   ])
 vshift-3.c:119
 770 {*call_insn}
 (expr_list:REG_CALL_DECL (symbol_ref:SI ("f2") [flags 0x3]  )
  (expr_list:REG_EH_REGION (const_int 0 [0])
   (nil)))
 (expr_list (clobber (reg:SI 12 ip))
  (nil)))
...

This results in cprop_hardreg using register ip during the call, which is 
incorrect:
...
(insn (set (reg:SI 4 r4)
   (reg:SI 12 ip)))

(call_insn 141 ... )

(insn (set (reg:SI 0 r0)
   (reg:SI 12 ip)))
...

I have not been able to reproduce the failing code.  But I was able to observe 
that the clobber in CALL_INSN_FUNCTION_USAGE was ignored by 
copyprop_hardreg_forward_1. My understanding is that this is a latent bug in 
cprop_hardreg, uncovered by -fipa-ra.



Actually, there is code in copyprop_hardreg_forward_1 to handle clobbers in 
CALL_INSN_FUNCTION_USAGE, but that is part of the fix for PR57003, where we 
re-apply clobbers:

...
  /* If SET was seen in CALL_INSN_FUNCTION_USAGE, and SET_SRC
 of the SET isn't in regs_invalidated_by_call hard reg set,
 but instead among CLOBBERs on the CALL_INSN, we could wrongly
 assume the value in it is still live.  */
  if (ksvd.ignore_set_reg)
{
  note_stores (PATTERN (insn), kill_clobbered_value, vd);
  for (exp = CALL_INSN_FUNCTION_USAGE (insn);
   exp;
   exp = XEXP (exp, 1))
{
  rtx x = XEXP (exp, 0);
  if (GET_CODE (x) == CLOBBER)
kill_value (SET_DEST (x), vd);
}
}
...
[ This code is related to a scenario where the called function returns one of 
it's arguments and is annotated as such, but that doesn't apply to this example. ]


However, the earlier application of the clobbers just handles the clobbers in 
PATTERN, not those in CALL_INSN_FUNCTION_USAGE:

...
  /* Within asms, a clobber cannot overlap inputs or outputs.
 I wouldn't think this were true for regular insns, but
 scan_rtx treats them like that...  */
  note_stores (PATTERN (insn), kill_clobbered_value, vd);
...

This patch basically adds the same CALL_INSN_FUNCTION_USAGE for loop after the 
earlier 'note_stores (PATTERN (insn), kill_clobbered_value, vd)' call.


Terry reported that the patch fixes the problem for him.

Bootstrapped and reg-tested on x86_64, no issues found.

OK for stage3 trunk?

Thanks,
- Tom
2015-01-09  Tom de Vries  

	PR rtl-optimization/64539
	* regcprop.c (copyprop_hardreg_forward_1): Handle clobbers in
	CALL_INSN_FUNCTION_USAGE.
---
 gcc/regcprop.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 8c4f564..b42a4b7 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -801,6 +801,18 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
 	 I wouldn't think this were true for regular insns, but
 	 scan_rtx treats them like that...  */
   note_stores (PATTERN (insn), kill_clobbered_value, vd);
+  if (CALL_P (insn))
+	{
+	  rtx exp;
+	  for (exp = CALL_INSN_FUNCTION_USAGE (insn);
+	   exp;
+	   exp = XEXP (exp, 1))
+	{
+	  rtx x = XEXP (exp, 0);
+	  if (GET_CODE (x) == CLOBBER)
+		kill_value (SET_DEST (x), vd);
+	}
+	}
 
   /* Kill all auto-incremented values.  */
   /* ??? REG_INC is useless, since stack pushes aren't done that way.  */
-- 
1.9.1



Re: [PATCH] IPA ICF: add comparison for target and optimization nodes

2015-01-09 Thread Martin Liška

On 01/09/2015 06:21 AM, Jeff Law wrote:

On 01/07/15 04:38, Martin Liška wrote:

Hello.

Following patch adds support for target and optimization nodes
comparison, which is
based on Honza's newly added infrastructure.

Apart from that, there's a small hunk that corrects formatting and
removes unnecessary
call to a comparison function.

Hope it can be applied as one patch.

Tested on x86_64-linux-pc without any new regression introduction.

Ready for trunk?

Thank you,
Martin

0001-IPA-ICF-target-and-optimization-flags-comparison.patch


 From 393eaa47c8aef9a91a1c635016f23ca2f5aa25e4 Mon Sep 17 00:00:00 2001
From: mliska
Date: Tue, 6 Jan 2015 15:06:18 +0100
Subject: [PATCH] IPA ICF: target and optimization flags comparison.

gcc/ChangeLog:

2015-01-06  Martin Liska

* cgraphunit.c (cgraph_node::create_wrapper): Fix level of indentation.
* ipa-icf.c (sem_function::equals_private): Add support for target and
(sem_item_optimizer::merge_classes): Remove redundant function
comparison.
optimization flags comparison.
* tree.h (target_opts_for_fn): New function.

Looks like the changelog is a bit goof'd with lines intermixed.

Patch itself is good for the trunk.  It'd be nice if you could add a testcase 
as well.

Jeff


Hi.

You are right, I forgot to delete a line in Changelog.
Attachment contains final version with a new test case I'm going to install.

Thanks,
Martin
>From 76f728dca65a266eb35bb5d96326f28e2147aafa Mon Sep 17 00:00:00 2001
From: mliska 
Date: Tue, 6 Jan 2015 15:06:18 +0100
Subject: [PATCH 1/2] IPA ICF: target and optimization flags comparison.

gcc/ChangeLog:

2015-01-06  Martin Liska  

	* cgraphunit.c (cgraph_node::create_wrapper): Fix level of indentation.
	* ipa-icf.c (sem_function::equals_private): Add support for target and
	(sem_item_optimizer::merge_classes): Remove redundant function
	optimization flags comparison.
	* tree.h (target_opts_for_fn): New function.

gcc/testsuite/ChangeLog:

2015-01-09  Martin Liska  

	* gcc.dg/ipa/ipa-icf-32.c: New test.
---
 gcc/cgraphunit.c  | 52 +--
 gcc/ipa-icf.c | 44 -
 gcc/testsuite/gcc.dg/ipa/ipa-icf-32.c | 24 
 gcc/tree.h| 10 +++
 4 files changed, 103 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-icf-32.c

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index c8c8562..81246e2 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2385,40 +2385,40 @@ cgraphunit_c_finalize (void)
 void
 cgraph_node::create_wrapper (cgraph_node *target)
 {
-/* Preserve DECL_RESULT so we get right by reference flag.  */
-tree decl_result = DECL_RESULT (decl);
+  /* Preserve DECL_RESULT so we get right by reference flag.  */
+  tree decl_result = DECL_RESULT (decl);
 
-/* Remove the function's body but keep arguments to be reused
-   for thunk.  */
-release_body (true);
-reset ();
+  /* Remove the function's body but keep arguments to be reused
+ for thunk.  */
+  release_body (true);
+  reset ();
 
-DECL_RESULT (decl) = decl_result;
-DECL_INITIAL (decl) = NULL;
-allocate_struct_function (decl, false);
-set_cfun (NULL);
+  DECL_RESULT (decl) = decl_result;
+  DECL_INITIAL (decl) = NULL;
+  allocate_struct_function (decl, false);
+  set_cfun (NULL);
 
-/* Turn alias into thunk and expand it into GIMPLE representation.  */
-definition = true;
-thunk.thunk_p = true;
-thunk.this_adjusting = false;
+  /* Turn alias into thunk and expand it into GIMPLE representation.  */
+  definition = true;
+  thunk.thunk_p = true;
+  thunk.this_adjusting = false;
 
-cgraph_edge *e = create_edge (target, NULL, 0, CGRAPH_FREQ_BASE);
+  cgraph_edge *e = create_edge (target, NULL, 0, CGRAPH_FREQ_BASE);
 
-tree arguments = DECL_ARGUMENTS (decl);
+  tree arguments = DECL_ARGUMENTS (decl);
 
-while (arguments)
-  {
-	TREE_ADDRESSABLE (arguments) = false;
-	arguments = TREE_CHAIN (arguments);
-  }
+  while (arguments)
+{
+  TREE_ADDRESSABLE (arguments) = false;
+  arguments = TREE_CHAIN (arguments);
+}
 
-expand_thunk (false, true);
-e->call_stmt_cannot_inline_p = true;
+  expand_thunk (false, true);
+  e->call_stmt_cannot_inline_p = true;
 
-/* Inline summary set-up.  */
-analyze ();
-inline_analyze_function (this);
+  /* Inline summary set-up.  */
+  analyze ();
+  inline_analyze_function (this);
 }
 
 #include "gt-cgraphunit.h"
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index c7ba75a..28158b3 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -427,6 +427,49 @@ sem_function::equals_private (sem_item *item,
   if (!equals_wpa (item, ignored_nodes))
 return false;
 
+  /* Checking function TARGET and OPTIMIZATION flags.  */
+  cl_target_option *tar1 = target_opts_for_fn (decl);
+  cl_target_option *tar2 = target_opts_for_fn (m_compared_func->decl);
+
+  if (tar1 != NULL 

Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)

2015-01-09 Thread Richard Biener
On Fri, 9 Jan 2015, Jakub Jelinek wrote:

> On Fri, Jan 09, 2015 at 10:36:09AM +0100, Richard Biener wrote:
> > I wonder why post_order_compute calls tidy_fallthru_edges at all - won't
> > that break the just computed postorder?
> 
> Dunno, but I think it shouldn't break anything, the function doesn't remove
> any blocks, just in the typical case of an unconditional jump to the next bb
> or conditional jump to the next bb (if only successor) removes the jump and
> makes the edge EDGE_FALLTHRU.
> 
> > Other than that, why doesn't can't the issue show up with non-table-jumps?
> 
> I think tablejumps are the only case where (at least during jump2)
> code_labels live in between the basic blocks, not inside of them.
> 
> > What does it take to preserve (all) the labels?
> 
> Then we'd need to remove all the instructions in between the two basic
> blocks (as we currently do), but move any code_labels from there first to
> the start of the next basic block.  Probably better just call tablejump_p
> with non-NULL args and move precisely that code_label that it sets.
> 
> But, as I said, we'd still not optimize it if tidy_fallthru_edges is not
> called, so we'd need to do it at another place too.

Ok, I see.  I still wonder why we call tidy_fallthru_edges from
postorder_compute.  If we delete unreachable blocks that means
we at most remove incoming edges to a block - that should never
change any other edges fallthru status...?

Does just removing that call there work and make the bug latent again?

Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


RE: [PATCH] Enable experimental TSAN support for Ada

2015-01-09 Thread Bernd Edlinger
Hi,

On Fri, 9 Jan 2015 10:57:14, Richard Biener wrote:
>
> On Mon, Jan 5, 2015 at 9:00 PM, Jeff Law  wrote:
>> On 01/03/15 06:49, Bernd Edlinger wrote:
>>>
>>> Hi,
>>>
>>> I was experimenting with enabling TSAN for Ada recently.
>>> I think this gives rather interesting results.
>>>
>>> The Instrumentation worked almost out of the box, we just have
>>> the problem that it is not gimple-OK to fold something like
>>> "& VIEW_CONVERT_EXPR(x)", and this happens in Ada all the time.
>>>
>>> Boot-Strapped and regression-tested on x86_64-linux-gnu.
>>> OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>>
>>>
>>> changelog-tsan-ada.txt
>>>
>>>
>>> gcc/ChangeLog:
>>> 2015-01-03 Bernd Edlinger
>>>
>>> Enable experimental TSAN support for Ada.
>>> * tsan.c (instrument_expr): Handle VIEW_CONVERT_EXPR.
>>
>> OK for the trunk with a comment before the new block of code indicating why
>> we need to handle VIEW_CONVERT_EXPR specially here (specifically we can't
>> call build_fold_addr_expr on the VIEW_CONVERT_EXPR).
>
> There may be multiple VIEW_CONVERT_EXPRs in a reference chain
> so simply stripping the outermost only doesn't work (the assert).
>

Hmm, that did not happen in any of the Ada tests in ada/acats nor in gnat.dg,
but with Ada anything may be possible...

Would you like it better if I do it this way:

  align = get_object_alignment (expr);
  if (align < BITS_PER_UNIT)
         return false;

  do
    {
   expr = TREE_OPERAND (expr, 0);
    } while (TREE_CODE (expr) == VIEW_CONVERT_EXPR);

  gcc_checking_assert (is_gimple_addressable (expr));
  expr_ptr = build_fold_addr_expr (unshare_expr (expr));



> I wonder why you do all the special-casing when you have already
> called get_inner_reference on the reference. The address is
> simply &base + offset + bitpos / BITS_PER_UNIT, the bitfield
> case is detectable via bitpos % BITS_PER_UNIT != 0.
>

I tried that first, but for something lile S.A[x].B
offset is someting like a+b*x, and while we handle that in expansion,
it is pretty hard to fold gimple code for &base + offset in this case.
But it is relatively easy to fold &S.A[x] and add a contant byte offset.


> Sth else I noticed, instead of checking points-to in the weird way
> you do you simply want if (DECL_P (base) && !may_be_aliased (base)).
>

you mean,

  /* No need to instrument accesses to decls that don't escape,
 they can't escape to other threads then.  */
  if (DECL_P (base))
    {
  struct pt_solution pt;
  memset (&pt, 0, sizeof (pt));
  pt.escaped = 1;
  pt.ipa_escaped = flag_ipa_pta != 0;
  pt.nonlocal = 1;
  if (!pt_solution_includes (&pt, base))
    return false;
  if (!is_global_var (base) && !may_be_aliased (base))
    return false;
    }

should be equivalent to

   if (DECL_P (base) && !may_be_aliased (base))
    return false;

is that right?


Thanks,
Bernd

> Richard.
>
>> Jeff
>>
  

  1   2   >