Re: [Patch 0/4] PowerPC64 Linux split stack support
On Sat, Jun 13, 2015 at 12:46:18PM +0200, Andreas Schwab wrote: /usr/bin/mkdir -p .; files=`echo ../../../../libgo/go/errors/errors.go | sed -e 's/[^ ]*\.gox//g'`; /bin/sh ./libtool --tag GO --mode=compile /daten/gcc/gcc-20150613/Build/./gcc/gccgo -B/daten/gcc/gcc-20150613/Build/./gcc/ -B/usr/powerpc64-linux/bin/ -B/usr/powerpc64-linux/lib/ -isystem /usr/powerpc64-linux/include -isystem /usr/powerpc64-linux/sys-include -m32 -O2 -g -I . -c -fgo-pkgpath=`echo errors.lo | sed -e 's/.lo$//' -e 's/-go$//'` -o errors.lo $files libtool: compile: /daten/gcc/gcc-20150613/Build/./gcc/gccgo -B/daten/gcc/gcc-20150613/Build/./gcc/ -B/usr/powerpc64-linux/bin/ -B/usr/powerpc64-linux/lib/ -isystem /usr/powerpc64-linux/include -isystem /usr/powerpc64-linux/sys-include -m32 -O2 -g -I . -c -fgo-pkgpath=errors ../../../../libgo/go/errors/errors.go go1: error: ‘-fsplit-stack’ currently only supported on PowerPC64 GNU/Linux with glibc-2.18 or later go1: error: ‘-fsplit-stack’ is not supported by this compiler configuration make[2]: *** [errors.lo] Error 1 make[2]: Leaving directory `/daten/gcc/gcc-20150613/Build/powerpc64-linux/32/libgo' make[1]: *** [all-recursive] Error 1 This untested patch ought to fix the problem, I think. My BE test environment had gold installed but not a sufficietly recent glibc. The LE test environment of course didn't build any 32-bit multilibs. Oops. * configure.ac (libgo_cv_c_split_stack_supported): Unset for powerpc. * configure: Regenerate. diff --git a/libgo/configure.ac b/libgo/configure.ac index 7c403a5..2ddcdfd 100644 --- a/libgo/configure.ac +++ b/libgo/configure.ac @@ -366,6 +366,13 @@ esac AC_SUBST(OSCFLAGS) dnl Use -fsplit-stack when compiling C code if available. +case $target in +powerpc*-*-*) + # Don't use cached value. Support is available only for 64-bit, + # so the result from a 64-bit multilib is not valid for 32-bit. + unset libgo_cv_c_split_stack_supported + ;; +esac AC_CACHE_CHECK([whether -fsplit-stack is supported], [libgo_cv_c_split_stack_supported], [CFLAGS_hold=$CFLAGS -- Alan Modra Australia Development Lab, IBM
Re: New type-based pool allocator code miscompiled due to aliasing issue?
On Mon, Jun 15, 2015 at 2:09 AM, Martin Liška mli...@suse.cz wrote: On 06/11/2015 08:19 PM, Richard Biener wrote: On June 11, 2015 7:50:36 PM GMT+02:00, Jakub Jelinek ja...@redhat.com wrote: On Fri, Jun 12, 2015 at 12:58:12AM +0800, pins...@gmail.com wrote: This is just a bug in the older compiler. There was a change to fix in placement new operator. I can't find the reference right now but this is the same issue as that. I'm not claiming 4.1 is aliasing bug free, there are various known issues in it. But, is that the case here? empty_var = onepart_pool (onepart).allocate (); empty_var-dv = dv; empty_var-refcount = 1; empty_var-n_var_parts = 0; doesn't really seem to use operator new at all, so I'd say the bug is in all the spots that call allocate () method of the pool, but don't really use operator new. Yeah. BTW, I see the same issue on x86_64 and on ia64 with a gcc 4.1 host compiler. I think allocate itself should use placement new, not just a static pointer conversion. Richard. Hi. What do you mean by calling placement new? Currently pool_allocatorT::allocate calls placement new as a last statement in the function: return (T *)(header); That is only a cast and not a placement new. Try this instead: return new(header) T(); Thanks, Andrew Martin Jakub
Re: [PATCH][GSoC] Extend shared_ptr to support arrays
On 14/06/15 23:45 +0800, Fan You wrote: This is the revised patch. Bootstrapped and Tested on Darwin 10.9.4. with testsuite 20_util/* Great, it's *very* important that you can run the tests, so we know your changes haven't broken the existing code. You will also need to write new tests (under testsuite/experimental/) to check that your new code works as intended. Tim also advice to do this: __shared_ptrlibfund_Tp : private __shared_ptrstd::remove_extent_Tp to prevent changing everything once __shared_ptr_Tp has changed. Yes, that probably makes sense. This patch is looking much better now. I noticed this macro is wrong: +#define __cpp_lib_experimental_shared_ptr_array 201506 The correct name and value are given in https://rawgit.com/cplusplus/fundamentals-ts/v1/fundamentals-ts.html#general.feature.test I wonder if all the new code really needs to be in bits/shared_ptr.h or if it can just be in experimental/memory instead. There should be no need to declare most of it (except maybe the enable_shared_from_this parts?) when users include memory.
[PATCH, RFC] PR middle-end/55299, contract bitnot through ASR and rotations
Hi. The attached patch adds new match-and-simplify patterns, which fold ~((~a) b) into (a b) for arithmetic shifts (i.e. when A is signed) and perform similar folds for rotations. It also fixes PR tree-optimization/54579 (because we already fold (-a - 1) into ~a). A couple of questions: 1. Should we limit folding to this special case or rather introduce some canonical order of bitnot and shifts (when they are commutative)? In the latter case, which order is better: bitnot as shift/rotate operand or vise-versa? 2. I noticed that some rotation patterns are folded on tree, while other are folded rather late (during second forward propagation). For example on LP64: #define INT_BITS (sizeof (int) * 8) unsigned int rol(unsigned int a, unsigned int b) { return a b | a (INT_BITS - b); } INT_BITS has type unsigned long, so b and (INT_BITS - b) have different types and tree folding fails (if I change int to long, everything is OK). Should this be addressed somehow? 3. Do the new patterns require any special handling of nop-conversions? -- Regards, Mikhail Maltsev gcc/ChangeLog: 2015-06-15 Mikhail Maltsev malts...@gmail.com * match.pd: (~((~X) Y) - X Y): New pattern. (~((~X) r Y) - X r Y): New pattern. (~((~X) r Y) - X r Y): New pattern. gcc/testsuite/ChangeLog: 2015-06-15 Mikhail Maltsev malts...@gmail.com * gcc.dg/fold-notrotate-1.c: New test. * gcc.dg/fold-notshift-1.c: New test. * gcc.dg/fold-notshift-2.c: New test. diff --git a/gcc/match.pd b/gcc/match.pd index 1ab2b1c..487af72 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -696,6 +696,21 @@ along with GCC; see the file COPYING3. If not see wi::eq_p (wi::lshift (@0, cand), @2)) (cmp @1 { build_int_cst (TREE_TYPE (@1), cand); }) +/* ~((~X) Y) - X Y (for arithmetic shift). */ +(simplify + (bit_not (rshift (bit_not @0) @1)) + (if (!TYPE_UNSIGNED (TREE_TYPE (@0))) + (rshift @0 @1))) + +/* Same as above, but for rotations. */ +(for rotate (lrotate rrotate) + (simplify + (bit_not (rotate (bit_not @0) @1)) + (rotate @0 @1))) + +/* TODO: ~((-X + CST) Y) - (X - (CST + 1)) Y, + if overflow does not trap. */ + /* Simplifications of conversions. */ /* Basic strip-useless-type-conversions / strip_nops. */ diff --git a/gcc/testsuite/gcc.dg/fold-notrotate-1.c b/gcc/testsuite/gcc.dg/fold-notrotate-1.c new file mode 100644 index 000..7fc43d4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-notrotate-1.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-optimized } */ + +#define INT_BITS (sizeof (int) * __CHAR_BIT__) +#define ROL(x, y) ((x) (y) | (x) (INT_BITS - (y))) +#define ROR(x, y) ((x) (y) | (x) (INT_BITS - (y))) + +unsigned int +rol (unsigned int a, unsigned int b) +{ + return ~ROL (~a, b); +} + +unsigned int +ror (unsigned int a, unsigned int b) +{ + return ~ROR (~a, b); +} + +#define LONG_BITS (sizeof (long) * __CHAR_BIT__) +#define ROLL(x, y) ((x) (y) | (x) (LONG_BITS - (y))) +#define RORL(x, y) ((x) (y) | (x) (LONG_BITS - (y))) + +unsigned long +roll (unsigned long a, unsigned long b) +{ + return ~ROLL (~a, b); +} + +unsigned long +rorl (unsigned long a, unsigned long b) +{ + return ~RORL (~a, b); +} + +/* { dg-final { scan-tree-dump-not ~ optimized } } */ diff --git a/gcc/testsuite/gcc.dg/fold-notshift-1.c b/gcc/testsuite/gcc.dg/fold-notshift-1.c new file mode 100644 index 000..32a55a0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-notshift-1.c @@ -0,0 +1,44 @@ +/* PR tree-optimization/54579 + PR middle-end/55299 */ + +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-cddce1 } */ + +int +asr1 (int a, int b) +{ + return ~((~a) b); +} + +long +asr1l (long a, long b) +{ + return ~((~a) b); +} + +int +asr2 (int a, int b) +{ + return -((-a - 1) b) - 1; +} + +int +asr3 (int a, int b) +{ + return a 0 ? ~((~a) b) : a b; +} + +long +asr3l (long a, int b) +{ + return a 0 ? ~((~a) b) : a b; +} + +int +asr4 (int a, int b) +{ + return a 0 ? -(-a - 1 b) - 1 : a b; +} + +/* { dg-final { scan-tree-dump-times 6 cddce1 } } */ +/* { dg-final { scan-tree-dump-not ~ cddce1 } } */ diff --git a/gcc/testsuite/gcc.dg/fold-notshift-2.c b/gcc/testsuite/gcc.dg/fold-notshift-2.c new file mode 100644 index 000..5287610 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-notshift-2.c @@ -0,0 +1,18 @@ +/* PR middle-end/55299 */ + +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-cddce1 } */ + +unsigned int +lsr (unsigned int a, unsigned int b) +{ + return ~((~a) b); +} + +int +sl (int a, int b) +{ + return ~((~a) b); +} + +/* { dg-final { scan-tree-dump-times ~ 4 cddce1 } } */
Re: [Patch, fortran, PR44672, v10] [F08] ALLOCATE with SOURCE and no array-spec
Hi Thomas, hi all, I got no objections so far, therefore commited as r224477. Thanks for the review. Regards, Andre On Thu, 11 Jun 2015 23:59:48 +0200 Thomas Koenig tkoe...@netcologne.de wrote: Hi Andre, please find attached an updated version of the patch. This patch simplifies some cases and ensures more straight line code. Furthermore was a bug in the interfacing routine for the _vptr-_copy() routine removed, where not the third and fourth arguments translated to be passed be value but the fourth and fifth (cs start counting at zero...). Bootstraps and regtests fine on x86_64-linux-gnu/f21. Ok for trunk? Following the discussions, and looking through the patch, I would say this patch is in pretty good shape (and quite impressive, too). My vote would be to commit as is, unless something important comes up, and fix smaller problems and possible corner cases afterwards, if any exist. However, I am not really deep into these aspects of the compiler, and I would still like to leave some time for others to comment if they think this is appropriate. So, OK to commit in two days unless there are objections. Thanks for the patch! Thomas -- Andre Vehreschild * Email: vehre ad gmx dot de Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 224476) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,35 @@ +2015-06-15 Andre Vehreschild ve...@gmx.de + + PR fortran/44672 + PR fortran/45440 + PR fortran/57307 + * gfortran.h: Extend gfc_code.ext.alloc to carry a + flag indicating that the array specification has to be + taken from expr3. + * resolve.c (resolve_allocate_expr): Add F2008 notify + and flag indicating source driven array spec. + (resolve_allocate_deallocate): Check for source driven + array spec, when array to allocate has no explicit + array spec. + * trans-array.c (gfc_array_init_size): Get lower and + upper bound from a tree array descriptor, except when + the source expression is an array-constructor which is + fixed to be one-based. + (retrieve_last_ref): Extracted from gfc_array_allocate(). + (gfc_array_allocate): Enable allocate(array, source= + array_expression) as specified by F2008:C633. + (gfc_conv_expr_descriptor): Add class tree expression + into the saved descriptor for class arrays. + * trans-array.h: Add temporary array descriptor to + gfc_array_allocate (). + * trans-expr.c (gfc_conv_procedure_call): Special handling + for _copy() routine translation, that comes without an + interface. Third and fourth argument are now passed by value. + * trans-stmt.c (gfc_trans_allocate): Get expr3 array + descriptor for temporary arrays to allow allocate(array, + source = array_expression) for array without array + specification. + 2015-06-14 Thomas Koenig tkoe...@gcc.gnu.org * intrinsic.texi: Change \leq to in descrition of imaginary Index: gcc/fortran/gfortran.h === --- gcc/fortran/gfortran.h (Revision 224476) +++ gcc/fortran/gfortran.h (Arbeitskopie) @@ -2395,6 +2395,9 @@ { gfc_typespec ts; gfc_alloc *list; + /* Take the array specification from expr3 to allocate arrays + without an explicit array specification. */ + unsigned arr_spec_from_expr3:1; } alloc; Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (Revision 224476) +++ gcc/fortran/resolve.c (Arbeitskopie) @@ -6805,7 +6805,7 @@ have a trailing array reference that gives the size of the array. */ static bool -resolve_allocate_expr (gfc_expr *e, gfc_code *code) +resolve_allocate_expr (gfc_expr *e, gfc_code *code, bool *array_alloc_wo_spec) { int i, pointer, allocatable, dimension, is_abstract; int codimension; @@ -7104,13 +7104,24 @@ if (!ref2 || ref2-type != REF_ARRAY || ref2-u.ar.type == AR_FULL || (dimension ref2-u.ar.dimen == 0)) { - gfc_error (Array specification required in ALLOCATE statement - at %L, e-where); - goto failure; + /* F08:C633. */ + if (code-expr3) + { + if (!gfc_notify_std (GFC_STD_F2008, Array specification required + in ALLOCATE statement at %L, e-where)) + goto failure; + *array_alloc_wo_spec = true; + } + else + { + gfc_error (Array specification required in ALLOCATE statement + at %L, e-where); + goto failure; + } } /* Make sure that the array section reference makes sense in the -context of an ALLOCATE specification. */ + context of an ALLOCATE specification. */ ar = ref2-u.ar; @@ -7125,7 +7136,7 @@ for (i = 0; i ar-dimen; i++) { - if (ref2-u.ar.type == AR_ELEMENT) + if (ar-type == AR_ELEMENT || ar-type == AR_FULL) goto check_symbols; switch (ar-dimen_type[i]) @@ -7202,6 +7213,7 @@ return false; } + static void
Re: [PATCH, AARCH64] movi type attribute confusion
On 12 June 2015 at 21:43, Jim Wilson jim.wil...@linaro.org wrote: We have 5 patterns that can emit the movi instruction. These patterns map it to 4 different type attributes. The movmode_aarch64 pattern uses mov_imm. The movdi_aarch64 pattern uses fmov. The movtf_aarch64 pattern uses fconstd. And the two aarch64_simd_movmode patterns for VD and VQ use neon_move. Bitwise identical instructions should always map to the same attribute type, so we need to change these patterns to agree on the right attribute. movi is an integer simd instruction, so neon_move is the only choice that makes sense. The following patch corrects the first 3 patterns to use neon_move like the last two. We could optionally create a new type attribute, e.g. neon_move_imm. I can do that if people think it would be better. This patch was tested with a make bootstrap and make check on an APM box running Ubuntu 14.04. FYI This patch overlaps with my movtf-zero patch which is still waiting review, but the overlap is trivial to resolve so this should not be a problem. The movdi* and movtf* hunks look fine. The movmode_aarch64 pattern calls aarch64_output_scalar_simd_mov_immediate which can emit either mvni, movi, with or without msl and lsl. In the case of the plain movi neon_move looks sensible, the other possbile outputs should ideally be represented by logic_immediate and logic_shift_immediate. Using neon_move is a step in the right direction. OK /Marcus
Re: [PATCH, AARCH64] improve long double 0.0 support
On 4 June 2015 at 01:35, Jim Wilson jim.wil...@linaro.org wrote: I noticed that poor code is emitted for a long double 0.0. This testcase long double sub (void) { return 0.0; } void sub2 (long double *ld) { *ld = 0.0; } currently generates sub: ldr q0, .LC0 ret ... sub2: ldr q0, .LC1 str q0, [x0] ret where LC0 and LC1 are 16-byte constant table long double zeros. With the attached patch, I get sub: movi v0.2d, #0 ret ... sub2: stp xzr, xzr, [x0] ret The main problem is in aarch64_valid_floating_const, which rejects all constants for TFmode. There is a comment that says we should handle 0, but not until after the movtf pattern is improved. This improvement apparently happened two years ago with this patch 2013-05-09 Sofiane Naci sofiane.n...@arm.com * config/aarch64/aarch64.md: New movtf split. ... so this comment is no longer relevant, and we should handle 0 now. The patch deletes the out of date comment and moves the 0 check before the TFmode check so that TFmode 0 is accepted. There are a few other changes needed to make this work well. The movtf expander needs to avoid forcing 0 to a reg for a mem dest, just like the movti pattern already does. The Ump/?rY alternative needs to be split into two, as %H doesn't work for const_double 0, again this is like the movti pattern. The condition needs to allow 0 values in operand 1, as is done in the movti pattern. I noticed another related problem while making this change. The ldp/stp instructions in the movtf_aarch64 pattern have neon attribute types. However, these are integer instructions with matching 'r' constraints and hence should be using load2/store2 attribute types, just like in the movti pattern. OK, Thank you /Marcus
Re: [PATCH, ARM] (commited) attribute target (thumb,arm) [4/6]
On Wed, Jun 10, 2015 at 08:57:37AM +0100, Christian Bruel wrote: Hi, Commited [4/6] as attached (r224314) thanks Christian On 06/08/2015 11:26 AM, Ramana Radhakrishnan wrote: On 08/06/15 09:45, Christian Bruel wrote: do you have other feedbacks for the remaining parts ? many thanks This is OK, thanks. Ramana Hi Christian, This patch is causing an ICE for me in my arm-none-linux-gnueabihf testing. It looks like something isn't saving/restoring/initialising data structures needed for IRA. I've raised https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66541 , the bug looks very similar to pr64047 ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64047 ). Thanks, James --- .../gcc/cc1 bug.c -O2 -ftree-parallelize-loops=4 -O2 -flto -fno-use-linker-plugin -ftree-parallelize-loops=4 -o pr52429.s foo Analyzing compilation unit Performing interprocedural optimizations *free_lang_data visibility build_ssa_passes opt_local_passes free-inline-summary whole-program profile_estimate icf devirt cp inline pure-const static-var single-use comdatsAssembling functions: foo .../gcc/testsuite/gcc.dg/torture/pr52429.c: In function ‘foo’: .../gcc/testsuite/gcc.dg/torture/pr52429.c:11:1: internal compiler error: Segmentation fault int i; ^ 0xafa15f crash_signal .../gcc/toplev.c:369 0x97c2b4 record_operand_costs .../gcc/ira-costs.c:1305 0x97c7a4 scan_one_insn .../gcc/ira-costs.c:1483 0x97c7a4 process_bb_for_costs .../gcc/ira-costs.c:1604 0x97d715 find_costs_and_classes .../gcc/ira-costs.c:1711 0x97ec3a ira_set_pseudo_classes(bool, _IO_FILE*) .../gcc/ira-costs.c:2245 0xffd743 alloc_global_sched_pressure_data .../gcc/haifa-sched.c:7119 0xffd743 sched_init() .../gcc/haifa-sched.c:7269 0xffebcf haifa_sched_init() .../gcc/haifa-sched.c:7281 0xaab8dc schedule_insns .../gcc/sched-rgn.c:3411 0xaac0b3 schedule_insns .../gcc/sched-rgn.c:3405 0xaac0b3 rest_of_handle_sched .../gcc/sched-rgn.c:3624 0xaac0b3 execute .../gcc/sched-rgn.c:3732 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions.
Re: [PATCH] Move gen_* stubs from defaults.h to genflags
Mikhail Maltsev malts...@gmail.com writes: On 10.06.2015 10:05, Richard Sandiford wrote: +/* Structure which holds data, required for generating stub gen_* function. */ No comma after data +/* These instructions require default stub function. Stubs are never called. require a default [snip] Seems like this is more naturally a hash_table rather than a hash_map. I think there's also a preference to avoid static constructor-based initialisation. Fixed. There again, this is a generator, so those kinds of concerns aren't particularly important. If we do keep the above though, I think we should put the hasher in hash-map-table.h now. Otherwise these FIXMEs are just going to accumulate, and each time makes it less likely that any consolidation will actually happen. Well, after changing hash_map to hash_table, the hasher class is no longer identical to other hash traits classes. As for fixing other occurrences, I think I'd better leave it for another patch. There are other hash_table string traits though. E.g. config/i386/winnt.c and java/jcf-io.c. Let's not add any more. FWIW I have some patches to try to clean up the hashing traits. I hope to post them later today. Also, sorry for the runaround, but it occured to me later that if we're getting the generators to help us with the default definitions, we might as well go one step further and move the HAVE_foo/gen_foo interface to the target structure, with the structure initialiser being filled in by the generators. I.e. rather than generating a default HAVE_foo and dummy gen_foo, we generate definitions for TARGET_HAVE_FOO and TARGET_GEN_FOO. This should remove the insn-flags.h dependency from most of the target-independent code. There'd be a .def file to list the instructions involved in the HAVE/GEN interface. I have a patch, but also needed a string hash, so ended up spending rather too long on that instead. Hope to post it when the hashing stuff is done. Thanks, Richard
Re: [Patch 0/4] PowerPC64 Linux split stack support
Alan Modra amo...@gmail.com writes: diff --git a/libgo/configure.ac b/libgo/configure.ac index 7c403a5..2ddcdfd 100644 --- a/libgo/configure.ac +++ b/libgo/configure.ac @@ -366,6 +366,13 @@ esac AC_SUBST(OSCFLAGS) dnl Use -fsplit-stack when compiling C code if available. +case $target in +powerpc*-*-*) + # Don't use cached value. Support is available only for 64-bit, + # so the result from a 64-bit multilib is not valid for 32-bit. + unset libgo_cv_c_split_stack_supported Where does this cached value come from? There shouldn't be any sharing between multilib builds. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: New type-based pool allocator code miscompiled due to aliasing issue?
On 06/11/2015 08:19 PM, Richard Biener wrote: On June 11, 2015 7:50:36 PM GMT+02:00, Jakub Jelinek ja...@redhat.com wrote: On Fri, Jun 12, 2015 at 12:58:12AM +0800, pins...@gmail.com wrote: This is just a bug in the older compiler. There was a change to fix in placement new operator. I can't find the reference right now but this is the same issue as that. I'm not claiming 4.1 is aliasing bug free, there are various known issues in it. But, is that the case here? empty_var = onepart_pool (onepart).allocate (); empty_var-dv = dv; empty_var-refcount = 1; empty_var-n_var_parts = 0; doesn't really seem to use operator new at all, so I'd say the bug is in all the spots that call allocate () method of the pool, but don't really use operator new. Yeah. BTW, I see the same issue on x86_64 and on ia64 with a gcc 4.1 host compiler. I think allocate itself should use placement new, not just a static pointer conversion. Richard. Hi. What do you mean by calling placement new? Currently pool_allocatorT::allocate calls placement new as a last statement in the function: return (T *)(header); Martin Jakub
Re: [PATCH] Fix PR66509
Hello! The attached patch revises the tests for the filds and fists mnemonics to use the assembly... filds mem(%rip); fists mem(%rip) and the test for the fildq and fistq mnemonics to use the assembly... fildq mem(%rip); fistpq mem(%rip) which will assemble for both 64-bit and 32-bit mode. It won't. $ as -32 rip.s rip.s: Assembler messages: rip.s:1: Error: `mem(%rip)' is not a valid base/index expression rip.s:1: Error: `mem(%rip)' is not a valid base/index expression Uros.
Re: aarch64 simd index out of range message not correct on 32 bit host
On 29 May 2015 at 09:32, Shiva Chen shiva0...@gmail.com wrote: Hi, Andrew I modify the patch as you suggestion and testing on 32/64 bit host. Thanks your tips. I really appreciate for your help. Shiva OK and committed with this ChangeLog: 2015-06-14 Shiva Chen shiva0...@gmail.com * aarch64.c (aarch64_simd_lane_bounds): Change %ld to %wd for HOST_WIDE_INT parameter. /Marcus
Re: New type-based pool allocator code miscompiled due to aliasing issue?
On 06/15/2015 11:13 AM, Andrew Pinski wrote: On Mon, Jun 15, 2015 at 2:09 AM, Martin Liška mli...@suse.cz wrote: On 06/11/2015 08:19 PM, Richard Biener wrote: On June 11, 2015 7:50:36 PM GMT+02:00, Jakub Jelinek ja...@redhat.com wrote: On Fri, Jun 12, 2015 at 12:58:12AM +0800, pins...@gmail.com wrote: This is just a bug in the older compiler. There was a change to fix in placement new operator. I can't find the reference right now but this is the same issue as that. I'm not claiming 4.1 is aliasing bug free, there are various known issues in it. But, is that the case here? empty_var = onepart_pool (onepart).allocate (); empty_var-dv = dv; empty_var-refcount = 1; empty_var-n_var_parts = 0; doesn't really seem to use operator new at all, so I'd say the bug is in all the spots that call allocate () method of the pool, but don't really use operator new. Yeah. BTW, I see the same issue on x86_64 and on ia64 with a gcc 4.1 host compiler. I think allocate itself should use placement new, not just a static pointer conversion. Richard. Hi. What do you mean by calling placement new? Currently pool_allocatorT::allocate calls placement new as a last statement in the function: return (T *)(header); That is only a cast and not a placement new. Try this instead: return new(header) T(); Ah, I overlooked that it's not a placement new, but just static casting. Anyway, if I added: cselib_val () {} to struct cselib_val and changed the cast to placement new: char *ptr = (char *) header; return new (ptr) T (); I got following compilation error: In file included from ../../gcc/alias.c:46:0: ../../gcc/alloc-pool.h: In instantiation of ‘T* pool_allocatorT::allocate() [with T = cselib_val]’: ../../gcc/cselib.h:51:27: required from here ../../gcc/alloc-pool.h:416:23: error: no matching function for call to ‘cselib_val::operator new(sizetype, char*)’ return new (ptr) T (); ^ In file included from ../../gcc/alias.c:47:0: ../../gcc/cselib.h:49:16: note: candidate: static void* cselib_val::operator new(size_t) inline void *operator new (size_t) ^ ../../gcc/cselib.h:49:16: note: candidate expects 1 argument, 2 provided I am wondering if I can combine overwritten new operator, which is going to internally use placement new with a default ctor? Martin Thanks, Andrew Martin Jakub
Re: [patch, testsuite] Remove -fopenmp in dg-options in libgomp.c
On Sat, Jun 06, 2015 at 12:10:00AM +0200, Tom de Vries wrote: this patch removes superfluous -fopenmp settings. In the case of target-8.c, we remove the whole dg-options line which did not have an -On setting, which means the optimization level at which the testcase compiles is changed from -O0 to the default -O2. Tested with a c build. OK for trunk? Thanks, - Tom Remove -fopenmp in dg-options in libgomp.c 2015-06-05 Tom de Vries t...@codesourcery.com * testsuite/libgomp.c/atomic-18.c: Remove superfluous -fopenmp setting in dg-options. * testsuite/libgomp.c/atomic-3.c: Same. * testsuite/libgomp.c/debug-1.c: Same. * testsuite/libgomp.c/nqueens-1.c: Same. * testsuite/libgomp.c/pr26171.c: Same. * testsuite/libgomp.c/pr48591.c: Same. * testsuite/libgomp.c/pr64824.c: Same. * testsuite/libgomp.c/pr64868.c: Same. * testsuite/libgomp.c/pr66133.c: Same. * testsuite/libgomp.c/pr66199-1.c: Same. * testsuite/libgomp.c/pr66199-2.c: Same. * testsuite/libgomp.c/target-8.c: Same. This is ok. Jakub
Re: [AArch64] Fix predicate and constraint mismatch in logical atomic operations
On 8 May 2015 at 12:42, Richard Biener richard.guent...@gmail.com wrote: On Tue, Nov 4, 2014 at 11:44 AM, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 25 September 2014 04:45, Michael Collison michael.colli...@linaro.org wrote: On certain patterns in atomics.md the constraint 'n' is used in combination with the predicate atomic_op_operand. The constraint is too general and allows constants that are disallowed by the predicate. This causes an ICE In final_scan_insn when the insn cannot be split because the constraint and predicate do not match. Tested on aarch64-none-elf, aarch64-linux-gnu. Additionally the originally reporter of the bug, (d...@ubuntu.com), applied the patch and successfully bootstrapped and tested with no regressions. 2014-09-23 Michael Collison michael.colli...@linaro.org * config/aarch64/iterators.md (lconst_atomic): New mode attribute to support constraints for CONST_INT in atomic operations. * config/aarch64/atomics.md (atomic_atomic_optabmode): Use lconst_atomic constraint. (atomic_nandmode): Likewise. (atomic_fetch_atomic_optabmode): Likewise. (atomic_fetch_nandmode): Likewise. (atomic_atomic_optab_fetchmode): Likewise. (atomic_nand_fetchmode): Likewise. OK Thanks. /Marcus Can you please backport this to all release branches as well? Hi Richard, I have tested this backport against 4.8 and 4.9 branches. I applies cleanly in both cases, shows no regression and fixes the ICE. I'm afraid it's too late for committing into the 4.8 branch? Sorry for the delay in handling this. Christophe. Thanks, Richard.
Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data
On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote: This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 4.1 with unsigned short for map kinds, but without new async arguments yet. I think I'd prefer (for now) to suffix the functions with _41 instead of 1 (and we'll see if we can come up with better names when async support is added). Do we need to change GOMP_target_update though (at least right now)? I mean, the construct only allows to and from clauses, not the map clause, and those don't really have an always modifier, nor release/delete semantics etc., so at least for now I think using the current GOMP_target_update should be ok. Jakub
[gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data
Hi, This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 4.1 with unsigned short for map kinds, but without new async arguments yet. make check-target-libgomp and bootstrap passed, ok for gomp-4_1-branch? gcc/ * builtin-types.def (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR): Remove. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New. (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove. * omp-builtins.def (BUILT_IN_GOMP_TARGET): Replace GOMP_target with GOMP_target1. (BUILT_IN_GOMP_TARGET_DATA): Replace GOMP_target_data with GOMP_target_data1. (BUILT_IN_GOMP_TARGET_UPDATE): Replace GOMP_target_update with GOMP_target_update1. (BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): New. * omp-low.c (expand_omp_target): Use BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA for GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Do not pass obsolete pointer to new builtins. (lower_omp_target): Always use unsigned short for map kinds. gcc/fortran/ * types.def (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR): Remove. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New. (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove. libgomp/ * libgomp.map (GOMP_4.1): Add GOMP_target1, GOMP_target_data1, GOMP_target_update1, GOMP_target_enter_exit_data. * libgomp_g.h: Declare GOMP_target1, GOMP_target_data1, GOMP_target_update1, GOMP_target_enter_exit_data. * target.c (resolve_device): Call gomp_init_device here instead of GOMP_target*. (get_kind): Rename is_openacc to short_mapkind. (gomp_map_vars): Likewise. (gomp_unmap_vars): Likewise. (gomp_update): Likewise. (gomp_target_fallback): New static function. (gomp_get_target_fn_addr): New static function. (GOMP_target): Move host fallback and fn lookup to the new functions. (GOMP_target1): New function. (gomp_target_data_fallback): New static function. (GOMP_target_data): Move host fallback to the new function. (GOMP_target_data1): New function. (GOMP_target_update): Do not call gomp_init_device. (GOMP_target_update1): New function. (GOMP_target_enter_exit_data): New function. diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 492ca63..3c4b9e3 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -524,8 +524,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT, BT_INT) DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE, BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT) -DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, -BT_VOID, BT_INT, BT_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR) +DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR, +BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR, +BT_PTR, BT_PTR) DEF_FUNCTION_TYPE_7 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, @@ -534,9 +535,6 @@ DEF_FUNCTION_TYPE_7 (BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR, BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG) -DEF_FUNCTION_TYPE_7 (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR, -BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_PTR, BT_SIZE, -BT_PTR, BT_PTR, BT_PTR) DEF_FUNCTION_TYPE_8 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_LONG_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index c0d3989..18f81e6 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -189,8 +189,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT, BT_INT) DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE, BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT) -DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, -BT_VOID, BT_INT, BT_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR) +DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR, +BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR, +BT_PTR, BT_PTR) DEF_FUNCTION_TYPE_7 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, @@ -199,9 +200,6 @@ DEF_FUNCTION_TYPE_7 (BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR, BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG) -DEF_FUNCTION_TYPE_7 (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR, -BT_VOID, BT_INT,
Re: [patch] libstdc++/66030 fix codecvt exports for mingw32
On 08/06/15 16:12 +0100, Jonathan Wakely wrote: The linker script assumes that std::mbstate_t has the name __mbstate_t for linkage purposes, but that's not necessarily true. For mingw32 it's just a typedef for int, so the patterns don't match. This adds a new mingw32-specific pattern for codecvt_byname's constructors and destructors, and relaxes the patterns for codecvtcharNN_t, char, mbstate_t so they match __mbstate_t or int. Tested x86_64-linux and powerpc64le-linux, committed to trunk. I plan to commit this to trunk and gcc-5-branch soon. I've fully tested this on x86_64-linux and powerpc64le-linux, and also manually verified there is no changed to the exported symbols on Fedora 22 for i686, s390, s390x, ppc64, ppc64le, armv7hl and aarch64, so am committing to the gcc-5-branch (approved by Jakub on IRC). commit dffce5e2b48ff19c4ec4de5d7ca934c15225b940 Author: Jonathan Wakely jwak...@redhat.com Date: Mon Jun 1 17:31:46 2015 +0100 PR libstdc++/66030 * config/abi/pre/gnu.ver: Export codecvt_byname and codecvt symbols for mingw32. diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 2da04e4..d42cd37 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -542,6 +542,9 @@ GLIBCXX_3.4 { # std::codecvt_byname _ZNSt14codecvt_bynameI[cw]c11__mbstate_tEC[12]EPKc[jmy]; _ZNSt14codecvt_bynameI[cw]c11__mbstate_tED*; +#if defined (_WIN32) !defined (__CYGWIN__) +_ZNSt14codecvt_bynameI[cw]ciE[CD]*; +#endif # std::collate _ZNSt7collateI[cw]*; @@ -1821,9 +1824,9 @@ GLIBCXX_3.4.21 { _ZNKSt8time_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE6do_getES3_S3_RSt8ios_baseRSt12_Ios_IostateP2tmcc; # codecvtchar16_t, char, mbstate_t, codecvtchar32_t, char, mbstate_t -_ZNKSt7codecvtID[is]c11__mbstate_t*; -_ZNSt7codecvtID[is]c11__mbstate_t*; -_ZT[ISV]St7codecvtID[is]c11__mbstate_tE; +_ZNKSt7codecvtID[is]c*; +_ZNSt7codecvtID[is]c*; +_ZT[ISV]St7codecvtID[is]c*E; extern C++ {
Re: [patch, testsuite] Remove superfluous -std={c99,gnu99} in libgomp.c
On Sat, Jun 06, 2015 at 12:05:44AM +0200, Tom de Vries wrote: this patch removes a superfluous -std=c99 or -std=gnu99 setting in the libgomp/testsuite/libgomp.c testcases (and a superfluous -fopenmp setting in some cases as well). The setting is superfluous because -std=gnu11 is the new default for C mode, which allows loop initial declaration. I'd prefer to keep the -std=gnu99 or -std=c99, to make it clear there is a C99+ requirement. It will not hurt to have some testsuite coverage for C99 anyway. Feel free to change the dg-options into /* { dg-additional-options -std=gnu99 } */ that is fine. Jakub
[PATCH] Fix up some omp simd array issues (PR middle-end/66429)
Hi! As Tom has reported, the for-2.c testcase ICEs at -O2 -fopenmp, because it has a noreturn function in the body and so while in omplower we decide to use omp simd array arrays, in ompexp there is no loop to attach the simd stuff to and I forgot to set the has_simduid_loops flag in that case (and propagate it to the outlined functions). But there is actually a bigger problem, the cleanup of the simduid internal calls and arrays has been done only in the vectorizer, but the vectorizer is part of the loop pass list that isn't run if there are no loops left, so e.g. in the testcase of noreturn loop where in the end there is no loop the cleanup wasn't performed. This patch (the omp-low.c part I can ack myself) thus clears the cfun-has_simduid_loops in the vectorizer after cleaning them up and adds a new pass gated on cfun-has_simduid_loops that performs just the cleanups. That pass will be invoked only for these pathological cases (when the vectorizer pass has not been run because there were no loops to vectorize, but still with OpenMP / Cilk+ code which has the simd directives). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5.2? 2015-06-15 Jakub Jelinek ja...@redhat.com PR middle-end/66429 * omp-low.c (expand_omp_taskreg): Use child_cfun instead of DECL_STRUCT_FUNCTION (child_fn). Or in has_simduid_loops and has_force_vectorize_loops flags from cfun into child_cfun. (expand_omp_simd): For broken loop, set cfun-has_simduid_loops if simduid is non-NULL. * tree-pass.h (make_pass_simduid_cleanup): New prototype. * passes.def (pass_simduid_cleanup): Add new pass after loop passes. * tree-vectorizer.c (adjust_simduid_builtins): Remove one unnecessary indirection from htab argument's type. (shrink_simd_arrays): New function. (vectorize_loops): Use it. Adjust adjust_simduid_builtins caller. Don't call adjust_simduid_builtins if there are no loops. (pass_data_simduid_cleanup, pass_simduid_cleanup): New variables. (pass_simduid_cleanup::execute): New method. (make_pass_simduid_cleanup): New function. * c-c++-common/gomp/pr66429.c: New test. --- gcc/omp-low.c.jj2015-06-10 11:06:13.0 +0200 +++ gcc/omp-low.c 2015-06-15 13:36:45.644277964 +0200 @@ -5589,7 +5589,9 @@ expand_omp_taskreg (struct omp_region *r vec_safe_truncate (child_cfun-local_decls, dstidx); /* Inform the callgraph about the new function. */ - DECL_STRUCT_FUNCTION (child_fn)-curr_properties = cfun-curr_properties; + child_cfun-curr_properties = cfun-curr_properties; + child_cfun-has_simduid_loops |= cfun-has_simduid_loops; + child_cfun-has_force_vectorize_loops |= cfun-has_force_vectorize_loops; cgraph_node *node = cgraph_node::get_create (child_fn); node-parallelized_function = 1; cgraph_node::add_new_function (child_fn, true); @@ -7838,6 +7840,8 @@ expand_omp_simd (struct omp_region *regi cfun-has_force_vectorize_loops = true; } } + else if (simduid) +cfun-has_simduid_loops = true; } --- gcc/tree-pass.h.jj 2015-04-17 13:50:55.0 +0200 +++ gcc/tree-pass.h 2015-06-15 14:18:50.299679523 +0200 @@ -372,6 +372,7 @@ extern gimple_opt_pass *make_pass_graphi extern gimple_opt_pass *make_pass_if_conversion (gcc::context *ctxt); extern gimple_opt_pass *make_pass_loop_distribution (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vectorize (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_simduid_cleanup (gcc::context *ctxt); extern gimple_opt_pass *make_pass_slp_vectorize (gcc::context *ctxt); extern gimple_opt_pass *make_pass_complete_unroll (gcc::context *ctxt); extern gimple_opt_pass *make_pass_complete_unrolli (gcc::context *ctxt); --- gcc/passes.def.jj 2015-06-10 08:18:25.0 +0200 +++ gcc/passes.def 2015-06-15 14:21:01.616671365 +0200 @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3. PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop) NEXT_PASS (pass_slp_vectorize); POP_INSERT_PASSES () + NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); --- gcc/tree-vectorizer.c.jj2015-06-10 08:18:29.0 +0200 +++ gcc/tree-vectorizer.c 2015-06-15 14:31:02.548482422 +0200 @@ -171,7 +171,7 @@ simd_array_to_simduid::equal (const simd into their corresponding constants. */ static void -adjust_simduid_builtins (hash_tablesimduid_to_vf **htab) +adjust_simduid_builtins (hash_tablesimduid_to_vf *htab) { basic_block bb; @@ -203,10 +203,12 @@ adjust_simduid_builtins (hash_tablesimd gcc_assert (TREE_CODE (arg) == SSA_NAME); simduid_to_vf *p = NULL, data; data.simduid = DECL_UID (SSA_NAME_VAR (arg)); - if (*htab) - p = (*htab)-find (data); -
Re: [PATCH] PR ada/66242 Front-end error if exception propagation disabled
Simon, As discussed privately, your patch is interesting but isn't complete enough so cannot be integrated as is since we also want to avoid not only the generation of the initialization/finalization exception handlers, but also to eliminate the creation of the various variables that keep track of this process as well as the removal of the now redundant wrapper blocks. We'll possibly make a more complete patch in this area in the future that will address this PR, so if you're not in a hurry, I would suggest you keep your local patch for now. Arno
[Patch, MIPS] Modify sysroot layout for mips-mti-* and mips-img-*
We (Imagination) would like to change the layout of the mips-mti-linux-gnu and mips-img-linux-gnu cross compiler toolchains. This patch, which affects nothing other than those targets, implements that change. Prior to this patch the mti and img cross compilers used a set of nested directories for the different options. For example, mips-mti-linux-gnu put the mips32r2 big-endian hard-float system libraries directly under the sysroot. Little-endian libraries were in a /el directory, soft-float libraries were in a /sof directory. Libraries that were both soft-float and little-endian were under el/sof. 64-bit libraries are in a subdirectory called /64, mips16 libraries are in a subdirectory called /mips16, etc. The problem with this layout is that it does not match the library layout of native MIPS linux systems and if you link a program with shared libraries and then try to move it to a native system and run it, it may not work because the libraries would not be where the executable expected them to be. This patch changes the expected layout of the sysroot libraries and headers to look more like native systems. The basic idea is that there is one level of directories under the sysroot directory for each 'kind' of native MIPS system: mips64r2 and big-endian and soft-float for example or mips64r6 and little-endian and hard-float. We do not have separate directories for mips32r2 and mips64r2 because under the one 'r2' directory there would be /lib, /lib32, and /lib64 directories for o32, n32, and n64 libraries so they can share one logical sysroot. The naming convention for the directories under the global sysroot (set with SYSROOT_SUFFIX_SPEC) is: [micro]mips[el]-r(1|2|6)[-mips16](-soft|-hard)[-nan2008][-uclibc] [] parts are optional [-micro] if build for micromips. [-el] if little-endian. [-mips16] if built in mips16 mode. [-nan2008] if built for nan2008 on a platform. where that is not the defualt. [-uclibc] if built with uclibc library (instead of glibc). () parts are a selection (1|2|6) refers to versions 1, 2, or 6 of mips32/mips64 architectures. (-soft|-hard) refers to a hard of soft float version. Tested by building both toolchains and inspecting the layout and by running the GCC testsuite with a subset of the various combinations. OK for checkin? Steve Ellcey sell...@imgtec.com 2015-06-15 Steve Ellcey sell...@imgtec.com * config/mips/mti-linux.h (MIPS_SYSVERSION_SPEC): New. (SYSROOT_SUFFIX_SPEC): Update. (SYSROOT_HEADERS_SUFFIX_SPEC): New. (STARTFILE_PREFIX_SPEC): Update. * config/mips/t-mti-linux (MULTILIB_EXCEPTIONS): Remove. (MULTILIB_REQUIRED): New. (MULTILIB_OSDIRNAMES): New. * config/mips/t-img-linux (MULTILIB_EXCEPTIONS): Remove. (MULTILIB_REQUIRED): New. (MULTILIB_OSDIRNAMES): New. diff --git a/gcc/config/mips/mti-linux.h b/gcc/config/mips/mti-linux.h index 80d5925..32b84d1 100644 --- a/gcc/config/mips/mti-linux.h +++ b/gcc/config/mips/mti-linux.h @@ -18,16 +18,20 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ /* This target is a multilib target, specify the sysroot paths. */ +#define MIPS_SYSVERSION_SPEC \ + %{mips32:r1}%{mips64:r1}%{mips32r2:r2}%{mips64r2:r2}%{mips32r6:r6}%{mips64r6:r6}%{mips16:-mips16} + #undef SYSROOT_SUFFIX_SPEC -#if MIPS_ISA_DEFAULT == 33 /* mips32r2 is the default */ -#define SYSROOT_SUFFIX_SPEC \ - %{mips32:/mips32}%{mips64:/mips64}%{mips64r2:/mips64r2}%{mips32r6:/mips32r6}%{mips64r6:/mips64r6}%{mips16:/mips16}%{mmicromips:/micromips}%{mabi=64:/64}%{mel|EL:/el}%{msoft-float:/sof}%{!mips32r6:%{!mips64r6:%{mnan=2008:/nan2008}}} -#elif MIPS_ISA_DEFAULT == 37 /* mips32r6 is the default */ #define SYSROOT_SUFFIX_SPEC \ - %{mips32:/mips32}%{mips64:/mips64}%{mips32r2:/mips32r2}%{mips64r2:/mips64r2}%{mips64r6:/mips64r6}%{mips16:/mips16}%{mmicromips:/micromips}%{mabi=64:/64}%{mel|EL:/el}%{msoft-float:/sof}%{!mips32r6:%{!mips64r6:%{mnan=2008:/nan2008}}} -#else /* Unexpected default ISA. */ -#error No SYSROOT_SUFFIX_SPEC exists for this default ISA -#endif + /%{mmicromips:micro}mips%{mel|EL:el}-MIPS_SYSVERSION_SPEC%{msoft-float:-soft;:-hard}%{!mips32r6:%{!mips64r6:%{mnan=2008:-nan2008}}}%{muclibc:-uclibc} + +#define SYSROOT_HEADERS_SUFFIX_SPEC SYSROOT_SUFFIX_SPEC + +#undef STARTFILE_PREFIX_SPEC +#define STARTFILE_PREFIX_SPEC \ + %{mabi=32: /usr/local/lib/ /lib/ /usr/lib/} \ + %{mabi=n32: /usr/local/lib32/ /lib32/ /usr/lib32/} \ + %{mabi=64: /usr/local/lib64/ /lib64/ /usr/lib64/} #undef DRIVER_SELF_SPECS #define DRIVER_SELF_SPECS \ diff --git a/gcc/config/mips/t-img-linux b/gcc/config/mips/t-img-linux index 86b0a26..93d81920 100644 --- a/gcc/config/mips/t-img-linux +++ b/gcc/config/mips/t-img-linux @@ -23,8 +23,16 @@ MULTILIB_OPTIONS = mips64r6
Re: [C++/58583] ICE instantiating NSDMIs
OK, thanks. Jason
Re: arm memcpy of aligned data
On 15/06/15 15:30, Kyrill Tkachov wrote: On 29/05/15 11:15, Kyrill Tkachov wrote: On 29/05/15 10:08, Kyrill Tkachov wrote: Hi Mike, On 28/05/15 22:15, Mike Stump wrote: So, the arm memcpy code of aligned data isn’t as good as it can be. void *memcpy(void *dest, const void *src, unsigned int n); void foo(char *dst, int i) { memcpy (dst, i, sizeof (i)); } generates horrible code, but, it we are willing to notice the src or the destination are aligned, we can do much better: $ ./cc1 -fschedule-fusion -fdump-tree-all-all -da -march=armv7ve -mcpu=cortex-m4 -fomit-frame-pointer -quiet -O2 /tmp/t.c -o t.s $ cat t.s [ … ] foo: @ args = 0, pretend = 0, frame = 4 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. subsp, sp, #4 strr1, [r0]@ unaligned addsp, sp, #4 I think there's something to do with cpu tuning here as well. That being said, I do think this is a good idea. I'll give it a test. The patch passes bootstrap and testing ok and I've seen it improve codegen in a few places in SPEC. I've added a testcase all marked up. Mike, I'll commit the attached patch in 24 hours unless somebody objects. Thanks, Kyrill 2015-06-15 Mike Stump mikest...@comcast.net * config/arm/arm.c (arm_block_move_unaligned_straight): Emit normal move instead of unaligned load when source or destination are appropriately aligned. 2015-06-15 Mike Stump mikest...@comcast.net Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/arm/memcpy-aligned-1.c: New test. My only question would be whether this should be pushed down into gen_unaligned_{load|store}si, so that all callers would benefit? R. Kyrill For the code you've given compiled with -O2 -mcpu=cortex-a53 I get: sub sp, sp, #8 mov r2, r0 add r3, sp, #8 str r1, [r3, #-4]! ldr r0, [r3]@ unaligned str r0, [r2]@ unaligned add sp, sp, #8 @ sp needed bx lr whereas for -O2 -mcpu=cortex-a57 I get the much better: sub sp, sp, #8 str r1, [r0]@ unaligned add sp, sp, #8 @ sp needed bx lr Kyrill Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 223842) +++ gcc/config/arm/arm.c(working copy) @@ -14376,7 +14376,10 @@ arm_block_move_unaligned_straight (rtx d srcoffset + j * UNITS_PER_WORD - src_autoinc); mem = adjust_automodify_address (srcbase, SImode, addr, srcoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_loadsi (regs[j], mem)); + if (src_aligned) +emit_move_insn (regs[j], mem); + else +emit_insn (gen_unaligned_loadsi (regs[j], mem)); } srcoffset += words * UNITS_PER_WORD; } @@ -14395,7 +14398,10 @@ arm_block_move_unaligned_straight (rtx d dstoffset + j * UNITS_PER_WORD - dst_autoinc); mem = adjust_automodify_address (dstbase, SImode, addr, dstoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_storesi (mem, regs[j])); + if (dst_aligned) +emit_move_insn (mem, regs[j]); + else +emit_insn (gen_unaligned_storesi (mem, regs[j])); } dstoffset += words * UNITS_PER_WORD; } Ok? Can someone spin this through an arm test suite run for me, I was doing this by inspection and cross compile on a system with no arm bits. Bonus points if you can check it in with the test case above marked up as appropriate. arm-memcpy-aligned.patch commit 77191f4224c8729d014a9150bd9364f95ff704b0 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Fri May 29 10:44:21 2015 +0100 [ARM] arm memcpy of aligned data diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 638d659..3a33c26 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -14283,7 +14283,10 @@ arm_block_move_unaligned_straight (rtx dstbase, rtx srcbase, srcoffset + j * UNITS_PER_WORD - src_autoinc); mem = adjust_automodify_address (srcbase, SImode, addr, srcoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_loadsi (regs[j], mem)); + if (src_aligned) + emit_move_insn (regs[j], mem); + else + emit_insn (gen_unaligned_loadsi (regs[j], mem)); } srcoffset += words * UNITS_PER_WORD; } @@ -14302,7 +14305,10 @@ arm_block_move_unaligned_straight (rtx dstbase, rtx srcbase, dstoffset + j * UNITS_PER_WORD - dst_autoinc); mem = adjust_automodify_address (dstbase, SImode, addr,
Re: arm memcpy of aligned data
On 29/05/15 11:15, Kyrill Tkachov wrote: On 29/05/15 10:08, Kyrill Tkachov wrote: Hi Mike, On 28/05/15 22:15, Mike Stump wrote: So, the arm memcpy code of aligned data isn’t as good as it can be. void *memcpy(void *dest, const void *src, unsigned int n); void foo(char *dst, int i) { memcpy (dst, i, sizeof (i)); } generates horrible code, but, it we are willing to notice the src or the destination are aligned, we can do much better: $ ./cc1 -fschedule-fusion -fdump-tree-all-all -da -march=armv7ve -mcpu=cortex-m4 -fomit-frame-pointer -quiet -O2 /tmp/t.c -o t.s $ cat t.s [ … ] foo: @ args = 0, pretend = 0, frame = 4 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. sub sp, sp, #4 str r1, [r0]@ unaligned add sp, sp, #4 I think there's something to do with cpu tuning here as well. That being said, I do think this is a good idea. I'll give it a test. The patch passes bootstrap and testing ok and I've seen it improve codegen in a few places in SPEC. I've added a testcase all marked up. Mike, I'll commit the attached patch in 24 hours unless somebody objects. Thanks, Kyrill 2015-06-15 Mike Stump mikest...@comcast.net * config/arm/arm.c (arm_block_move_unaligned_straight): Emit normal move instead of unaligned load when source or destination are appropriately aligned. 2015-06-15 Mike Stump mikest...@comcast.net Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/arm/memcpy-aligned-1.c: New test. Kyrill For the code you've given compiled with -O2 -mcpu=cortex-a53 I get: sub sp, sp, #8 mov r2, r0 add r3, sp, #8 str r1, [r3, #-4]! ldr r0, [r3]@ unaligned str r0, [r2]@ unaligned add sp, sp, #8 @ sp needed bx lr whereas for -O2 -mcpu=cortex-a57 I get the much better: sub sp, sp, #8 str r1, [r0]@ unaligned add sp, sp, #8 @ sp needed bx lr Kyrill Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 223842) +++ gcc/config/arm/arm.c(working copy) @@ -14376,7 +14376,10 @@ arm_block_move_unaligned_straight (rtx d srcoffset + j * UNITS_PER_WORD - src_autoinc); mem = adjust_automodify_address (srcbase, SImode, addr, srcoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_loadsi (regs[j], mem)); + if (src_aligned) + emit_move_insn (regs[j], mem); + else + emit_insn (gen_unaligned_loadsi (regs[j], mem)); } srcoffset += words * UNITS_PER_WORD; } @@ -14395,7 +14398,10 @@ arm_block_move_unaligned_straight (rtx d dstoffset + j * UNITS_PER_WORD - dst_autoinc); mem = adjust_automodify_address (dstbase, SImode, addr, dstoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_storesi (mem, regs[j])); + if (dst_aligned) + emit_move_insn (mem, regs[j]); + else + emit_insn (gen_unaligned_storesi (mem, regs[j])); } dstoffset += words * UNITS_PER_WORD; } Ok? Can someone spin this through an arm test suite run for me, I was doing this by inspection and cross compile on a system with no arm bits. Bonus points if you can check it in with the test case above marked up as appropriate. commit 77191f4224c8729d014a9150bd9364f95ff704b0 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Fri May 29 10:44:21 2015 +0100 [ARM] arm memcpy of aligned data diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 638d659..3a33c26 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -14283,7 +14283,10 @@ arm_block_move_unaligned_straight (rtx dstbase, rtx srcbase, srcoffset + j * UNITS_PER_WORD - src_autoinc); mem = adjust_automodify_address (srcbase, SImode, addr, srcoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_loadsi (regs[j], mem)); + if (src_aligned) + emit_move_insn (regs[j], mem); + else + emit_insn (gen_unaligned_loadsi (regs[j], mem)); } srcoffset += words * UNITS_PER_WORD; } @@ -14302,7 +14305,10 @@ arm_block_move_unaligned_straight (rtx dstbase, rtx srcbase, dstoffset + j * UNITS_PER_WORD - dst_autoinc); mem = adjust_automodify_address (dstbase, SImode, addr, dstoffset + j * UNITS_PER_WORD); - emit_insn (gen_unaligned_storesi (mem, regs[j])); + if (dst_aligned) + emit_move_insn (mem, regs[j]); + else + emit_insn (gen_unaligned_storesi (mem, regs[j])); } dstoffset += words * UNITS_PER_WORD; } diff --git
Re: [PATCH] Adding warning for constexpr's
Hi, thanks for your answer. I was trying with warning_at this way instead that I was doing before: + else if (TREE_CODE(init) == CALL_EXPR) +{ + tree fn = TREE_OPERAND(CALL_EXPR_FN(init), 0); + if (DECL_DECLARED_CONSTEXPR_P(fn) warn_constexpr) + warning_at (DECL_SOURCE_LINE(decl), OPT_Wconstexpr, function %q+F cannot be evaluated at compile time, fn); + where decl is the declaration that is being initializated. For some reason this isn't working. Checking with gdb I realised that warning_at in this case is returnin false. I don't know if I am missing something. 2015-06-12 18:35 GMT-03:00 Joseph Myers jos...@codesourcery.com: On Fri, 12 Jun 2015, Andres Tiraboschi wrote: Hi, this patch is for adding a warning when a constexpr cannot be evaluated at compile time. This is a single case: type var = fun(args...), with fun declared as a constexpr. All options need documenting in invoke.texi. All diagnostics need testcases added to the testsuite. C++-specific options go in c.opt and should be listed as C++ ObjC++, not Common. All new diagnostics should use warning_at etc. with explicit locations passed, unless there is some strong reason it's hard to get the relevant location when the warning is given. -- Joseph S. Myers jos...@codesourcery.com
Re: [C++17] Implement N3928 - Extending static_assert
On 05/20/2015 11:28 AM, Jason Merrill wrote: On 05/02/2015 04:16 PM, Ed Smith-Rowland wrote: This extends' static assert to not require a message string. I elected to make this work also for C++11 and C++14 and warn only with -pedantic. I think many people just write static_assert(thing, ); . I took the path of building an empty string in the parser in this case. I wasn't sure if setting message to NULL_TREE would cause sadness later on or not. Hmm. Yes, this technically implements the feature, but my impression of the (non-normative) intent was that they wanted leaving out the string to print the argument expression, in about the same way as #define BOOST_STATIC_ASSERT( B ) static_assert(B, #B) So the patch is OK as is, but you might also look into some libcpp magic to insert a second argument that stringizes the first. Are you planning to check this in? Jason
RE: [Patch MIPS] Enable TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS hook
Hi Matthew, /* LRA will allocate an FPR for an integer mode pseudo instead of spilling to memory if an FPR is present in the allocno class. It is rare that we actually need to place an integer mode value in an FPR so where possible limit the allocation to GR_REGS. This will slightly pessimize code that involves integer to/from float conversions as these will have to reload into FPRs in LRA. Such reloads are sometimes eliminated and sometimes only partially eliminated. We choose to take this penalty in order to eliminate usage of FPRs in code that does not use floating point data. This change has a similar effect to increasing the cost of FPR-GPR register moves for integer modes so that they are higher than the cost of memory but changing the allocno class is more reliable. This is also similar to forbidding integer mode values in FPRs entirely but this would lead to an inconsistency in the integer to/from float instructions that say integer mode values must be placed in FPRs. */ I'm keen to get the description of this right so please feel free to change it further if it isn't clear (or correct). This description is definitely more accurate. At first glance, I wasn't sure whether to include the bit about partial elimination since it caused by post reload passes but the introduction of reloads should be avoided in the first place. I don't know if this change will lead to classic reload being unusable for MIPS. I'm not worried about that but I think it is probably wise to remove classic reload support for MIPS now; we are dependent on LRA for several features already. Indeed. I think it's the right time to drop the classic reload and resolve LRA issues if they come up. Do you have any details on when we are left with suboptimal code for int-float conversions? I'd like to keep a record of them in this thread or in the comment so we know what is left to fix. I opened a bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66204 and just realised that I had never CCed Vlad. Hopefully, this is the last remaining issue related to integers and floating-point registers. + if (INTEGRAL_MODE_P (PSEUDO_REGNO_MODE (regno)) allocno_class == ALL_REGS) +return GR_REGS; + return allocno_class; +} + Trim the extra trailing newline. OK to commit if you are happy with the comment. I'll update the comment, do a quick regression and commit it. Regards, Robert
Re: [4.8, testsuite] Correct backported fix to gcc.dg/vect/vect-33.c
On June 15, 2015 9:58:33 PM GMT+02:00, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: I just was reading the gcc mailing list and realized that changes to 4.8 now require release manager approval. Adding Richard to the CC list for consideration. Thanks! OK. Richard. Bill On Mon, 2015-06-15 at 14:54 -0500, Bill Schmidt wrote: Hi, When I backported support for unaligned vector load/store operations on POWER8 to GCC 4.8, I fumbled the change for gcc.dg/vect/vect-33.c. One of the original tests was: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ which I modified to /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ This caused the test to be skipped for architectures other than PowerPC, which was a mistake. The correct test should have been: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ which leaves things alone for other architectures. Ok for 4.8? Thanks, Bill 2015-06-15 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.dg/vect/vect-33.c: Don't exclude Vectorizing an unaligned access test for non-PowerPC arches. Index: gcc/testsuite/gcc.dg/vect/vect-33.c === --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 224490) +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy) @@ -38,7 +38,7 @@ int main (void) /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ -/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ +/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ /* { dg-final { scan-tree-dump Alignment of access forced using peeling vect { target vector_alignment_reachable } } } */ /* { dg-final { scan-tree-dump-times Alignment of access forced using versioning 1 vect { target { {! vector_alignment_reachable} {! vect_hw_misalign} } } } } */ /* { dg-final { cleanup-tree-dump vect } } */
Re: [PATCH] rtx_costs vs. const_int
On Mon, Jun 15, 2015 at 01:37:48PM -0500, Segher Boessenkool wrote: sub-rtx's in rtx_cost are summed in order rather than reverse order so that the mode from an earlier operand can be used for a later operand lacking a mode. This is for ZERO_EXTEND and similar codes where the sub-rtx mode is different to the outer mode. Canonicalization puts const_int operands after other operands. Not always; only for commutative operations, or operations that can be swapped (like comparisons). As a counterexample, see PowerPC subfic (imm - reg). Hmm. Actually, on thinking over this some more I reckon I don't need the change for sub-rtx's at all. Clearly not for ZERO_EXTEND which only has one operand. To take your counter example, (zero_extend (minus (const_int) (reg))) will use the mode of the MINUS for the CONST_INT. Please consider that part of the patch removed. The testing showed some pre-existing bugs.. arc-elf dies on attempting to assemble first libgcc file, due to gas not understanding the options being passed by gcc. Apparently no one cared enough to push gas changes upstream. You need to configure it with --with-cpu. Not that that makes any sense :-) Yes, I wouldn't have got past configure without adding --with-cpu. --with-cpu doesn't solve the problem I'm talking about, which is that you need an assembler built from a branch. Mainline arc binutils does not support mainline arc gcc. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] Remove dg-options -O2 in libgomp.c
On Mon, Jun 15, 2015 at 06:31:01PM +0200, Tom de Vries wrote: Hi, this patch removes superfluous dg-option -O2 settings in testsuite/libgomp.c. The setting is superfluous, because DEFAULT_CFLAGS is already set to -O2 in c.exp. Tested on x86_64. OK for trunk? Ok, thanks. Jakub
Re: [PATCH] Fix up some omp simd array issues (PR middle-end/66429)
On June 15, 2015 5:46:46 PM GMT+02:00, Jakub Jelinek ja...@redhat.com wrote: Hi! As Tom has reported, the for-2.c testcase ICEs at -O2 -fopenmp, because it has a noreturn function in the body and so while in omplower we decide to use omp simd array arrays, in ompexp there is no loop to attach the simd stuff to and I forgot to set the has_simduid_loops flag in that case (and propagate it to the outlined functions). But there is actually a bigger problem, the cleanup of the simduid internal calls and arrays has been done only in the vectorizer, but the vectorizer is part of the loop pass list that isn't run if there are no loops left, so e.g. in the testcase of noreturn loop where in the end there is no loop the cleanup wasn't performed. This patch (the omp-low.c part I can ack myself) thus clears the cfun-has_simduid_loops in the vectorizer after cleaning them up and adds a new pass gated on cfun-has_simduid_loops that performs just the cleanups. That pass will be invoked only for these pathological cases (when the vectorizer pass has not been run because there were no loops to vectorize, but still with OpenMP / Cilk+ code which has the simd directives). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/5.2? OK. Though I wonder whether this should be a property like the vector and complex lowered state properties we have. Thanks Richard. 2015-06-15 Jakub Jelinek ja...@redhat.com PR middle-end/66429 * omp-low.c (expand_omp_taskreg): Use child_cfun instead of DECL_STRUCT_FUNCTION (child_fn). Or in has_simduid_loops and has_force_vectorize_loops flags from cfun into child_cfun. (expand_omp_simd): For broken loop, set cfun-has_simduid_loops if simduid is non-NULL. * tree-pass.h (make_pass_simduid_cleanup): New prototype. * passes.def (pass_simduid_cleanup): Add new pass after loop passes. * tree-vectorizer.c (adjust_simduid_builtins): Remove one unnecessary indirection from htab argument's type. (shrink_simd_arrays): New function. (vectorize_loops): Use it. Adjust adjust_simduid_builtins caller. Don't call adjust_simduid_builtins if there are no loops. (pass_data_simduid_cleanup, pass_simduid_cleanup): New variables. (pass_simduid_cleanup::execute): New method. (make_pass_simduid_cleanup): New function. * c-c++-common/gomp/pr66429.c: New test. --- gcc/omp-low.c.jj 2015-06-10 11:06:13.0 +0200 +++ gcc/omp-low.c 2015-06-15 13:36:45.644277964 +0200 @@ -5589,7 +5589,9 @@ expand_omp_taskreg (struct omp_region *r vec_safe_truncate (child_cfun-local_decls, dstidx); /* Inform the callgraph about the new function. */ - DECL_STRUCT_FUNCTION (child_fn)-curr_properties = cfun-curr_properties; + child_cfun-curr_properties = cfun-curr_properties; + child_cfun-has_simduid_loops |= cfun-has_simduid_loops; + child_cfun-has_force_vectorize_loops |= cfun-has_force_vectorize_loops; cgraph_node *node = cgraph_node::get_create (child_fn); node-parallelized_function = 1; cgraph_node::add_new_function (child_fn, true); @@ -7838,6 +7840,8 @@ expand_omp_simd (struct omp_region *regi cfun-has_force_vectorize_loops = true; } } + else if (simduid) +cfun-has_simduid_loops = true; } --- gcc/tree-pass.h.jj 2015-04-17 13:50:55.0 +0200 +++ gcc/tree-pass.h2015-06-15 14:18:50.299679523 +0200 @@ -372,6 +372,7 @@ extern gimple_opt_pass *make_pass_graphi extern gimple_opt_pass *make_pass_if_conversion (gcc::context *ctxt); extern gimple_opt_pass *make_pass_loop_distribution (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vectorize (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_simduid_cleanup (gcc::context *ctxt); extern gimple_opt_pass *make_pass_slp_vectorize (gcc::context *ctxt); extern gimple_opt_pass *make_pass_complete_unroll (gcc::context *ctxt); extern gimple_opt_pass *make_pass_complete_unrolli (gcc::context *ctxt); --- gcc/passes.def.jj 2015-06-10 08:18:25.0 +0200 +++ gcc/passes.def 2015-06-15 14:21:01.616671365 +0200 @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3. PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop) NEXT_PASS (pass_slp_vectorize); POP_INSERT_PASSES () + NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); --- gcc/tree-vectorizer.c.jj 2015-06-10 08:18:29.0 +0200 +++ gcc/tree-vectorizer.c 2015-06-15 14:31:02.548482422 +0200 @@ -171,7 +171,7 @@ simd_array_to_simduid::equal (const simd into their corresponding constants. */ static void -adjust_simduid_builtins (hash_tablesimduid_to_vf **htab) +adjust_simduid_builtins (hash_tablesimduid_to_vf *htab) { basic_block bb; @@ -203,10 +203,12 @@ adjust_simduid_builtins (hash_tablesimd gcc_assert (TREE_CODE
Re: [Patch 0/4] PowerPC64 Linux split stack support
Alan Modra amo...@gmail.com writes: This untested patch ought to fix the problem, I think. There is no -fsplit-stack in the Makefile, and the configure script has already determined the correct settings. $ grep -e -fsplit-stack libgo/Makefile 32/libgo/Makefile libgo/Makefile:SPLIT_STACK = -fsplit-stack $ grep split_stack libgo/config.log 32/libgo/config.log libgo/config.log:libgo_cv_c_linker_supports_split_stack=no libgo/config.log:libgo_cv_c_split_stack_supported=yes 32/libgo/config.log:libgo_cv_c_linker_supports_split_stack=no 32/libgo/config.log:libgo_cv_c_split_stack_supported=no Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Go patch committed: Add MERGE file
The master gofrontend repository has changed to git, so I am using a different system for keeping the files up to date in GCC. I've added a MERGE file to the GCC repository to track the most recent change that has been merged over. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 0) +++ gcc/go/gofrontend/MERGE (working copy) @@ -0,0 +1,4 @@ +8eeba3ad318863eea867669609a1910101c23f00 + +The first line of this file holds the git revision number of the last +merge done from the gofrontend repository.
Re: [PATCH] Fix PR c++/30044
On 06/11/2015 09:25 PM, Patrick Palka wrote: + parameter_vec = make_tree_vec + (TREE_VEC_LENGTH (TREE_VALUE (current_template_parms)) + 1); + + for (int i = 0; i TREE_VEC_LENGTH (parameter_vec) - 1; i++) + TREE_VEC_ELT (parameter_vec, i) + = TREE_VEC_ELT (TREE_VALUE (current_template_parms), i); + + TREE_VEC_ELT (parameter_vec, TREE_VEC_LENGTH (parameter_vec) - 1) + = tree_last (parameter_list); Any reason not to use grow_tree_vec?
Re: [C++ Patch] PR 51048
OK. Jason
Re: [PATCH] rtx_costs vs. const_int
Hi Alan, On Mon, Jun 15, 2015 at 12:03:47PM +0930, Alan Modra wrote: This patch changes the targetm.rtx_costs interface to pass a mode parameter, and removes a redundant parameter. The reason for the change is that powerpc and other backends need the mode that a const_int is used in to properly determine the cost. For instance, (set (reg) (ior (reg) (const_int))) where const_int is 0xff.. can be implemented in one instruction on powerpc if the regs and constant are SImode, but not when DImode. Nice :-) Some backends work around this problem by calculating the cost of the entire expression under the IOR, which allows the mode of the const_int to be inferred. Yeah, I do this in my series to improve the PowerPC rotate-and-mask instructions. Some of the time you really have to look at a bigger part of the pattern, but your patch should help here. sub-rtx's in rtx_cost are summed in order rather than reverse order so that the mode from an earlier operand can be used for a later operand lacking a mode. This is for ZERO_EXTEND and similar codes where the sub-rtx mode is different to the outer mode. Canonicalization puts const_int operands after other operands. Not always; only for commutative operations, or operations that can be swapped (like comparisons). As a counterexample, see PowerPC subfic (imm - reg). The testing showed some pre-existing bugs.. arc-elf dies on attempting to assemble first libgcc file, due to gas not understanding the options being passed by gcc. Apparently no one cared enough to push gas changes upstream. You need to configure it with --with-cpu. Not that that makes any sense :-) ia64-linux, m68k-linux, tilegx-linux and tilepro-linux have dependencies on include files, not solved by -Dinhibit_libc, so die building libgcc. Can be solved by hand, but annoying. Very annoying yes. Have patches for all; will send. s390x-linux doesn't build as a cross-compiler. undefined reference to `s390_host_detect_local_cpu(int, char const**)' This is recent, maybe already fixed? Segher
Re: [Patch 0/4] PowerPC64 Linux split stack support
* go-lang.c (go_langhook_init_options_struct): Don't set x_flag_split_stack. (go_langhook_post_options): Set it here instead. --- gcc/go/go-lang.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/gcc/go/go-lang.c b/gcc/go/go-lang.c index ce4dd9b..d952e0f 100644 --- a/gcc/go/go-lang.c +++ b/gcc/go/go-lang.c @@ -158,10 +158,6 @@ go_langhook_init_options_struct (struct gcc_options *opts) opts-x_flag_errno_math = 0; opts-frontend_set_flag_errno_math = true; - /* We turn on stack splitting if we can. */ - if (targetm_common.supports_split_stack (false, opts)) -opts-x_flag_split_stack = 1; - /* Exceptions are used to handle recovering from panics. */ opts-x_flag_exceptions = 1; opts-x_flag_non_call_exceptions = 1; @@ -295,6 +291,11 @@ go_langhook_post_options (const char **pfilename ATTRIBUTE_UNUSED) global_options.x_write_symbols == NO_DEBUG) global_options.x_write_symbols = PREFERRED_DEBUGGING_TYPE; + /* We turn on stack splitting if we can. */ + if (!global_options_set.x_flag_split_stack + targetm_common.supports_split_stack (false, global_options)) +global_options.x_flag_split_stack = 1; + /* Returning false means that the backend should be used. */ return false; } -- 2.4.3 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [PATCH] Fix PR c++/30044
On Mon, Jun 15, 2015 at 2:05 PM, Jason Merrill ja...@redhat.com wrote: On 06/11/2015 09:25 PM, Patrick Palka wrote: + parameter_vec = make_tree_vec + (TREE_VEC_LENGTH (TREE_VALUE (current_template_parms)) + 1); + + for (int i = 0; i TREE_VEC_LENGTH (parameter_vec) - 1; i++) + TREE_VEC_ELT (parameter_vec, i) + = TREE_VEC_ELT (TREE_VALUE (current_template_parms), i); + + TREE_VEC_ELT (parameter_vec, TREE_VEC_LENGTH (parameter_vec) - 1) + = tree_last (parameter_list); Any reason not to use grow_tree_vec? Doing so causes a lot of ICEs in the testsuite. I think it's because grow_tree_vec invalidates the older parameter_vec which some trees may still be holding a reference to in their DECL_TEMPLATE_PARMS field.
RE: [Patch, MIPS] Modify sysroot layout for mips-mti-* and mips-img-*
Hi Steve, Having worked on the new layout I of course am happy with it. I think it makes the cross compiled sysroots much easier to use for installing on a target as well as making the library paths match for cross compiled and native. A couple of minor things... diff --git a/gcc/config/mips/mti-linux.h b/gcc/config/mips/mti-linux.h index 80d5925..32b84d1 100644 --- a/gcc/config/mips/mti-linux.h +++ b/gcc/config/mips/mti-linux.h @@ -18,16 +18,20 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ /* This target is a multilib target, specify the sysroot paths. */ +#define MIPS_SYSVERSION_SPEC \ + %{mips32:r1}%{mips64:r1}%{mips32r2:r2}%{mips64r2:r2}%{mips32r6:r6}%{mips64r6:r6}%{mips16:-mips16} + I know we had long lines before in this file but can't we split this line safely like: %{mips32:r1}%{mips64:r1}%{mips32r2:r2}%{mips64r2:r2}%{mips32r6:r6} \ %{mips64r6:r6}%{mips16:-mips16} #undef SYSROOT_SUFFIX_SPEC -#if MIPS_ISA_DEFAULT == 33 /* mips32r2 is the default */ -#define SYSROOT_SUFFIX_SPEC \ - %{mips32:/mips32}%{mips64:/mips64}%{mips64r2:/mips64r2}%{mips32r6:/mips32r6}%{mips64r6:/mips64r6}%{mips16:/mips16}%{mmicromips:/micromips}%{mabi=64:/64}%{mel|EL:/el}%{msoft-float:/sof}%{!mips32r6:%{!mips64r6:%{mnan=2008:/nan2008}}} -#elif MIPS_ISA_DEFAULT == 37 /* mips32r6 is the default */ #define SYSROOT_SUFFIX_SPEC \ - %{mips32:/mips32}%{mips64:/mips64}%{mips32r2:/mips32r2}%{mips64r2:/mips64r2}%{mips64r6:/mips64r6}%{mips16:/mips16}%{mmicromips:/micromips}%{mabi=64:/64}%{mel|EL:/el}%{msoft-float:/sof}%{!mips32r6:%{!mips64r6:%{mnan=2008:/nan2008}}} -#else /* Unexpected default ISA. */ -#error No SYSROOT_SUFFIX_SPEC exists for this default ISA -#endif + /%{mmicromips:micro}mips%{mel|EL:el}-MIPS_SYSVERSION_SPEC%{msoft-float:-soft;:-hard}%{!mips32r6:%{!mips64r6:%{mnan=2008:-nan2008}}}%{muclibc:-uclibc} + Similarly here. +#define SYSROOT_HEADERS_SUFFIX_SPEC SYSROOT_SUFFIX_SPEC + +#undef STARTFILE_PREFIX_SPEC +#define STARTFILE_PREFIX_SPEC \ + %{mabi=32: /usr/local/lib/ /lib/ /usr/lib/} \ + %{mabi=n32: /usr/local/lib32/ /lib32/ /usr/lib32/} \ + %{mabi=64: /usr/local/lib64/ /lib64/ /usr/lib64/} #undef DRIVER_SELF_SPECS #define DRIVER_SELF_SPECS\ diff --git a/gcc/config/mips/t-img-linux b/gcc/config/mips/t-img-linux index 86b0a26..93d81920 100644 --- a/gcc/config/mips/t-img-linux +++ b/gcc/config/mips/t-img-linux @@ -23,8 +23,16 @@ MULTILIB_OPTIONS = mips64r6 mabi=64 EL MULTILIB_DIRNAMES = mips64r6 64 el MULTILIB_MATCHES = EL=mel EB=meb -# The 64 bit ABI is not supported on the mips32r6 architecture. -# Because mips32r6 is the default we can't use that flag to trigger -# the exception so we check for mabi=64 with no specific mips -# architecture flag instead. -MULTILIB_EXCEPTIONS += mabi=64* +MULTILIB_REQUIRED = +MULTILIB_OSDIRNAMES = .=mips-r6-hard/lib Why no exclamation (!) here? I understand the ! is supposed to prevent the tools from searching any other directory. Is it redundant on the default multilib? +MULTILIB_REQUIRED += mips64r6 +MULTILIB_OSDIRNAMES += mips64r6=!mips-r6-hard/lib32 +MULTILIB_REQUIRED += mips64r6/mabi=64 +MULTILIB_OSDIRNAMES += mips64r6/mabi.64=!mips-r6-hard/lib64 + +MULTILIB_REQUIRED += EL +MULTILIB_OSDIRNAMES += EL=!mipsel-r6-hard/lib +MULTILIB_REQUIRED += mips64r6/EL +MULTILIB_OSDIRNAMES += mips64r6/EL=!mipsel-r6-hard/lib32 +MULTILIB_REQUIRED += mips64r6/mabi=64/EL +MULTILIB_OSDIRNAMES += mips64r6/mabi.64/EL=!mipsel-r6-hard/lib64 diff --git a/gcc/config/mips/t-mti-linux b/gcc/config/mips/t-mti-linux index c0dcbf0..2404c4ca 100644 --- a/gcc/config/mips/t-mti-linux +++ b/gcc/config/mips/t-mti-linux @@ -23,26 +23,136 @@ MULTILIB_OPTIONS = mips32/mips64/mips64r2 mips16/mmicromips mabi=64 EL msoft-flo MULTILIB_DIRNAMES = mips32 mips64 mips64r2 mips16 micromips 64 el sof nan2008 MULTILIB_MATCHES = EL=mel EB=meb mips32r2=mips32r3 mips32r2=mips32r5 mips64r2=mips64r3 mips64r2=mips64r5 -# The 64 bit ABI is not supported on the mips32 architecture. -MULTILIB_EXCEPTIONS += *mips32*/*mabi=64* - -# The 64 bit ABI is not supported on the mips32r2 architecture. -# Because mips32r2 is the default we can't use that flag to trigger -# the exception so we check for mabi=64 with no specific mips -# architecture flag instead. -MULTILIB_EXCEPTIONS += mabi=64* - -# We do not want to build mips16 versions of mips64* architectures. -MULTILIB_EXCEPTIONS += *mips64*/*mips16* -MULTILIB_EXCEPTIONS += *mips16/mabi=64* - -# We only want micromips for mips32r2 architecture. -MULTILIB_EXCEPTIONS += *mips32/mmicromips* -MULTILIB_EXCEPTIONS += *mips64*/mmicromips* -MULTILIB_EXCEPTIONS += *mmicromips/mabi=64* - -# We do not want nan2008 libraries for soft-float, -# mips32[r1], or mips64[r1]. -MULTILIB_EXCEPTIONS += *msoft-float*/*mnan=2008* -MULTILIB_EXCEPTIONS +=
[PATCH] PR ada/66205 gnatbind generates invalid code when finalization is enabled in restricted runtime
If the RTS in use is configurable (I believe this is the same in this context as restricted) and includes finalization, gnatbind generates binder code that won't compile. This situation arises, for example, with an embedded RTS that incorporates the Ada 2012 generalized container iterators. The attached patch was bootstrapped/regression tested (make check-ada) against 5.1.0 on x86_64-apple-darwin13 (which confirms that the patch hasn't broken builds against the standard RTS). arm-eabi-gnatbind was successful against both an RTS with finalization and one without. The patch applies with no offset to the trunk. gcc/ada/Changelog: 2015-6-15 Simon Wright si...@pushface.org PR ada/66205 * bindgen.adb (Gen_Adafinal): if Configurable_Run_Time_On_Target is true, generate a null body. (Gen_Main): if Configurable_Run_Time_On_Target is true, then - don't import __gnat_initialize or __gnat_finalize (as Initialize, Finalize rsp). - don't call Initialize or Finalize. pr66205.diff Description: Binary data
Re: New type-based pool allocator code miscompiled due to aliasing issue?
On Mon, 15 Jun 2015, Martin Liška wrote: Ah, I overlooked that it's not a placement new, but just static casting. Anyway, if I added: cselib_val () {} to struct cselib_val and changed the cast to placement new: char *ptr = (char *) header; return new (ptr) T (); I got following compilation error: In file included from ../../gcc/alias.c:46:0: ../../gcc/alloc-pool.h: In instantiation of ‘T* pool_allocatorT::allocate() [with T = cselib_val]’: ../../gcc/cselib.h:51:27: required from here ../../gcc/alloc-pool.h:416:23: error: no matching function for call to ‘cselib_val::operator new(sizetype, char*)’ return new (ptr) T (); ^ In file included from ../../gcc/alias.c:47:0: ../../gcc/cselib.h:49:16: note: candidate: static void* cselib_val::operator new(size_t) inline void *operator new (size_t) ^ ../../gcc/cselib.h:49:16: note: candidate expects 1 argument, 2 provided #include new -- Marc Glisse
[C++ Patch] PR 51048
Hi, we are getting bug reports (3 so far) about this issue: in C++11 we reject the below testcase and we say that the virtual function declared in A is never defined. Without considering more subtle details, the error appears meaningless because the function is in fact *pure* virtual. In any case, the testcase is accepted with -fpermissive and the generated code is in fact Ok at runtime. All in all, I thought we could simply check DECL_PURE_VIRTUAL_P before emitting the permerror, the trivial change passes testing. Thanks, Paolo. // /cp 2015-06-15 Paolo Carlini paolo.carl...@oracle.com PR c++/51048 * decl2.c (no_linkage_error): Do not issue a permerror if the DECL using a local type is pure virtual. /testsuite 2015-06-15 Paolo Carlini paolo.carl...@oracle.com PR c++/51048 * g++.dg/cpp0x/local-type1.C: New. Index: cp/decl2.c === --- cp/decl2.c (revision 224475) +++ cp/decl2.c (working copy) @@ -4221,8 +4221,12 @@ no_linkage_error (tree decl) TYPE_NAME (t)); } else if (cxx_dialect = cxx11) -permerror (DECL_SOURCE_LOCATION (decl), %q#D, declared using local type - %qT, is used but never defined, decl, t); +{ + if (TREE_CODE (decl) == VAR_DECL || !DECL_PURE_VIRTUAL_P (decl)) + permerror (DECL_SOURCE_LOCATION (decl), + %q#D, declared using local type + %qT, is used but never defined, decl, t); +} else if (TREE_CODE (decl) == VAR_DECL) warning_at (DECL_SOURCE_LOCATION (decl), 0, type %qT with no linkage used to declare variable %q#D with linkage, t, decl); Index: testsuite/g++.dg/cpp0x/local-type1.C === --- testsuite/g++.dg/cpp0x/local-type1.C(revision 0) +++ testsuite/g++.dg/cpp0x/local-type1.C(working copy) @@ -0,0 +1,19 @@ +// PR c++/51048 +// { dg-do compile { target c++11 } } + +templatetypename X +struct A { + virtual void DoPush(X const x) = 0; + void Push(X const x) { DoPush(x); } +}; + +templatetypename X +struct B : AX { + using AX::Push; + virtual void DoPush(X const) { } +}; + +int main() { + enum S { }; + BS().Push(S()); +}
[PATCH] Remove dg-options -O2 in libgomp.c
Hi, this patch removes superfluous dg-option -O2 settings in testsuite/libgomp.c. The setting is superfluous, because DEFAULT_CFLAGS is already set to -O2 in c.exp. Tested on x86_64. OK for trunk? Thanks, - Tom Remove dg-options -O2 in libgomp.c 2015-06-15 Tom de Vries t...@codesourcery.com * testsuite/libgomp.c/atomic-1.c: Remove dg-options -O2. Use dg-additional-options for any remaining options. * testsuite/libgomp.c/atomic-2.c: Same. * testsuite/libgomp.c/atomic-4.c: Same. * testsuite/libgomp.c/atomic-5.c: Same. * testsuite/libgomp.c/atomic-6.c: Same. * testsuite/libgomp.c/autopar-1.c: Same. * testsuite/libgomp.c/copyin-1.c: Same. * testsuite/libgomp.c/copyin-2.c: Same. * testsuite/libgomp.c/copyin-3.c: Same. * testsuite/libgomp.c/examples-4/e.53.5.c: Same. * testsuite/libgomp.c/nestedfn-5.c: Same. * testsuite/libgomp.c/parloops-exit-first-loop-alt-2.c: Same. * testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c: Same. * testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c: Same. * testsuite/libgomp.c/parloops-exit-first-loop-alt.c: Same. * testsuite/libgomp.c/pr32362-1.c: Same. * testsuite/libgomp.c/pr32362-2.c: Same. * testsuite/libgomp.c/pr32362-3.c: Same. * testsuite/libgomp.c/pr39591-1.c: Same. * testsuite/libgomp.c/pr39591-2.c: Same. * testsuite/libgomp.c/pr39591-3.c: Same. * testsuite/libgomp.c/pr58392.c: Same. * testsuite/libgomp.c/pr58756.c: Same. * testsuite/libgomp.c/simd-1.c: Same. * testsuite/libgomp.c/simd-10.c: Same. * testsuite/libgomp.c/simd-11.c: Same. * testsuite/libgomp.c/simd-12.c: Same. * testsuite/libgomp.c/simd-13.c: Same. * testsuite/libgomp.c/simd-14.c: Same. * testsuite/libgomp.c/simd-15.c: Same. * testsuite/libgomp.c/simd-2.c: Same. * testsuite/libgomp.c/simd-3.c: Same. * testsuite/libgomp.c/simd-4.c: Same. * testsuite/libgomp.c/simd-5.c: Same. * testsuite/libgomp.c/simd-6.c: Same. * testsuite/libgomp.c/simd-7.c: Same. * testsuite/libgomp.c/simd-8.c: Same. * testsuite/libgomp.c/simd-9.c: Same. --- libgomp/testsuite/libgomp.c/atomic-1.c | 2 +- libgomp/testsuite/libgomp.c/atomic-2.c | 2 +- libgomp/testsuite/libgomp.c/atomic-4.c | 1 - libgomp/testsuite/libgomp.c/atomic-5.c | 3 +-- libgomp/testsuite/libgomp.c/atomic-6.c | 5 ++--- libgomp/testsuite/libgomp.c/autopar-1.c | 2 +- libgomp/testsuite/libgomp.c/copyin-1.c | 1 - libgomp/testsuite/libgomp.c/copyin-2.c | 1 - libgomp/testsuite/libgomp.c/copyin-3.c | 1 - libgomp/testsuite/libgomp.c/examples-4/e.53.5.c | 1 - libgomp/testsuite/libgomp.c/nestedfn-5.c | 1 - libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-2.c | 2 +- libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c | 2 +- libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-4.c | 2 +- libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt.c | 2 +- libgomp/testsuite/libgomp.c/pr32362-1.c | 1 - libgomp/testsuite/libgomp.c/pr32362-2.c | 1 - libgomp/testsuite/libgomp.c/pr32362-3.c | 1 - libgomp/testsuite/libgomp.c/pr39591-1.c | 1 - libgomp/testsuite/libgomp.c/pr39591-2.c | 1 - libgomp/testsuite/libgomp.c/pr39591-3.c | 1 - libgomp/testsuite/libgomp.c/pr58392.c| 1 - libgomp/testsuite/libgomp.c/pr58756.c| 1 - libgomp/testsuite/libgomp.c/simd-1.c | 1 - libgomp/testsuite/libgomp.c/simd-10.c| 1 - libgomp/testsuite/libgomp.c/simd-11.c| 1 - libgomp/testsuite/libgomp.c/simd-12.c| 1 - libgomp/testsuite/libgomp.c/simd-13.c| 1 - libgomp/testsuite/libgomp.c/simd-14.c| 1 - libgomp/testsuite/libgomp.c/simd-15.c| 1 - libgomp/testsuite/libgomp.c/simd-2.c | 1 - libgomp/testsuite/libgomp.c/simd-3.c | 1 - libgomp/testsuite/libgomp.c/simd-4.c | 1 - libgomp/testsuite/libgomp.c/simd-5.c | 1 - libgomp/testsuite/libgomp.c/simd-6.c | 1 - libgomp/testsuite/libgomp.c/simd-7.c | 1 - libgomp/testsuite/libgomp.c/simd-8.c | 1 - libgomp/testsuite/libgomp.c/simd-9.c | 1 - 38 files changed, 10 insertions(+), 41 deletions(-) diff --git a/libgomp/testsuite/libgomp.c/atomic-1.c b/libgomp/testsuite/libgomp.c/atomic-1.c index 4725b7d..1cecd09b 100644 --- a/libgomp/testsuite/libgomp.c/atomic-1.c +++ b/libgomp/testsuite/libgomp.c/atomic-1.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options -O2 -march=pentium { target { { i?86-*-* x86_64-*-* } ia32 }
[patch] Run testsuite/libgomp.c++/c++.exp at -O2 by default
Hi, this patch: - sets DEFAULT_CFLAGS to -O2, if not set otherwise (similar to what is done in c.exp) - removes superfluous dg-options -O2 settings. - removes superfluous dg-options -fopenmp settings. - uses dg-additional-options for -std=standard settings Tested on x86_64. OK for trunk? Thanks, - Tom Run testsuite/libgomp.c++/c++.exp at -O2 by default 2015-06-15 Tom de Vries t...@codesourcery.com * testsuite/libgomp.c++/c++.exp: Set DEFAULT_CFLAGS to -O2 if not already set. Use DEFAULT_CFLAGS in dg-runtest. * testsuite/libgomp.c++/atomic-16.C: Remove dg-options -O2 -fopenmp. * testsuite/libgomp.c++/pr64824.C: Same. * testsuite/libgomp.c++/pr64868.C: Same. * testsuite/libgomp.c++/pr66199-1.C: Same. * testsuite/libgomp.c++/pr66199-2.C: Same. * testsuite/libgomp.c++/target-2.C: Same. * testsuite/libgomp.c++/for-7.C: Use dg-additional-options for -std=standard option. * testsuite/libgomp.c++/udr-11.C: Same. * testsuite/libgomp.c++/udr-12.C: Same. * testsuite/libgomp.c++/udr-13.C: Same. * testsuite/libgomp.c++/udr-14.C: Same. * testsuite/libgomp.c++/udr-15.C: Same. * testsuite/libgomp.c++/udr-16.C: Same. * testsuite/libgomp.c++/udr-17.C: Same. * testsuite/libgomp.c++/udr-18.C: Same. * testsuite/libgomp.c++/udr-19.C: Same. * testsuite/libgomp.c++/atomic-1.C: Remove dg-options -O2. * testsuite/libgomp.c++/simd-1.C: Same. * testsuite/libgomp.c++/simd-2.C: Same. * testsuite/libgomp.c++/simd-3.C: Same. * testsuite/libgomp.c++/simd-4.C: Same. * testsuite/libgomp.c++/simd-5.C: Same. * testsuite/libgomp.c++/simd-6.C: Same. * testsuite/libgomp.c++/simd-7.C: Same. * testsuite/libgomp.c++/simd-8.C: Same. * testsuite/libgomp.c++/simd-9.C: Same. * testsuite/libgomp.c++/simd10.C: Same. * testsuite/libgomp.c++/simd11.C: Same. * testsuite/libgomp.c++/simd12.C: Same. * testsuite/libgomp.c++/simd13.C: Same. --- libgomp/testsuite/libgomp.c++/atomic-1.C | 1 - libgomp/testsuite/libgomp.c++/atomic-16.C | 1 - libgomp/testsuite/libgomp.c++/c++.exp | 7 ++- libgomp/testsuite/libgomp.c++/for-7.C | 2 +- libgomp/testsuite/libgomp.c++/pr64824.C | 1 - libgomp/testsuite/libgomp.c++/pr64868.C | 1 - libgomp/testsuite/libgomp.c++/pr66199-1.C | 1 - libgomp/testsuite/libgomp.c++/pr66199-2.C | 1 - libgomp/testsuite/libgomp.c++/simd-1.C| 1 - libgomp/testsuite/libgomp.c++/simd-2.C| 1 - libgomp/testsuite/libgomp.c++/simd-3.C| 1 - libgomp/testsuite/libgomp.c++/simd-4.C| 1 - libgomp/testsuite/libgomp.c++/simd-5.C| 1 - libgomp/testsuite/libgomp.c++/simd-6.C| 1 - libgomp/testsuite/libgomp.c++/simd-7.C| 1 - libgomp/testsuite/libgomp.c++/simd-8.C| 1 - libgomp/testsuite/libgomp.c++/simd-9.C| 1 - libgomp/testsuite/libgomp.c++/simd10.C| 1 - libgomp/testsuite/libgomp.c++/simd11.C| 1 - libgomp/testsuite/libgomp.c++/simd12.C| 1 - libgomp/testsuite/libgomp.c++/simd13.C| 1 - libgomp/testsuite/libgomp.c++/target-2.C | 1 - libgomp/testsuite/libgomp.c++/udr-11.C| 2 +- libgomp/testsuite/libgomp.c++/udr-12.C| 2 +- libgomp/testsuite/libgomp.c++/udr-13.C| 2 +- libgomp/testsuite/libgomp.c++/udr-14.C| 2 +- libgomp/testsuite/libgomp.c++/udr-15.C| 2 +- libgomp/testsuite/libgomp.c++/udr-16.C| 2 +- libgomp/testsuite/libgomp.c++/udr-17.C| 2 +- libgomp/testsuite/libgomp.c++/udr-18.C| 2 +- libgomp/testsuite/libgomp.c++/udr-19.C| 2 +- 31 files changed, 16 insertions(+), 31 deletions(-) diff --git a/libgomp/testsuite/libgomp.c++/atomic-1.C b/libgomp/testsuite/libgomp.c++/atomic-1.C index 73f6e7c..9eecfbb 100644 --- a/libgomp/testsuite/libgomp.c++/atomic-1.C +++ b/libgomp/testsuite/libgomp.c++/atomic-1.C @@ -1,6 +1,5 @@ // PR c++/33894 // { dg-do run } -// { dg-options -O2 } extern C void abort (); diff --git a/libgomp/testsuite/libgomp.c++/atomic-16.C b/libgomp/testsuite/libgomp.c++/atomic-16.C index afccd52..432d36d 100644 --- a/libgomp/testsuite/libgomp.c++/atomic-16.C +++ b/libgomp/testsuite/libgomp.c++/atomic-16.C @@ -1,5 +1,4 @@ // PR c/64824 // { dg-do run } -// { dg-options -O2 -fopenmp } #include ../libgomp.c/atomic-18.c diff --git a/libgomp/testsuite/libgomp.c++/c++.exp b/libgomp/testsuite/libgomp.c++/c++.exp index da42e62..0454f95 100644 --- a/libgomp/testsuite/libgomp.c++/c++.exp +++ b/libgomp/testsuite/libgomp.c++/c++.exp @@ -11,6 +11,11 @@ if [info exists lang_include_flags] then { unset lang_include_flags } +# If a testcase doesn't have special options, use these. +if ![info exists DEFAULT_CFLAGS] then { +set DEFAULT_CFLAGS -O2 +} + # Initialize dg. dg-init @@ -60,7 +65,7 @@ if { $lang_test_file_found } { } # Main loop. -dg-runtest $tests $libstdcxx_includes +dg-runtest $tests $libstdcxx_includes $DEFAULT_CFLAGS } # All done. diff --git a/libgomp/testsuite/libgomp.c++/for-7.C b/libgomp/testsuite/libgomp.c++/for-7.C index 9d626c0..256a131 100644 --- a/libgomp/testsuite/libgomp.c++/for-7.C +++ b/libgomp/testsuite/libgomp.c++/for-7.C @@
Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data
On Mon, Jun 15, 2015 at 15:06:09 +0200, Jakub Jelinek wrote: On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote: This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 4.1 with unsigned short for map kinds, but without new async arguments yet. I think I'd prefer (for now) to suffix the functions with _41 instead of 1 (and we'll see if we can come up with better names when async support is added). OK. Do we need to change GOMP_target_update though (at least right now)? I mean, the construct only allows to and from clauses, not the map clause, and those don't really have an always modifier, nor release/delete semantics etc., so at least for now I think using the current GOMP_target_update should be ok. I thought that it wouldn't look good, since without GOMP_target_update_41 we will need to keep this obsolete parts: - switch (start_ix) -{ -case BUILT_IN_GOMP_TARGET_UPDATE: - /* This const void * is part of the current ABI, but we're not actually -using it. */ - args.quick_push (build_zero_cst (ptr_type_node)); - break; -case BUILT_IN_GOMP_TARGET: -case BUILT_IN_GOMP_TARGET_DATA: -case BUILT_IN_GOACC_DATA_START: -case BUILT_IN_GOACC_ENTER_EXIT_DATA: -case BUILT_IN_GOACC_PARALLEL: -case BUILT_IN_GOACC_UPDATE: - break; -default: - gcc_unreachable (); -} and - tree tkind_type; - int talign_shift; - if (is_gimple_omp_oacc (stmt)) - { - tkind_type = short_unsigned_type_node; - talign_shift = 8; - } - else - { - tkind_type = unsigned_char_type_node; - talign_shift = 3; - } + tree tkind_type = short_unsigned_type_node; + int talign_shift = 8; -- Ilya
Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data
On Mon, Jun 15, 2015 at 07:18:27PM +0300, Ilya Verbin wrote: On Mon, Jun 15, 2015 at 15:06:09 +0200, Jakub Jelinek wrote: On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote: This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 4.1 with unsigned short for map kinds, but without new async arguments yet. I think I'd prefer (for now) to suffix the functions with _41 instead of 1 (and we'll see if we can come up with better names when async support is added). OK. Thanks. Do we need to change GOMP_target_update though (at least right now)? I mean, the construct only allows to and from clauses, not the map clause, and those don't really have an always modifier, nor release/delete semantics etc., so at least for now I think using the current GOMP_target_update should be ok. I thought that it wouldn't look good, since without GOMP_target_update_41 we will need to keep this obsolete parts: I'd prefer to keep it for now, perhaps later on we'll switch to 16-bit kinds even for that, but better figure out first what to do with the async stuff, handle the enter/exit data correctly, change the library for OpenMP 4.1 to do the fully refcounted model. Jakub
[committed] PR debug/66535: guard check into parent's DIE
The problem here is that the cached DIE does not have a parent because we purposely removed it, hoping that decls_for_scope will fill it in: /* If we're a nested function, initially use a parent of NULL; if we're a plain function, this will be fixed up in decls_for_scope. If we're a method, it will be ignored, since we already have a DIE. */ However, for the failing Ada testcase it _is_ decls_for_scope that we're being called from while we're doing the abstract instance dance while generating the containing type: gen_typedef_die(): ... origin = decl_ultimate_origin (decl); if (origin != NULL) add_abstract_origin_attribute (type_die, origin); So...we haven't yet filled in the parent. It doesn't really matter. We'll get it right, and besides, we shouldn't be dereferencing a parent's die field without checking the existence of said parent. Long story short... committing as obvious. Aldy commit ee39c82907a029fe1403cf1d3a364b89e9dee998 Author: Aldy Hernandez al...@redhat.com Date: Mon Jun 15 09:19:45 2015 -0700 PR debug/66535 * dwarf2out.c (gen_subprogram_die): Do not check a parent's tag if there is no parent. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index d2c516a..4fe33f8 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18790,7 +18790,8 @@ gen_subprogram_die (tree decl, dw_die_ref context_die) end function end module */ - || old_die-die_parent-die_tag == DW_TAG_module + || (old_die-die_parent + old_die-die_parent-die_tag == DW_TAG_module) || context_die == NULL) (DECL_ARTIFICIAL (decl) || (get_AT_file (old_die, DW_AT_decl_file) == file_index diff --git a/gcc/testsuite/gnat.dg/debug4.adb b/gcc/testsuite/gnat.dg/debug4.adb new file mode 100644 index 000..1ec37c2 --- /dev/null +++ b/gcc/testsuite/gnat.dg/debug4.adb @@ -0,0 +1,10 @@ +-- { dg-compile } +-- { dg-options -g } + +with Debug4_Pkg; + +procedure Debug4 is + package P is new Debug4_Pkg (Natural); +begin + null; +end; diff --git a/gcc/testsuite/gnat.dg/debug4_pkg.adb b/gcc/testsuite/gnat.dg/debug4_pkg.adb new file mode 100644 index 000..18ba0c0 --- /dev/null +++ b/gcc/testsuite/gnat.dg/debug4_pkg.adb @@ -0,0 +1,23 @@ +package body Debug4_Pkg is + + type Vertex_To_Vertex_T is array (Vertex_Id range ) of Vertex_Id; + + function Dominator_Tree_Internal (G : T'Class) return Vertex_To_Vertex_T is + subtype V_To_V is Vertex_To_Vertex_T (0 .. G.Vertices.Last_Index); + type V_To_VIL is array +(Valid_Vertex_Id range 1 .. G.Vertices.Last_Index) +of Vertex_Index_List; + Bucket : V_To_VIL := (others = VIL.Empty_Vector); + Dom: V_To_V := (others = 0); + begin + return Dom; + end; + + function Dominator_Tree (G : T'Class) return T is + Dom : constant Vertex_To_Vertex_T := Dominator_Tree_Internal (G); + DT : T := (Vertices = VL.Empty_Vector); + begin + return DT; + end; + +end Debug4_Pkg; diff --git a/gcc/testsuite/gnat.dg/debug4_pkg.ads b/gcc/testsuite/gnat.dg/debug4_pkg.ads new file mode 100644 index 000..bac4953 --- /dev/null +++ b/gcc/testsuite/gnat.dg/debug4_pkg.ads @@ -0,0 +1,28 @@ +with Ada.Containers.Vectors; + +generic + type Vertex_Key is private; +package Debug4_Pkg is + + type Vertex_Id is new Natural; + subtype Valid_Vertex_Id is Vertex_Id range 1 .. Vertex_Id'Last; + + package VIL is new Ada.Containers.Vectors + (Index_Type = Positive, + Element_Type = Valid_Vertex_Id); + use VIL; + subtype Vertex_Index_List is VIL.Vector; + + package VL is new Ada.Containers.Vectors + (Index_Type = Valid_Vertex_Id, + Element_Type = Vertex_Key); + use VL; + subtype Vertex_List is VL.Vector; + + type T is tagged record + Vertices : Vertex_List; + end record; + + function Dominator_Tree (G : T'Class) return T; + +end Debug4_Pkg;
Re: [Patch 0/4] PowerPC64 Linux split stack support
The bug is of course that like DEFAULT_ABI, rs6000_isa_flags hasn't been determined yet. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Go patch committed: Don't crash when dumping AST of empty block
This patch from Chris Manghane fixes the Go frontend so that when using the -fgo-dump-ast option it does not crash when dumping an empty block. This fixes http://golang.org/issue/10420 . Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/ast-dump.cc === --- gcc/go/gofrontend/ast-dump.cc (revision 223578) +++ gcc/go/gofrontend/ast-dump.cc (working copy) @@ -65,6 +65,12 @@ class Ast_dump_traverse_statements : pub int Ast_dump_traverse_blocks_and_functions::block(Block * block) { + if (block == NULL) +{ + this-ast_dump_context_-ostream() std::endl; + return TRAVERSE_EXIT; +} + this-ast_dump_context_-print_indent(); this-ast_dump_context_-ostream() { std::endl; this-ast_dump_context_-indent(); @@ -466,4 +472,4 @@ Ast_dump_context::dump_to_stream(const E { Ast_dump_context adc(out, false); expr-dump_expression(adc); -} \ No newline at end of file +}
[4.8, testsuite] Correct backported fix to gcc.dg/vect/vect-33.c
Hi, When I backported support for unaligned vector load/store operations on POWER8 to GCC 4.8, I fumbled the change for gcc.dg/vect/vect-33.c. One of the original tests was: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ which I modified to /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ This caused the test to be skipped for architectures other than PowerPC, which was a mistake. The correct test should have been: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ which leaves things alone for other architectures. Ok for 4.8? Thanks, Bill 2015-06-15 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.dg/vect/vect-33.c: Don't exclude Vectorizing an unaligned access test for non-PowerPC arches. Index: gcc/testsuite/gcc.dg/vect/vect-33.c === --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 224490) +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy) @@ -38,7 +38,7 @@ int main (void) /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ -/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ +/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ /* { dg-final { scan-tree-dump Alignment of access forced using peeling vect { target vector_alignment_reachable } } } */ /* { dg-final { scan-tree-dump-times Alignment of access forced using versioning 1 vect { target { {! vector_alignment_reachable} {! vect_hw_misalign} } } } } */ /* { dg-final { cleanup-tree-dump vect } } */
Re: [Patch, MIPS] Enable fp-contract on MIPS and update -mfused-madd
On Thu, 11 Jun 2015, Joseph Myers wrote: loongson and r8000 have the most changes, they no longer generate msub instructions with -mfused-madd because that instruction does not generate the correct NAN in some cases (the sign may be wrong). If HONOR_NANS is not set then they will generate msub instructions. There's no such thing as a correct NaN sign for fused multiply-add (at the C / GIMPLE / RTL level, that is; there may be a correct sign at the level of semantics for processor instructions). IEEE 754 only specifies signs of NaNs for a few operations (copy, negate, abs, copySign). So while you need to avoid negate / abs instructions that don't work properly on NaNs, if signs of NaNs are the only concern then that's not a reason to avoid any other arithmetic operations. You are right about fused multiply-add (FMA) as far as IEEE Std 754-2008 is concerned, however these instructions implement fused multiply-subtract (FMS) which is an operation that hasn't been defined by IEEE Std 754-2008 and neither is expressed by GCC at the RTL level (there's only (fma:M OP1 OP2 OP3); there's no (fms:M OP1 OP2 OP3) or suchlike). Consequently expander patterns like `fmsM4', `fnmaM4', `fnmsM4' and any further ones that might be invented for similar operations, that e.g. reverse the subtraction, are built around FMA with some of its input operands negated. That negation, implemented with the IEEE Std 754-2008 `negate' operation that you referred to, by definition is required to operate on the sign of its operand in a specific way even if the operand is a qNaN. So for example `fmsM4', that is specified at the RTL level as (fma:M OP1 OP2 (neg:M OP3)) will not produce the correct result with the fused version of the MIPS MSUB.fmt instruction in the case where OP1 and OP2 are numeric data patterns and OP3 is a qNaN data pattern that has its sign bit clear. As specified by IEEE Std 754-2008 the (neg:M OP3) operation is required to invert the sign bit of the qNaN data pattern in calculating TMP3, and then the (fma:M OP1 OP2 TMP3) operation is required to pass the TMP3 qNaN data pattern unchanged in calculating the final result. One thing I did not put in this patch was to add code to use the mips32r6/mips64r6 msubf instruction. This instruction implements 'c - (a * b)', not '(a * b) - c' and since it not currently used by GCC I decided not to add it to this patch. Fused c - (a * b) is exactly equivalent to fused (-a * b) + c or (a * -b) + c (I don't know which is canonical in RTL). (It's *not* equivalent to fused -((a * b) - c), when the result is an exact zero. And moving negation inside multiplication like that is only valid for a fused operation, not unfused if -frounding-math.) Well as I say, IEEE Std 754-2008 does not define a fused (c - (a * b)) or fused ((-a * b) + c) operation. The only fused operation defined by the standard is ((a * b) + c) and consequently, as I noted above, any additional IEEE Std 754-2008 compliant fused operations have to be implemented in terms of negating one or more of input operands, or the result. That is at least my understanding of IEEE Std 754-2008 -- if you think otherwise, can you prove me wrong? When going beyond IEEE Std 754-2008 we can of course define anything we want, but I think that'd have to be controlled with a separate compilation flag, assuming that we do want to go there (beyond -ffinite-math-only that we already have and that in turn causes HONOR_NANS to return FALSE). One complication however is the C11 language standard is still based around IEEE Std 754-1985 that in turn does not define any fused operation at all. Question might be whether we want to go into the IEEE Std 754-2008 territory before (unless?) the C language standard has reached there. However GCC has already adopted FMA, so I gather that we do. I can't speak of other languages GCC supports and their correspondence to IEEE Std 754. I agree with your note about the sign of exact zero results BTW, but it has been taken into account with Steve's patch. The relevant operations use or omit a HONOR_SIGNED_ZEROS check as applicable. If I got anything wrong in the above consideration, then I'll be happy to be corrected. Maciej
Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8
On Fri, 2015-06-12 at 17:36 +0100, Vidya Praveen wrote: On Thu, Apr 30, 2015 at 01:34:18PM +0100, Bill Schmidt wrote: On Thu, 2015-04-30 at 18:26 +0800, Bin.Cheng wrote: On Mon, Apr 27, 2015 at 9:26 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote: On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Index: gcc/testsuite/gcc.dg/vect/vect-33.c === --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118) +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy) @@ -36,9 +36,10 @@ int main (void) return main1 (); } +/* vect_hw_misalign { ! vect64 } */ /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ -/* { dg-final { scan-tree-dump Vectorizing an unaligned access vect { target { vect_hw_misalign { {! vect64} || vect_multiple_sizes } } } } } */ +/* { dg-final { scan-tree-dump Vectorizing an unaligned access vect { target { { { ! powerpc*-*-* } vect_hw_misalign } { { ! vect64 } || vect_multiple_sizes } } } } } */ /* { dg-final { scan-tree-dump Alignment of access forced using peeling vect { target { vector_alignment_reachable { vect64 {! vect_multiple_sizes} } } } } } */ /* { dg-final { scan-tree-dump-times Alignment of access forced using versioning 1 vect { target { { {! vector_alignment_reachable} || {! vect64} } {! vect_hw_misalign} } } } } */ /* { dg-final { cleanup-tree-dump vect } } */ Hi Bill, With this change, the test case is skipped on aarch64 now. Since it passed before, Is it expected to act like this on 64bit platforms? Hi Bin, No, that's a mistake on my part -- thanks for the report! That first added line was not intended to be part of the patch: +/* vect_hw_misalign { ! vect64 } */ Please try removing that line and verify that the patch succeeds again for ARM. Assuming so, I'll prepare a patch to fix this. It looks like this mistake was introduced only in this particular test, but please let me know if you see any other anomalies. Hi Bill, I chased the wrong branch. The test disappeared on fsf-48 branch in out build, rather than trunk. I guess it's not your patch's fault. Will follow up and get back to you later. Sorry for the inconvenience. OK, thanks for letting me know! There was still a bad line in this patch, although it was only introduced in 5.1 and trunk, so I guess that wasn't responsible in this case. Thanks for checking! Hi Bill, In 4.8 branch, you have changed: -/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ +/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ Whereas your comment says: 2015-04-24 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r222349 2015-04-22 Bill Schmidt wschm...@linux.vnet.ibm.com PR target/65456 [...] * gcc.dg/vect/vect-33.c: Exclude unaligned access test for POWER8. [...] There wasn't an unaligned access test in the first place. But if you wanted to introduce it and exclude it for POWER8 then it should've been: ... { { ! powerpc*-*-* } vect_hw_misalign } ... like you have done for the trunk. At the moment, this change has made the test to be skipped for AArch64. It should've been skipped for x86_64-*-* and i*86-*-* as well. I believe it wasn't intended to be skipped so? Right, wasn't intended to be skipped. This test changed substantially between 4.8 and 4.9, so when I did the backport I tried (and failed) to adjust it properly. Because the sense of the test has been reversed, I believe the correct change is /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ I'll give that a quick test. Bill Regards VP. Bill Thanks, bin Thanks very much! Bill PASS-NA: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects scan-tree-dump-times vect Vectorizing an unaligned access 0 PASS-NA: gcc.dg/vect/vect-33.c scan-tree-dump-times vect Vectorizing an unaligned access 0 Thanks, bin
Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data
On Mon, Jun 15, 2015 at 18:25:28 +0200, Jakub Jelinek wrote: On Mon, Jun 15, 2015 at 07:18:27PM +0300, Ilya Verbin wrote: On Mon, Jun 15, 2015 at 15:06:09 +0200, Jakub Jelinek wrote: On Mon, Jun 15, 2015 at 03:20:37PM +0300, Ilya Verbin wrote: This patch introduces new versions of GOMP_target{,_data,_update} for OpenMP 4.1 with unsigned short for map kinds, but without new async arguments yet. I think I'd prefer (for now) to suffix the functions with _41 instead of 1 (and we'll see if we can come up with better names when async support is added). OK. Thanks. Do we need to change GOMP_target_update though (at least right now)? I mean, the construct only allows to and from clauses, not the map clause, and those don't really have an always modifier, nor release/delete semantics etc., so at least for now I think using the current GOMP_target_update should be ok. I thought that it wouldn't look good, since without GOMP_target_update_41 we will need to keep this obsolete parts: I'd prefer to keep it for now, perhaps later on we'll switch to 16-bit kinds even for that, but better figure out first what to do with the async stuff, handle the enter/exit data correctly, change the library for OpenMP 4.1 to do the fully refcounted model. Here is the new patch. OK to commit? gcc/ * builtin-types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New. (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove. * omp-builtins.def (BUILT_IN_GOMP_TARGET): Replace GOMP_target with GOMP_target_41. (BUILT_IN_GOMP_TARGET_DATA): Replace GOMP_target_data with GOMP_target_data_41. (BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): New. * omp-low.c (expand_omp_target): Use BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA for GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Do not pass obsolete pointer to new builtins. (lower_omp_target): Use unsigned short for map kinds, except BUILT_IN_GOMP_TARGET_UPDATE. gcc/fortran/ * types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New. (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove. libgomp/ * libgomp.map (GOMP_4.1): Add GOMP_target_41, GOMP_target_data_41, GOMP_target_enter_exit_data. * libgomp_g.h: Declare GOMP_target_41, GOMP_target_data_41, GOMP_target_enter_exit_data. * target.c (resolve_device): Call gomp_init_device here instead of GOMP_target*. (get_kind): Rename is_openacc to short_mapkind. (gomp_map_vars): Likewise. (gomp_unmap_vars): Likewise. (gomp_update): Likewise. (gomp_target_fallback): New static function. (gomp_get_target_fn_addr): New static function. (GOMP_target): Move host fallback and fn lookup to the new functions. (GOMP_target_41): New function. (gomp_target_data_fallback): New static function. (GOMP_target_data): Move host fallback to the new function. (GOMP_target_data_41): New function. (GOMP_target_update): Do not call gomp_init_device. (GOMP_target_enter_exit_data): New function. diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 492ca63..870c957 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -526,6 +526,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE, BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT) DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, BT_VOID, BT_INT, BT_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR) +DEF_FUNCTION_TYPE_6 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR, +BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR, +BT_PTR, BT_PTR) DEF_FUNCTION_TYPE_7 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, @@ -534,9 +537,6 @@ DEF_FUNCTION_TYPE_7 (BT_FN_BOOL_BOOL_ULL_ULL_ULL_ULL_ULLPTR_ULLPTR, BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG) -DEF_FUNCTION_TYPE_7 (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR, -BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_PTR, BT_SIZE, -BT_PTR, BT_PTR, BT_PTR) DEF_FUNCTION_TYPE_8 (BT_FN_VOID_OMPFN_PTR_UINT_LONG_LONG_LONG_LONG_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index c0d3989..a830235 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -189,6 +189,9 @@ DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT, BT_INT) DEF_FUNCTION_TYPE_6 (BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT, BT_BOOL, BT_SIZE, BT_VOLATILE_PTR, BT_PTR, BT_PTR, BT_INT, BT_INT) +DEF_FUNCTION_TYPE_6
[gomp4] initial support for openacc worker state propagation
This patch adds preliminary support for worker state propagation inside acc loops. Besides for the same lack of precise data flow information shared with vector broadcasting, this patch does not attempt to reserve a sufficient amount of .shared memory to spill-and-fill all of the broadcasted variables simultaneously. So instead, all of the variables are broadcasted sequentially for the moment. A follow up patch at a later date will address this issue. This patch has been applied to gomp-4_0-branch. Cesar 2015-06-15 Cesar Philippidis ce...@codesourcery.com gcc/ * omp-low.c (expand_omp_for_static_nochunk): Update entry_bb after calling oacc-boradcast. (expand_omp_for_static_chunk): Likewise. (generate_oacc_broadcast): Insert a barrier after reading from shared memory. (settree live_in): New static set. (populate_loop_live_in): New function. (oacc_populate_live_in_1): New function. (oacc_populate_live_in): New function. (oacc_broadcast_1): (oacc_broadcast): Extract loop data from omp_regions instead of omp_for_data, populate live_in set, broadcast worker variables, and return updated entry_bb. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 6b261ca..c7451c9 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -291,8 +291,8 @@ static vecomp_context * taskreg_contexts; static void scan_omp (gimple_seq *, omp_context *); static tree scan_omp_1_op (tree *, int *, void *); -static void oacc_broadcast (basic_block, basic_block, struct omp_region *, - struct omp_for_data *); +static basic_block oacc_broadcast (basic_block, basic_block, + struct omp_region *); #define WALK_SUBSTMTS \ case GIMPLE_BIND: \ @@ -7326,7 +7326,8 @@ expand_omp_for_static_nochunk (struct omp_region *region, exit_bb = region-exit; /* Broadcast variables to OpenACC threads. */ - oacc_broadcast (entry_bb, fin_bb, region, fd); + entry_bb = oacc_broadcast (entry_bb, fin_bb, region); + region-entry = entry_bb; /* Iteration space partitioning goes in ENTRY_BB. */ gsi = gsi_last_bb (entry_bb); @@ -7738,7 +7739,8 @@ expand_omp_for_static_chunk (struct omp_region *region, fin_bb = BRANCH_EDGE (iter_part_bb)-dest; /* Broadcast variables to OpenACC threads. */ - oacc_broadcast (entry_bb, fin_bb, region, fd); + entry_bb = oacc_broadcast (entry_bb, fin_bb, region); + region-entry = entry_bb; gcc_assert (broken_loop || fin_bb == FALLTHRU_EDGE (cont_bb)-dest); @@ -10681,6 +10683,8 @@ generate_oacc_broadcast (omp_region *region, tree dest_var, tree var, gassign *ld = gimple_build_assign (dest_var, build_simple_mem_ref (ptr)); gsi_insert_after (where, ld, GSI_NEW_STMT); + gsi_insert_after (where, build_oacc_threadbarrier (), GSI_NEW_STMT); + return st; } @@ -10946,7 +10950,77 @@ predicate_omp_regions (basic_block head_bb) } /* USE and GET sets for variable broadcasting. */ -static std::settree use, gen; +static std::settree use, gen, live_in; + +/* This is an extremely conservative live in analysis. We only want to + detect is any compiler temporary used inside an acc loop is local to + that loop or not. So record all decl uses in all the basic blocks + post-dominating the acc loop in question. */ +static tree +populate_loop_live_in (tree *tp, int *walk_subtrees, + void *data_ ATTRIBUTE_UNUSED) +{ + struct walk_stmt_info *wi = (struct walk_stmt_info *) data_; + + if (wi wi-is_lhs) +{ + if (VAR_P (*tp)) + live_in.insert (*tp); +} + else if (IS_TYPE_OR_DECL_P (*tp)) +*walk_subtrees = 0; + + return NULL_TREE; +} + +static void +oacc_populate_live_in_1 (basic_block entry_bb, basic_block exit_bb, + basic_block loop_bb) +{ + basic_block son; + gimple_stmt_iterator gsi; + + if (entry_bb == exit_bb) +return; + + if (!dominated_by_p (CDI_DOMINATORS, loop_bb, entry_bb)) +return; + + for (gsi = gsi_start_bb (entry_bb); !gsi_end_p (gsi); gsi_next (gsi)) +{ + struct walk_stmt_info wi; + gimple stmt; + + memset (wi, 0, sizeof (wi)); + stmt = gsi_stmt (gsi); + + walk_gimple_op (stmt, populate_loop_live_in, wi); +} + + /* Continue walking the dominator tree. */ + for (son = first_dom_son (CDI_DOMINATORS, entry_bb); + son; + son = next_dom_son (CDI_DOMINATORS, son)) +oacc_populate_live_in_1 (son, exit_bb, loop_bb); +} + +static void +oacc_populate_live_in (basic_block entry_bb, omp_region *region) +{ + /* Find the innermost OMP_TARGET region. */ + while (region region-type != GIMPLE_OMP_TARGET) +region = region-outer; + + if (!region) +return; + + basic_block son; + + for (son = first_dom_son (CDI_DOMINATORS, region-entry); + son; + son = next_dom_son (CDI_DOMINATORS, son)) +oacc_populate_live_in_1 (son, region-exit, entry_bb); +} static tree populate_loop_use (tree *tp, int *walk_subtrees, void *data_) @@ -11044,44 +8,91 @@ oacc_broadcast_1 (basic_block entry_bb, basic_block exit_bb, bool init, is a latch back to
Re: [4.8, testsuite] Correct backported fix to gcc.dg/vect/vect-33.c
I just was reading the gcc mailing list and realized that changes to 4.8 now require release manager approval. Adding Richard to the CC list for consideration. Thanks! Bill On Mon, 2015-06-15 at 14:54 -0500, Bill Schmidt wrote: Hi, When I backported support for unaligned vector load/store operations on POWER8 to GCC 4.8, I fumbled the change for gcc.dg/vect/vect-33.c. One of the original tests was: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect } } */ which I modified to /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ This caused the test to be skipped for architectures other than PowerPC, which was a mistake. The correct test should have been: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ which leaves things alone for other architectures. Ok for 4.8? Thanks, Bill 2015-06-15 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.dg/vect/vect-33.c: Don't exclude Vectorizing an unaligned access test for non-PowerPC arches. Index: gcc/testsuite/gcc.dg/vect/vect-33.c === --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 224490) +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy) @@ -38,7 +38,7 @@ int main (void) /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ -/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { ! vect_hw_misalign } } } } */ +/* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 0 vect { target { { ! powerpc*-*-* } || { ! vect_hw_misalign } } } } } */ /* { dg-final { scan-tree-dump Alignment of access forced using peeling vect { target vector_alignment_reachable } } } */ /* { dg-final { scan-tree-dump-times Alignment of access forced using versioning 1 vect { target { {! vector_alignment_reachable} {! vect_hw_misalign} } } } } */ /* { dg-final { cleanup-tree-dump vect } } */
Re: [gomp4.1] Add new versions of GOMP_target{,_data,_update} and GOMP_target_enter_exit_data
On Mon, Jun 15, 2015 at 10:48:50PM +0300, Ilya Verbin wrote: Here is the new patch. OK to commit? gcc/ * builtin-types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New. (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove. * omp-builtins.def (BUILT_IN_GOMP_TARGET): Replace GOMP_target with GOMP_target_41. (BUILT_IN_GOMP_TARGET_DATA): Replace GOMP_target_data with GOMP_target_data_41. (BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA): New. * omp-low.c (expand_omp_target): Use BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA for GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Do not pass obsolete pointer to new builtins. (lower_omp_target): Use unsigned short for map kinds, except BUILT_IN_GOMP_TARGET_UPDATE. gcc/fortran/ * types.def (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR): New. (BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR): Remove. libgomp/ * libgomp.map (GOMP_4.1): Add GOMP_target_41, GOMP_target_data_41, GOMP_target_enter_exit_data. * libgomp_g.h: Declare GOMP_target_41, GOMP_target_data_41, GOMP_target_enter_exit_data. * target.c (resolve_device): Call gomp_init_device here instead of GOMP_target*. (get_kind): Rename is_openacc to short_mapkind. (gomp_map_vars): Likewise. (gomp_unmap_vars): Likewise. (gomp_update): Likewise. (gomp_target_fallback): New static function. (gomp_get_target_fn_addr): New static function. (GOMP_target): Move host fallback and fn lookup to the new functions. (GOMP_target_41): New function. (gomp_target_data_fallback): New static function. (GOMP_target_data): Move host fallback to the new function. (GOMP_target_data_41): New function. (GOMP_target_update): Do not call gomp_init_device. (GOMP_target_enter_exit_data): New function. Ok, thanks. Jakub
Re: [PATCH] Adding warning for constexpr's
2015-06-12 18:35 GMT-03:00 Joseph Myers jos...@codesourcery.com: On Fri, 12 Jun 2015, Andres Tiraboschi wrote: Hi, this patch is for adding a warning when a constexpr cannot be evaluated at compile time. This is a single case: type var = fun(args...), with fun declared as a constexpr. All options need documenting in invoke.texi. All diagnostics need testcases added to the testsuite. C++-specific options go in c.opt and should be listed as C++ ObjC++, not Common. All new diagnostics should use warning_at etc. with explicit locations passed, unless there is some strong reason it's hard to get the relevant location when the warning is given. -- Joseph S. Myers jos...@codesourcery.com Hi, do you know where is the .exp file for the tests in .../gcc/testsuite/g++.dg/warn? I can't find it. Thanks again
Re: [PATCH] toplevel: fixes for in-tree libiconv
This is the first in a series of patches to make a build with an in-tree GNU libiconv work as designed. This patch fixes dependencies for parallel make, and avoids failures with make targets not supported by GNU libiconv. This is OK. Thanks!
RE: [Patch, MIPS] Modify sysroot layout for mips-mti-* and mips-img-*
Apart from wanting to understand the need/lack of ! in various places, this is otherwise OK. Matthew Ok, I split the long lines and added the missing '!' characters to all the MULTILIB_OSDIRNAMES except for the default ones. Using the '!' on the default MULTILIB_OSDIRNAMES like: MULTILIB_OSDIRNAMES = .=mips-r6-hard/lib in t-img-linux causes the '--print-multi-os-directory' option to include the '!' character in the output when using the default options and that causes all sorts of problems when building. Without a '!' on the default case: % inst*/bin/*-gcc --print-multi-os-directory mips-r6-hard/lib With a '!' on the default case: % inst*/bin/*-gcc --print-multi-os-directory !mips-r6-hard/lib This is probably a bug in the scripts that parse the t-* files and setting up the various multi-lib lists but I didn't dig into exactly where or what the problem was, I just left the '!' off the default case. The non-default cases should all have a '!' and I have added it to the ones that were missing it. I will check this patch in after I have done one more round of testing to make sure that my latest changes didn't break anything. Steve Ellcey sell...@imgtec.com
Re: [Patch, MIPS] Enable fp-contract on MIPS and update -mfused-madd
On Mon, 15 Jun 2015, Maciej W. Rozycki wrote: operands negated. That negation, implemented with the IEEE Std 754-2008 `negate' operation that you referred to, by definition is required to operate on the sign of its operand in a specific way even if the operand is a qNaN. So for example `fmsM4', that is specified at the RTL level as (fma:M OP1 OP2 (neg:M OP3)) will not produce the correct result with the fused version of the MIPS MSUB.fmt instruction in the case where OP1 and OP2 are numeric data patterns and OP3 is a qNaN data pattern that has its sign bit clear. As specified by IEEE Std 754-2008 the (neg:M OP3) operation is required to invert the sign bit of the qNaN data pattern in calculating TMP3, and then the (fma:M OP1 OP2 TMP3) operation is required to pass the TMP3 qNaN data pattern unchanged in calculating the final result. It is only required (well, recommended) to pass the *payload*. The sign bit is not part of the payload. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.. -- Joseph S. Myers jos...@codesourcery.com
[PATCH] Altivec mulv4si3 and mulv8hi3 cleanup
POWER8 added a multiply instruction that makes mulv4si more efficient. And vmladduhm can be used for mulv8hi3. This patch also changes vmladduhm from a black box UNSPEC to descriptive RTL. Bootstrapped on powerpc64le-linux. * altivec.md: Delete UNSPEC_VMLADDUHM. (mulv4si3_p8): New pattern. (mulv4si3): Use it for POWER8. (mulv8hi3): Use vmladduhm with zero addend. (altivec_vmladduhm): Descriptive RTL. - David * altivec.md: Delete UNSPEC_VMLADDUHM. (mulv4si3_p8): New pattern. (mulv4si3): Use it for POWER8. (mulv8hi3): Use vmladduhm with zero addend. (altivec_vmladduhm): Descriptive RTL. Index: altivec.md === --- altivec.md (revision 224450) +++ altivec.md (working copy) @@ -27,7 +27,6 @@ UNSPEC_VMSUMSHS UNSPEC_VMHADDSHS UNSPEC_VMHRADDSHS - UNSPEC_VMLADDUHM UNSPEC_VADDCUW UNSPEC_VADDU UNSPEC_VADDS @@ -634,13 +633,20 @@ ;; [(set (match_operand:V4SI 0 register_operand =v) ;; (mult:V4SI (match_operand:V4SI 1 register_operand v) ;;(match_operand:V4SI 2 register_operand v)))] +(define_insn mulv4si3_p8 + [(set (match_operand:V4SI 0 register_operand =v) +(mult:V4SI (match_operand:V4SI 1 register_operand v) + (match_operand:V4SI 2 register_operand v)))] + TARGET_P8_VECTOR + vmuluwm %0,%1,%2 + [(set_attr type veccomplex)]) + (define_expand mulv4si3 [(use (match_operand:V4SI 0 register_operand )) (use (match_operand:V4SI 1 register_operand )) (use (match_operand:V4SI 2 register_operand ))] TARGET_ALTIVEC - - { +{ rtx zero; rtx swap; rtx small_swap; @@ -650,6 +656,12 @@ rtx low_product; rtx high_product; + if (TARGET_P8_VECTOR) +{ + emit_insn (gen_mulv4si3_p8 (operands[0], operands[1], operands[2])); + DONE; +} + zero = gen_reg_rtx (V4SImode); emit_insn (gen_altivec_vspltisw (zero, const0_rtx)); @@ -679,7 +691,7 @@ emit_insn (gen_addv4si3 (operands[0], high_product, low_product)); DONE; - }) +}) (define_expand mulv8hi3 [(use (match_operand:V8HI 0 register_operand )) @@ -686,32 +698,14 @@ (use (match_operand:V8HI 1 register_operand )) (use (match_operand:V8HI 2 register_operand ))] TARGET_ALTIVEC - { - rtx odd = gen_reg_rtx (V4SImode); - rtx even = gen_reg_rtx (V4SImode); - rtx high = gen_reg_rtx (V4SImode); - rtx low = gen_reg_rtx (V4SImode); + rtx zero = gen_reg_rtx (V8HImode); - if (BYTES_BIG_ENDIAN) - { - emit_insn (gen_altivec_vmulesh (even, operands[1], operands[2])); - emit_insn (gen_altivec_vmulosh (odd, operands[1], operands[2])); - emit_insn (gen_altivec_vmrghw_direct (high, even, odd)); - emit_insn (gen_altivec_vmrglw_direct (low, even, odd)); - emit_insn (gen_altivec_vpkuwum_direct (operands[0], high, low)); - } - else - { - emit_insn (gen_altivec_vmulosh (even, operands[1], operands[2])); - emit_insn (gen_altivec_vmulesh (odd, operands[1], operands[2])); - emit_insn (gen_altivec_vmrghw_direct (high, odd, even)); - emit_insn (gen_altivec_vmrglw_direct (low, odd, even)); - emit_insn (gen_altivec_vpkuwum_direct (operands[0], low, high)); - } + emit_insn (gen_altivec_vspltish (zero, const0_rtx)); + emit_insn (gen_altivec_vmladduhm(operands[0], operands[1], operands[2], zero)); DONE; -}) +}) ;; Fused multiply subtract (define_insn *altivec_vnmsubfp @@ -851,10 +845,9 @@ (define_insn altivec_vmladduhm [(set (match_operand:V8HI 0 register_operand =v) -(unspec:V8HI [(match_operand:V8HI 1 register_operand v) - (match_operand:V8HI 2 register_operand v) - (match_operand:V8HI 3 register_operand v)] -UNSPEC_VMLADDUHM))] +(plus:V8HI (mult:V8HI (match_operand:V8HI 1 register_operand v) + (match_operand:V8HI 2 register_operand v)) + (match_operand:V8HI 3 register_operand v)))] TARGET_ALTIVEC vmladduhm %0,%1,%2,%3 [(set_attr type veccomplex)])
Re: [Patch, MIPS] Enable fp-contract on MIPS and update -mfused-madd
On Mon, 15 Jun 2015, Joseph Myers wrote: operands negated. That negation, implemented with the IEEE Std 754-2008 `negate' operation that you referred to, by definition is required to operate on the sign of its operand in a specific way even if the operand is a qNaN. So for example `fmsM4', that is specified at the RTL level as (fma:M OP1 OP2 (neg:M OP3)) will not produce the correct result with the fused version of the MIPS MSUB.fmt instruction in the case where OP1 and OP2 are numeric data patterns and OP3 is a qNaN data pattern that has its sign bit clear. As specified by IEEE Std 754-2008 the (neg:M OP3) operation is required to invert the sign bit of the qNaN data pattern in calculating TMP3, and then the (fma:M OP1 OP2 TMP3) operation is required to pass the TMP3 qNaN data pattern unchanged in calculating the final result. It is only required (well, recommended) to pass the *payload*. The sign bit is not part of the payload. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.. However elsewhere: For an operation with quiet NaN inputs, other than maximum and minimum operations, if a floating-point result is to be delivered the result shall be a quiet NaN which should be one of the input NaNs.. I think such a wording makes it clear that the input NaN bit pattern is propagated with no change whatsoever and I can't immediately infer if, for the purpose of standard interpretation, the clause you've quoted extends one that I have or whether one I've quoted narrows one that you have. Maybe this is a mistake/inconsistency in the standard. Of course you're right these are recommendations only (should vs shall), but I think we might want to have/keep a mode where IEEE Std 754 recommendations are strictly followed (where hardware permits). Maciej
Re: [Patch, MIPS] Enable fp-contract on MIPS and update -mfused-madd
On Mon, 15 Jun 2015, Maciej W. Rozycki wrote: It is only required (well, recommended) to pass the *payload*. The sign bit is not part of the payload. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.. However elsewhere: For an operation with quiet NaN inputs, other than maximum and minimum operations, if a floating-point result is to be delivered the result shall be a quiet NaN which should be one of the input NaNs.. See http://grouper.ieee.org/groups/754/email/msg03893.html: The intent is that NaNs which differ only in the sign bit are considered equivalent for the purposes of 6.2.. -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch ARM-AArch64/testsuite Neon intrinsics 00/20] Executable tests
Ping? On 27 May 2015 at 22:15, Christophe Lyon christophe.l...@linaro.org wrote: This patch series is a follow-up to the tests I already contributed, converted from my original testsuite. This series consists in 20 new patches, which can be committed independently. For vrecpe, I added the setting of the Flush-to-Zero FP flag, to force AArch64 to behave the same as ARM by default. This is the final batch, except for the vget_lane tests which I will submit later. This should cover the subset of AdvSIMD intrinsics common to ARMv7 and AArch64. Tested with qemu on arm*linux, aarch64-linux. 2015-05-27 Christophe Lyon christophe.l...@linaro.org * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (_ARM_FPSCR): Add FZ field. (clean_results): Force FZ=1 on AArch64. * gcc.target/aarch64/advsimd-intrinsics/vrecpe.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vrecps.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vreinterpret.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrev.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrshl.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrshr_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrshrn_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrsqrte.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrsqrts.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vrsra_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vset_lane.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vshl_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vshll_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vshr_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vshrn_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vsra_n.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst1_lane.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vstX_lane.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vtbX.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vtst.c: Likewise. Christophe Lyon (20): Add vrecpe tests. Add vrecps tests. Add vreinterpret tests. Add vrev tests. Add vrshl tests. Add vshr_n tests. Add vrshr_n tests. Add vrshrn_n tests. Add vrsqrte tests. Add vrsqrts tests. Add vrsra_n tests. Add vset_lane tests. Add vshll_n tests. Add vshl_n tests. Add vshrn_n tests. Add vsra_n tests. Add vst1_lane tests. Add vstX_lane tests. Add vtbX tests. Add vtst tests. .../aarch64/advsimd-intrinsics/arm-neon-ref.h | 19 +- .../gcc.target/aarch64/advsimd-intrinsics/vrecpe.c | 154 + .../gcc.target/aarch64/advsimd-intrinsics/vrecps.c | 117 .../aarch64/advsimd-intrinsics/vreinterpret.c | 741 + .../gcc.target/aarch64/advsimd-intrinsics/vrev.c | 200 ++ .../gcc.target/aarch64/advsimd-intrinsics/vrshl.c | 627 + .../aarch64/advsimd-intrinsics/vrshr_n.c | 504 ++ .../aarch64/advsimd-intrinsics/vrshrn_n.c | 143 .../aarch64/advsimd-intrinsics/vrsqrte.c | 157 + .../aarch64/advsimd-intrinsics/vrsqrts.c | 118 .../aarch64/advsimd-intrinsics/vrsra_n.c | 553 +++ .../aarch64/advsimd-intrinsics/vset_lane.c | 99 +++ .../gcc.target/aarch64/advsimd-intrinsics/vshl_n.c | 96 +++ .../aarch64/advsimd-intrinsics/vshll_n.c | 56 ++ .../gcc.target/aarch64/advsimd-intrinsics/vshr_n.c | 95 +++ .../aarch64/advsimd-intrinsics/vshrn_n.c | 70 ++ .../gcc.target/aarch64/advsimd-intrinsics/vsra_n.c | 117 .../aarch64/advsimd-intrinsics/vst1_lane.c | 93 +++ .../aarch64/advsimd-intrinsics/vstX_lane.c | 578 .../gcc.target/aarch64/advsimd-intrinsics/vtbX.c | 289 .../gcc.target/aarch64/advsimd-intrinsics/vtst.c | 120 21 files changed, 4940 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrecpe.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrecps.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrev.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrshl.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrshr_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrshrn_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrsqrte.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrsqrts.c create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrsra_n.c create mode
Re: [C++17] Implement N3928 - Extending static_assert
On 06/15/2015 12:05 PM, Jason Merrill wrote: On 05/20/2015 11:28 AM, Jason Merrill wrote: On 05/02/2015 04:16 PM, Ed Smith-Rowland wrote: This extends' static assert to not require a message string. I elected to make this work also for C++11 and C++14 and warn only with -pedantic. I think many people just write static_assert(thing, ); . I took the path of building an empty string in the parser in this case. I wasn't sure if setting message to NULL_TREE would cause sadness later on or not. Hmm. Yes, this technically implements the feature, but my impression of the (non-normative) intent was that they wanted leaving out the string to print the argument expression, in about the same way as #define BOOST_STATIC_ASSERT( B ) static_assert(B, #B) So the patch is OK as is, but you might also look into some libcpp magic to insert a second argument that stringizes the first. Are you planning to check this in? Jason Jason, I wanted to fix it up as per your suggestion. If someone wants it now I can retest and commit. Otherwise give me a bit more time. Also, if you or someone else really has the whole enchilada then by all means just commit that. Ed
Go patch committed: Analyze binary expressions in escape analysis
This patch from Chris Manghane analyzes binary expressions in escape analysis. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 0) +++ gcc/go/gofrontend/MERGE (working copy) @@ -0,0 +1,4 @@ +8eeba3ad318863eea867669609a1910101c23f00 + +The first line of this file holds the git revision number of the last +merge done from the gofrontend repository.
Re: [PATCH] Adding warning for constexpr's
On Jun 15, 2015, at 12:55 PM, Andres Tiraboschi andres.tirabos...@tallertechnologies.com wrote: Hi, do you know where is the .exp file for the tests in .../gcc/testsuite/g++.dg/warn? I can't find it. find srcdir -name \*.exp -print will show you all of them. You’ll discover that a .exp file can run the entire tree under it. In this case, look in the parent directory.
[gomp4] c/c++ cleanups for openacc combined loops
This patch teaches the c and c++ front ends to use a common function to split clauses in combined acc parallel loops and acc kernel loops. There's still a little bit of duplicate code inside c_parser_oacc_loop and c_parser_oacc_loop and cp_parser_oacc_loop with their respective calls to c_finish_omp_clauses and finish_omp_clauses. But those are the only two isolated cases. I've also added a c_ prefix to oacc_filter_device_types because that how the other functions in c-omp.c are named. I've applied this patch to gomp-4_0-branch. Cesar 2015-06-15 Cesar Philippidis ce...@codesourcery.com gcc/c-family/ * c-common.h (c_oacc_split_loop_clauses): Declare. (oacc_extract_device_id): Remove declaration. (oacc_filter_device_types): Rename to ... (c_oacc_filter_device_types): ... this. * c-omp.c (oacc_extract_device_id): Rename to ... (c_oacc_extract_device_id): ... this and make static. (oacc_filter_device_types): Rename to ... (c_oacc_filter_device_types): ... this. (c_oacc_split_loop_clauses): New function. gcc/c/ * c-parser.c (c_parser_oacc_all_clauses): Call c_oacc_filter_device_types instead of oacc_filter_device_types. (oacc_split_loop_clauses): Remove. (c_parser_oacc_loop): Call c_oacc_split_loop_clauses instead of oacc_split_loop_clauses. gcc/cp/ * parser.c (cp_parser_oacc_all_clauses): Call c_oacc_filter_device_types instead of oacc_filter_device_types. (oacc_split_loop_clauses): Remove. (cp_parser_oacc_loop): Call c_oacc_split_loop_clauses instead of oacc_split_loop_clauses. diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index fcaebca..28d3252 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -1249,8 +1249,8 @@ extern void c_omp_split_clauses (location_t, enum tree_code, omp_clause_mask, extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree); extern void c_omp_declare_simd_clauses_to_decls (tree, tree); extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree); -extern int oacc_extract_device_id (const char *); -extern tree oacc_filter_device_types (tree); +extern tree c_oacc_filter_device_types (tree); +extern tree c_oacc_split_loop_clauses (tree, tree *); /* Return next tree in the chain for chain_next walking of tree nodes. */ static inline tree diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c index bcb6ff4..3909ec8 100644 --- a/gcc/c-family/c-omp.c +++ b/gcc/c-family/c-omp.c @@ -1093,8 +1093,8 @@ c_omp_predetermined_sharing (tree decl) only device_type(nvidia) is supported. All device_type parameters are treated as case-insensitive keywords. */ -int -oacc_extract_device_id (const char *device) +static int +c_oacc_extract_device_id (const char *device) { if (!strcasecmp (device, nvidia)) return GOMP_DEVICE_NVIDIA_PTX; @@ -1115,7 +1115,7 @@ struct identifier_hasher : ggc_cache_hashertree /* Filter out the list of unsupported OpenACC device_types. */ tree -oacc_filter_device_types (tree clauses) +c_oacc_filter_device_types (tree clauses) { tree c, prev; tree dtype = NULL_TREE; @@ -1141,7 +1141,7 @@ oacc_filter_device_types (tree clauses) goto filter_dtype; } - int code = oacc_extract_device_id (IDENTIFIER_POINTER (t)); + int code = c_oacc_extract_device_id (IDENTIFIER_POINTER (t)); if (code == GOMP_DEVICE_DEFAULT) seen_default = OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c); @@ -1214,3 +1214,49 @@ oacc_filter_device_types (tree clauses) OMP_CLAUSE_CHAIN (prev) = dtype; return clauses; } + +/* Split the 'clauses' into a set of 'loop' clauses and a set of + 'not-loop' clauses. */ + +tree +c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses) +{ + tree loop_clauses, next, c; + + loop_clauses = *not_loop_clauses = NULL_TREE; + + for (; clauses ; clauses = next) +{ + next = OMP_CLAUSE_CHAIN (clauses); + + switch (OMP_CLAUSE_CODE (clauses)) +{ + case OMP_CLAUSE_COLLAPSE: + case OMP_CLAUSE_REDUCTION: + case OMP_CLAUSE_GANG: + case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_WORKER: + case OMP_CLAUSE_AUTO: + case OMP_CLAUSE_SEQ: + OMP_CLAUSE_CHAIN (clauses) = loop_clauses; + loop_clauses = clauses; + break; + + case OMP_CLAUSE_FIRSTPRIVATE: + case OMP_CLAUSE_PRIVATE: + c = build_omp_clause (OMP_CLAUSE_LOCATION (clauses), + OMP_CLAUSE_CODE (clauses)); + OMP_CLAUSE_DECL (c) = OMP_CLAUSE_DECL (clauses); + OMP_CLAUSE_CHAIN (c) = loop_clauses; + loop_clauses = c; + /* FALL THROUGH */ + + default: + OMP_CLAUSE_CHAIN (clauses) = *not_loop_clauses; + *not_loop_clauses = clauses; + break; + } +} + + return loop_clauses; +} diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index f37a8f7..e7df751 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -12408,7 +12408,7 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, if (finish_p) { - clauses = oacc_filter_device_types (clauses); + clauses = c_oacc_filter_device_types (clauses); return