Re: [PATCH, PR66444] Handle -fipa-ra in reload_combine
On 08/06/15 17:31, Jakub Jelinek wrote: On Mon, Jun 08, 2015 at 02:04:12PM +0200, Tom de Vries wrote: this patch fixes PR66444, a problem with -fipa-ra in reload_combine. The problem is that for the test-case, reload_combine combines these two insns: Please work out with Vlad whether reload_cse_move2add doesn't need similar fix (and check other spots too). Filed PR66463 - review uses of call_used_regs and regs_invalidated_by_call. Thanks, - Tom
Re: [C++/58583] ICE instantiating NSDMIs
How about using a DECL_LANG_FLAG instead of creating a garbage DEFAULT_ARG? Jason
Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage
On Mon, Jun 8, 2015 at 2:45 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Patrick Palka patr...@parcs.ath.cx writes: At the same time this patch extends the coverage of the -Wmisleading-indentation implementation to catch misleading indentation involving the semicolon. (These changes are all contained in c-indentation.c.) For example, now we warn on the following code samples: if (flag); foo (); while (flag); { ... } if (flag); { ... } if (flag) ; /* blah */ { ... } if (flag); foo (); while avoiding to warn on code that is poorly indented but not misleadingly so; while (flag); foo (); while (flag) ; foo (); Maybe I've just been doing too much Python recently, but unlike the other two examples, this one does seem a little misleading. What would happen for: while (flag) /* blah */; foo (); where the semicolon is hidden after a comment? Ah yeah, good point. The case when there is a comment in between the guard and the semicolon is slightly tricky but doable. Thanks to David and you for the patches btw -- looks like a really useful feature. Richard if (flag1) ; if (flag) ; else ...
Re: [patch] Adjust gcc-plugin.h
On 06/08/2015 09:32 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 2:07 PM, Andrew MacLeod amacl...@redhat.com wrote: During the original flattening process I decided to use gcc-plugin.h as the kitchen sink for all includes that plugins might need. I think this has worked well for plugins, drastically reducing their dependency on gcc internal header file structure. What I didn't realize was that gcc's internal header (plugin.h) also includes gcc-plugin.h. This means that every file which may need to do something for plugins ends up indirectly including the gcc world again :-P Easy fix. (ha). This patch leaves all the #includes in gcc-plugin.h making the change transparent to plugins. All the remaining declarations and such are moved into a new file gcc-plugin-common.h. Both gcc-plugin.h and gcc's internal header plugin.h now include this common file. The effect is that gcc's source files no longer get anything but the required plugin info. Great.. Except there were a few files which were apparently picking up some required headers from gcc-plugins.h :-PThis patch also adds the required headers to those source files. Compiles on x86_64-unknown-linux-gnu with no new regressions. Also compiles across all targets in config-list.mk. OK for trunk? Err - why not simply remove the gcc-plugin.h include from plugin.h and instead include plugin.h from gcc-plugin.h? the gcc source files need to see the internal bits in plugin.h, as well as the common decls in gcc-plugin.h. So we could change the includes as you suggest, but we'd have to copy all the decls from gcc-inlcude.h to plugin.h so the gcc functions can see them. And then the plugins would be exposed to all the internal APIs and decls present in plugins.h Adding the 3rd file which contains all the common decls between both sides is the only way to isolate both. If you were OK with exposing the internal parts of plugin.h to plugin clients I could do that. Im presuming we didnt want to do that and thats why there were 2 files to start with. I hijacked the external interface in gcc-plugin.h file to provide all the includes when instead the right thing would have been to probably create a new in the first place. Andrew
Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage
Patrick Palka patr...@parcs.ath.cx writes: On Mon, Jun 8, 2015 at 2:11 PM, David Malcolm dmalc...@redhat.com wrote: void warn_for_misleading_indentation (const common_token_info guard_tinfo, const common_token_info body_tinfo, const common_token_info next_tinfo); [Do we allow C++ reference syntax? I'm OK with it, and some of the more C++y parts of codebase use it (templates), but I think Jeff objected last time I tried to submit a patch with it :) ] I'm not sure. The use of references is not a big deal to me in this case at least. I will just pass three pointers instead. FWIW: I thought const references were allowed for cases where the object is logically being passed by value. We used that a lot in thw wide-int code, where actually passing by value would be too slow and passing by pointer too unnatural. I thought it was out and in-out references that were the problem. Thanks, Richard
Re: RFA: RL78: With -mes0 put read only data in the .frodata section
Ok. Thanks!
Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage
On Mon, Jun 8, 2015 at 2:11 PM, David Malcolm dmalc...@redhat.com wrote: On Sun, 2015-06-07 at 16:06 -0400, Patrick Palka wrote: This patch refactors the entry point of -Wmisleading-indentation Thanks for working on this. I was hoping to submit a patch to propose putting -Wmisleading-indentation into -Wall, and have been testing the warning on the linux kernel to try to shake out false positives; hence I'm somewhat nervous about this change, though it mostly seems reasonable. Don't worry, I did a bit of testing myself :) This patch combined with the patch in PR 66454 is remarkably stable -- I could not find bogus warnings out of the dozen or so C code bases I compiled. It would to great to see this warning enabled with -Wall. Even in its current form is pretty useful yet pretty simple (all we need is the information of three tokens). I think there are three categories of message that -Wmisleading-indentation can emit: (A) bogus messages, where the message is untrue. An example of this was PR c/66220 (I found an example of this when running it on the linux kernel); this is fixed in trunk. (B) a message where the code is doing what the author/reviewer intended, but the indentation misleadingly suggests a different meaning (or perhaps the author is being underhanded c.f. http://www.underhanded-c.org/ ). I think it would be reasonable to warn about (B) in -Wall: invoke.texi describes -Wall as ...enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros hence I believe (B) falls into this category. Each instance is a gotcha lurking in the code for the next maintainer. (C) a message where the code is doing something differently to what the author intended, and the human has been fooled by the bogus indentation. Clearly you'd want to know about (C), but the compiler has no way of distinguishing it from (B). That makes sense. I call (A) a false positive but that is probably misleading, bogus sounds better. [FWIW, in my testing on the linux kernel, I ran into: * 1 instance of (A) (PR c/66220, now fixed in trunk), and * 8 that are each either (B) or (C), and I don't have the domain knowledge of the kernel to figure out which is which, so I've started reporting them to the upstream linux community; see e.g.: * https://bugzilla.kernel.org/show_bug.cgi?id=98231 * https://bugzilla.kernel.org/show_bug.cgi?id=98241 * https://bugzilla.kernel.org/show_bug.cgi?id=98251 * https://bugzilla.kernel.org/show_bug.cgi?id=98261 * https://bugzilla.kernel.org/show_bug.cgi?id=98271 * https://bugzilla.kernel.org/show_bug.cgi?id=98281 * 2 messages that are confirmed as (B), e.g. http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=e1395a321eab1a7833d82e952eb8255e0a1f03cb I did not stumble upon these warnings in my tests most likely because I was compiling with a local config which did not include these source files. ] I believe that gcc trunk is currently bootstrappable with -Wmisleading-indentation added to -Wall; is this still the case after your patch? Also, would it is it fair to ask for you to test your patched compiler on some body of code that isn't gcc itself? gcc trunk is currently not bootstrappable with -Wmisleading-indentation because of this misleadingly-indented code: https://github.com/gcc-mirror/gcc/blob/master/gcc/function.c#L4076 I tested the patch on a number of C code bases: Linux, sqlite, binutils-gdb, emacs, alpine, bash, zsh, git, tmux, nginx, and a couple more. The patch emits a bogus warning in the Linux and sqlite code bases, for which I filed PR 66454 before realizing that the issue was not pre-existing :( But this patch combined with an updated version of the patch attached to PR 66454 makes the feature pretty robust: I could no longer find any instances of bogus (A) warnings among the set of code bases I tested, though I did find a bunch of (B)/(C). And of course our test cases still pass. More comments inline below, throughout. from: void warn_for_misleading_indentation (location_t guard_loc, location_t body_loc, location_t next_stmt_loc, enum cpp_ttype next_tok_type, const char *guard_kind); to struct common_token_info { location_t location; cpp_ttype type; rid keyword; }; Bikeshedding a bit, but would, say, token_indent_info or token_location_info be better? (I'm not sure) What do you mean by common in the title? Sure! The name is a horrible placeholder. I used common to mean that the information we are gathering about the tokens is information that both c_tokens and cp_tokens have. Something like token_info sounds too general though. And
Re: [C++/58583] ICE instantiating NSDMIs
On 06/08/15 13:47, Jason Merrill wrote: How about using a DECL_LANG_FLAG instead of creating a garbage DEFAULT_ARG? good idea, I'll go look for an available one. nathan
Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage
On Sun, 2015-06-07 at 16:06 -0400, Patrick Palka wrote: This patch refactors the entry point of -Wmisleading-indentation Thanks for working on this. I was hoping to submit a patch to propose putting -Wmisleading-indentation into -Wall, and have been testing the warning on the linux kernel to try to shake out false positives; hence I'm somewhat nervous about this change, though it mostly seems reasonable. I think there are three categories of message that -Wmisleading-indentation can emit: (A) bogus messages, where the message is untrue. An example of this was PR c/66220 (I found an example of this when running it on the linux kernel); this is fixed in trunk. (B) a message where the code is doing what the author/reviewer intended, but the indentation misleadingly suggests a different meaning (or perhaps the author is being underhanded c.f. http://www.underhanded-c.org/ ). I think it would be reasonable to warn about (B) in -Wall: invoke.texi describes -Wall as ...enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros hence I believe (B) falls into this category. Each instance is a gotcha lurking in the code for the next maintainer. (C) a message where the code is doing something differently to what the author intended, and the human has been fooled by the bogus indentation. Clearly you'd want to know about (C), but the compiler has no way of distinguishing it from (B). [FWIW, in my testing on the linux kernel, I ran into: * 1 instance of (A) (PR c/66220, now fixed in trunk), and * 8 that are each either (B) or (C), and I don't have the domain knowledge of the kernel to figure out which is which, so I've started reporting them to the upstream linux community; see e.g.: * https://bugzilla.kernel.org/show_bug.cgi?id=98231 * https://bugzilla.kernel.org/show_bug.cgi?id=98241 * https://bugzilla.kernel.org/show_bug.cgi?id=98251 * https://bugzilla.kernel.org/show_bug.cgi?id=98261 * https://bugzilla.kernel.org/show_bug.cgi?id=98271 * https://bugzilla.kernel.org/show_bug.cgi?id=98281 * 2 messages that are confirmed as (B), e.g. http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=e1395a321eab1a7833d82e952eb8255e0a1f03cb ] I believe that gcc trunk is currently bootstrappable with -Wmisleading-indentation added to -Wall; is this still the case after your patch? Also, would it is it fair to ask for you to test your patched compiler on some body of code that isn't gcc itself? More comments inline below, throughout. from: void warn_for_misleading_indentation (location_t guard_loc, location_t body_loc, location_t next_stmt_loc, enum cpp_ttype next_tok_type, const char *guard_kind); to struct common_token_info { location_t location; cpp_ttype type; rid keyword; }; Bikeshedding a bit, but would, say, token_indent_info or token_location_info be better? (I'm not sure) What do you mean by common in the title? void warn_for_misleading_indentation (const common_token_info guard_tinfo, const common_token_info body_tinfo, const common_token_info next_tinfo); [Do we allow C++ reference syntax? I'm OK with it, and some of the more C++y parts of codebase use it (templates), but I think Jeff objected last time I tried to submit a patch with it :) ] The purpose of this refactoring is to expose more information to the -Wmisleading-indentation implementation to allow for more advanced heuristics and for better coverage. This change of API of course required changes to the C and C++ frontend. Amidst these minor changes I also made sure in both frontends that warn_for_misleading_indentation() gets called even when the body of the guard in question is a semicolon. This allows us to implement warnings about dubious semicolons. Also I moved the keyword != RID_ELSE checks guarding the call to warn_for_misleading_indentation() to the function itself itself. At the same time this patch extends the coverage of the -Wmisleading-indentation implementation to catch misleading indentation involving the semicolon. (These changes are all contained in c-indentation.c.) For example, now we warn on the following code samples: if (flag); foo (); while (flag); { ... } if (flag); { ... } if (flag) ; /* blah */ { ... } if (flag); foo (); while avoiding to warn on code that is poorly indented but not misleadingly so; while (flag); foo (); while (flag) ; foo (); if (flag1) ; if (flag) ; else ... (nods; I want the warning to warn about *misleading*
Re: [PATCH] Refactor -Wmisleading-indentation API and extend coverage
Patrick Palka patr...@parcs.ath.cx writes: At the same time this patch extends the coverage of the -Wmisleading-indentation implementation to catch misleading indentation involving the semicolon. (These changes are all contained in c-indentation.c.) For example, now we warn on the following code samples: if (flag); foo (); while (flag); { ... } if (flag); { ... } if (flag) ; /* blah */ { ... } if (flag); foo (); while avoiding to warn on code that is poorly indented but not misleadingly so; while (flag); foo (); while (flag) ; foo (); Maybe I've just been doing too much Python recently, but unlike the other two examples, this one does seem a little misleading. What would happen for: while (flag) /* blah */; foo (); where the semicolon is hidden after a comment? Thanks to David and you for the patches btw -- looks like a really useful feature. Richard if (flag1) ; if (flag) ; else ...
Re: [PATCH, PR66444] Handle -fipa-ra in reload_combine
On 08/06/15 17:31, Jakub Jelinek wrote: On Mon, Jun 08, 2015 at 02:04:12PM +0200, Tom de Vries wrote: this patch fixes PR66444, a problem with -fipa-ra in reload_combine. The problem is that for the test-case, reload_combine combines these two insns: Please work out with Vlad whether reload_cse_move2add doesn't need similar fix (and check other spots too). 2015-06-08 Tom de Vries t...@codesourcery.com PR rtl-optimization/66444 * postreload.c (reload_combine): Use get_call_reg_set_usage instead of call_used_regs. LGTM. * gcc.dg/pr66444.c: New test. +int __attribute__((noinline, noclone)) +baz (void) +{ + struct S *x = (struct S *) 0xe000U; I'm still afraid this will not really work on s390-linux (which has only 31-bit pointers) and will not work on 16-bit int targets either (some have say 24-bit pointers etc., not really familiar with the embedded world). So, I'd suggest use a macro for the address, so you don't need to duplicate it, Used a macro CONST_PTR. and define it to say ((struct S *) 0x8000UL), if it reproduces even with that change without your reload_combine fix. Unfortunately, it didn't. So I used __SIZEOF_POINTER__ to produce a valid constant pointer for the 16-31 bit pointer-size cases, while still triggering the original problem for x86_64 -m64 case using a larger pointer. Furthermore, I used __SIZEOF_POINTER__ to ensured a valid pointer constant suffix. Ok for trunk and 5.2 with that change. Committed to trunk and backported gcc-5-branch as attached. Thanks, - Tom Handle -fipa-ra in reload_combine 2015-06-08 Tom de Vries t...@codesourcery.com PR rtl-optimization/66444 * postreload.c (reload_combine): Use get_call_reg_set_usage instead of call_used_regs. * gcc.dg/pr66444.c: New test. --- gcc/postreload.c | 5 ++- gcc/testsuite/gcc.dg/pr66444.c | 79 ++ 2 files changed, 83 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/pr66444.c diff --git a/gcc/postreload.c b/gcc/postreload.c index 7ecca15..1cc7b14 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -1352,9 +1352,12 @@ reload_combine (void) if (CALL_P (insn)) { rtx link; + HARD_REG_SET used_regs; + + get_call_reg_set_usage (insn, used_regs, call_used_reg_set); for (r = 0; r FIRST_PSEUDO_REGISTER; r++) - if (call_used_regs[r]) + if (TEST_HARD_REG_BIT (used_regs, r)) { reg_state[r].use_index = RELOAD_COMBINE_MAX_USES; reg_state[r].store_ruid = reload_combine_ruid; diff --git a/gcc/testsuite/gcc.dg/pr66444.c b/gcc/testsuite/gcc.dg/pr66444.c new file mode 100644 index 000..3f92a5c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr66444.c @@ -0,0 +1,79 @@ +/* { dg-do run } */ +/* { dg-options -O2 -fipa-ra } */ + +extern void abort (void); + +#if (__SIZEOF_LONG_LONG__ == __SIZEOF_POINTER__) +#define ADD_SUFFIX(a) a ## ULL +#elif (__SIZEOF_LONG__ == __SIZEOF_POINTER__) +#define ADD_SUFFIX(a) a ## UL +#elif (__SIZEOF_INT__ == __SIZEOF_POINTER__) +#define ADD_SUFFIX(a) a ## U +#else +#error Add target support here +#endif + +#if __SIZEOF_POINTER__ = 4 +/* Use a 16 bit pointer to have a valid pointer for 16-bit to 31-bit pointer + architectures. Using sizeof, we cannot distinguish between 31-bit and 32-bit + pointer types, so we also handle the 32-bit pointer type case here. */ +#define CONST_PTR ADD_SUFFIX (0x800) +#else +/* For x86_64 -m64, the problem reproduces with this 32-bit CONST_PTR, but not + with a 2-power below it. */ +#define CONST_PTR ADD_SUFFIX (0x8000) +#endif + +int __attribute__((noinline, noclone)) +bar (void) +{ + return 1; +} + +struct S +{ + unsigned long p, q, r; + void *v; +}; + +struct S *s1; +struct S *s2; + +void __attribute__((noinline, noclone)) +fn2 (struct S *x) +{ + s2 = x; +} + +__attribute__((noinline, noclone)) void * +fn1 (struct S *x) +{ + /* Just a statement to make it a non-const function. */ + s1 = x; + + return (void *)0; +} + +int __attribute__((noinline, noclone)) +baz (void) +{ + struct S *x = (struct S *) CONST_PTR; + + x += bar (); + + fn1 (x); + fn2 (x); + + return 0; +} + +int +main (void) +{ + baz (); + + if (s2 != (((struct S *) CONST_PTR) + 1)) +abort (); + + return 0; +} -- 1.9.1
Re: debug-early branch merged into mainline
On June 8, 2015 7:14:19 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/08/2015 09:30 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 2:05 PM, Aldy Hernandez al...@redhat.com wrote: On 06/08/2015 04:26 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 02:33 PM, Richard Biener wrote: On June 7, 2015 6:00:05 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 11:25 AM, Richard Biener wrote: On June 7, 2015 5:03:30 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/06/2015 05:49 AM, Andreas Schwab wrote: Bootstrap fails on aarch64: Comparing stages 2 and 3 warning: gcc/cc1objplus-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs warning: gcc/cc1-checksum.o differs Bootstrap comparison failure! gcc/ira-costs.o differs gcc/tree-sra.o differs gcc/tree-parloops.o differs gcc/tree-vect-data-refs.o differs gcc/java/jcf-io.o differs gcc/ipa-inline-analysis.o differs The bootstrap comparison failure on ppc64le, aarch64, and possibly others is due to the order of some sections being in a different order with and without debugging. Stage2 is being compiled with no debugging due to -gtoggle, and stage3 is being compiled with debugging. For ira-costs.o on ppc64le we have: -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: ... -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: There is no semantic difference between the objects, just the ordering. I assume it's the same problem for the rest of the objects and architectures. I will look into this, unless someone beats me to it, or has an idea right off the bat. Check whether the symbol table walkers are walking hash tables. I assume the above are emitted via the symbol removal handling for debug stuff? Ughh, indeed. These sections are being outputted from output_object_blocks which traverses a hash table: void output_object_blocks (void) { object_block_htab-traversevoid *, output_object_block_htab (NULL); } Perhaps we should sort them by some deterministic field and then call output_object_block() on each member of the resulting list? Yes, that would be the usual fix. Maybe sth has an UID already, is the 'object' a decl by chance? The attached patch fixes the bootstrap failure on ppc64le, and theoretically the aarch64 problem as well, but I haven't checked. Tested on ppc64le linux by bootstrapping, and regtesting C/C++ against pre debug-early merge sources. Also tested by a full bootstrap and regtest on x86-64 Linux. OK for mainline? Please use FOR_EACH_HASH_TABLE_ELEMENT to put elements on the vector instead of the htab traversal. The compare function looks like we will end up having many equal elements (and thus random ordering on hosts where qsort doesn't behave sane here, like Solaris IIRC). Unless all sections are named (which it looks like) Some sections are not named. How about we sort the named sections and output them, but call output_object_block() on the rest of the sections on whatever order they were in? This solves the bootstrap problem as well. Attached patch tested on x86-64 and ppc64le Linux. OK? No, but hash_section suggests to sort after sect-common.flags if the section is not named. Conveniently flags is just an 'int' ... What about if the comparison routine gets a named section and an unnamed section? How to compare? That's why I was giving priority to one over the other originally, but I didn't know about problematic qsort implementations. Obviously unnamed and a named section can be sorted like you did in the original patch. Richard. Aldy
Re: [C++ Patch] PR 65815
Hi, On 06/08/2015 06:16 PM, Jason Merrill wrote: On 05/22/2015 02:46 PM, Paolo Carlini wrote: take a type, not a decl, as first argument. Why? This complicates calls. Yes, but, on the other hand, it's more consistent with the arguments of the various digest_init_*. Also, we don't really use the DECL per se, *only* its TREE_TYPE in the body. Thus, all in all, I decided to propose that, but sure, I don't have a strong opinion... Could you also check that we do the right thing for mem-initializers? Sure I will. Paolo.
Re: [C++ Patch] PR 65815
Hi again, On 06/08/2015 10:33 PM, Paolo Carlini wrote: Could you also check that we do the right thing for mem-initializers? Sure I will. I think we have a similar issue in expand_default_init: exactly when reshape_init is in order we fail to call it before digest_init. The below also passes testing. Thanks! Paolo. Index: cp/init.c === --- cp/init.c (revision 224234) +++ cp/init.c (working copy) @@ -1617,7 +1617,10 @@ expand_default_init (tree binfo, tree true_exp, tr CP_AGGREGATE_TYPE_P (type)) /* A brace-enclosed initializer for an aggregate. In C++0x this can happen for direct-initialization, too. */ -init = digest_init (type, init, complain); +{ + init = reshape_init (type, init, complain); + init = digest_init (type, init, complain); +} /* A CONSTRUCTOR of the target's type is a previously digested initializer, whether that happened just above or in Index: cp/typeck2.c === --- cp/typeck2.c(revision 224234) +++ cp/typeck2.c(working copy) @@ -1161,10 +1161,14 @@ digest_nsdmi_init (tree decl, tree init) { gcc_assert (TREE_CODE (decl) == FIELD_DECL); + tree type = TREE_TYPE (decl); int flags = LOOKUP_IMPLICIT; if (DIRECT_LIST_INIT_P (init)) flags = LOOKUP_NORMAL; - init = digest_init_flags (TREE_TYPE (decl), init, flags); + if (BRACE_ENCLOSED_INITIALIZER_P (init) + CP_AGGREGATE_TYPE_P (type)) +init = reshape_init (type, init, tf_warning_or_error); + init = digest_init_flags (type, init, flags); if (TREE_CODE (init) == TARGET_EXPR) /* This represents the whole initialization. */ TARGET_EXPR_DIRECT_INIT_P (init) = true; Index: testsuite/g++.dg/cpp0x/mem-init-aggr1.C === --- testsuite/g++.dg/cpp0x/mem-init-aggr1.C (revision 0) +++ testsuite/g++.dg/cpp0x/mem-init-aggr1.C (working copy) @@ -0,0 +1,10 @@ +// PR c++/65815 +// { dg-do compile { target c++11 } } + +struct array { + int data [2]; +}; + +struct X : array { + X() : array{ 1, 2 } { } +}; Index: testsuite/g++.dg/cpp0x/nsdmi-aggr1.C === --- testsuite/g++.dg/cpp0x/nsdmi-aggr1.C(revision 0) +++ testsuite/g++.dg/cpp0x/nsdmi-aggr1.C(working copy) @@ -0,0 +1,10 @@ +// PR c++/65815 +// { dg-do compile { target c++11 } } + +struct array { + int data [2]; +}; + +struct X { + array a = { 1, 2 }; +};
Re: debug-early branch merged into mainline
On 06/08/2015 02:59 PM, Richard Biener wrote: On June 8, 2015 7:14:19 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/08/2015 09:30 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 2:05 PM, Aldy Hernandez al...@redhat.com wrote: On 06/08/2015 04:26 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez al...@redhat.com What about if the comparison routine gets a named section and an unnamed section? How to compare? That's why I was giving priority to one over the other originally, but I didn't know about problematic qsort implementations. Obviously unnamed and a named section can be sorted like you did in the original patch. Obviously I'm not understanding :). How about this? Tested on x86-64 and ppc64le. Aldy diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e1bd305..f6d4bda 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2015-06-07 Aldy Hernandez al...@redhat.com + + * varasm.c (output_object_block_htab): Remove. + (output_object_block_compare): New. + (output_object_blocks): Sort named object_blocks before outputting + them. + 2015-06-06 Jan Hubicka hubi...@ucw.cz * alias.c (get_alias_set): Be ready for TYPE_CANONICAL diff --git a/gcc/varasm.c b/gcc/varasm.c index 18f3eac..d69ba5a 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -7420,14 +7420,29 @@ output_object_block (struct object_block *block) } } -/* A htab_traverse callback used to call output_object_block for - each member of object_block_htab. */ +/* A callback for qsort to compare object_blocks. */ -int -output_object_block_htab (object_block **slot, void *) +static int +output_object_block_compare (const void *x, const void *y) { - output_object_block (*slot); - return 1; + object_block *p1 = *(object_block * const*)x; + object_block *p2 = *(object_block * const*)y; + + if (p1-sect-common.flags SECTION_NAMED + !(p2-sect-common.flags SECTION_NAMED)) +return 1; + + if (!(p1-sect-common.flags SECTION_NAMED) + p2-sect-common.flags SECTION_NAMED) +return -1; + + if (p1-sect-common.flags SECTION_NAMED + p2-sect-common.flags SECTION_NAMED) +return strcmp (p1-sect-named.name, p2-sect-named.name); + + unsigned f1 = p1-sect-common.flags; + unsigned f2 = p2-sect-common.flags; + return f1 f2 ? -1 : (f1 f2 ? 1 : 0); } /* Output the definitions of all object_blocks. */ @@ -7435,7 +7450,20 @@ output_object_block_htab (object_block **slot, void *) void output_object_blocks (void) { - object_block_htab-traversevoid *, output_object_block_htab (NULL); + vecobject_block *, va_heap v = vNULL; + object_block *obj; + hash_tableobject_block_hasher::iterator hi; + + FOR_EACH_HASH_TABLE_ELEMENT (*object_block_htab, obj, object_block *, hi) +v.safe_push (obj); + + /* Sort them in order to output them in a deterministic manner, + otherwise we may get .rodata sections in different orders with + and without -g. */ + v.qsort (output_object_block_compare); + unsigned i; + FOR_EACH_VEC_ELT (v, i, obj) +output_object_block (obj); } /* This function provides a possible implementation of the
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Hi, this is a variant of patch that globs also the rest of integer types. Note that we will still get false warnings out of lto-symtab when the values are not wrapped up in structures. This is because lto-symtab uses types_compatible_p that in turn uses useless_type_conversion and that one needs to honor signedness. I suppose we need a way to test representation compatibility and TBAA compatiblity. I will give it a more tought how to reorganize the code. Basically we need representation compatibility is TYPE_CANONICAL equivalence, TBAA compatibility is get_alias_set equivalence. Hmm, I still wonder what to use in lto-symtab's warn_type_compatibility_p. Currently we use types_compatible_p which goes to useless conversion and honnors TYPE_UNSIGNED, so it will give false positives for Fortran. Comparing TYPE_CANONICAL for equivalence will be conservatively correct for now (I will submit patch for that and prepare a testcases), but as soon as we start computing finer TYPE_CANONICAL for pointers we really want to avoid warning on C_PTR declaration in Fortran and say float * in C. This will have different canonical types (C_PTR is void *) that are TBAA compatible. Comparing get_alias_set will block warnings about representation incompatibility in some cases, like when one of units is compiled with -fno-strict-aliasing. Even then IMO we ought to warn when fortran declares variable as C_DOUBLE and C declares it as int *. So I think we want to factor out the representation compatibility logic better and make it separate from canonical type machinery. So you have to be careful when mangling TYPE_CANONICAL according to get_alias_set and make sure to only apply this (signedness for example) for aggregate type components. Can you please split out the string-flag change? It is approved. This is what I commited. After the discussion I still think the second variant of patch (completely dropping signed/unsigned) makes sense for C+fortran units though it is unnecesarily coarse for C/C++ only units. Given that the current plan is to get things conservatively correct first, I would stick to it. Bootstrapped/regtested ppc64le-linux, comitted. Honza * lto.c (hash_canonical_type): Drop hashing of TYPE_STRING_FLAG. * tree.c (gimple_canonical_types_compatible_p): Drop comparsion of TYPE_STRING_FLAG. * gfortran.dg/lto/bind_c-2b_0.f90: New testcase * gfortran.dg/lto/bind_c-2b_1.c: New testcase Index: lto/lto.c === --- lto/lto.c (revision 224250) +++ lto/lto.c (working copy) @@ -332,18 +332,16 @@ if (TREE_CODE (type) == COMPLEX_TYPE) hstate.add_int (TYPE_UNSIGNED (type)); + /* Fortran's C_SIGNED_CHAR is !TYPE_STRING_FLAG but needs to be + interoperable with signed char. Unless all frontends are revisited to + agree on these types, we must ignore the flag completely. */ + /* Fortran standard define C_PTR type that is compatible with every C pointer. For this reason we need to glob all pointers into one. Still pointers in different address spaces are not compatible. */ if (POINTER_TYPE_P (type)) -{ - hstate.add_int (TYPE_ADDR_SPACE (TREE_TYPE (type))); -} +hstate.add_int (TYPE_ADDR_SPACE (TREE_TYPE (type))); - /* For integer types hash only the string flag. */ - if (TREE_CODE (type) == INTEGER_TYPE) -hstate.add_int (TYPE_STRING_FLAG (type)); - /* For array types hash the domain bounds and the string flag. */ if (TREE_CODE (type) == ARRAY_TYPE TYPE_DOMAIN (type)) { Index: testsuite/gfortran.dg/lto/bind_c-2b_0.f90 === --- testsuite/gfortran.dg/lto/bind_c-2b_0.f90 (revision 0) +++ testsuite/gfortran.dg/lto/bind_c-2b_0.f90 (working copy) @@ -0,0 +1,21 @@ +! { dg-lto-do run } +! { dg-lto-options {{ -O3 -flto }} } +! This testcase will abort if C_SIGNED_CHAR is not interoperable with signed +! char +module lto_type_merge_test + use, intrinsic :: iso_c_binding + implicit none + + type, bind(c) :: MYFTYPE_1 + integer(c_signed_char) :: chr + integer(c_signed_char) :: chrb + end type MYFTYPE_1 + + type(myftype_1), bind(c, name=myVar) :: myVar + +contains + subroutine types_test() bind(c) +myVar%chr = myVar%chrb + end subroutine types_test +end module lto_type_merge_test + Index: testsuite/gfortran.dg/lto/bind_c-2b_1.c === --- testsuite/gfortran.dg/lto/bind_c-2b_1.c (revision 0) +++ testsuite/gfortran.dg/lto/bind_c-2b_1.c (working copy) @@ -0,0 +1,36 @@ +#include stdlib.h +/* interopse with myftype_1 */ +typedef struct { + signed char chr; + signed char chr2; +} myctype_t; + + +extern void abort(void); +void types_test(void); +/* declared in the fortran module */ +extern myctype_t myVar; + +int main(int argc,
Re: Fix LTO streaming of BUILTINS_LOCATION
Yeah, I think streaming all locations up to RESERVED_LOCATION_COUNT literally is more robust. Thus do bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT, loc RESERVED_LOCATION_COUNT ? loc : RESERVED_LOCATION_COUNT); if (loc RESERVED_LOCATION_COUNT) return; Yep, I did that in meantime (did not notice we have RESERVED_LOCATION_COUNT) This is what I commited. 2015-06-08 Jan Hubicka hubi...@ucw.cz * lto-streamer-out.c (lto_output_location): Stream reserved locations correctly. * lto-streamer-in.c (lto_output_location): Likewise. Index: lto-streamer-out.c === --- lto-streamer-out.c (revision 224248) +++ lto-streamer-out.c (working copy) @@ -202,8 +202,10 @@ expanded_location xloc; loc = LOCATION_LOCUS (loc); - bp_pack_value (bp, loc == UNKNOWN_LOCATION, 1); - if (loc == UNKNOWN_LOCATION) + bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT, + loc RESERVED_LOCATION_COUNT + ? loc : RESERVED_LOCATION_COUNT); + if (loc RESERVED_LOCATION_COUNT) return; xloc = expand_location (loc); Index: lto-streamer-in.c === --- lto-streamer-in.c (revision 224248) +++ lto-streamer-in.c (working copy) @@ -278,13 +278,14 @@ gcc_assert (current_cache == this); - if (bp_unpack_value (bp, 1)) -{ - *loc = UNKNOWN_LOCATION; - return; -} - *loc = BUILTINS_LOCATION + 1; + *loc = bp_unpack_int_in_range (bp, location, 0, RESERVED_LOCATION_COUNT); + if (*loc RESERVED_LOCATION_COUNT) +return; + + /* Keep value RESERVED_LOCATION_COUNT in *loc as linemap lookups will + ICE on it. */ + file_change = bp_unpack_value (bp, 1); line_change = bp_unpack_value (bp, 1); column_change = bp_unpack_value (bp, 1);
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Thank you. It is interesting that the DR exists. We do have comments about possibly completing the types by equiality established by the symbol table but I tought it is strictly invalid. Not sure how much that buy us though. As for specific examples. Shall we warn for a.c: int a; b.c: unsigned int a; (this seems perfectly valid by C/Fortran rules) That's clearly invalid (declarations of the same object with incompatible types). int and unsigned int can alias, but declarations still need to be consistent. OK, so following the wording of the standard we have 2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined. in 6.2.7. The int and unsigned int is not considered compatible. However the memory accesses are further discussed in 6.5 7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 1 a type compatible with the effective type of the object, 2 a qualified version of a type compatible with the effective type of the object, 3 a type that is the signed or unsigned type corresponding to the effective type of the object, 4 a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, 5 an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or 6 a character type So this is where the basic c_get_alias rules come from. i.e. 1-4 basically boils down to ignoring qualifiers and signedness and 6 is about dropping char into alias set 0, so this gives us some extra rules for TBAA compatibility that however does not propagate up when building canonical types during LTO. So we don't really need to ignore signedness for C/C++ only programs while building canonical types (that correspond to the symmetric and transitive closure of the notion of compatible types). In C/Fortran mixed units however C_SIZE_T (that is signed type) is interoperable with size_t (that is unsigned type) and this propagates up by: A Fortran derived type is interoperable with a C struct type if and only if the Fortran type has the BIND attribute (4.5.2), the Fortran derived type and the C struct type have the same number of components, and the components of the Fortran derived type would interoperate with corresponding components of the C struct type as described in 15.3.5 and 15.3.6 if the components were variables. A component of a Fortran derived type and a component of a C struct type correspond if they are declared in the same relative position in their respective type denitions. So clearly Fortran's structure containing C_SIZE_T sould interoperate with struct {size_t val;}. So we do need to ignore TYPE_UNSIGNED when processing fields of structures in C/fortran mixed programs and we should not warn on a variable being declared both as size_t and C_SIZE_T. Honza
Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt
On 08/06/15 17:55, Thomas Schwinge wrote: Hi Tom! On Mon, 8 Jun 2015 12:43:01 +0200, Tom de Vries tom_devr...@mentor.com wrote: There are two problems in try_transform_to_exit_first_loop_alt: 1. In case the latch is not a singleton bb, the function should return false rather than true. 2. The check for singleton bb should ignore debug-insns. Attached patch fixes these problems. Fix try_transform_to_exit_first_loop_alt PR tree-optimization/66442 * gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function. * tree-parloops.c (try_transform_to_exit_first_loop_alt): Return false if the loop latch is not a singleton. Use gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p. Per my testing, the backport of this patch that you committed to gomp-4_0-branch, r224219, introduces a number of regressions in your OpenACC kernels test cases, specifically the »scan-tree-dump-times parloops_oacc_kernels (?n)pragma omp target oacc_parallel.*num_gangs\\(32\\) 1« tests. Would you please have a look? Hi Thomas, I seem to have committed (to both trunk and gomp-4_0-branch) an older version of the patch, which contained an incorrect version of gimple_seq_nondebug_singleton_p. I'll correct the mistake tomorrow morning. Thanks, - Tom Grüße, Thomas gcc/gimple-iterator.h | 29 + gcc/tree-parloops.c | 4 ++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h index 87e943a..76fa456 100644 --- a/gcc/gimple-iterator.h +++ b/gcc/gimple-iterator.h @@ -345,4 +345,33 @@ gsi_seq (gimple_stmt_iterator i) return *i.seq; } +/* Determine whether SEQ is a nondebug singleton. */ + +static inline bool +gimple_seq_nondebug_singleton_p (gimple_seq seq) +{ + gimple_stmt_iterator gsi; + + /* Find a nondebug gimple. */ + gsi.ptr = gimple_seq_first (seq); + gsi.seq = seq; + gsi.bb = NULL; + while (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) +gsi_next (gsi); + + /* No nondebug gimple found, not a singleton. */ + if (gsi_end_p (gsi)) +return false; + + /* Find a next nondebug gimple. */ + gsi_next (gsi); + while (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) +gsi_next (gsi); + + /* Only a singleton if there's no next nondebug gimple. */ + return gsi_end_p (gsi); +} + #endif /* GCC_GIMPLE_ITERATOR_H */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..c4b83fe 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -1769,8 +1769,8 @@ try_transform_to_exit_first_loop_alt (struct loop *loop, tree nit) { /* Check whether the latch contains a single statement. */ - if (!gimple_seq_singleton_p (bb_seq (loop-latch))) -return true; + if (!gimple_seq_nondebug_singleton_p (bb_seq (loop-latch))) +return false; /* Check whether the latch contains the loop iv increment. */ edge back = single_succ_edge (loop-latch); -- 1.9.1
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Joseph, I may be wrong, but I believe that the cross-compilation-unit representation compatibility (in C standard sense) is however not an equivalence class, so it can't be fully represented by TYPE_CANOINICAL Indeed, type compatibility is not transitive, but the expressed intention of WG14 in response to DR#314 is that in a valid program it should be possible to merge the translation units, unifying structure and union types across translation units (with renaming as needed). -- Joseph S. Myers jos...@codesourcery.com
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Joseph, I may be wrong, but I believe that the cross-compilation-unit representation compatibility (in C standard sense) is however not an equivalence class, so it can't be fully represented by TYPE_CANOINICAL Indeed, type compatibility is not transitive, but the expressed intention of WG14 in response to DR#314 is that in a valid program it should be possible to merge the translation units, unifying structure and union types across translation units (with renaming as needed). Thank you. It is interesting that the DR exists. We do have comments about possibly completing the types by equiality established by the symbol table but I tought it is strictly invalid. Not sure how much that buy us though. As for specific examples. Shall we warn for a.c: int a; b.c: unsigned int a; (this seems perfectly valid by C/Fortran rules) or a.c int *a; b.c: unsigned int *a; or a.c struct a {int a;} a; b.c: struct a {unsigned int a} a; so I assume the following is no longer valid a.c: struct a {int a;} *a; b.c: struct a *a; struct a *b; c.c: struct b {float a;} *a; as we may figure out that struct a in b.c does not have unique completetion. -- Joseph S. Myers jos...@codesourcery.com
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Thank you. It is interesting that the DR exists. We do have comments about possibly completing the types by equiality established by the symbol table but I tought it is strictly invalid. Not sure how much that buy us though. As for specific examples. Shall we warn for a.c: int a; b.c: unsigned int a; (this seems perfectly valid by C/Fortran rules) That's clearly invalid (declarations of the same object with incompatible types). int and unsigned int can alias, but declarations still need to be consistent. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Optimize (CST1 A) == CST2 (PR tree-optimization/66299)
On Mon, 8 Jun 2015, Marek Polacek wrote: PR tree-optimization/66299 * match.pd ((CST1 A) == CST2 - A == ctz (CST2) - ctz (CST1) ((CST1 A) != CST2 - A != ctz (CST2) - ctz (CST1)): New You are braver than I am, I would have abbreviated ctz (CST2) - ctz (CST1) to CST3 in the ChangeLog ;-) +/* (CST1 A) == CST2 - A == ctz (CST2) - ctz (CST1) + (CST1 A) != CST2 - A != ctz (CST2) - ctz (CST1) + if CST2 != 0. */ +(for cmp (ne eq) + (simplify + (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2) + (with { + unsigned int cand = wi::ctz (@2) - wi::ctz (@0); } + (if (!integer_zerop (@2) You can probably use directly wi::ne_p (@2, 0) here. Shouldn't this be indented one space more? +wi::eq_p (wi::lshift (@0, cand), @2)) + (cmp @1 { build_int_cst (TREE_TYPE (@1), cand); }) Making 'cand' signed, you could return 0 when cand0, like (2x)==1. You could also return 0 when the candidate turns out not to work: (3x)==4. Tweaking it so that (6X)==0 becomes X=31 for TYPE_OVERFLOW_WRAPS and false for TYPE_OVERFLOW_UNDEFINED is probably more controversial. FWIW, the patch looks good to me, thanks. -- Marc Glisse
Re: debug-early branch merged into mainline
On 06/08/2015 09:30 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 2:05 PM, Aldy Hernandez al...@redhat.com wrote: On 06/08/2015 04:26 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 02:33 PM, Richard Biener wrote: On June 7, 2015 6:00:05 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 11:25 AM, Richard Biener wrote: On June 7, 2015 5:03:30 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/06/2015 05:49 AM, Andreas Schwab wrote: Bootstrap fails on aarch64: Comparing stages 2 and 3 warning: gcc/cc1objplus-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs warning: gcc/cc1-checksum.o differs Bootstrap comparison failure! gcc/ira-costs.o differs gcc/tree-sra.o differs gcc/tree-parloops.o differs gcc/tree-vect-data-refs.o differs gcc/java/jcf-io.o differs gcc/ipa-inline-analysis.o differs The bootstrap comparison failure on ppc64le, aarch64, and possibly others is due to the order of some sections being in a different order with and without debugging. Stage2 is being compiled with no debugging due to -gtoggle, and stage3 is being compiled with debugging. For ira-costs.o on ppc64le we have: -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: ... -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: There is no semantic difference between the objects, just the ordering. I assume it's the same problem for the rest of the objects and architectures. I will look into this, unless someone beats me to it, or has an idea right off the bat. Check whether the symbol table walkers are walking hash tables. I assume the above are emitted via the symbol removal handling for debug stuff? Ughh, indeed. These sections are being outputted from output_object_blocks which traverses a hash table: void output_object_blocks (void) { object_block_htab-traversevoid *, output_object_block_htab (NULL); } Perhaps we should sort them by some deterministic field and then call output_object_block() on each member of the resulting list? Yes, that would be the usual fix. Maybe sth has an UID already, is the 'object' a decl by chance? The attached patch fixes the bootstrap failure on ppc64le, and theoretically the aarch64 problem as well, but I haven't checked. Tested on ppc64le linux by bootstrapping, and regtesting C/C++ against pre debug-early merge sources. Also tested by a full bootstrap and regtest on x86-64 Linux. OK for mainline? Please use FOR_EACH_HASH_TABLE_ELEMENT to put elements on the vector instead of the htab traversal. The compare function looks like we will end up having many equal elements (and thus random ordering on hosts where qsort doesn't behave sane here, like Solaris IIRC). Unless all sections are named (which it looks like) Some sections are not named. How about we sort the named sections and output them, but call output_object_block() on the rest of the sections on whatever order they were in? This solves the bootstrap problem as well. Attached patch tested on x86-64 and ppc64le Linux. OK? No, but hash_section suggests to sort after sect-common.flags if the section is not named. Conveniently flags is just an 'int' ... What about if the comparison routine gets a named section and an unnamed section? How to compare? That's why I was giving priority to one over the other originally, but I didn't know about problematic qsort implementations. Aldy
pr66345.c size_t assumption bug
The testcase for pr 66345 assumes size_t is unsigned long instead of using the real type, which causes failures on some 16-bit targets. Ok? Also, I note that some tests check for __SIZE_TYPE__ as I do below, and others use it unconditionally as a replacement for size_t. Is there a convention? * gcc.dg/torture/pr66345.c: Fix assumption about size_t type. 2015-06-08 Tom de Vries t...@codesourcery.com Index: gcc.dg/torture/pr66345.c === --- gcc.dg/torture/pr66345.c(revision 224260) +++ gcc.dg/torture/pr66345.c(working copy) @@ -1,9 +1,15 @@ /* { dg-do compile } */ -extern int snprintf (char *, unsigned long, const char *, ...); +#ifdef __SIZE_TYPE__ +typedef __SIZE_TYPE__ size_t; +#else +typedef unsigned int size_t; +#endif + +extern int snprintf (char *, size_t, const char *, ...); const char a[] = ; int b; void get_bar () { snprintf (0, 0, %s, a[b]);
Re: [C++ Patch] PR 65815
.. in case isn't obvious: this case is already Ok: struct array { int data [2]; }; struct X { X() : a{ 1, 2 } { } array a; }; because perform_member_init calls reshape_init. Paolo.
Re: [PATCH, AARCH64] make stdarg functions work with +nofp
On Tue, Jun 2, 2015 at 3:45 AM, James Greenhalgh james.greenha...@arm.com wrote: On Tue, Jun 02, 2015 at 11:38:29AM +0100, Kyrill Tkachov wrote: Hi James, Jim, On 02/06/15 10:42, James Greenhalgh wrote: On Sat, May 23, 2015 at 12:24:00AM +0100, Jim Wilson wrote: The compiler currently ICEs when compiling a stdarg function with +nofp, as reported in PR 66258. I'd like approval to add this patch to the gcc-5 release branch. I got two requests for this in the PR as currently grub won't build with gcc-5.1. I tested the patch on the gcc-5-release branch with a default languages bootstrap and make check on an APM box running Ubuntu. I also verified that the patch fixes my testcase. Jim
Re: pr66345.c size_t assumption bug
On Mon, 8 Jun 2015, DJ Delorie wrote: Also, I note that some tests check for __SIZE_TYPE__ as I do below, and others use it unconditionally as a replacement for size_t. Is there a convention? As far as I can tell, __SIZE_TYPE__ is always defined. The tests that check for it probably date from a time when it wasn't? -- Marc Glisse
Re: [Patch, fortran, PR44672, v9] [F08] ALLOCATE with SOURCE and no array-spec
All, I sincerely hope this patch will hit the trunk soon. There are 9 users on the cc list for this bug so it is clearly of considerable user interest.I was recently informed that the following three-line program does not compile: $ cat source-allocation.f90 integer, allocatable :: x(:) allocate(x,source=[1]) end $ gfortran source-allocation.f90 source-allocation.f90:2:11: allocate(x,source=[1]) 1 Error: Array specification required in ALLOCATE statement at (1) $ gfortran --version GNU Fortran (GCC) 6.0.0 20150607 (experimental) I was heartened to find out from the initial bug report that it’s a Fortran 2008 feature, which makes the behavior somewhat understandable, but it’s a fairly simple use case that I would imagine will be used widely. FYI, the above three-line program compiles and executes cleanly with the NAG, Cray, Intel, and Portland Group compilers. Damian Rouson, Ph.D., P.E. Founder President, Sourcery, Inc. 510-600-2992 (mobile) http://www.sourceryinstitute.org http://rouson.youcanbook.me On Jun 5, 2015, at 4:04 AM, Andre Vehreschild ve...@gmx.de wrote: Hi all, attached is the most recent version of the patch. It addresses the standard violation of allocate(foo, source=[bar(something)]), where foo after the allocate was a zero-based array instead of a one-based. Furthermore does this patch fix calling _vptr-_copy () routines, which come without an interface specification leading to pass all arguments by reference. When copying a deferred length string this is hazardous, because a __copy_character_* () routines third and fourth arguments are passed by value. This is fixed by simply counting the actual arguments and using pass by value for third and fourth to _copy routine. Bootstraps and regtests ok on x86_64-linux-gnu/f21. Ok for trunk? - Andre -- Andre Vehreschild * Email: vehre ad gmx dot de pr44672_9.clogpr44672_9.patch
Re: [PATCH] Add debug msg to dump_file in add_new_function
On Thu, Jun 4, 2015 at 4:42 PM, Tom de Vries tom_devr...@mentor.com wrote: On 04/06/15 15:20, Tom de Vries wrote: Hi, [ posted earlier as part of Don't dump low gimple functions in gimple dump, https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01586.html, currently discussed at: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02076.html ] This patch adds a debug msg to dump_file in cgraph_node::add_new_function. Added ssa gimple case for ompexpssa, and corresponding test-case. Ok. Thanks, Richard. OK for trunk (after retesting)? Thanks, - Tom
Re: genmatch: guess the type of a?b:c as b instead of a
On Sat, Jun 6, 2015 at 1:34 PM, Marc Glisse marc.gli...@inria.fr wrote: Hello, as discussed around https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00041.html we are currently guessing the type of a?b:c incorrectly. This does not affect current simplifications, because the only 'cond' in output patterns are at the outermost level, so their type is forced to 'type' and never guessed. Indeed, the patch does not change the generated *-match.c. It would allow removing an explicit cond:itype in a patch posted by Jeff. I tested it on a dummy .pd file containing: (simplify (plus @0 (plus @1 @2)) (negate (cond @0 @1 @2))) and the generated files differ by: - res = fold_build3_loc (loc, COND_EXPR, TREE_TYPE (ops1[0]), ops1[0], ops1[1], ops1[2]); + res = fold_build3_loc (loc, COND_EXPR, TREE_TYPE (ops1[1]), ops1[0], ops1[1], ops1[2]); (and something similar for gimple) I wondered about using something like VOID_TYPE_P (TREE_TYPE (ops1[1])) ? TREE_TYPE (ops1[2]) : TREE_TYPE (ops1[1]) but I don't think that will be necessary. Yeah, I think we can't currently match this anyway. Bootstrap is currently broken on many platforms with comparison failures, but since it went that far and generated the same *-match.c files, that seems sufficient testing. Ok. (this is indeed how I test genmatch.c patches - look at differences in generated {generic,gimple}-match.c and play with toy patterns and check their handling) Thanks, Richard. 2015-06-08 Marc Glisse marc.gli...@inria.fr * genmatch.c (expr::gen_transform): For conditions, guess the type from the second operand. -- Marc Glisse Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 224186) +++ gcc/genmatch.c (working copy) @@ -1702,20 +1702,27 @@ expr::gen_transform (FILE *f, const char type = optype; } else if (is_a operator_id * (operation) !strcmp (as_a operator_id * (operation)-tcc, tcc_comparison)) { /* comparisons use boolean_type_node (or what gets in), but their operands need to figure out the types themselves. */ sprintf (optype, boolean_type_node); type = optype; } + else if (*operation == COND_EXPR + || *operation == VEC_COND_EXPR) +{ + /* Conditions are of the same type as their first alternative. */ + sprintf (optype, TREE_TYPE (ops%d[1]), depth); + type = optype; +} else { /* Other operations are of the same type as their first operand. */ sprintf (optype, TREE_TYPE (ops%d[0]), depth); type = optype; } if (!type) fatal (two conversions in a row); fprintf (f, {\n);
Re: [patch] fix _OBJC_Module defined but not used warning
Hi Aldy, On 7 Jun 2015, at 12:37, Aldy Hernandez wrote: On 06/07/2015 06:19 AM, Andreas Schwab wrote: Another fallout: FAIL: obj-c++.dg/try-catch-5.mm -fgnu-runtime (test for excess errors) Excess errors: built-in: warning: '_OBJC_Module' defined but not used [-Wunused-variable] check_global_declarations is called for more symbols now. All the defined but not used errors I've seen in development have been legitimate. For tests, the tests should be fixed. For built-ins such as these, does the attached fix the problem? It is up to the objc maintainers, we can either fix this with the attached patch, The current patch is OK. or setting DECL_IN_SYSTEM_HEADER. This seems a better long-term idea; however, I would prefer to go through all the cases where it would be applicable (including for the NeXT runtime) and apply that change as a coherent patch. At the moment dealing with the NeXT stuff is a bit hampered by pr66448. thanks, Iain
Re: [PR64164] drop copyrename, integrate into expand
On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva aol...@redhat.com wrote: On Apr 27, 2015, Richard Biener richard.guent...@gmail.com wrote: This should also mention that is_gimple_reg vars do not have their address taken. check +static tree +leader_merge (tree cur, tree next) Ick - presumably you can't use sth better than a TREE_LIST here? The list was an experiment that never really worked, and when I tried to make it work after the patch, it proved to be unworkable, so I dropped it, and rewrote leader_merge to choose either of the params, preferring anonymous over ignored over named, so as to reduce the likelihood of misreading of debug dumps, since that's all they're used for. static void -expand_one_stack_var (tree var) +expand_one_stack_var_1 (tree var) { HOST_WIDE_INT size, offset; unsigned byte_align; - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); - byte_align = align_local_variable (SSAVAR (var)); + if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var)) +{ + size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); + byte_align = align_local_variable (SSAVAR (var)); +} + else I'd go here for all TREE_CODE (var) == SSA_NAME Check (and get rid of the SSAVAR macro?) There are remaining uses that don't seem worth dropping it for. +/* Return the promoted mode for name. If it is a named SSA_NAME, it + is the same as promote_decl_mode. Otherwise, it is the promoted + mode of a temp decl of same type as the SSA_NAME, if we had created + one. */ + +machine_mode +promote_ssa_mode (const_tree name, int *punsignedp) +{ + gcc_assert (TREE_CODE (name) == SSA_NAME); + + if (SSA_NAME_VAR (name)) +return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); As above I'd rather not have different paths for anonymous vs. non-anonymous vars (so just delete the above two lines). Check @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, pmode = promote_function_mode (type, mode, unsignedp, gimple_call_fntype (g), 2); + else if (!exp) + { + gcc_assert (code == SSA_NAME); promote_ssa_mode should assert this. + pmode = promote_ssa_mode (ssa_name, unsignedp); It does, so... check. @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree fntype) bool use_register_for_decl (const_tree decl) { + if (TREE_CODE (decl) == SSA_NAME) +{ + if (!SSA_NAME_VAR (decl)) + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode + !(flag_float_store FLOAT_TYPE_P (TREE_TYPE (decl))); + + decl = SSA_NAME_VAR (decl); See above. Please drop the SSA_NAME_VAR != NULL path. Check, then taken back, after a bootstrap failure and some debugging made me realize this would be wrong. Here are the nearly-added comments that explain why: /* We often try to use the SSA_NAME, instead of its underlying decl, to get type information and guide decisions, to avoid differences of behavior between anonymous and named variables, but in this one case we have to go for the actual variable if there is one. The main reason is that, at least at -O0, we want to place user variables on the stack, but we don't mind using pseudos for anonymous or ignored temps. Should we take the SSA_NAME, we'd conclude all SSA_NAMEs should go in pseudos, whereas their corresponding variables might have to go on the stack. So, disregarding the decl here would negatively impact debug info at -O0, enable coalescing between SSA_NAMEs that ought to get different stack/pseudo assignments, and get the incoming argument processing thoroughly confused by PARM_DECLs expected to live in stack slots but assigned to pseudos. */ +++ b/gcc/gimple-expr.h +/* Defined in tree-ssa-coalesce.c. */ +extern bool gimple_can_coalesce_p (tree, tree); Err, put it to tree-ssa-coalesce.h? Check. Lots of additional headers required to be able to include tree-ssa-coalesce.h, though. - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var))); + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name))); The TREE_TYPE of name and its SSA_NAME_VAR are always the same. So just use TREE_TYPE (name) here. Check gcc_assert (!REG_P (dest_rtx) - || dest_mode == promote_decl_mode (var, unsignedp)); + || dest_mode == promote_ssa_mode (name, unsignedp)); if (src_mode != dest_mode) { @@ -714,12 +715,12 @@ static rtx get_temp_reg (tree name) { tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; - tree type = TREE_TYPE (var); + tree type = var ? TREE_TYPE (var) : TREE_TYPE (name); See above. Check Here's the revised patch, regstrapped on x86_64-linux-gnu and i686-linux-gnu. The first attempt failed to compile libjava on x86_64, requiring the new
Re: Expand oacc kernels after pass_fre
On Thu, 4 Jun 2015, Tom de Vries wrote: { gsi_next (gsi); continue; diff --git gcc/tree-ssa-sccvn.c gcc/tree-ssa-sccvn.c index e417a15..449a615 100644 --- gcc/tree-ssa-sccvn.c +++ gcc/tree-ssa-sccvn.c @@ -85,6 +85,7 @@ along with GCC; see the file COPYING3. If not see #include ipa-ref.h #include plugin-api.h #include cgraph.h +#include omp-low.h /* This algorithm is based on the SCC algorithm presented by Keith Cooper and L. Taylor Simpson in SCC-Based Value numbering @@ -3542,7 +3543,8 @@ visit_use (tree use) { if (gimple_code (stmt) == GIMPLE_PHI) changed = visit_phi (stmt); - else if (gimple_has_volatile_ops (stmt)) + else if (gimple_has_volatile_ops (stmt) +|| gimple_stmt_omp_data_i_init_p (stmt)) No. What is the intent of these changes? These are changes to handle the kernels region conservatively, in order to not undo the omp-lowering before getting to the oacc-parloops pass. Still it feels too much like the MPX mistake (maintainance cost and compile-time cost). How can any pass undo omp-lowering? Richard.
Re: Fix LTO streaming of BUILTINS_LOCATION
+ bp_pack_value (bp, loc == BUILTINS_LOCATION, 1); + if (loc == BUILTINS_LOCATION) +return; Hmm, with this and #define DECL_IS_BUILTIN(DECL) \ (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) = BUILTINS_LOCATION) shouldn't we rather stream all locations = BUILTINS_LOCATION literally? That is, instead of two bits stream a [0, BUILTINS_LOCATION+1] 'enum' here? Btw, line-map.h has RESERVED_LOCATION_COUNT for the locations that are special (currently two, so your patch will work in practice). Yep, i considered that. Because we have precisely two special locations (ATM) and UNKNOWN_LOCATION is quite common, that would waste one extra bit on that case. Probably not a big deal, so if you think streaming all locations up to BUILTINS_LOCATION literarly is more robust, i will update the patch. xloc = expand_location (loc); Index: lto-streamer-in.c === --- lto-streamer-in.c (revision 224201) +++ lto-streamer-in.c (working copy) @@ -283,6 +283,11 @@ *loc = UNKNOWN_LOCATION; return; } + if (bp_unpack_value (bp, 1)) +{ + *loc = BUILTINS_LOCATION; + return; +} *loc = BUILTINS_LOCATION + 1; Btw, this assignment to *loc looks odd (I suppose it's to make location caching work). *loc is set to UNKNOWN_LOCATION/BUILTINS_LOCATION for those locations that are not cached and all others get BUILTINS_LOCATION + 1 which quite safely triggers ICE in line_map lookup though I do not recall why. I originally used UNKNOWN_LOCATION for cached values but that did not work as it confused DECL_IS_BUILTIN. We could extend API by adding INVALID_LOCATION and set it to INT_MAX or something that would also ICE. Honza Richard. file_change = bp_unpack_value (bp, 1); -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: Don't dump low gimple functions in gimple dump
On Thu, Jun 4, 2015 at 5:02 PM, Tom de Vries tom_devr...@mentor.com wrote: On 22/05/15 11:27, Richard Biener wrote: On Thu, May 21, 2015 at 5:36 PM, Thomas Schwinge tho...@codesourcery.com wrote: Hi! It's just been a year. ;-P In early March, I (hopefully correctly) adapted Tom's patch to apply to then-current GCC trunk sources; posting this here. Is the general approach OK? On Tue, 20 May 2014 10:16:45 +0200, Tom de Vries tom_devr...@mentor.com wrote: Honza, Consider this program: ... int main(void) { #pragma omp parallel { extern void foo(void); foo (); } return 0; } ... When compiling this program with -fopenmp, the ompexp pass splits off a new function called main._omp_fn.0 containing the call to foo. The new function is then dumped into the gimple dump by analyze_function. There are two problems with this: - the new function is in low gimple, and is dumped next to high gimple functions - since it's already low, the new function is not lowered, and 'goes missing' in the dumps following the gimple dump, until it reappears again after the last lowering dump. [ http://gcc.gnu.org/ml/gcc/2014-03/msg00312.html ] This patch fixes the problems by ensuring that analyze_function only dumps the new function to the gimple dump after gimplification (specifically, by moving the dump_function call into gimplify_function_tree. That makes the call to dump_function in finalize_size_functions superfluous). That also requires us to add a call to dump_function in finalize_task_copyfn, where we split off a new high gimple function. And in expand_omp_taskreg and expand_omp_target, where we split off a new low gimple function, we now dump the new function into the current (ompexp) dump file, which is the last lowering dump. Finally, we dump an information statement at the start of cgraph_add_new_function to give a better idea when and what kind of function is created. Bootstrapped and reg-tested on x86_64. OK for trunk ? Thanks, - Tom commit b925b393c3d975a9281789d97aff8a91a8b53be0 Author: Thomas Schwinge tho...@codesourcery.com Date: Sun Mar 1 15:05:15 2015 +0100 Don't dump low gimple functions in gimple dump id:537b0f6d.7060...@mentor.com or id:53734dc5.90...@mentor.com 2014-05-19 Tom de Vries t...@codesourcery.com * cgraphunit.c (cgraph_add_new_function): Dump message on new function. (analyze_function): Don't dump function to gimple dump file. * gimplify.c: Add tree-dump.h include. (gimplify_function_tree): Dump function to gimple dump file. * omp-low.c: Add tree-dump.h include. (finalize_task_copyfn): Dump new function to gimple dump file. (expand_omp_taskreg, expand_omp_target): Dump new function to dump file. * stor-layout.c (finalize_size_functions): Don't dump function to gimple dump file. * gcc.dg/gomp/dump-task.c: New test. --- gcc/cgraphunit.c | 15 ++- gcc/gimplify.c| 3 +++ gcc/omp-low.c | 6 ++ gcc/stor-layout.c | 1 - gcc/testsuite/gcc.dg/gomp/dump-task.c | 33 + 5 files changed, 56 insertions(+), 2 deletions(-) diff --git gcc/cgraphunit.c gcc/cgraphunit.c index 8280fc4..0860c86 100644 --- gcc/cgraphunit.c +++ gcc/cgraphunit.c @@ -501,6 +501,20 @@ cgraph_node::add_new_function (tree fndecl, bool lowered) { gcc::pass_manager *passes = g-get_passes (); cgraph_node *node; + + if (dump_file) +{ + const char *function_type = ((gimple_has_body_p (fndecl)) + ? (lowered + ? low gimple + : high gimple) + : to-be-gimplified); + fprintf (dump_file, + Added new %s function %s to callgraph\n, + function_type, + fndecl_name (fndecl)); +} + switch (symtab-state) { case PARSING: Split off this hunk as a seperate patch: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00416.html . @@ -629,7 +643,6 @@ cgraph_node::analyze (void) body. */ if (!gimple_has_body_p (decl)) gimplify_function_tree (decl); - dump_function (TDI_generic, decl); /* Lower the function. */ if (!lowered) diff --git gcc/gimplify.c gcc/gimplify.c index 9214648..d6c500d 100644 --- gcc/gimplify.c +++ gcc/gimplify.c @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see #include gimple-low.h #include cilk.h #include gomp-constants.h +#include tree-dump.h #include langhooks-def.h /* FIXME: for lhd_set_decl_assembler_name */ #include tree-pass.h /* FIXME: only for PROP_gimple_any */ @@ -9435,6 +9436,8 @@
Re: [patch] Implement Ada support for DragonFly, improve it for FreeBSD
Okay, I've attached them. I hope it helps! Thanks. The patch has been installed on the mainline. -- Eric Botcazou
Re: Fix LTO streaming of BUILTINS_LOCATION
On Mon, 8 Jun 2015, Jan Hubicka wrote: Hi, currently we stream BUILTINS_LOCATION by expanding it and streaming resulting filename/line/col tripplet. That is a nonsense and breaks some logic that special case it. This patch fixes it by special casing it same way as we do UNKNOWN_LOCATION (we have precisely 2 special location codes, so doing compound bitpack is not needed) Bootstrapped/regtested ppc64le-linux, OK? Honza * lto-streamer-out.c (lto_output_location): Correctly stream BUILTINS_LOCATION * lto-streamer-in (lto_input_location): Likewise. Index: lto-streamer-out.c === --- lto-streamer-out.c(revision 224201) +++ lto-streamer-out.c(working copy) @@ -205,6 +205,9 @@ bp_pack_value (bp, loc == UNKNOWN_LOCATION, 1); if (loc == UNKNOWN_LOCATION) return; + bp_pack_value (bp, loc == BUILTINS_LOCATION, 1); + if (loc == BUILTINS_LOCATION) +return; Hmm, with this and #define DECL_IS_BUILTIN(DECL) \ (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) = BUILTINS_LOCATION) shouldn't we rather stream all locations = BUILTINS_LOCATION literally? That is, instead of two bits stream a [0, BUILTINS_LOCATION+1] 'enum' here? Btw, line-map.h has RESERVED_LOCATION_COUNT for the locations that are special (currently two, so your patch will work in practice). xloc = expand_location (loc); Index: lto-streamer-in.c === --- lto-streamer-in.c (revision 224201) +++ lto-streamer-in.c (working copy) @@ -283,6 +283,11 @@ *loc = UNKNOWN_LOCATION; return; } + if (bp_unpack_value (bp, 1)) +{ + *loc = BUILTINS_LOCATION; + return; +} *loc = BUILTINS_LOCATION + 1; Btw, this assignment to *loc looks odd (I suppose it's to make location caching work). Richard. file_change = bp_unpack_value (bp, 1); -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PR58315] reset inlined debug vars at return-to point
On Wed, Jun 3, 2015 at 11:55 PM, Alexandre Oliva aol...@redhat.com wrote: On Feb 25, 2015, Alexandre Oliva aol...@redhat.com wrote: This patch fixes a problem that has been with us for several years. Variable tracking has no idea about the end of the lifetime of inlined variables, so it keeps on computing locations for them over and over, even though the computed locations make no sense whatsoever because the variable can't even be accessed any more. With this patch, we unbind all inlined variables at the point the inlined function returns to, so that the locations for those variables will not be touched any further. In theory, we could do something similar to non-inlined auto variables, when they go out of scope, but their decls apply to the entire function and I'm told gdb sort-of expects the variables to be accessible throughout the function, so I'm not tackling that in this patch, for I'm happy enough with what this patch gets us: - almost 99% reduction in the output asm for the PR testcase - more than 90% reduction in the peak memory use compiling that testcase - 63% reduction in the compile time for that testcase What's scary is that the testcase is not particularly pathological. Any function that calls a longish sequence of inlined functions, that in turn call other inline functions, and so on, something that's not particularly unusual in C++, will likely observe significant improvement, as we won't see growing sequences of var_location notes after each call or so, as var-tracking computes a new in-stack location for the implicit this argument of each previously-inlined function. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? Ping? Ok for trunk and 5.2 after a while with no issues popping up. Thanks, Richard. for gcc/ChangeLog PR debug/58315 * tree-inline.c (reset_debug_binding): New. (reset_debug_bindings): Likewise. (expand_call_inline): Call it. --- gcc/tree-inline.c | 56 + 1 file changed, 56 insertions(+) diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 71d75d9..c1578e5 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4346,6 +4346,60 @@ add_local_variables (struct function *callee, struct function *caller, } } +/* Add to BINDINGS a debug stmt resetting SRCVAR if inlining might + have brought in or introduced any debug stmts for SRCVAR. */ + +static inline void +reset_debug_binding (copy_body_data *id, tree srcvar, gimple_seq *bindings) +{ + tree *remappedvarp = id-decl_map-get (srcvar); + + if (!remappedvarp) +return; + + if (TREE_CODE (*remappedvarp) != VAR_DECL) +return; + + if (*remappedvarp == id-retvar || *remappedvarp == id-retbnd) +return; + + tree tvar = target_for_debug_bind (*remappedvarp); + if (!tvar) +return; + + gdebug *stmt = gimple_build_debug_bind (tvar, NULL_TREE, + id-call_stmt); + gimple_seq_add_stmt (bindings, stmt); +} + +/* For each inlined variable for which we may have debug bind stmts, + add before GSI a final debug stmt resetting it, marking the end of + its life, so that var-tracking knows it doesn't have to compute + further locations for it. */ + +static inline void +reset_debug_bindings (copy_body_data *id, gimple_stmt_iterator gsi) +{ + tree var; + unsigned ix; + gimple_seq bindings = NULL; + + if (!gimple_in_ssa_p (id-src_cfun)) +return; + + if (!opt_for_fn (id-dst_fn, flag_var_tracking_assignments)) +return; + + for (var = DECL_ARGUMENTS (id-src_fn); + var; var = DECL_CHAIN (var)) +reset_debug_binding (id, var, bindings); + + FOR_EACH_LOCAL_DECL (id-src_cfun, ix, var) +reset_debug_binding (id, var, bindings); + + gsi_insert_seq_before_without_update (gsi, bindings, GSI_SAME_STMT); +} + /* If STMT is a GIMPLE_CALL, replace it with its inline expansion. */ static bool @@ -4659,6 +4713,8 @@ expand_call_inline (basic_block bb, gimple stmt, copy_body_data *id) GCOV_COMPUTE_SCALE (cg_edge-frequency, CGRAPH_FREQ_BASE), bb, return_block, NULL); + reset_debug_bindings (id, stmt_gsi); + /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [patch#2] PR other/65366: Fix gdbhooks.py for GDB with Python3
On Wed, Jun 3, 2015 at 3:26 PM, Jan Kratochvil jan.kratoch...@redhat.com wrote: On Wed, 03 Jun 2015 10:25:20 +0200, Richard Biener wrote: gdb 7.9, python 2.7.6 attaching a fix; I do not know much Python so check it, please. OK for check-in? Python Exception type 'exceptions.NameError' global name 'sys' is not defined: Python Exception type 'exceptions.NameError' global name 'sys' is not defined: Python Exception type 'exceptions.NameError' global name 'sys' is not defined: Breakpoint 5, fold_binary_loc (loc=17953, code=LT_EXPR, type=, op0=, op1=) at /space/rguenther/tramp3d/trunk/gcc/fold-const.c:9862 adding a import sys makes it work fine though. Thus, ok with also adding a imoprt sys. Thanks, Richard. I have found it reproducible by : gdb -ex 'source /home/jkratoch/redhat/gcchead/gcc/gdbhooks.py' -ex 'b *0xec25b0' -ex r -ex 'set python print-stack full' -ex 'p *(stmt_vec_info)$r12' --args /home/jkratoch/redhat/gcchead-build/gcc/cc1plus ~/t/sigtest.C -o /dev/null -Wall -g -O3 0xec25b0= Breakpoint 1, free_stmt_vec_info (stmt=gimple_debug 0x71635b80) at /home/jkratoch/redhat/gcchead/gcc/tree-vect-stmts.c:7754 7754 free (stmt_info); Jan gcc/ 2015-06-03 Jan Kratochvil jan.kratoch...@redhat.com PR other/65366 * gdbhooks.py (intptr): New function. Replace int(...) by intptr(...). diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py index 20842bb..fe83376 100644 --- a/gcc/gdbhooks.py +++ b/gcc/gdbhooks.py @@ -149,6 +149,12 @@ tree_code_class_dict = gdb.types.make_enum_dict(gdb.lookup_type('enum tree_code_ tcc_type = tree_code_class_dict['tcc_type'] tcc_declaration = tree_code_class_dict['tcc_declaration'] +# Python3 has int() with arbitrary precision (bignum). Python2 int() is 32-bit +# on 32-bit hosts but remote targets may have 64-bit pointers there; Python2 +# long() is always 64-bit but Python3 no longer has anything named long. +def intptr(gdbval): +return long(gdbval) if sys.version_info.major == 2 else int(gdbval) + class Tree: Wrapper around a gdb.Value for a tree, with various methods @@ -158,7 +164,7 @@ class Tree: self.gdbval = gdbval def is_nonnull(self): -return int(self.gdbval) +return intptr(self.gdbval) def TREE_CODE(self): @@ -197,7 +203,7 @@ class TreePrinter: # like gcc/print-tree.c:print_node_brief # #define TREE_CODE(NODE) ((enum tree_code) (NODE)-base.code) # tree_code_name[(int) TREE_CODE (node)]) -if int(self.gdbval) == 0: +if intptr(self.gdbval) == 0: return 'tree 0x0' val_TREE_CODE = self.node.TREE_CODE() @@ -209,17 +215,17 @@ class TreePrinter: val_tclass = val_tree_code_type[val_TREE_CODE] val_tree_code_name = gdb.parse_and_eval('tree_code_name') -val_code_name = val_tree_code_name[int(val_TREE_CODE)] +val_code_name = val_tree_code_name[intptr(val_TREE_CODE)] #print(val_code_name.string()) -result = '%s 0x%x' % (val_code_name.string(), int(self.gdbval)) -if int(val_tclass) == tcc_declaration: +result = '%s 0x%x' % (val_code_name.string(), intptr(self.gdbval)) +if intptr(val_tclass) == tcc_declaration: tree_DECL_NAME = self.node.DECL_NAME() if tree_DECL_NAME.is_nonnull(): result += ' %s' % tree_DECL_NAME.IDENTIFIER_POINTER() else: pass # TODO: labels etc -elif int(val_tclass) == tcc_type: +elif intptr(val_tclass) == tcc_type: tree_TYPE_NAME = Tree(self.gdbval['type_common']['name']) if tree_TYPE_NAME.is_nonnull(): if tree_TYPE_NAME.TREE_CODE() == IDENTIFIER_NODE: @@ -242,8 +248,8 @@ class CGraphNodePrinter: self.gdbval = gdbval def to_string (self): -result = 'cgraph_node* 0x%x' % int(self.gdbval) -if int(self.gdbval): +result = 'cgraph_node* 0x%x' % intptr(self.gdbval) +if intptr(self.gdbval): # symtab_node::name calls lang_hooks.decl_printable_name # default implementation (lhd_decl_printable_name) is: #return IDENTIFIER_POINTER (DECL_NAME (decl)); @@ -261,12 +267,12 @@ class DWDieRefPrinter: self.gdbval = gdbval def to_string (self): -if int(self.gdbval) == 0: +if intptr(self.gdbval) == 0: return 'dw_die_ref 0x0' -result = 'dw_die_ref 0x%x' % int(self.gdbval) +result = 'dw_die_ref 0x%x' % intptr(self.gdbval) result += ' %s' % self.gdbval['die_tag'] -if int(self.gdbval['die_parent']) != 0: -result += ' parent=0x%x %s' % (int(self.gdbval['die_parent']), +if intptr(self.gdbval['die_parent']) != 0: +result += ' parent=0x%x %s' % (intptr(self.gdbval['die_parent']),
Re: debug-early branch merged into mainline
On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 02:33 PM, Richard Biener wrote: On June 7, 2015 6:00:05 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 11:25 AM, Richard Biener wrote: On June 7, 2015 5:03:30 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/06/2015 05:49 AM, Andreas Schwab wrote: Bootstrap fails on aarch64: Comparing stages 2 and 3 warning: gcc/cc1objplus-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs warning: gcc/cc1-checksum.o differs Bootstrap comparison failure! gcc/ira-costs.o differs gcc/tree-sra.o differs gcc/tree-parloops.o differs gcc/tree-vect-data-refs.o differs gcc/java/jcf-io.o differs gcc/ipa-inline-analysis.o differs The bootstrap comparison failure on ppc64le, aarch64, and possibly others is due to the order of some sections being in a different order with and without debugging. Stage2 is being compiled with no debugging due to -gtoggle, and stage3 is being compiled with debugging. For ira-costs.o on ppc64le we have: -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: ... -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: There is no semantic difference between the objects, just the ordering. I assume it's the same problem for the rest of the objects and architectures. I will look into this, unless someone beats me to it, or has an idea right off the bat. Check whether the symbol table walkers are walking hash tables. I assume the above are emitted via the symbol removal handling for debug stuff? Ughh, indeed. These sections are being outputted from output_object_blocks which traverses a hash table: void output_object_blocks (void) { object_block_htab-traversevoid *, output_object_block_htab (NULL); } Perhaps we should sort them by some deterministic field and then call output_object_block() on each member of the resulting list? Yes, that would be the usual fix. Maybe sth has an UID already, is the 'object' a decl by chance? The attached patch fixes the bootstrap failure on ppc64le, and theoretically the aarch64 problem as well, but I haven't checked. Tested on ppc64le linux by bootstrapping, and regtesting C/C++ against pre debug-early merge sources. Also tested by a full bootstrap and regtest on x86-64 Linux. OK for mainline? Please use FOR_EACH_HASH_TABLE_ELEMENT to put elements on the vector instead of the htab traversal. The compare function looks like we will end up having many equal elements (and thus random ordering on hosts where qsort doesn't behave sane here, like Solaris IIRC). Unless all sections are named (which it looks like) and we have only one object block per section name (which it looks like). Thus can you re-write the compare function to just return strcmp (p1-sect-named.name, p2-sect-named.name); ? (maybe with an assert that SECTION_NAMED is set on both) Ok with those changes. Btw, for portability the compare function should be a total ordering, thus return 0 only iff p1 == p2, otherwise it won't fix the bug on hosts where qsort may change the order of equal comparing elements non-deterministically (IIRC Solaris). Thanks, Richard. Aldy
Re: Fix LTO streaming of BUILTINS_LOCATION
On Mon, 8 Jun 2015, Jan Hubicka wrote: + bp_pack_value (bp, loc == BUILTINS_LOCATION, 1); + if (loc == BUILTINS_LOCATION) +return; Hmm, with this and #define DECL_IS_BUILTIN(DECL) \ (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) = BUILTINS_LOCATION) shouldn't we rather stream all locations = BUILTINS_LOCATION literally? That is, instead of two bits stream a [0, BUILTINS_LOCATION+1] 'enum' here? Btw, line-map.h has RESERVED_LOCATION_COUNT for the locations that are special (currently two, so your patch will work in practice). Yep, i considered that. Because we have precisely two special locations (ATM) and UNKNOWN_LOCATION is quite common, that would waste one extra bit on that case. Probably not a big deal, so if you think streaming all locations up to BUILTINS_LOCATION literarly is more robust, i will update the patch. Yeah, I think streaming all locations up to RESERVED_LOCATION_COUNT literally is more robust. Thus do bp_pack_int_in_range (bp, 0, RESERVED_LOCATION_COUNT, loc RESERVED_LOCATION_COUNT ? loc : RESERVED_LOCATION_COUNT); if (loc RESERVED_LOCATION_COUNT) return; and on unpacking use the special RESERVED_LOCATION_COUNT value to fall thru to unpacked location handling. Richard. xloc = expand_location (loc); Index: lto-streamer-in.c === --- lto-streamer-in.c (revision 224201) +++ lto-streamer-in.c (working copy) @@ -283,6 +283,11 @@ *loc = UNKNOWN_LOCATION; return; } + if (bp_unpack_value (bp, 1)) +{ + *loc = BUILTINS_LOCATION; + return; +} *loc = BUILTINS_LOCATION + 1; Btw, this assignment to *loc looks odd (I suppose it's to make location caching work). *loc is set to UNKNOWN_LOCATION/BUILTINS_LOCATION for those locations that are not cached and all others get BUILTINS_LOCATION + 1 which quite safely triggers ICE in line_map lookup though I do not recall why. I originally used UNKNOWN_LOCATION for cached values but that did not work as it confused DECL_IS_BUILTIN. We could extend API by adding INVALID_LOCATION and set it to INT_MAX or something that would also ICE. Honza Richard. file_change = bp_unpack_value (bp, 1); -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg) -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH] [AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics
Thanks for working on this! I'd been fiddling around with a patch with some similar elements to this, but many trials with union types, subregs, etc., all worsened the register allocation and led to more unnecessary shuffling / moves. The only real thing I tried which you don't do here, was to introduce a set_dreg expander to clean up some of those macro definitions in arm_neon.h. That could easily follow in a separate patch if desired! So your patch looks good to me. A couple of style nits: --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -128,7 +128,9 @@ enum aarch64_type_qualifiers /* Polynomial types. */ qualifier_poly = 0x100, /* Lane indices - must be in range, and flipped for bigendian. */ - qualifier_lane_index = 0x200 + qualifier_lane_index = 0x200, + /* Lane indices for single lane structure loads and stores */ + qualifier_struct_load_store_lane_index = 0x400 }; should be ...'loads and stores. */' also the dg-error messages in the testsuite, do not need to be on the same line as the statement generating the error, because the trailing 0 tells dg that the position/line number doesn't matter (i.e. dg should allow the error to be reported at any line); so these could be brought under 80 chars. Thanks, Alan Charles Baylis wrote: This is another attempt at fixing this PR63870 for AArch64 (ARM is still to come). As before, the Q register variants are handled by moving the check for the lane bounds into builtin expansion. The handling of lane numbers is made consistent wrt endianess with other NEON single lane operations - lane numbers in RTL are flipped for big-endian, and flipped back at assembly time. The D register variants are now handled by adding new builtins for all the 64bit operations. These behave identically to Q register variants, except that the permitted lane bounds are different. In the iterators used by the relevant patterns are changed from VQ and VALLDIF so that the correct vector sizes are used in the endian-flip at assembly time. Finally, a set of machine-generated test cases is added. These do need to be in separate files, because of testsuite limitations. Regression tested on qemu for aarch64-linux-gnu with no regressions and all new tests pass. OK for trunk? gcc/ChangeLog: DATE Charles Baylis charles.bay...@linaro.org PR target/63870 * config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers): Add qualifier_struct_load_store_lane_index. (aarch64_types_loadstruct_lane_qualifiers): Use qualifier_struct_load_store_lane_index for lane index argument for last argument. (aarch64_types_storestruct_lane_qualifiers): Ditto. (builtin_simd_arg): Add SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX. (aarch64_simd_expand_args): Add new argument describing mode of builtin. Check lane bounds for arguments with SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX. (aarch64_simd_expand_builtin): Emit error for incorrect lane indices if marked with SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX. (aarch64_simd_expand_builtin): Handle arguments with qualifier_struct_load_store_lane_index. Pass machine mode of builtin to aarch64_simd_expand_args. * config/aarch64/aarch64-simd-builtins.def: Declare ld[234]_lane and vst[234]_lane with BUILTIN_VALLDIF. * config/aarch64/aarch64-simd.md: (aarch64_vec_load_lanesoi_lanemode): Use VALLDIF iterator. Perform endianness reversal on lane index. (aarch64_vec_load_lanesci_lanemode): Ditto. (aarch64_vec_load_lanesxi_lanemode): Ditto. (vec_store_lanesoi_lanemode): Use VALLDIF iterator. Fix typo in attribute. (vec_store_lanesci_lanemode): Use VALLDIF iterator. (vec_store_lanesxi_lanemode): Ditto. (aarch64_ld2_lanemode): Use VALLDIF iterator. Remove endianness reversal of lane index. (aarch64_ld3_lanemode): Ditto. (aarch64_ld4_lanemode): Ditto. (aarch64_st2_lanemode): Ditto. (aarch64_st3_lanemode): Ditto. (aarch64_st4_lanemode): Ditto. * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Rename mode parameter to qmode. Add new mode parameter. Update uses. (__LD3_LANE_FUNC): Ditto. (__LD4_LANE_FUNC): Ditto. (__ST2_LANE_FUNC): Ditto. (__ST3_LANE_FUNC): Ditto. (__ST4_LANE_FUNC): Ditto. DATE Charles Baylis charles.bay...@linaro.org * gcc.target/aarch64/simd/vld2_lane_f32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_f64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_p8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s16_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_u16_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_u32_indices_1.c: New test. *
Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt
On Mon, 8 Jun 2015, Tom de Vries wrote: On 04/06/15 10:28, Tom de Vries wrote: I'm ok with the patch and count on you to fix eventual fallout ;) Great, will do. And here is the fallout: * PR66442 - [6 regression] FAIL: gcc.dg/autopar/pr46885.c (test for excess errors) There are two problems in try_transform_to_exit_first_loop_alt: 1. In case the latch is not a singleton bb, the function should return false rather than true. 2. The check for singleton bb should ignore debug-insns. Attached patch fixes these problems. Bootstrapped and reg-tested on x86_64. Verified by Andreas to fix the problem on m68k. OK for trunk? Ok. Thanks, Richard.
Re: [PATCH] [AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics
Oh, have you tested bigendian? --Alan Charles Baylis wrote: This is another attempt at fixing this PR63870 for AArch64 (ARM is still to come). As before, the Q register variants are handled by moving the check for the lane bounds into builtin expansion. The handling of lane numbers is made consistent wrt endianess with other NEON single lane operations - lane numbers in RTL are flipped for big-endian, and flipped back at assembly time. The D register variants are now handled by adding new builtins for all the 64bit operations. These behave identically to Q register variants, except that the permitted lane bounds are different. In the iterators used by the relevant patterns are changed from VQ and VALLDIF so that the correct vector sizes are used in the endian-flip at assembly time. Finally, a set of machine-generated test cases is added. These do need to be in separate files, because of testsuite limitations. Regression tested on qemu for aarch64-linux-gnu with no regressions and all new tests pass. OK for trunk? gcc/ChangeLog: DATE Charles Baylis charles.bay...@linaro.org PR target/63870 * config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers): Add qualifier_struct_load_store_lane_index. (aarch64_types_loadstruct_lane_qualifiers): Use qualifier_struct_load_store_lane_index for lane index argument for last argument. (aarch64_types_storestruct_lane_qualifiers): Ditto. (builtin_simd_arg): Add SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX. (aarch64_simd_expand_args): Add new argument describing mode of builtin. Check lane bounds for arguments with SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX. (aarch64_simd_expand_builtin): Emit error for incorrect lane indices if marked with SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX. (aarch64_simd_expand_builtin): Handle arguments with qualifier_struct_load_store_lane_index. Pass machine mode of builtin to aarch64_simd_expand_args. * config/aarch64/aarch64-simd-builtins.def: Declare ld[234]_lane and vst[234]_lane with BUILTIN_VALLDIF. * config/aarch64/aarch64-simd.md: (aarch64_vec_load_lanesoi_lanemode): Use VALLDIF iterator. Perform endianness reversal on lane index. (aarch64_vec_load_lanesci_lanemode): Ditto. (aarch64_vec_load_lanesxi_lanemode): Ditto. (vec_store_lanesoi_lanemode): Use VALLDIF iterator. Fix typo in attribute. (vec_store_lanesci_lanemode): Use VALLDIF iterator. (vec_store_lanesxi_lanemode): Ditto. (aarch64_ld2_lanemode): Use VALLDIF iterator. Remove endianness reversal of lane index. (aarch64_ld3_lanemode): Ditto. (aarch64_ld4_lanemode): Ditto. (aarch64_st2_lanemode): Ditto. (aarch64_st3_lanemode): Ditto. (aarch64_st4_lanemode): Ditto. * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Rename mode parameter to qmode. Add new mode parameter. Update uses. (__LD3_LANE_FUNC): Ditto. (__LD4_LANE_FUNC): Ditto. (__ST2_LANE_FUNC): Ditto. (__ST3_LANE_FUNC): Ditto. (__ST4_LANE_FUNC): Ditto. DATE Charles Baylis charles.bay...@linaro.org * gcc.target/aarch64/simd/vld2_lane_f32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_f64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_p8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s16_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_s8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_u16_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_u32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_u64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2_lane_u8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_f32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_f64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_p8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_s16_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_s32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_s64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_s8_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_u16_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_u32_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_u64_indices_1.c: New test. * gcc.target/aarch64/simd/vld2q_lane_u8_indices_1.c: New test. * gcc.target/aarch64/simd/vld3_lane_f32_indices_1.c: New test. * gcc.target/aarch64/simd/vld3_lane_f64_indices_1.c: New test. * gcc.target/aarch64/simd/vld3_lane_p8_indices_1.c: New test. * gcc.target/aarch64/simd/vld3_lane_s16_indices_1.c: New test. * gcc.target/aarch64/simd/vld3_lane_s32_indices_1.c: New test. * gcc.target/aarch64/simd/vld3_lane_s64_indices_1.c: New test. *
Re: [patch] fix _OBJC_Module defined but not used warning
On 06/08/2015 04:03 AM, Iain Sandoe wrote: Hi Aldy, On 7 Jun 2015, at 12:37, Aldy Hernandez wrote: On 06/07/2015 06:19 AM, Andreas Schwab wrote: Another fallout: FAIL: obj-c++.dg/try-catch-5.mm -fgnu-runtime (test for excess errors) Excess errors: built-in: warning: '_OBJC_Module' defined but not used [-Wunused-variable] check_global_declarations is called for more symbols now. All the defined but not used errors I've seen in development have been legitimate. For tests, the tests should be fixed. For built-ins such as these, does the attached fix the problem? It is up to the objc maintainers, we can either fix this with the attached patch, The current patch is OK. Committed. or setting DECL_IN_SYSTEM_HEADER. This seems a better long-term idea; however, I would prefer to go through all the cases where it would be applicable (including for the NeXT runtime) and apply that change as a coherent patch. At the moment dealing with the NeXT stuff is a bit hampered by pr66448. On my list next. Aldy
[PATCH] Disable -Wunused warning for artificial decls (PR sanitizer/66452)
This PR is about a bogus -Wunused-variable warning about an internal structure (*.Lubsan_data*), exposed by the debug-early merge. I think the defined but not used warning should not be allowed for the compiler-generated entities. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-06-08 Marek Polacek pola...@redhat.com PR sanitizer/66452 * toplev.c (check_global_declaration): Don't warn about artificial decls. * g++.dg/ubsan/pr66452.C: New test. diff --git gcc/testsuite/g++.dg/ubsan/pr66452.C gcc/testsuite/g++.dg/ubsan/pr66452.C index e69de29..473543c 100644 --- gcc/testsuite/g++.dg/ubsan/pr66452.C +++ gcc/testsuite/g++.dg/ubsan/pr66452.C @@ -0,0 +1,16 @@ +// PR sanitizer/66452 +// { dg-do compile } +// { dg-options -Wall -fsanitize=undefined } + +class A { +public: + A(int); +}; +class B { + A m_fn1() const; +}; +A B::m_fn1() const { + for (int i = 0;i;) +; + return 0; +} diff --git gcc/toplev.c gcc/toplev.c index fd2ac4e..eb8b8ac 100644 --- gcc/toplev.c +++ gcc/toplev.c @@ -534,6 +534,7 @@ check_global_declaration (tree decl) to handle multiple external decls in different scopes. */ ! (DECL_NAME (decl) TREE_USED (DECL_NAME (decl))) ! DECL_EXTERNAL (decl) + ! DECL_ARTIFICIAL (decl) ! TREE_PUBLIC (decl) /* A volatile variable might be used in some non-obvious way. */ ! TREE_THIS_VOLATILE (decl) Marek
Re: [PATCH] Disable -Wunused warning for artificial decls (PR sanitizer/66452)
On Mon, 8 Jun 2015, Marek Polacek wrote: This PR is about a bogus -Wunused-variable warning about an internal structure (*.Lubsan_data*), exposed by the debug-early merge. I think the defined but not used warning should not be allowed for the compiler-generated entities. Bootstrapped/regtested on x86_64-linux, ok for trunk? Ok. Thanks, Richard. 2015-06-08 Marek Polacek pola...@redhat.com PR sanitizer/66452 * toplev.c (check_global_declaration): Don't warn about artificial decls. * g++.dg/ubsan/pr66452.C: New test. diff --git gcc/testsuite/g++.dg/ubsan/pr66452.C gcc/testsuite/g++.dg/ubsan/pr66452.C index e69de29..473543c 100644 --- gcc/testsuite/g++.dg/ubsan/pr66452.C +++ gcc/testsuite/g++.dg/ubsan/pr66452.C @@ -0,0 +1,16 @@ +// PR sanitizer/66452 +// { dg-do compile } +// { dg-options -Wall -fsanitize=undefined } + +class A { +public: + A(int); +}; +class B { + A m_fn1() const; +}; +A B::m_fn1() const { + for (int i = 0;i;) +; + return 0; +} diff --git gcc/toplev.c gcc/toplev.c index fd2ac4e..eb8b8ac 100644 --- gcc/toplev.c +++ gcc/toplev.c @@ -534,6 +534,7 @@ check_global_declaration (tree decl) to handle multiple external decls in different scopes. */ ! (DECL_NAME (decl) TREE_USED (DECL_NAME (decl))) ! DECL_EXTERNAL (decl) + ! DECL_ARTIFICIAL (decl) ! TREE_PUBLIC (decl) /* A volatile variable might be used in some non-obvious way. */ ! TREE_THIS_VOLATILE (decl) Marek -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
[patch] libstdc++/66417 fix codecvt_utf16 bigendian
I forgot to swap the byte-order for codepoints that fit in a single UTF-16 character, fixed like so. Tested powerpc64-linux and powerpc64le-linux. Committing to trunk and gcc-5-branch. commit cc2dca496553bba0d09db49b98ccef0d728d9a36 Author: Jonathan Wakely jwak...@redhat.com Date: Mon Jun 8 11:38:04 2015 +0100 PR libstdc++/66417 * src/c++11/codecvt.cc (write_utf16_code_point): Use adjust_byte_order for single UTF-16 units. * testsuite/22_locale/codecvt/codecvt_utf16/66417.cc: New. diff --git a/libstdc++-v3/src/c++11/codecvt.cc b/libstdc++-v3/src/c++11/codecvt.cc index 83ee6e0..2a11ca3 100644 --- a/libstdc++-v3/src/c++11/codecvt.cc +++ b/libstdc++-v3/src/c++11/codecvt.cc @@ -319,7 +319,7 @@ namespace { if (to.size() 0) { - *to.next = codepoint; + *to.next = adjust_byte_order(codepoint, mode); ++to.next; return true; } diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_utf16/66417.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_utf16/66417.cc new file mode 100644 index 000..f9e4291 --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_utf16/66417.cc @@ -0,0 +1,76 @@ +// Copyright (C) 2015 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++11 } + +#include codecvt +#include testsuite_hooks.h + +using namespace std; + +void +test01() +{ + constexpr auto mode = generate_header; + codecvt_utf16char32_t, 0x10, mode cvt; + mbstate_t state{}; + const char32_t* from = UABC; + const char32_t* from_next; + char to[100]; + char* to_next; + + cvt.out(state, from, from + 3, from_next, to, to + 100, to_next); + + VERIFY((unsigned char)to[0] == 0xfe); + VERIFY((unsigned char)to[1] == 0xff); + VERIFY(to[2] == 0x00); + VERIFY(to[3] == 0x41); + VERIFY(to[4] == 0x00); + VERIFY(to[5] == 0x42); + VERIFY(to[6] == 0x00); + VERIFY(to[7] == 0x43); +} + +void +test02() +{ + constexpr auto mode = codecvt_mode(generate_header|little_endian); + codecvt_utf16char32_t, 0x10, mode cvt; + mbstate_t state{}; + const char32_t* from = UABC; + const char32_t* from_next; + char to[100]; + char* to_next; + + cvt.out(state, from, from + 3, from_next, to, to + 100, to_next); + + VERIFY((unsigned char)to[0] == 0xff); + VERIFY((unsigned char)to[1] == 0xfe); + VERIFY(to[2] == 0x41); + VERIFY(to[3] == 0x00); + VERIFY(to[4] == 0x42); + VERIFY(to[5] == 0x00); + VERIFY(to[6] == 0x43); + VERIFY(to[7] == 0x00); +} + +int +main() +{ + test01(); + test02(); +}
Re: [PATCH, ARM] attribute target (thumb,arm) [4/6] respin (5th)
On 08/06/15 09:45, Christian Bruel wrote: Hi Ramana, Ok, I see. The patch looks ok to me modulo the typo nits I pointed out, but I think Ramana should have the final say here as he's already started reviewing it and it adds quite a lot of functionality. Thanks, Kyrill do you have other feedbacks for the remaining parts ? many thanks Christian This is OK, thanks. Ramana
Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt
On 04/06/15 10:28, Tom de Vries wrote: I'm ok with the patch and count on you to fix eventual fallout ;) Great, will do. And here is the fallout: * PR66442 - [6 regression] FAIL: gcc.dg/autopar/pr46885.c (test for excess errors) There are two problems in try_transform_to_exit_first_loop_alt: 1. In case the latch is not a singleton bb, the function should return false rather than true. 2. The check for singleton bb should ignore debug-insns. Attached patch fixes these problems. Bootstrapped and reg-tested on x86_64. Verified by Andreas to fix the problem on m68k. OK for trunk? Thanks, - Tom Fix try_transform_to_exit_first_loop_alt 2015-06-06 Tom de Vries t...@codesourcery.com PR tree-optimization/66442 * gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function. * tree-parloops.c (try_transform_to_exit_first_loop_alt): Return false if the loop latch is not a singleton. Use gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p. --- gcc/gimple-iterator.h | 29 + gcc/tree-parloops.c | 4 ++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h index 87e943a..76fa456 100644 --- a/gcc/gimple-iterator.h +++ b/gcc/gimple-iterator.h @@ -345,4 +345,33 @@ gsi_seq (gimple_stmt_iterator i) return *i.seq; } +/* Determine whether SEQ is a nondebug singleton. */ + +static inline bool +gimple_seq_nondebug_singleton_p (gimple_seq seq) +{ + gimple_stmt_iterator gsi; + + /* Find a nondebug gimple. */ + gsi.ptr = gimple_seq_first (seq); + gsi.seq = seq; + gsi.bb = NULL; + while (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) +gsi_next (gsi); + + /* No nondebug gimple found, not a singleton. */ + if (gsi_end_p (gsi)) +return false; + + /* Find a next nondebug gimple. */ + gsi_next (gsi); + while (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) +gsi_next (gsi); + + /* Only a singleton if there's no next nondebug gimple. */ + return gsi_end_p (gsi); +} + #endif /* GCC_GIMPLE_ITERATOR_H */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..c4b83fe 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -1769,8 +1769,8 @@ try_transform_to_exit_first_loop_alt (struct loop *loop, tree nit) { /* Check whether the latch contains a single statement. */ - if (!gimple_seq_singleton_p (bb_seq (loop-latch))) -return true; + if (!gimple_seq_nondebug_singleton_p (bb_seq (loop-latch))) +return false; /* Check whether the latch contains the loop iv increment. */ edge back = single_succ_edge (loop-latch); -- 1.9.1
[PATCH] Fix PR66422
The following patch should fix the bogus array-bound warning caused by loop peeling which fails to split blocks after inserted unreachable calls (which is now fatal to optimization after removing the quadraticness in CFG cleanup to scan for noreturn calls). Bootstrap and regtest in progress on x86_64-unknown-linux-gnu. Richard. 2015-06-08 Richard Biener rguent...@suse.de PR tree-optimization/66422 * tree-ssa-loop-ivcanon.c (remove_exits_and_undefined_stmts): Split block after inserted gcc_unreachable. * gcc.dg/Warray-bounds-16.c: New testcase. Index: gcc/tree-ssa-loop-ivcanon.c === --- gcc/tree-ssa-loop-ivcanon.c (revision 224212) +++ gcc/tree-ssa-loop-ivcanon.c (working copy) @@ -520,9 +520,9 @@ remove_exits_and_undefined_stmts (struct gimple_stmt_iterator gsi = gsi_for_stmt (elt-stmt); gcall *stmt = gimple_build_call (builtin_decl_implicit (BUILT_IN_UNREACHABLE), 0); - gimple_set_location (stmt, gimple_location (elt-stmt)); gsi_insert_before (gsi, stmt, GSI_NEW_STMT); + split_block (gimple_bb (stmt), stmt); changed = true; if (dump_file (dump_flags TDF_DETAILS)) { Index: gcc/testsuite/gcc.dg/Warray-bounds-16.c === --- gcc/testsuite/gcc.dg/Warray-bounds-16.c (revision 0) +++ gcc/testsuite/gcc.dg/Warray-bounds-16.c (working copy) @@ -0,0 +1,40 @@ +/* { dg-do compile } */ +/* { dg-options -O3 -Warray-bounds } */ + +typedef struct foo { +unsigned char foo_size; +int buf[4]; +const char* bar; +} foo; + +const foo *get_foo(int index); + +static int foo_loop(const foo *myfoo) { +int i; +if (myfoo-foo_size 3) +return 0; +for (i = 0; i myfoo-foo_size; i++) { +if (myfoo-buf[i] != 1) /* { dg-bogus above array bounds } */ +return 0; +} + +return 1; +} + +static int run_foo(void) { +int i; +for (i = 0; i 1; i++) { +const foo *myfoo = get_foo(i); +if (foo_loop(myfoo)) +return 0; +} +return -1; +} + +typedef struct hack { +int (*func)(void); +} hack; + +hack myhack = { +.func = run_foo, +};
[PATCH] Yet another simple fix to enhance outer-loop vectorization.
Hi All, Here is a simple fix which allows duplication of outer loops to perform peeling for number of iterations if outer loop is marked with pragma omp simd. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2015-06-08 Yuri Rumyantsev ysrum...@gmail.com * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument to allow renaming of PHI arguments on edges incoming from outer loop header, add corresponding check before start PHI iterator. (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool variable DUPLICATE_OUTER_LOOP and set it to true for outer loops with true force_vectorize. Set-up dominator for outer loop too. Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb. (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it was marked with force_vectorize and has restricted cfg. * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling for outer loops. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-outer-simd-2.c: New test. patch.1 Description: Binary data
RE: [Patch] [X86_64]: fix operand constraints in sse3_mwait
Hi Uros, Checked the patch in the following branches and trunk after bootstrapping and regression testing them individually. GCC 5 branch https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=224215 GCC 4.9 branch https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=224214 GCC 4.8 branch https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=224147 Trunk https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=224146 regards, Venkat. -Original Message- From: Kumar, Venkataramanan Sent: Thursday, June 04, 2015 8:44 PM To: Uros Bizjak (ubiz...@gmail.com); gcc-patches@gcc.gnu.org Subject: [Patch] [X86_64]: fix operand constraints in sse3_mwait Hi Uros, As discussed here https://gcc.gnu.org/ml/gcc/2015-06/msg00043.html I am going to install the following patch to trunk. GCC bootstrap and regressions tests passed. Regards, Venkat. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index ab5c004..2fa6e96 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,7 @@ +2015-06-04 Venkataramanan Kumar Venkataramanan.kumar + + * config/i386/sse.md (sse3_mwait): Swap the operand constraints. + 2015-06-02 Alan Modra amo...@gmail.com * config/rs6000/vsx.md (vsx_extract_v4sf): Revert accidental diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 21c6c6c..2685f06 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -13194,10 +13194,12 @@ (set_attr atom_sse_attr fence) (set_attr memory unknown)]) - +;; As per AMD and Intel ISA manuals, the first operand is extensions ;; +and it goes to %ecx. The second operand received is hints and it goes +;; to %eax. (define_insn sse3_mwait - [(unspec_volatile [(match_operand:SI 0 register_operand a) -(match_operand:SI 1 register_operand c)] + [(unspec_volatile [(match_operand:SI 0 register_operand c) +(match_operand:SI 1 register_operand a)] UNSPECV_MWAIT)] TARGET_SSE3 ;; 64bit version is mwait %rax,%rcx. But only lower 32bits are used.
Commit: RX: Do not promote vector types
Hi Guys, I am applying the patch below to the RX's handling of vector functions. The RX ABI specifies that small integer return values should always be promoted to 32-bit values, but the code that performs this promotion was also affecting vector types. This results in internal compiler errors when the promoted type does not match the original vector type. Cheers Nick gcc/ChangeLog 2015-06-08 Nick Clifton ni...@redhat.com * config/rx/rx.c (rx_function_value): Do not promote vector types. (rx_promote_function_mode): Likewise. * config/rx/rx.h (LIBCALL_VALUE): Likewise. Index: gcc/config/rx/rx.c === --- gcc/config/rx/rx.c (revision 224227) +++ gcc/config/rx/rx.c (working copy) @@ -1174,6 +1181,8 @@ if (GET_MODE_SIZE (mode) 0 GET_MODE_SIZE (mode) 4 ! COMPLEX_MODE_P (mode) + ! VECTOR_TYPE_P (ret_type) + ! VECTOR_MODE_P (mode) ) return gen_rtx_REG (SImode, FUNC_RETURN_REGNUM); @@ -1193,6 +1202,8 @@ if (for_return != 1 || GET_MODE_SIZE (mode) = 4 || COMPLEX_MODE_P (mode) + || VECTOR_MODE_P (mode) + || VECTOR_TYPE_P (type) || GET_MODE_SIZE (mode) 1) return mode; Index: gcc/config/rx/rx.h === --- gcc/config/rx/rx.h (revision 224227) +++ gcc/config/rx/rx.h (working copy) @@ -267,6 +267,7 @@ #define LIBCALL_VALUE(MODE)\ gen_rtx_REG (((GET_MODE_CLASS (MODE) != MODE_INT \ || COMPLEX_MODE_P (MODE) \ + || VECTOR_MODE_P (MODE) \ || GET_MODE_SIZE (MODE) = 4) \ ? (MODE)\ : SImode), \
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Joseph Myers wrote: On Mon, 8 Jun 2015, Richard Biener wrote: I'm not sure the C standard mandates compatibility between struct { int i; } and struct { unsigned i; } for purposes of TBAA. Joseph? I don't think they are necessarily compatible for TBAA. Ok, but as int and unsigned are reading either structs element via a pointer to int or a pointer to unsigned must be supported? (The C FE ensures this via alias-subsets and the get_alias_set langhook returning the same alias sets for int and unsigned) Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [commit#2] [patch#2] PR other/65366: Fix gdbhooks.py for GDB with Python3
On Mon, Jun 8, 2015 at 3:37 PM, Jan Kratochvil jan.kratoch...@redhat.com wrote: On Mon, 08 Jun 2015 09:46:59 +0200, Richard Biener wrote: adding a import sys makes it work fine though. I do not see the sys error with either FSF GDB HEAD or Fedora 22 GDB. I agree it probably should be there. Yeah, I suspect you have other auto-loads that eventually import sys (I suppose the different python modules are not isolated) Thus, ok with also adding a imoprt sys. Done and checked in: r224223 Thanks. Jan
[PATCH] Fix PR66413
We fail to unshare exprs put into debug stmts during inlinign which creates bogus tree sharing. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2015-06-08 Richard Biener rguent...@suse.de PR middle-end/66413 * tree-inline.c (insert_init_debug_bind): Unshare value. * gcc.dg/torture/pr66413.c: New testcase. Index: gcc/tree-inline.c === *** gcc/tree-inline.c (revision 224221) --- gcc/tree-inline.c (working copy) *** insert_init_debug_bind (copy_body_data * *** 3027,3033 base_stmt = gsi_stmt (gsi); } ! note = gimple_build_debug_bind (tracked_var, value, base_stmt); if (bb) { --- 3027,3033 base_stmt = gsi_stmt (gsi); } ! note = gimple_build_debug_bind (tracked_var, unshare_expr (value), base_stmt); if (bb) { Index: gcc/testsuite/gcc.dg/torture/pr66413.c === *** gcc/testsuite/gcc.dg/torture/pr66413.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr66413.c (working copy) *** *** 0 --- 1,61 + /* { dg-do compile } */ + /* { dg-additional-options -g } */ + + int a, b, c, d, i, j, q, *e, *h, *k, *r, **p = e; + const int *f, **n = f; + static int g; + + void + fn1 (int p1) + { + c = p1; + } + + static int * + fn2 (int *p1, const int *p2) + { + if (g) + n = p2; + *n = p2; + int o[245]; + fn1 (o != p2); + return p1; + } + + static int * + fn3 () + { + int s[54], *t = s[0], u = 0, v = 1; + h = v; + q = 1; + for (; q; q++) + { + int *w[] = { u }; + for (; v;) + return *p; + } + *r = *t + b = 0; + return *p; + } + + static int + fn4 (int *p1) + { + int *l[2], **m[7]; + for (; i 1; i++) + for (; j 1; j++) + m[i * 70] = l[0]; + k = fn3 (); + fn2 (0, p1); + if ((m[0] == 0) a) + for (;;) + ; + return 0; + } + + int + main () + { + fn4 (d); + return 0; + }
Re: [patch] Adjust hash-table.h and it's pre-requisite includes.
On Fri, Jun 5, 2015 at 5:24 PM, Andrew MacLeod amacl...@redhat.com wrote: There is a horrible morass of include dependencies between hash-map.h, mem-stats.h and hash-table.h. There are even includes in both directions (mem-stats.h and hash-map.h include each other, as do hash-map.h and hash-table.h.. blech). Some of those files need parts of the other file to compile, and those whole mess is quite awful. They also manage to include vec.h into their little party 3 times as well, and it also has some icky #ifdefs. So I spent some time sorting out the situation, and reduced it down to a straight dependency list, rooted by hash-table.h. There are no double direction includes, and no header is included more than once. Once sorted out, I moved the root of this tree into coretypes.h since pretty much every file requires everything in the dependency chain. This chain consists of statistics.h, ggc.h, vec.h, hashtab.h, inchash.h, mem-stats-traits.h, hash-map-traits.h, mem-stats.h, hash-map.h and hash-table.h. With hash-table.h at the root of the dependency list, I wondered how many files actually need just that. So I flattened a source tree such that coretypes.h included the other required include files, but each .c file included hash-table.h. Then I tried removing the includes. It turned out that virtually every file needs hash-table.h. Part of that is due to how tightly integrated with mem-stats.h it is (they still need each other), and that is used throughout the compiler. So I think it makes sense to put that in coretypes.h. I also noticed that hash-set.h is included in a lot of places as well. Wondering how much it was actually needed, I preformed the same flattening exercise and found that only about 10% of the files in gcc core didn't need it to compile... the rest all needed it due to hash_setsometype being in a prototype parameter list or in a structure declared in a commonly used header file (function.h, gimple-walk.h, tree-core.h, tree.h,...) . It would be a lot of work to remove this dependency (if its even possible), so I added hash-set.h to coretypes.h as well. rtl.h needed hash-table.h added to the GENERATOR list, but not hash-set.. I guess the generators don't use it much :-) The only other thing of note is the change to vec.h. It had an ugly set of checks to see whether it was being used in a generator file, and if not whether GC was available, then included it if it wasn't or provided 3 prototypes if it wasn't suppose to be included. These allows it to compile when GC isn't available (those routines referencing the GC functions would never be referred to when GC isnt available).With my other changes, most of those checks weren't necessary. I also figured it was best to simply include those 3 prototypes for ggc_free, ggc_round_alloc_size, and ggc_realloc all the time. When there isn't ggc.h, things remain as they are now. If there is a ggc.h included, it will provide static confirmation that those prototypes are up-to-date and in sync with how ggc.h defines them. The first patch contains all of those changes. The second one is fully automated and removes all these headers from every other .c and .h file in the compiler. This also included changes to many of the gen*.c routines. I adjusted the #include list for all the *generated* .c files to also be up to date with this patchset as well at the previous one which moved wide-int and friends into coretypes.h This bootstraps with all languages enabled on x86_64-unknown-linux-gnu with no new regressions. It also causes no failures for all the targets in config-list.mk. OK for trunk? Ok. Thanks, Richard. Andrew
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Richard Biener wrote: I'm not sure the C standard mandates compatibility between struct { int i; } and struct { unsigned i; } for purposes of TBAA. Joseph? I don't think they are necessarily compatible for TBAA. -- Joseph S. Myers jos...@codesourcery.com
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Richard Biener wrote: On Mon, 8 Jun 2015, Joseph Myers wrote: On Mon, 8 Jun 2015, Richard Biener wrote: I'm not sure the C standard mandates compatibility between struct { int i; } and struct { unsigned i; } for purposes of TBAA. Joseph? I don't think they are necessarily compatible for TBAA. Ok, but as int and unsigned are reading either structs element via a pointer to int or a pointer to unsigned must be supported? Yes. The questionable case would be taking an object of one of those structure types, casting a pointer to it to point to the other structure type and then dereferencing. -- Joseph S. Myers jos...@codesourcery.com
[gomp4] (NVPTX) thread barriers after OpenACC worker loops
Hi, This patch adds a thread barrier after worker loops for OpenACC, in accordance with OpenACC 2.0a section 2.7.3 (worker loops): All workers will complete execution of their assigned iterations before any worker proceeds beyond the end of the loop.. (This is quite target-specific: work to alleviate that is still ongoing.) Barriers are special in that they should not be cloned or subject to excessive code motion: to that end, barriers placed after loops have their (outgoing) edge set to EDGE_ABNORMAL. That seems to suffice to keep the barriers in the right places. This passes libgomp testing when applied on gomp4 branch, and fixes the previously-broken worker-partn-5.c and worker-partn-6.c tests, on top of my previous patches: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02612.html https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00307.html (ping!), but unfortunately (again, with the above patches) appears to interact badly with Cesar's patch for vector state propagation: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00371.html I haven't yet investigated why (I reverted that patch in my local series in order to test the attached patch). FYI, Julian ChangeLog gcc/ * omp-low.c (build_oacc_threadbarrier): New function. (oacc_loop_needs_threadbarrier_p): New function. (expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Insert threadbarrier after worker loops. (find_omp_for_region_data): Rename to... (find_omp_for_region_gwv): This. Return mask, rather than modifying REGION structure. (build_omp_regions_1): Move modification of REGION structure to here, after calling above function with new name. (generate_oacc_broadcast): Use new build_oacc_threadbarrier function. (make_gimple_omp_edges): Make edges out of OpenACC worker loop exit block abnormal. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Add BUILT_IN_GOACC_THREADBARRIER. libgomp/ * testsuite/libgomp.oacc-c-c++-common/worker-partn-5.c: Remove XFAIL. * testsuite/libgomp.oacc-c-c++-common/worker-partn-6.c: Likewise.commit e46fbc68b7bc7e705417475fcfb8e203056b5a51 Author: Julian Brown jul...@codesourcery.com Date: Fri Jun 5 10:01:01 2015 -0700 Threadbarrier after worker and vector loops. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 55a2a12..45ff05a 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -3691,6 +3691,15 @@ build_omp_barrier (tree lhs) return g; } +/* Build a call to GOACC_threadbarrier. */ + +static gcall * +build_oacc_threadbarrier (void) +{ + tree fndecl = builtin_decl_explicit (BUILT_IN_GOACC_THREADBARRIER); + return gimple_build_call (fndecl, 0); +} + /* If a context was created for STMT when it was scanned, return it. */ static omp_context * @@ -7181,6 +7190,20 @@ expand_omp_for_generic (struct omp_region *region, } +/* True if a barrier is needed after a loop partitioned over + gangs/workers/vectors as specified by GWV_BITS. OpenACC semantics specify + that a (conceptual) barrier is needed after worker and vector-partitioned + loops, but not after gang-partitioned loops. Currently we are relying on + warp reconvergence to synchronise threads within a warp after vector loops, + so an explicit barrier is not helpful after those. */ + +static bool +oacc_loop_needs_threadbarrier_p (int gwv_bits) +{ + return (gwv_bits (MASK_GANG | MASK_WORKER)) == MASK_WORKER; +} + + /* A subroutine of expand_omp_for. Generate code for a parallel loop with static schedule and no specified chunk size. Given parameters: @@ -7523,7 +7546,11 @@ expand_omp_for_static_nochunk (struct omp_region *region, { t = gimple_omp_return_lhs (gsi_stmt (gsi)); if (gimple_omp_for_kind (fd-for_stmt) == GF_OMP_FOR_KIND_OACC_LOOP) - gcc_checking_assert (t == NULL_TREE); + { + gcc_checking_assert (t == NULL_TREE); + if (oacc_loop_needs_threadbarrier_p (region-gwv_this)) + gsi_insert_after (gsi, build_oacc_threadbarrier (), GSI_SAME_STMT); + } else gsi_insert_after (gsi, build_omp_barrier (t), GSI_SAME_STMT); } @@ -7956,7 +7983,11 @@ expand_omp_for_static_chunk (struct omp_region *region, { t = gimple_omp_return_lhs (gsi_stmt (gsi)); if (gimple_omp_for_kind (fd-for_stmt) == GF_OMP_FOR_KIND_OACC_LOOP) - gcc_checking_assert (t == NULL_TREE); +{ + gcc_checking_assert (t == NULL_TREE); + if (oacc_loop_needs_threadbarrier_p (region-gwv_this)) + gsi_insert_after (gsi, build_oacc_threadbarrier (), GSI_SAME_STMT); + } else gsi_insert_after (gsi, build_omp_barrier (t), GSI_SAME_STMT); } @@ -10270,22 +10301,26 @@ expand_omp (struct omp_region *region) /* Map each basic block to an omp_region. */ static hash_mapbasic_block, omp_region * *bb_region_map; -/* Fill in additional data for a region REGION associated with an +/* Return a mask of GWV bits for region REGION associated with an OMP_FOR STMT. */ -static void -find_omp_for_region_data
[PATCH, PR66444] Handle -fipa-ra in reload_combine
Hi, this patch fixes PR66444, a problem with -fipa-ra in reload_combine. The problem is that for the test-case, reload_combine combines these two insns: ... (insn 13 12 14 2 (parallel [ (set (reg/v/f:DI 37 r8 [orig:92 xD.1858 ] [92]) (plus:DI (reg:DI 37 r8 [96]) (reg:DI 0 ax [orig:95 D.1884 ] [95]))) (clobber (reg:CC 17 flags)) ]) (expr_list:REG_EQUAL (plus:DI (reg:DI 0 ax [orig:95 D.1884 ] [95]) (const_int 962072674304 [0xe0])) (nil))) (insn 14 13 15 2 (set (reg:DI 5 di) (reg/v/f:DI 37 r8 [orig:92 xD.1858 ] [92])) (nil)) ... into this insn: ... (insn 14 12 15 2 (set (reg:DI 5 di) (plus:DI (reg:DI 37 r8 [96]) (reg:DI 0 ax [orig:95 D.1884 ] [95]))) (nil)) ... That removes the set of r8 by insn 13. And that set of r8 is used by insn 16: (call_insn 15 ...) (insn 16 15 17 2 (set (reg:DI 5 di) (reg/v/f:DI 37 r8 [orig:92 x ] [92])) test.c:33 85 {*movdi_internal} (nil)) ... But reload_combine doesn't acknowledge that use, because it considers that r8 is killed by call_insn 15. The patch fixes the problem by using get_call_reg_set_usage to find out that r8 is actually not killed by the call_insn. Bootstrapped and reg-tested on x86_64 on top of trunk. OK for trunk and gcc-5-branch? Thanks, - Tom Handle -fipa-ra in reload_combine 2015-06-08 Tom de Vries t...@codesourcery.com PR rtl-optimization/66444 * postreload.c (reload_combine): Use get_call_reg_set_usage instead of call_used_regs. * gcc.dg/pr66444.c: New test. --- gcc/postreload.c | 5 +++- gcc/testsuite/gcc.dg/pr66444.c | 58 ++ 2 files changed, 62 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/pr66444.c diff --git a/gcc/postreload.c b/gcc/postreload.c index 7ecca15..1cc7b14 100644 --- a/gcc/postreload.c +++ b/gcc/postreload.c @@ -1352,9 +1352,12 @@ reload_combine (void) if (CALL_P (insn)) { rtx link; + HARD_REG_SET used_regs; + + get_call_reg_set_usage (insn, used_regs, call_used_reg_set); for (r = 0; r FIRST_PSEUDO_REGISTER; r++) - if (call_used_regs[r]) + if (TEST_HARD_REG_BIT (used_regs, r)) { reg_state[r].use_index = RELOAD_COMBINE_MAX_USES; reg_state[r].store_ruid = reload_combine_ruid; diff --git a/gcc/testsuite/gcc.dg/pr66444.c b/gcc/testsuite/gcc.dg/pr66444.c new file mode 100644 index 000..93a7644 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr66444.c @@ -0,0 +1,58 @@ +/* { dg-do run } */ +/* { dg-options -O2 -fipa-ra } */ + +extern void abort (void); + +int __attribute__((noinline, noclone)) +bar (void) +{ + return 1; +} + +struct S +{ + unsigned long p, q, r; + void *v; +}; + +struct S *s1; +struct S *s2; + +void __attribute__((noinline, noclone)) +fn2 (struct S *x) +{ + s2 = x; +} + +__attribute__((noinline, noclone)) void * +fn1 (struct S *x) +{ + /* Just a statement to make it a non-const function. */ + s1 = x; + + return (void *)0; +} + +int __attribute__((noinline, noclone)) +baz (void) +{ + struct S *x = (struct S *) 0xe000U; + + x += bar (); + + fn1 (x); + fn2 (x); + + return 0; +} + +int +main (void) +{ + baz (); + + if (s2 != (((struct S *) 0xe000U) + 1)) +abort (); + + return 0; +} -- 1.9.1
[PING^2][PATCH][3/3][PR65460] Mark offloaded functions as parallelized
On 17/04/15 12:08, Tom de Vries wrote: On 20-03-15 12:38, Tom de Vries wrote: On 19-03-15 12:05, Tom de Vries wrote: On 18-03-15 18:22, Tom de Vries wrote: Hi, this patch fixes PR65460. The patch marks offloaded functions as parallelized, which means the parloops pass no longer attempts to modify that function. Updated patch to postpone mark_parallelized_function until the corresponding cgraph_node is available, to ensure it works with the updated mark_parallelized_function from patch 2/3. Updated to eliminate mark_parallelized_function. Bootstrapped and reg-tested on x86_64. OK for stage4? ping. ping^2. Original post at https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01063.html . OK for stage1? Thanks, - Tom
Re: [patch] Adjust gcc-plugin.h
On Mon, Jun 8, 2015 at 2:07 PM, Andrew MacLeod amacl...@redhat.com wrote: During the original flattening process I decided to use gcc-plugin.h as the kitchen sink for all includes that plugins might need. I think this has worked well for plugins, drastically reducing their dependency on gcc internal header file structure. What I didn't realize was that gcc's internal header (plugin.h) also includes gcc-plugin.h. This means that every file which may need to do something for plugins ends up indirectly including the gcc world again :-P Easy fix. (ha). This patch leaves all the #includes in gcc-plugin.h making the change transparent to plugins. All the remaining declarations and such are moved into a new file gcc-plugin-common.h. Both gcc-plugin.h and gcc's internal header plugin.h now include this common file. The effect is that gcc's source files no longer get anything but the required plugin info. Great.. Except there were a few files which were apparently picking up some required headers from gcc-plugins.h :-PThis patch also adds the required headers to those source files. Compiles on x86_64-unknown-linux-gnu with no new regressions. Also compiles across all targets in config-list.mk. OK for trunk? Err - why not simply remove the gcc-plugin.h include from plugin.h and instead include plugin.h from gcc-plugin.h? Richard. Andrew
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Hi, this is a variant of patch that globs also the rest of integer types. Note that we will still get false warnings out of lto-symtab when the values are not wrapped up in structures. This is because lto-symtab uses types_compatible_p that in turn uses useless_type_conversion and that one needs to honor signedness. I suppose we need a way to test representation compatibility and TBAA compatiblity. I will give it a more tought how to reorganize the code. Basically we need representation compatibility is TYPE_CANONICAL equivalence, TBAA compatibility is get_alias_set equivalence. So you have to be careful when mangling TYPE_CANONICAL according to get_alias_set and make sure to only apply this (signedness for example) for aggregate type components. Can you please split out the string-flag change? It is approved. I'm not sure the C standard mandates compatibility between struct { int i; } and struct { unsigned i; } for purposes of TBAA. Joseph? Thanks, Richard. - way to decide if two types have compatible memory representation (to test in lto-symtab and for some cases in ipa-icf (contructors/pure moves)) operands_equal_p/compare_constant/ipa-icf::sem_variable all implements bit of this. copmare_constant seems to be most complete. - way to decide if two types match by TBAA oracle (to test in lto-symtab merging and for ipa-icf memory operations) - way to decide if one type is semantically compatible to other (useless_type_conversion_p) - way to decide if two types are same for canonical type computation (gimple_type_compatible_p). This may be sensitive to the set of languages we are merging and enable/disable various globbing as required. It may make sense to refactor the type walkers and get this more organized. But before playing with this I think we want to get something conservatively correct according to language standards and get a reasonable body of testcases. This is a variant of patch that removes TYPE_UNSIGNED testing completely. I am fine with both variants. Bootstrapped/regtested ppc64le-linux. Honza * gimple-expr.c (useless_type_conversion_p): Move INTEGER_TYPE handling ahead. * tree.c (gimple_canonical_types_compatible_p): Do not compare TYPE_UNSIGNED for size_t and char compatible types; do not hash STRING_FLAG on integer types. * lto.c (hash_canonical_type): Do not hash TYPE_UNSIGNED for size_t and char compatible types; do not hash STRING_FLAG on integer types. * gfortran.dg/lto/bind_c-2_0.f90: New testcase. * gfortran.dg/lto/bind_c-2_1.c: New testcase. * gfortran.dg/lto/bind_c-3_0.f90: New testcase. * gfortran.dg/lto/bind_c-3_1.c: New testcase. * gfortran.dg/lto/bind_c-4_0.f90: New testcase. * gfortran.dg/lto/bind_c-4_1.c: New testcase. Index: gimple-expr.c === --- gimple-expr.c (revision 224201) +++ gimple-expr.c (working copy) @@ -91,30 +91,14 @@ || TREE_CODE (TREE_TYPE (inner_type)) == METHOD_TYPE)) return false; } - - /* From now on qualifiers on value types do not matter. */ - inner_type = TYPE_MAIN_VARIANT (inner_type); - outer_type = TYPE_MAIN_VARIANT (outer_type); - - if (inner_type == outer_type) -return true; - - /* If we know the canonical types, compare them. */ - if (TYPE_CANONICAL (inner_type) - TYPE_CANONICAL (inner_type) == TYPE_CANONICAL (outer_type)) -return true; - - /* Changes in machine mode are never useless conversions unless we - deal with aggregate types in which case we defer to later checks. */ - if (TYPE_MODE (inner_type) != TYPE_MODE (outer_type) - !AGGREGATE_TYPE_P (inner_type)) -return false; - /* If both the inner and outer types are integral types, then the conversion is not necessary if they have the same mode and - signedness and precision, and both or neither are boolean. */ - if (INTEGRAL_TYPE_P (inner_type) - INTEGRAL_TYPE_P (outer_type)) + signedness and precision, and both or neither are boolean. + + Do not rely on TYPE_CANONICAL here because LTO puts same canonical + type for signed char and unsigned char. */ + else if (INTEGRAL_TYPE_P (inner_type) +INTEGRAL_TYPE_P (outer_type)) { /* Preserve changes in signedness or precision. */ if (TYPE_UNSIGNED (inner_type) != TYPE_UNSIGNED (outer_type) @@ -134,6 +118,25 @@ return true; } + + /* From now on qualifiers on value types do not matter. */ + inner_type = TYPE_MAIN_VARIANT (inner_type); + outer_type = TYPE_MAIN_VARIANT (outer_type); + + if (inner_type == outer_type) +return true; + + /* If we know the canonical types, compare them. */ + if (TYPE_CANONICAL (inner_type) + TYPE_CANONICAL
[patch] Adjust gcc-plugin.h
During the original flattening process I decided to use gcc-plugin.h as the kitchen sink for all includes that plugins might need. I think this has worked well for plugins, drastically reducing their dependency on gcc internal header file structure. What I didn't realize was that gcc's internal header (plugin.h) also includes gcc-plugin.h. This means that every file which may need to do something for plugins ends up indirectly including the gcc world again :-P Easy fix. (ha). This patch leaves all the #includes in gcc-plugin.h making the change transparent to plugins. All the remaining declarations and such are moved into a new file gcc-plugin-common.h. Both gcc-plugin.h and gcc's internal header plugin.h now include this common file. The effect is that gcc's source files no longer get anything but the required plugin info. Great.. Except there were a few files which were apparently picking up some required headers from gcc-plugins.h :-P This patch also adds the required headers to those source files. Compiles on x86_64-unknown-linux-gnu with no new regressions. Also compiles across all targets in config-list.mk. OK for trunk? Andrew * gcc-plugin-common.h: New. Relocate edecls from gcc-plugin.h. * gcc-plugin.h: Move decls to gcc-plugin-common.h. * plugin.h: Include gcc-plugin-common.h rather than gcc-plugin.h. * ggc-page.c: Include required header files. * passes.c: Likewise. * cgraphunit.c: Likewise. Index: gcc-plugin-common.h === *** gcc-plugin-common.h (revision 0) --- gcc-plugin-common.h (working copy) *** *** 0 --- 1,158 + /* Header file containing common declarations for gcc internal use and plugins. +Copyright (C) 2009-2015 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + http://www.gnu.org/licenses/. */ + + #ifndef GCC_PLUGIN_COMMON_H + #define GCC_PLUGIN_COMMON_H + + #include highlev-plugin-common.h + + /* Event names. */ + enum plugin_event + { + # define DEFEVENT(NAME) NAME, + # include plugin.def + # undef DEFEVENT + PLUGIN_EVENT_FIRST_DYNAMIC + }; + + /* All globals declared here have C linkage to reduce link compatibility +issues with implementation language choice and mangling. */ + #ifdef __cplusplus + extern C { + #endif + + extern const char **plugin_event_name; + + struct plugin_argument + { + char *key;/* key of the argument. */ + char *value; /* value is optional and can be NULL. */ + }; + + /* Additional information about the plugin. Used by --help and --version. */ + + struct plugin_info + { + const char *version; + const char *help; + }; + + /* Represents the gcc version. Used to avoid using an incompatible plugin. */ + + struct plugin_gcc_version + { + const char *basever; + const char *datestamp; + const char *devphase; + const char *revision; + const char *configuration_arguments; + }; + + /* Object that keeps track of the plugin name and its arguments. */ + struct plugin_name_args + { + char *base_name; /* Short name of the plugin (filename without +.so suffix). */ + const char *full_name;/* Path to the plugin as specified with +-fplugin=. */ + int argc; /* Number of arguments specified with +-fplugin-arg-... */ + struct plugin_argument *argv; /* Array of ARGC key-value pairs. */ + const char *version; /* Version string provided by plugin. */ + const char *help; /* Help string provided by plugin. */ + }; + + /* The default version check. Compares every field in VERSION. */ + + extern bool plugin_default_version_check (struct plugin_gcc_version *, + struct plugin_gcc_version *); + + /* Function type for the plugin initialization routine. Each plugin module +should define this as an externally-visible function with name +plugin_init. + +PLUGIN_INFO - plugin invocation information. +VERSION - the plugin_gcc_version symbol of GCC. + +Returns 0 if initialization finishes successfully. */ + + typedef int (*plugin_init_func) (struct plugin_name_args *plugin_info, + struct plugin_gcc_version *version); + + /* Declaration for plugin_init function so that it doesn't need to
Re: debug-early branch merged into mainline
On Mon, Jun 8, 2015 at 2:05 PM, Aldy Hernandez al...@redhat.com wrote: On 06/08/2015 04:26 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 02:33 PM, Richard Biener wrote: On June 7, 2015 6:00:05 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 11:25 AM, Richard Biener wrote: On June 7, 2015 5:03:30 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/06/2015 05:49 AM, Andreas Schwab wrote: Bootstrap fails on aarch64: Comparing stages 2 and 3 warning: gcc/cc1objplus-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs warning: gcc/cc1-checksum.o differs Bootstrap comparison failure! gcc/ira-costs.o differs gcc/tree-sra.o differs gcc/tree-parloops.o differs gcc/tree-vect-data-refs.o differs gcc/java/jcf-io.o differs gcc/ipa-inline-analysis.o differs The bootstrap comparison failure on ppc64le, aarch64, and possibly others is due to the order of some sections being in a different order with and without debugging. Stage2 is being compiled with no debugging due to -gtoggle, and stage3 is being compiled with debugging. For ira-costs.o on ppc64le we have: -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: ... -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: There is no semantic difference between the objects, just the ordering. I assume it's the same problem for the rest of the objects and architectures. I will look into this, unless someone beats me to it, or has an idea right off the bat. Check whether the symbol table walkers are walking hash tables. I assume the above are emitted via the symbol removal handling for debug stuff? Ughh, indeed. These sections are being outputted from output_object_blocks which traverses a hash table: void output_object_blocks (void) { object_block_htab-traversevoid *, output_object_block_htab (NULL); } Perhaps we should sort them by some deterministic field and then call output_object_block() on each member of the resulting list? Yes, that would be the usual fix. Maybe sth has an UID already, is the 'object' a decl by chance? The attached patch fixes the bootstrap failure on ppc64le, and theoretically the aarch64 problem as well, but I haven't checked. Tested on ppc64le linux by bootstrapping, and regtesting C/C++ against pre debug-early merge sources. Also tested by a full bootstrap and regtest on x86-64 Linux. OK for mainline? Please use FOR_EACH_HASH_TABLE_ELEMENT to put elements on the vector instead of the htab traversal. The compare function looks like we will end up having many equal elements (and thus random ordering on hosts where qsort doesn't behave sane here, like Solaris IIRC). Unless all sections are named (which it looks like) Some sections are not named. How about we sort the named sections and output them, but call output_object_block() on the rest of the sections on whatever order they were in? This solves the bootstrap problem as well. Attached patch tested on x86-64 and ppc64le Linux. OK? No, but hash_section suggests to sort after sect-common.flags if the section is not named. Conveniently flags is just an 'int' ... Can you adjust again? Thanks, Richard. Aldy
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Hi, this patchs makes fortran's C_SIGNED_CHAR and C_SIZE_T interoperable with signed char and size_t as standard require. There are two issues here. First Fortran integers are always signed, but the standard insist on the signed integer to match C size_t that is unsigned (if it was ptrdiff_t, we would be better of) and similarly the standard seems to explicitly state that C_SIGNED_CHAR is interoperable with both signed char and unsigned char. I thus globbed all integer types of precision compatible either with char or size_t to be the same regardless the signedness. Hmm, actually there is a note: NOTE 15.8 ISO/IEC 9899:1999 specifies that the representations for nonnegative signed integers are the same as the corresponding values of unsigned integers. Because Fortran does not provide direct support for unsigned kinds of integers, the ISO C BINDING module does not make accessible named constant s for their kind type parameter values. A user can use the signed kinds of integers to interoperate with the unsigned types and all their qualified versions as well. This has the potentially surprising side ect that the C type unsigned char is interoperable with the type integer with a kind type parameter of C SIGNED CHAR This seems to imply that other integer types also should be interoperable regardless of the signedness. It is true that representation is same for C, but alias sets are not. Perhaps all of the C BINDING types shall just be dropped to alias set 0? That would also solve the inter-operability of char versus char[1]. Alias-sets of signed and unsigned variants of integer types are the same: alias_set_type c_common_get_alias_set (tree t) { ... /* The C standard specifically allows aliasing between signed and unsigned variants of the same type. We treat the signed variant as canonical. */ if (TREE_CODE (t) == INTEGER_TYPE TYPE_UNSIGNED (t)) { tree t1 = c_common_signed_type (t); /* t1 == t can happen for boolean nodes which are always unsigned. */ if (t1 != t) return get_alias_set (t1); yes, this should be moved to alias.c ... Richard. I would say that the note is non-normative, so perhaps it can just be ignored, too :) Honza -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: debug-early branch merged into mainline
On 06/08/2015 04:26 AM, Richard Biener wrote: On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 02:33 PM, Richard Biener wrote: On June 7, 2015 6:00:05 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/07/2015 11:25 AM, Richard Biener wrote: On June 7, 2015 5:03:30 PM GMT+02:00, Aldy Hernandez al...@redhat.com wrote: On 06/06/2015 05:49 AM, Andreas Schwab wrote: Bootstrap fails on aarch64: Comparing stages 2 and 3 warning: gcc/cc1objplus-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs warning: gcc/cc1-checksum.o differs Bootstrap comparison failure! gcc/ira-costs.o differs gcc/tree-sra.o differs gcc/tree-parloops.o differs gcc/tree-vect-data-refs.o differs gcc/java/jcf-io.o differs gcc/ipa-inline-analysis.o differs The bootstrap comparison failure on ppc64le, aarch64, and possibly others is due to the order of some sections being in a different order with and without debugging. Stage2 is being compiled with no debugging due to -gtoggle, and stage3 is being compiled with debugging. For ira-costs.o on ppc64le we have: -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: ... -Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE26find_empty_slot_for_expandEj.str1.8: +Disassembly of section .rodata._ZN10hash_tableI19cost_classes_hasher11xcallocatorE6expandEv.str1.8: There is no semantic difference between the objects, just the ordering. I assume it's the same problem for the rest of the objects and architectures. I will look into this, unless someone beats me to it, or has an idea right off the bat. Check whether the symbol table walkers are walking hash tables. I assume the above are emitted via the symbol removal handling for debug stuff? Ughh, indeed. These sections are being outputted from output_object_blocks which traverses a hash table: void output_object_blocks (void) { object_block_htab-traversevoid *, output_object_block_htab (NULL); } Perhaps we should sort them by some deterministic field and then call output_object_block() on each member of the resulting list? Yes, that would be the usual fix. Maybe sth has an UID already, is the 'object' a decl by chance? The attached patch fixes the bootstrap failure on ppc64le, and theoretically the aarch64 problem as well, but I haven't checked. Tested on ppc64le linux by bootstrapping, and regtesting C/C++ against pre debug-early merge sources. Also tested by a full bootstrap and regtest on x86-64 Linux. OK for mainline? Please use FOR_EACH_HASH_TABLE_ELEMENT to put elements on the vector instead of the htab traversal. The compare function looks like we will end up having many equal elements (and thus random ordering on hosts where qsort doesn't behave sane here, like Solaris IIRC). Unless all sections are named (which it looks like) Some sections are not named. How about we sort the named sections and output them, but call output_object_block() on the rest of the sections on whatever order they were in? This solves the bootstrap problem as well. Attached patch tested on x86-64 and ppc64le Linux. OK? Aldy diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e1bd305..f6d4bda 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2015-06-07 Aldy Hernandez al...@redhat.com + + * varasm.c (output_object_block_htab): Remove. + (output_object_block_compare): New. + (output_object_blocks): Sort named object_blocks before outputting + them. + 2015-06-06 Jan Hubicka hubi...@ucw.cz * alias.c (get_alias_set): Be ready for TYPE_CANONICAL diff --git a/gcc/varasm.c b/gcc/varasm.c index 18f3eac..a765278 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -7420,14 +7420,18 @@ output_object_block (struct object_block *block) } } -/* A htab_traverse callback used to call output_object_block for - each member of object_block_htab. */ +/* A callback for qsort to compare object_blocks. */ -int -output_object_block_htab (object_block **slot, void *) +static int +output_object_block_compare (const void *x, const void *y) { - output_object_block (*slot); - return 1; + object_block *p1 = *(object_block * const*)x; + object_block *p2 = *(object_block * const*)y; + + gcc_assert (p1-sect-common.flags SECTION_NAMED + p2-sect-common.flags SECTION_NAMED); + + return strcmp (p1-sect-named.name, p2-sect-named.name); } /* Output the definitions of all object_blocks. */ @@ -7435,7 +7439,25 @@ output_object_block_htab (object_block **slot, void *) void output_object_blocks (void) { - object_block_htab-traversevoid *, output_object_block_htab (NULL); + vecobject_block *, va_heap v = vNULL; + object_block *obj; +
RFA: RL78: With -mes0 put read only data in the .frodata section
Hi DJ, The -mes0 option for the RL78 backend should put read only data into a special .frodata section, but unfortunately my recent update to the rl78_select_section function broke this behaviour. Below is a patch to fix this. Tested with no regressions on an rl78-elf toolchain. OK to apply ? Cheers Nick gcc/ChangeLog 2015-06-08 Nick Clifton ni...@redhat.com * config/rl78/rl78.c (rl78_select_section): With -mes0 put read only data into the .frodata section. Index: config/rl78/rl78.c === --- config/rl78/rl78.c (revision 224218) +++ config/rl78/rl78.c (working copy) @@ -4423,7 +4423,7 @@ } if (readonly) -return readonly_data_section; +return TARGET_ES0 ? frodata_section : readonly_data_section; switch (categorize_decl_for_section (decl, reloc)) { @@ -4430,7 +4430,7 @@ case SECCAT_TEXT: return text_section; case SECCAT_DATA: return data_section; case SECCAT_BSS:return bss_section; -case SECCAT_RODATA: return readonly_data_section; +case SECCAT_RODATA: return TARGET_ES0 ? frodata_section : readonly_data_section; default: return default_select_section (decl, reloc, align); }
[patch] libstdc++/66441 Fix wstring_convert when generating BOM
If the codecvt facet generates a BOM on every call to out() then we need to ensure there is enough capacity in the output string for the entire result, otherwise we loop and insert more than one BOM into the result. Tested powerpc64-linux and powerpc64-linux. Committing to trunk and gcc-5-branch. commit 5767dd073ce2905f35d817f67df8a3a4d3c995dc Author: Jonathan Wakely jwak...@redhat.com Date: Mon Jun 8 13:15:33 2015 +0100 PR libstdc++/66441 * testsuite/22_locale/conversions/string/66441.cc: New. * include/bits/locale_conv.h (__do_str_codecvt): Reserve enough space in the output string for BOM and complete result. diff --git a/libstdc++-v3/include/bits/locale_conv.h b/libstdc++-v3/include/bits/locale_conv.h index 8b0a77c..61b535c 100644 --- a/libstdc++-v3/include/bits/locale_conv.h +++ b/libstdc++-v3/include/bits/locale_conv.h @@ -60,12 +60,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { size_t __outchars = 0; auto __next = __first; - const auto __maxlen = __cvt.max_length(); + const auto __maxlen = __cvt.max_length() + 1; codecvt_base::result __result; do { - __outstr.resize(__outstr.size() + (__last - __next) + __maxlen); + __outstr.resize(__outstr.size() + (__last - __next) * __maxlen); auto __outnext = __outstr.front() + __outchars; auto const __outlast = __outstr.back() + 1; __result = (__cvt.*__fn)(__state, __next, __last, __next, diff --git a/libstdc++-v3/testsuite/22_locale/conversions/string/66441.cc b/libstdc++-v3/testsuite/22_locale/conversions/string/66441.cc new file mode 100644 index 000..b72edc8 --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/conversions/string/66441.cc @@ -0,0 +1,49 @@ +// Copyright (C) 2015 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++11 } + +// libstdc++/66441 + +#include locale +#include codecvt +#include testsuite_hooks.h + +void +test01() +{ + // convert from UCS-4 to UTF16BE with BOM. + using cvt = std::codecvt_utf16char32_t, 0x10, std::generate_header; + std::wstring_convertcvt, char32_t conv; + auto to = conv.to_bytes(Uab\u00e7); + + VERIFY( to.length() == 8 ); + VERIFY( (unsigned char)to[0] == 0xfe ); + VERIFY( (unsigned char)to[1] == 0xff ); + VERIFY( (unsigned char)to[2] == 0x00 ); + VERIFY( (unsigned char)to[3] == 0x61 ); + VERIFY( (unsigned char)to[4] == 0x00 ); + VERIFY( (unsigned char)to[5] == 0x62 ); + VERIFY( (unsigned char)to[6] == 0x00 ); + VERIFY( (unsigned char)to[7] == 0xe7 ); +} + +int +main() +{ + test01(); +}
Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
On 06/01/2015 10:22 AM, Jakub Jelinek wrote: On Fri, May 22, 2015 at 09:54:00AM +0200, Andreas Krebbel wrote: On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote: On 05/18/2015 04:19 PM, Richard Biener wrote: Please use uint64_t instead. Done. Ok with that change? I've applied the following patch. Note that on current trunk cross compiler from x86_64-linux to s390x-linux (admittedly just make cc1 of an older configured tree, but with libcpp (normal and build) rebuilt) fails miserably with genattrtab: invalid alternative specified for pattern number 1015 * recog.h: Increase MAX_RECOG_ALTERNATIVES. Change type of alternative_mask to uint64_t. From quick look at genattrtab.c, there are many further spots which rely on MAX_RECOG_ALTERNATIVES fitting into int bits. With this quick patch make cc1 at least succeeds, but no idea whether I've caught all the spots which work with bitmasks of alternatives. I've regtested your patch on S/390 without seeing any problems. Could you please commit it to mainline? Thanks! Bye, -Andreas-
Re: [BUILDROBOT]
On 06/05/2015 01:04 AM, Jan-Benedict Glaw wrote: Hi Andreas, On Mon, 2015-05-11 15:23:33 +0200, Andreas Krebbel kreb...@linux.vnet.ibm.com wrote: gcc/ * config/s390/constraints.md (j00, jm1, jxx, jyy, v): New constraints. * config/s390/predicates.md (const0_operand, constm1_operand) (constable_operand): Accept vector operands. * config/s390/s390-modes.def: Add supported vector modes. * config/s390/s390-protos.h (s390_cannot_change_mode_class) [...] Starting with this patch, it seems my buildrobot won't be able to build a s390{,x}-linux compiler: g++ -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE -static-libstdc++ -static-libgcc -o build/genattrtab \ build/genattrtab.o build/rtl.o build/read-rtl.o build/ggc-none.o build/vec.o build/min-insn-modes.o build/gensupport.o build/print-rtl.o build/hash-table.o build/read-md.o build/errors.o ../build-x86_64-unknown-linux-gnu/libiberty/libiberty.a build/genattrtab /home/jbglaw/repos/gcc/gcc/common.md /home/jbglaw/repos/gcc/gcc/config/s390/s390.md insn-conditions.md \ -Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c genattrtab: invalid alternative specified for pattern number 1015 Makefile:2167: recipe for target 's-attrtab' failed make[1]: *** [s-attrtab] Error 1 make[1]: Leaving directory '/home/jbglaw/build/s390x-linux/build-gcc/gcc' Makefile:4119: recipe for target 'all-gcc' failed make: *** [all-gcc] Error 2 The above error is taken from this build: http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=444690 This has been reported by Jakub already. He also proposed a fix in his email: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00012.html Bye, -Andreas-
Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
On Mon, Jun 08, 2015 at 03:32:50PM +0200, Andreas Krebbel wrote: On 06/01/2015 10:22 AM, Jakub Jelinek wrote: On Fri, May 22, 2015 at 09:54:00AM +0200, Andreas Krebbel wrote: On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote: On 05/18/2015 04:19 PM, Richard Biener wrote: Please use uint64_t instead. Done. Ok with that change? I've applied the following patch. Note that on current trunk cross compiler from x86_64-linux to s390x-linux (admittedly just make cc1 of an older configured tree, but with libcpp (normal and build) rebuilt) fails miserably with genattrtab: invalid alternative specified for pattern number 1015 * recog.h: Increase MAX_RECOG_ALTERNATIVES. Change type of alternative_mask to uint64_t. From quick look at genattrtab.c, there are many further spots which rely on MAX_RECOG_ALTERNATIVES fitting into int bits. With this quick patch make cc1 at least succeeds, but no idea whether I've caught all the spots which work with bitmasks of alternatives. I've regtested your patch on S/390 without seeing any problems. Could you please commit it to mainline? Ok, I will. Have you looked around if these are all the spots that need changing for this in the gen* tools? Perhaps trying -fsanitize=undefined and/or valgrind. I admit I haven't spent too much time on it. Jakub
[commit#2] [patch#2] PR other/65366: Fix gdbhooks.py for GDB with Python3
On Mon, 08 Jun 2015 09:46:59 +0200, Richard Biener wrote: adding a import sys makes it work fine though. I do not see the sys error with either FSF GDB HEAD or Fedora 22 GDB. I agree it probably should be there. Thus, ok with also adding a imoprt sys. Done and checked in: r224223 Jan
Re: [PATCH] Lift restrictions on SLP permutation for loop vect
On Wed, 3 Jun 2015, Richard Biener wrote: This allows all permutations we can generate (according to the target). Bootstrap and regtest pending on x86_64-unknown-linux-gnu. So this turned up other issues thus the following is what I have committed after bootstrapping and testing on x86_64-unknown-linux-gnu. Richard. 2015-06-08 Richard Biener rguent...@suse.de * tree-vect-stmts.c (vectorizable_load): Compute the pointer adjustment for gaps at the end of a SLP load group properly. * tree-vect-slp.c (vect_supported_load_permutation_p): Allow all permutations we can generate. (vect_transform_slp_perm_load): Use the correct group-size. * gcc.dg/vect/slp-perm-10.c: New testcase. * gcc.dg/vect/slp-23.c: Adjust. * gcc.dg/torture/pr53366-2.c: Also verify cross-iteration vector pointer update. Index: gcc/tree-vect-stmts.c === *** gcc/tree-vect-stmts.c (revision 224077) --- gcc/tree-vect-stmts.c (working copy) *** vectorizable_load (gimple stmt, gimple_s *** 5807,5813 gimple ptr_incr = NULL; int nunits = TYPE_VECTOR_SUBPARTS (vectype); int ncopies; ! int i, j, group_size = -1, group_gap; tree msq = NULL_TREE, lsq; tree offset = NULL_TREE; tree byte_offset = NULL_TREE; --- 5807,5813 gimple ptr_incr = NULL; int nunits = TYPE_VECTOR_SUBPARTS (vectype); int ncopies; ! int i, j, group_size = -1, group_gap_adj; tree msq = NULL_TREE, lsq; tree offset = NULL_TREE; tree byte_offset = NULL_TREE; *** vectorizable_load (gimple stmt, gimple_s *** 6396,6421 } first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt)); group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt)); /* VEC_NUM is the number of vect stmts to be created for this group. */ if (slp) { grouped_load = false; vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); ! group_gap = GROUP_GAP (vinfo_for_stmt (first_stmt)); } else ! { ! vec_num = group_size; ! group_gap = 0; ! } } else { first_stmt = stmt; first_dr = dr; group_size = vec_num = 1; ! group_gap = 0; } alignment_support_scheme = vect_supportable_dr_alignment (first_dr, false); --- 6396,6419 } first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt)); group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt)); + group_gap_adj = 0; /* VEC_NUM is the number of vect stmts to be created for this group. */ if (slp) { grouped_load = false; vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); ! group_gap_adj = vf * group_size - nunits * vec_num; } else ! vec_num = group_size; } else { first_stmt = stmt; first_dr = dr; group_size = vec_num = 1; ! group_gap_adj = 0; } alignment_support_scheme = vect_supportable_dr_alignment (first_dr, false); *** vectorizable_load (gimple stmt, gimple_s *** 6831,6842 if (slp !slp_perm) SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); } ! /* Bump the vector pointer to account for a gap. */ ! if (slp group_gap != 0) { ! tree bump = size_binop (MULT_EXPR, ! TYPE_SIZE_UNIT (elem_type), ! size_int (group_gap)); dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, stmt, bump); } --- 6829,6843 if (slp !slp_perm) SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt); } ! /* Bump the vector pointer to account for a gap or for excess !elements loaded for a permuted SLP load. */ ! if (group_gap_adj != 0) { ! bool ovf; ! tree bump ! = wide_int_to_tree (sizetype, ! wi::smul (TYPE_SIZE_UNIT (elem_type), ! group_gap_adj, ovf)); dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, stmt, bump); } Index: gcc/tree-vect-slp.c === *** gcc/tree-vect-slp.c (revision 224077) --- gcc/tree-vect-slp.c (working copy) *** vect_supported_load_permutation_p (slp_i *** 1502,1548 return true; } ! /* FORNOW: the only supported permutation is 0..01..1.. of length equal to ! GROUP_SIZE and where each sequence of same drs is of GROUP_SIZE length as ! well (unless it's reduction). */ ! if
[PING^2][PR65637] Fix ssa-handling code in expand_omp_for_static_chunk
On 18/05/15 14:53, Tom de Vries wrote: On 15-04-15 15:10, Tom de Vries wrote: Hi, This patch series fixes PR65637. Currently, ssa-handling code in expand_omp_for_static_chunk is dead and not exercised by testing. Ssa-handling code in omp-low.c is only triggered by pass_parallelize_loops, and that pass doesn't specify a chunk size on the GIMPLE_OMP_FOR it constructs, so that only exercises the expand_omp_for_static_nochunk path. Using the attached trigger patch, we excercise the ssa-handling code in expand_omp_for_static_chunk. The following patch series fixes the problems in the ssa-handling code that we encounter. 1. Fix gcc_assert in expand_omp_for_static_chunk 2. Fix inner loop phi in expand_omp_for_static_chunk 3. Handle 2 preds for fin_bb in expand_omp_for_static_chunk The patch series has been bootstrapped and reg-tested on x86_64 together with attached trigger patch. I'll post the patches from the patch series individually, in response to this email. Ping for the three patches. Ping^2. Original posting at https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00757.html . Thanks, - Tom
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Richard Biener wrote: On Mon, 8 Jun 2015, Joseph Myers wrote: On Mon, 8 Jun 2015, Richard Biener wrote: I'm not sure the C standard mandates compatibility between struct { int i; } and struct { unsigned i; } for purposes of TBAA. Joseph? I don't think they are necessarily compatible for TBAA. Ok, but as int and unsigned are reading either structs element via a pointer to int or a pointer to unsigned must be supported? Yes. The questionable case would be taking an object of one of those structure types, casting a pointer to it to point to the other structure type and then dereferencing. Are struct { int i; } and struct { unsigned i; } compatible when one is defined in one unit and other in another? In any cae, I suppose if int and unsigned int pointers can be used interchangeably, we want to ignore TYPE_UNSGINED for purposes of canonical type calculation for LTO. So is the second variant of patch OK with a comment update that this is also required by C? Honza -- Joseph S. Myers jos...@codesourcery.com
[gomp4] declare directive [0/5]
Hi! This patch series completes the implementation of the OpenACC declare directive. Patches applied to gomp-4_0-branch Thanks! Jim = gcc/ChangeLog.gomp * builtin-types.def (BT_FN_VOID_PTR_INT_UINT): New type. * gimple-pretty-print.c (dump_gimple_omp_target): Handle declare directive. * gimple.h (gf_mask): Add enum. (is_gimple_omp_oacc): Add declare directive. * gimplify.c (omp_notice_variable): Handle device_resident. (gimplify_omp_target_update): Handle declare directive. (gimplify_expr): Handle declare directive. * omp-builtins.def (BUILT_IN_GOACC_STATIC, BUILT_IN_GOACC_DECLARE): New types. * omp-low.c (expand_omp_target): Handle declare directive. (build_omp_regions_1): Likewise. (lower_omp_target): Likewise. (make_gimple_omp_edges): Likewise. * varpool.c (gomp-constants.h): Add inclusion. (make_offloadable_1, make_offloadable): New functions. (get_create): Add calls to make_offloadable. == gcc/c/ChangeLog.gomp * c-parser.c (tree-iterator.h): Add inclusion. (check_oacc_vars1, check_oacc_vars, find_oacc_return, finish_oacc_declare): New functions. (oacc_return): New structure. (oacc_returns): New variable. (c_parser_declaration_or_fndef): Add call to finish_oacc_declare. (oacc_dcl_idx): New variable. (c_parser_oacc_declare): Rewrite. = gcc/cp/ChangeLog.gomp * decl.c (gomp-constants.h): Add inclusion. (check_oacc_vars1, check_oacc_vsars, find_oacc_return, finish_oacc_declare): New functions. (finish_function): Add call to finish_oacc_declare. * parser.c (tree-iterator.h): Add inclusion. (oacc_dcl_idx): New variable. (OACC_DECLARE_CLAUSE_MASK): New macro. (cp_parser_oacc_declare): New function. (cp_parser_pragma): Handle parsing of declare directive. * pt.c (tsubr_expr): Add handling of declare directive. = gcc/fortran/ChangeLog.gomp * f95-lang.c (gfc_attribute_table): New entry. * gfortran.h (symbol_attribute): New attributes. (gfc_omp_map_op): New enums. (OMP_LIST_LINK): New enum. (gfc_oacc_declare): Add member: module_var. (finish_oacc_declare): Add calling parm. * module.c (ab_attribute): Add enums. (attr_bits): Add initialization of new attribute bits. (mio_symbol_attribute): Add handling of new attribute bits. * openmp.c (OMP_CLAUSE_LINK): New defintion. (gfc_match_omp_clauses): Add handling of link clause. (OACC_DECLARE_CLAUSES): Update declare directive clauses. (gfc_match_oacc_declare): Add handling of device_resident and link clauses. (gfc_resolve_oacc_declare): Add handling of link clause. * symbol.c (check_conflict): Add checks for declare clauses in modules. (gfc_add_oacc_declare_create, gfc_add_declare_copyin, gfc_add_oacc_declare_deviceptr, gfc_add_oacc_declare_device_resident): New functions. (gfc_add_target): Add checks for declare attrs. * trans-decl.c (add_attributes_to_decl): Add creation of attribute. (oacc_return): New structure. (oacc_returns, module_oacc_clauses): New variables. (find_oacc_return, add_clause, find_module_oacc_declare_clauses): New functions. (finish_oacc_declare): Rename from insert_oacc_declare and rewrite. (gfc_generate_function_code): Change calling of finish_oacc_declare. * trans-openmp.c (gfc_trans_omp_clauses): Add handling of link and device_resident clauses. (gfc_trans_oacc_declare): Rewrite. * trans-stmt.c (gfc_trans_block_construct): Change calling of finish_oacc_declare. * types.def (BT_FN_VOID_PTR_INT_UINT): New type. = gcc/testsuite/ChangeLog.gomp * c-c++-common/goacc/declare-1.c: Update tests. * c-c++-common/goacc/declare-2.c: Likewise. * gfortran.dg/goacc/declare-1.f95: Update tests. = libgomp/ChangeLog.gomp * libgomp.map: Add GOACC_declare and GOACC_register_static. * oacc-init.c (acc_shutdown_1): Add call to acc_deallocate_static. (acc_init): Add call to acc_allocate_static. * oacc-int.h (goacc_allocate_static, goacc_deallocate_static): New declarations. * oacc-parallel.c (oacc_static): New structure. (oacc_statics): New variable. (goacc_allocate_static, goacc_deallocate_static, GOACC_register_static, GOACC_declare): New functions. * testsuite/libgomp.oacc-c++/declare-1.C: New file. * testsuite/libgomp.oacc-c-c++-common/declare-1.c: New file. * testsuite/libgomp.oacc-c-c++-common/declare-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/declare-3.c: Likewise. *
[gomp4] declare directive [4/5]
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 7c3273f..0774da5 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -451,6 +451,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONG_ULONG_ULONGPTR, BT_BOOL, BT_ULONG, DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONGLONG_ULONGLONG_ULONGLONGPTR, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG) DEF_FUNCTION_TYPE_3 (BT_FN_INT_INT_INT_INT, BT_INT, BT_INT, BT_INT, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_UINT, BT_VOID, BT_PTR, BT_INT, BT_UINT) DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR, BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR) diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index a640a96..f447af6 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -1365,6 +1365,9 @@ dump_gimple_omp_target (pretty_printer *buffer, gomp_target *gs, case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: kind = oacc_enter_exit_data; break; +case GF_OMP_TARGET_KIND_OACC_DECLARE: + kind = oacc_declare; + break; default: gcc_unreachable (); } diff --git a/gcc/gimple.h b/gcc/gimple.h index bf048e6..bd92c96 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -100,7 +100,7 @@ enum gf_mask { GF_OMP_FOR_KIND_CILKSIMD = GF_OMP_FOR_SIMD | 1, GF_OMP_FOR_COMBINED = 1 3, GF_OMP_FOR_COMBINED_INTO = 1 4, -GF_OMP_TARGET_KIND_MASK = (1 3) - 1, +GF_OMP_TARGET_KIND_MASK = (1 4) - 1, GF_OMP_TARGET_KIND_REGION = 0, GF_OMP_TARGET_KIND_DATA = 1, GF_OMP_TARGET_KIND_UPDATE = 2, @@ -109,6 +109,7 @@ enum gf_mask { GF_OMP_TARGET_KIND_OACC_DATA = 5, GF_OMP_TARGET_KIND_OACC_UPDATE = 6, GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 7, +GF_OMP_TARGET_KIND_OACC_DECLARE = 8, /* True on an GIMPLE_OMP_RETURN statement if the return does not require a thread synchronization via some sort of barrier. The exact barrier @@ -5663,6 +5664,7 @@ is_gimple_omp_oacc (const_gimple stmt) case GF_OMP_TARGET_KIND_OACC_DATA: case GF_OMP_TARGET_KIND_OACC_UPDATE: case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: + case GF_OMP_TARGET_KIND_OACC_DECLARE: return true; default: return false; diff --git a/gcc/gimplify.c b/gcc/gimplify.c index c85b424..b1f768f 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -5819,10 +5819,26 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) splay_tree_node n; unsigned flags = in_code ? GOVD_SEEN : 0; bool ret = false, shared; + bool device_resident = false; if (error_operand_p (decl)) return false; + if (flag_openacc is_global_var (decl)) +{ + tree attr = lookup_attribute (oacc declare, DECL_ATTRIBUTES (decl)); + if (attr) + { + tree t, c; + for (t = TREE_VALUE (attr); t; t = TREE_PURPOSE (t)) + { + c = TREE_VALUE (t); + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DEVICE_RESIDENT) + device_resident = true; + } + } +} + /* Threadprivate variables are predetermined. */ if (is_global_var (decl)) { @@ -5899,7 +5915,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) by default are firstprivate (gang-local) in parallel. */ if (!n2 !AGGREGATE_TYPE_P (type)) { - if (ctx-acc_region_kind == ARK_PARALLEL) + if (device_resident) + flags |= GOVD_MAP_TO_ONLY; + else if (ctx-acc_region_kind == ARK_PARALLEL) flags |= (GOVD_GANGLOCAL | GOVD_MAP_TO_ONLY); /* Scalars under kernels are default 'copy'. */ else if (ctx-acc_region_kind == ARK_KERNELS) @@ -7729,6 +7747,10 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p) switch (TREE_CODE (expr)) { +case OACC_DECLARE: + kind = GF_OMP_TARGET_KIND_OACC_DECLARE; + ork = ORK_OACC; + break; case OACC_ENTER_DATA: kind = GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA; ork = ORK_OACC; @@ -8707,11 +8729,6 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, ret = gimplify_oacc_host_data (expr_p, pre_p); break; - case OACC_DECLARE: - sorry (directive not yet implemented); - ret = GS_ALL_DONE; - break; - case OACC_KERNELS: case OACC_PARALLEL: case OACC_DATA: @@ -8724,6 +8741,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, ret = GS_ALL_DONE; break; + case OACC_DECLARE: case OACC_ENTER_DATA: case OACC_EXIT_DATA: case OACC_UPDATE: diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def index 6e70d0b..b31cb2d 100644 --- a/gcc/omp-builtins.def +++ b/gcc/omp-builtins.def @@ -299,3 +299,7 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_UPDATE, GOMP_target_update, BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST) DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TEAMS, GOMP_teams, BT_FN_VOID_UINT_UINT, ATTR_NOTHROW_LIST) +DEF_GOACC_BUILTIN (BUILT_IN_GOACC_STATIC, GOACC_register_static, + BT_FN_VOID_PTR_INT_UINT,
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: Hi, this is a variant of patch that globs also the rest of integer types. Note that we will still get false warnings out of lto-symtab when the values are not wrapped up in structures. This is because lto-symtab uses types_compatible_p that in turn uses useless_type_conversion and that one needs to honor signedness. I suppose we need a way to test representation compatibility and TBAA compatiblity. I will give it a more tought how to reorganize the code. Basically we need representation compatibility is TYPE_CANONICAL equivalence, TBAA compatibility is get_alias_set equivalence. So you have to be careful when mangling TYPE_CANONICAL according to get_alias_set and make sure to only apply this (signedness for example) for aggregate type components. Hmm, OK, so you think that TYPE_CANONICAL (int) != TYPE_CANONICAL (unsigned int) get_alias_set (TYPE_CANONICAL (int)) == get_alias_set (TYPE_CANONICAL (unsigned int)) TYPE_CANOINCAL (struct {int}) == TYPE_CANONICAL (struct {unsgined int}) I suppose that makes sense because the actual fields of structure are not really used for semantics (it depends on what FIELD_DECL we pull out). I will build testcase and send a patch that will glob alias set of signed and usigned types in LTO's own get_lias_set hook. I do not think we need to punish aliasing of non-C languages (without LTO) by putting it all the way to alias.c. As for structures, I have the following plan. I plan to add computation of canonical types of pointers that will use the canonical type of pointed-to type same was as non-LTO path does. This however can not be propagated up to structure fields, since pointer to complete structure must be compatile to pointer to incomplete type. I suppose signed and unsgined values can be handled the same way. We thus need to track in hash_canonical_type and gimple_types_compatible_p if we are in a subtype of aggregate/array and if so, we don't want to use TYPE_CANONICAL of the nested type and instead just recurse with more coarsce equivalence defined. Joseph, I may be wrong, but I believe that the cross-compilation-unit representation compatibility (in C standard sense) is however not an equivalence class, so it can't be fully represented by TYPE_CANOINICAL equivalence. We may want to warn about types not being compatible when compiling: a.c: struct b {int c;}; struct a {struct b *ptr;} var; b.c: struct b {long c;}; struct a {struct b *ptr;} var; pointers are not required to be compatible with each other. Or in more ugly way int a[5]; as a structure field may be compatible with int a[b]; int a[b] is compatible with int a[10]; But in a[5] is not compatible with int a[10]; I am not sure if TYPE_CANONICAL of a structure with fixed size array should match TYPE_CANONICAL of similarly looking structure with VLA. Can you please split out the string-flag change? It is approved. OK, I will split it and commit it with the corresponding parts of the testcase commented out. Thanks! Honza
Re: [PATCH] Optimize (CST1 A) == CST2 (PR tree-optimization/66299)
On Thu, May 28, 2015 at 09:48:10PM +0200, Marc Glisse wrote: Side note: if we are looking for extra patterns to simplify, llvm has an almost unlimited supply. Here are a few we don't seem to have (there are more where those came from), of course several need constraining / generalizing, it is just a list of hints I wrote for myself. (A|B) ~(AB) - A^B (A | B) ((~A) ^ B) - (A B) (A (~B)) | (A ^ B) - (A ^ B) ((B | C) A) | B - B | (A C) A | ( A ^ B) - A | B A | (~A ^ B) - A | ~B (A ^ B) ((B ^ C) ^ A) - (A ^ B) ~C (A ^ B) | ((B ^ C) ^ A) - (A ^ B) | C (A B) | (A ^ B) - (A | B) A | ~(A ^ B) - A | ~B (A B) | ((~A) ^ B) - (~A ^ B) ~(~X Y) - (X | ~Y) ~(~X s Y) - (X s Y) (A B)^(A | B) - A ^ B (A | ~B) ^ (~A | B) - A ^ B (A ~B) ^ (~A B) - A ^ B (A ^ C)^(A | B) - ((~A) B) ^ C (A B) ^ (A ^ B) - (A | B) (A ~B) ^ (~A) - ~(A B) (AB)+(A^B) - A|B (AB)+(A|B) - A+B (A|B)-(A^B) - AB ((X | Y) - X) - (~X Y) fmax(x,NaN) - x fmax(a,fmax(a,b)) - fmax(a,b) (X+2) u X - x u 256-2 (1 X) 30 - X = 4 ((X ~7) == 0) - X 8 2 * X 5 - X = 2 ((1 x)8) == 0 - x != 3 ((1 x)7) == 0 - x 2 Y - Z X - Z - Y X 3 * X == 3 * Y - X == Y A 3 == B 3 - (A ^ B) 8 (float)int = 4.4 - int = 4 x unle x - x ord x Thanks for this list. I'll look at implementing (some of) them. On Thu, 28 May 2015, Jakub Jelinek wrote: Is CST2 a multiple of CST1 the best test though? Apparently not ;). I mean say in (0x8001U x) == 0x2U 0x2U isn't a multiple of 0x8001U, yet there is only one valid value of x for which it holds (17), so we could very well optimize that to x == 17. Yeah. If popcount of the CST1 is 1, then multiple_of_p is supposedly sufficient (have you checked if CST1 is negative that it still works?), for others supposedly we could have a helper function that would just try in a loop all shift counts from 0 to precision - 1, and note when (CST1 b) == CST2 - if for no b, then it should fold regardless of has_single_use to false or true, if for exactly one shift count, then use a comparison against that shift count, otherwise give up? ctz(CST2)-ctz(CST1) should provide a single candidate without looping. ctz(CST1) is also relevant when CST2==0. That seems to work well so the following patch is an attempt to do it so. If CST2 is non-zero, we compute a candidate and verify whether this candidate works. If so, we know there's exactly one so we should be able to fold the shift into comparison. I've tried even negative numbers and it seems to DTRT, but I'd certainly appreciate if y'all could take a look at this. Thanks. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-06-08 Marek Polacek pola...@redhat.com PR tree-optimization/66299 * match.pd ((CST1 A) == CST2 - A == ctz (CST2) - ctz (CST1) ((CST1 A) != CST2 - A != ctz (CST2) - ctz (CST1)): New patterns. * gcc.dg/pr66299-1.c: New test. * gcc.dg/pr66299-2.c: New test. diff --git gcc/match.pd gcc/match.pd index abd7851..32e913c 100644 --- gcc/match.pd +++ gcc/match.pd @@ -676,6 +676,18 @@ along with GCC; see the file COPYING3. If not see (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop) (icmp @0 { build_zero_cst (TREE_TYPE (@0)); }))) +/* (CST1 A) == CST2 - A == ctz (CST2) - ctz (CST1) + (CST1 A) != CST2 - A != ctz (CST2) - ctz (CST1) + if CST2 != 0. */ +(for cmp (ne eq) + (simplify + (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2) + (with { + unsigned int cand = wi::ctz (@2) - wi::ctz (@0); } + (if (!integer_zerop (@2) +wi::eq_p (wi::lshift (@0, cand), @2)) + (cmp @1 { build_int_cst (TREE_TYPE (@1), cand); }) + /* Simplifications of conversions. */ /* Basic strip-useless-type-conversions / strip_nops. */ diff --git gcc/testsuite/gcc.dg/pr66299-1.c gcc/testsuite/gcc.dg/pr66299-1.c index e69de29..e7b978d 100644 --- gcc/testsuite/gcc.dg/pr66299-1.c +++ gcc/testsuite/gcc.dg/pr66299-1.c @@ -0,0 +1,92 @@ +/* PR tree-optimization/66299 */ +/* { dg-do run } */ +/* { dg-options -fdump-tree-original } */ + +void +test1 (int x) +{ + if ((0 x) != 0 + || (1 x) != 2 + || (2 x) != 4 + || (3 x) != 6 + || (4 x) != 8 + || (5 x) != 10 + || (6 x) != 12 + || (7 x) != 14 + || (8 x) != 16 + || (9 x) != 18 + || (10 x) != 20) +__builtin_abort (); +} + +void +test2 (int x) +{ + if (!((0 x) == 0 + (1 x) == 4 + (2 x) == 8 + (3 x) == 12 + (4 x) == 16 + (5 x) == 20 + (6 x) == 24 + (7 x) == 28 + (8 x) == 32 + (9 x) == 36 +(10 x) == 40)) +__builtin_abort (); +} + +void +test3 (unsigned int x) +{ + if ((0U x) != 0U + || (1U x) != 16U + || (2U x) != 32U + || (3U x) != 48U + || (4U x) != 64U + || (5U x) != 80U + || (6U x) != 96U + || (7U x) != 112U + || (8U x) != 128U + || (9U x) != 144U + || (10U x)
[patch] libstdc++/66030 fix codecvt exports for mingw32
The linker script assumes that std::mbstate_t has the name __mbstate_t for linkage purposes, but that's not necessarily true. For mingw32 it's just a typedef for int, so the patterns don't match. This adds a new mingw32-specific pattern for codecvt_byname's constructors and destructors, and relaxes the patterns for codecvtcharNN_t, char, mbstate_t so they match __mbstate_t or int. Tested x86_64-linux and powerpc64le-linux, committed to trunk. I plan to commit this to trunk and gcc-5-branch soon. commit dffce5e2b48ff19c4ec4de5d7ca934c15225b940 Author: Jonathan Wakely jwak...@redhat.com Date: Mon Jun 1 17:31:46 2015 +0100 PR libstdc++/66030 * config/abi/pre/gnu.ver: Export codecvt_byname and codecvt symbols for mingw32. diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 2da04e4..d42cd37 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -542,6 +542,9 @@ GLIBCXX_3.4 { # std::codecvt_byname _ZNSt14codecvt_bynameI[cw]c11__mbstate_tEC[12]EPKc[jmy]; _ZNSt14codecvt_bynameI[cw]c11__mbstate_tED*; +#if defined (_WIN32) !defined (__CYGWIN__) +_ZNSt14codecvt_bynameI[cw]ciE[CD]*; +#endif # std::collate _ZNSt7collateI[cw]*; @@ -1821,9 +1824,9 @@ GLIBCXX_3.4.21 { _ZNKSt8time_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE6do_getES3_S3_RSt8ios_baseRSt12_Ios_IostateP2tmcc; # codecvtchar16_t, char, mbstate_t, codecvtchar32_t, char, mbstate_t -_ZNKSt7codecvtID[is]c11__mbstate_t*; -_ZNSt7codecvtID[is]c11__mbstate_t*; -_ZT[ISV]St7codecvtID[is]c11__mbstate_tE; +_ZNKSt7codecvtID[is]c*; +_ZNSt7codecvtID[is]c*; +_ZT[ISV]St7codecvtID[is]c*E; extern C++ {
Fortran's C_CHAR type
Hi, to furhter add to the topics to discuss, I noticed that Fortran FE seems to be quite ambivalent about C_CHAR type: [jh@gcc2-power8 gcc]$ cat ../b.f90 ! This testcase will abort if C_CHAR types are not interoperable module lto_type_merge_test use, intrinsic :: iso_c_binding implicit none contains function types_test1(V) bind(c) USE, INTRINSIC :: ISO_C_BINDING CHARACTER(C_CHAR) :: types_test1 CHARACTER(C_CHAR), VALUE :: V types_test1 = V end function types_test1 end module lto_type_merge_test [jh@gcc2-power8 gcc]$ cat ../a.c extern unsigned char types_test1 (char v); void main () { if (types_test1 ('a') != 'a') __builtin_abort (); return 0; } As my fortran-fu goes, i think this testcase is correct. Fortran FE however builds types_test1 as a function return char but taking the array of size of 1 as a parameter. I think it is just kind of coincidence that those are passed the same way for x86-64 and may not be quite the case for other targets. If the testcase seems valid, I would like to commit it to the non-LTO testsuite so we get this tested. With LTO we get bogus types mismatch warning and types_test1 won't be inlined for this reason. Honza
[Ping] [C++ Patch] PR 65815
Hi, gently pinging this... On 05/22/2015 08:46 PM, Paolo Carlini wrote: Hi, surprisingly, for NSDMIs we don't use reshape_init and we end-up rejecting simple testcases like the below. It seems clear to me that we should - consistently with the comment preceding digest_init too - but I'm not 100% sure that digest_nsdmi_init is the best place for that. Anyway, the below passes testing on x86_64-linux. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02158.html Thanks! Paolo.
Re: [gomp4] declare directive [2/5]
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 261a12d..15da51e 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3. If not see #include cilk.h #include wide-int.h #include builtins.h +#include gomp-constants.h /* Possible cases of bad specifiers type used by bad_specifiers. */ enum bad_spec_place { @@ -14113,6 +14114,314 @@ maybe_save_function_definition (tree fun) register_constexpr_fundef (fun, DECL_SAVED_TREE (fun)); } +static tree +check_oacc_vars_1 (tree *tp, int *, void *l) +{ + if (TREE_CODE (*tp) == VAR_DECL TREE_PUBLIC (*tp)) +{ + location_t loc = DECL_SOURCE_LOCATION (*tp); + tree attrs; + attrs = lookup_attribute (oacc declare, DECL_ATTRIBUTES (*tp)); + if (attrs) + { + tree t; + + for (t = TREE_VALUE (attrs); t; t = TREE_CHAIN (t)) + { + loc = EXPR_LOCATION ((tree) l); + + if (OMP_CLAUSE_MAP_KIND (TREE_VALUE (t)) == GOMP_MAP_LINK) + { + error_at (loc, %link% clause cannot be used with %qE, + *tp); + break; + } + } + } + else + error_at (loc, no %#pragma acc declare% for %qE, *tp); +} + return NULL_TREE; +} + +static tree +check_oacc_vars (tree *tp, int *, void *) +{ + if (TREE_CODE (*tp) == STATEMENT_LIST) +{ + tree_stmt_iterator i; + + for (i = tsi_start (*tp); !tsi_end_p (i); tsi_next (i)) + { + tree t = tsi_stmt (i); + walk_tree_without_duplicates (t, check_oacc_vars_1, t); + } +} + + return NULL_TREE; +} + +static struct oacc_return +{ + tree_stmt_iterator iter; + tree stmt; + int op; + struct oacc_return *next; +} *oacc_returns; + +static tree +find_oacc_return (tree *tp, int *, void *) +{ + if (TREE_CODE (*tp) == STATEMENT_LIST) +{ + tree_stmt_iterator i; + + for (i = tsi_start (*tp); !tsi_end_p (i); tsi_next (i)) + { + tree t; + struct oacc_return *r; + + t = tsi_stmt (i); + + if (TREE_CODE (t) == RETURN_EXPR) + { + r = XNEW (struct oacc_return); + r-iter = i; + r-stmt = NULL_TREE; + r-op = 1; + r-next = NULL; + + if (oacc_returns) + r-next = oacc_returns; + + oacc_returns = r; + } + else if (TREE_CODE (t) == IF_STMT) + { + bool op1, op2; + tree op; + + op1 = op2 = false; + + op = TREE_OPERAND (t, 1); + op1 = (op TREE_CODE (op) == RETURN_EXPR); + + op = TREE_OPERAND (t, 2); + op2 = (op TREE_CODE (op) == RETURN_EXPR); + + if (op1 || op2) + { + r = XNEW (struct oacc_return); + r-stmt = t; + r-op = op1 ? 1 : 2; + r-next = NULL; + + if (oacc_returns) + r-next = oacc_returns; + + oacc_returns = r; + } + } + } +} + + return NULL_TREE; +} + +static void +finish_oacc_declare (tree fndecl, tree decls) +{ + tree t, stmt, list, c, ret_clauses, clauses; + location_t loc; + tree_stmt_iterator i; + + list = cur_stmt_list; + + if (lookup_attribute (oacc function, DECL_ATTRIBUTES (fndecl))) +{ + if (lookup_attribute (oacc declare, DECL_ATTRIBUTES (fndecl))) + { + location_t loc = DECL_SOURCE_LOCATION (fndecl); + error_at (loc, %#pragma acc declare% not allowed in %qE, fndecl); + } + + walk_tree_without_duplicates (list, check_oacc_vars, NULL); + return; +} + + if (!decls) +return; + + walk_tree_without_duplicates (list, find_oacc_return, NULL); + + clauses = NULL_TREE; + + for (t = decls; t; t = TREE_CHAIN (t)) +{ + c = TREE_VALUE (TREE_VALUE (t)); + + if (clauses) + OMP_CLAUSE_CHAIN (c) = clauses; + else + loc = OMP_CLAUSE_LOCATION (c); + + clauses = c; +} + + ret_clauses = NULL_TREE; + + for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) +{ + bool ret = false; + HOST_WIDE_INT kind, new_op; + + kind = OMP_CLAUSE_MAP_KIND (c); + + switch (kind) + { + case GOMP_MAP_ALLOC: + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_TO: + new_op = GOMP_MAP_FORCE_DEALLOC; + ret = true; + break; + + case GOMP_MAP_FORCE_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FORCE_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_TO); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_TO); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_FORCE_PRESENT: + case GOMP_MAP_POINTER: + case GOMP_MAP_TO: + break; + + case GOMP_MAP_LINK: + continue; + + default: + gcc_unreachable (); + break; + } + + if (ret) + { + t = copy_node (c); + + OMP_CLAUSE_SET_MAP_KIND (t, new_op); + + if (ret_clauses) + OMP_CLAUSE_CHAIN
[gomp4] declare directive [5/5]
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index fe38dc6..663c27c 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -318,6 +318,7 @@ GOACC_2.0 { global: GOACC_data_end; GOACC_data_start; + GOACC_declare; GOACC_enter_exit_data; GOACC_parallel; GOACC_update; @@ -331,6 +332,7 @@ GOACC_2.0.GOMP_4_BRANCH { GOACC_deviceptr; GOACC_get_ganglocal_ptr; GOACC_kernels; + GOACC_register_static; } GOACC_2.0; GOMP_PLUGIN_1.0 { diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index 9f24dc3..e772f48 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -205,6 +205,8 @@ acc_shutdown_1 (acc_device_t d) if (!base_dev) gomp_fatal (device %s not supported, name_of_acc_device_t (d)); + goacc_deallocate_static (d); + gomp_mutex_lock (goacc_thread_lock); /* Free target-specific TLS data and close all devices. */ @@ -373,7 +375,9 @@ goacc_attach_host_thread_to_device (int ord) void acc_init (acc_device_t d) { - if (!cached_base_dev) + bool init = !cached_base_dev; + + if (init) gomp_init_targets_once (); gomp_mutex_lock (acc_device_lock); @@ -381,6 +385,9 @@ acc_init (acc_device_t d) cached_base_dev = acc_init_1 (d); gomp_mutex_unlock (acc_device_lock); + + if (init) +goacc_allocate_static (d); goacc_attach_host_thread_to_device (-1); } diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h index 0ace737..8f4938e 100644 --- a/libgomp/oacc-int.h +++ b/libgomp/oacc-int.h @@ -98,6 +98,9 @@ void goacc_save_and_set_bind (acc_device_t); void goacc_restore_bind (void); void goacc_lazy_initialize (void); +void goacc_allocate_static (acc_device_t); +void goacc_deallocate_static (acc_device_t); + #ifdef HAVE_ATTRIBUTE_VISIBILITY # pragma GCC visibility pop #endif diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c index 513d0bc..70758bc 100644 --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -109,6 +109,68 @@ alloc_ganglocal_addrs (size_t mapnum, void **hostaddrs, size_t *sizes, } } +static struct oacc_static +{ + void *addr; + size_t size; + unsigned short mask; + bool free; + struct oacc_static *next; +} *oacc_statics; + +static bool alloc_done = false; + +void +goacc_allocate_static (acc_device_t d) +{ + struct oacc_static *s; + + if (alloc_done) +assert (0); + + for (s = oacc_statics; s; s = s-next) +{ + void *d; + + switch (s-mask) + { + case GOMP_MAP_FORCE_ALLOC: + break; + + case GOMP_MAP_FORCE_TO: + d = acc_deviceptr (s-addr); + acc_memcpy_to_device (d, s-addr, s-size); + break; + + case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_LINK: + break; + + default: + assert (0); + break; + } +} + + alloc_done = true; +} + +void +goacc_deallocate_static (acc_device_t d) +{ + struct oacc_static *s; + unsigned short mask = GOMP_MAP_FORCE_DEALLOC; + + if (!alloc_done) +return; + + for (s = oacc_statics; s; s = s-next) +GOACC_enter_exit_data (d, 1, s-addr, s-size, mask, 0, 0); + + alloc_done = false; +} + static void goacc_wait (int async, int num_waits, va_list ap); void @@ -592,3 +654,82 @@ GOACC_get_thread_num (int gang, int worker, int vector) { return 0; } + +void +GOACC_register_static (void *addr, int size, unsigned int mask) +{ + struct oacc_static *s; + + s = (struct oacc_static *) malloc (sizeof (struct oacc_static)); + s-addr = addr; + s-size = (size_t) size; + s-mask = mask; + s-free = false; + s-next = NULL; + + if (oacc_statics) +s-next = oacc_statics; + + oacc_statics = s; +} + +#include stdio.h + +void +GOACC_declare (int device, size_t mapnum, + void **hostaddrs, size_t *sizes, unsigned short *kinds) +{ + int i; + + for (i = 0; i mapnum; i++) +{ + unsigned char kind = kinds[i] 0xff; + + if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET) + continue; + + switch (kind) + { + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_DEALLOC: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_POINTER: + GOACC_enter_exit_data (device, 1, hostaddrs[i], sizes[i], + kinds[i], 0, 0); + break; + + case GOMP_MAP_FORCE_DEVICEPTR: + break; + + case GOMP_MAP_ALLOC: + if (!acc_is_present (hostaddrs[i], sizes[i])) + { + GOACC_enter_exit_data (device, 1, hostaddrs[i], sizes[i], + kinds[i], 0, 0); + } + break; + + case GOMP_MAP_TO: + GOACC_enter_exit_data (device, 1, hostaddrs[i], sizes[i], + kinds[i], 0, 0); + + break; + + case GOMP_MAP_FROM: + kinds[i] = GOMP_MAP_FORCE_FROM; + GOACC_enter_exit_data (device, 1, hostaddrs[i], sizes[i], + kinds[i], 0, 0); + break; + + case GOMP_MAP_FORCE_PRESENT: + if (!acc_is_present (hostaddrs[i], sizes[i])) + gomp_fatal ([%p,%zd] is not mapped, hostaddrs[i], sizes[i]); + break; + + default: + assert (0); + break; + } +} +} diff --git
Re: [v3 PATCH] Implement N4387 and LWG 2367
* include/std/tuple In the hunk below, should the sizes_match template parameter be privatized (since it isn't part of the public interface)? Or perhaps even removed if it's not used? Martin @ -457,6 +457,73 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } }; + + // Concept utility functions + template bool sizes_match, typename... _Elements + struct _TC
Re: Fix more of C/fortran canonical type issues
On Mon, 8 Jun 2015, Jan Hubicka wrote: On Mon, 8 Jun 2015, Richard Biener wrote: On Mon, 8 Jun 2015, Joseph Myers wrote: On Mon, 8 Jun 2015, Richard Biener wrote: I'm not sure the C standard mandates compatibility between struct { int i; } and struct { unsigned i; } for purposes of TBAA. Joseph? I don't think they are necessarily compatible for TBAA. Ok, but as int and unsigned are reading either structs element via a pointer to int or a pointer to unsigned must be supported? Yes. The questionable case would be taking an object of one of those structure types, casting a pointer to it to point to the other structure type and then dereferencing. Are struct { int i; } and struct { unsigned i; } compatible when one is defined in one unit and other in another? In any cae, I suppose if int and unsigned int pointers can be used interchangeably, we want to ignore TYPE_UNSGINED for purposes of canonical type calculation for LTO. So is the second variant of patch OK with a comment update that this is also required by C? I think we should instead work towards eliminating the get_alias_set langhook first. The LTO langhook variant contains the same handling, btw, so just inline that into get_alias_set and see what remains? Richard. Honza -- Joseph S. Myers jos...@codesourcery.com -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [gomp4, fortran] Patch to fix continuation checks of OpenACC and OpenMP directives
On 06/07/2015 02:05 PM, Ilmir Usmanov wrote: Fixed fortran mail-list address. Sorry for inconvenience. 08.06.2015, 00:01, Ilmir Usmanov m...@ilmir.us: Hi Cesar! This patch fixes checks of OpenMP and OpenACC continuations in case if someone mixes them (i.e. continues OpenMP directive with !$ACC sentinel or vice versa). OK for gomp branch? Thanks for working on this. Does this fix PR63858 by any chance? two minor nits... 0001-Fix-mix-of-OpenACC-and-OpenMP-sentinels-in-continuat.patch From 5492bf5bc991b6924f5e3b35c11eeaed745df073 Mon Sep 17 00:00:00 2001 From: Ilmir Usmanov i.usma...@samsung.com Date: Sun, 7 Jun 2015 23:55:22 +0300 Subject: [PATCH] Fix mix of OpenACC and OpenMP sentinels in continuation --- gcc/fortran/ChangeLog | 5 + Use ChangeLog.gomp for gomp-4_0-branch. gcc/fortran/scanner.c | 28 gcc/testsuite/ChangeLog | 5 + gcc/testsuite/gfortran.dg/goacc/omp.f95 | 8 4 files changed, 42 insertions(+), 4 deletions(-) diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog index 67f9e09..f61e0e9 100644 --- a/gcc/fortran/ChangeLog +++ b/gcc/fortran/ChangeLog @@ -1,3 +1,8 @@ +2015-06-07 Ilmir Usmanov m...@ilmir.us + + * scanner.c (gfc_next_char_literal): Fix mix of OpenACC and OpenMP + sentinels in continuation. + 2015-05-05 David Malcolm dmalc...@redhat.com * expr.c (check_inquiry): Fix indentation so that it reflects the diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c index f0e6404..5af4eea 100644 --- a/gcc/fortran/scanner.c +++ b/gcc/fortran/scanner.c @@ -1331,7 +1331,7 @@ restart: continue_line = gfc_linebuf_linenum (gfc_current_locus.lb); if (flag_openmp) - if (prev_openmp_flag != openmp_flag) + if (prev_openmp_flag != openmp_flag !openacc_flag) { gfc_current_locus = old_loc; openmp_flag = prev_openmp_flag; @@ -1340,7 +1340,7 @@ restart: } if (flag_openacc) - if (prev_openacc_flag != openacc_flag) + if (prev_openacc_flag != openacc_flag !openmp_flag) { gfc_current_locus = old_loc; openacc_flag = prev_openacc_flag; @@ -1359,7 +1359,7 @@ restart: while (gfc_is_whitespace (c)) c = next_char (); - if (openmp_flag) + if (openmp_flag !openacc_flag) { for (i = 0; i 5; i++, c = next_char ()) { @@ -1370,7 +1370,7 @@ restart: while (gfc_is_whitespace (c)) c = next_char (); } - if (openacc_flag) + if (openacc_flag !openmp_flag) { for (i = 0; i 5; i++, c = next_char ()) { @@ -1382,6 +1382,26 @@ restart: c = next_char (); } + /* In case we have an OpenMP directive continued by OpenACC + sentinel, or vice versa, we get both openmp_flag and + openacc_flag on. */ + + if (openacc_flag openmp_flag) + { + int is_openmp = 0; + for (i = 0; i 5; i++, c = next_char ()) + { + if (gfc_wide_tolower (c) != (unsigned char) !$acc[i]) + is_openmp = 1; + if (i == 4) + old_loc = gfc_current_locus; + } + gfc_error (Wrong %s continuation at %C: expected %s, got %s, + is_openmp ? OpenACC : OpenMP, + is_openmp ? !$ACC : !$OMP, + is_openmp ? !$OMP : !$ACC); I think it's better for the translation project if you made this a complete string. So maybe change this line into gfc_error (is_openmp ? Wrong continuation at %C: expected !$ACC, got !$OMP, : Wrong continuation at %C: expected !$OMP, got !$ACC); Other than that, it looks fine. Thanks, Cesar
[gomp4] declare directive [1/5]
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index f508b91..83c1432 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -82,6 +82,7 @@ along with GCC; see the file COPYING3. If not see #include omp-low.h #include builtins.h #include gomp-constants.h +#include tree-iterator.h /* Initialization routine for this file. */ @@ -1472,6 +1473,316 @@ c_parser_external_declaration (c_parser *parser) } } +static tree +check_oacc_vars_1 (tree *tp, int *, void *l) +{ + if (TREE_CODE (*tp) == VAR_DECL TREE_PUBLIC (*tp)) +{ + location_t loc = DECL_SOURCE_LOCATION (*tp); + tree attrs; + attrs = lookup_attribute (oacc declare, DECL_ATTRIBUTES (*tp)); + if (attrs) + { + tree t; + + for (t = TREE_VALUE (attrs); t; t = TREE_CHAIN (t)) + { + loc = EXPR_LOCATION ((tree) l); + + if (OMP_CLAUSE_MAP_KIND (TREE_VALUE (t)) == GOMP_MAP_LINK) + { + error_at (loc, %link% clause cannot be used with %qE, + *tp); + break; + } + } + } + else + error_at (loc, no %#pragma acc declare% for %qE, *tp); +} + return NULL_TREE; +} + +static tree +check_oacc_vars (tree *tp, int *, void *) +{ + if (TREE_CODE (*tp) == STATEMENT_LIST) +{ + tree_stmt_iterator i; + + for (i = tsi_start (*tp); !tsi_end_p (i); tsi_next (i)) + { + tree t = tsi_stmt (i); + walk_tree_without_duplicates (t, check_oacc_vars_1, t); + } +} + + return NULL_TREE; +} + +static struct oacc_return +{ + tree_stmt_iterator iter; + tree stmt; + int op; + struct oacc_return *next; +} *oacc_returns; + +static tree +find_oacc_return (tree *tp, int *, void *) +{ + if (TREE_CODE (*tp) == STATEMENT_LIST) +{ + tree_stmt_iterator i; + + for (i = tsi_start (*tp); !tsi_end_p (i); tsi_next (i)) + { + tree t; + struct oacc_return *r; + + t = tsi_stmt (i); + + if (TREE_CODE (t) == RETURN_EXPR) + { + r = XNEW (struct oacc_return); + r-iter = i; + r-stmt = NULL_TREE; + r-op = 1; + r-next = NULL; + + if (oacc_returns) + r-next = oacc_returns; + + oacc_returns = r; + } + else if (TREE_CODE (t) == COND_EXPR) + { + bool op1, op2; + tree op; + + op1 = op2 = false; + + op = TREE_OPERAND (t, 1); + op1 = (op TREE_CODE (op) == RETURN_EXPR); + + op = TREE_OPERAND (t, 2); + op2 = (op TREE_CODE (op) == RETURN_EXPR); + + if (op1 || op2) + { + r = XNEW (struct oacc_return); + r-stmt = t; + r-op = op1 ? 1 : 2; + r-next = NULL; + + if (oacc_returns) + r-next = oacc_returns; + + oacc_returns = r; + } + } + } +} + + return NULL_TREE; +} + +static void +finish_oacc_declare (tree fnbody, tree decls) +{ + tree t, stmt, body, c, ret_clauses, clauses; + location_t loc; + tree_stmt_iterator i; + tree fndecl = current_function_decl; + + if (lookup_attribute (oacc function, DECL_ATTRIBUTES (fndecl))) +{ + if (lookup_attribute (oacc declare, DECL_ATTRIBUTES (fndecl))) + { + location_t loc = DECL_SOURCE_LOCATION (fndecl); + error_at (loc, %#pragma acc declare% not allowed in %qE, fndecl); + } + + walk_tree_without_duplicates (fnbody, check_oacc_vars, NULL); + return; +} + + if (!decls) +return; + + body = BIND_EXPR_BODY (fnbody); + + if (TREE_CODE (body) != STATEMENT_LIST) +{ + tree list; + + list = alloc_stmt_list (); + append_to_statement_list (body, list); + BIND_EXPR_BODY (fnbody) = list; + body = list; +} + + walk_tree_without_duplicates (body, find_oacc_return, NULL); + + clauses = NULL_TREE; + + for (t = decls; t; t = TREE_CHAIN (t)) +{ + c = TREE_VALUE (TREE_VALUE (t)); + + if (clauses) + OMP_CLAUSE_CHAIN (c) = clauses; + else + loc = OMP_CLAUSE_LOCATION (c); + + clauses = c; +} + + ret_clauses = NULL_TREE; + + for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) +{ + bool ret = false; + HOST_WIDE_INT kind, new_op; + + kind = OMP_CLAUSE_MAP_KIND (c); + + switch (kind) + { + case GOMP_MAP_ALLOC: + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_TO: + new_op = GOMP_MAP_FORCE_DEALLOC; + ret = true; + break; + + case GOMP_MAP_FORCE_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FORCE_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_TO); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_TO); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_FORCE_PRESENT: + case GOMP_MAP_LINK: + case GOMP_MAP_POINTER: + case GOMP_MAP_TO: + break; + +
[gomp4] declare directive [3/5]
diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c index 5003581..a889342 100644 --- a/gcc/fortran/f95-lang.c +++ b/gcc/fortran/f95-lang.c @@ -119,6 +119,8 @@ static const struct attribute_spec gfc_attribute_table[] = affects_type_identity } */ { omp declare target, 0, 0, true, false, false, gfc_handle_omp_declare_target_attribute, false }, + { oacc declare, 0, 0, true, false, false, +gfc_handle_omp_declare_target_attribute, false }, { oacc function, 0, 0, true, false, false, gfc_handle_omp_declare_target_attribute, false }, { NULL, 0, 0, false, false, false, NULL, false } diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index e73c269..a90b0f8 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -861,6 +861,13 @@ typedef struct /* Mentioned in OMP DECLARE TARGET. */ unsigned omp_declare_target:1; + /* Mentioned in OACC DECLARE. */ + unsigned oacc_declare_create:1; + unsigned oacc_declare_copyin:1; + unsigned oacc_declare_deviceptr:1; + unsigned oacc_declare_device_resident:1; + unsigned oacc_declare_link:1; + /* This is an OpenACC acclerator function. */ unsigned oacc_function:1; @@ -1132,6 +1139,8 @@ typedef enum OMP_MAP_FORCE_TOFROM, OMP_MAP_FORCE_PRESENT, OMP_MAP_FORCE_DEVICEPTR, + OMP_MAP_DEVICE_RESIDENT, + OMP_MAP_LINK, OMP_MAP_FORCE_TO_GANGLOCAL } gfc_omp_map_op; @@ -1174,6 +1183,7 @@ enum OMP_LIST_FROM, OMP_LIST_REDUCTION, OMP_LIST_DEVICE_RESIDENT, + OMP_LIST_LINK, OMP_LIST_USE_DEVICE, OMP_LIST_CACHE, OMP_LIST_NUM @@ -1269,6 +1279,7 @@ typedef struct gfc_oacc_declare { struct gfc_oacc_declare *next; locus where; + bool module_var; gfc_omp_clauses *clauses; } gfc_oacc_declare; @@ -3276,6 +3287,6 @@ void gfc_convert_mpz_to_signed (mpz_t, int); /* trans-decl.c */ -void insert_oacc_declare (gfc_namespace *); +void finish_oacc_declare (gfc_namespace *, enum sym_flavor); #endif /* GCC_GFORTRAN_H */ diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c index 1abfc46..c174902 100644 --- a/gcc/fortran/module.c +++ b/gcc/fortran/module.c @@ -1894,7 +1894,9 @@ typedef enum AB_IS_CLASS, AB_PROCEDURE, AB_PROC_POINTER, AB_ASYNCHRONOUS, AB_CODIMENSION, AB_COARRAY_COMP, AB_VTYPE, AB_VTAB, AB_CONTIGUOUS, AB_CLASS_POINTER, AB_IMPLICIT_PURE, AB_ARTIFICIAL, AB_UNLIMITED_POLY, AB_OMP_DECLARE_TARGET, - AB_ARRAY_OUTER_DEPENDENCY + AB_ARRAY_OUTER_DEPENDENCY, AB_OACC_DECLARE_CREATE, AB_OACC_DECLARE_COPYIN, + AB_OACC_DECLARE_DEVICEPTR, AB_OACC_DECLARE_DEVICE_RESIDENT, + AB_OACC_DECLARE_LINK } ab_attribute; @@ -1951,6 +1953,11 @@ static const mstring attr_bits[] = minit (UNLIMITED_POLY, AB_UNLIMITED_POLY), minit (OMP_DECLARE_TARGET, AB_OMP_DECLARE_TARGET), minit (ARRAY_OUTER_DEPENDENCY, AB_ARRAY_OUTER_DEPENDENCY), +minit (OACC_DECLARE_CREATE, AB_OACC_DECLARE_CREATE), +minit (OACC_DECLARE_COPYIN, AB_OACC_DECLARE_COPYIN), +minit (OACC_DECLARE_DEVICEPTR, AB_OACC_DECLARE_DEVICEPTR), +minit (OACC_DECLARE_DEVICE_RESIDENT, AB_OACC_DECLARE_DEVICE_RESIDENT), +minit (OACC_DECLARE_LINK, AB_OACC_DECLARE_LINK), minit (NULL, -1) }; @@ -2133,6 +2140,16 @@ mio_symbol_attribute (symbol_attribute *attr) MIO_NAME (ab_attribute) (AB_OMP_DECLARE_TARGET, attr_bits); if (attr-array_outer_dependency) MIO_NAME (ab_attribute) (AB_ARRAY_OUTER_DEPENDENCY, attr_bits); + if (attr-oacc_declare_create) + MIO_NAME (ab_attribute) (AB_OACC_DECLARE_CREATE, attr_bits); + if (attr-oacc_declare_copyin) + MIO_NAME (ab_attribute) (AB_OACC_DECLARE_COPYIN, attr_bits); + if (attr-oacc_declare_deviceptr) + MIO_NAME (ab_attribute) (AB_OACC_DECLARE_DEVICEPTR, attr_bits); + if (attr-oacc_declare_device_resident) + MIO_NAME (ab_attribute) (AB_OACC_DECLARE_DEVICE_RESIDENT, attr_bits); + if (attr-oacc_declare_link) + MIO_NAME (ab_attribute) (AB_OACC_DECLARE_LINK, attr_bits); mio_rparen (); @@ -2302,6 +2319,21 @@ mio_symbol_attribute (symbol_attribute *attr) case AB_ARRAY_OUTER_DEPENDENCY: attr-array_outer_dependency =1; break; + case AB_OACC_DECLARE_CREATE: + attr-oacc_declare_create = 1; + break; + case AB_OACC_DECLARE_COPYIN: + attr-oacc_declare_copyin = 1; + break; + case AB_OACC_DECLARE_DEVICEPTR: + attr-oacc_declare_deviceptr = 1; + break; + case AB_OACC_DECLARE_DEVICE_RESIDENT: + attr-oacc_declare_device_resident = 1; + break; + case AB_OACC_DECLARE_LINK: + attr-oacc_declare_link = 1; + break; } } } diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index fc16d8c..46bf865 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -475,6 +475,7 @@ match_oacc_clause_gang (gfc_omp_clauses *cp) #define OMP_CLAUSE_BIND ((uint64_t) 1 58) #define OMP_CLAUSE_NOHOST ((uint64_t) 1 59) #define OMP_CLAUSE_DEVICE_TYPE ((uint64_t) 1 60) +#define OMP_CLAUSE_LINK ((uint64_t) 1 61)
Re: [PATCH 1/2] Memory statistics enhancement.
On 06/01/2015 06:16 PM, mliska wrote: Hi. Following 2 patches improve memory statistics infrastructure. First one ports pool allocator to the new infrastructure. And the second one makes column alignment properly. Both can bootstrap on x86_64-linux-pc and survive regression tests. Ready for trunk? Thank you, Martin Port pool-allocator memory stats to a new infrastructure. gcc/ChangeLog: 2015-06-02 Martin Liska mli...@suse.cz * alloc-pool.c (allocate_pool_descriptor): Remove. (struct pool_output_info): Likewise. (print_alloc_pool_statistics): Likewise. (dump_alloc_pool_statistics): Likewise. * alloc-pool.h (struct pool_usage): New struct. (pool_allocator::initialize): Change usage of memory statistics to a new interface. (pool_allocator::release): Likewise. (pool_allocator::allocate): Likewise. (pool_allocator::remove): Likewise. * mem-stats-traits.h (enum mem_alloc_origin): Add new enum value for a pool allocator. * mem-stats.h (struct mem_location): Add new ctor. (struct mem_usage): Add counter for number of instances. (mem_alloc_description::register_descriptor): New overload of the function. --- gcc/alloc-pool.c | 60 + gcc/alloc-pool.h | 102 +++-- gcc/mem-stats-traits.h | 3 +- gcc/mem-stats.h| 69 ++--- 4 files changed, 132 insertions(+), 102 deletions(-) diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index e9fdc86..601c2b7 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -26,70 +26,14 @@ along with GCC; see the file COPYING3. If not see #include hash-map.h ALLOC_POOL_ID_TYPE last_id; - -/* Hashtable mapping alloc_pool names to descriptors. */ -hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; - -struct alloc_pool_descriptor * -allocate_pool_descriptor (const char *name) -{ - if (!alloc_pool_hash) -alloc_pool_hash = new hash_mapconst char *, alloc_pool_descriptor (10, - false, - false); - - return alloc_pool_hash-get_or_insert (name); -} - -/* Output per-alloc_pool statistics. */ - -/* Used to accumulate statistics about alloc_pool sizes. */ -struct pool_output_info -{ - unsigned long total_created; - unsigned long total_allocated; -}; - -/* Called via hash_map.traverse. Output alloc_pool descriptor pointed out by - SLOT and update statistics. */ -bool -print_alloc_pool_statistics (const char *const name, - const alloc_pool_descriptor d, - struct pool_output_info *i) -{ - if (d.allocated) -{ - fprintf (stderr, -%-22s %6d %10lu %10lu(%10lu) %10lu(%10lu) %10lu(%10lu)\n, -name, d.elt_size, d.created, d.allocated, -d.allocated / d.elt_size, d.peak, d.peak / d.elt_size, -d.current, d.current / d.elt_size); - i-total_allocated += d.allocated; - i-total_created += d.created; -} - return 1; -} +mem_alloc_descriptionpool_usage pool_allocator_usage; /* Output per-alloc_pool memory usage statistics. */ void dump_alloc_pool_statistics (void) { - struct pool_output_info info; - if (! GATHER_STATISTICS) return; - if (!alloc_pool_hash) -return; - - fprintf (stderr, \nAlloc-pool Kind Elt size Pools Allocated (elts)Peak (elts)Leak (elts)\n); - fprintf (stderr, --\n); - info.total_created = 0; - info.total_allocated = 0; - alloc_pool_hash-traverse struct pool_output_info *, - print_alloc_pool_statistics (info); - fprintf (stderr, --\n); - fprintf (stderr, %-22s %7lu %10lu\n, -Total, info.total_created, info.total_allocated); - fprintf (stderr, --\n); + pool_allocator_usage.dump (ALLOC_POOL); } diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h index 96a1342..a1727ce 100644 --- a/gcc/alloc-pool.h +++ b/gcc/alloc-pool.h @@ -26,6 +26,71 @@ extern void dump_alloc_pool_statistics (void); typedef unsigned long ALLOC_POOL_ID_TYPE; +/* Pool allocator memory usage. */ +struct pool_usage: public mem_usage +{ + /* Default contructor. */ + pool_usage (): m_element_size (0), m_pool_name () {} + /* Constructor. */ + pool_usage (size_t allocated, size_t times, size_t peak, + size_t instances, size_t element_size, +
[PATCH] Fix PR66419
This fixes PR66419. Bootstrap / regtest running on x86_64-unknown-linux-gnu. Richard. 2015-06-08 Richard Biener rguent...@suse.de PR tree-optimization/66419 * tree-vect-slp.c (vect_supported_load_permutation_p): Properly consider GROUP_GAP when detecting a perfect subchain. * gcc.dg/vect/bb-slp-37.c: New testcase. Index: gcc/tree-vect-slp.c === --- gcc/tree-vect-slp.c (revision 224221) +++ gcc/tree-vect-slp.c (working copy) @@ -1444,7 +1459,9 @@ vect_supported_load_permutation_p (slp_i next_load = NULL; FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), j, load) { - if (j != 0 next_load != load) + if (j != 0 + (next_load != load + || GROUP_GAP (vinfo_for_stmt (load)) != 0)) { subchain_p = false; break; Index: gcc/testsuite/gcc.dg/vect/bb-slp-37.c === --- gcc/testsuite/gcc.dg/vect/bb-slp-37.c (revision 0) +++ gcc/testsuite/gcc.dg/vect/bb-slp-37.c (working copy) @@ -0,0 +1,32 @@ +/* { dg-require-effective-target vect_int } */ + +#include tree-vect.h + +extern void abort (void); + +int a[16]; +int b[4]; + +void __attribute__((noinline)) +foo (void) +{ + b[0] = a[0]; + b[1] = a[4]; + b[2] = a[8]; + b[3] = a[12]; +} + +int main() +{ + int i; + check_vect (); + for (i = 0; i 16; ++i) +{ + a[i] = i; + __asm__ volatile (); +} + foo (); + if (b[0] != 0 || b[1] != 4 || b[2] != 8 || b[3] != 12) +abort (); + return 0; +}
Re: Fix more of C/fortran canonical type issues
I think we should instead work towards eliminating the get_alias_set langhook first. The LTO langhook variant contains the same handling, btw, so just inline that into get_alias_set and see what remains? I see, i completely missed existence of gimple_get_alias_set. It makes more sense now. Is moving everyting to alias.c realy a desirable thing? If non-C languages do not have this rule, why we want to reduce the code quality when compiling those? Honza Richard. Honza -- Joseph S. Myers jos...@codesourcery.com -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH, PR66444] Handle -fipa-ra in reload_combine
On Mon, Jun 08, 2015 at 02:04:12PM +0200, Tom de Vries wrote: this patch fixes PR66444, a problem with -fipa-ra in reload_combine. The problem is that for the test-case, reload_combine combines these two insns: Please work out with Vlad whether reload_cse_move2add doesn't need similar fix (and check other spots too). 2015-06-08 Tom de Vries t...@codesourcery.com PR rtl-optimization/66444 * postreload.c (reload_combine): Use get_call_reg_set_usage instead of call_used_regs. LGTM. * gcc.dg/pr66444.c: New test. +int __attribute__((noinline, noclone)) +baz (void) +{ + struct S *x = (struct S *) 0xe000U; I'm still afraid this will not really work on s390-linux (which has only 31-bit pointers) and will not work on 16-bit int targets either (some have say 24-bit pointers etc., not really familiar with the embedded world). So, I'd suggest use a macro for the address, so you don't need to duplicate it, and define it to say ((struct S *) 0x8000UL), if it reproduces even with that change without your reload_combine fix. Ok for trunk and 5.2 with that change. Jakub
Re: [patch] libstdc++/66030 fix codecvt exports for mingw32
On 06/08/2015 09:12 AM, Jonathan Wakely wrote: The linker script assumes that std::mbstate_t has the name __mbstate_t for linkage purposes, but that's not necessarily true. For mingw32 it's just a typedef for int, so the patterns don't match. This adds a new mingw32-specific pattern for codecvt_byname's constructors and destructors, and relaxes the patterns for codecvtcharNN_t, char, mbstate_t so they match __mbstate_t or int. As a data point, in case other targets have a similar issue, mbstate_t is a typedef for char* on AIX, and (based on my old notes) typedef struct mbstate_t on HP-UX. (It is a typedef struct __mbstate_t on Darwin and Solaris.) Martin
Re: [C++ Patch] PR 65815
On 05/22/2015 02:46 PM, Paolo Carlini wrote: take a type, not a decl, as first argument. Why? This complicates calls. Could you also check that we do the right thing for mem-initializers? Jason
Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt
Hi Tom! On Mon, 8 Jun 2015 12:43:01 +0200, Tom de Vries tom_devr...@mentor.com wrote: There are two problems in try_transform_to_exit_first_loop_alt: 1. In case the latch is not a singleton bb, the function should return false rather than true. 2. The check for singleton bb should ignore debug-insns. Attached patch fixes these problems. Fix try_transform_to_exit_first_loop_alt PR tree-optimization/66442 * gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function. * tree-parloops.c (try_transform_to_exit_first_loop_alt): Return false if the loop latch is not a singleton. Use gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p. Per my testing, the backport of this patch that you committed to gomp-4_0-branch, r224219, introduces a number of regressions in your OpenACC kernels test cases, specifically the »scan-tree-dump-times parloops_oacc_kernels (?n)pragma omp target oacc_parallel.*num_gangs\\(32\\) 1« tests. Would you please have a look? Grüße, Thomas gcc/gimple-iterator.h | 29 + gcc/tree-parloops.c | 4 ++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h index 87e943a..76fa456 100644 --- a/gcc/gimple-iterator.h +++ b/gcc/gimple-iterator.h @@ -345,4 +345,33 @@ gsi_seq (gimple_stmt_iterator i) return *i.seq; } +/* Determine whether SEQ is a nondebug singleton. */ + +static inline bool +gimple_seq_nondebug_singleton_p (gimple_seq seq) +{ + gimple_stmt_iterator gsi; + + /* Find a nondebug gimple. */ + gsi.ptr = gimple_seq_first (seq); + gsi.seq = seq; + gsi.bb = NULL; + while (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) +gsi_next (gsi); + + /* No nondebug gimple found, not a singleton. */ + if (gsi_end_p (gsi)) +return false; + + /* Find a next nondebug gimple. */ + gsi_next (gsi); + while (!gsi_end_p (gsi) + is_gimple_debug (gsi_stmt (gsi))) +gsi_next (gsi); + + /* Only a singleton if there's no next nondebug gimple. */ + return gsi_end_p (gsi); +} + #endif /* GCC_GIMPLE_ITERATOR_H */ diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..c4b83fe 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -1769,8 +1769,8 @@ try_transform_to_exit_first_loop_alt (struct loop *loop, tree nit) { /* Check whether the latch contains a single statement. */ - if (!gimple_seq_singleton_p (bb_seq (loop-latch))) -return true; + if (!gimple_seq_nondebug_singleton_p (bb_seq (loop-latch))) +return false; /* Check whether the latch contains the loop iv increment. */ edge back = single_succ_edge (loop-latch); -- 1.9.1 signature.asc Description: PGP signature