RE: [PATCH] Fix bdverN vector cost of cond_[not_]taken_branch_cost
I have added a person from AMD to comment on the decision. Otherwise, the patch looks OK, but please wait a couple of days for possible comments. Thank you Uros! I am checking the changes with few tests and benchmarking them. Please wait for a couple of days. -Ganesh
Re: [wwwdocs] Update changes.html with libstdc++ changes
On Sat, 6 Dec 2014, Jonathan Wakely wrote: This adds recent libstdc++ updates to gcc-5/changes.html Nice! Just a most minor change to end a list with a full stop instead of a semi-colon. Applied. Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.94 diff -u -r1.94 changes.html --- changes.html6 Apr 2015 12:56:40 - 1.94 +++ changes.html8 Apr 2015 06:54:43 - @@ -456,7 +456,7 @@ liNew random number distributions codelogistic_distribution/code and codeuniform_on_sphere_distribution/code as extensions./li lia href=https://sourceware.org/gdb/current/onlinedocs/gdb/Xmethods-In-Python.html;GDB - Xmethods/a for containers and codestd::unique_ptr/code;/li + Xmethods/a for containers and codestd::unique_ptr/code./li /ul h3 id=fortranFortran/h3
Re: [patch,avr]: Part2: Fix various problems with specs and specs file generation.
Am 04/08/2015 um 10:28 AM schrieb Denis Chertykov: 2015-04-07 15:34 GMT+03:00 Georg-Johann Lay a...@gjlay.de: Am 04/06/2015 um 11:54 AM schrieb Sivanupandi, Pitchumani: Hi Johann, Did you try running g++ tests? It seems xgcc is invoked to get multilibs (from gcc/testsuite/lib/g++.exp) which failed to find specs file. This is because libgloss.exp:get_multilibs (used from g++_init) runs xgcc ($compiler) without -B, i.e. without any prefix. Without prefix there is no way to determine where the specs files are located. Patching driver_self_specs to read a specs file by means of -specs= is, well, not very common. I don't know any other target which does that. As a work-around you can run the tests against the installed compiler. Denis, what do you think? I could add yet another fixme to avr backend like the following; that way there's no need to change dejagnu: Johann Index: config/avr/driver-avr.c === --- config/avr/driver-avr.c (revision 221602) +++ config/avr/driver-avr.c (working copy) @@ -80,6 +80,20 @@ avr_devicespecs_file (int argc, const ch return X_NODEVLIB; case 1: + if (0 == strcmp (device-specs, argv[0])) +{ + /* FIXME: This means device-specs%s from avr.h:DRIVER_SELF_SPECS + has not been resolved to a path. That case can occur when the + c++ testsuite is run from the build directory. DejaGNU's + libgloss.exp:get_multilibs runs $compiler without -B, i.e.runs + xgcc without specifying a prefix. Without any prefix, there is + no means to find out where the specs files might be located. + get_multilibs runs xgcc --print-multi-lib, hence we don't + actually need information form a specs file and may skip it + altogether. */ + return X_NODEVLIB; +} + mmcu = AVR_MMCU_DEFAULT; break; I'm weak in dejagnu internals and c++ testsuite. It looks like an acceptable solution. Denis. Pitchumani, does that patch work for you? If so I'd go ahead and apply it. And what about the spaces problem as mentioned in http://savannah.nongnu.org/bugs/?44574 http://lists.gnu.org/archive/html/avr-libc-dev/2015-03/msg00010.html Art there plans to fix that? Johann
[PATCH, i386] Fix PR target/65676.
Hello, Patch in the bottom fixes PR65676. Bootstrapped, reg-testing is in progress. I am going to commit if testing will pass. I am also going back port the patch to 4.9.x. gcc/ PR target/65676 * config/i386/i386.c (fixup_modeless_constant): New. (ix86_expand_args_builtin): Fixup modeless constant operand. (ix86_expand_round_builtin): Ditto. (ix86_expand_special_args_builtin): Ditto. (ix86_expand_builtin): Ditto. gcc/testsuite/ PR target/65676 * gcc.target/i386/sse-25.c: New. -- Thanks, K commit bdb50f43ed940261230953a647c6a7197bc60c97 Author: Kirill Yukhin kirill.yuk...@intel.com Date: Tue Apr 7 17:37:01 2015 +0300 Fix PR65676. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 02b5103..a02e004 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -35863,6 +35863,15 @@ safe_vector_operand (rtx x, machine_mode mode) return x; } +/* Fixup modeless constants to fit required mode. */ +static rtx +fixup_modeless_constant (rtx x, machine_mode mode) +{ + if (GET_MODE (x) == VOIDmode) +x = convert_to_mode (mode, x, 1); + return x; +} + /* Subroutine of ix86_expand_builtin to take care of binop insns. */ static rtx @@ -37509,6 +37518,8 @@ ix86_expand_args_builtin (const struct builtin_description *d, if (memory_operand (op, mode)) num_memory++; + op = fixup_modeless_constant (op, mode); + if (GET_MODE (op) == mode || GET_MODE (op) == VOIDmode) { if (optimize || !match || num_memory 1) @@ -37882,6 +37893,8 @@ ix86_expand_round_builtin (const struct builtin_description *d, if (VECTOR_MODE_P (mode)) op = safe_vector_operand (op, mode); + op = fixup_modeless_constant (op, mode); + if (GET_MODE (op) == mode || GET_MODE (op) == VOIDmode) { if (optimize || !match) @@ -38289,6 +38302,8 @@ ix86_expand_special_args_builtin (const struct builtin_description *d, if (VECTOR_MODE_P (mode)) op = safe_vector_operand (op, mode); + op = fixup_modeless_constant (op, mode); + if (GET_MODE (op) == mode || GET_MODE (op) == VOIDmode) op = copy_to_mode_reg (mode, op); else @@ -39852,6 +39867,9 @@ addcarryx: op1 = copy_to_mode_reg (Pmode, op1); if (!insn_data[icode].operand[3].predicate (op2, mode2)) op2 = copy_to_mode_reg (mode2, op2); + + op = fixup_modeless_constant (op, mode); + if (GET_MODE (op3) == mode3 || GET_MODE (op3) == VOIDmode) { if (!insn_data[icode].operand[4].predicate (op3, mode3)) @@ -39995,6 +40013,8 @@ addcarryx: if (!insn_data[icode].operand[0].predicate (op0, Pmode)) op0 = copy_to_mode_reg (Pmode, op0); + op = fixup_modeless_constant (op, mode); + if (GET_MODE (op1) == mode1 || GET_MODE (op1) == VOIDmode) { if (!insn_data[icode].operand[1].predicate (op1, mode1)) @@ -40041,6 +40061,8 @@ addcarryx: mode3 = insn_data[icode].operand[3].mode; mode4 = insn_data[icode].operand[4].mode; + op = fixup_modeless_constant (op, mode); + if (GET_MODE (op0) == mode0 || (GET_MODE (op0) == VOIDmode op0 != constm1_rtx)) { diff --git a/gcc/testsuite/gcc.target/i386/sse-25.c b/gcc/testsuite/gcc.target/i386/sse-25.c new file mode 100644 index 000..c4b334c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse-25.c @@ -0,0 +1,6 @@ +/* PR target/65676 */ +/* { dg-do compile } */ +/* { dg-options -O2 -Werror-implicit-function-declaration -march=k8 -funsigned-char } */ +/* { dg-add-options bind_pic_locally } */ + +#include sse-23.c
Re: [wwwdocs] Update changes.html with libstdc++ changes
On 08/04/15 13:06 +0200, Gerald Pfeifer wrote: On Sat, 6 Dec 2014, Jonathan Wakely wrote: I'm also noting one old change in the GCC 4.5 page, and removing/changing some links to the C++0x status table. The list of features supported on trunk is fairly irrelevant to someone looking at the 4.4 release notes, so I've linked to the docs for the relevant release This was a good change, thank you. The only drawback of this, and some similar cases, is that we now risk referring to older versions on a release branch. I do not have a good idea how to best handle this, so for now I updated references to online documentation to the latest versions (4.8.4 and 4.9.2, respectively). Yes, I realised that problem when making the change and linking to the versions that were current at the time. One option would be to add a gcc-4.8 symlink that points to the latest gcc-4.8.x version, but that adds more work for the release managers and only has a small benefit. Alternatively, since we only tend to have four or five releases from a branch, we could just update them manually when we remember to. That's only necessary until the branch closes, at which point the latest release won't change. It's not a huge problem if the links don't go to the latest docs immediately IMHO.
Re: pr59016
Le 07/04/2015 14:25, Mikael Morin a écrit : Le 06/04/2015 20:26, Mikael Morin a écrit : Regarding the patch, I don't understand why the existing symbol restoration code doesn't work here (see gfc_restore_last_undo_checkpoint, restore_old_symbol). I have to investigate more. I think the problem is the usage of gfc_find_symbol in gfc_match_decl_type_spec. In opposition to the gfc_get_* family of functions, the gfc_find_* functions don't version symbols, so that changes made to the symbol are not thrown away when the statement is rejected. So something like the following should be preferred over Evangelos' patch. Except that the following ... ahem ... doesn't work. Mikael Index: decl.c === --- decl.c (révision 221654) +++ decl.c (copie de travail) @@ -2840,7 +2840,7 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int im if (ts-kind != -1) { gfc_get_ha_symbol (name, sym); - if (sym-generic gfc_find_symbol (dt_name, NULL, 0, dt_sym)) + if (sym-generic gfc_get_symbol (dt_name, NULL, dt_sym)) { gfc_error (Type name %qs at %C is ambiguous, name); return MATCH_ERROR; @@ -2850,10 +2850,11 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int im } else if (ts-kind == -1) { - int iface = gfc_state_stack-previous-state != COMP_INTERFACE - || gfc_current_ns-has_import_set; - gfc_find_symbol (name, NULL, iface, sym); - if (sym sym-generic gfc_find_symbol (dt_name, NULL, 1, dt_sym)) + gfc_get_ha_symbol (name, sym); + if (sym == NULL || sym-gfc_new) + return MATCH_NO; + + if (sym sym-generic gfc_get_ha_symbol (dt_name, dt_sym)) { gfc_error (Type name %qs at %C is ambiguous, name); return MATCH_ERROR; @@ -2862,8 +2863,6 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int im dt_sym = gfc_find_dt_in_generic (sym); ts-kind = 0; - if (sym == NULL) - return MATCH_NO; } if ((sym-attr.flavor != FL_UNKNOWN @@ -2885,12 +2884,13 @@ gfc_match_decl_type_spec (gfc_typespec *ts, int im !gfc_add_function (sym-attr, sym-name, NULL)) return MATCH_ERROR; - if (!dt_sym) + if (!dt_sym || dt_sym-gfc_new) { gfc_interface *intr, *head; /* Use upper case to save the actual derived-type symbol. */ - gfc_get_symbol (dt_name, NULL, dt_sym); + if (!dt_sym) + gfc_get_symbol (dt_name, NULL, dt_sym); dt_sym-name = gfc_get_string (sym-name); head = sym-generic; intr = gfc_get_interface ();
Re: Revert PowerPC shrink-wrap support 3 of 3
On Thu, 10 Nov 2011, Hans-Peter Nilsson wrote: I think I need someone with appropriate write privileges to agree with that, and to also give 48h for someone to fix the problem. Sorry for not forthcoming on the second point. brgds, H-P PS. where is the policy written down, besides the mailing list archives? https://gcc.gnu.org/develop.html has had the following since 2003 or so: Patch Reversion If a patch is committed which introduces a regression on any target which the Steering Committee considers to be important and if: * the problem is reported to the original poster; * 48 hours pass without the original poster or any other party indicating that a fix will be forthcoming in the very near future; * two people with write privileges to the affected area of the compiler determine that the best course of action is to revert the patch; then they may revert the patch. (The list of important targets will be revised at the beginning of each release cycle, if necessary, and is part of the release criteria.) After the patch has been reverted, the poster may appeal the decision to the Steering Committee. Note that no distinction is made between patches which are themselves buggy and patches that expose latent bugs elsewhere in the compiler. This is there as part of the overall (release) methodology, though svnwrite.html would be at least as natural. Perhaps a reference there? Thoughts? Gerald PS: Yes, I am catching up with some older mails. ;-)
Re: [wwwdocs] Update changes.html with libstdc++ changes
On Sat, 6 Dec 2014, Jonathan Wakely wrote: I'm also noting one old change in the GCC 4.5 page, and removing/changing some links to the C++0x status table. The list of features supported on trunk is fairly irrelevant to someone looking at the 4.4 release notes, so I've linked to the docs for the relevant release This was a good change, thank you. The only drawback of this, and some similar cases, is that we now risk referring to older versions on a release branch. I do not have a good idea how to best handle this, so for now I updated references to online documentation to the latest versions (4.8.4 and 4.9.2, respectively). Gerald Index: gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.131 diff -u -r1.131 changes.html --- gcc-4.8/changes.html19 Dec 2014 11:38:57 - 1.131 +++ gcc-4.8/changes.html8 Apr 2015 10:30:59 - @@ -292,7 +292,7 @@ h4Runtime Library (libstdc++)/h4 ul -lia href=https://gcc.gnu.org/onlinedocs/gcc-4.8.3/libstdc++/manual/manual/status.html#status.iso.2011; +lia href=https://gcc.gnu.org/onlinedocs/gcc-4.8.4/libstdc++/manual/manual/status.html#status.iso.2011; Improved experimental support for the new ISO C++ standard, C++11/a, including: ul Index: gcc-4.9/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving revision 1.84 diff -u -r1.84 changes.html --- gcc-4.9/changes.html31 Dec 2014 10:51:19 - 1.84 +++ gcc-4.9/changes.html8 Apr 2015 10:31:00 - @@ -128,7 +128,7 @@ and starting with the 4.9.1 release also in the Fortran compiler. The new code-fopenmp-simd/code option can be used to enable OpenMP's SIMD directives, while ignoring other OpenMP directives. The new a - href=https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Optimize-Options.html#index-fsimd-cost-model-908; + href=https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Optimize-Options.html#index-fsimd-cost-model-908; code-fsimd-cost-model=/code/a option permits to tune the vectorization cost model for loops annotated with OpenMP and Cilk Plus codesimd/code directives; code-Wopenmp-simd/code warns when @@ -151,7 +151,7 @@ ul liSupport for colorizing diagnostics emitted by GCC has been added. The codea - href=https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Language-Independent-Options.html#index-fdiagnostics-color-252; + href=https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Language-Independent-Options.html#index-fdiagnostics-color-252; -fdiagnostics-color=auto/a/code will enable it when outputting to terminals, code-fdiagnostics-color=always/code unconditionally. The codeGCC_COLORS/code environment variable @@ -177,7 +177,7 @@ /pre/li liWith the new a - href=https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Loop-Specific-Pragmas.html; + href=https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Loop-Specific-Pragmas.html; code#pragma GCC ivdep/code/a, the user can assert that there are no loop-carried dependencies which would prevent concurrent execution of consecutive iterations using SIMD (single instruction multiple data) @@ -383,7 +383,7 @@ liABI changes: ul liThe a - href=https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gfortran/Argument-passing-conventions.html; + href=https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gfortran/Argument-passing-conventions.html; argument passing ABI/a has changed for scalar dummy arguments of type codeINTEGER/code, codeREAL/code, codeCOMPLEX/code and codeLOGICAL/code, which have @@ -418,7 +418,7 @@ controlled by the code-Wzerotrip/code option, which is implied by code-Wall/code./li liThe new codeNO_ARG_CHECK/code attribute of the a - href=https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gfortran/GNU-Fortran-Compiler-Directives.html; + href=https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gfortran/GNU-Fortran-Compiler-Directives.html; code!GCC$/code directive/a can be used to disable the type-kind-rank (TKR) argument check for a dummy argument. The feature is similar to ISO/IEC TS 29133:2012's codeTYPE(*)/code, except that @@ -452,7 +452,7 @@ the execution and any exception (but inexact) is signaling, a warning is printed to codeERROR_UNIT/code, indicating which exceptions are signaling. The codea - href=https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gfortran/Debugging-Options.html; + href=https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gfortran/Debugging-Options.html; -ffpe-summary=/a/code command-line option can be used to fine-tune for which exceptions the warning should be shown./li liRounding on input (codeREAD/code) is now
Re: [PATCH, bootstrap]: Add bootstrap-lto-noplugin build configuration (PR65537)
On Mon, 6 Apr 2015, Sandra Loosemore wrote: s/frontend/front end/ ?? (Since we have middle end and back end on the previous line.) Absolutely -- I should have caught this. Is also is in line with https://gcc.gnu.org/codingconventions.html. Here is the updated patch I just committed. 2015-04-08 Gerald Pfeifer ger...@pfeifer.com * doc/install.texi (bootstrap-lto-noplugin): Rewrite. Index: doc/install.texi === --- doc/install.texi(revision 221916) +++ doc/install.texi(working copy) @@ -2525,10 +2525,10 @@ @item @samp{bootstrap-lto-noplugin} This option is similar to @code{bootstrap-lto}, but is intended for -hosts that do not support the linker plugin. Please note that static -libraries are not compiled with link time optimizations without -linker plugin. Since GCC middle-end and back-end are in libbackend.a, -it means that only part of the frontend is actually LTO optimized. +hosts that do not support the linker plugin. Without the linker plugin +static libraries are not compiled with link-time optimizations. Since +the GCC middle end and back end are in @file{libbackend.a} this means +that only the front end is actually LTO optimized. @item @samp{bootstrap-debug} Verifies that the compiler generates the same executable code, whether
[C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
Hi! The following patches (included or attached) fix a regression on WebKit compilation. The first hunk basically reverts Honza's patch from December/January, because layout_type when the variants are already linked in doesn't layout just the current type, but also forcefully overwrites all the other variants, which is clearly highly undesirable. The attached patch has an alternate hunk for that, where it doesn't call layout_type at all and just copies over the needed fields from the main variant. The second hunk (the same in between both patches) fixes a problem that alignof (const T) is incorrect, but it has been that way already in 4.8 at least. I've been wondering if it would be possible that build_cplus_array_type would see incomplete element type on the main variant and complete qualified element type, but have not succeeded with e.g. struct S { typedef S T[4][4] __attribute__((aligned (16))); static T t; static volatile T v; }; void foo (const S::T); volatile const S::T w; S::T S::t; volatile S::T S::v; The included (first) patch has been successfully bootstrapped/regtested on x86_64-linux and i686-linux, the attached patch not, but I can bootstrap/regtest it if you prefer it. 2015-04-08 Jakub Jelinek ja...@redhat.com Jan Hubicka hubi...@ucw.cz PR c++/65690 * tree.c (build_cplus_array_type): Layout type before variants are set, but copy over TYPE_SIZE and TYPE_SIZE_UNIT from the main variant. (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. * c-c++-common/attr-aligned-1.c: New test. --- gcc/cp/tree.c.jj2015-04-01 15:29:33.0 +0200 +++ gcc/cp/tree.c 2015-04-08 09:09:45.326939354 +0200 @@ -880,12 +880,19 @@ build_cplus_array_type (tree elt_type, t { t = build_min_array_type (elt_type, index_type); set_array_type_canon (t, elt_type, index_type); + if (!dependent) + { + layout_type (t); + /* Make sure sizes are shared with the main variant. +layout_type can't be called after setting TYPE_NEXT_VARIANT, +as it will overwrite alignment etc. of all variants. */ + TYPE_SIZE (t) = TYPE_SIZE (m); + TYPE_SIZE_UNIT (t) = TYPE_SIZE_UNIT (m); + } TYPE_MAIN_VARIANT (t) = m; TYPE_NEXT_VARIANT (t) = TYPE_NEXT_VARIANT (m); TYPE_NEXT_VARIANT (m) = t; - if (!dependent) - layout_type (t); } } @@ -1057,21 +1064,23 @@ cp_build_qualified_type_real (tree type, should be equivalent to those in check_qualified_type. */ for (t = TYPE_MAIN_VARIANT (type); t; t = TYPE_NEXT_VARIANT (t)) if (TREE_TYPE (t) == element_type -TYPE_NAME (t) == TYPE_NAME (type) -TYPE_CONTEXT (t) == TYPE_CONTEXT (type) -attribute_list_equal (TYPE_ATTRIBUTES (t), -TYPE_ATTRIBUTES (type))) +check_base_type (t, type)) break; if (!t) { t = build_cplus_array_type (element_type, TYPE_DOMAIN (type)); - /* Keep the typedef name. */ - if (TYPE_NAME (t) != TYPE_NAME (type)) + /* Keep the typedef name, context and alignment. */ + if (TYPE_NAME (t) != TYPE_NAME (type) + || TYPE_CONTEXT (t) != TYPE_CONTEXT (type) + || TYPE_ALIGN (t) != TYPE_ALIGN (type)) { t = build_variant_type_copy (t); TYPE_NAME (t) = TYPE_NAME (type); + TYPE_CONTEXT (t) = TYPE_CONTEXT (type); + TYPE_ALIGN (t) = TYPE_ALIGN (type); + TYPE_USER_ALIGN (t) = TYPE_USER_ALIGN (type); } } --- gcc/testsuite/c-c++-common/attr-aligned-1.c.jj 2015-04-08 09:22:46.181427189 +0200 +++ gcc/testsuite/c-c++-common/attr-aligned-1.c 2015-04-08 09:26:41.315627195 +0200 @@ -0,0 +1,24 @@ +/* PR c++/65690 */ +/* { dg-do run } */ + +typedef double T[4][4] __attribute__((aligned (2 * __alignof__ (double; +void foo (const T); +struct S { T s; }; + +int +main () +{ + if (__alignof__ (struct S) != 2 * __alignof__ (double) + || __alignof__ (T) != 2 * __alignof__ (double) + || __alignof__ (const struct S) != 2 * __alignof__ (double) + || __alignof__ (const T) != 2 * __alignof__ (double)) +__builtin_abort (); + return 0; +} + +#if defined(__cplusplus) __cplusplus = 201103L +static_assert (alignof (S) == 2 * alignof (double), alignment of S); +static_assert (alignof (T) == 2 * alignof (double), alignment of T); +static_assert (alignof (const S) == 2 * alignof (double), alignment of const S); +static_assert (alignof (const T) == 2 * alignof (double), alignment of const T); +#endif Jakub 2015-04-08 Jakub Jelinek ja...@redhat.com Jan Hubicka hubi...@ucw.cz
Re: [ping] Re: proper name of i386/x86-64/etc targets
Hi Sandra, On Mon, 26 Jan 2015, Sandra Loosemore wrote: OK, here is a patch that attempts to implement that convention. I'd appreciate review from a target maintainer to check that I've correctly disambiguated places where i386 was referring to both 32- and 64-bit variants vs 32-bit only. I've left alone some instances of i386 where it seemed appropriate to name a specific processor -- e.g. there are a bunch of examples in the inline asm section that are described as i386 code. mind adding this to https://gcc.gnu.org/codingconventions.html where we have a table for spelling (even though this is more spelling)? That way we have a reference point in the future. Thanks, Gerald
Re: pr59016
Le 08/04/2015 12:29, Mikael Morin a écrit : Except that the following ... ahem ... doesn't work. And it doesn't work because gfc_get_ha_symbol doesn't version host-associated symbols. So one has to call symbol.c's save_symbol_data by hand. And then, we can as well keep the original gfc_find_symbol calls. Evangelos, do you want to propose a patch along those lines? Mikael
Re: [PATCH, wwwdoc] Describe the changes of NDS32 port in GCC-5.0.
On Wed, 25 Feb 2015, Chung-Ju Wu wrote: Committed as revision 1.82 of htdocs/gcc-5/changes.html with minor adjustment. Thanks for adding those release notes! I just applied a number of editorial changes on top of the original patch. Let me know if there are further changes you'd like to see (or if I misunderstood some aspect). Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.96 diff -u -r1.96 changes.html --- changes.html8 Apr 2015 06:56:07 - 1.96 +++ changes.html8 Apr 2015 07:24:57 - @@ -759,15 +759,15 @@ h3 id=nds32NDS32/h3 ul -liThe variadic function ABI implementation is now compatible to the past -Andes toolchains where caller uses registers to pass arguments and callee -is in charge of pushing them into stack./li +liThe variadic function ABI implementation is now compatible with +past Andes toolchains where the caller uses registers to pass arguments +and the callee is in charge of pushing them on stack./li liThe options code-mforce-fp-as-gp/code, code-mforbid-fp-as-gp/code, -and code-mex9/code have been removed since they are not available yet in -the nds32 port of GNU binutils package./li -liNew option code-mcmodel=[small|medium|large]/code is provided to -support varied code model on code generation. The code-mgp-direct/code -option now becomes meaningless and can be discarded./li +and code-mex9/code have been removed since they are not yet available +in the nds32 port of GNU binutils./li +liA new option code-mcmodel=[small|medium|large]/code supports +varied code models on code generation. The code-mgp-direct/code +option became meaningless and can be discarded./li /ul h3 id=shSH/h3
Re: [patch,avr]: Part2: Fix various problems with specs and specs file generation.
2015-04-07 15:34 GMT+03:00 Georg-Johann Lay a...@gjlay.de: Am 04/06/2015 um 11:54 AM schrieb Sivanupandi, Pitchumani: Hi Johann, Did you try running g++ tests? It seems xgcc is invoked to get multilibs (from gcc/testsuite/lib/g++.exp) which failed to find specs file. This is because libgloss.exp:get_multilibs (used from g++_init) runs xgcc ($compiler) without -B, i.e. without any prefix. Without prefix there is no way to determine where the specs files are located. Patching driver_self_specs to read a specs file by means of -specs= is, well, not very common. I don't know any other target which does that. As a work-around you can run the tests against the installed compiler. Denis, what do you think? I could add yet another fixme to avr backend like the following; that way there's no need to change dejagnu: Johann Index: config/avr/driver-avr.c === --- config/avr/driver-avr.c (revision 221602) +++ config/avr/driver-avr.c (working copy) @@ -80,6 +80,20 @@ avr_devicespecs_file (int argc, const ch return X_NODEVLIB; case 1: + if (0 == strcmp (device-specs, argv[0])) +{ + /* FIXME: This means device-specs%s from avr.h:DRIVER_SELF_SPECS + has not been resolved to a path. That case can occur when the + c++ testsuite is run from the build directory. DejaGNU's + libgloss.exp:get_multilibs runs $compiler without -B, i.e.runs + xgcc without specifying a prefix. Without any prefix, there is + no means to find out where the specs files might be located. + get_multilibs runs xgcc --print-multi-lib, hence we don't + actually need information form a specs file and may skip it + altogether. */ + return X_NODEVLIB; +} + mmcu = AVR_MMCU_DEFAULT; break; I'm weak in dejagnu internals and c++ testsuite. It looks like an acceptable solution. Denis.
Re: [libstdc++/65033] Give alignment info to libatomic
On 07/04/15 23:58 -0400, Hans-Peter Nilsson wrote: I'd expect alignof(ai): 4 .is_lock_free(): 1 No... wait, that's because atomic_base.h doesn't have the natural-alignment fix, so it's still broken for less-than-natural-alignment targets. But will be fixed? Yes, with my uncommitted patch to add the alignas specifier to __atomic_base_ITp I get your expected output.
[Ada] Fix incorrect call to Pure function returning discriminated type
This disables incorrect optimization (mainly CSE) of calls to Pure functions returning a discriminated record type. These functions allocate their return value on the secondary stack and thus calls to them cannot be CSE'ed because the stack can be reclaimed in between. Tested on x86_64-suse-linux, applied on the mainline. 2015-04-08 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (gnat_to_gnu_entity) E_Function: Do not make a function returning an unconstrained type 'const' for the middle-end. 2015-04-08 Eric Botcazou ebotca...@adacore.com * gnat.dg/opt48.adb: New test. * gnat.dg/opt48_pkg1.ad[sb]: New helper. * gnat.dg/opt48_pkg2.ad[sb]: Likewise. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 221915) +++ gcc-interface/decl.c (working copy) @@ -4266,8 +4266,9 @@ gnat_to_gnu_entity (Entity_Id gnat_entit return_by_direct_ref_p = true; } - /* If we are supposed to return an unconstrained array type, make - the actual return type the fat pointer type. */ + /* If the return type is an unconstrained array type, the return + value will be allocated on the secondary stack so the actual + return type is the fat pointer type. */ else if (TREE_CODE (gnu_return_type) == UNCONSTRAINED_ARRAY_TYPE) { gnu_return_type = TREE_TYPE (gnu_return_type); @@ -4275,8 +4276,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entit } /* Likewise, if the return type requires a transient scope, the - return value will be allocated on the secondary stack so the - actual return type is the pointer type. */ + return value will also be allocated on the secondary stack so + the actual return type is the pointer type. */ else if (Requires_Transient_Scope (gnat_return_type)) { gnu_return_type = build_pointer_type (gnu_return_type); @@ -4591,11 +4592,14 @@ gnat_to_gnu_entity (Entity_Id gnat_entit return_by_direct_ref_p, return_by_invisi_ref_p); - /* A subprogram (something that doesn't return anything) shouldn't - be considered const since there would be no reason for such a + /* A procedure (something that doesn't return anything) shouldn't be + considered const since there would be no reason for calling such a subprogram. Note that procedures with Out (or In Out) parameters - have already been converted into a function with a return type. */ - if (TREE_CODE (gnu_return_type) == VOID_TYPE) + have already been converted into a function with a return type. + Similarly, if the function returns an unconstrained type, then the + function will allocate the return value on the secondary stack and + thus calls to it cannot be CSE'ed, lest the stack be reclaimed. */ + if (TREE_CODE (gnu_return_type) == VOID_TYPE || return_unconstrained_p) const_flag = false; if (const_flag || volatile_flag)-- { dg-do run } -- { dg-options -O } with Opt48_Pkg1; use Opt48_Pkg1; with Opt48_Pkg2; use Opt48_Pkg2; procedure Opt48 is begin if Get_Z /= (12, Hello world!) then raise Program_Error; end if; end;package body Opt48_Pkg1 is function G return Rec is begin return (32, ); end G; X : Rec := F; Y : Rec := G; Z : Rec := F; function Get_Z return Rec is begin return Z; end; end Opt48_Pkg1;with Opt48_Pkg2; use Opt48_Pkg2; package Opt48_Pkg1 is function Get_Z return Rec; end Opt48_Pkg1;package body Opt48_Pkg2 is function F return Rec is begin return (12, Hello world!); end F; end Opt48_Pkg2;package Opt48_Pkg2 is pragma Pure; type Rec (L : Natural) is record S : String (1 .. L); end record; function F return Rec; end Opt48_Pkg2;
RE: [PATCH] Fix bdverN vector cost of cond_[not_]taken_branch_cost
On Wed, 8 Apr 2015, Gopalasubramanian, Ganesh wrote: I have added a person from AMD to comment on the decision. Otherwise, the patch looks OK, but please wait a couple of days for possible comments. Thank you Uros! I am checking the changes with few tests and benchmarking them. Please wait for a couple of days. Note that before the fixes for PR64909 the epilogue/prologue loops had very large costs associated due to a bug in the cost model implementation. After the fix their cost is reasonable but the cost of the extra jumps is way under-accounted for due to the numbers for cond_taken_branch_cost and cond_not_taken_branch_cost. The proposes match mitigates that somewhat. How did you arrive at the original cost model? Thanks, Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
On 04/08/2015 06:02 AM, Jakub Jelinek wrote: (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. This seems wrong. If there is an array with the same name, attributes and element type, it should have the same alignment; if One of problems is that cp_build_qualified_type rebuilds the array from scratch and never copies the attribute list around (as oposed to build_qualified_type that just memcpy the type node) Honza it doesn't, that probably means that one of the types hasn't been laid out yet. We don't want to have two variants of the same array that are distinguished only by whether they've been laid out, especially since later probably both will be laid out and the two types will be the same. Jason
Re: [C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
On Wed, Apr 08, 2015 at 06:22:10PM +0200, Jan Hubicka wrote: On 04/08/2015 06:02 AM, Jakub Jelinek wrote: (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. This seems wrong. If there is an array with the same name, attributes and element type, it should have the same alignment; if One of problems is that cp_build_qualified_type rebuilds the array from scratch and never copies the attribute list around (as oposed to build_qualified_type that just memcpy the type node) As I said earlier, TYPE_ATTRIBUTES is NULL here anyway, because the attributes hang in DECL_ATTRIBUTES of TYPE_DECL. And, except for config/sol2.c (which looks wrong), nothing ever calls lookup_attribute for aligned anyway, the user aligned stuff is encoded in TYPE_USER_ALIGN and/or DECL_USER_ALIGN and TYPE_ALIGN/DECL_ALIGN. Jakub
Re: [C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
On Wed, Apr 08, 2015 at 06:22:10PM +0200, Jan Hubicka wrote: On 04/08/2015 06:02 AM, Jakub Jelinek wrote: (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. This seems wrong. If there is an array with the same name, attributes and element type, it should have the same alignment; if One of problems is that cp_build_qualified_type rebuilds the array from scratch and never copies the attribute list around (as oposed to build_qualified_type that just memcpy the type node) As I said earlier, TYPE_ATTRIBUTES is NULL here anyway, because the attributes hang in DECL_ATTRIBUTES of TYPE_DECL. And, except for config/sol2.c (which looks wrong), nothing ever calls lookup_attribute for aligned anyway, the user aligned stuff is encoded in TYPE_USER_ALIGN and/or DECL_USER_ALIGN and TYPE_ALIGN/DECL_ALIGN. This is interesting too. I did know that alignment is lowered into TYPE_USER_ALIGN/TYPE_ALIGN values, but there is a lot of other code that looks for type attributes by searching TYPE_ATTRIBUTES, not DECL_ATTRIBUTES of TYPE_DECL (such as nonnul_arg_p in tree-vrp) or alloc_object_size. Does it mean that those attributes are ignored for C++ produced types? Honza Jakub
Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)
On Tue, 7 Apr 2015 17:26:45 +0200 Jakub Jelinek ja...@redhat.com wrote: On Mon, Apr 06, 2015 at 03:45:57PM +0300, Ilya Verbin wrote: On Wed, Apr 01, 2015 at 15:20:25 +0200, Jakub Jelinek wrote: LGTM with proper ChangeLog entry. I've commited this patch into trunk. Julian, you probably want to update the nvptx plugin. Note that as the number of P1s without posted fixes is now zero, it is likely RC1 will be done this week, so if you want nvptx working in GCC 5, please post a fix as soon as possible. This version is mostly the same as the last posted version but has a tweak in GOACC_parallel to account for the new splay tree arrangement for target functions: - tgt_fn = (void (*)) tgt_fn_key-tgt-tgt_start; + tgt_fn = (void (*)) tgt_fn_key-tgt_offset; Have there been any other changes I might have missed? It passes libgomp testing on NVPTX. OK? Thanks, Juliancommit ac06b5e25e170061bb9855b9ea4b8e5696816bf1 Author: Julian Brown jul...@codesourcery.com Date: Tue Apr 7 09:23:58 2015 -0700 NVPTX load/unload and init-rework patch. diff --git a/gcc/config/nvptx/mkoffload.c b/gcc/config/nvptx/mkoffload.c index 02c44b6..dbc68bc 100644 --- a/gcc/config/nvptx/mkoffload.c +++ b/gcc/config/nvptx/mkoffload.c @@ -839,6 +839,7 @@ process (FILE *in, FILE *out) { const char *input = read_file (in); Token *tok = tokenize (input); + unsigned int nvars = 0, nfuncs = 0; do tok = parse_file (tok); @@ -850,16 +851,17 @@ process (FILE *in, FILE *out) write_stmts (out, rev_stmts (fns)); fprintf (out, ;\n\n); fprintf (out, static const char *var_mappings[] = {\n); - for (id_map *id = var_ids; id; id = id-next) + for (id_map *id = var_ids; id; id = id-next, nvars++) fprintf (out, \t\%s\%s\n, id-ptx_name, id-next ? , : ); fprintf (out, };\n\n); fprintf (out, static const char *func_mappings[] = {\n); - for (id_map *id = func_ids; id; id = id-next) + for (id_map *id = func_ids; id; id = id-next, nfuncs++) fprintf (out, \t\%s\%s\n, id-ptx_name, id-next ? , : ); fprintf (out, };\n\n); fprintf (out, static const void *target_data[] = {\n); - fprintf (out, ptx_code, var_mappings, func_mappings\n); + fprintf (out, ptx_code, (void*) %u, var_mappings, (void*) %u, + func_mappings\n, nvars, nfuncs); fprintf (out, };\n\n); fprintf (out, extern void GOMP_offload_register (const void *, int, void *);\n); diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index a1d42c5..5272f01 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -655,9 +655,6 @@ struct target_mem_desc { /* Corresponding target device descriptor. */ struct gomp_device_descr *device_descr; - /* Memory mapping info for the thread that created this descriptor. */ - struct splay_tree_s *mem_map; - /* List of splay keys to remove (or decrease refcount) at the end of region. */ splay_tree_key list[]; @@ -691,18 +688,6 @@ typedef struct acc_dispatch_t /* This is guarded by the lock in the outer struct gomp_device_descr. */ struct target_mem_desc *data_environ; - /* Extra information required for a device instance by a given target. */ - /* This is guarded by the lock in the outer struct gomp_device_descr. */ - void *target_data; - - /* Open or close a device instance. */ - void *(*open_device_func) (int n); - int (*close_device_func) (void *h); - - /* Set or get the device number. */ - int (*get_device_num_func) (void); - void (*set_device_num_func) (int); - /* Execute. */ void (*exec_func) (void (*) (void *), size_t, void **, void **, size_t *, unsigned short *, int, int, int, int, void *); @@ -720,7 +705,7 @@ typedef struct acc_dispatch_t void (*async_set_async_func) (int); /* Create/destroy TLS data. */ - void *(*create_thread_data_func) (void *); + void *(*create_thread_data_func) (int); void (*destroy_thread_data_func) (void *); /* NVIDIA target specific routines. */ diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c index 08b7c5e..1f5827e 100644 --- a/libgomp/oacc-async.c +++ b/libgomp/oacc-async.c @@ -26,7 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see http://www.gnu.org/licenses/. */ - +#include assert.h #include openacc.h #include libgomp.h #include oacc-int.h @@ -37,13 +37,23 @@ acc_async_test (int async) if (async acc_async_sync) gomp_fatal (invalid async argument: %d, async); - return base_dev-openacc.async_test_func (async); + struct goacc_thread *thr = goacc_thread (); + + if (!thr || !thr-dev) +gomp_fatal (no device active); + + return thr-dev-openacc.async_test_func (async); } int acc_async_test_all (void) { - return base_dev-openacc.async_test_all_func (); + struct goacc_thread *thr = goacc_thread (); + + if (!thr || !thr-dev) +gomp_fatal (no device active); + + return thr-dev-openacc.async_test_all_func (); } void @@ -52,19 +62,34 @@ acc_wait (int async) if (async acc_async_sync)
Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)
On Wed, Apr 08, 2015 at 15:31:42 +0100, Julian Brown wrote: This version is mostly the same as the last posted version but has a tweak in GOACC_parallel to account for the new splay tree arrangement for target functions: - tgt_fn = (void (*)) tgt_fn_key-tgt-tgt_start; + tgt_fn = (void (*)) tgt_fn_key-tgt_offset; Have there been any other changes I might have missed? No. It passes libgomp testing on NVPTX. OK? Have you tested it with disabled offloading? I see several regressions: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_on_device-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test -- Ilya
[PATCH] Update {x86_64,i[34]86,aarch64,s390{,x},powerpc64}-linux baseline_symbols.txt
Hi! Attached patch updates baseline_symbols.txt for a couple of architectures. Don't have 32-bit powerpc-linux around though this time. Ok for trunk? 2015-04-07 Jakub Jelinek ja...@redhat.com * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update. * config/abi/post/i386-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/s390-linux-gnu/baseline_symbols.txt: Update. * config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update. Jakub libstdc++-baseline-update.patch.xz Description: application/xz
[testsuite, i386] Avoid finite in gcc.target/i386/avx512dq-vfpclasspd-2.c etc.
Some AVX512DQ tests are failing on Solaris/x86 with gas: FAIL: gcc.target/i386/avx512dq-vfpclasspd-2.c (test for excess errors) FAIL: gcc.target/i386/avx512dq-vfpclassps-2.c (test for excess errors) FAIL: gcc.target/i386/avx512vl-vfpclasspd-2.c (test for excess errors) FAIL: gcc.target/i386/avx512vl-vfpclassps-2.c (test for excess errors) FAIL: gcc.target/i386/avx512dq-vfpclasspd-2.c (test for excess errors) Excess errors: /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-2.c:24:20: warning: implicit declaration of function 'finite' [-Wimplicit-function-declaration] finite() is non-standard and requires ieeefp.h on Solaris. Instead of this, I believe it's simplest to just use __builtin_finite instead. The following patch does just that. Tested with the appropriate runtest invocation on i386-pc-solaris2.11 and x86_64-unknown-linux-gnu. Ok for mainline? Rainer 2015-04-08 Rainer Orth r...@cebitec.uni-bielefeld.de * gcc.target/i386/avx512dq-vfpclasspd-2.c (check_fp_class_dp): Use __builtin_finite instead of finite. * gcc.target/i386/avx512dq-vfpclassps-2.c (check_fp_class_sp): Likewise. # HG changeset patch # Parent 4a9cc7114f8129eb1eda0f314ca2ae48494b48be Avoid finite in gcc.target/i386/avx512dq-vfpclasspd-2.c etc. diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-2.c b/gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-2.c --- a/gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-2.c +++ b/gcc/testsuite/gcc.target/i386/avx512dq-vfpclasspd-2.c @@ -21,7 +21,7 @@ int check_fp_class_dp (double src, int i int PInf_res = (isinf (src) == 1); int NInf_res = (isinf (src) == -1); int Denorm_res = (fpclassify (src) == FP_SUBNORMAL); - int FinNeg_res = finite (src) (src 0); + int FinNeg_res = __builtin_finite (src) (src 0); int result = (((imm 1) qNaN_res) || (((imm 1) 1) Pzero_res) diff --git a/gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-2.c b/gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-2.c --- a/gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-2.c +++ b/gcc/testsuite/gcc.target/i386/avx512dq-vfpclassps-2.c @@ -21,7 +21,7 @@ int check_fp_class_sp (float src, int im int PInf_res = (isinf (src) == 1); int NInf_res = (isinf (src) == -1); int Denorm_res = (fpclassify (src) == FP_SUBNORMAL); - int FinNeg_res = finite (src) (src 0); + int FinNeg_res = __builtin_finite (src) (src 0); int result = (((imm 1) qNaN_res) || (((imm 1) 1) Pzero_res) -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [libgo] Remove Solaris 11.1+ zone_net_addr_t treatment
Ian Lance Taylor i...@golang.org writes: On Mon, Nov 3, 2014 at 8:59 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The recent godump changes broke Solaris 11.1+ bootstrap in libgo: before, gen-sysinfo.so had type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 _in6_addr; }; } which was filtered out by mksysinfo.sh due to the use of _in6_addr. After the change, there's now type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 [16]byte; Godump_0_align [0]uint32; }; } instead, not filtered, but added a second time by the _zone_net_addr_t code in mksysinfo.sh, which leads to redefinition warnings/errors. Simply removing the old _zone_net_addr_t fragment fixes this and restores bootstrap. Bootstrapped without regressions on i386-pc-solaris2.1[01], ok for mainline? I just got back to this. Committed to mainline. Thanks. Sorry for the late reply, but between the time I submitted the patch and you committing it, something changed and the mksysinfo.sh fragment became necessary again. In fact, without it Solaris 11 bootstrap is broken. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Walk through thunks when propagating comdat group
On 07 Apr 22:37, Jan Hubicka wrote: OK, thanks! Please put the comment about inline clones just before the global.inlined_to test and make the comment about thunks separate. Probably Thunks can not call across section boundary Honza Here is a committed variant. Thanks, Ilya -- gcc/ 2015-04-08 Ilya Enkovich ilya.enkov...@intel.com * ipa-comdats.c (propagate_comdat_group): Walk through thunks. gcc/testsuite/ 2015-04-08 Ilya Enkovich ilya.enkov...@intel.com * gcc.target/i386/mpx/chkp-thunk-comdat-3.c: New. diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c index f349f9f..4298b9b 100644 --- a/gcc/ipa-comdats.c +++ b/gcc/ipa-comdats.c @@ -142,12 +142,14 @@ propagate_comdat_group (struct symtab_node *symbol, { struct symtab_node *symbol2 = edge-caller; - /* If we see inline clone, its comdat group actually - corresponds to the comdat group of the function it is inlined - to. */ - if (cgraph_node * cn = dyn_cast cgraph_node * (symbol2)) { + /* Thunks can not call across section boundary. */ + if (cn-thunk.thunk_p) + newgroup = propagate_comdat_group (symbol2, newgroup, map); + /* If we see inline clone, its comdat group actually + corresponds to the comdat group of the function it + is inlined to. */ if (cn-global.inlined_to) symbol2 = cn-global.inlined_to; } diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-3.c b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-3.c new file mode 100644 index 000..dd0057e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-thunk-comdat-3.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -fcheck-pointer-bounds -mmpx -O -fvisibility=hidden } */ + +int val; + +static int __attribute__((noinline)) +test1 () +{ + return val; +} + +static int __attribute__((bnd_legacy,noinline)) +test2 () +{ + return test1 (); +} + +int +test3 (void) +{ + return test2 (); +} +
Re: [PATCH] add self-tuning to x86 hardware fast path in libitm
Nuno Diegues n...@ist.utl.pt writes: What workloads did you test this on? +static inline float fastLog(float x) +{ + union { float f; uint32_t i; } vx = { x }; + float y = vx.i; + y *= 8.2629582881927490e-8f; + return y - 87.989971088f; +} + +static inline float fastSqrt(float x) +{ + union + { +int i; +float x; + } u; + + u.x = x; + u.i = (129) + (u.i 1) - (122); + return u.x; +} Are you sure you need floating point here? If the program does not use it in any other ways faulting in the floating point state can be quite expensive. I bet fixed point would work for such simple purposes too. + serial_lock.read_unlock(tx); + + // Obtain the delta performance with respect to the last period. + uint64_t current_cycles = rdtsc(); + uint64_t cycles_used = current_cycles - optimizer.last_cycles; It may be worth pointing out that rdtsc does not return cycles. In fact the ratio to real cycles is variable depending on the changing frequency. I hope your algorithms can handle that. + + // Compute gradient descent for the number of retries. + double change_for_better = current_throughput / optimizer.last_throughput; + double change_for_worse = optimizer.last_throughput / current_throughput; + int32_t last_attempts = optimizer.last_attempts; + int32_t current_attempts = optimizer.optimized_attempts; + int32_t new_attempts = current_attempts; + if (unlikely(change_for_worse 1.40)) +{ + optimizer.optimized_attempts = optimizer.best_ever_attempts; + optimizer.last_throughput = current_throughput; + optimizer.last_attempts = current_attempts; + return; +} + + if (unlikely(random() % 100 1)) +{ So where is the seed for that random stored? Could you corrupt some user's random state? Is the state per thread or global? If it's per thread how do you initialize so that they threads do start with different seeds. If it's global what synchronizes it? Overall the algorithm looks very complicated with many mysterious magic numbers. Are there simplifications possible? While the retry path is not extremely critical it should be at least somewhat optimized, otherwise it will dominate the cost of short transactions. One problems with so many magic numbers is that they may be good on one system, but bad on another. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [libgo] Remove Solaris 11.1+ zone_net_addr_t treatment
On Wed, Apr 8, 2015 at 6:48 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@golang.org writes: On Mon, Nov 3, 2014 at 8:59 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The recent godump changes broke Solaris 11.1+ bootstrap in libgo: before, gen-sysinfo.so had type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 _in6_addr; }; } which was filtered out by mksysinfo.sh due to the use of _in6_addr. After the change, there's now type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 [16]byte; Godump_0_align [0]uint32; }; } instead, not filtered, but added a second time by the _zone_net_addr_t code in mksysinfo.sh, which leads to redefinition warnings/errors. Simply removing the old _zone_net_addr_t fragment fixes this and restores bootstrap. Bootstrapped without regressions on i386-pc-solaris2.1[01], ok for mainline? I just got back to this. Committed to mainline. Thanks. Sorry for the late reply, but between the time I submitted the patch and you committing it, something changed and the mksysinfo.sh fragment became necessary again. In fact, without it Solaris 11 bootstrap is broken. To avoid any confusion, can you send me the patch I should apply to mainline? Ian
Re: [PATCH] fix building for alpha-dec-vms
[CC ing maintainers] Ping. On 27 March 2015 at 11:24, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org Hi, Unfortunately when testing r217869 I didn't realize the modified code in alpha.c was only used for some alpha targets. So testing alpha-linux wasn't enough or even really useful :( I tested cc1 for alpha-dec-vms now builds as discussed before make all-gcc is still broken because the vms targets don't support c++ and don't say it shouldn't be built in config.gcc. Is this ok? Trev diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 37258ad..fac42d6 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-03-27 Trevor Saunders tbsau...@tbsaunde.org + + * config/alpha/alpha.c (alpha_use_linkage): Change type of slot to + alpha_links **. + (alpha_write_one_linkage): Correct typo. + 2015-03-27 Marek Polacek pola...@redhat.com PR sanitizer/65583 diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c index 554ff09..67c15dc 100644 --- a/gcc/config/alpha/alpha.c +++ b/gcc/config/alpha/alpha.c @@ -9665,7 +9665,7 @@ alpha_use_linkage (rtx func, bool lflag, bool rflag) if (cfun-machine-links) { /* Is this name already defined? */ - alpha_links *slot = cfun-machine-links-get (name); + alpha_links **slot = cfun-machine-links-get (name); if (slot) al = *slot; } @@ -9711,7 +9711,7 @@ alpha_use_linkage (rtx func, bool lflag, bool rflag) } static int -alpha_write_one_linkage (const char *name, alpha_links *link, FILE *steam) +alpha_write_one_linkage (const char *name, alpha_links *link, FILE *stream) { ASM_OUTPUT_INTERNAL_LABEL (stream, XSTR (link-linkage, 0)); if (link-rkind == KIND_CODEADDR) -- 2.1.4
Re: libgo patch committed: Build libnetgo.a
I see there is some documentation about build constraints in go/go/build/doc.go. There is no mention of netgo on this page now but this seems like an appropriate spot to mention the use of the netgo tag for gc and then how to achieve the equivalent effect with gccgo. On 04/07/2015 01:09 PM, Ian Lance Taylor wrote: PR 63731 points out that when using gccgo there is no way to request a Go program that uses the native Go DNS lookup code rather than using the system libraries. This patch from Lynn Boger at least provides a mechanism for that, by adding a -lnetgo library that can be used to pick up the Go DNS lookup routines. This isn't complete fix because we still need to document it somewhere. Bootstrapped on x86_64-unknown-linux-gnu. Committed to mainline. Ian
Re: [PATCH] fix building for alpha-dec-vms
On Fri, Mar 27, 2015 at 11:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders tbsaunde+...@tbsaunde.org Hi, Unfortunately when testing r217869 I didn't realize the modified code in alpha.c was only used for some alpha targets. So testing alpha-linux wasn't enough or even really useful :( I tested cc1 for alpha-dec-vms now builds as discussed before make all-gcc is still broken because the vms targets don't support c++ and don't say it shouldn't be built in config.gcc. Is this ok? Ok. Thanks, Richard. Trev diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 37258ad..fac42d6 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-03-27 Trevor Saunders tbsau...@tbsaunde.org + + * config/alpha/alpha.c (alpha_use_linkage): Change type of slot to + alpha_links **. + (alpha_write_one_linkage): Correct typo. + 2015-03-27 Marek Polacek pola...@redhat.com PR sanitizer/65583 diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c index 554ff09..67c15dc 100644 --- a/gcc/config/alpha/alpha.c +++ b/gcc/config/alpha/alpha.c @@ -9665,7 +9665,7 @@ alpha_use_linkage (rtx func, bool lflag, bool rflag) if (cfun-machine-links) { /* Is this name already defined? */ - alpha_links *slot = cfun-machine-links-get (name); + alpha_links **slot = cfun-machine-links-get (name); if (slot) al = *slot; } @@ -9711,7 +9711,7 @@ alpha_use_linkage (rtx func, bool lflag, bool rflag) } static int -alpha_write_one_linkage (const char *name, alpha_links *link, FILE *steam) +alpha_write_one_linkage (const char *name, alpha_links *link, FILE *stream) { ASM_OUTPUT_INTERNAL_LABEL (stream, XSTR (link-linkage, 0)); if (link-rkind == KIND_CODEADDR) -- 2.1.4
[v3] Update Solaris baselines
With the GCC 5 release approaching, it's time to update the Solaris baselines again. This patch does just that and is pretty much straightforward: * With one exception, all new symbols are in the GLIBCXX_3.4.21 and CXXABI_1.3.9 versions. * On Solaris/x86 only (obviously), we have +OBJECT:0:CXXABI_FLOAT128 +OBJECT:16:_ZTIg@@CXXABI_FLOAT128 +OBJECT:2:_ZTSg@@CXXABI_FLOAT128 +OBJECT:32:_ZTIPKg@@CXXABI_FLOAT128 +OBJECT:32:_ZTIPg@@CXXABI_FLOAT128 +OBJECT:3:_ZTSPg@@CXXABI_FLOAT128 +OBJECT:4:_ZTSPKg@@CXXABI_FLOAT128 For the moment (considering how late we are in the release cycle), I've decided to use the SPARC version for the common baseline, so those symbols show up as added on x86. * On Solaris 11, there are two additonal symbols beyond those in Solaris 10: +FUNC:std::__cxx11::basic_stringbufchar, std::char_traitschar, std::allocatorchar ::~basic_stringbuf()@@GLIBCXX_3.4.21 +FUNC:std::__cxx11::basic_stringbufwchar_t, std::char_traitswchar_t, std::allocatorwchar_t ::~basic_stringbuf()@@GLIBCXX_3.4.21 They are from src/c++98/complex_io.o and src/c++11/sstream-inst.o, respectively. No idea where this difference comes from, it might be related to the fact that Solaris 11 has COMDAT group support while Solaris 10 does not. I've again decided to live with this difference and use the Solaris 10 version of the baselines. Bootstrapped without regressions on i386-pc-solaris2.1[01] and sparc-sun-solaris2.1[01], abi_check results are clean with the additions noted above. Ok for mainline? Rainer 2015-04-02 Rainer Orth r...@cebitec.uni-bielefeld.de * config/abi/post/solaris2.10/baseline_symbols.txt: Regenerate. * config/abi/post/solaris2.10/amd64/baseline_symbols.txt: Likewise. * config/abi/post/solaris2.10/sparcv9/baseline_symbols.txt: Likewise. sol2-libstdc++-baseline-gcc50.patch.gz Description: Binary data -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [libgo] Remove Solaris 11.1+ zone_net_addr_t treatment
Ian Lance Taylor i...@golang.org writes: On Wed, Apr 8, 2015 at 6:48 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@golang.org writes: On Mon, Nov 3, 2014 at 8:59 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The recent godump changes broke Solaris 11.1+ bootstrap in libgo: before, gen-sysinfo.so had type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 _in6_addr; }; } which was filtered out by mksysinfo.sh due to the use of _in6_addr. After the change, there's now type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 [16]byte; Godump_0_align [0]uint32; }; } instead, not filtered, but added a second time by the _zone_net_addr_t code in mksysinfo.sh, which leads to redefinition warnings/errors. Simply removing the old _zone_net_addr_t fragment fixes this and restores bootstrap. Bootstrapped without regressions on i386-pc-solaris2.1[01], ok for mainline? I just got back to this. Committed to mainline. Thanks. Sorry for the late reply, but between the time I submitted the patch and you committing it, something changed and the mksysinfo.sh fragment became necessary again. In fact, without it Solaris 11 bootstrap is broken. To avoid any confusion, can you send me the patch I should apply to mainline? Sure: here's what I have in my tree. Thanks. Rainer # HG changeset patch # Parent fb5daa5b2c139aa02220feb898ac29bbafb1cb00 Handle Solaris 11 Update 1 zone_net_addr_t diff --git a/libgo/mksysinfo.sh b/libgo/mksysinfo.sh --- a/libgo/mksysinfo.sh +++ b/libgo/mksysinfo.sh @@ -1065,4 +1065,9 @@ grep '^type _ipv6_member_t ' gen-sysinfo egrep '^const _(MIB2|EXPER)_' gen-sysinfo.go | \ sed -e 's/^\(const \)_\([^= ]*\)\(.*\)$/\1\2 = _\2/' ${OUT} +# The Solaris 11 Update 1 _zone_net_addr_t struct. +grep '^type _zone_net_addr_t ' gen-sysinfo.go | \ +sed -e 's/_in6_addr/[16]byte/' \ + ${OUT} + exit $? -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] fix building for alpha-dec-vms
On Wed, Apr 08, 2015 at 03:57:09PM +0200, Bernhard Reutner-Fischer wrote: [CC ing maintainers] Ping. This is ok. --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-03-27 Trevor Saunders tbsau...@tbsaunde.org + + * config/alpha/alpha.c (alpha_use_linkage): Change type of slot to + alpha_links **. + (alpha_write_one_linkage): Correct typo. + 2015-03-27 Marek Polacek pola...@redhat.com PR sanitizer/65583 diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c index 554ff09..67c15dc 100644 --- a/gcc/config/alpha/alpha.c +++ b/gcc/config/alpha/alpha.c @@ -9665,7 +9665,7 @@ alpha_use_linkage (rtx func, bool lflag, bool rflag) if (cfun-machine-links) { /* Is this name already defined? */ - alpha_links *slot = cfun-machine-links-get (name); + alpha_links **slot = cfun-machine-links-get (name); if (slot) al = *slot; } @@ -9711,7 +9711,7 @@ alpha_use_linkage (rtx func, bool lflag, bool rflag) } static int -alpha_write_one_linkage (const char *name, alpha_links *link, FILE *steam) +alpha_write_one_linkage (const char *name, alpha_links *link, FILE *stream) { ASM_OUTPUT_INTERNAL_LABEL (stream, XSTR (link-linkage, 0)); if (link-rkind == KIND_CODEADDR) -- 2.1.4 Jakub
Re: [libgo] Remove Solaris 11.1+ zone_net_addr_t treatment
On Wed, Apr 8, 2015 at 7:10 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@golang.org writes: On Wed, Apr 8, 2015 at 6:48 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@golang.org writes: On Mon, Nov 3, 2014 at 8:59 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The recent godump changes broke Solaris 11.1+ bootstrap in libgo: before, gen-sysinfo.so had type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 _in6_addr; }; } which was filtered out by mksysinfo.sh due to the use of _in6_addr. After the change, there's now type _zone_net_addr_t struct { zna_family uint16; zna_plen uint16; zna_addru struct { znau_addr6 [16]byte; Godump_0_align [0]uint32; }; } instead, not filtered, but added a second time by the _zone_net_addr_t code in mksysinfo.sh, which leads to redefinition warnings/errors. Simply removing the old _zone_net_addr_t fragment fixes this and restores bootstrap. Bootstrapped without regressions on i386-pc-solaris2.1[01], ok for mainline? I just got back to this. Committed to mainline. Thanks. Sorry for the late reply, but between the time I submitted the patch and you committing it, something changed and the mksysinfo.sh fragment became necessary again. In fact, without it Solaris 11 bootstrap is broken. To avoid any confusion, can you send me the patch I should apply to mainline? Sure: here's what I have in my tree. Committed. Ian
Re: [C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
On 04/08/2015 06:02 AM, Jakub Jelinek wrote: (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. This seems wrong. If there is an array with the same name, attributes and element type, it should have the same alignment; if it doesn't, that probably means that one of the types hasn't been laid out yet. We don't want to have two variants of the same array that are distinguished only by whether they've been laid out, especially since later probably both will be laid out and the two types will be the same. Jason
Re: [C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
On Wed, Apr 08, 2015 at 10:47:15AM -0400, Jason Merrill wrote: On 04/08/2015 06:02 AM, Jakub Jelinek wrote: (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. This seems wrong. If there is an array with the same name, attributes and element type, it should have the same alignment; if it doesn't, that probably means that one of the types hasn't been laid out yet. We don't want to have two variants of the same array that are distinguished only by whether they've been laid out, especially since later probably both will be laid out and the two types will be the same. But that is how handle_aligned_attribute works, since forever (checked it back to 3.2). In = 3.4.x, it used to create it using build_type_copy, since 4.0.0 using build_variant_type_copy, but both those routines behave the same - build a type variant which is linked in the TYPE_NEXT_VARIANT chain, and differs from the other type in there possibly just by TYPE_ALIGN/TYPE_USER_ALIGN. Perhaps it should check TYPE_ALIGN only if at least one of the two types has TYPE_USER_ALIGN set? As for why TYPE_ATTRIBUTES are NULL, the reason for that is that these are attributes on a typedef, so the attributes go into DECL_ATTRIBUTES of the TYPE_DECL instead. Anyway, the P1 regression is just about the first hunk, so if you have issues just with the second hunk and not the first hunk (from either of the patches), I can just comment out the tests for alignof (const T), and open a separate PR for that for later. Jakub
Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)
On Wed, Apr 08, 2015 at 03:31:42PM +0100, Julian Brown wrote: It passes libgomp testing on NVPTX. OK? Please write a proper ChangeLog entry for it. Ok with that. Jakub
Re: [PATCH] add self-tuning to x86 hardware fast path in libitm
Thank you for the feedback. Comments inline. On Wed, Apr 8, 2015 at 3:05 PM, Andi Kleen a...@firstfloor.org wrote: Nuno Diegues n...@ist.utl.pt writes: What workloads did you test this on? On the STAMP suite of benchmarks for transactional memory (described here [1]). I have ran an unmodified GCC 5.0.0 against the patched GCC with these modifications and obtain the following speedups in STAMP with 4 threads (on a Haswell with 4 cores, average 10 runs): benchmarks: speedup genome: 1.32 intruder: 1.66 labyrinth: 1.00 ssca2: 1.02 yada: 1.00 kmeans-high: 1.13 kmeans-low: 1.10 vacation-high: 2.27 vacation-low: 1.88 [1] Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, Kunle Olukotun: STAMP: Stanford Transactional Applications for Multi-Processing. IISWC 2008: 35-46 Are you sure you need floating point here? If the program does not use it in any other ways faulting in the floating point state can be quite expensive. I bet fixed point would work for such simple purposes too. That is a good point. While I haven't ever used fixed point arithmetic, a cursory inspection reveals that it does make sense and seems applicable to this case. Are you aware of some place where this is being done already within GCC that I could use as inspiration, or should I craft some macros from scratch for this? + serial_lock.read_unlock(tx); + + // Obtain the delta performance with respect to the last period. + uint64_t current_cycles = rdtsc(); + uint64_t cycles_used = current_cycles - optimizer.last_cycles; It may be worth pointing out that rdtsc does not return cycles. In fact the ratio to real cycles is variable depending on the changing frequency. I hope your algorithms can handle that. The intent here is to obtain some notion of time passed with a low cost. RDTSC seemed to be the best choice around: it is not critical that the frequency of the processor may change the relativity of the returned value with respect to actual cpu cycles. + + // Compute gradient descent for the number of retries. + double change_for_better = current_throughput / optimizer.last_throughput; + double change_for_worse = optimizer.last_throughput / current_throughput; + int32_t last_attempts = optimizer.last_attempts; + int32_t current_attempts = optimizer.optimized_attempts; + int32_t new_attempts = current_attempts; + if (unlikely(change_for_worse 1.40)) +{ + optimizer.optimized_attempts = optimizer.best_ever_attempts; + optimizer.last_throughput = current_throughput; + optimizer.last_attempts = current_attempts; + return; +} + + if (unlikely(random() % 100 1)) +{ So where is the seed for that random stored? Could you corrupt some user's random state? Is the state per thread or global? If it's per thread how do you initialize so that they threads do start with different seeds. If it's global what synchronizes it? As I do not specify any seed, I was under the impression that there would be a default initialization. Furthermore, the posix documentation specifies random() to be MT-safe, so I assumed its internal state to be per-thread. Did I mis-interpret this? With regard to the self-tuning state, it is kept within the gtm_global_optimizer optimizer struct, which is in essence multi-reader (any thread running transactions can check the struct to use the parameters optimized in it) and single-writer (notice that the void GTM::gtm_thread::reoptimize_htm_execution() function is called only by one thread, the one at the end of the list of threads, i.e., whose tx-next_thread == NULL). Overall the algorithm looks very complicated with many mysterious magic numbers. Are there simplifications possible? While the retry path is not extremely critical it should be at least somewhat optimized, otherwise it will dominate the cost of short transactions. One problems with so many magic numbers is that they may be good on one system, but bad on another. Notice that the retry path is barely changed in the common case: only a designated thread (the last one in the list of threads registered in libitm) will periodically execute the re-optimization. Hence, most of the patch that you can see here is code execute in that (uncommon case). I understand the concern with magic numbers: we could self-tune them as well, but that would surely increase the complexity of the patch :) In essence, we have the following numbers at the moment: * how often we re-optimize (every 500 successful transactions for the designated thread) * how many maximum attempts we can have in hardware (20) * how much better and worse the performance must change for the gradient descent to move to a new configuration (5%) * how terrible the performance must change for the gradient descent to rollback to the best known configuration so far (40%) * how often the gradient descent can explore randomly to avoid local optima (1%) Once again, thank you for your time
Re: [PATCH] Update {x86_64,i[34]86,aarch64,s390{,x},powerpc64}-linux baseline_symbols.txt
On 08/04/15 19:40 +0200, Jakub Jelinek wrote: Hi! Attached patch updates baseline_symbols.txt for a couple of architectures. Don't have 32-bit powerpc-linux around though this time. Ok for trunk? OK, thanks very much for doing this.
[doc] extend.texi grammar fix
I ran into this while trying to help a user. Committed. Gerald 2015-04-08 Gerald Pfeifer ger...@pfeifer.com * doc/extend.texi (__sync Builtins): Fix grammar. Index: doc/extend.texi === --- doc/extend.texi (revision 221926) +++ doc/extend.texi (revision 221927) @@ -8224,7 +8224,7 @@ Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning is -generated and a call an external function is generated. The external +generated and a call to an external function is generated. The external function carries the same name as the built-in version, with an additional suffix @samp{_@var{n}} where @var{n} is the size of the data type.
[PATCH 0/2] Commentary typo fixes
Two typos and remove two unneeded forward declarations for static functions in tree-tailcall.c. These patches were applied when doing a config-list.mk build and showed no negative effect. I'd usually consider them obvious but given current stage.. Ok for trunk now? Bernhard Reutner-Fischer (2): tree.h: Commentary typo fix tree-tailcall: Commentary typo fix, remove fwd declaration gcc/tree-tailcall.c | 4 +--- gcc/tree.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) -- 2.1.4
[PATCH 1/2] tree.h: Commentary typo fix
gcc/ChangeLog: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * tree.h (CONVERT_EXPR_P): Commentary typo fix. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- gcc/tree.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree.h b/gcc/tree.h index 4fcc272..bedf103 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -428,7 +428,7 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, #define CONVERT_EXPR_CODE_P(CODE) \ ((CODE) == NOP_EXPR || (CODE) == CONVERT_EXPR) -/* Similarly, but accept an expressions instead of a tree code. */ +/* Similarly, but accept an expression instead of a tree code. */ #define CONVERT_EXPR_P(EXP)CONVERT_EXPR_CODE_P (TREE_CODE (EXP)) /* Generate case for NOP_EXPR, CONVERT_EXPR. */ -- 2.1.4
[PATCH 2/2] tree-tailcall: Commentary typo fix, remove fwd declaration
gcc/ChangeLog: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * tree-tailcall.c (suitable_for_tail_opt_p, find_tail_calls): Remove unneeded forward declarations. (suitable_for_tail_call_opt_p): Commentary typo fix. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- gcc/tree-tailcall.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c index 1d065fb..013972d 100644 --- a/gcc/tree-tailcall.c +++ b/gcc/tree-tailcall.c @@ -165,10 +165,8 @@ struct tailcall accumulator. */ static tree m_acc, a_acc; -static bool suitable_for_tail_opt_p (void); static bool optimize_tail_call (struct tailcall *, bool); static void eliminate_tail_call (struct tailcall *); -static void find_tail_calls (basic_block, struct tailcall **); /* Returns false when the function is not suitable for tail call optimization from some reason (e.g. if it takes variable number of arguments). */ @@ -182,7 +180,7 @@ suitable_for_tail_opt_p (void) return true; } /* Returns false when the function is not suitable for tail call optimization - from some reason (e.g. if it takes variable number of arguments). + for some reason (e.g. if it takes variable number of arguments). This test must pass in addition to suitable_for_tail_opt_p in order to make tail call discovery happen. */ -- 2.1.4
[patch] Update my email address
I'm retiring, and my last day at google is this Friday, April 10. I plan to continue to contribute to GCC and binutils in my retirement. I've updated the MAINTAINERS file to use my personal address, ccout...@gmail.com. -cary 2015-04-08 Cary Coutant ccout...@gmail.com * MAINTAINERS: Update my email address. Index: MAINTAINERS === --- MAINTAINERS (revision 221926) +++ MAINTAINERS (working copy) @@ -195,7 +195,7 @@ caller-save.c Jeff Law l...@redhat.com callgraph Jan Hubicka hubi...@ucw.cz debugging code Jim Wilson wil...@tuliptree.org dwarf debugging code Jason Merrill ja...@redhat.com -dwarf debugging code Cary Coutantccout...@google.com +dwarf debugging code Cary Coutantccout...@gmail.com c++ runtime libs Paolo Carlini paolo.carl...@oracle.com c++ runtime libs Ulrich Drepper drep...@gmail.com c++ runtime libs Benjamin De Kosnik b...@gnu.org @@ -300,7 +300,7 @@ libsanitizer, asan.cDmitry Vyukov dvy loop optimizer Zdenek Dvorak o...@ucw.cz loop optimizer Daniel Berlin dber...@dberlin.org LTORichard Biener rguent...@suse.de -LTO plugin Cary Coutantccout...@google.com +LTO plugin Cary Coutantccout...@gmail.com Plugin Le-Chun Wu l...@google.com register allocationPeter Bergner berg...@vnet.ibm.com register allocationKenneth Zadeck zad...@naturalbridge.com @@ -365,7 +365,7 @@ William Cohen wco...@redhat.com Josh Connerjcon...@apple.com R. Kelley Cook kc...@gcc.gnu.org Christian Cornelssen cc...@cs.tu-berlin.de -Cary Coutant ccout...@google.com +Cary Coutant ccout...@gmail.com Lawrence Crowl cr...@google.com Ian Dall i...@beware.dropbear.id.au David Daneydavid.da...@caviumnetworks.com
Re: [PATCH 1/2] tree.h: Commentary typo fix
On Wed, 8 Apr 2015, Bernhard Reutner-Fischer wrote: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * tree.h (CONVERT_EXPR_P): Commentary typo fix. Go ahead. Pure comment fixes still should be fine at this point in time. Gerald
[PATCH] PR target/55144: bfin: fix opening glibc-c.o: No such file or directory
building all-gcc for bfin-linux-uclibc results in build/genchecksum cp/cp-lang.o c-family/stub-objc.o ... glibc-c.o \ libbackend.a .. cc1plus-checksum.c.tmp opening glibc-c.o: No such file or directory make[2]: *** [cc1-checksum.c] Error 1 Fix this by prepending tmake_file which nowadays consists of t-slibgcc t-linux t-glibc. Remove the already listed tmake_file entries. Fixes all-gcc config-list.mk build for bfin-linux-uclibc. Ok for trunk? gcc/ChangeLog PR target/55144 * config.gcc (bfin*-linux-uclibc*): Prepend tmake_file and remove already contained t-files. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com Cc: Bernd Schmidt ber...@codesourcery.com Cc: Jie Zhang jzhang...@gmail.com Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- gcc/config.gcc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config.gcc b/gcc/config.gcc index cb08a5c..ddbd57b 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1118,7 +1118,7 @@ bfin*-uclinux*) ;; bfin*-linux-uclibc*) tm_file=${tm_file} dbxelf.h elfos.h bfin/elf.h gnu-user.h linux.h glibc-stdint.h bfin/linux.h ./linux-sysroot-suffix.h - tmake_file=bfin/t-bfin-linux t-slibgcc t-linux + tmake_file=${tmake_file} bfin/t-bfin-linux use_collect2=no ;; bfin*-rtems*) -- 2.1.4
Re: [PATCH 1/2] tree.h: Commentary typo fix
On 8 April 2015 at 20:34, Gerald Pfeifer ger...@pfeifer.com wrote: On Wed, 8 Apr 2015, Bernhard Reutner-Fischer wrote: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * tree.h (CONVERT_EXPR_P): Commentary typo fix. Go ahead. Pure comment fixes still should be fine at this point in time. Ok, thanks. r221930.
Re: MAINTAINERS: resign as testsuite maintainer, update address
On Mon, 2 Feb 2015, Janis Johnson wrote: I retired from Mentor Graphics 3 weeks ago and have no immediate plans to be active in GCC, so I'm resigning as a testsuite maintainer. I'm leaving myself under Write After Approval with my personal email address so people can find me. Thanks for your contributions over the years, Janis! Five years ago while between jobs I got an individual FSF copyright assignment; is that still valid? I've had a look at the original file maintained by the FSF, and from all I can tell it is, yes. So, one excuse less to continue contributing. :-) Gerald
Re: [PATCH 1/4] Docs: extend.texi: Add missing semicolon for consistency
Hi Michael, and big apologies for this falling through a lot of cracks apparently. I just committed your patch with the ChangeLog below. If there are any other patches that have not been committed (nor NACKed yet, I know there were some as well), please let us know and I will look into getting at least documentation patches addressed swiftly going forward. Thank you, and sorry again, Gerald 2015-04-08 Michael Witten mfwit...@gmail.com * doc/extend.texi (Attribute Syntax): Add a trailing semicolon to an example. Index: doc/extend.texi === --- doc/extend.texi (revision 221930) +++ doc/extend.texi (working copy) @@ -4771,7 +4771,7 @@ @smallexample __attribute__((noreturn)) void d0 (void), __attribute__((format(printf, 1, 2))) d1 (const char *, ...), - d2 (void) + d2 (void); @end smallexample @noindent On Wed, 27 Apr 2011, Michael Witten wrote: --- trunk/gcc/doc/extend.texi |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/trunk/gcc/doc/extend.texi b/trunk/gcc/doc/extend.texi index eddff95..c154958 100644 --- a/trunk/gcc/doc/extend.texi +++ b/trunk/gcc/doc/extend.texi @@ -3997,7 +3997,7 @@ @smallexample __attribute__((noreturn)) void d0 (void), __attribute__((format(printf, 1, 2))) d1 (const char *, ...), - d2 (void) + d2 (void); @end smallexample @noindent
Re: [patch] Fix shared_timed_mutex::try_lock_until() et al
On 08/04/15 18:59 +0200, Torvald Riegel wrote: +// set or the maximum number of reader locks is held, then increment the +// reader lock count. +// To release decrement the count, then if the write-entered flag is set +// and the count is zero then signal gate2 to wake a queued writer, +// otherwise if the maximum number of reader locks was held signal gate1 +// to wake a reader. +// +// To take a writer lock block on gate1 while the write-entered flag is +// set, then set the write-entered flag to start queueing, then block on +// gate2 while the number of reader locks is non-zero. +// To release unset the write-entered flag and signal gate1 to wake all +// blocked readers and writers. Perhaps it would also be useful to have a sentence on how writers and readers are prioritized (e.g., writers preferred if readers already hold the lock, all are equal when writer unlocks, ...). How about: // This means that when no reader locks are held readers and writers get // equal priority. When one or more reader locks is held a writer gets // priority and no more reader locks can be taken while the writer is // queued. +static constexpr unsigned _S_n_readers = ~_S_write_entered; Rename this to _S_max_readers or such? Done. templatetypename _Clock, typename _Duration bool try_lock_until(const chrono::time_point_Clock, _Duration __abs_time) { - unique_lock_Mutex __lk(_M_mut, __abs_time); - if (__lk.owns_lock() _M_state == 0) + if (!_M_mut.try_lock_until(__abs_time)) + return false; I think a non-timed acquisition for the internal lock should be fine. The internal lock is supposed to protect critical sections of finite (and short) length, and the condvars also don't do acquisition with time outs. Good point about the condvars re-acquiring the mutex. We can get rid of the _Mutex type then, and just use std::mutex, and that also means we can provide the timed locking functions even when !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK). And so maybe we should use this fallback implementation instead of the pthread_rwlock_t one when !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK), so that they have a complete std::shared_timed_mutex (this applies to at least Darwin, not sure which other targets). Maybe we should also provide a fallback std::timed_mutex based on a condvar, I'll put that on the TODO list for stage 1. + unique_lockmutex __lk(_M_mut, adopt_lock); + if (!_M_gate1.wait_until(__lk, __abs_time, +[=]{ return !_M_write_entered(); })) { - _M_state = _S_write_entered; - return true; + return false; } - return false; + _M_state |= _S_write_entered; + if (!_M_gate2.wait_until(__lk, __abs_time, +[=]{ return _M_readers() == 0; })) + { I'd add a comment saying that you have to mimic a full write unlock, so that's why those steps are necessary. OK, like this: // Wake all threads blocked while the write-entered flag was set. + _M_state ^= _S_write_entered; + _M_gate1.notify_all(); + return false; + } + return true; } #endif @@ -364,7 +399,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION unlock() { { - lock_guard_Mutex __lk(_M_mut); + lock_guardmutex __lk(_M_mut); _M_state = 0; } _M_gate1.notify_all(); The notification must be *inside* of the critical section. Otherwise, you're violating the mutex destruction requirements (see 30.4.1.3.1p3): After you set _M_state to 0 and release _M_mut, another thread can go it, acquire, assume it's the last one to own the mutex if that's what the program ensures, and destroy; then, destruction would be concurrent with the call to notify_all. (Perhaps check other wake-ups / unlocks as well.) Thanks. I'll add this comment too: // call notify_all() while mutex is held so that another thread can't // lock and unlock the mutex then destroy *this before we make the call. void unlock_shared() { - lock_guard_Mutex __lk(_M_mut); - unsigned __num_readers = (_M_state _M_n_readers) - 1; - _M_state = ~_M_n_readers; - _M_state |= __num_readers; - if (_M_state _S_write_entered) + lock_guardmutex __lk(_M_mut); + auto __prev = _M_state--; + if (_M_write_entered()) { - if (__num_readers == 0) + if (_M_readers() == 0) _M_gate2.notify_one(); } else { - if (__num_readers == _M_n_readers - 1) + if (__prev == _S_n_readers) _M_gate1.notify_one(); } I think this needs documentation why we can't miss a wake-up in case _M_write_entered is true and we have a reader overflow. I think it's not quite obvious why this works. If there is a writer we don't notify gate1 in unlock_shared(), so there is
Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)
On Wed, 8 Apr 2015 17:58:56 +0300 Ilya Verbin iver...@gmail.com wrote: Have you tested it with disabled offloading? I see several regressions: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_on_device-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test No -- thanks for the note. I've committed the patch now, but I'll try to get to looking at these in the next day or two (it's probably something relatively minor, I guess). Julian
Re: [v3] Update Solaris baselines
On 08/04/15 15:41 +0200, Rainer Orth wrote: With the GCC 5 release approaching, it's time to update the Solaris baselines again. This patch does just that and is pretty much straightforward: * With one exception, all new symbols are in the GLIBCXX_3.4.21 and CXXABI_1.3.9 versions. * On Solaris/x86 only (obviously), we have +OBJECT:0:CXXABI_FLOAT128 +OBJECT:16:_ZTIg@@CXXABI_FLOAT128 +OBJECT:2:_ZTSg@@CXXABI_FLOAT128 +OBJECT:32:_ZTIPKg@@CXXABI_FLOAT128 +OBJECT:32:_ZTIPg@@CXXABI_FLOAT128 +OBJECT:3:_ZTSPg@@CXXABI_FLOAT128 +OBJECT:4:_ZTSPKg@@CXXABI_FLOAT128 For the moment (considering how late we are in the release cycle), I've decided to use the SPARC version for the common baseline, so those symbols show up as added on x86. * On Solaris 11, there are two additonal symbols beyond those in Solaris 10: +FUNC:std::__cxx11::basic_stringbufchar, std::char_traitschar, std::allocatorchar ::~basic_stringbuf()@@GLIBCXX_3.4.21 +FUNC:std::__cxx11::basic_stringbufwchar_t, std::char_traitswchar_t, std::allocatorwchar_t ::~basic_stringbuf()@@GLIBCXX_3.4.21 They are from src/c++98/complex_io.o and src/c++11/sstream-inst.o, respectively. No idea where this difference comes from, it might be related to the fact that Solaris 11 has COMDAT group support while Solaris 10 does not. I've again decided to live with this difference and use the Solaris 10 version of the baselines. Bootstrapped without regressions on i386-pc-solaris2.1[01] and sparc-sun-solaris2.1[01], abi_check results are clean with the additions noted above. Ok for mainline? OK, thanks.
Re: [WEB][PATCH] Describe -pg and LTO changes
On Sun, 16 Nov 2014, Andi Kleen wrote: This patch describes some user visible changes that were added to gcc 5. Thanks, Andi! I added some code environments, broke up a long one, changed a LTO build to an LTO build, Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.97 diff -u -r1.97 changes.html --- changes.html8 Apr 2015 07:26:23 - 1.97 +++ changes.html8 Apr 2015 16:39:59 - @@ -330,7 +330,8 @@ about qualifiers on pointers being discarded via a new warning option code-Wno-discarded-qualifiers/code./li liThe C front end now generates more precise caret diagnostics./li -liThe -pg option now only affects the current file in a LTO build./li +liThe code-pg/code command-line option now only affects the current +file in an LTO build./li /ul h3 id=cxxC++/h3 @@ -684,12 +685,13 @@ code-mavx512ifma/code and for AVX-512 Vector Bit Manipulation Instructions: code-mavx512vbmi/code./li liThe new code-mrecord-mcount/code option for code-pg/code - generates a Linux kernel style table of pointers to mcount or - __fentry__ calls at the beginning of functions. The new - code-mnop-mcount/code option in addition also generates nops in - place of the __fentry__ or mcount call, so that a call per function - can be later patched in. This can be used for low overhead tracing or - hot code patching./li + generates a Linux kernel style table of pointers to + codemcount/code or code__fentry__/code calls at the beginning + of functions. The new code-mnop-mcount/code option in addition + also generates nops in place of the code__fentry__/code or + codemcount/code call, so that a call per function can be later + patched in. This can be used for low overhead tracing or hot code + patching./li liThe new code-malign-data/code option controls how GCC aligns variables. code-malign-data=compat/code uses increased alignment compatible with GCC 4.8 and earlier, @@ -838,8 +840,9 @@ h2Other significant improvements/h2 h3 id=gcc-ar/h3 ul -liThe codegcc-ar, gcc-nm, gcc-ranlib/code wrappers now - understand a code-B/code option to set the compiler to use./li +liThe codegcc-ar/code, codegcc-nm/code, codegcc-ranlib/code + wrappers now understand a code-B/code option to set the compiler + to use./li /ul h3 id=driver/h3 ul
Re: [patch] Fix shared_timed_mutex::try_lock_until() et al
There is an correctness issue related to mutex destruction. The added documentation is a good start, but I'd still add some more for the complicated pieces of reasoning. Details inline below. On Tue, 2015-04-07 at 15:28 +0100, Jonathan Wakely wrote: diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc ++-v3/include/std/shared_mutex index ab1b45b..7391f11 100644 --- a/libstdc++-v3/include/std/shared_mutex +++ b/libstdc++-v3/include/std/shared_mutex @@ -268,7 +268,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #else // ! _GLIBCXX_USE_PTHREAD_RWLOCK_T +// Must use the same clock as condition_variable +typedef chrono::system_clock __clock_t; + #if _GTHREAD_USE_MUTEX_TIMEDLOCK +// Can't use std::timed_mutex with std::condition_variable, so define +// a new timed mutex type that derives from std::mutex. struct _Mutex : mutex, __timed_mutex_impl_Mutex { templatetypename _Rep, typename _Period @@ -285,16 +290,44 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typedef mutex _Mutex; #endif -// Based on Howard Hinnant's reference implementation from N2406 - +// Based on Howard Hinnant's reference implementation from N2406. + +// The high bit of _M_state is the write-entered flag which is set to +// indicate a writer has taken the lock or is queuing to take the lock. +// The remaining bits are the count of reader locks. +// +// To take a reader lock block on gate1 while the write-entered flag is I think this is easier to parse if you add a comma: To take a reader lock, block Same for other sentences below. +// set or the maximum number of reader locks is held, then increment the +// reader lock count. +// To release decrement the count, then if the write-entered flag is set +// and the count is zero then signal gate2 to wake a queued writer, +// otherwise if the maximum number of reader locks was held signal gate1 +// to wake a reader. +// +// To take a writer lock block on gate1 while the write-entered flag is +// set, then set the write-entered flag to start queueing, then block on +// gate2 while the number of reader locks is non-zero. +// To release unset the write-entered flag and signal gate1 to wake all +// blocked readers and writers. Perhaps it would also be useful to have a sentence on how writers and readers are prioritized (e.g., writers preferred if readers already hold the lock, all are equal when writer unlocks, ...). + +// Only locked when accessing _M_state or waiting on condition variables. _Mutex _M_mut; +// Used to block while write-entered is set or reader count at maximum. condition_variable _M_gate1; +// Used to block queued writers while reader count is non-zero. condition_variable _M_gate2; +// The write-entered flag and reader count. unsigned _M_state; static constexpr unsigned _S_write_entered = 1U (sizeof(unsigned)*__CHAR_BIT__ - 1); -static constexpr unsigned _M_n_readers = ~_S_write_entered; +static constexpr unsigned _S_n_readers = ~_S_write_entered; Rename this to _S_max_readers or such? + +// Test whether the write-entered flag is set. _M_mut must be locked. +bool _M_write_entered() const { return _M_state _S_write_entered; } + +// The number of reader locks currently held. _M_mut must be locked. +unsigned _M_readers() const { return _M_state _S_n_readers; } public: shared_timed_mutex() : _M_state(0) {} @@ -313,11 +346,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION lock() { unique_lockmutex __lk(_M_mut); - while (_M_state _S_write_entered) - _M_gate1.wait(__lk); Add something like this:? // We first wait until we acquire the writer-part of the lock, and // then wait until no readers hold the lock anymore. + _M_gate1.wait(__lk, [=]{ return !_M_write_entered(); }); _M_state |= _S_write_entered; - while (_M_state _M_n_readers) - _M_gate2.wait(__lk); + _M_gate2.wait(__lk, [=]{ return _M_readers() == 0; }); } bool @@ -337,26 +368,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool try_lock_for(const chrono::duration_Rep, _Period __rel_time) { - unique_lock_Mutex __lk(_M_mut, __rel_time); - if (__lk.owns_lock() _M_state == 0) - { - _M_state = _S_write_entered; - return true; - } - return false; + return try_lock_until(__clock_t::now() + __rel_time); } templatetypename _Clock, typename _Duration bool try_lock_until(const chrono::time_point_Clock, _Duration __abs_time) { - unique_lock_Mutex __lk(_M_mut, __abs_time); - if (__lk.owns_lock() _M_state == 0) + if (!_M_mut.try_lock_until(__abs_time)) + return false; I think a non-timed acquisition
Re: [C++ PATCH] Fix alignment handling in build_cplus_array_type/cp_build_qualified_type_real (PR c++/65690)
On Wed, Apr 08, 2015 at 06:32:29PM +0200, Jan Hubicka wrote: On Wed, Apr 08, 2015 at 06:22:10PM +0200, Jan Hubicka wrote: On 04/08/2015 06:02 AM, Jakub Jelinek wrote: (cp_build_qualified_type_real): Use check_base_type. Build a variant and copy over even TYPE_CONTEXT and TYPE_ALIGN/TYPE_USER_ALIGN if any of those are different. This seems wrong. If there is an array with the same name, attributes and element type, it should have the same alignment; if One of problems is that cp_build_qualified_type rebuilds the array from scratch and never copies the attribute list around (as oposed to build_qualified_type that just memcpy the type node) As I said earlier, TYPE_ATTRIBUTES is NULL here anyway, because the attributes hang in DECL_ATTRIBUTES of TYPE_DECL. And, except for config/sol2.c (which looks wrong), nothing ever calls lookup_attribute for aligned anyway, the user aligned stuff is encoded in TYPE_USER_ALIGN and/or DECL_USER_ALIGN and TYPE_ALIGN/DECL_ALIGN. This is interesting too. I did know that alignment is lowered into TYPE_USER_ALIGN/TYPE_ALIGN values, but there is a lot of other code that looks for type attributes by searching TYPE_ATTRIBUTES, not DECL_ATTRIBUTES of TYPE_DECL (such as nonnul_arg_p in tree-vrp) or alloc_object_size. Does it mean that those attributes are ignored for C++ produced types? I don't think this has anything to do with C++. c-common.c has an attribute table, for each attribute it has 3 flags, whether a decl is required, type is required and/or fn type is required, and that determines to what the attributes go. These flags have the following combinations (decl/type/fntype # of attributes): TFF 51 (i.e. decl required) FTT 12 (i.e. function type required) FTF 9 (i.e. some type required) FFF 7 (applies to both types and decls) Which means most of the attributes require a decl and thus go into DECL_ATTRIBUTES, then some require function types and go to the TYPE_ATTRIBUTES of the function, then others go solely to TYPE_ATTRIBUTES. The last row are attributes that don't really care where they apply to, and that is { packed, 0, 0, false, false, false, { unused, 0, 0, false, false, false, { transparent_union, 0, 0, false, false, false, { aligned,0, 1, false, false, false, { deprecated, 0, 1, false, false, false, { visibility, 1, 1, false, false, false, { warn_unused,0, 0, false, false, false, where the first 6 really don't care about what is stored in {TYPE,DECL}_ATTRIBUTES because the attributes are encoded differently in generic, and the last one sounds like a mistake (perhaps one that can't be undone anymore) where it doesn't require a type, but only stores it on types and warns otherwise. Jakub
Re: [PATCH] add self-tuning to x86 hardware fast path in libitm
On the STAMP suite of benchmarks for transactional memory (described here [1]). I have ran an unmodified GCC 5.0.0 against the patched GCC with these modifications and obtain the following speedups in STAMP with 4 threads (on a Haswell with 4 cores, average 10 runs): I expect you'll need different tunings on larger systems. That is a good point. While I haven't ever used fixed point arithmetic, a cursory inspection reveals that it does make sense and seems applicable to this case. Are you aware of some place where this is being done already within GCC that I could use as inspiration, or should I craft some macros from scratch for this? I believe the inliner uses fixed point. Own macros should be fine too. + int32_t last_attempts = optimizer.last_attempts; + int32_t current_attempts = optimizer.optimized_attempts; + int32_t new_attempts = current_attempts; + if (unlikely(change_for_worse 1.40)) +{ + optimizer.optimized_attempts = optimizer.best_ever_attempts; + optimizer.last_throughput = current_throughput; + optimizer.last_attempts = current_attempts; + return; +} + + if (unlikely(random() % 100 1)) +{ So where is the seed for that random stored? Could you corrupt some user's random state? Is the state per thread or global? If it's per thread how do you initialize so that they threads do start with different seeds. If it's global what synchronizes it? As I do not specify any seed, I was under the impression that there would be a default initialization. Furthermore, the posix documentation specifies random() to be MT-safe, so I assumed its internal state to be per-thread. Did I mis-interpret this? Yes, that's right. But it's very nasty to change the users RNG state. A common pattern for repeatable benchmarks is to start with srand(1) and then use the random numbers to run the benchmark, so it always does the same thing. If you non deterministically (transaction aborts are not deterministic) change the random state it will make the benchmark not repeatable anymore. You'll need to use an own RNG state that it independent. It would be good to see if any parts of the algorithm can be simplified. In general in production software the goal is to have the simplest algorithm that does the job. -Andi -- a...@linux.intel.com -- Speaking for myself only.
Re: Fix increase_alignment
On 2 April 2015 at 01:20, Jan Hubicka hubi...@ucw.cz wrote: Your follow-up patch 88ada5e935d58223ae2d9ce6d0c1c71c372680a8 a.k.a r221269 added this to emit_local(): static bool -emit_local (tree decl ATTRIBUTE_UNUSED, +emit_local (tree decl, const char *name ATTRIBUTE_UNUSED, unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED, unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED) { + int align = symtab_node::get (decl)-definition_alignment (); #if defined ASM_OUTPUT_ALIGNED_DECL_LOCAL ASM_OUTPUT_ALIGNED_DECL_LOCAL (asm_out_file, decl, name, -size, DECL_ALIGN (decl)); +size, align); return true; #elif defined ASM_OUTPUT_ALIGNED_LOCAL - ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size, DECL_ALIGN (decl)); + ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size, align); return true; #else ASM_OUTPUT_LOCAL (asm_out_file, name, size, rounded); return false; #endif } which gives gcc/varasm.c:1936:7: error: unused variable ‘align’ [-Werror=unused-variable] int align = symtab_node::get (decl)-definition_alignment (); ^ on log/alpha64-dec-vms log/alpha-dec-vms log/i686-cygwinOPT-enable-threads=yes log/i686-mingw32crt log/i686-openbsd3.0 log/i686-pc-msdosdjgpp log/m68k-openbsd Maybe just flag it as used or copy-move it? Yep, lets just move it into the ifdefs. Can you please check that the alignment Committed revision 221925. looks right atone of those targets? I am not quite sure who is supposed to do so on targets not defining ASM_OUTPUT_ALIGNED_LOCAL. Perhaps we want then to prvent vectorizer from updating the alignments. I have no idea how the alignment on those targets are supposed to look like, sorry. thanks, r221925 | aldot | 2015-04-08 19:56:18 +0200 (Wed, 08 Apr 2015) | 22 lines emit_local(): Fix unused warning Honzas r221269 produced gcc/varasm.c:1936:7: error: unused variable ‘align’ [-Werror=unused-variable] int align = symtab_node::get (decl)-definition_alignment (); ^ on e.g.: log/alpha64-dec-vms log/alpha-dec-vms log/i686-cygwinOPT-enable-threads=yes log/i686-mingw32crt log/i686-openbsd3.0 log/i686-pc-msdosdjgpp log/m68k-openbsd Silence this by moving the variable into the corresponding blocks and adding back the ATTRIBUTE_UNUSED decoration for the decl param. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 221924) +++ gcc/ChangeLog (revision 221925) @@ -1,3 +1,7 @@ +2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org + + * varasm.c (emit_local): Move definition of align. + 2015-04-08 Julian Brown jul...@codesourcery.com * config/nvptx/mkoffload.c (process): Support variable mapping. Index: gcc/varasm.c === --- gcc/varasm.c (revision 221924) +++ gcc/varasm.c (revision 221925) @@ -1928,17 +1928,18 @@ /* A noswitch_section_callback for lcomm_section. */ static bool -emit_local (tree decl, +emit_local (tree decl ATTRIBUTE_UNUSED, const char *name ATTRIBUTE_UNUSED, unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED, unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED) { +#if defined ASM_OUTPUT_ALIGNED_DECL_LOCAL int align = symtab_node::get (decl)-definition_alignment (); -#if defined ASM_OUTPUT_ALIGNED_DECL_LOCAL ASM_OUTPUT_ALIGNED_DECL_LOCAL (asm_out_file, decl, name, size, align); return true; #elif defined ASM_OUTPUT_ALIGNED_LOCAL + int align = symtab_node::get (decl)-definition_alignment (); ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size, align); return true; #else
[PATCH, doc, committed] cfg.texi (GIMPLE statement iterators): Fix typo
Hi, Committed as r221926. 2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org * doc/cfg.texi (GIMPLE statement iterators): Fix typo. Index: gcc/doc/cfg.texi === --- gcc/doc/cfg.texi (revision 221925) +++ gcc/doc/cfg.texi (revision 221926) @@ -543,7 +543,7 @@ representation, @dfn{GIMPLE statement iterators} should be used. These iterators provide an integrated abstraction of the flow graph and the instruction stream. Block statement iterators are constructed using -the @code{gimple_stmt_iterator} data structure and several modifier are +the @code{gimple_stmt_iterator} data structure and several modifiers are available, including the following: @ftable @code Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 221925) +++ gcc/ChangeLog (revision 221926) @@ -1,5 +1,9 @@ 2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org + * doc/cfg.texi (GIMPLE statement iterators): Fix typo. + +2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org + * varasm.c (emit_local): Move definition of align. 2015-04-08 Julian Brown jul...@codesourcery.com
[PATCH] emit_bss(): Remove redundant guard
gcc/ChangeLog: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * varasm.c (emit_bss): Remove redundant guard. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- gcc/varasm.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/gcc/varasm.c b/gcc/varasm.c index 537a64d..2bb5f27 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -1951,21 +1951,19 @@ emit_local (tree decl, #if defined ASM_OUTPUT_ALIGNED_BSS static bool emit_bss (tree decl ATTRIBUTE_UNUSED, const char *name ATTRIBUTE_UNUSED, unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED, unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED) { -#if defined ASM_OUTPUT_ALIGNED_BSS ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size, get_variable_align (decl)); return true; -#endif } #endif /* A noswitch_section_callback for comm_section. */ static bool emit_common (tree decl ATTRIBUTE_UNUSED, const char *name ATTRIBUTE_UNUSED, -- 2.1.4
[PATCH] mklog: Fallback to env author name and addr
contrib/ChangeLog: 2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org * mklog ($name, $addr): Fallback to env author settings. --- contrib/mklog | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/contrib/mklog b/contrib/mklog index f7974a7..4cad351 100755 --- a/contrib/mklog +++ b/contrib/mklog @@ -53,9 +53,9 @@ if (-f $conf) { . $dot_mklog_format_msg; } } else { -$name = `git config user.name`; +$name = `git config user.name` || $ENV{GIT_AUTHOR_NAME}; chomp($name); -$addr = `git config user.email`; +$addr = `git config user.email` || $ENV{GIT_AUTHOR_EMAIL}; chomp($addr); if (!($name $addr)) { -- 2.1.4
Re: [patch] Fix shared_timed_mutex::try_lock_until() et al
On 08/04/15 20:11 +0100, Jonathan Wakely wrote: We can get rid of the _Mutex type then, and just use std::mutex, and that also means we can provide the timed locking functions even when !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK). And so maybe we should use this fallback implementation instead of the pthread_rwlock_t one when !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK), so that they have a complete std::shared_timed_mutex (this applies to at least Darwin, not sure which other targets). Here's a further patch to do that (which really needs to go into 5.0 too, so we don't switch Darwin to the new pthread_rwlock_t version and then have to swtich it back again in 6.0). commit 20f08d3eac6ec88c83becb8f0cb2e65c10d3fe20 Author: Jonathan Wakely jwak...@redhat.com Date: Wed Apr 8 20:25:45 2015 +0100 * include/std/shared_mutex (shared_timed_mutex): Only use pthread_rwlock_t when the POSIX Timeouts option is supported. * testsuite/30_threads/shared_lock/cons/5.cc: Remove dg-require-gthreads-timed. * testsuite/30_threads/shared_lock/cons/6.cc: Likewise. * testsuite/30_threads/shared_lock/locking/3.cc: Likewise. * testsuite/30_threads/shared_lock/locking/4.cc: Likewise. diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc++-v3/include/std/shared_mutex index 7f26465..351a4f6 100644 --- a/libstdc++-v3/include/std/shared_mutex +++ b/libstdc++-v3/include/std/shared_mutex @@ -57,7 +57,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION /// shared_timed_mutex class shared_timed_mutex { -#ifdef _GLIBCXX_USE_PTHREAD_RWLOCK_T +#if defined(_GLIBCXX_USE_PTHREAD_RWLOCK_T) _GTHREAD_USE_MUTEX_TIMEDLOCK typedef chrono::system_clock __clock_t; #ifdef PTHREAD_RWLOCK_INITIALIZER @@ -116,7 +116,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return true; } -#if _GTHREAD_USE_MUTEX_TIMEDLOCK templatetypename _Rep, typename _Period bool try_lock_for(const chrono::duration_Rep, _Period __rel_time) @@ -158,7 +157,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION const auto __s_atime = __s_entry + __delta; return try_lock_until(__s_atime); } -#endif void unlock() @@ -200,7 +198,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return true; } -#if _GTHREAD_USE_MUTEX_TIMEDLOCK templatetypename _Rep, typename _Period bool try_lock_shared_for(const chrono::duration_Rep, _Period __rel_time) @@ -258,7 +255,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION const auto __s_atime = __s_entry + __delta; return try_lock_shared_until(__s_atime); } -#endif void unlock_shared() @@ -266,7 +262,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION unlock(); } -#else // ! _GLIBCXX_USE_PTHREAD_RWLOCK_T +#else // ! (_GLIBCXX_USE_PTHREAD_RWLOCK_T _GTHREAD_USE_MUTEX_TIMEDLOCK) // Must use the same clock as condition_variable typedef chrono::system_clock __clock_t; @@ -459,7 +455,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_gate1.notify_one(); } } -#endif // ! _GLIBCXX_USE_PTHREAD_RWLOCK_T +#endif // _GLIBCXX_USE_PTHREAD_RWLOCK_T _GTHREAD_USE_MUTEX_TIMEDLOCK }; #endif // _GLIBCXX_HAS_GTHREADS diff --git a/libstdc++-v3/testsuite/30_threads/shared_lock/cons/5.cc b/libstdc++-v3/testsuite/30_threads/shared_lock/cons/5.cc index 63ab514..9ec0498 100644 --- a/libstdc++-v3/testsuite/30_threads/shared_lock/cons/5.cc +++ b/libstdc++-v3/testsuite/30_threads/shared_lock/cons/5.cc @@ -3,7 +3,6 @@ // { dg-options -std=gnu++14 -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++14 { target *-*-cygwin *-*-darwin* } } // { dg-require-cstdint } -// { dg-require-gthreads-timed } // Copyright (C) 2013-2015 Free Software Foundation, Inc. // diff --git a/libstdc++-v3/testsuite/30_threads/shared_lock/cons/6.cc b/libstdc++-v3/testsuite/30_threads/shared_lock/cons/6.cc index 8b2bcd5..2074bbe 100644 --- a/libstdc++-v3/testsuite/30_threads/shared_lock/cons/6.cc +++ b/libstdc++-v3/testsuite/30_threads/shared_lock/cons/6.cc @@ -3,7 +3,6 @@ // { dg-options -std=gnu++14 -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++14 { target *-*-cygwin *-*-darwin* } } // { dg-require-cstdint } -// { dg-require-gthreads-timed } // Copyright (C) 2013-2015 Free Software Foundation, Inc. // diff --git a/libstdc++-v3/testsuite/30_threads/shared_lock/locking/3.cc b/libstdc++-v3/testsuite/30_threads/shared_lock/locking/3.cc index b67022a..4b653ea 100644 --- a/libstdc++-v3/testsuite/30_threads/shared_lock/locking/3.cc +++ b/libstdc++-v3/testsuite/30_threads/shared_lock/locking/3.cc @@ -3,7 +3,6 @@ // { dg-options -std=gnu++14 -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++14 { target *-*-cygwin *-*-darwin* } } // { dg-require-cstdint } -// { dg-require-gthreads-timed } // Copyright (C) 2013-2015 Free Software Foundation, Inc. // diff --git a/libstdc++-v3/testsuite/30_threads/shared_lock/locking/4.cc b/libstdc++-v3/testsuite/30_threads/shared_lock/locking/4.cc index e87d0dd..afeefa2 100644 ---
Re: [PATCH, CHKP] Fix static const bounds creation in LTO
On 07 Apr 22:43, Jan Hubicka wrote: 2015-04-07 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c (chkp_find_const_bounds_var): Remove. (chkp_make_static_const_bounds): Search existing symbol by assembler name. Use make_decl_one_only. gcc/testsuite/ 2015-04-07 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-static-bounds_0.c: New. OK, thanks! + if ((snode = symtab_node::get_for_asmname (DECL_ASSEMBLER_NAME (var +{ + /* We don't allow this symbol usage for non bounds. */ + gcc_assert (snode-type == SYMTAB_VARIABLE); + gcc_assert (POINTER_BOUNDS_P (snode-decl)); This probably allows users to trigger ICE by declaring function of conflicting name. What about sorry (...) message instead? Here is installed version. Thanks, Ilya -- gcc/ 2015-04-08 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c (chkp_find_const_bounds_var): Remove. (chkp_make_static_const_bounds): Search existing symbol by assembler name. Use make_decl_one_only. (chkp_get_zero_bounds_var): Remove node search which is now performed in chkp_make_static_const_bounds. (chkp_get_none_bounds_var): Likewise. gcc/testsuite/ 2015-04-08 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/lto/chkp-static-bounds_0.c: New. diff --git a/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c new file mode 100644 index 000..596e551 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/chkp-static-bounds_0.c @@ -0,0 +1,26 @@ +/* { dg-lto-do link } */ +/* { dg-require-effective-target mpx } */ +/* { dg-lto-options { { -flto -flto-partition=max -fcheck-pointer-bounds -mmpx } } } */ + +const char *cc; + +int test1 (const char *c) +{ + c = __builtin___bnd_init_ptr_bounds (c); + cc = c; + return c[0] * 2; +} + +struct S +{ + int (*fnptr) (const char *); +} S; + +struct S s1 = {test1}; +struct S s2 = {test1}; +struct S s3 = {test1}; + +int main (int argc, const char **argv) +{ + return s1.fnptr (argv[0]) + s2.fnptr (argv[1]); +} diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c index 03f75b3..4c8379f 100644 --- a/gcc/tree-chkp.c +++ b/gcc/tree-chkp.c @@ -1873,33 +1873,6 @@ chkp_add_bounds_to_call_stmt (gimple_stmt_iterator *gsi) gimple_call_set_with_bounds (new_call, true); } -/* Return constant static bounds var with specified LB and UB - if such var exists in varpool. Return NULL otherwise. */ -static tree -chkp_find_const_bounds_var (HOST_WIDE_INT lb, - HOST_WIDE_INT ub) -{ - tree val = targetm.chkp_make_bounds_constant (lb, ub); - struct varpool_node *node; - - /* We expect bounds constant is represented as a complex value - of two pointer sized integers. */ - gcc_assert (TREE_CODE (val) == COMPLEX_CST); - - FOR_EACH_VARIABLE (node) -if (POINTER_BOUNDS_P (node-decl) -TREE_READONLY (node-decl) -DECL_INITIAL (node-decl) -TREE_CODE (DECL_INITIAL (node-decl)) == COMPLEX_CST -tree_int_cst_equal (TREE_REALPART (DECL_INITIAL (node-decl)), - TREE_REALPART (val)) -tree_int_cst_equal (TREE_IMAGPART (DECL_INITIAL (node-decl)), - TREE_IMAGPART (val))) - return node-decl; - - return NULL; -} - /* Return constant static bounds var with specified bounds LB and UB. If such var does not exists then new var is created with specified NAME. */ static tree @@ -1907,37 +1880,43 @@ chkp_make_static_const_bounds (HOST_WIDE_INT lb, HOST_WIDE_INT ub, const char *name) { + tree id = get_identifier (name); tree var; + varpool_node *node; + symtab_node *snode; + + var = build_decl (UNKNOWN_LOCATION, VAR_DECL, id, +pointer_bounds_type_node); + TREE_STATIC (var) = 1; + TREE_PUBLIC (var) = 1; /* With LTO we may have constant bounds already in varpool. Try to find it. */ - var = chkp_find_const_bounds_var (lb, ub); - - if (var) -return var; - - var = build_decl (UNKNOWN_LOCATION, VAR_DECL, -get_identifier (name), pointer_bounds_type_node); + if ((snode = symtab_node::get_for_asmname (DECL_ASSEMBLER_NAME (var +{ + /* We don't allow this symbol usage for non bounds. */ + if (snode-type != SYMTAB_VARIABLE + || !POINTER_BOUNDS_P (snode-decl)) + sorry (-fcheck-pointer-bounds requires '%s' + name for internal usage, + IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (var))); + + return snode-decl; +} - TREE_PUBLIC (var) = 1; TREE_USED (var) = 1; TREE_READONLY (var) = 1; - TREE_STATIC (var) = 1; TREE_ADDRESSABLE (var) = 0; DECL_ARTIFICIAL (var) = 1; DECL_READ_P (var) = 1; + DECL_INITIAL (var) = targetm.chkp_make_bounds_constant (lb, ub); + make_decl_one_only (var, DECL_ASSEMBLER_NAME (var)); /* We may use this symbol during
Re: [PATCH] mklog: Fallback to env author name and addr
On Wed, Apr 8, 2015 at 3:36 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: contrib/ChangeLog: 2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org * mklog ($name, $addr): Fallback to env author settings. This looks fine, but note that I no longer have approval rights for patches. Code in contrib/ has looser requirements, and the patch is clearly harmless. I would just add documentation on the GIT_AUTHOR_* environment variables at the top of the script. Diego.
Re: [PATCH] mklog: Fallback to env author name and addr
On 8 April 2015 at 21:45, Diego Novillo dnovi...@google.com wrote: On Wed, Apr 8, 2015 at 3:36 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: contrib/ChangeLog: 2015-04-08 Bernhard Reutner-Fischer al...@gcc.gnu.org * mklog ($name, $addr): Fallback to env author settings. This looks fine, but note that I no longer have approval rights for yuck. I remember now, what a pity. patches. Code in contrib/ has looser requirements, and the patch is clearly harmless. I would just add documentation on the GIT_AUTHOR_* environment variables at the top of the script. I'll document them, thanks!
Re: [PATCH, c6x] handle unk_isa in TARGET_CPU_CPP_BUILTINS
ping On 2 April 2015 at 22:49, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On Wed, Apr 01, 2015 at 11:37:41PM +0200, Bernhard Reutner-Fischer wrote: Bernd, same for c6x for unk_isa, fwiw. Attached. Ok for trunk for the c6x bits? Ok for trunk for the bfin bits? thanks,
Re: [PATCH] combine: Disregard clobbers in another test for two SETs (PR65693)
On 04/08/2015 03:33 PM, Segher Boessenkool wrote: PR65693 exposes a case where combine does a worse job after my patches to split parallels before combining. We start with a parallel of an udiv and an umod, and a clobber; the umod is dead. The instruction is combined with one setting the divisor pseudo to a power-of-two constant, so we end up with a parallel of an lshiftrt, an and (dead), and a clobber. This is not a recognised instruction. Before my patches this was a 2-1 combination, and the combiner throws away the dead set and everyone is happy. After the patches, this now is a 3-1 combination, the combiner does not throw away the dead set but tries to split the parallel into two, which does not work because one of the resulting insns has to end up as i2, which is earlier than the original sets. The combiner gives up. There already is code to throw away dead sets in the 3-1 case, but it only works for a parallel for two sets without any clobbers. This patch fixes it. Tested on powerpc64-linux (-m32,-m32/-mpowerpc64,-m64,-m64/-mlra); no regressions. Tested a cross to x86_64-linux on the PR65693 testcase, and it fixes it. Is this okay for current trunk? Segher 2015-04-08 Segher Boessenkool seg...@kernel.crashing.org * combine.c (is_parallel_of_n_reg_sets): Change first argument from an rtx_insn * to an rtx. (try_combine): Adjust both callers. Use it once more. OK. jeff
[PATCH] combine: Disregard clobbers in another test for two SETs (PR65693)
PR65693 exposes a case where combine does a worse job after my patches to split parallels before combining. We start with a parallel of an udiv and an umod, and a clobber; the umod is dead. The instruction is combined with one setting the divisor pseudo to a power-of-two constant, so we end up with a parallel of an lshiftrt, an and (dead), and a clobber. This is not a recognised instruction. Before my patches this was a 2-1 combination, and the combiner throws away the dead set and everyone is happy. After the patches, this now is a 3-1 combination, the combiner does not throw away the dead set but tries to split the parallel into two, which does not work because one of the resulting insns has to end up as i2, which is earlier than the original sets. The combiner gives up. There already is code to throw away dead sets in the 3-1 case, but it only works for a parallel for two sets without any clobbers. This patch fixes it. Tested on powerpc64-linux (-m32,-m32/-mpowerpc64,-m64,-m64/-mlra); no regressions. Tested a cross to x86_64-linux on the PR65693 testcase, and it fixes it. Is this okay for current trunk? Segher 2015-04-08 Segher Boessenkool seg...@kernel.crashing.org * combine.c (is_parallel_of_n_reg_sets): Change first argument from an rtx_insn * to an rtx. (try_combine): Adjust both callers. Use it once more. --- gcc/combine.c | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/gcc/combine.c b/gcc/combine.c index 14df228..32950383 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -2493,13 +2493,11 @@ update_cfg_for_uncondjump (rtx_insn *insn) } #ifndef HAVE_cc0 -/* Return whether INSN is a PARALLEL of exactly N register SETs followed +/* Return whether PAT is a PARALLEL of exactly N register SETs followed by an arbitrary number of CLOBBERs. */ static bool -is_parallel_of_n_reg_sets (rtx_insn *insn, int n) +is_parallel_of_n_reg_sets (rtx pat, int n) { - rtx pat = PATTERN (insn); - if (GET_CODE (pat) != PARALLEL) return false; @@ -2907,7 +2905,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, decrement insn. */ if (i1 == 0 - is_parallel_of_n_reg_sets (i2, 2) + is_parallel_of_n_reg_sets (PATTERN (i2), 2) (GET_MODE_CLASS (GET_MODE (SET_DEST (XVECEXP (PATTERN (i2), 0, 0 == MODE_CC) GET_CODE (SET_SRC (XVECEXP (PATTERN (i2), 0, 0))) == COMPARE @@ -2939,7 +2937,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, make those two SETs separate I1 and I2 insns, and make an I0 that is the original I1. */ if (i0 == 0 - is_parallel_of_n_reg_sets (i2, 2) + is_parallel_of_n_reg_sets (PATTERN (i2), 2) can_split_parallel_of_n_reg_sets (i2, 2) !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3) !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3)) @@ -3460,10 +3458,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, debug info less accurate. */ if (!(added_sets_2 i1 == 0) - GET_CODE (newpat) == PARALLEL - XVECLEN (newpat, 0) == 2 - GET_CODE (XVECEXP (newpat, 0, 0)) == SET - GET_CODE (XVECEXP (newpat, 0, 1)) == SET + is_parallel_of_n_reg_sets (newpat, 2) asm_noperands (newpat) 0) { rtx set0 = XVECEXP (newpat, 0, 0); -- 1.8.1.4
[PATCH] Fix ICE with x86_64 alloca (PR target/65693)
Hi! This patch extends the PR65220 patch: 1) there is really no reason to limit the divisor to 32 or 64, we can divide/modulo by pow2 constants up to 2G (0x7fff is then still representable in 32-bit signed immediate) 2) on the testcase RTL cprop unfortunately isn't performed, because the function contains only a single basic block, and the combiner when the constant is propagated into it simplifies it to the shift and and; so the patch adds another pattern similar to the previous one to handle this case Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-04-08 Jakub Jelinek ja...@redhat.com PR target/65693 * config/i386/i386.md (*udivmodmode4_pow2): Allow any pow2 integer in between 2 and 0x8000U inclusive. (*udivmodmode4_pow2_1): New define_insn_and_split. * gcc.target/i386/pr65693.c: New test. --- gcc/config/i386/i386.md.jj 2015-04-03 15:32:30.0 +0200 +++ gcc/config/i386/i386.md 2015-04-08 17:30:10.615369134 +0200 @@ -7340,7 +7340,7 @@ (define_insn_and_split *udivmodmode4_ (set (match_operand:SWI48 1 register_operand =r) (umod:SWI48 (match_dup 2) (match_dup 3))) (clobber (reg:CC FLAGS_REG))] - UINTVAL (operands[3]) - 2 MODE_SIZE * BITS_PER_UNIT + IN_RANGE (INTVAL (operands[3]), 2, HOST_WIDE_INT_UC (0x8000)) (UINTVAL (operands[3]) (UINTVAL (operands[3]) - 1)) == 0 # 1 @@ -7357,6 +7357,27 @@ (define_insn_and_split *udivmodmode4_ [(set_attr type multi) (set_attr mode MODE)]) +;; Similarly, but for the case when the combiner simplifies it. +(define_insn_and_split *udivmodmode4_pow2_1 + [(set (match_operand:SWI48 0 register_operand =r) + (lshiftrt:SWI48 (match_operand:SWI48 2 register_operand 0) + (match_operand:SWI48 3 const_int_operand n))) + (set (match_operand:SWI48 1 register_operand =r) + (and:SWI48 (match_dup 2) (match_operand:SWI48 4 const_int_operand n))) + (clobber (reg:CC FLAGS_REG))] + IN_RANGE (INTVAL (operands[3]), 1, 31) +UINTVAL (operands[4]) == (HOST_WIDE_INT_1U INTVAL (operands[3])) - 1 + # + 1 + [(set (match_dup 1) (match_dup 2)) + (parallel [(set (match_dup 0) (lshiftrt:MODE (match_dup 2) (match_dup 3))) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_dup 1) (and:MODE (match_dup 1) (match_dup 4))) + (clobber (reg:CC FLAGS_REG))])] + + [(set_attr type multi) + (set_attr mode MODE)]) + (define_insn *udivmodmode4_noext [(set (match_operand:SWIM248 0 register_operand =a) (udiv:SWIM248 (match_operand:SWIM248 2 register_operand 0) --- gcc/testsuite/gcc.target/i386/pr65693.c.jj 2015-04-08 17:42:14.146727788 +0200 +++ gcc/testsuite/gcc.target/i386/pr65693.c 2015-04-08 17:41:22.0 +0200 @@ -0,0 +1,13 @@ +/* PR target/65693 */ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +int a; + +void +foo (int (*fn) (int, int, int), unsigned int b) +{ + unsigned long *c = (unsigned long *) __builtin_alloca (b); + a = *c; + register int d asm (edx) = fn (0, 0, d); +} Jakub
[PATCH] Be less conservative in process_{output,input}_constraints (PR target/65689)
Hi! Right now, stmt.c on constraints not hardcoded in it, and not define_{register,address,memory}_constraint just assumes the constraint might allow both reg and mem. Unfortunately, on some constraints which clearly can't allow either of those leads to errors at -O0, because the expander doesn't try so hard to expand it as EXPAND_INITIALIZER. The following patch is an attempt to handle at least the easy cases - define_constraint like: (define_constraint S A constraint that matches an absolute symbolic address. (and (match_code const,symbol_ref,label_ref) (match_test aarch64_symbolic_address_p (op where the match_code clearly proves that it never can match any REG/SUBREG, nor MEM, by teaching genpreds.c to emit an extra inline function that stmt.c can in process_{output,input}_constraint use for the unknown constraints. On x86_64/i686 this only detects constraint G as not allowing reg nor mem (it is match_code const_double), and V (plus and , but those are hardcoded in stmt.c already) that it allows mem but not reg. On aarch64, in the first category it detects several constraints. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-04-08 Jakub Jelinek ja...@redhat.com PR target/65689 * genpreds.c (struct constraint_data): Add maybe_allows_reg and maybe_allows_mem bitfields. (maybe_allows_none_start, maybe_allows_none_end, maybe_allows_reg_start, maybe_allows_reg_end, maybe_allows_mem_start, maybe_allows_mem_end): New variables. (compute_maybe_allows): New function. (add_constraint): Use it to initialize maybe_allows_reg and maybe_allows_mem fields. (choose_enum_order): Sort the non-is_register/is_const_int/is_memory/ is_address constraints such that those that allow neither mem nor reg come first, then those that only allow reg but not mem, then those that only allow mem but not reg, then the rest. (write_allows_reg_mem_function): New function. (write_tm_preds_h): Call it. * stmt.c (parse_output_constraint, parse_input_constraint): Use the generated insn_extra_constraint_allows_reg_mem function instead of always setting *allows_reg = true; *allows_mem = true; for unknown extra constraints. --- gcc/genpreds.c.jj 2015-01-09 21:59:54.0 +0100 +++ gcc/genpreds.c 2015-04-08 14:09:51.713934240 +0200 @@ -640,12 +640,14 @@ struct constraint_data const char *regclass; /* for register constraints */ rtx exp; /* for other constraints */ unsigned int lineno; /* line of definition */ - unsigned int is_register : 1; - unsigned int is_const_int : 1; - unsigned int is_const_dbl : 1; - unsigned int is_extra : 1; - unsigned int is_memory: 1; - unsigned int is_address : 1; + unsigned int is_register : 1; + unsigned int is_const_int: 1; + unsigned int is_const_dbl: 1; + unsigned int is_extra: 1; + unsigned int is_memory : 1; + unsigned int is_address : 1; + unsigned int maybe_allows_reg : 1; + unsigned int maybe_allows_mem : 1; }; /* Overview of all constraints beginning with a given letter. */ @@ -691,6 +693,9 @@ static unsigned int satisfied_start; static unsigned int const_int_start, const_int_end; static unsigned int memory_start, memory_end; static unsigned int address_start, address_end; +static unsigned int maybe_allows_none_start, maybe_allows_none_end; +static unsigned int maybe_allows_reg_start, maybe_allows_reg_end; +static unsigned int maybe_allows_mem_start, maybe_allows_mem_end; /* Convert NAME, which contains angle brackets and/or underscores, to a string that can be used as part of a C identifier. The string @@ -711,6 +716,46 @@ mangle (const char *name) return XOBFINISH (rtl_obstack, const char *); } +/* Return a bitmask, bit 1 if EXP maybe allows a REG/SUBREG, 2 if EXP + maybe allows a MEM. This should be conservative. */ +static int +compute_maybe_allows (rtx exp) +{ + switch (GET_CODE (exp)) +{ +case IF_THEN_ELSE: + /* Conservative answer is like IOR, of the THEN and ELSE branches. */ + return compute_maybe_allows (XEXP (exp, 1)) +| compute_maybe_allows (XEXP (exp, 2)); +case AND: + return compute_maybe_allows (XEXP (exp, 0)) + compute_maybe_allows (XEXP (exp, 1)); +case IOR: + return compute_maybe_allows (XEXP (exp, 0)) +| compute_maybe_allows (XEXP (exp, 1)); +case MATCH_CODE: + if (*XSTR (exp, 1) == '\0') + { + const char *code, *codes = XSTR (exp, 0); + int ret = 0; + while ((code = scan_comma_elt (codes)) != 0) + if (strncmp (code, reg, 3) == 0 +(code[3] == ',' || code[3] == '\0' || code[3] == ' ')) + ret |= 1; + else if (strncmp (code, subreg, 6) == 0 + (code[6] == ',' || code[6] == '\0' || code[6]
Re: [PATCH] Be less conservative in process_{output,input}_constraints (PR target/65689)
On Wed, Apr 08, 2015 at 05:16:08PM -0500, Segher Boessenkool wrote: On Wed, Apr 08, 2015 at 05:12:07PM -0500, Segher Boessenkool wrote: On Wed, Apr 08, 2015 at 11:00:59PM +0200, Jakub Jelinek wrote: +case MATCH_CODE: + if (*XSTR (exp, 1) == '\0') + { + const char *code, *codes = XSTR (exp, 0); + int ret = 0; + while ((code = scan_comma_elt (codes)) != 0) + if (strncmp (code, reg, 3) == 0 + (code[3] == ',' || code[3] == '\0' || code[3] == ' ')) This doesn't allow other whitespace. Maybe it's cleaner written as e.g. codes - code == 3 ... and that doesn't handle trailing whitespace. Ugh. Yeah. Guess I should use (code[3] == ',' || code[3] == '\0' || ISSPACE (code[3])) instead then. Jakub
breakage with [PATCH] combine: Disregard clobbers in another test for two SETs (PR65693)
On Wed, 8 Apr 2015, Segher Boessenkool wrote: 2015-04-08 Segher Boessenkool seg...@kernel.crashing.org * combine.c (is_parallel_of_n_reg_sets): Change first argument from an rtx_insn * to an rtx. (try_combine): Adjust both callers. Use it once more. That once more is outside of #ifndef HAVE_cc0 and is_parallel_of_n_reg_sets is only defined inside of one. Boom. Full test on a cc0 target (such as cris-elf) is advised, and at least make all-gcc would be a minimum after fixing. cutnpaste: g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall \ -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macr\ os -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I/tmp/hpautotest-gcc1/gcc/gcc -I/tmp/hpautotest-gcc1/g\ cc/gcc/. -I/tmp/hpautotest-gcc1/gcc/gcc/../include -I/tmp/hpautotest-gcc1/gcc/gcc/../libcpp/include -I/tmp/hpautotest-g\ cc1/cris-elf/gccobj/./gmp -I/tmp/hpautotest-gcc1/gcc/gmp -I/tmp/hpautotest-gcc1/cris-elf/gccobj/./mpfr -I/tmp/hpautotes\ t-gcc1/gcc/mpfr -I/tmp/hpautotest-gcc1/gcc/mpc/src -I/tmp/hpautotest-gcc1/gcc/gcc/../libdecnumber -I/tmp/hpautotest-gc\ c1/gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I/tmp/hpautotest-gcc1/gcc/gcc/../libbacktrace -o combine.o -MT comb\ ine.o -MMD -MP -MF ./.deps/combine.TPo /tmp/hpautotest-gcc1/gcc/gcc/combine.c /tmp/hpautotest-gcc1/gcc/gcc/combine.c: In function 'rtx_insn* try_combine(rtx_insn*, rtx_insn*, rtx_insn*, rtx_insn*, \ int*, rtx_insn*)': /tmp/hpautotest-gcc1/gcc/gcc/combine.c:3461: error: 'is_parallel_of_n_reg_sets' was not declared in this scope make[2]: *** [combine.o] Error 1 brgds, H-P
Re: [PATCH] Be less conservative in process_{output,input}_constraints (PR target/65689)
On Wed, Apr 08, 2015 at 11:00:59PM +0200, Jakub Jelinek wrote: +case MATCH_CODE: + if (*XSTR (exp, 1) == '\0') + { + const char *code, *codes = XSTR (exp, 0); + int ret = 0; + while ((code = scan_comma_elt (codes)) != 0) + if (strncmp (code, reg, 3) == 0 + (code[3] == ',' || code[3] == '\0' || code[3] == ' ')) This doesn't allow other whitespace. Maybe it's cleaner written as e.g. codes - code == 3 ? Segher
Re: [PATCH] Be less conservative in process_{output,input}_constraints (PR target/65689)
On Wed, Apr 08, 2015 at 05:12:07PM -0500, Segher Boessenkool wrote: On Wed, Apr 08, 2015 at 11:00:59PM +0200, Jakub Jelinek wrote: +case MATCH_CODE: + if (*XSTR (exp, 1) == '\0') + { + const char *code, *codes = XSTR (exp, 0); + int ret = 0; + while ((code = scan_comma_elt (codes)) != 0) + if (strncmp (code, reg, 3) == 0 +(code[3] == ',' || code[3] == '\0' || code[3] == ' ')) This doesn't allow other whitespace. Maybe it's cleaner written as e.g. codes - code == 3 ... and that doesn't handle trailing whitespace. Ugh. Segher
Re: [PATCH, bfin] handle BFIN_CPU_UNKNOWN in TARGET_CPU_CPP_BUILTINS
ping On 1 April 2015 at 23:34, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: Hi, gcc/c-family/c-cppbuiltin.c In file included from ./tm.h:21:0, from ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:23: ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c: In function ‘void c_cpp_builtins(cpp_reader*)’: ../../../../../../home/me/src/gcc-5.0.mine/gcc/config/bfin/bfin.h:43:14: error: enumeration value ‘BFIN_CPU_UNKNOWN’ not handled in switch [-Werror=switch] switch (bfin_cpu_type) \ ^ ../../../../../../home/me/src/gcc-5.0.mine/gcc/c-family/c-cppbuiltin.c:1243:3: note: in expansion of macro ‘TARGET_CPU_CPP_BUILTINS’ TARGET_CPU_CPP_BUILTINS (); ^ cc1plus: all warnings being treated as errors make[2]: *** [c-family/c-cppbuiltin.o] Error 1 gcc/ChangeLog: 2015-04-01 Bernhard Reutner-Fischer al...@gcc.gnu.org * config/bfin/bfin.h (TARGET_CPU_CPP_BUILTINS): Add BFIN_CPU_UNKNOWN.