RE: [PATCH, Fortan] fix initialization of flag_errno_math and flag_associative_math
Why do you want -fno-math-errno to be the default for gfortran? because it was with rth's commit? This makes sense also because errno is a variable that is defined for C, and not accessible from Fortran. Why would you want -fmath-errno to be the default ? if (flag_associative_math == -1) flag_associative_math = (!flag_trapping_math !flag_signed_zeros); Why this is not working is beyond my understanding, but I am not sure that your fix is the right one. while the details of the option handling are rather opaque to me to, at the point of reaching this statement, flag_associative_math currently equals 0 or 1 even if the user didn't specify the flag. I believe as set by common_handle_option. With my patch, the value of -1 is seen when the user did not specify the flag explicitly. Note that the go frontend uses a similar approach for errno. Joost
Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*
Looks like now function does not return anything for ARM case? I'd say we should replace this pc = ... with return like all other cases, the code is just asking for trouble. --- libsanitizer/sanitizer_common/sanitizer_stacktrace.cc (revision 209878) +++ libsanitizer/sanitizer_common/sanitizer_stacktrace.cc (revision 209879) @@ -18,11 +18,13 @@ namespace __sanitizer { uptr StackTrace::GetPreviousInstructionPc(uptr pc) { -#ifdef __arm__ +#if defined(__arm__) // Cancel Thumb bit. pc = pc (~1); -#endif -#if defined(__sparc__) +#elif defined(__powerpc__) || defined(__powerpc64__) + // PCs are always 4 byte aligned. + return pc - 4; +#elif defined(__sparc__) return pc - 8; #else return pc - 1;
[C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
PR61271 showed that this warning, recently added to clang, is quite useful and detected several reckless cases in the GCC codebase. This patch adds this warning even for GCC. Due to PR61271, it's not a part of -Wall, that would break the bootstrap, but after that is fixed, I think we want this warning be a part of -Wall or -Wextra. It's possible to suppress the warning by enclosing the LHS in parentheses. This proved to be hard to achieve in the C++ FE, so I had to go all the way up to parser... Jason, does the bool parenthesized_not_lhs_warn line looks reasonable? Also for C++, I think we don't want this warning to trigger when the operator (==, !=, , , =, =) is overloaded. But I admit I haven't even tried that, and I don't know how to detect overloaded operators except DECL_OVERLOADED_OPERATOR_P. Regtested/bootstrapped on x86_64-unknown-linux-gnu, ok for trunk? 2014-06-02 Marek Polacek pola...@redhat.com PR c/49706 * doc/invoke.texi: Document -Wlogical-not-parentheses. c-family/ * c-common.c (warn_logical_not_parentheses): New function. * c-common.h (warn_logical_not_parentheses): Declare. * c.opt (Wlogical-not-parentheses): New option. c/ * c-typeck.c (parser_build_binary_op): Warn when logical not is used on the left hand side operand of a comparison. cp/ * parser.c (cp_parser_binary_expression): Warn when logical not is used on the left hand side operand of a comparison. testsuite/ * c-c++-common/pr49706.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 6ec14fc..650afaf 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -1722,6 +1722,25 @@ warn_logical_operator (location_t location, enum tree_code code, tree type, } } +/* Warn about logical not used on the left hand side operand of a comparison. + This function assumes that the LHS is inside of TRUTH_NOT_EXPR. + Do not warn if the LHS or RHS is of a boolean or a vector type. */ + +void +warn_logical_not_parentheses (location_t location, enum tree_code code, + tree lhs, tree rhs) +{ + if (TREE_CODE_CLASS (code) != tcc_comparison + || TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE + || TREE_CODE (TREE_TYPE (rhs)) == BOOLEAN_TYPE + || VECTOR_TYPE_P (TREE_TYPE (lhs)) + || VECTOR_TYPE_P (TREE_TYPE (rhs))) +return; + + warning_at (location, OPT_Wlogical_not_parentheses, + logical not is only applied to the left hand side of + comparison); +} /* Warn if EXP contains any computations whose results are not used. Return true if a warning is printed; false otherwise. LOCUS is the diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h index 0d34004..83d5dee 100644 --- gcc/c-family/c-common.h +++ gcc/c-family/c-common.h @@ -772,6 +772,8 @@ extern void overflow_warning (location_t, tree); extern bool warn_if_unused_value (const_tree, location_t); extern void warn_logical_operator (location_t, enum tree_code, tree, enum tree_code, tree, enum tree_code, tree); +extern void warn_logical_not_parentheses (location_t, enum tree_code, tree, + tree); extern void check_main_parameter_types (tree decl); extern bool c_determine_visibility (tree); extern bool vector_types_compatible_elements_p (tree, tree); diff --git gcc/c-family/c.opt gcc/c-family/c.opt index c586e65..d51c890 100644 --- gcc/c-family/c.opt +++ gcc/c-family/c.opt @@ -490,6 +490,10 @@ Wlogical-op C ObjC C++ ObjC++ Var(warn_logical_op) Init(0) Warning Warn when a logical operator is suspiciously always evaluating to true or false +Wlogical-not-parentheses +C ObjC C++ ObjC++ Var(warn_logical_not_paren) Warning +Warn when logical not is used on the left hand side operand of a comparison + Wlong-long C ObjC C++ ObjC++ Var(warn_long_long) Init(-1) Warning Do not warn about using \long long\ when -pedantic diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index 6ca584b..f3b142f 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -3401,6 +3401,11 @@ parser_build_binary_op (location_t location, enum tree_code code, warn_logical_operator (location, code, TREE_TYPE (result.value), code1, arg1.value, code2, arg2.value); + if (warn_logical_not_paren + code1 == TRUTH_NOT_EXPR + TREE_CODE (arg1.value) == EQ_EXPR) +warn_logical_not_parentheses (location, code, arg1.value, arg2.value); + /* Warn about comparisons against string literals, with the exception of testing for equality or inequality of a string literal with NULL. */ if (code == EQ_EXPR || code == NE_EXPR) diff --git gcc/cp/parser.c gcc/cp/parser.c index 2591ae5..1ced05a 100644 --- gcc/cp/parser.c +++ gcc/cp/parser.c @@ -7883,6 +7883,8 @@ cp_parser_binary_expression (cp_parser* parser, bool cast_p, enum tree_code rhs_type; enum cp_parser_prec new_prec, lookahead_prec; tree
Re: [C PATCH] Don't reject valid code with _Alignas (PR c/61053)
Ping. On Sun, May 25, 2014 at 11:30:51AM +0200, Marek Polacek wrote: On Mon, May 05, 2014 at 10:27:03PM +0200, Marek Polacek wrote: In this PR the issue is that we reject (valid) code such as _Alignas (long long) long long foo; with -m32, because we trip this condition: alignas_align = 1U declspecs-align_log; if (alignas_align TYPE_ALIGN_UNIT (type)) { if (name) error_at (loc, %_Alignas% specifiers cannot reduce alignment of %qE, name); and error later on, since alignas_align is 4 (correct, see PR52023 for why), but TYPE_ALIGN_UNIT of long long is 8. I think TYPE_ALIGN_UNIT is wrong here as that won't give us minimal alignment required. In c_sizeof_or_alignof_type we already have the code to compute such minimal alignment so I just moved the code to a separate function and used that instead of TYPE_ALIGN_UNIT. Note that the test is run only on i?86 and x86_64, because we can't (?) easily determine which target requires what alignment. Regtested/bootstrapped on x86_64-unknown-linux-gnu and powerpc64-unknown-linux-gnu, ok for trunk? Can I backport this one to 4.9? Marek
Re: [PATCH][2/n] Always 64bit-HWI cleanups
On Wed, 28 May 2014, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: The following changes the configury to insist on [u]int64_t being available and removes the very old __int64 case. Autoconf doesn't check for it, support came in via a big merge in Dec 2002, r60174, and it was never used on the libcpp side until I fixed that with the last patch of this series, so we couldn't have relied on it at least since libcpp was introduced. Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT to literally use int64_t has to be done with the grand renaming of all users due to us using 'unsigned HOST_WIDE_INT'. Btw, I couldn't find any standard way of writing [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor one for printf formats (substitutions for HOST_WIDE_INT_PRINT and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ there if nobody comes up with a better plan. Not sure whether you meant that to apply to both groups, but as far as the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N) instead of HOST_WIDE_INT_C (N) or INT64_C (N)? int64_t (N) would already be taken, but yes, I considered [U]INT64_C (N). Let's see. Richard.
[PATCH] Fix PR61378
Committed as obvious. Richard. 2014-06-02 Richard Biener rguent...@suse.de PR tree-optimization/61378 * tree-ssa-sccvn.c (vn_reference_lookup_3): Initialize valueized_anything. Index: gcc/tree-ssa-sccvn.c === *** gcc/tree-ssa-sccvn.c(revision 211021) --- gcc/tree-ssa-sccvn.c(working copy) *** vn_reference_lookup_3 (ao_ref *ref, tree *** 1613,1619 conditional calls to free falsely clobbering ref because of imprecise points-to info of the argument. */ tree oldargs[4]; ! bool valueized_anything; for (unsigned i = 0; i gimple_call_num_args (def_stmt); ++i) { oldargs[i] = gimple_call_arg (def_stmt, i); --- 1613,1619 conditional calls to free falsely clobbering ref because of imprecise points-to info of the argument. */ tree oldargs[4]; ! bool valueized_anything = false; for (unsigned i = 0; i gimple_call_num_args (def_stmt); ++i) { oldargs[i] = gimple_call_arg (def_stmt, i);
GCC 4.7.4 Status Report, branch frozen for release (candidate)
Status == The GCC 4.7 branch is now frozen as I am preparing a first release candidate for GCC 4.7.4. All changes from now on require release manager approval. After GCC 4.7.4 is released the branch will be closed. Previous Report === https://gcc.gnu.org/ml/gcc/2013-04/msg00121.html I will send the next status report once GCC 4.7.4 RC1 is ready.
RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
Hi Richard, I've realised that I may need to do 'something' to prevent GCC from loading or storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 when using an unaligned address. Initial tests show that when loading from an unaligned address (4-byte aligned) then GCC loads the two halves of a 64-bit value into GPRs and then moves across to FPRs. This is good but I don't know if it is guaranteed. From what I can tell the backend doesn't specifically deal with loading unaligned data but rather the normal load/store patterns are used by the middle end. As such I'm not sure there is anything to prevent direct loads to FPRs by parts. Do you know one way or the other if unaligned doubles can currently be loaded via pairs of lwc1 (same for store) and if so can you advise on an approach I could try to prevent this for FPXX? I will try to figure this out on my own in the meantime. Regards, Matthew
Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard
Regtested on ia64-suse-linux and installed as obvious. Andreas. * config/ia64/ia64.c (ia64_first_cycle_multipass_dfa_lookahead_guard): Check pending_data_specs first. diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 118e5bf..4c5390b 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -7536,7 +7536,7 @@ ia64_first_cycle_multipass_dfa_lookahead_guard (rtx insn, int ready_index) /* Size of ALAT is 32. As far as we perform conservative data speculation, we keep ALAT half-empty. */ - if ((TODO_SPEC (insn) BEGIN_DATA) pending_data_specs = 16) + if (pending_data_specs = 16 (TODO_SPEC (insn) BEGIN_DATA)) return ready_index == 0 ? -1 : 1; if (ready_index == 0) -- 2.0.0 -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
[PATCH] Testcase for PR61346
Committed. Richard. 2014-06-02 Richard Biener rguent...@suse.de PR tree-optimization/61346 * gcc.dg/torture/pr61346.c: New testcase. Index: gcc/testsuite/gcc.dg/torture/pr61346.c === --- gcc/testsuite/gcc.dg/torture/pr61346.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr61346.c (working copy) @@ -0,0 +1,162 @@ +/* { dg-do run } */ + +extern void abort (void); + +typedef int int32_t __attribute__ ((mode (SI))); +typedef int int64_t __attribute__ ((mode (DI))); +typedef __SIZE_TYPE__ size_t; + +struct slice +{ + unsigned char *data; + int64_t len; + int64_t cap; +}; + +void fail (int32_t) __attribute__ ((noinline)); +void +fail (int32_t c) +{ + if (c != 0) +abort (); +} + +struct decode_rune_ret +{ + int32_t r; + int64_t width; +}; + +struct decode_rune_ret decode_rune (struct slice) __attribute__ ((noinline)); +struct decode_rune_ret +decode_rune (struct slice s) +{ + struct decode_rune_ret dr; + dr.r = s.data[0]; + dr.width = 1; + return dr; +} + +_Bool is_space (int32_t) __attribute__ ((noinline)); +_Bool +is_space (int32_t r) +{ + return r == ' '; +} + +struct ret +{ + int64_t advance; + struct slice token; +}; + +struct ret scanwords (struct slice, _Bool) __attribute__ ((noinline)); + +struct ret +scanwords (struct slice data, _Bool ateof) +{ + int64_t advance; + struct slice token; + int64_t start = 0; + { +int64_t width; +for (width = 0; start data.len; start += width) + { + int32_t r = 0; + struct slice s; + if (start data.cap || start 0) + fail (3); + s.data = data.data + (size_t) start; + s.len = data.len - start; + s.cap = data.cap - start; + struct decode_rune_ret dr = decode_rune (s); + r = dr.r; + width = dr.width; + if (!is_space (r)) + break; + } + } + _Bool tmp = ateof; + if (tmp != 0) +goto L1; + else +goto L2; + L1: + tmp = data.len == 0; + L2: + if (tmp != 0) +goto L11; + else +goto L12; + L11: +{ + struct ret r; + advance = 0; + token.data = 0; + token.len = 0; + token.cap = 0; + r.advance = advance; + r.token = token; + return r; +} + L12:; + int64_t width; + int64_t i; + for (width = 0, i = start; i data.len; i += width) +{ + int32_t r; + struct slice s; + if (i data.cap || i 0) + fail (3); + s.data = data.data + i; + s.len = data.len - i; + s.cap = data.cap - i; + struct decode_rune_ret dr = decode_rune (s); + r = dr.r; + width = dr.width; + if (is_space (r)) + { + if (i start || i data.cap || i 0) + fail (3); + if (start data.cap || start 0) + fail (3); + struct ret r; + advance = i + width; + token.data = data.data + (size_t) start; + token.len = i - start; + token.cap = data.cap - start; + r.advance = advance; + r.token = token; + return r; + } +} + { +struct ret r; +advance = 0; +token.data = 0; +token.len = 0; +token.cap = 0; +r.advance = advance; +r.token = token; +return r; + } +} + +int +main () +{ + unsigned char buf[1000]; + struct slice s; + __builtin_memset (buf, 0, sizeof (buf)); + buf[0] = ' '; + buf[1] = 'a'; + buf[2] = ' '; + s.data = buf; + s.len = 3; + s.cap = sizeof (buf); + struct ret r; + r = scanwords (s, 1); + if (r.advance != 3 || r.token.data[0] != 'a' || r.token.len != 1) +abort (); + return 0; +}
Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*
On 06/02/2014 11:56 AM, Jakub Jelinek wrote: Looks like now function does not return anything for ARM case? I'd say we should replace this pc = ... with return like all other cases, the code is just asking for trouble. But it should be return (pc ~(uptr)1) - 1; right? Yes, I had something like this in mind. -Y
Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
Matthew Fortune matthew.fort...@imgtec.com writes: I've realised that I may need to do 'something' to prevent GCC from loading or storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 when using an unaligned address. Initial tests show that when loading from an unaligned address (4-byte aligned) then GCC loads the two halves of a 64-bit value into GPRs and then moves across to FPRs. This is good but I don't know if it is guaranteed. From what I can tell the backend doesn't specifically deal with loading unaligned data but rather the normal load/store patterns are used by the middle end. As such I'm not sure there is anything to prevent direct loads to FPRs by parts. Do you know one way or the other if unaligned doubles can currently be loaded via pairs of lwc1 (same for store) and if so can you advise on an approach I could try to prevent this for FPXX? I will try to figure this out on my own in the meantime. The port does handle some widths of unaligned access via the {insv,extv,extzv}misalign patterns. That's how an unaligned DImode value will be handled on 64-bit targets. The MIPS *misalign patterns only handle integer modes, so for other types of mode the target-independent code will fall back to using an integer load followed by a subreg (or a subreg followed by an integer store). IIRC that's how an unaligned DFmode access will be handled on 64-bit targets. For modes that are larger or smaller than *misalign can handle, the target-independent code has to split the access up into smaller pieces and reassemble them. And these pieces have to have integer modes. E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4). The thing that prevents these individual loads from using LWC1 is CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid for any target-independent code to reduce a subreg of an FPR pair to an individual FPR. [FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for things like DImode on 32-bit targets is because there's no special architecture feature than can be used. It's just a case of decomposing the access. Since that's a general technique, we want to make the target-independent code do it as well as possible rather than handle it in a port-specific way.] So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of unaligned floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS should guarantee what you want. Thanks, Richard
[PATCH] Fix PR ipa/61190
Hi, the test case g++.old-deja/g++.mike/p4736b.C is mis-compiled with with all optimization levels, except -O0 and -Og. This probably started with gcc 4.7.x. The constructor Main::Main() is first inlined and then completely optimized away in the dce1 pass. But the thunk is still using the vtable, and therefore crashes. Well, the problem seems to be, that the thunk code is not emitted in the normal way using gimple code, but instead by the i386 back-end with a callback. This makes the thunk invisible to the ipa-pure-const pass, which assumes that all values just flow straight thu the thunk. See ipa-pure-const.c (analyze_function): if (fn-thunk.thunk_p || fn-alias) { /* Thunk gets propagated through, so nothing interesting happens. */ gcc_assert (ipa); return l; } But this is not true for a virtual base class: The thunk itself references the vtable, so if nothing else might need the vtable, the optimizer may remove the initalization of the vtable, which happened in this example. The most simple work around would be the attached patch, which simply falls back to emitting gimple code, if virtual base classes are used. Boot-Strapped and Regression-Tested on x86_64-linux-gnu. Ok for trunk? Thanks Bernd. 2014-06-02 Bernd Edlinger bernd.edlin...@hotmail.de PR ipa/61190 * cgraphunit.c (expand_thunk): Don't try to output the thunk for a virtual base class via targetm.asm_out.output_mi_thunk. testsuite/ChangeLog: 2014-06-02 Bernd Edlinger bernd.edlin...@hotmail.de PR ipa/61190 * g++.old-deja/g++.mike/p4736b.C: Use -O2. patch-pr61190.diff Description: Binary data
Re: [PATCH][2/n] Always 64bit-HWI cleanups
Richard Biener rguent...@suse.de writes: On Wed, 28 May 2014, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: The following changes the configury to insist on [u]int64_t being available and removes the very old __int64 case. Autoconf doesn't check for it, support came in via a big merge in Dec 2002, r60174, and it was never used on the libcpp side until I fixed that with the last patch of this series, so we couldn't have relied on it at least since libcpp was introduced. Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT to literally use int64_t has to be done with the grand renaming of all users due to us using 'unsigned HOST_WIDE_INT'. Btw, I couldn't find any standard way of writing [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor one for printf formats (substitutions for HOST_WIDE_INT_PRINT and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ there if nobody comes up with a better plan. Not sure whether you meant that to apply to both groups, but as far as the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N) instead of HOST_WIDE_INT_C (N) or INT64_C (N)? int64_t (N) would already be taken, but yes, I considered [U]INT64_C (N). Not sure what you mean by taken though. Isn't it just a C++-style cast? The nice thing about [u]int64_t (N) is that it Just Works for both constant and non-constant N. We shouldn't really need a special macro that can only handle constant N. E.g.: #include stdint.h uint64_t x = uint64_t (1) 48; Thanks, Richard
Re: [PATCH][2/n] Always 64bit-HWI cleanups
On Mon, 2 Jun 2014, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: On Wed, 28 May 2014, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: The following changes the configury to insist on [u]int64_t being available and removes the very old __int64 case. Autoconf doesn't check for it, support came in via a big merge in Dec 2002, r60174, and it was never used on the libcpp side until I fixed that with the last patch of this series, so we couldn't have relied on it at least since libcpp was introduced. Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT to literally use int64_t has to be done with the grand renaming of all users due to us using 'unsigned HOST_WIDE_INT'. Btw, I couldn't find any standard way of writing [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor one for printf formats (substitutions for HOST_WIDE_INT_PRINT and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ there if nobody comes up with a better plan. Not sure whether you meant that to apply to both groups, but as far as the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N) instead of HOST_WIDE_INT_C (N) or INT64_C (N)? int64_t (N) would already be taken, but yes, I considered [U]INT64_C (N). Not sure what you mean by taken though. Isn't it just a C++-style cast? The nice thing about [u]int64_t (N) is that it Just Works for both constant and non-constant N. We shouldn't really need a special macro that can only handle constant N. E.g.: #include stdint.h uint64_t x = uint64_t (1) 48 But HOST_WIDE_INT_C can also be used to write 12443875182037483LL. Not sure if large numbers are parsed as 'long long' (give that C++04 doesn't have long long and integer literals not fitting long invoke undefined behavior). For example config/alpha/alpha.c: GEN_INT (HOST_WIDE_INT_C (0x0ffffff0)), config/alpha/alpha.c: word1 = GEN_INT (HOST_WIDE_INT_C (0xa77b0010a43b0018)); ... config/rs6000/rs6000.c: const HOST_WIDE_INT lower_32bits = HOST_WIDE_INT_C(0x); config/rs6000/rs6000.c: const HOST_WIDE_INT sign_bit = HOST_WIDE_INT_C(0x8000); ... (no significant other uses). Richard.
Re: [PATCH][2/n] Always 64bit-HWI cleanups
Richard Biener rguent...@suse.de writes: On Mon, 2 Jun 2014, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: On Wed, 28 May 2014, Richard Sandiford wrote: Richard Biener rguent...@suse.de writes: The following changes the configury to insist on [u]int64_t being available and removes the very old __int64 case. Autoconf doesn't check for it, support came in via a big merge in Dec 2002, r60174, and it was never used on the libcpp side until I fixed that with the last patch of this series, so we couldn't have relied on it at least since libcpp was introduced. Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT to literally use int64_t has to be done with the grand renaming of all users due to us using 'unsigned HOST_WIDE_INT'. Btw, I couldn't find any standard way of writing [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor one for printf formats (substitutions for HOST_WIDE_INT_PRINT and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ there if nobody comes up with a better plan. Not sure whether you meant that to apply to both groups, but as far as the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N) instead of HOST_WIDE_INT_C (N) or INT64_C (N)? int64_t (N) would already be taken, but yes, I considered [U]INT64_C (N). Not sure what you mean by taken though. Isn't it just a C++-style cast? The nice thing about [u]int64_t (N) is that it Just Works for both constant and non-constant N. We shouldn't really need a special macro that can only handle constant N. E.g.: #include stdint.h uint64_t x = uint64_t (1) 48 But HOST_WIDE_INT_C can also be used to write 12443875182037483LL. Not sure if large numbers are parsed as 'long long' (give that C++04 doesn't have long long and integer literals not fitting long invoke undefined behavior). For example config/alpha/alpha.c: GEN_INT (HOST_WIDE_INT_C (0x0ffffff0)), config/alpha/alpha.c: word1 = GEN_INT (HOST_WIDE_INT_C (0xa77b0010a43b0018)); Ah, yeah, of course. Hadn't realised we actually had constants like that. Sorry for the noise... ... config/rs6000/rs6000.c: const HOST_WIDE_INT lower_32bits = HOST_WIDE_INT_C(0x); config/rs6000/rs6000.c: const HOST_WIDE_INT sign_bit = HOST_WIDE_INT_C(0x8000); ... Looks like these could just use uint64_t FWIW. Thanks, Richard
Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*
On Mon, Jun 2, 2014 at 8:56 AM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Jun 02, 2014 at 10:22:11AM +0400, Yury Gribov wrote: Looks like now function does not return anything for ARM case? I'd say we should replace this pc = ... with return like all other cases, the code is just asking for trouble. But it should be return (pc ~(uptr)1) - 1; Why the -1 ? No ARM or Thumb instruction is 1 byte long. Instructions are 4 bytes long if in ARM state and could be 2 or 4 bytes if Thumb state. regards Ramana right? --- libsanitizer/sanitizer_common/sanitizer_stacktrace.cc (revision 209878) +++ libsanitizer/sanitizer_common/sanitizer_stacktrace.cc (revision 209879) @@ -18,11 +18,13 @@ namespace __sanitizer { uptr StackTrace::GetPreviousInstructionPc(uptr pc) { -#ifdef __arm__ +#if defined(__arm__) // Cancel Thumb bit. pc = pc (~1); -#endif -#if defined(__sparc__) +#elif defined(__powerpc__) || defined(__powerpc64__) + // PCs are always 4 byte aligned. + return pc - 4; +#elif defined(__sparc__) return pc - 8; #else return pc - 1; Jakub
[PATCH] Make HOST_WIDE_INT underlying type match int64_t, adjust printing macros again
This computes the underlying type of int64_t in a configure check and uses that for HOST_WIDE_INT. This makes sure we can use the PRI?64 macros from inttypes.h for HOST_WIDE_INT_PRINT_*, which the following patch re-instantiates. The patch also makes sure inttypes.h defines PRId64 and if not, provides replacements in hwint.h (like previously), matching int64_t. Built on i586-linux-gnu and x86_64-linux-gnu (to verify both 'long' and 'long long' choices). A bootstrap and regtest on x86_64-unknown-linux-gnu is pending. Ok? Thanks, Richard. 2014-06-02 Richard Biener rguent...@suse.de * configure.ac: Check whether the underlying type of int64_t is long or long long. * configure: Regenerate. * config.in: Likewise. * hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t. (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64. Index: gcc/configure.ac === *** gcc/configure.ac(revision 211125) --- gcc/configure.ac(working copy) *** if test x$ac_cv_c_uint64_t = xno -o *** 316,321 --- 316,350 AC_MSG_ERROR([uint64_t or int64_t not found]) fi + # check what underlying integer type int64_t uses + AC_LANG_PUSH(C++) + AC_CACHE_CHECK(for int64_t underlying type, ac_cv_int64_t_type, [ + AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ + #ifdef HAVE_STDINT_H + #include stdint.h + #endif + template typename T struct X { }; + template + struct Xlong { typedef long t; }; + ]], [[typename Xint64_t::t x;]])],[ac_cv_int64_t_type=long],[ac_cv_int64_t_type=long long])]) + if test x$ac_cv_int64_t_type = xlong; then + AC_DEFINE(INT64_T_IS_LONG, 1, + [Define if int64_t uses long as underlying type.]) + else + AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ + #ifdef HAVE_STDINT_H + #include stdint.h + #endif + template typename T struct X { }; + template + struct Xlong long { typedef long long t; }; + ]], [[typename Xint64_t::t x;]])],[],[AC_MSG_ERROR([error verifying int64_t uses long long])]) + fi + AC_LANG_POP(C++) + + + + # - # Warnings and checking # - *** LIBS=$save_LIBS *** 1055,1067 AC_SUBST(LDEXP_LIB) # Use inttypes.h only if it exists, ! # doesn't clash with sys/types.h, and declares intmax_t. AC_MSG_CHECKING(for inttypes.h) AC_CACHE_VAL(gcc_cv_header_inttypes_h, [AC_COMPILE_IFELSE([AC_LANG_PROGRAM( ! [[#include sys/types.h #include inttypes.h]], ! [[intmax_t i = -1;]])], [gcc_cv_header_inttypes_h=yes], [gcc_cv_header_inttypes_h=no])]) AC_MSG_RESULT($gcc_cv_header_inttypes_h) --- 1084,1101 AC_SUBST(LDEXP_LIB) # Use inttypes.h only if it exists, ! # doesn't clash with sys/types.h, declares intmax_t and defines ! # PRId64 AC_MSG_CHECKING(for inttypes.h) AC_CACHE_VAL(gcc_cv_header_inttypes_h, [AC_COMPILE_IFELSE([AC_LANG_PROGRAM( ! [[#define __STDC_FORMAT_MACROS ! #include sys/types.h #include inttypes.h]], ! [[intmax_t i = -1; ! #ifndef PRId64 ! choke me ! #endif]])], [gcc_cv_header_inttypes_h=yes], [gcc_cv_header_inttypes_h=no])]) AC_MSG_RESULT($gcc_cv_header_inttypes_h) Index: gcc/hwint.h === *** gcc/hwint.h (revision 211125) --- gcc/hwint.h (working copy) *** extern char sizeof_long_long_must_be_8[s *** 46,58 #endif /* Set HOST_WIDE_INT, this should be always 64 bits. ! !With a sane ABI, 'long' is the largest efficient host integer type. !Thus, we use that unless we have to use 'long long' !because we're on a 32-bit host. */ #define HOST_BITS_PER_WIDE_INT 64 ! #if HOST_BITS_PER_LONG == 64 # define HOST_WIDE_INT long # define HOST_WIDE_INT_C(X) X ## L #else --- 46,56 #endif /* Set HOST_WIDE_INT, this should be always 64 bits. !The underlying type is matched to that of int64_t and assumed !to be either long or long long. */ #define HOST_BITS_PER_WIDE_INT 64 ! #if INT64_T_IS_LONG # define HOST_WIDE_INT long # define HOST_WIDE_INT_C(X) X ## L #else *** extern char sizeof_long_long_must_be_8[s *** 75,122 typedef before using the __asm_fprintf__ format attribute. */ typedef HOST_WIDE_INT __gcc_host_wide_int__; - /* Various printf format strings for HOST_WIDE_INT. */ - - #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG - # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT - # define HOST_WIDE_INT_PRINT_C L - /* HOST_BITS_PER_WIDE_INT is 64 bits. */ - # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ - 0x% HOST_LONG_FORMAT x%016 HOST_LONG_FORMAT x - # define HOST_WIDE_INT_PRINT_PADDED_HEX \ - %016 HOST_LONG_FORMAT x - #else - # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT - # define HOST_WIDE_INT_PRINT_C LL - /* HOST_BITS_PER_WIDE_INT is 64 bits. */ - # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ - 0x% HOST_LONG_LONG_FORMAT x%016 HOST_LONG_LONG_FORMAT x - # define
Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*
On Mon, Jun 02, 2014 at 10:24:32AM +0100, Ramana Radhakrishnan wrote: On Mon, Jun 2, 2014 at 8:56 AM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Jun 02, 2014 at 10:22:11AM +0400, Yury Gribov wrote: Looks like now function does not return anything for ARM case? I'd say we should replace this pc = ... with return like all other cases, the code is just asking for trouble. But it should be return (pc ~(uptr)1) - 1; Why the -1 ? No ARM or Thumb instruction is 1 byte long. Instructions are 4 bytes long if in ARM state and could be 2 or 4 bytes if Thumb state. Well, that is what the code did before this patch. The -1 just points to the middle of previous instruction, so that supposedly it can be looked up in debug info etc. Now, if all insns are at least 2 bytes long, perhaps you could subtract 2. Jakub
Re: [AArch64/ARM 2/3] Detect EXT patterns to vec_perm_const, use for EXT intrinsics
Yes, reproduced. Seems the mid-end doesn't elide no-op masks at -O0 after all... Fix in progress, think it's almost (tho not quite) simply a bad assertion. --Alan Christophe Lyon wrote: Hi Alan This causes g++ to ICE on pr59378 test, for aarch64 targets: http://cbuild.validation.linaro.org/build/cross-validation/gcc/211058/report-build-info.html Can you check? Thanks, Christophe. On 19 May 2014 14:53, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 23 April 2014 21:22, Alan Lawrence alan.lawre...@arm.com wrote: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_types_binopv_qualifiers, TYPES_BINOPV): New static data. * config/aarch64/aarch64-simd-builtins.def (im_lane_bound): New builtin. * config/aarch64/aarch64-simd.md (aarch64_ext, aarch64_im_lane_boundsi): New patterns. * config/aarch64/aarch64.c (aarch64_expand_vec_perm_const_1): Match patterns for EXT. (aarch64_evpc_ext): New function. * config/aarch64/iterators.md (UNSPEC_EXT): New enum element. * config/aarch64/arm_neon.h (vext_f32, vext_f64, vext_p8, vext_p16, vext_s8, vext_s16, vext_s32, vext_s64, vext_u8, vext_u16, vext_u32, vext_u64, vextq_f32, vextq_f64, vextq_p8, vextq_p16, vextq_s8, vextq_s16, vextq_s32, vextq_s64, vextq_u8, vextq_u16, vextq_u32, vextq_u64): Replace __asm with __builtin_shuffle and im_lane_boundsi. OK /Marcus
Re: [PATCH] Make HOST_WIDE_INT underlying type match int64_t, adjust printing macros again
On Mon, Jun 02, 2014 at 11:24:23AM +0200, Richard Biener wrote: 2014-06-02 Richard Biener rguent...@suse.de * configure.ac: Check whether the underlying type of int64_t is long or long long. * configure: Regenerate. * config.in: Likewise. * hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t. (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64. LGTM, but give others 24 hours to comment on it too. Jakub
Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*
Why the -1 ? No ARM or Thumb instruction is 1 byte long. Instructions are 4 bytes long if in ARM state and could be 2 or 4 bytes if Thumb state. The -1 just points to the middle of previous instruction, so that supposedly it can be looked up in debug info etc. Right, that works quite well with gdb, addr2line, etc. I once tried adding proper handling of ARM/Thumb but this complicated code with no real benefit. -Y
[patch,avr] atmel avr new devices set-1
Hi, Attached patch adds support for Atmel ATA devices (ata6616c, ata6617c, ata664251, ata6612c, ata6613c and ata6614q). Please commit if the patch is OK. I do not have commit access. Regards, Pitchumani gcc/ChangeLog 2014-06-02 Vishnu K S vishnu@atmel.com Pitchumani Sivanupandi pitchuman...@atmel.com * config/avr/avr-mcus.def (ata6616c): Add new avr25 device. (ata6617c, ata664251): Add new avr35 devices. (ata6612c): Add new avr4 device. (ata6613c, ata6614q): Add new avr5 devices. * config/avr/avr-tables.opt: Regenerate. * config/avr/t-multilib: Regenerate. * doc/avr-mmcu.texi: Regenerate. ata-devices-support.patch Description: ata-devices-support.patch
Re: Eliminate write-only variables
On Sat, May 31, 2014 at 8:21 PM, Sandra Loosemore san...@codesourcery.com wrote: On 05/20/2014 04:56 AM, Richard Biener wrote: On Tue, May 20, 2014 at 6:29 AM, Jan Hubicka hubi...@ucw.cz wrote: On 05/18/2014 08:45 PM, Sandra Loosemore wrote: On 05/18/2014 02:59 PM, Jan Hubicka wrote: For cases like local-statics-7 your approach can be saved by adding simple IPA analysis to look for static vars that are used only by one function and keeping your DSE code active for them, so we can still get rid of this special case assignments during late compilation. I am however not quite convinced it is worth the effort - do you have some real world cases where it helps? Um, the well-known benchmark. ;-) I looked at the generated code for this benchmark and see that your patch is indeed not getting rid of all the pointless static variables, while ours is, so this is quite disappointing. I'm thinking now that we will still need to retain our patch at least locally, although we'd really like to get it on trunk if possible. Yours patch can really be improved by adding simple IPA analysis to work out what variables are refernced by one function only instead of using not-longer-that-relevant local static info. As last IPA pass for each static variable with no address taken, look at all references and see if they all come from one function or functions inlined to it. Here is another testcase vaguely based on the same kind of signal-processing algorithm as the benchmark, but coded without reference to it. I have arm-none-eabi builds around for both 4.9.0 with our remove_local_statics patch applied, and trunk with your patch. With -O2, our optimization produces 23 instructions and gets rid of all 3 statics, yours produces 33 and only gets rid of 1 of them. Of course it's lame to use static variables in this way, but OTOH it's lame if GCC can't recognize them and optimize them away, too. -Sandra void fir (int *coeffs, int coeff_length, int *input, int input_length, int *output) { static int *coeffp; static int *inputp; static int *outputp; Here inputp read is rmeoved only at 101.dceloop1 pass. That is really late. coeffp is removed late, too. int i, c, acc; for (i = 0; i input_length; i++) { coeffp = coeffs; inputp = input + i; outputp = output + i; acc = 0; for (c = 0; c coeff_length; c++) { acc += *coeffp * *inputp; coeffp++; inputp--; } It seems like AA can not work out the fact that inputp is unchanged until that late. Richi, any ideas? Well, AA is perfectly fine - it's just that this boils down to a partial redundancy problem. The stores can be removed by DCE even with Index: gcc/tree-ssa-dce.c === --- gcc/tree-ssa-dce.c (revision 210635) +++ gcc/tree-ssa-dce.c (working copy) @@ -278,10 +278,20 @@ mark_stmt_if_obviously_necessary (gimple break; case GIMPLE_ASSIGN: - if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME - TREE_CLOBBER_P (gimple_assign_rhs1 (stmt))) - return; - break; + { + tree lhs = gimple_assign_lhs (stmt); + if (TREE_CODE (lhs) == SSA_NAME +TREE_CLOBBER_P (gimple_assign_rhs1 (stmt))) + return; + if (TREE_CODE (lhs) == VAR_DECL +!TREE_ADDRESSABLE (lhs) +TREE_STATIC (lhs) +!TREE_PUBLIC (lhs) !DECL_EXTERNAL (lhs) + /* ??? Make sure lhs is only refered to from cfun-decl. */ +decl_function_context (lhs) == cfun-decl) + return; + break; + } default: break; but I don't have a good way of checking the ??? prerequesite (even without taking its address the statics may be refered to by a) inline copies, b) ipa-split function parts, c) nested functions). I'm sure IPA references are not kept up-to-date. The last reads go away with PRE at the expense of two additional induction variables. If we know that a static variable does not have its address taken and is only refered to from a single function we can in theory simply rewrite it into SSA form during early opts (before inlining its body), for example by SRA. That isn't even restricted to function-local statics (for statics used in multiple functions but not having address-taken we can do the same with doing function entry / exit load / store). I think that would be a much better IPA enabled optimization than playing games with looking at individual stmts. FAOD, I'm not currently planning to work on this any more myself. I understand Bernd's patch pretty well and could make some minor changes to clean it up if that's what it takes to get it approved, but it sounds like what's wanted is a completely different approach, again, and I don't have any confidence that I could implement
Re: [PATCH] Fix PR ipa/61190
On Mon, Jun 2, 2014 at 11:00 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, the test case g++.old-deja/g++.mike/p4736b.C is mis-compiled with with all optimization levels, except -O0 and -Og. This probably started with gcc 4.7.x. The constructor Main::Main() is first inlined and then completely optimized away in the dce1 pass. But the thunk is still using the vtable, and therefore crashes. Well, the problem seems to be, that the thunk code is not emitted in the normal way using gimple code, but instead by the i386 back-end with a callback. This makes the thunk invisible to the ipa-pure-const pass, which assumes that all values just flow straight thu the thunk. See ipa-pure-const.c (analyze_function): if (fn-thunk.thunk_p || fn-alias) { /* Thunk gets propagated through, so nothing interesting happens. */ gcc_assert (ipa); return l; } But this is not true for a virtual base class: The thunk itself references the vtable, so if nothing else might need the vtable, the optimizer may remove the initalization of the vtable, which happened in this example. The most simple work around would be the attached patch, which simply falls back to emitting gimple code, if virtual base classes are used. Boot-Strapped and Regression-Tested on x86_64-linux-gnu. Ok for trunk? Honza should know better. I'd rather skip the above for thunks going the output_mi_thunk way. That is, the target hook docs should be updated to reflect which kind of thunks it is supposed to handle. Richard. Thanks Bernd.
RE: [PATCH] Fix PR ipa/61190
Hi, On Mon, 2 Jun 2014 12:06:12, Richard Biener wrote: On Mon, Jun 2, 2014 at 11:00 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, the test case g++.old-deja/g++.mike/p4736b.C is mis-compiled with with all optimization levels, except -O0 and -Og. This probably started with gcc 4.7.x. The constructor Main::Main() is first inlined and then completely optimized away in the dce1 pass. But the thunk is still using the vtable, and therefore crashes. Well, the problem seems to be, that the thunk code is not emitted in the normal way using gimple code, but instead by the i386 back-end with a callback. This makes the thunk invisible to the ipa-pure-const pass, which assumes that all values just flow straight thu the thunk. See ipa-pure-const.c (analyze_function): if (fn-thunk.thunk_p || fn-alias) { /* Thunk gets propagated through, so nothing interesting happens. */ gcc_assert (ipa); return l; } But this is not true for a virtual base class: The thunk itself references the vtable, so if nothing else might need the vtable, the optimizer may remove the initalization of the vtable, which happened in this example. The most simple work around would be the attached patch, which simply falls back to emitting gimple code, if virtual base classes are used. Boot-Strapped and Regression-Tested on x86_64-linux-gnu. Ok for trunk? Honza should know better. I'd rather skip the above for thunks going the output_mi_thunk way. Ahh Yes, that was of course my first attempt... But there is not BB to enumerate in that case. And the loop below just crashed in: FOR_EACH_BB_FN (this_block, cfun) There is also another interesting thing to mention: when the output_mi_thunk outputs the virtual thunk, there is no inlining at all. But with this patch, the thunk inlines the called function, and therefore the generated code looks rather nifty, compared to the state before the patch. Thanks Bernd. That is, the target hook docs should be updated to reflect which kind of thunks it is supposed to handle. Richard. Thanks Bernd.
[PATCH, i386, Pointer Bounds Checker 17/x] Pointer bounds constants support
Hi, This patch adds support for pointer bounds constants to be used as DECL_INITIAL for constant bounds (like zero bounds). Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-30 Ilya Enkovich ilya.enkov...@intel.com * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS. (init_emit_once): Build pointer bounds zero constants. * explow.c (trunc_int_for_mode): Likewise. * varpool.c (ctor_for_folding): Do not fold constant bounds vars. * varasm.c (output_constant_pool_2): Support MODE_POINTER_BOUNDS. * config/i386/i386.c (ix86_legitimate_constant_p): Mark bounds constant as not valid. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 6a99b6b..888f5ad 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -12443,8 +12443,12 @@ darwin_local_data_pic (rtx disp) satisfies CONSTANT_P. */ static bool -ix86_legitimate_constant_p (enum machine_mode mode ATTRIBUTE_UNUSED, rtx x) +ix86_legitimate_constant_p (enum machine_mode mode, rtx x) { + /* Pointer bounds constants are not valid. */ + if (POINTER_BOUNDS_MODE_P (mode)) +return false; + switch (GET_CODE (x)) { case CONST: diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 4736f8d..f9484d5 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -505,7 +505,8 @@ immed_double_const (HOST_WIDE_INT i0, HOST_WIDE_INT i1, enum machine_mode mode) || GET_MODE_CLASS (mode) == MODE_PARTIAL_INT /* We can get a 0 for an error mark. */ || GET_MODE_CLASS (mode) == MODE_VECTOR_INT - || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT); + || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT + || GET_MODE_CLASS (mode) == MODE_POINTER_BOUNDS); if (GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT) return gen_int_mode (i0, mode); @@ -5875,6 +5876,11 @@ init_emit_once (void) if (STORE_FLAG_VALUE == 1) const_tiny_rtx[1][(int) BImode] = const1_rtx; + for (mode = GET_CLASS_NARROWEST_MODE (MODE_POINTER_BOUNDS); + mode != VOIDmode; + mode = GET_MODE_WIDER_MODE (mode)) +const_tiny_rtx[0][mode] = immed_double_const (0, 0, mode); + pc_rtx = gen_rtx_fmt_ (PC, VOIDmode); ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode); simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode); diff --git a/gcc/explow.c b/gcc/explow.c index 48e91a6..9661abb 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -53,7 +53,8 @@ trunc_int_for_mode (HOST_WIDE_INT c, enum machine_mode mode) int width = GET_MODE_PRECISION (mode); /* You want to truncate to a _what_? */ - gcc_assert (SCALAR_INT_MODE_P (mode)); + gcc_assert (SCALAR_INT_MODE_P (mode) + || POINTER_BOUNDS_MODE_P (mode)); /* Canonicalize BImode to 0 and STORE_FLAG_VALUE. */ if (mode == BImode) diff --git a/gcc/varasm.c b/gcc/varasm.c index af7fb4a..fcae2fa 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -3799,6 +3799,7 @@ output_constant_pool_2 (enum machine_mode mode, rtx x, unsigned int align) case MODE_UFRACT: case MODE_ACCUM: case MODE_UACCUM: +case MODE_POINTER_BOUNDS: assemble_integer (x, GET_MODE_SIZE (mode), align, 1); break; diff --git a/gcc/varpool.c b/gcc/varpool.c index 0eeb2b6..321165d 100644 --- a/gcc/varpool.c +++ b/gcc/varpool.c @@ -257,6 +257,12 @@ ctor_for_folding (tree decl) TREE_CODE (decl) != CONST_DECL) return error_mark_node; + /* Static constant bounds are created to be + used instead of constants and therefore + do not let folding it. */ + if (POINTER_BOUNDS_P (decl)) +return error_mark_node; + if (TREE_CODE (decl) == CONST_DECL || DECL_IN_CONSTANT_POOL (decl)) return DECL_INITIAL (decl);
Re: [PATCH, i386, Pointer Bounds Checker 11/x] Keep bounds initial values
On 30 May 10:45, Jeff Law wrote: On 05/29/14 04:53, Ilya Enkovich wrote: Hi, This patch tries to keep bounds initial values when it may be needed. Even if initial value is not fully known (e.g. we know only low bound) it still may help to remove some redundant checks. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2013-05-29 Ilya Enkovich ilya.enkov...@intel.com * ipa.c (symtab_remove_unreachable_nodes): Kepp initial values for pointer bounds to be used for checks eliminations. * lto-cgraph.c (compute_ltrans_boundary): Likewise. * add_references_to_partition (add_references_to_partition): Add references to pointer bounds vars. Typo in the ChangeLog kepp - keep. OK for the trunk when the rest of the stuff is approved, though I'm curious how, for example, keeping the initial value for an unreachable node in symtab_remove_unreachable_nodes helps eliminate redundant checks. Consider we have some var 'arr' and static bounds '__chkp_bounds_of_arr' holding its bounds. Upper bound may be unknown at compile time and therefore '__chkp_bounds_of_arr' is not a candidate for constant folding. But we always know low bound which is 'arr'. It means we may remove checks of low bound when value of '__chkp_bounds_of_arr' is used in checks. That is why I want to keep initials for all static bounds vars. Here is fixed ChangeLog. Thanks, Ilya -- gcc/ 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * ipa.c (symtab_remove_unreachable_nodes): Keep initial values for pointer bounds to be used for checks eliminations. * lto-cgraph.c (compute_ltrans_boundary): Likewise. * add_references_to_partition (add_references_to_partition): Add references to pointer bounds vars. Jeff
Re: [RFC] optimize x - y cmp 0 with undefined overflow
On Fri, May 30, 2014 at 10:48 AM, Eric Botcazou ebotca...@adacore.com wrote: I'd say we still handle basic symbolic range ops in extract_range_from_binary_1 but in extract_range_from_binary_expr for a symbolic op0 we try to simplify it with both [op1, op1] and with the value-range of op1 until we get a non-varying range as result. Not sure if it's worth restricting that to the case where op0s value-range refers to op1 or vice versa, and eventually only use op1 symbolically then. Patch along these lines attached. A bit heavy as expected, but it's VRP... It deals with my pet problem, you might want to check it does so with yours. Tested on x86_64-suse-linux with no regressions. Looks mostly ok. Any reason why you are not re-creating MINUS_EXPR in build_symbolic_expr? That is, build inv - t (for non-pointers, of course)? Otherwise if a range becomes -t + inv that will no longer match get_single_symbol for further propagation? Then I'm not sure if + /* Try with VR0 and [-INF, OP1]. */ + set_value_range (new_vr1, VR_RANGE, vrp_val_min (expr_type), op1, NULL); + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1); + if (vr-type != VR_VARYING) + return; + + /* Try with VR0 and [OP1, +INF]. */ + set_value_range (new_vr1, VR_RANGE, op1, vrp_val_max (expr_type), NULL); + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1); + if (vr-type != VR_VARYING) + return; is a safe thing to do. If it does make a difference to try [-INF, OP1], [OP1, +INF] instead of just [OP1, OP1] then at least it's very suspicious ;) (or an easy missed optimization). So - can you fix the negate thing and drop the four cases trying the +-INF based ranges? Thanks, Richard. 2014-05-30 Eric Botcazou ebotca...@adacore.com * tree-vrp.c (get_single_symbol): New function. (build_symbolic_expr): Likewise. (symbolic_range_based_on_p): New predicate. (extract_range_from_binary_expr_1): Deal with single-symbolic ranges for PLUS and MINUS. Do not drop symbolic ranges at the end. (extract_range_from_binary_expr): Try harder for PLUS and MINUS if operand is symbolic and based on the other operand. 2014-05-30 Eric Botcazou ebotca...@adacore.com * gcc.dg/tree-ssa/vrp93.c: New test. * gnat.dg/opt38.adb: Likewise. -- Eric Botcazou
Re: [RFC] optimize x - y cmp 0 with undefined overflow
On Mon, Jun 2, 2014 at 12:36 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 30, 2014 at 10:48 AM, Eric Botcazou ebotca...@adacore.com wrote: I'd say we still handle basic symbolic range ops in extract_range_from_binary_1 but in extract_range_from_binary_expr for a symbolic op0 we try to simplify it with both [op1, op1] and with the value-range of op1 until we get a non-varying range as result. Not sure if it's worth restricting that to the case where op0s value-range refers to op1 or vice versa, and eventually only use op1 symbolically then. Patch along these lines attached. A bit heavy as expected, but it's VRP... It deals with my pet problem, you might want to check it does so with yours. Tested on x86_64-suse-linux with no regressions. Looks mostly ok. Any reason why you are not re-creating MINUS_EXPR in build_symbolic_expr? That is, build inv - t (for non-pointers, of course)? Otherwise if a range becomes -t + inv that will no longer match get_single_symbol for further propagation? Then I'm not sure if + /* Try with VR0 and [-INF, OP1]. */ + set_value_range (new_vr1, VR_RANGE, vrp_val_min (expr_type), op1, NULL); + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1); + if (vr-type != VR_VARYING) + return; + + /* Try with VR0 and [OP1, +INF]. */ + set_value_range (new_vr1, VR_RANGE, op1, vrp_val_max (expr_type), NULL); + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0, new_vr1); + if (vr-type != VR_VARYING) + return; is a safe thing to do. If it does make a difference to try [-INF, OP1], [OP1, +INF] instead of just [OP1, OP1] then at least it's very suspicious ;) (or an easy missed optimization). So - can you fix the negate thing and drop the four cases trying the +-INF based ranges? Btw, the testcases are missing in the patch so I can't have a look myself. Richard. Thanks, Richard. 2014-05-30 Eric Botcazou ebotca...@adacore.com * tree-vrp.c (get_single_symbol): New function. (build_symbolic_expr): Likewise. (symbolic_range_based_on_p): New predicate. (extract_range_from_binary_expr_1): Deal with single-symbolic ranges for PLUS and MINUS. Do not drop symbolic ranges at the end. (extract_range_from_binary_expr): Try harder for PLUS and MINUS if operand is symbolic and based on the other operand. 2014-05-30 Eric Botcazou ebotca...@adacore.com * gcc.dg/tree-ssa/vrp93.c: New test. * gnat.dg/opt38.adb: Likewise. -- Eric Botcazou
Re: [PATCH, i386, Pointer Bounds Checker 12/x] Recognize instrumented special functions
On 30 May 10:46, Jeff Law wrote: On 05/29/14 05:00, Ilya Enkovich wrote: Hi, This patch allows to recognize instrumented call to special function by using the original function name for recognition. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * calls.c (special_function_p): Use original decl name when analyzing instrumentation clone. OK for the trunk when the rest of the patches are approved. Presumably we have to use the original decl because we twiddle the name in the clones, right? Just want to make sure I understand why we're dong this :-) That's right. We add .chkp prefix for all instrumented functions to keep assembler name unique during compilation. Otherwise it becomes a mess with LTO. Ilya jeff
Re: [PATCH] Make HOST_WIDE_INT underlying type match int64_t, adjust printing macros again
On Mon, 2 Jun 2014, Jakub Jelinek wrote: On Mon, Jun 02, 2014 at 11:24:23AM +0200, Richard Biener wrote: 2014-06-02 Richard Biener rguent...@suse.de * configure.ac: Check whether the underlying type of int64_t is long or long long. * configure: Regenerate. * config.in: Likewise. * hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t. (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64. LGTM, but give others 24 hours to comment on it too. Testing with older host compiler revealed two issues - typename use outside of template and missing quoting around the cache variable check. Thus I am re-testing the following. Richard. 2014-06-02 Richard Biener rguent...@suse.de * configure.ac: Check whether the underlying type of int64_t is long or long long. * configure: Regenerate. * config.in: Likewise. * hwint.h (HOST_WIDE_INT): Match the underlying type of int64_t. (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64. Index: gcc/configure.ac === *** gcc/configure.ac(revision 211125) --- gcc/configure.ac(working copy) *** if test x$ac_cv_c_uint64_t = xno -o *** 316,321 --- 316,350 AC_MSG_ERROR([uint64_t or int64_t not found]) fi + # check what underlying integer type int64_t uses + AC_LANG_PUSH(C++) + AC_CACHE_CHECK(for int64_t underlying type, ac_cv_int64_t_type, [ + AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ + #ifdef HAVE_STDINT_H + #include stdint.h + #endif + template typename T struct X { }; + template + struct Xlong { typedef long t; }; + ]], [[Xint64_t::t x;]])],[ac_cv_int64_t_type=long],[ac_cv_int64_t_type=long long])]) + if test $ac_cv_int64_t_type = long; then + AC_DEFINE(INT64_T_IS_LONG, 1, + [Define if int64_t uses long as underlying type.]) + else + AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ + #ifdef HAVE_STDINT_H + #include stdint.h + #endif + template typename T struct X { }; + template + struct Xlong long { typedef long long t; }; + ]], [[Xint64_t::t x;]])],[],[AC_MSG_ERROR([error verifying int64_t uses long long])]) + fi + AC_LANG_POP(C++) + + + + # - # Warnings and checking # - *** LIBS=$save_LIBS *** 1055,1067 AC_SUBST(LDEXP_LIB) # Use inttypes.h only if it exists, ! # doesn't clash with sys/types.h, and declares intmax_t. AC_MSG_CHECKING(for inttypes.h) AC_CACHE_VAL(gcc_cv_header_inttypes_h, [AC_COMPILE_IFELSE([AC_LANG_PROGRAM( ! [[#include sys/types.h #include inttypes.h]], ! [[intmax_t i = -1;]])], [gcc_cv_header_inttypes_h=yes], [gcc_cv_header_inttypes_h=no])]) AC_MSG_RESULT($gcc_cv_header_inttypes_h) --- 1084,1101 AC_SUBST(LDEXP_LIB) # Use inttypes.h only if it exists, ! # doesn't clash with sys/types.h, declares intmax_t and defines ! # PRId64 AC_MSG_CHECKING(for inttypes.h) AC_CACHE_VAL(gcc_cv_header_inttypes_h, [AC_COMPILE_IFELSE([AC_LANG_PROGRAM( ! [[#define __STDC_FORMAT_MACROS ! #include sys/types.h #include inttypes.h]], ! [[intmax_t i = -1; ! #ifndef PRId64 ! choke me ! #endif]])], [gcc_cv_header_inttypes_h=yes], [gcc_cv_header_inttypes_h=no])]) AC_MSG_RESULT($gcc_cv_header_inttypes_h) Index: gcc/hwint.h === *** gcc/hwint.h (revision 211125) --- gcc/hwint.h (working copy) *** extern char sizeof_long_long_must_be_8[s *** 46,58 #endif /* Set HOST_WIDE_INT, this should be always 64 bits. ! !With a sane ABI, 'long' is the largest efficient host integer type. !Thus, we use that unless we have to use 'long long' !because we're on a 32-bit host. */ #define HOST_BITS_PER_WIDE_INT 64 ! #if HOST_BITS_PER_LONG == 64 # define HOST_WIDE_INT long # define HOST_WIDE_INT_C(X) X ## L #else --- 46,56 #endif /* Set HOST_WIDE_INT, this should be always 64 bits. !The underlying type is matched to that of int64_t and assumed !to be either long or long long. */ #define HOST_BITS_PER_WIDE_INT 64 ! #if INT64_T_IS_LONG # define HOST_WIDE_INT long # define HOST_WIDE_INT_C(X) X ## L #else *** extern char sizeof_long_long_must_be_8[s *** 75,122 typedef before using the __asm_fprintf__ format attribute. */ typedef HOST_WIDE_INT __gcc_host_wide_int__; - /* Various printf format strings for HOST_WIDE_INT. */ - - #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG - # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT - # define HOST_WIDE_INT_PRINT_C L - /* HOST_BITS_PER_WIDE_INT is 64 bits. */ - # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ - 0x% HOST_LONG_FORMAT x%016 HOST_LONG_FORMAT x - # define HOST_WIDE_INT_PRINT_PADDED_HEX \ - %016 HOST_LONG_FORMAT x - #else - # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT - # define HOST_WIDE_INT_PRINT_C LL - /* HOST_BITS_PER_WIDE_INT is 64 bits. */ - #
Re: [PATCH/AARCH64] Fix PR 61345: rtx_cost ICEing on simple code
On 28 May 2014 23:58, Andrew Pinski andrew.pin...@caviumnetworks.com wrote: Hi, The problem here is aarch64_rtx_costs for IF_THEN_ELSE does not handle the case where the first operand is a non comparison. This happens when the combine is combing a few RTLs and calling set_src_cost to check the costs of the newly created rtl. At the same I noticed the code for rtx_costs was getting more complex so I factored out the code for IF_THEN_ELSE OK? Built and tested on aarch64-elf with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.c (aarch64_if_then_else_costs): New function. (aarch64_rtx_costs): Use aarch64_if_then_else_costs. testsuite/ChangeLog: * gcc.c-torture/compile/20140528-1.c: New testcase. Hi, Can you split this into two patches please, the first a straight refactor, no functional change, the second with the fix and testcase ? + enum rtx_code cmpcode; + if (COMPARISON_P (op0)) Blank line between declarations and statement please. + return false; + +} Drop the superflous blank line after the return please. Thanks /Marcus
Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning
On 30 May 10:59, Jeff Law wrote: On 05/29/14 05:05, Ilya Enkovich wrote: Hi, This patch allows to perform function versioning when some structures are not available yet. It is required to make clones for Pointer Bounds Checker right after SSA build. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-inline.c (copy_cfg_body): Check loop tree existence before accessing it. (tree_function_versioning): Check DF info existence before accessing it. diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 4293241..23fef90 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -2544,7 +2544,8 @@ copy_cfg_body (copy_body_data * id, gcov_type count, int frequency_scale, /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ - if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) + if (loops_for_fn (src_cfun) + loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); Hmm, so if I understand things correctly, src_fun has no loop structures attached, thus there's nothing to copy. Presumably at some later point we build loop structures for the copy from scratch? I suppose it is just a simple bug with absent NULL pointer check. Here is original code: /* Duplicate the loop tree, if available and wanted. */ if (loops_for_fn (src_cfun) != NULL current_loops != NULL) { copy_loops (id, entry_block_map-loop_father, get_loop (src_cfun, 0)); /* Defer to cfgcleanup to update loop-father fields of basic-blocks. */ loops_state_set (LOOPS_NEED_FIXUP); } /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); As you may see we have check for absent loops structure in the first if-statement and no check in the second one. I hit segfault and added the check. Similarly for the PTA info, we just build it from scratch in the copy at some point? Here we also have conditional access like /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); and following unconditional I've fixed. Assuming the answers to both are yes, then this patch is OK for the trunk when the rest of the patches are approved. It's not great, bit it's OK. Thanks! Ilya jeff
Re: [PATCH] Do not build libsanitizer also for powerpc*-*-linux*
Committed as r210012, please don't forget to add llvm-commits to CC next time (I did not see the message) Thanks! On Mon, Jun 2, 2014 at 9:57 AM, Yury Gribov y.gri...@samsung.com wrote: There was some discussion on what to do, but has there been a decision on how to fix this yet? Or is it fixed upstream and we just need to merge that fix too? Yuri needs to send the patch to llvm-commits. I think I already did that: D3911 -Y
Re: [PATCH AArch64 1/2] Correct signedness of builtins, remove casts from arm_neon.h
On 29 May 2014 13:44, Alan Lawrence alan.lawre...@arm.com wrote: This adds three new sets of qualifiers to aarch64-builtins.c, and uses the already-present-but-unused USHIFTIMM. gcc/ChangeLog: * gcc/config/aarch64/aarch64-builtins.c (aarch64_types_binop_uus_qualifiers, aarch64_types_shift_to_unsigned_qualifiers, aarch64_types_unsigned_shiftacc_qualifiers): Define. * gcc/config/aarch64/aarch64-simd-builtins.def (uqshl, uqrshl, uqadd, uqsub, usqadd, usra_n, ursra_n, uqshrn_n, uqrshrn_n, usri_n, usli_n, sqshlu_n, uqshl_n): Update qualifiers. * gcc/config/aarch64/arm_neon.h (vqadd_u8, vqadd_u16, vqadd_u32, vqadd_u64, vqaddq_u8, vqaddq_u16, vqaddq_u32, vqaddq_u64, vqsub_u8, vqsub_u16, vqsub_u32, vqsub_u64, vqsubq_u8, vqsubq_u16, vqsubq_u32, vqsubq_u64, vqaddb_u8, vqaddh_u16, vqadds_u32, vqaddd_u64, vqrshl_u8, vqrshl_u16, vqrshl_u32, vqrshl_u64, vqrshlq_u8, vqrshlq_u16, vqrshlq_u32, vqrshlq_u64, vqrshlb_u8, vqrshlh_u16, vqrshls_u32, vqrshld_u64, vqrshrn_n_u16, vqrshrn_n_u32, vqrshrn_n_u64, vqrshrnh_n_u16, vqrshrns_n_u32, vqrshrnd_n_u64, vqshl_u8, vqshl_u16, vqshl_u32, vqshl_u64, vqshlq_u8, vqshlq_u16, vqshlq_u32, vqshlq_u64, vqshlb_u8, vqshlh_u16, vqshls_u32, vqshld_u64, vqshl_n_u8, vqshl_n_u16, vqshl_n_u32, vqshl_n_u64, vqshlq_n_u8, vqshlq_n_u16, vqshlq_n_u32, vqshlq_n_u64, vqshlb_n_u8, vqshlh_n_u16, vqshls_n_u32, vqshld_n_u64, vqshlu_n_s8, vqshlu_n_s16, vqshlu_n_s32, vqshlu_n_s64, vqshluq_n_s8, vqshluq_n_s16, vqshluq_n_s32, vqshluq_n_s64, vqshlub_n_s8, vqshluh_n_s16, vqshlus_n_s32, vqshlud_n_s64, vqshrn_n_u16, vqshrn_n_u32, vqshrn_n_u64, vqshrnh_n_u16, vqshrns_n_u32, vqshrnd_n_u64, vqsubb_u8, vqsubh_u16, vqsubs_u32, vqsubd_u64, vrsra_n_u8, vrsra_n_u16, vrsra_n_u32, vrsra_n_u64, vrsraq_n_u8, vrsraq_n_u16, vrsraq_n_u32, vrsraq_n_u64, vrsrad_n_u64, vsli_n_u8, vsli_n_u16, vsli_n_u32,vsli_n_u64, vsliq_n_u8, vsliq_n_u16, vsliq_n_u32, vsliq_n_u64, vslid_n_u64, vsqadd_u8, vsqadd_u16, vsqadd_u32, vsqadd_u64, vsqaddq_u8, vsqaddq_u16, vsqaddq_u32, vsqaddq_u64, vsqaddb_u8, vsqaddh_u16, vsqadds_u32, vsqaddd_u64, vsra_n_u8, vsra_n_u16, vsra_n_u32, vsra_n_u64, vsraq_n_u8, vsraq_n_u16, vsraq_n_u32, vsraq_n_u64, vsrad_n_u64, vsri_n_u8, vsri_n_u16, vsri_n_u32, vsri_n_u64, vsriq_n_u8, vsriq_n_u16, vsriq_n_u32, vsriq_n_u64, vsrid_n_u64): Remove casts. vqrshl_u8 (uint8x8_t __a, int8x8_t __b) { - return (uint8x8_t) __builtin_aarch64_uqrshlv8qi ((int8x8_t) __a, __b); + return __builtin_aarch64_uqrshlv8qi_uus ( __a, __b); } Some of these patterns have gained an extra space following the return, OK to commit with the additional space dropped. /Marcus
Re: [PATCH AArch64 2/2] Correct signedness of builtins, remove casts from arm_neon.h
On 29 May 2014 13:47, Alan Lawrence alan.lawre...@arm.com wrote: This adds another set of qualifiers to aarch64-builtins.c, and removes more casts from arm_neon.h, for the suqadd, ushl, urshl, urshr_n, ushll_n, and sshl intrinsics. gcc/ChangeLog: * gcc/config/aarch64/aarch64-builtins.c (aarch64_types_binop_ssu_qualifiers): New static data. (TYPES_BINOP_SSU): Define. * gcc/config/aarch64/aarch64-simd-builtins.def (suqadd, ushl, urshl, urshr_n, ushll_n): Use appropriate unsigned qualifiers. * gcc/config/aarch64/arm_neon.h (vrshl_u8, vrshl_u16, vrshl_u32, vrshl_u64, vrshlq_u8, vrshlq_u16, vrshlq_u32, vrshlq_u64, vrshld_u64, vrshr_n_u8, vrshr_n_u16, vrshr_n_u32, vrshr_n_u64, vrshrq_n_u8, vrshrq_n_u16, vrshrq_n_u32, vrshrq_n_u64, vrshrd_n_u64, vshll_n_u8, vshll_n_u16, vshll_n_u32, vuqadd_s8, vuqadd_s16, vuqadd_s32, vuqadd_s64, vuqaddq_s8, vuqaddq_s16, vuqaddq_s32, vuqaddq_s64, vuqaddb_s8, vuqaddh_s16, vuqadds_s32, vuqaddd_s64): Add signedness suffix to builtin function name, remove cast. (vshl_s8, vshl_s16, vshl_s32, vshl_s64, vshl_u8, vshl_u16, vshl_u32, vshl_u64, vshlq_s8, vshlq_s16, vshlq_s32, vshlq_s64, vshlq_u8, vshlq_u16, vshlq_u32, vshlq_u64, vshld_s64, vshld_u64): Remove cast. Looks OK to me. /Marcus
RE: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
Richard Sandiford rdsandif...@googlemail.com writes: Matthew Fortune matthew.fort...@imgtec.com writes: I've realised that I may need to do 'something' to prevent GCC from loading or storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 when using an unaligned address. Initial tests show that when loading from an unaligned address (4-byte aligned) then GCC loads the two halves of a 64-bit value into GPRs and then moves across to FPRs. This is good but I don't know if it is guaranteed. From what I can tell the backend doesn't specifically deal with loading unaligned data but rather the normal load/store patterns are used by the middle end. As such I'm not sure there is anything to prevent direct loads to FPRs by parts. Do you know one way or the other if unaligned doubles can currently be loaded via pairs of lwc1 (same for store) and if so can you advise on an approach I could try to prevent this for FPXX? I will try to figure this out on my own in the meantime. The port does handle some widths of unaligned access via the {insv,extv,extzv}misalign patterns. That's how an unaligned DImode value will be handled on 64-bit targets. The MIPS *misalign patterns only handle integer modes, so for other types of mode the target-independent code will fall back to using an integer load followed by a subreg (or a subreg followed by an integer store). IIRC that's how an unaligned DFmode access will be handled on 64-bit targets. For modes that are larger or smaller than *misalign can handle, the target-independent code has to split the access up into smaller pieces and reassemble them. And these pieces have to have integer modes. E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4). The thing that prevents these individual loads from using LWC1 is CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid for any target-independent code to reduce a subreg of an FPR pair to an individual FPR. [FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for things like DImode on 32-bit targets is because there's no special architecture feature than can be used. It's just a case of decomposing the access. Since that's a general technique, we want to make the target-independent code do it as well as possible rather than handle it in a port-specific way.] So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of unaligned floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS should guarantee what you want. Thanks for all the details. I did not know if CANNOT_CHANGE_MODE_CLASS would give this guarantee as I am conscious of MIPS 1 having to do pairs of LWC1/SWC1 for doubles. However, if I understand the code correctly the LWC1/SWC1 pairs are generated by splits for MIPS 1 and not directly from target-independent code so CANNOT_CHANGE_MODE_CLASS does not impact that explicit splitting logic. Is that right? Regards, Matthew
Re: [PATCH][MIPS] Implement O32 FPXX ABI (GCC)
Matthew Fortune matthew.fort...@imgtec.com writes: Richard Sandiford rdsandif...@googlemail.com writes: Matthew Fortune matthew.fort...@imgtec.com writes: I've realised that I may need to do 'something' to prevent GCC from loading or storing DFmode/DImode values to/from FPRs using pairs of SWC1/LWC1 when using an unaligned address. Initial tests show that when loading from an unaligned address (4-byte aligned) then GCC loads the two halves of a 64-bit value into GPRs and then moves across to FPRs. This is good but I don't know if it is guaranteed. From what I can tell the backend doesn't specifically deal with loading unaligned data but rather the normal load/store patterns are used by the middle end. As such I'm not sure there is anything to prevent direct loads to FPRs by parts. Do you know one way or the other if unaligned doubles can currently be loaded via pairs of lwc1 (same for store) and if so can you advise on an approach I could try to prevent this for FPXX? I will try to figure this out on my own in the meantime. The port does handle some widths of unaligned access via the {insv,extv,extzv}misalign patterns. That's how an unaligned DImode value will be handled on 64-bit targets. The MIPS *misalign patterns only handle integer modes, so for other types of mode the target-independent code will fall back to using an integer load followed by a subreg (or a subreg followed by an integer store). IIRC that's how an unaligned DFmode access will be handled on 64-bit targets. For modes that are larger or smaller than *misalign can handle, the target-independent code has to split the access up into smaller pieces and reassemble them. And these pieces have to have integer modes. E.g. on 32-bit targets a 4-byte-misaligned load into (reg:DF x) could be done by loading (subreg:SI (reg:DF x) 0) and (subreg:SI (reg:DF x) 4). The thing that prevents these individual loads from using LWC1 is CANNOT_CHANGE_MODE_CLASS, which (among other things) makes it invalid for any target-independent code to reduce a subreg of an FPR pair to an individual FPR. [FWIW, the reason MIPS doesn't define {insv,extv,extzv}misalign for things like DImode on 32-bit targets is because there's no special architecture feature than can be used. It's just a case of decomposing the access. Since that's a general technique, we want to make the target-independent code do it as well as possible rather than handle it in a port-specific way.] So yeah, the combination of (a) STRICT_ALIGNMENT, (b) the lack of unaligned floating-point load/store patterns and (c) CANNOT_CHANGE_MODE_CLASS should guarantee what you want. Thanks for all the details. I did not know if CANNOT_CHANGE_MODE_CLASS would give this guarantee as I am conscious of MIPS 1 having to do pairs of LWC1/SWC1 for doubles. However, if I understand the code correctly the LWC1/SWC1 pairs are generated by splits for MIPS 1 and not directly from target-independent code so CANNOT_CHANGE_MODE_CLASS does not impact that explicit splitting logic. Is that right? Yeah, that's right. All splits of floating-point values are done in the MIPS code and the MIPS code is free to ignore CANNOT_CHANGE_MODE_CLASS. One of the many reasons for defining CANNOT_CHANGE_MODE_CLASS the way it is now is that the FPRs are always little-endian. If the target-independent code ever did try to access one half of a pair, it would access the wrong one on big-endian targets. So CCCM is by no means just a technicality at the moment. Other targets also rely on CCCM for similarly important reasons. Thanks, Richard
Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes
On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch adds Pointer Bounds Checker passes. Versioning happens before early local passes. Earply local passes are split into 3 stages to have everything instrumented before any optimization applies. That looks artificial to me. If you need to split up early_local_passes then do that - nesting three IPA pass groups inside it looks odd to me. Btw - doing this in three IPA phases makes things possibly slower due to cache effects. It might be worth pursuing to move the early stage completely to the lowering pipeline. Btw, fixup_cfg only needs to run once local_pure_const was run on a callee, thus it shouldn't be neccessary to run it from the first group. void pass_manager::execute_early_local_passes () { - execute_pass_list (pass_early_local_passes_1-sub); + execute_pass_list (pass_early_local_passes_1-sub-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-next-next-sub); } is gross - it should be enough to execute the early local pass list (obsolete comment with the suggestion above). Richard. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c: New. * tree-chkp.h: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o. (GTFILES): Add tree-chkp.c. * c-family/c.opt (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * passes.def (pass_ipa_chkp_versioning): New. (pass_before_local_optimization_passes): New. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * toplev.c: include tree-chkp.h. (compile_file): Add chkp_finish_file call. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_before_local_optimization_passes): New. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * tree.h (called_as_built_in): New. * builtins.c (called_as_built_in): Not static anymore. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (pass_data_before_local_optimization_passes): New. (pass_before_local_optimization_passes): New. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New. (make_pass_before_local_optimization_passes): New. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New.
Re: [PATCH, i386, Pointer Bounds Checker 10/x] Partitions
On Fri, May 30, 2014 at 7:10 PM, Jeff Law l...@redhat.com wrote: On 05/28/14 10:06, Ilya Enkovich wrote: Hi, This patch keeps instrumented and original versions together and preserve tranparent alias chain during symbol name privatization. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2013-05-28 Ilya Enkovich ilya.enkov...@intel.com * lto/lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. This part is OK. Note lto/ has its own ChangeLog, so put the ChangeLog entry there and don't use the lto/ prefix in the ChangeLog entry. (privatize_symbol_name): Restore transparent alias chain if required. What exactly are you doing here? The comment in the code doesn't really make it clear what you are doing or why. + /* We could change name which is a target of transparent alias + chain of instrumented function name. Fix alias chain if so .*/ So are you saying that we want to change the name? Or that it could have been changed and we have to adjust something because it was changed? I'm certainly not as familiar with the LTO stuff as I should be -- what is the purpose behing chaining the DECL_ASSEMBLER_NAME nodes? Something gross: /* Nonzero in an IDENTIFIER_NODE if the name is a local alias, whose uses are to be substituted for uses of the TREE_CHAINed identifier. */ #define IDENTIFIER_TRANSPARENT_ALIAS(NODE) \ (IDENTIFIER_NODE_CHECK (NODE)-base.deprecated_flag) this should be all moved to the symbol table level. (and IDENTIFIER_NODE shouldn't have to have tree_common.chain and thus become smaller). Richard. jeff
Re: [PATCH, i386, Pointer Bounds Checker 10/x] Partitions
On 30 May 11:10, Jeff Law wrote: On 05/28/14 10:06, Ilya Enkovich wrote: Hi, This patch keeps instrumented and original versions together and preserve tranparent alias chain during symbol name privatization. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2013-05-28 Ilya Enkovich ilya.enkov...@intel.com * lto/lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. This part is OK. Note lto/ has its own ChangeLog, so put the ChangeLog entry there and don't use the lto/ prefix in the ChangeLog entry. (privatize_symbol_name): Restore transparent alias chain if required. What exactly are you doing here? The comment in the code doesn't really make it clear what you are doing or why. + /* We could change name which is a target of transparent alias + chain of instrumented function name. Fix alias chain if so .*/ So are you saying that we want to change the name? Or that it could have been changed and we have to adjust something because it was changed? I'm certainly not as familiar with the LTO stuff as I should be -- what is the purpose behing chaining the DECL_ASSEMBLER_NAME nodes? I'm just adjusting renaming to support alias chains. LTO renames functions to avoid two static functions with the same assembler name. Instrumented functions have names chained with original names using IDENTIFIER_TRANSPARENT_ALIAS. If name of the original function is privatized, then IDENTIFIER_TRANSPARENT_ALIAS still points to the old name which is wrong. My patch fixes it. jeff Here is fixed ChangeLog. Thanks, Ilya -- gcc/lto 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. (privatize_symbol_name): Restore transparent alias chain if required.
Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning
On Mon, Jun 2, 2014 at 12:48 PM, Ilya Enkovich enkovich@gmail.com wrote: On 30 May 10:59, Jeff Law wrote: On 05/29/14 05:05, Ilya Enkovich wrote: Hi, This patch allows to perform function versioning when some structures are not available yet. It is required to make clones for Pointer Bounds Checker right after SSA build. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-inline.c (copy_cfg_body): Check loop tree existence before accessing it. (tree_function_versioning): Check DF info existence before accessing it. diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 4293241..23fef90 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -2544,7 +2544,8 @@ copy_cfg_body (copy_body_data * id, gcov_type count, int frequency_scale, /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ - if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) + if (loops_for_fn (src_cfun) + loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); Hmm, so if I understand things correctly, src_fun has no loop structures attached, thus there's nothing to copy. Presumably at some later point we build loop structures for the copy from scratch? I suppose it is just a simple bug with absent NULL pointer check. Here is original code: /* Duplicate the loop tree, if available and wanted. */ if (loops_for_fn (src_cfun) != NULL current_loops != NULL) { copy_loops (id, entry_block_map-loop_father, get_loop (src_cfun, 0)); /* Defer to cfgcleanup to update loop-father fields of basic-blocks. */ loops_state_set (LOOPS_NEED_FIXUP); } /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); As you may see we have check for absent loops structure in the first if-statement and no check in the second one. I hit segfault and added the check. Actually after SSA is built we always have loops (loops are built once we build the CFG which happens earlier). So all the above checks are no longer necessary now. Similarly for the PTA info, we just build it from scratch in the copy at some point? Here we also have conditional access like /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); and following unconditional I've fixed. Likewise. The init_data_structures pass initializes this (init_tree_ssa). So I'm not sure why you need all this given we are in SSA form? Richard. Assuming the answers to both are yes, then this patch is OK for the trunk when the rest of the patches are approved. It's not great, bit it's OK. Thanks! Ilya jeff
Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning
2014-06-02 15:56 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Mon, Jun 2, 2014 at 12:48 PM, Ilya Enkovich enkovich@gmail.com wrote: On 30 May 10:59, Jeff Law wrote: On 05/29/14 05:05, Ilya Enkovich wrote: Hi, This patch allows to perform function versioning when some structures are not available yet. It is required to make clones for Pointer Bounds Checker right after SSA build. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-inline.c (copy_cfg_body): Check loop tree existence before accessing it. (tree_function_versioning): Check DF info existence before accessing it. diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 4293241..23fef90 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -2544,7 +2544,8 @@ copy_cfg_body (copy_body_data * id, gcov_type count, int frequency_scale, /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ - if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) + if (loops_for_fn (src_cfun) + loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); Hmm, so if I understand things correctly, src_fun has no loop structures attached, thus there's nothing to copy. Presumably at some later point we build loop structures for the copy from scratch? I suppose it is just a simple bug with absent NULL pointer check. Here is original code: /* Duplicate the loop tree, if available and wanted. */ if (loops_for_fn (src_cfun) != NULL current_loops != NULL) { copy_loops (id, entry_block_map-loop_father, get_loop (src_cfun, 0)); /* Defer to cfgcleanup to update loop-father fields of basic-blocks. */ loops_state_set (LOOPS_NEED_FIXUP); } /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); As you may see we have check for absent loops structure in the first if-statement and no check in the second one. I hit segfault and added the check. Actually after SSA is built we always have loops (loops are built once we build the CFG which happens earlier). So all the above checks are no longer necessary now. Similarly for the PTA info, we just build it from scratch in the copy at some point? Here we also have conditional access like /* Reset the escaped solution. */ if (cfun-gimple_df) pt_solution_reset (cfun-gimple_df-escaped); and following unconditional I've fixed. Likewise. The init_data_structures pass initializes this (init_tree_ssa). So I'm not sure why you need all this given we are in SSA form? Instrumentation clones are created before we are in SSA form. I do it before early local passes. Ilya Richard. Assuming the answers to both are yes, then this patch is OK for the trunk when the rest of the patches are approved. It's not great, bit it's OK. Thanks! Ilya jeff
Re: [AARCH64, PATCH] Fix ICE in aarch64_float_const_representable_p
On 30 May 2014 09:14, Tom de Vries tom_devr...@mentor.com wrote: Marcus, when building for aarch64-linux-gnu with --enable-checking=yes,rtl, I run into the following error: OK, thanks Tom. /Marcus
Re: [PING*2][PATCH] Extend mode-switching to support toggle (1/2)
Hello, Any feedback for this ? I'd like to commit only when OK for Epiphany. many thanks, Christian On 05/26/2014 05:32 PM, Christian Bruel wrote: On 04/28/2014 10:08 AM, Christian Bruel wrote: Hello, I'd like to ping the following patches [Hookize mode-switching] http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01003.html [Add new hooks to support toggle and SH4A fpchg instruction] http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01005.html Sorry, I only saw the first part and thought I' d need to wait till I see the second part - and I somehow missed that. I think the previous known mode should be passed to the TARGET_MODE_EMIT hook - no need to have extra hooks for toggling, and, as I mentioned earlier, fixating on the toggle is actually an SH artifact - other ports have multi-way modes settings that can benefit from knowing the previous mode. OK I'll change the bool toggle parameter by the previous mode in TARGET_MODE_EMIT and remove the bool XOR hooks in the SH description. Hello, This is the interface for targets that could use the previous mode for switching. TARGET_MODE_EMIT now takes a new prev_mode parameter. If the previous mode cannot be determined, MODE_NONE (value depends on the entity) is used. The implementation is less trivial than just supporting a boolean toggle bit, as the previous modes information have to be carried along the edges. For this I recycle the auxiliary edge field that is made unnecessary by the removal of make_pred_opaque and a change in the implementation to call LCM for every modes from every identity simultaneously. This idea was suggested by Joern in PR29349. Another speed improvement is that we process the modes to no_mode instead of max_num_modes for each entity. Thanks to all this, the only additional data to support prev_mode is that for each BB, the avin/avout lcm computation are cached inside the bb_info mode_in/mode_out fields, the xor toggle bit handling could have been removed. bootstrapped/regtested for x86 and sh4, sh4a, sh4a-single, epiphany build is OK. testsuite not ran. Joern, is this new target macro interface OK with you ? Jeff, (or other RTL maintainer) since this is a new implementation for optimize_mode_switching I suppose your previous approval doesn't held anymore... is this new one OK for trunk as well ? No change for x86/sh4/2a interfaces. Many thanks Christian
Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes
2014-06-02 15:35 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch adds Pointer Bounds Checker passes. Versioning happens before early local passes. Earply local passes are split into 3 stages to have everything instrumented before any optimization applies. That looks artificial to me. If you need to split up early_local_passes then do that - nesting three IPA pass groups inside it looks odd to me. Btw - doing this in three IPA phases makes things possibly slower due to cache effects. It might be worth pursuing to move the early stage completely to the lowering pipeline. Early local passes is some special case because these passes are executed separately for new functions. I did not want to get three special passes instead and therefore made split inside. If you prefer split pass itself, I suppose pass_early_local_passes may be replaced with something like pass_build_ssa_passes + pass_chkp_instrumentation_passes + pass_ipa_chkp_produce_thunks + pass_local_optimization_passes. execute_early_local_passes would execute gimple passes lists of pass_build_ssa_passes, pass_chkp_instrumentation_passes and pass_local_optimization_passes. I think we cannot have the first stage moved into lowering passes because it should be executed for newly created functions. Btw, fixup_cfg only needs to run once local_pure_const was run on a callee, thus it shouldn't be neccessary to run it from the first group. OK. Will try to remove it from the first group. void pass_manager::execute_early_local_passes () { - execute_pass_list (pass_early_local_passes_1-sub); + execute_pass_list (pass_early_local_passes_1-sub-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-next-next-sub); } is gross - it should be enough to execute the early local pass list (obsolete comment with the suggestion above). This function should call only gimple passes for cfun. Therefore we cannot call IPA passes here and has to execute each gimple passes list separately. Ilya Richard. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c: New. * tree-chkp.h: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o. (GTFILES): Add tree-chkp.c. * c-family/c.opt (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * passes.def (pass_ipa_chkp_versioning): New. (pass_before_local_optimization_passes): New. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * toplev.c: include tree-chkp.h. (compile_file): Add chkp_finish_file call. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_before_local_optimization_passes): New. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * tree.h (called_as_built_in): New. * builtins.c (called_as_built_in): Not static anymore. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (pass_data_before_local_optimization_passes): New. (pass_before_local_optimization_passes): New. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New. (make_pass_before_local_optimization_passes): New. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New.
Re: [GSoC][match-and-simplify] add bitwise patterns to match.pd
On Mon, Jun 2, 2014 at 1:33 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: Hi, I have tried to add few bitwise patterns from tree-ssa-forwprop.c:simplify_bitwise_binary and the patterns mentioned in Simplifications wiki (https://gcc.gnu.org/wiki/Simplifications). How to write a test-case to match multiple gimple statements ? Example: For the pattern: ~x | ~y - ~(x y) test-case: int f(int x, int y) { int t1 = ~x; int t2 = ~y; return t1 | t2; } fdump-tree-forwprop-details output: gimple_match_and_simplified to _5 = ~_7; f (int x, int y) { int t2; int t1; int _5; int _7; bb 2: t1_2 = ~x_1(D); t2_4 = ~y_3(D); _7 = x_1(D) y_3(D); _5 = ~_7; return _5; } I suppose we want to look for matching both: _7 = x_1(D) y_3(D); _5 = ~_7; so only matching on gimple_match_and_simplified to _5 = ~_7 won't be correct ? Yeah, that's a forwprop debugging dump issue and/or of the API it uses. The gimple_match_and_simplify overload using a gimple_stmt_iterator * inserts the other stmts before it instead of delaying that to the caller (and gives it the chance to dump it). You can do the old-school testing by scanning for the IL itself, not some gimple_match_and_simplified dump. Thus in addition to the above also scan for { dg-final { scan-tree-dump \[xy\]_\\d\+\\(D\\) \[xy\]_\\d\+\\(D\\) } } The patterns for x 0 - 0 and x -1 - x don't get fired from forwprop. I tried: int f1(int x) { int t1 = 0; return x t1; } fdump-tree-forwprop-details shows following: ;; Function f1 (f1, funcdef_no=0, decl_uid=1743, symbol_order=0) f1 (int x) { int t1; bb 2: return 0; } That's because constant propagation (which runs before forwprop) already simplified the statement. I have a patch to make use of match-and-simplify from gimple_fold_stmt_to_constant but I have to clean it up. I'll give writing testcases a thought there (hopefully will post and commit a patch later today). Richard. [gcc] * match.pd: Add few patterns from simplify_bitwise_binary. [gcc/testsuite] * gcc.dg/tree-ssa/match-2.c: Add more test-cases. Thanks and Regards, Prathamesh
[PATCH][match-and-simplify]
This is a patch I worked on last week - it makes gimple_fold_stmt_to_constant[_1] use gimple_match_and_simplify (and avoid building trees we throw away because they are not gimple values). It also adds basic constant folding patterns (and comments on what is missing in match.pd - the actual patch I was using was comparing the result from both simplification methods and errors if they don't agree). The patch below (which has been committed) keeps the old path to not regress due to missing patterns. Committed to the branch. Richard. 2014-06-02 Richard Biener rguent...@suse.de * gimple-fold.c (gimple_fold_stmt_to_constant_1): Rename to (gimple_fold_stmt_to_constant_2): ... this and make static. (gimple_fold_stmt_to_constant_1): New wrapper using gimple_match_and_simplify before falling back to gimple_fold_stmt_to_constant_2. * match.pd: Add basic constant folding patterns. Add two commutations. Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 211131) +++ gcc/gimple-fold.c (working copy) @@ -2524,8 +2524,8 @@ maybe_fold_or_comparisons (enum tree_cod privatized with the single valueize function used in the various TUs to avoid the indirect function call overhead. */ -tree -gimple_fold_stmt_to_constant_1 (gimple stmt, tree (*valueize) (tree)) +static tree +gimple_fold_stmt_to_constant_2 (gimple stmt, tree (*valueize) (tree)) { location_t loc = gimple_location (stmt); switch (gimple_code (stmt)) @@ -2803,6 +2803,30 @@ gimple_fold_stmt_to_constant_1 (gimple s } } +tree +gimple_fold_stmt_to_constant_1 (gimple stmt, tree (*valueize) (tree)) +{ + tree lhs = gimple_get_lhs (stmt); + if (lhs) +{ + tree res = gimple_match_and_simplify (lhs, NULL, valueize); + if (res) + { + if (dump_file dump_flags TDF_DETAILS) + { + fprintf (dump_file, Match-and-simplified definition of ); + print_generic_expr (dump_file, lhs, 0); + fprintf (dump_file, to ); + print_generic_expr (dump_file, res, 0); + fprintf (dump_file, \n); + } + return res; + } +} + /* ??? For now, to avoid regressions. */ + return gimple_fold_stmt_to_constant_2 (stmt, valueize); +} + /* Fold STMT to a constant using VALUEIZE to valueize SSA names. Returns NULL_TREE if folding to a constant is not possible, otherwise returns a constant according to is_gimple_min_invariant. */ Index: gcc/match.pd === --- gcc/match.pd(revision 211131) +++ gcc/match.pd(working copy) @@ -21,6 +21,95 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +/* Simple constant foldings to substitute gimple_fold_stmt_to_constant_2. */ +(match_and_simplify + (plus @0 integer_zerop) + @0) +(match_and_simplify + (pointer_plus @0 integer_zerop) + @0) +(match_and_simplify + (minus @0 integer_zerop) + @0) +(match_and_simplify + (minus @0 @0) + { build_zero_cst (type); }) +(match_and_simplify + (mult @0 integer_zerop@1) + @1) +(match_and_simplify + (mult @0 integer_onep) + @0) +(match_and_simplify + (trunc_div @0 integer_onep) + @0) +/* It's hard to preserve non-folding of / 0 which is done by a + positional check in fold-const.c (to preserve warnings). The + issue here is that we fold too early in frontends. + Also fold happilt folds 0 / x to 0 (even if x turns out to be zero later). */ +(match_and_simplify + (trunc_div integer_zerop@0 @1) + @0) +(match_and_simplify + (trunc_div @0 @0) + { build_one_cst (type); }) +(match_and_simplify + (trunc_mod @0 integer_onep) + { build_zero_cst (type); }) +(match_and_simplify + (trunc_mod integer_zerop@0 @1) + @0) +(match_and_simplify + (trunc_mod @0 @0) + { build_zero_cst (type); }) +(match_and_simplify + (bit_ior @0 integer_zerop) + @0) +(match_and_simplify + (bit_ior @0 integer_all_onesp@1) + @1) +(match_and_simplify + (bit_and @0 integer_all_onesp) + @0) +(match_and_simplify + (bit_and @0 integer_zerop@1) + @1) +(match_and_simplify + (bit_xor @0 integer_zerop) + @0) +(match_and_simplify + (bit_xor @0 @0) + { build_zero_cst (type); }) +/* tree-ssa/ifc-pr44710.c requires a b ? c : d to fold to 1. + ??? probably runs into issue of recursive folding of a b op0. */ +/* tree-ssa/ssa-ccp-16.c wants to fold hello[i_2] to 0 + (fold_const_aggregate_ref_1). */ +/* tree-ssa/ssa-ccp-19.c wants to fold a1_3-i to MEM[(void *)a] + (get_addr_base_and_unit_offset_1). */ +/* tree-ssa/ssa-ccp-22.c wants to fold b_2(D) = t_1 to 1. + We are missing compare constant folding to type boundaries. */ + +/* The following is simplification done by gimple_fold_stmt_to_constant_1 + to aid propagation engines, producing is_gimple_min_invariants from +
[PATCH, i386]: Fix PR 61239, ICE in decompose, at rtl.h when compiling vshuf-v16hi.c using -mavx2
Hello! 2014-06-02 Uros Bizjak ubiz...@gmail.com PR target/61239 * config/i386/i386.c (ix86_expand_vec_perm) [case V32QImode]: Use GEN_INT (-128) instead of GEN_INT (128) to set MSB of QImode constant. Tested on x86_64-pc-linux-gnu with make check-gcc RUNTESTFLAGS='--target_board=unix\{-msse2,-msse4,-mavx,-mavx2\} dg-torture.exp=vshuf*.c' and committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 211125) +++ config/i386/i386.c (working copy) @@ -21541,7 +21541,7 @@ ix86_expand_vec_perm (rtx operands[]) t1 = gen_reg_rtx (V32QImode); t2 = gen_reg_rtx (V32QImode); t3 = gen_reg_rtx (V32QImode); - vt2 = GEN_INT (128); + vt2 = GEN_INT (-128); for (i = 0; i 32; i++) vec[i] = vt2; vt = gen_rtx_CONST_VECTOR (V32QImode, gen_rtvec_v (32, vec));
[PING**2] [PATCH, FORTRAN] Fix PR fortran/60718
Hello, Ping**2... this is patch still pending: see https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00774.html for the latest version. Thanks, Bernd. Date: Wed, 30 Apr 2014 15:17:51 +0200 Ping... Date: Tue, 15 Apr 2014 13:49:37 +0200 Hi Tobias, On Fri, 11 Apr 2014 16:04:51, Tobias Burnus wrote: Hi Tobias, On Fri, Apr 11, 2014 at 02:39:57PM +0200, Bernd Edlinger wrote: On Fri, 11 Apr 2014 13:37:46, Tobias Burnus wrote: Hmm, I was hoping somehow that only that test case is broken, and needs to be fixed. The target attribute is somehow simple, it implies intent(in) and the actual value will in most cases be a pointer, as in the example. I think that passing another nonpointer TARGET to a dummy argument which has a TARGET attribute is at least as common as passing a POINTER to a TARGET. TARGET is roughtly the opposite to the restrict qualifier. By default any nonpointer variable does not alias with something else, unless it has the TARGET attribute; if it has, it (its address) can then be assigned to a pointer. POINTER intrinsically alias and cannot have the TARGET attribute. Pointer - Nonalloc Allocatable - Noalloc Nonallocatable*/Allocatable* - Pointer with intent(in) Well, this approach does not handle intent(inout) at all. Now I have created a test case for the different aliasing issues with may arise with scalar objects. As you pointed out, also conversions of allocatable - nonalloc, allocatable - pointer and nonalloc - pointer turn out to violate the strict aliasing rules. However, conversions of arrays of objects with different attributes seem to be safe. I have not been able to find an example where it would be necessary to write the modified class object back to the original location. But I am not really a Fortran expert. Unfortunately there are also conversions of optional allocatable - optional pointer, which complicate the whole thing quite a lot. I have found these in class_optional_2.f90. Boot-strapped and regression-tested on x86_64-linux-gnu. OK for trunk? Thanks Bernd.
Re: [GSoC][match-and-simplify] add bitwise patterns to match.pd
On Mon, Jun 2, 2014 at 2:53 PM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Jun 2, 2014 at 1:33 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: Hi, I have tried to add few bitwise patterns from tree-ssa-forwprop.c:simplify_bitwise_binary and the patterns mentioned in Simplifications wiki (https://gcc.gnu.org/wiki/Simplifications). How to write a test-case to match multiple gimple statements ? Example: For the pattern: ~x | ~y - ~(x y) test-case: int f(int x, int y) { int t1 = ~x; int t2 = ~y; return t1 | t2; } fdump-tree-forwprop-details output: gimple_match_and_simplified to _5 = ~_7; f (int x, int y) { int t2; int t1; int _5; int _7; bb 2: t1_2 = ~x_1(D); t2_4 = ~y_3(D); _7 = x_1(D) y_3(D); _5 = ~_7; return _5; } I suppose we want to look for matching both: _7 = x_1(D) y_3(D); _5 = ~_7; so only matching on gimple_match_and_simplified to _5 = ~_7 won't be correct ? Yeah, that's a forwprop debugging dump issue and/or of the API it uses. The gimple_match_and_simplify overload using a gimple_stmt_iterator * inserts the other stmts before it instead of delaying that to the caller (and gives it the chance to dump it). You can do the old-school testing by scanning for the IL itself, not some gimple_match_and_simplified dump. Thus in addition to the above also scan for { dg-final { scan-tree-dump \[xy\]_\\d\+\\(D\\) \[xy\]_\\d\+\\(D\\) } } The patterns for x 0 - 0 and x -1 - x don't get fired from forwprop. I tried: int f1(int x) { int t1 = 0; return x t1; } fdump-tree-forwprop-details shows following: ;; Function f1 (f1, funcdef_no=0, decl_uid=1743, symbol_order=0) f1 (int x) { int t1; bb 2: return 0; } That's because constant propagation (which runs before forwprop) already simplified the statement. I have a patch to make use of match-and-simplify from gimple_fold_stmt_to_constant but I have to clean it up. I'll give writing testcases a thought there (hopefully will post and commit a patch later today). Btw, /* x 0 - 0 */ (match_and_simplify (bit_and @0 @1) if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) @1 == integer_zero_node) { integer_zero_node; }) /* x -1 - x */ (match_and_simplify (bit_and @0 @1) if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) @1 == integer_minus_one_node) @0) are too restrictive due to comparing @1 against very special tree nodes. The patch I just committed uses integer_zerop and integer_all_onesp instead. I have simplfied some of the if-exprs, operands are guaranteed to match. Also if we settle on the solution of providing 'type' as the type of the outermost expression we can simplify them as bitwise expressions don't change types. Most of the patterns would also apply in commutated form, thus I guess thinking of a good solution for that is next on the list after the decision tree stuff. /* ((a b) ~a) ~b - 0 */ (match_and_simplify (bit_and (bit_and (bit_and @0 @1) (bit_not @0)) (bit_not @1)) if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) INTEGRAL_TYPE_P (TREE_TYPE (@1))) { integer_zero_node; }) isn't this too complex and instead already (a b) ~a is 0? Also that would be still doing two steps in one as we already have a ~a - 0 so the pattern (if wanted) should only do the association (a b) ~a - (a ~a) b? (note that generally this is something for a reassociation pass, not for simple pattern matching) I have applied the patch with these changes. Richard. Richard. [gcc] * match.pd: Add few patterns from simplify_bitwise_binary. [gcc/testsuite] * gcc.dg/tree-ssa/match-2.c: Add more test-cases. Thanks and Regards, Prathamesh
Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes
On Mon, Jun 2, 2014 at 2:44 PM, Ilya Enkovich enkovich@gmail.com wrote: 2014-06-02 15:35 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch adds Pointer Bounds Checker passes. Versioning happens before early local passes. Earply local passes are split into 3 stages to have everything instrumented before any optimization applies. That looks artificial to me. If you need to split up early_local_passes then do that - nesting three IPA pass groups inside it looks odd to me. Btw - doing this in three IPA phases makes things possibly slower due to cache effects. It might be worth pursuing to move the early stage completely to the lowering pipeline. Early local passes is some special case because these passes are executed separately for new functions. I did not want to get three special passes instead and therefore made split inside. Yeah, but all passes are already executed via execute_early_local_passes, so it would be only an implementation detail. If you prefer split pass itself, I suppose pass_early_local_passes may be replaced with something like pass_build_ssa_passes + pass_chkp_instrumentation_passes + pass_ipa_chkp_produce_thunks + pass_local_optimization_passes. execute_early_local_passes would execute gimple passes lists of pass_build_ssa_passes, pass_chkp_instrumentation_passes and pass_local_optimization_passes. I think we cannot have the first stage moved into lowering passes because it should be executed for newly created functions. Well, let's defer that then. Btw, fixup_cfg only needs to run once local_pure_const was run on a callee, thus it shouldn't be neccessary to run it from the first group. OK. Will try to remove it from the first group. void pass_manager::execute_early_local_passes () { - execute_pass_list (pass_early_local_passes_1-sub); + execute_pass_list (pass_early_local_passes_1-sub-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-next-next-sub); } is gross - it should be enough to execute the early local pass list (obsolete comment with the suggestion above). This function should call only gimple passes for cfun. Therefore we cannot call IPA passes here and has to execute each gimple passes list separately. Ok, given a different split this would then become somewhat more sane anyway. Richard. Ilya Richard. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c: New. * tree-chkp.h: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o. (GTFILES): Add tree-chkp.c. * c-family/c.opt (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * passes.def (pass_ipa_chkp_versioning): New. (pass_before_local_optimization_passes): New. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * toplev.c: include tree-chkp.h. (compile_file): Add chkp_finish_file call. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_before_local_optimization_passes): New. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * tree.h (called_as_built_in): New. * builtins.c (called_as_built_in): Not static anymore. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (pass_data_before_local_optimization_passes): New. (pass_before_local_optimization_passes): New. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New.
Re: [PATCH v2 3/3] mangler/demangler dogfooding
On 05/28/2014 06:31 PM, Pedro Alves wrote: Some failures might be + demangler bugs, others unknown mangler bugs, and others known + mangler bugs fixed with a higher -fabi-version that the + demangler doesn't have a workaround for. You can avoid the latter by skipping demangling if G.need_abi_warning is set. If you do that, are there still any failures? Jason
Re: [PATCHv2/AARCH64 3/3] Support ILP32 multi-lib
On 26 February 2014 02:25, Andrew Pinski apin...@cavium.com wrote: Hi, This is the final patch which adds support for the dynamic linker and multi-lib directories for ILP32. I did not change multi-arch support as I did not know what it should be changed to and internally here at Cavium, we don't use multi-arch. Updated for the new names that were decided on. OK? Build and tested for aarch64-linux-gnu with and without --with-multilib-list=lp64,ilp32. Thanks, Andrew Pinski * config/aarch64/aarch64-linux.h (GLIBC_DYNAMIC_LINKER): /lib/ld-linux-aarch64_ilp32.so.1 is used for ILP32. (LINUX_TARGET_LINK_SPEC): Update linker script for ILP32. file whose name depends on -mabi= and -mbig-endian. * config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Handle LP64 better and handle ilp32 too. (MULTILIB_OPTIONS): Delete. (MULTILIB_DIRNAMES): Delete. Hi, This is OK, thanks /Marcus
Re: [PATCH 2/3] PR other/61321 - demangler crash on casts in template parameters
On 05/27/2014 07:57 AM, Pedro Alves wrote: And after the fix for 59195, due to: static void d_print_cast (struct d_print_info *dpi, int options, const struct demangle_component *dc) { ... /* For a cast operator, we need the template parameters from the enclosing template in scope for processing the type. */ if (dpi-current_template != NULL) { dpt.next = dpi-templates; dpi-templates = dpt; dpt.template_decl = dpi-current_template; } when printing the template argument list of A (what should be sizeof (int)), the template parameter 0 (that is, T_, the '**' above) now refers to the first parameter of the the template argument list of the 'A' template (the '*' above), exactly what we were already trying to print. This leads to infinite recursion, and stack exaustion. The template parameter 0 should actually refer to the first parameter of the 'function_temp' template. It seems that the problem here is more general; a template argument list is not in scope within that same template argument list. Can't we fix that without special-casing conversion ops? Jason
[PATCH, testsuite]: Add -mno-avx2 to some i386 XOP tests
Hello! With targets that default to AVX2, these tests vectorize via 256bit paths, where different insns are emitted. 2014-06-02 Uros Bizjak ubiz...@gmail.com * gcc.target/i386/xop-rotate1-vector.c (dg-options): Add -mno-avx2. * gcc.target/i386/xop-rotate2-vector.c (dg-options): Ditto. * gcc.target/i386/xop-rotate3-vector.c (dg-options): Ditto. * gcc.target/i386/xop-imul32widen-vector.c (dg-options): Ditto. * gcc.target/i386/xop-imul64-vector.c (dg-options): Ditto. * gcc.target/i386/xop-shift1-vector.c (dg-options): Ditto. * gcc.target/i386/xop-shift2-vector.c (dg-options): Ditto. * gcc.target/i386/xop-shift3-vector.c (dg-options): Ditto. Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: gcc.target/i386/xop-rotate1-vector.c === --- gcc.target/i386/xop-rotate1-vector.c(revision 211125) +++ gcc.target/i386/xop-rotate1-vector.c(working copy) @@ -2,7 +2,7 @@ into prot on XOP systems. */ /* { dg-do compile { target { ! { ia32 } } } } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-rotate2-vector.c === --- gcc.target/i386/xop-rotate2-vector.c(revision 211125) +++ gcc.target/i386/xop-rotate2-vector.c(working copy) @@ -2,7 +2,7 @@ into prot on XOP systems. */ /* { dg-do compile { target { ! { ia32 } } } } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-imul32widen-vector.c === --- gcc.target/i386/xop-imul32widen-vector.c(revision 211125) +++ gcc.target/i386/xop-imul32widen-vector.c(working copy) @@ -3,7 +3,7 @@ /* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-rotate3-vector.c === --- gcc.target/i386/xop-rotate3-vector.c(revision 211125) +++ gcc.target/i386/xop-rotate3-vector.c(working copy) @@ -2,7 +2,7 @@ into prot on XOP systems. */ /* { dg-do compile { target { ! { ia32 } } } } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-imul64-vector.c === --- gcc.target/i386/xop-imul64-vector.c (revision 211125) +++ gcc.target/i386/xop-imul64-vector.c (working copy) @@ -3,7 +3,7 @@ /* { dg-do compile } */ /* { dg-require-effective-target lp64 } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-shift1-vector.c === --- gcc.target/i386/xop-shift1-vector.c (revision 211125) +++ gcc.target/i386/xop-shift1-vector.c (working copy) @@ -2,7 +2,7 @@ psha/pshl on XOP systems. */ /* { dg-do compile { target { ! { ia32 } } } } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-shift2-vector.c === --- gcc.target/i386/xop-shift2-vector.c (revision 211125) +++ gcc.target/i386/xop-shift2-vector.c (working copy) @@ -2,7 +2,7 @@ psha/pshl on XOP systems. */ /* { dg-do compile { target { ! { ia32 } } } } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int); Index: gcc.target/i386/xop-shift3-vector.c === --- gcc.target/i386/xop-shift3-vector.c (revision 211125) +++ gcc.target/i386/xop-shift3-vector.c (working copy) @@ -2,7 +2,7 @@ psha/pshl on XOP systems. */ /* { dg-do compile { target { ! { ia32 } } } } */ -/* { dg-options -O2 -mxop -ftree-vectorize } */ +/* { dg-options -O2 -mxop -mno-avx2 -ftree-vectorize } */ extern void exit (int);
Re: [PATCHv3 2/2] libstdc++: Add std::aligned_union.
On 16/04/14 17:47 +0200, Rüdiger Sonderfeld wrote: Of course I forgot to replace one _M_ instance. This should work now. Sorry about this. -- 8 - 8 -- C++11: [meta.trans.other] * libstdc++-v3/testsuite/20_util/aligned_union/1.cc: New file. * libstdc++-v3/include/std/type_traits (__strictest_alignment): New helper struct. (aligned_union): New struct (C++11). (aligned_union_t): New type alias (C++14). Hi, I've fixed up the patch to meet the coding standards, define the static member aligned_union::alignment_value, check the precondition, update the docs and fix a test failure. Tested x86_64-linux, committed to trunk. Thanks for adding this missing piece! commit 7018ff5142ba1413b140d9d69c7263565ecae000 Author: Jonathan Wakely jwak...@redhat.com Date: Sun Jun 1 23:33:29 2014 +0100 2014-06-02 R??diger Sonderfeld ruedi...@c-plusplus.de Jonathan Wakely jwak...@redhat.com * libstdc++-v3/include/std/type_traits (__strictest_alignment): New helper struct. (aligned_union): New struct (C++11). (aligned_union_t): New type alias (C++14). * doc/xml/manual/status_cxx2011.xml: Update. * libstdc++-v3/testsuite/20_util/aligned_union/1.cc: New file. * testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error line number. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml index b3c24d8..cad4111 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml @@ -871,11 +871,10 @@ particular release. entry/ /row row - ?dbhtml bgcolor=#B0B0B0 ? entry20.9.7.6/entry entryOther transformations/entry - entryPartial/entry - entryMissing codealigned_union/code./entry + entryY/entry + entry/ /row row entry20.10/entry diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 1ff2e62..da8a95f 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -1870,6 +1870,52 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; }; + template typename... _Types +struct __strictest_alignment +{ + static const size_t _S_alignment = 0; + static const size_t _S_size = 0; +}; + + template typename _Tp, typename... _Types +struct __strictest_alignment_Tp, _Types... +{ + static const size_t _S_alignment = +alignof(_Tp) __strictest_alignment_Types...::_S_alignment + ? alignof(_Tp) : __strictest_alignment_Types...::_S_alignment; + static const size_t _S_size = +sizeof(_Tp) __strictest_alignment_Types...::_S_size + ? sizeof(_Tp) : __strictest_alignment_Types...::_S_size; +}; + + /** + * @brief Provide aligned storage for types. + * + * [meta.trans.other] + * + * Provides aligned storage for any of the provided types of at + * least size _Len. + * + * @see aligned_storage + */ + template size_t _Len, typename... _Types +struct aligned_union +{ +private: + static_assert(sizeof...(_Types) != 0, At least one type is required); + + using __strictest = __strictest_alignment_Types...; + static const size_t _S_len = _Len __strictest::_S_size + ? _Len : __strictest::_S_size; +public: + /// The value of the strictest alignment of _Types. + static const size_t alignment_value = __strictest::_S_alignment; + /// The storage. + typedef typename aligned_storage_S_len, alignment_value::type type; +}; + + template size_t _Len, typename... _Types +const size_t aligned_union_Len, _Types...::alignment_value; // Decay trait for arrays and functions, used for perfect forwarding // in make_pair, make_tuple, etc. @@ -2206,6 +2252,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __alignof__(typename __aligned_storage_msa_Len::__type) using aligned_storage_t = typename aligned_storage_Len, _Align::type; + template size_t _Len, typename... _Types +using aligned_union_t = typename aligned_union_Len, _Types...::type; + /// Alias template for decay templatetypename _Tp using decay_t = typename decay_Tp::type; diff --git a/libstdc++-v3/testsuite/20_util/aligned_union/1.cc b/libstdc++-v3/testsuite/20_util/aligned_union/1.cc new file mode 100644 index 000..5285bb0 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/aligned_union/1.cc @@ -0,0 +1,72 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// 2014-04-16 R??diger Sonderfeld ruedi...@c-plusplus.de + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the terms +// of the GNU General Public License as published by the Free Software +// Foundation; either version 3, or (at your option) any later +// version. + +// This
Re: [PATCH] Fix PR54733 Optimize endian independent load/store
On 29 May 2014 11:58, Thomas Preud'homme thomas.preudho...@arm.com wrote: From: Andreas Schwab [mailto:sch...@linux-m68k.org] Thomas Preud'homme thomas.preudho...@arm.com writes: By the way, I couldn't understand how you reached the value 0x44434241. Can you explain me? Each byte is composed of the first 7 bits of the original byte. Sorry, it seems I wasn't very awake when I checked that. Makes sense now. Thanks. Does the patch solve the problem you had? What about you Christophe? Best regards, Thomas Hi Thomas, After a quick test, it looks OK to me. Thanks Christophe.
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On 06/02/2014 02:50 AM, Marek Polacek wrote: + TREE_CODE (arg1.value) == EQ_EXPR) ... + TREE_CODE (current.lhs) == EQ_EXPR It seems like your version only warns about ==, while the clang version warns about all comparisons. + (complain_flags (decltype_p) tf_warning) Let's not call complain_flags multiple times in the function. In fact this will always be true, so you could just drop it. Also for C++, I think we don't want this warning to trigger when the operator (==, !=, , , =, =) is overloaded. But I admit I haven't even tried that, and I don't know how to detect overloaded operators except DECL_OVERLOADED_OPERATOR_P. Overloaded operators have the same precedence, so I think it's appropriate to give the warning whether or not the operators are overloaded. Jason
[PATCH] Fix part of PR61383
In comment #3 it's noted that ifcombine happily hoists possibly trapping stmts ... oops. Fixed like the following, bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-06-02 Richard Biener rguent...@suse.de PR tree-optimization/61383 * tree-ssa-ifcombine.c (bb_no_side_effects_p): Make sure stmts can't trap. * gcc.dg/torture/pr61383-1.c: New testcase. Index: gcc/tree-ssa-ifcombine.c === *** gcc/tree-ssa-ifcombine.c(revision 211125) --- gcc/tree-ssa-ifcombine.c(working copy) *** bb_no_side_effects_p (basic_block bb) *** 128,133 --- 128,134 gimple stmt = gsi_stmt (gsi); if (gimple_has_side_effects (stmt) + || gimple_could_trap_p (stmt) || gimple_vuse (stmt)) return false; } Index: gcc/testsuite/gcc.dg/torture/pr61383-1.c === *** gcc/testsuite/gcc.dg/torture/pr61383-1.c(revision 0) --- gcc/testsuite/gcc.dg/torture/pr61383-1.c(working copy) *** *** 0 --- 1,35 + /* { dg-do run } */ + + int a, b = 1, c, d, e, f, g; + + int + fn1 () + { + int h; + for (;;) + { + g = b; + g = g ? 0 : 1 % g; + e = a + 1; + for (; d 1; d = e) + { + if (f == 0) + h = 0; + else + h = 1 % f; + if (f 1) + c = 0; + else if (h) + break; + } + if (b) + return 0; + } + } + + int + main () + { + fn1 (); + return 0; + }
Re: [PATCH 1/4] Make coverage_compute_cfg_checksum callable with an argument
On 05/30/2014 06:28 PM, Jeff Law wrote: On 05/30/14 00:47, Martin Liška wrote: Hello, this is a small patchset that prepares API for new IPA Identical code folding pass. The patch adds an argument for coverage_compute_cfg_checksum. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-05-29 Martin Liska mli...@suse.cz * coverage.h (coverage_compute_cfg_checksum): Argument added. * coverage.c (coverage_compute_cfg_checksum): Likewise. * profile.c (branch_prob): Likewise. The block comment for coverage_compute_cfg_checksum needs to be updated. We're no longer computing the checksum for the current function (cfun), but instead computing the checksum for the argument FN. Hi, thank you for your feedback, I've just fixed the patch and will commit soon. Martin Otherwise OK for the trunk. jeff
Re: [PATCH 2/4] Enhancement of call graph API
On 05/30/2014 06:42 PM, Jeff Law wrote: On 05/30/14 00:47, Martin Liška wrote: Hello, this patch enhances callgraph API to enable more precise control of expand_thunk; another function becomes global. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-05-29 Martin Liska mli...@suse.cz * cgraph.h (expand_thunk): New argument added. (address_taken_from_non_vtable_p): New global function. * ipa-visibility.c (address_taken_from_non_vtable_p): Likewise. * cgraphclones.c (duplicate_thunk_for_node): Argument added to call. * cgraphunit.c (analyze_function): Likewise. (assemble_thunks_and_aliases): Argument added to call. (expand_thunk): New argument forces to produce GIMPLE thunk. Only concern here is the location of the prototype for address_taken_from_non_vtable_p. Though I guess other things form ipa-visibility.c are prototyped in cgraph.h. Can you put the prototype here in cgraph.h: /* In ipa-visibility.c */ bool cgraph_local_node_p (struct cgraph_node *); Otherwise OK. Real curious to see the meat of the optimization now :-) Hello, thanks too. It was really a wrong place for the declaration. Yeah, the optimization will be juicy :) Martin jeff
Re: [PATCH 3/4] New attribute lookup function addition
On 05/30/2014 06:37 PM, Jeff Law wrote: On 05/30/14 00:49, Martin Liška wrote: Hi, this patch introduces a new function lookup_attribute_starting that can find all attributes starting with a specified string. Purpose of the function is to be able to identify e.g. if a function has any 'omp' attribute. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-05-29 Martin Liska mli...@suse.cz * tree.h (private_lookup_attribute_starting): New function. (lookup_attribute_starting): Likewise. * tree.c (private_lookup_attribute_starting): Likewise. private_lookup_attribute_starting needs a block comment. Added. +tree +private_lookup_attribute_starting (const char *attr_name, size_t attr_len, tree list) Long line needs to be wrapped? Please review the patch for lines that need wrapping at 80 columns. Fixed too. So it's really a lookup by prefix, so I'd probably use a name like lookup_attribute_by_prefix. Why private_ in the function name? I used the same construction as for function 'private_is_attribute_p'; I hope the construction is fine? It appears it just returns the first attribute from LIST with the given prefix. Presumably you use it iteratively. +/* Given an attribute name ATTR_NAME and a list of attributes LIST, + return a pointer to the attribute's list element if the attribute + starts with ATTR_NAME. ATTR_NAME must be in the form 'text' (not + '__text__'). */ + +static inline tree +lookup_attribute_starting (const char *attr_name, tree list) +{ + gcc_checking_assert (attr_name[0] != '_'); + /* In most cases, list is NULL_TREE. */ + if (list == NULL_TREE) +return NULL_TREE; + else +return private_lookup_attribute_starting (attr_name, strlen (attr_name), list); +} So again, I prefer prefix rather than starting. Similarly this is meant to be called iteratively since you only get the first attribute with the given prefix, right? I added a comment that it returns just such first argument. Is the reworked patch OK for trunk? Martin OK with the nit fixes mentioned above. Jeff From be3ab469ee70ff3de434f5326c1a2eabf07da3ed Mon Sep 17 00:00:00 2001 Message-Id: be3ab469ee70ff3de434f5326c1a2eabf07da3ed.1401718733.git.mli...@suse.cz In-Reply-To: e245d67afb610a2f210b83382b49f75731ba68b8.1401718733.git.mli...@suse.cz References: e245d67afb610a2f210b83382b49f75731ba68b8.1401718733.git.mli...@suse.cz From: mliska mli...@suse.cz Date: Thu, 29 May 2014 17:18:34 +0200 Subject: [PATCH 3/4] New attribute lookup function addition To: gcc-patches@gcc.gnu.org Hi, this patch introduces a new function lookup_attribute_starting that can find all attributes starting with a specified string. Purpose of the function is to be able to identify e.g. if a function has any 'omp' attribute. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-05-29 Martin Liska mli...@suse.cz * tree.h (private_lookup_attribute_starting): New function. (lookup_attribute_starting): Likewise. * tree.c (private_lookup_attribute_starting): Likewise. diff --git a/gcc/tree.c b/gcc/tree.c index cf7e362..f983408 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -5758,6 +5758,44 @@ private_lookup_attribute (const char *attr_name, size_t attr_len, tree list) return list; } +/* Given an attribute name ATTR_NAME and a list of attributes LIST, + return a pointer to the attribute's list first element if the attribute + starts with ATTR_NAME. ATTR_NAME must be in the form 'text' (not + '__text__'). */ + +tree +private_lookup_attribute_by_prefix (const char *attr_name, size_t attr_len, +tree list) +{ + while (list) +{ + size_t ident_len = IDENTIFIER_LENGTH (get_attribute_name (list)); + + if (attr_len ident_len) + { + list = TREE_CHAIN (list); + continue; + } + + const char *p = IDENTIFIER_POINTER (get_attribute_name (list)); + + if (strncmp (attr_name, p, attr_len) == 0) + break; + + /* TODO: If we made sure that attributes were stored in the + canonical form without '__...__' (ie, as in 'text' as opposed + to '__text__') then we could avoid the following case. */ + if (p[0] == '_' p[1] == '_' + strncmp (attr_name, p + 2, attr_len) == 0) + break; + + list = TREE_CHAIN (list); +} + + return list; +} + + /* A variant of lookup_attribute() that can be used with an identifier as the first argument, and where the identifier can be either 'text' or '__text__'. diff --git a/gcc/tree.h b/gcc/tree.h index 9fe7360..e592280 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -3731,6 +3731,10 @@ extern tree merge_type_attributes (tree, tree); and you should never call it directly. */ extern tree private_lookup_attribute (const char *, size_t, tree); +/* This function is a private implementation detail + of lookup_attribute_by_prefix() and you should never call it directly. */ +extern tree private_lookup_attribute_by_prefix (const char *, size_t, tree); + /*
Re: [C++11, C++14 PATCH 2/3] Support for SD-6: SG10 Feature Test Recommendations - c-family and testsuite
On 05/31/2014 02:30 AM, Marc Glisse wrote: Also, I am pretty sure that gcc doesn't support the latest constexpr, we shouldn't define those macros lightly. That's correct. We should leave __cpp_constexpr at 200704 for now. I think having __has_include for all languages is fine since it is already in the implementation defined namespace. Agreed. These macros should be defined if the feature is supported. Similarly, features of later standards that we implement in earlier conformance modes as extensions (specifically, init-captures and binary literals) should have the appropriate macros defined. Jason
[PATCH, i386, Pointer Bounds Checker 18/x] Expand instrumented builtin function calls
Hi, This patch adds support for normal builtin function calls (target ones are not instrumented). The basic idea of the patch is to make call expr copy with no bounds and expand it instead. If expr is going to be emitted as a function call then original instrumented expr takes place. Handle string function separately because they are of high interest for the checker. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * builtins.c: Include rtl-chkp.h, tree-chkp.h. (expand_builtin_mempcpy_args): Add orig exp as argument. Support BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK. (expand_builtin_mempcpy): Adjust expand_builtin_mempcpy_args call. (expand_builtin_stpcpy): Likewise. (expand_builtin_memset_args): Support BUILT_IN_CHKP_MEMSET_NOBND_NOCHK. (std_expand_builtin_va_start): Initialize bounds for va_list. (expand_builtin): Support instrumented calls. diff --git a/gcc/builtins.c b/gcc/builtins.c index 7357124..c0140e1 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -59,6 +59,8 @@ along with GCC; see the file COPYING3. If not see #include builtins.h #include ubsan.h #include cilk.h +#include tree-chkp.h +#include rtl-chkp.h static tree do_mpc_arg1 (tree, tree, int (*)(mpc_ptr, mpc_srcptr, mpc_rnd_t)); @@ -124,7 +126,7 @@ static rtx builtin_memcpy_read_str (void *, HOST_WIDE_INT, enum machine_mode); static rtx expand_builtin_memcpy (tree, rtx); static rtx expand_builtin_mempcpy (tree, rtx, enum machine_mode); static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx, - enum machine_mode, int); + enum machine_mode, int, tree); static rtx expand_builtin_strcpy (tree, rtx); static rtx expand_builtin_strcpy_args (tree, tree, rtx); static rtx expand_builtin_stpcpy (tree, rtx, enum machine_mode); @@ -3284,7 +3286,8 @@ expand_builtin_mempcpy (tree exp, rtx target, enum machine_mode mode) tree src = CALL_EXPR_ARG (exp, 1); tree len = CALL_EXPR_ARG (exp, 2); return expand_builtin_mempcpy_args (dest, src, len, - target, mode, /*endp=*/ 1); + target, mode, /*endp=*/ 1, + exp); } } @@ -3296,10 +3299,23 @@ expand_builtin_mempcpy (tree exp, rtx target, enum machine_mode mode) static rtx expand_builtin_mempcpy_args (tree dest, tree src, tree len, -rtx target, enum machine_mode mode, int endp) +rtx target, enum machine_mode mode, int endp, +tree orig_exp) { + tree fndecl = get_callee_fndecl (orig_exp); + /* If return value is ignored, transform mempcpy into memcpy. */ - if (target == const0_rtx builtin_decl_implicit_p (BUILT_IN_MEMCPY)) + if (target == const0_rtx + DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK + builtin_decl_implicit_p (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK)) +{ + tree fn = builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK); + tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3, + dest, src, len); + return expand_expr (result, target, mode, EXPAND_NORMAL); +} + else if (target == const0_rtx + builtin_decl_implicit_p (BUILT_IN_MEMCPY)) { tree fn = builtin_decl_implicit (BUILT_IN_MEMCPY); tree result = build_call_nofold_loc (UNKNOWN_LOCATION, fn, 3, @@ -3484,7 +3500,8 @@ expand_builtin_stpcpy (tree exp, rtx target, enum machine_mode mode) lenp1 = size_binop_loc (loc, PLUS_EXPR, len, ssize_int (1)); ret = expand_builtin_mempcpy_args (dst, src, lenp1, -target, mode, /*endp=*/2); +target, mode, /*endp=*/2, +exp); if (ret) return ret; @@ -3778,7 +3795,8 @@ expand_builtin_memset_args (tree dest, tree val, tree len, do_libcall: fndecl = get_callee_fndecl (orig_exp); fcode = DECL_FUNCTION_CODE (fndecl); - if (fcode == BUILT_IN_MEMSET) + if (fcode == BUILT_IN_MEMSET + || fcode == BUILT_IN_CHKP_MEMSET_NOBND_NOCHK) fn = build_call_nofold_loc (EXPR_LOCATION (orig_exp), fndecl, 3, dest, val, len); else if (fcode == BUILT_IN_BZERO) @@ -4330,6 +4348,13 @@ std_expand_builtin_va_start (tree valist, rtx nextarg) { rtx va_r = expand_expr (valist, NULL_RTX, VOIDmode, EXPAND_WRITE); convert_move (va_r, nextarg, 0); + + /* We do not have any valid bounds for the pointer, so + just store zero bounds for it. */ + if (chkp_function_instrumented_p (current_function_decl)) +chkp_expand_bounds_reset_for_mem (valist, + make_tree (TREE_TYPE (valist), +
[PATCH, Pointer Bounds Checker 19/x] Support bounds in expand
Hi, This patch adds support for input bounds, call bounds args and returned bounds in expand pass. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * calls.c: Include tree-chkp.h, rtl-chkp.h. (arg_data): Add fields special_slot, pointer_arg and pointer_offset. (store_bounds): New. (emit_call_1): Propagate instrumentation flag for CALL. (initialize_argument_information): Compute pointer_arg, pointer_offset and special_slot for pointer bounds arguments. (finalize_must_preallocate): Preallocate when storing bounds in bounds table. (compute_argument_addresses): Skip pointer bounds. (expand_call): Store bounds into tables separately. Return result joined with resulting bounds. * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h. (expand_call_stmt): Propagate bounds flag for CALL_EXPR. (expand_return): Add returned bounds arg. Handle returned bounds. (expand_gimple_stmt_1): Adjust to new expand_return signature. (gimple_expand_cfg): Reset rtx bounds map. * expr.h (store_expr): Add param for bounds target. * expr.c: Include tree-chkp.h, rtl-chkp.h. (expand_assignment): Handle returned bounds. (store_expr): Add bounds target argument. Handle bounds returned by calls. (store_constructor): Adjust to new store_expr signature. (store_field): Likewise. (expand_expr_real_2): Likewise. (expand_expr_real_1): Likewise. * function.c: Include tree-chkp.h, rtl-chkp.h. (bounds_parm_data): New. (use_register_for_decl): Do not registerize decls used for bounds stores and loads. (assign_parms_augmented_arg_list): Add bounds of the result structure pointer as the second argument. (assign_parm_find_entry_rtl): Mark bounds are never passed on the stack. (assign_parm_is_stack_parm): Likewise. (assign_parm_load_bounds): New. (assign_bounds): New. (assign_parms): Load bounds and determine a location for returned bounds. (diddle_return_value_1): New. (diddle_return_value): Handle returned bounds. * function.h (rtl_data): Add field for returned bounds. * tree-outof-ssa.c (insert_value_copy_on_edge): Adjust to new store_expr signature. diff --git a/gcc/calls.c b/gcc/calls.c index e1dc8eb..140ceb4 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -44,11 +44,14 @@ along with GCC; see the file COPYING3. If not see #include tm_p.h #include timevar.h #include sbitmap.h +#include bitmap.h #include langhooks.h #include target.h #include cgraph.h #include except.h #include dbgcnt.h +#include tree-chkp.h +#include rtl-chkp.h /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits. */ #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT) @@ -76,6 +79,15 @@ struct arg_data /* If REG is a PARALLEL, this is a copy of VALUE pulled into the correct form for emit_group_move. */ rtx parallel_value; + /* If value is passed in neither reg nor stack, this field holds a number + of a special slot to be used. */ + rtx special_slot; + /* For pointer bounds hold an index of parm bounds are bound to. -1 if + there is no such pointer. */ + int pointer_arg; + /* If pointer_arg refers a structure, then pointer_offset holds an offset + of a pointer in this structure. */ + int pointer_offset; /* If REG was promoted from the actual mode of the argument expression, indicates whether the promotion is sign- or zero-extended. */ int unsignedp; @@ -133,6 +145,7 @@ static void emit_call_1 (rtx, tree, tree, tree, HOST_WIDE_INT, HOST_WIDE_INT, HOST_WIDE_INT, rtx, rtx, int, rtx, int, cumulative_args_t); static void precompute_register_parameters (int, struct arg_data *, int *); +static void store_bounds (struct arg_data *, struct arg_data *); static int store_one_arg (struct arg_data *, rtx, int, int, int); static void store_unaligned_arguments_into_pseudos (struct arg_data *, int); static int finalize_must_preallocate (int, int, struct arg_data *, @@ -396,6 +409,10 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, tree fndecl ATTRIBUTE_UNU MEM_EXPR (funmem) != NULL_TREE) set_mem_expr (XEXP (call, 0), MEM_EXPR (funmem)); + /* Mark instrumented calls. */ + if (call fntree) +CALL_EXPR_WITH_BOUNDS_P (call) = CALL_WITH_BOUNDS_P (fntree); + /* Put the register usage information there. */ add_function_usage_to (call_insn, call_fusage); @@ -1141,18 +1158,84 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED, /* First fill in the actual arguments in the ARGS array, splitting complex arguments if necessary. */ { -int j = i; +int j = i, ptr_arg = -1;
Re: [C++11, C++14 PATCH 2/3] Support for SD-6: SG10 Feature Test Recommendations - c-family and testsuite
On 06/02/2014 10:31 AM, Jason Merrill wrote: On 05/31/2014 02:30 AM, Marc Glisse wrote: Also, I am pretty sure that gcc doesn't support the latest constexpr, we shouldn't define those macros lightly. That's correct. We should leave __cpp_constexpr at 200704 for now. Right... That was a testing thing I left in by accident to make sure supercedance would work. Commented out. ;-) I think having __has_include for all languages is fine since it is already in the implementation defined namespace. Agreed. These macros should be defined if the feature is supported. I now have these for all C/C++ versions. When I implemented these I thought Why the heck hasn't the preprocessor had these for 20 years... Similarly, features of later standards that we implement in earlier conformance modes as extensions (specifically, init-captures and binary literals) should have the appropriate macros defined. Very good idea... I'll research these. unless someone has a little list somewhere...? Jason
Re: [PATCH, Pointer Bounds Checker 14/x] Pointer Bounds Checker passes
2014-06-02 17:37 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Mon, Jun 2, 2014 at 2:44 PM, Ilya Enkovich enkovich@gmail.com wrote: 2014-06-02 15:35 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Fri, May 30, 2014 at 2:25 PM, Ilya Enkovich enkovich@gmail.com wrote: Hi, This patch adds Pointer Bounds Checker passes. Versioning happens before early local passes. Earply local passes are split into 3 stages to have everything instrumented before any optimization applies. That looks artificial to me. If you need to split up early_local_passes then do that - nesting three IPA pass groups inside it looks odd to me. Btw - doing this in three IPA phases makes things possibly slower due to cache effects. It might be worth pursuing to move the early stage completely to the lowering pipeline. Early local passes is some special case because these passes are executed separately for new functions. I did not want to get three special passes instead and therefore made split inside. Yeah, but all passes are already executed via execute_early_local_passes, so it would be only an implementation detail. If you prefer split pass itself, I suppose pass_early_local_passes may be replaced with something like pass_build_ssa_passes + pass_chkp_instrumentation_passes + pass_ipa_chkp_produce_thunks + pass_local_optimization_passes. execute_early_local_passes would execute gimple passes lists of pass_build_ssa_passes, pass_chkp_instrumentation_passes and pass_local_optimization_passes. I think we cannot have the first stage moved into lowering passes because it should be executed for newly created functions. Well, let's defer that then. Btw, fixup_cfg only needs to run once local_pure_const was run on a callee, thus it shouldn't be neccessary to run it from the first group. OK. Will try to remove it from the first group. void pass_manager::execute_early_local_passes () { - execute_pass_list (pass_early_local_passes_1-sub); + execute_pass_list (pass_early_local_passes_1-sub-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-sub); + execute_pass_list (pass_early_local_passes_1-sub-next-next-next-sub); } is gross - it should be enough to execute the early local pass list (obsolete comment with the suggestion above). This function should call only gimple passes for cfun. Therefore we cannot call IPA passes here and has to execute each gimple passes list separately. Ok, given a different split this would then become somewhat more sane anyway. Sorry, didn't catch it. Should I try a different split or defer it? :) Ilya Richard. Ilya Richard. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-05-29 Ilya Enkovich ilya.enkov...@intel.com * tree-chkp.c: New. * tree-chkp.h: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o. (GTFILES): Add tree-chkp.c. * c-family/c.opt (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * passes.def (pass_ipa_chkp_versioning): New. (pass_before_local_optimization_passes): New. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * toplev.c: include tree-chkp.h. (compile_file): Add chkp_finish_file call. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_before_local_optimization_passes): New. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * tree.h (called_as_built_in): New. * builtins.c (called_as_built_in): Not static anymore. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (pass_data_before_local_optimization_passes): New. (pass_before_local_optimization_passes): New. (pass_data_chkp_instrumentation_passes): New.
[PATCH, Pointer Bounds Checker 20/x] Follow transparent alias chains
Hi, In the most case we follow transparent alias chains wne assemble names. But in some cases it is not performed. For instrumented functions it is critical and following patch fixes that. It also adds a visibility inheritance for instrtumented functions. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * varasm.c: Include tree-chkp.h. (ultimate_transparent_alias_target): Move up. (make_decl_rtl): For instrumented function use name of the original decl. (assemble_start_function): Mark function as global in case it is instrumentation clone of the global function. (do_assemble_alias): Follow transparent alias chain for identifier. Check if original alias is public. (maybe_assemble_visibility): Use visibility of the original function for instrumented version. (default_unique_section): Likewise. diff --git a/gcc/varasm.c b/gcc/varasm.c index fcae2fa..d473bc7 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3. If not see #include pointer-set.h #include asan.h #include basic-block.h +#include tree-chkp.h #ifdef XCOFF_DEBUGGING_INFO #include xcoffout.h /* Needed for external data @@ -1200,6 +1201,30 @@ use_blocks_for_decl_p (tree decl) return targetm.use_blocks_for_decl_p (decl); } +/* Follow the IDENTIFIER_TRANSPARENT_ALIAS chain starting at *ALIAS + until we find an identifier that is not itself a transparent alias. + Modify the alias passed to it by reference (and all aliases on the + way to the ultimate target), such that they do not have to be + followed again, and return the ultimate target of the alias + chain. */ + +static inline tree +ultimate_transparent_alias_target (tree *alias) +{ + tree target = *alias; + + if (IDENTIFIER_TRANSPARENT_ALIAS (target)) +{ + gcc_assert (TREE_CHAIN (target)); + target = ultimate_transparent_alias_target (TREE_CHAIN (target)); + gcc_assert (! IDENTIFIER_TRANSPARENT_ALIAS (target) + ! TREE_CHAIN (target)); + *alias = target; +} + + return target; +} + /* Create the DECL_RTL for a VAR_DECL or FUNCTION_DECL. DECL should have static storage duration. In other words, it should not be an automatic variable, including PARM_DECLs. @@ -1214,6 +1239,7 @@ make_decl_rtl (tree decl) { const char *name = 0; int reg_number; + tree id; rtx x; /* Check that we are not being given an automatic variable. */ @@ -1271,7 +1297,12 @@ make_decl_rtl (tree decl) return; } - name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); + id = DECL_ASSEMBLER_NAME (decl); + if (TREE_CODE (decl) == FUNCTION_DECL + cgraph_get_node (decl) + cgraph_get_node (decl)-instrumentation_clone) +ultimate_transparent_alias_target (id); + name = IDENTIFIER_POINTER (id); if (name[0] != '*' TREE_CODE (decl) != FUNCTION_DECL DECL_REGISTER (decl)) @@ -1699,7 +1730,10 @@ assemble_start_function (tree decl, const char *fnname) /* Make function name accessible from other files, if appropriate. */ - if (TREE_PUBLIC (decl)) + if (TREE_PUBLIC (decl) + || (cgraph_get_node (decl)-instrumentation_clone + cgraph_get_node (decl)-instrumented_version + TREE_PUBLIC (cgraph_get_node (decl)-instrumented_version-decl))) { notice_global_symbol (decl); @@ -2386,30 +2420,6 @@ mark_decl_referenced (tree decl) } -/* Follow the IDENTIFIER_TRANSPARENT_ALIAS chain starting at *ALIAS - until we find an identifier that is not itself a transparent alias. - Modify the alias passed to it by reference (and all aliases on the - way to the ultimate target), such that they do not have to be - followed again, and return the ultimate target of the alias - chain. */ - -static inline tree -ultimate_transparent_alias_target (tree *alias) -{ - tree target = *alias; - - if (IDENTIFIER_TRANSPARENT_ALIAS (target)) -{ - gcc_assert (TREE_CHAIN (target)); - target = ultimate_transparent_alias_target (TREE_CHAIN (target)); - gcc_assert (! IDENTIFIER_TRANSPARENT_ALIAS (target) - ! TREE_CHAIN (target)); - *alias = target; -} - - return target; -} - /* Output to FILE (an assembly file) a reference to NAME. If NAME starts with a *, the rest of NAME is output verbatim. Otherwise NAME is transformed in a target-specific way (usually by the @@ -5544,6 +5554,9 @@ vecalias_pair, va_gc *alias_pairs; void do_assemble_alias (tree decl, tree target) { + tree orig_decl = decl; + tree id; + /* Emulated TLS had better not get this var. */ gcc_assert (!(!targetm.have_tls TREE_CODE (decl) == VAR_DECL @@ -5552,12 +5565,21 @@ do_assemble_alias (tree decl, tree target) if (TREE_ASM_WRITTEN (decl)) return; + if (TREE_CODE (decl) == FUNCTION_DECL +
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On Mon, Jun 02, 2014 at 10:08:24AM -0400, Jason Merrill wrote: On 06/02/2014 02:50 AM, Marek Polacek wrote: + TREE_CODE (arg1.value) == EQ_EXPR) ... + TREE_CODE (current.lhs) == EQ_EXPR It seems like your version only warns about ==, while the clang version warns about all comparisons. Actually it warns about all tcc_comparison cases; the reason for EQ_EXPR checks above was that we convert the LHS from e.g. !x into x == 0 (at least in the C FE this is done in c_objc_common_truthvalue_conversion + invert_truthvalue), so I checked that. But it's useless, so I dropped those checks. + (complain_flags (decltype_p) tf_warning) Let's not call complain_flags multiple times in the function. In fact this will always be true, so you could just drop it. Good, I dropped that check too. Also for C++, I think we don't want this warning to trigger when the operator (==, !=, , , =, =) is overloaded. But I admit I haven't even tried that, and I don't know how to detect overloaded operators except DECL_OVERLOADED_OPERATOR_P. Overloaded operators have the same precedence, so I think it's appropriate to give the warning whether or not the operators are overloaded. Thanks. 2014-06-02 Marek Polacek pola...@redhat.com PR c/49706 * doc/invoke.texi: Document -Wlogical-not-parentheses. c-family/ * c-common.c (warn_logical_not_parentheses): New function. * c-common.h (warn_logical_not_parentheses): Declare. * c.opt (Wlogical-not-parentheses): New option. c/ * c-typeck.c (parser_build_binary_op): Warn when logical not is used on the left hand side operand of a comparison. cp/ * parser.c (cp_parser_binary_expression): Warn when logical not is used on the left hand side operand of a comparison. testsuite/ * c-c++-common/pr49706.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 6ec14fc..650afaf 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -1722,6 +1722,25 @@ warn_logical_operator (location_t location, enum tree_code code, tree type, } } +/* Warn about logical not used on the left hand side operand of a comparison. + This function assumes that the LHS is inside of TRUTH_NOT_EXPR. + Do not warn if the LHS or RHS is of a boolean or a vector type. */ + +void +warn_logical_not_parentheses (location_t location, enum tree_code code, + tree lhs, tree rhs) +{ + if (TREE_CODE_CLASS (code) != tcc_comparison + || TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE + || TREE_CODE (TREE_TYPE (rhs)) == BOOLEAN_TYPE + || VECTOR_TYPE_P (TREE_TYPE (lhs)) + || VECTOR_TYPE_P (TREE_TYPE (rhs))) +return; + + warning_at (location, OPT_Wlogical_not_parentheses, + logical not is only applied to the left hand side of + comparison); +} /* Warn if EXP contains any computations whose results are not used. Return true if a warning is printed; false otherwise. LOCUS is the diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h index 0d34004..83d5dee 100644 --- gcc/c-family/c-common.h +++ gcc/c-family/c-common.h @@ -772,6 +772,8 @@ extern void overflow_warning (location_t, tree); extern bool warn_if_unused_value (const_tree, location_t); extern void warn_logical_operator (location_t, enum tree_code, tree, enum tree_code, tree, enum tree_code, tree); +extern void warn_logical_not_parentheses (location_t, enum tree_code, tree, + tree); extern void check_main_parameter_types (tree decl); extern bool c_determine_visibility (tree); extern bool vector_types_compatible_elements_p (tree, tree); diff --git gcc/c-family/c.opt gcc/c-family/c.opt index c586e65..d51c890 100644 --- gcc/c-family/c.opt +++ gcc/c-family/c.opt @@ -490,6 +490,10 @@ Wlogical-op C ObjC C++ ObjC++ Var(warn_logical_op) Init(0) Warning Warn when a logical operator is suspiciously always evaluating to true or false +Wlogical-not-parentheses +C ObjC C++ ObjC++ Var(warn_logical_not_paren) Warning +Warn when logical not is used on the left hand side operand of a comparison + Wlong-long C ObjC C++ ObjC++ Var(warn_long_long) Init(-1) Warning Do not warn about using \long long\ when -pedantic diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index 6ca584b..e98224e 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -3401,6 +3401,10 @@ parser_build_binary_op (location_t location, enum tree_code code, warn_logical_operator (location, code, TREE_TYPE (result.value), code1, arg1.value, code2, arg2.value); + if (warn_logical_not_paren + code1 == TRUTH_NOT_EXPR) +warn_logical_not_parentheses (location, code, arg1.value, arg2.value); + /* Warn about comparisons against string literals, with the exception of testing for equality or inequality of a string literal with NULL. */ if (code == EQ_EXPR || code ==
[PATCH, Pointer Bounds Checker 21/x] Weakrefs output
Hi, This patch prevents output of both instrumented and not instrumented weakref variants. Thanks, Ilya -- gcc/ 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * cgraphunit.c (output_weakrefs): If there are both instrumented and original versions, output only one of them. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index c5c..ae9e699 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -2111,9 +2111,13 @@ static void output_weakrefs (void) { symtab_node *node; + cgraph_node *cnode; FOR_EACH_SYMBOL (node) if (node-alias !TREE_ASM_WRITTEN (node-decl) +(!(cnode = dyn_cast cgraph_node (node)) + || !cnode-instrumented_version + || !TREE_ASM_WRITTEN (cnode-instrumented_version-decl)) node-weakref) { tree target;
Patch for builtins.h restructuring [1/2]
Furthering the include file restructuring, this patch fixes the header files for builtins.h. The prototypes were spread between tree.h, expr.h and fold-const.h. * There were 5 build_* routines in builtins.c that really should be in tree.c. They are used all over the place. The prototypes were already in tree.h. * There were 3 routines in builtins.c that belonged in targhooks.c. The prototypes were already in targhooks.h * fold-const.c uses do_mpc_arg2() which is defined in builtins.c, so the prototype should be in builtins.h. The prototype utilizes type mpc_srcptr from mpc.h which means that every file which includes builtins.h would also require #include mpc.h. That is likely the reason the prototype was placed into realmpfr.h. I added #include mpc.h to builtins.h. Normally we're trying not to do that until we can rebuild a decent module structure, but I think that rule only makes sense for include files within gcc's source base... * Finally, fortran/trans.c was calling fold_builtin_call_array directly. That means it would have needed builtins.h which caused issues since builtins.h defines struct target_builtins and uses FIRST_PSEUDO_REGISTER... which is defined within the target config file... something I dont think we really want to expose to the fortran front end :-P. Anyway, looking aorund, it turns out fold-const.c::fold_build_call_array_loc is really a wrapper for a call to fold_builtin_call_array, with some extra checking code before and after the call protected by ENABLE_FOLD_CHECKING. I'd think that should be OK since its what other front ends call... Bootstraps on x86_64-unknown-linux-gnu with no regressions. I also ran it through a full set of target builds for compilation, with no new failures there. The second set of patches are the updated #includes required for compilation. OK for trunk? Andrew * expr.h: Remove prototypes of functions defined in builtins.c. * tree.h: (build_call_expr_*, build_string_literal): Add prototypes. Remove prototypes of functions defined in builtins.c. * builtins.h: Update prototype list to include all exported functions. * builtins.c: (default_libc_has_function, gnu_libc_has_function, no_c99_libc_has_function): Move to targhooks.c (build_string_literal, build_call_expr_loc_array, build_call_expr_loc_vec, build_call_expr_loc, build_call_expr): Move to tree.c. (expand_builtin_object_size, fold_builtin_object_size): Make static. * targhooks.c (default_libc_has_function, gnu_libc_has_function, no_c99_libc_has_function): Relocate from builtins.c. * tree.c: Include builtins.h. (build_call_expr_loc_array, build_call_expr_loc_vec, build_call_expr_loc, build_call_expr, build_string_literal): Relocate from builtins.c. * fold-const.h (fold_fma): Move prototype to builtins.h. * realmpfr.h (do_mpc_arg2): Move prototype to builtins.h. * fortran/trans.c (trans_runtime_error_vararg): Call fold_build_call_array_loc instead of fold_builtin_call_array. Index: expr.h === *** expr.h (revision 211131) --- expr.h (working copy) *** extern unsigned HOST_WIDE_INT choose_mul *** 252,271 int, unsigned HOST_WIDE_INT *, int *, int *); - /* Functions from builtins.c: */ - extern rtx expand_builtin (tree, rtx, rtx, enum machine_mode, int); - extern tree std_build_builtin_va_list (void); - extern tree std_fn_abi_va_list (tree); - extern tree std_canonical_va_list_type (tree); - - extern void std_expand_builtin_va_start (tree, rtx); - extern rtx default_expand_builtin (tree, rtx, rtx, enum machine_mode, int); - extern void expand_builtin_setjmp_setup (rtx, rtx); - extern void expand_builtin_setjmp_receiver (rtx); - extern rtx expand_builtin_saveregs (void); - extern void expand_builtin_trap (void); - extern rtx builtin_strncpy_read_str (void *, HOST_WIDE_INT, enum machine_mode); - /* Functions from expr.c: */ /* This is run during target initialization to set up which modes can be --- 252,257 Index: tree.h === *** tree.h (revision 211131) --- tree.h (working copy) *** extern tree build_call_valist (tree, tre *** 3624,3629 --- 3624,3634 build_call_array_loc (UNKNOWN_LOCATION, T1, T2, N, T3) extern tree build_call_array_loc (location_t, tree, tree, int, const tree *); extern tree build_call_vec (tree, tree, vectree, va_gc *); + extern tree build_call_expr_loc_array (location_t, tree, int, tree *); + extern tree build_call_expr_loc_vec (location_t, tree, vectree, va_gc *); + extern tree build_call_expr_loc (location_t, tree, int, ...); + extern tree build_call_expr (tree, int, ...); + extern tree build_string_literal (int, const char *); /* Construct various nodes representing data types. */ *** extern tree get_inner_reference (tree, H *** 4745,4790 EXP, an ARRAY_REF or an ARRAY_RANGE_REF.
Patch for builtins.h restructuring [2/2]
This is the second set of patches which updates the #includes required for compilation. I've reduced it to the base required set, except for the target config files... Most seem to require it, so in the interest of not breaking a target, I simply include builtins.h in all target config files that included tree.h or expr.h. GO changes can now be directly checked in, is that right Ian? Andrew * asan.c: Include builtins.h. * cfgexpand.c: Likewise. * convert.c: Likewise. * emit-rtl.c: Likewise. * except.c: Likewise. * expr.c: Likewise. * fold-const.c: Likewise. * gimple-fold.c: Likewise. * gimple-ssa-strength-reduction.c: Likewise. * gimplify.c: Likewise. * ipa-inline.c: Likewise. * ipa-prop.c: Likewise. * lto-streamer-out.c: Likewise. * stmt.c: Likewise. * tree-inline.c: Likewise. * tree-object-size.c: Likewise. * tree-sra.c: Likewise. * tree-ssa-ccp.c: Likewise. * tree-ssa-forwprop.c: Likewise. * tree-ssa-loop-ivcanon.c: Likewise. * tree-ssa-loop-ivopts.c: Likewise. * tree-ssa-math-opts.c: Likewise. * tree-ssa-reassoc.c: Likewise. * tree-ssa-threadedge.c: Likewise. * tree-streamer-in.c: Likewise. * tree-vect-data-refs.c: Likewise. * tree-vect-patterns.c: Likewise. * tree-vect-stmts.c: Likewise. c * c-decl.c: Include builtins.h. * c-parser.c: Likewise. cp * decl.c: Include builtins.h. * semantics.c: Likewise. go * go-gcc.cc: Include builtins.h. lto * lto-symtab.c: Include builtins.h. config * aarch64/aarch64.c: Include builtins.h. * alpha/alpha.c: Likewise. * arc/arc.c: Likewise. * arm/arm.c: Likewise. * avr/avr.c: Likewise. * bfin/bfin.c: Likewise. * c6x/c6x.c: Likewise. * cr16/cr16.c: Likewise. * cris/cris.c: Likewise. * epiphany/epiphany.c: Likewise. * fr30/fr30.c: Likewise. * frv/frv.c: Likewise. * h8300/h8300.c: Likewise. * i386/i386.c: Likewise. * i386/winnt.c: Likewise. * ia64/ia64.c: Likewise. * iq2000/iq2000.c: Likewise. * lm32/lm32.c: Likewise. * m32c/m32c.c: Likewise. * m32r/m32r.c: Likewise. * m68k/m68k.c: Likewise. * mcore/mcore.c: Likewise. * mep/mep.c: Likewise. * microblaze/microblaze.c: Likewise. * mips/mips.c: Likewise. * mmix/mmix.c: Likewise. * mn10300/mn10300.c: Likewise. * moxie/moxie.c: Likewise. * msp430/msp430.c: Likewise. * nds32/nds32.c: Likewise. * pa/pa.c: Likewise. * pdp11/pdp11.c: Likewise. * picochip/picochip.c: Likewise. * rl78/rl78.c: Likewise. * rs6000/rs6000.c: Likewise. * rx/rx.c: Likewise. * s390/s390.c: Likewise. * score/score.c: Likewise. * sh/sh.c: Likewise. * sparc/sparc.c: Likewise. * spu/spu.c: Likewise. * stormy16/stormy16.c: Likewise. * tilegx/tilegx.c: Likewise. * tilepro/tilepro.c: Likewise. * v850/v850.c: Likewise. * vax/vax.c: Likewise. * xtensa/xtensa.c: Likewise. Index: asan.c === *** asan.c (revision 211131) --- asan.c (working copy) *** along with GCC; see the file COPYING3. *** 54,59 --- 54,60 #include ubsan.h #include predict.h #include params.h + #include builtins.h /* AddressSanitizer finds out-of-bounds and use-after-free bugs with 2x slowdown on average. Index: cfgexpand.c === *** cfgexpand.c (revision 211131) --- cfgexpand.c (working copy) *** along with GCC; see the file COPYING3. *** 73,78 --- 73,79 #include tree-ssa-address.h #include recog.h #include output.h + #include builtins.h /* Some systems use __main in a way incompatible with its use in gcc, in these cases use the macros NAME__MAIN to give a quoted symbol and SYMBOL__MAIN to Index: convert.c === *** convert.c (revision 211131) --- convert.c (working copy) *** along with GCC; see the file COPYING3. *** 32,37 --- 32,38 #include diagnostic-core.h #include target.h #include langhooks.h + #include builtins.h #include ubsan.h /* Convert EXPR to some pointer or reference type TYPE. Index: emit-rtl.c === *** emit-rtl.c (revision 211131) --- emit-rtl.c (working copy) *** along with GCC; see the file COPYING3. *** 57,62 --- 57,63 #include df.h #include params.h #include target.h + #include builtins.h struct target_rtl default_target_rtl; #if SWITCHABLE_TARGET Index: except.c === *** except.c (revision 211131) --- except.c (working copy) *** along with GCC; see the file COPYING3. *** 141,146 --- 141,147 #include tree-pass.h #include pointer-set.h #include cfgloop.h + #include builtins.h /* Provide defaults for stuff that may not be defined when using sjlj exceptions. */ Index: expr.c === *** expr.c (revision 211131) --- expr.c (working copy)
[PATCH] Add patch for debugging compiler ICEs
Hi, A years ago there was a discussion (https://gcc.gnu.org/ml/gcc-patches/2004-01/msg02437.html) about debugging compiler ICEs that resulted in a patch from Jakub, which dumps useful information into temporary file, but for some reasons this patch wasn't applied to trunk. This is the resurrected patch with added GCC version information into generated repro file. -Maxim 2014-06-02 Jakub Jelinek ja...@redhat.com Max Ostapenko m.ostape...@partner.samsung.com * diagnostic.c (diagnostic_action_after_output): Exit with ICE_EXIT_CODE instead of FATAL_EXIT_CODE. * gcc.c (execute): Don't free first string early, but at the end of the function. Call retry_ice if compiler exited with ICE_EXIT_CODE. (main): Factor out common code. (print_configuration): New function. (try_fork): Likewise. (redirect_stdout_stderr): Likewise. (files_equal_p): Likewise. (check_repro): Likewise. (run_attempt): Likewise. (generate_preprocessed_code): Likewise. (append_text): Likewise. (try_generate_repro): Likewise. diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 0cc7593..67b8c5b 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -492,7 +492,7 @@ diagnostic_action_after_output (diagnostic_context *context, real_abort (); diagnostic_finish (context); fnotice (stderr, compilation terminated.\n); - exit (FATAL_EXIT_CODE); + exit (ICE_EXIT_CODE); default: gcc_unreachable (); diff --git a/gcc/gcc.c b/gcc/gcc.c index 9ac18e6..86dce03 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -43,6 +43,13 @@ compilation is specified by a string called a spec. */ #include params.h #include vec.h #include filenames.h +#ifdef HAVE_UNISTD_H +#include unistd.h +#endif + +#if !(defined (__MSDOS__) || defined (OS2) || defined (VMS)) +#define RETRY_ICE_SUPPORTED +#endif /* By default there is no special suffix for target executables. */ /* FIXME: when autoconf is fixed, remove the host check - dj */ @@ -253,6 +260,9 @@ static void init_gcc_specs (struct obstack *, const char *, const char *, static const char *convert_filename (const char *, int, int); #endif +#ifdef RETRY_ICE_SUPPORTED +static void try_generate_repro (const char *prog, const char **argv); +#endif static const char *getenv_spec_function (int, const char **); static const char *if_exists_spec_function (int, const char **); static const char *if_exists_else_spec_function (int, const char **); @@ -2797,7 +2807,7 @@ execute (void) } } - if (string != commands[i].prog) + if (i string != commands[i].prog) free (CONST_CAST (char *, string)); } @@ -2850,6 +2860,16 @@ execute (void) else if (WIFEXITED (status) WEXITSTATUS (status) = MIN_FATAL_STATUS) { +#ifdef RETRY_ICE_SUPPORTED + /* For ICEs in cc1, cc1obj, cc1plus see if it is + reproducible or not. */ + const char *p; + if (WEXITSTATUS (status) == ICE_EXIT_CODE + i == 0 + (p = strrchr (commands[0].argv[0], DIR_SEPARATOR)) + ! strncmp (p + 1, cc1, 3)) + try_generate_repro (commands[0].prog, commands[0].argv); +#endif if (WEXITSTATUS (status) greatest_status) greatest_status = WEXITSTATUS (status); ret_code = -1; @@ -2907,6 +2927,9 @@ execute (void) } } + if (commands[0].argv[0] != commands[0].prog) + free (CONST_CAST (char *, commands[0].argv[0])); + return ret_code; } } @@ -6098,6 +6121,342 @@ give_switch (int switchnum, int omit_first_word) switches[switchnum].validated = true; } +static void +print_configuration (void) +{ + int n; + const char *thrmod; + + fnotice (stderr, Target: %s\n, spec_machine); + fnotice (stderr, Configured with: %s\n, configuration_arguments); + +#ifdef THREAD_MODEL_SPEC + /* We could have defined THREAD_MODEL_SPEC to %* by default, + but there's no point in doing all this processing just to get + thread_model back. */ + obstack_init (obstack); + do_spec_1 (THREAD_MODEL_SPEC, 0, thread_model); + obstack_1grow (obstack, '\0'); + thrmod = XOBFINISH (obstack, const char *); +#else + thrmod = thread_model; +#endif + + fnotice (stderr, Thread model: %s\n, thrmod); + + /* compiler_version is truncated at the first space when initialized + from version string, so truncate version_string at the first space + before comparing. */ + for (n = 0; version_string[n]; n++) +if (version_string[n] == ' ') + break; + + if (! strncmp (version_string, compiler_version, n) + compiler_version[n] == 0) +fnotice (stderr, gcc version %s %s\n\n, version_string, + pkgversion_string); + else +fnotice (stderr, gcc driver version %s %sexecuting gcc version %s\n\n, + version_string, pkgversion_string, compiler_version); +} + +#ifdef RETRY_ICE_SUPPORTED +#define RETRY_ICE_ATTEMPTS 3 + +static int +try_fork () +{ + int pid, retries, sleep_interval; + sleep_interval = 1; + pid = -1; + for (retries = 0; retries 4; retries++) +{ + pid = fork (); + if
Re: [PATCH, Pointer Bounds Checker 19/x] Support bounds in expand
Hi, On Mon, 2 Jun 2014, Ilya Enkovich wrote: This patch adds support for input bounds, call bounds args and returned bounds in expand pass. * expr.h (store_expr): Add param for bounds target. There is exactly one place (except for the self-recursive ones) where you call the new store_expr with a non-null argument for bounds target, and it seems to be only necessary for when some sub-expression of the RHS is a call. Can you somehow arrange to move that handling to the single place in expand_assignment() so that you don't need to change the signature of store_expr? Ciao, Michael.
Re: Patch for builtins.h restructuring [2/2]
On Mon, Jun 2, 2014 at 8:26 AM, Andrew MacLeod amacl...@redhat.com wrote: GO changes can now be directly checked in, is that right Ian? To be pedantically clear, it's always been OK to directly in changes to the files in gcc/go, but not to the files in gcc/go/gofrontend. That is still true. What has changed recently is that the files in gcc/go/gofrontend do not #include any GCC header files, so all general GCC changes now only affect the files in gcc/go but not the ones in gcc/go/gofrontend, and are therefore fine to check in directly. In any case, this patch is fine from a Go frontend point of view. Ian
[PATCH] rebuild frequency after vrp
This patch rebuilds frequency after vrp. Bootstrapped and testing on-going. OK for trunk if test pass? Thanks, Dehao gcc/ChangeLog: 2014-06-02 Dehao Chen de...@google.com PR tree-optimization/61384 * tree-vrp.c (execute_vrp): rebuild frequency after vrp. gcc/testsuite/ChangeLog: 2014-06-02 Dehao Chen de...@google.com PR tree-optimization/61384 * gcc.dg/pr61384.c: New testcase. Index: gcc/testsuite/gcc.dg/pr61384.c === --- gcc/testsuite/gcc.dg/pr61384.c (revision 0) +++ gcc/testsuite/gcc.dg/pr61384.c (revision 0) @@ -0,0 +1,32 @@ +/* PR tree-optimization/61384 */ +/* { dg-do compile } */ +/* { dg-options -O3 } */ + +int a, b, d, e, g, h; +static int c = 1, f[5]; + +int +fn1 (int p1, int p2) +{ + return p1 p2 p2; +} + +void +fn2 () +{ + for (d = 0; d 1; d++) +{ + g = f[0]; + e = 0; + h = fn1 (1, (a c) ^ b); +} + for (; e 5; e++) +f[e] = 1; +} + +int +main () +{ + fn2 (); + return 0; +} Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 211137) +++ gcc/tree-vrp.c (working copy) @@ -9794,7 +9794,7 @@ execute_vrp (void) scev_finalize (); loop_optimizer_finalize (); - return 0; + return TODO_rebuild_frequencies; } namespace {
Re: Eliminate write-only variables
Well, I'm hesitant to add a new pass just to optimize a (irrelevant in practice) benchmark. I'm ok with strengthening existing infrastructure or enhancing existing passes. The issue with these mini-passes is that they are very placement sensitive and you don't easily get secondary effects. See the comment (and patch) about DCE - the ??? comment can be addressed the same way Bernd addressed it (use cgraph_function_possibly_inlined_p). Would that optimize the testcase? Note that the issue with nested function use prevails (not sure how to solve that - same issue in Bernds patch). I think this part of the patch can be made much cleaner by simply adding flag used by one function only to variables. We can compute it at the end of IPA queue and remove the code playing with local statics, nested functions and inlines. I can easily implement this analysis - perhaps it would be useful for AA or something else, too? Honza Thanks, Richard. -Sandra
Re: [patch] Update catch(...) handlers to deal with __forced_unwind
On 14/05/14 20:37 +0100, Jonathan Wakely wrote: Failing to rethrow a __forced_unwind exception is very bad. This patch ensures we rethrow them in async tasks, and makes the shared state ready with a broken_promise so that waiting threads don't block forever. That seems reasonable to me, does anyone have any better ideas? Tested x86_64-linux, will wait for feedback before committing. Committed to trunk. commit 8d4a49e0391269380b160bd277339f740716de0c Author: Jonathan Wakely jwak...@redhat.com Date: Tue May 13 15:35:29 2014 +0100 * include/std/condition_variable (condition_variable_any::_Unlock): Do not swallow __forced_unwind. * include/std/future (__future_base::_Task_setter): Likewise. (__future_base::_Async_state_impl): Turn __forced_unwind into broken promise and rethrow. * include/std/mutex (try_lock): Likewise. * testsuite/30_threads/async/forced_unwind.cc: New. * testsuite/30_threads/packaged_task/forced_unwind.cc: New. diff --git a/libstdc++-v3/include/std/condition_variable b/libstdc++-v3/include/std/condition_variable index fc111dd..921cb83 100644 --- a/libstdc++-v3/include/std/condition_variable +++ b/libstdc++-v3/include/std/condition_variable @@ -189,7 +189,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION ~_Unlock() noexcept(false) { if (uncaught_exception()) - __try { _M_lock.lock(); } __catch(...) { } + { + __try + { _M_lock.lock(); } + __catch(const __cxxabiv1::__forced_unwind) + { __throw_exception_again; } + __catch(...) + { } + } else _M_lock.lock(); } diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future index 717ce71..8972ecf 100644 --- a/libstdc++-v3/include/std/future +++ b/libstdc++-v3/include/std/future @@ -1231,6 +1231,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { _M_result-_M_set(_M_fn()); } + __catch(const __cxxabiv1::__forced_unwind) + { + __throw_exception_again; // will cause broken_promise + } __catch(...) { _M_result-_M_error = current_exception(); @@ -1250,6 +1254,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { _M_fn(); } + __catch(const __cxxabiv1::__forced_unwind) + { + __throw_exception_again; // will cause broken_promise + } __catch(...) { _M_result-_M_error = current_exception(); @@ -1510,7 +1518,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _M_result(new _Result_Res()), _M_fn(std::move(__fn)) { _M_thread = std::thread{ [this] { - _M_set_result(_S_task_setter(_M_result, _M_fn)); + __try + { + _M_set_result(_S_task_setter(_M_result, _M_fn)); + } + __catch (const __cxxabiv1::__forced_unwind) + { + // make the shared state ready on thread cancellation + if (static_castbool(_M_result)) + this-_M_break_promise(std::move(_M_result)); + __throw_exception_again; + } } }; } diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex index 3d70754..f6b851c 100644 --- a/libstdc++-v3/include/std/mutex +++ b/libstdc++-v3/include/std/mutex @@ -44,6 +44,7 @@ #include bits/functexcept.h #include bits/gthr.h #include bits/move.h // for std::swap +#include bits/cxxabi_forced.h #ifdef _GLIBCXX_USE_C99_STDINT_TR1 @@ -631,6 +632,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION auto __locks = std::tie(__l1, __l2, __l3...); __try { __try_lock_impl0::__do_try_lock(__locks, __idx); } + __catch(const __cxxabiv1::__forced_unwind) + { __throw_exception_again; } __catch(...) { } return __idx; diff --git a/libstdc++-v3/testsuite/30_threads/async/forced_unwind.cc b/libstdc++-v3/testsuite/30_threads/async/forced_unwind.cc new file mode 100644 index 000..7b0a492 --- /dev/null +++ b/libstdc++-v3/testsuite/30_threads/async/forced_unwind.cc @@ -0,0 +1,45 @@ +// { dg-do run { target *-*-linux* *-*-gnu* } } +// { dg-options -std=gnu++11 -pthread { target *-*-linux* *-*-gnu* } } +// { dg-require-cstdint } +// { dg-require-gthreads } +// { dg-require-atomic-builtins } + +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more
Re: [PATCH, Pointer Bounds Checker 19/x] Support bounds in expand
2014-06-02 19:28 GMT+04:00 Michael Matz m...@suse.de: Hi, On Mon, 2 Jun 2014, Ilya Enkovich wrote: This patch adds support for input bounds, call bounds args and returned bounds in expand pass. * expr.h (store_expr): Add param for bounds target. There is exactly one place (except for the self-recursive ones) where you call the new store_expr with a non-null argument for bounds target, and it seems to be only necessary for when some sub-expression of the RHS is a call. Can you somehow arrange to move that handling to the single place in expand_assignment() so that you don't need to change the signature of store_expr? I see the only nice way to do it - store_expr should return bounds of expanded exp. Currently it always return NULL_RTX. Does it look better than a new argument? Ilya Ciao, Michael.
[PATCH, Pointer Bounds Checker 22/x] Inline
Hi, This patch adds support for inlining instrumented calls. Changes are mostly to support returned bounds. Also generated mem-to-mem assignments are registered to be later instrumented with appropriate bounds copy. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2014-06-02 Ilya Enkovich ilya.enkov...@intel.com * ipa-inline.c (early_inliner): Check edge has summary allocated. * tree-inline.c: Include tree-chkp.h. (declare_return_variable): Add arg holding returned bounds slot. Create and initialize returned bounds var. (remap_gimple_stmt): Handle returned bounds. Return sequence of statements instead of a single statement. (insert_init_stmt): Add declaration. (remap_gimple_seq): Adjust to new remap_gimple_stmt signature. (copy_bb): Adjust to changed return type of remap_gimple_stmt. (expand_call_inline): Handle returned bounds. Add bounds copy for generated mem to mem assignments. * tree-inline.h (copy_body_data): Add fields retbnd and assign_stmts. * cgraph.c: Include tree-chkp.h. (cgraph_redirect_edge_call_stmt_to_callee): Support returned bounds. * value-prof.c: Include tree-chkp.h. (gimple_ic): Support returned bounds. diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 1f684c2..4b6996b 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see #include gimple-pretty-print.h #include expr.h #include tree-dfa.h +#include tree-chkp.h /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. */ #include tree-pass.h @@ -1398,6 +1399,31 @@ cgraph_redirect_edge_call_stmt_to_callee (struct cgraph_edge *e) e-speculative = false; cgraph_set_call_stmt_including_clones (e-caller, e-call_stmt, new_stmt, false); + + /* Fix edges for BUILT_IN_CHKP_BNDRET calls attached to the +processed call stmt. */ + if (gimple_call_with_bounds_p (new_stmt) + gimple_call_lhs (new_stmt) + chkp_retbnd_call_by_val (gimple_call_lhs (e2-call_stmt))) + { + tree dresult = gimple_call_lhs (new_stmt); + tree iresult = gimple_call_lhs (e2-call_stmt); + gimple dbndret = chkp_retbnd_call_by_val (dresult); + gimple ibndret = chkp_retbnd_call_by_val (iresult); + struct cgraph_edge *iedge = cgraph_edge (e2-caller, ibndret); + struct cgraph_edge *dedge; + + if (dbndret) + { + dedge = cgraph_create_edge (iedge-caller, iedge-callee, + dbndret, e-count, + e-frequency); + dedge-frequency = compute_call_stmt_bb_frequency + (dedge-caller-decl, gimple_bb (dedge-call_stmt)); + } + iedge-frequency = compute_call_stmt_bb_frequency + (iedge-caller-decl, gimple_bb (iedge-call_stmt)); + } e-frequency = compute_call_stmt_bb_frequency (e-caller-decl, gimple_bb (e-call_stmt)); e2-frequency = compute_call_stmt_bb_frequency diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index 4051819..a6fc853 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -2301,11 +2301,15 @@ early_inliner (void) info that might be cleared out for newly discovered edges. */ for (edge = node-callees; edge; edge = edge-next_callee) { - struct inline_edge_summary *es = inline_edge_summary (edge); - es-call_stmt_size - = estimate_num_insns (edge-call_stmt, eni_size_weights); - es-call_stmt_time - = estimate_num_insns (edge-call_stmt, eni_time_weights); + /* We have no summary for new bound store calls yet. */ + if (inline_edge_summary_vec.length () (unsigned)edge-uid) + { + struct inline_edge_summary *es = inline_edge_summary (edge); + es-call_stmt_size + = estimate_num_insns (edge-call_stmt, eni_size_weights); + es-call_stmt_time + = estimate_num_insns (edge-call_stmt, eni_time_weights); + } if (edge-callee-decl !gimple_check_call_matching_types ( edge-call_stmt, edge-callee-decl, false)) diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 23fef90..6557a95 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -67,6 +67,7 @@ along with GCC; see the file COPYING3. If not see #include tree-pass.h #include target.h #include cfgloop.h +#include tree-chkp.h #include rtl.h /* FIXME: For asm_str_count. */ @@ -130,7 +131,8 @@ eni_weights eni_time_weights; /* Prototypes. */
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On 06/02/2014 11:17 AM, Marek Polacek wrote: + !processing_template_decl Also, why not warn when parsing a template? We don't need to know the type to recognize the problematic syntax. Jason
Re: [PATCH] Fix PR ipa/61190
Hi, On Mon, 2 Jun 2014 12:06:12, Richard Biener wrote: On Mon, Jun 2, 2014 at 11:00 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, the test case g++.old-deja/g++.mike/p4736b.C is mis-compiled with with all optimization levels, except -O0 and -Og. This probably started with gcc 4.7.x. The constructor Main::Main() is first inlined and then completely optimized away in the dce1 pass. But the thunk is still using the vtable, and therefore crashes. Why the ctor is eliminated? Is it because ipa-pure-const identifies the thunk as const? I think we need to update it to notice the read of vtable in those thunks. I will take a look. Honza Well, the problem seems to be, that the thunk code is not emitted in the normal way using gimple code, but instead by the i386 back-end with a callback. This makes the thunk invisible to the ipa-pure-const pass, which assumes that all values just flow straight thu the thunk. See ipa-pure-const.c (analyze_function): if (fn-thunk.thunk_p || fn-alias) { /* Thunk gets propagated through, so nothing interesting happens. */ gcc_assert (ipa); return l; } But this is not true for a virtual base class: The thunk itself references the vtable, so if nothing else might need the vtable, the optimizer may remove the initalization of the vtable, which happened in this example. The most simple work around would be the attached patch, which simply falls back to emitting gimple code, if virtual base classes are used. Boot-Strapped and Regression-Tested on x86_64-linux-gnu. Ok for trunk? Honza should know better. I'd rather skip the above for thunks going the output_mi_thunk way. Ahh Yes, that was of course my first attempt... But there is not BB to enumerate in that case. And the loop below just crashed in: FOR_EACH_BB_FN (this_block, cfun) There is also another interesting thing to mention: when the output_mi_thunk outputs the virtual thunk, there is no inlining at all. But with this patch, the thunk inlines the called function, and therefore the generated code looks rather nifty, compared to the state before the patch. Thanks Bernd. That is, the target hook docs should be updated to reflect which kind of thunks it is supposed to handle. Richard. Thanks Bernd.
Re: [PATCH] rebuild frequency after vrp
This patch rebuilds frequency after vrp. Why do you need to rebuild frequency after VRP? I always tought it may be useful to do VRP as early optimization (modulo to compile time costs), but I do not think we should unconditionally rebuild frequencies like this... Honza Bootstrapped and testing on-going. OK for trunk if test pass? Thanks, Dehao gcc/ChangeLog: 2014-06-02 Dehao Chen de...@google.com PR tree-optimization/61384 * tree-vrp.c (execute_vrp): rebuild frequency after vrp. gcc/testsuite/ChangeLog: 2014-06-02 Dehao Chen de...@google.com PR tree-optimization/61384 * gcc.dg/pr61384.c: New testcase. Index: gcc/testsuite/gcc.dg/pr61384.c === --- gcc/testsuite/gcc.dg/pr61384.c (revision 0) +++ gcc/testsuite/gcc.dg/pr61384.c (revision 0) @@ -0,0 +1,32 @@ +/* PR tree-optimization/61384 */ +/* { dg-do compile } */ +/* { dg-options -O3 } */ + +int a, b, d, e, g, h; +static int c = 1, f[5]; + +int +fn1 (int p1, int p2) +{ + return p1 p2 p2; +} + +void +fn2 () +{ + for (d = 0; d 1; d++) +{ + g = f[0]; + e = 0; + h = fn1 (1, (a c) ^ b); +} + for (; e 5; e++) +f[e] = 1; +} + +int +main () +{ + fn2 (); + return 0; +} Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 211137) +++ gcc/tree-vrp.c (working copy) @@ -9794,7 +9794,7 @@ execute_vrp (void) scev_finalize (); loop_optimizer_finalize (); - return 0; + return TODO_rebuild_frequencies; } namespace {
Re: [PATCH] rebuild frequency after vrp
We need to rebuild frequency after vrp, otherwise the following code in tree-ssa-threadupdate.c will make the frequency larger than upper-bound. /* Excessive jump threading may make frequencies large enough so the computation overflows. */ if (rd-dup_blocks[0]-frequency BB_FREQ_MAX * 2) rd-dup_blocks[0]-frequency += EDGE_FREQUENCY (e); This is referring to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 Thanks, Dehao On Mon, Jun 2, 2014 at 9:13 AM, Jan Hubicka hubi...@ucw.cz wrote: This patch rebuilds frequency after vrp. Why do you need to rebuild frequency after VRP? I always tought it may be useful to do VRP as early optimization (modulo to compile time costs), but I do not think we should unconditionally rebuild frequencies like this... Honza Bootstrapped and testing on-going. OK for trunk if test pass? Thanks, Dehao gcc/ChangeLog: 2014-06-02 Dehao Chen de...@google.com PR tree-optimization/61384 * tree-vrp.c (execute_vrp): rebuild frequency after vrp. gcc/testsuite/ChangeLog: 2014-06-02 Dehao Chen de...@google.com PR tree-optimization/61384 * gcc.dg/pr61384.c: New testcase. Index: gcc/testsuite/gcc.dg/pr61384.c === --- gcc/testsuite/gcc.dg/pr61384.c (revision 0) +++ gcc/testsuite/gcc.dg/pr61384.c (revision 0) @@ -0,0 +1,32 @@ +/* PR tree-optimization/61384 */ +/* { dg-do compile } */ +/* { dg-options -O3 } */ + +int a, b, d, e, g, h; +static int c = 1, f[5]; + +int +fn1 (int p1, int p2) +{ + return p1 p2 p2; +} + +void +fn2 () +{ + for (d = 0; d 1; d++) +{ + g = f[0]; + e = 0; + h = fn1 (1, (a c) ^ b); +} + for (; e 5; e++) +f[e] = 1; +} + +int +main () +{ + fn2 (); + return 0; +} Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c (revision 211137) +++ gcc/tree-vrp.c (working copy) @@ -9794,7 +9794,7 @@ execute_vrp (void) scev_finalize (); loop_optimizer_finalize (); - return 0; + return TODO_rebuild_frequencies; } namespace {
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On Mon, Jun 02, 2014 at 12:01:22PM -0400, Jason Merrill wrote: On 06/02/2014 11:17 AM, Marek Polacek wrote: + !processing_template_decl Also, why not warn when parsing a template? We don't need to know the type to recognize the problematic syntax. I have no special reason for that check, so dropped it too in this patch. Ran dg.exp with this warning enabled in -Wall; no regressions. 2014-06-02 Marek Polacek pola...@redhat.com PR c/49706 * doc/invoke.texi: Document -Wlogical-not-parentheses. c-family/ * c-common.c (warn_logical_not_parentheses): New function. * c-common.h (warn_logical_not_parentheses): Declare. * c.opt (Wlogical-not-parentheses): New option. c/ * c-typeck.c (parser_build_binary_op): Warn when logical not is used on the left hand side operand of a comparison. cp/ * parser.c (cp_parser_binary_expression): Warn when logical not is used on the left hand side operand of a comparison. testsuite/ * c-c++-common/pr49706.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 6ec14fc..650afaf 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -1722,6 +1722,25 @@ warn_logical_operator (location_t location, enum tree_code code, tree type, } } +/* Warn about logical not used on the left hand side operand of a comparison. + This function assumes that the LHS is inside of TRUTH_NOT_EXPR. + Do not warn if the LHS or RHS is of a boolean or a vector type. */ + +void +warn_logical_not_parentheses (location_t location, enum tree_code code, + tree lhs, tree rhs) +{ + if (TREE_CODE_CLASS (code) != tcc_comparison + || TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE + || TREE_CODE (TREE_TYPE (rhs)) == BOOLEAN_TYPE + || VECTOR_TYPE_P (TREE_TYPE (lhs)) + || VECTOR_TYPE_P (TREE_TYPE (rhs))) +return; + + warning_at (location, OPT_Wlogical_not_parentheses, + logical not is only applied to the left hand side of + comparison); +} /* Warn if EXP contains any computations whose results are not used. Return true if a warning is printed; false otherwise. LOCUS is the diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h index 0d34004..83d5dee 100644 --- gcc/c-family/c-common.h +++ gcc/c-family/c-common.h @@ -772,6 +772,8 @@ extern void overflow_warning (location_t, tree); extern bool warn_if_unused_value (const_tree, location_t); extern void warn_logical_operator (location_t, enum tree_code, tree, enum tree_code, tree, enum tree_code, tree); +extern void warn_logical_not_parentheses (location_t, enum tree_code, tree, + tree); extern void check_main_parameter_types (tree decl); extern bool c_determine_visibility (tree); extern bool vector_types_compatible_elements_p (tree, tree); diff --git gcc/c-family/c.opt gcc/c-family/c.opt index c586e65..d51c890 100644 --- gcc/c-family/c.opt +++ gcc/c-family/c.opt @@ -490,6 +490,10 @@ Wlogical-op C ObjC C++ ObjC++ Var(warn_logical_op) Init(0) Warning Warn when a logical operator is suspiciously always evaluating to true or false +Wlogical-not-parentheses +C ObjC C++ ObjC++ Var(warn_logical_not_paren) Warning +Warn when logical not is used on the left hand side operand of a comparison + Wlong-long C ObjC C++ ObjC++ Var(warn_long_long) Init(-1) Warning Do not warn about using \long long\ when -pedantic diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index 6ca584b..e98224e 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -3401,6 +3401,10 @@ parser_build_binary_op (location_t location, enum tree_code code, warn_logical_operator (location, code, TREE_TYPE (result.value), code1, arg1.value, code2, arg2.value); + if (warn_logical_not_paren + code1 == TRUTH_NOT_EXPR) +warn_logical_not_parentheses (location, code, arg1.value, arg2.value); + /* Warn about comparisons against string literals, with the exception of testing for equality or inequality of a string literal with NULL. */ if (code == EQ_EXPR || code == NE_EXPR) diff --git gcc/cp/parser.c gcc/cp/parser.c index 2591ae5..60e6cda 100644 --- gcc/cp/parser.c +++ gcc/cp/parser.c @@ -7883,6 +7883,8 @@ cp_parser_binary_expression (cp_parser* parser, bool cast_p, enum tree_code rhs_type; enum cp_parser_prec new_prec, lookahead_prec; tree overload; + bool parenthesized_not_lhs_warn += cp_lexer_next_token_is (parser-lexer, CPP_NOT); /* Parse the first expression. */ current.lhs = cp_parser_cast_expression (parser, /*address_p=*/false, @@ -7985,6 +7987,11 @@ cp_parser_binary_expression (cp_parser* parser, bool cast_p, else if (current.tree_type == TRUTH_ORIF_EXPR) c_inhibit_evaluation_warnings -= current.lhs == truthvalue_true_node; + if (warn_logical_not_paren + parenthesized_not_lhs_warn) +
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On 06/02/2014 12:23 PM, Marek Polacek wrote: I have no special reason for that check, so dropped it too in this patch. Thanks. I expect that warn_logical_not_parentheses will crash if one of the operands is type-dependent such that TREE_TYPE is NULL_TREE, so you'll want to handle that case, and test it in the testcase, i.e. something like #ifdef __cplusplus template class T, class U bool f(T t, U u) { return (!t == u); } #endif I think !t should have null TREE_TYPE in this case. Jason
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On Mon, Jun 02, 2014 at 12:31:32PM -0400, Jason Merrill wrote: On 06/02/2014 12:23 PM, Marek Polacek wrote: I have no special reason for that check, so dropped it too in this patch. Thanks. I expect that warn_logical_not_parentheses will crash if one of the operands is type-dependent such that TREE_TYPE is NULL_TREE, so you'll want to handle that case, and test it in the testcase, i.e. something like #ifdef __cplusplus template class T, class U bool f(T t, U u) { return (!t == u); } #endif I think !t should have null TREE_TYPE in this case. Do we actually want to warn in that case? As the patch doesn't warn if the type is bool or vector, if somebody instantiates the above with say T=bool, U=bool, then we'd warn on something that otherwise will not be warned about when not in template. But to warn about this during instantiation, we'd need to somehow tell tsubst* whether it was !t == u or (!t) == u ... Jakub
Fix target/61336 -- alpha ice on questionable asm
The scheme that alpha uses to split symbolic references into trackable pairs of relocations doesn't really handle asms. Of course, normal asms don't have the problem seen here because they're interested in producing instructions, whereas this case is system tap creating some annotations. The most important part of the patch is the call to output_operand_lossage, which eliminates the ICE. The rest of the patch is somewhat debatable. It certainly works for the given test case, in that the assembly hunk is producing string data. But I doubt that system tap has actually been ported to alpha, so whether it works in the more specific sense of doing the right thing, I have no idea. In order to more properly handle symbolic references in asms, we'd probably have to write a dedicated pass rather than rely on post-reload splitting. Which is more of a viable option now than in olden times. But properly in this particular case would almost certainly *not* work for system tap, since it would result in a weird symbol+reloc string that I'm certain would not be handled properly by system tap. r~ PR target/61336 * config/alpha/alpha.c (print_operand_address): Allow symbolic addresses inside asms. Use output_operand_lossage instead of gcc_unreachable. diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c index efef1e9..7663e20 100644 --- a/gcc/config/alpha/alpha.c +++ b/gcc/config/alpha/alpha.c @@ -5450,22 +5450,23 @@ print_operand_address (FILE *file, rtx addr) offset = INTVAL (addr); break; -#if TARGET_ABI_OPEN_VMS case SYMBOL_REF: + gcc_assert(TARGET_ABI_OPEN_VMS || this_is_asm_operands); fprintf (file, %s, XSTR (addr, 0)); return; case CONST: + gcc_assert(TARGET_ABI_OPEN_VMS || this_is_asm_operands); gcc_assert (GET_CODE (XEXP (addr, 0)) == PLUS GET_CODE (XEXP (XEXP (addr, 0), 0)) == SYMBOL_REF); fprintf (file, %s+ HOST_WIDE_INT_PRINT_DEC, XSTR (XEXP (XEXP (addr, 0), 0), 0), INTVAL (XEXP (XEXP (addr, 0), 1))); return; - -#endif + default: - gcc_unreachable (); + output_operand_lossage (invalid operand address); + return; } fprintf (file, HOST_WIDE_INT_PRINT_DEC ($%d), offset, basereg);
Re: [PATCH] rebuild frequency after vrp
On 06/02/14 10:17, Dehao Chen wrote: We need to rebuild frequency after vrp, otherwise the following code in tree-ssa-threadupdate.c will make the frequency larger than upper-bound. /* Excessive jump threading may make frequencies large enough so the computation overflows. */ if (rd-dup_blocks[0]-frequency BB_FREQ_MAX * 2) rd-dup_blocks[0]-frequency += EDGE_FREQUENCY (e); This is referring to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 Can you try this with Teresa's revamping of the jump threading frequency updates? It's still in my queue of things to review. Fixing this stuff in the updater would be better than rebuilding the frequencies, IMHO. Jeff
[PATCH] [ARM] Post-indexed addressing for NEON memory access
This patch adds support for post-indexed addressing for NEON structure memory accesses. For example VLD1.8 {d0}, [r0], r1 Bootstrapped and checked on arm-unknown-gnueabihf using Qemu. Ok for trunk? gcc/Changelog: 2014-06-02 Charles Baylis charles.bay...@linaro.org * config/arm/arm.c (neon_vector_mem_operand): Allow register POST_MODIFY for neon loads and stores. (arm_print_operand): Output post-index register for neon loads and stores. From a8e0bdbceab00d5e5b655611965d3975ba74365c Mon Sep 17 00:00:00 2001 From: Charles Baylis charles.bay...@linaro.org Date: Tue, 6 May 2014 15:23:46 +0100 Subject: [PATCH] post-indexed addressing for vld/vst 2014-05-09 Charles Baylis charles.bay...@linaro.org * config/arm/arm.c (neon_vector_mem_operand): Allow register POST_MODIFY for neon loads and stores. (arm_print_operand): Output post-index register for neon loads and stores. --- gcc/config/arm/arm.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 1117bd4..6ab02ef 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12786,7 +12786,11 @@ neon_vector_mem_operand (rtx op, int type, bool strict) || (type == 0 GET_CODE (ind) == PRE_DEC)) return arm_address_register_rtx_p (XEXP (ind, 0), 0); - /* FIXME: vld1 allows register post-modify. */ + /* Allow post-increment by register for VLDn */ + if (type == 2 GET_CODE (ind) == POST_MODIFY + GET_CODE (XEXP (ind, 1)) == PLUS + REG_P (XEXP (XEXP (ind, 1), 1))) + return true; /* Match: (plus (reg) @@ -21816,6 +21820,7 @@ arm_print_operand (FILE *stream, rtx x, int code) { rtx addr; bool postinc = FALSE; + rtx postinc_reg = NULL; unsigned align, memsize, align_bits; gcc_assert (MEM_P (x)); @@ -21825,6 +21830,11 @@ arm_print_operand (FILE *stream, rtx x, int code) postinc = 1; addr = XEXP (addr, 0); } + if (GET_CODE (addr) == POST_MODIFY) + { + postinc_reg = XEXP( XEXP (addr, 1), 1); + addr = XEXP (addr, 0); + } asm_fprintf (stream, [%r, REGNO (addr)); /* We know the alignment of this access, so we can emit a hint in the @@ -21850,6 +21860,8 @@ arm_print_operand (FILE *stream, rtx x, int code) if (postinc) fputs(!, stream); + if (postinc_reg) + asm_fprintf (stream, , %r, REGNO (postinc_reg)); } return; -- 1.9.1
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On Mon, Jun 02, 2014 at 06:36:06PM +0200, Jakub Jelinek wrote: On Mon, Jun 02, 2014 at 12:31:32PM -0400, Jason Merrill wrote: On 06/02/2014 12:23 PM, Marek Polacek wrote: I have no special reason for that check, so dropped it too in this patch. Thanks. I expect that warn_logical_not_parentheses will crash if one of the operands is type-dependent such that TREE_TYPE is NULL_TREE, so you'll want to handle that case, and test it in the testcase, i.e. something like #ifdef __cplusplus template class T, class U bool f(T t, U u) { return (!t == u); } #endif I think !t should have null TREE_TYPE in this case. Hmm, I see no crash; the types seem to be template_type_parm 0x7013d5e8 T type_0 type_6 VOID ... Do we actually want to warn in that case? As the patch doesn't warn if the type is bool or vector, if somebody instantiates the above with say T=bool, U=bool, then we'd warn on something that otherwise will not be warned about when not in template. But to warn about this during instantiation, we'd need to somehow tell tsubst* whether it was !t == u or (!t) == u ... Ah - indeed. So I suggest to stay with !processing_template_decl ;). Marek
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On Mon, Jun 02, 2014 at 07:04:58PM +0200, Marek Polacek wrote: Do we actually want to warn in that case? As the patch doesn't warn if the type is bool or vector, if somebody instantiates the above with say T=bool, U=bool, then we'd warn on something that otherwise will not be warned about when not in template. But to warn about this during instantiation, we'd need to somehow tell tsubst* whether it was !t == u or (!t) == u ... Ah - indeed. So I suggest to stay with !processing_template_decl ;). Well, if the vars are known not to be boolean, even in template, you can warn about this right away (i.e. if they aren't type_dependent_expression_p). Jakub
Re: [PATCH] rebuild frequency after vrp
Just tried with Teresa's patch, the ICE in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 is not resolved. Dehao On Mon, Jun 2, 2014 at 9:45 AM, Jeff Law l...@redhat.com wrote: On 06/02/14 10:17, Dehao Chen wrote: We need to rebuild frequency after vrp, otherwise the following code in tree-ssa-threadupdate.c will make the frequency larger than upper-bound. /* Excessive jump threading may make frequencies large enough so the computation overflows. */ if (rd-dup_blocks[0]-frequency BB_FREQ_MAX * 2) rd-dup_blocks[0]-frequency += EDGE_FREQUENCY (e); This is referring to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 Can you try this with Teresa's revamping of the jump threading frequency updates? It's still in my queue of things to review. Fixing this stuff in the updater would be better than rebuilding the frequencies, IMHO. Jeff
Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning
On 06/02/14 04:48, Ilya Enkovich wrote: Hmm, so if I understand things correctly, src_fun has no loop structures attached, thus there's nothing to copy. Presumably at some later point we build loop structures for the copy from scratch? I suppose it is just a simple bug with absent NULL pointer check. Here is original code: /* Duplicate the loop tree, if available and wanted. */ if (loops_for_fn (src_cfun) != NULL current_loops != NULL) { copy_loops (id, entry_block_map-loop_father, get_loop (src_cfun, 0)); /* Defer to cfgcleanup to update loop-father fields of basic-blocks. */ loops_state_set (LOOPS_NEED_FIXUP); } /* If the loop tree in the source function needed fixup, mark the destination loop tree for fixup, too. */ if (loops_for_fn (src_cfun)-state LOOPS_NEED_FIXUP) loops_state_set (LOOPS_NEED_FIXUP); As you may see we have check for absent loops structure in the first if-statement and no check in the second one. I hit segfault and added the check. Downthread you indicated you're not in SSA form which might explain the inconsistency here. If so, then we need to make sure that the loop df structures do get set up properly later. Jeff
Re: Patch for builtins.h restructuring [1/2]
On 06/02/14 09:26, Andrew MacLeod wrote: Furthering the include file restructuring, this patch fixes the header files for builtins.h. The prototypes were spread between tree.h, expr.h and fold-const.h. * There were 5 build_* routines in builtins.c that really should be in tree.c. They are used all over the place. The prototypes were already in tree.h. * There were 3 routines in builtins.c that belonged in targhooks.c. The prototypes were already in targhooks.h * fold-const.c uses do_mpc_arg2() which is defined in builtins.c, so the prototype should be in builtins.h. The prototype utilizes type mpc_srcptr from mpc.h which means that every file which includes builtins.h would also require #include mpc.h. That is likely the reason the prototype was placed into realmpfr.h. I added #include mpc.h to builtins.h. Normally we're trying not to do that until we can rebuild a decent module structure, but I think that rule only makes sense for include files within gcc's source base... Seems reasonable. * Finally, fortran/trans.c was calling fold_builtin_call_array directly. That means it would have needed builtins.h which caused issues since builtins.h defines struct target_builtins and uses FIRST_PSEUDO_REGISTER... which is defined within the target config file... something I dont think we really want to expose to the fortran front end :-P. Anyway, looking aorund, it turns out fold-const.c::fold_build_call_array_loc is really a wrapper for a call to fold_builtin_call_array, with some extra checking code before and after the call protected by ENABLE_FOLD_CHECKING. I'd think that should be OK since its what other front ends call... Ick. Yea, it'd be good if FIRST_PSEUDO_REGISTER doesn't bleed all the way into the front-ends. Bootstraps on x86_64-unknown-linux-gnu with no regressions. I also ran it through a full set of target builds for compilation, with no new failures there. The second set of patches are the updated #includes required for compilation. OK for trunk? OK. jeff
Re: [C/C++ PATCH] Add -Wlogical-not-parentheses (PR c/49706)
On 06/02/2014 01:04 PM, Marek Polacek wrote: #ifdef __cplusplus template class T, class U bool f(T t, U u) { return (!t == u); } #endif I think !t should have null TREE_TYPE in this case. Hmm, I see no crash; the types seem to be template_type_parm 0x7013d5e8 T type_0 type_6 VOID ... Right, because you're pulling out the operand of the not. OK, then you need something a little more complicated like !g(t). Do we actually want to warn in that case? As the patch doesn't warn if the type is bool or vector, if somebody instantiates the above with say T=bool, U=bool, then we'd warn on something that otherwise will not be warned about when not in template. I think we still want to warn. A template that is only correct for one possible template argument shouldn't be a template. Jason
Re: [PATCH] rebuild frequency after vrp
On 06/02/14 10:17, Dehao Chen wrote: We need to rebuild frequency after vrp, otherwise the following code in tree-ssa-threadupdate.c will make the frequency larger than upper-bound. /* Excessive jump threading may make frequencies large enough so the computation overflows. */ if (rd-dup_blocks[0]-frequency BB_FREQ_MAX * 2) rd-dup_blocks[0]-frequency += EDGE_FREQUENCY (e); This is referring to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 Can you try this with Teresa's revamping of the jump threading frequency updates? It's still in my queue of things to review. Fixing this stuff in the updater would be better than rebuilding the frequencies, IMHO. Agreed. However one of problems is that the frequencies after VRP may be unfixable. If i.e. you have if (a) prob 20 else prob 80 if (a) prob 50 else prob 50 that can easily happen if the second conditional was more convoluted earlier in the optimization queue. We may track when this change and ask for frequency rebuild, but issue is that I am not completely sure if that is going to give us consistently more reliable profiles than without: the problem is that branch probabilities themselves may be misupdated by earlier passes. It may be interesting to do some analysis how well estimated profile corelate with measured profile thorough the optimization. The problem above would be less troublesome if we jump threaded at least some cases pre-branch prediction. It is one of reasions why I think it would be cool to do jump threading in early opts, too, at least in a lightweidt form. Now when VRP is able to annotate SSA names, VRP results could feed loop estimation analysis for i.e. get upper bound on iterations of: if (a50) for (i=0;ia;i++) that can in turn improve branch prediction. Also we can feed the ranges into ipa-cp and get ipa-cp to propagate ranges rather than constants. This will get us free non-NULL propagation. IPA-inline's predicates can take use of value ranges because they record conditionals and also the results can be fed back to post-IPA VRP (by setting value ranges of default def SSA names of the parmeters) The down side is, of course, that VRP is considered expensive. Honza Jeff
Re: [PATCH] rebuild frequency after vrp
We need to rebuild frequency after vrp, otherwise the following code in tree-ssa-threadupdate.c will make the frequency larger than upper-bound. /* Excessive jump threading may make frequencies large enough so the computation overflows. */ if (rd-dup_blocks[0]-frequency BB_FREQ_MAX * 2) rd-dup_blocks[0]-frequency += EDGE_FREQUENCY (e); This is referring to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 I see, I still think C++ conversion gives us chance to turn both counts and frequencies to types that are safe WRT such calculations. Counts can be either software floats as discussed or fixed point 64bit type (I think we only need something like 8-10 bits to get problems Teresa is running into resolved - the cases where expanding statements gets sub 1 counts that needs to be diferent from 0) that does caps instead of overflow. Frequencies can be either the same or 32bit fixed point with capping as we do, sort of, now. I would really welcome if someone worked on this rather than papering around the bugs in current hand written fixed point arithmetics. (I am currently occupied by other projects, so I am not sure I can do it myself really soon) Honza