[RS6000] PR61300 K&R incoming args
One of the nice features of the ELFv2 ABI is that stack frames are smaller compared to ELFv1. We don't allocate a parameter save area unless we actually use it. However, for variable argument lists, we kept the simple va_list type which is a pointer to the memory location of the next parameter. This means calls to variable argument list functions must allocate the parameter save area, and hence calls to unprototyped functions must also do so. The wrinkle with K&R style C functions is that function *definitions* may be unprototyped. So when compiling a function body we can't use !prototype_p() to say we have a parameter save area. A call in some other compilation unit might be prototyped and so not allocate a parameter save area. Another consequence of unprototyped function definitions is that the return type and argument types may not be available on the function type node. Instead you need to look at the return and arguments on the function decl. Now, function.c always passes a decl to REG_PARM_STACK_SPACE, but calls.c sometimes passes a decl and sometimes a type. This latter fact makes it necessary, I think, to define an INCOMING_REG_PARM_STACK_SPACE used by function.c. You can't blindly use a decl from calls.c as that falls foul of C++.. The following implements this. Bootstrapped and regression tested powerpc64le-linux and powerpc64-linux all langs (except Ada since I didn't have gnat installed.) OK to apply? PR target/61300 * doc/tm.texi.in (INCOMING_REG_PARM_STACK_SPACE): Document. * doc/tm.texi: Regenerate. * function.c (INCOMING_REG_PARM_STACK_SPACE): Provide default. Use throughout in place of REG_PARM_STACK_SPACE. * config/rs6000/rs6000.c (rs6000_reg_parm_stack_space): Add "incoming" param. Pass to rs6000_function_parms_need_stack. (rs6000_function_parms_need_stack): Add "incoming" param, ignore prototype_p when incoming. Use function decl when incoming to handle K&R style functions. * config/rs6000/rs6000.h (REG_PARM_STACK_SPACE): Adjust. (INCOMING_REG_PARM_STACK_SPACE): Define. Index: gcc/doc/tm.texi.in === --- gcc/doc/tm.texi.in (revision 210919) +++ gcc/doc/tm.texi.in (working copy) @@ -3499,6 +3499,13 @@ which. @c above is overfull. not sure what to do. --mew 5feb93 did @c something, not sure if it looks good. --mew 10feb93 +@defmac INCOMING_REG_PARM_STACK_SPACE (@var{fndecl}) +Like @code{REG_PARM_STACK_SPACE}, but for incoming register arguments. +Define this macro if space guaranteed when compiling a function body +is different to space required when making a call, a situation that +can arise with unprototyped functions. +@end defmac + @defmac OUTGOING_REG_PARM_STACK_SPACE (@var{fntype}) Define this to a nonzero value if it is the responsibility of the caller to allocate the area reserved for arguments passed in registers Index: gcc/doc/tm.texi === --- gcc/doc/tm.texi (revision 210919) +++ gcc/doc/tm.texi (working copy) @@ -3948,6 +3948,13 @@ which. @c above is overfull. not sure what to do. --mew 5feb93 did @c something, not sure if it looks good. --mew 10feb93 +@defmac INCOMING_REG_PARM_STACK_SPACE (@var{fndecl}) +Like @code{REG_PARM_STACK_SPACE}, but for incoming register arguments. +Define this macro if space guaranteed when compiling a function body +is different to space required when making a call, a situation that +can arise with unprototyped functions. +@end defmac + @defmac OUTGOING_REG_PARM_STACK_SPACE (@var{fntype}) Define this to a nonzero value if it is the responsibility of the caller to allocate the area reserved for arguments passed in registers Index: gcc/function.c === --- gcc/function.c (revision 210919) +++ gcc/function.c (working copy) @@ -1348,9 +1348,13 @@ static int cfa_offset; #define STACK_POINTER_OFFSET 0 #endif +#if defined (REG_PARM_STACK_SPACE) && !defined (INCOMING_REG_PARM_STACK_SPACE) +#define INCOMING_REG_PARM_STACK_SPACE REG_PARM_STACK_SPACE +#endif + /* If not defined, pick an appropriate default for the offset of dynamically allocated memory depending on the value of ACCUMULATE_OUTGOING_ARGS, - REG_PARM_STACK_SPACE, and OUTGOING_REG_PARM_STACK_SPACE. */ + INCOMING_REG_PARM_STACK_SPACE, and OUTGOING_REG_PARM_STACK_SPACE. */ #ifndef STACK_DYNAMIC_OFFSET @@ -1362,12 +1366,12 @@ static int cfa_offset; `crtl->outgoing_args_size'. Nevertheless, we must allow for it when allocating stack dynamic objects. */ -#if defined(REG_PARM_STACK_SPACE) +#ifdef INCOMING_REG_PARM_STACK_SPACE #define STACK_DYNAMIC_OFFSET(FNDECL) \ ((ACCUMULATE_OUTGOING_ARGS \ ? (crtl->outgoing_args_size\ + (OUTGOING_REG_P
Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant
On Wed, Jul 13, 2011 at 9:39 AM, H.J. Lu wrote: > On Wed, Jul 13, 2011 at 9:13 AM, Paolo Bonzini wrote: >> On 07/11/2011 05:54 PM, H.J. Lu wrote: >>> >>> The key is the >>> > >>> > XEXP (x, 1) == convert_memory_address_addr_space >>> >(to_mode, XEXP (x, 1), as) >>> > >>> > test. It ensures basically that the constant has 31-bit precision, >>> > because >>> > otherwise the constant would change from e.g. (const_int -0x7ffc) >>> > to >>> > (const_int 0x8004) when zero-extending it from SImode to DImode. >>> > >>> > But I'm not sure it's safe. You have, >>> > >>> >(zero_extend:DI (plus:SI FOO:SI) (const_int Y)) >>> > >>> > and you want to convert it to >>> > >>> >(plus:DI FOO:DI (zero_extend:DI (const_int Y))) >>> > >>> > (where the zero_extend is folded). Ignore that FOO is a SYMBOL_REF >>> > (this >>> > piece of code does not assume anything about its shape); if FOO == >>> > 0xfffc and Y = 8, the result will be respectively 0x4 (valid) and >>> > 0x10004 (invalid). >>> >>> This example contradicts what you said above "It ensures basically that >>> the >>> constant has 31-bit precision". >> >> Why? Certainly Y = 8 has 31-bit (or less) precision. So it has the same >> representation in SImode and DImode, and the test above on XEXP (x, 1) >> succeeds. > > And then we permute conversion and addition, which leads to the issue you > raised above. In another word, the current code permutes conversion > and addition. > It leads to different values in case of symbol (0xfffc) + 8. > Basically the current > test for 31-bit (or less) precision is bogus. The real question is > for a address > computation, A + B, if address wrap-around is supported in > convert_memory_address_addr_space. Unless the code has already reassociated the additions already. Like in the AARCH64 ILP32 case: (plus:SI (plus:SI (mult:SI (reg/v:SI 80 [ b ]) (const_int -4 [0xfffc])) (subreg/s/u:SI (reg/v/f:DI 79 [ a ]) 0)) (const_int -1073742592 [0xbd00])) The Tree level is correct in that it did not reassociate the addition but the RTL level ignores that. So this patch is invalid and incorrect unless you know the non constant part of the addition is a pointer (which is not the case here). Thanks, Andrew Pinski > >>> > What happens if you just return NULL instead of the assertion (good >>> > idea >>> > adding it!)? >>> > >>> > Of course then you need to: >>> > >>> > 1) check the return values of convert_memory_address_addr_space_1, and >>> > propagate NULL up to simplify_unary_operation; >>> > >>> > 2) check in simplify-rtx.c whether the return value of >>> > convert_memory_address_1 is NULL, and only return if the return value >>> > is not >>> > NULL. This is not yet necessary (convert_memory_address is the last >>> > transformation for both SIGN_EXTEND and ZERO_EXTEND) but it is better >>> > to >>> > keep code clean. >>> >>> I will give it a try. >> >> Thanks, did you get any result? There's no "I think" in this code. So even >> if I cannot approve it, I'd really like to see a version that I understand >> and that is clearly conservative, if it works. >> > > I opened a new bug: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721 > > My current code looks like: > >case CONST: > temp = convert_memory_address_addr_space_1 (to_mode, XEXP (x, 0), > as, no_emit, > ignore_address_wrap_around); > return temp ? gen_rtx_CONST (to_mode, temp) : temp; > break; > > case PLUS: > case MULT: > /* For addition we can safely permute the conversion and addition > operation if one operand is a constant, address wrap-around > is ignored and we are using a ptr_extend instruction or > zero-extending (POINTERS_EXTEND_UNSIGNED != 0). We can always > safely permute them if we are making the address narrower. */ > if (GET_MODE_SIZE (to_mode) < GET_MODE_SIZE (from_mode) > || (GET_CODE (x) == PLUS > && CONST_INT_P (XEXP (x, 1)) > && (POINTERS_EXTEND_UNSIGNED != 0 > && ignore_address_wrap_around))) > return gen_rtx_fmt_ee (GET_CODE (x), to_mode, >convert_memory_address_addr_space_1 > (to_mode, XEXP (x, 0), as, no_emit, > ignore_address_wrap_around), >XEXP (x, 1)); > break; > > -- > H.J.
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On 05/28/2014 02:54 PM, Jeff Law wrote: > On 05/28/14 15:52, Jakub Jelinek wrote: >> On Wed, May 28, 2014 at 05:28:31PM -0400, Kai Tietz wrote: >>> Yes, I missed the plus-part. >>> >>> I am just running bootstrap with regression testing for altering predicate >>> to: >>> >>> (define_predicate "sibcall_memory_operand" >>>(match_operand 0 "memory_operand") >>> { >>>op = XEXP (op, 0); >>> >>>if (GET_CODE (op) == CONST) >>> op = XEXP (op, 0); >>>if (GET_CODE (op) == PLUS && CONSTANT_P (XEXP (op, 0))) >>> op = XEXP (op, 1); >> >> Why not get rid of all the above 4 lines and just keep: >> >>>return CONSTANT_P (op); >> >> ? CONST matches CONSTANT_P, and what is inside of CONST should be >> fine, and (plus (symbol_ref) (const_int)) not surrounded by CONST >> ir invalid. > Haven't we recently had problems with being overly accepting of stuff inside > CONST when using the CONST for address expressions. ISTM we should only > accept > what the processor supports here. Recall that it has just satisfied memory_operand, where all the real checks should have been done. I think just the CONSTANT_P check is sufficient. r~
Re: [C++ Patch] PR 57543
On 05/28/2014 10:00 PM, Paolo Carlini wrote: Hi again, On 05/28/2014 07:06 PM, Paolo Carlini wrote: (In case I would have also to double check something weird I was seeing if the injection happens for all method types...) In the meanwhile I investigated the reason of all those regressions I was seeing when injecting in all METHOD_TYPEs: while handling, eg, from cpp0x/access02.C: template struct foo_argument { template static Arg test(Ret (C::*)(Arg)); ... }; Just wanted to add that I spent some time analyzing what we do for the above and things seem in good shape: the first time we call tsubst_function_type on Ret (C::)(Arg) we don't know the type of C and we do not inject. Then we work again on it, when we actually know the types of Ret, C, and Arg, and then we inject, and tsubst_function_type returns a fully substituted fntype, an (int) (base::) (int). Paolo.
libgo patch committed: Work around LLVM split-stack deficiency
This patch from Peter Collingbourne tweaks libgo to work around a deficiency in the LLVM split-stack implementation: it doesn't support varargs functions. This is a step toward making it possible to compile libgo with LLVM. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 445501e362dd libgo/runtime/print.c --- a/libgo/runtime/print.c Wed May 28 16:10:26 2014 -0700 +++ b/libgo/runtime/print.c Wed May 28 16:47:30 2014 -0700 @@ -11,7 +11,9 @@ //static Lock debuglock; -static void go_vprintf(const char*, va_list); +// Clang requires this function to not be inlined (see below). +static void go_vprintf(const char*, va_list) +__attribute__((noinline)); // write to goroutine-local buffer if diverting output, // or else standard error. @@ -61,6 +63,24 @@ gwrite(s, runtime_findnull((const byte*)s)); } +#if defined (__clang__) && (defined (__i386__) || defined (__x86_64__)) +// LLVM's code generator does not currently support split stacks for vararg +// functions, so we disable the feature for this function under Clang. This +// appears to be OK as long as: +// - this function only calls non-inlined, internal-linkage (hence no dynamic +// loader) functions compiled with split stacks (i.e. go_vprintf), which can +// allocate more stack space as required; +// - this function itself does not occupy more than BACKOFF bytes of stack space +// (see libgcc/config/i386/morestack.S). +// These conditions are currently known to be satisfied by Clang on x86-32 and +// x86-64. Note that signal handlers receive slightly less stack space than they +// would normally do if they happen to be called while this function is being +// run. If this turns out to be a problem we could consider increasing BACKOFF. +void +runtime_printf(const char *s, ...) +__attribute__((no_split_stack)); +#endif + void runtime_printf(const char *s, ...) {
Re: [PATCH] demangler, only access valid fields for DEMANGLE_COMPONENT_FIXED_TYPE.
On 28/05/2014 11:56 PM, Pedro Alves wrote: > On 05/28/2014 09:38 PM, Andrew Burgess wrote: >> >> diff --git a/libiberty/testsuite/demangle-expected >> b/libiberty/testsuite/demangle-expected >> index 453f9a3..0e2bb12 100644 >> --- a/libiberty/testsuite/demangle-expected >> +++ b/libiberty/testsuite/demangle-expected >> @@ -4343,3 +4343,9 @@ >> cereal::detail::InputBindingMap::Serializers >> cereal::p >> --format=gnu-v3 >> _ZNSt9_Any_data9_M_accessIPZ4postISt8functionIFvvEEEvOT_EUlvE_EERS5_v >> void post >(std::function> ()>&&)::{lambda()#1}*& std::_Any_data::_M_access> post >(void post >> >(std::function&&)::{lambda()#1}*&&)::{lambda()#1}*>() >> +# The following input symbol was found during random, it caused a fault > > Could you add a single empty # above, to separate the tests? > I find that that makes it much easier to follow the file. Done. >> +# The following input symbol was found during random, it caused a fault > > "during random testing?" > >> +# within the demangler, it's not a symbol we'd expect in the real world. > > Why not? Good point(s), that comment was out of date, I've removed it. The symbol is a perfectly good symbol which we could find in the real world, and should be able to handle. Patch below only has changes to the tests. Thanks, Andrew libiberty/ChangeLog: * cp-demangle.c (d_dump): Only access field from s_fixed part of the union for DEMANGLE_COMPONENT_FIXED_TYPE. (d_count_templates_scopes): Likewise. * testsuite/demangle-expected: New test case. diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index 68d8ee1..a31dad4 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -710,7 +710,9 @@ d_dump (struct demangle_component *dc, int indent) printf ("pointer to member type\n"); break; case DEMANGLE_COMPONENT_FIXED_TYPE: - printf ("fixed-point type\n"); + printf ("fixed-point type, accum? %d, sat? %d\n", + dc->u.s_fixed.accum, dc->u.s_fixed.sat); + d_dump (dc->u.s_fixed.length, indent + 2) break; case DEMANGLE_COMPONENT_ARGLIST: printf ("argument list\n"); @@ -3869,7 +3871,13 @@ d_count_templates_scopes (int *num_templates, int *num_scopes, case DEMANGLE_COMPONENT_FUNCTION_TYPE: case DEMANGLE_COMPONENT_ARRAY_TYPE: case DEMANGLE_COMPONENT_PTRMEM_TYPE: + goto recurse_left_right; + case DEMANGLE_COMPONENT_FIXED_TYPE: + d_count_templates_scopes (num_templates, num_scopes, +dc->u.s_fixed.length); + break; + case DEMANGLE_COMPONENT_VECTOR_TYPE: case DEMANGLE_COMPONENT_ARGLIST: case DEMANGLE_COMPONENT_TEMPLATE_ARGLIST: diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index 453f9a3..63f6821 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -4343,3 +4343,8 @@ cereal::detail::InputBindingMap::Serializers cereal::p --format=gnu-v3 _ZNSt9_Any_data9_M_accessIPZ4postISt8functionIFvvEEEvOT_EUlvE_EERS5_v void post >(std::function&&)::{lambda()#1}*& std::_Any_data::_M_access >(void post >(std::function&&)::{lambda()#1}*&&)::{lambda()#1}*>() +# +--format=auto --no-params +_Z3xxxDFyuVb +xxx(unsigned long long _Fract, bool volatile) +xxx
Re: ipa-visibility TLC 2/n
> Richard Sandiford wrote the original section anchors implementation, > so he would be a good person to comment about the interaction between > aliases and section anchors. Thanks! Richard, does this patch seem sane? AIX gcc now builds for me, so I will test it on x86_64-linux it and commit if it passes to unbreak bootstraps. I think I should incrementally make symtab to believe that all functions/vars in named sections (and perhaps all COMDAT variables with -fdata-sections and COMDAT functions with -ffunction-sections, too) can be removed at linktime and thus disable the creation of the local aliases. This will result with worse code with -fdata-sections but will make it possible to garbage collect tose sections Honza > > Thanks, David
libgo patch committed: Make libgo C code more portable
This patch to libgo, from Peter Collingbourne, changes some of the C code to make it easier to compile libgo with a compiler other than GCC. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r d73f07d002ef libgo/runtime/go-cdiv.c --- a/libgo/runtime/go-cdiv.c Tue May 27 15:00:31 2014 -0700 +++ b/libgo/runtime/go-cdiv.c Wed May 28 16:09:25 2014 -0700 @@ -4,6 +4,9 @@ Use of this source code is governed by a BSD-style license that can be found in the LICENSE file. */ +#include +#include + /* Calls to these functions are generated by the Go frontend for division of complex64 or complex128. We use these because Go's complex division expects slightly different results from the GCC @@ -13,33 +16,33 @@ the the whole number is Inf, but an operation involving NaN ought to result in NaN, not Inf. */ -__complex float -__go_complex64_div (__complex float a, __complex float b) +complex float +__go_complex64_div (complex float a, complex float b) { - if (__builtin_expect (b == 0+0i, 0)) + if (__builtin_expect (b == 0, 0)) { - if (!__builtin_isinff (__real__ a) - && !__builtin_isinff (__imag__ a) - && (__builtin_isnanf (__real__ a) || __builtin_isnanf (__imag__ a))) + if (!isinf (crealf (a)) + && !isinf (cimagf (a)) + && (isnan (crealf (a)) || isnan (cimagf (a { /* Pass "1" to nanf to match math/bits.go. */ - return __builtin_nanf("1") + __builtin_nanf("1")*1i; + return nanf("1") + nanf("1")*I; } } return a / b; } -__complex double -__go_complex128_div (__complex double a, __complex double b) +complex double +__go_complex128_div (complex double a, complex double b) { - if (__builtin_expect (b == 0+0i, 0)) + if (__builtin_expect (b == 0, 0)) { - if (!__builtin_isinf (__real__ a) - && !__builtin_isinf (__imag__ a) - && (__builtin_isnan (__real__ a) || __builtin_isnan (__imag__ a))) + if (!isinf (creal (a)) + && !isinf (cimag (a)) + && (isnan (creal (a)) || isnan (cimag (a { /* Pass "1" to nan to match math/bits.go. */ - return __builtin_nan("1") + __builtin_nan("1")*1i; + return nan("1") + nan("1")*I; } } return a / b; diff -r d73f07d002ef libgo/runtime/go-type-complex.c --- a/libgo/runtime/go-type-complex.c Tue May 27 15:00:31 2014 -0700 +++ b/libgo/runtime/go-type-complex.c Wed May 28 16:09:25 2014 -0700 @@ -4,13 +4,13 @@ Use of this source code is governed by a BSD-style license that can be found in the LICENSE file. */ +#include +#include +#include +#include #include "runtime.h" #include "go-type.h" -/* The 64-bit type. */ - -typedef unsigned int DItype __attribute__ ((mode (DI))); - /* Hash function for float types. */ uintptr_t @@ -18,69 +18,67 @@ { if (key_size == 8) { - union - { - unsigned char a[8]; - __complex float cf; - DItype di; - } ucf; - __complex float cf; + const complex float *cfp; + complex float cf; float cfr; float cfi; + uint64_t fi; - __builtin_memcpy (ucf.a, vkey, 8); - cf = ucf.cf; - cfr = __builtin_crealf (cf); - cfi = __builtin_cimagf (cf); - if (__builtin_isinff (cfr) || __builtin_isinff (cfi)) + cfp = (const complex float *) vkey; + cf = *cfp; + + cfr = crealf (cf); + cfi = cimagf (cf); + + if (isinf (cfr) || isinf (cfi)) return 0; /* NaN != NaN, so the hash code of a NaN is irrelevant. Make it random so that not all NaNs wind up in the same place. */ - if (__builtin_isnanf (cfr) || __builtin_isnanf (cfi)) + if (isnan (cfr) || isnan (cfi)) return runtime_fastrand1 (); /* Avoid negative zero. */ if (cfr == 0 && cfi == 0) return 0; else if (cfr == 0) - ucf.cf = cfi * 1.0iF; + cf = cfi * I; else if (cfi == 0) - ucf.cf = cfr; + cf = cfr; - return ucf.di; + memcpy (&fi, &cf, 8); + return (uintptr_t) cfi; } else if (key_size == 16) { - union - { - unsigned char a[16]; - __complex double cd; - DItype adi[2]; - } ucd; - __complex double cd; + const complex double *cdp; + complex double cd; double cdr; double cdi; + uint64_t di[2]; - __builtin_memcpy (ucd.a, vkey, 16); - cd = ucd.cd; - cdr = __builtin_crealf (cd); - cdi = __builtin_cimagf (cd); - if (__builtin_isinf (cdr) || __builtin_isinf (cdi)) + cdp = (const complex double *) vkey; + cd = *cdp; + + cdr = creal (cd); + cdi = cimag (cd); + + if (isinf (cdr) || isinf (cdi)) return 0; - if (__builtin_isnan (cdr) || __builtin_isnan (cdi)) + if (isnan (cdr) || isnan (cdi)) return runtime_fastrand1 (); /* Avoid negative zero. */ if (cdr == 0 && cdi == 0) return 0; else if (cdr == 0) - ucd.cd = cdi * 1.0i; + cd = cdi * I; else if (cdi == 0) - ucd.cd = cdr; + cd = cdr; - return ucd.adi[0] ^ ucd.adi
[PATCH/AARCH64] Fix PR 61345: rtx_cost ICEing on simple code
Hi, The problem here is aarch64_rtx_costs for IF_THEN_ELSE does not handle the case where the first operand is a non comparison. This happens when the combine is combing a few RTLs and calling set_src_cost to check the costs of the newly created rtl. At the same I noticed the code for rtx_costs was getting more complex so I factored out the code for IF_THEN_ELSE OK? Built and tested on aarch64-elf with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.c (aarch64_if_then_else_costs): New function. (aarch64_rtx_costs): Use aarch64_if_then_else_costs. testsuite/ChangeLog: * gcc.c-torture/compile/20140528-1.c: New testcase. commit d2a89a0b21f13e676a863eeb3ac1f9ad927e65ac Author: Andrew Pinski Date: Wed May 28 15:44:05 2014 -0700 Fix kernel. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c2f6c4f..32c5fcb 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4848,6 +4848,80 @@ aarch64_rtx_arith_op_extract_p (rtx x, enum machine_mode mode) return false; } +/* Calculate the cost of calculating (if_then_else (OP0) (OP1) (OP2)), + storing it in *COST. Result is true if the total cost of the operation + has now been calculated. */ +static bool +aarch64_if_then_else_costs (rtx op0, rtx op1, rtx op2, int *cost, bool speed) +{ + rtx inner; + rtx comparator; + enum rtx_code cmpcode; + if (COMPARISON_P (op0)) +{ + inner = XEXP (op0, 0); + comparator = XEXP (op0, 1); + cmpcode = GET_CODE (op0); +} + else +{ + inner = op0; + comparator = const0_rtx; + cmpcode = NE; +} + + if (GET_CODE (op1) == PC || GET_CODE (op2) == PC) +{ + /* Conditional branch. */ + if (GET_MODE_CLASS (GET_MODE (inner)) == MODE_CC) + return true; + else + { + if (cmpcode == NE || cmpcode == EQ) + { + if (comparator == const0_rtx) + { + /* TBZ/TBNZ/CBZ/CBNZ. */ + if (GET_CODE (inner) == ZERO_EXTRACT) + /* TBZ/TBNZ. */ + *cost += rtx_cost (XEXP (inner, 0), ZERO_EXTRACT, + 0, speed); + else + /* CBZ/CBNZ. */ + *cost += rtx_cost (inner, cmpcode, 0, speed); + + return true; + } + } + else if (cmpcode == LT || cmpcode == GE) + { + /* TBZ/TBNZ. */ + if (comparator == const0_rtx) + return true; + } + } +} + else if (GET_MODE_CLASS (GET_MODE (inner)) == MODE_CC) +{ + /* It's a conditional operation based on the status flags, +so it must be some flavor of CSEL. */ + + /* CSNEG, CSINV, and CSINC are handled for free as part of CSEL. */ + if (GET_CODE (op1) == NEG + || GET_CODE (op1) == NOT + || (GET_CODE (op1) == PLUS && XEXP (op1, 1) == const1_rtx)) + op1 = XEXP (op1, 0); + + *cost += rtx_cost (op1, IF_THEN_ELSE, 1, speed); + *cost += rtx_cost (op2, IF_THEN_ELSE, 2, speed); + return true; +} + + /* We don't know what this is, cost all operands. */ + return false; + +} + /* Calculate the cost of calculating X, storing it in *COST. Result is true if the total cost of the operation has now been calculated. */ static bool @@ -5582,67 +5656,9 @@ cost_plus: return false; /* All arguments need to be in registers. */ case IF_THEN_ELSE: - op2 = XEXP (x, 2); - op0 = XEXP (x, 0); - op1 = XEXP (x, 1); - - if (GET_CODE (op1) == PC || GET_CODE (op2) == PC) -{ - /* Conditional branch. */ - if (GET_MODE_CLASS (GET_MODE (XEXP (op0, 0))) == MODE_CC) - return true; - else - { - if (GET_CODE (op0) == NE - || GET_CODE (op0) == EQ) - { - rtx inner = XEXP (op0, 0); - rtx comparator = XEXP (op0, 1); - - if (comparator == const0_rtx) - { - /* TBZ/TBNZ/CBZ/CBNZ. */ - if (GET_CODE (inner) == ZERO_EXTRACT) - /* TBZ/TBNZ. */ - *cost += rtx_cost (XEXP (inner, 0), ZERO_EXTRACT, - 0, speed); - else - /* CBZ/CBNZ. */ - *cost += rtx_cost (inner, GET_CODE (op0), 0, speed); - - return true; - } - } - else if (GET_CODE (op0) == LT - || GET_CODE (op0) == GE) - { - rtx comparator = XEXP (op0, 1); - - /* TBZ/TBNZ. */ - if (comparator == const0_rtx) - return true; - } - } -} - else if (GET_MODE_CLASS (GET_MODE (XEXP (op
Re: [PATCH] demangler, only access valid fields for DEMANGLE_COMPONENT_FIXED_TYPE.
On 05/28/2014 09:38 PM, Andrew Burgess wrote: > In two places when a struct demangle_component is of type > DEMANGLE_COMPONENT_FIXED_TYPE we fall back to accessing the default > s_binary member of the union rather than the s_fixed member. This is > incorrect and can cause the demangler to crash. > > In d_dump I've changed the code to only access the s_fixed member of the > union, and also added printing of the remaining parts of the s_fixed > struct, this felt like the most useful thing to do. > > I've added a new test, this causes a SIGSEGV for me before the patch, and > is fine afterwords, however, this undefined, so might not cause a crash on > all platforms. Hi Andrew, As you know, I'm not a demangler maintainer, but in any case, I took a look and this looks good to me. > diff --git a/libiberty/testsuite/demangle-expected > b/libiberty/testsuite/demangle-expected > index 453f9a3..0e2bb12 100644 > --- a/libiberty/testsuite/demangle-expected > +++ b/libiberty/testsuite/demangle-expected > @@ -4343,3 +4343,9 @@ > cereal::detail::InputBindingMap::Serializers > cereal::p > --format=gnu-v3 > _ZNSt9_Any_data9_M_accessIPZ4postISt8functionIFvvEEEvOT_EUlvE_EERS5_v > void post >(std::function&&)::{lambda()#1}*& > std::_Any_data::_M_access >(void > post >(std::function ()>&&)::{lambda()#1}*&&)::{lambda()#1}*>() > +# The following input symbol was found during random, it caused a fault Could you add a single empty # above, to separate the tests? I find that that makes it much easier to follow the file. I have no idea why we can't have/handle real empty lines though. > +# The following input symbol was found during random, it caused a fault "during random testing?" > +# within the demangler, it's not a symbol we'd expect in the real world. Why not? > +--format=auto --no-params > +_Z3xxxDFyuVb > +xxx(unsigned long long _Fract, bool volatile) > +xxx -- Pedro Alves
Re: ipa-visibility TLC 2/n
On Wed, May 28, 2014 at 6:31 PM, Jan Hubicka wrote: >> Honza, >> >> I'm glad that you're making progress. >> >> > David, this looks like a bug in the AIX target output macros. I get: >> > .set >> > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69,_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si >> >> > (this is correct since localalias is really an alias) >> >> > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69: >> > .space 40 >> > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si: >> ... >> >> > This is wrong, since we should not try to out the variable at least if I >> > read AIX assembly correctly. >> >> > varpool has explicit test to not output any aliases, so perhaps this is a >> > bug in wrapup_globals >> > and AIX output macros. I will try to track more after my teaching tonight. >> >> The AIX support handles AIX XCOFF assembler syntax and chooses >> appropriate sections, but it would not choose to emit an extra >> definition. If there are multiple definitions, then the varasm macros >> are being invoked multiple times for the same symbol. > > I now understand what goes wrong: AIX uses section anchors and the code to > produce them completely ignore the existence of aliases. It means that we > get a variable and alias on a different offsets within the anchor. > > I think we need to somehow propagate anchros of objects into anchors of > aliases. > The code goes by: > > validize_mem > that gets SYMBOL_REF for the alias > to > use_anchored_address > to > place_block_symbol > > and place_block adds the alias and allocates space for it. > I suppose we want to walk to alias target there: > > Index: varasm.c > === > --- varasm.c(revision 211028) > +++ varasm.c(working copy) > @@ -1083,6 +1083,9 @@ > { >addr_space_t as = ADDR_SPACE_GENERIC; >int reloc; > + symtab_node *snode = symtab_get_node (decl); > + if (snode) > +decl = symtab_alias_ultimate_target (snode)->decl; > >if (TREE_TYPE (decl) != error_mark_node) > as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); > @@ -7084,7 +7087,16 @@ > } >else > { > + struct symtab_node *snode; >decl = SYMBOL_REF_DECL (symbol); > + > + snode = symtab_node (decl); > + if (snode->alias) > + { > + rtx target = DECL_RTL (symtab_alias_ultimate_target (snode)->decl); > + SYMBOL_REF_BLOCK_OFFSET (symbol) = SYMBOL_REF_BLOCK_OFFSET (target); > + return; > + } >alignment = get_variable_align (decl); >size = tree_to_uhwi (DECL_SIZE_UNIT (decl)); >if ((flag_sanitize & SANITIZE_ADDRESS) > > I am testing if that lets me to finish a bootstrap at gcc111 Richard Sandiford wrote the original section anchors implementation, so he would be a good person to comment about the interaction between aliases and section anchors. Thanks, David
Re: -fuse-caller-save - Collect register usage information
Tom, the final version of this patch that you committed breaks bootstrap on powerpc64le-linux-gnu. The problem is that all uses of the variable i are guarded by #ifdef STACK_REGS, but the declaration of i is unconditional. We get an unused variable warning that becomes an error during stage 3. Thanks, Bill On Mon, 2014-05-19 at 16:30 +0200, Tom de Vries wrote: > On 17-05-14 12:51, Eric Botcazou wrote: > >> This is the updated version of the previously approved patch > >> submitted here (http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html ). > >> The changes are: > >> - using a new hook call_fusage_contains_non_callee_clobbers, > >> - incorporating minor review comments from Richard Sandiford > >> ( http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01436.html ). > >> > >> As part of the fuse-caller-save patch series, bootstrapped and reg-tested > >> on > >> x86_64, and build and reg-tested on MIPS. > >> > >> Eric, non-cgraph part OK for trunk? > > > > Eric, > > thanks for the review. > > > I think we should consider creating a new rule: for every target hook added, > > another must be first removed... > > > > So this call_fusage_contains_non_callee_clobbers is essentially only a stop > > gap measure for the ports that haven't been changed yet? > > I think so. > > > If so, please add a > > ??? comment at the beginning of collect_fn_hard_reg_usage: > > > >/* ??? To be removed when all the ports have been fixed. */ > >if (!targetm.call_fusage_contains_non_callee_clobbers) > > > > Done. > > > and invoke collect_fn_hard_reg_usage from rest_of_handle_final only when > > flag_use_caller_save is true. > > > > Done. > > > Why do you need to retest them in get_call_reg_set_usage and > > get_call_fndecl? > > > > The test for flag_use_caller-save in get_call_fndecl was unnecessary, I've > removed it. > > The test in get_call_reg_set_usage for flag_use_caller_save and the hook is > strictly speaking not necessary. But it's the interface function to retrieve > the > collected register usage information, so it seems a good location to do an > early-out. I've left it in for now. > > Bootstrapped and reg-tested on x86_64. > > non-cgraph part OK for trunk? > > Thanks, > - Tom
[PATCH v2 3/3] mangler/demangler dogfooding
On 05/27/2014 12:57 PM, Pedro Alves wrote: > +dmgl_opts = (DMGL_VERBOSE > + | DMGL_ANSI > + | DMGL_GNU_V3 > + | DMGL_RET_POSTFIX > + | DMGL_PARAMS); Hmm, don't know why I had put DMGL_RET_POSTFIX there in the first place. That's unintended... GDB only uses that for java. Without that, the set of symbols we can't demangle when running the testsuite drops from 2600 to 300. Clearly that shows DMGL_RET_POSTFIX is buggy, but that's not something we really about... And related, I've now added type demangling coverage too. > +mangled_str = (const char *) obstack_base (mangle_obstack); > +str = cplus_demangle_v3 (mangled_str, dmgl_opts); Err, I forgot to free the demangled string. Fixed now. > +#if 0 I've now added a #define for this, instead of #if 0. > +/* XXX The above catches potential demangler crashes. And, > + ideally, we'd also abort if demangling fails. However, we > + can't do that because the demangler isn't able to demangle all > + symbols we generate by default. Some failures might be > + demangler bugs, others unknown mangler bugs, and others known > + mangler bugs fixed with a higher -fabi-version, which the > + demangler doesn't have a workaround for. */ > +if ((str != NULL) != (mangled_str[0] == '_' && mangled_str[1] == 'Z')) > + internal_error ("demangling failed for: %s", mangled_str); I've now made this check complete in accordance to cp-demangle.c. Here's v2. Okay to apply, once patch #2 is in? 8< Subject: [PATCH] mangler/demangler dogfooding This patch teaches g++ to try to demangle any symbol it generates/mangles, when checking is enabled in the build, as soon as the symbol is mangled. The idea here is validate the demangling as close as possible to the generator as possible. This allows catching demangler bugs/crashes much earlier in the cycle, when building libstdc++ and friends, when running g++'s testsuite, and even potentially earlier than that, when developers work on new C++11/14-and-beyond features, which influence mangling, validating against some random test that's not in the testsuite yet (and sometimes doesn't make it there either), rather than much later in production when the user is trying to debug the code, or the program tries to generate a backtrace. Both libstdc++ and the testsuite include a good set of tricky symbols to demangle, and the latter naturally includes all sort of random, weird, code full of corner cases. And, I assume g++ maintainers once in a while run WIP g++ through some piles of very complicated C++ code. It seems clear to me that ideally we should be able to do assert (demangle (mangle (symbol)); But, unfortunately, we can't yet. I built g++ and ran the testsuite with a variant of this patch that would print the mangled symbol if the demangling fails, but would otherwise continue without aborting, and I found out that: - we can't demangle ~40 symbols generated while building libstdc++ and friends. E.g.: _ZN9__gnu_cxx13new_allocatorINSt13__future_base13_State_baseV2EE9constructIS2_IEEEvPT_DpOT0_ _ZNSt15_Sp_counted_ptrIDnLN9__gnu_cxx12_Lock_policyE1EEC1ERKS3_ - we can't demangle around ~300 symbols generated while running the g++ testsuite. E.g., _Z1f1SIXadL3FooEEE _ZZ4mainENKUlRT_E_clI1SEEDaS0_ Of course, these all may well be mangler bugs rather than demangler bugs. It's possible that these are already known mangler bugs even, that have been fixed, but require a higher -fabi-version=X. I didn't investigate that. Bootstrapped and regtested on x86_64 Fedora 20. gcc/ 2014-05-28 Pedro Alves * cp/mangle.c: Include demangle.h. (EXTRA_DEMANGLE_CHECKING): Define if not defined. (globals) : New field. (start_mangling): Clear it. (finish_mangling_internal) [ENABLE_CHECKING || EXTRA_DEMANGLE_CHECKING]: Demangle the just mangled symbol. (mangle_decl_string): Set G.type if mangling a type. (mangle_type_string): Set G.type. --- gcc/cp/mangle.c | 61 - 1 file changed, 60 insertions(+), 1 deletion(-) diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c index 4205fec..b323c9c 100644 --- a/gcc/cp/mangle.c +++ b/gcc/cp/mangle.c @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see #include "target.h" #include "cgraph.h" #include "wide-int.h" +#include "demangle.h" /* Debugging support. */ @@ -66,6 +67,14 @@ along with GCC; see the file COPYING3. If not see #define DEBUG_MANGLE 0 #endif +/* If ENABLE_CHECKING, we try to demangle what we mangle in order to + catch nasty demangler crashes early. But by default, we don't + actually make sure the result is sane. Define this to enable such + extra validation. */ +#ifndef EXTRA_DEMANGLE_CHECKING +#define EXTRA_DEMANGLE_CHECKING 1 +#endif + /* Macros for tracing the write_* functions. */ #if DE
Re: OpenMP target update tests
On Thu, May 29, 2014 at 12:14:45AM +0200, Jakub Jelinek wrote: > > +#pragma omp target update from(a, b, b) /* { dg-error "'b' appears more > > than once in motion clauses" } */ > > +#pragma omp target update to(a) to(b, b) /* { dg-error "'b' appears more > > than once in motion clauses" } */ > > +#pragma omp target update from(a, b) from(b) /* { dg-error "'b' appears > > more than once in motion clauses" } */ > > +#pragma omp target update from(a) to(b) from(b) /* { dg-error "'b' appears > > more than once in motion clauses" } */ > > +#pragma omp target update from(a) to(b) to(b) /* { dg-error "'b' appears > > more than once in motion clauses" } */ > > + > > +#pragma omp target update to(a) from(a[1:1]) /* { dg-error "'a' appears > > more than once in motion clauses" "not implemented" { xfail *-*-* } } */ > > Here I guess it really depends if a and a[1:1] is the same list item, I'd > say they are not. I think it should be fine and useful to use at least > the cases where there is the same underlying variable, but non-overlapping > sections like: > #pragma omp target update to(a[10:4]) from(a[0:5]) Also, if you have say: int *a; void foo () { #pragma omp target update to (a) from (a[0:10]) } then there is actually no overlap, this would write the host pointer value to the device's a pointer copy, and copy 40 bytes from the device back to where a points. Jakub
Re: ipa-visibility TLC 2/n
> Honza, > > I'm glad that you're making progress. > > > David, this looks like a bug in the AIX target output macros. I get: > > .set > > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69,_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si > > > (this is correct since localalias is really an alias) > > > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69: > > .space 40 > > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si: > ... > > > This is wrong, since we should not try to out the variable at least if I > > read AIX assembly correctly. > > > varpool has explicit test to not output any aliases, so perhaps this is a > > bug in wrapup_globals > > and AIX output macros. I will try to track more after my teaching tonight. > > The AIX support handles AIX XCOFF assembler syntax and chooses > appropriate sections, but it would not choose to emit an extra > definition. If there are multiple definitions, then the varasm macros > are being invoked multiple times for the same symbol. I now understand what goes wrong: AIX uses section anchors and the code to produce them completely ignore the existence of aliases. It means that we get a variable and alias on a different offsets within the anchor. I think we need to somehow propagate anchros of objects into anchors of aliases. The code goes by: validize_mem that gets SYMBOL_REF for the alias to use_anchored_address to place_block_symbol and place_block adds the alias and allocates space for it. I suppose we want to walk to alias target there: Index: varasm.c === --- varasm.c(revision 211028) +++ varasm.c(working copy) @@ -1083,6 +1083,9 @@ { addr_space_t as = ADDR_SPACE_GENERIC; int reloc; + symtab_node *snode = symtab_get_node (decl); + if (snode) +decl = symtab_alias_ultimate_target (snode)->decl; if (TREE_TYPE (decl) != error_mark_node) as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); @@ -7084,7 +7087,16 @@ } else { + struct symtab_node *snode; decl = SYMBOL_REF_DECL (symbol); + + snode = symtab_node (decl); + if (snode->alias) + { + rtx target = DECL_RTL (symtab_alias_ultimate_target (snode)->decl); + SYMBOL_REF_BLOCK_OFFSET (symbol) = SYMBOL_REF_BLOCK_OFFSET (target); + return; + } alignment = get_variable_align (decl); size = tree_to_uhwi (DECL_SIZE_UNIT (decl)); if ((flag_sanitize & SANITIZE_ADDRESS) I am testing if that lets me to finish a bootstrap at gcc111
Re: [PATCH 1/3] Fix demangler testsuite crashes with CP_DEMANGLE_DEBUG defined
+gdb-patches On 05/27/2014 02:56 PM, Ian Lance Taylor wrote: > On Tue, May 27, 2014 at 4:57 AM, Pedro Alves wrote: >> >> libiberty/ >> 2014-05-26 Pedro Alves >> >> * cp-demangle.c (d_dump): Handle DEMANGLE_COMPONENT_FUNCTION_PARAM >> and DEMANGLE_COMPONENT_NUMBER. > > This is OK. Thank you. I've committed this to gcc svn, and pushed it to binutils-gdb as well. 8<- From: Pedro Alves Subject: [PATCH] Fix demangler testsuite crashes with CP_DEMANGLE_DEBUG defined Running the demangler's testsuite with CP_DEMANGLE_DEBUG defined crashes, with: Program received signal SIGSEGV, Segmentation fault. 0x0040a8c3 in d_dump (dc=0x1, indent=12) at ../../src/libiberty/cp-demangle.c:567 567 switch (dc->type) (gdb) bt 3 #0 0x0040a8c3 in d_dump (dc=0x1, indent=12) at ../../src/libiberty/cp-demangle.c:567 #1 0x0040ae47 in d_dump (dc=0x7fffd098, indent=10) at ../../src/libiberty/cp-demangle.c:787 #2 0x0040ae47 in d_dump (dc=0x7fffd0c8, indent=8) at ../../src/libiberty/cp-demangle.c:787 Note dc=0x1, which is obviously a bogus pointer. This is the end of d_dump recursing for a component type that that doesn't actually have subtrees: 787 d_dump (d_left (dc), indent + 2); 788 d_dump (d_right (dc), indent + 2); This fixes the two cases the testsuite currently trips on. libiberty/ 2014-05-28 Pedro Alves * cp-demangle.c (d_dump): Handle DEMANGLE_COMPONENT_FUNCTION_PARAM and DEMANGLE_COMPONENT_NUMBER. --- libiberty/ChangeLog | 5 + libiberty/cp-demangle.c | 6 ++ 2 files changed, 11 insertions(+) diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog index 7b25c7e..16eb6f7 100644 --- a/libiberty/ChangeLog +++ b/libiberty/ChangeLog @@ -1,3 +1,8 @@ +2014-05-28 Pedro Alves + + * cp-demangle.c (d_dump): Handle DEMANGLE_COMPONENT_FUNCTION_PARAM + and DEMANGLE_COMPONENT_NUMBER. + 2014-05-22 Thomas Schwinge * testsuite/demangle-expected: Fix last commit. diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index 68d8ee1..c0d2ffe 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -575,6 +575,9 @@ d_dump (struct demangle_component *dc, int indent) case DEMANGLE_COMPONENT_TEMPLATE_PARAM: printf ("template parameter %ld\n", dc->u.s_number.number); return; +case DEMANGLE_COMPONENT_FUNCTION_PARAM: + printf ("function parameter %ld\n", dc->u.s_number.number); + return; case DEMANGLE_COMPONENT_CTOR: printf ("constructor %d\n", (int) dc->u.s_ctor.kind); d_dump (dc->u.s_ctor.name, indent + 2); @@ -760,6 +763,9 @@ d_dump (struct demangle_component *dc, int indent) case DEMANGLE_COMPONENT_CHARACTER: printf ("character '%c'\n", dc->u.s_character.character); return; +case DEMANGLE_COMPONENT_NUMBER: + printf ("number %ld\n", dc->u.s_number.number); + return; case DEMANGLE_COMPONENT_DECLTYPE: printf ("decltype\n"); break; -- 1.9.0
Re: [AArch64 costs 14/18] Cost comparisons, flag setting operators and IF_THEN_ELSE
On Wed, May 28, 2014 at 10:51:10PM +0100, Andrew Pinski wrote: > On Thu, Mar 27, 2014 at 10:33 AM, James Greenhalgh > wrote: > > Hi, > > > > Next, comparisons, flag setting operations and IF_THEN_ELSE. > > > > Tested on aarch64-none-elf. > > > > Ok for stage 1? > > This broke building the Linux kernel. > A simple testcase: > unsigned grab_cache_page_write_begin(unsigned flags, unsigned capabilities) > { > unsigned gfp_mask; > unsigned gfp_notmask = 0; > gfp_mask = flags & ((1 << 25) - 1); > if (!(capabilities & 0x0001)) > gfp_mask |= 0x100u; > return (gfp_mask & ~gfp_notmask); > } > > CUT > The problem is combine creates the following RTL: > (if_then_else:SI (reg:SI 78 [ D.2578 ]) > (const_int 0 [0]) > (const_int 16777216 [0x100])) > > Which the code you added does not handle. I am going to fix this but > I need to re-factor this code. I am going to place the code for > IF_THEN_ELSE in its own function also since it is getting too large > for my taste. Ugh, sorry for that and confirmed locally. Refactoring this is very sensible. Thanks for working on the fix. James > > Thanks, > Andrew Pinski > > > > > Thanks, > > James > > > > --- > > 2014-03-27 James Greenhalgh > > Philipp Tomsich > > > > * config/aarch64/aarch64.c (aarch64_rtx_costs): Cost comparison > > operators. >
Re: Build problems on arm-linux-gnueabihf
> > Thanks ! > > Unfortunately this is not the only bug at the time. > > see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61331 > > which is even more painful... > > I can currently only build anything if I do this: > > svn diff -r210963:210965 | patch -p0 -R thanks! I will regtest and go ahead with this change. I debugged furhter the AIX issue and it is unrelated - AIX use section anchors and it seems that their implementation completely ignore existence of aliases :( Honza
Re: [PATCH] cplus-demangler, free resource after a failed call to gnu_special.
On 05/22/2014 12:57 PM, Thomas Schwinge wrote: > On Wed, 14 May 2014 15:20:16 +0100, Gary Benson wrote: >> Andrew Burgess wrote: >>> On 14/05/2014 10:01 AM, Gary Benson wrote: Ian Lance Taylor wrote: > This patch is OK. Andrew, would you like me to commit this? >>> >>> Yes please. >> >> Done: >> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=210425 > ... > Fix test in libiberty/testsuite/demangle-expected. > > libiberty/ > * testsuite/demangle-expected: Fix last commit. Thanks. I've merged this and Andrew's patch to the binutils-gdb git repo. -- Pedro Alves
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
- Original Message - > On Wed, May 28, 2014 at 03:54:58PM -0600, Jeff Law wrote: > > >Why not get rid of all the above 4 lines and just keep: > > > > > >> return CONSTANT_P (op); > > > > > >? CONST matches CONSTANT_P, and what is inside of CONST should be > > >fine, and (plus (symbol_ref) (const_int)) not surrounded by CONST > > >ir invalid. > > Haven't we recently had problems with being overly accepting of > > stuff inside CONST when using the CONST for address expressions. > > ISTM we should only accept what the processor supports here. > > The only recent problem I remember was that we forgot to put CONST > around (plus (symbol_ref) (const_int)), but I see no problem not > accepting such invalid RTL. > The processor shouldn't care, for the instructions a CONST is just > any kind of immediate, what exactly it is matters solely to the > assembler/linker and dynamic linker > (if there are relocations for it, if the expression can be expressed in > the assembly, etc.). But that is common to all CONST operands, there is > nothing special particularly about sibcalls. > > Jakub > Well, actually we want to prevent to accept anything plus/mult within memory-addresses, which hasn't a symbol-ref, or a constant-value as arguments. Is it for sure that there are within a CONST-rtx no registers? If so, we could check intitally for CONSTANT_P. Kai
Re: OpenMP target update tests
On Wed, May 28, 2014 at 10:38:55PM +0200, Thomas Schwinge wrote: > --- gcc/c/c-parser.c > +++ gcc/c/c-parser.c > @@ -13530,7 +13530,7 @@ c_parser_omp_target_update (location_t loc, c_parser > *parser, >&& find_omp_clause (clauses, OMP_CLAUSE_FROM) == NULL_TREE) > { >error_at (loc, > - "%<#pragma omp target update must contain at least one " > + "%<#pragma omp target update%> must contain at least one " > "% or % clauses"); >return false; > } This is fine, ditto the cp/parser.c change. > --- gcc/c/c-typeck.c > +++ gcc/c/c-typeck.c > @@ -12152,6 +12152,8 @@ c_finish_omp_clauses (tree clauses) > remove = true; > } > } > + /* Currently, we're not doing any further checking for array > + sections. */ > break; > } > if (t == error_mark_node) I don't know what more checking would be desirable, in fact, reading the latest standard I actually see no restriction that the same list item can't appear in multiple map (or multiple from or multiple to) clauses on the same construct, there is only a restriction that the same list item should not appear in both to and from clauses on the same construct. Will need to go through my omp-lang related mails to see if this has been discussed post 4.0. But, as soon as array sections are involved, it is hard (especially if variable bounds are involved) to actually analyze this restriction. > +#pragma omp target update from(a, a, b) /* { dg-error "'a' appears more than > once in motion clauses" } */ > +#pragma omp target update to(a, a) to(b) /* { dg-error "'a' appears more > than once in motion clauses" } */ > +#pragma omp target update from(a) from(a, b) /* { dg-error "'a' appears more > than once in motion clauses" } */ See above, I believe this is not invalid. > +#pragma omp target update from(a) to(a) from(b) /* { dg-error "'a' appears > more than once in motion clauses" } */ > +#pragma omp target update to(a) to(a) from(b) /* { dg-error "'a' appears > more than once in motion clauses" } */ The above one too. > +#pragma omp target update from(a, b, b) /* { dg-error "'b' appears more than > once in motion clauses" } */ > +#pragma omp target update to(a) to(b, b) /* { dg-error "'b' appears more > than once in motion clauses" } */ > +#pragma omp target update from(a, b) from(b) /* { dg-error "'b' appears more > than once in motion clauses" } */ > +#pragma omp target update from(a) to(b) from(b) /* { dg-error "'b' appears > more than once in motion clauses" } */ > +#pragma omp target update from(a) to(b) to(b) /* { dg-error "'b' appears > more than once in motion clauses" } */ > + > +#pragma omp target update to(a) from(a[1:1]) /* { dg-error "'a' appears more > than once in motion clauses" "not implemented" { xfail *-*-* } } */ Here I guess it really depends if a and a[1:1] is the same list item, I'd say they are not. I think it should be fine and useful to use at least the cases where there is the same underlying variable, but non-overlapping sections like: #pragma omp target update to(a[10:4]) from(a[0:5]) Jakub
RE: Build problems on arm-linux-gnueabihf
Hi, On Wed, 28 May 2014 22:36:17, Jan Hubicka wrote: > > On the other hand, the alias created ought to inherit properties form its > target, so yes, we probably want to copy flags that matters, including > DECL_THREAD_LOCAL_P. We however should not copy DECL_INITIAL - we never have > constructors for aliases and there is ctor_for_folding function that gets you > from decl to corresponding constructor that knows how to walk aliases. > So if varasm looks into DECL_INITIAL of an alias, I think it is a bug. > > We probably want to copy other flags. Does the patch works if you copy > TREE_SIDE_EFFECTS > alone? But why that particular variable has side effects at first place? > No, that was my first attempt, but unfortunately this did not make a difference, What seems to be important is if DECL_INITIAL(decl) == NULL that means it is in BSS segment, and -fzero-initialized-in-bss (the default) it looks also at the initial data if they are zero. I but for RELRO, it seems also to be important, if the TREE_CONSTANT flag is set on the DECL_INITIAL(decl). Other properties of the DECL_INITIAL are not important. except, if the TREE_CODE of the DECL_INITIAL is STRING_CST, then the section may be also different And look at compute_reloc_for_constant, this inspects the initial value if it binds to locals or externals, or not. That tells us if it has relocations or not. Hmm, I am not sure, if it is possible to leave the DECL_INITIAL away... > /* In any expression, decl, or constant, nonzero means it has side effects or > reevaluation of the whole expression could produce a different value. > This is set if any subexpression is a function call, a side effect or a > reference to a volatile variable. In a ..._DECL, this is set only if the > declaration said `volatile'. This will never be set for a constant. */ > > I do not see how vtable may end up being volatile. > > I posted following patch: > Index: varasm.c > === > --- varasm.c (revision 211028) > +++ varasm.c (working copy) > @@ -1083,6 +1083,9 @@ > { > addr_space_t as = ADDR_SPACE_GENERIC; > int reloc; > + symtab_node *snode = symtab_get_node (decl); > + if (snode) > + decl = symtab_alias_ultimate_target (snode)->decl; > > if (TREE_TYPE (decl) != error_mark_node) > as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); > > which should make the variable sections to go according to alias target, I > suppose > it doesn't help to your build? > I just tried it, and ... Yes, it worked, somehow ... > Sorry for slow reaction - I teach now monday-wednesday a class in linear > algebra > (as part of my post-doc) and it is time consuming (since the semmester is only > 6 weeks). I can do serious GCC hacking only thursday/friday/weekend thus. > I will try to fix this today however - I realize it is painful bug. > Thanks ! Unfortunately this is not the only bug at the time. see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61331 which is even more painful... I can currently only build anything if I do this: svn diff -r210963:210965 | patch -p0 -R Bernd. > Honza > >> >> maybe also on CONST_CAST_TREE (decl) in case of asan >> and various target hooks. >> >> >> now, since the error message names a ".localalias." >> >> I see your checkin from monday at the same area: >> >> r210919 | hubicka | 2014-05-26 02:50:24 +0200 (Mo, 26. Mai 2014) | 2 Zeilen >> >> * symtab.c (symtab_nonoverwritable_alias): Copy READONLY flag for >> variables. >> >> and I wonder if there are more flags missing: >> >> Index: symtab.c >> === >> --- symtab.c(Revision 211028) >> +++ symtab.c(Arbeitskopie) >> @@ -1165,6 +1165,8 @@ symtab_nonoverwritable_alias (symtab_node *node) >>else >> { >>TREE_READONLY (new_decl) = TREE_READONLY (node->decl); >> + TREE_SIDE_EFFECTS (new_decl) = TREE_SIDE_EFFECTS (node->decl); >> + DECL_INITIAL (new_decl) = DECL_INITIAL (node->decl); >>new_node = varpool_create_variable_alias (new_decl, node->decl); >> } >>symtab_resolve_alias (new_node, node); >> >> >> This makes the build succeed for me, but I am not really sure, if that is >> the right direction to fix this, and if maybe even more flags may be missing, >> for instance: DECL_THREAD_LOCAL_P or CONST_CAST_TREE? >> >> What do you think? >> >> >> >> Bernd. >>
[committed] Add myself to MAINTAINERS.
2014-05-28 Pedro Alves * MAINTAINERS (Write After Approval): Add myself. --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 0c0062b..a710ccb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -312,6 +312,7 @@ from other maintainers or reviewers. Write After Approval(last name alphabetical order) Mark G. Adams mark.g.ad...@sympatico.ca +Pedro Alvespal...@redhat.com Raksit Ashok rak...@google.com Matt Austern aust...@google.com David Ayersay...@fsfe.org -- 1.9.0
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On Wed, May 28, 2014 at 03:54:58PM -0600, Jeff Law wrote: > >Why not get rid of all the above 4 lines and just keep: > > > >> return CONSTANT_P (op); > > > >? CONST matches CONSTANT_P, and what is inside of CONST should be > >fine, and (plus (symbol_ref) (const_int)) not surrounded by CONST > >ir invalid. > Haven't we recently had problems with being overly accepting of > stuff inside CONST when using the CONST for address expressions. > ISTM we should only accept what the processor supports here. The only recent problem I remember was that we forgot to put CONST around (plus (symbol_ref) (const_int)), but I see no problem not accepting such invalid RTL. The processor shouldn't care, for the instructions a CONST is just any kind of immediate, what exactly it is matters solely to the assembler/linker and dynamic linker (if there are relocations for it, if the expression can be expressed in the assembly, etc.). But that is common to all CONST operands, there is nothing special particularly about sibcalls. Jakub
Re: ipa-visibility TLC 2/n
> Any update? > > I've managed to generate a simple test case from > libstdc++-v3/src/c++98/strstream.cc which reproduces the issue on > ARM that Ramana has reported previously: > > > template struct char_traits; > > template > class basic_ios > { > }; > > template > > class basic_istream : virtual public basic_ios<_CharT, _Traits> > { > protected: > int _M_gcount; > virtual ~basic_istream() > { } > }; > > class istrstream : public basic_istream > { > virtual ~istrstream(); > }; > > istrstream::~istrstream() > { > } > > -- CUT -- > > With an arm-none-linux-gnueabi gcc configured as: > > ./gcc/configure --target=arm-none-linux-gnueabi > --enable-gnu-indirect-function --enable-shared --with-arch=armv7-a > --with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a > (irrelevant parts omitted) > > With the following command line options: > > -fdata-sections-O2 -fPIC -S ./test.cpp > > We'll see > > ./test.cpp:17:7: error: istrstream::_ZTV10istrstream.localalias.0 > causes a section type conflict with istrstream::_ZTV10istrstream > class istrstream : public basic_istream >^ > ./test.cpp:17:7: note: 'istrstream::_ZTV10istrstream' was declared here This seems to be same cause as on AIX - we do section for decl rather than original. The following patch seems to fix it. Does it allows bootstrap for you? (it doesn't for AIX. but that seems bug in output machinery) Index: varasm.c === --- varasm.c(revision 210914) +++ varasm.c(working copy) @@ -1083,6 +1083,9 @@ { addr_space_t as = ADDR_SPACE_GENERIC; int reloc; + symtab_node *snode = symtab_get_node (decl); + if (snode) +decl = symtab_alias_ultimate_target (snode)->decl; if (TREE_TYPE (decl) != error_mark_node) as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); > > > Yufeng
Re: [PATCH 7/7] Plug ipa-prop escape analysis into gimple_call_arg_flags
> Hi! > > On Mon, 26 May 2014 02:16:35 -0700, Andrew Pinski wrote: > > On Mon, May 26, 2014 at 1:59 AM, Dominique Dhumieres > > wrote: > > > r210901 breaks bootstrap on targets not supporting strnlen, e.g., > > > darwin10. > > > > > > ../../_clean/gcc/lto-cgraph.c:976:68: error: 'strnlen' was not declared > > > in this scope > > I'm seeing the same on MinGW, which also doesn't have strnlen (which is a > GNU extension). > > > strnlen should be declared in include/libiberty.h if there is no > > declaration as libiberty support is already there. That should be a > > simple fix. > > Like this? This looks good to me (thoguh strnlen is posix). I can not approve the patch but I would preffer it over just hand implementing strnlen there (that is easy, too) Honza > > --- gcc/config.in > +++ gcc/config.in > [Regenerate.] > --- gcc/configure > +++ gcc/configure > [Regenerate.] > --- gcc/configure.ac > +++ gcc/configure.ac > @@ -1136,7 +1136,7 @@ CFLAGS="$CFLAGS -I${srcdir} -I${srcdir}/../include > $GMPINC" > saved_CXXFLAGS="$CXXFLAGS" > CXXFLAGS="$CXXFLAGS -I${srcdir} -I${srcdir}/../include $GMPINC" > gcc_AC_CHECK_DECLS(getenv atol asprintf sbrk abort atof getcwd getwd \ > - strsignal strstr stpcpy strverscmp \ > + stpcpy strnlen strsignal strstr strverscmp \ > errno snprintf vsnprintf vasprintf malloc realloc calloc \ > free basename getopt clock getpagesize ffs gcc_UNLOCKED_FUNCS, , ,[ > #include "ansidecl.h" > diff --git include/libiberty.h include/libiberty.h > index 7fd0703..56b8b43 100644 > --- include/libiberty.h > +++ include/libiberty.h > @@ -636,6 +636,10 @@ extern int snprintf (char *, size_t, const char *, ...) > ATTRIBUTE_PRINTF_3; > extern int vsnprintf (char *, size_t, const char *, va_list) > ATTRIBUTE_PRINTF(3,0); > #endif > > +#if defined (HAVE_DECL_STRNLEN) && !HAVE_DECL_STRNLEN > +extern size_t strnlen (const char *, size_t); > +#endif > + > #if defined(HAVE_DECL_STRVERSCMP) && !HAVE_DECL_STRVERSCMP > /* Compare version strings. */ > extern int strverscmp (const char *, const char *); > > > Grüße, > Thomas
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On 05/28/14 15:52, Jakub Jelinek wrote: On Wed, May 28, 2014 at 05:28:31PM -0400, Kai Tietz wrote: Yes, I missed the plus-part. I am just running bootstrap with regression testing for altering predicate to: (define_predicate "sibcall_memory_operand" (match_operand 0 "memory_operand") { op = XEXP (op, 0); if (GET_CODE (op) == CONST) op = XEXP (op, 0); if (GET_CODE (op) == PLUS && CONSTANT_P (XEXP (op, 0))) op = XEXP (op, 1); Why not get rid of all the above 4 lines and just keep: return CONSTANT_P (op); ? CONST matches CONSTANT_P, and what is inside of CONST should be fine, and (plus (symbol_ref) (const_int)) not surrounded by CONST ir invalid. Haven't we recently had problems with being overly accepting of stuff inside CONST when using the CONST for address expressions. ISTM we should only accept what the processor supports here. jeff
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On Wed, May 28, 2014 at 05:28:31PM -0400, Kai Tietz wrote: > Yes, I missed the plus-part. > > I am just running bootstrap with regression testing for altering predicate to: > > (define_predicate "sibcall_memory_operand" > (match_operand 0 "memory_operand") > { > op = XEXP (op, 0); > > if (GET_CODE (op) == CONST) > op = XEXP (op, 0); > if (GET_CODE (op) == PLUS && CONSTANT_P (XEXP (op, 0))) > op = XEXP (op, 1); Why not get rid of all the above 4 lines and just keep: > return CONSTANT_P (op); ? CONST matches CONSTANT_P, and what is inside of CONST should be fine, and (plus (symbol_ref) (const_int)) not surrounded by CONST ir invalid. Jakub
Re: [AArch64 costs 14/18] Cost comparisons, flag setting operators and IF_THEN_ELSE
On Thu, Mar 27, 2014 at 10:33 AM, James Greenhalgh wrote: > Hi, > > Next, comparisons, flag setting operations and IF_THEN_ELSE. > > Tested on aarch64-none-elf. > > Ok for stage 1? This broke building the Linux kernel. A simple testcase: unsigned grab_cache_page_write_begin(unsigned flags, unsigned capabilities) { unsigned gfp_mask; unsigned gfp_notmask = 0; gfp_mask = flags & ((1 << 25) - 1); if (!(capabilities & 0x0001)) gfp_mask |= 0x100u; return (gfp_mask & ~gfp_notmask); } CUT The problem is combine creates the following RTL: (if_then_else:SI (reg:SI 78 [ D.2578 ]) (const_int 0 [0]) (const_int 16777216 [0x100])) Which the code you added does not handle. I am going to fix this but I need to re-factor this code. I am going to place the code for IF_THEN_ELSE in its own function also since it is getting too large for my taste. Thanks, Andrew Pinski > > Thanks, > James > > --- > 2014-03-27 James Greenhalgh > Philipp Tomsich > > * config/aarch64/aarch64.c (aarch64_rtx_costs): Cost comparison > operators.
Re: [PATCH][2/n] Always 64bit-HWI cleanups
Richard Biener writes: > The following changes the configury to insist on [u]int64_t being > available and removes the very old __int64 case. Autoconf doesn't > check for it, support came in via a big merge in Dec 2002, r60174, > and it was never used on the libcpp side until I fixed that with > the last patch of this series, so we couldn't have relied on it > at least since libcpp was introduced. > > Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT > to literally use int64_t has to be done with the grand renaming of > all users due to us using 'unsigned HOST_WIDE_INT'. > > Btw, I couldn't find any "standard" way of writing > [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor > one for printf formats (substitutions for HOST_WIDE_INT_PRINT > and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ > there if nobody comes up with a better plan. Not sure whether you meant that to apply to both groups, but as far as the HOST_WIDE_INT_C replacement goes, how about just using int64_t (N) instead of HOST_WIDE_INT_C (N) or INT64_C (N)? Thanks, Richard
Re: ipa-visibility TLC 2/n
Honza, I'm glad that you're making progress. > David, this looks like a bug in the AIX target output macros. I get: > .set > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69,_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si > (this is correct since localalias is really an alias) > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69: > .space 40 > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si: ... > This is wrong, since we should not try to out the variable at least if I read > AIX assembly correctly. > varpool has explicit test to not output any aliases, so perhaps this is a bug > in wrapup_globals > and AIX output macros. I will try to track more after my teaching tonight. The AIX support handles AIX XCOFF assembler syntax and chooses appropriate sections, but it would not choose to emit an extra definition. If there are multiple definitions, then the varasm macros are being invoked multiple times for the same symbol. Thanks, David
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
- Original Message - > On 05/28/14 11:22, Richard Henderson wrote: > > On 05/28/2014 01:43 AM, Kai Tietz wrote: > >> + if (GET_CODE (op) == CONST) > >> +op = XEXP (op, 0); > >> + return (GET_CODE (op) == SYMBOL_REF || CONSTANT_P (op)); > > > > Surely all this boils down to just CONSTANT_P (op), > > without having to look through the CONST at all. > Something certainly seems odd there. > > Kai, is your intention to detect symbol_ref + const_int? Isn't the > canonical form for that: > > (const (plus (symbol_ref) (const_int))? > > It appears to me the code above looks for > > (const (symbol_ref)) > > or > > (const (const_int)) > > ?!? > > Jeff > Yes, I missed the plus-part. I am just running bootstrap with regression testing for altering predicate to: (define_predicate "sibcall_memory_operand" (match_operand 0 "memory_operand") { op = XEXP (op, 0); if (GET_CODE (op) == CONST) op = XEXP (op, 0); if (GET_CODE (op) == PLUS && CONSTANT_P (XEXP (op, 0))) op = XEXP (op, 1); return CONSTANT_P (op); }) Kai
OpenMP target update tests
Hi! Does that look appropriate (for trunk)? gcc/c/c-parser.c | 2 +- gcc/c/c-typeck.c | 2 ++ gcc/cp/parser.c | 2 +- gcc/cp/semantics.c| 2 ++ gcc/testsuite/c-c++-common/gomp/target-update-1.c | 41 +++ 5 files changed, 47 insertions(+), 2 deletions(-) diff --git gcc/c/c-parser.c gcc/c/c-parser.c index ae13b0a..d9c91f0 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -13530,7 +13530,7 @@ c_parser_omp_target_update (location_t loc, c_parser *parser, && find_omp_clause (clauses, OMP_CLAUSE_FROM) == NULL_TREE) { error_at (loc, - "%<#pragma omp target update must contain at least one " + "%<#pragma omp target update%> must contain at least one " "% or % clauses"); return false; } diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index b6db627..ab78f6d 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -12152,6 +12152,8 @@ c_finish_omp_clauses (tree clauses) remove = true; } } + /* Currently, we're not doing any further checking for array +sections. */ break; } if (t == error_mark_node) diff --git gcc/cp/parser.c gcc/cp/parser.c index e52edbb..6c88fe4 100644 --- gcc/cp/parser.c +++ gcc/cp/parser.c @@ -30288,7 +30288,7 @@ cp_parser_omp_target_update (cp_parser *parser, cp_token *pragma_tok, && find_omp_clause (clauses, OMP_CLAUSE_FROM) == NULL_TREE) { error_at (pragma_tok->location, - "%<#pragma omp target update must contain at least one " + "%<#pragma omp target update%> must contain at least one " "% or % clauses"); return false; } diff --git gcc/cp/semantics.c gcc/cp/semantics.c index 7f06f3f..5f29268 100644 --- gcc/cp/semantics.c +++ gcc/cp/semantics.c @@ -5648,6 +5648,8 @@ finish_omp_clauses (tree clauses) remove = true; } } + /* Currently, we're not doing any further checking for array +sections. */ break; } if (t == error_mark_node) diff --git gcc/testsuite/c-c++-common/gomp/target-update-1.c gcc/testsuite/c-c++-common/gomp/target-update-1.c new file mode 100644 index 000..8476314 --- /dev/null +++ gcc/testsuite/c-c++-common/gomp/target-update-1.c @@ -0,0 +1,41 @@ +void +f (int a[10], float b) +{ +#pragma omp target update from(a) +#pragma omp target update to(a) +#pragma omp target update from(b) +#pragma omp target update to(b) + +#pragma omp target update from(a, b) +#pragma omp target update from(a) to(b) +#pragma omp target update to(a, b) + +#pragma omp target update from(a, a, b) /* { dg-error "'a' appears more than once in motion clauses" } */ +#pragma omp target update to(a, a) to(b) /* { dg-error "'a' appears more than once in motion clauses" } */ +#pragma omp target update from(a) from(a, b) /* { dg-error "'a' appears more than once in motion clauses" } */ +#pragma omp target update from(a) to(a) from(b) /* { dg-error "'a' appears more than once in motion clauses" } */ +#pragma omp target update to(a) to(a) from(b) /* { dg-error "'a' appears more than once in motion clauses" } */ + +#pragma omp target update from(a, b, b) /* { dg-error "'b' appears more than once in motion clauses" } */ +#pragma omp target update to(a) to(b, b) /* { dg-error "'b' appears more than once in motion clauses" } */ +#pragma omp target update from(a, b) from(b) /* { dg-error "'b' appears more than once in motion clauses" } */ +#pragma omp target update from(a) to(b) from(b) /* { dg-error "'b' appears more than once in motion clauses" } */ +#pragma omp target update from(a) to(b) to(b) /* { dg-error "'b' appears more than once in motion clauses" } */ + +#pragma omp target update to(a) from(a[1:1]) /* { dg-error "'a' appears more than once in motion clauses" "not implemented" { xfail *-*-* } } */ +#pragma omp target update to(a[1:1]) from(a[1:1]) /* { dg-error "'a' appears more than once in motion clauses" "not implemented" { xfail *-*-* } } */ +#pragma omp target update to(a[1:1]) to(a[1:1]) /* { dg-error "'a' appears more than once in motion clauses" "not implemented" { xfail *-*-* } } */ +#pragma omp target update to(a[1:3]) from(a[2:1]) /* { dg-error "'a' appears more than once in motion clauses" "not implemented" { xfail *-*-* } } */ + +#pragma omp target update from(a) if(0) + +#pragma omp target update from(a) if(0) if(0) /* { dg-error "too many 'if' clauses" } */ + +#pragma omp target update from(a) device(0) + +#pragma omp target update from(a) device(0) device(0) /* { dg-error "too many 'device' clauses" } */ + +#pragma omp target update if(0) /* { dg-error "'#pragma omp target update' must contain at least one 'from' or 'to' clauses" } */
[PATCH] demangler, only access valid fields for DEMANGLE_COMPONENT_FIXED_TYPE.
In two places when a struct demangle_component is of type DEMANGLE_COMPONENT_FIXED_TYPE we fall back to accessing the default s_binary member of the union rather than the s_fixed member. This is incorrect and can cause the demangler to crash. In d_dump I've changed the code to only access the s_fixed member of the union, and also added printing of the remaining parts of the s_fixed struct, this felt like the most useful thing to do. I've added a new test, this causes a SIGSEGV for me before the patch, and is fine afterwords, however, this undefined, so might not cause a crash on all platforms. If this is approved then please could someone commit it for me, I don't have gcc write access. Thanks, Andrew libiberty/ChangeLog: * cp-demangle.c (d_dump): Only access field from s_fixed part of the union for DEMANGLE_COMPONENT_FIXED_TYPE. (d_count_templates_scopes): Likewise. * testsuite/demangle-expected: New test case. --- libiberty/cp-demangle.c | 10 +- libiberty/testsuite/demangle-expected | 6 ++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index 68d8ee1..a31dad4 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -710,7 +710,9 @@ d_dump (struct demangle_component *dc, int indent) printf ("pointer to member type\n"); break; case DEMANGLE_COMPONENT_FIXED_TYPE: - printf ("fixed-point type\n"); + printf ("fixed-point type, accum? %d, sat? %d\n", + dc->u.s_fixed.accum, dc->u.s_fixed.sat); + d_dump (dc->u.s_fixed.length, indent + 2) break; case DEMANGLE_COMPONENT_ARGLIST: printf ("argument list\n"); @@ -3869,7 +3871,13 @@ d_count_templates_scopes (int *num_templates, int *num_scopes, case DEMANGLE_COMPONENT_FUNCTION_TYPE: case DEMANGLE_COMPONENT_ARRAY_TYPE: case DEMANGLE_COMPONENT_PTRMEM_TYPE: + goto recurse_left_right; + case DEMANGLE_COMPONENT_FIXED_TYPE: + d_count_templates_scopes (num_templates, num_scopes, +dc->u.s_fixed.length); + break; + case DEMANGLE_COMPONENT_VECTOR_TYPE: case DEMANGLE_COMPONENT_ARGLIST: case DEMANGLE_COMPONENT_TEMPLATE_ARGLIST: diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index 453f9a3..0e2bb12 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -4343,3 +4343,9 @@ cereal::detail::InputBindingMap::Serializers cereal::p --format=gnu-v3 _ZNSt9_Any_data9_M_accessIPZ4postISt8functionIFvvEEEvOT_EUlvE_EERS5_v void post >(std::function&&)::{lambda()#1}*& std::_Any_data::_M_access >(void post >(std::function&&)::{lambda()#1}*&&)::{lambda()#1}*>() +# The following input symbol was found during random, it caused a fault +# within the demangler, it's not a symbol we'd expect in the real world. +--format=auto --no-params +_Z3xxxDFyuVb +xxx(unsigned long long _Fract, bool volatile) +xxx -- 1.8.1.3
Re: Build problems on arm-linux-gnueabihf
> > When I debug and set a breakpoint in varasm.c line 314 it looks as if > the flag SECTION_RELRO is different. This flag is set in I see, it is the same message, but different reason from AIX. > categorize_decl_for_section where it depends on > TREE_CODE(decl), and if it is a VAR_DECL also on > TREE_READONLY (decl) > TREE_SIDE_EFFECTS (decl) > DECL_INITIAL (decl) > TREE_CONSTANT (DECL_INITIAL (decl)) > DECL_THREAD_LOCAL_P (decl) the new variable is alias. As i wrote to ML and to Richard Henderson (who is guru for this area), I think it is wrong to determine section of alias from alias decl itself and rather we should use the same section as alias target (you can't place alias to different section than taret). So I think it is semi-latent varasm bug that I triggered. I think it should walk to alias tarets where it matters (see patch bellow) On the other hand, the alias created ought to inherit properties form its target, so yes, we probably want to copy flags that matters, including DECL_THREAD_LOCAL_P. We however should not copy DECL_INITIAL - we never have constructors for aliases and there is ctor_for_folding function that gets you from decl to corresponding constructor that knows how to walk aliases. So if varasm looks into DECL_INITIAL of an alias, I think it is a bug. We probably want to copy other flags. Does the patch works if you copy TREE_SIDE_EFFECTS alone? But why that particular variable has side effects at first place? /* In any expression, decl, or constant, nonzero means it has side effects or reevaluation of the whole expression could produce a different value. This is set if any subexpression is a function call, a side effect or a reference to a volatile variable. In a ..._DECL, this is set only if the declaration said `volatile'. This will never be set for a constant. */ I do not see how vtable may end up being volatile. I posted following patch: Index: varasm.c === --- varasm.c(revision 211028) +++ varasm.c(working copy) @@ -1083,6 +1083,9 @@ { addr_space_t as = ADDR_SPACE_GENERIC; int reloc; + symtab_node *snode = symtab_get_node (decl); + if (snode) +decl = symtab_alias_ultimate_target (snode)->decl; if (TREE_TYPE (decl) != error_mark_node) as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); which should make the variable sections to go according to alias target, I suppose it doesn't help to your build? Sorry for slow reaction - I teach now monday-wednesday a class in linear algebra (as part of my post-doc) and it is time consuming (since the semmester is only 6 weeks). I can do serious GCC hacking only thursday/friday/weekend thus. I will try to fix this today however - I realize it is painful bug. Honza > > maybe also on CONST_CAST_TREE (decl) in case of asan > and various target hooks. > > > now, since the error message names a ".localalias." > > I see your checkin from monday at the same area: > > r210919 | hubicka | 2014-05-26 02:50:24 +0200 (Mo, 26. Mai 2014) | 2 Zeilen > > * symtab.c (symtab_nonoverwritable_alias): Copy READONLY flag for > variables. > > and I wonder if there are more flags missing: > > Index: symtab.c > === > --- symtab.c (Revision 211028) > +++ symtab.c (Arbeitskopie) > @@ -1165,6 +1165,8 @@ symtab_nonoverwritable_alias (symtab_node *node) > else > { > TREE_READONLY (new_decl) = TREE_READONLY (node->decl); > + TREE_SIDE_EFFECTS (new_decl) = TREE_SIDE_EFFECTS (node->decl); > + DECL_INITIAL (new_decl) = DECL_INITIAL (node->decl); > new_node = varpool_create_variable_alias (new_decl, node->decl); > } > symtab_resolve_alias (new_node, node); > > > This makes the build succeed for me, but I am not really sure, if that is > the right direction to fix this, and if maybe even more flags may be missing, > for instance: DECL_THREAD_LOCAL_P or CONST_CAST_TREE? > > What do you think? > > > > Bernd. >
RFA: A couple of ira_get_dup_out_num fixes
While working on patches to speed up the handling of constraints, I hit some behaviour in ira_get_dup_out_num that looked unintentional: - the check for output operands was part of the !ignored_p condition so would be skipped if the first alternative is disabled/excluded. - the first disabled/excluded alternative stops all following alternatives from being processed, since we get "stuck" in the first part of the "if" statement and never increment curr_alt. This seems to have some effect on the testsuite. E.g. at -O2 gcc.c-torture/compile/20071117-1.c has changes like: .LCFI2: movq%rsp, %rbx subq%rax, %rsp - leaq15(%rsp), %rax - andq$-16, %rax - movq%rax, %rdi + leaq15(%rsp), %rdi + andq$-16, %rdi callbar xorl%esi, %esi movq%rbx, %rsp There are also some cases where the change introduces a move though. E.g. gcc.c-torture/compat/struct-ic.c has: movabsq $4294967296, %rdx addq$8, %rsp .LCFI4: - andq%rdi, %rax + andq%rax, %rdi + movq%rdi, %rax orq %rdx, %rax ret .L9: But AFAICT the patch is what was originally intended. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ * ira.c (ira_get_dup_out_num): Check for output operands at the start of the loop. Handle cases where an included alternative follows an excluded one. Index: gcc/ira.c === --- gcc/ira.c 2014-05-28 20:48:25.747321188 +0100 +++ gcc/ira.c 2014-05-28 21:31:22.769217059 +0100 @@ -1979,33 +1979,33 @@ ira_get_dup_out_num (int op_num, HARD_RE if (op_num < 0 || recog_data.n_alternatives == 0) return -1; - use_commut_op_p = false; + /* We should find duplications only for input operands. */ + if (recog_data.operand_type[op_num] != OP_IN) +return -1; str = recog_data.constraints[op_num]; + use_commut_op_p = false; for (;;) { #ifdef EXTRA_CONSTRAINT_STR op = recog_data.operand[op_num]; #endif - for (ignore_p = false, original = -1, curr_alt = 0;;) + for (curr_alt = 0, ignore_p = !TEST_HARD_REG_BIT (alts, curr_alt), + original = -1;;) { c = *str; if (c == '\0') break; - if (c == '#' || !TEST_HARD_REG_BIT (alts, curr_alt)) + if (c == '#') ignore_p = true; else if (c == ',') { curr_alt++; - ignore_p = false; + ignore_p = !TEST_HARD_REG_BIT (alts, curr_alt); } else if (! ignore_p) switch (c) { - /* We should find duplications only for input operands. */ - case '=': - case '+': - goto fail; case 'X': case 'p': case 'g':
Build problems on arm-linux-gnueabihf
Honza, I try to build this configuration: ../gcc-trunk/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-linux64 --target=arm-linux-gnueabihf --enable-languages=c,c++,fortran,ada --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard but make fails: libtool: compile: /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/./gcc/xgcc -shared-libgcc -B/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/./gcc -nostdinc++ -L/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/src -L/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/src/.libs -L/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/libsupc++/.libs -B/home/ed/gnu/arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/bin/ -B/home/ed/gnu/arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/lib/ -isystem /home/ed/gnu/arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/include -isystem /home/ed/gnu/arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/sys-include -I/home/ed/gnu/gcc-trunk/libstdc++-v3/../libgcc -I/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/arm-linux-gnueabihf -I/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include -I/home/ed/gnu/gcc-trunk/libstdc++-v3/libsupc++ -D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi -fdiagnostics-show-location=once -ffunction-sections -fdata-sections -frandom-seed=strstream.lo -g -O2 -D_GNU_SOURCE -I/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward -Wno-deprecated -c ../../../../../gcc-trunk/libstdc++-v3/src/c++98/strstream.cc -fPIC -DPIC -D_GLIBCXX_SHARED -o strstream.o In file included from ../../../../../gcc-trunk/libstdc++-v3/src/c++98/strstream.cc:44:0: /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward/strstream:126:9: error: std::istrstream::_ZTVSt10istrstream.localalias.0 causes a section type conflict with std::istrstream::_ZTVSt10istrstream class istrstream : public basic_istream ^ /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward/strstream:126:9: note: ‘std::istrstream::_ZTVSt10istrstream’ was declared here /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward/strstream:143:9: error: std::ostrstream::_ZTVSt10ostrstream.localalias.1 causes a section type conflict with std::ostrstream::_ZTVSt10ostrstream class ostrstream : public basic_ostream ^ /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward/strstream:143:9: note: ‘std::ostrstream::_ZTVSt10ostrstream’ was declared here /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward/strstream:160:9: error: std::strstream::_ZTVSt9strstream.localalias.2 causes a section type conflict with std::strstream::_ZTVSt9strstream class strstream : public basic_iostream ^ /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/include/backward/strstream:160:9: note: ‘std::strstream::_ZTVSt9strstream’ was declared here make[5]: *** [strstream.lo] Fehler 1 make[5]: Verlasse Verzeichnis '/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/src/c++98' make[4]: *** [all-recursive] Fehler 1 make[4]: Verlasse Verzeichnis '/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3/src' make[3]: *** [all-recursive] Fehler 1 make[3]: Verlasse Verzeichnis '/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3' make[2]: *** [all] Fehler 2 make[2]: Verlasse Verzeichnis '/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/arm-linux-gnueabihf/libstdc++-v3' make[1]: *** [all-target-libstdc++-v3] Fehler 2 make[1]: Verlasse Verzeichnis '/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64' make: *** [all] Fehler 2 When I debug and set a breakpoint in varasm.c line 314 it looks as if the flag SECTION_RELRO is different. This flag is set in categorize_decl_for_section where it depends on TREE_CODE(decl), and if it is a VAR_DECL also on TREE_READONLY (decl) TREE_SIDE_EFFECTS (decl) DECL_INITIAL (decl) TREE_CONSTANT (DECL_INITIAL (decl)) DECL_THREAD_LOCAL_P (decl) maybe also on CONST_CAST_TREE (decl) in case of asan and various target hooks. now, since the error message names a ".localalias." I see your checkin from monday at the same area: r210919 | hubicka | 2014-05-26 02:50:24 +0200 (Mo, 26. Mai 2014) | 2 Zeilen * symtab.c (symtab_nonoverwritable_alias): Copy READONLY flag for variables. and I wonder if there are more flags missing: Index: symtab.c === --- symtab.c (Revision 211028) +++ symtab.c (Arb
Re: [C++ Patch] PR 57543
Hi again, On 05/28/2014 07:06 PM, Paolo Carlini wrote: (In case I would have also to double check something weird I was seeing if the injection happens for all method types...) In the meanwhile I investigated the reason of all those regressions I was seeing when injecting in all METHOD_TYPEs: while handling, eg, from cpp0x/access02.C: template struct foo_argument { template static Arg test(Ret (C::*)(Arg)); ... }; we may still have, *after* substituting the argument types, that the the TREE_TYPE of the 0-th argument is still dependent, eg, a TEMPLATE_TYPE_PARM. In such cases it seems to me clear that it doesn't make sense to inject it, we don't really know the type. Indeed, if we try, we ICE in type_of_this_parm. Thus the below, which passes testing. Thanks! Paolo. Index: cp/pt.c === --- cp/pt.c (revision 211024) +++ cp/pt.c (working copy) @@ -11322,8 +11322,32 @@ tsubst_function_type (tree t, /* The TYPE_CONTEXT is not used for function/method types. */ gcc_assert (TYPE_CONTEXT (t) == NULL_TREE); + /* Substitute the argument types. */ + arg_types = tsubst_arg_types (TYPE_ARG_TYPES (t), args, NULL_TREE, + complain, in_decl); + if (arg_types == error_mark_node) +return error_mark_node; + /* Substitute the return type. */ + tree save_ccp = current_class_ptr; + tree save_ccr = current_class_ref; + tree this_type = (TREE_CODE (t) == METHOD_TYPE + ? TREE_TYPE (TREE_VALUE (arg_types)) : NULL_TREE); + bool do_inject = this_type && !dependent_type_p (this_type); + if (do_inject) +{ + /* DR 1207: 'this' is in scope in the trailing return type. */ + inject_this_parameter (this_type, cp_type_quals (this_type)); +} + return_type = tsubst (TREE_TYPE (t), args, complain, in_decl); + + if (do_inject) +{ + current_class_ptr = save_ccp; + current_class_ref = save_ccr; +} + if (return_type == error_mark_node) return error_mark_node; /* DR 486 clarifies that creation of a function type with an @@ -11344,12 +11368,6 @@ tsubst_function_type (tree t, if (abstract_virtuals_error_sfinae (ACU_RETURN, return_type, complain)) return error_mark_node; - /* Substitute the argument types. */ - arg_types = tsubst_arg_types (TYPE_ARG_TYPES (t), args, NULL_TREE, - complain, in_decl); - if (arg_types == error_mark_node) -return error_mark_node; - /* Construct a new type node and return it. */ if (TREE_CODE (t) == FUNCTION_TYPE) { Index: testsuite/g++.dg/cpp0x/decltype28.C === --- testsuite/g++.dg/cpp0x/decltype28.C (revision 211024) +++ testsuite/g++.dg/cpp0x/decltype28.C (working copy) @@ -8,9 +8,9 @@ template void ft (F f, typename enable_if::type) {} template< class F, int N > -decltype(ft (F(), 0)) // { dg-error "depth" } +decltype(ft (F(), 0)) ft (F f, typename enable_if::type) {} int main() { - ft (0, 0); // { dg-message "from here" } + ft (0, 0); } Index: testsuite/g++.dg/cpp0x/decltype59.C === --- testsuite/g++.dg/cpp0x/decltype59.C (revision 0) +++ testsuite/g++.dg/cpp0x/decltype59.C (working copy) @@ -0,0 +1,41 @@ +// PR c++/57543 +// { dg-do compile { target c++11 } } + +template< typename > struct X +{ + void foo(); + auto bar() -> decltype( X::foo() ); +}; + +template< typename > struct Y +{ + void foo(); + template< typename > + auto bar() -> decltype( Y::foo() ); +}; + +template< typename > struct Z +{ + void foo(); + template< typename T > + auto bar() -> decltype( T::foo() ); +}; + +template< typename > struct K +{ + void foo(); + template< typename T > + auto bar() -> decltype( T::foo() ); +}; + +template<> +template<> +auto K::bar>() -> decltype( K::foo() ); + +int main() +{ + X().bar(); + Y().bar(); + Z().bar>(); + K().bar>(); +}
Re: ipa-visibility TLC 2/n
> On 05/27/14 23:20, Jan Hubicka wrote: > >> > >>here we have decl and its local alias: > >>(gdb) p debug_tree (sect->named.decl) > >> >> type >> type >> 700f57e0> > >> unsigned SI > >> size > >> unit size > >> align 32 symtab 45 alias set 3 canonical type 700f5840 > >> pointer_to_this> > >> BLK > >> size > >> unit size > >> align 32 symtab 0 alias set 3 canonical type 70dd8840 > >> domain > >> type_6 SI size unit size >> 700064b0 4> > >> align 32 symtab 0 alias set -1 canonical type 701a78a0 > >> precision 32 min max> > >> pointer_to_this> > >> readonly addressable used public static tree_1 tree_5 tree_6 ignored > >> weak in_system_header virtual decl_5 SI file > >> /home/jh/trunk/c/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/fstream line > >> 444 col 11 size unit size >> 40> > >> user align 32 context > >> initial > >> > >> (mem/u/c:SI (symbol_ref/i:SI > >> ("_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si") [flags 0x82] >> 70f7d060 _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si>) [3 > >> _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si+0 S40 A32])> > >>$6 = 10 > >>(gdb) p debug_tree (decl) > >> >> _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69 > >> type >> type >> 700f57e0> > >> unsigned SI > >> size > >> unit size > >> align 32 symtab 45 alias set 3 canonical type 700f5840 > >> pointer_to_this> > >> BLK > >> size > >> unit size > >> align 32 symtab 0 alias set 3 canonical type 70dd8840 > >> domain > >> type_6 SI size unit size >> 700064b0 4> > >> align 32 symtab 0 alias set -1 canonical type 701a78a0 > >> precision 32 min max> > >> pointer_to_this> > >> readonly addressable used static tree_1 tree_5 tree_6 ignored > >> in_system_header decl_5 SI file > >> /home/jh/trunk/c/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/fstream line > >> 444 col 11 size unit size >> 40> > >> user align 32 context > >>> > >>$7 = 10 > >> > >> > >>Perhaps get_vairable_section should look for alias target, since that is the > >>decl really deciding on the section? Richard? > > > >This is patch that makes get_variable_section to look through the aliases > > > >Index: varasm.c > >=== > >--- varasm.c (revision 210914) > >+++ varasm.c (working copy) > >@@ -1083,6 +1083,9 @@ > > { > >addr_space_t as = ADDR_SPACE_GENERIC; > >int reloc; > >+ symtab_node *snode = symtab_get_node (decl); > >+ if (snode) > >+decl = symtab_alias_ultimate_target (snode)->decl; > > > >if (TREE_TYPE (decl) != error_mark_node) > > as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); > > > >For AIX it makes the bug go away and I eventually get: > > > >/tmp//ccyAATFr.s: line 30042: 1252-001 > >_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69 is defined > >already. > >/tmp//ccyAATFr.s: line 30125: 1252-001 > >_ZTCSt14basic_ofstreamIcSt11char_traitsIcEE0_So.localalias.71 is defined > >already. > >/tmp//ccyAATFr.s: line 30164: 1252-001 > >_ZTCSt13basic_fstreamIcSt11char_traitsIcEE8_So.localalias.75 is defined > >already. > >/tmp//ccyAATFr.s: line 30223: 1252-001 > >_ZTCSt13basic_fstreamIcSt11char_traitsIcEE0_Si.localalias.74 is defined > >already. > >/tmp//ccyAATFr.s: line 30263: 1252-001 > >_ZTCSt14basic_ofstreamIwSt11char_traitsIwEE0_St13basic_ostreamIwS1_E.localalias.80 > > is defined already. > >/tmp//ccyAATFr.s: line 30323: 1252-001 > >_ZTCSt13basic_fstreamIwSt11char_traitsIwEE8_St13basic_ostreamIwS1_E.localalias.84 > > is defined already. > >/tmp//ccyAATFr.s: line 30388: 1252-001 > >_ZTCSt14basic_ifstreamIwSt11char_traitsIwEE0_St13basic_istreamIwS1_E.localalias.78 > > is defined already. > >/tmp//ccyAATFr.s: line 30436: 1252-001 > >_ZTCSt13basic_fstreamIwSt11char_traitsIwEE0_St13basic_istreamIwS1_E.localalias.83 > > is defined already. > > > >David, this looks like a bug in the AIX target output macros. I get: > > .set > > _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69,_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si > > > >(this is correct since localalias is really an alias) > > > >_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69: > > .space 40 > >_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si: > >... > > > >This is wrong, since we should not try to out the variable at least if I > >read AIX assembly correctly. > > > >varpool has explicit test to not output any aliases, so perhaps this is a > >bug in wrapup_globals > >and AIX output macros. I will try to track more after my teaching tonight. > > Any update? > > I've managed to generate a simple test case from > libstdc++-v3/src/c++98/strstream.cc which reproduces the issue on > ARM that Ramana has reported previousl
Re: Require '%' to be at the beginning of a constraint string
Thanks for the review. Jeff Law writes: > On 05/26/14 13:21, Richard Sandiford wrote: >>> If we're going to change it, then clearly the docs need to change and >>> ideally we'd statically check the port's constraints during the build >>> process to ensure they meet the tighter definition. >> >> OK, how does this look? I built a cc1 for one target per config/ >> directory to (try to) check that there were no remaining cases. >> >> This means that we will silently ignore '%'s that are embedded in the >> middle of an asm constraint string, but in a way that's already true for >> most places that check for commutativity. An error seems a bit extreme >> when '%' is only a hint. If we want a warning, what should the option >> be called? And should it be under -Wall, -Wextra, or on by default? >> >> Tested on x86_64-linux-gnu. OK to install? > OK. My initial thought on adding a warning was to weed out bad > constraints. You've already done that for the in-tree ports. I'm a lot > less inclined to do much more here to help the out-of-tree ports, so > upon further review, let's not worry about the warning unless you've > already got it ready to go :-) Well, the new genoutput.c error should catch problems in out-of-tree ports. It's just asms where the non-initial '%'s would be silently accepted and have no effect. David W suggested off-list that I add "Only input operands can use @samp{%}." to the documention as well. That seemed like it was obviously an improvement so I applied the patch with that change (below). Thanks, Richard gcc/ * doc/md.texi: Document that the % constraint character must be at the beginning of the string. * genoutput.c (validate_insn_alternatives): Check that '=', '+' and '%' only appear at the beginning of a constraint. * ira.c (commutative_constraint_p): Delete. (ira_get_dup_out_num): Expect the '%' commutativity marker to be at the start of the string. * config/alpha/alpha.md (*movmemdi_1, *clrmemdi_1): Remove duplicate '='s. * config/arm/neon.md (bicdi3_neon): Likewise. * config/iq2000/iq2000.md (addsi3_internal, subsi3_internal, sgt_si) (slt_si, sltu_si): Likewise. * config/vax/vax.md (sbcdi3): Likewise. * config/h8300/h8300.md (*cmpstz): Remove duplicate '+'. * config/arc/arc.md (mulsi_600, mulsidi_600, umulsidi_600) (mul64): Move '%' to beginning of constraint. * config/arm/arm.md (*xordi3_insn): Likewise. * config/nds32/nds32.md (add3, mulsi3, andsi3, iorsi3) (xorsi3): Likewise. Index: gcc/doc/md.texi === --- gcc/doc/md.texi 2014-05-27 11:23:48.676099753 +0100 +++ gcc/doc/md.texi 2014-05-27 11:23:53.289138410 +0100 @@ -1589,7 +1589,10 @@ See, for example, the @samp{mulsi3} insn Declares the instruction to be commutative for this operand and the following operand. This means that the compiler may interchange the two operands if that is the cheapest way to make all operands fit the -constraints. +constraints. @samp{%} applies to all alternatives and must appear as +the first character in the constraint. Only input operands can use +@samp{%}. + @ifset INTERNALS This is often used in patterns for addition instructions that really have only two operands: the result must go in one of the Index: gcc/genoutput.c === --- gcc/genoutput.c 2014-05-27 11:23:48.661099627 +0100 +++ gcc/genoutput.c 2014-05-27 11:23:53.284138368 +0100 @@ -781,6 +781,11 @@ validate_insn_alternatives (struct data for (p = d->operand[start].constraint; (c = *p); p += len) { + if ((c == '%' || c == '=' || c == '+') + && p != d->operand[start].constraint) + error_with_line (d->lineno, + "character '%c' can only be used at the" + " beginning of a constraint string", c); #ifdef USE_MD_CONSTRAINTS if (ISSPACE (c) || strchr (indep_constraints, c)) len = 1; Index: gcc/ira.c === --- gcc/ira.c 2014-05-27 11:23:48.679099778 +0100 +++ gcc/ira.c 2014-05-27 11:24:23.746394430 +0100 @@ -1770,34 +1770,6 @@ setup_prohibited_mode_move_regs (void) -/* Return TRUE if the operand constraint STR is commutative. */ -static bool -commutative_constraint_p (const char *str) -{ - int c; - - alternative_mask enabled = recog_data.enabled_alternatives; - for (;;) -{ - c = *str; - if (c == '\0') - break; - str += CONSTRAINT_LEN (c, str); - if (c == '#') - enabled &= ~ALTERNATIVE_BIT (0); - else if (c == ',') - enabled >>= 1; - else if (enabled & 1) - { - /* Usually `%' is the first constraint character but the -documentation does not requi
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On 05/28/14 11:22, Richard Henderson wrote: On 05/28/2014 01:43 AM, Kai Tietz wrote: + if (GET_CODE (op) == CONST) +op = XEXP (op, 0); + return (GET_CODE (op) == SYMBOL_REF || CONSTANT_P (op)); Surely all this boils down to just CONSTANT_P (op), without having to look through the CONST at all. Something certainly seems odd there. Kai, is your intention to detect symbol_ref + const_int? Isn't the canonical form for that: (const (plus (symbol_ref) (const_int))? It appears to me the code above looks for (const (symbol_ref)) or (const (const_int)) ?!? Jeff
Re: [patch, mips, tree] align microMIPS functions to 16 bits with -Os
Sandra Loosemore writes: > On 05/19/2014 01:38 PM, Sandra Loosemore wrote: >> >> 2014-05-19 Iain Sandoe >> Catherine Moore >> Sandra Loosemore >> >> gcc/ >> * config/mips/mips.c (mips_set_current_function): Choose >> function alignment once the current mode is known. >> >> gcc/testsuite/ >> * gcc.target/mips/umips-align-1.c: New. >> * gcc.target/mips/umips-align-2.c: New. >> * gcc.target/mips/umips-align-3.c: New. >> * gcc.target/mips/mips.exp: Add interlink-compressed to >> -mfoo/-mno-foo options. > > Ping? > > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01536.html > > -Sandra FAOD, I wasn't commenting because I still think it's the wrong place but still don't have a specific counter-suggestion. mips_set_current_function is potentially called many times for the same function but setting the alignment seems like something that should only happen once. I think it could potentially mean that alignment tests against the function address could be optimised away based on the FUNCTION_BOUNDARY before mips_set_current_function is called. As a strawman, maybe adding a new target hook to cgraph_create_node would work? Hopefully that'll prompt someone to say how stupid that idea is and say what the right way of doing it would be. Thanks, Richard
Re: [PATCH] Inline asm asan instrumentation
On Wed, May 28, 2014 at 5:33 PM, Marat Zakirov wrote: > Hi all, > > Here's a patch for optional Asan instrumentation of inline assembly. > > This version scans gimple for GIMPLE_ASMs and performs usual instrumentation > of arguments with memory constraints ("m", "o", etc.) with fixed size. > > Instrumentation is turned off by default. > > This was successfully bootstrapped and regtested on x64. I have also > instrumented and ran ffmpeg regression testsuite (it seems to have quite > some inline asm). > > --Marat
Re: [MIPS] Add sbasic supoert ffor MSA (SIMD)
On May 28, 2014, at 7:27 AM, Richard Earnshaw wrote: > > Speed of implementation. We're gradually replacing these with proper > builtins, but that takes a lot more work. As an owner of a port with more builtins that yours, I can offer a technological solution to reduce the cost of builtins to: (define_builtin “my_stop" [ (define_outputs [(void_operand 0)]) (define_rtl_pattern “my_stop" []) ] ) (define_insn “my_stop" [(unspec_volatile [(const_int 0)] UNSPECV_STOP)] "" “stop”) for example. This creates the builtins, allows overloading, allows input/output parameters, can reorder operands, allows for complex types, allows memory reference parameters, allows pure markings, does vectors, conditional availability, generates documentation, creates test suites and more. If you wire up a speaker it even sings. Someone would have have to step forward with a need and some time to port their port over to the new scheme and help with the reason for why the technology should go in. It is mostly contained in 5600 lines of self contained python code, and is built to solve the problem generally. It adds about 800 lines to builtins.c. It has a macro system that is more powerful than the macro system .md files use, so one gets to share and collapse builtins rather nicely. It is known to work for C and C++. Other languages may need extending; C for example cost is around 250 lines to support. One promise, you will never have to create an argument list, or a type, for example here is a two output, type input functional instruction with some doc content: (define_mode_iterator MYTYPE [V8QI V4HI V2SI DI ...]) (define_builtin “my_foo” "my_foo2_" [ (define_desc“Doc string for operation") (define_outputs [(var_operand:T_MYTYPE 0) (var_operand:T_MYTYPE 1)]) (define_inputs [(var_operand:T_MYTYPE 2) (var_operand:T_MYTYPE 3)]) (define_rtl_pattern “my_foo2_" [0 2 1 3]) (attributes [pure]) ] ) I stripped it so you can’t know what the instruction was, but you get a flavor of multiple outputs, doc bits, pure, overloading, arguments and argument rearranging. Let me know if you’re interested.
Re: ipa-visibility TLC 2/n
On 05/27/14 23:20, Jan Hubicka wrote: here we have decl and its local alias: (gdb) p debug_tree (sect->named.decl) unsigned SI size unit size align 32 symtab 45 alias set 3 canonical type 700f5840 pointer_to_this> BLK size unit size align 32 symtab 0 alias set 3 canonical type 70dd8840 domain type_6 SI size unit size align 32 symtab 0 alias set -1 canonical type 701a78a0 precision 32 min max> pointer_to_this> readonly addressable used public static tree_1 tree_5 tree_6 ignored weak in_system_header virtual decl_5 SI file /home/jh/trunk/c/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/fstream line 444 col 11 size unit size user align 32 context initial (mem/u/c:SI (symbol_ref/i:SI ("_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si") [flags 0x82]) [3 _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si+0 S40 A32])> $6 = 10 (gdb) p debug_tree (decl) unsigned SI size unit size align 32 symtab 45 alias set 3 canonical type 700f5840 pointer_to_this> BLK size unit size align 32 symtab 0 alias set 3 canonical type 70dd8840 domain type_6 SI size unit size align 32 symtab 0 alias set -1 canonical type 701a78a0 precision 32 min max> pointer_to_this> readonly addressable used static tree_1 tree_5 tree_6 ignored in_system_header decl_5 SI file /home/jh/trunk/c/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/fstream line 444 col 11 size unit size user align 32 context > $7 = 10 Perhaps get_vairable_section should look for alias target, since that is the decl really deciding on the section? Richard? This is patch that makes get_variable_section to look through the aliases Index: varasm.c === --- varasm.c(revision 210914) +++ varasm.c(working copy) @@ -1083,6 +1083,9 @@ { addr_space_t as = ADDR_SPACE_GENERIC; int reloc; + symtab_node *snode = symtab_get_node (decl); + if (snode) +decl = symtab_alias_ultimate_target (snode)->decl; if (TREE_TYPE (decl) != error_mark_node) as = TYPE_ADDR_SPACE (TREE_TYPE (decl)); For AIX it makes the bug go away and I eventually get: /tmp//ccyAATFr.s: line 30042: 1252-001 _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69 is defined already. /tmp//ccyAATFr.s: line 30125: 1252-001 _ZTCSt14basic_ofstreamIcSt11char_traitsIcEE0_So.localalias.71 is defined already. /tmp//ccyAATFr.s: line 30164: 1252-001 _ZTCSt13basic_fstreamIcSt11char_traitsIcEE8_So.localalias.75 is defined already. /tmp//ccyAATFr.s: line 30223: 1252-001 _ZTCSt13basic_fstreamIcSt11char_traitsIcEE0_Si.localalias.74 is defined already. /tmp//ccyAATFr.s: line 30263: 1252-001 _ZTCSt14basic_ofstreamIwSt11char_traitsIwEE0_St13basic_ostreamIwS1_E.localalias.80 is defined already. /tmp//ccyAATFr.s: line 30323: 1252-001 _ZTCSt13basic_fstreamIwSt11char_traitsIwEE8_St13basic_ostreamIwS1_E.localalias.84 is defined already. /tmp//ccyAATFr.s: line 30388: 1252-001 _ZTCSt14basic_ifstreamIwSt11char_traitsIwEE0_St13basic_istreamIwS1_E.localalias.78 is defined already. /tmp//ccyAATFr.s: line 30436: 1252-001 _ZTCSt13basic_fstreamIwSt11char_traitsIwEE0_St13basic_istreamIwS1_E.localalias.83 is defined already. David, this looks like a bug in the AIX target output macros. I get: .set _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69,_ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si (this is correct since localalias is really an alias) _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si.localalias.69: .space 40 _ZTCSt14basic_ifstreamIcSt11char_traitsIcEE0_Si: ... This is wrong, since we should not try to out the variable at least if I read AIX assembly correctly. varpool has explicit test to not output any aliases, so perhaps this is a bug in wrapup_globals and AIX output macros. I will try to track more after my teaching tonight. Any update? I've managed to generate a simple test case from libstdc++-v3/src/c++98/strstream.cc which reproduces the issue on ARM that Ramana has reported previously: template struct char_traits; template class basic_ios { }; template > class basic_istream : virtual public basic_ios<_CharT, _Traits> { protected: int _M_gcount; virtual ~basic_istream() { } }; class istrstream : public basic_istream { virtual ~istrstream(); }; istrstream::~istrstream() { } -- CUT -- With an arm-none-linux-gnueabi gcc configured as: ./gcc/configure --target=arm-none-linux-gnueabi --enable-gnu-indirect-function --enable-shared --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a (irrelevant parts omitted) With the following command line options: -fdata-sections-O2
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On 05/28/2014 01:43 AM, Kai Tietz wrote: > + if (GET_CODE (op) == CONST) > +op = XEXP (op, 0); > + return (GET_CODE (op) == SYMBOL_REF || CONSTANT_P (op)); Surely all this boils down to just CONSTANT_P (op), without having to look through the CONST at all. Otherwise this looks ok. r~
Re: [C++ Patch] PR 57543
Hi, On 05/28/2014 06:33 PM, Jason Merrill wrote: On 05/28/2014 11:59 AM, Paolo Carlini wrote: I see. Even not considering this issue, there are many regression if I inject for all method types. I'm afraid the issue turns out to be much more tricky than I hoped, I guess I'm going to unassign myself, for now, and work on some other pending issues in my todo. You are of course more than welcome to take it and include my additional tests in your work! OK. Please add a link to this thread in the PR, if you haven't already. Now, I got this "insane" idea: would it make sense to simply invert the substitutions (args and return) unconditionally? I'm asking because the below appears to pass the testsuite modulo decltype28.C which we would end up accepting, but likewise do current clang, icc, and Solaris Studio!?! (In case I would have also to double check something weird I was seeing if the injection happens for all method types...) Thanks! Paolo. /// Index: cp/pt.c === --- cp/pt.c (revision 211024) +++ cp/pt.c (working copy) @@ -11322,8 +11322,32 @@ tsubst_function_type (tree t, /* The TYPE_CONTEXT is not used for function/method types. */ gcc_assert (TYPE_CONTEXT (t) == NULL_TREE); + /* Substitute the argument types. */ + arg_types = tsubst_arg_types (TYPE_ARG_TYPES (t), args, NULL_TREE, + complain, in_decl); + if (arg_types == error_mark_node) +return error_mark_node; + /* Substitute the return type. */ + tree save_ccp = current_class_ptr; + tree save_ccr = current_class_ref; + bool do_inject = (TREE_CODE (t) == METHOD_TYPE + && TREE_CODE (TREE_TYPE (t)) == DECLTYPE_TYPE); + if (do_inject) +{ + /* DR 1207: 'this' is in scope in the trailing return type. */ + tree this_type = TREE_TYPE (TREE_VALUE (arg_types)); + inject_this_parameter (this_type, cp_type_quals (this_type)); +} + return_type = tsubst (TREE_TYPE (t), args, complain, in_decl); + + if (do_inject) +{ + current_class_ptr = save_ccp; + current_class_ref = save_ccr; +} + if (return_type == error_mark_node) return error_mark_node; /* DR 486 clarifies that creation of a function type with an @@ -11344,12 +11368,6 @@ tsubst_function_type (tree t, if (abstract_virtuals_error_sfinae (ACU_RETURN, return_type, complain)) return error_mark_node; - /* Substitute the argument types. */ - arg_types = tsubst_arg_types (TYPE_ARG_TYPES (t), args, NULL_TREE, - complain, in_decl); - if (arg_types == error_mark_node) -return error_mark_node; - /* Construct a new type node and return it. */ if (TREE_CODE (t) == FUNCTION_TYPE) { Index: testsuite/g++.dg/cpp0x/decltype28.C === --- testsuite/g++.dg/cpp0x/decltype28.C (revision 211024) +++ testsuite/g++.dg/cpp0x/decltype28.C (working copy) @@ -8,9 +8,9 @@ template void ft (F f, typename enable_if::type) {} template< class F, int N > -decltype(ft (F(), 0)) // { dg-error "depth" } +decltype(ft (F(), 0)) ft (F f, typename enable_if::type) {} int main() { - ft (0, 0); // { dg-message "from here" } + ft (0, 0); }
Re: [C++ Patch] PR 57543
On 05/28/2014 11:59 AM, Paolo Carlini wrote: I see. Even not considering this issue, there are many regression if I inject for all method types. I'm afraid the issue turns out to be much more tricky than I hoped, I guess I'm going to unassign myself, for now, and work on some other pending issues in my todo. You are of course more than welcome to take it and include my additional tests in your work! OK. Please add a link to this thread in the PR, if you haven't already. Jason
C++ PATCH to catch array of array of unknown bound in template
Daveed from EDG pointed out this bug to me. Tested x86_64-pc-linux-gnu, applying to trunk. commit 912eae71d30adf6c07dbdf1b4741fd7ffd5a05ff Author: Jason Merrill Date: Wed May 28 11:50:50 2014 -0400 * pt.c (tsubst) [ARRAY_TYPE]: Check for array of array of unknown bound. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 2ebe015..0b3cd7f 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -12084,7 +12084,7 @@ tsubst (tree t, tree args, tsubst_flags_t complain, tree in_decl) if (type == TREE_TYPE (t) && domain == TYPE_DOMAIN (t)) return t; - /* These checks should match the ones in grokdeclarator. + /* These checks should match the ones in create_array_type_for_decl. [temp.deduct] @@ -12095,6 +12095,8 @@ tsubst (tree t, tree args, tsubst_flags_t complain, tree in_decl) an abstract class type. */ if (VOID_TYPE_P (type) || TREE_CODE (type) == FUNCTION_TYPE + || (TREE_CODE (type) == ARRAY_TYPE + && TYPE_DOMAIN (type) == NULL_TREE) || TREE_CODE (type) == REFERENCE_TYPE) { if (complain & tf_error) diff --git a/gcc/testsuite/g++.dg/template/array28.C b/gcc/testsuite/g++.dg/template/array28.C new file mode 100644 index 000..18b629d --- /dev/null +++ b/gcc/testsuite/g++.dg/template/array28.C @@ -0,0 +1,7 @@ +typedef int (A)[]; + +template void f(T (*)[1]); // { dg-error "array" } + +int main() { + f(0); // { dg-error "no match" } +}
Re: [patch, mips, tree] align microMIPS functions to 16 bits with -Os
On 05/19/2014 01:38 PM, Sandra Loosemore wrote: 2014-05-19 Iain Sandoe Catherine Moore Sandra Loosemore gcc/ * config/mips/mips.c (mips_set_current_function): Choose function alignment once the current mode is known. gcc/testsuite/ * gcc.target/mips/umips-align-1.c: New. * gcc.target/mips/umips-align-2.c: New. * gcc.target/mips/umips-align-3.c: New. * gcc.target/mips/mips.exp: Add interlink-compressed to -mfoo/-mno-foo options. Ping? https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01536.html -Sandra
Re: [C++ Patch] PR 57543
Hi, On 05/28/2014 05:49 PM, Jason Merrill wrote: On 05/28/2014 11:38 AM, Paolo Carlini wrote: But unconditionally doesn't work, without doing something more. For example for the first test, as reported: when current_class_type is set and equal to "struct X", class_of_this_parm is "struct X< >". Right, it needs to be the substituted this parameter. Let's remember when we had a trailing return type (probably by setting a flag in splice_late_return_type and then preserving it when we rebuild METHOD/FUNCTION_TYPEs) and if it's set, tsubst the return type after the argument types rather than before so that SFINAE consistently works in lexical order. I see. Even not considering this issue, there are many regression if I inject for all method types. I'm afraid the issue turns out to be much more tricky than I hoped, I guess I'm going to unassign myself, for now, and work on some other pending issues in my todo. You are of course more than welcome to take it and include my additional tests in your work! Thanks, Paolo.
[PATCH, i386, Pointer Bounds Checker 10/x] Partitions
Hi, This patch keeps instrumented and original versions together and preserve tranparent alias chain during symbol name privatization. Bootstrapped and tested on linux-x86_64. Thanks, Ilya -- gcc/ 2013-05-28 Ilya Enkovich * lto/lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. (privatize_symbol_name): Restore transparent alias chain if required. diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c index 1ee5fbb..2967d73 100644 --- a/gcc/lto/lto-partition.c +++ b/gcc/lto/lto-partition.c @@ -163,6 +163,11 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node) for (e = cnode->callers; e; e = e->next_caller) if (e->caller->thunk.thunk_p) add_symbol_to_partition_1 (part, e->caller); + + /* Instrumented version is actually the same function. +Therefore put it into the same partition. */ + if (cnode->instrumented_version) + add_symbol_to_partition_1 (part, cnode->instrumented_version); } add_references_to_partition (part, node); @@ -745,6 +750,7 @@ privatize_symbol_name (symtab_node *node) { tree decl = node->decl; const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); + cgraph_node *cnode; /* Our renaming machinery do not handle more than one change of assembler name. We should not need more than one anyway. */ @@ -774,6 +780,18 @@ privatize_symbol_name (symtab_node *node) lto_record_renamed_decl (node->lto_file_data, name, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl))); + /* We could change name which is a target of transparent alias + chain of instrumented function name. Fix alias chain if so .*/ + if ((cnode = dyn_cast (node)) + && !cnode->instrumentation_clone + && cnode->instrumented_version + && cnode->instrumented_version->orig_decl == decl) +{ + tree iname = DECL_ASSEMBLER_NAME (cnode->instrumented_version->decl); + + gcc_assert (IDENTIFIER_TRANSPARENT_ALIAS (iname)); + TREE_CHAIN (iname) = DECL_ASSEMBLER_NAME (decl); +} if (cgraph_dump_file) fprintf (cgraph_dump_file, "Privatizing symbol name: %s -> %s\n",
C++ PATCH for c++/61242 (nested list-init)
We were getting confused by applying LOOKUP_NO_TEMP_BIND to the nested list-initializations, which is wrong; in aggregate initialization the elements are copy-initialized without regard to the enclosing context. Tested x86_64-pc-linux-gnu, applying to trunk. commit 916172d3d6e115b94f1a902aebca444a61ca04ae Author: Jason Merrill Date: Tue May 27 16:35:07 2014 -0400 PR c++/61242 * call.c (build_aggr_conv): Ignore passed in flags. (build_array_conv, build_complex_conv): Likewise. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 20af0e3..77aa8ca 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -890,7 +890,9 @@ build_aggr_conv (tree type, tree ctor, int flags, tsubst_flags_t complain) if (ctor == error_mark_node) return NULL; - flags |= LOOKUP_NO_NARROWING; + /* The conversions within the init-list aren't affected by the enclosing + context; they're always simple copy-initialization. */ + flags = LOOKUP_IMPLICIT|LOOKUP_NO_NARROWING; for (; field; field = next_initializable_field (DECL_CHAIN (field))) { @@ -963,6 +965,8 @@ build_array_conv (tree type, tree ctor, int flags, tsubst_flags_t complain) return NULL; } + flags = LOOKUP_IMPLICIT|LOOKUP_NO_NARROWING; + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (ctor), i, val) { conversion *sub @@ -1007,6 +1011,8 @@ build_complex_conv (tree type, tree ctor, int flags, if (len != 2) return NULL; + flags = LOOKUP_IMPLICIT|LOOKUP_NO_NARROWING; + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (ctor), i, val) { conversion *sub diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist84.C b/gcc/testsuite/g++.dg/cpp0x/initlist84.C new file mode 100644 index 000..4d46746 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/initlist84.C @@ -0,0 +1,17 @@ +// PR c++/61242 +// { dg-do compile { target c++11 } } + +struct Foo +{ + struct A + { +const int &container; +const int &args; + }; + static void Create (const A &); +}; + +int main () +{ + Foo::Create ({{}, {}}); +}
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On Wed, May 28, 2014 at 1:43 AM, Kai Tietz wrote: > Hi, > > I modified prior patch so that it uses the new predicate > sibcall_memory_operand to extend sibcall_insn_operand. > Just one change in i386.c remains about x86_output_mi_thunk. Later one isn't > pretty much essential. Nevertheless it makes > code equivalent to none-memory-case for potential tail-sibcalls. > > ChangeLog gcc > > 2014-05-28 Kai Tietz > > * config/i386/i386.c (x86_output_mi_thunk): Add memory case > for sibling-tail-calls. > * config/i386/i386.md (sibcall_insn_operand): Add memory-constrain > to its use. > * config/i386/predicates.md (sibcall_memory_operand): New predicate. > (sibcall_insn_operand): Add check for sibcall_memory_operand. > > ChangeLog testsuite > > 2014-05-28 Kai Tietz > > PR target/60104 > * gcc.target/i386/sibcall-1.c: New test. > * gcc.target/i386/sibcall-2.c: New test. > * gcc.target/i386/sibcall-3.c: New test. > > Regression tested x86_64-unknown-linux-gnu multilib, x86_64-w64-mingw32, and > i686-pc-cygwin. Ok for apply? > > Regards, > Kai > > > Index: gcc/config/i386/i386.c > === > --- gcc/config/i386/i386.c (revision 210985) > +++ gcc/config/i386/i386.c (working copy) > @@ -38893,7 +38893,16 @@ x86_output_mi_thunk (FILE *file, > For our purposes here, we can get away with (ab)using a jump pattern, > because we're going to do no optimization. */ >if (MEM_P (fnaddr)) > -emit_jump_insn (gen_indirect_jump (fnaddr)); > +{ > + if (sibcall_insn_operand (fnaddr, word_mode)) > + { > + tmp = gen_rtx_CALL (VOIDmode, fnaddr, const0_rtx); > + tmp = emit_call_insn (tmp); > + SIBLING_CALL_P (tmp) = 1; > + } > + else > + emit_jump_insn (gen_indirect_jump (fnaddr)); > +} >else > { >if (ix86_cmodel == CM_LARGE_PIC && SYMBOLIC_CONST (fnaddr)) > Index: gcc/config/i386/i386.md > === > --- gcc/config/i386/i386.md (revision 210985) > +++ gcc/config/i386/i386.md (working copy) > @@ -11376,7 +11376,7 @@ >[(set_attr "type" "call")]) > > (define_insn "*sibcall" > - [(call (mem:QI (match_operand:W 0 "sibcall_insn_operand" "Uz")) > + [(call (mem:QI (match_operand:W 0 "sibcall_insn_operand" "Uzm")) > (match_operand 1))] >"SIBLING_CALL_P (insn)" >"* return ix86_output_call_insn (insn, operands[0]);" > @@ -11406,7 +11406,7 @@ >[(set_attr "type" "call")]) > > (define_insn "*sibcall_pop" > - [(call (mem:QI (match_operand:SI 0 "sibcall_insn_operand" "Uz")) > + [(call (mem:QI (match_operand:SI 0 "sibcall_insn_operand" "Uzm")) > (match_operand 1)) > (set (reg:SI SP_REG) > (plus:SI (reg:SI SP_REG) > @@ -11451,7 +11451,7 @@ > > (define_insn "*sibcall_value" >[(set (match_operand 0) > - (call (mem:QI (match_operand:W 1 "sibcall_insn_operand" "Uz")) > + (call (mem:QI (match_operand:W 1 "sibcall_insn_operand" "Uzm")) > (match_operand 2)))] >"SIBLING_CALL_P (insn)" >"* return ix86_output_call_insn (insn, operands[1]);" > @@ -11494,7 +11494,7 @@ > > (define_insn "*sibcall_value_pop" >[(set (match_operand 0) > - (call (mem:QI (match_operand:SI 1 "sibcall_insn_operand" "Uz")) > + (call (mem:QI (match_operand:SI 1 "sibcall_insn_operand" "Uzm")) > (match_operand 2))) > (set (reg:SI SP_REG) > (plus:SI (reg:SI SP_REG) > Index: gcc/config/i386/predicates.md > === > --- gcc/config/i386/predicates.md (revision 210985) > +++ gcc/config/i386/predicates.md (working copy) > @@ -71,6 +71,16 @@ >return ANY_QI_REG_P (op); > }) > > +(define_predicate "sibcall_memory_operand" > + (match_operand 0 "memory_operand") > +{ > + op = XEXP (op, 0); > + > + if (GET_CODE (op) == CONST) > +op = XEXP (op, 0); > + return (GET_CODE (op) == SYMBOL_REF || CONSTANT_P (op)); > +}) > + > ;; Match an SI or HImode register for a zero_extract. > (define_special_predicate "ext_register_operand" >(match_operand 0 "register_operand") > @@ -600,7 +610,9 @@ > (define_special_predicate "sibcall_insn_operand" >(ior (match_test "constant_call_address_operand > (op, mode == VOIDmode ? mode : Pmode)") > - (match_operand 0 "register_no_elim_operand"))) > + (match_operand 0 "register_no_elim_operand") > + (and (not (match_test "TARGET_X32")) > + (match_operand 0 "sibcall_memory_operand" > > ;; Return true if OP is a call from MS ABI to SYSV ABI function. > (define_predicate "call_rex64_ms_sysv_operation" > Index: gcc/testsuite/gcc.target/i386/sibcall-1.c > === > --- gcc/testsuite/gcc.target/i386/sibcall-1.c (revision 0) > ++
Re: [C++ Patch] PR 57543
On 05/28/2014 11:38 AM, Paolo Carlini wrote: But unconditionally doesn't work, without doing something more. For example for the first test, as reported: when current_class_type is set and equal to "struct X", class_of_this_parm is "struct X< >". Right, it needs to be the substituted this parameter. Let's remember when we had a trailing return type (probably by setting a flag in splice_late_return_type and then preserving it when we rebuild METHOD/FUNCTION_TYPEs) and if it's set, tsubst the return type after the argument types rather than before so that SFINAE consistently works in lexical order. Jason
Re: [C++ Patch] PR 57543
Hi, On 05/28/2014 05:14 PM, Jason Merrill wrote: On 05/28/2014 06:15 AM, Paolo Carlini wrote: + bool do_inject = (!current_class_ref +&& TREE_CODE (t) == METHOD_TYPE +&& TREE_CODE (TREE_TYPE (t)) == DECLTYPE_TYPE); Let's do this for any METHOD_TYPE; the decltype could be nested as a template argument. And current_class_ref might be for the wrong class. Ok. + /* DR 1207: 'this' is in scope in the trailing return type. */ + tree this_type = (current_class_type +? current_class_type +: TREE_TYPE (TREE_VALUE (TYPE_ARG_TYPES (t; And here let's use class_of_this_parm unconditionally. But unconditionally doesn't work, without doing something more. For example for the first test, as reported: when current_class_type is set and equal to "struct X", class_of_this_parm is "struct X< >". I'm trying to paste below the trees: (gdb) p debug_tree(class_of_this_parm(t)) size bitsizetype> constant 0> unit size 0x766d70a8 sizetype> constant 0> align 8 symtab 0 alias set -1 canonical type 0x76819a80 fields type size unit size align 8 symtab 0 alias set -1 canonical type 0x76819a80 fields context 0x766e0170 D.1> full-name "struct X< >" n_parents=0 use_template=0 interface-unknown chain > used nonlocal decl_4 VOID file 57543_1.C line 5 col 1 align 1 context result > context full-name "struct X< >" n_parents=0 use_template=0 interface-unknown pointer_to_this chain 0x76813a10 X>> $3 = void (gdb) p current_class_type $4 = (tree) 0x7682a2a0 (gdb) p debug_tree(current_class_type) align 8 symtab 0 alias set -1 canonical type 0x7682a2a0 context full-name "struct X" n_parents=0 use_template=1 interface-unknown chain > used external nonlocal suppress-debug decl_4 VOID file 57543_1.C line 5 col 1 align 8 context result > context full-name "struct X" n_parents=0 use_template=1 interface-unknown pointer_to_this chain 0x76813f18 X>>
Re: libsanitizer merge from upstream r208536
On Wed, May 28, 2014 at 8:36 AM, Konstantin Serebryany wrote: > Dmitry, > You've introduced atomic_uint64_t stats_[AllocatorStatCount]; in > http://llvm.org/viewvc/llvm-project?view=revision&revision=173332 > Do you mind to change it to atomic_uintptr_t? > There is of course a chance of overflow on 32-bit arch (the number of > mallocs in a process may easily go over 2^32 in a long run), > but this is just stats, I think we can tolerate it. I removed 64-bit atomics from stats in llvm r209744. > On Wed, May 28, 2014 at 2:41 AM, Ramana Radhakrishnan > wrote: >> On Tue, May 27, 2014 at 11:00 PM, Jakub Jelinek wrote: >>> On Tue, May 27, 2014 at 11:50:33PM +0200, Jakub Jelinek wrote: On Tue, May 27, 2014 at 04:02:08PM -0500, Peter Bergner wrote: > It's being called form basically two files: > > [bergner@makalu-lp1 gcc-fsf-mainline-asan-debug]$ find . -name '*.o' | > xargs nm -AC | grep sync_fetch_and_add_8 > ./powerpc64-linux/32/libsanitizer/sanitizer_common/.libs/sanitizer_allocator.o: > U __sync_fetch_and_add_8 > ./powerpc64-linux/32/libsanitizer/sanitizer_common/sanitizer_allocator.o: > U __sync_fetch_and_add_8 > ./powerpc64-linux/32/libsanitizer/asan/.libs/asan_allocator2.o: > U __sync_fetch_and_add_8 > ./powerpc64-linux/32/libsanitizer/asan/asan_allocator2.o: U > __sync_fetch_and_add_8 Does ppc32 have any atomic 64-bit loads/stores (in the sense that the aligned 64 bits are written as one memory transaction, not each 32-bit word separately)? In any case, atomic_uint64_t there seems to be used only for some statistic counter and not really atomic anyway, as additions aren't performed using atomic instructions, but just atomic load, followed by normal arithmetics, followed by atomic store. Can't 32-bit counters be used instead on 32-bit arches? I see there is another spot with atomic_uint64_t in sanitizer_lfstack.h, but for some reason it isn't used now at all (there it would want to use 64-bit compare and exchange). >>> >>> On ARM, while it supposedly links, because __sync_compare_and_exchange_8 >>> is defined in libgcc.a, it will only work with post-2011 kernels and is >>> going to be very slow (because you do a separate compare and exchange >> >> FTR, this call down to the library function should only be for legacy >> architectures. >> >> On ARM we have a 64 bit atomic compare and exchange which can be done >> with the ldrexd / strexd instructions at the right architecture level >> (v6k and above IIRC). >> >> Ramana
Re: Darwin bootstrap failure following wide int merge
> After lengthy IRC discussions, what Richard and I can live with is > && !defined(__clang__) in this particular case that uses longlong.h > in GCC sources, with a comment why. > If we get too many of these workarounds, we should reconsider. Committed as revision 211023, after bootstrap on x86_64-apple-darwin13. Thanks! FX patch Description: application/applefile CL Description: Binary data
Re: [C++ Patch] PR 57543
On 05/28/2014 06:15 AM, Paolo Carlini wrote: + bool do_inject = (!current_class_ref + && TREE_CODE (t) == METHOD_TYPE + && TREE_CODE (TREE_TYPE (t)) == DECLTYPE_TYPE); Let's do this for any METHOD_TYPE; the decltype could be nested as a template argument. And current_class_ref might be for the wrong class. + /* DR 1207: 'this' is in scope in the trailing return type. */ + tree this_type = (current_class_type + ? current_class_type + : TREE_TYPE (TREE_VALUE (TYPE_ARG_TYPES (t; And here let's use class_of_this_parm unconditionally. Jason
Re: RFA: cache enabled attribute by insn code
On 27/05/14 17:31, Richard Sandiford wrote: > Richard Earnshaw writes: >> On 27/05/14 17:09, Richard Sandiford wrote: >>> Richard Earnshaw writes: On 27/05/14 16:27, Jakub Jelinek wrote: > On Tue, May 27, 2014 at 04:15:47PM +0100, Richard Earnshaw wrote: >> On 27/05/14 15:08, Richard Sandiford wrote: >>> Hmm, is this because of "insn_enabled"? If so, how did that work before >>> the patch? LRA already assumed that the "enabled" attribute didn't >>> depend >>> on the operands. >> >> Huh! "enabled" can be applied to each alternative. Alternatives are >> selected based on the operands. If LRA can't cope with that we have a >> serious problem. In fact, a pattern that matches but has no enabled >> alternatives is meaningless and guaranteed to cause problems during >> register allocation. > > This is not LRA fault, but the backend misusing the "enabled" attribute > for something it wasn't designed for, and IMHO against the documentation > of the attribute too. > Just look at the original submission why it has been added. > > Jakub > The @code{enabled} insn attribute may be used to disable certain insn alternatives for machine-specific reasons. The rest of the text just says what happens when this is done and then gives an example usage. It doesn't any time, either explicitly or implicitly, say that this must be a static choice determined once-off at run time. >>> >>> OK, how about the doc patch below? >>> That being said, I agree that this particular use case is pushing the boundaries -- but that's always a risk when the boundaries aren't clearly defined. A better solution here would be to get rid of all those match_operator patterns and replace them with iterators; but that's a lot of work, particularly for all the conditinal operation patterns we have. It would probably also bloat the number of patterns quite alarmingly. >>> >>> Yeah, which is why I was just going for the one place where it mattered. >>> I think match_operator still has a place in cases where the insn pattern >>> is entirely regular, which seems to be the case for most other uses >>> of shiftable_operator. It's just that in this case there really were >>> two separate cases per operator (plus vs. non-plus and mult vs. true shift). >>> >>> Thanks, >>> Richard >>> >>> >>> gcc/ >>> * doc/md.texi: Document the restrictions on the "enabled" attribute. >>> >>> Index: gcc/doc/md.texi >>> === >>> --- gcc/doc/md.texi (revision 210972) >>> +++ gcc/doc/md.texi (working copy) >>> @@ -4094,11 +4094,11 @@ >>> @subsection Disable insn alternatives using the @code{enabled} attribute >>> @cindex enabled >>> >>> -The @code{enabled} insn attribute may be used to disable certain insn >>> -alternatives for machine-specific reasons. This is useful when adding >>> -new instructions to an existing pattern which are only available for >>> -certain cpu architecture levels as specified with the @code{-march=} >>> -option. >>> +The @code{enabled} insn attribute may be used to disable insn >>> +alternatives that are not available for the current subtarget. >>> +This is useful when adding new instructions to an existing pattern >>> +which are only available for certain cpu architecture levels as >>> +specified with the @code{-march=} option. >>> >>> If an insn alternative is disabled, then it will never be used. The >>> compiler treats the constraints for the disabled alternative as >>> @@ -4112,6 +4112,9 @@ >>> A definition of the @code{enabled} insn attribute. The attribute is >>> defined as usual using the @code{define_attr} command. This >>> definition should be based on other insn attributes and/or target flags. >>> +It must not depend directly or indirectly on the current operands, >>> +since the attribute is expected to be a static property of the subtarget. >>> + >> >> I'd reverse the two components of that sentence. Something like: >> >> The attribute must be a static property of the subtarget; that is, it >> must not depend on the current operands or any other dynamic context >> (for example, location of the insn within the body of a loop). > > OK, how about the attached? > >> It would be useful if we could precisely define 'static property' >> somewhere, so that it encompases per-function changing of the ISA or >> optimization variant via __attribute__ annotations. That does need to >> work, since it could be used for switching between ARM and Thumb. > > Yeah, the cache depends on the current target for SWITCHABLE_TARGETs, > is invalidated by target_reinit for other targets, and is also invalidated > by changes to the register classes. > > Thanks, > Richard > > > gcc/ > * doc/md.texi: Document the restrictions on the "enabled" attribute. > LGTM. R. > Index: gcc/doc/md.texi >
Re: [AArch64 05/14] Add AArch64 'prefetch'-pattern.
On 28 May 2014, at 16:25 , Gopalasubramanian, Ganesh wrote: > Hi Philipp, > >> These changes look good to me. >> We'll try them out on the benchmarks that caused us to add prefetching in >> the first place. > > If you are OK, I would like to get these changes upstreamed. Sorry for the delay, I was pre-occupied with other issues in the runtime system. Yes, please go ahead with applying these changes. Best, Philipp.
Re: [MIPS] Add sbasic supoert ffor MSA (SIMD)
On 28/05/14 09:03, Matthew Fortune wrote: >> You shouldn't need to declare __builtin_* functions anyway. And if a >> function can be represented directly with GNU C vector extensions, it's >> preferred to implement it that way inline in the header rather than having >> built-in functions duplicating existing GNU C functionality. (Look at >> what AArch64 arm_neon.h does where possible, and what ARM arm_neon.h has >> been moved towards lately. I don't now what the msa.h functions do, so I >> don't know if this actually applies to any of them - but it's something to >> consider, so that built-in functions are only defined where actually >> needed.) > > In the aarch64 arm_neon.h header there are a decent number of inline asm > implementations too instead of builtins. It is not immediately obvious to me > as to what the deciding factor is between adding a builtin and using inline > asm when vector extensions do not support the operation. Do you happen to > know why inline asm is used in places? > Speed of implementation. We're gradually replacing these with proper builtins, but that takes a lot more work. > This looks like a reasonable idea to use GNU extensions where available. The > down-side to this approach is that it may be necessary to write quite > dis-similar headers for LLVM vs GCC which I think is part of the reason why > the header is written as it is. I don't know if that is a good reason to > require builtins or not though. > I regard these headers as part of the compiler. In an ideal world the contents of arm_neon.h should be replacable with #pragma GCC neon_intrinsics which would make parsing pretty much instantaneous. It's never that simple, though; sadly :-( R.
RE: [AArch64 05/14] Add AArch64 'prefetch'-pattern.
Hi Philipp, > These changes look good to me. > We'll try them out on the benchmarks that caused us to add prefetching in the > first place. If you are OK, I would like to get these changes upstreamed. -Ganesh -Original Message- From: Dr. Philipp Tomsich [mailto:philipp.toms...@theobroma-systems.com] Sent: Friday, February 28, 2014 2:58 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; pins...@gmail.com Subject: Re: [AArch64 05/14] Add AArch64 'prefetch'-pattern. Ganesh, On 28 Feb 2014, at 10:13 , Gopalasubramanian, Ganesh wrote: > I also have attached a patch that implements the following. > * Prefetch with immediate offset in the range 0 to 32760 (multiple of 8). > Added a predicate for this. > * Prefetch with immediate offset - in the range -256 to 255 (Gets > generated only when we have a negative offset. Generates prfum instruction). > Added a predicate for this. > * Prefetch with register offset. (modified for printing the locality) These changes look good to me. We'll try them out on the benchmarks that caused us to add prefetching in the first place. Best, Philipp.
Re: Darwin bootstrap failure following wide int merge
> I suppose casting the result of CWI_ELT () to uint64_t fixes this. Do > similar errors happen elsewhere? I don’t think you can cast to uint64_t, as host wide int might be some other type, no? There are others: ../../trunk/gcc/print-rtl.c: In function ‘void print_rtx(const_rtx)’: ../../trunk/gcc/print-rtl.c:385:62: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long int’ [-Werror=format=] fprintf (outfile, HOST_WIDE_INT_PRINT_DEC, XWINT (in_rtx, i)); ^ ../../trunk/gcc/print-rtl.c:385:62: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long int’ [-Werror=format=] ../../trunk/gcc/print-rtl.c:388:48: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘long unsigned int’ [-Werror=format=] (unsigned HOST_WIDE_INT) XWINT (in_rtx, i)); ^ ../../trunk/gcc/print-rtl.c:388:48: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘long unsigned int’ [-Werror=format=] ../../trunk/gcc/genemit.c: In function ‘void gen_exp(rtx, rtx_code, char*)’: ../../trunk/gcc/genemit.c:200:49: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘long int’ [-Werror=format=] printf (HOST_WIDE_INT_PRINT_DEC_C, INTVAL (x)); ^ ../../trunk/gcc/genemit.c:200:49: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 2 has type ‘long int’ [-Werror=format=] ../../trunk/gcc/genattrtab.c: In function ‘unsigned int write_test_expr(FILE*, rtx, unsigned int, int)’: ../../trunk/gcc/genattrtab.c:3758:61: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long int’ [-Werror=format=] fprintf (outf, HOST_WIDE_INT_PRINT_DEC, XWINT (exp, 0)); ^ ../../trunk/gcc/genattrtab.c:3758:61: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long int’ [-Werror=format=] ../../trunk/gcc/genattrtab.c: In function ‘void write_attr_value(FILE*, attr_desc*, rtx)’: ../../trunk/gcc/genattrtab.c:4350:61: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long int’ [-Werror=format=] fprintf (outf, HOST_WIDE_INT_PRINT_DEC, INTVAL (value)); ^ ../../trunk/gcc/genattrtab.c:4350:61: error: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long int’ [-Werror=format=]
Re: [build, doc, testsuite] Centralise clearing hardware capabilities with Sun ld
On Tue, 27 May 2014, Mike Stump wrote: > So, I read the doc bits, and they look fine. I’m not a doc reviewer, > but, the changes are usual and customary for a port, and trivial. Yes, and I'd like to emphasize this point: Just because a file matches *.texi doesn't mean that ports, middle, front, whatever end maintainers cannot and should not just make changes there without requiring explicit doc approval. Gerald
[PATCH] Revert HWI printing change
I have reverted the following for now. Richard. 2014-05-28 Richard Biener Revert 2014-05-28 Richard Biener * hwint.h (HOST_WIDE_INT_PRINT_*): Define in terms of PRI*64. Index: gcc/hwint.h === *** gcc/hwint.h.orig2014-05-26 10:49:23.009339524 +0200 --- gcc/hwint.h 2014-05-26 11:09:43.597255488 +0200 *** extern char sizeof_long_long_must_be_8[s *** 109,156 typedef before using the __asm_fprintf__ format attribute. */ typedef HOST_WIDE_INT __gcc_host_wide_int__; - /* Various printf format strings for HOST_WIDE_INT. */ - - #if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG - # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT - # define HOST_WIDE_INT_PRINT_C "L" - /* HOST_BITS_PER_WIDE_INT is 64 bits. */ - # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ - "0x%" HOST_LONG_FORMAT "x%016" HOST_LONG_FORMAT "x" - # define HOST_WIDE_INT_PRINT_PADDED_HEX \ - "%016" HOST_LONG_FORMAT "x" - #else - # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT - # define HOST_WIDE_INT_PRINT_C "LL" - /* HOST_BITS_PER_WIDE_INT is 64 bits. */ - # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \ - "0x%" HOST_LONG_LONG_FORMAT "x%016" HOST_LONG_LONG_FORMAT "x" - # define HOST_WIDE_INT_PRINT_PADDED_HEX \ - "%016" HOST_LONG_LONG_FORMAT "x" - #endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */ - - #define HOST_WIDE_INT_PRINT_DEC "%" HOST_WIDE_INT_PRINT "d" - #define HOST_WIDE_INT_PRINT_DEC_C HOST_WIDE_INT_PRINT_DEC HOST_WIDE_INT_PRINT_C - #define HOST_WIDE_INT_PRINT_UNSIGNED "%" HOST_WIDE_INT_PRINT "u" - #define HOST_WIDE_INT_PRINT_HEX "%#" HOST_WIDE_INT_PRINT "x" - #define HOST_WIDE_INT_PRINT_HEX_PURE "%" HOST_WIDE_INT_PRINT "x" - /* Provide C99 style format definitions for 64bits. */ #ifndef HAVE_INTTYPES_H #undef PRId64 ! #define PRId64 HOST_WIDE_INT_PRINT "d" #undef PRIi64 ! #define PRIi64 HOST_WIDE_INT_PRINT "i" #undef PRIo64 ! #define PRIo64 HOST_WIDE_INT_PRINT "o" #undef PRIu64 ! #define PRIu64 HOST_WIDE_INT_PRINT "u" #undef PRIx64 ! #define PRIx64 HOST_WIDE_INT_PRINT "x" #undef PRIX64 ! #define PRIX64 HOST_WIDE_INT_PRINT "X" #endif /* Define HOST_WIDEST_FAST_INT to the widest integer type supported efficiently in hardware. (That is, the widest integer type that fits in a hardware register.) Normally this is "long" but on some hosts it --- 75,119 typedef before using the __asm_fprintf__ format attribute. */ typedef HOST_WIDE_INT __gcc_host_wide_int__; /* Provide C99 style format definitions for 64bits. */ #ifndef HAVE_INTTYPES_H + #if HOST_BITS_PER_LONG == 64 + # define GCC_PRI64 HOST_LONG_FORMAT + #else + # define GCC_PRI64 HOST_LONG_LONG_FORMAT + #endif #undef PRId64 ! #define PRId64 GCC_PRI64 "d" #undef PRIi64 ! #define PRIi64 GCC_PRI64 "i" #undef PRIo64 ! #define PRIo64 GCC_PRI64 "o" #undef PRIu64 ! #define PRIu64 GCC_PRI64 "u" #undef PRIx64 ! #define PRIx64 GCC_PRI64 "x" #undef PRIX64 ! #define PRIX64 GCC_PRI64 "X" ! #endif ! ! /* Various printf format strings for HOST_WIDE_INT. */ ! ! #if HOST_BITS_PER_LONG == 64 ! # define HOST_WIDE_INT_PRINT HOST_LONG_FORMAT ! # define HOST_WIDE_INT_PRINT_C "L" ! #else ! # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT ! # define HOST_WIDE_INT_PRINT_C "LL" #endif + #define HOST_WIDE_INT_PRINT_DEC "%" PRId64 + #define HOST_WIDE_INT_PRINT_DEC_C "%" PRId64 HOST_WIDE_INT_PRINT_C + #define HOST_WIDE_INT_PRINT_UNSIGNED "%" PRIu64 + #define HOST_WIDE_INT_PRINT_HEX "%#" PRIx64 + #define HOST_WIDE_INT_PRINT_HEX_PURE "%" PRIx64 + #define HOST_WIDE_INT_PRINT_DOUBLE_HEX "0x%" PRIx64 "%016" PRIx64 + #define HOST_WIDE_INT_PRINT_PADDED_HEX "%016" PRIx64 + /* Define HOST_WIDEST_FAST_INT to the widest integer type supported efficiently in hardware. (That is, the widest integer type that fits in a hardware register.) Normally this is "long" but on some hosts it
Re: Darwin bootstrap failure following wide int merge
On Wed, May 28, 2014 at 04:00:32PM +0200, Jakub Jelinek wrote: > Defining HOST_WIDE_INT to long long or long based on whether long is 64-bit > or not, but using PRIx64 etc. unconditionally is just wrong, either > HOST_WIDE_INT should be uint64_t and then you should use PRI*64, or it is > not, and then you should keep using what has been used in the past, > different macros depending on how HOST_WIDE_INT is defined. And, note that even the fallback PRI*64 definitions if not defined in system headers already probably need to depend on autoconf test finding out whether uint64_t is unsigned long or unsigned long long (and failing otherwise, unless PRI*64 is defined in the headers). Guess the test can easily try to compile some proglet with a template to find out what uint64_t is. Jakub
Re: RFA: cache enabled attribute by insn code
The patch also fixes the arm-none-eabi build failures I've seen. Thanks, Yufeng On 05/27/14 16:07, Richard Sandiford wrote: Richard Sandiford writes: Richard Sandiford writes: Does the following patch help? Bah, it won't of course: %i1 needs to be the operator. Here's v2. I tested that it worked for simple tests like: int f1 (int x, int y) { return x + (y<< 4); } int f2 (int x, int y) { return x - (y<< 4); } int f3 (int x, int y) { return x& (y<< 4); } int f4 (int x, int y) { return x | (y<< 4); } int f5 (int x, int y) { return x ^ (y<< 4); } int f6 (int x, int y) { return (y<< 4) - x; } int g1 (int x, int y, int z) { return x + (y<< z); } int g2 (int x, int y, int z) { return x - (y<< z); } int g3 (int x, int y, int z) { return x& (y<< z); } int g4 (int x, int y, int z) { return x | (y<< z); } int g5 (int x, int y, int z) { return x ^ (y<< z); } int g6 (int x, int y, int z) { return (y<< z) - x; } as well as the testcase. Thanks, Richard gcc/ * config/arm/iterators.md (shiftable_ops): New code iterator. (t2_binop0, arith_shift_insn): New code attributes. * config/arm/arm.md (insn_enabled): Delete. (enabled): Remove insn_enabled test. (*arith_shiftsi): Split out... (*arith_multsi): ...this pattern and remove insn_enabled attribute. Index: gcc/config/arm/arm.md === --- gcc/config/arm/arm.md (revision 210972) +++ gcc/config/arm/arm.md (working copy) @@ -200,17 +200,9 @@ (const_string "yes")] (const_string "no"))) -; Allows an insn to disable certain alternatives for reasons other than -; arch support. -(define_attr "insn_enabled" "no,yes" - (const_string "yes")) - ; Enable all alternatives that are both arch_enabled and insn_enabled. (define_attr "enabled" "no,yes" - (cond [(eq_attr "insn_enabled" "no") - (const_string "no") - - (and (eq_attr "predicable_short_it" "no") + (cond [(and (eq_attr "predicable_short_it" "no") (and (eq_attr "predicated" "yes") (match_test "arm_restrict_it"))) (const_string "no") @@ -9876,39 +9868,38 @@ ;; Patterns to allow combination of arithmetic, cond code and shifts -(define_insn "*arith_shiftsi" - [(set (match_operand:SI 0 "s_register_operand" "=r,r,r,r") -(match_operator:SI 1 "shiftable_operator" - [(match_operator:SI 3 "shift_operator" - [(match_operand:SI 4 "s_register_operand" "r,r,r,r") - (match_operand:SI 5 "shift_amount_operand" "M,M,M,r")]) - (match_operand:SI 2 "s_register_operand" "rk,rk,r,rk")]))] +;; Separate out the (mult ...) case, since it doesn't allow register operands. +(define_insn "*arith_multsi" + [(set (match_operand:SI 0 "s_register_operand" "=r,r") +(shiftable_ops:SI + (match_operator:SI 2 "mult_operator" + [(match_operand:SI 3 "s_register_operand" "r,r") +(match_operand:SI 4 "power_of_two_operand")]) + (match_operand:SI 1 "s_register_operand" "rk,")))] "TARGET_32BIT" - "%i1%?\\t%0, %2, %4%S3" + "%?\\t%0, %1, %3%S2" [(set_attr "predicable" "yes") (set_attr "predicable_short_it" "no") (set_attr "shift" "4") - (set_attr "arch" "a,t2,t2,a") - ;; Thumb2 doesn't allow the stack pointer to be used for - ;; operand1 for all operations other than add and sub. In this case - ;; the minus operation is a candidate for an rsub and hence needs - ;; to be disabled. - ;; We have to make sure to disable the fourth alternative if - ;; the shift_operator is MULT, since otherwise the insn will - ;; also match a multiply_accumulate pattern and validate_change - ;; will allow a replacement of the constant with a register - ;; despite the checks done in shift_operator. - (set_attr_alternative "insn_enabled" -[(const_string "yes") - (if_then_else - (match_operand:SI 1 "add_operator" "") - (const_string "yes") (const_string "no")) - (const_string "yes") - (if_then_else - (match_operand:SI 3 "mult_operator" "") - (const_string "no") (const_string "yes"))]) - (set_attr "type" "alu_shift_imm,alu_shift_imm,alu_shift_imm,alu_shift_reg")]) + (set_attr "arch" "a,t2") + (set_attr "type" "alu_shift_imm,alu_shift_imm")]) +;; Handles cases other than the above. +(define_insn "*arith_shiftsi" + [(set (match_operand:SI 0 "s_register_operand" "=r,r,r") +(shiftable_ops:SI + (match_operator:SI 2 "shift_operator" +[(match_operand:SI 3 "s_register_operand" "r,r,r") + (match_operand:SI 4 "shift_amount_operand" "M,M,r")]) + (match_operand:SI 1 "s_register_operand" "rk,,rk")))] + "TARGET_32BIT&& GET_CODE (operands[3]) != MULT" + "%?\\t%0, %1, %3%S2" + [(
Re: Darwin bootstrap failure following wide int merge
On Wed, May 28, 2014 at 03:47:52PM +0200, Richard Biener wrote: > On Wed, May 28, 2014 at 3:15 PM, FX wrote: > >> After lengthy IRC discussions, what Richard and I can live with is > >> && !defined(__clang__) in this particular case that uses longlong.h > >> in GCC sources, with a comment why. > > > > I’ll test this patch and commit if there is no problem. But right now, > > current trunk doesn’t build on x86_64-apple-darwin due to error below. > > Richard, could this be due to your revision 211013? > > Hum, yeah. But why does it even warn if sizeof (long) == sizeof (long long)? Because using the right format strings is important for portability. > I suppose casting the result of CWI_ELT () to uint64_t fixes this. Do > similar errors happen elsewhere? > > (the hex printfs expect unsigned types but CWI_ELT returns a signed > HWI) I think the sign shouldn't be a problem, but perhaps there is a mismatch between HOST_WIDE_INT definition and HOST_WIDE_INT_PRINT_HEX? Defining HOST_WIDE_INT to long long or long based on whether long is 64-bit or not, but using PRIx64 etc. unconditionally is just wrong, either HOST_WIDE_INT should be uint64_t and then you should use PRI*64, or it is not, and then you should keep using what has been used in the past, different macros depending on how HOST_WIDE_INT is defined. Jakub
Re: Darwin bootstrap failure following wide int merge
On Wed, May 28, 2014 at 3:15 PM, FX wrote: >> After lengthy IRC discussions, what Richard and I can live with is >> && !defined(__clang__) in this particular case that uses longlong.h >> in GCC sources, with a comment why. > > I’ll test this patch and commit if there is no problem. But right now, > current trunk doesn’t build on x86_64-apple-darwin due to error below. > Richard, could this be due to your revision 211013? Hum, yeah. But why does it even warn if sizeof (long) == sizeof (long long)? I suppose casting the result of CWI_ELT () to uint64_t fixes this. Do similar errors happen elsewhere? (the hex printfs expect unsigned types but CWI_ELT returns a signed HWI) Richard. > FX > > > > ../../trunk/gcc/rtl.c: In function ‘void cwi_output_hex(FILE*, const_rtx)’: > ../../trunk/gcc/rtl.c:239:62: error: format ‘%llx’ expects argument of type > ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] >fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, CWI_ELT (x, --i)); > ^ > ../../trunk/gcc/rtl.c:239:62: error: format ‘%llx’ expects argument of type > ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] > ../../trunk/gcc/rtl.c:241:69: error: format ‘%llx’ expects argument of type > ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] > fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, CWI_ELT (x, i)); > ^ > ../../trunk/gcc/rtl.c:241:69: error: format ‘%llx’ expects argument of type > ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] > cc1plus: all warnings being treated as errors > make[3]: *** [build/rtl.o] Error 1 > make[2]: *** [all-stage2-gcc] Error 2 > make[1]: *** [stage2-bubble] Error 2 > make: *** [all] Error 2 > > >
[PATCH] Inline asm asan instrumentation
Hi all, Here's a patch for optional Asan instrumentation of inline assembly. This version scans gimple for GIMPLE_ASMs and performs usual instrumentation of arguments with memory constraints ("m", "o", etc.) with fixed size. Instrumentation is turned off by default. This was successfully bootstrapped and regtested on x64. I have also instrumented and ran ffmpeg regression testsuite (it seems to have quite some inline asm). --Marat InAsmAsan.diff Description: Binary data
Re: [PATCH2/2, PR52252] Vectorization for load/store groups of size 3.
Ping. Test is modified according to the fix in the test for loads. diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-st.c b/gcc/testsuite/gcc.dg/vect/pr52252-st.c new file mode 100644 index 000..e7161f7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr52252-st.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ + +#define byte unsigned char + +void +matrix_mul (byte *in, byte *out, int size) +{ + int i; + for (i = 0; i < size; i++) +{ + out[0] = in[0] + in[1] + in[3]; + out[1] = in[0] + in[2] + in[4]; + out[2] = in[1] + in[2] + in[4]; + in += 4; + out += 3; +} +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { i?86-*-* x86_64-*-* } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ On Tue, May 6, 2014 at 6:39 PM, Evgeny Stupachenko wrote: > 2nd part of patch is on stores group. > Bootstrap and make check passed on x86. > > Is it ok? > > 2014-05-06 Evgeny Stupachenko > > * tree-vect-data-refs.c (vect_grouped_store_supported): New > check for storess group of length 3. > (vect_permute_store_chain): New permutations for storess group of > length 3. > * tree-vect-stmts.c (vect_model_store_cost): Change cost > of vec_perm_shuffle for the new permutations. > > ChangeLog for testsuite: > > 2014-05-06 Evgeny Stupachenko > >PR tree-optimization/52252 >* gcc.dg/vect/pr52252-st.c: Test on stores group of size 3. > > diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c > index ef710cf..fb0e30d 100644 > --- a/gcc/tree-vect-data-refs.c > +++ b/gcc/tree-vect-data-refs.c > @@ -4365,13 +4365,14 @@ vect_grouped_store_supported (tree vectype, > unsigned HOST_WIDE_INT count) > { >enum machine_mode mode = TYPE_MODE (vectype); > > - /* vect_permute_store_chain requires the group size to be a power of two. > */ > - if (exact_log2 (count) == -1) > + /* vect_permute_store_chain requires the group size to be equal to 3 or > + be a power of two. */ > + if (count != 3 && exact_log2 (count) == -1) > { >if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > - "the size of the group of accesses" > - " is not a power of 2\n"); > +"the size of the group of accesses" > +" is not a power of 2 or not eqaul to 3\n"); >return false; > } > > @@ -4380,23 +4381,76 @@ vect_grouped_store_supported (tree vectype, > unsigned HOST_WIDE_INT count) > { >unsigned int i, nelt = GET_MODE_NUNITS (mode); >unsigned char *sel = XALLOCAVEC (unsigned char, nelt); > - for (i = 0; i < nelt / 2; i++) > + > + if (count == 3) > { > - sel[i * 2] = i; > - sel[i * 2 + 1] = i + nelt; > + unsigned int j0 = 0, j1 = 0, j2 = 0; > + unsigned int i, j; > + > + for (j = 0; j < 3; j++) > + { > + int nelt0 = ((3 - j) * nelt) % 3; > + int nelt1 = ((3 - j) * nelt + 1) % 3; > + int nelt2 = ((3 - j) * nelt + 2) % 3; > + for (i = 0; i < nelt; i++) > + { > + if (3 * i + nelt0 < nelt) > + sel[3 * i + nelt0] = j0++; > + if (3 * i + nelt1 < nelt) > + sel[3 * i + nelt1] = nelt + j1++; > + if (3 * i + nelt2 < nelt) > + sel[3 * i + nelt2] = 0; > + } > + if (!can_vec_perm_p (mode, false, sel)) > + { > + if (dump_enabled_p ()) > + dump_printf (MSG_MISSED_OPTIMIZATION, > +"permutaion op not supported by target.\n"); > + return false; > + } > + > + for (i = 0; i < nelt; i++) > + { > + if (3 * i + nelt0 < nelt) > + sel[3 * i + nelt0] = 3 * i + nelt0; > + if (3 * i + nelt1 < nelt) > + sel[3 * i + nelt1] = 3 * i + nelt1; > + if (3 * i + nelt2 < nelt) > + sel[3 * i + nelt2] = nelt + j2++; > + } > + if (!can_vec_perm_p (mode, false, sel)) > + { > + if (dump_enabled_p ()) > + dump_printf (MSG_MISSED_OPTIMIZATION, > +"permutaion op not supported by target.\n"); > + return false; > + } > + } > + return true; > } > - if (can_vec_perm_p (mode, false, sel)) > + else > { > - for (i = 0; i < nelt; i++) > - sel[i] += nelt / 2; > - if (can_vec_perm_p (mode, false, sel)) > - return true; > + /* If length is not equal to 3 then only power of 2 is supported. > */ > + gcc_asser
Re: Darwin bootstrap failure following wide int merge
> After lengthy IRC discussions, what Richard and I can live with is > && !defined(__clang__) in this particular case that uses longlong.h > in GCC sources, with a comment why. I’ll test this patch and commit if there is no problem. But right now, current trunk doesn’t build on x86_64-apple-darwin due to error below. Richard, could this be due to your revision 211013? FX ../../trunk/gcc/rtl.c: In function ‘void cwi_output_hex(FILE*, const_rtx)’: ../../trunk/gcc/rtl.c:239:62: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, CWI_ELT (x, --i)); ^ ../../trunk/gcc/rtl.c:239:62: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] ../../trunk/gcc/rtl.c:241:69: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, CWI_ELT (x, i)); ^ ../../trunk/gcc/rtl.c:241:69: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘long int’ [-Werror=format=] cc1plus: all warnings being treated as errors make[3]: *** [build/rtl.o] Error 1 make[2]: *** [all-stage2-gcc] Error 2 make[1]: *** [stage2-bubble] Error 2 make: *** [all] Error 2
[PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.
Hi, The patch introduces alternative way of permutations for load groups of size 2 and 3 which should be faster on architectures with low parallelism. The patch gives 2 times gain on Silvermont to the test from PR52252 (in addition to already committed 3 times gain). Patch passes bootstrap on x86. Make check is in progress. ChangeLog: 2014-05-28 Evgeny Stupachenko * config/i386/i386.c (ix86_have_vector_parallel_execution): New. (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New. * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New. * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New. * target.def (have_vector_parallel_execution): New. * doc/tm.texi.in (have_vector_parallel_execution)): New. * doc/tm.texi: Regenerate. * targhooks.c (default_have_vector_parallel_execution): New. * tree-vect-data-refs.c (vect_shift_permute_load_chain): New. Introduces alternative way of loads group permutaions. (vect_transform_grouped_load): Try alternative way of permutaions. Evgeny vect_groups.patch Description: Binary data
[PATCH] Less noisy VRP dump, improve copy handling
The following patch makes the VRP dump less vertical space noisy. It also makes handling of the two forms of copies, a_1 = b_2; and a_1 = PHI behave more similar by also copying VR_UNDEFINED ranges in the first case and by creating a [b_2, b_2] symbolic range in the second case (if b_2 has a VR_VARYING range). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2014-05-28 Richard Biener * tree-ssa-propagate.c (add_control_edge): Print less vertical space. * tree-vrp.c (extract_range_from_ssa_name): Also copy VR_UNDEFINED. (vrp_visit_assignment_or_call): Print less vertical space. (vrp_visit_stmt): Likewise. (vrp_visit_phi_node): Likewise. For a PHI argument with VR_VARYING range consider recording it as copy. Index: gcc/tree-ssa-propagate.c === *** gcc/tree-ssa-propagate.c(revision 211015) --- gcc/tree-ssa-propagate.c(working copy) *** add_control_edge (edge e) *** 301,307 cfg_blocks_add (bb); if (dump_file && (dump_flags & TDF_DETAILS)) ! fprintf (dump_file, "Adding Destination of edge (%d -> %d) to worklist\n\n", e->src->index, e->dest->index); } --- 301,307 cfg_blocks_add (bb); if (dump_file && (dump_flags & TDF_DETAILS)) ! fprintf (dump_file, "\nAdding Destination of edge (%d -> %d) to worklist\n", e->src->index, e->dest->index); } Index: gcc/tree-vrp.c === *** gcc/tree-vrp.c (revision 211015) --- gcc/tree-vrp.c (working copy) *** extract_range_from_ssa_name (value_range *** 1810,1816 { value_range_t *var_vr = get_value_range (var); ! if (var_vr->type != VR_UNDEFINED && var_vr->type != VR_VARYING) copy_value_range (vr, var_vr); else set_value_range (vr, VR_RANGE, var, var, NULL); --- 1810,1816 { value_range_t *var_vr = get_value_range (var); ! if (var_vr->type != VR_VARYING) copy_value_range (vr, var_vr); else set_value_range (vr, VR_RANGE, var, var, NULL); *** vrp_visit_assignment_or_call (gimple stm *** 6679,6685 print_generic_expr (dump_file, lhs, 0); fprintf (dump_file, ": "); dump_value_range (dump_file, &new_vr); ! fprintf (dump_file, "\n\n"); } if (new_vr.type == VR_VARYING) --- 6679,6685 print_generic_expr (dump_file, lhs, 0); fprintf (dump_file, ": "); dump_value_range (dump_file, &new_vr); ! fprintf (dump_file, "\n"); } if (new_vr.type == VR_VARYING) *** vrp_visit_stmt (gimple stmt, edge *taken *** 7473,7479 { fprintf (dump_file, "\nVisiting statement:\n"); print_gimple_stmt (dump_file, stmt, 0, dump_flags); - fprintf (dump_file, "\n"); } if (!stmt_interesting_for_vrp (stmt)) --- 7473,7478 *** vrp_visit_phi_node (gimple phi) *** 8242,8248 if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, ! "\nArgument #%d (%d -> %d %sexecutable)\n", (int) i, e->src->index, e->dest->index, (e->flags & EDGE_EXECUTABLE) ? "" : "not "); } --- 8241,8247 if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, ! "Argument #%d (%d -> %d %sexecutable)\n", (int) i, e->src->index, e->dest->index, (e->flags & EDGE_EXECUTABLE) ? "" : "not "); } *** vrp_visit_phi_node (gimple phi) *** 8260,8275 /* Do not allow equivalences or symbolic ranges to leak in from backedges. That creates invalid equivalencies. See PR53465 and PR54767. */ ! if (e->flags & EDGE_DFS_BACK ! && (vr_arg.type == VR_RANGE ! || vr_arg.type == VR_ANTI_RANGE)) { ! vr_arg.equiv = NULL; ! if (symbolic_range_p (&vr_arg)) { ! vr_arg.type = VR_VARYING; ! vr_arg.min = NULL_TREE; ! vr_arg.max = NULL_TREE; } } } --- 8259,8288 /* Do not allow equivalences or symbolic ranges to leak in from backedges. That creates invalid equivalencies. See PR53465 and PR54767. */ ! if (e->flags & EDGE_DFS_BACK) { ! if (vr_arg.type == VR_RANGE ! || vr_arg.type == VR_ANTI_RANGE) { ! vr_arg.equiv = NULL; ! if (symbolic_range_p (&vr_arg)) ! { ! vr_arg.type =
Re: [build, doc, testsuite] Centralise clearing hardware capabilities with Sun ld
Mike Stump writes: > On May 27, 2014, at 1:04 AM, Rainer Orth > wrote: >> It's been a week since I've submitted the patch, so far having received >> approval for the testsuite parts only. > > So, I read the doc bits, and they look fine. I’m not a doc reviewer, but, > the changes are usual and customary for a port, and trivial. Thanks. Given that the build changes in gcc and libitm only affect Solaris and have also been bootstrapped without regressions on Darwin/x86_64 and Linux/x86_64, I've committed the patch on mainline and 4.9 branch. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Fix PR61045
The following fixes bogus folding (introducing signed overflow) for the X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 transform where we let through a sign-change of the remaining constant operand. Fixed as follows, bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-05-28 Richard Biener PR middle-end/61045 * fold-const.c (fold_comparison): When folding X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 also ensure the sign of the remaining constant operand stays the same. * gcc.dg/pr61045.c: New testcase. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 211011) +++ gcc/fold-const.c(working copy) @@ -9239,7 +9239,7 @@ fold_comparison (location_t loc, enum tr /* Transform comparisons of the form X +- C1 CMP Y +- C2 to X CMP Y +- C2 +- C1 for signed X, Y. This is valid if the resulting offset is smaller in absolute value than the - original one. */ + original one and has the same sign. */ if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)) && (TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR) && (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST @@ -9258,32 +9258,35 @@ fold_comparison (location_t loc, enum tr "a comparison"); /* Put the constant on the side where it doesn't overflow and is -of lower absolute value than before. */ +of lower absolute value and of same sign than before. */ cst = int_const_binop (TREE_CODE (arg0) == TREE_CODE (arg1) ? MINUS_EXPR : PLUS_EXPR, const2, const1); if (!TREE_OVERFLOW (cst) - && tree_int_cst_compare (const2, cst) == tree_int_cst_sgn (const2)) + && tree_int_cst_compare (const2, cst) == tree_int_cst_sgn (const2) + && tree_int_cst_sgn (cst) == tree_int_cst_sgn (const2)) { fold_overflow_warning (warnmsg, WARN_STRICT_OVERFLOW_COMPARISON); return fold_build2_loc (loc, code, type, - variable1, - fold_build2_loc (loc, - TREE_CODE (arg1), TREE_TYPE (arg1), - variable2, cst)); + variable1, + fold_build2_loc (loc, TREE_CODE (arg1), + TREE_TYPE (arg1), + variable2, cst)); } cst = int_const_binop (TREE_CODE (arg0) == TREE_CODE (arg1) ? MINUS_EXPR : PLUS_EXPR, const1, const2); if (!TREE_OVERFLOW (cst) - && tree_int_cst_compare (const1, cst) == tree_int_cst_sgn (const1)) + && tree_int_cst_compare (const1, cst) == tree_int_cst_sgn (const1) + && tree_int_cst_sgn (cst) == tree_int_cst_sgn (const1)) { fold_overflow_warning (warnmsg, WARN_STRICT_OVERFLOW_COMPARISON); return fold_build2_loc (loc, code, type, - fold_build2_loc (loc, TREE_CODE (arg0), TREE_TYPE (arg0), - variable1, cst), - variable2); + fold_build2_loc (loc, TREE_CODE (arg0), + TREE_TYPE (arg0), + variable1, cst), + variable2); } } Index: gcc/testsuite/gcc.dg/pr61045.c === --- gcc/testsuite/gcc.dg/pr61045.c (revision 0) +++ gcc/testsuite/gcc.dg/pr61045.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-options "-fstrict-overflow" } */ + +int main () +{ + int a = 0; + int b = __INT_MAX__; + int t = (a - 2) > (b - 1); + if (t != 0) +__builtin_abort(); + return 0; +}
Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
On Wed, May 28, 2014 at 03:33:15PM +0400, Evgeny Stupachenko wrote: > Ok. Fixed. Test still passes on x86: Ok. > --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > @@ -1,6 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 > -fdump-tree-vect-details" { target { i?86-*-* x86_64-*-* } } } */ > - > +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ > #define byte unsigned char > > void > @@ -26,5 +25,5 @@ matrix_mul (byte *in, byte *out, int size) > } > } > > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target { i?86-*-* x86_64-*-* } } } } */ > /* { dg-final { cleanup-tree-dump "vect" } } */ > Jakub
Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
Ok. Fixed. Test still passes on x86: diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c index 6e3cb52..e37b177 100644 --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 -fdump-tree-vect-details" { target { i?86-*-* x86_64-*-* } } } */ - +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ #define byte unsigned char void @@ -26,5 +25,5 @@ matrix_mul (byte *in, byte *out, int size) } } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { i?86-*-* x86_64-*-* } } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ On Wed, May 28, 2014 at 2:55 PM, Jakub Jelinek wrote: > On Wed, May 28, 2014 at 02:51:57PM +0400, Evgeny Stupachenko wrote: >> Does the following fix ok? >> >> 2014-05-28 Evgeny Stupachenko >> >>* gcc.dg/vect/pr52252-ld.c: Fix target and options for the test. >> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> index 6e3cb52..57e8468 100644 >> --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> @@ -1,6 +1,6 @@ >> /* { dg-do compile } */ >> -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 >> -fdump-tree-vect-details" { target { i?86-*-* x86_64-* >> - >> +/* { dg-options "-O2 -g -ftree-vectorize -fdump-tree-vect-details" } */ > > dg-options should not be used in g*.dg/vect/* at all. > Not sure about -g, but the other options are provided by default already, > and shouldn't be overriden. > > Jakub
Re: Darwin bootstrap failure following wide int merge
On Wed, May 28, 2014 at 12:18:13PM +0200, Richard Biener wrote: > > ATM we get the testing coverage for i686 and ppc32 hosts. So TBH I'd > > prefer to keep it simple and just bump the version number. > > Works for me (though see Jakubs idea on the other thread, so please > wait until we settled on a solution). After lengthy IRC discussions, what Richard and I can live with is && !defined(__clang__) in this particular case that uses longlong.h in GCC sources, with a comment why. If we get too many of these workarounds, we should reconsider. Jakub
Re: AIX build broken by IPA changes
On 27/05/14 04:03, Jan Hubicka wrote: Jan, The IPA patch broke bootstrap on AIX with multiple failures. The tail of the build log is attached. Thanks, I will give it a try at gcc111, good to have reproducible testcase. FWIW, I'm seeing the same error when building arm-none-linux-gnueabihf Kyrill Honza - David make "AR_FLAGS=" "CC_FOR_BUILD=" "CC_FOR_TARGET=" "CFLAGS=-g -O2" "CXXFLAGS=-g -O2" "CFLAGS_FOR_BUILD=" "CFLAGS_FOR_TARGET=" "INSTALL=/nasfarm/edelsohn/src/src/install-sh -c" "INSTALL_DATA=/nasfarm/edelsohn/src/src/install-sh -c -m 644" "INSTALL_PROGRAM=/nasfarm/edelsohn/src/src/install-sh -c" "INSTALL_SCRIPT=/nasfarm/edelsohn/src/src/install-sh -c" "LDFLAGS=" "LIBCFLAGS=" "LIBCFLAGS_FOR_TARGET=" "MAKE=make" "MAKEINFO=/usr/gnu/bin/bash /nasfarm/edelsohn/src/src/libstdc++-v3/../missing makeinfo " "SHELL=/usr/gnu/bin/bash" "RUNTESTFLAGS=" "exec_prefix=/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525" "infodir=/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/share/info" "libdir=/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/lib" "includedir=/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/include" "prefix=/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525" "tooldir=" "gxx_include_dir=/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/include/c++/4.10.0" "AR=ar -X32_64" "AS=/tmp/20140525/./gcc/as" "LD=/tmp/20140525/./gcc/collect-ld" "RANLIB=ranlib" "NM=/tmp/20140525/./gcc/nm -B -X32_64" "NM_FOR_BUILD=" "NM_FOR_TARGET=" "DESTDIR=" "WERROR=" all-recursive make[1]: Entering directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3' Making all in include make[2]: Entering directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include' Making all in libsupc++ make[2]: Entering directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/libsupc++' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/libsupc++' Making all in src make[2]: Entering directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src' Making all in c++98 make[3]: Entering directory `/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src/c++98' /usr/gnu/bin/bash ../../libtool --tag CXX --tag disable-shared --mode=compile /tmp/20140525/./gcc/xgcc -shared-libgcc -B/tmp/20140525/./gcc -nostdinc++ -L/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src -L/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src/.libs -L/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/libsupc++/.libs -B/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/bin/ -B/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/lib/ -isystem /gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/include -isystem /gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/sys-include -I/nasfarm/edelsohn/src/src/libstdc++-v3/../libgcc -I/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/powerpc-ibm-aix7.1.0.0 -I/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include -I/nasfarm/edelsohn/src/src/libstdc++-v3/libsupc++ -I/gsa/yktgsa/home/e/d/edelsohn/install/include -prefer-pic -D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi -fdiagnostics-show-location=once -ffunction-sections -fdata-sections -frandom-seed=strstream.lo -g -O2 -I/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/backward -Wno-deprecated -c /nasfarm/edelsohn/src/src/libstdc++-v3/src/c++98/strstream.cc libtool: compile: /tmp/20140525/./gcc/xgcc -shared-libgcc -B/tmp/20140525/./gcc -nostdinc++ -L/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src -L/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/src/.libs -L/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/libsupc++/.libs -B/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/bin/ -B/gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/lib/ -isystem /gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/include -isystem /gsa/yktgsa/home/e/d/edelsohn/install/powerpc-ibm-aix7.1.0.0-20140525/powerpc-ibm-aix7.1.0.0/sys-include -I/nasfarm/edelsohn/src/src/libstdc++-v3/../libgcc -I/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/powerpc-ibm-aix7.1.0.0 -I/tmp/20140525/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include -I/nasfarm/edelsohn/src/src/libstdc++-v3/libsupc++ -I/gsa/yktgsa/home/e/d/edelsohn/install/include -D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi -fdiagnostics-show-location=once -ffunction-sections -fda
Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
missed some line tails. Correct patch below: diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c index 6e3cb52..57e8468 100644 --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 -fdump-tree-vect-details" { target { i?86-*-* x86_64-*-* } } } */ - +/* { dg-options "-O2 -g -ftree-vectorize -fdump-tree-vect-details" } */ +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ #define byte unsigned char void @@ -26,5 +26,5 @@ matrix_mul (byte *in, byte *out, int size) } } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { i?86-*-* x86_64-*-* } } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ On Wed, May 28, 2014 at 2:51 PM, Evgeny Stupachenko wrote: > Does the following fix ok? > > 2014-05-28 Evgeny Stupachenko > >* gcc.dg/vect/pr52252-ld.c: Fix target and options for the test. > > diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > index 6e3cb52..57e8468 100644 > --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > @@ -1,6 +1,6 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 > -fdump-tree-vect-details" { target { i?86-*-* x86_64-* > - > +/* { dg-options "-O2 -g -ftree-vectorize -fdump-tree-vect-details" } */ > +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ > #define byte unsigned char > > void > @@ -26,5 +26,5 @@ matrix_mul (byte *in, byte *out, int size) > } > } > > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > target { i?86-*-* x86_64-*-* } } } > /* { dg-final { cleanup-tree-dump "vect" } } */ > > On Fri, May 16, 2014 at 5:21 PM, Jakub Jelinek wrote: >> On Fri, May 16, 2014 at 03:11:05PM +0200, Rainer Orth wrote: >>> Hi Evgeny, >>> >>> > Does the following fix ok? >>> > >>> > 2014-05-16 Evgeny Stupachenko >>> > >>> >* gcc.dg/vect/pr52252-ld.c: Fix target for the test. >>> > >>> > >>> > diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >>> > b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >>> > index 6e3cb52..301433b 100644 >>> > --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >>> > +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >>> > @@ -1,5 +1,6 @@ >>> > /* { dg-do compile } */ >>> > -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 >>> > -fdump-tree-vect-details" { target { i?86-*-* x86_64-*-* } } } */ >>> > +/* { dg-options "-O2 -g -ftree-vectorize -mssse3 >>> > -fdump-tree-vect-details" } */ >>> > +/* { dg-skip-if "why" { ! { x86_64-*-* i?86-*-* } } } */ >>> >>> If the test is really x86 specific, move it to gcc.target/i386 and >>> remove the dg-skip-if. Otherwise, add an explanation for skipping the >>> test on other targets to the first arg of dg-skip-if. This is supposed >>> to be a comment stating why the test is skipped, not "why" literally. >> >> Well, I don't see anything i?86/x86_64 specific on the test. What >> is specific is the -mssse3, which supposedly should be added through >> /* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ >> and then perhaps the test might not necessarily be vectorized (so the >> dg-final line may need target guard as well. >> But, I see no reason not to try to compile this on other targets. >> >> Jakub
Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
On Wed, May 28, 2014 at 02:51:57PM +0400, Evgeny Stupachenko wrote: > Does the following fix ok? > > 2014-05-28 Evgeny Stupachenko > >* gcc.dg/vect/pr52252-ld.c: Fix target and options for the test. > > diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > index 6e3cb52..57e8468 100644 > --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c > @@ -1,6 +1,6 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 > -fdump-tree-vect-details" { target { i?86-*-* x86_64-* > - > +/* { dg-options "-O2 -g -ftree-vectorize -fdump-tree-vect-details" } */ dg-options should not be used in g*.dg/vect/* at all. Not sure about -g, but the other options are provided by default already, and shouldn't be overriden. Jakub
Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
Does the following fix ok? 2014-05-28 Evgeny Stupachenko * gcc.dg/vect/pr52252-ld.c: Fix target and options for the test. diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c index 6e3cb52..57e8468 100644 --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 -fdump-tree-vect-details" { target { i?86-*-* x86_64-* - +/* { dg-options "-O2 -g -ftree-vectorize -fdump-tree-vect-details" } */ +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ #define byte unsigned char void @@ -26,5 +26,5 @@ matrix_mul (byte *in, byte *out, int size) } } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { i?86-*-* x86_64-*-* } } } /* { dg-final { cleanup-tree-dump "vect" } } */ On Fri, May 16, 2014 at 5:21 PM, Jakub Jelinek wrote: > On Fri, May 16, 2014 at 03:11:05PM +0200, Rainer Orth wrote: >> Hi Evgeny, >> >> > Does the following fix ok? >> > >> > 2014-05-16 Evgeny Stupachenko >> > >> >* gcc.dg/vect/pr52252-ld.c: Fix target for the test. >> > >> > >> > diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> > b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> > index 6e3cb52..301433b 100644 >> > --- a/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> > +++ b/gcc/testsuite/gcc.dg/vect/pr52252-ld.c >> > @@ -1,5 +1,6 @@ >> > /* { dg-do compile } */ >> > -/* { dg-options "-O2 -g -ftree-vectorize -mssse3 >> > -fdump-tree-vect-details" { target { i?86-*-* x86_64-*-* } } } */ >> > +/* { dg-options "-O2 -g -ftree-vectorize -mssse3 >> > -fdump-tree-vect-details" } */ >> > +/* { dg-skip-if "why" { ! { x86_64-*-* i?86-*-* } } } */ >> >> If the test is really x86 specific, move it to gcc.target/i386 and >> remove the dg-skip-if. Otherwise, add an explanation for skipping the >> test on other targets to the first arg of dg-skip-if. This is supposed >> to be a comment stating why the test is skipped, not "why" literally. > > Well, I don't see anything i?86/x86_64 specific on the test. What > is specific is the -mssse3, which supposedly should be added through > /* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */ > and then perhaps the test might not necessarily be vectorized (so the > dg-final line may need target guard as well. > But, I see no reason not to try to compile this on other targets. > > Jakub
Re: Fix old bug in div_and_round_double
On Wed, May 28, 2014 at 12:23 PM, Eric Botcazou wrote: >> Well, a wrong-code bug plus a very obvious fix certainly qualifies. > > Fine with me, onto which branch(es) do you want me to put it? Where you have tested it already, no need to spend extra cycles. Richard. > -- > Eric Botcazou
Re: Fix old bug in div_and_round_double
> Well, a wrong-code bug plus a very obvious fix certainly qualifies. Fine with me, onto which branch(es) do you want me to put it? -- Eric Botcazou
Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)
On Wed, May 28, 2014 at 12:03:39PM +0200, Richard Biener wrote: > > Any other compilers that define __GNUC__? > > Every one I guess. ICC 9 has it defined to 4, __GNUC_MINOR__ to 0. For ICC it seems to depend on the ICC version, newer ICC versions report newer __GNUC_MINOR__. That said, ICC doesn't error out on the LHS cast for -m32, just silently drops the cast, so we really can't say it is compatible in this regard. unsigned long i; void foo (void) { asm volatile ("# %0 %1" : "=r" ((unsigned long long) i) : "0" ((unsigned long long) 0)); } Jakub
Re: Fix old bug in div_and_round_double
On Wed, May 28, 2014 at 12:17 PM, Eric Botcazou wrote: >> I suppose you also install on branches? > > No plan to do so since this isn't a regression, unless you insist. :-) Well, a wrong-code bug plus a very obvious fix certainly qualifies. Richard. > -- > Eric Botcazou
Re: Mark more constants readonly
On Wed, May 28, 2014 at 12:00 PM, Bernd Schmidt wrote: > On 05/27/2014 04:57 PM, Richard Biener wrote: >> >> On Tue, May 27, 2014 at 3:13 PM, Bernd Schmidt >> wrote: >>> >>> I noticed that string constants built by the Fortran frontend don't set >>> TREE_READONLY for STRING_CST (and subsequently noticed that nothing seems >>> to >>> set it for COMPLEX_CST). That was confusing the ptx backend I'm working >>> on. >>> The -fwritable-strings option for C was removed a while ago, so I tested >>> the >>> following patch to just set the flag unconditionally, and passed >>> bootstrap >>> and test on x86_64-linux. Ok? >> >> >> Hmm? Not so obvious. TREE_READONLY has no purpose on tcc_constant >> nodes if I read documentation correctly (tree.h). In fact >> base.readonly_flag is unused for tcc_constant (and would be redundant >> with TREE_CONSTANT). > > > Well, the C frontend still sets it for strings (in fix_string_type), and > gcc.dg/lvalue-5.c fails if that is removed. So things are a little > inconsistent between frontends. Then fix that instead. The ptx backend shouldn't use TREE_READONLY on tcc_constant, so you can resolve the issue completely within the ptx backend anyway. Richard. > > Bernd >
Re: [C++ Patch] PR 57543
... turns out, I can avoid fiddling with in_decl (which, I realized, is meant to be used for diagnostics). The below version also passes testing. Thanks, Paolo. / Index: cp/pt.c === --- cp/pt.c (revision 211003) +++ cp/pt.c (working copy) @@ -11323,7 +11323,28 @@ tsubst_function_type (tree t, gcc_assert (TYPE_CONTEXT (t) == NULL_TREE); /* Substitute the return type. */ + tree save_ccp = current_class_ptr; + tree save_ccr = current_class_ref; + bool do_inject = (!current_class_ref + && TREE_CODE (t) == METHOD_TYPE + && TREE_CODE (TREE_TYPE (t)) == DECLTYPE_TYPE); + if (do_inject) +{ + /* DR 1207: 'this' is in scope in the trailing return type. */ + tree this_type = (current_class_type + ? current_class_type + : TREE_TYPE (TREE_VALUE (TYPE_ARG_TYPES (t; + inject_this_parameter (this_type, type_memfn_quals (t)); +} + return_type = tsubst (TREE_TYPE (t), args, complain, in_decl); + + if (do_inject) +{ + current_class_ptr = save_ccp; + current_class_ref = save_ccr; +} + if (return_type == error_mark_node) return error_mark_node; /* DR 486 clarifies that creation of a function type with an Index: testsuite/g++.dg/cpp0x/decltype59.C === --- testsuite/g++.dg/cpp0x/decltype59.C (revision 0) +++ testsuite/g++.dg/cpp0x/decltype59.C (working copy) @@ -0,0 +1,41 @@ +// PR c++/57543 +// { dg-do compile { target c++11 } } + +template< typename > struct X +{ + void foo(); + auto bar() -> decltype( X::foo() ); +}; + +template< typename > struct Y +{ + void foo(); + template< typename > + auto bar() -> decltype( Y::foo() ); +}; + +template< typename > struct Z +{ + void foo(); + template< typename T > + auto bar() -> decltype( T::foo() ); +}; + +template< typename > struct K +{ + void foo(); + template< typename T > + auto bar() -> decltype( T::foo() ); +}; + +template<> +template<> +auto K::bar>() -> decltype( K::foo() ); + +int main() +{ + X().bar(); + Y().bar(); + Z().bar>(); + K().bar>(); +}
Re: Darwin bootstrap failure following wide int merge
On Wed, May 28, 2014 at 11:49 AM, Richard Sandiford wrote: > Richard Biener writes: >> On Wed, May 28, 2014 at 11:40 AM, Richard Biener >> wrote: >>> On Wed, May 28, 2014 at 10:24 AM, Richard Sandiford >>> wrote: Richard Biener writes: > On Wed, May 28, 2014 at 8:50 AM, Jakub Jelinek wrote: >> On Mon, May 26, 2014 at 08:36:31AM -0700, Mike Stump wrote: >>> On May 26, 2014, at 2:22 AM, FX wrote: >>> >> This causes GCC bootstrap to fail on Darwin systems (whose system >>> > compiler is clang-based). Since PR 61146 was resolved as INVALID >>> > (but I’m not sure it’s the right call, see below), I’ve filed a >>> > separate report for the bootstrap issue >>> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315). >>> > >>> > Since my PR has been closed twice by Andrew Pinski (“it’s clang’s >>> > fault, bouh ouh”), I’d ask the maintainers to step in. Can we >>> > please provide a GCC that works for the default darwin setup? Or at >>> > least drop darwin as secondary target and document the failure? >>> >>> The best coarse of action, post a patch, have it reviewed and put in. >>> Current action, a patch has been posted, the review is outstanding, I’d >>> like to see it put in; though, I am curious why the casts were there in >>> the first place. >> >> Note, haven't added them there, but from what I can test, the casts there >> can serve as a compile time check that the right type is used, e.g. >> unsigned long i; >> >> void >> foo (void) >> { >> asm volatile ("# %0 %1" : "=r" ((unsigned long long) i) : "0" >> ((unsigned long long) 0)); >> } > > Ah, interesting. A not-so-hineous extension then. In that case, how about just protecting the include with: #if GCC_VERSION >= 4300 && (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__)) rather than: #if GCC_VERSION >= 3000 && (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__)) so that clang will fail the version check? At the end of the day we only really care what happens during stage 2 and 3. Cross-compilers built with recentish gccs will still benefit. >>> >>> Works for me (thus, pre-approved with a comment explaining the version >>> choice). >> >> Btw, testing coverage would ask for && defined (__OPTIMIZE__), too, disabling >> that path in stage1 unconditionally even for new GCC. > > Ah, but surely the day is near when we use -O or -Og for stage1? > I've been using -O for a good few years now and it definitely makes > things faster (and without affecting debugging too much -- in the > rare cases where it does affect debugging, a recompile with -g > is very quick). > > ATM we get the testing coverage for i686 and ppc32 hosts. So TBH I'd > prefer to keep it simple and just bump the version number. Works for me (though see Jakubs idea on the other thread, so please wait until we settled on a solution). Richard. > Thanks, > Richard >
Re: Fix old bug in div_and_round_double
> I suppose you also install on branches? No plan to do so since this isn't a regression, unless you insist. :-) -- Eric Botcazou
Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)
On Wed, May 28, 2014 at 12:07:41PM +0200, Richard Biener wrote: > >> #ifndef GCC_VERSION > >> +/* Some compilers pretend to be GCC, even when they are not. */ > >> +#if defined(__clang__) || defined(__INTEL_COMPILER) > >> +#define GCC_VERSION 0 > >> +#else > >> #define GCC_VERSION (__GNUC__ * 1000 + __GNUC_MINOR__) > >> +#endif > >> #endif /* GCC_VERSION */ > >> > >> so that we really can trust the GCC_VERSION macro, casts in lhs of inline > >> asm isn't the only incompatibility clang has. > >> > >> Any other compilers that define __GNUC__? > > > > Every one I guess. ICC 9 has it defined to 4, __GNUC_MINOR__ to 0. ICC is included in the above check, that is the __INTEL_COMPILER check. > So if we want to go down that route I'd rather change the configury that > checks whether we are using GCC to be more pedantic and for example > parse $CC --version output (not for the actual version but for whether > $CC is GCC). include/ is shared among many GCC/binutils/GDB uses, using autoconf macros there would be nightmare, since you'd need to replicate those tests everywhere you'd use the header, plus lots of people customize the gcc --version output and it has changed over the years. Jakub
Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)
On Wed, May 28, 2014 at 12:03 PM, Richard Biener wrote: > On Wed, May 28, 2014 at 11:48 AM, Jakub Jelinek wrote: >> On Wed, May 28, 2014 at 11:38:55AM +0200, Richard Biener wrote: >>> On Wed, May 28, 2014 at 10:24 AM, FX wrote: >>> >> Yeah, a portable (C and C++) static assert would be nice. And also >>> >> pushing >>> >> this to gmp then. >>> >> >>> >> In the meantime I see nothing wrong in "merging" from GMP. >>> > >>> > One question, one comment: >>> > >>> > 1. can I count your “I see nothing wrong” as an approval, as in “global >>> > reviewers can approve changes to any part of the compiler or associated >>> > libraries”? >>> >>> Well, kind of. But Jakub is as well, so I don't want to override him. So >>> please wait for an ack from Jakub. I agree with him that the casts >>> served a purpose and that, if removed, they need to be replaced with >>> an appropriate assertion measure. >> >> I think my preference would be to change include/ansidecl.h to: >> >> /* This macro simplifies testing whether we are using gcc, and if it >> is of a particular minimum version. (Both major & minor numbers are >> significant.) This macro will evaluate to 0 if we are not using >> gcc at all.*/ >> #ifndef GCC_VERSION >> +/* Some compilers pretend to be GCC, even when they are not. */ >> +#if defined(__clang__) || defined(__INTEL_COMPILER) >> +#define GCC_VERSION 0 >> +#else >> #define GCC_VERSION (__GNUC__ * 1000 + __GNUC_MINOR__) >> +#endif >> #endif /* GCC_VERSION */ >> >> so that we really can trust the GCC_VERSION macro, casts in lhs of inline >> asm isn't the only incompatibility clang has. >> >> Any other compilers that define __GNUC__? > > Every one I guess. ICC 9 has it defined to 4, __GNUC_MINOR__ to 0. So if we want to go down that route I'd rather change the configury that checks whether we are using GCC to be more pedantic and for example parse $CC --version output (not for the actual version but for whether $CC is GCC). Richard. > Richard. > >> Jakub
Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)
On Wed, May 28, 2014 at 11:48 AM, Jakub Jelinek wrote: > On Wed, May 28, 2014 at 11:38:55AM +0200, Richard Biener wrote: >> On Wed, May 28, 2014 at 10:24 AM, FX wrote: >> >> Yeah, a portable (C and C++) static assert would be nice. And also >> >> pushing >> >> this to gmp then. >> >> >> >> In the meantime I see nothing wrong in "merging" from GMP. >> > >> > One question, one comment: >> > >> > 1. can I count your “I see nothing wrong” as an approval, as in “global >> > reviewers can approve changes to any part of the compiler or associated >> > libraries”? >> >> Well, kind of. But Jakub is as well, so I don't want to override him. So >> please wait for an ack from Jakub. I agree with him that the casts >> served a purpose and that, if removed, they need to be replaced with >> an appropriate assertion measure. > > I think my preference would be to change include/ansidecl.h to: > > /* This macro simplifies testing whether we are using gcc, and if it > is of a particular minimum version. (Both major & minor numbers are > significant.) This macro will evaluate to 0 if we are not using > gcc at all.*/ > #ifndef GCC_VERSION > +/* Some compilers pretend to be GCC, even when they are not. */ > +#if defined(__clang__) || defined(__INTEL_COMPILER) > +#define GCC_VERSION 0 > +#else > #define GCC_VERSION (__GNUC__ * 1000 + __GNUC_MINOR__) > +#endif > #endif /* GCC_VERSION */ > > so that we really can trust the GCC_VERSION macro, casts in lhs of inline > asm isn't the only incompatibility clang has. > > Any other compilers that define __GNUC__? Every one I guess. ICC 9 has it defined to 4, __GNUC_MINOR__ to 0. Richard. > Jakub
Re: Mark more constants readonly
On 05/27/2014 04:57 PM, Richard Biener wrote: On Tue, May 27, 2014 at 3:13 PM, Bernd Schmidt wrote: I noticed that string constants built by the Fortran frontend don't set TREE_READONLY for STRING_CST (and subsequently noticed that nothing seems to set it for COMPLEX_CST). That was confusing the ptx backend I'm working on. The -fwritable-strings option for C was removed a while ago, so I tested the following patch to just set the flag unconditionally, and passed bootstrap and test on x86_64-linux. Ok? Hmm? Not so obvious. TREE_READONLY has no purpose on tcc_constant nodes if I read documentation correctly (tree.h). In fact base.readonly_flag is unused for tcc_constant (and would be redundant with TREE_CONSTANT). Well, the C frontend still sets it for strings (in fix_string_type), and gcc.dg/lvalue-5.c fails if that is removed. So things are a little inconsistent between frontends. Bernd
Re: [patch] VxWorks configuration adjustments for powerpc 8548 (e500v2)
On May 27, 2014, at 5:27 PM, Nathan Sidwell wrote: > ok Great! Committed as rev 211011. Thanks Nathan :-) Olivier
Re: Darwin bootstrap failure following wide int merge (was: we are starting the wide int merge)
On Wed, May 28, 2014 at 11:38:55AM +0200, Richard Biener wrote: > On Wed, May 28, 2014 at 10:24 AM, FX wrote: > >> Yeah, a portable (C and C++) static assert would be nice. And also pushing > >> this to gmp then. > >> > >> In the meantime I see nothing wrong in "merging" from GMP. > > > > One question, one comment: > > > > 1. can I count your “I see nothing wrong” as an approval, as in “global > > reviewers can approve changes to any part of the compiler or associated > > libraries”? > > Well, kind of. But Jakub is as well, so I don't want to override him. So > please wait for an ack from Jakub. I agree with him that the casts > served a purpose and that, if removed, they need to be replaced with > an appropriate assertion measure. I think my preference would be to change include/ansidecl.h to: /* This macro simplifies testing whether we are using gcc, and if it is of a particular minimum version. (Both major & minor numbers are significant.) This macro will evaluate to 0 if we are not using gcc at all.*/ #ifndef GCC_VERSION +/* Some compilers pretend to be GCC, even when they are not. */ +#if defined(__clang__) || defined(__INTEL_COMPILER) +#define GCC_VERSION 0 +#else #define GCC_VERSION (__GNUC__ * 1000 + __GNUC_MINOR__) +#endif #endif /* GCC_VERSION */ so that we really can trust the GCC_VERSION macro, casts in lhs of inline asm isn't the only incompatibility clang has. Any other compilers that define __GNUC__? Jakub
Re: Darwin bootstrap failure following wide int merge
Richard Biener writes: > On Wed, May 28, 2014 at 11:40 AM, Richard Biener > wrote: >> On Wed, May 28, 2014 at 10:24 AM, Richard Sandiford >> wrote: >>> Richard Biener writes: On Wed, May 28, 2014 at 8:50 AM, Jakub Jelinek wrote: > On Mon, May 26, 2014 at 08:36:31AM -0700, Mike Stump wrote: >> On May 26, 2014, at 2:22 AM, FX wrote: >> >> This causes GCC bootstrap to fail on Darwin systems (whose system >> > compiler is clang-based). Since PR 61146 was resolved as INVALID >> > (but I’m not sure it’s the right call, see below), I’ve filed a >> > separate report for the bootstrap issue >> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61315). >> > >> > Since my PR has been closed twice by Andrew Pinski (“it’s clang’s >> > fault, bouh ouh”), I’d ask the maintainers to step in. Can we >> > please provide a GCC that works for the default darwin setup? Or at >> > least drop darwin as secondary target and document the failure? >> >> The best coarse of action, post a patch, have it reviewed and put in. >> Current action, a patch has been posted, the review is outstanding, I’d >> like to see it put in; though, I am curious why the casts were there in >> the first place. > > Note, haven't added them there, but from what I can test, the casts there > can serve as a compile time check that the right type is used, e.g. > unsigned long i; > > void > foo (void) > { > asm volatile ("# %0 %1" : "=r" ((unsigned long long) i) : "0" > ((unsigned long long) 0)); > } Ah, interesting. A not-so-hineous extension then. >>> >>> In that case, how about just protecting the include with: >>> >>> #if GCC_VERSION >= 4300 && (W_TYPE_SIZE == 32 || defined >>> (__SIZEOF_INT128__)) >>> >>> rather than: >>> >>> #if GCC_VERSION >= 3000 && (W_TYPE_SIZE == 32 || defined >>> (__SIZEOF_INT128__)) >>> >>> so that clang will fail the version check? At the end of the day we >>> only really care what happens during stage 2 and 3. Cross-compilers >>> built with recentish gccs will still benefit. >> >> Works for me (thus, pre-approved with a comment explaining the version >> choice). > > Btw, testing coverage would ask for && defined (__OPTIMIZE__), too, disabling > that path in stage1 unconditionally even for new GCC. Ah, but surely the day is near when we use -O or -Og for stage1? I've been using -O for a good few years now and it definitely makes things faster (and without affecting debugging too much -- in the rare cases where it does affect debugging, a recompile with -g is very quick). ATM we get the testing coverage for i686 and ppc32 hosts. So TBH I'd prefer to keep it simple and just bump the version number. Thanks, Richard
Re: [PATCH, V2] Fix an AARCH64/ARM performance regression
On Wed, May 28, 2014 at 11:02 AM, Bernd Edlinger wrote: > Hi, > > >> But the coment previously read >> >> /* A constant address in TO_RTX can have VOIDmode, we must not try >> to call force_reg for that case. Avoid that case. */ >> >> and you are removing that check. So I guess you want to retain >> >> && GET_MODE (XEXP (to_rtx, 0)) != VOIDmode >> >> or investigate why we don't need to avoid this case anymore. In fact, >> that's probably the only change necessary, just drop the == BLKmode >> check? Your lengthy description doesn't reveal if you investigated that. >> It seems to me it tries to avoid ajdusting MEM (symbol-ref) for example? >> Maybe also for optimization reasons? > > I have now found out, what kind of MEM it is, which has a VOIDmode address... > > Just a few lines above, there is this: > > if (!MEM_P (to_rtx)) > { > /* We can get constant negative offsets into arrays with broken > user code. Translate this to a trap instead of ICEing. */ > gcc_assert (TREE_CODE (offset) == INTEGER_CST); > expand_builtin_trap (); > to_rtx = gen_rtx_MEM (BLKmode, const0_rtx); > } > > now, I have never ever seen this block executed. > Even the examples from PR 23714 do not execute this block any more. > > But const0_rtx has VOIDmode. This would satisfy the condition. > > So if I try: > > to_rtx = gen_rtx_MEM (BLKmode, const0_rtx); > force_reg(mode1, to_rtx); > > Voila: assertion in emit_move_insn. > > Somehow this matches literally to what Jeff wrote in the comment: > "A constant address in TO_RTX can have VOIDmode, we must not try > to call force_reg for that case." > > But nothing happens with: > > to_rtx = gen_rtx_MEM (BLKmode, const0_rtx); > to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT) > > also the rest of the code generation works, and of course this > expands a null pointer access. > > >> Adding a new comment before the block reflecting your analysis above >> would be nice as well. > > done, see the attached patch v2. > > I have also removed the check for MEM_P (to_rtx) in the first if-statement, > because this is always satisfied, after the if (!MEM_P(to_rtx))-block above. > Now both places have exactly the same logic. > > Boot-strapped & regression-tested in x86_64-linux-gnu. > > OK for trunk? Ok. Thanks, Richard. > > Thanks > Bernd. >