Re: [PATCH] Another non-prototype builtin issue (PR middle-end/55890)
Jakub Jelinek ja...@redhat.com wrote: Hi! evaluate_stmt can be crashed similarly to what you've fixed today. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2013-01-07 Jakub Jelinek ja...@redhat.com PR middle-end/55890 * tree-ssa-ccp.c (evaluate_stmt): Use gimple_call_builtin_p. * gcc.dg/torture/pr55890-3.c: New test. --- gcc/tree-ssa-ccp.c.jj 2012-11-19 14:41:17.0 +0100 +++ gcc/tree-ssa-ccp.c 2013-01-07 18:53:39.952854381 +0100 @@ -1546,7 +1546,6 @@ evaluate_stmt (gimple stmt) !is_constant) { enum gimple_code code = gimple_code (stmt); - tree fndecl; val.lattice_val = VARYING; val.value = NULL_TREE; val.mask = double_int_minus_one; @@ -1593,10 +1592,9 @@ evaluate_stmt (gimple stmt) || POINTER_TYPE_P (TREE_TYPE (rhs1))) val = bit_value_binop (code, TREE_TYPE (rhs1), rhs1, rhs2); } - else if (code == GIMPLE_CALL - (fndecl = gimple_call_fndecl (stmt)) - DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL) + else if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)) { +tree fndecl = gimple_call_fndecl (stmt); switch (DECL_FUNCTION_CODE (fndecl)) { case BUILT_IN_MALLOC: --- gcc/testsuite/gcc.dg/torture/pr55890-3.c.jj2013-01-07 18:51:27.415614201 +0100 +++ gcc/testsuite/gcc.dg/torture/pr55890-3.c 2013-01-07 18:51:22.864640151 +0100 @@ -0,0 +1,9 @@ +/* { dg-do compile } */ + +void *memmove (); + +void * +bar () +{ + return memmove (); +} Jakub
[PING^1] [Patch] Ignore Invalid Memory operands in constraint 'X'
Hi, Please consider this as a reminder to review the patch posted at following link:- http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00765.html Please review the patch and let me know if its okay? Thanks Regards, Naveen.H.S
[Patch, AARCH64] Remove REGISTER_PREFIX
Hi, AARCH64 back-end defines REGISTER_PREFIX to empty string. Hence, it can be removed as the default value of REGISTER_PREFIX is . Please find attached the patch that removes REGISTER_PREFIX and all its occurrence in aarch64 back-end. Please review the patch and let me know if there should be any modifications? Build and tested on aarch64-thunder-elf (using Cavium's internal simulator). Thanks, Naveen.H.S 2013-01-08 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * config/aarch64/aarch64-elf.h (REGISTER_PREFIX): Remove. * config/aarch64/aarch64.c (aarch64_print_operand): Remove all occurrence of REGISTER_PREFIX as its empty string.--- gcc/config/aarch64/aarch64.c 2013-01-07 17:04:51.295005008 +0530 +++ gcc/config/aarch64/aarch64.c 2013-01-08 09:50:46.717051922 +0530 @@ -3348,7 +3348,7 @@ aarch64_print_operand (FILE *f, rtx x, c output_operand_lossage (incompatible floating point / vector register operand for '%%%c', code); return; } - asm_fprintf (f, %s%c%d, REGISTER_PREFIX, code, REGNO (x) - V0_REGNUM); + asm_fprintf (f, %c%d, code, REGNO (x) - V0_REGNUM); break; case 'S': @@ -3361,8 +3361,7 @@ aarch64_print_operand (FILE *f, rtx x, c output_operand_lossage (incompatible floating point / vector register operand for '%%%c', code); return; } - asm_fprintf (f, %sv%d, REGISTER_PREFIX, - REGNO (x) - V0_REGNUM + (code - 'S')); + asm_fprintf (f, v%d, REGNO (x) - V0_REGNUM + (code - 'S')); break; case 'w': @@ -3371,20 +3370,19 @@ aarch64_print_operand (FILE *f, rtx x, c 64-bit). */ if (x == const0_rtx) { - asm_fprintf (f, %s%czr, REGISTER_PREFIX, code); + asm_fprintf (f, %czr, code); break; } if (REG_P (x) GP_REGNUM_P (REGNO (x))) { - asm_fprintf (f, %s%c%d, REGISTER_PREFIX, code, - REGNO (x) - R0_REGNUM); + asm_fprintf (f, %c%d, code, REGNO (x) - R0_REGNUM); break; } if (REG_P (x) REGNO (x) == SP_REGNUM) { - asm_fprintf (f, %s%ssp, REGISTER_PREFIX, code == 'w' ? w : ); + asm_fprintf (f, %ssp, code == 'w' ? w : ); break; } --- gcc/config/aarch64/aarch64-elf.h 2012-12-14 09:53:42.265262998 +0530 +++ gcc/config/aarch64/aarch64-elf.h 2013-01-08 09:47:17.525044826 +0530 @@ -106,7 +106,6 @@ #define ASM_COMMENT_START // -#define REGISTER_PREFIX #define LOCAL_LABEL_PREFIX . #define USER_LABEL_PREFIX
[PATCH] Enhance ifcvt conditional execution to handle IF-THEN-ELSE
Hi, max (MAX_CONDITIONAL_EXECUTE) is doubled for IF-THEN-ELSE since there is one more JUMP for the taken branch to goto the successor of IF-THEN-ELSE. But when the last JUMP is a RETURN, the additional JUMP can be optimized. Here are the two different assemble codes the attached test case (-O2 -mthumb) for ARM. IT block (current result): cmp r0, #0 itete lt rsblt r0, r0, #2 movwge r3, #21846 addlt r0, r0, r0, lsl #1 movtge r3, 21845 iteee lt lsllt r0, r0, #1 subge r0, r0, #2 smullge r2, r3, r3, r0 subge r0, r3, r0, asr #31 bx lr Branch (after patch): cmp r0, #0 blt .L5 movwr3, #21846 subsr0, r0, #2 movtr3, 21845 smull r2, r3, r3, r0 sub r0, r3, r0, asr #31 bx lr .L5: rsb r0, r0, #2 add r0, r0, r0, lsl #1 lslsr0, r0, #1 bx lr In general, I think the later one will have better performance. The patch will check whether there is RETURN. If has, do not double max. No make check regression and no performance regression for SPECINT2000 for ARM. Is it OK? Thanks! -Zhenqiang ChangeLog: 2013-01-08 Zhenqiang Chen zhenqiang.c...@linaro.org * ifcvt.c (cond_exec_process_if_block): Do not increase max when there has RETURN in IF-THEN-ELSE. testsuite/ChangeLog: * gcc.target/arm/if-then-else-return.c: New. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index b518b0e..70cb53a 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -443,6 +443,8 @@ cond_exec_process_if_block (ce_if_block_t * ce_info, rtx else_first_tail = NULL_RTX; /* First match at the tail of ELSE */ int then_n_insns, else_n_insns, n_insns; enum rtx_code false_code; + bool has_return_p = FALSE; + rtx insn; /* If test is comprised of or || elements, and we've failed at handling all of them together, just use the last test if it is the special case of @@ -478,11 +480,19 @@ cond_exec_process_if_block (ce_if_block_t * ce_info, n_insns = then_n_insns; max = MAX_CONDITIONAL_EXECUTE; + insn = BB_END (then_bb); + if (JUMP_P (insn) ANY_RETURN_P (PATTERN (insn))) +has_return_p = TRUE; + if (else_bb) { int n_matching; + insn = BB_END (else_bb); + if (!has_return_p JUMP_P (insn) ANY_RETURN_P (PATTERN (insn))) + has_return_p = TRUE; - max *= 2; + if (!has_return_p) +max *= 2; else_start = first_active_insn (else_bb); else_end = last_active_insn (else_bb, TRUE); else_n_insns = ce_info-num_else_insns = count_bb_insns (else_bb); /* For this case, we do not expect it generates IT block. */ /* { dg-do compile } */ /* { dg-options -O2 -mthumb } */ int test (int a) { int r = a; if (r 0) { r = -r; r += 2; r *= 6; } else { r -= 2; r /= 3; } return r; } /* { dg-final { scan-assembler \tblt } } */
Re: [AARCH64] Remove %r from asm_printf
On 8 January 2013 05:04, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, ASM_FPRINTF_EXTENSIONS was copied from the arm back-end and is not needed. Please find attached the patch that replaces %r in asm_fprintf with reg_names. The definition of ASM_FPRINTF_EXTENSIONS is removed as it is no longer needed. It appears that ARM and AARCH64 are the only users of this target macro. This patch is OK. It would be much appreciated if you would handle the back port to ARM/aarch64-4.7-branch Thanks /Marcus Please review the patch and let me know if there should be any modifications? Build and tested on aarch64-thunder-elf (using Cavium's internal simulator). Thanks, Naveen.H.S 2013-01-08 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * config/aarch64/aarch64.c (aarch64_print_operand): Replace %r in asm_fprintf with reg_names. (aarch64_print_operand_address): Likewise. (aarch64_return_addr): Likewise. * config/aarch64/aarch64.h (ASM_FPRINTF_EXTENSIONS): Remove.
Re: [Patch, AARCH64] Define REVERSIBLE_CC_MODE
On 8 January 2013 05:09, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, AARCH64 back-end defines REVERSE_CONDITION but not REVERSIBLE_CC_MODE. REVERSE_CONDITION is used only in case REVERSIBLE_CC_MODE (mode) is nonzero. Please find attached the patch that defines REVERSIBLE_CC_MODE for aarch64 back-end. Please review the patch and let me know if there should be any modifications? Build and tested on aarch64-thunder-elf (using Cavium's internal simulator). Thanks, Naveen.H.S 2013-01-08 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * config/aarch64/aarch64.h (REVERSIBLE_CC_MODE): Define. +/* A C expression whose value is one if it is always safe to reverse a + comparison whose mode is MODE. If `SELECT_CC_MODE' can ever return + MODE for a floating-point inequality comparison, then + `REVERSIBLE_CC_MODE (MODE)' must be zero. + You need not define this macro if it would always returns zero or if the + floating-point format is anything other than `IEEE_FLOAT_FORMAT'. */ +#define REVERSIBLE_CC_MODE(MODE) 1 + Given the comment about SELECT_CC_MODE and the implementation of aarch64_select_cc_mode this definition of REVERSIBLE_CC_MODE looks broken to me. /Marcus
Re: AVX generic mode tuning discussion.
On Mon, Jan 7, 2013 at 7:21 PM, Jagasia, Harsha harsha.jaga...@amd.com wrote: We would like to propose changing AVX generic mode tuning to generate 128-bit AVX instead of 256-bit AVX. You indicate a 3% reduction on bulldozer with avx256. How does avx128 compare to -mno-avx -msse4.2? Will the next AMD generation have a useable avx256? I'm not keen on the idea of generic mode being tune for a single processor revision that maybe shouldn't actually be using avx at all. Btw, it looks like the data is massively skewed by 436.cactusADM. What are the overall numbers if you disregard cactus? It's also for sure the case that the vectorizer cost model has not been touched for avx256 vs. avx128 vs. sse, so a more sensible approach would be to look at differentiating things there to improve the cactus numbers. Harsha, did you investigate why avx256 is such a loss for cactus or why it is so much of a win for SB? I know this thread did not get closed from our end for a while now, but we (AMD) would really like to re-open this discussion. So here goes. We did investigate why cactus is slower in avx-256 mode than avx-128 mode on AMD processors. Using -Ofast flag (with appropriate flags to generate avx-128 code or avx-256 code) and running with the reference data set, we observe the following runtimes on Bulldozer. Runtime %Diff AVX-256 versus AVX-128 AVX128616s 38% AVX256 with store splitting 853s Scheduling and predictive commoning are turned off in the compiler for both cases, so that the code generated by the compiler for the avx-128 and avx-256 cases are mostly equivalent i.e only avx-128 instructions on one side are being replaced by avx-256 instructions on the other side. Looking at the cactus source and oprofile reports, the hottest loop nest is a triple nested loop. The innermost loop of this nest has ~400 lines of Fortran code and takes up 99% of the run time of the benchmark. Gcc vectorizes the innermost loop for both the 128 and 256 bit cases. In order to vectorize the innermost loop, gcc generates a SIMD scalar prologue loop to align the relevant vectors, followed by a SIMD packed avx loop, followed by a SIMD scalar epilogue loop to handle what's left after a whole multiple of vector factor is taken care of. Here are the oprofile samples seen in the AVX-128 and AVX-256 case for the innermost Fortran loop's 3 components. Oprofile Samples AVX 128 AVX-256-ss Gap in samples Gap as % of total runtime Total 153408 214448 61040 38% SIMD Vector loop135653 183074 4742130% SIMD Scalar Prolog loop3817 104346617 4% SIMD Scalar Epilog loop 3471 100726601 4% The avx-256 code is spending 30% more time in the SIMD vector loop than the avx-128 code. The code gen appears to be equivalent for this vector loop in the 128b and 256b cases- i.e only avx-128 instructions on one side are being replaced by avx-256 instructions on the other side. The instruction mix and scheduling are same, except for the spilling and loading of one variable. We know this gap is because there are fewer physical registers available for renaming to the avx-256 code, since our processor loses the upper halves of the FP registers for renaming. Our entire SIMD pipeline in the processor is 128-bit and we don't have native true 256-bit, even for foreseeable future generations, unlike Sandybridge/Ivybridge. The avx-256 code is spending 8% more time in the SIMD scalar prologue and epilogue than the avx-128 code. The code gen is exactly the same for these scalar loops in the 128b and 256b case - i.e exact same instruction mix and scheduling. The reason for the gap is actually the number of iterations that gcc executes in these loops for the 2 cases. This is because gcc is following Sandy bridge's recommendation and aligning avx-256 vectors to a 32-byte boundary instead of a 16-byte boundary, even on Bulldozer. The Sandybridge Software Optimization Guide mentions that the optimal memory alignment of an AVX 256-bit vector, stored in memory, is 32 bytes. The Bulldozer Software Optimization Guide says Align all packed floating-point data on 16-byte boundaries. In case of cactus, the relevant double vector has 118 elements
[C++ Patch] PR 55801
Hi, a simple ICE on invalid: the problem manifests itself when, during error recovery, var_defined_without_dynamic_init does TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (var)) for a VAR_DECL which has error_mark_node as TREE_TYPE. It seems to me that a good place to avoid this is finish_id_expression (a bit earlier), where we already use error_operand_p, by not calling at all get_tls_wrapper_fn on such a decl. Tested x86_64-linux. Thanks, Paolo. / /cp 2013-01-08 Paolo Carlini paolo.carl...@oracle.com PR c++/55801 * semantics.c (finish_id_expression): Don't call get_tls_wrapper_fn on a decl with error_mark_node as TREE_TYPE. /testsuite 2013-01-08 Paolo Carlini paolo.carl...@oracle.com PR c++/55801 * g++.dg/tls/thread_local-ice.C: New. Index: cp/semantics.c === --- cp/semantics.c (revision 195008) +++ cp/semantics.c (working copy) @@ -3,7 +3,7 @@ building RTL. These routines are used both during actual parsing and during the instantiation of template functions. - Copyright (C) 1998-2012 Free Software Foundation, Inc. + Copyright (C) 1998-2013 Free Software Foundation, Inc. Written by Mark Mitchell (mmitch...@usa.net) based on code found formerly in parse.y and pt.c. @@ -3290,8 +3290,9 @@ finish_id_expression (tree id_expression, } tree wrap; - if (TREE_CODE (decl) == VAR_DECL - !cp_unevaluated_operand + if (! error_operand_p (decl) + TREE_CODE (decl) == VAR_DECL + ! cp_unevaluated_operand DECL_THREAD_LOCAL_P (decl) (wrap = get_tls_wrapper_fn (decl))) { Index: testsuite/g++.dg/tls/thread_local-ice.C === --- testsuite/g++.dg/tls/thread_local-ice.C (revision 0) +++ testsuite/g++.dg/tls/thread_local-ice.C (working copy) @@ -0,0 +1,6 @@ +// PR c++/55801 +// { dg-options -std=c++11 } +// { dg-require-effective-target tls } + +class C; +thread_local C O, O2 = O; // { dg-error incomplete }
Re: Control dependence vs. builtin_unreachable
On Mon, Jan 7, 2013 at 8:45 PM, Jeff Law l...@redhat.com wrote: On 01/05/2013 01:10 PM, Steven Bosscher wrote: Presumably BB7 was created in response to the builtin_unreachable? Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up at the end of the GIMPLE passes pipeline, in the fold-all-builtins pass (most __builtin_unreachable calls are, but not all). I think if you eliminate the block and the cleanup the CFG appropriately, the right thing will just happen. The problem with this, is that __builtin_unreachable actually exposes optimization opportunities: more const/copy props of implicit sets in the predicate guarding the __builtin_unreachable call, more optimistic value numbering, etc. It also helps improve maybe-unused warnings accuracy. So simply removing these dead ends in the CFG is probably not a good idea. ?!? By removing the empty unreachable block that's precisely what you enable. The block itself goes away and the branch leading to the block is simplified appropriately. That in turn will create larger basic blocks, enabling the const/copy propagations and similar optimizations. Finally removing unreachable paths was insired by the desire to eliminate false positives from maybe-{unused,uninitialized} and similar warnings. I'd be very curious to see the conditions under which removing the empty unreachable and appropriate cleanup of the CFG (including the underlying control statement) results in less optimizations and less precision in our may-warnings. The issue is VRP - when you remove unreachable blocks you lose the conditional statement as it is no longer necessary and thus the predicate you can derive value-ranges from. Maybe there are more examples. I can imagine false positive may-be warnings then become must-be false positives. There are clearly (as seen here) missed optimizations because we do keep the unreachable blocks around. Note that the whole point of the unreachable () excercise was to be able to implement an assert () that even with -DNDEBUG leaves the compiler with the same optimization opportunities (as of value-range analysis) as with -UNDEBUG. Richard.
Re: Control dependence vs. builtin_unreachable
On Sat, Jan 5, 2013 at 9:10 PM, Steven Bosscher stevenb@gmail.com wrote: On Thu, Jan 3, 2013 at 9:51 PM, Jeff Law wrote: On 01/03/2013 12:01 PM, Steven Bosscher wrote: Hello, Consider the following test case: void bar (void); int foo (int b, int c, int d) { int r = 0; if (b) res = b * 2 + 4; if (c) { if (d) r = res; else __builtin_unreachable (); } return r; } This is typical for code in GCC itself in places where gcc_unreachable() is used. The corresponding CFG looks like this: +-+ | bb0 | +-+ | | v +-+ | bb2 | -+ +-+ | | | | | v | +-+ | | bb3 | | +-+ | | | | | v | +-+ +-+ | | bb8 | -- | bb4 | + +-+ +-+ | | | | | v | +-+ +-+ | | bb5 | -- | bb7 | | +-+ +-+ | | | | | v | +-+ | | bb6 | | +-+ | | | | | v | +-+ +--- | bb9 | +-+ | | v +-+ | bb1 | +-+ Presumably BB7 was created in response to the builtin_unreachable? Yes. The block only contains the BB_UNREACHABLE call. It is cleaned up at the end of the GIMPLE passes pipeline, in the fold-all-builtins pass (most __builtin_unreachable calls are, but not all). One could argue that an empty dead-end basic block should just be removed and the CFG appropriately simplified. The problem with this, is that __builtin_unreachable actually exposes optimization opportunities: more const/copy props of implicit sets in the predicate guarding the __builtin_unreachable call, more optimistic value numbering, etc. It also helps improve maybe-unused warnings accuracy. So simply removing these dead ends in the CFG is probably not a good idea. You might want to look at a discussion from Oct/Nov 2011 New pass to delete unexecutable paths in the CFG which touches on some of this stuff. That's a really interesting discussion! I must have missed it at the time :-) It's not 100% the same, but the concept of eliminating edges from the CFG which we can never traverse in a conforming program applies to both your example and the stuff I was playing with. I think there is one important difference: In the thread you referred to, you're removing paths in the CFG that are implicitly not executable (for some languages anyway), whereas a __builtin_unreachable call is an explicit marker for this can never happen. I think this difference is important: * The explicit marker may have been put there on purpose (e.g. to get rid of a false-positive warning). The compiler should respect that. An implicit unreachable path can be optimized away without regard for the user's intent. * The explicit marker should not inhibit optimizations. For an implicit unreachable path the compiler should be conservative. But for a __builtin_unreachable call that is the only statement in a basic block, the compiler should be allowed to work as if the block really is never reached. The attached patch implements these ideas. During a tree-CFG cleanup, basic blocks containing only a __builtin_unreachable call are marked with a new flag BB_NEVER_REACHED. The flag is used in post-dominance: A marked block is not considered in the computations of the immediate post-dominator of the predecessor blocks. The result is a more optimistic post-dominance tree: Without the patch all predecessors of these BB_NEVER_REACHED blocks have the EXIT block as their post-dominator, but with the patch this non-executable path in the CFG is ignored and the post-dominators are those that would result if the BB_NEVER_REACHED blocks are not there at all (the BB_NEVER_REACHED blocks themselves are only post-dominated by the EXIT block). I've also added a control dependence calculation function. It's not currently used, but it shows how the BB_NEVER_REACHED flag is used in this case to avoid the false control dependences that I showed in the graphs in http://gcc.gnu.org/ml/gcc/2013-01/msg00021.html. Bootstrappedtested on powerpc64-unknown-linux-gnu. What do you think of this approach? Does it handle side-effects on the builtin-unreachable path correctly? int b; int a; extern void foo (); int main() { if (!a) { if (!b) foo (); __builtin_unreachable (); } } --- void foo () { puts(Hello); exit(0); } ? Users are not _required_ to annotate noreturn
Re: [RFA] statement before variable declaration in cp_parser_initializer_list.
Hi Richard, Hmm? We compile with a C++ compiler where this is perfectly valid ... I was compiling with GCC 4.7 where it gave me a warning... I don't know much about C++ anymore, so I didn't know. Oh well! Ah, for the 4.7 branch yes. Eric Botcazou asked that we have the same code for both 4.7 and HEAD. Would it be OK to apply it to both? It's not really strictly necessary for the HEAD, but I don't see it as being harmful either. Thank you! -- Joel
Re: [RFA] statement before variable declaration in cp_parser_initializer_list.
On Tue, Jan 8, 2013 at 12:37 PM, Joel Brobecker brobec...@adacore.com wrote: Hi Richard, Hmm? We compile with a C++ compiler where this is perfectly valid ... I was compiling with GCC 4.7 where it gave me a warning... I don't know much about C++ anymore, so I didn't know. Oh well! Ah, for the 4.7 branch yes. Eric Botcazou asked that we have the same code for both 4.7 and HEAD. Would it be OK to apply it to both? It's not really strictly necessary for the HEAD, but I don't see it as being harmful either. Sure. Richard. Thank you! -- Joel
Re: [PATCH, PR 55579] Make SRA keep candidated with only debug replacements
On Fri, Jan 04, 2013 at 02:00:39PM +0100, Martin Jambor wrote: 2013-01-03 Martin Jambor mjam...@suse.cz PR debug/55579 * tree-sra.c (analyze_access_subtree): Return true also after potentially creating a debug-only replacement. testsuite/ * gcc.dg/tree-ssa/pr55579.c: New test. Looks good, thanks. Jakub
[AArch64] Add support for floating-point vcond.
Hi, The AArch64 port has support for integer vcond, but did not have support for floating-point vcond. This patch adds this support and enables check_effective_target_vect_cond in the testsuite. Regression tested on aarch64-none-elf with no regressions. Is this OK to commit? Thanks, James Greenhalgh --- gcc/ 2013-01-08 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-simd.md (aarch64_simd_bslmode_internal): Add floating-point modes. (aarch64_simd_bsl): Likewise. (aarch64_vcond_internalmode): Likewise. (vcondmodemode): Likewise. (aarch64_cmcmpmode): Fix constraints, add new modes. * config/aarch64/iterators.md (V_cmp_result): Add V2DF. gcc/testsuite/ 2013-01-08 James Greenhalgh james.greenha...@arm.com * gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c: New. * gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c: Likewise. * gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c: Likewise. * gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c: Likewise. * gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c: Likewise. * gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c: Likewise. * gcc/testsuite/gcc.target/aarch64/vect-fcm.x: Likewise. * gcc/testsuite/lib/target-supports.exp (check_effective_target_vect_cond): Enable for AArch64. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d4b52c3..e6655e8 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1463,7 +1463,7 @@ (set_attr simd_mode V2SI)] ) -;; vbsl_* intrinsics may compile to any of vbsl/vbif/vbit depending on register +;; vbsl_* intrinsics may compile to any of bsl/bif/bit depending on register ;; allocation. For an intrinsic of form: ;; vD = bsl_* (vS, vN, vM) ;; We can use any of: @@ -1472,11 +1472,12 @@ ;; bif vD, vM, vS (if D = N, so 0-bits in vS choose bits from vM, else vN) (define_insn aarch64_simd_bslmode_internal - [(set (match_operand:VDQ 0 register_operand =w,w,w) - (unspec:VDQ [(match_operand:VDQ 1 register_operand 0,w,w) - (match_operand:VDQ 2 register_operand w,w,0) - (match_operand:VDQ 3 register_operand w,0,w)] - UNSPEC_BSL))] + [(set (match_operand:VALL 0 register_operand =w,w,w) + (unspec:VALL + [(match_operand:V_cmp_result 1 register_operand 0,w,w) + (match_operand:VALL 2 register_operand w,w,0) + (match_operand:VALL 3 register_operand w,0,w)] + UNSPEC_BSL))] TARGET_SIMD @ bsl\\t%0.Vbtype, %2.Vbtype, %3.Vbtype @@ -1485,15 +1486,15 @@ ) (define_expand aarch64_simd_bslmode - [(set (match_operand:VDQ 0 register_operand) - (unspec:VDQ [(match_operand:V_cmp_result 1 register_operand) - (match_operand:VDQ 2 register_operand) - (match_operand:VDQ 3 register_operand)] - UNSPEC_BSL))] + [(set (match_operand:VALL 0 register_operand) + (unspec:VALL [(match_operand:V_cmp_result 1 register_operand) + (match_operand:VALL 2 register_operand) + (match_operand:VALL 3 register_operand)] + UNSPEC_BSL))] TARGET_SIMD { /* We can't alias operands together if they have different modes. */ - operands[1] = gen_lowpart (MODEmode, operands[1]); + operands[1] = gen_lowpart (V_cmp_resultmode, operands[1]); }) (define_expand aarch64_vcond_internalmode @@ -1574,14 +1575,64 @@ DONE; }) +(define_expand aarch64_vcond_internalmode + [(set (match_operand:VDQF 0 register_operand) + (if_then_else:VDQF + (match_operator 3 comparison_operator + [(match_operand:VDQF 4 register_operand) + (match_operand:VDQF 5 nonmemory_operand)]) + (match_operand:VDQF 1 register_operand) + (match_operand:VDQF 2 register_operand)))] + TARGET_SIMD +{ + int inverse = 0; + rtx mask = gen_reg_rtx (V_cmp_resultmode); + + if (!REG_P (operands[5]) + (operands[5] != CONST0_RTX (MODEmode))) +operands[5] = force_reg (MODEmode, operands[5]); + + switch (GET_CODE (operands[3])) +{ +case LT: + inverse = 1; + /* Fall through. */ +case GE: + emit_insn (gen_aarch64_cmgemode (mask, operands[4], operands[5])); + break; +case LE: + inverse = 1; + /* Fall through. */ +case GT: + emit_insn (gen_aarch64_cmgtmode (mask, operands[4], operands[5])); + break; +case NE: + inverse = 1; + /* Fall through. */ +case EQ: + emit_insn (gen_aarch64_cmeqmode (mask, operands[4], operands[5])); + break; +default: + gcc_unreachable (); +} + + if (inverse) +emit_insn (gen_aarch64_simd_bslmode (operands[0], mask, operands[2], +operands[1])); + else +emit_insn (gen_aarch64_simd_bslmode (operands[0], mask, operands[1], +operands[2])); + DONE; +}) + (define_expand vcondmodemode - [(set (match_operand:VDQ 0 register_operand) - (if_then_else:VDQ + [(set (match_operand:VALL 0 register_operand) + (if_then_else:VALL
GCC 4.8.0 Status Report (2013-01-08), Stage 3 is over, regression fixes and docs only in effect immediately
Status == Stage3 has now officially ended after lasting a bit more than two months. This means that GCC trunk is now in release branch mode, thus only regression fixes and documentation changes are allowed now. Upcoming GCC 4.8 has stabilized itself over the holidays and I am aware of whole-distro rebuilds having taken place (x86_64, i586, s390x and ppc). There are a high number of release critical bugs around, several for less common (but primary or secondary) targets. Target maintainers - please help confirming and analyzing bugs for your targets. GCC trunk will remain in release branch mode until the release managers consider it fit for creating a first release candidate at which point it will branch and development will open for the next stage1. History tells us this will happen not earlier than two months from now. Quality Data Priority # Change from Last Report --- --- P1 25 + 2 P2 76 - 1 P3 18 - 73 --- --- Total 119 - 72 Previous Report === http://gcc.gnu.org/ml/gcc/2012-11/msg00095.html The next report will be sent by Joseph.
[testsuite] Fix g++.dg/debug/dwarf2/pr54508.C on Solaris 9/x86
g++.dg/debug/dwarf2/pr54508.C is FAILing on Solaris 9/x86 (both with as and gas): FAIL: g++.dg/debug/dwarf2/pr54508.C -std=gnu++98 scan-assembler c0[ \\t]+[#;/!|@]+ DW_AT_name[\\r\\n]+[^\\r\\n]+[\\r\\n]+[^\\r\\n]+[\\r\\n]+[^#;/!|@]+[#;/!|@]+ DW_AT_decl_line[\\r\\n]+[^#;/!|@]+[#;/!|@]+ DW_AT_declaration FAIL: g++.dg/debug/dwarf2/pr54508.C -std=gnu++98 scan-assembler s0[ \\t]+[#;/!|@]+ DW_AT_name FAIL: g++.dg/debug/dwarf2/pr54508.C -std=gnu++98 scan-assembler f10[ \\t]+[#;/!|@]+ DW_AT_name FAIL: g++.dg/debug/dwarf2/pr54508.C -std=gnu++98 scan-assembler u0[ \\t]+[#;/!|@]+ DW_AT_name[\\r\\n]+[^\\r\\n]+[\\r\\n]+[^\\r\\n]+[\\r\\n]+[^#;/!|@]+[#;/!|@]+ DW_AT_decl_line[\\r\\n]+[^#;/!|@]+[#;/!|@]+ DW_AT_declaration This turns out to be due to whitespace differences in assembler comments between Solaris 9 and 10: - .byte 0x3 / uleb128 0x3; (DW_AT_name) + .byte 0x3 / uleb128 0x3; (DW_AT_name) I've traced this to different values of ASM_COMMENT_START: config/i386/unix.h:#define ASM_COMMENT_START / config/i386/sol2-bi.h:#define ASM_COMMENT_START / The following patch fixes this; tested with the appropriate runtest invocations on i386-pc-solaris2.9, i386-pc-solaris2.10, and x86_64-unknown-linux-gnu. Ok for mainline? Rainer 2013-01-08 Rainer Orth r...@cebitec.uni-bielefeld.de * g++.dg/debug/dwarf2/pr54508.C: Allow for more whitespace after asm comments. # HG changeset patch # Parent 6f6b53ac13329dd89e81296140f3c39de724e7a7 Fix g++.dg/debug/dwarf2/pr54508.C on Solaris 9/x86 diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C b/gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C --- a/gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C +++ b/gcc/testsuite/g++.dg/debug/dwarf2/pr54508.C @@ -2,17 +2,17 @@ // { dg-do compile } // { dg-options -g2 -dA -fno-merge-debug-strings } -// { dg-final { scan-assembler-not \cbase0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler \c0\\[ \t\]+\[#;/!|@\]+ DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ DW_AT_decl_line\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ DW_AT_declaration } } -// { dg-final { scan-assembler-not \OPCODE0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler-not \bi0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler-not \si0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler \s0\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler-not \s0\\[^#;/!|@\]+\[#;/!|@\]+ DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ DW_AT_decl_line\[\r\n\]+\[ \t\]+\[#;/!|@\]+ DW_AT_declaration } } -// { dg-final { scan-assembler \f10\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler \u0\\[ \t\]+\[#;/!|@\]+ DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ DW_AT_decl_line\[\r\n\]+\[^#;/!|@\]+\[#;/!|@\]+ DW_AT_declaration } } -// { dg-final { scan-assembler-not \f20\\[ \t\]+\[#;/!|@\]+ DW_AT_name } } -// { dg-final { scan-assembler-not \nc0\\[ \t\]+\# DW_AT_name } } +// { dg-final { scan-assembler-not \cbase0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler \c0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ +DW_AT_decl_line\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ +DW_AT_declaration } } +// { dg-final { scan-assembler-not \OPCODE0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler-not \bi0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler-not \si0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler \s0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler-not \s0\\[^#;/!|@\]+\[#;/!|@\]+ +DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ +DW_AT_decl_line\[\r\n\]+\[ \t\]+\[#;/!|@\]+ +DW_AT_declaration } } +// { dg-final { scan-assembler \f10\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler \u0\\[ \t\]+\[#;/!|@\]+ +DW_AT_name\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^\r\n\]+\[\r\n\]+\[\^#;/!|@\]+\[#;/!|@\]+ +DW_AT_decl_line\[\r\n\]+\[^#;/!|@\]+\[#;/!|@\]+ +DW_AT_declaration } } +// { dg-final { scan-assembler-not \f20\\[ \t\]+\[#;/!|@\]+ +DW_AT_name } } +// { dg-final { scan-assembler-not \nc0\\[ \t\]+\# +DW_AT_name } } class cbase -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] Fix g++.dg/debug/dwarf2/pr54508.C on Solaris 9/x86
On Tue, Jan 08, 2013 at 01:11:24PM +0100, Rainer Orth wrote: Ok for mainline? Yes, thanks. 2013-01-08 Rainer Orth r...@cebitec.uni-bielefeld.de * g++.dg/debug/dwarf2/pr54508.C: Allow for more whitespace after asm comments. Jakub
[RFA/dwarf] Add DW_AT_use_GNAT_descriptive_type flag for Ada units.
Hello, I just noticed that part of the proposal we made for... http://gcc.gnu.org/wiki/DW_AT_GNAT_descriptive_type ... got missed in the patch that got checked in: http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00099.html In particular, we're missing the second part, where we are expected to generate a DW_AT_use_GNAT_descriptive_type flag in the CU in order to tell the consumers that we are providing the descriptive type. gcc/ChangeLog: * dwarf2out.c (gen_compile_unit_die): Add DW_AT_use_GNAT_descriptive_type attribute for Ada units. Tested on x86_64-linux, no regression. I also tested against the GDB testsuite, before and after, and no regression. This is expected, since GDB currently does not look for this attribute (but I would like it to). OK to apply for HEAD and 4.7? Thanks, -- Joel --- gcc/dwarf2out.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index a865250..8117ce9 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -18940,6 +18940,10 @@ gen_compile_unit_die (const char *filename) /* The default DW_ID_case_sensitive doesn't need to be specified. */ break; } + + if (language == DW_LANG_Ada95) +add_AT_flag (die, DW_AT_use_GNAT_descriptive_type, 1); + return die; } -- 1.7.0.4
Re: [RFA] statement before variable declaration in cp_parser_initializer_list.
Eric Botcazou asked that we have the same code for both 4.7 and HEAD. Would it be OK to apply it to both? It's not really strictly necessary for the HEAD, but I don't see it as being harmful either. Sure. Thank you! Now checked in. -- Joel
Re: PR tree-optimization/55823 (ipa-inline-transform ICE)
Hi, On Mon, Jan 07, 2013 at 01:26:23AM +0100, Jan Hubicka wrote: Hi, as discused in the PR log there seems to be ordering issue in update_indirect_edges_after_inlining that first updates info in call edge to correspond the situation after inlining and then it tries to devirtualize that is trying to look up the info prior inlining. Bootstrapped/regtested x86_64-linux Martin, does it look sane? Yes, this is exactly what needs to be done. I'm quite surprised I had not already added a testcase for this. Can you also, please, look into why ipa-cp is not handling both calls? I will. Thanks, a lot for the patch, Martin Honza PR tree-optimization/55823 * g++.dg/ipa/devirt-10.C: New testcase. * ipa-prop.c (update_indirect_edges_after_inlining): Fix ordering issue. Index: testsuite/g++.dg/ipa/devirt-10.C === *** testsuite/g++.dg/ipa/devirt-10.C (revision 0) --- testsuite/g++.dg/ipa/devirt-10.C (revision 0) *** *** 0 --- 1,34 + /* { dg-do compile } */ + /* { dg-options -O3 -fdump-ipa-inline -fdump-ipa-cp } */ + class wxPaintEvent { }; + struct wxDCBase + { + wxDCBase (); + virtual int GetLayoutDirection() const{} + virtual void SetLayoutDirection(int){} + }; + struct wxWindowDC : public wxDCBase {}; + struct wxBufferedDC : public wxDCBase + { + void Init(wxDCBase*dc) { + InitCommon(dc); + } + void InitCommon(wxDCBase*dc) { + if (dc) + SetLayoutDirection(dc-GetLayoutDirection()); + } + }; + struct wxBufferedPaintDC : public wxBufferedDC { + wxBufferedPaintDC() { + Init(m_paintdc); + } + wxWindowDC m_paintdc; + }; + void OnPaint(wxPaintEvent event) { + wxBufferedPaintDC dc; + } + /* IPA-CP should really discover both cases, but for time being the second is handled by inliner. */ + /* { dg-final { scan-ipa-dump-times Discovered a virtual call to a known target 1 inline } } */ + /* { dg-final { scan-ipa-dump-times Discovered a virtual call to a known target 1 cp } } */ + /* { dg-final { cleanup-ipa-dump inline } } */ + /* { dg-final { cleanup-ipa-dump cp } } */ Index: ipa-prop.c === *** ipa-prop.c(revision 194918) --- ipa-prop.c(working copy) *** update_indirect_edges_after_inlining (st *** 2264,2303 param_index = ici-param_index; jfunc = ipa_get_ith_jump_func (top, param_index); - if (jfunc-type == IPA_JF_PASS_THROUGH -ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR) - { - if (ici-agg_contents -!ipa_get_jf_pass_through_agg_preserved (jfunc)) - ici-param_index = -1; - else - ici-param_index = ipa_get_jf_pass_through_formal_id (jfunc); - } - else if (jfunc-type == IPA_JF_ANCESTOR) - { - if (ici-agg_contents -!ipa_get_jf_ancestor_agg_preserved (jfunc)) - ici-param_index = -1; - else - { - ici-param_index = ipa_get_jf_ancestor_formal_id (jfunc); - ici-offset += ipa_get_jf_ancestor_offset (jfunc); - } - } - else - /* Either we can find a destination for this edge now or never. */ - ici-param_index = -1; if (!flag_indirect_inlining) ! continue; ! ! if (ici-polymorphic) new_direct_edge = try_make_edge_direct_virtual_call (ie, jfunc, new_root_info); else new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc, new_root_info); - if (new_direct_edge) { new_direct_edge-indirect_inlining_edge = 1; --- 2264,2278 param_index = ici-param_index; jfunc = ipa_get_ith_jump_func (top, param_index); if (!flag_indirect_inlining) ! new_direct_edge = NULL; ! else if (ici-polymorphic) new_direct_edge = try_make_edge_direct_virtual_call (ie, jfunc, new_root_info); else new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc, new_root_info); if (new_direct_edge) { new_direct_edge-indirect_inlining_edge = 1; *** update_indirect_edges_after_inlining (st *** 2312,2317 --- 2287,2315 res = true; } } + else if (jfunc-type == IPA_JF_PASS_THROUGH + ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR) + { + if (ici-agg_contents +!ipa_get_jf_pass_through_agg_preserved (jfunc)) + ici-param_index = -1; + else + ici-param_index = ipa_get_jf_pass_through_formal_id (jfunc); +
Re: PR tree-optimization/55823 (ipa-inline-transform ICE)
On Tue, Jan 8, 2013 at 2:29 PM, Martin Jambor mjam...@suse.cz wrote: Hi, On Mon, Jan 07, 2013 at 01:26:23AM +0100, Jan Hubicka wrote: Hi, as discused in the PR log there seems to be ordering issue in update_indirect_edges_after_inlining that first updates info in call edge to correspond the situation after inlining and then it tries to devirtualize that is trying to look up the info prior inlining. Bootstrapped/regtested x86_64-linux Martin, does it look sane? Yes, this is exactly what needs to be done. I'm quite surprised I had not already added a testcase for this. Is this maybe related to PR55264? The patch is also not yet applied btw ... Richard. Can you also, please, look into why ipa-cp is not handling both calls? I will. Thanks, a lot for the patch, Martin Honza PR tree-optimization/55823 * g++.dg/ipa/devirt-10.C: New testcase. * ipa-prop.c (update_indirect_edges_after_inlining): Fix ordering issue. Index: testsuite/g++.dg/ipa/devirt-10.C === *** testsuite/g++.dg/ipa/devirt-10.C (revision 0) --- testsuite/g++.dg/ipa/devirt-10.C (revision 0) *** *** 0 --- 1,34 + /* { dg-do compile } */ + /* { dg-options -O3 -fdump-ipa-inline -fdump-ipa-cp } */ + class wxPaintEvent { }; + struct wxDCBase + { + wxDCBase (); + virtual int GetLayoutDirection() const{} + virtual void SetLayoutDirection(int){} + }; + struct wxWindowDC : public wxDCBase {}; + struct wxBufferedDC : public wxDCBase + { + void Init(wxDCBase*dc) { + InitCommon(dc); + } + void InitCommon(wxDCBase*dc) { + if (dc) + SetLayoutDirection(dc-GetLayoutDirection()); + } + }; + struct wxBufferedPaintDC : public wxBufferedDC { + wxBufferedPaintDC() { + Init(m_paintdc); + } + wxWindowDC m_paintdc; + }; + void OnPaint(wxPaintEvent event) { + wxBufferedPaintDC dc; + } + /* IPA-CP should really discover both cases, but for time being the second is handled by inliner. */ + /* { dg-final { scan-ipa-dump-times Discovered a virtual call to a known target 1 inline } } */ + /* { dg-final { scan-ipa-dump-times Discovered a virtual call to a known target 1 cp } } */ + /* { dg-final { cleanup-ipa-dump inline } } */ + /* { dg-final { cleanup-ipa-dump cp } } */ Index: ipa-prop.c === *** ipa-prop.c(revision 194918) --- ipa-prop.c(working copy) *** update_indirect_edges_after_inlining (st *** 2264,2303 param_index = ici-param_index; jfunc = ipa_get_ith_jump_func (top, param_index); - if (jfunc-type == IPA_JF_PASS_THROUGH -ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR) - { - if (ici-agg_contents -!ipa_get_jf_pass_through_agg_preserved (jfunc)) - ici-param_index = -1; - else - ici-param_index = ipa_get_jf_pass_through_formal_id (jfunc); - } - else if (jfunc-type == IPA_JF_ANCESTOR) - { - if (ici-agg_contents -!ipa_get_jf_ancestor_agg_preserved (jfunc)) - ici-param_index = -1; - else - { - ici-param_index = ipa_get_jf_ancestor_formal_id (jfunc); - ici-offset += ipa_get_jf_ancestor_offset (jfunc); - } - } - else - /* Either we can find a destination for this edge now or never. */ - ici-param_index = -1; if (!flag_indirect_inlining) ! continue; ! ! if (ici-polymorphic) new_direct_edge = try_make_edge_direct_virtual_call (ie, jfunc, new_root_info); else new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc, new_root_info); - if (new_direct_edge) { new_direct_edge-indirect_inlining_edge = 1; --- 2264,2278 param_index = ici-param_index; jfunc = ipa_get_ith_jump_func (top, param_index); if (!flag_indirect_inlining) ! new_direct_edge = NULL; ! else if (ici-polymorphic) new_direct_edge = try_make_edge_direct_virtual_call (ie, jfunc, new_root_info); else new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc, new_root_info); if (new_direct_edge) { new_direct_edge-indirect_inlining_edge = 1; *** update_indirect_edges_after_inlining (st *** 2312,2317 --- 2287,2315 res = true; } } + else if (jfunc-type == IPA_JF_PASS_THROUGH + ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR) + { + if (ici-agg_contents +
[AARCH64] Fix support for vectorization over sqrt (), sqrtf ().
Hi, This patch fixes the support for vectorizing sqrt and sqrtf function calls on AArch64. While the standard name was implemented previously, the glue in aarch64_builtin_vectorized_function was not. We also fix the initialisation of vectors in the gcc.target/aarch64/vsqrt.c testcase such that they are more big-endian-safe. This patch regresses clean on aarch64-none-elf. Applying this patch gives: FAIL-PASS: gcc.dg/vect/no-math-errno-slp-32.c scan-tree-dump vect pattern recognized FAIL-PASS: gcc.dg/vect/no-math-errno-vect-pow-1.c scan-tree-dump vect pattern recognized Is this OK to commit? Thanks, James Greenhalgh --- gcc/ 2013-01-08 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Handle sqrt, sqrtf. gcc/testsuite/ 2013-01-08 James Greenhalgh james.greenha...@arm.com * gcc.target/aarch64/vsqrt.c (test_square_root_v2sf): Use endian-safe float pool loading. (test_square_root_v4sf): Likewise. (test_square_root_v2df): Likewise. * lib/target-supports.exp (check_effective_target_vect_call_sqrtf): Add AArch64. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index a683afd..f54e9f4 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1271,6 +1271,9 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in) case BUILT_IN_NEARBYINT: case BUILT_IN_NEARBYINTF: return AARCH64_FIND_FRINT_VARIANT (frinti); + case BUILT_IN_SQRT: + case BUILT_IN_SQRTF: + return AARCH64_FIND_FRINT_VARIANT (sqrt); #undef AARCH64_CHECK_BUILTIN_MODE #define AARCH64_CHECK_BUILTIN_MODE(C, N) \ (out_mode == N##Imode out_n == C \ diff --git a/gcc/testsuite/gcc.target/aarch64/vsqrt.c b/gcc/testsuite/gcc.target/aarch64/vsqrt.c index b59535a..5b777b2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vsqrt.c +++ b/gcc/testsuite/gcc.target/aarch64/vsqrt.c @@ -11,9 +11,11 @@ extern void abort (void); void test_square_root_v2sf () { - float32x2_t val = {4.0f, 9.0f}; + const float32_t pool[] = {4.0f, 9.0f}; + float32x2_t val; float32x2_t res; + val = vld1_f32 (pool); res = vsqrt_f32 (val); if (vget_lane_f32 (res, 0) != 2.0f) @@ -25,9 +27,11 @@ test_square_root_v2sf () void test_square_root_v4sf () { - float32x4_t val = {4.0f, 9.0f, 16.0f, 25.0f}; + const float32_t pool[] = {4.0f, 9.0f, 16.0f, 25.0f}; + float32x4_t val; float32x4_t res; + val = vld1q_f32 (pool); res = vsqrtq_f32 (val); if (vgetq_lane_f32 (res, 0) != 2.0f) @@ -43,9 +47,11 @@ test_square_root_v4sf () void test_square_root_v2df () { - float64x2_t val = {4.0, 9.0}; + const float64_t pool[] = {4.0, 9.0}; + float64x2_t val; float64x2_t res; + val = vld1q_f64 (pool); res = vsqrtq_f64 (val); if (vgetq_lane_f64 (res, 0) != 2.0) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index a3828cb..a6d766e 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3952,7 +3952,8 @@ proc check_effective_target_vect_call_sqrtf { } { verbose check_effective_target_vect_call_sqrtf: using cached result 2 } else { set et_vect_call_sqrtf_saved 0 - if { [istarget i?86-*-*] + if { [istarget aarch64*-*-*] + || [istarget i?86-*-*] || [istarget x86_64-*-*] || ([istarget powerpc*-*-*] [check_vsx_hw_available]) } { set et_vect_call_sqrtf_saved 1
[PATCH] PR c++/55663 - constexpr function templ instantiation considered non-const as alias templ arg
Hello, Consider the example of the problem report 1 template typename 2 constexpr bool the_truth () { return true; } 3 4 template bool 5struct Takes_bool { }; 6 7 templatebool B 8using Alias = Takes_boolB; 9 10 templatetypename T 11struct test { using type = Aliasthe_truthT(); }; 12 13 int main () { 14testint a; 15 16return 0; 17 } that yields the error: test.cc: In substitution of ‘templatebool B using Alias = Takes_boolB [with bool B = the_truthint()]’: test.cc:11:51: required from ‘struct testint’ test.cc:14:13: required from here test.cc:11:51: error: integral expression ‘the_truthint()’ is not constant struct test { using type = Aliasthe_truthT(); }; I think the issue happens in the course of instantiating testint at line 14, when we look into instantiating Aliasthe_truthT() (at line 11), with T = int. There, when we check the argument 'the_truthint()' to see if it actually is a constant expression, in check_instantiated_arg, we fail to recognize its constexpr-ness b/c we just look at its TREE_CONSTANT. Would the patch below be ok-ish in testing for the const-ness of that argument in a general enough way that takes into account its constexpr-ness? Bootstapped and tested on x86_64-unknown-linux-gnu against trunk. gcc/cp/ PR c++/55663 * cp-tree.h (cxx_is_constant_expression): Declare ... * semantics.c (cxx_is_constant_expression): ... new function. * pt.c (check_instantiated_arg): Use the new cxx_is_constant_expression in lieu of TREE_CONSTANT. gcc/testsuite/ PR c++/55663 * g++.dg/cpp0x/alias-decl-31.C: New test. --- gcc/cp/cp-tree.h | 1 + gcc/cp/pt.c| 2 +- gcc/cp/semantics.c | 9 + gcc/testsuite/g++.dg/cpp0x/alias-decl-31.C | 20 4 files changed, 31 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-31.C diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 810df7d..9d52ba7 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -5608,6 +5608,7 @@ extern bool potential_rvalue_constant_expression (tree); extern bool require_potential_constant_expression (tree); extern bool require_potential_rvalue_constant_expression (tree); extern tree cxx_constant_value (tree); +extern bool cxx_is_constant_expression (tree); extern tree maybe_constant_value (tree); extern tree maybe_constant_init (tree); extern bool is_sub_constant_expr (tree); diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 30bafa0..74ccfbf 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -14426,7 +14426,7 @@ check_instantiated_arg (tree tmpl, tree t, tsubst_flags_t complain) constant. */ else if (TREE_TYPE (t) INTEGRAL_OR_ENUMERATION_TYPE_P (TREE_TYPE (t)) - !TREE_CONSTANT (t)) + !cxx_is_constant_expression (t)) { if (complain tf_error) error (integral expression %qE is not constant, t); diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 2e02295..e40d48f 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -8077,6 +8077,15 @@ cxx_constant_value (tree t) return cxx_eval_outermost_constant_expr (t, false); } +/* Return TRUE iff E is a constant expression. */ + +bool +cxx_is_constant_expression (tree e) +{ + tree t = cxx_constant_value (e); + return (t != error_mark_node t != NULL_TREE); +} + /* If T is a constant expression, returns its reduced value. Otherwise, if T does not have TREE_CONSTANT set, returns T. Otherwise, returns a version of T without TREE_CONSTANT. */ diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-31.C b/gcc/testsuite/g++.dg/cpp0x/alias-decl-31.C new file mode 100644 index 000..83eea47 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-31.C @@ -0,0 +1,20 @@ +// Origin: PR c++/55663 +// { dg-do compile { target c++11 } } + +template typename +constexpr bool the_truth () { return true; } + +template bool + struct Takes_bool { }; + +templatebool B + using Alias = Takes_boolB; + +templatetypename T + struct test { using type = Aliasthe_truthT(); }; + +int main () { + testint a; + + return 0; +} -- Dodji
Re: Relax limits of early inliner for the forwarder functions
On Mon, Nov 5, 2012 at 12:23 PM, Jan Hubicka hubi...@ucw.cz wrote: Hi, in 4.6 timeframe I limited early inlier growth to apply only for leaf functions. This does not work really well, because with less propagation of address expressions we are really not 100% succesfull on detecting C++ forwarders and predicting them zero cost. This patch simply makes the cost to be divided by number of callees, similarly as in LLVM. Bootstrapped/regtested x86_64-linux, benchmarked and comitted. The patch seems consistent win in all benchmarks, most noticeably in tramp3d. * ipa-inline.c (leaf_node_p): Rename to ... (num_calls) ... this one. (want_early_inline_function_p): Allow smal growth on non-leafs. Index: ipa-inline.c === --- ipa-inline.c(revision 193134) +++ ipa-inline.c(working copy) @@ -380,17 +380,18 @@ can_early_inline_edge_p (struct cgraph_e } -/* Return true when N is leaf function. Accept cheap builtins - in leaf functions. */ +/* Return number of calls in N. Ignore cheap builtins. */ -static bool -leaf_node_p (struct cgraph_node *n) +static int +num_calls (struct cgraph_node *n) { struct cgraph_edge *e; + int num = 0; + for (e = n-callees; e; e = e-next_callee) if (!is_inexpensive_builtin (e-callee-symbol.decl)) - return false; - return true; + num++; + return num; } This counts all calls in 'n' @@ -414,6 +415,8 @@ want_early_inline_function_p (struct cgr else { int growth = estimate_edge_growth (e); + int n; + if (growth = 0) ; else if (!cgraph_maybe_hot_edge_p (e) @@ -427,22 +430,23 @@ want_early_inline_function_p (struct cgr growth); want_inline = false; } - else if (!leaf_node_p (callee) - growth 0) + else if (growth PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) { if (dump_file) fprintf (dump_file, will not early inline: %s/%i-%s/%i, -callee is not leaf and code would grow by %i\n, +growth %i exceeds --param early-inlining-insns\n, xstrdup (cgraph_node_name (e-caller)), e-caller-uid, xstrdup (cgraph_node_name (callee)), callee-uid, growth); want_inline = false; } - else if (growth PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) + else if ((n = num_calls (callee)) != 0 + growth * (n + 1) PARAM_VALUE (PARAM_EARLY_INLINING_INSNS)) So this counts all calls in the function we want to inline (!?). That's completely backward to me. In fact for forwarder functions you still only allow half of the early-inlining-insns growth. Previously for non-leafs we didn't allow any growth (hm, why?). Now with relaxing that and allowing functions with calls to be inlined more frequently we run into PR55797 which shows that we cannot limit recursive inlining anymore if it is indirect one level. By means of early inlining iteration we blow up completely (8 iterations at most?!). Also because we do not compute overall function growth (because we rely on early inlining only shrinking code size ...). I believe we at least need to track recursive inlining during early inliner iteration by means of some -aux marking or so. Honza - please have a look at the ICE in PR55797 and the issues with this patch enabling more inlining. Thanks, Richard. { if (dump_file) fprintf (dump_file, will not early inline: %s/%i-%s/%i, -growth %i exceeds --param early-inlining-insns\n, +growth %i exceeds --param early-inlining-insns +divided by number of calls\n, xstrdup (cgraph_node_name (e-caller)), e-caller-uid, xstrdup (cgraph_node_name (callee)), callee-uid, growth);
[Patch, AArch64-4.7]: Fix format of sumullq instruction.
Hi, Attached is a patch that fixes the instruction and operand separator for sumull2 instructions from a space to a tab. Regressed for aarch64-none-elf on aarch64-4.7-branch. OK to commit? Thanks, Tejas Belagod ARM. 2013-01-08 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64-simd.md (aarch64_simd_vec_sumult_lo_mode, aarch64_simd_vec_sumult_hi_mode): Separate instruction and operand with tab instead of space.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index febf71d..01348ec 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1100,7 +1100,7 @@ (match_operand:VQW 2 register_operand w) (match_dup 3)] TARGET_SIMD - sumull %0.Vwtype, %1.Vhalftype, %2.Vhalftype + sumull\\t%0.Vwtype, %1.Vhalftype, %2.Vhalftype [(set_attr simd_type simd_mull) (set_attr simd_mode MODE)] ) @@ -1128,7 +1128,7 @@ (match_operand:VQW 2 register_operand w) (match_dup 3)] TARGET_SIMD - sumull2 %0.Vwtype, %1.Vtype, %2.Vtype + sumull2\\t%0.Vwtype, %1.Vtype, %2.Vtype [(set_attr simd_type simd_mull) (set_attr simd_mode MODE)] )
[Patch, AArch64]: Fix format of sumullq instruction.
Hi, Attached is a patch that fixes the instruction and operand separator for sumull2 instructions from a space to a tab. Regressed for aarch64-none-elf on trunk. OK to commit? Thanks, Tejas Belagod ARM. 2013-01-08 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64-simd.md (aarch64_simd_vec_sumult_lo_mode, aarch64_simd_vec_sumult_hi_mode): Separate instruction and operand with tab instead of space.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index febf71d..01348ec 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1100,7 +1100,7 @@ (match_operand:VQW 2 register_operand w) (match_dup 3)] TARGET_SIMD - sumull %0.Vwtype, %1.Vhalftype, %2.Vhalftype + sumull\\t%0.Vwtype, %1.Vhalftype, %2.Vhalftype [(set_attr simd_type simd_mull) (set_attr simd_mode MODE)] ) @@ -1128,7 +1128,7 @@ (match_operand:VQW 2 register_operand w) (match_dup 3)] TARGET_SIMD - sumull2 %0.Vwtype, %1.Vtype, %2.Vtype + sumull2\\t%0.Vwtype, %1.Vtype, %2.Vtype [(set_attr simd_type simd_mull) (set_attr simd_mode MODE)] )
[Patch, AArch64] Make sumullq target tests more robust.
Hi, Attached is a patch that makes the test case for sumullq more robust by looking for {scan-assembler smull v[0-9]+\.type} where type = {8h, 4s, 2d} and smull2 v[0-9]+\.type} where type = {8h, 4s, 2d} instead of looking for a specific number of occurances of smull v which can vary with the vectorization factor. Tested on aarch64-4.7-branch and trunk for aarch64-none-elf. OK to commit on aarch64-4.7-branch and trunk? Thanks, Tejas Belagod ARM. 2013-01-08 Tejas Belagod tejas.bela...@arm.com testsuite/ * gcc.target/aarch64/vect-mull-compile.c: Explicitly scan for instructions generated instead of number of occurances.diff --git a/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c b/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c index e51eaee..e90c97f 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c @@ -10,7 +10,15 @@ DEF_MULL2 (DEF_MULLB) DEF_MULL2 (DEF_MULLH) DEF_MULL2 (DEF_MULLS) -/* { dg-final { scan-assembler-times smull v 3 } } */ -/* { dg-final { scan-assembler-times smull2 v 3 } } */ -/* { dg-final { scan-assembler-times umull v 3 } } */ -/* { dg-final { scan-assembler-times umull2 v 3 } } */ +/* { dg-final { scan-assembler smull\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler smull\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler smull\\tv\[0-9\]+\.2d} } */ +/* { dg-final { scan-assembler umull\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler umull\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler umull\\tv\[0-9\]+\.2d} } */ +/* { dg-final { scan-assembler smull2\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler smull2\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler smull2\\tv\[0-9\]+\.2d} } */ +/* { dg-final { scan-assembler umull2\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler umull2\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler umull2\\tv\[0-9\]+\.2d} } */
Re: [AARCH64] Fix support for vectorization over sqrt (), sqrtf ().
On 8 January 2013 13:55, James Greenhalgh james.greenha...@arm.com wrote: Hi, This patch fixes the support for vectorizing sqrt and sqrtf function calls on AArch64. While the standard name was implemented previously, the glue in aarch64_builtin_vectorized_function was not. We also fix the initialisation of vectors in the gcc.target/aarch64/vsqrt.c testcase such that they are more big-endian-safe. This patch regresses clean on aarch64-none-elf. Applying this patch gives: FAIL-PASS: gcc.dg/vect/no-math-errno-slp-32.c scan-tree-dump vect pattern recognized FAIL-PASS: gcc.dg/vect/no-math-errno-vect-pow-1.c scan-tree-dump vect pattern recognized Is this OK to commit? OK /Marcus
Re: [AArch64] Add support for floating-point vcond.
On 8 January 2013 12:04, James Greenhalgh james.greenha...@arm.com wrote: Hi, The AArch64 port has support for integer vcond, but did not have support for floating-point vcond. This patch adds this support and enables check_effective_target_vect_cond in the testsuite. Regression tested on aarch64-none-elf with no regressions. Is this OK to commit? OK /Marcus
Re: [Patch, AArch64] Make sumullq target tests more robust.
OK /Marcus On 8 January 2013 14:27, Tejas Belagod tbela...@arm.com wrote: Hi, Attached is a patch that makes the test case for sumullq more robust by looking for {scan-assembler smull v[0-9]+\.type} where type = {8h, 4s, 2d} and smull2 v[0-9]+\.type} where type = {8h, 4s, 2d} instead of looking for a specific number of occurances of smull v which can vary with the vectorization factor. Tested on aarch64-4.7-branch and trunk for aarch64-none-elf. OK to commit on aarch64-4.7-branch and trunk? Thanks, Tejas Belagod ARM. 2013-01-08 Tejas Belagod tejas.bela...@arm.com testsuite/ * gcc.target/aarch64/vect-mull-compile.c: Explicitly scan for instructions generated instead of number of occurances. diff --git a/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c b/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c index e51eaee..e90c97f 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c @@ -10,7 +10,15 @@ DEF_MULL2 (DEF_MULLB) DEF_MULL2 (DEF_MULLH) DEF_MULL2 (DEF_MULLS) -/* { dg-final { scan-assembler-times smull v 3 } } */ -/* { dg-final { scan-assembler-times smull2 v 3 } } */ -/* { dg-final { scan-assembler-times umull v 3 } } */ -/* { dg-final { scan-assembler-times umull2 v 3 } } */ +/* { dg-final { scan-assembler smull\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler smull\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler smull\\tv\[0-9\]+\.2d} } */ +/* { dg-final { scan-assembler umull\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler umull\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler umull\\tv\[0-9\]+\.2d} } */ +/* { dg-final { scan-assembler smull2\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler smull2\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler smull2\\tv\[0-9\]+\.2d} } */ +/* { dg-final { scan-assembler umull2\\tv\[0-9\]+\.8h} } */ +/* { dg-final { scan-assembler umull2\\tv\[0-9\]+\.4s} } */ +/* { dg-final { scan-assembler umull2\\tv\[0-9\]+\.2d} } */
Re: [Patch, AArch64]: Fix format of sumullq instruction.
On 8 January 2013 14:25, Tejas Belagod tbela...@arm.com wrote: 2013-01-08 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64-simd.md (aarch64_simd_vec_sumult_lo_mode, aarch64_simd_vec_sumult_hi_mode): Separate instruction and operand with tab instead of space. OK /Marcus
Re: Relax limits of early inliner for the forwarder functions
So this counts all calls in the function we want to inline (!?). That's completely backward to me. In fact for forwarder functions you still only allow half of the early-inlining-insns growth. Previously for non-leafs we didn't allow any growth (hm, why?). Well, the idea is that inlining leaf functions is almost always good idea (i.e. you can assume that the function's body will optimize well with surrounding code and eliminating a call is good thing) Inlining functions that have call in it is less cool. I introduced the non-leaf/leaf logic in about 4.6 time after late inlining became more informed about anticipated optimizations, but it really caused quite some trouble on C++ abstraction, so relaxing this logic somewhat seemed like resonable idea. Now with relaxing that and allowing functions with calls to be inlined more frequently we run into PR55797 which shows that we cannot limit recursive inlining anymore if it is indirect one level. By means of early inlining iteration we blow up completely (8 iterations at most?!). Also because we do not compute overall function growth (because we rely on early inlining only shrinking code size ...). Well, we compute function growth, but for each iteratio nseparately. I believe we at least need to track recursive inlining during early inliner iteration by means of some -aux marking or so. Hmm, I guess we want to disable recursive inlining in the early inliner completely. I will take a look. Honza Honza - please have a look at the ICE in PR55797 and the issues with this patch enabling more inlining. Thanks, Richard. { if (dump_file) fprintf (dump_file, will not early inline: %s/%i-%s/%i, -growth %i exceeds --param early-inlining-insns\n, +growth %i exceeds --param early-inlining-insns +divided by number of calls\n, xstrdup (cgraph_node_name (e-caller)), e-caller-uid, xstrdup (cgraph_node_name (callee)), callee-uid, growth);
[testsuite] Fix g++.dg/tls/thread_local* on Solaris 9
I'm seeing various failures in the g++.dg/tls/thread_local* tests on Solaris 9: With Sun as, I have XPASS: g++.dg/tls/thread_local-cse.C execution test XPASS: g++.dg/tls/thread_local2.C execution test XPASS: g++.dg/tls/thread_local2g.C execution test XPASS: g++.dg/tls/thread_local6.C execution test on both SPARC and x86. The Solaris 9 with as configuration uses emutls. These errors are due to Eric's patch 2012-11-19 Eric Botcazou ebotca...@adacore.com * g++.dg/tls/thread_local-order1.C: Add dg-require-cxa-atexit. * g++.dg/tls/thread_local3g.C: Likewise. * g++.dg/tls/thread_local4g.C: Likewise. * g++.dg/tls/thread_local5g.C: Likewise. * g++.dg/tls/thread_local6g.C: Likewise. * g++.dg/tls/thread_local-cse.C: XFAIL on Solaris 9. * g++.dg/tls/thread_local2.C: Likewise. * g++.dg/tls/thread_local2g.C: Likewise. * g++.dg/tls/thread_local6.C: Likewise. The XFAILs are wrong, instead one just needs to use dg-add-options tls, which the patch below does. Solaris 9/x86 with as also has FAIL: g++.dg/tls/thread_local-wrap4.C scan-assembler-not _ZTW1i@PLT The difference between the as and gas outputs is -.LCFI2: - call.LPR3 - addl$_GLOBAL_OFFSET_TABLE_, %ebx - call_ZTW1i@PLT + call_ZTW1i No idea how best to handle this. On Solaris 9, 10, and 11/x86 with gas and gld, I find FAIL: g++.dg/tls/thread_local-order1.C execution test FAIL: g++.dg/tls/thread_local3g.C -std=gnu++11 execution test FAIL: g++.dg/tls/thread_local4g.C -std=gnu++11 execution test FAIL: g++.dg/tls/thread_local5g.C -std=gnu++11 execution test FAIL: g++.dg/tls/thread_local6g.C execution test This happens (as in two other testcases, cf. PR c++/51923) since the dg-require-cxa-atexit test passes on Solaris with gld though the platform doesn't have __cxa_atexit and I'm ignoring it for now. The patch below adds the missing dg-add-options tls in all run tests, not just the immediately affected ones. Tested with the appropriate runtest invocations on {i386-pc,sparc-sun}-solaris2.{9,10} and x86_64-unknown-linux-gnu. Ok for mainline? Rainer 2013-01-08 Rainer Orth r...@cebitec.uni-bielefeld.de * g++.dg/tls/thread_local-cse.C: Don't xfail on *-*-solaris2.9. Add tls options. * g++.dg/tls/thread_local2.C: Likewise. * g++.dg/tls/thread_local2g.C: Likewise. * g++.dg/tls/thread_local6.C: Likewise. * g++.dg/tls/thread_local-order1.C: Add tls options. * g++.dg/tls/thread_local-order2.C: Likewise. * g++.dg/tls/thread_local3.C: Likewise. * g++.dg/tls/thread_local3g.C: Likewise. * g++.dg/tls/thread_local4.C: Likewise. * g++.dg/tls/thread_local4g.C: Likewise. * g++.dg/tls/thread_local5.C: Likewise. * g++.dg/tls/thread_local5g.C: Likewise. * g++.dg/tls/thread_local6g.C: Likewise. # HG changeset patch # Parent 7eebbf02cdbac34362d53d0bfa39b375afefb2e3 Fix g++.dg/tls/thread_local* on Solaris 9 diff --git a/gcc/testsuite/g++.dg/tls/thread_local-cse.C b/gcc/testsuite/g++.dg/tls/thread_local-cse.C --- a/gcc/testsuite/g++.dg/tls/thread_local-cse.C +++ b/gcc/testsuite/g++.dg/tls/thread_local-cse.C @@ -1,7 +1,8 @@ // Test for CSE of the wrapper function: we should only call it once // for the two references to ir. -// { dg-do run { xfail *-*-solaris2.9 } } +// { dg-do run } // { dg-options -std=c++11 -O -fno-inline -save-temps } +// { dg-add-options tls } // { dg-require-effective-target tls_runtime } // { dg-require-alias } // { dg-final { scan-assembler-times call *_ZTW2ir 1 { xfail *-*-* } } } diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order1.C b/gcc/testsuite/g++.dg/tls/thread_local-order1.C --- a/gcc/testsuite/g++.dg/tls/thread_local-order1.C +++ b/gcc/testsuite/g++.dg/tls/thread_local-order1.C @@ -1,5 +1,6 @@ // { dg-do run } // { dg-options -std=c++11 } +// { dg-add-options tls } // { dg-require-effective-target tls_runtime } // { dg-require-alias } // { dg-require-cxa-atexit } diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C b/gcc/testsuite/g++.dg/tls/thread_local-order2.C --- a/gcc/testsuite/g++.dg/tls/thread_local-order2.C +++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C @@ -4,6 +4,7 @@ // { dg-do run { xfail *-*-* } } // { dg-options -std=c++11 } +// { dg-add-options tls } // { dg-require-effective-target tls_runtime } // { dg-require-alias } diff --git a/gcc/testsuite/g++.dg/tls/thread_local2.C b/gcc/testsuite/g++.dg/tls/thread_local2.C --- a/gcc/testsuite/g++.dg/tls/thread_local2.C +++ b/gcc/testsuite/g++.dg/tls/thread_local2.C @@ -1,5 +1,6 @@ -// { dg-do run { xfail *-*-solaris2.9 } } +// { dg-do run } // { dg-options -std=c++11 } +// { dg-add-options tls } // { dg-require-effective-target tls_runtime } extern C void abort(); diff --git a/gcc/testsuite/g++.dg/tls/thread_local2g.C b/gcc/testsuite/g++.dg/tls/thread_local2g.C --- a/gcc/testsuite/g++.dg/tls/thread_local2g.C
Re: Control dependence vs. builtin_unreachable
On 01/08/2013 04:26 AM, Richard Biener wrote: The issue is VRP - when you remove unreachable blocks you lose the conditional statement as it is no longer necessary and thus the predicate you can derive value-ranges from. Understood. Perhaps we could eliminate them after the first VRP pass, but before the second. That ought to give us the majority of the benefit of seeing the conditional and propagating based on the conditional, but also give us the benefit of eliminating the branch generating straight-line code. Clearly it needs more investigation, but I think it's worth exploring. Jeff
Re: [ARM] Turning off 64bits ops in Neon and gfortran/modulo-scheduling problem
Ping? http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01197.html Thanks, Christophe On 19 December 2012 16:59, Christophe Lyon christophe.l...@linaro.org wrote: On 17 December 2012 16:12, Richard Earnshaw rearn...@arm.com wrote: On 29/11/12 17:16, Christophe Lyon wrote: On trunk I have noticed a regression in gfortran when using modulo scheduling: sms-1.f90 now fails, but I suspect it's not because of this patch since forcing compilation for armv5t makes the same test fail with and without my patch. Hmm, that's worrying. Could you please makesure this is recorded in bugzilla. If this is a regression, please mark it as such. I was about to do so, but after bisecting it turns out that the problem was introduced by http://gcc.gnu.org/viewcvs?root=gccview=revrev=192969 and is very likely to be another instance of PR55562, which has just been fixed by http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01137.html. Now that this optimization is disabled by default, the onlya8 code is completely redundant and should be purged, along with the insn alternatives that used it. R. Here is a new version of my patch, with the cleanup you requested. 2012-12-18 Christophe Lyon christophe.l...@linaro.org gcc/ * config/arm/arm-protos.h (tune_params): Add prefer_neon_for_64bits field. * config/arm/arm.c (prefer_neon_for_64bits): New variable. (arm_slowmul_tune): Default prefer_neon_for_64bits to false. (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto. (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto. (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto. (arm_cortex_a9_tune, arm_fa726te_tune): Ditto. (arm_option_override): Handle -mneon-for-64bits new option. * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro. (prefer_neon_for_64bits): Declare new variable. * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to avoid_neon_for_64bits and neon_for_64bits. Remove onlya8 and nota8. (arch_enabled): Handle new arch types. Remove support for onlya8 and nota8. (one_cmpldi2): Use new arch names. * config/arm/arm.opt (mneon-for-64bits): Add option. * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon) (anddi3_neon, xordi3_neon, ashldi3_neon, shiftdi3_neon): Use neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead of onlya8. * doc/invoke.texi (-mneon-for-64bits): Document. gcc/testsuite/ * gcc.target/arm/neon-for-64bits-1.c: New tests. * gcc.target/arm/neon-for-64bits-2.c: Likewise.
Re: [patch, mips, stabs] Do any MIPS targets use stabs?
On Mon, 2013-01-07 at 12:54 -0800, Steve Ellcey wrote: On Mon, 2013-01-07 at 20:38 +, Richard Sandiford wrote: Ah, sorry, this reminded me of a patch I'd written a year ago and never got around to sanity-checking. Does it fix the problem for you? Richard It fixes my small test case, I will try it on the full testsuite overnight. Steve The full testsuite worked fine too, so I think this should be checked in. Given this fix I will probably leave the stab support alone for now, though I would have no objection if someone wanted to remove it for MIPS. Steve Ellcey sell...@mips.com
PR 55875 (IV wrapping issue)
Hi, My patch to add bounds that are not dominating loop latch caused problem in scev_probably_wraps_p that is trying to prove that given IV at given STMT is not wrapping based on loop bounds. When I was extending loop bounds to contain not only statements that dominate the loop latch, I verified the users that they are valid after the change. In this case it is however not true. What I missed is that it does two things 1) it tries to prove that STMT is bounded by given bound based on fact that bound's STMT dominate the statement 2) it tries to prove the bound based on number of iterations of loop that it derrives from the bounds I saw the dominance test in there and was happy. Obviously however 2) needs to be updated. It seems to me best to simply drop 2) that anyway is just determining a bound on number of iterations and use the bound recorded in the structure. While doing that I also noticed older problem in the postdominance test - we need to verify that loop is not terminated by a call or other side ffect as demonstrated by the new C testcase I constructed. This testcase probably fails in 4.7 and earlier releases. Boostrapped/regtested x86_64-linux Honza * gcc.c-torture/execute/pr55875.c: New testcase. * g++.dg/torture/pr55875.C: New testcase. * tree-ssa-loop-niter.c (n_of_executions_at_most): Simplify to only test for cases where statement is dominated by the particular bound; handle correctly the postdominance test. (scev_probably_wraps_p): Use max loop iterations info as a global bound first. Index: testsuite/gcc.c-torture/execute/pr55875.c === --- testsuite/gcc.c-torture/execute/pr55875.c (revision 0) +++ testsuite/gcc.c-torture/execute/pr55875.c (revision 0) @@ -0,0 +1,17 @@ +int a[250]; +__attribute__ ((noinline)) +t(int i) +{ + if (i==0) +exit(0); + if (i255) +abort (); +} +main() +{ + unsigned int i; + for (i=0;;i++) +{ + a[i]=t((unsigned char)(i+5)); +} +} Index: testsuite/g++.dg/torture/pr55875.C === --- testsuite/g++.dg/torture/pr55875.C (revision 0) +++ testsuite/g++.dg/torture/pr55875.C (revision 0) @@ -0,0 +1,53 @@ +struct A +{ + short int a1; + unsigned char a2; + unsigned int a3; +}; + +struct B +{ + unsigned short b1; + const A *b2; +}; + +B b; + +__attribute__((noinline, noclone)) +int foo (unsigned x) +{ + __asm volatile ( : +r (x) : : memory); + return x; +} + +inline void +bar (const int ) +{ +} + +__attribute__((noinline)) void +baz () +{ + const A *a = b.b2; + unsigned int i; + unsigned short n = b.b1; + for (i = 0; i n; ++i) +if (a[i].a1 == 11) + { +if (i 0 (a[i - 1].a2 1)) + continue; +bar (foo (2)); +return; + } +} + +int +main () +{ + A a[4] = { { 10, 0, 0 }, { 11, 1, 0 }, { 11, 1, 0 }, { 11, 1, 0 } }; + b.b1 = 4; + b.b2 = a; + baz (); + return 0; +} + Index: tree-ssa-loop-niter.c === --- tree-ssa-loop-niter.c (revision 194918) +++ tree-ssa-loop-niter.c (working copy) @@ -3549,8 +3549,15 @@ stmt_dominates_stmt_p (gimple s1, gimple /* Returns true when we can prove that the number of executions of STMT in the loop is at most NITER, according to the bound on the number of executions of the statement NITER_BOUND-stmt recorded in - NITER_BOUND. If STMT is NULL, we must prove this bound for all - statements in the loop. */ + NITER_BOUND and fact that NITER_BOUND-stmt dominate STMT. + + ??? This code can become quite a CPU hog - we can have many bounds, + and large basic block forcing stmt_dominates_stmt_p to be queried + many times on a large basic blocks, so the whole thing is O(n^2) + for scev_probably_wraps_p invocation (that can be done n times). + + It would make more sense (and give better answers) to remember BB + bounds computed by discover_iteration_bound_by_body_walk. */ static bool n_of_executions_at_most (gimple stmt, @@ -3571,32 +3578,37 @@ n_of_executions_at_most (gimple stmt, /* We know that NITER_BOUND-stmt is executed at most NITER_BOUND-bound + 1 times. This means that: - -- if NITER_BOUND-is_exit is true, then everything before -NITER_BOUND-stmt is executed at most NITER_BOUND-bound + 1 - times, and everything after it at most NITER_BOUND-bound times. + -- if NITER_BOUND-is_exit is true, then everything after + it at most NITER_BOUND-bound times. -- If NITER_BOUND-is_exit is false, then if we can prove that when STMT is executed, then NITER_BOUND-stmt is executed as well in the same - iteration (we conclude that if both statements belong to the same - basic block, or if STMT is after NITER_BOUND-stmt), then STMT - is executed at most NITER_BOUND-bound + 1 times. Otherwise STMT is - executed at
Re: [asan] Never use memset for clearing of shadow mem in epilogues (PR fortran/55341)
Jakub Jelinek ja...@redhat.com writes: 2012-12-19 Jakub Jelinek ja...@redhat.com PR fortran/55341 * asan.c (asan_clear_shadow): New function. (asan_emit_stack_protection): Use it. This looks OK to me. Thanks. -- Dodji
Re: [PATCH] PR c++/55663 - constexpr function templ instantiation considered non-const as alias templ arg
On Tue, Jan 8, 2013 at 7:58 AM, Dodji Seketeli do...@redhat.com wrote: Hello, Consider the example of the problem report 1 template typename 2 constexpr bool the_truth () { return true; } 3 4 template bool 5struct Takes_bool { }; 6 7 templatebool B 8using Alias = Takes_boolB; 9 10 templatetypename T 11struct test { using type = Aliasthe_truthT(); }; 12 13 int main () { 14testint a; 15 16return 0; 17 } that yields the error: test.cc: In substitution of ‘templatebool B using Alias = Takes_boolB [with bool B = the_truthint()]’: test.cc:11:51: required from ‘struct testint’ test.cc:14:13: required from here test.cc:11:51: error: integral expression ‘the_truthint()’ is not constant struct test { using type = Aliasthe_truthT(); }; I think the issue happens in the course of instantiating testint at line 14, when we look into instantiating Aliasthe_truthT() (at line 11), with T = int. There, when we check the argument 'the_truthint()' to see if it actually is a constant expression, in check_instantiated_arg, we fail to recognize its constexpr-ness b/c we just look at its TREE_CONSTANT. Thanks for the detective work! We already have various predicates to test for constant expressions so I am uneasy to add yet another one. What we do no need -- which I already suggested -- is a predicate to test valid non-type template arguments. For example, we already have the predicate verify_constant and reduced_constant_expression_p, require_potential_constant_expression. I think reduced_constant_expression_p is what you want. -- Gaby
Re: PR 55875 (IV wrapping issue)
Hi, I actually attached not completely final version of the patch. This one has the extra loop looking for side effect that cures the first testcase. Sorry for the confussion. Honza PR tree-optimize/55875 * gcc.c-torture/execute/pr55875.c: New testcase. * g++.dg/torture/pr55875.C: New testcase. * tree-ssa-loop-niter.c (n_of_executions_at_most): Simplify to only test for cases where statement is dominated by the particular bound; handle correctly the postdominance test. (scev_probably_wraps_p): Use max loop iterations info as a global bound first. Index: testsuite/gcc.c-torture/execute/pr55875.c === --- testsuite/gcc.c-torture/execute/pr55875.c (revision 0) +++ testsuite/gcc.c-torture/execute/pr55875.c (revision 0) @@ -0,0 +1,17 @@ +int a[250]; +__attribute__ ((noinline)) +t(int i) +{ + if (i==0) +exit(0); + if (i255) +abort (); +} +main() +{ + unsigned int i; + for (i=0;;i++) +{ + a[i]=t((unsigned char)(i+5)); +} +} Index: testsuite/g++.dg/torture/pr55875.C === --- testsuite/g++.dg/torture/pr55875.C (revision 0) +++ testsuite/g++.dg/torture/pr55875.C (revision 0) @@ -0,0 +1,53 @@ +struct A +{ + short int a1; + unsigned char a2; + unsigned int a3; +}; + +struct B +{ + unsigned short b1; + const A *b2; +}; + +B b; + +__attribute__((noinline, noclone)) +int foo (unsigned x) +{ + __asm volatile ( : +r (x) : : memory); + return x; +} + +inline void +bar (const int ) +{ +} + +__attribute__((noinline)) void +baz () +{ + const A *a = b.b2; + unsigned int i; + unsigned short n = b.b1; + for (i = 0; i n; ++i) +if (a[i].a1 == 11) + { +if (i 0 (a[i - 1].a2 1)) + continue; +bar (foo (2)); +return; + } +} + +int +main () +{ + A a[4] = { { 10, 0, 0 }, { 11, 1, 0 }, { 11, 1, 0 }, { 11, 1, 0 } }; + b.b1 = 4; + b.b2 = a; + baz (); + return 0; +} + Index: tree-ssa-loop-niter.c === --- tree-ssa-loop-niter.c (revision 194918) +++ tree-ssa-loop-niter.c (working copy) @@ -3549,8 +3549,15 @@ stmt_dominates_stmt_p (gimple s1, gimple /* Returns true when we can prove that the number of executions of STMT in the loop is at most NITER, according to the bound on the number of executions of the statement NITER_BOUND-stmt recorded in - NITER_BOUND. If STMT is NULL, we must prove this bound for all - statements in the loop. */ + NITER_BOUND and fact that NITER_BOUND-stmt dominate STMT. + + ??? This code can become quite a CPU hog - we can have many bounds, + and large basic block forcing stmt_dominates_stmt_p to be queried + many times on a large basic blocks, so the whole thing is O(n^2) + for scev_probably_wraps_p invocation (that can be done n times). + + It would make more sense (and give better answers) to remember BB + bounds computed by discover_iteration_bound_by_body_walk. */ static bool n_of_executions_at_most (gimple stmt, @@ -3571,32 +3578,43 @@ n_of_executions_at_most (gimple stmt, /* We know that NITER_BOUND-stmt is executed at most NITER_BOUND-bound + 1 times. This means that: - -- if NITER_BOUND-is_exit is true, then everything before -NITER_BOUND-stmt is executed at most NITER_BOUND-bound + 1 - times, and everything after it at most NITER_BOUND-bound times. + -- if NITER_BOUND-is_exit is true, then everything after + it at most NITER_BOUND-bound times. -- If NITER_BOUND-is_exit is false, then if we can prove that when STMT is executed, then NITER_BOUND-stmt is executed as well in the same - iteration (we conclude that if both statements belong to the same - basic block, or if STMT is after NITER_BOUND-stmt), then STMT - is executed at most NITER_BOUND-bound + 1 times. Otherwise STMT is - executed at most NITER_BOUND-bound + 2 times. */ + iteration then STMT is executed at most NITER_BOUND-bound + 1 times. + + If we can determine that NITER_BOUND-stmt is always executed + after STMT, then STMT is executed at most NITER_BOUND-bound + 2 times. + We conclude that if both statements belong to the same + basic block and STMT is before NITER_BOUND-stmt and there are no + statements with side effects in between. */ if (niter_bound-is_exit) { - if (stmt - stmt != niter_bound-stmt - stmt_dominates_stmt_p (niter_bound-stmt, stmt)) - cmp = GE_EXPR; - else - cmp = GT_EXPR; + if (stmt == niter_bound-stmt + || !stmt_dominates_stmt_p (niter_bound-stmt, stmt)) + return false; + cmp = GE_EXPR; } else { - if (!stmt - || (gimple_bb (stmt) != gimple_bb (niter_bound-stmt) - !stmt_dominates_stmt_p (niter_bound-stmt, stmt)))
Re: PR 55875 (IV wrapping issue)
On Tue, Jan 08, 2013 at 05:49:28PM +0100, Jan Hubicka wrote: --- testsuite/gcc.c-torture/execute/pr55875.c (revision 0) +++ testsuite/gcc.c-torture/execute/pr55875.c (revision 0) @@ -0,0 +1,17 @@ Please add /* PR tree-optimization/55875 */ here. +int a[250]; +__attribute__ ((noinline)) +t(int i) +{ + if (i==0) +exit(0); + if (i255) +abort (); +} +main() +{ + unsigned int i; + for (i=0;;i++) +{ + a[i]=t((unsigned char)(i+5)); +} +} Index: testsuite/g++.dg/torture/pr55875.C === --- testsuite/g++.dg/torture/pr55875.C(revision 0) +++ testsuite/g++.dg/torture/pr55875.C(revision 0) And // PR tree-optimization/55875 // { dg-do run } here, dg-do compile is the default. @@ -0,0 +1,53 @@ +struct A +{ + short int a1; + unsigned char a2; + unsigned int a3; +}; + +struct B +{ + unsigned short b1; + const A *b2; +}; + +B b; + +__attribute__((noinline, noclone)) +int foo (unsigned x) +{ + __asm volatile ( : +r (x) : : memory); + return x; +} + +inline void +bar (const int ) +{ +} + +__attribute__((noinline)) void +baz () +{ + const A *a = b.b2; + unsigned int i; + unsigned short n = b.b1; + for (i = 0; i n; ++i) +if (a[i].a1 == 11) + { +if (i 0 (a[i - 1].a2 1)) + continue; +bar (foo (2)); +return; There should be tabs on the 4 above lines, just bugzilla ate it. Jakub
Re: Support for MIPS r5900
On Tue, 8 Jan 2013, Richard Sandiford wrote: I disabled 64 bit FPU instructions by -msoft-float. This works, but using -msingle-float fails. This would be the better configuration. There are still 64 bit FPU instructions used (e.g. dmfc1 $2,$f0 when using long double multiplication). So -msingle-float doesn't seem to work on generic mips64-linux-gnu. Right. That combination hasn't really been defined. What happens for plain doubles? Do you pass those in FPRs or GPRs? IIUC the R5900 has an FPU that is functionally the same as that of the R4640/R4650. If that is the case, then there is no way to pass doubles in FPRs -- there is no room to store the upper halves. My point was that you could pass them in consecutive FPRs, like n32 does for long double. There's no architectural support for long double either, but the decision was still to pass them in FPRs rather than GPRs. You mean using a pair of FPRs (e.g. $f0/$f2) as a sum of two values of different exponents for extra precision? That would make sense, but would not match the way the double type has been defined in the ISO C standard for IEEE-754 targets -- please note that contrariwise the standard provides more freedom as to how the long double type can be implemented on IEEE-754 targets. Otherwise it would make no sense IMO, the contents would have to be moved back to GPRs for any use anyway. I'm not saying that that's a sensible precendent to copy. I was just using it as one example of why an ABI has to be defined. Not necessarily, the double type may simply be banned or alias to the single type. Especially the latter -- such an arrangement is allowed by ISO C as long as the target does not claim IEEE-754 compliance (we'd have a problem with the Java frontend though) and I think such a compilation mode might be permitted as long as it is useful to someone. Maciej
Re: Control dependence vs. builtin_unreachable
On Tue, Jan 8, 2013 at 12:32 PM, Richard Biener wrote: Does it handle side-effects on the builtin-unreachable path correctly? int b; int a; extern void foo (); int main() { if (!a) { if (!b) foo (); __builtin_unreachable (); } } --- void foo () { puts(Hello); exit(0); } ? Users are not _required_ to annotate noreturn functions. The BB with __builtin_unreachable () is still empty otherwise. Not sure what you're expecting in this case, so guessing... There is a warning for reaching end of non-void blah, but that's correct on the a!=0 path. With that if(!a) commented out, I don't have that warning of course. The code is identical with and without patch. It's another example of how __builtin_unreachable() helps hide unnecessary warnings. Ciao! Steven
Re: [PATCH, dataflow]: Fix PR55845, 454.calculix miscompares on x86 AVX due to movement of vzeroupper
On 01/07/2013 03:26 PM, Jakub Jelinek wrote: 2012-01-08 Jakub Jelinek ja...@redhat.com Uros Bizjak ubiz...@gmail.com PR rtl-optimization/55845 * df-problems.c (can_move_insns_across): Stop scanning at volatile_insn_p source instruction or give up if across_from .. across_to range contains any volatile_insn_p instructions. Ok. r~
Re: Support for MIPS r5900
Maciej W. Rozycki ma...@codesourcery.com writes: On Tue, 8 Jan 2013, Richard Sandiford wrote: I disabled 64 bit FPU instructions by -msoft-float. This works, but using -msingle-float fails. This would be the better configuration. There are still 64 bit FPU instructions used (e.g. dmfc1 $2,$f0 when using long double multiplication). So -msingle-float doesn't seem to work on generic mips64-linux-gnu. Right. That combination hasn't really been defined. What happens for plain doubles? Do you pass those in FPRs or GPRs? IIUC the R5900 has an FPU that is functionally the same as that of the R4640/R4650. If that is the case, then there is no way to pass doubles in FPRs -- there is no room to store the upper halves. My point was that you could pass them in consecutive FPRs, like n32 does for long double. There's no architectural support for long double either, but the decision was still to pass them in FPRs rather than GPRs. You mean using a pair of FPRs (e.g. $f0/$f2) as a sum of two values of different exponents for extra precision? That would make sense, but would not match the way the double type has been defined in the ISO C standard for IEEE-754 targets -- please note that contrariwise the standard provides more freedom as to how the long double type can be implemented on IEEE-754 targets. No, I mean passing the two 32-bit halves in two FPRs, like we pass the two 64-bit halves of long doubles in two FPRs. Like I say... I'm not saying that that's a sensible precendent to copy. I was just using it as one example of why an ABI has to be defined. Not necessarily, the double type may simply be banned or alias to the single type. Especially the latter -- such an arrangement is allowed by ISO C as long as the target does not claim IEEE-754 compliance (we'd have a problem with the Java frontend though) and I think such a compilation mode might be permitted as long as it is useful to someone. But that's the point: we have to define what the rules are. The definition includes what isn't allowed. Richard
Re: [PATCH] Fix PR54659, include gmp.h from system.h
On Fri, Dec 21, 2012 at 5:14 AM, Richard Biener rguent...@suse.de wrote: After the Ada fix including gmp.h from system.h is trivial (by means of double-int.h it was included from almost everywhere already) Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Ok for trunk? (Go frontend unfixed) Thanks, Richard. 2012-12-21 Richard Biener rguent...@suse.de PR bootstrap/54659 * system.h: Include gmp.h. * tree-ssa-loop-niter.c: Do not include gmp.h here. * double-int.h: Likewise. * realmpfr.h: Likewise. fortran/ * gfortran.h: Do not include gmp.h here. This breaks bootstrap with in-tree gmp since system.h is included for host feature test by gcc/configure. Before gmp is built, gmp.h isn't available. We need to build gmp first before configuring gcc. Balaji, can you try this patch? -- H.J. -- 2013-01-08 H.J. Lu hongjiu...@intel.com * Makefile.def (configure-gcc): Depend on all-gmp. (all-gcc): Remove dependency on all-gmp. * Makefile.in: Regenerated. gcc-gmp.patch Description: Binary data
Re: [PATCH] PR c++/55663 - constexpr function templ instantiation considered non-const as alias templ arg
On 01/08/2013 08:58 AM, Dodji Seketeli wrote: There, when we check the argument 'the_truthint()' to see if it actually is a constant expression, in check_instantiated_arg, we fail to recognize its constexpr-ness b/c we just look at its TREE_CONSTANT. The problem is that by the time we get to check_instantiated_arg, we should have folded the expression into something TREE_CONSTANT. convert_template_argument should have done that; don't we ever call that function for this template argument? Jason
[C/C++ PATCH] shift with negative or too big count warning (PR c/48418)
Hi! As discussed in the PR, on the following testcase we've regressed with the introduction of c_fully_fold, when the C FE normally warns the argument isn't folded yet. Fixed by also warning in c_fully_fold_internal, if before that function the shift count wasn't INTEGER_CST and after it it is. The testcase also revealed a regression on the C++ FE side, caused by SIZEOF_EXPR folding deferral. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-08 Jakub Jelinek ja...@redhat.com PR c/48418 * c-common.c (c_fully_fold_internal): Warn for LSHIFT_EXPR and RSHIFT_EXPR, if orig_op1 isn't INTEGER_CST, op1 is INTEGER_CST and is either negative or bigger or equal to type precision of the first operand. * typeck.c (cp_build_binary_op): For LSHIFT_EXPR and RSHIFT_EXPR, call maybe_constant_value for the negative or too big shift count warnings. * c-c++-common/pr48418.c: New test. --- gcc/c-family/c-common.c.jj 2012-12-31 15:05:45.0 +0100 +++ gcc/c-family/c-common.c 2013-01-08 15:15:47.019347593 +0100 @@ -1269,6 +1269,25 @@ c_fully_fold_internal (tree expr, bool i !TREE_OVERFLOW_P (op0) !TREE_OVERFLOW_P (op1)) overflow_warning (EXPR_LOCATION (expr), ret); + if ((code == LSHIFT_EXPR || code == RSHIFT_EXPR) + TREE_CODE (orig_op1) != INTEGER_CST + TREE_CODE (op1) == INTEGER_CST + (TREE_CODE (TREE_TYPE (orig_op0)) == INTEGER_TYPE + || TREE_CODE (TREE_TYPE (orig_op0)) == FIXED_POINT_TYPE) + TREE_CODE (TREE_TYPE (orig_op1)) == INTEGER_TYPE + c_inhibit_evaluation_warnings == 0) + { + if (tree_int_cst_sgn (op1) 0) + warning_at (loc, 0, (code == LSHIFT_EXPR +? left shift count is negative +: right shift count is negative)); + else if (compare_tree_int (op1, +TYPE_PRECISION (TREE_TYPE (orig_op0))) + = 0) + warning_at (loc, 0, (code == LSHIFT_EXPR +? left shift count = width of type +: right shift count = width of type)); + } goto out; case INDIRECT_REF: --- gcc/cp/typeck.c.jj 2013-01-07 14:14:44.0 +0100 +++ gcc/cp/typeck.c 2013-01-08 15:30:20.202388635 +0100 @@ -4095,10 +4095,13 @@ cp_build_binary_op (location_t location, } else if (code0 == INTEGER_TYPE code1 == INTEGER_TYPE) { + tree const_op1 = maybe_constant_value (op1); + if (TREE_CODE (const_op1) != INTEGER_CST) + const_op1 = op1; result_type = type0; - if (TREE_CODE (op1) == INTEGER_CST) + if (TREE_CODE (const_op1) == INTEGER_CST) { - if (tree_int_cst_lt (op1, integer_zero_node)) + if (tree_int_cst_lt (const_op1, integer_zero_node)) { if ((complain tf_warning) c_inhibit_evaluation_warnings == 0) @@ -4106,7 +4109,7 @@ cp_build_binary_op (location_t location, } else { - if (compare_tree_int (op1, TYPE_PRECISION (type0)) = 0 + if (compare_tree_int (const_op1, TYPE_PRECISION (type0)) = 0 (complain tf_warning) c_inhibit_evaluation_warnings == 0) warning (0, right shift count = width of type); @@ -4138,16 +4141,20 @@ cp_build_binary_op (location_t location, } else if (code0 == INTEGER_TYPE code1 == INTEGER_TYPE) { + tree const_op1 = maybe_constant_value (op1); + if (TREE_CODE (const_op1) != INTEGER_CST) + const_op1 = op1; result_type = type0; - if (TREE_CODE (op1) == INTEGER_CST) + if (TREE_CODE (const_op1) == INTEGER_CST) { - if (tree_int_cst_lt (op1, integer_zero_node)) + if (tree_int_cst_lt (const_op1, integer_zero_node)) { if ((complain tf_warning) c_inhibit_evaluation_warnings == 0) warning (0, left shift count is negative); } - else if (compare_tree_int (op1, TYPE_PRECISION (type0)) = 0) + else if (compare_tree_int (const_op1, +TYPE_PRECISION (type0)) = 0) { if ((complain tf_warning) c_inhibit_evaluation_warnings == 0) --- gcc/testsuite/c-c++-common/pr48418.c.jj 2013-01-08 15:25:36.501003969 +0100 +++ gcc/testsuite/c-c++-common/pr48418.c2013-01-08 15:21:54.0 +0100 @@ -0,0 +1,20 @@ +/* PR c/48418 */ +/* { dg-do compile } */ +/* { dg-options -Wall -O2 } */ + +int +foo (int x) +{ + const int a = sizeof (int) * __CHAR_BIT__; + const int b = -7; + int c = 0; +
[PATCH] predict_loop fix (PR tree-optimization/48189)
Hi! This is Steven's patch from the PR plus mine hunk, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-08 Steven Bosscher ste...@gcc.gnu.org Jakub Jelinek ja...@redhat.com PR tree-optimization/48189 * predict.c (predict_loops): If max is 0, don't call compare_tree_int. If nitercst is 0, don't predict the exit edge. * gcc.dg/pr48189.c: New test. --- gcc/predict.c.jj2013-01-04 13:44:35.0 +0100 +++ gcc/predict.c 2013-01-08 15:58:15.552884491 +0100 @@ -1434,7 +1434,8 @@ predict_loops (void) if (TREE_CODE (niter) == INTEGER_CST) { if (host_integerp (niter, 1) - compare_tree_int (niter, max-1) == -1) + max + compare_tree_int (niter, max - 1) == -1) nitercst = tree_low_cst (niter, 1) + 1; else nitercst = max; @@ -1456,6 +1457,11 @@ predict_loops (void) else continue; + /* If the prediction for number of iterations is zero, do not +predict the exit edges. */ + if (nitercst == 0) + continue; + probability = ((REG_BR_PROB_BASE + nitercst / 2) / nitercst); predict_edge (ex, predictor, probability); } --- gcc/testsuite/gcc.dg/pr48189.c.jj 2013-01-08 15:59:38.319433008 +0100 +++ gcc/testsuite/gcc.dg/pr48189.c 2013-01-08 15:52:31.0 +0100 @@ -0,0 +1,13 @@ +/* PR tree-optimization/48189 */ +/* { dg-do compile } */ +/* { dg-options -O --param max-predicted-iterations=0 } */ + +struct S { int s[8]; }; + +void +foo (int *x, struct S *y) +{ + int i; + for (i = 0; y[i].s[i]; i++) +*x++ = y[i].s[i]; +} Jakub
Re: [Patch, Fortran] PR55758 - Non-C_Bool handling with BIND(C)
On Sun, Jan 6, 2013 at 6:52 PM, Tobias Burnus bur...@net-b.de wrote: ** ping ** Attached is a small variation, which in addition handles the case that a non-BOOL_C LOGICAL, Bind(C) dummy argument (or result variable) is used in a procedure call. In that case, the variable is now converted to a TYPE_PRECISION == 1 variable. -- The updated patch was build and regtested successfully. Nice, this should fix a pitfall with the previous patch. I still worry about these almost-but-not-quite logicals causing weird and very hard to track down bugs. A slightly safer variant of the approach you describe above would be to convert the variable directly after the bind(c) procedure call; that should make it pretty fool-proof, AFAICS? (in some cases that would be a bit of extra useless work, but I doubt it would matter performance-wise). As written before, I believe that the patch avoids some pitfalls with C interoperability of logical variables: On one hand, it improves cross-compiler portability by rejecting non C_BOOL ones with -std=f2003/f2008/f2008ts; This part is certainly ok. on the other hand, it makes wrong-code issues due to using non-0/1 integers from C much less likely. In both cases, the type-precision==1 handling for non-BIND(C) Fortran LOGICALs or for Bind(C) LOGICAL(kind=C_BOOL) remains the same; hence, no optimization issue is caused. OK for the trunk? Tobias PS: If there is consensus that this patch is a bad idea, I propose to reject non-C_BOOL LOGICALs unconditionally as dummy argument or result variable of BIND(C) procedures. Or do you have a better suggestion? On December 30, 2012, Tobias Burnus wrote: Janne Blomqvist wrote: On Fri, Dec 28, 2012 at 12:31 AM, Tobias Burnus bur...@net-b.de wrote: a) The Fortran standard only defines LOGICAL(kind=C_Bool) as being interoperable with C - no other LOGICAL type. That matches GCC: With gcc (the C compiler) only _Bool is a BOOLEAN_TYPE with TYPE_PRECISION == 1. Hence, this patch rejects other logical kinds as dummy argument/result variable in BIND(C) procedures if -std=f2003/f2008/f2008ts is specified (using -pedantic, one gets a warning). Sorry, I don't understand, what is the -pedantic warning about if it's already rejected? Or do you mean std=gnu -pedantic? The latter. Actually, I use gfc_notify_std(GFC_STD_GNU, ... and just observed the -pedantic result. (I have to admit that I never quite understood - and still don't - what -pedantic exactly does.) b) As GNU extension, other logical kinds are accepted in BIND(C) procedures; however, as the main use of LOGICAL(kind=4) (for BIND(C) procedures) is to handle logical expressions which use C's int, one has to deal with all integer values and not only 0 and 1. Hence, a normal integer type is used internally in that case. That has been done to avoid surprises of users and hard to trace bugs. Does this actually work robustly? I think it does in the sense that it mitigates the problems related to LOGICAL(kind=4) and BIND(C) procedures. No, if one thinks of it as full cure for the problem. The only way to ensure this is to turn all of gfortran's LOGICALs into integers - and even that won't prevent issues related to interoperability with C's _Bool as that one expects only 0 and 1. Thus, either C-Fortran or Fortran - Fortran logical(kind=C_Bool) could still lead to problems. E.g. if you have a logical but really integer under the covers, what happens if you equivalence it with a normal logical variable. Well, equivalencing of actual arguments / result variables is not allowed (I think, not checked). Besides, if you have equivalenced two variables, if you have set one, you may not access the other, e.g.: logical :: A integer :: B equivalence (A,B) A = .true. B = 1 if (A) ... is invalid as A is not defined, even if A = .true. and B = 1 have exactly the same storage size and bit patterns and, hence, in practice A would be a well defined .true. Or pass it as an argument to a procedure expecting a normal logical etc. If the value is only 1 or 0, there shouldn't be any problems. Only if one uses in turn .not. dummy there might be one. The idea of the patch was only to mitigate the problems - a full cure is not possible (cf. above). I think the most likely problematic code is if (.not. c_function()) which is fixed by the patch. And the hope is that fold-converting to a type-precision=1, Boolean-type logical fixes most of the remaining issues. I think the current solution which only affects non-C_BOOL-kind actual arguments and result variables of BIND(C) procedures is a good compromise. * * * But if others do not like this approach, one could turn the gfc_notify_std into a gfc_error are completely reject logicals with kinds /= C_Bool for dummy arguments/result variables in BIND(C) procedures. Would you prefer that approach? (Doing so will break user code (e.g. Open MPI) and make users unhappy but it will
Re: [C++ Patch] PR 55801
I think I'd rather handle this by returning false from var_needs_tls_wrapper. Jason
[PATCH] Allow x - x, 1 in *vec_concatv2df (PR rtl-optimization/55829)
Hi! No matter whether LRA (if it is a bug in there) is fixed or not, *vec_concatv2df could handle for !avx sse3 x - x, 1 alternative the same as it handles x - m, 1 alternative (using movddup). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-01-08 Jakub Jelinek ja...@redhat.com PR rtl-optimization/55829 * config/i386/sse.md (*vec_concatv2df): Add x - x, 1 alternative for sse3 but not avx. * gcc.target/i386/pr55829.c: New test. --- gcc/config/i386/sse.md.jj 2012-11-26 10:14:26.0 +0100 +++ gcc/config/i386/sse.md 2013-01-08 10:28:42.496819712 +0100 @@ -5183,10 +5183,10 @@ (define_insn vec_dupv2df (set_attr mode V2DF)]) (define_insn *vec_concatv2df - [(set (match_operand:V2DF 0 register_operand =x,x,x,x,x,x,x,x) + [(set (match_operand:V2DF 0 register_operand =x,x,x, x,x,x,x,x) (vec_concat:V2DF - (match_operand:DF 1 nonimmediate_operand 0,x,m,0,x,m,0,0) - (match_operand:DF 2 vector_move_operand x,x,1,m,m,C,x,m)))] + (match_operand:DF 1 nonimmediate_operand 0,x,xm,0,x,m,0,0) + (match_operand:DF 2 vector_move_operand x,x,1, m,m,C,x,m)))] TARGET_SSE @ unpcklpd\t{%2, %0|%0, %2} --- gcc/testsuite/gcc.target/i386/pr55829.c.jj 2013-01-08 10:34:41.490778873 +0100 +++ gcc/testsuite/gcc.target/i386/pr55829.c 2013-01-08 10:35:24.591532010 +0100 @@ -0,0 +1,23 @@ +/* PR rtl-optimization/55829 */ +/* { dg-do compile } */ +/* { dg-options -O2 -msse3 -fno-expensive-optimizations } */ + +typedef double V __attribute__ ((__vector_size__ (16))); +extern double v[], w[]; +int foo (void); + +int +bar (void) +{ + int i, f = 0; + V t1 = (V) { *v, 0 }; + V t2 = __builtin_ia32_shufpd (t1, t1, 0); + double p10 = v[0]; + for (i = 0; i 80; i++) +{ + w[0] = p10; + __builtin_ia32_storeupd (v, t2); + f += foo (); +} + return f; +} Jakub
[1/2] fixincludes copyright
Here's the patch that Joseph asked for to make mkheaders.in print a single copyright year. Tested on x86_64-linux-gnu. OK to install? Richard fixincludes/ * mkheaders.in: Print 2013 as the copyright year. Index: fixincludes/mkheaders.in === --- fixincludes/mkheaders.in2013-01-07 20:22:36.0 + +++ fixincludes/mkheaders.in2013-01-07 20:24:22.868016259 + @@ -41,7 +41,7 @@ fi if [ x$1 = x--version ] ; then echo mkheaders (GCC) version $version -echo Copyright 2002, 2007, 2009 Free Software Foundation, Inc. +echo Copyright (C) 2013 Free Software Foundation, Inc. echo This program is free software; you may redistribute it under the echo terms of the GNU General Public License. This program has echo absolutely no warranty.
[2/2] fixincludes copyright
Here's the result of running the copyright update script on fixincludes/. Tested on x86_64-linux-gnu. OK to install? Hope we don't need to list every file in the ChangeLog for this kind of thing :-) Richard fixincludes/ Update copyright. Index: gcc/fixincludes/fixfixes.c === --- gcc.orig/fixincludes/fixfixes.c +++ gcc/fixincludes/fixfixes.c @@ -3,8 +3,7 @@ Test to see if a particular fix should be applied to a header file. - Copyright (C) 1997, 1998, 1999, 2000, 2001, 2003, 2004, 2009 - Free Software Foundation, Inc. + Copyright (C) 1997-2013 Free Software Foundation, Inc. = = = = = = = = = = = = = = = = = = = = = = = = = Index: gcc/fixincludes/fixinc.in === --- gcc.orig/fixincludes/fixinc.in +++ gcc/fixincludes/fixinc.in @@ -6,8 +6,7 @@ # # See README-fixinc for more information. # -# fixincludes copyright (c) 1998, 1999, 2000, 2002, 2009 -# The Free Software Foundation, Inc. +# fixincludes copyright (C) 1998-2013 Free Software Foundation, Inc. # # fixincludes is free software. # Index: gcc/fixincludes/fixincl.c === --- gcc.orig/fixincludes/fixincl.c +++ gcc/fixincludes/fixincl.c @@ -2,8 +2,7 @@ files which are fixed to work correctly with ANSI C and placed in a directory that GCC will search. - Copyright (C) 1997, 1998, 1999, 2000, 2004, 2009, 2012 - Free Software Foundation, Inc. + Copyright (C) 1997-2013 Free Software Foundation, Inc. This file is part of GCC. Index: gcc/fixincludes/fixincl.tpl === --- gcc.orig/fixincludes/fixincl.tpl +++ gcc/fixincludes/fixincl.tpl @@ -18,9 +18,7 @@ x=fixincl.x =] * * See README for more information. * - * inclhack copyright (c) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, - * 2006, 2007, 2008 - * The Free Software Foundation, Inc. + * inclhack copyright (C) 1998-2013 Free Software Foundation, Inc. * * inclhack is free software: you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the Index: gcc/fixincludes/fixincl.x === --- gcc.orig/fixincludes/fixincl.x +++ gcc/fixincludes/fixincl.x @@ -19,9 +19,7 @@ * * See README for more information. * - * inclhack copyright (c) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, - * 2006, 2007, 2008 - * The Free Software Foundation, Inc. + * inclhack copyright (C) 1998-2013 Free Software Foundation, Inc. * * inclhack is free software: you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the Index: gcc/fixincludes/fixlib.c === --- gcc.orig/fixincludes/fixlib.c +++ gcc/fixincludes/fixlib.c @@ -3,7 +3,7 @@ files which are fixed to work correctly with ANSI C and placed in a directory that GCC will search. - Copyright (C) 1999, 2000, 2001, 2004, 2009 Free Software Foundation, Inc. + Copyright (C) 1999-2013 Free Software Foundation, Inc. This file is part of GCC. Index: gcc/fixincludes/fixlib.h === --- gcc.orig/fixincludes/fixlib.h +++ gcc/fixincludes/fixlib.h @@ -3,8 +3,7 @@ files which are fixed to work correctly with ANSI C and placed in a directory that GCC will search. - Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 2004, 2009, 2012 - Free Software Foundation, Inc. + Copyright (C) 1997-2013 Free Software Foundation, Inc. This file is part of GCC. Index: gcc/fixincludes/fixopts.c === --- gcc.orig/fixincludes/fixopts.c +++ gcc/fixincludes/fixopts.c @@ -1,6 +1,6 @@ /* Handle options that are passed from environment variables. - Copyright (C) 2004, 2009, 2012 Free Software Foundation, Inc. + Copyright (C) 2004-2013 Free Software Foundation, Inc. This file is part of GCC. Index: gcc/fixincludes/fixtests.c === --- gcc.orig/fixincludes/fixtests.c +++ gcc/fixincludes/fixtests.c @@ -3,8 +3,7 @@ Test to see if a particular fix should be applied to a header file. - Copyright (C) 1997, 1998, 1999, 2000, 2009, 2012 - Free Software Foundation, Inc. + Copyright (C) 1997-2013 Free Software Foundation, Inc. = = = = = = = = = = = = = = = = = = = = = = = = = Index: gcc/fixincludes/genfixes === --- gcc.orig/fixincludes/genfixes +++ gcc/fixincludes/genfixes @@ -1,6 +1,6 @@ #! /bin/sh -# Copyright (C) 1999, 2000, 2001, 2009, 2012 Free Software Foundation, Inc. +# Copyright (C) 1999-2013 Free
Re: PR 55875 (IV wrapping issue)
Hi, here is even more updated patch, this time really fixing the testcase, I hope ;) It turns out there is one extra problem in tree-ssa-loop-niter.c triggered by that code. Some bounds, like one based on inequality test or with wrapping IVs are bounds only when they are executed every iteration. Boostrapped/regtested x86_64-linux. Honza PR tree-optimiation/55875 * gcc.c-torture/execute/pr55875.c: New testcase. * g++.dg/torture/pr55875.C: New testcase. * tree-ssa-loop-niter.c (number_of_iterations_cond): Add EVERY_ITERATION parameter. (number_of_iterations_exit): Check if exit is executed every iteration. (idx_infer_loop_bounds): Similarly here. (n_of_executions_at_most): Simplify to only test for cases where statement is dominated by the particular bound; handle correctly the postdominance test. (scev_probably_wraps_p): Use max loop iterations info as a global bound first. Index: tree-ssa-loop-niter.c === *** tree-ssa-loop-niter.c (revision 194918) --- tree-ssa-loop-niter.c (working copy) *** dump_affine_iv (FILE *file, affine_iv *i *** 1208,1213 --- 1208,1215 -- in this case we can use the information whether the control induction variables can overflow or not in a more efficient way. +if EVERY_ITERATION is true, we know the test is executed on every iteration. + The results (number of iterations and assumptions as described in comments at struct tree_niter_desc in tree-flow.h) are stored to NITER. Returns false if it fails to determine number of iterations, true if it *** static bool *** 1217,1227 number_of_iterations_cond (struct loop *loop, tree type, affine_iv *iv0, enum tree_code code, affine_iv *iv1, struct tree_niter_desc *niter, ! bool only_exit) { bool exit_must_be_taken = false, ret; bounds bnds; /* The meaning of these assumptions is this: if !assumptions then the rest of information does not have to be valid --- 1219,1239 number_of_iterations_cond (struct loop *loop, tree type, affine_iv *iv0, enum tree_code code, affine_iv *iv1, struct tree_niter_desc *niter, ! bool only_exit, bool every_iteration) { bool exit_must_be_taken = false, ret; bounds bnds; + /* If the test is not executed every iteration, wrapping may make the test + to pass again. + TODO: the overflow case can be still used as unreliable estimate of upper + bound. But we have no API to pass it down to number of iterations code + and, at present, it will not use it anyway. */ + if (!every_iteration +(!iv0-no_overflow || !iv1-no_overflow + || code == NE_EXPR || code == EQ_EXPR)) + return false; + /* The meaning of these assumptions is this: if !assumptions then the rest of information does not have to be valid *** number_of_iterations_exit (struct loop * *** 1807,1815 tree op0, op1; enum tree_code code; affine_iv iv0, iv1; ! if (every_iteration !!dominated_by_p (CDI_DOMINATORS, loop-latch, exit-src)) return false; niter-assumptions = boolean_false_node; --- 1819,1829 tree op0, op1; enum tree_code code; affine_iv iv0, iv1; + bool safe; ! safe = dominated_by_p (CDI_DOMINATORS, loop-latch, exit-src); ! ! if (every_iteration !safe) return false; niter-assumptions = boolean_false_node; *** number_of_iterations_exit (struct loop * *** 1855,1861 iv0.base = expand_simple_operations (iv0.base); iv1.base = expand_simple_operations (iv1.base); if (!number_of_iterations_cond (loop, type, iv0, code, iv1, niter, ! loop_only_exit_p (loop, exit))) { fold_undefer_and_ignore_overflow_warnings (); return false; --- 1869,1875 iv0.base = expand_simple_operations (iv0.base); iv1.base = expand_simple_operations (iv1.base); if (!number_of_iterations_cond (loop, type, iv0, code, iv1, niter, ! loop_only_exit_p (loop, exit), safe)) { fold_undefer_and_ignore_overflow_warnings (); return false; *** idx_infer_loop_bounds (tree base, tree * *** 2657,2662 --- 2671,2677 tree low, high, type, next; bool sign, upper = true, at_end = false; struct loop *loop = data-loop; + bool reliable = true; if (TREE_CODE (base) != ARRAY_REF) return true; *** idx_infer_loop_bounds (tree base, tree * *** 2728,2734 tree_int_cst_compare (next, high) = 0) return true; ! record_nonwrapping_iv (loop, init, step,
Re: [PATCH] Adding target rdos to GCC
After some new suggestions from Uros, I've made a new diff. Change log: * config/gthr.m4: Added rdos thread header. * gcc/config/i386/i386.c: Provided a way to define a default setting for medium memory model and PIC using TARGET_RDOS define. * gcc/config/i386/i386.h: Defined TARGET_RDOS macro. Defined default value for large-data-threshold. * gcc/config/i386/i386.md: Added r14 and r15 register names. * gcc/config/i386/i386.opt: Changed initial value for large-data-threshold to DEFAULT_LARGE_SECTION_THRESHOLD. * gcc/config/i386/rdos.h: Common definitions for target rdos * gcc/config/i386/rdos64.h: Specific definitions for 64-bit rdos target. * gcc/config.gcc: Added rdos targets Tested on target rdos and rdos32. Leif Ekblad - Original Message - From: Leif Ekblad l...@rdos.net To: Uros Bizjak ubiz...@gmail.com; gcc-patches@gcc.gnu.org Cc: H.J. Lu hjl.to...@gmail.com Sent: Monday, January 07, 2013 11:13 PM Subject: Re: [PATCH] Adding target rdos to GCC Hello Uros, I tried to use your suggestion and use DEFAULT_SECTION_THRESHOLD in i386.opt, however this doesn't work for me. It seems the default is always 65536 regardless of what I define in rdos.h. Therefore, I kept the design as it was before. Changing name to TARGET_RDOS seems reasonable, and this has been changed. The location of the defines are also changed to the positions you proposed. I added another include-file (rdos32.h) for the 32-bit target. Currently, the 32-bit target is supported with OpenWatcom. While I want to make sure the header-files compile for both 32- and 64-bits, 32-bit support is not my primary target right now. I just need a GCC that can build the 32-bit version. New change log: * config/gthr.m4: Added rdos thread header. * gcc/config/i386/i386.c: Provided a way to define a default setting for medium memory model and PIC using TARGET_RDOS define. Provided a way to define a default value for large-data-threshold. * gcc/config/i386/i386.h: Defined TARGET_RDOS macro. Defined default value for large-data-threshold. * gcc/config/i386/i386.md: Added r14 and r15 register names. * gcc/config/i386/i386.opt: Changed initial value for large-data-threshold to 0 to allow detection of modification. * gcc/config/i386/rdos.h: Added new file for 64-bit rdos target. * gcc/config/i386/rdos32.h: Added new file for 32-bit rdos target. * gcc/config.gcc: Added rdos targets Leif Ekblad - Original Message - From: Uros Bizjak ubiz...@gmail.com To: gcc-patches@gcc.gnu.org Cc: Leif Ekblad l...@rdos.net; H.J. Lu hjl.to...@gmail.com Sent: Sunday, January 06, 2013 9:32 PM Subject: Re: [PATCH] Adding target rdos to GCC Hello! Updated patches with the suggestions below, except that the initial value is 0 (which is not meaningful). I also added a setting in rdos target file to use r15 for PIC instead of rbx. *** gcc-4.8-20121230/gcc/config/i386/i386.c Thu Dec 27 02:58:06 2012 --- gcc-work/gcc/config/i386/i386.c Sun Jan 6 14:10:34 2013 + if (ix86_section_threshold == 0) + ix86_section_threshold = DEFAULT_SECTION_THRESHOLD; Please see attached patch on how to initialize an option. *** gcc-4.8-20121230/gcc/config/i386/i386.h Wed Dec 19 17:04:12 2012 --- gcc-work/gcc/config/i386/i386.h Sun Jan 6 13:24:26 2013 + #define TARGET_MEDIUM_PIC 0 You should just use TARGET_RDOS here (also, please see attached patch). + /* Default threshold for putting data in large sections with x86-64 +medium memory model */ + #define DEFAULT_SECTION_THRESHOLD 65536 Do not put this define in the middle of i386.h section that deals with alignment... *** gcc-4.8-20121230/gcc/config/i386/rdos.h Thu Jan 1 01:00:00 1970 --- gcc-work/gcc/config/i386/rdos.h Sun Jan 6 13:20:12 2013 + #undef REAL_PIC_OFFSET_TABLE_REGNUM + #define REAL_PIC_OFFSET_TABLE_REGNUM R15_REG Is this header also used for 32bit target? You should not use REX registers for 32bit targets. + #undef TARGET_MEDIUM_PIC + #define TARGET_MEDIUM_PIC 1 TARGET_RDOS + #define DEFAULT_SECTION_THRESHOLD0x10 No hex numbers here. Also, you will need to #undef this first to override the default in i386.h. *** gcc-4.8-20121230/gcc/config.gcc Thu Nov 22 00:33:40 2012 --- gcc-work/gcc/config.gcc Fri Jan 4 21:08:46 2013 + i[34567]86-*-rdos*) + tm_file=${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h newlib-stdint.h + ;; You forgot to include i386/rdos.h here (other headers too?). It is needed at least for TARGET_EXECUTABLE_SUFFIX define. Uros. gcc-base.diff Description: Binary data
[patch] fix libstdc++/55908
This fixes a 4.8 regression in std::mem_fn. PR libstdc++/55908 * include/std/functional (_Mem_fn::operator()): Fix constraints to avoid ambiguity. * testsuite/20_util/function_objects/mem_fn/55908.cc: New. * testsuite/20_util/bind/ref_neg.cc: Adjust dg-error line numbers. Tested x86_64-linux, committed to trunk. commit bb2e3a9896c0c3be894e5b926ea55ee12a8a2297 Author: Jonathan Wakely jwakely@gmail.com Date: Tue Jan 8 20:27:06 2013 + PR libstdc++/55908 * include/std/functional (_Mem_fn::operator()): Fix constraints to avoid ambiguity. * testsuite/20_util/function_objects/mem_fn/55908.cc: New. * testsuite/20_util/bind/ref_neg.cc: Adjust dg-error line numbers. diff --git a/libstdc++-v3/include/std/functional b/libstdc++-v3/include/std/functional index 0b5d475..714b3ed 100644 --- a/libstdc++-v3/include/std/functional +++ b/libstdc++-v3/include/std/functional @@ -648,7 +648,7 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) templatetypename _Tp, typename... _Args using _RequireValidArgs2 - = _Require_NotSame_Class, _Tp, _NotSame_Class*, _Tp, + = _Require_NotSame_Class, _Tp, _NotSameconst _Class*, _Tp, _AllConvertible_Pack_Args..., _Pack_ArgTypes...; templatetypename _Tp, typename... _Args @@ -727,7 +727,7 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) templatetypename _Tp, typename... _Args using _RequireValidArgs2 - = _Require_NotSame_Class, _Tp, _NotSame_Class*, _Tp, + = _Require_NotSame_Class, _Tp, _NotSamevolatile _Class*, _Tp, _AllConvertible_Pack_Args..., _Pack_ArgTypes...; templatetypename _Tp, typename... _Args @@ -807,7 +807,8 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type) templatetypename _Tp, typename... _Args using _RequireValidArgs2 - = _Require_NotSame_Class, _Tp, _NotSame_Class*, _Tp, + = _Require_NotSame_Class, _Tp, +_NotSameconst volatile _Class*, _Tp, _AllConvertible_Pack_Args..., _Pack_ArgTypes...; templatetypename _Tp, typename... _Args diff --git a/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc b/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc index 9854176..f65baa0 100644 --- a/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc +++ b/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc @@ -30,10 +30,10 @@ void test01() { const int dummy = 0; std::bind(inc, _1)(0); // { dg-error no match } - // { dg-error rvalue|const { target *-*-* } 1346 } - // { dg-error rvalue|const { target *-*-* } 1360 } - // { dg-error rvalue|const { target *-*-* } 1374 } - // { dg-error rvalue|const { target *-*-* } 1388 } + // { dg-error rvalue|const { target *-*-* } 1347 } + // { dg-error rvalue|const { target *-*-* } 1361 } + // { dg-error rvalue|const { target *-*-* } 1375 } + // { dg-error rvalue|const { target *-*-* } 1389 } std::bind(inc, std::ref(dummy))(); // { dg-error no match } } diff --git a/libstdc++-v3/testsuite/20_util/function_objects/mem_fn/55908.cc b/libstdc++-v3/testsuite/20_util/function_objects/mem_fn/55908.cc new file mode 100644 index 000..7bbdb34 --- /dev/null +++ b/libstdc++-v3/testsuite/20_util/function_objects/mem_fn/55908.cc @@ -0,0 +1,34 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2013 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// PR libstdc++/55908 calling const member function via const pointer + +#include functional + +struct foo +{ + void f(int) const { } + + void g() const + { +auto mf = std::mem_fn(foo::f); +mf(this, 1); + } +};
[Patch, fortran] PR47203 Use of module with same name as subroutine
Hello, a small, unexciting bug. For the case: subroutine m() use m end subroutine m the USE statement is rejected, but it is not if the subroutine is contained. In the latter case, the namespace of the symbol of the subroutine is the parent namespace, which confuses the error condition in check_for_ambiguous. Regression tested on x86_64-unknown-linux-gnu. OK for trunk? Mikael 2013-01-08 Mikael Morin mik...@gcc.gnu.org PR fortran/47203 * module.c (check_for_ambiguous): Get the current program unit using gfc_current_ns. diff --git a/module.c b/module.c index f3b3caa..604acbb 100644 --- a/module.c +++ b/module.c @@ -4493,7 +4493,7 @@ check_for_ambiguous (gfc_symbol *st_sym, pointer_info *info) module_locus locus; symbol_attribute attr; - if (st_sym-ns-proc_name st_sym-name == st_sym-ns-proc_name-name) + if (st_sym-name == gfc_current_ns-proc_name-name) { gfc_error ('%s' of module '%s', imported at %C, is also the name of the current program unit, st_sym-name, module_name); 2013-01-08 Mikael Morin mik...@gcc.gnu.org PR fortran/47203 * gfortran.dg/use_28.f90: New test. ! { dg-do compile } ! ! PR fortran/47203 ! The USE statement of a module was not rejected in a procedure with the same ! name if the procedure was contained. ! ! Contributed by Tobias Burnus bur...@net-b.de module m end module m call m contains subroutine m() use m ! { dg-error is also the name of the current program unit } end subroutine m end
Re: Support for MIPS r5900
Hello Richard, cvt.w.s is replaced by trunc.w.s. This seems to work. Probably showing my ignorance, but I couldn't see this in the patch. trunc.w.s is enabled by ISA_HAS_TRUNC_W_S. This automatically disables cvt.w.s, because trunc.w.s is preferred. I disabled 64 bit FPU instructions by -msoft-float. This works, but using -msingle-float fails. This would be the better configuration. There are still 64 bit FPU instructions used (e.g. dmfc1 $2,$f0 when using long double multiplication). So -msingle-float doesn't seem to work on generic mips64-linux-gnu. Right. That combination hasn't really been defined. What happens for plain doubles? This seems to work. There are no unsupported instructions generated. Do you pass those in FPRs or GPRs? I used -mhard-float together with -msingle-float, so it is using FPRs. The n32 TImode support you mention uses pairs of GPRs, whereas I imagine you'd eventually want to use a single 128-bit GPR. Is that right? Most old PS2 code will work when supporting this: typedef unsigned int __u128 __attribute__((mode(TI))); This is currently working with n32 ABI without any change, but not in o32 ABI. Support for 128-bit GPR would be better, so we have full compatibility to old PS2 code. For the record, I think all those patches would be too invasive this late into the 4.8 cycle so would have to wait for 4.9. OK. I also want to go step by step here. Please use ISA_HAS_* for consistency with other macros. I think it'd be better to drop the '(integer)'. OK. I thought ISA_HAS_* was related to MIPS ISA. My stuff is only related to one CPU here. Several other cases later on, I won't bore you with them all :-) I rework it and try to get it stable and tested. Best regards Jürgen
[patch] update libstdc++ C++11 status table
I was going to fix this missing feature, but as Richard just announced Stage 3 has ended I'll just update the docs and wait for trunk to reopen. * doc/xml/manual/status_cxx2011.xml: Document that N3189 is missing. Committed to trunk and 4.7 commit d5a292b9ce2ebbe48ab2a9b6823554c86d2bfa06 Author: Jonathan Wakely jwakely@gmail.com Date: Tue Jan 8 20:58:53 2013 + * doc/xml/manual/status_cxx2011.xml: Document that N3189 is missing. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml index c0780de..5693bfd 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml @@ -154,10 +154,13 @@ particular release. entryC library dependency for quick_exit, at_quick_exit/entry /row row + ?dbhtml bgcolor=#B0B0B0 ? entry18.6/entry entryDynamic memory management/entry - entryY/entry - entry/ + entryPartial/entry + entryMissing codeget_new_handler/code. + codeset_new_handler/code is not thread-safe. + /entry /row row entry18.7/entry @@ -202,10 +205,13 @@ particular release. entry/ /row row + ?dbhtml bgcolor=#B0B0B0 ? entry18.8.3/entry entryAbnormal termination/entry - entryY/entry - entry/ + entryPartial/entry + entryMissing codeget_terminate/code. + codeset_terminate/code is not thread-safe. + /entry /row row entry18.8.4/entry @@ -2610,10 +2616,13 @@ particular release. entry/ /row row + ?dbhtml bgcolor=#B0B0B0 ? entryD.11/entry entryViolating exception-specifications/entry - entry/ - entry/ + entryPartial/entry + entryMissing codeget_unexpected/code. + codeset_unexpected/code is not thread-safe. + /entry /row /tbody
Re: [Patch, fortran] PR47203 Use of module with same name as subroutine
On Tue, Jan 08, 2013 at 10:06:17PM +0100, Mikael Morin wrote: a small, unexciting bug. For the case: subroutine m() use m end subroutine m the USE statement is rejected, but it is not if the subroutine is contained. In the latter case, the namespace of the symbol of the subroutine is the parent namespace, which confuses the error condition in check_for_ambiguous. Regression tested on x86_64-unknown-linux-gnu. OK for trunk? OK. Note, RM sent a status report stating that trunk was closed except for regression and doc fixes. Other changes (may) need RM approval. -- Steve
Re: [C/C++ PATCH] shift with negative or too big count warning (PR c/48418)
On Tue, 8 Jan 2013, Jakub Jelinek wrote: Hi! As discussed in the PR, on the following testcase we've regressed with the introduction of c_fully_fold, when the C FE normally warns the argument isn't folded yet. Fixed by also warning in c_fully_fold_internal, if before that function the shift count wasn't INTEGER_CST and after it it is. The testcase also revealed a regression on the C++ FE side, caused by SIZEOF_EXPR folding deferral. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? The C front-end changes are OK. Properly diagnostic messages inside ? : conditionals should be marked up with G_() to ensure that both cases are extracted for translation, though in this case it doesn't matter much given that the message wording should stay identical to other copies of the same messages. -- Joseph S. Myers jos...@codesourcery.com
Re: Support for MIPS r5900
Hello Maciej, ll, sc, dmult, ddiv, cvt.w.s, 64 bit FPU instructions. ll and sc is disabled with -mno-llsc and works. cvt.w.s is replaced by trunc.w.s. This seems to work. Probably showing my ignorance, but I couldn't see this in the patch. This has raised my attention -- AFAICS the binutils change recently approved correctly disables DMULT, DDIV, CVT.W.S, etc. for -march=r5900, but does not do that for LL or SC. I think that should be fixed. And I gather LLD and SCD should then be disabled as well. The glibc can only be compiled with support for ll and sc. The Linux kernel successfully emulates these instructions. When compiling GCC for mips*r5900*-elf (i.e. not Linux), the instructions ll/sc and lld/scd are disabled by my patch. Things would get more complicated if one wanted to run a real OS such as Linux on the R5900 and let the kernel FP emulator handle the missing double FP automagically -- this is a little bit out of scope here as regular -mdouble-float would then just do, but makes me wonder whether -mfp32 should really be enforced (as opposed to just defaulted) for the R5900, hmm... I tried to emulate the 64 Bit FPU when the real 32 Bit FPU was enabled in Linux. There are 2 problems with this: 1. When the program starts, I don't know if it needs a 64 Bit or 32 Bit FPRs. So registers are initialized for 32 bit. When dmfc0 or dmtc0 appears, I need to emulate them using 32 Bit FPU, because some 32 bit programs use these instructions with a 32 Bit FPU (e.g. Linux 2.6.35 kernel and Debian 5.0). When a 64 bit calculation instructions appears, I need to switch from 32 bit FPRs to 64 bit FPRs. When the program used 32 bit instructions with the odd FPRs, there is no way to reconstruct the overwritten part of the 64 bit FPRs. 2. Some undefined instructions (e.g. c.eq.d) doesn't lead to an exception on an r5900, but have undefined behavior. So there is no emulation possible. It just calculates random stuff. So the FPU needs to be disabled and completely emulated by the kernel, because then all FPU instructions lead to an exception. This is working with Linux 2.6 on PS2. There are even more problems when running unchanged code from official Fedora 12 on PS2, because of some different opcode encoding. The users of my PS2 Linux 2.6 complain about low speed, because many instructions are emulated. I need some fast implementation, even if the size of the floating point data types is smaller. So 32 bit FPU must be default for r5900. Best regards Jürgen
Re: [Patch, Fortran] PR55758 - Non-C_Bool handling with BIND(C)
Janne Blomqvist worte: On Sun, Jan 6, 2013 at 6:52 PM, Tobias Burnusbur...@net-b.de wrote: Attached is a small variation, which in addition handles the case that a non-BOOL_C LOGICAL, Bind(C) dummy argument (or result variable) is used in a procedure call. In that case, the variable is now converted to a TYPE_PRECISION == 1 variable. -- The updated patch was build and regtested successfully. Nice, this should fix a pitfall with the previous patch. I still worry about these almost-but-not-quite logicals causing weird and very hard to track down bugs. Though, it should be much less severe then with the current trunk. A slightly safer variant of the approach youdescribe above would be to convert the variable directly after the bind(c) procedure call; that should make it pretty fool-proof, AFAICS? (in some cases that would be a bit of extra useless work, but I doubt it would matter performance-wise). Well, that's not at trivial as it sounds. In particular for a Fortran-written procedure, which gets the input from C. If the variable is INTENT(IN) or if it is not modified in the procedure, it may not be touched. In order to do this, one has to implement support for a shadow variable, which has to set the real one at the end of the procedure. I don't think that this shadow-var handling is really that trivial. For actual arguments, doing the conversion back is simpler. Function results might be also a bit tricky, but that's mostly handled by the current patch, I hope. Tobias
[PATCH] Merge Makefile.{def,in} from Binutils
Hi! I merged this patch from Binutils, which defines a dep on gas for various other Binutils components: 2013-01-07 Jan-Benedict Glaw jbg...@lug-owl.de * Makefile.def: Merge from binutils. * Makefile.in: Dito. Index: Makefile.in === --- Makefile.in (revision 195037) +++ Makefile.in (working copy) @@ -45910,6 +45910,14 @@ all-stage4-binutils: maybe-all-stage4-intl all-stageprofile-binutils: maybe-all-stageprofile-intl all-stagefeedback-binutils: maybe-all-stagefeedback-intl +all-binutils: maybe-all-gas + +all-stage1-binutils: maybe-all-stage1-gas +all-stage2-binutils: maybe-all-stage2-gas +all-stage3-binutils: maybe-all-stage3-gas +all-stage4-binutils: maybe-all-stage4-gas +all-stageprofile-binutils: maybe-all-stageprofile-gas +all-stagefeedback-binutils: maybe-all-stagefeedback-gas install-binutils: maybe-install-opcodes install-strip-binutils: maybe-install-strip-opcodes install-opcodes: maybe-install-bfd @@ -45959,6 +45967,7 @@ all-gprof: maybe-all-bfd all-gprof: maybe-all-opcodes all-gprof: maybe-all-intl +all-gprof: maybe-all-gas configure-ld: maybe-configure-intl configure-stage1-ld: maybe-configure-stage1-intl @@ -46015,6 +46024,14 @@ all-stage4-ld: maybe-all-stage4-intl all-stageprofile-ld: maybe-all-stageprofile-intl all-stagefeedback-ld: maybe-all-stagefeedback-intl +all-ld: maybe-all-gas + +all-stage1-ld: maybe-all-stage1-gas +all-stage2-ld: maybe-all-stage2-gas +all-stage3-ld: maybe-all-stage3-gas +all-stage4-ld: maybe-all-stage4-gas +all-stageprofile-ld: maybe-all-stageprofile-gas +all-stagefeedback-ld: maybe-all-stagefeedback-gas install-ld: maybe-install-gold install-strip-ld: maybe-install-strip-gold configure-gold: maybe-configure-intl @@ -46057,6 +46074,14 @@ all-stage4-gold: maybe-all-build-bison all-stageprofile-gold: maybe-all-build-bison all-stagefeedback-gold: maybe-all-build-bison +all-gold: maybe-all-gas + +all-stage1-gold: maybe-all-stage1-gas +all-stage2-gold: maybe-all-stage2-gas +all-stage3-gold: maybe-all-stage3-gas +all-stage4-gold: maybe-all-stage4-gas +all-stageprofile-gold: maybe-all-stageprofile-gas +all-stagefeedback-gold: maybe-all-stagefeedback-gas check-gold: maybe-all-binutils check-stage1-gold: maybe-all-stage1-binutils Index: Makefile.def === --- Makefile.def(revision 195037) +++ Makefile.def(working copy) @@ -373,6 +373,9 @@ dependencies = { module=configure-opcodes; on=configure-libiberty; hard=true; }; dependencies = { module=all-opcodes; on=all-libiberty; }; +// We must build gas before binutils, gprof, ld and gold to avoid race +// condition in the prev-gcc/as script during bootstrap of combined tree +// with GCC and binutils. See PR gas/14899 for details. dependencies = { module=configure-binutils; on=configure-intl; }; dependencies = { module=all-binutils; on=all-libiberty; }; dependencies = { module=all-binutils; on=all-opcodes; }; @@ -380,6 +383,7 @@ dependencies = { module=all-binutils; on=all-build-flex; }; dependencies = { module=all-binutils; on=all-build-bison; }; dependencies = { module=all-binutils; on=all-intl; }; +dependencies = { module=all-binutils; on=all-gas; }; // We put install-opcodes before install-binutils because the installed // binutils might be on PATH, and they might need the shared opcodes @@ -401,6 +405,7 @@ dependencies = { module=all-gprof; on=all-bfd; }; dependencies = { module=all-gprof; on=all-opcodes; }; dependencies = { module=all-gprof; on=all-intl; }; +dependencies = { module=all-gprof; on=all-gas; }; dependencies = { module=configure-ld; on=configure-intl; }; dependencies = { module=all-ld; on=all-libiberty; }; dependencies = { module=all-ld; on=all-bfd; }; @@ -408,6 +413,7 @@ dependencies = { module=all-ld; on=all-build-bison; }; dependencies = { module=all-ld; on=all-build-flex; }; dependencies = { module=all-ld; on=all-intl; }; +dependencies = { module=all-ld; on=all-gas; }; dependencies = { module=install-ld; on=install-gold; }; dependencies = { module=install-strip-ld; on=install-strip-gold; }; dependencies = { module=configure-gold; on=configure-intl; }; @@ -415,6 +421,7 @@ dependencies = { module=all-gold; on=all-intl; }; dependencies = { module=all-gold; on=all-bfd; }; dependencies = { module=all-gold; on=all-build-bison; }; +dependencies = { module=all-gold; on=all-gas; }; dependencies = { module=check-gold; on=all-binutils; }; dependencies = { module=check-gold; on=all-gas; }; MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Alles sollte so einfach wie möglich gemacht sein. the second : Aber nicht einfacher. (Einstein) signature.asc Description: Digital signature
[PATCH] Merge config.{guess,sub} from config repo
Hi! I merged this patch: 2013-01-08 Jan-Benedict Glaw jbg...@lug-owl.de * config.sub: Merge from config repo. * config.guess: Dito. Index: config.sub === --- config.sub (revision 195037) +++ config.sub (working copy) @@ -4,7 +4,7 @@ # 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, # 2011, 2012, 2013 Free Software Foundation, Inc. -timestamp='2012-12-29' +timestamp='2013-01-04' # This file is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by @@ -1354,7 +1354,7 @@ -gnu* | -bsd* | -mach* | -minix* | -genix* | -ultrix* | -irix* \ | -*vms* | -sco* | -esix* | -isc* | -aix* | -cnk* | -sunos | -sunos[34]*\ | -hpux* | -unos* | -osf* | -luna* | -dgux* | -auroraux* | -solaris* \ - | -sym* | -kopensolaris* \ + | -sym* | -kopensolaris* | -plan9* \ | -amigaos* | -amigados* | -msdos* | -newsos* | -unicos* | -aof* \ | -aos* | -aros* \ | -nindy* | -vxsim* | -vxworks* | -ebmon* | -hms* | -mvs* \ @@ -1500,9 +1500,6 @@ -aros*) os=-aros ;; - -kaos*) - os=-kaos - ;; -zvmoe) os=-zvmoe ;; Index: config.guess === --- config.guess(revision 195037) +++ config.guess(working copy) @@ -4,7 +4,7 @@ # 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, # 2011, 2012, 2013 Free Software Foundation, Inc. -timestamp='2012-12-29' +timestamp='2012-12-30' # This file is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by @@ -26,7 +26,7 @@ # program. This Exception is an additional permission under section 7 # of the GNU General Public License, version 3 (GPLv3). # -# Originally written by Per Bothner. +# Originally written by Per Bothner. # # You can get the latest version of this script from: # http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: 17:44 @uschebit Evangelist ist doch ein Vertriebler the second : für unverkäufliche Produkte, oder? (#korsett, 20120821) signature.asc Description: Digital signature
Re: Support for MIPS r5900
IIRC we defined doubles as 32bits wide in our old port. We simply didn't support 64bit wide doubles. I don't remember what mechanism we used to make this happen. Ah, yeah. I think limiting wide doubles would be good. I tried to disable dmult and ddiv (see mips.md). Disabling worked, but now muldi3 calls itself in libgcc2. I thought this should work, because I got this working with GCC 4.3, but the latest GCC version is a problem. multi3 is calling muldi3, so that muldi3 should be able to use mulsi3, because it is the same C code in libgcc2. Can someone get me some hints or comments? How can this be debugged? Not sure, sorry. IIRC I simply disabled muldi3_internal2 and I think we defined away everything related to timode except register-register moves. @Jeff: I think you know the stringent copyright rules for GCC. I want to use the code from the original patch, but I don't know how many people were involved. So I can't use it without copyright problems. Can you please tell me which code can I use without encountering copyright problems? I plan to submit the code for fixing the r5900 short loop bug in GCC. AIUI the problem that Jürgen's hitting is that _muldi3.o in libgcc actually contains __multi3 on 64-bit targets, because LIBGCC2_UNITS_PER_WORD == 8. Presumably _mulsi3.o would then contain __muldi3 where needed, but that file doesn't exist. So he was trying to add it. Yes, this is exactly what happened. Best regards Jürgen
Define accessors for handler functions (Was [patch] update libstdc++ C++11 status table)
On 8 January 2013 21:33, Jonathan Wakely wrote: I was going to fix this missing feature, but as Richard just announced Stage 3 has ended I'll just update the docs and wait for trunk to reopen. * doc/xml/manual/status_cxx2011.xml: Document that N3189 is missing. Committed to trunk and 4.7 Here's a patch to implement N3189, in case I get hit by a bus before Stage 1. This adds get_new_handler, get_unexpected and get_terminate, and makes all stores and loads of the handlers atomic. I think this is safe and backwards compatible, because all stores and loads of the handlers happen in the library, but I'll wait for Stage 1. commit d9a509e2ea86b9689e1029f9d648edc5ada8196b Author: Jonathan Wakely jwakely@gmail.com Date: Tue Jan 8 19:50:13 2013 + * libsupc++/exception (get_terminate(), get_unexpected()): Declare. * libsupc++/eh_terminate.cc (get_terminate() , set_unexpected()): Define. (set_terminate(terminate_handler)): Set atomically. (set_unexpected(terminate_handler)): Likewise. * libsupc++/new (get_new_handler()): Declare. * libsupc++/new_handler.cc (get_new_handler()): Define. (set_new_handler(new_handler)): Set atomically. (__new_handler): Use internal linkage. * libsupc++/new_op.cc (operator new): Use get_new_handler(). * libsupc++/new_opnt.cc (operator new): Likewise. * libsupc++/Makefile.am: Compile above files with -std=gnu++11. * libsupc++/Makefile.in: Regenerate. * config/abi/pre/gnu.ver: Add new exports. * testsuite/18_support/headers/exception/synopsis.cc: Check accessors for handlers. * testsuite/18_support/headers/new/synopsis.cc: Likewise. * testsuite/18_support/new_handler.cc: New. * testsuite/18_support/terminate_handler.cc: New. * testsuite/18_support/unexpected_handler.cc: New. diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 35b4c44..9c13e1a 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -1345,6 +1345,15 @@ GLIBCXX_3.4.18 { # std::bad_function_call::what() _ZNKSt17bad_function_call4whatEv; +# std::get_new_handler() +_ZSt15get_new_handlerv; + +# std::get_terminate() +_ZSt13get_terminatev; + +# std::get_unexpected() +_ZSt14get_unexpectedv; + } GLIBCXX_3.4.17; # Symbols in the support library (libsupc++) have their own tag. diff --git a/libstdc++-v3/libsupc++/Makefile.am b/libstdc++-v3/libsupc++/Makefile.am index a019bd8..0207746 100644 --- a/libstdc++-v3/libsupc++/Makefile.am +++ b/libstdc++-v3/libsupc++/Makefile.am @@ -108,31 +108,51 @@ cp-demangle.o: cp-demangle.c $(C_COMPILE) -DIN_GLIBCPP_V3 -Wno-error -c $ -# Use special rules for the C++0x sources so that the proper flags are passed. +# Use special rules for the C++11 sources so that the proper flags are passed. eh_ptr.lo: eh_ptr.cc - $(LTCXXCOMPILE) -std=gnu++0x -c $ + $(LTCXXCOMPILE) -std=gnu++11 -c $ eh_ptr.o: eh_ptr.cc - $(CXXCOMPILE) -std=gnu++0x -c $ + $(CXXCOMPILE) -std=gnu++11 -c $ + +eh_terminate.lo: eh_terminate.cc + $(LTCXXCOMPILE) -std=gnu++11 -c $ +eh_terminate.o: eh_terminate.cc + $(CXXCOMPILE) -std=gnu++11 -c $ eh_throw.lo: eh_throw.cc - $(LTCXXCOMPILE) -std=gnu++0x -c $ + $(LTCXXCOMPILE) -std=gnu++11 -c $ eh_throw.o: eh_throw.cc - $(CXXCOMPILE) -std=gnu++0x -c $ + $(CXXCOMPILE) -std=gnu++11 -c $ guard.lo: guard.cc - $(LTCXXCOMPILE) -std=gnu++0x -c $ + $(LTCXXCOMPILE) -std=gnu++11 -c $ guard.o: guard.cc - $(CXXCOMPILE) -std=gnu++0x -c $ + $(CXXCOMPILE) -std=gnu++11 -c $ atexit_thread.lo: atexit_thread.cc - $(LTCXXCOMPILE) -std=gnu++0x -c $ + $(LTCXXCOMPILE) -std=gnu++11 -c $ atexit_thread.o: atexit_thread.cc - $(CXXCOMPILE) -std=gnu++0x -c $ + $(CXXCOMPILE) -std=gnu++11 -c $ nested_exception.lo: nested_exception.cc - $(LTCXXCOMPILE) -std=gnu++0x -c $ + $(LTCXXCOMPILE) -std=gnu++11 -c $ nested_exception.o: nested_exception.cc - $(CXXCOMPILE) -std=gnu++0x -c $ + $(CXXCOMPILE) -std=gnu++11 -c $ + +new_handler.lo: new_handler.cc + $(LTCXXCOMPILE) -std=gnu++11 -c $ +new_handler.o: new_handler.cc + $(CXXCOMPILE) -std=gnu++11 -c $ + +new_op.lo: new_op.cc + $(LTCXXCOMPILE) -std=gnu++11 -c $ +new_op.o: new_op.cc + $(CXXCOMPILE) -std=gnu++11 -c $ + +new_opnt.lo: new_opnt.cc + $(LTCXXCOMPILE) -std=gnu++11 -c $ +new_opnt.o: new_opnt.cc + $(CXXCOMPILE) -std=gnu++11 -c $ # AM_CXXFLAGS needs to be in each subdirectory so that it can be # modified in a per-library or per-sub-library way. Need to manually diff --git a/libstdc++-v3/libsupc++/eh_terminate.cc b/libstdc++-v3/libsupc++/eh_terminate.cc index 87359d0..35e43c2 100644 --- a/libstdc++-v3/libsupc++/eh_terminate.cc +++ b/libstdc++-v3/libsupc++/eh_terminate.cc @@ -1,7
*ping* [patch, Fortran] PR 55806 - Inefficient ANY with array constructors
Ping**2? This was submitted before the review, so I think it should still be OK. Ping? http://gcc.gnu.org/ml/fortran/2013-01/msg0.html Hello world, the attached patch replaces ANY(a, b, c) with a .or. b .or c, leading to reduced execution time. It also handles ALL, PRODUCT and SUM. This fixes a bug noted by Michael Metcalf. Regression-tested. OK for trunk?
Re: [C++ Patch] PR 55801
Hi, On 01/08/2013 09:00 PM, Jason Merrill wrote: I think I'd rather handle this by returning false from var_needs_tls_wrapper. Ah Ok. Then what about the below? Thanks, Paolo. // /cp 2013-01-09 Paolo Carlini paolo.carl...@oracle.com PR c++/55801 * decl2.c (var_needs_tls_wrapper): Return false when error_operand_p of the argument is true. /testsuite 2013-01-09 Paolo Carlini paolo.carl...@oracle.com PR c++/55801 * g++.dg/tls/thread_local-ice.C: New. Index: cp/decl2.c === --- cp/decl2.c (revision 195043) +++ cp/decl2.c (working copy) @@ -1,7 +1,7 @@ /* Process declarations and variables for C++ compiler. Copyright (C) 1988, 1992, 1993, 1994, 1995, 1996, 1997, 1998, - 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, - 2011, 2012 Free Software Foundation, Inc. + 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007-2013 + Free Software Foundation, Inc. Hacked by Michael Tiemann (tiem...@cygnus.com) This file is part of GCC. @@ -2807,7 +2807,8 @@ var_defined_without_dynamic_init (tree var) static bool var_needs_tls_wrapper (tree var) { - return (DECL_THREAD_LOCAL_P (var) + return (!error_operand_p (var) + DECL_THREAD_LOCAL_P (var) !DECL_GNU_TLS_P (var) !DECL_FUNCTION_SCOPE_P (var) !var_defined_without_dynamic_init (var)); Index: testsuite/g++.dg/tls/thread_local-ice.C === --- testsuite/g++.dg/tls/thread_local-ice.C (revision 0) +++ testsuite/g++.dg/tls/thread_local-ice.C (working copy) @@ -0,0 +1,6 @@ +// PR c++/55801 +// { dg-options -std=c++11 } +// { dg-require-effective-target tls } + +class C; +thread_local C O, O2 = O; // { dg-error incomplete }
RE: [PATCH] Fix PR54659, include gmp.h from system.h
-Original Message- From: H.J. Lu [mailto:hjl.to...@gmail.com] Sent: Tuesday, January 08, 2013 1:50 PM To: Richard Biener; Iyer, Balaji V Cc: gcc-patches@gcc.gnu.org; Diego Novillo; i...@google.com Subject: Re: [PATCH] Fix PR54659, include gmp.h from system.h On Fri, Dec 21, 2012 at 5:14 AM, Richard Biener rguent...@suse.de wrote: After the Ada fix including gmp.h from system.h is trivial (by means of double-int.h it was included from almost everywhere already) Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Ok for trunk? (Go frontend unfixed) Thanks, Richard. 2012-12-21 Richard Biener rguent...@suse.de PR bootstrap/54659 * system.h: Include gmp.h. * tree-ssa-loop-niter.c: Do not include gmp.h here. * double-int.h: Likewise. * realmpfr.h: Likewise. fortran/ * gfortran.h: Do not include gmp.h here. This breaks bootstrap with in-tree gmp since system.h is included for host feature test by gcc/configure. Before gmp is built, gmp.h isn't available. We need to build gmp first before configuring gcc. Balaji, can you try this patch? Yes, this seem to do the trick in my initial runs! Is this already committed to the trunk? If so, I will just merge my branch (cilkplus) with the trunk. Otherwise I have to manually apply it. Thanks, Balaji V. Iyer. -- H.J. -- 2013-01-08 H.J. Lu hongjiu...@intel.com * Makefile.def (configure-gcc): Depend on all-gmp. (all-gcc): Remove dependency on all-gmp. * Makefile.in: Regenerated.
[committed] Update extract and insert patterns on PA
The attached change updates the extract and insert patterns on HP PA. It removes the deprecated extzv, extv and insv expanders, replacing them with SI and DI mode variant. Operand predicates are tightened in these and related insn patterns. Predicates are also tightened some shift patterns. Two new shift patterns are introduced which I plan to use to support misalign loads. Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11 with no observed regressions. Committed to trunk. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2013-01-08 John David Anglin dave.ang...@nrc-cnrc.gc.ca * config/pa/pa.h (VAL_U6_BITS_P): Define. (INT_U6_BITS): Likewise. * config/pa/predicates.md (uint6_operand): New predicate. (shift5_operand, shift6_operand): Likewise. * config/pa/pa.md (lshrsi3, rotrsi3): Use shift5_operand instead of arith32_operand. (lshrdi3): Use shift6_operand. (shrpsi4, shrpdi4): New insn patterns. (extzv): Delete expander. (extzvsi, extzvdi): New expanders. Use uint5_operand and uint6_operand predicates in unamed zero extract patterns. Tighten common constraint. (extv): Delete expander. (extvsi, extvdi): New expanders. Use uint5_operand and uint6_operand predicates in unamed sign extract patterns. Tighten common constraint. (insv): Delete expander. (insvsi, insvdi): New expanders. Use uint5_operand and uint6_operand predicates in unamed insert patterns. Tighten common constraint. Change uint32_operand predicate to uint6_operand predicate in unamed DImode pattern to insert constant values of type 1...1. Index: config/pa/pa.h === --- config/pa/pa.h (revision 195003) +++ config/pa/pa.h (working copy) @@ -956,6 +956,9 @@ #define VAL_U5_BITS_P(X) ((unsigned HOST_WIDE_INT)(X) 0x20) #define INT_U5_BITS(X) VAL_U5_BITS_P (INTVAL (X)) +#define VAL_U6_BITS_P(X) ((unsigned HOST_WIDE_INT)(X) 0x40) +#define INT_U6_BITS(X) VAL_U6_BITS_P (INTVAL (X)) + #define VAL_11_BITS_P(X) ((unsigned HOST_WIDE_INT)(X) + 0x400 0x800) #define INT_11_BITS(X) VAL_11_BITS_P (INTVAL (X)) Index: config/pa/predicates.md === --- config/pa/predicates.md (revision 195003) +++ config/pa/predicates.md (working copy) @@ -32,6 +32,13 @@ (match_test INT_U5_BITS (op ;; Return truth value of whether OP is an integer which fits the range +;; constraining 6-bit unsigned immediate operands in three-address insns. + +(define_predicate uint6_operand + (and (match_code const_int) + (match_test INT_U6_BITS (op + +;; Return truth value of whether OP is an integer which fits the range ;; constraining 11-bit signed immediate operands in three-address insns. (define_predicate int11_operand @@ -125,6 +132,20 @@ (ior (match_operand 0 register_operand) (match_code const_int))) +;; Return truth value of whether OP can be used as a shift operand in +;; a shift insn that accepts registers of mode MODE or 5-bit shift amounts. + +(define_predicate shift5_operand + (ior (match_operand 0 register_operand) + (match_operand 0 uint5_operand))) + +;; Return truth value of whether OP can be used as a shift operand in +;; a shift insn that accepts registers of mode MODE or 6-bit shift amounts. + +(define_predicate shift6_operand + (ior (match_operand 0 register_operand) + (match_operand 0 uint6_operand))) + ;; True iff OP can be used as an operand in an adddi3 insn. (define_predicate adddi3_operand Index: config/pa/pa.md === --- config/pa/pa.md (revision 195003) +++ config/pa/pa.md (working copy) @@ -6584,7 +6584,7 @@ (define_insn lshrsi3 [(set (match_operand:SI 0 register_operand =r,r) (lshiftrt:SI (match_operand:SI 1 register_operand r,r) -(match_operand:SI 2 arith32_operand q,n)))] +(match_operand:SI 2 shift5_operand q,n)))] @ {vshd %%r0,%1,%0|shrpw %%r0,%1,%%sar,%0} @@ -6595,7 +6595,7 @@ (define_insn lshrdi3 [(set (match_operand:DI 0 register_operand =r,r) (lshiftrt:DI (match_operand:DI 1 register_operand r,r) -(match_operand:DI 2 arith32_operand q,n)))] +(match_operand:DI 2 shift6_operand q,n)))] TARGET_64BIT @ shrpd %%r0,%1,%%sar,%0 @@ -6603,10 +6603,40 @@ [(set_attr type shift) (set_attr length 4)]) +; Shift right pair word 0 to 31 bits. +(define_insn shrpsi4 + [(set (match_operand:SI 0 register_operand =r,r) + (ior:SI (ashift:SI (match_operand:SI 1 register_operand r,r) + (minus:SI (const_int 32) +
RE: [Patch, AARCH64] Define REVERSIBLE_CC_MODE
Hi Marcus, Thanks for reviewing the patch. Given the comment about SELECT_CC_MODE and the implementation of aarch64_select_cc_mode this definition of REVERSIBLE_CC_MODE looks broken to me. Please find attached the modified patch that does not allow floating point inequality comparisons in REVERSIBLE_CC_MODE. Please review the patch and let me know if its OK? Build and tested on aarch64-thunder-elf (using Cavium's internal simulator). No new regressions. Thanks, Naveen.H.S 2013-01-09 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * config/aarch64/aarch64.h (REVERSIBLE_CC_MODE): Define. --- gcc/config/aarch64/aarch64.h 2013-01-09 09:55:56.315992619 +0530 +++ gcc/config/aarch64/aarch64.h 2013-01-09 09:55:29.175991698 +0530 @@ -709,6 +709,14 @@ do { \ #define SELECT_CC_MODE(OP, X, Y) aarch64_select_cc_mode (OP, X, Y) +/* A C expression whose value is one if it is always safe to reverse a + comparison whose mode is MODE. If `SELECT_CC_MODE' can ever return + MODE for a floating-point inequality comparison, then + `REVERSIBLE_CC_MODE (MODE)' must be zero. + You need not define this macro if it would always returns zero or if the + floating-point format is anything other than `IEEE_FLOAT_FORMAT'. */ +#define REVERSIBLE_CC_MODE(MODE) ((MODE) != CCFPEmode) + #define REVERSE_CONDITION(CODE, MODE) \ (((MODE) == CCFPmode || (MODE) == CCFPEmode) \ ? reverse_condition_maybe_unordered (CODE) \
[Patch, i386]: Set avx128 preferred vector mode for btver2
Hi Maintainers, Please fine below patch that sets preferred vectorization mode for btver2 as avx128. We tested some benchmarks and found that enabling 128-bit AVX instruction generation by the auto vectorizer performs better than default 256-bit AVX. Completed bootstrap and make -k check passes. Ok for trunk? Gcc/ChangeLog --- 2013-01-09 Venkataramanan Kumar venkataramanan.ku...@amd.com * config/i386/i386.c (initial_ix86_tune_features): Set up X86_TUNE_AVX128_OPTIMAL for m_BTVER2. Index: i386.c === --- i386.c (revision 194833) +++ i386.c (working copy) @@ -2014,7 +2014,7 @@ /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for the auto-vectorizer. */ - m_BDVER, + m_BDVER | m_BTVER2, /* X86_TUNE_REASSOC_INT_TO_PARALLEL: Try to produce parallel computations during reassociation of integer computation. */ Regards, Venkat.
Re: Support for MIPS r5900
On 01/08/2013 03:49 PM, Jürgen Urban wrote: @Jeff: I think you know the stringent copyright rules for GCC. I want to use the code from the original patch, but I don't know how many people were involved. So I can't use it without copyright problems. Can you please tell me which code can I use without encountering copyright problems? I plan to submit the code for fixing the r5900 short loop bug in GCC. If you're using something from the Cygnus port, then it would be covered by the blanket copyright assignment Cygnus had in place with the FSF. If there are any doubts about the origin of a hunk of GCC code I could probably pull out the old sources to determine if it came from Cygnus's code base or not. Jeff
Re: [Patch, i386]: Set avx128 preferred vector mode for btver2
On Wed, Jan 9, 2013 at 6:05 AM, Kumar, Venkataramanan venkataramanan.ku...@amd.com wrote: Please fine below patch that sets preferred vectorization mode for btver2 as avx128. We tested some benchmarks and found that enabling 128-bit AVX instruction generation by the auto vectorizer performs better than default 256-bit AVX. Completed bootstrap and make -k check passes. Ok for trunk? Gcc/ChangeLog --- 2013-01-09 Venkataramanan Kumar venkataramanan.ku...@amd.com * config/i386/i386.c (initial_ix86_tune_features): Set up X86_TUNE_AVX128_OPTIMAL for m_BTVER2. OK. Thanks, Uros.
Re: [PATCH] Adding target rdos to GCC
On Tue, Jan 8, 2013 at 9:32 PM, Leif Ekblad l...@rdos.net wrote: After some new suggestions from Uros, I've made a new diff. Change log: * config/gthr.m4: Added rdos thread header. * gcc/config/i386/i386.c: Provided a way to define a default setting for medium memory model and PIC using TARGET_RDOS define. * gcc/config/i386/i386.h: Defined TARGET_RDOS macro. Defined default value for large-data-threshold. * gcc/config/i386/i386.md: Added r14 and r15 register names. * gcc/config/i386/i386.opt: Changed initial value for large-data-threshold to DEFAULT_LARGE_SECTION_THRESHOLD. * gcc/config/i386/rdos.h: Common definitions for target rdos * gcc/config/i386/rdos64.h: Specific definitions for 64-bit rdos target. * gcc/config.gcc: Added rdos targets Please use present tense in ChangeLog (see many other entries). Tested on target rdos and rdos32. This is OK for mainline, but the patch needs explicit approval from Release Managers at this stage. Thanks, Uros.