Ping: Make 128 bits the default vector size for NEON
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02172.html The last version: ChangeLog: * doc/invoke.texi (preferred-vector-size): Document. * params.h (PREFERRED_VECTOR_SIZE): Define. * config/arm/arm.c (arm_preferred_simd_mode): Use param PREFERRED_VECTOR_SIZE instead of TARGET_NEON_VECTORIZE_QUAD. Make 128 bits the default. (arm_autovectorize_vector_sizes): Likewise. * config/arm/arm.opt (NEON_VECTORIZE_QUAD): Add RejectNegative. * params.def (PARAM_PREFERRED_VECTOR_SIZE): Define. testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect_multiple_sizes): New procedure. (add_options_for_quad_vectors): Replace with ... (add_options_for_double_vectors): ... this. * gfortran.dg/vect/pr19049.f90: Expect more printings on targets that support multiple vector sizes since the vectorizer attempts to vectorize with both vector sizes. * gcc.dg/vect/slp-reduc-6.c, gcc.dg/vect/no-vfa-vect-79.c, gcc.dg/vect/no-vfa-vect-102a.c, gcc.dg/vect/vect-outer-1a.c, gcc.dg/vect/vect-outer-1b.c, gcc.dg/vect/vect-outer-2b.c, gcc.dg/vect/vect-outer-3a.c, gcc.dg/vect/no-vfa-vect-37.c, gcc.dg/vect/vect-outer-3b.c, gcc.dg/vect/no-vfa-vect-101.c, gcc.dg/vect/no-vfa-vect-102.c, gcc.dg/vect/vect-reduc-dot-s8b.c, gcc.dg/vect/vect-outer-1.c, gcc.dg/vect/vect-104.c: Likewise. * gcc.dg/vect/vect-16.c: Rename to... * gcc.dg/vect/no-fast-math-vect-16.c: ... this to ensure that it runs without -ffast-math. * gcc.dg/vect/vect-42.c: Run with 64 bit vectors if applicable. * gcc.dg/vect/vect-multitypes-6.c, gcc.dg/vect/vect-52.c, gcc.dg/vect/vect-54.c, gcc.dg/vect/vect-46.c, gcc.dg/vect/vect-48.c, gcc.dg/vect/vect-96.c, gcc.dg/vect/vect-multitypes-3.c, gcc.dg/vect/vect-40.c: Likewise. * gcc.dg/vect/vect-outer-5.c: Remove quad-vectors option as redundant. * gcc.dg/vect/vect-109.c, gcc.dg/vect/vect-peel-1.c, gcc.dg/vect/vect-peel-2.c, gcc.dg/vect/slp-25.c, gcc.dg/vect/vect-multitypes-1.c, gcc.dg/vect/slp-3.c, gcc.dg/vect/no-vfa-pr29145.c, gcc.dg/vect/vect-multitypes-4.c: Likewise. * gcc.dg/vect/vect.exp: Run no-fast-math-vect*.c tests with -fno-fast-math. Thanks, Ira Index: doc/invoke.texi === --- doc/invoke.texi (revision 171723) +++ doc/invoke.texi (working copy) @@ -8874,6 +8874,10 @@ The maximum number of conditional stores paires th if either vectorization (@option{-ftree-vectorize}) or if-conversion (@option{-ftree-loop-if-convert}) is disabled. The default is 2. +@item preferred-vector-size +Preferred vector size in bits for targets that support multiple vector sizes. +Invalid values are ignored. The default is 128. + @end table @end table Index: params.h === --- params.h(revision 171723) +++ params.h(working copy) @@ -204,6 +204,8 @@ extern void init_param_values (int *params); PARAM_VALUE (PARAM_PREFETCH_MIN_INSN_TO_MEM_RATIO) #define MIN_NONDEBUG_INSN_UID \ PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID) +#define PREFERRED_VECTOR_SIZE \ + PARAM_VALUE (PARAM_PREFERRED_VECTOR_SIZE) #define MAX_STORES_TO_SINK \ PARAM_VALUE (PARAM_MAX_STORES_TO_SINK) #endif /* ! GCC_PARAMS_H */ Index: testsuite/lib/target-supports.exp === --- testsuite/lib/target-supports.exp (revision 171723) +++ testsuite/lib/target-supports.exp (working copy) @@ -3203,6 +3203,24 @@ proc check_effective_target_vect_strided_wide { } return $et_vect_strided_wide_saved } +# Return 1 if the target supports multiple vector sizes + +proc check_effective_target_vect_multiple_sizes { } { +global et_vect_multiple_sizes + +if [info exists et_vect_multiple_sizes_saved] { +verbose check_effective_target_vect_multiple_sizes: using cached result 2 +} else { +set et_vect_multiple_sizes_saved 0 +if { ([istarget arm*-*-*] [check_effective_target_arm_neon]) } { + set et_vect_multiple_sizes_saved 1 +} +} + +verbose check_effective_target_vect_multiple_sizes: returning $et_vect_multiple_sizes_saved 2 +return $et_vect_multiple_sizes_saved +} + # Return 1 if the target supports section-anchors proc check_effective_target_section_anchors { } { @@ -3585,9 +3603,9 @@ proc add_options_for_bind_pic_locally { flags } { # Add to FLAGS the flags needed to enable 128-bit vectors. -proc add_options_for_quad_vectors { flags } { +proc add_options_for_double_vectors { flags } { if [is-effective-target arm_neon_ok] { - return $flags -mvectorize-with-neon-quad + return $flags --param preferred-vector-size=64 } return $flags Index: testsuite/gfortran.dg/vect/pr19049.f90 === ---
Re: [PATCH] use build_function_type_list in the avr backend
2011/4/20 Nathan Froyd froy...@codesourcery.com: As $SUBJECT suggests. Tested with cross to avr-elf. OK to commit? -Nathan * config/avr/avr.c (avr_init_builtins): Call build_function_type_list instead of build_function_type. Please, commit. Denis.
[Patch, Fortran, committed] PR 18918 - Fix max-rank check for coarrays
Committed as obvious (Rev. 172812). Tobias Index: gcc/testsuite/gfortran.dg/coarray_18.f90 === --- gcc/testsuite/gfortran.dg/coarray_18.f90 (revision 0) +++ gcc/testsuite/gfortran.dg/coarray_18.f90 (revision 0) @@ -0,0 +1,39 @@ +! { dg-do compile } +! { dg-options -fcoarray=single } +! +! Prevent ICE when exceeding the maximal number of allowed +! dimensions (normal + codimensions). +! +! Fortran 2008 allows (co)arrays with 15 ranks +! Currently, gfortran only supports 7, cf. PR 37577 +! Thus, the program is valid Fortran 2008 ... +! +! See also general coarray PR 18918 +! +! Test case taken from Leibniz-Rechenzentrum (LRZ)'s +! fortran_tests with thanks to Reinhold Bader. +! + +program ar + implicit none + integer :: ic(2)[*] + integer :: id(2,2)[2,*] + integer :: ie(2,2,2)[2,2,*] + integer :: ig(2,2,2,2)[2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: ih(2,2,2,2,2)[2,2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: ij(2,2,2,2,2,2)[2,2,2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: ik(2,2,2,2,2,2,2)[2,2,2,2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: il[2,2,2,2,2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: im[2,2,2,2,2,2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: in[2,2,2,2,2,2,2,2,2,*] ! { dg-error has more than 7 dimensions } + integer :: io[2,2,2,2,2,2,2,2,2,2,*] ! { dg-error has more than 7 dimensions } + real :: x2(2,2,4)[2,*] + complex :: c2(4,2)[2,*] + double precision :: d2(1,5,9)[2,*] + character(len=1) :: ch2(2)[2,*] + character(len=2) :: ch22(-5:4)[2,*] + logical :: l2(17)[2,*] + if (this_image() == 1) then + write(*,*) 'OK' + end if +end program Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 172811) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2011-04-21 Tobias Burnus bur...@net-b.de + + PR fortran/18918 + * gfortran.dg/coarray_18.f90: New. + 2011-04-20 Jason Merrill ja...@redhat.com * g++.dg/cpp0x/initlist47.C: New. Index: gcc/fortran/array.c === --- gcc/fortran/array.c (revision 172811) +++ gcc/fortran/array.c (working copy) @@ -576,6 +576,13 @@ goto cleanup; } + if (as-rank = GFC_MAX_DIMENSIONS) +{ + gfc_error (Array specification at %C has more than %d + dimensions, GFC_MAX_DIMENSIONS); + goto cleanup; +} + for (;;) { as-corank++; @@ -644,7 +651,7 @@ goto cleanup; } - if (as-corank = GFC_MAX_DIMENSIONS) + if (as-rank + as-corank = GFC_MAX_DIMENSIONS) { gfc_error (Array specification at %C has more than %d dimensions, GFC_MAX_DIMENSIONS); Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (revision 172811) +++ gcc/fortran/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2011-04-21 Tobias Burnus bur...@net-b.de + + PR fortran/18918 + * array.c (gfc_match_array_spec): Fix maximal rank(+corank) check. + 2011-04-20 Jim Meyering meyer...@redhat.com * expr.c (free_expr0): Remove useless if-before-free.
Re: [PATCH] use build_function_type_list in the s390 backend
On Wed, Apr 20, 2011 at 9:43 PM, Nathan Froyd froy...@codesourcery.com wrote: As $SUBJECT suggests. Tested with cross to s390-linux-gnu. OK to commit? Those kind of patches are pretty obvious, so if $target maintainers do not comment within 48h consider them approved. Thanks, Richard. -Nathan * config/s390/s390.c (s390_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index caee077..adacfa3 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -9172,7 +9172,7 @@ s390_init_builtins (void) { tree ftype; - ftype = build_function_type (ptr_type_node, void_list_node); + ftype = build_function_type_list (ptr_type_node, NULL_TREE); add_builtin_function (__builtin_thread_pointer, ftype, S390_BUILTIN_THREAD_POINTER, BUILT_IN_MD, NULL, NULL_TREE);
Re: Ping^2 Re: Target header etc. cleanup patch
On Wed, Apr 20, 2011 at 11:09 PM, Joseph S. Myers jos...@codesourcery.com wrote: Ping^2. This patch http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00130.html is still pending review. This version applies cleanly to current trunk. The s/struct rtx_def */rtx/ changes are all ok (in fact I'd say they are obvious). Richard. 2011-04-20 Joseph Myers jos...@codesourcery.com * config/alpha/alpha.c (struct machine_function): Use rtx, not struct rtx_def *. * config/bfin/bfin-protos.h (Mmode): Don't define. Expand definition where used. * config/bfin/bfin.h (bfin_cc_rtx, bfin_rets_rtx): Use rtx, not struct rtx_def *. * config/cris/cris-protos.h (STDIO_INCLUDED): Don't define. * config/fr30/fr30-protos.h (Mmode): Don't define. * config/fr30/fr30.h (inhibit_libc): Don't define. * config/h8300/h8300.h (struct cum_arg): Use rtx, not struct rtx_def *. * config/i386/cygming.h (union tree_node, TREE): Don't define or undefine. (FILE): Don't undefine. * config/iq2000/iq2000.h (struct iq2000_args): Use rtx, not struct rtx_def *. * config/m32c/m32c-protos.h (MM, UINT): Don't define. Expand definitions where used. * config/m32r/m32r-protos.h (Mmode): Don't define. Expand definition where used. * config/microblaze/microblaze.h (struct microblaze_args): Use rtx, not struct rtx_def *. * config/mn10300/mn10300-protos.h (Mmode, Cstar, Rclas): Don't define. Expand definitions where used. * config/pa/pa-protos.h (return_addr_rtx): Use rtx, not struct rtx_def *. * config/pa/pa.h (hppa_pic_save_rtx): Use rtx, not struct rtx_def *. * config/pdp11/pdp11.h (cc0_reg_rtx): Use rtx, not struct rtx_def *. * config/rx/rx-protos.h (Mmode, Fargs, Rcode): Don't define. Expand definitions where used. * config/rx/rx.c (rx_is_legitimate_address, rx_function_arg_size, rx_function_arg, rx_function_arg_advance, rx_function_arg_boundary): Expand definitions of those macros. * config/sh/sh-protos.h (sfunc_uses_reg, get_fpscr_rtx): Use rtx, not struct rtx_def *. * config/sh/sh.h (sh_compare_op0, sh_compare_op1): Use rtx, not struct rtx_def *. * config/spu/spu-protos.h (spu_float_const): Use rtx, not struct rtx_def *. * config/spu/spu.c (spu_float_const): Use rtx, not struct rtx_def *. * config/v850/v850-protos.h (Mmode): Don't define. Expand definition where used. * config/v850/v850.h (GHS_default_section_names, GHS_current_section_names): Use tree, not union tree_node *. Index: gcc/config/alpha/alpha.c === --- gcc/config/alpha/alpha.c (revision 172767) +++ gcc/config/alpha/alpha.c (working copy) @@ -4606,7 +4606,7 @@ struct GTY(()) machine_function const char *some_ld_name; /* For TARGET_LD_BUGGY_LDGP. */ - struct rtx_def *gp_save_rtx; + rtx gp_save_rtx; /* For VMS condition handlers. */ bool uses_condition_handler; Index: gcc/config/m32c/m32c-protos.h === --- gcc/config/m32c/m32c-protos.h (revision 172767) +++ gcc/config/m32c/m32c-protos.h (working copy) @@ -1,5 +1,5 @@ /* Target Prototypes for R8C/M16C/M32C - Copyright (C) 2005, 2007, 2008, 2010 + Copyright (C) 2005, 2007, 2008, 2010, 2011 Free Software Foundation, Inc. Contributed by Red Hat. @@ -19,12 +19,9 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -#define MM enum machine_mode -#define UINT unsigned int - void m32c_conditional_register_usage (void); int m32c_const_ok_for_constraint_p (HOST_WIDE_INT, char, const char *); -UINT m32c_dwarf_frame_regnum (int); +unsigned int m32c_dwarf_frame_regnum (int); int m32c_eh_return_data_regno (int); void m32c_emit_epilogue (void); void m32c_emit_prologue (void); @@ -47,8 +44,8 @@ int m32c_trampoline_size (void); #ifdef RTX_CODE -int m32c_cannot_change_mode_class (MM, MM, int); -int m32c_class_max_nregs (int, MM); +int m32c_cannot_change_mode_class (enum machine_mode, enum machine_mode, int); +int m32c_class_max_nregs (int, enum machine_mode); rtx m32c_eh_return_stackadj_rtx (void); void m32c_emit_eh_epilogue (rtx); int m32c_expand_cmpstr (rtx *); @@ -60,20 +57,20 @@ void m32c_expand_neg_mulpsi3 (rtx *); int m32c_expand_setmemhi (rtx *); int m32c_extra_constraint_p (rtx, char, const char *); int m32c_extra_constraint_p2 (rtx, char, const char *); -int m32c_hard_regno_nregs (int, MM); -int m32c_hard_regno_ok (int, MM); +int m32c_hard_regno_nregs (int, enum machine_mode); +int m32c_hard_regno_ok (int, enum machine_mode); bool m32c_illegal_subreg_p (rtx); -bool
Re: [PATCH] use build_function_type_list in the spu backend
Nathan Froyd wrote: for (parm = 1; d-parm[parm] != SPU_BTI_END_OF_PARAMS; parm++) ; - p = void_list_node; + gcc_assert (parm = (SPU_MAX_ARGS_TO_BUILTIN + 1)); + + for (i = 0; i ARRAY_SIZE (args); i++) + args[i] = NULL_TREE; + while (parm 1) - p = tree_cons (NULL_TREE, spu_builtin_types[d-parm[--parm]], p); + { + tree arg = spu_builtin_types[d-parm[--parm]]; + args[parm-1] = arg; + } - p = build_function_type (spu_builtin_types[d-parm[0]], p); + ftype = build_function_type_list (spu_builtin_types[d-parm[0]], + args[0], args[1], args[2], NULL_TREE); This looks really odd now. The reason for running through parms twice, first forwards and then backwards, is just that you need build up the linked tree using tree_cons from the end. With the new scheme, all of that is now unnecessary. Why not simply something along the lines of: for (i = 0; i ARRAY_SIZE (args) d-parm[i] != SPU_BTI_END_OF_PARAMS; i++) args[i] = spu_builtin_types[d-parm[i]]; for (; i ARRAY_SIZE (args); i++) args[i] = NULL; ftype = build_function_type_list (args[0], args[1], args[2], args[3], NULL_TREE); (As an aside, the SPU_MAX_ARGS_TO_BUILTIN define appears to imply a generality that is not really there: the call to build_function_type in itself implies an upper bound on the number of supported arguments.) Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [PATCH] use build_function_type_list in the s390 backend
Nathan Froyd wrote: * config/s390/s390.c (s390_init_builtins): Call build_function_type_list instead of build_function_type. This is OK. Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
GCC 4.5.3 Status Report (2011-04-21), branch now frozen
Status == A first release candidate for GCC 4.5.3 is beeing made. The branch is now frozen until after the final 4.5.3 release. All changes require explicit release manager approval. Quality Data Priority # Change from Last Report --- --- P10 P2 112 - 7 P30 - 9 --- --- Total 112 - 5 Previous Report === http://gcc.gnu.org/ml/gcc/2010-12/msg00389.html The next report will be sent after GCC 4.5.3 has been released.
Re: Add an array_mode_supported_p target hook
To get back to this... Richard Sandiford richard.sandif...@linaro.org writes: Richard Guenther richard.guent...@gmail.com writes: On Thu, Mar 31, 2011 at 3:32 PM, Richard Sandiford richard.sandif...@linaro.org wrote: This patch adds an array_mode_supported_p hook, which says whether MAX_FIXED_MODE_SIZE should be ignored for a given type of array. It follows on from the discussion here: http://gcc.gnu.org/ml/gcc/2011-03/msg00342.html The intended use of the hook is to allow small arrays of vectors to have a non-BLK mode, and hence to be stored in rtl registers. These arrays are used both in the ARM arm_neon.h API and in the optabs proposed in: http://gcc.gnu.org/ml/gcc/2011-03/msg00322.html The tail end of the thread was about the definition of TYPE_MODE: #define TYPE_MODE(NODE) \ (TREE_CODE (TYPE_CHECK (NODE)) == VECTOR_TYPE \ ? vector_type_mode (NODE) : (NODE)-type.mode) with this outcome: http://gcc.gnu.org/ml/gcc/2011-03/msg00470.html To summarise my take on it: - The current definition of TYPE_MODE isn't sufficient even for vector modes and vector_mode_supported_p, because non-vector types can have vector modes. - We should no longer treat types as having one mode everywhere. We should instead replace TYPE_MODE with a function that takes a context. Tests of things like vector_mode_supported_p would move from layout_type to this new function. I think this patch fits within that scheme. array_mode_supported_p would be treated in the same way as vector_mode_supported_p. I realise the ideal would be to get rid of TYPE_MODE first. But that's going to be a longer-term thing. Now that there's at least a plan, I'd like to press ahead with the array stuff on the basis that (a) although the new hook won't work with the target attribute, our current mode handling doesn't work in just the same way. (b) the new hook doesn't interfere with the plan. (c) getting good code from the intrinsics (and support for these instructions in the vectoriser) is going to be much more important to most ARM users than the ability to turn Neon on and off for individual functions in a TU. To give an example of the difference, the Neon code posted here: http://hilbert-space.de/?p=22 produces this inner loop before the patch (but with http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01996.html applied): .L3: vld3.8 {d16-d18}, [r1]! vstmia ip, {d16-d18} fldd d19, [sp, #24] adr r5, .L6 ldmia r5, {r4-r5} fldd d16, [sp, #32] vmov d18, r4, r5 @ v8qi vmull.u8 q9, d19, d18 adr r5, .L6+8 ldmia r5, {r4-r5} vmov d17, r4, r5 @ v8qi vstmia sp, {d18-d19} vmlal.u8 q9, d16, d17 fldd d16, [sp, #40] adr r5, .L6+16 ldmia r5, {r4-r5} vmov d17, r4, r5 @ v8qi vmlal.u8 q9, d16, d17 add r3, r3, #1 vshrn.i16 d16, q9, #8 cmp r3, r2 vst1.8 {d16}, [r0]! bne .L3 With both patches applied, the inner loop is: .L3: vld3.8 {d18-d20}, [r1]! vmull.u8 q8, d18, d21 vmlal.u8 q8, d19, d22 vmlal.u8 q8, d20, d23 add r3, r3, #1 vshrn.i16 d16, q8, #8 cmp r3, r2 vst1.8 {d16}, [r0]! bne .L3 Tested on arm-linux-gnueabi. OK to install? It looks reasonable given the past discussion, but - can you move forward with the Neon stuff a bit to see if it really fits? Or is this all that is needed for the load/store lane support as well (apart from vectorizer changes of course). Yeah, I have a prototype that hacks up some C support for generating the (otherwise internal-only) load/store built-in functions that the vectoriser is suppsoed to generate. This patch is all that seems to be needed for the types and optabs generation to work in the natural way. I'm happy to leave it until the vectoriser stuff is in a more submittable state though. The vectorisation stuff has now been approved and uses this hook to detect whether interleaved loads stores are supported. Also... Especially given: Can you check the code generated by for example float foo(char *p) { float a[2]; int i; ((char *)a)[0] = p[0]; ((char *)a)[1] = p[1]; ((char *)a)[2] = p[2]; ((char *)a)[3] = p[3]; ((char *)a)[4] = p[4]; ((char *)a)[5] = p[5]; ((char *)a)[6] = p[6]; ((char *)a)[7] = p[7]; return a[0] + a[1]; } for an array a that would get such a larger mode? Thus, check what happens with partial defs of different types (just to avoid ICEs like the ones Jakub was fixing yesterday). OK, I tried: #include arm_neon.h uint32x2_t foo(char *p) { uint32x2_t a[2]; int i; ((char *)a)[0] = p[0]; ((char *)a)[1] = p[1]; ((char *)a)[2] = p[2]; ((char *)a)[3] = p[3]; ((char
Re: GCC 4.5.3 Status Report (2011-04-21), branch now frozen
On Thursday 21 April 2011 11:30:49 Richard Guenther wrote: Status == A first release candidate for GCC 4.5.3 is beeing made. The branch is now frozen until after the final 4.5.3 release. All changes require explicit release manager approval. Quality Data Priority # Change from Last Report --- --- P10 P2 112 - 7 P30 - 9 --- --- Total 112 - 5 Total 112 - 16 Do you think nobody reads your reports? :-p
Re: Add an array_mode_supported_p target hook
On Thu, Apr 21, 2011 at 11:50 AM, Richard Sandiford richard.sandif...@linaro.org wrote: To get back to this... Richard Sandiford richard.sandif...@linaro.org writes: Richard Guenther richard.guent...@gmail.com writes: On Thu, Mar 31, 2011 at 3:32 PM, Richard Sandiford richard.sandif...@linaro.org wrote: This patch adds an array_mode_supported_p hook, which says whether MAX_FIXED_MODE_SIZE should be ignored for a given type of array. It follows on from the discussion here: http://gcc.gnu.org/ml/gcc/2011-03/msg00342.html The intended use of the hook is to allow small arrays of vectors to have a non-BLK mode, and hence to be stored in rtl registers. These arrays are used both in the ARM arm_neon.h API and in the optabs proposed in: http://gcc.gnu.org/ml/gcc/2011-03/msg00322.html The tail end of the thread was about the definition of TYPE_MODE: #define TYPE_MODE(NODE) \ (TREE_CODE (TYPE_CHECK (NODE)) == VECTOR_TYPE \ ? vector_type_mode (NODE) : (NODE)-type.mode) with this outcome: http://gcc.gnu.org/ml/gcc/2011-03/msg00470.html To summarise my take on it: - The current definition of TYPE_MODE isn't sufficient even for vector modes and vector_mode_supported_p, because non-vector types can have vector modes. - We should no longer treat types as having one mode everywhere. We should instead replace TYPE_MODE with a function that takes a context. Tests of things like vector_mode_supported_p would move from layout_type to this new function. I think this patch fits within that scheme. array_mode_supported_p would be treated in the same way as vector_mode_supported_p. I realise the ideal would be to get rid of TYPE_MODE first. But that's going to be a longer-term thing. Now that there's at least a plan, I'd like to press ahead with the array stuff on the basis that (a) although the new hook won't work with the target attribute, our current mode handling doesn't work in just the same way. (b) the new hook doesn't interfere with the plan. (c) getting good code from the intrinsics (and support for these instructions in the vectoriser) is going to be much more important to most ARM users than the ability to turn Neon on and off for individual functions in a TU. To give an example of the difference, the Neon code posted here: http://hilbert-space.de/?p=22 produces this inner loop before the patch (but with http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01996.html applied): .L3: vld3.8 {d16-d18}, [r1]! vstmia ip, {d16-d18} fldd d19, [sp, #24] adr r5, .L6 ldmia r5, {r4-r5} fldd d16, [sp, #32] vmov d18, r4, r5 @ v8qi vmull.u8 q9, d19, d18 adr r5, .L6+8 ldmia r5, {r4-r5} vmov d17, r4, r5 @ v8qi vstmia sp, {d18-d19} vmlal.u8 q9, d16, d17 fldd d16, [sp, #40] adr r5, .L6+16 ldmia r5, {r4-r5} vmov d17, r4, r5 @ v8qi vmlal.u8 q9, d16, d17 add r3, r3, #1 vshrn.i16 d16, q9, #8 cmp r3, r2 vst1.8 {d16}, [r0]! bne .L3 With both patches applied, the inner loop is: .L3: vld3.8 {d18-d20}, [r1]! vmull.u8 q8, d18, d21 vmlal.u8 q8, d19, d22 vmlal.u8 q8, d20, d23 add r3, r3, #1 vshrn.i16 d16, q8, #8 cmp r3, r2 vst1.8 {d16}, [r0]! bne .L3 Tested on arm-linux-gnueabi. OK to install? It looks reasonable given the past discussion, but - can you move forward with the Neon stuff a bit to see if it really fits? Or is this all that is needed for the load/store lane support as well (apart from vectorizer changes of course). Yeah, I have a prototype that hacks up some C support for generating the (otherwise internal-only) load/store built-in functions that the vectoriser is suppsoed to generate. This patch is all that seems to be needed for the types and optabs generation to work in the natural way. I'm happy to leave it until the vectoriser stuff is in a more submittable state though. The vectorisation stuff has now been approved and uses this hook to detect whether interleaved loads stores are supported. Also... Especially given: Can you check the code generated by for example float foo(char *p) { float a[2]; int i; ((char *)a)[0] = p[0]; ((char *)a)[1] = p[1]; ((char *)a)[2] = p[2]; ((char *)a)[3] = p[3]; ((char *)a)[4] = p[4]; ((char *)a)[5] = p[5]; ((char *)a)[6] = p[6]; ((char *)a)[7] = p[7]; return a[0] + a[1]; } for an array a that would get such a larger mode? Thus, check what happens with partial defs of different types (just to avoid ICEs like the ones Jakub was fixing yesterday). OK, I tried: #include arm_neon.h uint32x2_t foo(char *p) { uint32x2_t a[2]; int i; ((char *)a)[0] =
Re: Ping^2 Re: Target header etc. cleanup patch
On Thu, 21 Apr 2011, Richard Guenther wrote: On Wed, Apr 20, 2011 at 11:09 PM, Joseph S. Myers jos...@codesourcery.com wrote: Ping^2. This patch http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00130.html is still pending review. This version applies cleanly to current trunk. The s/struct rtx_def */rtx/ changes are all ok (in fact I'd say they are obvious). I have committed the approved parts (those, plus the cris and m32c changes) as r172818. The following remains pending review (target maintainers CC:ed). 2011-04-21 Joseph Myers jos...@codesourcery.com * config/bfin/bfin-protos.h (Mmode): Don't define. Expand definition where used. * config/fr30/fr30-protos.h (Mmode): Don't define. * config/fr30/fr30.h (inhibit_libc): Don't define. * config/i386/cygming.h (union tree_node, TREE): Don't define or undefine. (FILE): Don't undefine. * config/m32r/m32r-protos.h (Mmode): Don't define. Expand definition where used. * config/mn10300/mn10300-protos.h (Mmode, Cstar, Rclas): Don't define. Expand definitions where used. * config/rx/rx-protos.h (Mmode, Fargs, Rcode): Don't define. Expand definitions where used. * config/rx/rx.c (rx_is_legitimate_address, rx_function_arg_size, rx_function_arg, rx_function_arg_advance, rx_function_arg_boundary): Expand definitions of those macros. * config/v850/v850-protos.h (Mmode): Don't define. Expand definition where used. * config/v850/v850.h (GHS_default_section_names, GHS_current_section_names): Use tree, not union tree_node *. Index: gcc/config/m32r/m32r-protos.h === --- gcc/config/m32r/m32r-protos.h (revision 172818) +++ gcc/config/m32r/m32r-protos.h (working copy) @@ -1,5 +1,6 @@ /* Prototypes for m32r.c functions used in the md file elsewhere. - Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2009, 2010 + Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2009, 2010, + 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -18,9 +19,8 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ -/* Function prototypes that cannot exist in v850.h due to dependency +/* Function prototypes that cannot exist in m32r.h due to dependency complications. */ -#define Mmode enum machine_mode extern void m32r_init (void); extern void m32r_init_expanders (void); @@ -51,16 +51,14 @@ extern rtxm32r_return_addr (int); extern rtxm32r_function_symbol (const char *); #ifdef HAVE_MACHINE_MODES -extern intcall_operand (rtx, Mmode); -extern intsmall_data_operand (rtx, Mmode); -extern intaddr24_operand (rtx, Mmode); -extern intaddr32_operand (rtx, Mmode); -extern intcall26_operand (rtx, Mmode); -extern intmemreg_operand (rtx, Mmode); -extern intsmall_insn_p (rtx, Mmode); +extern intcall_operand (rtx, enum machine_mode); +extern intsmall_data_operand (rtx, enum machine_mode); +extern intaddr24_operand (rtx, enum machine_mode); +extern intaddr32_operand (rtx, enum machine_mode); +extern intcall26_operand (rtx, enum machine_mode); +extern intmemreg_operand (rtx, enum machine_mode); +extern intsmall_insn_p (rtx, enum machine_mode); #endif /* HAVE_MACHINE_MODES */ #endif /* RTX_CODE */ - -#undef Mmode Index: gcc/config/i386/cygming.h === --- gcc/config/i386/cygming.h (revision 172818) +++ gcc/config/i386/cygming.h (working copy) @@ -1,7 +1,7 @@ /* Operating system specific defines to be used when targeting GCC for hosting on Windows32, using a Unix style C library and tools. Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, - 2004, 2005, 2007, 2008, 2009, 2010 + 2004, 2005, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -176,9 +176,6 @@ along with GCC; see the file COPYING3. #undef LONG_TYPE_SIZE #define LONG_TYPE_SIZE 32 -union tree_node; -#define TREE union tree_node * - #define drectve_section() \ (fprintf (asm_out_file, \t.section .drectve\n), \ in_section = NULL) @@ -472,9 +469,3 @@ do {\ /* Static stack checking is supported by means of probes. */ #define STACK_CHECK_STATIC_BUILTIN 1 - -#undef TREE - -#ifndef BUFSIZ -# undef FILE -#endif Index: gcc/config/rx/rx-protos.h === --- gcc/config/rx/rx-protos.h (revision 172818) +++ gcc/config/rx/rx-protos.h (working copy) @@ -21,11 +21,6 @@ #ifndef GCC_RX_PROTOS_H #define GCC_RX_PROTOS_H -/* A few abbreviations to make the prototypes shorter. */ -#define Mmode enum machine_mode -#define Fargs CUMULATIVE_ARGS -#define Rcode enum rtx_code
Re: Ping^2 Re: Target header etc. cleanup patch
Joseph, mingw part is ok, too. Thanks, Kai
[PATCH][ARM] Thumb2 replicated constants
This patch is a repost of the one I previously posted here: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html As requested, I've broken out the other parts of the original patch, and those have already been reposted yesterday (and one committed also). This (final) part is support for using Thumb2's replicated constants and addw/subw instructions as part of split constant loads. Previously the compiler could use these constants, but only where they would be loaded in a single instruction. This patch must be applied on top of the addw/subw patch I posted yesterday. The patch also optimizes the use of inverted or negated constants as a short-cut to the final value. The previous code did this in some cases, but could not be easily adapted to replicated constants. The previous code also had a bug that prevented optimal use of shifted constants in Thumb code by imposing the same restrictions as ARM code. This has been fixed. Example 1: addw as part of a split constant load a + 0xf Before: movwr3, #65535 ; 0x0 movtr3, 15 ; 0xf addsr3, r0, r3 After: add r0, r0, #1044480 ; 0xff000 addwr0, r0, #4095; 0x00fff Example 2: arbitrary shifts bug fix a - 0xfff1 Before: sub r0, r0, #65024 ; 0xfe00 sub r0, r0, #496 ; 0x01f0 sub r0, r0, #1 ; 0x0001 After: sub r0, r0, #65280 ; 0xff00 sub r0, r0, #241 ; 0x00f1 Example 3: 16-bit replicated patterns a + 0x44004401 Before: movwr3, #17409 ; 0x4401 movtr3, 17408 ; 0x4400 addsr3, r0, r3 After: add r0, r0, #1140868096 ; 0x44004400 addsr0, r0, #1 ; 0x0001 Example 4: 32-bit replicated patterns a 0xaa00 Before: mov r3, #43520 ; 0xaa00 movtr3, 43690; 0x and r3, r0, r3 After: and r0, r0, #-1431655766 ; 0x bic r0, r0, #170 ; 0x00aa The constant splitting code was duplicated in two places, and I would have needed to modify both quite heavily, so I have taken the opportunity to unify the two, and hopefully reduce the future maintenance burden. Let me respond to a point Richard Earnshaw raised following the original posting: A final note is that you may have missed some cases. Now that we have movw, reg ~(16-bit const) can now be done in at most 2 insns: movw t1, #16-bit const bic Rd, reg, t1 Actually, I think we can do better than that for a 16-bit constant. Given: a ~(0xabcd) Before my changes, GCC gave: bic r0, r0, #43520 bic r0, r0, #460 bic r0, r0, #1 and after applying my patch: bic r0, r0, #43776 bic r0, r0, #205 Two instructions and no temporary register. On thumb-2 you can also use ORN that way as well. It turns out that my previous patch was broken for ORN. I traced the problem to some confusing code already in arm.c that set can_invert for IOR, but then explicitly ignored it later (I had removed the second part, but not the first). I posted, and committed a patch to fix this yesterday. In fact ORN is only of limited use for this kind of thing. Like AND, you can't use multiple ORNs to build a constant. The compiler already does use ORN in some circumstances, and this patch has not changed that. Is the patch OK? Andrew 2011-04-21 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c: (count_insns_for_constant): Delete function. (find_best_start): Delete function. (optimal_immediate_sequence): New function. (optimal_immediate_sequence_1): New function. (arm_gen_constant): Move constant splitting code to optimal_immediate_sequence. Rewrite constant negation/invertion code. gcc/testsuite/ * gcc.target/arm/thumb2-replicated-constant1.c: New file. * gcc.target/arm/thumb2-replicated-constant2.c: New file. * gcc.target/arm/thumb2-replicated-constant3.c: New file. * gcc.target/arm/thumb2-replicated-constant4.c: New file. --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -129,7 +129,12 @@ static void thumb1_output_function_prologue (FILE *, HOST_WIDE_INT); static int arm_comp_type_attributes (const_tree, const_tree); static void arm_set_default_type_attributes (tree); static int arm_adjust_cost (rtx, rtx, rtx, int); -static int count_insns_for_constant (HOST_WIDE_INT, int); +static int optimal_immediate_sequence (enum rtx_code code, + unsigned HOST_WIDE_INT val, + int return_sequence[]); +static int optimal_immediate_sequence_1 (enum rtx_code code, + unsigned HOST_WIDE_INT val, + int return_sequence[], int i); static int arm_get_strip_length (int); static bool arm_function_ok_for_sibcall (tree, tree); static enum machine_mode arm_promote_function_mode
[PATCH] Handle returns in ref_maybe_used_by_stmt_p
With all returns now having virtual operands we can trivially arrive at them during alias walks. Instead of always returning true as we did sofar this patch makes us more precise, also handle the fact that a function return implicitly is a use for all values that escape (now, hopefully I will get to the point to compute and store separate points-to sets for variables that escape through function returns ...). This makes PR48702 be also exposed on trunk. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-04-20 Richard Guenther rguent...@suse.de * tree-ssa-alias.c (ref_maybe_used_by_stmt_p): Handle return statements. Index: gcc/tree-ssa-alias.c === *** gcc/tree-ssa-alias.c(revision 172773) --- gcc/tree-ssa-alias.c(working copy) *** ref_maybe_used_by_stmt_p (gimple stmt, t *** 1364,1369 --- 1364,1389 } else if (is_gimple_call (stmt)) return ref_maybe_used_by_call_p (stmt, ref); + else if (gimple_code (stmt) == GIMPLE_RETURN) + { + tree retval = gimple_return_retval (stmt); + tree base; + if (retval + TREE_CODE (retval) != SSA_NAME + !is_gimple_min_invariant (retval) + refs_may_alias_p (retval, ref)) + return true; + /* If ref escapes the function then the return acts as a use. */ + base = get_base_address (ref); + if (!base) + ; + else if (DECL_P (base)) + return is_global_var (base); + else if (TREE_CODE (base) == MEM_REF + || TREE_CODE (base) == TARGET_MEM_REF) + return ptr_deref_may_alias_global_p (TREE_OPERAND (base, 0)); + return false; + } return true; }
[PATCH] Fix gcc_release to not build GO tarballs on old branches
This changes maybe_build_tarfile to require that the first directory in the list actually exists. This avoids building a GO tarball on old branches where neither gcc/go nor libgo exist but libffi does. I have built the 4.5.3 release candidate with this patch applied. Looks ok? Thanks, Richard. 2011-04-21 Richard Guenther rguent...@suse.de * gcc_release (maybe_build_tarfile): Require existance of the first directory. Index: maintainer-scripts/gcc_release === --- maintainer-scripts/gcc_release (revision 172819) +++ maintainer-scripts/gcc_release (working copy) @@ -258,6 +258,7 @@ maybe_build_tarfile() { if [ -d $maybe_dir ]; then dir_exists=1 fi +break; done if [ $dir_exists = 1 ]; then build_tarfile $dest $@
Re: [PATCH] Fix gcc_release to not build GO tarballs on old branches
On Thu, Apr 21, 2011 at 01:36:53PM +0200, Richard Guenther wrote: This changes maybe_build_tarfile to require that the first directory in the list actually exists. This avoids building a GO tarball on old branches where neither gcc/go nor libgo exist but libffi does. I have built the 4.5.3 release candidate with this patch applied. If we want to just test the first directory (and I agree with that, the first directory is the language directory and in case of go libffi is duplicated), having a loop and extra var is IMHO way overkill. What about following instead? 2011-04-21 Jakub Jelinek ja...@redhat.com * gcc_release (maybe_build_tarfile): Don't build a tarfile if the first directory doesn't exist. --- maintainer-scripts/gcc_release.jj 2010-12-09 11:12:50.0 +0100 +++ maintainer-scripts/gcc_release 2011-04-21 13:51:52.471180071 +0200 @@ -253,13 +253,7 @@ build_tarfile() { maybe_build_tarfile() { dest=$1 shift - dir_exists=0 - for maybe_dir in $@; do -if [ -d $maybe_dir ]; then - dir_exists=1 -fi - done - if [ $dir_exists = 1 ]; then + if [ $# != 0 -a -d $1 ]; then build_tarfile $dest $@ else echo Not building $dest tarfile Jakub
Re: [PATCH] Fix gcc_release to not build GO tarballs on old branches
On Thu, 21 Apr 2011, Jakub Jelinek wrote: On Thu, Apr 21, 2011 at 01:36:53PM +0200, Richard Guenther wrote: This changes maybe_build_tarfile to require that the first directory in the list actually exists. This avoids building a GO tarball on old branches where neither gcc/go nor libgo exist but libffi does. I have built the 4.5.3 release candidate with this patch applied. If we want to just test the first directory (and I agree with that, the first directory is the language directory and in case of go libffi is duplicated), having a loop and extra var is IMHO way overkill. What about following instead? Works for me if it works (my shell fu is quite limited). Richard. 2011-04-21 Jakub Jelinek ja...@redhat.com * gcc_release (maybe_build_tarfile): Don't build a tarfile if the first directory doesn't exist. --- maintainer-scripts/gcc_release.jj 2010-12-09 11:12:50.0 +0100 +++ maintainer-scripts/gcc_release2011-04-21 13:51:52.471180071 +0200 @@ -253,13 +253,7 @@ build_tarfile() { maybe_build_tarfile() { dest=$1 shift - dir_exists=0 - for maybe_dir in $@; do -if [ -d $maybe_dir ]; then - dir_exists=1 -fi - done - if [ $dir_exists = 1 ]; then + if [ $# != 0 -a -d $1 ]; then build_tarfile $dest $@ else echo Not building $dest tarfile Jakub -- Richard Guenther rguent...@suse.de Novell / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex
Re: [Patch,AVR]: Solve PR42210
Georg-Johann Lay schrieb: This solves some missed optimization that can be seen when moving around bits. There are 4 combiner patterns that operate on regs and one that uses them as intermediate patterns and works on I/O. Even if just an intermediate pattern matches it's still an improvement because avoid of shift. Tested on some home-brew example. Ok if I see no regressions? The original patch has a thinko. The source bit is always 0. Besides that, the (set (zero_extract is now allowed to handle constant sources, too. The new patch fixes that and passes testsuite. Johann 2011-04-20 Georg-Johann Lay a...@gjlay.de PR target/42210 * config/avr/avr.md (*movbitqi.1-6.a, *movbitqi.1-6.b, *movbitqi.0, *movbitqi.7, *movbitqi.io): New insns. Index: config/avr/avr.md === --- config/avr/avr.md (Revision 172770) +++ config/avr/avr.md (Arbeitskopie) @@ -3388,3 +3388,84 @@ (define_insn fmulsu clr __zero_reg__ [(set_attr length 3) (set_attr cc clobber)]) + + +;; Some combiner patterns dealing with bits. +;; See PR42210 + +;; Move bit $3.$4 into bit $0.$4 +(define_insn *movbitqi.1-6.a + [(set (match_operand:QI 0 register_operand=r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(match_operand:QI 2 single_zero_operand n)) +(and:QI (ashift:QI (match_operand:QI 3 register_operand r) + (match_operand:QI 4 const_int_operand n)) +(match_operand:QI 5 single_one_operand n] + optimize +INTVAL(operands[4]) == exact_log2 (INTVAL(operands[5]) GET_MODE_MASK (QImode)) +INTVAL(operands[4]) == exact_log2 (~INTVAL(operands[2]) GET_MODE_MASK (QImode)) + bst %3,0\;bld %0,%4 + [(set_attr length 2) + (set_attr cc none)]) + +;; Move bit $3.0 into bit $0.$4 +;; Variation of above. Unfortunately, there is no canonicalized representation +;; of moving around bits. So what we see here depends on how user writes down +;; bit manipulations. +(define_insn *movbitqi.1-6.b + [(set (match_operand:QI 0 register_operand=r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(match_operand:QI 2 single_zero_operand n)) +(ashift:QI (and:QI (match_operand:QI 3 register_operand r) + (const_int 1)) + (match_operand:QI 4 const_int_operand n] + optimize +INTVAL(operands[4]) == exact_log2 (~INTVAL(operands[2]) GET_MODE_MASK (QImode)) + bst %3,0\;bld %0,%4 + [(set_attr length 2) + (set_attr cc none)]) + +;; Move bit $3.0 into bit $0.0. +;; For bit 0, combiner generates slightly different pattern. +(define_insn *movbitqi.0 + [(set (match_operand:QI 0 register_operand =r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(match_operand:QI 2 single_zero_operand n)) +(and:QI (match_operand:QI 3 register_operand r) +(const_int 1] + optimize +0 == exact_log2 (~INTVAL(operands[2]) GET_MODE_MASK (QImode)) + bst %3,0\;bld %0,0 + [(set_attr length 2) + (set_attr cc none)]) + +;; Move bit $2.0 into bit $0.7. +;; For bit 7, combiner generates slightly different pattern +(define_insn *movbitqi.7 + [(set (match_operand:QI 0 register_operand =r) +(ior:QI (and:QI (match_operand:QI 1 register_operand 0) +(const_int 127)) +(ashift:QI (match_operand:QI 2 register_operandr) + (const_int 7] + optimize + bst %2,0\;bld %0,7 + [(set_attr length 2) + (set_attr cc none)]) + +;; Combiner transforms above four pattern into ZERO_EXTRACT if it sees MEM +;; and input/output match. We provide a special pattern for this, because +;; in contrast to a IN/BST/BLD/OUT sequence we need less registers and the +;; operation on I/O is atomic. +(define_insn *movbitqi.io + [(set (zero_extract:QI (mem:QI (match_operand 0 low_io_address_operand n,n,n)) + (const_int 1) + (match_operand 1 const_int_operand n,n,n)) +(match_operand:QI 2 nonmemory_operandL,P,r))] + optimize +IN_RANGE (INTVAL(operands[1]), 0, 7) + @ + cbi %m0-0x20,%1 + sbi %m0-0x20,%1 + sbrc %2,0\;sbi %m0-0x20,%1\;sbrs %2,0\;cbi %m0-0x20,%1 + [(set_attr length 1,1,4) + (set_attr cc none)])
Re: [PATCH] Optimize (x * 8) | 5 and (x 3) ^ 3 to use lea (PR target/48688)
On Wed, Apr 20, 2011 at 10:27:36AM -0700, Richard Henderson wrote: On 04/20/2011 09:09 AM, Jakub Jelinek wrote: Hi! This splitter allows us to optimize (x {* {2,4,8}, {1,2,3}}) {|,^} y for constant integer y = {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in that case, when the low bits are known to be all 0, is like plus). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-04-20 Jakub Jelinek ja...@redhat.com PR target/48688 * config/i386/i386.md (*lea_general_4): New define_insn_and_split. Any chance you could do this in combine instead? Shift-and-add patterns are a fairly common architectural feature... I've tried to do it in simplify-rtx.c, unfortunately combine.c does exactly opposite canonicalization and thus it results in endless recursion: /* If we are adding two things that have no bits in common, convert the addition into an IOR. This will often be further simplified, for example in cases like ((a 1) + (a 2)), which can become a 3. */ if (GET_MODE_BITSIZE (mode) = HOST_BITS_PER_WIDE_INT (nonzero_bits (XEXP (x, 0), mode) nonzero_bits (XEXP (x, 1), mode)) == 0) { /* Try to simplify the expression further. */ rtx tor = simplify_gen_binary (IOR, mode, XEXP (x, 0), XEXP (x, 1)); temp = combine_simplify_rtx (tor, mode, in_dest, 0); /* If we could, great. If not, do not go ahead with the IOR replacement, since PLUS appears in many special purpose address arithmetic instructions. */ if (GET_CODE (temp) != CLOBBER temp != tor) return temp; } So at least it can't be done in simplify_binary_operation_1. Jakub
[committed]: vms/ia64: avoid a crash if -mdebug-main is used without -g
Hi, this simple patch adds a guard to avoid a crash. As -mdebug-main directly calls dwarf2out, debug_info_level should be checked before. Committed on trunk. Tristan. 2011-04-21 Tristan Gingold ging...@adacore.com * config/ia64/ia64.c (ia64_start_function): Add a guard. --- gcc/config/ia64/ia64.c (revision 172821) +++ gcc/config/ia64/ia64.c (working copy) @@ -3542,6 +3542,7 @@ { #if VMS_DEBUGGING_INFO if (vms_debug_main + debug_info_level DINFO_LEVEL_NONE strncmp (vms_debug_main, fnname, strlen (vms_debug_main)) == 0) { targetm.asm_out.globalize_label (asm_out_file, VMS_DEBUG_MAIN_POINTER);
Re: [lto, testsuite] Don't use visibility on targets that don't support it (PR lto/47334)
Mike, [Could you please configure your mail client to break lines? It's hard to reply to messages all on a single line. Thanks.] On Apr 19, 2011, at 11:12 AM, Rainer Orth wrote: I've had a closer look now and think it's possible (and desirable) to define HAVE_GAS_HIDDEN for Darwin, too. But, they don't have the same thing, therefore, either, you loose out on the meaning, or, you must have yet another test that means the rest. In your email, you don't state that you want to loose out, nor do you state that you want an additional test, so, I am left wondering what you want. I'd say, forge ahead, the worst you can do is break everything... You'll either be careful and not break anything, or you will and someone will pipe up with the bit you broke. I'd like to think we have enough tests in the testsuite to ensure you can't silently break the major pieces. True, Darwin only supports a subset of the whole set of visibilites supported by ELF targets, and even amoung ELF targets there are differences. The testsuite already deals with that alright, and the vast majority of uses inside the compiler proper only use VISIBILITY_(DEFAULT|HIDDEN|INTERNAL) that are supported by Darwin, too. I plan both to check if there are problematic cases that assume more from the target, and of course will rely on a bootstrap and regtest to detect problems. It's probably best to wait for the tested patch and judge from there. If all else fails, one could sprinkle TARGET_MACHO over the affected places, but that can only be a very last resort IMO. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[vms/committed]: avoid wrong output by vmsdbgout.c for long path
Hi, vmsdbgout could generate buggy records for long path. This happened while cross-compiling. Committed on trunk. Tristan. 2011-04-21 Tristan Gingold ging...@adacore.com * vmsdbgout.c (write_srccorr): Compute file length from the string. (dst_file_info_struct): Remove flen field. (lookup_filename): Remove code that set flen field. Index: gcc/vmsdbgout.c === --- gcc/vmsdbgout.c (revision 172821) +++ gcc/vmsdbgout.c (working copy) @@ -71,7 +71,6 @@ long ebk; short ffb; char rfo; - char flen; } dst_file_info_entry; @@ -932,7 +931,7 @@ int src_command_size; int linesleft = file_info_entry.max_line; int linestart = file_info_entry.listing_line_start; - int flen = file_info_entry.flen; + int flen = strlen (file_info_entry.file_name); int linestodo = 0; DST_SOURCE_CORR src_header; DST_SRC_COMMAND src_command; @@ -977,7 +976,7 @@ src_command.dst_a_src_cmd_fields.dst_a_src_decl_src.dst_b_src_df_rms_rfo = file_info_entry.rfo; src_command.dst_a_src_cmd_fields.dst_a_src_decl_src.dst_b_src_df_filename -= file_info_entry.flen; += flen; src_header.dst_a_source_corr_header.dst__header_length.dst_w_length = DST_K_SOURCE_CORR_HEADER_SIZE + src_command_size - 1; @@ -1332,7 +1331,6 @@ register char *fn; register unsigned i; const char *fnam; - char flen; long long cdt = 0; long ebk = 0; short ffb = 0; @@ -1341,7 +1339,6 @@ int ver = 0; fnam = full_name (file_name); - flen = strlen (fnam); /* Check to see if the file name that was searched on the previous call matches this file name. If so, return the index. */ @@ -1386,7 +1383,6 @@ file_info_table[file_info_table_in_use].ebk = ebk; file_info_table[file_info_table_in_use].ffb = ffb; file_info_table[file_info_table_in_use].rfo = rfo; - file_info_table[file_info_table_in_use].flen = flen; last_file_lookup_index = file_info_table_in_use++; return last_file_lookup_index;
[PATCH][1/n] Alias housekeeping
Remove INDIRECT_REF times code, don't use the strange SSA_VAR_P predicate. Simple cleanups, catched a case where we still build an INDIRECT_REF. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-04-21 Richard Guenther rguent...@suse.de * tree-ssa-alias.c (ptr_deref_may_alias_decl_p): Handle MEM_REF and TARGET_MEM_REF, do not care about INDIRECT_REFs. Use DECL_P, not SSA_VAR_P. (ptr_derefs_may_alias_p): Likewise. (ptr_deref_may_alias_ref_p_1): Likewise. (decl_refs_may_alias_p): Likewise. (refs_may_alias_p_1): Likewise. (ref_maybe_used_by_call_p_1): Likewise. (call_may_clobber_ref_p_1): Likewise. (indirect_ref_may_alias_decl_p): Assume indirect refrences are either MEM_REF or TARGET_MEM_REF. (indirect_refs_may_alias_p): Likewise. * calls.c (emit_call_1): Build a MEM_REF instead of an INDIRECT_REF for MEM_EXPR of indirect calls. Index: gcc/tree-ssa-alias.c === *** gcc/tree-ssa-alias.c(revision 172820) --- gcc/tree-ssa-alias.c(working copy) *** ptr_deref_may_alias_decl_p (tree ptr, tr *** 196,206 { tree base = get_base_address (TREE_OPERAND (ptr, 0)); if (base ! (INDIRECT_REF_P (base) ! || TREE_CODE (base) == MEM_REF)) ptr = TREE_OPERAND (base, 0); else if (base ! SSA_VAR_P (base)) return base == decl; else if (base CONSTANT_CLASS_P (base)) --- 196,206 { tree base = get_base_address (TREE_OPERAND (ptr, 0)); if (base ! (TREE_CODE (base) == MEM_REF ! || TREE_CODE (base) == TARGET_MEM_REF)) ptr = TREE_OPERAND (base, 0); else if (base ! DECL_P (base)) return base == decl; else if (base CONSTANT_CLASS_P (base)) *** ptr_derefs_may_alias_p (tree ptr1, tree *** 281,291 { tree base = get_base_address (TREE_OPERAND (ptr1, 0)); if (base ! (INDIRECT_REF_P (base) ! || TREE_CODE (base) == MEM_REF)) ptr1 = TREE_OPERAND (base, 0); else if (base ! SSA_VAR_P (base)) return ptr_deref_may_alias_decl_p (ptr2, base); else return true; --- 281,291 { tree base = get_base_address (TREE_OPERAND (ptr1, 0)); if (base ! (TREE_CODE (base) == MEM_REF ! || TREE_CODE (base) == TARGET_MEM_REF)) ptr1 = TREE_OPERAND (base, 0); else if (base ! DECL_P (base)) return ptr_deref_may_alias_decl_p (ptr2, base); else return true; *** ptr_derefs_may_alias_p (tree ptr1, tree *** 294,304 { tree base = get_base_address (TREE_OPERAND (ptr2, 0)); if (base ! (INDIRECT_REF_P (base) ! || TREE_CODE (base) == MEM_REF)) ptr2 = TREE_OPERAND (base, 0); else if (base ! SSA_VAR_P (base)) return ptr_deref_may_alias_decl_p (ptr1, base); else return true; --- 294,304 { tree base = get_base_address (TREE_OPERAND (ptr2, 0)); if (base ! (TREE_CODE (base) == MEM_REF ! || TREE_CODE (base) == TARGET_MEM_REF)) ptr2 = TREE_OPERAND (base, 0); else if (base ! DECL_P (base)) return ptr_deref_may_alias_decl_p (ptr1, base); else return true; *** ptr_deref_may_alias_ref_p_1 (tree ptr, a *** 338,347 { tree base = ao_ref_base (ref); ! if (INDIRECT_REF_P (base) ! || TREE_CODE (base) == MEM_REF) return ptr_derefs_may_alias_p (ptr, TREE_OPERAND (base, 0)); ! else if (SSA_VAR_P (base)) return ptr_deref_may_alias_decl_p (ptr, base); return true; --- 338,347 { tree base = ao_ref_base (ref); ! if (TREE_CODE (base) == MEM_REF ! || TREE_CODE (base) == TARGET_MEM_REF) return ptr_derefs_may_alias_p (ptr, TREE_OPERAND (base, 0)); ! else if (DECL_P (base)) return ptr_deref_may_alias_decl_p (ptr, base); return true; *** decl_refs_may_alias_p (tree base1, *** 688,694 tree base2, HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2) { ! gcc_assert (SSA_VAR_P (base1) SSA_VAR_P (base2)); /* If both references are based on different variables, they cannot alias. */ if (base1 != base2) --- 688,694 tree base2, HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2) { ! gcc_checking_assert (DECL_P (base1) DECL_P (base2)); /* If both references are based on different variables, they cannot alias. */ if (base1 != base2) *** indirect_ref_may_alias_decl_p (tree ref1 *** 720,743
Re: [PATCH] Don't use ./tmp0 for site.exp generation
Richard Guenther rguent...@suse.de writes: Perhaps it would be better to use site.tmp to match current git automake? I can certainly use any other temporary name, just the tmp0 use is annoying ;) It's a matter of a single search-and-replace. Although this isn't exactly part of the testsuite itself, given that site.tmp is used in all automake-generated Makefile.in's, I'd say go ahead. gcc/Makefile.in seems to be the only (hand-written) Makefile.in that uses something else. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [RFC] OpenMP 3.1 atomics
On Thu, Mar 03, 2011 at 05:54:32PM +0100, Jakub Jelinek wrote: This patch is a WIP patch for OpenMP 3.1 atomics, so far for C FE only. It handles parsing (I created 3 new tree codes for atomic read, atomic capture of the old and of the new value), gimplification and omp expansion thereof, but currently both atomic read and atomic write are implemented always pretty expensively using __sync_val_compare_and_swap. I think we probably want to implement them as volatile read or write (or perhaps just non-volatile read with some barriers around?), but probably need some target hook to tell us what loads/stores aren't actually atomic (say on old alpha char/short/int loads and stores aren't atomic). I've now committed the patch as is to gomp-3_1-branch, so that I can continue working on C++ and Fortran 3.1 atomics parsing. Comments/suggestions about the omp-low.c implementation of atomic reads and writes are welcome. Here is ChangeLog for the committed bits. 2011-04-21 Jakub Jelinek ja...@redhat.com * c-parser.c (c_parser_omp_atomic): Handle parsing OpenMP 3.1 atomics. Adjust c_finish_omp_atomic caller. * tree.def (OMP_ATOMIC_READ, OMP_ATOMIC_CAPTURE_OLD, OMP_ATOMIC_CAPTURE_NEW): New. * gimple.h (GF_OMP_ATOMIC_NEED_VALUE): New. (gimple_omp_atomic_need_value_p, gimple_omp_atomic_set_need_value): New inlines. * gimplify.c (gimplify_omp_atomic, gimplify_expr): Handle OMP_ATOMIC_READ, OMP_ATOMIC_CAPTURE_OLD and OMP_ATOMIC_CAPTURE_NEW. * omp-low.c (expand_omp_atomic_load, expand_omp_atomic_store): New functions. (expand_omp_atomic_fetch_op): Handle cases where old or new value is needed afterwards. (expand_omp_atomic): Call expand_omp_atomic_load resp. expand_omp_atomic_store. * tree-pretty-print.c (dump_generic_node): Handle OMP_ATOMIC_READ, OMP_ATOMIC_CAPTURE_OLD and OMP_ATOMIC_CAPTURE_NEW. * c-common.h (c_finish_omp_atomic): Adjust prototype. * c-omp.c (c_finish_omp_atomic): Add OPCODE, V and LHS1 arguments. Handle OMP_ATOMIC_READ, OMP_ATOMIC_CAPTURE_OLD and OMP_ATOMIC_CAPTURE_NEW in addition to OMP_ATOMIC. * semantics.c (finish_omp_atomic): Adjust c_finish_omp_atomic caller. * testsuite/libgomp.c/atomic-11.c: New test. * testsuite/libgomp.c/atomic-12.c: New test. Jakub
Re: [PATCH] Don't use ./tmp0 for site.exp generation
On Thu, 21 Apr 2011, Rainer Orth wrote: Richard Guenther rguent...@suse.de writes: Perhaps it would be better to use site.tmp to match current git automake? I can certainly use any other temporary name, just the tmp0 use is annoying ;) It's a matter of a single search-and-replace. Although this isn't exactly part of the testsuite itself, given that site.tmp is used in all automake-generated Makefile.in's, I'd say go ahead. gcc/Makefile.in seems to be the only (hand-written) Makefile.in that uses something else. Committed using site.tmp. Richard.
Re: better wpa [1/n]: merge types during read-in
Hi, On Wed, 20 Apr 2011, Michael Matz wrote: It would have been nice to have the top-level tree merging as a separate patch, as I am not convinced it is correct, but see below ... I'll split it out. Like so (also including the other remarks). Regstrapping on x86_64-linux in progress. Ciao, Michael. * lto-streamer.c (lto_streamer_cache_insert_1): Accept to override other trees that just builtins. (lto_record_common_node): Don't leave NULL TYPE_CANONICAL. lto/ * lto.c (toplevel): Include tree-flow.h. (lto_read_in_decl_state): Don't merge types here. (tree_with_vars): New static hash table. (remember_with_vars): New static functions. (LTO_FIXUP_TYPE): New macro. (lto_ft_common, lto_ft_decl_minimal, lto_ft_decl_common, lto_ft_decl_with_vis, lto_ft_decl_non_common, lto_ft_function, lto_ft_field_decl, lto_ft_type, lto_ft_binfo, lto_ft_constructor, lto_ft_expr, lto_fixup_types, uniquify_nodes): New static functions. (lto_read_decls): Uniquify while reading in trees. (lto_fixup_data_t, LTO_FIXUP_SUBTREE, LTO_REGISTER_TYPE_AND_FIXUP_SUBTREE, no_fixup_p, lto_fixup_common, lto_fixup_decl_minimal, lto_fixup_decl_common, lto_fixup_decl_with_vis, lto_fixup_decl_non_common, lto_fixup_function, lto_fixup_field_decl, lto_fixup_type, lto_fixup_binfo, lto_fixup_constructor, lto_fixup_tree): Remove. (lto_fixup_state): Remove data argument. Use lto_symtab_prevailing_decl. (LTO_SET_PREVAIL, LTO_NO_PREVAIL): New macros. (lto_fixup_prevailing_decls): New function. (lto_fixup_state_aux): Argument aux is unused. (lto_fixup_decls): Don't allocate pointer sets, don't use lto_fixup_tree, use lto_fixup_prevailing_decls. (read_cgraph_and_symbols): Allocate and remove tree_with_vars. * Make-lang.in (lto/lto.o): Depend on $(TREE_FLOW_H). Index: lto-streamer.c === *** lto-streamer.c (revision 172769) --- lto-streamer.c (working copy) *** lto_streamer_cache_insert_1 (struct lto_ *** 383,401 { /* If the caller wants to insert T at a specific slot location, and ENTRY-TO does not match *IX_P, add T to !the requested location slot. This situation arises when !streaming builtin functions. ! !For instance, on the writer side we could have two !FUNCTION_DECLS T1 and T2 that are represented by the same !builtin function. The reader will only instantiate the !canonical builtin, but since T1 and T2 had been !originally stored in different cache slots (S1 and S2), !the reader must be able to find the canonical builtin !function at slots S1 and S2. */ ! gcc_assert (lto_stream_as_builtin_p (t)); ix = *ix_p; - lto_streamer_cache_add_to_node_array (cache, ix, t); } --- 383,390 { /* If the caller wants to insert T at a specific slot location, and ENTRY-TO does not match *IX_P, add T to !the requested location slot. */ ix = *ix_p; lto_streamer_cache_add_to_node_array (cache, ix, t); } *** lto_record_common_node (tree *nodep, VEC *** 513,518 --- 502,509 TYPE_CANONICAL (node) = NULL_TREE; node = gimple_register_type (node); TYPE_CANONICAL (node) = gimple_register_canonical_type (node); + if (in_lto_p) + TYPE_CANONICAL (*nodep) = TYPE_CANONICAL (node); *nodep = node; } Index: lto/lto.c === *** lto/lto.c (revision 172769) --- lto/lto.c (working copy) *** along with GCC; see the file COPYING3. *** 24,29 --- 24,30 #include opts.h #include toplev.h #include tree.h + #include tree-flow.h #include diagnostic-core.h #include tm.h #include cgraph.h *** lto_read_in_decl_state (struct data_in * *** 215,228 tree *decls = ggc_alloc_vec_tree (size); for (j = 0; j size; j++) ! { ! decls[j] = lto_streamer_cache_get (data_in-reader_cache, data[j]); ! ! /* Register every type in the global type table. If the !type existed already, use the existing type. */ ! if (TYPE_P (decls[j])) ! decls[j] = gimple_register_type (decls[j]); ! } state-streams[i].size = size; state-streams[i].trees = decls; --- 216,222 tree *decls = ggc_alloc_vec_tree (size); for (j = 0; j size; j++) ! decls[j] = lto_streamer_cache_get (data_in-reader_cache, data[j]); state-streams[i].size = size; state-streams[i].trees = decls; *** lto_read_in_decl_state (struct data_in *
Re: [patch, fortran] PR 48405 - Front end expressions in DO loops
Am 19.04.2011 20:35, schrieb Thomas Koenig: Hello world, this patch fixes the enhancement PR, plus probably a few regressions. The basic problem was that the code walker got confused when *c, the pointer to the current gfc_code statement, was changed by inserting additional code. Currently regression-testing. OK for trunk if the tests pass? Regression-testing passed. Ping ** 0.25? Thomas
Re: [PATCH] use build_function_type_list in the frv backend
Hi Nathan, * config/frv/frv.c (frv_init_builtins): Delete `endlink' variable. Call builtin_function_type_list instead of builtin_function_type. (UNARY, BINARY, TRINARY, QUAD): Likewise. Approved - please apply. Cheers Nick
Re: [PATCH] use build_function_type_list in the iq2000 backend
Hi Nathan, * config/iq2000/i2000.c (iq2000_init_builtins): Call build_function_type_list instead of build_function_type. Delete `endlink' variable. Approved - please apply. Cheers Nick
Re: [PATCH] use build_function_type_list in the stormy16 backend
Hi Nathan, * config/stormy16/stormy16 (xstormy16_init_builtins): Call build_function_type_list instead of build_function_type. Rearrange initialization of `args' to do so. Approved - please apply. Cheers Nick
Re: [patch, fortran] PR 48405 - Front end expressions in DO loops
On Thursday 21 April 2011 16:02:18 Thomas Koenig wrote: Am 19.04.2011 20:35, schrieb Thomas Koenig: Hello world, this patch fixes the enhancement PR, plus probably a few regressions. The basic problem was that the code walker got confused when *c, the pointer to the current gfc_code statement, was changed by inserting additional code. Currently regression-testing. OK for trunk if the tests pass? Regression-testing passed. Ping ** 0.25? Thomas OK. Thanks Mikael
Fix ICE during nested functions lowering
The attached testcase triggers an ICE during nested functions lowering, a regression present on the mainline and 4.6 branch. The middle-end is trying to create an object which must be created by the front-end: /* If the type is of variable size or a type which must be created by the frontend, something is wrong. Note that we explicitly allow incomplete types here, since we create them ourselves here. */ gcc_assert (!TREE_ADDRESSABLE (type)); This is a fallout of tuplification. It turns out that the real issue was fixed back in July by Richard: 2010-07-07 Richard Guenther rguent...@suse.de * tree-ssa-propagate.h (valid_gimple_call_p): Remove. * tree-ssa-propagate.c (valid_gimple_call_p): Make static. Fix. * gimple.h (is_gimple_operand): Remove. * gimple.c (is_gimple_operand): Likewise. (walk_gimple_op): Fix wi-val_only setting for calls. * tree-cfg.c (verify_gimple_call): Fix argument validation. * tree-profile.c (tree_gen_ic_func_profiler): Do not create invalid gimple calls. but the walk_gimple_op change contains a couple of oversights. Bootstrapped/regtested on x86_64-suse-linux, applied on the mainline and 4.6 branch as obvious. 2011-04-21 Eric Botcazou ebotca...@adacore.com * gimple.c (walk_gimple_op) GIMPLE_CALL: Fix couple of oversights. 2011-04-21 Eric Botcazou ebotca...@adacore.com * gnat.dg/volatile5.adb: New test. * gnat.dg/volatile5_pkg.ads: New helper. -- Eric Botcazou Index: gimple.c === --- gimple.c (revision 172811) +++ gimple.c (working copy) @@ -1464,7 +1464,8 @@ walk_gimple_op (gimple stmt, walk_tree_f for (i = 0; i gimple_call_num_args (stmt); i++) { if (wi) - wi-val_only = is_gimple_reg_type (gimple_call_arg (stmt, i)); + wi-val_only + = is_gimple_reg_type (TREE_TYPE (gimple_call_arg (stmt, i))); ret = walk_tree (gimple_call_arg_ptr (stmt, i), callback_op, wi, pset); if (ret) @@ -1476,7 +1477,8 @@ walk_gimple_op (gimple stmt, walk_tree_f if (wi) { wi-is_lhs = true; - wi-val_only = is_gimple_reg_type (gimple_call_lhs (stmt)); + wi-val_only + = is_gimple_reg_type (TREE_TYPE (gimple_call_lhs (stmt))); } ret = walk_tree (gimple_call_lhs_ptr (stmt), callback_op, wi, pset); -- { dg-do compile } with Volatile5_Pkg; use Volatile5_Pkg; procedure Volatile5 is A : Rec; procedure Proc is begin A := F; end; begin Proc; end; package Volatile5_Pkg is type Rec is record I : Integer; end record; pragma Volatile(Rec); function F return Rec; end Volatile5_Pkg; --
[PATCH] Fix PR48703
This fixes the fallout of not re-setting the set_decl_assembler_name langhook with -flto. Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC2k6 build tested, installed. Richard. 2011-04-21 Richard Guenther rguent...@suse.de PR lto/48703 * tree.c (free_lang_data_in_decl): Do not zero TREE_TYPE of DECL_NAME. * g++.dg/lto/pr48207-2_0.C: New testcase. * g++.dg/lto/pr48207-3_0.C: Likewise. Index: gcc/tree.c === *** gcc/tree.c (revision 172820) --- gcc/tree.c (working copy) *** free_lang_data_in_decl (tree decl) *** 4561,4570 TREE_LANG_FLAG_5 (decl) = 0; TREE_LANG_FLAG_6 (decl) = 0; - /* Identifiers need not have a type. */ - if (DECL_NAME (decl)) - TREE_TYPE (DECL_NAME (decl)) = NULL_TREE; - free_lang_data_in_one_sizepos (DECL_SIZE (decl)); free_lang_data_in_one_sizepos (DECL_SIZE_UNIT (decl)); if (TREE_CODE (decl) == FIELD_DECL) --- 4561,4566 Index: gcc/testsuite/g++.dg/lto/pr48207-2_0.C === *** gcc/testsuite/g++.dg/lto/pr48207-2_0.C (revision 0) --- gcc/testsuite/g++.dg/lto/pr48207-2_0.C (revision 0) *** *** 0 --- 1,10 + // { dg-lto-do link } + // { dg-lto-options { { -flto -g } } } + + namespace { + typedef struct { + int x; + } Foo; + } + + int main () {} Index: gcc/testsuite/g++.dg/lto/pr48207-3_0.C === *** gcc/testsuite/g++.dg/lto/pr48207-3_0.C (revision 0) --- gcc/testsuite/g++.dg/lto/pr48207-3_0.C (revision 0) *** *** 0 --- 1,12 + // { dg-lto-do link } + // { dg-lto-options { { -flto -g } } } + + void bar(void) {} + + void foo(void) + { + typedef enum { ABC } DEF; + bar(); + } + + int main () {}
Re: [PATCH] use build_function_type_list in the picochip backend
Looks good to me Nathan. Please go ahead and commit. Thanks Hari On 20/04/11 20:51, Nathan Froyd wrote: As $SUBJECT suggests. Tested with cross to picochip-elf. OK to commit? -Nathan * config/picochip/picochip.c (picochip_init_builtins): Call build_function_type_list instead of build_function_type. Delete `endlink' variable. diff --git a/gcc/config/picochip/picochip.c b/gcc/config/picochip/picochip.c index 1ca95b4..4442d1e 100644 --- a/gcc/config/picochip/picochip.c +++ b/gcc/config/picochip/picochip.c @@ -4216,18 +4216,6 @@ void picochip_init_builtins (void) { tree noreturn; - tree endlink = void_list_node; - tree int_endlink = tree_cons (NULL_TREE, integer_type_node, endlink); - tree unsigned_endlink = tree_cons (NULL_TREE, unsigned_type_node, endlink); - tree long_endlink = tree_cons (NULL_TREE, long_integer_type_node, endlink); - tree int_int_endlink = -tree_cons (NULL_TREE, integer_type_node, int_endlink); - tree int_int_int_endlink = -tree_cons (NULL_TREE, integer_type_node, int_int_endlink); - tree int_long_endlink = -tree_cons (NULL_TREE, integer_type_node, long_endlink); - tree long_int_int_int_endlink = -tree_cons (NULL_TREE, long_integer_type_node, int_int_int_endlink); tree int_ftype_int, int_ftype_int_int; tree long_ftype_int, long_ftype_int_int_int; @@ -4236,36 +4224,51 @@ picochip_init_builtins (void) tree void_ftype_void, unsigned_ftype_unsigned; /* void func (void) */ - void_ftype_void = build_function_type (void_type_node, endlink); + void_ftype_void = build_function_type_list (void_type_node, NULL_TREE); /* int func (int) */ - int_ftype_int = build_function_type (integer_type_node, int_endlink); + int_ftype_int = build_function_type_list (integer_type_node, + integer_type_node, NULL_TREE); /* unsigned int func (unsigned int) */ - unsigned_ftype_unsigned = build_function_type (unsigned_type_node, unsigned_endlink); + unsigned_ftype_unsigned += build_function_type_list (unsigned_type_node, + unsigned_type_node, NULL_TREE); /* int func(int, int) */ int_ftype_int_int -= build_function_type (integer_type_node, int_int_endlink); += build_function_type_list (integer_type_node, + integer_type_node, integer_type_node, + NULL_TREE); /* long func(int) */ - long_ftype_int = build_function_type (long_integer_type_node, int_endlink); + long_ftype_int = build_function_type_list (long_integer_type_node, +integer_type_node, NULL_TREE); /* long func(int, int, int) */ long_ftype_int_int_int -= build_function_type (long_integer_type_node, int_int_int_endlink); += build_function_type_list (long_integer_type_node, + integer_type_node, integer_type_node, + integer_type_node, NULL_TREE); /* int func(int, int, int) */ int_ftype_int_int_int -= build_function_type (integer_type_node, int_int_int_endlink); += build_function_type_list (integer_type_node, + integer_type_node, integer_type_node, + integer_type_node, NULL_TREE); /* void func(int, long) */ void_ftype_int_long -= build_function_type (void_type_node, int_long_endlink); += build_function_type_list (void_type_node, + integer_type_node, long_integer_type_node, + NULL_TREE); /* void func(long, int, int, int) */ void_ftype_long_int_int_int -= build_function_type (void_type_node, long_int_int_int_endlink); += build_function_type_list (void_type_node, + long_integer_type_node, integer_type_node, + integer_type_node, integer_type_node, + NULL_TREE); /* Initialise the sign-bit-count function. */ add_builtin_function (__builtin_sbc, int_ftype_int,
Re: better wpa [1/n]: merge types during read-in
On Thu, Apr 21, 2011 at 3:46 PM, Michael Matz m...@suse.de wrote: Hi, On Wed, 20 Apr 2011, Michael Matz wrote: It would have been nice to have the top-level tree merging as a separate patch, as I am not convinced it is correct, but see below ... I'll split it out. Like so (also including the other remarks). Regstrapping on x86_64-linux in progress. Ok if it passed. Thanks, Richard. Ciao, Michael. * lto-streamer.c (lto_streamer_cache_insert_1): Accept to override other trees that just builtins. (lto_record_common_node): Don't leave NULL TYPE_CANONICAL. lto/ * lto.c (toplevel): Include tree-flow.h. (lto_read_in_decl_state): Don't merge types here. (tree_with_vars): New static hash table. (remember_with_vars): New static functions. (LTO_FIXUP_TYPE): New macro. (lto_ft_common, lto_ft_decl_minimal, lto_ft_decl_common, lto_ft_decl_with_vis, lto_ft_decl_non_common, lto_ft_function, lto_ft_field_decl, lto_ft_type, lto_ft_binfo, lto_ft_constructor, lto_ft_expr, lto_fixup_types, uniquify_nodes): New static functions. (lto_read_decls): Uniquify while reading in trees. (lto_fixup_data_t, LTO_FIXUP_SUBTREE, LTO_REGISTER_TYPE_AND_FIXUP_SUBTREE, no_fixup_p, lto_fixup_common, lto_fixup_decl_minimal, lto_fixup_decl_common, lto_fixup_decl_with_vis, lto_fixup_decl_non_common, lto_fixup_function, lto_fixup_field_decl, lto_fixup_type, lto_fixup_binfo, lto_fixup_constructor, lto_fixup_tree): Remove. (lto_fixup_state): Remove data argument. Use lto_symtab_prevailing_decl. (LTO_SET_PREVAIL, LTO_NO_PREVAIL): New macros. (lto_fixup_prevailing_decls): New function. (lto_fixup_state_aux): Argument aux is unused. (lto_fixup_decls): Don't allocate pointer sets, don't use lto_fixup_tree, use lto_fixup_prevailing_decls. (read_cgraph_and_symbols): Allocate and remove tree_with_vars. * Make-lang.in (lto/lto.o): Depend on $(TREE_FLOW_H). Index: lto-streamer.c === *** lto-streamer.c (revision 172769) --- lto-streamer.c (working copy) *** lto_streamer_cache_insert_1 (struct lto_ *** 383,401 { /* If the caller wants to insert T at a specific slot location, and ENTRY-TO does not match *IX_P, add T to ! the requested location slot. This situation arises when ! streaming builtin functions. ! ! For instance, on the writer side we could have two ! FUNCTION_DECLS T1 and T2 that are represented by the same ! builtin function. The reader will only instantiate the ! canonical builtin, but since T1 and T2 had been ! originally stored in different cache slots (S1 and S2), ! the reader must be able to find the canonical builtin ! function at slots S1 and S2. */ ! gcc_assert (lto_stream_as_builtin_p (t)); ix = *ix_p; - lto_streamer_cache_add_to_node_array (cache, ix, t); } --- 383,390 { /* If the caller wants to insert T at a specific slot location, and ENTRY-TO does not match *IX_P, add T to ! the requested location slot. */ ix = *ix_p; lto_streamer_cache_add_to_node_array (cache, ix, t); } *** lto_record_common_node (tree *nodep, VEC *** 513,518 --- 502,509 TYPE_CANONICAL (node) = NULL_TREE; node = gimple_register_type (node); TYPE_CANONICAL (node) = gimple_register_canonical_type (node); + if (in_lto_p) + TYPE_CANONICAL (*nodep) = TYPE_CANONICAL (node); *nodep = node; } Index: lto/lto.c === *** lto/lto.c (revision 172769) --- lto/lto.c (working copy) *** along with GCC; see the file COPYING3. *** 24,29 --- 24,30 #include opts.h #include toplev.h #include tree.h + #include tree-flow.h #include diagnostic-core.h #include tm.h #include cgraph.h *** lto_read_in_decl_state (struct data_in * *** 215,228 tree *decls = ggc_alloc_vec_tree (size); for (j = 0; j size; j++) ! { ! decls[j] = lto_streamer_cache_get (data_in-reader_cache, data[j]); ! ! /* Register every type in the global type table. If the ! type existed already, use the existing type. */ ! if (TYPE_P (decls[j])) ! decls[j] = gimple_register_type (decls[j]); ! } state-streams[i].size = size; state-streams[i].trees = decls; --- 216,222 tree *decls = ggc_alloc_vec_tree (size); for (j = 0; j size; j++) ! decls[j] = lto_streamer_cache_get (data_in-reader_cache, data[j]);
[PATCH] centralize builtin function type building
This patch does two things: - centralizes some infrastructure for defining builtin function types for frontends by providing a common function that DEF_FUNCTION_TYPE_FOO macros can call; and - in order to do that well, it also introduces build{,_varargs}_function_type_array for cases when build_function_type_list's interface doesn't work so well. build_function_type_list could have been used instead, but it would lose the error_mark_node handling provided in the C/C++/Ada/LTO frontends. This new interface will be necessary for eliminating other uses of build_function_type anyway. It would have been easier to move all of the builtin-types stuff into the middle-end, but Fortran doesn't use builtin-types.def. Even if it did, I suppose it's possible that some new front-end could have its own set of builtin types, so I'm leaving things as they are. The new functions can eliminate some of the games that were played with the recent backend changes to use build_function_type_list; if this patch is approved, I'll make the (currently uncommitted) patches that could use build_function_type_array do so. Bootstrap and testing in progress on x86_64-unknown-linux-gnu. OK to commit if successful? -Nathan gcc/ada/ * gcc-interface/utils.c (def_fn_type): Delete. (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0, DEF_FUNCTION_TYPE_VAR_1): (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_3): (DEF_FUNCTION_TYPE_VAR_4, DEF_FUNCTION_TYPE_VAR_5): Likewise. gcc/c-family/: * c-common.c (def_fn_type): Delete. (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0, DEF_FUNCTION_TYPE_VAR_1): (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_3): (DEF_FUNCTION_TYPE_VAR_4, DEF_FUNCTION_TYPE_VAR_5): Likewise. gcc/ * tree.h (build_function_type_array): Declare. (build_varargs_function_type_array, define_builtin_function_type): Declare. * tree.c (build_function_type_array_1): Define. (build_function_type_array, build_varargs_function_type_array): Define. (define_builtin_function_type): Define. gcc/fortran/ * f95-lang.c (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0): Likewise. gcc/lto/ * lto-lang.c (def_fn_type): Delete. (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0, DEF_FUNCTION_TYPE_VAR_1): (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_3): (DEF_FUNCTION_TYPE_VAR_4, DEF_FUNCTION_TYPE_VAR_5): Likewise. diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c index 1031ee9..6eb136d 100644 --- a/gcc/ada/gcc-interface/utils.c +++ b/gcc/ada/gcc-interface/utils.c @@ -4952,47 +4952,6 @@ typedef enum c_builtin_type builtin_type; /* A temporary array used in communication with def_fn_type. */ static GTY(()) tree builtin_types[(int) BT_LAST + 1]; -/* A helper function for install_builtin_types. Build function type - for DEF with return type RET and N arguments. If VAR is true, then the - function should be variadic after those N arguments. - - Takes special care not to ICE if any of the types involved are - error_mark_node, which indicates that said type is not in fact available - (see builtin_type_for_size). In which case the function type as a whole - should be error_mark_node. */ - -static void -def_fn_type (builtin_type def, builtin_type ret, bool var, int n, ...) -{ - tree args = NULL, t; - va_list list; - int i; - - va_start (list, n); - for (i = 0; i n; ++i) -{ - builtin_type a = (builtin_type) va_arg (list, int); - t = builtin_types[a]; - if (t == error_mark_node) - goto egress; - args = tree_cons (NULL_TREE, t, args); -} - va_end (list); - - args = nreverse (args); - if (!var) -args = chainon (args, void_list_node); - - t = builtin_types[ret]; - if (t == error_mark_node) -goto egress; - t = build_function_type (t, args); - - egress: - builtin_types[def] = t; - va_end (list); -} - /* Build the builtin function types and install them in the builtin_types array for later use in builtin
Re: fix memory leak in gengtype
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/20/11 17:35, Dimitrios Apostolou wrote: Plus a whole page which is preallocated by the obstack, if I understand correctly. As a result, for each word in the text file we consume 4KB, which are never freed. Plausible. Though I always thought obstacks would release all the unused chunks as part of the obstack_free call and the obstack structure itself was always supposed to be small. Regardless of what exactly was leaked, the two locations you identified were leaking a ton of memory. It turns out there's a similar leak in gengtype.c which is fixed in the same way. Nice, thanks for looking deeper into this, I just stopped when memory utilisation seemed ok. It's been so long since I had to think about obstacks and I wasn't sure the leak was really going to be *that* bad, so I decided to run gengtype under valgrind to verify leak behavior. FWIW, the remaining 300K are almost all leaks through the hash tables. It wouldn't take terribly long to verify the death of the hash tables and insert a suitable htab_delete call. It probably doesn't matter much, but I hate leaving the leak, so I'll probably sit down and learn a little more about gengtype* so I can safely plug the leak hash table leaks. If by PA you mean PA-RISC, I remember when I had access to a Visualize C200 with gentoo on. I loved the machine, but it had an important issue: it was absolutely random if it would power up, when pressing the power button. But as long as we never turned it off, it worked ok :-) I've got a 715, C240 R390 in the basement. The R390 overheats. The other two worked when I fired them up several years ago. Anyway, attached is the patch I ultimately checked in. Same as yours with the addition of fixing a similar obstack leak in gengtype.c. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNsEmrAAoJEBRtltQi2kC7EykH/2nbMtFrAcorqinQViI9Nvtt xO7H764lFjTQa4gsKP/gq+RFRM2s8omyI8cgaFABM7kfkFp64ZUspXHXQS/U1PX/ PHlLCxnjmnw/w56VGDRjF8z9MZ30Cc3dU7xfJRUAbRuooYzYrPw5fMivCeQo4axy +hPEhCHohIOzSjC5+yKnafklfgQdVE1pTc9Cp5LKCTDYAlzvh2vi6FOHf8FF2/1C Dmqgv5qF8Bpd3tyjXj6+/raTU6RfsGsDcQiQo5fADosjPj+h4iWdoVir3xm4TGoU mCwsAoywawk2tPBvlbHAMEg5fiia3qjQ73Pmt2Z62WcK/C+BaZJnj4Or/iZ6Xao= =2JFI -END PGP SIGNATURE- * gengtype-state.c (read_a_state_token): Fix argument to obstack_free. * gengtype.c (matching_file_name_substitute): Likewise. Index: gengtype-state.c === *** gengtype-state.c(revision 172644) --- gengtype-state.c(working copy) *** read_a_state_token (void) *** 303,309 obstack_1grow (id_obstack, (char) 0); ids = XOBFINISH (id_obstack, char *); sid = state_ident_by_name (ids, INSERT); ! obstack_free (id_obstack, ids); ids = NULL; tk = XCNEW (struct state_token_st); tk-stok_kind = STOK_NAME; --- 303,309 obstack_1grow (id_obstack, (char) 0); ids = XOBFINISH (id_obstack, char *); sid = state_ident_by_name (ids, INSERT); ! obstack_free (id_obstack, NULL); ids = NULL; tk = XCNEW (struct state_token_st); tk-stok_kind = STOK_NAME; *** read_a_state_token (void) *** 408,414 tk-stok_file = state_path; tk-stok_next = NULL; strcpy (tk-stok_un.stok_string, cstr); ! obstack_free (bstring_obstack, cstr); return tk; } --- 408,414 tk-stok_file = state_path; tk-stok_next = NULL; strcpy (tk-stok_un.stok_string, cstr); ! obstack_free (bstring_obstack, NULL); return tk; } Index: gengtype.c === *** gengtype.c (revision 172644) --- gengtype.c (working copy) *** matching_file_name_substitute (const cha *** 1943,1949 obstack_1grow (str_obstack, '\0'); rawstr = XOBFINISH (str_obstack, char *); str = xstrdup (rawstr); ! obstack_free (str_obstack, rawstr); DBGPRINTF (matched replacement %s, str); rawstr = NULL; return str; --- 1943,1949 obstack_1grow (str_obstack, '\0'); rawstr = XOBFINISH (str_obstack, char *); str = xstrdup (rawstr); ! obstack_free (str_obstack, NULL); DBGPRINTF (matched replacement %s, str); rawstr = NULL; return str;
Re: Improve stack layout heuristic.
Hi, On Wed, 20 Apr 2011, Easwaran Raman wrote: But you're right - not adding that conflict doesn't actually reduce the size of bit maps. Reverting back to what was there originally. Thanks, I have no more issues with the patch. You'll need to find someone who can formally approve it, though. Ciao, Michael.
Re: [RFA] [PowerPC]
On 04/20/2011 07:52 PM, Segher Boessenkool wrote: The test and-1.c has wrong logic. In the formula: y ~(y -y) The part (y -y) is always a mask with one bit set, which corresponds to the least significant 1 bit in y. The final result is that bit, is set to zero (y ~mask) There is no boolean simplification possible, and the compiler always produces a nand instruction. The formula is equal to y (y-1) , maybe the testcase is testing that? Segher Ah, yes A neg/nand/and should be optimized into a sub -1/and. I will check why this is not happening. Thanks Edmar
Re: [PATCH] centralize builtin function type building
On Thu, Apr 21, 2011 at 5:04 PM, Nathan Froyd froy...@codesourcery.com wrote: This patch does two things: - centralizes some infrastructure for defining builtin function types for frontends by providing a common function that DEF_FUNCTION_TYPE_FOO macros can call; and - in order to do that well, it also introduces build{,_varargs}_function_type_array for cases when build_function_type_list's interface doesn't work so well. build_function_type_list could have been used instead, but it would lose the error_mark_node handling provided in the C/C++/Ada/LTO frontends. This new interface will be necessary for eliminating other uses of build_function_type anyway. It would have been easier to move all of the builtin-types stuff into the middle-end, but Fortran doesn't use builtin-types.def. Even if it did, I suppose it's possible that some new front-end could have its own set of builtin types, so I'm leaving things as they are. But this is what should be done, at least for all builtins in the BUILT_IN_NORMAL category. ISTR Fortran was once running into the issue of assigning different DECL_FUNCTION_CODE numbers to those builtins than other languages, breaking LTO. So, it would be indeed nice to have a central middle-end place to instantiate those builtins and their required types. I'm not sure how far we are from that and am too lazy to look right now ... Richard. The new functions can eliminate some of the games that were played with the recent backend changes to use build_function_type_list; if this patch is approved, I'll make the (currently uncommitted) patches that could use build_function_type_array do so. Bootstrap and testing in progress on x86_64-unknown-linux-gnu. OK to commit if successful? -Nathan gcc/ada/ * gcc-interface/utils.c (def_fn_type): Delete. (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0, DEF_FUNCTION_TYPE_VAR_1): (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_3): (DEF_FUNCTION_TYPE_VAR_4, DEF_FUNCTION_TYPE_VAR_5): Likewise. gcc/c-family/: * c-common.c (def_fn_type): Delete. (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0, DEF_FUNCTION_TYPE_VAR_1): (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_3): (DEF_FUNCTION_TYPE_VAR_4, DEF_FUNCTION_TYPE_VAR_5): Likewise. gcc/ * tree.h (build_function_type_array): Declare. (build_varargs_function_type_array, define_builtin_function_type): Declare. * tree.c (build_function_type_array_1): Define. (build_function_type_array, build_varargs_function_type_array): Define. (define_builtin_function_type): Define. gcc/fortran/ * f95-lang.c (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0): Likewise. gcc/lto/ * lto-lang.c (def_fn_type): Delete. (DEF_FUNCTION_TYPE_0, DEF_FUNCTION_TYPE_1): Change to use define_builtin_function_type. (DEF_FUNCTION_TYPE_2, DEF_FUNCTION_TYPE_3, DEF_FUNCTION_TYPE_4): (DEF_FUNCTION_TYPE_5, DEF_FUNCTION_TYPE_6, DEF_FUNCTION_TYPE_7): (DEF_FUNCTION_TYPE_VAR_0, DEF_FUNCTION_TYPE_VAR_1): (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_3): (DEF_FUNCTION_TYPE_VAR_4, DEF_FUNCTION_TYPE_VAR_5): Likewise. diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c index 1031ee9..6eb136d 100644 --- a/gcc/ada/gcc-interface/utils.c +++ b/gcc/ada/gcc-interface/utils.c @@ -4952,47 +4952,6 @@ typedef enum c_builtin_type builtin_type; /* A temporary array used in communication with def_fn_type. */ static GTY(()) tree builtin_types[(int) BT_LAST + 1]; -/* A helper function for install_builtin_types. Build function type - for DEF with return type RET and N arguments. If VAR is true, then the - function should be variadic after those N arguments. - - Takes special care not to ICE if any of the types involved are - error_mark_node, which indicates that said type is not in fact available - (see builtin_type_for_size). In which case the function type as a whole - should be error_mark_node. */ - -static void -def_fn_type (builtin_type def, builtin_type ret, bool var, int n, ...) -{ - tree args = NULL, t; - va_list list; - int i; - - va_start (list, n); - for (i = 0; i n; ++i) - { -
Re: Improve stack layout heuristic.
On Thu, Apr 21, 2011 at 5:22 PM, Michael Matz m...@suse.de wrote: Hi, On Wed, 20 Apr 2011, Easwaran Raman wrote: But you're right - not adding that conflict doesn't actually reduce the size of bit maps. Reverting back to what was there originally. Thanks, I have no more issues with the patch. You'll need to find someone who can formally approve it, though. Ok with a proper changelog entry. Richard. Ciao, Michael.
RFA: Improve jump threading #2 of N
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 For some dumb reason I thought handling threading through a SWITCH_EXPR was hard in VRP; that's definitely not the case, it's no more difficult than handling a COND_EXPR. This patch allows tree-vrp.c to thread through a SWITCH_EXPR when we know the switch index via a particular path. Most of these would have eventually been caught by DOM, but we get to pick up more secondary effects if we can catch them in VRP. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for trunk? -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNsFGKAAoJEBRtltQi2kC7wP4IAJZ6e/jF5WAnFdG5/G6lWdEp D1mBlaFvrSgAXp/Qx9y8uLQfnpzSvcgb59naTH2p+rDILjAs9Bw1eXZN0MCwTN4H tylFm3w4a9BMzHbzgiwGzDBTkUyzIQyQ1XzqSHbOTWtzEuFGbAmRzSgDoFPbZEPD cWXsh7yRLr/aqOFLCoEbN/Wp87aBDSXi+I7d5rU+la97En3FWBSNku84dZf1F+8C 9qW4wCW7lX1LVnWy3LmeSxXf0k9JmJGkUvdmSr2TYtZVPvPLqHg97NOWG9CAfUVO cTZW5QOFuyEGg/vbYOFmfWRI7uw+YjggV/v0LzUYjRV00fj+rygjgUfgA5f3rGA= =NPtm -END PGP SIGNATURE- * tree-vrp.c (identify_jump_threads): Handle GIMPLE_SWITCH too. Index: tree-vrp.c === *** tree-vrp.c (revision 172644) --- tree-vrp.c (working copy) *** identify_jump_threads (void) *** 7555,7579 may be some value in handling SWITCH_EXPR here, I doubt it's terribly important. */ last = gsi_stmt (gsi_last_bb (bb)); - if (gimple_code (last) != GIMPLE_COND) - continue; ! /* We're basically looking for any kind of conditional with integral or pointer type arguments. Note the type of the second argument will be the same as the first argument, so no need to check it explicitly. */ ! if (TREE_CODE (gimple_cond_lhs (last)) == SSA_NAME ! (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (last))) ! || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (last ! (TREE_CODE (gimple_cond_rhs (last)) == SSA_NAME ! || is_gimple_min_invariant (gimple_cond_rhs (last { edge_iterator ei; /* We've got a block with multiple predecessors and multiple !successors which also ends in a suitable conditional. For !each predecessor, see if we can thread it to a specific !successor. */ FOR_EACH_EDGE (e, ei, bb-preds) { /* Do not thread across back edges or abnormal edges --- 7555,7579 may be some value in handling SWITCH_EXPR here, I doubt it's terribly important. */ last = gsi_stmt (gsi_last_bb (bb)); ! /* We're basically looking for a switch or any kind of conditional with integral or pointer type arguments. Note the type of the second argument will be the same as the first argument, so no need to check it explicitly. */ ! if (gimple_code (last) == GIMPLE_SWITCH ! || (gimple_code (last) == GIMPLE_COND ! TREE_CODE (gimple_cond_lhs (last)) == SSA_NAME ! (INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (last))) ! || POINTER_TYPE_P (TREE_TYPE (gimple_cond_lhs (last ! (TREE_CODE (gimple_cond_rhs (last)) == SSA_NAME ! || is_gimple_min_invariant (gimple_cond_rhs (last) { edge_iterator ei; /* We've got a block with multiple predecessors and multiple !successors which also ends in a suitable conditional or !switch statement. For each predecessor, see if we can thread !it to a specific successor. */ FOR_EACH_EDGE (e, ei, bb-preds) { /* Do not thread across back edges or abnormal edges
Re: [PATCH PING] c++-specific bits of tree-slimming patches
On Thu, 14 Apr 2011, Nathan Froyd wrote: On Fri, Apr 08, 2011 at 01:50:24PM -0400, Jason Merrill wrote: On 03/24/2011 09:15 AM, Nathan Froyd wrote: + tree t = make_node (CASE_LABEL_EXPR); + + TREE_TYPE (t) = void_type_node; + SET_EXPR_LOCATION (t, input_location); As jsm and richi said, using input_location like this is a regression. Can we use DECL_SOURCE_LOCATION (label_decl) instead? Sure. Joseph, Richi, are you happy with that change? It would fix the C/C++ regression, as c_add_case_label does: /* Create the LABEL_DECL itself. */ label = create_artificial_label (loc); ... /* Add a CASE_LABEL to the statement-tree. */ case_label = add_stmt (build_case_label (loc, low_value, high_value, label)); so the DECL_SOURCE_LOCATION would be the same as the location_t we were passing in anyway. For the other languages, I think it would be neutral or an improvement (they all use input_location or UNKNOWN_LOCATION for the CASE_LABEL anyway). Seems fine with me. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] make LABEL_DECL has its own rtx field for its associated CODE_LABEL
Hi, On Wed, 20 Apr 2011, Richard Guenther wrote: I had occasion to try this today; this inheritance structure doesn't work. The truncated inheritance tree looks like: * decl_common * field_decl * const_decl * decl_with_rtl * label_decl * result_decl * parm_decl * decl_with_vis... In particular, FIELD_DECLs have a size, but they have no RTL associated with them. And LABEL_DECLs have RTL, but no size. Blaeh. So far about nice clean ideas :) One hacky idea: change my proposal to this: decl_common {} # no size, no rtl, no align, no pt_uid decl_with_rtl_or_size : decl_common { # add align, pt_uid union { rtx rtl; tree size; } u; } decl_with_size : decl_with_rtl_or_size { # add size, size_unit } Use the rtl_or_size struct primarily for the current _with_rtl things that don't naturally have a size, but use it also for FIELD_DECLs via the union. Alternatively I could also envision making a new tree_ struct for just field_decls, that would contain the size and other fields that currently are in decl_common just for fields (in particular the off_align) member. The various accessors like DECL_SIZE would then need to dispatch based on tree code. Also doesn't sound terribly sexy. FWIW I'm usually against on the side mappings A-B if most As will most of the time be associated with a B. A size _is_ associated with the entity always (for entities where it makes sense to talk about sizes), so that's exactly where I would find on the side tables strange. For DECL_RTL it's less clear. Ciao, Michael.
Re: [PATCH][ARM] RTABI half-precision conversion functions
On Thu, 14 Apr 2011, Andrew Stubbs wrote: On 14/02/11 18:20, Joseph S. Myers wrote: Is there a reason you didn't add these functions to the shared libgcc (adjust t-bpabi and t-symbian accordingly, add them to libgcc-bpabi.ver at version GCC_4.6.0)? The GCC-specific names were deliberately made static-only in the expectation that they would be obsoleted by standard AEABI names and temporary names shouldn't be a permanent part of the shared libgcc interface; now we have the permanent names, I'd have thought they should go in shared libgcc as well as static libgcc (while the GCC-specific names would continue to be exported from static libgcc only, with the symbol versioning ensuring they don't get exported from shared libgcc). No, there was no reason - I just didn't realise it needed doing. Is this patch better? You need to add %inherit GCC_4.7.0 GCC_4.6.0 GCC_4.7.0 { } to libgcc-std.ver so that the symbol versions are properly related to each other (empty versions there that only have contents for some targets are fine; GCC_4.1.0 is another other example of such a symbol version). Otherwise the symbol version handling seems right to me, although I can't approve the patch. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH][C] Do not expand array-refs via pointer arithmetic
On Thu, 14 Apr 2011, Richard Guenther wrote: 2011-04-14 Richard Guenther rguent...@suse.de * c-typeck.c (build_unary_op): Do not expand array-refs via pointer arithmetic. Only adjust qualifiers for function types. OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] make LABEL_DECL has its own rtx field for its associated CODE_LABEL
On Thu, Apr 21, 2011 at 05:54:28PM +0200, Michael Matz wrote: In particular, FIELD_DECLs have a size, but they have no RTL associated with them. And LABEL_DECLs have RTL, but no size. Blaeh. So far about nice clean ideas :) One hacky idea: change my proposal to this: decl_common {} # no size, no rtl, no align, no pt_uid decl_with_rtl_or_size : decl_common { # add align, pt_uid union { rtx rtl; tree size; } u; } decl_with_size : decl_with_rtl_or_size { # add size, size_unit } Use the rtl_or_size struct primarily for the current _with_rtl things that don't naturally have a size, but use it also for FIELD_DECLs via the union. I'm not sure I follow this wrt FIELD_DECLs. Before you have: ...lots of decl fields.. tree size; tree size_unit; After you have: ...lots of decl fields... union { rtx rtl; tree size; } u; tree size; tree size_unit; Or did you mean something like: /* As above. */ decl_with_rtl_or_size /* Add a size_unit field. */ decl_with_size_unit : decl_with_rtl_or_size /* FIELD_DECL */ /* Add a size field for DECLs that do have RTL. */ decl_with_rtl_and_size : decl_with_size_unit /* VAR_DECL, PARM_DECL, etc. */ which looks just awful. :) Alternatively I could also envision making a new tree_ struct for just field_decls, that would contain the size and other fields that currently are in decl_common just for fields (in particular the off_align) member. The various accessors like DECL_SIZE would then need to dispatch based on tree code. You could also do something like: struct tree_decl_size { tree size; tree size_unit; ... }; struct tree_field_decl { ... struct tree_decl_size size; }; struct tree_var_decl { ... struct tree_decl_size size; }; static inline tree * decl_size_1 (tree node) { switch (TREE_CODE (node)) { case FIELD_DECL: return node-field_decl.size.size; case VAR_DECL: return node-var_decl.size.size; ... default: gcc_unreachable (); } } /* Might also need a HAS_DECL_SIZE_P predicate or similar. */ #define DECL_SIZE(NODE) (*decl_size_1 (NODE)) ... which would make it somewhat more obvious when things have sizes, as well as letting you remove DECL_SIZE{,_UNIT} from FUNCTION_DECL. Slimming CONST_DECL and LABEL_DECL like this is useful mostly for code cleanliness, but eliminating such fields from FUNCTION_DECL would have a much more noticeable memory size impact. -Nathan
Re: Convert legacy m68k options to .opt aliases
On Thu, 14 Apr 2011, Gunther Nikl wrote: However, the link spec seems to be harder. %{m68020-*|m68040|m68060:-fl libm020} Do I have to replace every m680x0 option with a matching mcpu= (maybe even together with a march=) option? Whatever options you want to match that spec should be replaced by their current versions. That means m68020-*|mcpu=68040|mcpu=68060 (the -m68020-* options haven't been changed into aliases, since they correspond to multiple options rather than being equivalent to a single other option). -- Joseph S. Myers jos...@codesourcery.com
[Ada] Adjust rules for selected files in Makefiles
A few selected files must be compiled with special options in Ada, both for the compiler and the runtime. The attached patch cleans things up a bit in this area and also fixes various minor related issues. Bootstrapped/regtested on i586-suse-linux, applied on the mainline. 2011-04-21 Eric Botcazou ebotca...@adacore.com * gcc-interface/Makefile.in (NO_SIBLING_ADAFLAGS): Always define. (NO_REORDER_ADAFLAGS): New variable. (EXTRA_GNATTOOLS): Always define. (../stamp-gnatlib1-$(RTSDIR): Copy tsystem.h. Clean up and adjust list of files compiled with special options. * gcc-interface/Make-lang.in: Likewise. (ada/decl.o): Cosmetical change. (ada/misc.o): Remove dependency on $(PLUGIN_H). -- Eric Botcazou Index: gcc-interface/Makefile.in === --- gcc-interface/Makefile.in (revision 172811) +++ gcc-interface/Makefile.in (working copy) @@ -107,6 +107,8 @@ ADA_CFLAGS = ADAFLAGS = -W -Wall -gnatpg -gnata SOME_ADAFLAGS =-gnata FORCE_DEBUG_ADAFLAGS = -g +NO_SIBLING_ADAFLAGS=-fno-optimize-sibling-calls +NO_REORDER_ADAFLAGS=-fno-toplevel-reorder GNATLIBFLAGS = -gnatpg -nostdinc GNATLIBCFLAGS = -g -O2 # Pretend that _Unwind_GetIPInfo is available for the target by default. This @@ -180,6 +182,9 @@ SYSLIBS = @GNAT_LIBEXC@ # List of extra object files linked in with various programs. EXTRA_GNATTOOLS_OBJS = ../../prefix.o ../../version.o +# List extra gnattools +EXTRA_GNATTOOLS = + # List of target dependent sources, overridden below as necessary TARGET_ADA_SRCS = @@ -243,6 +248,21 @@ LIBDEPS = $(LIBINTL_DEP) $(LIBIBERTY) TGT_LIB = TOOLS_LIBS = $(EXTRA_GNATTOOLS_OBJS) targext.o link.o $(LIBGNAT) ../../../libiberty/libiberty.a $(SYSLIBS) $(TGT_LIB) +# Convert the target variable into a space separated list of architecture, +# manufacturer, and operating system and assign each of those to its own +# variable. + +host:=$(subst -, ,$(host_canonical)) +targ:=$(subst -, ,$(target)) +arch:=$(word 1,$(targ)) +ifeq ($(words $(targ)),2) + manu:= + osys:=$(word 2,$(targ)) +else + manu:=$(word 2,$(targ)) + osys:=$(word 3,$(targ)) +endif + # Specify the directories to be searched for header files. # Both . and srcdir are used, in that order, # so that tm.h and config.h will be found in the compilation @@ -312,21 +332,6 @@ GNATMAKE_OBJS = a-except.o ali.o ali-uti uintp.o uname.o urealp.o usage.o widechar.o scil_ll.o \ $(EXTRA_GNATMAKE_OBJS) -# Convert the target variable into a space separated list of architecture, -# manufacturer, and operating system and assign each of those to its own -# variable. - -host:=$(subst -, ,$(host_canonical)) -targ:=$(subst -, ,$(target)) -arch:=$(word 1,$(targ)) -ifeq ($(words $(targ)),2) - manu:= - osys:=$(word 2,$(targ)) -else - manu:=$(word 2,$(targ)) - osys:=$(word 3,$(targ)) -endif - # Make arch match the current multilib so that the RTS selection code # picks up the right files. For a given target this must be coherent # with MULTILIB_DIRNAMES defined in gcc/config/target/t-*. @@ -1634,7 +1639,6 @@ ifeq ($(strip $(filter-out cygwin32% min s-osprim.adbs-osprim-mingw.adb \ s-taprop.adbs-taprop-mingw.adb -EH_MECHANISM=-gcc ifeq ($(strip $(filter-out x86_64%,$(arch))),) ifeq ($(strip $(MULTISUBDIR)),/32) LIBGNAT_TARGET_PAIRS += \ @@ -1666,6 +1670,8 @@ ifeq ($(strip $(filter-out cygwin32% min # ??? This will be replaced by gnatlib-shared-dual-win32 when GNAT # auto-import support for array/record will be done. GNATLIB_SHARED = gnatlib-shared-win32 + +EH_MECHANISM=-gcc endif TOOLS_TARGET_PAIRS= \ @@ -2444,6 +2450,8 @@ install-gnatlib: ../stamp-gnatlib-$(RTSD $(foreach PAIR,$(LIBGNAT_TARGET_PAIRS), \ $(LN_S) $(fsrcpfx)ada/$(word 2,$(subst , ,$(PAIR))) \ $(RTSDIR)/$(word 1,$(subst , ,$(PAIR)));) +# Copy tsystem.h + $(CP) $(srcdir)/tsystem.h rts # Copy generated target dependent sources $(RM) $(RTSDIR)/s-oscons.ads (cd $(RTSDIR); $(LN_S) ../s-oscons.ads s-oscons.ads) @@ -2668,7 +2676,7 @@ gnatlib-sjlj: gnatlib-zcx: $(MAKE) $(FLAGS_TO_PASS) EH_MECHANISM=-gcc \ - THREAD_KIND=$(THREAD_KIND) ../stamp-gnatlib1-$(RTSDIR) + THREAD_KIND=$(THREAD_KIND) ../stamp-gnatlib1-$(RTSDIR) sed -e 's/ZCX_By_Default.*/ZCX_By_Default: constant Boolean := True;/' $(RTSDIR)/system.ads $(RTSDIR)/s.ads $(MV) $(RTSDIR)/s.ads $(RTSDIR)/system.ads $(MAKE) $(FLAGS_TO_PASS) \ @@ -2709,22 +2717,15 @@ b_gnatm.o : b_gnatm.adb ADA_INCLUDE_DIR = $(libsubdir)/adainclude ADA_RTL_OBJ_DIR = $(libsubdir)/adalib +# Special flags + # force no sibling call optimization on s-traceb.o so the number of stack # frames to be skipped when computing a call chain is not modified by -# optimization. However we can do that only when building the runtime -# (not the compiler) because the -fno-optimize-sibling-calls option exists -# only in GCC 3 and above. +#
Re: [doc] Remove references to mips-tfile on MIPS
Not strictly related to this patch, but there are other cleanups possible because of the only-used-on-Tru64 nature of mips-tdump/mips-tfile. In particular, there are seven target macros (all undocumented) used by those programs and nowhere else in GCC: ALIGN_SYMTABLE_OFFSET CODE_MASK MIPS_IS_STAB MIPS_MARK_STAB MIPS_UNMARK_STAB SHASH_SIZE THASH_SIZE. Since those are only defined in alpha.h and only used on those programs, it would be better to hardcode the definitions inside mips-{tdump,tfile}.c, remove them from alpha.h and so eliminate seven target macros (about 1% of the total). If you also hardcode the right definition of MIPS_DEBUGGING_INFO (used both in mips-tfile and elsewhere) then these programs should no longer depend on target macros and their tm.h includes can be removed. (There are many other instances of #if conditionals in those programs, all of which are suspect since the programs are native-only for a single target and so shouldn't need conditional compilation at all. But those other conditionals aren't relevant to target macro elimination.) -- Joseph S. Myers jos...@codesourcery.com
[cxx-mem-model] contiguous bitfields race implementation
To refresh everyone's memory, here is the problem: struct { unsigned int a : 4; unsigned char b; unsigned int c: 6; } var; void seta(){ var.a = 12; } Stores into a cannot touch b, so we can't store with anything wider (e.g. a 32 bit store) that will touch b. This problem can be seen on strictly aligned targets such as Arm, where we store the above sequence with a 32-bit store. Or on x86-64 with a being volatile (PR48124). This patch fixes both problems, but only for the C++ memory model. This is NOT a generic fix PR48124, only a fix when using --param allow-store-data-races=0. The gist of this patch is in max_field_size(), where we calculate the maximum number of bits we can store into. In doing this calculation I assume we can store into the padding without causing any races. So, padding between fields and at the end of the structure are included. Tested on x86-64 with --param allow-store-data-races=0. How does this look? I'd like a review before committing to the branch. Aldy * Makefile.in (expr.o): Depend on PARAMS_H. * machmode.h (get_best_mode): Add argument. * fold-const.c (optimize_bit_field_compare): Add argument to get_best_mode. (fold_truthop): Same. * ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field. * expr.c (emit_group_store): Same. (copy_blkmode_from_reg): Same. (write_complex_part): Same. (optimize_bitfield_assignment_op): Add argument. Add argument to get_best_mode. (max_field_size): New. (expand_assignment): Calculate maxbits and pass it down accordingly. (store_field): New argument. (expand_expr_real_2): New argument to store_field. Include params.h. * expr.h (store_bit_field): New argument. * stor-layout.c (get_best_mode): Restrict mode expansion by taking into account maxbits. * calls.c (store_unaligned_arguments_into_pseudos): New argument to store_bit_field. * expmed.c (store_bit_field_1): New argument. Use it. (store_bit_field): Same. (store_fixed_bit_field): Same. (store_split_bit_field): Same. (extract_bit_field_1): Pass new argument to get_best_mode. (extract_bit_field): Same. * stmt.c (store_bit_field): Pass new argument to store_bit_field. * tree.h (DECL_THREAD_VISIBLE_P): New. Index: machmode.h === --- machmode.h (revision 170745) +++ machmode.h (working copy) @@ -248,7 +248,9 @@ extern enum machine_mode mode_for_vector /* Find the best mode to use to access a bit field. */ -extern enum machine_mode get_best_mode (int, int, unsigned int, +extern enum machine_mode get_best_mode (int, int, + unsigned HOST_WIDE_INT, + unsigned int, enum machine_mode, int); /* Determine alignment, 1=result=BIGGEST_ALIGNMENT. */ Index: tree.h === --- tree.h (revision 170745) +++ tree.h (working copy) @@ -3102,6 +3102,10 @@ struct GTY(()) tree_parm_decl { #define DECL_THREAD_LOCAL_P(NODE) \ (VAR_DECL_CHECK (NODE)-decl_with_vis.tls_model = TLS_MODEL_REAL) +/* Return true if a VAR_DECL is visible from another thread. */ +#define DECL_THREAD_VISIBLE_P(NODE) \ + (TREE_STATIC (NODE) !DECL_THREAD_LOCAL_P (NODE)) + /* In a non-local VAR_DECL with static storage duration, true if the variable has an initialization priority. If false, the variable will be initialized at the DEFAULT_INIT_PRIORITY. */ Index: fold-const.c === --- fold-const.c(revision 170745) +++ fold-const.c(working copy) @@ -3426,7 +3426,7 @@ optimize_bit_field_compare (location_t l flag_strict_volatile_bitfields 0) nmode = lmode; else -nmode = get_best_mode (lbitsize, lbitpos, +nmode = get_best_mode (lbitsize, lbitpos, 0, const_p ? TYPE_ALIGN (TREE_TYPE (linner)) : MIN (TYPE_ALIGN (TREE_TYPE (linner)), TYPE_ALIGN (TREE_TYPE (rinner))), @@ -5254,7 +5254,7 @@ fold_truthop (location_t loc, enum tree_ to be relative to a field of that size. */ first_bit = MIN (ll_bitpos, rl_bitpos); end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize); - lnmode = get_best_mode (end_bit - first_bit, first_bit, + lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode, volatilep); if (lnmode == VOIDmode) @@ -5319,7 +5319,7 @@ fold_truthop (location_t loc, enum tree_ first_bit = MIN (lr_bitpos, rr_bitpos); end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos +
[PATCH] PR c/36750: Suppress missing field initializer warning for '= {0}'
Hello, This patch suppresses the missing field initializer warning when a structure is initialized with ` = { 0 }' in C. Even though the PR author asks specifically to suppress (at least) only when a trailing comma is included, results from Google code search suggest that spelling without a comma is more common, so the patch does not distinguish these variants. Behavior of C++ front-end is unchanged. Bootstrapped and regtested on x86_64-linux, OK for trunk? 2011-04-21 Alexander Monakov amona...@ispras.ru PR c/36750 * c-typeck.c (pop_init_level): Do not warn about initializing with ` = {0}'. testsuite: * gcc.dg/missing-field-init-2.c: Update testcase. diff --git a/gcc/c-typeck.c b/gcc/c-typeck.c index 15b7755..d8609d2 100644 --- a/gcc/c-typeck.c +++ b/gcc/c-typeck.c @@ -6934,15 +6934,23 @@ pop_init_level (int implicit, struct obstack * braced_init_obstack) TREE_CODE (constructor_type) == RECORD_TYPE constructor_unfilled_fields) { + bool constructor_zeroinit = +(VEC_length (constructor_elt, constructor_elements) == 1 + integer_zerop + (VEC_index (constructor_elt, constructor_elements, 0)-value)); + /* Do not warn for flexible array members or zero-length arrays. */ while (constructor_unfilled_fields (!DECL_SIZE (constructor_unfilled_fields) || integer_zerop (DECL_SIZE (constructor_unfilled_fields constructor_unfilled_fields = DECL_CHAIN (constructor_unfilled_fields); - /* Do not warn if this level of the initializer uses member - designators; it is likely to be deliberate. */ - if (constructor_unfilled_fields !constructor_designated) + if (constructor_unfilled_fields + /* Do not warn if this level of the initializer uses member + designators; it is likely to be deliberate. */ +!constructor_designated + /* Do not warn about initializing with ` = {0}'. */ +!constructor_zeroinit) { push_member_name (constructor_unfilled_fields); warning_init (OPT_Wmissing_field_initializers, diff --git a/gcc/testsuite/gcc.dg/missing-field-init-2.c b/gcc/testsuite/gcc.dg/missing-field-init-2.c index 581eb30..c5a3f49 100644 --- a/gcc/testsuite/gcc.dg/missing-field-init-2.c +++ b/gcc/testsuite/gcc.dg/missing-field-init-2.c @@ -9,3 +9,6 @@ struct s s4[] = { 1, 2, 3, 4, 5 }; /* { dg-warning (missing initializer)|(near struct s s5[] = { 1, 2, 3, 4, 5, 6 }; /* Designated initializers produce no warning. */ struct s s6 = { .a = 1 }; /* { dg-bogus missing initializer } */ +/* Allow zero-initializing with = { 0 }. */ +struct s s7 = { 0 }; /* { dg-bogus missing initializer } */ +struct s s8 = { 1 }; /* { dg-warning (missing initializer)|(near initialization) } */
Re: new option -Wno-maybe-uninitialized
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/07/11 18:24, Xinliang David Li wrote: Hi, the following patch implements the option to fine control the emitted warnings -- 1) allow suppressing warnings for use of values that may be uninitialized. Definitely uninitialized values that may be used warning is not affected 2) allow fine grain control on promotion of warnings to errors: -Wno-error=maybe-uninitialized This is useful for users who only care about definite uninitialized variable warnings. I'm definitely in favor of having the ability for the user to be able to fine tune the warnings they want. However, I have some concerns. First, I'm not sure how reconcile the may vs must inconsistency. We can have an object which is used in a real statement in the IL, which we currently would classify as must-be-uninitialized, even if the path leading to the statement is unexecutable.I guess I want to find better ways to describe this kind of stuff in our documentation and switch names Second, I'm concerned about changing the default behaviour of - -Wuninitialized. I don't think we ever reached a consensus on that issue last time we tried to hash through this stuff. Maybe I missed something, but it appears to me your patch makes -Wuninitialized only warn about real uses and ignore uses in PHIs. Given the lack of consensus (and I believe there will never be a consensus), I believe we should keep -Wuninitialized behavior as-is and use -Wno-maybe-uninitialized to override -Wuninitialized. If we can come up with better switch name than -Wno-maybe-uninitialized, that would be a good step as well. However, I'm not offhand sure what name to use given the issues surrounding may vs must be used uninitialized. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNsGecAAoJEBRtltQi2kC7BugH/01X2ppdVUONq2fBzztxF+YR Qn85yccwI45iizNG1dlmyZlYpATYcfTGwDkBYTjiVllhOJa64Ri6bIP3ErySPllk OrSAbwFT96TK9/h1eGfVj45mZS3MO0Pa+sxp93noGnxIMWWOt+7LiyrfJEHHLaUK rPhREXodzLnN0KpXTD8+RS2uzTV+YODAuiWQDkiyQ0XpjCw5w2ccc7dWnrzgvEn5 6x/baTqQxunV8t3/ezUlHiMcUNyMigXccFgIeRRsvMengY/reLxy7eg1i+vdeets DyWR/Hg3HewLAQidL3SAJLLGuqTrNUZ0wswgqghqh9VllAFoUdKn0xk8GFt0wbk= =4aA4 -END PGP SIGNATURE-
Re: RFA: Improve jump threading #2 of N
Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for trunk? Would this also fix PR18046? Ciao! Steven
Re: FDO usability patch -- cfg and lineno checksum
On Tue, Apr 19, 2011 at 15:47, Xinliang David Li davi...@google.com wrote: The attached is the revised patch with a warning suggested for cases when CFG matches, but source locations change. Ok for trunk? The tree.c changes are OK. Diego.
Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error
Ping .. David On Wed, Apr 20, 2011 at 9:34 AM, Xinliang David Li davi...@google.com wrote: This would work if there is a way to set Werror=coverage-mismatch without having to explicitly set the option classification as DK_ERROR. Does this mechanism exist? Thanks, David On Tue, Apr 19, 2011 at 12:52 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Apr 19, 2011 at 9:13 AM, Xinliang David Li davi...@google.com wrote: -Wcoverage-mismatch is enabled by default, and the warning is promoted to error by default. However in the current implementation -Wno-error can not demote the error back to warning. The patch was ported from one contributed by Neil. OK for trunk after regression testing? I am sure there is a better way to achieve this, like making Werror=coverage-mismatch the default. Joseph? Richard. 2011-04-18 Neil Vachharajani nvach...@gmail.com * flags.c: New flag variable. * opts.c (common_handle_options): Set flag_werror_set. * opts-global.c (decode_options): Delay Werror decision for Wcoverage-mismatch util after options are parsed. The following test case can be added, but the test harness does not like the extra warnings -- how can they be pruned? Thanks, David /* { dg-options -O2 -Wcoverage-mismatch -Wno-error } */ int __attribute__((noinline)) bar (void) { } #ifdef _PROFILE_USE int foo (int i) { if (i) bar (); else bar (); return 0; } #else int foo (int i) { if (i) bar (); return 0; } #endif int main(int argc, char **argv) { foo (argc); return 0; }
Re: RFA: Improve jump threading #2 of N
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/21/11 11:26, Steven Bosscher wrote: Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for trunk? Would this also fix PR18046? Not right now. If we look at VRP2 (and this only affects VRP's jump threading) we have: SSA form after inserting ASSERT_EXPRs bar () { int prephitmp.5; int pretmp.4; int i.0; # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) # VUSE .MEM_5(D) i.0_1 = i; switch (i.0_1) default: L2, case 0: L0 # SUCC: 4 [71.0%] (exec) 3 [29.0%] (exec) # BLOCK 3 freq:2900 # PRED: 2 [29.0%] (exec) L0: i.0_2 = ASSERT_EXPR i.0_1, i.0_1 == 0; # .MEM_6 = VDEF .MEM_5(D) foo (); # VUSE .MEM_6 pretmp.4_8 = i; # SUCC: 4 [100.0%] (fallthru,exec) # BLOCK 4 freq:1 # PRED: 2 [71.0%] (exec) 3 [100.0%] (fallthru,exec) # .MEM_3 = PHI .MEM_5(D)(2), .MEM_6(3) # prephitmp.5_9 = PHI i.0_1(2), pretmp.4_8(3) L2: switch (prephitmp.5_9) default: L6, case 0: L4 # SUCC: 6 [61.0%] (exec) 5 [39.0%] (exec) # BLOCK 5 freq:3898 # PRED: 4 [39.0%] (exec) L4: # .MEM_7 = VDEF .MEM_3 foo (); # SUCC: 6 [100.0%] (fallthru,exec) # BLOCK 6 freq:1 # PRED: 4 [61.0%] (exec) 5 [100.0%] (fallthru,exec) # .MEM_4 = PHI .MEM_3(4), .MEM_7(5) L6: # VUSE .MEM_4 return; # SUCC: EXIT [100.0%] } Note the lack of range information for i.0_1 for the default case of the first switch. That's going to be a prerequisite for threading through the second switch which its reached via the default path from the first switch. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNsGviAAoJEBRtltQi2kC79/AH/3jrM+zArf6l9tUlyO8dFF4T IBfJW8oq94DQfwyahEh6yk1Qeh6YkV3e5GsmIpI3GAzimhekoXEKchdqbXvYfSvC JNk5FmTlv5rc4SUL+rPLpOeVNxgj46LXjlgUh3d3Ino5PXW8uhal4qidMEPxhonA HHgbwuvYdhLWrJYJ35mEP5HPQGLRTVQtCdpDz+8CXl9D8Cr87y93W+cOEXckhOGI yJeUhTRYftVCOwDdVfzEgqM+3OkGq6PE0TEMh/OUA2zDfeTVfQ8/vv9T521X/DJW v9a7JN0gQM+6Hp5MmpMnSpKCfmeKKXUpvisT/Cr0pHeXKthmbm1nwRFoHalYp5g= =6WXP -END PGP SIGNATURE-
Re: new option -Wno-maybe-uninitialized
On Thu, Apr 21, 2011 at 10:21 AM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/07/11 18:24, Xinliang David Li wrote: Hi, the following patch implements the option to fine control the emitted warnings -- 1) allow suppressing warnings for use of values that may be uninitialized. Definitely uninitialized values that may be used warning is not affected 2) allow fine grain control on promotion of warnings to errors: -Wno-error=maybe-uninitialized This is useful for users who only care about definite uninitialized variable warnings. I'm definitely in favor of having the ability for the user to be able to fine tune the warnings they want. However, I have some concerns. First, I'm not sure how reconcile the may vs must inconsistency. We can have an object which is used in a real statement in the IL, which we currently would classify as must-be-uninitialized, even if the path leading to the statement is unexecutable. I guess I want to find better ways to describe this kind of stuff in our documentation and switch names There are three different kind of uninit warnings: 1) definitely uninitialized, and the use the variable dominates the exit -- this is the must be used uninitialized case 2) definitely uninitialized, but the use may not be executed at runtime; 3) Only initialized in some paths from entry to the use (the use may or may not be executed). Currently 2) and 3) uses the same warning message, which may be misleading. The proposed new option only controls 3). Second, I'm concerned about changing the default behaviour of - -Wuninitialized. I don't think we ever reached a consensus on that issue last time we tried to hash through this stuff. Maybe I missed something, but it appears to me your patch makes -Wuninitialized only warn about real uses and ignore uses in PHIs. No, the patch does not change the default behavior of -Wuninitialized wihch will turn on -Wmaybe-uninitialized as well. The only useful scenarios for the new option are: 1) -Wuninitialized -Wno-maybe-uninitialized 2) -Wuninitialized -Werror=uninitialized -Wno-error=maybe-uninitialized Given the lack of consensus (and I believe there will never be a consensus), I believe we should keep -Wuninitialized behavior as-is and use -Wno-maybe-uninitialized to override -Wuninitialized. Yes, that is this patch is supposed to do -- the attached test case is testing it. If we can come up with better switch name than -Wno-maybe-uninitialized, that would be a good step as well. However, I'm not offhand sure what name to use given the issues surrounding may vs must be used uninitialized. Maybe-uninitialized matches case 3) -- different from case 2 which is actually 'maybe-used-uninitialized'. Thanks, David jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNsGecAAoJEBRtltQi2kC7BugH/01X2ppdVUONq2fBzztxF+YR Qn85yccwI45iizNG1dlmyZlYpATYcfTGwDkBYTjiVllhOJa64Ri6bIP3ErySPllk OrSAbwFT96TK9/h1eGfVj45mZS3MO0Pa+sxp93noGnxIMWWOt+7LiyrfJEHHLaUK rPhREXodzLnN0KpXTD8+RS2uzTV+YODAuiWQDkiyQ0XpjCw5w2ccc7dWnrzgvEn5 6x/baTqQxunV8t3/ezUlHiMcUNyMigXccFgIeRRsvMengY/reLxy7eg1i+vdeets DyWR/Hg3HewLAQidL3SAJLLGuqTrNUZ0wswgqghqh9VllAFoUdKn0xk8GFt0wbk= =4aA4 -END PGP SIGNATURE-
Re: fix memory leak in gengtype
On Thu, 21 Apr 2011, Laurynas Biveinis wrote: :( Why don't you get yourself a compile farm account? http://gcc.gnu.org/wiki/CompileFarm Thanks Laurynas, I am absolutely thrilled to see such a variety of hardware! I'll try applying, but I'm not sure I'm eligible, my contributions to OSS are just a few and mostly simple. Thanks, Dimitris
Re: [Patch,AVR]: Solve PR42210
On 04/21/2011 05:31 AM, Georg-Johann Lay wrote: +;; Some combiner patterns dealing with bits. +;; See PR42210 + +;; Move bit $3.$4 into bit $0.$4 +(define_insn *movbitqi.1-6.a ... +(define_insn *movbitqi.1-6.b ... +(define_insn *movbitqi.0 ... +(define_insn *movbitqi.7 This looks to be way more complicated than necessary. Consider implementing the extzv and insv named patterns, and using zero_extract properly. E.g. the following, lightly tested. Looking again at the comment on your last pattern, it looks like that one would still need to be included. r~ diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md index 1ab3033..e557fcd 100644 --- a/gcc/config/avr/avr.md +++ b/gcc/config/avr/avr.md @@ -3388,3 +3388,74 @@ clr __zero_reg__ [(set_attr length 3) (set_attr cc clobber)]) + +(define_expand extzv + [(set (match_operand:QI 0 register_operand ) + (zero_extract:QI + (match_operand:QI 1 register_operand ) + (match_operand:QI 2 const1_operand ) + (match_operand:QI 3 const_0_to_7_operand )))] + + ) + +(define_expand insv + [(set (zero_extract:QI + (match_operand:QI 0 register_operand ) + (match_operand:QI 1 const1_operand ) + (match_operand:QI 2 const_0_to_7_operand )) + (match_operand:QI 3 nonmemory_operand ))] + + ) + +(define_insn *movbitqi_reg_ext + [(set (match_operand:QI 0 register_operand =r) + (zero_extract:QI + (match_operand:QI 1 register_operand r) + (const_int 1) + (match_operand 2 const_0_to_7_operand )))] + + bst %1,%2\;bld %0,0 + [(set_attr length 2) + (set_attr cc none)]) + +(define_insn *movbitqi_ext_ext + [(set (zero_extract:QI + (match_operand:QI 0 register_operand +r) + (const_int 1) + (match_operand 1 const_0_to_7_operand )) + (zero_extract:QI + (match_operand:QI 2 register_operand r) + (const_int 1) + (match_operand 3 const_0_to_7_operand )))] + + bst %2,%3\;bld %0,%1 + [(set_attr length 2) + (set_attr cc none)]) + +(define_insn *movbitqi_ext_reg + [(set (zero_extract:QI + (match_operand:QI 0 register_operand +*d,*d,r,r,r) + (const_int 1) + (match_operand 1 const_0_to_7_operand )) + (match_operand:QI 2 nonmemory_operand L,P,L,P,r))] + +{ + switch (which_alternative) +{ +case 0: + operands[2] = gen_int_mode (~(1 INTVAL (operands[1])), QImode); + return andi %0,%2; +case 1: + operands[2] = gen_int_mode (1 INTVAL (operands[1]), QImode); + return ori %0,%2; +case 2: + return clt\;bld %0,%1; +case 3: + return set\;bld %0,%1; +case 4: + return bst %2,0\;bld %0,%1; +} + gcc_unreachable (); +} + [(set_attr length 1,1,2,2,2) + (set_attr cc set_zn,set_zn,none,none,none)]) diff --git a/gcc/config/avr/predicates.md b/gcc/config/avr/predicates.md index 9a3473b..e0142d1 100755 --- a/gcc/config/avr/predicates.md +++ b/gcc/config/avr/predicates.md @@ -62,6 +62,11 @@ (and (match_code const_int,const_double) (match_test op == CONST0_RTX (mode +;; Return 1 if OP is the one constant for MODE. +(define_predicate const1_operand + (and (match_code const_int) + (match_test op == CONST1_RTX (mode + ;; Returns true if OP is either the constant zero or a register. (define_predicate reg_or_0_operand (ior (match_operand 0 register_operand) @@ -106,6 +111,11 @@ (and (match_code const_int) (match_test exact_log2(~INTVAL (op) GET_MODE_MASK (mode)) = 0))) +;; Return true if OP is a constant between 0 and 7. +(define_predicate const_0_to_7_operand + (and (match_code const_int) + (match_test IN_RANGE (INTVAL (op), 0, 7 + ;; (define_predicate avr_sp_immediate_operand (and (match_code const_int)
[PATCH] PR 48175, Make CASE_VALUES_THRESHOLD settable via --param
In looking at some improvements to the powerpc, we wanted to change the default for when a table jump is generated vs. a series of if statements. Now, we could just add a powerpc specific TARGET_CASE_VALUES_THRESHOLD, but I tend to think that these should be settable on all/most ports with --param. At present, there are only two ports (avr and mn10300) that define their own TARGET_CASE_VALUES_THRESHOLD hook. My first patch does not remove the target hook and modify the avr/mn10300 ports to use maybe_set_param_value, but that can be done if desired. The patch adds two --param values, one for when the port is using the casesi insn, and the other when it uses the more primitive tablejump insn. I have bootstrapped the compiler with this patch and run the test suite with no regressions. Is it ok to apply as is? Should I modify the avr and mn10300 ports to use the parameters and do away with the target hook? Or should I do this just as a powerpc target hook? [gcc] 2011-04-21 Michael Meissner meiss...@linux.vnet.ibm.com PR rtl-optimization/48715 * params.def (PARAM_CASE_VALUES_THRESHOD_CASESI): New parameter. (PARAM_CASE_VALUES_THRESHOLD_TABLEJUMP): Ditto. * params.h (CASE_VALUES_THRESHOLD_CASESI): Define. (CASE_VALUES_THRESHOLD_TABLEJUMP): Ditto. * targhooks.c (toplevel): Include params.h. (default_case_values_threshold): Use CASE_VALUES_THRESHOLD_CASESI and CASE_VALUES_THRESHOLD_TABLEJUMP. * Makefile.in (targhooks.o): Add $(PARAMS_H) dependency. * doc/invoke.texi (case-values-threshold-casesi): Document. (case-values-threshold-tablejump): Ditto. [gcc/testsuite] 2011-04-21 Michael Meissner meiss...@linux.vnet.ibm.com PR rtl-optimization/48715 * gcc.target/powerpc/case-threshold1.c: New file to test --param case-values-threshold-tablejump. * gcc.target/powerpc/case-threshold2.c: Ditto. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899 Index: gcc/params.def === --- gcc/params.def (revision 172832) +++ gcc/params.def (working copy) @@ -885,6 +885,22 @@ DEFPARAM (PARAM_MAX_STORES_TO_SINK, 2, 0, 0) +/* Threshold for using jump table vs. if statements if the machine supports a + CASESI instruction. */ +DEFPARAM (PARAM_CASE_VALUES_THRESHOLD_CASESI, + case-values-threshold-casesi, + Minimum number of case elements to use a jump table instead of + if statements if the machine supports a CASESI instruction, + 4, 2, 0) + +/* Threshold for using jump table vs. if statements if the machine does not + support a CASESI instruction. */ +DEFPARAM (PARAM_CASE_VALUES_THRESHOLD_TABLEJUMP, + case-values-threshold-tablejump, + Minimum number of case elements to use a jump table instead of + if statements if the machine does not support a CASESI instruction, + 5, 2, 0) + /* Local variables: mode:c Index: gcc/params.h === --- gcc/params.h(revision 172832) +++ gcc/params.h(working copy) @@ -206,4 +206,8 @@ extern void init_param_values (int *para PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID) #define MAX_STORES_TO_SINK \ PARAM_VALUE (PARAM_MAX_STORES_TO_SINK) +#define CASE_VALUES_THRESHOLD_CASESI \ + PARAM_VALUE (PARAM_CASE_VALUES_THRESHOLD_CASESI) +#define CASE_VALUES_THRESHOLD_TABLEJUMP \ + PARAM_VALUE (PARAM_CASE_VALUES_THRESHOLD_TABLEJUMP) #endif /* ! GCC_PARAMS_H */ Index: gcc/targhooks.c === --- gcc/targhooks.c (revision 172832) +++ gcc/targhooks.c (working copy) @@ -71,7 +71,7 @@ along with GCC; see the file COPYING3. #include opts.h #include tree-flow.h #include tree-ssa-alias.h - +#include params.h bool default_legitimate_address_p (enum machine_mode mode ATTRIBUTE_UNUSED, @@ -1209,7 +1209,9 @@ default_target_can_inline_p (tree caller unsigned int default_case_values_threshold (void) { - return (HAVE_casesi ? 4 : 5); + return (HAVE_casesi + ? CASE_VALUES_THRESHOLD_CASESI + : CASE_VALUES_THRESHOLD_TABLEJUMP); } bool Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 172832) +++ gcc/Makefile.in (working copy) @@ -2807,7 +2807,7 @@ opts-common.o : opts-common.c $(OPTS_H) targhooks.o : targhooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TREE_H) \ $(EXPR_H) $(TM_H) $(RTL_H) $(TM_P_H) $(FUNCTION_H) output.h $(DIAGNOSTIC_CORE_H) \ $(MACHMODE_H) $(TARGET_DEF_H) $(TARGET_H) $(GGC_H) gt-targhooks.h \ - $(OPTABS_H) $(RECOG_H) reload.h hard-reg-set.h intl.h $(OPTS_H) \ + $(OPTABS_H) $(RECOG_H) reload.h hard-reg-set.h intl.h $(OPTS_H) $(PARAMS_H) \ tree-ssa-alias.h
Re: [PATCH] centralize builtin function type building
On Thu, Apr 21, 2011 at 05:36:42PM +0200, Richard Guenther wrote: On Thu, Apr 21, 2011 at 5:04 PM, Nathan Froyd froy...@codesourcery.com wrote: This patch does two things: - centralizes some infrastructure for defining builtin function types for frontends by providing a common function that DEF_FUNCTION_TYPE_FOO macros can call; and - in order to do that well, it also introduces build{,_varargs}_function_type_array for cases when build_function_type_list's interface doesn't work so well. build_function_type_list could have been used instead, but it would lose the error_mark_node handling provided in the C/C++/Ada/LTO frontends. This new interface will be necessary for eliminating other uses of build_function_type anyway. It would have been easier to move all of the builtin-types stuff into the middle-end, but Fortran doesn't use builtin-types.def. Even if it did, I suppose it's possible that some new front-end could have its own set of builtin types, so I'm leaving things as they are. But this is what should be done, at least for all builtins in the BUILT_IN_NORMAL category. ISTR Fortran was once running into the issue of assigning different DECL_FUNCTION_CODE numbers to those builtins than other languages, breaking LTO. So, it would be indeed nice to have a central middle-end place to instantiate those builtins and their required types. I'm not sure how far we are from that and am too lazy to look right now ... I agree that it is desirable that the backend builtin index not overlap with the standard builtin index (and front end builtin index). I was starting to look at this, when I learned Kenneth Zadeck was working on a more comprehensive solution for backend builtin types. Obviously all 3 of our efforts should be merged into one goal. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: Improve stack layout heuristic.
On Thu, Apr 21, 2011 at 8:43 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Apr 21, 2011 at 5:22 PM, Michael Matz m...@suse.de wrote: Hi, On Wed, 20 Apr 2011, Easwaran Raman wrote: But you're right - not adding that conflict doesn't actually reduce the size of bit maps. Reverting back to what was there originally. Thanks, I have no more issues with the patch. You'll need to find someone who can formally approve it, though. Ok with a proper changelog entry. Richard. Ciao, Michael. Committed with the following Changelog entry: 2011-04-21 Easwaran Raman era...@google.com * gcc/cfgexpand.c (stack_var): Remove OFFSET... (add_stack_var): ...and its reference here... (expand_stack_vars): ...and here. (stack_var_cmp): Sort by descending order of size. (partition_stack_vars): Change heuristic. (union_stack_vars): Fix to reflect changes in partition_stack_vars. (dump_stack_var_partition): Add newline after each partition. testsuite/Changelog: 2011-04-21 Easwaran Raman era...@google.com * gcc.dg/stack-layout-2.c: New test. Thanks, Easwaran
Re: [patch, fortran] PR 48405 - Front end expressions in DO loops
Hi Mikael, Regression-testing passed. Ping ** 0.25? Thomas OK. Thanks Waiting for Emacs... Sende fortran/ChangeLog Sende fortran/frontend-passes.c Sende testsuite/ChangeLog Hinzufügen testsuite/gfortran.dg/function_optimize_6.f90 Übertrage Daten Revision 172838 übertragen. Thanks for the review! Thomas
Re: [PATCH] Avoid regressing with the PR48248 change
On Mon, 18 Apr 2011, Richard Guenther wrote: This avoids changing -P output with the PR48248 fix which appearantly breaks Chrome (with has bogus assumptions on gcc -E -P output). For 4.7 I think we should instead go with the 2nd patch and make -P output smaller (which is the whole reason for this code path). Thus, first patch is for the 4.6 and 4.5 branches and the second for the trunk. Sofar I bootstrapped and tested the first on the 4.6 branch. Ok for both variants after testing completed? OK. -- Joseph S. Myers jos...@codesourcery.com
Re: RFA: Improve jump threading #2 of N
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/21/11 11:26, Steven Bosscher wrote: Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for trunk? Would this also fix PR18046? Also note that the assertion machinery doesn't really have the concept of anti-ranges, much less the ability to build up an anti-range by repeatedly excluding part of a range (say as we iterate over the labels that go somewhere other than the default). So while it wouldn't be terribly hard to deal with this testcase, it's not going to be very effective on real world code. To be effective on real world code we need to: 1. Lower trivial switches 2. build up anti-ranges as we encounter case labels that go elsewhere 3. track distinct values in a range at meet points If you look at something like Coverity it clearly does this kind of analysis and represents a clear case where Coverity can do better localized code analysis. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNsIZVAAoJEBRtltQi2kC7004H/jjGdzr1+WM5z+XMSGzLlLhF crz7ALIi/Ul+icUdlre+F1gUofAp+8g210uYESbeEJvPTtX7lBS0dQe+TX6r1Xvr 7kjKe2/iP1fl7lQzkNnbOqXygbmEFKG1ySwKIg0XkD7he58BDSAOaC1OgArpJAvI ppYZAO1Tkkqy/38+Jdj2emFbiayFqbHmPid0QaoMywDkxl3a5ZElQo1+h2jpUEir is8fb4tULoiswD4xL9PgNw5xFgDfqUUjXYVDEBWgQDI0DRaXJd/utO9C+Rg2to2S jxFnrZ8ljSZ++KapVHSjDrRSiTm+tDFaGxRHz1+8VtBn9siBHs4yFoZ1VjgbrZ4= =bNEt -END PGP SIGNATURE-
Re: Improve stack layout heuristic.
2011-04-21 Easwaran Raman era...@google.com * gcc/cfgexpand.c (stack_var): Remove OFFSET... (add_stack_var): ...and its reference here... (expand_stack_vars): ...and here. (stack_var_cmp): Sort by descending order of size. (partition_stack_vars): Change heuristic. (union_stack_vars): Fix to reflect changes in partition_stack_vars. (dump_stack_var_partition): Add newline after each partition. No gcc/ prefix. Paths are relative to the ChangeLog file. -- Eric Botcazou
Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error
On Tue, 19 Apr 2011, Xinliang David Li wrote: 2011-04-18 Neil Vachharajani nvach...@gmail.com * flags.c: New flag variable. * opts.c (common_handle_options): Set flag_werror_set. * opts-global.c (decode_options): Delay Werror decision for Wcoverage-mismatch util after options are parsed. This patch is certainly wrong, since common_handle_option must not set any global state, only state pointed to by its arguments. That said, setting -Werror=coverage-mismatch in decode_options at all is bad because decode_options is called when optimize attributes are processed; as-is, a -Wno-error=coverage-mismatch option will be overridden if such attributes are used. So the right place to set this is probably later, in process_options. And this can check global_options_set.x_warnings_are_errors to see if an explicit -Werror/-Wno-error option was passed. -- Joseph S. Myers jos...@codesourcery.com
Re: [patch] Split Parse Timevar (issue4378056)
On Wed, Apr 20, 2011 at 18:00, Jason Merrill ja...@redhat.com wrote: On 04/12/2011 11:49 AM, Lawrence Crowl wrote: This patch is available for review at http://codereview.appspot.com/4378056 I tried to comment there, but it didn't seem to be working; looking at the side-by-side diffs didn't show any changes, and double-clicking on a line in the patch form didn't let me add a comment. If you are logged in and click on the file (not the side-by-side), you can double-click on the patch hunk you want to comment on (you'll need to cut-n-paste some context due to http://code.google.com/p/rietveld/issues/detail?id=291can=4). Now, for some reason I never could quite grasp, some of Lawrence's patches were uploaded in such a way that both the base file and the patched file are the same. That's why the side-by-side view is not working. Subsequent patches from Lawrence did not have that problem. So, whatever it was, it got fixed (but I'm not sure what fixed it). Diego.
Re: [google]Pass --save-temps to the assembler (issue4436049)
On Tue, Apr 19, 2011 at 12:02:08PM -0700, Easwaran Raman wrote: This makes the gcc driver pass the --save-temps option to the assembler or assembler wrapper so that post-assembly tools like MAO can be integrated. Bootstraps on x86_64. Ok for google/main? 2011-04-19 Easwaran Raman era...@google.com * gcc/gcc.c (static const char *asm_options): Pass --save-temps to the assembler. Index: gcc/gcc.c === --- gcc/gcc.c (revision 172727) +++ gcc/gcc.c (working copy) @@ -780,7 +780,7 @@ static const char *asm_options = #if HAVE_GNU_AS /* If GNU AS is used, then convert -w (no warnings), -I, and -v to the assembler equivalents. */ -%{v} %{w:-W} %{I*} +%{v} %{w:-W} %{I*} %{save-temps:--save-temps} #endif %a %Y %{c:%W{o*}%{!o*:-o %w%b%O}}%{!c:-o %d%w%u%O}; This patch should also support -save-temps=obj. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [PATCH] PR c/36750: Suppress missing field initializer warning for '= {0}'
On Thu, 21 Apr 2011, Alexander Monakov wrote: Hello, This patch suppresses the missing field initializer warning when a structure is initialized with ` = { 0 }' in C. Even though the PR author asks specifically to suppress (at least) only when a trailing comma is included, results from Google code search suggest that spelling without a comma is more common, so the patch does not distinguish these variants. Behavior of C++ front-end is unchanged. Bootstrapped and regtested on x86_64-linux, OK for trunk? OK. -- Joseph S. Myers jos...@codesourcery.com
Re: FDO usability: pid handling
@@ -730,6 +726,8 @@ void cgraph_clone_inlined_nodes (struct void compute_inline_parameters (struct cgraph_node *); cgraph_inline_failed_t cgraph_edge_inlinable_p (struct cgraph_edge *); +void cgraph_init_node_map (void); +void cgraph_del_node_map (void); Given that you don't even export API for using it, I would go for init_node_map/del_node_map in profile.h. It is nothing generic that needs to be included into half of compiler. -static struct cgraph_node** pid_map = NULL; +typedef struct +{ + struct cgraph_node *n; +} cgraph_node_ptr_t; -/* Initialize map of pids (pid - cgraph node) */ +DEF_VEC_O (cgraph_node_ptr_t); +DEF_VEC_ALLOC_O (cgraph_node_ptr_t, heap); You don't need wrapping struct. In cgraph.h you already have: DEF_VEC_P(varpool_node_ptr); DEF_VEC_ALLOC_P(varpool_node_ptr,heap); DEF_VEC_ALLOC_P(varpool_node_ptr,gc); so you can use vector of cgraph_node_ptr With those changes the patch is OK.
Re: Improve stack layout heuristic.
On Thu, Apr 21, 2011 at 12:30 PM, Eric Botcazou ebotca...@adacore.com wrote: 2011-04-21 Easwaran Raman era...@google.com * gcc/cfgexpand.c (stack_var): Remove OFFSET... (add_stack_var): ...and its reference here... (expand_stack_vars): ...and here. (stack_var_cmp): Sort by descending order of size. (partition_stack_vars): Change heuristic. (union_stack_vars): Fix to reflect changes in partition_stack_vars. (dump_stack_var_partition): Add newline after each partition. No gcc/ prefix. Paths are relative to the ChangeLog file. -- Eric Botcazou Sorry. Fixed that entry and committed. Thanks, Easwaran
Re: FDO usability: pid handling
On Thu, Apr 21, 2011 at 1:36 PM, Jan Hubicka hubi...@ucw.cz wrote: @@ -730,6 +726,8 @@ void cgraph_clone_inlined_nodes (struct void compute_inline_parameters (struct cgraph_node *); cgraph_inline_failed_t cgraph_edge_inlinable_p (struct cgraph_edge *); +void cgraph_init_node_map (void); +void cgraph_del_node_map (void); Given that you don't even export API for using it, I would go for init_node_map/del_node_map in profile.h. It is nothing generic that needs to be included into half of compiler. Ok. -static struct cgraph_node** pid_map = NULL; +typedef struct +{ + struct cgraph_node *n; +} cgraph_node_ptr_t; -/* Initialize map of pids (pid - cgraph node) */ +DEF_VEC_O (cgraph_node_ptr_t); +DEF_VEC_ALLOC_O (cgraph_node_ptr_t, heap); You don't need wrapping struct. In cgraph.h you already have: DEF_VEC_P(varpool_node_ptr); DEF_VEC_ALLOC_P(varpool_node_ptr,heap); DEF_VEC_ALLOC_P(varpool_node_ptr,gc); so you can use vector of cgraph_node_ptr But ptr vector does allow me to put VEC_index as the LHS of the assignment -- or have I missed something? Thanks, David With those changes the patch is OK.
Re: FDO usability patch -- cfg and lineno checksum
I don't really follow the logic here. buffer is allocated to be size of block+4 and it is expected that gcov_write_words is not executed on size greater than 4. Since gcov_write_string now seems to be expected to handle strings of bigger size, I think you acually need to make write_string to write in chunks when you reach block boundary? gcov_write_words is used to reserve words*4 bytes in buffer for data write later. The the old logic is wrong -- if 'words' is large enough, it will lead to out of bound access. The old logic assumes that writes are not bigger than 4 and it is documented somewhere in gcov-io.h if I remember right (it is not my code). I however still think your change won't solve it in general, since you might want to write more than block size, so you need write_string update. What gets into libgcov is very problematic busyness for embedded targets, where you really want libgcov to be small. Why do you need to store strings now? It is needed to store the assembler name for functions. It allows lookup of the profile data using assembler name as key instead of using function ids. For gcc, the total size of gcda files is about 59k bytes without storing the names, and about 65k with names -- about 10% increase. For eon, the size changes from 27k to 35k bytes. I can split the patches into two parts -- one with cfg checksum and one with the name string. Well, lets make it incremental change then at time we will have use for it once we agree it makes sense. So you want to do symbol name resolving at GCOV runtime? I duno about embedded targets, 10% increase is not too bad, but I don't know. libgcov has quite inflexible coding style just to be small and fit there, so I don't know what are the space constraints that are considered acceptable in that world. We also need to bump gcov version, right? Yes -- but the version is currently derived from gcc version number and phase number --- this is wrong -- different version of gcc may have compatible coverage data format. Any suggestions to change this? --- probably just hard code the GCOV_VERSION string? Hmm, I am pretty sure we used to have minor/major numbering on the file format. Perhaps it went away with Nathan's changes. For -fprofile-use the compatibility across versions is pointless, at least now, but for gcov utility it makes sense. So I would go for explicit versioning again. Honza
Re: FDO usability: pid handling
On Thu, Apr 21, 2011 at 1:36 PM, Jan Hubicka hubi...@ucw.cz wrote: @@ -730,6 +726,8 @@ void cgraph_clone_inlined_nodes (struct void compute_inline_parameters (struct cgraph_node *); cgraph_inline_failed_t cgraph_edge_inlinable_p (struct cgraph_edge *); +void cgraph_init_node_map (void); +void cgraph_del_node_map (void); Given that you don't even export API for using it, I would go for init_node_map/del_node_map in profile.h. It is nothing generic that needs to be included into half of compiler. Ok. -static struct cgraph_node** pid_map = NULL; +typedef struct +{ + struct cgraph_node *n; +} cgraph_node_ptr_t; -/* Initialize map of pids (pid - cgraph node) */ +DEF_VEC_O (cgraph_node_ptr_t); +DEF_VEC_ALLOC_O (cgraph_node_ptr_t, heap); You don't need wrapping struct. In cgraph.h you already have: DEF_VEC_P(varpool_node_ptr); DEF_VEC_ALLOC_P(varpool_node_ptr,heap); DEF_VEC_ALLOC_P(varpool_node_ptr,gc); so you can use vector of cgraph_node_ptr But ptr vector does allow me to put VEC_index as the LHS of the assignment -- or have I missed something? If you want to set value, then use VEC_replace Honza
Re: FDO usability patch -- cfg and lineno checksum
On Thu, Apr 21, 2011 at 1:43 PM, Jan Hubicka hubi...@ucw.cz wrote: I don't really follow the logic here. buffer is allocated to be size of block+4 and it is expected that gcov_write_words is not executed on size greater than 4. Since gcov_write_string now seems to be expected to handle strings of bigger size, I think you acually need to make write_string to write in chunks when you reach block boundary? gcov_write_words is used to reserve words*4 bytes in buffer for data write later. The the old logic is wrong -- if 'words' is large enough, it will lead to out of bound access. The old logic assumes that writes are not bigger than 4 and it is documented somewhere in gcov-io.h if I remember right (it is not my code). I however still think your change won't solve it in general, since you might want to write more than block size, so you need write_string update. Right, 'words; can be large for string. -- the memcpy code following does not look right either. What gets into libgcov is very problematic busyness for embedded targets, where you really want libgcov to be small. Why do you need to store strings now? It is needed to store the assembler name for functions. It allows lookup of the profile data using assembler name as key instead of using function ids. For gcc, the total size of gcda files is about 59k bytes without storing the names, and about 65k with names -- about 10% increase. For eon, the size changes from 27k to 35k bytes. I can split the patches into two parts -- one with cfg checksum and one with the name string. Well, lets make it incremental change then at time we will have use for it once we agree it makes sense. Ok. So you want to do symbol name resolving at GCOV runtime? For now, it is just passed through as a tag for the profile data for each function. I duno about embedded targets, 10% increase is not too bad, but I don't know. libgcov has quite inflexible coding style just to be small and fit there, so I don't know what are the space constraints that are considered acceptable in that world. I will delay the string writing patch for now. We also need to bump gcov version, right? Yes -- but the version is currently derived from gcc version number and phase number --- this is wrong -- different version of gcc may have compatible coverage data format. Any suggestions to change this? --- probably just hard code the GCOV_VERSION string? Hmm, I am pretty sure we used to have minor/major numbering on the file format. Perhaps it went away with Nathan's changes. For -fprofile-use the compatibility across versions is pointless, Right, but also equally unnecessary to force version mismatch between compiler releases. at least now, but for gcov utility it makes sense. Yes. So I would go for explicit versioning again. Will dig through the revision history .. David Honza
Re: [google]Pass --save-temps to the assembler (issue4436049)
On Thu, Apr 21, 2011 at 12:07 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Tue, Apr 19, 2011 at 12:02:08PM -0700, Easwaran Raman wrote: This makes the gcc driver pass the --save-temps option to the assembler or assembler wrapper so that post-assembly tools like MAO can be integrated. Bootstraps on x86_64. Ok for google/main? 2011-04-19 Easwaran Raman era...@google.com * gcc/gcc.c (static const char *asm_options): Pass --save-temps to the assembler. Index: gcc/gcc.c === --- gcc/gcc.c (revision 172727) +++ gcc/gcc.c (working copy) @@ -780,7 +780,7 @@ static const char *asm_options = #if HAVE_GNU_AS /* If GNU AS is used, then convert -w (no warnings), -I, and -v to the assembler equivalents. */ -%{v} %{w:-W} %{I*} +%{v} %{w:-W} %{I*} %{save-temps:--save-temps} #endif %a %Y %{c:%W{o*}%{!o*:-o %w%b%O}}%{!c:-o %d%w%u%O}; This patch should also support -save-temps=obj. When using gcc with a post-assembly tool, we use a wrapper that invokes the tool and needs to know if the tool's output (another assembly file) needs to be saved. The wrapper doesn't generate any additional .o files and hence doesn't need to know whether the obj file is to be saved (which is done by the driver). Why do we need to pass -save-temps=obj in that case? -Easwaran -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [google]Pass --save-temps to the assembler (issue4436049)
On Thu, Apr 21, 2011 at 3:18 PM, Easwaran Raman era...@google.com wrote: When using gcc with a post-assembly tool, we use a wrapper that invokes the tool and needs to know if the tool's output (another assembly file) needs to be saved. The wrapper doesn't generate any additional .o files and hence doesn't need to know whether the obj file is to be saved (which is done by the driver). Why do we need to pass -save-temps=obj in that case? -save-temps=obj means place the save temps in the same directory as the object file would be put rather than the current directory. Thanks, Andrew Pinski
Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error
On Thu, Apr 21, 2011 at 12:54 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 19 Apr 2011, Xinliang David Li wrote: 2011-04-18 Neil Vachharajani nvach...@gmail.com * flags.c: New flag variable. * opts.c (common_handle_options): Set flag_werror_set. * opts-global.c (decode_options): Delay Werror decision for Wcoverage-mismatch util after options are parsed. This patch is certainly wrong, since common_handle_option must not set any global state, only state pointed to by its arguments. That said, setting -Werror=coverage-mismatch in decode_options at all is bad because decode_options is called when optimize attributes are processed; as-is, a -Wno-error=coverage-mismatch option will be overridden if such attributes are used. Not sure if I understand the comment on the 'option be overriden' -- this is not happening with the patch. As long as the the diagnostic_classification for coverage-mismatch is explicitly specified, it will be honored. So the right place to set this is probably later, in process_options. And this can check global_options_set.x_warnings_are_errors to see if an explicit -Werror/-Wno-error option was passed. The problem is that when warning_as_error_requested is 0, there is no way to tell if it is the default or it is user has specified -Wno-error. Maybe we should not make -Wcoverage-mismatch warnings to be promoted to errors by default and relies on it to be turned on explicitly via -Werror, or -Werror=coverage-mismatch. There are probably not many people depending on the old behavior. Honza, what is your opinion on this? Thanks, David -- Joseph S. Myers jos...@codesourcery.com
Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error
On Thu, 21 Apr 2011, Xinliang David Li wrote: On Thu, Apr 21, 2011 at 12:54 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 19 Apr 2011, Xinliang David Li wrote: 2011-04-18 Neil Vachharajani nvach...@gmail.com * flags.c: New flag variable. * opts.c (common_handle_options): Set flag_werror_set. * opts-global.c (decode_options): Delay Werror decision for Wcoverage-mismatch util after options are parsed. This patch is certainly wrong, since common_handle_option must not set any global state, only state pointed to by its arguments. That said, setting -Werror=coverage-mismatch in decode_options at all is bad because decode_options is called when optimize attributes are processed; as-is, a -Wno-error=coverage-mismatch option will be overridden if such attributes are used. Not sure if I understand the comment on the 'option be overriden' -- this is not happening with the patch. As long as the the I am referring to the state before the patch. But in general decode_options is the wrong place for any once-only initialization. So the right place to set this is probably later, in process_options. And this can check global_options_set.x_warnings_are_errors to see if an explicit -Werror/-Wno-error option was passed. The problem is that when warning_as_error_requested is 0, there is no way to tell if it is the default or it is user has specified -Wno-error. Maybe we should not make -Wcoverage-mismatch warnings to I referred to global_options_set.x_warnings_are_errors, not warning_as_error_requested. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup
Rainer Orth r...@cebitec.uni-bielefeld.de writes: Here's the error I run into: /vol/gcc/src/hg/trunk/irix/libgo/go/os/file.go:432:12: error: incompatible types in assignment (implicit assignment of 'syscall.Timeval' hidden field '_f0') /vol/gcc/src/hg/trunk/irix/libgo/go/os/file.go:433:12: error: incompatible types in assignment (implicit assignment of 'syscall.Timeval' hidden field '_f0') /vol/gcc/src/hg/trunk/irix/libgo/go/os/file.go:434:37: error: argument 2 has incompatible type (implicit assignment of 'syscall.Timeval' hidden field '_f0') make[8]: *** [os/os.lo] Error 1 What does the line for timeval look like in gen-sysinfo.go? I get type Timeval struct { _f0 int32; Sec Timeval_sec_t; Usec Timeval_usec_t; } Thanks. I fixed this problem with this patch. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2011-04-21 Ian Lance Taylor i...@google.com * godump.c (go_format_type): Use exported Go name for anonymous field name. Index: gcc/godump.c === --- gcc/godump.c (revision 172331) +++ gcc/godump.c (working copy) @@ -675,7 +675,7 @@ go_format_type (struct godump_container { char buf[100]; - obstack_grow (ob, _f, 2); + obstack_grow (ob, Godump_, 2); snprintf (buf, sizeof buf, %d, i); obstack_grow (ob, buf, strlen (buf)); i++;
Re: [patch] Split Parse Timevar (issue4378056)
On 04/21/2011 07:17 PM, Lawrence Crowl wrote: @@ -1911,7 +1911,7 @@ ggc_collect (void) - timevar_push (TV_GC); + timevar_start (TV_GC); Why this change? GC time shouldn't be counted against whatever we happen to be parsing when it happens. If not, then code that generates lots of garbage does not get charged for the cost to collect it. I thought it best to separate these issues. Sure, but the problem is that the collection doesn't always happen in the same place that generated most of the garbage. +DEFTIMEVAR (TV_PHASE_C_WRAPUP_CHECK , phase C wrapup check) +DEFTIMEVAR (TV_PHASE_CP_DEFERRED , phase C++ deferred) Why do these need to be different timevars? The are measuring different things. They are less different now than they were during earlier development. We can make them the same if you want. I think we could describe both as language-specific finalization. +DEFTIMEVAR (TV_PARSE_INMETH , parser inl. meth. body) Is it really important to distinguish this from other functions? This distinction is here to help evaluate potential speedup due to lazy parsing. It might make some sense to separate functions and inline functions, which also wouldn't have to be parsed immediately. That makes sense. Inlines in the class aren't significantly different from inlines outside the class, but inlines are significantly different from non-inlines for our purposes. -DEFTIMEVAR (TV_TEMPLATE_INSTANTIATION, template instantiation) +DEFTIMEVAR (TV_INSTANTIATE_TEMPLATE , instantiate template) Why these changes? Just to shorten the names. I'd prefer to keep it in the noun form. -DEFTIMEVAR (TV_NAME_LOOKUP , name lookup) -DEFTIMEVAR (TV_OVERLOAD , overload resolution) +DEFTIMEVAR (TV_NAME_LOOKUP , |name lookup) +DEFTIMEVAR (TV_RESOLVE_OVERLOAD , |overload resolution) And here you significantly lengthened one. :) The | (also in TV_GC) indicates that these vars are collecting time concurrently with the other non-phase variables. It is intended to remind readers not to add those times into totals. Hmm, I guess that makes sense, but it should be documented. And perhaps move these timevars to the beginning or end so that they don't look like subsets of template instantiation. @@ -564,6 +564,8 @@ compile_file (void) + timevar_start (TV_PHASE_PARSING); Why does this happen before... + timevar_push (TV_PARSE_GLOBAL); ...this? I would think the bits in there should be part of _SETUP. We could do that, though it would involve splitting the start/stop calls into different functions. That seemed hard to manage. As it stands, TV_PHASE_SETUP is entirely before compile_file() and TV_PHASE_FINALIZE is entirely after. Thoughts? The code is cleaner the way you have it, but not as correct, as there's some initialization being charged to parsing. Jason
Re: [PATCH PING] c++-specific bits of tree-slimming patches
On Fri, Apr 08, 2011 at 01:50:24PM -0400, Jason Merrill wrote: On 03/24/2011 09:15 AM, Nathan Froyd wrote: + tree t = make_node (CASE_LABEL_EXPR); + + TREE_TYPE (t) = void_type_node; + SET_EXPR_LOCATION (t, input_location); As jsm and richi said, using input_location like this is a regression. Can we use DECL_SOURCE_LOCATION (label_decl) instead? I went off and tried that; some callers provide a NULL label_decl. What's the right thing to do in that case--use UNKNOWN_LOCATION or input_location? I'm leaning towards the former. -Nathan
fix up hot/cold partitioning on ports that don't have long conditional branches
This patch fixes up hot/cold partitioning on ports that don't have long conditional branches. I'll note that the entire file has lots of other jump optimizations that are suspect. Ok? 2011-04-21 Mike Stump mikest...@comcast.net * reorg.c (relax_delay_slots): Don't delete a jump that crosses a section boundary. (follow_jumps): Don't replace a short conditional jump with a long conditional jump when the port doesn't have long conditional jumps. (fill_slots_from_thread): Pass insn to follow_jumps. (relax_delay_slots): Likewise. Index: reorg.c === --- reorg.c (revision 1301) +++ reorg.c (working copy) @@ -2509,7 +2509,7 @@ since that tells caller to avoid changing the insn. */ static rtx -follow_jumps (rtx label) +follow_jumps (rtx label, rtx jump) { rtx insn; rtx next; @@ -2529,6 +2529,16 @@ { rtx tem; + /* If a label crosses a section boundary and we're thinking +about changing a conditional jump to be a conditional jump +across that boundary, don't do it if the port doesn't have +long conditional branches. We can however jump to the label +just before we cross such a boundary. */ + if (!HAS_LONG_COND_BRANCH + find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) + (condjump_p (jump) || condjump_in_parallel_p (jump))) + return value; + /* If we have found a cycle, make the insn jump to itself. */ if (JUMP_LABEL (insn) == label) return label; @@ -2991,7 +3001,7 @@ redirect_with_delay_list_safe_p (insn, JUMP_LABEL (new_thread), delay_list)) - new_thread = follow_jumps (JUMP_LABEL (new_thread)); + new_thread = follow_jumps (JUMP_LABEL (new_thread), new_thread); if (new_thread == 0) label = find_end_label (); @@ -3342,12 +3352,13 @@ (condjump_p (insn) || condjump_in_parallel_p (insn)) (target_label = JUMP_LABEL (insn)) != 0) { - target_label = skip_consecutive_labels (follow_jumps (target_label)); + target_label = skip_consecutive_labels (follow_jumps (target_label, insn)); if (target_label == 0) target_label = find_end_label (); if (target_label next_active_insn (target_label) == next - ! condjump_in_parallel_p (insn)) + ! condjump_in_parallel_p (insn) + find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) == NULL_RTX) { delete_jump (insn); continue; @@ -3492,7 +3503,7 @@ { /* If this jump goes to another unconditional jump, thread it, but don't convert a jump into a RETURN here. */ - trial = skip_consecutive_labels (follow_jumps (target_label)); + trial = skip_consecutive_labels (follow_jumps (target_label, delay_insn)); if (trial == 0) trial = find_end_label ();
Re: Fix 20020425-1.c
Ping? On Apr 12, 2011, at 4:36 PM, Mike Stump wrote: This fixes 20020425-1.c when the compiler under test is built with -O0 and we're on a machine with an 8 meg stack. Ok? 2011-04-12 Mike Stump mikest...@comcast.net * c-typeck.c (c_finish_if_stmt): Fold result. * fold-const.c (fold_ternary_loc): Handle an empty else. fold.diffs.txt
Update my email address
Hi, I have committed this patch to update my email address. Jie 2011-04-21 Jie Zhang jzhang...@gmail.com * MAINTAINERS: Update my email address. Index: MAINTAINERS === --- MAINTAINERS (revision 172853) +++ MAINTAINERS (working copy) @@ -49,7 +49,7 @@ avr port Anatoly Sokolov ae...@post.ru avr port Eric Weddington eric.wedding...@atmel.com bfin port Bernd Schmidt ber...@codesourcery.com -bfin port Jie Zhang j...@codesourcery.com +bfin port Jie Zhang jzhang...@gmail.com cris port Hans-Peter Nilsson h...@axis.com fr30 port Nick Clifton ni...@redhat.com frv port Nick Clifton ni...@redhat.com
Re: [PATCH PING] c++-specific bits of tree-slimming patches
On 04/21/2011 08:50 PM, Nathan Froyd wrote: On Fri, Apr 08, 2011 at 01:50:24PM -0400, Jason Merrill wrote: As jsm and richi said, using input_location like this is a regression. Can we use DECL_SOURCE_LOCATION (label_decl) instead? I went off and tried that; some callers provide a NULL label_decl. Hunh. How does that work? They fill in CASE_LABEL later? Can that be changed? Jason
Re: [PATCH] use build_function_type_list in the bfin backend
On 04/20/2011 03:24 PM, Nathan Froyd wrote: As $SUBJECT suggests. Tested with cross to bfin-elf. OK to commit? OK. Thanks! Jie -Nathan * config/bfin/bfin.c (bfin_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c index 5d08437..03a833d 100644 --- a/gcc/config/bfin/bfin.c +++ b/gcc/config/bfin/bfin.c @@ -5967,7 +5967,7 @@ bfin_init_builtins (void) { tree V2HI_type_node = build_vector_type_for_mode (intHI_type_node, V2HImode); tree void_ftype_void -= build_function_type (void_type_node, void_list_node); += build_function_type_list (void_type_node, NULL_TREE); tree short_ftype_short = build_function_type_list (short_integer_type_node, short_integer_type_node, NULL_TREE);
Re: [PATCH PING] c++-specific bits of tree-slimming patches
On Thu, Apr 21, 2011 at 10:49:05PM -0400, Jason Merrill wrote: On 04/21/2011 08:50 PM, Nathan Froyd wrote: On Fri, Apr 08, 2011 at 01:50:24PM -0400, Jason Merrill wrote: As jsm and richi said, using input_location like this is a regression. Can we use DECL_SOURCE_LOCATION (label_decl) instead? I went off and tried that; some callers provide a NULL label_decl. Hunh. How does that work? They fill in CASE_LABEL later? Can that be changed? Yeah, tree-eh.c:lower_try_finally_switch. I don't know how easy it is to fix; it certainly looks non-trivial. -Nathan
Re: Allow more PowerPC sibling calls
On Mon, Apr 18, 2011 at 10:29:17AM -0700, Nathan Froyd wrote: Could I request that you use FOREACH_FUNCTION_ARGS in these two cases? The conversion is trivial, and avoiding more exposed TYPE_ARG_TYPES calls is a good thing. Committed revision 172855. -- Alan Modra Australia Development Lab, IBM
Re: Allow more PowerPC sibling calls
On Fri, Apr 22, 2011 at 01:29:14PM +0930, Alan Modra wrote: On Mon, Apr 18, 2011 at 10:29:17AM -0700, Nathan Froyd wrote: Could I request that you use FOREACH_FUNCTION_ARGS in these two cases? The conversion is trivial, and avoiding more exposed TYPE_ARG_TYPES calls is a good thing. Committed revision 172855. Oops, I missed the following hunk, which I talked about here I also fixed a minor problem with CALL_LIBCALL in the call cookie, which could cause various call insns to match an n constraint rather than a 0 constraint and so give the wrong insn length. and mentioned in the changelog: * config/rs6000/rs6000.c (rs6000_function_arg): Remove CALL_LIBCALL when returning call_cookie. but failed to include in my post. Committed as obvious, revision 172856. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 172855) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -9007,7 +9007,7 @@ rs6000_function_arg (CUMULATIVE_ARGS *cu : CALL_V4_CLEAR_FP_ARGS)); } - return GEN_INT (cum-call_cookie); + return GEN_INT (cum-call_cookie ~CALL_LIBCALL); } if (TARGET_MACHO rs6000_darwin64_struct_check_p (mode, type)) -- Alan Modra Australia Development Lab, IBM
Re: [ARM] [3/3] Implement TARGET_BUILTIN_DECL
Thank you for review, update and commit this patch set! Jie On 04/18/2011 10:04 AM, Richard Earnshaw wrote: On Mon, 2010-10-11 at 15:44 +0800, Jie Zhang wrote: This patch implements TARGET_BUILTIN_DECL for ARM. With the changes of the previous two patches, this one is straightforward. Is it OK? Sorry for the long time reviewing this set of patches. I've just tweaked it to bring it up to the current code base and committed it. It's largely unchanged from your submission apart from: 1) Updates to incorporate latest changes made by Richard Sandiford. 2) Minor tweak to simplyfy the iWMMXT builtins initialization. R. 2011-04-18 Jie Zhangj...@codesourcery.com Richard Earnshawrearn...@arm.com * arm.c (neon_builtin_type_bits): Remove. (typedef enum neon_builtin_mode): New. (T_MAX): Don't define. (typedef enum neon_builtin_datum): Remove bits, codes[], num_vars and base_fcode. Add mode, code and fcode. (VAR1, VAR2, VAR3, VAR4, VAR5, VAR6, VAR7, VAR8, VAR9 VAR10): Change accordingly. (neon_builtin_data[]): Change accordingly (arm_init_neon_builtins): Change accordingly. (neon_builtin_compare): Remove. (locate_neon_builtin_icode): Remove. (arm_expand_neon_builtin): Change accordingly. * arm.h (enum arm_builtins): Move to ... * arm.c (enum arm_builtins): ... here; and rearrange builtin code. * arm.c (arm_builtin_decl): Declare. (TARGET_BUILTIN_DECL): Define. (enum arm_builtins): Correct ARM_BUILTIN_MAX. (arm_builtin_decls[]): New. (arm_init_neon_builtins): Store builtin declarations in arm_builtin_decls[]. (arm_init_tls_builtins): Likewise. (arm_init_iwmmxt_builtins): Likewise. Refactor initialization code. (arm_builtin_decl): New.
Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error
Please review the new patch. Thanks, David On Thu, Apr 21, 2011 at 3:59 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Thu, 21 Apr 2011, Xinliang David Li wrote: On Thu, Apr 21, 2011 at 12:54 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 19 Apr 2011, Xinliang David Li wrote: 2011-04-18 Neil Vachharajani nvach...@gmail.com * flags.c: New flag variable. * opts.c (common_handle_options): Set flag_werror_set. * opts-global.c (decode_options): Delay Werror decision for Wcoverage-mismatch util after options are parsed. This patch is certainly wrong, since common_handle_option must not set any global state, only state pointed to by its arguments. That said, setting -Werror=coverage-mismatch in decode_options at all is bad because decode_options is called when optimize attributes are processed; as-is, a -Wno-error=coverage-mismatch option will be overridden if such attributes are used. Not sure if I understand the comment on the 'option be overriden' -- this is not happening with the patch. As long as the the I am referring to the state before the patch. But in general decode_options is the wrong place for any once-only initialization. So the right place to set this is probably later, in process_options. And this can check global_options_set.x_warnings_are_errors to see if an explicit -Werror/-Wno-error option was passed. The problem is that when warning_as_error_requested is 0, there is no way to tell if it is the default or it is user has specified -Wno-error. Maybe we should not make -Wcoverage-mismatch warnings to I referred to global_options_set.x_warnings_are_errors, not warning_as_error_requested. -- Joseph S. Myers jos...@codesourcery.com wc2.p Description: Binary data