Re: Interaction between first stage build with g++ and $PATH
On Wed, Aug 15, 2012 at 10:17:18PM -0700, Gary Funck wrote: > I can file a bug reported if necessary, but am wondering > if it is a known requirement not to have "." on $PATH > or to explicitly set CC and CXX? Having . in $PATH is a serious bug (especially from security POV). Just never do that. Jakub
Re: Interaction between first stage build with g++ and $PATH
On 08/15/12 22:26:08, Ian Lance Taylor wrote: > it's a bug. OK, thanks. I submitted a bug report. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54279 - Gary
Re: Interaction between first stage build with g++ and $PATH
On Wed, Aug 15, 2012 at 10:17 PM, Gary Funck wrote: > > 1. I have "." on $PATH. > > 2. In one build of the latest GCC trunk, I specify >CC=/usr/bin/gcc and CXX=/usr/bin/g++ and everything >works. > > 3. In another build, I don't specify CC or CXX. >Therefore they default to 'gcc' and 'g++'. >This fails: >g++: error trying to exec 'cc1plus': execvp: No such file or directory > > If I remove "." from $PATH then the configuration in 3 will build. > > The problem is that there is a g++ executable under the > built gcc directory, but cc1plus and other g++ component > parts haven't been built yet. > > I can file a bug reported if necessary, but am wondering > if it is a known requirement not to have "." on $PATH > or to explicitly set CC and CXX? it's a bug. Ian
Interaction between first stage build with g++ and $PATH
1. I have "." on $PATH. 2. In one build of the latest GCC trunk, I specify CC=/usr/bin/gcc and CXX=/usr/bin/g++ and everything works. 3. In another build, I don't specify CC or CXX. Therefore they default to 'gcc' and 'g++'. This fails: g++: error trying to exec 'cc1plus': execvp: No such file or directory If I remove "." from $PATH then the configuration in 3 will build. The problem is that there is a g++ executable under the built gcc directory, but cc1plus and other g++ component parts haven't been built yet. I can file a bug reported if necessary, but am wondering if it is a known requirement not to have "." on $PATH or to explicitly set CC and CXX? thanks, - Gary
Re: [PATCH 5/7] rs6000: Get rid of old-mnemonics
This patch is okay, but the longlong.h patch is incomplete. It no longer should test for _ARCH_COM or _ARCH_PWR, because those will not work. Committed the following as pre-approved off-list: 2012-08-15 Segher Boessenkool * longlong.h: (powerpc): Delete _ARCH_PWR and _ARCH_COM handling. --- trunk/libgcc/longlong.h 2012/08/16 01:36:47 190433 +++ trunk/libgcc/longlong.h 2012/08/16 01:49:57 190434 @@ -850,8 +850,6 @@ FIXME: What's needed for gcc PowerPC VxWorks? __vxworks__ is not good enough, since that hits ARM and m68k too. */ #if (defined (_ARCH_PPC) /* AIX */ \ - || defined (_ARCH_PWR)/* AIX */ \ - || defined (_ARCH_COM)/* AIX */ \ || defined (__powerpc__) /* gcc */ \ || defined (__POWERPC__) /* BEOS */ \ || defined (__ppc__) /* Darwin */\ @@ -914,14 +912,6 @@ } while (0) #define SMUL_TIME 14 #define UDIV_TIME 120 -#elif defined (_ARCH_PWR) -#define UMUL_TIME 8 -#define smul_ppmm(xh, xl, m0, m1) \ - __asm__ ("mul %0,%2,%3" : "=r" (xh), "=q" (xl) : "r" (m0), "r" (m1)) -#define SMUL_TIME 4 -#define sdiv_qrnnd(q, r, nh, nl, d) \ - __asm__ ("div %0,%2,%4" : "=r" (q), "=q" (r) : "r" (nh), "1" (nl), "r" (d)) -#define UDIV_TIME 100 #endif #endif /* 32-bit POWER architecture variants. */
Re: [PATCH 7/7] rs6000: Old AIX specs: Use %(asm_default) instead of -mppc.
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool wrote: > 2012-08-15 Segher Boessenkool > > gcc/ > * config/rs6000/aix43.h (ASM_CPU_SPEC): Use %(asm_default) > instead of -mppc. > * config/rs6000/aix51.h (ASM_CPU_SPEC): Ditto. This patch is okay. Thanks, David
Re: [PATCH 5/7] rs6000: Get rid of old-mnemonics
On Wed, 15 Aug 2012, David Edelsohn wrote: > Does GCC "own" longlong.h, or is that part of GMP or some other project? It is right now in sync between glibc and GCC; changes should be applied in both places. It hasn't been in sync with GMP's version for many years. -- Joseph S. Myers jos...@codesourcery.com
Re: Scheduler: Allow breaking dependencies by modifying patterns
On 4/08/2012, at 12:05 AM, Bernd Schmidt wrote: > This patch allows us to change > > rn++ > rm=[rn] > > into > > rm=[rn + 4] > rn++ The patch is OK with the following nitpicks. [BTW, if anyone else wants to review this patch, it helps to read it from the end.] > > Opportunities to do this are discovered by a mini-pass over the > instructions after generating dependencies and before scheduling a > block. At that point we have all the information required to ensure that > a candidate dep between two instructions is only used to show the > register dependence, and to ensure that every insn with a memory > reference is only subject to at most one dep causing a pattern change. > > The dep_t structure is extended to hold an optional pointer to a > "replacement description", which holds information about what to change > when a dependency is broken. The time when this replacement is applied > differs depending on whether the changed insn is the DEP_CON (in which > case the pattern is changed whenever the broken dependency becomes the > last one), or the DEP_PRO, in which case we make the change when the > corresponding DEP_CON has been scheduled. This ensures that the ready > list always contains insns with the correct pattern. > > A few additional bits are needed in the dep structure: one to hold > information about whether a dependency occurs multiple times, and one to > distinguish dependencies that are purely for register values from those > with other meanings (e.g. memory references). > > Also, sched-rgn was changed to use a new bit, DEP_POSTPONED, rather than > HARD_DEP to indicate that we don't want to schedule an insn in the > current block. > > A possible future extension would be to also allow autoinc addressing > modes as the increment insn. Would you please copy the above to a comment describing how the new optimization fits within the scheduler? In sched-deps.c just before definition of struct mem_inc_info seems like a good spot for an overview comment. > > Bootstrapped and tested on x86_64-linux, and also tested on c6x-elf > (quite a number of changes were necessary to make it work there). It was > originally written for a mips target and tested there in the context of > a 4.6 tree. I've also run spec2000 on x86_64, with no change that looked > like anything other than noise. Ok? > > + We also perform pattern replacements for predication, and for broken > + replacement dependencies. The latter is only done if FOR_BACKTRACK is > + false. */ > We also perform pattern replacement for ?speculation? ... >> +void >> +find_modifiable_mems (rtx head, rtx tail) >> +{ >> + rtx insn; >> + int success_in_block = 0; Success_in_block is not really used. Maybe add a debug printout for it? > + /* Dependency major type. This field is superseded by STATUS below. > + Though, it is still in place because some targets use it. */ > + ENUM_BITFIELD(reg_note) type:6; > ... is superseded by STATUS *above*. -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics
Re: [PATCH 5/7] rs6000: Get rid of old-mnemonics
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool wrote: > 2012-08-15 Segher Boessenkool > > gcc/ > * config/rs6000/aix43.h (TARGET_DEFAULT): Delete MASK_NEW_MNEMONICS. > (RS6000_CALL_GLUE): Adjust for single assembler syntax. > * config/rs6000/aix51.h (TARGET_DEFAULT, RS6000_CALL_GLUE): > Ditto. > * config/rs6000/aix52.h (TARGET_DEFAULT, RS6000_CALL_GLUE): > Ditto. > * config/rs6000/aix53.h (TARGET_DEFAULT, RS6000_CALL_GLUE): > Ditto. > * config/rs6000/aix61.h (TARGET_DEFAULT, RS6000_CALL_GLUE): > Ditto. > * config/rs6000/darwin.h (TARGET_DEFAULT): Ditto. > * config/rs6000/darwin.md (whole file): Adjust to single > assembler syntax. > * config/rs6000/darwin64.h (TARGET_DEFAULT): Delete > MASK_NEW_MNEMONICS. > * config/rs6000/default64.h (TARGET_DEFAULT): Ditto. > * config/rs6000/dfp.md: (whole file): Adjust to single > assembler syntax. > * config/rs6000/eabi.h (TARGET_DEFAULT): Delete > MASK_NEW_MNEMONICS. > * config/rs6000/eabialtivec.h (TARGET_DEFAULT): Ditto. > * config/rs6000/eabispe.h (TARGET_DEFAULT): Ditto. > * config/rs6000/linuxaltivec.h (TARGET_DEFAULT): Ditto. > * config/rs6000/linuxspe.h (TARGET_DEFAULT): Ditto. > * config/rs6000/rs6000-cpus.def (whole file): Delete > POWERPC_BASE_MASK. > * config/rs6000/rs6000-tables.opt: Regenerate. > * config/rs6000/rs6000.c (POWERPC_BASE_MASK): Delete. > (num_insns_constant_wide): Adjust comments. > (whole file): Adjust to single assembler syntax. > (output_cbranch): Adjust comment. > * config/rs6000/rs6000.h (ASSEMBLER_DIALECT): Delete. > * config/rs6000/rs6000.md: (whole file): Adjust to single > assembler syntax. > * config/rs6000/rs6000.opt (mnew-mnemonics): Delete. > (mold-mnemonics): Delete. > * config/rs6000/spe.md: (whole file): Adjust to single > assembler syntax. > * config/rs6000/sync.md: (whole file): Adjust to single > assembler syntax. > * config/rs6000/sysv4.h (TARGET_DEFAULT): Delete > MASK_NEW_MNEMONICS. > (ASM_OUTPUT_REG_PUSH): Adjust. > (ASM_OUTPUT_REG_POP): ADjust. > * config/rs6000/sysv4le.h (TARGET_DEFAULT): Delete > MASK_NEW_MNEMONICS. > * config/rs6000/vsx.md: (whole file): Adjust to single > assembler syntax. > * config/rs6000/vxworks.h (TARGET_DEFAULT): Delete > MASK_NEW_MNEMONICS. > * doc/invoke.texi: Adjust documentation to reflect the > removal of -mnew-mnemonics and -mold-mnemonics. > > libgcc/ > * longlong.h: (whole file): Adjust to single assembler syntax. > --- > gcc/config/rs6000/aix43.h |4 +- > gcc/config/rs6000/aix51.h |4 +- > gcc/config/rs6000/aix52.h |4 +- > gcc/config/rs6000/aix53.h |4 +- > gcc/config/rs6000/aix61.h |4 +- > gcc/config/rs6000/darwin.h |2 +- > gcc/config/rs6000/darwin.md | 34 +- > gcc/config/rs6000/darwin64.h|2 +- > gcc/config/rs6000/default64.h |3 +- > gcc/config/rs6000/dfp.md| 24 +- > gcc/config/rs6000/eabi.h|2 +- > gcc/config/rs6000/eabialtivec.h |2 +- > gcc/config/rs6000/eabispe.h |2 +- > gcc/config/rs6000/linuxaltivec.h|2 +- > gcc/config/rs6000/linuxspe.h|2 +- > gcc/config/rs6000/rs6000-cpus.def | 121 +++ > gcc/config/rs6000/rs6000-tables.opt | 33 +- > gcc/config/rs6000/rs6000.c | 62 ++-- > gcc/config/rs6000/rs6000.h |4 - > gcc/config/rs6000/rs6000.md | 606 +- > gcc/config/rs6000/rs6000.opt|8 - > gcc/config/rs6000/spe.md| 26 +- > gcc/config/rs6000/sync.md |4 +- > gcc/config/rs6000/sysv4.h |6 +- > gcc/config/rs6000/sysv4le.h |2 +- > gcc/config/rs6000/vsx.md|8 +- > gcc/config/rs6000/vxworks.h |2 +- > gcc/doc/invoke.texi | 28 +-- > libgcc/longlong.h | 34 +- This patch is okay, but the longlong.h patch is incomplete. It no longer should test for _ARCH_COM or _ARCH_PWR, because those will not work. Does GCC "own" longlong.h, or is that part of GMP or some other project? Thanks, David
Re: [PATCH 4/7] rs6000: Remove TARGET_POWERPC
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool wrote: > It's always on now. Also remove -mcpu=common. > > 2012-08-15 Segher Boessenkool > > gcc/ > * common/config/rs6000/rs6000-common.c (rs6000_handle_option): > Delete handling for -mno-powerpc and -mpowerpc. > * config/rs6000/aix43.h (ASM_CPU_SPEC): Similar. > (ASM_DEFAULT_SPEC): Use -mppc instead of -mcom. > * config/rs6000/aix51.h (ASM_CPU_SPEC, ASM_DEFAULT_SPEC): Ditto. > * config/rs6000/aix52.h (TARGET_DEFAULT): Delete MASK_POWERPC. > * config/rs6000/aix53.h (TARGET_DEFAULT): Ditto. > * config/rs6000/aix61.h (TARGET_DEFAULT): Ditto. > * config/rs6000/darwin.h (TARGET_DEFAULT): Ditto. > * config/rs6000/darwin64.h (TARGET_DEFAULT): Ditto. > * config/rs6000/default64.h (TARGET_DEFAULT): Ditto. > * config/rs6000/driver-rs6000.c (asm_names): Delete handling > for -mcpu=common and -mpowerpc. > * config/rs6000/eabi.h (TARGET_DEFAULT): Delete MASK_POWERPC. > * config/rs6000/eabialtivec.h (TARGET_DEFAULT): Ditto. > * config/rs6000/eabispe.h (TARGET_DEFAULT): Ditto. > * config/rs6000/linuxaltivec.h (TARGET_DEFAULT): Ditto. > * config/rs6000/linuxspe.h (TARGET_DEFAULT): Ditto. > * config/rs6000/rs6000-builtin.def (RS6000_BUILTIN_CFSTRING): > Use RS6000_BTM_ALWAYS instead of RS6000_BTM_POWERPC. > * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): > Adjust. > (rs6000_cpu_cpp_builtins): Adjust. > * config/rs6000/rs6000.c (POWERPC_BASE_MASK): Delete MASK_POWERPC. > (rs6000_builtin_mask_calculate): Adjust. > (rs6000_emit_move): Delete code for ! TARGET_POWERPC. > (rs6000_init_libfuncs): Ditto. > (rs6000_output_function_prologue): Ditto. > (rs6000_opt_masks): Delete MASK_POWERPC. > (rs6000_builtin_mask_names): Delete RS6000_BTM_POWERPC. > * config/rs6000/rs6000.h (ASM_CPU_SPEC): Delete handling for > -mpowerpc. > (RS6000_BTM_POWERPC): Delete. > (RS6000_BTM_COMMON): Delete RS6000_BTM_POWERPC. > * config/rs6000/rs6000.md (extendqisi2 patterns): Adjust for > TARGET_POWERPC always on. > (extendqihi2 patterns): Similar. > (various unnamed subtract patterns): Similar. > (bswaphi2 patterns): Similar. > (divmodsi4): Similar. > (udiv3): Similar. > (div3 patterns): Similar. > (udivmodsi4): Similar. > (mulhcall): Delete. > (mullcall): Delete. > (divss_call): Delete. > (divus_call): Delete. > (quoss_call): Delete. > (quous_call): Delete. > (insvsi patterns): Adjust. > (addsf3 patterns): Adjust. > (subsf3 patterns): Adjust. > (mulsf3 patterns): Adjust. > (divsf3 patterns): Adjust. > (*fmasf4_fpr): Adjust. > (*fmssf4_fpr): Adjust. > (*nfmasf4_fpr): Adjust. > (*nfmssf4_fpr): Adjust. > (*floatunssidf2_internal): Adjust. > (fix_truncsi2_internal): Adjust. > (fctiwz_): Adjust. > (mulsidi3 patterns): Adjust. > (smulsi3_highpart patterns): Adjust. > (umulsi3_highpart patterns): Adjust. > (fix_trunctfsi2 patterns): Adjust. > (prefetch): Adjust. > * config/rs6000/rs6000.opt (mpowerpc): Replace by stub option. > (mno-powerpc): Delete. > * config/rs6000/sync.md (load_locked): Adjust. > (store_conditional): Adjust. > (atomic_compare_and_swap): Adjust. > (atomic_exchange): Adjust. > (atomic_): Adjust. > (atomic_nand): Adjust. > (atomic_fetch_): Adjust. > (atomic_fetch_nand): Adjust. > (atomic__fetch): Adjust. > (atomic_nand_fetch): Adjust. > * config/rs6000/sysv4.h (TARGET_DEFAULT): Delete MASK_POWERPC. > * config/rs6000/sysv4le.h (TARGET_DEFAULT): Ditto. > * config/rs6000/vxworks.h (TARGET_DEFAULT): Ditto. > * doc/invoke.texi: Adjust documentation. This patch is okay. Thanks, David
Re: [PATCH, MIPS] DSP ALU scheduling
On 08/04/2012 07:55 AM, Richard Sandiford wrote: > Sandra Loosemore writes: >> This is another patch that has been present in our local source base for some >> years now. It originally came from MIPS; I've verified that we have legal >> permission to contribute it to the FSF. >> >> The 74k.md parts of this patch depend on the not-yet-reviewed "74k madd >> scheduler >> tweaks" patch I posted the other day: >> >> http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00062.html >> >> Assuming that one gets approved, is this patch OK for mainline? > > OK with: > >> +/* DSP ALU can bypass data with no delays for the following pairs. */ >> +enum insn_code dspalu_bypass_table[][2] = >> +{ >> + {CODE_FOR_mips_addsc, CODE_FOR_mips_addwc}, >> + {CODE_FOR_mips_cmpu_eq_qb, CODE_FOR_mips_pick_qb}, >> + {CODE_FOR_mips_cmpu_lt_qb, CODE_FOR_mips_pick_qb}, >> + {CODE_FOR_mips_cmpu_le_qb, CODE_FOR_mips_pick_qb}, >> + {CODE_FOR_mips_cmp_eq_ph, CODE_FOR_mips_pick_ph}, >> + {CODE_FOR_mips_cmp_lt_ph, CODE_FOR_mips_pick_ph}, >> + {CODE_FOR_mips_cmp_le_ph, CODE_FOR_mips_pick_ph}, >> + {CODE_FOR_mips_wrdsp, CODE_FOR_mips_insv} >> +}; >> + >> +int >> +mips_dspalu_bypass_p (rtx out_insn, rtx in_insn) >> +{ >> + int i; >> + int num_bypass = (sizeof (dspalu_bypass_table) >> +/ (2 * sizeof (enum insn_code))); > > this changed to ARRAY_SIZE (dspalu_bypass_table); Here's the version I've checked in. In addition to the change above, I implemented the suggestion here http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00463.html to remove the uses of mips_mult_madd_chain_bypass_p. -Sandra 2012-08-15 Sandra Loosemore Maxim Kuvyrkov Julian Brown MIPS Technologies, Inc. gcc/ * config/mips/mips.md (dspmac, dspmacsat, accext, accmod, dspalu) (dspalusat): Add insn types. * config/mips/mips-dsp.md (add3) (mips_add_s_) (sub3, mips_sub_s_, mips_addsc) (mips_addwc, mips_modsub, mips_raddu_w_qb, mips_absq_s_) (mips_precrq_qb_ph, mips_precrq_ph_w, mips_precrq_rs_ph_w) (mips_precrqu_s_qb_ph, mips_preceq_w_phl, mips_preceq_w_phr) (mips_precequ_ph_qbl, mips_precequ_ph_qbr, mips_precequ_ph_qbla) (mips_precequ_ph_qbra, mips_preceu_ph_qbl, mips_preceu_ph_qbr) (mips_preceu_ph_qbla, mips_preceu_ph_qbra, mips_shll_) (mips_shll_s_, mips_shll_s_, mips_shrl_qb) (mips_shra_ph, mips_shra_r_, mips_bitrev, mips_insv) (mips_repl_qb, mips_repl_ph) (mips_cmp_eq_) (mips_cmp_lt_) (mips_cmp_le_, mips_cmpgu_eq_qb) (mips_cmpgu_lt_qb, mips_cmpgu_le_qb, mips_pick_) (mips_packrl_ph, mips_wrdsp, mips_rddsp): Change type to dspalu. (mips_dpau_h_qbl, mips_dpau_h_qbr, mips_dpsu_h_qbl, mips_dpsu_h_qbr) (mips_dpaq_s_w_ph, mips_dpsq_s_w_ph, mips_mulsaq_s_w_ph) (mips_maq_s_w_phl, mips_maq_s_w_phr, mips_maq_sa_w_phr): Set type to dspmac. (mips_dpaq_sa_l_w, mips_dpsq_sa_l_w, mips_maq_sa_w_phl): Set type to dspmacsat. (mips_extr_w, mips_extr_r_w, mips_extr_rs_w, mips_extp, mips_extpdp): Set type to accext. (mips_shilo, mips_mthlip): Set type to accmod. * config/mips/mips-dspr2.md (mips_absq_s_qb, mips_addu_s_ph) (mips_adduh_r_qb): Set type to dspalusat. (mips_addu_ph, mips_adduh_qb, mips_append, mips_balign) (mips_cmpgdu_eq_qb, mips_cmpgdu_lt_qb, mips_cmpgdu_le_qb) (mips_precr_qb_ph, mips_precr_sra_ph_w, mips_precr_sra_r_ph_w) (mips_prepend, mips_shra_qb, mips_shra_r_qb, mips_shrl_ph) (mips_subu_ph, mips_subuh_qb, mips_subuh_r_qb, mips_addqh_ph) (mips_addqh_r_ph, mips_addqh_w, mips_addqh_r_w, mips_subqh_ph) (mips_subqh_r_ph, mips_subqh_w, mips_subqh_r_w): Set type to dspalu. (mips_dpa_w_ph, mips_dps_w_ph, mips_mulsa_w_ph, mips_dpax_w_ph) (mips_dpsx_w_ph, mips_dpaqx_s_w_ph, mips_dpsqx_s_w_ph): Set type to dspmac. Set accum_in attribute. (mips_subu_s_ph): Set type to dspalusat. (mips_dpaqx_sa_w_ph, mips_dpsqx_sa_w_ph): Set type to dspmacsat. Set accum_in attribute. * config/mips/mips-protos.h (mips_dspalu_bypass_p): Add prototype. * config/mips/mips.c (dspalu_bypass_table): New. (mips_dspalu_bypass_p): New. * config/mips/24k.md (r24k_dsp_alu, r24k_dsp_mac, r24k_dsp_mac_sat) (r24k_dsp_acc_ext, r24k_dsp_acc_mod): New insn reservations. (r24k_int_mult, r24k_int_mthilo, r24k_dsp_mac, r24k_dsp_mac_sat) (r24k_dsp_acc_ext, r24k_dsp_acc_mod, r24k_dsp_alu): New bypasses. * config/mips/74k.md (r74k_dsp_alu, r74k_dsp_alu_sat, r74k_dsp_mac) (r74k_dsp_mac_sat, r74k_dsp_acc_ext, r74k_dsp_acc_mod): New insn reservations. (r74k_dsp_mac, r74k_dsp_mac_sat, r74k_int_mult, r74k_int_mul3) (r74k_dsp_mac, r74k_dsp_mac_sat): New bypasses. Index: gcc/config/mips/mips.md
Re: [PATCH 3/7] rs6000: Add RS6000_BTM_ALWAYS
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool wrote: > This adds a builtin flag for "always enabled". The value 0 works > right now as far as I can see, but that is too tricky and should > be fixed some day. > > 2012-08-15 Segher Boessenkool > > gcc/ > * config/rs6000/rs6000.h (RS6000_BTM_ALWAYS): New. This patch is okay. Thanks, David
Re: [PATCH 2/7] rs6000: Fix typo mpower64 -> mpowerpc64 in various spec strings.
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool wrote: > 2012-08-15 Segher Boessenkool > > gcc/ > * config/rs6000/aix52.h (ASM_CPU_SPEC): Fix typo. > * config/rs6000/aix53.h (ASM_CPU_SPEC): Ditto. > * config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto. > * config/rs6000/driver-rs6000.c (asm_names): Ditto. This patch is okay. Thanks, David
Re: [PATCH 1/7] rs6000: Fix PR54142
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool wrote: > This fixes PR54142, a problem I exposed when I made -mno-power the > default. > > 2012-08-15 Segher Boessenkool > > gcc/ > PR54142 > * config/rs6000/driver-rs6000.c (asm_names): Use %(asm_default) > instead of -mcom. > * config/rs6000/rs6000.h (ASM_CPU_SPEC): Ditto. This patch is okay. Thanks, David
PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook
Hi, This patch replaces MEMBER_TYPE_FORCES_BLK with a target hook. I also pass the type to the target hook in addition to field, which will be used by i386 backend for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020 This patch doesn't change code generation. Tested on Linux/x86-64. I also tested cross compilers to ia64-hpux, powerpc-linux and xtensa-linux up to cc1. OK to install? Thanks. H.J. --- 2012-08-15 H.J. Lu * stor-layout.c (compute_record_mode): Replace MEMBER_TYPE_FORCES_BLK with targetm.member_type_forces_blk. (layout_type): Likewise. * system.h: Poison MEMBER_TYPE_FORCES_BLK. * target.def (member_type_forces_blk): New target hook. * targhooks.c (default_member_type_forces_blk): New. * targhooks.h (default_member_type_forces_blk): Likewise. * doc/tm.texi.in (MEMBER_TYPE_FORCES_BLK): Renamed to ... (TARGET_MEMBER_TYPE_FORCES_BLK): This. * doc/tm.texi: Regenerated. * config/ia64/hpux.h (MEMBER_TYPE_FORCES_BLK): Removed. * config/ia64/ia64.c (ia64_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/rs6000/rs6000.c (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. (rs6000_member_type_forces_blk): New function. * config/rs6000/rs6000.h (MEMBER_TYPE_FORCES_BLK): Removed. * config/xtensa/xtensa.c (xtensa_member_type_forces_blk): New function. (TARGET_MEMBER_TYPE_FORCES_BLK): New macro. * config/xtensa/xtensa.h (MEMBER_TYPE_FORCES_BLK): Removed. diff --git a/gcc/config/ia64/hpux.h b/gcc/config/ia64/hpux.h index ad106b4..d9ae109 100644 --- a/gcc/config/ia64/hpux.h +++ b/gcc/config/ia64/hpux.h @@ -115,9 +115,6 @@ do { \ #define TARGET_DEFAULT \ (MASK_DWARF2_ASM | MASK_BIG_ENDIAN | MASK_ILP32) -/* ??? Might not be needed anymore. */ -#define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) ((MODE) == TFmode) - /* ASM_OUTPUT_EXTERNAL_LIBCALL defaults to just a globalize_label call, but that doesn't put out the @function type information which causes shared library problems. */ diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index c7fb559..f3585b1 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -319,6 +319,8 @@ static const char *ia64_invalid_binary_op (int, const_tree, const_tree); static enum machine_mode ia64_c_mode_for_suffix (char); static void ia64_trampoline_init (rtx, tree, rtx); static void ia64_override_options_after_change (void); +static bool ia64_member_type_forces_blk (const_tree, const_tree, +enum machine_mode); static tree ia64_builtin_decl (unsigned, bool); @@ -570,6 +572,9 @@ static const struct attribute_spec ia64_attribute_table[] = #undef TARGET_GET_RAW_ARG_MODE #define TARGET_GET_RAW_ARG_MODE ia64_get_reg_raw_mode +#undef TARGET_MEMBER_TYPE_FORCES_BLK +#define TARGET_MEMBER_TYPE_FORCES_BLK ia64_member_type_forces_blk + #undef TARGET_GIMPLIFY_VA_ARG_EXPR #define TARGET_GIMPLIFY_VA_ARG_EXPR ia64_gimplify_va_arg @@ -11153,6 +11158,17 @@ ia64_get_reg_raw_mode (int regno) return default_get_reg_raw_mode(regno); } +/* Implement TARGET_MEMBER_TYPE_FORCES_BLK. ??? Might not be needed + anymore. */ + +bool +ia64_member_type_forces_blk (const_tree type ATTRIBUTE_UNUSED, +const_tree field ATTRIBUTE_UNUSED, +enum machine_mode mode) +{ + return TARGET_HPUX && mode == TFmode; +} + /* Always default to .text section until HP-UX linker is fixed. */ ATTRIBUTE_UNUSED static section * diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 34948fb..5b2d9f0 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1302,6 +1302,9 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_INIT_DWARF_REG_SIZES_EXTRA #define TARGET_INIT_DWARF_REG_SIZES_EXTRA rs6000_init_dwarf_reg_sizes_extra +#undef TARGET_MEMBER_TYPE_FORCES_BLK +#define TARGET_MEMBER_TYPE_FORCES_BLK rs6000_member_type_forces_blk + /* On rs6000, function arguments are promoted, as are function return values. */ #undef TARGET_PROMOTE_FUNCTION_MODE @@ -7287,6 +7290,27 @@ rs6000_emit_move (rtx dest, rtx source, enum machine_mode mode) emit_set: emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[1])); } + +/* Return true if TYPE containing FIELD should be accessed using + `BLKMODE'. + + For the SPE, simd types are V2SI, and gcc can be tempted to put the + entire thing in a DI and use subregs to access the internals. + store_bit_field() will force (subreg:DI (reg:V2SI x))'s to the + back-end. Because a single GPR can hold a V2SI, but not a DI, the + best thing to do is set structs to BLKmode and avoid Severe Tire + Damage. + + On e500 v2, DF and DI modes suffer from the same anomaly. DF can + fit in
Re: [patch] speed up ifcvt:cond_move_convert_if_block
On Mon, Aug 6, 2012 at 1:27 PM, Paolo Bonzini wrote: >> 2. sparseset has the same problem of memory clearing (for valgrind, >> see sparseset_alloc). > > ... only the sparse array needs this clearing, but currently we do it > for both. And according to the fat comment before the xcalloc, it's not even really needed. Why can't we just tell valgrind to treat the memory as defined, like in this patchlet: Index: sparseset.c === --- sparseset.c (revision 190418) +++ sparseset.c (working copy) @@ -30,12 +30,14 @@ sparseset_alloc (SPARSESET_ELT_TYPE n_elms) unsigned int n_bytes = sizeof (struct sparseset_def) + ((n_elms - 1) * 2 * sizeof (SPARSESET_ELT_TYPE)); - /* We use xcalloc rather than xmalloc to silence some valgrind uninitialized + sparseset set = XNEWVAR(sparseset, n_bytes); + + /* Mark the sparseset as defined to silence some valgrind uninitialized read errors when accessing set->sparse[n] when "n" is not, and never has been, in the set. These uninitialized reads are expected, by design and - harmless. If this turns into a performance problem due to some future - additional users of sparseset, we can revisit this decision. */ - sparseset set = (sparseset) xcalloc (1, n_bytes); + harmless. */ + VALGRIND_DISCARD (VALGRIND_MAKE_MEM_DEFINED (set, n_bytes)); + set->dense = &(set->elms[0]); set->sparse = &(set->elms[n_elms]); set->size = n_elms; I have never built GCC with valgrind checking, so I don't know if this is correct. But there has to be a better solution than calling xcalloc... Ciao! Steven
[PATCH 6/7] rs6000: Remove -mabi=ieeelongdouble.
There are some problems with it: - On at least 4.6 and later, it crashes the compiler together with -m64; - On older versions, it generates incorrect code together with -m64; - Supposedly it doesn't actually work on 32-bit either, on the glibc side; - It isn't listed in --target-help, because the option file says "undocumented", but the manual does in fact list it; - The Darwin header claims it is for POWER. In the spirit of the rest of this patch series, I solve these problems by ripping it all out. 2012-08-15 Segher Boessenkool gcc/ * config/rs6000/aix.h (TARGET_IEEEQUAD): Delete. * config/rs6000/darwin.h (TARGET_IEEEQUAD): Delete. * config/rs6000/rs6000.c (rs6000_option_override_internal): Delete rs6000_ieeequad handling. Adjust for removal of TARGET_IEEEQUAD. (rs6000_emit_move): Adjust. (rs6000_return_in_memory): Adjust. (rs6000_function_arg_advance_1): Adjust. (rs6000_function_arg): Adjust. (rs6000_pass_by_reference): Adjust. (rs6000_init_libfuncs): Adjust. (rs6000_cannot_change_mode_class): Adjust. (rs6000_generate_compare): Adjust. (rs6000_mangle_type): Adjust. * config/rs6000/rs6000.h: Delete TARGET_IEEEQUAD. * config/rs6000/rs6000.md (whole file): Adjust for removal of TARGET_IEEEQUAD. * config/rs6000/rs6000.opt (mabi=ieeelongdouble): Delete. (mabi=ibmlongdouble): Replace by stub. * config/rs6000/spe.md (whole file): Adjust for removal of TARGET_IEEEQUAD. * doc/invoke.texi: Adjust documentation. --- gcc/config/rs6000/aix.h |3 - gcc/config/rs6000/darwin.h |4 - gcc/config/rs6000/rs6000.c | 127 +- gcc/config/rs6000/rs6000.h |1 - gcc/config/rs6000/rs6000.md | 78 +- gcc/config/rs6000/rs6000.opt |5 +- gcc/config/rs6000/spe.md | 42 +- gcc/doc/invoke.texi | 13 + 8 files changed, 82 insertions(+), 191 deletions(-) diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h index 41421a0..dc6dc55 100644 --- a/gcc/config/rs6000/aix.h +++ b/gcc/config/rs6000/aix.h @@ -41,9 +41,6 @@ #undef STACK_BOUNDARY #define STACK_BOUNDARY 128 -#undef TARGET_IEEEQUAD -#define TARGET_IEEEQUAD 0 - /* The AIX linker will discard static constructors in object files before collect has a chance to see them, so scan the object files directly. */ #define COLLECT_EXPORT_LIST diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h index 17ff675..2e67215 100644 --- a/gcc/config/rs6000/darwin.h +++ b/gcc/config/rs6000/darwin.h @@ -282,10 +282,6 @@ extern int darwin_emit_branch_islands; #undef TARGET_DEFAULT #define TARGET_DEFAULT (MASK_MULTIPLE | MASK_PPC_GFXOPT) -/* Darwin only runs on PowerPC, so short-circuit POWER patterns. */ -#undef TARGET_IEEEQUAD -#define TARGET_IEEEQUAD 0 - /* Since Darwin doesn't do TOCs, stub this out. */ #define ASM_OUTPUT_SPECIAL_POOL_ENTRY_P(X, MODE) ((void)X, (void)MODE, 0) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 4571c6f..15105c6 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -2686,11 +2686,6 @@ rs6000_option_override_internal (bool global_init_p) rs6000_long_double_type_size = RS6000_DEFAULT_LONG_DOUBLE_SIZE; } -#if !defined (POWERPC_LINUX) && !defined (POWERPC_FREEBSD) - if (!global_options_set.x_rs6000_ieeequad) -rs6000_ieeequad = 1; -#endif - /* Disable VSX and Altivec silently if the user switched cpus to power7 in a target attribute or pragma which automatically enables both options, unless the altivec ABI was set. This is set by default for 64-bit, but @@ -2893,7 +2888,7 @@ rs6000_option_override_internal (bool global_init_p) flag_signed_bitfields = 0; #endif - if (TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD) + if (TARGET_LONG_DOUBLE_128) REAL_MODE_FORMAT (TFmode) = &ibm_extended_format; if (TARGET_TOC) @@ -6986,8 +6981,8 @@ rs6000_emit_move (rtx dest, rtx source, enum machine_mode mode) /* 128-bit constant floating-point values on Darwin should really be loaded as two parts. */ - if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 - && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE) + if (TARGET_LONG_DOUBLE_128 && mode == TFmode + && GET_CODE (operands[1]) == CONST_DOUBLE) { rs6000_emit_move (simplify_gen_subreg (DFmode, operands[0], mode, 0), simplify_gen_subreg (DFmode, operands[1], mode, 0), @@ -7350,9 +7345,6 @@ rs6000_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED) return true; } - if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && TYPE_MODE (type) == TFmode) -return true; - return false; } @@ -7908,8 +7900,7 @@ rs6000_function_arg_advance_1 (CUMULATIVE_ARGS *cum, enum machine_mode mode,
[PATCH 7/7] rs6000: Old AIX specs: Use %(asm_default) instead of -mppc.
2012-08-15 Segher Boessenkool gcc/ * config/rs6000/aix43.h (ASM_CPU_SPEC): Use %(asm_default) instead of -mppc. * config/rs6000/aix51.h (ASM_CPU_SPEC): Ditto. --- gcc/config/rs6000/aix43.h |2 +- gcc/config/rs6000/aix51.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/rs6000/aix43.h b/gcc/config/rs6000/aix43.h index 8465c20..c1a69dc 100644 --- a/gcc/config/rs6000/aix43.h +++ b/gcc/config/rs6000/aix43.h @@ -49,7 +49,7 @@ do { \ #undef ASM_CPU_SPEC #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ - %{!mpowerpc64: -mppc} \ + %{!mpowerpc64: %(asm_default)} \ %{mpowerpc64: -mppc64}}} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -m620} \ diff --git a/gcc/config/rs6000/aix51.h b/gcc/config/rs6000/aix51.h index a140e12..6ea30c1 100644 --- a/gcc/config/rs6000/aix51.h +++ b/gcc/config/rs6000/aix51.h @@ -43,7 +43,7 @@ do { \ #undef ASM_CPU_SPEC #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ - %{!mpowerpc64: -mppc} \ + %{!mpowerpc64: %(asm_default)} \ %{mpowerpc64: -mppc64}}} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -m620} \ -- 1.7.7.6
[PATCH 0/7] rs6000: POWER removal, phase 2.
Here are some more patches to remove old, unused features from the rs6000 port. The first two and last patches fix up some spec strings so that asm_default is used when it should. The third adds a "bitmask" RS6000_BTM_ALWAYS for use by the fourth patch; the one builtin that uses it probably should not be always on (it is specific to Darwin as far as I can tell), but I'm not going to change behaviour with these patches. The fourth patch removes "common mode", makes -mpowerpc always on effectively. The fifth patch removes the old assembler syntax everywhere. The sixth patch removes -mabi=ieeelongdouble, which is plagued by problems; a comment in darwin.h made me believe this option is for old POWER, but it's not (it's for 32-bit SVR4 instead). I'm not totally sure we can remove this support at this time, but I made the patch already, so I'm sending it anyway. Bootstrapped and tested on powerpc64-linux, no regressions, --enable-languages=c,c++,fortran . Okay for mainline? Segher Segher Boessenkool (7): rs6000: Fix PR54142 rs6000: Fix typo mpower64 -> mpowerpc64 in various spec strings. rs6000: Add RS6000_BTM_ALWAYS rs6000: Remove TARGET_POWERPC rs6000: Get rid of old-mnemonics rs6000: Remove -mabi=ieeelongdouble. rs6000: Old AIX specs: Use %(asm_default) instead of -mppc. gcc/common/config/rs6000/rs6000-common.c | 15 +- gcc/config/rs6000/aix.h |3 - gcc/config/rs6000/aix43.h| 12 +- gcc/config/rs6000/aix51.h| 12 +- gcc/config/rs6000/aix52.h|6 +- gcc/config/rs6000/aix53.h|6 +- gcc/config/rs6000/aix61.h|6 +- gcc/config/rs6000/darwin.h |7 +- gcc/config/rs6000/darwin.md | 34 +- gcc/config/rs6000/darwin64.h |4 +- gcc/config/rs6000/default64.h|4 +- gcc/config/rs6000/dfp.md | 24 +- gcc/config/rs6000/driver-rs6000.c|6 +- gcc/config/rs6000/eabi.h |2 +- gcc/config/rs6000/eabialtivec.h |2 +- gcc/config/rs6000/eabispe.h |3 +- gcc/config/rs6000/linuxaltivec.h |2 +- gcc/config/rs6000/linuxspe.h |3 +- gcc/config/rs6000/rs6000-builtin.def |2 +- gcc/config/rs6000/rs6000-c.c |7 +- gcc/config/rs6000/rs6000-cpus.def| 121 ++-- gcc/config/rs6000/rs6000-tables.opt | 33 +- gcc/config/rs6000/rs6000.c | 235 ++- gcc/config/rs6000/rs6000.h | 12 +- gcc/config/rs6000/rs6000.md | 1137 ++ gcc/config/rs6000/rs6000.opt | 21 +- gcc/config/rs6000/spe.md | 68 +-- gcc/config/rs6000/sync.md| 24 +- gcc/config/rs6000/sysv4.h|6 +- gcc/config/rs6000/sysv4le.h |2 +- gcc/config/rs6000/vsx.md |8 +- gcc/config/rs6000/vxworks.h |3 +- gcc/doc/invoke.texi | 84 +-- libgcc/longlong.h| 34 +- 34 files changed, 661 insertions(+), 1287 deletions(-) -- 1.7.7.6
[PATCH 3/7] rs6000: Add RS6000_BTM_ALWAYS
This adds a builtin flag for "always enabled". The value 0 works right now as far as I can see, but that is too tricky and should be fixed some day. 2012-08-15 Segher Boessenkool gcc/ * config/rs6000/rs6000.h (RS6000_BTM_ALWAYS): New. --- gcc/config/rs6000/rs6000.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 5644435..5c53f85 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2302,6 +2302,7 @@ extern int frame_pointer_needed; /* Builtin targets. For now, we reuse the masks for those options that are in target flags, and pick two random bits for SPE and paired which aren't in target_flags. */ +#define RS6000_BTM_ALWAYS 0 /* Always enabled. */ #define RS6000_BTM_ALTIVEC MASK_ALTIVEC/* VMX/altivec vectors. */ #define RS6000_BTM_VSX MASK_VSX/* VSX (vector/scalar). */ #define RS6000_BTM_SPE MASK_STRING /* E500 */ -- 1.7.7.6
[PATCH 2/7] rs6000: Fix typo mpower64 -> mpowerpc64 in various spec strings.
2012-08-15 Segher Boessenkool gcc/ * config/rs6000/aix52.h (ASM_CPU_SPEC): Fix typo. * config/rs6000/aix53.h (ASM_CPU_SPEC): Ditto. * config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto. * config/rs6000/driver-rs6000.c (asm_names): Ditto. --- gcc/config/rs6000/aix52.h |2 +- gcc/config/rs6000/aix53.h |2 +- gcc/config/rs6000/aix61.h |2 +- gcc/config/rs6000/driver-rs6000.c |2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/aix52.h b/gcc/config/rs6000/aix52.h index 206b9fe..eeeb820 100644 --- a/gcc/config/rs6000/aix52.h +++ b/gcc/config/rs6000/aix52.h @@ -50,7 +50,7 @@ do { \ #define ASM_CPU_SPEC \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ - %{!mpower64: %(asm_default)}}} \ + %{!mpowerpc64: %(asm_default)}}} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -m620} \ %{mcpu=power5: -m620} \ diff --git a/gcc/config/rs6000/aix53.h b/gcc/config/rs6000/aix53.h index 44ad5cf..dac0ce0 100644 --- a/gcc/config/rs6000/aix53.h +++ b/gcc/config/rs6000/aix53.h @@ -53,7 +53,7 @@ do { \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ - %{!maltivec: %{!mpower64: %(asm_default) \ + %{!maltivec: %{!mpowerpc64: %(asm_default) \ %{mcpu=native: %(asm_cpu_native)} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -mpwr4} \ diff --git a/gcc/config/rs6000/aix61.h b/gcc/config/rs6000/aix61.h index 1000e11..0518d44 100644 --- a/gcc/config/rs6000/aix61.h +++ b/gcc/config/rs6000/aix61.h @@ -53,7 +53,7 @@ do { \ "%{!mcpu*: %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ - %{!maltivec: %{!mpower64: %(asm_default) \ + %{!maltivec: %{!mpowerpc64: %(asm_default) \ %{mcpu=native: %(asm_cpu_native)} \ %{mcpu=power3: -m620} \ %{mcpu=power4: -mpwr4} \ diff --git a/gcc/config/rs6000/driver-rs6000.c b/gcc/config/rs6000/driver-rs6000.c index e6c7da1..36fc27d 100644 --- a/gcc/config/rs6000/driver-rs6000.c +++ b/gcc/config/rs6000/driver-rs6000.c @@ -368,7 +368,7 @@ static const struct asm_name asm_names[] = { %{!maix64: \ %{mpowerpc64: -mppc64} \ %{maltivec: -m970} \ -%{!maltivec: %{!mpower64: %(asm_default)}}}" }, +%{!maltivec: %{!mpowerpc64: %(asm_default)}}}" }, #else { "common", "-mcom" }, -- 1.7.7.6
[PATCH 1/7] rs6000: Fix PR54142
This fixes PR54142, a problem I exposed when I made -mno-power the default. 2012-08-15 Segher Boessenkool gcc/ PR54142 * config/rs6000/driver-rs6000.c (asm_names): Use %(asm_default) instead of -mcom. * config/rs6000/rs6000.h (ASM_CPU_SPEC): Ditto. --- gcc/config/rs6000/driver-rs6000.c |2 +- gcc/config/rs6000/rs6000.h|2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/rs6000/driver-rs6000.c b/gcc/config/rs6000/driver-rs6000.c index 68b5257..e6c7da1 100644 --- a/gcc/config/rs6000/driver-rs6000.c +++ b/gcc/config/rs6000/driver-rs6000.c @@ -420,7 +420,7 @@ static const struct asm_name asm_names[] = { { NULL, "\ %{mpowerpc64*: -mppc64} \ %{!mpowerpc64*: %{mpowerpc*: -mppc}} \ -%{!mpowerpc*: -mcom}" }, +%{!mpowerpc*: %(asm_default)}" }, #endif }; diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index ec62fc6..5644435 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -103,7 +103,7 @@ "%{!mcpu*: \ %{mpowerpc64*: -mppc64} \ %{!mpowerpc64*: %{mpowerpc*: -mppc}} \ - %{!mpowerpc*: -mcom}} \ + %{!mpowerpc*: %(asm_default)}} \ %{mcpu=native: %(asm_cpu_native)} \ %{mcpu=common: -mcom} \ %{mcpu=cell: -mcell} \ -- 1.7.7.6
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On Wed, Aug 15, 2012 at 2:58 PM, Lawrence Crowl wrote: > > I do not much like _t names either. Also, names ending in _t are reserved by POSIX if you #include . Though that may only apply to global names, not to types defined in classes or namespaces. Ian
Re: [rfc] Fix SPU build (Re: [PATCH] Remove basic_block->loop_depth)
Richard Guenther wrote: > On Wed, 15 Aug 2012, Ulrich Weigand wrote: > > It seems flow_loops_find by itself is not quite enough, but everything > > necessary to use the loop structures seems to be encapsulated in > > loop_optimizer_init / loop_optimizer_finalize, which are already called > > by a number of optimization passes. When I do the same in the SPU > > md-reorg pass, loop_father seems to be set up OK. > > > > Does this look reasonable? > > Yes. You can use bb_loop_depth instead of loop_depth (bb->loop_father). Ah, good point. Below the version I committed. Tested on spu-elf. Thanks, Ulrich ChangeLog: * config/spu/spu.c: Include "cfgloop.h". (spu_machine_dependent_reorg): Call loop_optimizer_init and loop_optimizer_finalize. Use bb_loop_depth instead of loop_depth. Directly compare loop_father values where appropriate. * config/spu/t-spu-elf (spu.o): Update dependencies. Index: gcc/config/spu/spu.c === *** gcc/config/spu/spu.c(revision 190390) --- gcc/config/spu/spu.c(working copy) *** *** 53,58 --- 53,59 #include "timevar.h" #include "df.h" #include "dumpfile.h" + #include "cfgloop.h" /* Builtin types, data and prototypes. */ *** spu_machine_dependent_reorg (void) *** 2458,2463 --- 2459,2468 in_spu_reorg = 1; compute_bb_for_insn (); + /* (Re-)discover loops so that bb->loop_father can be used + in the analysis below. */ + loop_optimizer_init (AVOID_CFG_MODIFICATIONS); + compact_blocks (); spu_bb_info = *** spu_machine_dependent_reorg (void) *** 2562,2575 fallthru block. This catches the cases when it is a simple loop or when there is an initial branch into the loop. */ if (prev && (loop_exit || simple_loop) ! && prev->loop_depth <= bb->loop_depth) prop = prev; /* If there is only one adjacent predecessor. Don't propagate !outside this loop. This loop_depth test isn't perfect, but !I'm not sure the loop_father member is valid at this point. */ else if (prev && single_pred_p (bb) ! && prev->loop_depth == bb->loop_depth) prop = prev; /* If this is the JOIN block of a simple IF-THEN then --- 2567,2579 fallthru block. This catches the cases when it is a simple loop or when there is an initial branch into the loop. */ if (prev && (loop_exit || simple_loop) ! && bb_loop_depth (prev) <= bb_loop_depth (bb)) prop = prev; /* If there is only one adjacent predecessor. Don't propagate !outside this loop. */ else if (prev && single_pred_p (bb) ! && prev->loop_father == bb->loop_father) prop = prev; /* If this is the JOIN block of a simple IF-THEN then *** spu_machine_dependent_reorg (void) *** 2578,2584 && EDGE_COUNT (bb->preds) == 2 && EDGE_COUNT (prev->preds) == 1 && EDGE_PRED (prev, 0)->src == prev2 ! && prev2->loop_depth == bb->loop_depth && GET_CODE (branch_target) != REG) prop = prev; --- 2582,2588 && EDGE_COUNT (bb->preds) == 2 && EDGE_COUNT (prev->preds) == 1 && EDGE_PRED (prev, 0)->src == prev2 ! && prev2->loop_father == bb->loop_father && GET_CODE (branch_target) != REG) prop = prev; *** spu_machine_dependent_reorg (void) *** 2600,2606 if (dump_file) fprintf (dump_file, "propagate from %i to %i (loop depth %i) " "for %i (loop_exit %i simple_loop %i dist %i)\n", !bb->index, prop->index, bb->loop_depth, INSN_UID (branch), loop_exit, simple_loop, branch_addr - INSN_ADDRESSES (INSN_UID (bbend))); --- 2604,2610 if (dump_file) fprintf (dump_file, "propagate from %i to %i (loop depth %i) " "for %i (loop_exit %i simple_loop %i dist %i)\n", !bb->index, prop->index, bb_loop_depth (bb), INSN_UID (branch), loop_exit, simple_loop, branch_addr - INSN_ADDRESSES (INSN_UID (bbend))); *** spu_machine_dependent_reorg (void) *** 2657,2662 --- 2661,2668 spu_var_tracking (); + loop_optimizer_finalize (); + free_bb_for_insn (); in_spu_reorg = 0; Index: gcc/config/spu/t-spu-elf === *** gcc/config/spu/t-spu-elf(revision 190390) --- gcc/config/spu/t-spu-elf(working copy) *
[Patch, Fortran, committed] PR 54243 & 54244
Hi all, I have just committed as obvious a small patch fixing two ICE-on-invalid PR involving CLASS declarations: http://gcc.gnu.org/viewcvs?view=revision&revision=190420 Cheers, Janus
Re: [google/gcc-4_7] Fix regression - SUBTARGET_EXTRA_SPECS overridden by LINUX_GRTE_EXTRA_SPECS
Hi Jing, ping? On Mon, Aug 13, 2012 at 10:58 AM, Han Shen(沈涵) wrote: > Hi, the google/gcc-4_7 fails to linking anything (on x86-generic), by > looking into specs file, it seems that 'link_emulation' section is > missing in specs. > > The problem is in config/i386/linux.h, SUBTARGET_EXTRA_SPECS (which is > not empty for chrome x86-generic) is overridden by > "LINUX_GRTE_EXTRA_SPECS". > > My fix is to prepend LINUX_GRTE_EXTRA_SPECS to SUBTARGET_EXTRA_SPECS in > linux.h > > Jing, could you take a look at this? > > -- > Han Shen > > 2012-08-13 Han Shen > * gcc/config/i386/gnu-user.h (SUBTARGET_EXTRA_SPECS): Compute > new value of LINUX_GRTE_EXTRA_SPECS by pre-pending LINUX_GRTE_EXTRA_SPECS > to its origin value. > * gcc/config/i386/gnu-user.h (SUBTARGET_EXTRA_SPECS_STR): Add > new MACRO to hold value of SUBTARET_EXTRA_SPECS so that > SUBTARET_EXTRA_SPECS could be replaced later in gnu-user.h > > --- a/gcc/config/i386/gnu-user.h > +++ b/gcc/config/i386/gnu-user.h > @@ -92,11 +92,14 @@ along with GCC; see the file COPYING3. If not see > #define ASM_SPEC \ >"--32 %{!mno-sse2avx:%{mavx:-msse2avx}} %{msse2avx:%{!mavx:-msse2avx}}" > > -#undef SUBTARGET_EXTRA_SPECS > -#define SUBTARGET_EXTRA_SPECS \ > +#undef SUBTARGET_EXTRA_SPECS_STR > +#define SUBTARGET_EXTRA_SPECS_STR \ >{ "link_emulation", GNU_USER_LINK_EMULATION },\ >{ "dynamic_linker", GNU_USER_DYNAMIC_LINKER } > > +#undef SUBTARGET_EXTRA_SPECS > +#define SUBTARGET_EXTRA_SPECS SUBTARGET_EXTRA_SPECS_STR > + > #undef LINK_SPEC > #define LINK_SPEC "-m %(link_emulation) %{shared:-shared} \ >%{!shared: \ > --- a/gcc/config/i386/linux.h > +++ b/gcc/config/i386/linux.h > @@ -32,5 +32,11 @@ along with GCC; see the file COPYING3. If not see > #endif > > #undef SUBTARGET_EXTRA_SPECS > +#ifndef SUBTARGET_EXTRA_SPECS_STR > #define SUBTARGET_EXTRA_SPECS \ >LINUX_GRTE_EXTRA_SPECS > +#else > +#define SUBTARGET_EXTRA_SPECS \ > + LINUX_GRTE_EXTRA_SPECS \ > + SUBTARGET_EXTRA_SPECS_STR > +#endif -- Han Shen | Software Engineer | shen...@google.com | +1-650-440-3330
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On 8/15/12, Richard Henderson wrote: > On 2012-08-15 07:29, Richard Guenther wrote: >> + typedef typename Element::Element_t Element_t; > > Can we use something less ugly than Element_t? > Such as > > typedef typename Element::T T; > > ? Given that this name is scoped anyway... I do not much like _t names either. -- Lawrence Crowl
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On 8/15/12, Richard Guenther wrote: > On Wed, 15 Aug 2012, Michael Matz wrote: > > On Wed, 15 Aug 2012, Richard Guenther wrote: > > > Like the following, only the coverage.c use is converted. > > > I've never seen template function arguments anywhere and > > > having to repeat them all > > > > > > over the place is really really ugly (yes, even if only in > > > the implementation). > > > > > > This goes with static member functions and not wrapping any > > > data. It goes away with the requirement of having externally > > > visible global functions for the hash/compare functions as well > > > (which was ugly anyway). > > > > Well, it looks nicer than what's there currently. As the > > element functions now are scoped and normal member functions, > > they should be named with lower case characters of course. I do > > like that the hash table would then only have one real template > > argument; the Allocator, well, no better idea comes to my mind. > > Yeah. Updated patch below, all users converted. I probably > missed to revert some of the globalizations of hash/compare fns. Your conversion is a better abstraction, and something I'd wanted to get to eventually, so I support your conversion. BTW, the conding conventions say to put member function definitions out of line. -- Lawrence Crowl
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On 8/15/12, Richard Guenther wrote: > On Sun, 12 Aug 2012, Diego Novillo wrote: > > This implements a new C++ hash table. > > > > See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for > > details. > > Now as we see the result I'd have prefered a more C++-way instead > of making the conversion simple ... > > Like > > template > class hash_table > { > ... > }; > > template > class pointer_hash > { > hashval_t hash (); > int equal (const Element *); > ~Element (); > Element e; > }; > > and > > /* Hash table with all same_succ entries. */ > > static hash_table > same_succ_htab; > > The existing way is simply too ugly ... so, why did you not invent > a "nice" C++ way? (please consider reverting the hashtable patch) We are trying to balance several factors. Sometimes we're going to pick multiple steps rather than a single step. In some cases, as here, the intent was to make the client code changes minimal and unsurprising while still getting the type safety and efficiency. Other times, it may be a minor technical issue. In particular, I prefer to avoid steps that might cause very poor matching of lines in the diff. Others may choose differently. Together we will learn where the tradeoffs lie. More in a later response. -- Lawrence Crowl
Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
On Tue, Aug 14, 2012 at 2:34 PM, Gary Funck wrote: > Attached, is an updated patch (with change logs). > > The test cases are now in gcc.target/i386 and the > target selection is "dg-require-effective-target int128" only. > > Verified that the tests correctly detect the presence/lack > of TImode support. > Here is a patch, which passed tests on Linux/x86-64/ia32 with --enable-languages=c,c++,fortran,java,lto,objc,ada,obj-c++,go on Linux/x32 with --enable-languages=c,c++,fortran,java,lto,objc,obj-c++,go I skipped Ada on x32 since Ada run-time library assumes clock_t/time_t are long. I added a target hook, type_blkmode_p, so that a backend can force a type to BLKmode. There is a macro, MEMBER_TYPE_FORCES_BLK. But it will also set struct { long double x; }; to BLKmode instead of XFmode. I think we can replace MEMBER_TYPE_FORCES_BLK with the new type_blkmode_p target hook. -- H.J. --- gcc/ 2012-08-15 H.J. Lu Gary Funck PR target/20020 * stor-layout.c (compute_record_mode): Use BLKmode if targetm.type_blkmode_p returns true. * target.def (type_blkmode_p): New target hook. * doc/tm.texi.in: Regenerated. * doc/tm.texi.in: Likewise. * config/i386/i386.c (ix86_type_blkmode_p): New function. (TARGET_TYPE_BLKMODE_P): New macro. * config/i386/i386.h (MAX_FIXED_MODE_SIZE): New macro. gcc/testsuite/ 2012-08-15 H.J. Lu Gary Funck PR target/20020 * gcc.target/i386/pr20020-1.c: New test. * gcc.target/i386/pr20020-2.c: Likewise. * gcc.target/i386/pr20020-3.c: Likewise. gcc-pr20020.patch Description: Binary data
Re: Merge C++ conversion into trunk (0/6 - Overview)
On 08/15/2012 06:00 PM, Diego Novillo wrote: > On the switch to C++ as the build language for GCC ... Here are my results: 0:30 UTC - using C as the initial build language: http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg01329.html and: 18:40 UTC - using C++ as the initial build language: http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg01408.html both for x86_64-unknown-linux-gnu native (note: --with-build-config=bootstrap-lto). As far as I can, little difference. Congratulations, Diego and all the others who worked on this transition. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
[google] Modification of gcov pmu format to reduce gcda size bloat (issue6427063)
This patch has been updated to reflect changes in patch r190247, which removed pfmon support. The patch should be applied to google/main Tested with crosstools. 2012-08-14 Chris Manghane * libgcc/pmu-profile.c (gcov_write_load_latency_infos): Removed unused function. (gcov_write_branch_mispredict_infos): Ditto. (destroy_load_latency_infos): Removed static keyword. (init_pmu_branch_mispredict): Ditto. (gcov_write_ll_line): Ditto, plus replaced filename field with filetag. (gcov_write_branch_mispredict_line): Ditto. (gcov_write_string_table_entry): New function. (gcov_write_tool_header): Removed static keyword. * gcc/gcov.c (release_structures): Removed filename field from PMU structures. (filter_pmu_data_lines): Added PMU string table support. (process_pmu_profile): Ditto. * gcc/gcov-io.c (gcov_read_pmu_load_latency_info): Replaced filename field with filetag. (gcov_read_pmu_branch_mispredict_info): Ditto. (gcov_read_pmu_string_table_entry): New Function. (print_load_latency_line): Replaced filename field with filetag. (print_branch_mispredict_line): Ditto. (print_string_table_entry): New function. * gcc/gcov-io.h (GCOV_TAG_PMU_LOAD_LATENCY_LENGTH): Replaced filename field with filetag (GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH): Ditto. (GCOV_TAG_PMU_STRING_TABLE_ENTRY): Added new tag. (GCOV_TAG_PMU_STRING_TABLE_ENTRY_LENGTH): Ditto. (gcov_pmu_load_latency_info): Replaced filename field with filetag. (gcov_pmu_branch_mispredict_info): Ditto. (gcov_pmu_string_table_entry): New struct. (gcov_pmu_string_table): New struct. * gcc/gcov-dump.c (tag_pmu_load_latency_info): Removed PMU filename field. (tag_pmu_branch_mispredict_info): Ditto. (tag_pmu_string_table_entry): New function. Index: libgcc/pmu-profile.c === --- libgcc/pmu-profile.c(revision 190362) +++ libgcc/pmu-profile.c(working copy) @@ -1,3 +1,4 @@ + /* Performance monitoring unit (PMU) profiler. If available, use an external tool to collect hardware performance counter data and write it in the .gcda files. @@ -74,13 +75,13 @@ static void destroy_load_latency_infos (void *info static void destroy_branch_mispredict_infos (void *info); static gcov_unsigned_t gcov_tag_pmu_tool_header_length (gcov_pmu_tool_header_t *header); -static void gcov_write_tool_header (gcov_pmu_tool_header_t *header); -static void gcov_write_load_latency_infos (void *info); -static void gcov_write_branch_mispredict_infos (void *info); -static void gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info); -static void gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t +void gcov_write_tool_header (gcov_pmu_tool_header_t *header); +void gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info); +void gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t *brm_info); +void gcov_write_string_table_entry (const gcov_pmu_st_entry_t *st_entry); + /* Convert a fractional PCT to an unsigned integer after muliplying by 100. */ @@ -156,11 +157,11 @@ init_pmu_branch_mispredict (void) /* Write the load latency information LL_INFO into the gcda file. */ -static void +void gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info) { - gcov_unsigned_t len = GCOV_TAG_PMU_LOAD_LATENCY_LENGTH (ll_info->filename); - gcov_write_tag_length (GCOV_TAG_PMU_LOAD_LATENCY_INFO, len); + gcov_write_tag_length (GCOV_TAG_PMU_LOAD_LATENCY_INFO, + GCOV_TAG_PMU_LOAD_LATENCY_LENGTH()); gcov_write_unsigned (ll_info->counts); gcov_write_unsigned (ll_info->self); gcov_write_unsigned (ll_info->cum); @@ -174,27 +175,38 @@ gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_i gcov_write_counter (ll_info->code_addr); gcov_write_unsigned (ll_info->line); gcov_write_unsigned (ll_info->discriminator); - gcov_write_string (ll_info->filename); + gcov_write_unsigned (ll_info->filetag); } /* Write the branch mispredict information BRM_INFO into the gcda file. */ -static void +void gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t *brm_info) { - gcov_unsigned_t len = GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH ( - brm_info->filename); - gcov_write_tag_length (GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO, len); + gcov_write_tag_length (GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO, + GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH()); gcov_write_unsigned (brm_info->counts); gcov_write_unsigned (brm_info->self); gcov_write_unsigned (brm_info->cum); gcov_write_counter (brm_info->code_addr); gcov_write_unsigned (brm_info->line); gcov_write_unsigned (brm_info->discriminator
Re: RFC: fix std::unique_ptr pretty-printer
> "Jonathan" == Jonathan Wakely writes: Jonathan> I like it, please go ahead and check that in it you're happy Jonathan> with it. I did. Thanks. Tom
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Wed, Aug 15, 2012 at 1:21 PM, Tom Tromey wrote: >> "Gaby" == Gabriel Dos Reis writes: > > Tom> I asked Keith to resurrect his patch for this. > > Gaby> Since people are concerned about typing rules, would it > Gaby> be an option for GDB to allow people to input pointer > Gaby> literals with the "p" suffix (or "0p" prefix instead of "0x")? > > I think on the whole I'd rather have one extension instead of two. That is a fair point :-) > > It seems to me that the above would still require extensions in the > overloading code, to deal with conversions from void*; or perhaps some > magic pointer type. > > What I think Keith is going to do is take his patch, change it so that > int->pointer conversions (if the int != 0) are given a different badness > from other conversions (meaning that, in theory, this should only be > chosen as a last resort, and shouldn't interfere with ordinary uses), > and finally add a parameter so that the feature can be disabled. > > I hope this will be acceptable all around. OK, that sounds reasonable. Thanks! -- Gaby
Re: Merge C++ conversion into trunk (0/6 - Overview)
> "Gaby" == Gabriel Dos Reis writes: Tom> I asked Keith to resurrect his patch for this. Gaby> Since people are concerned about typing rules, would it Gaby> be an option for GDB to allow people to input pointer Gaby> literals with the "p" suffix (or "0p" prefix instead of "0x")? I think on the whole I'd rather have one extension instead of two. It seems to me that the above would still require extensions in the overloading code, to deal with conversions from void*; or perhaps some magic pointer type. What I think Keith is going to do is take his patch, change it so that int->pointer conversions (if the int != 0) are given a different badness from other conversions (meaning that, in theory, this should only be chosen as a last resort, and shouldn't interfere with ordinary uses), and finally add a parameter so that the feature can be disabled. I hope this will be acceptable all around. Tom
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Wed, Aug 15, 2012 at 12:53 PM, Tom Tromey wrote: >> "Diego" == Diego Novillo writes: > > Diego> GDB folks, would it be hard to figure out that there is a single > Diego> variant of the called function and trust the user that they are > Diego> passing the right pointer value? > > I asked Keith to resurrect his patch for this. Since people are concerned about typing rules, would it be an option for GDB to allow people to input pointer literals with the "p" suffix (or "0p" prefix instead of "0x")? -- Gaby
Re: Merge C++ conversion into trunk (0/6 - Overview)
> "Diego" == Diego Novillo writes: Diego> GDB folks, would it be hard to figure out that there is a single Diego> variant of the called function and trust the user that they are Diego> passing the right pointer value? I asked Keith to resurrect his patch for this. Tom
Re: [PATCH] PR target/53633; disable return value warnings for naked functions
On Wed, Jul 25, 2012 at 11:11 AM, Sandra Loosemore wrote: > On 07/25/2012 09:57 AM, Richard Henderson wrote: >> >> I'll echo Nick's comments about arm asm in a common test. >> There's no need to have anything but __asm__(""); there. >> >> Ok with that change. > > Thanks! Here's the version I committed. > > -Sandra > > > 2012-07-25 Sandra Loosemore > Paul Brook > > PR target/53633 > > gcc/ > * target.def (warn_func_return): New hook. > * doc/tm.texi.in (TARGET_WARN_FUNC_RETURN): New hook. > * doc/tm.texi: Regenerate. > * doc/sourcebuild.texi (Effective-Target Keywords): Document > naked_functions. > * ipa-pure-const.c (warn_function_noreturn): Check > targetm.warn_func_return. > * tree-cfg.c (execute_warn_function_return): Likewise. You didn't mention that you included "target.h" nor add "target.h" dependency on tree-cfg.c. I am checking in this as an obvious fix. -- H.J. -- diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 65a49d7..f2366ad 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2363,7 +2363,7 @@ tree-vrp.o : tree-vrp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ $(CFGLOOP_H) $(SCEV_H) intl.h \ $(GIMPLE_PRETTY_PRINT_H) gimple-fold.h $(OPTABS_H) $(EXPR_H) tree-cfg.o : tree-cfg.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ - $(TREE_H) $(TM_P_H) $(GGC_H) $(FLAGS_H) \ + $(TREE_H) $(TM_P_H) $(GGC_H) $(FLAGS_H) $(TARGET_H) \ $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) $(TM_H) coretypes.h \ $(TREE_DUMP_H) $(EXCEPT_H) $(CFGLOOP_H) $(TREE_PASS_H) \ $(BASIC_BLOCK_H) \
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Wed, Aug 15, 2012 at 05:49:34PM +0200, Jan Kratochvil wrote: > On Wed, 15 Aug 2012 17:44:32 +0200, Michael Matz wrote: > > On Wed, 15 Aug 2012, Jan Kratochvil wrote: > > It's not needless as the examples here show. gdb is about helping people > > debug their stuff, not about language lawyering. > > In such case there should be a GDB setting for it as at least from my opinion > it complicates the debugging. Then there may be different opinions what > should be the default. That would be fine for gcc development purposes. We could switch that mode on at the end of gcc .gdbinit . Jakub
Re: [RFC PATCH] -Wsizeof-pointer-memaccess warning
On Wed, 15 Aug 2012, Jakub Jelinek wrote: > I was mainly interested in whether such an approach is acceptable, or > whether I need to stop evaluating sizeof right away, create SIZEOF_EXPR > and only fold it during fully_fold*. I've briefly looked at that today, The approach is fine. Delaying evaluating sizeof is hard simply because of the expectation that integer constant expressions in general are evaluated early. -- Joseph S. Myers jos...@codesourcery.com
[patch] PR54146 - rewrite rewrite_into_loop_closed_ssa
Hello, This patch rewrites the rewriting part of rewrite_into_loop_closed_ssa. This function took ~300s on the simplified test case for PR54146, walking around the many thousands of loops in the >400,000 basic blocks in the CFG, in compute_global_livein. For rewriting into LC-SSA, we can do better than that if we use the knowledge that we're actually computing livein for an SSA name "DEF", therefore its uses must all be dominated by the basic block where DEF is assigned a value. But walking up the dominator tree doesn't work here because we want to traverse and mark the edges in the CFG where DEF is live, and we may not see those edges if we walk up the dominator tree from a USE to DEF. We do know, however, that when we walk up the CFG then: 1. if we enter any loop nested in the same loop as the basic block containing DEF (let's call it DEF_LOOP) then we can stop walking because the only way in is through one of the edges we're interested in. 2. if we enter a loop is not in the nesting stack of DEF_LOOP, then we can skin straight to the header of the loop and continue looking up from there. 3. if we reach a basic block X as a predecessor of Y and Y dominates X then we don't have to look at X or any of its predecessors. Putting this all together, you end up with what looks like a poor man's form of structural analysis where the control tree only contains loop nodes, non-loop basic blocks, and irreducible regions. (Honza: maybe re-surrect http://gcc.gnu.org/ml/gcc-patches/2003-06/msg01669.html? Now that we can maintain the loop tree, perhaps it's also feasible to maintain the complete control tree and use it in places where we want to analyze only a sub-CFG?) The effect is that the time spent in rewrite_into_loop_closed_ssa drops to less than one second. Bootstrapped&tested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven PR54146_lcssa.diff Description: Binary data
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On 2012-08-15 07:29, Richard Guenther wrote: > + typedef typename Element::Element_t Element_t; Can we use something less ugly than Element_t? Such as typedef typename Element::T T; ? Given that this name is scoped anyway... r~
Re: Merge C++ conversion into trunk (0/6 - Overview)
On 12-08-15 11:44 , Michael Matz wrote: Hi, On Wed, 15 Aug 2012, Jan Kratochvil wrote: It is a needless violation of C++ resolving rules. It's not needless as the examples here show. gdb is about helping people debug their stuff, not about language lawyering. Agreed. If I'm passing a numeric pointer constant, I'm already past the bleeding edge. I don't want gdb to get in the way. If the inferior call crashes, I get to keep both pieces. Thanks. Diego.
Re: [RFC PATCH] -Wsizeof-pointer-memaccess warning
On Wed, Aug 15, 2012 at 03:39:29PM +, Joseph S. Myers wrote: > On Wed, 18 Jul 2012, Jakub Jelinek wrote: > > > + if (warn_sizeof_pointer_memaccess > > + && sizeof_arg != NULL_TREE) > > + sizeof_pointer_memaccess_warning (c_last_sizeof_arg_loc, > > + expr.value, exprlist, > > + sizeof_arg, > > + sizeof_ptr_memacc_comptypes); > > Why do you pass a local variable sizeof_arg but a global > c_last_sizeof_arg_loc here? I'd have expected the same approach to apply > for both arguments (e.g. passing down a pointer to a location alongside > &sizeof_arg). I guess I could pass it the same way as sizeof_arg or stick the location_t into sizeof_arg (surround the value into NOP_EXPR with the location_t). I was mainly interested in whether such an approach is acceptable, or whether I need to stop evaluating sizeof right away, create SIZEOF_EXPR and only fold it during fully_fold*. I've briefly looked at that today, and see several problems with that though: 1) fully_fold* is in c-family/, so can't use e.g. c_vla_type_p (at least not easily, it could use a langhook) 2) c_expr_sizeof_{expr,type} currently calls pop_maybe_used, I don't see how that could be easily deferred till later when the expression is fully folded. So, we might need to do what c_expr_sizeof_{expr,type} does right now (including fully folding), to call pop_maybe_used with the right argument, and then either create SIZEOF_EXPR with the original argument and fully fold it again during fully folding of the SIZEOF_EXPR, or add an argument to SIZEOF_EXPR which would contain the folded expression (i.e. have two arguments, the original expression and the value of the sizeof). Jakub
Re: Merge C++ conversion into trunk (0/6 - Overview)
Hi, On Wed, 15 Aug 2012 17:44:32 +0200, Michael Matz wrote: > On Wed, 15 Aug 2012, Jan Kratochvil wrote: > It's not needless as the examples here show. gdb is about helping people > debug their stuff, not about language lawyering. In such case there should be a GDB setting for it as at least from my opinion it complicates the debugging. Then there may be different opinions what should be the default. Thanks, Jan
Re: Merge C++ conversion into trunk (0/6 - Overview)
Hi, On Wed, 15 Aug 2012, Jan Kratochvil wrote: > It is a needless violation of C++ resolving rules. It's not needless as the examples here show. gdb is about helping people debug their stuff, not about language lawyering. > There are various easy way how to get it working (in .gdbinit or > cc1-gdb.gdb define GDB function, define macro in GDB, use GDB python > pretty printer instead (possibly even calling GCC inferior function) We should define gdb macros for every not-overloaded function (which are _all_ GCC functions currently)? Doesn't scale. Ciao, Michael.
Re: [RFC PATCH] -Wsizeof-pointer-memaccess warning
On Wed, 18 Jul 2012, Jakub Jelinek wrote: > + if (warn_sizeof_pointer_memaccess > + && sizeof_arg != NULL_TREE) > + sizeof_pointer_memaccess_warning (c_last_sizeof_arg_loc, > + expr.value, exprlist, > + sizeof_arg, > + sizeof_ptr_memacc_comptypes); Why do you pass a local variable sizeof_arg but a global c_last_sizeof_arg_loc here? I'd have expected the same approach to apply for both arguments (e.g. passing down a pointer to a location alongside &sizeof_arg). -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis
Hi, On Fri, Aug 10, 2012 at 05:12:31AM +0200, Jan Hubicka wrote: > > Do you have any data on memory usage? I was originally concerned > about memory use of the whole predicate thingy on WPA level. > Eventually we could add simple inheritance on conditions and sort > them into mutiple vectors if needed. But I assume it is OK or we > will work out on Mozilla builds soonish. > When I have completely disabled computation of jump functions, the number of expression trees reported by detailed -fmem-report in the WPA stage dropped from 2253872 to 2124732, i.e. by 5.7%. Martin
Re: [PATCH] Set current_function_decl in {push,pop}_cfun and push_struct_function
Hi, On Fri, Aug 10, 2012 at 04:57:41PM +0200, Eric Botcazou wrote: > > - ada/gcc-interface/utils.c:rest_of_subprog_body_compilation calls > > dump_function which in turns calls dump_function_to_file which calls > > push_cfun. But Ada front end has its idea of the > > current_function_decl and there is no cfun which is an inconsistency > > which makes push_cfun assert fail. I "solved" it by temporarily > > setting current_function_decl to NULL_TREE. It's just dumping and I > > thought that dump_function should be considered middle-end and thus > > middle-end invariants should apply. > > If you think that calling dump_function from rest_of_subprog_body_compilation > is a layering violation, I don't have a problem with replacing it with a more > "manual" scheme like the one in c-family/c-gimplify.c:c_genericize, provided > that this yields roughly the same output. Richi suggested on IRC that I remove the push/pop_cfun calls from dump_function_to_file. The only problem seems to be dump_histograms_for_stmt (buried in a dump_bb call) which needs a function to get the value histograms. Richi suggested to replace the per function hash with one global one. I tried that and bumped into two problems: First, I did not really know when to deallocate the global hash and all the histograms. More importantly, the current gimple verifier (verify_gimple_in_cfg) checks that there are no "dead histograms" in the hash corresponding to statements no longer in the function. With one global hash, it would be very cumbersome to do this check (add cfun to each histogram? only when checking is enabled?). Hash tables are now in flux anyway so I postponed that work until the interface settles down. Nevertheless, I guess I need a confirmation from maintainers that I can remove the checking if I continue this way. When I looked at what would need to be changed if I added a function parameter to dump_bb, the work list grew very large very quickly. It is probably doable, but the change will be quite big, so if anyone thinks it is a bad idea, please email me to stop. If I manage to change the dumping in one of the two ways, the call will not be a layering problem any more as it probably should not be. At the moment, it probably is. We'll see. Thanks, Martin
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Wed, 15 Aug 2012 14:23:37 +0200, Diego Novillo wrote: > GDB folks, would it be hard to figure out that there is a single > variant of the called function and trust the user that they are > passing the right pointer value? It is a needless violation of C++ resolving rules. There are various easy way how to get it working (in .gdbinit or cc1-gdb.gdb define GDB function, define macro in GDB, use GDB python pretty printer instead (possibly even calling GCC inferior function) etc.). While I did not post such patch yet I would prefer to even forbid by default expressions like (gdb) print *0x1234567 and enforce (gdb) print *(int *)0x1234567 instead as the former syntax confuses people (as commonly seen on IRC), the same applies to calling functions without full debuginfo (they should require an explicit cast in GDB) etc. Regards, Jan
Re: C++ PR 54197: lifetime of reference not properly extended
(Adding other C++ maintainers in case someone else wants to have a stab.) Ping? Ollie On Mon, Aug 13, 2012 at 4:01 PM, Ollie Wild wrote: > > On Mon, Aug 13, 2012 at 3:50 PM, Jakub Jelinek wrote: > > > > The formatting doesn't match GCC coding conventions in several ways. > > You don't have spaces before (, and ( shouldn't be at the end of line if > > possible. > > Updated patch attached. > > Ollie
Re: Merge C++ conversion into trunk (2/6 - VEC rewrite)
On 12-08-15 10:44 , Eric Botcazou wrote: This implements the VEC re-write. See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for details. You didn't update the head comment in vec.h though, is that on purpose? Yes. I am updating it now that I'm *really* changing the interface. Diego.
Re: RFC: fix std::unique_ptr pretty-printer
On 14 August 2012 15:44, Tom Tromey wrote: >> "Jonathan" == Jonathan Wakely writes: > > Jonathan> I prefer it as unique_ptr but I'm probably not your typical > Jonathan> user of the pretty printers, so if anyone else has an opinion please > Jonathan> share it. > > I prefer it too. Here's the updated patch. Let me know what you think. I like it, please go ahead and check that in it you're happy with it.
Re: Merge C++ conversion into trunk (2/6 - VEC rewrite)
> This implements the VEC re-write. > > See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for > details. You didn't update the head comment in vec.h though, is that on purpose? -- Eric Botcazou
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On Wed, 15 Aug 2012, Michael Matz wrote: > Hi, > > On Wed, 15 Aug 2012, Richard Guenther wrote: > > > Like the following, only the coverage.c use is converted. I've never > > seen template function arguments anywhere and having to repeat them all > > over the place is really really ugly (yes, even if only in the > > implementation). > > > > This goes with static member functions and not wrapping any data. It > > goes away with the requirement of having externally visible global > > functions for the hash/compare functions as well (which was ugly > > anyway). > > > > Comments? > > Well, it looks nicer than what's there currently. As the element > functions now are scoped and normal member functions, they should be named > with lower case characters of course. I do like that the hash table would > then only have one real template argument; the Allocator, well, no better > idea comes to my mind. Yeah. Updated patch below, all users converted. I probably missed to revert some of the globalizations of hash/compare fns. Comments? Richard. Index: gcc/hash-table.h === *** gcc/hash-table.h.orig 2012-08-15 15:39:34.0 +0200 --- gcc/hash-table.h2012-08-15 16:17:06.613628039 +0200 *** xcallocator ::data_free (Type *mem *** 83,127 } - /* A common function for hashing a CANDIDATE typed pointer. */ - template ! inline hashval_t ! typed_pointer_hash (const Element *candidate) { ! /* This is a really poor hash function, but it is what the current code uses, ! so I am reusing it to avoid an additional axis in testing. */ ! return (hashval_t) ((intptr_t)candidate >> 3); ! } - /* A common function for comparing an EXISTING and CANDIDATE typed pointers -for equality. */ template ! inline int ! typed_pointer_equal (const Element *existing, const Element * candidate) { ! return existing == candidate; ! } ! /* A common function for doing nothing on removing a RETIRED slot. */ template ! inline void ! typed_null_remove (Element *retired ATTRIBUTE_UNUSED) { } - - /* A common function for using free on removing a RETIRED slot. */ - template ! inline void ! typed_free_remove (Element *retired) { ! free (retired); } --- 83,132 } template ! class typed_free_remove { ! public: ! static inline void remove (Element *p) { free (p); } ! }; + template + class typed_noop_remove + { + public: + static inline void remove (Element *) {} + }; + /* Pointer hash. */ template ! class pointer_hash : public typed_noop_remove { ! public: ! typedef Element Element_t; + static inline hashval_t + hash (const Element_t *); ! static inline int ! equal (const Element_t *existing, const Element_t * candidate); ! }; template ! inline hashval_t ! pointer_hash::hash (const Element_t *candidate) { + /* This is a really poor hash function, but it is what the current code uses, + so I am reusing it to avoid an additional axis in testing. */ + return (hashval_t) ((intptr_t)candidate >> 3); } template ! inline int ! pointer_hash::equal (const Element_t *existing, ! const Element_t *candidate) { ! return existing == candidate; } *** struct hash_table_control *** 180,194 The table stores elements of type Element. !It hashes elements with the Hash function. The table currently works with relatively weak hash functions. Use typed_pointer_hash when hashing pointers instead of objects. !It compares elements with the Equal function. Two elements with the same hash may not be equal. Use typed_pointer_equal when hashing pointers instead of objects. !It removes elements with the Remove function. This feature is useful for freeing memory. Use typed_null_remove when not freeing objects. Use typed_free_remove when doing a simple object free. --- 185,199 The table stores elements of type Element. !It hashes elements with the hash function. The table currently works with relatively weak hash functions. Use typed_pointer_hash when hashing pointers instead of objects. !It compares elements with the equal function. Two elements with the same hash may not be equal. Use typed_pointer_equal when hashing pointers instead of objects. !It removes elements with the remove function. This feature is useful for freeing memory. Use typed_null_remove when not freeing objects. Use typed_free_remove when doing a simple object free. *** struct hash_table_control *** 199,243 */ template class Allocator = xcallocator> class hash_table { private: ! hash_table_control *htab; ! ! Element **find_empty_slot_for_expand (hashval_t hash); void expand ();
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Aug 15, 2012, at 4:59 AM, Richard Guenther wrote: > and debugging becomes a nightmare (hello gdb people!) > (gdb) call debug_tree (0x767fa5e8) > Cannot resolve function debug_tree to any overloaded instance Inquiring minds want to know if: macro define debug_tree(A) ((tree)A) makes the above work if you put it into gdbinit.in? If yes, then, I think for now we should add such lines for every non-overloaded function, I too like using hex constants from dumps. Oh, or, maybe we just add a debug_tree(long) overload just to satisfy gdb.
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
Hi, On Wed, 15 Aug 2012, Richard Guenther wrote: > Like the following, only the coverage.c use is converted. I've never > seen template function arguments anywhere and having to repeat them all > over the place is really really ugly (yes, even if only in the > implementation). > > This goes with static member functions and not wrapping any data. It > goes away with the requirement of having externally visible global > functions for the hash/compare functions as well (which was ugly > anyway). > > Comments? Well, it looks nicer than what's there currently. As the element functions now are scoped and normal member functions, they should be named with lower case characters of course. I do like that the hash table would then only have one real template argument; the Allocator, well, no better idea comes to my mind. Ciao, Michael.
Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)
On Wed, 15 Aug 2012, Richard Guenther wrote: > On Sun, 12 Aug 2012, Diego Novillo wrote: > > > This implements a new C++ hash table. > > > > See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for > > details. > > > > Diego. > > Now as we see the result I'd have prefered a more C++-way instead > of making the conversion simple ... > > Like > > template > class hash_table > { > ... > }; > > template > class pointer_hash > { > hashval_t hash (); > int equal (const Element *); > ~Element (); > Element e; > }; > > and > > /* Hash table with all same_succ entries. */ > > static hash_table > same_succ_htab; > > The existing way is simply too ugly ... so, why did you not invent > a "nice" C++ way? Like the following, only the coverage.c use is converted. I've never seen template function arguments anywhere and having to repeat them all over the place is really really ugly (yes, even if only in the implementation). This goes with static member functions and not wrapping any data. It goes away with the requirement of having externally visible global functions for the hash/compare functions as well (which was ugly anyway). Comments? Thanks, Richard. > diffstat ~/p coverage.c | 43 +- hash-table.h | 234 +-- 2 files changed, 108 insertions(+), 169 deletions(-) Index: gcc/hash-table.h === --- gcc/hash-table.h(revision 190410) +++ gcc/hash-table.h(working copy) @@ -199,45 +199,42 @@ struct hash_table_control */ template class Allocator = xcallocator> class hash_table { +public: + typedef typename Element::Element_t Element_t; private: + hash_table_control *htab; - hash_table_control *htab; - - Element **find_empty_slot_for_expand (hashval_t hash); + Element_t **find_empty_slot_for_expand (hashval_t hash); void expand (); public: - hash_table (); void create (size_t initial_slots); bool is_created (); void dispose (); - Element *find (Element *comparable); - Element *find_with_hash (Element *comparable, hashval_t hash); - Element **find_slot (Element *comparable, enum insert_option insert); - Element **find_slot_with_hash (Element *comparable, hashval_t hash, -enum insert_option insert); + Element_t *find (Element_t *comparable); + Element_t *find_with_hash (Element_t *comparable, hashval_t hash); + Element_t **find_slot (Element_t *comparable, enum insert_option insert); + Element_t **find_slot_with_hash (Element_t *comparable, hashval_t hash, + enum insert_option insert); void empty (); - void clear_slot (Element **slot); - void remove_elt (Element *comparable); - void remove_elt_with_hash (Element *comparable, hashval_t hash); + void clear_slot (Element_t **slot); + void remove_elt (Element_t *comparable); + void remove_elt_with_hash (Element_t *comparable, hashval_t hash); size_t size(); size_t elements(); double collisions(); template + int (*Callback) (Element_t **slot, Argument argument)> void traverse_noresize (Argument argument); template + int (*Callback) (Element_t **slot, Argument argument)> void traverse (Argument argument); }; @@ -245,12 +242,9 @@ public: /* Construct the hash table. The only useful operation next is create. */ template class Allocator> inline -hash_table ::hash_table () +hash_table ::hash_table () : htab (NULL) { } @@ -259,12 +253,9 @@ hash_table class Allocator> inline bool -hash_table ::is_created () +hash_table ::is_created () { return htab != NULL; } @@ -273,56 +264,44 @@ hash_table class Allocator> -inline Element * -hash_table ::find (Element *comparable) +inline typename Element::Element_t * +hash_table ::find (Element_t *comparable) { - return find_with_hash (comparable, Hash (comparable)); + return find_with_hash (comparable, Element::Hash (comparable)); } /* Like find_slot_with_hash, but compute the hash value from the element. */ template class Allocator> -inline Element ** -hash_table -::find_slot (Element *comparable, enum insert_option insert) + template class Allocator> +inline typename Element::Element_t ** +hash_table +::find_slot (Element_t *comparable, enum insert_option insert) { - return find_slot_with_hash (comparable, Hash (comparable), insert); + return find_slot_with_hash (comparable, Element::Hash (comparable), insert); } /* Like remove_elt_with_hash, but compute the hash value from the element. */ template class Allocator> inline void -hash_table -::remove_elt (Element *comparable) +hash_table +::remove_elt (Element_t *comparable) { - remove_elt_with_hash (comparable, Hash (comparable)); + remove_elt_with_hash (comparable, Element::Hash (comparable)); } /* Return the current size of this hash table. */ template class Allocator> inline size_t
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 2:53 PM, Jakub Jelinek wrote: > On Wed, Aug 15, 2012 at 02:36:54PM +0200, Richard Guenther wrote: >> On Wed, Aug 15, 2012 at 2:29 PM, Jakub Jelinek wrote: >> > On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote: >> >> Ok. That would still leave us with the issue Ramana brought up - the >> >> target hook returning true unconditionally if a generic permute is >> >> implemented. >> >> We just avoid generic expansion by tree-vect-generic.c that way. >> > >> > Yeah, if there is a generic permute, can_vec_perm_p will return true always >> > for the modes for which it is available. Which is why I've been suggesting >> > also the target hook which should return false if only generic permute is >> > going to be used. >> >> Well. What about returning a cost instead? We don't want to transform >> two insns to four either. > > That could work, it would just be a lot of work to change it (and more > costly). E.g. on i386 for 16-byte vectors with certain ISAs we return true > right away, when returning cost we'd need to go the slow path always > even when the caller is just interested in a boolean flag whether it is > possible or not. > > CCing rth as the author of the hooks. We'd want more accurate costs for the vectorizer cost model anyways. Richard. > Jakub
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 02:36:54PM +0200, Richard Guenther wrote: > On Wed, Aug 15, 2012 at 2:29 PM, Jakub Jelinek wrote: > > On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote: > >> Ok. That would still leave us with the issue Ramana brought up - the > >> target hook returning true unconditionally if a generic permute is > >> implemented. > >> We just avoid generic expansion by tree-vect-generic.c that way. > > > > Yeah, if there is a generic permute, can_vec_perm_p will return true always > > for the modes for which it is available. Which is why I've been suggesting > > also the target hook which should return false if only generic permute is > > going to be used. > > Well. What about returning a cost instead? We don't want to transform > two insns to four either. That could work, it would just be a lot of work to change it (and more costly). E.g. on i386 for 16-byte vectors with certain ISAs we return true right away, when returning cost we'd need to go the slow path always even when the caller is just interested in a boolean flag whether it is possible or not. CCing rth as the author of the hooks. Jakub
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 2:29 PM, Jakub Jelinek wrote: > On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote: >> Ok. That would still leave us with the issue Ramana brought up - the >> target hook returning true unconditionally if a generic permute is >> implemented. >> We just avoid generic expansion by tree-vect-generic.c that way. > > Yeah, if there is a generic permute, can_vec_perm_p will return true always > for the modes for which it is available. Which is why I've been suggesting > also the target hook which should return false if only generic permute is > going to be used. Well. What about returning a cost instead? We don't want to transform two insns to four either. Richard. > Jakub
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote: > Ok. That would still leave us with the issue Ramana brought up - the > target hook returning true unconditionally if a generic permute is > implemented. > We just avoid generic expansion by tree-vect-generic.c that way. Yeah, if there is a generic permute, can_vec_perm_p will return true always for the modes for which it is available. Which is why I've been suggesting also the target hook which should return false if only generic permute is going to be used. Jakub
Re: Merge C++ conversion into trunk (0/6 - Overview)
On 12-08-15 08:18 , Richard Guenther wrote: 0 is fixed if you have recent enough gdb. (gdb) call debug_tree (0) as 0 is a null pointer constant. Oh, cool. Progress. GDB folks, would it be hard to figure out that there is a single variant of the called function and trust the user that they are passing the right pointer value? Diego.
Re: combine permutations in gimple
On 15 August 2012 13:07, Jakub Jelinek wrote: > On Wed, Aug 15, 2012 at 01:46:05PM +0200, Richard Guenther wrote: >> Well, we're waiting for someone to break the tie ... I'd go with the original >> patch, improving the backends where necessary. > > E.g. i?86/x86_64 with just plain -msse2 has only very small subset of > constant shuffles (and no variable shuffle), so by doing the transformation > you could end up with a scalarized shuffle instead of two constant vector > shuffles. Expecting the backend to figure it out and doing the two constant > vector shuffles in every case is not realistic, i386/x86_64 has way too many > different shuffling instructions (and worse different ISA levels have > different subsets of them) and while a lot of effort has been spent on > it already by Richard, me, Marc and others, we are nowhere close to having > optimal sequence in many cases for various modes and ISA levels. Doing a > brute force might be too expensive. > Neon has atleast 7-8 such forms. Brute force computing this would be expensive. I don't know what happens on other vector targets - surely other SIMD targets in GCC will also have the same problem. Ramana > Jakub
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Wed, 15 Aug 2012, Diego Novillo wrote: > On 12-08-15 07:59 , Richard Guenther wrote: > > > (gdb) call debug_tree (*expr_p) > > > > constant 9> > > (gdb) call debug_tree (0x767fa5e8) > > Cannot resolve function debug_tree to any overloaded instance > > Yeah, in the absence of overloads this is annoying. Happens with 0, too. 0 is fixed if you have recent enough gdb. (gdb) call debug_tree (0) as 0 is a null pointer constant. Richard.
Re: Merge C++ conversion into trunk (0/6 - Overview)
On 12-08-15 07:59 , Richard Guenther wrote: (gdb) call debug_tree (*expr_p) constant 9> (gdb) call debug_tree (0x767fa5e8) Cannot resolve function debug_tree to any overloaded instance Yeah, in the absence of overloads this is annoying. Happens with 0, too. Diego.
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 2:07 PM, Jakub Jelinek wrote: > On Wed, Aug 15, 2012 at 01:46:05PM +0200, Richard Guenther wrote: >> Well, we're waiting for someone to break the tie ... I'd go with the original >> patch, improving the backends where necessary. > > E.g. i?86/x86_64 with just plain -msse2 has only very small subset of > constant shuffles (and no variable shuffle), so by doing the transformation > you could end up with a scalarized shuffle instead of two constant vector > shuffles. Expecting the backend to figure it out and doing the two constant > vector shuffles in every case is not realistic, i386/x86_64 has way too many > different shuffling instructions (and worse different ISA levels have > different subsets of them) and while a lot of effort has been spent on > it already by Richard, me, Marc and others, we are nowhere close to having > optimal sequence in many cases for various modes and ISA levels. Doing a > brute force might be too expensive. Ok. That would still leave us with the issue Ramana brought up - the target hook returning true unconditionally if a generic permute is implemented. We just avoid generic expansion by tree-vect-generic.c that way. (I wonder if there exists some automated machinery that finds a path using existing instructions to shuffle from {0, 1, ..., n } to the mask and if that would be a viable way to handle shuffle expansions, or pre-compute them all) Richard. > Jakub
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 01:46:05PM +0200, Richard Guenther wrote: > Well, we're waiting for someone to break the tie ... I'd go with the original > patch, improving the backends where necessary. E.g. i?86/x86_64 with just plain -msse2 has only very small subset of constant shuffles (and no variable shuffle), so by doing the transformation you could end up with a scalarized shuffle instead of two constant vector shuffles. Expecting the backend to figure it out and doing the two constant vector shuffles in every case is not realistic, i386/x86_64 has way too many different shuffling instructions (and worse different ISA levels have different subsets of them) and while a lot of effort has been spent on it already by Richard, me, Marc and others, we are nowhere close to having optimal sequence in many cases for various modes and ISA levels. Doing a brute force might be too expensive. Jakub
Re: [rfc] Fix SPU build (Re: [PATCH] Remove basic_block->loop_depth)
On Wed, 15 Aug 2012, Ulrich Weigand wrote: > Richard Guenther wrote: > > On Tue, 14 Aug 2012, Ulrich Weigand wrote: > > > Looks like this broke SPU build, since spu_machine_dependent_reorg > > > accesses ->loop_depth. According to comments in the code, this > > > was done because of concerns that loop_father may no longer be set up > > > this late in compilation, so I'm wondering whether just replacing > > > this by loop_depth (bb->loop_father) would work here ... > > > If SPU md reorg would like to look at loop structures it should > > compute them. Simply call flow_loops_find, which hopefully works > > in CFG RTL mode (which I think is the mode available from md reorg?). > > It seems flow_loops_find by itself is not quite enough, but everything > necessary to use the loop structures seems to be encapsulated in > loop_optimizer_init / loop_optimizer_finalize, which are already called > by a number of optimization passes. When I do the same in the SPU > md-reorg pass, loop_father seems to be set up OK. > > Does this look reasonable? Yes. You can use bb_loop_depth instead of loop_depth (bb->loop_father). Richard. > Thanks, > Ulrich > > > ChangeLog: > > * config/spu/spu.c: Include "cfgloop.h". > (spu_machine_dependent_reorg): Call loop_optimizer_init and > loop_optimizer_finalize. Use loop_father instead of loop_depth. > * config/spu/t-spu-elf (spu.o): Update dependencies. > > Index: gcc/config/spu/spu.c > === > *** gcc/config/spu/spu.c (revision 190390) > --- gcc/config/spu/spu.c (working copy) > *** > *** 53,58 > --- 53,59 > #include "timevar.h" > #include "df.h" > #include "dumpfile.h" > + #include "cfgloop.h" > > /* Builtin types, data and prototypes. */ > > *** spu_machine_dependent_reorg (void) > *** 2458,2463 > --- 2459,2468 > in_spu_reorg = 1; > compute_bb_for_insn (); > > + /* (Re-)discover loops so that bb->loop_father can be used > + in the analysis below. */ > + loop_optimizer_init (AVOID_CFG_MODIFICATIONS); > + > compact_blocks (); > > spu_bb_info = > *** spu_machine_dependent_reorg (void) > *** 2562,2575 >fallthru block. This catches the cases when it is a simple >loop or when there is an initial branch into the loop. */ > if (prev && (loop_exit || simple_loop) > ! && prev->loop_depth <= bb->loop_depth) > prop = prev; > > /* If there is only one adjacent predecessor. Don't propagate > ! outside this loop. This loop_depth test isn't perfect, but > ! I'm not sure the loop_father member is valid at this point. */ > else if (prev && single_pred_p (bb) > !&& prev->loop_depth == bb->loop_depth) > prop = prev; > > /* If this is the JOIN block of a simple IF-THEN then > --- 2567,2580 >fallthru block. This catches the cases when it is a simple >loop or when there is an initial branch into the loop. */ > if (prev && (loop_exit || simple_loop) > ! && (loop_depth (prev->loop_father) > ! <= loop_depth (bb->loop_father))) > prop = prev; > > /* If there is only one adjacent predecessor. Don't propagate > ! outside this loop. */ > else if (prev && single_pred_p (bb) > !&& prev->loop_father == bb->loop_father) > prop = prev; > > /* If this is the JOIN block of a simple IF-THEN then > *** spu_machine_dependent_reorg (void) > *** 2578,2584 > && EDGE_COUNT (bb->preds) == 2 > && EDGE_COUNT (prev->preds) == 1 > && EDGE_PRED (prev, 0)->src == prev2 > !&& prev2->loop_depth == bb->loop_depth > && GET_CODE (branch_target) != REG) > prop = prev; > > --- 2583,2589 > && EDGE_COUNT (bb->preds) == 2 > && EDGE_COUNT (prev->preds) == 1 > && EDGE_PRED (prev, 0)->src == prev2 > !&& prev2->loop_father == bb->loop_father > && GET_CODE (branch_target) != REG) > prop = prev; > > *** spu_machine_dependent_reorg (void) > *** 2600,2606 > if (dump_file) > fprintf (dump_file, "propagate from %i to %i (loop depth %i) " >"for %i (loop_exit %i simple_loop %i dist %i)\n", > ! bb->index, prop->index, bb->loop_depth, >INSN_UID (branch), loop_exit, simple_loop, >branch_addr - INSN_ADDRESSES (INSN_UID (bbend))); > > --- 2605,2611 > if (dump_file) > fprintf (dump_file, "propagate from %i to %i (loop depth %i) " >"for %i (loop_exit %i simple_loop %i dist %i)\n", > !
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 1:56 PM, Ramana Radhakrishnan wrote: > [It looks like I missed hitting the send button on this response] > >> >> Seems to be one instruction shorter at least ;-) Yes, there can be much >> worse regressions than that because of the patch (like 40 instructions >> instead of 4, in the x86 backend). > > If this is replacing 4 instructions with 40 in x86 backend maybe > someone will notice :) > > Not a win in this particular testcase because the compiler replaces 2 > constant permutes ( that's about 4 cycles) with a load from the > constant pool , a generic permute and in addition are polluting the > icache with guff in the constant pool . If you go to 3 -4 permutes > into a single one then it might be a win but not till that point. > > >> with a-b without first asking the backend whether it might be more >> efficient. One permutation is better than 2. >> It just happens that the range >> of possible permutations is too large (and the basic instructions are too >> strange) for backends to do a good job on them, and thus keeping toplevel >> input as a hint is important. > > Of-course, the problem here is this change of semantics with the hook > TARGET_VEC_PERM_CONST_OK. Targets were expanding to generic permutes > with constants in the *absence* of being able to deal with them with > the specialized permutes. fwprop will now leave us at a point where > each target has to now grow more knowledge with respect to how best to > expand a generic constant permute with a sequence of permutes rather > than just using the generic permute. > > Generating a sequence of permutes from a single constant permute will > be a harder problem than (say) dealing with immediate expansions so > you are pushing more complexity into the target but in the short term > target maintainers should definitely have a heads up that folks using > vector permute intrinsics could actually have performance regressions > on their targets. It's of course the same with the user input containing such a non-optimal handled constant permute. So I'm less convinced that it's too much burden on the target side. OTOH if there is a generic kind of shuffles that targets do not implement directly but can be simulated by two that are directly implemented pushing the logic to the expander (and adjusting the target hook semantic) would be ok. Richard. > Thanks, > Ramana
[rfc] Fix SPU build (Re: [PATCH] Remove basic_block->loop_depth)
Richard Guenther wrote: > On Tue, 14 Aug 2012, Ulrich Weigand wrote: > > Looks like this broke SPU build, since spu_machine_dependent_reorg > > accesses ->loop_depth. According to comments in the code, this > > was done because of concerns that loop_father may no longer be set up > > this late in compilation, so I'm wondering whether just replacing > > this by loop_depth (bb->loop_father) would work here ... > If SPU md reorg would like to look at loop structures it should > compute them. Simply call flow_loops_find, which hopefully works > in CFG RTL mode (which I think is the mode available from md reorg?). It seems flow_loops_find by itself is not quite enough, but everything necessary to use the loop structures seems to be encapsulated in loop_optimizer_init / loop_optimizer_finalize, which are already called by a number of optimization passes. When I do the same in the SPU md-reorg pass, loop_father seems to be set up OK. Does this look reasonable? Thanks, Ulrich ChangeLog: * config/spu/spu.c: Include "cfgloop.h". (spu_machine_dependent_reorg): Call loop_optimizer_init and loop_optimizer_finalize. Use loop_father instead of loop_depth. * config/spu/t-spu-elf (spu.o): Update dependencies. Index: gcc/config/spu/spu.c === *** gcc/config/spu/spu.c(revision 190390) --- gcc/config/spu/spu.c(working copy) *** *** 53,58 --- 53,59 #include "timevar.h" #include "df.h" #include "dumpfile.h" + #include "cfgloop.h" /* Builtin types, data and prototypes. */ *** spu_machine_dependent_reorg (void) *** 2458,2463 --- 2459,2468 in_spu_reorg = 1; compute_bb_for_insn (); + /* (Re-)discover loops so that bb->loop_father can be used + in the analysis below. */ + loop_optimizer_init (AVOID_CFG_MODIFICATIONS); + compact_blocks (); spu_bb_info = *** spu_machine_dependent_reorg (void) *** 2562,2575 fallthru block. This catches the cases when it is a simple loop or when there is an initial branch into the loop. */ if (prev && (loop_exit || simple_loop) ! && prev->loop_depth <= bb->loop_depth) prop = prev; /* If there is only one adjacent predecessor. Don't propagate !outside this loop. This loop_depth test isn't perfect, but !I'm not sure the loop_father member is valid at this point. */ else if (prev && single_pred_p (bb) ! && prev->loop_depth == bb->loop_depth) prop = prev; /* If this is the JOIN block of a simple IF-THEN then --- 2567,2580 fallthru block. This catches the cases when it is a simple loop or when there is an initial branch into the loop. */ if (prev && (loop_exit || simple_loop) ! && (loop_depth (prev->loop_father) ! <= loop_depth (bb->loop_father))) prop = prev; /* If there is only one adjacent predecessor. Don't propagate !outside this loop. */ else if (prev && single_pred_p (bb) ! && prev->loop_father == bb->loop_father) prop = prev; /* If this is the JOIN block of a simple IF-THEN then *** spu_machine_dependent_reorg (void) *** 2578,2584 && EDGE_COUNT (bb->preds) == 2 && EDGE_COUNT (prev->preds) == 1 && EDGE_PRED (prev, 0)->src == prev2 ! && prev2->loop_depth == bb->loop_depth && GET_CODE (branch_target) != REG) prop = prev; --- 2583,2589 && EDGE_COUNT (bb->preds) == 2 && EDGE_COUNT (prev->preds) == 1 && EDGE_PRED (prev, 0)->src == prev2 ! && prev2->loop_father == bb->loop_father && GET_CODE (branch_target) != REG) prop = prev; *** spu_machine_dependent_reorg (void) *** 2600,2606 if (dump_file) fprintf (dump_file, "propagate from %i to %i (loop depth %i) " "for %i (loop_exit %i simple_loop %i dist %i)\n", !bb->index, prop->index, bb->loop_depth, INSN_UID (branch), loop_exit, simple_loop, branch_addr - INSN_ADDRESSES (INSN_UID (bbend))); --- 2605,2611 if (dump_file) fprintf (dump_file, "propagate from %i to %i (loop depth %i) " "for %i (loop_exit %i simple_loop %i dist %i)\n", !bb->index, prop->index, loop_depth (bb->loop_father), INSN_UID (branch), loop_exit, simple_loop, branch_addr - INSN_ADDRESSES (INSN_UID (bbend))); *** spu_machine_dependent_re
Re: Merge C++ conversion into trunk (0/6 - Overview)
On Sun, 12 Aug 2012, Diego Novillo wrote: > I will be sending 6 patches that implement all the changes we > have been making on the cxx-conversion branch. As described in > http://gcc.gnu.org/ml/gcc/2012-08/msg00015.html, these patches > change the default bootstrap process so that stage 1 always > builds with a C++ compiler. > > Other than the bootstrap change, the patches make no functional > changes to the compiler. Everything should build as it does now > in trunk. ... and debugging becomes a nightmare (hello gdb people!) (gdb) call debug_tree (*expr_p) constant 9> (gdb) call debug_tree (0x767fa5e8) Cannot resolve function debug_tree to any overloaded instance filing bugs does not help (as usual). http://sourceware.org/bugzilla/show_bug.cgi?id=13356 It's a year old, to prep for this change. Now we're here. Debugging sucks. Great. Richard.
Re: combine permutations in gimple
[It looks like I missed hitting the send button on this response] > > Seems to be one instruction shorter at least ;-) Yes, there can be much > worse regressions than that because of the patch (like 40 instructions > instead of 4, in the x86 backend). If this is replacing 4 instructions with 40 in x86 backend maybe someone will notice :) Not a win in this particular testcase because the compiler replaces 2 constant permutes ( that's about 4 cycles) with a load from the constant pool , a generic permute and in addition are polluting the icache with guff in the constant pool . If you go to 3 -4 permutes into a single one then it might be a win but not till that point. > with a-b without first asking the backend whether it might be more > efficient. One permutation is better than 2. > It just happens that the range > of possible permutations is too large (and the basic instructions are too > strange) for backends to do a good job on them, and thus keeping toplevel > input as a hint is important. Of-course, the problem here is this change of semantics with the hook TARGET_VEC_PERM_CONST_OK. Targets were expanding to generic permutes with constants in the *absence* of being able to deal with them with the specialized permutes. fwprop will now leave us at a point where each target has to now grow more knowledge with respect to how best to expand a generic constant permute with a sequence of permutes rather than just using the generic permute. Generating a sequence of permutes from a single constant permute will be a harder problem than (say) dealing with immediate expansions so you are pushing more complexity into the target but in the short term target maintainers should definitely have a heads up that folks using vector permute intrinsics could actually have performance regressions on their targets. Thanks, Ramana
Re: combine permutations in gimple
On Wed, Aug 15, 2012 at 1:22 PM, Marc Glisse wrote: > On Mon, 13 Aug 2012, Marc Glisse wrote: > >> I'll give it a few more days for the conversation to settle, so I know >> what I should do between: >> - the barely modified patch you accepted, >> - the check asked by Jakub, >> - the restriction to identity that prevents any regression (well...), >> - something else? > > > I didn't realize the conversation would go quiet immediatly... > > Is someone willing to take the responsibility of telling me which approach > is right? I can add Jakub's checks (though I am not sure how I would test > them), but I would rather not do it if the conclusion is that they are > either unnecessary (original patch is ok) or insufficient (don't avoid > Ramana's regressions). I can do the very safe option 3 (only combine > permutations when the result is the identity permutation), but I first want > to make sure the more general thing is bad. > > (sorry, it's my first patch that gets conflicting answers...) Well, we're waiting for someone to break the tie ... I'd go with the original patch, improving the backends where necessary. Richard. > -- > Marc Glisse
Re: combine permutations in gimple
On Mon, 13 Aug 2012, Marc Glisse wrote: I'll give it a few more days for the conversation to settle, so I know what I should do between: - the barely modified patch you accepted, - the check asked by Jakub, - the restriction to identity that prevents any regression (well...), - something else? I didn't realize the conversation would go quiet immediatly... Is someone willing to take the responsibility of telling me which approach is right? I can add Jakub's checks (though I am not sure how I would test them), but I would rather not do it if the conclusion is that they are either unnecessary (original patch is ok) or insufficient (don't avoid Ramana's regressions). I can do the very safe option 3 (only combine permutations when the result is the identity permutation), but I first want to make sure the more general thing is bad. (sorry, it's my first patch that gets conflicting answers...) -- Marc Glisse
Re: LEA-splitting improvement patch.
Hi Uros, I send you new patch with fixed space/tab alignments. About your comment. It is more optimal to put adding of constant before adding of the register only for case when 3 instructions must be generated to split lea. In all other cases it does not matter and I left code unchangeable. Best regards. Yuri. 2012/8/14 Uros Bizjak : > On Tue, Aug 14, 2012 at 3:35 PM, Yuri Rumyantsev wrote: >> Uros, >> >> Let me try to explain you why I used such code duplication: >> >> Here we have a common case of LEA with 3 different registers - r0 >> (target), r1(base), r2(index) and possible offset. >> To get the better scheduling we first try to determine what register >> is prefirable for inititial setting - r1 or r2 through >> find_nearest_reg_def. And then we generate the following sequence of >> instructions: >>r0 = r_best; >>r0 = $const, r0 >>r0 = r_worse, r0 >> that can save 2 cycles for Atom since first 2 instructions can be hoisted up. >> I could not find better way for coding it. > > If it is important to put adding of const before adding of the > register, then you can emit similar sequence for other cases, too. > Something like following: > > --cut here-- > ... > { > if (regno0 == regno1) > tmp = parts.index; > else if (regno0 == regno2) > tmp = parts.base; > else > { > rtx tmp1; > > /* regno1: base, regno2: index */ > if (find_nearest_reg_def (insn, regno1, regno2)) > tmp1 = parts.index, tmp = parts.base; > else > tmp1 = parts.base, tmp = parts.index; > > emit_insn (gen_rtx_SET (VOIDmode, target, tmp1)); > } > > if (parts.disp && parts.disp != const0_rtx) > ix86_emit_binop (PLUS, mode, target, parts.disp); > ix86_emit_binop (PLUS, mode, target, tmp); > return; > } > --cut here-- > I prepared new patch and ChangeLog. Testing of x32 is in progress. > > You didn't fix vertical spaces and tab issues in new patch. > > Uros. lea_split_improve.diff Description: Binary data
Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)
On Wed, 15 Aug 2012, Richard Guenther wrote: > On Wed, Aug 15, 2012 at 12:31 PM, Jakub Jelinek wrote: > > On Wed, Aug 15, 2012 at 12:28:58PM +0200, Richard Guenther wrote: > >> the function names make no sense - they should be talking about > >> host-wide-ints, because that is what they are about. Thus, > >> > >> /* Conversion functions. */ > >> > >> HOST_WIDE_INT to_signed_hwi () const; > >> unsigned HOST_WIDE_INT to_unsigned_hwi () const; > >> > >> /* Conversion query functions. */ > >> > >> bool fits_unsigned_hwi () const; > >> bool fits_signed_hwi () const; > >> bool fits_hwi (bool uns) const; > > > > Wouldn't uhwi and shwi be better? Both are already widely used within > > gcc... > > True, I'll change it that way. Like so, testing in progress. Richard. 2012-08-15 Richard Guenther * double-int.h (from_unsigned): Rename to ... (from_uhwi): ... this. (from_signed): Rename to ... (from_shwi): ... this. (to_signed): Rename to ... (to_shwi): ... this. (to_unsigned): Rename to ... (to_uhwi): ... this. (fits_unsigned): Rename to ... (fits_uhwi): ... this. (fits_signed): Rename to ... (fits_shwi): ... this. (fits): Rename to ... (fits_hwi): ... this. * double-int.c: Likewise. Index: gcc/double-int.c === *** gcc/double-int.c(revision 190407) --- gcc/double-int.c(working copy) *** double_int::sext (unsigned prec) const *** 716,722 /* Returns true if CST fits in signed HOST_WIDE_INT. */ bool ! double_int::fits_signed () const { const double_int &cst = *this; if (cst.high == 0) --- 716,722 /* Returns true if CST fits in signed HOST_WIDE_INT. */ bool ! double_int::fits_shwi () const { const double_int &cst = *this; if (cst.high == 0) *** double_int::fits_signed () const *** 731,742 unsigned HOST_WIDE_INT if UNS is true. */ bool ! double_int::fits (bool uns) const { if (uns) ! return this->fits_unsigned (); else ! return this->fits_signed (); } /* Returns A * B. */ --- 731,742 unsigned HOST_WIDE_INT if UNS is true. */ bool ! double_int::fits_hwi (bool uns) const { if (uns) ! return this->fits_uhwi (); else ! return this->fits_shwi (); } /* Returns A * B. */ Index: gcc/double-int.h === *** gcc/double-int.h(revision 190407) --- gcc/double-int.h(working copy) *** public: *** 60,67 Second, the GCC conding conventions prefer explicit conversion, and explicit conversion operators are not available until C++11. */ ! static double_int from_unsigned (unsigned HOST_WIDE_INT cst); ! static double_int from_signed (HOST_WIDE_INT cst); /* No copy assignment operator or destructor to keep the type a POD. */ --- 60,67 Second, the GCC conding conventions prefer explicit conversion, and explicit conversion operators are not available until C++11. */ ! static double_int from_uhwi (unsigned HOST_WIDE_INT cst); ! static double_int from_shwi (HOST_WIDE_INT cst); /* No copy assignment operator or destructor to keep the type a POD. */ *** public: *** 83,96 /* Conversion functions. */ ! HOST_WIDE_INT to_signed () const; ! unsigned HOST_WIDE_INT to_unsigned () const; /* Conversion query functions. */ ! bool fits_unsigned () const; ! bool fits_signed () const; ! bool fits (bool uns) const; /* Attribute query functions. */ --- 83,96 /* Conversion functions. */ ! HOST_WIDE_INT to_shwi () const; ! unsigned HOST_WIDE_INT to_uhwi () const; /* Conversion query functions. */ ! bool fits_uhwi () const; ! bool fits_shwi () const; ! bool fits_hwi (bool uns) const; /* Attribute query functions. */ *** public: *** 186,192 HOST_WIDE_INT are filled with the sign bit. */ inline ! double_int double_int::from_signed (HOST_WIDE_INT cst) { double_int r; r.low = (unsigned HOST_WIDE_INT) cst; --- 186,192 HOST_WIDE_INT are filled with the sign bit. */ inline ! double_int double_int::from_shwi (HOST_WIDE_INT cst) { double_int r; r.low = (unsigned HOST_WIDE_INT) cst; *** double_int double_int::from_signed (HOST *** 198,204 static inline double_int shwi_to_double_int (HOST_WIDE_INT cst) { ! return double_int::from_signed (cst); } /* Some useful constants. */ --- 198,204 static inline double_int shwi_to_double_int (HOST_WIDE_INT cst) { ! return double_int::from_shwi (cst); } /* Some useful constants. */ *** shwi_to_double_int (HOST_WIDE_INT cst) *** 206,222 The problem is that a named constant would
Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)
On Wed, Aug 15, 2012 at 12:31 PM, Jakub Jelinek wrote: > On Wed, Aug 15, 2012 at 12:28:58PM +0200, Richard Guenther wrote: >> the function names make no sense - they should be talking about >> host-wide-ints, because that is what they are about. Thus, >> >> /* Conversion functions. */ >> >> HOST_WIDE_INT to_signed_hwi () const; >> unsigned HOST_WIDE_INT to_unsigned_hwi () const; >> >> /* Conversion query functions. */ >> >> bool fits_unsigned_hwi () const; >> bool fits_signed_hwi () const; >> bool fits_hwi (bool uns) const; > > Wouldn't uhwi and shwi be better? Both are already widely used within > gcc... True, I'll change it that way. Richard. >> Likewise for >> >> static double_int from_unsigned (unsigned HOST_WIDE_INT cst); >> static double_int from_signed (HOST_WIDE_INT cst); >> >> I'm going to install a patch, after testing, that adjusts the names >> accordingly. > > Jakub
Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)
On Wed, Aug 15, 2012 at 12:28:58PM +0200, Richard Guenther wrote: > the function names make no sense - they should be talking about > host-wide-ints, because that is what they are about. Thus, > > /* Conversion functions. */ > > HOST_WIDE_INT to_signed_hwi () const; > unsigned HOST_WIDE_INT to_unsigned_hwi () const; > > /* Conversion query functions. */ > > bool fits_unsigned_hwi () const; > bool fits_signed_hwi () const; > bool fits_hwi (bool uns) const; Wouldn't uhwi and shwi be better? Both are already widely used within gcc... > Likewise for > > static double_int from_unsigned (unsigned HOST_WIDE_INT cst); > static double_int from_signed (HOST_WIDE_INT cst); > > I'm going to install a patch, after testing, that adjusts the names > accordingly. Jakub
Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)
On Tue, Aug 14, 2012 at 10:04 AM, Richard Guenther wrote: > On Mon, 13 Aug 2012, Lawrence Crowl wrote: > >> On 8/13/12, Richard Guenther wrote: >> > Increment/decrement operations did not exist, please do not add >> > them at this point. >> >> Note that I have also added +=, -= and *= operations. Having them >> has three advantages. First, it matches expectations on what >> numeric types allow. Second, it results in more concise code. >> Third, it results in potentially faster code. I think we should >> be able to use those operators. >> >> When I run through changing call sites, I really want to change >> the sites to the final form, not do two passes. > > Ok. Btw, I noticed we now have /* Conversion functions. */ HOST_WIDE_INT to_signed () const; unsigned HOST_WIDE_INT to_unsigned () const; /* Conversion query functions. */ bool fits_unsigned () const; bool fits_signed () const; bool fits (bool uns) const; the function names make no sense - they should be talking about host-wide-ints, because that is what they are about. Thus, /* Conversion functions. */ HOST_WIDE_INT to_signed_hwi () const; unsigned HOST_WIDE_INT to_unsigned_hwi () const; /* Conversion query functions. */ bool fits_unsigned_hwi () const; bool fits_signed_hwi () const; bool fits_hwi (bool uns) const; Likewise for static double_int from_unsigned (unsigned HOST_WIDE_INT cst); static double_int from_signed (HOST_WIDE_INT cst); I'm going to install a patch, after testing, that adjusts the names accordingly. Richard. > Thanks, > Richard.
RFC patch for http://gcc.gnu.org/install/download.html / gcc/doc/install.texi
Dear all, when looking at http://gcc.gnu.org/install/download.html, I wondered whether some parts should be updated. In particular - Mentioning the GIT mirror at http://gcc.gnu.org/wiki/GitMirror - Mentioning CLOOG/ISL for the in-tree build Those changes I did in the patch below. However, I am also wondering about the "It is possible to download a full distribution or specific components." I think GCC stopped doing so at some 4.x release - but install.html also applies to GCC 3.x.y where it still existed. I was thus a bit reluctant to remove that part. Furthermore, I was wondering whether one should link to ftp://gcc.gnu.org/pub/gcc/infrastructure/ (either directly or by mentioning its existence on the mirrors). What do you think? Tobias diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index dfdd624..bbb8535 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -524,6 +524,7 @@ the executables named @command{cantlr}, @command{runantlr} or -GCC is distributed via @uref{http://gcc.gnu.org/svn.html,,SVN} and FTP -tarballs compressed with @command{gzip} or -@command{bzip2}. It is possible to download a full distribution or specific -components. +GCC is distributed via @uref{http://gcc.gnu.org/svn.html,,SVN}, +@uref{http://gcc.gnu.org/wiki/GitMirror,,GIT}, and on +@uref{http://gcc.gnu.org/mirrors.html,,FTP} as tarballs compressed with +@command{gzip} or @command{bzip2}. It is possible to download a full +distribution or specific components. @@ -555,7 +556,8 @@ components of the binutils you intend to build alongside the compiler -Likewise the GMP, MPFR and MPC libraries can be automatically built -together with GCC. Unpack the GMP, MPFR and/or MPC source -distributions in the directory containing the GCC sources and rename -their directories to @file{gmp}, @file{mpfr} and @file{mpc}, -respectively (or use symbolic links with the same name). +Likewise the GMP, MPFR and MPC libraries and the optional CLOOG and ISL +libraries can be automatically built together with GCC. Unpack the GMP, +MPFR, MPC, CLOOG and/or ISL source distributions in the directory +containing the GCC sources and rename their directories to @file{gmp}, +@file{mpfr}, @file{mpc}, @file{cloog} and @file{isl}, respectively (or +use symbolic links with the same name).
Re: [patch] timevar TLC
On Tue, Aug 14, 2012 at 10:46 PM, Lawrence Crowl wrote: > You can check the error statically. Something like > > % cat limitstring.c > #define LIMIT 32 > > struct def { > int x; > char name[LIMIT+1]; > }; > > struct def var[] = { > { 3, "hello" }, > { 4, "name is much too too long for a reasonable name" }, > }; > > % gcc -c limitstring.c -Werror > cc1: warnings being treated as errors > limitstring.c:10: error: initializer-string for array of chars is too long > limitstring.c:10: error: (near initialization for 'timevars[1].name') > > But of course the variable definition would look more like > > #define DEFTIMEVAR(identifier__, name__) \ > { , name__, ... }, > struct def var[] = { > #include "timevar.def" > }; That looks much better than my hack! I'll see if I can find some time to implement this idea. In the mean time I've committed my patch without the timevar.c bits. Ciao! Steven
Re: [PATCH] Fix PR54245
On Tue, 14 Aug 2012, William J. Schmidt wrote: > Currently we can insert an initializer that performs a multiply in too > small of a type for correctness. For now, detect the problem and avoid > the optimization when this would happen. Eventually I will fix this up > to cause the multiply to be performed in a sufficiently wide type. > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new > regressions. Ok for trunk? Ok. Thanks, Richard. > Thanks, > Bill > > > gcc: > > 2012-08-14 Bill Schmidt > > PR tree-optimization/54245 > * gimple-ssa-strength-reduction.c (legal_cast_p_1): New function. > (legal_cast_p): Split out logic to legal_cast_p_1. > (analyze_increments): Avoid introducing multiplies in smaller types. > > > gcc/testsuite: > > 2012-08-14 Bill Schmidt > > PR tree-optimization/54245 > * gcc.dg/tree-ssa/pr54245.c: New test. > > > Index: gcc/testsuite/gcc.dg/tree-ssa/pr54245.c > === > --- gcc/testsuite/gcc.dg/tree-ssa/pr54245.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/pr54245.c (revision 0) > @@ -0,0 +1,49 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -fdump-tree-slsr-details" } */ > + > +#include > + > +#define W1 22725 > +#define W2 21407 > +#define W3 19266 > +#define W6 8867 > + > +void idct_row(short *row, int *dst) > +{ > +int a0, a1, b0, b1; > + > +a0 = W1 * row[0]; > +a1 = a0; > + > +a0 += W2 * row[2]; > +a1 += W6 * row[2]; > + > +b0 = W1 * row[1]; > +b1 = W3 * row[1]; > + > +dst[0] = a0 + b0; > +dst[1] = a0 - b0; > +dst[2] = a1 + b1; > +dst[3] = a1 - b1; > +} > + > +static short block[8] = { 1, 2, 3, 4 }; > + > +int main(void) > +{ > +int out[4]; > +int i; > + > +idct_row(block, out); > + > +for (i = 0; i < 4; i++) > +printf("%d\n", out[i]); > + > +return !(out[2] == 87858 && out[3] == 10794); > +} > + > +/* For now, disable inserting an initializer when the multiplication will > + take place in a smaller type than originally. This test may be deleted > + in future when this case is handled more precisely. */ > +/* { dg-final { scan-tree-dump-times "Inserting initializer" 0 "slsr" } } */ > +/* { dg-final { cleanup-tree-dump "slsr" } } */ > Index: gcc/gimple-ssa-strength-reduction.c > === > --- gcc/gimple-ssa-strength-reduction.c (revision 190305) > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > @@ -1089,6 +1089,32 @@ slsr_process_neg (gimple gs, tree rhs1, bool speed >add_cand_for_stmt (gs, c); > } > > +/* Help function for legal_cast_p, operating on two trees. Checks > + whether it's allowable to cast from RHS to LHS. See legal_cast_p > + for more details. */ > + > +static bool > +legal_cast_p_1 (tree lhs, tree rhs) > +{ > + tree lhs_type, rhs_type; > + unsigned lhs_size, rhs_size; > + bool lhs_wraps, rhs_wraps; > + > + lhs_type = TREE_TYPE (lhs); > + rhs_type = TREE_TYPE (rhs); > + lhs_size = TYPE_PRECISION (lhs_type); > + rhs_size = TYPE_PRECISION (rhs_type); > + lhs_wraps = TYPE_OVERFLOW_WRAPS (lhs_type); > + rhs_wraps = TYPE_OVERFLOW_WRAPS (rhs_type); > + > + if (lhs_size < rhs_size > + || (rhs_wraps && !lhs_wraps) > + || (rhs_wraps && lhs_wraps && rhs_size != lhs_size)) > +return false; > + > + return true; > +} > + > /* Return TRUE if GS is a statement that defines an SSA name from > a conversion and is legal for us to combine with an add and multiply > in the candidate table. For example, suppose we have: > @@ -1129,28 +1155,11 @@ slsr_process_neg (gimple gs, tree rhs1, bool speed > static bool > legal_cast_p (gimple gs, tree rhs) > { > - tree lhs, lhs_type, rhs_type; > - unsigned lhs_size, rhs_size; > - bool lhs_wraps, rhs_wraps; > - >if (!is_gimple_assign (gs) >|| !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (gs))) > return false; > > - lhs = gimple_assign_lhs (gs); > - lhs_type = TREE_TYPE (lhs); > - rhs_type = TREE_TYPE (rhs); > - lhs_size = TYPE_PRECISION (lhs_type); > - rhs_size = TYPE_PRECISION (rhs_type); > - lhs_wraps = TYPE_OVERFLOW_WRAPS (lhs_type); > - rhs_wraps = TYPE_OVERFLOW_WRAPS (rhs_type); > - > - if (lhs_size < rhs_size > - || (rhs_wraps && !lhs_wraps) > - || (rhs_wraps && lhs_wraps && rhs_size != lhs_size)) > -return false; > - > - return true; > + return legal_cast_p_1 (gimple_assign_lhs (gs), rhs); > } > > /* Given GS which is a cast to a scalar integer type, determine whether > @@ -1996,6 +2005,31 @@ analyze_increments (slsr_cand_t first_dep, enum ma > != POINTER_PLUS_EXPR))) > incr_vec[i].cost = COST_NEUTRAL; > > + /* FORNOW: If we need to add an initializer, give up if a cast from > + the candidate's type to its stride's type can lose precision. > + This could eventually be handled better b
Re: [PATCH] Fix PR54240
On Tue, 14 Aug 2012, William J. Schmidt wrote: > Replace the once vacuously true, and now vacuously false, test for > existence of a conditional move instruction for a given mode, with one > that actually checks what it's supposed to. Add a test case so we don't > miss such things in future. > > The test is powerpc-specific. It would be good to have an i386 version > of the test as well, if someone can help with that. > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new > regressions. Ok for trunk? Ok if you also drop in the testcase from Andrew. Thanks, Richard. > Thanks, > Bill > > > gcc: > > 2012-08-13 Bill Schmidt > > PR tree-optimization/54240 > * tree-ssa-phiopt.c (hoist_adjacent_loads): Correct test for > existence of conditional move with given mode. > > > gcc/testsuite: > > 2012-08-13 Bill Schmidt > > PR tree-optimization/54240 > * gcc.target/powerpc/pr54240.c: New test. > > > Index: gcc/testsuite/gcc.target/powerpc/pr54240.c > === > --- gcc/testsuite/gcc.target/powerpc/pr54240.c(revision 0) > +++ gcc/testsuite/gcc.target/powerpc/pr54240.c(revision 0) > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -misel -fdump-tree-phiopt-details" } */ > + > +typedef struct s { > + int v; > + int b; > + struct s *l; > + struct s *r; > +} S; > + > + > +int foo(S *s) > +{ > + S *this; > + S *next; > + > + this = s; > + if (this->b) > +next = this->l; > + else > +next = this->r; > + > + return next->v; > +} > + > +/* { dg-final { scan-tree-dump "Hoisting adjacent loads" "phiopt1" } } */ > +/* { dg-final { cleanup-tree-dump "phiopt1" } } */ > Index: gcc/tree-ssa-phiopt.c > === > --- gcc/tree-ssa-phiopt.c (revision 190305) > +++ gcc/tree-ssa-phiopt.c (working copy) > @@ -1843,7 +1843,8 @@ hoist_adjacent_loads (basic_block bb0, basic_block > >/* Check the mode of the arguments to be sure a conditional move >can be generated for it. */ > - if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1 > + if (optab_handler (movcc_optab, TYPE_MODE (TREE_TYPE (arg1))) > + == CODE_FOR_nothing) > continue; > >/* Both statements must be assignments whose RHS is a COMPONENT_REF. > */ > > > -- Richard Guenther SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
Re: [PATCH] Intrinsics for ADCX
Hi, There's white paper [1] available, which explains usage of MULX/ADCX/ADOX [1] - http://download.intel.com/embedded/processor/whitepaper/327831.pdf Thanks, K