Re: Interaction between first stage build with g++ and $PATH

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 10:17:18PM -0700, Gary Funck wrote:
> I can file a bug reported if necessary, but am wondering
> if it is a known requirement not to have "." on $PATH
> or to explicitly set CC and CXX?

Having . in $PATH is a serious bug (especially from security POV).
Just never do that.

Jakub


Re: Interaction between first stage build with g++ and $PATH

2012-08-15 Thread Gary Funck
On 08/15/12 22:26:08, Ian Lance Taylor wrote:
> it's a bug.

OK, thanks.  I submitted a bug report.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54279

- Gary


Re: Interaction between first stage build with g++ and $PATH

2012-08-15 Thread Ian Lance Taylor
On Wed, Aug 15, 2012 at 10:17 PM, Gary Funck  wrote:
>
> 1. I have "." on $PATH.
>
> 2. In one build of the latest GCC trunk, I specify
>CC=/usr/bin/gcc and CXX=/usr/bin/g++ and everything
>works.
>
> 3. In another build, I don't specify CC or CXX.
>Therefore they default to 'gcc' and 'g++'.
>This fails:
>g++: error trying to exec 'cc1plus': execvp: No such file or directory
>
> If I remove "." from $PATH then the configuration in 3 will build.
>
> The problem is that there is a g++ executable under the
> built gcc directory, but cc1plus and other g++ component
> parts haven't been built yet.
>
> I can file a bug reported if necessary, but am wondering
> if it is a known requirement not to have "." on $PATH
> or to explicitly set CC and CXX?

it's a bug.

Ian


Interaction between first stage build with g++ and $PATH

2012-08-15 Thread Gary Funck

1. I have "." on $PATH.

2. In one build of the latest GCC trunk, I specify
   CC=/usr/bin/gcc and CXX=/usr/bin/g++ and everything
   works.

3. In another build, I don't specify CC or CXX.
   Therefore they default to 'gcc' and 'g++'.
   This fails:
   g++: error trying to exec 'cc1plus': execvp: No such file or directory

If I remove "." from $PATH then the configuration in 3 will build.

The problem is that there is a g++ executable under the
built gcc directory, but cc1plus and other g++ component
parts haven't been built yet.

I can file a bug reported if necessary, but am wondering
if it is a known requirement not to have "." on $PATH
or to explicitly set CC and CXX?

thanks,
- Gary


Re: [PATCH 5/7] rs6000: Get rid of old-mnemonics

2012-08-15 Thread Segher Boessenkool

This patch is okay, but the longlong.h patch is incomplete. It no
longer should test for _ARCH_COM or _ARCH_PWR, because those will not
work.


Committed the following as pre-approved off-list:

2012-08-15 Segher Boessenkool 

* longlong.h: (powerpc): Delete _ARCH_PWR and _ARCH_COM handling.


--- trunk/libgcc/longlong.h 2012/08/16 01:36:47 190433
+++ trunk/libgcc/longlong.h 2012/08/16 01:49:57 190434
@@ -850,8 +850,6 @@
FIXME: What's needed for gcc PowerPC VxWorks?  __vxworks__ is  
not good

enough, since that hits ARM and m68k too.  */
 #if (defined (_ARCH_PPC)   /* AIX  
*/   \
- || defined (_ARCH_PWR)/* AIX  
*/   \
- || defined (_ARCH_COM)/* AIX  
*/   \
  || defined (__powerpc__)  /* gcc  
*/   \
  || defined (__POWERPC__)  /* BEOS  
*/  \
  || defined (__ppc__)  /* Darwin  
*/\

@@ -914,14 +912,6 @@
   } while (0)
 #define SMUL_TIME 14
 #define UDIV_TIME 120
-#elif defined (_ARCH_PWR)
-#define UMUL_TIME 8
-#define smul_ppmm(xh, xl, m0, m1) \
-  __asm__ ("mul %0,%2,%3" : "=r" (xh), "=q" (xl) : "r" (m0), "r" (m1))
-#define SMUL_TIME 4
-#define sdiv_qrnnd(q, r, nh, nl, d) \
-  __asm__ ("div %0,%2,%4" : "=r" (q), "=q" (r) : "r" (nh), "1" (nl),  
"r" (d))

-#define UDIV_TIME 100
 #endif
 #endif /* 32-bit POWER architecture variants.  */




Re: [PATCH 7/7] rs6000: Old AIX specs: Use %(asm_default) instead of -mppc.

2012-08-15 Thread David Edelsohn
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool
 wrote:
> 2012-08-15  Segher Boessenkool  
>
> gcc/
> * config/rs6000/aix43.h (ASM_CPU_SPEC): Use %(asm_default)
> instead of -mppc.
> * config/rs6000/aix51.h (ASM_CPU_SPEC): Ditto.

This patch is okay.

Thanks, David


Re: [PATCH 5/7] rs6000: Get rid of old-mnemonics

2012-08-15 Thread Joseph S. Myers
On Wed, 15 Aug 2012, David Edelsohn wrote:

> Does GCC "own" longlong.h, or is that part of GMP or some other project?

It is right now in sync between glibc and GCC; changes should be applied 
in both places.  It hasn't been in sync with GMP's version for many years.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Scheduler: Allow breaking dependencies by modifying patterns

2012-08-15 Thread Maxim Kuvyrkov
On 4/08/2012, at 12:05 AM, Bernd Schmidt wrote:

> This patch allows us to change
> 
> rn++
> rm=[rn]
> 
> into
> 
> rm=[rn + 4]
> rn++

The patch is OK with the following nitpicks.

[BTW, if anyone else wants to review this patch, it helps to read it from the 
end.]

> 
> Opportunities to do this are discovered by a mini-pass over the
> instructions after generating dependencies and before scheduling a
> block. At that point we have all the information required to ensure that
> a candidate dep between two instructions is only used to show the
> register dependence, and to ensure that every insn with a memory
> reference is only subject to at most one dep causing a pattern change.
> 
> The dep_t structure is extended to hold an optional pointer to a
> "replacement description", which holds information about what to change
> when a dependency is broken. The time when this replacement is applied
> differs depending on whether the changed insn is the DEP_CON (in which
> case the pattern is changed whenever the broken dependency becomes the
> last one), or the DEP_PRO, in which case we make the change when the
> corresponding DEP_CON has been scheduled. This ensures that the ready
> list always contains insns with the correct pattern.
> 
> A few additional bits are needed in the dep structure: one to hold
> information about whether a dependency occurs multiple times, and one to
> distinguish dependencies that are purely for register values from those
> with other meanings (e.g. memory references).
> 
> Also, sched-rgn was changed to use a new bit, DEP_POSTPONED, rather than
> HARD_DEP to indicate that we don't want to schedule an insn in the
> current block.
> 
> A possible future extension would be to also allow autoinc addressing
> modes as the increment insn.

Would you please copy the above to a comment describing how the new 
optimization fits within the scheduler?  In sched-deps.c just before definition 
of struct mem_inc_info seems like a good spot for an overview comment.

> 
> Bootstrapped and tested on x86_64-linux, and also tested on c6x-elf
> (quite a number of changes were necessary to make it work there). It was
> originally written for a mips target and tested there in the context of
> a 4.6 tree. I've also run spec2000 on x86_64, with no change that looked
> like anything other than noise. Ok?
> 

> +   We also perform pattern replacements for predication, and for broken
> +   replacement dependencies.  The latter is only done if FOR_BACKTRACK is
> +   false.  */
> 

We also perform pattern replacement for ?speculation? ...


>> +void
>> +find_modifiable_mems (rtx head, rtx tail)
>> +{
>> +  rtx insn;
>> +  int success_in_block = 0;

Success_in_block is not really used.  Maybe add a debug printout for it?


> +  /* Dependency major type.  This field is superseded by STATUS below.
> + Though, it is still in place because some targets use it.  */
> +  ENUM_BITFIELD(reg_note) type:6;
> 

... is superseded by STATUS *above*.


--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics




Re: [PATCH 5/7] rs6000: Get rid of old-mnemonics

2012-08-15 Thread David Edelsohn
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool
 wrote:
> 2012-08-15  Segher Boessenkool 
>
> gcc/
> * config/rs6000/aix43.h (TARGET_DEFAULT): Delete MASK_NEW_MNEMONICS.
> (RS6000_CALL_GLUE): Adjust for single assembler syntax.
> * config/rs6000/aix51.h (TARGET_DEFAULT, RS6000_CALL_GLUE):
> Ditto.
> * config/rs6000/aix52.h (TARGET_DEFAULT, RS6000_CALL_GLUE):
> Ditto.
> * config/rs6000/aix53.h (TARGET_DEFAULT, RS6000_CALL_GLUE):
> Ditto.
> * config/rs6000/aix61.h (TARGET_DEFAULT, RS6000_CALL_GLUE):
> Ditto.
> * config/rs6000/darwin.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/darwin.md (whole file): Adjust to single
> assembler syntax.
> * config/rs6000/darwin64.h (TARGET_DEFAULT): Delete
> MASK_NEW_MNEMONICS.
> * config/rs6000/default64.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/dfp.md: (whole file): Adjust to single
> assembler syntax.
> * config/rs6000/eabi.h (TARGET_DEFAULT): Delete
> MASK_NEW_MNEMONICS.
> * config/rs6000/eabialtivec.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/eabispe.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/linuxaltivec.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/linuxspe.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/rs6000-cpus.def (whole file): Delete
> POWERPC_BASE_MASK.
> * config/rs6000/rs6000-tables.opt: Regenerate.
> * config/rs6000/rs6000.c (POWERPC_BASE_MASK): Delete.
> (num_insns_constant_wide): Adjust comments.
> (whole file): Adjust to single assembler syntax.
> (output_cbranch): Adjust comment.
> * config/rs6000/rs6000.h (ASSEMBLER_DIALECT): Delete.
> * config/rs6000/rs6000.md: (whole file): Adjust to single
> assembler syntax.
> * config/rs6000/rs6000.opt (mnew-mnemonics): Delete.
> (mold-mnemonics): Delete.
> * config/rs6000/spe.md: (whole file): Adjust to single
> assembler syntax.
> * config/rs6000/sync.md: (whole file): Adjust to single
> assembler syntax.
> * config/rs6000/sysv4.h (TARGET_DEFAULT): Delete
> MASK_NEW_MNEMONICS.
> (ASM_OUTPUT_REG_PUSH): Adjust.
> (ASM_OUTPUT_REG_POP): ADjust.
> * config/rs6000/sysv4le.h (TARGET_DEFAULT): Delete
> MASK_NEW_MNEMONICS.
> * config/rs6000/vsx.md: (whole file): Adjust to single
> assembler syntax.
> * config/rs6000/vxworks.h (TARGET_DEFAULT): Delete
> MASK_NEW_MNEMONICS.
> * doc/invoke.texi: Adjust documentation to reflect the
> removal of -mnew-mnemonics and -mold-mnemonics.
>
> libgcc/
> * longlong.h: (whole file): Adjust to single assembler syntax.
> ---
>  gcc/config/rs6000/aix43.h   |4 +-
>  gcc/config/rs6000/aix51.h   |4 +-
>  gcc/config/rs6000/aix52.h   |4 +-
>  gcc/config/rs6000/aix53.h   |4 +-
>  gcc/config/rs6000/aix61.h   |4 +-
>  gcc/config/rs6000/darwin.h  |2 +-
>  gcc/config/rs6000/darwin.md |   34 +-
>  gcc/config/rs6000/darwin64.h|2 +-
>  gcc/config/rs6000/default64.h   |3 +-
>  gcc/config/rs6000/dfp.md|   24 +-
>  gcc/config/rs6000/eabi.h|2 +-
>  gcc/config/rs6000/eabialtivec.h |2 +-
>  gcc/config/rs6000/eabispe.h |2 +-
>  gcc/config/rs6000/linuxaltivec.h|2 +-
>  gcc/config/rs6000/linuxspe.h|2 +-
>  gcc/config/rs6000/rs6000-cpus.def   |  121 +++
>  gcc/config/rs6000/rs6000-tables.opt |   33 +-
>  gcc/config/rs6000/rs6000.c  |   62 ++--
>  gcc/config/rs6000/rs6000.h  |4 -
>  gcc/config/rs6000/rs6000.md |  606 +-
>  gcc/config/rs6000/rs6000.opt|8 -
>  gcc/config/rs6000/spe.md|   26 +-
>  gcc/config/rs6000/sync.md   |4 +-
>  gcc/config/rs6000/sysv4.h   |6 +-
>  gcc/config/rs6000/sysv4le.h |2 +-
>  gcc/config/rs6000/vsx.md|8 +-
>  gcc/config/rs6000/vxworks.h |2 +-
>  gcc/doc/invoke.texi |   28 +--
>  libgcc/longlong.h   |   34 +-

This patch is okay, but the longlong.h patch is incomplete. It no
longer should test for _ARCH_COM or _ARCH_PWR, because those will not
work.

Does GCC "own" longlong.h, or is that part of GMP or some other project?

Thanks, David


Re: [PATCH 4/7] rs6000: Remove TARGET_POWERPC

2012-08-15 Thread David Edelsohn
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool
 wrote:
> It's always on now.  Also remove -mcpu=common.
>
> 2012-08-15  Segher Boessenkool  
>
> gcc/
> * common/config/rs6000/rs6000-common.c (rs6000_handle_option):
> Delete handling for -mno-powerpc and -mpowerpc.
> * config/rs6000/aix43.h (ASM_CPU_SPEC): Similar.
> (ASM_DEFAULT_SPEC): Use -mppc instead of -mcom.
> * config/rs6000/aix51.h (ASM_CPU_SPEC, ASM_DEFAULT_SPEC): Ditto.
> * config/rs6000/aix52.h (TARGET_DEFAULT): Delete MASK_POWERPC.
> * config/rs6000/aix53.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/aix61.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/darwin.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/darwin64.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/default64.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/driver-rs6000.c (asm_names): Delete handling
> for -mcpu=common and -mpowerpc.
> * config/rs6000/eabi.h (TARGET_DEFAULT): Delete MASK_POWERPC.
> * config/rs6000/eabialtivec.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/eabispe.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/linuxaltivec.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/linuxspe.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/rs6000-builtin.def (RS6000_BUILTIN_CFSTRING):
> Use RS6000_BTM_ALWAYS instead of RS6000_BTM_POWERPC.
> * config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
> Adjust.
> (rs6000_cpu_cpp_builtins): Adjust.
> * config/rs6000/rs6000.c (POWERPC_BASE_MASK): Delete MASK_POWERPC.
> (rs6000_builtin_mask_calculate): Adjust.
> (rs6000_emit_move): Delete code for ! TARGET_POWERPC.
> (rs6000_init_libfuncs): Ditto.
> (rs6000_output_function_prologue): Ditto.
> (rs6000_opt_masks): Delete MASK_POWERPC.
> (rs6000_builtin_mask_names): Delete RS6000_BTM_POWERPC.
> * config/rs6000/rs6000.h (ASM_CPU_SPEC): Delete handling for
> -mpowerpc.
> (RS6000_BTM_POWERPC): Delete.
> (RS6000_BTM_COMMON): Delete RS6000_BTM_POWERPC.
> * config/rs6000/rs6000.md (extendqisi2 patterns): Adjust for
> TARGET_POWERPC always on.
> (extendqihi2 patterns): Similar.
> (various unnamed subtract patterns): Similar.
> (bswaphi2 patterns): Similar.
> (divmodsi4): Similar.
> (udiv3): Similar.
> (div3 patterns): Similar.
> (udivmodsi4): Similar.
> (mulhcall): Delete.
> (mullcall): Delete.
> (divss_call): Delete.
> (divus_call): Delete.
> (quoss_call): Delete.
> (quous_call): Delete.
> (insvsi patterns): Adjust.
> (addsf3 patterns): Adjust.
> (subsf3 patterns): Adjust.
> (mulsf3 patterns): Adjust.
> (divsf3 patterns): Adjust.
> (*fmasf4_fpr): Adjust.
> (*fmssf4_fpr): Adjust.
> (*nfmasf4_fpr): Adjust.
> (*nfmssf4_fpr): Adjust.
> (*floatunssidf2_internal): Adjust.
> (fix_truncsi2_internal): Adjust.
> (fctiwz_): Adjust.
> (mulsidi3 patterns): Adjust.
> (smulsi3_highpart patterns): Adjust.
> (umulsi3_highpart patterns): Adjust.
> (fix_trunctfsi2 patterns): Adjust.
> (prefetch): Adjust.
> * config/rs6000/rs6000.opt (mpowerpc): Replace by stub option.
> (mno-powerpc): Delete.
> * config/rs6000/sync.md (load_locked): Adjust.
> (store_conditional): Adjust.
> (atomic_compare_and_swap): Adjust.
> (atomic_exchange): Adjust.
> (atomic_): Adjust.
> (atomic_nand): Adjust.
> (atomic_fetch_): Adjust.
> (atomic_fetch_nand): Adjust.
> (atomic__fetch): Adjust.
> (atomic_nand_fetch): Adjust.
> * config/rs6000/sysv4.h (TARGET_DEFAULT): Delete MASK_POWERPC.
> * config/rs6000/sysv4le.h (TARGET_DEFAULT): Ditto.
> * config/rs6000/vxworks.h (TARGET_DEFAULT): Ditto.
> * doc/invoke.texi: Adjust documentation.

This patch is okay.

Thanks, David


Re: [PATCH, MIPS] DSP ALU scheduling

2012-08-15 Thread Sandra Loosemore
On 08/04/2012 07:55 AM, Richard Sandiford wrote:
> Sandra Loosemore  writes:
>> This is another patch that has been present in our local source base for some
>> years now.  It originally came from MIPS; I've verified that we have legal
>> permission to contribute it to the FSF.
>>
>> The 74k.md parts of this patch depend on the not-yet-reviewed "74k madd 
>> scheduler
>> tweaks" patch I posted the other day:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00062.html
>>
>> Assuming that one gets approved, is this patch OK for mainline?
> 
> OK with:
> 
>> +/* DSP ALU can bypass data with no delays for the following pairs. */
>> +enum insn_code dspalu_bypass_table[][2] =
>> +{
>> +  {CODE_FOR_mips_addsc, CODE_FOR_mips_addwc},
>> +  {CODE_FOR_mips_cmpu_eq_qb, CODE_FOR_mips_pick_qb},
>> +  {CODE_FOR_mips_cmpu_lt_qb, CODE_FOR_mips_pick_qb},
>> +  {CODE_FOR_mips_cmpu_le_qb, CODE_FOR_mips_pick_qb},
>> +  {CODE_FOR_mips_cmp_eq_ph, CODE_FOR_mips_pick_ph},
>> +  {CODE_FOR_mips_cmp_lt_ph, CODE_FOR_mips_pick_ph},
>> +  {CODE_FOR_mips_cmp_le_ph, CODE_FOR_mips_pick_ph},
>> +  {CODE_FOR_mips_wrdsp, CODE_FOR_mips_insv}
>> +};
>> +
>> +int
>> +mips_dspalu_bypass_p (rtx out_insn, rtx in_insn)
>> +{
>> +  int i;
>> +  int num_bypass = (sizeof (dspalu_bypass_table)
>> +/ (2 * sizeof (enum insn_code)));
> 
> this changed to ARRAY_SIZE (dspalu_bypass_table);

Here's the version I've checked in.  In addition to the change above, I 
implemented the suggestion here

http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00463.html

to remove the uses of mips_mult_madd_chain_bypass_p.

-Sandra


2012-08-15  Sandra Loosemore  
Maxim Kuvyrkov  
Julian Brown  
MIPS Technologies, Inc.

gcc/
* config/mips/mips.md (dspmac, dspmacsat, accext, accmod, dspalu)
(dspalusat): Add insn types.
* config/mips/mips-dsp.md (add3)
(mips_add_s_)
(sub3, mips_sub_s_, mips_addsc)
(mips_addwc, mips_modsub, mips_raddu_w_qb, mips_absq_s_)
(mips_precrq_qb_ph, mips_precrq_ph_w, mips_precrq_rs_ph_w)
(mips_precrqu_s_qb_ph, mips_preceq_w_phl, mips_preceq_w_phr)
(mips_precequ_ph_qbl, mips_precequ_ph_qbr, mips_precequ_ph_qbla)
(mips_precequ_ph_qbra, mips_preceu_ph_qbl, mips_preceu_ph_qbr)
(mips_preceu_ph_qbla, mips_preceu_ph_qbra, mips_shll_)
(mips_shll_s_, mips_shll_s_, mips_shrl_qb)
(mips_shra_ph, mips_shra_r_, mips_bitrev, mips_insv)
(mips_repl_qb, mips_repl_ph)
(mips_cmp_eq_)
(mips_cmp_lt_)
(mips_cmp_le_, mips_cmpgu_eq_qb)
(mips_cmpgu_lt_qb, mips_cmpgu_le_qb, mips_pick_)
(mips_packrl_ph, mips_wrdsp, mips_rddsp): Change type to dspalu.
(mips_dpau_h_qbl, mips_dpau_h_qbr, mips_dpsu_h_qbl, mips_dpsu_h_qbr)
(mips_dpaq_s_w_ph, mips_dpsq_s_w_ph, mips_mulsaq_s_w_ph)
(mips_maq_s_w_phl, mips_maq_s_w_phr, mips_maq_sa_w_phr): Set type to
dspmac.
(mips_dpaq_sa_l_w, mips_dpsq_sa_l_w, mips_maq_sa_w_phl): Set type to
dspmacsat.
(mips_extr_w, mips_extr_r_w, mips_extr_rs_w, mips_extp, mips_extpdp):
Set type to accext.
(mips_shilo, mips_mthlip): Set type to accmod.
* config/mips/mips-dspr2.md (mips_absq_s_qb, mips_addu_s_ph)
(mips_adduh_r_qb): Set type to dspalusat.
(mips_addu_ph, mips_adduh_qb, mips_append, mips_balign)
(mips_cmpgdu_eq_qb, mips_cmpgdu_lt_qb, mips_cmpgdu_le_qb)
(mips_precr_qb_ph, mips_precr_sra_ph_w, mips_precr_sra_r_ph_w)
(mips_prepend, mips_shra_qb, mips_shra_r_qb, mips_shrl_ph)
(mips_subu_ph, mips_subuh_qb, mips_subuh_r_qb, mips_addqh_ph)
(mips_addqh_r_ph, mips_addqh_w, mips_addqh_r_w, mips_subqh_ph)
(mips_subqh_r_ph, mips_subqh_w, mips_subqh_r_w): Set type to dspalu.
(mips_dpa_w_ph, mips_dps_w_ph, mips_mulsa_w_ph, mips_dpax_w_ph)
(mips_dpsx_w_ph, mips_dpaqx_s_w_ph, mips_dpsqx_s_w_ph): Set type to
dspmac.  Set accum_in attribute.
(mips_subu_s_ph): Set type to dspalusat. 
(mips_dpaqx_sa_w_ph, mips_dpsqx_sa_w_ph): Set type to dspmacsat.
Set accum_in attribute.
* config/mips/mips-protos.h (mips_dspalu_bypass_p): Add prototype.
* config/mips/mips.c (dspalu_bypass_table): New.
(mips_dspalu_bypass_p): New.
* config/mips/24k.md (r24k_dsp_alu, r24k_dsp_mac, r24k_dsp_mac_sat)
(r24k_dsp_acc_ext, r24k_dsp_acc_mod): New insn reservations.
(r24k_int_mult, r24k_int_mthilo, r24k_dsp_mac, r24k_dsp_mac_sat)
(r24k_dsp_acc_ext, r24k_dsp_acc_mod, r24k_dsp_alu): New bypasses.
* config/mips/74k.md (r74k_dsp_alu, r74k_dsp_alu_sat, r74k_dsp_mac)
(r74k_dsp_mac_sat, r74k_dsp_acc_ext, r74k_dsp_acc_mod): New insn
reservations.
(r74k_dsp_mac, r74k_dsp_mac_sat, r74k_int_mult, r74k_int_mul3)
(r74k_dsp_mac, r74k_dsp_mac_sat): New bypasses.

Index: gcc/config/mips/mips.md

Re: [PATCH 3/7] rs6000: Add RS6000_BTM_ALWAYS

2012-08-15 Thread David Edelsohn
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool
 wrote:
> This adds a builtin flag for "always enabled".  The value 0 works
> right now as far as I can see, but that is too tricky and should
> be fixed some day.
>
> 2012-08-15  Segher Boessenkool 
>
> gcc/
> * config/rs6000/rs6000.h (RS6000_BTM_ALWAYS): New.

This patch is okay.

Thanks, David


Re: [PATCH 2/7] rs6000: Fix typo mpower64 -> mpowerpc64 in various spec strings.

2012-08-15 Thread David Edelsohn
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool
 wrote:
> 2012-08-15  Segher Boessenkool  
>
> gcc/
> * config/rs6000/aix52.h (ASM_CPU_SPEC): Fix typo.
> * config/rs6000/aix53.h (ASM_CPU_SPEC): Ditto.
> * config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto.
> * config/rs6000/driver-rs6000.c (asm_names): Ditto.

This patch is okay.

Thanks, David


Re: [PATCH 1/7] rs6000: Fix PR54142

2012-08-15 Thread David Edelsohn
On Wed, Aug 15, 2012 at 6:29 PM, Segher Boessenkool
 wrote:
> This fixes PR54142, a problem I exposed when I made -mno-power the
> default.
>
> 2012-08-15  Segher Boessenkool  
>
> gcc/
> PR54142
> * config/rs6000/driver-rs6000.c (asm_names): Use %(asm_default)
> instead of -mcom.
> * config/rs6000/rs6000.h (ASM_CPU_SPEC): Ditto.

This patch is okay.

Thanks, David


PATCH: Replace target MEMBER_TYPE_FORCES_BLK macro with a target hook

2012-08-15 Thread H.J. Lu
Hi,

This patch replaces MEMBER_TYPE_FORCES_BLK with a target hook.  I
also pass the type to the target hook in addition to field, which will
be used by i386 backend for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020

This patch doesn't change code generation.  Tested on Linux/x86-64. I
also tested cross compilers to ia64-hpux, powerpc-linux and xtensa-linux
up to cc1.  OK to install?

Thanks.


H.J.
---
2012-08-15  H.J. Lu  

* stor-layout.c (compute_record_mode): Replace
MEMBER_TYPE_FORCES_BLK with targetm.member_type_forces_blk.
(layout_type): Likewise.

* system.h: Poison MEMBER_TYPE_FORCES_BLK.

* target.def (member_type_forces_blk): New target hook.

* targhooks.c (default_member_type_forces_blk): New.
* targhooks.h (default_member_type_forces_blk): Likewise.

* doc/tm.texi.in (MEMBER_TYPE_FORCES_BLK): Renamed to ...
(TARGET_MEMBER_TYPE_FORCES_BLK): This.
* doc/tm.texi: Regenerated.

* config/ia64/hpux.h (MEMBER_TYPE_FORCES_BLK): Removed.

* config/ia64/ia64.c (ia64_member_type_forces_blk): New
function.
(TARGET_MEMBER_TYPE_FORCES_BLK): New macro.

* config/rs6000/rs6000.c (TARGET_MEMBER_TYPE_FORCES_BLK): New
macro.
(rs6000_member_type_forces_blk): New function.

* config/rs6000/rs6000.h (MEMBER_TYPE_FORCES_BLK): Removed.

* config/xtensa/xtensa.c (xtensa_member_type_forces_blk): New
function.
(TARGET_MEMBER_TYPE_FORCES_BLK): New macro.

* config/xtensa/xtensa.h (MEMBER_TYPE_FORCES_BLK): Removed.

diff --git a/gcc/config/ia64/hpux.h b/gcc/config/ia64/hpux.h
index ad106b4..d9ae109 100644
--- a/gcc/config/ia64/hpux.h
+++ b/gcc/config/ia64/hpux.h
@@ -115,9 +115,6 @@ do {
\
 #define TARGET_DEFAULT \
   (MASK_DWARF2_ASM | MASK_BIG_ENDIAN | MASK_ILP32)
 
-/* ??? Might not be needed anymore.  */
-#define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) ((MODE) == TFmode)
-
 /* ASM_OUTPUT_EXTERNAL_LIBCALL defaults to just a globalize_label call,
but that doesn't put out the @function type information which causes
shared library problems.  */
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index c7fb559..f3585b1 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -319,6 +319,8 @@ static const char *ia64_invalid_binary_op (int, const_tree, 
const_tree);
 static enum machine_mode ia64_c_mode_for_suffix (char);
 static void ia64_trampoline_init (rtx, tree, rtx);
 static void ia64_override_options_after_change (void);
+static bool ia64_member_type_forces_blk (const_tree, const_tree,
+enum machine_mode);
 
 static tree ia64_builtin_decl (unsigned, bool);
 
@@ -570,6 +572,9 @@ static const struct attribute_spec ia64_attribute_table[] =
 #undef TARGET_GET_RAW_ARG_MODE
 #define TARGET_GET_RAW_ARG_MODE ia64_get_reg_raw_mode
 
+#undef TARGET_MEMBER_TYPE_FORCES_BLK
+#define TARGET_MEMBER_TYPE_FORCES_BLK ia64_member_type_forces_blk
+
 #undef TARGET_GIMPLIFY_VA_ARG_EXPR
 #define TARGET_GIMPLIFY_VA_ARG_EXPR ia64_gimplify_va_arg
 
@@ -11153,6 +11158,17 @@ ia64_get_reg_raw_mode (int regno)
   return default_get_reg_raw_mode(regno);
 }
 
+/* Implement TARGET_MEMBER_TYPE_FORCES_BLK.  ??? Might not be needed
+   anymore.  */
+
+bool
+ia64_member_type_forces_blk (const_tree type ATTRIBUTE_UNUSED,
+const_tree field ATTRIBUTE_UNUSED,
+enum machine_mode mode)
+{
+  return TARGET_HPUX && mode == TFmode;
+}
+
 /* Always default to .text section until HP-UX linker is fixed.  */
 
 ATTRIBUTE_UNUSED static section *
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 34948fb..5b2d9f0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1302,6 +1302,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_INIT_DWARF_REG_SIZES_EXTRA
 #define TARGET_INIT_DWARF_REG_SIZES_EXTRA rs6000_init_dwarf_reg_sizes_extra
 
+#undef TARGET_MEMBER_TYPE_FORCES_BLK
+#define TARGET_MEMBER_TYPE_FORCES_BLK rs6000_member_type_forces_blk
+
 /* On rs6000, function arguments are promoted, as are function return
values.  */
 #undef TARGET_PROMOTE_FUNCTION_MODE
@@ -7287,6 +7290,27 @@ rs6000_emit_move (rtx dest, rtx source, enum 
machine_mode mode)
  emit_set:
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[1]));
 }
+
+/* Return true if TYPE containing FIELD should be accessed using
+   `BLKMODE'.
+
+   For the SPE, simd types are V2SI, and gcc can be tempted to put the
+   entire thing in a DI and use subregs to access the internals.
+   store_bit_field() will force (subreg:DI (reg:V2SI x))'s to the
+   back-end.  Because a single GPR can hold a V2SI, but not a DI, the
+   best thing to do is set structs to BLKmode and avoid Severe Tire
+   Damage.
+
+   On e500 v2, DF and DI modes suffer from the same anomaly.  DF can
+   fit in

Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-15 Thread Steven Bosscher
On Mon, Aug 6, 2012 at 1:27 PM, Paolo Bonzini  wrote:
>> 2. sparseset has the same problem of memory clearing (for valgrind,
>> see sparseset_alloc).
>
> ... only the sparse array needs this clearing, but currently we do it
> for both.

And according to the fat comment before the xcalloc, it's not even
really needed. Why can't we just tell valgrind to treat the memory as
defined, like in this patchlet:

Index: sparseset.c
===
--- sparseset.c (revision 190418)
+++ sparseset.c (working copy)
@@ -30,12 +30,14 @@ sparseset_alloc (SPARSESET_ELT_TYPE n_elms)
   unsigned int n_bytes = sizeof (struct sparseset_def)
 + ((n_elms - 1) * 2 * sizeof (SPARSESET_ELT_TYPE));

-  /* We use xcalloc rather than xmalloc to silence some valgrind uninitialized
+  sparseset set = XNEWVAR(sparseset, n_bytes);
+
+  /* Mark the sparseset as defined to silence some valgrind uninitialized
  read errors when accessing set->sparse[n] when "n" is not, and never has
  been, in the set.  These uninitialized reads are expected, by design and
- harmless.  If this turns into a performance problem due to some future
- additional users of sparseset, we can revisit this decision.  */
-  sparseset set = (sparseset) xcalloc (1, n_bytes);
+ harmless.  */
+  VALGRIND_DISCARD (VALGRIND_MAKE_MEM_DEFINED (set, n_bytes));
+
   set->dense = &(set->elms[0]);
   set->sparse = &(set->elms[n_elms]);
   set->size = n_elms;


I have never built GCC with valgrind checking, so I don't know if this
is correct. But there has to be a better solution than calling
xcalloc...

Ciao!
Steven


[PATCH 6/7] rs6000: Remove -mabi=ieeelongdouble.

2012-08-15 Thread Segher Boessenkool
There are some problems with it:
- On at least 4.6 and later, it crashes the compiler together with -m64;
- On older versions, it generates incorrect code together with -m64;
- Supposedly it doesn't actually work on 32-bit either, on the glibc side;
- It isn't listed in --target-help, because the option file says
  "undocumented", but the manual does in fact list it;
- The Darwin header claims it is for POWER.

In the spirit of the rest of this patch series, I solve these problems
by ripping it all out.

2012-08-15  Segher Boessenkool  

gcc/
* config/rs6000/aix.h (TARGET_IEEEQUAD): Delete.
* config/rs6000/darwin.h (TARGET_IEEEQUAD): Delete.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Delete rs6000_ieeequad handling.  Adjust for removal of
TARGET_IEEEQUAD.
(rs6000_emit_move): Adjust.
(rs6000_return_in_memory): Adjust.
(rs6000_function_arg_advance_1): Adjust.
(rs6000_function_arg): Adjust.
(rs6000_pass_by_reference): Adjust.
(rs6000_init_libfuncs): Adjust.
(rs6000_cannot_change_mode_class): Adjust.
(rs6000_generate_compare): Adjust.
(rs6000_mangle_type): Adjust.
* config/rs6000/rs6000.h: Delete TARGET_IEEEQUAD.
* config/rs6000/rs6000.md (whole file): Adjust for removal
of TARGET_IEEEQUAD.
* config/rs6000/rs6000.opt (mabi=ieeelongdouble): Delete.
(mabi=ibmlongdouble): Replace by stub.
* config/rs6000/spe.md (whole file): Adjust for removal
of TARGET_IEEEQUAD.
* doc/invoke.texi: Adjust documentation.
---
 gcc/config/rs6000/aix.h  |3 -
 gcc/config/rs6000/darwin.h   |4 -
 gcc/config/rs6000/rs6000.c   |  127 +-
 gcc/config/rs6000/rs6000.h   |1 -
 gcc/config/rs6000/rs6000.md  |   78 +-
 gcc/config/rs6000/rs6000.opt |5 +-
 gcc/config/rs6000/spe.md |   42 +-
 gcc/doc/invoke.texi  |   13 +
 8 files changed, 82 insertions(+), 191 deletions(-)

diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h
index 41421a0..dc6dc55 100644
--- a/gcc/config/rs6000/aix.h
+++ b/gcc/config/rs6000/aix.h
@@ -41,9 +41,6 @@
 #undef  STACK_BOUNDARY
 #define STACK_BOUNDARY 128
 
-#undef  TARGET_IEEEQUAD
-#define TARGET_IEEEQUAD 0
-
 /* The AIX linker will discard static constructors in object files before
collect has a chance to see them, so scan the object files directly.  */
 #define COLLECT_EXPORT_LIST
diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 17ff675..2e67215 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -282,10 +282,6 @@ extern int darwin_emit_branch_islands;
 #undef  TARGET_DEFAULT
 #define TARGET_DEFAULT (MASK_MULTIPLE | MASK_PPC_GFXOPT)
 
-/* Darwin only runs on PowerPC, so short-circuit POWER patterns.  */
-#undef  TARGET_IEEEQUAD
-#define TARGET_IEEEQUAD 0
-
 /* Since Darwin doesn't do TOCs, stub this out.  */
 
 #define ASM_OUTPUT_SPECIAL_POOL_ENTRY_P(X, MODE)  ((void)X, (void)MODE, 0)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 4571c6f..15105c6 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -2686,11 +2686,6 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_long_double_type_size = RS6000_DEFAULT_LONG_DOUBLE_SIZE;
 }
 
-#if !defined (POWERPC_LINUX) && !defined (POWERPC_FREEBSD)
-  if (!global_options_set.x_rs6000_ieeequad)
-rs6000_ieeequad = 1;
-#endif
-
   /* Disable VSX and Altivec silently if the user switched cpus to power7 in a
  target attribute or pragma which automatically enables both options,
  unless the altivec ABI was set.  This is set by default for 64-bit, but
@@ -2893,7 +2888,7 @@ rs6000_option_override_internal (bool global_init_p)
flag_signed_bitfields = 0;
 #endif
 
-  if (TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD)
+  if (TARGET_LONG_DOUBLE_128)
REAL_MODE_FORMAT (TFmode) = &ibm_extended_format;
 
   if (TARGET_TOC)
@@ -6986,8 +6981,8 @@ rs6000_emit_move (rtx dest, rtx source, enum machine_mode 
mode)
 
   /* 128-bit constant floating-point values on Darwin should really be
  loaded as two parts.  */
-  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-  && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE)
+  if (TARGET_LONG_DOUBLE_128 && mode == TFmode
+  && GET_CODE (operands[1]) == CONST_DOUBLE)
 {
   rs6000_emit_move (simplify_gen_subreg (DFmode, operands[0], mode, 0),
simplify_gen_subreg (DFmode, operands[1], mode, 0),
@@ -7350,9 +7345,6 @@ rs6000_return_in_memory (const_tree type, const_tree 
fntype ATTRIBUTE_UNUSED)
   return true;
 }
 
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && TYPE_MODE (type) == TFmode)
-return true;
-
   return false;
 }
 
@@ -7908,8 +7900,7 @@ rs6000_function_arg_advance_1 (CUMULATIVE_ARGS *cum, enum 
machine_mode mode,

[PATCH 7/7] rs6000: Old AIX specs: Use %(asm_default) instead of -mppc.

2012-08-15 Thread Segher Boessenkool
2012-08-15  Segher Boessenkool  

gcc/
* config/rs6000/aix43.h (ASM_CPU_SPEC): Use %(asm_default)
instead of -mppc.
* config/rs6000/aix51.h (ASM_CPU_SPEC): Ditto.
---
 gcc/config/rs6000/aix43.h |2 +-
 gcc/config/rs6000/aix51.h |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/aix43.h b/gcc/config/rs6000/aix43.h
index 8465c20..c1a69dc 100644
--- a/gcc/config/rs6000/aix43.h
+++ b/gcc/config/rs6000/aix43.h
@@ -49,7 +49,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{!mcpu*: %{!maix64: \
-  %{!mpowerpc64: -mppc} \
+  %{!mpowerpc64: %(asm_default)} \
   %{mpowerpc64: -mppc64}}} \
 %{mcpu=power3: -m620} \
 %{mcpu=power4: -m620} \
diff --git a/gcc/config/rs6000/aix51.h b/gcc/config/rs6000/aix51.h
index a140e12..6ea30c1 100644
--- a/gcc/config/rs6000/aix51.h
+++ b/gcc/config/rs6000/aix51.h
@@ -43,7 +43,7 @@ do {  
\
 #undef ASM_CPU_SPEC
 #define ASM_CPU_SPEC \
 "%{!mcpu*: %{!maix64: \
-  %{!mpowerpc64: -mppc} \
+  %{!mpowerpc64: %(asm_default)} \
   %{mpowerpc64: -mppc64}}} \
 %{mcpu=power3: -m620} \
 %{mcpu=power4: -m620} \
-- 
1.7.7.6



[PATCH 0/7] rs6000: POWER removal, phase 2.

2012-08-15 Thread Segher Boessenkool
Here are some more patches to remove old, unused features from
the rs6000 port.

The first two and last patches fix up some spec strings so that
asm_default is used when it should.  The third adds a "bitmask"
RS6000_BTM_ALWAYS for use by the fourth patch; the one builtin
that uses it probably should not be always on (it is specific
to Darwin as far as I can tell), but I'm not going to change
behaviour with these patches.

The fourth patch removes "common mode", makes -mpowerpc always
on effectively.

The fifth patch removes the old assembler syntax everywhere.

The sixth patch removes -mabi=ieeelongdouble, which is plagued
by problems; a comment in darwin.h made me believe this option
is for old POWER, but it's not (it's for 32-bit SVR4 instead).
I'm not totally sure we can remove this support at this time,
but I made the patch already, so I'm sending it anyway.

Bootstrapped and tested on powerpc64-linux, no regressions,
--enable-languages=c,c++,fortran .

Okay for mainline?


Segher


Segher Boessenkool (7):
  rs6000: Fix PR54142
  rs6000: Fix typo mpower64 -> mpowerpc64 in various spec strings.
  rs6000: Add RS6000_BTM_ALWAYS
  rs6000: Remove TARGET_POWERPC
  rs6000: Get rid of old-mnemonics
  rs6000: Remove -mabi=ieeelongdouble.
  rs6000: Old AIX specs: Use %(asm_default) instead of -mppc.

 gcc/common/config/rs6000/rs6000-common.c |   15 +-
 gcc/config/rs6000/aix.h  |3 -
 gcc/config/rs6000/aix43.h|   12 +-
 gcc/config/rs6000/aix51.h|   12 +-
 gcc/config/rs6000/aix52.h|6 +-
 gcc/config/rs6000/aix53.h|6 +-
 gcc/config/rs6000/aix61.h|6 +-
 gcc/config/rs6000/darwin.h   |7 +-
 gcc/config/rs6000/darwin.md  |   34 +-
 gcc/config/rs6000/darwin64.h |4 +-
 gcc/config/rs6000/default64.h|4 +-
 gcc/config/rs6000/dfp.md |   24 +-
 gcc/config/rs6000/driver-rs6000.c|6 +-
 gcc/config/rs6000/eabi.h |2 +-
 gcc/config/rs6000/eabialtivec.h  |2 +-
 gcc/config/rs6000/eabispe.h  |3 +-
 gcc/config/rs6000/linuxaltivec.h |2 +-
 gcc/config/rs6000/linuxspe.h |3 +-
 gcc/config/rs6000/rs6000-builtin.def |2 +-
 gcc/config/rs6000/rs6000-c.c |7 +-
 gcc/config/rs6000/rs6000-cpus.def|  121 ++--
 gcc/config/rs6000/rs6000-tables.opt  |   33 +-
 gcc/config/rs6000/rs6000.c   |  235 ++-
 gcc/config/rs6000/rs6000.h   |   12 +-
 gcc/config/rs6000/rs6000.md  | 1137 ++
 gcc/config/rs6000/rs6000.opt |   21 +-
 gcc/config/rs6000/spe.md |   68 +--
 gcc/config/rs6000/sync.md|   24 +-
 gcc/config/rs6000/sysv4.h|6 +-
 gcc/config/rs6000/sysv4le.h  |2 +-
 gcc/config/rs6000/vsx.md |8 +-
 gcc/config/rs6000/vxworks.h  |3 +-
 gcc/doc/invoke.texi  |   84 +--
 libgcc/longlong.h|   34 +-
 34 files changed, 661 insertions(+), 1287 deletions(-)

-- 
1.7.7.6



[PATCH 3/7] rs6000: Add RS6000_BTM_ALWAYS

2012-08-15 Thread Segher Boessenkool
This adds a builtin flag for "always enabled".  The value 0 works
right now as far as I can see, but that is too tricky and should
be fixed some day.

2012-08-15  Segher Boessenkool 

gcc/
* config/rs6000/rs6000.h (RS6000_BTM_ALWAYS): New.
---
 gcc/config/rs6000/rs6000.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 5644435..5c53f85 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2302,6 +2302,7 @@ extern int frame_pointer_needed;
 /* Builtin targets.  For now, we reuse the masks for those options that are in
target flags, and pick two random bits for SPE and paired which aren't in
target_flags.  */
+#define RS6000_BTM_ALWAYS  0   /* Always enabled.  */
 #define RS6000_BTM_ALTIVEC MASK_ALTIVEC/* VMX/altivec vectors.  */
 #define RS6000_BTM_VSX MASK_VSX/* VSX (vector/scalar).  */
 #define RS6000_BTM_SPE MASK_STRING /* E500 */
-- 
1.7.7.6



[PATCH 2/7] rs6000: Fix typo mpower64 -> mpowerpc64 in various spec strings.

2012-08-15 Thread Segher Boessenkool
2012-08-15  Segher Boessenkool  

gcc/
* config/rs6000/aix52.h (ASM_CPU_SPEC): Fix typo.
* config/rs6000/aix53.h (ASM_CPU_SPEC): Ditto.
* config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto.
* config/rs6000/driver-rs6000.c (asm_names): Ditto.
---
 gcc/config/rs6000/aix52.h |2 +-
 gcc/config/rs6000/aix53.h |2 +-
 gcc/config/rs6000/aix61.h |2 +-
 gcc/config/rs6000/driver-rs6000.c |2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/aix52.h b/gcc/config/rs6000/aix52.h
index 206b9fe..eeeb820 100644
--- a/gcc/config/rs6000/aix52.h
+++ b/gcc/config/rs6000/aix52.h
@@ -50,7 +50,7 @@ do {  
\
 #define ASM_CPU_SPEC \
 "%{!mcpu*: %{!maix64: \
   %{mpowerpc64: -mppc64} \
-  %{!mpower64: %(asm_default)}}} \
+  %{!mpowerpc64: %(asm_default)}}} \
 %{mcpu=power3: -m620} \
 %{mcpu=power4: -m620} \
 %{mcpu=power5: -m620} \
diff --git a/gcc/config/rs6000/aix53.h b/gcc/config/rs6000/aix53.h
index 44ad5cf..dac0ce0 100644
--- a/gcc/config/rs6000/aix53.h
+++ b/gcc/config/rs6000/aix53.h
@@ -53,7 +53,7 @@ do {  
\
 "%{!mcpu*: %{!maix64: \
   %{mpowerpc64: -mppc64} \
   %{maltivec: -m970} \
-  %{!maltivec: %{!mpower64: %(asm_default) \
+  %{!maltivec: %{!mpowerpc64: %(asm_default) \
 %{mcpu=native: %(asm_cpu_native)} \
 %{mcpu=power3: -m620} \
 %{mcpu=power4: -mpwr4} \
diff --git a/gcc/config/rs6000/aix61.h b/gcc/config/rs6000/aix61.h
index 1000e11..0518d44 100644
--- a/gcc/config/rs6000/aix61.h
+++ b/gcc/config/rs6000/aix61.h
@@ -53,7 +53,7 @@ do {  
\
 "%{!mcpu*: %{!maix64: \
   %{mpowerpc64: -mppc64} \
   %{maltivec: -m970} \
-  %{!maltivec: %{!mpower64: %(asm_default) \
+  %{!maltivec: %{!mpowerpc64: %(asm_default) \
 %{mcpu=native: %(asm_cpu_native)} \
 %{mcpu=power3: -m620} \
 %{mcpu=power4: -mpwr4} \
diff --git a/gcc/config/rs6000/driver-rs6000.c 
b/gcc/config/rs6000/driver-rs6000.c
index e6c7da1..36fc27d 100644
--- a/gcc/config/rs6000/driver-rs6000.c
+++ b/gcc/config/rs6000/driver-rs6000.c
@@ -368,7 +368,7 @@ static const struct asm_name asm_names[] = {
 %{!maix64: \
 %{mpowerpc64: -mppc64} \
 %{maltivec: -m970} \
-%{!maltivec: %{!mpower64: %(asm_default)}}}" },
+%{!maltivec: %{!mpowerpc64: %(asm_default)}}}" },
 
 #else
   { "common",  "-mcom" },
-- 
1.7.7.6



[PATCH 1/7] rs6000: Fix PR54142

2012-08-15 Thread Segher Boessenkool
This fixes PR54142, a problem I exposed when I made -mno-power the
default.

2012-08-15  Segher Boessenkool  

gcc/
PR54142
* config/rs6000/driver-rs6000.c (asm_names): Use %(asm_default)
instead of -mcom.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Ditto.
---
 gcc/config/rs6000/driver-rs6000.c |2 +-
 gcc/config/rs6000/rs6000.h|2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/driver-rs6000.c 
b/gcc/config/rs6000/driver-rs6000.c
index 68b5257..e6c7da1 100644
--- a/gcc/config/rs6000/driver-rs6000.c
+++ b/gcc/config/rs6000/driver-rs6000.c
@@ -420,7 +420,7 @@ static const struct asm_name asm_names[] = {
   { NULL,  "\
 %{mpowerpc64*: -mppc64} \
 %{!mpowerpc64*: %{mpowerpc*: -mppc}} \
-%{!mpowerpc*: -mcom}" },
+%{!mpowerpc*: %(asm_default)}" },
 #endif
 };
 
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index ec62fc6..5644435 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -103,7 +103,7 @@
 "%{!mcpu*: \
   %{mpowerpc64*: -mppc64} \
   %{!mpowerpc64*: %{mpowerpc*: -mppc}} \
-  %{!mpowerpc*: -mcom}} \
+  %{!mpowerpc*: %(asm_default)}} \
 %{mcpu=native: %(asm_cpu_native)} \
 %{mcpu=common: -mcom} \
 %{mcpu=cell: -mcell} \
-- 
1.7.7.6



Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Ian Lance Taylor
On Wed, Aug 15, 2012 at 2:58 PM, Lawrence Crowl  wrote:
>
> I do not much like _t names either.

Also, names ending in _t are reserved by POSIX if you #include
.  Though that may only apply to global names, not to
types defined in classes or namespaces.

Ian


Re: [rfc] Fix SPU build (Re: [PATCH] Remove basic_block->loop_depth)

2012-08-15 Thread Ulrich Weigand
Richard Guenther wrote:
> On Wed, 15 Aug 2012, Ulrich Weigand wrote:
> > It seems flow_loops_find by itself is not quite enough, but everything
> > necessary to use the loop structures seems to be encapsulated in
> > loop_optimizer_init / loop_optimizer_finalize, which are already called
> > by a number of optimization passes.  When I do the same in the SPU
> > md-reorg pass, loop_father seems to be set up OK.
> > 
> > Does this look reasonable?
> 
> Yes.  You can use bb_loop_depth instead of loop_depth (bb->loop_father).

Ah, good point.  Below the version I committed.  Tested on spu-elf.

Thanks,
Ulrich

ChangeLog:

* config/spu/spu.c: Include "cfgloop.h".
(spu_machine_dependent_reorg): Call loop_optimizer_init and
loop_optimizer_finalize.  Use bb_loop_depth instead of loop_depth.
Directly compare loop_father values where appropriate.
* config/spu/t-spu-elf (spu.o): Update dependencies.

Index: gcc/config/spu/spu.c
===
*** gcc/config/spu/spu.c(revision 190390)
--- gcc/config/spu/spu.c(working copy)
***
*** 53,58 
--- 53,59 
  #include "timevar.h"
  #include "df.h"
  #include "dumpfile.h"
+ #include "cfgloop.h"
  
  /* Builtin types, data and prototypes. */
  
*** spu_machine_dependent_reorg (void)
*** 2458,2463 
--- 2459,2468 
in_spu_reorg = 1;
compute_bb_for_insn ();
  
+   /* (Re-)discover loops so that bb->loop_father can be used
+  in the analysis below.  */
+   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+ 
compact_blocks ();
  
spu_bb_info =
*** spu_machine_dependent_reorg (void)
*** 2562,2575 
 fallthru block. This catches the cases when it is a simple
 loop or when there is an initial branch into the loop. */
  if (prev && (loop_exit || simple_loop)
! && prev->loop_depth <= bb->loop_depth)
prop = prev;
  
  /* If there is only one adjacent predecessor.  Don't propagate
!outside this loop.  This loop_depth test isn't perfect, but
!I'm not sure the loop_father member is valid at this point.  */
  else if (prev && single_pred_p (bb)
!  && prev->loop_depth == bb->loop_depth)
prop = prev;
  
  /* If this is the JOIN block of a simple IF-THEN then
--- 2567,2579 
 fallthru block. This catches the cases when it is a simple
 loop or when there is an initial branch into the loop. */
  if (prev && (loop_exit || simple_loop)
! && bb_loop_depth (prev) <= bb_loop_depth (bb))
prop = prev;
  
  /* If there is only one adjacent predecessor.  Don't propagate
!outside this loop.  */
  else if (prev && single_pred_p (bb)
!  && prev->loop_father == bb->loop_father)
prop = prev;
  
  /* If this is the JOIN block of a simple IF-THEN then
*** spu_machine_dependent_reorg (void)
*** 2578,2584 
   && EDGE_COUNT (bb->preds) == 2
   && EDGE_COUNT (prev->preds) == 1
   && EDGE_PRED (prev, 0)->src == prev2
!  && prev2->loop_depth == bb->loop_depth
   && GET_CODE (branch_target) != REG)
prop = prev;
  
--- 2582,2588 
   && EDGE_COUNT (bb->preds) == 2
   && EDGE_COUNT (prev->preds) == 1
   && EDGE_PRED (prev, 0)->src == prev2
!  && prev2->loop_father == bb->loop_father
   && GET_CODE (branch_target) != REG)
prop = prev;
  
*** spu_machine_dependent_reorg (void)
*** 2600,2606 
  if (dump_file)
fprintf (dump_file, "propagate from %i to %i (loop depth %i) "
 "for %i (loop_exit %i simple_loop %i dist %i)\n",
!bb->index, prop->index, bb->loop_depth,
 INSN_UID (branch), loop_exit, simple_loop,
 branch_addr - INSN_ADDRESSES (INSN_UID (bbend)));
  
--- 2604,2610 
  if (dump_file)
fprintf (dump_file, "propagate from %i to %i (loop depth %i) "
 "for %i (loop_exit %i simple_loop %i dist %i)\n",
!bb->index, prop->index, bb_loop_depth (bb),
 INSN_UID (branch), loop_exit, simple_loop,
 branch_addr - INSN_ADDRESSES (INSN_UID (bbend)));
  
*** spu_machine_dependent_reorg (void)
*** 2657,2662 
--- 2661,2668 
  
spu_var_tracking ();
  
+   loop_optimizer_finalize ();
+ 
free_bb_for_insn ();
  
in_spu_reorg = 0;
Index: gcc/config/spu/t-spu-elf
===
*** gcc/config/spu/t-spu-elf(revision 190390)
--- gcc/config/spu/t-spu-elf(working copy)
*

[Patch, Fortran, committed] PR 54243 & 54244

2012-08-15 Thread Janus Weil
Hi all,

I have just committed as obvious a small patch fixing two
ICE-on-invalid PR involving CLASS declarations:

http://gcc.gnu.org/viewcvs?view=revision&revision=190420

Cheers,
Janus


Re: [google/gcc-4_7] Fix regression - SUBTARGET_EXTRA_SPECS overridden by LINUX_GRTE_EXTRA_SPECS

2012-08-15 Thread 沈涵
Hi Jing, ping?

On Mon, Aug 13, 2012 at 10:58 AM, Han Shen(沈涵)  wrote:
> Hi, the google/gcc-4_7 fails to linking anything (on x86-generic), by
> looking into specs file, it seems that 'link_emulation' section is
> missing in specs.
>
> The problem is in config/i386/linux.h, SUBTARGET_EXTRA_SPECS (which is
> not empty for chrome x86-generic) is overridden by
> "LINUX_GRTE_EXTRA_SPECS".
>
> My fix is to prepend LINUX_GRTE_EXTRA_SPECS to SUBTARGET_EXTRA_SPECS in 
> linux.h
>
> Jing, could you take a look at this?
>
> --
> Han Shen
>
> 2012-08-13 Han Shen  
> * gcc/config/i386/gnu-user.h (SUBTARGET_EXTRA_SPECS): Compute
> new value of LINUX_GRTE_EXTRA_SPECS by pre-pending LINUX_GRTE_EXTRA_SPECS
> to its origin value.
> * gcc/config/i386/gnu-user.h (SUBTARGET_EXTRA_SPECS_STR): Add
> new MACRO to hold value of SUBTARET_EXTRA_SPECS so that
> SUBTARET_EXTRA_SPECS could be replaced later in gnu-user.h
>
> --- a/gcc/config/i386/gnu-user.h
> +++ b/gcc/config/i386/gnu-user.h
> @@ -92,11 +92,14 @@ along with GCC; see the file COPYING3.  If not see
>  #define ASM_SPEC \
>"--32 %{!mno-sse2avx:%{mavx:-msse2avx}} %{msse2avx:%{!mavx:-msse2avx}}"
>
> -#undef  SUBTARGET_EXTRA_SPECS
> -#define SUBTARGET_EXTRA_SPECS \
> +#undef  SUBTARGET_EXTRA_SPECS_STR
> +#define SUBTARGET_EXTRA_SPECS_STR \
>{ "link_emulation", GNU_USER_LINK_EMULATION },\
>{ "dynamic_linker", GNU_USER_DYNAMIC_LINKER }
>
> +#undef  SUBTARGET_EXTRA_SPECS
> +#define SUBTARGET_EXTRA_SPECS SUBTARGET_EXTRA_SPECS_STR
> +
>  #undef LINK_SPEC
>  #define LINK_SPEC "-m %(link_emulation) %{shared:-shared} \
>%{!shared: \
> --- a/gcc/config/i386/linux.h
> +++ b/gcc/config/i386/linux.h
> @@ -32,5 +32,11 @@ along with GCC; see the file COPYING3.  If not see
>  #endif
>
>  #undef  SUBTARGET_EXTRA_SPECS
> +#ifndef SUBTARGET_EXTRA_SPECS_STR
>  #define SUBTARGET_EXTRA_SPECS \
>LINUX_GRTE_EXTRA_SPECS
> +#else
> +#define SUBTARGET_EXTRA_SPECS \
> +  LINUX_GRTE_EXTRA_SPECS \
> +  SUBTARGET_EXTRA_SPECS_STR
> +#endif



-- 
Han Shen |  Software Engineer |  shen...@google.com |  +1-650-440-3330


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Lawrence Crowl
On 8/15/12, Richard Henderson  wrote:
> On 2012-08-15 07:29, Richard Guenther wrote:
>> +   typedef typename Element::Element_t Element_t;
>
> Can we use something less ugly than Element_t?
> Such as
>
>   typedef typename Element::T T;
>
> ?  Given that this name is scoped anyway...

I do not much like _t names either.

-- 
Lawrence Crowl


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Lawrence Crowl
On 8/15/12, Richard Guenther  wrote:
> On Wed, 15 Aug 2012, Michael Matz wrote:
> > On Wed, 15 Aug 2012, Richard Guenther wrote:
> > > Like the following, only the coverage.c use is converted.
> > > I've never seen template function arguments anywhere and
> > > having to repeat them all
> > >
> > > over the place is really really ugly (yes, even if only in
> > > the implementation).
> > >
> > > This goes with static member functions and not wrapping any
> > > data. It goes away with the requirement of having externally
> > > visible global functions for the hash/compare functions as well
> > > (which was ugly anyway).
> >
> > Well, it looks nicer than what's there currently.  As the
> > element functions now are scoped and normal member functions,
> > they should be named with lower case characters of course.  I do
> > like that the hash table would then only have one real template
> > argument; the Allocator, well, no better idea comes to my mind.
>
> Yeah.  Updated patch below, all users converted.  I probably
> missed to revert some of the globalizations of hash/compare fns.

Your conversion is a better abstraction, and something I'd wanted
to get to eventually, so I support your conversion.

BTW, the conding conventions say to put member function definitions
out of line.

-- 
Lawrence Crowl


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Lawrence Crowl
On 8/15/12, Richard Guenther  wrote:
> On Sun, 12 Aug 2012, Diego Novillo wrote:
> > This implements a new C++ hash table.
> >
> > See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for
> > details.
>
> Now as we see the result I'd have prefered a more C++-way instead
> of making the conversion simple ...
>
> Like
>
> template 
> class hash_table
> {
> ...
> };
>
> template 
> class pointer_hash
> {
>   hashval_t hash ();
>   int equal (const Element *);
>   ~Element ();
>   Element e;
> };
>
> and
>
> /* Hash table with all same_succ entries.  */
>
> static hash_table  > same_succ_htab;
>
> The existing way is simply too ugly ... so, why did you not invent
> a "nice" C++ way?  (please consider reverting the hashtable patch)

We are trying to balance several factors.  Sometimes we're going
to pick multiple steps rather than a single step.  In some cases,
as here, the intent was to make the client code changes minimal and
unsurprising while still getting the type safety and efficiency.
Other times, it may be a minor technical issue.  In particular,
I prefer to avoid steps that might cause very poor matching of
lines in the diff.  Others may choose differently.  Together we
will learn where the tradeoffs lie.

More in a later response.

-- 
Lawrence Crowl


Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode

2012-08-15 Thread H.J. Lu
On Tue, Aug 14, 2012 at 2:34 PM, Gary Funck  wrote:
> Attached, is an updated patch (with change logs).
>
> The test cases are now in gcc.target/i386 and the
> target selection is "dg-require-effective-target int128" only.
>
> Verified that the tests correctly detect the presence/lack
> of TImode support.
>

Here is a patch, which passed tests on Linux/x86-64/ia32 with

--enable-languages=c,c++,fortran,java,lto,objc,ada,obj-c++,go

on Linux/x32 with

--enable-languages=c,c++,fortran,java,lto,objc,obj-c++,go

I skipped Ada on x32 since Ada run-time library assumes
clock_t/time_t are long.

I added a target hook, type_blkmode_p, so that a backend
can force a type to BLKmode.  There is a macro,
MEMBER_TYPE_FORCES_BLK.  But it will also set

struct
{
  long double x;
};

to BLKmode instead of XFmode.  I think we can replace
MEMBER_TYPE_FORCES_BLK with the new type_blkmode_p
target hook.


-- 
H.J.
---
gcc/

2012-08-15  H.J. Lu  
Gary Funck 

PR target/20020
* stor-layout.c (compute_record_mode): Use BLKmode if
targetm.type_blkmode_p returns true.

* target.def (type_blkmode_p): New target hook.
* doc/tm.texi.in: Regenerated.
* doc/tm.texi.in: Likewise.

* config/i386/i386.c (ix86_type_blkmode_p): New function.
(TARGET_TYPE_BLKMODE_P): New macro.

* config/i386/i386.h (MAX_FIXED_MODE_SIZE): New macro.

gcc/testsuite/

2012-08-15  H.J. Lu  
Gary Funck 

PR target/20020
* gcc.target/i386/pr20020-1.c: New test.
* gcc.target/i386/pr20020-2.c: Likewise.
* gcc.target/i386/pr20020-3.c: Likewise.


gcc-pr20020.patch
Description: Binary data


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Toon Moene

On 08/15/2012 06:00 PM, Diego Novillo wrote:

> On the switch to C++ as the build language for GCC ...

Here are my results:

0:30 UTC - using C as the initial build language:

http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg01329.html

and:

18:40 UTC - using C++ as the initial build language:

http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg01408.html

both for x86_64-unknown-linux-gnu native (note: 
--with-build-config=bootstrap-lto).


As far as I can, little difference.

Congratulations, Diego and all the others who worked on this transition.

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


[google] Modification of gcov pmu format to reduce gcda size bloat (issue6427063)

2012-08-15 Thread Chris Manghane
This patch has been updated to reflect changes in patch r190247, which removed
pfmon support.

The patch should be applied to google/main

Tested with crosstools.

2012-08-14  Chris Manghane  

* libgcc/pmu-profile.c 
(gcov_write_load_latency_infos): Removed unused function.
(gcov_write_branch_mispredict_infos): Ditto.
(destroy_load_latency_infos): Removed static keyword.
(init_pmu_branch_mispredict): Ditto.
(gcov_write_ll_line): Ditto, plus replaced filename field with filetag.
(gcov_write_branch_mispredict_line): Ditto.
(gcov_write_string_table_entry): New function.
(gcov_write_tool_header): Removed static keyword.
* gcc/gcov.c
(release_structures): Removed filename field from PMU structures.
(filter_pmu_data_lines): Added PMU string table support.
(process_pmu_profile): Ditto.
* gcc/gcov-io.c 
(gcov_read_pmu_load_latency_info): Replaced filename field with filetag.
(gcov_read_pmu_branch_mispredict_info): Ditto.
(gcov_read_pmu_string_table_entry): New Function.
(print_load_latency_line): Replaced filename field with filetag.
(print_branch_mispredict_line): Ditto.
(print_string_table_entry): New function.
* gcc/gcov-io.h 
(GCOV_TAG_PMU_LOAD_LATENCY_LENGTH): Replaced filename field with filetag
(GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH): Ditto.
(GCOV_TAG_PMU_STRING_TABLE_ENTRY): Added new tag.
(GCOV_TAG_PMU_STRING_TABLE_ENTRY_LENGTH): Ditto.
(gcov_pmu_load_latency_info): Replaced filename field with filetag.
(gcov_pmu_branch_mispredict_info): Ditto.
(gcov_pmu_string_table_entry): New struct.
(gcov_pmu_string_table): New struct.
* gcc/gcov-dump.c
(tag_pmu_load_latency_info): Removed PMU filename field.
(tag_pmu_branch_mispredict_info): Ditto.
(tag_pmu_string_table_entry): New function.

Index: libgcc/pmu-profile.c
===
--- libgcc/pmu-profile.c(revision 190362)
+++ libgcc/pmu-profile.c(working copy)
@@ -1,3 +1,4 @@
+
 /* Performance monitoring unit (PMU) profiler. If available, use an
external tool to collect hardware performance counter data and
write it in the .gcda files.
@@ -74,13 +75,13 @@ static void destroy_load_latency_infos (void *info
 static void destroy_branch_mispredict_infos (void *info);
 static gcov_unsigned_t gcov_tag_pmu_tool_header_length (gcov_pmu_tool_header_t
 *header);
-static void gcov_write_tool_header (gcov_pmu_tool_header_t *header);
-static void gcov_write_load_latency_infos (void *info);
-static void gcov_write_branch_mispredict_infos (void *info);
-static void gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info);
-static void gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t
+void gcov_write_tool_header (gcov_pmu_tool_header_t *header);
+void gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info);
+void gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t
*brm_info);
+void gcov_write_string_table_entry (const gcov_pmu_st_entry_t *st_entry);
 
+
 /* Convert a fractional PCT to an unsigned integer after
muliplying by 100.  */
 
@@ -156,11 +157,11 @@ init_pmu_branch_mispredict (void)
 
 /* Write the load latency information LL_INFO into the gcda file.  */
 
-static void
+void
 gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_info)
 {
-  gcov_unsigned_t len = GCOV_TAG_PMU_LOAD_LATENCY_LENGTH (ll_info->filename);
-  gcov_write_tag_length (GCOV_TAG_PMU_LOAD_LATENCY_INFO, len);
+  gcov_write_tag_length (GCOV_TAG_PMU_LOAD_LATENCY_INFO,
+ GCOV_TAG_PMU_LOAD_LATENCY_LENGTH());
   gcov_write_unsigned (ll_info->counts);
   gcov_write_unsigned (ll_info->self);
   gcov_write_unsigned (ll_info->cum);
@@ -174,27 +175,38 @@ gcov_write_ll_line (const gcov_pmu_ll_info_t *ll_i
   gcov_write_counter (ll_info->code_addr);
   gcov_write_unsigned (ll_info->line);
   gcov_write_unsigned (ll_info->discriminator);
-  gcov_write_string (ll_info->filename);
+  gcov_write_unsigned (ll_info->filetag);
 }
 
 
 /* Write the branch mispredict information BRM_INFO into the gcda file.  */
 
-static void
+void
 gcov_write_branch_mispredict_line (const gcov_pmu_brm_info_t *brm_info)
 {
-  gcov_unsigned_t len = GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH (
-  brm_info->filename);
-  gcov_write_tag_length (GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO, len);
+  gcov_write_tag_length (GCOV_TAG_PMU_BRANCH_MISPREDICT_INFO,
+ GCOV_TAG_PMU_BRANCH_MISPREDICT_LENGTH());
   gcov_write_unsigned (brm_info->counts);
   gcov_write_unsigned (brm_info->self);
   gcov_write_unsigned (brm_info->cum);
   gcov_write_counter (brm_info->code_addr);
   gcov_write_unsigned (brm_info->line);
   gcov_write_unsigned (brm_info->discriminator

Re: RFC: fix std::unique_ptr pretty-printer

2012-08-15 Thread Tom Tromey
> "Jonathan" == Jonathan Wakely  writes:

Jonathan> I like it, please go ahead and check that in it you're happy
Jonathan> with it.

I did.  Thanks.

Tom


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Gabriel Dos Reis
On Wed, Aug 15, 2012 at 1:21 PM, Tom Tromey  wrote:
>> "Gaby" == Gabriel Dos Reis  writes:
>
> Tom> I asked Keith to resurrect his patch for this.
>
> Gaby> Since people are concerned about typing rules, would it
> Gaby> be an option for GDB to allow people to input pointer
> Gaby> literals with the "p" suffix (or "0p" prefix instead of "0x")?
>
> I think on the whole I'd rather have one extension instead of two.

That is a fair point :-)

>
> It seems to me that the above would still require extensions in the
> overloading code, to deal with conversions from void*; or perhaps some
> magic pointer type.
>
> What I think Keith is going to do is take his patch, change it so that
> int->pointer conversions (if the int != 0) are given a different badness
> from other conversions (meaning that, in theory, this should only be
> chosen as a last resort, and shouldn't interfere with ordinary uses),
> and finally add a parameter so that the feature can be disabled.
>
> I hope this will be acceptable all around.

OK, that sounds reasonable.

Thanks!

-- Gaby


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Tom Tromey
> "Gaby" == Gabriel Dos Reis  writes:

Tom> I asked Keith to resurrect his patch for this.

Gaby> Since people are concerned about typing rules, would it
Gaby> be an option for GDB to allow people to input pointer
Gaby> literals with the "p" suffix (or "0p" prefix instead of "0x")?

I think on the whole I'd rather have one extension instead of two.

It seems to me that the above would still require extensions in the
overloading code, to deal with conversions from void*; or perhaps some
magic pointer type.

What I think Keith is going to do is take his patch, change it so that
int->pointer conversions (if the int != 0) are given a different badness
from other conversions (meaning that, in theory, this should only be
chosen as a last resort, and shouldn't interfere with ordinary uses),
and finally add a parameter so that the feature can be disabled.

I hope this will be acceptable all around.

Tom


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Gabriel Dos Reis
On Wed, Aug 15, 2012 at 12:53 PM, Tom Tromey  wrote:
>> "Diego" == Diego Novillo  writes:
>
> Diego> GDB folks, would it be hard to figure out that there is a single
> Diego> variant of the called function and trust the user that they are
> Diego> passing the right pointer value?
>
> I asked Keith to resurrect his patch for this.

Since people are concerned about typing rules, would it
be an option for GDB to allow people to input pointer
literals with the "p" suffix (or "0p" prefix instead of "0x")?

-- Gaby


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Tom Tromey
> "Diego" == Diego Novillo  writes:

Diego> GDB folks, would it be hard to figure out that there is a single
Diego> variant of the called function and trust the user that they are
Diego> passing the right pointer value?

I asked Keith to resurrect his patch for this.

Tom


Re: [PATCH] PR target/53633; disable return value warnings for naked functions

2012-08-15 Thread H.J. Lu
On Wed, Jul 25, 2012 at 11:11 AM, Sandra Loosemore
 wrote:
> On 07/25/2012 09:57 AM, Richard Henderson wrote:
>>
>> I'll echo Nick's comments about arm asm in a common test.
>> There's no need to have anything but __asm__(""); there.
>>
>> Ok with that change.
>
> Thanks!  Here's the version I committed.
>
> -Sandra
>
>
> 2012-07-25  Sandra Loosemore  
> Paul Brook  
>
> PR target/53633
>
> gcc/
> * target.def (warn_func_return): New hook.
> * doc/tm.texi.in (TARGET_WARN_FUNC_RETURN): New hook.
> * doc/tm.texi: Regenerate.
> * doc/sourcebuild.texi (Effective-Target Keywords): Document
> naked_functions.
> * ipa-pure-const.c (warn_function_noreturn): Check
> targetm.warn_func_return.
> * tree-cfg.c (execute_warn_function_return): Likewise.

You didn't mention that you included "target.h" nor add "target.h"
dependency on tree-cfg.c.  I am checking in this as an obvious fix.

-- 
H.J.
--
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 65a49d7..f2366ad 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2363,7 +2363,7 @@ tree-vrp.o : tree-vrp.c $(CONFIG_H) $(SYSTEM_H)
coretypes.h $(TM_H) $(TREE_H) \
$(CFGLOOP_H) $(SCEV_H) intl.h \
$(GIMPLE_PRETTY_PRINT_H) gimple-fold.h $(OPTABS_H) $(EXPR_H)
 tree-cfg.o : tree-cfg.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
-   $(TREE_H) $(TM_P_H) $(GGC_H) $(FLAGS_H) \
+   $(TREE_H) $(TM_P_H) $(GGC_H) $(FLAGS_H) $(TARGET_H) \
$(DIAGNOSTIC_CORE_H) $(FUNCTION_H) $(TM_H) coretypes.h \
$(TREE_DUMP_H) $(EXCEPT_H) $(CFGLOOP_H) $(TREE_PASS_H) \
$(BASIC_BLOCK_H) \


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 05:49:34PM +0200, Jan Kratochvil wrote:
> On Wed, 15 Aug 2012 17:44:32 +0200, Michael Matz wrote:
> > On Wed, 15 Aug 2012, Jan Kratochvil wrote:
> > It's not needless as the examples here show.  gdb is about helping people 
> > debug their stuff, not about language lawyering.
> 
> In such case there should be a GDB setting for it as at least from my opinion
> it complicates the debugging.  Then there may be different opinions what
> should be the default.

That would be fine for gcc development purposes.  We could switch that mode
on at the end of gcc .gdbinit .

Jakub


Re: [RFC PATCH] -Wsizeof-pointer-memaccess warning

2012-08-15 Thread Joseph S. Myers
On Wed, 15 Aug 2012, Jakub Jelinek wrote:

> I was mainly interested in whether such an approach is acceptable, or
> whether I need to stop evaluating sizeof right away, create SIZEOF_EXPR
> and only fold it during fully_fold*.  I've briefly looked at that today,

The approach is fine.  Delaying evaluating sizeof is hard simply because 
of the expectation that integer constant expressions in general are 
evaluated early.

-- 
Joseph S. Myers
jos...@codesourcery.com


[patch] PR54146 - rewrite rewrite_into_loop_closed_ssa

2012-08-15 Thread Steven Bosscher
Hello,

This patch rewrites the rewriting part of
rewrite_into_loop_closed_ssa. This function took ~300s on the
simplified test case for PR54146, walking around the many thousands of
loops in the >400,000 basic blocks in the CFG, in
compute_global_livein.

For rewriting into LC-SSA, we can do better than that if we use the
knowledge that we're actually computing livein for an SSA name "DEF",
therefore its uses must all be dominated by the basic block where DEF
is assigned a value. But walking up the dominator tree doesn't work
here because we want to traverse and mark the edges in the CFG where
DEF is live, and we may not see those edges if we walk up the
dominator tree from a USE to DEF. We do know, however, that when we
walk up the CFG then:

1. if we enter any loop nested in the same loop as the basic block
containing DEF (let's call it DEF_LOOP) then we can stop walking
because the only way in is through one of the edges we're interested
in.

2. if we enter a loop is not in the nesting stack of DEF_LOOP, then we
can skin straight to the header of the loop and continue looking up
from there.

3. if we reach a basic block X as a predecessor of Y and Y dominates X
then we don't have to look at X or any of its predecessors.

Putting this all together, you end up with what looks like a poor
man's form of structural analysis where the control tree only contains
loop nodes, non-loop basic blocks, and irreducible regions. (Honza:
maybe re-surrect
http://gcc.gnu.org/ml/gcc-patches/2003-06/msg01669.html? Now that we
can maintain the loop tree, perhaps it's also feasible to maintain the
complete control tree and use it in places where we want to analyze
only a sub-CFG?)

The effect is that the time spent in rewrite_into_loop_closed_ssa
drops to less than one second.

Bootstrapped&tested on powerpc64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven


PR54146_lcssa.diff
Description: Binary data


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Richard Henderson
On 2012-08-15 07:29, Richard Guenther wrote:
> +   typedef typename Element::Element_t Element_t;

Can we use something less ugly than Element_t?
Such as

  typedef typename Element::T T;

?  Given that this name is scoped anyway...

r~


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Diego Novillo

On 12-08-15 11:44 , Michael Matz wrote:

Hi,

On Wed, 15 Aug 2012, Jan Kratochvil wrote:


It is a needless violation of C++ resolving rules.


It's not needless as the examples here show.  gdb is about helping people
debug their stuff, not about language lawyering.


Agreed.  If I'm passing a numeric pointer constant, I'm already past the 
bleeding edge.  I don't want gdb to get in the way.  If the inferior 
call crashes, I get to keep both pieces.



Thanks.  Diego.



Re: [RFC PATCH] -Wsizeof-pointer-memaccess warning

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 03:39:29PM +, Joseph S. Myers wrote:
> On Wed, 18 Jul 2012, Jakub Jelinek wrote:
> 
> > + if (warn_sizeof_pointer_memaccess
> > + && sizeof_arg != NULL_TREE)
> > +   sizeof_pointer_memaccess_warning (c_last_sizeof_arg_loc,
> > + expr.value, exprlist,
> > + sizeof_arg,
> > + sizeof_ptr_memacc_comptypes);
> 
> Why do you pass a local variable sizeof_arg but a global 
> c_last_sizeof_arg_loc here?  I'd have expected the same approach to apply 
> for both arguments (e.g. passing down a pointer to a location alongside 
> &sizeof_arg).

I guess I could pass it the same way as sizeof_arg or stick the location_t
into sizeof_arg (surround the value into NOP_EXPR with the location_t).

I was mainly interested in whether such an approach is acceptable, or
whether I need to stop evaluating sizeof right away, create SIZEOF_EXPR
and only fold it during fully_fold*.  I've briefly looked at that today,
and see several problems with that though: 1) fully_fold* is in c-family/,
so can't use e.g. c_vla_type_p (at least not easily, it could use a
langhook) 2) c_expr_sizeof_{expr,type} currently calls pop_maybe_used,
I don't see how that could be easily deferred till later when the expression
is fully folded.  So, we might need to do what c_expr_sizeof_{expr,type}
does right now (including fully folding), to call pop_maybe_used with the
right argument, and then either create SIZEOF_EXPR with the original
argument and fully fold it again during fully folding of the SIZEOF_EXPR,
or add an argument to SIZEOF_EXPR which would contain the folded expression
(i.e. have two arguments, the original expression and the value of the
sizeof).

Jakub


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Jan Kratochvil
Hi,

On Wed, 15 Aug 2012 17:44:32 +0200, Michael Matz wrote:
> On Wed, 15 Aug 2012, Jan Kratochvil wrote:
> It's not needless as the examples here show.  gdb is about helping people 
> debug their stuff, not about language lawyering.

In such case there should be a GDB setting for it as at least from my opinion
it complicates the debugging.  Then there may be different opinions what
should be the default.


Thanks,
Jan


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Michael Matz
Hi,

On Wed, 15 Aug 2012, Jan Kratochvil wrote:

> It is a needless violation of C++ resolving rules.

It's not needless as the examples here show.  gdb is about helping people 
debug their stuff, not about language lawyering.

> There are various easy way how to get it working (in .gdbinit or 
> cc1-gdb.gdb define GDB function, define macro in GDB, use GDB python 
> pretty printer instead (possibly even calling GCC inferior function) 

We should define gdb macros for every not-overloaded function (which are 
_all_ GCC functions currently)?  Doesn't scale.


Ciao,
Michael.


Re: [RFC PATCH] -Wsizeof-pointer-memaccess warning

2012-08-15 Thread Joseph S. Myers
On Wed, 18 Jul 2012, Jakub Jelinek wrote:

> +   if (warn_sizeof_pointer_memaccess
> +   && sizeof_arg != NULL_TREE)
> + sizeof_pointer_memaccess_warning (c_last_sizeof_arg_loc,
> +   expr.value, exprlist,
> +   sizeof_arg,
> +   sizeof_ptr_memacc_comptypes);

Why do you pass a local variable sizeof_arg but a global 
c_last_sizeof_arg_loc here?  I'd have expected the same approach to apply 
for both arguments (e.g. passing down a pointer to a location alongside 
&sizeof_arg).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis

2012-08-15 Thread Martin Jambor
Hi,

On Fri, Aug 10, 2012 at 05:12:31AM +0200, Jan Hubicka wrote:
> 
> Do you have any data on memory usage?  I was originally concerned
> about memory use of the whole predicate thingy on WPA level.
> Eventually we could add simple inheritance on conditions and sort
> them into mutiple vectors if needed. But I assume it is OK or we
> will work out on Mozilla builds soonish.
> 

When I have completely disabled computation of jump functions, the
number of expression trees reported by detailed -fmem-report in the
WPA stage dropped from 2253872 to 2124732, i.e. by 5.7%.

Martin



Re: [PATCH] Set current_function_decl in {push,pop}_cfun and push_struct_function

2012-08-15 Thread Martin Jambor
Hi,

On Fri, Aug 10, 2012 at 04:57:41PM +0200, Eric Botcazou wrote:
> > - ada/gcc-interface/utils.c:rest_of_subprog_body_compilation calls
> >   dump_function which in turns calls dump_function_to_file which calls
> >   push_cfun.  But Ada front end has its idea of the
> >   current_function_decl and there is no cfun which is an inconsistency
> >   which makes push_cfun assert fail.  I "solved" it by temporarily
> >   setting current_function_decl to NULL_TREE.  It's just dumping and I
> >   thought that dump_function should be considered middle-end and thus
> >   middle-end invariants should apply.
> 
> If you think that calling dump_function from rest_of_subprog_body_compilation 
> is a layering violation, I don't have a problem with replacing it with a more 
> "manual" scheme like the one in c-family/c-gimplify.c:c_genericize, provided 
> that this yields roughly the same output.

Richi suggested on IRC that I remove the push/pop_cfun calls from
dump_function_to_file.  The only problem seems to be
dump_histograms_for_stmt (buried in a dump_bb call) which needs a
function to get the value histograms.  Richi suggested to replace the
per function hash with one global one.  I tried that and bumped into
two problems: First, I did not really know when to deallocate the
global hash and all the histograms.  More importantly, the current
gimple verifier (verify_gimple_in_cfg) checks that there are no "dead
histograms" in the hash corresponding to statements no longer in the
function.  With one global hash, it would be very cumbersome to do
this check (add cfun to each histogram?  only when checking is
enabled?).  Hash tables are now in flux anyway so I postponed that
work until the interface settles down.  Nevertheless, I guess I need a
confirmation from maintainers that I can remove the checking if I
continue this way.

When I looked at what would need to be changed if I added a function
parameter to dump_bb, the work list grew very large very quickly.  It
is probably doable, but the change will be quite big, so if anyone
thinks it is a bad idea, please email me to stop.

If I manage to change the dumping in one of the two ways, the call
will not be a layering problem any more as it probably should not be.
At the moment, it probably is.  We'll see.

Thanks,

Martin


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Jan Kratochvil
On Wed, 15 Aug 2012 14:23:37 +0200, Diego Novillo wrote:
> GDB folks, would it be hard to figure out that there is a single
> variant of the called function and trust the user that they are
> passing the right pointer value?

It is a needless violation of C++ resolving rules.  There are various easy way
how to get it working (in .gdbinit or cc1-gdb.gdb define GDB function, define
macro in GDB, use GDB python pretty printer instead (possibly even calling GCC
inferior function) etc.).

While I did not post such patch yet I would prefer to even forbid by default
expressions like
(gdb) print *0x1234567
and enforce
(gdb) print *(int *)0x1234567
instead as the former syntax confuses people (as commonly seen on IRC), the
same applies to calling functions without full debuginfo (they should require
an explicit cast in GDB) etc.


Regards,
Jan


Re: C++ PR 54197: lifetime of reference not properly extended

2012-08-15 Thread Ollie Wild
(Adding other C++ maintainers in case someone else wants to have a stab.)

Ping?

Ollie


On Mon, Aug 13, 2012 at 4:01 PM, Ollie Wild  wrote:
>
> On Mon, Aug 13, 2012 at 3:50 PM, Jakub Jelinek  wrote:
> >
> > The formatting doesn't match GCC coding conventions in several ways.
> > You don't have spaces before (, and ( shouldn't be at the end of line if
> > possible.
>
> Updated patch attached.
>
> Ollie


Re: Merge C++ conversion into trunk (2/6 - VEC rewrite)

2012-08-15 Thread Diego Novillo

On 12-08-15 10:44 , Eric Botcazou wrote:

This implements the VEC re-write.

See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for
details.


You didn't update the head comment in vec.h though, is that on purpose?


Yes.  I am updating it now that I'm *really* changing the interface.


Diego.



Re: RFC: fix std::unique_ptr pretty-printer

2012-08-15 Thread Jonathan Wakely
On 14 August 2012 15:44, Tom Tromey wrote:
>> "Jonathan" == Jonathan Wakely  writes:
>
> Jonathan> I prefer it as unique_ptr but I'm probably not your typical
> Jonathan> user of the pretty printers, so if anyone else has an opinion please
> Jonathan> share it.
>
> I prefer it too.  Here's the updated patch.  Let me know what you think.

I like it, please go ahead and check that in it you're happy with it.


Re: Merge C++ conversion into trunk (2/6 - VEC rewrite)

2012-08-15 Thread Eric Botcazou
> This implements the VEC re-write.
> 
> See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for
> details.

You didn't update the head comment in vec.h though, is that on purpose?

-- 
Eric Botcazou


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Richard Guenther
On Wed, 15 Aug 2012, Michael Matz wrote:

> Hi,
> 
> On Wed, 15 Aug 2012, Richard Guenther wrote:
> 
> > Like the following, only the coverage.c use is converted.  I've never 
> > seen template function arguments anywhere and having to repeat them all 
> > over the place is really really ugly (yes, even if only in the 
> > implementation).
> > 
> > This goes with static member functions and not wrapping any data. It 
> > goes away with the requirement of having externally visible global 
> > functions for the hash/compare functions as well (which was ugly 
> > anyway).
> > 
> > Comments?
> 
> Well, it looks nicer than what's there currently.  As the element 
> functions now are scoped and normal member functions, they should be named 
> with lower case characters of course.  I do like that the hash table would 
> then only have one real template argument; the Allocator, well, no better 
> idea comes to my mind.

Yeah.  Updated patch below, all users converted.  I probably missed
to revert some of the globalizations of hash/compare fns.

Comments?
Richard.

Index: gcc/hash-table.h
===
*** gcc/hash-table.h.orig   2012-08-15 15:39:34.0 +0200
--- gcc/hash-table.h2012-08-15 16:17:06.613628039 +0200
*** xcallocator ::data_free (Type *mem
*** 83,127 
  }
  
  
- /* A common function for hashing a CANDIDATE typed pointer.  */
- 
  template 
! inline hashval_t
! typed_pointer_hash (const Element *candidate)
  {
!   /* This is a really poor hash function, but it is what the current code 
uses,
!  so I am reusing it to avoid an additional axis in testing.  */
!   return (hashval_t) ((intptr_t)candidate >> 3);
! }
  
  
- /* A common function for comparing an EXISTING and CANDIDATE typed pointers
-for equality. */
  
  template 
! inline int
! typed_pointer_equal (const Element *existing, const Element * candidate)
  {
!   return existing == candidate;
! }
  
  
! /* A common function for doing nothing on removing a RETIRED slot.  */
  
  template 
! inline void
! typed_null_remove (Element *retired ATTRIBUTE_UNUSED)
  {
  }
  
- 
- /* A common function for using free on removing a RETIRED slot.  */
- 
  template 
! inline void
! typed_free_remove (Element *retired)
  {
!   free (retired);
  }
  
  
--- 83,132 
  }
  
  
  template 
! class typed_free_remove
  {
! public:
!   static inline void remove (Element *p) { free (p); }
! };
  
+ template 
+ class typed_noop_remove
+ {
+ public:
+   static inline void remove (Element *) {}
+ };
  
  
+ /* Pointer hash.  */
  template 
! class pointer_hash : public typed_noop_remove 
  {
! public:
!   typedef Element Element_t;
  
+   static inline hashval_t
+   hash (const Element_t *);
  
!   static inline int
!   equal (const Element_t *existing, const Element_t * candidate);
! };
  
  template 
! inline hashval_t
! pointer_hash::hash (const Element_t *candidate)
  {
+   /* This is a really poor hash function, but it is what the current code 
uses,
+  so I am reusing it to avoid an additional axis in testing.  */
+   return (hashval_t) ((intptr_t)candidate >> 3);
  }
  
  template 
! inline int
! pointer_hash::equal (const Element_t *existing,
! const Element_t *candidate)
  {
!   return existing == candidate;
  }
  
  
*** struct hash_table_control
*** 180,194 
  
 The table stores elements of type Element.
  
!It hashes elements with the Hash function.
   The table currently works with relatively weak hash functions.
   Use typed_pointer_hash  when hashing pointers instead of 
objects.
  
!It compares elements with the Equal function.
   Two elements with the same hash may not be equal.
   Use typed_pointer_equal  when hashing pointers instead of 
objects.
  
!It removes elements with the Remove function.
   This feature is useful for freeing memory.
   Use typed_null_remove  when not freeing objects.
   Use typed_free_remove  when doing a simple object free.
--- 185,199 
  
 The table stores elements of type Element.
  
!It hashes elements with the hash function.
   The table currently works with relatively weak hash functions.
   Use typed_pointer_hash  when hashing pointers instead of 
objects.
  
!It compares elements with the equal function.
   Two elements with the same hash may not be equal.
   Use typed_pointer_equal  when hashing pointers instead of 
objects.
  
!It removes elements with the remove function.
   This feature is useful for freeing memory.
   Use typed_null_remove  when not freeing objects.
   Use typed_free_remove  when doing a simple object free.
*** struct hash_table_control
*** 199,243 
  */
  
  template  class Allocator = xcallocator>
  class hash_table
  {
  
  private:
  
!   hash_table_control  *htab;
! 
!   Element **find_empty_slot_for_expand (hashval_t hash);
void expand ();

Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Mike Stump
On Aug 15, 2012, at 4:59 AM, Richard Guenther wrote:
> and debugging becomes a nightmare (hello gdb people!)

> (gdb) call debug_tree (0x767fa5e8)
> Cannot resolve function debug_tree to any overloaded instance

Inquiring minds want to know if:

macro define debug_tree(A) ((tree)A)

makes the above work if you put it into gdbinit.in?  If yes, then, I think for 
now we should add such lines for every non-overloaded function, I too like 
using hex constants from dumps.  Oh, or, maybe we just add a debug_tree(long) 
overload just to satisfy gdb.


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Michael Matz
Hi,

On Wed, 15 Aug 2012, Richard Guenther wrote:

> Like the following, only the coverage.c use is converted.  I've never 
> seen template function arguments anywhere and having to repeat them all 
> over the place is really really ugly (yes, even if only in the 
> implementation).
> 
> This goes with static member functions and not wrapping any data. It 
> goes away with the requirement of having externally visible global 
> functions for the hash/compare functions as well (which was ugly 
> anyway).
> 
> Comments?

Well, it looks nicer than what's there currently.  As the element 
functions now are scoped and normal member functions, they should be named 
with lower case characters of course.  I do like that the hash table would 
then only have one real template argument; the Allocator, well, no better 
idea comes to my mind.


Ciao,
Michael.


Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-08-15 Thread Richard Guenther
On Wed, 15 Aug 2012, Richard Guenther wrote:

> On Sun, 12 Aug 2012, Diego Novillo wrote:
> 
> > This implements a new C++ hash table.
> > 
> > See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html for
> > details.
> > 
> > Diego.
> 
> Now as we see the result I'd have prefered a more C++-way instead
> of making the conversion simple ...
> 
> Like
> 
> template 
> class hash_table
> {
> ...
> };
> 
> template 
> class pointer_hash
> {
>   hashval_t hash ();
>   int equal (const Element *);
>   ~Element ();
>   Element e;
> };
> 
> and
> 
> /* Hash table with all same_succ entries.  */
> 
> static hash_table  > same_succ_htab;
> 
> The existing way is simply too ugly ... so, why did you not invent
> a "nice" C++ way?

Like the following, only the coverage.c use is converted.  I've
never seen template function arguments anywhere and having to repeat
them all over the place is really really ugly (yes, even if only in
the implementation).

This goes with static member functions and not wrapping any data.
It goes away with the requirement of having externally visible
global functions for the hash/compare functions as well (which was
ugly anyway).

Comments?

Thanks,
Richard.

> diffstat ~/p
 coverage.c   |   43 +-
 hash-table.h |  234 
+--
 2 files changed, 108 insertions(+), 169 deletions(-)


Index: gcc/hash-table.h
===
--- gcc/hash-table.h(revision 190410)
+++ gcc/hash-table.h(working copy)
@@ -199,45 +199,42 @@ struct hash_table_control
 */
 
 template  class Allocator = xcallocator>
 class hash_table
 {
+public:
+  typedef typename Element::Element_t Element_t;
 
 private:
+  hash_table_control  *htab;
 
-  hash_table_control  *htab;
-
-  Element **find_empty_slot_for_expand (hashval_t hash);
+  Element_t **find_empty_slot_for_expand (hashval_t hash);
   void expand ();
 
 public:
-
   hash_table ();
   void create (size_t initial_slots);
   bool is_created ();
   void dispose ();
-  Element *find (Element *comparable);
-  Element *find_with_hash (Element *comparable, hashval_t hash);
-  Element **find_slot (Element *comparable, enum insert_option insert);
-  Element **find_slot_with_hash (Element *comparable, hashval_t hash,
-enum insert_option insert);
+  Element_t *find (Element_t *comparable);
+  Element_t *find_with_hash (Element_t *comparable, hashval_t hash);
+  Element_t **find_slot (Element_t *comparable, enum insert_option insert);
+  Element_t **find_slot_with_hash (Element_t *comparable, hashval_t hash,
+  enum insert_option insert);
   void empty ();
-  void clear_slot (Element **slot);
-  void remove_elt (Element *comparable);
-  void remove_elt_with_hash (Element *comparable, hashval_t hash);
+  void clear_slot (Element_t **slot);
+  void remove_elt (Element_t *comparable);
+  void remove_elt_with_hash (Element_t *comparable, hashval_t hash);
   size_t size();
   size_t elements();
   double collisions();
 
   template 
+   int (*Callback) (Element_t **slot, Argument argument)>
   void traverse_noresize (Argument argument);
 
   template 
+   int (*Callback) (Element_t **slot, Argument argument)>
   void traverse (Argument argument);
 };
 
@@ -245,12 +242,9 @@ public:
 /* Construct the hash table.  The only useful operation next is create.  */
 
 template  class Allocator>
 inline
-hash_table ::hash_table ()
+hash_table ::hash_table ()
 : htab (NULL)
 {
 }
@@ -259,12 +253,9 @@ hash_table  class Allocator>
 inline bool
-hash_table ::is_created ()
+hash_table ::is_created ()
 {
   return htab != NULL;
 }
@@ -273,56 +264,44 @@ hash_table  class Allocator>
-inline Element *
-hash_table ::find (Element 
*comparable)
+inline typename Element::Element_t *
+hash_table ::find (Element_t *comparable)
 {
-  return find_with_hash (comparable, Hash (comparable));
+  return find_with_hash (comparable, Element::Hash (comparable));
 }
 
 
 /* Like find_slot_with_hash, but compute the hash value from the element.  */
 
 template  class Allocator>
-inline Element **
-hash_table 
-::find_slot (Element *comparable, enum insert_option insert)
+ template  class Allocator>
+inline typename Element::Element_t **
+hash_table 
+::find_slot (Element_t *comparable, enum insert_option insert)
 {
-  return find_slot_with_hash (comparable, Hash (comparable), insert);
+  return find_slot_with_hash (comparable, Element::Hash (comparable), insert);
 }
 
 
 /* Like remove_elt_with_hash, but compute the hash value from the element.  */
 
 template  class Allocator>
 inline void
-hash_table 
-::remove_elt (Element *comparable)
+hash_table 
+::remove_elt (Element_t *comparable)
 {
-  remove_elt_with_hash (comparable, Hash (comparable));
+  remove_elt_with_hash (comparable, Element::Hash (comparable));
 }
 
 
 /* Return the current size of this hash table.  */
 
 template  class Allocator>
 inline size_t

Re: combine permutations in gimple

2012-08-15 Thread Richard Guenther
On Wed, Aug 15, 2012 at 2:53 PM, Jakub Jelinek  wrote:
> On Wed, Aug 15, 2012 at 02:36:54PM +0200, Richard Guenther wrote:
>> On Wed, Aug 15, 2012 at 2:29 PM, Jakub Jelinek  wrote:
>> > On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote:
>> >> Ok.  That would still leave us with the issue Ramana brought up - the
>> >> target hook returning true unconditionally if a generic permute is 
>> >> implemented.
>> >> We just avoid generic expansion by tree-vect-generic.c that way.
>> >
>> > Yeah, if there is a generic permute, can_vec_perm_p will return true always
>> > for the modes for which it is available.  Which is why I've been suggesting
>> > also the target hook which should return false if only generic permute is
>> > going to be used.
>>
>> Well.  What about returning a cost instead?  We don't want to transform
>> two insns to four either.
>
> That could work, it would just be a lot of work to change it (and more
> costly).  E.g. on i386 for 16-byte vectors with certain ISAs we return true
> right away, when returning cost we'd need to go the slow path always
> even when the caller is just interested in a boolean flag whether it is
> possible or not.
>
> CCing rth as the author of the hooks.

We'd want more accurate costs for the vectorizer cost model anyways.

Richard.

> Jakub


Re: combine permutations in gimple

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 02:36:54PM +0200, Richard Guenther wrote:
> On Wed, Aug 15, 2012 at 2:29 PM, Jakub Jelinek  wrote:
> > On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote:
> >> Ok.  That would still leave us with the issue Ramana brought up - the
> >> target hook returning true unconditionally if a generic permute is 
> >> implemented.
> >> We just avoid generic expansion by tree-vect-generic.c that way.
> >
> > Yeah, if there is a generic permute, can_vec_perm_p will return true always
> > for the modes for which it is available.  Which is why I've been suggesting
> > also the target hook which should return false if only generic permute is
> > going to be used.
> 
> Well.  What about returning a cost instead?  We don't want to transform
> two insns to four either.

That could work, it would just be a lot of work to change it (and more
costly).  E.g. on i386 for 16-byte vectors with certain ISAs we return true
right away, when returning cost we'd need to go the slow path always
even when the caller is just interested in a boolean flag whether it is
possible or not.

CCing rth as the author of the hooks.

Jakub


Re: combine permutations in gimple

2012-08-15 Thread Richard Guenther
On Wed, Aug 15, 2012 at 2:29 PM, Jakub Jelinek  wrote:
> On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote:
>> Ok.  That would still leave us with the issue Ramana brought up - the
>> target hook returning true unconditionally if a generic permute is 
>> implemented.
>> We just avoid generic expansion by tree-vect-generic.c that way.
>
> Yeah, if there is a generic permute, can_vec_perm_p will return true always
> for the modes for which it is available.  Which is why I've been suggesting
> also the target hook which should return false if only generic permute is
> going to be used.

Well.  What about returning a cost instead?  We don't want to transform
two insns to four either.

Richard.

> Jakub


Re: combine permutations in gimple

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 02:15:03PM +0200, Richard Guenther wrote:
> Ok.  That would still leave us with the issue Ramana brought up - the
> target hook returning true unconditionally if a generic permute is 
> implemented.
> We just avoid generic expansion by tree-vect-generic.c that way.

Yeah, if there is a generic permute, can_vec_perm_p will return true always
for the modes for which it is available.  Which is why I've been suggesting
also the target hook which should return false if only generic permute is
going to be used.

Jakub


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Diego Novillo

On 12-08-15 08:18 , Richard Guenther wrote:


0 is fixed if you have recent enough gdb.

(gdb) call debug_tree (0)

as 0 is a null pointer constant.


Oh, cool.  Progress.

GDB folks, would it be hard to figure out that there is a single variant 
of the called function and trust the user that they are passing the 
right pointer value?



Diego.


Re: combine permutations in gimple

2012-08-15 Thread Ramana Radhakrishnan
On 15 August 2012 13:07, Jakub Jelinek  wrote:
> On Wed, Aug 15, 2012 at 01:46:05PM +0200, Richard Guenther wrote:
>> Well, we're waiting for someone to break the tie ... I'd go with the original
>> patch, improving the backends where necessary.
>
> E.g. i?86/x86_64 with just plain -msse2 has only very small subset of
> constant shuffles (and no variable shuffle), so by doing the transformation
> you could end up with a scalarized shuffle instead of two constant vector
> shuffles.  Expecting the backend to figure it out and doing the two constant
> vector shuffles in every case is not realistic, i386/x86_64 has way too many
> different shuffling instructions (and worse different ISA levels have
> different subsets of them) and while a lot of effort has been spent on
> it already by Richard, me, Marc and others, we are nowhere close to having
> optimal sequence in many cases for various modes and ISA levels.  Doing a
> brute force might be too expensive.
>

Neon has atleast 7-8 such forms. Brute force computing this would be expensive.

I don't know what happens on other vector targets - surely other SIMD
targets in GCC will also have the same problem.

Ramana

> Jakub


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Richard Guenther
On Wed, 15 Aug 2012, Diego Novillo wrote:

> On 12-08-15 07:59 , Richard Guenther wrote:
> 
> > (gdb) call debug_tree (*expr_p)
> >   
> > constant 9>
> > (gdb) call debug_tree (0x767fa5e8)
> > Cannot resolve function debug_tree to any overloaded instance
> 
> Yeah, in the absence of overloads this is annoying.  Happens with 0, too.

0 is fixed if you have recent enough gdb.

(gdb) call debug_tree (0)

as 0 is a null pointer constant.

Richard.


Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Diego Novillo

On 12-08-15 07:59 , Richard Guenther wrote:


(gdb) call debug_tree (*expr_p)
  
constant 9>
(gdb) call debug_tree (0x767fa5e8)
Cannot resolve function debug_tree to any overloaded instance


Yeah, in the absence of overloads this is annoying.  Happens with 0, too.


Diego.


Re: combine permutations in gimple

2012-08-15 Thread Richard Guenther
On Wed, Aug 15, 2012 at 2:07 PM, Jakub Jelinek  wrote:
> On Wed, Aug 15, 2012 at 01:46:05PM +0200, Richard Guenther wrote:
>> Well, we're waiting for someone to break the tie ... I'd go with the original
>> patch, improving the backends where necessary.
>
> E.g. i?86/x86_64 with just plain -msse2 has only very small subset of
> constant shuffles (and no variable shuffle), so by doing the transformation
> you could end up with a scalarized shuffle instead of two constant vector
> shuffles.  Expecting the backend to figure it out and doing the two constant
> vector shuffles in every case is not realistic, i386/x86_64 has way too many
> different shuffling instructions (and worse different ISA levels have
> different subsets of them) and while a lot of effort has been spent on
> it already by Richard, me, Marc and others, we are nowhere close to having
> optimal sequence in many cases for various modes and ISA levels.  Doing a
> brute force might be too expensive.

Ok.  That would still leave us with the issue Ramana brought up - the
target hook returning true unconditionally if a generic permute is implemented.
We just avoid generic expansion by tree-vect-generic.c that way.

(I wonder if there exists some automated machinery that finds a path using
existing instructions to shuffle from {0, 1, ..., n } to the mask and if that
would be a viable way to handle shuffle expansions, or pre-compute them all)

Richard.

> Jakub


Re: combine permutations in gimple

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 01:46:05PM +0200, Richard Guenther wrote:
> Well, we're waiting for someone to break the tie ... I'd go with the original
> patch, improving the backends where necessary.

E.g. i?86/x86_64 with just plain -msse2 has only very small subset of
constant shuffles (and no variable shuffle), so by doing the transformation
you could end up with a scalarized shuffle instead of two constant vector
shuffles.  Expecting the backend to figure it out and doing the two constant
vector shuffles in every case is not realistic, i386/x86_64 has way too many
different shuffling instructions (and worse different ISA levels have
different subsets of them) and while a lot of effort has been spent on
it already by Richard, me, Marc and others, we are nowhere close to having
optimal sequence in many cases for various modes and ISA levels.  Doing a
brute force might be too expensive.

Jakub


Re: [rfc] Fix SPU build (Re: [PATCH] Remove basic_block->loop_depth)

2012-08-15 Thread Richard Guenther
On Wed, 15 Aug 2012, Ulrich Weigand wrote:

> Richard Guenther wrote:
> > On Tue, 14 Aug 2012, Ulrich Weigand wrote:
> > > Looks like this broke SPU build, since spu_machine_dependent_reorg
> > > accesses ->loop_depth.  According to comments in the code, this
> > > was done because of concerns that loop_father may no longer be set up
> > > this late in compilation, so I'm wondering whether just replacing
> > > this by loop_depth (bb->loop_father) would work here ...
> 
> > If SPU md reorg would like to look at loop structures it should
> > compute them.  Simply call flow_loops_find, which hopefully works
> > in CFG RTL mode (which I think is the mode available from md reorg?).
> 
> It seems flow_loops_find by itself is not quite enough, but everything
> necessary to use the loop structures seems to be encapsulated in
> loop_optimizer_init / loop_optimizer_finalize, which are already called
> by a number of optimization passes.  When I do the same in the SPU
> md-reorg pass, loop_father seems to be set up OK.
> 
> Does this look reasonable?

Yes.  You can use bb_loop_depth instead of loop_depth (bb->loop_father).

Richard.

> Thanks,
> Ulrich
> 
> 
> ChangeLog:
> 
>   * config/spu/spu.c: Include "cfgloop.h".
>   (spu_machine_dependent_reorg): Call loop_optimizer_init and
>   loop_optimizer_finalize.  Use loop_father instead of loop_depth.
>   * config/spu/t-spu-elf (spu.o): Update dependencies.
> 
> Index: gcc/config/spu/spu.c
> ===
> *** gcc/config/spu/spu.c  (revision 190390)
> --- gcc/config/spu/spu.c  (working copy)
> ***
> *** 53,58 
> --- 53,59 
>   #include "timevar.h"
>   #include "df.h"
>   #include "dumpfile.h"
> + #include "cfgloop.h"
>   
>   /* Builtin types, data and prototypes. */
>   
> *** spu_machine_dependent_reorg (void)
> *** 2458,2463 
> --- 2459,2468 
> in_spu_reorg = 1;
> compute_bb_for_insn ();
>   
> +   /* (Re-)discover loops so that bb->loop_father can be used
> +  in the analysis below.  */
> +   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
> + 
> compact_blocks ();
>   
> spu_bb_info =
> *** spu_machine_dependent_reorg (void)
> *** 2562,2575 
>fallthru block. This catches the cases when it is a simple
>loop or when there is an initial branch into the loop. */
> if (prev && (loop_exit || simple_loop)
> !   && prev->loop_depth <= bb->loop_depth)
>   prop = prev;
>   
> /* If there is only one adjacent predecessor.  Don't propagate
> !  outside this loop.  This loop_depth test isn't perfect, but
> !  I'm not sure the loop_father member is valid at this point.  */
> else if (prev && single_pred_p (bb)
> !&& prev->loop_depth == bb->loop_depth)
>   prop = prev;
>   
> /* If this is the JOIN block of a simple IF-THEN then
> --- 2567,2580 
>fallthru block. This catches the cases when it is a simple
>loop or when there is an initial branch into the loop. */
> if (prev && (loop_exit || simple_loop)
> !   && (loop_depth (prev->loop_father)
> !   <= loop_depth (bb->loop_father)))
>   prop = prev;
>   
> /* If there is only one adjacent predecessor.  Don't propagate
> !  outside this loop.  */
> else if (prev && single_pred_p (bb)
> !&& prev->loop_father == bb->loop_father)
>   prop = prev;
>   
> /* If this is the JOIN block of a simple IF-THEN then
> *** spu_machine_dependent_reorg (void)
> *** 2578,2584 
>  && EDGE_COUNT (bb->preds) == 2
>  && EDGE_COUNT (prev->preds) == 1
>  && EDGE_PRED (prev, 0)->src == prev2
> !&& prev2->loop_depth == bb->loop_depth
>  && GET_CODE (branch_target) != REG)
>   prop = prev;
>   
> --- 2583,2589 
>  && EDGE_COUNT (bb->preds) == 2
>  && EDGE_COUNT (prev->preds) == 1
>  && EDGE_PRED (prev, 0)->src == prev2
> !&& prev2->loop_father == bb->loop_father
>  && GET_CODE (branch_target) != REG)
>   prop = prev;
>   
> *** spu_machine_dependent_reorg (void)
> *** 2600,2606 
> if (dump_file)
>   fprintf (dump_file, "propagate from %i to %i (loop depth %i) "
>"for %i (loop_exit %i simple_loop %i dist %i)\n",
> !  bb->index, prop->index, bb->loop_depth,
>INSN_UID (branch), loop_exit, simple_loop,
>branch_addr - INSN_ADDRESSES (INSN_UID (bbend)));
>   
> --- 2605,2611 
> if (dump_file)
>   fprintf (dump_file, "propagate from %i to %i (loop depth %i) "
>"for %i (loop_exit %i simple_loop %i dist %i)\n",
> !

Re: combine permutations in gimple

2012-08-15 Thread Richard Guenther
On Wed, Aug 15, 2012 at 1:56 PM, Ramana Radhakrishnan
 wrote:
> [It looks like I missed hitting the send button on this response]
>
>>
>> Seems to be one instruction shorter at least ;-) Yes, there can be much
>> worse regressions than that because of the patch (like 40 instructions
>> instead of 4, in the x86 backend).
>
> If this is replacing 4 instructions with 40 in x86 backend maybe
> someone will notice :)
>
> Not a win in this particular testcase  because the compiler replaces 2
> constant permutes ( that's about 4 cycles) with a load from the
> constant pool , a generic permute and in addition are polluting the
> icache with guff in the constant pool .  If you go to 3 -4 permutes
> into a single one then it might be a win but not till that point.
>
>
>> with a-b without first asking the backend whether it might be more
>> efficient. One permutation is better than 2.
>>  It just happens that the range
>> of possible permutations is too large (and the basic instructions are too
>> strange) for backends to do a good job on them, and thus keeping toplevel
>> input as a hint is important.
>
> Of-course, the problem here is this change of semantics with the hook
> TARGET_VEC_PERM_CONST_OK. Targets were expanding to generic permutes
> with constants in the *absence* of being able to deal with them with
> the specialized permutes.  fwprop will now leave us at a point where
> each target has to now grow more knowledge with respect to how best to
> expand a generic constant permute with a sequence of permutes rather
> than just using the generic permute.
>
> Generating a sequence of permutes from a single constant permute will
> be a  harder problem than (say) dealing with immediate expansions so
> you are pushing more complexity into the target but in the short term
> target maintainers should definitely have a heads up that folks using
> vector permute intrinsics could actually have performance regressions
> on their targets.

It's of course the same with the user input containing such a non-optimal
handled constant permute.  So I'm less convinced that it's too much burden
on the target side.  OTOH if there is a generic kind of shuffles that targets
do not implement directly but can be simulated by two that are directly
implemented pushing the logic to the expander (and adjusting the target
hook semantic) would be ok.

Richard.

> Thanks,
> Ramana


[rfc] Fix SPU build (Re: [PATCH] Remove basic_block->loop_depth)

2012-08-15 Thread Ulrich Weigand
Richard Guenther wrote:
> On Tue, 14 Aug 2012, Ulrich Weigand wrote:
> > Looks like this broke SPU build, since spu_machine_dependent_reorg
> > accesses ->loop_depth.  According to comments in the code, this
> > was done because of concerns that loop_father may no longer be set up
> > this late in compilation, so I'm wondering whether just replacing
> > this by loop_depth (bb->loop_father) would work here ...

> If SPU md reorg would like to look at loop structures it should
> compute them.  Simply call flow_loops_find, which hopefully works
> in CFG RTL mode (which I think is the mode available from md reorg?).

It seems flow_loops_find by itself is not quite enough, but everything
necessary to use the loop structures seems to be encapsulated in
loop_optimizer_init / loop_optimizer_finalize, which are already called
by a number of optimization passes.  When I do the same in the SPU
md-reorg pass, loop_father seems to be set up OK.

Does this look reasonable?

Thanks,
Ulrich


ChangeLog:

* config/spu/spu.c: Include "cfgloop.h".
(spu_machine_dependent_reorg): Call loop_optimizer_init and
loop_optimizer_finalize.  Use loop_father instead of loop_depth.
* config/spu/t-spu-elf (spu.o): Update dependencies.

Index: gcc/config/spu/spu.c
===
*** gcc/config/spu/spu.c(revision 190390)
--- gcc/config/spu/spu.c(working copy)
***
*** 53,58 
--- 53,59 
  #include "timevar.h"
  #include "df.h"
  #include "dumpfile.h"
+ #include "cfgloop.h"
  
  /* Builtin types, data and prototypes. */
  
*** spu_machine_dependent_reorg (void)
*** 2458,2463 
--- 2459,2468 
in_spu_reorg = 1;
compute_bb_for_insn ();
  
+   /* (Re-)discover loops so that bb->loop_father can be used
+  in the analysis below.  */
+   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+ 
compact_blocks ();
  
spu_bb_info =
*** spu_machine_dependent_reorg (void)
*** 2562,2575 
 fallthru block. This catches the cases when it is a simple
 loop or when there is an initial branch into the loop. */
  if (prev && (loop_exit || simple_loop)
! && prev->loop_depth <= bb->loop_depth)
prop = prev;
  
  /* If there is only one adjacent predecessor.  Don't propagate
!outside this loop.  This loop_depth test isn't perfect, but
!I'm not sure the loop_father member is valid at this point.  */
  else if (prev && single_pred_p (bb)
!  && prev->loop_depth == bb->loop_depth)
prop = prev;
  
  /* If this is the JOIN block of a simple IF-THEN then
--- 2567,2580 
 fallthru block. This catches the cases when it is a simple
 loop or when there is an initial branch into the loop. */
  if (prev && (loop_exit || simple_loop)
! && (loop_depth (prev->loop_father)
! <= loop_depth (bb->loop_father)))
prop = prev;
  
  /* If there is only one adjacent predecessor.  Don't propagate
!outside this loop.  */
  else if (prev && single_pred_p (bb)
!  && prev->loop_father == bb->loop_father)
prop = prev;
  
  /* If this is the JOIN block of a simple IF-THEN then
*** spu_machine_dependent_reorg (void)
*** 2578,2584 
   && EDGE_COUNT (bb->preds) == 2
   && EDGE_COUNT (prev->preds) == 1
   && EDGE_PRED (prev, 0)->src == prev2
!  && prev2->loop_depth == bb->loop_depth
   && GET_CODE (branch_target) != REG)
prop = prev;
  
--- 2583,2589 
   && EDGE_COUNT (bb->preds) == 2
   && EDGE_COUNT (prev->preds) == 1
   && EDGE_PRED (prev, 0)->src == prev2
!  && prev2->loop_father == bb->loop_father
   && GET_CODE (branch_target) != REG)
prop = prev;
  
*** spu_machine_dependent_reorg (void)
*** 2600,2606 
  if (dump_file)
fprintf (dump_file, "propagate from %i to %i (loop depth %i) "
 "for %i (loop_exit %i simple_loop %i dist %i)\n",
!bb->index, prop->index, bb->loop_depth,
 INSN_UID (branch), loop_exit, simple_loop,
 branch_addr - INSN_ADDRESSES (INSN_UID (bbend)));
  
--- 2605,2611 
  if (dump_file)
fprintf (dump_file, "propagate from %i to %i (loop depth %i) "
 "for %i (loop_exit %i simple_loop %i dist %i)\n",
!bb->index, prop->index, loop_depth (bb->loop_father),
 INSN_UID (branch), loop_exit, simple_loop,
 branch_addr - INSN_ADDRESSES (INSN_UID (bbend)));
  
*** spu_machine_dependent_re

Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Richard Guenther
On Sun, 12 Aug 2012, Diego Novillo wrote:

> I will be sending 6 patches that implement all the changes we
> have been making on the cxx-conversion branch.  As described in 
> http://gcc.gnu.org/ml/gcc/2012-08/msg00015.html, these patches
> change the default bootstrap process so that stage 1 always
> builds with a C++ compiler.
> 
> Other than the bootstrap change, the patches make no functional
> changes to the compiler.  Everything should build as it does now
> in trunk.

...

and debugging becomes a nightmare (hello gdb people!)

(gdb) call debug_tree (*expr_p)
  
constant 9>
(gdb) call debug_tree (0x767fa5e8)
Cannot resolve function debug_tree to any overloaded instance

filing bugs does not help (as usual).  
http://sourceware.org/bugzilla/show_bug.cgi?id=13356

It's a year old, to prep for this change.  Now we're here.  Debugging
sucks.

Great.
Richard.


Re: combine permutations in gimple

2012-08-15 Thread Ramana Radhakrishnan
[It looks like I missed hitting the send button on this response]

>
> Seems to be one instruction shorter at least ;-) Yes, there can be much
> worse regressions than that because of the patch (like 40 instructions
> instead of 4, in the x86 backend).

If this is replacing 4 instructions with 40 in x86 backend maybe
someone will notice :)

Not a win in this particular testcase  because the compiler replaces 2
constant permutes ( that's about 4 cycles) with a load from the
constant pool , a generic permute and in addition are polluting the
icache with guff in the constant pool .  If you go to 3 -4 permutes
into a single one then it might be a win but not till that point.


> with a-b without first asking the backend whether it might be more
> efficient. One permutation is better than 2.
>  It just happens that the range
> of possible permutations is too large (and the basic instructions are too
> strange) for backends to do a good job on them, and thus keeping toplevel
> input as a hint is important.

Of-course, the problem here is this change of semantics with the hook
TARGET_VEC_PERM_CONST_OK. Targets were expanding to generic permutes
with constants in the *absence* of being able to deal with them with
the specialized permutes.  fwprop will now leave us at a point where
each target has to now grow more knowledge with respect to how best to
expand a generic constant permute with a sequence of permutes rather
than just using the generic permute.

Generating a sequence of permutes from a single constant permute will
be a  harder problem than (say) dealing with immediate expansions so
you are pushing more complexity into the target but in the short term
target maintainers should definitely have a heads up that folks using
vector permute intrinsics could actually have performance regressions
on their targets.

Thanks,
Ramana


Re: combine permutations in gimple

2012-08-15 Thread Richard Guenther
On Wed, Aug 15, 2012 at 1:22 PM, Marc Glisse  wrote:
> On Mon, 13 Aug 2012, Marc Glisse wrote:
>
>> I'll give it a few more days for the conversation to settle, so I know
>> what I should do between:
>> - the barely modified patch you accepted,
>> - the check asked by Jakub,
>> - the restriction to identity that prevents any regression (well...),
>> - something else?
>
>
> I didn't realize the conversation would go quiet immediatly...
>
> Is someone willing to take the responsibility of telling me which approach
> is right? I can add Jakub's checks (though I am not sure how I would test
> them), but I would rather not do it if the conclusion is that they are
> either unnecessary (original patch is ok) or insufficient (don't avoid
> Ramana's regressions). I can do the very safe option 3 (only combine
> permutations when the result is the identity permutation), but I first want
> to make sure the more general thing is bad.
>
> (sorry, it's my first patch that gets conflicting answers...)

Well, we're waiting for someone to break the tie ... I'd go with the original
patch, improving the backends where necessary.

Richard.

> --
> Marc Glisse


Re: combine permutations in gimple

2012-08-15 Thread Marc Glisse

On Mon, 13 Aug 2012, Marc Glisse wrote:

I'll give it a few more days for the conversation to settle, so I know what I 
should do between:

- the barely modified patch you accepted,
- the check asked by Jakub,
- the restriction to identity that prevents any regression (well...),
- something else?


I didn't realize the conversation would go quiet immediatly...

Is someone willing to take the responsibility of telling me which approach 
is right? I can add Jakub's checks (though I am not sure how I would test 
them), but I would rather not do it if the conclusion is that they are 
either unnecessary (original patch is ok) or insufficient (don't avoid 
Ramana's regressions). I can do the very safe option 3 (only combine 
permutations when the result is the identity permutation), but I first 
want to make sure the more general thing is bad.


(sorry, it's my first patch that gets conflicting answers...)

--
Marc Glisse


Re: LEA-splitting improvement patch.

2012-08-15 Thread Yuri Rumyantsev
Hi Uros,

I send you new patch with fixed space/tab alignments.

About your comment.

It is more optimal to put adding of constant before adding of the
register only for case when 3 instructions must be generated to split
lea. In all other cases it does not matter and I left code
unchangeable.

Best regards.
Yuri.

2012/8/14 Uros Bizjak :
> On Tue, Aug 14, 2012 at 3:35 PM, Yuri Rumyantsev  wrote:
>> Uros,
>>
>> Let me try to explain you why I used such code duplication:
>>
>> Here we have a common case of LEA with 3 different registers - r0
>> (target), r1(base), r2(index) and possible offset.
>> To get the better scheduling we first try to determine what register
>> is prefirable for inititial setting - r1 or r2 through
>> find_nearest_reg_def. And then we generate the following sequence of
>> instructions:
>>r0 = r_best;
>>r0 = $const, r0
>>r0 = r_worse, r0
>> that can save 2 cycles for Atom since first 2 instructions can be hoisted up.
>> I could not find better way for coding it.
>
> If it is important to put adding of const before adding of the
> register, then you can emit similar sequence for other cases, too.
> Something like following:
>
> --cut here--
> ...
> {
>   if (regno0 == regno1)
> tmp = parts.index;
>   else if (regno0 == regno2)
> tmp = parts.base;
>   else
> {
>   rtx tmp1;
>
>   /* regno1: base, regno2: index  */
>   if (find_nearest_reg_def (insn, regno1, regno2))
> tmp1 = parts.index, tmp = parts.base;
>   else
> tmp1 = parts.base, tmp = parts.index;
>
>   emit_insn (gen_rtx_SET (VOIDmode, target, tmp1));
> }
>
>   if (parts.disp && parts.disp != const0_rtx)
> ix86_emit_binop (PLUS, mode, target, parts.disp);
>   ix86_emit_binop (PLUS, mode, target, tmp);
>   return;
> }
> --cut here--
>
 I prepared new patch and ChangeLog. Testing of x32 is in progress.
>
> You didn't fix vertical spaces and tab issues in new patch.
>
> Uros.


lea_split_improve.diff
Description: Binary data


Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)

2012-08-15 Thread Richard Guenther
On Wed, 15 Aug 2012, Richard Guenther wrote:

> On Wed, Aug 15, 2012 at 12:31 PM, Jakub Jelinek  wrote:
> > On Wed, Aug 15, 2012 at 12:28:58PM +0200, Richard Guenther wrote:
> >> the function names make no sense - they should be talking about
> >> host-wide-ints, because that is what they are about.  Thus,
> >>
> >>   /* Conversion functions.  */
> >>
> >>   HOST_WIDE_INT to_signed_hwi () const;
> >>   unsigned HOST_WIDE_INT to_unsigned_hwi () const;
> >>
> >>   /* Conversion query functions.  */
> >>
> >>   bool fits_unsigned_hwi () const;
> >>   bool fits_signed_hwi () const;
> >>   bool fits_hwi (bool uns) const;
> >
> > Wouldn't uhwi and shwi be better?  Both are already widely used within
> > gcc...
> 
> True, I'll change it that way.

Like so, testing in progress.

Richard.

2012-08-15  Richard Guenther  

* double-int.h (from_unsigned): Rename to ...
(from_uhwi): ... this.
(from_signed): Rename to ...
(from_shwi): ... this.
(to_signed): Rename to ...
(to_shwi): ... this.
(to_unsigned): Rename to ...
(to_uhwi): ... this.
(fits_unsigned): Rename to ...
(fits_uhwi): ... this.
(fits_signed): Rename to ...
(fits_shwi): ... this.
(fits): Rename to ...
(fits_hwi): ... this.
* double-int.c: Likewise.

Index: gcc/double-int.c
===
*** gcc/double-int.c(revision 190407)
--- gcc/double-int.c(working copy)
*** double_int::sext (unsigned prec) const
*** 716,722 
  /* Returns true if CST fits in signed HOST_WIDE_INT.  */
  
  bool
! double_int::fits_signed () const
  {
const double_int &cst = *this;
if (cst.high == 0)
--- 716,722 
  /* Returns true if CST fits in signed HOST_WIDE_INT.  */
  
  bool
! double_int::fits_shwi () const
  {
const double_int &cst = *this;
if (cst.high == 0)
*** double_int::fits_signed () const
*** 731,742 
 unsigned HOST_WIDE_INT if UNS is true.  */
  
  bool
! double_int::fits (bool uns) const
  {
if (uns)
! return this->fits_unsigned ();
else
! return this->fits_signed ();
  }
  
  /* Returns A * B.  */
--- 731,742 
 unsigned HOST_WIDE_INT if UNS is true.  */
  
  bool
! double_int::fits_hwi (bool uns) const
  {
if (uns)
! return this->fits_uhwi ();
else
! return this->fits_shwi ();
  }
  
  /* Returns A * B.  */
Index: gcc/double-int.h
===
*** gcc/double-int.h(revision 190407)
--- gcc/double-int.h(working copy)
*** public:
*** 60,67 
   Second, the GCC conding conventions prefer explicit conversion,
   and explicit conversion operators are not available until C++11.  */
  
!   static double_int from_unsigned (unsigned HOST_WIDE_INT cst);
!   static double_int from_signed (HOST_WIDE_INT cst);
  
/* No copy assignment operator or destructor to keep the type a POD.  */
  
--- 60,67 
   Second, the GCC conding conventions prefer explicit conversion,
   and explicit conversion operators are not available until C++11.  */
  
!   static double_int from_uhwi (unsigned HOST_WIDE_INT cst);
!   static double_int from_shwi (HOST_WIDE_INT cst);
  
/* No copy assignment operator or destructor to keep the type a POD.  */
  
*** public:
*** 83,96 
  
/* Conversion functions.  */
  
!   HOST_WIDE_INT to_signed () const;
!   unsigned HOST_WIDE_INT to_unsigned () const;
  
/* Conversion query functions.  */
  
!   bool fits_unsigned () const;
!   bool fits_signed () const;
!   bool fits (bool uns) const;
  
/* Attribute query functions.  */
  
--- 83,96 
  
/* Conversion functions.  */
  
!   HOST_WIDE_INT to_shwi () const;
!   unsigned HOST_WIDE_INT to_uhwi () const;
  
/* Conversion query functions.  */
  
!   bool fits_uhwi () const;
!   bool fits_shwi () const;
!   bool fits_hwi (bool uns) const;
  
/* Attribute query functions.  */
  
*** public:
*** 186,192 
 HOST_WIDE_INT are filled with the sign bit.  */
  
  inline
! double_int double_int::from_signed (HOST_WIDE_INT cst)
  {
double_int r;
r.low = (unsigned HOST_WIDE_INT) cst;
--- 186,192 
 HOST_WIDE_INT are filled with the sign bit.  */
  
  inline
! double_int double_int::from_shwi (HOST_WIDE_INT cst)
  {
double_int r;
r.low = (unsigned HOST_WIDE_INT) cst;
*** double_int double_int::from_signed (HOST
*** 198,204 
  static inline double_int
  shwi_to_double_int (HOST_WIDE_INT cst)
  {
!   return double_int::from_signed (cst);
  }
  
  /* Some useful constants.  */
--- 198,204 
  static inline double_int
  shwi_to_double_int (HOST_WIDE_INT cst)
  {
!   return double_int::from_shwi (cst);
  }
  
  /* Some useful constants.  */
*** shwi_to_double_int (HOST_WIDE_INT cst)
*** 206,222 
 The problem is that a named constant would 

Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)

2012-08-15 Thread Richard Guenther
On Wed, Aug 15, 2012 at 12:31 PM, Jakub Jelinek  wrote:
> On Wed, Aug 15, 2012 at 12:28:58PM +0200, Richard Guenther wrote:
>> the function names make no sense - they should be talking about
>> host-wide-ints, because that is what they are about.  Thus,
>>
>>   /* Conversion functions.  */
>>
>>   HOST_WIDE_INT to_signed_hwi () const;
>>   unsigned HOST_WIDE_INT to_unsigned_hwi () const;
>>
>>   /* Conversion query functions.  */
>>
>>   bool fits_unsigned_hwi () const;
>>   bool fits_signed_hwi () const;
>>   bool fits_hwi (bool uns) const;
>
> Wouldn't uhwi and shwi be better?  Both are already widely used within
> gcc...

True, I'll change it that way.

Richard.

>> Likewise for
>>
>>   static double_int from_unsigned (unsigned HOST_WIDE_INT cst);
>>   static double_int from_signed (HOST_WIDE_INT cst);
>>
>> I'm going to install a patch, after testing, that adjusts the names 
>> accordingly.
>
> Jakub


Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)

2012-08-15 Thread Jakub Jelinek
On Wed, Aug 15, 2012 at 12:28:58PM +0200, Richard Guenther wrote:
> the function names make no sense - they should be talking about
> host-wide-ints, because that is what they are about.  Thus,
> 
>   /* Conversion functions.  */
> 
>   HOST_WIDE_INT to_signed_hwi () const;
>   unsigned HOST_WIDE_INT to_unsigned_hwi () const;
> 
>   /* Conversion query functions.  */
> 
>   bool fits_unsigned_hwi () const;
>   bool fits_signed_hwi () const;
>   bool fits_hwi (bool uns) const;

Wouldn't uhwi and shwi be better?  Both are already widely used within
gcc...

> Likewise for
> 
>   static double_int from_unsigned (unsigned HOST_WIDE_INT cst);
>   static double_int from_signed (HOST_WIDE_INT cst);
> 
> I'm going to install a patch, after testing, that adjusts the names 
> accordingly.

Jakub


Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)

2012-08-15 Thread Richard Guenther
On Tue, Aug 14, 2012 at 10:04 AM, Richard Guenther  wrote:
> On Mon, 13 Aug 2012, Lawrence Crowl wrote:
>
>> On 8/13/12, Richard Guenther  wrote:
>> > Increment/decrement operations did not exist, please do not add
>> > them at this point.
>>
>> Note that I have also added +=, -= and *= operations.  Having them
>> has three advantages.  First, it matches expectations on what
>> numeric types allow.  Second, it results in more concise code.
>> Third, it results in potentially faster code.  I think we should
>> be able to use those operators.
>>
>> When I run through changing call sites, I really want to change
>> the sites to the final form, not do two passes.
>
> Ok.

Btw, I noticed we now have

  /* Conversion functions.  */

  HOST_WIDE_INT to_signed () const;
  unsigned HOST_WIDE_INT to_unsigned () const;

  /* Conversion query functions.  */

  bool fits_unsigned () const;
  bool fits_signed () const;
  bool fits (bool uns) const;

the function names make no sense - they should be talking about
host-wide-ints, because that is what they are about.  Thus,

  /* Conversion functions.  */

  HOST_WIDE_INT to_signed_hwi () const;
  unsigned HOST_WIDE_INT to_unsigned_hwi () const;

  /* Conversion query functions.  */

  bool fits_unsigned_hwi () const;
  bool fits_signed_hwi () const;
  bool fits_hwi (bool uns) const;

Likewise for

  static double_int from_unsigned (unsigned HOST_WIDE_INT cst);
  static double_int from_signed (HOST_WIDE_INT cst);

I'm going to install a patch, after testing, that adjusts the names accordingly.

Richard.

> Thanks,
> Richard.


RFC patch for http://gcc.gnu.org/install/download.html / gcc/doc/install.texi

2012-08-15 Thread Tobias Burnus

Dear all,

when looking at http://gcc.gnu.org/install/download.html, I wondered 
whether some parts should be updated.


In particular
- Mentioning the GIT mirror at http://gcc.gnu.org/wiki/GitMirror
- Mentioning CLOOG/ISL for the in-tree build

Those changes I did in the patch below.

However, I am also wondering about the
"It is possible to download a full distribution or specific components."

I think GCC stopped doing so at some 4.x release - but install.html also 
applies to GCC 3.x.y where it still existed. I was thus a bit reluctant 
to remove that part.


Furthermore, I was wondering whether one should link to 
ftp://gcc.gnu.org/pub/gcc/infrastructure/ (either directly or by 
mentioning its existence on the mirrors).


What do you think?

Tobias

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index dfdd624..bbb8535 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -524,6 +524,7 @@ the executables named @command{cantlr}, @command{runantlr} 
or

-GCC is distributed via @uref{http://gcc.gnu.org/svn.html,,SVN} and FTP
-tarballs compressed with @command{gzip} or
-@command{bzip2}.  It is possible to download a full distribution or specific
-components.
+GCC is distributed via @uref{http://gcc.gnu.org/svn.html,,SVN},
+@uref{http://gcc.gnu.org/wiki/GitMirror,,GIT}, and on
+@uref{http://gcc.gnu.org/mirrors.html,,FTP} as tarballs compressed with
+@command{gzip} or @command{bzip2}.  It is possible to download a full
+distribution or specific components.

@@ -555,7 +556,8 @@ components of the binutils you intend to build alongside 
the compiler

-Likewise the GMP, MPFR and MPC libraries can be automatically built
-together with GCC.  Unpack the GMP, MPFR and/or MPC source
-distributions in the directory containing the GCC sources and rename
-their directories to @file{gmp}, @file{mpfr} and @file{mpc},
-respectively (or use symbolic links with the same name).
+Likewise the GMP, MPFR and MPC libraries and the optional CLOOG and ISL
+libraries can be automatically built together with GCC.  Unpack the GMP,
+MPFR, MPC, CLOOG and/or ISL source distributions in the directory
+containing the GCC sources and rename their directories to @file{gmp},
+@file{mpfr}, @file{mpc}, @file{cloog} and @file{isl}, respectively (or
+use symbolic links with the same name).




Re: [patch] timevar TLC

2012-08-15 Thread Steven Bosscher
On Tue, Aug 14, 2012 at 10:46 PM, Lawrence Crowl  wrote:
> You can check the error statically.  Something like
>
> % cat limitstring.c
> #define LIMIT 32
>
> struct def {
>   int x;
>   char name[LIMIT+1];
> };
>
> struct def var[] = {
>   { 3, "hello" },
>   { 4, "name is much too too long for a reasonable name" },
> };
>
> % gcc -c limitstring.c -Werror
> cc1: warnings being treated as errors
> limitstring.c:10: error: initializer-string for array of chars is too long
> limitstring.c:10: error: (near initialization for 'timevars[1].name')
>
> But of course the variable definition would look more like
>
> #define DEFTIMEVAR(identifier__, name__) \
> { , name__, ... },
> struct def var[] = {
> #include "timevar.def"
> };

That looks much better than my hack! I'll see if I can find some time
to implement this idea.

In the mean time I've committed my patch without the timevar.c bits.

Ciao!
Steven


Re: [PATCH] Fix PR54245

2012-08-15 Thread Richard Guenther
On Tue, 14 Aug 2012, William J. Schmidt wrote:

> Currently we can insert an initializer that performs a multiply in too
> small of a type for correctness.  For now, detect the problem and avoid
> the optimization when this would happen.  Eventually I will fix this up
> to cause the multiply to be performed in a sufficiently wide type.
> 
> Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
> regressions.  Ok for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> gcc:
> 
> 2012-08-14  Bill Schmidt  
> 
>   PR tree-optimization/54245
>   * gimple-ssa-strength-reduction.c (legal_cast_p_1): New function.
>   (legal_cast_p): Split out logic to legal_cast_p_1.
>   (analyze_increments): Avoid introducing multiplies in smaller types.
> 
> 
> gcc/testsuite:
> 
> 2012-08-14  Bill Schmidt  
> 
>   PR tree-optimization/54245
>   * gcc.dg/tree-ssa/pr54245.c: New test.
> 
> 
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr54245.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/pr54245.c   (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr54245.c   (revision 0)
> @@ -0,0 +1,49 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-slsr-details" } */
> +
> +#include 
> +
> +#define W1  22725
> +#define W2  21407
> +#define W3  19266
> +#define W6  8867
> +
> +void idct_row(short *row, int *dst)
> +{
> +int a0, a1, b0, b1;
> +
> +a0 = W1 * row[0];
> +a1 = a0;
> +
> +a0 += W2 * row[2];
> +a1 += W6 * row[2];
> +
> +b0 = W1 * row[1];
> +b1 = W3 * row[1];
> +
> +dst[0] = a0 + b0;
> +dst[1] = a0 - b0;
> +dst[2] = a1 + b1;
> +dst[3] = a1 - b1;
> +}
> +
> +static short block[8] = { 1, 2, 3, 4 };
> +
> +int main(void)
> +{
> +int out[4];
> +int i;
> +
> +idct_row(block, out);
> +
> +for (i = 0; i < 4; i++)
> +printf("%d\n", out[i]);
> +
> +return !(out[2] == 87858 && out[3] == 10794);
> +}
> +
> +/* For now, disable inserting an initializer when the multiplication will
> +   take place in a smaller type than originally.  This test may be deleted
> +   in future when this case is handled more precisely.  */
> +/* { dg-final { scan-tree-dump-times "Inserting initializer" 0 "slsr" } } */
> +/* { dg-final { cleanup-tree-dump "slsr" } } */
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c   (revision 190305)
> +++ gcc/gimple-ssa-strength-reduction.c   (working copy)
> @@ -1089,6 +1089,32 @@ slsr_process_neg (gimple gs, tree rhs1, bool speed
>add_cand_for_stmt (gs, c);
>  }
>  
> +/* Help function for legal_cast_p, operating on two trees.  Checks
> +   whether it's allowable to cast from RHS to LHS.  See legal_cast_p
> +   for more details.  */
> +
> +static bool
> +legal_cast_p_1 (tree lhs, tree rhs)
> +{
> +  tree lhs_type, rhs_type;
> +  unsigned lhs_size, rhs_size;
> +  bool lhs_wraps, rhs_wraps;
> +
> +  lhs_type = TREE_TYPE (lhs);
> +  rhs_type = TREE_TYPE (rhs);
> +  lhs_size = TYPE_PRECISION (lhs_type);
> +  rhs_size = TYPE_PRECISION (rhs_type);
> +  lhs_wraps = TYPE_OVERFLOW_WRAPS (lhs_type);
> +  rhs_wraps = TYPE_OVERFLOW_WRAPS (rhs_type);
> +
> +  if (lhs_size < rhs_size
> +  || (rhs_wraps && !lhs_wraps)
> +  || (rhs_wraps && lhs_wraps && rhs_size != lhs_size))
> +return false;
> +
> +  return true;
> +}
> +
>  /* Return TRUE if GS is a statement that defines an SSA name from
> a conversion and is legal for us to combine with an add and multiply
> in the candidate table.  For example, suppose we have:
> @@ -1129,28 +1155,11 @@ slsr_process_neg (gimple gs, tree rhs1, bool speed
>  static bool
>  legal_cast_p (gimple gs, tree rhs)
>  {
> -  tree lhs, lhs_type, rhs_type;
> -  unsigned lhs_size, rhs_size;
> -  bool lhs_wraps, rhs_wraps;
> -
>if (!is_gimple_assign (gs)
>|| !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (gs)))
>  return false;
>  
> -  lhs = gimple_assign_lhs (gs);
> -  lhs_type = TREE_TYPE (lhs);
> -  rhs_type = TREE_TYPE (rhs);
> -  lhs_size = TYPE_PRECISION (lhs_type);
> -  rhs_size = TYPE_PRECISION (rhs_type);
> -  lhs_wraps = TYPE_OVERFLOW_WRAPS (lhs_type);
> -  rhs_wraps = TYPE_OVERFLOW_WRAPS (rhs_type);
> -
> -  if (lhs_size < rhs_size
> -  || (rhs_wraps && !lhs_wraps)
> -  || (rhs_wraps && lhs_wraps && rhs_size != lhs_size))
> -return false;
> -
> -  return true;
> +  return legal_cast_p_1 (gimple_assign_lhs (gs), rhs);
>  }
>  
>  /* Given GS which is a cast to a scalar integer type, determine whether
> @@ -1996,6 +2005,31 @@ analyze_increments (slsr_cand_t first_dep, enum ma
>  != POINTER_PLUS_EXPR)))
>   incr_vec[i].cost = COST_NEUTRAL;
>
> +  /* FORNOW: If we need to add an initializer, give up if a cast from
> +  the candidate's type to its stride's type can lose precision.
> +  This could eventually be handled better b

Re: [PATCH] Fix PR54240

2012-08-15 Thread Richard Guenther
On Tue, 14 Aug 2012, William J. Schmidt wrote:

> Replace the once vacuously true, and now vacuously false, test for
> existence of a conditional move instruction for a given mode, with one
> that actually checks what it's supposed to.  Add a test case so we don't
> miss such things in future.
> 
> The test is powerpc-specific.  It would be good to have an i386 version
> of the test as well, if someone can help with that.
> 
> Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
> regressions.  Ok for trunk?

Ok if you also drop in the testcase from Andrew.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> gcc:
> 
> 2012-08-13  Bill Schmidt  
> 
>   PR tree-optimization/54240
>   * tree-ssa-phiopt.c (hoist_adjacent_loads): Correct test for
>   existence of conditional move with given mode.
> 
> 
> gcc/testsuite:
> 
> 2012-08-13  Bill Schmidt  
> 
>   PR tree-optimization/54240
>   * gcc.target/powerpc/pr54240.c: New test.
> 
> 
> Index: gcc/testsuite/gcc.target/powerpc/pr54240.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/pr54240.c(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/pr54240.c(revision 0)
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -misel -fdump-tree-phiopt-details" } */
> +
> +typedef struct s {
> +  int v;
> +  int b;
> +  struct s *l;
> +  struct s *r;
> +} S;
> +
> +
> +int foo(S *s)
> +{
> +  S *this;
> +  S *next;
> +
> +  this = s;
> +  if (this->b)
> +next = this->l;
> +  else
> +next = this->r;
> +
> +  return next->v;
> +}
> +
> +/* { dg-final { scan-tree-dump "Hoisting adjacent loads" "phiopt1" } } */
> +/* { dg-final { cleanup-tree-dump "phiopt1" } } */
> Index: gcc/tree-ssa-phiopt.c
> ===
> --- gcc/tree-ssa-phiopt.c (revision 190305)
> +++ gcc/tree-ssa-phiopt.c (working copy)
> @@ -1843,7 +1843,8 @@ hoist_adjacent_loads (basic_block bb0, basic_block
>  
>/* Check the mode of the arguments to be sure a conditional move
>can be generated for it.  */
> -  if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1
> +  if (optab_handler (movcc_optab, TYPE_MODE (TREE_TYPE (arg1)))
> +   == CODE_FOR_nothing)
>   continue;
>  
>/* Both statements must be assignments whose RHS is a COMPONENT_REF.  
> */
> 
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


Re: [PATCH] Intrinsics for ADCX

2012-08-15 Thread Kirill Yukhin
Hi,
There's white paper [1] available, which explains usage of  MULX/ADCX/ADOX

[1] - http://download.intel.com/embedded/processor/whitepaper/327831.pdf

Thanks, K