Re: [PATCH, PR43864] Gimple level duplicate block cleanup.
On Wed, Aug 24, 2011 at 9:00 PM, Ian Lance Taylor i...@google.com wrote: Tom de Vries vr...@codesourcery.com writes: Do you have a moment to give a second look to a gimple CFG optimization? The optimization removes duplicate basic blocks and reduces code size by 1-2%. The latest patch is posted at http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01602.html. I'm not really the best person to look at this patch, since it applies to areas of the compiler with which I am less familiar.. However, since you ask, I did read through the patch, and it looks OK to me. Since Richi OK'ed it, this patch is OK with the following changes. +typedef struct same_succ *same_succ_t; +typedef const struct same_succ *const_same_succ_t; Don't name new types ending with _t. POSIX reserves names ending with _t when sys/types.h is #included. Name these something else. +typedef struct bb_cluster *bb_cluster_t; +typedef const struct bb_cluster *const_bb_cluster_t; Same here. +@item -ftree-tail-merge +Merges identical blocks with same successors. This flag is enabled by default +at @option{-O2} and higher. The run time of this pass can be limited using +@option{max-tail-merge-comparisons} parameter. I think this text can be improved to be more meaningful to compiler users. I suggest something like: Look for identical code sequences. When found, replace one with a jump to the other. This optimization is known as tail merging or cross jumping. This flag is enabled [now same as above] Can you also add a --param for the maximum number of iterations you perform (16 sounds quite high for GCC bootstrap), I'd default it to 2 which seems to catch 99% of all cases. If you already committed the patch just do it as a followup please. Thanks, Richard. Thanks. Ian
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. convert cannot be called from the middle-end, instead use fold_convert. And I seriously need help with backend patterns. I'll look at the patch in detail later today. Richard. Thanks, Artem.
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 8:34 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. convert cannot be called from the middle-end, instead use fold_convert. Thanks, great. I didn't know that. Using fold_convert solves my problem and make all my tests pass. And I seriously need help with backend patterns. I'll look at the patch in detail later today. Thanks, Artem. Richard. Thanks, Artem.
Re: [PATCH][RFC] Fix sizetype sign checks
On Wed, 24 Aug 2011, Richard Guenther wrote: On Fri, 19 Aug 2011, Richard Guenther wrote: On Fri, 19 Aug 2011, Eric Botcazou wrote: Looking at the Ada case I believe this happens because Ada has negative DECL_FIELD_OFFSET values (but that's again in sizetype, not ssizetype)? Other host_integerp uses in Ada operate on sizes where I hope those are never negative ;) Yes, the Ada compiler uses negative offsets for some peculiar constructs. Nothing to do with the language per se, but with mechanisms implemented in gigi to support some features of the language. Eric, any better way of fixing this or would you be fine with this patch? Hard to say without seeing the complete patch and playing a little with it. This is the complete patch I am playing with currently, Ada bootstrap still fails for me unfortunately. Bootstrap for all other languages succeeds, but there are some regressions, mostly warning-related. Any help with pinpointing the Ada problem is welcome. Another patch that makes the gnat.dg and acats testsuites clean on x86_64-linux apart from diagnostic regressions is Index: varasm.c === --- varasm.c(revision 178035) +++ varasm.c(working copy) @@ -4740,8 +4743,15 @@ output_constructor_regular_field (oc_loc if (local-index != NULL_TREE) { - double_int idx = double_int_sub (tree_to_double_int (local-index), - tree_to_double_int (local-min_index)); + /* ??? Ada has negative DECL_FIELD_OFFSETs but we are using an + unsigned sizetype so make sure to sign-extend the indices before +subtracting them. */ + unsigned prec = TYPE_PRECISION (sizetype); + double_int idx + = double_int_sub (double_int_sext (tree_to_double_int (local-index), + prec), + double_int_sext (tree_to_double_int +(local-min_index), prec)); gcc_assert (double_int_fits_in_shwi_p (idx)); fieldpos = (tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (local-val)), 1) * idx.low); but I still fail to build Ada without bootstrapping as the RTS is still miscompiled which results in gnatmake segfaulting like Starting program: /home/abuild/rguenther/obj/gcc/gnatmake -c -b -I../rts -I. -I/space/rguenther/src/svn/trunk2/gcc/ada --GNATBIND=../../gnatbind --GCC=../../xgcc\ -B../../\ -g\ -O2\ \ -gnatpg\ -gnata gnatchop gnatcmd gnatkr gnatls gnatprep gnatxref gnatfind gnatname gnatclean -bargs -I../rts -I. -I/space/rguenther/src/svn/trunk2/gcc/ada -static -x Program received signal SIGSEGV, Segmentation fault. system.secondary_stack.ss_mark () at ../rts/s-secsta.adb:465 465 return (Sstk = Sstk, Sptr = To_Stack_Ptr (Sstk).Top); (gdb) p Sstk $1 = (system.address) 0x0 Are there any other places where GIGI would expect sign-extended sizetype values? Any hint where to look for the above failure? system.soft_links.get_sec_stack_addr_nt doesn't seem to get called at all. Where is the machinery to eventually set it up located? Grepping for stack or secondary doesn't reveal too much useful information. Updated patch below. Thanks, Richard. 2011-06-16 Richard Guenther rguent...@suse.de * fold-const.c (div_if_zero_remainder): sizetypes no longer sign-extend. * stor-layout.c (initialize_sizetypes): Likewise. * tree-ssa-ccp.c (bit_value_unop_1): Likewise. (bit_value_binop_1): Likewise. * tree.c (double_int_to_tree): Likewise. (double_int_fits_to_tree_p): Likewise. (force_fit_type_double): Likewise. (host_integerp): Likewise. (int_fits_type_p): Likewise. * tree-cfg.c (verify_types_in_gimple_reference): Do not compare sizes by pointer. Index: trunk/gcc/fold-const.c === *** trunk.orig/gcc/fold-const.c 2011-08-24 15:34:13.0 +0200 --- trunk/gcc/fold-const.c 2011-08-24 15:36:22.0 +0200 *** div_if_zero_remainder (enum tree_code co *** 194,202 does the correct thing for POINTER_PLUS_EXPR where we want a signed division. */ uns = TYPE_UNSIGNED (TREE_TYPE (arg2)); - if (TREE_CODE (TREE_TYPE (arg2)) == INTEGER_TYPE -TYPE_IS_SIZETYPE (TREE_TYPE (arg2))) - uns = false; quo = double_int_divmod (tree_to_double_int (arg1), tree_to_double_int (arg2), --- 194,199 *** int_binop_types_match_p (enum tree_code *** 938,945 to produce a new constant. Return NULL_TREE if we don't know how to evaluate CODE at compile-time. */ ! tree ! int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2) { double_int op1, op2, res, tmp; tree t; --- 935,943 to produce a
Re: [PATCH, i386, testsuite] FMA intrinsics
On Thu, Aug 25, 2011 at 10:18 AM, Ilya Tocar tocarip.in...@gmail.com wrote: Changelog: 2011-08-25 Ilya Tocar ilya.to...@intel.com * config/i386/fmaintrin.h: New. * config.gcc: Add fmaintrin.h. * config/i386/i386.c * ix86_builtins (IX86_BUILTIN_VFMADDSS3): New. (IX86_BUILTIN_VFMADDSD3): Likewise. (enum ix86_builtins) IX86_...: New. IX86_...: Likewise. * config/i386/sse.md (fmai_vmfmadd_mode): New. (*fmai_fmadd_mode): Likewise. (*fmai_fmsub_mode): Likewise. (*fmai_fnmadd_mode): Likewise. (*fmai_fnmsub_mode): Likewise. * config/i386/x86intrin.h: Add fmaintrin.h. And Changelog for testsuite: 2011-08-25 Ilya Tocar ilya.to...@intel.com * gcc.target/i386/fma-check.h: New. * gcc.target/i386/fma-256-fmaddXX.c: New testcase. * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmsubXX.c: Likewise. * gcc.target/i386/fma-fmaddXX.c: Likewise. * gcc.target/i386/fma-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-fmsubXX.c: Likewise. * gcc.target/i386/fma-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-fnmaddXX.c: Likewise. * gcc.target/i386/fma-fnmsubXX.c: Likewise. * gcc.target/i386/fma-compile.c: Likewise. * gcc.target/i386/i386.exp (check_effective_target_fma): New. * gcc.target/i386/sse-12.c: Add -mfma. * gcc.target/i386/sse-13.c: Likewise. * gcc.target/i386/sse-14.c: Likewise. * gcc.target/i386/sse-22.c: Likewise. * gcc.target/i386/sse-23.c: Likewise. * gcc.target/i386/sse-13.c: Likewise. Duplicate. * g++.dg/other/i386-2.c: Likewise. *g++.dg/other/i386-2.C * g++.dg/other/i386-2.c: Likewise. * g++.dg/other/i386-3.C Uros.
[PATCH, MELT] Build system inconsistency on flavors
Hello, Following is a patch that fixes usage of the now invalid 'static' flavor and replaces its usage with 'quicklybuilt'.From 546c86cd45114470da4cb7811c71f1fcde48714b Mon Sep 17 00:00:00 2001 From: Alexandre Lissy ali...@mandriva.com Date: Thu, 25 Aug 2011 11:26:54 +0200 Subject: [PATCH] [MELT] Fix bad flavor static Recent changes made the name 'static' invalid as a flavor however there were pieces of its existence that made the build as a plugin failing because of a flavor named static while the shared library was named 'quicklybuilt', and hence unable to be loaded. This commit changes those last pieces of 'static' to 'quicklybuilt'. --- contrib/MELT-Plugin-Makefile |4 ++-- gcc/ChangeLog.MELT |6 ++ gcc/melt-build.tpl | 24 3 files changed, 20 insertions(+), 14 deletions(-) diff --git a/contrib/MELT-Plugin-Makefile b/contrib/MELT-Plugin-Makefile index 476935f..5e5baae 100644 --- a/contrib/MELT-Plugin-Makefile +++ b/contrib/MELT-Plugin-Makefile @@ -149,8 +149,8 @@ melt_installed_module_makefile=$(MELTGCC_PLUGIN_DIR)/melt-module.mk ## should be 1 melt_is_plugin=1 -# we force the stage0 to be static i.e. with constant field offsets. -MELT_STAGE_ZERO = melt-stage0-static +# we force the stage0 to be quicklybuilt i.e. with constant field offsets. +MELT_STAGE_ZERO = melt-stage0-quicklybuilt ## Tell GNU make to not build goals in parallel, that is to ignore any ## -j flag to make. diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT index 4282321..a341a49 100644 --- a/gcc/ChangeLog.MELT +++ b/gcc/ChangeLog.MELT @@ -1,3 +1,9 @@ +2011-08-25 Alexandre Lissy ali...@mandriva.com + * gcc/melt/melt-build.tpl: Use -quicklybuilt instead of -static for + consistency with the whole build system + * contrib/MELT-Plugin-Makefile: Using -quicklybuilt instead of + -static. + 2011-08-23 Alexandre Lissy ali...@mandriva.com * melt-runtime.c (melt_load_module_index): Correct handling of invalid dlopen handle. diff --git a/gcc/melt-build.tpl b/gcc/melt-build.tpl index 705fe67..e3e8282 100644 --- a/gcc/melt-build.tpl +++ b/gcc/melt-build.tpl @@ -135,7 +135,7 @@ melt-workdir: ## from the MELT descriptor C file ## using static object fields offsets for [+base+] -melt-stage0-static/[+base+].$(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5).quicklybuilt.so: $(MELT_GENERATED_[+mkvarsuf+]_C_FILES) \ +melt-stage0-quicklybuilt/[+base+].$(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5).quicklybuilt.so: $(MELT_GENERATED_[+mkvarsuf+]_C_FILES) \ melt-run.h melt-runtime.h melt-runtime.c \ melt-predef.h $(melt_make_cc1_dependency) @echo stage0static [+base+] MELT_GENERATED_[+mkvarsuf+]_CUMULMD5= $(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5) @@ -145,9 +145,9 @@ melt-stage0-static/[+base+].$(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5).quicklybuilt GCCMELT_CFLAGS=$(melt_cflags) \ GCCMELT_MODULE_SOURCEBASE=$(melt_make_source_dir)/generated/[+base+] \ GCCMELT_CUMULATED_MD5=$(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5) \ - GCCMELT_MODULE_BINARYBASE=melt-stage0-static/[+base+] + GCCMELT_MODULE_BINARYBASE=melt-stage0-quicklybuilt/[+base+] -melt-stage0-static/[+base+].so: melt-stage0-static/[+base+].$(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5).quicklybuilt.so +melt-stage0-quicklybuilt/[+base+].so: melt-stage0-quicklybuilt/[+base+].$(MELT_GENERATED_[+mkvarsuf+]_CUMULMD5).quicklybuilt.so cd $(dir $@) ; rm -f $(notdir $@); $(LN_S) $(notdir $) $(notdir $@) ## using dynamic object fields offsets for [+base+] @@ -176,14 +176,14 @@ melt-stage0-dynamic/[+base+].quicklybuilt.so: melt-stage0-dynamic/[+base+].$(MEL [+ENDFOR melt_translator_file+] -melt-stage0-static.stamp: melt-stage0-static melt-run.h $(wildcard $(patsubst %,$(melt_make_source_dir)/generated/%*.c,$(MELT_TRANSLATOR_BASE))) | melt-stage0-static/warmelt.modlis +melt-stage0-quicklybuilt.stamp: melt-stage0-quicklybuilt melt-run.h $(wildcard $(patsubst %,$(melt_make_source_dir)/generated/%*.c,$(MELT_TRANSLATOR_BASE))) | melt-stage0-quicklybuilt/warmelt.modlis date +#$@ generated %F $@-tmp [+FOR melt_translator_file \n+] md5sum melt-run.h $(MELT_GENERATED_[+mkvarsuf+]_C_FILES) $@-tmp[+ENDFOR melt_translator_file+] echo # end $@ $@-tmp $(melt_make_move) $@-tmp $@ - rm -f $(patsubst %,melt-stage0-static/%*.c,$(MELT_TRANSLATOR_BASE)) - $(LN_S) $(realpath $(sort $(wildcard $(patsubst %,$(realpath $(melt_make_source_dir))/generated/%*.c,$(MELT_TRANSLATOR_BASE) melt-stage0-static/ - @echo STAMPstage0static after $@ ; ls -l melt-stage0-static/* + rm -f $(patsubst %,melt-stage0-quicklybuilt/%*.c,$(MELT_TRANSLATOR_BASE)) + $(LN_S) $(realpath $(sort $(wildcard $(patsubst %,$(realpath $(melt_make_source_dir))/generated/%*.c,$(MELT_TRANSLATOR_BASE) melt-stage0-quicklybuilt/ + @echo STAMPstage0static after $@ ; ls -l melt-stage0-quicklybuilt/* melt-stage0-dynamic.stamp: melt-stage0-dynamic melt-run.h $(wildcard $(patsubst
Re: patch to solve recent SPEC2000 degradation
Vladimir Makarov vmaka...@redhat.com writes: Instead of using explicitly necessary number of registers, I used contains_reg_of_mode which also checks the number of necessary registers but also it checks that the register class can hold value of given mode. This resulted in different register pressure classes (before the patch, they were GENERAL_REGS and FLOAT_REGS for x86. They became only INT_FLOAT_REGS) because it became not costly to hold integer mode value in FLOAT_REGS. The new register pressure class in own turn resulted in low register pressure and one region allocation in most cases instead of multiple region RA. As a consequence, we got a big degradation on Intel 32 bit targets. Sorry, I know I should be able to work this out, but could you explain in a bit more detail why contains_reg_of_mode (CL1, MODE) was wrong? The loop is calculating costs for moving values of mode MODE into and out of CL1, and I wouldn't have expected those costs to have any meaning if CL1 can't in fact store anything of mode MODE. It just looked at first glance as though: /* Some subclasses are to small to have enough registers to hold a value of MODE. Just ignore them. */ - if (! contains_reg_of_mode[cl1][mode]) + if (ira_reg_class_max_nregs[cl1][mode] ira_available_class_regs[cl1]) continue; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); expects CLASS_MAX_NREGS (CL1, MODE) to have a certain meaning even if CL1 can't store values of mode MODE, whereas I'd assumed it was undefined in that case. Richard
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. And I seriously need help with backend patterns. On the patch. The documentation needs review by a native english speaker, but here are some factual comments: +In C vector comparison is supported within standard comparison operators: it should read 'In GNU C' here and everywhere else as this is a GNU extension. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. The result type of the comparison is determined by the C frontend, it isn't under control of the user. What you are implying here is restrictions on vector assignments, which are documented elsewhere. I'd just say 'The result of the comparison is a vector of the same width and number of elements as the comparison operands with a signed integral element type.' +In addition to the vector comparison C supports conditional expressions See above. +For the convenience condition in the vector conditional can be just a +vector of signed integer type. 'of integer type.' I don't see a reason to disallow unsigned integers, they can be equally well compared against zero. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,7 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, spurious whitespace change. Index: gcc/optabs.c === --- gcc/optabs.c(revision 177665) +++ gcc/optabs.c(working copy) @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type ... + else +{ + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (ops[0], target, mode); + create_input_operand (ops[1], rtx_op1, mode); + create_input_operand (ops[2], rtx_op2, mode); + create_input_operand (ops[3], comparison, mode); + create_input_operand (ops[4], rtx_op0, mode); + create_input_operand (ops[5], vec, mode); this still builds the fake(?) != comparison, but as you said you need help with the .md part if we want to use a machine specific pattern for this case (which we eventually want, for the sake of using XOP vcond). Index: gcc/target.h === --- gcc/target.h(revision 177665) +++ gcc/target.h(working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include insn-modes.h +#include gimple.h #ifdef ENABLE_CHECKING spurious change. @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { + tree arg0_type = TREE_TYPE (arg0); + switch (code) { case EQ_EXPR: - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0 + if (! FLOAT_TYPE_P (arg0_type) + || ! HONOR_NANS (TYPE_MODE (arg0_type))) ... Likewise. @@ -8440,6 +8440,37 @@ expand_expr_real_2 (sepops ops, rtx targ case UNGE_EXPR: case UNEQ_EXPR: case LTGT_EXPR: + if (TREE_CODE (ops-type) == VECTOR_TYPE) + { + enum tree_code code = ops-code; + tree arg0 = ops-op0; + tree arg1 = ops-op1; move this code to do_store_flag (we really store a flag value). It should also simply do what expand_vec_cond_expr does, probably simply calling that with the {-1,...} {0,...} extra args should work. As for the still required conversions, you should be able to delay those
[PATCH] [MELT] Fix meltgendoc.texi target
A previous commit has introduced a change for melt output argument in meltgendoc.texi target: going from $@ ti $(basename $@). This is bad since we are loosing the extension of the filename, hence the build process is stopped because of missing meltgendoc.texi file. --- gcc/ChangeLog.MELT |4 gcc/melt-build.tpl |2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT index a341a49..f32e960 100644 --- a/gcc/ChangeLog.MELT +++ b/gcc/ChangeLog.MELT @@ -1,4 +1,8 @@ 2011-08-25 Alexandre Lissy ali...@mandriva.com + * gcc/melt/melt-build.tpl: Fix meltgendoc.texi target (missing .texi + extension for generated file) + +2011-08-25 Alexandre Lissy ali...@mandriva.com * gcc/melt/melt-build.tpl: Use -quicklybuilt instead of -static for consistency with the whole build system * contrib/MELT-Plugin-Makefile: Using -quicklybuilt instead of diff --git a/gcc/melt-build.tpl b/gcc/melt-build.tpl index e3e8282..87f920b 100644 --- a/gcc/melt-build.tpl +++ b/gcc/melt-build.tpl @@ -611,7 +611,7 @@ meltgendoc.texi: $(melt_default_modules_list).modlis \ $(meltarg_init)=@$(melt_default_modules_list) \ $(meltarg_module_path)=$(realpath melt-modules):. \ $(meltarg_source_path)=$(realpath melt-sources):. \ - $(meltarg_output)=$(basename $@) \ + $(meltarg_output)=$(basename $@).texi \ $(meltarg_arglist)=[+FOR melt_translator_file+][+base+].melt,[+ENDFOR melt_translator_file+]\ [+FOR melt_application_file , +][+base+].melt[+ENDFOR melt_application_file+] \ empty-file-for-melt.c $(notdir $(basename $@)).args-tmp
[PATCH, MELT] Fixing documentation generation
Hello, A prevoous commit introduced a typo on the output argument when generating documentation, hence the attached commit fixes the issue.
fix for segmentation violation in dump_generic_node
Jakub, This patch fixes a segmentation violation, which occurs when printing a MEM_REF or COMPONENT_REF containing a released ssa name. This can happen when we print basic blocks upon removal, enabled by -ftree-dump-tree-*-details (see remove_bb:tree-cfg.c). Bootstrapped and reg-tested on x86_64. OK for trunk? Thanks, - Tom 2011-08-25 Tom de Vries t...@codesourcery.com * tree-pretty-print (dump_generic_node): Test for NULL_TREE before accessing TREE_TYPE. Index: gcc/tree-pretty-print.c === --- gcc/tree-pretty-print.c (revision 176920) +++ gcc/tree-pretty-print.c (working copy) @@ -811,6 +811,8 @@ dump_generic_node (pretty_printer *buffe TREE_CODE (TREE_OPERAND (node, 0)) != INTEGER_CST /* Same pointer types, but ignoring POINTER_TYPE vs. REFERENCE_TYPE. */ + TREE_TYPE (TREE_OPERAND (node, 0)) != NULL_TREE + TREE_TYPE (TREE_OPERAND (node, 1)) != NULL_TREE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 0))) == TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1 (TYPE_MODE (TREE_TYPE (TREE_OPERAND (node, 0))) @@ -1177,6 +1179,8 @@ dump_generic_node (pretty_printer *buffe TREE_CODE (TREE_OPERAND (op0, 0)) != INTEGER_CST /* Same pointer types, but ignoring POINTER_TYPE vs. REFERENCE_TYPE. */ + TREE_TYPE (TREE_OPERAND (op0, 0)) != NULL_TREE + TREE_TYPE (TREE_OPERAND (op0, 1)) != NULL_TREE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0))) == TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1 (TYPE_MODE (TREE_TYPE (TREE_OPERAND (op0, 0)))
Re: fix for segmentation violation in dump_generic_node
On Thu, Aug 25, 2011 at 12:32 PM, Tom de Vries vr...@codesourcery.com wrote: Jakub, This patch fixes a segmentation violation, which occurs when printing a MEM_REF or COMPONENT_REF containing a released ssa name. This can happen when we print basic blocks upon removal, enabled by -ftree-dump-tree-*-details (see remove_bb:tree-cfg.c). Where do we dump stmts there? Bootstrapped and reg-tested on x86_64. OK for trunk? At least TREE_TYPE (TREE_OPERAND (node, 1)) != NULL_TREE is always true. The comment before the new lines is now in the wrong place and this check at least needs a comment as well. But - it's broken to dump freed stuff, why and where do we do this? Richard. Thanks, - Tom 2011-08-25 Tom de Vries t...@codesourcery.com * tree-pretty-print (dump_generic_node): Test for NULL_TREE before accessing TREE_TYPE.
Re: [PATCH, i386, testsuite] FMA intrinsics
Sorry. Like this? Changelog: 2011-08-25 Ilya Tocar ilya.to...@intel.com * config/i386/fmaintrin.h: New. * config.gcc: Add fmaintrin.h. * config/i386/i386.c (enum ix86_builtins) IX86_BUILTIN_VFMADDSS3: New. IX86_BUILTIN_VFMADDSD3: Likewise. * config/i386/sse.md (fmai_vmfmadd_mode): New. (*fmai_fmadd_mode): Likewise. (*fmai_fmsub_mode): Likewise. (*fmai_fnmadd_mode): Likewise. (*fmai_fnmsub_mode): Likewise. * config/i386/x86intrin.h: Add fmaintrin.h. And Changelog for testsuite: 2011-08-25 Ilya Tocar ilya.to...@intel.com * gcc.target/i386/fma-check.h: New. * gcc.target/i386/fma-256-fmaddXX.c: New testcase. * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmsubXX.c: Likewise. * gcc.target/i386/fma-fmaddXX.c: Likewise. * gcc.target/i386/fma-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-fmsubXX.c: Likewise. * gcc.target/i386/fma-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-fnmaddXX.c: Likewise. * gcc.target/i386/fma-fnmsubXX.c: Likewise. * gcc.target/i386/fma-compile.c: Likewise. * gcc.target/i386/i386.exp (check_effective_target_fma): New. * gcc.target/i386/sse-12.c: Add -mfma. * gcc.target/i386/sse-13.c: Likewise. * gcc.target/i386/sse-14.c: Likewise. * gcc.target/i386/sse-22.c: Likewise. * gcc.target/i386/sse-23.c: Likewise. * gcc.target/i386/sse-13.c: Likewise. * g++.dg/other/i386-2.c: Likewise. * g++.dg/other/i386-3.c: Likewise. 2011/8/25 Uros Bizjak ubiz...@gmail.com: On Thu, Aug 25, 2011 at 10:18 AM, Ilya Tocar tocarip.in...@gmail.com wrote: Changelog: 2011-08-25 Ilya Tocar ilya.to...@intel.com * config/i386/fmaintrin.h: New. * config.gcc: Add fmaintrin.h. * config/i386/i386.c * ix86_builtins (IX86_BUILTIN_VFMADDSS3): New. (IX86_BUILTIN_VFMADDSD3): Likewise. (enum ix86_builtins) IX86_...: New. IX86_...: Likewise. * config/i386/sse.md (fmai_vmfmadd_mode): New. (*fmai_fmadd_mode): Likewise. (*fmai_fmsub_mode): Likewise. (*fmai_fnmadd_mode): Likewise. (*fmai_fnmsub_mode): Likewise. * config/i386/x86intrin.h: Add fmaintrin.h. And Changelog for testsuite: 2011-08-25 Ilya Tocar ilya.to...@intel.com * gcc.target/i386/fma-check.h: New. * gcc.target/i386/fma-256-fmaddXX.c: New testcase. * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmsubXX.c: Likewise. * gcc.target/i386/fma-fmaddXX.c: Likewise. * gcc.target/i386/fma-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-fmsubXX.c: Likewise. * gcc.target/i386/fma-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-fnmaddXX.c: Likewise. * gcc.target/i386/fma-fnmsubXX.c: Likewise. * gcc.target/i386/fma-compile.c: Likewise. * gcc.target/i386/i386.exp (check_effective_target_fma): New. * gcc.target/i386/sse-12.c: Add -mfma. * gcc.target/i386/sse-13.c: Likewise. * gcc.target/i386/sse-14.c: Likewise. * gcc.target/i386/sse-22.c: Likewise. * gcc.target/i386/sse-23.c: Likewise. * gcc.target/i386/sse-13.c: Likewise. Duplicate. * g++.dg/other/i386-2.c: Likewise. *g++.dg/other/i386-2.C * g++.dg/other/i386-2.c: Likewise. * g++.dg/other/i386-3.C Uros.
Re: [PATCH, i386, testsuite] FMA intrinsics
On Thu, Aug 25, 2011 at 02:47:51PM +0400, Ilya Tocar wrote: Sorry. Like this? No. * gcc.target/i386/sse-12.c: Add -mfma. * gcc.target/i386/sse-13.c: Likewise. * gcc.target/i386/sse-14.c: Likewise. * gcc.target/i386/sse-22.c: Likewise. * gcc.target/i386/sse-23.c: Likewise. * gcc.target/i386/sse-13.c: Likewise. The above line still needs to be removed. * g++.dg/other/i386-2.c: Likewise. * g++.dg/other/i386-3.c: Likewise. And there is missing s/\.c/.C/ in the above two lines. Jakub
Re: [PATCH, i386, testsuite] FMA intrinsics
Fixed. Changelog: 2011-08-25 Ilya Tocar ilya.to...@intel.com * config/i386/fmaintrin.h: New. * config.gcc: Add fmaintrin.h. * config/i386/i386.c (enum ix86_builtins) IX86_BUILTIN_VFMADDSS3: New. IX86_BUILTIN_VFMADDSD3: Likewise. * config/i386/sse.md (fmai_vmfmadd_mode): New. (*fmai_fmadd_mode): Likewise. (*fmai_fmsub_mode): Likewise. (*fmai_fnmadd_mode): Likewise. (*fmai_fnmsub_mode): Likewise. * config/i386/x86intrin.h: Add fmaintrin.h. And Changelog for testsuite: 2011-08-25 Ilya Tocar ilya.to...@intel.com * gcc.target/i386/fma-check.h: New. * gcc.target/i386/fma-256-fmaddXX.c: New testcase. * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubXX.c: Likewise. * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmaddXX.c: Likewise. * gcc.target/i386/fma-256-fnmsubXX.c: Likewise. * gcc.target/i386/fma-fmaddXX.c: Likewise. * gcc.target/i386/fma-fmaddsubXX.c: Likewise. * gcc.target/i386/fma-fmsubXX.c: Likewise. * gcc.target/i386/fma-fmsubaddXX.c: Likewise. * gcc.target/i386/fma-fnmaddXX.c: Likewise. * gcc.target/i386/fma-fnmsubXX.c: Likewise. * gcc.target/i386/fma-compile.c: Likewise. * gcc.target/i386/i386.exp (check_effective_target_fma): New. * gcc.target/i386/sse-12.c: Add -mfma. * gcc.target/i386/sse-13.c: Likewise. * gcc.target/i386/sse-14.c: Likewise. * gcc.target/i386/sse-22.c: Likewise. * gcc.target/i386/sse-23.c: Likewise. * g++.dg/other/i386-2.C: Likewise. * g++.dg/other/i386-3.C: Likewise.
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. And I seriously need help with backend patterns. On the patch. The documentation needs review by a native english speaker, but here are some factual comments: +In C vector comparison is supported within standard comparison operators: it should read 'In GNU C' here and everywhere else as this is a GNU extension. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. The result type of the comparison is determined by the C frontend, it isn't under control of the user. What you are implying here is restrictions on vector assignments, which are documented elsewhere. I'd just say 'The result of the comparison is a vector of the same width and number of elements as the comparison operands with a signed integral element type.' +In addition to the vector comparison C supports conditional expressions See above. +For the convenience condition in the vector conditional can be just a +vector of signed integer type. 'of integer type.' I don't see a reason to disallow unsigned integers, they can be equally well compared against zero. I'll have a final go on the documentation, it is untouched from the old patches. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,7 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, spurious whitespace change. Yes, thanks. Index: gcc/optabs.c === --- gcc/optabs.c (revision 177665) +++ gcc/optabs.c (working copy) @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type ... + else + { + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (ops[0], target, mode); + create_input_operand (ops[1], rtx_op1, mode); + create_input_operand (ops[2], rtx_op2, mode); + create_input_operand (ops[3], comparison, mode); + create_input_operand (ops[4], rtx_op0, mode); + create_input_operand (ops[5], vec, mode); this still builds the fake(?) != comparison, but as you said you need help with the .md part if we want to use a machine specific pattern for this case (which we eventually want, for the sake of using XOP vcond). Yes, I am waiting for it. This is the only way at the moment to make sure that in m = a b; r = m ? c : d; m in the vcond is not transformed into the m != 0. Index: gcc/target.h === --- gcc/target.h (revision 177665) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include insn-modes.h +#include gimple.h #ifdef ENABLE_CHECKING spurious change. Old stuff, fixed. @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { + tree arg0_type = TREE_TYPE (arg0); + switch (code) { case EQ_EXPR: - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0 + if (! FLOAT_TYPE_P (arg0_type) + || ! HONOR_NANS (TYPE_MODE (arg0_type))) ... Ok. Likewise. @@ -8440,6 +8440,37 @@ expand_expr_real_2 (sepops ops, rtx targ case UNGE_EXPR: case UNEQ_EXPR: case
[PATCH, i386]: Remove Y2, Y3 and Y4 register constraints
Hello! Modernize i386 md files by using enabled attribute instead of Y2, Y3 and Y4 conditional register constraints. I will investigate other conditional register constraints as well. 2011-08-25 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (isa): Add sse2, sse2_noavx, sse3, sse4 and sse4_noavx. (enabled): Handle sse2, sse2_noavx, sse3, sse4 and sse4_noavx. (*pushdf_rex64): Change Y2 register constraint to x. (*movdf_internal_rex64): Ditto. (*zero_extendsidi2_rex64): Ditto. (*movdi_internal): Change Y2 register constraint to x and add isa attribute. (*pushdf): Ditto. (*movdf internal): Ditto. (zero_extendsidi2_1): Ditto. (*truncdfdf_mixed): Ditto. (*truncxfdf2_mixed): Ditto. * config/i386/mmx.md (*movmode_internal_rex64): Change Y2 register constraint to x. (*movv2sf_internal_rex64): Ditto. (*movmode_internal): Change Y2 register constraint to x and add isa attribute. (*movv2sf_internal): Ditto. (*vec_extractv2si_1): Ditto. * config/i386/sse.md (vec_setmode_0): Change Y2 and Y4 register constraints to x and update isa attribute. (*vec_interleave_highv2df): Change Y3 registerconstraint to x and update isa attribute. (*vec_interleave_lowv2df): Ditto. (*vec_concatv2df): Change Y2 register constraint to x and update isa attribute. (sse2_loadld): Ditto. (*vec_extractv2di_1): Ditto. (*vec_dupv4si): Ditto. (*vec_dupv2di): Ditto. (*vec_concatv4si): Ditto. (vec_concatv2di): Ditto. * config/i386/constraints.md (Y2): Remove. (Y3): Ditto. (Y4): Ditto. Tested on x86_64-pc-linux-gnu {,-m32}. I will wait for eventual comments before committing the patch to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 178053) +++ i386.md (working copy) @@ -711,11 +711,17 @@ (define_attr movu 0,1 (const_string 0)) ;; Used to control the enabled attribute on a per-instruction basis. -(define_attr isa base,noavx,avx,bmi2 +(define_attr isa base,sse2,sse2_noavx,sse3,sse4,sse4_noavx,noavx,avx,bmi2 (const_string base)) (define_attr enabled - (cond [(eq_attr isa noavx) (symbol_ref !TARGET_AVX) + (cond [(eq_attr isa sse2) (symbol_ref TARGET_SSE2) +(eq_attr isa sse2_noavx) + (symbol_ref TARGET_SSE2 !TARGET_AVX) +(eq_attr isa sse3) (symbol_ref TARGET_SSE3) +(eq_attr isa sse4) (symbol_ref TARGET_SSE4_1) +(eq_attr isa sse4_noavx) + (symbol_ref TARGET_SSE4_1 !TARGET_AVX) (eq_attr isa avx) (symbol_ref TARGET_AVX) (eq_attr isa bmi2) (symbol_ref TARGET_BMI2) ] @@ -2153,9 +2159,9 @@ (define_insn *movdi_internal [(set (match_operand:DI 0 nonimmediate_operand - =r ,o ,*y,m*y,*y,*Y2,m ,*Y2,*Y2,*x,m ,*x,*x,?*Y2,?*Ym) + =r ,o ,*y,m*y,*y,*x,m ,*x,*x,*x,m ,*x,*x,?*x,?*Ym) (match_operand:DI 1 general_operand - riFo,riF,C ,*y ,m ,C ,*Y2,*Y2,m ,C ,*x,*x,m ,*Ym ,*Y2))] + riFo,riF,C ,*y ,m ,C ,*x,*x,m ,C ,*x,*x,m ,*Ym,*x))] !TARGET_64BIT !(MEM_P (operands[0]) MEM_P (operands[1])) { switch (get_attr_type (insn)) @@ -2198,9 +2204,12 @@ } } [(set (attr isa) - (if_then_else (eq_attr alternative 9,10,11,12) - (const_string noavx) - (const_string *))) + (cond [(eq_attr alternative 5,6,7,8,13,14) + (const_string sse2) + (eq_attr alternative 9,10,11,12) + (const_string noavx) + ] + (const_string *))) (set (attr type) (cond [(eq_attr alternative 0,1) (const_string multi) @@ -2770,7 +2779,7 @@ (define_insn *pushdf_rex64 [(set (match_operand:DF 0 push_operand =,,) - (match_operand:DF 1 general_no_elim_operand f,Yd*rFm,Y2))] + (match_operand:DF 1 general_no_elim_operand f,Yd*rFm,x))] TARGET_64BIT { /* This insn should be already split before reg-stack. */ @@ -2786,13 +2795,14 @@ (define_insn *pushdf [(set (match_operand:DF 0 push_operand =,,) - (match_operand:DF 1 general_no_elim_operand f,Yd*rFo,Y2))] + (match_operand:DF 1 general_no_elim_operand f,Yd*rFo,x))] !TARGET_64BIT { /* This insn should be already split before reg-stack. */ gcc_unreachable (); } - [(set_attr type multi) + [(set_attr isa *,*,sse2) + (set_attr type multi) (set_attr unit i387,*,*) (set_attr mode DF,DI,DF)]) @@ -2976,9 +2986,9 @@ (define_insn *movdf_internal_rex64 [(set (match_operand:DF 0 nonimmediate_operand - =f,m,f,?r,?m,?r,!o,Y2*x,Y2*x,Y2*x,m ,Yi,r ) + =f,m,f,?r,?m,?r,!o,x,x,x,m,Yi,r ) (match_operand:DF 1 general_operand - fm,f,G,rm,r ,F ,F ,C ,Y2*x,m ,Y2*x,r ,Yi))] + fm,f,G,rm,r ,F ,F ,C,x,m,x,r ,Yi))] TARGET_64BIT
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. And I seriously need help with backend patterns. On the patch. The documentation needs review by a native english speaker, but here are some factual comments: +In C vector comparison is supported within standard comparison operators: it should read 'In GNU C' here and everywhere else as this is a GNU extension. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. The result type of the comparison is determined by the C frontend, it isn't under control of the user. What you are implying here is restrictions on vector assignments, which are documented elsewhere. I'd just say 'The result of the comparison is a vector of the same width and number of elements as the comparison operands with a signed integral element type.' +In addition to the vector comparison C supports conditional expressions See above. +For the convenience condition in the vector conditional can be just a +vector of signed integer type. 'of integer type.' I don't see a reason to disallow unsigned integers, they can be equally well compared against zero. I'll have a final go on the documentation, it is untouched from the old patches. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,7 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, spurious whitespace change. Yes, thanks. Index: gcc/optabs.c === --- gcc/optabs.c (revision 177665) +++ gcc/optabs.c (working copy) @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type ... + else + { + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (ops[0], target, mode); + create_input_operand (ops[1], rtx_op1, mode); + create_input_operand (ops[2], rtx_op2, mode); + create_input_operand (ops[3], comparison, mode); + create_input_operand (ops[4], rtx_op0, mode); + create_input_operand (ops[5], vec, mode); this still builds the fake(?) != comparison, but as you said you need help with the .md part if we want to use a machine specific pattern for this case (which we eventually want, for the sake of using XOP vcond). Yes, I am waiting for it. This is the only way at the moment to make sure that in m = a b; r = m ? c : d; m in the vcond is not transformed into the m != 0. Index: gcc/target.h === --- gcc/target.h (revision 177665) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include insn-modes.h +#include gimple.h #ifdef ENABLE_CHECKING spurious change. Old stuff, fixed. @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { + tree arg0_type = TREE_TYPE (arg0); + switch (code) { case EQ_EXPR: - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) - || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0 + if (! FLOAT_TYPE_P (arg0_type) + || ! HONOR_NANS (TYPE_MODE (arg0_type))) ... Ok. Likewise. @@ -8440,6 +8440,37 @@
Re: ivopts improvement
Hi Zdenek, here's the updated version of the patch. The goal is to remove the 'i' iterator from the following example, by replacing 'i n' with 'p base + n'. void f (char *base, unsigned long int i, unsigned long int n) { char *p = base + i; do { *p = '\0'; p++; i++; } while (i n); } bootstrapped and reg-tested on x864_64, and build and reg-tested on MIPS. I will sent a test-case in a separate email. OK for trunk? Thanks, - Tom 2011-08-25 Zdenek Dvorak o...@ucw.cz Tom de Vries t...@codesourcery.com * tree-ssa-loop-ivopts.c (struct cost_pair): Add comp field. (struct ivopts_data): Add loop_single_exit_p field. (niter_for_exit): Change parameter desc_p into return value. Return desc if desc-may_be_zero. Free desc if unused. (niter_for_single_dom_exit): Change return type. (find_induction_variables): Handle changed return type of niter_for_single_dom_exit. Dump may_be_zero. (add_candidate_1): Keep original base and step type for IP_ORIGINAL. (set_use_iv_cost): Add and handle comp parameter. (determine_use_iv_cost_generic, determine_use_iv_cost_address): Add comp argument to set_use_iv_cost. (strip_wrap_conserving_type_conversions, expr_equal_p) (difference_cannot_overflow_p, iv_elimination_compare_lt): New function. (may_eliminate_iv): Add comp parameter. Handle new return type of niter_for_exit. Use loop_single_exit_p. Use iv_elimination_compare_lt. (determine_use_iv_cost_condition): Add comp argument to set_use_iv_cost and may_eliminate_iv. (rewrite_use_compare): Move call to iv_elimination_compare to ... (may_eliminate_iv): Here. (tree_ssa_iv_optimize_loop): Initialize loop_single_exit_p. Index: gcc/tree-ssa-loop-ivopts.c === --- gcc/tree-ssa-loop-ivopts.c (revision 176554) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -176,6 +176,7 @@ struct cost_pair tree value; /* For final value elimination, the expression for the final value of the iv. For iv elimination, the new bound to compare with. */ + enum tree_code comp; /* For iv elimination, the comparison. */ int inv_expr_id; /* Loop invariant expression id. */ }; @@ -297,6 +298,9 @@ struct ivopts_data /* Whether the loop body includes any function calls. */ bool body_includes_call; + + /* Whether the loop body can only be exited via single exit. */ + bool loop_single_exit_p; }; /* An assignment of iv candidates to uses. */ @@ -770,15 +774,13 @@ contains_abnormal_ssa_name_p (tree expr) return false; } -/* Returns tree describing number of iterations determined from +/* Returns the structure describing number of iterations determined from EXIT of DATA-current_loop, or NULL if something goes wrong. */ -static tree -niter_for_exit (struct ivopts_data *data, edge exit, -struct tree_niter_desc **desc_p) +static struct tree_niter_desc * +niter_for_exit (struct ivopts_data *data, edge exit) { - struct tree_niter_desc* desc = NULL; - tree niter; + struct tree_niter_desc *desc; void **slot; if (!data-niters) @@ -791,37 +793,31 @@ niter_for_exit (struct ivopts_data *data if (!slot) { - /* Try to determine number of iterations. We must know it - unconditionally (i.e., without possibility of # of iterations - being zero). Also, we cannot safely work with ssa names that - appear in phi nodes on abnormal edges, so that we do not create - overlapping life ranges for them (PR 27283). */ + /* Try to determine number of iterations. We cannot safely work with ssa + names that appear in phi nodes on abnormal edges, so that we do not + create overlapping life ranges for them (PR 27283). */ desc = XNEW (struct tree_niter_desc); - if (number_of_iterations_exit (data-current_loop, - exit, desc, true) - integer_zerop (desc-may_be_zero) - !contains_abnormal_ssa_name_p (desc-niter)) - niter = desc-niter; - else - niter = NULL_TREE; - - desc-niter = niter; + if (!number_of_iterations_exit (data-current_loop, + exit, desc, true) + || contains_abnormal_ssa_name_p (desc-niter)) + { + XDELETE (desc); + desc = NULL; + } slot = pointer_map_insert (data-niters, exit); *slot = desc; } else -niter = ((struct tree_niter_desc *) *slot)-niter; +desc = (struct tree_niter_desc *) *slot; - if (desc_p) -*desc_p = (struct tree_niter_desc *) *slot; - return niter; + return desc; } -/* Returns tree describing number of iterations determined from +/* Returns the structure describing number of iterations determined from single dominating exit of DATA-current_loop, or NULL if something goes wrong. */ -static tree +static struct tree_niter_desc *
[PATCH, testsuite] Avoid architecture options conflict for case pr42894.c
Hello, I think it is useful to run this case for newer arm targets. So the patch intends to skip the warning of architecture conflicts. Is it ok to commit to trunk? BR, Terry gcc/testsuite/ChangeLog: 2011-08-25 Terry Guo terry@arm.com * gcc.dg/tls/pr42894.c: Add dg-prune-output to skip architecture conflict. diff --git a/gcc/testsuite/gcc.dg/tls/pr42894.c b/gcc/testsuite/gcc.dg/tls/pr42894.c index c3bd76c..cda6719 100644 --- a/gcc/testsuite/gcc.dg/tls/pr42894.c +++ b/gcc/testsuite/gcc.dg/tls/pr42894.c @@ -2,6 +2,7 @@ /* { dg-do compile } */ /* { dg-options -march=armv5te -mthumb { target arm*-*-* } } */ /* { dg-require-effective-target tls } */ +/* { dg-prune-output switch .* conflicts with } */ extern __thread int t;
Re: ivopts improvement
On 08/25/2011 01:42 PM, Tom de Vries wrote: Hi Zdenek, here's the updated version of the patch. The goal is to remove the 'i' iterator from the following example, by replacing 'i n' with 'p base + n'. void f (char *base, unsigned long int i, unsigned long int n) { char *p = base + i; do { *p = '\0'; p++; i++; } while (i n); } bootstrapped and reg-tested on x864_64, and build and reg-tested on MIPS. I will sent a test-case in a separate email. OK for trunk? 2011-08-25 Tom de Vries t...@codesourcery.com * gcc.dg/tree-ssa/ivopts-lt.c: New test. Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c === --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-ivopts } */ + +void +f1 (char *p, unsigned long int i, unsigned long int n) +{ + p += i; + do +{ + *p = '\0'; + p += 1; + i++; +} + while (i n); +} + +/* { dg-final { scan-tree-dump-times PHI 1 ivopts} } */ +/* { dg-final { scan-tree-dump-times PHI p_ 1 ivopts} } */ +/* { dg-final { scan-tree-dump-times p_\[0-9\]* 1 ivopts} } */ +/* { dg-final { cleanup-tree-dump ivopts } } */
[PATCH, MELT] meltgendoc.texi failure
Hello, The meltgendoc.texi target can fail, returning a value of 1 while the file has been correctly written. The attached patch allows to circumvent this.
[PATCH, MELT] Fixing texi file for MELT Plugin API
Hello, A couple of nodes in the texi file for the MELT Plugin API were not present in the menu, hence makeinfo failed. Fixing with help from Patrice Dumas pertu...@free.fr.
[PATCH] [MELT] Fix nodes in MELT plugin API texi
Several nodes were not present in the menu, fixing. Thanks to Patrice Dumas pertu...@free.fr. --- contrib/meltpluginapi.texi |3 +++ gcc/ChangeLog.MELT |4 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/contrib/meltpluginapi.texi b/contrib/meltpluginapi.texi index b2d843f..b71ec38 100644 --- a/contrib/meltpluginapi.texi +++ b/contrib/meltpluginapi.texi @@ -84,6 +84,9 @@ Additional tutorial information for GCC is linked to from @menu * MELT Programming Reference::the MELT API. +* GNU Project:: +* Copying:: the GPLv3 license. +* MELT API Index:: @end menu @c the meltgendoc.texi is generated from various *.melt source files diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT index c801737..ceb0d04 100644 --- a/gcc/ChangeLog.MELT +++ b/gcc/ChangeLog.MELT @@ -1,4 +1,8 @@ 2011-08-25 Alexandre Lissy ali...@mandriva.com + * contrib/meltpluginapi.texi: Fix nodes (thanks to Patrice Dumas + pertu...@free.fr + +2011-08-25 Alexandre Lissy ali...@mandriva.com * gcc/melt-build.tpl: Allow build to fail ; meltgentdoc.texi target might return 1 while the generation was successful
RE: [PATCH][RFC] Fix PR49957 - build array index differently
This patch causes regression in one of the Spec2000 benchmarks on arm-none-linux-gnueabi cortex-a9. The benchmark 173.applu from CFP2000 dropped performance by about 8% between revisions 177367 and 177368. Other benchmarks are not affected. In the assembly generated by the new version, the two main subroutines of applu (blts and buts) seem to use the stack a lot more. I couldn't localize the problem to a single loop. Perhaps the array index optimization increases register pressure? Thanks, Greta -Original Message- From: Richard Guenther [mailto:rguent...@suse.de] Sent: 03 August 2011 14:48 To: gcc-patches@gcc.gnu.org Cc: fort...@gcc.gnu.org Subject: [PATCH][RFC] Fix PR49957 - build array index differently This fixes PR49957 by keeping the array index into a multi-dimensional array in optimal associated form which is ((off + outermost) + ...) + innermost) + constant) so that dependence analysis can properly handle it. It doesn't work right now because we build the expression in reverse order, fold thinks it can do some fancy and the expression is of signed type and thus we know it doesn't overflow but we also won't re-associate it to a more optimal form. I tried reversing the loop in gfc_conv_array_ref but that doesn't work (for example aliasing_dummy_4.f90 ICEs), thus the funny way of chaining the pluses. I also don't know if there is maybe another place we build similar expressions that should be adjusted, too - this one is where we build the expression for the testcase I looked at. The patch doesn't make 410.bwaves measurably faster, but at least it also doesn't get slower. Bootstrap and regtest is currently running on x86_64-unknown-linux-gnu, the reversed loop one was ok (well, apart from those 2-3 fails). Comments? Any idea why reversing the loop would break? The ICE I got is /space/rguenther/src/svn/trunk/gcc/testsuite/gfortran.dg/aliasing_dummy _4.f90: In function 'test_f90': /space/rguenther/src/svn/trunk/gcc/testsuite/gfortran.dg/aliasing_dummy _4.f90:21:0: internal compiler error: in gfc_conv_constant, at fortran/trans-const.c:387 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. Thanks, Richard. 2011-08-03 Richard Guenther rguent...@suse.de PR fortran/49957 * trans-array.c (gfc_conv_array_ref): Build the array index expression in optimally associated order. * gfortran.dg/vect/O3-pr49957.f: New testcase. Index: gcc/fortran/trans-array.c === --- gcc/fortran/trans-array.c (revision 177094) +++ gcc/fortran/trans-array.c (working copy) @@ -2634,7 +2634,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_arr locus * where) { int n; - tree index; + tree index, offset, *indexp; tree tmp; tree stride; gfc_se indexse; @@ -2669,9 +2669,16 @@ gfc_conv_array_ref (gfc_se * se, gfc_arr return; } - index = gfc_index_zero_node; + offset = gfc_conv_array_offset (se-expr); + if (TREE_CODE (offset) != INTEGER_CST) +index = offset; + else +index = gfc_index_zero_node; - /* Calculate the offsets from all the dimensions. */ + indexp = index; + + /* Calculate the offsets from all the dimensions. Make sure to associate + the final offset so that we form a chain of loop invariant summands. */ for (n = 0; n ar-dimen; n++) { /* Calculate the index for this dimension. */ @@ -2740,15 +2747,38 @@ gfc_conv_array_ref (gfc_se * se, gfc_arr tmp = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, indexse.expr, stride); - /* And add it to the total. */ - index = fold_build2_loc (input_location, PLUS_EXPR, -gfc_array_index_type, index, tmp); + /* And add it to the total. Avoid folding as that re-associates + in a non-optimal way. We want to have constant offsets as + the outermost addition and the rest of the additions in order + of the loop depth. */ + if (!integer_zerop (index)) + { + if (TREE_CODE (tmp) == INTEGER_CST) + { + bool reset = indexp == index; + index = fold_build2_loc (input_location, PLUS_EXPR, +gfc_array_index_type, index, tmp); + if (reset) + indexp = index; + } + else + { + *indexp + = build2_loc (input_location, PLUS_EXPR, + gfc_array_index_type, *indexp, tmp); + indexp = TREE_OPERAND (*indexp, 0); + } + } + else + { + index = tmp; + indexp = index; + } } - tmp = gfc_conv_array_offset (se-expr); - if (!integer_zerop (tmp)) + if (TREE_CODE (offset) == INTEGER_CST) index =
Re: [pph] Independent pre-loaded cache for common nodes (issue 4956041)
OK with a couple of nits below. Diego. http://codereview.appspot.com/4956041/diff/1/gcc/cp/pph-streamer-in.c File gcc/cp/pph-streamer-in.c (right): http://codereview.appspot.com/4956041/diff/1/gcc/cp/pph-streamer-in.c#newcode155 gcc/cp/pph-streamer-in.c:155: || marker == PPH_RECORD_PREF; 154 return marker == PPH_RECORD_IREF || marker == PPH_RECORD_XREF 155 || marker == PPH_RECORD_PREF; Align these vertically. http://codereview.appspot.com/4956041/diff/1/gcc/cp/pph-streamer-in.c#newcode197 gcc/cp/pph-streamer-in.c:197: || marker == PPH_RECORD_PREF) 154 return marker == PPH_RECORD_IREF || marker == PPH_RECORD_XREF 155 || marker == PPH_RECORD_PREF; Likewise. http://codereview.appspot.com/4956041/diff/1/gcc/cp/pph-streamer.c File gcc/cp/pph-streamer.c (right): http://codereview.appspot.com/4956041/diff/1/gcc/cp/pph-streamer.c#newcode530 gcc/cp/pph-streamer.c:530: enum pph_record_marker marker) 529 pph_cache_get (pph_pickle_cache *stream_cache, unsigned include_ix, unsigned ix, 530enum pph_record_marker marker) Do we really need the MARKER argument? If STREAM_CACHE is NULL, we use INCLUDE_IX to determine if they want an external reference or the preloaded cache. Though, I guess adding MARKER does not hurt and makes the code easier to read. Never mind. http://codereview.appspot.com/4956041/
Re: -Wshadow warning
On Wed, Aug 24, 2011 at 9:32 PM, Alan Modra amo...@gmail.com wrote: Wouldn't -Wshadow be more useful if it obeyed -Wno-system-headers? For code like #include stdlib.h int foo (int atof); int foo (int atof) { return atof; } we currently do not warn on the prototype, but do on the function definition, leading to reports such as http://sourceware.org/bugzilla/show_bug.cgi?id=13121 The following has been bootstrapped and regression tested powerpc-linux. OK to apply? OK. * c-decl.c (warn_if_shadowing): Don't warn if shadowed identifier is from system header. Index: gcc/c-decl.c === --- gcc/c-decl.c (revision 178035) +++ gcc/c-decl.c (working copy) @@ -2516,7 +2516,10 @@ warn_if_shadowing (tree new_decl) /* Is anything being shadowed? Invisible decls do not count. */ for (b = I_SYMBOL_BINDING (DECL_NAME (new_decl)); b; b = b-shadowed) - if (b-decl b-decl != new_decl !b-invisible) + if (b-decl b-decl != new_decl !b-invisible + (b-decl == error_mark_node + || diagnostic_report_warnings_p (global_dc, + DECL_SOURCE_LOCATION (b-decl { tree old_decl = b-decl; -- Alan Modra Australia Development Lab, IBM
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 2:45 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 12:39 PM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. And I seriously need help with backend patterns. On the patch. The documentation needs review by a native english speaker, but here are some factual comments: +In C vector comparison is supported within standard comparison operators: it should read 'In GNU C' here and everywhere else as this is a GNU extension. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. The result type of the comparison is determined by the C frontend, it isn't under control of the user. What you are implying here is restrictions on vector assignments, which are documented elsewhere. I'd just say 'The result of the comparison is a vector of the same width and number of elements as the comparison operands with a signed integral element type.' +In addition to the vector comparison C supports conditional expressions See above. +For the convenience condition in the vector conditional can be just a +vector of signed integer type. 'of integer type.' I don't see a reason to disallow unsigned integers, they can be equally well compared against zero. I'll have a final go on the documentation, it is untouched from the old patches. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,7 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, spurious whitespace change. Yes, thanks. Index: gcc/optabs.c === --- gcc/optabs.c (revision 177665) +++ gcc/optabs.c (working copy) @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type ... + else + { + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (ops[0], target, mode); + create_input_operand (ops[1], rtx_op1, mode); + create_input_operand (ops[2], rtx_op2, mode); + create_input_operand (ops[3], comparison, mode); + create_input_operand (ops[4], rtx_op0, mode); + create_input_operand (ops[5], vec, mode); this still builds the fake(?) != comparison, but as you said you need help with the .md part if we want to use a machine specific pattern for this case (which we eventually want, for the sake of using XOP vcond). Yes, I am waiting for it. This is the only way at the moment to make sure that in m = a b; r = m ? c : d; m in the vcond is not transformed into the m != 0. Index: gcc/target.h === --- gcc/target.h (revision 177665) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include insn-modes.h +#include gimple.h #ifdef ENABLE_CHECKING spurious change. Old stuff, fixed. @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { + tree arg0_type = TREE_TYPE (arg0); + switch (code) { case EQ_EXPR: - if (! FLOAT_TYPE_P (TREE_TYPE (arg0)) - || ! HONOR_NANS
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 2:00 PM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 2:45 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 12:39 PM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. And I seriously need help with backend patterns. On the patch. The documentation needs review by a native english speaker, but here are some factual comments: +In C vector comparison is supported within standard comparison operators: it should read 'In GNU C' here and everywhere else as this is a GNU extension. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. The result type of the comparison is determined by the C frontend, it isn't under control of the user. What you are implying here is restrictions on vector assignments, which are documented elsewhere. I'd just say 'The result of the comparison is a vector of the same width and number of elements as the comparison operands with a signed integral element type.' +In addition to the vector comparison C supports conditional expressions See above. +For the convenience condition in the vector conditional can be just a +vector of signed integer type. 'of integer type.' I don't see a reason to disallow unsigned integers, they can be equally well compared against zero. I'll have a final go on the documentation, it is untouched from the old patches. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,7 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, spurious whitespace change. Yes, thanks. Index: gcc/optabs.c === --- gcc/optabs.c (revision 177665) +++ gcc/optabs.c (working copy) @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type ... + else + { + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (ops[0], target, mode); + create_input_operand (ops[1], rtx_op1, mode); + create_input_operand (ops[2], rtx_op2, mode); + create_input_operand (ops[3], comparison, mode); + create_input_operand (ops[4], rtx_op0, mode); + create_input_operand (ops[5], vec, mode); this still builds the fake(?) != comparison, but as you said you need help with the .md part if we want to use a machine specific pattern for this case (which we eventually want, for the sake of using XOP vcond). Yes, I am waiting for it. This is the only way at the moment to make sure that in m = a b; r = m ? c : d; m in the vcond is not transformed into the m != 0. Index: gcc/target.h === --- gcc/target.h (revision 177665) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include insn-modes.h +#include gimple.h #ifdef ENABLE_CHECKING spurious change. Old stuff, fixed. @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { + tree arg0_type = TREE_TYPE (arg0); + switch (code) { case
[PATCH] Restore gcc.dg/Wshadow-3.c
This patch restores gcc.dg/Wshadow-3.c which was overwritten by r148442. I noticed this by the strange test name for a test checking -Warray-bounds warnings ... Committed. (reversed patch below) Richard. 2011-08-25 Richard Guenther rguent...@suse.de * gcc.dg/Wshadow-3.c: Restore original content destroyed by r148442. Index: testsuite/gcc.dg/Wshadow-3.c === --- testsuite/gcc.dg/Wshadow-3.c(revision 148441) +++ testsuite/gcc.dg/Wshadow-3.c(revision 148442) @@ -1,21 +1,61 @@ -/* Test warnings for shadowing in function prototype scope: generally - useless but of use if the parameter is used within the scope. Bug - 529. */ -/* Origin: Joseph Myers jos...@codesourcery.com */ +/* PR middle-end/36902 Array bound warning with dead code after optimization */ /* { dg-do compile } */ -/* { dg-options -std=gnu89 -Wshadow } */ +/* { dg-options -O2 -Warray-bounds -Wall -Wextra } */ +typedef unsigned char __u8; +typedef unsigned short __u16; -int v; /* { dg-warning shadowed declaration } */ -int f1(int v); -int f2(int v, int x[v]); /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f3(int v, int y[sizeof(v)]); /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f4(int v) { return 0; } /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f5(int v, int x[v]) { return 0; } /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f6(int x) { return 0; } -int f7(v) int v; { return 0; } /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f8(v, w) int v; int w[v]; { return 0; } /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f9(x) int x; { return 0; } -int f10(v) { return 0; } /* { dg-warning declaration of 'v' shadows a global declaration } */ -int f11(int a, int b(int a)); -int f12(int a, int b(int a, int x[a])); /* { dg-warning declaration of 'a' shadows a parameter } */ -/* { dg-warning shadowed declaration outer parm { target *-*-* } 20 } */ +static inline unsigned char * +foo(unsigned char * to, const unsigned char * from, int n) +{ + switch ( n ) +{ +case 3: + *to = *from; + break; +case 5: + to[4] = from [4]; + break; +} + return to; +} + +struct { + intsize_of_select; + unsigned char pcr_select[4]; +} sel; + +int bar(void) +{ + static unsigned char buf[64]; + + sel.size_of_select = 3; + foo(buf, sel.pcr_select, sel.size_of_select); + + return 1; +} + + +static inline unsigned char * +foo2(unsigned char * to, const unsigned char * from, int n) +{ + switch ( n ) +{ +case 3: + *to = *from; + break; +case 5: + to[63] = from [111]; /* { dg-warning array subscript is above array bounds } */ + break; +} + return to; +} + +int baz(void) +{ + static unsigned char buf[64]; + + sel.size_of_select = 5; + foo2(buf, sel.pcr_select, sel.size_of_select); + + return 1; +}
[Patch,AVR]: Bit of Cleanup
These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Johann
[PATCH, MELT] MELT modules installation fix
Hello, Changes recently happened on the MELT modules which broke the way the install-melt-modules target in MELT's Makefile in the plugin case works. Attached is a patch that fixes this.
[PATCH] [MELT] Fix installation of MELT modules
Path computation for installation in GCC's plugin/melt-modules/ path was broken (in fact not updated to the latest changes). Present commit fixes this by reading the link targets and installing them. --- contrib/ChangeLog.MELT |2 ++ contrib/MELT-Plugin-Makefile |8 +++- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/contrib/ChangeLog.MELT b/contrib/ChangeLog.MELT index 9b8123c..42f2aca 100644 --- a/contrib/ChangeLog.MELT +++ b/contrib/ChangeLog.MELT @@ -1,3 +1,5 @@ +2011-08-25 Alexandre Lissy ali...@mandriva.com + * MELT-Plugin-Makefile (install-melt-modules): Fix installation paths 2011-07-18 Basile Starynkevitch bas...@starynkevitch.net * MELT-Plugin-Makefile (melt_make_move): Use move-if-change. diff --git a/contrib/MELT-Plugin-Makefile b/contrib/MELT-Plugin-Makefile index 5e5baae..7ab032a 100644 --- a/contrib/MELT-Plugin-Makefile +++ b/contrib/MELT-Plugin-Makefile @@ -288,13 +288,11 @@ install-melt-sources: melt-sources melt-all-sources ### phony makefile target from melt-build.mk install-melt-modules: melt-modules melt-all-modules $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir) - for d in $(wildcard melt-modules/*); do \ - $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir)/`basename $$d` ; \ - for f in $$d/*.so ; do \ - $(INSTALL_PROGRAM) $$f $(DESTDIR)/$(melt_module_dir)/`basename $$d`/`basename $$f ` ; \ - done; \ + for l in $(wildcard melt-modules/*); do \ + $(INSTALL_PROGRAM) `readlink $$l` $(DESTDIR)/$(melt_module_dir)/$$(basename `readlink $$l`) ; \ done + ## install the makefile for MELT modules install-melt-mk: melt-module.mk $(mkinstalldirs) $(DESTDIR)/$(libexecsubdir)
Re: [Patch,AVR]: Bit of Cleanup [1/3]: Test string for prefix
Georg-Johann Lay wrote: These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Johann * config/avr/avr.c (STR_PREFIX_P): New Define. (avr_asm_declare_function_name): Use it. (avr_asm_named_section): Use it. (avr_section_type_flags): Use it. Index: config/avr/avr.c === --- config/avr/avr.c (revision 178035) +++ config/avr/avr.c (working copy) @@ -51,6 +51,9 @@ /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) +/* Return true if STR starts with PREFIX and false, otherwise. */ +#define STR_PREFIX_P(STR,PREFIX) (0 == strncmp (STR, PREFIX, strlen (PREFIX))) + static void avr_option_override (void); static int avr_naked_function_p (tree); static int interrupt_function_p (tree); @@ -4852,7 +4855,7 @@ avr_asm_declare_function_name (FILE *fil if (cfun-machine-is_interrupt) { - if (strncmp (name, __vector, strlen (__vector)) != 0) + if (!STR_PREFIX_P (name, __vector)) { warning_at (DECL_SOURCE_LOCATION (decl), 0, %qs appears to be a misspelled interrupt handler, @@ -4861,7 +4864,7 @@ avr_asm_declare_function_name (FILE *fil } else if (cfun-machine-is_signal) { - if (strncmp (name, __vector, strlen (__vector)) != 0) + if (!STR_PREFIX_P (name, __vector)) { warning_at (DECL_SOURCE_LOCATION (decl), 0, %qs appears to be a misspelled signal handler, @@ -5116,12 +5119,12 @@ static void avr_asm_named_section (const char *name, unsigned int flags, tree decl) { if (!avr_need_copy_data_p) -avr_need_copy_data_p = (0 == strncmp (name, .data, 5) -|| 0 == strncmp (name, .rodata, 7) -|| 0 == strncmp (name, .gnu.linkonce.d, 15)); +avr_need_copy_data_p = (STR_PREFIX_P (name, .data) +|| STR_PREFIX_P (name, .rodata) +|| STR_PREFIX_P (name, .gnu.linkonce.d)); if (!avr_need_clear_bss_p) -avr_need_clear_bss_p = (0 == strncmp (name, .bss, 4)); +avr_need_clear_bss_p = STR_PREFIX_P (name, .bss); default_elf_asm_named_section (name, flags, decl); } @@ -5131,7 +5134,7 @@ avr_section_type_flags (tree decl, const { unsigned int flags = default_section_type_flags (decl, name, reloc); - if (strncmp (name, .noinit, 7) == 0) + if (STR_PREFIX_P (name, .noinit)) { if (decl TREE_CODE (decl) == VAR_DECL DECL_INITIAL (decl) == NULL_TREE) @@ -5141,7 +5144,7 @@ avr_section_type_flags (tree decl, const .noinit section); } - if (0 == strncmp (name, .progmem.data, strlen (.progmem.data))) + if (STR_PREFIX_P (name, .progmem.data)) flags = ~SECTION_WRITE; return flags;
Re: [Patch,AVR]: Bit of Cleanup [2/3]: Let avr_regno_reg_class return smallest class
Georg-Johann Lay wrote: These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Johann * config/avr/avr.c (reg_class_tab): Make local to avr_regno_reg_class. Return smallest register class available. Index: config/avr/avr.c === --- config/avr/avr.c (revision 178035) +++ config/avr/avr.c (working copy) @@ -278,22 +278,6 @@ avr_option_override (void) init_machine_status = avr_init_machine_status; } -/* return register class from register number. */ - -static const enum reg_class reg_class_tab[]={ - GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS, - GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS, - GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS,GENERAL_REGS, - GENERAL_REGS, /* r0 - r15 */ - LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS,LD_REGS, - LD_REGS, /* r16 - 23 */ - ADDW_REGS,ADDW_REGS, /* r24,r25 */ - POINTER_X_REGS,POINTER_X_REGS, /* r26,27 */ - POINTER_Y_REGS,POINTER_Y_REGS, /* r28,r29 */ - POINTER_Z_REGS,POINTER_Z_REGS, /* r30,r31 */ - STACK_REG,STACK_REG /* SPL,SPH */ -}; - /* Function to set up the backend function structure. */ static struct machine_function * @@ -307,8 +291,32 @@ avr_init_machine_status (void) enum reg_class avr_regno_reg_class (int r) { + static const enum reg_class reg_class_tab[] = +{ + R0_REG, + /* r1 - r15 */ + NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, + NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, + NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, + NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, + /* r16 - r23 */ + SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, + SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, + /* r24, r25 */ + ADDW_REGS, ADDW_REGS, + /* X: r26, 27 */ + POINTER_X_REGS, POINTER_X_REGS, + /* Y: r28, r29 */ + POINTER_Y_REGS, POINTER_Y_REGS, + /* Z: r30, r31 */ + POINTER_Z_REGS, POINTER_Z_REGS, + /* SP: SPL, SPH */ + STACK_REG, STACK_REG +}; + if (r = 33) return reg_class_tab[r]; + return ALL_REGS; }
Re: [Patch,AVR]: Bit of Cleanup [3/3]: Remove byte_immediate_operand
Georg-Johann Lay wrote: These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Johann * config/avr-protos.h (byte_immediate_operand): Remove Prototype. (secondary_input_reload_class): Remove Prototype. * config/avr/avr.c (byte_immediate_operand): Remove Function. * config/avr/avr.md (setmemhi): Use u8_operand. (strlenhi): Use const0_rtx for comparison. * config/avr/avr.h (avr_reg_order): Remove Declaration. Index: config/avr/avr.md === --- config/avr/avr.md (revision 178058) +++ config/avr/avr.md (working copy) @@ -591,18 +591,16 @@ (define_expand setmemhi { rtx addr0; - int cnt8; enum machine_mode mode; /* If value to set is not zero, use the library routine. */ if (operands[2] != const0_rtx) FAIL; - if (GET_CODE (operands[1]) != CONST_INT) + if (!CONST_INT_P (operands[1])) FAIL; - cnt8 = byte_immediate_operand (operands[1], GET_MODE (operands[1])); - mode = cnt8 ? QImode : HImode; + mode = u8_operand (operands[1], VOIDmode) ? QImode : HImode; operands[5] = gen_rtx_SCRATCH (mode); operands[1] = copy_to_mode_reg (mode, gen_int_mode (INTVAL (operands[1]), mode)); @@ -660,7 +658,7 @@ (define_expand strlenhi { rtx addr; - if (! (GET_CODE (operands[2]) == CONST_INT INTVAL (operands[2]) == 0)) + if (operands[2] != const0_rtx) FAIL; addr = copy_to_mode_reg (Pmode, XEXP (operands[1],0)); operands[1] = gen_rtx_MEM (BLKmode, addr); Index: config/avr/avr-protos.h === --- config/avr/avr-protos.h (revision 178035) +++ config/avr/avr-protos.h (working copy) @@ -86,16 +86,12 @@ extern int extra_constraint_Q (rtx x); extern int adjust_insn_length (rtx insn, int len); extern const char *output_reload_inhi (rtx insn, rtx *operands, int *len); extern const char *output_reload_insisf (rtx insn, rtx *operands, rtx clobber, int *len); -extern enum reg_class secondary_input_reload_class (enum reg_class, - enum machine_mode, - rtx); extern void notice_update_cc (rtx body, rtx insn); extern void print_operand (FILE *file, rtx x, int code); extern void print_operand_address (FILE *file, rtx addr); extern int reg_unused_after (rtx insn, rtx reg); extern int _reg_unused_after (rtx insn, rtx reg); extern int avr_jump_mode (rtx x, rtx insn); -extern int byte_immediate_operand (rtx op, enum machine_mode mode); extern int test_hard_reg_class (enum reg_class rclass, rtx x); extern int jump_over_one_insn_p (rtx insn, rtx dest); Index: config/avr/avr.h === --- config/avr/avr.h (revision 178035) +++ config/avr/avr.h (working copy) @@ -376,8 +376,6 @@ typedef struct avr_args { #define FUNCTION_ARG_REGNO_P(r) function_arg_regno_p(r) -extern int avr_reg_order[]; - #define DEFAULT_PCC_STRUCT_RETURN 0 #define EPILOGUE_USES(REGNO) avr_epilogue_uses(REGNO)
RE: [Patch,AVR]: Bit of Cleanup [1/3]: Test string for prefix
-Original Message- From: Georg-Johann Lay [mailto:a...@gjlay.de] Sent: Thursday, August 25, 2011 7:28 AM To: gcc-patches@gcc.gnu.org Cc: Denis Chertykov; Weddington, Eric Subject: Re: [Patch,AVR]: Bit of Cleanup [1/3]: Test string for prefix Georg-Johann Lay wrote: These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Ok for #1.
RE: [Patch,AVR]: Bit of Cleanup [2/3]: Let avr_regno_reg_class return smallest class
-Original Message- From: Georg-Johann Lay [mailto:a...@gjlay.de] Sent: Thursday, August 25, 2011 7:30 AM To: gcc-patches@gcc.gnu.org Cc: Denis Chertykov; Weddington, Eric Subject: Re: [Patch,AVR]: Bit of Cleanup [2/3]: Let avr_regno_reg_class return smallest class Georg-Johann Lay wrote: These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Johann * config/avr/avr.c (reg_class_tab): Make local to avr_regno_reg_class. Return smallest register class available. Ok for #2.
RE: [Patch,AVR]: Bit of Cleanup [3/3]: Remove byte_immediate_operand
-Original Message- From: Georg-Johann Lay [mailto:a...@gjlay.de] Sent: Thursday, August 25, 2011 7:31 AM To: gcc-patches@gcc.gnu.org Cc: Denis Chertykov; Weddington, Eric Subject: Re: [Patch,AVR]: Bit of Cleanup [3/3]: Remove byte_immediate_operand Georg-Johann Lay wrote: These are three small patches to clean up the avr BE a bit: #1: Use custom macro to test of a string starts with given prefix #2: Let avr_regno_reg_class return smallest register class #3: Replace/remove superfluous byte_immediate_operand and some protos. All patches tested without regression. Ok to install them? Johann * config/avr-protos.h (byte_immediate_operand): Remove Prototype. (secondary_input_reload_class): Remove Prototype. * config/avr/avr.c (byte_immediate_operand): Remove Function. * config/avr/avr.md (setmemhi): Use u8_operand. (strlenhi): Use const0_rtx for comparison. * config/avr/avr.h (avr_reg_order): Remove Declaration. Ok for #3. Thanks for doing this cleanup. Eric
Re: [PATCH] [MELT] Fix installation of MELT modules
* Alexandre Lissy wrote on Thu, Aug 25, 2011 at 03:26:46PM CEST: Path computation for installation in GCC's plugin/melt-modules/ path was broken (in fact not updated to the latest changes). Present commit fixes this by reading the link targets and installing them. --- a/contrib/MELT-Plugin-Makefile +++ b/contrib/MELT-Plugin-Makefile @@ -288,13 +288,11 @@ install-melt-sources: melt-sources melt-all-sources ### phony makefile target from melt-build.mk install-melt-modules: melt-modules melt-all-modules $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir) - for d in $(wildcard melt-modules/*); do \ - $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir)/`basename $$d` ; \ - for f in $$d/*.so ; do \ - $(INSTALL_PROGRAM) $$f $(DESTDIR)/$(melt_module_dir)/`basename $$d`/`basename $$f ` ; \ - done; \ + for l in $(wildcard melt-modules/*); do \ If there are no matches for this wildcard, this will generate a shell syntax error, unlike the old code. Perusing the MELT-Plugin-Makefile, it wasn't obvious whether this can ever happen in practice or not, but it might be prudent to guard against it nonetheless. One common way is to add a ':' in the for list, and test for != : inside. + $(INSTALL_PROGRAM) `readlink $$l` $(DESTDIR)/$(melt_module_dir)/$$(basename `readlink $$l`) ; \ No slash after $(DESTDIR), please. done + ## install the makefile for MELT modules install-melt-mk: melt-module.mk $(mkinstalldirs) $(DESTDIR)/$(libexecsubdir)
Re: [google] Increase hotness count fraction
On 24 August 2011 22:43, Mark Heffernan meh...@google.com wrote: This patch bumps up the parameter 'hot-bb-count-fraction' from 1 to 4. This results in about a 0.5% geomean performance improvement across internal benchmarks for x86-64 LIPO. The parameter change effectively increases the number of functions/callsites which are considered hot. The performance improvement is likely due to increased inlining (more callsites are considered hot and available for inlining). Bootstrapped and reg-tested on x86-64. OK for google/gcc-4_6? I know this is intended for the google branches but shouldn't such a change be in the x86_64 backend rather than such a general change to params.def . My 2 cents. cheers Ramana
Re: Rename across basic block boundaries
On 08/24/11 13:12, Richard Sandiford wrote: Sorry, I'm find this a bit tough to review. Could you provide some overview comments somewhere to say what the new algorithm is? Will resubmit. To make the patch smaller next time, I've committed the following as obvious (BSRT i686-linux). Bernd Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 178062) +++ gcc/ChangeLog (working copy) @@ -1,5 +1,12 @@ +2011-08-25 Bernd Schmidt ber...@codesourcery.com + + * regrename.c (scan_rtx_reg, scan_rtx_address, build_def_use, + dump_def_use_chain): Don't declare. + (mark_conflict, create_new_chain): Move before users. + (regrename_optimize): Move to near end of file. + 2011-08-25 Georg-Johann Lay a...@gjlay.de - + * config/avr/avr.c (STR_PREFIX_P): New Define. (avr_asm_declare_function_name): Use it. (avr_asm_named_section): Use it. Index: gcc/regrename.c === --- gcc/regrename.c (revision 178057) +++ gcc/regrename.c (working copy) @@ -138,14 +138,8 @@ static int this_tick = 0; static struct obstack rename_obstack; static void do_replace (struct du_head *, int); -static void scan_rtx_reg (rtx, rtx *, enum reg_class, - enum scan_actions, enum op_type); -static void scan_rtx_address (rtx, rtx *, enum reg_class, - enum scan_actions, enum machine_mode); static void scan_rtx (rtx, rtx *, enum reg_class, enum scan_actions, enum op_type); -static struct du_head *build_def_use (basic_block); -static void dump_def_use_chain (struct du_head *); typedef struct du_head *du_head_p; DEF_VEC_P (du_head_p); @@ -204,6 +198,84 @@ free_chain_data (void) VEC_free (du_head_p, heap, id_to_chain); } +/* Walk all chains starting with CHAINS and record that they conflict with + another chain whose id is ID. */ + +static void +mark_conflict (struct du_head *chains, unsigned id) +{ + while (chains) +{ + bitmap_set_bit (chains-conflicts, id); + chains = chains-next_chain; +} +} + +/* Create a new chain for THIS_NREGS registers starting at THIS_REGNO, + and record its occurrence in *LOC, which is being written to in INSN. + This access requires a register of class CL. */ + +static void +create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc, + rtx insn, enum reg_class cl) +{ + struct du_head *head = XOBNEW (rename_obstack, struct du_head); + struct du_chain *this_du; + int nregs; + + head-next_chain = open_chains; + open_chains = head; + head-regno = this_regno; + head-nregs = this_nregs; + head-need_caller_save_reg = 0; + head-cannot_rename = 0; + + VEC_safe_push (du_head_p, heap, id_to_chain, head); + head-id = current_id++; + + bitmap_initialize (head-conflicts, bitmap_default_obstack); + bitmap_copy (head-conflicts, open_chains_set); + mark_conflict (open_chains, head-id); + + /* Since we're tracking this as a chain now, remove it from the + list of conflicting live hard registers and track it in + live_in_chains instead. */ + nregs = head-nregs; + while (nregs-- 0) +{ + SET_HARD_REG_BIT (live_in_chains, head-regno + nregs); + CLEAR_HARD_REG_BIT (live_hard_regs, head-regno + nregs); +} + + COPY_HARD_REG_SET (head-hard_conflicts, live_hard_regs); + bitmap_set_bit (open_chains_set, head-id); + + open_chains = head; + + if (dump_file) +{ + fprintf (dump_file, Creating chain %s (%d), + reg_names[head-regno], head-id); + if (insn != NULL_RTX) + fprintf (dump_file, at insn %d, INSN_UID (insn)); + fprintf (dump_file, \n); +} + + if (insn == NULL_RTX) +{ + head-first = head-last = NULL; + return; +} + + this_du = XOBNEW (rename_obstack, struct du_chain); + head-first = head-last = this_du; + + this_du-next_use = 0; + this_du-loc = loc; + this_du-insn = insn; + this_du-cl = cl; +} + /* For a def-use chain HEAD, find which registers overlap its lifetime and set the corresponding bits in *PSET. */ @@ -416,52 +488,6 @@ rename_chains (du_head_p all_chains) } } -/* Perform register renaming on the current function. */ - -static unsigned int -regrename_optimize (void) -{ - basic_block bb; - char *first_obj; - - df_set_flags (DF_LR_RUN_DCE); - df_note_add_problem (); - df_analyze (); - df_set_flags (DF_DEFER_INSN_RESCAN); - - memset (tick, 0, sizeof tick); - - gcc_obstack_init (rename_obstack); - first_obj = XOBNEWVAR (rename_obstack, char, 0); - - FOR_EACH_BB (bb) -{ - struct du_head *all_chains = 0; - - id_to_chain = VEC_alloc (du_head_p, heap, 0); - - if (dump_file) - fprintf (dump_file, \nBasic block %d:\n, bb-index); - - all_chains = build_def_use (bb); - - if (dump_file) - dump_def_use_chain (all_chains); - -
Re: Vector Comparison patch
On Thu, Aug 25, 2011 at 3:15 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 2:00 PM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 2:45 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 12:39 PM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 1:07 PM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: On Thu, Aug 25, 2011 at 11:09 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Aug 25, 2011 at 8:20 AM, Artem Shinkarov artyom.shinkar...@gmail.com wrote: Here is a cleaned-up patch without the hook. Mostly it works in a way we discussed. So I think it is a right time to do something about vcond patterns, which would allow me to get rid of conversions that I need to put all over the code. Also at the moment the patch breaks lto frontend with a simple example: #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type type int main (int argc, char *argv[]) { vector (4, float) f0; vector (4, float) f1; f0 = f1 != f0 ? (vector (4, float)){-1,-1,-1,-1} : (vector (4, float)){0,0,0,0}; return (int)f0[argc]; } test-lto.c:8:14: internal compiler error: in convert, at lto/lto-lang.c:1244 I looked into the file, the conversion function is defined as gcc_unreachable (). I am not very familiar with lto, so I don't really know what is the right way to treat the conversions. And I seriously need help with backend patterns. On the patch. The documentation needs review by a native english speaker, but here are some factual comments: +In C vector comparison is supported within standard comparison operators: it should read 'In GNU C' here and everywhere else as this is a GNU extension. The result of the +comparison is a signed integer-type vector where the size of each +element must be the same as the size of compared vectors element. The result type of the comparison is determined by the C frontend, it isn't under control of the user. What you are implying here is restrictions on vector assignments, which are documented elsewhere. I'd just say 'The result of the comparison is a vector of the same width and number of elements as the comparison operands with a signed integral element type.' +In addition to the vector comparison C supports conditional expressions See above. +For the convenience condition in the vector conditional can be just a +vector of signed integer type. 'of integer type.' I don't see a reason to disallow unsigned integers, they can be equally well compared against zero. I'll have a final go on the documentation, it is untouched from the old patches. Index: gcc/targhooks.h === --- gcc/targhooks.h (revision 177665) +++ gcc/targhooks.h (working copy) @@ -86,6 +86,7 @@ extern int default_builtin_vectorization extern tree default_builtin_reciprocal (unsigned int, bool, bool); extern bool default_builtin_vector_alignment_reachable (const_tree, bool); + extern bool default_builtin_support_vector_misalignment (enum machine_mode mode, const_tree, spurious whitespace change. Yes, thanks. Index: gcc/optabs.c === --- gcc/optabs.c (revision 177665) +++ gcc/optabs.c (working copy) @@ -6572,16 +6572,36 @@ expand_vec_cond_expr (tree vec_cond_type ... + else + { + rtx rtx_op0; + rtx vec; + + rtx_op0 = expand_normal (op0); + comparison = gen_rtx_NE (mode, NULL_RTX, NULL_RTX); + vec = CONST0_RTX (mode); + + create_output_operand (ops[0], target, mode); + create_input_operand (ops[1], rtx_op1, mode); + create_input_operand (ops[2], rtx_op2, mode); + create_input_operand (ops[3], comparison, mode); + create_input_operand (ops[4], rtx_op0, mode); + create_input_operand (ops[5], vec, mode); this still builds the fake(?) != comparison, but as you said you need help with the .md part if we want to use a machine specific pattern for this case (which we eventually want, for the sake of using XOP vcond). Yes, I am waiting for it. This is the only way at the moment to make sure that in m = a b; r = m ? c : d; m in the vcond is not transformed into the m != 0. Index: gcc/target.h === --- gcc/target.h (revision 177665) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #define GCC_TARGET_H #include insn-modes.h +#include gimple.h #ifdef ENABLE_CHECKING spurious change. Old stuff, fixed. @@ -9073,26 +9082,28 @@ fold_comparison (location_t loc, enum tr floating-point, we can only do some of these simplifications.) */ if (operand_equal_p (arg0, arg1, 0)) { +
Re: [PATCH] [MELT] Fix installation of MELT modules
Hello, Thanks for your suggestions. I made the suggested changes, I hope it matches your requirements :).
[PATCH] [MELT] Fix installation of MELT modules
Path computation for installation in GCC's plugin/melt-modules/ path was broken (in fact not updated to the latest changes). Present commit fixes this by reading the link targets and installing them. --- contrib/ChangeLog.MELT |2 ++ contrib/MELT-Plugin-Makefile | 10 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/contrib/ChangeLog.MELT b/contrib/ChangeLog.MELT index 9b8123c..42f2aca 100644 --- a/contrib/ChangeLog.MELT +++ b/contrib/ChangeLog.MELT @@ -1,3 +1,5 @@ +2011-08-25 Alexandre Lissy ali...@mandriva.com + * MELT-Plugin-Makefile (install-melt-modules): Fix installation paths 2011-07-18 Basile Starynkevitch bas...@starynkevitch.net * MELT-Plugin-Makefile (melt_make_move): Use move-if-change. diff --git a/contrib/MELT-Plugin-Makefile b/contrib/MELT-Plugin-Makefile index 5e5baae..a4be1f0 100644 --- a/contrib/MELT-Plugin-Makefile +++ b/contrib/MELT-Plugin-Makefile @@ -288,13 +288,13 @@ install-melt-sources: melt-sources melt-all-sources ### phony makefile target from melt-build.mk install-melt-modules: melt-modules melt-all-modules $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir) - for d in $(wildcard melt-modules/*); do \ - $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir)/`basename $$d` ; \ - for f in $$d/*.so ; do \ - $(INSTALL_PROGRAM) $$f $(DESTDIR)/$(melt_module_dir)/`basename $$d`/`basename $$f ` ; \ - done; \ + for l in $(wildcard melt-modules/*) : ; do \ + if [ $$l != : ]; then \ + $(INSTALL_PROGRAM) `readlink $$l` $(DESTDIR)$(melt_module_dir)/$$(basename `readlink $$l`) ; \ + fi; \ done + ## install the makefile for MELT modules install-melt-mk: melt-module.mk $(mkinstalldirs) $(DESTDIR)/$(libexecsubdir)
[PATCH, MELT] Plugin documentation generation
Hello, Documentation for the MELT plugin was only produced as .info file, but the PDF and HTML output were missing. This was due to missing target for both ; hence the following patch which fixes by defining two targets: - %.html - %.pdf Simply calling respectivement $(TEXI2HTML) and $(TEXI2PDF).
[PATCH] [MELT] HTML and PDF targets for plugin doc
Plugin documentation was being built as .info file thanks to the default's make .info target, but none were defined for HTML and PDF. The present commit add the missing targets. --- contrib/ChangeLog.MELT |3 +++ contrib/MELT-Plugin-Makefile |9 + 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/contrib/ChangeLog.MELT b/contrib/ChangeLog.MELT index 42f2aca..01aaf4c 100644 --- a/contrib/ChangeLog.MELT +++ b/contrib/ChangeLog.MELT @@ -1,4 +1,7 @@ 2011-08-25 Alexandre Lissy ali...@mandriva.com + * MELT-Plugin-Makefile: Adding target to build .html and .pdf + +2011-08-25 Alexandre Lissy ali...@mandriva.com * MELT-Plugin-Makefile (install-melt-modules): Fix installation paths 2011-07-18 Basile Starynkevitch bas...@starynkevitch.net diff --git a/contrib/MELT-Plugin-Makefile b/contrib/MELT-Plugin-Makefile index a4be1f0..8625c09 100644 --- a/contrib/MELT-Plugin-Makefile +++ b/contrib/MELT-Plugin-Makefile @@ -66,6 +66,9 @@ MAKEINFO?= makeinfo ## the GNU texi2pdf utility from makeinfo TEXI2PDF?= texi2pdf +## the GNU texi2html utility from makeinfo +TEXI2HTML?= texi2html + ## an install driver, which could be sudo or echo, or stay empty INSTALL_DRIVER?= @@ -246,6 +249,12 @@ MELTDOCPDF= $(patsubst %.texi,%.pdf,$(MELTDOCSRC)) MELTDOCINFO= $(patsubst %.texi,%.info,$(MELTDOCSRC)) MELTDOCHTML= $(patsubst %.texi,%.html,$(MELTDOCSRC)) +%.html: %.texi + $(TEXI2HTML) $(TEXI2HTML_FLAGS) $ -o $@ + +%.pdf: %.texi + $(TEXI2PDF) $(TEXI2PDF_FLAGS) $ -o $@ + doc: meltgendoc.texi meltplugin.texi meltpluginapi.texi doc-pdf doc-info doc-html doc-pdf: $(MELTDOCPDF)
Re: RFC: add a testsuite for libstdc++ pretty-printers
Tom Any comments on this? Tom I'd like to get it in; Phil found a bug in the std::tuple printer, and Tom it would be nice to put in a test case along with the fix. Benjamin Hey Tom (and Phil!). Benjamin Sorry for the delay: this looks fine. Please put it in on trunk and Benjamin enjoy your vacation! Thanks, I checked it in. Let me know if there are problems. Tom
[Patch, Fortran] Fix allocatable scalar coarray components
Scalar coarray components didn't use the array descriptor, which caused all kinds of ICEs. Fix by this relatively simple patch. OK for the trunk? * * * Pending coarray patch for -fcoarray=lib: Add support for assumed-shape coarray dummies (passing offset and token); see http://gcc.gnu.org/ml/fortran/2011-08/msg00182.html Regarding -fcoarray=single: I would claim gfortran has full coarray support, except for polymorphic coarrays (depends on polymorphic arrays support) and another issue with allocatable coarray components and assignment, which will be fixed soon. Tobias 2011-08-25 Tobias Burnus bur...@net-b.de * trans-array.c (structure_alloc_comps): Fix for allocatable scalar coarray components. * trans-expr.c (gfc_conv_component_ref): Ditto. * trans-type.c (gfc_get_derived_type): Ditto. 2011-08-25 Tobias Burnus bur...@net-b.de * gfortran.dg/coarray/alloc_comp_1.f90: New. diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 3a75658..6dc1e17 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -6798,7 +6799,8 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, gfc_add_expr_to_block (fnblock, tmp); } - if (c-attr.allocatable c-attr.dimension) + if (c-attr.allocatable + (c-attr.dimension || c-attr.codimension)) { comp = fold_build3_loc (input_location, COMPONENT_REF, ctype, decl, cdecl, NULL_TREE); @@ -6845,7 +6847,8 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl, case NULLIFY_ALLOC_COMP: if (c-attr.pointer) continue; - else if (c-attr.allocatable c-attr.dimension) + else if (c-attr.allocatable + (c-attr.dimension|| c-attr.codimension)) { comp = fold_build3_loc (input_location, COMPONENT_REF, ctype, decl, cdecl, NULL_TREE); diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 39a83ce..6f93d6f 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -564,7 +564,8 @@ gfc_conv_component_ref (gfc_se * se, gfc_ref * ref) se-string_length = tmp; } - if (((c-attr.pointer || c-attr.allocatable) c-attr.dimension == 0 + if (((c-attr.pointer || c-attr.allocatable) +(!c-attr.dimension !c-attr.codimension) c-ts.type != BT_CHARACTER) || c-attr.proc_pointer) se-expr = build_fold_indirect_ref_loc (input_location, diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c index bec2a11..f66878a 100644 --- a/gcc/fortran/trans-types.c +++ b/gcc/fortran/trans-types.c @@ -2395,7 +2400,7 @@ gfc_get_derived_type (gfc_symbol * derived) /* This returns an array descriptor type. Initialization may be required. */ - if (c-attr.dimension !c-attr.proc_pointer) + if ((c-attr.dimension || c-attr.codimension) !c-attr.proc_pointer ) { if (c-attr.pointer || c-attr.allocatable) { --- /dev/null 2011-08-24 07:52:14.631885245 +0200 +++ gcc/gcc/testsuite/gfortran.dg/coarray/alloc_comp_1.f90 2011-08-25 15:50:07.0 +0200 @@ -0,0 +1,16 @@ +! { dg-do run } +! +! Allocatable scalar corrays were mishandled (ICE) +! +type t + integer, allocatable :: caf[:] +end type t +type(t) :: a +allocate (a%caf[3:*]) +a%caf = 7 +!print *, a%caf +if (a%caf /= 7) call abort () +if (any (lcobound (a%caf) /= [ 3 ]) +.or. ucobound (a%caf, dim=1) /= this_image ()+2) + call abort () +end
[PATCH, MELT] Fixing PDF documentation generation
Hello, The texi file defining the MELT Plugin API was missing the versionsubtitle macro definition (which is however present in meltplugin.texi), and this resulted in PDF documentation generation failing. Following patch fixes this by adding the macro to meltpluginapi.texi.
[PATCH] [MELT] Missign macro versionsubtitle
The MELT Plugin API texi source misses the macro versionsubtitle which is also defined in meltplugin.texi. This commit simply steals the definition from the last one. --- contrib/ChangeLog.MELT |3 +++ contrib/meltpluginapi.texi | 10 ++ 2 files changed, 13 insertions(+), 0 deletions(-) diff --git a/contrib/ChangeLog.MELT b/contrib/ChangeLog.MELT index 01aaf4c..953fb65 100644 --- a/contrib/ChangeLog.MELT +++ b/contrib/ChangeLog.MELT @@ -1,4 +1,7 @@ 2011-08-25 Alexandre Lissy ali...@mandriva.com + * meltpluginapi.texi: Adding missing versionsubtitle macro + +2011-08-25 Alexandre Lissy ali...@mandriva.com * MELT-Plugin-Makefile: Adding target to build .html and .pdf 2011-08-25 Alexandre Lissy ali...@mandriva.com diff --git a/contrib/meltpluginapi.texi b/contrib/meltpluginapi.texi index b71ec38..75413c5 100644 --- a/contrib/meltpluginapi.texi +++ b/contrib/meltpluginapi.texi @@ -2,6 +2,16 @@ @c %**start of header @setfilename meltpluginapi.info @c don't need @include gcc-common.texi +@c Macro to generate a For the N.N.N version subtitle on the title +@c page of TeX documentation. This macro should be used in the +@c titlepage environment after the title and any other subtitles have +@c been placed, and before any authors are placed. +@macro versionsubtitle +@subtitle For @sc{MELT} plugin of @sc{gcc} +@c Even if there are no authors, the second titlepage line should be +@c forced to the bottom of the page. +@vskip 0pt plus 1filll +@end macro @settitle MELT plugin API (generated documentation)
RE: Announcing the Port of Intel(r) Cilk (TM) Plus into GCC
Hello Richard and Mike, Thank you for your interest in the cilkplus branch to GCC 4.7. While the full Intel Cilk Plus Specification (available at http://software.intel.com/en-us/articles/intel-cilk-plus-specification/) includes array notations, they are not implemented in the current release. As mentioned in the announcement, the current release is a subset of the language extension that includes the Intel Cilk Plus keywords, reducers, and the SIMD pragmas, as well as the Intel Cilk Plus runtime for Linux on the x86 and x86-64 architectures. The release includes 3 Changelogs: - gcc/gcc/Changelog.cilk - gcc/cp/Changelog.cilk - gcc/c-family/Changelog.cilk I've attached copies for your convenience. Could I have done something to make these more obvious? Thanking You, Yours Sincerely, Balaji V. Iyer. -Original Message- From: Richard Guenther [mailto:richard.guent...@gmail.com] Sent: Saturday, August 20, 2011 3:33 AM To: Mike Stump Cc: Iyer, Balaji V; gcc-patches@gcc.gnu.org Subject: Re: Announcing the Port of Intel(r) Cilk (TM) Plus into GCC On Sat, Aug 20, 2011 at 8:12 AM, Mike Stump mikest...@comcast.net wrote: On Aug 15, 2011, at 1:30 PM, Iyer, Balaji V wrote: This letter describes the recently created GCC branch called cilkplus that ports the Intel(R) Cilk(TM) Plus language extensions to the C and C++ front-ends of gcc-4.7. We are looking for collaborators and advice as we proceed Enhance the gcc plugin infrastructure to permit the extension to be a pure plugin. :-) I'm thinking about doing this for the Objective-C and Objective-C++ languages, as a fun, get the feet wet project. We can rely upon -flto to improve performance, should performance be a concern. The actual goal however, is to provide a way for people to play around and add extensions, like say for example, the Apple Blocks extension, but without rebuilding gcc, only using the standard plugin interface. I think longer term, this can enhance the design and layout of gcc itself. I of course like the notion of having data-parallel array statements in C just like in Fortran. If only because that makes developing middle-end arrays easier and a cross-frontend thing ;) I suppose the present implementation scalarizes those in the C frontend, but I didn't yet look at the branch (and seriously, a short overview of the code changes, like posting a ChangeLog, would be nice). Thanks, Richard. c-family-ChangeLog.cilk Description: c-family-ChangeLog.cilk cp-ChangeLog.cilk Description: cp-ChangeLog.cilk gcc-ChangeLog.cilk Description: gcc-ChangeLog.cilk
Re: [Patch, Fortran] Fix allocatable scalar coarray components
Am 25.08.2011 16:39, schrieb Tobias Burnus: Scalar coarray components didn't use the array descriptor, which caused all kinds of ICEs. Fix by this relatively simple patch. OK for the trunk? OK (bordering on obvious, although I'm not sure which side of the border :-) Thanks for the patch! Thomas
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Mon, 4 Jul 2011 13:43:31 +0200 (CEST) Ulrich Weigand uweig...@de.ibm.com wrote: Julian Brown wrote: The most awkward change in the patch is to generic code (expmed.c, {store,extract}_bit_field_1): in big-endian mode, the existing behaviour (when inserting/extracting a bitfield to a memory location) is definitely bogus: unit is set to BITS_PER_UNIT for memory locations, and if bitsize (the size of the field to insert/extract) is greater than BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative. That can't possibly be intentional; I can only assume that this code path is not exercised for machines which have memory alternatives for bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN mode. I agree that the current code cannot possibly be correct. However, just disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering *completely* seems wrong to me as well. According to the docs, the meaning bit position passed to the extv/insv expanders is determined by BITS_BIG_ENDIAN, both in the cases of register and memory operands. Therefore, if BITS_BIG_ENDIAN differs from BYTES_BIG_ENDIAN, we should need a correction for memory operands as well. However, this correction needs to be relative to the size of the access (i.e. the operand to the extv/insn), not just BITS_PER_UNIT. Note that with that change, the new code your patch introduces to the ARM back-end will also need to change. You currently handle bitpos like this: base_addr = adjust_address (operands[1], HImode, bitpos / BITS_PER_UNIT); This implicitly assumes that bitpos counts according to BYTES_BIG_ENDIAN, not BITS_BIG_ENDIAN -- which exactly cancels out the common code behaviour introduced by your patch ... I've updated the patch to work with current mainline, and implemented your suggestion along with the change of the interpretation of bitpos in the insv/extv/extzv expanders in arm.md. It seems to work fine (testing still in progress), but I'm a bit concerned that the semantics of bit-positioning for memory operands when BYTES_BIG_ENDIAN !BITS_BIG_ENDIAN are now strange to the point of perversity. The problem is, if we're using little-endian bit numbering for memory locations in big-endian-bytes mode, we need to define an origin from which to count backwards from. For the current implementation, this will now be (I believe) one word beyond the base address of the access in question, which is IMO slightly bizarre, to say the least. But, I can't think of any other easy ways forward than either this patch, or the previous one which disabled bit-endian switching entirely for memory operands in this case. So, OK to apply this version, assuming testing comes out OK? (And the followup patch [2/2], which remains unchanged?) Thanks, Julian ChangeLog gcc/ * config/arm/arm.c (arm_override_options): Add unaligned_access support. (arm_file_start): Emit attribute for unaligned access as appropriate. * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. (insv, extzv): Add unaligned-access support. (extv): Change to expander. Likewise. (extzv_t1, extv_regsi): Add helpers. (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) (unaligned_storesi, unaligned_storehi): New. (*extv_reg): New (previous extv implementation). * config/arm/arm.opt (munaligned_access): Add option. * config/arm/constraints.md (Uw): New constraint. * expmed.c (store_bit_field_1): Adjust bitfield numbering according to size of access, not size of unit, when BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. (extract_bit_field_1): Likewise. commit 7bf9c1c0806ad1ae75e96635cda55fff4c40e7ae Author: Julian Brown jul...@henry8.codesourcery.com Date: Tue Aug 23 05:46:22 2011 -0700 Unaligned support for packed structs diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 2353704..dda2718 100644 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 3162b30..cc1eb80 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1905,6 +1905,28 @@ arm_option_override (void) fix_cm3_ldrd = 0; } + /* Enable -munaligned-access by default for + - all ARMv6 architecture-based processors + - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. + + Disable -munaligned-access by default for + - all pre-ARMv6 architecture-based processors + - ARMv6-M architecture-based processors. */ + + if (unaligned_access == 2) +{ + if (arm_arch6 (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + else if (unaligned_access == 1 + !(arm_arch6 (arm_arch_notm || arm_arch7))) +{ + warning (0, target CPU does not support unaligned accesses); + unaligned_access = 0; +} + if (TARGET_THUMB1 flag_schedule_insns) {
Re: fix for segmentation violation in dump_generic_node
Hi Richard, thanks for the review. On 08/25/2011 12:45 PM, Richard Guenther wrote: On Thu, Aug 25, 2011 at 12:32 PM, Tom de Vries vr...@codesourcery.com wrote: Jakub, This patch fixes a segmentation violation, which occurs when printing a MEM_REF or COMPONENT_REF containing a released ssa name. This can happen when we print basic blocks upon removal, enabled by -ftree-dump-tree-*-details (see remove_bb:tree-cfg.c). Where do we dump stmts there? In dump_bb: static void remove_bb (basic_block bb) { gimple_stmt_iterator i; if (dump_file) { fprintf (dump_file, Removing basic block %d\n, bb-index); if (dump_flags TDF_DETAILS) { dump_bb (bb, dump_file, 0); fprintf (dump_file, \n); } } Bootstrapped and reg-tested on x86_64. OK for trunk? At least TREE_TYPE (TREE_OPERAND (node, 1)) != NULL_TREE is always true. Right. The comment before the new lines is now in the wrong place and this check at least needs a comment as well. Ok, fixed that. But - it's broken to dump freed stuff, why and where do we do this? Sorry, I did not realize that. The scenario is as follows: fnsplit splits a function, and as todo cleanup_tree_cfg is called and unreachable blocks are removed, among which blocks 12 and 13. Block 12 contains a use of 45: # BLOCK 12 freq:9100 # PRED: 13 D.13888_46 = *sD.13886_45; Block 13 contains a def of 45: Block 13 # BLOCK 13 # PRED: 11 12 ... # sD.13886_45 = PHI sD.13886_44(11), sD.13886_49(12) ... if (sizeD.8479_2 iD.13887_50) goto bb 12; else goto bb 14; # SUCC: 12 14 First block 13 is removed, and remove_phi_nodes_and_edges_for_unreachable_block in remove_bb removes the phi def and releases version 45. Then block 12 is removed, and before removal it is dumped by dump_bb in remove_bb, triggering the segv. The order of removal is determined by the 2nd loop in delete_unreachable_blocks, which is chosen because there is no dominator info present: for (b = EXIT_BLOCK_PTR-prev_bb; b != ENTRY_BLOCK_PTR; b = prev_bb) { prev_bb = b-prev_bb; if (!(b-flags BB_REACHABLE)) { delete_basic_block (b); changed = true; } } I'm not sure how to fix this. Another occurance of the same segv is in remove_dead_inserted_code: EXECUTE_IF_SET_IN_BITMAP (inserted_exprs, 0, i, bi) { t = SSA_NAME_DEF_STMT (ssa_name (i)); if (!gimple_plf (t, NECESSARY)) { gimple_stmt_iterator gsi; if (dump_file (dump_flags TDF_DETAILS)) { fprintf (dump_file, Removing unnecessary insertion:); print_gimple_stmt (dump_file, t, 0, 0); } gsi = gsi_for_stmt (t); if (gimple_code (t) == GIMPLE_PHI) remove_phi_node (gsi, true); else { gsi_remove (gsi, true); release_defs (t); } } } Here a version is released, while it's used in the defining statement of version+1, which is subsequently printed. This is easy to fix by splitting the loop, I'll make a patch for this. There might be other occurrences (I triggered these 2 doing a gcc build), but I cannot trigger others until delete_unreachable_blocks does not trigger anymore. Richard. Updated untested patch attached, I'll test this patch together with the remove_dead_inserted_code patch. Thanks, - Tom 2011-08-25 Tom de Vries t...@codesourcery.com * tree-pretty-print (dump_generic_node): Test for NULL_TREE before accessing TREE_TYPE. Index: gcc/tree-pretty-print.c === --- gcc/tree-pretty-print.c (revision 176554) +++ gcc/tree-pretty-print.c (working copy) @@ -809,6 +809,8 @@ dump_generic_node (pretty_printer *buffe infer them and MEM_ATTR caching will share MEM_REFs with differently-typed op0s. */ TREE_CODE (TREE_OPERAND (node, 0)) != INTEGER_CST + /* Released SSA_NAMES have no TREE_TYPE. */ + TREE_TYPE (TREE_OPERAND (node, 0)) != NULL_TREE /* Same pointer types, but ignoring POINTER_TYPE vs. REFERENCE_TYPE. */ (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 0))) @@ -1175,6 +1177,8 @@ dump_generic_node (pretty_printer *buffe can't infer them and MEM_ATTR caching will share MEM_REFs with differently-typed op0s. */ TREE_CODE (TREE_OPERAND (op0, 0)) != INTEGER_CST + /* Released SSA_NAMES have no TREE_TYPE. */ + TREE_TYPE (TREE_OPERAND (op0, 0)) != NULL_TREE /* Same pointer types, but ignoring POINTER_TYPE vs. REFERENCE_TYPE. */ (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))
Re: [google] Increase hotness count fraction
On Thu, Aug 25, 2011 at 6:49 AM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: I know this is intended for the google branches but shouldn't such a change be in the x86_64 backend rather than such a general change to params.def . I wouldn't consider this a backend-specific change. The parameter generally affects performance-space tradeoff with FDO. A higher value means more code will be optimized for performance rather than size (perhaps most significantly in the inliner). The value is too conservative (too much optimizing for size) and leaves performance on the table, at least for our benchmarks. Though I've only tested x86-64, I'd imagine other arches would benefit especially those with relatively large I-caches where a larger code footprint can be tolerated. Of course YMMV... Mark My 2 cents. cheers Ramana
Re: [Patch, Fortran] PR 50163 - ICE with nonconst expr in init expr
On Wednesday 24 August 2011 15:31:17 Tobias Burnus wrote: Isn't there some rules about backporting? The way we do it now, it looks completely arbitrary. I think it *is* arbitrary - and unavoidable so. The main idea behind regression fixing is to make sure that what once worked should continue to work. But what always had been broken can remain broken and will be only fixed on the trunk. Reason: If you fix more, the behaviour on the branch changes and you may introduce regressions. If thinks are known to be broken, you can simply work around them. Additional ingredients are: How serious is the problem? A wrong-code issue occurring in a potentially often used part has a different priority than an accepts-invalid or ice-on-invalid-code issue. Also, a patch which is huge is less suited than a small trivial patch. Regressions, which existed for a long time are typically also less important - otherwise they would have been fixed or found before. But there are also other items such as: Which is the last maintained version in GCC, which versions are still being used (such that it makes sense to backport), and which patches (Linux) distributions want to see. Additionally, as backporting takes time (bootstrap, regtesting, and maybe even adapting the patch slightly): How much time wants the developer spend on backporting. OK, it's a complex problem, and that's the very reason for my remark. My impression is that gfortran is currently doing too much non-regression backporting, which should be left to serious ICE-on-valid code and wrong-code issues. Especially as older versions do not see as much testing as the trunk. I didn't have that impression; a matter of style probably. Yes we could try to be more carefull in the future. [...] But at the end it is question of style. [...] Well, I was asking whether we could decide on our own style. [...] I didn't really answer your question, did I? You exposed your point of view clearly, which is certainly a step forward. There are some basic rules for backports, on which everybody agrees; but in the end the same question is raised over and over again, and nobody seems to know really: how far should we backport? Currently the GCC rules are (basically): not a regression - no backport (unless serious bug/trivial fix) regression - backport On the other hand, we have three open branches (trunk apart) and it is not clear to me whether we should apply the same rules to all of them or shade the seriousness and trivialness trigger levels into the 3 levels of backport we have. Furthermore we have to take into account our (lack of) ressources and (amount of) interest for doing the backport. So I was proposing to include version numbers into the rules, and be more specific about them, like for example: - Serious (wrong-code, ice-on-valid) non-regression bugs with a simple fix are backported to N-1 only. [N is trunk] - Non-serious regressions are not backported beyond N-2. - ... Of course in the end, what is simple, what is serious, are arbitrary. Maybe you are right, it's unavoidable. Mikael
Re: [Patch, Fortran] Coarray assumed-shape token and offset handling
On Monday 22 August 2011 23:22:08 Tobias Burnus wrote: Dear all, this patch added token/offset support for assumed-shape coarray dummies (with .-fcoarray=lib). Build and regtested. OK for the trunk? OK, thanks. Mikael
Re: patch to solve recent SPEC2000 degradation
On 08/25/2011 05:57 AM, Richard Sandiford wrote: Vladimir Makarovvmaka...@redhat.com writes: Instead of using explicitly necessary number of registers, I used contains_reg_of_mode which also checks the number of necessary registers but also it checks that the register class can hold value of given mode. This resulted in different register pressure classes (before the patch, they were GENERAL_REGS and FLOAT_REGS for x86. They became only INT_FLOAT_REGS) because it became not costly to hold integer mode value in FLOAT_REGS. The new register pressure class in own turn resulted in low register pressure and one region allocation in most cases instead of multiple region RA. As a consequence, we got a big degradation on Intel 32 bit targets. Sorry, I know I should be able to work this out, but could you explain in a bit more detail why contains_reg_of_mode (CL1, MODE) was wrong? The loop is calculating costs for moving values of mode MODE into and out of CL1, and I wouldn't have expected those costs to have any meaning if CL1 can't in fact store anything of mode MODE. Here is x86 example. For an integer mode it excludes FLOAT_REGS from updating max register move cost for two FLOAT_INT_REGS and the integer mode in the loop. At the end of function in another loop where ira_register_move_cost is defined more accurately from ira_max_register_move_cost, it results in smaller ira_register_move_cost for two FLOAT_INT_REGS and integer modes than ira_memory_move_cost for FLOAT_INT_REGS and integer mode. And the last results in one pressure class FLOAT_INT_REGS instead of GENERAL_REGS and FLOAT_REGS as it should be. It is a very complicated area. In previous versions of IRA, I tried to calculate cover classes. It never worked because the right cover classes was too critical for a good code and therefore I introduced a new macro for definition of cover classes. After introducing IRA without cover classes, I decided to calculate pressure classes because their accuracy were not critical (especially for targets with moderate or large size) and because the algorithm worked as I wanted (I checked a lot of targets but not all). If we have more troubles with pressure classes calculation, I think we could reconsider the approach and define pressure classes through a macro/hook. It just looked at first glance as though: /* Some subclasses are to small to have enough registers to hold a value of MODE. Just ignore them. */ - if (! contains_reg_of_mode[cl1][mode]) + if (ira_reg_class_max_nregs[cl1][mode] ira_available_class_regs[cl1]) continue; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); expects CLASS_MAX_NREGS (CL1, MODE) to have a certain meaning even if CL1 can't store values of mode MODE, whereas I'd assumed it was undefined in that case. It is a bit ambiguous area. The documentation said that it must be not zero even if the register can not hold the value. The usual (and recommended) definition is based on relation of #bit in value of given mode and #bits in register.
[PATCH, MELT] Multiple slashes in paths
Hello, There are some points in MELT build where several slashes are used and present in paths, mainly when calling GCC. This is bad not only for cosmetic reasons, but it also makes rpm not happy when extracting debug symbols. The following patch fixes this.
[PATCH] [MELT] Fix slashes in paths
Removing useless slashes in path to avoid issues when RPM extracts debug informations. --- contrib/ChangeLog.MELT |3 +++ contrib/MELT-Plugin-Makefile | 30 +++--- gcc/ChangeLog.MELT |3 +++ gcc/melt-module.mk |8 4 files changed, 25 insertions(+), 19 deletions(-) diff --git a/contrib/ChangeLog.MELT b/contrib/ChangeLog.MELT index 953fb65..0b63775 100644 --- a/contrib/ChangeLog.MELT +++ b/contrib/ChangeLog.MELT @@ -1,4 +1,7 @@ 2011-08-25 Alexandre Lissy ali...@mandriva.com + * MELT-Plugin-Makefile: Remove useless '/' after $(DESTDIR) + +2011-08-25 Alexandre Lissy ali...@mandriva.com * meltpluginapi.texi: Adding missing versionsubtitle macro 2011-08-25 Alexandre Lissy ali...@mandriva.com diff --git a/contrib/MELT-Plugin-Makefile b/contrib/MELT-Plugin-Makefile index 8625c09..0609cd5 100644 --- a/contrib/MELT-Plugin-Makefile +++ b/contrib/MELT-Plugin-Makefile @@ -275,28 +275,28 @@ install: all melt.so warmelt \ install-melt-mk install-melt-default-modules-list install-melt-so install-melt-includes: melt-runtime.h melt-predef.h melt-run.h melt-run-md5.h melt/generated/meltrunsup.h - $(mkinstalldirs) $(DESTDIR)/$(MELTGCC_PLUGIN_DIR)/include/ + $(mkinstalldirs) $(DESTDIR)$(MELTGCC_PLUGIN_DIR)/include/ for f in $^; do \ - $(INSTALL_DATA) $$f $(DESTDIR)/$(MELTGCC_PLUGIN_DIR)/include/ ; \ + $(INSTALL_DATA) $$f $(DESTDIR)$(MELTGCC_PLUGIN_DIR)/include/ ; \ done install-melt-so: melt.so - $(INSTALL_PROGRAM) $ $(DESTDIR)/$(MELTGCC_PLUGIN_DIR)/melt.so + $(INSTALL_PROGRAM) $ $(DESTDIR)$(MELTGCC_PLUGIN_DIR)/melt.so ### notice that melt-sources is a directory, but melt-all-sources is a ### phony makefile target from melt-build.mk install-melt-sources: melt-sources melt-all-sources - $(mkinstalldirs) $(DESTDIR)/$(melt_source_dir) + $(mkinstalldirs) $(DESTDIR)$(melt_source_dir) for f in melt-sources/*.c melt-sources/*.melt ; do \ - $(INSTALL_DATA) $$f $(DESTDIR)/$(melt_source_dir) ; \ + $(INSTALL_DATA) $$f $(DESTDIR)$(melt_source_dir) ; \ done ### notice that melt-modules is a directory, but melt-all-modules is a ### phony makefile target from melt-build.mk install-melt-modules: melt-modules melt-all-modules - $(mkinstalldirs) $(DESTDIR)/$(melt_module_dir) + $(mkinstalldirs) $(DESTDIR)$(melt_module_dir) for l in $(wildcard melt-modules/*) : ; do \ if [ $$l != : ]; then \ $(INSTALL_PROGRAM) `readlink $$l` $(DESTDIR)$(melt_module_dir)/$$(basename `readlink $$l`) ; \ @@ -306,18 +306,18 @@ install-melt-modules: melt-modules melt-all-modules ## install the makefile for MELT modules install-melt-mk: melt-module.mk - $(mkinstalldirs) $(DESTDIR)/$(libexecsubdir) - $(INSTALL_DATA) $ $(DESTDIR)/$(melt_installed_module_makefile) + $(mkinstalldirs) $(DESTDIR)$(libexecsubdir) + $(INSTALL_DATA) $ $(DESTDIR)$(melt_installed_module_makefile) ## install the default modules list install-melt-default-modules-list: $(melt_default_modules_list).modlis - $(INSTALL_DATA) $ $(DESTDIR)/$(melt_module_dir) + $(INSTALL_DATA) $ $(DESTDIR)$(melt_module_dir) ### install the MELT documentation files install-melt-doc: doc doc-info doc-pdf doc-html - $(mkinstalldirs) $(DESTDIR)/$(MELTGCC_DOC_INFO_DIR) - $(INSTALL_DATA) *.info *.info-*[0-9] $(DESTDIR)/$(MELTGCC_DOC_INFO_DIR) - $(mkinstalldirs) $(DESTDIR)/$(MELTGCC_DOC_HTML_DIR) - $(INSTALL_DATA) *.html $(DESTDIR)/$(MELTGCC_DOC_HTML_DIR) - $(mkinstalldirs) $(DESTDIR)/$(MELTGCC_DOC_PDF_DIR) - $(INSTALL_DATA) *.pdf $(DESTDIR)/$(MELTGCC_DOC_PDF_DIR) + $(mkinstalldirs) $(DESTDIR)$(MELTGCC_DOC_INFO_DIR) + $(INSTALL_DATA) *.info *.info-*[0-9] $(DESTDIR)$(MELTGCC_DOC_INFO_DIR) + $(mkinstalldirs) $(DESTDIR)$(MELTGCC_DOC_HTML_DIR) + $(INSTALL_DATA) *.html $(DESTDIR)$(MELTGCC_DOC_HTML_DIR) + $(mkinstalldirs) $(DESTDIR)$(MELTGCC_DOC_PDF_DIR) + $(INSTALL_DATA) *.pdf $(DESTDIR)$(MELTGCC_DOC_PDF_DIR) diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT index ceb0d04..3379208 100644 --- a/gcc/ChangeLog.MELT +++ b/gcc/ChangeLog.MELT @@ -1,4 +1,7 @@ 2011-08-25 Alexandre Lissy ali...@mandriva.com + * melt-module.mk: Remove double slashes (makes RPM unhappy) + +2011-08-25 Alexandre Lissy ali...@mandriva.com * contrib/meltpluginapi.texi: Fix nodes (thanks to Patrice Dumas pertu...@free.fr diff --git a/gcc/melt-module.mk b/gcc/melt-module.mk index ef2d07f..b0426f5 100644 --- a/gcc/melt-module.mk +++ b/gcc/melt-module.mk @@ -125,7 +125,7 @@ $(GCCMELT_MODULE_WORKSPACE)/%.optimized.pic.o: echo optimized base3name at $(basename $(basename $(basename $@))) echo optimized base4name at $(basename $(basename $(basename $(basename $@ $(GCCMELT_CC) -DMELTGCC_MODULE_OPTIMIZED -DMELT_HAVE_DEBUG=0 $(GCCMELT_OPTIMIZED_FLAGS) $(GCCMELT_CFLAGS) \ - -fPIC -c -o $@ $(patsubst %, $(GCCMELT_SOURCEDIR)/%.c, $(basename $(basename $(basename $(basename $(notdir $@)) + -fPIC -c -o $@ $(patsubst %, $(GCCMELT_SOURCEDIR)%.c, $(basename $(basename $(basename
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Thu, 25 Aug 2011 16:46:50 +0100 Julian Brown jul...@codesourcery.com wrote: So, OK to apply this version, assuming testing comes out OK? (And the followup patch [2/2], which remains unchanged?) FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c, which regresses. The output contains: ldrhr0, [r3, #2]@ unaligned I believe that, to conform to the ARM EABI, that GCC must use an (aligned) ldr in this case. Is that correct? If so, it looks like the middle-end bitfield code does not take the setting of -fstrict-volatile-bitfields into account. Julian
[pph] Fix x1key* tests (issue4969041)
This patch fixes the x1key* tests by changing the way we register functions to the middle end in the reader. We were calling expand_or_defer_fn, but that was re-computing tree attributes that we had already computed and, worse, it was assuming parser context in scope_chain that the reader did not have. The fix involves saving the callgraph node of the expanded function, so that the reader can rematerialize it and register with the callgraph on the way in. I adjusted symbol table entries to capture more context that we save with the entry so the reader knows what to do on the way in. This includes the two arguments TOP_LEVEL and AT_END that is needed for rest_of_decl_compilation. In fixing this, I found a buglet in the saving of decl chains. When we use filters, we turn the chain into a VEC and save that VEC, but the reader does not know that the chain was saved as a VEC, so it calls pph_in_chain, which assumes that the length of the chain is a HOST_WIDE_INT. However, we were saving the length as an unsigned int. Lawrence, this patch almost fixes x4keyno.cc. The only reason we still fail is due to LFB/LFE asm labels having different numbers. This test case includes two different pph images, which declare exactly the same structures and decls. What's happening is that the compiler ends up using the symbols coming from the last pph image, which have different function numbers, so the code is identical but the labels are different. I documented the issue in the comments for g++.dg/pph/x4keyno.cc. I think the merging process we talked about recently will fix this problem. Tested on x86_64. Committed to branch. * Make-lang.in (cp/semantics.o): Add dependency on $(CXX_PPH_STREAMER_H). * decl.c (start_preparsed_function): Remove call to pph_add_decl_to_symtab. * pph-streamer-in.c (pph_in_ld_min): Call pph_in_hwi instead of pph_in_uint. (pph_in_struct_function): Do not return the allocated struct function. Add argument DECL. Update all callers. Allow references to an cached function. (pph_in_cgraph_node): New. (pph_in_symtab): Call it for PPH_SYMTAB_EXPAND records. Call cgraph_finalize_function instead of expand_or_defer_fn. (pph_in_tcc_declaration): Adjust DECL_EXTERNAL for DECL_NOT_REALLY_EXTERN functions. Document why. * pph-streamer-out.c (pph_out): Assert that the call to fwrite worked. (pph_out_tree_vec): Output the length as a HOST_WIDE_INT. Document why. (pph_out_chained_tree): Remove unused function. (pph_out_cgraph_node): New. (pph_out_symtab): Call it for PPH_SYMTAB_EXPAND records. (pph_add_decl_to_symtab): Add arguments ACTION, TOP_LEVEL and AT_END. Update all callers. * pph-streamer.h (enum pph_symtab_action): Rename from pph_symtab_marker. Change values to PPH_SYMTAB_DECLARE and PPH_SYMTAB_EXPAND. Update all users. (struct pph_symtab_entry): Declare. (struct pph_symtab): Remove field 'm'. Change field 'v' to be VEC(pph_symtab_entry,heap). Update all users. (pph_out_uhwi): New. (pph_out_hwi): New. (pph_in_uhwi): New. (pph_in_hwi): New. * semantics.c: Include pph-streamer.h. (expand_or_defer_fn_1): Call pph_add_decl_to_symtab. testsuite/ChangeLog.pph: * g++.dg/pph/x1keyed.cc: Mark fixed. * g++.dg/pph/x1keyno.cc: Mark fixed. * g++.dg/pph/x4keyno.cc: Change asm diff signature. Document source of difference. diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in index 3c47a4e..a83d5a2 100644 --- a/gcc/cp/Make-lang.in +++ b/gcc/cp/Make-lang.in @@ -332,7 +332,7 @@ cp/repo.o: cp/repo.c $(CXX_TREE_H) $(TM_H) toplev.h $(DIAGNOSTIC_CORE_H) \ cp/semantics.o: cp/semantics.c $(CXX_TREE_H) $(TM_H) toplev.h \ $(FLAGS_H) output.h $(RTL_H) $(TIMEVAR_H) \ $(TREE_INLINE_H) $(CGRAPH_H) $(TARGET_H) $(C_COMMON_H) $(GIMPLE_H) \ - bitmap.h gt-cp-semantics.h c-family/c-objc.h + bitmap.h gt-cp-semantics.h c-family/c-objc.h $(CXX_PPH_STREAMER_H) cp/dump.o: cp/dump.c $(CXX_TREE_H) $(TM_H) $(TREE_DUMP_H) cp/optimize.o: cp/optimize.c $(CXX_TREE_H) $(TM_H) \ input.h $(PARAMS_H) debug.h $(TREE_INLINE_H) $(GIMPLE_H) \ diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 31d0e43..5dd9980 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -5900,7 +5900,7 @@ cp_rest_of_decl_compilation (tree decl, int top_level, int at_end) /* If we are generating a PPH image, add DECL to its symbol table. */ if (pph_out_file) -pph_add_decl_to_symtab (decl); +pph_add_decl_to_symtab (decl, PPH_SYMTAB_DECLARE, top_level, at_end); } @@ -12834,10 +12834,6 @@ start_preparsed_function (tree decl1, tree attrs, int flags) start_fname_decls (); store_parm_decls (current_function_parms); - - /* If we are generating a PPH image, add DECL1 to its symbol table. */ - if
Re: [pph] Independent pre-loaded cache for common nodes (issue 4956041)
On Thu, Aug 25, 2011 at 13:17, Gabriel Charette gch...@google.com wrote: What do you mean? the || is aligned with the 'marker' entry above it. Do you want the || to be aligned? i.e. marker == PPH_RECORD_IREF || marker == PPH_RECORD_XREF || marker == PPH_RECORD_PREF ? Yeah, this way. Every operand of the predicate in its own line. Ya I had the same thought and same resolution. Relying on include_ix == -1U was already sketchy imo, and now adding another trigger on stream == NULL was just too much assumptions and dependence on the rest of the implementation I thought. Sounds good. Diego.
RFA: PR 50061/50113: BLOCK_REG_PADDING for libcalls
Julian Brown jul...@codesourcery.com writes: On Wed, 24 Aug 2011 17:04:55 +0100 Julian Brown jul...@codesourcery.com wrote: On Sun, 07 Aug 2011 18:47:57 +0100 Richard Sandiford rdsandif...@googlemail.com wrote: This patch caused several regressions on big-endian 64-bit MIPS targets, which now try to shift single-precision floating-point arguments to the top of an FPR. [...] Sorry for the breakage! The patch below borrows the padding code from the main call routines. It fixes the MIPS problems for me (tested on mips64-linux-gnu), but I'm not set up for big-endian ARM testing. From what I can tell, other targets' BLOCK_REG_PADDING definitions already handle null types. I tested your patch very lightly on ARM, by running arm.exp fixed-point.exp in both big little-endian mode, and it looks fine. I'll set off a more-complete test run also, in case that's helpful... The patch looks fine for big/little endian, gcc/g++/libstdc++, cross to ARM EABI, btw. Great! Thanks for testing. Maintainers: is the patch: http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00735.html OK to install? Tested by Julian on ARM BE and LE, and by me on mips64-linux-gnu. Thanks, Richard
Re: [PATCH, i386]: Remove Y2, Y3 and Y4 register constraints
@@ -3445,7 +3463,7 @@ }) (define_insn *zero_extendsidi2_rex64 - [(set (match_operand:DI 0 nonimmediate_operand =r,o,?*Ym,?*y,?*Yi,*Y2) + [(set (match_operand:DI 0 nonimmediate_operand =r,o,?*Ym,?*y,?*Yi,*x) (zero_extend:DI (match_operand:SI 1 nonimmediate_operand rm,0,r ,m ,r ,m)))] TARGET_64BIT @@ -3470,7 +3488,7 @@ Missing ISA attr? Although perhaps it doesn't matter; this is 64-bit, and if -mno-sse we disable the register bank. Other than this I didn't see any errors. r~
Re: patch to solve recent SPEC2000 degradation
Vladimir Makarov vmaka...@redhat.com writes: On 08/25/2011 05:57 AM, Richard Sandiford wrote: Vladimir Makarovvmaka...@redhat.com writes: Instead of using explicitly necessary number of registers, I used contains_reg_of_mode which also checks the number of necessary registers but also it checks that the register class can hold value of given mode. This resulted in different register pressure classes (before the patch, they were GENERAL_REGS and FLOAT_REGS for x86. They became only INT_FLOAT_REGS) because it became not costly to hold integer mode value in FLOAT_REGS. The new register pressure class in own turn resulted in low register pressure and one region allocation in most cases instead of multiple region RA. As a consequence, we got a big degradation on Intel 32 bit targets. Sorry, I know I should be able to work this out, but could you explain in a bit more detail why contains_reg_of_mode (CL1, MODE) was wrong? The loop is calculating costs for moving values of mode MODE into and out of CL1, and I wouldn't have expected those costs to have any meaning if CL1 can't in fact store anything of mode MODE. Here is x86 example. For an integer mode it excludes FLOAT_REGS from updating max register move cost for two FLOAT_INT_REGS and the integer mode in the loop. At the end of function in another loop where ira_register_move_cost is defined more accurately from ira_max_register_move_cost, it results in smaller ira_register_move_cost for two FLOAT_INT_REGS and integer modes than ira_memory_move_cost for FLOAT_INT_REGS and integer mode. But isn't that correct though? If FLOAT_REGS can't store integer modes, and if FLOAT_INT_REGS is the union of FLOAT_REGS and INT_REGS, then doesn't it follow that the move cost for FLOAT_INT_REGS should be the same as for INT_REGS? To be clear, I'm not disputing that the pressure class changes were undesirable. I'm just not sure why changing the move costs in this way is the right way to get back to the desired pressure classes. If FLOAT_REGS can't store integer modes, then it doesn't seem like we should be taking the reported move cost for FLOAT_REGS and integer modes into account. I'm just not sure that those costs (or CLASS_MAX_NREGS) are meaningful in this case. Richard
[Patch,AVR]: Cleanup progmem_section handling.
progmem_section is a section to put jump tables in. This patch puts jump tables in individual sections if -ffunction-section is on and does some more cleanup around that, i.e. implement TARGET_ASM_FUNCTION_RODATA_SECTION hook. progmem_section is renamed to progmem_swtable_section and is local to avr.c now. What I don't understand is the old restriction that ASM_OUTPUT_ALIGN only printed .p2align for powers 1; I changed that to = 1 so that jump tables get aligned properly. Passed without regressions. Ok to commit? Johann * config/avr/avr.h (progmem_section): Remove Declaration. (ASM_OUTPUT_ALIGN): Output .p2align for powers = 1. * config/avr/avr.c (progmem_section): Make static and rename to progmem_swtable_section. (avr_output_addr_vec_elt): No need to switch sections. (avr_asm_init_sections): Use output_section_asm_op as section callback for progmem_swtable_section. (avr_output_progmem_section_asm_op): Remove Function. (TARGET_ASM_FUNCTION_RODATA_SECTION): New Define. (avr_asm_function_rodata_section): New static Function. * config/avr/elf.h (ASM_OUTPUT_BEFORE_CASE_LABEL): Define to output alignment 2**1 for jump tables. Index: config/avr/elf.h === --- config/avr/elf.h (revision 178062) +++ config/avr/elf.h (working copy) @@ -37,9 +37,10 @@ #define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL) \ avr_asm_declare_function_name ((FILE), (NAME), (DECL)) +/* Output alignment 2**1 for jump tables. */ #undef ASM_OUTPUT_BEFORE_CASE_LABEL #define ASM_OUTPUT_BEFORE_CASE_LABEL(FILE, PREFIX, NUM, TABLE) \ - switch_to_section (progmem_section); + ASM_OUTPUT_ALIGN ((FILE), 1); /* Be conservative in crtstuff.c. */ #undef INIT_SECTION_ASM_OP Index: config/avr/avr.c === --- config/avr/avr.c (revision 178067) +++ config/avr/avr.c (working copy) @@ -113,6 +113,7 @@ static void avr_function_arg_advance (cu static bool avr_function_ok_for_sibcall (tree, tree); static void avr_asm_named_section (const char *name, unsigned int flags, tree decl); static void avr_encode_section_info (tree, rtx, int); +static section* avr_asm_function_rodata_section (tree); /* Allocate registers from r25 to r8 for parameters for function calls. */ #define FIRST_CUM_REG 26 @@ -135,7 +136,8 @@ const struct base_arch_s *avr_current_ar /* Current device. */ const struct mcu_type_s *avr_current_device; -section *progmem_section; +/* Section to put switch tables in. */ +static GTY(()) section *progmem_swtable_section; /* To track if code will use .bss and/or .data. */ bool avr_need_clear_bss_p = false; @@ -263,6 +265,8 @@ static const struct attribute_spec avr_a #undef TARGET_EXPAND_BUILTIN #define TARGET_EXPAND_BUILTIN avr_expand_builtin +#undef TARGET_ASM_FUNCTION_RODATA_SECTION +#define TARGET_ASM_FUNCTION_RODATA_SECTION avr_asm_function_rodata_section struct gcc_target targetm = TARGET_INITIALIZER; @@ -5036,18 +5040,6 @@ avr_insert_attributes (tree node, tree * } } -/* A get_unnamed_section callback for switching to progmem_section. */ - -static void -avr_output_progmem_section_asm_op (const void *arg ATTRIBUTE_UNUSED) -{ - fprintf (asm_out_file, - \t.section .progmem.gcc_sw_table, \%s\, @progbits\n, - AVR_HAVE_JMP_CALL ? a : ax); - /* Should already be aligned, this is just to be safe if it isn't. */ - fprintf (asm_out_file, \t.p2align 1\n); -} - /* Implement `ASM_OUTPUT_ALIGNED_DECL_LOCAL'. */ /* Implement `ASM_OUTPUT_ALIGNED_DECL_COMMON'. */ @@ -5098,9 +5090,23 @@ avr_output_bss_section_asm_op (const voi static void avr_asm_init_sections (void) { - progmem_section = get_unnamed_section (AVR_HAVE_JMP_CALL ? 0 : SECTION_CODE, - avr_output_progmem_section_asm_op, - NULL); + /* Set up a section for jump tables. Alignment is handled by + ASM_OUTPUT_BEFORE_CASE_LABEL. */ + + if (AVR_HAVE_JMP_CALL) +{ + progmem_swtable_section += get_unnamed_section (0, output_section_asm_op, + \t.section\t.progmem.gcc_sw_table + ,\a\,@progbits); +} + else +{ + progmem_swtable_section += get_unnamed_section (SECTION_CODE, output_section_asm_op, + \t.section\t.progmem.gcc_sw_table + ,\ax\,@progbits); +} /* Override section callbacks to keep track of `avr_need_clear_bss_p' resp. `avr_need_copy_data_p'. */ @@ -5111,6 +5117,36 @@ avr_asm_init_sections (void) } +/* Implement `TARGET_ASM_FUNCTION_RODATA_SECTION'. */ + +static section* +avr_asm_function_rodata_section (tree decl) +{ + /* If a function is unused and optimized out by -ffunction-sections and + --gc-sections, ensure that the same will happen for its jump tables + by putting them into individual sections.
C++ PATCH for c++/50157 (hard error in SFINAE situation)
We weren't protecting all errors in convert_like_real with a complain check. So this patch adds one at the top of the file to handle all bad_p cases. The second patch then removes various checks in the rest of the function that are now redundant. Tested x86_64-pc-linux-gnu, applying to trunk. Also applying the first patch to 4.6. commit 1ad44870bd7ed3478efa5c0feb0c14e19a4060da Author: Jason Merrill ja...@redhat.com Date: Wed Aug 24 06:15:19 2011 -0400 PR c++/50157 * call.c (convert_like_real): Exit early if bad and !tf_error. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index e5f65b3..d911b3a 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -5642,6 +5642,9 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, diagnostic_t diag_kind; int flags; + if (convs-bad_p !(complain tf_error)) +return error_mark_node; + if (convs-bad_p convs-kind != ck_user convs-kind != ck_list @@ -5688,15 +5691,12 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, else if (t-kind == ck_identity) break; } - if (complain tf_error) - { - permerror (input_location, invalid conversion from %qT to %qT, TREE_TYPE (expr), totype); - if (fn) - permerror (DECL_SOURCE_LOCATION (fn), - initializing argument %P of %qD, argnum, fn); - } - else - return error_mark_node; + + permerror (input_location, invalid conversion from %qT to %qT, + TREE_TYPE (expr), totype); + if (fn) + permerror (DECL_SOURCE_LOCATION (fn), + initializing argument %P of %qD, argnum, fn); return cp_convert (totype, expr); } diff --git a/gcc/testsuite/g++.dg/cpp0x/sfinae27.C b/gcc/testsuite/g++.dg/cpp0x/sfinae27.C new file mode 100644 index 000..93327ba --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/sfinae27.C @@ -0,0 +1,20 @@ +// PR c++/50157 +// { dg-options -std=c++0x } + +templateclass T +T val(); + +templateclass T, class Arg, class = + decltype(::new T(valArg())) + +auto test(int) - char; + +templateclass, class +auto test(...) - char ()[2]; + +struct P { + explicit operator bool(); // (#13) +}; + +typedef decltype(testbool, P(0)) type; // OK +typedef decltype(testfloat, P(0)) type2; // Error (#17) commit abc6432b98ce7b149d52c00c4e9285f88da8e96c Author: Jason Merrill ja...@redhat.com Date: Wed Aug 24 12:15:46 2011 -0400 * call.c (convert_like_real): Remove redundant complain checks. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index d911b3a..dc35824 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -5733,11 +5733,8 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, empty list, since that is handled separately in 8.5.4. */ cand-num_convs 0) { - if (complain tf_error) - error (converting to %qT from initializer list would use - explicit constructor %qD, totype, convfn); - else - return error_mark_node; + error (converting to %qT from initializer list would use + explicit constructor %qD, totype, convfn); } /* Set user_conv_p on the argument conversions, so rvalue/base @@ -5789,6 +5786,9 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, } return expr; case ck_ambig: + /* We leave bad_p off ck_ambig because overload resolution considers + it valid, it just fails when we try to perform it. So we need to + check complain here, too. */ if (complain tf_error) { /* Call build_user_type_conversion again for the error. */ @@ -5899,14 +5899,9 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, /* Copy-list-initialization doesn't actually involve a copy. */ return expr; expr = build_temp (expr, totype, flags, diag_kind, complain); - if (diag_kind fn) - { - if ((complain tf_error)) - emit_diagnostic (diag_kind, DECL_SOURCE_LOCATION (fn), 0, - initializing argument %P of %qD, argnum, fn); - else if (diag_kind == DK_ERROR) - return error_mark_node; - } + if (diag_kind fn complain) + emit_diagnostic (diag_kind, DECL_SOURCE_LOCATION (fn), 0, + initializing argument %P of %qD, argnum, fn); return build_cplus_new (totype, expr, complain); case ck_ref_bind: @@ -5916,13 +5911,10 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, if (convs-bad_p TYPE_REF_IS_RVALUE (ref_type) real_lvalue_p (expr)) { - if (complain tf_error) - { - error (cannot bind %qT lvalue to %qT, - TREE_TYPE (expr), totype); - if (fn) - error ( initializing argument %P of %q+D, argnum, fn); - } + error (cannot bind %qT lvalue to %qT, + TREE_TYPE (expr), totype); + if (fn) + error ( initializing argument %P of %q+D, argnum, fn); return error_mark_node; } @@ -5948,19 +5940,16 @@ convert_like_real (conversion *convs, tree expr, tree fn, int argnum, if (!CP_TYPE_CONST_NON_VOLATILE_P (type)
Re: Rename across basic block boundaries
On 08/24/11 13:12, Richard Sandiford wrote: Sorry, I'm find this a bit tough to review. Could you provide some overview comments somewhere to say what the new algorithm is? The comment at the head of regrename.c still describes the current bb-local algorithm. New patch below, with extra comments. Let me know if more are needed. One thing though: Bernd Schmidt ber...@codesourcery.com writes: @@ -215,8 +306,9 @@ merge_overlapping_regs (HARD_REG_SET *ps IOR_HARD_REG_SET (*pset, head-hard_conflicts); EXECUTE_IF_SET_IN_BITMAP (head-conflicts, 0, i, bi) { - du_head_p other = VEC_index (du_head_p, id_to_chain, i); + du_head_p other = chain_from_id (i); unsigned j = other-nregs; + gcc_assert (other != head); while (j-- 0) SET_HARD_REG_BIT (*pset, other-regno + j); } Is this effectively cubic in the number of chains? There are O(chains) calls to merge_overlapping_regs, O(chains) nodes in the conflicts bitmap, and chain_from_id chases O(chains) merges. I've made chain_from_id store its final result in the original chain, so it'll take O(chains) only once per chain. Bootstrapped and tested on i686-linux (with rr enabled at O2). I've also redone performance tests with a popular embedded benchmark on C6X; some fluctuations around +/-5%, and 20% improvement on one benchmark. Bernd * regrename.c (struct du_head): Make nregs signed. (scan_rtx_reg, scan_rtx_address, dump_def_use_chain): Remove declarations. (closed_chains): Remove. (chain_from_id): New static function. (dump_def_use_chain): Change argument to be an int, indicating the first ID to print. All callers changed. (mark_conflict, create_new_chain): Move upwards in the file. (merge_overlapping_regs): Use chain_from_id. Assert that chains don't conflict with themselves. (rename_chains): Take no argument. Iterate over id_to_chain rather to find chains to rename. Clear tick before the main loop. (struct incoming_reg_info): New struct. (struct bb_rename_info): New struct. (init_rename_info, set_incoming_from_chain, merge_chains): New static functions. (regrename_analyze): New static function, broken out of regrename_optimize. Record and make use of open chain information at basic block boundaries, and merge chains where possible. (scan_rtx_reg): Make this_nregs signed. Don't update closed_chains. (build_def_use): Return a bool to indicate success. All callers changed. Don't initialize global data here. (regrename_optimize): Move most code out of here into regrename_analyze. * regs.h (add_range_to_hard_reg_set, remove_range_from_hard_reg_set, range_overlaps_hard_reg_set_p, range_in_hard_reg_set_p): New static inline functions. Index: regrename.c === --- regrename.c (revision 178065) +++ regrename.c (working copy) @@ -47,18 +47,24 @@ 1. Local def/use chains are built: within each basic block, chains are opened and closed; if a chain isn't closed at the end of the block, - it is dropped. + it is dropped. We pre-open chains if we have already examined a + predecessor block and found chains live at the end which match + live registers at the start of the new block. - 2. For each chain, the set of possible renaming registers is computed. + 2. We try combine the local chains across basic block boundaries by +comparing chains that were open at the start or end of a block to + those in successor/predecessor blocks. + + 3. For each chain, the set of possible renaming registers is computed. This takes into account the renaming of previously processed chains. Optionally, a preferred class is computed for the renaming register. - 3. The best renaming register is computed for the chain in the above set, + 4. The best renaming register is computed for the chain in the above set, using a round-robin allocation. If a preferred class exists, then the round-robin allocation is done within the class first, if possible. The round-robin allocation of renaming registers itself is global. - 4. If a renaming register has been found, it is substituted in the chain. + 5. If a renaming register has been found, it is substituted in the chain. Targets can parameterize the pass by specifying a preferred class for the renaming register for a given (super)class of registers to be renamed. */ @@ -75,8 +81,9 @@ struct du_head struct du_head *next_chain; /* The first and last elements of this chain. */ struct du_chain *first, *last; - /* Describes the register being tracked. */ - unsigned regno, nregs; + /* Describe the register being tracked, register number and count.
Re: [PATCH, i386]: Remove Y2, Y3 and Y4 register constraints
On Thu, Aug 25, 2011 at 8:00 PM, Richard Henderson r...@redhat.com wrote: @@ -3445,7 +3463,7 @@ }) (define_insn *zero_extendsidi2_rex64 - [(set (match_operand:DI 0 nonimmediate_operand =r,o,?*Ym,?*y,?*Yi,*Y2) + [(set (match_operand:DI 0 nonimmediate_operand =r,o,?*Ym,?*y,?*Yi,*x) (zero_extend:DI (match_operand:SI 1 nonimmediate_operand rm,0,r ,m ,r ,m)))] TARGET_64BIT @@ -3470,7 +3488,7 @@ Missing ISA attr? Although perhaps it doesn't matter; this is 64-bit, and if -mno-sse we disable the register bank. Other than this I didn't see any errors. No, this was done on purpose. We assume that SSE2 (so, Y2) is always enabled on x86_64 target. -mno-sse is handled in ix86_conditional_register_usage by marking all SSE regs fixed. Please note that all *_rex64 patterns were changed in the same way, as described in the ChangeLog entry. Uros.
Re: ivopts improvement
Hi, here's the updated version of the patch. The goal is to remove the 'i' iterator from the following example, by replacing 'i n' with 'p base + n'. void f (char *base, unsigned long int i, unsigned long int n) { char *p = base + i; do { *p = '\0'; p++; i++; } while (i n); } bootstrapped and reg-tested on x864_64, and build and reg-tested on MIPS. I will sent a test-case in a separate email. OK for trunk? OK, Zdenek
[pph] Make x7rtti.cc executable (issue4970041)
* g++.dg/pph/x7rtti.cc: Make it executable. diff --git a/gcc/testsuite/g++.dg/pph/x7rtti.cc b/gcc/testsuite/g++.dg/pph/x7rtti.cc index 297ebe2..0da2f97 100644 --- a/gcc/testsuite/g++.dg/pph/x7rtti.cc +++ b/gcc/testsuite/g++.dg/pph/x7rtti.cc @@ -1,3 +1,4 @@ +// { dg-do run } // { dg-xfail-if BOGUS { *-*-* } { -fpph-map=pph.map } } // { dg-bogus x7rtti.cc:21:0: warning: .__STDC_IEC_559_COMPLEX__. redefined .enabled by default. { xfail *-*-* } 0 } // { dg-bogus x7rtti.cc:21:0: warning: .__STDC_ISO_10646__. redefined .enabled by default. { xfail *-*-* } 0 } @@ -13,16 +14,20 @@ // { dg-bogus x7rtti.cc:28:1: error: redefinition of .const char _ZTS15non_polymorphic ... { xfail *-*-* } 0 } // { dg-bogus x7rtti.cc:28:1: error: redefinition of .const char _ZTS11polymorphic ... { xfail *-*-* } 0 } - -//FIXME We should make this a run test. - #include x5rtti1.h #include x5rtti2.h int main() { -returnpoly1() == poly2() nonp1() == nonp2() -hpol1() == hpol2() hnpl1() == hnpl2() -poly1() != nonp1() hpol1() == hnpl1() -poly2() != nonp2() hpol2() == hnpl2(); +if (poly1() == poly2() +nonp1() == nonp2() + hpol1() == hpol2() +hnpl1() == hnpl2() + poly1() != nonp1() +hpol1() == hnpl1() + poly2() != nonp2() +hpol2() == hnpl2()) + return 0; +else + return 1; } -- This patch is available for review at http://codereview.appspot.com/4970041
[PATCH, middle-end]: Fix PR50083: All 32-bit fortran tests fail on 32-bit Solaris
Hello! As noted in the PR, we also have to protect conversion from round-lround for non-TARGET_C99_FUNCTIONS targets. Otherwise, gcc chokes in fold_fixed_mathfn, trying to canonicalize iround to (non-existent) lround. It looks to me, that we can trigger the same problem trying to convert (long long) round - llround - lround on non-TARGET_C99_FUNCTIONS LP64 targets, so this fix probably applies to other release branches as well. 2011-08-25 Uros Bizjak ubiz...@gmail.com PR middle-end/50083 * convert.c (convert_to_integer) BUIT_IN_ROUND{,F,L}: Convert only when TARGET_C99_FUNCTIONS. BUILT_IN_NEARBYINT{,F,L}: Ditto. BUILT_IN_RINT{,F,L}: Ditto. Bootstrapped on x86_64-pc-linux-gnu, regtesting in progress. OK for SVN and 4.6? Uros. Index: convert.c === --- convert.c (revision 178071) +++ convert.c (working copy) @@ -469,6 +469,9 @@ convert_to_integer (tree type, tree expr) break; CASE_FLT_FN (BUILT_IN_ROUND): + /* Only convert in ISO C99 mode. */ + if (!TARGET_C99_FUNCTIONS) + break; if (outprec TYPE_PRECISION (integer_type_node) || (outprec == TYPE_PRECISION (integer_type_node) !TYPE_UNSIGNED (type))) @@ -487,11 +490,14 @@ convert_to_integer (tree type, tree expr) break; /* ... Fall through ... */ CASE_FLT_FN (BUILT_IN_RINT): + /* Only convert in ISO C99 mode. */ + if (!TARGET_C99_FUNCTIONS) + break; if (outprec TYPE_PRECISION (integer_type_node) || (outprec == TYPE_PRECISION (integer_type_node) !TYPE_UNSIGNED (type))) fn = mathfn_built_in (s_intype, BUILT_IN_IRINT); - else if (outprec TYPE_PRECISION (long_integer_type_node) + else if (outprec == TYPE_PRECISION (long_integer_type_node) !TYPE_UNSIGNED (type)) fn = mathfn_built_in (s_intype, BUILT_IN_LRINT); else if (outprec == TYPE_PRECISION (long_long_integer_type_node)
Fix pr 50132 and 49864
These are both REG_ARGS_SIZE mis-match problems. The 49864 problem is caused by cross-jumping doing the wrong thing. The 50132 problem is caused by fixup_args_size_notes think-o where we failed to properly handle pops. Techinically I should have split these changes apart, but I tested them together and the splitting would have just been make-work. Tested on x86_64-linux. Committed. r~ PR 50132 PR 49864 * cfgcleanup.c (old_insns_match_p): Don't allow cross-jump for non-constant stack adjutment. * expr.c (find_args_size_adjust): Break out from ... (fixup_args_size_notes): ... here. * rtl.h (find_args_size_adjust): Declare. diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index 7173013..396057c 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -1081,11 +1081,20 @@ old_insns_match_p (int mode ATTRIBUTE_UNUSED, rtx i1, rtx i2) /* ??? Do not allow cross-jumping between different stack levels. */ p1 = find_reg_note (i1, REG_ARGS_SIZE, NULL); p2 = find_reg_note (i2, REG_ARGS_SIZE, NULL); - if (p1) -p1 = XEXP (p1, 0); - if (p2) -p2 = XEXP (p2, 0); - if (!rtx_equal_p (p1, p2)) + if (p1 p2) +{ + p1 = XEXP (p1, 0); + p2 = XEXP (p2, 0); + if (!rtx_equal_p (p1, p2)) +return dir_none; + + /* ??? Worse, this adjustment had better be constant lest we + have differing incoming stack levels. */ + if (!frame_pointer_needed + find_args_size_adjust (i1) == HOST_WIDE_INT_MIN) + return dir_none; +} + else if (p1 || p2) return dir_none; p1 = PATTERN (i1); diff --git a/gcc/expr.c b/gcc/expr.c index ee16b6a..a6746d1 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -3548,131 +3548,151 @@ mem_autoinc_base (rtx mem) verified, via immediate operand or auto-inc. If the adjustment cannot be trivially extracted, the return value is INT_MIN. */ -int -fixup_args_size_notes (rtx prev, rtx last, int end_args_size) +HOST_WIDE_INT +find_args_size_adjust (rtx insn) { - int args_size = end_args_size; - bool saw_unknown = false; - rtx insn; + rtx dest, set, pat; + int i; - for (insn = last; insn != prev; insn = PREV_INSN (insn)) -{ - rtx dest, set, pat; - HOST_WIDE_INT this_delta = 0; - int i; + pat = PATTERN (insn); + set = NULL; - if (!NONDEBUG_INSN_P (insn)) - continue; - pat = PATTERN (insn); - set = NULL; + /* Look for a call_pop pattern. */ + if (CALL_P (insn)) +{ + /* We have to allow non-call_pop patterns for the case +of emit_single_push_insn of a TLS address. */ + if (GET_CODE (pat) != PARALLEL) + return 0; - /* Look for a call_pop pattern. */ - if (CALL_P (insn)) + /* All call_pop have a stack pointer adjust in the parallel. +The call itself is always first, and the stack adjust is +usually last, so search from the end. */ + for (i = XVECLEN (pat, 0) - 1; i 0; --i) { - /* We have to allow non-call_pop patterns for the case -of emit_single_push_insn of a TLS address. */ - if (GET_CODE (pat) != PARALLEL) - continue; - - /* All call_pop have a stack pointer adjust in the parallel. -The call itself is always first, and the stack adjust is -usually last, so search from the end. */ - for (i = XVECLEN (pat, 0) - 1; i 0; --i) - { - set = XVECEXP (pat, 0, i); - if (GET_CODE (set) != SET) - continue; - dest = SET_DEST (set); - if (dest == stack_pointer_rtx) - break; - } - /* We'd better have found the stack pointer adjust. */ - if (i == 0) + set = XVECEXP (pat, 0, i); + if (GET_CODE (set) != SET) continue; - /* Fall through to process the extracted SET and DEST -as if it was a standalone insn. */ + dest = SET_DEST (set); + if (dest == stack_pointer_rtx) + break; } - else if (GET_CODE (pat) == SET) - set = pat; - else if ((set = single_set (insn)) != NULL) - ; - else if (GET_CODE (pat) == PARALLEL) + /* We'd better have found the stack pointer adjust. */ + if (i == 0) + return 0; + /* Fall through to process the extracted SET and DEST +as if it was a standalone insn. */ +} + else if (GET_CODE (pat) == SET) +set = pat; + else if ((set = single_set (insn)) != NULL) +; + else if (GET_CODE (pat) == PARALLEL) +{ + /* ??? Some older ports use a parallel with a stack adjust +and a store for a PUSH_ROUNDING pattern, rather than a +PRE/POST_MODIFY rtx. Don't force them to update yet... */ + /* ??? See h8300 and m68k, pushqi1. */ + for (i = XVECLEN (pat, 0) - 1; i = 0; --i) { - /* ??? Some older ports use a parallel with a stack adjust -
[lra] a small patch for LRA speedup.
Recently I sent the patch which was mostly a caller-saves subpass rewriting. It resulted in removing a call df-analyze because the new subpass does not use DF-infrastructure as all subsequent subpasses. Before caller-saves subpass there is a small subpass which substitutes scratches to pseudos for simpler and better allocation which called df_insn_rescan after each change. With the new caller-saves subpass we don't need to do that any more. The patch was successfully bootstrapped on x86-64 and ia64. 2011-08-25 Vladimir Makarov vmaka...@redhat.com * lra.c (remove_scratches): Don't rescan insn. Index: lra.c === --- lra.c (revision 178008) +++ lra.c (working copy) @@ -1791,7 +1791,6 @@ static void remove_scratches (void) { int i; - bool insn_change_p; basic_block bb; rtx insn, reg; loc_t loc; @@ -1807,12 +1806,10 @@ remove_scratches (void) { id = lra_get_insn_recog_data (insn); static_id = id-insn_static_data; - insn_change_p = false; for (i = 0; i static_id-n_operands; i++) if (GET_CODE (*id-operand_loc[i]) == SCRATCH GET_MODE (*id-operand_loc[i]) != VOIDmode) { - insn_change_p = true; *id-operand_loc[i] = reg = lra_create_new_reg (static_id-operand[i].mode, *id-operand_loc[i], ALL_REGS, NULL); @@ -1829,8 +1826,6 @@ remove_scratches (void) fprintf (lra_dump_file, Removing SCRATCH in insn #%u (nop %d)\n, INSN_UID (insn), i); } - if (insn_change_p) - df_insn_rescan (insn); } }
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On 25 August 2011 18:31, Julian Brown jul...@codesourcery.com wrote: On Thu, 25 Aug 2011 16:46:50 +0100 Julian Brown jul...@codesourcery.com wrote: So, OK to apply this version, assuming testing comes out OK? (And the followup patch [2/2], which remains unchanged?) FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c, which regresses. The output contains: ldrh r0, [r3, #2] @ unaligned I believe that, to conform to the ARM EABI, that GCC must use an (aligned) ldr in this case. Is that correct? That is correct by my reading of the ABI Spec. The relevant section is 7.1.7.5 where it states that : When a volatile bitfield is read it's container must be read exactly once using the access width appropriate to the type of the container. Here the type of the container is a long and hence the access should be with an ldr instruction followed by a shift as it is today irrespective whether we support unaligned accesses in this case. cheers Ramana
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Thu, 25 Aug 2011, Ramana Radhakrishnan wrote: On 25 August 2011 18:31, Julian Brown jul...@codesourcery.com wrote: On Thu, 25 Aug 2011 16:46:50 +0100 Julian Brown jul...@codesourcery.com wrote: So, OK to apply this version, assuming testing comes out OK? (And the followup patch [2/2], which remains unchanged?) FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c, which regresses. The output contains: ldrh r0, [r3, #2] @ unaligned I believe that, to conform to the ARM EABI, that GCC must use an (aligned) ldr in this case. Is that correct? That is correct by my reading of the ABI Spec. The relevant section is 7.1.7.5 where it states that : When a volatile bitfield is read it's container must be read exactly once using the access width appropriate to the type of the container. Here the type of the container is a long and hence the access should be with an ldr instruction followed by a shift as it is today irrespective whether we support unaligned accesses in this case. Except for packed structures, anyway (and there aren't any packed structures involved in this particular testcase). The ABI doesn't cover packed structures and I maintain that there the target-independent GNU C semantics take precedence - meaning that if the compiler doesn't know that a single read with the relevant access width is going to be safe, it must use an instruction sequence it does know to be safe. -- Joseph S. Myers jos...@codesourcery.com
Re: GIMPLE and intent of variables
Mateusz Grabowski wrote: If a function calls another, the intent of variables should be passed to the first one. But what if the callee is in the other compilation unit? Does anyone have knowledge of using LTO mode? At this moment I have many compilation units. In my case the main program is written in Fortran and it calls functions in C - I'm sure that this works correctly. My plugin runs after cplxlower0 pass. I use following flags to compile and link each Fortran and C files: -flto -O0 -flto-partition=none -fwhole-program -fuse-linker-plugin I use them in order to see whole program as a single compilation unit. Object files are also compileted with -flto flags. ld.gold linker (from gcc binutils) links object files (.o) thanks to that my main .cplxlower0 file contains all needed functions in GIMPLE. My plugin comes to every function. But my problem is that I can't get some function definition when I'm currently in the another one. if(is_gimple_call(stmt)) { tree fndecl = gimple_call_fndecl(stmt); struct function *new_cfun = DECL_STRUCT_FUNCTION(fndecl); (...) } In this case new_cfun == NULL or equivalently gimple_has_body_p(fndecl) == NULL even if the function is in the same Fortran or C file. -- View this message in context: http://old.nabble.com/GIMPLE-and-intent-of-variables-tp32275433p32337494.html Sent from the gcc - patches mailing list archive at Nabble.com.
Re: [PATCH] Add infrastructure to merge standard builtin enums with backend builtins
On Wed, Aug 24, 2011 at 11:06:55AM +0200, Richard Guenther wrote: This basically would make DECL_BUILT_IN_CLASS no longer necessary if all targets where converted, right? (We don't currently have any BUILT_IN_FRONTEND builtins). That would sound appealing if this patch weren't a partial transition ;) Or we could reduce it to 1 bit if we aren't going to change all of the backends. Now for the possible downsides. How can we reliably distinguish middle-end from target builtins for purpose of lazy initialization? Doesn't this complicate the idea of pluggable targets, thus something like a hybrid ppc / spu compiler? In this light merging middle-end and target builtin enums and arrays sounds like a step backward. If we are willing to pay the storage costs, we could have 1 or 2 bytes for builtin owner, and 2 bytes for builtin index, and then reserve 0 for standard builtins and 1 for machine dependent builtins. However, then you still have the potential problem that sooner or later somebody else will omit the checks. We could reserve a fixed range for plugin builtins if you think that is desirable. What I _do_ like is having common machinery for defining builtins. Though instead of continuing the .def file way with all the current warts of ways of adding attributes, etc. to builtins I would have prefered a genbuiltins.c program that can parse standard C declarations and generate whatever is necessary to setup the builtin decls. Thus, instead of DEF_GCC_BUILTIN(BUILT_IN_CLZ, clz, BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST) have simply int __builtin_clz (unsigned int) __attribute__((const,nothrow,leaf)); in a header file which genbuiltins.c would parse. My first idea when discussing this was a -fgenbuiltins flag to the C frontend (because that already can do all the parsing ...), but Micha suggested a parser that can deal with the above is easy enough to re-implement. Yes, that is certainly do-able. My main intention is to see what kind of infrastructure people wanted before changing all of the ppc builtins. Hm, I guess this pushes back a bit on your patch. Sorry for that. If you're not excited to try the above idea, can you split out the pieces that do the .def file thing for rs6000, keeping the separation of md and middle-end builtin arrays and enums? I have several goals for the 4.7 time frame: 1) Make target attribute and pragma enable appropriate machine dependent builtins; 2) Make it less likely we will again be bitten by code that blindly references built_in_decl without checking if it is MD or standard; 3) Make at least the MD builtins created on demand. It would be nice to do the standard builtins as well, but that may somewhat more problematical. I do think all references to built_in_decl and implicit_built_in_decl should be moved to a macro wrapper. If we restrict the types and attributes for a C like header file, it shouldn't be that hard (famous last words). I would think adding #ifdef also, so: #ifdef __ALTIVEC__ extern vector float __builtin_altivec_vaddfp (vector float, vector float) __attribute__ ((...)); #endif The backend would need to specify a list of valid #ifdef's and the mapping to TARGET_xxx, and valid extra types with a mapping to the internal type node. The alternative is something like what Kenney and Mike are doing in their private port, where they have new syntax in the MD file for builtins. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: GIMPLE and intent of variables
During determining the intent of variable I run into problems with PHI nodes. The problematical GIMPLE code looks: # BLOCK 196 # PRED: 194 (false) (...) ndycD.8665_1099 = 1; # BLOCK 197 # PRED: 196 (true) 207 (false) # ndycD.8665_4 = PHI lt;ndycD.8665_1099(196), ndycD.8665_1431(207)gt; (...) # BLOCK 207 # PRED: 197 (false) 206 (true) ndycD.8665_1431 = ndycD.8665_4 + 1; I dont' understand why gimple_phi statement is before its argument(ndycD.8665_1431) assigment. What is more, this argument is defined by its PHI node result(ndycD.8665_4)! So if I want to estimate the true origin of ndycD.8665_4 variable, based on the PHI node, I go from BLOCK 197 to BLOCK 196 (ndycD.8665_1099) and BLOCK 207(ndycD.8665_1431) - and that is OK. But in BLOCK 207, where I find out about the origin of ndycD.8665_1431 variable, I must return to BLOCK 197, where ndycD.8665_4 variable is defined - this is infinite jumping. Is there any option to avoid this situation? -- View this message in context: http://old.nabble.com/GIMPLE-and-intent-of-variables-tp32275433p32337811.html Sent from the gcc - patches mailing list archive at Nabble.com.
Re: [PATCH] Add infrastructure to merge standard builtin enums with backend builtins
On Aug 25, 2011, at 1:35 PM, Michael Meissner wrote: The alternative is something like what Kenney and Mike are doing in their private port, where they have new syntax in the MD file for builtins. I think the issue is actually largely orthogonal. In our code, we generate which code is used by a description for the built-in, picking a simple one: (define_builtin port_add port_add_type [ (define_outputs [(var_operand:T_ALL_DI 0)]) (define_inputs [(var_operand:T_ALL_DI 1) (var_operand:T_ALL_DI 2) (var_operand:T_ALL_DI 3)]) (define_rtl_pattern port_add_m_mode [0 1 2 3]) (attributes [pure]) ] ) from this, we generate everything needed. Way under the hood, there is a set of enum values, but, you wouldn't ever see them. There isn't a one-to-one correspondence as we permit overloading. We start as 0 and increase, but, we could just as easily start at LAST_MI_BUILTIN+1. Things like T_ALL_DI as iterators which describe the front-end type to use and how it relates to modes in rtl-land. T_DI might be long, and T_UDI might be unsigned long, the mode for both is DImode.
[pph] Use REAL_IDENTIFIER_TYPE_VALUE instead of TREE_TYPE for IDENTIFIER_NODE (issue4965046)
This was the last thing remaining on my cleanup list. As suggested by Steven and Jason in issue4550121, we should use REAL_IDENTIFIER_TYPE_VALUE for IDENTIFIER_NODEs instead of TREE_TYPE (although the former resolves to the later in its macro definition, this is more robust to potential later changes to TREE_TYPE in trunk). This patch does that change. As mentionned by Steven in the same issue, we were accessing some fields directly instead of correctly using their corresponding accessor macros. This patch makes use of the correct accessor macros for pph_read/write_tree_body. There is no implementation change in this patch, every macro used resolves to what we it replaces, if anything this will make the pph code slightly more robust to trunk merges. Tested with boostrap and pph regression testing on x64. Cheers, Gab 2011-08-25 Gabriel Charette gch...@google.com * pph-streamer-in.c (pph_read_tree_body): Use accessor macros for all fields. Use REAL_IDENTIFIER_TYPE_VALUE instead of TREE_TYPE for the IDENTIFIER_NODE case. * pph-streamer-out.c (pph_write_tree_body): Likewise. diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index f37feaf..2fcb436 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -1826,14 +1826,11 @@ pph_read_tree_body (pph_stream *stream, tree expr) break; case IDENTIFIER_NODE: - { -struct lang_identifier *id = LANG_IDENTIFIER_CAST (expr); -id-namespace_bindings = pph_in_cxx_binding (stream); -id-bindings = pph_in_cxx_binding (stream); -id-class_template_info = pph_in_tree (stream); -id-label_value = pph_in_tree (stream); - TREE_TYPE (expr) = pph_in_tree (stream); - } + IDENTIFIER_NAMESPACE_BINDINGS (expr) = pph_in_cxx_binding (stream); + IDENTIFIER_BINDING (expr) = pph_in_cxx_binding (stream); + IDENTIFIER_TEMPLATE (expr) = pph_in_tree (stream); + IDENTIFIER_LABEL_VALUE (expr) = pph_in_tree (stream); + REAL_IDENTIFIER_TYPE_VALUE (expr) = pph_in_tree (stream); break; case BASELINK: @@ -1876,17 +1873,13 @@ pph_read_tree_body (pph_stream *stream, tree expr) break; case LAMBDA_EXPR: - { -struct tree_lambda_expr *e -= (struct tree_lambda_expr *)LAMBDA_EXPR_CHECK (expr); -pph_in_tree_common (stream, expr); - e-locus = pph_in_location (stream); -e-capture_list = pph_in_tree (stream); -e-this_capture = pph_in_tree (stream); -e-return_type = pph_in_tree (stream); -e-extra_scope = pph_in_tree (stream); -e-discriminator = pph_in_uint (stream); - } + pph_in_tree_common (stream, expr); + LAMBDA_EXPR_LOCATION (expr) = pph_in_location (stream); + LAMBDA_EXPR_CAPTURE_LIST (expr) = pph_in_tree (stream); + LAMBDA_EXPR_THIS_CAPTURE (expr) = pph_in_tree (stream); + LAMBDA_EXPR_RETURN_TYPE (expr) = pph_in_tree (stream); + LAMBDA_EXPR_EXTRA_SCOPE (expr) = pph_in_tree (stream); + LAMBDA_EXPR_DISCRIMINATOR (expr) = pph_in_uint (stream); break; case TREE_VEC: @@ -1899,15 +1892,12 @@ pph_read_tree_body (pph_stream *stream, tree expr) break; case TEMPLATE_PARM_INDEX: - { -template_parm_index *p = TEMPLATE_PARM_INDEX_CAST (expr); -pph_in_tree_common (stream, expr); -p-index = pph_in_uint (stream); -p-level = pph_in_uint (stream); -p-orig_level = pph_in_uint (stream); -p-num_siblings = pph_in_uint (stream); -p-decl = pph_in_tree (stream); - } + pph_in_tree_common (stream, expr); + TEMPLATE_PARM_IDX (expr) = pph_in_uint (stream); + TEMPLATE_PARM_LEVEL (expr) = pph_in_uint (stream); + TEMPLATE_PARM_ORIG_LEVEL (expr) = pph_in_uint (stream); + TEMPLATE_PARM_NUM_SIBLINGS (expr) = pph_in_uint (stream); + TEMPLATE_PARM_DECL (expr) = pph_in_tree (stream); break; case DEFERRED_NOEXCEPT: diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c index 3f7ac0c..27495e7 100644 --- a/gcc/cp/pph-streamer-out.c +++ b/gcc/cp/pph-streamer-out.c @@ -1612,14 +1612,11 @@ pph_write_tree_body (pph_stream *stream, tree expr) break; case IDENTIFIER_NODE: - { -struct lang_identifier *id = LANG_IDENTIFIER_CAST (expr); -pph_out_cxx_binding (stream, id-namespace_bindings); -pph_out_cxx_binding (stream, id-bindings); -pph_out_tree_1 (stream, id-class_template_info, 3); -pph_out_tree_1 (stream, id-label_value, 3); - pph_out_tree_1 (stream, TREE_TYPE (expr), 3); - } + pph_out_cxx_binding (stream, IDENTIFIER_NAMESPACE_BINDINGS (expr)); + pph_out_cxx_binding (stream, IDENTIFIER_BINDING (expr)); + pph_out_tree_1 (stream, IDENTIFIER_TEMPLATE (expr), 3); + pph_out_tree_1 (stream, IDENTIFIER_LABEL_VALUE (expr), 3); + pph_out_tree_1 (stream, REAL_IDENTIFIER_TYPE_VALUE (expr), 3); break;
Re: [pph] Use REAL_IDENTIFIER_TYPE_VALUE instead of TREE_TYPE for IDENTIFIER_NODE (issue4965046)
On Thu, Aug 25, 2011 at 18:14, Gabriel Charette gch...@google.com wrote: This was the last thing remaining on my cleanup list. As suggested by Steven and Jason in issue4550121, we should use REAL_IDENTIFIER_TYPE_VALUE for IDENTIFIER_NODEs instead of TREE_TYPE (although the former resolves to the later in its macro definition, this is more robust to potential later changes to TREE_TYPE in trunk). This patch does that change. As mentionned by Steven in the same issue, we were accessing some fields directly instead of correctly using their corresponding accessor macros. This patch makes use of the correct accessor macros for pph_read/write_tree_body. Nice! Thanks for this cleanup. 2011-08-25 Gabriel Charette gch...@google.com * pph-streamer-in.c (pph_read_tree_body): Use accessor macros for all fields. Use REAL_IDENTIFIER_TYPE_VALUE instead of TREE_TYPE for the IDENTIFIER_NODE case. * pph-streamer-out.c (pph_write_tree_body): Likewise. OK. Diego.
Re: New automaton_option collapse-ndfa
On 07/18/11 18:47, Vladimir Makarov wrote: But I guess comb-vector is popular for a reason. We could tolerate slow compression time because it is done once but worse compression and slower access would have a really bad impact on the compiler time. With some fixes that I need to make to the C6X machine description, comb vector generation time is no longer tolerable. Ok to apply the following patch? (Bootstrapped and tested on i686-linux). Bernd * genautomata.c (NO_COMB_OPTION): New macro. (no_comb_flag): New static variable. (gen_automata_option): Handle NO_COMB_OPTION. (comb_vect_p): False if no_comb_flag. (add_vect): Move computation of min/max values. Return early if no_comb_flag. * doc/md.texi (automata_option): Document no-comb-vect. Index: gcc/genautomata.c === --- gcc/genautomata.c (revision 332057) +++ gcc/genautomata.c (working copy) @@ -252,6 +252,7 @@ static arc_t next_out_arc ( #define W_OPTION -w #define NDFA_OPTION -ndfa #define COLLAPSE_OPTION -collapse-ndfa +#define NO_COMB_OPTION -no-comb-vect #define PROGRESS_OPTION -progress /* The following flags are set up by function `initiate_automaton_gen'. */ @@ -267,6 +268,9 @@ static int collapse_flag; /* Do not make minimization of DFA (`-no-minimization'). */ static int no_minimization_flag; +/* Do not try to generate a comb vector (`-no-comb-vect'). */ +static int no_comb_flag; + /* Value of this variable is number of automata being generated. The actual number of automata may be less this value if there is not sufficient number of units. This value is defined by argument of @@ -1538,6 +1542,8 @@ gen_automata_option (rtx def) ndfa_flag = 1; else if (strcmp (XSTR (def, 0), COLLAPSE_OPTION + 1) == 0) collapse_flag = 1; + else if (strcmp (XSTR (def, 0), NO_COMB_OPTION + 1) == 0) +no_comb_flag = 1; else if (strcmp (XSTR (def, 0), PROGRESS_OPTION + 1) == 0) progress_flag = 1; else @@ -7190,6 +7196,8 @@ static int undefined_vect_el_value; static int comb_vect_p (state_ainsn_table_t tab) { + if (no_comb_flag) +return false; return (2 * VEC_length (vect_el_t, tab-full_vect) 5 * VEC_length (vect_el_t, tab-comb_vect)); } @@ -7308,6 +7316,22 @@ add_vect (state_ainsn_table_t tab, int v VEC_replace (vect_el_t, tab-full_vect, full_base + i, VEC_index (vect_el_t, vect, i)); } + + /* The comb_vect min/max values are also used for the full vector, so + compute them now. */ + for (vect_index = 0; vect_index vect_length; vect_index++) +if (VEC_index (vect_el_t, vect, vect_index) != undefined_vect_el_value) + { + vect_el_t x = VEC_index (vect_el_t, vect, vect_index); +gcc_assert (x = 0); +if (tab-max_comb_vect_el_value x) + tab-max_comb_vect_el_value = x; +if (tab-min_comb_vect_el_value x) + tab-min_comb_vect_el_value = x; + } + if (no_comb_flag) +return; + /* Form comb vector in the table: */ gcc_assert (VEC_length (vect_el_t, tab-comb_vect) == VEC_length (vect_el_t, tab-check_vect)); @@ -7417,10 +7441,6 @@ add_vect (state_ainsn_table_t tab, int v comb_vect_index + vect_index) == undefined_vect_el_value); gcc_assert (x = 0); -if (tab-max_comb_vect_el_value x) - tab-max_comb_vect_el_value = x; -if (tab-min_comb_vect_el_value x) - tab-min_comb_vect_el_value = x; VEC_replace (vect_el_t, tab-comb_vect, comb_vect_index + vect_index, x); VEC_replace (vect_el_t, tab-check_vect, Index: gcc/doc/md.texi === --- gcc/doc/md.texi (revision 332057) +++ gcc/doc/md.texi (working copy) @@ -7795,6 +7795,13 @@ verification and debugging. non-critical errors. @item +@dfn{no-comb-vect} prevents the automaton generator from generating +two data structures and comparing them for space efficiency. Using +a comb vector to represent transitions may be better, but it can be +very expensive to construct. This option is useful if the build +process spends an unacceptably long time in genautomata. + +@item @dfn{ndfa} makes nondeterministic finite state automata. This affects the treatment of operator @samp{|} in the regular expressions. The usual treatment of the operator is to try the first alternative and,
[pph] Detect #include outside the global context (issue4958045)
To prevent PPH images from depend on the context in which they were generated, we require that the header file be compiled in isolation and when included, it should only be included in the global context. That is, things like namespace Foo { #include bar.h ... }; should prevent bar.h from becoming a PPH image, since all the symbols defined in it would belong to Foo, but when bar.pph was generated, they belonged to :: This patch detects the use of PPH images inside nested scopes like that. It is a bit crude, but it works. During lexing, it keeps track of open and close braces (all kinds, not just { }), so when the #include command is found, it rejects the image if the nesting level is positive. Jason, I added a field to scope_chain. That seemed the cleaner approach to keep track of the nesting level. Is that a good place to keep track of it? This is the kind of bookkeeping that scope_chain seems to be used for, but I can put it elsewhere if you want. This error flagged the usage of sys/types.pph. This file is included inside 'extern C { }', so strictly speaking it should be rejected. We can relax these restrictions later. Tested on x86_64. Applied to branch. * cp-tree.h (struct saved_scope): Add field x_brace_nesting. * parser.c (cp_lexer_token_is_open_brace): New. (cp_lexer_token_is_close_brace): New. (cp_lexer_get_preprocessor_token): Call them. Increase scope_chain-x_brace_nesting for open braces, decrease for closing braces. * pph.c (pph_is_valid_here): New. Return false if scope_chain-x_brace_nesting is greater than 0. (pph_include_handler): Call it. testsuite/ChangeLog.pph * g++.dg/pph/pph.exp: Do not create a PPH image for sys/types.h. * g++.dg/pph/y8inc-nmspc.cc: Mark fixed. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 8fb2ccc..e0b67c6 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -973,6 +973,10 @@ struct GTY(()) saved_scope { cp_binding_level *bindings; struct saved_scope *prev; + + /* Used during lexing to validate where PPH images are included, it + keeps track of nested bracing. */ + unsigned x_brace_nesting; }; /* The current open namespace. */ diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 009922e..919671c 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -748,6 +748,31 @@ cp_lexer_saving_tokens (const cp_lexer* lexer) return VEC_length (cp_token_position, lexer-saved_tokens) != 0; } + +/* Return true if TOKEN is one of CPP_OPEN_SQUARE, CPP_OPEN_BRACE or + CPP_OPEN_PAREN. */ + +static inline bool +cp_lexer_token_is_open_brace (cp_token *token) +{ + return token-type == CPP_OPEN_SQUARE +|| token-type == CPP_OPEN_BRACE +|| token-type == CPP_OPEN_PAREN; +} + + +/* Return true if TOKEN is one of CPP_CLOSE_SQUARE, CPP_CLOSE_BRACE or + CPP_CLOSE_PAREN. */ + +static inline bool +cp_lexer_token_is_close_brace (cp_token *token) +{ + return token-type == CPP_CLOSE_SQUARE +|| token-type == CPP_CLOSE_BRACE +|| token-type == CPP_CLOSE_PAREN; +} + + /* Store the next token from the preprocessor in *TOKEN. Return true if we reach EOF. If LEXER is NULL, assume we are handling an initial #pragma pch_preprocess, and thus want the lexer to return @@ -835,8 +860,13 @@ cp_lexer_get_preprocessor_token (cp_lexer *lexer, cp_token *token) TREE_INT_CST_LOW (token-u.value)); token-u.value = NULL_TREE; } + else if (cp_lexer_token_is_open_brace (token)) +scope_chain-x_brace_nesting++; + else if (cp_lexer_token_is_close_brace (token)) +scope_chain-x_brace_nesting--; } + /* Update the globals input_location and the input file stack from TOKEN. */ static inline void cp_lexer_set_source_position_from_token (cp_token *token) diff --git a/gcc/cp/pph.c b/gcc/cp/pph.c index 9b1c63c..cbd9c24 100644 --- a/gcc/cp/pph.c +++ b/gcc/cp/pph.c @@ -94,11 +94,30 @@ pph_dump_namespace (FILE *file, tree ns) } +/* Return true if PPH image NAME can be used at the point of inclusion + (given by LOC). */ + +static bool +pph_is_valid_here (const char *name, location_t loc) +{ + /* If we are inside a scope, reject the image. We could be inside a + namespace or a structure which changes the parsing context for + the original text file. */ + if (scope_chain-x_brace_nesting 0) +{ + error_at (loc, PPH file %s not included at global scope, name); + return false; +} + + return true; +} + + /* Record a #include or #include_next for PPH. */ static bool pph_include_handler (cpp_reader *reader, - location_t loc ATTRIBUTE_UNUSED, + location_t loc, const unsigned char *dname, const char *name, int angle_brackets, @@ -117,7 +136,9 @@ pph_include_handler (cpp_reader *reader, read_text_file_p = true; pph_file =
Re: [pph] Detect #include outside the global context (issue4958045)
I'm getting the following pph test failure output after this patch (I'll commit my patch on top of it anyways as I get the same errors with a clean build and with my patch, which itself had a clean test output before this recent pull before my commit). XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) # of expected passes336 # of unexpected successes 35 # of expected failures 45 # of unresolved testcases 1 Cheers, Gab On Thu, Aug 25, 2011 at 3:30 PM, Diego Novillo dnovi...@google.com wrote: To prevent PPH images from depend on the context in which they were generated, we require that the header file be compiled in isolation and when included, it should only be included in the global context. That is, things like namespace Foo { #include bar.h ... }; should prevent bar.h from becoming a PPH image, since all the symbols defined in it would belong to Foo, but when bar.pph was generated, they belonged to :: This patch detects the use of PPH images inside nested scopes like that. It is a bit crude, but it works. During lexing, it keeps track of open and close braces (all kinds, not just { }), so when the #include command is found, it rejects the image if the nesting level is positive. Jason, I added a field to scope_chain. That seemed the cleaner approach to keep track of the nesting level. Is that a good place to keep track of it? This is the kind of bookkeeping that scope_chain seems to be used for, but I can put it elsewhere if you want. This error flagged the usage of sys/types.pph. This file is included inside 'extern C { }', so strictly speaking it should be rejected. We can relax these restrictions later. Tested on x86_64. Applied to branch. * cp-tree.h (struct saved_scope): Add field x_brace_nesting. * parser.c (cp_lexer_token_is_open_brace): New. (cp_lexer_token_is_close_brace): New. (cp_lexer_get_preprocessor_token): Call them. Increase scope_chain-x_brace_nesting for open braces, decrease for closing braces. * pph.c (pph_is_valid_here): New. Return false if scope_chain-x_brace_nesting is greater than 0. (pph_include_handler): Call it. testsuite/ChangeLog.pph * g++.dg/pph/pph.exp: Do not create a PPH image for sys/types.h. * g++.dg/pph/y8inc-nmspc.cc: Mark fixed. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 8fb2ccc..e0b67c6 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -973,6 +973,10 @@ struct GTY(()) saved_scope { cp_binding_level *bindings; struct saved_scope *prev; + + /* Used during lexing to validate where PPH images are included, it + keeps track of nested bracing. */ + unsigned x_brace_nesting; }; /* The current open namespace. */ diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 009922e..919671c 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -748,6 +748,31 @@ cp_lexer_saving_tokens (const cp_lexer* lexer) return VEC_length (cp_token_position, lexer-saved_tokens) != 0; } + +/* Return true if TOKEN is one of CPP_OPEN_SQUARE, CPP_OPEN_BRACE or + CPP_OPEN_PAREN. */ + +static inline bool +cp_lexer_token_is_open_brace (cp_token *token) +{ + return token-type == CPP_OPEN_SQUARE + || token-type == CPP_OPEN_BRACE + || token-type == CPP_OPEN_PAREN; +} + + +/* Return true if TOKEN is one of CPP_CLOSE_SQUARE, CPP_CLOSE_BRACE or + CPP_CLOSE_PAREN. */ + +static inline bool +cp_lexer_token_is_close_brace (cp_token *token) +{ + return token-type == CPP_CLOSE_SQUARE + || token-type == CPP_CLOSE_BRACE + || token-type ==
PATCH: Support BMI, BMI2 and LZCNT in immintrin.h
Hi, immintrin.h should support all Intel intrinsics. This patch adds BMI, BMI2 and LZCNT support to immintrin.h. OK for trunk? Thanks. H.J. --- 2011-08-25 H.J. Lu hongjiu...@intel.com * config/i386/bmi2intrin.h: Allow in immintrin.h. * config/i386/bmiintrin.h: Likewise. * config/i386/lzcntintrin.h: Likewise. * config/i386/immintrin.h: Include lzcntintrin.h, bmiintrin.h and bmi2intrin.h. diff --git a/gcc/config/i386/bmi2intrin.h b/gcc/config/i386/bmi2intrin.h index f3ffa52..a72c9a9 100644 --- a/gcc/config/i386/bmi2intrin.h +++ b/gcc/config/i386/bmi2intrin.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2010, 2011 Free Software Foundation, Inc. +/* Copyright (C) 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -21,7 +21,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see http://www.gnu.org/licenses/. */ -#ifndef _X86INTRIN_H_INCLUDED +#if !defined _X86INTRIN_H_INCLUDED !defined _IMMINTRIN_H_INCLUDED # error Never use bmi2intrin.h directly; include x86intrin.h instead. #endif diff --git a/gcc/config/i386/bmiintrin.h b/gcc/config/i386/bmiintrin.h index 1699c61..af5d9dc 100644 --- a/gcc/config/i386/bmiintrin.h +++ b/gcc/config/i386/bmiintrin.h @@ -21,7 +21,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see http://www.gnu.org/licenses/. */ -#ifndef _X86INTRIN_H_INCLUDED +#if !defined _X86INTRIN_H_INCLUDED !defined _IMMINTRIN_H_INCLUDED # error Never use bmiintrin.h directly; include x86intrin.h instead. #endif diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h index 3704df7..d2e715f 100644 --- a/gcc/config/i386/immintrin.h +++ b/gcc/config/i386/immintrin.h @@ -60,6 +60,18 @@ #include avx2intrin.h #endif +#ifdef __LZCNT__ +#include lzcntintrin.h +#endif + +#ifdef __BMI__ +#include bmiintrin.h +#endif + +#ifdef __BMI2__ +#include bmi2intrin.h +#endif + #ifdef __RDRND__ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) diff --git a/gcc/config/i386/lzcntintrin.h b/gcc/config/i386/lzcntintrin.h index 8df01d2..31db7dc 100644 --- a/gcc/config/i386/lzcntintrin.h +++ b/gcc/config/i386/lzcntintrin.h @@ -21,7 +21,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see http://www.gnu.org/licenses/. */ -#ifndef _X86INTRIN_H_INCLUDED +#if !defined _X86INTRIN_H_INCLUDED !defined _IMMINTRIN_H_INCLUDED # error Never use lzcntintrin.h directly; include x86intrin.h instead. #endif
Go patch committed: Simplify Bound_method_expression
Bound_method_expression in the Go frontend IR represents a method bound to an object, as in v.m. The IR was permitting the method to be any expression. However, Go does not support method pointers, so m was always a specific method (Go supports a comparable but different approach, method expressions associated with the type, as in T.m where T is a type). This patch to the Go frontend changes the IR so that a Bound_method_expression just refers directly to a method, rather than always using a Func_expression which refers to the method. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r a505fb42665b go/expressions.cc --- a/go/expressions.cc Wed Aug 24 11:21:25 2011 -0700 +++ b/go/expressions.cc Thu Aug 25 15:47:46 2011 -0700 @@ -6798,9 +6798,7 @@ int Bound_method_expression::do_traverse(Traverse* traverse) { - if (Expression::traverse(this-expr_, traverse) == TRAVERSE_EXIT) -return TRAVERSE_EXIT; - return Expression::traverse(this-method_, traverse); + return Expression::traverse(this-expr_, traverse); } // Return the type of a bound method expression. The type of this @@ -6811,7 +6809,12 @@ Type* Bound_method_expression::do_type() { - return this-method_-type(); + if (this-method_-is_function()) +return this-method_-func_value()-type(); + else if (this-method_-is_function_declaration()) +return this-method_-func_declaration_value()-type(); + else +return Type::make_error_type(); } // Determine the types of a method expression. @@ -6819,9 +6822,7 @@ void Bound_method_expression::do_determine_type(const Type_context*) { - this-method_-determine_type_no_context(); - Type* mtype = this-method_-type(); - Function_type* fntype = mtype == NULL ? NULL : mtype-function_type(); + Function_type* fntype = this-type()-function_type(); if (fntype == NULL || !fntype-is_method()) this-expr_-determine_type_no_context(); else @@ -6836,14 +6837,12 @@ void Bound_method_expression::do_check_types(Gogo*) { - Type* type = this-method_-type()-deref(); - if (type == NULL - || type-function_type() == NULL - || !type-function_type()-is_method()) + if (!this-method_-is_function() + !this-method_-is_function_declaration()) this-report_error(_(object is not a method)); else { - Type* rtype = type-function_type()-receiver()-type()-deref(); + Type* rtype = this-type()-function_type()-receiver()-type()-deref(); Type* etype = (this-expr_type_ != NULL ? this-expr_type_ : this-expr_-type()); @@ -6881,14 +6880,13 @@ ast_dump_context-ostream() ); } - ast_dump_context-ostream() .; - ast_dump_context-dump_expression(method_); + ast_dump_context-ostream() . this-method_-name(); } // Make a method expression. Bound_method_expression* -Expression::make_bound_method(Expression* expr, Expression* method, +Expression::make_bound_method(Expression* expr, Named_object* method, source_location location) { return new Bound_method_expression(expr, method, location); @@ -9257,6 +9255,9 @@ Bound_method_expression* bound_method, tree* first_arg_ptr) { + Gogo* gogo = context-gogo(); + source_location loc = this-location(); + Expression* first_argument = bound_method-first_argument(); tree first_arg = first_argument-get_tree(context); if (first_arg == error_mark_node) @@ -9272,7 +9273,7 @@ || TREE_CODE(first_arg) == INDIRECT_REF || TREE_CODE(first_arg) == COMPONENT_REF) { - first_arg = build_fold_addr_expr(first_arg); + first_arg = build_fold_addr_expr_loc(loc, first_arg); if (DECL_P(first_arg)) TREE_ADDRESSABLE(first_arg) = 1; } @@ -9282,9 +9283,10 @@ get_name(first_arg)); DECL_IGNORED_P(tmp) = 0; DECL_INITIAL(tmp) = first_arg; - first_arg = build2(COMPOUND_EXPR, pointer_to_arg_type, - build1(DECL_EXPR, void_type_node, tmp), - build_fold_addr_expr(tmp)); + first_arg = build2_loc(loc, COMPOUND_EXPR, pointer_to_arg_type, + build1_loc(loc, DECL_EXPR, void_type_node, + tmp), + build_fold_addr_expr_loc(loc, tmp)); TREE_ADDRESSABLE(tmp) = 1; } if (first_arg == error_mark_node) @@ -9296,8 +9298,8 @@ { if (fatype-points_to() == NULL) fatype = Type::make_pointer_type(fatype); - Btype* bfatype = fatype-get_backend(context-gogo()); - first_arg = fold_convert(type_to_tree(bfatype), first_arg); + Btype* bfatype = fatype-get_backend(gogo); + first_arg = fold_convert_loc(loc, type_to_tree(bfatype), first_arg); if (first_arg == error_mark_node || TREE_TYPE(first_arg) == error_mark_node) return error_mark_node; @@ -9305,7 +9307,21 @@ *first_arg_ptr = first_arg; - return bound_method-method()-get_tree(context); + Named_object* method = bound_method-method(); + tree id = method-get_id(gogo); + if (id == error_mark_node) +return error_mark_node; + + tree fndecl; + if
[PATCH][config]Add missing crt*.o in start and end file specs for linux-android.h
Hi, We received this from Intel and would like to check in the trunk. Could the maintainers of gcc/config take a look? Thanks. -Doug 2011-08-25 Mark D Horn mark.d.h...@intel.com config/linux-android.h (ANDROID_STARTFILE_SPEC, ANDROID_ENDFILE_SPEC): Add missing crt*.o objects for shared building a library. Index: gcc/config/linux-android.h === --- gcc/config/linux-android.h (revision 178053) +++ gcc/config/linux-android.h (working copy) @@ -54,7 +54,7 @@ #define ANDROID_STARTFILE_SPEC \ %{!shared: \ -%{static: crtbegin_static%O%s;: crtbegin_dynamic%O%s}} +%{static: crtbegin_static%O%s;: crtbegin_dynamic%O%s};: crtbegin_so%O%s} #define ANDROID_ENDFILE_SPEC \ - %{!shared: crtend_android%O%s} + %{!shared: crtend_android%O%s;: crtend_so%O%s}
Re: [4.7][google]Support for getting CPU type and feature information at run-time. (issue4893046)
On Thu, Aug 25, 2011 at 5:37 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, Thanks for all the comments. I am attaching a new patch incorporating all of the changes mentioned, mainly : 1) Make __cpu_indicator_init a constructor in libgcc and guard to call it only once. This is unreliable and you don't need 3 symbols from libgcc. You can use static struct cpu_indicator { feature model status } cpu_indicator; struct cpu_indicator * __get_cpu_indicator () { if cpu_indicator is uninitialized; then initialize cpu_indicator; return cpu_indicator; } You can simply call __get_cpu_indicator to get a pointer to cpu_indicator; -- H.J.
Re: [pph] Detect #include outside the global context (issue4958045)
On 11-08-25 18:56 , Gabriel Charette wrote: XPASS: g++.dg/pph/x7rtti.cc -fno-dwarf2-cfi-asm -fpph-map=pph.map -I. (test for bogus messages, line ) [ ... ] Oops, my fault. When I changed x7rtti.cc into an executable test, I altered line numbers and forgot to update the dg markers. Fixed. Diego.