Re: [PATCH][RFC] Re-write LTO type merging again, do tree merging
> > > > > > Ok, not streaming and comparing TREE_USED gets it improved to > > > > I will try to gather better data tomorrow. My mozilla build died on disk > > space, > > but according to stats we are now at about 7GB of GGC memory after merging. > > I was playing with the following patch that implements testing whether types > > are same in my (probably naive and wrong) understanding of ODR rule in C++ > > So i can confirm that we now need 3GB of TMP space instead of 8GB with earlier > version of patch. I will compare to mainline tomorrow, but I think it is > about the same. > phase opt and generate : 96.39 ( 9%) usr 40.45 (45%) sys 136.91 (12%) > wall 271042 kB ( 7%) ggc > phase stream in : 457.87 (43%) usr 8.38 ( 9%) sys 466.44 (40%) > wall 3798844 kB (93%) ggc > phase stream out: 509.39 (48%) usr 40.82 (46%) sys 550.88 (48%) > wall7149 kB ( 0%) ggc > ipa cp : 13.62 ( 1%) usr 5.00 ( 6%) sys 18.61 ( 2%) > wall 425204 kB (10%) ggc > ipa inlining heuristics : 60.52 ( 6%) usr 36.15 (40%) sys 96.71 ( 8%) > wall 1353370 kB (33%) ggc > ipa lto decl in : 346.94 (33%) usr 5.49 ( 6%) sys 352.60 (31%) > wall7042 kB ( 0%) ggc > ipa lto decl out: 481.19 (45%) usr 23.28 (26%) sys 504.68 (44%) > wall 0 kB ( 0%) ggc > TOTAL :1063.6789.65 1154.26 > 4078436 kB > > So we are still bound by streaming. I am running -flto-report overnight. [WPA] read 43363300 SCCs of average size 2.264113 [WPA] 98179403 tree bodies read in total [WPA] tree SCC table: size 16777213, 6422251 elements, collision ratio: 0.811639 [WPA] tree SCC max chain length 88 (size 1) [WPA] Compared 16544560 SCCs, 275298 collisions (0.016640) [WPA] Merged 16458553 SCCs [WPA] Merged 46453870 tree bodies [WPA] Merged 9535385 types [WPA] 6771259 types prevailed (21348860 associated trees) [WPA] Old merging code merges an additional 1759918 types of which 379059 are in the same SCC with their prevailing variant (19696849 and 15301625 associated trees) [WPA] GIMPLE canonical type table: size 131071, 77875 elements, 6771394 searches, 1528380 collisions (ratio: 0.225711) [WPA] GIMPLE canonical type hash table: size 16777213, 6771339 elements, 23174504 searches, 21075518 collisions (ratio: 0.909427) [LTRANS] read 228296 SCCs of average size 11.882460 [LTRANS] 2712718 tree bodies read in total [LTRANS] GIMPLE canonical type table: size 16381, 7025 elements, 704670 searches, 24040 collisions (ratio: 0.034115) [LTRANS] GIMPLE canonical type hash table: size 1048573, 704613 elements, 2269381 searches, 2021919 collisions (ratio: 0.890956) We manage to get stuck in one of ltranses on LRA LRA hard reg assignment : 476.07 (44%) usr 0.03 ( 0%) sys 476.08 (44%) wall 0 kB ( 0%) ggc 2860712.1151 lto1 alloc_page(unsigned int) 3564 1.5094 lto1 record_reg_classes(int, int, rtx_def**, machine_mode*, char const**, rtx_def*, reg_class*) 3235 1.3700 libc-2.11.1.so _int_malloc 3056 1.2942 lto1 ggc_set_mark(void const*) 2646 1.1206 lto1 gt_ggc_mx_lang_tree_node(void*) 2539 1.0753 lto1 bitmap_set_bit(bitmap_head_def*, int) 2333 0.9880 opreport /usr/bin/opreport 2210 0.9359 lto1 for_each_rtx_1(rtx_def*, int, int (*)(rtx_def**, void*), void*) 2133 0.9033 lto1 constrain_operands(int) 2128 0.9012 lto1 lookup_page_table_entry(void const*) 1586 0.6717 lto1 preprocess_constraints() While GGC memory is now under 7GB after type streaming and we GGC just once in WPA, the TOP usage still goes to about 12GB. With the ODR patch there are 424 devirtualizations happening during WPA and some extra (do not have stats for) during ltrans. Honza
Re: Aw: Re: [PATCH] Basic support for MIPS r5900
Richard Sandiford writes: > I can't approve the Makefile.in bits. I've cc'ed Ian, who's the libgcc > maintainer. Ian: the problem is that "_muldi3.o" on 64-bit targets > is actually an implementation of __multi3. Jürgen wants to have a > __muldi3 too, with the same implementation as on 32-bit targets. My assumption is that target maintainers can approve target-specific changes to libgcc, including Makefile changes. That said, it seems that this particular patch ought to mostly be in libgcc/config/mips/t-mips, using LIB2FUNCS_EXCLUDE and LIB2ADD. It's not clear to me that libgcc/Makefile.in needs any changes, and in case it should not be necessary for it to have anything like MIPSTYPE. That kind of thing belongs in config/mips. Ian
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka wrote: >> * tree-inline.c (expand_call_inline): Allow the error to be flagged >> in early inline pass. >> * ipa-inline.c (inline_always_inline_functions): Pretend always_inline >> functions are inlined during failures to flag an error. >> * gcc.target/i386/inline_error.c: New test. > This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Thanks Sri > Thanks for your patience! > > Honza
target.h uses insn-codes.h
When doing a -j16 build of top of the trunk from a little while ago, with our wide-int patches in it, I hit: g++ -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -Ilto -I../../gcc/gcc -I../../gcc/gcc/lto -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc/gcc/../libbacktrace ../../gcc/gcc/lto/lto-partition.c -o lto/lto-partition.o In file included from ../../gcc/gcc/lto-streamer.h:30:0, from ../../gcc/gcc/lto/lto-object.c:27: ../../gcc/gcc/target.h:52:24: fatal error: insn-codes.h: No such file or directory compilation terminated. make[3]: *** [lto/lto-object.o] Error 1 make[3]: *** Waiting for unfinished jobs In file included from ../../gcc/gcc/lto-streamer.h:30:0, from ../../gcc/gcc/lto/lto-partition.c:27: ../../gcc/gcc/target.h:52:24: fatal error: insn-codes.h: No such file or directory compilation terminated. make[3]: *** [lto/lto-partition.o] Error 1 to my untrained eye, it seems we are missing: diff --git a/gcc/Makefile.in b/gcc/Makefile.in index a098040..a9f4dee 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -841,7 +841,7 @@ EXCEPT_H = except.h $(HASHTAB_H) TARGET_DEF = target.def target-hooks-macros.h C_TARGET_DEF = c-family/c-target.def target-hooks-macros.h COMMON_TARGET_DEF = common/common-target.def target-hooks-macros.h -TARGET_H = $(TM_H) target.h $(TARGET_DEF) insn-modes.h +TARGET_H = $(TM_H) target.h $(TARGET_DEF) insn-modes.h insn-codes.h C_TARGET_H = c-family/c-target.h $(C_TARGET_DEF) COMMON_TARGET_H = common/common-target.h $(INPUT_H) $(COMMON_TARGET_DEF) MACHMODE_H = machmode.h mode-classes.def insn-modes.h I think insn-codes.h falls in to the same or easier class (no stamp file for it) than insn-modes.h, so, I think this is ok; but would be nice to have someone review it. I checked trunk, and I don't see any thing that would ensure insn-codes.h is done before target.h is used. Ok?
[PATCH] Remove old x86 builtins from documentation
Hey guys, I'm a first time patch submitter, please be gentle. It probably goes without saying, but I do not have svn write access. The patch attached is related to PR38836. This patch removes the following x86 builtins from the documentation: __builtin_ia32_cmpnlts * __builtin_ia32_loadaps __builtin_ia32_loadddup __builtin_ia32_loadsss ** __builtin_ia32_movddup __builtin_ia32_pextrw __builtin_ia32_pfrsqrtit1 __builtin_ia32_pinsrw __builtin_ia32_storeaps __builtin_ia32_storess * Corrected the spelling of __builtin_ia32_cmpnlts. The suffix should be "ss", not "s". ** Corrected the spelling of __builtin_ia32_loadss. The suffix should be "ss", not "sss". I have run both 'make info' and 'make dvi' also. Although, 'make dvi' appears to be failing with an unrelated issue in doc/libffi.dvi. I am assuming that this is a known failure. -Cameron Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 200078) +++ gcc/doc/extend.texi (working copy) @@ -9916,8 +9916,6 @@ v4hi __builtin_ia32_pmaxsw (v4hi, v4hi) v8qi __builtin_ia32_pminub (v8qi, v8qi) v4hi __builtin_ia32_pminsw (v4hi, v4hi) -int __builtin_ia32_pextrw (v4hi, int) -v4hi __builtin_ia32_pinsrw (v4hi, int, int) int __builtin_ia32_pmovmskb (v8qi) void __builtin_ia32_maskmovq (v8qi, v8qi, char *) void __builtin_ia32_movntq (di *, di) @@ -9965,7 +9963,7 @@ v4si __builtin_ia32_cmpless (v4sf, v4sf) v4si __builtin_ia32_cmpunordss (v4sf, v4sf) v4si __builtin_ia32_cmpneqss (v4sf, v4sf) -v4si __builtin_ia32_cmpnlts (v4sf, v4sf) +v4si __builtin_ia32_cmpnltss (v4sf, v4sf) v4si __builtin_ia32_cmpnless (v4sf, v4sf) v4si __builtin_ia32_cmpordss (v4sf, v4sf) v4sf __builtin_ia32_maxps (v4sf, v4sf) @@ -10001,18 +,12 @@ The following built-in functions are available when @option{-msse} is used. @table @code -@item v4sf __builtin_ia32_loadaps (float *) -Generates the @code{movaps} machine instruction as a load from memory. -@item void __builtin_ia32_storeaps (float *, v4sf) -Generates the @code{movaps} machine instruction as a store to memory. @item v4sf __builtin_ia32_loadups (float *) Generates the @code{movups} machine instruction as a load from memory. @item void __builtin_ia32_storeups (float *, v4sf) Generates the @code{movups} machine instruction as a store to memory. -@item v4sf __builtin_ia32_loadsss (float *) +@item v4sf __builtin_ia32_loadss (float *) Generates the @code{movss} machine instruction as a load from memory. -@item void __builtin_ia32_storess (float *, v4sf) -Generates the @code{movss} machine instruction as a store to memory. @item v4sf __builtin_ia32_loadhps (v4sf, const v2sf *) Generates the @code{movhps} machine instruction as a load from memory. @item v4sf __builtin_ia32_loadlps (v4sf, const v2sf *) @@ -10196,19 +10188,11 @@ v4sf __builtin_ia32_hsubps (v4sf, v4sf) v16qi __builtin_ia32_lddqu (char const *) void __builtin_ia32_monitor (void *, unsigned int, unsigned int) -v2df __builtin_ia32_movddup (v2df) v4sf __builtin_ia32_movshdup (v4sf) v4sf __builtin_ia32_movsldup (v4sf) void __builtin_ia32_mwait (unsigned int, unsigned int) @end smallexample -The following built-in functions are available when @option{-msse3} is used. - -@table @code -@item v2df __builtin_ia32_loadddup (double const *) -Generates the @code{movddup} machine instruction as a load from memory. -@end table - The following built-in functions are available when @option{-mssse3} is used. All of them generate the machine instruction that is part of the name with MMX registers. @@ -10991,7 +10975,6 @@ v2sf __builtin_ia32_pfrcpit1 (v2sf, v2sf) v2sf __builtin_ia32_pfrcpit2 (v2sf, v2sf) v2sf __builtin_ia32_pfrsqrt (v2sf) -v2sf __builtin_ia32_pfrsqrtit1 (v2sf, v2sf) v2sf __builtin_ia32_pfsub (v2sf, v2sf) v2sf __builtin_ia32_pfsubr (v2sf, v2sf) v2sf __builtin_ia32_pi2fd (v2si)
Re: [C++ Patch] PR 57599
OK. Jason
[v3] LWG 2101, LWG 2196, and more
Hi, I applied this nice batch of improvements coming from Daniel. Thanks, Paolo. // 2013-06-13 Daniel Krugler * include/std/type_traits (is_function): Support ref-qualified functions. (is_copy_constructible, is_move_constructible, is_copy_assignable, is_move_assignable, is_nothrow_copy_constructible, is_nothrow_move_constructible, is_nothrow_copy_assignable, is_nothrow_move_assignable): Implement LWG 2196. (add_lvalue_reference, add_rvalue_reference, add_pointer): Implement LWG 2101. (__strip_reference_wrapper<>): Remove, unused. * testsuite/20_util/add_lvalue_reference/value.cc: Extend. * testsuite/20_util/add_rvalue_reference/value.cc: Likewise. * testsuite/20_util/decay/requirements/typedefs.cc: Likewise. * testsuite/20_util/is_assignable/value.cc: Likewise. * testsuite/20_util/is_constructible/value-2.cc: Likewise. * testsuite/20_util/is_copy_assignable/value.cc: Likewise. * testsuite/20_util/is_copy_constructible/value.cc: Likewise. * testsuite/20_util/is_function/value.cc: Likewise. * testsuite/20_util/is_move_assignable/value.cc: Likewise. * testsuite/20_util/is_move_constructible/value.cc: Likewise. * testsuite/20_util/is_nothrow_copy_assignable/value.cc: Likewise. * testsuite/20_util/is_nothrow_copy_constructible/value.cc: Likewise. * testsuite/20_util/is_nothrow_move_assignable/value.cc: Likewise. * testsuite/20_util/is_nothrow_move_constructible/value.cc: Likewise. * testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error line number. * testsuite/20_util/make_signed/requirements/typedefs_neg.cc: Likewise. * testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc: Likewise. Index: include/std/type_traits === --- include/std/type_traits (revision 200079) +++ include/std/type_traits (working copy) @@ -377,33 +377,97 @@ : public true_type { }; template +struct is_function<_Res(_ArgTypes...) &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes...) &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes..)> : public true_type { }; template +struct is_function<_Res(_ArgTypes..) &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes..) &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes...) const> : public true_type { }; template +struct is_function<_Res(_ArgTypes...) const &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes...) const &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes..) const> : public true_type { }; template +struct is_function<_Res(_ArgTypes..) const &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes..) const &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes...) volatile> : public true_type { }; template +struct is_function<_Res(_ArgTypes...) volatile &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes...) volatile &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes..) volatile> : public true_type { }; template +struct is_function<_Res(_ArgTypes..) volatile &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes..) volatile &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes...) const volatile> : public true_type { }; template +struct is_function<_Res(_ArgTypes...) const volatile &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes...) const volatile &&> +: public true_type { }; + + template struct is_function<_Res(_ArgTypes..) const volatile> : public true_type { }; + template +struct is_function<_Res(_ArgTypes..) const volatile &> +: public true_type { }; + + template +struct is_function<_Res(_ArgTypes..) const volatile &&> +: public true_type { }; + template struct __is_null_pointer_helper : public false_type { }; @@ -482,6 +546,23 @@ : public __is_member_pointer_helper::type>::type { }; + // Utility to detect referenceable types ([defns.referenceable]). + + template +struct __is_referenceable +: public __or_, is_reference<_Tp>>::type +{ }; + + template +struct __is_referenceable<_Res(_Args...)> +: public true_type +{ }; + + template +struct __is_referenceable<_Res(_Args..)> +: public true_type +{ }; + // Type properties. /// is_const @@ -947,15 +1028,15 @@
Re: [Patch tree-ssa] RFC: Enable path threading for control variables (PR tree-optimization/54742).
On Thu, Jun 13, 2013 at 08:29:08PM +0100, Steve Ellcey wrote: > On Fri, 2013-06-07 at 16:14 +0100, James Greenhalgh wrote: > > > Beyond that, the path search code is modified from Steve's patch > > to only perform one depth first search, and the path copy code > > is modified to purge the region entry and exit edges for those > > which traditional jump-threading may consider. > > > > Opinions? > > > > Thanks, > > James Greenhalgh > > Hi James, > > I tried out your patch and I am not getting the speed up in > coremark that I got with my patch. I wonder if restricting it > to one depth first search is the reason. With my patch I was > able to completely get rid of the switch statement, but with > this patch it still exists. Maybe we need an option to do > multiple searches? > > Steve Ellcey > sell...@mips.com > Hi Steve, Thanks for having a look at the patch, though you appear to get very different results to my local build. Comparing a bootstrapped native x86_64 compiler with my patch and these flags: /work/gcc-build-jg-threading/build-x86/install/bin/gcc -S -O3 -Ilinux64 core_state.c And a bootstrapped native x86_64 compiler with your patch and these flags: /work/gcc-build-sje-threading/build-x86/install/bin/gcc -S -O3 -Ilinux64 -ftree-switch-shortcut core_state.c I see only minor cosmetic differences (Placement of labels, numbering of labels) between the two generated assembly files. Perhaps you could share the flags you were using and I can try to figure out which paths I seem to be missing. If I can't find them, I'll happily revert to using your search strategy. Regards, James Greenhalgh
Re: [C++ Patch] PR 57599
Hi again, On 06/13/2013 03:51 PM, Jason Merrill wrote: Perhaps we should hand off to build_static_cast in the case described by the comment, rather than duplicate the logic. Thus I have regression tested successfully the below. Thanks, Paolo. /cp 2013-06-13 Paolo Carlini PR c++/57599 * rtti.c (build_dynamic_cast_1): In case of cast to an unambiguous accessible base simply forward to build_static_cast. /testsuite 2013-06-13 Paolo Carlini PR c++/57599 * g++.dg/rtti/dyncast6.C: New. * g++.dg/cpp0x/dyncast1.C: Likewise. Index: cp/rtti.c === --- cp/rtti.c (revision 200079) +++ cp/rtti.c (working copy) @@ -622,19 +622,10 @@ build_dynamic_cast_1 (tree type, tree expr, tsubst /* If *type is an unambiguous accessible base class of *exprtype, convert statically. */ { -tree binfo; - -binfo = lookup_base (TREE_TYPE (exprtype), TREE_TYPE (type), -ba_check, NULL, complain); - +tree binfo = lookup_base (TREE_TYPE (exprtype), TREE_TYPE (type), + ba_check, NULL, complain); if (binfo) - { - expr = build_base_path (PLUS_EXPR, convert_from_reference (expr), - binfo, 0, complain); - if (TYPE_PTR_P (exprtype)) - expr = rvalue (expr); - return expr; - } + return build_static_cast (type, expr, complain); } /* Otherwise *exprtype must be a polymorphic class (have a vtbl). */ Index: testsuite/g++.dg/cpp0x/dyncast1.C === --- testsuite/g++.dg/cpp0x/dyncast1.C (revision 0) +++ testsuite/g++.dg/cpp0x/dyncast1.C (working copy) @@ -0,0 +1,31 @@ +// PR c++/57599 +// { dg-do compile { target c++11 } } + +struct A { }; +struct B : public A { }; + +template +struct is_same { static constexpr bool value = false; }; + +template +struct is_same { static constexpr bool value = true; }; + +template +T val(); + +static_assert(is_same(val())), + A*>::value, "Ouch"); +static_assert(is_same(val())), + A&>::value, "Ouch"); +static_assert(is_same(val())), + const A*>::value, "Ouch"); +static_assert(is_same(val())), + const A&>::value, "Ouch"); +static_assert(is_same(val())), + volatile A*>::value, "Ouch"); +static_assert(is_same(val())), + volatile A&>::value, "Ouch"); +static_assert(is_same(val())), + const volatile A*>::value, "Ouch"); +static_assert(is_same(val())), + const volatile A&>::value, "Ouch"); Index: testsuite/g++.dg/rtti/dyncast6.C === --- testsuite/g++.dg/rtti/dyncast6.C(revision 0) +++ testsuite/g++.dg/rtti/dyncast6.C(working copy) @@ -0,0 +1,59 @@ +// PR c++/57599 + +class A { }; + +class B : public A { }; + +void p() +{ + B* b; + + A* a1; + a1 = dynamic_cast(b); + a1 = dynamic_cast(b); // { dg-error "invalid" } + a1 = dynamic_cast(b); // { dg-error "invalid" } + a1 = dynamic_cast(b); // { dg-error "invalid" } + + const A* a2; + a2 = dynamic_cast(b); + a2 = dynamic_cast(b); + a2 = dynamic_cast(b); // { dg-error "invalid" } + a2 = dynamic_cast(b); // { dg-error "invalid" } + + volatile A* a3; + a3 = dynamic_cast(b); + a3 = dynamic_cast(b); // { dg-error "invalid" } + a3 = dynamic_cast(b); + a3 = dynamic_cast(b); // { dg-error "invalid" } + + const volatile A* a4; + a4 = dynamic_cast(b); + a4 = dynamic_cast(b); + a4 = dynamic_cast(b); + a4 = dynamic_cast(b); +} + +void r() +{ + B b; + + A& a1 = dynamic_cast(b); + A& a2 = dynamic_cast(b);// { dg-error "invalid" } + A& a3 = dynamic_cast(b); // { dg-error "invalid" } + A& a4 = dynamic_cast(b); // { dg-error "invalid" } + + const A& ca1 = dynamic_cast(b); + const A& ca2 = dynamic_cast(b); + const A& ca3 = dynamic_cast(b); // { dg-error "invalid" } + const A& ca4 = dynamic_cast(b); // { dg-error "invalid" } + + volatile A& va1 = dynamic_cast(b); + volatile A& va2 = dynamic_cast(b); // { dg-error "invalid" } + volatile A& va3 = dynamic_cast(b); + volatile A& va4 = dynamic_cast(b);// { dg-error "invalid" } + + const volatile A& cva1 = dynamic_cast(b); + const volatile A& cva2 = dynamic_cast(b); + const volatile A& cva3 = dynamic_cast(b); + const volatile A& cva4 = dynamic_cast(b); +}
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Thu, Jun 13, 2013 at 05:42:17PM +0200, Jakub Jelinek wrote: > On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote: > > @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT > >type = TREE_TYPE (decl); > > > >dalign = TYPE_ALIGN (type); > > + dalign = DATA_ABI_ALIGNMENT (type, dalign); > >if (CONSTANT_CLASS_P (decl)) > > dalign = CONSTANT_ALIGNMENT (decl, dalign); > >else > > - dalign = DATA_ALIGNMENT (decl, dalign); > > + dalign = DATA_ALIGNMENT (type, dalign); > > > >if (dsize == 0) > > { > > What is this code trying to do? Shouldn't it just use DECL_ALIGN > which should be set to the right value from get_variable_alignment? > I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT > or CONSTANT_ALIGNMENT (for anything but actually emitting the var into > object, or just as an optimization hint that very likely the decl will be > aligned enough, but not guaranteed), which are optimization, is wrong > (an ABI problem). It is handling !DECL_P trees, which must be local. I know I saw STRING_CST here when I wrote offsettable_ok_by_alignment, hence the use of CONSTANT_ALIGNMENT. I'm not so sure about the need for DATA_ALIGNMENT now, but if it was correct before then we ought to be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your changes. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH][RFC] Re-write LTO type merging again, do tree merging
> > > > Ok, not streaming and comparing TREE_USED gets it improved to > > I will try to gather better data tomorrow. My mozilla build died on disk > space, > but according to stats we are now at about 7GB of GGC memory after merging. > I was playing with the following patch that implements testing whether types > are same in my (probably naive and wrong) understanding of ODR rule in C++ So i can confirm that we now need 3GB of TMP space instead of 8GB with earlier version of patch. I will compare to mainline tomorrow, but I think it is about the same. phase opt and generate : 96.39 ( 9%) usr 40.45 (45%) sys 136.91 (12%) wall 271042 kB ( 7%) ggc phase stream in : 457.87 (43%) usr 8.38 ( 9%) sys 466.44 (40%) wall 3798844 kB (93%) ggc phase stream out: 509.39 (48%) usr 40.82 (46%) sys 550.88 (48%) wall 7149 kB ( 0%) ggc ipa cp : 13.62 ( 1%) usr 5.00 ( 6%) sys 18.61 ( 2%) wall 425204 kB (10%) ggc ipa inlining heuristics : 60.52 ( 6%) usr 36.15 (40%) sys 96.71 ( 8%) wall 1353370 kB (33%) ggc ipa lto decl in : 346.94 (33%) usr 5.49 ( 6%) sys 352.60 (31%) wall 7042 kB ( 0%) ggc ipa lto decl out: 481.19 (45%) usr 23.28 (26%) sys 504.68 (44%) wall 0 kB ( 0%) ggc TOTAL :1063.6789.65 1154.26 4078436 kB So we are still bound by streaming. I am running -flto-report overnight. My ODR patch finds 36377 matches and also weird looking mismatches of type: constant 1024> unit size constant 128> align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78 fields unit size align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min max > unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0 size unit size align 16 offset_align 128 offset bit offset context chain unsigned nonlocal DI file /usr/include/bits/socket.h line 190 col 0 size unit size align 64 offset_align 128 offset bit offset context chain >> context chain > constant 1024> unit size constant 128> align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78 fields unit size align 16 symtab 0 alias set -1 canonical type 0x7fbd41566540 precision 16 min max > unsigned HI file /usr/include/bits/socket.h line 189 col 0 size unit size align 16 offset_align 128 offset bit offset context chain unsigned DI file /usr/include/bits/socket.h line 190 col 0 size unit size align 64 offset_align 128 offset bit offset context chain >> context pointer_to_this chain > that mismatch because we run into following difference: unit size align 64 symtab 0 alias set -1 canonical type 0x7fbd30f0bc78 fields unsigned nonlocal HI file /usr/include/bits/socket.h line 189 col 0 size unit size align 16 offset_align 128 offset bit offset context chain > context chain > public VOID file /usr/include/bits/socket.h line 187 col 0 align 8 context > I am not sure what means that one type has more TYPE_DECLs stacked than the other. Honza
Re: [PATCH][RFC] Re-write LTO type merging again, do tree merging
> > Ok, not streaming and comparing TREE_USED gets it improved to I will try to gather better data tomorrow. My mozilla build died on disk space, but according to stats we are now at about 7GB of GGC memory after merging. I was playing with the following patch that implements testing whether types are same in my (probably naive and wrong) understanding of ODR rule in C++ It prints type pairs that seems same and then it verifies that they are having same names and they are in same namespaces and records. On Javascript there are 5000 types found same by devirtualization code this way that are not having the same MAIN VARIANT. I gess those trees may be good starting point for you to look why they are not merged. I suppose that once we have maintenable code base we can get into more aggressive merging in special cases. Requiring trees to be exactly same is a good default behaviour. We however may take advantage of extra knowledge. FE may tag types/decls that are subject to ODR rule and for those we can reduce the hash to be based only on name+context and we can even output sane diagnostic on mismatches. Simiarly I think it would help a lot if we proactively merged !can_prevail_p decls with matching types into those that can prevail by hashing PUBLIC decls only by their assembler name. Merging those should subsequently allow collapsing the types that are otherwise kept separate just because associated vtables are having differences in EXTERNAL and PUBLIC flags on the methods and such. Index: tree.c === --- tree.c (revision 200064) +++ tree.c (working copy) @@ -11618,6 +11711,91 @@ lhd_gcc_personality (void) return gcc_eh_personality_decl; } +/* For languages with One Definition Rule, work out if + decls are actually the same even if the tree representation + differs. This handles only decls appearing in TYPE_NAME + and TYPE_CONTEXT. That is NAMESPACE_DECL, TYPE_DECL, + RECORD_TYPE and IDENTIFIER_NODE. */ + +static bool +decls_same_for_odr (tree decl1, tree decl2) +{ + if (decl1 == decl2) +return true; + if (!decl1 || !decl2) +{ + fprintf (stderr, "Nesting mismatch\n"); + debug_tree (decl1); + debug_tree (decl2); + return false; +} + if (TREE_CODE (decl1) != TREE_CODE (decl2)) +{ + fprintf (stderr, "Code mismatch\n"); + debug_tree (decl1); + debug_tree (decl2); + return false; +} + if (TREE_CODE (decl1) == TRANSLATION_UNIT_DECL) +return true; + if (TREE_CODE (decl1) != NAMESPACE_DECL + && TREE_CODE (decl1) != RECORD_TYPE + && TREE_CODE (decl1) != TYPE_DECL) +{ + fprintf (stderr, "Decl type mismatch\n"); + debug_tree (decl1); + return false; +} + if (!DECL_NAME (decl1)) +{ + fprintf (stderr, "Anonymous; name mysmatch\n"); + debug_tree (decl1); + return false; +} + if (!decls_same_for_odr (DECL_NAME (decl1), DECL_NAME (decl2))) +return false; + return decls_same_for_odr (DECL_CONTEXT (decl1), +DECL_CONTEXT (decl2)); +} + +/* For languages with One Definition Rule, work out if + types are same even if the tree representation differs. + This is non-trivial for LTO where minnor differences in + the type representation may have prevented type merging + to merge two copies of otherwise equivalent type. */ + +static bool +types_same_for_odr (tree type1, tree type2) +{ + type1 = TYPE_MAIN_VARIANT (type1); + type2 = TYPE_MAIN_VARIANT (type2); + if (type1 == type2) +return true; + if (!type1 || !type2) +return false; + + /* If types are not structuraly same, do not bother to contnue. + Match in the remainder of code would mean ODR violation. */ + if (!types_compatible_p (type1, type2)) +return false; + + debug_tree (type1); + debug_tree (type2); + if (!TYPE_NAME (type1)) +{ + fprintf (stderr, "Anonymous; name mysmatch\n"); + return false; +} + if (!decls_same_for_odr (TYPE_NAME (type1), TYPE_NAME (type2))) +return false; + if (!decls_same_for_odr (TYPE_CONTEXT (type1), TYPE_CONTEXT (type2))) +return false; + fprintf (stderr, "type match!\n"); + gcc_assert (in_lto_p); + + return true; +} + /* Try to find a base info of BINFO that would have its field decl at offset OFFSET within the BINFO type and which is of EXPECTED_TYPE. If it can be found, return, otherwise return NULL_TREE. */ @@ -11633,8 +11811,8 @@ get_binfo_at_offset (tree binfo, HOST_WI tree fld; int i; - if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) - return binfo; + if (types_same_for_odr (type, expected_type)) +return binfo; if (offset < 0) return NULL_TREE; @@ -11663,7 +11841,7 @@ get_binfo_at_offset (tree binfo, HOST_WI { tree base_binfo, found_binfo = NULL_TREE; for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo)
Re: [Patch, Fortran] PR57596 - Fix OPTIONAL handling of deferred-length strings
Le 13/06/2013 11:13, Tobias Burnus a écrit : > A rather simple patch. I wonder why we didn't get in trouble before - > the "*dummy = NULL;" part should affect also other optional allocatable > dummy arguments. > > Build and regtested on x86-64-gnu-linux. > OK for the trunk? > OK; thanks. Mikael
Re: [gomp4] Some progress on #pragma omp simd
On 06/12/13 16:36, Jakub Jelinek wrote: On Wed, Jun 12, 2013 at 10:38:00AM -0700, Richard Henderson wrote: On 06/12/2013 10:30 AM, Jakub Jelinek wrote: So the built-ins would take address of this decl, something else? Perhaps address, perhaps just referenced uninitialized? True, assuming no pass would actually want to change that SSA_NAME of the magic decl just because it is undefined (coalesce with some other undefined SSA_NAME or something similar). I hope nothing does that, it would be problematic for the uninitialized warning pass too I bet. Boo hiss! I've seen uninitialized variables cause all sorts of grief when cleaning up SSA.
Re: [gomp4] Some progress on #pragma omp simd
On Thu, Jun 13, 2013 at 03:15:45PM -0500, Aldy Hernandez wrote: > > >it. Also, not sure what to do for lastprivate, probably use the magic > >arrays and just in the epilogue of the loop compute which of the array items > >belonged to the last iteration somehow. > > Can't you do (for lastprivate(abc) something like: > > if (i == 1024) { > abc = magic_abc[__builtin_GOMP.simd_lane (1)]; > } Well, if you do that inside of the loop, you make it probably not vectorizable. So you need something like: abc = magic_abc[(count - 1) & (__builtin_GOMP.simd_vf (1) - 1)]; or so. > >#pragma omp declare simd > >__attribute__((noinline, noclone)) void > >bar (int &x, int &y) > >{ > > x += 4; > > y += 4; > >} > > Does bar() have anything to do with this example, or was this an oversight? It was there just to make the stuff addressable during gimplification, and possibly no longer addressable afterwards. > >using the magic arrays and so is reduction. While the vectorizer can > >recognize some reductions, e.g. without -ffast-math it will not vectorize > >any floating point ones because that means changing the order of > >computations, while when they are mandated to be one copy per simd lane, > >the order of computations is clear and thus can be vectorized. > > Let me see if I understand (all things floating point confuse me). > You're saying that the vectorizer, in its present state will refuse > to vectorize reductions with floats because it may possibly change > the order of computations, but we should override that behavior for > OMP simd loops? No, I'm saying that in simd loops the order of computations is different (and depending on the vectorization factor), as each SIMD lane is supposed to have its own private variable and at the end everything is reduced together. > > D.2717[D.2714].s = D.2702; > > D.2703 = b[i]; > > a.0 = a; > > D.2705 = a.0 + x; > > D.2701 = D.2717[D.2714].s; > > Is there some subtlety in which we have to dereference D.2717 twice > here, or can we reuse D.2702? Usually it is FRE/PRE that optimizes at least the loads, and DSE stores, but FRE/PRE isn't run after vectorization I think. Jakub
Re: [gomp4] Some progress on #pragma omp simd
it. Also, not sure what to do for lastprivate, probably use the magic arrays and just in the epilogue of the loop compute which of the array items belonged to the last iteration somehow. Can't you do (for lastprivate(abc) something like: if (i == 1024) { abc = magic_abc[__builtin_GOMP.simd_lane (1)]; } #pragma omp declare simd __attribute__((noinline, noclone)) void bar (int &x, int &y) { x += 4; y += 4; } Does bar() have anything to do with this example, or was this an oversight? using the magic arrays and so is reduction. While the vectorizer can recognize some reductions, e.g. without -ffast-math it will not vectorize any floating point ones because that means changing the order of computations, while when they are mandated to be one copy per simd lane, the order of computations is clear and thus can be vectorized. Let me see if I understand (all things floating point confuse me). You're saying that the vectorizer, in its present state will refuse to vectorize reductions with floats because it may possibly change the order of computations, but we should override that behavior for OMP simd loops? D.2717[D.2714].s = D.2702; D.2703 = b[i]; a.0 = a; D.2705 = a.0 + x; D.2701 = D.2717[D.2714].s; Is there some subtlety in which we have to dereference D.2717 twice here, or can we reuse D.2702? Aldy
Re: GCC does not support *mmintrin.h with function specific opts
yes, what you said makes sense. thanks, David On Thu, Jun 13, 2013 at 12:45 PM, Jan Hubicka wrote: >> If you want to flag errors for all possible wrongly used always_inline >> attribute, should this change be done in can_inline_edge_p? Or keep >> your current change, but also add a warning (something like 'always >> inline function is ignored etc') in inline_always_inline_functions >> when inline transformation can not meaningfully give a useful error. > > We have the CIF error codes to be able to give useful diagnostic at > transformation time. I think it is better to have all the diagnostic output > at > one place unless we have really good reasons to fork it. We are not losing > any > precision here, right? > > Sure, other option would be to move all alwaysinline diagnostic into > inline_always_inline_functions and remove the code path in tree-inline. The > warnings however are quite meaningfully places in tree-inline, because we want > to warn only after all the inlining algorithms has finished and inlining > really > did not happen. They can be moved out there if we walk the edges at end and > output diagnostic, but I do not see anything really wrong with their current > location. > > Honza
[PATCH, committed] Fix thinko in power8 switches
I had a thinko in processing the switches for quad memory, in that it cleared quad memory on 32-bit, and then OR'ed in the switches from the -mcpu, when it should have cleared it after the OR. I have committed this patch as obvious (after checking it out to make sure it fixes the problem and builds): 2013-06-13 Michael Meissner * config/rs6000/rs6000.c (rs6000_option_override_internal): Move test for clearing quad memory on 32-bit later. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 200044) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -2979,16 +2979,6 @@ rs6000_option_override_internal (bool gl } } - /* The quad memory instructions only works in 64-bit mode. In 32-bit mode, - silently turn off quad memory mode. */ - if (TARGET_QUAD_MEMORY && !TARGET_POWERPC64) -{ - if ((rs6000_isa_flags_explicit & OPTION_MASK_QUAD_MEMORY) != 0) - warning (0, N_("-mquad-memory requires 64-bit mode")); - - rs6000_isa_flags &= ~OPTION_MASK_QUAD_MEMORY; -} - if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) rs6000_print_isa_options (stderr, 0, "before defaults", rs6000_isa_flags); @@ -3046,6 +3036,16 @@ rs6000_option_override_internal (bool gl rs6000_isa_flags &= ~OPTION_MASK_VSX_TIMODE; } + /* The quad memory instructions only works in 64-bit mode. In 32-bit mode, + silently turn off quad memory mode. */ + if (TARGET_QUAD_MEMORY && !TARGET_POWERPC64) +{ + if ((rs6000_isa_flags_explicit & OPTION_MASK_QUAD_MEMORY) != 0) + warning (0, N_("-mquad-memory requires 64-bit mode")); + + rs6000_isa_flags &= ~OPTION_MASK_QUAD_MEMORY; +} + if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) rs6000_print_isa_options (stderr, 0, "after defaults", rs6000_isa_flags);
Re: GCC does not support *mmintrin.h with function specific opts
> If you want to flag errors for all possible wrongly used always_inline > attribute, should this change be done in can_inline_edge_p? Or keep > your current change, but also add a warning (something like 'always > inline function is ignored etc') in inline_always_inline_functions > when inline transformation can not meaningfully give a useful error. We have the CIF error codes to be able to give useful diagnostic at transformation time. I think it is better to have all the diagnostic output at one place unless we have really good reasons to fork it. We are not losing any precision here, right? Sure, other option would be to move all alwaysinline diagnostic into inline_always_inline_functions and remove the code path in tree-inline. The warnings however are quite meaningfully places in tree-inline, because we want to warn only after all the inlining algorithms has finished and inlining really did not happen. They can be moved out there if we walk the edges at end and output diagnostic, but I do not see anything really wrong with their current location. Honza
Re: GCC does not support *mmintrin.h with function specific opts
> * tree-inline.c (expand_call_inline): Allow the error to be flagged > in early inline pass. > * ipa-inline.c (inline_always_inline_functions): Pretend always_inline > functions are inlined during failures to flag an error. > * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Thanks for your patience! Honza
Re: [Patch tree-ssa] RFC: Enable path threading for control variables (PR tree-optimization/54742).
On Fri, 2013-06-07 at 16:14 +0100, James Greenhalgh wrote: > Beyond that, the path search code is modified from Steve's patch > to only perform one depth first search, and the path copy code > is modified to purge the region entry and exit edges for those > which traditional jump-threading may consider. > > Opinions? > > Thanks, > James Greenhalgh Hi James, I tried out your patch and I am not getting the speed up in coremark that I got with my patch. I wonder if restricting it to one depth first search is the reason. With my patch I was able to completely get rid of the switch statement, but with this patch it still exists. Maybe we need an option to do multiple searches? Steve Ellcey sell...@mips.com
Re: GCC does not support *mmintrin.h with function specific opts
If you want to flag errors for all possible wrongly used always_inline attribute, should this change be done in can_inline_edge_p? Or keep your current change, but also add a warning (something like 'always inline function is ignored etc') in inline_always_inline_functions when inline transformation can not meaningfully give a useful error. David On Thu, Jun 13, 2013 at 11:33 AM, Sriraman Tallam wrote: > On Thu, Jun 13, 2013 at 10:19 AM, Jan Hubicka wrote: >>> On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka wrote: >>> >> Can you create a helper function to flag the error and perhaps also >>> >> put that check inside can_inline_edge_p ? >>> >> >>> >> David >>> >> >>> >> >>> >> On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam >>> >> wrote: >>> >> > Hi Honza, >>> >> > >>> >> >I have isolated the ipa-inline.c part into a separate patch with a >>> >> > test and attached it here. The patch is simple. Could you please take >>> >> > a look? >>> >> > >>> >> > * ipa-inline.c (can_early_inline_edge_p): Flag an error when >>> >> > the function that cannot be inlined is target specific. >>> >> > * gcc.target/i386/inline_error.c: New test. >>> > >>> > Sorry for taking ages to look at the patch, I was too hooked into other >>> > problems. >>> > I also think can_early_inline_edge_p should not produce diagnostic - it >>> > is supposed >>> > to be predicate. >>> > >>> > So your problem is that the hard worker in tree-inline is not called at >>> > -O0 >>> > and thus errors are not output? I would suggest arranging >>> > inline_always_inline_functions >>> > to return true even if inlining failed and thus making inline_calls to be >>> > called. >>> >>> Thanks Honza! Yes, that is the problem. Should I just make it return >>> true for these special conditions (TARGET_MISMATCH + gnu_inline + >>> always_inline + ...)? >> >> Can't it just return true if there is any alwaysinline call? I think if >> inline fails, >> we always want to error, right? > > Ok, patch attached that does this. Please let me know what you think. > > Thanks > Sri > >> >> Honza
Re: More vector folding
On 06/13/13 12:06, Marc Glisse wrote: Hello, an incredibly suprising patch: I am adapting yet more fold-const transformations to vectors... (it varies, the last patch was in forwprop) I was quite conservative with respect to complex: I didn't want to create a BIT_NOT_EXPR of a complex. As an aside, while writing this patch, I noticed the following: double f(double x){ return -2*x; } double g(double x){ return 2*-x; } generates mulsd for f and xorpd+addsd for g, I should probably file a PR about that, the running time may be similar but it is better to canonicalize. 2013-06-14 Marc Glisse gcc/ * fold-const.c (negate_expr_p): Handle VECTOR_CST. (fold_negate_expr): Likewise. (fold_real_zero_addition_p): Handle vectors. (fold_binary_loc) : Likewise. gcc/testsuite/ * gcc.dg/fold-minus-1.c: New testcase. This is good. Please install. Thanks, Jeff
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 10:19 AM, Jan Hubicka wrote: >> On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka wrote: >> >> Can you create a helper function to flag the error and perhaps also >> >> put that check inside can_inline_edge_p ? >> >> >> >> David >> >> >> >> >> >> On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam >> >> wrote: >> >> > Hi Honza, >> >> > >> >> >I have isolated the ipa-inline.c part into a separate patch with a >> >> > test and attached it here. The patch is simple. Could you please take >> >> > a look? >> >> > >> >> > * ipa-inline.c (can_early_inline_edge_p): Flag an error when >> >> > the function that cannot be inlined is target specific. >> >> > * gcc.target/i386/inline_error.c: New test. >> > >> > Sorry for taking ages to look at the patch, I was too hooked into other >> > problems. >> > I also think can_early_inline_edge_p should not produce diagnostic - it is >> > supposed >> > to be predicate. >> > >> > So your problem is that the hard worker in tree-inline is not called at -O0 >> > and thus errors are not output? I would suggest arranging >> > inline_always_inline_functions >> > to return true even if inlining failed and thus making inline_calls to be >> > called. >> >> Thanks Honza! Yes, that is the problem. Should I just make it return >> true for these special conditions (TARGET_MISMATCH + gnu_inline + >> always_inline + ...)? > > Can't it just return true if there is any alwaysinline call? I think if > inline fails, > we always want to error, right? Ok, patch attached that does this. Please let me know what you think. Thanks Sri > > Honza * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. Index: tree-inline.c === --- tree-inline.c (revision 200034) +++ tree-inline.c (working copy) @@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c for inlining, but we can't do that because frontends overwrite the body. */ && !cg_edge->callee->local.redefined_extern_inline - /* Avoid warnings during early inline pass. */ - && cgraph_global_info_ready /* PR 20090218-1_0.c. Body can be provided by another module. */ && (reason != CIF_BODY_NOT_AVAILABLE || !flag_generate_lto)) { Index: ipa-inline.c === --- ipa-inline.c(revision 200034) +++ ipa-inline.c(working copy) @@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node } if (!can_early_inline_edge_p (e)) - continue; + { + /* Set inlined to true if the callee is marked "always_inline" but +is not inlinable. This will allow flagging an error later in +expand_call_inline in tree-inline.c. */ + if (lookup_attribute ("always_inline", +DECL_ATTRIBUTES (callee->symbol.decl)) != NULL) + inlined = true; + continue; + } if (dump_file) fprintf (dump_file, " Inlining %s into %s (always_inline).\n", Index: testsuite/gcc.target/i386/inline_error.c === --- testsuite/gcc.target/i386/inline_error.c(revision 0) +++ testsuite/gcc.target/i386/inline_error.c(revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -mno-popcnt" } */ + +inline int __attribute__ ((__gnu_inline__, __always_inline__, target("popcnt"))) +foo () /* { dg-error "inlining failed in call to always_inline .* target specific option mismatch" } */ +{ + return 0; +} + +int bar() +{ + return foo (); /* { dg-error "called from here" } */ +}
More vector folding
Hello, an incredibly suprising patch: I am adapting yet more fold-const transformations to vectors... (it varies, the last patch was in forwprop) I was quite conservative with respect to complex: I didn't want to create a BIT_NOT_EXPR of a complex. As an aside, while writing this patch, I noticed the following: double f(double x){ return -2*x; } double g(double x){ return 2*-x; } generates mulsd for f and xorpd+addsd for g, I should probably file a PR about that, the running time may be similar but it is better to canonicalize. 2013-06-14 Marc Glisse gcc/ * fold-const.c (negate_expr_p): Handle VECTOR_CST. (fold_negate_expr): Likewise. (fold_real_zero_addition_p): Handle vectors. (fold_binary_loc) : Likewise. gcc/testsuite/ * gcc.dg/fold-minus-1.c: New testcase. Bootstrap+testsuite on x86_64-unknown-linux-gnu. -- Marc GlisseIndex: testsuite/gcc.dg/fold-minus-1.c === --- testsuite/gcc.dg/fold-minus-1.c (revision 0) +++ testsuite/gcc.dg/fold-minus-1.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-gimple" } */ + +typedef int vec __attribute__((vector_size(2*sizeof(int; + +void f(vec*x,vec*y){ + *x -= *x / *y * *y; +} +void g(vec*x,vec*y,vec*z){ + *x = -1 - *x; + *y = -*y - 1; + *z = -*z - 13; +} + +/* { dg-final { scan-tree-dump-times "%" 1 "gimple"} } */ +/* { dg-final { scan-tree-dump-times "~" 2 "gimple"} } */ +/* { dg-final { scan-tree-dump-not "/" "gimple"} } */ +/* { dg-final { scan-tree-dump-not "\\\+" "gimple"} } */ +/* { dg-final { scan-tree-dump "{ -13, -13 }" "gimple"} } */ +/* { dg-final { cleanup-tree-dump "gimple" } } */ Property changes on: testsuite/gcc.dg/fold-minus-1.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: fold-const.c === --- fold-const.c(revision 200071) +++ fold-const.c(working copy) @@ -414,20 +414,34 @@ negate_expr_p (tree t) case REAL_CST: /* We want to canonicalize to positive real constants. Pretend that only negative ones can be easily negated. */ return REAL_VALUE_NEGATIVE (TREE_REAL_CST (t)); case COMPLEX_CST: return negate_expr_p (TREE_REALPART (t)) && negate_expr_p (TREE_IMAGPART (t)); +case VECTOR_CST: + { + if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type)) + return true; + + int count = TYPE_VECTOR_SUBPARTS (type), i; + + for (i = 0; i < count; i++) + if (!negate_expr_p (VECTOR_CST_ELT (t, i))) + return false; + + return true; + } + case COMPLEX_EXPR: return negate_expr_p (TREE_OPERAND (t, 0)) && negate_expr_p (TREE_OPERAND (t, 1)); case CONJ_EXPR: return negate_expr_p (TREE_OPERAND (t, 0)); case PLUS_EXPR: if (HONOR_SIGN_DEPENDENT_ROUNDING (TYPE_MODE (type)) || HONOR_SIGNED_ZEROS (TYPE_MODE (type))) @@ -553,20 +567,35 @@ fold_negate_expr (location_t loc, tree t tree ipart = negate_expr (TREE_IMAGPART (t)); if ((TREE_CODE (rpart) == REAL_CST && TREE_CODE (ipart) == REAL_CST) || (TREE_CODE (rpart) == INTEGER_CST && TREE_CODE (ipart) == INTEGER_CST)) return build_complex (type, rpart, ipart); } break; +case VECTOR_CST: + { + int count = TYPE_VECTOR_SUBPARTS (type), i; + tree *elts = XALLOCAVEC (tree, count); + + for (i = 0; i < count; i++) + { + elts[i] = fold_negate_expr (loc, VECTOR_CST_ELT (t, i)); + if (elts[i] == NULL_TREE) + return NULL_TREE; + } + + return build_vector (type, elts); + } + case COMPLEX_EXPR: if (negate_expr_p (t)) return fold_build2_loc (loc, COMPLEX_EXPR, type, fold_negate_expr (loc, TREE_OPERAND (t, 0)), fold_negate_expr (loc, TREE_OPERAND (t, 1))); break; case CONJ_EXPR: if (negate_expr_p (t)) return fold_build1_loc (loc, CONJ_EXPR, type, @@ -6161,23 +6190,26 @@ fold_real_zero_addition_p (const_tree ty return false; /* Don't allow the fold with -fsignaling-nans. */ if (HONOR_SNANS (TYPE_MODE (type))) return false; /* Allow the fold if zeros aren't signed, or their sign isn't important. */ if (!HONOR_SIGNED_ZEROS (TYPE_MODE (type))) return true; + /* In a vector or complex, we would need to check the sign of all zeros. */ + if (TREE_CODE (addend) != REAL_CST) +return false; + /* Treat x + -0 as x - 0 and x - -0 as x + 0. */ - if (TREE_CODE (addend) == REAL_CST - && REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (addend))) + if (REAL_VALUE_MINUS_ZERO (TREE_RE
Re: [Patch, Fortran] PR57508 - Fix ICE/Reject-valid issue with get_temp_from_expr (intrinsic assignment with defined assignment)
Mikael Morin wrote: This fixes the problem, but shouldn't the fix be in gfc_expr_attr instead? I tried it - but it does not work: In many case, one actually needs a function, e.g. for procedure pointers or for C_FUNLOC. Thus, I had to add an additional flag to tell whether the function or the function result it needed. But instead of adding a Boolean flag to 55 calls, which can be false in 54 case and true in 1, I think that the original patch is better. It's the only case where not an attribute it checked - but where attributes are copied. Thus, is the original patch okay? Or do you have a better proposal?http://gcc.gnu.org/ml/fortran/2013-06/msg00027.html Tobias PS: Other pending patches: * Unreviewed: Print exception status at STOP, http://gcc.gnu.org/ml/fortran/2013-06/msg00077.html * PR57596 - Fix OPTIONAL handling of deferred-length strings, http://gcc.gnu.org/ml/fortran/2013-06/msg00082.html
Re: GCC does not support *mmintrin.h with function specific opts
> On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka wrote: > >> Can you create a helper function to flag the error and perhaps also > >> put that check inside can_inline_edge_p ? > >> > >> David > >> > >> > >> On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam > >> wrote: > >> > Hi Honza, > >> > > >> >I have isolated the ipa-inline.c part into a separate patch with a > >> > test and attached it here. The patch is simple. Could you please take > >> > a look? > >> > > >> > * ipa-inline.c (can_early_inline_edge_p): Flag an error when > >> > the function that cannot be inlined is target specific. > >> > * gcc.target/i386/inline_error.c: New test. > > > > Sorry for taking ages to look at the patch, I was too hooked into other > > problems. > > I also think can_early_inline_edge_p should not produce diagnostic - it is > > supposed > > to be predicate. > > > > So your problem is that the hard worker in tree-inline is not called at -O0 > > and thus errors are not output? I would suggest arranging > > inline_always_inline_functions > > to return true even if inlining failed and thus making inline_calls to be > > called. > > Thanks Honza! Yes, that is the problem. Should I just make it return > true for these special conditions (TARGET_MISMATCH + gnu_inline + > always_inline + ...)? Can't it just return true if there is any alwaysinline call? I think if inline fails, we always want to error, right? Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka wrote: >> Can you create a helper function to flag the error and perhaps also >> put that check inside can_inline_edge_p ? >> >> David >> >> >> On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam wrote: >> > Hi Honza, >> > >> >I have isolated the ipa-inline.c part into a separate patch with a >> > test and attached it here. The patch is simple. Could you please take >> > a look? >> > >> > * ipa-inline.c (can_early_inline_edge_p): Flag an error when >> > the function that cannot be inlined is target specific. >> > * gcc.target/i386/inline_error.c: New test. > > Sorry for taking ages to look at the patch, I was too hooked into other > problems. > I also think can_early_inline_edge_p should not produce diagnostic - it is > supposed > to be predicate. > > So your problem is that the hard worker in tree-inline is not called at -O0 > and thus errors are not output? I would suggest arranging > inline_always_inline_functions > to return true even if inlining failed and thus making inline_calls to be > called. Thanks Honza! Yes, that is the problem. Should I just make it return true for these special conditions (TARGET_MISMATCH + gnu_inline + always_inline + ...)? Thanks, Sri > > Honza
Re: GCC does not support *mmintrin.h with function specific opts
> Can you create a helper function to flag the error and perhaps also > put that check inside can_inline_edge_p ? > > David > > > On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam wrote: > > Hi Honza, > > > >I have isolated the ipa-inline.c part into a separate patch with a > > test and attached it here. The patch is simple. Could you please take > > a look? > > > > * ipa-inline.c (can_early_inline_edge_p): Flag an error when > > the function that cannot be inlined is target specific. > > * gcc.target/i386/inline_error.c: New test. Sorry for taking ages to look at the patch, I was too hooked into other problems. I also think can_early_inline_edge_p should not produce diagnostic - it is supposed to be predicate. So your problem is that the hard worker in tree-inline is not called at -O0 and thus errors are not output? I would suggest arranging inline_always_inline_functions to return true even if inlining failed and thus making inline_calls to be called. Honza
Re: [PATCH 4/4] Fix leading spaces.
On Thu, 13 Jun 2013, Richard Biener wrote: > Btw, rather than these kind of patches I'd appreciate if someone would look > at a simple pre(post?)-commit hook that enforces those whitespace rules. In the cpp testsuite we definitely want tests with bad whitespace (e.g. gcc.dg/cpp/backslash*.c) and generally I think it's wrong to be changing whitespace in the testsuite without an understanding of what the test is testing and how the whitespace is irrelevant to that (more generally, cleanups of compiler tests are suspect without such an understanding of what is or is not significant in a particular test). And so you need to allow addition of otherwise bad whitespace there. It's not obvious whether there might be other cases needing such whitespace as well. > Either by adjusting the committed content or by rejecting the commit(?) I don't think hooks adjusting committed content are likely to work at all. -- Joseph S. Myers jos...@codesourcery.com
Re: GCC does not support *mmintrin.h with function specific opts
Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam wrote: > Hi Honza, > >I have isolated the ipa-inline.c part into a separate patch with a > test and attached it here. The patch is simple. Could you please take > a look? > > * ipa-inline.c (can_early_inline_edge_p): Flag an error when > the function that cannot be inlined is target specific. > * gcc.target/i386/inline_error.c: New test. > > > > Thanks > Sri > > On Mon, Jun 10, 2013 at 4:10 PM, Sriraman Tallam wrote: >> Ping. >> >> On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam wrote: >>> Ping. >>> >>> On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam >>> wrote: Ping, for review of ipa-inline.c change. Sri On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam wrote: > On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek wrote: >> On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: >>> --- ipa-inline.c (revision 198950) >>> +++ ipa-inline.c (working copy) >>> @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) >>>return false; >>> } >>>if (!can_inline_edge_p (e, true)) >>> -return false; >>> +{ >>> + enum availability avail; >>> + struct cgraph_node *callee >>> += cgraph_function_or_thunk_node (e->callee, &avail); >>> + /* Flag an error when the inlining cannot happen because of >>> target option >>> + mismatch but the callee is marked as "always_inline". In -O0 >>> mode >>> + this will go undetected because the error flagged in >>> + "expand_call_inline" in tree-inline.c might not execute and the >>> + inlining will not happen. Then, the linker could complain about >>> a >>> + missing body for the callee if it turned out that the callee was >>> + also marked "gnu_inline" with extern inline keyword as bodies of >>> such >>> + functions are not generated. */ >>> + if ((!optimize >>> +|| flag_no_inline) >> >> This should be if ((!optimize || flag_no_inline) on one line. >> >> I'd prefer also the testcase for the ICEs, something like: >> >> /* Test case to check if AVX intrinsics and function specific target >>optimizations work together. Check by including x86intrin.h */ >> >> /* { dg-do compile } */ >> /* { dg-options "-O2 -mno-sse -mno-avx" } */ >> >> #include >> >> __m256 a, b, c; >> void __attribute__((target ("avx"))) >> foo (void) >> { >> a = _mm256_and_ps (b, c); >> } >> >> and another testcase that does: >> >> /* { dg-do compile } */ >> #pragma GCC target ("mavx") /* { dg-error "whatever" } */ >> >> Otherwise it looks good to me, but I'd prefer the i?86 maintainers to >> review >> it too (and Honza for ipa-inline.c?). > > Honza, could you please take a look at the ipa-inline.c fix? I will > split the patches and submit after Honza's review. I will also make > the changes mentioned. > > Thanks > Sri > > >> >> Jakub
Re: [PATCH] Cilk Plus Array Notation for C++
On 06/13/2013 09:11 AM, Aldy Hernandez wrote: > The whole slew of these cases have a lot of duplicated code. For instance, > BUILT_IN_CILKPLUS_SEC_REDUCE_MIN is the same as > BUILT_IN_CILKPLUS_SEC_REDUCE_MAX, the only difference being GT_EXPR vs > LT_EXPR. Surely you could do something like: > > if (an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MIN > || an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MAX) { >enum tree_code code; >if (an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MIN) > code = GT_EXPR; >else > code = LT_EXPR; >// stuff > } > > The compiler should be able to optimize the above, but even if it couldn't, I > am willing to compare twice and save lines and lines of code. > > Similarly for SEC_REDUCE_ANY_ZERO/SEC_REDUCE_ANY_NONZERO, > SEC_REDUCE_ALL_ZERO/SEC_REDUCE_ALL_NONZERO, SEC_REDUCE_ADD/SEC_REDUCE_MUL, etc > etc etc. Yep. It's at this point that I normally start using the idiom switch (an_type) { case BUILT_IN_CILKPLUS_SEC_REDUCE_MIN: code = GT_EXPR; goto do_min_max; case BUILT_IN_CILKPLUS_SEC_REDUCE_MAX: code = LT_EXPR; goto do_min_max; do_min_max: // stuff So much the better if you can include the other SEC_* cases in that switch too, doing one compiler-controlled dispatch. It also occurs to me to wonder why you're building your own COND_EXPR here, with the comparison, rather than using MIN/MAX_EXPR... r~
Re: [PATCH] Cilk Plus Array Notation for C++
It looks like a NULL in INIT_INDEX is a specially handled case. Perhaps you should document that INIT_INDEX can be null and what it means. Also, you don't need to document what internal variable name you are using as a return value (VALUE_TREE). Perhaps instead of "The return value..." you could write "This function returns the ARRAY_NOTATION_REF node." or something like it. It is documented inside the function, right before checking for !init_index. Is that enough? Please put it in the function comment. As it stands, users would have to look through the body to find that init_index==NULL is a special case. Changes to existing tests should be submitted as a separate patch, since this doesn't seem to be C++ specific. And BTW, this particular test change can be committed as obvious. OK will do. What about adding {target c} and {target c++} for test cases. Can that be submitted with the patch? Yes. + else if (an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MAX) +{ + /* If the TYPE_MIN_VALUE is available for the new_var_type, then +set that as the initial value. */ + if (TYPE_MIN_VALUE (new_var_type)) + new_var_init = build_x_modify_expr (location, *new_var, NOP_EXPR, + TYPE_MIN_VALUE (new_var_type), 1); + else + /* ... otherwise set initial value as the first element of array. */ + new_var_init = build_x_modify_expr (location, *new_var, NOP_EXPR, + func_parm, 1); + new_no_expr = build_x_modify_expr (location, *new_var, NOP_EXPR, + *new_var, 1); + new_yes_expr = build_x_modify_expr (location, *new_var, NOP_EXPR, + func_parm, 1); + new_cond_expr = build_x_binary_op (location, LT_EXPR, *new_var, +TREE_CODE (*new_var), func_parm, +TREE_CODE (func_parm), NULL, +tf_warning_or_error); + new_expr = build_x_conditional_expr (location, new_cond_expr, + new_yes_expr, new_no_expr, + tf_warning_or_error); +} + else if (an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MIN) +{ + /* If the TYPE_MAX_VALUE is available for the new_var_type, then +set that as the initial value. */ + if (TYPE_MAX_VALUE (new_var_type)) + new_var_init = build_x_modify_expr (location, *new_var, NOP_EXPR, + TYPE_MAX_VALUE (new_var_type), 1); + else + /* ... otherwise set initial value as the first element of array. */ + new_var_init = build_x_modify_expr (location, *new_var, NOP_EXPR, + func_parm, 1); + new_no_expr = build_x_modify_expr (location, *new_var, NOP_EXPR, + *new_var, 1); + new_yes_expr = build_x_modify_expr (location, *new_var, NOP_EXPR, + func_parm, 1); + new_cond_expr = build_x_binary_op (location, GT_EXPR, *new_var, +TREE_CODE (*new_var), func_parm, +TREE_CODE (func_parm), NULL, +tf_warning_or_error); + new_expr = build_x_conditional_expr (location, new_cond_expr, + new_yes_expr, new_no_expr, + tf_warning_or_error); +} The whole slew of these cases have a lot of duplicated code. For instance, BUILT_IN_CILKPLUS_SEC_REDUCE_MIN is the same as BUILT_IN_CILKPLUS_SEC_REDUCE_MAX, the only difference being GT_EXPR vs LT_EXPR. Surely you could do something like: if (an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MIN || an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MAX) { enum tree_code code; if (an_type == BUILT_IN_CILKPLUS_SEC_REDUCE_MIN) code = GT_EXPR; else code = LT_EXPR; // stuff } The compiler should be able to optimize the above, but even if it couldn't, I am willing to compare twice and save lines and lines of code. Similarly for SEC_REDUCE_ANY_ZERO/SEC_REDUCE_ANY_NONZERO, SEC_REDUCE_ALL_ZERO/SEC_REDUCE_ALL_NONZERO, SEC_REDUCE_ADD/SEC_REDUCE_MUL, etc etc etc. + if (location == UNKNOWN_LOCATION) +{ + if (EXPR_LOCATION (lhs) != UNKNOWN_LOCATION) + location = EXPR_LOCATION (lhs); + else if (EXPR_LOCATION (rhs) != UNKNOWN_LOCATION) + location = EXPR_LOCATION (rhs); +} + + + /* We need this when we have a scatter issue. */ Extra whitespace. + if (lhs_rank != 0 && rhs_rank != 0 && lhs_rank != rhs_rank) +{ + tree lhs_base = lhs; + tree rhs_base = rhs; + + for (ii = 0; ii < lhs_rank; ii++) + lhs_base = ARRAY_NOTATION_ARRAY (lhs_base); + + while (rhs_base && TREE_CODE (rhs_base)
Re: [GOOGLE] More strict checking for call args
On Thu, Jun 13, 2013 at 1:43 AM, Richard Biener wrote: > On Sat, Jun 8, 2013 at 2:26 AM, Xinliang David Li wrote: >> On Fri, Jun 7, 2013 at 6:47 AM, Richard Biener >> wrote: >>> On Fri, Jun 7, 2013 at 3:30 PM, Xinliang David Li >>> wrote: On Fri, Jun 7, 2013 at 2:05 AM, Richard Biener wrote: > On Thu, Jun 6, 2013 at 5:10 PM, Dehao Chen wrote: >> Hi, Martin, >> >> Yes, your patch can fix my case. Thanks a lot for the fix. >> >> With the fix, value profiling will still promote the wrong indirect >> call target. Though it will not be inlining, but it results in an >> additional check. How about in check_ic_target, after calling >> gimple_check_call_matching_types, we also check if number of args >> match number of params in target->symbol.decl? > > I wonder what's the point in the gimple_check_call_matching_types check > in the profiling case. It's at least no longer > > /* Perform sanity check on the indirect call target. Due to race > conditions, >false function target may be attributed to an indirect call site. If > the >call expression type mismatches with the target function's type, > expand_call >may ICE. > > because since the introduction of gimple_call_fntype we will _not_ ICE. > > Thus I argue that check_ic_target should be even removed instead of > enhancing it! > Another reason is what Dehao had mentioned -- wrong target leads to useless transformation. >>> >>> Sure, but a not wrong in the sense of the predicate does not guarantee >>> a useful transformation either. >> >> The case in reality is very rare -- most of the cases, the >> transformation is good. >> >>> > How does IC profiling determine the called target? That is, what does it > do when the target is not always the same? (because the checking code > talks about race conditions for example) The race condition is the happening at instrumentation time -- the indirect call counters are not thread local. We have seen this a lot in the past that a totally bogus target is attributed to a indirect callsite. >>> >>> So it simply uses whatever function was called last? Instead of >>> using the function that was called most of the time? >> >> It uses the most frequent target -- but the target id recorded for the >> most frequent target might be corrupted and got mapped to a false >> target during profile-use. > > But that's not due to "race conditions" but rather AutoFDO which isn't even > in trunk, right? not yet. Dehao is working on it. David > > Anyway, the discussion is probably moot - the patch is ok with me > and my argument would be we should use the function in less places. > > Thanks, > Richard. > >> David >> >>> >>> Richard. >>> thanks, David > > Richard. > > >> Thanks, >> Dehao >> >> >> On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor wrote: >>> >>> Hi, >>> >>> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote: >>> > attached is a testcase that would cause problem when source has >>> > changed: >>> > >>> > $ g++ test.cc -O2 -fprofile-generate -DOLD >>> > $ ./a.out >>> > $ g++ test.cc -O2 -fprofile-use >>> > test.cc:34:1: internal compiler error: in operator[], at vec.h:815 >>> > } >>> > ^ >>> > 0x512740 vec::operator[](unsigned int) >>> > ../../gcc/vec.h:815 >>> > 0x512740 vec::operator[](unsigned int) >>> > ../../gcc/vec.h:1244 >>> > 0xf24464 vec::operator[](unsigned int) >>> > ../../gcc/vec.h:815 >>> > 0xf24464 vec::operator[](unsigned int) >>> > ../../gcc/vec.h:1244 >>> > 0xf24464 ipa_get_indirect_edge_target_1 >>> > ../../gcc/ipa-cp.c:1535 >>> > 0x971b9a estimate_edge_devirt_benefit >>> > ../../gcc/ipa-inline-analysis.c:2757 >>> >>> Hm, this seems rather like an omission in >>> ipa_get_indirect_edge_target_1. >>> Since it is called also from inlining, we can have parameter count >>> mismatches... and in fact in non-virtual paths of that function we do >>> check that we don't. Because all callers have to pass known_vals >>> describing all formal parameters of the inline tree root, we should >>> apply the fix below (I've only just started running a bootstrap and >>> testsuite on x86_64, though). >>> >>> OTOH, while I understand that FDO can change inlining sufficiently so >>> that this error occurs, IMHO this should not be caused by outdated >>> profiles but there is somewhere a parameter mismatch in the source. >>> >>> Dehao, can you please check that this patch helps? >>> >>> Richi, if it does and the patch passes bootstrap and tests, is it OK >>> for trunk and 4.8 branch? >>> >>> Thanks and sorry for the trouble, >>> >>> Martin >>> >>> >>> 2013-06-06 Martin Jambor >>>
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote: > @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT >type = TREE_TYPE (decl); > >dalign = TYPE_ALIGN (type); > + dalign = DATA_ABI_ALIGNMENT (type, dalign); >if (CONSTANT_CLASS_P (decl)) > dalign = CONSTANT_ALIGNMENT (decl, dalign); >else > - dalign = DATA_ALIGNMENT (decl, dalign); > + dalign = DATA_ALIGNMENT (type, dalign); > >if (dsize == 0) > { What is this code trying to do? Shouldn't it just use DECL_ALIGN which should be set to the right value from get_variable_alignment? I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT or CONSTANT_ALIGNMENT (for anything but actually emitting the var into object, or just as an optimization hint that very likely the decl will be aligned enough, but not guaranteed), which are optimization, is wrong (an ABI problem). Jakub
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Thu, Jun 13, 2013 at 05:10:51PM +0930, Alan Modra wrote: > On Wed, Jun 12, 2013 at 12:52:03PM -0500, Edmar Wienskoski wrote: > > The e500v2 (SPE) hardware is such that if the address of vector (double > > world > > load / stores) are not double world aligned the instruction will trap. > > > > So this alignment is not optional. > > Vector type alignment is also specified by the ppc64 abi. I think we > want the following. Note that DATA_ALIGNMENT has been broken for > vectors right from the initial vector support (and the error was > copied for e500 double). For example > > typedef int vec_align __attribute__ ((vector_size(16), aligned(32))); > vec_align x = { 0, 0, 0, 0 }; > > currently loses the extra alignment. Fixed by never decreasing > alignment in DATA_ABI_ALIGNMENT. Testing in progress. OK to > apply assuming bootstrap is good? (I think I need a change in > offsettable_ok_by_alignment too. I'll do that in a separate patch.) Revised patch with offsettable_ok_by_alignment change, avoiding dumb idea of using statement expressions. This one actually bootstraps and passes regression testing. * config/rs6000/rs6000.h (enum data_align): New. (LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment. (DATA_ABI_ALIGNMENT): Define. (CONSTANT_ALIGNMENT): Correct comment. * config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare. * config/rs6000/rs6000.c (rs6000_data_alignment): New function. (offsettable_ok_by_alignment): Align by DATA_ABI_ALIGNMENT. Pass "type" not "decl" to DATA_ALIGNMENT. Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 200055) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -813,12 +813,6 @@ extern unsigned rs6000_pointer_size; /* No data type wants to be aligned rounder than this. */ #define BIGGEST_ALIGNMENT 128 -/* A C expression to compute the alignment for a variables in the - local store. TYPE is the data type, and ALIGN is the alignment - that the object would ordinarily have. */ -#define LOCAL_ALIGNMENT(TYPE, ALIGN) \ - DATA_ALIGNMENT (TYPE, ALIGN) - /* Alignment of field after `int : 0' in a structure. */ #define EMPTY_FIELD_BOUNDARY 32 @@ -828,8 +822,15 @@ extern unsigned rs6000_pointer_size; /* A bit-field declared as `int' forces `int' alignment for the struct. */ #define PCC_BITFIELD_TYPE_MATTERS 1 -/* Make strings word-aligned so strcpy from constants will be faster. - Make vector constants quadword aligned. */ +enum data_align { align_abi, align_opt, align_both }; + +/* A C expression to compute the alignment for a variables in the + local store. TYPE is the data type, and ALIGN is the alignment + that the object would ordinarily have. */ +#define LOCAL_ALIGNMENT(TYPE, ALIGN) \ + rs6000_data_alignment (TYPE, ALIGN, align_both) + +/* Make strings word-aligned so strcpy from constants will be faster. */ #define CONSTANT_ALIGNMENT(EXP, ALIGN) \ (TREE_CODE (EXP) == STRING_CST\ && (STRICT_ALIGNMENT || !optimize_size) \ @@ -837,21 +838,14 @@ extern unsigned rs6000_pointer_size; ? BITS_PER_WORD \ : (ALIGN)) -/* Make arrays of chars word-aligned for the same reasons. - Align vectors to 128 bits. Align SPE vectors and E500 v2 doubles to +/* Make arrays of chars word-aligned for the same reasons. */ +#define DATA_ALIGNMENT(TYPE, ALIGN) \ + rs6000_data_alignment (TYPE, ALIGN, align_opt) + +/* Align vectors to 128 bits. Align SPE vectors and E500 v2 doubles to 64 bits. */ -#define DATA_ALIGNMENT(TYPE, ALIGN)\ - (TREE_CODE (TYPE) == VECTOR_TYPE \ - ? (((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (TYPE))) \ - || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE \ - ? 64 : 128) \ - : ((TARGET_E500_DOUBLE \ - && TREE_CODE (TYPE) == REAL_TYPE \ - && TYPE_MODE (TYPE) == DFmode) \ - ? 64 \ - : (TREE_CODE (TYPE) == ARRAY_TYPE \ -&& TYPE_MODE (TREE_TYPE (TYPE)) == QImode \ -&& (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN))) +#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \ + rs6000_data_alignment (TYPE, ALIGN, align_abi) /* Nonzero if move instructions will actually fail to work when given unaligned data. */ Index: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs
Re: force_const_mem VOIDmode
On Thu, Jun 13, 2013 at 04:29:44PM +0200, Andreas Krebbel wrote: > On 07/06/13 04:00, Alan Modra wrote: > > force_const_mem() isn't supposed to handle VOIDmode or BLKmode, so the > > check for VOIDmode when aligning is needless. If we ever did get one > > of these modes in a constant pool, this > > > > pool->offset += GET_MODE_SIZE (mode); > > Hi Alan, > > the assertion is triggered on S/390 during bootstrap. Reverting the patch seems the simplest thing to do, so I've done that. Apologies for the breakage. -- Alan Modra Australia Development Lab, IBM
Re: [Bug libstdc++/56430] In __airy: return-statement with a value, in function returning 'void'.
Hi, for now only a few words about your first two points (goes without saying that further contributions from you about the special functions TR or anything C++11 and C++14 are more than welcome!) >On 06/13/2013 04:30 AM, Paolo Carlini wrote: >> On 06/13/2013 02:38 AM, Paolo Carlini wrote: >>> If we really have to add a testcase - I'm not sure - please double >>> check that it passes testing with -Wall, no unused vars. >> Patch as went in had still the testcase wrong, triggering at least 3 >> warnings with -Wall. All in all, I decided to also remove the >> additional functions: it doesn't make sense to add *now* functions to > >> tr1, which otherwise is deeply in regression fixes only mode. And >> certainly not under a completely unrelated PR. >> >> Paolo. >> >1. Fine. I get fixing just the PR and not conflating things. > >2. How do you test with Wall? Do you just test just the library with >Wall or the whole build? >I've tried several 'make check-libstdc++ RUNTESTFLAGS="-Wall"', etc. >and >no dice. There are many ways of course. Some more elegant than others. Normally I just do, in the library build directory: make check CXXFLAGS="-02 -g -Wall -Wno-unused-local-typedefs". It should be clean. I strive to keep it like that. Actually it used to be clean without the latter -Wno-* too , patches welcome, as usual ;) More generally if you stay in the build directory everything should work rather predictably and naturally (in terms of *nix common sense): make clean works; make + CXXFLAGS works for building .so and .a, etc. Paolo
Re: powerpc64le abi check
In order to check powerpc64le abi symbols. Bootstrapped etc. powerpc64-linux. OK to apply? * configure.host (abi_baseline_pair): Match powerpc64*. I think that I can approve this. - David
Re: force_const_mem VOIDmode
On 07/06/13 04:00, Alan Modra wrote: > force_const_mem() isn't supposed to handle VOIDmode or BLKmode, so the > check for VOIDmode when aligning is needless. If we ever did get one > of these modes in a constant pool, this > > pool->offset += GET_MODE_SIZE (mode); Hi Alan, the assertion is triggered on S/390 during bootstrap. The S/390 movmem_short expander emits the following insn: (insn 793 3147 795 93 (parallel [ (set (mem:BLK (reg/f:SI 509 [ gi_filename ]) [0 A8]) (mem:BLK (reg/f:SI 510 [ gcov_prefix ]) [0 A8])) (use (reg:SI 511 [ prefix_length ])) (use (const:BLK (unspec:BLK [ (const_int 0 [0]) ] UNSPEC_INSN))) (clobber (reg:SI 1288)) ]) For machines without larl (load address relative long) the RTX: (const:BLK (unspec:BLK [ (const_int 0 [0]) ] UNSPEC_INSN)) is supposed to be pushed into the literal pool. The expression represents the target of the execute instruction. It doesn't matter that the literal pool offset calculation doesn't work with this RTX the backend does this in machine dependent reorg anyway on its own. Picking a different mode wouldn't be correct for our purposes since the size depends on the pattern containing the expression. So from a S/390 perspective I think the !BLKmode assertion should be removed. Bye, -Andreas- > > won't add to the pool size, and output_constant_pool_2() will hit a > gcc_unreachable(). Bootstrapped etc. powerpc64-linux. > > * varasm.c (force_const_mem): Assert mode is not VOID or BLK. > > Index: gcc/varasm.c > === > --- gcc/varasm.c (revision 199718) > +++ gcc/varasm.c (working copy) > @@ -3567,7 +3575,8 @@ >*slot = desc; > >/* Align the location counter as required by EXP's data type. */ > - align = GET_MODE_ALIGNMENT (mode == VOIDmode ? word_mode : mode); > + gcc_checking_assert (mode != VOIDmode && mode != BLKmode); > + align = GET_MODE_ALIGNMENT (mode); > #ifdef CONSTANT_ALIGNMENT >{ > tree type = lang_hooks.types.type_for_mode (mode, 0); >
Re: [Bug libstdc++/56430] In __airy: return-statement with a value, in function returning 'void'.
On 06/13/2013 04:30 AM, Paolo Carlini wrote: On 06/13/2013 02:38 AM, Paolo Carlini wrote: If we really have to add a testcase - I'm not sure - please double check that it passes testing with -Wall, no unused vars. Patch as went in had still the testcase wrong, triggering at least 3 warnings with -Wall. All in all, I decided to also remove the additional functions: it doesn't make sense to add *now* functions to tr1, which otherwise is deeply in regression fixes only mode. And certainly not under a completely unrelated PR. Paolo. 1. Fine. I get fixing just the PR and not conflating things. 2. How do you test with Wall? Do you just test just the library with Wall or the whole build? I've tried several 'make check-libstdc++ RUNTESTFLAGS="-Wall"', etc. and no dice. Obviously I should have hit the test with wall. Sorry. 3. I would like to implement TR29123 which adds the TR1 math functions to std. I obviously would like to make cleanups to the algorithms there. We could either copy the math libs to bits and maintain separate files for the same functions or we could keep the math headers in one place. If we don't add anything to the math it seems like splitting the implementation would be a waste. There would be a lot of overlap. OTOH, cleaning up TR29123 in, say, bits and totally leaving tr1 alone would encourage people to switch up (especially with guidance to that effect). 4. I would like some way to add experimental support for new math functions that appear in std proposal papers (airy_ai, etc.). I suppose tr2 or ext. Any ideas which you'd prefer?
Re: [C++ Patch] PR 57599
Hi, Jason Merrill ha scritto: >On 06/12/2013 08:49 PM, Paolo Carlini wrote: >> +/* Add any qualifier conversions. */ >> +return build_nop (type, expr); > >For a cast to reference type, this will produce an rvalue rather than >an lvalue. Ah, thanks a lot for the clarification. Admittedly, I still find the whole area of nops rather obscure. >Perhaps we should hand off to build_static_cast in the case described >by >the comment, rather than duplicate the logic. Thanks for hint, I'll try to come up with another try over the next day or so. Thanks again, Paolo
Re: [C++ Patch] PR 57599
On 06/12/2013 08:49 PM, Paolo Carlini wrote: + /* Add any qualifier conversions. */ + return build_nop (type, expr); For a cast to reference type, this will produce an rvalue rather than an lvalue. Perhaps we should hand off to build_static_cast in the case described by the comment, rather than duplicate the logic. Jason
Re: [patch] [python libstdc++ printers] Fix gdb/15195
> "Phil" == Phil Muldoon writes: Phil> Attached is an updated patch correcting the issues that you pointed Phil> out. The patch itself looks fine to me, but I don't think I can approve it. Tom
Fix LTO support for compound literals
Hi, compount literals are constructed COMDAT but not PUBLIC by C frontned. A while ago I discussed with with Jason? (I believe) and the reason for that is that they allow sharing even if it is not done by linker. This however breaks our partitioning code that assumes that COMDAT is always exported. This patch sives it and fixes one of two problems seen while linking Linux kernel. Bootstrapped/regtested x86_64-linux, comitted. Honza * ipa.c (cgraph_externally_visible_p, varpool_externally_visible_p): Local comdats are not externally visible. * symtab.c (dump_symtab_base): Dump externally visible. (verify_symtab_base): Verify back links in the symtab hash. Index: ipa.c === --- ipa.c (revision 200016) +++ ipa.c (working copy) @@ -606,9 +606,8 @@ cgraph_externally_visible_p (struct cgra { if (!node->symbol.definition) return false; - if (!DECL_COMDAT (node->symbol.decl) - && (!TREE_PUBLIC (node->symbol.decl) - || DECL_EXTERNAL (node->symbol.decl))) + if (!TREE_PUBLIC (node->symbol.decl) + || DECL_EXTERNAL (node->symbol.decl)) return false; /* Do not try to localize built-in functions yet. One of problems is that we @@ -667,7 +666,7 @@ varpool_externally_visible_p (struct var if (DECL_EXTERNAL (vnode->symbol.decl)) return true; - if (!DECL_COMDAT (vnode->symbol.decl) && !TREE_PUBLIC (vnode->symbol.decl)) + if (!TREE_PUBLIC (vnode->symbol.decl)) return false; /* If linker counts on us, we must preserve the function. */ Index: symtab.c === --- symtab.c(revision 200018) +++ symtab.c(working copy) @@ -508,6 +508,8 @@ dump_symtab_base (FILE *f, symtab_node n fprintf (f, " force_output"); if (node->symbol.forced_by_abi) fprintf (f, " forced_by_abi"); + if (node->symbol.externally_visible) +fprintf (f, " externally_visible"); if (node->symbol.resolution != LDPR_UNKNOWN) fprintf (f, " %s", ld_plugin_symbol_resolution_names[(int)node->symbol.resolution]); @@ -655,6 +657,15 @@ verify_symtab_base (symtab_node node) error ("node not found in symtab decl hashtable"); error_found = true; } + if (hashed_node != node + && (!is_a (node) + || !dyn_cast (node)->clone_of + || dyn_cast (node)->clone_of->symbol.decl +!= node->symbol.decl)) + { + error ("node differs from symtab decl hashtable"); + error_found = true; + } } if (assembler_name_hash) {
Re: [PATCH][RFC] Re-write LTO type merging again, do tree merging
On Wed, 12 Jun 2013, Richard Biener wrote: > > The following patch re-writes LTO type merging completely with the > goal to move as much work as possible to the compile step, away > from WPA time. At the same time the merging itself gets very > conservative but also more general - it now merges arbitrary trees, > not only types, but only if they are bit-identical and have the > same outgoing tree pointers. > > Especially the latter means that we now have to merge SCCs of trees > together and either take the whole SCC as prevailing or throw it > away. Moving work to the compile step means that we compute > SCCs and their hashes there, re-organizing streaming to stream > tree bodies as SCC blocks together with the computed hash. > > When we ask the streamer to output a tree T then it now has > to DFS walk all tree pointers, collecting SCCs of not yet > streamed trees and output them like the following: > > { LTO_tree_scc, N, hash, entry_len, >{ header1, header2, ... headerN }, >{ bits1, refs1, bits2, refs2, ... bitsN, refsN } } > { LTO_tree_scc, 1, hash, header, bits, refs } > { LTO_tree_scc, M, hash, entry_len, >{ header1, header2, ... headerM }, >{ bits1, refs1, bits2, refs2, ... bitsM, refsM } } > LTO_tree_pickle_reference to T > > with tree references in refsN always being LTO_tree_pickle_references > instead of starting a new tree inline. That results in at most > N extra LTO_tree_pickle_references for N streamed trees, together > with the LTO_tree_scc wrapping overhead this causes a slight > increase in LTO object size (around 10% last time I measured, which > was before some additional optimization went in). > > The overhead also happens on the LTRANS file producing side > which now has to do the DFS walk and stream the extra data. > It doesn't do the hashing though as on the LTRANS consumer > side no merging is performed. > > The patch preserves the core of the old merging code to compare > with the new code and output some statistics. That means that > if you build with -flto-report[-wpa] you get an additional > compile-time and memory overhead. > > For reference here are the stats when LTO bootstrapping for > stage2 cc1: > > WPA statistics > [WPA] read 2494507 SCCs of average size 2.380067 > [WPA] 5937095 tree bodies read in total > [WPA] tree SCC table: size 524287, 286280 elements, collision ratio: > 0.806376 > [WPA] tree SCC max chain length 11 (size 1) > [WPA] Compared 403361 SCCs, 6226 collisions (0.015435) > [WPA] Merged 399980 SCCs > [WPA] Merged 2438250 tree bodies > [WPA] Merged 192475 types > [WPA] 195422 types prevailed > [WPA] Old merging code merges an additional 54582 types of which 21083 are > in the same SCC with their prevailing variant > > this says that we've streamed in 5937095 tree bodies in > 2494507 SCCs (so the average SCC size is small), of those > we were able to immediately ggc_free 399980 SCCs because they > already existed in identical form (16% of the SCCs, 41% of the trees > and 49% of the types). The old merging code forced the merge > of an additional 54582 types (but 21083 of them it merged with > a type that is in the same SCC, that is, it changed the shape > of the SCC and collapsed parts of it - something that is > suspicious). > > The patch was LTO bootstrapped (testing currently running) on > x86_64-unknown-linux-gnu and I've built SPEC2k6 with -Ofast -g -flto > and did a test run of the binaries which shows that > currently 471.omnetpp, 483.xalancbmk and 447.dealII fail > (471.omnetpp segfaults in __cxxabiv1::__dynamic_cast) - these > fails were introduced quite recently likely due to the improved > FUNCTION_DECL and VAR_DECL merging and the cgraph fixup Honza did. The following incremental patch fixes that. Index: trunk/gcc/lto-symtab.c === --- trunk.orig/gcc/lto-symtab.c 2013-06-12 16:47:38.0 +0200 +++ trunk/gcc/lto-symtab.c 2013-06-12 17:00:12.664126423 +0200 @@ -96,9 +96,6 @@ lto_varpool_replace_node (struct varpool ipa_clone_referring ((symtab_node)prevailing_node, &vnode->symbol.ref_list); - /* Be sure we can garbage collect the initializer. */ - if (DECL_INITIAL (vnode->symbol.decl)) -DECL_INITIAL (vnode->symbol.decl) = error_mark_node; /* Finally remove the replaced node. */ varpool_remove_node (vnode); } Index: trunk/gcc/varpool.c === --- trunk.orig/gcc/varpool.c2013-06-12 13:13:06.0 +0200 +++ trunk/gcc/varpool.c 2013-06-12 17:01:46.088248807 +0200 @@ -77,15 +77,8 @@ varpool_remove_node (struct varpool_node /* Renove node initializer when it is no longer needed. */ void -varpool_remove_initializer (struct varpool_node *node) +varpool_remove_initializer (struct varpool_node *) { - if (DECL_INITIAL (node->symbol.decl) - && !DECL_IN_CONSTANT_POOL (node->symbol.decl) - /* Keep vtables for BINFO folding. */ - && !DECL_
Re: [patch] set MULTIARCH_DIRNAME for multilib architectures
"Bernhard Reutner-Fischer" writes: > On 12 June 2013 20:20:50 Richard Sandiford wrote: >> Matthias Klose writes: >> > Index: config/mips/t-linux64 >> > === >> > --- config/mips/t-linux64 (revision 200012) >> > +++ config/mips/t-linux64 (working copy) >> > @@ -24,3 +24,13 @@ >> >../lib32$(call >> if_multiarch,:mips64$(MIPS_EL)-linux-gnuabin32$(MIPS_SOFT)) \ >> >../lib$(call if_multiarch,:mips$(MIPS_EL)-linux-gnu$(MIPS_SOFT)) \ >> >../lib64$(call >> > if_multiarch,:mips64$(MIPS_EL)-linux-gnuabi64$(MIPS_SOFT)) >> > + >> > +ifneq (,$(findstring abin32,$(target))) >> > +MULTIARCH_DIRNAME = $(call >> if_multiarch,mips64$(MIPS_EL)-linux-gnuabin32$(MIPS_SOFT)) >> > +else >> > +ifneq (,$(findstring abi64,$(target))) >> > +MULTIARCH_DIRNAME = $(call >> if_multiarch,mips64$(MIPS_EL)-linux-gnuabi64$(MIPS_SOFT)) >> > +else >> > +MULTIARCH_DIRNAME = $(call >> if_multiarch,mips$(MIPS_EL)-linux-gnu$(MIPS_SOFT)) >> > +endif >> > +endif >> >> findstring seems a bit fragile for a full triple. I think it would >> be better to have something similar to the current MIPS_SOFT definition: >> >> MIPS_SOFT = $(if $(strip $(filter MASK_SOFT_FLOAT_ABI, >> $(target_cpu_default)) $(filter soft, $(with_float))),soft) >> >> but for ABIs. It could then also take with_abi into account. >> Maybe something like: >> >> MIPS_ABI = $(or $(with_abi), \ >> $(if $(filter MIPS_ABI_DEFAULT=ABI_N32, \ >>$(target_cpu_default)), n32), \ >> o32) >> >> (completely untested). > > Bikeshedding: > Doko would know, but ISTR that $(or) did not exist in make-3.80 which is > currently the minimum prerequisite, fwiw. Gah, that's a pity. Thanks for the catch though. Maybe firstword would be OK instead. I see I also fell into the usual ABI trap. It wouldn't have affected the use in this patch, but the default ought to be "32" rather than "o32". Thanks, Richard
Re: [C++ Patch] PR 57599
... in any case, I propose to also add to /cpp0x this testcase suggested by Daniel. Thanks, Paolo. 7 Index: g++.dg/cpp0x/dyncast1.C === --- g++.dg/cpp0x/dyncast1.C (revision 0) +++ g++.dg/cpp0x/dyncast1.C (working copy) @@ -0,0 +1,27 @@ +// PR c++/57599 +// { dg-do compile { target c++11 } } + +struct A { }; +struct B : public A { }; + +template +struct is_same { static constexpr bool value = false; }; + +template +struct is_same { static constexpr bool value = true; }; + +template +T val(); + +static_assert(is_same(val())), + const A*>::value, "Ouch"); +static_assert(is_same(val())), + const A&>::value, "Ouch"); +static_assert(is_same(val())), + volatile A*>::value, "Ouch"); +static_assert(is_same(val())), + volatile A&>::value, "Ouch"); +static_assert(is_same(val())), + const volatile A*>::value, "Ouch"); +static_assert(is_same(val())), + const volatile A&>::value, "Ouch");
[Patch, Fortran] PR57596 - Fix OPTIONAL handling of deferred-length strings
A rather simple patch. I wonder why we didn't get in trouble before - the "*dummy = NULL;" part should affect also other optional allocatable dummy arguments. Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias PS: Pending patches: * Unreviewed: Print exception status at STOP, http://gcc.gnu.org/ml/fortran/2013-06/msg00077.html * Uncommitted: Mikael's CLASS+function patch, http://gcc.gnu.org/ml/fortran/2013-06/msg00079.html PPS: The old dump (GCC 4.8, 4.9 w/o patch should be the same) produced: get (character(kind=1)[1:(integer(kind=4)) _c_val] * * c_val, integer(kind=4) * _c_val) { *c_val = 0B; ... finally { *_c_val = .c_val; } } and with intent(inout): .c_val = *_c_val; 2013-06-13 Tobias Burnus PR fortran/57596 * trans-decl.c (gfc_trans_deferred_vars): Honor OPTIONAL for nullify and deferred-strings' length variable. 2013-06-13 Tobias Burnus PR fortran/57596 * gfortran.dg/deferred_type_param_9.f90: New. diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c index 87652ba..300175f 100644 --- a/gcc/fortran/trans-decl.c +++ b/gcc/fortran/trans-decl.c @@ -3855,12 +3857,21 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block) if (!sym->attr.dummy || sym->attr.intent == INTENT_OUT) { /* Nullify when entering the scope. */ - gfc_add_modify (&init, se.expr, - fold_convert (TREE_TYPE (se.expr), - null_pointer_node)); + tmp = fold_build2_loc (input_location, MODIFY_EXPR, + TREE_TYPE (se.expr), se.expr, + fold_convert (TREE_TYPE (se.expr), + null_pointer_node)); + if (sym->attr.optional) + { + tree present = gfc_conv_expr_present (sym); + tmp = build3_loc (input_location, COND_EXPR, + void_type_node, present, tmp, + build_empty_stmt (input_location)); + } + gfc_add_expr_to_block (&init, tmp); } - if ((sym->attr.dummy ||sym->attr.result) + if ((sym->attr.dummy || sym->attr.result) && sym->ts.type == BT_CHARACTER && sym->ts.deferred) { @@ -3874,15 +3885,38 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block) gfc_add_modify (&init, sym->ts.u.cl->backend_decl, build_int_cst (gfc_charlen_type_node, 0)); else - gfc_add_modify (&init, sym->ts.u.cl->backend_decl, tmp); + { + tree tmp2; + + tmp2 = fold_build2_loc (input_location, MODIFY_EXPR, + gfc_charlen_type_node, + sym->ts.u.cl->backend_decl, tmp); + if (sym->attr.optional) + { + tree present = gfc_conv_expr_present (sym); + tmp2 = build3_loc (input_location, COND_EXPR, + void_type_node, present, tmp2, + build_empty_stmt (input_location)); + } + gfc_add_expr_to_block (&init, tmp2); + } gfc_restore_backend_locus (&loc); /* Pass the final character length back. */ if (sym->attr.intent != INTENT_IN) - tmp = fold_build2_loc (input_location, MODIFY_EXPR, - gfc_charlen_type_node, tmp, - sym->ts.u.cl->backend_decl); + { + tmp = fold_build2_loc (input_location, MODIFY_EXPR, + gfc_charlen_type_node, tmp, + sym->ts.u.cl->backend_decl); + if (sym->attr.optional) + { + tree present = gfc_conv_expr_present (sym); + tmp = build3_loc (input_location, COND_EXPR, + void_type_node, present, tmp, + build_empty_stmt (input_location)); + } + } else tmp = NULL_TREE; } --- /dev/null 2013-06-13 09:10:45.615178715 +0200 +++ gcc/gcc/testsuite/gfortran.dg/deferred_type_param_9.f90 2013-06-13 10:55:51.506836678 +0200 @@ -0,0 +1,22 @@ +! { dg-do run } +! +! PR fortran/57596 +! +! Contributed by Valery Weber +! +PROGRAM main + IMPLICIT NONE + call get () + call get2 () +contains + SUBROUTINE get (c_val) +CHARACTER( : ), INTENT( INOUT ), ALLOCATABLE, OPTIONAL :: c_val +CHARACTER( 10 ) :: c_val_tmp +if(present(c_val)) call abort() + END SUBROUTINE get + SUBROUTINE get2 (c_val) +CHARACTER( : ), INTENT( OUT ), ALLOCATABLE, OPTIONAL :: c_val +CHARACTER( 10 ) :: c_val_tmp +if(present(c_val)) call abort() + END SUBROUTINE get2 +END PROGRAM main
Re: [PATCH 4/4] Fix leading spaces.
On Thu, Jun 13, 2013 at 10:08:23AM +0200, Richard Biener wrote: > On Wed, Jun 12, 2013 at 10:08 PM, Ondřej Bílka wrote: > > A followup to previous patch is more general pass that changes leading > > spaces to tabs followed by at most 8 spaces. > > > > http://kam.mff.cuni.cz/~ondra/0004-Formatted-by-leading_space.patch > > Btw, rather than these kind of patches I'd appreciate if someone would look > at a simple pre(post?)-commit hook that enforces those whitespace rules. > > Either by adjusting the committed content or by rejecting the commit(?) > It was my next logical step to propose this. For hooks you need a clean source tree or they will pick unrelated code that happens to be in same file. I had hook ready so I copied it to http://kam.mff.cuni.cz/~ondra/stylepp.tar.bz2 To use it, add DIRECTORY/space_pre_commit to your pre-commit hook. I have other checkes that are bit more controversional (for example space after comma which could trigger reformating of subsequent lines.) For it I could add post commit hook that prints: Stylepp found several problems with formating. See git diff (stylepp.patch file if you prefer) Ondra
RE: [PATCH GCC]Consider NOP_EXPR and CONVERT_EXPR as equal nodes in operand_equal_p
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Thursday, June 13, 2013 3:51 PM > To: Bin Cheng > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH GCC]Consider NOP_EXPR and CONVERT_EXPR as equal nodes in > operand_equal_p > > On Thu, Jun 13, 2013 at 3:27 AM, Bin Cheng wrote: > > Hi, > > This is a case of NOP_EXPR and CONVERT_EXPR not compared equal in > > operand_equal_p, resulting in below two nodes are considered different: > > > > NODE 0: > > > type > size > > unit size > > align 16 symtab 0 alias set 4 canonical type 0xb74602a0 > > precision 16 min max > 0xb744e78c 32767> context > > pointer_to_this > > > > > arg 0 > type > size > > unit size > > align 32 symtab 0 alias set 5 canonical type 0xb7460420 > > precision 32 min max > 0xb744e8a4 2147483647> context > D.6120> > > pointer_to_this > > > visiteddef_stmt _23 = *_22; > > > > version 23>> > > > > NODE 1: > > > type > size > > unit size > > align 16 symtab 0 alias set 4 canonical type 0xb74602a0 > > precision 16 min max > 0xb744e78c 32767> context > > pointer_to_this > > > > > arg 0 > type > size > > unit size > > align 32 symtab 0 alias set 5 canonical type 0xb7460420 > > precision 32 min max > 0xb744e8a4 2147483647> context > D.6120> > > pointer_to_this > > > visiteddef_stmt _23 = *_22; > > > > version 23>> > > > > > > This patch fixes the problem. Please refer to > > http://gcc.gnu.org/ml/gcc/2013-05/msg00199.html for more information. > > > > Bootstrap and test on x86 and cortex-a15. Is it OK? > > Ok. Applied as r200062. Thanks. bin
RE: [PATCH GCC]Check the code to be executed for COND_EXEC in noop_move_p
> -Original Message- > From: Eric Botcazou [mailto:ebotca...@adacore.com] > Sent: Thursday, June 13, 2013 3:36 PM > To: Bin Cheng > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH GCC]Check the code to be executed for COND_EXEC in > noop_move_p > > > 2013-06-13 Bin Cheng > > > > * rtlanal.c (noop_move_p): Check the code to be executed for > > COND_EXEC. > > OK if you use COND_EXEC_CODE instead of EXP and remove the useless assertion. > Hi Eric, Attached patch is applied as r200061, modified according to your comments. Thanks. bin Index: gcc/rtlanal.c === --- gcc/rtlanal.c (revision 199949) +++ gcc/rtlanal.c (working copy) @@ -1199,6 +1199,10 @@ noop_move_p (const_rtx insn) if (find_reg_note (insn, REG_EQUAL, NULL_RTX)) return 0; + /* Check the code to be executed for COND_EXEC. */ + if (GET_CODE (pat) == COND_EXEC) +pat = COND_EXEC_CODE (pat); + if (GET_CODE (pat) == SET && set_noop_p (pat)) return 1;
Re: expand_expr tweaks to fix PR57134
On Wed, Jun 12, 2013 at 4:48 AM, Alan Modra wrote: > The following patch fixes PR57134 by > a) excluding bitfield expansion when EXPAND_MEMORY, and > b) passing down the EXPAND_MEMORY modifier in a couple of places where > this does not currently happen on recursive calls to expand_expr(). > > (a) has precedent elsewhere in expr.c, eg. see expand_expr_real_1 > , (b) is a little difficult to justify except to claim > there's no logical reason why it should be excluded nowadays. Hand > waving argument follows: > > Of the seven expand_expr() modifiers, recursive calls made to handle > inner references of aggregate types currently exclude EXPAND_SUM, > EXPAND_WRITE, and EXPAND_MEMORY from the modifier passed. I suppose > EXPAND_SUM is excluded so that the inner expand_expr() won't return a > PLUS or MULT when we might be adding a further offset, but I can't see > why the other two modifiers are not passed. Digging through the > history, it looks like EXPAND_WRITE may have been missed by accident, > and EXPAND_MEMORY used to be an entirely different animal. kenner was > responsible for the original recursive call that passed down > EXPAND_NORMAL, EXPAND_CONST_ADDRESS and EXPAND_INITIALIZER when expr.h > looked like: > > 14612 kennerEXPAND_MEMORY_USE_* are explained below. */ >406 kenner enum expand_modifier {EXPAND_NORMAL, EXPAND_SUM, > 14612 kenner EXPAND_CONST_ADDRESS, > EXPAND_INITIALIZER, > 14612 kenner EXPAND_MEMORY_USE_WO, > EXPAND_MEMORY_USE_RW, > 14612 kenner EXPAND_MEMORY_USE_BAD, > EXPAND_MEMORY_USE_DONT}; >406 kenner > 14612 kenner /* Argument for chkr_* functions. > 14612 kennerMEMORY_USE_RO: the pointer reads memory. > 14612 kennerMEMORY_USE_WO: the pointer writes to memory. > 14612 kennerMEMORY_USE_RW: the pointer modifies memory (ie it reads > and writes). An > 14612 kenner example is (*ptr)++ > 14612 kennerMEMORY_USE_BAD: use this if you don't know the behavior > of the pointer, or > 14612 kennerif you know there are no pointers. > Using an INDIRECT_REF > 14612 kennerwith MEMORY_USE_BAD will abort. > 14612 kennerMEMORY_USE_TW: just test for writing, without update. > Special. > 14612 kennerMEMORY_USE_DONT: the memory is neither read nor written. > This is used by > 14612 kenner '->' and '.'. */ > > So I think passing EXPAND_MEMORY and EXPAND_WRITE down ought to be > good. The VIEW_CONVERT_EXPR case really doesn't need to exclude > EXPAND_SUM either, since it doesn't allow an offset, but I made it the > same as the inner ref case so that when/if someone attends to > /* ??? We should work harder and deal with non-zero offsets. */ > it doesn't cause a surprise. Really, a lot of this code needs > attention, for example, I think SSA made EXPAND_STACK_PARM and all of > the related code in calls.c dealing with libcalls when expanding args, > unnecessary. > > Bootstrapped and regression tested powerpc64-linux. OK to apply? I suppose it also fixes PR57586 which looks similar? Can you add a testcase please? Ok with a testcase and mentioning PR57586 in the changelog. Thanks, Richard. > PR middle-end/57134 > * expr.c (expand_expr_real_1 ): Pass > EXPAND_MEMORY and EXPAND_WRITE to recursive call. Don't use > bitfield expansion when EXPAND_MEMORY. > (expand_expr_real_1 ): Pass modifier likewise. > > > Index: gcc/expr.c > === > --- gcc/expr.c (revision 199940) > +++ gcc/expr.c (working copy) > @@ -9918,12 +9919,8 @@ expand_expr_real_1 (tree exp, rtx target, enum mac > && modifier != EXPAND_STACK_PARM > ? target : NULL_RTX), > VOIDmode, > -(modifier == EXPAND_INITIALIZER > - || modifier == EXPAND_CONST_ADDRESS > - || modifier == EXPAND_STACK_PARM) > -? modifier : EXPAND_NORMAL); > +modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier); > > - > /* If the bitfield is volatile, we want to access it in the >field's mode, not the computed mode. >If a MEM has VOIDmode (external with incomplete type), > @@ -10081,6 +10078,7 @@ expand_expr_real_1 (tree exp, rtx target, enum mac > || (MEM_P (op0) > && (MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (mode1) > || (bitpos % GET_MODE_ALIGNMENT (mode1) != 0 > +&& modifier != EXPAND_MEMORY > && ((modifier == EXPAND_CONST_ADDRESS > || modifier == EXPAND_INITIALIZER) > ? STRICT_ALIGNMENT > @@ -10279,10 +102
Re: [GOOGLE] More strict checking for call args
On Sat, Jun 8, 2013 at 2:26 AM, Xinliang David Li wrote: > On Fri, Jun 7, 2013 at 6:47 AM, Richard Biener > wrote: >> On Fri, Jun 7, 2013 at 3:30 PM, Xinliang David Li wrote: >>> On Fri, Jun 7, 2013 at 2:05 AM, Richard Biener >>> wrote: On Thu, Jun 6, 2013 at 5:10 PM, Dehao Chen wrote: > Hi, Martin, > > Yes, your patch can fix my case. Thanks a lot for the fix. > > With the fix, value profiling will still promote the wrong indirect > call target. Though it will not be inlining, but it results in an > additional check. How about in check_ic_target, after calling > gimple_check_call_matching_types, we also check if number of args > match number of params in target->symbol.decl? I wonder what's the point in the gimple_check_call_matching_types check in the profiling case. It's at least no longer /* Perform sanity check on the indirect call target. Due to race conditions, false function target may be attributed to an indirect call site. If the call expression type mismatches with the target function's type, expand_call may ICE. because since the introduction of gimple_call_fntype we will _not_ ICE. Thus I argue that check_ic_target should be even removed instead of enhancing it! >>> >>> Another reason is what Dehao had mentioned -- wrong target leads to >>> useless transformation. >> >> Sure, but a not wrong in the sense of the predicate does not guarantee >> a useful transformation either. > > The case in reality is very rare -- most of the cases, the > transformation is good. > >> How does IC profiling determine the called target? That is, what does it do when the target is not always the same? (because the checking code talks about race conditions for example) >>> >>> >>> The race condition is the happening at instrumentation time -- the >>> indirect call counters are not thread local. We have seen this a lot >>> in the past that a totally bogus target is attributed to a indirect >>> callsite. >> >> So it simply uses whatever function was called last? Instead of >> using the function that was called most of the time? > > It uses the most frequent target -- but the target id recorded for the > most frequent target might be corrupted and got mapped to a false > target during profile-use. But that's not due to "race conditions" but rather AutoFDO which isn't even in trunk, right? Anyway, the discussion is probably moot - the patch is ok with me and my argument would be we should use the function in less places. Thanks, Richard. > David > >> >> Richard. >> >>> thanks, >>> >>> David Richard. > Thanks, > Dehao > > > On Thu, Jun 6, 2013 at 7:11 AM, Martin Jambor wrote: >> >> Hi, >> >> On Tue, Jun 04, 2013 at 05:19:02PM -0700, Dehao Chen wrote: >> > attached is a testcase that would cause problem when source has >> > changed: >> > >> > $ g++ test.cc -O2 -fprofile-generate -DOLD >> > $ ./a.out >> > $ g++ test.cc -O2 -fprofile-use >> > test.cc:34:1: internal compiler error: in operator[], at vec.h:815 >> > } >> > ^ >> > 0x512740 vec::operator[](unsigned int) >> > ../../gcc/vec.h:815 >> > 0x512740 vec::operator[](unsigned int) >> > ../../gcc/vec.h:1244 >> > 0xf24464 vec::operator[](unsigned int) >> > ../../gcc/vec.h:815 >> > 0xf24464 vec::operator[](unsigned int) >> > ../../gcc/vec.h:1244 >> > 0xf24464 ipa_get_indirect_edge_target_1 >> > ../../gcc/ipa-cp.c:1535 >> > 0x971b9a estimate_edge_devirt_benefit >> > ../../gcc/ipa-inline-analysis.c:2757 >> >> Hm, this seems rather like an omission in ipa_get_indirect_edge_target_1. >> Since it is called also from inlining, we can have parameter count >> mismatches... and in fact in non-virtual paths of that function we do >> check that we don't. Because all callers have to pass known_vals >> describing all formal parameters of the inline tree root, we should >> apply the fix below (I've only just started running a bootstrap and >> testsuite on x86_64, though). >> >> OTOH, while I understand that FDO can change inlining sufficiently so >> that this error occurs, IMHO this should not be caused by outdated >> profiles but there is somewhere a parameter mismatch in the source. >> >> Dehao, can you please check that this patch helps? >> >> Richi, if it does and the patch passes bootstrap and tests, is it OK >> for trunk and 4.8 branch? >> >> Thanks and sorry for the trouble, >> >> Martin >> >> >> 2013-06-06 Martin Jambor >> >> * ipa-cp.c (ipa_get_indirect_edge_target_1): Check that >> param_index is >> within bounds at the beginning of the function. >> >> Index: src/gcc/ipa-cp.c >>
Re: [PATCH 0/2] Proof-of-concept towards removal of the "cfun" global
On Fri, May 31, 2013 at 4:12 PM, David Malcolm wrote: > On Tue, 2013-05-28 at 12:30 -0600, Jeff Law wrote: >> On 05/28/2013 11:00 AM, David Malcolm wrote: >> > On Tue, 2013-05-28 at 06:39 -0600, Jeff Law wrote: >> >> On 05/25/2013 07:02 AM, David Malcolm wrote: >> >>> I can think of three approaches to "cfun": >> >>> (a) status quo: a global variable, with macros to prevent direct >> >>> assignment, and an API for changing cfun. >> >>> (b) have a global "context" or "universe" object, and put cfun in >> >>> there (perhaps with tricks to be able to make this a singleton in a >> >>> non-library build, optimizing away the context lookups somehow >> >>> - see [2] for discussion on this) >> >>> (c) go through all of the places where cfun is used, and somehow ensure >> >>> that they're passed in the data they need. Often it's not the >> >>> function that's used, but its cfg. >> >> I'd think B or C is going to be the way to go here. B may also be an >> >> intermediate step towards C. >> >> >> >>> >> >>> One part of the puzzle is that various header files in the build define >> >>> macros that reference the "cfun" global, e.g.: >> >>> >> >>> #define n_basic_blocks (cfun->cfg->x_n_basic_blocks) >> >>> >> >>> This one isn't in block caps, which might mislead a new contributor into >> >>> thinking it's a variable, rather than a macro, so there may be virtue in >> >>> removing these macros for that reason alone. (I know that these confused >> >>> me for a while when I first started writing my plugin) [3] >> >> There's a few of these that have crept in over the years. >> >> n_basic_blocks used to be a global variable. At some point it was >> >> stuffed into cfun, but it was decided not to go back and fix all the >> >> references -- possibly due to not wanting to fix the overly long lines >> >> after the mechanical change. >> > >> > If a mechanical change could fix the overly-long lines as it went along, >> > would such a change be acceptable now? >> Probably. However, I would advise against trying to pack too much into >> a single patch. >> >> So one possibility to move forward would be a single patch which removes >> the n_basic_blocks macro and fixes all the uses along with their >> overly-long lines. It's simple, obvious and very self-contained. > > Thanks. I'm attaching such a patch. > > Successful 3-stage build against r199533; "make check" shows the same > results as an unpatched 3-stage build of that same revision (both on > x86_64-unknown-linux-gnu). > > OK for trunk? > > Should I continue with individual patches to remove each macro, in turn? Sorry for taking so long to look at this. I'd prefer instead of hunks like @@ -2080,7 +2080,7 @@ reorder_basic_blocks (void) gcc_assert (current_ir_type () == IR_RTL_CFGLAYOUT); - if (n_basic_blocks <= NUM_FIXED_BLOCKS + 1) + if (cfun->cfg->n_basic_blocks <= NUM_FIXED_BLOCKS + 1) return; set_edge_can_fallthru_flag (); to use if (n_basic_blocks_for_function (cfun) <= NUM_FIXED_BLOCKS + 1) so we have a single way over the compiler to access n_basic_blocks. Ok with that change, and ok with following up with patches for the other individual macros, replacing them with existing _for_function variants. Btw, I'm also ok with shortening these macros to use _for_fn instead at the same time, adjusting the very few existing invocations of them. Thanks, Richard. > Dave
Re: [PATCH] Enable non-complex math builtins from C99 for Bionic
On Mon, May 27, 2013 at 11:15 AM, Alexander Ivchenko wrote: > Hi, > > While discussing the issue with the command line option for sincos here: > http://gcc.1065356.n5.nabble.com/PATCH-bionic-Add-foptimize-sincos-tp940918.html > > Richard wrote: >> I'd rather think about a way to specify, for all known builtins, whether GCC >> should generate calls to such function where they are not in the source >> program. That is, similar to how we have -f[no-]builtin-FOO introduce >> -f[no-]libc-FOO (with a better name for 'libc'?). That way there would be >> a way to specify that -floop-distribute-patterns should not produce calls >> to memset () for example. > > This patch is related to that idea: in the target hook > libc_has_function we can check whether > the compiler should generate calls to a particular function or not. > The command line support for > each function could be the next step. > > Could you please take a look? I don't see how a target hook is required for the command-line idea. Targets already have a perfectly working way of changing the default of a command-line option. Richard. > thanks > Alexander > > 2013/4/23 Alexander Ivchenko : >> *ping* >> >> thanks >> Alexander >> >> 2013/3/28 Alexander Ivchenko : >>> Hi, >>> >>> 4.8 is now branched, lets come back to the discussion that we had >>> before. I updated the patch a little >>> bit since we now have linux-protos.h and linux-android.c files. >>> >>> I tried to preserve the avaiability of c99 for all targets, but it's >>> pretty difficult, because we are changing >>> the defaults. Passing an empty string as second argument doesn't look >>> very good, but on the other hand >>> the user has one clear way for checking the presence of a certain >>> function. But of course we can create >>> another function, that will call targetm.libc_has_function >>> (function_class, "") within itself. >>> >>> best regards, >>> Alexander >>> >>> 2013/1/7 Joseph S. Myers : On Fri, 21 Dec 2012, Alexander Ivchenko wrote: > Hi, > > Thank you very much for your input! Please, take a look at the updated > version: > I fixed coding style, moved documentation for TARGET_LIBC_HAS_FUNCTION > to target.def. > Removed TARGET_C99_FUNCTIONS and TARGET_HAS_SINCOS and all their > influence and moved the implementation of linux_libc_has_function to > host-linux.c. > I changed the defaults: now it is assumed that we have C99 runtime, > but no sincos. I updated all needed gcc/config/*.h. But 'm not sure in > this part, > cause I don't have the opportunity to test it properly... This patch seems mostly plausible, though there are various places that call targetm.libc_has_function with and empty string as second argument, that should be naming the specific function instead. I haven't reviewed the details, and at this development stage I think it will need to wait until after 4.8 branches. -- Joseph S. Myers jos...@codesourcery.com
Re: [Bug libstdc++/56430] In __airy: return-statement with a value, in function returning 'void'.
On 06/13/2013 02:38 AM, Paolo Carlini wrote: If we really have to add a testcase - I'm not sure - please double check that it passes testing with -Wall, no unused vars. Patch as went in had still the testcase wrong, triggering at least 3 warnings with -Wall. All in all, I decided to also remove the additional functions: it doesn't make sense to add *now* functions to tr1, which otherwise is deeply in regression fixes only mode. And certainly not under a completely unrelated PR. Paolo.
Re: [PATCH 1/4] Fix trailing whitespaces
On Thu, Jun 13, 2013 at 09:45:23AM +0200, Marek Polacek wrote: > On Wed, Jun 12, 2013 at 10:18:29PM +, Joseph S. Myers wrote: > > In general I think no formatting fixes should be made to GCC testcases, > > including removal of trailing whitespace; it's good if they cover a range > > of coding styles and oddities as that reflects how GCC is used in > > practice. > > And second problem I see with these cleanups is that they are not very > good for e.g. git blame. > If only these tools had switch to ignore whitespaces. Then you could write git blame -w and these pesky cleanups would not show at all.
Re: [PATCH 4/4] Fix leading spaces.
On Wed, Jun 12, 2013 at 10:08 PM, Ondřej Bílka wrote: > A followup to previous patch is more general pass that changes leading > spaces to tabs followed by at most 8 spaces. > > http://kam.mff.cuni.cz/~ondra/0004-Formatted-by-leading_space.patch Btw, rather than these kind of patches I'd appreciate if someone would look at a simple pre(post?)-commit hook that enforces those whitespace rules. Either by adjusting the committed content or by rejecting the commit(?) Richard.
Re: More forwprop for vectors
On Thu, Jun 13, 2013 at 8:44 AM, Marc Glisse wrote: > On Wed, 12 Jun 2013, Jeff Law wrote: > >>> 2013-06-13 Marc Glisse >>> >>> * tree-ssa-forwprop.c (simplify_bitwise_binary, >>> associate_plusminus): >>> Generalize to complex and vector. >>> * tree.c (build_all_ones_cst): New function. >>> * tree.h (build_all_ones_cst): Declare it. >> >> This is OK. >> >> Extra credit if you create some testcases. > > > While writing the testcase, I fixed a bug in the current code (mix up > between rhs1 and rhs2 for ~A + 1 -> -A which prevents it from ever > triggering) and generalized the pattern (A + CST) +- CST -> A + CST to also > allow '-' (it isn't clear to me why it wasn't handled, did I miss a > reason?). It passes the bootstrap+testsuite on x86_64-unknown-linux-gnu. > > Are you ok with the (A +- CST) +- CST -> A +- CST part? Yes, that looks fine. Please also update the overall comment before the cases: /* Second match patterns that allow contracting a plus-minus pair irrespective of overflow issues. (A +- B) - A -> +- B (A +- B) -+ B -> A (CST +- A) +- CST -> CST +- A (A + CST) +- CST -> A + CST ~A + A -> -1 ~A + 1 -> -A A - (A +- B) -> -+ B A +- (B +- A) -> +- B CST +- (CST +- A) -> CST +- A CST +- (A +- CST) -> CST +- A A + ~A -> -1 you can see that we do handle CST +- (A +- CST) -> CST +- A so I cannot think of a reason to not handle the commutated variant. Thus, ok with the above adjusted. Thanks, Richard. > Extra piece of ChangeLog: > > gcc/testsuite/ > * gcc.dg/tree-ssa/forwprop-27.c: New testcase. > > -- > Marc Glisse > Index: tree-ssa-forwprop.c > === > --- tree-ssa-forwprop.c (revision 200044) > +++ tree-ssa-forwprop.c (working copy) > @@ -1971,22 +1971,22 @@ simplify_bitwise_binary (gimple_stmt_ite > gimple_assign_set_rhs2 (stmt, b); > gimple_assign_set_rhs_code (stmt, def1_code); > update_stmt (stmt); > return true; > } > } > >/* (a | CST1) & CST2 -> (a & CST2) | (CST1 & CST2). */ >if (code == BIT_AND_EXPR >&& def1_code == BIT_IOR_EXPR > - && TREE_CODE (arg2) == INTEGER_CST > - && TREE_CODE (def1_arg2) == INTEGER_CST) > + && CONSTANT_CLASS_P (arg2) > + && CONSTANT_CLASS_P (def1_arg2)) > { >tree cst = fold_build2 (BIT_AND_EXPR, TREE_TYPE (arg2), > arg2, def1_arg2); >tree tem; >gimple newop; >if (integer_zerop (cst)) > { > gimple_assign_set_rhs1 (stmt, def1_arg1); > update_stmt (stmt); > return true; > @@ -2002,34 +2002,33 @@ simplify_bitwise_binary (gimple_stmt_ite >gimple_assign_set_rhs_code (stmt, BIT_IOR_EXPR); >update_stmt (stmt); >return true; > } > >/* Combine successive equal operations with constants. */ >if ((code == BIT_AND_EXPR > || code == BIT_IOR_EXPR > || code == BIT_XOR_EXPR) >&& def1_code == code > - && TREE_CODE (arg2) == INTEGER_CST > - && TREE_CODE (def1_arg2) == INTEGER_CST) > + && CONSTANT_CLASS_P (arg2) > + && CONSTANT_CLASS_P (def1_arg2)) > { >tree cst = fold_build2 (code, TREE_TYPE (arg2), > arg2, def1_arg2); >gimple_assign_set_rhs1 (stmt, def1_arg1); >gimple_assign_set_rhs2 (stmt, cst); >update_stmt (stmt); >return true; > } > >/* Canonicalize X ^ ~0 to ~X. */ >if (code == BIT_XOR_EXPR > - && TREE_CODE (arg2) == INTEGER_CST >&& integer_all_onesp (arg2)) > { >gimple_assign_set_rhs_with_ops (gsi, BIT_NOT_EXPR, arg1, NULL_TREE); >gcc_assert (gsi_stmt (*gsi) == stmt); >update_stmt (stmt); >return true; > } > >/* Try simple folding for X op !X, and X op X. */ >res = simplify_bitwise_binary_1 (code, TREE_TYPE (arg1), arg1, arg2); > @@ -2472,73 +2471,75 @@ associate_plusminus (gimple_stmt_iterato >&& code != def_code) > { > /* (A +- B) -+ B -> A. */ > code = TREE_CODE (def_rhs1); > rhs1 = def_rhs1; > rhs2 = NULL_TREE; > gimple_assign_set_rhs_with_ops (gsi, code, rhs1, > NULL_TREE); > gcc_assert (gsi_stmt (*gsi) == stmt); > gimple_set_modified (stmt, true); > } > - else if (TREE_CODE (rhs2) == INTEGER_CST > - && TREE_CODE (def_rhs1) == INTEGER_CST) > + else if (CONSTANT_CLASS_P (rhs2) > + && CONSTANT_CLASS_P (def_rhs1)) > { > /* (CST +- A) +- CST -> CST +- A. */ > tree cst = fold_binary (code,
[Ada] Remove recent trick used for fat pointer types
Fat pointer types are pointer types to (slices of) arrays and they are said "fat" because they contain a pointer to the array and a pointer to the bounds. We were using a trick on platforms which pass them by reference to improve the debug info at -O0, but this pessimizes when optimization is enabled. Tested on x86_64-suse-linux, applied on the mainline and 4.8 branch. 2013-06-13 Eric Botcazou * gcc-interface/ada-tree.h (DECL_BY_DOUBLE_REF_P): Delete. * gcc-interface/gigi.h (annotate_object): Adjust prototype. (convert_vms_descriptor): Likewise. * gcc-interface/decl.c (gnat_to_gnu_param): Do not pass fat pointer types by double dereference. (annotate_object): Remove BY_DOUBLE_REF parameter and adjust. (gnat_to_gnu_entity): Adjust calls to annotate_object. * gcc-interface/trans.c (Identifier_to_gnu): Do not deal with double dereference. (Call_to_gnu): Likewise. (build_function_stub): Adjust call to convert_vms_descriptor. (Subprogram_Body_to_gnu): Adjust call to annotate_object. * gcc-interface/utils.c (convert_vms_descriptor): Remove BY_REF parameter and adjust. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 200056) +++ gcc-interface/utils.c (working copy) @@ -4096,33 +4096,25 @@ convert_vms_descriptor32 (tree gnu_type, /* Convert GNU_EXPR, a pointer to a VMS descriptor, to GNU_TYPE, a regular pointer or fat pointer type. GNU_EXPR_ALT_TYPE is the alternate (32-bit) - pointer type of GNU_EXPR. BY_REF is true if the result is to be used by - reference. GNAT_SUBPROG is the subprogram to which the VMS descriptor is - passed. */ + pointer type of GNU_EXPR. GNAT_SUBPROG is the subprogram to which the + descriptor is passed. */ tree convert_vms_descriptor (tree gnu_type, tree gnu_expr, tree gnu_expr_alt_type, - bool by_ref, Entity_Id gnat_subprog) + Entity_Id gnat_subprog) { tree desc_type = TREE_TYPE (TREE_TYPE (gnu_expr)); tree desc = build1 (INDIRECT_REF, desc_type, gnu_expr); tree mbo = TYPE_FIELDS (desc_type); const char *mbostr = IDENTIFIER_POINTER (DECL_NAME (mbo)); tree mbmo = DECL_CHAIN (DECL_CHAIN (DECL_CHAIN (mbo))); - tree real_type, is64bit, gnu_expr32, gnu_expr64; - - if (by_ref) -real_type = TREE_TYPE (gnu_type); - else -real_type = gnu_type; + tree is64bit, gnu_expr32, gnu_expr64; /* If the field name is not MBO, it must be 32-bit and no alternate. Otherwise primary must be 64-bit and alternate 32-bit. */ if (strcmp (mbostr, "MBO") != 0) { - tree ret = convert_vms_descriptor32 (real_type, gnu_expr, gnat_subprog); - if (by_ref) - ret = build_unary_op (ADDR_EXPR, gnu_type, ret); + tree ret = convert_vms_descriptor32 (gnu_type, gnu_expr, gnat_subprog); return ret; } @@ -4139,14 +4131,9 @@ convert_vms_descriptor (tree gnu_type, t integer_minus_one_node)); /* Build the 2 possible end results. */ - gnu_expr64 = convert_vms_descriptor64 (real_type, gnu_expr, gnat_subprog); - if (by_ref) -gnu_expr64 = build_unary_op (ADDR_EXPR, gnu_type, gnu_expr64); + gnu_expr64 = convert_vms_descriptor64 (gnu_type, gnu_expr, gnat_subprog); gnu_expr = fold_convert (gnu_expr_alt_type, gnu_expr); - gnu_expr32 = convert_vms_descriptor32 (real_type, gnu_expr, gnat_subprog); - if (by_ref) -gnu_expr32 = build_unary_op (ADDR_EXPR, gnu_type, gnu_expr32); - + gnu_expr32 = convert_vms_descriptor32 (gnu_type, gnu_expr, gnat_subprog); return build3 (COND_EXPR, gnu_type, is64bit, gnu_expr64, gnu_expr32); } Index: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 200056) +++ gcc-interface/decl.c (working copy) @@ -1025,7 +1025,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit save_gnu_tree (gnat_entity, gnu_decl, true); saved = true; annotate_object (gnat_entity, gnu_type, NULL_TREE, - false, false); + false); /* This assertion will fail if the renamed object isn't aligned enough as to make it possible to honor the alignment set on the renaming. */ @@ -1604,7 +1604,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit type of the object and not on the object directly, and makes it possible to support all confirming representation clauses. */ annotate_object (gnat_entity, TREE_TYPE (gnu_decl), gnu_object_size, - used_by_ref, false); + used_by_ref); } break; @@ -5650,7 +5650,7 @@ gnat_to_gnu_param (Entity_Id gnat_param, /* The parameter can be indirectly modified if its address is taken. */ bool ro_param = in_param && !Address_Taken (gnat_param); bool by_return = false, by_component_ptr = false; - bool by_ref = false, by_double_ref = false; + bool by_ref = false; tree gnu_param; /* Copy-return is used only for the fi
Re: [PATCH GCC]Consider NOP_EXPR and CONVERT_EXPR as equal nodes in operand_equal_p
On Thu, Jun 13, 2013 at 3:27 AM, Bin Cheng wrote: > Hi, > This is a case of NOP_EXPR and CONVERT_EXPR not compared equal in > operand_equal_p, resulting in below two nodes are considered different: > > NODE 0: > type size > unit size > align 16 symtab 0 alias set 4 canonical type 0xb74602a0 > precision 16 min max 0xb744e78c 32767> context > pointer_to_this > > > arg 0 type size > unit size > align 32 symtab 0 alias set 5 canonical type 0xb7460420 > precision 32 min max 0xb744e8a4 2147483647> context D.6120> > pointer_to_this > > visiteddef_stmt _23 = *_22; > > version 23>> > > NODE 1: > type size > unit size > align 16 symtab 0 alias set 4 canonical type 0xb74602a0 > precision 16 min max 0xb744e78c 32767> context > pointer_to_this > > > arg 0 type size > unit size > align 32 symtab 0 alias set 5 canonical type 0xb7460420 > precision 32 min max 0xb744e8a4 2147483647> context D.6120> > pointer_to_this > > visiteddef_stmt _23 = *_22; > > version 23>> > > > This patch fixes the problem. Please refer to > http://gcc.gnu.org/ml/gcc/2013-05/msg00199.html for more information. > > Bootstrap and test on x86 and cortex-a15. Is it OK? Ok. Thanks, Richard. > Thanks. > bin > > 2013-06-13 Bin Cheng > > * fold-const.c (operand_equal_p): Consider NOP_EXPR and CONVERT_EXPR > as equal nodes.
Re: [PATCH 1/4] Fix trailing whitespaces
On Wed, Jun 12, 2013 at 10:18:29PM +, Joseph S. Myers wrote: > In general I think no formatting fixes should be made to GCC testcases, > including removal of trailing whitespace; it's good if they cover a range > of coding styles and oddities as that reflects how GCC is used in > practice. And second problem I see with these cleanups is that they are not very good for e.g. git blame. Marek
Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
On Wed, Jun 12, 2013 at 12:52:03PM -0500, Edmar Wienskoski wrote: > The e500v2 (SPE) hardware is such that if the address of vector (double world > load / stores) are not double world aligned the instruction will trap. > > So this alignment is not optional. Vector type alignment is also specified by the ppc64 abi. I think we want the following. Note that DATA_ALIGNMENT has been broken for vectors right from the initial vector support (and the error was copied for e500 double). For example typedef int vec_align __attribute__ ((vector_size(16), aligned(32))); vec_align x = { 0, 0, 0, 0 }; currently loses the extra alignment. Fixed by never decreasing alignment in DATA_ABI_ALIGNMENT. Testing in progress. OK to apply assuming bootstrap is good? (I think I need a change in offsettable_ok_by_alignment too. I'll do that in a separate patch.) * config/rs6000/rs6000.h (DATA_ABI_ALIGNMENT): Define. (DATA_ALIGNMENT): Remove alignment already covered by above. (LOCAL_ALIGNMENT): Use both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT. Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 200055) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -817,7 +817,8 @@ extern unsigned rs6000_pointer_size; local store. TYPE is the data type, and ALIGN is the alignment that the object would ordinarily have. */ #define LOCAL_ALIGNMENT(TYPE, ALIGN) \ - DATA_ALIGNMENT (TYPE, ALIGN) + ({unsigned int _align = DATA_ABI_ALIGNMENT (TYPE, ALIGN);\ +DATA_ALIGNMENT (TYPE, _align); }) /* Alignment of field after `int : 0' in a structure. */ #define EMPTY_FIELD_BOUNDARY 32 @@ -837,21 +838,26 @@ extern unsigned rs6000_pointer_size; ? BITS_PER_WORD \ : (ALIGN)) -/* Make arrays of chars word-aligned for the same reasons. - Align vectors to 128 bits. Align SPE vectors and E500 v2 doubles to +/* Make arrays of chars word-aligned for the same reasons. */ +#define DATA_ALIGNMENT(TYPE, ALIGN)\ + ((TREE_CODE (TYPE) == ARRAY_TYPE \ +&& TYPE_MODE (TREE_TYPE (TYPE)) == QImode \ +&& (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN)) + +/* Align vectors to 128 bits. Align SPE vectors and E500 v2 doubles to 64 bits. */ -#define DATA_ALIGNMENT(TYPE, ALIGN)\ +#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \ (TREE_CODE (TYPE) == VECTOR_TYPE \ ? (((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (TYPE))) \ || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE \ - ? 64 : 128) \ - : ((TARGET_E500_DOUBLE \ - && TREE_CODE (TYPE) == REAL_TYPE \ - && TYPE_MODE (TYPE) == DFmode) \ - ? 64 \ - : (TREE_CODE (TYPE) == ARRAY_TYPE \ -&& TYPE_MODE (TREE_TYPE (TYPE)) == QImode \ -&& (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN))) + ? ((ALIGN) < 64 ? 64 : (ALIGN)) \ + : ((ALIGN) < 128 ? 128 : (ALIGN))) \ + : (TARGET_E500_DOUBLE \ + && TREE_CODE (TYPE) == REAL_TYPE \ + && TYPE_MODE (TYPE) == DFmode\ + && (ALIGN) < 64) \ + ? 64 \ + : (ALIGN)) /* Nonzero if move instructions will actually fail to work when given unaligned data. */ -- Alan Modra Australia Development Lab, IBM
Re: [PATCH GCC]Check the code to be executed for COND_EXEC in noop_move_p
> 2013-06-13 Bin Cheng > > * rtlanal.c (noop_move_p): Check the code to be executed for > COND_EXEC. OK if you use COND_EXEC_CODE instead of EXP and remove the useless assertion. -- Eric Botcazou