Re: PR c++/30195
Fabien Chêne fabien.ch...@gmail.com a écrit: Index: gcc/dbxout.c === --- gcc/dbxout.c (revision 178088) +++ gcc/dbxout.c (working copy) @@ -1518,6 +1518,8 @@ dbxout_type_fields (tree type) if (TREE_CODE (tem) == TYPE_DECL /* Omit here the nameless fields that are used to skip bits. */ || DECL_IGNORED_P (tem) + /* Omit USING_DECL */ + || TREE_CODE (tem) = LAST_AND_UNUSED_TREE_CODE /* Omit fields whose position or size are variable or too large to represent. */ || (TREE_CODE (tem) == FIELD_DECL As this dbxout backend code already ignores DECLs marked DECL_IGNORED_P, maybe it would be best to have the front-end mark the USING_DECL as DECL_IGNORED_P; possibly in finish_member_declaration? You'd then avoid the above change. -- Dodji
Re: [Patch] Move Objective-C runtime flags to modern options system.
On Wed, 9 Nov 2011, Mike Stump wrote: On Nov 9, 2011, at 10:12 AM, Iain Sandoe wrote: This puts flag_next_runtime into the global options structure I needed to deal with '-fobjc-sjlj-exceptions' and elected to remove it - - this is because there is only one valid exception model for each permutation of runtime and ABI - thus the User flag is just clutter. It is now ignored as a User flag - and the relevant selection actions are all local to Objective C. (yay! got rid of one exceptions-related flag :-)) Yeah, that sounds like a good idea. +ObjC ObjC++ Ignore Warn(switch %qs has been removed and is set automaticaly where required) Spelling, automatically. +targetting Darwin. However, the flag overrides have not be called yet. */ Spelling, targeting. + if (flag_objc_exceptions) +/* ??? Should we warn that this is incompatible, if the user has set it. + For now, just force it it off. */ +flag_exceptions = 0; Where was this in the previous code? In ObjC++, exceptions can be on for C++ and should not be turned off. Does this code ever turn off C++ exceptions? flag_exceptions also triggers middle-end behavior - without it no statement can possibly throw. Thus, resetting it can't be ok. Richard.
Re: [patch tree-optimization 1/2]: Branch-cost optimizations
On Wed, Nov 9, 2011 at 7:34 PM, Jeff Law l...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/07/11 15:36, Richard Guenther wrote: Yes. tree-affine does this for a sum of expressions of the form a + b * c. It collects such sum, optimizes it (and you can add/subtract or scale these things) and re-emit the new simplified form. Kai, what what were the concerns with this kind of approach? Richard's suggestion seems sound to me. From a maintenance standpoint reusing/extending the tree-affine code seems like a better route. jeff I'd rather write a new (but similar) infrastructure for predicates. They are substantially different enough so that putting them together doesn't make too much sense. Richard. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOusfNAAoJEBRtltQi2kC7qjYH/35nH85/+mgZNQiKTSfh2QMp eC9XUDScOzzIbCiN0kZZiedHarIlZL7LJ9285t5PGJP0oTzCpFuHOKrdp7+CC1e4 bNJSXlZpVKhJfvd5NCoJVts/CR/AlwA2P4hOGMHs2jn939fbIokxjknGsevG8udm W/SCS2B65IysJFNCQLjz7/CiZNq36Keuw2BC6c6dn1bXXDxAcvGuR8dgr3CEBbE1 fkR2WRzucOxnoy3/d05kJuG+GRXjQBLCVtFDl1SKK/moK3Zck2MDleI4oguCj+Gp B1zCA2BjEcdDQOoQjip8XhYqhoL1hFGJXoz7KU9nwl6utVG4SGeYw1V7Wr+i3u0= =3e/j -END PGP SIGNATURE-
Re: [patch tree-optimization 1/2]: Branch-cost optimizations
On Wed, Nov 9, 2011 at 10:09 PM, Kai Tietz ktiet...@googlemail.com wrote: 2011/11/9 Jeff Law l...@redhat.com: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/07/11 15:36, Richard Guenther wrote: Yes. tree-affine does this for a sum of expressions of the form a + b * c. It collects such sum, optimizes it (and you can add/subtract or scale these things) and re-emit the new simplified form. Kai, what what were the concerns with this kind of approach? Richard's suggestion seems sound to me. From a maintenance standpoint reusing/extending the tree-affine code seems like a better route. jeff Well, such a comparison-logic-folder helper - like affine-tree for add/subtract/scale) - is for sure something good for inner gimple passes building up new logic-truth expressions, but such a pass doesn't meet requirements need to fold for BC optimization AFAICS. tree-affine is not a affine folder either. It is an on-the side representation of a sum of affine components that you can operate on (for example, you can simplify it). The difference is that for BC we don't want to fold at all. Also it isn't necessarily simplified statement we want. For example 'if ((a | b) == 0 ...) ...'. If the costs of such pattern '(a | b) == 0' are too high, we want to representation it instead as 'if (a == 0) if (b == 0) ...'. The affine tree of (a | b) == 0 is AND [0] ~a [1] ~b So for BC optimization we need to have a fully compare-expanded sequence of bitwise-operations including for sub-sequences. For normal folding we don't need sub-sequence expansion in most cases. the predicate infrastructure isn't meant to be for folding - it is mean to be a data structure that is well suited for operating on predicate expressions in all ways (well, including folding). The cause why we need for BC fully compare-expanded sequence I try to demonstrate on the following example. We have an condition 'if ((A | B) == 0 C == 0) ...' where the joining of A == 0 and C == 0 would be profitable by BC-optimization, but the join of A != 0 and B != 0 isn't. So we do - as my patch does - first an expand to element comparison-sequence view. AND [0] ~A [1] ~B [2] ~C So we get for it the transformed form 'if (A == 0 B == 0 C == 0)'. Now we can begin to search for valid patterns in the condition for joining by searching from left-to-right for a profitable pattern. So we end-up with final statement 'if ((A | C) == 0 C)' So as conclusion to answer your question about tree-affine implementation. It's exactly what you implemented (well, sort-of). You just did not properly abstract it away. Sure we can move the BC-optimization into a separate pass. As you and Richard already have explained, this would be indeed in some cases better, as there is more freedom in operating on gimple-statements. This makes for sure sense. But the logic itself we need in normal sequence-representation for folding seems not to be that what we need for BC.\ Sure, it's exactly the same data structure we can use. For general gimple passes we want to have compact form on linear-sequence without sub-sequences and we want compact final-representation in output. In BC we have slightly different requirements. We need a comparison-expanded form and of course with sub-sequences to do split-up right dependent on the actual branch-costs. You can trivially split up a predicate combination into multiple ones. Richard.
[patch] Fix PR tree-optimization/51058
Hi, This patch handles CALL_EXPRs in constant/invariant operand creation in SLP. Bootstrapped and tested on powerpc64-suse-linux. Committed. Ira ChangeLog: PR tree-optimization/51058 * tree-vect-slp.c (vect_get_constant_vectors): Handle CALL_EXPR. testsuite/ChangeLog: PR tree-optimization/51058 * gfortran.dg/vect/pr51058.f90: New test. Index: tree-vect-slp.c === --- tree-vect-slp.c (revision 181250) +++ tree-vect-slp.c (working copy) @@ -2191,7 +2191,7 @@ vect_get_constant_vectors (tree op, slp_tree slp_n VEC (tree, heap) *voprnds = VEC_alloc (tree, heap, number_of_vectors); bool constant_p, is_store; tree neutral_op = NULL; - enum tree_code code = gimple_assign_rhs_code (stmt); + enum tree_code code = gimple_expr_code (stmt); gimple def_stmt; struct loop *loop; @@ -2287,22 +2287,32 @@ vect_get_constant_vectors (tree op, slp_tree slp_n { if (is_store) op = gimple_assign_rhs1 (stmt); - else if (gimple_assign_rhs_code (stmt) != COND_EXPR) -op = gimple_op (stmt, op_num + 1); - else + else { - if (op_num == 0 || op_num == 1) + switch (code) { - tree cond = gimple_assign_rhs1 (stmt); - op = TREE_OPERAND (cond, op_num); + case COND_EXPR: + if (op_num == 0 || op_num == 1) + { + tree cond = gimple_assign_rhs1 (stmt); + op = TREE_OPERAND (cond, op_num); + } + else + { + if (op_num == 2) + op = gimple_assign_rhs2 (stmt); + else + op = gimple_assign_rhs3 (stmt); + } + break; + + case CALL_EXPR: + op = gimple_call_arg (stmt, op_num); + break; + + default: + op = gimple_op (stmt, op_num + 1); } - else - { - if (op_num == 2) - op = gimple_assign_rhs2 (stmt); - else - op = gimple_assign_rhs3 (stmt); - } } if (reduc_index != -1) Index: testsuite/gfortran.dg/vect/pr51058.f90 === --- testsuite/gfortran.dg/vect/pr51058.f90 (revision 0) +++ testsuite/gfortran.dg/vect/pr51058.f90 (revision 0) @@ -0,0 +1,19 @@ +! { dg-do compile } + + SUBROUTINE MLIST(MOLsp,PBCx,PBCy,PBCz, X0) + IMPLICIT NONE + INTEGER, PARAMETER :: NM=16384 + INTEGER :: MOLsp, i + REAL :: PBCx, PBCy, PBCz, boxjmp, HALf=1./2. + REAL :: X0(2,-2:NM) + + DO i = 1 , MOLsp +boxjmp = PBCx*INT(X0(1,i)+SIGN(HALf,X0(1,i))) +X0(1,i) = X0(1,i) - boxjmp +boxjmp = PBCy*INT(X0(2,i)+SIGN(HALf,X0(2,i))) +X0(2,i) = X0(2,i) - boxjmp + ENDDO + END + +! { dg-final { cleanup-tree-dump vect } } +
Revert PowerPC shrink-wrap support 3 of 3
From: Hans-Peter Nilsson h...@axis.com Date: Wed, 9 Nov 2011 09:55:59 +0100 From: Alan Modra amo...@gmail.com Date: Tue, 1 Nov 2011 16:33:40 +0100 On Tue, Nov 01, 2011 at 12:57:22AM +1030, Alan Modra wrote: * function.c (bb_active_p): Delete. (dup_block_and_redirect, active_insn_between): New functions. (convert_jumps_to_returns, emit_return_for_exit): New functions, split out from.. (thread_prologue_and_epilogue_insns): ..here. Delete shadowing variables. Don't do prologue register clobber tests when shrink wrapping already failed. Delete all last_bb_active code. Instead compute tail block candidates for duplicating exit path. Remove these from antic set. Duplicate tails when reached from both blocks needing a prologue/epilogue and blocks not needing such. * ifcvt.c (dead_or_predicable): Test both flag_shrink_wrap and HAVE_simple_return. * bb-reorder.c (get_uncond_jump_length): Make global. * bb-reorder.h (get_uncond_jump_length): Declare. * cfgrtl.c (rtl_create_basic_block): Comment typo fix. (rtl_split_edge): Likewise. Warning fix. (rtl_duplicate_bb): New function. (rtl_cfg_hooks): Enable can_duplicate_block_p and duplicate_block. This (a revision in the range 181187:181189) broke build for cris-elf like so: See PR51051. Given that this also broke arm-linux-gnueabi, a primary platform, and Alan being absent until the 15th according to a message on IRC, I move to revert r181188. I think I need someone with appropriate write privileges to agree with that, and to also give 48h for someone to fix the problem. Sorry for not forthcoming on the second point. brgds, H-P PS. where is the policy written down, besides the mailing list archives?
Re: [PATCH] Handle -msse -mno-sse2 in expand_vec_perm_interleave2 (PR target/50911)
2011-11-09 Jakub Jelinek ja...@redhat.com PR target/50911 * config/i386/i386.c (expand_vec_perm_interleave2): If d-vmode is V4SImode, !TARGET_SSE2 and punpck[lh]* is needed, change dremap.vmode to V4SFmode. Thanks for fixing this. I've installed the Ada testcase. 2011-11-10 Eric Botcazou ebotca...@adacore.com * gnat.dg/loop_optimization9.ad[sb]: New test. -- Eric Botcazou -- { dg-do compile } -- { dg-options -gnatws -O3 } -- { dg-options -gnatws -O3 -msse { target i?86-*-* x86_64-*-* } } with System.Soft_Links; package body Loop_Optimization9 is package SSL renames System.Soft_Links; First_Temp_File_Name : constant String := GNAT-TEMP-00.TMP; Current_Temp_File_Name : String := First_Temp_File_Name; Temp_File_Name_Last_Digit : constant Positive := First_Temp_File_Name'Last - 4; function Argument_String_To_List (Arg_String : String) return Argument_List_Access is Max_Args : constant Integer := Arg_String'Length; New_Argv : Argument_List (1 .. Max_Args); New_Argc : Natural := 0; Idx : Integer; begin Idx := Arg_String'First; loop exit when Idx Arg_String'Last; declare Quoted : Boolean := False; Backqd : Boolean := False; Old_Idx : Integer; begin Old_Idx := Idx; loop -- An unquoted space is the end of an argument if not (Backqd or Quoted) and then Arg_String (Idx) = ' ' then exit; -- Start of a quoted string elsif not (Backqd or Quoted) and then Arg_String (Idx) = '' then Quoted := True; -- End of a quoted string and end of an argument elsif (Quoted and not Backqd) and then Arg_String (Idx) = '' then Idx := Idx + 1; exit; -- Following character is backquoted elsif Arg_String (Idx) = '\' then Backqd := True; -- Turn off backquoting after advancing one character elsif Backqd then Backqd := False; end if; Idx := Idx + 1; exit when Idx Arg_String'Last; end loop; -- Found an argument New_Argc := New_Argc + 1; New_Argv (New_Argc) := new String'(Arg_String (Old_Idx .. Idx - 1)); end; end loop; return new Argument_List'(New_Argv (1 .. New_Argc)); end Argument_String_To_List; procedure Create_Temp_File_Internal (FD: out File_Descriptor; Name : out String_Access) is Pos : Positive; begin File_Loop : loop Locked : begin Pos := Temp_File_Name_Last_Digit; Digit_Loop : loop case Current_Temp_File_Name (Pos) is when '0' .. '8' = Current_Temp_File_Name (Pos) := Character'Succ (Current_Temp_File_Name (Pos)); exit Digit_Loop; when '9' = Current_Temp_File_Name (Pos) := '0'; Pos := Pos - 1; when others = SSL.Unlock_Task.all; FD := 0; Name := null; exit File_Loop; end case; end loop Digit_Loop; end Locked; end loop File_Loop; end Create_Temp_File_Internal; end Loop_Optimization9; with GNAT.Strings; use GNAT.Strings; package Loop_Optimization9 is type File_Descriptor is new Integer; procedure Create_Temp_File_Internal (FD : out File_Descriptor; Name : out String_Access); subtype Argument_List is String_List; subtype Argument_List_Access is String_List_Access; function Argument_String_To_List (Arg_String : String) return Argument_List_Access; end Loop_Optimization9;
Re: [patch tree-optimization 1/2]: Branch-cost optimizations
2011/11/10 Richard Guenther richard.guent...@gmail.com: On Wed, Nov 9, 2011 at 10:09 PM, Kai Tietz ktiet...@googlemail.com wrote: 2011/11/9 Jeff Law l...@redhat.com: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/07/11 15:36, Richard Guenther wrote: Yes. tree-affine does this for a sum of expressions of the form a + b * c. It collects such sum, optimizes it (and you can add/subtract or scale these things) and re-emit the new simplified form. Kai, what what were the concerns with this kind of approach? Richard's suggestion seems sound to me. From a maintenance standpoint reusing/extending the tree-affine code seems like a better route. jeff Well, such a comparison-logic-folder helper - like affine-tree for add/subtract/scale) - is for sure something good for inner gimple passes building up new logic-truth expressions, but such a pass doesn't meet requirements need to fold for BC optimization AFAICS. tree-affine is not a affine folder either. It is an on-the side representation of a sum of affine components that you can operate on (for example, you can simplify it). The difference is that for BC we don't want to fold at all. Also it isn't necessarily simplified statement we want. For example 'if ((a | b) == 0 ...) ...'. If the costs of such pattern '(a | b) == 0' are too high, we want to representation it instead as 'if (a == 0) if (b == 0) ...'. The affine tree of (a | b) == 0 is AND [0] ~a [1] ~b Well, this is just true, if a and b are boolean-typed. But we need to handle also elements with different types, which are always comparisons. I have choosen exactly this sample, as it works on any integral-type. So using predicate not is not that what we need here in general. We have indeed to make up element-chains via comparison to operate on. So for BC optimization we need to have a fully compare-expanded sequence of bitwise-operations including for sub-sequences. For normal folding we don't need sub-sequence expansion in most cases. the predicate infrastructure isn't meant to be for folding - it is mean to be a data structure that is well suited for operating on predicate expressions in all ways (well, including folding). The cause why we need for BC fully compare-expanded sequence I try to demonstrate on the following example. We have an condition 'if ((A | B) == 0 C == 0) ...' where the joining of A == 0 and C == 0 would be profitable by BC-optimization, but the join of A != 0 and B != 0 isn't. So we do - as my patch does - first an expand to element comparison-sequence view. AND [0] ~A [1] ~B [2] ~C Again, not true, as ~ works only on boolean-typed invariant elements A, B, and C. The attribute logical not is only of interest within a fina operand of an invariant argument of an element. It is no predicate for the bitwise-binary-chain itself. If we have a bitwise-chain with different type as boolean, an inverse might be a candidate for predicates, but we try to operate here on conditions. So we get for it the transformed form 'if (A == 0 B == 0 C == 0)'. Now we can begin to search for valid patterns in the condition for joining by searching from left-to-right for a profitable pattern. So we end-up with final statement 'if ((A | C) == 0 C)' So as conclusion to answer your question about tree-affine implementation. It's exactly what you implemented (well, sort-of). You just did not properly abstract it away. I know, but this can be done. Kai
Re: Revert PowerPC shrink-wrap support 3 of 3
On Thu, Nov 10, 2011 at 11:38 AM, Hans-Peter Nilsson hans-peter.nils...@axis.com wrote: From: Hans-Peter Nilsson h...@axis.com Date: Wed, 9 Nov 2011 09:55:59 +0100 From: Alan Modra amo...@gmail.com Date: Tue, 1 Nov 2011 16:33:40 +0100 On Tue, Nov 01, 2011 at 12:57:22AM +1030, Alan Modra wrote: * function.c (bb_active_p): Delete. (dup_block_and_redirect, active_insn_between): New functions. (convert_jumps_to_returns, emit_return_for_exit): New functions, split out from.. (thread_prologue_and_epilogue_insns): ..here. Delete shadowing variables. Don't do prologue register clobber tests when shrink wrapping already failed. Delete all last_bb_active code. Instead compute tail block candidates for duplicating exit path. Remove these from antic set. Duplicate tails when reached from both blocks needing a prologue/epilogue and blocks not needing such. * ifcvt.c (dead_or_predicable): Test both flag_shrink_wrap and HAVE_simple_return. * bb-reorder.c (get_uncond_jump_length): Make global. * bb-reorder.h (get_uncond_jump_length): Declare. * cfgrtl.c (rtl_create_basic_block): Comment typo fix. (rtl_split_edge): Likewise. Warning fix. (rtl_duplicate_bb): New function. (rtl_cfg_hooks): Enable can_duplicate_block_p and duplicate_block. This (a revision in the range 181187:181189) broke build for cris-elf like so: See PR51051. Given that this also broke arm-linux-gnueabi, a primary platform, and Alan being absent until the 15th according to a message on IRC, I move to revert r181188. Is there a PR for the arm issue? I think I need someone with appropriate write privileges to agree with that, and to also give 48h for someone to fix the problem. Sorry for not forthcoming on the second point. Did you or somebody else try to look into the problem? To decide whether it's the best course of action it would be nice to know if it's a simple error in the patch that is easy to fix. brgds, H-P PS. where is the policy written down, besides the mailing list archives? http://gcc.gnu.org/develop.html
Re: [patch tree-optimization 1/2]: Branch-cost optimizations
On Thu, Nov 10, 2011 at 11:49 AM, Kai Tietz ktiet...@googlemail.com wrote: 2011/11/10 Richard Guenther richard.guent...@gmail.com: On Wed, Nov 9, 2011 at 10:09 PM, Kai Tietz ktiet...@googlemail.com wrote: 2011/11/9 Jeff Law l...@redhat.com: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/07/11 15:36, Richard Guenther wrote: Yes. tree-affine does this for a sum of expressions of the form a + b * c. It collects such sum, optimizes it (and you can add/subtract or scale these things) and re-emit the new simplified form. Kai, what what were the concerns with this kind of approach? Richard's suggestion seems sound to me. From a maintenance standpoint reusing/extending the tree-affine code seems like a better route. jeff Well, such a comparison-logic-folder helper - like affine-tree for add/subtract/scale) - is for sure something good for inner gimple passes building up new logic-truth expressions, but such a pass doesn't meet requirements need to fold for BC optimization AFAICS. tree-affine is not a affine folder either. It is an on-the side representation of a sum of affine components that you can operate on (for example, you can simplify it). The difference is that for BC we don't want to fold at all. Also it isn't necessarily simplified statement we want. For example 'if ((a | b) == 0 ...) ...'. If the costs of such pattern '(a | b) == 0' are too high, we want to representation it instead as 'if (a == 0) if (b == 0) ...'. The affine tree of (a | b) == 0 is AND [0] ~a [1] ~b Well, this is just true, if a and b are boolean-typed. But we need to handle also elements with different types, which are always comparisons. I have choosen exactly this sample, as it works on any integral-type. So using predicate not is not that what we need here in general. We have indeed to make up element-chains via comparison to operate on. AND [0] ~ (a != 0) [1] ~ (b != 0) just because I chose to draw simple pictures does not mean a more complex one would be not supported (we'd want general comparisons anway). Nowhere did I specify that it should only work on boolean variables. So for BC optimization we need to have a fully compare-expanded sequence of bitwise-operations including for sub-sequences. For normal folding we don't need sub-sequence expansion in most cases. the predicate infrastructure isn't meant to be for folding - it is mean to be a data structure that is well suited for operating on predicate expressions in all ways (well, including folding). The cause why we need for BC fully compare-expanded sequence I try to demonstrate on the following example. We have an condition 'if ((A | B) == 0 C == 0) ...' where the joining of A == 0 and C == 0 would be profitable by BC-optimization, but the join of A != 0 and B != 0 isn't. So we do - as my patch does - first an expand to element comparison-sequence view. AND [0] ~A [1] ~B [2] ~C Again, not true, as ~ works only on boolean-typed invariant elements A, B, and C. The attribute logical not is only of interest within a fina operand of an invariant argument of an element. It is no predicate for the bitwise-binary-chain itself. If we have a bitwise-chain with different type as boolean, an inverse might be a candidate for predicates, but we try to operate here on conditions. See above. So we get for it the transformed form 'if (A == 0 B == 0 C == 0)'. Now we can begin to search for valid patterns in the condition for joining by searching from left-to-right for a profitable pattern. So we end-up with final statement 'if ((A | C) == 0 C)' So as conclusion to answer your question about tree-affine implementation. It's exactly what you implemented (well, sort-of). You just did not properly abstract it away. I know, but this can be done. Kai
Re: Revert PowerPC shrink-wrap support 3 of 3
From: Richard Guenther richard.guent...@gmail.com Date: Thu, 10 Nov 2011 12:22:56 +0100 On Thu, Nov 10, 2011 at 11:38 AM, Hans-Peter Nilsson hans-peter.nils...@axis.com wrote: From: Hans-Peter Nilsson h...@axis.com Date: Wed, 9 Nov 2011 09:55:59 +0100 From: Alan Modra amo...@gmail.com Date: Tue, 1 Nov 2011 16:33:40 +0100 On Tue, Nov 01, 2011 at 12:57:22AM +1030, Alan Modra wrote: * function.c (bb_active_p): Delete. (dup_block_and_redirect, active_insn_between): New functions. (convert_jumps_to_returns, emit_return_for_exit): New functions, split out from.. (thread_prologue_and_epilogue_insns): ..here. Delete shadowing variables. Don't do prologue register clobber tests when shrink wrapping already failed. Delete all last_bb_active code. Instead compute tail block candidates for duplicating exit path. Remove these from antic set. Duplicate tails when reached from both blocks needing a prologue/epilogue and blocks not needing such. * ifcvt.c (dead_or_predicable): Test both flag_shrink_wrap and HAVE_simple_return. * bb-reorder.c (get_uncond_jump_length): Make global. * bb-reorder.h (get_uncond_jump_length): Declare. * cfgrtl.c (rtl_create_basic_block): Comment typo fix. (rtl_split_edge): Likewise. Warning fix. (rtl_duplicate_bb): New function. (rtl_cfg_hooks): Enable can_duplicate_block_p and duplicate_block. This (a revision in the range 181187:181189) broke build for cris-elf like so: See PR51051. Given that this also broke arm-linux-gnueabi, a primary platform, and Alan being absent until the 15th according to a message on IRC, I move to revert r181188. Is there a PR for the arm issue? It's covered by the same PR, see comment #1. I've now updated the target field. I think I need someone with appropriate write privileges to agree with that, and to also give 48h for someone to fix the problem. Sorry for not forthcoming on the second point. Did you or somebody else try to look into the problem? To decide whether it's the best course of action it would be nice to know if it's a simple error in the patch that is easy to fix. Nope, not really. Wouldn't FWIW, de jure matter, me not having write privileges to the affected area. Though, I had a quick look at the patch and nothing stood out except its intrusiveness, and it seems the patch wasn't tested on a !simple_return target (just powerpc-linux according to the replied-to message). brgds, H-P
[PATCH] Fix PR51071
When using gimple_has_side_effects on a GIMPLE_LABEL with a LABEL_DECL with DECL_FORCED_LABEL set we ICE. That is because gimple_has_side_effects uses TREE_SIDE_EFFECTS on the LABEL_DECL which isn't valid. Fixed by (finally) cleaning up this predicate, removing all code that can only be executed if we eventually ICE and has no other side-effects. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2011-11-10 Richard Guenther rguent...@suse.de PR middle-end/51071 * gimple.c (gimple_has_side_effects): Remove checking code that doesn't belong here. * gcc.dg/torture/pr51071.c: New testcase. Index: gcc/gimple.c === *** gcc/gimple.c(revision 181206) --- gcc/gimple.c(working copy) *** gimple_set_modified (gimple s, bool modi *** 2457,2464 bool gimple_has_side_effects (const_gimple s) { - unsigned i; - if (is_gimple_debug (s)) return false; --- 2457,2462 *** gimple_has_side_effects (const_gimple s) *** 2474,2520 if (is_gimple_call (s)) { ! unsigned nargs = gimple_call_num_args (s); ! tree fn; ! if (!(gimple_call_flags (s) (ECF_CONST | ECF_PURE))) ! return true; ! else if (gimple_call_flags (s) ECF_LOOPING_CONST_OR_PURE) ! /* An infinite loop is considered a side effect. */ return true; - if (gimple_call_lhs (s) -TREE_SIDE_EFFECTS (gimple_call_lhs (s))) - { - gcc_checking_assert (gimple_has_volatile_ops (s)); - return true; - } - - fn = gimple_call_fn (s); - if (fn TREE_SIDE_EFFECTS (fn)) - return true; - - for (i = 0; i nargs; i++) - if (TREE_SIDE_EFFECTS (gimple_call_arg (s, i))) - { - gcc_checking_assert (gimple_has_volatile_ops (s)); - return true; - } - return false; } - else - { - for (i = 0; i gimple_num_ops (s); i++) - { - tree op = gimple_op (s, i); - if (op TREE_SIDE_EFFECTS (op)) - { - gcc_checking_assert (gimple_has_volatile_ops (s)); - return true; - } - } - } return false; } --- 2472,2486 if (is_gimple_call (s)) { ! int flags = gimple_call_flags (s); ! /* An infinite loop is considered a side effect. */ ! if (!(flags (ECF_CONST | ECF_PURE)) ! || (flags ECF_LOOPING_CONST_OR_PURE)) return true; return false; } return false; } Index: gcc/testsuite/gcc.dg/torture/pr51071.c === *** gcc/testsuite/gcc.dg/torture/pr51071.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr51071.c (revision 0) *** *** 0 --- 1,33 + /* { dg-do compile } */ + + void foo (void); + void bar (void *); + extern int t; + + static void kmalloc_large (int size, int flags) + { + (void) size; + (void) flags; + foo (); + bar (({__here:__here;})); + } + + static void kmalloc (int size, int flags) + { + if (size) + { + if ((unsigned long) size 0x1000) + kmalloc_large (size, flags); + + if (flags) + bar (({__here:__here;})); + } + } + + void compress_file_range (int i, int j, int k) + { + int nr_pages = ({j k;}); + + if (i || t) + kmalloc (0x1000UL * nr_pages, 0x40UL); + }
[PATCH] Fix PR51070, loop distribution and memset generation
This fixes PR51070 where we repace the partition bb 5: # g_224.0_23 = PHI g_224.1_10(7), g_224.0_22(4) D.2957_3 = g_92[g_224.0_23]; g_92[g_224.0_23] = 0; g_224.1_10 = g_224.0_23 + 1; if (g_224.1_10 != 0) goto bb 7; else goto bb 6; bb 6: # g_95_I_lsm.15_29 = PHI D.2957_3(5) g_95[0] = g_95_I_lsm.15_29; g_224 = 0; goto bb 3; bb 7: goto bb 5; with a memset, throwing away the statement that loads D.2957_3 and thus causing an SSA name on the freelist to remain in the PHI node in BB6. We obviously cannot create a memset for such a partition. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2011-11-10 Richard Guenther rguent...@suse.de PR tree-optimization/51070 * tree-loop-distribution.c (generate_builtin): Do not replace the loop with a builtin if the partition contains statements which results are used outside of the loop. (pass_loop_distribution): Verify and collect. * gcc.dg/torture/pr51070.c: New testcase. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c(revision 181252) --- gcc/tree-loop-distribution.c(working copy) *** static bitmap remaining_stmts; *** 63,68 --- 63,110 predecessor a node that writes to memory. */ static bitmap upstream_mem_writes; + /* Returns true when DEF is an SSA_NAME defined in LOOP and used after +the LOOP. */ + + static bool + ssa_name_has_uses_outside_loop_p (tree def, loop_p loop) + { + imm_use_iterator imm_iter; + use_operand_p use_p; + + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, def) + if (loop != loop_containing_stmt (USE_STMT (use_p))) + return true; + + return false; + } + + /* Returns true when STMT defines a scalar variable used after the +loop. */ + + static bool + stmt_has_scalar_dependences_outside_loop (gimple stmt) + { + tree name; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + name = gimple_assign_lhs (stmt); + break; + + case GIMPLE_PHI: + name = gimple_phi_result (stmt); + break; + + default: + return false; + } + + return TREE_CODE (name) == SSA_NAME + ssa_name_has_uses_outside_loop_p (name, loop_containing_stmt (stmt)); + } + /* Update the PHI nodes of NEW_LOOP. NEW_LOOP is a duplicate of ORIG_LOOP. */ *** generate_builtin (struct loop *loop, bit *** 330,339 { gimple stmt = gsi_stmt (bsi); ! if (gimple_code (stmt) != GIMPLE_LABEL ! !is_gimple_debug (stmt) ! bitmap_bit_p (partition, x++) ! is_gimple_assign (stmt) !is_gimple_reg (gimple_assign_lhs (stmt))) { /* Don't generate the builtins when there are more than --- 372,389 { gimple stmt = gsi_stmt (bsi); ! if (gimple_code (stmt) == GIMPLE_LABEL ! || is_gimple_debug (stmt)) ! continue; ! ! if (!bitmap_bit_p (partition, x++)) ! continue; ! ! /* If the stmt has uses outside of the loop fail. */ ! if (stmt_has_scalar_dependences_outside_loop (stmt)) ! goto end; ! ! if (is_gimple_assign (stmt) !is_gimple_reg (gimple_assign_lhs (stmt))) { /* Don't generate the builtins when there are more than *** fuse_partitions_with_similar_memory_acce *** 824,871 } } - /* Returns true when DEF is an SSA_NAME defined in LOOP and used after -the LOOP. */ - - static bool - ssa_name_has_uses_outside_loop_p (tree def, loop_p loop) - { - imm_use_iterator imm_iter; - use_operand_p use_p; - - FOR_EACH_IMM_USE_FAST (use_p, imm_iter, def) - if (loop != loop_containing_stmt (USE_STMT (use_p))) - return true; - - return false; - } - - /* Returns true when STMT defines a scalar variable used after the -loop. */ - - static bool - stmt_has_scalar_dependences_outside_loop (gimple stmt) - { - tree name; - - switch (gimple_code (stmt)) - { - case GIMPLE_ASSIGN: - name = gimple_assign_lhs (stmt); - break; - - case GIMPLE_PHI: - name = gimple_phi_result (stmt); - break; - - default: - return false; - } - - return TREE_CODE (name) == SSA_NAME - ssa_name_has_uses_outside_loop_p (name, loop_containing_stmt (stmt)); - } - /* Returns true when STMT will be code generated in a partition of RDG different than PART and that will not be code generated as a builtin. */ --- 874,879 *** struct gimple_opt_pass pass_loop_distrib *** 1311,1316 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ ! 0 /* todo_flags_finish
Re: Revert PowerPC shrink-wrap support 3 of 3
On Thu, Nov 10, 2011 at 12:43 PM, Hans-Peter Nilsson hans-peter.nils...@axis.com wrote: From: Richard Guenther richard.guent...@gmail.com Date: Thu, 10 Nov 2011 12:22:56 +0100 On Thu, Nov 10, 2011 at 11:38 AM, Hans-Peter Nilsson hans-peter.nils...@axis.com wrote: From: Hans-Peter Nilsson h...@axis.com Date: Wed, 9 Nov 2011 09:55:59 +0100 From: Alan Modra amo...@gmail.com Date: Tue, 1 Nov 2011 16:33:40 +0100 On Tue, Nov 01, 2011 at 12:57:22AM +1030, Alan Modra wrote: * function.c (bb_active_p): Delete. (dup_block_and_redirect, active_insn_between): New functions. (convert_jumps_to_returns, emit_return_for_exit): New functions, split out from.. (thread_prologue_and_epilogue_insns): ..here. Delete shadowing variables. Don't do prologue register clobber tests when shrink wrapping already failed. Delete all last_bb_active code. Instead compute tail block candidates for duplicating exit path. Remove these from antic set. Duplicate tails when reached from both blocks needing a prologue/epilogue and blocks not needing such. * ifcvt.c (dead_or_predicable): Test both flag_shrink_wrap and HAVE_simple_return. * bb-reorder.c (get_uncond_jump_length): Make global. * bb-reorder.h (get_uncond_jump_length): Declare. * cfgrtl.c (rtl_create_basic_block): Comment typo fix. (rtl_split_edge): Likewise. Warning fix. (rtl_duplicate_bb): New function. (rtl_cfg_hooks): Enable can_duplicate_block_p and duplicate_block. This (a revision in the range 181187:181189) broke build for cris-elf like so: See PR51051. Given that this also broke arm-linux-gnueabi, a primary platform, and Alan being absent until the 15th according to a message on IRC, I move to revert r181188. Is there a PR for the arm issue? It's covered by the same PR, see comment #1. I've now updated the target field. I think I need someone with appropriate write privileges to agree with that, and to also give 48h for someone to fix the problem. Sorry for not forthcoming on the second point. Did you or somebody else try to look into the problem? To decide whether it's the best course of action it would be nice to know if it's a simple error in the patch that is easy to fix. Nope, not really. Wouldn't FWIW, de jure matter, me not having write privileges to the affected area. Though, I had a quick look at the patch and nothing stood out except its intrusiveness, and it seems the patch wasn't tested on a !simple_return target (just powerpc-linux according to the replied-to message). Fair enough. You can count me as one then, and I'll defer to Bernd to either provide a fix or ack the revert. Thanks, Richard. brgds, H-P
Re: Revert PowerPC shrink-wrap support 3 of 3
On 11/10/11 13:14, Richard Guenther wrote: Fair enough. You can count me as one then, and I'll defer to Bernd to either provide a fix or ack the revert. I'm trying to track it down. In 189r.outof_cfglayout, we have (insn 31 33 35 3 (use (reg/i:SI 0 r0)) ../../../../baseline-trunk/libstdc++-v3/libsupc++/new_opv.cc:34 -1 (nil)) ;; Successors: EXIT [100.0%] (fallthru) ;; lr out 0 [r0] 11 [fp] 13 [sp] 14 [lr] 25 [sfp] 26 [afp] ;; live out 0 [r0] 11 [fp] 13 [sp] 25 [sfp] 26 [afp] followed by a number of other basic blocks, so that looks wrong to me. outof_cfglayout seems to assume that fallthrough edges to the exit block are OK and don't need fixing up, and changing that seems nontrivial at first glance. The situation is first created during cfgcleanup in into_cfglayout. The following patch makes the testcase compile by stopping the compiler from moving the exit fallthru block around, but I've not checked whether it has a negative effect on code quality. HP, can you run full tests? Bernd Index: ../baseline-trunk/gcc/cfgrtl.c === --- ../baseline-trunk/gcc/cfgrtl.c (revision 181252) +++ ../baseline-trunk/gcc/cfgrtl.c (working copy) @@ -2735,6 +2735,16 @@ cfg_layout_can_merge_blocks_p (basic_blo if (BB_PARTITION (a) != BB_PARTITION (b)) return false; + /* If we would end up moving B's instructions, make sure it doesn't fall + through into the exit block, since we cannot recover from a fallthrough + edge into the exit block occurring in the middle of a function. */ + if (NEXT_INSN (BB_END (a)) != BB_HEAD (b)) +{ + edge e = find_fallthru_edge (b-succs); + if (e e-dest == EXIT_BLOCK_PTR) + return false; +} + /* There must be exactly one edge in between the blocks. */ return (single_succ_p (a) single_succ (a) == b
[PATCH][rs6000] Fix warning building libgcc
Currently when building unwind-dw2.c for powerpc64 I see In file included from ../../../libgcc/unwind-dw2.c:376:0: ./md-unwind-support.h: In function 'frob_update_context': ./md-unwind-support.h:371:8: warning: passing argument 3 of '_Unwind_SetGRPtr' makes pointer from integer without a cast [enabled by default] ../../../libgcc/unwind-dw2.c:281:1: note: expected 'void *' but argument is of type '_Unwind_Word' looking at other places we cast such arguments to void *. Ok? Thanks, Richard. 2011-11-10 Richard Guenther rguent...@suse.de * config/rs6000/linux-unwind.h (frob_update_context): Properly cast the third argument of _Unwind_SetGRPtr to void *. Index: libgcc/config/rs6000/linux-unwind.h === --- libgcc/config/rs6000/linux-unwind.h (revision 181252) +++ libgcc/config/rs6000/linux-unwind.h (working copy) @@ -368,7 +368,7 @@ frob_update_context (struct _Unwind_Cont before the bctrl so this is the first and only place we need to use the stored R2. */ _Unwind_Word sp = _Unwind_GetGR (context, 1); - _Unwind_SetGRPtr (context, 2, sp + 40); + _Unwind_SetGRPtr (context, 2, (void *)(sp + 40)); } } }
Re: [RFA/ARM][Patch 01/02]: Thumb2 epilogue in RTL
On 28/09/11 17:15, Sameera Deshpande wrote: Hi! This patch generates Thumb2 epilogues in RTL form. The work involves defining new functions, predicates and patterns along with few changes in existing code: * The load_multiple_operation predicate was found to be too restrictive for integer loads as it required consecutive destination regs, so this restriction was lifted. * Variations of load_multiple_operation were required to handle cases - where SP must be the base register - where FP values were being loaded (which do require consecutive destination registers) - where PC can be in register-list (which requires return pattern along with register loads). Hence, the common code was factored out into a new function in arm.c and parameterised to show - whether consecutive destination regs are needed - the data type being loaded - whether the base register has to be SP - whether PC is in register-list The patch is tested with arm-eabi with no regressions. ChangeLog: 2011-09-28 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com * config/arm/arm-protos.h (load_multiple_operation_p): New declaration. (thumb2_expand_epilogue): Likewise. (thumb2_output_return): Likewise (thumb2_expand_return): Likewise. (thumb_unexpanded_epilogue): Rename to... (thumb1_unexpanded_epilogue): ...this * config/arm/arm.c (load_multiple_operation_p): New function. (thumb2_emit_multi_reg_pop): Likewise. (thumb2_emit_vfp_multi_reg_pop): Likewise. (thumb2_expand_return): Likewise. (thumb2_expand_epilogue): Likewise. (thumb2_output_return): Likewise (thumb_unexpanded_epilogue): Rename to... ( thumb1_unexpanded_epilogue): ...this * config/arm/arm.md (pop_multiple_with_stack_update): New pattern. (pop_multiple_with_stack_update_and_return): Likewise. (thumb2_ldr_with_return): Likewise. (floating_point_pop_multiple_with_stack_update): Likewise. (return): Update condition and code for pattern. (arm_return): Likewise. (epilogue_insns): Likewise. * config/arm/predicates.md (load_multiple_operation): Update predicate. (load_multiple_operation_stack_and_return): New predicate. (load_multiple_operation_stack): Likewise. (load_multiple_operation_stack_fp): Likewise. * config/arm/thumb2.md (thumb2_return): Remove. (thumb2_rtl_epilogue_return): New pattern. - Thanks and regards, Sameera D. thumb2_rtl_epilogue_complete-27Sept.patch + if (GET_CODE (SET_SRC (elt = XVECEXP (op, 0, offset_adj))) == PLUS) It's generally best not to use assignments within conditionals unless there is a strong reason otherwise (that normally implies something like being deep within a condition test where you only want to update the variable if some pre-conditions are true and that can't be easily factored out). + != (unsigned int) (first_dest_regno + regs_per_val * (i - base Line length (split the line just before the '+' operator. + /* now show EVERY reg that will be restored, using a SET for each. */ Capital letter at start of sentence. Why is EVERY in caps? + saved_regs_mask = offsets-saved_regs_mask; + for (i = 0, num_regs = 0; i = LAST_ARM_REGNUM; i++) blank line before the for loop. + /* It's illegal to do a pop for only one reg, so generate an ldr. */ GCC coding standards suggest avoiding the use of 'illegal'. Suggest changing that to 'Pop can only be used for more than one reg; so...' +reg_names[REGNO (XEXP (XVECEXP (operands[0], 0, 2), 0))]); + +/* Skip over the first two elements and the one we just generated. */ +for (i = 3; i (num_saves); i++) + { +strcat (pattern, \, %|\); +strcat (pattern, +reg_names[REGNO (XEXP (XVECEXP (operands[0], 0, i), 0))]); + } + +strcat (pattern, \}\); +output_asm_insn (pattern, operands); + +return \\; + } + + [(set_attr type load4)] There's a lot of trailing white space here. Please remove. +(define_insn *thumb2_ldr_with_return + [(return) + (set (reg:SI PC_REGNUM) +(mem:SI (post_inc:SI (match_operand:SI 0 s_register_operand k] + TARGET_THUMB2 + ldr%?\t%|pc, [%0], #4 + [(set_attr type load1) + (set_attr predicable yes)] +) + This pattern doesn't seem to be used. What's its purpose? +static const struct { const char *const name; } table[] + = { {\d0\}, {\d1\}, {\d2\}, {\d3\}, I'm not keen on having this table. Generally the register names should be configurable depending on the assembler flavour and this patch defeats that. Is there any way to rewrite this code so that it can use the standard operand methods for generating register names? In summary, this is
Re: [RFA/ARM][Patch 01/02]: Thumb2 epilogue in RTL
Hi Richard, thanks for your comments. -- + if (GET_CODE (SET_SRC (elt = XVECEXP (op, 0, offset_adj))) == PLUS) It's generally best not to use assignments within conditionals unless there is a strong reason otherwise (that normally implies something like being deep within a condition test where you only want to update the variable if some pre-conditions are true and that can't be easily factored out). + != (unsigned int) (first_dest_regno + regs_per_val * (i - base Line length (split the line just before the '+' operator. + /* now show EVERY reg that will be restored, using a SET for each. */ Capital letter at start of sentence. Why is EVERY in caps? + saved_regs_mask = offsets-saved_regs_mask; + for (i = 0, num_regs = 0; i = LAST_ARM_REGNUM; i++) blank line before the for loop. + /* It's illegal to do a pop for only one reg, so generate an ldr. */ GCC coding standards suggest avoiding the use of 'illegal'. Suggest changing that to 'Pop can only be used for more than one reg; so...' +reg_names[REGNO (XEXP (XVECEXP (operands[0], 0, 2), 0))]); + +/* Skip over the first two elements and the one we just generated. */ +for (i = 3; i (num_saves); i++) + { +strcat (pattern, \, %|\); +strcat (pattern, +reg_names[REGNO (XEXP (XVECEXP (operands[0], 0, i), 0))]); + } + +strcat (pattern, \}\); +output_asm_insn (pattern, operands); + +return \\; + } + + [(set_attr type load4)] There's a lot of trailing white space here. Please remove. Removed white spaces in reworked patch http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01009.html +(define_insn *thumb2_ldr_with_return + [(return) + (set (reg:SI PC_REGNUM) +(mem:SI (post_inc:SI (match_operand:SI 0 s_register_operand k] + TARGET_THUMB2 + ldr%?\t%|pc, [%0], #4 + [(set_attr type load1) + (set_attr predicable yes)] +) + This pattern doesn't seem to be used. What's its purpose? This pattern is generated from thumb2_expand_return in + if (num_regs == 1) +{ + rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2)); + rtx reg = gen_rtx_REG (SImode, PC_REGNUM); + rtx addr = gen_rtx_MEM (SImode, + gen_rtx_POST_INC (SImode, + stack_pointer_rtx)); + set_mem_alias_set (addr, get_frame_alias_set ()); + XVECEXP (par, 0, 0) = ret_rtx; + XVECEXP (par, 0, 1) = gen_rtx_SET (SImode, reg, addr); + RTX_FRAME_RELATED_P (par) = 1; + emit_jump_insn (par); +} +static const struct { const char *const name; } table[] + = { {\d0\}, {\d1\}, {\d2\}, {\d3\}, I'm not keen on having this table. Generally the register names should be configurable depending on the assembler flavour and this patch defeats that. Is there any way to rewrite this code so that it can use the standard operand methods for generating register names? The updated patch was resent after comments from Ramana and Paul which eliminates this table. http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01009.html I will take care of other formatting issues and will resend the patch. In summary, this is mostly OK, apart from the last two items. R. - Thanks and regards, Sameera D.
Re: [PATCH] Fix combine's simplify_comparison (PR rtl-optimization/51023, take 3)
On 11/09/11 18:12, Jakub Jelinek wrote: So here is hopefully last iteration of that. Negative constants that trunc_int_for_mode to the same value are IMHO just fine too, similarly for ZERO_EXTEND 0x for HImode should be fine too. On the other side, if mode is DImode and outer mode of ZERO_EXTEND is TImode, if const_op had the highest bit set, it would mean it is considered to be 65 1s followed by 63 other bits. It is hard to construct testcases for these (except for the failure), because apparently the FE is already narrowing the comparisons if the constant is in range. Ok for trunk if this bootstraps/regtests? Ok. Can you please look at the simplify_set hunk? Thanks. That could safely just use the same kind of change, couldn't it? Preapproved as well. Bernd
Re: [Patch 001] [x86 backend] Define march/mtune for upcoming AMD Bulldozer procesor.
Hello! This patch defines -march=bdver1 and -mtune=bdver1 flag for the upcoming AMD Bulldozer processor. Hi, it seems that bdver/btver is not mentioned in invoke.texi nor changes.html. Could you please add documentation? Honza
[PATCH] Fix PR51042
This fixes PR51042. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2011-11-10 Richard Guenther rguent...@suse.de PR tree-optimization/51042 * tree-ssa-pre.c (phi_translate_1): Avoid recursing on self-referential expressions. Refactor code to avoid duplication. * gcc.dg/torture/pr51042.c: New testcase. Index: gcc/tree-ssa-pre.c === *** gcc/tree-ssa-pre.c (revision 181252) --- gcc/tree-ssa-pre.c (working copy) *** phi_translate_1 (pre_expr expr, bitmap_s *** 1527,1533 tree newvuse = vuse; VEC (vn_reference_op_s, heap) *newoperands = NULL; bool changed = false, same_valid = true; ! unsigned int i, j; vn_reference_op_t operand; vn_reference_t newref; --- 1527,1533 tree newvuse = vuse; VEC (vn_reference_op_s, heap) *newoperands = NULL; bool changed = false, same_valid = true; ! unsigned int i, j, n; vn_reference_op_t operand; vn_reference_t newref; *** phi_translate_1 (pre_expr expr, bitmap_s *** 1536,1635 { pre_expr opresult; pre_expr leader; ! tree oldop0 = operand-op0; ! tree oldop1 = operand-op1; ! tree oldop2 = operand-op2; ! tree op0 = oldop0; ! tree op1 = oldop1; ! tree op2 = oldop2; tree type = operand-type; vn_reference_op_s newop = *operand; ! ! if (op0 TREE_CODE (op0) == SSA_NAME) { ! unsigned int op_val_id = VN_INFO (op0)-value_id; ! leader = find_leader_in_sets (op_val_id, set1, set2); ! opresult = phi_translate (leader, set1, set2, pred, phiblock); ! if (opresult opresult != leader) { ! tree name = get_representative_for (opresult); ! if (!name) break; ! op0 = name; } ! else if (!opresult) ! break; ! } ! changed |= op0 != oldop0; ! ! if (op1 TREE_CODE (op1) == SSA_NAME) ! { ! unsigned int op_val_id = VN_INFO (op1)-value_id; leader = find_leader_in_sets (op_val_id, set1, set2); ! opresult = phi_translate (leader, set1, set2, pred, phiblock); ! if (opresult opresult != leader) { ! tree name = get_representative_for (opresult); ! if (!name) break; ! op1 = name; } - else if (!opresult) - break; } ! /* We can't possibly insert these. */ ! else if (op1 !is_gimple_min_invariant (op1)) ! break; ! changed |= op1 != oldop1; ! if (op2 TREE_CODE (op2) == SSA_NAME) { ! unsigned int op_val_id = VN_INFO (op2)-value_id; ! leader = find_leader_in_sets (op_val_id, set1, set2); ! opresult = phi_translate (leader, set1, set2, pred, phiblock); ! if (opresult opresult != leader) ! { ! tree name = get_representative_for (opresult); ! if (!name) ! break; ! op2 = name; ! } ! else if (!opresult) ! break; } - /* We can't possibly insert these. */ - else if (op2 !is_gimple_min_invariant (op2)) - break; - changed |= op2 != oldop2; - if (!newoperands) newoperands = VEC_copy (vn_reference_op_s, heap, operands); /* We may have changed from an SSA_NAME to a constant */ ! if (newop.opcode == SSA_NAME TREE_CODE (op0) != SSA_NAME) ! newop.opcode = TREE_CODE (op0); newop.type = type; ! newop.op0 = op0; ! newop.op1 = op1; ! newop.op2 = op2; /* If it transforms a non-constant ARRAY_REF into a constant one, adjust the constant offset. */ if (newop.opcode == ARRAY_REF newop.off == -1 !TREE_CODE (op0) == INTEGER_CST !TREE_CODE (op1) == INTEGER_CST !TREE_CODE (op2) == INTEGER_CST) { ! double_int off = tree_to_double_int (op0); off = double_int_add (off, double_int_neg ! (tree_to_double_int (op1))); ! off = double_int_mul (off, tree_to_double_int (op2)); if (double_int_fits_in_shwi_p (off)) newop.off = off.low; } VEC_replace (vn_reference_op_s, newoperands, j,
Re: RFT: Fix PR middle/end-40154
No, it isn't. Expanders call other expanders to do fancy stuff. When everything is done, they tag a REG_EQUAL note on the last insn. One of the purposes of set_unique_reg_note is lubricate this process: the layered expanders can add multiple REG_EQUAL notes. We only want the most high-level note, so we discard the previous one from the next lower level. Well, we realize now that we only want the most high-level note that makes sense, actually. We either have to make it make sense, or abandon adding it. This would have been a valid design 20 years ago when notes were implemented. But, for the past couple of decades, set_unique_reg_note hasn't touched the datum and all bugs over the years in this area have been fixed in the caller. So this one should be fixed the same way, I see no reason to special-case it. If every expander has to analyze the instructions that have been issued to figure out if the new note needs to be modified, or cannot be applied at all, you end up with umpteen duplications of the checks I added to set_unique_reg_note prepended to its call sites. Yes, expanders must know what they're doing, but this isn't new. -- Eric Botcazou
Current Solaris/x86 libitm results
With your last patches, I get decent test results for libitm on Solaris 11/x86, both with Sun as/ld and gas/Sun ld: === libitm tests === Running target unix FAIL: libitm.c/clone-1.c execution test FAIL: libitm.c/memcpy-1.c execution test FAIL: libitm.c/memset-1.c execution test WARNING: libitm.c++/static_ctor.C compilation failed to produce executable === libitm Summary for unix === # of expected passes20 # of unexpected failures3 # of expected failures 5 Running target unix/-m64 FAIL: libitm.c/clone-1.c execution test FAIL: libitm.c++/eh-1.C execution test WARNING: libitm.c++/static_ctor.C compilation failed to produce executable === libitm Summary for unix/-m64 === # of expected passes21 # of unexpected failures2 # of expected failures 5 === libitm Summary === # of expected passes41 # of unexpected failures5 # of expected failures 10 With the exception of the 64-bit libitm.c++/eh-1.C failure, they match what I see on x86_64-unknown-linux-gnu. This also means the current lack of CFI support in Sun as doesn't make a difference. To check the worst possible case, I've also tried Solaris 8/x86, also with Sun as and gas, but ran into a couple of problems with Sun as: * It doesn't understand .hidden (and older versions of ld may not, either, cf. gcc/configure.ac (gcc_cv_as_hidden). One would probably have to move the test to a common place to avoid duplicating it. For the moment, I've just #if 0'ed the .hidden. * Later, libitm.so fails to link: Text relocation remains referenced against symbol offset in file GTM_begin_transaction 0x20.libs/sjlj.o ld: fatal: relocations remain against allocatable but non-writable sections which is true, but doesn't happen on S8 with gas or S11 with either as or gas. I've cheated, removed the -z text from libtool, and linked with -mimpure-text. With those hacks, I can link libitm.so, but all execution tests fail to link due to PR middle-end/50598. With gas and Sun ld, on the other hand, testsuite results are good: === libitm tests === Running target unix FAIL: libitm.c/clone-1.c execution test FAIL: libitm.c/memcpy-1.c execution test FAIL: libitm.c/memset-1.c execution test FAIL: libitm.c++/eh-1.C execution test WARNING: libitm.c++/static_ctor.C compilation failed to produce executable === libitm Summary === # of expected passes19 # of unexpected failures4 # of expected failures 5 Only the failure of 32-bit libitm.c++/eh-1.C differs from the Solaris 11 results above. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [Patch] Move Objective-C runtime flags to modern options system.
On Nov 10, 2011, at 1:35 AM, Richard Guenther rguent...@suse.de wrote: flag_exceptions also triggers middle-end behavior - without it no statement can possibly throw Actually, one version of exception handling for objective c++ doesn't require flag_exceptions... One can indeed @throw without it, they just can't throw without it.
Re: [PATCH] PR target/50038 fix: redundant zero extensions removal
Initial aim of the pass was to remove zero extentions redundant due to implicit zero extention in x64. But implementation actually uses generic approach and seems like a mini-combiner. Pass may combine two zero extends or combine zero extend with a constant as a special case but in other cases we just try to merge two instructions and then check we have corresponding template. It can be easily adopted to remove all redundant extensions. So, byte add in the example will be merged with zxero extend only if we have explicit template for it in machine model. OK. In this particular test case combiner may also help because we have byte memory load and extend on combiner pass. But due to some reason it does not merge them. In combiner dump I see (insn 39 38 40 4 (set (reg/v:QI 81 [ xr ]) (mem:QI (reg/v/f:DI 111 [ ImageInPtr ]) [0 MEM[base: ImageInPtr_29, offset: 0B]+0 S1 A8])) 1.c:9 66 {*movqi_internal} (nil)) (insn 43 42 44 4 (parallel [ (set (reg:SI 116 [ xr ]) (zero_extend:SI (reg/v:QI 81 [ xr ]))) (clobber (reg:CC 17 flags)) ]) 1.c:11 121 {*zero_extendqisi2_movzbl_and} (expr_list:REG_DEAD (reg/v:QI 81 [ xr ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil The pseudo-register (reg/v/f:DI 111) is changed between insn 39 and insn 43 so can_combine_p returns 0. -- Eric Botcazou
Re: [PATCH] PR target/50038 fix: redundant zero extensions removal
So, what about the patch? I think since we already have zee patch it would be great to use it as more general optimization. I tested it on EEMBC 2.0 on Atom and it showed 1% performance gain in geomean on 32 bit which is really good for such simple optimization. For OOO archs patch is not so critical but still makes code cleaner The patch cannot be accepted as-is since it doesn't update a single bit of the documentation present in implicit-zee.c. The authors have made the effort of thoroughly documenting their code so it shouldn't be wasted. Therefore, at a minimum, the documentation must be overhauled the same way the code will be. I agree that the numbers are encouraging. Moreover, the narrow specialization of the pass was critized when it was added so a generalization will probably be welcome. So, unless other developers object, let's do it, but correctly, that is to say, let's rename the pass, eliminate all the hardcoded references to implicit zero-extensions in the code and turn it into a generic elimination of redundant extensions pass. -- Eric Botcazou
Re: [Bug rtl-optimization/51040] atomic_fetch_nand issue
On 11/09/2011 02:15 PM, Andrew MacLeod wrote: NAND patchup arithmetic was missing the 2 stage AND then NOT operation. Instead it was falling into the same sequence as every other operation and trying to perform a binary operation on a NOT. I managed to modify and existing testcase to trigger the bug without requiring a configuration with RTL checking enabled. Bootstrapped on x86_64-unknown-linux-gnu with no new regressions (pending completetion of test run) OK for mainline? ok, here's an adjusted patch to use positive tests for the NOT condition Andrew PR rtl-optimization/51040 * optabs.c (expand_atomic_fetch_op): Patchup code for NAND should be AND followed by NOT. * builtins.c (expand_builtin_atomic_fetch_op): Patchup code for NAND should be AND followed by NOT. * testsuite/gcc.dg/atomic-noinline[-aux].c: Test no-inline NAND and patchup code. Index: optabs.c === *** optabs.c(revision 181206) --- optabs.c(working copy) *** expand_atomic_fetch_op (rtx target, rtx *** 7875,7882 Fetch_before == after REVERSE_OP val. */ if (!after) code = optab.reverse_code; ! result = expand_simple_binop (mode, code, result, val, target, true, ! OPTAB_LIB_WIDEN); return result; } } --- 7875,7889 Fetch_before == after REVERSE_OP val. */ if (!after) code = optab.reverse_code; ! if (code == NOT) ! { ! result = expand_simple_binop (mode, AND, result, val, NULL_RTX, ! true, OPTAB_LIB_WIDEN); ! result = expand_simple_unop (mode, NOT, result, target, true); ! } ! else ! result = expand_simple_binop (mode, code, result, val, target, ! true, OPTAB_LIB_WIDEN); return result; } } Index: builtins.c === *** builtins.c (revision 181206) --- builtins.c (working copy) *** expand_builtin_atomic_fetch_op (enum mac *** 5460,5467 /* Then issue the arithmetic correction to return the right result. */ if (!ignore) ! ret = expand_simple_binop (mode, code, ret, val, NULL_RTX, true, ! OPTAB_LIB_WIDEN); return ret; } --- 5460,5476 /* Then issue the arithmetic correction to return the right result. */ if (!ignore) ! { ! if (code == NOT) ! { ! ret = expand_simple_binop (mode, AND, ret, val, NULL_RTX, true, !OPTAB_LIB_WIDEN); ! ret = expand_simple_unop (mode, NOT, ret, target, true); ! } ! else ! ret = expand_simple_binop (mode, code, ret, val, target, true, ! OPTAB_LIB_WIDEN); ! } return ret; } Index: testsuite/gcc.dg/atomic-noinline.c === *** testsuite/gcc.dg/atomic-noinline.c (revision 181206) --- testsuite/gcc.dg/atomic-noinline.c (working copy) *** main () *** 49,54 --- 49,61 if (__atomic_is_lock_free (4, 0) != 10) abort (); + /* PR 51040 was caused by arithmetic code not patching up nand_fetch properly + when used an an external function. Look for proper return value here. */ + ac = 0x3C; + bc = __atomic_nand_fetch (ac, 0x0f, __ATOMIC_RELAXED); + if (bc != ac) + abort (); + return 0; } Index: testsuite/gcc.dg/atomic-noinline-aux.c === *** testsuite/gcc.dg/atomic-noinline-aux.c (revision 181206) --- testsuite/gcc.dg/atomic-noinline-aux.c (working copy) *** char __atomic_fetch_add_1 (char *p, char *** 40,50 *p = 1; } ! short __atomic_fetch_add_2 (short *p, short v, short i) { *p = 1; } int __atomic_is_lock_free (int i, void *p) { return 10; --- 40,63 *p = 1; } ! short __atomic_fetch_add_2 (short *p, short v, int i) { *p = 1; } + /* Really perform a NAND. PR51040 showed incorrect calculation of a +non-inlined fetch_nand. */ + unsigned char + __atomic_fetch_nand_1 (unsigned char *p, unsigned char v, int i) + { + unsigned char ret; + + ret = *p; + *p = ~(*p v); + + return ret; + } + int __atomic_is_lock_free (int i, void *p) { return 10;
Re: [Bug rtl-optimization/51040] atomic_fetch_nand issue
On 11/10/2011 08:29 AM, Andrew MacLeod wrote: PR rtl-optimization/51040 * optabs.c (expand_atomic_fetch_op): Patchup code for NAND should be AND followed by NOT. * builtins.c (expand_builtin_atomic_fetch_op): Patchup code for NAND should be AND followed by NOT. * testsuite/gcc.dg/atomic-noinline[-aux].c: Test no-inline NAND and patchup code. Ok. r~
[PATCH] pr51038 atomic_flag on targets with no atomic support.
The issue here is no atomic support whatsoever. The standard now *requires* that atomic_flag be implementable in a lock free manner for compliance. That means they must resolve to something, and not an external library call. In order to support atomic_flag in a lock free manner on a target, we need at a minimum the legacy __sync_lock_test_and_set and __sync_lock_release to be implemented. Previous to this release, if atomic_flag couldn't be implemented lock free, it was implemented with locks. libstdc++-v3 no longer supports any locked implementations. We'll have the same problem with C1x next release as well, so I bit the bullet and added __atomic_test_and_set and __atomic_clear as built-ins. These routines will first try to use lock free exchange and store. Failing that, legacy __sync_lock_test_and_set and __sync_lock_release are tried. If those fail as well, then we simply default to performing a load and/or store as is required. Currently I don't issue any warnings because we don't have a good way of saying we are running in a single threaded model. When we add the -fmemory-model=single flag (probably next release) I think we should issue a warning that atomics with no support are being used in a non-single threaded environment. this boostraps and no new regressions on x86_64-unknown-linux-gnu. I also built a cris-elf compiler and looked at the output from atomic-flag.c, and there were no external calls in it, so hopefully it resolves the issue there... care to give it a try and verify? Andrew libstdc++-v3 * include/bits/atomic_base.h (atomic_thread_fence): Call built-in. (atomic_signal_fence): Call built-in. (test_and_set, clear): Call new atomic built-ins. gcc * builtins.c (expand_builtin_atomic_clear): New. Expand atomic_clear. (expand_builtin_atomic_test_and_set): New. Expand atomic test_and_set. (expand_builtin): Add cases for test_and_set and clear. * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET): New. (BUILT_IN_ATOMIC_CLEAR): New. testsuite * gcc.dg/atomic-invalid.c: Add test for invalid __atomic_clear models. * gcc.dg/atomic-flag.c: New. Test __atomic_test_and_set and __atomic_clear. Index: libstdc++-v3/include/bits/atomic_base.h === *** libstdc++-v3/include/bits/atomic_base.h (revision 181119) --- libstdc++-v3/include/bits/atomic_base.h (working copy) *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 68,78 return __mo2; } ! void ! atomic_thread_fence(memory_order __m) noexcept; ! void ! atomic_signal_fence(memory_order __m) noexcept; /// kill_dependency templatetypename _Tp --- 68,84 return __mo2; } ! inline void ! atomic_thread_fence(memory_order __m) noexcept ! { ! __atomic_thread_fence (__m); ! } ! inline void ! atomic_signal_fence(memory_order __m) noexcept ! { ! __atomic_thread_fence (__m); ! } /// kill_dependency templatetypename _Tp *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 261,295 bool test_and_set(memory_order __m = memory_order_seq_cst) noexcept { ! /* The standard *requires* this to be lock free. If exchange is not !always lock free, the resort to the old test_and_set. */ ! if (__atomic_always_lock_free (sizeof (_M_i), 0)) ! return __atomic_exchange_n(_M_i, 1, __m); ! else ! { ! /* Sync test and set is only guaranteed to be acquire. */ ! if (__m == memory_order_seq_cst || __m == memory_order_release ! || __m == memory_order_acq_rel) ! atomic_thread_fence (__m); ! return __sync_lock_test_and_set (_M_i, 1); ! } } bool test_and_set(memory_order __m = memory_order_seq_cst) volatile noexcept { ! /* The standard *requires* this to be lock free. If exchange is not !always lock free, the resort to the old test_and_set. */ ! if (__atomic_always_lock_free (sizeof (_M_i), 0)) ! return __atomic_exchange_n(_M_i, 1, __m); ! else ! { ! /* Sync test and set is only guaranteed to be acquire. */ ! if (__m == memory_order_seq_cst || __m == memory_order_release ! || __m == memory_order_acq_rel) ! atomic_thread_fence (__m); ! return __sync_lock_test_and_set (_M_i, 1); ! } } void --- 267,279 bool test_and_set(memory_order __m = memory_order_seq_cst) noexcept { ! return __atomic_test_and_set (_M_i, __m); } bool test_and_set(memory_order __m = memory_order_seq_cst) volatile noexcept { ! return __atomic_test_and_set (_M_i, __m); } void *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 299,315 __glibcxx_assert(__m !=
Status of libitm on Tru64 UNIX
Just for fun, I gave libitm a try on alpha-dec-osf5.1b, too. Here's what I found: * config/alpha/sjlj.S needs trivial changes for the non-ELF/non-Linux platform. * Initially, all C tests were failing like this: 333619:./simple-1.exe: /sbin/loader: Error: libitm.so.0: symbol _ZnamRKSt9nothrow_t unresolved 333619:./simple-1.exe: /sbin/loader: Fatal Error: Load of ./simple-1.exe failed: Unresolved symbol name This happens because the platform doesn't support weak definitions (i.e. an extern symbol declared weak working as if defined with a NULL value), you need to provide a dummy implementation instead. To check if it works, I'm testing __osf__ for now, but a test like the following can be used to construct a configure test: void weakdef (void) __attribute__ ((weak)); int main (void) { if (weakdef != 0) weakdef (); return 0; } It fails to link on osf. libgfortran/acinclude.m4 (LIBGFOR_GTHREAD_WEAK) has another test, but mostly uses a blacklist approach instead. With those changes, libitm.so built, but with loads of problems in the testsuite: * Most tests fail with an ICE: FAIL: libitm.c/cancel.c (internal compiler error) FAIL: libitm.c/cancel.c (test for excess errors) Excess errors: /vol/gcc/src/hg/trunk/local/libitm/testsuite/libitm.c/cancel.c:55:1: internal compiler error: in default_no_named_section, at varasm.c:6293 WARNING: libitm.c/cancel.c compilation failed to produce executable I couldn't determine where tm tries to use named sections (which are missing in ECOFF), since even gdb 7.3.1 SEGVs on cc1. This may be due to being compiled as C++. * All tests using pthread.h need an explicit -pthread: FAIL: libitm.c/notx.c (test for excess errors) Excess errors: /var/gcc/regression/trunk/5.1b-gcc/build/gcc/include-fixed/pthread.h:1427:4: error: #error Please compile the module including pthread.h with -pthread WARNING: libitm.c/notx.c compilation failed to produce executable This patch adds this unconditionally. Even with those changes, I run into PR middle-end/50598, just as on Solaris 8. Rainer 2011-11-09 Rainer Orth r...@cebitec.uni-bielefeld.de libitm: * config/alpha/sjlj.S (_ITM_beginTransaction) [!__ELF__]: Don't use .hidden. (.note.GNU-stack): Only use if __linux__. * alloc_cpp.cc [!HAVE_WEAKDEF] (_ZnaXRKSt9nothrow_t): Dummy function. * testsuite/libitm.c/notx.c: Use dg-options -pthread. * testsuite/libitm.c/reentrant.c: Likewise. * testsuite/libitm.c/simple-2.c: Likewise. * testsuite/libitm.c/txrelease.c: Likewise. * testsuite/libitm.c++/static_ctor.C: Likewise. diff --git a/libitm/alloc_cpp.cc b/libitm/alloc_cpp.cc --- a/libitm/alloc_cpp.cc +++ b/libitm/alloc_cpp.cc @@ -60,6 +60,14 @@ extern void _ZdlPvRKSt9nothrow_t (void * extern void *_ZnaXRKSt9nothrow_t (size_t, c_nothrow_p) __attribute__((weak)); extern void _ZdaPvRKSt9nothrow_t (void *, c_nothrow_p) __attribute__((weak)); +#ifdef __osf__ /* Really: !HAVE_WEAKDEF */ +void * +_ZnaXRKSt9nothrow_t (size_t, c_nothrow_p) +{ + return NULL; +} +#endif /* __osf__ */ + /* Wrap the delete nothrow symbols for usage with a single argument. Perhaps should have a configure type check for this, because the std::nothrow_t reference argument is unused (empty class), and most diff --git a/libitm/config/alpha/sjlj.S b/libitm/config/alpha/sjlj.S --- a/libitm/config/alpha/sjlj.S +++ b/libitm/config/alpha/sjlj.S @@ -74,7 +74,9 @@ _ITM_beginTransaction: .align 4 .globl GTM_longjmp +#ifdef __ELF__ .hidden GTM_longjmp +#endif .ent GTM_longjmp GTM_longjmp: @@ -105,4 +107,6 @@ GTM_longjmp: ret .end GTM_longjmp +#ifdef __linux__ .section .note.GNU-stack, , @progbits +#endif diff --git a/libitm/testsuite/libitm.c++/static_ctor.C b/libitm/testsuite/libitm.c++/static_ctor.C --- a/libitm/testsuite/libitm.c++/static_ctor.C +++ b/libitm/testsuite/libitm.c++/static_ctor.C @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-options -pthread } */ /* { dg-xfail-if { *-*-* } { * } { } } */ /* Tests static constructors inside of transactional code. */ diff --git a/libitm/testsuite/libitm.c/notx.c b/libitm/testsuite/libitm.c/notx.c --- a/libitm/testsuite/libitm.c/notx.c +++ b/libitm/testsuite/libitm.c/notx.c @@ -1,5 +1,8 @@ /* These tests all check whether initialization happens properly even if no transaction has been used in the current thread yet. */ + +/* { dg-options -pthread } */ + #include stdlib.h #include pthread.h #include libitm.h diff --git a/libitm/testsuite/libitm.c/reentrant.c b/libitm/testsuite/libitm.c/reentrant.c --- a/libitm/testsuite/libitm.c/reentrant.c +++ b/libitm/testsuite/libitm.c/reentrant.c @@ -1,4 +1,5 @@ /* { dg-do run { xfail *-*-* } } +/* { dg-options -pthread } */ /* Tests that new transactions can be started from both transaction_pure and transaction_unsafe code. This also requires proper handling of reentrant diff --git
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
On 11/10/2011 08:35 AM, Andrew MacLeod wrote: Currently I don't issue any warnings ... + /* Otherwise issue the store and a warning. */ + warning_at (loc, 0, + __atomic_clear used on target with no atomic support); + __atomic_clear (a, __ATOMIC_RELAXED); /* { dg-warning __atomic_clear used on target with no atomic support { target cris-*-elf } } */ What are those then? And, obviously the cris test should be an effective target test. r~
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
On 11/10/2011 11:47 AM, Richard Henderson wrote: On 11/10/2011 08:35 AM, Andrew MacLeod wrote: Currently I don't issue any warnings ... What are those then? And, obviously the cris test should be an effective target test. Oh, those are gone, I must not have re-svn'd Justa minute Andrew
Re: Status of libitm on Tru64 UNIX
On 11/10/2011 08:42 AM, Rainer Orth wrote: libitm: * config/alpha/sjlj.S (_ITM_beginTransaction) [!__ELF__]: Don't use .hidden. (.note.GNU-stack): Only use if __linux__. * alloc_cpp.cc [!HAVE_WEAKDEF] (_ZnaXRKSt9nothrow_t): Dummy function. * testsuite/libitm.c/notx.c: Use dg-options -pthread. * testsuite/libitm.c/reentrant.c: Likewise. * testsuite/libitm.c/simple-2.c: Likewise. * testsuite/libitm.c/txrelease.c: Likewise. * testsuite/libitm.c++/static_ctor.C: Likewise. Ok. I'll try to do something about the weakdef thing. r~
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
On 11/10/2011 11:48 AM, Andrew MacLeod wrote: On 11/10/2011 11:47 AM, Richard Henderson wrote: On 11/10/2011 08:35 AM, Andrew MacLeod wrote: Currently I don't issue any warnings ... What are those then? And, obviously the cris test should be an effective target test. Oh, those are gone, I must not have re-svn'd Justa minute Andrew doh. sorry about that Andrew libstdc++-v3 * include/bits/atomic_base.h (atomic_thread_fence): Call built-in. (atomic_signal_fence): Call built-in. (test_and_set, clear): Call new atomic built-ins. gcc * builtins.c (expand_builtin_atomic_clear): New. Expand atomic_clear. (expand_builtin_atomic_test_and_set): New. Expand atomic test_and_set. (expand_builtin): Add cases for test_and_set and clear. * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET): New. (BUILT_IN_ATOMIC_CLEAR): New. testsuite * gcc.dg/atomic-invalid.c: Add test for invalid __atomic_clear models. * gcc.dg/atomic-flag.c: New. Test __atomic_test_and_set and __atomic_clear. Index: libstdc++-v3/include/bits/atomic_base.h === *** libstdc++-v3/include/bits/atomic_base.h (revision 181119) --- libstdc++-v3/include/bits/atomic_base.h (working copy) *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 68,78 return __mo2; } ! void ! atomic_thread_fence(memory_order __m) noexcept; ! void ! atomic_signal_fence(memory_order __m) noexcept; /// kill_dependency templatetypename _Tp --- 68,84 return __mo2; } ! inline void ! atomic_thread_fence(memory_order __m) noexcept ! { ! __atomic_thread_fence (__m); ! } ! inline void ! atomic_signal_fence(memory_order __m) noexcept ! { ! __atomic_thread_fence (__m); ! } /// kill_dependency templatetypename _Tp *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 261,295 bool test_and_set(memory_order __m = memory_order_seq_cst) noexcept { ! /* The standard *requires* this to be lock free. If exchange is not !always lock free, the resort to the old test_and_set. */ ! if (__atomic_always_lock_free (sizeof (_M_i), 0)) ! return __atomic_exchange_n(_M_i, 1, __m); ! else ! { ! /* Sync test and set is only guaranteed to be acquire. */ ! if (__m == memory_order_seq_cst || __m == memory_order_release ! || __m == memory_order_acq_rel) ! atomic_thread_fence (__m); ! return __sync_lock_test_and_set (_M_i, 1); ! } } bool test_and_set(memory_order __m = memory_order_seq_cst) volatile noexcept { ! /* The standard *requires* this to be lock free. If exchange is not !always lock free, the resort to the old test_and_set. */ ! if (__atomic_always_lock_free (sizeof (_M_i), 0)) ! return __atomic_exchange_n(_M_i, 1, __m); ! else ! { ! /* Sync test and set is only guaranteed to be acquire. */ ! if (__m == memory_order_seq_cst || __m == memory_order_release ! || __m == memory_order_acq_rel) ! atomic_thread_fence (__m); ! return __sync_lock_test_and_set (_M_i, 1); ! } } void --- 267,279 bool test_and_set(memory_order __m = memory_order_seq_cst) noexcept { ! return __atomic_test_and_set (_M_i, __m); } bool test_and_set(memory_order __m = memory_order_seq_cst) volatile noexcept { ! return __atomic_test_and_set (_M_i, __m); } void *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 299,315 __glibcxx_assert(__m != memory_order_acquire); __glibcxx_assert(__m != memory_order_acq_rel); ! /* The standard *requires* this to be lock free. If store is not always !lock free, the resort to the old style __sync_lock_release. */ ! if (__atomic_always_lock_free (sizeof (_M_i), 0)) ! __atomic_store_n(_M_i, 0, __m); ! else ! { ! __sync_lock_release (_M_i, 0); ! /* __sync_lock_release is only guaranteed to be a release barrier. */ ! if (__m == memory_order_seq_cst) ! atomic_thread_fence (__m); ! } } void --- 283,289 __glibcxx_assert(__m != memory_order_acquire); __glibcxx_assert(__m != memory_order_acq_rel); ! __atomic_clear (_M_i, __m); } void *** _GLIBCXX_BEGIN_NAMESPACE_VERSION *** 319,335 __glibcxx_assert(__m != memory_order_acquire); __glibcxx_assert(__m != memory_order_acq_rel); ! /* The standard *requires* this to be lock free. If store is not always !lock free, the resort to the old style __sync_lock_release. */ ! if (__atomic_always_lock_free (sizeof (_M_i), 0)) !
Re: [PATCH] [Annotalysis] Support trylock attributes on virtual methods.
On 11-11-08 13:11 , Delesley Hutchins wrote: This patch fixes a bug wherein the trylock attribute would not work if it was attached to a virtual method. Bootstrapped and passed gcc regression testsuite on x86_64-unknown-linux-gnu. Okay for google/gcc-4_6? -DeLesley Changelog.google-4_6: 2011-11-08 DeLesley Hutchinsdeles...@google.com * tree-threadsafe-analyze.c: factors out code to get function decl. Fix formatting. Blank line after date. Specify the name of the modified function in '()'. testsuite/Changelog.google-4_6: 2011-11-08 DeLesley Hutchinsdeles...@google.com * g++.dg/thread-ann/thread_annot_lock-85.C: New regression test Blank line after date. Index: testsuite/g++.dg/thread-ann/thread_annot_lock-85.C === --- testsuite/g++.dg/thread-ann/thread_annot_lock-85.C (revision 0) +++ testsuite/g++.dg/thread-ann/thread_annot_lock-85.C (revision 0) @@ -0,0 +1,22 @@ +// Regression test, handle trylock on virtual method. +// { dg-do compile } +// { dg-options -Wthread-safety } + +#include thread_annot_common.h + +class LOCKABLE Lock { + public: + virtual ~Lock() {} + virtual bool TryLock() EXCLUSIVE_TRYLOCK_FUNCTION(true) { return true; } + void Unlock() UNLOCK_FUNCTION() {} +}; + + +void foo() { + Lock x; + Lock *p = x; + if (p-TryLock()) { +p-Unlock(); + } +} + Index: tree-threadsafe-analyze.c === --- tree-threadsafe-analyze.c (revision 180984) +++ tree-threadsafe-analyze.c (working copy) @@ -2508,26 +2508,14 @@ return; } -/* The main routine that handles gimple call statements. */ -static void -handle_call_gs (gimple call, struct bb_threadsafe_info *current_bb_info) -{ +/* Get the function declaration from a gimple call stmt. This handles both + ordinary function calls and virtual methods. */ +static tree +get_fdecl_from_gimple_stmt (gimple call) { Opening brace on line below. Reference 'call' argument in capitals. tree fdecl = gimple_call_fndecl (call); - int num_args = gimple_call_num_args (call); - int arg_index = 0; - tree arg_type = NULL_TREE; - tree arg; - tree lhs; - location_t locus; - - if (!gimple_has_location (call)) -locus = input_location; - else -locus = gimple_location (call); - /* If the callee fndecl is NULL, check if it is a virtual function, - and if so, try to get its decl through the reference object. */ + and if so, try to get its decl through the reference object. */ if (!fdecl) { tree callee = gimple_call_fn (call); @@ -2546,7 +2534,28 @@ fdecl = lang_hooks.get_virtual_function_decl (callee, objtype); } } + return fdecl; +} + +/* The main routine that handles gimple call statements. */ + +static void +handle_call_gs (gimple call, struct bb_threadsafe_info *current_bb_info) Since you are modifying this code, could you add documentation for CALL and CURRENT_BB_INFO? +{ + tree fdecl = get_fdecl_from_gimple_stmt (call); + int num_args = gimple_call_num_args (call); + int arg_index = 0; + tree arg_type = NULL_TREE; + tree arg; + tree lhs; + location_t locus; + + if (!gimple_has_location (call)) +locus = input_location; + else +locus = gimple_location (call); + /* The callee fndecl could be NULL, e.g., when the function is passed in as an argument. */ if (fdecl) @@ -2839,7 +2848,8 @@ } else if (is_gimple_call (gs)) { - tree fdecl = gimple_call_fndecl (gs); + tree fdecl = get_fdecl_from_gimple_stmt (gs); + What about the other spots where we call gimple_call_fndecl, shouldn't we call get_fdecl_from_gimple_stmt instead? Diego.
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
On 11/10/2011 08:52 AM, Andrew MacLeod wrote: libstdc++-v3 * include/bits/atomic_base.h (atomic_thread_fence): Call built-in. (atomic_signal_fence): Call built-in. (test_and_set, clear): Call new atomic built-ins. gcc * builtins.c (expand_builtin_atomic_clear): New. Expand atomic_clear. (expand_builtin_atomic_test_and_set): New. Expand atomic test_and_set. (expand_builtin): Add cases for test_and_set and clear. * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET): New. (BUILT_IN_ATOMIC_CLEAR): New. testsuite * gcc.dg/atomic-invalid.c: Add test for invalid __atomic_clear models. * gcc.dg/atomic-flag.c: New. Test __atomic_test_and_set and __atomic_clear. Ok. r~
Re: [PATCH] [Annotalysis] Add support for arrays in lock expressions
On 11-11-03 14:20 , Delesley Hutchins wrote: This patch adds support for array indexing (i.e. operator []) in lock expressions. The current version of gcc seems to emit these as expressions involving pointer arithmetic, so we update get_canonical_lock_expr() to handle such expressions. Bootstrapped and passed gcc regression testsuite on x86_64-unknown-linux-gnu. Okay for google/gcc-4_6? -DeLesley Changelog.google-4_6: 2011-11-03 DeLesley Hutchinsdeles...@google.com * tree-threadsafe-analyze.c (get_canonical_lock_expr): Add support for pointer arithmetic operations Blank line after date. End entry with '.'. Align second line with '*'. Index: tree-threadsafe-analyze.c === --- tree-threadsafe-analyze.c (revision 180716) +++ tree-threadsafe-analyze.c (working copy) @@ -79,6 +79,7 @@ along with GCC; see the file COPYING3. #include coretypes.h #include tm.h #include tree.h +#include gimple.h Well, we're getting gimple.h from somewhere else, but it's fine to include. Since you did, make sure you add it as a dependency for tree-threadsafe-analyze.o in Makefile.in. #include c-family/c-common.h #include toplev.h #include input.h @@ -804,12 +805,27 @@ get_canonical_lock_expr (tree lock, tree !gimple_nop_p (SSA_NAME_DEF_STMT (lock))) { gimple def_stmt = SSA_NAME_DEF_STMT (lock); - if (is_gimple_assign (def_stmt) - (get_gimple_rhs_class (gimple_assign_rhs_code (def_stmt)) - == GIMPLE_SINGLE_RHS)) -return get_canonical_lock_expr (gimple_assign_rhs1 (def_stmt), -base_obj, is_temp_expr, -NULL_TREE); + if (is_gimple_assign (def_stmt)) +{ + enum gimple_rhs_class gcls = +get_gimple_rhs_class (gimple_assign_rhs_code (def_stmt)); + tree rhs = 0; s/0/NULL_TREE/ + + if (gcls == GIMPLE_SINGLE_RHS) +rhs = gimple_assign_rhs1 (def_stmt); + else if (gcls == GIMPLE_UNARY_RHS) +rhs = build1 (gimple_assign_rhs_code (def_stmt), + TREE_TYPE (gimple_assign_lhs (def_stmt)), + gimple_assign_rhs1 (def_stmt)); + else if (gcls == GIMPLE_BINARY_RHS) +rhs = build2 (gimple_assign_rhs_code (def_stmt), + TREE_TYPE (gimple_assign_lhs (def_stmt)), + gimple_assign_rhs1 (def_stmt), + gimple_assign_rhs2 (def_stmt)); + if (rhs) +return get_canonical_lock_expr (rhs, base_obj, +is_temp_expr, NULL_TREE); +} else if (is_gimple_call (def_stmt)) { tree fdecl = gimple_call_fndecl (def_stmt); @@ -981,6 +997,24 @@ get_canonical_lock_expr (tree lock, tree TREE_TYPE (TREE_TYPE (canon_base)), canon_base); break; } + case PLUS_EXPR: + case POINTER_PLUS_EXPR: + case MULT_EXPR: Why PLUS_EXPR and MULT_EXPR? Pointer arithmetic should use POINTER_PLUS_EXPR exclusively. I don't think you should be seeing PLUS_EXPRs here. The MULT_EXPR show up in scaling expressions? +{ + tree left = TREE_OPERAND (lock, 0); + tree canon_left = get_canonical_lock_expr (left, base_obj, + true /* is_temp_expr */, + NULL_TREE); + + tree right = TREE_OPERAND (lock, 1); + tree canon_right = get_canonical_lock_expr (right, base_obj, + true /* is_temp_expr */, + NULL_TREE); + if (left != canon_left || right != canon_right) +lock = build2 (TREE_CODE(lock), TREE_TYPE(lock), Space before '('. + canon_left, canon_right); + break; +} default: break; } Diego.
Re: [libitm] Work around missing AVX support
On 11/10/2011 12:16 AM, Jakub Jelinek wrote: On Wed, Nov 09, 2011 at 04:32:58PM -0800, Richard Henderson wrote: Not pretty at all. But given the corresponding irritation in writing assembler wrapper functions, it seems like it's about a wash. Tested with and without HAVE_AS_AVX on x86_64-linux. Shouldn't -mavx be also not passed in that case? Then you wouldn't need to undef __AVX__ and we wouldn't risk gcc doesn't decide to optimize memcpy or something similar using AVX instructions... You are correct. Thanks for noticing this; I was a bit frazzled after fighting with autofoo for so long yesterday. Tested on x86_64-linux, with avx and with avx forcibly disabled. r~ commit b8190fde2cd04078f8448576fb021060526b51d5 Author: rth rth@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Nov 10 17:09:04 2011 + libitm: Don't add -mavx if the assembler doesn't support avx. * config/x86/x86_avx.cc: Remove #undef __AVX__ hack. Tidy comments. * Makefile.am (x86_avx.lo): Only add -mavx if ARCH_X86_AVX. * configure.ac (ARCH_X86_AVX): New conditional. * Makefile.in, configure: Rebuild. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181261 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/libitm/ChangeLog b/libitm/ChangeLog index 8aeb589..4fb699e 100644 --- a/libitm/ChangeLog +++ b/libitm/ChangeLog @@ -1,3 +1,10 @@ +2011-11-10 Richard Henderson r...@redhat.com + + * config/x86/x86_avx.cc: Remove #undef __AVX__ hack. Tidy comments. + * Makefile.am (x86_avx.lo): Only add -mavx if ARCH_X86_AVX. + * configure.ac (ARCH_X86_AVX): New conditional. + * Makefile.in, configure: Rebuild. + 2011-11-09 Richard Henderson r...@redhat.com * acinclude.m4 (LIBITM_CHECK_AS_AVX): New. diff --git a/libitm/Makefile.am b/libitm/Makefile.am index 4578986..b4674a5 100644 --- a/libitm/Makefile.am +++ b/libitm/Makefile.am @@ -48,6 +48,8 @@ libitm_la_SOURCES = \ if ARCH_X86 libitm_la_SOURCES += x86_sse.cc x86_avx.cc x86_sse.lo : XCFLAGS += -msse +endif +if ARCH_X86_AVX x86_avx.lo : XCFLAGS += -mavx endif diff --git a/libitm/Makefile.in b/libitm/Makefile.in index 8816580..7426146 100644 --- a/libitm/Makefile.in +++ b/libitm/Makefile.in @@ -1259,7 +1259,7 @@ uninstall-am: uninstall-dvi-am uninstall-html-am uninstall-info-am \ vpath % $(strip $(search_path)) @ARCH_X86_TRUE@x86_sse.lo : XCFLAGS += -msse -@ARCH_X86_TRUE@x86_avx.lo : XCFLAGS += -mavx +@ARCH_X86_AVX_TRUE@x86_avx.lo : XCFLAGS += -mavx all-local: $(STAMP_GENINSRC) diff --git a/libitm/config/x86/x86_avx.cc b/libitm/config/x86/x86_avx.cc index cd20fe2..6a5e297 100644 --- a/libitm/config/x86/x86_avx.cc +++ b/libitm/config/x86/x86_avx.cc @@ -24,24 +24,20 @@ #include config.h -// ??? This is pretty gross, but we're going to frob types of the functions. -// Is this better or worse than just admitting we need to do this in pure -// assembly? - -#ifndef HAVE_AS_AVX -#undef __AVX__ -#endif - #include libitm_i.h #include dispatch.h extern C { #ifndef HAVE_AS_AVX +// If we don't have an AVX capable assembler, we didn't set -mavx on the +// command-line either, which means that libitm.h defined neither this type +// nor the functions in this file. Define the type and unconditionally +// wrap the file in extern C to make up for the lack of pre-declaration. typedef float _ITM_TYPE_M256 __attribute__((vector_size(32), may_alias)); #endif -// ??? Re-define the memcpy implementations so that we can frob the +// Re-define the memcpy implementations so that we can frob the // interface to deal with possibly missing AVX instruction set support. #ifdef HAVE_AS_AVX @@ -52,10 +48,10 @@ typedef float _ITM_TYPE_M256 __attribute__((vector_size(32), may_alias)); #else /* Emit vmovaps (%rax),%ymm0. */ #define RETURN(X) \ - asm volatile(.byte 0xc5,0xfc,0x28,0x00 : =m(X) : a(X)); + asm volatile(.byte 0xc5,0xfc,0x28,0x00 : =m(X) : a(X)) /* Emit vmovaps %ymm0,(%rax); vzeroupper. */ #define STORE(X,Y) \ - asm volatile(.byte 0xc5,0xfc,0x29,0x00,0xc5,0xf8,0x77 : =m(X) : a(X)); + asm volatile(.byte 0xc5,0xfc,0x29,0x00,0xc5,0xf8,0x77 : =m(X) : a(X)) #define OUTPUT(T) void #define INPUT(T,X) #endif @@ -92,4 +88,4 @@ _ITM_LM256 (const _ITM_TYPE_M256 *ptr) GTM::GTM_LB (ptr, sizeof (*ptr)); } -} +} // extern C diff --git a/libitm/configure b/libitm/configure index b30ced1..c0317cc 100644 --- a/libitm/configure +++ b/libitm/configure @@ -603,6 +603,8 @@ LTLIBOBJS LIBOBJS ARCH_FUTEX_FALSE ARCH_FUTEX_TRUE +ARCH_X86_AVX_FALSE +ARCH_X86_AVX_TRUE ARCH_X86_FALSE ARCH_X86_TRUE link_itm @@ -11714,7 +11716,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11717 configure +#line 11719 configure #include confdefs.h #if HAVE_DLFCN_H @@ -11820,7 +11822,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat
[Patch, libfortran] Reduce inlining
Hi, the inlining heuristics are nowadays decent. In particular, at -O2 the compiler does the obvious inlinings: - If the function body is very small (for some measure of small, see -finline-small-functions) - static functions called once (-finline-functions-called-once) Where GCC may need help is for function called with constant arguments, and the inlining would allow the deletion of untaken branches. Otherwise, in lieu of profiling data suggesting otherwise, keeping the code size smaller by avoiding inlining is probably the smart thing to do. The attached patch does this for libgfortran, that is, removes the inline attribute for static functions. The patch reduces the size of the following object files as follows: Before: textdata bss dec hex filename 781 0 0 781 30d ../../trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran/cpu_time.o textdata bss dec hex filename 679 0 0 679 2a7 ../../trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran/system_clock.o After: textdata bss dec hex filename 660 0 0 660 294 ../../trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran/cpu_time.o textdata bss dec hex filename 631 0 0 631 277 ../../trunk/objdir-git/x86_64-unknown-linux-gnu/libgfortran/system_clock.o For the other affected object files there is no change, suggesting that while the inline attributes did no harm, they did no good either. A system_clock benchmark program showed no change due to the un-inlining of gf_gettime_mono. For CPU_TIME, that results in a proper syscall (as opposed to SYSTEM_CLOCK/clock_gettime which is available as a VDSO on my system) so the overhead of that would overshadow whatever differences inlining might make, so I didn't test that. There was also an inline function (memset4) which was copy-pasted both in transfer.c and write.c; I moved it to io.h after first verifying that removing the inline attribute still caused the compiler to inline it. Committed as obvious to trunk. 2011-11-10 Janne Blomqvist j...@gcc.gnu.org * intrinsics/cpu_time.c (__cpu_time_1): Don't force inlining. * intrinsics/random.c (rnumber_4): Remove inline attribute. (rnumber_8, rnumber_10, rnumber_16): Likewise. * intrinsics/system_clock.c (gf_gettime_mono): Likewise. * intrinsics/time_1.h (ATTRIBUTE_ALWAYS_INLINE): Remove macro. (gf_cputime): Add inline attribute for MingW version. * io/format.c (format_hash): Remove inline attribute. * io/io.h (memset4): Inline function from transfer.c and write.c moved here. * io/transfer.c (min_off): Remove inline attribute. (memset4): Move to io.h. * io/write.c (memset4): Likewise. (memcpy4): Remove inline attribute. * io/write_float.def (calculate_exp): Likewise. -- Janne Blomqvist diff --git a/libgfortran/intrinsics/cpu_time.c b/libgfortran/intrinsics/cpu_time.c index 619f8d2..94636c4 100644 --- a/libgfortran/intrinsics/cpu_time.c +++ b/libgfortran/intrinsics/cpu_time.c @@ -26,9 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #include time_1.h -static inline void __cpu_time_1 (long *, long *) ATTRIBUTE_ALWAYS_INLINE; - -static inline void +static void __cpu_time_1 (long *sec, long *usec) { long user_sec, user_usec, system_sec, system_usec; diff --git a/libgfortran/intrinsics/random.c b/libgfortran/intrinsics/random.c index 8c16b85..35576b8 100644 --- a/libgfortran/intrinsics/random.c +++ b/libgfortran/intrinsics/random.c @@ -74,7 +74,7 @@ static __gthread_mutex_t random_lock; correct offset. */ -static inline void +static void rnumber_4 (GFC_REAL_4 *f, GFC_UINTEGER_4 v) { GFC_UINTEGER_4 mask; @@ -89,7 +89,7 @@ rnumber_4 (GFC_REAL_4 *f, GFC_UINTEGER_4 v) *f = (GFC_REAL_4) v * GFC_REAL_4_LITERAL(0x1.p-32); } -static inline void +static void rnumber_8 (GFC_REAL_8 *f, GFC_UINTEGER_8 v) { GFC_UINTEGER_8 mask; @@ -106,7 +106,7 @@ rnumber_8 (GFC_REAL_8 *f, GFC_UINTEGER_8 v) #ifdef HAVE_GFC_REAL_10 -static inline void +static void rnumber_10 (GFC_REAL_10 *f, GFC_UINTEGER_8 v) { GFC_UINTEGER_8 mask; @@ -126,7 +126,7 @@ rnumber_10 (GFC_REAL_10 *f, GFC_UINTEGER_8 v) /* For REAL(KIND=16), we only need to mask off the lower bits. */ -static inline void +static void rnumber_16 (GFC_REAL_16 *f, GFC_UINTEGER_8 v1, GFC_UINTEGER_8 v2) { GFC_UINTEGER_8 mask; diff --git a/libgfortran/intrinsics/system_clock.c b/libgfortran/intrinsics/system_clock.c index f4bac07..6385c4f 100644 --- a/libgfortran/intrinsics/system_clock.c +++ b/libgfortran/intrinsics/system_clock.c @@ -75,7 +75,7 @@ static int weak_gettime (clockid_t, struct timespec *) Return value: 0 for success, -1 for error. In case of error, errno is set. */ -static inline int +static int gf_gettime_mono (time_t * secs, long * nanosecs) {
Re: [Patch] Move Objective-C runtime flags to modern options system.
Hi Mike, just want to state my understanding to allow you to comment if I'm off On 10 Nov 2011, at 16:12, Mike Stump wrote: On Nov 10, 2011, at 1:35 AM, Richard Guenther rguent...@suse.de wrote: flag_exceptions also triggers middle-end behavior - without it no statement can possibly throw Actually, one version of exception handling for objective c++ doesn't require flag_exceptions... One can indeed @throw without it, they just can't throw without it. Thanks for catching that --- brainstorm on my part ... the code under discussion should have been #ifndef OBCPLUS Unfortunately, that particular blunder wasn't caught by the test-suite (which passes with the code I posted). -=- Specifically: NeXT m32 (ABI=0 or 1) uses SjLj exceptions for @throw etc. (and the @throw is done from a library routine). Moreover, there is no personality routine in m32 NeXT libobjc, so if one tries to engage the zero-cost exceptions, one gets a link error (and generates a load of unused eh data). I can work around that if there is still reason to have -fexceptions on. When c++ exceptions are operated in parallel with the ObjC ones - the personality routine is pointed at the libstc++ one. === m64 NeXT is a different beast altogether and the @throw and throw exceptions work together using the same unwinder. === With Joseph's suggestion I don't have a problem with the early use of flag_next_runtime - which means I can split the patch up - as I know Mike would prefer. ... will try an post a new version later. Iain
Re: Revert PowerPC shrink-wrap support 3 of 3
From: Hans-Peter Nilsson h...@axis.com Date: Thu, 10 Nov 2011 15:12:54 +0100 From: Bernd Schmidt ber...@codesourcery.com Date: Thu, 10 Nov 2011 14:29:04 +0100 HP, can you run full tests? Cross-test to cris-elf in progress. Thanks! Works, no regressions compared to before the breakage (r181187). Thanks! According to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51051#c3 it fixes building for arm-unknown-linux-gnueabi too. brgds, H-P
Re: vector garbaged collected while still in use
Looks like it is fixed already, so there is no need for this patch. David On Wed, Nov 9, 2011 at 12:36 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Nov 8, 2011 at 6:10 PM, Xinliang David Li davi...@google.com wrote: Here is the revised patch. Bootstrap and regression tested on linux/x86-64. Honza, can you comment on the implication of this change? Jason also seems to have touched this again, so maybe it's already fixed? thanks, David On Mon, Nov 7, 2011 at 2:09 PM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Nov 7, 2011 at 5:41 PM, Xinliang David Li davi...@google.com wrote: Here is the stack trace when the watch point is hit (the watch point is on address cleanups-base.prefix.num David #0 memset () at ../sysdeps/x86_64/memset.S:336 #1 0x00d1528d in poison_pages () at /usr/local/google/davidxl/dev/gcc/tot/gcc/ggc-page.c:1983 #2 0x00d15424 in ggc_collect () at /usr/local/google/davidxl/dev/gcc/tot/gcc/ggc-page.c:2076 #3 0x01028d7f in cgraph_finalize_function (decl=0x7577d600, nested=0 '\000') at /usr/local/google/davidxl/dev/gcc/tot/gcc/cgraphunit.c:376 Hm. We already conditionally arrange for cgraph_finalize_function to not call ggc_collect - so it seems that doing so is even less safe than originally thought. Which means I think we should push calling ggc_collect to the callers, which for the C++ frontend means ... #4 0x00988010 in expand_or_defer_fn (fn=0x7577d600) at /usr/local/google/davidxl/dev/gcc/tot/gcc/cp/semantics.c:3797 #5 0x00a678a7 in maybe_clone_body (fn=0x75770700) at /usr/local/google/davidxl/dev/gcc/tot/gcc/cp/optimize.c:426 #6 0x00987aa3 in expand_or_defer_fn_1 (fn=0x75770700) at /usr/local/google/davidxl/dev/gcc/tot/gcc/cp/semantics.c:3722 #7 0x00987fe0 in expand_or_defer_fn (fn=0x75770700) at /usr/local/google/davidxl/dev/gcc/tot/gcc/cp/semantics.c:3792 #8 0x0091c5f5 in synthesize_method (fndecl=0x75770700) at /usr/local/google/davidxl/dev/gcc/tot/gcc/cp/method.c:773 #9 0x00551fa0 in cp_finish_decl (decl=0x75770700, init=0x76d8f898, init_const_expr_p=0 '\000', asmspec_tree=0x0, flags=11) at /usr/local/google/davidxl/dev/gcc/tot/gcc/cp/decl.c:6286 ... probably here. Though I'd also approve a patch that simply removes the ggc_collect call (and the nested parameter). Honza - you probably added the ggc_collect - what's the reason to do it in this lowlevel place? Thanks, Richard.
Re: [patch tree-optimization 1/2]: Branch-cost optimizations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/09/11 14:09, Kai Tietz wrote: Well, such a comparison-logic-folder helper - like affine-tree for add/subtract/scale) - is for sure something good for inner gimple passes building up new logic-truth expressions, but such a pass doesn't meet requirements need to fold for BC optimization AFAICS. Perhaps. I think the thing to do would be to see what additional needs we have and evaluate if they make sense in that kind of framework. The difference is that for BC we don't want to fold at all. Also it isn't necessarily simplified statement we want. For example 'if ((a | b) == 0 ...) ...'. If the costs of such pattern '(a | b) == 0' are too high, we want to representation it instead as 'if (a == 0) if (b == 0) ...'. We don't have to fold. Think of this as an easy to use on-the-side representation of some operation(s). What I would roughly expect to see is the set of operations feeding the comparison shoved into this structure. With the full set of ops exposed into this structure we could look at the cost, canonicalize/simplify if appropriate, then select the best codegen strategy, modifying the on-the-side structure as needed. We then reflect the final result back into the IL. We have an condition 'if ((A | B) == 0 C == 0) ...' where the joining of A == 0 and C == 0 would be profitable by BC-optimization, but the join of A != 0 and B != 0 isn't. So we do - as my patch does - first an expand to element comparison-sequence view. So we get for it the transformed form 'if (A == 0 B == 0 C == 0)'. Right, so one of the first steps would be to canonicalize the (A|B) == 0 C == 0 form into something like you've shown above within this on-the-side structure. Now we can begin to search for valid patterns in the condition for joining by searching from left-to-right for a profitable pattern. So we end-up with final statement 'if ((A | C) == 0 C)' Which would be fairly straighforward using the on-the-side structure. I'm not sure what I'm missing since the description you've given fits very nicely with the overall approach Richi is suggesting. jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOvBC4AAoJEBRtltQi2kC7tlcH/RHMWieuVEeJY8HZHw4wroA+ 3Dnz1SFd7wA5kmj+1G+UdT4tl+L6zMdiF0GxwJ2zRh9QBQBCkwk3gBHfsgKSGb1h u3jsUfa/TAVtym6cccIQZ6+ieEGVaARkqzt+dlqKyd+YItkm9nCciYVIaTTBwgsd D6I2GMRrFfPkh1txQQ1sQQ9knnXmTp3YEwiDN3jCbm2dpn6X+jI9fOFJqGNPXrum 3t+d30zHiWlai+T0zfBSNJKOJO/NOU6hU1ShbPDGy3d+YQItXVb6BcSIivu9Jexz c9RNyflJRXY3tKkcMWqbarddbGeyXdnS66tgkIoxXpMp5len46Ion+Y+DJCXV34= =A65E -END PGP SIGNATURE-
Re: [PATCH] [Annotalysis] Add support for arrays in lock expressions
Why PLUS_EXPR and MULT_EXPR? Pointer arithmetic should use POINTER_PLUS_EXPR exclusively. I don't think you should be seeing PLUS_EXPRs here. The MULT_EXPR show up in scaling expressions? MULT_EXPR shows up in array indexing, since the index is multiplied by the size of the element; gcc converts everything to bytes before lowering. I added PLUS_EXPR for completeness, since I'm sure someone will write an expression like array[i+1] at some point. :-) -DeLesley
Re: [PATCH] [Annotalysis] Add support for arrays in lock expressions
On 11-11-10 13:05 , Delesley Hutchins wrote: Why PLUS_EXPR and MULT_EXPR? Pointer arithmetic should use POINTER_PLUS_EXPR exclusively. I don't think you should be seeing PLUS_EXPRs here. The MULT_EXPR show up in scaling expressions? MULT_EXPR shows up in array indexing, since the index is multiplied by the size of the element; gcc converts everything to bytes before lowering. I added PLUS_EXPR for completeness, since I'm sure someone will write an expression like array[i+1] at some point. :-) But you should not see such an expression in gimple. The array index is always a gimple_val. Diego.
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
From: Andrew MacLeod amacl...@redhat.com Date: Thu, 10 Nov 2011 17:52:44 +0100 On 11/10/2011 11:48 AM, Andrew MacLeod wrote: Justa minute Andrew doh. sorry about that Test cross to cris-elf in progress for your second take (at r181254 + Bernd's patch to unbreak the tree for arm-linux-gnueabi and cris-elf). Thanks! brgds, H-P
Re: [RFA/ARM][Patch 01/02]: Thumb2 epilogue in RTL
On Thu, 2011-11-10 at 13:44 +, Richard Earnshaw wrote: On 28/09/11 17:15, Sameera Deshpande wrote: Hi! This patch generates Thumb2 epilogues in RTL form. The work involves defining new functions, predicates and patterns along with few changes in existing code: * The load_multiple_operation predicate was found to be too restrictive for integer loads as it required consecutive destination regs, so this restriction was lifted. * Variations of load_multiple_operation were required to handle cases - where SP must be the base register - where FP values were being loaded (which do require consecutive destination registers) - where PC can be in register-list (which requires return pattern along with register loads). Hence, the common code was factored out into a new function in arm.c and parameterised to show - whether consecutive destination regs are needed - the data type being loaded - whether the base register has to be SP - whether PC is in register-list The patch is tested with arm-eabi with no regressions. ChangeLog: 2011-09-28 Ian Bolton ian.bol...@arm.com Sameera Deshpande sameera.deshpa...@arm.com * config/arm/arm-protos.h (load_multiple_operation_p): New declaration. (thumb2_expand_epilogue): Likewise. (thumb2_output_return): Likewise (thumb2_expand_return): Likewise. (thumb_unexpanded_epilogue): Rename to... (thumb1_unexpanded_epilogue): ...this * config/arm/arm.c (load_multiple_operation_p): New function. (thumb2_emit_multi_reg_pop): Likewise. (thumb2_emit_vfp_multi_reg_pop): Likewise. (thumb2_expand_return): Likewise. (thumb2_expand_epilogue): Likewise. (thumb2_output_return): Likewise (thumb_unexpanded_epilogue): Rename to... ( thumb1_unexpanded_epilogue): ...this * config/arm/arm.md (pop_multiple_with_stack_update): New pattern. (pop_multiple_with_stack_update_and_return): Likewise. (thumb2_ldr_with_return): Likewise. (floating_point_pop_multiple_with_stack_update): Likewise. (return): Update condition and code for pattern. (arm_return): Likewise. (epilogue_insns): Likewise. * config/arm/predicates.md (load_multiple_operation): Update predicate. (load_multiple_operation_stack_and_return): New predicate. (load_multiple_operation_stack): Likewise. (load_multiple_operation_stack_fp): Likewise. * config/arm/thumb2.md (thumb2_return): Remove. (thumb2_rtl_epilogue_return): New pattern. - Thanks and regards, Sameera D. thumb2_rtl_epilogue_complete-27Sept.patch + if (GET_CODE (SET_SRC (elt = XVECEXP (op, 0, offset_adj))) == PLUS) It's generally best not to use assignments within conditionals unless there is a strong reason otherwise (that normally implies something like being deep within a condition test where you only want to update the variable if some pre-conditions are true and that can't be easily factored out). + != (unsigned int) (first_dest_regno + regs_per_val * (i - base Line length (split the line just before the '+' operator. + /* now show EVERY reg that will be restored, using a SET for each. */ Capital letter at start of sentence. Why is EVERY in caps? + saved_regs_mask = offsets-saved_regs_mask; + for (i = 0, num_regs = 0; i = LAST_ARM_REGNUM; i++) blank line before the for loop. + /* It's illegal to do a pop for only one reg, so generate an ldr. */ GCC coding standards suggest avoiding the use of 'illegal'. Suggest changing that to 'Pop can only be used for more than one reg; so...' +reg_names[REGNO (XEXP (XVECEXP (operands[0], 0, 2), 0))]); + +/* Skip over the first two elements and the one we just generated. */ +for (i = 3; i (num_saves); i++) + { +strcat (pattern, \, %|\); +strcat (pattern, +reg_names[REGNO (XEXP (XVECEXP (operands[0], 0, i), 0))]); + } + +strcat (pattern, \}\); +output_asm_insn (pattern, operands); + +return \\; + } + + [(set_attr type load4)] There's a lot of trailing white space here. Please remove. +(define_insn *thumb2_ldr_with_return + [(return) + (set (reg:SI PC_REGNUM) +(mem:SI (post_inc:SI (match_operand:SI 0 s_register_operand k] + TARGET_THUMB2 + ldr%?\t%|pc, [%0], #4 + [(set_attr type load1) + (set_attr predicable yes)] +) + This pattern doesn't seem to be used. What's its purpose? +static const struct { const char *const name; } table[] + = { {\d0\}, {\d1\}, {\d2\}, {\d3\}, I'm not keen on having this table. Generally the register names should be configurable
Re: [PATCH] [Annotalysis] Add support for arrays in lock expressions
But you should not see such an expression in gimple. The array index is always a gimple_val. I'm not sure what you mean. The expression array[i+1] compiles to the following (courtesy of dump-tree-ssa): D.2095_4 = (long unsigned int) i_1; D.2096_5 = D.2095_4 + 1; D.2097_6 = D.2096_5 * 4; D.2098_8 = array_7(D) + D.2097_6; Annotalysis canonicalizes this gimple code into a tree that includes a PLUS_EXPR, a MULT_EXPR, and a POINTER_PLUS_EXPR. -DeLesley
Re: [PATCH] [Annotalysis] Add support for arrays in lock expressions
On 11-11-10 13:25 , Delesley Hutchins wrote: But you should not see such an expression in gimple. The array index is always a gimple_val. I'm not sure what you mean. The expression array[i+1] compiles to the following (courtesy of dump-tree-ssa): D.2095_4 = (long unsigned int) i_1; D.2096_5 = D.2095_4 + 1; D.2097_6 = D.2096_5 * 4; D.2098_8 = array_7(D) + D.2097_6; Annotalysis canonicalizes this gimple code into a tree that includes a PLUS_EXPR, a MULT_EXPR, and a POINTER_PLUS_EXPR. Ah, I see what you mean now. Sorry. Yes, the patch is fine with the other fixes. Diego.
Re: RFA: New pass to delete unexecutable paths in the CFG
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/09/11 15:10, Paolo Bonzini wrote: On 11/09/2011 06:53 PM, Jeff Law wrote: My patch totally ignores the other code on the unexecutable path. So we can miss externally visible side effects, if we were to somehow get on the unexecutable path. But that's the whole point, in a conforming program we can't ever get on the unexecutable path. But if a subroutine call never returns, we wouldn't get to the undefined behavior in the first place. Yea, I'd been pondering this aspect as well. The cases that most concern me would be aborts and infinite loops. Stuff like EH is represented in the CFG and the control dependence stuff would ensure we do the right thing. I think there are enough unanswered questions that we should defer this until after 4.7 branches. Or at the least not have the option on by default for 4.7, even if the issues raised in the threads are addressed. Jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOvBvtAAoJEBRtltQi2kC7kmUH/j4KOxLwlgLJZmYEp1fPAvOp riga57XaawnZtxnZYDwD8TQ8l5a2lsj8LMUthBUFq6Bl8NLTh4uJRAtWLbhS9D7Q t1sl+2D2CjzdX4J2Ygs7asKrPld+OIFizttu6pYw9CZ2o4Ia21xnmLnDqtbkBiC6 BZ+qGtzjMetEZQhsQYYz8q/B44eF5Cnfsl5ISaKVaF2ZfR3dZGhoxqujuD1/bZtQ Rijrg6uddiLQZrMvKT9WiJL+eoZYUvB1tTLD8tRs7e2YPSmQuxpmcN4JKc+DsPEF d+n1ZWYSG2EmoHCaHwkSq0X5oxGjNz+hfbSpyM+sVXEubilM1BiEBvwegK22GWo= =lPeq -END PGP SIGNATURE-
[committed] Fix -D_FORTIFY_SOURCE infinite recursion (PR middle-end/51077)
Hi! The PR50700 change caused infinite recursion - we shouldn't call compute_builtin_object_size on SSA_NAMEs more than two times, otherwise we risk calling it endlessly. But TREE_CODE (pt_var) in this code is known to be MEM_REF, so always != SSA_NAME. Fixed thusly, committed as obvious to trunk/4.6 after bootstrap/regtest on x86_64-linux and i686-linux. 2011-11-10 Jakub Jelinek ja...@redhat.com PR middle-end/51077 * tree-object-size.c (addr_object_size): Check TREE_CODE of MEM_REF's operand rather than code of the MEM_REF itself. * gcc.c-torture/compile/pr51077.c: New test. --- gcc/tree-object-size.c.jj 2011-10-12 20:28:20.0 +0200 +++ gcc/tree-object-size.c 2011-11-10 11:53:37.106777916 +0100 @@ -175,7 +175,7 @@ addr_object_size (struct object_size_inf unsigned HOST_WIDE_INT sz; if (!osi || (object_size_type 1) != 0 - || TREE_CODE (pt_var) != SSA_NAME) + || TREE_CODE (TREE_OPERAND (pt_var, 0)) != SSA_NAME) { sz = compute_builtin_object_size (TREE_OPERAND (pt_var, 0), object_size_type ~1); --- gcc/testsuite/gcc.c-torture/compile/pr51077.c.jj2011-11-10 12:05:24.797638658 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr51077.c 2011-11-10 12:04:59.0 +0100 @@ -0,0 +1,15 @@ +/* PR middle-end/51077 */ + +struct S { unsigned char s, t[256]; }; + +void +foo (const struct S *x, struct S *y, int z) +{ + int i; + for (i = 0; i 8; i++) +{ + const struct S *a = x[i]; + __builtin___memcpy_chk (y-t, a-t, z, __builtin_object_size (y-t, 0)); + y = (struct S *) y-t[z]; +} +} Jakub
[PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074)
Hi! This patch adds folding of the new VEC_PERM_EXPR as well as the older more specialized permutation exprs. For VEC_PERM_EXPR e.g. __builtin_shuffle may be used with constant arguments, for the other one the vectorizer sometimes creates it (though, admittedly, it should try harder to figure it out). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-11-10 Jakub Jelinek ja...@redhat.com PR tree-optimization/51074 * fold-const.c (fold_binary_loc): Handle VEC_EXTRACT_EVEN_EXPR, VEC_EXTRACT_ODD_EXPR, VEC_INTERLEAVE_HIGH_EXPR and VEC_INTERLEAVE_LOW_EXPR with VECTOR_CST or CONSTRUCTOR operands. (fold_ternary_loc): Handle VEC_PERM_EXPR with VECTOR_CST or CONSTRUCTOR operands. --- gcc/fold-const.c.jj 2011-10-24 12:21:14.0 +0200 +++ gcc/fold-const.c2011-11-10 14:21:56.671487697 +0100 @@ -13381,6 +13381,102 @@ fold_binary_loc (location_t loc, /* An ASSERT_EXPR should never be passed to fold_binary. */ gcc_unreachable (); +case VEC_EXTRACT_EVEN_EXPR: +case VEC_EXTRACT_ODD_EXPR: +case VEC_INTERLEAVE_HIGH_EXPR: +case VEC_INTERLEAVE_LOW_EXPR: + if ((TREE_CODE (arg0) == VECTOR_CST + || TREE_CODE (arg0) == CONSTRUCTOR) + (TREE_CODE (arg1) == VECTOR_CST + || TREE_CODE (arg1) == CONSTRUCTOR) + TREE_TYPE (TREE_TYPE (arg0)) == TREE_TYPE (type) + TREE_TYPE (TREE_TYPE (arg1)) == TREE_TYPE (type)) + { + unsigned int nelements = TYPE_VECTOR_SUBPARTS (type), i; + tree *elements = XALLOCAVEC (tree, nelements * 3), t; + constructor_elt *elt; + bool need_ctor = false; + + if (TREE_CODE (arg0) == VECTOR_CST) + { + for (i = 0, t = TREE_VECTOR_CST_ELTS (arg0); + i nelements t; i++, t = TREE_CHAIN (t)) + elements[i] = TREE_VALUE (t); + if (t) + return NULL_TREE; + } + else + FOR_EACH_VEC_ELT (constructor_elt, CONSTRUCTOR_ELTS (arg0), i, elt) + if (i = nelements) + return NULL_TREE; + else + elements[i] = elt-value; + if (i nelements) + return NULL_TREE; + + if (TREE_CODE (arg0) == VECTOR_CST) + { + for (i = 0, t = TREE_VECTOR_CST_ELTS (arg1); + i nelements t; i++, t = TREE_CHAIN (t)) + elements[i + nelements] = TREE_VALUE (t); + if (t) + return NULL_TREE; + } + else + FOR_EACH_VEC_ELT (constructor_elt, CONSTRUCTOR_ELTS (arg1), i, elt) + if (i = nelements) + return NULL_TREE; + else + elements[i + nelements] = elt-value; + if (i nelements) + return NULL_TREE; + + for (i = 0; i nelements; i++) + { + unsigned int idx; + switch (code) + { + case VEC_EXTRACT_EVEN_EXPR: + idx = i * 2; + break; + case VEC_EXTRACT_ODD_EXPR: + idx = i * 2 + 1; + break; + case VEC_INTERLEAVE_HIGH_EXPR: + idx = (i + nelements) / 2 + ((i 1) ? nelements : 0); + break; + case VEC_INTERLEAVE_LOW_EXPR: + idx = i / 2 + ((i 1) ? nelements : 0); + break; + default: + gcc_unreachable (); + } + + if (!CONSTANT_CLASS_P (elements[idx])) + need_ctor = true; + elements[i + 2 * nelements] = elements[idx]; + } + + if (need_ctor) + { + VEC(constructor_elt,gc) *v + = VEC_alloc (constructor_elt, gc, nelements); + for (i = 0; i nelements; i++) + CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, + elements[2 * nelements + i]); + return build_constructor (type, v); + } + else + { + tree vals = NULL_TREE; + for (i = 0; i nelements; i++) + vals = tree_cons (NULL_TREE, + elements[3 * nelements - i - 1], vals); + return build_vector (type, vals); + } + } + return NULL_TREE; + default: return NULL_TREE; } /* switch (code) */ @@ -13767,6 +13863,90 @@ fold_ternary_loc (location_t loc, enum t return fold_fma (loc, type, arg0, arg1, arg2); +case VEC_PERM_EXPR: + if ((TREE_CODE (arg0) == VECTOR_CST + || TREE_CODE (arg0) == CONSTRUCTOR) + (TREE_CODE (arg1) == VECTOR_CST + || TREE_CODE (arg1) == CONSTRUCTOR) + TREE_CODE (arg2) == VECTOR_CST + TREE_TYPE (TREE_TYPE (arg0)) == TREE_TYPE (type) + TREE_TYPE (TREE_TYPE (arg1)) == TREE_TYPE (type)) + { +
gcov internal changes
I've committed this patch to reorganize the internal data structures of gcov in preparation for some future features. The main change is that the sources list becomes an array. This changes references to a source_info object to be an index into the array, rather than a pointer. In making this change I noticed a bug introduced by my comdat support. We were adding a function into the list for a source file before determining whether that instance of the function was in the executable at all. built tested on i686-pc-linux-gnu. nathan 2011-11-10 Nathan Sidwell nat...@acm.org * gcov.c (struct function_info): Make src an index, not a pointer. (struct source_info): Remove index and next source fields. (fn_end): New static var. (sources_index): Remove. (sources): Now a pointer to an array, not a list. (n_sources, a_sources): New. (process_file): Adjust for changes to read_graph_file. Insert functions into source lists and check line numbers here. (generate_results): Only allocate lines for sources with contents. Adjust for source array. (release_structures): Likewise. (find_source): Return source index, adjust for source array. (read_graph_file): Return function list. Don't insert into source lists here. (read_count_file): Take list of functions. (solve_flow_graph): Reverse the arc lists here. (add_line_counts): Adjust for source array. Index: gcov.c === --- gcov.c (revision 181105) +++ gcov.c (working copy) @@ -181,9 +181,9 @@ typedef struct function_info gcov_type *counts; unsigned num_counts; - /* First line number. */ + /* First line number file. */ unsigned line; - struct source_info *src; + unsigned src; /* Next function in same source file. */ struct function_info *line_next; @@ -233,7 +233,6 @@ typedef struct source_info { /* Name of source file. */ char *name; - unsigned index; time_t file_time; /* Array of line information. */ @@ -245,23 +244,16 @@ typedef struct source_info /* Functions in this source file. These are in ascending line number order. */ function_t *functions; - - /* Next source file. */ - struct source_info *next; } source_t; /* Holds a list of function basic block graphs. */ static function_t *functions; +static function_t **fn_end = functions; -/* This points to the head of the sourcefile structure list. New elements - are always prepended. */ - -static source_t *sources; - -/* Next index for a source file. */ - -static unsigned source_index; +static source_t *sources; /* Array of source files */ +static unsigned n_sources; /* Number of sources */ +static unsigned a_sources; /* Allocated sources */ /* This holds data summary information. */ @@ -349,9 +341,9 @@ static void print_version (void) ATTRIBU static void process_file (const char *); static void generate_results (const char *); static void create_file_names (const char *); -static source_t *find_source (const char *); -static int read_graph_file (void); -static int read_count_file (void); +static unsigned find_source (const char *); +static function_t *read_graph_file (void); +static int read_count_file (function_t *); static void solve_flow_graph (function_t *); static void add_branch_counts (coverage_t *, const arc_t *); static void add_line_counts (coverage_t *, function_t *); @@ -537,57 +529,85 @@ process_args (int argc, char **argv) static void process_file (const char *file_name) { - function_t *fn; - function_t **fn_p; - function_t *old_functions; - - /* Save and clear the list of current functions. They will be appended - later. */ - old_functions = functions; - functions = NULL; + function_t *fns; create_file_names (file_name); - if (read_graph_file ()) + fns = read_graph_file (); + if (!fns) return; - - if (!functions) + + read_count_file (fns); + while (fns) { - fnotice (stderr, %s:no functions found\n, bbg_file_name); - return; -} - - if (read_count_file ()) -return; + function_t *fn = fns; - fn_p = functions; - while ((fn = *fn_p) != NULL) -{ + fns = fn-next; + fn-next = NULL; if (fn-counts) { + unsigned src = fn-src; + unsigned line = fn-line; + unsigned block_no; + function_t *probe, **prev; + + /* Now insert it into the source file's list of +functions. Normally functions will be encountered in +ascending order, so a simple scan is quick. Note we're +building this list in reverse order. */ + for (prev = sources[src].functions; + (probe = *prev); prev = probe-line_next) + if (probe-line = line) + break; + fn-line_next = probe; + *prev = fn; + + /*
[PATCH, i386]: Fix PR 50762, [4.7 Regression] ICE: in extract_insn, at recog.c:2137 (unrecognizable insn)
Hello! Attached patch fixes corner case with reload, where reload propagates constant zero into zero_extended LEA instruction, creating invalid RTX: (insn 4 15 52 2 (set (reg/v:SI 59 [ p_60 ]) (const_int 0 [0])) tt.c:24 64 {*movsi_internal} (nil)) ... (insn 29 28 30 3 (set (reg:DI 78) (zero_extend:DI (plus:SI (reg/v:SI 59 [ p_60 ]) (const_int 1 [0x1] tt.c:35 250 {*lea_4_zext} (expr_list:REG_DEAD (reg/v:SI 59 [ p_60 ]) (nil))) to: tt.c: In function ‘func_59’: tt.c:48:1: error: unrecognizable insn: (insn 29 28 30 3 (set (reg:DI 0 ax [78]) (zero_extend:DI (const_int 1 [0x1]))) tt.c:35 -1 (expr_list:REG_DEAD (reg/v:SI 59 [ p_60 ]) (nil))) To prevent this, we introduce new address constraint, so a register will be used instead of const_int. The fix uncovered a problem with lea_address_operand, that was a converted from special predicate to normal one a while ago. For a brief moment, when fixing operand with address constraint, reload requires that the pattern accepts (const int 1). However, contrary to what documentation says, normal predicates don't provide any bypass for const_int operands, leaving mode-less const_int operands out by GET_MODE (op) == mode check. To fix this, attached patch converts lea_address_operand back to special_predicate. It was actually changed by me a couple of months ago to normal predicate, while Reading The (... ehm ...) Fine Manual. A followup patch will axe out wrong define_predicate blurb. For additional joy, I was not able to fix the testcase - some of its variables have to be left uninitialized to trigger this corner case. OTOH, the testcase is too ugly to live and could cause some psychological trauma to innocent readers. I suspect, that this problem triggers more often on x32 which depends on zero_extended addresses for sane code. 2011-11-10 Uros Bizjak ubiz...@gmail.com PR target/50762 * config/i386/constraints.md (j): New address constraint. * config/i386/predicates.md (lea_address_operand): Redefine as special predicate. * config/i386/i386.md (*lea_3_zext): Use j constraint for operand 1. (*lea_4_zext): Ditto. Patch was tested on x86_64-pc-linux-gnu {,-m32}. I will wait for a day for possible comments before committing the patch to SVN mainline. (Thanks go to Ulrich and Bernd for their help in heroic battle against reload). Uros. Index: i386.md === --- i386.md (revision 181258) +++ i386.md (working copy) @@ -5551,7 +5551,7 @@ (define_insn *lea_3_zext [(set (match_operand:DI 0 register_operand =r) (zero_extend:DI - (subreg:SI (match_operand:DI 1 lea_address_operand p) 0)))] + (subreg:SI (match_operand:DI 1 lea_address_operand j) 0)))] TARGET_64BIT lea{l}\t{%a1, %k0|%k0, %a1} [(set_attr type lea) @@ -5560,7 +5560,7 @@ (define_insn *lea_4_zext [(set (match_operand:DI 0 register_operand =r) (zero_extend:DI - (match_operand:SI 1 lea_address_operand p)))] + (match_operand:SI 1 lea_address_operand j)))] TARGET_64BIT lea{l}\t{%a1, %k0|%k0, %a1} [(set_attr type lea) Index: constraints.md === --- constraints.md (revision 181258) +++ constraints.md (working copy) @@ -19,7 +19,7 @@ ;;; Unused letters: ;;; B H T W -;;; h jk v +;;; h k v ;; Integer register constraints. ;; It is not necessary to define 'r' here. @@ -127,6 +127,11 @@ (and (not (match_test TARGET_X32)) (match_operand 0 memory_operand))) +(define_address_constraint j + @internal Address operand that can be zero extended in LEA instruction. + (and (not (match_code const_int)) + (match_operand 0 address_operand))) + ;; Integer constant constraints. (define_constraint I Integer constant in the range 0 @dots{} 31, for 32-bit shifts. Index: predicates.md === --- predicates.md (revision 181258) +++ predicates.md (working copy) @@ -808,8 +808,9 @@ (match_operand 0 const0_operand))) ;; Return true if op if a valid address for LEA, and does not contain -;; a segment override. -(define_predicate lea_address_operand +;; a segment override. Defined as a special predicate to allow +;; mode-less const_int operands passed to address_operand. +(define_special_predicate lea_address_operand (match_operand 0 address_operand) { struct ix86_address parts;
Options handling and reload memory leak fixes
Hi! Running valgrind even on simple testcases shows a bunch of memory leaks (definitely lost). This patch cures some of them. There are a few further leaks in the options handling. The first hunk is when this function already called concat to set opt_text, and then doesn't write opt_text anywhere, but concat of that and something else (which malloces a new memory and doesn't free the old one). The second hunk is because register_pass_name stores a copy of the string or does nothing (another alternative would be to store the passed in name or free it in register_pass_name, but that would be quite weird API for the function). The first reload mem leak is because we allocate the array and then immediately call init_eliminable_invariants (..., true) which allocates it again. The second leaks are because init_eliminable_invariants allocates those two arrays when it is called with either do_subregs true or false. In the former case they are freed by free_reg_equiv called at the end of reload (), but in the latter case nothing freed them. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-11-10 Jakub Jelinek ja...@redhat.com * opts-common.c (generate_canonical_option): Free opt_text it it has been allocated here and not stored anywhere. * passes.c (register_one_dump_file): Free full_name. * reload1.c (reload): Don't allocate reg_max_ref_width here. (calculate_elim_costs_all_insns): Free offsets_at and offsets_known_at at the end and clear the pointers. --- gcc/opts-common.c.jj2011-11-04 07:49:45.0 +0100 +++ gcc/opts-common.c 2011-11-10 15:35:27.917116296 +0100 @@ -304,6 +304,8 @@ generate_canonical_option (size_t opt_in decoded-canonical_option[0] = concat (opt_text, arg, NULL); decoded-canonical_option[1] = NULL; decoded-canonical_option_num_elements = 1; + if (opt_text != option-opt_text) + free (CONST_CAST (char *, opt_text)); } } else --- gcc/passes.c.jj 2011-11-08 23:35:12.0 +0100 +++ gcc/passes.c2011-11-10 15:13:34.789021796 +0100 @@ -409,6 +409,7 @@ register_one_dump_file (struct opt_pass set_pass_for_id (id, pass); full_name = concat (prefix, pass-name, num, NULL); register_pass_name (pass, full_name); + free (CONST_CAST (char *, full_name)); } /* Recursive worker function for register_dump_files. */ --- gcc/reload1.c.jj2011-11-08 23:35:12.0 +0100 +++ gcc/reload1.c 2011-11-10 15:31:16.100601192 +0100 @@ -768,7 +768,6 @@ reload (rtx first, int global) be substituted eventually by altering the REG-rtx's. */ grow_reg_equivs (); - reg_max_ref_width = XCNEWVEC (unsigned int, max_regno); reg_old_renumber = XCNEWVEC (short, max_regno); memcpy (reg_old_renumber, reg_renumber, max_regno * sizeof (short)); pseudo_forbidden_regs = XNEWVEC (HARD_REG_SET, max_regno); @@ -1688,6 +1687,10 @@ calculate_elim_costs_all_insns (void) } free (reg_equiv_init_cost); + free (offsets_known_at); + free (offsets_at); + offsets_at = NULL; + offsets_known_at = NULL; } /* Comparison function for qsort to decide which of two reloads Jakub
[PATCH] Free memory leaks in tree-vect-slp.c
Hi! This patch fixes some compiler memory leaks in SLP. For vect_free_oprnd_info I've removed the FREE_DEF_STMTS argument and am freeing the defs always, but set them to NULL when moving the vectors over elsewhere, because otherwise if vect_create_new_slp_node or vect_build_slp_tree fails after succeeding for a couple of iterations, we'd leak the rest or double free them. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-11-10 Jakub Jelinek ja...@redhat.com * tree-vect-slp.c (vect_free_slp_tree): Also free SLP_TREE_CHILDREN vector. (vect_create_new_slp_node): Don't allocate node before checking stmt type. (vect_free_oprnd_info): Remove FREE_DEF_STMTS argument, always free def_stmts vectors and additionally free oprnd_info. (vect_build_slp_tree): Adjust callers. Call it even if stop_recursion. If vect_create_new_slp_node or vect_build_slp_tree fails, properly handle freeing memory. If it succeeded, clear def_stmts in oprnd_info. --- gcc/tree-vect-slp.c.jj 2011-11-08 23:35:12.0 +0100 +++ gcc/tree-vect-slp.c 2011-11-10 16:17:33.583105311 +0100 @@ -75,8 +75,9 @@ vect_free_slp_tree (slp_tree node) return; FOR_EACH_VEC_ELT (slp_void_p, SLP_TREE_CHILDREN (node), i, child) -vect_free_slp_tree ((slp_tree)child); +vect_free_slp_tree ((slp_tree) child); + VEC_free (slp_void_p, heap, SLP_TREE_CHILDREN (node)); VEC_free (gimple, heap, SLP_TREE_SCALAR_STMTS (node)); if (SLP_TREE_VEC_STMTS (node)) @@ -102,7 +103,7 @@ vect_free_slp_instance (slp_instance ins static slp_tree vect_create_new_slp_node (VEC (gimple, heap) *scalar_stmts) { - slp_tree node = XNEW (struct _slp_tree); + slp_tree node; gimple stmt = VEC_index (gimple, scalar_stmts, 0); unsigned int nops; @@ -117,6 +118,7 @@ vect_create_new_slp_node (VEC (gimple, h else return NULL; + node = XNEW (struct _slp_tree); SLP_TREE_SCALAR_STMTS (node) = scalar_stmts; SLP_TREE_VEC_STMTS (node) = NULL; SLP_TREE_CHILDREN (node) = VEC_alloc (slp_void_p, heap, nops); @@ -152,21 +154,19 @@ vect_create_oprnd_info (int nops, int gr } -/* Free operands info. Free def-stmts in FREE_DEF_STMTS is true. - (FREE_DEF_STMTS is true when the SLP analysis fails, and false when it - succeds. In the later case we don't need the operands info that we used to - check isomorphism of the stmts, but we still need the def-stmts - they are - used as scalar stmts in SLP nodes. */ +/* Free operands info. */ + static void -vect_free_oprnd_info (VEC (slp_oprnd_info, heap) **oprnds_info, - bool free_def_stmts) +vect_free_oprnd_info (VEC (slp_oprnd_info, heap) **oprnds_info) { int i; slp_oprnd_info oprnd_info; - if (free_def_stmts) -FOR_EACH_VEC_ELT (slp_oprnd_info, *oprnds_info, i, oprnd_info) + FOR_EACH_VEC_ELT (slp_oprnd_info, *oprnds_info, i, oprnd_info) +{ VEC_free (gimple, heap, oprnd_info-def_stmts); + XDELETE (oprnd_info); +} VEC_free (slp_oprnd_info, heap, *oprnds_info); } @@ -502,7 +502,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -516,7 +516,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -532,7 +532,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -546,7 +546,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_generic_expr (vect_dump, scalar_type, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -576,7 +576,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } } @@ -611,7 +611,7 @@ vect_build_slp_tree (loop_vec_info loop_ { if (vect_print_dump_info (REPORT_SLP)) fprintf (vect_dump, Build SLP failed: no optab.); - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } icode = (int) optab_handler (optab, vec_mode); @@
[libffi] Use GCC_AS_CFI_PSEUDO_OP
Previously, I split out this exact configure fragment to config/asmcfi.m4 for use in libitm. This just tidies the original use in libffi so that we don't have duplicates. Tested on x86_64-linux and committed. r~ commit 022a1701c4517308af026c64c707883358b37f26 Author: rth rth@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Nov 10 19:34:57 2011 + * configure.ac (GCC_AS_CFI_PSEUDO_OP): Use it instead of inline check. * configure, aclocal.m4: Rebuild. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181266 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/libffi/ChangeLog b/libffi/ChangeLog index a9d240a..2c34801 100644 --- a/libffi/ChangeLog +++ b/libffi/ChangeLog @@ -1,3 +1,8 @@ +2011-11-10 Richard Henderson r...@redhat.com + + * configure.ac (GCC_AS_CFI_PSEUDO_OP): Use it instead of inline check. + * configure, aclocal.m4: Rebuild. + 2011-09-04 Iain Sandoe ia...@gcc.gnu.org PR libffi/49594 diff --git a/libffi/aclocal.m4 b/libffi/aclocal.m4 index f7ef2f8..9d6a669 100644 --- a/libffi/aclocal.m4 +++ b/libffi/aclocal.m4 @@ -1025,6 +1025,7 @@ AC_SUBST([am__tar]) AC_SUBST([am__untar]) ]) # _AM_PROG_TAR +m4_include([../config/asmcfi.m4]) m4_include([../config/depstand.m4]) m4_include([../config/lead-dot.m4]) m4_include([../config/multi.m4]) diff --git a/libffi/configure b/libffi/configure index 6478747..57ccc55 100755 --- a/libffi/configure +++ b/libffi/configure @@ -12282,11 +12282,11 @@ $as_echo #define AC_APPLE_UNIVERSAL_BUILD 1 confdefs.h { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler .cfi pseudo-op support 5 $as_echo_n checking assembler .cfi pseudo-op support... 6; } -if test ${libffi_cv_as_cfi_pseudo_op+set} = set; then : +if test ${gcc_cv_as_cfi_pseudo_op+set} = set; then : $as_echo_n (cached) 6 else -libffi_cv_as_cfi_pseudo_op=unknown +gcc_cv_as_cfi_pseudo_op=unknown cat confdefs.h - _ACEOF conftest.$ac_ext /* end confdefs.h. */ asm (.cfi_startproc\n\t.cfi_endproc); @@ -12299,20 +12299,21 @@ main () } _ACEOF if ac_fn_c_try_compile $LINENO; then : - libffi_cv_as_cfi_pseudo_op=yes + gcc_cv_as_cfi_pseudo_op=yes else - libffi_cv_as_cfi_pseudo_op=no + gcc_cv_as_cfi_pseudo_op=no fi rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext fi -{ $as_echo $as_me:${as_lineno-$LINENO}: result: $libffi_cv_as_cfi_pseudo_op 5 -$as_echo $libffi_cv_as_cfi_pseudo_op 6; } -if test x$libffi_cv_as_cfi_pseudo_op = xyes; then +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_cfi_pseudo_op 5 +$as_echo $gcc_cv_as_cfi_pseudo_op 6; } + if test x$gcc_cv_as_cfi_pseudo_op = xyes; then $as_echo #define HAVE_AS_CFI_PSEUDO_OP 1 confdefs.h -fi + fi + if test x$TARGET = xSPARC; then { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler and linker support unaligned pc related relocs 5 diff --git a/libffi/configure.ac b/libffi/configure.ac index d16155a..2c67335 100644 --- a/libffi/configure.ac +++ b/libffi/configure.ac @@ -228,17 +228,7 @@ AC_SUBST(HAVE_LONG_DOUBLE) AC_C_BIGENDIAN -AC_CACHE_CHECK([assembler .cfi pseudo-op support], -libffi_cv_as_cfi_pseudo_op, [ -libffi_cv_as_cfi_pseudo_op=unknown -AC_TRY_COMPILE([asm (.cfi_startproc\n\t.cfi_endproc);],, - [libffi_cv_as_cfi_pseudo_op=yes], - [libffi_cv_as_cfi_pseudo_op=no]) -]) -if test x$libffi_cv_as_cfi_pseudo_op = xyes; then -AC_DEFINE(HAVE_AS_CFI_PSEUDO_OP, 1, - [Define if your assembler supports .cfi_* directives.]) -fi +GCC_AS_CFI_PSEUDO_OP if test x$TARGET = xSPARC; then AC_CACHE_CHECK([assembler and linker support unaligned pc related relocs],
Re: [patch] c++/2972 warn when ctor-initializer leaves uninitialized data
On 7 November 2011 21:47, Jason Merrill wrote: On 11/07/2011 04:43 PM, Jonathan Wakely wrote: Unfortunately this doesn't work very well in C++11 mode, as defaulted constructors don't cause warnings when they should do e.g. Maybe check this in defaulted_late_check? I tried that (attached) and it does cause warnings in defaulted constructed, but even for members with an NSDMI, which are not uninitialized. Index: c-family/c.opt === --- c-family/c.opt (revision 181173) +++ c-family/c.opt (working copy) @@ -461,6 +461,10 @@ Wmain C ObjC C++ ObjC++ Var(warn_main) Init(-1) Warning Warn about suspicious declarations of \main\ +Wmeminit +C++ Var(warn_meminit) Warning +Warn about POD members which are not initialized in a constructor initialization list + Wmissing-braces C ObjC C++ ObjC++ Var(warn_missing_braces) Warning Warn about possibly missing braces around initializers Index: c-family/c-opts.c === --- c-family/c-opts.c (revision 181173) +++ c-family/c-opts.c (working copy) @@ -550,7 +550,14 @@ c_common_handle_option (size_t scode, co case OPT_Weffc__: warn_ecpp = value; if (value) -warn_nonvdtor = true; +{ + /* Effective C++ rule 12 says to prefer using a mem-initializer + to assignment. */ + warn_meminit = true; + /* Effective C++ rule 14 says to declare destructors virtual + in polymorphic classes. */ + warn_nonvdtor = true; +} break; case OPT_ansi: Index: cp/init.c === --- cp/init.c (revision 181173) +++ cp/init.c (working copy) @@ -485,6 +485,42 @@ build_value_init_noctor (tree type, tsub return build_zero_init (type, NULL_TREE, /*static_storage_p=*/false); } +/* Warn if default initialization of MEMBER of type TYPE in constructor + * CONS will leave some parts uninitialized. */ + +static void +warn_meminit_leaves_uninitialized (tree member, tree type, tree cons) +{ + tree field = default_init_uninitialized_part (type); + if (!field) +return; + + if (DECL_P (field)) +warning_at (DECL_SOURCE_LOCATION (cons), OPT_Wmeminit, +no member initializer for %qD so %q+#D is uninitialized, +member, field); + else +warning_at (DECL_SOURCE_LOCATION (cons), OPT_Wmeminit, +no member initializer for %qD so it is uninitialized, +member); +} + +/* Warn if defaulted constructor CONS for TYPE with no mem-initializer-list + will leave uninitialized data. */ + +void +warn_missing_meminits (tree type, tree cons) +{ + tree mem_inits = sort_mem_initializers (type, NULL_TREE); + while (mem_inits) +{ + tree member = TREE_PURPOSE (mem_inits); + /* TODO do not warn if brace-or-equal-initializer */ + warn_meminit_leaves_uninitialized (member, TREE_TYPE (member), cons); + mem_inits = TREE_CHAIN (mem_inits); +} +} + /* Initialize MEMBER, a FIELD_DECL, with INIT, a TREE_LIST of arguments. If TREE_LIST is void_type_node, an empty initializer list was given; if NULL_TREE no initializer was given. */ @@ -518,12 +554,10 @@ perform_member_init (tree member, tree i } } - /* Effective C++ rule 12 requires that all data members be - initialized. */ - if (warn_ecpp init == NULL_TREE TREE_CODE (type) != ARRAY_TYPE) -warning_at (DECL_SOURCE_LOCATION (current_function_decl), OPT_Weffc__, - %qD should be initialized in the member initialization list, - member); + /* Warn if there is no initializer for a member which will be left + uninitialized. */ + if (warn_meminit init == NULL_TREE) + warn_meminit_leaves_uninitialized (member, type, current_function_decl); /* Get an lvalue for the data member. */ decl = build_class_member_access_expr (current_class_ref, member, Index: cp/method.c === --- cp/method.c (revision 181173) +++ cp/method.c (working copy) @@ -1655,6 +1655,10 @@ defaulted_late_check (tree fn) if (DECL_DELETED_FN (implicit_fn)) DECL_DELETED_FN (fn) = 1; + + if (warn_meminit (kind == sfk_constructor || kind == sfk_copy_constructor +|| kind == sfk_move_constructor)) +warn_missing_meminits (current_class_type, fn); } /* Returns true iff FN can be explicitly defaulted, and gives any Index: cp/cp-tree.h === --- cp/cp-tree.h (revision 181173) +++ cp/cp-tree.h (working copy) @@ -4915,6 +4915,7 @@ extern bool user_provided_p (tree); extern bool type_has_user_provided_constructor (tree); extern bool type_has_user_provided_default_constructor (tree); extern tree default_init_uninitialized_part (tree); +extern void warn_missing_meminits (tree, tree); extern bool trivial_default_constructor_is_constexpr
Re: [PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074)
On 11/10/2011 11:09 AM, Jakub Jelinek wrote: + if (TREE_CODE (arg0) == VECTOR_CST) + { + for (i = 0, t = TREE_VECTOR_CST_ELTS (arg0); +i nelements t; i++, t = TREE_CHAIN (t)) + elements[i] = TREE_VALUE (t); + if (t) + return NULL_TREE; + } + else + FOR_EACH_VEC_ELT (constructor_elt, CONSTRUCTOR_ELTS (arg0), i, elt) + if (i = nelements) + return NULL_TREE; + else + elements[i] = elt-value; + if (i nelements) + return NULL_TREE; Subroutine. + if (TREE_CODE (arg0) == VECTOR_CST) + { + for (i = 0, t = TREE_VECTOR_CST_ELTS (arg1); Typo in test -- arg1. Not that you'll make that mistake after reusing the above subroutine. ;-) + for (i = 0; i nelements; i++) + { + unsigned int idx; + switch (code) + { + case VEC_EXTRACT_EVEN_EXPR: + idx = i * 2; + break; + case VEC_EXTRACT_ODD_EXPR: + idx = i * 2 + 1; + break; + case VEC_INTERLEAVE_HIGH_EXPR: + idx = (i + nelements) / 2 + ((i 1) ? nelements : 0); + break; + case VEC_INTERLEAVE_LOW_EXPR: + idx = i / 2 + ((i 1) ? nelements : 0); + break; + default: + gcc_unreachable (); + } + + if (!CONSTANT_CLASS_P (elements[idx])) + need_ctor = true; + elements[i + 2 * nelements] = elements[idx]; + } + + if (need_ctor) + { + VEC(constructor_elt,gc) *v + = VEC_alloc (constructor_elt, gc, nelements); + for (i = 0; i nelements; i++) + CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, + elements[2 * nelements + i]); + return build_constructor (type, v); + } + else + { + tree vals = NULL_TREE; + for (i = 0; i nelements; i++) + vals = tree_cons (NULL_TREE, + elements[3 * nelements - i - 1], vals); + return build_vector (type, vals); + } From need_ctor on, definitely a subroutine. It's tempting to suggest that you build an integral array of the indicies so that this whole block can be shared with vec_perm. + for (i = 0, t = TREE_VECTOR_CST_ELTS (arg2); +i nelements t; i++, t = TREE_CHAIN (t)) + { + unsigned HOST_WIDE_INT idx; + if (!host_integerp (TREE_VALUE (t), 1)) + return NULL_TREE; + idx = tree_low_cst (TREE_VALUE (t), 1); + if (idx = nelements * 2) + return NULL_TREE; VEC_PERM_EXPR is explicitly modulo. Don't fail, mask. r~
Re: [PATCH] Fold VEC_PERM_EXPR/VEC_INTERLEAVE*EXPR/VEC_EXTRACT*EXPR with VECTOR_CST/CONSTRUCTOR arguments (PR tree-optimization/51074)
On 11/10/2011 12:00 PM, Richard Henderson wrote: VEC_PERM_EXPR is explicitly modulo. Don't fail, mask. It does occur to me that we could usefully fold a constant selector with out-of-range elements to a new selector with in-range elements, even if the other operands are non-constant. This might result in more masks being shared, should they get spilled to memory. r~
Re: [patch] c++/2972 warn when ctor-initializer leaves uninitialized data
On 11/10/2011 02:48 PM, Jonathan Wakely wrote: +warn_missing_meminits (tree type, tree cons) +{ + tree mem_inits = sort_mem_initializers (type, NULL_TREE); + while (mem_inits) +{ + tree member = TREE_PURPOSE (mem_inits); + /* TODO do not warn if brace-or-equal-initializer */ + warn_meminit_leaves_uninitialized (member, TREE_TYPE (member), cons); + mem_inits = TREE_CHAIN (mem_inits); +} +} Check DECL_INITIAL (member) to tell if it has an NSDMI. Jason
Re: [patch] c++/2972 warn when ctor-initializer leaves uninitialized data
On 11/10/2011 03:10 PM, Jason Merrill wrote: On 11/10/2011 02:48 PM, Jonathan Wakely wrote: +warn_missing_meminits (tree type, tree cons) +{ + tree mem_inits = sort_mem_initializers (type, NULL_TREE); + while (mem_inits) + { + tree member = TREE_PURPOSE (mem_inits); + /* TODO do not warn if brace-or-equal-initializer */ + warn_meminit_leaves_uninitialized (member, TREE_TYPE (member), cons); + mem_inits = TREE_CHAIN (mem_inits); + } +} Check DECL_INITIAL (member) to tell if it has an NSDMI. Actually, why not just use default_init_uninitialized_part (type)? + if (warn_meminit (kind == sfk_constructor || kind == sfk_copy_constructor +|| kind == sfk_move_constructor)) +warn_missing_meminits (current_class_type, fn); We only want to do this for sfk_constructor; the others initialize all fields. Jason
[DOC PATCH]: Remove wrong explanation w.r.t. const handling for predicates, defined with define_predicate
Hello! Predicates, defined with define_predicate do not handle CONST_INT and CONST_DOUBLE operands at all, let alone provide any sort of special bypass for them. Just remove wrong text to save some poor soul from tripping this trap in the future. 2011-11-10 Uros Bizjak ubiz...@gmail.com * doc/md.texi (Defining Machine-Specific Predicates): Remove wrong explanation that predicates written with define_predicate provide special handling of CONST_INT and CONST_DOUBLE operands. Tested by make doc in gcc directory. OK for mainline SVN ? Uros. Index: doc/md.texi === --- doc/md.texi (revision 181258) +++ doc/md.texi (working copy) @@ -1001,16 +1001,7 @@ Predicates written with @code{define_predicate} automatically include a test that @var{mode} is @code{VOIDmode}, or @var{op} has the same -mode as @var{mode}, or @var{op} is a @code{CONST_INT} or -@code{CONST_DOUBLE}. They do @emph{not} check specifically for -integer @code{CONST_DOUBLE}, nor do they test that the value of either -kind of constant fits in the requested mode. This is because -target-specific predicates that take constants usually have to do more -stringent value checks anyway. If you need the exact same treatment -of @code{CONST_INT} or @code{CONST_DOUBLE} that the generic predicates -provide, use a @code{MATCH_OPERAND} subexpression to call -@code{const_int_operand}, @code{const_double_operand}, or -@code{immediate_operand}. +mode as @var{mode}. Predicates written with @code{define_special_predicate} do not get any automatic mode checks, and are treated as having special mode handling
Re: [PATCH] PR debug/50983
Thanks. I reformatted the patch a bit and added a testcase; here's what I'm checking in. commit 91eed4ebec24bbb2993c1ca8a5407f4fdeff48ec Author: Jason Merrill ja...@redhat.com Date: Thu Nov 10 00:11:13 2011 -0500 PR debug/50983 * dwarf2out.c (set_cur_line_info_table): Restore the last is_stmt value in the current line table. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 39be9a1..7b5930e 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -20371,6 +20371,10 @@ set_cur_line_info_table (section *sec) VEC_safe_push (dw_line_info_table_p, gc, separate_line_info, table); } + if (DWARF2_ASM_LINE_DEBUG_INFO) +table-is_stmt = (cur_line_info_table + ? cur_line_info_table-is_stmt + : DWARF_LINE_DEFAULT_IS_STMT_START); cur_line_info_table = table; } diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c b/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c new file mode 100644 index 000..1d2e148 --- /dev/null +++ b/gcc/testsuite/gcc.dg/debug/dwarf2/asm-line1.c @@ -0,0 +1,20 @@ +/* PR debug/50983 */ +/* { dg-do compile { target *-*-linux-gnu } } */ +/* { dg-options -O0 -gdwarf-2 } */ +/* { dg-final { scan-assembler is_stmt 1 } } */ + +int i; +void f() __attribute ((section (foo))); +void f() { if (i) ++i; else --i; } + +void fun() +{ + return; +} + +int main() +{ + f(); + fun(); + return 0; +}
Re: [libitm] Work around missing AVX support
On 10 Nov 2011, at 17:12, Richard Henderson wrote: On 11/10/2011 12:16 AM, Jakub Jelinek wrote: On Wed, Nov 09, 2011 at 04:32:58PM -0800, Richard Henderson wrote: Not pretty at all. But given the corresponding irritation in writing assembler wrapper functions, it seems like it's about a wash. Tested with and without HAVE_AS_AVX on x86_64-linux. Shouldn't -mavx be also not passed in that case? Then you wouldn't need to undef __AVX__ and we wouldn't risk gcc doesn't decide to optimize memcpy or something similar using AVX instructions... You are correct. Thanks for noticing this; I was a bit frazzled after fighting with autofoo for so long yesterday. Tested on x86_64-linux, with avx and with avx forcibly disabled. As of r181262 things are looking much better; all the files build ... ... we now have the following failure linking the library on i686- darwin9 and x86-64-darwin10: libtool: link: /GCC/gcc-4-7-trunk-build/./gcc/xgcc -B/GCC/gcc-4-7- trunk-build/./gcc/ -B/GCC/gcc-4-7-install/i686-apple-darwin9/bin/ -B/ GCC/gcc-4-7-install/i686-apple-darwin9/lib/ -isystem /GCC/gcc-4-7- install/i686-apple-darwin9/include -isystem /GCC/gcc-4-7-install/i686- apple-darwin9/sys-include -m64 -dynamiclib -Wl,-undefined - Wl,dynamic_lookup -o .libs/libitm.0.dylib .libs/aatree.o .libs/ alloc.o .libs/alloc_c.o .libs/alloc_cpp.o .libs/barrier.o .libs/ beginend.o .libs/clone.o .libs/eh_cpp.o .libs/local.o .libs/ query.o .libs/retry.o .libs/rwlock.o .libs/useraction.o .libs/ util.o .libs/sjlj.o .libs/tls.o .libs/method-serial.o .libs/method- gl.o .libs/x86_sse.o .libs/x86_avx.o-m64 -pthread -pthread -m64 - m64 -pthread -install_name /GCC/gcc-4-7-install/lib/gcc/i686-apple- darwin9/4.7.0/x86_64/libitm.0.dylib -compatibility_version 1 - current_version 1.0 -Wl,-single_module ld: codegen problem, can't use rel32 to external symbol in __ITM_malloc from .libs/alloc_c.o collect2: error: ld returned 1 exit status (I think the symbol in question is an __emutls var) ... the objects appear to be correctly x86-64 and the link line has - m64 ... so there's a codegen issue somewhere, will try an investigate tomorrow. -- The m32 version builds OK on i686-darwin9 (haven't been able to try on Darwin10 .. machine is busy) Iain
C++ PATCH for c++/51079 (DR 495, checking second conv before template)
DR 495 changed the order of these rules. Tested x86_64-pc-linux-gnu, applying to trunk. commit dc49a72a22b10b39edc054414537bda44ce82546 Author: Jason Merrill ja...@redhat.com Date: Thu Nov 10 14:59:15 2011 -0500 PR c++/51079, DR 495 * call.c (joust): Check the second conversion sequence before checking templates. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 578905e..e81950c 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -8109,6 +8109,22 @@ joust (struct z_candidate *cand1, struct z_candidate *cand2, bool warn) if (winner) return winner; + /* DR 495 moved this tiebreaker above the template ones. */ + /* or, if not that, + the context is an initialization by user-defined conversion (see + _dcl.init_ and _over.match.user_) and the standard conversion + sequence from the return type of F1 to the destination type (i.e., + the type of the entity being initialized) is a better conversion + sequence than the standard conversion sequence from the return type + of F2 to the destination type. */ + + if (cand1-second_conv) +{ + winner = compare_ics (cand1-second_conv, cand2-second_conv); + if (winner) + return winner; +} + /* or, if not that, F1 is a non-template function and F2 is a template function specialization. */ @@ -8137,21 +8153,6 @@ joust (struct z_candidate *cand1, struct z_candidate *cand2, bool warn) return winner; } - /* or, if not that, - the context is an initialization by user-defined conversion (see - _dcl.init_ and _over.match.user_) and the standard conversion - sequence from the return type of F1 to the destination type (i.e., - the type of the entity being initialized) is a better conversion - sequence than the standard conversion sequence from the return type - of F2 to the destination type. */ - - if (cand1-second_conv) -{ - winner = compare_ics (cand1-second_conv, cand2-second_conv); - if (winner) - return winner; -} - /* Check whether we can discard a builtin candidate, either because we have two identical ones or matching builtin and non-builtin candidates. diff --git a/gcc/testsuite/g++.dg/template/conv12.C b/gcc/testsuite/g++.dg/template/conv12.C new file mode 100644 index 000..e6af054 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/conv12.C @@ -0,0 +1,25 @@ +// PR c++/51079 + +#if __cplusplus 199711L +struct C1 +{ + template class T + operator T() = delete; // { dg-message declared here { target c++11 } } + operator bool() { return false; } +} c1; + +int ic1 = c1; // { dg-error deleted { target c++11 } } +int ac1 = c1 + c1; // { dg-error deleted { target c++11 } } +#endif + +struct C2 +{ +private: + template class T + operator T(); // { dg-error private } +public: + operator bool() { return false; } +} c2; + +int ic2 = c2; // { dg-error } +int ac2 = c2 + c2; // { dg-error }
Re: [libitm] Work around missing AVX support
On 11/10/2011 03:25 PM, Iain Sandoe wrote: libtool: link: /GCC/gcc-4-7-trunk-build/./gcc/xgcc -B/GCC/gcc-4-7-trunk-build/./gcc/ -B/GCC/gcc-4-7-install/i686-apple-darwin9/bin/ -B/GCC/gcc-4-7-install/i686-apple-darwin9/lib/ -isystem /GCC/gcc-4-7-install/i686-apple-darwin9/include -isystem /GCC/gcc-4-7-install/i686-apple-darwin9/sys-include -m64 -dynamiclib -Wl,-undefined -Wl,dynamic_lookup -o .libs/libitm.0.dylib .libs/aatree.o .libs/alloc.o .libs/alloc_c.o .libs/alloc_cpp.o .libs/barrier.o .libs/beginend.o .libs/clone.o .libs/eh_cpp.o .libs/local.o .libs/query.o .libs/retry.o .libs/rwlock.o .libs/useraction.o .libs/util.o .libs/sjlj.o .libs/tls.o .libs/method-serial.o .libs/method-gl.o .libs/x86_sse.o .libs/x86_avx.o -m64 -pthread -pthread -m64 -m64 -pthread -install_name /GCC/gcc-4-7-install/lib/gcc/i686-apple-darwin9/4.7.0/x86_64/libitm.0.dylib -compatibility_version 1 -current_version 1.0 -Wl,-single_module ld: codegen problem, can't use rel32 to external symbol in __ITM_malloc from .libs/alloc_c.o collect2: error: ld returned 1 exit status (I think the symbol in question is an __emutls var) The symbol _ITM_malloc is in libitm. Maybe the problem is an extra _ before the _ITM_malloc? Patrick.
Re: New port^2: Renesas RL78
+# non-PIC targets always get an array-bounds error in thread_prologue_and_epilogue_insns +function.o-warn = -Wno-error Didn't we find another way to fix this? In any case this is not present in your changelog. Otherwise the port is looking ok. r~
Re: [libitm] Work around missing AVX support
On 10 Nov 2011, at 20:33, Patrick Marlier wrote: On 11/10/2011 03:25 PM, Iain Sandoe wrote: libtool: link: /GCC/gcc-4-7-trunk-build/./gcc/xgcc -B/GCC/gcc-4-7-trunk-build/./gcc/ -B/GCC/gcc-4-7-install/i686-apple-darwin9/bin/ -B/GCC/gcc-4-7-install/i686-apple-darwin9/lib/ -isystem /GCC/gcc-4-7-install/i686-apple-darwin9/include -isystem /GCC/gcc-4-7-install/i686-apple-darwin9/sys-include -m64 -dynamiclib -Wl,-undefined -Wl,dynamic_lookup -o .libs/libitm.0.dylib .libs/ aatree.o .libs/alloc.o .libs/alloc_c.o .libs/alloc_cpp.o .libs/barrier.o .libs/beginend.o .libs/clone.o .libs/eh_cpp.o .libs/local.o .libs/query.o .libs/retry.o .libs/rwlock.o .libs/useraction.o .libs/util.o .libs/sjlj.o .libs/tls.o .libs/method-serial.o .libs/method-gl.o .libs/x86_sse.o .libs/x86_avx.o -m64 -pthread - pthread -m64 -m64 -pthread -install_name /GCC/gcc-4-7-install/lib/gcc/i686-apple-darwin9/4.7.0/x86_64/libitm. 0.dylib -compatibility_version 1 -current_version 1.0 -Wl,-single_module ld: codegen problem, can't use rel32 to external symbol in __ITM_malloc from .libs/alloc_c.o collect2: error: ld returned 1 exit status (I think the symbol in question is an __emutls var) The symbol _ITM_malloc is in libitm. Maybe the problem is an extra _ before the _ITM_malloc? Actually, I think the missing symbol is ___emutls_v._ZN3GTM12_gtm_thr_tlsE and (although the m32 lib builds OK - the symbol is also missing there). The m64 build fails because of the -Wl,-undefined -Wl,dynamic_lookup in combination with the missing var. the m32 build succeed - but none of the testsuite runs, because the emutls var is missing (not resolved at load). ... I strongly suspect it might be another manifestation of: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50598 Has anyone succeeded in building libitm on an emutls target? Iain
Re: New port^2: Renesas RL78
Didn't we find another way to fix this? In any case this is not present in your changelog. Yes, please ignore that. I do svn diff and then have to cut out all the bits that aren't part of the base port itself.
C++ PATCH for c++/50973 (ICE with defaulted virtual destructor)
Here the problem was that we were calling use_thunk before we knew what the right linkage for the function it's thunking to was. Fixed by deferring synthesis of virtual dtors until EOF. Tested x86_64-pc-linux-gnu, applying to trunk. commit 566d5469261e63f8359998386b3b7c60ecd5e2ba Author: Jason Merrill ja...@redhat.com Date: Wed Nov 9 15:53:04 2011 -0500 PR c++/50973 * decl2.c (mark_used): Defer synthesis of virtual functions. * method.c (use_thunk): Make sure the target function has DECL_INTERFACE_KNOWN. diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index 4e24755..05f4b42 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -4347,6 +4347,14 @@ mark_used (tree decl) !DECL_DEFAULTED_OUTSIDE_CLASS_P (decl) ! DECL_INITIAL (decl)) { + /* Defer virtual destructors so that thunks get the right + linkage. */ + if (DECL_VIRTUAL_P (decl) !at_eof) + { + note_vague_linkage_fn (decl); + return true; + } + /* Remember the current location for a function we will end up synthesizing. Then we can inform the user where it was required in the case of error. */ @@ -4358,7 +4366,7 @@ mark_used (tree decl) on the stack (such as overload resolution candidates). We could just let cp_write_global_declarations handle synthesizing - this function, since we just added it to deferred_fns, but doing + this function by adding it to deferred_fns, but doing it at the use site produces better error messages. */ ++function_depth; synthesize_method (decl); diff --git a/gcc/cp/method.c b/gcc/cp/method.c index bb58312..8101f8a 100644 --- a/gcc/cp/method.c +++ b/gcc/cp/method.c @@ -339,6 +339,7 @@ use_thunk (tree thunk_fndecl, bool emit_p) DECL_EXTERNAL (thunk_fndecl) = 0; /* The linkage of the function may have changed. FIXME in linkage rewrite. */ + gcc_assert (DECL_INTERFACE_KNOWN (function)); TREE_PUBLIC (thunk_fndecl) = TREE_PUBLIC (function); DECL_VISIBILITY (thunk_fndecl) = DECL_VISIBILITY (function); DECL_VISIBILITY_SPECIFIED (thunk_fndecl) diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted33.C b/gcc/testsuite/g++.dg/cpp0x/defaulted33.C new file mode 100644 index 000..2f11c13 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/defaulted33.C @@ -0,0 +1,32 @@ +// PR c++/50973 +// { dg-do compile { target c++11 } } + +class HD +{ + public: + virtual ~HD() {}; +}; +class InputHD : public virtual HD +{ +}; +class OutputHD : public virtual HD +{ +}; +class IOHD : public InputHD, public OutputHD +{ +}; +template typename T, unsigned int N +class ArrayNHD : public IOHD +{ + public: + ~ArrayNHD() = default; +}; +class TLText +{ + ~TLText(); + ArrayNHDint, 1* m_argsHD; +}; +TLText::~TLText() +{ + delete m_argsHD; +}
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
From: Hans-Peter Nilsson h...@axis.com Date: Thu, 10 Nov 2011 19:06:26 +0100 From: Andrew MacLeod amacl...@redhat.com Date: Thu, 10 Nov 2011 17:52:44 +0100 On 11/10/2011 11:48 AM, Andrew MacLeod wrote: Justa minute Andrew doh. sorry about that Test cross to cris-elf in progress for your second take (at r181254 + Bernd's patch to unbreak the tree And it works without regressions. Thanks! brgds, H-P
Re: [PATCH] [Annotalysis] Support trylock attributes on virtual methods.
On 11-11-10 17:23 , Delesley Hutchins wrote: +{ + tree callee = gimple_call_fn (call); + if (TREE_CODE (callee) == OBJ_TYPE_REF) +{ + tree objtype = TREE_TYPE (TREE_TYPE (OBJ_TYPE_REF_OBJECT (callee))); + /* Check to make sure objtype is a valid type. + OBJ_TYPE_REF_OBJECT does not always return the correct static type of the callee. + For example: Given foo(void* ptr) { ((Foo*) ptr)-doSomething(); } + objtype will be void, not Foo. Whether or not this happens depends on the details + of how a particular call is lowered to GIMPLE, and there is no easy fix that works + in all cases. For now, we simply rely on gcc's type information; if that information + is not accurate, then the analysis will be less precise. Re-format for 80 cols. OK with that change. Diego.
Re: [PATCH] pr51038 atomic_flag on targets with no atomic support.
On 11/10/2011 02:28 PM, Andrew MacLeod wrote: * doc/extend.texi: Document __atomic_test_and_set and __atomic_clear. ok. r~
Re: [wwwdocs] Add info about IPA optimization and LTO improvments
On Sat, 8 Oct 2011, Andi Kleen wrote: On Wed, 28 Sep 2011, Andi Kleen wrote: lild -r is now supported with LTO. When using assembler files or non LTOed objects inside ld -r objects together with LTO then the Linux binutils 2.21.51.0.3 or later are needed./li I think this should be GNU/Linux, if anything, but then I also think we should not refer to this $notsurewhatthepropertermis variant of our official binutils project as part of GCC release notes. Okay. The users will stay mystified then. Hopefully they are good at googling. Are you saying no official version of binutils is able to address this, and H.J.'s is required? If so, please go ahead and apply the patch (just avoid the naked instance of Linux). Thanks, Gerald
Re: [patch] c++/2972 warn when ctor-initializer leaves uninitialized data
On 10 November 2011 20:17, Jason Merrill wrote: On 11/10/2011 03:10 PM, Jason Merrill wrote: On 11/10/2011 02:48 PM, Jonathan Wakely wrote: +warn_missing_meminits (tree type, tree cons) +{ + tree mem_inits = sort_mem_initializers (type, NULL_TREE); + while (mem_inits) + { + tree member = TREE_PURPOSE (mem_inits); + /* TODO do not warn if brace-or-equal-initializer */ + warn_meminit_leaves_uninitialized (member, TREE_TYPE (member), cons); + mem_inits = TREE_CHAIN (mem_inits); + } +} Check DECL_INITIAL (member) to tell if it has an NSDMI. Actually, why not just use default_init_uninitialized_part (type)? + if (warn_meminit (kind == sfk_constructor || kind == sfk_copy_constructor + || kind == sfk_move_constructor)) + warn_missing_meminits (current_class_type, fn); We only want to do this for sfk_constructor; the others initialize all fields. Doh, of course. Thanks for the pointers, I'll have another stab at it. I really want to get this warning implemented eventually.
Re: [libitm] Work around missing AVX support
On 10 Nov 2011, at 20:43, Iain Sandoe wrote: The symbol _ITM_malloc is in libitm. Maybe the problem is an extra _ before the _ITM_malloc? Actually, I think the missing symbol is ___emutls_v._ZN3GTM12_gtm_thr_tlsE and (although the m32 lib builds OK - the symbol is also missing there). The m64 build fails because of the -Wl,-undefined -Wl,dynamic_lookup FAOD, Is there some reason that this library needs to resolve symbols from some external source at load time? in combination with the missing var. the m32 build succeed - but none of the testsuite runs, because the emutls var is missing (not resolved at load). ... I strongly suspect it might be another manifestation of: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50598 This is confirmed - if I hack around that bug, the library builds. There are two other issues I can see so far: 1/ the symbols generated in sjlj.S are not getting their extra _ (I patched that up temporarily manually) ... which allows some of the testsuite to pass. 2/ The section .tm_clone_table doesn't exist for Darwin leading to assembler errors because .tm_clone_table is not a complete section spec for Darwin (that's not too hard to fix - but too late for tonight). Iain
Re: Revert PowerPC shrink-wrap support 3 of 3
On Thu, Nov 10, 2011 at 02:29:04PM +0100, Bernd Schmidt wrote: On 11/10/11 13:14, Richard Guenther wrote: Fair enough. You can count me as one then, and I'll defer to Bernd to either provide a fix or ack the revert. I'm trying to track it down. In 189r.outof_cfglayout, we have (insn 31 33 35 3 (use (reg/i:SI 0 r0)) ../../../../baseline-trunk/libstdc++-v3/libsupc++/new_opv.cc:34 -1 (nil)) ;; Successors: EXIT [100.0%] (fallthru) ;; lr out 0 [r0] 11 [fp] 13 [sp] 14 [lr] 25 [sfp] 26 [afp] ;; live out 0 [r0] 11 [fp] 13 [sp] 25 [sfp] 26 [afp] followed by a number of other basic blocks, so that looks wrong to me. outof_cfglayout seems to assume that fallthrough edges to the exit block are OK and don't need fixing up, and changing that seems nontrivial at first glance. The situation is first created during cfgcleanup in into_cfglayout. The following patch makes the testcase compile by stopping the compiler from moving the exit fallthru block around, but I've not checked whether it has a negative effect on code quality. HP, can you run full tests? FWIW, I did bootstrap and make check with/without this patch, and it introduces no regressions in the PowerPC, but I haven't look at the code generated. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [google] ThreadSanitizer instrumentation pass (issue 5303083)
http://codereview.appspot.com/5303083/diff/28001/gcc/tree-tsan.c File gcc/tree-tsan.c (right): http://codereview.appspot.com/5303083/diff/28001/gcc/tree-tsan.c#newcode227 gcc/tree-tsan.c:227: var = varpool_node_for_asm (id); Use cgraph_node_for_asm instead. http://codereview.appspot.com/5303083/
Re: [google] ThreadSanitizer instrumentation pass (issue 5303083)
Have you run through SPEC, and SPEC06 with this change? What is the instrumentation overhead using gcc? David http://codereview.appspot.com/5303083/
Re: [libitm] Work around missing AVX support
On Thu, Nov 10, 2011 at 11:29:35PM +, Iain Sandoe wrote: On 10 Nov 2011, at 20:43, Iain Sandoe wrote: The symbol _ITM_malloc is in libitm. Maybe the problem is an extra _ before the _ITM_malloc? Actually, I think the missing symbol is ___emutls_v._ZN3GTM12_gtm_thr_tlsE and (although the m32 lib builds OK - the symbol is also missing there). The m64 build fails because of the -Wl,-undefined -Wl,dynamic_lookup FAOD, Is there some reason that this library needs to resolve symbols from some external source at load time? in combination with the missing var. the m32 build succeed - but none of the testsuite runs, because the emutls var is missing (not resolved at load). ... I strongly suspect it might be another manifestation of: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50598 This is confirmed - if I hack around that bug, the library builds. Iain, I can confirm on x86_64-apple-darwin11 that if I revert r179429... * cgraphunit.c (ipa_passes): Remove unrechable nodes. * lto-streamer-out.c (produce_symtab): Skip unused extern declarations. * ipa.c (cgraph_remove_unreachable_nodes): Do not assume that external functions are reachable when address is taken. * ipa-inline-analysis.c (reset_inline_edge_summary): New * gcc.dg/ipa/ctor-empty-1.c: Update dump file. the linker crash is eliminated when libitm.dylib is linked. Jack There are two other issues I can see so far: 1/ the symbols generated in sjlj.S are not getting their extra _ (I patched that up temporarily manually) ... which allows some of the testsuite to pass. 2/ The section .tm_clone_table doesn't exist for Darwin leading to assembler errors because .tm_clone_table is not a complete section spec for Darwin (that's not too hard to fix - but too late for tonight). Iain
Re: [libitm] Work around missing AVX support
On 11/10/2011 03:29 PM, Iain Sandoe wrote: The m64 build fails because of the -Wl,-undefined -Wl,dynamic_lookup FAOD, Is there some reason that this library needs to resolve symbols from some external source at load time? Not that I know of. I think that's generic libtool giving you that. r~
Re: [google] ThreadSanitizer instrumentation pass (issue 5303083)
On Thu, Nov 10, 2011 at 4:24 PM, Kostya Serebryany k...@google.com wrote: On Thu, Nov 10, 2011 at 4:00 PM, davi...@google.com wrote: Have you run through SPEC, and SPEC06 with this change? What is the instrumentation overhead using gcc? I don't think anyone of us ever run spec with tsan. Mostly because this will always use the fast path of the tsan analysis (spec is single-threaded). I suggested it because It is good for correctness testing, instrumentation (only) overhead testing. David --kcc David http://codereview.appspot.com/5303083/
Re: [Patch] Move Objective-C runtime flags to modern options system.
On Nov 10, 2011, at 9:40 AM, Iain Sandoe wrote: Thanks for catching that --- brainstorm on my part ... the code under discussion should have been #ifndef OBCPLUS There is no prohibition against C having exceptions, so, doesn't matter if you turn C++ off, you can still throw through C code, so turning on exceptions is reasonable. Moreover, there is no personality routine in m32 NeXT libobjc, so if one tries to engage the zero-cost exceptions, one gets a link error (and generates a load of unused eh data). I can work around that if there is still reason to have -fexceptions on. No, this must be wrong: $ cat t.c void bar() { } void foo() { bar(); } int main() { return 0; } $ gcc -fexceptions t.c $ gcc -m32 -fexceptions t.c $ Like I said, it does work, one can count on it working and it is useful, you can't break it. And next week, they'll add catching and throwing to C, and when they do, it still has to just work. :-)
Re: Options handling and reload memory leak fixes
On Thu, 10 Nov 2011, Jakub Jelinek wrote: Hi! Running valgrind even on simple testcases shows a bunch of memory leaks (definitely lost). This patch cures some of them. There are a few further leaks in the options handling. The first hunk is when this function already called concat to set opt_text, and then doesn't write opt_text anywhere, but concat of that and something else (which malloces a new memory and doesn't free the old one). The option-handling change is OK. But I suspect eliminating memory leaks completely from option handling will require defining various fields, that may at present sometimes hold malloced memory and sometimes hold pointers into constant data or the original command line, always to hold malloced memory (and so require various new allocations that aren't required at present) - as without defining that there are probably cases where you won't know whether to free the previous value of a field. (I've generally presumed that memory usage that is O(n) in the size of the command line isn't significant.) -- Joseph S. Myers jos...@codesourcery.com
C++ PATCH for c++/50372 (C++11 allows static functions as template arguments)
DR 1155 allows variables and functions with internal linkage to be used as template arguments. Tested x86_64-pc-linux-gnu, applying to trunk. commit bd15c5ecefbb9f8a3a44d2547c3a6a9881a47f31 Author: Jason Merrill ja...@redhat.com Date: Thu Nov 10 21:59:59 2011 -0500 PR c++/50372 * pt.c (convert_nontype_argument_function): Allow functions with internal linkage in C++11. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index da5497e..9ce7854 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -5324,6 +5324,7 @@ convert_nontype_argument_function (tree type, tree expr) { tree fns = expr; tree fn, fn_no_ptr; + linkage_kind linkage; fn = instantiate_type (type, fns, tf_none); if (fn == error_mark_node) @@ -5340,12 +5341,19 @@ convert_nontype_argument_function (tree type, tree expr) A template-argument for a non-type, non-template template-parameter shall be one of: [...] - -- the address of an object or function with external linkage. */ - if (!DECL_EXTERNAL_LINKAGE_P (fn_no_ptr)) + -- the address of an object or function with external [C++11: or +internal] linkage. */ + linkage = decl_linkage (fn_no_ptr); + if (cxx_dialect = cxx0x ? linkage == lk_none : linkage != lk_external) { - error (%qE is not a valid template argument for type %qT - because function %qD has not external linkage, - expr, type, fn_no_ptr); + if (cxx_dialect = cxx0x) + error (%qE is not a valid template argument for type %qT + because function %qD has no linkage, + expr, type, fn_no_ptr); + else + error (%qE is not a valid template argument for type %qT + because function %qD doesn't have external linkage, + expr, type, fn_no_ptr); return NULL_TREE; } diff --git a/gcc/testsuite/g++.dg/ext/visibility/anon8.C b/gcc/testsuite/g++.dg/ext/visibility/anon8.C index 8ef8d68..5e58b55 100644 --- a/gcc/testsuite/g++.dg/ext/visibility/anon8.C +++ b/gcc/testsuite/g++.dg/ext/visibility/anon8.C @@ -26,10 +26,8 @@ int main () static void fn2 () {} }; callB1::fn1 (); - callB2::fn2 (); // { dg-error not external linkage|no matching } - // { dg-message candidate candidate note { target *-*-* } 29 } + callB2::fn2 (); // { dg-error linkage|no matching } callfn3 (); callB1::fn4 (); - callfn5 (); // { dg-error not external linkage|no matching } - // { dg-message candidate candidate note { target *-*-* } 33 } + callfn5 (); // { dg-error linkage|no matching { target c++98 } } } diff --git a/gcc/testsuite/g++.dg/template/linkage1.C b/gcc/testsuite/g++.dg/template/linkage1.C new file mode 100644 index 000..02aa967 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/linkage1.C @@ -0,0 +1,18 @@ +// PR c++/50372 +// Test that a template instantiation has the same linkage as its argument. +// { dg-final { scan-assembler (weak|glob)\[^\n\]*_Z3fooIXadL_Z13external_funcvEEEvv } } +// { dg-final { scan-assembler-not (weak|glob)\[^\n\]*_Z3fooIXadL_ZL11static_funcvEEEvv } } + +templatevoid (*fptr)(void) +void foo() { } + +static void static_func() {} +void external_func() { } + +void test() +{ +#if __cplusplus 199711L + foostatic_func(); +#endif + fooexternal_func(); +} diff --git a/gcc/testsuite/g++.old-deja/g++.other/linkage4.C b/gcc/testsuite/g++.old-deja/g++.other/linkage4.C index 7531f45..450733f 100644 --- a/gcc/testsuite/g++.old-deja/g++.other/linkage4.C +++ b/gcc/testsuite/g++.old-deja/g++.other/linkage4.C @@ -8,4 +8,4 @@ void f () {} // Check that the strlen declaration here is given internal linkage by // using it as a non-type template argument, and expecting an error. -template void fstrlen(); // { dg-error } no matching template +template void fstrlen(); // { dg-error { target c++98 } } no matching template
Re: [PATCH] Fix PR51030, handle p ? p-base : 0 in phiopt
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/08/11 06:45, Richard Guenther wrote: This should optimize VEC_BASE that Jakub was patching by teaching phiopt to handle some one-statement intermediate basic-blocks. Bootstrapped and tested on x86_64-unknown-linux-gnu, any comments? Thanks, Richard. 2011-11-08 Richard Guenther rguent...@suse.de PR tree-optimization/51030 * tree-ssa-phiopt.c (jump_function_from_stmt): New function. (value_replacement): Use it to handle trivial non-empty intermediate blocks. * gcc.dg/tree-ssa/phi-opt-6.c: New testcase. Seems like a reasonable extension of the existing value_replacement capability. We might want to tweak the comment near the top of the file to indicate the additional case we handle. It's pretty specific to the p-base idiom, but that's probably OK. You didn't peek to see how often the optimization triggered by chance did you? jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOvLkjAAoJEBRtltQi2kC7hp0H/29hwfGpSdRFaK6JFanakkp1 HnpEf/YdMr8YieG5dAqR5IzBkb/lRIgwZR9mxPdXnb172IdsHfTaeuX3NyXtVPHD 0Rt/yXGQJStt0J7n3FUjOnQHvZeE05VRzxK8axSLDsNHGUH3j+FVVOT8/+hhnKi3 YfZJo6A68mfpeV/0BlPmnQJOTKnE4YSJbeBnqXg3DNcAJcIajNnj6SNUs8cXe4JB s2ERYcqrje9sWwgEQY4KueTlwqpPu2c+52Sh2JBD8E8fo+S5OWIglg23q5GmZ83i OIeM3DZaZdjtV8/oJjsSi1viJVyXXQQqiIJfs1U4EcFw+h6gRLp4HDWIoy/tprQ= =zmj+ -END PGP SIGNATURE-
Re: [PATCH] Improve VEC_BASE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/07/11 14:25, Jakub Jelinek wrote: Hi! This patch attempts to optimize VEC_BASE if we know that offsetof of base is 0 (unless the compiler is doing something strange, it is true). It doesn't have a clear code size effect, some .text sections grew, supposedly because of more inlining, some .text sections shrunk. Bootstrapped/regtested on x86_64-linux and i686-linux. 2011-11-07 Jakub Jelinek ja...@redhat.com * vec.h (VEC_BASE): If base is at offset 0 in the structure, use (P)-base even if P is NULL. Presumably this becomes redundant with we went with Richi's patch? jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOvLmLAAoJEBRtltQi2kC7BNUIAJMJvrpgtSfIrPDZ/JHXM1on Td5M5ebXO7lgdf5W5SJU5WiuRqHG2N+J/YJ9mJgUaCdLrbKJQXTdU/jbMaJqbwlR g1LU7nMHM2Kf87zXdJCdcuB7BfMfmVcpNVopuab1GA6nRye9ru3+SXpXbDSiNmeM 1j8r1IEeL37sWPX2opUHgE9bQfaqDigonlPiqw+JxWQXGBpAvy2xS5CNd93RoN80 SLtHnxWoULpwJ16E9mpgTtR8kG8mFYaWAuWDKMpTX21hK/nTIkjGpFEOHddIjI8n z/c3zUcjrJujQ773qReSEY0tLdtrckS7/Gy01tIc+yMus7VmVfdt4lpabeoXwX4= =mJd5 -END PGP SIGNATURE-
Re: [PATCH] Free memory leaks in tree-vect-slp.c
On 10 November 2011 21:31, Jakub Jelinek ja...@redhat.com wrote: Hi! This patch fixes some compiler memory leaks in SLP. For vect_free_oprnd_info I've removed the FREE_DEF_STMTS argument and am freeing the defs always, but set them to NULL when moving the vectors over elsewhere, because otherwise if vect_create_new_slp_node or vect_build_slp_tree fails after succeeding for a couple of iterations, we'd leak the rest or double free them. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Thanks, Ira 2011-11-10 Jakub Jelinek ja...@redhat.com * tree-vect-slp.c (vect_free_slp_tree): Also free SLP_TREE_CHILDREN vector. (vect_create_new_slp_node): Don't allocate node before checking stmt type. (vect_free_oprnd_info): Remove FREE_DEF_STMTS argument, always free def_stmts vectors and additionally free oprnd_info. (vect_build_slp_tree): Adjust callers. Call it even if stop_recursion. If vect_create_new_slp_node or vect_build_slp_tree fails, properly handle freeing memory. If it succeeded, clear def_stmts in oprnd_info. --- gcc/tree-vect-slp.c.jj 2011-11-08 23:35:12.0 +0100 +++ gcc/tree-vect-slp.c 2011-11-10 16:17:33.583105311 +0100 @@ -75,8 +75,9 @@ vect_free_slp_tree (slp_tree node) return; FOR_EACH_VEC_ELT (slp_void_p, SLP_TREE_CHILDREN (node), i, child) - vect_free_slp_tree ((slp_tree)child); + vect_free_slp_tree ((slp_tree) child); + VEC_free (slp_void_p, heap, SLP_TREE_CHILDREN (node)); VEC_free (gimple, heap, SLP_TREE_SCALAR_STMTS (node)); if (SLP_TREE_VEC_STMTS (node)) @@ -102,7 +103,7 @@ vect_free_slp_instance (slp_instance ins static slp_tree vect_create_new_slp_node (VEC (gimple, heap) *scalar_stmts) { - slp_tree node = XNEW (struct _slp_tree); + slp_tree node; gimple stmt = VEC_index (gimple, scalar_stmts, 0); unsigned int nops; @@ -117,6 +118,7 @@ vect_create_new_slp_node (VEC (gimple, h else return NULL; + node = XNEW (struct _slp_tree); SLP_TREE_SCALAR_STMTS (node) = scalar_stmts; SLP_TREE_VEC_STMTS (node) = NULL; SLP_TREE_CHILDREN (node) = VEC_alloc (slp_void_p, heap, nops); @@ -152,21 +154,19 @@ vect_create_oprnd_info (int nops, int gr } -/* Free operands info. Free def-stmts in FREE_DEF_STMTS is true. - (FREE_DEF_STMTS is true when the SLP analysis fails, and false when it - succeds. In the later case we don't need the operands info that we used to - check isomorphism of the stmts, but we still need the def-stmts - they are - used as scalar stmts in SLP nodes. */ +/* Free operands info. */ + static void -vect_free_oprnd_info (VEC (slp_oprnd_info, heap) **oprnds_info, - bool free_def_stmts) +vect_free_oprnd_info (VEC (slp_oprnd_info, heap) **oprnds_info) { int i; slp_oprnd_info oprnd_info; - if (free_def_stmts) - FOR_EACH_VEC_ELT (slp_oprnd_info, *oprnds_info, i, oprnd_info) + FOR_EACH_VEC_ELT (slp_oprnd_info, *oprnds_info, i, oprnd_info) + { VEC_free (gimple, heap, oprnd_info-def_stmts); + XDELETE (oprnd_info); + } VEC_free (slp_oprnd_info, heap, *oprnds_info); } @@ -502,7 +502,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -516,7 +516,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -532,7 +532,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -546,7 +546,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_generic_expr (vect_dump, scalar_type, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } @@ -576,7 +576,7 @@ vect_build_slp_tree (loop_vec_info loop_ print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); } - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info); return false; } } @@ -611,7 +611,7 @@ vect_build_slp_tree (loop_vec_info loop_ { if (vect_print_dump_info (REPORT_SLP)) fprintf (vect_dump, Build SLP failed: no optab.); - vect_free_oprnd_info (oprnds_info, true); + vect_free_oprnd_info (oprnds_info);
Re: [PATCH] PR target/50038 fix: redundant zero extensions removal
Hello Eric, Thanks for review! 2011/11/10 Eric Botcazou ebotca...@adacore.com: So, what about the patch? I think since we already have zee patch it would be great to use it as more general optimization. I tested it on EEMBC 2.0 on Atom and it showed 1% performance gain in geomean on 32 bit which is really good for such simple optimization. For OOO archs patch is not so critical but still makes code cleaner The patch cannot be accepted as-is since it doesn't update a single bit of the documentation present in implicit-zee.c. The authors have made the effort of thoroughly documenting their code so it shouldn't be wasted. Therefore, at a minimum, the documentation must be overhauled the same way the code will be. I agree that the numbers are encouraging. Moreover, the narrow specialization of the pass was critized when it was added so a generalization will probably be welcome. So, unless other developers object, let's do it, but correctly, that is to say, let's rename the pass, eliminate all the hardcoded references to implicit zero-extensions in the code and turn it into a generic elimination of redundant extensions pass. Great! I'll be back with patch covering all non functional changes. Will it be OK to have everything in one patch (including current functional changes) or I should split it? -- Eric Botcazou Thanks, Ilya
Re: C++ PATCH for c++/50372 (C++11 allows static functions as template arguments)
On Thu, Nov 10, 2011 at 10:14 PM, Jason Merrill ja...@redhat.com wrote: DR 1155 allows variables and functions with internal linkage to be used as template arguments. Yay!
Re: [PATCH] Improve VEC_BASE
On Thu, Nov 10, 2011 at 10:58:36PM -0700, Jeff Law wrote: This patch attempts to optimize VEC_BASE if we know that offsetof of base is 0 (unless the compiler is doing something strange, it is true). It doesn't have a clear code size effect, some .text sections grew, supposedly because of more inlining, some .text sections shrunk. Bootstrapped/regtested on x86_64-linux and i686-linux. 2011-11-07 Jakub Jelinek ja...@redhat.com * vec.h (VEC_BASE): If base is at offset 0 in the structure, use (P)-base even if P is NULL. Presumably this becomes redundant with we went with Richi's patch? I've actually committed it yesterday after discussion with Richi on IRC. While his patch optimizes it, it doesn't do so for -O0 and isn't performed during the early passes. For -O0 the offsetof == 0 should be folded to 1 and the test should be eliminated and for -O1+ it may affect inlining decisions. Jakub