Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Jun 19, 2015 at 9:47 AM, Tom de Vries tom_devr...@mentor.com wrote: On 16/06/15 13:18, Richard Biener wrote: On Mon, Jun 15, 2015 at 12:38 AM, Tom de Vries tom_devr...@mentor.com wrote: On 14/06/15 23:49, Bernhard Reutner-Fischer wrote: On June 14, 2015 10:55:59 AM GMT+02:00, Tom de Vries tom_devr...@mentor.com wrote: On 13/03/15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hi, with -fdump-passes now fixed to work on a dummy function (r222129), I'm resubmitting this patch, split up in two patches. The first patch moves two trivial early-exit tests to the parloops gate. The second patch moves the number_of_loops test to the parloops gate, and adds a dummy loops structure in the dummy function for -fdump-passes. Bootstrapped and reg-tested on x86_64. Both patches OK for trunk? diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..a1659a3 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2535,10 +2535,6 @@ parallelize_loops (void) source_location loop_loc; /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); Now stray comment? Stopped reading here. Fixed in updated patch. Also: - made sure cfun is not used in the gate function - added missing update of function header comment for init_loops_structure - improved comment in pass_manager::dump_passes. OK for trunk? For -fdump-passes this doesn't make much sense but I suppose you are after not getting the untouched functions dumped. Note that generally this is undesired for debugging (in my opinion) as diffs from the pass dump before parloops to parloops will contain a lot of spurious changes (and if the preceeding pass is changed similarly the function we run parloops on is maybe not even dumped there!). So I question the general idea of the change. I suppose there are two competing principles: 1. ensure the state before the pass is in the previous dump 2. only dump if changed Indeed in general we use the first principle, although it doesn't hold for f.i. pass_tree_loop and a function without loops. The problems I'm trying to solve with this patch are the following: - even if you're compiling a single function, part of the function occurs twice (once in the original, once in the split-off function), making formulating scan-tree-dumps more complicated for parloops. - the intuitive thing is to look in the parloops tree-dump and look at the original function and the split-off function, and think that the split-off function is the immediate result of the split-off that happened in the pass, which is incorrect (since it jumps back in the pass list and traverses other passes before arriving back at parloops). Adding something like a flag -fno-dump-new-function would solve the first problem, but not the second. Yeah, any pass introducing new functions has this issue. We could also dump new functions in a different dump file src.c.new.123t.pass, and this would solve both problems. But it will cause the 'where did that function go' confusion, certainly initially. Any other ideas? No, not really. But it sounds like a very special thing you try to solve. You could also dump the split off function twice - once next to the function it was split off and once when it arrives at parloops again... Richard. Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On 16/06/15 13:18, Richard Biener wrote: On Mon, Jun 15, 2015 at 12:38 AM, Tom de Vries tom_devr...@mentor.com wrote: On 14/06/15 23:49, Bernhard Reutner-Fischer wrote: On June 14, 2015 10:55:59 AM GMT+02:00, Tom de Vries tom_devr...@mentor.com wrote: On 13/03/15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hi, with -fdump-passes now fixed to work on a dummy function (r222129), I'm resubmitting this patch, split up in two patches. The first patch moves two trivial early-exit tests to the parloops gate. The second patch moves the number_of_loops test to the parloops gate, and adds a dummy loops structure in the dummy function for -fdump-passes. Bootstrapped and reg-tested on x86_64. Both patches OK for trunk? diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..a1659a3 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2535,10 +2535,6 @@ parallelize_loops (void) source_location loop_loc; /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); Now stray comment? Stopped reading here. Fixed in updated patch. Also: - made sure cfun is not used in the gate function - added missing update of function header comment for init_loops_structure - improved comment in pass_manager::dump_passes. OK for trunk? For -fdump-passes this doesn't make much sense but I suppose you are after not getting the untouched functions dumped. Note that generally this is undesired for debugging (in my opinion) as diffs from the pass dump before parloops to parloops will contain a lot of spurious changes (and if the preceeding pass is changed similarly the function we run parloops on is maybe not even dumped there!). So I question the general idea of the change. I suppose there are two competing principles: 1. ensure the state before the pass is in the previous dump 2. only dump if changed Indeed in general we use the first principle, although it doesn't hold for f.i. pass_tree_loop and a function without loops. The problems I'm trying to solve with this patch are the following: - even if you're compiling a single function, part of the function occurs twice (once in the original, once in the split-off function), making formulating scan-tree-dumps more complicated for parloops. - the intuitive thing is to look in the parloops tree-dump and look at the original function and the split-off function, and think that the split-off function is the immediate result of the split-off that happened in the pass, which is incorrect (since it jumps back in the pass list and traverses other passes before arriving back at parloops). Adding something like a flag -fno-dump-new-function would solve the first problem, but not the second. We could also dump new functions in a different dump file src.c.new.123t.pass, and this would solve both problems. But it will cause the 'where did that function go' confusion, certainly initially. Any other ideas? Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On Mon, Jun 15, 2015 at 12:38 AM, Tom de Vries tom_devr...@mentor.com wrote: On 14/06/15 23:49, Bernhard Reutner-Fischer wrote: On June 14, 2015 10:55:59 AM GMT+02:00, Tom de Vries tom_devr...@mentor.com wrote: On 13/03/15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hi, with -fdump-passes now fixed to work on a dummy function (r222129), I'm resubmitting this patch, split up in two patches. The first patch moves two trivial early-exit tests to the parloops gate. The second patch moves the number_of_loops test to the parloops gate, and adds a dummy loops structure in the dummy function for -fdump-passes. Bootstrapped and reg-tested on x86_64. Both patches OK for trunk? diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..a1659a3 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2535,10 +2535,6 @@ parallelize_loops (void) source_location loop_loc; /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); Now stray comment? Stopped reading here. Fixed in updated patch. Also: - made sure cfun is not used in the gate function - added missing update of function header comment for init_loops_structure - improved comment in pass_manager::dump_passes. OK for trunk? For -fdump-passes this doesn't make much sense but I suppose you are after not getting the untouched functions dumped. Note that generally this is undesired for debugging (in my opinion) as diffs from the pass dump before parloops to parloops will contain a lot of spurious changes (and if the preceeding pass is changed similarly the function we run parloops on is maybe not even dumped there!). So I question the general idea of the change. Richard. Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On 13/03/15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hi, with -fdump-passes now fixed to work on a dummy function (r222129), I'm resubmitting this patch, split up in two patches. The first patch moves two trivial early-exit tests to the parloops gate. The second patch moves the number_of_loops test to the parloops gate, and adds a dummy loops structure in the dummy function for -fdump-passes. Bootstrapped and reg-tested on x86_64. Both patches OK for trunk? Thanks, - Tom Move parallelize_loops tests to parloops gate 2015-06-08 Tom de Vries t...@codesourcery.com * tree-parloops.c (parallelize_loops): Move early-exit tests to ... (pass_parallelize_loops::gate): ... here. --- gcc/tree-parloops.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..a1659a3 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2535,10 +2535,6 @@ parallelize_loops (void) source_location loop_loc; /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); @@ -2657,7 +2653,12 @@ public: {} /* opt_pass methods: */ - virtual bool gate (function *) { return flag_tree_parallelize_loops 1; } + virtual bool gate (function *fun) +{ + return (flag_tree_parallelize_loops 1 + !parallelized_function_p (fun-decl) + !cfun-has_nonlocal_label); +} virtual unsigned int execute (function *); }; // class pass_parallelize_loops -- 1.9.1 Move parloops::execute test to parloops gate 2015-06-11 Tom de Vries t...@codesourcery.com * cfgloop.c (init_loops_structure): Add and handle dummy_p parameter. (flow_loops_find): Add extra argument to call to init_loops_structure. * cfgloop.h (init_loops_structure): Add bool parameter. * cgraphunit.c (init_lowered_empty_function): Add extra argument to call to init_loops_structure. * lto-streamer-in.c (input_cfg): Same. * tree-cfg.c (move_sese_region_to_fn): Same. * passes.c (pass_manager::dump_passes): Add dummy loops structure to dummy function. * tree-parloops.c (pass_parallelize_loops::execute): Move early-exit test to .. (pass_parallelize_loops::gate): ... here. --- gcc/cfgloop.c | 19 +++ gcc/cfgloop.h | 2 +- gcc/cgraphunit.c | 2 +- gcc/lto-streamer-in.c | 2 +- gcc/passes.c | 4 gcc/tree-cfg.c| 2 +- gcc/tree-parloops.c | 6 ++ 7 files changed, 21 insertions(+), 16 deletions(-) diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index a279046..2b17585 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -356,8 +356,8 @@ alloc_loop (void) (including the root of the loop tree). */ void -init_loops_structure (struct function *fn, - struct loops *loops, unsigned num_loops) +init_loops_structure (struct function *fn, struct loops *loops, + unsigned num_loops, bool dummy_p) { struct loop *root; @@ -366,11 +366,14 @@ init_loops_structure (struct function *fn, /* Dummy loop containing whole function. */ root = alloc_loop (); - root-num_nodes = n_basic_blocks_for_fn (fn); - root-latch = EXIT_BLOCK_PTR_FOR_FN (fn); - root-header = ENTRY_BLOCK_PTR_FOR_FN (fn); - ENTRY_BLOCK_PTR_FOR_FN (fn)-loop_father = root; - EXIT_BLOCK_PTR_FOR_FN (fn)-loop_father = root; + if (!dummy_p) +{ + root-num_nodes = n_basic_blocks_for_fn (fn); + root-latch = EXIT_BLOCK_PTR_FOR_FN (fn); + root-header = ENTRY_BLOCK_PTR_FOR_FN (fn); + ENTRY_BLOCK_PTR_FOR_FN (fn)-loop_father = root; + EXIT_BLOCK_PTR_FOR_FN (fn)-loop_father = root; +} loops-larray-quick_push (root); loops-tree_root = root; @@ -427,7 +430,7 @@ flow_loops_find (struct loops *loops) if (!loops) { loops = ggc_cleared_allocstruct loops (); - init_loops_structure (cfun, loops, 1); + init_loops_structure (cfun, loops, 1, false); } /* Ensure that loop exits were released. */ diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index d811c56..e680941 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -260,7 +260,7 @@ struct GTY (()) loops { /* Loop recognition. */ bool bb_loop_header_p (basic_block); -void init_loops_structure (struct function *, struct loops *, unsigned); +void init_loops_structure (struct function *, struct loops *, unsigned, bool); extern struct loops *flow_loops_find (struct loops *);
Re: [PATCH, stage1] Make parloops gate more strict
On June 14, 2015 10:55:59 AM GMT+02:00, Tom de Vries tom_devr...@mentor.com wrote: On 13/03/15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hi, with -fdump-passes now fixed to work on a dummy function (r222129), I'm resubmitting this patch, split up in two patches. The first patch moves two trivial early-exit tests to the parloops gate. The second patch moves the number_of_loops test to the parloops gate, and adds a dummy loops structure in the dummy function for -fdump-passes. Bootstrapped and reg-tested on x86_64. Both patches OK for trunk? diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..a1659a3 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2535,10 +2535,6 @@ parallelize_loops (void) source_location loop_loc; /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); Now stray comment? Stopped reading here. Thanks,
Re: [PATCH, stage1] Make parloops gate more strict
On 14/06/15 23:49, Bernhard Reutner-Fischer wrote: On June 14, 2015 10:55:59 AM GMT+02:00, Tom de Vries tom_devr...@mentor.com wrote: On 13/03/15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hi, with -fdump-passes now fixed to work on a dummy function (r222129), I'm resubmitting this patch, split up in two patches. The first patch moves two trivial early-exit tests to the parloops gate. The second patch moves the number_of_loops test to the parloops gate, and adds a dummy loops structure in the dummy function for -fdump-passes. Bootstrapped and reg-tested on x86_64. Both patches OK for trunk? diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 02f44eb..a1659a3 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2535,10 +2535,6 @@ parallelize_loops (void) source_location loop_loc; /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); Now stray comment? Stopped reading here. Fixed in updated patch. Also: - made sure cfun is not used in the gate function - added missing update of function header comment for init_loops_structure - improved comment in pass_manager::dump_passes. OK for trunk? Thanks, - Tom [PATCH 1/2] Move parallelize_loops tests to parloops gate 2015-06-08 Tom de Vries t...@codesourcery.com * tree-parloops.c (parallelize_loops): Move early-exit tests to ... (pass_parallelize_loops::gate): ... here. --- gcc/tree-parloops.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 3495ac1..50b8d75 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2531,12 +2531,6 @@ parallelize_loops (void) HOST_WIDE_INT estimated; source_location loop_loc; - /* Do not parallelize loops in the functions created by parallelization. */ - if (parallelized_function_p (cfun-decl)) -return false; - if (cfun-has_nonlocal_label) -return false; - gcc_obstack_init (parloop_obstack); reduction_info_table_type reduction_list (10); init_stmt_vec_info_vec (); @@ -2654,7 +2648,14 @@ public: {} /* opt_pass methods: */ - virtual bool gate (function *) { return flag_tree_parallelize_loops 1; } + virtual bool gate (function *fun) +{ + return (flag_tree_parallelize_loops 1 + /* Do not parallelize loops in the functions created by + parallelization. */ + !parallelized_function_p (fun-decl) + !fun-has_nonlocal_label); +} virtual unsigned int execute (function *); }; // class pass_parallelize_loops -- 1.9.1 [PATCH 2/2] Move parloops::execute test to parloops gate 2015-06-11 Tom de Vries t...@codesourcery.com * cfgloop.c (init_loops_structure): Add and handle dummy_p parameter. (flow_loops_find): Add extra argument to call to init_loops_structure. * cfgloop.h (init_loops_structure): Add bool parameter. * cgraphunit.c (init_lowered_empty_function): Add extra argument to call to init_loops_structure. * lto-streamer-in.c (input_cfg): Same. * tree-cfg.c (move_sese_region_to_fn): Same. * passes.c (pass_manager::dump_passes): Add dummy loops structure to dummy function. * tree-parloops.c (pass_parallelize_loops::execute): Move early-exit test to .. (pass_parallelize_loops::gate): ... here. --- gcc/cfgloop.c | 22 +- gcc/cfgloop.h | 2 +- gcc/cgraphunit.c | 2 +- gcc/lto-streamer-in.c | 2 +- gcc/passes.c | 4 gcc/tree-cfg.c| 2 +- gcc/tree-parloops.c | 6 ++ 7 files changed, 23 insertions(+), 17 deletions(-) diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index 20b81b3..2335ecb 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -349,11 +349,12 @@ alloc_loop (void) } /* Initializes loops structure LOOPS, reserving place for NUM_LOOPS loops - (including the root of the loop tree). */ + (including the root of the loop tree). If DUMMY_P, don't use information + from FN. */ void -init_loops_structure (struct function *fn, - struct loops *loops, unsigned num_loops) +init_loops_structure (struct function *fn, struct loops *loops, + unsigned num_loops, bool dummy_p) { struct loop *root; @@ -362,11 +363,14 @@ init_loops_structure (struct function *fn, /* Dummy loop containing whole function. */ root = alloc_loop (); - root-num_nodes = n_basic_blocks_for_fn
Re: [PATCH, stage1] Make parloops gate more strict
On Wed, Mar 18, 2015 at 6:02 PM, Tom de Vries tom_devr...@mentor.com wrote: On 18-03-15 12:18, Richard Biener wrote: On Wed, Mar 18, 2015 at 12:03 PM, Tom de Vries tom_devr...@mentor.com wrote: On 18-03-15 11:16, Richard Biener wrote: On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries tom_devr...@mentor.com wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n-decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node-decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Indeed. Attached patch removes that code, and runs the gates with cfun == NULL for -fdump-passes. It at least builds, and allows us to compile src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. Should I bootstrap and reg-test, or do you see a problem with this approach? Yeah - it makes the -fdump-passes hack more pervasive throughout the compiler. I suppose it should instead build push a dummy sturct function. Like this? Looks good to me. Richard. Well, or simply don't care for it's brokeness. I'm afraid leaving it broken just means we'll keep coming back to it. So I'd prefer either fixing or removing. Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On Wed, Mar 18, 2015 at 12:03 PM, Tom de Vries tom_devr...@mentor.com wrote: On 18-03-15 11:16, Richard Biener wrote: On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries tom_devr...@mentor.com wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n-decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node-decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Indeed. Attached patch removes that code, and runs the gates with cfun == NULL for -fdump-passes. It at least builds, and allows us to compile src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. Should I bootstrap and reg-test, or do you see a problem with this approach? Yeah - it makes the -fdump-passes hack more pervasive throughout the compiler. I suppose it should instead build push a dummy sturct function. Well, or simply don't care for it's brokeness. Richard. Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries tom_devr...@mentor.com wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n-decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node-decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Richard. ... This was discussed here: https://gcc.gnu.org/ml/gcc-patches/2011-06/msg00856.html Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On 18-03-15 11:16, Richard Biener wrote: On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries tom_devr...@mentor.com wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n-decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node-decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Indeed. Attached patch removes that code, and runs the gates with cfun == NULL for -fdump-passes. It at least builds, and allows us to compile src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. Should I bootstrap and reg-test, or do you see a problem with this approach? Thanks, - Tom Fix -fdump-passes --- gcc/bb-reorder.c| 5 +++-- gcc/cprop.c | 8 +--- gcc/except.c| 10 +- gcc/gcse.c | 28 gcc/loop-init.c | 11 +++ gcc/omp-low.c | 3 ++- gcc/passes.c| 16 +--- gcc/store-motion.c | 10 ++ gcc/tree-chkp-opt.c | 14 -- gcc/tree-chkp.c | 11 ++- gcc/tree-complex.c | 3 ++- gcc/tree-eh.c | 10 -- gcc/tree-if-conv.c | 6 +- gcc/tree-into-ssa.c | 3 ++- gcc/tree-ssa-loop.c | 16 +--- gcc/tree-ssa.c | 3 ++- gcc/tree-stdarg.c | 3 ++- gcc/tree-vect-generic.c | 3 ++- 18 files changed, 95 insertions(+), 68 deletions(-) diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c index c2a3be3..b8f2a4b 100644 --- a/gcc/bb-reorder.c +++ b/gcc/bb-reorder.c @@ -2709,8 +2709,9 @@ pass_partition_blocks::gate (function *fun) /* See gate_handle_reorder_blocks. We should not partition if we are going to omit the reordering. */ optimize_function_for_speed_p (fun) - !DECL_COMDAT_GROUP (current_function_decl) - !user_defined_section_attribute); + !user_defined_section_attribute + (fun == NULL + || !DECL_COMDAT_GROUP (current_function_decl))); } unsigned diff --git a/gcc/cprop.c b/gcc/cprop.c index c9fb2fc..bbea008 100644 --- a/gcc/cprop.c +++ b/gcc/cprop.c @@ -1961,9 +1961,11 @@ public: opt_pass * clone () { return new pass_rtl_cprop (m_ctxt); } virtual bool gate (function *fun) { - return optimize 0 flag_gcse - !fun-calls_setjmp - dbg_cnt (cprop); + return (optimize 0 + flag_gcse + (fun == NULL + || (!fun-calls_setjmp + dbg_cnt (cprop; } virtual unsigned int execute (function *) { return execute_rtl_cprop (); } diff --git a/gcc/except.c b/gcc/except.c index 833ec21..f81ade6 100644 --- a/gcc/except.c +++ b/gcc/except.c @@ -2677,14 +2677,14 @@ public: }; // class pass_convert_to_eh_region_ranges bool -pass_convert_to_eh_region_ranges::gate (function *) +pass_convert_to_eh_region_ranges::gate (function *fun) { - /* Nothing to do for SJLJ exceptions or if no regions created. */ - if (cfun-eh-region_tree == NULL) -return false; if (targetm_common.except_unwind_info (global_options) == UI_SJLJ) return false; - return true; + + /* Nothing to do for SJLJ exceptions or if no regions created. */ + return (fun == NULL + || fun-eh-region_tree != NULL); } } // anon namespace diff --git a/gcc/gcse.c b/gcc/gcse.c index e03b36c..80bdd5f 100644 --- a/gcc/gcse.c +++ b/gcc/gcse.c @@ -4252,10 +4252,12 @@ public: bool pass_rtl_pre::gate (function *fun) { - return optimize 0 flag_gcse - !fun-calls_setjmp - optimize_function_for_speed_p (fun) - dbg_cnt (pre); + return (optimize 0 + flag_gcse + optimize_function_for_speed_p (fun) + (fun == NULL + || (!fun-calls_setjmp + dbg_cnt (pre; } } // anon namespace @@ -4295,15 +4297,17 @@ public: }; // class pass_rtl_hoist bool -pass_rtl_hoist::gate (function *) +pass_rtl_hoist::gate (function *fun) { - return optimize 0 flag_gcse - !cfun-calls_setjmp -/* It does not make sense to run code hoisting unless we are optimizing - for code size -- it rarely makes programs faster, and can make then -
Re: [PATCH, stage1] Make parloops gate more strict
On 18-03-15 12:18, Richard Biener wrote: On Wed, Mar 18, 2015 at 12:03 PM, Tom de Vries tom_devr...@mentor.com wrote: On 18-03-15 11:16, Richard Biener wrote: On Fri, Mar 13, 2015 at 4:28 PM, Tom de Vries tom_devr...@mentor.com wrote: On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n-decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node-decl)); Um - this now picks a random function which may be one with an optimize or target attribute associated to it. Indeed. Attached patch removes that code, and runs the gates with cfun == NULL for -fdump-passes. It at least builds, and allows us to compile src/gcc/testsuite/gcc.dg/dump-pass.c with -O2 -fdump-passes. Should I bootstrap and reg-test, or do you see a problem with this approach? Yeah - it makes the -fdump-passes hack more pervasive throughout the compiler. I suppose it should instead build push a dummy sturct function. Like this? Well, or simply don't care for it's brokeness. I'm afraid leaving it broken just means we'll keep coming back to it. So I'd prefer either fixing or removing. Thanks, - Tom Fix fdump-passes --- gcc/function.c | 37 - gcc/function.h | 2 ++ gcc/passes.c| 17 ++--- gcc/tree-chkp.c | 6 -- 4 files changed, 40 insertions(+), 22 deletions(-) diff --git a/gcc/function.c b/gcc/function.c index 2c3d142..9ddbad8 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -4862,6 +4862,29 @@ prepare_function_start (void) frame_pointer_needed = 0; } +void +push_dummy_function (bool with_decl) +{ + tree fn_decl, fn_type, fn_result_decl; + + gcc_assert (!in_dummy_function); + in_dummy_function = true; + + if (with_decl) +{ + fn_type = build_function_type_list (void_type_node, NULL_TREE); + fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, NULL_TREE, + fn_type); + fn_result_decl = build_decl (UNKNOWN_LOCATION, RESULT_DECL, + NULL_TREE, void_type_node); + DECL_RESULT (fn_decl) = fn_result_decl; +} + else +fn_decl = NULL_TREE; + + push_struct_function (fn_decl); +} + /* Initialize the rtl expansion mechanism so that we can do simple things like generate sequences. This is used to provide a context during global initialization of some passes. You must call expand_dummy_function_end @@ -4870,9 +4893,7 @@ prepare_function_start (void) void init_dummy_function_start (void) { - gcc_assert (!in_dummy_function); - in_dummy_function = true; - push_struct_function (NULL_TREE); + push_dummy_function (false); prepare_function_start (); } @@ -5144,6 +5165,13 @@ expand_function_start (tree subr) stack_check_probe_note = emit_note (NOTE_INSN_DELETED); } +void +pop_dummy_function (void) +{ + pop_cfun (); + in_dummy_function = false; +} + /* Undo the effects of init_dummy_function_start. */ void expand_dummy_function_end (void) @@ -5159,8 +5187,7 @@ expand_dummy_function_end (void) free_after_parsing (cfun); free_after_compilation (cfun); - pop_cfun (); - in_dummy_function = false; + pop_dummy_function (); } /* Helper for diddle_return_value. */ diff --git a/gcc/function.h b/gcc/function.h index b89c5ae..349f80c 100644 --- a/gcc/function.h +++ b/gcc/function.h @@ -899,6 +899,8 @@ extern int get_next_funcdef_no (void); extern int get_last_funcdef_no (void); extern void allocate_struct_function (tree, bool); extern void push_struct_function (tree fndecl); +extern void push_dummy_function (bool); +extern void pop_dummy_function (void); extern void init_dummy_function_start (void); extern void init_function_start (tree); extern void stack_protect_epilogue (void); diff --git a/gcc/passes.c b/gcc/passes.c index 23a90d9..bca8dbb 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -944,32 +944,19 @@ dump_passes (void) void pass_manager::dump_passes () const { - struct cgraph_node *n, *node = NULL; + push_dummy_function (true); create_pass_tab (); - FOR_EACH_FUNCTION (n) -if (DECL_STRUCT_FUNCTION (n-decl)) - { - node = n; -
Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Richard. Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Jakub
Re: [PATCH, stage1] Make parloops gate more strict
On 13-03-15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hmm, I never heard of the option, interesting, thanks. Let's see what it is supposed to do: ... @item -fdump-passes @opindex fdump-passes Dump the list of optimization passes that are turned on and off by the current command-line options. ... fdump-passes seems to be using the gate function to find out the effect of the command line options on the pass. But then f.i. in pass_stdarg, using fun-stdarg in the gate function violates that AFAIU. Does this mean the gate function of pass_stdarg should be fixed by moving the fun-stdarg test from the gate to the execute function? Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Richard. Jakub
Re: [PATCH, stage1] Make parloops gate more strict
On Fri, Mar 13, 2015 at 12:12 PM, Tom de Vries tom_devr...@mentor.com wrote: On 13-03-15 11:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 11:32 AM, Tom de Vries tom_devr...@mentor.com wrote: Hi, this patch moves a bunch of early-out tests from the parloops pass to the gate function. The only effect is for functions that we don't consider at all for parallelization in the parloops pass. We no longer dump those in the parloops dump file. Bootstrapped and reg-tested on x86_64. OK for stage1 trunk? Does it work with -fdump-passes? Hmm, I never heard of the option, interesting, thanks. Let's see what it is supposed to do: ... @item -fdump-passes @opindex fdump-passes Dump the list of optimization passes that are turned on and off by the current command-line options. ... fdump-passes seems to be using the gate function to find out the effect of the command line options on the pass. But then f.i. in pass_stdarg, using fun-stdarg in the gate function violates that AFAIU. Does this mean the gate function of pass_stdarg should be fixed by moving the fun-stdarg test from the gate to the execute function? Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) Richard. Thanks, - Tom
Re: [PATCH, stage1] Make parloops gate more strict
On 13-03-15 13:36, Richard Biener wrote: On Fri, Mar 13, 2015 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, Mar 13, 2015 at 01:04:57PM +0100, Richard Biener wrote: Not really (I don't like -fdump-passes ...), but we need to make sure that -fdump-passes doesn't crash (because it runs very early and with cfun == NULL I think) If it runs with cfun == NULL, then supposedly the gates that are dependent on current function should for -fdump-passes purposes also return true if cfun == NULL (well, of course do all the unconditional checks). Though of course, with optimize/target attributes this is harder, as different functions can use different options. Yes, one reason why I think -fdump-passes is just broken implementation-wise. Atm fdump-passes doesn't run with cfun == NULL. From pass_manager::dump_passes: ... FOR_EACH_FUNCTION (n) if (DECL_STRUCT_FUNCTION (n-decl)) { node = n; break; } if (!node) return; push_cfun (DECL_STRUCT_FUNCTION (node-decl)); ... This was discussed here: https://gcc.gnu.org/ml/gcc-patches/2011-06/msg00856.html Thanks, - Tom