Re: [PATCH] New attribute to create target clones
Some tests in the patch already updated (ifunc require condition added) by Uros commit: ac39b078992c27934ea53cb580dbd79f75b6c727 I'll ask to commit attached patch. x86 bootstrap and make check passed. 2015-11-03 Evgeny Stupachenko gcc/ * multiple_target.c (create_dispatcher_calls): Add target check on ifunc. (create_target_clone): Change assembler name for versioned declarations. gcc/testsuite * g++.dg/ext/mvc4.C: Add dg-require-ifunc condition. * gcc.target/i386/mvc5.c: Ditto. * gcc.target/i386/mvc7.c: Add dg-require-ifunc condition and checks on resolver. Thanks, Evgeny On Mon, Nov 2, 2015 at 8:02 PM, Jeff Law wrote: > On 11/02/2015 07:50 AM, Evgeny Stupachenko wrote: >> >> Yes, that is exactly what should fix the tests. >> Unfortunately I don't have access to darwin machine right now. >> Can you please test if the patch (attached) fixes the tests? >> >> gcc/ >> * multiple_target.c (create_dispatcher_calls): Add target check >> on ifunc. >> (create_target_clone): Change assembler name for versioned >> declarations. >> >> gcc/testsuite >> * gcc.dg/mvc1.c: Add dg-require-ifunc condition. >> * gcc.dg/mvc4.c: Ditto. >> * gcc.dg/mvc5.c: Ditto. >> * g++.dg/ext/mvc1.C: Ditto. >> * g++.dg/ext/mvc4.C: Ditto. >> * gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on >> resolver. >> > OK. > > jeff > target_clones_tests_fix.patch Description: Binary data
Re: [PATCH] New attribute to create target clones
On 11/02/2015 07:50 AM, Evgeny Stupachenko wrote: Yes, that is exactly what should fix the tests. Unfortunately I don't have access to darwin machine right now. Can you please test if the patch (attached) fixes the tests? gcc/ * multiple_target.c (create_dispatcher_calls): Add target check on ifunc. (create_target_clone): Change assembler name for versioned declarations. gcc/testsuite * gcc.dg/mvc1.c: Add dg-require-ifunc condition. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc4.C: Ditto. * gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on resolver. OK. jeff
Re: [PATCH] New attribute to create target clones
Evgeny, I have already checked that the addition of +/* { dg-require-ifunc "" } */ fixes the failures. I’ll test your full patch later today (currently chasing regression with gfortran). Thanks, Dominique > Le 2 nov. 2015 à 15:50, Evgeny Stupachenko a écrit : > > Yes, that is exactly what should fix the tests. > Unfortunately I don't have access to darwin machine right now. > Can you please test if the patch (attached) fixes the tests? > > gcc/ >* multiple_target.c (create_dispatcher_calls): Add target check >on ifunc. >(create_target_clone): Change assembler name for versioned > declarations. > > gcc/testsuite >* gcc.dg/mvc1.c: Add dg-require-ifunc condition. >* gcc.dg/mvc4.c: Ditto. >* gcc.dg/mvc5.c: Ditto. >* g++.dg/ext/mvc1.C: Ditto. >* g++.dg/ext/mvc4.C: Ditto. >* gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on resolver. > > Thanks, > Evgeny >
Re: [PATCH] New attribute to create target clones
Yes, that is exactly what should fix the tests. Unfortunately I don't have access to darwin machine right now. Can you please test if the patch (attached) fixes the tests? gcc/ * multiple_target.c (create_dispatcher_calls): Add target check on ifunc. (create_target_clone): Change assembler name for versioned declarations. gcc/testsuite * gcc.dg/mvc1.c: Add dg-require-ifunc condition. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc4.C: Ditto. * gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on resolver. Thanks, Evgeny On Sat, Oct 31, 2015 at 1:40 PM, Dominique d'Humières wrote: > Evgeny, > On darwin I see the following failures > FAIL: g++.dg/ext/mvc1.C -std=c++11 (test for excess errors) > UNRESOLVED: g++.dg/ext/mvc1.C -std=c++11 compilation failed to produce > executable > FAIL: g++.dg/ext/mvc1.C -std=c++14 (test for excess errors) > UNRESOLVED: g++.dg/ext/mvc1.C -std=c++14 compilation failed to produce > executable > FAIL: g++.dg/ext/mvc1.C -std=c++98 (test for excess errors) > UNRESOLVED: g++.dg/ext/mvc1.C -std=c++98 compilation failed to produce > executable > FAIL: g++.dg/ext/mvc4.C -std=gnu++11 (internal compiler error) > FAIL: g++.dg/ext/mvc4.C -std=gnu++11 (test for excess errors) > FAIL: g++.dg/ext/mvc4.C -std=gnu++14 (internal compiler error) > FAIL: g++.dg/ext/mvc4.C -std=gnu++14 (test for excess errors) > FAIL: g++.dg/ext/mvc4.C -std=gnu++98 (internal compiler error) > FAIL: g++.dg/ext/mvc4.C -std=gnu++98 (test for excess errors) > > FAIL: gcc.dg/mvc1.c (internal compiler error) > FAIL: gcc.dg/mvc1.c (test for excess errors) > UNRESOLVED: gcc.dg/mvc1.c compilation failed to produce executable > FAIL: gcc.dg/mvc4.c (internal compiler error) > FAIL: gcc.dg/mvc4.c (test for excess errors) > UNRESOLVED: gcc.dg/mvc4.c compilation failed to produce executable > FAIL: gcc.dg/mvc5.c (internal compiler error) > FAIL: gcc.dg/mvc5.c (test for excess errors) > FAIL: gcc.dg/mvc5.c scan-assembler-times foo.ifunc 6 > FAIL: gcc.dg/mvc7.c (internal compiler error) > FAIL: gcc.dg/mvc7.c (test for excess errors) > FAIL: gcc.dg/mvc7.c scan-assembler-times foo.ifunc 4 > > The errors are of the kind > > /opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:5:5: error: multiversioning needs > ifunc which is not supported on this target > > These tests probably require something such as > > /* { dg-require-ifunc "" } */ > > I have no opinion if this should also used in the tests for warnings or > errors. > > The ICEs are of the kind > > /opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:10:1: internal compiler error: > Segmentation fault: 11 > } > ^ > > TIA > > Dominique > target_clones_tests_fix.patch Description: Binary data
Re: [PATCH] New attribute to create target clones
Evgeny, On darwin I see the following failures FAIL: g++.dg/ext/mvc1.C -std=c++11 (test for excess errors) UNRESOLVED: g++.dg/ext/mvc1.C -std=c++11 compilation failed to produce executable FAIL: g++.dg/ext/mvc1.C -std=c++14 (test for excess errors) UNRESOLVED: g++.dg/ext/mvc1.C -std=c++14 compilation failed to produce executable FAIL: g++.dg/ext/mvc1.C -std=c++98 (test for excess errors) UNRESOLVED: g++.dg/ext/mvc1.C -std=c++98 compilation failed to produce executable FAIL: g++.dg/ext/mvc4.C -std=gnu++11 (internal compiler error) FAIL: g++.dg/ext/mvc4.C -std=gnu++11 (test for excess errors) FAIL: g++.dg/ext/mvc4.C -std=gnu++14 (internal compiler error) FAIL: g++.dg/ext/mvc4.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/ext/mvc4.C -std=gnu++98 (internal compiler error) FAIL: g++.dg/ext/mvc4.C -std=gnu++98 (test for excess errors) FAIL: gcc.dg/mvc1.c (internal compiler error) FAIL: gcc.dg/mvc1.c (test for excess errors) UNRESOLVED: gcc.dg/mvc1.c compilation failed to produce executable FAIL: gcc.dg/mvc4.c (internal compiler error) FAIL: gcc.dg/mvc4.c (test for excess errors) UNRESOLVED: gcc.dg/mvc4.c compilation failed to produce executable FAIL: gcc.dg/mvc5.c (internal compiler error) FAIL: gcc.dg/mvc5.c (test for excess errors) FAIL: gcc.dg/mvc5.c scan-assembler-times foo.ifunc 6 FAIL: gcc.dg/mvc7.c (internal compiler error) FAIL: gcc.dg/mvc7.c (test for excess errors) FAIL: gcc.dg/mvc7.c scan-assembler-times foo.ifunc 4 The errors are of the kind /opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:5:5: error: multiversioning needs ifunc which is not supported on this target These tests probably require something such as /* { dg-require-ifunc "" } */ I have no opinion if this should also used in the tests for warnings or errors. The ICEs are of the kind /opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:10:1: internal compiler error: Segmentation fault: 11 } ^ TIA Dominique
Re: [PATCH] New attribute to create target clones
I've fixed the misprint and vertical spaces. I'll ask to commit the patch when x86 bootstrap and make check finished. Thanks, Evgeny Updated ChangeLog: 2015-10-30 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * attrib.c (make_attribute): Moved from config/i386/i386.c * config/i386/i386.c (make_attribute): Deleted. * multiple_target.c (create_dispatcher_calls): New. (get_attr_len): Ditto. (get_attr_str): Ditto. (separate_attrs): Ditto. (is_valid_asm_symbol): Ditto. (create_new_asm_name): Ditto. (create_target_clone): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. (ipa_dispatcher_calls): Ditto. * passes.def (pass_target_clone): Two new ipa passes. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. * g++.dg/ext/mvc4.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. On Fri, Oct 30, 2015 at 8:27 AM, Jeff Law wrote: > On 10/29/2015 12:13 PM, Evgeny Stupachenko wrote: >> >> On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka wrote: >>Yes. This is not necessary. However that way we'll have the following >>code in dispatcher: >> cmpl$6, __cpu_model+4(%rip) >> sete%al >> movzbl %al, %eax >> testl %eax, %eax >> jle .L16 >> movl$foo.target_clone.1, %eax >>I think it is very hard to read and debug such... >> >>While now we have: >> >> cmpl$6, __cpu_model+4(%rip) >> sete%al >> movzbl %al, %eax >> testl %eax, %eax >> jle .L16 >> movl$foo.arch_slm, %eax >> >>and it is clear that we are jumping to SLM code here. >>I'd like to keep target in names. >>> >>> > >>> >I am not against more informative names, but why you don't pass the info >>> > here: >>> > >>> >+create_target_clone (cgraph_node *node, bool definition) >>> >+{ >>> >+ cgraph_node *new_node; >>> >+ if (definition) >>> >+{ >>> >+ new_node = node->create_version_clone_with_body (vNULL, NULL, >>> >+ NULL, false, >>> >+ NULL, NULL, >>> >+ "target_clone"); >>> >+ new_node->force_output = true; >>> >+} >>> >+ else >>> >+{ >>> >+ tree new_decl = copy_node (node->decl); >>> >+ new_node = cgraph_node::get_create (new_decl); >>> >+} >>> >+ return new_node; >>> >+} >>> > >>> >passing "arch_slm" instead of target_clone will get you the name you >>> > want >>> >(plus the extra index that may be needed anyway to disambiguate). >>> > >>> >Note that in general those .suffixes should be machine parseable, so >>> > cp-demangle.c >>> >can expand them correctly. We may want to have some consistent grammar >>> > for them here >>> >and update cp-demangle.c to output nice info like "target clone for..." >> >> Ok. I've modified the patch correspondingly. > > You'll need updated ChangeLog entries. Don't forget to drop the omp-low > spurious whitespace change. > > You should also fix the formatting nits Jan pointed out. > > With those changes, this patch is OK for the trunk. I'll run the header > file reordering & cleanup tool after the patch is committed to the trunk. > > jeff > >
Re: [PATCH] New attribute to create target clones
On 10/29/2015 12:13 PM, Evgeny Stupachenko wrote: On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka wrote: >>Yes. This is not necessary. However that way we'll have the following >>code in dispatcher: >> cmpl$6, __cpu_model+4(%rip) >> sete%al >> movzbl %al, %eax >> testl %eax, %eax >> jle .L16 >> movl$foo.target_clone.1, %eax >>I think it is very hard to read and debug such... >> >>While now we have: >> >> cmpl$6, __cpu_model+4(%rip) >> sete%al >> movzbl %al, %eax >> testl %eax, %eax >> jle .L16 >> movl$foo.arch_slm, %eax >> >>and it is clear that we are jumping to SLM code here. >>I'd like to keep target in names. > >I am not against more informative names, but why you don't pass the info here: > >+create_target_clone (cgraph_node *node, bool definition) >+{ >+ cgraph_node *new_node; >+ if (definition) >+{ >+ new_node = node->create_version_clone_with_body (vNULL, NULL, >+ NULL, false, >+ NULL, NULL, >+ "target_clone"); >+ new_node->force_output = true; >+} >+ else >+{ >+ tree new_decl = copy_node (node->decl); >+ new_node = cgraph_node::get_create (new_decl); >+} >+ return new_node; >+} > >passing "arch_slm" instead of target_clone will get you the name you want >(plus the extra index that may be needed anyway to disambiguate). > >Note that in general those .suffixes should be machine parseable, so cp-demangle.c >can expand them correctly. We may want to have some consistent grammar for them here >and update cp-demangle.c to output nice info like "target clone for..." Ok. I've modified the patch correspondingly. You'll need updated ChangeLog entries. Don't forget to drop the omp-low spurious whitespace change. You should also fix the formatting nits Jan pointed out. With those changes, this patch is OK for the trunk. I'll run the header file reordering & cleanup tool after the patch is committed to the trunk. jeff
Re: [PATCH] New attribute to create target clones
> Ok. I've modified the patch correspondingly. Thanks, the IPA/callgarph parts of the patch looks fine to me except for a typo: +/* Create sting with attributes separated by comma. + Return number of attributes. */ and a fact that sometimes you skip vertical whitespace after variable declarations. I will leave it Jeff for the final ack. Honza > > > > > I will look at the rest of the patch at evening. > > Honza > >> > >> Thanks, > >> Evgeny > >> > > >> > > >> > Jeff > > > >
Re: [PATCH] New attribute to create target clones
On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka wrote: >> Yes. This is not necessary. However that way we'll have the following >> code in dispatcher: >> cmpl$6, __cpu_model+4(%rip) >> sete%al >> movzbl %al, %eax >> testl %eax, %eax >> jle .L16 >> movl$foo.target_clone.1, %eax >> I think it is very hard to read and debug such... >> >> While now we have: >> >> cmpl$6, __cpu_model+4(%rip) >> sete%al >> movzbl %al, %eax >> testl %eax, %eax >> jle .L16 >> movl$foo.arch_slm, %eax >> >> and it is clear that we are jumping to SLM code here. >> I'd like to keep target in names. > > I am not against more informative names, but why you don't pass the info here: > > +create_target_clone (cgraph_node *node, bool definition) > +{ > + cgraph_node *new_node; > + if (definition) > +{ > + new_node = node->create_version_clone_with_body (vNULL, NULL, > + NULL, false, > + NULL, NULL, > + "target_clone"); > + new_node->force_output = true; > +} > + else > +{ > + tree new_decl = copy_node (node->decl); > + new_node = cgraph_node::get_create (new_decl); > +} > + return new_node; > +} > > passing "arch_slm" instead of target_clone will get you the name you want > (plus the extra index that may be needed anyway to disambiguate). > > Note that in general those .suffixes should be machine parseable, so > cp-demangle.c > can expand them correctly. We may want to have some consistent grammar for > them here > and update cp-demangle.c to output nice info like "target clone for..." Ok. I've modified the patch correspondingly. > > I will look at the rest of the patch at evening. > Honza >> >> Thanks, >> Evgeny >> > >> > >> > Jeff > > target_clones5.patch Description: Binary data
Re: [PATCH] New attribute to create target clones
> Yes. This is not necessary. However that way we'll have the following > code in dispatcher: > cmpl$6, __cpu_model+4(%rip) > sete%al > movzbl %al, %eax > testl %eax, %eax > jle .L16 > movl$foo.target_clone.1, %eax > I think it is very hard to read and debug such... > > While now we have: > > cmpl$6, __cpu_model+4(%rip) > sete%al > movzbl %al, %eax > testl %eax, %eax > jle .L16 > movl$foo.arch_slm, %eax > > and it is clear that we are jumping to SLM code here. > I'd like to keep target in names. I am not against more informative names, but why you don't pass the info here: +create_target_clone (cgraph_node *node, bool definition) +{ + cgraph_node *new_node; + if (definition) +{ + new_node = node->create_version_clone_with_body (vNULL, NULL, + NULL, false, + NULL, NULL, + "target_clone"); + new_node->force_output = true; +} + else +{ + tree new_decl = copy_node (node->decl); + new_node = cgraph_node::get_create (new_decl); +} + return new_node; +} passing "arch_slm" instead of target_clone will get you the name you want (plus the extra index that may be needed anyway to disambiguate). Note that in general those .suffixes should be machine parseable, so cp-demangle.c can expand them correctly. We may want to have some consistent grammar for them here and update cp-demangle.c to output nice info like "target clone for..." I will look at the rest of the patch at evening. Honza > > Thanks, > Evgeny > > > > > > Jeff
Re: [PATCH] New attribute to create target clones
On Mon, Oct 26, 2015 at 6:50 PM, Jeff Law wrote: > On 10/12/2015 05:35 PM, Evgeny Stupachenko wrote: >> >> Hi All, >> >> Here is a new version of patch (attached). >> Bootstrap and make check are in progress (all new tests passed). >> >> New test case g++.dg/ext/mvc4.C fails with ICE, when options lower >> than "-mavx" are passed. >> However it has the same behavior if "target_clones" attribute is >> replaced by 2 corresponding "target" attributes. >> I've filed PR67946 on this: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946 >> >> Thanks, >> Evgeny >> >> ChangeLog: >> >> 2015-10-13 Evgeny Stupachenko >> gcc/ >> * Makefile.in (OBJS): Add multiple_target.o. >> * attrib.c (make_attribute): Moved from config/i386/i386.c >> * config/i386/i386.c (make_attribute): Deleted. >> * multiple_target.c (make_attribute): New. >> (create_dispatcher_calls): Ditto. >> (get_attr_len): Ditto. >> (get_attr_str): Ditto. >> (is_valid_asm_symbol): Ditto. >> (create_new_asm_name): Ditto. >> (create_target_clone): Ditto. >> (expand_target_clones): Ditto. >> (ipa_target_clone): Ditto. >> (ipa_dispatcher_calls): Ditto. >> * passes.def (pass_target_clone): Two new ipa passes. >> * tree-pass.h (make_pass_target_clone): Ditto. >> >> gcc/c-family >> * c-common.c (handle_target_clones_attribute): New. >> * (c_common_attribute_table): Add handle_target_clones_attribute. >> * (handle_always_inline_attribute): Add check on target_clones >> attribute. >> * (handle_target_attribute): Ditto. >> >> gcc/testsuite >> * gcc.dg/mvc1.c: New test for multiple targets cloning. >> * gcc.dg/mvc2.c: Ditto. >> * gcc.dg/mvc3.c: Ditto. >> * gcc.dg/mvc4.c: Ditto. >> * gcc.dg/mvc5.c: Ditto. >> * gcc.dg/mvc6.c: Ditto. >> * gcc.dg/mvc7.c: Ditto. >> * g++.dg/ext/mvc1.C: Ditto. >> * g++.dg/ext/mvc2.C: Ditto. >> * g++.dg/ext/mvc3.C: Ditto. >> * g++.dg/ext/mvc4.C: Ditto. >> >> gcc/doc >> * doc/extend.texi (target_clones): New attribute description. >> > >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi >> index 23e6a76..f9d28d1 100644 >> --- a/gcc/doc/extend.texi >> +++ b/gcc/doc/extend.texi >> @@ -3066,6 +3066,19 @@ This function attribute make a stack protection of >> the function if >> flags @option{fstack-protector} or @option{fstack-protector-strong} >> or @option{fstack-protector-explicit} are set. >> >> +@item target_clones (@var{options}) >> +@cindex @code{target_clones} function attribute >> +The @code{target_clones} attribute is used to specify that a function is >> to >> +be cloned into multiple versions compiled with different target options >> +than specified on the command line. The supported options and >> restrictions >> +are the same as for @code{target}. > > "as for @code{target}" -> "as for the @code{target} attribute." > > I think that makes it a tiny bit clearer. > > > > >> + >> +/* Creates target clone of NODE. */ >> + >> +static cgraph_node * >> +create_target_clone (cgraph_node *node, bool definition) >> +{ >> + cgraph_node *new_node; >> + if (definition) >> +{ >> + new_node = node->create_version_clone_with_body (vNULL, NULL, >> + NULL, false, >> + NULL, NULL, >> + "target_clone"); >> + new_node->externally_visible = node->externally_visible; >> + new_node->address_taken = node->address_taken; >> + new_node->thunk = node->thunk; >> + new_node->alias = node->alias; >> + new_node->weakref = node->weakref; >> + new_node->cpp_implicit_alias = node->cpp_implicit_alias; >> + new_node->local.local = node->local.local; > > So do we need to explicitly clear TREE_PUBLIC here? It also seems like > copying externally_visible, address_taken and possibly some of those other > fields is wrong. The clone is going to be local to the CU, it doesn't > inherit those properties from the original -- only the dispatcher needs to > inherit those properties, right? Right. That was just the way to keep the node, that I didn't clean up. Replaced with "new_node->force_output = true;" > >> + >> + >> + for (i = 0; i < attrnum; i++) >> +{ >> + char *attr = attrs[i]; >> + cgraph_node *new_node = create_target_clone (node, defenition); >> + char *new_asm_name = >> + XNEWVEC (char, strlen (old_asm_name) + strlen (attr) + 2); >> + create_new_asm_name (old_asm_name, attr, new_asm_name); > > I thought after discussions with Jan that this wasn't going to be necessary > as cloning should create a suitable name for us? Yes. This is not necessary. However that way we'll have the following code in dispatcher: cmpl$6, __cpu_model
Re: [PATCH] New attribute to create target clones
On 10/12/2015 05:35 PM, Evgeny Stupachenko wrote: Hi All, Here is a new version of patch (attached). Bootstrap and make check are in progress (all new tests passed). New test case g++.dg/ext/mvc4.C fails with ICE, when options lower than "-mavx" are passed. However it has the same behavior if "target_clones" attribute is replaced by 2 corresponding "target" attributes. I've filed PR67946 on this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946 Thanks, Evgeny ChangeLog: 2015-10-13 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * attrib.c (make_attribute): Moved from config/i386/i386.c * config/i386/i386.c (make_attribute): Deleted. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (get_attr_len): Ditto. (get_attr_str): Ditto. (is_valid_asm_symbol): Ditto. (create_new_asm_name): Ditto. (create_target_clone): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. (ipa_dispatcher_calls): Ditto. * passes.def (pass_target_clone): Two new ipa passes. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. * g++.dg/ext/mvc4.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 23e6a76..f9d28d1 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3066,6 +3066,19 @@ This function attribute make a stack protection of the function if flags @option{fstack-protector} or @option{fstack-protector-strong} or @option{fstack-protector-explicit} are set. +@item target_clones (@var{options}) +@cindex @code{target_clones} function attribute +The @code{target_clones} attribute is used to specify that a function is to +be cloned into multiple versions compiled with different target options +than specified on the command line. The supported options and restrictions +are the same as for @code{target}. "as for @code{target}" -> "as for the @code{target} attribute." I think that makes it a tiny bit clearer. + +/* Creates target clone of NODE. */ + +static cgraph_node * +create_target_clone (cgraph_node *node, bool definition) +{ + cgraph_node *new_node; + if (definition) +{ + new_node = node->create_version_clone_with_body (vNULL, NULL, + NULL, false, + NULL, NULL, + "target_clone"); + new_node->externally_visible = node->externally_visible; + new_node->address_taken = node->address_taken; + new_node->thunk = node->thunk; + new_node->alias = node->alias; + new_node->weakref = node->weakref; + new_node->cpp_implicit_alias = node->cpp_implicit_alias; + new_node->local.local = node->local.local; So do we need to explicitly clear TREE_PUBLIC here? It also seems like copying externally_visible, address_taken and possibly some of those other fields is wrong. The clone is going to be local to the CU, it doesn't inherit those properties from the original -- only the dispatcher needs to inherit those properties, right? + + + for (i = 0; i < attrnum; i++) +{ + char *attr = attrs[i]; + cgraph_node *new_node = create_target_clone (node, defenition); + char *new_asm_name = + XNEWVEC (char, strlen (old_asm_name) + strlen (attr) + 2); + create_new_asm_name (old_asm_name, attr, new_asm_name); I thought after discussions with Jan that this wasn't going to be necessary as cloning should create a suitable name for us? Jeff
Re: [PATCH] New attribute to create target clones
PING. On Thu, Oct 15, 2015 at 12:32 AM, Evgeny Stupachenko wrote: > Bootstrap and make check for x86 passed. No new fails. > Please ignore an empty line added to omp-low.c in the patch, the > misprint will be removed prior to a commit. > > Thanks, > Evgeny > > On Tue, Oct 13, 2015 at 2:35 AM, Evgeny Stupachenko > wrote: >> Hi All, >> >> Here is a new version of patch (attached). >> Bootstrap and make check are in progress (all new tests passed). >> >> New test case g++.dg/ext/mvc4.C fails with ICE, when options lower >> than "-mavx" are passed. >> However it has the same behavior if "target_clones" attribute is >> replaced by 2 corresponding "target" attributes. >> I've filed PR67946 on this: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946 >> >> Thanks, >> Evgeny >> >> ChangeLog: >> >> 2015-10-13 Evgeny Stupachenko >> gcc/ >> * Makefile.in (OBJS): Add multiple_target.o. >> * attrib.c (make_attribute): Moved from config/i386/i386.c >> * config/i386/i386.c (make_attribute): Deleted. >> * multiple_target.c (make_attribute): New. >> (create_dispatcher_calls): Ditto. >> (get_attr_len): Ditto. >> (get_attr_str): Ditto. >> (is_valid_asm_symbol): Ditto. >> (create_new_asm_name): Ditto. >> (create_target_clone): Ditto. >> (expand_target_clones): Ditto. >> (ipa_target_clone): Ditto. >> (ipa_dispatcher_calls): Ditto. >> * passes.def (pass_target_clone): Two new ipa passes. >> * tree-pass.h (make_pass_target_clone): Ditto. >> >> gcc/c-family >> * c-common.c (handle_target_clones_attribute): New. >> * (c_common_attribute_table): Add handle_target_clones_attribute. >> * (handle_always_inline_attribute): Add check on target_clones >> attribute. >> * (handle_target_attribute): Ditto. >> >> gcc/testsuite >> * gcc.dg/mvc1.c: New test for multiple targets cloning. >> * gcc.dg/mvc2.c: Ditto. >> * gcc.dg/mvc3.c: Ditto. >> * gcc.dg/mvc4.c: Ditto. >> * gcc.dg/mvc5.c: Ditto. >> * gcc.dg/mvc6.c: Ditto. >> * gcc.dg/mvc7.c: Ditto. >> * g++.dg/ext/mvc1.C: Ditto. >> * g++.dg/ext/mvc2.C: Ditto. >> * g++.dg/ext/mvc3.C: Ditto. >> * g++.dg/ext/mvc4.C: Ditto. >> >> gcc/doc >> * doc/extend.texi (target_clones): New attribute description. >> >> On Sat, Oct 10, 2015 at 12:44 AM, Evgeny Stupachenko >> wrote: >>> On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka wrote: > On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka wrote: > >> >Of course it also depends what you inline into function. You can have > >> > > >> >bar() target(-mavx) {fancy avx code} > >> >foobar() { .. if (avx) bar();} > >> >foo() ctarget(-mavx,-mno-avx) {foobar();} > > "no-" targets are not supported Why not? I suppose I can use -march=x86_64 in a file compiled with -march=core-avx2 or something like that, too. >>> Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) >>> returns: >>> >>> aaa.cpp: In function '': >>> aaa.cpp:7:5: error: No dispatcher found for no-avx >>> int bar() >>> ^ >>> > > >> > > >> >Now if you compile with -mavx and because ctarget takes effect only > >> >after inlining, > >> >at inlining time the target attributes will match and we can edn up > >> >inline bar->foobar->foo. > >> >After that we multiversion foo and drop AVX flag we will likely get > >> >ICE at expansion > >> >time. > >> But isn't that avoided by fixing up the call graph so that all calls > >> to the affected function are going through the dispatcher? Or is > >> that happening too late? > > > > There is dispatcher only for foo that is the root of the callgarph tree. > > When inlining we compare target attributes for match (in > > can_inline_edge_p). > > We do not compare ctarget attributes. Expanding ctarget to target > > early would > > avoid need for ctarget handling. > Currently inlining is disabled for functions with target_clone attribute: Do you also disable inlining into functions with target_clone? What I am concerned about is early inliner inlining (say) AVX code into ctarget function because at early inlining time the target is not applied, yet. >>> Right. Now I've got your point and ICE on the test. >>> Yes the solution is to disable inline into target_clones function. >>> Or to move the pass creating clones before inline (as you suggested) >>> and leave dispatcher creator after inline. >>> >>> I like you suggestion. It fixes the ICE. >>> I'll fix the patch and retest. >>> >>> Thank you for the review, >>> Evgeny. >>> >>> Honza
Re: [PATCH] New attribute to create target clones
Bootstrap and make check for x86 passed. No new fails. Please ignore an empty line added to omp-low.c in the patch, the misprint will be removed prior to a commit. Thanks, Evgeny On Tue, Oct 13, 2015 at 2:35 AM, Evgeny Stupachenko wrote: > Hi All, > > Here is a new version of patch (attached). > Bootstrap and make check are in progress (all new tests passed). > > New test case g++.dg/ext/mvc4.C fails with ICE, when options lower > than "-mavx" are passed. > However it has the same behavior if "target_clones" attribute is > replaced by 2 corresponding "target" attributes. > I've filed PR67946 on this: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946 > > Thanks, > Evgeny > > ChangeLog: > > 2015-10-13 Evgeny Stupachenko > gcc/ > * Makefile.in (OBJS): Add multiple_target.o. > * attrib.c (make_attribute): Moved from config/i386/i386.c > * config/i386/i386.c (make_attribute): Deleted. > * multiple_target.c (make_attribute): New. > (create_dispatcher_calls): Ditto. > (get_attr_len): Ditto. > (get_attr_str): Ditto. > (is_valid_asm_symbol): Ditto. > (create_new_asm_name): Ditto. > (create_target_clone): Ditto. > (expand_target_clones): Ditto. > (ipa_target_clone): Ditto. > (ipa_dispatcher_calls): Ditto. > * passes.def (pass_target_clone): Two new ipa passes. > * tree-pass.h (make_pass_target_clone): Ditto. > > gcc/c-family > * c-common.c (handle_target_clones_attribute): New. > * (c_common_attribute_table): Add handle_target_clones_attribute. > * (handle_always_inline_attribute): Add check on target_clones > attribute. > * (handle_target_attribute): Ditto. > > gcc/testsuite > * gcc.dg/mvc1.c: New test for multiple targets cloning. > * gcc.dg/mvc2.c: Ditto. > * gcc.dg/mvc3.c: Ditto. > * gcc.dg/mvc4.c: Ditto. > * gcc.dg/mvc5.c: Ditto. > * gcc.dg/mvc6.c: Ditto. > * gcc.dg/mvc7.c: Ditto. > * g++.dg/ext/mvc1.C: Ditto. > * g++.dg/ext/mvc2.C: Ditto. > * g++.dg/ext/mvc3.C: Ditto. > * g++.dg/ext/mvc4.C: Ditto. > > gcc/doc > * doc/extend.texi (target_clones): New attribute description. > > On Sat, Oct 10, 2015 at 12:44 AM, Evgeny Stupachenko > wrote: >> On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka wrote: On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka wrote: >> >Of course it also depends what you inline into function. You can have >> > >> >bar() target(-mavx) {fancy avx code} >> >foobar() { .. if (avx) bar();} >> >foo() ctarget(-mavx,-mno-avx) {foobar();} "no-" targets are not supported >>> >>> Why not? I suppose I can use -march=x86_64 in a file compiled with >>> -march=core-avx2 or something like that, too. >> Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) >> returns: >> >> aaa.cpp: In function '': >> aaa.cpp:7:5: error: No dispatcher found for no-avx >> int bar() >> ^ >> >> > >> >Now if you compile with -mavx and because ctarget takes effect only >> >after inlining, >> >at inlining time the target attributes will match and we can edn up >> >inline bar->foobar->foo. >> >After that we multiversion foo and drop AVX flag we will likely get >> >ICE at expansion >> >time. >> But isn't that avoided by fixing up the call graph so that all calls >> to the affected function are going through the dispatcher? Or is >> that happening too late? > > There is dispatcher only for foo that is the root of the callgarph tree. > When inlining we compare target attributes for match (in > can_inline_edge_p). > We do not compare ctarget attributes. Expanding ctarget to target early > would > avoid need for ctarget handling. Currently inlining is disabled for functions with target_clone attribute: >>> >>> Do you also disable inlining into functions with target_clone? >>> What I am concerned about is early inliner inlining (say) AVX code into >>> ctarget >>> function because at early inlining time the target is not applied, yet. >> Right. Now I've got your point and ICE on the test. >> Yes the solution is to disable inline into target_clones function. >> Or to move the pass creating clones before inline (as you suggested) >> and leave dispatcher creator after inline. >> >> I like you suggestion. It fixes the ICE. >> I'll fix the patch and retest. >> >> Thank you for the review, >> Evgeny. >> >> >>> >>> Honza
Re: [PATCH] New attribute to create target clones
Hi All, Here is a new version of patch (attached). Bootstrap and make check are in progress (all new tests passed). New test case g++.dg/ext/mvc4.C fails with ICE, when options lower than "-mavx" are passed. However it has the same behavior if "target_clones" attribute is replaced by 2 corresponding "target" attributes. I've filed PR67946 on this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946 Thanks, Evgeny ChangeLog: 2015-10-13 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * attrib.c (make_attribute): Moved from config/i386/i386.c * config/i386/i386.c (make_attribute): Deleted. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (get_attr_len): Ditto. (get_attr_str): Ditto. (is_valid_asm_symbol): Ditto. (create_new_asm_name): Ditto. (create_target_clone): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. (ipa_dispatcher_calls): Ditto. * passes.def (pass_target_clone): Two new ipa passes. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. * g++.dg/ext/mvc4.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. On Sat, Oct 10, 2015 at 12:44 AM, Evgeny Stupachenko wrote: > On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka wrote: >>> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka wrote: >>> >> >Of course it also depends what you inline into function. You can have >>> >> > >>> >> >bar() target(-mavx) {fancy avx code} >>> >> >foobar() { .. if (avx) bar();} >>> >> >foo() ctarget(-mavx,-mno-avx) {foobar();} >>> >>> "no-" targets are not supported >> >> Why not? I suppose I can use -march=x86_64 in a file compiled with >> -march=core-avx2 or something like that, too. > Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) > returns: > > aaa.cpp: In function '': > aaa.cpp:7:5: error: No dispatcher found for no-avx > int bar() > ^ > >>> >>> >> > >>> >> >Now if you compile with -mavx and because ctarget takes effect only >>> >> >after inlining, >>> >> >at inlining time the target attributes will match and we can edn up >>> >> >inline bar->foobar->foo. >>> >> >After that we multiversion foo and drop AVX flag we will likely get ICE >>> >> >at expansion >>> >> >time. >>> >> But isn't that avoided by fixing up the call graph so that all calls >>> >> to the affected function are going through the dispatcher? Or is >>> >> that happening too late? >>> > >>> > There is dispatcher only for foo that is the root of the callgarph tree. >>> > When inlining we compare target attributes for match (in >>> > can_inline_edge_p). >>> > We do not compare ctarget attributes. Expanding ctarget to target early >>> > would >>> > avoid need for ctarget handling. >>> Currently inlining is disabled for functions with target_clone attribute: >> >> Do you also disable inlining into functions with target_clone? >> What I am concerned about is early inliner inlining (say) AVX code into >> ctarget >> function because at early inlining time the target is not applied, yet. > Right. Now I've got your point and ICE on the test. > Yes the solution is to disable inline into target_clones function. > Or to move the pass creating clones before inline (as you suggested) > and leave dispatcher creator after inline. > > I like you suggestion. It fixes the ICE. > I'll fix the patch and retest. > > Thank you for the review, > Evgeny. > > >> >> Honza target_clones.patch Description: Binary data
Re: [PATCH] New attribute to create target clones
On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka wrote: >> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka wrote: >> >> >Of course it also depends what you inline into function. You can have >> >> > >> >> >bar() target(-mavx) {fancy avx code} >> >> >foobar() { .. if (avx) bar();} >> >> >foo() ctarget(-mavx,-mno-avx) {foobar();} >> >> "no-" targets are not supported > > Why not? I suppose I can use -march=x86_64 in a file compiled with > -march=core-avx2 or something like that, too. Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) returns: aaa.cpp: In function '': aaa.cpp:7:5: error: No dispatcher found for no-avx int bar() ^ >> >> >> > >> >> >Now if you compile with -mavx and because ctarget takes effect only >> >> >after inlining, >> >> >at inlining time the target attributes will match and we can edn up >> >> >inline bar->foobar->foo. >> >> >After that we multiversion foo and drop AVX flag we will likely get ICE >> >> >at expansion >> >> >time. >> >> But isn't that avoided by fixing up the call graph so that all calls >> >> to the affected function are going through the dispatcher? Or is >> >> that happening too late? >> > >> > There is dispatcher only for foo that is the root of the callgarph tree. >> > When inlining we compare target attributes for match (in >> > can_inline_edge_p). >> > We do not compare ctarget attributes. Expanding ctarget to target early >> > would >> > avoid need for ctarget handling. >> Currently inlining is disabled for functions with target_clone attribute: > > Do you also disable inlining into functions with target_clone? > What I am concerned about is early inliner inlining (say) AVX code into > ctarget > function because at early inlining time the target is not applied, yet. Right. Now I've got your point and ICE on the test. Yes the solution is to disable inline into target_clones function. Or to move the pass creating clones before inline (as you suggested) and leave dispatcher creator after inline. I like you suggestion. It fixes the ICE. I'll fix the patch and retest. Thank you for the review, Evgeny. > > Honza
Re: [PATCH] New attribute to create target clones
> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka wrote: > >> >Of course it also depends what you inline into function. You can have > >> > > >> >bar() target(-mavx) {fancy avx code} > >> >foobar() { .. if (avx) bar();} > >> >foo() ctarget(-mavx,-mno-avx) {foobar();} > > "no-" targets are not supported Why not? I suppose I can use -march=x86_64 in a file compiled with -march=core-avx2 or something like that, too. > > >> > > >> >Now if you compile with -mavx and because ctarget takes effect only after > >> >inlining, > >> >at inlining time the target attributes will match and we can edn up > >> >inline bar->foobar->foo. > >> >After that we multiversion foo and drop AVX flag we will likely get ICE > >> >at expansion > >> >time. > >> But isn't that avoided by fixing up the call graph so that all calls > >> to the affected function are going through the dispatcher? Or is > >> that happening too late? > > > > There is dispatcher only for foo that is the root of the callgarph tree. > > When inlining we compare target attributes for match (in can_inline_edge_p). > > We do not compare ctarget attributes. Expanding ctarget to target early > > would > > avoid need for ctarget handling. > Currently inlining is disabled for functions with target_clone attribute: Do you also disable inlining into functions with target_clone? What I am concerned about is early inliner inlining (say) AVX code into ctarget function because at early inlining time the target is not applied, yet. Honza
Re: [PATCH] New attribute to create target clones
On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka wrote: >> >Of course it also depends what you inline into function. You can have >> > >> >bar() target(-mavx) {fancy avx code} >> >foobar() { .. if (avx) bar();} >> >foo() ctarget(-mavx,-mno-avx) {foobar();} "no-" targets are not supported >> > >> >Now if you compile with -mavx and because ctarget takes effect only after >> >inlining, >> >at inlining time the target attributes will match and we can edn up inline >> >bar->foobar->foo. >> >After that we multiversion foo and drop AVX flag we will likely get ICE at >> >expansion >> >time. >> But isn't that avoided by fixing up the call graph so that all calls >> to the affected function are going through the dispatcher? Or is >> that happening too late? > > There is dispatcher only for foo that is the root of the callgarph tree. > When inlining we compare target attributes for match (in can_inline_edge_p). > We do not compare ctarget attributes. Expanding ctarget to target early would > avoid need for ctarget handling. Currently inlining is disabled for functions with target_clone attribute: if call from versioned function is call of multiversion function: foo.avx2() { ... bar(); ... } bar dispatcher is better than inlining: foo.avx2() { ... bar.ifunc(); ... } That is better from my point of view as: 1. multiversioning is going to be used mostly for target specific optimizations like vectorization, scheduling... For these cases inlininng not giving much. One of good examples is moving hot vectorizable loop into the function and add target_clones attribute. 2. Inlining algorithm when we have not equal chains of architectures in target_clones attribute are complex: __attribute__((target_clones("avx","arch=slm","arch=core-avx2","default"))) __attribute__((target_clones("ssse3","arch=atom","arch=slm","default"))) 3. If for one version inline occurs, but for another not we can get unexpected (from user point) performance difference Anyway I'll add the test you've mentioned to the patch. The other big question is how all this deal with OpenMP clauses. For now both target and target_clones attributes disable #pragma omp effect inserting scalar ifunc call into loop. Thanks, Evgeny > >> >> > >> >> >> >> >> >>Since we're going through a dispatcher I don't think we're allowed >> >>to change the ABI across the clones. >> > >> >If the dispatcher was something like >> > >> >switch(value) >> >{ >> > case 1: foo_ver1(param); break; >> > case 2: foo_ver2(param); break; >> >} >> >then it would be possible to change ABI if we can prove that param=0 in >> >ipa-cp. >> >If the dispatcher uses indirect calls, then we probably want to consider >> >foo_ver* as having address taken and thus not local. >> Yea, I guess that could work. For some reason I kept thinking about >> indirect calls, but that shouldn't be the case here. >> >> That argues further that setting TREE_PUBLIC on the target specific >> implementations is wrong. > > Yep, if the function doesn't need to be public then the whole block of code > tampering with visibility and other flags can go. We probably also don't need > to care about giving clone some fixed assembler name as clonning will invent > .target_clone.XYZ name itself. > > Honza >> >> >> Jeff
Re: [PATCH] New attribute to create target clones
> >Of course it also depends what you inline into function. You can have > > > >bar() target(-mavx) {fancy avx code} > >foobar() { .. if (avx) bar();} > >foo() ctarget(-mavx,-mno-avx) {foobar();} > > > >Now if you compile with -mavx and because ctarget takes effect only after > >inlining, > >at inlining time the target attributes will match and we can edn up inline > >bar->foobar->foo. > >After that we multiversion foo and drop AVX flag we will likely get ICE at > >expansion > >time. > But isn't that avoided by fixing up the call graph so that all calls > to the affected function are going through the dispatcher? Or is > that happening too late? There is dispatcher only for foo that is the root of the callgarph tree. When inlining we compare target attributes for match (in can_inline_edge_p). We do not compare ctarget attributes. Expanding ctarget to target early would avoid need for ctarget handling. > > > > >> > >> > >>Since we're going through a dispatcher I don't think we're allowed > >>to change the ABI across the clones. > > > >If the dispatcher was something like > > > >switch(value) > >{ > > case 1: foo_ver1(param); break; > > case 2: foo_ver2(param); break; > >} > >then it would be possible to change ABI if we can prove that param=0 in > >ipa-cp. > >If the dispatcher uses indirect calls, then we probably want to consider > >foo_ver* as having address taken and thus not local. > Yea, I guess that could work. For some reason I kept thinking about > indirect calls, but that shouldn't be the case here. > > That argues further that setting TREE_PUBLIC on the target specific > implementations is wrong. Yep, if the function doesn't need to be public then the whole block of code tampering with visibility and other flags can go. We probably also don't need to care about giving clone some fixed assembler name as clonning will invent .target_clone.XYZ name itself. Honza > > > Jeff
Re: [PATCH] New attribute to create target clones
On 10/08/2015 03:36 PM, Jan Hubicka wrote: Yes, here you have different names for different variants of the function body. Basically this pass takes ctarget attribute and creates bunch of verisons of the functions and assigns them the proper target attributes, right? Right. Given a single function in the source tree with the new attribute, we'll create clones and compile each clone with a different set of options. It's got a lot of similarities to the multi-versioning code. The key difference is with multi-versioning, you actually have a different source level implementation for each target while Evgeny's stuff has a single source implementation. One thing I am confused about is why this does not happen early? What happens to inlines of functions with specific taret requirements? I.e. if I compile with AVX enabled have function body with AVX code but then, at late compilation, force a clone with -mno-avx? I would expect cloning to happen early, perhaps even before early optimizations... Switching random target flags mid optimization queue seems dangerous. These shouldn't be inlined to the best of my knowledge. We go back and direct all callers to a dispatcher. Inlining them would be a mistake since the goal here is to specialize the clones around target capabilities. Thus if something got inlined, then we lose that ability. OK, I assume that every multiversioned function will end up in something like this: foo_ver1() target(...) { } foo_ver2() target(...) { } foo() { dispatch either to foo_ver1() or foo_ver2() } Yes. I wonder why foo_ver1/foo_ver2 needs ever become public? If there is only way to call them via dispatcher, then the code setting TREE_PUBLIC seems wrong. If there is direct way to call them, then inlinng is possible. I can't offhand think of a good reason why they should be public. In fact, it would defeat the purpose of this feature to begin with if those symbols were directly reachable. So, yes, setting TREE_PUBLIC seems wrong. Of course it also depends what you inline into function. You can have bar() target(-mavx) {fancy avx code} foobar() { .. if (avx) bar();} foo() ctarget(-mavx,-mno-avx) {foobar();} Now if you compile with -mavx and because ctarget takes effect only after inlining, at inlining time the target attributes will match and we can edn up inline bar->foobar->foo. After that we multiversion foo and drop AVX flag we will likely get ICE at expansion time. But isn't that avoided by fixing up the call graph so that all calls to the affected function are going through the dispatcher? Or is that happening too late? Since we're going through a dispatcher I don't think we're allowed to change the ABI across the clones. If the dispatcher was something like switch(value) { case 1: foo_ver1(param); break; case 2: foo_ver2(param); break; } then it would be possible to change ABI if we can prove that param=0 in ipa-cp. If the dispatcher uses indirect calls, then we probably want to consider foo_ver* as having address taken and thus not local. Yea, I guess that could work. For some reason I kept thinking about indirect calls, but that shouldn't be the case here. That argues further that setting TREE_PUBLIC on the target specific implementations is wrong. Jeff
Re: [PATCH] New attribute to create target clones
On 10/08/2015 02:01 PM, Evgeny Stupachenko wrote: On Thu, Oct 8, 2015 at 10:00 PM, Jeff Law wrote: On 09/24/2015 04:28 PM, Evgeny Stupachenko wrote: I've fixed ICE and review issues. x86 make check and bootstrap passed. Thanks, Evgeny ChangeLog 2015-09-25 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. * passes.def (pass_target_clone): New ipa pass. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. target_clones.patch Sorry this has taken so long to come back to... As I mentioned a couple months ago, I'd hoped Jan would chime in on the IPA/symtab requirements. But that didn't happen. SO I went back and reviewed the discussion between Jan, Ilya & myself WRT some of the rules around aliases, clones, etc. I think the key question for this patch is whether or not the clones have the same assembler name or not. From looking at expand_target_clones, I'm confident the answer is the clones have different assembler names. In fact, the assembler names are munged with the options used for that specific clone of the original function. +/* Makes a function attribute of the form NAME(ARG_NAME) and chains + it to CHAIN. */ + +static tree +make_attribute (const char *name, const char *arg_name, tree chain) +{ + tree attr_name; + tree attr_arg_name; + tree attr_args; + tree attr; + + attr_name = get_identifier (name); + attr_arg_name = build_string (strlen (arg_name), arg_name); + attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE); + attr = tree_cons (attr_name, attr_args, chain); + return attr; +} This seems generic enough that I'd prefer it in attribs.c. I was rather surprised when I looked and didn't find an existing routine to do this. + +/* If the call in NODE has multiple target attribute with multiple fields, + replace it with dispatcher call and create dispatcher (once). */ + +static void +create_dispatcher_calls (struct cgraph_node *node) +{ + cgraph_edge *e; + cgraph_edge *e_next; + for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller) That's a rather strange way to write the loop increment. If I follow the loop logic correctly, it seems that we always end up using e->next_caller, it's just obscured. For the test if we're calling a versioned function, we just "continue". e will be non-null and thus we use e->next_caller to set e for the next iteration. If the test for calling a versioned function falls, we set e_next to e->next_caller, then later set e to NULL. That results in using e_next to set e for the next iteration. But e_next was initialized to e->next_caller. So why not just write the loop increment as e = e->next-caller? Because of this: e->redirect_callee (inode); It modifies next_caller field. The other way is to remember all what we want to redirect and create one more loop through the modifications to apply them. Seems like a comment would be wise. I'm slightly concerned with using the pretty printer to build up the new name. Is there precedent for this anywhere else in GCC? I don't remember where it exactly came from. However it's not a big deal to simplify this to std functions. Thanks. Not sure why, but I'd appreciate that change. When creating the munged name, don't you also have to make sure that other symbols that aren't supported for names don't sneak through? I see that you replace = and -, but you'd need to replace any symbol that could possibly be used in an option, but which isn't allowed in a function name at the assembler level. I'd be worried about anything that might possibly be seen as an operator by the assembler, '.', and possibly others. This restriction comes from "targetm.target_option.valid_attribute_p" and it is the same for current implementation of function multiversioning. It exits with error: "attribute(target("...")) is unknown". It looks reasonable to put the check before symtab changes. Right, but there's nothing inherently that says that a option couldn't have other operators such as
Re: [PATCH] New attribute to create target clones
> > > >Yes, here you have different names for different variants of the function > >body. Basically this pass takes ctarget attribute and creates bunch of > >verisons > >of the functions and assigns them the proper target attributes, right? > Right. Given a single function in the source tree with the new > attribute, we'll create clones and compile each clone with a > different set of options. It's got a lot of similarities to the > multi-versioning code. The key difference is with multi-versioning, > you actually have a different source level implementation for each > target while Evgeny's stuff has a single source implementation. > > > > >One thing I am confused about is why this does not happen early? > >What happens to inlines of functions with specific taret requirements? I.e. > >if I compile with AVX enabled have function body with AVX code but then, at > >late compilation, force a clone with -mno-avx? > > > >I would expect cloning to happen early, perhaps even before early > >optimizations... > >Switching random target flags mid optimization queue seems dangerous. > These shouldn't be inlined to the best of my knowledge. We go back > and direct all callers to a dispatcher. Inlining them would be a > mistake since the goal here is to specialize the clones around > target capabilities. Thus if something got inlined, then we lose > that ability. OK, I assume that every multiversioned function will end up in something like this: foo_ver1() target(...) { } foo_ver2() target(...) { } foo() { dispatch either to foo_ver1() or foo_ver2() } I wonder why foo_ver1/foo_ver2 needs ever become public? If there is only way to call them via dispatcher, then the code setting TREE_PUBLIC seems wrong. If there is direct way to call them, then inlinng is possible. Of course it also depends what you inline into function. You can have bar() target(-mavx) {fancy avx code} foobar() { .. if (avx) bar();} foo() ctarget(-mavx,-mno-avx) {foobar();} Now if you compile with -mavx and because ctarget takes effect only after inlining, at inlining time the target attributes will match and we can edn up inline bar->foobar->foo. After that we multiversion foo and drop AVX flag we will likely get ICE at expansion time. > > > Since we're going through a dispatcher I don't think we're allowed > to change the ABI across the clones. If the dispatcher was something like switch(value) { case 1: foo_ver1(param); break; case 2: foo_ver2(param); break; } then it would be possible to change ABI if we can prove that param=0 in ipa-cp. If the dispatcher uses indirect calls, then we probably want to consider foo_ver* as having address taken and thus not local. Honza > > > > > >Also when you are copying a function, you probably want to copy the > >associated > >thunks and version them, too? > Dunno on that. > > Jeff
Re: [PATCH] New attribute to create target clones
On Thu, Oct 8, 2015 at 10:00 PM, Jeff Law wrote: > On 09/24/2015 04:28 PM, Evgeny Stupachenko wrote: >> >> I've fixed ICE and review issues. >> x86 make check and bootstrap passed. >> >> Thanks, >> Evgeny >> >> ChangeLog >> >> 2015-09-25 Evgeny Stupachenko >> >> gcc/ >> * Makefile.in (OBJS): Add multiple_target.o. >> * multiple_target.c (make_attribute): New. >> (create_dispatcher_calls): Ditto. >> (expand_target_clones): Ditto. >> (ipa_target_clone): Ditto. >> * passes.def (pass_target_clone): New ipa pass. >> * tree-pass.h (make_pass_target_clone): Ditto. >> >> gcc/c-family >> * c-common.c (handle_target_clones_attribute): New. >> * (c_common_attribute_table): Add handle_target_clones_attribute. >> * (handle_always_inline_attribute): Add check on target_clones >> attribute. >> * (handle_target_attribute): Ditto. >> >> gcc/testsuite >> * gcc.dg/mvc1.c: New test for multiple targets cloning. >> * gcc.dg/mvc2.c: Ditto. >> * gcc.dg/mvc3.c: Ditto. >> * gcc.dg/mvc4.c: Ditto. >> * gcc.dg/mvc5.c: Ditto. >> * gcc.dg/mvc6.c: Ditto. >> * gcc.dg/mvc7.c: Ditto. >> * g++.dg/ext/mvc1.C: Ditto. >> * g++.dg/ext/mvc2.C: Ditto. >> * g++.dg/ext/mvc3.C: Ditto. >> >> gcc/doc >> * doc/extend.texi (target_clones): New attribute description. >> >> > >> >> target_clones.patch > > Sorry this has taken so long to come back to... As I mentioned a couple > months ago, I'd hoped Jan would chime in on the IPA/symtab requirements. > But that didn't happen. > > > SO I went back and reviewed the discussion between Jan, Ilya & myself WRT > some of the rules around aliases, clones, etc. I think the key question for > this patch is whether or not the clones have the same assembler name or not. > From looking at expand_target_clones, I'm confident the answer is the clones > have different assembler names. In fact, the assembler names are munged > with the options used for that specific clone of the original function. > > >> >> +/* Makes a function attribute of the form NAME(ARG_NAME) and chains >> + it to CHAIN. */ >> + >> +static tree >> +make_attribute (const char *name, const char *arg_name, tree chain) >> +{ >> + tree attr_name; >> + tree attr_arg_name; >> + tree attr_args; >> + tree attr; >> + >> + attr_name = get_identifier (name); >> + attr_arg_name = build_string (strlen (arg_name), arg_name); >> + attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE); >> + attr = tree_cons (attr_name, attr_args, chain); >> + return attr; >> +} > > This seems generic enough that I'd prefer it in attribs.c. I was rather > surprised when I looked and didn't find an existing routine to do this. > > >> + >> +/* If the call in NODE has multiple target attribute with multiple >> fields, >> + replace it with dispatcher call and create dispatcher (once). */ >> + >> +static void >> +create_dispatcher_calls (struct cgraph_node *node) >> +{ >> + cgraph_edge *e; >> + cgraph_edge *e_next; >> + for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller) > > That's a rather strange way to write the loop increment. If I follow the > loop logic correctly, it seems that we always end up using e->next_caller, > it's just obscured. > > For the test if we're calling a versioned function, we just "continue". e > will be non-null and thus we use e->next_caller to set e for the next > iteration. > > If the test for calling a versioned function falls, we set e_next to > e->next_caller, then later set e to NULL. That results in using e_next to > set e for the next iteration. But e_next was initialized to e->next_caller. > > So why not just write the loop increment as e = e->next-caller? Because of this: e->redirect_callee (inode); It modifies next_caller field. The other way is to remember all what we want to redirect and create one more loop through the modifications to apply them. > > > >> +{ >> + tree resolver_decl; >> + tree idecl; >> + tree decl; >> + gimple *call = e->call_stmt; >> + struct cgraph_node *inode; >> + >> + /* Checking if call of function is call of versioned function. >> +Versioned function are not inlined, so there is no need to >> +check for inline. */ > > This comment doesn't parse well. Perhaps: > > /* Check if this is a call to a versioned function. Verisoned >fucntions are not inlined, so there is no need to check for that. */ > > >> + >> +/* If the function in NODE has multiple target attribute with multiple >> fields, >> + create the appropriate clone for each field. */ >> + >> +static bool >> +expand_target_clones (struct cgraph_node *node) > > So this is probably getting a little large. Can we look to refactor it a > little? It's not a huge deal and there's certainly code in GCC that is far > worse, but it just feels like there's e
Re: [PATCH] New attribute to create target clones
On 10/08/2015 01:23 PM, Jan Hubicka wrote: Sorry this has taken so long to come back to... As I mentioned a couple months ago, I'd hoped Jan would chime in on the IPA/symtab requirements. But that didn't happen. Sorry for that. I had bit too many real life things this summer and I am still trying to catch up. It happens to us all :-) No worries. Ilya's code seems different from what this patch does. Ilya simply needs multiple declarations for one physical assembler name (this is not an alias). This is not currently supported by symtab (support for that was removed long time ago as part of the one decl project) and I have some perliminary patches to push out, but since they add basic sanity checking that the different declarations of the same thing looks compatible I get too many positives I need to walk through. Those seems real bugs in glibc (which uses duplicated decls for checking) and the pointer bounds code. Right. I was mostly concerned because I goof'd letting those bits from Ilya in and didn't want to repeat the mistake again. It was pretty clear once I found the thread between the tree of us that Evgeny's patches were doing something rather different and didn't violate the assumptions of the symtab code. Yes, here you have different names for different variants of the function body. Basically this pass takes ctarget attribute and creates bunch of verisons of the functions and assigns them the proper target attributes, right? Right. Given a single function in the source tree with the new attribute, we'll create clones and compile each clone with a different set of options. It's got a lot of similarities to the multi-versioning code. The key difference is with multi-versioning, you actually have a different source level implementation for each target while Evgeny's stuff has a single source implementation. One thing I am confused about is why this does not happen early? What happens to inlines of functions with specific taret requirements? I.e. if I compile with AVX enabled have function body with AVX code but then, at late compilation, force a clone with -mno-avx? I would expect cloning to happen early, perhaps even before early optimizations... Switching random target flags mid optimization queue seems dangerous. These shouldn't be inlined to the best of my knowledge. We go back and direct all callers to a dispatcher. Inlining them would be a mistake since the goal here is to specialize the clones around target capabilities. Thus if something got inlined, then we lose that ability. As for the patch itself: + if (node->definition) + { + if (!node->has_gimple_body_p ()) + return false; + node->get_body (); + + /* Replacing initial function with clone only for 1st ctarget. */ + new_node = node->create_version_clone_with_body (vNULL, NULL, + NULL, false, + NULL, NULL, + "target_clone"); + new_node->externally_visible = node->externally_visible; + new_node->address_taken = node->address_taken; + new_node->thunk = node->thunk; + new_node->alias = node->alias; + new_node->weakref = node->weakref; + new_node->cpp_implicit_alias = node->cpp_implicit_alias; + new_node->local.local = node->local.local; + TREE_PUBLIC (new_node->decl) = TREE_PUBLIC (node->decl); + } Since you test it has gimple body, then you don't need to worry about alias/thunk/weakrefs/implicit_alias properties. Those will be never set. How does the dispatcher look like? Can the function be considered local in a sense that one can change calling conventions of one clone but not another? Easiest to think of them as having the same abilities as multi-versioning. Since we're going through a dispatcher I don't think we're allowed to change the ABI across the clones. Also when you are copying a function, you probably want to copy the associated thunks and version them, too? Dunno on that. Jeff
Re: [PATCH] New attribute to create target clones
> Sorry this has taken so long to come back to... As I mentioned a > couple months ago, I'd hoped Jan would chime in on the IPA/symtab > requirements. But that didn't happen. Sorry for that. I had bit too many real life things this summer and I am still trying to catch up. > > > SO I went back and reviewed the discussion between Jan, Ilya & > myself WRT some of the rules around aliases, clones, etc. I think > the key question for this patch is whether or not the clones have > the same assembler name or not. From looking at Ilya's code seems different from what this patch does. Ilya simply needs multiple declarations for one physical assembler name (this is not an alias). This is not currently supported by symtab (support for that was removed long time ago as part of the one decl project) and I have some perliminary patches to push out, but since they add basic sanity checking that the different declarations of the same thing looks compatible I get too many positives I need to walk through. Those seems real bugs in glibc (which uses duplicated decls for checking) and the pointer bounds code. > expand_target_clones, I'm confident the answer is the clones have > different assembler names. In fact, the assembler names are munged Yes, here you have different names for different variants of the function body. Basically this pass takes ctarget attribute and creates bunch of verisons of the functions and assigns them the proper target attributes, right? One thing I am confused about is why this does not happen early? What happens to inlines of functions with specific taret requirements? I.e. if I compile with AVX enabled have function body with AVX code but then, at late compilation, force a clone with -mno-avx? I would expect cloning to happen early, perhaps even before early optimizations... Switching random target flags mid optimization queue seems dangerous. As for the patch itself: + if (node->definition) + { + if (!node->has_gimple_body_p ()) + return false; + node->get_body (); + + /* Replacing initial function with clone only for 1st ctarget. */ + new_node = node->create_version_clone_with_body (vNULL, NULL, + NULL, false, + NULL, NULL, + "target_clone"); + new_node->externally_visible = node->externally_visible; + new_node->address_taken = node->address_taken; + new_node->thunk = node->thunk; + new_node->alias = node->alias; + new_node->weakref = node->weakref; + new_node->cpp_implicit_alias = node->cpp_implicit_alias; + new_node->local.local = node->local.local; + TREE_PUBLIC (new_node->decl) = TREE_PUBLIC (node->decl); + } Since you test it has gimple body, then you don't need to worry about alias/thunk/weakrefs/implicit_alias properties. Those will be never set. How does the dispatcher look like? Can the function be considered local in a sense that one can change calling conventions of one clone but not another? On the other hand, node->create_version_clone_with_body creates a function that is local, if we want to create globally visible clones, I think we should add a parameter there and avoid the localization followed by unlocalization. (I know we already have too many parameters here, perhaps we could add a flags parameter that will also handle the existing skip_return flag.) For exmaple, I think you want to have all functions with same WEAK attributes or in the same COMDAT group. Also when you are copying a function, you probably want to copy the associated thunks and version them, too? + id = get_identifier (new_asm_name); + symtab->change_decl_assembler_name (new_node->decl, id); here I think you can just pass new_asm_name as clone_name. I.e. replace the current use of "target_clone" string. + targetm.target_option.valid_attribute_p (new_node->decl, NULL, + TREE_VALUE (attributes), 0); looks like return value should not be ignored. If attribute is not valid, we need to error and do something sane. Honza
Re: [PATCH] New attribute to create target clones
On 09/24/2015 04:28 PM, Evgeny Stupachenko wrote: I've fixed ICE and review issues. x86 make check and bootstrap passed. Thanks, Evgeny ChangeLog 2015-09-25 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. * passes.def (pass_target_clone): New ipa pass. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. target_clones.patch Sorry this has taken so long to come back to... As I mentioned a couple months ago, I'd hoped Jan would chime in on the IPA/symtab requirements. But that didn't happen. SO I went back and reviewed the discussion between Jan, Ilya & myself WRT some of the rules around aliases, clones, etc. I think the key question for this patch is whether or not the clones have the same assembler name or not. From looking at expand_target_clones, I'm confident the answer is the clones have different assembler names. In fact, the assembler names are munged with the options used for that specific clone of the original function. +/* Makes a function attribute of the form NAME(ARG_NAME) and chains + it to CHAIN. */ + +static tree +make_attribute (const char *name, const char *arg_name, tree chain) +{ + tree attr_name; + tree attr_arg_name; + tree attr_args; + tree attr; + + attr_name = get_identifier (name); + attr_arg_name = build_string (strlen (arg_name), arg_name); + attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE); + attr = tree_cons (attr_name, attr_args, chain); + return attr; +} This seems generic enough that I'd prefer it in attribs.c. I was rather surprised when I looked and didn't find an existing routine to do this. + +/* If the call in NODE has multiple target attribute with multiple fields, + replace it with dispatcher call and create dispatcher (once). */ + +static void +create_dispatcher_calls (struct cgraph_node *node) +{ + cgraph_edge *e; + cgraph_edge *e_next; + for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller) That's a rather strange way to write the loop increment. If I follow the loop logic correctly, it seems that we always end up using e->next_caller, it's just obscured. For the test if we're calling a versioned function, we just "continue". e will be non-null and thus we use e->next_caller to set e for the next iteration. If the test for calling a versioned function falls, we set e_next to e->next_caller, then later set e to NULL. That results in using e_next to set e for the next iteration. But e_next was initialized to e->next_caller. So why not just write the loop increment as e = e->next-caller? +{ + tree resolver_decl; + tree idecl; + tree decl; + gimple *call = e->call_stmt; + struct cgraph_node *inode; + + /* Checking if call of function is call of versioned function. +Versioned function are not inlined, so there is no need to +check for inline. */ This comment doesn't parse well. Perhaps: /* Check if this is a call to a versioned function. Verisoned fucntions are not inlined, so there is no need to check for that. */ + +/* If the function in NODE has multiple target attribute with multiple fields, + create the appropriate clone for each field. */ + +static bool +expand_target_clones (struct cgraph_node *node) So this is probably getting a little large. Can we look to refactor it a little? It's not a huge deal and there's certainly code in GCC that is far worse, but it just feels like there's enough going on in this code that there ought to be 2-3 helpers for the larger chunks of work going on inside this code. The first is the attribute parsing. The second is creation of the nodes, and the final would be munging the name and attributes on the clones. I'm slightly concerned with using the pretty printer to build up the new name. Is there precedent for this anywhere else in GCC? When creating the munged name, don't you also have to make sure that other symbols that aren't supported for names don't sneak through? I see that you replace =
Re: [PATCH] New attribute to create target clones
On 09/22/2015 03:09 PM, Bernd Schmidt wrote: On 09/22/2015 09:41 PM, Jeff Law wrote: Essentially it allows us to more easily support per-microarchitecture-optimized versions of functions. You list just have to list the microarchitectures and the compiler handles the rest. Very simple, very easy. I'd think it'd be particularly helpful for vectorization. You could emulate this with compiling the same source multiple times with different flags/defines and wire up on ifunc by hand. But Evgeny's approach is vastly simpler. As far as I can tell the ifunc is generated automatically (and the functionality is documented as such), so the new target_clone doesn't buy much. But one thing I didn't was that the existing support is only available in C++, while Evgeny's patch works for C. That is probably an argument that could be made for its inclusion. In multi-versioning, you have a distinct source implementation for each function. ie __attribute__ ((target ("default"))) int foo () { // The default version of foo. return 0; } __attribute__ ((target ("sse4.2"))) int foo () { // foo version for SSE4.2 return 1; } And so-on for each processor version you want to support. In Evgeny's patch we'd have a single source implementation of foo which gets compiled multiple times, once for each target specified by the attribute. I wasn't aware that multi-versioning was only implemented for C++, that seems fairly lame. I hope I didn't approve that :-) Jeff
Re: [PATCH] New attribute to create target clones
PING. On Fri, Sep 25, 2015 at 1:28 AM, Evgeny Stupachenko wrote: > I've fixed ICE and review issues. > x86 make check and bootstrap passed. > > Thanks, > Evgeny > > ChangeLog > > 2015-09-25 Evgeny Stupachenko > > gcc/ > * Makefile.in (OBJS): Add multiple_target.o. > * multiple_target.c (make_attribute): New. > (create_dispatcher_calls): Ditto. > (expand_target_clones): Ditto. > (ipa_target_clone): Ditto. > * passes.def (pass_target_clone): New ipa pass. > * tree-pass.h (make_pass_target_clone): Ditto. > > gcc/c-family > * c-common.c (handle_target_clones_attribute): New. > * (c_common_attribute_table): Add handle_target_clones_attribute. > * (handle_always_inline_attribute): Add check on target_clones > attribute. > * (handle_target_attribute): Ditto. > > gcc/testsuite > * gcc.dg/mvc1.c: New test for multiple targets cloning. > * gcc.dg/mvc2.c: Ditto. > * gcc.dg/mvc3.c: Ditto. > * gcc.dg/mvc4.c: Ditto. > * gcc.dg/mvc5.c: Ditto. > * gcc.dg/mvc6.c: Ditto. > * gcc.dg/mvc7.c: Ditto. > * g++.dg/ext/mvc1.C: Ditto. > * g++.dg/ext/mvc2.C: Ditto. > * g++.dg/ext/mvc3.C: Ditto. > > gcc/doc > * doc/extend.texi (target_clones): New attribute description. > > > On Wed, Sep 23, 2015 at 1:49 AM, Evgeny Stupachenko > wrote: >> Thank you for the review. >> The patch still works with gcc 5, but the fail reproduced on trunk >> (looks like it appeared while patch was at review). I'll debug it and >> fix. >> As a workaround to test the feature... >> Removing >> "gimple_call_set_fndecl (call, idecl);" from multiple_target.c >> should resolve the ICE >> >> I'll fix the patch for trunk and send an update. >> >> Thanks, >> Evgeny >> >> >> On Wed, Sep 23, 2015 at 12:09 AM, Bernd Schmidt wrote: >>> On 09/22/2015 09:41 PM, Jeff Law wrote: Essentially it allows us to more easily support per-microarchitecture-optimized versions of functions. You list just have to list the microarchitectures and the compiler handles the rest. Very simple, very easy. I'd think it'd be particularly helpful for vectorization. You could emulate this with compiling the same source multiple times with different flags/defines and wire up on ifunc by hand. But Evgeny's approach is vastly simpler. >>> >>> >>> As far as I can tell the ifunc is generated automatically (and the >>> functionality is documented as such), so the new target_clone doesn't buy >>> much. But one thing I didn't was that the existing support is only available >>> in C++, while Evgeny's patch works for C. That is probably an argument that >>> could be made for its inclusion. >>> >>> Or at least, it's supposed to work. As I said, I get verify_ssa failures on >>> the included testcases, and for a simpler one I just tried I get the clones >>> of the function, but not the resolver that ought to be generated. >>> >>> >>> Bernd
Re: [PATCH] New attribute to create target clones
I've fixed ICE and review issues. x86 make check and bootstrap passed. Thanks, Evgeny ChangeLog 2015-09-25 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. * passes.def (pass_target_clone): New ipa pass. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. On Wed, Sep 23, 2015 at 1:49 AM, Evgeny Stupachenko wrote: > Thank you for the review. > The patch still works with gcc 5, but the fail reproduced on trunk > (looks like it appeared while patch was at review). I'll debug it and > fix. > As a workaround to test the feature... > Removing > "gimple_call_set_fndecl (call, idecl);" from multiple_target.c > should resolve the ICE > > I'll fix the patch for trunk and send an update. > > Thanks, > Evgeny > > > On Wed, Sep 23, 2015 at 12:09 AM, Bernd Schmidt wrote: >> On 09/22/2015 09:41 PM, Jeff Law wrote: >>> >>> Essentially it allows us to more easily support >>> per-microarchitecture-optimized versions of functions. You list just >>> have to list the microarchitectures and the compiler handles the rest. >>> Very simple, very easy. I'd think it'd be particularly helpful for >>> vectorization. >>> >>> You could emulate this with compiling the same source multiple times >>> with different flags/defines and wire up on ifunc by hand. But Evgeny's >>> approach is vastly simpler. >> >> >> As far as I can tell the ifunc is generated automatically (and the >> functionality is documented as such), so the new target_clone doesn't buy >> much. But one thing I didn't was that the existing support is only available >> in C++, while Evgeny's patch works for C. That is probably an argument that >> could be made for its inclusion. >> >> Or at least, it's supposed to work. As I said, I get verify_ssa failures on >> the included testcases, and for a simpler one I just tried I get the clones >> of the function, but not the resolver that ought to be generated. >> >> >> Bernd target_clones.patch Description: Binary data
Re: [PATCH] New attribute to create target clones
Thank you for the review. The patch still works with gcc 5, but the fail reproduced on trunk (looks like it appeared while patch was at review). I'll debug it and fix. As a workaround to test the feature... Removing "gimple_call_set_fndecl (call, idecl);" from multiple_target.c should resolve the ICE I'll fix the patch for trunk and send an update. Thanks, Evgeny On Wed, Sep 23, 2015 at 12:09 AM, Bernd Schmidt wrote: > On 09/22/2015 09:41 PM, Jeff Law wrote: >> >> Essentially it allows us to more easily support >> per-microarchitecture-optimized versions of functions. You list just >> have to list the microarchitectures and the compiler handles the rest. >> Very simple, very easy. I'd think it'd be particularly helpful for >> vectorization. >> >> You could emulate this with compiling the same source multiple times >> with different flags/defines and wire up on ifunc by hand. But Evgeny's >> approach is vastly simpler. > > > As far as I can tell the ifunc is generated automatically (and the > functionality is documented as such), so the new target_clone doesn't buy > much. But one thing I didn't was that the existing support is only available > in C++, while Evgeny's patch works for C. That is probably an argument that > could be made for its inclusion. > > Or at least, it's supposed to work. As I said, I get verify_ssa failures on > the included testcases, and for a simpler one I just tried I get the clones > of the function, but not the resolver that ought to be generated. > > > Bernd
Re: [PATCH] New attribute to create target clones
On 09/22/2015 09:41 PM, Jeff Law wrote: Essentially it allows us to more easily support per-microarchitecture-optimized versions of functions. You list just have to list the microarchitectures and the compiler handles the rest. Very simple, very easy. I'd think it'd be particularly helpful for vectorization. You could emulate this with compiling the same source multiple times with different flags/defines and wire up on ifunc by hand. But Evgeny's approach is vastly simpler. As far as I can tell the ifunc is generated automatically (and the functionality is documented as such), so the new target_clone doesn't buy much. But one thing I didn't was that the existing support is only available in C++, while Evgeny's patch works for C. That is probably an argument that could be made for its inclusion. Or at least, it's supposed to work. As I said, I get verify_ssa failures on the included testcases, and for a simpler one I just tried I get the clones of the function, but not the resolver that ought to be generated. Bernd
Re: [PATCH] New attribute to create target clones
On 09/21/2015 07:25 AM, Bernd Schmidt wrote: On 08/27/2015 01:18 PM, Evgeny Stupachenko wrote: Based on RFC: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html The patch implement an extension to Function Multiversioning that allows to clone a function for multiple targets. __attribute__((target_clones("avx","arch=slm","default"))) int foo () ... Will create 3 clones of foo(). One optimized with -mavx, one optimized with -march=slm, and one with default optimizations. And will create ifunc resolver that calls appropriate clone (same as in Function Multiversioning). The general question is - do we want this, given that it seems to introduce no functionality that can't be had with the existing multiversioning? You could always compile the same source file to multiple objects with different defines for the target optimization, or include a file containing the multiversioned function multiple times with changing #defines. Essentially it allows us to more easily support per-microarchitecture-optimized versions of functions. You list just have to list the microarchitectures and the compiler handles the rest. Very simple, very easy. I'd think it'd be particularly helpful for vectorization. You could emulate this with compiling the same source multiple times with different flags/defines and wire up on ifunc by hand. But Evgeny's approach is vastly simpler. We are still waiting on Jan to chime in as to whether or not any of this breaks the rules WRT nodes in the symbol table. Given I mucked up the MPX code by not knowing those rules, I'm hesitant to approve this patch without input from Jan. Jeff
Re: [PATCH] New attribute to create target clones
On 08/27/2015 01:18 PM, Evgeny Stupachenko wrote: Based on RFC: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html The patch implement an extension to Function Multiversioning that allows to clone a function for multiple targets. __attribute__((target_clones("avx","arch=slm","default"))) int foo () ... Will create 3 clones of foo(). One optimized with -mavx, one optimized with -march=slm, and one with default optimizations. And will create ifunc resolver that calls appropriate clone (same as in Function Multiversioning). The general question is - do we want this, given that it seems to introduce no functionality that can't be had with the existing multiversioning? You could always compile the same source file to multiple objects with different defines for the target optimization, or include a file containing the multiversioned function multiple times with changing #defines. Some comments on the patch itself: > diff --git a/gcc/testsuite/gcc.dg/mvc1.c b/gcc/testsuite/gcc.dg/mvc1.c > new file mode 100644 > index 000..ab1e0f5 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/mvc1.c > @@ -0,0 +1,27 @@ > +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ > +/* { dg-options "-O2" } */ Target-specific tests should go into gcc.target/i386. + else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node))) + { + warning (OPT_Wattributes, + "%qE attribute ignored as it conflict with target_clones", + name); + *no_add_attrs = true; + } Bad grammar ("conflicts"), and apparently you're supposed to use %qs even for known attribute names. Look at the code immediately before this and copy it. Similar issues throughout. + for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller) It looks like this could be simplified if you immediately assign e_next = e->next_caller and then always use e_next. Trying to test it, I get (compiling with -O2 on x86_64-linux): mvc1.c: In function ‘main’: mvc1.c:18:1: error: virtual definition of statement not up-to-date main () ^ _2 = foo.ifunc (); mvc1.c:18:1: internal compiler error: verify_ssa failed Bernd
Re: [PATCH] New attribute to create target clones
PING 2. On Tue, Sep 8, 2015 at 2:27 PM, Evgeny Stupachenko wrote: > Ping. > > On Thu, Aug 27, 2015 at 2:18 PM, Evgeny Stupachenko > wrote: >> Hi All, >> >> Based on RFC: >> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html >> >> The patch implement an extension to Function Multiversioning that >> allows to clone a function for multiple targets. >> __attribute__((target_clones("avx","arch=slm","default"))) >> int foo () >> ... >> >> Will create 3 clones of foo(). One optimized with -mavx, one optimized >> with -march=slm, and one with default optimizations. >> And will create ifunc resolver that calls appropriate clone (same as >> in Function Multiversioning). >> >> Bootstrap and make check for x86 passed. >> >> Is it ok for trunk? >> >> 2015-08-27 Evgeny Stupachenko >> >> gcc/ >> * Makefile.in (OBJS): Add multiple_target.o. >> * multiple_target.c (make_attribute): New. >> (create_dispatcher_calls): Ditto. >> (expand_target_clones): Ditto. >> (ipa_target_clone): Ditto. >> * passes.def (pass_target_clone): New ipa pass. >> * tree-pass.h (make_pass_target_clone): Ditto. >> >> gcc/c-family >> * c-common.c (handle_target_clones_attribute): New. >> * (c_common_attribute_table): Add handle_target_clones_attribute. >> * (handle_always_inline_attribute): Add check on target_clones >> attribute. >> * (handle_target_attribute): Ditto. >> >> gcc/testsuite >> * gcc.dg/mvc1.c: New test for multiple targets cloning. >> * gcc.dg/mvc2.c: Ditto. >> * gcc.dg/mvc3.c: Ditto. >> * gcc.dg/mvc4.c: Ditto. >> * gcc.dg/mvc5.c: Ditto. >> * gcc.dg/mvc6.c: Ditto. >> * gcc.dg/mvc7.c: Ditto. >> * g++.dg/ext/mvc1.C: Ditto. >> * g++.dg/ext/mvc2.C: Ditto. >> * g++.dg/ext/mvc3.C: Ditto. >> >> gcc/doc >> * doc/extend.texi (target_clones): New attribute description.
Re: [PATCH] New attribute to create target clones
Ping. On Thu, Aug 27, 2015 at 2:18 PM, Evgeny Stupachenko wrote: > Hi All, > > Based on RFC: > https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html > > The patch implement an extension to Function Multiversioning that > allows to clone a function for multiple targets. > __attribute__((target_clones("avx","arch=slm","default"))) > int foo () > ... > > Will create 3 clones of foo(). One optimized with -mavx, one optimized > with -march=slm, and one with default optimizations. > And will create ifunc resolver that calls appropriate clone (same as > in Function Multiversioning). > > Bootstrap and make check for x86 passed. > > Is it ok for trunk? > > 2015-08-27 Evgeny Stupachenko > > gcc/ > * Makefile.in (OBJS): Add multiple_target.o. > * multiple_target.c (make_attribute): New. > (create_dispatcher_calls): Ditto. > (expand_target_clones): Ditto. > (ipa_target_clone): Ditto. > * passes.def (pass_target_clone): New ipa pass. > * tree-pass.h (make_pass_target_clone): Ditto. > > gcc/c-family > * c-common.c (handle_target_clones_attribute): New. > * (c_common_attribute_table): Add handle_target_clones_attribute. > * (handle_always_inline_attribute): Add check on target_clones > attribute. > * (handle_target_attribute): Ditto. > > gcc/testsuite > * gcc.dg/mvc1.c: New test for multiple targets cloning. > * gcc.dg/mvc2.c: Ditto. > * gcc.dg/mvc3.c: Ditto. > * gcc.dg/mvc4.c: Ditto. > * gcc.dg/mvc5.c: Ditto. > * gcc.dg/mvc6.c: Ditto. > * gcc.dg/mvc7.c: Ditto. > * g++.dg/ext/mvc1.C: Ditto. > * g++.dg/ext/mvc2.C: Ditto. > * g++.dg/ext/mvc3.C: Ditto. > > gcc/doc > * doc/extend.texi (target_clones): New attribute description.
[PATCH] New attribute to create target clones
Hi All, Based on RFC: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html The patch implement an extension to Function Multiversioning that allows to clone a function for multiple targets. __attribute__((target_clones("avx","arch=slm","default"))) int foo () ... Will create 3 clones of foo(). One optimized with -mavx, one optimized with -march=slm, and one with default optimizations. And will create ifunc resolver that calls appropriate clone (same as in Function Multiversioning). Bootstrap and make check for x86 passed. Is it ok for trunk? 2015-08-27 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. * passes.def (pass_target_clone): New ipa pass. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_target_clones_attribute): New. * (c_common_attribute_table): Add handle_target_clones_attribute. * (handle_always_inline_attribute): Add check on target_clones attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. gcc/doc * doc/extend.texi (target_clones): New attribute description. target_clones.patch Description: Binary data
Re: [RFC, patch] New attribute to create target clones
On 08/20/2015 05:38 PM, Evgeny Stupachenko wrote: On Mon, Aug 3, 2015 at 9:43 PM, Jeff Law wrote: On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote: Hi All, The patch enables new attribute 'ctarget', The attribute force compiler to create clones of a function with the attribute. For example: __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) So presumably we're allowing both changing the ISA and the tuning options? In fact, it looks like we're able to change any -m option, right? What about something like -mregparm? I think the docs need to disallow clones with different ABI requirements. -mregparm is not allowed now. The targetm.target_option.valid_attribute_p hook specify which -m option is allowed for architecture. Here patch reuses Function Multiversioning methods. Ah, OK. That clarifies things. I could make an argument for either. Do we have anything to guide us from other compilers such as ICC that may have a similar capability? Not sure. However ICC has similar to Function Multiversioning: __declcpec(cpu_specific(... where "default" is ""generic". I think for "default" we should do the same as Function Multiversioning - keep compiler options. That way users will be able to create target specific minimum by passing corresponding options to command line. That works for me. So I think we really need Jan to chime in here. Jeff
Re: [RFC, patch] New attribute to create target clones
On Mon, Aug 3, 2015 at 9:43 PM, Jeff Law wrote: > On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote: >> >> Hi All, >> >> The patch enables new attribute 'ctarget', >> The attribute force compiler to create clones of a function with the >> attribute. >> >> For example: >> __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) > > So presumably we're allowing both changing the ISA and the tuning options? > In fact, it looks like we're able to change any -m option, right? > > What about something like -mregparm? I think the docs need to disallow > clones with different ABI requirements. -mregparm is not allowed now. The targetm.target_option.valid_attribute_p hook specify which -m option is allowed for architecture. Here patch reuses Function Multiversioning methods. > >> Currently default ctarget means that foo() will be optimized with >> current compiler options and target. >> The other option is to switch target to target specific minimum (like >> target x86-64). Is it better? > > I could make an argument for either. Do we have anything to guide us from > other compilers such as ICC that may have a similar capability? > Not sure. However ICC has similar to Function Multiversioning: __declcpec(cpu_specific(... where "default" is ""generic". I think for "default" we should do the same as Function Multiversioning - keep compiler options. That way users will be able to create target specific minimum by passing corresponding options to command line. > >> >> What do you think about attribute name? 'ctarget' is short but not >> informative. Other variants are 'target_clones', 'targets'... > > target_clones seems good. > > For multiple_target.c: Can you please trim down the #include set. I can't > believe you need all that stuff.If you really need backend stuff (tm.h), > then use "backend.h" instead. > > It generally looks reasonable, but Jan knows the IPA code much better than I > do and I'd like him to chime in. You might also ask Ilya to review the > cloning and rules around clones since he head to deal with a lot of that > stuff in his MPX work. > > jeff
Re: [RFC, patch] New attribute to create target clones
On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote: Hi All, The patch enables new attribute 'ctarget', The attribute force compiler to create clones of a function with the attribute. For example: __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) So presumably we're allowing both changing the ISA and the tuning options? In fact, it looks like we're able to change any -m option, right? What about something like -mregparm? I think the docs need to disallow clones with different ABI requirements. Currently default ctarget means that foo() will be optimized with current compiler options and target. The other option is to switch target to target specific minimum (like target x86-64). Is it better? I could make an argument for either. Do we have anything to guide us from other compilers such as ICC that may have a similar capability? What do you think about attribute name? 'ctarget' is short but not informative. Other variants are 'target_clones', 'targets'... target_clones seems good. For multiple_target.c: Can you please trim down the #include set. I can't believe you need all that stuff.If you really need backend stuff (tm.h), then use "backend.h" instead. It generally looks reasonable, but Jan knows the IPA code much better than I do and I'd like him to chime in. You might also ask Ilya to review the cloning and rules around clones since he head to deal with a lot of that stuff in his MPX work. jeff
Re: [RFC, patch] New attribute to create target clones
On 07/31/2015 05:50 AM, Evgeny wrote: New ctarget attribute automatically clone function and optimize for corresponding target. For target attribute if you specify "avx,sse4" you will get only 1 function optimized for the target with higher priority, while ctarget will generate 2 functions: one optimized for sse4 and one for avx. Ah, thanks for the explanation. That helps a lot. jeff
Re: [RFC, patch] New attribute to create target clones
On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote: Hi All, The patch enables new attribute 'ctarget', The attribute force compiler to create clones of a function with the attribute. For example: __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) int foo () will create 3 clones of foo() optimized for corresponding targets and replace all foo() calls with ifunc resolver. The patch include new tests for the feature. Currently default ctarget means that foo() will be optimized with current compiler options and target. The other option is to switch target to target specific minimum (like target x86-64). Is it better? What do you think about attribute name? 'ctarget' is short but not informative. Other variants are 'target_clones', 'targets'... Below is ChangeLog. Attached patch passed make check on x86. Thanks, Evgeny 2015-07-31 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. * passes.def (pass_target_clone): New ipa pass. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_ctarget_attribute): New. * (c_common_attribute_table): Add handle_ctarget_attribute. * (handle_always_inline_attribute): Add check on ctarget attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. I have the same question as Andrew. How is this fundamentally different from the function multi-versioning we currently support? jeff
Re: [RFC, patch] New attribute to create target clones
On Fri, 31 Jul 2015, Evgeny Stupachenko wrote: > The patch enables new attribute 'ctarget', Attributes should be documented in extend.texi. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC, patch] New attribute to create target clones
You can't apply both 'ctarget' and 'target' attributes to a function. The patch uses same target specific methods to create dispatcher and target clones. On Fri, Jul 31, 2015 at 1:30 AM, Andrew Pinski wrote: > On Thu, Jul 30, 2015 at 3:19 PM, Evgeny Stupachenko > wrote: >> Hi All, >> >> The patch enables new attribute 'ctarget', >> The attribute force compiler to create clones of a function with the >> attribute. >> >> For example: >> __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) >> int >> foo () >> >> will create 3 clones of foo() optimized for corresponding targets and >> replace all foo() calls with ifunc resolver. >> The patch include new tests for the feature. >> >> Currently default ctarget means that foo() will be optimized with >> current compiler options and target. >> The other option is to switch target to target specific minimum (like >> target x86-64). Is it better? >> >> What do you think about attribute name? 'ctarget' is short but not >> informative. Other variants are 'target_clones', 'targets'... > > How does this interacts with Function Multiversioning > (https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Multiversioning.html)? > > Thanks, > Andrew Pinski > >> >> Below is ChangeLog. Attached patch passed make check on x86. >> >> Thanks, >> Evgeny >> >> 2015-07-31 Evgeny Stupachenko >> >> gcc/ >> * Makefile.in (OBJS): Add multiple_target.o. >> * multiple_target.c (make_attribute): New. >> (create_dispatcher_calls): Ditto. >> (expand_target_clones): Ditto. >> (ipa_target_clone): Ditto. >> * passes.def (pass_target_clone): New ipa pass. >> * tree-pass.h (make_pass_target_clone): Ditto. >> >> gcc/c-family >> * c-common.c (handle_ctarget_attribute): New. >> * (c_common_attribute_table): Add handle_ctarget_attribute. >> * (handle_always_inline_attribute): Add check on ctarget attribute. >> * (handle_target_attribute): Ditto. >> >> gcc/testsuite >> * gcc.dg/mvc1.c: New test for multiple targets cloning. >> * gcc.dg/mvc2.c: Ditto. >> * gcc.dg/mvc3.c: Ditto. >> * gcc.dg/mvc4.c: Ditto. >> * gcc.dg/mvc5.c: Ditto. >> * gcc.dg/mvc6.c: Ditto. >> * gcc.dg/mvc7.c: Ditto. >> * g++.dg/ext/mvc1.C: Ditto. >> * g++.dg/ext/mvc2.C: Ditto. >> * g++.dg/ext/mvc3.C: Ditto.
Re: [RFC, patch] New attribute to create target clones
On Thu, Jul 30, 2015 at 3:19 PM, Evgeny Stupachenko wrote: > Hi All, > > The patch enables new attribute 'ctarget', > The attribute force compiler to create clones of a function with the > attribute. > > For example: > __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) > int > foo () > > will create 3 clones of foo() optimized for corresponding targets and > replace all foo() calls with ifunc resolver. > The patch include new tests for the feature. > > Currently default ctarget means that foo() will be optimized with > current compiler options and target. > The other option is to switch target to target specific minimum (like > target x86-64). Is it better? > > What do you think about attribute name? 'ctarget' is short but not > informative. Other variants are 'target_clones', 'targets'... How does this interacts with Function Multiversioning (https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Multiversioning.html)? Thanks, Andrew Pinski > > Below is ChangeLog. Attached patch passed make check on x86. > > Thanks, > Evgeny > > 2015-07-31 Evgeny Stupachenko > > gcc/ > * Makefile.in (OBJS): Add multiple_target.o. > * multiple_target.c (make_attribute): New. > (create_dispatcher_calls): Ditto. > (expand_target_clones): Ditto. > (ipa_target_clone): Ditto. > * passes.def (pass_target_clone): New ipa pass. > * tree-pass.h (make_pass_target_clone): Ditto. > > gcc/c-family > * c-common.c (handle_ctarget_attribute): New. > * (c_common_attribute_table): Add handle_ctarget_attribute. > * (handle_always_inline_attribute): Add check on ctarget attribute. > * (handle_target_attribute): Ditto. > > gcc/testsuite > * gcc.dg/mvc1.c: New test for multiple targets cloning. > * gcc.dg/mvc2.c: Ditto. > * gcc.dg/mvc3.c: Ditto. > * gcc.dg/mvc4.c: Ditto. > * gcc.dg/mvc5.c: Ditto. > * gcc.dg/mvc6.c: Ditto. > * gcc.dg/mvc7.c: Ditto. > * g++.dg/ext/mvc1.C: Ditto. > * g++.dg/ext/mvc2.C: Ditto. > * g++.dg/ext/mvc3.C: Ditto.
Re: [RFC, patch] New attribute to create target clones
Forget C++ tests. On Fri, Jul 31, 2015 at 1:19 AM, Evgeny Stupachenko wrote: > Hi All, > > The patch enables new attribute 'ctarget', > The attribute force compiler to create clones of a function with the > attribute. > > For example: > __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) > int > foo () > > will create 3 clones of foo() optimized for corresponding targets and > replace all foo() calls with ifunc resolver. > The patch include new tests for the feature. > > Currently default ctarget means that foo() will be optimized with > current compiler options and target. > The other option is to switch target to target specific minimum (like > target x86-64). Is it better? > > What do you think about attribute name? 'ctarget' is short but not > informative. Other variants are 'target_clones', 'targets'... > > Below is ChangeLog. Attached patch passed make check on x86. > > Thanks, > Evgeny > > 2015-07-31 Evgeny Stupachenko > > gcc/ > * Makefile.in (OBJS): Add multiple_target.o. > * multiple_target.c (make_attribute): New. > (create_dispatcher_calls): Ditto. > (expand_target_clones): Ditto. > (ipa_target_clone): Ditto. > * passes.def (pass_target_clone): New ipa pass. > * tree-pass.h (make_pass_target_clone): Ditto. > > gcc/c-family > * c-common.c (handle_ctarget_attribute): New. > * (c_common_attribute_table): Add handle_ctarget_attribute. > * (handle_always_inline_attribute): Add check on ctarget attribute. > * (handle_target_attribute): Ditto. > > gcc/testsuite > * gcc.dg/mvc1.c: New test for multiple targets cloning. > * gcc.dg/mvc2.c: Ditto. > * gcc.dg/mvc3.c: Ditto. > * gcc.dg/mvc4.c: Ditto. > * gcc.dg/mvc5.c: Ditto. > * gcc.dg/mvc6.c: Ditto. > * gcc.dg/mvc7.c: Ditto. > * g++.dg/ext/mvc1.C: Ditto. > * g++.dg/ext/mvc2.C: Ditto. > * g++.dg/ext/mvc3.C: Ditto. ctarget_attribute_C++tests.patch Description: Binary data
[RFC, patch] New attribute to create target clones
Hi All, The patch enables new attribute 'ctarget', The attribute force compiler to create clones of a function with the attribute. For example: __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default"))) int foo () will create 3 clones of foo() optimized for corresponding targets and replace all foo() calls with ifunc resolver. The patch include new tests for the feature. Currently default ctarget means that foo() will be optimized with current compiler options and target. The other option is to switch target to target specific minimum (like target x86-64). Is it better? What do you think about attribute name? 'ctarget' is short but not informative. Other variants are 'target_clones', 'targets'... Below is ChangeLog. Attached patch passed make check on x86. Thanks, Evgeny 2015-07-31 Evgeny Stupachenko gcc/ * Makefile.in (OBJS): Add multiple_target.o. * multiple_target.c (make_attribute): New. (create_dispatcher_calls): Ditto. (expand_target_clones): Ditto. (ipa_target_clone): Ditto. * passes.def (pass_target_clone): New ipa pass. * tree-pass.h (make_pass_target_clone): Ditto. gcc/c-family * c-common.c (handle_ctarget_attribute): New. * (c_common_attribute_table): Add handle_ctarget_attribute. * (handle_always_inline_attribute): Add check on ctarget attribute. * (handle_target_attribute): Ditto. gcc/testsuite * gcc.dg/mvc1.c: New test for multiple targets cloning. * gcc.dg/mvc2.c: Ditto. * gcc.dg/mvc3.c: Ditto. * gcc.dg/mvc4.c: Ditto. * gcc.dg/mvc5.c: Ditto. * gcc.dg/mvc6.c: Ditto. * gcc.dg/mvc7.c: Ditto. * g++.dg/ext/mvc1.C: Ditto. * g++.dg/ext/mvc2.C: Ditto. * g++.dg/ext/mvc3.C: Ditto. ctarget_attribute.patch Description: Binary data