Re: [PATCH] New attribute to create target clones

2015-11-03 Thread Evgeny Stupachenko
Some tests in the patch already updated (ifunc require condition
added) by Uros commit: ac39b078992c27934ea53cb580dbd79f75b6c727

I'll ask to commit attached patch.
x86 bootstrap and make check passed.

2015-11-03  Evgeny Stupachenko  

gcc/
* multiple_target.c (create_dispatcher_calls): Add target check
on ifunc.
(create_target_clone): Change assembler name for versioned declarations.

gcc/testsuite
* g++.dg/ext/mvc4.C: Add dg-require-ifunc condition.
* gcc.target/i386/mvc5.c: Ditto.
* gcc.target/i386/mvc7.c: Add dg-require-ifunc condition and checks on
resolver.

Thanks,
Evgeny

On Mon, Nov 2, 2015 at 8:02 PM, Jeff Law  wrote:
> On 11/02/2015 07:50 AM, Evgeny Stupachenko wrote:
>>
>> Yes, that is exactly what should fix the tests.
>> Unfortunately I don't have access to darwin machine right now.
>> Can you please test if the patch (attached) fixes the tests?
>>
>> gcc/
>>  * multiple_target.c (create_dispatcher_calls): Add target check
>>  on ifunc.
>>  (create_target_clone): Change assembler name for versioned
>> declarations.
>>
>> gcc/testsuite
>>  * gcc.dg/mvc1.c: Add dg-require-ifunc condition.
>>  * gcc.dg/mvc4.c: Ditto.
>>  * gcc.dg/mvc5.c: Ditto.
>>  * g++.dg/ext/mvc1.C: Ditto.
>>  * g++.dg/ext/mvc4.C: Ditto.
>>  * gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on
>> resolver.
>>
> OK.
>
> jeff
>


target_clones_tests_fix.patch
Description: Binary data


Re: [PATCH] New attribute to create target clones

2015-11-02 Thread Jeff Law

On 11/02/2015 07:50 AM, Evgeny Stupachenko wrote:

Yes, that is exactly what should fix the tests.
Unfortunately I don't have access to darwin machine right now.
Can you please test if the patch (attached) fixes the tests?

gcc/
 * multiple_target.c (create_dispatcher_calls): Add target check
 on ifunc.
 (create_target_clone): Change assembler name for versioned 
declarations.

gcc/testsuite
 * gcc.dg/mvc1.c: Add dg-require-ifunc condition.
 * gcc.dg/mvc4.c: Ditto.
 * gcc.dg/mvc5.c: Ditto.
 * g++.dg/ext/mvc1.C: Ditto.
 * g++.dg/ext/mvc4.C: Ditto.
 * gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on resolver.


OK.

jeff



Re: [PATCH] New attribute to create target clones

2015-11-02 Thread Dominique d'Humières
Evgeny,

I have already checked that the addition of

+/* { dg-require-ifunc "" } */

fixes the failures. I’ll test your full patch later today (currently chasing 
regression with gfortran).

Thanks,

Dominique

> Le 2 nov. 2015 à 15:50, Evgeny Stupachenko  a écrit :
> 
> Yes, that is exactly what should fix the tests.
> Unfortunately I don't have access to darwin machine right now.
> Can you please test if the patch (attached) fixes the tests?
> 
> gcc/
>* multiple_target.c (create_dispatcher_calls): Add target check
>on ifunc.
>(create_target_clone): Change assembler name for versioned 
> declarations.
> 
> gcc/testsuite
>* gcc.dg/mvc1.c: Add dg-require-ifunc condition.
>* gcc.dg/mvc4.c: Ditto.
>* gcc.dg/mvc5.c: Ditto.
>* g++.dg/ext/mvc1.C: Ditto.
>* g++.dg/ext/mvc4.C: Ditto.
>* gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on resolver.
> 
> Thanks,
> Evgeny
> 



Re: [PATCH] New attribute to create target clones

2015-11-02 Thread Evgeny Stupachenko
Yes, that is exactly what should fix the tests.
Unfortunately I don't have access to darwin machine right now.
Can you please test if the patch (attached) fixes the tests?

gcc/
* multiple_target.c (create_dispatcher_calls): Add target check
on ifunc.
(create_target_clone): Change assembler name for versioned declarations.

gcc/testsuite
* gcc.dg/mvc1.c: Add dg-require-ifunc condition.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc4.C: Ditto.
* gcc.dg/mvc7.c: Add dg-require-ifunc condition and checks on resolver.

Thanks,
Evgeny

On Sat, Oct 31, 2015 at 1:40 PM, Dominique d'Humières
 wrote:
> Evgeny,
> On darwin I see the following failures
> FAIL: g++.dg/ext/mvc1.C  -std=c++11 (test for excess errors)
> UNRESOLVED: g++.dg/ext/mvc1.C  -std=c++11 compilation failed to produce 
> executable
> FAIL: g++.dg/ext/mvc1.C  -std=c++14 (test for excess errors)
> UNRESOLVED: g++.dg/ext/mvc1.C  -std=c++14 compilation failed to produce 
> executable
> FAIL: g++.dg/ext/mvc1.C  -std=c++98 (test for excess errors)
> UNRESOLVED: g++.dg/ext/mvc1.C  -std=c++98 compilation failed to produce 
> executable
> FAIL: g++.dg/ext/mvc4.C  -std=gnu++11 (internal compiler error)
> FAIL: g++.dg/ext/mvc4.C  -std=gnu++11 (test for excess errors)
> FAIL: g++.dg/ext/mvc4.C  -std=gnu++14 (internal compiler error)
> FAIL: g++.dg/ext/mvc4.C  -std=gnu++14 (test for excess errors)
> FAIL: g++.dg/ext/mvc4.C  -std=gnu++98 (internal compiler error)
> FAIL: g++.dg/ext/mvc4.C  -std=gnu++98 (test for excess errors)
>
> FAIL: gcc.dg/mvc1.c (internal compiler error)
> FAIL: gcc.dg/mvc1.c (test for excess errors)
> UNRESOLVED: gcc.dg/mvc1.c compilation failed to produce executable
> FAIL: gcc.dg/mvc4.c (internal compiler error)
> FAIL: gcc.dg/mvc4.c (test for excess errors)
> UNRESOLVED: gcc.dg/mvc4.c compilation failed to produce executable
> FAIL: gcc.dg/mvc5.c (internal compiler error)
> FAIL: gcc.dg/mvc5.c (test for excess errors)
> FAIL: gcc.dg/mvc5.c scan-assembler-times foo.ifunc 6
> FAIL: gcc.dg/mvc7.c (internal compiler error)
> FAIL: gcc.dg/mvc7.c (test for excess errors)
> FAIL: gcc.dg/mvc7.c scan-assembler-times foo.ifunc 4
>
> The errors are of the kind
>
> /opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:5:5: error: multiversioning needs 
> ifunc which is not supported on this target
>
> These tests probably require something such as
>
> /* { dg-require-ifunc "" } */
>
> I have no opinion if this should also used in the tests for warnings or 
> errors.
>
> The ICEs are of the kind
>
> /opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:10:1: internal compiler error: 
> Segmentation fault: 11
>  }
>  ^
>
> TIA
>
> Dominique
>


target_clones_tests_fix.patch
Description: Binary data


Re: [PATCH] New attribute to create target clones

2015-10-31 Thread Dominique d'Humières
Evgeny,
On darwin I see the following failures
FAIL: g++.dg/ext/mvc1.C  -std=c++11 (test for excess errors)
UNRESOLVED: g++.dg/ext/mvc1.C  -std=c++11 compilation failed to produce 
executable
FAIL: g++.dg/ext/mvc1.C  -std=c++14 (test for excess errors)
UNRESOLVED: g++.dg/ext/mvc1.C  -std=c++14 compilation failed to produce 
executable
FAIL: g++.dg/ext/mvc1.C  -std=c++98 (test for excess errors)
UNRESOLVED: g++.dg/ext/mvc1.C  -std=c++98 compilation failed to produce 
executable
FAIL: g++.dg/ext/mvc4.C  -std=gnu++11 (internal compiler error)
FAIL: g++.dg/ext/mvc4.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/ext/mvc4.C  -std=gnu++14 (internal compiler error)
FAIL: g++.dg/ext/mvc4.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ext/mvc4.C  -std=gnu++98 (internal compiler error)
FAIL: g++.dg/ext/mvc4.C  -std=gnu++98 (test for excess errors)

FAIL: gcc.dg/mvc1.c (internal compiler error)
FAIL: gcc.dg/mvc1.c (test for excess errors)
UNRESOLVED: gcc.dg/mvc1.c compilation failed to produce executable
FAIL: gcc.dg/mvc4.c (internal compiler error)
FAIL: gcc.dg/mvc4.c (test for excess errors)
UNRESOLVED: gcc.dg/mvc4.c compilation failed to produce executable
FAIL: gcc.dg/mvc5.c (internal compiler error)
FAIL: gcc.dg/mvc5.c (test for excess errors)
FAIL: gcc.dg/mvc5.c scan-assembler-times foo.ifunc 6
FAIL: gcc.dg/mvc7.c (internal compiler error)
FAIL: gcc.dg/mvc7.c (test for excess errors)
FAIL: gcc.dg/mvc7.c scan-assembler-times foo.ifunc 4

The errors are of the kind

/opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:5:5: error: multiversioning needs 
ifunc which is not supported on this target

These tests probably require something such as

/* { dg-require-ifunc "" } */

I have no opinion if this should also used in the tests for warnings or errors.

The ICEs are of the kind

/opt/gcc/_clean/gcc/testsuite/gcc.dg/mvc7.C:10:1: internal compiler error: 
Segmentation fault: 11
 }
 ^

TIA

Dominique



Re: [PATCH] New attribute to create target clones

2015-10-30 Thread Evgeny Stupachenko
I've fixed the misprint and vertical spaces.
I'll ask to commit the patch when x86 bootstrap and make check finished.

Thanks,
Evgeny

Updated ChangeLog:

2015-10-30  Evgeny Stupachenko  

gcc/
* Makefile.in (OBJS): Add multiple_target.o.
* attrib.c (make_attribute): Moved from config/i386/i386.c
* config/i386/i386.c (make_attribute): Deleted.
* multiple_target.c (create_dispatcher_calls): New.
(get_attr_len): Ditto.
(get_attr_str): Ditto.
(separate_attrs): Ditto.
(is_valid_asm_symbol): Ditto.
(create_new_asm_name): Ditto.
(create_target_clone): Ditto.
(expand_target_clones): Ditto.
(ipa_target_clone): Ditto.
(ipa_dispatcher_calls): Ditto.
* passes.def (pass_target_clone): Two new ipa passes.
* tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
* c-common.c (handle_target_clones_attribute): New.
* (c_common_attribute_table): Add handle_target_clones_attribute.
* (handle_always_inline_attribute): Add check on target_clones
attribute.
* (handle_target_attribute): Ditto.

gcc/testsuite
* gcc.dg/mvc1.c: New test for multiple targets cloning.
* gcc.dg/mvc2.c: Ditto.
* gcc.dg/mvc3.c: Ditto.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* gcc.dg/mvc6.c: Ditto.
* gcc.dg/mvc7.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc2.C: Ditto.
* g++.dg/ext/mvc3.C: Ditto.
* g++.dg/ext/mvc4.C: Ditto.

gcc/doc
* doc/extend.texi (target_clones): New attribute description.

On Fri, Oct 30, 2015 at 8:27 AM, Jeff Law  wrote:
> On 10/29/2015 12:13 PM, Evgeny Stupachenko wrote:
>>
>> On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka  wrote:

 >>Yes. This is not necessary. However that way we'll have the following
 >>code in dispatcher:
 >> cmpl$6, __cpu_model+4(%rip)
 >> sete%al
 >> movzbl  %al, %eax
 >> testl   %eax, %eax
 >> jle .L16
 >> movl$foo.target_clone.1, %eax
 >>I think it is very hard to read and debug such...
 >>
 >>While now we have:
 >>
 >> cmpl$6, __cpu_model+4(%rip)
 >> sete%al
 >> movzbl  %al, %eax
 >> testl   %eax, %eax
 >> jle .L16
 >> movl$foo.arch_slm, %eax
 >>
 >>and it is clear that we are jumping to SLM code here.
 >>I'd like to keep target in names.
>>>
>>> >
>>> >I am not against more informative names, but why you don't pass the info
>>> > here:
>>> >
>>> >+create_target_clone (cgraph_node *node, bool definition)
>>> >+{
>>> >+  cgraph_node *new_node;
>>> >+  if (definition)
>>> >+{
>>> >+  new_node = node->create_version_clone_with_body (vNULL, NULL,
>>> >+  NULL, false,
>>> >+  NULL, NULL,
>>> >+  "target_clone");
>>> >+  new_node->force_output = true;
>>> >+}
>>> >+  else
>>> >+{
>>> >+  tree new_decl = copy_node (node->decl);
>>> >+  new_node = cgraph_node::get_create (new_decl);
>>> >+}
>>> >+  return new_node;
>>> >+}
>>> >
>>> >passing "arch_slm" instead of target_clone will get you the name you
>>> > want
>>> >(plus the extra index that may be needed anyway to disambiguate).
>>> >
>>> >Note that in general those .suffixes should be machine parseable, so
>>> > cp-demangle.c
>>> >can expand them correctly.  We may want to have some consistent grammar
>>> > for them here
>>> >and update cp-demangle.c to output nice info like "target clone for..."
>>
>> Ok. I've modified the patch correspondingly.
>
> You'll need updated ChangeLog entries.  Don't forget to drop the omp-low
> spurious whitespace change.
>
> You should also fix the formatting nits Jan pointed out.
>
> With those changes, this patch is OK for the trunk.  I'll run the header
> file reordering & cleanup tool after the patch is committed to the trunk.
>
> jeff
>
>


Re: [PATCH] New attribute to create target clones

2015-10-29 Thread Jeff Law

On 10/29/2015 12:13 PM, Evgeny Stupachenko wrote:

On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka  wrote:

>>Yes. This is not necessary. However that way we'll have the following
>>code in dispatcher:
>> cmpl$6, __cpu_model+4(%rip)
>> sete%al
>> movzbl  %al, %eax
>> testl   %eax, %eax
>> jle .L16
>> movl$foo.target_clone.1, %eax
>>I think it is very hard to read and debug such...
>>
>>While now we have:
>>
>> cmpl$6, __cpu_model+4(%rip)
>> sete%al
>> movzbl  %al, %eax
>> testl   %eax, %eax
>> jle .L16
>> movl$foo.arch_slm, %eax
>>
>>and it is clear that we are jumping to SLM code here.
>>I'd like to keep target in names.

>
>I am not against more informative names, but why you don't pass the info here:
>
>+create_target_clone (cgraph_node *node, bool definition)
>+{
>+  cgraph_node *new_node;
>+  if (definition)
>+{
>+  new_node = node->create_version_clone_with_body (vNULL, NULL,
>+  NULL, false,
>+  NULL, NULL,
>+  "target_clone");
>+  new_node->force_output = true;
>+}
>+  else
>+{
>+  tree new_decl = copy_node (node->decl);
>+  new_node = cgraph_node::get_create (new_decl);
>+}
>+  return new_node;
>+}
>
>passing "arch_slm" instead of target_clone will get you the name you want
>(plus the extra index that may be needed anyway to disambiguate).
>
>Note that in general those .suffixes should be machine parseable, so 
cp-demangle.c
>can expand them correctly.  We may want to have some consistent grammar for 
them here
>and update cp-demangle.c to output nice info like "target clone for..."

Ok. I've modified the patch correspondingly.
You'll need updated ChangeLog entries.  Don't forget to drop the omp-low 
spurious whitespace change.


You should also fix the formatting nits Jan pointed out.

With those changes, this patch is OK for the trunk.  I'll run the header 
file reordering & cleanup tool after the patch is committed to the trunk.


jeff




Re: [PATCH] New attribute to create target clones

2015-10-29 Thread Jan Hubicka
> Ok. I've modified the patch correspondingly.
Thanks,
the IPA/callgarph parts of the patch looks fine to me except for a typo:
+/* Create sting with attributes separated by comma.
+   Return number of attributes.  */

and a fact that sometimes you skip vertical whitespace after variable 
declarations.
I will leave it Jeff for the final ack.

Honza

> 
> >
> > I will look at the rest of the patch at evening.
> > Honza
> >>
> >> Thanks,
> >> Evgeny
> >> >
> >> >
> >> > Jeff
> >
> >




Re: [PATCH] New attribute to create target clones

2015-10-29 Thread Evgeny Stupachenko
On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka  wrote:
>> Yes. This is not necessary. However that way we'll have the following
>> code in dispatcher:
>> cmpl$6, __cpu_model+4(%rip)
>> sete%al
>> movzbl  %al, %eax
>> testl   %eax, %eax
>> jle .L16
>> movl$foo.target_clone.1, %eax
>> I think it is very hard to read and debug such...
>>
>> While now we have:
>>
>> cmpl$6, __cpu_model+4(%rip)
>> sete%al
>> movzbl  %al, %eax
>> testl   %eax, %eax
>> jle .L16
>> movl$foo.arch_slm, %eax
>>
>> and it is clear that we are jumping to SLM code here.
>> I'd like to keep target in names.
>
> I am not against more informative names, but why you don't pass the info here:
>
> +create_target_clone (cgraph_node *node, bool definition)
> +{
> +  cgraph_node *new_node;
> +  if (definition)
> +{
> +  new_node = node->create_version_clone_with_body (vNULL, NULL,
> +  NULL, false,
> +  NULL, NULL,
> +  "target_clone");
> +  new_node->force_output = true;
> +}
> +  else
> +{
> +  tree new_decl = copy_node (node->decl);
> +  new_node = cgraph_node::get_create (new_decl);
> +}
> +  return new_node;
> +}
>
> passing "arch_slm" instead of target_clone will get you the name you want
> (plus the extra index that may be needed anyway to disambiguate).
>
> Note that in general those .suffixes should be machine parseable, so 
> cp-demangle.c
> can expand them correctly.  We may want to have some consistent grammar for 
> them here
> and update cp-demangle.c to output nice info like "target clone for..."
Ok. I've modified the patch correspondingly.

>
> I will look at the rest of the patch at evening.
> Honza
>>
>> Thanks,
>> Evgeny
>> >
>> >
>> > Jeff
>
>


target_clones5.patch
Description: Binary data


Re: [PATCH] New attribute to create target clones

2015-10-29 Thread Jan Hubicka
> Yes. This is not necessary. However that way we'll have the following
> code in dispatcher:
> cmpl$6, __cpu_model+4(%rip)
> sete%al
> movzbl  %al, %eax
> testl   %eax, %eax
> jle .L16
> movl$foo.target_clone.1, %eax
> I think it is very hard to read and debug such...
> 
> While now we have:
> 
> cmpl$6, __cpu_model+4(%rip)
> sete%al
> movzbl  %al, %eax
> testl   %eax, %eax
> jle .L16
> movl$foo.arch_slm, %eax
> 
> and it is clear that we are jumping to SLM code here.
> I'd like to keep target in names.

I am not against more informative names, but why you don't pass the info here:

+create_target_clone (cgraph_node *node, bool definition)
+{
+  cgraph_node *new_node;
+  if (definition)
+{
+  new_node = node->create_version_clone_with_body (vNULL, NULL,
+  NULL, false,
+  NULL, NULL,
+  "target_clone");
+  new_node->force_output = true;
+}
+  else
+{
+  tree new_decl = copy_node (node->decl);
+  new_node = cgraph_node::get_create (new_decl);
+}
+  return new_node;
+}

passing "arch_slm" instead of target_clone will get you the name you want
(plus the extra index that may be needed anyway to disambiguate).

Note that in general those .suffixes should be machine parseable, so 
cp-demangle.c
can expand them correctly.  We may want to have some consistent grammar for 
them here
and update cp-demangle.c to output nice info like "target clone for..."

I will look at the rest of the patch at evening.
Honza
> 
> Thanks,
> Evgeny
> >
> >
> > Jeff




Re: [PATCH] New attribute to create target clones

2015-10-29 Thread Evgeny Stupachenko
On Mon, Oct 26, 2015 at 6:50 PM, Jeff Law  wrote:
> On 10/12/2015 05:35 PM, Evgeny Stupachenko wrote:
>>
>> Hi All,
>>
>> Here is a new version of patch (attached).
>> Bootstrap and make check are in progress (all new tests passed).
>>
>> New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
>> than "-mavx" are passed.
>> However it has the same behavior if "target_clones" attribute is
>> replaced by 2 corresponding "target" attributes.
>> I've filed PR67946 on this:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946
>>
>> Thanks,
>> Evgeny
>>
>> ChangeLog:
>>
>> 2015-10-13  Evgeny Stupachenko
>> gcc/
>>  * Makefile.in (OBJS): Add multiple_target.o.
>>  * attrib.c (make_attribute): Moved from config/i386/i386.c
>>  * config/i386/i386.c (make_attribute): Deleted.
>>  * multiple_target.c (make_attribute): New.
>>  (create_dispatcher_calls): Ditto.
>>  (get_attr_len): Ditto.
>>  (get_attr_str): Ditto.
>>  (is_valid_asm_symbol): Ditto.
>>  (create_new_asm_name): Ditto.
>>  (create_target_clone): Ditto.
>>  (expand_target_clones): Ditto.
>>  (ipa_target_clone): Ditto.
>>  (ipa_dispatcher_calls): Ditto.
>>  * passes.def (pass_target_clone): Two new ipa passes.
>>  * tree-pass.h (make_pass_target_clone): Ditto.
>>
>> gcc/c-family
>>  * c-common.c (handle_target_clones_attribute): New.
>>  * (c_common_attribute_table): Add handle_target_clones_attribute.
>>  * (handle_always_inline_attribute): Add check on target_clones
>>  attribute.
>>  * (handle_target_attribute): Ditto.
>>
>> gcc/testsuite
>>  * gcc.dg/mvc1.c: New test for multiple targets cloning.
>>  * gcc.dg/mvc2.c: Ditto.
>>  * gcc.dg/mvc3.c: Ditto.
>>  * gcc.dg/mvc4.c: Ditto.
>>  * gcc.dg/mvc5.c: Ditto.
>>  * gcc.dg/mvc6.c: Ditto.
>>  * gcc.dg/mvc7.c: Ditto.
>>  * g++.dg/ext/mvc1.C: Ditto.
>>  * g++.dg/ext/mvc2.C: Ditto.
>>  * g++.dg/ext/mvc3.C: Ditto.
>>  * g++.dg/ext/mvc4.C: Ditto.
>>
>> gcc/doc
>>  * doc/extend.texi (target_clones): New attribute description.
>>
>
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 23e6a76..f9d28d1 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -3066,6 +3066,19 @@ This function attribute make a stack protection of
>> the function if
>>   flags @option{fstack-protector} or @option{fstack-protector-strong}
>>   or @option{fstack-protector-explicit} are set.
>>
>> +@item target_clones (@var{options})
>> +@cindex @code{target_clones} function attribute
>> +The @code{target_clones} attribute is used to specify that a function is
>> to
>> +be cloned into multiple versions compiled with different target options
>> +than specified on the command line.  The supported options and
>> restrictions
>> +are the same as for @code{target}.
>
> "as for @code{target}" -> "as for the @code{target} attribute."
>
> I think that makes it a tiny bit clearer.
>
>
>
>
>> +
>> +/*  Creates target clone of NODE.  */
>> +
>> +static cgraph_node *
>> +create_target_clone (cgraph_node *node, bool definition)
>> +{
>> +  cgraph_node *new_node;
>> +  if (definition)
>> +{
>> +  new_node = node->create_version_clone_with_body (vNULL, NULL,
>> +  NULL, false,
>> +  NULL, NULL,
>> +  "target_clone");
>> +  new_node->externally_visible = node->externally_visible;
>> +  new_node->address_taken = node->address_taken;
>> +  new_node->thunk = node->thunk;
>> +  new_node->alias = node->alias;
>> +  new_node->weakref = node->weakref;
>> +  new_node->cpp_implicit_alias = node->cpp_implicit_alias;
>> +  new_node->local.local = node->local.local;
>
> So do we need to explicitly clear TREE_PUBLIC here?  It also seems like
> copying externally_visible, address_taken and possibly some of those other
> fields is wrong.  The clone is going to be local to the CU, it doesn't
> inherit those properties from the original -- only the dispatcher needs to
> inherit those properties, right?
Right. That was just the way to keep the node, that I didn't clean up.
Replaced with "new_node->force_output = true;"

>
>> +
>> +
>> +  for (i = 0; i < attrnum; i++)
>> +{
>> +  char *attr = attrs[i];
>> +  cgraph_node *new_node = create_target_clone (node, defenition);
>> +  char *new_asm_name =
>> + XNEWVEC (char, strlen (old_asm_name) + strlen (attr) + 2);
>> +  create_new_asm_name (old_asm_name, attr, new_asm_name);
>
> I thought after discussions with Jan that this wasn't going to be necessary
> as cloning should create a suitable name for us?
Yes. This is not necessary. However that way we'll have the following
code in dispatcher:
cmpl$6, __cpu_model

Re: [PATCH] New attribute to create target clones

2015-10-26 Thread Jeff Law

On 10/12/2015 05:35 PM, Evgeny Stupachenko wrote:

Hi All,

Here is a new version of patch (attached).
Bootstrap and make check are in progress (all new tests passed).

New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
than "-mavx" are passed.
However it has the same behavior if "target_clones" attribute is
replaced by 2 corresponding "target" attributes.
I've filed PR67946 on this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946

Thanks,
Evgeny

ChangeLog:

2015-10-13  Evgeny Stupachenko
gcc/
 * Makefile.in (OBJS): Add multiple_target.o.
 * attrib.c (make_attribute): Moved from config/i386/i386.c
 * config/i386/i386.c (make_attribute): Deleted.
 * multiple_target.c (make_attribute): New.
 (create_dispatcher_calls): Ditto.
 (get_attr_len): Ditto.
 (get_attr_str): Ditto.
 (is_valid_asm_symbol): Ditto.
 (create_new_asm_name): Ditto.
 (create_target_clone): Ditto.
 (expand_target_clones): Ditto.
 (ipa_target_clone): Ditto.
 (ipa_dispatcher_calls): Ditto.
 * passes.def (pass_target_clone): Two new ipa passes.
 * tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
 * c-common.c (handle_target_clones_attribute): New.
 * (c_common_attribute_table): Add handle_target_clones_attribute.
 * (handle_always_inline_attribute): Add check on target_clones
 attribute.
 * (handle_target_attribute): Ditto.

gcc/testsuite
 * gcc.dg/mvc1.c: New test for multiple targets cloning.
 * gcc.dg/mvc2.c: Ditto.
 * gcc.dg/mvc3.c: Ditto.
 * gcc.dg/mvc4.c: Ditto.
 * gcc.dg/mvc5.c: Ditto.
 * gcc.dg/mvc6.c: Ditto.
 * gcc.dg/mvc7.c: Ditto.
 * g++.dg/ext/mvc1.C: Ditto.
 * g++.dg/ext/mvc2.C: Ditto.
 * g++.dg/ext/mvc3.C: Ditto.
 * g++.dg/ext/mvc4.C: Ditto.

gcc/doc
 * doc/extend.texi (target_clones): New attribute description.




diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 23e6a76..f9d28d1 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3066,6 +3066,19 @@ This function attribute make a stack protection of the 
function if
  flags @option{fstack-protector} or @option{fstack-protector-strong}
  or @option{fstack-protector-explicit} are set.

+@item target_clones (@var{options})
+@cindex @code{target_clones} function attribute
+The @code{target_clones} attribute is used to specify that a function is to
+be cloned into multiple versions compiled with different target options
+than specified on the command line.  The supported options and restrictions
+are the same as for @code{target}.

"as for @code{target}" -> "as for the @code{target} attribute."

I think that makes it a tiny bit clearer.





+
+/*  Creates target clone of NODE.  */
+
+static cgraph_node *
+create_target_clone (cgraph_node *node, bool definition)
+{
+  cgraph_node *new_node;
+  if (definition)
+{
+  new_node = node->create_version_clone_with_body (vNULL, NULL,
+  NULL, false,
+  NULL, NULL,
+  "target_clone");
+  new_node->externally_visible = node->externally_visible;
+  new_node->address_taken = node->address_taken;
+  new_node->thunk = node->thunk;
+  new_node->alias = node->alias;
+  new_node->weakref = node->weakref;
+  new_node->cpp_implicit_alias = node->cpp_implicit_alias;
+  new_node->local.local = node->local.local;
So do we need to explicitly clear TREE_PUBLIC here?  It also seems like 
copying externally_visible, address_taken and possibly some of those 
other fields is wrong.  The clone is going to be local to the CU, it 
doesn't inherit those properties from the original -- only the 
dispatcher needs to inherit those properties, right?



+
+
+  for (i = 0; i < attrnum; i++)
+{
+  char *attr = attrs[i];
+  cgraph_node *new_node = create_target_clone (node, defenition);
+  char *new_asm_name =
+ XNEWVEC (char, strlen (old_asm_name) + strlen (attr) + 2);
+  create_new_asm_name (old_asm_name, attr, new_asm_name);
I thought after discussions with Jan that this wasn't going to be 
necessary as cloning should create a suitable name for us?



Jeff


Re: [PATCH] New attribute to create target clones

2015-10-22 Thread Evgeny Stupachenko
PING.

On Thu, Oct 15, 2015 at 12:32 AM, Evgeny Stupachenko  wrote:
> Bootstrap and make check for x86 passed. No new fails.
> Please ignore an empty line added to omp-low.c in the patch, the
> misprint will be removed prior to a commit.
>
> Thanks,
> Evgeny
>
> On Tue, Oct 13, 2015 at 2:35 AM, Evgeny Stupachenko  
> wrote:
>> Hi All,
>>
>> Here is a new version of patch (attached).
>> Bootstrap and make check are in progress (all new tests passed).
>>
>> New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
>> than "-mavx" are passed.
>> However it has the same behavior if "target_clones" attribute is
>> replaced by 2 corresponding "target" attributes.
>> I've filed PR67946 on this:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946
>>
>> Thanks,
>> Evgeny
>>
>> ChangeLog:
>>
>> 2015-10-13  Evgeny Stupachenko  
>> gcc/
>> * Makefile.in (OBJS): Add multiple_target.o.
>> * attrib.c (make_attribute): Moved from config/i386/i386.c
>> * config/i386/i386.c (make_attribute): Deleted.
>> * multiple_target.c (make_attribute): New.
>> (create_dispatcher_calls): Ditto.
>> (get_attr_len): Ditto.
>> (get_attr_str): Ditto.
>> (is_valid_asm_symbol): Ditto.
>> (create_new_asm_name): Ditto.
>> (create_target_clone): Ditto.
>> (expand_target_clones): Ditto.
>> (ipa_target_clone): Ditto.
>> (ipa_dispatcher_calls): Ditto.
>> * passes.def (pass_target_clone): Two new ipa passes.
>> * tree-pass.h (make_pass_target_clone): Ditto.
>>
>> gcc/c-family
>> * c-common.c (handle_target_clones_attribute): New.
>> * (c_common_attribute_table): Add handle_target_clones_attribute.
>> * (handle_always_inline_attribute): Add check on target_clones
>> attribute.
>> * (handle_target_attribute): Ditto.
>>
>> gcc/testsuite
>> * gcc.dg/mvc1.c: New test for multiple targets cloning.
>> * gcc.dg/mvc2.c: Ditto.
>> * gcc.dg/mvc3.c: Ditto.
>> * gcc.dg/mvc4.c: Ditto.
>> * gcc.dg/mvc5.c: Ditto.
>> * gcc.dg/mvc6.c: Ditto.
>> * gcc.dg/mvc7.c: Ditto.
>> * g++.dg/ext/mvc1.C: Ditto.
>> * g++.dg/ext/mvc2.C: Ditto.
>> * g++.dg/ext/mvc3.C: Ditto.
>> * g++.dg/ext/mvc4.C: Ditto.
>>
>> gcc/doc
>> * doc/extend.texi (target_clones): New attribute description.
>>
>> On Sat, Oct 10, 2015 at 12:44 AM, Evgeny Stupachenko  
>> wrote:
>>> On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka  wrote:
> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka  wrote:
> >> >Of course it also depends what you inline into function. You can have
> >> >
> >> >bar() target(-mavx) {fancy avx code}
> >> >foobar() { .. if (avx) bar();}
> >> >foo() ctarget(-mavx,-mno-avx) {foobar();}
>
> "no-" targets are not supported

 Why not? I suppose I can use -march=x86_64 in a file compiled with 
 -march=core-avx2 or something like that, too.
>>> Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) 
>>> returns:
>>>
>>> aaa.cpp: In function '':
>>> aaa.cpp:7:5: error: No dispatcher found for no-avx
>>>  int bar()
>>>  ^
>>>
>
> >> >
> >> >Now if you compile with -mavx and because ctarget takes effect only 
> >> >after inlining,
> >> >at inlining time the target attributes will match and we can edn up 
> >> >inline bar->foobar->foo.
> >> >After that we multiversion foo and drop AVX flag we will likely get 
> >> >ICE at expansion
> >> >time.
> >> But isn't that avoided by fixing up the call graph so that all calls
> >> to the affected function are going through the dispatcher?  Or is
> >> that happening too late?
> >
> > There is dispatcher only for foo that is the root of the callgarph tree.
> > When inlining we compare target attributes for match (in 
> > can_inline_edge_p).
> > We do not compare ctarget attributes.  Expanding ctarget to target 
> > early would
> > avoid need for ctarget handling.
> Currently inlining is disabled for functions with target_clone attribute:

 Do you also disable inlining into functions with target_clone?
 What I am concerned about is early inliner inlining (say) AVX code into 
 ctarget
 function because at early inlining time the target is not applied, yet.
>>> Right. Now I've got your point and ICE on the test.
>>> Yes the solution is to disable inline into target_clones function.
>>> Or to move the pass creating clones before inline (as you suggested)
>>> and leave dispatcher creator after inline.
>>>
>>> I like you suggestion. It fixes the ICE.
>>> I'll fix the patch and retest.
>>>
>>> Thank you for the review,
>>> Evgeny.
>>>
>>>

 Honza


Re: [PATCH] New attribute to create target clones

2015-10-14 Thread Evgeny Stupachenko
Bootstrap and make check for x86 passed. No new fails.
Please ignore an empty line added to omp-low.c in the patch, the
misprint will be removed prior to a commit.

Thanks,
Evgeny

On Tue, Oct 13, 2015 at 2:35 AM, Evgeny Stupachenko  wrote:
> Hi All,
>
> Here is a new version of patch (attached).
> Bootstrap and make check are in progress (all new tests passed).
>
> New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
> than "-mavx" are passed.
> However it has the same behavior if "target_clones" attribute is
> replaced by 2 corresponding "target" attributes.
> I've filed PR67946 on this:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946
>
> Thanks,
> Evgeny
>
> ChangeLog:
>
> 2015-10-13  Evgeny Stupachenko  
> gcc/
> * Makefile.in (OBJS): Add multiple_target.o.
> * attrib.c (make_attribute): Moved from config/i386/i386.c
> * config/i386/i386.c (make_attribute): Deleted.
> * multiple_target.c (make_attribute): New.
> (create_dispatcher_calls): Ditto.
> (get_attr_len): Ditto.
> (get_attr_str): Ditto.
> (is_valid_asm_symbol): Ditto.
> (create_new_asm_name): Ditto.
> (create_target_clone): Ditto.
> (expand_target_clones): Ditto.
> (ipa_target_clone): Ditto.
> (ipa_dispatcher_calls): Ditto.
> * passes.def (pass_target_clone): Two new ipa passes.
> * tree-pass.h (make_pass_target_clone): Ditto.
>
> gcc/c-family
> * c-common.c (handle_target_clones_attribute): New.
> * (c_common_attribute_table): Add handle_target_clones_attribute.
> * (handle_always_inline_attribute): Add check on target_clones
> attribute.
> * (handle_target_attribute): Ditto.
>
> gcc/testsuite
> * gcc.dg/mvc1.c: New test for multiple targets cloning.
> * gcc.dg/mvc2.c: Ditto.
> * gcc.dg/mvc3.c: Ditto.
> * gcc.dg/mvc4.c: Ditto.
> * gcc.dg/mvc5.c: Ditto.
> * gcc.dg/mvc6.c: Ditto.
> * gcc.dg/mvc7.c: Ditto.
> * g++.dg/ext/mvc1.C: Ditto.
> * g++.dg/ext/mvc2.C: Ditto.
> * g++.dg/ext/mvc3.C: Ditto.
> * g++.dg/ext/mvc4.C: Ditto.
>
> gcc/doc
> * doc/extend.texi (target_clones): New attribute description.
>
> On Sat, Oct 10, 2015 at 12:44 AM, Evgeny Stupachenko  
> wrote:
>> On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka  wrote:
 On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka  wrote:
 >> >Of course it also depends what you inline into function. You can have
 >> >
 >> >bar() target(-mavx) {fancy avx code}
 >> >foobar() { .. if (avx) bar();}
 >> >foo() ctarget(-mavx,-mno-avx) {foobar();}

 "no-" targets are not supported
>>>
>>> Why not? I suppose I can use -march=x86_64 in a file compiled with 
>>> -march=core-avx2 or something like that, too.
>> Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) 
>> returns:
>>
>> aaa.cpp: In function '':
>> aaa.cpp:7:5: error: No dispatcher found for no-avx
>>  int bar()
>>  ^
>>

 >> >
 >> >Now if you compile with -mavx and because ctarget takes effect only 
 >> >after inlining,
 >> >at inlining time the target attributes will match and we can edn up 
 >> >inline bar->foobar->foo.
 >> >After that we multiversion foo and drop AVX flag we will likely get 
 >> >ICE at expansion
 >> >time.
 >> But isn't that avoided by fixing up the call graph so that all calls
 >> to the affected function are going through the dispatcher?  Or is
 >> that happening too late?
 >
 > There is dispatcher only for foo that is the root of the callgarph tree.
 > When inlining we compare target attributes for match (in 
 > can_inline_edge_p).
 > We do not compare ctarget attributes.  Expanding ctarget to target early 
 > would
 > avoid need for ctarget handling.
 Currently inlining is disabled for functions with target_clone attribute:
>>>
>>> Do you also disable inlining into functions with target_clone?
>>> What I am concerned about is early inliner inlining (say) AVX code into 
>>> ctarget
>>> function because at early inlining time the target is not applied, yet.
>> Right. Now I've got your point and ICE on the test.
>> Yes the solution is to disable inline into target_clones function.
>> Or to move the pass creating clones before inline (as you suggested)
>> and leave dispatcher creator after inline.
>>
>> I like you suggestion. It fixes the ICE.
>> I'll fix the patch and retest.
>>
>> Thank you for the review,
>> Evgeny.
>>
>>
>>>
>>> Honza


Re: [PATCH] New attribute to create target clones

2015-10-12 Thread Evgeny Stupachenko
Hi All,

Here is a new version of patch (attached).
Bootstrap and make check are in progress (all new tests passed).

New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
than "-mavx" are passed.
However it has the same behavior if "target_clones" attribute is
replaced by 2 corresponding "target" attributes.
I've filed PR67946 on this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946

Thanks,
Evgeny

ChangeLog:

2015-10-13  Evgeny Stupachenko  
gcc/
* Makefile.in (OBJS): Add multiple_target.o.
* attrib.c (make_attribute): Moved from config/i386/i386.c
* config/i386/i386.c (make_attribute): Deleted.
* multiple_target.c (make_attribute): New.
(create_dispatcher_calls): Ditto.
(get_attr_len): Ditto.
(get_attr_str): Ditto.
(is_valid_asm_symbol): Ditto.
(create_new_asm_name): Ditto.
(create_target_clone): Ditto.
(expand_target_clones): Ditto.
(ipa_target_clone): Ditto.
(ipa_dispatcher_calls): Ditto.
* passes.def (pass_target_clone): Two new ipa passes.
* tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
* c-common.c (handle_target_clones_attribute): New.
* (c_common_attribute_table): Add handle_target_clones_attribute.
* (handle_always_inline_attribute): Add check on target_clones
attribute.
* (handle_target_attribute): Ditto.

gcc/testsuite
* gcc.dg/mvc1.c: New test for multiple targets cloning.
* gcc.dg/mvc2.c: Ditto.
* gcc.dg/mvc3.c: Ditto.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* gcc.dg/mvc6.c: Ditto.
* gcc.dg/mvc7.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc2.C: Ditto.
* g++.dg/ext/mvc3.C: Ditto.
* g++.dg/ext/mvc4.C: Ditto.

gcc/doc
* doc/extend.texi (target_clones): New attribute description.

On Sat, Oct 10, 2015 at 12:44 AM, Evgeny Stupachenko  wrote:
> On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka  wrote:
>>> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka  wrote:
>>> >> >Of course it also depends what you inline into function. You can have
>>> >> >
>>> >> >bar() target(-mavx) {fancy avx code}
>>> >> >foobar() { .. if (avx) bar();}
>>> >> >foo() ctarget(-mavx,-mno-avx) {foobar();}
>>>
>>> "no-" targets are not supported
>>
>> Why not? I suppose I can use -march=x86_64 in a file compiled with 
>> -march=core-avx2 or something like that, too.
> Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) 
> returns:
>
> aaa.cpp: In function '':
> aaa.cpp:7:5: error: No dispatcher found for no-avx
>  int bar()
>  ^
>
>>>
>>> >> >
>>> >> >Now if you compile with -mavx and because ctarget takes effect only 
>>> >> >after inlining,
>>> >> >at inlining time the target attributes will match and we can edn up 
>>> >> >inline bar->foobar->foo.
>>> >> >After that we multiversion foo and drop AVX flag we will likely get ICE 
>>> >> >at expansion
>>> >> >time.
>>> >> But isn't that avoided by fixing up the call graph so that all calls
>>> >> to the affected function are going through the dispatcher?  Or is
>>> >> that happening too late?
>>> >
>>> > There is dispatcher only for foo that is the root of the callgarph tree.
>>> > When inlining we compare target attributes for match (in 
>>> > can_inline_edge_p).
>>> > We do not compare ctarget attributes.  Expanding ctarget to target early 
>>> > would
>>> > avoid need for ctarget handling.
>>> Currently inlining is disabled for functions with target_clone attribute:
>>
>> Do you also disable inlining into functions with target_clone?
>> What I am concerned about is early inliner inlining (say) AVX code into 
>> ctarget
>> function because at early inlining time the target is not applied, yet.
> Right. Now I've got your point and ICE on the test.
> Yes the solution is to disable inline into target_clones function.
> Or to move the pass creating clones before inline (as you suggested)
> and leave dispatcher creator after inline.
>
> I like you suggestion. It fixes the ICE.
> I'll fix the patch and retest.
>
> Thank you for the review,
> Evgeny.
>
>
>>
>> Honza


target_clones.patch
Description: Binary data


Re: [PATCH] New attribute to create target clones

2015-10-09 Thread Evgeny Stupachenko
On Fri, Oct 9, 2015 at 11:04 PM, Jan Hubicka  wrote:
>> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka  wrote:
>> >> >Of course it also depends what you inline into function. You can have
>> >> >
>> >> >bar() target(-mavx) {fancy avx code}
>> >> >foobar() { .. if (avx) bar();}
>> >> >foo() ctarget(-mavx,-mno-avx) {foobar();}
>>
>> "no-" targets are not supported
>
> Why not? I suppose I can use -march=x86_64 in a file compiled with 
> -march=core-avx2 or something like that, too.
Sure, you can. target(arch=x86-64) is ok. I mean exactly target(no-avx) returns:

aaa.cpp: In function '':
aaa.cpp:7:5: error: No dispatcher found for no-avx
 int bar()
 ^

>>
>> >> >
>> >> >Now if you compile with -mavx and because ctarget takes effect only 
>> >> >after inlining,
>> >> >at inlining time the target attributes will match and we can edn up 
>> >> >inline bar->foobar->foo.
>> >> >After that we multiversion foo and drop AVX flag we will likely get ICE 
>> >> >at expansion
>> >> >time.
>> >> But isn't that avoided by fixing up the call graph so that all calls
>> >> to the affected function are going through the dispatcher?  Or is
>> >> that happening too late?
>> >
>> > There is dispatcher only for foo that is the root of the callgarph tree.
>> > When inlining we compare target attributes for match (in 
>> > can_inline_edge_p).
>> > We do not compare ctarget attributes.  Expanding ctarget to target early 
>> > would
>> > avoid need for ctarget handling.
>> Currently inlining is disabled for functions with target_clone attribute:
>
> Do you also disable inlining into functions with target_clone?
> What I am concerned about is early inliner inlining (say) AVX code into 
> ctarget
> function because at early inlining time the target is not applied, yet.
Right. Now I've got your point and ICE on the test.
Yes the solution is to disable inline into target_clones function.
Or to move the pass creating clones before inline (as you suggested)
and leave dispatcher creator after inline.

I like you suggestion. It fixes the ICE.
I'll fix the patch and retest.

Thank you for the review,
Evgeny.


>
> Honza


Re: [PATCH] New attribute to create target clones

2015-10-09 Thread Jan Hubicka
> On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka  wrote:
> >> >Of course it also depends what you inline into function. You can have
> >> >
> >> >bar() target(-mavx) {fancy avx code}
> >> >foobar() { .. if (avx) bar();}
> >> >foo() ctarget(-mavx,-mno-avx) {foobar();}
> 
> "no-" targets are not supported

Why not? I suppose I can use -march=x86_64 in a file compiled with 
-march=core-avx2 or something like that, too.
> 
> >> >
> >> >Now if you compile with -mavx and because ctarget takes effect only after 
> >> >inlining,
> >> >at inlining time the target attributes will match and we can edn up 
> >> >inline bar->foobar->foo.
> >> >After that we multiversion foo and drop AVX flag we will likely get ICE 
> >> >at expansion
> >> >time.
> >> But isn't that avoided by fixing up the call graph so that all calls
> >> to the affected function are going through the dispatcher?  Or is
> >> that happening too late?
> >
> > There is dispatcher only for foo that is the root of the callgarph tree.
> > When inlining we compare target attributes for match (in can_inline_edge_p).
> > We do not compare ctarget attributes.  Expanding ctarget to target early 
> > would
> > avoid need for ctarget handling.
> Currently inlining is disabled for functions with target_clone attribute:

Do you also disable inlining into functions with target_clone?  
What I am concerned about is early inliner inlining (say) AVX code into ctarget
function because at early inlining time the target is not applied, yet.

Honza


Re: [PATCH] New attribute to create target clones

2015-10-09 Thread Evgeny Stupachenko
On Fri, Oct 9, 2015 at 9:27 PM, Jan Hubicka  wrote:
>> >Of course it also depends what you inline into function. You can have
>> >
>> >bar() target(-mavx) {fancy avx code}
>> >foobar() { .. if (avx) bar();}
>> >foo() ctarget(-mavx,-mno-avx) {foobar();}

"no-" targets are not supported

>> >
>> >Now if you compile with -mavx and because ctarget takes effect only after 
>> >inlining,
>> >at inlining time the target attributes will match and we can edn up inline 
>> >bar->foobar->foo.
>> >After that we multiversion foo and drop AVX flag we will likely get ICE at 
>> >expansion
>> >time.
>> But isn't that avoided by fixing up the call graph so that all calls
>> to the affected function are going through the dispatcher?  Or is
>> that happening too late?
>
> There is dispatcher only for foo that is the root of the callgarph tree.
> When inlining we compare target attributes for match (in can_inline_edge_p).
> We do not compare ctarget attributes.  Expanding ctarget to target early would
> avoid need for ctarget handling.
Currently inlining is disabled for functions with target_clone attribute:

if call from versioned function is call of multiversion function:

foo.avx2()
{
  ...
  bar();
  ...
}

bar dispatcher is better than inlining:

foo.avx2()
{
  ...
  bar.ifunc();
  ...
}

That is better from my point of view as:
1. multiversioning is going to be used mostly for target specific
optimizations like vectorization, scheduling... For these cases
inlininng not giving much. One of good examples is moving hot
vectorizable loop into the function and add target_clones attribute.
2. Inlining algorithm when we have not equal chains of architectures
in target_clones attribute are complex:
__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
__attribute__((target_clones("ssse3","arch=atom","arch=slm","default")))
3. If for one version inline occurs, but for another not we can get
unexpected (from user point) performance difference

Anyway I'll add the test you've mentioned to the patch.

The other big question is how all this deal with OpenMP clauses.
For now both target and target_clones attributes disable #pragma omp
effect inserting scalar ifunc call into loop.

Thanks,
Evgeny

>
>>
>> >
>> >>
>> >>
>> >>Since we're going through a dispatcher I don't think we're allowed
>> >>to change the ABI across the clones.
>> >
>> >If the dispatcher was something like
>> >
>> >switch(value)
>> >{
>> >   case 1: foo_ver1(param); break;
>> >   case 2: foo_ver2(param); break;
>> >}
>> >then it would be possible to change ABI if we can prove that param=0 in 
>> >ipa-cp.
>> >If the dispatcher uses indirect calls, then we probably want to consider
>> >foo_ver* as having address taken and thus not local.
>> Yea, I guess that could work.  For some reason I kept thinking about
>> indirect calls, but that shouldn't be the case here.
>>
>> That argues further that setting TREE_PUBLIC on the target specific
>> implementations is wrong.
>
> Yep, if the function doesn't need to be public then the whole block of code
> tampering with visibility and other flags can go.  We probably also don't need
> to care about giving clone some fixed assembler name as clonning will invent
> .target_clone.XYZ name itself.
>
> Honza
>>
>>
>> Jeff


Re: [PATCH] New attribute to create target clones

2015-10-09 Thread Jan Hubicka
> >Of course it also depends what you inline into function. You can have
> >
> >bar() target(-mavx) {fancy avx code}
> >foobar() { .. if (avx) bar();}
> >foo() ctarget(-mavx,-mno-avx) {foobar();}
> >
> >Now if you compile with -mavx and because ctarget takes effect only after 
> >inlining,
> >at inlining time the target attributes will match and we can edn up inline 
> >bar->foobar->foo.
> >After that we multiversion foo and drop AVX flag we will likely get ICE at 
> >expansion
> >time.
> But isn't that avoided by fixing up the call graph so that all calls
> to the affected function are going through the dispatcher?  Or is
> that happening too late?

There is dispatcher only for foo that is the root of the callgarph tree.
When inlining we compare target attributes for match (in can_inline_edge_p).
We do not compare ctarget attributes.  Expanding ctarget to target early would
avoid need for ctarget handling.

> 
> >
> >>
> >>
> >>Since we're going through a dispatcher I don't think we're allowed
> >>to change the ABI across the clones.
> >
> >If the dispatcher was something like
> >
> >switch(value)
> >{
> >   case 1: foo_ver1(param); break;
> >   case 2: foo_ver2(param); break;
> >}
> >then it would be possible to change ABI if we can prove that param=0 in 
> >ipa-cp.
> >If the dispatcher uses indirect calls, then we probably want to consider
> >foo_ver* as having address taken and thus not local.
> Yea, I guess that could work.  For some reason I kept thinking about
> indirect calls, but that shouldn't be the case here.
> 
> That argues further that setting TREE_PUBLIC on the target specific
> implementations is wrong.

Yep, if the function doesn't need to be public then the whole block of code
tampering with visibility and other flags can go.  We probably also don't need
to care about giving clone some fixed assembler name as clonning will invent
.target_clone.XYZ name itself.

Honza
> 
> 
> Jeff


Re: [PATCH] New attribute to create target clones

2015-10-09 Thread Jeff Law

On 10/08/2015 03:36 PM, Jan Hubicka wrote:


Yes, here you have different names for different variants of the function
body. Basically this pass takes ctarget attribute and creates bunch of verisons
of the functions and assigns them the proper target attributes, right?

Right.  Given a single function in the source tree with the new
attribute, we'll create clones and compile each clone with a
different set of options.  It's got a lot of similarities to the
multi-versioning code.  The key difference is with multi-versioning,
you actually have a different source level implementation for each
target while Evgeny's stuff has a single source implementation.



One thing I am confused about is why this does not happen early?
What happens to inlines of functions with specific taret requirements? I.e.
if I compile with AVX enabled have function body with AVX code but then, at
late compilation, force a clone with -mno-avx?

I would expect cloning to happen early, perhaps even before early 
optimizations...
Switching random target flags mid optimization queue seems dangerous.

These shouldn't be inlined to the best of my knowledge.  We go back
and direct all callers to a dispatcher.  Inlining them would be a
mistake since the goal here is to specialize the clones around
target capabilities.  Thus if something got inlined, then we lose
that ability.


OK, I assume that every multiversioned function will end up in something
like this:

foo_ver1() target(...)
{
}
foo_ver2() target(...)
{
}
foo()
{
   dispatch either to foo_ver1() or foo_ver2()
}

Yes.



I wonder why foo_ver1/foo_ver2 needs ever become public?  If there is only way
to call them via dispatcher, then the code setting TREE_PUBLIC seems wrong.  If
there is direct way to call them, then inlinng is possible.
I can't offhand think of a good reason why they should be public.  In 
fact, it would defeat the purpose of this feature to begin with if those 
symbols were directly reachable.  So, yes, setting TREE_PUBLIC seems wrong.





Of course it also depends what you inline into function. You can have

bar() target(-mavx) {fancy avx code}
foobar() { .. if (avx) bar();}
foo() ctarget(-mavx,-mno-avx) {foobar();}

Now if you compile with -mavx and because ctarget takes effect only after 
inlining,
at inlining time the target attributes will match and we can edn up inline 
bar->foobar->foo.
After that we multiversion foo and drop AVX flag we will likely get ICE at 
expansion
time.
But isn't that avoided by fixing up the call graph so that all calls to 
the affected function are going through the dispatcher?  Or is that 
happening too late?







Since we're going through a dispatcher I don't think we're allowed
to change the ABI across the clones.


If the dispatcher was something like

switch(value)
{
   case 1: foo_ver1(param); break;
   case 2: foo_ver2(param); break;
}
then it would be possible to change ABI if we can prove that param=0 in ipa-cp.
If the dispatcher uses indirect calls, then we probably want to consider
foo_ver* as having address taken and thus not local.
Yea, I guess that could work.  For some reason I kept thinking about 
indirect calls, but that shouldn't be the case here.


That argues further that setting TREE_PUBLIC on the target specific 
implementations is wrong.



Jeff


Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Jeff Law

On 10/08/2015 02:01 PM, Evgeny Stupachenko wrote:

On Thu, Oct 8, 2015 at 10:00 PM, Jeff Law  wrote:

On 09/24/2015 04:28 PM, Evgeny Stupachenko wrote:


I've fixed ICE and review issues.
x86 make check and bootstrap passed.

Thanks,
Evgeny

ChangeLog

2015-09-25  Evgeny Stupachenko

gcc/
  * Makefile.in (OBJS): Add multiple_target.o.
  * multiple_target.c (make_attribute): New.
  (create_dispatcher_calls): Ditto.
  (expand_target_clones): Ditto.
  (ipa_target_clone): Ditto.
  * passes.def (pass_target_clone): New ipa pass.
  * tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
  * c-common.c (handle_target_clones_attribute): New.
  * (c_common_attribute_table): Add handle_target_clones_attribute.
  * (handle_always_inline_attribute): Add check on target_clones
  attribute.
  * (handle_target_attribute): Ditto.

gcc/testsuite
  * gcc.dg/mvc1.c: New test for multiple targets cloning.
  * gcc.dg/mvc2.c: Ditto.
  * gcc.dg/mvc3.c: Ditto.
  * gcc.dg/mvc4.c: Ditto.
  * gcc.dg/mvc5.c: Ditto.
  * gcc.dg/mvc6.c: Ditto.
  * gcc.dg/mvc7.c: Ditto.
  * g++.dg/ext/mvc1.C: Ditto.
  * g++.dg/ext/mvc2.C: Ditto.
  * g++.dg/ext/mvc3.C: Ditto.

gcc/doc
  * doc/extend.texi (target_clones): New attribute description.






target_clones.patch


Sorry this has taken so long to come back to...  As I mentioned a couple
months ago, I'd hoped Jan would chime in on the IPA/symtab requirements.
But that didn't happen.


SO I went back and reviewed the discussion between Jan, Ilya & myself WRT
some of the rules around aliases, clones, etc.  I think the key question for
this patch is whether or not the clones have the same assembler name or not.
 From looking at expand_target_clones, I'm confident the answer is the clones
have different assembler names.  In fact, the assembler names are munged
with the options used for that specific clone of the original function.




+/* Makes a function attribute of the form NAME(ARG_NAME) and chains
+   it to CHAIN.  */
+
+static tree
+make_attribute (const char *name, const char *arg_name, tree chain)
+{
+  tree attr_name;
+  tree attr_arg_name;
+  tree attr_args;
+  tree attr;
+
+  attr_name = get_identifier (name);
+  attr_arg_name = build_string (strlen (arg_name), arg_name);
+  attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
+  attr = tree_cons (attr_name, attr_args, chain);
+  return attr;
+}


This seems generic enough that I'd prefer it in attribs.c.  I was rather
surprised when I looked and didn't find an existing routine to do this.



+
+/* If the call in NODE has multiple target attribute with multiple
fields,
+   replace it with dispatcher call and create dispatcher (once).  */
+
+static void
+create_dispatcher_calls (struct cgraph_node *node)
+{
+  cgraph_edge *e;
+  cgraph_edge *e_next;
+  for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller)


That's a rather strange way to write the loop increment.  If I follow the
loop logic correctly, it seems that we always end up using e->next_caller,
it's just obscured.

For the test if we're calling a versioned function, we just "continue".  e
will be non-null and thus we use e->next_caller to set e for the next
iteration.

If the test for calling a versioned function falls, we set e_next to
e->next_caller, then later set e to NULL.  That results in using e_next to
set e for the next iteration.  But e_next was initialized to e->next_caller.

So why not just write the loop increment as e = e->next-caller?


Because of this:
   e->redirect_callee (inode);
It modifies next_caller field.
The other way is to remember all what we want to redirect and create
one more loop through the modifications to apply them.

Seems like a comment would be wise.




I'm slightly concerned with using the pretty printer to build up the new
name.  Is there precedent for this anywhere else in GCC?

I don't remember where it exactly came from. However it's not a big
deal to simplify this to std functions.

Thanks.  Not sure why, but I'd appreciate that change.




When creating the munged name, don't you also have to make sure that other
symbols that aren't supported for names don't sneak through?  I see that you
replace = and -, but you'd need to replace any symbol that could possibly be
used in an option, but which isn't allowed in a function name at the
assembler level.  I'd be worried about anything that might possibly be seen
as an operator by the assembler, '.', and possibly others.

This restriction comes from "targetm.target_option.valid_attribute_p"
and it is the same for current implementation of function
multiversioning.
It exits with error: "attribute(target("...")) is unknown".
It looks reasonable to put the check before symtab changes.
Right, but there's nothing inherently that says that a option couldn't 
have other operators such as

Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Jan Hubicka
> >
> >Yes, here you have different names for different variants of the function
> >body. Basically this pass takes ctarget attribute and creates bunch of 
> >verisons
> >of the functions and assigns them the proper target attributes, right?
> Right.  Given a single function in the source tree with the new
> attribute, we'll create clones and compile each clone with a
> different set of options.  It's got a lot of similarities to the
> multi-versioning code.  The key difference is with multi-versioning,
> you actually have a different source level implementation for each
> target while Evgeny's stuff has a single source implementation.
> 
> >
> >One thing I am confused about is why this does not happen early?
> >What happens to inlines of functions with specific taret requirements? I.e.
> >if I compile with AVX enabled have function body with AVX code but then, at
> >late compilation, force a clone with -mno-avx?
> >
> >I would expect cloning to happen early, perhaps even before early 
> >optimizations...
> >Switching random target flags mid optimization queue seems dangerous.
> These shouldn't be inlined to the best of my knowledge.  We go back
> and direct all callers to a dispatcher.  Inlining them would be a
> mistake since the goal here is to specialize the clones around
> target capabilities.  Thus if something got inlined, then we lose
> that ability.

OK, I assume that every multiversioned function will end up in something
like this:

foo_ver1() target(...)
{
}
foo_ver2() target(...)
{
}
foo()
{
  dispatch either to foo_ver1() or foo_ver2()
}

I wonder why foo_ver1/foo_ver2 needs ever become public?  If there is only way
to call them via dispatcher, then the code setting TREE_PUBLIC seems wrong.  If
there is direct way to call them, then inlinng is possible.

Of course it also depends what you inline into function. You can have

bar() target(-mavx) {fancy avx code}
foobar() { .. if (avx) bar();}
foo() ctarget(-mavx,-mno-avx) {foobar();}

Now if you compile with -mavx and because ctarget takes effect only after 
inlining,
at inlining time the target attributes will match and we can edn up inline 
bar->foobar->foo.
After that we multiversion foo and drop AVX flag we will likely get ICE at 
expansion
time.

> 
> 
> Since we're going through a dispatcher I don't think we're allowed
> to change the ABI across the clones.

If the dispatcher was something like

switch(value)
{
  case 1: foo_ver1(param); break;
  case 2: foo_ver2(param); break;
}
then it would be possible to change ABI if we can prove that param=0 in ipa-cp.
If the dispatcher uses indirect calls, then we probably want to consider
foo_ver* as having address taken and thus not local.

Honza
> 
> 
> >
> >Also when you are copying a function, you probably want to copy the 
> >associated
> >thunks and version them, too?
> Dunno on that.
> 
> Jeff


Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Evgeny Stupachenko
On Thu, Oct 8, 2015 at 10:00 PM, Jeff Law  wrote:
> On 09/24/2015 04:28 PM, Evgeny Stupachenko wrote:
>>
>> I've fixed ICE and review issues.
>> x86 make check and bootstrap passed.
>>
>> Thanks,
>> Evgeny
>>
>> ChangeLog
>>
>> 2015-09-25  Evgeny Stupachenko
>>
>> gcc/
>>  * Makefile.in (OBJS): Add multiple_target.o.
>>  * multiple_target.c (make_attribute): New.
>>  (create_dispatcher_calls): Ditto.
>>  (expand_target_clones): Ditto.
>>  (ipa_target_clone): Ditto.
>>  * passes.def (pass_target_clone): New ipa pass.
>>  * tree-pass.h (make_pass_target_clone): Ditto.
>>
>> gcc/c-family
>>  * c-common.c (handle_target_clones_attribute): New.
>>  * (c_common_attribute_table): Add handle_target_clones_attribute.
>>  * (handle_always_inline_attribute): Add check on target_clones
>>  attribute.
>>  * (handle_target_attribute): Ditto.
>>
>> gcc/testsuite
>>  * gcc.dg/mvc1.c: New test for multiple targets cloning.
>>  * gcc.dg/mvc2.c: Ditto.
>>  * gcc.dg/mvc3.c: Ditto.
>>  * gcc.dg/mvc4.c: Ditto.
>>  * gcc.dg/mvc5.c: Ditto.
>>  * gcc.dg/mvc6.c: Ditto.
>>  * gcc.dg/mvc7.c: Ditto.
>>  * g++.dg/ext/mvc1.C: Ditto.
>>  * g++.dg/ext/mvc2.C: Ditto.
>>  * g++.dg/ext/mvc3.C: Ditto.
>>
>> gcc/doc
>>  * doc/extend.texi (target_clones): New attribute description.
>>
>>
>
>>
>> target_clones.patch
>
> Sorry this has taken so long to come back to...  As I mentioned a couple
> months ago, I'd hoped Jan would chime in on the IPA/symtab requirements.
> But that didn't happen.
>
>
> SO I went back and reviewed the discussion between Jan, Ilya & myself WRT
> some of the rules around aliases, clones, etc.  I think the key question for
> this patch is whether or not the clones have the same assembler name or not.
> From looking at expand_target_clones, I'm confident the answer is the clones
> have different assembler names.  In fact, the assembler names are munged
> with the options used for that specific clone of the original function.
>
>
>>
>> +/* Makes a function attribute of the form NAME(ARG_NAME) and chains
>> +   it to CHAIN.  */
>> +
>> +static tree
>> +make_attribute (const char *name, const char *arg_name, tree chain)
>> +{
>> +  tree attr_name;
>> +  tree attr_arg_name;
>> +  tree attr_args;
>> +  tree attr;
>> +
>> +  attr_name = get_identifier (name);
>> +  attr_arg_name = build_string (strlen (arg_name), arg_name);
>> +  attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
>> +  attr = tree_cons (attr_name, attr_args, chain);
>> +  return attr;
>> +}
>
> This seems generic enough that I'd prefer it in attribs.c.  I was rather
> surprised when I looked and didn't find an existing routine to do this.
>
>
>> +
>> +/* If the call in NODE has multiple target attribute with multiple
>> fields,
>> +   replace it with dispatcher call and create dispatcher (once).  */
>> +
>> +static void
>> +create_dispatcher_calls (struct cgraph_node *node)
>> +{
>> +  cgraph_edge *e;
>> +  cgraph_edge *e_next;
>> +  for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller)
>
> That's a rather strange way to write the loop increment.  If I follow the
> loop logic correctly, it seems that we always end up using e->next_caller,
> it's just obscured.
>
> For the test if we're calling a versioned function, we just "continue".  e
> will be non-null and thus we use e->next_caller to set e for the next
> iteration.
>
> If the test for calling a versioned function falls, we set e_next to
> e->next_caller, then later set e to NULL.  That results in using e_next to
> set e for the next iteration.  But e_next was initialized to e->next_caller.
>
> So why not just write the loop increment as e = e->next-caller?

Because of this:
  e->redirect_callee (inode);
It modifies next_caller field.
The other way is to remember all what we want to redirect and create
one more loop through the modifications to apply them.

>
>
>
>> +{
>> +  tree resolver_decl;
>> +  tree idecl;
>> +  tree decl;
>> +  gimple *call = e->call_stmt;
>> +  struct cgraph_node *inode;
>> +
>> +  /* Checking if call of function is call of versioned function.
>> +Versioned function are not inlined, so there is no need to
>> +check for inline.  */
>
> This comment doesn't parse well.  Perhaps:
>
> /* Check if this is a call to a versioned function.  Verisoned
>fucntions are not inlined, so there is no need to check for that.  */
>
>
>> +
>> +/* If the function in NODE has multiple target attribute with multiple
>> fields,
>> +   create the appropriate clone for each field.  */
>> +
>> +static bool
>> +expand_target_clones (struct cgraph_node *node)
>
> So this is probably getting a little large.  Can we look to refactor it a
> little?  It's not a huge deal and there's certainly code in GCC that is far
> worse, but it just feels like there's e

Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Jeff Law

On 10/08/2015 01:23 PM, Jan Hubicka wrote:

Sorry this has taken so long to come back to...  As I mentioned a
couple months ago, I'd hoped Jan would chime in on the IPA/symtab
requirements.  But that didn't happen.


Sorry for that.  I had bit too many real life things this summer
and I am still trying to catch up.

It happens to us all :-)  No worries.



Ilya's code seems different from what this patch does. Ilya simply needs
multiple declarations for one physical assembler name (this is not an alias).
This is not currently supported by symtab (support for that was removed long
time ago as part of the one decl project) and I have some perliminary patches
to push out, but since they add basic sanity checking that the different
declarations of the same thing looks compatible I get too many positives I need
to walk through.  Those seems real bugs in glibc (which uses duplicated decls
for checking) and the pointer bounds code.
Right.  I was mostly concerned because I goof'd letting those bits from 
Ilya in and didn't want to repeat the mistake again.  It was pretty 
clear once I found the thread between the tree of us that Evgeny's 
patches were doing something rather different and didn't violate the 
assumptions of the symtab code.




Yes, here you have different names for different variants of the function
body. Basically this pass takes ctarget attribute and creates bunch of verisons
of the functions and assigns them the proper target attributes, right?
Right.  Given a single function in the source tree with the new 
attribute, we'll create clones and compile each clone with a different 
set of options.  It's got a lot of similarities to the multi-versioning 
code.  The key difference is with multi-versioning, you actually have a 
different source level implementation for each target while Evgeny's 
stuff has a single source implementation.




One thing I am confused about is why this does not happen early?
What happens to inlines of functions with specific taret requirements? I.e.
if I compile with AVX enabled have function body with AVX code but then, at
late compilation, force a clone with -mno-avx?

I would expect cloning to happen early, perhaps even before early 
optimizations...
Switching random target flags mid optimization queue seems dangerous.
These shouldn't be inlined to the best of my knowledge.  We go back and 
direct all callers to a dispatcher.  Inlining them would be a mistake 
since the goal here is to specialize the clones around target 
capabilities.  Thus if something got inlined, then we lose that ability.





As for the patch itself:
+  if (node->definition)
+   {
+ if (!node->has_gimple_body_p ())
+   return false;
+ node->get_body ();
+
+ /* Replacing initial function with clone only for 1st ctarget.  */
+ new_node = node->create_version_clone_with_body (vNULL, NULL,
+  NULL, false,
+  NULL, NULL,
+  "target_clone");
+ new_node->externally_visible = node->externally_visible;
+ new_node->address_taken = node->address_taken;
+ new_node->thunk = node->thunk;
+ new_node->alias = node->alias;
+ new_node->weakref = node->weakref;
+ new_node->cpp_implicit_alias = node->cpp_implicit_alias;
+ new_node->local.local = node->local.local;
+ TREE_PUBLIC (new_node->decl) = TREE_PUBLIC (node->decl);
+   }
Since you test it has gimple body, then you don't need to worry about
alias/thunk/weakrefs/implicit_alias properties. Those will be never set.
How does the dispatcher look like?  Can the function be considered local
in a sense that one can change calling conventions of one clone but not another?

Easiest to think of them as having the same abilities as multi-versioning.


Since we're going through a dispatcher I don't think we're allowed to 
change the ABI across the clones.





Also when you are copying a function, you probably want to copy the associated
thunks and version them, too?

Dunno on that.

Jeff


Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Jan Hubicka
> Sorry this has taken so long to come back to...  As I mentioned a
> couple months ago, I'd hoped Jan would chime in on the IPA/symtab
> requirements.  But that didn't happen.

Sorry for that.  I had bit too many real life things this summer
and I am still trying to catch up.
> 
> 
> SO I went back and reviewed the discussion between Jan, Ilya &
> myself WRT some of the rules around aliases, clones, etc.  I think
> the key question for this patch is whether or not the clones have
> the same assembler name or not.  From looking at

Ilya's code seems different from what this patch does. Ilya simply needs
multiple declarations for one physical assembler name (this is not an alias).
This is not currently supported by symtab (support for that was removed long
time ago as part of the one decl project) and I have some perliminary patches
to push out, but since they add basic sanity checking that the different
declarations of the same thing looks compatible I get too many positives I need
to walk through.  Those seems real bugs in glibc (which uses duplicated decls
for checking) and the pointer bounds code.

> expand_target_clones, I'm confident the answer is the clones have
> different assembler names.  In fact, the assembler names are munged

Yes, here you have different names for different variants of the function
body. Basically this pass takes ctarget attribute and creates bunch of verisons
of the functions and assigns them the proper target attributes, right?

One thing I am confused about is why this does not happen early?
What happens to inlines of functions with specific taret requirements? I.e.
if I compile with AVX enabled have function body with AVX code but then, at
late compilation, force a clone with -mno-avx?

I would expect cloning to happen early, perhaps even before early 
optimizations...
Switching random target flags mid optimization queue seems dangerous.

As for the patch itself:
+  if (node->definition)
+   {
+ if (!node->has_gimple_body_p ())
+   return false;
+ node->get_body ();
+
+ /* Replacing initial function with clone only for 1st ctarget.  */
+ new_node = node->create_version_clone_with_body (vNULL, NULL,
+  NULL, false,
+  NULL, NULL,
+  "target_clone");
+ new_node->externally_visible = node->externally_visible;
+ new_node->address_taken = node->address_taken;
+ new_node->thunk = node->thunk;
+ new_node->alias = node->alias;
+ new_node->weakref = node->weakref;
+ new_node->cpp_implicit_alias = node->cpp_implicit_alias;
+ new_node->local.local = node->local.local;
+ TREE_PUBLIC (new_node->decl) = TREE_PUBLIC (node->decl);
+   }
Since you test it has gimple body, then you don't need to worry about
alias/thunk/weakrefs/implicit_alias properties. Those will be never set.
How does the dispatcher look like?  Can the function be considered local
in a sense that one can change calling conventions of one clone but not another?

On the other hand, node->create_version_clone_with_body creates a function
that is local, if we want to create globally visible clones, I think we should
add a parameter there and avoid the localization followed by unlocalization.
(I know we already have too many parameters here, perhaps we could add a flags
parameter that will also handle the existing skip_return flag.)
For exmaple, I think you want to have all functions with same WEAK attributes
or in the same COMDAT group.

Also when you are copying a function, you probably want to copy the associated
thunks and version them, too?

+  id = get_identifier (new_asm_name);
+  symtab->change_decl_assembler_name (new_node->decl, id);

here I think you can just pass new_asm_name as clone_name. I.e. replace
the current use of "target_clone" string.

+  targetm.target_option.valid_attribute_p (new_node->decl, NULL,
+  TREE_VALUE (attributes), 0);

looks like return value should not be ignored. If attribute is not valid,
we need to error and do something sane.

Honza


Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Jeff Law

On 09/24/2015 04:28 PM, Evgeny Stupachenko wrote:

I've fixed ICE and review issues.
x86 make check and bootstrap passed.

Thanks,
Evgeny

ChangeLog

2015-09-25  Evgeny Stupachenko

gcc/
 * Makefile.in (OBJS): Add multiple_target.o.
 * multiple_target.c (make_attribute): New.
 (create_dispatcher_calls): Ditto.
 (expand_target_clones): Ditto.
 (ipa_target_clone): Ditto.
 * passes.def (pass_target_clone): New ipa pass.
 * tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
 * c-common.c (handle_target_clones_attribute): New.
 * (c_common_attribute_table): Add handle_target_clones_attribute.
 * (handle_always_inline_attribute): Add check on target_clones
 attribute.
 * (handle_target_attribute): Ditto.

gcc/testsuite
 * gcc.dg/mvc1.c: New test for multiple targets cloning.
 * gcc.dg/mvc2.c: Ditto.
 * gcc.dg/mvc3.c: Ditto.
 * gcc.dg/mvc4.c: Ditto.
 * gcc.dg/mvc5.c: Ditto.
 * gcc.dg/mvc6.c: Ditto.
 * gcc.dg/mvc7.c: Ditto.
 * g++.dg/ext/mvc1.C: Ditto.
 * g++.dg/ext/mvc2.C: Ditto.
 * g++.dg/ext/mvc3.C: Ditto.

gcc/doc
 * doc/extend.texi (target_clones): New attribute description.






target_clones.patch
Sorry this has taken so long to come back to...  As I mentioned a couple 
months ago, I'd hoped Jan would chime in on the IPA/symtab requirements. 
 But that didn't happen.



SO I went back and reviewed the discussion between Jan, Ilya & myself 
WRT some of the rules around aliases, clones, etc.  I think the key 
question for this patch is whether or not the clones have the same 
assembler name or not.  From looking at expand_target_clones, I'm 
confident the answer is the clones have different assembler names.  In 
fact, the assembler names are munged with the options used for that 
specific clone of the original function.





+/* Makes a function attribute of the form NAME(ARG_NAME) and chains
+   it to CHAIN.  */
+
+static tree
+make_attribute (const char *name, const char *arg_name, tree chain)
+{
+  tree attr_name;
+  tree attr_arg_name;
+  tree attr_args;
+  tree attr;
+
+  attr_name = get_identifier (name);
+  attr_arg_name = build_string (strlen (arg_name), arg_name);
+  attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
+  attr = tree_cons (attr_name, attr_args, chain);
+  return attr;
+}
This seems generic enough that I'd prefer it in attribs.c.  I was rather 
surprised when I looked and didn't find an existing routine to do this.




+
+/* If the call in NODE has multiple target attribute with multiple fields,
+   replace it with dispatcher call and create dispatcher (once).  */
+
+static void
+create_dispatcher_calls (struct cgraph_node *node)
+{
+  cgraph_edge *e;
+  cgraph_edge *e_next;
+  for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller)
That's a rather strange way to write the loop increment.  If I follow 
the loop logic correctly, it seems that we always end up using 
e->next_caller, it's just obscured.


For the test if we're calling a versioned function, we just "continue". 
 e will be non-null and thus we use e->next_caller to set e for the 
next iteration.


If the test for calling a versioned function falls, we set e_next to 
e->next_caller, then later set e to NULL.  That results in using e_next 
to set e for the next iteration.  But e_next was initialized to 
e->next_caller.


So why not just write the loop increment as e = e->next-caller?




+{
+  tree resolver_decl;
+  tree idecl;
+  tree decl;
+  gimple *call = e->call_stmt;
+  struct cgraph_node *inode;
+
+  /* Checking if call of function is call of versioned function.
+Versioned function are not inlined, so there is no need to
+check for inline.  */

This comment doesn't parse well.  Perhaps:

/* Check if this is a call to a versioned function.  Verisoned
   fucntions are not inlined, so there is no need to check for that.  */



+
+/* If the function in NODE has multiple target attribute with multiple fields,
+   create the appropriate clone for each field.  */
+
+static bool
+expand_target_clones (struct cgraph_node *node)
So this is probably getting a little large.  Can we look to refactor it 
a little?  It's not a huge deal and there's certainly code in GCC that 
is far worse, but it just feels like there's enough going on in this 
code that there ought to be 2-3 helpers for the larger chunks of work 
going on inside this code.


The first is the attribute parsing.  The second is creation of the 
nodes, and the final would be munging the name and attributes on the clones.


I'm slightly concerned with using the pretty printer to build up the new 
name.  Is there precedent for this anywhere else in GCC?


When creating the munged name, don't you also have to make sure that 
other symbols that aren't supported for names don't sneak through?  I 
see that you replace =

Re: [PATCH] New attribute to create target clones

2015-10-08 Thread Jeff Law

On 09/22/2015 03:09 PM, Bernd Schmidt wrote:

On 09/22/2015 09:41 PM, Jeff Law wrote:

Essentially it allows us to more easily support
per-microarchitecture-optimized versions of functions.   You list just
have to list the microarchitectures and the compiler handles the rest.
Very simple, very easy.  I'd think it'd be particularly helpful for
vectorization.

You could emulate this with compiling the same source multiple times
with different flags/defines and wire up on ifunc by hand.  But Evgeny's
approach is vastly simpler.


As far as I can tell the ifunc is generated automatically (and the
functionality is documented as such), so the new target_clone doesn't
buy much. But one thing I didn't was that the existing support is only
available in C++, while Evgeny's patch works for C. That is probably an
argument that could be made for its inclusion.
In multi-versioning, you have a distinct source implementation for each 
function.  ie


__attribute__ ((target ("default")))
int foo ()
{
  // The default version of foo.
  return 0;
}

__attribute__ ((target ("sse4.2")))
int foo ()
{
  // foo version for SSE4.2
  return 1;
}

And so-on for each processor version you want to support.

In Evgeny's patch we'd have a single source implementation of foo which 
gets compiled multiple times, once for each target specified by the 
attribute.


I wasn't aware that multi-versioning was only implemented for C++, that 
seems fairly lame.  I hope I didn't approve that :-)


Jeff



Re: [PATCH] New attribute to create target clones

2015-10-02 Thread Evgeny Stupachenko
PING.

On Fri, Sep 25, 2015 at 1:28 AM, Evgeny Stupachenko  wrote:
> I've fixed ICE and review issues.
> x86 make check and bootstrap passed.
>
> Thanks,
> Evgeny
>
> ChangeLog
>
> 2015-09-25  Evgeny Stupachenko  
>
> gcc/
> * Makefile.in (OBJS): Add multiple_target.o.
> * multiple_target.c (make_attribute): New.
> (create_dispatcher_calls): Ditto.
> (expand_target_clones): Ditto.
> (ipa_target_clone): Ditto.
> * passes.def (pass_target_clone): New ipa pass.
> * tree-pass.h (make_pass_target_clone): Ditto.
>
> gcc/c-family
> * c-common.c (handle_target_clones_attribute): New.
> * (c_common_attribute_table): Add handle_target_clones_attribute.
> * (handle_always_inline_attribute): Add check on target_clones
> attribute.
> * (handle_target_attribute): Ditto.
>
> gcc/testsuite
> * gcc.dg/mvc1.c: New test for multiple targets cloning.
> * gcc.dg/mvc2.c: Ditto.
> * gcc.dg/mvc3.c: Ditto.
> * gcc.dg/mvc4.c: Ditto.
> * gcc.dg/mvc5.c: Ditto.
> * gcc.dg/mvc6.c: Ditto.
> * gcc.dg/mvc7.c: Ditto.
> * g++.dg/ext/mvc1.C: Ditto.
> * g++.dg/ext/mvc2.C: Ditto.
> * g++.dg/ext/mvc3.C: Ditto.
>
> gcc/doc
> * doc/extend.texi (target_clones): New attribute description.
>
>
> On Wed, Sep 23, 2015 at 1:49 AM, Evgeny Stupachenko  
> wrote:
>> Thank you for the review.
>> The patch still works with gcc 5, but the fail reproduced on trunk
>> (looks like it appeared while patch was at review). I'll debug it and
>> fix.
>> As a workaround to test the feature...
>> Removing
>> "gimple_call_set_fndecl (call, idecl);" from multiple_target.c
>> should resolve the ICE
>>
>> I'll fix the patch for trunk and send an update.
>>
>> Thanks,
>> Evgeny
>>
>>
>> On Wed, Sep 23, 2015 at 12:09 AM, Bernd Schmidt  wrote:
>>> On 09/22/2015 09:41 PM, Jeff Law wrote:

 Essentially it allows us to more easily support
 per-microarchitecture-optimized versions of functions.   You list just
 have to list the microarchitectures and the compiler handles the rest.
 Very simple, very easy.  I'd think it'd be particularly helpful for
 vectorization.

 You could emulate this with compiling the same source multiple times
 with different flags/defines and wire up on ifunc by hand.  But Evgeny's
 approach is vastly simpler.
>>>
>>>
>>> As far as I can tell the ifunc is generated automatically (and the
>>> functionality is documented as such), so the new target_clone doesn't buy
>>> much. But one thing I didn't was that the existing support is only available
>>> in C++, while Evgeny's patch works for C. That is probably an argument that
>>> could be made for its inclusion.
>>>
>>> Or at least, it's supposed to work. As I said, I get verify_ssa failures on
>>> the included testcases, and for a simpler one I just tried I get the clones
>>> of the function, but not the resolver that ought to be generated.
>>>
>>>
>>> Bernd


Re: [PATCH] New attribute to create target clones

2015-09-24 Thread Evgeny Stupachenko
I've fixed ICE and review issues.
x86 make check and bootstrap passed.

Thanks,
Evgeny

ChangeLog

2015-09-25  Evgeny Stupachenko  

gcc/
* Makefile.in (OBJS): Add multiple_target.o.
* multiple_target.c (make_attribute): New.
(create_dispatcher_calls): Ditto.
(expand_target_clones): Ditto.
(ipa_target_clone): Ditto.
* passes.def (pass_target_clone): New ipa pass.
* tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
* c-common.c (handle_target_clones_attribute): New.
* (c_common_attribute_table): Add handle_target_clones_attribute.
* (handle_always_inline_attribute): Add check on target_clones
attribute.
* (handle_target_attribute): Ditto.

gcc/testsuite
* gcc.dg/mvc1.c: New test for multiple targets cloning.
* gcc.dg/mvc2.c: Ditto.
* gcc.dg/mvc3.c: Ditto.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* gcc.dg/mvc6.c: Ditto.
* gcc.dg/mvc7.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc2.C: Ditto.
* g++.dg/ext/mvc3.C: Ditto.

gcc/doc
* doc/extend.texi (target_clones): New attribute description.


On Wed, Sep 23, 2015 at 1:49 AM, Evgeny Stupachenko  wrote:
> Thank you for the review.
> The patch still works with gcc 5, but the fail reproduced on trunk
> (looks like it appeared while patch was at review). I'll debug it and
> fix.
> As a workaround to test the feature...
> Removing
> "gimple_call_set_fndecl (call, idecl);" from multiple_target.c
> should resolve the ICE
>
> I'll fix the patch for trunk and send an update.
>
> Thanks,
> Evgeny
>
>
> On Wed, Sep 23, 2015 at 12:09 AM, Bernd Schmidt  wrote:
>> On 09/22/2015 09:41 PM, Jeff Law wrote:
>>>
>>> Essentially it allows us to more easily support
>>> per-microarchitecture-optimized versions of functions.   You list just
>>> have to list the microarchitectures and the compiler handles the rest.
>>> Very simple, very easy.  I'd think it'd be particularly helpful for
>>> vectorization.
>>>
>>> You could emulate this with compiling the same source multiple times
>>> with different flags/defines and wire up on ifunc by hand.  But Evgeny's
>>> approach is vastly simpler.
>>
>>
>> As far as I can tell the ifunc is generated automatically (and the
>> functionality is documented as such), so the new target_clone doesn't buy
>> much. But one thing I didn't was that the existing support is only available
>> in C++, while Evgeny's patch works for C. That is probably an argument that
>> could be made for its inclusion.
>>
>> Or at least, it's supposed to work. As I said, I get verify_ssa failures on
>> the included testcases, and for a simpler one I just tried I get the clones
>> of the function, but not the resolver that ought to be generated.
>>
>>
>> Bernd


target_clones.patch
Description: Binary data


Re: [PATCH] New attribute to create target clones

2015-09-22 Thread Evgeny Stupachenko
Thank you for the review.
The patch still works with gcc 5, but the fail reproduced on trunk
(looks like it appeared while patch was at review). I'll debug it and
fix.
As a workaround to test the feature...
Removing
"gimple_call_set_fndecl (call, idecl);" from multiple_target.c
should resolve the ICE

I'll fix the patch for trunk and send an update.

Thanks,
Evgeny


On Wed, Sep 23, 2015 at 12:09 AM, Bernd Schmidt  wrote:
> On 09/22/2015 09:41 PM, Jeff Law wrote:
>>
>> Essentially it allows us to more easily support
>> per-microarchitecture-optimized versions of functions.   You list just
>> have to list the microarchitectures and the compiler handles the rest.
>> Very simple, very easy.  I'd think it'd be particularly helpful for
>> vectorization.
>>
>> You could emulate this with compiling the same source multiple times
>> with different flags/defines and wire up on ifunc by hand.  But Evgeny's
>> approach is vastly simpler.
>
>
> As far as I can tell the ifunc is generated automatically (and the
> functionality is documented as such), so the new target_clone doesn't buy
> much. But one thing I didn't was that the existing support is only available
> in C++, while Evgeny's patch works for C. That is probably an argument that
> could be made for its inclusion.
>
> Or at least, it's supposed to work. As I said, I get verify_ssa failures on
> the included testcases, and for a simpler one I just tried I get the clones
> of the function, but not the resolver that ought to be generated.
>
>
> Bernd


Re: [PATCH] New attribute to create target clones

2015-09-22 Thread Bernd Schmidt

On 09/22/2015 09:41 PM, Jeff Law wrote:

Essentially it allows us to more easily support
per-microarchitecture-optimized versions of functions.   You list just
have to list the microarchitectures and the compiler handles the rest.
Very simple, very easy.  I'd think it'd be particularly helpful for
vectorization.

You could emulate this with compiling the same source multiple times
with different flags/defines and wire up on ifunc by hand.  But Evgeny's
approach is vastly simpler.


As far as I can tell the ifunc is generated automatically (and the 
functionality is documented as such), so the new target_clone doesn't 
buy much. But one thing I didn't was that the existing support is only 
available in C++, while Evgeny's patch works for C. That is probably an 
argument that could be made for its inclusion.


Or at least, it's supposed to work. As I said, I get verify_ssa failures 
on the included testcases, and for a simpler one I just tried I get the 
clones of the function, but not the resolver that ought to be generated.



Bernd


Re: [PATCH] New attribute to create target clones

2015-09-22 Thread Jeff Law

On 09/21/2015 07:25 AM, Bernd Schmidt wrote:

On 08/27/2015 01:18 PM, Evgeny Stupachenko wrote:

Based on RFC:
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html

The patch implement an extension to Function Multiversioning that
allows to clone a function for multiple targets.
__attribute__((target_clones("avx","arch=slm","default")))
int foo ()
...

Will create 3 clones of foo(). One optimized with -mavx, one optimized
with -march=slm, and one with default optimizations.
And will create ifunc resolver that calls appropriate clone (same as
in Function Multiversioning).


The general question is - do we want this, given that it seems to
introduce no functionality that can't be had with the existing
multiversioning? You could always compile the same source file to
multiple objects with different defines for the target optimization, or
include a file containing the multiversioned function multiple times
with changing #defines.
Essentially it allows us to more easily support 
per-microarchitecture-optimized versions of functions.   You list just 
have to list the microarchitectures and the compiler handles the rest. 
Very simple, very easy.  I'd think it'd be particularly helpful for 
vectorization.


You could emulate this with compiling the same source multiple times 
with different flags/defines and wire up on ifunc by hand.  But Evgeny's 
approach is vastly simpler.


We are still waiting on Jan to chime in as to whether or not any of this 
breaks the rules WRT nodes in the symbol table.  Given I mucked up the 
MPX code by not knowing those rules, I'm hesitant to approve this patch 
without input from Jan.


Jeff



Re: [PATCH] New attribute to create target clones

2015-09-21 Thread Bernd Schmidt

On 08/27/2015 01:18 PM, Evgeny Stupachenko wrote:

Based on RFC:
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html

The patch implement an extension to Function Multiversioning that
allows to clone a function for multiple targets.
__attribute__((target_clones("avx","arch=slm","default")))
int foo ()
...

Will create 3 clones of foo(). One optimized with -mavx, one optimized
with -march=slm, and one with default optimizations.
And will create ifunc resolver that calls appropriate clone (same as
in Function Multiversioning).


The general question is - do we want this, given that it seems to 
introduce no functionality that can't be had with the existing 
multiversioning? You could always compile the same source file to 
multiple objects with different defines for the target optimization, or 
include a file containing the multiversioned function multiple times 
with changing #defines.


Some comments on the patch itself:

> diff --git a/gcc/testsuite/gcc.dg/mvc1.c b/gcc/testsuite/gcc.dg/mvc1.c
> new file mode 100644
> index 000..ab1e0f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/mvc1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run { target i?86-*-* x86_64-*-* } } */
> +/* { dg-options "-O2" } */

Target-specific tests should go into gcc.target/i386.


+  else if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (*node)))
+   {
+ warning (OPT_Wattributes,
+  "%qE attribute ignored as it conflict with target_clones",
+  name);
+ *no_add_attrs = true;
+   }


Bad grammar ("conflicts"), and apparently you're supposed to use %qs 
even for known attribute names. Look at the code immediately before this 
and copy it. Similar issues throughout.



+  for (e = node->callers; e ;e = (e == NULL) ? e_next : e->next_caller)


It looks like this could be simplified if you immediately assign
  e_next = e->next_caller
and then always use e_next.

Trying to test it, I get (compiling with -O2 on x86_64-linux):

mvc1.c: In function ‘main’:
mvc1.c:18:1: error: virtual definition of statement not up-to-date
 main ()
 ^
_2 = foo.ifunc ();
mvc1.c:18:1: internal compiler error: verify_ssa failed


Bernd



Re: [PATCH] New attribute to create target clones

2015-09-16 Thread Evgeny Stupachenko
PING 2.

On Tue, Sep 8, 2015 at 2:27 PM, Evgeny Stupachenko  wrote:
> Ping.
>
> On Thu, Aug 27, 2015 at 2:18 PM, Evgeny Stupachenko  
> wrote:
>> Hi All,
>>
>> Based on RFC:
>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html
>>
>> The patch implement an extension to Function Multiversioning that
>> allows to clone a function for multiple targets.
>> __attribute__((target_clones("avx","arch=slm","default")))
>> int foo ()
>> ...
>>
>> Will create 3 clones of foo(). One optimized with -mavx, one optimized
>> with -march=slm, and one with default optimizations.
>> And will create ifunc resolver that calls appropriate clone (same as
>> in Function Multiversioning).
>>
>> Bootstrap and make check for x86 passed.
>>
>> Is it ok for trunk?
>>
>> 2015-08-27  Evgeny Stupachenko  
>>
>> gcc/
>> * Makefile.in (OBJS): Add multiple_target.o.
>> * multiple_target.c (make_attribute): New.
>> (create_dispatcher_calls): Ditto.
>> (expand_target_clones): Ditto.
>> (ipa_target_clone): Ditto.
>> * passes.def (pass_target_clone): New ipa pass.
>> * tree-pass.h (make_pass_target_clone): Ditto.
>>
>> gcc/c-family
>> * c-common.c (handle_target_clones_attribute): New.
>> * (c_common_attribute_table): Add handle_target_clones_attribute.
>> * (handle_always_inline_attribute): Add check on target_clones
>> attribute.
>> * (handle_target_attribute): Ditto.
>>
>> gcc/testsuite
>> * gcc.dg/mvc1.c: New test for multiple targets cloning.
>> * gcc.dg/mvc2.c: Ditto.
>> * gcc.dg/mvc3.c: Ditto.
>> * gcc.dg/mvc4.c: Ditto.
>> * gcc.dg/mvc5.c: Ditto.
>> * gcc.dg/mvc6.c: Ditto.
>> * gcc.dg/mvc7.c: Ditto.
>> * g++.dg/ext/mvc1.C: Ditto.
>> * g++.dg/ext/mvc2.C: Ditto.
>> * g++.dg/ext/mvc3.C: Ditto.
>>
>> gcc/doc
>> * doc/extend.texi (target_clones): New attribute description.


Re: [PATCH] New attribute to create target clones

2015-09-08 Thread Evgeny Stupachenko
Ping.

On Thu, Aug 27, 2015 at 2:18 PM, Evgeny Stupachenko  wrote:
> Hi All,
>
> Based on RFC:
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html
>
> The patch implement an extension to Function Multiversioning that
> allows to clone a function for multiple targets.
> __attribute__((target_clones("avx","arch=slm","default")))
> int foo ()
> ...
>
> Will create 3 clones of foo(). One optimized with -mavx, one optimized
> with -march=slm, and one with default optimizations.
> And will create ifunc resolver that calls appropriate clone (same as
> in Function Multiversioning).
>
> Bootstrap and make check for x86 passed.
>
> Is it ok for trunk?
>
> 2015-08-27  Evgeny Stupachenko  
>
> gcc/
> * Makefile.in (OBJS): Add multiple_target.o.
> * multiple_target.c (make_attribute): New.
> (create_dispatcher_calls): Ditto.
> (expand_target_clones): Ditto.
> (ipa_target_clone): Ditto.
> * passes.def (pass_target_clone): New ipa pass.
> * tree-pass.h (make_pass_target_clone): Ditto.
>
> gcc/c-family
> * c-common.c (handle_target_clones_attribute): New.
> * (c_common_attribute_table): Add handle_target_clones_attribute.
> * (handle_always_inline_attribute): Add check on target_clones
> attribute.
> * (handle_target_attribute): Ditto.
>
> gcc/testsuite
> * gcc.dg/mvc1.c: New test for multiple targets cloning.
> * gcc.dg/mvc2.c: Ditto.
> * gcc.dg/mvc3.c: Ditto.
> * gcc.dg/mvc4.c: Ditto.
> * gcc.dg/mvc5.c: Ditto.
> * gcc.dg/mvc6.c: Ditto.
> * gcc.dg/mvc7.c: Ditto.
> * g++.dg/ext/mvc1.C: Ditto.
> * g++.dg/ext/mvc2.C: Ditto.
> * g++.dg/ext/mvc3.C: Ditto.
>
> gcc/doc
> * doc/extend.texi (target_clones): New attribute description.


[PATCH] New attribute to create target clones

2015-08-27 Thread Evgeny Stupachenko
Hi All,

Based on RFC:
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01322.html

The patch implement an extension to Function Multiversioning that
allows to clone a function for multiple targets.
__attribute__((target_clones("avx","arch=slm","default")))
int foo ()
...

Will create 3 clones of foo(). One optimized with -mavx, one optimized
with -march=slm, and one with default optimizations.
And will create ifunc resolver that calls appropriate clone (same as
in Function Multiversioning).

Bootstrap and make check for x86 passed.

Is it ok for trunk?

2015-08-27  Evgeny Stupachenko  

gcc/
* Makefile.in (OBJS): Add multiple_target.o.
* multiple_target.c (make_attribute): New.
(create_dispatcher_calls): Ditto.
(expand_target_clones): Ditto.
(ipa_target_clone): Ditto.
* passes.def (pass_target_clone): New ipa pass.
* tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
* c-common.c (handle_target_clones_attribute): New.
* (c_common_attribute_table): Add handle_target_clones_attribute.
* (handle_always_inline_attribute): Add check on target_clones
attribute.
* (handle_target_attribute): Ditto.

gcc/testsuite
* gcc.dg/mvc1.c: New test for multiple targets cloning.
* gcc.dg/mvc2.c: Ditto.
* gcc.dg/mvc3.c: Ditto.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* gcc.dg/mvc6.c: Ditto.
* gcc.dg/mvc7.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc2.C: Ditto.
* g++.dg/ext/mvc3.C: Ditto.

gcc/doc
* doc/extend.texi (target_clones): New attribute description.


target_clones.patch
Description: Binary data


Re: [RFC, patch] New attribute to create target clones

2015-08-21 Thread Jeff Law

On 08/20/2015 05:38 PM, Evgeny Stupachenko wrote:

On Mon, Aug 3, 2015 at 9:43 PM, Jeff Law  wrote:

On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote:


Hi All,

The patch enables new attribute 'ctarget',
The attribute force compiler to create clones of a function with the
attribute.

For example:
__attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))


So presumably we're allowing both changing the ISA and the tuning options?
In fact, it looks like we're able to change any -m option, right?

What about something like -mregparm?  I think the docs need to disallow
clones with different ABI requirements.


-mregparm is not allowed now. The
targetm.target_option.valid_attribute_p hook specify which -m option
is allowed for architecture. Here patch reuses Function
Multiversioning methods.

Ah, OK.  That clarifies things.



I could make an argument for either.  Do we have anything to guide us from
other compilers such as ICC that may have a similar capability?



Not sure. However ICC has similar to Function Multiversioning:
__declcpec(cpu_specific(... where "default" is ""generic". I think for
"default" we should do the same as Function Multiversioning - keep
compiler options. That way users will be able to create target
specific minimum by passing corresponding options to command line.

That works for me.

So I think we really need Jan to chime in here.

Jeff


Re: [RFC, patch] New attribute to create target clones

2015-08-20 Thread Evgeny Stupachenko
On Mon, Aug 3, 2015 at 9:43 PM, Jeff Law  wrote:
> On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote:
>>
>> Hi All,
>>
>> The patch enables new attribute 'ctarget',
>> The attribute force compiler to create clones of a function with the
>> attribute.
>>
>> For example:
>> __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
>
> So presumably we're allowing both changing the ISA and the tuning options?
> In fact, it looks like we're able to change any -m option, right?
>
> What about something like -mregparm?  I think the docs need to disallow
> clones with different ABI requirements.

-mregparm is not allowed now. The
targetm.target_option.valid_attribute_p hook specify which -m option
is allowed for architecture. Here patch reuses Function
Multiversioning methods.

>
>> Currently default ctarget means that foo() will be optimized with
>> current compiler options and target.
>> The other option is to switch target to target specific minimum (like
>> target x86-64). Is it better?
>
> I could make an argument for either.  Do we have anything to guide us from
> other compilers such as ICC that may have a similar capability?
>

Not sure. However ICC has similar to Function Multiversioning:
__declcpec(cpu_specific(... where "default" is ""generic". I think for
"default" we should do the same as Function Multiversioning - keep
compiler options. That way users will be able to create target
specific minimum by passing corresponding options to command line.

>
>>
>> What do you think about attribute name? 'ctarget' is short but not
>> informative. Other variants are 'target_clones', 'targets'...
>
> target_clones seems good.
>
> For multiple_target.c: Can you please trim down the #include set.  I can't
> believe you need all that stuff.If you really need backend stuff (tm.h),
> then use "backend.h" instead.
>
> It generally looks reasonable, but Jan knows the IPA code much better than I
> do and I'd like him to chime in.  You might also ask Ilya to review the
> cloning and rules around clones since he head to deal with a lot of that
> stuff in his MPX work.
>
> jeff


Re: [RFC, patch] New attribute to create target clones

2015-08-03 Thread Jeff Law

On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote:

Hi All,

The patch enables new attribute 'ctarget',
The attribute force compiler to create clones of a function with the attribute.

For example:
__attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
So presumably we're allowing both changing the ISA and the tuning 
options?   In fact, it looks like we're able to change any -m option, right?


What about something like -mregparm?  I think the docs need to disallow 
clones with different ABI requirements.



Currently default ctarget means that foo() will be optimized with
current compiler options and target.
The other option is to switch target to target specific minimum (like
target x86-64). Is it better?
I could make an argument for either.  Do we have anything to guide us 
from other compilers such as ICC that may have a similar capability?





What do you think about attribute name? 'ctarget' is short but not
informative. Other variants are 'target_clones', 'targets'...

target_clones seems good.

For multiple_target.c: Can you please trim down the #include set.  I 
can't believe you need all that stuff.If you really need backend 
stuff (tm.h), then use "backend.h" instead.


It generally looks reasonable, but Jan knows the IPA code much better 
than I do and I'd like him to chime in.  You might also ask Ilya to 
review the cloning and rules around clones since he head to deal with a 
lot of that stuff in his MPX work.


jeff


Re: [RFC, patch] New attribute to create target clones

2015-07-31 Thread Jeff Law

On 07/31/2015 05:50 AM, Evgeny wrote:

New ctarget attribute automatically clone function and optimize for
corresponding target. For target attribute if you specify "avx,sse4" you
will get only 1 function optimized for the target with higher priority,
while ctarget will generate 2 functions: one optimized for sse4 and one
for avx.

Ah, thanks for the explanation.  That helps a lot.

jeff


Re: [RFC, patch] New attribute to create target clones

2015-07-30 Thread Jeff Law

On 07/30/2015 04:19 PM, Evgeny Stupachenko wrote:

Hi All,

The patch enables new attribute 'ctarget',
The attribute force compiler to create clones of a function with the attribute.

For example:
__attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
int
foo ()

will create 3 clones of foo() optimized for corresponding targets and
replace all foo() calls with ifunc resolver.
The patch include new tests for the feature.

Currently default ctarget means that foo() will be optimized with
current compiler options and target.
The other option is to switch target to target specific minimum (like
target x86-64). Is it better?

What do you think about attribute name? 'ctarget' is short but not
informative. Other variants are 'target_clones', 'targets'...

Below is ChangeLog. Attached patch passed make check on x86.

Thanks,
Evgeny

2015-07-31  Evgeny Stupachenko  

gcc/
 * Makefile.in (OBJS): Add multiple_target.o.
 * multiple_target.c (make_attribute): New.
 (create_dispatcher_calls): Ditto.
 (expand_target_clones): Ditto.
 (ipa_target_clone): Ditto.
 * passes.def (pass_target_clone): New ipa pass.
 * tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
 * c-common.c (handle_ctarget_attribute): New.
 * (c_common_attribute_table): Add handle_ctarget_attribute.
 * (handle_always_inline_attribute): Add check on ctarget attribute.
 * (handle_target_attribute): Ditto.

gcc/testsuite
 * gcc.dg/mvc1.c: New test for multiple targets cloning.
 * gcc.dg/mvc2.c: Ditto.
 * gcc.dg/mvc3.c: Ditto.
 * gcc.dg/mvc4.c: Ditto.
 * gcc.dg/mvc5.c: Ditto.
 * gcc.dg/mvc6.c: Ditto.
 * gcc.dg/mvc7.c: Ditto.
 * g++.dg/ext/mvc1.C: Ditto.
 * g++.dg/ext/mvc2.C: Ditto.
 * g++.dg/ext/mvc3.C: Ditto.
I have the same question as Andrew.  How is this fundamentally different 
from the function multi-versioning we currently support?


jeff






Re: [RFC, patch] New attribute to create target clones

2015-07-30 Thread Joseph Myers
On Fri, 31 Jul 2015, Evgeny Stupachenko wrote:

> The patch enables new attribute 'ctarget',

Attributes should be documented in extend.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC, patch] New attribute to create target clones

2015-07-30 Thread Evgeny Stupachenko
You can't apply both 'ctarget' and 'target' attributes to a function.
The patch uses same target specific methods to create dispatcher and
target clones.

On Fri, Jul 31, 2015 at 1:30 AM, Andrew Pinski  wrote:
> On Thu, Jul 30, 2015 at 3:19 PM, Evgeny Stupachenko  
> wrote:
>> Hi All,
>>
>> The patch enables new attribute 'ctarget',
>> The attribute force compiler to create clones of a function with the 
>> attribute.
>>
>> For example:
>> __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
>> int
>> foo ()
>>
>> will create 3 clones of foo() optimized for corresponding targets and
>> replace all foo() calls with ifunc resolver.
>> The patch include new tests for the feature.
>>
>> Currently default ctarget means that foo() will be optimized with
>> current compiler options and target.
>> The other option is to switch target to target specific minimum (like
>> target x86-64). Is it better?
>>
>> What do you think about attribute name? 'ctarget' is short but not
>> informative. Other variants are 'target_clones', 'targets'...
>
> How does this interacts with Function Multiversioning
> (https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Multiversioning.html)?
>
> Thanks,
> Andrew Pinski
>
>>
>> Below is ChangeLog. Attached patch passed make check on x86.
>>
>> Thanks,
>> Evgeny
>>
>> 2015-07-31  Evgeny Stupachenko  
>>
>> gcc/
>> * Makefile.in (OBJS): Add multiple_target.o.
>> * multiple_target.c (make_attribute): New.
>> (create_dispatcher_calls): Ditto.
>> (expand_target_clones): Ditto.
>> (ipa_target_clone): Ditto.
>> * passes.def (pass_target_clone): New ipa pass.
>> * tree-pass.h (make_pass_target_clone): Ditto.
>>
>> gcc/c-family
>> * c-common.c (handle_ctarget_attribute): New.
>> * (c_common_attribute_table): Add handle_ctarget_attribute.
>> * (handle_always_inline_attribute): Add check on ctarget attribute.
>> * (handle_target_attribute): Ditto.
>>
>> gcc/testsuite
>> * gcc.dg/mvc1.c: New test for multiple targets cloning.
>> * gcc.dg/mvc2.c: Ditto.
>> * gcc.dg/mvc3.c: Ditto.
>> * gcc.dg/mvc4.c: Ditto.
>> * gcc.dg/mvc5.c: Ditto.
>> * gcc.dg/mvc6.c: Ditto.
>> * gcc.dg/mvc7.c: Ditto.
>> * g++.dg/ext/mvc1.C: Ditto.
>> * g++.dg/ext/mvc2.C: Ditto.
>> * g++.dg/ext/mvc3.C: Ditto.


Re: [RFC, patch] New attribute to create target clones

2015-07-30 Thread Andrew Pinski
On Thu, Jul 30, 2015 at 3:19 PM, Evgeny Stupachenko  wrote:
> Hi All,
>
> The patch enables new attribute 'ctarget',
> The attribute force compiler to create clones of a function with the 
> attribute.
>
> For example:
> __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
> int
> foo ()
>
> will create 3 clones of foo() optimized for corresponding targets and
> replace all foo() calls with ifunc resolver.
> The patch include new tests for the feature.
>
> Currently default ctarget means that foo() will be optimized with
> current compiler options and target.
> The other option is to switch target to target specific minimum (like
> target x86-64). Is it better?
>
> What do you think about attribute name? 'ctarget' is short but not
> informative. Other variants are 'target_clones', 'targets'...

How does this interacts with Function Multiversioning
(https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Multiversioning.html)?

Thanks,
Andrew Pinski

>
> Below is ChangeLog. Attached patch passed make check on x86.
>
> Thanks,
> Evgeny
>
> 2015-07-31  Evgeny Stupachenko  
>
> gcc/
> * Makefile.in (OBJS): Add multiple_target.o.
> * multiple_target.c (make_attribute): New.
> (create_dispatcher_calls): Ditto.
> (expand_target_clones): Ditto.
> (ipa_target_clone): Ditto.
> * passes.def (pass_target_clone): New ipa pass.
> * tree-pass.h (make_pass_target_clone): Ditto.
>
> gcc/c-family
> * c-common.c (handle_ctarget_attribute): New.
> * (c_common_attribute_table): Add handle_ctarget_attribute.
> * (handle_always_inline_attribute): Add check on ctarget attribute.
> * (handle_target_attribute): Ditto.
>
> gcc/testsuite
> * gcc.dg/mvc1.c: New test for multiple targets cloning.
> * gcc.dg/mvc2.c: Ditto.
> * gcc.dg/mvc3.c: Ditto.
> * gcc.dg/mvc4.c: Ditto.
> * gcc.dg/mvc5.c: Ditto.
> * gcc.dg/mvc6.c: Ditto.
> * gcc.dg/mvc7.c: Ditto.
> * g++.dg/ext/mvc1.C: Ditto.
> * g++.dg/ext/mvc2.C: Ditto.
> * g++.dg/ext/mvc3.C: Ditto.


Re: [RFC, patch] New attribute to create target clones

2015-07-30 Thread Evgeny Stupachenko
Forget C++ tests.

On Fri, Jul 31, 2015 at 1:19 AM, Evgeny Stupachenko  wrote:
> Hi All,
>
> The patch enables new attribute 'ctarget',
> The attribute force compiler to create clones of a function with the 
> attribute.
>
> For example:
> __attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
> int
> foo ()
>
> will create 3 clones of foo() optimized for corresponding targets and
> replace all foo() calls with ifunc resolver.
> The patch include new tests for the feature.
>
> Currently default ctarget means that foo() will be optimized with
> current compiler options and target.
> The other option is to switch target to target specific minimum (like
> target x86-64). Is it better?
>
> What do you think about attribute name? 'ctarget' is short but not
> informative. Other variants are 'target_clones', 'targets'...
>
> Below is ChangeLog. Attached patch passed make check on x86.
>
> Thanks,
> Evgeny
>
> 2015-07-31  Evgeny Stupachenko  
>
> gcc/
> * Makefile.in (OBJS): Add multiple_target.o.
> * multiple_target.c (make_attribute): New.
> (create_dispatcher_calls): Ditto.
> (expand_target_clones): Ditto.
> (ipa_target_clone): Ditto.
> * passes.def (pass_target_clone): New ipa pass.
> * tree-pass.h (make_pass_target_clone): Ditto.
>
> gcc/c-family
> * c-common.c (handle_ctarget_attribute): New.
> * (c_common_attribute_table): Add handle_ctarget_attribute.
> * (handle_always_inline_attribute): Add check on ctarget attribute.
> * (handle_target_attribute): Ditto.
>
> gcc/testsuite
> * gcc.dg/mvc1.c: New test for multiple targets cloning.
> * gcc.dg/mvc2.c: Ditto.
> * gcc.dg/mvc3.c: Ditto.
> * gcc.dg/mvc4.c: Ditto.
> * gcc.dg/mvc5.c: Ditto.
> * gcc.dg/mvc6.c: Ditto.
> * gcc.dg/mvc7.c: Ditto.
> * g++.dg/ext/mvc1.C: Ditto.
> * g++.dg/ext/mvc2.C: Ditto.
> * g++.dg/ext/mvc3.C: Ditto.


ctarget_attribute_C++tests.patch
Description: Binary data


[RFC, patch] New attribute to create target clones

2015-07-30 Thread Evgeny Stupachenko
Hi All,

The patch enables new attribute 'ctarget',
The attribute force compiler to create clones of a function with the attribute.

For example:
__attribute__((ctarget("avx","arch=slm","arch=core-avx2","default")))
int
foo ()

will create 3 clones of foo() optimized for corresponding targets and
replace all foo() calls with ifunc resolver.
The patch include new tests for the feature.

Currently default ctarget means that foo() will be optimized with
current compiler options and target.
The other option is to switch target to target specific minimum (like
target x86-64). Is it better?

What do you think about attribute name? 'ctarget' is short but not
informative. Other variants are 'target_clones', 'targets'...

Below is ChangeLog. Attached patch passed make check on x86.

Thanks,
Evgeny

2015-07-31  Evgeny Stupachenko  

gcc/
* Makefile.in (OBJS): Add multiple_target.o.
* multiple_target.c (make_attribute): New.
(create_dispatcher_calls): Ditto.
(expand_target_clones): Ditto.
(ipa_target_clone): Ditto.
* passes.def (pass_target_clone): New ipa pass.
* tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
* c-common.c (handle_ctarget_attribute): New.
* (c_common_attribute_table): Add handle_ctarget_attribute.
* (handle_always_inline_attribute): Add check on ctarget attribute.
* (handle_target_attribute): Ditto.

gcc/testsuite
* gcc.dg/mvc1.c: New test for multiple targets cloning.
* gcc.dg/mvc2.c: Ditto.
* gcc.dg/mvc3.c: Ditto.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* gcc.dg/mvc6.c: Ditto.
* gcc.dg/mvc7.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc2.C: Ditto.
* g++.dg/ext/mvc3.C: Ditto.


ctarget_attribute.patch
Description: Binary data