On Fri, Dec 6, 2013 at 8:33 PM, Sriraman Tallam <tmsri...@google.com> wrote:
> Patch updated with two more tests to check if the vfmadd insn is being
> produced when possible.
>
> Thanks
> Sri
>
> On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsri...@google.com> wrote:
>> Hi,
>>
>>     I have attached a patch to fix
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review.
>>
>> Here is the problem. GCC adds target-specific builtins on demand. The
>> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via
>> this declaration:
>>
>> void fun() __attribute__((target("fma")));
>>
>> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when
>> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree
>> when processing this target attribute.
>>
>> Now, when the vectorizer is processing the builtin "__builtin_fma" in
>> function other_fn(), it checks to see if this function is vectorizable
>> and calls ix86_builtin_vectorized_function in i386.c. That returns the
>> builtin stored here:
>>
>>
>> case BUILT_IN_FMA:
>> if (out_mode == DFmode && in_mode == DFmode)
>> {
>>  if (out_n == 2 && in_n == 2)
>>    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>           ....
>>
>> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE
>> had the builtin not been added by the previous target attribute. That
>> is why the code works if we remove the previous declaration.
>>
>> The fix is to not just return the builtin but to also check if the
>> current function's isa allows the use of the builtin. For instance,
>> this patch would solve the problem:
>>
>> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>>        if (out_mode == DFmode && in_mode == DFmode)
>>   {
>>    if (out_n == 2 && in_n == 2)
>> -    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>> +    {
>> +      if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa
>> +  & global_options.x_ix86_isa_flags)
>> +        return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>> +      else
>> + return NULL_TREE;
>> +    }
>>
>>
>> but there are many instances of this usage in
>> ix86_builtin_vectorized_function. This patch covers all the cases.

>     PR target/59390
>     * gcc.target/i386/pr59390.c: New test.
>     * gcc.target/i386/pr59390_1.c: New test.
>     * gcc.target/i386/pr59390_2.c: New test.
>     * config/i386/i386.c (get_builtin): New function.
>     (ix86_builtin_vectorized_function): Replace all instances of
>     ix86_builtins[...] with get_builtin(...).
>     (ix86_builtin_reciprocal): Ditto.

OK, with a couple of nits:

+static tree get_builtin (enum ix86_builtins code)

Please name this function ix86_get_builtin.

+  if (current_function_decl)
+    target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
+  if (target_tree)
+    opts = TREE_TARGET_OPTION (target_tree);
+  else
+    opts = TREE_TARGET_OPTION (target_option_default_node);
+

opts = TREE_TARGET_OPTION (target_tree ? target_tree :
target_option_default_node);

Thanks,
Uros.

Reply via email to