On Fri, Dec 6, 2013 at 8:33 PM, Sriraman Tallam <tmsri...@google.com> wrote: > Patch updated with two more tests to check if the vfmadd insn is being > produced when possible. > > Thanks > Sri > > On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsri...@google.com> wrote: >> Hi, >> >> I have attached a patch to fix >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review. >> >> Here is the problem. GCC adds target-specific builtins on demand. The >> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via >> this declaration: >> >> void fun() __attribute__((target("fma"))); >> >> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when >> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree >> when processing this target attribute. >> >> Now, when the vectorizer is processing the builtin "__builtin_fma" in >> function other_fn(), it checks to see if this function is vectorizable >> and calls ix86_builtin_vectorized_function in i386.c. That returns the >> builtin stored here: >> >> >> case BUILT_IN_FMA: >> if (out_mode == DFmode && in_mode == DFmode) >> { >> if (out_n == 2 && in_n == 2) >> return ix86_builtins[IX86_BUILTIN_VFMADDPD]; >> .... >> >> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE >> had the builtin not been added by the previous target attribute. That >> is why the code works if we remove the previous declaration. >> >> The fix is to not just return the builtin but to also check if the >> current function's isa allows the use of the builtin. For instance, >> this patch would solve the problem: >> >> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, tre >> if (out_mode == DFmode && in_mode == DFmode) >> { >> if (out_n == 2 && in_n == 2) >> - return ix86_builtins[IX86_BUILTIN_VFMADDPD]; >> + { >> + if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa >> + & global_options.x_ix86_isa_flags) >> + return ix86_builtins[IX86_BUILTIN_VFMADDPD]; >> + else >> + return NULL_TREE; >> + } >> >> >> but there are many instances of this usage in >> ix86_builtin_vectorized_function. This patch covers all the cases.
> PR target/59390 > * gcc.target/i386/pr59390.c: New test. > * gcc.target/i386/pr59390_1.c: New test. > * gcc.target/i386/pr59390_2.c: New test. > * config/i386/i386.c (get_builtin): New function. > (ix86_builtin_vectorized_function): Replace all instances of > ix86_builtins[...] with get_builtin(...). > (ix86_builtin_reciprocal): Ditto. OK, with a couple of nits: +static tree get_builtin (enum ix86_builtins code) Please name this function ix86_get_builtin. + if (current_function_decl) + target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl); + if (target_tree) + opts = TREE_TARGET_OPTION (target_tree); + else + opts = TREE_TARGET_OPTION (target_option_default_node); + opts = TREE_TARGET_OPTION (target_tree ? target_tree : target_option_default_node); Thanks, Uros.