On 10/27/2012 09:16 PM, Sriraman Tallam wrote:
+ /* See if there's a match. For functions that are multi-versioned,
+ all the versions match. */
if (same_type_p (target_fn_type, static_fn_type (fn)))
- matches = tree_cons (fn, NULL_TREE, matches);
+ {
+ matches = tree_cons (fn, NULL_TREE, matches);
+ /*If versioned, push all possible versions into a vector. */
+ if (DECL_FUNCTION_VERSIONED (fn))
+ {
+ if (fn_ver_vec == NULL)
+ fn_ver_vec = VEC_alloc (tree, heap, 2);
+ VEC_safe_push (tree, heap, fn_ver_vec, fn);
+ }
+ }
Why do we need to keep both a list and vector of the matches?
+ Call decls_match to make sure they are different because they are
+ versioned. */
+ if (DECL_FUNCTION_VERSIONED (fn))
+ {
+ for (match = TREE_CHAIN (matches); match; match = TREE_CHAIN (match))
+ if (decls_match (fn, TREE_PURPOSE (match)))
+ break;
+ }
What if you have multiple matches that aren't all versions of the same
function?
Why would it be a problem to have two separate declarations of the same
function?
+ dispatcher_decl = targetm.get_function_versions_dispatcher (fn_ver_vec);
Is the idea here that if you have some versions declared, then a call,
then more versions declared, then another call, you will call two
different dispatchers, where the first one will only dispatch to the
versions declared before the first call? If not, why do we care about
the set of declarations at this point?
+ /* Mark this functio to be output. */
+ node->local.finalized = 1;
Missing 'n' in "function".
@@ -14227,7 +14260,11 @@ cxx_comdat_group (tree decl)
else
break;
}
- name = DECL_ASSEMBLER_NAME (decl);
+ if (TREE_CODE (decl) == FUNCTION_DECL
+ && DECL_FUNCTION_VERSIONED (decl))
+ name = DECL_NAME (decl);
This would mean that f in the global namespace and f in namespace foo
would end up in the same comdat group. Why do we need special handling
here at all?
dump_function_name (tree t, int flags)
{
- tree name = DECL_NAME (t);
+ tree name;
+ /* For function versions, use the assembler name as the decl name is
+ the same for all versions. */
+ if (TREE_CODE (t) == FUNCTION_DECL
+ && DECL_FUNCTION_VERSIONED (t))
+ name = DECL_ASSEMBLER_NAME (t);
This shouldn't be necessary; we should print the target attribute when
printing the function declaration.
+ Also, mark this function as needed if it is marked inline but
+ is a multi-versioned function. */
+ if (((flag_keep_inline_functions
+ || DECL_FUNCTION_VERSIONED (fn))
This should be marked as needed by the code that builds the dispatcher.
+ /* For calls to a multi-versioned function, overload resolution
+ returns the function with the highest target priority, that is,
+ the version that will checked for dispatching first. If this
+ version is inlinable, a direct call to this version can be made
+ otherwise the call should go through the dispatcher. */
I'm a bit confused why people would want both dispatched calls and
non-dispatched inlining; I would expect that if a function can be
compiled differently enough on newer hardware to make versioning
worthwhile, that would be a larger difference than the call overhead.
+ if (DECL_FUNCTION_VERSIONED (fn)
+ && !targetm.target_option.can_inline_p (current_function_decl, fn))
+ {
+ struct cgraph_node *dispatcher_node = NULL;
+ fn = get_function_version_dispatcher (fn);
+ if (fn == NULL)
+ return NULL;
+ dispatcher_node = cgraph_get_create_node (fn);
+ gcc_assert (dispatcher_node != NULL);
+ /* Mark this function to be output. */
+ dispatcher_node->local.finalized = 1;
+ }
Why do you need to mark this here? If you generate a call to the
dispatcher, cgraph should mark it to be output automatically.
+ /* For candidates of a multi-versioned function, make the version with
+ the highest priority win. This version will be checked for dispatching
+ first. If this version can be inlined into the caller, the front-end
+ will simply make a direct call to this function. */
This is still too high in joust. I believe I said before that this code
should come just above
/* If the two function declarations represent the same function
(this can
happen with declarations in multiple scopes and arg-dependent
lookup),
arbitrarily choose one. But first make sure the default args
we're
using match. */
+ /* For multiversioned functions, aggregate all the versions here for
+ generating the dispatcher body later if necessary. Check to see if
+ the dispatcher is already generated to avoid doing this more than
+ once. */
This caching seems to assume that you'll always be considering the same
group of declarations, which goes back to my earlier question.
Jason