----------  Forwarded Message  ----------

Subject: [CPUblog] Gnu support for CPU dispatching - sort of...
Date: Friday 08 July 2011, 14:34:24
From: "Agner's CPU blog" <cpub...@agner.org>
To: cpub...@agner.org

This message was sent from messageboard Agner`s CPU blog
Subject: Gnu support for CPU dispatching - sort of...
Author: Agner
Go to message <http://agner.org/optimize/blog/read.php?i=167>

The Gnu tools have added a new feature for automatic CPU dispatching. 
This means that you can have multiple versions of the same function, 
each optimized for a different CPU or a different instruction set. For 
example, you may want to have three different versions of an important 
library function: one that is compatible with any old CPU, a better one 
for CPUs with SSE2, and a still better one for CPUs with the AVX 
instruction set.

This feature, called "Gnu indirect function" was introduced two years 
ago (link <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40528>). Since 
then, I have waited impatiently for an implementation that works. Now I 
have discovered that this feature is actually used for a few functions 
in the standard function library (glibc v. 2.13). The official 
documentation (link 
<http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html>) says that 
you can use |__attribute__ ((ifunc("name_of_dispatch_function")))|, but 
this doesn't work.

After some experimentation, I found that the method shown below actually 

// Example of Gnu indirect function

// Define different versions of my function
int myfunc1() {
    return 1;

int myfunc2() {
    return 2;

// Prototype for the common entry point
extern "C" int myfunc();
__asm__ (".type myfunc, @gnu_indirect_function");

// Make the dispatcher function. This returns a pointer to the desired
function version
typeof(myfunc) * myfunc_dispatch (void) __asm__ ("myfunc");
typeof(myfunc) * myfunc_dispatch (void)  {

    if (time(0)&  1) {
       // If time is odd at first call, use version 1
    else {
       // else use version 2

int main() {
    // Test the call to myfunc
    printf("\nCalled function number %i\n", myfunc());
    return 0;

The function call is resolved via the normal procedure linkage table 
(PLT). The PLT entry is changed to point to the desired version of the 
function, either at load time or at the first call. The PLT initially 
points to |myfunc_dispatch|. This function is called only once, and the 
return value from |myfunc_dispatch| replaces its own entry in the PLT.

The "Gnu indirect function" feature requires support in the assembler, 
linker and loader, which is found in binutils version 2.20 and later.

The Gnu standard library glibc uses this feature to implement multiple 
versions of a few memory and string functions, including |memmove, 
memset, memcmp, strcmp, strstr|, but - strangely - not the most 
important one: |memcpy|.

This feature can be useful for anybody who wants to make a highly 
optimized function library for Linux. It is not possible in Windows, but 
it may be implemented in BSD and Mac systems. See my manual Optimizing 
software in C++ <http://agner.org/optimize/#manuals> for a method that 
works on all platforms.


You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to