https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82803

Yann Droneaud <yann at droneaud dot fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yann at droneaud dot fr

--- Comment #8 from Yann Droneaud <yann at droneaud dot fr> ---
Created attachment 46903
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46903&action=edit
An artificial test case for gcc to emit 17 calls to __tls_get_addr()

Using Thread Local Storage (TLS) is a pain: the issue reported here still apply
on latest GCC.

I've code such as

  static struct state *state(void) __attribute__((pure));
  static struct state *state(void)
  {
      static __thread struct state s;

      return &s;
  }

  int do(void)
  {
      struct state * const s = state();
      int res;

      /* do something */

      return res;
  }

Once compiled, code for my real function contains 6 calls to __tls_get_addr().
Which is far more than expected. And far more than necessary.
Clang compile the same code and emit a single call to __tls_get_addr(). Both on
Linux amd64, -O3 -fPIC.

The attached testcase is an example which is designed to trigger 17 calls to
__tls_get_addr(). As you will see, there's about one per conditional + function
call pair.

Once again, clang is able to emit code with a single call to __tls_get_addr().

You can check for yourself: https://godbolt.org/z/QVGjka

Reply via email to