https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125736

--- Comment #4 from LIU Hao <lh_mouse at 126 dot com> ---
I realize that there's corner case: [eax*8+0x1480] truncates the result to
32-bits if the result overflows 32-bit address, while [rax*8+0x1480] doesn't.

Here is a proper testcase:
(https://gcc.godbolt.org/z/8zn75r5Eh)
```
typedef __UINT32_TYPE__ uint32_t;
typedef __UINTPTR_TYPE__ uintptr_t;

inline
uintptr_t
read_gs_ptr(uint32_t offset)
  {
    uintptr_t value;
#ifdef __clang__
    value = *(uintptr_t __seg_gs*)(uintptr_t) offset;
#else
    __asm__ ("gs mov %k0, %a1" 
             : "=r"(value) : "Ts"((uintptr_t) offset) 
             : "memory");
#endif
    return value;
  }

extern uint32_t tls_index;

uintptr_t
read_tls(void)
  {
    if(tls_index >= 1088) 
      __builtin_unreachable();

    return read_gs_ptr(0x1480 + tls_index * sizeof(uintptr_t));
  }
```

This asserts that the address can't overflow, and now Clang generates

```
read_tls:
 mov    eax,DWORD PTR [rip+0x0]        # 6 <read_tls+0x6>
    R_X86_64_PC32 tls_index-0x4
 mov    rax,QWORD PTR gs:[rax*8+0x1480]
 ret
```

but GCC still generates an address override prefix:

```
read_tls:
 mov    eax,DWORD PTR [rip+0x0]        # 6 <read_tls+0x6>
    R_X86_64_PC32 tls_index-0x4
 mov    eax,DWORD PTR gs:[eax*8+0x1480]
 ret
```

Reply via email to