https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125736
--- Comment #4 from LIU Hao <lh_mouse at 126 dot com> --- I realize that there's corner case: [eax*8+0x1480] truncates the result to 32-bits if the result overflows 32-bit address, while [rax*8+0x1480] doesn't. Here is a proper testcase: (https://gcc.godbolt.org/z/8zn75r5Eh) ``` typedef __UINT32_TYPE__ uint32_t; typedef __UINTPTR_TYPE__ uintptr_t; inline uintptr_t read_gs_ptr(uint32_t offset) { uintptr_t value; #ifdef __clang__ value = *(uintptr_t __seg_gs*)(uintptr_t) offset; #else __asm__ ("gs mov %k0, %a1" : "=r"(value) : "Ts"((uintptr_t) offset) : "memory"); #endif return value; } extern uint32_t tls_index; uintptr_t read_tls(void) { if(tls_index >= 1088) __builtin_unreachable(); return read_gs_ptr(0x1480 + tls_index * sizeof(uintptr_t)); } ``` This asserts that the address can't overflow, and now Clang generates ``` read_tls: mov eax,DWORD PTR [rip+0x0] # 6 <read_tls+0x6> R_X86_64_PC32 tls_index-0x4 mov rax,QWORD PTR gs:[rax*8+0x1480] ret ``` but GCC still generates an address override prefix: ``` read_tls: mov eax,DWORD PTR [rip+0x0] # 6 <read_tls+0x6> R_X86_64_PC32 tls_index-0x4 mov eax,DWORD PTR gs:[eax*8+0x1480] ret ```
