MaskRay requested changes to this revision. MaskRay added a comment. This revision now requires changes to proceed.
In D108421#2977107 <https://reviews.llvm.org/D108421#2977107>, @kamleshbhalui wrote: > In D108421#2958848 <https://reviews.llvm.org/D108421#2958848>, @MaskRay wrote: > >> If you read the comment in TargetMachine::shouldAssumeDSOLocal: this is the >> wrong direction. dso_local is assumed to be marked by the frontend. >> >> Direct accesses and GOT accesses are trade-offs. Direct accesses are not >> always preferred. The MachO special case is an unfortunate case for their >> xnu kernel, not a good example here. > > @MaskRay I would like to know more about these trade-offs details, any > supporting documents will do. > Considering below testcase, can you shed some light how code generated by > llc-12 is better than llc-11 for given testcase? > https://godbolt.org/z/x9xojWb58 This is a very minor issue. First, global variable access is rarely a performance bottleneck. Second, if the symbol turns out to be non-preemptible (which implies that it is defined in the component), the R_X86_64_REX_GOTPCRELX GOT indirection can be optimized out. The only minor issue is slightly longer code sequence. > And FYI this testcase does not work when build as Linux Kernel Module. LKM > loader throw error("Unknown rela relocation: 42")? This is a kernel issue. Please mention the justification (is it related to OpenMP?) in the first place. The kernel can be compiled in -fpie mode. In this mode, taking the address of a default-visibility undefined symbol uses R_X86_64_REX_GOTPCRELX. So the kernel should support R_X86_64_REX_GOTPCRELX anyway. --- If we think the optimization is meaningful: Depending on the property of `.gomp_critical_user_.atomic_reduction.var` different DSOLocal strategies should be used. If it is TU-local, make it local linkage. If it is linked image local, make it hidden visibility. If it may be defined in a shared object and shared with other shared objects or the main executable, (not so sure because such symbol interposition does not work in other binary formats), use dso_preemptable as is. I believe the current forced dso_local is definitely incorrect because it may break `-fpic -shared` links. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108421/new/ https://reviews.llvm.org/D108421 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits