leezu edited a comment on issue #17045: Relocation truncation issues URL: https://github.com/apache/incubator-mxnet/issues/17045#issuecomment-564558015 Looking at the `cmake -GNinja -DUSE_SIGNAL_HANDLER=ON -DUSE_CUDA=ON -DUSE_CUDNN=ON -DUSE_TVM_OP=ON -DPython3_EXECUTABLE=/usr/bin/python3 -DUSE_MKL_IF_AVAILABLE=OFF -DUSE_MKLDNN=OFF -DUSE_DIST_KVSTORE=ON -DCMAKE_BUILD_TYPE=Release -DCUDA_ARCH_NAME=Manual DUSE_INT64_TENSOR_SIZE=ON ..` build with #17031 (that PR currently does not yet support `-DCUDA_ARCH_BIN=52,70`, so we get a few more errors here below due to even larger size due to supporting all common archs. But they serve to illustrate the problem), I make the following observations: "By default" it fails like ``` libmxnet.a(utils.cc.o): In function `mxnet::common::ExecuteMonInputCallback(nnvm::IndexedGraph const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, unsigned long, std::function<void (char const*, char const*, void*)> const&)': utils.cc:(.text+0xa5d): relocation truncated to fit: R_X86_64_PC32 against `.bss' utils.cc:(.text+0xa6c): relocation truncated to fit: R_X86_64_PC32 against `.bss' utils.cc:(.text+0xb48): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vsnprintf@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 utils.cc:(.text+0xd86): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `__pthread_key_create@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libpthread.so.0 utils.cc:(.text+0xeab): relocation truncated to fit: R_X86_64_GOTPCREL against symbol `__pthread_key_create@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libpthread.so.0 utils.cc:(.text+0x1665): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_ios<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so utils.cc:(.text+0x169d): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `VTT for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so utils.cc:(.text+0x16e0): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4.21' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so utils.cc:(.text+0x1724): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_streambuf<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/7/libstdc++.so utils.cc:(.text+0x1742): additional relocation overflows omitted from the output /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax ``` Enabling `-mcmodel=large` to use 64bit relocation, the failure is moved to `libmxnet.a`: ``` libmxnet.a(utils.cc.o):(.eh_frame+0x6c): relocation truncated to fit: R_X86_64_PC32 against `.text' libmxnet.a(utils.cc.o):(.eh_frame+0xb8): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN5mxnet2op8mxnet_op6KernelINS_6common16csr_indptr_checkEN7mshadow3cpuEE6LaunchIJPfPlllEEEbPNS5_6StreamIS6_EEmDpT_._omp_fn.1' libmxnet.a(utils.cc.o):(.eh_frame+0xe8): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN5mxnet2op8mxnet_op6KernelINS_6common13csr_idx_checkEN7mshadow3cpuEE6LaunchIJPfPlSA_lEEEbPNS5_6StreamIS6_EEmDpT_._omp_fn.2' libmxnet.a(utils.cc.o):(.eh_frame+0x118): relocation truncated to fit: R_X86_64_PC32 against `.text' libmxnet.a(utils.cc.o):(.eh_frame+0x164): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN5mxnet2op8mxnet_op6KernelINS_6common16csr_indptr_checkEN7mshadow3cpuEE6LaunchIJPdPlllEEEbPNS5_6StreamIS6_EEmDpT_._omp_fn.4' libmxnet.a(utils.cc.o):(.eh_frame+0x194): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN5mxnet2op8mxnet_op6KernelINS_6common13csr_idx_checkEN7mshadow3cpuEE6LaunchIJPdPlSA_lEEEbPNS5_6StreamIS6_EEmDpT_._omp_fn.5' libmxnet.a(utils.cc.o):(.eh_frame+0x1e4): relocation truncated to fit: R_X86_64_PC32 against `.text' libmxnet.a(utils.cc.o):(.eh_frame+0x21c): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN5mxnet2op8mxnet_op6KernelINS_6common16csr_indptr_checkEN7mshadow3cpuEE6LaunchIJPNS5_4half6half_tEPlllEEEbPNS5_6StreamIS6_EEmDpT_._omp_fn.7' libmxnet.a(utils.cc.o):(.eh_frame+0x24c): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN5mxnet2op8mxnet_op6KernelINS_6common13csr_idx_checkEN7mshadow3cpuEE6LaunchIJPNS5_4half6half_tEPlSC_lEEEbPNS5_6StreamIS6_EEmDpT_._omp_fn.8' libmxnet.a(utils.cc.o):(.eh_frame+0x27c): additional relocation overflows omitted from the output /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax ``` And when setting `-Wl,--no-relax`, we get back to the state reported by CI at http://jenkins.mxnet-ci.amazon-ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-validation/pipelines/unix-gpu/branches/PR-17031/runs/6/nodes/52/steps/84/log/?start=0 (which builds with clang, unlike my build here with gcc). ``` /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o: In function `_start': (.text+0x12): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `__libc_csu_fini' defined in .text section in /usr/lib/x86_64-linux-gnu/libc_nonshared.a(elf-init.oS) (.text+0x19): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `__libc_csu_init' defined in .text section in /usr/lib/x86_64-linux-gnu/libc_nonshared.a(elf-init.oS) (.text+0x20): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `main' defined in .text.startup section in tests/CMakeFiles/mxnet_unit_tests.dir/cpp/test_main.cc.o (.text+0x26): relocation truncated to fit: R_X86_64_GOTPCRELX against symbol `__libc_start_main@@GLIBC_2.2.5' defined in .text section in /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.eh_frame+0x20): relocation truncated to fit: R_X86_64_PC32 against `.text' /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o: In function `_init': (.init+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__' tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/base.cc.o:(.eh_frame+0x20): relocation truncated to fit: R_X86_64_PC32 against `.text._ZNKSt5ctypeIcE8do_widenEc' tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/base.cc.o:(.eh_frame+0x48): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN7testing8internal15TestFactoryImplI38ContextHashTest_ContextHashUnique_TestED2Ev' tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/base.cc.o:(.eh_frame+0x5c): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN7testing8internal15TestFactoryImplI38ContextHashTest_ContextHashUnique_TestED0Ev' tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/base.cc.o:(.eh_frame+0xc0): relocation truncated to fit: R_X86_64_PC32 against `.text._ZN38ContextHashTest_ContextHashUnique_TestD2Ev' tests/CMakeFiles/mxnet_unit_tests.dir/cpp/misc/base.cc.o:(.eh_frame+0xdc): additional relocation overflows omitted from the output tests/mxnet_unit_tests: PC-relative offset overflow in PLT entry for `cudnnBatchNormalizationForwardInference@@libcudnn.so.7' ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services