Your message dated Sat, 15 Apr 2023 23:42:59 -0600
with message-id <[email protected]>
and subject line Segfault during compilation is probably hardware failure
has caused the Debian Bug report #1034045,
regarding clang-15: Segfault during compilation of rocthrust/testing/shuffle.cu
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
1034045: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1034045
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: clang-15
Version: 1:15.0.7-4
Severity: normal
X-Debbugs-Cc: [email protected], [email protected]
Dear Maintainer,
I'm having trouble reproducing this problem, but I figured I should
report it anyway in case other users had seen similar behaviour.
While compiling the rocthrust test suite, I encountered a segfault in
the compiler. To move forward with my work, I decided to disable
building the test that triggered the segfault, with the intention of
returning to prepare a simple, self-contained example of the problem
later. Unfortunately, when I returned to prepare this bug report, I was
unable to reproduce the crash. I'm unsure of whether I have somehow
failed to reproduce the original conditions that caused the crash, or
if the crash is non-deterministic.
I originally thought this crash was a regression from 15.0.6, but I no
longer have any reason to believe that. It may just have been luck that
I never encountered this crash before.
The only information that I saved from the failure is this log snippet:
[ 90%] Building CXX object
testing/CMakeFiles/test_thrust_zip_iterator_reduce_by_key.dir/zip_iterator_reduce_by_key.cu.o
cd /root/rocthrust/rocthrust/obj-x86_64-linux-gnu/testing && /usr/bin/hipcc
-D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1
-I/root/rocthrust/rocthrust/obj-x86_64-linux-gnu/thrust/include
-I/root/rocthrust/rocthrust/thrust/.. -isystem
/root/rocthrust/rocthrust/testing -Wno-deprecated-builtins -g -O2
-ffile-prefix-map=/root/rocthrust/rocthrust=. -Xarch_host
-fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
-D_FORTIFY_SOURCE=2 -Wno-unused-command-line-argument -Wall -Wextra -std=c++14
-O3 -DNDEBUG -x hip --offload-arch=gfx803 --offload-arch=gfx900
--offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a
--offload-arch=gfx1010 --offload-arch=gfx1011 --offload-arch=gfx1030
-DGTEST_HAS_PTHREAD=1 -std=c++14 -MD -MT
testing/CMakeFiles/test_thrust_zip_iterator_reduce_by_key.dir/zip_iterator_reduce_by_key.cu.o
-MF
CMakeFiles/test_thrust_zip_iterator_reduce_by_key.dir/zip_iterator_reduce_by_key.cu.o.d
-o
CMakeFiles/test_thrust_zip_iterator_reduce_by_key.dir/zip_iterator_reduce_by_key.cu.o
-c /root/rocthrust/rocthrust/testing/zip_iterator_reduce_by_key.cu
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and
include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /usr/lib/llvm-15/bin/clang -cc1 -triple
amdgcn-amd-amdhsa -aux-triple x86_64-pc-linux-gnu -emit-obj
--mrelax-relocations -disable-free -clear-ast-before-backend
-disable-llvm-verifier -discard-value-names -main-file-name shuffle.cu
-mrelocation-model pic -pic-level 1 -fhalf-no-semantic-interposition
-mframe-pointer=none -fno-rounding-math -mconstructor-aliases -aux-target-cpu
x86-64 -fcuda-is-device -mllvm -amdgpu-internalize-symbols
-fcuda-allow-variadic-functions -fvisibility hidden
-fapply-global-visibility-to-externs -mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/hip.bc -mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/ocml.bc -mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/ockl.bc -mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_daz_opt_off.bc
-mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_unsafe_math_off.bc
-mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_finite_only_off.bc
-mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
-mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_wavefrontsize64_off.bc
-mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_isa_version_1030.bc
-mlink-builtin-bitcode
/usr/lib/x86_64-linux-gnu/amdgcn/bitcode/oclc_abi_version_400.bc -target-cpu
gfx1030 -mllvm -treat-scalable-fixed-error-as-warning
-debug-info-kind=constructor -dwarf-version=5 -debugger-tuning=gdb
-resource-dir /usr/lib/llvm-15/lib/clang/15.0.7 -dependency-file
CMakeFiles/test_thrust_shuffle.dir/shuffle.cu.o.d -MT
testing/CMakeFiles/test_thrust_shuffle.dir/shuffle.cu.o -sys-header-deps
-internal-isystem /usr/lib/llvm-15/lib/clang/15.0.7/include/cuda_wrappers
-idirafter /usr/include -include __clang_hip_runtime_wrapper.h -isystem
/usr/lib/llvm-15/lib/clang/15.0.7/include/.. -isystem /usr/hsa/include -isystem
/root/rocthrust/rocthrust/testing -D __HIP_PLATFORM_AMD__=1 -D
__HIP_PLATFORM_HCC__=1 -I
/root/rocthrust/rocthrust/obj-x86_64-linux-gnu/thrust/include -I
/root/rocthrust/rocthrust/thrust/.. -D _FORTIFY_SOURCE=2 -D NDEBUG -D
GTEST_HAS_PTHREAD=1 -internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12
-internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/x86_64-linux-gnu/c++/12
-internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/backward
-internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12
-internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/x86_64-linux-gnu/c++/12
-internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/backward
-internal-isystem /usr/lib/llvm-15/lib/clang/15.0.7/include -internal-isystem
/usr/local/include -internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include -internal-externc-isystem /usr/include
-internal-isystem /usr/lib/llvm-15/lib/clang/15.0.7/include -internal-isystem
/usr/local/include -internal-isystem
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include -internal-externc-isystem /usr/include
-fmacro-prefix-map=/root/rocthrust/rocthrust=.
-fcoverage-prefix-map=/root/rocthrust/rocthrust=. -O3 -Wno-deprecated-builtins
-Wformat -Werror=format-security -Wdate-time -Wno-unused-command-line-argument
-Wall -Wextra -std=c++14 -fdeprecated-macro -fno-autolink
-fdebug-compilation-dir=/root/rocthrust/rocthrust/obj-x86_64-linux-gnu/testing
-fdebug-prefix-map=/root/rocthrust/rocthrust=. -ferror-limit 19
-fhip-new-launch-api -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions
-vectorize-loops -vectorize-slp -mllvm -amdgpu-early-inline-all=true -mllvm
-amdgpu-function-calls=false -cuid=e8f1cf112b6efa9c
-fcuda-allow-variadic-functions -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o
/tmp/shuffle-a62af3/shuffle-gfx1030.o -x hip
/root/rocthrust/rocthrust/testing/shuffle.cu
1. <eof> parser at end of file
2. Code generation
3. Running pass 'CallGraph Pass Manager' on module
'/root/rocthrust/rocthrust/testing/shuffle.cu'.
4. Running pass 'Debug Variable Analysis' on function
'@_ZN7rocprim6detail18sort_single_kernelILj256ELj10ELb0EN6thrust6detail15normal_iteratorINS2_10device_ptrIaEEEES7_PNS_10empty_typeES9_EEvT2_T3_T4_T5_jjj'
#0 0x00007f360e0f51b1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int)
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0xf5c1b1)
#1 0x00007f360e0f2ece llvm::sys::RunSignalHandlers()
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0xf59ece)
#2 0x00007f360e0f56d6 (/lib/x86_64-linux-gnu/libLLVM-15.so.1+0xf5c6d6)
#3 0x00007f360ccd8f90 (/lib/x86_64-linux-gnu/libc.so.6+0x3bf90)
#4 0x00007f360e48d3e4 llvm::MachineInstr::getMF() const
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x12f43e4)
#5 0x00007f360e42135f
llvm::ilist_traits<llvm::MachineInstr>::removeNodeFromList(llvm::MachineInstr*)
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x128835f)
#6 0x00007f360e3d1360 (/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x1238360)
#7 0x00007f360e486cbc
llvm::MachineFunctionPass::runOnFunction(llvm::Function&)
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x12edcbc)
#8 0x00007f360e231fd2 llvm::FPPassManager::runOnFunction(llvm::Function&)
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x1098fd2)
#9 0x00007f360f46d7d2 (/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x22d47d2)
#10 0x00007f360e232b76 llvm::legacy::PassManagerImpl::run(llvm::Module&)
(/lib/x86_64-linux-gnu/libLLVM-15.so.1+0x1099b76)
#11 0x00007f3615b51991 clang::EmitBackendOutput(clang::DiagnosticsEngine&,
clang::HeaderSearchOptions const&, clang::CodeGenOptions const&,
clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef,
llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream>>)
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x1954991)
#12 0x00007f3615eb19e1
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x1cb49e1)
#13 0x00007f3614cbab1b clang::ParseAST(clang::Sema&, bool, bool)
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0xabdb1b)
#14 0x00007f3615eada05 clang::CodeGenAction::ExecuteAction()
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x1cb0a05)
#15 0x00007f36168ebaf7 clang::FrontendAction::Execute()
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x26eeaf7)
#16 0x00007f361685ccf6
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x265fcf6)
#17 0x00007f361696a0aa
clang::ExecuteCompilerInvocation(clang::CompilerInstance*)
(/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x276d0aa)
#18 0x000055bd219e1178 cc1_main(llvm::ArrayRef<char const*>, char const*,
void*) (/usr/lib/llvm-15/bin/clang+0x14178)
#19 0x000055bd219df27b (/usr/lib/llvm-15/bin/clang+0x1227b)
#20 0x000055bd219df0d2 clang_main(int, char**)
(/usr/lib/llvm-15/bin/clang+0x120d2)
#21 0x00007f360ccc418a (/lib/x86_64-linux-gnu/libc.so.6+0x2718a)
#22 0x00007f360ccc4245 __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x27245)
#23 0x000055bd219dbec1 _start (/usr/lib/llvm-15/bin/clang+0xeec1)
clang: error: unable to execute command: Segmentation fault (core dumped)
clang: error: clang frontend command failed due to signal (use -v to see
invocation)
Debian clang version 15.0.7
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang: note: diagnostic msg: Error generating preprocessed source(s).
make[3]: *** [testing/CMakeFiles/test_thrust_shuffle.dir/build.make:79:
testing/CMakeFiles/test_thrust_shuffle.dir/shuffle.cu.o] Error 1
make[3]: Leaving directory '/root/rocthrust/rocthrust/obj-x86_64-linux-gnu'
make[2]: *** [CMakeFiles/Makefile2:3773:
testing/CMakeFiles/test_thrust_shuffle.dir/all] Error 2
make[2]: *** Waiting for unfinished jobs....
I really must apologise for not collecting all the data required to
diagnose this bug. I was expecting to be able to easily reproduce the
crash and capture the necessary data the second time around, but this
problem defied my expectations.
-- System Information:
Debian Release: 12.0
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 6.1.0-7-amd64 (SMP w/32 CPU threads; PREEMPT)
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect
Versions of packages clang-15 depends on:
ii binutils 2.40-2
ii libc6 2.36-8
ii libc6-dev 2.36-8
ii libclang-common-15-dev 1:15.0.7-4
ii libclang-cpp15 1:15.0.7-4
ii libclang1-15 1:15.0.7-4
ii libgcc-12-dev 12.2.0-14
ii libgcc-s1 12.2.0-14
ii libllvm15 1:15.0.7-4
ii libobjc-12-dev 12.2.0-14
ii libstdc++-12-dev 12.2.0-14
ii libstdc++6 12.2.0-14
ii llvm-15-linker-tools 1:15.0.7-4
Versions of packages clang-15 recommends:
pn llvm-15-dev <none>
ii python3 3.11.2-1
Versions of packages clang-15 suggests:
pn clang-15-doc <none>
pn wasi-libc <none>
-- no debconf information
--- End Message ---
--- Begin Message ---
I am closing Bug #1034045 (clang-15: Segfault during compilation of
rocthrust/testing/shuffle.cu), because I now believe it to have been
caused by a hardware failure. I haven't pinned down the exact cause, but
I'm seeing occasional crashing in a wide variety of programs. I suspect
there is a core on my CPU that is not stable under heavy load.
Apologies for wasting your time on a wild goose chase.
Sincerely,
Cory Bloor
--- End Message ---