Public bug reported:
When trying to run a model from huggingface, llama-cli segfaults on
startup. I have only tested it on gfx1201 for the moment, will test the
other hardware I have access to shortly.
## CLI Output
# llama-cli -st -hf ggml-org/gemma-3-1b-it-GGUF -p "If this is working, please
output the exact phrase 'this appears to be working'"
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 32624 MiB):
Device 0: AMD Radeon AI PRO R9700, gfx1201 (0x1201), VMM: no, Wave Size: 32,
VRAM: 32624 MiB
load_backend: loaded ROCm backend from
/usr/lib/x86_64-linux-gnu/ggml/backends0/libggml-hip.so
load_backend: loaded CPU backend from
/usr/lib/x86_64-linux-gnu/ggml/backends0/libggml-cpu-haswell.so
common_download_file_single_online: no previous model file found
/root/.cache/llama.cpp/ggml-org_gemma-3-1b-it-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found
/root/.cache/llama.cpp/ggml-org_gemma-3-1b-it-GGUF_gemma-3-1b-it-Q4_K_M.gguf
common_download_file_single_online: downloading from
https://huggingface.co/ggml-org/gemma-3-1b-it-GGUF/resolve/main/gemma-3-1b-it-Q4_K_M.gguf
to /root/.cache/llama.cpp/ggml-org_gemma-3-1b-it-
GGUF_gemma-3-1b-it-Q4_K_M.gguf.downloadInProgress
(etag:"107078f2011b8db626bee8040bb2bf82aa23ff7f5a81c786f3cf58dbcd75db2e")...
[==================================================] 100% (768 MB / 768 MB)
Loading model... -Segmentation fault (core dumped) llama-cli -st -hf
ggml-org/gemma-3-1b-it-GGUF -p "If this is working, please output the exact
phrase 'this appears to be working'"
## GDB Backtrace
Thread 1 "llama-cli" received signal SIGSEGV, Segmentation fault.
0x00007fffac71ffcb in hip::ihipLaunchKernel_validate (f=f@entry=0x7ffff6ddfa68,
launch_params=..., kernelParams=kernelParams@entry=0x7fffffff38e0,
extra=extra@entry=0x0,
deviceId=deviceId@entry=0, params=params@entry=0) at
/usr/src/rocm-hipamd-7.1.0-0ubuntu2/rocclr/platform/kernel.hpp:85
warning: 85 /usr/src/rocm-hipamd-7.1.0-0ubuntu2/rocclr/platform/kernel.hpp:
No such file or directory
(gdb) bt
#0 0x00007fffac71ffcb in hip::ihipLaunchKernel_validate
(f=f@entry=0x7ffff6ddfa68, launch_params=...,
kernelParams=kernelParams@entry=0x7fffffff38e0, extra=extra@entry=0x0,
deviceId=deviceId@entry=0, params=params@entry=0) at
/usr/src/rocm-hipamd-7.1.0-0ubuntu2/rocclr/platform/kernel.hpp:85
#1 0x00007fffac72048d in hip::ihipModuleLaunchKernel (f=0x7ffff6ddfa68,
launch_params=..., hStream=0x5555562716b0,
kernelParams=kernelParams@entry=0x7fffffff38e0, extra=extra@entry=0x0,
startEvent=startEvent@entry=0x0, stopEvent=0x0, flags=0, params=0,
gridId=0, numGrids=0, prevGridSum=0, allGridSum=0, firstDevice=0)
at /usr/src/rocm-hipamd-7.1.0-0ubuntu2/hipamd/src/hip_module.cpp:467
#2 0x00007fffac77deb4 in hip::ihipLaunchKernel (hostFunction=0x7ffff6ddfa68,
gridDim=..., blockDim=..., args=0x7fffffff38e0, sharedMemBytes=0,
stream=<optimized out>, startEvent=0x0,
stopEvent=0x0, flags=0) at
/usr/src/rocm-hipamd-7.1.0-0ubuntu2/hipamd/src/hip_platform.cpp:677
#3 0x00007fffac71fe31 in hip::hipLaunchKernel_common (hostFunction=<optimized
out>, hostFunction@entry=0x7ffff6ddfa68, gridDim=..., blockDim=...,
args=<optimized out>,
args@entry=0x7fffffff38e0, sharedMemBytes=<optimized out>,
stream=<optimized out>) at
/usr/src/rocm-hipamd-7.1.0-0ubuntu2/hipamd/src/hip_module.cpp:819
#4 0x00007fffac73bae2 in hip::hipLaunchKernel (hostFunction=<optimized out>,
gridDim=..., blockDim=..., args=<optimized out>, sharedMemBytes=<optimized
out>, stream=<optimized out>)
at /usr/src/rocm-hipamd-7.1.0-0ubuntu2/hipamd/src/hip_module.cpp:826
#5 0x00007fffb7e215e1 in ggml_cuda_op_scale(ggml_backend_cuda_context&,
ggml_tensor*) () from /usr/lib/x86_64-linux-gnu/ggml/backends0/libggml-hip.so
#6 0x00007fffb7d43709 in ?? () from
/usr/lib/x86_64-linux-gnu/ggml/backends0/libggml-hip.so
#7 0x00007fffb7d41f41 in ?? () from
/usr/lib/x86_64-linux-gnu/ggml/backends0/libggml-hip.so
#8 0x00007ffff75161d3 in ggml_backend_sched_compute_splits
(sched=0x555556258d30) at /usr/src/ggml-0.9.8-3/src/ggml-backend.cpp:1582
#9 ggml_backend_sched_graph_compute_async (sched=0x555556258d30,
graph=<optimized out>) at /usr/src/ggml-0.9.8-3/src/ggml-backend.cpp:1805
#10 0x00007ffff7601781 in llama_context::graph_compute
(this=this@entry=0x5555561c1cf0, gf=0x5555566243f0, batched=<optimized out>) at
/usr/include/c++/15/bits/unique_ptr.h:192
#11 0x00007ffff76029b3 in llama_context::process_ubatch
(this=this@entry=0x5555561c1cf0, ubatch=...,
gtype=gtype@entry=LLM_GRAPH_TYPE_DECODER, mctx=mctx@entry=0x5555562e6b40,
ret=@0x7fffffff974c: 892221235) at
/usr/src/llama.cpp-8064+dfsg-1ubuntu1/src/llama-context.cpp:1162
#12 0x00007ffff7606d99 in llama_context::decode (this=0x5555561c1cf0,
batch_inp=...) at
/usr/src/llama.cpp-8064+dfsg-1ubuntu1/src/llama-context.cpp:1620
#13 0x00007ffff76125f5 in llama_decode (ctx=<optimized out>, batch=...) at
/usr/src/llama.cpp-8064+dfsg-1ubuntu1/src/llama-context.cpp:3466
#14 0x000055555571f7f4 in common_init_from_params (params=...) at
/usr/src/llama.cpp-8064+dfsg-1ubuntu1/common/common.cpp:1297
#15 0x000055555565b797 in server_context_impl::load_model (this=0x5555562df560,
params=...) at
/usr/src/llama.cpp-8064+dfsg-1ubuntu1/tools/server/server-context.cpp:625
#16 0x00005555555ea326 in server_context::load_model (this=0x7fffffffc320,
params=...) at /usr/include/c++/15/bits/unique_ptr.h:192
#17 main (argc=<optimized out>, argv=<optimized out>) at
/usr/src/llama.cpp-8064+dfsg-1ubuntu1/tools/cli/cli.cpp:243
## Immediately relevant installed packages
libggml0-backend-hip_0.9.8-3.amd64
llama.cpp_8064+dfsg-1ubuntu1.amd64
** Affects: llama.cpp (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2146822
Title:
llama-cli segfaults on startup
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/llama.cpp/+bug/2146822/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs