Public bug reported:

Any single `UBLK_U_CMD_ADD_DEV` control command Oopses the kernel on
`linux-azure` `6.17.0-1015.15~24.04.1`:

```text
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
Oops: 0000 [#1] SMP NOPTI
RIP: 0010:ublk_init_queues+0x4e/0x1e0 [ublk_drv]
Call Trace:
 ublk_ctrl_add_dev+0x31a/0x5e0 [ublk_drv]
 ublk_ctrl_uring_cmd+0x7f/0x754 [ublk_drv]
 io_uring_cmd+0xa3/0x140
 __io_issue_sqe+0x43/0x1b0
 io_issue_sqe+0x3f/0x350
 io_wq_submit_work+0xcc/0x330
 io_worker_handle_work+0x169/0x540
 io_wq_worker+0xfd/0x380
note: iou-wrk-* exited with irqs disabled
```

The io-uring worker dies with IRQs disabled; the command never completes and
the device is never created. Any ublk server (ublksrv, libublk-rs, SPDK ublk,
qemu-storage-daemon ublk, etc.) crashes the host on first device add. Breaks
CI on GitHub-hosted `ubuntu-24.04` runners, which ship this kernel.

- `ubuntu-22.04` / `6.8.0-1052-azure`: works.
- `ubuntu-24.04` image `20260525.161.1` / `6.17.0-1015-azure #15~24.04.1-Ubuntu`
  (`linux-modules-extra-6.17.0-1015-azure 6.17.0-1015.15~24.04.1`): Oops.

The NULL faulting address (`0000000000000000`) is the unallocated `mq_map`
base read by the NUMA helper — see root cause below. Full log captured at
<https://github.com/e2b-dev/ublk-adddev-repro/actions/runs/26619162938>
(attached as `oops.txt`).

See also runner-images issue: 
<https://github.com/actions/runner-images/issues/14175>
(maintainer confirmed it is a `linux-azure` kernel bug to be fixed upstream of 
the image).


The crashing kernel carries this cherry-pick:

```
c34f3aeec82a5eb19384b1bd3911329adbbf8a1a  (linux-azure noble)
ublk: implement NUMA-aware memory allocation
[ Upstream commit 529d4d6327880e5c60f4e0def39b3faaa7954e54 ]
BugLink: https://bugs.launchpad.net/bugs/2146193
```

`c34f3aee` adds `ublk_get_queue_numa_node()`, inlined into the
`ublk_init_queues()` path, which reads:

```c
ub->tag_set.map[HCTX_TYPE_DEFAULT].mq_map[cpu]
```

`mq_map[]` is only allocated by `blk_mq_alloc_tag_set()`, called from
`ublk_add_tag_set()`. Upstream is safe **only because** its immediate parent
commit `011af85ccd87` reordered `ublk_ctrl_add_dev()` to call
`ublk_add_tag_set()` **before** `ublk_init_queues()`.

The `linux-azure-6.17` backport pulled the child (`c34f3aee`, NUMA) but **not**
its parent (`011af85c`, reorder). The azure tree still has the v6.17 order —
`ublk_init_queues()` first — so the NUMA helper dereferences a NULL `mq_map`
on a freshly `kzalloc()`'d `ub`. Not-present page read → Oops at
`ublk_init_queues+0x4e`.

Call order in `ublk_ctrl_add_dev`, verified across trees:

| tree | order | NUMA mq_map read | result |
|---|---|---|---|
| v6.17 release | init_queues → add_tag_set | no | safe (order irrelevant) |
| upstream `011af85c` (`529d4d63~1`) | add_tag_set → init_queues | no | reorder 
lands |
| upstream `529d4d63` (NUMA) | add_tag_set → init_queues | yes | **safe** |
| stable `linux-6.17.y` | init_queues → add_tag_set | no | not affected |
| **azure `6.17.0-1015.15`** | init_queues → add_tag_set | **yes** | **Oops** |

Mainline and upstream stable `6.17.y` are **not** affected (neither carries the
NUMA commit), so this is not a `linux-block`/`stable@vger` issue — it is a
`linux-azure` incomplete-backport regression introduced under lp#2146193.


Suggested fix:

Cherry-pick mainline commit:

```
011af85ccd871526df36988c7ff20ca375fb804d
ublk: reorder tag_set initialization before queue allocation
```

It applies cleanly to the `linux-azure-6.17` tree (the `ublk_ctrl_add_dev`
context is identical to the upstream pre-image). Patch attached:
`0001-ublk-reorder-tag_set-initialization-before-queue-all.patch`.


The full details of the reproduction oops:

Captured from GitHub Actions run: e2b-dev/ublk-adddev-repro
Run: https://github.com/e2b-dev/ublk-adddev-repro/actions/runs/26619162938
Date: 2026-05-29

Environment (ubuntu-24.04 hosted runner):
  Runner image:   ubuntu-24.04  20260525.161.1
  Azure region:   eastus2
  uname:          Linux 6.17.0-1015-azure #15~24.04.1-Ubuntu SMP Wed May  6 
22:37:49 UTC 2026 x86_64
  Built with:     x86_64-linux-gnu-gcc-13 (Ubuntu 13.3.0-6ubuntu2~24.04.1), GNU 
ld 2.42
  Module:         
/lib/modules/6.17.0-1015-azure/kernel/drivers/block/ublk_drv.ko.zst
  Package:        linux-modules-extra-6.17.0-1015-azure  6.17.0-1015.15~24.04.1
  /dev/ublk-control present (crw------- 10, 263)

Trigger: a single UBLK_U_CMD_ADD_DEV (nr_hw_queues=1, queue_depth=32,
max_io_buf_bytes=65536) via raw io_uring IORING_OP_URING_CMD (SQE128).
The command never completes; the io-uring worker dies with IRQs disabled.

Comparison runs:
  ubuntu-22.04 / 6.8.0-1052-azure : PASS (ADD_DEV ok, DEL_DEV ok)
  ubuntu-24.04 / 6.17.0-1015-azure: Oops (below)

dmesg:

[   61.497360] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   61.499381] #PF: supervisor read access in kernel mode
[   61.500939] #PF: error_code(0x0000) - not-present page
[   61.503357] Oops: Oops: 0000 [#1] SMP NOPTI
[   61.510970] RIP: 0010:ublk_init_queues+0x4e/0x1e0 [ublk_drv]
[   61.534877] Call Trace:
[   61.536210]  ublk_ctrl_add_dev+0x31a/0x5e0 [ublk_drv]
[   61.537736]  ublk_ctrl_uring_cmd+0x7f/0x754 [ublk_drv]
[   61.539247]  io_uring_cmd+0xa3/0x140
[   61.556762] RIP: 0033:0x0
[   61.571589] Modules linked in: ublk_drv xt_MASQUERADE bridge xt_set ip_set 
nft_chain_nat nf_nat xt_addrtype xfrm_user xfrm_algo overlay 8021q garp mrp stp 
llc xt_owner xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
nft_compat nf_tables binfmt_misc nls_iso8859_1 mlx5_ib macsec ib_uverbs ib_core 
kvm_amd ccp mlx5_core kvm mlxfw irqbypass joydev tls psample hid_generic 
serio_raw polyval_clmulni hid_hyperv ghash_clmulni_intel aesni_intel hyperv_drm 
hv_netvsc hid hyperv_keyboard sch_fq_codel dm_multipath msr nvme_fabrics 
efi_pstore nfnetlink ip_tables x_tables autofs4
[   61.679985] RIP: 0010:ublk_init_queues+0x4e/0x1e0 [ublk_drv]
[   61.706539] note: iou-wrk-7260[7261] exited with irqs disabled

** Affects: linux-azure (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: kernel-bug regression-update ublk

** Patch added: 
"0001-ublk-reorder-tag_set-initialization-before-queue-all.patch"
   
https://bugs.launchpad.net/bugs/2154635/+attachment/5974520/+files/0001-ublk-reorder-tag_set-initialization-before-queue-all.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2154635

Title:
  ublk ADD_DEV Oops in ublk_init_queues on 6.17.0-1015-azure

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/2154635/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to