Hi Julian,

Please find attached the "consolidated" patch as "final-patch.patch) (I had
3-4 which I merged into one), in case it is useful. As I mentioned, I used
Claude Code to support this process, and iterated several times until a
working patch was produced.  (I also did the same using Gemini, but
Gemini's patch seems a little bit more messy, although it also "works"
(e.g. the module compiles).  I can share the patch produced with Gemini as
well for reference when I'm back home, although the one with Claude seemed
the cleanest.)

Thanks,
Jonas.


Jonás Andradas

GPG Fingerprint:  678F 7BD0 83C3 28CE 9E8F
                           3F7F 4D87 9996 E0C6 9372

On Sun, May 10, 2026 at 8:17 AM Julian Gilbey <[email protected]> wrote:

> Dear Jonas,
>
> Thanks for this tantalising message!
>
> Until the maintainer is able to provide a "proper" patch, would you be
> able to share your patch?  The maintainer might well then be able to
> check it and either approve or improve on it.
>
> Best wishes,
>
>    Julian
>
> On Fri, May 08, 2026 at 08:50:33AM +0200, Jonas Andradas wrote:
> > Package: nvidia-kernel-dkms
> > Version: 550.163.01-5
> > Followup-For: Bug #1135362
> > X-Debbugs-Cc: [email protected], [email protected]
> > User: [email protected]
> > Usertags: amd64
> > Control: tags -1 ftbfs
> >
> > Dear Maintainer,
> >
> > I am experiencing this as well in my Debian sid system. I had a similar
> issue a
> > couple of weeks ago in a Proxmox host (running Debian trixie), as Proxmox
> > provides its own kernel and it was bumped to version 7.  Leveraging AI
> (Gemini
> > and Claude) I managed to obtain patches for the nvidia-kernel-dkms
> package in
> > trixie, so that it would compile for Proxmox's kernel 7. These compile
> and I
> > get the functionality, so for me it was "good enough" for my homelab,
> but I
> > would not be so bold as to presume they are the best patch possible. If
> these
> > patches can be useful to see the changes that were needed to make this
> work,
> > such as the ones below, I could upload the file under debian/patches that
> > applied to the trixie package allowed me to built it for Proxmox's
> kernel 7:
> >
> >      - conftest.sh: detect NV_VM_AREA_STRUCT_HAS___VM_FLAGS, absence of
> > dma_map_ops.map_resource, void return of dma_fence_signal, and
> > drm_mode_config_funcs.fb_create format_info argument
> >      - nv-mm.h: handle vma->vm_flags cast for kernels without __vm_flags
> >      - nv-mmap.c: wrap VMA_LOCK_OFFSET and __is_vma_write_locked for
> 1-arg form
> >      - nv-time.h: compat shim for removed in_irq() macro
> >      - nv-dma.c: guard dma_map_ops.map_resource access
> >      - header-presence-tests.mk: add drm/drm_print.h detection
> >      - nvidia-drm-priv.h: include drm_print.h for
> DRM_ERROR/DRM_INFO/DRM_DEBUG
> >      - nvidia-dma-fence-helper.h: handle void dma_fence_signal return
> type
> >      - nvidia-drm-helper.h: use for_each_new_*_in_state iterators
> >      - nvidia-drm-drv.c, nvidia-drm-fb.c: handle fb_create format_info
> arg
> >
> >
> > Thanks,
> > Jonas.
>
From: Jonas Andradas <[email protected]>
Date: Thu, 08 May 2026 00:00:00 +0000
Subject: add LINUXINCLUDE to NV_CONFTEST_CFLAGS for split kernel header support

Debian (and other distributions) split kernel headers into an
architecture-specific package and a common package.  The common headers
(including linux/kconfig.h and linux/stdarg.h) live in a separate directory
that is only reachable via LINUXINCLUDE (which already contains the correct
-I flags for both trees).

Without LINUXINCLUDE in NV_CONFTEST_CFLAGS the conftest compilation probes
cannot find kconfig.h, so IS_ENABLED() calls in kernel headers cause every
conftest to fail, and linux/stdarg.h is also undetected, breaking the
nv_stdarg.h include path.

Add $(LINUXINCLUDE) to NV_CONFTEST_CFLAGS so that conftest probes use the
same include search path as the kernel build itself.

---
 Kbuild | 1 +
 1 file changed, 1 insertion(+)

--- a/Kbuild
+++ b/Kbuild
@@ -186,6 +186,7 @@ NV_CFLAGS_FROM_CONFTEST := $(shell $(NV_
 NV_CONFTEST_CFLAGS =
 NV_CONFTEST_CFLAGS += $(NV_CFLAGS_FROM_KBUILD)
+NV_CONFTEST_CFLAGS += $(LINUXINCLUDE)
 NV_CONFTEST_CFLAGS += $(NV_CFLAGS_FROM_CONFTEST) $(ccflags-y) -fno-pie
 NV_CONFTEST_CFLAGS += $(call cc-disable-warning,pointer-sign)
 NV_CONFTEST_CFLAGS += $(call cc-option,-fshort-wchar,)

--
From: Jonas Andradas <[email protected]>
Date: Thu, 08 May 2026 00:00:00 +0000
Subject: backport VMA lock API changes from Linux 6.15

Linux 6.15 (and thus Linux 7.0) renamed VMA_LOCK_OFFSET to
VM_REFCNT_EXCLUDE_READERS_FLAG and changed __is_vma_write_locked()
from a two-argument form (vma, &seq) to a one-argument form (vma),
with __vma_raw_mm_seqnum(vma) providing the sequence number separately.

Add a backward-compatible alias and conditional compilation so the
manual VMA write-lock implementation in nv_vma_start_write() builds
against both the old and new APIs.

---
 nvidia/nv-mmap.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/nvidia/nv-mmap.c
+++ b/nvidia/nv-mmap.c
@@ -841,6 +841,15 @@ void NV_API_CALL nv_set_safe_to_mmap_lo

 #if !NV_CAN_CALL_VMA_START_WRITE
+
+/*
+ * VMA_LOCK_OFFSET was renamed to VM_REFCNT_EXCLUDE_READERS_FLAG in Linux 6.15.
+ * __is_vma_write_locked() dropped its second argument in the same release;
+ * use __vma_raw_mm_seqnum() to obtain the sequence number separately.
+ */
+#ifndef VMA_LOCK_OFFSET
+#define VMA_LOCK_OFFSET VM_REFCNT_EXCLUDE_READERS_FLAG
+#define NV_VMA_LOCK_API_NEW
+#endif
+
 static NvBool nv_vma_enter_locked(struct vm_area_struct *vma, NvBool detaching)
 {
     NvU32 tgt_refcnt = VMA_LOCK_OFFSET;
@@ -889,7 +898,13 @@ void nv_vma_start_write(struct vm_area_s
 {
     NvU32 mm_lock_seq;
     NvBool locked;
-    if (__is_vma_write_locked(vma, &mm_lock_seq))
+#ifdef NV_VMA_LOCK_API_NEW
+    if (__is_vma_write_locked(vma))
+        return;
+    mm_lock_seq = __vma_raw_mm_seqnum(vma);
+#else
+    if (__is_vma_write_locked(vma, &mm_lock_seq))
+#endif
         return;

     locked = nv_vma_enter_locked(vma, NV_FALSE);

--
From: Jonas Andradas <[email protected]>
Date: Thu, 08 May 2026 00:00:00 +0000
Subject: fix dma_fence_signal void return type in Linux 7.0

dma_fence_signal() and dma_fence_signal_locked() changed their return
type from int to void in Linux 7.0.  The nv_dma_fence_signal* wrappers
tried to return the (now void) value, causing a build error:

  error: void value not ignored as it ought to be

Since no callers of these wrappers inspect the return value, change both
wrappers to void.

---
 nvidia-drm/nvidia-dma-fence-helper.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/nvidia-drm/nvidia-dma-fence-helper.h
+++ b/nvidia-drm/nvidia-dma-fence-helper.h
@@ -93,17 +93,17 @@ static inline signed long nv_dma_fence_w
 #endif
 }

-static inline int nv_dma_fence_signal(nv_dma_fence_t *fence) {
+static inline void nv_dma_fence_signal(nv_dma_fence_t *fence) {
 #if defined(NV_LINUX_FENCE_H_PRESENT)
-    return fence_signal(fence);
+    fence_signal(fence);
 #else
-    return dma_fence_signal(fence);
+    dma_fence_signal(fence);
 #endif
 }

-static inline int nv_dma_fence_signal_locked(nv_dma_fence_t *fence) {
+static inline void nv_dma_fence_signal_locked(nv_dma_fence_t *fence) {
 #if defined(NV_LINUX_FENCE_H_PRESENT)
-    return fence_signal_locked(fence);
+    fence_signal_locked(fence);
 #else
-    return dma_fence_signal_locked(fence);
+    dma_fence_signal_locked(fence);
 #endif
 }

--
From: Jonas Andradas <[email protected]>
Date: Thu, 08 May 2026 00:00:00 +0000
Subject: fix PAHOLE shell quoting for Linux 6.15+ gen-btf.sh

Linux 6.15 introduced scripts/gen-btf.sh which calls ${PAHOLE} directly
without eval.  The NVIDIA Makefile sets PAHOLE to an awk one-liner that
contains embedded single quotes when scripts/pahole-flags.sh is absent.
Bash's unquoted ${PAHOLE} expansion word-splits this value but does NOT
process the shell quoting, so awk receives "'BEGIN" (with the literal
single quote) as its program text and fails with:

  awk: cmd. line:1: 'BEGIN
  awk: cmd. line:1: ^ invalid char ''' in expression

Add nvidia-pahole-wrapper.sh which replicates the awk one-liner's
behavior as a proper executable: it appends ,c++,c++11 to any
--lang_exclude= argument and ignores the exit status (workaround for
DW_TAG_rvalue_reference_type errors in nvidia-modeset.ko).

Update Makefile to use this wrapper instead of the awk one-liner when
gen-btf.sh is present.

---
 Makefile                   | 7 +++++--
 nvidia-pahole-wrapper.sh   | 17 +++++++++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

--- a/Makefile
+++ b/Makefile
@@ -95,7 +95,10 @@ else
   # If scripts/pahole-flags.sh is not present in the kernel tree, add PAHOLE and
   # PAHOLE_AWK_PROGRAM assignments to PAHOLE_VARIABLES; otherwise assign the
   # empty string to PAHOLE_VARIABLES.
-  PAHOLE_VARIABLES=$(if $(wildcard $(KERNEL_SOURCES)/scripts/pahole-flags.sh),,"PAHOLE=$(AWK) '$(PAHOLE_AWK_PROGRAM)'")
+  # If scripts/gen-btf.sh exists (Linux 6.15+), use a wrapper script: gen-btf.sh
+  # calls ${PAHOLE} directly without eval, so PAHOLE cannot be a shell one-liner
+  # with embedded quoting.
+  PAHOLE_VARIABLES=$(if $(wildcard $(KERNEL_SOURCES)/scripts/pahole-flags.sh),,$(if $(wildcard $(KERNEL_SOURCES)/scripts/gen-btf.sh),"PAHOLE=sh $(CURDIR)/nvidia-pahole-wrapper.sh","PAHOLE=$(AWK) '$(PAHOLE_AWK_PROGRAM)'"))

--- /dev/null
+++ b/nvidia-pahole-wrapper.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+# NVIDIA pahole wrapper for Linux 6.15+ gen-btf.sh compatibility.
+#
+# gen-btf.sh calls ${PAHOLE} directly without eval, so PAHOLE cannot
+# be a shell one-liner with embedded quoting.  This script replicates
+# the original awk one-liner's behavior:
+#   - appends ,c++,c++11 to any --lang_exclude= argument
+#   - returns success regardless of pahole exit status (workaround
+#     for DW_TAG_rvalue_reference_type errors in nvidia-modeset.ko)
+
+ARGS=""
+for arg in "$@"; do
+    case "$arg" in
+        --lang_exclude=*) ARGS="${ARGS} ${arg},c++,c++11" ;;
+        *) ARGS="${ARGS} ${arg}" ;;
+    esac
+done
+pahole $ARGS || true

Reply via email to