[Kernel-packages] [Bug 1960256] Re: compilation errors due to "peermem" module
** Summary changed: - compilation due to "peermem" module + compilation errors due to "peermem" module -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to nvidia-graphics-drivers-470 in Ubuntu. https://bugs.launchpad.net/bugs/1960256 Title: compilation errors due to "peermem" module Status in nvidia-graphics-drivers-470 package in Ubuntu: New Bug description: Installed with the following: apt-get install --no-install-recommends nvidia-driver-470 nvidia- modprobe libnvidia-cfg1-470 libnvidia-common-470 libnvidia-compute-470 libnvidia-decode-470 libnvidia-encode-470 libnvidia-extra-470 libnvidia-fbc1-470 libnvidia-gl-470 libnvidia-ifr1-470 nvidia-compute- utils-470 nvidia-dkms-470 nvidia-driver-470 nvidia-kernel-common-470 nvidia-kernel-source-470 nvidia-utils-470 xserver-xorg-video- nvidia-470 Seems to generate /var/crash/nvidia-dkms-470.0.crash (tail -20): /usr/bin/ld.bfd -m elf_x86_64 -z max-page-size=0x20-r -o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-drv.o […] { echo /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm.o […] /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-format.o; echo; } > /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm.mod make -f ./scripts/Makefile.modpost sed 's/ko$/o/' /var/lib/dkms/nvidia/470.103.01/build/modules.order | scripts/mod/modpost -m -a -i ./Module.symvers -I /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -e /usr/src/ofa_kernel/default/Module.symvers -o /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -s -T - FATAL: parse error in symbol dump file scripts/Makefile.modpost:93: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1 Makefile:1675: recipe for target 'modules' failed make[1]: *** [modules] Error 2 make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-97-generic' Makefile:80: recipe for target 'modules' failed make: *** [modules] Error 2 DKMSKernelVersion: 5.4.0-97-generic Date: Mon Feb 7 11:32:36 2022 Package: nvidia-dkms-470 470.103.01-0ubuntu0.18.04.1 PackageVersion: 470.103.01-0ubuntu0.18.04.1 SourcePackage: nvidia-graphics-drivers-470 Title: nvidia-dkms-470 470.103.01-0ubuntu0.18.04.1: nvidia kernel module failed to build It is added: # dkms status -k `uname -r` iser, 4.7: added kernel-mft-dkms, 4.13.0, 5.4.0-97-generic, x86_64: installed knem, 1.1.3.90mlnx1: added mlnx-ofed-kernel, 4.7: added nvidia, 470.103.01: added rshim, 1.8, 5.4.0-97-generic, x86_64: installed srp, 4.7: added However, doing a build gets: dkms build nvidia/470.103.01 Kernel preparation unnecessary for this kernel. Skipping... applying patch disable_fstack-clash-protection_fcf-protection.patch...patching file Kbuild Hunk #1 succeeded at 82 (offset 11 lines). Building module: cleaning build area... unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j16 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-97-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.4.0-97-generic/build LD=/usr/bin/ld.bfd modules.(bad exit status: 2) ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-dkms-470.0.crash' Error! Bad return status for module build on kernel: 5.4.0-97-generic (x86_64) Consult /var/lib/dkms/nvidia/470.103.01/build/make.log for more information. Which I suspect is related to the "peermem" module: { echo /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-drv.o [...] /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-format.o; echo; } > /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm.mod /usr/bin/ld.bfd -m elf_x86_64 -z max-page-size=0x20-r -o /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem/nvidia-peermem.o { echo /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem/nvidia-peermem.o; echo; } > /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem.mod make -f ./scripts/Makefile.modpost sed 's/ko$/o/' /var/lib/dkms/nvidia/470.103.01/build/modules.order | scripts/mod/modpost -m -a -i ./Module.symvers -I /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -e /usr/src/ofa_kernel/default/Module.symvers -o /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -s -T - FATAL: parse error in symbol dump file scripts/Makefile.modpost:93: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1 Makefile:1675: recipe for target 'modules' failed make[1]: *** [modules] Error 2 make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-97-generic' Makefile:80: recipe for target 'modules' failed make: *** [modules]
[Kernel-packages] [Bug 1960256] [NEW] compilation due to "peermem" module
Public bug reported: Installed with the following: apt-get install --no-install-recommends nvidia-driver-470 nvidia- modprobe libnvidia-cfg1-470 libnvidia-common-470 libnvidia-compute-470 libnvidia-decode-470 libnvidia-encode-470 libnvidia-extra-470 libnvidia- fbc1-470 libnvidia-gl-470 libnvidia-ifr1-470 nvidia-compute-utils-470 nvidia-dkms-470 nvidia-driver-470 nvidia-kernel-common-470 nvidia- kernel-source-470 nvidia-utils-470 xserver-xorg-video-nvidia-470 Seems to generate /var/crash/nvidia-dkms-470.0.crash (tail -20): /usr/bin/ld.bfd -m elf_x86_64 -z max-page-size=0x20-r -o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-drv.o […] { echo /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm.o […] /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-format.o; echo; } > /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm.mod make -f ./scripts/Makefile.modpost sed 's/ko$/o/' /var/lib/dkms/nvidia/470.103.01/build/modules.order | scripts/mod/modpost -m -a -i ./Module.symvers -I /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -e /usr/src/ofa_kernel/default/Module.symvers -o /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -s -T - FATAL: parse error in symbol dump file scripts/Makefile.modpost:93: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1 Makefile:1675: recipe for target 'modules' failed make[1]: *** [modules] Error 2 make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-97-generic' Makefile:80: recipe for target 'modules' failed make: *** [modules] Error 2 DKMSKernelVersion: 5.4.0-97-generic Date: Mon Feb 7 11:32:36 2022 Package: nvidia-dkms-470 470.103.01-0ubuntu0.18.04.1 PackageVersion: 470.103.01-0ubuntu0.18.04.1 SourcePackage: nvidia-graphics-drivers-470 Title: nvidia-dkms-470 470.103.01-0ubuntu0.18.04.1: nvidia kernel module failed to build It is added: # dkms status -k `uname -r` iser, 4.7: added kernel-mft-dkms, 4.13.0, 5.4.0-97-generic, x86_64: installed knem, 1.1.3.90mlnx1: added mlnx-ofed-kernel, 4.7: added nvidia, 470.103.01: added rshim, 1.8, 5.4.0-97-generic, x86_64: installed srp, 4.7: added However, doing a build gets: dkms build nvidia/470.103.01 Kernel preparation unnecessary for this kernel. Skipping... applying patch disable_fstack-clash-protection_fcf-protection.patch...patching file Kbuild Hunk #1 succeeded at 82 (offset 11 lines). Building module: cleaning build area... unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j16 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-97-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.4.0-97-generic/build LD=/usr/bin/ld.bfd modules.(bad exit status: 2) ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-dkms-470.0.crash' Error! Bad return status for module build on kernel: 5.4.0-97-generic (x86_64) Consult /var/lib/dkms/nvidia/470.103.01/build/make.log for more information. Which I suspect is related to the "peermem" module: { echo /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-drv.o [...] /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm/nvidia-drm-format.o; echo; } > /var/lib/dkms/nvidia/470.103.01/build/nvidia-drm.mod /usr/bin/ld.bfd -m elf_x86_64 -z max-page-size=0x20-r -o /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem.o /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem/nvidia-peermem.o { echo /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem/nvidia-peermem.o; echo; } > /var/lib/dkms/nvidia/470.103.01/build/nvidia-peermem.mod make -f ./scripts/Makefile.modpost sed 's/ko$/o/' /var/lib/dkms/nvidia/470.103.01/build/modules.order | scripts/mod/modpost -m -a -i ./Module.symvers -I /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -e /usr/src/ofa_kernel/default/Module.symvers -o /var/lib/dkms/nvidia/470.103.01/build/Module.symvers -s -T - FATAL: parse error in symbol dump file scripts/Makefile.modpost:93: recipe for target '__modpost' failed make[2]: *** [__modpost] Error 1 Makefile:1675: recipe for target 'modules' failed make[1]: *** [modules] Error 2 make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-97-generic' Makefile:80: recipe for target 'modules' failed make: *** [modules] Error 2 (END) This does not seem to be supported on (unpatched?) 5.4 kernels per this thread I found: https://forums.linuxmint.com/viewtopic.php?p=2106512#p2106512 Downloading and running "NVIDIA-Linux-x86_64-470.103.01.run" directly also fails, UNLESS the following options is used: --no-peermem Do not install the nvidia-peermem kernel module. This kernel module provides support for peer-to-peer memory sharing with Mellanox HCAs (Host Channel Adapters) via GPUDirect RDMA (Remote Direct Memory Access). BUT