Bug#1064194: 535.86.05 breaks Xorg
Hi, On 03/03/2024 18.01, Harald Dunkel wrote: Are you sure that you really need nvidia-vulkan-icd (i.e. nvidia-vulkan-common) and libnvidia-glvkspirv alone isn't enough? Yes, installing libnvidia-glvkspirv is sufficient as a fix. If I remove this package the problem is back. Thanks for confirming my assumption. I've moved the libnvidia-glvkspirv dependency to libnvidia-(e)glcore which seem to make more use of that in experimental. Furthermore I've promoted nvidia-vulkan-icd to Recommends. Andreas
Bug#1064194: 535.86.05 breaks Xorg
Hi Andreas, On 2024-03-01 11:36:43, Andreas Beckmann wrote: Hi Harald, On 24/02/2024 11.31, Harald Dunkel wrote: [INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1 [INSTALL] nvidia-vulkan-icd:amd64 535.86.10-1 Are you sure that you really need nvidia-vulkan-icd (i.e. nvidia-vulkan-common) and libnvidia-glvkspirv alone isn't enough? Yes, installing libnvidia-glvkspirv is sufficient as a fix. If I remove this package the problem is back. Regards Harri
Bug#1064194: 535.86.05 breaks Xorg
Hi Harald, On 24/02/2024 11.31, Harald Dunkel wrote: [INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1 [INSTALL] nvidia-vulkan-icd:amd64 535.86.10-1 Are you sure that you really need nvidia-vulkan-icd (i.e. nvidia-vulkan-common) and libnvidia-glvkspirv alone isn't enough? I've found the following 'suspicious' string in libnvidia-{glcore,eglcore}.so.*: The NVIDIA driver was unable to open 'libnvidia-glvkspirv.so.535.86.05'. This library is required at run time. which seems to indicate the missing dependency. While the string was already present since the 418 series (which introduced libnvidia-glvkspirv.so.*), the missing library only recently seems to have started causing real trouble. Andreas
Bug#1064194: 535.86.05 breaks Xorg
I went through the list of packages added by nvidia-driver-full, one after the other. Of course it was the very last one making xorg work again: ``` Aptitude 0.8.13: log report Sat, Feb 24 2024 11:23:56 +0100 IMPORTANT: this log only lists intended actions; actions which fail due to dpkg problems may not be completed. Will install 3 packages, and remove 0 packages. 41.3 MB of disk space will be used [INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1 [INSTALL] nvidia-vulkan-icd:amd64 535.86.10-1 Log complete. === ``` I have verified by removing all packages installed during testing again, and installing only the nvidia-vulkan-icd package: Bingo. Hope this helps. BTW. Nvidia released the 550 series for Linux. Regards Harri
Bug#1064194: 535.86.05 breaks Xorg
On 24/02/2024 10.26, Harald Dunkel via pkg-nvidia-devel wrote: I found something: Apparently I have to install the *nvidia-driver-full* package to avoid the crash at start time. + Very good finding. Something I hadn't considered ... I would guess there is a missing dependency in the nvidia-driver package. probably in xserver-xorg-video-nvidia ... and probably a nonobvious dependency on something that gets dlopen()ed ... and the dlopen() error silently got ignored, but the symbol used anyway. Can you take a look at lsof -p | grep lib to see which nvidia libraries are loaded into Xorg? If you identify one (or more) of newly installed library packages, can you uninstall nvidia-driver-full (but keep the dependencies) and the library in question to see if the segfault returns? Maybe needs some iterations ... And thereafter for verification, remove everything added by nvidia-driver-full and just add that one library that seems to cause/solve the segfault. Thanks
Bug#1064194: 535.86.05 breaks Xorg
I found something: Apparently I have to install the *nvidia-driver-full* package to avoid the crash at start time. I would guess there is a missing dependency in the nvidia-driver package. These packages were added by nvidia-driver-full: ``` Aptitude 0.8.13: log report Sat, Feb 24 2024 08:57:19 + IMPORTANT: this log only lists intended actions; actions which fail due to dpkg problems may not be completed. Will install 25 packages, and remove 0 packages. 497 MB of disk space will be used [INSTALL, DEPENDENCIES] libcudadebugger1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libgles-nvidia1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libgles-nvidia2:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libgles1:amd64 1.7.0-1 [INSTALL, DEPENDENCIES] libnvcuvid1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-allocator1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-api1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-egl-gbm1:amd64 1.1.1-1 [INSTALL, DEPENDENCIES] libnvidia-encode1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-fbc1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-ngx1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-nvvm4:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-opticalflow1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvidia-rtcore:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] libnvoptix1:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-cuda-mps:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-opencl-common:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-opencl-icd:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-powerd:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-settings:amd64 525.147.05-1 [INSTALL, DEPENDENCIES] nvidia-suspend-common:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1 [INSTALL, DEPENDENCIES] nvidia-vulkan-icd:amd64 535.86.10-1 [INSTALL] nvidia-driver-full:amd64 535.86.10-1 Log complete. === ``` On the first, broken upgrade from 525 to 535 (using the "nvidia-driver") the libnvidia-egl-wayland1 package was removed. Maybe this is a hint. Regards Harri
Bug#1064194: Info received (Bug#1064194: 535.86.05 breaks Xorg)
For verification I have installed Bookworm on an external SSD (using debootstrap), added Gnome and nvidia-graphics-drivers 525 and verified, upgraded to Testing and verified, and upgraded to Unstable and verified (still 525.147.05-7~deb12u1). X worked. Next I cherry-picked nvidia 535.86.10-1 from experimental and tried again: Now its broken. Please note there is no xorg.conf Regards Harri
Bug#1064194: 535.86.05 breaks Xorg
On 2024-02-19 00:29:15, Andreas Beckmann wrote: I didn't spot anything obvious in the upstream changelog for the newer releases so I'll just upload the next (not latest) upstream release (because that's the one I've ready right now) in the hope it improves things. Its still broken. I tried Debian's linux-image-6.6.15-amd64 kernel and upstream's kernel version 6.7.5. Upstream provides version 535.154.05. Is there some script to create an nvidia-graphics-drivers source package from it? Regards Harri
Bug#1064194: 535.86.05 breaks Xorg
On 18/02/2024 09.13, Harald Dunkel via pkg-nvidia-devel wrote: Rebuilding the package using the installed libc6 and tools did not help. log file is attached. Rebuilding the package won't change anything since we are only repacking upstream provided binaries. I didn't spot anything obvious in the upstream changelog for the newer releases so I'll just upload the next (not latest) upstream release (because that's the one I've ready right now) in the hope it improves things. Andreas
Bug#1064194: 535.86.05 breaks Xorg
Package: libnvidia-glcore Version: 535.86.05-1 After upgrading to version 535.86.05.1 Xorg dies with SIGSEV: : [33.382] (II) Initializing extension NV-GLX [33.382] (II) Initializing extension NV-CONTROL [33.382] (II) Initializing extension XINERAMA [33.386] (EE) [33.386] (EE) Backtrace: [33.386] (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x14d) [0x5616c2fc5f9d] [33.386] (EE) 1: /lib/x86_64-linux-gnu/libc.so.6 (__sigaction+0x40) [0x7fa9bfadf510] [33.386] (EE) 2: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.86.05 (_nv012glcore+0x93ac) [0x7fa9bbd5850c] [33.387] (EE) 3: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.86.05 (_nv012glcore+0x95c8) [0x7fa9bbd58728] [33.387] (EE) 4: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.86.05 (_nv012glcore+0x22717) [0x7fa9bbd71877] [33.387] (EE) unw_get_proc_name failed: no unwind info found [-10] [33.387] (EE) 5: /usr/lib/xorg/modules/extensions/libglxserver_nvidia.so (?+0x0) [0x7fa9bdeca5c8] [33.387] (EE) [33.387] (EE) Segmentation fault at address 0x88 [33.387] (EE) Fatal server error: [33.387] (EE) Caught signal 11 (Segmentation fault). Server aborting [33.387] (EE) [33.387] (EE) : Rebuilding the package using the installed libc6 and tools did not help. log file is attached. Regards Harri[32.595] X.Org X Server 1.21.1.11 X Protocol Version 11, Revision 0 [32.595] Current Operating System: Linux cecil.afaics.de 6.7.5-raw #1 SMP PREEMPT_DYNAMIC Fri Feb 16 20:37:14 CET 2024 x86_64 [32.595] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.7.5-raw root=UUID=d6b6d2f3-8213-4221-9a69-df7dc69acc45 ro net.ifnames=0 mitigations=off video=vesafb:ywrap,mtrr:3 [32.595] xorg-server 2:21.1.11-2 (https://www.debian.org/support) [32.595] Current version of pixman: 0.42.2 [32.595]Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. [32.595] Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. [32.595] (==) Log file: "/var/log/Xorg.4.log", Time: Sun Feb 18 08:32:08 2024 [32.595] (==) Using config file: "/etc/X11/xorg.conf" [32.595] (==) Using config directory: "/etc/X11/xorg.conf.d" [32.595] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [32.595] (==) ServerLayout "Layout0" [32.595] (**) |-->Screen "Screen0" (0) [32.595] (**) | |-->Monitor "Monitor0" [32.595] (**) | |-->Device "Device0" [32.595] (**) |-->Input Device "Keyboard0" [32.595] (**) |-->Input Device "Mouse0" [32.595] (**) Option "Xinerama" "0" [32.595] (==) Automatically adding devices [32.595] (==) Automatically enabling devices [32.595] (==) Automatically adding GPU devices [32.595] (==) Automatically binding GPU devices [32.595] (==) Max clients allowed: 256, resource mask: 0x1f [32.595] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist. [32.595]Entry deleted from font path. [32.595] (==) FontPath set to: /usr/share/fonts/X11/misc, /usr/share/fonts/X11/100dpi/:unscaled, /usr/share/fonts/X11/75dpi/:unscaled, /usr/share/fonts/X11/Type1, /usr/share/fonts/X11/100dpi, /usr/share/fonts/X11/75dpi, built-ins [32.595] (==) ModulePath set to "/usr/lib/xorg/modules" [32.595] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled. [32.595] (WW) Disabling Keyboard0 [32.595] (WW) Disabling Mouse0 [32.595] (II) Loader magic: 0x5616c3052f00 [32.595] (II) Module ABI versions: [32.595]X.Org ANSI C Emulation: 0.4 [32.595]X.Org Video Driver: 25.2 [32.595]X.Org XInput driver : 24.4 [32.595]X.Org Server Extension : 10.0 [32.596] (--) using VT number 2 [32.596] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration [32.596] (II) xfree86: Adding drm device (/dev/dri/card0) [32.596] (II) Platform probe for /sys/devices/pci:00/:00:01.1/:02:00.0/drm/card0 [32.597] (--) PCI:*(2@0:0:0) 10de:1f82:19da:1546 rev 161, Mem @ 0xa200/16777216, 0x9000/268435456, 0xa000/33554432, I/O @ 0x3000/128, BIOS @ 0x/131072 [32.597] (II) "glx" will be loaded. This was enabled by default and also specified in the config file. [32.597] (II) LoadModule: "dbe" [32.597] (II) Module "dbe" already built-in [32.597] (II) LoadModule: "extmod" [32.597] (II) Module "extmod" already built-in [32.597] (II) LoadModule: "glx" [32.597] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so [32.597] (II) Module glx: vendor="X.Org Foundation" [32.597]compiled for 1.21.1.11, module version = 1.0.0 [32.597]ABI class: X.Org Server Extension,