Bug#1064194: 535.86.05 breaks Xorg

2024-03-03 Thread Andreas Beckmann

Hi,

On 03/03/2024 18.01, Harald Dunkel wrote:

Are you sure that you really need nvidia-vulkan-icd (i.e.
nvidia-vulkan-common) and libnvidia-glvkspirv alone isn't enough?



Yes, installing libnvidia-glvkspirv is sufficient as a fix. If I remove
this package the problem is back.


Thanks for confirming my assumption.

I've moved the libnvidia-glvkspirv dependency to libnvidia-(e)glcore 
which seem to make more use of that in experimental.

Furthermore I've promoted nvidia-vulkan-icd to Recommends.


Andreas



Bug#1064194: 535.86.05 breaks Xorg

2024-03-03 Thread Harald Dunkel

Hi Andreas,

On 2024-03-01 11:36:43, Andreas Beckmann wrote:

Hi Harald,

On 24/02/2024 11.31, Harald Dunkel wrote:

[INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1
[INSTALL] nvidia-vulkan-icd:amd64 535.86.10-1


Are you sure that you really need nvidia-vulkan-icd (i.e.
nvidia-vulkan-common) and libnvidia-glvkspirv alone isn't enough?



Yes, installing libnvidia-glvkspirv is sufficient as a fix. If I remove
this package the problem is back.


Regards

Harri



Bug#1064194: 535.86.05 breaks Xorg

2024-03-01 Thread Andreas Beckmann

Hi Harald,

On 24/02/2024 11.31, Harald Dunkel wrote:

[INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1
[INSTALL] nvidia-vulkan-icd:amd64 535.86.10-1


Are you sure that you really need nvidia-vulkan-icd (i.e. 
nvidia-vulkan-common) and libnvidia-glvkspirv alone isn't enough?


I've found the following 'suspicious' string in
  libnvidia-{glcore,eglcore}.so.*:

The NVIDIA driver was unable to open 
'libnvidia-glvkspirv.so.535.86.05'.  This library is required at run time.


which seems to indicate the missing dependency.
While the string was already present since the 418 series (which 
introduced libnvidia-glvkspirv.so.*), the missing library only recently 
seems to have started causing real trouble.



Andreas



Bug#1064194: 535.86.05 breaks Xorg

2024-02-24 Thread Harald Dunkel

I went through the list of packages added by nvidia-driver-full,
one after the other. Of course it was the very last one making
xorg work again:

```
Aptitude 0.8.13: log report
Sat, Feb 24 2024 11:23:56 +0100

  IMPORTANT: this log only lists intended actions; actions which fail
  due to dpkg problems may not be completed.

Will install 3 packages, and remove 0 packages.
41.3 MB of disk space will be used

[INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1
[INSTALL] nvidia-vulkan-icd:amd64 535.86.10-1


Log complete.

===
```
I have verified by removing all packages installed during testing
again, and installing only the nvidia-vulkan-icd package: Bingo.
Hope this helps.

BTW. Nvidia released the 550 series for Linux.


Regards
Harri



Bug#1064194: 535.86.05 breaks Xorg

2024-02-24 Thread Andreas Beckmann

On 24/02/2024 10.26, Harald Dunkel via pkg-nvidia-devel wrote:

I found something: Apparently I have to install the *nvidia-driver-full*
package to avoid the crash at start time.


+

Very good finding. Something I hadn't considered ...


I would guess there is a missing
dependency in the nvidia-driver package.


probably in xserver-xorg-video-nvidia ... and probably a nonobvious 
dependency on something that gets dlopen()ed ... and the dlopen() error 
silently got ignored, but the symbol used anyway.


Can you take a look at
  lsof -p  | grep lib
to see which nvidia libraries are loaded into Xorg?
If you identify one (or more) of newly installed library packages, can 
you uninstall nvidia-driver-full (but keep the dependencies) and the 
library in question to see if the segfault returns? Maybe needs some 
iterations ...


And thereafter for verification, remove everything added by 
nvidia-driver-full and just add that one library that seems to 
cause/solve the segfault.



Thanks



Bug#1064194: 535.86.05 breaks Xorg

2024-02-24 Thread Harald Dunkel

I found something: Apparently I have to install the *nvidia-driver-full*
package to avoid the crash at start time. I would guess there is a missing
dependency in the nvidia-driver package.

These packages were added by nvidia-driver-full:

```
Aptitude 0.8.13: log report
Sat, Feb 24 2024 08:57:19 +

  IMPORTANT: this log only lists intended actions; actions which fail
  due to dpkg problems may not be completed.

Will install 25 packages, and remove 0 packages.
497 MB of disk space will be used

[INSTALL, DEPENDENCIES] libcudadebugger1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libgles-nvidia1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libgles-nvidia2:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libgles1:amd64 1.7.0-1
[INSTALL, DEPENDENCIES] libnvcuvid1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-allocator1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-api1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-egl-gbm1:amd64 1.1.1-1
[INSTALL, DEPENDENCIES] libnvidia-encode1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-fbc1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-glvkspirv:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-ngx1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-nvvm4:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-opticalflow1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvidia-rtcore:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] libnvoptix1:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-cuda-mps:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-opencl-common:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-opencl-icd:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-powerd:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-settings:amd64 525.147.05-1
[INSTALL, DEPENDENCIES] nvidia-suspend-common:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-vulkan-common:amd64 535.86.10-1
[INSTALL, DEPENDENCIES] nvidia-vulkan-icd:amd64 535.86.10-1
[INSTALL] nvidia-driver-full:amd64 535.86.10-1


Log complete.

===
```

On the first, broken upgrade from 525 to 535 (using the "nvidia-driver")
the libnvidia-egl-wayland1 package was removed. Maybe this is a hint.

Regards
Harri



Bug#1064194: Info received (Bug#1064194: 535.86.05 breaks Xorg)

2024-02-23 Thread Harald Dunkel

For verification I have installed Bookworm on an external SSD (using
debootstrap), added Gnome and nvidia-graphics-drivers 525 and verified,
upgraded to Testing and verified, and upgraded to Unstable and verified
(still 525.147.05-7~deb12u1). X worked.

Next I cherry-picked nvidia 535.86.10-1 from experimental and tried
again: Now its broken.

Please note there is no xorg.conf

Regards
Harri



Bug#1064194: 535.86.05 breaks Xorg

2024-02-19 Thread Harald Dunkel

On 2024-02-19 00:29:15, Andreas Beckmann wrote:


I didn't spot anything obvious in the upstream changelog for the newer
releases so I'll just upload the next (not latest) upstream release
(because that's the one I've ready right now) in the hope it improves
things.



Its still broken. I tried Debian's linux-image-6.6.15-amd64 kernel and
upstream's kernel version 6.7.5.

Upstream provides version 535.154.05. Is there some script to create an
nvidia-graphics-drivers source package from it?


Regards

Harri



Bug#1064194: 535.86.05 breaks Xorg

2024-02-18 Thread Andreas Beckmann

On 18/02/2024 09.13, Harald Dunkel via pkg-nvidia-devel wrote:

Rebuilding the package using the installed libc6 and tools
did not help. log file is attached.


Rebuilding the package won't change anything since we are only repacking 
upstream provided binaries.


I didn't spot anything obvious in the upstream changelog for the newer 
releases so I'll just upload the next (not latest) upstream release 
(because that's the one I've ready right now) in the hope it improves 
things.


Andreas



Bug#1064194: 535.86.05 breaks Xorg

2024-02-18 Thread Harald Dunkel

Package: libnvidia-glcore
Version: 535.86.05-1

After upgrading to version 535.86.05.1 Xorg dies with SIGSEV:

:
[33.382] (II) Initializing extension NV-GLX
[33.382] (II) Initializing extension NV-CONTROL
[33.382] (II) Initializing extension XINERAMA
[33.386] (EE)
[33.386] (EE) Backtrace:
[33.386] (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x14d) [0x5616c2fc5f9d]
[33.386] (EE) 1: /lib/x86_64-linux-gnu/libc.so.6 (__sigaction+0x40) 
[0x7fa9bfadf510]
[33.386] (EE) 2: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.86.05 
(_nv012glcore+0x93ac) [0x7fa9bbd5850c]
[33.387] (EE) 3: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.86.05 
(_nv012glcore+0x95c8) [0x7fa9bbd58728]
[33.387] (EE) 4: /lib/x86_64-linux-gnu/libnvidia-glcore.so.535.86.05 
(_nv012glcore+0x22717) [0x7fa9bbd71877]
[33.387] (EE) unw_get_proc_name failed: no unwind info found [-10]
[33.387] (EE) 5: /usr/lib/xorg/modules/extensions/libglxserver_nvidia.so 
(?+0x0) [0x7fa9bdeca5c8]
[33.387] (EE)
[33.387] (EE) Segmentation fault at address 0x88
[33.387] (EE)
Fatal server error:
[33.387] (EE) Caught signal 11 (Segmentation fault). Server aborting
[33.387] (EE)
[33.387] (EE)
:

Rebuilding the package using the installed libc6 and tools
did not help. log file is attached.

Regards
Harri[32.595] 
X.Org X Server 1.21.1.11
X Protocol Version 11, Revision 0
[32.595] Current Operating System: Linux cecil.afaics.de 6.7.5-raw #1 SMP 
PREEMPT_DYNAMIC Fri Feb 16 20:37:14 CET 2024 x86_64
[32.595] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.7.5-raw 
root=UUID=d6b6d2f3-8213-4221-9a69-df7dc69acc45 ro net.ifnames=0 mitigations=off 
video=vesafb:ywrap,mtrr:3
[32.595] xorg-server 2:21.1.11-2 (https://www.debian.org/support) 
[32.595] Current version of pixman: 0.42.2
[32.595]Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
[32.595] Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[32.595] (==) Log file: "/var/log/Xorg.4.log", Time: Sun Feb 18 08:32:08 
2024
[32.595] (==) Using config file: "/etc/X11/xorg.conf"
[32.595] (==) Using config directory: "/etc/X11/xorg.conf.d"
[32.595] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[32.595] (==) ServerLayout "Layout0"
[32.595] (**) |-->Screen "Screen0" (0)
[32.595] (**) |   |-->Monitor "Monitor0"
[32.595] (**) |   |-->Device "Device0"
[32.595] (**) |-->Input Device "Keyboard0"
[32.595] (**) |-->Input Device "Mouse0"
[32.595] (**) Option "Xinerama" "0"
[32.595] (==) Automatically adding devices
[32.595] (==) Automatically enabling devices
[32.595] (==) Automatically adding GPU devices
[32.595] (==) Automatically binding GPU devices
[32.595] (==) Max clients allowed: 256, resource mask: 0x1f
[32.595] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist.
[32.595]Entry deleted from font path.
[32.595] (==) FontPath set to:
/usr/share/fonts/X11/misc,
/usr/share/fonts/X11/100dpi/:unscaled,
/usr/share/fonts/X11/75dpi/:unscaled,
/usr/share/fonts/X11/Type1,
/usr/share/fonts/X11/100dpi,
/usr/share/fonts/X11/75dpi,
built-ins
[32.595] (==) ModulePath set to "/usr/lib/xorg/modules"
[32.595] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 
'vmmouse' will be disabled.
[32.595] (WW) Disabling Keyboard0
[32.595] (WW) Disabling Mouse0
[32.595] (II) Loader magic: 0x5616c3052f00
[32.595] (II) Module ABI versions:
[32.595]X.Org ANSI C Emulation: 0.4
[32.595]X.Org Video Driver: 25.2
[32.595]X.Org XInput driver : 24.4
[32.595]X.Org Server Extension : 10.0
[32.596] (--) using VT number 2

[32.596] (II) systemd-logind: logind integration requires -keeptty and 
-keeptty was not provided, disabling logind integration
[32.596] (II) xfree86: Adding drm device (/dev/dri/card0)
[32.596] (II) Platform probe for 
/sys/devices/pci:00/:00:01.1/:02:00.0/drm/card0
[32.597] (--) PCI:*(2@0:0:0) 10de:1f82:19da:1546 rev 161, Mem @ 
0xa200/16777216, 0x9000/268435456, 0xa000/33554432, I/O @ 
0x3000/128, BIOS @ 0x/131072
[32.597] (II) "glx" will be loaded. This was enabled by default and also 
specified in the config file.
[32.597] (II) LoadModule: "dbe"
[32.597] (II) Module "dbe" already built-in
[32.597] (II) LoadModule: "extmod"
[32.597] (II) Module "extmod" already built-in
[32.597] (II) LoadModule: "glx"
[32.597] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[32.597] (II) Module glx: vendor="X.Org Foundation"
[32.597]compiled for 1.21.1.11, module version = 1.0.0
[32.597]ABI class: X.Org Server Extension,