> Date: Mon, 20 Feb 2023 23:15:35 -0800 > From: Jeff Frasca <that...@jeff-frasca.name> > > I have two machines with AMD graphics hardware: a laptop with a raven > ridge APU (GCN 5) and a desktop with a kaveri APU (GCN 2). > [...] > After upgrading my system in place from tarballs, and compiling a > custom kernel with the AMDGPU drivers instead of the radeon drivers, I > was pleasantly surprised to find that the frame buffer worked with the > newer, less tested drivers!
Cool! By the way, you may be able to just add `load amdgpu' to boot.cfg instead of compiling a custom kernel with amdgpu. Loading amdgpu as a module also has the advantage that it doesn't break dtrace (due to annoying technical restrictions in CTF which the kernel violates when amdgpu is statically linked because it is so large). You can rebuild just the module with: cd src/sys/modules/amdgpu $TOOLDIR/bin/nbmake-$ARCH -j4 dependall $TOOLDIR/bin/nbmake-$ARCH -j4 install > The problem with the doorbell code is that the Linux code uses > adev->doorbell.ptr + index to get the address to write to. ptr is > ultimately a pointer to a 32 bit wide value (rather than the 64 bit > wide value it actually is :-/ ), so the compiler's pointer math > multiplies index by 4 instead of 8, as the NetBSD dev who wrote the > code would have expected. Amazing! I must have stared at that code for hours trying to track down the ring test failures, without realizing that the pointer was typed 32-bit instead of 64-bit. ...I don't suppose you have another trick up your sleeve for the radeon driver, do you? We've also been seeing intermittent ring test failures at boot, but it doesn't use any 64-bit doorbells, so this trick doesn't work, alas. > (The driver blows up spectacularly shortly thereafter by causing a > floating point exception in kernel mode. I don't have a full fix for > that yet. The thing I did try that seems to get further causes the > screen to go blank. I have a plan for debugging this, but I haven't > gotten there yet.) If you have a stack trace or crash dump I might be able to help. The amdgpu driver apparently uses FP/SIMD instructions in the kernel, and I wired it up to NetBSD's mechanism for allowing it to do that, but I don't know if I've ever seen those parts of the code get hit and perhaps I missed something. > I've attached patches. Should I open a bug? Send these to the kernel > mailing list? Patches applied, thanks! I tweaked them a little bit, including to fix an arithmetic overflow bug that you had copied & pasted from one Taylor R Campbell, riastr...@netbsd.org, in kern_ksyms.c...oops. (Fix also applied in kern_ksyms.c now.) Feel free to file PRs with patches and/or cc me and tech-kern -- I don't always follow current-users.