[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2023-03-24 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #11 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
(In reply to Pierre Ossman from comment #9)
> 
> It now hangs more arbitrarily, not just when trying to play a video. Having
> done a suspend/resume cycle is still a requirement though.
> 

I tried disabling video acceleration, and the hangs are now gone. So it does
seem to be the culprit after all.

Could this help you pinpoint things somehow?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2023-03-08 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #10 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
I finally got that old version of mesa to build. Unfortunately, the hangs still
happen even with that. :/

> Mar 09 07:18:30 kernel: radeon :00:01.0: ring 3 stalled for more than
> 10028msec
> Mar 09 07:18:30 kernel: radeon :00:01.0: GPU lockup (current fence id
> 0xfa91 last fence id 0xfabc on ring 3)
> Mar 09 07:18:31 kernel: radeon :00:01.0: ring 5 stalled for more than
> 10077msec
> Mar 09 07:18:31 kernel: radeon :00:01.0: GPU lockup (current fence id
> 0x18fb last fence id 0x18fe on ring 5)
> Mar 09 07:18:31 kernel: radeon :00:01.0: ring 0 stalled for more than
> 10202msec
> ...

What can we do next to pinpoint this?

It seems to fail rather reliably after a suspend/resume. Is there some test
suite I can run to provoke things?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2023-03-05 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #9 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
FYI, it seems to have gotten worse since upgrading from
kernel-6.1.8-100.fc36.x86_64 to kernel-6.1.13-100.fc36.x86_64.

It now hangs more arbitrarily, not just when trying to play a video. Having
done a suspend/resume cycle is still a requirement though.

I'm struggling building the old version of mesa that still worked. It isn't
very compatible with newer LLVM, and there is something wrong with Fedora's
packaging of LLVM 12 (that seems to be the matching LLVM version for that old
mesa). I'll need some more effort to get that test up and running.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-12-20 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #8 from Alex Deucher (alexdeuc...@gmail.com) ---
(In reply to Pierre Ossman from comment #7)
> 
> Is that also handled by mesa, or some other component?

Yes, mesa handles video APIs (VAAPI, OpenMAX, VDPAU) as well as 3D (OpenGL,
Vulkan).

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-12-19 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #7 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
Sorry, I haven't had time to look at downgrading Mesa yet. But FYI, it does
still happen with mesa 22.1.7 and kernel 6.0.10.

I am now almost 100% certain that it is videos that are triggering this. And
possibly not all videos. So I'm thinking, perhaps the video acceleration?

Is that also handled by mesa, or some other component?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-11-11 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #6 from Alex Deucher (alexdeuc...@gmail.com) ---
(In reply to Pierre Ossman from comment #5)
> 
> Could the issue be with the firmware? Has that changed recently for these
> devices?
> 
> The last good firmware should be:
> 
>   linux-firmware-20220509-132.fc34.noarch
> 
> And the first bad firmware should be:
> 
>   linux-firmware-20220708-136.fc35.noarch

Not likely. The firmware for this chip has not changed in years.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-11-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #5 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
The lockup happens on 5.17.2 as well, so it seems the kernel is not the most
likely suspect.

I'll see if I can try an older mesa next.

Could the issue be with the firmware? Has that changed recently for these
devices?

The last good firmware should be:

  linux-firmware-20220509-132.fc34.noarch

And the first bad firmware should be:

  linux-firmware-20220708-136.fc35.noarch

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-10-27 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #4 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
I just got a GPU lockup on 5.18.4. So it's either not the kernel, or a bug that
appeared in the 5.18 series. I'll go back to the known good kernel now and see
if I can get the bug there.


One thought though, even if it is mesa that happens to issue a bad sequence of
commands, shouldn't the kernel driver be able to reset the GPU? It certainly
indicates that it is trying.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-10-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #3 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
This is wrong, I checked the wrong lines in dnf's history:

> Last working system:
> 
>   kernel-5.13.8-100.fc33.x86_64

The last working kernel is actually 5.17.12-100.fc34.x86_64. So if it's the
kernel it's likely 5.18 or 5.19 that regressed. I'll give 5.18.1 a spin.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-10-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

--- Comment #2 from Pierre Ossman (pierre-bugzi...@ossman.eu) ---
A bisect will be difficult, given that I can't reproduce it. :/

Any clues from the dmesg that could tell how to provoke it? Or some settings
that could provide more information?

I can try a few version and see if I'm able to narrow it down somewhat. It's
difficult to know when to assume it's a good version as in some cases it has
gone weeks without a lookup...

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-10-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

Alex Deucher (alexdeuc...@gmail.com) changed:

   What|Removed |Added

 CC||alexdeuc...@gmail.com

--- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) ---
Any chance you could bisect?  There have been very few changes to the radeon
kernel driver over the last few years.  I could also be a mesa regression. 
Does upgrading or downgrading mesa help?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 216625] [regression] GPU lockup on Radeon R7 Kaveri

2022-10-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=216625

Pierre Ossman (pierre-bugzi...@ossman.eu) changed:

   What|Removed |Added

   Tree|Mainline|Fedora

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.