[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #27 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Alex Deucher from comment #26) > (In reply to Rainer Fiebig from comment #25) > > (In reply to Alex Deucher from comment #23) > > > I'll just revert it. It is more important for kernels with the the > > > drm_buddy changes. > > > > Would the following be equivalent to what you intended with your commit? > > Looks a bit awkward but hibernate/resume work with it for 6.0.19 (and a > > Ryzen 5600G): > > > > > > uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev, > > uint32_t domain) > > { > > if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) { > > domain = AMDGPU_GEM_DOMAIN_VRAM; > > if ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == > > CHIP_STONEY)) > > { > > if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD) > > domain = AMDGPU_GEM_DOMAIN_GTT; > > } > > } > > return domain; > > } > > > > > > Let me know whether this is worth persuing. I could then test it with > > 5.15.88 and 6.1.6. > > Nope. What my patch does is allow display buffers to be in either system > memory (GTT) or carve out (VRAM) depending on what is available. Without > the patch, the driver picks either VRAM or GTT depending on how much VRAM is > available on the system. This can lead to memory exhaustion in some cases > with multiple large resolution monitors depending on memory fragmentation. > > What your patch does is just always use VRAM unless the chip is Carrizo or > Stoney. So it is effectively just reverting the commit (depending on how > much VRAM your system has). I see. Thanks a lot for the explanation! -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #26 from Alex Deucher (alexdeuc...@gmail.com) --- (In reply to Rainer Fiebig from comment #25) > (In reply to Alex Deucher from comment #23) > > I'll just revert it. It is more important for kernels with the the > > drm_buddy changes. > > Would the following be equivalent to what you intended with your commit? > Looks a bit awkward but hibernate/resume work with it for 6.0.19 (and a > Ryzen 5600G): > > > uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev, > uint32_t domain) > { > if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) { > domain = AMDGPU_GEM_DOMAIN_VRAM; > if ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == > CHIP_STONEY)) > { > if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD) > domain = AMDGPU_GEM_DOMAIN_GTT; > } > } > return domain; > } > > > Let me know whether this is worth persuing. I could then test it with > 5.15.88 and 6.1.6. Nope. What my patch does is allow display buffers to be in either system memory (GTT) or carve out (VRAM) depending on what is available. Without the patch, the driver picks either VRAM or GTT depending on how much VRAM is available on the system. This can lead to memory exhaustion in some cases with multiple large resolution monitors depending on memory fragmentation. What your patch does is just always use VRAM unless the chip is Carrizo or Stoney. So it is effectively just reverting the commit (depending on how much VRAM your system has). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #25 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Alex Deucher from comment #23) > I'll just revert it. It is more important for kernels with the the > drm_buddy changes. Would the following be equivalent to what you intended with your commit? Looks a bit awkward but hibernate/resume work with it for 6.0.19 (and a Ryzen 5600G): uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev, uint32_t domain) { if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) { domain = AMDGPU_GEM_DOMAIN_VRAM; if ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY)) { if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD) domain = AMDGPU_GEM_DOMAIN_GTT; } } return domain; } Let me know whether this is worth persuing. I could then test it with 5.15.88 and 6.1.6. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #24 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Alex Deucher from comment #23) > I'll just revert it. It is more important for kernels with the the > drm_buddy changes. Right thing to do for now, I guess. If I can find a way to identify the commit(s) between 6.0.19 and 6.1 that fix the problem, I'll report it here. Thanks. Rainer -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #23 from Alex Deucher (alexdeuc...@gmail.com) --- I'll just revert it. It is more important for kernels with the the drm_buddy changes. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #22 from Mario Limonciello (AMD) (mario.limoncie...@amd.com) --- Thanks for trying. Another idea that might be feasible to do to identify it is a proper bisect between v6.0 and v6.1 but manually applying '306df163069e78160e7a534b892c5cd6fefdd537 ("drm/amdgpu: make display pinning more flexible (v2)")' on each test point. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #21 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Mario Limonciello (AMD) from comment #19) > Assuming it's within amdgpu and not DRM helpers it's still ~800 commits to > sift through. Even though 6.0.y is EOL now, I think it would be easier to > check the missing commit(s) from there to backport. We can worry about > 5.15.y after that. > > Can you see if this series from 6.1 on top of 6.0.19 helps? > > https://patchwork.freedesktop.org/series/106027/ No, those patches didn't help. Hibernate was always fine but resume always failed in the same way as described in my original mail to "stable". Note that I'm not going to test 800 commits in this manner. ;) So long! -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #20 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Mario Limonciello (AMD) from comment #19) > Assuming it's within amdgpu and not DRM helpers it's still ~800 commits to > sift through. Even though 6.0.y is EOL now, I think it would be easier to > check the missing commit(s) from there to backport. We can worry about > 5.15.y after that. > > Can you see if this series from 6.1 on top of 6.0.19 helps? > > https://patchwork.freedesktop.org/series/106027/ Yes, but may take a while. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #19 from Mario Limonciello (AMD) (mario.limoncie...@amd.com) --- Assuming it's within amdgpu and not DRM helpers it's still ~800 commits to sift through. Even though 6.0.y is EOL now, I think it would be easier to check the missing commit(s) from there to backport. We can worry about 5.15.y after that. Can you see if this series from 6.1 on top of 6.0.19 helps? https://patchwork.freedesktop.org/series/106027/ -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #18 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Alex Deucher from comment #16) > (In reply to Rainer Fiebig from comment #15) > > (In reply to Mario Limonciello (AMD) from comment #13) > > > Can we please confirm it's actually broken in 5.15.y before going through > > > that effort? > > > > I have tested this with 5.15.87/88. Error messages and symptoms were the > > same as with 6.0.18. Spared me the bisecting this time, though. > > Can you verify that reverting the change in 5.15.y fixes it? Alright, I do confirm that reverting commit 306df163069e78160e7a534b892c5cd6fefdd537 ("drm/amdgpu: make display pinning more flexible (v2)") solves the problem with hibernate and resume in 5.15.88. To me it seems that this patch cannot be backported in an isolated fashion. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #17 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Alex Deucher from comment #16) > (In reply to Rainer Fiebig from comment #15) > > (In reply to Mario Limonciello (AMD) from comment #13) > > > Can we please confirm it's actually broken in 5.15.y before going through > > > that effort? > > > > I have tested this with 5.15.87/88. Error messages and symptoms were the > > same as with 6.0.18. Spared me the bisecting this time, though. > > Can you verify that reverting the change in 5.15.y fixes it? Will do it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #16 from Alex Deucher (alexdeuc...@gmail.com) --- (In reply to Rainer Fiebig from comment #15) > (In reply to Mario Limonciello (AMD) from comment #13) > > Can we please confirm it's actually broken in 5.15.y before going through > > that effort? > > I have tested this with 5.15.87/88. Error messages and symptoms were the > same as with 6.0.18. Spared me the bisecting this time, though. Can you verify that reverting the change in 5.15.y fixes it? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #15 from Rainer Fiebig (j...@mailbox.org) --- (In reply to Mario Limonciello (AMD) from comment #13) > Can we please confirm it's actually broken in 5.15.y before going through > that effort? I have tested this with 5.15.87/88. Error messages and symptoms were the same as with 6.0.18. Spared me the bisecting this time, though. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 Mario Limonciello (AMD) (mario.limoncie...@amd.com) changed: What|Removed |Added Status|REOPENED|NEEDINFO --- Comment #14 from Mario Limonciello (AMD) (mario.limoncie...@amd.com) --- > and a hard reset (as in my case) may result Sorry - specifically that reverting the backported commit fixes your case. If so, yeah then we should see if there is anything else obvious to backport to help it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 Mario Limonciello (AMD) (mario.limoncie...@amd.com) changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|WILL_NOT_FIX|--- -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #13 from Mario Limonciello (AMD) (mario.limoncie...@amd.com) --- Can we please confirm it's actually broken in 5.15.y before going through that effort? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 Rainer Fiebig (j...@mailbox.org) changed: What|Removed |Added CC||j...@mailbox.org --- Comment #12 from Rainer Fiebig (j...@mailbox.org) --- (In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #9) > FWIW, I just wanted to add this to the regression tracking, but 6.0.y is EOL > now; and it seems 6.1.y works. Greg might do another fixup release, but > maybe investigating this further is not worth it. I beg to differ. Longterm kernels 5.15.87/88 and probably all other LTS kernels to which commit 306df163069e78160e7a534b892c5cd6fefdd537 has been backported are also affected. As "hibernate" is a basic, reliable feature and a hard reset (as in my case) may result in data loss, I only see two possibilities: either revert the commit in the longterm kernels or try to find out quickly what makes it work for them. The diff between 6.0.18 and 6.1.4 (where it was introduced) shows that only 86 files in /drivers/gpu/drm/amd/amdgpu have been modified. Probably only a few of them are relevant in this matter. So for the experts it should not be too hard to figure out a solution. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #11 from The Linux kernel's regression tracker (Thorsten Leemhuis) (regressi...@leemhuis.info) --- Just for the record, if someone cares or lands here some time in the future: There is another report about hibernation problems with ryzen cppus in 6.0.18 here: https://lore.kernel.org/all/2d59ed2b-ba8f-6695-9764-fd3b109ac...@mailbox.org/ Bisection result included (drm/amdgpu: make display pinning more flexible (v2)). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 Mario Limonciello (AMD) (mario.limoncie...@amd.com) changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |WILL_NOT_FIX -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #10 from kolAflash (kolafl...@kolahilft.de) --- Looks like the display issue with linux-6.1.y is on a good way. Hibernation still works fine with the latest revert-commit by Mario & Wayne, which I tested here. https://gitlab.freedesktop.org/drm/amd/-/issues/2171#note_1720281 So from my point of view this bug isn't relevant anymore. At least as long as it doesn't appear on newer kernels again. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 The Linux kernel's regression tracker (Thorsten Leemhuis) (regressi...@leemhuis.info) changed: What|Removed |Added CC||regressi...@leemhuis.info --- Comment #9 from The Linux kernel's regression tracker (Thorsten Leemhuis) (regressi...@leemhuis.info) --- FWIW, I just wanted to add this to the regression tracking, but 6.0.y is EOL now; and it seems 6.1.y works. Greg might do another fixup release, but maybe investigating this further is not worth it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #8 from kolAflash (kolafl...@kolahilft.de) --- (In reply to Alex Deucher from comment #7) > do you still have the problem with: > CONFIG_DRM_FBDEV_EMULATION=n > in your .config? The problem unfortunately still exists with CONFIG_DRM_FBDEV_EMULATION=n (and I get a black screen on the virtual console) > Does reverting a6250bdb6c4677ee77d699b338e077b900f94c0c fix it? No. That also doesn't help. I'm sorry. Anything else I can try? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #7 from Alex Deucher (alexdeuc...@gmail.com) --- do you still have the problem with: CONFIG_DRM_FBDEV_EMULATION=n in your .config? Does reverting a6250bdb6c4677ee77d699b338e077b900f94c0c fix it? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #6 from kolAflash (kolafl...@kolahilft.de) --- Created attachment 303585 --> https://bugzilla.kernel.org/attachment.cgi?id=303585&action=edit 6.1.4 dmesg after hibernation (In reply to Mario Limonciello (AMD) from comment #4) > [...] > Is it 100% failure rate on 6.0.y? Yes. > Since you mentioned that you couldn't effectively use 6.1.y because of the > MST issue, are you only finding it on 6.0.y when connected to a dock or > anything else unique? No. Happens with dock, with simple USB-C power (no dock) and on battery. > > Sadly I don't know how to provide helpful logs. After reboot there's > nothing > > helpful in /var/log/messages > > Can you check /var/lib/systemd/pstore? Perhaps there was a kernel crash > that got saved into NVRAM and restored by systemd on the next boot. Sadly that file doesn't exist. There are some files in /sys/fs/pstore/. But nothing from today. (In reply to Alex Deucher from comment #5) > Can you attach your dmesg output? I don't know how to get logs (including dmesg) when hibernation has failed. As said, after reboot there's nothing new in /var/log/messages Instead I attached dmesg after hibernation with v6.1.4. Is that helpful? Another thing: Is it important that my SWAP is a file /swap on an ext4 partition inside a LUKS partition? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #5 from Alex Deucher (alexdeuc...@gmail.com) --- Can you attach your dmesg output? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 --- Comment #4 from Mario Limonciello (AMD) (mario.limoncie...@amd.com) --- > Perfect guess! OK.. so we need to find out why this works in 6.1.y and not in 6.0.y. There are some fairly severe bugs it fixed. Is it 100% failure rate on 6.0.y? Since you mentioned that you couldn't effectively use 6.1.y because of the MST issue, are you only finding it on 6.0.y when connected to a dock or anything else unique? > Sadly I don't know how to provide helpful logs. After reboot there's nothing > helpful in /var/log/messages Can you check /var/lib/systemd/pstore? Perhaps there was a kernel crash that got saved into NVRAM and restored by systemd on the next boot. > Just wanted to say THANK YOU for all your help in the last couple of month!!! :) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 216917] hibernation regression since 6.0.18 (Ryzen-5650U incl. Radeon GPU)
https://bugzilla.kernel.org/show_bug.cgi?id=216917 Mario Limonciello (AMD) (mario.limoncie...@amd.com) changed: What|Removed |Added Status|NEW |ASSIGNED Component|Platform_x86|Video(DRI - non Intel) Assignee|drivers_platform_x86@kernel |drivers_video-dri@kernel-bu |-bugs.osdl.org |gs.osdl.org -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.