On Wed, Oct 15, 2025 at 10:31 AM Mario Limonciello
<[email protected]> wrote:
>
> On 10/15/25 9:19 AM, Harry Wentland wrote:
> >
> >
> > On 2025-10-14 17:38, Mario Limonciello wrote:
> >>
> >>
> >> On 10/14/2025 4:27 PM, Alex Deucher wrote:
> >>> On Tue, Oct 14, 2025 at 3:46 PM Mario Limonciello
> >>> <[email protected]> wrote:
> >>>>
> >>>> [Why]
> >>>> If userspace hasn't frozen user processes (like systemd does with
> >>>> user.slice) then an aborted hibernate could give control back to
> >>>> userspace before display hardware is resumed.  IoW an atomic commit could
> >>>> be done while the hardware is in D3, which could hang a system.
> >>>
> >>> Is there any way to prevent this altogether?
> >>
> >> The obvious way is that userspace should be freezing user processes. 
> >> That's what systemd does.
> >>
> >>> This seems like a recipe
> >>> for trouble for any driver.
> >>
> >> If we want to uplevel this kind of check I think we would need some helper 
> >> to report hardware status into drm and drm would have to call that.
> >>
> >> Most distros use systemd, and this only happened in an aborted hibernate.  
> >> I guess I'd like to see how much this warning actually comes up before 
> >> deciding if all that plumbing is worth it.
> >
> > I was briefly thinking about a generic solution as well and don't
> > think it's worth it at this point. This is mostly about internal
> > driver/HW state.
> >
> > Harry
>
> FWIW IGT testing seems to have thrown up some problems with this which
> didn't show up in my unit testing.
>
> [  471.261018]  ? drm_atomic_helper_resume+0x2b/0x150 [drm_kms_helper]
> [  471.261031]  ? dm_resume+0x459/0x8f0 [amdgpu]
> [  471.261396]  ? amdgpu_ip_block_resume+0x27/0x70 [amdgpu]
> [  471.261635]  ? dpm_run_callback+0x9c/0x200
> [  471.261640]  ? device_resume+0x1b4/0x2b0
> [  471.261645]  drm_atomic_check_only+0x5d9/0x9e0 [drm]
> [  471.261672]  drm_atomic_commit+0x6d/0xe0 [drm]
> [  471.261697]  ? __pfx___drm_printfn_info+0x10/0x10 [drm]
> [  471.261729]  drm_atomic_helper_commit_duplicated_state+0xcd/0xe0
> [drm_kms_helper]
> [  471.261739]  drm_atomic_helper_resume+0x93/0x150 [drm_kms_helper]
> [  471.261751]  dm_resume+0x459/0x8f0 [amdgpu]
> [  471.262119]  ? preempt_count_add+0x7b/0xc0
> [  471.262125]  amdgpu_ip_block_resume+0x27/0x70 [amdgpu]
> [  471.262363]  amdgpu_device_ip_resume_phase3+0x54/0x80 [amdgpu]
> [  471.262598]  amdgpu_device_resume+0x188/0x3b0 [amdgpu]
> [  471.262842]  amdgpu_pmops_resume+0x4c/0x90 [amdgpu]
> [  471.263078]  pci_pm_resume+0x6b/0x100
>
> The issue appears to me to be that ip_block->status.hw isn't set again
> until the end of amdgpu_ip_block_resume().
>
> I am tempted to move it before the call to
> ip_block->version->funcs->resume().
>
> Thoughts?

Then it won't actually reflect the hw state.

Alex


Alex

>
> >
> >>>
> >>> Alex
> >>>
> >>>>
> >>>> [How]
> >>>> Add a check whether the IP block hardware is ready to the atomic check
> >>>> handler and return a failure. Userspace shouldn't do an atomic commit if
> >>>> the atomic check fails.
> >>>>
> >>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4627
> >>>> Signed-off-by: Mario Limonciello <[email protected]>
> >>>> ---
> >>>> Cc: Harry Wentland <[email protected]>
> >>>> v2:
> >>>>    * Return -EBUSY instead (Harry)
> >>>> ---
> >>>>    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +++++
> >>>>    1 file changed, 5 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> >>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>> index 6446ec6c21d4..f5cd9982af99 100644
> >>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >>>> @@ -12010,6 +12010,11 @@ static int amdgpu_dm_atomic_check(struct 
> >>>> drm_device *dev,
> >>>>
> >>>>           trace_amdgpu_dm_atomic_check_begin(state);
> >>>>
> >>>> +       if (WARN_ON(unlikely(!amdgpu_device_ip_is_hw(adev, 
> >>>> AMD_IP_BLOCK_TYPE_DCE)))) {
> >>>> +               ret = -EBUSY;
> >>>> +               goto fail;
> >>>> +       }
> >>>> +
> >>>>           ret = drm_atomic_helper_check_modeset(dev, state);
> >>>>           if (ret) {
> >>>>                   drm_dbg_atomic(dev, "drm_atomic_helper_check_modeset() 
> >>>> failed\n");
> >>>> --
> >>>> 2.49.0
> >>>>
> >>
> >
>

Reply via email to