On Fri, 30 May 2025 10:08:38 +1000
Jonathan Gray <[email protected]> wrote:
> On Fri, May 30, 2025 at 08:01:31AM +0900, YASUOKA Masahiko wrote:
>> 
>> >Synopsis:   inteldrm stop working after {hibernate,suspend}/resume
>> >Category:   kernel
>> >Environment:
>>      System      : OpenBSD 7.7
>>      Details     : OpenBSD 7.7-current (GENERIC.MP) #117: Thu May 29 
>> 21:18:15 JST 2025
>>                       
>> yasuoka@xxx:/home/yasuoka/src/sys/arch/amd64/compile/GENERIC.MP
>> 
>>      Architecture: OpenBSD.amd64
>>      Machine     : amd64
>> >Description:
>>      After hibernate and resume, X11 stops working.  Keyboard and
>>      mouse don't work, but Ctrl-Alt-F1 or Ctrl-Alt-Backspace works.
>> 
>>      errors in dmesg:
>>      ****
>>      drm:pid97650:__uc_init_hw *ERROR* [drm] *ERROR* GT0: GuC initialization 
>> failed 0xfffffffffffffffae
>>      drm:pid97650:intel_gt_init_hw *ERROR* [drm] *ERROR* GT0: Enabling uc 
>> failed (-5)
>>      drm:pid97650:intel_gt_resume *ERROR* [drm] *ERROR* GT0: Failed to 
>> initialize GPU, declaring it wedged!
>>      ****
>> 
>>      This happens because guc_wait_ucode() in i915/gt/uc/intel_guc_fw.c
>>      fails.
>> 
>>      The function is to wait for the GuC to start up by calling the inline
>>      function guc_load_done() and the function checks two regisiters.
>> 
>>           97 static inline bool guc_load_done(struct intel_uncore *uncore, 
>> u32 *status, bool *success)
>>           98 {
>>           99         u32 val = intel_uncore_read(uncore, GUC_STATUS);
>>          100         u32 uk_val = REG_FIELD_GET(GS_UKERNEL_MASK, val);
>>          101         u32 br_val = REG_FIELD_GET(GS_BOOTROM_MASK, val);
>>          102 
>>          103         *status = val;
>>          104         switch (uk_val) {
>>          105         case INTEL_GUC_LOAD_STATUS_READY:
>>          106                 *success = true;
>>          107                 return true;
>>          108 
>>          109         case INTEL_GUC_LOAD_STATUS_ERROR_DEVID_BUILD_MISMATCH:
>>          110         case INTEL_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH:
>>          111         case INTEL_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE:
>> 
>>      In my test, the functions fails with the resgisters:
>> 
>>        ukernel = INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID(0x71)
>>        bootrom = INTEL_BOOTROM_STATUS_JUMP_PASSED(0x76)
>> 
>>      When I was using 7.6, I didn't see this problem.
>> 
>> >How-To-Repeat:
>>      1. hibernate or suspend
>>      2. resume
>> 
>>      the problem happens always (~10 times)
>> 
>>      After the workaround diff, not happen always (~3 times)
>>      
>> >Fix:
>>      Also the diff attached at last, workaround the problem.
>> 
>>      The diff partially backouts the change on Feb 7 and add a printf().
>> 
>>      I don't understand it logically, but if the printf() is removed, the
>>      problem start happening.
> 
> Thank you for the report.
> 
> Does this smaller diff still workaround the problem?

The smaller diff doesn't fix the problem.  I tried 2 times.

> If not, I suspect the krealloc() conversion in __mmio_reg_add().
> __mmio_reg_add() is used by GUC_MMIO_REG_ADD().

This seems good news for me.

Thanks,

> Index: sys/dev/pci/drm/i915/gt/uc/intel_guc_ads.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/pci/drm/i915/gt/uc/intel_guc_ads.c,v
> diff -u -p -r1.9 intel_guc_ads.c
> --- sys/dev/pci/drm/i915/gt/uc/intel_guc_ads.c        7 Feb 2025 03:03:30 
> -0000       1.9
> +++ sys/dev/pci/drm/i915/gt/uc/intel_guc_ads.c        29 May 2025 23:43:11 
> -0000
> @@ -849,6 +849,7 @@ static void guc_waklv_init(struct intel_
>  {
>       struct intel_gt *gt = guc_to_gt(guc);
>       u32 offset, addr_ggtt, remain, size;
> +return;
>  
>       if (!intel_uc_uses_guc_submission(&gt->uc))
>               return;
> Index: sys/dev/pci/drm/i915/gt/uc/intel_guc_fw.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/pci/drm/i915/gt/uc/intel_guc_fw.c,v
> diff -u -p -r1.8 intel_guc_fw.c
> --- sys/dev/pci/drm/i915/gt/uc/intel_guc_fw.c 7 Feb 2025 03:03:30 -0000       
> 1.8
> +++ sys/dev/pci/drm/i915/gt/uc/intel_guc_fw.c 29 May 2025 23:43:45 -0000
> @@ -197,6 +197,7 @@ static int guc_wait_ucode(struct intel_g
>                       REG_FIELD_GET(GS_BOOTROM_MASK, status),
>                       REG_FIELD_GET(GS_UKERNEL_MASK, status));
>       }
> +     printf("%s: count = %d, ret = %d\n", __func__, count, ret);
>       after = ktime_get();
>       delta = ktime_sub(after, before);
>       delta_ms = ktime_to_ms(delta);
> 
> 

Reply via email to