Public bug reported:

[Impact]
On Dell systems (CID: 202602-38397) with Intel XeLPDP/Arrow Lake-S display
(8086:7d67), the i915 driver produces a kernel WARNING during cold boot and
warm boot when no monitor is connected. The BIOS can leave TypeC HPD live
status bits set on an XeLPDP TC port indicating a connected device, while
TCSS (TypeC Subsystem) power has not been enabled. During driver probe,
this causes drm_WARN_ON to fire in xelpdp_tc_phy_get_hw_state() even
though the driver correctly handles this state afterwards.
The warning triggers failures in automated cold-boot/warm-boot stress test
plans (cold-boot-loop-test, warm-boot-loop-test).
Error log:
i915 0000:00:02.0: [drm] drm_WARN_ON((tc->mode == TC_PORT_DP_ALT || tc->mode == 
TC_PORT_LEGACY) && !xelpdp_tc_phy_tcss_power_is_enabled(tc))
WARNING: CPU: 7 PID: 591 at drivers/gpu/drm/i915/display/intel_tc.c:1177 
xelpdp_tc_phy_get_hw_state+0x145/0x150 [i915]
Call Trace:
 intel_tc_port_init_mode+0x73/0x270 [i915]
 intel_tc_port_init+0x1e0/0x2a0 [i915]
 intel_ddi_init+0xa9a/0x1220 [i915]
 intel_setup_outputs+0x1f4/0xc20 [i915]
 intel_display_driver_probe_nogem+0x16f/0x270 [i915]
 i915_driver_probe+0x251/0x670 [i915]

[Fix]
Convert the drm_WARN_ON in xelpdp_tc_phy_get_hw_state() to a drm_dbg_kms()
message. The stale HPD bits left by BIOS are a known handoff condition that
the driver already handles correctly via intel_tc_port_update_mode(); the
WARN_ON is a false positive that need not be a warning.
This is consistent with the approach taken for the analogous AUX power check
in commit 5830231e6547 ("drm/i915/icl+/tc: Convert AUX powered WARN to a
debug message"), which is already in the tree.

https://lore.kernel.org/lkml/[email protected]/T/#u

[Test Plan]
Boot the system without a monitor connected and check dmesg:
$ sudo dmesg | grep -i "warn_on.*xelpdp\|WARNING.*intel_tc"
Without the patch: WARNING at intel_tc.c:1177 xelpdp_tc_phy_get_hw_state
is present in dmesg on affected hardware.
With the patch: No WARNING or drm_WARN_ON in dmesg related to
xelpdp_tc_phy_get_hw_state.
To stress test, run multiple cold/warm boot cycles:
$ sudo reboot  # repeat 10+ times, check dmesg each boot

[Where problems could occur]
It may affect Intel XeLPDP TC port state detection in the i915 driver on
MTL/ARL platforms.
The WARN_ON being converted was asserting that TCSS power must be enabled
when the port mode is DP-alt or Legacy. By downgrading it to a debug
message, a genuinely unexpected hardware state (not just a BIOS handoff
artifact) could go unnoticed and silently result in incorrect TC port
mode initialisation. This could manifest as a TC/USB4 port failing to
enumerate connected devices or DisplayPort-alt mode not working on
affected TC ports after boot.
The risk is limited to XeLPDP platforms (MTL, ARL) and only triggers
during the specific BIOS handoff scenario where HPD bits are stale.

** Affects: hwe-next
     Importance: Undecided
         Status: New

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: linux-oem-6.17 (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: linux (Ubuntu Noble)
     Importance: Undecided
         Status: New

** Affects: linux-oem-6.17 (Ubuntu Noble)
     Importance: Undecided
         Status: In Progress

** Affects: linux (Ubuntu Questing)
     Importance: Undecided
         Status: New

** Affects: linux-oem-6.17 (Ubuntu Questing)
     Importance: Undecided
         Status: Invalid

** Affects: linux (Ubuntu Resolute)
     Importance: Undecided
         Status: New

** Affects: linux-oem-6.17 (Ubuntu Resolute)
     Importance: Undecided
         Status: Invalid


** Tags: jira-somerville-4243 oem-priority somerville

** Also affects: linux (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.17 (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Questing)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.17 (Ubuntu Questing)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Resolute)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.17 (Ubuntu Resolute)
   Importance: Undecided
       Status: New

** Changed in: linux-oem-6.17 (Ubuntu Noble)
       Status: New => In Progress

** Changed in: linux-oem-6.17 (Ubuntu Questing)
       Status: New => Invalid

** Changed in: linux-oem-6.17 (Ubuntu Resolute)
       Status: New => Invalid

** Tags added: jira-somerville-4243 oem-priority somerville

** Description changed:

  [Impact]
  On Dell systems (CID: 202602-38397) with Intel XeLPDP/Arrow Lake-S display
  (8086:7d67), the i915 driver produces a kernel WARNING during cold boot and
  warm boot when no monitor is connected. The BIOS can leave TypeC HPD live
  status bits set on an XeLPDP TC port indicating a connected device, while
  TCSS (TypeC Subsystem) power has not been enabled. During driver probe,
  this causes drm_WARN_ON to fire in xelpdp_tc_phy_get_hw_state() even
  though the driver correctly handles this state afterwards.
  The warning triggers failures in automated cold-boot/warm-boot stress test
  plans (cold-boot-loop-test, warm-boot-loop-test).
  Error log:
  i915 0000:00:02.0: [drm] drm_WARN_ON((tc->mode == TC_PORT_DP_ALT || tc->mode 
== TC_PORT_LEGACY) && !xelpdp_tc_phy_tcss_power_is_enabled(tc))
  WARNING: CPU: 7 PID: 591 at drivers/gpu/drm/i915/display/intel_tc.c:1177 
xelpdp_tc_phy_get_hw_state+0x145/0x150 [i915]
  Call Trace:
-  intel_tc_port_init_mode+0x73/0x270 [i915]
-  intel_tc_port_init+0x1e0/0x2a0 [i915]
-  intel_ddi_init+0xa9a/0x1220 [i915]
-  intel_setup_outputs+0x1f4/0xc20 [i915]
-  intel_display_driver_probe_nogem+0x16f/0x270 [i915]
-  i915_driver_probe+0x251/0x670 [i915]
+  intel_tc_port_init_mode+0x73/0x270 [i915]
+  intel_tc_port_init+0x1e0/0x2a0 [i915]
+  intel_ddi_init+0xa9a/0x1220 [i915]
+  intel_setup_outputs+0x1f4/0xc20 [i915]
+  intel_display_driver_probe_nogem+0x16f/0x270 [i915]
+  i915_driver_probe+0x251/0x670 [i915]
  
  [Fix]
  Convert the drm_WARN_ON in xelpdp_tc_phy_get_hw_state() to a drm_dbg_kms()
  message. The stale HPD bits left by BIOS are a known handoff condition that
  the driver already handles correctly via intel_tc_port_update_mode(); the
  WARN_ON is a false positive that need not be a warning.
  This is consistent with the approach taken for the analogous AUX power check
  in commit 5830231e6547 ("drm/i915/icl+/tc: Convert AUX powered WARN to a
  debug message"), which is already in the tree.
+ 
+ 
https://lore.kernel.org/lkml/[email protected]/T/#u
  
  [Test Plan]
  Boot the system without a monitor connected and check dmesg:
  $ sudo dmesg | grep -i "warn_on.*xelpdp\|WARNING.*intel_tc"
  Without the patch: WARNING at intel_tc.c:1177 xelpdp_tc_phy_get_hw_state
  is present in dmesg on affected hardware.
  With the patch: No WARNING or drm_WARN_ON in dmesg related to
  xelpdp_tc_phy_get_hw_state.
  To stress test, run multiple cold/warm boot cycles:
  $ sudo reboot  # repeat 10+ times, check dmesg each boot
  
  [Where problems could occur]
  It may affect Intel XeLPDP TC port state detection in the i915 driver on
  MTL/ARL platforms.
  The WARN_ON being converted was asserting that TCSS power must be enabled
  when the port mode is DP-alt or Legacy. By downgrading it to a debug
  message, a genuinely unexpected hardware state (not just a BIOS handoff
  artifact) could go unnoticed and silently result in incorrect TC port
  mode initialisation. This could manifest as a TC/USB4 port failing to
  enumerate connected devices or DisplayPort-alt mode not working on
  affected TC ports after boot.
  The risk is limited to XeLPDP platforms (MTL, ARL) and only triggers
  during the specific BIOS handoff scenario where HPD bits are stale.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2144537

Title:
  i915 WARN_ON call trace during CB/WB on MTL/ARL platforms

To manage notifications about this bug go to:
https://bugs.launchpad.net/hwe-next/+bug/2144537/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to