https://bugs.kde.org/show_bug.cgi?id=521142

            Bug ID: 521142
           Summary: [Regression 6.6.5] Direct scanout broken for fully
                    discrete dual-GPU desktop (NVIDIA render + Intel Arc
                    display) after dmabuf import mode removal
    Classification: Plasma
           Product: kwin
      Version First 6.6.5
       Reported In:
          Platform: Arch Linux
                OS: Linux
            Status: REPORTED
          Severity: major
          Priority: NOR
         Component: compositing
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

SUMMARY

Direct scanout stopped working after upgrading from KWin 6.6.4 to 6.6.5.
Games rendered on a discrete NVIDIA GPU now cause the Intel Arc B580
(display/primary GPU) to show 10-30% compositor load instead of idling at
0% during fullscreen gameplay. Downgrading to 6.6.4 immediately restores
correct behavior. This is likely caused by commits 4729b0bb and 64d608907
shipped in 6.6.5.

HARDWARE TOPOLOGY

This is a fully discrete dual-GPU desktop system — not a laptop, not
optimus, no shared memory path between GPUs:

- GPU0 (primary): Intel Arc B580 12GB (PCIe Gen 3 x8, slot 1)
  Driver: mesa-git (latest), display output connected via this GPU
  KWin compositor runs on this GPU

- GPU1 (render offload): NVIDIA RTX 2060 Super 8GB (PCIe Gen 3 x4, slot 2)
  Driver: nvidia-open-dkms (610 series, also tested 595 — same result)
  No display output connected, render-only

- CPU: Intel i9-9900K
- OS: CachyOS (Arch-based, x86-64-v3)
- Kernel: 7.0.11-1-cachyos
- KWin: 6.6.5 (broken) / 6.6.4 (working — confirmed by downgrade test)
- No custom GPU-related environment variables set system-wide

The B580 is the system primary GPU. KWin composites on it. The 2060S is
used exclusively for render offload of selected applications. No
KWIN_DRM_DEVICES or similar overrides are set.

RENDER OFFLOAD METHOD

The affected application (RimWorld, a native Linux/Vulkan title) is
launched with NVIDIA's standard proprietary offload variables:

  __NV_PRIME_RENDER_OFFLOAD=1
  __GLX_VENDOR_LIBRARY_NAME=nvidia
  __VK_LAYER_NV_optimus=NVIDIA_only

This configuration has worked correctly for over a year across multiple
KWin, driver, and kernel versions.

STEPS TO REPRODUCE

1. Configure a desktop system with two fully discrete PCIe GPUs:
   - Intel Arc (or similar) as primary/display GPU running mesa-git
   - NVIDIA discrete GPU in secondary slot, render-only
2. Launch a fullscreen application targeting the NVIDIA GPU via the
   offload variables listed above
3. Monitor primary GPU (Intel Arc) utilization

OBSERVED RESULT (KWin 6.6.5)

Intel Arc B580 shows 10-30% GPU utilization during fullscreen gameplay.
KWin is compositing through the B580 rather than allowing direct scanout.
The NVIDIA GPU's rendered frames are being routed through KWin's multi-GPU
copy path. For fully discrete PCIe cards with no shared memory this copy
path is not a viable fallback — it defeats the entire purpose of the
render offload configuration.

EXPECTED RESULT

Intel Arc B580 GPU utilization should be ~0% during fullscreen gameplay on
the NVIDIA GPU. Direct scanout should bypass KWin composition entirely, as
it did correctly in KWin 6.6.4 and all prior versions.

CONFIRMED REGRESSION

Downgrading KWin from 6.6.5 to 6.6.4 immediately and completely restores
correct behavior. B580 returns to 0% utilization during fullscreen
gameplay. Screenshot attached showing confirmed 0% B580 utilization with
KWin 6.6.4 while RimWorld is actively running on the 2060S at ~25% load.

ROOT CAUSE (LIKELY)

The regression was likely introduced by these two commits shipped in 6.6.5
as a fix for bug #517987:

  4729b0bb8b51af77181df7aae6ce2ae81a128784
  "backends/drm: drop dmabuf import modes"

  64d608907fc427b0791d7531959010c5093eb69e
  "backends/drm: don't attempt multi GPU copies with unsupported formats"

These commits removed dmabuf import modes that were enabling direct scanout
for fully discrete multi-GPU desktop configurations. The original bug
#517987 affected optimus laptops (AMD/NVIDIA shared-memory topology). The
fix, however, also eliminated the code path that this qualitatively
different topology — two fully discrete PCIe cards with no shared memory —
was relying on for direct scanout eligibility.

The commit message for 4729b0bb explicitly states: "If we later find some
hardware where it's proven to be beneficial, we can add this path back
with checks specific to that hardware." This report represents that
hardware.

This is not a complaint about multi-GPU copy performance. The issue is
that direct scanout eligibility is lost entirely for this topology after
6.6.5. The multi-GPU copy path is not an acceptable fallback — it is
functionally broken for this specific configuration (fully discrete
cross-vendor PCIe cards: Intel Arc + NVIDIA), though it is unclear whether
the breakage is specific to the cross-vendor combination or to any fully
discrete dual-GPU desktop setup without shared memory.

Either way downgrading to kwin 6.6.4 immediately restores direct scanout
functionality.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to