[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-05-29 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-azure-
nvidia-6.17/6.17.0-1010.11 kernel in -proposed solves the problem.
Please test the kernel and update this bug with the results. If the
problem is solved, change the tag 'verification-needed-noble-linux-
azure-nvidia-6.17' to 'verification-done-noble-linux-azure-nvidia-6.17'.
If the problem still exists, change the tag 'verification-needed-noble-
linux-azure-nvidia-6.17' to 'verification-failed-noble-linux-azure-
nvidia-6.17'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-noble-linux-azure-nvidia-6.17-v2 
verification-needed-noble-linux-azure-nvidia-6.17

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-05-28 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-nvidia-
bos-7.0/7.0.0-2008.8~24.04.1 kernel in -proposed solves the problem.
Please test the kernel and update this bug with the results. If the
problem is solved, change the tag 'verification-needed-noble-linux-
nvidia-bos-7.0' to 'verification-done-noble-linux-nvidia-bos-7.0'. If
the problem still exists, change the tag 'verification-needed-noble-
linux-nvidia-bos-7.0' to 'verification-failed-noble-linux-nvidia-
bos-7.0'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-noble-linux-nvidia-bos-7.0-v2 
verification-needed-noble-linux-nvidia-bos-7.0

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-05-28 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-nvidia-bos/7.0.0-2007.7
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-resolute-linux-nvidia-bos' to 'verification-
done-resolute-linux-nvidia-bos'. If the problem still exists, change the
tag 'verification-needed-resolute-linux-nvidia-bos' to 'verification-
failed-resolute-linux-nvidia-bos'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-resolute-linux-nvidia-bos-v2 
verification-needed-resolute-linux-nvidia-bos

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-05-21 Thread Ubuntu Kernel Bot
** Tags added: kernel-daily-bug

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-05-21 Thread Jacob Martin
** Also affects: linux-nvidia-bos (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: linux-nvidia-bos (Ubuntu)
   Status: New => Invalid

** Also affects: linux-nvidia-6.17 (Ubuntu Resolute)
   Importance: Undecided
   Status: New

** Also affects: linux-nvidia-bos (Ubuntu Resolute)
   Importance: Undecided
   Status: New

** Changed in: linux-nvidia-bos (Ubuntu Noble)
   Status: New => Invalid

** Changed in: linux-nvidia-6.17 (Ubuntu Resolute)
   Status: New => Invalid

** Changed in: linux-nvidia-bos (Ubuntu Resolute)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-04-22 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-
nvidia-6.17/6.17.0-1017.17 kernel in -proposed solves the problem.
Please test the kernel and update this bug with the results. If the
problem is solved, change the tag 'verification-needed-noble-linux-
nvidia-6.17' to 'verification-done-noble-linux-nvidia-6.17'. If the
problem still exists, change the tag 'verification-needed-noble-linux-
nvidia-6.17' to 'verification-failed-noble-linux-nvidia-6.17'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-noble-linux-nvidia-6.17-v2 
verification-needed-noble-linux-nvidia-6.17

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-04-20 Thread Jacob Martin
** Also affects: linux-nvidia-6.17 (Ubuntu Noble)
   Importance: Undecided
   Status: New

** Changed in: linux-nvidia-6.17 (Ubuntu)
   Status: New => Invalid

** Changed in: linux-nvidia-6.17 (Ubuntu Noble)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-04-15 Thread Jiandi An
** Description changed:

  This patch series adds comprehensive CXL (Compute Express Link) support to the
  nvidia-6.17 kernel, including:
  
  1. CXL Type-2 device support - Enables accelerator devices (like GPUs and
     SmartNICs) to use CXL for coherent memory access via firmware-provisioned
     regions
  2. CXL RAS (Reliability, Availability, Serviceability) error handling -
     Implements PCIe Port Protocol error handling and logging for CXL Root 
Ports,
     Downstream Switch Ports, and Upstream Switch Ports
  3. CXL DVSEC and HDM state save/restore - Preserves CXL DVSEC control/range
     registers and HDM decoder programming across PCI resets and link 
transitions,
     enabling device re-initialization after reset for firmware-provisioned
     configurations
  4. CXL Reset support - Implements the CXL Reset method (CXL Spec v3.2,
     Sections 8.1.3, 9.6, 9.7) via a sysfs interface for Type-2 devices,
     including memory offlining, cache flushing, multi-function sibling
     coordination, and DVSEC reset sequencing
  5. Multi-level interleaving fix - Supports firmware-configured CXL
     interleaving where lower levels use smaller granularities than parent ports
     (reverse HPA bit ordering)
  6. Prerequisite CXL and PCI driver updates - Cherry-picked commits from
     upstream torvalds/master covering the range from v6.17.9 to the merge
     point of Terry Bowman's v14 series into v7.0
  7. CXL DAX support - Enables direct memory access to CXL RAM regions and
  mapping CXL DAX devices as System-RAM
  
  Key Features Added:
  
    - CXL Type-2 accelerator device registration and memory management
    - CXL region creation by Type-2 drivers
    - DPA (Device Physical Address) allocation interface for accelerators
    - HPA (Host Physical Address) free space enumeration
    - Multi-level CXL address translation (SPA↔HPA↔DPA)
    - CXL protocol error detection, forwarding, and recovery
    - CXL RAS error handling for Endpoints, RCH, and Switch Ports
  (replacing the old PCIEAER_CXL symbol with the new CXL_RAS def_bool)
    - CXL extended linear cache region support
    - CXL DVSEC and HDM decoder state save/restore across PCI resets
    - CXL Reset sysfs interface (/sys/bus/pci/devices/.../cxl_reset) for
  Type-2 devices with Reset Capable bit set
    - Multi-function sibling coordination during CXL reset via Non-CXL
  Function Map DVSEC
    - CPU cache flush using cpu_cache_invalidate_memregion() during reset
    - Multi-level interleaving with smaller granularities for lower decoder
  levels (firmware-provisioned configurations)
    - CXL DAX device access (DEV_DAX_CXL) and System-RAM mapping
  (DEV_DAX_KMEM)
    - CXL protocol error injection via APEI EINJ (ACPI_APEI_EINJ_CXL)
  
  Justification
  
  CXL Type-2 device support is critical for next-generation NVIDIA accelerators
  and data center workloads:
  
    - Enables coherent memory sharing between CPUs and accelerators
    - Supports firmware-provisioned CXL regions for accelerator memory
    - Provides proper error handling and reporting for CXL fabric errors
    - Enables device reset and state recovery for CXL Type-2 devices
    - Preserves firmware-programmed DVSEC and HDM decoder state across resets
    - Required for upcoming NVIDIA hardware with CXL capabilities
  
  Source
  Patch Breakdown (139 patches + 1 revert):
  #  Category  Count  Source
  

  1  Revert old CXL reset (f198764)  1OOT (cleanup)
  

  2  Upstream CXL/PCI prerequisite   103  Upstream torvalds/master 
(v6.17.9
     cherry-picks   → merge of Terry Bowman v14 into 
v7.0)
  

  3  Smita Koralahalli's CXL EINJ1  LKML (v6, not yet merged)
     series v6 patch 3/9
  

  4  Alejandro Lucero's CXL Type-2   22 LKML (v23, not yet merged)
     series v23
  

  5  Robert Richter's multi-level1  LKML (v1, not yet merged)
     interleaving fix
  

  6  Srirangan Madhavan's CXL state  5  LKML (v1, not yet merged)
     save/restore series
  

  7  Srirangan Madhavan's CXL reset  7  LKML (v5, not yet merged)
     series
  

+ 8  Upstream fixes for ported   14 13 Upstream merged fixes + 1 
+    commitsprerequisite
+ 

  8  

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-03-30 Thread Jiandi An
** Description changed:

- This patch series adds comprehensive CXL (Compute Express Link) support
- to the nvidia-6.17 kernel, including:
+ This patch series adds comprehensive CXL (Compute Express Link) support to the
+ nvidia-6.17 kernel, including:
  
- 1. CXL Type-2 device support - Enables accelerator devices (like GPUs
- and SmartNICs) to use CXL for coherent memory access
- 
+ 1. CXL Type-2 device support - Enables accelerator devices (like GPUs and
+SmartNICs) to use CXL for coherent memory access via firmware-provisioned
+regions
  2. CXL RAS (Reliability, Availability, Serviceability) error handling -
- Implements PCIe Port Protocol error handling and logging for CXL devices
- 
- 3. Prerequisite CXL driver updates - Cherry-picked commits from Linux
- v6.18 that are required dependencies
- 
+Implements PCIe Port Protocol error handling and logging for CXL Root 
Ports,
+Downstream Switch Ports, and Upstream Switch Ports
+ 3. CXL DVSEC and HDM state save/restore - Preserves CXL DVSEC control/range
+registers and HDM decoder programming across PCI resets and link 
transitions,
+enabling device re-initialization after reset for firmware-provisioned
+configurations
+ 4. CXL Reset support - Implements the CXL Reset method (CXL Spec v3.2,
+Sections 8.1.3, 9.6, 9.7) via a sysfs interface for Type-2 devices,
+including memory offlining, cache flushing, multi-function sibling
+coordination, and DVSEC reset sequencing
+ 5. Multi-level interleaving fix - Supports firmware-configured CXL
+interleaving where lower levels use smaller granularities than parent ports
+(reverse HPA bit ordering)
+ 6. Prerequisite CXL and PCI driver updates - Cherry-picked commits from
+upstream torvalds/master covering the range from v6.17.9 to the merge
+point of Terry Bowman's v14 series into v7.0
+ 7. CXL DAX support - Enables direct memory access to CXL RAM regions and
+ mapping CXL DAX devices as System-RAM
  
  Key Features Added:
  
- CXL Type-2 accelerator device registration and memory management
- CXL region creation by Type-2 drivers
- DPA (Device Physical Address) allocation interface for accelerators
- HPA (Host Physical Address) free space enumeration
- CXL protocol error detection, forwarding, and recovery
- RAS register mapping for CXL Endpoints and Switch Ports
+   - CXL Type-2 accelerator device registration and memory management
+   - CXL region creation by Type-2 drivers
+   - DPA (Device Physical Address) allocation interface for accelerators
+   - HPA (Host Physical Address) free space enumeration
+   - Multi-level CXL address translation (SPA↔HPA↔DPA)
+   - CXL protocol error detection, forwarding, and recovery
+   - CXL RAS error handling for Endpoints, RCH, and Switch Ports
+ (replacing the old PCIEAER_CXL symbol with the new CXL_RAS def_bool)
+   - CXL extended linear cache region support
+   - CXL DVSEC and HDM decoder state save/restore across PCI resets
+   - CXL Reset sysfs interface (/sys/bus/pci/devices/.../cxl_reset) for
+ Type-2 devices with Reset Capable bit set
+   - Multi-function sibling coordination during CXL reset via Non-CXL
+ Function Map DVSEC
+   - CPU cache flush using cpu_cache_invalidate_memregion() during reset
+   - Multi-level interleaving with smaller granularities for lower decoder
+ levels (firmware-provisioned configurations)
+   - CXL DAX device access (DEV_DAX_CXL) and System-RAM mapping
+ (DEV_DAX_KMEM)
+   - CXL protocol error injection via APEI EINJ (ACPI_APEI_EINJ_CXL)
  
  Justification
  
- CXL Type-2 device support is critical for next-generation NVIDIA
- accelerators and data center workloads:
+ CXL Type-2 device support is critical for next-generation NVIDIA accelerators
+ and data center workloads:
+ 
+   - Enables coherent memory sharing between CPUs and accelerators
+   - Supports firmware-provisioned CXL regions for accelerator memory
+   - Provides proper error handling and reporting for CXL fabric errors
+   - Enables device reset and state recovery for CXL Type-2 devices
+   - Preserves firmware-programmed DVSEC and HDM decoder state across resets
+   - Required for upcoming NVIDIA hardware with CXL capabilities
+ 
+ Source
+ Patch Breakdown (139 patches + 1 revert):
+ #  Category  Count  Source
+ 

+ 1  Revert old CXL reset (f198764)  1OOT (cleanup)
+ 

+ 2  Upstream CXL/PCI prerequisite   103  Upstream torvalds/master 
(v6.17.9
+cherry-picks   → merge of Terry Bowman v14 into 
v7.0)
+ 

+ 3  Smita Koralahalli's CXL EINJ1  LKML (v6, not yet merged)
+series v6 patch 3/9
+ 
---

[Bug 2143032] Re: Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support

2026-03-30 Thread Jiandi An
** Summary changed:

- Add CXL Type-2 device support and CXL RAS error handling
+ Add CXL Type-2 device support, RAS error handling, reset, state save/restore, 
and interleaving support

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143032

Title:
  Add CXL Type-2 device support, RAS error handling, reset, state
  save/restore, and interleaving support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2143032/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs