On 7/23/2025 7:38 PM, Raag Jadav wrote:
On Tue, Jul 15, 2025 at 04:17:25PM +0530, Riana Tauro wrote:
Certain runtime firmware errors can cause the device to be in a unusable
state requiring a firmware flash to restore normal operation.
Runtime Survivability Mode indicates firmware flash is necessary by
wedging the device and exposing survivability mode sysfs.

The below sysfs is an indication that device is in survivability mode

/sys/bus/pci/devices/<device>/survivability_mode

...

+int xe_survivability_mode_runtime_enable(struct xe_device *xe)
+{
+       struct xe_survivability *survivability = &xe->survivability;
+       struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+       int ret;
+
+       if (!IS_DGFX(xe) || IS_SRIOV_VF(xe) || xe->info.platform < 
XE_BATTLEMAGE) {
+               dev_err(&pdev->dev, "Runtime Survivability Mode not 
supported\n");
+               return -EINVAL;
+       }
+
+       ret = init_survivability_mode(xe);
+       if (ret)
+               return ret;
+
+       ret = create_survivability_sysfs(pdev);
+       if (ret)
+               dev_err(&pdev->dev, "Failed to create survivability mode 
sysfs\n");
+
+       survivability->type = XE_SURVIVABILITY_TYPE_RUNTIME;
+       dev_err(&pdev->dev, "Runtime Survivability mode enabled\n");
+
+       xe_device_set_wedged_method(xe, DRM_WEDGE_RECOVERY_VENDOR);
+       xe_device_declare_wedged(xe);
+       dev_err(&pdev->dev, "Firmware update required, Refer the userspace 
documentation for more details!\n");

Do we have it? Or did I miss it somewhere? :D

fwupd currently implements it and they have a generic documentation
and https://github.com/fwupd/fwupd/blob/main/plugins/intel-gsc/README.md intel specific. Once the patches are good to merge the dmesg and sysfs will be added in the same location by Frank.

I have mentioned "userspace" as there can be other tools in the future that might use this. There has to be a message indicating firmware update is required.

Thanks
Riana


Raag


Reply via email to