在 2025/10/20 21:58, Lukas Wunner 写道:
On Mon, Oct 20, 2025 at 09:09:41PM +0800, Shuai Xue wrote:
??? 2025/10/20 18:17, Lukas Wunner ??????:
On Wed, Oct 15, 2025 at 10:41:58AM +0800, Shuai Xue wrote:
Replace the manual checks for native AER control with the
pcie_aer_is_native() helper, which provides a more robust way
to determine if we have native control of AER.
Why is it more robust?
IMHO, the pcie_aer_is_native() helper is more robust because it includes
additional safety checks that the manual approach lacks:
[...]
Specifically, it performs a sanity check for dev->aer_cap before
evaluating native AER control.
I'm under the impression that aer_cap must be set, otherwise the
error wouldn't have been reported and we wouldn't be in this code path?
If we can end up in this code path without aer_cap set, your patch
would regress devices which are not AER-capable because it would
now skip clearing of errors in the Device Status register via
pcie_clear_device_status().
Hi Lukas,
You raise an excellent point about the potential regression.
The origin code is:
if (host->native_aer || pcie_ports_native) {
pcie_clear_device_status(bridge);
pci_aer_clear_nonfatal_status(bridge);
}
This code clears both the PCIe Device Status register and AER status
registers when in native AER mode.
pcie_clear_device_status() is renamed from
pci_aer_clear_device_status(). Does it intends to clear only AER error
status?
- BIT 0: Correctable Error Detected
- BIT 1: Non-Fatal Error Detected
- BIT 2: Fatal Error Detected
- BIT 3: Unsupported Request Detected
From PCIe spec, BIT 0-2 are logged for functions supporting Advanced
Error Handling.
I am not sure if we should clear BIT 3, and also BIT 6 (Emergency Power
Reduction Detected) and in case a AER error.
Thanks,
Lukas
Thanks.
Shuai