>-----Original Message----- >From: Jonathan Cameron <jonathan.came...@huawei.com> >Subject: Re: [PATCH v3 1/3] PCI/AER: Store UNCOR_STATUS bits that might >be ANFE in aer_err_info > >On Sun, 28 Apr 2024 03:31:11 +0000 >"Duan, Zhenzhong" <zhenzhong.d...@intel.com> wrote: > >> Hi Jonathan, >> >> >-----Original Message----- >> >From: Jonathan Cameron <jonathan.came...@huawei.com> >> >Subject: Re: [PATCH v3 1/3] PCI/AER: Store UNCOR_STATUS bits that >might >> >be ANFE in aer_err_info >> > >> >On Tue, 23 Apr 2024 02:25:05 +0000 >> >"Duan, Zhenzhong" <zhenzhong.d...@intel.com> wrote: >> > >> >> >-----Original Message----- >> >> >From: Jonathan Cameron <jonathan.came...@huawei.com> >> >> >Subject: Re: [PATCH v3 1/3] PCI/AER: Store UNCOR_STATUS bits that >> >might >> >> >be ANFE in aer_err_info >> >> > >> >> >On Wed, 17 Apr 2024 14:14:05 +0800 >> >> >Zhenzhong Duan <zhenzhong.d...@intel.com> wrote: >> >> > >> >> >> In some cases the detector of a Non-Fatal Error(NFE) is not the most >> >> >> appropriate agent to determine the type of the error. For example, >> >> >> when software performs a configuration read from a non-existent >> >> >> device or Function, completer will send an ERR_NONFATAL Message. >> >> >> On some platforms, ERR_NONFATAL results in a System Error, which >> >> >> breaks normal software probing. >> >> >> >> >> >> Advisory Non-Fatal Error(ANFE) is a special case that can be used >> >> >> in above scenario. It is predominantly determined by the role of the >> >> >> detecting agent (Requester, Completer, or Receiver) and the specific >> >> >> error. In such cases, an agent with AER signals the NFE (if enabled) >> >> >> by sending an ERR_COR Message as an advisory to software, instead >of >> >> >> sending ERR_NONFATAL. >> >> >> >> >> >> When processing an ANFE, ideally both correctable error(CE) status >and >> >> >> uncorrectable error(UE) status should be cleared. However, there is >no >> >> >> way to fully identify the UE associated with ANFE. Even worse, a >Fatal >> >> >> Error(FE) or Non-Fatal Error(NFE) may set the same UE status bit as >> >> >> ANFE. Treating an ANFE as NFE will reproduce above mentioned >issue, >> >> >> i.e., breaking softwore probing; treating NFE as ANFE will make us >> >> >> ignoring some UEs which need active recover operation. To avoid >> >clearing >> >> >> UEs that are not ANFE by accident, the most conservative route is >taken >> >> >> here: If any of the FE/NFE Detected bits is set in Device Status, do >not >> >> >> touch UE status, they should be cleared later by the UE handler. >> >Otherwise, >> >> >> a specific set of UEs that may be raised as ANFE according to the >PCIe >> >> >> specification will be cleared if their corresponding severity is Non- >Fatal. >> >> >> >> >> >> To achieve above purpose, store UNCOR_STATUS bits that might be >> >ANFE >> >> >> in aer_err_info.anfe_status. So that those bits could be printed and >> >> >> processed later. >> >> >> >> >> >> Tested-by: Yudong Wang <yudong.w...@intel.com> >> >> >> Co-developed-by: "Wang, Qingshun" ><qingshun.w...@linux.intel.com> >> >> >> Signed-off-by: "Wang, Qingshun" <qingshun.w...@linux.intel.com> >> >> >> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> >> >> >> --- >> >> >> drivers/pci/pci.h | 1 + >> >> >> drivers/pci/pcie/aer.c | 45 >> >> >++++++++++++++++++++++++++++++++++++++++++ >> >> >> 2 files changed, 46 insertions(+) >> >> >> >> >> >> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >> >> >> index 17fed1846847..3f9eb807f9fd 100644 >> >> >> --- a/drivers/pci/pci.h >> >> >> +++ b/drivers/pci/pci.h >> >> >> @@ -412,6 +412,7 @@ struct aer_err_info { >> >> >> >> >> >> unsigned int status; /* COR/UNCOR Error Status >*/ >> >> >> unsigned int mask; /* COR/UNCOR Error Mask */ >> >> >> + unsigned int anfe_status; /* UNCOR Error Status for >ANFE */ >> >> >> struct pcie_tlp_log tlp; /* TLP Header */ >> >> >> }; >> >> >> >> >> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >> >> >> index ac6293c24976..27364ab4b148 100644 >> >> >> --- a/drivers/pci/pcie/aer.c >> >> >> +++ b/drivers/pci/pcie/aer.c >> >> >> @@ -107,6 +107,12 @@ struct aer_stats { >> >> >> > PCI_ERR_ROOT_MULTI_COR_RCV | >> >> > \ >> >> >> > PCI_ERR_ROOT_MULTI_UNCOR_RCV) >> >> >> >> >> >> +#define AER_ERR_ANFE_UNC_MASK >> >> > (PCI_ERR_UNC_POISON_TLP | \ >> >> >> + PCI_ERR_UNC_COMP_TIME | >> >> > \ >> >> >> + PCI_ERR_UNC_COMP_ABORT >| >> >> > \ >> >> >> + PCI_ERR_UNC_UNX_COMP | >> >> > \ >> >> >> + PCI_ERR_UNC_UNSUP) >> >> >> + >> >> >> static int pcie_aer_disable; >> >> >> static pci_ers_result_t aer_root_reset(struct pci_dev *dev); >> >> >> >> >> >> @@ -1196,6 +1202,41 @@ void aer_recover_queue(int domain, >> >unsigned >> >> >int bus, unsigned int devfn, >> >> >> EXPORT_SYMBOL_GPL(aer_recover_queue); >> >> >> #endif >> >> >> >> >> >> +static void anfe_get_uc_status(struct pci_dev *dev, struct >> >aer_err_info >> >> >*info) >> >> >> +{ >> >> >> + u32 uncor_mask, uncor_status; >> >> >> + u16 device_status; >> >> >> + int aer = dev->aer_cap; >> >> >> + >> >> >> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, >> >> >&device_status)) >> >> >> + return; >> >> >> + /* >> >> >> + * Take the most conservative route here. If there are >> >> >> + * Non-Fatal/Fatal errors detected, do not assume any >> >> >> + * bit in uncor_status is set by ANFE. >> >> >> + */ >> >> >> + if (device_status & (PCI_EXP_DEVSTA_NFED | >PCI_EXP_DEVSTA_FED)) >> >> >> + return; >> >> >> + >> >> > >> >> >Is there not a race here? If we happen to get either an NFED or FED >> >> >between the read of device_status above and here we might pick up a >> >status >> >> >that corresponds to that (and hence clear something we should not). >> >> >> >> In this scenario, info->anfe_status is 0. >> > >> >OK. In that case what is the point of the check above? >> >If the code is safe to races, it's safe to go ahead without that check >> >on what might race. >> >> Good question. >> After further digging into the spec, I just found I misunderstood it. >> An UNCUR error raised as ANFE can be raised as NFE in different cases, >> so info->anfe_status can be nonzero here and the race you mentioned >> does exist, the check on PCI_EXP_DEVSTA_FED is also unnecessary. >> Sorry for the misleading. I plan to have below change to fix the race: >> >> unsigned int anfe_status; >> anfe_status = uncor_status & ~uncor_mask & ~info->severity & >> AER_ERR_ANFE_UNC_MASK; >> >> if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &device_status)) >> return; >> /* >> * Take the most conservative route here. If there are >> * Non-Fatal errors detected, do not assume any >> * bit in uncor_status is set by ANFE. >> */ >> if (device_status & PCI_EXP_DEVSTA_NFED) >> return; >> info->anfe_status = anfe_status; >> >> With this change, there is still a small window between reading >uncor_status >> and device_status to leak ANFE, but that's the best we can do and better >> than clearing NFE. Let me know if you have better idea😊 > >Worth leaving some breadcrumbs about there being a race (so a comment) >and explain what the side effects of hitting that race are (lost info >on the error I think, but not a missed error)?
Plan to add below comments, let me know if it's unclear for you: "If there is another ANFE between reading uncor_status and clearing PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE isn't recorded in info->anfe_status. It will be read out as NFE in next uncor_status register reading and processed by NFE handler." Thanks Zhenzhong >> >> Thanks >> Zhenzhong >> >> > >> >> >> >> > >> >> >Or am I missing that race being close somewhere? >> >> >> >> The bits leading to NFED or FED is masked out when assigning info- >> >>anfe_status. >> >> Bits for FED is masked out by ~info->severity, >> >> bit for NFED is masked out by AER_ERR_ANFE_UNC_MASK. >> >> >> >> So we never clear status bits for NFED or FED in ANFE handler. >> >> >> >> See below assignment of info->anfe_status. >> >> >> >> Thanks >> >> Zhenzhong >> >> >> >> > >> >> >> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, >> >> >&uncor_status); >> >> >> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, >> >> >&uncor_mask); >> >> >> + /* >> >> >> + * According to PCIe Base Specification Revision 6.1, >> >> >> + * Section 6.2.3.2.4, if an UNCOR error is raised as >> >> >> + * Advisory Non-Fatal error, it will match the following >> >> >> + * conditions: >> >> >> + * a. The severity of the error is Non-Fatal. >> >> >> + * b. The error is one of the following: >> >> >> + * 1. Poisoned TLP (Section 6.2.3.2.4.3) >> >> >> + * 2. Completion Timeout (Section 6.2.3.2.4.4) >> >> >> + * 3. Completer Abort (Section 6.2.3.2.4.1) >> >> >> + * 4. Unexpected Completion (Section >6.2.3.2.4.5) >> >> >> + * 5. Unsupported Request (Section 6.2.3.2.4.1) >> >> >> + */ >> >> >> + info->anfe_status = uncor_status & ~uncor_mask & ~info- >>severity >> >> >& >> >> >> + AER_ERR_ANFE_UNC_MASK; >> >> >> +} >> >> >> + >> >> >> /** >> >> >> * aer_get_device_error_info - read error status from dev and store >it >> >to >> >> >info >> >> >> * @dev: pointer to the device expected to have a error record >> >> >> @@ -1213,6 +1254,7 @@ int aer_get_device_error_info(struct >pci_dev >> >> >*dev, struct aer_err_info *info) >> >> >> >> >> >> /* Must reset in this function */ >> >> >> info->status = 0; >> >> >> + info->anfe_status = 0; >> >> >> info->tlp_header_valid = 0; >> >> >> >> >> >> /* The device might not support AER */ >> >> >> @@ -1226,6 +1268,9 @@ int aer_get_device_error_info(struct >pci_dev >> >> >*dev, struct aer_err_info *info) >> >> >> &info->mask); >> >> >> if (!(info->status & ~info->mask)) >> >> >> return 0; >> >> >> + >> >> >> + if (info->status & PCI_ERR_COR_ADV_NFAT) >> >> >> + anfe_get_uc_status(dev, info); >> >> >> } else if (type == PCI_EXP_TYPE_ROOT_PORT || >> >> >> type == PCI_EXP_TYPE_RC_EC || >> >> >> type == PCI_EXP_TYPE_DOWNSTREAM || >> >> >> >> >>