On 2/27/2019 11:56 AM, Bolen, Austin wrote: > > BTW, this patch in particular is complaining about an error for a > removed device. The Dell servers referenced in this chain will check if > the device is removed and if so it will suppress the error so I don't > think they are susceptible to this particular issue and I agree it is > broken if they do. If that is the case we can and will fix it in firmware. >
Confirmed this issue does not apply to the referenced Dell servers so I don't not have a stake in how this should be handled for those systems. It may be they just don't support surprise removal. I know in our case all the Linux distributions we qualify (RHEL, SLES, Ubuntu Server) have told us they do not support surprise removal. So I'm guessing that any issues found with surprise removal could potentially fall under the category of "unsupported". Still though, the larger issue of recovering from other types of PCIe errors that are not due to device removal is still important. I would expect many system from many platform makers to not be able to recover PCIe errors in general and hopefully the new DPC CER model will help address this and provide added protection for cases like above as well. Thanks, Austin