On 3/12/2020 7:34 AM, Thomas Monjalon wrote: > 12/03/2020 04:25, Kalesh Anakkur Purayil: >> Hi Thomas, >> >> On Wed, Mar 11, 2020 at 6:49 PM Thomas Monjalon <tho...@monjalon.net> wrote: >> >>> 22/01/2020 11:16, Kalesh A P: >>>> From: Kalesh AP <kalesh-anakkur.pura...@broadcom.com> >>>> >>>> This patch adds support for recovery event in rte_eth_event framework. >>>> FW error and FW reset conditions would be managed by PMD. Driver uses >>> >>> "Driver"? THE driver? :) >>> >>>> RTE_ETH_EVENT_INTR_RESET event to notify the applications about the >>>> FW reset or error. >>> >>> Which drivers doe that? >>> >> [Kalesh]: Second patch in this series implements this behavior in bnxt PMD. >> Error recovery is a new feature added in bnxt PMD in 19.11. This change is >> needed to support error recovery functionality. >> >>> >>>> In such cases, PMD would need recovery events to >>>> notify application about PMD has recovered from FW reset or FW error. >>> >>> Sorry I don't understand. You said application is notified of any error. >>> But the PMD can recover from this error? So what is the error at the end? >>> If the error is recovered why notifying the application? >>> >> [Kalesh] : Let me give you some insight on this. >> >> The error recovery solution is a protocol implemented between firmware and >> bnxt PMD to recover from the fatal errors without a system reboot. There is >> an alarm thread which constantly monitors the health of the firmware and >> initiates a recovery when needed. >> >> There are two scenarios here: >> >> 1. Hardware or firmware encountered an error which firmware detected. >> Firmware is in operational status here. In this case, firmware can reset >> the chip and notify the driver about the reset. >> 2. Hardware or firmware encountered an error but firmware is dead/hung. >> Firmware is not in operational status. In this case, the only possible way >> to recover the adapter is through host driver(bnxt PMD). >> >> In both cases, bnxt PMD reinitializes with the FW again after the reset. >> During that recovery process, data path will be halted and any control path >> operation would fail. So, bnxt PMD has to notify the application about this >> reset/error event to prevent any activities from application during this >> time. > > I think you are changing the meaning of the reset event. > It was described like this: > RTE_ETH_EVENT_INTR_RESET, > /**< reset interrupt event, sent to VF on PF reset */ > > Please update this description as well. > > Of course, we'll need approval from other PMD maintainers > to accept the new recovery API. >
Hi Kalesh, Is this RFC still relevant/valid?