On 10/12/20 12:29 AM, Thomas Monjalon wrote: > 09/10/2020 05:48, Kalesh A P: >> From: Kalesh AP <kalesh-anakkur.pura...@broadcom.com> >> >> Adding support for device reset and recovery events in the >> rte_eth_event framework. FW error and FW reset conditions would be >> managed internally by PMD without needing application intervention. >> In such cases, PMD would need reset/recovery events to notify application >> that PMD is undergoing a reset. >> >> Signed-off-by: Somnath Kotur <somnath.ko...@broadcom.com> >> Signed-off-by: Kalesh AP <kalesh-anakkur.pura...@broadcom.com> >> Reviewed-by: Ajit Khaparde <ajit.khapa...@broadcom.com> >> Reviewed-by: Asaf Penso <as...@nvidia.com> > > The ethdev maintainers are not Cc'ed. > Please use the option --cc-cmd devtools/get-maintainer.sh > > >> +Error recovery support >> +~~~~~~~~~~~~~~~~~~~~~~ >> + >> +When the PMD detects a FW reset or error condition, it will try to recover >> +from the error without needing the application intervention. In such cases, >> +PMD would need events to notify the application that it is undergoing >> +an error recovery. >> + >> +The PMD will trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the >> +application that PMD detected a FW reset or FW error condition. PMD will >> +try to recover from the error by itself. Data path will be halted and >> +control path operations would fail during the recovery period. >> + >> +The PMD will trigger RTE_ETH_EVENT_RECOVERED event to notify the application >> +that the it has recovered from the error condition. Control path and data >> path >> +are up now. Since the device undergone a reset, flow rules offloaded prior >> to >> +the reset will be lost and the application has to recreate the rules again.
What should be done if the state is not recoverable? >> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h >> index 9759f13..9b4b015 100644 >> --- a/lib/librte_ethdev/rte_ethdev.h >> +++ b/lib/librte_ethdev/rte_ethdev.h >> @@ -3207,6 +3207,23 @@ enum rte_eth_event_type { >> RTE_ETH_EVENT_DESTROY, /**< port is released */ >> RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */ >> RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */ >> + RTE_ETH_EVENT_ERR_RECOVERING, >> + /**< port recovering from an error >> + * >> + * PMD detected a FW reset or error condition. >> + * PMD will try to recover from the error. >> + * Data path will be halted and Control path operations >> + * would fail at this time. >> + */ > > Does it mean the application has nothing to do when receiving this event? > I think the app should stop polling at least. > >> + RTE_ETH_EVENT_RECOVERED, >> + /**< port recovered from an error >> + * >> + * PMD has recovered from the error condition. >> + * Control path and Data path are up now. >> + * Since the device undergone a reset, flow rules >> + * offloaded prior to the reset will be lost and >> + * the application has to recreate the rules again. >> + */ > > Please be more precise. > Should the app re-configure the port, setup the queues, start the port? > >