On 1/28/2022 12:48 PM, Kalesh A P wrote:
From: Kalesh AP <kalesh-anakkur.pura...@broadcom.com>

Adding support for the device reset and recovery events in the
rte_eth_event framework. FW error and FW reset conditions would be
managed internally by the PMD without needing application intervention.
In such cases, PMD would need reset/recovery events to notify application
that PMD is undergoing a reset.

While most of the recovery process is transparent to the application since
most of the driver ensures recovery from FW reset or FW error conditions,
the application will have to reprogram any flows which were offloaded to
the underlying hardware.

Signed-off-by: Kalesh AP <kalesh-anakkur.pura...@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.ko...@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khapa...@broadcom.com>
---
  doc/guides/prog_guide/poll_mode_drv.rst | 24 ++++++++++++++++++++++++
  lib/ethdev/rte_ethdev.h                 | 18 ++++++++++++++++++
  2 files changed, 42 insertions(+)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst 
b/doc/guides/prog_guide/poll_mode_drv.rst
index 6831289..9ecc0e4 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -623,3 +623,27 @@ by application.
  The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger
  the application to handle reset event. It is duty of application to
  handle all synchronization before it calls rte_eth_dev_reset().
+
+Error recovery support
+~~~~~~~~~~~~~~~~~~~~~~
+
+When the PMD detects a FW reset or error condition, it may try to recover
+from the error without needing the application intervention. In such cases,
+PMD would need events to notify the application that it is undergoing
+an error recovery.
+
+The PMD should trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the
+application that PMD detected a FW reset or FW error condition. PMD may
+try to recover from the error by itself. Data path may be quiesced and
+control path operations may fail during the recovery period. The application
+should stop polling till it receives RTE_ETH_EVENT_RECOVERED event from the 
PMD.
+
+The PMD should trigger RTE_ETH_EVENT_RECOVERED event to notify the application
+that the it has recovered from the error condition. PMD re-configures the port
+to the state prior to the error condition. Control path and data path are up 
now.
+Since the device has undergone a reset, flow rules offloaded prior to reset
+may be lost and the application should recreate the rules again.
+
+The PMD should trigger RTE_ETH_EVENT_INTR_RMV event to notify the application
+that it has failed to recover from the error condition. The device may not be
+usable anymore.
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 147cc1c..a46819f 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -3818,6 +3818,24 @@ enum rte_eth_event_type {
        RTE_ETH_EVENT_DESTROY,  /**< port is released */
        RTE_ETH_EVENT_IPSEC,    /**< IPsec offload related event */
        RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */
+       RTE_ETH_EVENT_ERR_RECOVERING,
+                       /**< port recovering from an error
+                        *
+                        * PMD detected a FW reset or error condition.
+                        * PMD will try to recover from the error.
+                        * Data path may be quiesced and Control path operations
+                        * may fail at this time.
+                        */
+       RTE_ETH_EVENT_RECOVERED,
+                       /**< port recovered from an error
+                        *
+                        * PMD has recovered from the error condition.
+                        * Control path and Data path are up now.
+                        * PMD re-configures the port to the state prior to the 
error.
+                        * Since the device has undergone a reset, flow rules
+                        * offloaded prior to reset may be lost and
+                        * the application should recreate the rules again.
+                        */
        RTE_ETH_EVENT_MAX       /**< max value of this enum */


Also ABI check complains about 'RTE_ETH_EVENT_MAX' value check, cc'ed more 
people
to evaluate if it is a false positive:


1 function with some indirect sub-type change:
  [C] 'function int rte_eth_dev_callback_register(uint16_t, rte_eth_event_type, 
rte_eth_dev_cb_fn, void*)' at rte_ethdev.c:4637:1 has some indirect sub-type 
changes:
    parameter 3 of type 'typedef rte_eth_dev_cb_fn' has sub-type changes:
      underlying type 'int (typedef uint16_t, enum rte_eth_event_type, void*, 
void*)*' changed:
        in pointed to type 'function type int (typedef uint16_t, enum 
rte_eth_event_type, void*, void*)':
          parameter 2 of type 'enum rte_eth_event_type' has sub-type changes:
            type size hasn't changed
            2 enumerator insertions:
              'rte_eth_event_type::RTE_ETH_EVENT_ERR_RECOVERING' value '11'
              'rte_eth_event_type::RTE_ETH_EVENT_RECOVERED' value '12'
            1 enumerator change:
              'rte_eth_event_type::RTE_ETH_EVENT_MAX' from value '11' to '13' 
at rte_ethdev.h:3807:1

Reply via email to