On 4/5/19 1:01 AM, Thomas Monjalon wrote:
Hi,

You forgot to Cc Andrew, co-maintainer of ethdev.

20/03/2019 05:54, Qi Zhang:
Device reset should be implemented in an async way since it is
possible to be invoked in interrupt thread and sometimes to reset a
device need to wait for some dependency, for example, a VF expects for
PF ready or a NIC function as part of a SOC wait for the whole system
reset complete, and all these time-consuming tasks will block the
interrupt thread.
The patch rename rte_eth_dev_reset to rte_eth_dev_reset_async and
rework the implementation. It will spawn a new thread which will call
ops->dev_reset, and when finished it will raise the event
RTE_ETH_EVENT_RESET_COMPLETE. The application should always wait for
this event before it continues to configure and restart the device.

Signed-off-by: Qi Zhang <qi.z.zh...@intel.com>
---
- * When this function is called, it first stops the port and then calls the
- * PMD specific dev_uninit( ) and dev_init( ) to return the port to initial
- * state, in which no Tx and Rx queues are setup, as if the port has been
- * reset and not started. The port keeps the port id it had before the
- * function call.
- *
- * After calling rte_eth_dev_reset( ), the application should use
- * rte_eth_dev_configure( ), rte_eth_rx_queue_setup( ),
- * rte_eth_tx_queue_setup( ), and rte_eth_dev_start( )
- * to reconfigure the device as appropriate.
- *
- * Note: To avoid unexpected behavior, the application should stop calling
- * Tx and Rx functions before calling rte_eth_dev_reset( ). For thread
- * safety, all these controlling functions should be called from the same
- * thread.
+ * @note
+ * Device reset may have the dependency, for example, a VF reset expects
+ * PF ready, or a NIC function as a part of a SOC need to wait for other
+ * parts of the system be ready, these are time-consuming tasks and will
+ * block current thread.
+ *
+ * As the name, rte_eth_dev_reset_async is an async API, it will spwan a
+ * new thread to call ops->dev_reset, once it is finished, it will raise
+ * the RTE_ETH_EVENT_RESET_COMPLETE event to notify application.  That makes
+ * things easy for an application that want to reset the device from the
+ * interrupt thread since typically a RTE_ETH_EVENT_INTR_RESET handler is
+ * invoked in interrupt thread. The typical implementation of ops->dev_reset
+ * will do some hardware reset operations through calling dev_uninit() and
+ * dev_init().
+ *
+ * Application should not assume device reset is finished after
+ * rte_eth_dev_reset_async return, it should always wait for a
+ * RTE_ETH_EVENT_RESET_COMPLETE event and check the reset result.
+ * If reset success, application should call rte_eth_dev_configure( ),
+ * rte_eth_rx_queue_setup( ), rte_eth_tx_queue_setup( ),
+ * and rte_eth_dev_start( ) to reconfigure the device as appropriate.
+ *
+ * @Note
+ * To avoid unexpected behavior, the application should stop calling
+ * Tx and Rx functions before calling rte_eth_dev_reset_async( ).
   *
   * @param port_id
   *   The port identifier of the Ethernet device.
@@ -1880,12 +1892,10 @@ void rte_eth_dev_close(uint16_t port_id);
   *   - (0) if successful.
   *   - (-EINVAL) if port identifier is invalid.
   *   - (-ENOTSUP) if hardware doesn't support this function.
- *   - (-EPERM) if not ran from the primary process.
- *   - (-EIO) if re-initialisation failed or device is removed.
   *   - (-ENOMEM) if the reset failed due to OOM.
- *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
+ *   - (<0) other errors from low level driver.
   */
-int rte_eth_dev_reset(uint16_t port_id);
+int rte_eth_dev_reset_async(uint16_t port_id);
Sorry I didn't check whether this API is better or not,
but I know it cannot be accepted before proposing a deprecation notice.
Perhaps you may keep the old API and just add the new one.

Honestly, I never really agreed with the purpose of the reset API.
So making it async or not, I have no real opinion...
... but spawning a new thread at each function call, I feel it is bad.

I agree with Thomas.

It looks very dangerous from locking point of view when a PMD
callback is executed from dynamically created thread.

Reply via email to