[PATCH v3] virtio_net: Fix missed rtnl_unlock
The rtnl_lock would stay locked if allocating promisc_allmulti failed. Also changed the allocation to GFP_KERNEL. Fixes: ff7c7d9f5261 ("virtio_net: Remove command data from control_buf") Reported-by: Eric Dumazet Link: https://lore.kernel.org/netdev/cann89ilazvaucvhpm6rpjj0owra_ofnx7fhc8d60gv-65ad...@mail.gmail.com/ Signed-off-by: Daniel Jurgens --- v3: - Changed to promisc_allmulti alloc to GPF_KERNEL v2: - Added fixes tag. --- drivers/net/virtio_net.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 19a9b50646c7..4e1a0fc0d555 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2902,14 +2902,14 @@ static void virtnet_rx_mode_work(struct work_struct *work) if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX)) return; - rtnl_lock(); - - promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_KERNEL); if (!promisc_allmulti) { dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); return; } + rtnl_lock(); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); -- 2.45.0
[PATCH v2] virtio_net: Fix missed rtnl_unlock
The rtnl_lock would stay locked if allocating promisc_allmulti failed. Fixes: ff7c7d9f5261 ("virtio_net: Remove command data from control_buf") Reported-by: Eric Dumazet Link: https://lore.kernel.org/netdev/cann89ilazvaucvhpm6rpjj0owra_ofnx7fhc8d60gv-65ad...@mail.gmail.com/ Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 19a9b50646c7..e2b7488f375e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2902,14 +2902,14 @@ static void virtnet_rx_mode_work(struct work_struct *work) if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX)) return; - rtnl_lock(); - promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); if (!promisc_allmulti) { dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); return; } + rtnl_lock(); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); -- 2.45.0
[PATCH] virtio_net: Fix missed rtnl_unlock
The rtnl_lock would stay locked if allocating promisc_allmulti failed. Reported-by: Eric Dumazet Link: https://lore.kernel.org/netdev/cann89ilazvaucvhpm6rpjj0owra_ofnx7fhc8d60gv-65ad...@mail.gmail.com/ Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 19a9b50646c7..e2b7488f375e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2902,14 +2902,14 @@ static void virtnet_rx_mode_work(struct work_struct *work) if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX)) return; - rtnl_lock(); - promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); if (!promisc_allmulti) { dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); return; } + rtnl_lock(); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); -- 2.45.0
[PATCH net-next v2 2/2] virtio_net: Add TX stopped and wake counters
Add a tx queue stop and wake counters, they are useful for debugging. $ ./tools/net/ynl/cli.py --spec netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope": "queue"}' ... {'ifindex': 13, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 14756682850, 'tx-packets': 226465, 'tx-stop': 113208, 'tx-wake': 113208}, {'ifindex': 13, 'queue-id': 1, 'queue-type': 'tx', 'tx-bytes': 18167675008, 'tx-packets': 278660, 'tx-stop': 8632, 'tx-wake': 8632}] Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 28 ++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 218a446c4c27..df6121c38a1b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -95,6 +95,8 @@ struct virtnet_sq_stats { u64_stats_t xdp_tx_drops; u64_stats_t kicks; u64_stats_t tx_timeouts; + u64_stats_t stop; + u64_stats_t wake; }; struct virtnet_rq_stats { @@ -145,6 +147,8 @@ static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { static const struct virtnet_stat_desc virtnet_sq_stats_desc_qstat[] = { VIRTNET_SQ_STAT_QSTAT("packets", packets), VIRTNET_SQ_STAT_QSTAT("bytes", bytes), + VIRTNET_SQ_STAT_QSTAT("stop",stop), + VIRTNET_SQ_STAT_QSTAT("wake",wake), }; static const struct virtnet_stat_desc virtnet_rq_stats_desc_qstat[] = { @@ -1014,6 +1018,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { netif_stop_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.stop); + u64_stats_update_end(&sq->stats.syncp); if (use_napi) { if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) virtqueue_napi_schedule(&sq->napi, sq->vq); @@ -1022,6 +1029,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, free_old_xmit(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); virtqueue_disable_cb(sq->vq); } } @@ -2322,8 +2332,14 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) free_old_xmit(sq, true); } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } __netif_tx_unlock(txq); } @@ -2473,8 +2489,14 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) virtqueue_disable_cb(sq->vq); free_old_xmit(sq, true); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } opaque = virtqueue_enable_cb_prepare(sq->vq); @@ -4790,6 +4812,8 @@ static void virtnet_get_base_stats(struct net_device *dev, tx->bytes = 0; tx->packets = 0; + tx->stop = 0; + tx->wake = 0; if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { tx->hw_drops = 0; -- 2.45.0
[PATCH net-next v2 1/2] netdev: Add queue stats for TX stop and wake
TX queue stop and wake are counted by some drivers. Support reporting these via netdev-genl queue stats. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- Documentation/netlink/specs/netdev.yaml | 14 ++ include/net/netdev_queues.h | 3 +++ include/uapi/linux/netdev.h | 2 ++ net/core/netdev-genl.c | 4 +++- tools/include/uapi/linux/netdev.h | 2 ++ 5 files changed, 24 insertions(+), 1 deletion(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 2be4b3714d17..11a32373365a 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -439,6 +439,20 @@ attribute-sets: Number of the packets dropped by the device due to the transmit packets bitrate exceeding the device rate limit. type: uint + - +name: tx-stop +doc: | + Number of times driver paused accepting new tx packets + from the stack to this queue, because the queue was full. + Note that if BQL is supported and enabled on the device + the networking stack will avoid queuing a lot of data at once. +type: uint + - +name: tx-wake +doc: | + Number of times driver re-started accepting send + requests to this queue from the stack. +type: uint operations: list: diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h index e7b84f018cee..a8a7e48dfa6c 100644 --- a/include/net/netdev_queues.h +++ b/include/net/netdev_queues.h @@ -41,6 +41,9 @@ struct netdev_queue_stats_tx { u64 hw_gso_wire_bytes; u64 hw_drop_ratelimits; + + u64 stop; + u64 wake; }; /** diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index cf24f1d9adf8..a8188202413e 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -165,6 +165,8 @@ enum { NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, + NETDEV_A_QSTATS_TX_STOP, + NETDEV_A_QSTATS_TX_WAKE, __NETDEV_A_QSTATS_MAX, NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 4b5054087309..1f6ae6379e0f 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -517,7 +517,9 @@ netdev_nl_stats_write_tx(struct sk_buff *rsp, struct netdev_queue_stats_tx *tx) netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_BYTES, tx->hw_gso_bytes) || netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, tx->hw_gso_wire_packets) || netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, tx->hw_gso_wire_bytes) || - netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, tx->hw_drop_ratelimits)) + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, tx->hw_drop_ratelimits) || + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_STOP, tx->stop) || + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_WAKE, tx->wake)) return -EMSGSIZE; return 0; } diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index cf24f1d9adf8..a8188202413e 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -165,6 +165,8 @@ enum { NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, + NETDEV_A_QSTATS_TX_STOP, + NETDEV_A_QSTATS_TX_WAKE, __NETDEV_A_QSTATS_MAX, NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) -- 2.45.0
[PATCH net-next 0/2] Add TX stop/wake counters
Several drivers provide TX stop and wake counters via ethtool stats. Add those to the netdev queue stats, and use them in virtio_net. v2: - Fixed an accidental line deletion - Enhanced documentation Daniel Jurgens (2): netdev: Add queue stats for TX stop and wake virtio_net: Add TX stopped and wake counters Documentation/netlink/specs/netdev.yaml | 14 + drivers/net/virtio_net.c| 28 +++-- include/net/netdev_queues.h | 3 +++ include/uapi/linux/netdev.h | 2 ++ net/core/netdev-genl.c | 4 +++- tools/include/uapi/linux/netdev.h | 2 ++ 6 files changed, 50 insertions(+), 3 deletions(-) -- 2.45.0
[PATCH] virtio_net: Fix memory leak in virtnet_rx_mod_work
The pointer delcaration was missing the __free(kfree). Fixes: ff7c7d9f5261 ("virtio_net: Remove command data from control_buf") Reported-by: Jens Axboe Closes: https://lore.kernel.org/netdev/0674ca1b-020f-4f93-94d0-104964566...@kernel.dk/ Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index df6121c38a1b..42da535913ed 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2884,7 +2884,6 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) static int virtnet_close(struct net_device *dev) { - u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); int i; @@ -2905,11 +2904,11 @@ static void virtnet_rx_mode_work(struct work_struct *work) { struct virtnet_info *vi = container_of(work, struct virtnet_info, rx_mode_work); + u8 *promisc_allmulti __free(kfree) = NULL; struct net_device *dev = vi->dev; struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; struct netdev_hw_addr *ha; - u8 *promisc_allmulti; int uc_count; int mc_count; void *buf; -- 2.45.0
[PATCH net-next 1/2] netdev: Add queue stats for TX stop and wake
TX queue stop and wake are counted by some drivers. Support reporting these via netdev-genl queue stats. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- Documentation/netlink/specs/netdev.yaml | 10 ++ include/net/netdev_queues.h | 3 +++ include/uapi/linux/netdev.h | 2 ++ net/core/netdev-genl.c | 4 +++- tools/include/uapi/linux/netdev.h | 3 ++- 5 files changed, 20 insertions(+), 2 deletions(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 2be4b3714d17..c8b976d03330 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -439,6 +439,16 @@ attribute-sets: Number of the packets dropped by the device due to the transmit packets bitrate exceeding the device rate limit. type: uint + - +name: tx-stop +doc: | + Number of times the tx queue was stopped. +type: uint + - +name: tx-wake +doc: | + Number of times the tx queue was restarted. +type: uint operations: list: diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h index e7b84f018cee..a8a7e48dfa6c 100644 --- a/include/net/netdev_queues.h +++ b/include/net/netdev_queues.h @@ -41,6 +41,9 @@ struct netdev_queue_stats_tx { u64 hw_gso_wire_bytes; u64 hw_drop_ratelimits; + + u64 stop; + u64 wake; }; /** diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index cf24f1d9adf8..a8188202413e 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -165,6 +165,8 @@ enum { NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, + NETDEV_A_QSTATS_TX_STOP, + NETDEV_A_QSTATS_TX_WAKE, __NETDEV_A_QSTATS_MAX, NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 4b5054087309..1f6ae6379e0f 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -517,7 +517,9 @@ netdev_nl_stats_write_tx(struct sk_buff *rsp, struct netdev_queue_stats_tx *tx) netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_BYTES, tx->hw_gso_bytes) || netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, tx->hw_gso_wire_packets) || netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, tx->hw_gso_wire_bytes) || - netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, tx->hw_drop_ratelimits)) + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, tx->hw_drop_ratelimits) || + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_STOP, tx->stop) || + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_WAKE, tx->wake)) return -EMSGSIZE; return 0; } diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index cf24f1d9adf8..ccf6976b1693 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -164,7 +164,8 @@ enum { NETDEV_A_QSTATS_TX_HW_GSO_BYTES, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, - NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, + NETDEV_A_QSTATS_TX_STOP, + NETDEV_A_QSTATS_TX_WAKE, __NETDEV_A_QSTATS_MAX, NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) -- 2.44.0
[PATCH net-next 2/2] virtio_net: Add TX stopped and wake counters
Add a tx queue stop and wake counters, they are useful for debugging. $ ./tools/net/ynl/cli.py --spec netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope": "queue"}' ... {'ifindex': 13, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 14756682850, 'tx-packets': 226465, 'tx-stop': 113208, 'tx-wake': 113208}, {'ifindex': 13, 'queue-id': 1, 'queue-type': 'tx', 'tx-bytes': 18167675008, 'tx-packets': 278660, 'tx-stop': 8632, 'tx-wake': 8632}] Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 28 ++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 218a446c4c27..df6121c38a1b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -95,6 +95,8 @@ struct virtnet_sq_stats { u64_stats_t xdp_tx_drops; u64_stats_t kicks; u64_stats_t tx_timeouts; + u64_stats_t stop; + u64_stats_t wake; }; struct virtnet_rq_stats { @@ -145,6 +147,8 @@ static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { static const struct virtnet_stat_desc virtnet_sq_stats_desc_qstat[] = { VIRTNET_SQ_STAT_QSTAT("packets", packets), VIRTNET_SQ_STAT_QSTAT("bytes", bytes), + VIRTNET_SQ_STAT_QSTAT("stop",stop), + VIRTNET_SQ_STAT_QSTAT("wake",wake), }; static const struct virtnet_stat_desc virtnet_rq_stats_desc_qstat[] = { @@ -1014,6 +1018,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { netif_stop_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.stop); + u64_stats_update_end(&sq->stats.syncp); if (use_napi) { if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) virtqueue_napi_schedule(&sq->napi, sq->vq); @@ -1022,6 +1029,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, free_old_xmit(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); virtqueue_disable_cb(sq->vq); } } @@ -2322,8 +2332,14 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) free_old_xmit(sq, true); } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } __netif_tx_unlock(txq); } @@ -2473,8 +2489,14 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) virtqueue_disable_cb(sq->vq); free_old_xmit(sq, true); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } opaque = virtqueue_enable_cb_prepare(sq->vq); @@ -4790,6 +4812,8 @@ static void virtnet_get_base_stats(struct net_device *dev, tx->bytes = 0; tx->packets = 0; + tx->stop = 0; + tx->wake = 0; if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { tx->hw_drops = 0; -- 2.44.0
[PATCH net-next 0/2] Add TX stop/wake counters
Several drivers provide TX stop and wake counters via ethtool stats. Add those to the netdev queue stats, and use them in virtio_net. Daniel Jurgens (2): netdev: Add queue stats for TX stop and wake virtio_net: Add TX stopped and wake counters Documentation/netlink/specs/netdev.yaml | 10 + drivers/net/virtio_net.c| 28 +++-- include/net/netdev_queues.h | 3 +++ include/uapi/linux/netdev.h | 2 ++ net/core/netdev-genl.c | 4 +++- tools/include/uapi/linux/netdev.h | 3 ++- 6 files changed, 46 insertions(+), 4 deletions(-) -- 2.44.0
[PATCH net-next v6 6/6] virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index a7cbfa7f5311..218a446c4c27 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2824,14 +2824,12 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; @@ -2862,16 +2860,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { u8 *promisc_allmulti __free(kfree) = NULL; @@ -3477,7 +3465,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -4409,9 +4397,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct dim_cq_moder update_moder; int qnum, err; - if (!rtnl_trylock()) - return; - qnum = rq - vi->rq; mutex_lock(&rq->dim_lock); @@ -4431,7 +4416,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) } out: mutex_unlock(&rq->dim_lock); - rtnl_unlock(); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -4989,7 +4973,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -5855,7 +5839,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there -- 2.44.0
[PATCH net-next v6 5/6] virtio_net: Add a lock for per queue RX coalesce
Once the RTNL locking around the control buffer is removed there can be contention on the per queue RX interrupt coalescing data. Use a mutex per queue. A mutex is required because virtnet_send_command can sleep. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 53 +++- 1 file changed, 41 insertions(+), 12 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 386ded936bf1..a7cbfa7f5311 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -312,6 +312,9 @@ struct receive_queue { /* Is dynamic interrupt moderation enabled? */ bool dim_enabled; + /* Used to protect dim_enabled and inter_coal */ + struct mutex dim_lock; + /* Dynamic Interrupt Moderation */ struct dim dim; @@ -2365,6 +2368,10 @@ static int virtnet_poll(struct napi_struct *napi, int budget) /* Out of packets? */ if (received < budget) { napi_complete = virtqueue_napi_complete(napi, rq->vq, received); + /* Intentionally not taking dim_lock here. This may result in a +* spurious net_dim call. But if that happens virtnet_rx_dim_work +* will not act on the scheduled work. +*/ if (napi_complete && rq->dim_enabled) virtnet_rx_dim_update(vi, rq); } @@ -3247,9 +3254,11 @@ static int virtnet_set_ringparam(struct net_device *dev, return err; /* The reason is same as the transmit virtqueue reset */ + mutex_lock(&vi->rq[i].dim_lock); err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, vi->intr_coal_rx.max_usecs, vi->intr_coal_rx.max_packets); + mutex_unlock(&vi->rq[i].dim_lock); if (err) return err; } @@ -4255,6 +4264,7 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, struct virtio_net_ctrl_coal_rx *coal_rx __free(kfree) = NULL; bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; struct scatterlist sgs_rx; + int ret = 0; int i; if (rx_ctrl_dim_on && !virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) @@ -4264,16 +4274,22 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, ec->rx_max_coalesced_frames != vi->intr_coal_rx.max_packets)) return -EINVAL; + /* Acquire all queues dim_locks */ + for (i = 0; i < vi->max_queue_pairs; i++) + mutex_lock(&vi->rq[i].dim_lock); + if (rx_ctrl_dim_on && !vi->rx_dim_enabled) { vi->rx_dim_enabled = true; for (i = 0; i < vi->max_queue_pairs; i++) vi->rq[i].dim_enabled = true; - return 0; + goto unlock; } coal_rx = kzalloc(sizeof(*coal_rx), GFP_KERNEL); - if (!coal_rx) - return -ENOMEM; + if (!coal_rx) { + ret = -ENOMEM; + goto unlock; + } if (!rx_ctrl_dim_on && vi->rx_dim_enabled) { vi->rx_dim_enabled = false; @@ -4291,8 +4307,10 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET, - &sgs_rx)) - return -EINVAL; + &sgs_rx)) { + ret = -EINVAL; + goto unlock; + } vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs; vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames; @@ -4300,8 +4318,11 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; } +unlock: + for (i = vi->max_queue_pairs - 1; i >= 0; i--) + mutex_unlock(&vi->rq[i].dim_lock); - return 0; + return ret; } static int virtnet_send_notf_coal_cmds(struct virtnet_info *vi, @@ -4325,19 +4346,24 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, u16 queue) { bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; - bool cur_rx_dim = vi->rq[queue].dim_enabled; u32 max_usecs, max_packets; + bool cur_rx_dim; int err; + mutex_lock(&vi->rq[queue].dim_lo
[PATCH net-next v6 4/6] virtio_net: Do DIM update for specified queue only
Since we no longer have to hold the RTNL lock here just do updates for the specified queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 40 +++- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d7bad74a395f..386ded936bf1 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -4383,38 +4383,28 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct virtnet_info *vi = rq->vq->vdev->priv; struct net_device *dev = vi->dev; struct dim_cq_moder update_moder; - int i, qnum, err; + int qnum, err; if (!rtnl_trylock()) return; - /* Each rxq's work is queued by "net_dim()->schedule_work()" -* in response to NAPI traffic changes. Note that dim->profile_ix -* for each rxq is updated prior to the queuing action. -* So we only need to traverse and update profiles for all rxqs -* in the work which is holding rtnl_lock. -*/ - for (i = 0; i < vi->curr_queue_pairs; i++) { - rq = &vi->rq[i]; - dim = &rq->dim; - qnum = rq - vi->rq; + qnum = rq - vi->rq; - if (!rq->dim_enabled) - continue; + if (!rq->dim_enabled) + goto out; - update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); - if (update_moder.usec != rq->intr_coal.max_usecs || - update_moder.pkts != rq->intr_coal.max_packets) { - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, - update_moder.usec, - update_moder.pkts); - if (err) - pr_debug("%s: Failed to send dim parameters on rxq%d\n", -dev->name, qnum); - dim->state = DIM_START_MEASURE; - } + update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); + if (update_moder.usec != rq->intr_coal.max_usecs || + update_moder.pkts != rq->intr_coal.max_packets) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, + update_moder.usec, + update_moder.pkts); + if (err) + pr_debug("%s: Failed to send dim parameters on rxq%d\n", +dev->name, qnum); + dim->state = DIM_START_MEASURE; } - +out: rtnl_unlock(); } -- 2.44.0
[PATCH net-next v6 3/6] virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a mutex to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 451879d570a8..d7bad74a395f 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -411,6 +411,9 @@ struct virtnet_info { /* Has control virtqueue */ bool has_cvq; + /* Lock to protect the control VQ */ + struct mutex cvq_lock; + /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -2675,6 +2678,7 @@ static bool virtnet_send_command_reply(struct virtnet_info *vi, u8 class, u8 cmd /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + mutex_lock(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -2697,11 +2701,12 @@ static bool virtnet_send_command_reply(struct virtnet_info *vi, u8 class, u8 cmd if (ret < 0) { dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret); + mutex_unlock(&vi->cvq_lock); return false; } if (unlikely(!virtqueue_kick(vi->cvq))) - return vi->ctrl->status == VIRTIO_NET_OK; + goto unlock; /* Spin for a response, the kick causes an ioport write, trapping * into the hypervisor, so the request should be handled immediately. @@ -2712,6 +2717,8 @@ static bool virtnet_send_command_reply(struct virtnet_info *vi, u8 class, u8 cmd cpu_relax(); } +unlock: + mutex_unlock(&vi->cvq_lock); return vi->ctrl->status == VIRTIO_NET_OK; } @@ -5736,6 +5743,8 @@ static int virtnet_probe(struct virtio_device *vdev) if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) vi->has_cvq = true; + mutex_init(&vi->cvq_lock); + if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, offsetof(struct virtio_net_config, -- 2.44.0
[PATCH net-next v6 0/6] Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the RTNL lock. Previously this wasn't a major concern because the control VQ was only used during device setup and user interaction. With the recent addition of dynamic interrupt moderation the control VQ may be used frequently during normal operation. This series removes the RNTL lock dependency by introducing a mutex to protect the control buffer and writing SGs to the control VQ. v6: - Rebased over new stats code. - Added comment to cvq_lock, init the mutex unconditionally, and replaced some duplicate code with a goto. - Fixed minor grammer errors, checkpatch warnings, and clarified a comment. v5: - Changed cvq_lock to a mutex. - Changed dim_lock to mutex, because it's held taking the cvq_lock. - Use spin/mutex_lock/unlock vs guard macros. v4: - Protect dim_enabled with same lock as well intr_coal. - Rename intr_coal_lock to dim_lock. - Remove some scoped_guard where the error path doesn't have to be in the lock. v3: - Changed type of _offloads to __virtio16 to fix static analysis warning. - Moved a misplaced hunk to the correct patch. v2: - New patch to only process the provided queue in virtnet_dim_work - New patch to lock per queue rx coalescing structure. Daniel Jurgens (6): virtio_net: Store RSS setting in virtnet_info virtio_net: Remove command data from control_buf virtio_net: Add a lock for the command VQ. virtio_net: Do DIM update for specified queue only virtio_net: Add a lock for per queue RX coalesce virtio_net: Remove rtnl lock protection of command buffers drivers/net/virtio_net.c | 288 +++ 1 file changed, 173 insertions(+), 115 deletions(-) -- 2.44.0
[PATCH net-next v6 2/6] virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the struct could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 124 +++ 1 file changed, 85 insertions(+), 39 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 9cf93a8a4446..451879d570a8 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -368,15 +368,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; - struct virtio_net_stats_capabilities stats_cap; }; struct virtnet_info { @@ -2828,14 +2819,19 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi) static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2864,6 +2860,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) static int virtnet_close(struct net_device *dev) { + u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); int i; @@ -2888,6 +2885,7 @@ static void virtnet_rx_mode_work(struct work_struct *work) struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; struct netdev_hw_addr *ha; + u8 *promisc_allmulti; int uc_count; int mc_count; void *buf; @@ -2899,22 +2897,27 @@ static void virtnet_rx_mode_work(struct work_struct *work) rtnl_lock(); - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", -vi->ctrl->promisc ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", -vi->ctrl->allmulti ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); netif_addr_lock_bh(dev); @@ -2975,10 +2978,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg
[PATCH net-next v6 1/6] virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 1fa84790041b..9cf93a8a4446 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -373,7 +373,6 @@ struct control_buf { u8 allmulti; __virtio16 vid; __virtio64 offloads; - struct virtio_net_ctrl_rss rss; struct virtio_net_ctrl_coal_tx coal_tx; struct virtio_net_ctrl_coal_rx coal_rx; struct virtio_net_ctrl_coal_vq coal_vq; @@ -416,6 +415,7 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; @@ -3243,17 +3243,17 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -3269,21 +3269,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3394,7 +3394,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -4574,11 +4574,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -4602,7 +4602,7 @@ static int virtnet_set_rxfh(struct net_device *dev,
[PATCH net-next v5 5/6] virtio_net: Add a lock for per queue RX coalesce
Once the RTNL locking around the control buffer is removed there can be contention on the per queue RX interrupt coalescing data. Use a mutex per queue. A mutex is required because virtnet_send_command can sleep. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 53 +++- 1 file changed, 41 insertions(+), 12 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index af9048ddc3c1..033e1d6ea31b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -184,6 +184,9 @@ struct receive_queue { /* Is dynamic interrupt moderation enabled? */ bool dim_enabled; + /* Used to protect dim_enabled and inter_coal */ + struct mutex dim_lock; + /* Dynamic Interrupt Moderation */ struct dim dim; @@ -2218,6 +2221,10 @@ static int virtnet_poll(struct napi_struct *napi, int budget) /* Out of packets? */ if (received < budget) { napi_complete = virtqueue_napi_complete(napi, rq->vq, received); + /* Intentionally not taking dim_lock here. This could result +* in a net_dim call with dim now disabled. But virtnet_rx_dim_work +* will take the lock not update settings if dim is now disabled. +*/ if (napi_complete && rq->dim_enabled) virtnet_rx_dim_update(vi, rq); } @@ -3091,9 +3098,11 @@ static int virtnet_set_ringparam(struct net_device *dev, return err; /* The reason is same as the transmit virtqueue reset */ + mutex_lock(&vi->rq[i].dim_lock); err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, vi->intr_coal_rx.max_usecs, vi->intr_coal_rx.max_packets); + mutex_unlock(&vi->rq[i].dim_lock); if (err) return err; } @@ -3472,6 +3481,7 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, struct virtio_net_ctrl_coal_rx *coal_rx __free(kfree) = NULL; bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; struct scatterlist sgs_rx; + int ret = 0; int i; if (rx_ctrl_dim_on && !virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) @@ -3481,16 +3491,22 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, ec->rx_max_coalesced_frames != vi->intr_coal_rx.max_packets)) return -EINVAL; + /* Acquire all queues dim_locks */ + for (i = 0; i < vi->max_queue_pairs; i++) + mutex_lock(&vi->rq[i].dim_lock); + if (rx_ctrl_dim_on && !vi->rx_dim_enabled) { vi->rx_dim_enabled = true; for (i = 0; i < vi->max_queue_pairs; i++) vi->rq[i].dim_enabled = true; - return 0; + goto unlock; } coal_rx = kzalloc(sizeof(*coal_rx), GFP_KERNEL); - if (!coal_rx) - return -ENOMEM; + if (!coal_rx) { + ret = -ENOMEM; + goto unlock; + } if (!rx_ctrl_dim_on && vi->rx_dim_enabled) { vi->rx_dim_enabled = false; @@ -3508,8 +3524,10 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET, - &sgs_rx)) - return -EINVAL; + &sgs_rx)) { + ret = -EINVAL; + goto unlock; + } vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs; vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames; @@ -3517,8 +3535,11 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; } +unlock: + for (i = vi->max_queue_pairs - 1; i >= 0; i--) + mutex_unlock(&vi->rq[i].dim_lock); - return 0; + return ret; } static int virtnet_send_notf_coal_cmds(struct virtnet_info *vi, @@ -3542,19 +3563,24 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, u16 queue) { bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; - bool cur_rx_dim = vi->rq[queue].dim_enabled; u32 max_usecs, max_packets; + bool cur_rx_dim; int err; + mutex_lock(&
[PATCH net-next v5 6/6] virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 033e1d6ea31b..d00f4147c7c0 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2668,14 +2668,12 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; @@ -2706,16 +2704,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { u8 *promisc_allmulti __free(kfree) = NULL; @@ -3321,7 +3309,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -3626,9 +3614,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct dim_cq_moder update_moder; int qnum, err; - if (!rtnl_trylock()) - return; - qnum = rq - vi->rq; mutex_lock(&rq->dim_lock); @@ -3648,7 +3633,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) } out: mutex_unlock(&rq->dim_lock); - rtnl_unlock(); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -4117,7 +4101,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -4939,7 +4923,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there -- 2.34.1
[PATCH net-next v5 4/6] virtio_net: Do DIM update for specified queue only
Since we no longer have to hold the RTNL lock here just do updates for the specified queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 40 +++- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d752c8ac5cd3..af9048ddc3c1 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3600,38 +3600,28 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct virtnet_info *vi = rq->vq->vdev->priv; struct net_device *dev = vi->dev; struct dim_cq_moder update_moder; - int i, qnum, err; + int qnum, err; if (!rtnl_trylock()) return; - /* Each rxq's work is queued by "net_dim()->schedule_work()" -* in response to NAPI traffic changes. Note that dim->profile_ix -* for each rxq is updated prior to the queuing action. -* So we only need to traverse and update profiles for all rxqs -* in the work which is holding rtnl_lock. -*/ - for (i = 0; i < vi->curr_queue_pairs; i++) { - rq = &vi->rq[i]; - dim = &rq->dim; - qnum = rq - vi->rq; + qnum = rq - vi->rq; - if (!rq->dim_enabled) - continue; + if (!rq->dim_enabled) + goto out; - update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); - if (update_moder.usec != rq->intr_coal.max_usecs || - update_moder.pkts != rq->intr_coal.max_packets) { - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, - update_moder.usec, - update_moder.pkts); - if (err) - pr_debug("%s: Failed to send dim parameters on rxq%d\n", -dev->name, qnum); - dim->state = DIM_START_MEASURE; - } + update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); + if (update_moder.usec != rq->intr_coal.max_usecs || + update_moder.pkts != rq->intr_coal.max_packets) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, + update_moder.usec, + update_moder.pkts); + if (err) + pr_debug("%s: Failed to send dim parameters on rxq%d\n", +dev->name, qnum); + dim->state = DIM_START_MEASURE; } - +out: rtnl_unlock(); } -- 2.34.1
[PATCH net-next v5 3/6] virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a mutex to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0ee192b45e1e..d752c8ac5cd3 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -282,6 +282,7 @@ struct virtnet_info { /* Has control virtqueue */ bool has_cvq; + struct mutex cvq_lock; /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -2529,6 +2530,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + mutex_lock(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -2548,11 +2550,14 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, if (ret < 0) { dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret); + mutex_unlock(&vi->cvq_lock); return false; } - if (unlikely(!virtqueue_kick(vi->cvq))) + if (unlikely(!virtqueue_kick(vi->cvq))) { + mutex_unlock(&vi->cvq_lock); return vi->ctrl->status == VIRTIO_NET_OK; + } /* Spin for a response, the kick causes an ioport write, trapping * into the hypervisor, so the request should be handled immediately. @@ -2563,6 +2568,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, cpu_relax(); } + mutex_unlock(&vi->cvq_lock); return vi->ctrl->status == VIRTIO_NET_OK; } @@ -4818,8 +4824,10 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) vi->any_header_sg = true; - if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) { vi->has_cvq = true; + mutex_init(&vi->cvq_lock); + } if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, -- 2.34.1
[PATCH net-next v5 1/6] virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 115c3c5414f2..7248dae54e1c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -245,7 +245,6 @@ struct control_buf { u8 allmulti; __virtio16 vid; __virtio64 offloads; - struct virtio_net_ctrl_rss rss; struct virtio_net_ctrl_coal_tx coal_tx; struct virtio_net_ctrl_coal_rx coal_rx; struct virtio_net_ctrl_coal_vq coal_vq; @@ -287,6 +286,7 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; @@ -3087,17 +3087,17 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -3113,21 +3113,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3238,7 +3238,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -3791,11 +3791,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -3819,7 +3819,7 @@ static int virtnet_set_rxfh(struct net_device *dev,
[PATCH net-next v5 2/6] virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 111 ++- 1 file changed, 75 insertions(+), 36 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 7248dae54e1c..0ee192b45e1e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -240,14 +240,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; }; struct virtnet_info { @@ -2672,14 +2664,19 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi) static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2708,6 +2705,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) static int virtnet_close(struct net_device *dev) { + u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); int i; @@ -2732,6 +2730,7 @@ static void virtnet_rx_mode_work(struct work_struct *work) struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; struct netdev_hw_addr *ha; + u8 *promisc_allmulti; int uc_count; int mc_count; void *buf; @@ -2743,22 +2742,27 @@ static void virtnet_rx_mode_work(struct work_struct *work) rtnl_lock(); - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", -vi->ctrl->promisc ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", -vi->ctrl->allmulti ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); netif_addr_lock_bh(dev); @@ -2819,10 +2823,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTI
[PATCH net-next v5 0/6] Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the RTNL lock. Previously this wasn't a major concern because the control VQ was only used during device setup and user interaction. With the recent addition of dynamic interrupt moderation the control VQ may be used frequently during normal operation. This series removes the RNTL lock dependency by introducing a mutex to protect the control buffer and writing SGs to the control VQ. v5: - Changed cvq_lock to a mutex. - Changed dim_lock to mutex, because it's held taking the cvq_lock. - Use spin/mutex_lock/unlock vs guard macros. v4: - Protect dim_enabled with same lock as well intr_coal. - Rename intr_coal_lock to dim_lock. - Remove some scoped_guard where the error path doesn't have to be in the lock. v3: - Changed type of _offloads to __virtio16 to fix static analysis warning. - Moved a misplaced hunk to the correct patch. v2: - New patch to only process the provided queue in virtnet_dim_work - New patch to lock per queue rx coalescing structure. Daniel Jurgens (6): virtio_net: Store RSS setting in virtnet_info virtio_net: Remove command data from control_buf virtio_net: Add a lock for the command VQ. virtio_net: Do DIM update for specified queue only virtio_net: Add a lock for per queue RX coalesce virtio_net: Remove rtnl lock protection of command buffers drivers/net/virtio_net.c | 276 +++ 1 file changed, 163 insertions(+), 113 deletions(-) -- 2.34.1
[PATCH net-next v4 6/6] virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index bae5beafe1a1..5825775af8f8 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2664,14 +2664,12 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; @@ -2702,16 +2700,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { u8 *promisc_allmulti __free(kfree) = NULL; @@ -3317,7 +3305,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -3621,14 +3609,11 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct dim_cq_moder update_moder; int qnum, err; - if (!rtnl_trylock()) - return; - qnum = rq - vi->rq; guard(spinlock)(&rq->dim_lock); if (!rq->dim_enabled) - goto out; + return; update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); if (update_moder.usec != rq->intr_coal.max_usecs || @@ -3641,8 +3626,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) dev->name, qnum); dim->state = DIM_START_MEASURE; } -out: - rtnl_unlock(); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -4110,7 +4093,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -4932,7 +4915,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there -- 2.34.1
[PATCH net-next v4 4/6] virtio_net: Do DIM update for specified queue only
Since we no longer have to hold the RTNL lock here just do updates for the specified queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 40 +++- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d02f83a919a7..b3aa4d2a15e9 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3596,38 +3596,28 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct virtnet_info *vi = rq->vq->vdev->priv; struct net_device *dev = vi->dev; struct dim_cq_moder update_moder; - int i, qnum, err; + int qnum, err; if (!rtnl_trylock()) return; - /* Each rxq's work is queued by "net_dim()->schedule_work()" -* in response to NAPI traffic changes. Note that dim->profile_ix -* for each rxq is updated prior to the queuing action. -* So we only need to traverse and update profiles for all rxqs -* in the work which is holding rtnl_lock. -*/ - for (i = 0; i < vi->curr_queue_pairs; i++) { - rq = &vi->rq[i]; - dim = &rq->dim; - qnum = rq - vi->rq; + qnum = rq - vi->rq; - if (!rq->dim_enabled) - continue; + if (!rq->dim_enabled) + goto out; - update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); - if (update_moder.usec != rq->intr_coal.max_usecs || - update_moder.pkts != rq->intr_coal.max_packets) { - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, - update_moder.usec, - update_moder.pkts); - if (err) - pr_debug("%s: Failed to send dim parameters on rxq%d\n", -dev->name, qnum); - dim->state = DIM_START_MEASURE; - } + update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); + if (update_moder.usec != rq->intr_coal.max_usecs || + update_moder.pkts != rq->intr_coal.max_packets) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, + update_moder.usec, + update_moder.pkts); + if (err) + pr_debug("%s: Failed to send dim parameters on rxq%d\n", +dev->name, qnum); + dim->state = DIM_START_MEASURE; } - +out: rtnl_unlock(); } -- 2.34.1
[PATCH net-next v4 2/6] virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 111 ++- 1 file changed, 75 insertions(+), 36 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 7248dae54e1c..0ee192b45e1e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -240,14 +240,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; }; struct virtnet_info { @@ -2672,14 +2664,19 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi) static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2708,6 +2705,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) static int virtnet_close(struct net_device *dev) { + u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); int i; @@ -2732,6 +2730,7 @@ static void virtnet_rx_mode_work(struct work_struct *work) struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; struct netdev_hw_addr *ha; + u8 *promisc_allmulti; int uc_count; int mc_count; void *buf; @@ -2743,22 +2742,27 @@ static void virtnet_rx_mode_work(struct work_struct *work) rtnl_lock(); - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", -vi->ctrl->promisc ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", -vi->ctrl->allmulti ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); netif_addr_lock_bh(dev); @@ -2819,10 +2823,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTI
[PATCH net-next v4 3/6] virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a spinlock to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0ee192b45e1e..d02f83a919a7 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -282,6 +282,7 @@ struct virtnet_info { /* Has control virtqueue */ bool has_cvq; + spinlock_t cvq_lock; /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -2529,6 +2530,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + guard(spinlock)(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -4818,8 +4820,10 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) vi->any_header_sg = true; - if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) { vi->has_cvq = true; + spin_lock_init(&vi->cvq_lock); + } if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, -- 2.34.1
[PATCH net-next v4 5/6] virtio_net: Add a lock for per queue RX coalesce
Once the RTNL locking around the control buffer is removed there can be contention on the per queue RX interrupt coalescing data. Use a spin lock per queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 40 +--- 1 file changed, 33 insertions(+), 7 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b3aa4d2a15e9..bae5beafe1a1 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -184,6 +184,9 @@ struct receive_queue { /* Is dynamic interrupt moderation enabled? */ bool dim_enabled; + /* Used to protect dim_enabled and inter_coal */ + spinlock_t dim_lock; + /* Dynamic Interrupt Moderation */ struct dim dim; @@ -2218,6 +2221,10 @@ static int virtnet_poll(struct napi_struct *napi, int budget) /* Out of packets? */ if (received < budget) { napi_complete = virtqueue_napi_complete(napi, rq->vq, received); + /* Intentionally not taking dim_lock here. This could result +* in a net_dim call with dim now disabled. But virtnet_rx_dim_work +* will take the lock not update settings if dim is now disabled. +*/ if (napi_complete && rq->dim_enabled) virtnet_rx_dim_update(vi, rq); } @@ -3087,9 +3094,11 @@ static int virtnet_set_ringparam(struct net_device *dev, return err; /* The reason is same as the transmit virtqueue reset */ + spin_lock(&vi->rq[i].dim_lock); err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, vi->intr_coal_rx.max_usecs, vi->intr_coal_rx.max_packets); + spin_unlock(&vi->rq[i].dim_lock); if (err) return err; } @@ -3468,6 +3477,7 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, struct virtio_net_ctrl_coal_rx *coal_rx __free(kfree) = NULL; bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; struct scatterlist sgs_rx; + int ret = 0; int i; if (rx_ctrl_dim_on && !virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) @@ -3477,16 +3487,22 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, ec->rx_max_coalesced_frames != vi->intr_coal_rx.max_packets)) return -EINVAL; + /* Acquire all queues dim_locks */ + for (i = 0; i < vi->max_queue_pairs; i++) + spin_lock(&vi->rq[i].dim_lock); + if (rx_ctrl_dim_on && !vi->rx_dim_enabled) { vi->rx_dim_enabled = true; for (i = 0; i < vi->max_queue_pairs; i++) vi->rq[i].dim_enabled = true; - return 0; + goto unlock; } coal_rx = kzalloc(sizeof(*coal_rx), GFP_KERNEL); - if (!coal_rx) - return -ENOMEM; + if (!coal_rx) { + ret = -ENOMEM; + goto unlock; + } if (!rx_ctrl_dim_on && vi->rx_dim_enabled) { vi->rx_dim_enabled = false; @@ -3504,8 +3520,10 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET, - &sgs_rx)) - return -EINVAL; + &sgs_rx)) { + ret = -EINVAL; + goto unlock; + } vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs; vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames; @@ -3513,8 +3531,11 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; } +unlock: + for (i = vi->max_queue_pairs - 1; i >= 0; i--) + spin_unlock(&vi->rq[i].dim_lock); - return 0; + return ret; } static int virtnet_send_notf_coal_cmds(struct virtnet_info *vi, @@ -3538,10 +3559,12 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, u16 queue) { bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; - bool cur_rx_dim = vi->rq[queue].dim_enabled; u32 max_usecs, max_packets; + bool cur_rx_dim; int err; + guard(spinlock)(&vi->rq[queue].dim_lock); + cur_rx_dim = vi->rq[q
[PATCH net-next v4 1/6] virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 115c3c5414f2..7248dae54e1c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -245,7 +245,6 @@ struct control_buf { u8 allmulti; __virtio16 vid; __virtio64 offloads; - struct virtio_net_ctrl_rss rss; struct virtio_net_ctrl_coal_tx coal_tx; struct virtio_net_ctrl_coal_rx coal_rx; struct virtio_net_ctrl_coal_vq coal_vq; @@ -287,6 +286,7 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; @@ -3087,17 +3087,17 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -3113,21 +3113,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3238,7 +3238,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -3791,11 +3791,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -3819,7 +3819,7 @@ static int virtnet_set_rxfh(struct net_device *dev,
[PATCH net-next v4 0/6] Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the RTNL lock. Previously this wasn't a major concern because the control VQ was only used during device setup and user interaction. With the recent addition of dynamic interrupt moderation the control VQ may be used frequently during normal operation. This series removes the RNTL lock dependency by introducing a spin lock to protect the control buffer and writing SGs to the control VQ. v4: - Protect dim_enabled with same lock as well intr_coal. - Rename intr_coal_lock to dim_lock. - Remove some scoped_guard where the error path doesn't have to be in the lock. v3: - Changed type of _offloads to __virtio16 to fix static analysis warning. - Moved a misplaced hunk to the correct patch. v2: - New patch to only process the provided queue in virtnet_dim_work - New patch to lock per queue rx coalescing structure. Daniel Jurgens (6): virtio_net: Store RSS setting in virtnet_info virtio_net: Remove command data from control_buf virtio_net: Add a lock for the command VQ. virtio_net: Do DIM update for specified queue only virtio_net: Add a lock for per queue RX coalesce virtio_net: Remove rtnl lock protection of command buffers drivers/net/virtio_net.c | 256 +++ 1 file changed, 149 insertions(+), 107 deletions(-) -- 2.34.1
[PATCH net-next v3 6/6] virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 8724caa7c2ed..8df8585834f0 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2658,14 +2658,12 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; @@ -2696,16 +2694,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { u8 *promisc_allmulti __free(kfree) = NULL; @@ -3311,7 +3299,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -3604,13 +3592,10 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct dim_cq_moder update_moder; int qnum, err; - if (!rtnl_trylock()) - return; - qnum = rq - vi->rq; if (!rq->dim_enabled) - goto out; + return; guard(spinlock)(&rq->intr_coal_lock); update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); @@ -3624,8 +3609,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) dev->name, qnum); dim->state = DIM_START_MEASURE; } -out: - rtnl_unlock(); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -4093,7 +4076,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -4915,7 +4898,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there -- 2.42.0
[PATCH net-next v3 5/6] virtio_net: Add a lock for per queue RX coalesce
Once the RTNL locking around the control buffer is removed there can be contention on the per queue RX interrupt coalescing data. Use a spin lock per queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b3aa4d2a15e9..8724caa7c2ed 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -190,6 +190,7 @@ struct receive_queue { u32 packets_in_napi; struct virtnet_interrupt_coalesce intr_coal; + spinlock_t intr_coal_lock; /* Chain pages by the private ptr. */ struct page *pages; @@ -3087,11 +3088,13 @@ static int virtnet_set_ringparam(struct net_device *dev, return err; /* The reason is same as the transmit virtqueue reset */ - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, - vi->intr_coal_rx.max_usecs, - vi->intr_coal_rx.max_packets); - if (err) - return err; + scoped_guard(spinlock, &vi->rq[i].intr_coal_lock) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, + vi->intr_coal_rx.max_usecs, + vi->intr_coal_rx.max_packets); + if (err) + return err; + } } } @@ -3510,8 +3513,10 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs; vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames; for (i = 0; i < vi->max_queue_pairs; i++) { - vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; - vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; + scoped_guard(spinlock, &vi->rq[i].intr_coal_lock) { + vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; + vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; + } } return 0; @@ -3542,6 +3547,7 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, u32 max_usecs, max_packets; int err; + guard(spinlock)(&vi->rq[queue].intr_coal_lock); max_usecs = vi->rq[queue].intr_coal.max_usecs; max_packets = vi->rq[queue].intr_coal.max_packets; @@ -3606,6 +3612,7 @@ static void virtnet_rx_dim_work(struct work_struct *work) if (!rq->dim_enabled) goto out; + guard(spinlock)(&rq->intr_coal_lock); update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); if (update_moder.usec != rq->intr_coal.max_usecs || update_moder.pkts != rq->intr_coal.max_packets) { @@ -3756,6 +3763,7 @@ static int virtnet_get_per_queue_coalesce(struct net_device *dev, return -EINVAL; if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) { + guard(spinlock)(&vi->rq[queue].intr_coal_lock); ec->rx_coalesce_usecs = vi->rq[queue].intr_coal.max_usecs; ec->tx_coalesce_usecs = vi->sq[queue].intr_coal.max_usecs; ec->tx_max_coalesced_frames = vi->sq[queue].intr_coal.max_packets; @@ -4501,6 +4509,7 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) u64_stats_init(&vi->rq[i].stats.syncp); u64_stats_init(&vi->sq[i].stats.syncp); + spin_lock_init(&vi->rq[i].intr_coal_lock); } return 0; -- 2.42.0
[PATCH net-next v3 4/6] virtio_net: Do DIM update for specified queue only
Since we no longer have to hold the RTNL lock here just do updates for the specified queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 40 +++- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d02f83a919a7..b3aa4d2a15e9 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3596,38 +3596,28 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct virtnet_info *vi = rq->vq->vdev->priv; struct net_device *dev = vi->dev; struct dim_cq_moder update_moder; - int i, qnum, err; + int qnum, err; if (!rtnl_trylock()) return; - /* Each rxq's work is queued by "net_dim()->schedule_work()" -* in response to NAPI traffic changes. Note that dim->profile_ix -* for each rxq is updated prior to the queuing action. -* So we only need to traverse and update profiles for all rxqs -* in the work which is holding rtnl_lock. -*/ - for (i = 0; i < vi->curr_queue_pairs; i++) { - rq = &vi->rq[i]; - dim = &rq->dim; - qnum = rq - vi->rq; + qnum = rq - vi->rq; - if (!rq->dim_enabled) - continue; + if (!rq->dim_enabled) + goto out; - update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); - if (update_moder.usec != rq->intr_coal.max_usecs || - update_moder.pkts != rq->intr_coal.max_packets) { - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, - update_moder.usec, - update_moder.pkts); - if (err) - pr_debug("%s: Failed to send dim parameters on rxq%d\n", -dev->name, qnum); - dim->state = DIM_START_MEASURE; - } + update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); + if (update_moder.usec != rq->intr_coal.max_usecs || + update_moder.pkts != rq->intr_coal.max_packets) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, + update_moder.usec, + update_moder.pkts); + if (err) + pr_debug("%s: Failed to send dim parameters on rxq%d\n", +dev->name, qnum); + dim->state = DIM_START_MEASURE; } - +out: rtnl_unlock(); } -- 2.42.0
[PATCH net-next v3 2/6] virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 111 ++- 1 file changed, 75 insertions(+), 36 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 7248dae54e1c..0ee192b45e1e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -240,14 +240,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; }; struct virtnet_info { @@ -2672,14 +2664,19 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi) static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2708,6 +2705,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) static int virtnet_close(struct net_device *dev) { + u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); int i; @@ -2732,6 +2730,7 @@ static void virtnet_rx_mode_work(struct work_struct *work) struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; struct netdev_hw_addr *ha; + u8 *promisc_allmulti; int uc_count; int mc_count; void *buf; @@ -2743,22 +2742,27 @@ static void virtnet_rx_mode_work(struct work_struct *work) rtnl_lock(); - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", -vi->ctrl->promisc ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", -vi->ctrl->allmulti ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); netif_addr_lock_bh(dev); @@ -2819,10 +2823,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTI
[PATCH net-next v3 3/6] virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a spinlock to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0ee192b45e1e..d02f83a919a7 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -282,6 +282,7 @@ struct virtnet_info { /* Has control virtqueue */ bool has_cvq; + spinlock_t cvq_lock; /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -2529,6 +2530,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + guard(spinlock)(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -4818,8 +4820,10 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) vi->any_header_sg = true; - if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) { vi->has_cvq = true; + spin_lock_init(&vi->cvq_lock); + } if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, -- 2.42.0
[PATCH net-next v3 0/6] Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the RTNL lock. Previously this wasn't a major concern because the control VQ was only used during device setup and user interaction. With the recent addition of dynamic interrupt moderation the control VQ may be used frequently during normal operation. This series removes the RNTL lock dependency by introducing a spin lock to protect the control buffer and writing SGs to the control VQ. v3: - Changed type of _offloads to __virtio16 to fix static analysis warning. - Moved a misplaced hunk to the correct patch. v2: - New patch to only process the provided queue in virtnet_dim_work - New patch to lock per queue rx coalescing structure. Daniel Jurgens (6): virtio_net: Store RSS setting in virtnet_info virtio_net: Remove command data from control_buf virtio_net: Add a lock for the command VQ. virtio_net: Do DIM update for specified queue only virtio_net: Add a lock for per queue RX coalesce virtio_net: Remove rtnl lock protection of command buffers drivers/net/virtio_net.c | 243 +-- 1 file changed, 134 insertions(+), 109 deletions(-) -- 2.42.0
[PATCH net-next v3 1/6] virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 115c3c5414f2..7248dae54e1c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -245,7 +245,6 @@ struct control_buf { u8 allmulti; __virtio16 vid; __virtio64 offloads; - struct virtio_net_ctrl_rss rss; struct virtio_net_ctrl_coal_tx coal_tx; struct virtio_net_ctrl_coal_rx coal_rx; struct virtio_net_ctrl_coal_vq coal_vq; @@ -287,6 +286,7 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; @@ -3087,17 +3087,17 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -3113,21 +3113,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3238,7 +3238,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -3791,11 +3791,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -3819,7 +3819,7 @@ static int virtnet_set_rxfh(struct net_device *dev,
[PATCH net-next v2 6/6] virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 859d767411f8..351d9107f472 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2658,14 +2658,12 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; @@ -2696,16 +2694,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { u8 *promisc_allmulti __free(kfree) = NULL; @@ -3311,7 +3299,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -3604,13 +3592,10 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct dim_cq_moder update_moder; int qnum, err; - if (!rtnl_trylock()) - return; - qnum = rq - vi->rq; if (!rq->dim_enabled) - goto out; + return; guard(spinlock)(&rq->intr_coal_lock); update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); @@ -3624,8 +3609,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) dev->name, qnum); dim->state = DIM_START_MEASURE; } -out: - rtnl_unlock(); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -4077,7 +4060,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -4897,7 +4880,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there -- 2.42.0
[PATCH net-next v2 4/6] virtio_net: Do DIM update for specified queue only
Since we no longer have to hold the RTNL lock here just do updates for the specified queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 38 ++ 1 file changed, 14 insertions(+), 24 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b9298544b1b5..9c4bfb1eb15c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3596,36 +3596,26 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct virtnet_info *vi = rq->vq->vdev->priv; struct net_device *dev = vi->dev; struct dim_cq_moder update_moder; - int i, qnum, err; + int qnum, err; if (!rtnl_trylock()) return; - /* Each rxq's work is queued by "net_dim()->schedule_work()" -* in response to NAPI traffic changes. Note that dim->profile_ix -* for each rxq is updated prior to the queuing action. -* So we only need to traverse and update profiles for all rxqs -* in the work which is holding rtnl_lock. -*/ - for (i = 0; i < vi->curr_queue_pairs; i++) { - rq = &vi->rq[i]; - dim = &rq->dim; - qnum = rq - vi->rq; + qnum = rq - vi->rq; - if (!rq->dim_enabled) - continue; + if (!rq->dim_enabled) + continue; - update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); - if (update_moder.usec != rq->intr_coal.max_usecs || - update_moder.pkts != rq->intr_coal.max_packets) { - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, - update_moder.usec, - update_moder.pkts); - if (err) - pr_debug("%s: Failed to send dim parameters on rxq%d\n", -dev->name, qnum); - dim->state = DIM_START_MEASURE; - } + update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); + if (update_moder.usec != rq->intr_coal.max_usecs || + update_moder.pkts != rq->intr_coal.max_packets) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, + update_moder.usec, + update_moder.pkts); + if (err) + pr_debug("%s: Failed to send dim parameters on rxq%d\n", +dev->name, qnum); + dim->state = DIM_START_MEASURE; } rtnl_unlock(); -- 2.42.0
[PATCH net-next v2 2/6] virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 111 ++- 1 file changed, 75 insertions(+), 36 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 44525e9b09c5..ff93d18992e4 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -240,14 +240,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; }; struct virtnet_info { @@ -2672,14 +2664,19 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi) static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2708,6 +2705,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) static int virtnet_close(struct net_device *dev) { + u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); int i; @@ -2732,6 +2730,7 @@ static void virtnet_rx_mode_work(struct work_struct *work) struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; struct netdev_hw_addr *ha; + u8 *promisc_allmulti; int uc_count; int mc_count; void *buf; @@ -2743,22 +2742,27 @@ static void virtnet_rx_mode_work(struct work_struct *work) rtnl_lock(); - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", -vi->ctrl->promisc ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", -vi->ctrl->allmulti ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); netif_addr_lock_bh(dev); @@ -2819,10 +2823,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTI
[PATCH net-next v2 5/6] virtio_net: Add a lock for per queue RX coalesce
Once the RTNL locking around the control buffer is removed there can be contention on the per queue RX interrupt coalescing data. Use a spin lock per queue. Signed-off-by: Daniel Jurgens --- drivers/net/virtio_net.c | 27 ++- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 9c4bfb1eb15c..859d767411f8 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -190,6 +190,7 @@ struct receive_queue { u32 packets_in_napi; struct virtnet_interrupt_coalesce intr_coal; + spinlock_t intr_coal_lock; /* Chain pages by the private ptr. */ struct page *pages; @@ -3087,11 +3088,13 @@ static int virtnet_set_ringparam(struct net_device *dev, return err; /* The reason is same as the transmit virtqueue reset */ - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, - vi->intr_coal_rx.max_usecs, - vi->intr_coal_rx.max_packets); - if (err) - return err; + scoped_guard(spinlock, &vi->rq[i].intr_coal_lock) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, + vi->intr_coal_rx.max_usecs, + vi->intr_coal_rx.max_packets); + if (err) + return err; + } } } @@ -3510,8 +3513,10 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs; vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames; for (i = 0; i < vi->max_queue_pairs; i++) { - vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; - vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; + scoped_guard(spinlock, &vi->rq[i].intr_coal_lock) { + vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; + vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; + } } return 0; @@ -3542,6 +3547,7 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, u32 max_usecs, max_packets; int err; + guard(spinlock)(&vi->rq[queue].intr_coal_lock); max_usecs = vi->rq[queue].intr_coal.max_usecs; max_packets = vi->rq[queue].intr_coal.max_packets; @@ -3604,8 +3610,9 @@ static void virtnet_rx_dim_work(struct work_struct *work) qnum = rq - vi->rq; if (!rq->dim_enabled) - continue; + goto out; + guard(spinlock)(&rq->intr_coal_lock); update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); if (update_moder.usec != rq->intr_coal.max_usecs || update_moder.pkts != rq->intr_coal.max_packets) { @@ -3617,7 +3624,7 @@ static void virtnet_rx_dim_work(struct work_struct *work) dev->name, qnum); dim->state = DIM_START_MEASURE; } - +out: rtnl_unlock(); } @@ -3756,6 +3763,7 @@ static int virtnet_get_per_queue_coalesce(struct net_device *dev, return -EINVAL; if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) { + guard(spinlock)(&vi->rq[queue].intr_coal_lock); ec->rx_coalesce_usecs = vi->rq[queue].intr_coal.max_usecs; ec->tx_coalesce_usecs = vi->sq[queue].intr_coal.max_usecs; ec->tx_max_coalesced_frames = vi->sq[queue].intr_coal.max_packets; @@ -4485,6 +4493,7 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) u64_stats_init(&vi->rq[i].stats.syncp); u64_stats_init(&vi->sq[i].stats.syncp); + spin_lock_init(&vi->rq[i].intr_coal_lock); } return 0; -- 2.42.0
[PATCH net-next v2 0/6] Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the RTNL lock. Previously this wasn't a major concern because the control VQ was only used during device setup and user interaction. With the recent addition of dynamic interrupt moderation the control VQ may be used frequently during normal operation. This series removes the RNTL lock dependancy by introducing a spin lock to protect the control buffer and writing SGs to the control VQ. v2: - New patch to only process the provided queue in virtnet_dim_work - New patch to lock per queue rx coalescing structure. Daniel Jurgens (6): virtio_net: Store RSS setting in virtnet_info virtio_net: Remove command data from control_buf virtio_net: Add a lock for the command VQ. virtio_net: Do DIM update for specified queue only virtio_net: Add a lock for per queue RX coalesce virtio_net: Remove rtnl lock protection of command buffers drivers/net/virtio_net.c | 243 +-- 1 file changed, 134 insertions(+), 109 deletions(-) -- 2.42.0
[PATCH net-next v2 3/6] virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a spinlock to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ff93d18992e4..b9298544b1b5 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -282,6 +282,7 @@ struct virtnet_info { /* Has control virtqueue */ bool has_cvq; + spinlock_t cvq_lock; /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -2529,6 +2530,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + guard(spinlock)(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -4800,8 +4802,10 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) vi->any_header_sg = true; - if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) { vi->has_cvq = true; + spin_lock_init(&vi->cvq_lock); + } if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, -- 2.42.0
[PATCH net-next v2 1/6] virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index c22d1118a133..44525e9b09c5 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -245,7 +245,6 @@ struct control_buf { u8 allmulti; __virtio16 vid; __virtio64 offloads; - struct virtio_net_ctrl_rss rss; struct virtio_net_ctrl_coal_tx coal_tx; struct virtio_net_ctrl_coal_rx coal_rx; struct virtio_net_ctrl_coal_vq coal_vq; @@ -287,6 +286,7 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; @@ -3087,17 +3087,17 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -3113,21 +3113,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3238,7 +3238,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -3791,11 +3791,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -3815,10 +3815,10 @@ static int virtnet_set_rxfh(struct net_device *dev, if (r
[PATCH net-next 4/4] virtio_net: Remove rtnl lock protection of command buffers
The rtnl lock is no longer needed to protect the control buffer and command VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 41f8dc16ff38..d09ea20b16be 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2639,14 +2639,12 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; @@ -2677,16 +2675,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); @@ -3268,7 +3256,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -3558,14 +3546,11 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct dim_cq_moder update_moder; int i, qnum, err; - if (!rtnl_trylock()) - return; - /* Each rxq's work is queued by "net_dim()->schedule_work()" * in response to NAPI traffic changes. Note that dim->profile_ix * for each rxq is updated prior to the queuing action. * So we only need to traverse and update profiles for all rxqs -* in the work which is holding rtnl_lock. +* in the work. */ for (i = 0; i < vi->curr_queue_pairs; i++) { rq = &vi->rq[i]; @@ -3587,8 +3572,6 @@ static void virtnet_rx_dim_work(struct work_struct *work) dim->state = DIM_START_MEASURE; } } - - rtnl_unlock(); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -4036,7 +4019,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -4852,7 +4835,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there -- 2.42.0
[PATCH net-next 3/4] virtio_net: Add a lock for the command VQ.
The command VQ will no longer be protected by the RTNL lock. Use a spinlock to protect the control buffer header and the VQ. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 6780479a1e6c..41f8dc16ff38 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -277,6 +277,7 @@ struct virtnet_info { /* Has control virtqueue */ bool has_cvq; + spinlock_t cvq_lock; /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -2513,6 +2514,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + guard(spinlock)(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -4756,8 +4758,10 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) vi->any_header_sg = true; - if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) { vi->has_cvq = true; + spin_lock_init(&vi->cvq_lock); + } if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, -- 2.42.0
[PATCH net-next 2/4] virtio_net: Remove command data from control_buf
Allocate memory for the data when it's used. Ideally the could be on the stack, but we can't DMA stack memory. With this change only the header and status memory are shared between commands, which will allow using a tighter lock than RTNL. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 110 ++- 1 file changed, 74 insertions(+), 36 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 7419a68cf6e2..6780479a1e6c 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -235,14 +235,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; }; struct virtnet_info { @@ -2654,14 +2646,19 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi) static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2708,6 +2705,7 @@ static int virtnet_close(struct net_device *dev) static void virtnet_set_rx_mode(struct net_device *dev) { + u8 *promisc_allmulti __free(kfree) = NULL; struct virtnet_info *vi = netdev_priv(dev); struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; @@ -2721,22 +2719,27 @@ static void virtnet_set_rx_mode(struct net_device *dev) if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX)) return; - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", -vi->ctrl->promisc ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", -vi->ctrl->allmulti ? "en" : "dis"); +*promisc_allmulti ? "en" : "dis"); uc_count = netdev_uc_count(dev); mc_count = netdev_mc_count(dev); @@ -2780,10 +2783,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN, VIRTIO_NET_CTRL_VLAN_ADD, &sg)) @@ -2795,10 +2803,15 @@ static int virtnet_
[PATCH net-next 1/4] virtio_net: Store RSS setting in virtnet_info
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens Reviewed-by: Jiri Pirko --- drivers/net/virtio_net.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index d7ce4a1011ea..7419a68cf6e2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -240,7 +240,6 @@ struct control_buf { u8 allmulti; __virtio16 vid; __virtio64 offloads; - struct virtio_net_ctrl_rss rss; struct virtio_net_ctrl_coal_tx coal_tx; struct virtio_net_ctrl_coal_rx coal_rx; struct virtio_net_ctrl_coal_vq coal_vq; @@ -282,6 +281,7 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; @@ -3048,17 +3048,17 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -3074,21 +3074,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3199,7 +3199,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -3752,11 +3752,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -3776,10 +3776,10 @@ static int virtnet_set_rxfh(struct net_device *dev, if (r
[PATCH net-next 0/4] Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the RTNL lock. Previously this wasn't a major concern because the control VQ was only used during device setup and user interaction. With the recent addition of dynamic interrupt moderation the control VQ may be used frequently during normal operation. This series removes the RNTL lock dependancy by introducing a spin lock to protect the control buffer and writing SGs to the control VQ. Daniel Jurgens (4): virtio_net: Store RSS setting in virtnet_info virtio_net: Remove command data from control_buf virtio_net: Add a lock for the command VQ. virtio_net: Remove rtnl lock protection of command buffers drivers/net/virtio_net.c | 185 ++- 1 file changed, 104 insertions(+), 81 deletions(-) -- 2.42.0
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Michael S. Tsirkin > Sent: Wednesday, February 7, 2024 2:19 PM > To: Daniel Jurgens > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters > > On Wed, Feb 07, 2024 at 07:38:16PM +0000, Daniel Jurgens wrote: > > > From: Michael S. Tsirkin > > > Sent: Sunday, February 4, 2024 6:40 AM > > > To: Jason Wang > > > Cc: Jakub Kicinski ; Jason Xing > > > ; Daniel Jurgens ; > > > netdev@vger.kernel.org; xuanz...@linux.alibaba.com; > > > virtualizat...@lists.linux.dev; da...@davemloft.net; > > > eduma...@google.com; ab...@redhat.com; Parav Pandit > > > > > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake > > > counters > > > > > > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote: > > > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski > wrote: > > > > > > > > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote: > > > > > > > Can you say more? I'm curious what's your use case. > > > > > > > > > > > > I'm not working at Nvidia, so my point of view may differ from > theirs. > > > > > > From what I can tell is that those two counters help me narrow > > > > > > down the range if I have to diagnose/debug some issues. > > > > > > > > > > right, i'm asking to collect useful debugging tricks, nothing > > > > > against the patch itself :) > > > > > > > > > > > 1) I sometimes notice that if some irq is held too long (say, > > > > > > one simple case: output of printk printed to the console), > > > > > > those two counters can reflect the issue. > > > > > > 2) Similarly in virtio net, recently I traced such counters > > > > > > the current kernel does not have and it turned out that one of > > > > > > the output queues in the backend behaves badly. > > > > > > ... > > > > > > > > > > > > Stop/wake queue counters may not show directly the root cause > > > > > > of the issue, but help us 'guess' to some extent. > > > > > > > > > > I'm surprised you say you can detect stall-related issues with this. > > > > > I guess virtio doesn't have BQL support, which makes it special. > > > > > > > > Yes, virtio-net has a legacy orphan mode, this is something that > > > > needs to be dropped in the future. This would make BQL much more > > > > easier to be implemented. > > > > > > > > > It's not that we can't implement BQL, it's that it does not seem to > > > be benefitial - has been discussed many times. > > > > > > > > Normal HW drivers with BQL almost never stop the queue by > themselves. > > > > > I mean - if they do, and BQL is active, then the system is > > > > > probably misconfigured (queue is too short). This is what we use > > > > > at Meta to detect stalls in drivers with BQL: > > > > > > > > > > https://lore.kernel.org/all/20240131102150.728960-3-leitao@debia > > > > > n.or > > > > > g/ > > > > > > > > > > Daniel, I think this may be a good enough excuse to add > > > > > per-queue stats to the netdev genl family, if you're up for > > > > > that. LMK if you want more info, otherwise I guess ethtool -S is fine > for now. > > > > > > > > > > > > > Thanks > > > > Michael, > > Are you OK with this patch? Unless I missed it I didn't see a response > from you in our conversation the day I sent it. > > > > I thought what is proposed is adding some support for these stats to core? > Did I misunderstood? > That's a much bigger change and going that route I think still need to count them at the driver level. I said I could potentially take that on as a background project. But would prefer to go with ethtool -S for now. > -- > MST
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Michael S. Tsirkin > Sent: Sunday, February 4, 2024 6:40 AM > To: Jason Wang > Cc: Jakub Kicinski ; Jason Xing > ; Daniel Jurgens ; > netdev@vger.kernel.org; xuanz...@linux.alibaba.com; > virtualizat...@lists.linux.dev; da...@davemloft.net; > eduma...@google.com; ab...@redhat.com; Parav Pandit > > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters > > On Sun, Feb 04, 2024 at 09:20:18AM +0800, Jason Wang wrote: > > On Sat, Feb 3, 2024 at 12:01 AM Jakub Kicinski wrote: > > > > > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote: > > > > > Can you say more? I'm curious what's your use case. > > > > > > > > I'm not working at Nvidia, so my point of view may differ from theirs. > > > > From what I can tell is that those two counters help me narrow > > > > down the range if I have to diagnose/debug some issues. > > > > > > right, i'm asking to collect useful debugging tricks, nothing > > > against the patch itself :) > > > > > > > 1) I sometimes notice that if some irq is held too long (say, one > > > > simple case: output of printk printed to the console), those two > > > > counters can reflect the issue. > > > > 2) Similarly in virtio net, recently I traced such counters the > > > > current kernel does not have and it turned out that one of the > > > > output queues in the backend behaves badly. > > > > ... > > > > > > > > Stop/wake queue counters may not show directly the root cause of > > > > the issue, but help us 'guess' to some extent. > > > > > > I'm surprised you say you can detect stall-related issues with this. > > > I guess virtio doesn't have BQL support, which makes it special. > > > > Yes, virtio-net has a legacy orphan mode, this is something that needs > > to be dropped in the future. This would make BQL much more easier to > > be implemented. > > > It's not that we can't implement BQL, it's that it does not seem to be > benefitial - has been discussed many times. > > > > Normal HW drivers with BQL almost never stop the queue by themselves. > > > I mean - if they do, and BQL is active, then the system is probably > > > misconfigured (queue is too short). This is what we use at Meta to > > > detect stalls in drivers with BQL: > > > > > > https://lore.kernel.org/all/20240131102150.728960-3-lei...@debian.or > > > g/ > > > > > > Daniel, I think this may be a good enough excuse to add per-queue > > > stats to the netdev genl family, if you're up for that. LMK if you > > > want more info, otherwise I guess ethtool -S is fine for now. > > > > > > > Thanks Michael, Are you OK with this patch? Unless I missed it I didn't see a response from you in our conversation the day I sent it.
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Jakub Kicinski > Sent: Friday, February 2, 2024 10:01 AM > Subject: Re: [PATCH net-next] virtio_net: Add TX stop and wake counters > > On Fri, 2 Feb 2024 14:52:59 +0800 Jason Xing wrote: > > > Can you say more? I'm curious what's your use case. > > > > I'm not working at Nvidia, so my point of view may differ from theirs. > > From what I can tell is that those two counters help me narrow down > > the range if I have to diagnose/debug some issues. > > right, i'm asking to collect useful debugging tricks, nothing against the > patch > itself :) > > > 1) I sometimes notice that if some irq is held too long (say, one > > simple case: output of printk printed to the console), those two > > counters can reflect the issue. > > 2) Similarly in virtio net, recently I traced such counters the > > current kernel does not have and it turned out that one of the output > > queues in the backend behaves badly. > > ... > > > > Stop/wake queue counters may not show directly the root cause of the > > issue, but help us 'guess' to some extent. > > I'm surprised you say you can detect stall-related issues with this. > I guess virtio doesn't have BQL support, which makes it special. > Normal HW drivers with BQL almost never stop the queue by themselves. > I mean - if they do, and BQL is active, then the system is probably > misconfigured (queue is too short). This is what we use at Meta to detect > stalls in drivers with BQL: > > https://lore.kernel.org/all/20240131102150.728960-3-lei...@debian.org/ > > Daniel, I think this may be a good enough excuse to add per-queue stats to > the netdev genl family, if you're up for that. LMK if you want more info, > otherwise I guess ethtool -S is fine for now. For now, I would like to go with ethtool -S. But I can take on the netdev level as a background task. Are you suggesting adding them to rtnl_link_stats64?
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Michael S. Tsirkin > Sent: Tuesday, January 30, 2024 9:53 AM > On Tue, Jan 30, 2024 at 03:50:29PM +0000, Daniel Jurgens wrote: > > > From: Michael S. Tsirkin > > > Sent: Tuesday, January 30, 2024 9:42 AM On Tue, Jan 30, 2024 at > > > 03:40:21PM +, Daniel Jurgens wrote: > > > > > From: Michael S. Tsirkin > > > > > Sent: Tuesday, January 30, 2024 8:58 AM > > > > > > > > > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote: > > > > > > Add a tx queue stop and wake counters, they are useful for > debugging. > > > > > > > > > > > > $ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake' > > > > > > ... > > > > > > tx_queue_1_tx_stop: 16726 > > > > > > tx_queue_1_tx_wake: 16726 > > > > > > ... > > > > > > tx_queue_8_tx_stop: 1500110 > > > > > > tx_queue_8_tx_wake: 1500110 > > > > > > > > > > > > Signed-off-by: Daniel Jurgens > > > > > > Reviewed-by: Parav Pandit > > > > > > > > > > Hmm isn't one always same as the other, except when queue is > stopped? > > > > > And when it is stopped you can see that in the status? > > > > > So how is having two useful? > > > > > > > > At idle the counters will be the same, unless a tx_timeout occurs. > > > > But > > > under load they can be monitored to see which queues are stopped and > > > get an idea of how long they are stopped. > > > > > > how does it give you the idea of how long they are stopped? > > > > By serially monitoring the counter you can see stops that persist long > intervals that are less than the tx_timeout time. > > Why don't you monitor queue status directly? How? I don't know of any interface to check if a queue is stopped. > > > > > > > > Other net drivers (not all), also have the wake counter. > > > > > > Examples? > > > > [danielj@sw-mtx-051 upstream]$ ethtool -i ens2f1np1 > > driver: mlx5_core > > version: 6.7.0+ > > ... > > [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep wake > > tx_queue_wake: 0 > > tx0_wake: 0 > > Do they have a stop counter too? Yes: [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep 'stop\|wake' tx_queue_stopped: 0 tx_queue_wake: 0 tx0_stopped: 0 tx0_wake: 0 > > > > > > > > In my opinion it makes the stop counter more useful, at little cost. > > > > > > > > > > > > > > > > > > > > --- > > > > > > drivers/net/virtio_net.c | 26 -- > > > > > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/drivers/net/virtio_net.c > > > > > > b/drivers/net/virtio_net.c index 3cb8aa193884..7e3c31ceaf7e > > > > > > 100644 > > > > > > --- a/drivers/net/virtio_net.c > > > > > > +++ b/drivers/net/virtio_net.c > > > > > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats { > > > > > > u64_stats_t xdp_tx_drops; > > > > > > u64_stats_t kicks; > > > > > > u64_stats_t tx_timeouts; > > > > > > + u64_stats_t tx_stop; > > > > > > + u64_stats_t tx_wake; > > > > > > }; > > > > > > > > > > > > struct virtnet_rq_stats { > > > > > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc > > > > > virtnet_sq_stats_desc[] = { > > > > > > { "xdp_tx_drops", VIRTNET_SQ_STAT(xdp_tx_drops) }, > > > > > > { "kicks", VIRTNET_SQ_STAT(kicks) }, > > > > > > { "tx_timeouts",VIRTNET_SQ_STAT(tx_timeouts) }, > > > > > > + { "tx_stop",VIRTNET_SQ_STAT(tx_stop) }, > > > > > > + { "tx_wake",VIRTNET_SQ_STAT(tx_wake) }, > > > > > > }; > > > > > > > > > > > > static const struct virtnet_stat_desc virtnet_rq_stats_desc[] > > > > > > = { @@ > > > > > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct > > > > > > virtnet_info > > > > > *vi, > >
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Michael S. Tsirkin > Sent: Tuesday, January 30, 2024 9:42 AM > On Tue, Jan 30, 2024 at 03:40:21PM +0000, Daniel Jurgens wrote: > > > From: Michael S. Tsirkin > > > Sent: Tuesday, January 30, 2024 8:58 AM > > > > > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote: > > > > Add a tx queue stop and wake counters, they are useful for debugging. > > > > > > > > $ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake' > > > > ... > > > > tx_queue_1_tx_stop: 16726 > > > > tx_queue_1_tx_wake: 16726 > > > > ... > > > > tx_queue_8_tx_stop: 1500110 > > > > tx_queue_8_tx_wake: 1500110 > > > > > > > > Signed-off-by: Daniel Jurgens > > > > Reviewed-by: Parav Pandit > > > > > > Hmm isn't one always same as the other, except when queue is stopped? > > > And when it is stopped you can see that in the status? > > > So how is having two useful? > > > > At idle the counters will be the same, unless a tx_timeout occurs. But > under load they can be monitored to see which queues are stopped and get > an idea of how long they are stopped. > > how does it give you the idea of how long they are stopped? By serially monitoring the counter you can see stops that persist long intervals that are less than the tx_timeout time. > > > Other net drivers (not all), also have the wake counter. > > Examples? [danielj@sw-mtx-051 upstream]$ ethtool -i ens2f1np1 driver: mlx5_core version: 6.7.0+ ... [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep wake tx_queue_wake: 0 tx0_wake: 0 > > > In my opinion it makes the stop counter more useful, at little cost. > > > > > > > > > > > > --- > > > > drivers/net/virtio_net.c | 26 -- > > > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > > index 3cb8aa193884..7e3c31ceaf7e 100644 > > > > --- a/drivers/net/virtio_net.c > > > > +++ b/drivers/net/virtio_net.c > > > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats { > > > > u64_stats_t xdp_tx_drops; > > > > u64_stats_t kicks; > > > > u64_stats_t tx_timeouts; > > > > + u64_stats_t tx_stop; > > > > + u64_stats_t tx_wake; > > > > }; > > > > > > > > struct virtnet_rq_stats { > > > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc > > > virtnet_sq_stats_desc[] = { > > > > { "xdp_tx_drops", VIRTNET_SQ_STAT(xdp_tx_drops) }, > > > > { "kicks", VIRTNET_SQ_STAT(kicks) }, > > > > { "tx_timeouts",VIRTNET_SQ_STAT(tx_timeouts) }, > > > > + { "tx_stop",VIRTNET_SQ_STAT(tx_stop) }, > > > > + { "tx_wake",VIRTNET_SQ_STAT(tx_wake) }, > > > > }; > > > > > > > > static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { > > > > @@ > > > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct > > > > virtnet_info > > > *vi, > > > > */ > > > > if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { > > > > netif_stop_subqueue(dev, qnum); > > > > + u64_stats_update_begin(&sq->stats.syncp); > > > > + u64_stats_inc(&sq->stats.tx_stop); > > > > + u64_stats_update_end(&sq->stats.syncp); > > > > if (use_napi) { > > > > if > > > > (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) > > > > virtqueue_napi_schedule(&sq->napi, sq- > > > > vq); > @@ -851,6 +858,9 > > > >@@ static void check_sq_full_and_disable(struct virtnet_info *vi, > > > > free_old_xmit_skbs(sq, false); > > > > if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { > > > > netif_start_subqueue(dev, qnum); > > > > + > > > > u64_stats_update_begin(&sq->stats.syncp
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Heng Qi > Sent: Tuesday, January 30, 2024 9:17 AM > 在 2024/1/30 下午10:25, Daniel Jurgens 写道: > > Add a tx queue stop and wake counters, they are useful for debugging. > > > > $ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake' > > ... > > tx_queue_1_tx_stop: 16726 > > tx_queue_1_tx_wake: 16726 > > ... > > tx_queue_8_tx_stop: 1500110 > > tx_queue_8_tx_wake: 1500110 > > > > Signed-off-by: Daniel Jurgens > > Reviewed-by: Parav Pandit > > --- > > drivers/net/virtio_net.c | 26 -- > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index > > 3cb8aa193884..7e3c31ceaf7e 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats { > > u64_stats_t xdp_tx_drops; > > u64_stats_t kicks; > > u64_stats_t tx_timeouts; > > + u64_stats_t tx_stop; > > + u64_stats_t tx_wake; > > }; > > Hi Daniel! > > tx_stop/wake only counts the status in the I/O path. > Do the status of virtnet_config_changed_work and virtnet_tx_resize need to > be counted? > My motivation for the counter is detecting full TX queues. I don't think counting them in the control path is useful, but it can be done if you disagree. > Thanks, > Heng > > > > > struct virtnet_rq_stats { > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc > virtnet_sq_stats_desc[] = { > > { "xdp_tx_drops", VIRTNET_SQ_STAT(xdp_tx_drops) }, > > { "kicks", VIRTNET_SQ_STAT(kicks) }, > > { "tx_timeouts",VIRTNET_SQ_STAT(tx_timeouts) }, > > + { "tx_stop",VIRTNET_SQ_STAT(tx_stop) }, > > + { "tx_wake",VIRTNET_SQ_STAT(tx_wake) }, > > }; > > > > static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@ > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info > *vi, > > */ > > if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { > > netif_stop_subqueue(dev, qnum); > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_stop); > > + u64_stats_update_end(&sq->stats.syncp); > > if (use_napi) { > > if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) > > virtqueue_napi_schedule(&sq->napi, sq- > >vq); @@ -851,6 +858,9 @@ > > static void check_sq_full_and_disable(struct virtnet_info *vi, > > free_old_xmit_skbs(sq, false); > > if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { > > netif_start_subqueue(dev, qnum); > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_wake); > > + u64_stats_update_end(&sq->stats.syncp); > > virtqueue_disable_cb(sq->vq); > > } > > } > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct > receive_queue *rq) > > free_old_xmit_skbs(sq, true); > > } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) > > + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > > + if (netif_tx_queue_stopped(txq)) { > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_wake); > > + u64_stats_update_end(&sq->stats.syncp); > > + } > > netif_tx_wake_queue(txq); > > + } > > > > __netif_tx_unlock(txq); > > } > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct > *napi, int budget) > > virtqueue_disable_cb(sq->vq); > > free_old_xmit_skbs(sq, true); > > > > - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) > > + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > > + if (netif_tx_queue_stopped(txq)) { > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_wake); > > + u64_stats_update_end(&sq->stats.syncp); > > + } > > netif_tx_wake_queue(txq); > > + } > > > > opaque = virtqueue_enable_cb_prepare(sq->vq); > >
RE: [PATCH net-next] virtio_net: Add TX stop and wake counters
> From: Michael S. Tsirkin > Sent: Tuesday, January 30, 2024 8:58 AM > > On Tue, Jan 30, 2024 at 08:25:21AM -0600, Daniel Jurgens wrote: > > Add a tx queue stop and wake counters, they are useful for debugging. > > > > $ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake' > > ... > > tx_queue_1_tx_stop: 16726 > > tx_queue_1_tx_wake: 16726 > > ... > > tx_queue_8_tx_stop: 1500110 > > tx_queue_8_tx_wake: 1500110 > > > > Signed-off-by: Daniel Jurgens > > Reviewed-by: Parav Pandit > > Hmm isn't one always same as the other, except when queue is stopped? > And when it is stopped you can see that in the status? > So how is having two useful? At idle the counters will be the same, unless a tx_timeout occurs. But under load they can be monitored to see which queues are stopped and get an idea of how long they are stopped. Other net drivers (not all), also have the wake counter. In my opinion it makes the stop counter more useful, at little cost. > > > > --- > > drivers/net/virtio_net.c | 26 -- > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index > > 3cb8aa193884..7e3c31ceaf7e 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -88,6 +88,8 @@ struct virtnet_sq_stats { > > u64_stats_t xdp_tx_drops; > > u64_stats_t kicks; > > u64_stats_t tx_timeouts; > > + u64_stats_t tx_stop; > > + u64_stats_t tx_wake; > > }; > > > > struct virtnet_rq_stats { > > @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc > virtnet_sq_stats_desc[] = { > > { "xdp_tx_drops", VIRTNET_SQ_STAT(xdp_tx_drops) }, > > { "kicks", VIRTNET_SQ_STAT(kicks) }, > > { "tx_timeouts",VIRTNET_SQ_STAT(tx_timeouts) }, > > + { "tx_stop",VIRTNET_SQ_STAT(tx_stop) }, > > + { "tx_wake",VIRTNET_SQ_STAT(tx_wake) }, > > }; > > > > static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@ > > -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info > *vi, > > */ > > if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { > > netif_stop_subqueue(dev, qnum); > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_stop); > > + u64_stats_update_end(&sq->stats.syncp); > > if (use_napi) { > > if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) > > virtqueue_napi_schedule(&sq->napi, sq- > >vq); @@ -851,6 +858,9 @@ > > static void check_sq_full_and_disable(struct virtnet_info *vi, > > free_old_xmit_skbs(sq, false); > > if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { > > netif_start_subqueue(dev, qnum); > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_wake); > > + u64_stats_update_end(&sq->stats.syncp); > > virtqueue_disable_cb(sq->vq); > > } > > } > > @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct > receive_queue *rq) > > free_old_xmit_skbs(sq, true); > > } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); > > > > - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) > > + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > > + if (netif_tx_queue_stopped(txq)) { > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_wake); > > + u64_stats_update_end(&sq->stats.syncp); > > + } > > netif_tx_wake_queue(txq); > > + } > > > > __netif_tx_unlock(txq); > > } > > @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct > *napi, int budget) > > virtqueue_disable_cb(sq->vq); > > free_old_xmit_skbs(sq, true); > > > > - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) > > + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { > > + if (netif_tx_queue_stopped(txq)) { > > + u64_stats_update_begin(&sq->stats.syncp); > > + u64_stats_inc(&sq->stats.tx_wake); > > + u64_stats_update_end(&sq->stats.syncp); > > + } > > netif_tx_wake_queue(txq); > > + } > > > > opaque = virtqueue_enable_cb_prepare(sq->vq); > > > > -- > > 2.42.0
[PATCH net-next] virtio_net: Add TX stop and wake counters
Add a tx queue stop and wake counters, they are useful for debugging. $ ethtool -S ens5f2 | grep 'tx_stop\|tx_wake' ... tx_queue_1_tx_stop: 16726 tx_queue_1_tx_wake: 16726 ... tx_queue_8_tx_stop: 1500110 tx_queue_8_tx_wake: 1500110 Signed-off-by: Daniel Jurgens Reviewed-by: Parav Pandit --- drivers/net/virtio_net.c | 26 -- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 3cb8aa193884..7e3c31ceaf7e 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -88,6 +88,8 @@ struct virtnet_sq_stats { u64_stats_t xdp_tx_drops; u64_stats_t kicks; u64_stats_t tx_timeouts; + u64_stats_t tx_stop; + u64_stats_t tx_wake; }; struct virtnet_rq_stats { @@ -112,6 +114,8 @@ static const struct virtnet_stat_desc virtnet_sq_stats_desc[] = { { "xdp_tx_drops", VIRTNET_SQ_STAT(xdp_tx_drops) }, { "kicks", VIRTNET_SQ_STAT(kicks) }, { "tx_timeouts",VIRTNET_SQ_STAT(tx_timeouts) }, + { "tx_stop",VIRTNET_SQ_STAT(tx_stop) }, + { "tx_wake",VIRTNET_SQ_STAT(tx_wake) }, }; static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { @@ -843,6 +847,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { netif_stop_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.tx_stop); + u64_stats_update_end(&sq->stats.syncp); if (use_napi) { if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) virtqueue_napi_schedule(&sq->napi, sq->vq); @@ -851,6 +858,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, free_old_xmit_skbs(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.tx_wake); + u64_stats_update_end(&sq->stats.syncp); virtqueue_disable_cb(sq->vq); } } @@ -2163,8 +2173,14 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) free_old_xmit_skbs(sq, true); } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.tx_wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } __netif_tx_unlock(txq); } @@ -2310,8 +2326,14 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) virtqueue_disable_cb(sq->vq); free_old_xmit_skbs(sq, true); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.tx_wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } opaque = virtqueue_enable_cb_prepare(sq->vq); -- 2.42.0
Re: [BUG] ethernet:mellanox:mlx5: Oops in health_recover get_nic_state(dev)
On 3/28/2017 4:11 AM, Saeed Mahameed wrote: > On Tue, Mar 28, 2017 at 2:45 AM, Goel, Sameer wrote: >> Stack frame: >> [ 1744.418958] [] get_nic_state+0x24/0x40 [mlx5_core] >> [ 1744.425273] [] health_recover+0x28/0x80 [mlx5_core] >> [ 1744.431496] [] process_one_work+0x150/0x460 >> [ 1744.437218] [] worker_thread+0x50/0x4b8 >> [ 1744.442609] [] kthread+0xd8/0xf0 >> [ 1744.447377] [] ret_from_fork+0x10/0x20 >> >> Summary: >> This issue was seen on QDF2400 system 30 mins after while running speccpu >> 2006. During the test a recoverable PCIe error was seen that gave the >> following log: >> [ 1673.170969] pcieport 0002:00:00.0: aer_status: 0x4000, aer_mask: >> 0x0040 >> [ 1673.177961] pcieport 0002:00:00.0: aer_layer=Transaction Layer, >> aer_agent=Requester ID >> [ 1673.185832] pcieport 0002:00:00.0: aer_uncor_severity: 0x00462030 >> [ 1675.536391] mlx5_core 0002:01:00.0: assert_var[0] 0x >> [ 1675.541093] mlx5_core 0002:01:00.0: assert_var[1] 0x >> [ 1675.546750] mlx5_core 0002:01:00.0: assert_var[2] 0x >> [ 1675.552377] mlx5_core 0002:01:00.0: assert_var[3] 0x >> [ 1675.558040] mlx5_core 0002:01:00.0: assert_var[4] 0x >> [ 1675.563661] mlx5_core 0002:01:00.0: assert_exit_ptr 0x >> [ 1675.569488] mlx5_core 0002:01:00.0: assert_callra 0x >> [ 1675.575120] mlx5_core 0002:01:00.0: fw_ver 15.4095.65535 >> [ 1675.580426] mlx5_core 0002:01:00.0: hw_id 0x >> [ 1675.585363] mlx5_core 0002:01:00.0: irisc_index 255 >> [ 1675.590242] mlx5_core 0002:01:00.0: synd 0xff: unrecognized error >> [ 1675.596301] mlx5_core 0002:01:00.0: ext_synd 0x >> [ 1675.601209] mlx5_core 0002:01:00.0: mlx5_enter_error_state:120:(pid >> 7205): start >> [ 1675.608613] mlx5_core 0002:01:00.0: mlx5_enter_error_state:127:(pid >> 7205): end >> >> After the above log we see the above stackframe and a page fault due to >> invalid dev pointer. >> >> So the the recovery work is queued and the timer is stopped. Somehow the >> workqueue is not cleared and when it runs the dev pointer is invalid. >> >> This issue was difficult to repro and was seen only once in multiple runs on >> a specific device. > Hi Sameer, > > Thanks for the report, > adding more relevant ppl > > Mohamad/Daniel Does the above ring a bell ? > can you check ? > > Thanks > Saeed. Hi Sameer, Can you tell me if you have these 2 patches? 5e44fca504705 ('net/mlx5: Only cancel recovery work when cleaning up device') 689a248df83b ("net/mlx5: Cancel recovery work in remove flow")
Re: [net-next 2/8] net/mlx5: Configure cache line size for start and end padding
On 2/1/2017 5:12 AM, David Laight wrote: > From: Saeed Mahameed >> Sent: 31 January 2017 20:59 >> From: Daniel Jurgens >> >> There is a hardware feature that will pad the start or end of a DMA to >> be cache line aligned to avoid RMWs on the last cache line. The default >> cache line size setting for this feature is 64B. This change configures >> the hardware to use 128B alignment on systems with 128B cache lines. > What guarantees that the extra bytes are actually inside the receive skb's > head and tail room? > > David > > The hardware won't over write the length of the posted buffer. This feature is already enabled and defaults to 64B stride, this patch just configures it properly for 128B cache line sizes. Thanks for reviewing it. Dan
Re: [PATCH net-next 3/9] net/mlx4_core: Set EQ affinity hint to local NUMA CPUs
On 1/16/2017 3:59 PM, Or Gerlitz wrote: > On Mon, Jan 16, 2017 at 11:54 PM, Daniel Jurgens wrote: >> On 1/16/2017 3:44 PM, Or Gerlitz wrote: >>> On Mon, Jan 16, 2017 at 7:29 PM, Tariq Toukan wrote: >>>> From: Daniel Jurgens >>>> >>>> Use CPUs on the close NUMA when setting the EQ affinity hints. >>> Dan, are we sure there are no down-sides for always doing this? this >>> code is probably there for many years and we're introducing here new >>> behaviour to potentially to many Ms installs when they get distro >>> update that includes this patch. > >> I don't see a downside, this just favors using the node local CPUs before >> others. > OK, so this just favors before others and not limits (not in front of > the code now)? would be good to improve the change log and make this > clear. Right, doesn't limit. Just favors the local node. I can work with Tariq to update the commit message if you think it necessary.
Re: [PATCH net-next 3/9] net/mlx4_core: Set EQ affinity hint to local NUMA CPUs
On 1/16/2017 3:44 PM, Or Gerlitz wrote: > On Mon, Jan 16, 2017 at 7:29 PM, Tariq Toukan wrote: >> From: Daniel Jurgens >> >> Use CPUs on the close NUMA when setting the EQ affinity hints. > Dan, are we sure there are no down-sides for always doing this? this > code is probably there for many years and we're introducing here new > behaviour to potentially to many Ms installs when they get distro > update that includes this patch. > I don't see a downside, this just favors using the node local CPUs before others. I don't understand your 2nd sentence there. "Ms installs"?