On 08-10-2025 20:58, Aditya Garg wrote:
On 08-10-2025 20:51, Eric Dumazet wrote:
On Wed, Oct 8, 2025 at 8:16 AM Aditya Garg
<[email protected]> wrote:

On 03-10-2025 21:45, Eric Dumazet wrote:
On Fri, Oct 3, 2025 at 8:47 AM Aditya Garg
<[email protected]> wrote:

The MANA hardware supports a maximum of 30 scatter-gather entries (SGEs) per TX WQE. In rare configurations where MAX_SKB_FRAGS + 2 exceeds this
limit, the driver drops the skb. Add a check in mana_start_xmit() to
detect such cases and linearize the SKB before transmission.

Return NETDEV_TX_BUSY only for -ENOSPC from mana_gd_post_work_request(),
send other errors to free_sgl_ptr to free resources and record the tx
drop.

Signed-off-by: Aditya Garg <[email protected]>
Reviewed-by: Dipayaan Roy <[email protected]>
---
   drivers/net/ethernet/microsoft/mana/mana_en.c | 26 +++++++++++++ ++----
   include/net/mana/gdma.h                       |  8 +++++-
   include/net/mana/mana.h                       |  1 +
   3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/ drivers/net/ethernet/microsoft/mana/mana_en.c
index f4fc86f20213..22605753ca84 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -20,6 +20,7 @@

   #include <net/mana/mana.h>
   #include <net/mana/mana_auxiliary.h>
+#include <linux/skbuff.h>

   static DEFINE_IDA(mana_adev_ida);

@@ -289,6 +290,19 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev)
          cq = &apc->tx_qp[txq_idx].tx_cq;
          tx_stats = &txq->stats;

+       BUILD_BUG_ON(MAX_TX_WQE_SGL_ENTRIES != MANA_MAX_TX_WQE_SGL_ENTRIES);
+       #if (MAX_SKB_FRAGS + 2 > MANA_MAX_TX_WQE_SGL_ENTRIES)
+               if (skb_shinfo(skb)->nr_frags + 2 > MANA_MAX_TX_WQE_SGL_ENTRIES) {
+                       netdev_info_once(ndev,
+                                        "nr_frags %d exceeds max supported sge limit. Attempting skb_linearize\n",
+                                        skb_shinfo(skb)->nr_frags);
+                       if (skb_linearize(skb)) {

This will fail in many cases.

This sort of check is better done in ndo_features_check()

Most probably this would occur for GSO packets, so can ask a software
segmentation
to avoid this big and risky kmalloc() by all means.

Look at idpf_features_check()  which has something similar.

Hi Eric,
Thank you for your review. I understand your concerns regarding the use
of skb_linearize() in the xmit path, as it can fail under memory
pressure and introduces additional overhead in the transmit path. Based
on your input, I will work on a v2 that will move the SGE limit check to
the ndo_features_check() path and for GSO skbs exceding the hw limit
will disable the NETIF_F_GSO_MASK to enforce software segmentation in
kernel before the call to xmit.
Also for non GSO skb exceeding the SGE hw limit should we go for using
skb_linearize only then or would you suggest some other approach here?

I think that for non GSO, the linearization attempt is fine.

Note that this is extremely unlikely for non malicious users,
and MTU being usually small (9K or less),
the allocation will be much smaller than a GSO packet.

Okay. Will send a v2
Hi Eric,
I tested the code by disabling GSO in ndo_features_check when the number of SGEs exceeds the hardware limit, using iperf for a single TCP connection with zerocopy enabled. I noticed a significant difference in throughput compared to when we linearize the skbs. For reference, the throughput is 35.6 Gbits/sec when using skb_linearize, but drops to 6.75 Gbits/sec when disabling GSO per skb.

Hence, We propose to linearizing skbs until the first failure occurs. After that, we switch to a fail-safe mode by disabling GSO for SKBs with sge > hw limit using the ndo_feature_check implementation, while continuing to apply skb_linearize() for non-GSO packets that exceed the hardware limit. This ensures we remain on the optimal performance path initially, and only transition to the fail-safe path after encountering a failure.
Regards,
Aditya

Reply via email to