On 02/08/16 12:58, Adrien Mazarguil wrote: > On Tue, Aug 02, 2016 at 12:31:35PM +0300, Sagi Grimberg wrote: >> >> >> On 01/08/16 19:43, Adrien Mazarguil wrote: >>> Hi Sagi, >>> >>> On Mon, Aug 01, 2016 at 11:44:21AM +0300, Sagi Grimberg wrote: >>>> The user is allowed to call ->rx_pkt_burst() even without free >>>> mbufs in the pool. In this scenario we'll fail allocating a rep mbuf >>>> on the first iteration (where pkt is still NULL). This would cause us >>>> to deref a NULL pkt (reset refcount and free). >>>> >>>> Fix this by checking the pkt before freeing it. >>> >>> Just to be sure, did you get an actual NULL deref crash here or is that an >>> assumed possibility? >>> >>> I'm asking because this problem was supposed to be addressed by: >>> >>> a1bdb71a32da ("net/mlx5: fix crash in Rx") >> >> I actually got the NULL deref. This happens when the application doesn't >> restore mbufs to the pool correctly. In the case rte_mbuf_raw_alloc >> will fail on the first iteration (pkt wasn't assigned) unlike the >> condition handled in a1bdb71a32da. >> >> With this applied, I didn't see the crash. > > Thanks for confirming this,
Hey Adrien, I just noticed that I missed the rest of your response in the previous message (pre-coffee mail browsing...) You analysis was on spot. > now what about the different approach I > suggested in my previous message to avoid the extra check in the inner loop: > > if (!pkt) > pkt = seg; > while (pkt != seg) { > ... > } We can go this way, but it looks kinda confusing to set pkt = seg and then iterate on pkt != seg. How about a more explicit approach: -- diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index fce3381ae87a..37573668e43e 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1572,6 +1572,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) rte_prefetch0(wqe); rep = rte_mbuf_raw_alloc(rxq->mp); if (unlikely(rep == NULL)) { + ++rxq->stats.rx_nombuf; + if (!pkt) { + /* + * no buffers before we even started, + * bail out silently. + */ + break; + } while (pkt != seg) { assert(pkt != (*rxq->elts)[idx]); seg = NEXT(pkt); @@ -1579,7 +1587,6 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) __rte_mbuf_raw_free(pkt); pkt = seg; } - ++rxq->stats.rx_nombuf; break; } if (!pkt) { -- > Also the fixes line in your commit message? I'll add it in v2. Thanks.