On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yi...@amd.com>
>> Sent: Tuesday, February 6, 2024 11:55 PM
>> To: Rahul Bhansali <rbhans...@marvell.com>; dev@dpdk.org; Radu Nicolau
>> <radu.nico...@intel.com>; Akhil Goyal <gak...@marvell.com>; Konstantin
>> Ananyev <konstantin.anan...@huawei.com>; Anoob Joseph
>> <ano...@marvell.com>
>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
>>> performance. On cn10k, maximum of ~4% drop observed for IPsec event
>>> mode single SA outbound case.
>>>
>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
>>>
>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
>>>
>>> Signed-off-by: Rahul Bhansali <rbhans...@marvell.com>
>>> ---
>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
>>> b/examples/ipsec-secgw/ipsec-secgw.h
>>> index 8baab44ee7..ec33a982df 100644
>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)  }
>>>
>>>  /* helper routine to free bulk of packets */ -static inline void
>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
>>> +static __rte_always_inline void
>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
>>>  {
>>> -   rte_pktmbuf_free_bulk(mb, n);
>>> -
>>> +   n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
>>>     core_stats_update_drop(n);
>>>  }
>>>
>>
>> Hi Rahul,
>>
>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by
>> similar change?
> 
> Hi Ferruh,
> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with 
> __rte_pktmbuf_free_seg_via_array()  both inline then performance can be 
> improved similar.
>

Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being
inline, OK.

As you are doing performance testing in that area, can you please check
if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static
function I expect it to be inlined. If not, can you please test with
force inlining it (__rte_always_inline)?


And I wonder if bulk() API may get single mbuf is a common theme, does
it makes sense add a new inline wrapper to library to cover this case,
if it is bringing ~4% improvement, like:
```
static inline void
rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n)
{
        if (n == 1)
                return rte_pktmbuf_free(mb[0]);
        return rte_pktmbuf_free_bulk(mb, n);
}
```

Reply via email to