On 9/27/2024 11:46 AM, Meade, Niall wrote:
>> On 9/26/2024 3:03 PM, Meade, Niall wrote:
>>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
>>>> Sent: Thursday, September 26, 2024 12:16 AM
>>>> To: Meade, Niall <niall.me...@intel.com>; Thomas Monjalon 
>>>> <tho...@monjalon.net>; Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; 
>>>> Roman Zhukov <roman.zhu...@arknetworks.am>
>>>> Cc: dev@dpdk.org <dev@dpdk.org>
>>>> Subject: Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic
>>> <snip>
>>>>> The resolution involves upcasting nb_desc to a uint32_t before the
>>>>> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent
>>>>> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not
>>>>> result in an overflow, as it would when nb_desc is a uint16_t. By using
>>>>> a uint32_t for these operations, the correct behavior is maintained
>>>>> without the risk of overflow.
>>>>>
>>>>
>>>> Hi Niall,
>>>
>>> Hi Ferruh,
>>>
>>>> Thanks for the patch.
>>>>
>>>> For the 'RTE_ALIGN_CEIL(val, align)' macro, 'align' should be power of
>>>> two, as 'desc_lim->nb_align' is uint16_t, max value it can get is 2^15.
>>>> 'val' should be smaller than or equal to 'align', so '*nb_desc' can be
>>>> maximum 2^15.
>>>>
>>>> So RTE_ALIGN_CEIL(2^15-1, 2^15) = 2^15, I think this should work fine
>>>> (although I didn't test).
>>>>
>>>> And even with your uint32_t cast, I think following will fail:
>>>> RTE_ALIGN_CEIL(2^16-1, 2^15)
>>>> (again, not tested).
>>>>
>>>
>>> I tested my code with these values and the behaviour is as expected from
>>> what I can see.
>>> At a high level I ran into this issue when passing uint16_tMAX into
>>> rte_eth_dev_adjust_nb_rx_tx_desc() with the intent of selecting the maximum
>>> ring descriptor size but the minimum was selected.
>>>
>>>> Or maybe I am missing a case, can you please give some actual numbers to
>>>> show the problem and the fix?
>>>
>>> Yes sure! If we take an example of val= (2^16)-1 and align= 32.
>>> RTE_ALIGN_CEIL(val, align) calls RTE_ALIGN_FLOOR(val + align - 1, align). 
>>> With
>>> val as a uint16_t this subsequent macro call results in a wrap around for 
>>> val
>>> (originally was the max uint16_t and now we are attempting to add align to
>>> it). The returned value of RTE_ALIGN_CEIL() in this case is 0. This results 
>>> in
>>> nb_desc being set to 0, and later set to the minimum ring descriptor size 
>>> for
>>> that NIC with *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min).
>>>
>>> While this example is an unreasonably large request for a descriptor ring 
>>> size,
>>> the expected behaviour would be that the descriptor ring size defaults back 
>>> to
>>> the maximum possible for that particular NIC, not to the minimum which it
>>> currently does.
>>> By introducing a uint32_t, the wrap around in RTE_ALIGN_FLOOR() is avoided,
>>> keeping the large value of nb_desc_32 which is later set to an appropriate 
>>> size
>>> in RTE_MIN(*nb_desc_32, desc_lim->nb_max)
>>>
>>
>> I see the problem now, thanks.
>>
>> When value > (2^16 - align), next aligned value is 2^16, which is
>> UINT16_MAX + 1, hence wraps to 0, this is kind of expected.
>>
>> For the relevant code, assuming 'desc_lim->nb_max' & 'desc_lim->nb_min'
>> are already aligned to 'desc_lim->nb_align', following should fix the
>> issue, that seems simpler to me, what do you think:
> 
> Yes, while it is a simpler solution there is still potential for an overflow 
> if nb_max
> is equal to 0. If nb_max is 0 while nb_desc is UINT16_MAX, UINT16_MAX will be
> passed to the align macro resulting in an overflow again.
> 

ack

>>
>> ```
>> if (desc_lim->nb_max != 0)
>>         *nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max);
>>
>> nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min);
>>
>> if (desc_lim->nb_align != 0)
>>         *nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align);
>> ```
>>
>> Basically just changing the order of the operations...
>>
>> It is not easy to see the problem, can you please give sample values in
>> the commit log (for '*nb_desc', 'nb_align', 'nb_max' & 'nb_min'), that
>> makes much easier to see why above works.
> 
> Yes, good idea! I'll add an example to the commit log for clarity.
> --------------------------------------------------------------
> Intel Research and Development Ireland Limited
> Registered in Ireland
> Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
> Registered Number: 308263
> 
> 
> This e-mail and any attachments may contain confidential material for the sole
> use of the intended recipient(s). Any review or distribution by others is
> strictly prohibited. If you are not the intended recipient, please contact the
> sender and delete all copies.
> 

Reply via email to