On 9/27/2024 11:46 AM, Meade, Niall wrote: >> On 9/26/2024 3:03 PM, Meade, Niall wrote: >>>> From: Ferruh Yigit <ferruh.yi...@amd.com> >>>> Sent: Thursday, September 26, 2024 12:16 AM >>>> To: Meade, Niall <niall.me...@intel.com>; Thomas Monjalon >>>> <tho...@monjalon.net>; Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>; >>>> Roman Zhukov <roman.zhu...@arknetworks.am> >>>> Cc: dev@dpdk.org <dev@dpdk.org> >>>> Subject: Re: [PATCH v1] ethdev: fix int overflow in descriptor count logic >>> <snip> >>>>> The resolution involves upcasting nb_desc to a uint32_t before the >>>>> RTE_ALIGN_CEIL macro is applied. This change ensures that the subsequent >>>>> call to RTE_ALIGN_FLOOR(nb_desc + (nb_align - 1), nb_align) does not >>>>> result in an overflow, as it would when nb_desc is a uint16_t. By using >>>>> a uint32_t for these operations, the correct behavior is maintained >>>>> without the risk of overflow. >>>>> >>>> >>>> Hi Niall, >>> >>> Hi Ferruh, >>> >>>> Thanks for the patch. >>>> >>>> For the 'RTE_ALIGN_CEIL(val, align)' macro, 'align' should be power of >>>> two, as 'desc_lim->nb_align' is uint16_t, max value it can get is 2^15. >>>> 'val' should be smaller than or equal to 'align', so '*nb_desc' can be >>>> maximum 2^15. >>>> >>>> So RTE_ALIGN_CEIL(2^15-1, 2^15) = 2^15, I think this should work fine >>>> (although I didn't test). >>>> >>>> And even with your uint32_t cast, I think following will fail: >>>> RTE_ALIGN_CEIL(2^16-1, 2^15) >>>> (again, not tested). >>>> >>> >>> I tested my code with these values and the behaviour is as expected from >>> what I can see. >>> At a high level I ran into this issue when passing uint16_tMAX into >>> rte_eth_dev_adjust_nb_rx_tx_desc() with the intent of selecting the maximum >>> ring descriptor size but the minimum was selected. >>> >>>> Or maybe I am missing a case, can you please give some actual numbers to >>>> show the problem and the fix? >>> >>> Yes sure! If we take an example of val= (2^16)-1 and align= 32. >>> RTE_ALIGN_CEIL(val, align) calls RTE_ALIGN_FLOOR(val + align - 1, align). >>> With >>> val as a uint16_t this subsequent macro call results in a wrap around for >>> val >>> (originally was the max uint16_t and now we are attempting to add align to >>> it). The returned value of RTE_ALIGN_CEIL() in this case is 0. This results >>> in >>> nb_desc being set to 0, and later set to the minimum ring descriptor size >>> for >>> that NIC with *nb_desc = RTE_MAX(*nb_desc, desc_lim->nb_min). >>> >>> While this example is an unreasonably large request for a descriptor ring >>> size, >>> the expected behaviour would be that the descriptor ring size defaults back >>> to >>> the maximum possible for that particular NIC, not to the minimum which it >>> currently does. >>> By introducing a uint32_t, the wrap around in RTE_ALIGN_FLOOR() is avoided, >>> keeping the large value of nb_desc_32 which is later set to an appropriate >>> size >>> in RTE_MIN(*nb_desc_32, desc_lim->nb_max) >>> >> >> I see the problem now, thanks. >> >> When value > (2^16 - align), next aligned value is 2^16, which is >> UINT16_MAX + 1, hence wraps to 0, this is kind of expected. >> >> For the relevant code, assuming 'desc_lim->nb_max' & 'desc_lim->nb_min' >> are already aligned to 'desc_lim->nb_align', following should fix the >> issue, that seems simpler to me, what do you think: > > Yes, while it is a simpler solution there is still potential for an overflow > if nb_max > is equal to 0. If nb_max is 0 while nb_desc is UINT16_MAX, UINT16_MAX will be > passed to the align macro resulting in an overflow again. >
ack >> >> ``` >> if (desc_lim->nb_max != 0) >> *nb_desc = RTE_MIN(*nb_desc, desc_lim->nb_max); >> >> nb_desc_32 = RTE_MAX(nb_desc_32, desc_lim->nb_min); >> >> if (desc_lim->nb_align != 0) >> *nb_desc = RTE_ALIGN_CEIL(*nb_desc, desc_lim->nb_align); >> ``` >> >> Basically just changing the order of the operations... >> >> It is not easy to see the problem, can you please give sample values in >> the commit log (for '*nb_desc', 'nb_align', 'nb_max' & 'nb_min'), that >> makes much easier to see why above works. > > Yes, good idea! I'll add an example to the commit log for clarity. > -------------------------------------------------------------- > Intel Research and Development Ireland Limited > Registered in Ireland > Registered Office: Collinstown Industrial Park, Leixlip, County Kildare > Registered Number: 308263 > > > This e-mail and any attachments may contain confidential material for the sole > use of the intended recipient(s). Any review or distribution by others is > strictly prohibited. If you are not the intended recipient, please contact the > sender and delete all copies. >