On 10/11/21 7:25 PM, Ananyev, Konstantin wrote:
At queue configure stage always allocate space for maximum possible number (RTE_MAX_QUEUES_PER_PORT) of queue pointers. That will allow 'fast' inline functions (eth_rx_burst, etc.) to refer pointer to internal queue data without extra checking of current number of configured queues. That would help in future to hide rte_eth_dev and related structures. It means that from now on, each ethdev port will always consume: ((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT) bytes of memory for its queue pointers. With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port. Signed-off-by: Konstantin Ananyev <[email protected]> --- lib/ethdev/rte_ethdev.c | 36 +++++++++--------------------------- 1 file changed, 9 insertions(+), 27 deletions(-) diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index ed37f8871b..c8abda6dd7 100644 --- a/lib/ethdev/rte_ethdev.c +++ b/lib/ethdev/rte_ethdev.c @@ -897,7 +897,8 @@ eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) if (dev->data->rx_queues == NULL && nb_queues != 0) { /* first time configuration */ dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues", - sizeof(dev->data->rx_queues[0]) * nb_queues, + sizeof(dev->data->rx_queues[0]) * + RTE_MAX_QUEUES_PER_PORT, RTE_CACHE_LINE_SIZE);Looking at it I have few questions: 1. Why is nb_queues == 0 case kept as an exception? Yes, strictly speaking it is not the problem of the patch, DPDK will still segfault (non-debug build) if I allocate Tx queues only but call rte_eth_rx_burst().eth_dev_rx_queue_config(.., nb_queues=0) is used in few places to clean-up things.
No, as far as I know. For Tx only application (e.g. traffic generator) it is 100% legal to configure with tx_queues=X, rx_queues=0. The same is for Rx only application (e.g. packet capture).
After reading the patch description I thought that we're trying to address it.We do, though I can't see how we can address it in this patch. Though it is a good idea - I think I can add extra check in eth_dev_fp_ops_setup() or around and setup RX function pointers only when dev->data->rx_queues != NULL. Same for TX.
You don't need to care about these pointers, if these arrays are always allocated. See (3) below.
2. Why do we need to allocate memory dynamically? Can we just make rx_queues an array of appropriate size?Pavan already asked same question. My answer to him: Yep we can, and yes it will simplify this peace of code. The main reason I decided no to do this change now - it will change layout of the_eth_dev_data structure. In this series I tried to mininize(/avoid) changes in rte_eth_dev and rte_eth_dev_data, as much as possible to avoid any unforeseen performance and functional impacts. If we'll manage to make rte_eth_dev and rte_eth_dev_data private we can in future consider that one and other changes in rte_eth_dev and rte_eth_dev_data layouts without worrying about ABI breakage
Thanks a lot. Makes sense.
May be wasting 512K unconditionally is too much. 3. If wasting 512K is too much, I'd consider to move allocation to eth_dev_get(). IfDon't understand where 512KB came from.
32 port * 1024 queues * 2 types * 8 pointer size if we allocate as in (2) above.
each ethdev port will always consume: ((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT) bytes of memory for its queue pointers. With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port.
IMHO it will be a bit nicer if queue pointers arrays are allocated on device get if size is fixed. It is just a suggestion. If you disagree, feel free to drop it.
if (dev->data->rx_queues == NULL) { dev->data->nb_rx_queues = 0; @@ -908,21 +909,11 @@ eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) rxq = dev->data->rx_queues; - for (i = nb_queues; i < old_nb_queues; i++) + for (i = nb_queues; i < old_nb_queues; i++) { (*dev->dev_ops->rx_queue_release)(rxq[i]); - rxq = rte_realloc(rxq, sizeof(rxq[0]) * nb_queues, - RTE_CACHE_LINE_SIZE); - if (rxq == NULL) - return -(ENOMEM); - if (nb_queues > old_nb_queues) { - uint16_t new_qs = nb_queues - old_nb_queues; - - memset(rxq + old_nb_queues, 0, - sizeof(rxq[0]) * new_qs); + rxq[i] = NULL;It looks like the patch should be rebased on top of next-net main because of queue release patches. [snip]

