[dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool size
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Robert Sanford > Sent: Monday, August 1, 2016 22:43 > To: dev at dpdk.org > Cc: Doherty, Declan ; De Lara Guarch, Pablo > ; olivier.matz at 6wind.com > Subject: [dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool > size > > The following log message may appear after a slave is idle (or nearly > idle) for a few minutes: "PMD: Failed to allocate LACP packet from pool". > > Problem: All mbufs from a slave's private pool (used exclusively for > transmitting LACPDUs) have been allocated and are still sitting in the > device's tx descriptor ring and other cores' mempool caches. > > Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than n- > tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold) mbufs. > > Note that the LACP tx machine function is the only code that allocates > from a slave's private pool. It runs in the context of the interrupt > thread, and thus it has no mempool cache of its own. > > Signed-off-by: Robert Sanford > --- > drivers/net/bonding/rte_eth_bond_8023ad.c | 10 +++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c > b/drivers/net/bonding/rte_eth_bond_8023ad.c > index 2f7ae70..1207896 100644 > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c > @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev > *bond_dev, uint8_t slave_id) > char mem_name[RTE_ETH_NAME_MAX_LEN]; > int socket_id; > unsigned element_size; > + unsigned cache_size; > + unsigned cache_flushthresh; > uint32_t total_tx_desc; > struct bond_tx_queue *bd_tx_q; > uint16_t q_id; > @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev > *bond_dev, uint8_t slave_id) > > element_size = sizeof(struct slow_protocol_frame) + sizeof(struct > rte_mbuf) > + RTE_PKTMBUF_HEADROOM; > + cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? > + 32 : RTE_MEMPOOL_CACHE_MAX_SIZE; > + cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size); > > /* The size of the mempool should be at least: >* the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */ > total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS; > for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) { > bd_tx_q = (struct bond_tx_queue*)bond_dev->data- > >tx_queues[q_id]; > - total_tx_desc += bd_tx_q->nb_tx_desc; > + total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh; > } > > snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", > slave_id); > port->mbuf_pool = rte_mempool_create(mem_name, > - total_tx_desc, element_size, > - RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : > RTE_MEMPOOL_CACHE_MAX_SIZE, > + total_tx_desc, element_size, cache_size, > sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, > NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD); > > -- > 1.7.1 Reviewed-by: Tomasz Kulasek
[dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool size
Hi Robert, On 08/23/2016 10:01 PM, Sanford, Robert wrote: > Hi Olivier, > > On 8/23/16, 11:09 AM, "Olivier MATZ" wrote: > > Hi Robert, > > On 08/01/2016 10:42 PM, Robert Sanford wrote: > > The following log message may appear after a slave is idle (or nearly > > idle) for a few minutes: "PMD: Failed to allocate LACP packet from > > pool". > > > > Problem: All mbufs from a slave's private pool (used exclusively for > > transmitting LACPDUs) have been allocated and are still sitting in > > the device's tx descriptor ring and other cores' mempool caches. > > > > Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than > > n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold) > > mbufs. > > > > Note that the LACP tx machine function is the only code that allocates > > from a slave's private pool. It runs in the context of the interrupt > > thread, and thus it has no mempool cache of its own. > > > > Signed-off-by: Robert Sanford > > --- > > drivers/net/bonding/rte_eth_bond_8023ad.c | 10 +++--- > > 1 files changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c > b/drivers/net/bonding/rte_eth_bond_8023ad.c > > index 2f7ae70..1207896 100644 > > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c > > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c > > @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev > *bond_dev, uint8_t slave_id) > >char mem_name[RTE_ETH_NAME_MAX_LEN]; > >int socket_id; > >unsigned element_size; > > + unsigned cache_size; > > + unsigned cache_flushthresh; > >uint32_t total_tx_desc; > >struct bond_tx_queue *bd_tx_q; > >uint16_t q_id; > > @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct > rte_eth_dev *bond_dev, uint8_t slave_id) > > > >element_size = sizeof(struct slow_protocol_frame) + > sizeof(struct rte_mbuf) > >+ RTE_PKTMBUF_HEADROOM; > > + cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? > > + 32 : RTE_MEMPOOL_CACHE_MAX_SIZE; > > + cache_flushthresh = > RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size); > > > >/* The size of the mempool should be at least: > > * the sum of the TX descriptors + > BOND_MODE_8023AX_SLAVE_TX_PKTS */ > >total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS; > >for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) { > >bd_tx_q = (struct > bond_tx_queue*)bond_dev->data->tx_queues[q_id]; > > - total_tx_desc += bd_tx_q->nb_tx_desc; > > + total_tx_desc += bd_tx_q->nb_tx_desc + > cache_flushthresh; > >} > > > >snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", > slave_id); > >port->mbuf_pool = rte_mempool_create(mem_name, > > - total_tx_desc, element_size, > > - RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : > RTE_MEMPOOL_CACHE_MAX_SIZE, > > + total_tx_desc, element_size, cache_size, > >sizeof(struct rte_pktmbuf_pool_private), > rte_pktmbuf_pool_init, > >NULL, rte_pktmbuf_init, NULL, socket_id, > MEMPOOL_F_NO_SPREAD); > > > > > > I'm not very familiar with bonding code, so maybe your patch is correct. > I think the size of the mempool should be: > > BOND_MODE_8023AX_SLAVE_TX_PKTS + > n_cores * RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size) > > With n_cores = number of cores that can dequeue from the mempool. > > The safest thing to do would be to have n_cores = RTE_MAX_LCORE. I don't > know if bond_dev->data->nb_tx_queue corresponds to this definition, if > yes you can ignore my comment ;) > > > Regards, > Olivier > > -- > > Only one, non-EAL thread dequeues from a per-slave, private (LACPDU) mempool. > The thread calls mbuf-alloc in the context of an rte_eal_alarm_set( ) > callback function, and enqueues the mbuf to a multi-consumer ring. > > The consumer(s) of the mbufs can be any core(s) that invoke the bond PMD?s > tx-burst function. As we know, mbufs can sit around on TX descriptor rings > long after we transmit them, often until we need to reuse the descriptors for > subsequent TX packets. When the TX cores finally free some of the mbufs, some > of the mbufs get left in a pool?s per-core cache. So, the maximum number of > mbufs that may be lying around doing nothing is: the ?done? mbufs in the TX > desc ring and in the per-core cache, times the number of TX cores, or > approximately nb_tx_queues * (nb_tx_desc + CACHE_FLUSHTHRESH(cache_size)). (I > probably should also update the comment
[dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool size
Hi Olivier, On 8/23/16, 11:09 AM, "Olivier MATZ" wrote: Hi Robert, On 08/01/2016 10:42 PM, Robert Sanford wrote: > The following log message may appear after a slave is idle (or nearly > idle) for a few minutes: "PMD: Failed to allocate LACP packet from > pool". > > Problem: All mbufs from a slave's private pool (used exclusively for > transmitting LACPDUs) have been allocated and are still sitting in > the device's tx descriptor ring and other cores' mempool caches. > > Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than > n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold) > mbufs. > > Note that the LACP tx machine function is the only code that allocates > from a slave's private pool. It runs in the context of the interrupt > thread, and thus it has no mempool cache of its own. > > Signed-off-by: Robert Sanford > --- > drivers/net/bonding/rte_eth_bond_8023ad.c | 10 +++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c > index 2f7ae70..1207896 100644 > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c > @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id) > char mem_name[RTE_ETH_NAME_MAX_LEN]; > int socket_id; > unsigned element_size; > + unsigned cache_size; > + unsigned cache_flushthresh; > uint32_t total_tx_desc; > struct bond_tx_queue *bd_tx_q; > uint16_t q_id; > @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id) > > element_size = sizeof(struct slow_protocol_frame) + sizeof(struct rte_mbuf) > + RTE_PKTMBUF_HEADROOM; > + cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? > + 32 : RTE_MEMPOOL_CACHE_MAX_SIZE; > + cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size); > > /* The size of the mempool should be at least: >* the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */ > total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS; > for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) { > bd_tx_q = (struct bond_tx_queue*)bond_dev->data->tx_queues[q_id]; > - total_tx_desc += bd_tx_q->nb_tx_desc; > + total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh; > } > > snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id); > port->mbuf_pool = rte_mempool_create(mem_name, > - total_tx_desc, element_size, > - RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : RTE_MEMPOOL_CACHE_MAX_SIZE, > + total_tx_desc, element_size, cache_size, > sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, > NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD); > > I'm not very familiar with bonding code, so maybe your patch is correct. I think the size of the mempool should be: BOND_MODE_8023AX_SLAVE_TX_PKTS + n_cores * RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size) With n_cores = number of cores that can dequeue from the mempool. The safest thing to do would be to have n_cores = RTE_MAX_LCORE. I don't know if bond_dev->data->nb_tx_queue corresponds to this definition, if yes you can ignore my comment ;) Regards, Olivier -- Only one, non-EAL thread dequeues from a per-slave, private (LACPDU) mempool. The thread calls mbuf-alloc in the context of an rte_eal_alarm_set( ) callback function, and enqueues the mbuf to a multi-consumer ring. The consumer(s) of the mbufs can be any core(s) that invoke the bond PMD?s tx-burst function. As we know, mbufs can sit around on TX descriptor rings long after we transmit them, often until we need to reuse the descriptors for subsequent TX packets. When the TX cores finally free some of the mbufs, some of the mbufs get left in a pool?s per-core cache. So, the maximum number of mbufs that may be lying around doing nothing is: the ?done? mbufs in the TX desc ring and in the per-core cache, times the number of TX cores, or approximately nb_tx_queues * (nb_tx_desc + CACHE_FLUSHTHRESH(cache_size)). (I probably should also update the comment above the for-loop that calculates total_tx_desc.) When we include normal TX mbufs and private LACPDU mbufs displacing each other in the TX desc ring, there are many (if not infinite) possibilities. I wrote a program to simulate all this interaction, trying different values for cache_size, nb_tx_desc, etc., because it took a lon
[dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool size
Hi Robert, On 08/01/2016 10:42 PM, Robert Sanford wrote: > The following log message may appear after a slave is idle (or nearly > idle) for a few minutes: "PMD: Failed to allocate LACP packet from > pool". > > Problem: All mbufs from a slave's private pool (used exclusively for > transmitting LACPDUs) have been allocated and are still sitting in > the device's tx descriptor ring and other cores' mempool caches. > > Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than > n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold) > mbufs. > > Note that the LACP tx machine function is the only code that allocates > from a slave's private pool. It runs in the context of the interrupt > thread, and thus it has no mempool cache of its own. > > Signed-off-by: Robert Sanford > --- > drivers/net/bonding/rte_eth_bond_8023ad.c | 10 +++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c > b/drivers/net/bonding/rte_eth_bond_8023ad.c > index 2f7ae70..1207896 100644 > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c > @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev > *bond_dev, uint8_t slave_id) > char mem_name[RTE_ETH_NAME_MAX_LEN]; > int socket_id; > unsigned element_size; > + unsigned cache_size; > + unsigned cache_flushthresh; > uint32_t total_tx_desc; > struct bond_tx_queue *bd_tx_q; > uint16_t q_id; > @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev > *bond_dev, uint8_t slave_id) > > element_size = sizeof(struct slow_protocol_frame) + sizeof(struct > rte_mbuf) > + RTE_PKTMBUF_HEADROOM; > + cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? > + 32 : RTE_MEMPOOL_CACHE_MAX_SIZE; > + cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size); > > /* The size of the mempool should be at least: >* the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */ > total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS; > for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) { > bd_tx_q = (struct > bond_tx_queue*)bond_dev->data->tx_queues[q_id]; > - total_tx_desc += bd_tx_q->nb_tx_desc; > + total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh; > } > > snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id); > port->mbuf_pool = rte_mempool_create(mem_name, > - total_tx_desc, element_size, > - RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : > RTE_MEMPOOL_CACHE_MAX_SIZE, > + total_tx_desc, element_size, cache_size, > sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, > NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD); > > I'm not very familiar with bonding code, so maybe your patch is correct. I think the size of the mempool should be: BOND_MODE_8023AX_SLAVE_TX_PKTS + n_cores * RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size) With n_cores = number of cores that can dequeue from the mempool. The safest thing to do would be to have n_cores = RTE_MAX_LCORE. I don't know if bond_dev->data->nb_tx_queue corresponds to this definition, if yes you can ignore my comment ;) Regards, Olivier
[dpdk-dev] [PATCH 3/4] net/bonding: another fix to LACP mempool size
The following log message may appear after a slave is idle (or nearly idle) for a few minutes: "PMD: Failed to allocate LACP packet from pool". Problem: All mbufs from a slave's private pool (used exclusively for transmitting LACPDUs) have been allocated and are still sitting in the device's tx descriptor ring and other cores' mempool caches. Solution: Ensure that each slaves' tx (LACPDU) mempool owns more than n-tx-queues * (n-tx-descriptors + per-core-mempool-flush-threshold) mbufs. Note that the LACP tx machine function is the only code that allocates from a slave's private pool. It runs in the context of the interrupt thread, and thus it has no mempool cache of its own. Signed-off-by: Robert Sanford --- drivers/net/bonding/rte_eth_bond_8023ad.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c index 2f7ae70..1207896 100644 --- a/drivers/net/bonding/rte_eth_bond_8023ad.c +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c @@ -854,6 +854,8 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id) char mem_name[RTE_ETH_NAME_MAX_LEN]; int socket_id; unsigned element_size; + unsigned cache_size; + unsigned cache_flushthresh; uint32_t total_tx_desc; struct bond_tx_queue *bd_tx_q; uint16_t q_id; @@ -890,19 +892,21 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id) element_size = sizeof(struct slow_protocol_frame) + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM; + cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? + 32 : RTE_MEMPOOL_CACHE_MAX_SIZE; + cache_flushthresh = RTE_MEMPOOL_CALC_CACHE_FLUSHTHRESH(cache_size); /* The size of the mempool should be at least: * the sum of the TX descriptors + BOND_MODE_8023AX_SLAVE_TX_PKTS */ total_tx_desc = BOND_MODE_8023AX_SLAVE_TX_PKTS; for (q_id = 0; q_id < bond_dev->data->nb_tx_queues; q_id++) { bd_tx_q = (struct bond_tx_queue*)bond_dev->data->tx_queues[q_id]; - total_tx_desc += bd_tx_q->nb_tx_desc; + total_tx_desc += bd_tx_q->nb_tx_desc + cache_flushthresh; } snprintf(mem_name, RTE_DIM(mem_name), "slave_port%u_pool", slave_id); port->mbuf_pool = rte_mempool_create(mem_name, - total_tx_desc, element_size, - RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? 32 : RTE_MEMPOOL_CACHE_MAX_SIZE, + total_tx_desc, element_size, cache_size, sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, NULL, socket_id, MEMPOOL_F_NO_SPREAD); -- 1.7.1