Re: [dpdk-dev] [PATCH v2] net/pcap: fix infinite Rx with large files
The new error message looks great. As I've already given my ack, I'm happy for this to be applied. > -Original Message- > From: Yigit, Ferruh > Sent: Thursday 4 February 2021 16:51 > To: Ferriter, Cian > Cc: Yigit, Ferruh ; dev@dpdk.org; sta...@dpdk.org > Subject: [PATCH v2] net/pcap: fix infinite Rx with large files > > Packet forwarding is not working when infinite Rx feature is used with > large .pcap files that has high number of packets. > > The problem is number of allocated mbufs are less than the infinite Rx > ring size, and all mbufs consumed to fill the ring, so there is no mbuf > left for forwarding. > > Current logic can not detect that infinite Rx ring is not filled > completely and no more mbufs left, and setup continues which leads > silent fail on packet forwarding. > > There isn't much can be done when there is not enough mbuf for the given > .pcap file, so additional checks added to detect the case and fail > explicitly with an error log. > > Bugzilla ID: 595 > Fixes: a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file") > Cc: sta...@dpdk.org > > Reported-by: Cian Ferriter > Signed-off-by: Ferruh Yigit > Acked-by: Cian Ferriter > --- > v2: > * Updated log message > --- > drivers/net/pcap/rte_eth_pcap.c | 40 - > 1 file changed, 25 insertions(+), 15 deletions(-) > > diff --git a/drivers/net/pcap/rte_eth_pcap.c > b/drivers/net/pcap/rte_eth_pcap.c > index c7751b7ba742..90f5d75ea87f 100644 > --- a/drivers/net/pcap/rte_eth_pcap.c > +++ b/drivers/net/pcap/rte_eth_pcap.c > @@ -735,6 +735,17 @@ eth_stats_reset(struct rte_eth_dev *dev) > return 0; > } > > +static inline void > +infinite_rx_ring_free(struct rte_ring *pkts) > +{ > + struct rte_mbuf *bufs; > + > + while (!rte_ring_dequeue(pkts, (void **)&bufs)) > + rte_pktmbuf_free(bufs); > + > + rte_ring_free(pkts); > +} > + > static int > eth_dev_close(struct rte_eth_dev *dev) > { > @@ -753,7 +764,6 @@ eth_dev_close(struct rte_eth_dev *dev) > if (internals->infinite_rx) { > for (i = 0; i < dev->data->nb_rx_queues; i++) { > struct pcap_rx_queue *pcap_q = &internals- > >rx_queue[i]; > - struct rte_mbuf *pcap_buf; > > /* >* 'pcap_q->pkts' can be NULL if 'eth_dev_close()' > @@ -762,11 +772,7 @@ eth_dev_close(struct rte_eth_dev *dev) > if (pcap_q->pkts == NULL) > continue; > > - while (!rte_ring_dequeue(pcap_q->pkts, > - (void **)&pcap_buf)) > - rte_pktmbuf_free(pcap_buf); > - > - rte_ring_free(pcap_q->pkts); > + infinite_rx_ring_free(pcap_q->pkts); > } > } > > @@ -835,21 +841,25 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, > while (eth_pcap_rx(pcap_q, bufs, 1)) { > /* Check for multiseg mbufs. */ > if (bufs[0]->nb_segs != 1) { > - rte_pktmbuf_free(*bufs); > - > - while (!rte_ring_dequeue(pcap_q->pkts, > - (void **)bufs)) > - rte_pktmbuf_free(*bufs); > - > - rte_ring_free(pcap_q->pkts); > - PMD_LOG(ERR, "Multiseg mbufs are not > supported in infinite_rx " > - "mode."); > + infinite_rx_ring_free(pcap_q->pkts); > + PMD_LOG(ERR, > + "Multiseg mbufs are not supported in > infinite_rx mode."); > return -EINVAL; > } > > rte_ring_enqueue_bulk(pcap_q->pkts, > (void * const *)bufs, 1, NULL); > } > + > + if (rte_ring_count(pcap_q->pkts) < pcap_pkt_count) { > + infinite_rx_ring_free(pcap_q->pkts); > + PMD_LOG(ERR, > + "Not enough mbufs to accommodate packets > in pcap file. " > + "At least %" PRIu64 " mbufs per queue is > required.", > + pcap_pkt_count); > + return -EINVAL; > + } > + > /* >* Reset the stats for this queue since eth_pcap_rx calls > above >* didn't result in the application receiving packets. > -- > 2.29.2
Re: [dpdk-dev] [PATCH] net/pcap: fix infinite Rx with large files
> -Original Message- > From: Ferruh Yigit > Sent: Thursday 4 February 2021 16:29 > To: Ferriter, Cian > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: Re: [PATCH] net/pcap: fix infinite Rx with large files > > On 2/4/2021 4:03 PM, Ferriter, Cian wrote: > > Hi Ferruh, > > > > This fixes the issue I was seeing. Now an error is reported, rather than > silent failure. > > > > I have one piece of feedback about the particular error message below > inline which you can take or leave, I'm happy for you to upstream this fix > either way. > > > > Acked-by: Cian Ferriter > > > >> -Original Message----- > >> From: Yigit, Ferruh > >> Sent: Wednesday 3 February 2021 15:49 > >> To: Ferriter, Cian > >> Cc: Yigit, Ferruh ; dev@dpdk.org; > sta...@dpdk.org > >> Subject: [PATCH] net/pcap: fix infinite Rx with large files > >> > >> Packet forwarding is not working when infinite Rx feature is used with > >> large .pcap files that has high number of packets. > >> > >> The problem is number of allocated mbufs are less than the infinite Rx > >> ring size, and all mbufs consumed to fill the ring, so there is no mbuf > >> left for forwarding. > >> > >> Current logic can not detect that infinite Rx ring is not filled > >> completely and no more mbufs left, and setup continues which leads > >> silent fail on packet forwarding. > >> > >> There isn't much can be done when there is not enough mbuf for the > given > >> .pcap file, so additional checks added to detect the case and fail > >> explicitly with an error log. > >> > >> Bugzilla ID: 595 > >> Fixes: a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file") > >> Cc: sta...@dpdk.org > >> > >> Reported-by: Cian Ferriter > >> Signed-off-by: Ferruh Yigit > >> --- > >> drivers/net/pcap/rte_eth_pcap.c | 40 -- > --- > >> 1 file changed, 25 insertions(+), 15 deletions(-) > >> > >> diff --git a/drivers/net/pcap/rte_eth_pcap.c > >> b/drivers/net/pcap/rte_eth_pcap.c > >> index ff02ade70d1a..98f80368ca1d 100644 > >> --- a/drivers/net/pcap/rte_eth_pcap.c > >> +++ b/drivers/net/pcap/rte_eth_pcap.c > >> @@ -735,6 +735,17 @@ eth_stats_reset(struct rte_eth_dev *dev) > >> return 0; > >> } > >> > >> +static inline void > >> +infinite_rx_ring_free(struct rte_ring *pkts) > >> +{ > >> +struct rte_mbuf *bufs; > >> + > >> +while (!rte_ring_dequeue(pkts, (void **)&bufs)) > >> +rte_pktmbuf_free(bufs); > >> + > >> +rte_ring_free(pkts); > >> +} > >> + > >> static int > >> eth_dev_close(struct rte_eth_dev *dev) > >> { > >> @@ -753,7 +764,6 @@ eth_dev_close(struct rte_eth_dev *dev) > >> if (internals->infinite_rx) { > >> for (i = 0; i < dev->data->nb_rx_queues; i++) { > >> struct pcap_rx_queue *pcap_q = &internals- > >>> rx_queue[i]; > >> -struct rte_mbuf *pcap_buf; > >> > >> /* > >>* 'pcap_q->pkts' can be NULL if 'eth_dev_close()' > >> @@ -762,11 +772,7 @@ eth_dev_close(struct rte_eth_dev *dev) > >> if (pcap_q->pkts == NULL) > >> continue; > >> > >> -while (!rte_ring_dequeue(pcap_q->pkts, > >> -(void **)&pcap_buf)) > >> -rte_pktmbuf_free(pcap_buf); > >> - > >> -rte_ring_free(pcap_q->pkts); > >> +infinite_rx_ring_free(pcap_q->pkts); > >> } > >> } > >> > >> @@ -835,21 +841,25 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, > >> while (eth_pcap_rx(pcap_q, bufs, 1)) { > >> /* Check for multiseg mbufs. */ > >> if (bufs[0]->nb_segs != 1) { > >> -rte_pktmbuf_free(*bufs); > >> - > >> -while (!rte_ring_dequeue(pcap_q->pkts, > >> -(void **)bufs)) > >> -rte_pktmbuf_free(*bufs); > >> - > >> -rte_ring_free(pcap_q->pkts); > >> -PMD_LOG(ERR, "Multiseg mbufs are not > >> supported in infinite_rx " > >> -"mode."); > >> +infinite_rx_ring_free(pcap_q->pkts); > >> +PMD_LOG(ERR, > >> +"Multiseg mbufs are not supported in > >> infinite_rx mode."); > >> return -EINVAL; > >> } > >> > >> rte_ring_enqueue_bulk(pcap_q->pkts, > >> (void * const
Re: [dpdk-dev] [PATCH] net/pcap: fix byte stats for drop Tx
> -Original Message- > From: Ferriter, Cian > Sent: Thursday 4 February 2021 15:48 > To: Yigit, Ferruh > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: RE: [PATCH] net/pcap: fix byte stats for drop Tx > > > > > -Original Message- > > From: Yigit, Ferruh > > Sent: Wednesday 3 February 2021 17:30 > > To: Ferriter, Cian > > Cc: Yigit, Ferruh ; dev@dpdk.org; > sta...@dpdk.org > > Subject: [PATCH] net/pcap: fix byte stats for drop Tx > > > > Drop Tx path in pcap is Tx that just drops the packets, which is used > > for the case only Rx from a pcap file is requested/matters. > > > > The byte stats was calculated using first mbuf segment, which gives > > wrong values for multi segmented mbufs, updated to use packet length > > instead. > > > > Bugzilla ID: 597 > > Fixes: a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file") > > Cc: sta...@dpdk.org > > > > Reported-by: Cian Ferriter > > Signed-off-by: Ferruh Yigit > > --- > > Acked-by: Cian Ferriter > > Tested this with 2 large PCAPs and it works as expected: > > testpmd> show port stats all > > NIC statistics for port 0 > > RX-packets: 2 RX-missed: 0 RX-bytes: 131070 > RX-errors: 0 > RX-nombuf: 0 > TX-packets: 2 TX-errors: 0 TX-bytes: 131070 > > Throughput (since last show) > Rx-pps:0 Rx-bps:0 > Tx-pps:0 Tx-bps:0 > > ## > ## [Cian Ferriter] Just to clarify, the Testpmd CMD used to get the stats I pasted above is: dpdk-testpmd -l 1,2 -w 0:00.0 --vdev net_pcap0,rx_pcap=/root/udp_2_max_jumbo.pcap -- --no-flush-rx --i This means that tx_drop was enabled because no tx_pcap file was passed to the PCAP PMD as an arg.
Re: [dpdk-dev] [PATCH] net/pcap: fix infinite Rx with large files
Hi Ferruh, This fixes the issue I was seeing. Now an error is reported, rather than silent failure. I have one piece of feedback about the particular error message below inline which you can take or leave, I'm happy for you to upstream this fix either way. Acked-by: Cian Ferriter > -Original Message- > From: Yigit, Ferruh > Sent: Wednesday 3 February 2021 15:49 > To: Ferriter, Cian > Cc: Yigit, Ferruh ; dev@dpdk.org; sta...@dpdk.org > Subject: [PATCH] net/pcap: fix infinite Rx with large files > > Packet forwarding is not working when infinite Rx feature is used with > large .pcap files that has high number of packets. > > The problem is number of allocated mbufs are less than the infinite Rx > ring size, and all mbufs consumed to fill the ring, so there is no mbuf > left for forwarding. > > Current logic can not detect that infinite Rx ring is not filled > completely and no more mbufs left, and setup continues which leads > silent fail on packet forwarding. > > There isn't much can be done when there is not enough mbuf for the given > .pcap file, so additional checks added to detect the case and fail > explicitly with an error log. > > Bugzilla ID: 595 > Fixes: a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file") > Cc: sta...@dpdk.org > > Reported-by: Cian Ferriter > Signed-off-by: Ferruh Yigit > --- > drivers/net/pcap/rte_eth_pcap.c | 40 - > 1 file changed, 25 insertions(+), 15 deletions(-) > > diff --git a/drivers/net/pcap/rte_eth_pcap.c > b/drivers/net/pcap/rte_eth_pcap.c > index ff02ade70d1a..98f80368ca1d 100644 > --- a/drivers/net/pcap/rte_eth_pcap.c > +++ b/drivers/net/pcap/rte_eth_pcap.c > @@ -735,6 +735,17 @@ eth_stats_reset(struct rte_eth_dev *dev) > return 0; > } > > +static inline void > +infinite_rx_ring_free(struct rte_ring *pkts) > +{ > + struct rte_mbuf *bufs; > + > + while (!rte_ring_dequeue(pkts, (void **)&bufs)) > + rte_pktmbuf_free(bufs); > + > + rte_ring_free(pkts); > +} > + > static int > eth_dev_close(struct rte_eth_dev *dev) > { > @@ -753,7 +764,6 @@ eth_dev_close(struct rte_eth_dev *dev) > if (internals->infinite_rx) { > for (i = 0; i < dev->data->nb_rx_queues; i++) { > struct pcap_rx_queue *pcap_q = &internals- > >rx_queue[i]; > - struct rte_mbuf *pcap_buf; > > /* >* 'pcap_q->pkts' can be NULL if 'eth_dev_close()' > @@ -762,11 +772,7 @@ eth_dev_close(struct rte_eth_dev *dev) > if (pcap_q->pkts == NULL) > continue; > > - while (!rte_ring_dequeue(pcap_q->pkts, > - (void **)&pcap_buf)) > - rte_pktmbuf_free(pcap_buf); > - > - rte_ring_free(pcap_q->pkts); > + infinite_rx_ring_free(pcap_q->pkts); > } > } > > @@ -835,21 +841,25 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, > while (eth_pcap_rx(pcap_q, bufs, 1)) { > /* Check for multiseg mbufs. */ > if (bufs[0]->nb_segs != 1) { > - rte_pktmbuf_free(*bufs); > - > - while (!rte_ring_dequeue(pcap_q->pkts, > - (void **)bufs)) > - rte_pktmbuf_free(*bufs); > - > - rte_ring_free(pcap_q->pkts); > - PMD_LOG(ERR, "Multiseg mbufs are not > supported in infinite_rx " > - "mode."); > + infinite_rx_ring_free(pcap_q->pkts); > + PMD_LOG(ERR, > + "Multiseg mbufs are not supported in > infinite_rx mode."); > return -EINVAL; > } > > rte_ring_enqueue_bulk(pcap_q->pkts, > (void * const *)bufs, 1, NULL); > } > + > + if (rte_ring_count(pcap_q->pkts) < pcap_pkt_count) { > + infinite_rx_ring_free(pcap_q->pkts); > + PMD_LOG(ERR, > + "Not enough mbuf to fill the infinite_rx ring. " > + "At least %" PRIu64 " mbufs per queue is > required to fill the ring", > +
Re: [dpdk-dev] [PATCH] net/pcap: fix byte stats for drop Tx
> -Original Message- > From: Yigit, Ferruh > Sent: Wednesday 3 February 2021 17:30 > To: Ferriter, Cian > Cc: Yigit, Ferruh ; dev@dpdk.org; sta...@dpdk.org > Subject: [PATCH] net/pcap: fix byte stats for drop Tx > > Drop Tx path in pcap is Tx that just drops the packets, which is used > for the case only Rx from a pcap file is requested/matters. > > The byte stats was calculated using first mbuf segment, which gives > wrong values for multi segmented mbufs, updated to use packet length > instead. > > Bugzilla ID: 597 > Fixes: a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file") > Cc: sta...@dpdk.org > > Reported-by: Cian Ferriter > Signed-off-by: Ferruh Yigit > --- Acked-by: Cian Ferriter Tested this with 2 large PCAPs and it works as expected: testpmd> show port stats all NIC statistics for port 0 RX-packets: 2 RX-missed: 0 RX-bytes: 131070 RX-errors: 0 RX-nombuf: 0 TX-packets: 2 TX-errors: 0 TX-bytes: 131070 Throughput (since last show) Rx-pps:0 Rx-bps:0 Tx-pps:0 Tx-bps:0
Re: [dpdk-dev] [PATCH v2] net/pcap: truncate packet if it is too large
Hi Zhike, I've tested the behavior before and after this patch and can verify that the packets are being correctly truncated. Code looks good to me too. Reviewed-by: Cian Ferriter Thanks, Cian > -Original Message- > From: Zhike Wang > Sent: Monday 9 December 2019 01:53 > To: dev@dpdk.org > Cc: wangzh...@jd.com; Yigit, Ferruh ; Ferriter, Cian > ; Zhike Wang > Subject: [PATCH v2] net/pcap: truncate packet if it is too large > > From: Zhike Wang > > Previously large packet would be dropped, instead now it is better to keep it > via truncating it. > > Signed-off-by: Zhike Wang > --- > drivers/net/pcap/rte_eth_pcap.c | 16 ++-- > 1 file changed, 6 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/pcap/rte_eth_pcap.c > b/drivers/net/pcap/rte_eth_pcap.c index aa7ef6f..b4c79d1 100644 > --- a/drivers/net/pcap/rte_eth_pcap.c > +++ b/drivers/net/pcap/rte_eth_pcap.c > @@ -313,7 +313,7 @@ struct pmd_devargs_all { > struct pcap_pkthdr header; > pcap_dumper_t *dumper; > unsigned char temp_data[RTE_ETH_PCAP_SNAPLEN]; > - size_t len; > + size_t len, caplen; > > pp = rte_eth_devices[dumper_q->port_id].process_private; > dumper = pp->tx_dumper[dumper_q->queue_id]; > @@ -325,28 +325,24 @@ struct pmd_devargs_all { >* dumper */ > for (i = 0; i < nb_pkts; i++) { > mbuf = bufs[i]; > - len = rte_pktmbuf_pkt_len(mbuf); > + len = caplen = rte_pktmbuf_pkt_len(mbuf); > if (unlikely(!rte_pktmbuf_is_contiguous(mbuf) && > len > sizeof(temp_data))) { > - PMD_LOG(ERR, > - "Dropping multi segment PCAP packet. Size > (%zd) > max size (%zd).", > - len, sizeof(temp_data)); > - rte_pktmbuf_free(mbuf); > - continue; > + caplen = sizeof(temp_data); > } > > calculate_timestamp(&header.ts); > header.len = len; > - header.caplen = header.len; > + header.caplen = caplen; > /* rte_pktmbuf_read() returns a pointer to the data directly >* in the mbuf (when the mbuf is contiguous) or, otherwise, >* a pointer to temp_data after copying into it. >*/ > pcap_dump((u_char *)dumper, &header, > - rte_pktmbuf_read(mbuf, 0, len, temp_data)); > + rte_pktmbuf_read(mbuf, 0, caplen, temp_data)); > > num_tx++; > - tx_bytes += len; > + tx_bytes += caplen; > rte_pktmbuf_free(mbuf); > } > > -- > 1.8.3.1 >
Re: [dpdk-dev] [PATCH 1/9] net/pcap: fix argument checks
> -Original Message- > From: Kevin Traynor > Sent: Tuesday 5 November 2019 16:41 > To: David Marchand > Cc: dev ; Ferriter, Cian ; dpdk > stable ; Yigit, Ferruh > Subject: Re: [dpdk-dev] [PATCH 1/9] net/pcap: fix argument checks > > On 30/10/2019 07:52, David Marchand wrote: > > On Tue, Oct 1, 2019 at 2:54 PM Kevin Traynor > wrote: > >> > >> Previously rx/tx_queues were passed into eth_from_pcaps_common() as > >> ptrs and were checked for being NULL. > >> > >> In commit da6ba28f0540 ("net/pcap: use a struct to pass user > >> options") that changed to pass in a ptr to a pmd_devargs_all which > >> contains the rx/tx_queues. > >> > >> The parameter checking was not updated as part of that commit and > >> coverity caught that there was still a check if rx/tx_queues were > >> NULL, apparently after they had been dereferenced. > >> > >> Fix that by checking the pmd_devargs_all ptr and removing the NULL > >> checks on rx/tx_queues. > >> > >> 1231struct pmd_devargs *rx_queues = &devargs_all->rx_queues; > >> 1232struct pmd_devargs *tx_queues = &devargs_all->tx_queues; > >> 1233const unsigned int nb_rx_queues = rx_queues->num_of_queue; > >> deref_ptr: Directly dereferencing pointer tx_queues. > >> 1234const unsigned int nb_tx_queues = tx_queues->num_of_queue; > >> 1235unsigned int i; > >> 1236 > >> 1237/* do some parameter checking */ > >> CID 345004: Dereference before null check (REVERSE_INULL) > >> [select issue] > >> 1238if (rx_queues == NULL && nb_rx_queues > 0) > >> 1239return -1; > >> CID 345029 (#1 of 1): Dereference before null check (REVERSE_INULL) > >> check_after_deref: Null-checking tx_queues suggests that it may be > >> null, but it has already been dereferenced on all paths leading to > >> the check. > >> 1240if (tx_queues == NULL && nb_tx_queues > 0) > >> 1241return -1; > >> > >> Coverity issue: 345029 > >> Coverity issue: 345044 > >> Fixes: da6ba28f0540 ("net/pcap: use a struct to pass user options") > >> Cc: cian.ferri...@intel.com > >> Cc: sta...@dpdk.org > >> > >> Signed-off-by: Kevin Traynor > > > > This patch hides the coverity warning. > > But can't we just remove those checks? > > > > Iiuc, the checks on NULL pointers are unnecessary since > > > https://git.dpdk.org/dpdk/commit/?id=a0fce1193ce13a50c00624cb36605ebe > f7a3e60b. > > > > > > Aside from the Coverity complaint about rxq/txq NULL check after use, the > checks didn't seem to make sense anymore as rxq/txq are part of devargs_all > struct now, so they were removed. > > I added the NULL check on devargs_all to avoid a NULL deference while > getting them instead. The check isn't doing any harm, but looking at the > current paths to this fn. can see devargs_all would not be NULL, so I'm ok to > remove it too. Cian/Ferruh, wdyt? I'm OK to remove this. Looks like it's no longer necessary. Good find David!
Re: [dpdk-dev] [PATCH 1/9] net/pcap: fix argument checks
> -Original Message- > From: Kevin Traynor [mailto:ktray...@redhat.com] > Sent: Tuesday 1 October 2019 13:53 > To: dev@dpdk.org > Cc: Kevin Traynor ; Ferriter, Cian > ; sta...@dpdk.org > Subject: [PATCH 1/9] net/pcap: fix argument checks > > Previously rx/tx_queues were passed into eth_from_pcaps_common() as > ptrs and were checked for being NULL. > > In commit da6ba28f0540 ("net/pcap: use a struct to pass user options") that > changed to pass in a ptr to a pmd_devargs_all which contains the > rx/tx_queues. > > The parameter checking was not updated as part of that commit and coverity > caught that there was still a check if rx/tx_queues were NULL, apparently > after they had been dereferenced. > > Fix that by checking the pmd_devargs_all ptr and removing the NULL checks > on rx/tx_queues. > > 1231struct pmd_devargs *rx_queues = &devargs_all->rx_queues; > 1232struct pmd_devargs *tx_queues = &devargs_all->tx_queues; > 1233const unsigned int nb_rx_queues = rx_queues->num_of_queue; > deref_ptr: Directly dereferencing pointer tx_queues. > 1234const unsigned int nb_tx_queues = tx_queues->num_of_queue; > 1235unsigned int i; > 1236 > 1237/* do some parameter checking */ > CID 345004: Dereference before null check (REVERSE_INULL) > [select issue] > 1238if (rx_queues == NULL && nb_rx_queues > 0) > 1239return -1; > CID 345029 (#1 of 1): Dereference before null check (REVERSE_INULL) > check_after_deref: Null-checking tx_queues suggests that it may be > null, but it has already been dereferenced on all paths leading to > the check. > 1240if (tx_queues == NULL && nb_tx_queues > 0) > 1241return -1; > > Coverity issue: 345029 > Coverity issue: 345044 > Fixes: da6ba28f0540 ("net/pcap: use a struct to pass user options") > Cc: cian.ferri...@intel.com > Cc: sta...@dpdk.org > > Signed-off-by: Kevin Traynor Acked-by: Cian Ferriter > --- > drivers/net/pcap/rte_eth_pcap.c | 19 +++ > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/pcap/rte_eth_pcap.c > b/drivers/net/pcap/rte_eth_pcap.c index 5489010b6..7cf00306e 100644 > --- a/drivers/net/pcap/rte_eth_pcap.c > +++ b/drivers/net/pcap/rte_eth_pcap.c > @@ -1229,16 +1229,19 @@ eth_from_pcaps_common(struct > rte_vdev_device *vdev, { > struct pmd_process_private *pp; > - struct pmd_devargs *rx_queues = &devargs_all->rx_queues; > - struct pmd_devargs *tx_queues = &devargs_all->tx_queues; > - const unsigned int nb_rx_queues = rx_queues->num_of_queue; > - const unsigned int nb_tx_queues = tx_queues->num_of_queue; > + struct pmd_devargs *rx_queues; > + struct pmd_devargs *tx_queues; > + unsigned int nb_rx_queues; > + unsigned int nb_tx_queues; > unsigned int i; > > - /* do some parameter checking */ > - if (rx_queues == NULL && nb_rx_queues > 0) > - return -1; > - if (tx_queues == NULL && nb_tx_queues > 0) > + if (devargs_all == NULL) > return -1; > > + rx_queues = &devargs_all->rx_queues; > + tx_queues = &devargs_all->tx_queues; > + > + nb_rx_queues = rx_queues->num_of_queue; > + nb_tx_queues = tx_queues->num_of_queue; > + > if (pmd_init_internals(vdev, nb_rx_queues, nb_tx_queues, > internals, > eth_dev) < 0) > -- > 2.21.0
Re: [dpdk-dev] [PATCH 19.08 v2] doc: announce infinite Rx PCAP PMD feature
v2 changes: * added the commit ID to which the announcement is relevant > -Original Message- > From: Ferriter, Cian > Sent: 03 July 2019 12:04 > To: Mcnamara, John ; Kovacevic, Marko > > Cc: dev@dpdk.org; Ferriter, Cian > Subject: [PATCH 19.08 v2] doc: announce infinite Rx PCAP PMD feature > > This feature was added in the following commit: > commit a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file") > > Signed-off-by: Cian Ferriter > --- > doc/guides/rel_notes/release_19_08.rst | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/doc/guides/rel_notes/release_19_08.rst > b/doc/guides/rel_notes/release_19_08.rst > index 60c258136..c5acfe028 100644 > --- a/doc/guides/rel_notes/release_19_08.rst > +++ b/doc/guides/rel_notes/release_19_08.rst > @@ -116,6 +116,12 @@ New Features >* Enabled need_wakeup feature which can provide efficient support for > case > that application and driver executing on the same core. > > +* **Enabled infinite Rx in the PCAP PMD.** > + > + Added an infinite Rx feature which allows packets in the Rx PCAP of a > + PCAP device to be received repeatedly at a high rate. This can be > + useful for quick performance testing of DPDK apps. > + > * **Updated telemetry library for global metrics support.** > >Updated ``librte_telemetry`` to fetch the global metrics from the > -- > 2.17.1
Re: [dpdk-dev] [PATCH 19.08 v3 2/2] net/pcap: enable infinitely rxing a pcap file
Hi Ferruh, I've responded inline but to summarize, I agree with both of your points. There is one additional piece of rework that I spotted for this version. I have left out a small part of the commit message. Please see my response inline below for more details. Since the rework is minor here, I'd be happy if you wanted to make the changes and apply the patches to save going through another version + review cycle. Let me know if there is anything else. Thanks, Cian > -Original Message- > From: Yigit, Ferruh > Sent: 27 June 2019 18:56 > To: Ferriter, Cian ; Richardson, Bruce > ; Mcnamara, John > ; Kovacevic, Marko > > Cc: dev@dpdk.org > Subject: Re: [PATCH 19.08 v3 2/2] net/pcap: enable infinitely rxing a pcap > file > > On 6/14/2019 3:43 PM, Cian Ferriter wrote: > > It can be useful to use pcap files for some rudimental performance > > Part of the commit message got lost in my splitting of the patch into two patches between v2 and v3. The above line should read: It can be useful to use pcap files for some rudimental performance testing. This patch enables this functionality in the pcap driver. > > At a high level, this works by creaing a ring of sufficient size to > > store the packets in the pcap file passed to the application. When the > > rx function for this mode is called, packets are dequeued from the > > ring for use by the application and also enqueued back on to the ring > > to be "received" again. > > > > A tx_drop mode is also added since transmitting to a tx_pcap file > > isn't desirable at a high traffic rate. > > > > Jumbo frames are not supported in this mode. When filling the ring at > > rx queue setup time, the presence of multi segment mbufs is checked for. > > The PMD will exit on detection of these multi segment mbufs. > > > > Signed-off-by: Cian Ferriter > > <...> > > > @@ -106,6 +106,26 @@ Runtime Config Options > > > > --vdev 'net_pcap0,iface=eth0,phy_mac=1' > > > > +- Use the RX PCAP file to infinitely receive packets > > + > > + In case ``rx_pcap=`` configuration is set, user may want to use the > > + selected PCAP file for rudimental performance testing. This can be done > with a ``devarg`` ``infinite_rx``, for example:: > > + > > + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,infinite_rx=1,tx_drop=1' > > 'tx_drop' seems remaining, it is no more valid. > I missed this, it should be removed. > <...> > > > @@ -1337,6 +1551,9 @@ pmd_pcap_remove(struct rte_vdev_device > *dev) > > eth_dev->data->mac_addrs = NULL; > > } > > > > + if (internals->infinite_rx) > > + eth_dev_close(eth_dev); > > Why 'internals->infinite_rx' check is required to call the 'eth_dev_close()'? > Can we remove the check? Good point, we don't need to check whether the device is in infinite_rx mode, since this is done in eth_dev_close anyway.
Re: [dpdk-dev] [PATCH 19.08 v2] net/pcap: enable infinitely rxing a pcap file
Hi Ferruh, Thanks for the review. I've send a v3 and responded to your comments below. Thanks, Cian > -Original Message- > From: Yigit, Ferruh > Sent: 12 June 2019 15:10 > To: Ferriter, Cian ; Richardson, Bruce > ; Mcnamara, John > ; Kovacevic, Marko > > Cc: dev@dpdk.org > Subject: Re: [PATCH 19.08 v2] net/pcap: enable infinitely rxing a pcap file > > On 6/5/2019 1:46 PM, Ferriter, Cian wrote: > > Adding in my changelog at the top of this, since I forgot to add it in the > original mail: > > > > v2: > > * Rework the method of filling the ring to infinitely rx from > > * Avoids potential huge allocation of mbufs > > * Removes double allocation of mbufs used during queue setup > > * rename count_packets_in_pcaps to count_packets_in_pcap > > * initialize pcap_pkt_count in count_packets_in_pcap > > * use RTE_PMD_REGISTER_PARAM_STRING <0|1> rather than > > * replace calls to rte_panic with proper error returning > > * count rx and tx stat bytes in pcap_rx_infinite and tx_drop > > * make internals->infinite_rx = infinite_rx assignment unconditional > > * add cleanup for infinite_rx in eth_dev_close and pmd_pcap_remove > > * add cleanup when multi seg mbufs are found > > * add some clarifications to the documentation update > > > >> -Original Message- > >> From: Ferriter, Cian > >> Sent: 05 June 2019 12:56 > >> To: Richardson, Bruce ; Yigit, Ferruh > >> ; Mcnamara, John ; > >> Kovacevic, Marko > >> Cc: dev@dpdk.org; Ferriter, Cian > >> Subject: [PATCH 19.08 v2] net/pcap: enable infinitely rxing a pcap > >> file > >> > >> It can be useful to use pcap files for some rudimental performance > testing. > >> This patch enables this functionality in the pcap driver. > >> > >> At a high level, this works by creaing a ring of sufficient size to > >> store the packets in the pcap file passed to the application. When > >> the rx function for this mode is called, packets are dequeued from > >> the ring for use by the application and also enqueued back on to the ring > to be "received" again. > >> > >> A tx_drop mode is also added since transmitting to a tx_pcap file > >> isn't desirable at a high traffic rate. > >> > >> Jumbo frames are not supported in this mode. When filling the ring at > >> rx queue setup time, the presence of multi segment mbufs is checked > for. > >> The PMD will exit on detection of these multi segment mbufs. > >> > >> Signed-off-by: Cian Ferriter > >> --- > >> doc/guides/nics/pcap_ring.rst | 19 +++ > >> drivers/net/pcap/rte_eth_pcap.c | 268 > >> ++-- > >> 2 files changed, 277 insertions(+), 10 deletions(-) > >> > >> diff --git a/doc/guides/nics/pcap_ring.rst > >> b/doc/guides/nics/pcap_ring.rst index c1ef9196b..b272e6fe3 100644 > >> --- a/doc/guides/nics/pcap_ring.rst > >> +++ b/doc/guides/nics/pcap_ring.rst > >> @@ -106,6 +106,25 @@ Runtime Config Options > >> > >> --vdev 'net_pcap0,iface=eth0,phy_mac=1' > >> > >> +- Use the RX PCAP file to infinitely receive packets > >> + > >> + In case ``rx_pcap=`` configuration is set, user may want to use the > >> + selected PCAP file for rudimental performance testing. This can be > >> + done > >> with a ``devarg`` ``infinite_rx``, for example:: > >> + > >> + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,infinite_rx=1,tx_drop=1' > > Can be good to highlight that this flag is not per queue, but should be > provided once (explictly once since code checks it) per Rx. > Added to the docs in the next version. > >> + > >> + When this mode is used, it is recommended to use the ``tx_drop`` > >> ``devarg``. > >> + > >> + This option is device wide, so all queues on a device will either > >> + have this > >> enabled or disabled. > >> + > >> +- Drop all packets on transmit > >> + > >> + The user may want to drop all packets on tx for a device. This can > >> + be done > >> with the ``tx_drop`` ``devarg``, for example:: > >> + > >> + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_drop=1' > >> + > >> + One tx drop queue is created for each rxq on that device. > > Can we drop the ``tx_drop`` completely? > > What happens when no 'tx_pcap' or 'tx_iface' provided at all, to imply the > tx_dro
Re: [dpdk-dev] [PATCH 19.08] net/pcap: enable infinitely rxing a pcap file
Hi Ferruh, Thanks for the review! I've posted the v2 of this patch. My replies are inline Thanks, Cian > -Original Message- > From: Yigit, Ferruh > Sent: 27 May 2019 16:15 > To: Ferriter, Cian ; Richardson, Bruce > ; Mcnamara, John > ; Kovacevic, Marko > > Cc: dev@dpdk.org > Subject: Re: [PATCH 19.08] net/pcap: enable infinitely rxing a pcap file > > On 4/11/2019 12:00 PM, Cian Ferriter wrote: > > It can be useful to use pcap files for some rudimental performance > > testing. This patch enables this functionality in the pcap driver. > > > > At a high level, this works by creaing a ring of sufficient size to > > store the packets in the pcap file passed to the application. When the > > rx function for this mode is called, packets are dequeued from the > > ring for use by the application and also enqueued back on to the ring > > to be "received" again. > > > > A tx_drop mode is also added since transmitting to a tx_pcap file > > isn't desirable at a high traffic rate. > > > > Jumbo frames are not supported in this mode. When filling the ring at > > rx queue setup time, the presence of multi segment mbufs is checked for. > > The PMD will exit on detection of these multi segment mbufs. > > > > Signed-off-by: Cian Ferriter > > --- > > doc/guides/nics/pcap_ring.rst | 15 +++ > > drivers/net/pcap/rte_eth_pcap.c | 209 > > ++-- > > 2 files changed, 215 insertions(+), 9 deletions(-) > > > > diff --git a/doc/guides/nics/pcap_ring.rst > > b/doc/guides/nics/pcap_ring.rst index c1ef9196b..45f55a345 100644 > > --- a/doc/guides/nics/pcap_ring.rst > > +++ b/doc/guides/nics/pcap_ring.rst > > @@ -106,6 +106,21 @@ Runtime Config Options > > > > --vdev 'net_pcap0,iface=eth0,phy_mac=1' > > > > +- Use the RX PCAP file to infinitely receive packets > > + > > + In case ``rx_pcap=`` configuration is set, user may want to use the > > + selected PCAP file for rudimental performance testing. This can be done > with a ``devarg`` ``infinite_rx``, for example:: > > + > > + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,infinite_rx=1,tx_drop=1' > > + > > + When this mode is used, it is recommended to use the ``tx_drop`` > ``devarg``. > > + > > +- Drop all packets on transmit > > + > > + The user may want to drop all packets on tx for a device. This can be done > with the ``tx_drop`` ``devarg``, for example:: > > + > > + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_drop=1' > > Is there a performance drop when tx files are directed to 'loopback' > interface, > like: --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_iface=lo' ? > If no performance drop I am for using 'lo' interface instead of adding new > devargs. > While I did think of using /dev/null as method of dropping packets, I didn't think to use the loopback interface. The reason I implemented the tx_drop feature after trying 'tx_pcap=/dev/null' was that the file writes proved too expensive. I've tried out the loopback interface as per your recommendation and it suffers from the same problem as using /dev/null. Below is a comparison of the three options for dropping packets. = tx_drop = CMD: testpmd -l 1,2 --file-prefix=basic -w 0:00.0 --vdev net_pcap0,rx_pcap=/opt/cian/ovs/scripts/pcaps/udp.pcap,infinite_rx=1,tx_drop=1 --vdev net_pcap1,rx_pcap=/opt/cian/ovs/scripts/pcaps/udp.pcap,infinite_rx=1,tx_drop=1 -- --no-flush-rx --rxq=1 --txq=1 --i For each port: Tx-pps: 22680856 = lo = CMD: testpmd -l 1,2 --file-prefix=basic -w 0:00.0 --vdev net_pcap0,rx_pcap=/opt/cian/ovs/scripts/pcaps/udp.pcap,tx_iface=lo,infinite_rx=1 --vdev net_pcap1,rx_pcap=/opt/cian/ovs/scripts/pcaps/udp.pcap,tx_iface=lo,infinite_rx=1 -- --no-flush-rx --rxq=1 --txq=1 --i For each port: Tx-pps: 255507 = /dev/null = CMD: testpmd -l 1,2 --file-prefix=basic -w 0:00.0 --vdev net_pcap0,rx_pcap=/opt/cian/ovs/scripts/pcaps/udp.pcap,tx_pcap=/dev/null,infinite_rx=1 --vdev net_pcap1,rx_pcap=/opt/cian/ovs/scripts/pcaps/udp.pcap,tx_pcap=/dev/null,infinite_rx=1 -- --no-flush-rx --rxq=1 --txq=1 --i For each port: Tx-pps: 3106131 For the case where the loopback or local interface is being used to drop packets, perf top shows that the majority of time is being spent in kernel functions such as tpacket_rcv, do_syscall_64 and kfree_skb. For the case where the /dev/null is being used to drop packets, perf top shows that the majority of time is being spent in libc file operation functions such as _IO_fwrite and _IO_file_xsputn. Tx_drop is 7.3 times faster than /dev/null and ~89 times faster than lo so it's the preferr
Re: [dpdk-dev] [PATCH 19.08 v2] net/pcap: enable infinitely rxing a pcap file
Adding in my changelog at the top of this, since I forgot to add it in the original mail: v2: * Rework the method of filling the ring to infinitely rx from * Avoids potential huge allocation of mbufs * Removes double allocation of mbufs used during queue setup * rename count_packets_in_pcaps to count_packets_in_pcap * initialize pcap_pkt_count in count_packets_in_pcap * use RTE_PMD_REGISTER_PARAM_STRING <0|1> rather than * replace calls to rte_panic with proper error returning * count rx and tx stat bytes in pcap_rx_infinite and tx_drop * make internals->infinite_rx = infinite_rx assignment unconditional * add cleanup for infinite_rx in eth_dev_close and pmd_pcap_remove * add cleanup when multi seg mbufs are found * add some clarifications to the documentation update > -Original Message----- > From: Ferriter, Cian > Sent: 05 June 2019 12:56 > To: Richardson, Bruce ; Yigit, Ferruh > ; Mcnamara, John ; > Kovacevic, Marko > Cc: dev@dpdk.org; Ferriter, Cian > Subject: [PATCH 19.08 v2] net/pcap: enable infinitely rxing a pcap file > > It can be useful to use pcap files for some rudimental performance testing. > This patch enables this functionality in the pcap driver. > > At a high level, this works by creaing a ring of sufficient size to store the > packets in the pcap file passed to the application. When the rx function for > this mode is called, packets are dequeued from the ring for use by the > application and also enqueued back on to the ring to be "received" again. > > A tx_drop mode is also added since transmitting to a tx_pcap file isn't > desirable at a high traffic rate. > > Jumbo frames are not supported in this mode. When filling the ring at rx > queue setup time, the presence of multi segment mbufs is checked for. > The PMD will exit on detection of these multi segment mbufs. > > Signed-off-by: Cian Ferriter > --- > doc/guides/nics/pcap_ring.rst | 19 +++ > drivers/net/pcap/rte_eth_pcap.c | 268 > ++-- > 2 files changed, 277 insertions(+), 10 deletions(-) > > diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst > index c1ef9196b..b272e6fe3 100644 > --- a/doc/guides/nics/pcap_ring.rst > +++ b/doc/guides/nics/pcap_ring.rst > @@ -106,6 +106,25 @@ Runtime Config Options > > --vdev 'net_pcap0,iface=eth0,phy_mac=1' > > +- Use the RX PCAP file to infinitely receive packets > + > + In case ``rx_pcap=`` configuration is set, user may want to use the > + selected PCAP file for rudimental performance testing. This can be done > with a ``devarg`` ``infinite_rx``, for example:: > + > + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,infinite_rx=1,tx_drop=1' > + > + When this mode is used, it is recommended to use the ``tx_drop`` > ``devarg``. > + > + This option is device wide, so all queues on a device will either have this > enabled or disabled. > + > +- Drop all packets on transmit > + > + The user may want to drop all packets on tx for a device. This can be done > with the ``tx_drop`` ``devarg``, for example:: > + > + --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_drop=1' > + > + One tx drop queue is created for each rxq on that device. > + > Examples of Usage > ^ > > diff --git a/drivers/net/pcap/rte_eth_pcap.c > b/drivers/net/pcap/rte_eth_pcap.c index 10277b9b6..9d239d171 100644 > --- a/drivers/net/pcap/rte_eth_pcap.c > +++ b/drivers/net/pcap/rte_eth_pcap.c > @@ -39,6 +39,8 @@ > #define ETH_PCAP_TX_IFACE_ARG "tx_iface" > #define ETH_PCAP_IFACE_ARG"iface" > #define ETH_PCAP_PHY_MAC_ARG "phy_mac" > +#define ETH_PCAP_INFINITE_RX_ARG "infinite_rx" > +#define ETH_PCAP_TX_DROP_ARG "tx_drop" > > #define ETH_PCAP_ARG_MAXLEN 64 > > @@ -64,6 +66,9 @@ struct pcap_rx_queue { > struct queue_stat rx_stat; > char name[PATH_MAX]; > char type[ETH_PCAP_ARG_MAXLEN]; > + > + /* Contains pre-generated packets to be looped through */ > + struct rte_ring *pkts; > }; > > struct pcap_tx_queue { > @@ -82,6 +87,7 @@ struct pmd_internals { > int if_index; > int single_iface; > int phy_mac; > + int infinite_rx; > }; > > struct pmd_process_private { > @@ -109,6 +115,8 @@ static const char *valid_arguments[] = { > ETH_PCAP_TX_IFACE_ARG, > ETH_PCAP_IFACE_ARG, > ETH_PCAP_PHY_MAC_ARG, > + ETH_PCAP_INFINITE_RX_ARG, > + ETH_PCAP_TX_DROP_ARG, > NULL > }; > > @@ -178,6 +186,43 @@ eth_pcap_gather_data(unsigned char *data, struct > rte_mbuf *mbuf) > } > } > > +static uint16_t > +eth_pcap_rx_
[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve
Hey guys, I just wanted to ask is there anything more that can be done with this patch or is it in an acceptable state for pushing? Cian -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ferriter, Cian Sent: Monday, January 19, 2015 6:39 PM To: Richardson, Bruce Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve I would be happy with the original suggestion. If the ethdev data for a port in use is in cache it removes the performance concern associated the current setup and my fix. The original suggestion also fixes the crash that I was seeing because of memory being reserved from a numa node with no "--socket-mem" allocated. Cian -Original Message- From: Richardson, Bruce Sent: Wednesday, January 14, 2015 10:10 AM To: Ferriter, Cian Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve On Tue, Jan 13, 2015 at 06:05:25PM +, Ferriter, Cian wrote: > Comments on alternative solutions: > 1) how would this solution work when there is no NIC present, and > "rte_eth_from_rings" is called? Here, could you have an else where the socket > id of the master core is passed to the "memzone_reserve"? > 2) how would you advise making this change? I have looked at where > "rte_eth_dev_allocate" is being called and in all but one case, there is a > "numa_id" that could be passed in. This isn't the case for " > rte_eth_dev_init" however, is there an easy solution for this? Would there > now need to be an "rte_eth_dev_data" struct for each socket that there is a > NIC attached to, reserving memory from that socket? > > Cian While I think the issues you highlight can probably be overcome, I'm not so sure any more how much it matters what numa node this is allocated on. The ethdev data for any port in use by a port should be in the cache. In that case, if it doesn't matter, your original suggestion would work fine. /Bruce > > -Original Message- > From: Richardson, Bruce > Sent: Tuesday, January 13, 2015 1:56 PM > To: Ferriter, Cian > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id > passed to rte_memzone_reserve > > On Tue, Jan 13, 2015 at 09:23:16AM +, Ferriter, Cian wrote: > > Passing a socket id of "rte_socket_id()" can cause problems in non DPDK > > applications as there is a dependency on the current logical core we are > > running on. > > Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id > > to rte_memzone_reserve resolves these issues as the master lcore doesn't > > change. > > > > The only trouble is that when affinitizing the memory for the NICs to the > socket of the master lcore, it gives us no way to correctly configure an app > to use NICs connected to two different sockets on the one system. All memory > for all NICs will end up on the same socket. Two possible alternative > solutions: > 1) affinitize memory to the socket the NIC is connected to > 2) add a socket parameter to the API calls to allow the user complete > control over their memory allocations > > Obviously the second one breaks backward compatibility (assume we modify > existing API call), but is more powerful. > > Thoughts? > > /Bruce > > > -Original Message- > > From: Ferriter, Cian > > Sent: Tuesday, January 13, 2015 9:22 AM > > To: dev at dpdk.org > > Cc: Ferriter, Cian > > Subject: [PATCH] lib/librte_ether: change socket_id passed to > > rte_memzone_reserve > > > > Change the socket id that is passed to rte_memzone_reserve from the socket > > id of current logical core to the socket id of the master_lcore. > > --- > > lib/librte_ether/rte_ethdev.c |2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) mode change > > 100644 => 100755 lib/librte_ether/rte_ethdev.c > > > > diff --git a/lib/librte_ether/rte_ethdev.c > > b/lib/librte_ether/rte_ethdev.c old mode 100644 new mode 100755 > > index 95f2ceb..835540d > > --- a/lib/librte_ether/rte_ethdev.c > > +++ b/lib/librte_ether/rte_ethdev.c > > @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void) > > if (rte_eal_process_type() == RTE_PROC_PRIMARY){ > > mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA, > > RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data), > > - rte_socket_id(), flags); > > + rte_lcore_to_socket_id(rte_get_master_lcore()), > > flags); > > } else > > mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA); > > if (mz == NULL) > > -- > > 1.7.4.1 > >
[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve
I would be happy with the original suggestion. If the ethdev data for a port in use is in cache it removes the performance concern associated the current setup and my fix. The original suggestion also fixes the crash that I was seeing because of memory being reserved from a numa node with no "--socket-mem" allocated. Cian -Original Message- From: Richardson, Bruce Sent: Wednesday, January 14, 2015 10:10 AM To: Ferriter, Cian Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve On Tue, Jan 13, 2015 at 06:05:25PM +, Ferriter, Cian wrote: > Comments on alternative solutions: > 1) how would this solution work when there is no NIC present, and > "rte_eth_from_rings" is called? Here, could you have an else where the socket > id of the master core is passed to the "memzone_reserve"? > 2) how would you advise making this change? I have looked at where > "rte_eth_dev_allocate" is being called and in all but one case, there is a > "numa_id" that could be passed in. This isn't the case for " > rte_eth_dev_init" however, is there an easy solution for this? Would there > now need to be an "rte_eth_dev_data" struct for each socket that there is a > NIC attached to, reserving memory from that socket? > > Cian While I think the issues you highlight can probably be overcome, I'm not so sure any more how much it matters what numa node this is allocated on. The ethdev data for any port in use by a port should be in the cache. In that case, if it doesn't matter, your original suggestion would work fine. /Bruce > > -Original Message- > From: Richardson, Bruce > Sent: Tuesday, January 13, 2015 1:56 PM > To: Ferriter, Cian > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id > passed to rte_memzone_reserve > > On Tue, Jan 13, 2015 at 09:23:16AM +, Ferriter, Cian wrote: > > Passing a socket id of "rte_socket_id()" can cause problems in non DPDK > > applications as there is a dependency on the current logical core we are > > running on. > > Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id > > to rte_memzone_reserve resolves these issues as the master lcore doesn't > > change. > > > > The only trouble is that when affinitizing the memory for the NICs to the > socket of the master lcore, it gives us no way to correctly configure an app > to use NICs connected to two different sockets on the one system. All memory > for all NICs will end up on the same socket. Two possible alternative > solutions: > 1) affinitize memory to the socket the NIC is connected to > 2) add a socket parameter to the API calls to allow the user complete > control over their memory allocations > > Obviously the second one breaks backward compatibility (assume we modify > existing API call), but is more powerful. > > Thoughts? > > /Bruce > > > -Original Message- > > From: Ferriter, Cian > > Sent: Tuesday, January 13, 2015 9:22 AM > > To: dev at dpdk.org > > Cc: Ferriter, Cian > > Subject: [PATCH] lib/librte_ether: change socket_id passed to > > rte_memzone_reserve > > > > Change the socket id that is passed to rte_memzone_reserve from the socket > > id of current logical core to the socket id of the master_lcore. > > --- > > lib/librte_ether/rte_ethdev.c |2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) mode change > > 100644 => 100755 lib/librte_ether/rte_ethdev.c > > > > diff --git a/lib/librte_ether/rte_ethdev.c > > b/lib/librte_ether/rte_ethdev.c old mode 100644 new mode 100755 > > index 95f2ceb..835540d > > --- a/lib/librte_ether/rte_ethdev.c > > +++ b/lib/librte_ether/rte_ethdev.c > > @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void) > > if (rte_eal_process_type() == RTE_PROC_PRIMARY){ > > mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA, > > RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data), > > - rte_socket_id(), flags); > > + rte_lcore_to_socket_id(rte_get_master_lcore()), > > flags); > > } else > > mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA); > > if (mz == NULL) > > -- > > 1.7.4.1 > >
[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve
Comments on alternative solutions: 1) how would this solution work when there is no NIC present, and "rte_eth_from_rings" is called? Here, could you have an else where the socket id of the master core is passed to the "memzone_reserve"? 2) how would you advise making this change? I have looked at where "rte_eth_dev_allocate" is being called and in all but one case, there is a "numa_id" that could be passed in. This isn't the case for " rte_eth_dev_init" however, is there an easy solution for this? Would there now need to be an "rte_eth_dev_data" struct for each socket that there is a NIC attached to, reserving memory from that socket? Cian -Original Message- From: Richardson, Bruce Sent: Tuesday, January 13, 2015 1:56 PM To: Ferriter, Cian Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve On Tue, Jan 13, 2015 at 09:23:16AM +, Ferriter, Cian wrote: > Passing a socket id of "rte_socket_id()" can cause problems in non DPDK > applications as there is a dependency on the current logical core we are > running on. > Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id to > rte_memzone_reserve resolves these issues as the master lcore doesn't change. > The only trouble is that when affinitizing the memory for the NICs to the socket of the master lcore, it gives us no way to correctly configure an app to use NICs connected to two different sockets on the one system. All memory for all NICs will end up on the same socket. Two possible alternative solutions: 1) affinitize memory to the socket the NIC is connected to 2) add a socket parameter to the API calls to allow the user complete control over their memory allocations Obviously the second one breaks backward compatibility (assume we modify existing API call), but is more powerful. Thoughts? /Bruce > -Original Message----- > From: Ferriter, Cian > Sent: Tuesday, January 13, 2015 9:22 AM > To: dev at dpdk.org > Cc: Ferriter, Cian > Subject: [PATCH] lib/librte_ether: change socket_id passed to > rte_memzone_reserve > > Change the socket id that is passed to rte_memzone_reserve from the socket id > of current logical core to the socket id of the master_lcore. > --- > lib/librte_ether/rte_ethdev.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) mode change 100644 > => 100755 lib/librte_ether/rte_ethdev.c > > diff --git a/lib/librte_ether/rte_ethdev.c > b/lib/librte_ether/rte_ethdev.c old mode 100644 new mode 100755 index > 95f2ceb..835540d > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void) > if (rte_eal_process_type() == RTE_PROC_PRIMARY){ > mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA, > RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data), > - rte_socket_id(), flags); > + rte_lcore_to_socket_id(rte_get_master_lcore()), > flags); > } else > mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA); > if (mz == NULL) > -- > 1.7.4.1 >
[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve
Passing a socket id of "rte_socket_id()" can cause problems in non DPDK applications as there is a dependency on the current logical core we are running on. Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id to rte_memzone_reserve resolves these issues as the master lcore doesn't change. -Original Message- From: Ferriter, Cian Sent: Tuesday, January 13, 2015 9:22 AM To: dev at dpdk.org Cc: Ferriter, Cian Subject: [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve Change the socket id that is passed to rte_memzone_reserve from the socket id of current logical core to the socket id of the master_lcore. --- lib/librte_ether/rte_ethdev.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c old mode 100644 new mode 100755 index 95f2ceb..835540d --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void) if (rte_eal_process_type() == RTE_PROC_PRIMARY){ mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA, RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data), - rte_socket_id(), flags); + rte_lcore_to_socket_id(rte_get_master_lcore()), flags); } else mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA); if (mz == NULL) -- 1.7.4.1