[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
On 2015/8/7 13:37, Zhang, Helin wrote: > >> -Original Message- >> From: Qiu, Michael >> Sent: Friday, August 7, 2015 11:53 AM >> To: Zhang, Helin; dev at dpdk.org >> Subject: Re: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding >> >> On 2015/8/7 9:06, Zhang, Helin wrote: -Original Message- From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu Sent: Thursday, August 6, 2015 8:29 PM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK will be dropped in switch side, so the packet generator will never >> receive the packet. >>> Is it because of anti-sproof? E.g. When the hardware found that the >>> dest mac is the port itself, then it will be dropped during TX. >>> You need to tell the root cause, and why we need to modify like this. >> Actually, it is not the hardware from PEP(PCI End Point) side, but the >> switch side. >> >> The TX is OK for DPDK and NIC, but in switch, it receives the packet and try >> to >> forward it, but the dest mac is the same as the NIC which transmit this >> packet. >> So switch will drop it as "Loopback Suppression Drop" in RRC. This should >> only >> happen when switch forwarding packets using dest mac. >> >> Signed-off-by: Michael Qiu --- app/test-pmd/csumonly.c | 4 1 file changed, 4 insertions(+) diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index 1bf3485..bf8af1d 100644 --- a/app/test-pmd/csumonly.c +++ b/app/test-pmd/csumonly.c @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream >> *fs) * and inner headers */ eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); + ether_addr_copy(_eth_addrs[fs->peer_addr], + _hdr->d_addr); + ether_addr_copy([fs->tx_port].eth_addr, + _hdr->s_addr); >>> Is it really necessary? Why other NICs do not need this? >> Because other NICs is connect directly to packet generator, if we using >> switch >> to connect the generator and the NICs, I think it will need this. > There are 'iofwd' and 'mac' mode in testpmd, and mac forware will modify the > dest > mac before transmitting the packet. They are for different cases. > Why not use mac forwarding mode for your testing, and just keep it as is? Yes, I don't touch iofwd, I just modify the csum, when we test checksum offload, especially for checksum insert in TX side. Thanks, Michael > Regards, > Helin > >> Thanks, >> Michael parse_ethernet(eth_hdr, ); l3_hdr = (char *)eth_hdr + info.l2_len; -- 1.9.3 >
[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
> -Original Message- > From: Qiu, Michael > Sent: Friday, August 7, 2015 11:53 AM > To: Zhang, Helin; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding > > On 2015/8/7 9:06, Zhang, Helin wrote: > > > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu > >> Sent: Thursday, August 6, 2015 8:29 PM > >> To: dev at dpdk.org > >> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum > >> forwarding > >> > >> For some ethnet-switch like intel RRC, all the packet forwarded out > >> by DPDK will be dropped in switch side, so the packet generator will never > receive the packet. > > Is it because of anti-sproof? E.g. When the hardware found that the > > dest mac is the port itself, then it will be dropped during TX. > > You need to tell the root cause, and why we need to modify like this. > > Actually, it is not the hardware from PEP(PCI End Point) side, but the switch > side. > > The TX is OK for DPDK and NIC, but in switch, it receives the packet and try > to > forward it, but the dest mac is the same as the NIC which transmit this > packet. > So switch will drop it as "Loopback Suppression Drop" in RRC. This should only > happen when switch forwarding packets using dest mac. > > > > > >> Signed-off-by: Michael Qiu > >> --- > >> app/test-pmd/csumonly.c | 4 > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index > >> 1bf3485..bf8af1d 100644 > >> --- a/app/test-pmd/csumonly.c > >> +++ b/app/test-pmd/csumonly.c > >> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream > *fs) > >> * and inner headers */ > >> > >>eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); > >> + ether_addr_copy(_eth_addrs[fs->peer_addr], > >> + _hdr->d_addr); > >> + ether_addr_copy([fs->tx_port].eth_addr, > >> + _hdr->s_addr); > > Is it really necessary? Why other NICs do not need this? > > Because other NICs is connect directly to packet generator, if we using > switch > to connect the generator and the NICs, I think it will need this. There are 'iofwd' and 'mac' mode in testpmd, and mac forware will modify the dest mac before transmitting the packet. They are for different cases. Why not use mac forwarding mode for your testing, and just keep it as is? Regards, Helin > > Thanks, > Michael > > > >>parse_ethernet(eth_hdr, ); > >>l3_hdr = (char *)eth_hdr + info.l2_len; > >> > >> -- > >> 1.9.3 > >
[dpdk-dev] vhost-switch example: huge memory need and CRC off-loading issue
Hi again, two findings in the vhost-switch example code that can cause grey hair for starters: - MAX_QUEUES of 512 causes pretty high memory need for the application (something between 1 and 2G) - is that really needed? I'm now running with 32, and I'm able to get away with 256M. Can we tune this default? - hw_strip_crc is set to 0, but either the igb driver or the ET2 quad port adapter I'm using is ignoring this. It does strip the CRC, so does software, and I'm losing 4 bytes on each unpadded packet. Known issue? Thanks, Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
[dpdk-dev] [PATCH] vchost: Notify application of ownership change
On VHOST_*_RESET_OWNER, we reinitialize the device but without telling the application. That will cause crashes when it continues to invoke vhost services on the device. Fix it by calling the destruction hook if the device is still in use. Signed-off-by: Jan Kiszka --- This is the surprisingly simple answer to my questions in http://thread.gmane.org/gmane.comp.networking.dpdk.devel/22661. lib/librte_vhost/virtio-net.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index b520ec5..3c5b5b2 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -402,6 +402,9 @@ reset_owner(struct vhost_device_ctx ctx) ll_dev = get_config_ll_entry(ctx); + if ((ll_dev->dev.flags & VIRTIO_DEV_RUNNING)) + notify_ops->destroy_device(_dev->dev); + cleanup_device(_dev->dev); init_device(_dev->dev); -- 2.1.4
[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
On 2015/8/7 9:06, Zhang, Helin wrote: > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu >> Sent: Thursday, August 6, 2015 8:29 PM >> To: dev at dpdk.org >> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding >> >> For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK >> will >> be dropped in switch side, so the packet generator will never receive the >> packet. > Is it because of anti-sproof? E.g. When the hardware found that the dest mac > is the > port itself, then it will be dropped during TX. > You need to tell the root cause, and why we need to modify like this. Actually, it is not the hardware from PEP(PCI End Point) side, but the switch side. The TX is OK for DPDK and NIC, but in switch, it receives the packet and try to forward it, but the dest mac is the same as the NIC which transmit this packet. So switch will drop it as "Loopback Suppression Drop" in RRC. This should only happen when switch forwarding packets using dest mac. > >> Signed-off-by: Michael Qiu >> --- >> app/test-pmd/csumonly.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index >> 1bf3485..bf8af1d 100644 >> --- a/app/test-pmd/csumonly.c >> +++ b/app/test-pmd/csumonly.c >> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs) >> * and inner headers */ >> >> eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); >> +ether_addr_copy(_eth_addrs[fs->peer_addr], >> +_hdr->d_addr); >> +ether_addr_copy([fs->tx_port].eth_addr, >> +_hdr->s_addr); > Is it really necessary? Why other NICs do not need this? Because other NICs is connect directly to packet generator, if we using switch to connect the generator and the NICs, I think it will need this. Thanks, Michael > >> parse_ethernet(eth_hdr, ); >> l3_hdr = (char *)eth_hdr + info.l2_len; >> >> -- >> 1.9.3 >
[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
On 2015/8/7 9:05, De Lara Guarch, Pablo wrote: > Hi Michael, > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu >> Sent: Friday, August 07, 2015 4:29 AM >> To: dev at dpdk.org >> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding >> >> For some ethnet-switch like intel RRC, all the packet forwarded >> out by DPDK will be dropped in switch side, so the packet >> generator will never receive the packet. >> >> Signed-off-by: Michael Qiu >> --- >> app/test-pmd/csumonly.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c >> index 1bf3485..bf8af1d 100644 >> --- a/app/test-pmd/csumonly.c >> +++ b/app/test-pmd/csumonly.c >> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream >> *fs) >> * and inner headers */ >> >> eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); >> +ether_addr_copy(_eth_addrs[fs->peer_addr], >> +_hdr->d_addr); >> +ether_addr_copy([fs->tx_port].eth_addr, >> +_hdr->s_addr); >> parse_ethernet(eth_hdr, ); >> l3_hdr = (char *)eth_hdr + info.l2_len; >> >> -- >> 1.9.3 > Why do you make this change only in this mode? If NICs like RRC has this > issue, > I assume it would happen in other modes. Yes, exactly, but for iofwd if we change the mac, so the mode is changed am I right? Thanks, Michael > Thanks, > Pablo >
[dpdk-dev] [PATCH] ethdev: Fix illegal access of rte_eth_dev_is_detachable()
To obtain detachable flag, pci_drv is accessed in rte_eth_dev_is_detachable(). But pci_drv is only valid if port is enabled. Not to cause illegal access, add rte_eth_dev_is_valid_port() before accessing. Signed-off-by: Tetsuya Mukawa --- lib/librte_ether/rte_ethdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 5fe1906..6b2400c 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -505,7 +505,7 @@ rte_eth_dev_is_detachable(uint8_t port_id) { uint32_t drv_flags; - if (port_id >= RTE_MAX_ETHPORTS) { + if (!rte_eth_dev_is_valid_port(port_id)) { PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); return -EINVAL; } -- 2.1.4
[dpdk-dev] [PATCH 2/2] mem: fix freeing an IVSHMEM memzone
There is no sync between host and guest to allow removal of memzones, and freeing them result in undefined behavior. In the guest, we identify IVSHMEM memsegs/memzones by having ioremap_addr != 0. In the host, nothing is done to the memzone, meaning ioremap_addr == 0. As a solution, mark memzones being added to IVSHMEM in the host, by setting ioremap_addr, then return an error whenever we try to free an IVSHMEM memzone. Fixes: ff909fe21f0 ("mem: introduce memzone freeing") Signed-off-by: Sergio Gonzalez Monroy --- lib/librte_eal/common/eal_common_memzone.c | 8 lib/librte_eal/common/include/rte_memzone.h | 4 +++- lib/librte_ivshmem/rte_ivshmem.c| 15 +++ 3 files changed, 26 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c index 7b1d77e..febc56b 100644 --- a/lib/librte_eal/common/eal_common_memzone.c +++ b/lib/librte_eal/common/eal_common_memzone.c @@ -322,6 +322,14 @@ rte_memzone_free(const struct rte_memzone *mz) idx = idx / sizeof(struct rte_memzone); addr = mcfg->memzone[idx].addr; +#ifdef RTE_LIBRTE_IVSHMEM + /* +* If ioremap_addr is set, it's an IVSHMEM memzone and we cannot +* free it. +*/ + if (mcfg->memzone[idx].ioremap_addr != 0) + ret = -EINVAL; +#endif if (addr == NULL) ret = -EINVAL; else if (mcfg->memzone_cnt == 0) { diff --git a/lib/librte_eal/common/include/rte_memzone.h b/lib/librte_eal/common/include/rte_memzone.h index 38e5f5b..6a9 100644 --- a/lib/librte_eal/common/include/rte_memzone.h +++ b/lib/librte_eal/common/include/rte_memzone.h @@ -258,10 +258,12 @@ const struct rte_memzone *rte_memzone_reserve_bounded(const char *name, /** * Free a memzone. * + * Note: an IVSHMEM zone cannot be freed. + * * @param mz * A pointer to the memzone * @return - * -EINVAL - invalid parameter + * -EINVAL - invalid parameter, IVSHMEM memzone. * 0 - success */ int rte_memzone_free(const struct rte_memzone *mz); diff --git a/lib/librte_ivshmem/rte_ivshmem.c b/lib/librte_ivshmem/rte_ivshmem.c index 9621906..8fc4b57 100644 --- a/lib/librte_ivshmem/rte_ivshmem.c +++ b/lib/librte_ivshmem/rte_ivshmem.c @@ -504,7 +504,22 @@ add_memzone_to_metadata(const struct rte_memzone * mz, config->metadata->name); goto fail; } +#ifdef RTE_LIBRTE_IVSHMEM + struct rte_mem_config *mcfg; + unsigned int idx; + mcfg = rte_eal_get_configuration()->mem_config; + + rte_rwlock_write_lock(>mlock); + + idx = ((uintptr_t)mz - (uintptr_t)mcfg->memzone); + idx = idx / sizeof(struct rte_memzone); + + /* mark the memzone not freeable */ + mcfg->memzone[idx].ioremap_addr = mz->phys_addr; + + rte_rwlock_write_unlock(>mlock); +#endif rte_spinlock_unlock(>sl); return 0; fail: -- 1.9.3
[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu > Sent: Thursday, August 6, 2015 8:29 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding > > For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK > will > be dropped in switch side, so the packet generator will never receive the > packet. Is it because of anti-sproof? E.g. When the hardware found that the dest mac is the port itself, then it will be dropped during TX. You need to tell the root cause, and why we need to modify like this. > > Signed-off-by: Michael Qiu > --- > app/test-pmd/csumonly.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index > 1bf3485..bf8af1d 100644 > --- a/app/test-pmd/csumonly.c > +++ b/app/test-pmd/csumonly.c > @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs) >* and inner headers */ > > eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); > + ether_addr_copy(_eth_addrs[fs->peer_addr], > + _hdr->d_addr); > + ether_addr_copy([fs->tx_port].eth_addr, > + _hdr->s_addr); Is it really necessary? Why other NICs do not need this? > parse_ethernet(eth_hdr, ); > l3_hdr = (char *)eth_hdr + info.l2_len; > > -- > 1.9.3
[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
Hi Michael, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu > Sent: Friday, August 07, 2015 4:29 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding > > For some ethnet-switch like intel RRC, all the packet forwarded > out by DPDK will be dropped in switch side, so the packet > generator will never receive the packet. > > Signed-off-by: Michael Qiu > --- > app/test-pmd/csumonly.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c > index 1bf3485..bf8af1d 100644 > --- a/app/test-pmd/csumonly.c > +++ b/app/test-pmd/csumonly.c > @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream > *fs) >* and inner headers */ > > eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); > + ether_addr_copy(_eth_addrs[fs->peer_addr], > + _hdr->d_addr); > + ether_addr_copy([fs->tx_port].eth_addr, > + _hdr->s_addr); > parse_ethernet(eth_hdr, ); > l3_hdr = (char *)eth_hdr + info.l2_len; > > -- > 1.9.3 Why do you make this change only in this mode? If NICs like RRC has this issue, I assume it would happen in other modes. Thanks, Pablo
[dpdk-dev] vhost: Problem RESET_OWNER processing
Hi, I was wondering if I'm alone with this: the vhost-switch example crashes on client disconnects if the client send a RESET_OWNER message. That's at least the case for QEMU and vhost-user mode (I suppose vhost-cuse is legacy now). And it really ruins the party when playing with this because every VM shutdown or guest reboot triggers. I was looking deeper in the librte_vhost, and I found that reset_owner() is doing cleanup_device and then init_device - but without letting the user know. So vhost-switch crashed in its main loop over continuing to use the device, namely calling rte_vhost_dequeue_burst (with dev->virtqueue[]->avail == NULL). Do we simply need another hook in the vhost API, similar to the destruction notification? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux
[dpdk-dev] [PATCH] doc: ip_pipeline app user guide
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cristian Dumitrescu > Sent: Thursday, August 6, 2015 2:48 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] doc: ip_pipeline app user guide > > Added more extensive documentation for ip_pipeline application. > > Signed-off-by: Cristian Dumitrescu Acked-by: John McNamara
[dpdk-dev] [PATCH 1/3] vfio: Added hot removal feature for vfio
Hi Harpal, > I think maintaining a ref count of groups will solve this problem. Yes, refcounting seems like the best way to solve this problem. > Additionally, I have found a bug in existing design in case of multiple > devices of same group. > <...> > Therefore, I will provide a fix?for this as well in my next version. That should probably be in a separate patch. Thanks, Anatoly
[dpdk-dev] [PATCH v1] Move EAL thread common functions
Changes include moving common functions in eal_thread.c in linuxapp and bsdapp into common/eal_common_thread.c file. Compiled on Linux for following targets > x86_64-native-linuxapp-gcc > x86_64-native-linuxapp-clang > x86_x32-native-linuxapp-gcc Compiled on FreeBSD for following targets > x86_64-native-bsdapp-clang > x86_64-native-bsdapp-gcc Tested on Linux: > testpmd (pmd_perf_autotest) Tested on FreeBSD: > testpmd Successful run of checkpatch.pl on the diffs Signed-off-by: Ravi Kerur --- lib/librte_eal/bsdapp/eal/Makefile| 3 +- lib/librte_eal/bsdapp/eal/eal_thread.c| 152 - lib/librte_eal/common/eal_common_thread.c | 147 +++- lib/librte_eal/linuxapp/eal/Makefile | 3 +- lib/librte_eal/linuxapp/eal/eal_thread.c | 153 -- 5 files changed, 150 insertions(+), 308 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile index a969435..93d76bb 100644 --- a/lib/librte_eal/bsdapp/eal/Makefile +++ b/lib/librte_eal/bsdapp/eal/Makefile @@ -51,6 +51,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) := eal.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_memory.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_hugepage_info.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_thread.c +SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_log.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_pci.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_debug.c @@ -76,7 +77,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_hexdump.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_devargs.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_dev.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_options.c -SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += eal_common_thread.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += rte_malloc.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += malloc_elem.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_BSDAPP) += malloc_heap.c @@ -90,6 +90,7 @@ CFLAGS_eal_common_log.o := -D_GNU_SOURCE # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y) CFLAGS_eal_thread.o += -Wno-return-type +CFLAGS_eal_common_thread.o += -Wno-return-type CFLAGS_eal_hpet.o += -Wno-return-type endif diff --git a/lib/librte_eal/bsdapp/eal/eal_thread.c b/lib/librte_eal/bsdapp/eal/eal_thread.c index 9a03437..5714b8f 100644 --- a/lib/librte_eal/bsdapp/eal/eal_thread.c +++ b/lib/librte_eal/bsdapp/eal/eal_thread.c @@ -35,163 +35,11 @@ #include #include #include -#include -#include -#include -#include #include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - #include "eal_private.h" #include "eal_thread.h" -RTE_DEFINE_PER_LCORE(unsigned, _lcore_id) = LCORE_ID_ANY; -RTE_DEFINE_PER_LCORE(unsigned, _socket_id) = (unsigned)SOCKET_ID_ANY; -RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset); - -/* - * Send a message to a slave lcore identified by slave_id to call a - * function f with argument arg. Once the execution is done, the - * remote lcore switch in FINISHED state. - */ -int -rte_eal_remote_launch(int (*f)(void *), void *arg, unsigned slave_id) -{ - int n; - char c = 0; - int m2s = lcore_config[slave_id].pipe_master2slave[1]; - int s2m = lcore_config[slave_id].pipe_slave2master[0]; - - if (lcore_config[slave_id].state != WAIT) - return -EBUSY; - - lcore_config[slave_id].f = f; - lcore_config[slave_id].arg = arg; - - /* send message */ - n = 0; - while (n == 0 || (n < 0 && errno == EINTR)) - n = write(m2s, , 1); - if (n < 0) - rte_panic("cannot write on configuration pipe\n"); - - /* wait ack */ - do { - n = read(s2m, , 1); - } while (n < 0 && errno == EINTR); - - if (n <= 0) - rte_panic("cannot read on configuration pipe\n"); - - return 0; -} - -/* set affinity for current thread */ -static int -eal_thread_set_affinity(void) -{ - unsigned lcore_id = rte_lcore_id(); - - /* acquire system unique id */ - rte_gettid(); - - /* update EAL thread core affinity */ - return rte_thread_set_affinity(_config[lcore_id].cpuset); -} - -void eal_thread_init_master(unsigned lcore_id) -{ - /* set the lcore ID in per-lcore memory area */ - RTE_PER_LCORE(_lcore_id) = lcore_id; - - /* set CPU affinity */ - if (eal_thread_set_affinity() < 0) - rte_panic("cannot set affinity\n"); -} - -/* main loop of threads */ -__attribute__((noreturn)) void * -eal_thread_loop(__attribute__((unused)) void *arg) -{ - char c; - int n, ret; - unsigned lcore_id; - pthread_t thread_id; - int m2s, s2m; - char cpuset[RTE_CPU_AFFINITY_STR_LEN]; - - thread_id = pthread_self(); - - /* retrieve our lcore_id from
[dpdk-dev] [PATCH v1] Move eal_thread.c common functions.
As per Thomas's suggestion we will split remaining files in EAL cleanup effort into multiple patches, eal_thread.c is first in this series. Ravi Kerur (1): Move EAL thread common functions lib/librte_eal/bsdapp/eal/Makefile| 3 +- lib/librte_eal/bsdapp/eal/eal_thread.c| 152 - lib/librte_eal/common/eal_common_thread.c | 147 +++- lib/librte_eal/linuxapp/eal/Makefile | 3 +- lib/librte_eal/linuxapp/eal/eal_thread.c | 153 -- 5 files changed, 150 insertions(+), 308 deletions(-) -- 1.9.1
[dpdk-dev] [PATCH v1] Change rte_eal_vdev_init to update port_id
Hi Tetsuya, On Thu, Aug 6, 2015 at 7:25 PM, Tetsuya Mukawa wrote: > On 2015/08/07 3:04, Ravi Kerur wrote: > > diff --git a/drivers/net/enic/enic_ethdev.c > b/drivers/net/enic/enic_ethdev.c > > index 8280cea..472ef5a 100644 > > --- a/drivers/net/enic/enic_ethdev.c > > +++ b/drivers/net/enic/enic_ethdev.c > > @@ -36,8 +36,8 @@ > > #include > > #include > > > > -#include > > #include > > +#include > > #include > > #include > > Hi Ravi, > > Do we need this fixing? > > > > > diff --git a/drivers/net/mpipe/mpipe_tilegx.c > b/drivers/net/mpipe/mpipe_tilegx.c > > index 743feef..6e3e304 100644 > > --- a/drivers/net/mpipe/mpipe_tilegx.c > > +++ b/drivers/net/mpipe/mpipe_tilegx.c > > @@ -1582,6 +1582,7 @@ rte_pmd_mpipe_devinit(const char *ifname, > > if (!eth_dev) { > > RTE_LOG(ERR, PMD, "%s: Failed to allocate device.\n", > ifname); > > rte_free(priv); > > + return -ENOMEM; > > How about separating this fixing from the patch, and put it as an one of > cleanup patch series? > > rte_pmd_mpipe_devinit is the init func pointer called via rte_eal_vdev_init. Since we were fixing rte_eal_vdev_init thought of taking care of mpipe issue. If you think it's unrelated to this patch I will send a separate one. > } > > > > RTE_LOG(INFO, PMD, "%s: Initialized mpipe device" > > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > > index 4089d66..82d5693 100644 > > --- a/lib/librte_eal/common/eal_common_dev.c > > +++ b/lib/librte_eal/common/eal_common_dev.c > > > > RTE_LOG(ERR, EAL, "no driver found for %s\n", name); > > @@ -94,6 +99,7 @@ rte_eal_dev_init(void) > > { > > struct rte_devargs *devargs; > > struct rte_driver *driver; > > + uint8_t port_id; > > > > /* > >* Note that the dev_driver_list is populated here > > @@ -108,7 +114,7 @@ rte_eal_dev_init(void) > > continue; > > > > if (rte_eal_vdev_init(devargs->virtual.drv_name, > > - devargs->args)) { > > + devargs->args, _id)) { > > After this line, 'port_id' is actually not used by anywhere in this > function. > Also, I guess we will not use port_id in this function in the future. > How about fixing rte_eal_vdev_init() to handle NULL value correctly to > remove port_id from this function? > But I agree your current implementation is also one of choice. > > > diff --git a/lib/librte_ether/rte_ethdev.c > b/lib/librte_ether/rte_ethdev.c > > index 5fe1906..355d709 100644 > > --- a/lib/librte_ether/rte_ethdev.c > > +++ b/lib/librte_ether/rte_ethdev.c > > +int > > +rte_eth_dev_get_port_by_addr(const struct rte_pci_addr *addr, uint8_t > *port_id) > > +{ > > + int i; > > + struct rte_pci_device *pci_dev = NULL; > > + > > + if (addr == NULL || port_id == NULL) { > > + PMD_DEBUG_TRACE("Null pointer is specified\n"); > > + return -EINVAL; > > + } > > + > > + *port_id = RTE_MAX_ETHPORTS; > > + > > + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { > > + > > + pci_dev = rte_eth_devices[i].pci_dev; > > + > > + if (pci_dev != NULL && > > + pci_dev->addr.domain == addr->domain && > > + pci_dev->addr.bus == addr->bus && > > + pci_dev->addr.devid == addr->devid && > > + pci_dev->addr.function == addr->function) { > > You can use rte_eal_compare_pci_addr() here. > Will fix this. > > > + > > + *port_id = i; > > + return 0; > > + } > > + } > > + return -ENODEV; > > +} > > diff --git a/lib/librte_ether/rte_ether_version.map > b/lib/librte_ether/rte_ether_version.map > > index 8345a6c..3d5cb23 100644 > > --- a/lib/librte_ether/rte_ether_version.map > > +++ b/lib/librte_ether/rte_ether_version.map > > @@ -125,5 +125,7 @@ DPDK_2.1 { > > rte_eth_timesync_enable; > > rte_eth_timesync_read_rx_timestamp; > > rte_eth_timesync_read_tx_timestamp; > > + rte_eth_dev_get_port_by_name; > > + rte_eth_dev_get_port_by_addr; > > > > } DPDK_2.0; > > Hi Thomas, > > Could you please make sure API consistency? > Is it ok to add above functions to DPDK_2.1 even though we are in RC > phase, or need to add to DPDK_2.2? > > Same question. If it's targeted for 2.2 then I will modify this. Thanks, Ravi > Thanks, > Tetsuya > > >
[dpdk-dev] [PATCH v2] test: fix test_tlb_tx_burst
Test failed on verification if number of bytes transmitted on each slave is not less than 90% and greater than 110% of mean value of bytes transmitted thru one slave. This was verified on a real system but is difficult to achieve using virtualpmd. That's why for unit tests only, it is sufficient to verify that with high load (2 seconds transsmission) all slaves are trasnitting so the traffic is balanced. v2 changes: - improved description - reverted number of packets generated (in v1 it was decreased, but to achieve balancing it has to be high load). Signed-off-by: Michal Jastrzebski --- app/test/test_link_bonding.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c index 305d45d..388cf11 100644 --- a/app/test/test_link_bonding.c +++ b/app/test/test_link_bonding.c @@ -4058,7 +4058,6 @@ test_tlb_tx_burst(void) struct rte_eth_stats port_stats[32]; uint64_t sum_ports_opackets = 0, all_bond_opackets = 0, all_bond_obytes = 0; uint16_t pktlen; - uint64_t floor_obytes = 0, ceiling_obytes = 0; TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves (BONDING_MODE_TLB, 1, 3, 1), @@ -4070,7 +4069,7 @@ test_tlb_tx_burst(void) "Burst size specified is greater than supported.\n"); - /* Generate 40 test bursts in 2s of packets to transmit */ + /* Generate bursts of packets */ for (i = 0; i < 40; i++) { /*test two types of mac src own(bonding) and others */ if (i % 2 == 0) { @@ -4123,15 +4122,14 @@ test_tlb_tx_burst(void) TEST_ASSERT_EQUAL(sum_ports_opackets, (uint64_t)all_bond_opackets, "Total packets sent by slaves is not equal to packets sent by bond interface"); - /* distribution of packets on each slave within +/- 10% of the expected value. */ - for (i = 0; i < test_params->bonded_slave_count; i++) { - floor_obytes = (all_bond_obytes*90)/(test_params->bonded_slave_count*100); - ceiling_obytes = (all_bond_obytes*110)/(test_params->bonded_slave_count*100); - TEST_ASSERT(port_stats[i].obytes >= floor_obytes && - port_stats[i].obytes <= ceiling_obytes, - "Distribution is not even"); + /* checking if distribution of packets is balanced over slaves */ + for (i = 0; i < test_params->bonded_slave_count; i++) { + TEST_ASSERT(port_stats[i].obytes > 0 && + port_stats[i].obytes < all_bond_obytes, + "Packets are not balanced over slaves"); } + /* Put all slaves down and try and transmit */ for (i = 0; i < test_params->bonded_slave_count; i++) { virtual_ethdev_simulate_link_status_interrupt( -- 1.7.9.5
[dpdk-dev] [PATCH v2] bonding: 8023ad: fix incorrect typecast of socket
On slave activation in LACP (8023AD) SOCKET_ANY_ID (which is -1) is being casted to unsigned char and then to signed int. The result is that socket_id has value of 255, not -1. This results to memory allocation failure. Signed-off-by: Sergey Balabanov --- drivers/net/bonding/rte_eth_bond_8023ad.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c b/drivers/net/bonding/rte_eth_bond_8023ad.c index 97a828e..c0f0b99 100644 --- a/drivers/net/bonding/rte_eth_bond_8023ad.c +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c @@ -849,7 +849,7 @@ bond_mode_8023ad_activate_slave(struct rte_eth_dev *bond_dev, uint8_t slave_id) }; char mem_name[RTE_ETH_NAME_MAX_LEN]; - uint8_t socket_id; + int socket_id; unsigned element_size; /* Given slave mus not be in active list */ -- 2.1.4
[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK will be dropped in switch side, so the packet generator will never receive the packet. Signed-off-by: Michael Qiu --- app/test-pmd/csumonly.c | 4 1 file changed, 4 insertions(+) diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index 1bf3485..bf8af1d 100644 --- a/app/test-pmd/csumonly.c +++ b/app/test-pmd/csumonly.c @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs) * and inner headers */ eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); + ether_addr_copy(_eth_addrs[fs->peer_addr], + _hdr->d_addr); + ether_addr_copy([fs->tx_port].eth_addr, + _hdr->s_addr); parse_ethernet(eth_hdr, ); l3_hdr = (char *)eth_hdr + info.l2_len; -- 1.9.3
[dpdk-dev] [PATCH v1] Change rte_eal_vdev_init to update port_id
On 2015/08/07 3:04, Ravi Kerur wrote: > diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c > index 8280cea..472ef5a 100644 > --- a/drivers/net/enic/enic_ethdev.c > +++ b/drivers/net/enic/enic_ethdev.c > @@ -36,8 +36,8 @@ > #include > #include > > -#include > #include > +#include > #include > #include Hi Ravi, Do we need this fixing? > > diff --git a/drivers/net/mpipe/mpipe_tilegx.c > b/drivers/net/mpipe/mpipe_tilegx.c > index 743feef..6e3e304 100644 > --- a/drivers/net/mpipe/mpipe_tilegx.c > +++ b/drivers/net/mpipe/mpipe_tilegx.c > @@ -1582,6 +1582,7 @@ rte_pmd_mpipe_devinit(const char *ifname, > if (!eth_dev) { > RTE_LOG(ERR, PMD, "%s: Failed to allocate device.\n", ifname); > rte_free(priv); > + return -ENOMEM; How about separating this fixing from the patch, and put it as an one of cleanup patch series? > } > > RTE_LOG(INFO, PMD, "%s: Initialized mpipe device" > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > index 4089d66..82d5693 100644 > --- a/lib/librte_eal/common/eal_common_dev.c > +++ b/lib/librte_eal/common/eal_common_dev.c > > RTE_LOG(ERR, EAL, "no driver found for %s\n", name); > @@ -94,6 +99,7 @@ rte_eal_dev_init(void) > { > struct rte_devargs *devargs; > struct rte_driver *driver; > + uint8_t port_id; > > /* >* Note that the dev_driver_list is populated here > @@ -108,7 +114,7 @@ rte_eal_dev_init(void) > continue; > > if (rte_eal_vdev_init(devargs->virtual.drv_name, > - devargs->args)) { > + devargs->args, _id)) { After this line, 'port_id' is actually not used by anywhere in this function. Also, I guess we will not use port_id in this function in the future. How about fixing rte_eal_vdev_init() to handle NULL value correctly to remove port_id from this function? But I agree your current implementation is also one of choice. > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index 5fe1906..355d709 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > +int > +rte_eth_dev_get_port_by_addr(const struct rte_pci_addr *addr, uint8_t > *port_id) > +{ > + int i; > + struct rte_pci_device *pci_dev = NULL; > + > + if (addr == NULL || port_id == NULL) { > + PMD_DEBUG_TRACE("Null pointer is specified\n"); > + return -EINVAL; > + } > + > + *port_id = RTE_MAX_ETHPORTS; > + > + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { > + > + pci_dev = rte_eth_devices[i].pci_dev; > + > + if (pci_dev != NULL && > + pci_dev->addr.domain == addr->domain && > + pci_dev->addr.bus == addr->bus && > + pci_dev->addr.devid == addr->devid && > + pci_dev->addr.function == addr->function) { You can use rte_eal_compare_pci_addr() here. > + > + *port_id = i; > + return 0; > + } > + } > + return -ENODEV; > +} > diff --git a/lib/librte_ether/rte_ether_version.map > b/lib/librte_ether/rte_ether_version.map > index 8345a6c..3d5cb23 100644 > --- a/lib/librte_ether/rte_ether_version.map > +++ b/lib/librte_ether/rte_ether_version.map > @@ -125,5 +125,7 @@ DPDK_2.1 { > rte_eth_timesync_enable; > rte_eth_timesync_read_rx_timestamp; > rte_eth_timesync_read_tx_timestamp; > + rte_eth_dev_get_port_by_name; > + rte_eth_dev_get_port_by_addr; > > } DPDK_2.0; Hi Thomas, Could you please make sure API consistency? Is it ok to add above functions to DPDK_2.1 even though we are in RC phase, or need to add to DPDK_2.2? Thanks, Tetsuya
[dpdk-dev] [PATCH] examples/l3fwd: fix compilation issue when using exact-match
On 07/08/2015 10:08, Pablo de Lara wrote: > L3fwd was trying to use an inexistent function "simple_ipv6_fwd_4pkts", > instead it should be "simple_ipv6_fwd_8pkts". > > Fixes: 80fcb4d4 ("examples/l3fwd: increase lookup burst size to 8") > > Signed-off-by: Pablo de Lara > --- > Acked-by: Sergio Gonzalez Monroy
[dpdk-dev] [PATCH 0/2] Warn user if system has more than 64 cores when using VM power manager
On 06/08/2015 12:07, Pablo de Lara wrote: > Pablo de Lara (2): >examples/vm_power_mgr: show warning when using systems with more than > 64 cores >doc: add known issue regarding VM power mgr in release notes > > doc/guides/rel_notes/known_issues.rst | 24 > examples/vm_power_manager/channel_manager.c | 11 +-- > 2 files changed, 29 insertions(+), 6 deletions(-) > Acked-by: Sergio Gonzalez Monroy
[dpdk-dev] [PATCH] ethdev: Fix illegal access of rte_eth_dev_is_detachable()
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa > Sent: Friday, August 7, 2015 10:21 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] ethdev: Fix illegal access of > rte_eth_dev_is_detachable() > > To obtain detachable flag, pci_drv is accessed in > rte_eth_dev_is_detachable(). > But pci_drv is only valid if port is enabled. Not to cause illegal access, add > rte_eth_dev_is_valid_port() before accessing. > > Signed-off-by: Tetsuya Mukawa Acked-by: Bernard Iremonger
[dpdk-dev] [PATCH] examples/l3fwd: fix compilation issue when using exact-match
L3fwd was trying to use an inexistent function "simple_ipv6_fwd_4pkts", instead it should be "simple_ipv6_fwd_8pkts". Fixes: 80fcb4d4 ("examples/l3fwd: increase lookup burst size to 8") Signed-off-by: Pablo de Lara --- examples/l3fwd/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 9351322..350c1cb 100644 --- a/examples/l3fwd/main.c +++ b/examples/l3fwd/main.c @@ -1714,7 +1714,7 @@ main_loop(__attribute__((unused)) void *dummy) portid, qconf); } else if (ol_flag & PKT_RX_IPV6_HDR) { #endif /* RTE_NEXT_ABI */ - simple_ipv6_fwd_4pkts(_burst[j], + simple_ipv6_fwd_8pkts(_burst[j], portid, qconf); } else { l3fwd_simple_forward(pkts_burst[j], -- 2.4.2
[dpdk-dev] Performance of rte_ring APIs
Hi All, I have an extremely simple test - I have just one single DPDK EAL thread which pulls packets from one 10G port and just puts the packet exactly as is (no changes) on another 10G port - I get 9.5million pps, so far so good. So its like this dpdk_rx dpdk_tx 9.5 Millionpps Now I do the below and the performance comes down to like 4 Million pps, less than half ! dpdk_rx rte_ring_mc_dequeue_bulk(my_ring1, my_array, nb_rx) dpdk_tx rte_ring_mp_enqueue_bulk(my_ring1, my_array, nb_rx) Note that I do nothing with the things I dequeue from the ring, I just enqueue it back. So is there any gotchas in using these rings ? I am sure I am missing something, it cant drop from 9.5Mpps to 4Mpps just because of a dequeue/enqueue ? Rgds, Gopa.
[dpdk-dev] [PATCH v2] bonding: 8023ad: fix incorrect typecast of socket
> -Original Message- > From: Sergey Balabanov [mailto:balabanovsv at ecotelecom.ru] > Sent: Friday, August 07, 2015 10:33 AM > To: dev at dpdk.org > Cc: De Lara Guarch, Pablo; Sergey Balabanov > Subject: [PATCH v2] bonding: 8023ad: fix incorrect typecast of socket > > On slave activation in LACP (8023AD) SOCKET_ANY_ID (which is -1) > is being casted to unsigned char and then to signed int. > The result is that socket_id has value of 255, not -1. > This results to memory allocation failure. > > Signed-off-by: Sergey Balabanov Acked-by: Pablo de Lara
[dpdk-dev] [PATCH] bonding: 8023ad: fix incorrect typecast of socket
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergey Balabanov > Sent: Wednesday, August 05, 2015 2:49 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] bonding: 8023ad: fix incorrect typecast of > socket > > On slave activation in LACP (8023AD) SOCKET_ANY_ID (which is -1) > is being casted to unsigned char and then to signed int. > The result is that socket_id has value of 255, not -1. > This results to memory allocation failure. > --- > drivers/net/bonding/rte_eth_bond_8023ad.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c > b/drivers/net/bonding/rte_eth_bond_8023ad.c > index 97a828e..c0f0b99 100644 > --- a/drivers/net/bonding/rte_eth_bond_8023ad.c > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.c > @@ -849,7 +849,7 @@ bond_mode_8023ad_activate_slave(struct > rte_eth_dev *bond_dev, uint8_t slave_id) > }; > > char mem_name[RTE_ETH_NAME_MAX_LEN]; > - uint8_t socket_id; > + int socket_id; > unsigned element_size; > > /* Given slave mus not be in active list */ > -- > 2.1.4 Acked-by: Pablo de Lara You forgot to sign off the patch, can you send a v2 with it? You can include my ack. Thanks! Pablo
[dpdk-dev] [PATCH 0/2] Warn user if system has more than 64 cores when using VM power manager
Tested-by: Marvin Liu > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pablo de Lara > Sent: Thursday, August 06, 2015 7:08 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH 0/2] Warn user if system has more than 64 cores > when using VM power manager > > Pablo de Lara (2): > examples/vm_power_mgr: show warning when using systems with more than > 64 cores > doc: add known issue regarding VM power mgr in release notes > > doc/guides/rel_notes/known_issues.rst | 24 > examples/vm_power_manager/channel_manager.c | 11 +-- > 2 files changed, 29 insertions(+), 6 deletions(-) > > -- > 2.4.2
[dpdk-dev] [PATCH v1] doc: prog_guide update for RX interrupt event
> -Original Message- > From: Liang, Cunming > Sent: Thursday, August 06, 2015 10:19 AM > To: dev at dpdk.org > Cc: Mcnamara, John; david.marchand at 6wind.com; Liang, Cunming > Subject: [PATCH v1] doc: prog_guide update for RX interrupt event > > The patch updates the env_abstraction_layer.rst part in prog_guide. > It adds the RX interrupt event declaration and revises the others in > interrupt event section. > > Signed-off-by: Cunming Liang Acked-by: Danny Zhou