[dpdk-dev] perfomance of rte_lpm rule subsystem
Stephen, what was the main reason you use red-black tree instead of dir-24-8? Did you switch to using trees because of too big memory working set of dir-24-8 algorithm? 2016-04-19 18:46 GMT+03:00 Stephen Hemminger : > On Tue, 19 Apr 2016 14:11:11 +0300 > ? ??? wrote: > > > Hi. > > > > Doing some test with rte_lpm (adding/deleting bgp full table rules) I > > noticed that > > rule subsystem is very slow even considering that probably it was never > > designed for using > > in a data forwarding plane. So I want to propose some changes to the > "rule" > > subsystem. > > > > I reimplemented rule part ot the lib using rte_hash, and perfomance of > > adding/deleted routes have increased dramatically. > > If increasing speed of adding deleting routes makes sence for anybody > else > > I would like to discuss my patch. > > The patch also include changes that make next_hop 64 bit, so please just > > ignore them. The rule changes are in the following > > functions only: > > > > rte_lpm2_create > > > > rule_find > > rule_add > > rule_delete > > find_previous_rule > > delete_depth_small > > delete_depth_big > > > > rte_lpm2_add > > rte_lpm2_delete > > rte_lpm2_is_rule_present > > rte_lpm2_delete_all > > > > We forked LPM back several versions ago. > I sent the patches to use BSD red-black tree for rules but the patches were > ignored. mostly because it broke ABI. > -- -- Kiselev Alexander
[dpdk-dev] [PATCH] mk: add rpath for applications
2016-04-29 17:34, Ferruh Yigit: > Add default library output folder to the library search folder. > > This is useful for development environment, in production environment > DPDK libraries already should be in know locations. Yes it is useful in dev environment, but can be risky or strange when packaged for production environment. Shouldn't we have a switch to avoid a development garbage in production? I suggest to use RTE_DEVEL_BUILD. > Patch removes requirement to set LD_LIBRARY_PATH variable when DPDK > compiled as shared library. Yes, this patch could remove export LD_LIBRARY_PATH=$build/lib:$LD_LIBRARY_PATH in scripts/test-null.sh. [...] > +ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y) > +LDFLAGS += --rpath=$(RTE_SDK_BIN)/lib > +endif Isn't it -rpath, with a single dash? As it is a variable setting, it should be added before the rules, just after LDLIBS settings.
[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop
Hi Jerin, > -Original Message- > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com] > Sent: Monday, May 02, 2016 1:00 PM > To: dev at dpdk.org > Cc: De Lara Guarch, Pablo; Jerin Jacob > Subject: [dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in > macswap loop > > prefetch the next packet data address in advance in macswap loop > for performance improvement. > > Signed-off-by: Jerin Jacob > --- > app/test-pmd/macswap.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c > index 154889d..c10f4b5 100644 > --- a/app/test-pmd/macswap.c > +++ b/app/test-pmd/macswap.c > @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs) > if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ) > ol_flags |= PKT_TX_QINQ_PKT; > for (i = 0; i < nb_rx; i++) { > + if (likely(i < nb_rx - 1)) > + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1], > +void *)); > mb = pkts_burst[i]; > eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *); > > -- > 2.1.0 This looks good. Could you also add it in the other forwarding modes (the ones that make changes in the packets)? Thanks, Pablo
[dpdk-dev] [PATCH] mk: do not enforce any specific ARM ABI
2016-04-16 00:33, Jan Viktorin: > The dpdk build system passes -mfloat-abi=softfp, which makes the build fail > when the selected ABI is EABIhf. The dpdk build system should not make > assumptions on the selected ARM ABI. > > Signed-off-by: Jan Viktorin > Reported-by: Thomas Petazzoni Applied, thanks
[dpdk-dev] [PATCH 3/3] vhost: arrange virtio_net fields for better cache sharing
the ifname[] field takes so much space, that it seperate some frequently used fields into different caches, say, features and broadcast_rarp. This patch move all those fields that will be accessed frequently in Rx/Tx together (before the ifname[] field) to let them share one cache line. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost-net.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h index 9dec83c..3b0ffe7 100644 --- a/lib/librte_vhost/vhost-net.h +++ b/lib/librte_vhost/vhost-net.h @@ -123,16 +123,16 @@ struct virtio_net { int vid; uint32_tflags; uint16_tvhost_hlen; + /* to tell if we need broadcast rarp packet */ + rte_atomic16_t broadcast_rarp; + uint32_tvirt_qp_nb; + struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) charifname[IF_NAME_SZ]; - uint32_tvirt_qp_nb; uint64_tlog_size; uint64_tlog_base; struct ether_addr mac; - /* to tell if we need broadcast rarp packet */ - rte_atomic16_t broadcast_rarp; - struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; } __rte_cache_aligned; /** -- 1.9.0
[dpdk-dev] [PATCH 2/3] vhost: optimize dequeue for small packets
Both current kernel virtio driver and DPDK virtio driver use at least 2 desc buffer for Tx: the first for storing the header, and the others for storing the data. Therefore, we could fetch the first data desc buf before the main loop, and do the copy first before the check of "are we done yet?". This could save one check for small packets, that just have one data desc buffer and need one mbuf to store it. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost_rxtx.c | 52 ++- 1 file changed, 36 insertions(+), 16 deletions(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 2c3b810..34d6ed1 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -753,18 +753,48 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, return -1; desc_addr = gpa_to_vva(dev, desc->addr); - rte_prefetch0((void *)(uintptr_t)desc_addr); - - /* Retrieve virtio net header */ hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr); - desc_avail = desc->len - dev->vhost_hlen; - desc_offset = dev->vhost_hlen; + rte_prefetch0(hdr); + + /* +* Both current kernel virio driver and DPDK virtio driver +* use at least 2 desc bufferr for Tx: the first for storing +* the header, and others for storing the data. +*/ + if (likely(desc->len == dev->vhost_hlen)) { + desc = >desc[desc->next]; + + desc_addr = gpa_to_vva(dev, desc->addr); + rte_prefetch0((void *)(uintptr_t)desc_addr); + + desc_offset = 0; + desc_avail = desc->len; + nr_desc+= 1; + + PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0); + } else { + desc_avail = desc->len - dev->vhost_hlen; + desc_offset = dev->vhost_hlen; + } mbuf_offset = 0; mbuf_avail = m->buf_len - RTE_PKTMBUF_HEADROOM; - while (desc_avail != 0 || (desc->flags & VRING_DESC_F_NEXT) != 0) { + while (1) { + cpy_len = RTE_MIN(desc_avail, mbuf_avail); + rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, mbuf_offset), + (void *)((uintptr_t)(desc_addr + desc_offset)), + cpy_len); + + mbuf_avail -= cpy_len; + mbuf_offset += cpy_len; + desc_avail -= cpy_len; + desc_offset += cpy_len; + /* This desc reaches to its end, get the next one */ if (desc_avail == 0) { + if ((desc->flags & VRING_DESC_F_NEXT) == 0) + break; + if (unlikely(desc->next >= vq->size || ++nr_desc >= vq->size)) return -1; @@ -800,16 +830,6 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, mbuf_offset = 0; mbuf_avail = cur->buf_len - RTE_PKTMBUF_HEADROOM; } - - cpy_len = RTE_MIN(desc_avail, mbuf_avail); - rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, mbuf_offset), - (void *)((uintptr_t)(desc_addr + desc_offset)), - cpy_len); - - mbuf_avail -= cpy_len; - mbuf_offset += cpy_len; - desc_avail -= cpy_len; - desc_offset += cpy_len; } prev->data_len = mbuf_offset; -- 1.9.0
[dpdk-dev] [PATCH 1/3] vhost: pre update used ring for Tx and Rx
Pre update and update used ring in batch for Tx and Rx at the stage while fetching all avail desc idx. This would reduce some cache misses and hence, increase the performance a bit. Pre update would be feasible as guest driver will not start processing those entries as far as we don't update "used->idx". (I'm not 100% certain I don't miss anything, though). Cc: Michael S. Tsirkin Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost_rxtx.c | 58 +-- 1 file changed, 28 insertions(+), 30 deletions(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index c9cd1c5..2c3b810 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -137,7 +137,7 @@ copy_virtio_net_hdr(struct virtio_net *dev, uint64_t desc_addr, static inline int __attribute__((always_inline)) copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, - struct rte_mbuf *m, uint16_t desc_idx, uint32_t *copied) + struct rte_mbuf *m, uint16_t desc_idx) { uint32_t desc_avail, desc_offset; uint32_t mbuf_avail, mbuf_offset; @@ -161,7 +161,6 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, desc_offset = dev->vhost_hlen; desc_avail = desc->len - dev->vhost_hlen; - *copied = rte_pktmbuf_pkt_len(m); mbuf_avail = rte_pktmbuf_data_len(m); mbuf_offset = 0; while (mbuf_avail != 0 || m->next != NULL) { @@ -262,6 +261,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, struct vhost_virtqueue *vq; uint16_t res_start_idx, res_end_idx; uint16_t desc_indexes[MAX_PKT_BURST]; + uint16_t used_idx; uint32_t i; LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); @@ -285,27 +285,29 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, /* Retrieve all of the desc indexes first to avoid caching issues. */ rte_prefetch0(>avail->ring[res_start_idx & (vq->size - 1)]); for (i = 0; i < count; i++) { - desc_indexes[i] = vq->avail->ring[(res_start_idx + i) & - (vq->size - 1)]; + used_idx = (res_start_idx + i) & (vq->size - 1); + desc_indexes[i] = vq->avail->ring[used_idx]; + vq->used->ring[used_idx].id = desc_indexes[i]; + vq->used->ring[used_idx].len = pkts[i]->pkt_len + + dev->vhost_hlen; + vhost_log_used_vring(dev, vq, + offsetof(struct vring_used, ring[used_idx]), + sizeof(vq->used->ring[used_idx])); } rte_prefetch0(>desc[desc_indexes[0]]); for (i = 0; i < count; i++) { uint16_t desc_idx = desc_indexes[i]; - uint16_t used_idx = (res_start_idx + i) & (vq->size - 1); - uint32_t copied; int err; - err = copy_mbuf_to_desc(dev, vq, pkts[i], desc_idx, ); - - vq->used->ring[used_idx].id = desc_idx; - if (unlikely(err)) + err = copy_mbuf_to_desc(dev, vq, pkts[i], desc_idx); + if (unlikely(err)) { + used_idx = (res_start_idx + i) & (vq->size - 1); vq->used->ring[used_idx].len = dev->vhost_hlen; - else - vq->used->ring[used_idx].len = copied + dev->vhost_hlen; - vhost_log_used_vring(dev, vq, - offsetof(struct vring_used, ring[used_idx]), - sizeof(vq->used->ring[used_idx])); + vhost_log_used_vring(dev, vq, + offsetof(struct vring_used, ring[used_idx]), + sizeof(vq->used->ring[used_idx])); + } if (i + 1 < count) rte_prefetch0(>desc[desc_indexes[i+1]]); @@ -879,6 +881,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, /* Prefetch available ring to retrieve head indexes. */ used_idx = vq->last_used_idx & (vq->size - 1); rte_prefetch0(>avail->ring[used_idx]); + rte_prefetch0(>used->ring[used_idx]); count = RTE_MIN(count, MAX_PKT_BURST); count = RTE_MIN(count, free_entries); @@ -887,22 +890,23 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, /* Retrieve all of the head indexes first to avoid caching issues. */ for (i = 0; i < count; i++) { - desc_indexes[i] = vq->avail->ring[(vq->last_used_idx + i) & - (vq->size - 1)]; + used_idx = (vq->last_used_idx + i) & (vq->size - 1); + desc_indexes[i] = vq->avail->ring[used_idx]; + + vq->used->ring[used_idx].id = desc_indexes[i]; + vq->used->ring[used_idx].len = 0; + vhost_log_used_vring(dev, vq,
[dpdk-dev] ovs crash when running traffic from VM to VM over DPDK and vhostuser
Running with dpdk 16.04 and latest ovs from git, and removing "mrg_rxbuf=off" from virtio params, the crash is no longer observed. However, we are wittnessing ovs gets stuck, and will post to ovs mailing list:2016-05-02T17:26:18.804Z|00111|ovs_rcu|WARN|blocked 1000 ms waiting for pmd145 to quiesce 2016-05-02T17:26:19.805Z|00112|ovs_rcu|WARN|blocked 2001 ms waiting for pmd145 to quiesce 2016-05-02T17:26:21.804Z|00113|ovs_rcu|WARN|blocked 4000 ms waiting for pmd145 to quiesce 2016-05-02T17:26:25.805Z|00114|ovs_rcu|WARN|blocked 8001 ms waiting for pmd145 to quiesce 2016-05-02T17:26:33.805Z|00115|ovs_rcu|WARN|blocked 16001 ms waiting for pmd145 to quiesce 2016-05-02T17:26:49.805Z|00116|ovs_rcu|WARN|blocked 32001 ms waiting for pmd145 to quiesce 2016-05-02T17:27:14.354Z|00072|ovs_rcu(vhost_thread2)|WARN|blocked 128000 ms waiting for pmd145 to quiesce 2016-05-02T17:27:15.841Z|8|ovs_rcu(urcu3)|WARN|blocked 128001 ms waiting for pmd145 to quiesce 2016-05-02T17:27:21.805Z|00117|ovs_rcu|WARN|blocked 64000 ms waiting for pmd145 to quiesce 2016-05-02T17:28:25.804Z|00118|ovs_rcu|WARN|blocked 128000 ms waiting for pmd145 to quiesce On Wednesday, 6 April 2016 10:56 AM, Yuanhan Liu wrote: On Tue, Apr 05, 2016 at 08:36:19PM +, Yi Ba wrote: > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7ff1ddffb700 (LWP 21287)] > 0x00450da7 in update_secure_len (vec_idx=0x7ff1ddff27f8, > secure_len=0x7ff1ddff27fc, id=13948, vq=0x7fe7992c8940) >? ? at /home/stack/ovs-dpdk/dpdk-2.2.0/lib/librte_vhost/vhost_rxtx.c:452 > 452? ? /home/stack/ovs-dpdk/dpdk-2.2.0/lib/librte_vhost/vhost_rxtx.c: No such > file or directory. > (gdb) bt > #0? 0x00450da7 in update_secure_len (vec_idx=0x7ff1ddff27f8, > secure_len=0x7ff1ddff27fc, id=13948, vq=0x7fe7992c8940) It looks like a known issue, which has been fixed in this release. So, could you please just try again with the latest DPDK code? It should be able to solve your issue. ??? --yliu
[dpdk-dev] [PATCH] app/testpmd: add packet data prefetch in macswap loop
prefetch the next packet data address in advance in macswap loop for performance improvement. Signed-off-by: Jerin Jacob --- app/test-pmd/macswap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c index 154889d..c10f4b5 100644 --- a/app/test-pmd/macswap.c +++ b/app/test-pmd/macswap.c @@ -113,6 +113,9 @@ pkt_burst_mac_swap(struct fwd_stream *fs) if (txp->tx_ol_flags & TESTPMD_TX_OFFLOAD_INSERT_QINQ) ol_flags |= PKT_TX_QINQ_PKT; for (i = 0; i < nb_rx; i++) { + if (likely(i < nb_rx - 1)) + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i + 1], + void *)); mb = pkts_burst[i]; eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *); -- 2.1.0
[dpdk-dev] [PATCH] cmdline: fix unchecked return value
Hi Daniel, On 04/14/2016 03:01 PM, Daniel Mrzyglod wrote: > This patch is for checking if error values occurs. > fix for coverity errors #13209 & #13195 > > If the function returns an error value, the error value may be mistaken > for a normal value. > > In rdline_char_in: Value returned from a function is not checked for errors > before being used > > Signed-off-by: Daniel Mrzyglod > --- > lib/librte_cmdline/cmdline_rdline.c | 19 +++ > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/lib/librte_cmdline/cmdline_rdline.c > b/lib/librte_cmdline/cmdline_rdline.c > index 1ef2258..e75a556 100644 > --- a/lib/librte_cmdline/cmdline_rdline.c > +++ b/lib/librte_cmdline/cmdline_rdline.c > @@ -377,7 +377,10 @@ rdline_char_in(struct rdline *rdl, char c) > case CMDLINE_KEY_CTRL_K: > cirbuf_get_buf_head(>right, rdl->kill_buf, > RDLINE_BUF_SIZE); > rdl->kill_size = CIRBUF_GET_LEN(>right); > - cirbuf_del_buf_head(>right, rdl->kill_size); > + > + if (cirbuf_del_buf_head(>right, rdl->kill_size) < > 0) > + return -EINVAL; > + > rdline_puts(rdl, vt100_clear_right); > break; > I wonder if a better way to fix wouldn't be to remove the checks introduced in http://dpdk.org/browse/dpdk/commit/?id=ab971e562860 There is no reason to check that in cirbuf_get_buf_head/tail(): if (!cbuf || !c) The function should never fail, it just returns the number of copied chars. This is the responsibility of the caller to ensure that the pointer to the circular buffer is not NULL. Also, rdline_char_in() is not expected to return -EINVAL, but RDLINE_RES_* instead. So I think that partially revert ab971e562860 would fix the coverity warning. Regards, Olivier
[dpdk-dev] [PATCH 0/4] cleanup debug and dead code
2016-04-22 15:43, Thomas Monjalon: > With this series, the default log level is not debug anymore. > And more code depends on debug level instead of having some > almost dead code. > > Thomas Monjalon (4): > eal: increase log level of some messages > log: increase default level to info > examples: remove useless debug flags > eal: add assert macro for debug Applied with small fix discussed for vmxnet3.
[dpdk-dev] [PATCH v3 1/1] cmdline: add any multi string mode to token string
2016-04-29 16:29, Piotr Azarewicz: > While parsing token string there may be several modes: > - fixed single string > - multi-choice single string > - any single string > > This patch add one more mode - any multi string. > > Signed-off-by: Piotr Azarewicz > Acked-by: Olivier Matz Applied, thanks
[dpdk-dev] Flow Director Example?
Hi Helin, thanks for the reply. Some code might help me explain myself better- port->configuration = rte_eth_conf { .fdir_conf = { .mode = RTE_FDIR_MODE_SIGNATURE, .pballoc = RTE_FDIR_PBALLOC_64K, .mask = rte_eth_fdir_masks { .ipv4_mask = rte_eth_ipv4_flow { .dst_ip = 0x0, }, .ipv6_mask = rte_eth_ipv6_flow { .dst_ip = { 0x0, 0x0, 0x0, 0x0 }, }, }, .status = RTE_FDIR_REPORT_STATUS, .drop_queue = 127, }, .rxmode = { .mq_mode = ETH_MQ_RX_NONE, .max_rx_pkt_len = ETHER_MAX_LEN, .split_hdr_size = 0, .header_split = 0, .hw_ip_checksum = 0, .hw_vlan_filter = 0, .jumbo_frame= 0, .hw_strip_crc = 0, }, .txmode = { .mq_mode = ETH_MQ_TX_NONE, }, }; I'm trying to direct packets with the same destination IPv4 or IPv6 address into the same RX queues. I haven't been able to find any examples of using Flow Director with DPDK, so I'm sure I'm doing something obviously wrong here, but I can't figure out what it is. Alex Forster On 5/2/16, 1:38 AM, "Zhang, Helin" wrote: >Hi Alex > >Can you confirm that you are using DPDK? And how do you use DPDK and possibly >kernel driver? >I need your detailed topo of how are you using DPDK, as I am a bit confused. >Thanks! > >Regards, >Helin > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Forster >> Sent: Saturday, April 30, 2016 4:34 AM >> To: dev at dpdk.org >> Subject: [dpdk-dev] Flow Director Example? >> >> Hi guys, apologies if this is the wrong list, but the others look pretty >> bare. >> >> We have a 32 core server that has two X520-QDA1's NICs with 2x10G ports >> plugged into each. I'm using 2016.1 (latest stable) with ixgbe 4.3.15 >> (latest stable). >> I'm setting up 8 RX queues per port, and I'd like Flow Director in signature >> mode >> (?) to place packets into queues based on a hash of destination IPv4 or IPv6 >> address. However, I can't figure out rte_fdir_conf, and despite a good >> amount of >> trial and error, each of my ports are still only using one of the RX queues >> I set up. >> >> Would anyone be able to point me in the right direction here? Thanks in >> advance! >> >> Alex Forster
[dpdk-dev] [PATCH 16/16] vhost: make buf vector for scatter Rx local
From: Ilya MaximetsArray of buf_vector's is just an array for temporary storing information about available descriptors. It used only locally in virtio_dev_merge_rx() and there is no reason for that array to be shared. Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). Signed-off-by: Ilya Maximets Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost-net.h | 1 - lib/librte_vhost/vhost_rxtx.c | 41 ++--- 2 files changed, 22 insertions(+), 20 deletions(-) diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h index 9dec83c..e697d96 100644 --- a/lib/librte_vhost/vhost-net.h +++ b/lib/librte_vhost/vhost-net.h @@ -81,7 +81,6 @@ struct vhost_virtqueue { /* Physical address of used ring, for logging */ uint64_tlog_guest_addr; - struct buf_vector buf_vec[BUF_VECTOR_MAX]; } __rte_cache_aligned; /* Old kernels have no such macro defined */ diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index c9cd1c5..96720db 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -335,7 +335,8 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, static inline int fill_vec_buf(struct vhost_virtqueue *vq, uint32_t avail_idx, -uint32_t *allocated, uint32_t *vec_idx) +uint32_t *allocated, uint32_t *vec_idx, +struct buf_vector *buf_vec) { uint16_t idx = vq->avail->ring[avail_idx & (vq->size - 1)]; uint32_t vec_id = *vec_idx; @@ -346,9 +347,9 @@ fill_vec_buf(struct vhost_virtqueue *vq, uint32_t avail_idx, return -1; len += vq->desc[idx].len; - vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; - vq->buf_vec[vec_id].buf_len = vq->desc[idx].len; - vq->buf_vec[vec_id].desc_idx = idx; + buf_vec[vec_id].buf_addr = vq->desc[idx].addr; + buf_vec[vec_id].buf_len = vq->desc[idx].len; + buf_vec[vec_id].desc_idx = idx; vec_id++; if ((vq->desc[idx].flags & VRING_DESC_F_NEXT) == 0) @@ -371,7 +372,8 @@ fill_vec_buf(struct vhost_virtqueue *vq, uint32_t avail_idx, */ static inline int reserve_avail_buf_mergeable(struct vhost_virtqueue *vq, uint32_t size, - uint16_t *start, uint16_t *end) + uint16_t *start, uint16_t *end, + struct buf_vector *buf_vec) { uint16_t res_start_idx; uint16_t res_cur_idx; @@ -393,7 +395,7 @@ again: return -1; if (unlikely(fill_vec_buf(vq, res_cur_idx, , - _idx) < 0)) + _idx, buf_vec) < 0)) return -1; res_cur_idx++; @@ -427,7 +429,7 @@ again: static inline uint32_t __attribute__((always_inline)) copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, uint16_t res_start_idx, uint16_t res_end_idx, - struct rte_mbuf *m) + struct rte_mbuf *m, struct buf_vector *buf_vec) { struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0}; uint32_t vec_idx = 0; @@ -444,10 +446,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n", dev->vid, cur_idx, res_end_idx); - if (vq->buf_vec[vec_idx].buf_len < dev->vhost_hlen) + if (buf_vec[vec_idx].buf_len < dev->vhost_hlen) return -1; - desc_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr); + desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr); rte_prefetch0((void *)(uintptr_t)desc_addr); virtio_hdr.num_buffers = res_end_idx - res_start_idx; @@ -456,10 +458,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, virtio_enqueue_offload(m, _hdr.hdr); copy_virtio_net_hdr(dev, desc_addr, virtio_hdr); - vhost_log_write(dev, vq->buf_vec[vec_idx].buf_addr, dev->vhost_hlen); + vhost_log_write(dev, buf_vec[vec_idx].buf_addr, dev->vhost_hlen); PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0); - desc_avail = vq->buf_vec[vec_idx].buf_len - dev->vhost_hlen; + desc_avail = buf_vec[vec_idx].buf_len - dev->vhost_hlen; desc_offset = dev->vhost_hlen; mbuf_avail = rte_pktmbuf_data_len(m); @@ -467,7 +469,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, while (mbuf_avail != 0 || m->next != NULL) { /* done with current desc buf, get the next one */ if (desc_avail == 0) { - desc_idx = vq->buf_vec[vec_idx].desc_idx; +
[dpdk-dev] [PATCH 15/16] vhost: per device vhost_hlen
Virtio net header length is set per device, but not per queue. So, there is no reason to store it in vhost_virtqueue struct, instead, we should store it in virtio_net struct, to make one copy only. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost-net.h | 2 +- lib/librte_vhost/vhost_rxtx.c | 40 lib/librte_vhost/virtio-net.c | 13 ++--- 3 files changed, 23 insertions(+), 32 deletions(-) diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h index 9710009..9dec83c 100644 --- a/lib/librte_vhost/vhost-net.h +++ b/lib/librte_vhost/vhost-net.h @@ -63,7 +63,6 @@ struct vhost_virtqueue { struct vring_avail *avail; struct vring_used *used; uint32_tsize; - uint16_tvhost_hlen; /* Last index used on the available ring */ volatile uint16_t last_used_idx; @@ -123,6 +122,7 @@ struct virtio_net { uint64_tprotocol_features; int vid; uint32_tflags; + uint16_tvhost_hlen; #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) charifname[IF_NAME_SZ]; uint32_tvirt_qp_nb; diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 65278bb..c9cd1c5 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -126,10 +126,10 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct virtio_net_hdr *net_hdr) } static inline void -copy_virtio_net_hdr(struct vhost_virtqueue *vq, uint64_t desc_addr, +copy_virtio_net_hdr(struct virtio_net *dev, uint64_t desc_addr, struct virtio_net_hdr_mrg_rxbuf hdr) { - if (vq->vhost_hlen == sizeof(struct virtio_net_hdr_mrg_rxbuf)) + if (dev->vhost_hlen == sizeof(struct virtio_net_hdr_mrg_rxbuf)) *(struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr = hdr; else *(struct virtio_net_hdr *)(uintptr_t)desc_addr = hdr.hdr; @@ -147,19 +147,19 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0}; desc = >desc[desc_idx]; - if (unlikely(desc->len < vq->vhost_hlen)) + if (unlikely(desc->len < dev->vhost_hlen)) return -1; desc_addr = gpa_to_vva(dev, desc->addr); rte_prefetch0((void *)(uintptr_t)desc_addr); virtio_enqueue_offload(m, _hdr.hdr); - copy_virtio_net_hdr(vq, desc_addr, virtio_hdr); - vhost_log_write(dev, desc->addr, vq->vhost_hlen); - PRINT_PACKET(dev, (uintptr_t)desc_addr, vq->vhost_hlen, 0); + copy_virtio_net_hdr(dev, desc_addr, virtio_hdr); + vhost_log_write(dev, desc->addr, dev->vhost_hlen); + PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0); - desc_offset = vq->vhost_hlen; - desc_avail = desc->len - vq->vhost_hlen; + desc_offset = dev->vhost_hlen; + desc_avail = desc->len - dev->vhost_hlen; *copied = rte_pktmbuf_pkt_len(m); mbuf_avail = rte_pktmbuf_data_len(m); @@ -300,9 +300,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, vq->used->ring[used_idx].id = desc_idx; if (unlikely(err)) - vq->used->ring[used_idx].len = vq->vhost_hlen; + vq->used->ring[used_idx].len = dev->vhost_hlen; else - vq->used->ring[used_idx].len = copied + vq->vhost_hlen; + vq->used->ring[used_idx].len = copied + dev->vhost_hlen; vhost_log_used_vring(dev, vq, offsetof(struct vring_used, ring[used_idx]), sizeof(vq->used->ring[used_idx])); @@ -444,7 +444,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n", dev->vid, cur_idx, res_end_idx); - if (vq->buf_vec[vec_idx].buf_len < vq->vhost_hlen) + if (vq->buf_vec[vec_idx].buf_len < dev->vhost_hlen) return -1; desc_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr); @@ -455,12 +455,12 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, dev->vid, virtio_hdr.num_buffers); virtio_enqueue_offload(m, _hdr.hdr); - copy_virtio_net_hdr(vq, desc_addr, virtio_hdr); - vhost_log_write(dev, vq->buf_vec[vec_idx].buf_addr, vq->vhost_hlen); - PRINT_PACKET(dev, (uintptr_t)desc_addr, vq->vhost_hlen, 0); + copy_virtio_net_hdr(dev, desc_addr, virtio_hdr); + vhost_log_write(dev, vq->buf_vec[vec_idx].buf_addr, dev->vhost_hlen); + PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0); - desc_avail = vq->buf_vec[vec_idx].buf_len -
[dpdk-dev] [PATCH 14/16] vhost: reserve few more space for future extension
"virtio_net_device_ops" is the only left open struct that an application can access, therefore, it's the only place that might introduce potential ABI break in future for extension. So, do some reservation for it. 5 should be pretty enough, considering that we have barely touched it for a long while. Another reason to choose 5 is for cache alignment: 5 makes the struct 64 bytes for 64 bit machine. With this, it's confidence to say that we might be able to be free from the ABI violation forever. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/rte_virtio_net.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index 388621e..4e50425 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -77,6 +77,8 @@ struct virtio_net_device_ops { void (*destroy_device)(int vid);/**< Remove device. */ int (*vring_state_changed)(int vid, uint16_t queue_id, int enable); /**< triggered when a vring is enabled or disabled */ + + void *reserved[5]; /**< Reserved for future extension */ }; /** -- 1.9.0
[dpdk-dev] [PATCH 13/16] vhost: remove virtio-net.h
It barely has anything useful there, just 2 functions prototype. Here move them to vhost-net.h, and delete it. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost-net.h | 3 ++ lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 1 - lib/librte_vhost/vhost_rxtx.c | 1 - lib/librte_vhost/vhost_user/virtio-net-user.c | 1 - lib/librte_vhost/virtio-net.c | 1 - lib/librte_vhost/virtio-net.h | 43 --- 6 files changed, 3 insertions(+), 47 deletions(-) delete mode 100644 lib/librte_vhost/virtio-net.h diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h index 2e4c95d..9710009 100644 --- a/lib/librte_vhost/vhost-net.h +++ b/lib/librte_vhost/vhost-net.h @@ -214,6 +214,9 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa) return vhost_va; } +struct virtio_net_device_ops const *notify_ops; +struct virtio_net *get_device(int vid); + int vhost_new_device(void); void vhost_destroy_device(int); diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c index 0723a7a..552be7d 100644 --- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c +++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c @@ -54,7 +54,6 @@ #include "rte_virtio_net.h" #include "vhost-net.h" #include "virtio-net-cdev.h" -#include "virtio-net.h" #include "eventfd_copy.h" /* Line size for reading maps file. */ diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 08cab08..65278bb 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -46,7 +46,6 @@ #include #include "vhost-net.h" -#include "virtio-net.h" #define MAX_PKT_BURST 32 #define VHOST_LOG_PAGE 4096 diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c index 7fa69a7..6463bdd 100644 --- a/lib/librte_vhost/vhost_user/virtio-net-user.c +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c @@ -43,7 +43,6 @@ #include #include -#include "virtio-net.h" #include "virtio-net-user.h" #include "vhost-net-user.h" #include "vhost-net.h" diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index 9fd80a8..6577fe0 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -53,7 +53,6 @@ #include #include "vhost-net.h" -#include "virtio-net.h" #define MAX_VHOST_DEVICE 1024 static struct virtio_net *vhost_devices[MAX_VHOST_DEVICE]; diff --git a/lib/librte_vhost/virtio-net.h b/lib/librte_vhost/virtio-net.h deleted file mode 100644 index 9812545..000 --- a/lib/librte_vhost/virtio-net.h +++ /dev/null @@ -1,43 +0,0 @@ -/*- - * BSD LICENSE - * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * * Neither the name of Intel Corporation nor the names of its - * contributors may be used to endorse or promote products derived - * from this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - */ - -#ifndef _VIRTIO_NET_H -#define _VIRTIO_NET_H - -#include "vhost-net.h" -#include "rte_virtio_net.h" - -struct virtio_net_device_ops const *notify_ops; -struct virtio_net *get_device(int vid); - -#endif -- 1.9.0
[dpdk-dev] [PATCH 11/16] vhost: hide internal structs/macros/functions
We are now safe to move all those internal structs/macros/functions to vhost-net.h, to hide them from external access. This patch also breaks long lines and removes some redundant comments. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/rte_virtio_net.h | 128 -- lib/librte_vhost/vhost-net.h | 142 ++ 2 files changed, 142 insertions(+), 128 deletions(-) diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index 0a26df9..388621e 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -65,111 +65,6 @@ struct rte_mbuf; /* Enum for virtqueue management. */ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; -#define BUF_VECTOR_MAX 256 - -/** - * Structure contains buffer address, length and descriptor index - * from vring to do scatter RX. - */ -struct buf_vector { - uint64_t buf_addr; - uint32_t buf_len; - uint32_t desc_idx; -}; - -/** - * Structure contains variables relevant to RX/TX virtqueues. - */ -struct vhost_virtqueue { - struct vring_desc *desc; /**< Virtqueue descriptor ring. */ - struct vring_avail *avail; /**< Virtqueue available ring. */ - struct vring_used *used; /**< Virtqueue used ring. */ - uint32_tsize; /**< Size of descriptor ring. */ - int backend;/**< Backend value to determine if device should started/stopped. */ - uint16_tvhost_hlen; /**< Vhost header length (varies depending on RX merge buffers. */ - volatile uint16_t last_used_idx; /**< Last index used on the available ring */ - volatile uint16_t last_used_idx_res; /**< Used for multiple devices reserving buffers. */ -#define VIRTIO_INVALID_EVENTFD (-1) -#define VIRTIO_UNINITIALIZED_EVENTFD (-2) - int callfd; /**< Used to notify the guest (trigger interrupt). */ - int kickfd; /**< Currently unused as polling mode is enabled. */ - int enabled; - uint64_tlog_guest_addr; /**< Physical address of used ring, for logging */ - uint64_treserved[15]; /**< Reserve some spaces for future extension. */ - struct buf_vector buf_vec[BUF_VECTOR_MAX];/**< for scatter RX. */ -} __rte_cache_aligned; - -/* Old kernels have no such macro defined */ -#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE - #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 -#endif - - -/* - * Make an extra wrapper for VIRTIO_NET_F_MQ and - * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are - * introduced since kernel v3.8. This makes our - * code buildable for older kernel. - */ -#ifdef VIRTIO_NET_F_MQ - #define VHOST_MAX_QUEUE_PAIRS VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX - #define VHOST_SUPPORTS_MQ (1ULL << VIRTIO_NET_F_MQ) -#else - #define VHOST_MAX_QUEUE_PAIRS 1 - #define VHOST_SUPPORTS_MQ 0 -#endif - -/* - * Define virtio 1.0 for older kernels - */ -#ifndef VIRTIO_F_VERSION_1 - #define VIRTIO_F_VERSION_1 32 -#endif - -/** - * Device structure contains all configuration information relating to the device. - */ -struct virtio_net { - struct virtio_memory*mem; /**< QEMU memory and memory region information. */ - uint64_tfeatures; /**< Negotiated feature set. */ - uint64_tprotocol_features; /**< Negotiated protocol feature set. */ - int vid;/**< device identifier. */ - uint32_tflags; /**< Device flags. Only used to check if device is running on data core. */ -#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) - charifname[IF_NAME_SZ]; /**< Name of the tap device or socket path. */ - uint32_tvirt_qp_nb; /**< number of queue pair we have allocated */ - void*priv; /**< private context */ - uint64_tlog_size; /**< Size of log area */ - uint64_tlog_base; /**< Where dirty pages are logged */ - struct ether_addr mac;/**< MAC address */ - rte_atomic16_t broadcast_rarp; /**< A flag to tell if we need broadcast rarp packet */ - uint64_treserved[61]; /**< Reserve some spaces for future extension. */ - struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; /**< Contains all virtqueue information. */ -} __rte_cache_aligned; - -/** - * Information relating to memory regions including offsets to addresses in QEMUs memory file. - */ -struct virtio_memory_regions { - uint64_tguest_phys_address; /**< Base guest physical
[dpdk-dev] [PATCH 10/16] vhost: export vid as the only interface to applications
With all the previous prepare works, we are just one step away from the final ABI refactoring. That is, to change current API to let them stick to vid instead of the old virtio_net dev. Signed-off-by: Yuanhan Liu --- drivers/net/vhost/rte_eth_vhost.c | 61 ++- examples/vhost/main.c | 41 -- lib/librte_vhost/rte_virtio_net.h | 30 + lib/librte_vhost/vhost_rxtx.c | 15 ++- lib/librte_vhost/vhost_user/virtio-net-user.c | 14 +++--- lib/librte_vhost/virtio-net.c | 17 +--- 6 files changed, 91 insertions(+), 87 deletions(-) diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index 9763cd4..a9dada5 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -68,9 +68,9 @@ static struct ether_addr base_eth_addr = { }; struct vhost_queue { + int vid; rte_atomic32_t allow_queuing; rte_atomic32_t while_queuing; - struct virtio_net *device; struct pmd_internal *internal; struct rte_mempool *mb_pool; uint8_t port; @@ -137,7 +137,7 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs) goto out; /* Dequeue packets from guest TX queue */ - nb_rx = rte_vhost_dequeue_burst(r->device, + nb_rx = rte_vhost_dequeue_burst(r->vid, r->virtqueue_id, r->mb_pool, bufs, nb_bufs); r->rx_pkts += nb_rx; @@ -168,7 +168,7 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs) goto out; /* Enqueue packets to guest RX queue */ - nb_tx = rte_vhost_enqueue_burst(r->device, + nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id, bufs, nb_bufs); r->tx_pkts += nb_tx; @@ -217,7 +217,7 @@ find_internal_resource(int vid) } static int -new_device(struct virtio_net *dev) +new_device(int vid) { struct rte_eth_dev *eth_dev; struct internal_list *list; @@ -228,23 +228,17 @@ new_device(struct virtio_net *dev) int newnode; #endif - if (dev == NULL) { - RTE_LOG(INFO, PMD, "Invalid argument\n"); - return -1; - } - - list = find_internal_resource(dev->vid); + list = find_internal_resource(vid); if (list == NULL) { - RTE_LOG(INFO, PMD, "Invalid vid %d\n", dev->vid); + RTE_LOG(INFO, PMD, "Invalid vid %d\n", vid); return -1; } eth_dev = list->eth_dev; internal = eth_dev->data->dev_private; - internal->vid = dev->vid; #ifdef RTE_LIBRTE_VHOST_NUMA - newnode = rte_vhost_get_numa_node(dev->vid); + newnode = rte_vhost_get_numa_node(vid); if (newnode > 0) eth_dev->data->numa_node = newnode; #endif @@ -253,7 +247,7 @@ new_device(struct virtio_net *dev) vq = eth_dev->data->rx_queues[i]; if (vq == NULL) continue; - vq->device = dev; + vq->vid = vid; vq->internal = internal; vq->port = eth_dev->data->port_id; } @@ -261,15 +255,14 @@ new_device(struct virtio_net *dev) vq = eth_dev->data->tx_queues[i]; if (vq == NULL) continue; - vq->device = dev; + vq->vid = vid; vq->internal = internal; vq->port = eth_dev->data->port_id; } - for (i = 0; i < rte_vhost_get_queue_num(dev->vid) * VIRTIO_QNUM; i++) - rte_vhost_enable_guest_notification(dev, i, 0); + for (i = 0; i < rte_vhost_get_queue_num(vid) * VIRTIO_QNUM; i++) + rte_vhost_enable_guest_notification(vid, i, 0); - dev->priv = eth_dev; eth_dev->data->dev_link.link_status = ETH_LINK_UP; for (i = 0; i < eth_dev->data->nb_rx_queues; i++) { @@ -293,22 +286,19 @@ new_device(struct virtio_net *dev) } static void -destroy_device(volatile struct virtio_net *dev) +destroy_device(int vid) { + struct internal_list *list; struct rte_eth_dev *eth_dev; struct vhost_queue *vq; unsigned i; - if (dev == NULL) { - RTE_LOG(INFO, PMD, "Invalid argument\n"); - return; - } - - eth_dev = (struct rte_eth_dev *)dev->priv; - if (eth_dev == NULL) { - RTE_LOG(INFO, PMD, "Failed to find a ethdev\n"); + list = find_internal_resource(vid); + if (list == NULL) { + RTE_LOG(INFO, PMD, "Invalid vid %d\n", vid); return; } + eth_dev = list->eth_dev; /* Wait until rx/tx_pkt_burst stops accessing vhost device */ for (i = 0; i < eth_dev->data->nb_rx_queues; i++) { @@ -330,19 +320,17 @@ destroy_device(volatile struct virtio_net *dev)
[dpdk-dev] [PATCH 09/16] vhost: add few more functions
Add few more functions to export few more fields or informations of virtio_net struct, to applications, as we are gonna make them private. It includes: - rte_vhost_avail_entries It's actually a rename of "rte_vring_available_entries", with the "vring" to "vhost" name change to keep the consistency of other functions. - rte_vhost_get_queue_num Exports the "virt_qp_nb" field. - rte_vhost_get_numa_node Exports the numa node from where the virtio net device is allocated. Signed-off-by: Yuanhan Liu --- drivers/net/vhost/rte_eth_vhost.c | 18 - examples/vhost/main.c | 4 +-- lib/librte_vhost/rte_virtio_net.h | 37 +++ lib/librte_vhost/virtio-net.c | 54 +++ 4 files changed, 98 insertions(+), 15 deletions(-) diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index 290fd9e..9763cd4 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -33,9 +33,6 @@ #include #include #include -#ifdef RTE_LIBRTE_VHOST_NUMA -#include -#endif #include #include @@ -228,7 +225,7 @@ new_device(struct virtio_net *dev) struct vhost_queue *vq; unsigned i; #ifdef RTE_LIBRTE_VHOST_NUMA - int newnode, ret; + int newnode; #endif if (dev == NULL) { @@ -247,14 +244,9 @@ new_device(struct virtio_net *dev) internal->vid = dev->vid; #ifdef RTE_LIBRTE_VHOST_NUMA - ret = get_mempolicy(, NULL, 0, dev, - MPOL_F_NODE | MPOL_F_ADDR); - if (ret < 0) { - RTE_LOG(ERR, PMD, "Unknown numa node\n"); - return -1; - } - - eth_dev->data->numa_node = newnode; + newnode = rte_vhost_get_numa_node(dev->vid); + if (newnode > 0) + eth_dev->data->numa_node = newnode; #endif for (i = 0; i < eth_dev->data->nb_rx_queues; i++) { @@ -274,7 +266,7 @@ new_device(struct virtio_net *dev) vq->port = eth_dev->data->port_id; } - for (i = 0; i < dev->virt_qp_nb * VIRTIO_QNUM; i++) + for (i = 0; i < rte_vhost_get_queue_num(dev->vid) * VIRTIO_QNUM; i++) rte_vhost_enable_guest_notification(dev, i, 0); dev->priv = eth_dev; diff --git a/examples/vhost/main.c b/examples/vhost/main.c index e395e4a..145fa6f 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1056,13 +1056,13 @@ drain_eth_rx(struct vhost_dev *vdev) * to diminish packet loss. */ if (enable_retry && - unlikely(rx_count > rte_vring_available_entries(dev, + unlikely(rx_count > rte_vhost_avail_entries(vdev->vid, VIRTIO_RXQ))) { uint32_t retry; for (retry = 0; retry < burst_rx_retry_num; retry++) { rte_delay_us(burst_rx_delay_time); - if (rx_count <= rte_vring_available_entries(dev, + if (rx_count <= rte_vhost_avail_entries(vdev->vid, VIRTIO_RXQ)) break; } diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index bc64e89..27f6847 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -245,6 +245,43 @@ int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * cons int rte_vhost_driver_session_start(void); /** + * Get how many avail entries are left in the queue @queue_id. + * + * @param vid + * virtio-net device ID + * @param queue_id + * virtio queue index in mq case + * + * @return + * num of avail entires left + */ +uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id); + +/** + * Get the number of queues the device supports. + * + * @param vid + * virtio-net device ID + * + * @return + * The number of queues, 0 on failure + */ +uint32_t rte_vhost_get_queue_num(int vid); + +/** + * Get the numa node from which the virtio net device's memory + * is allocated. + * + * @param vid + * virtio-net device ID + * + * @return + * The numa node, -1 on failure + */ +int rte_vhost_get_numa_node(int vid); + + +/** * This function adds buffers to the virtio devices RX virtqueue. Buffers can * be received from the physical port or from another virtual device. A packet * count is returned to indicate the number of packets that were succesfully diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index ea28090..6bf4d87 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -753,6 +753,60 @@ int rte_vhost_feature_disable(uint64_t feature_mask) return 0; } +uint16_t +rte_vhost_avail_entries(int vid, uint16_t queue_id) +{ + struct virtio_net *dev; + struct vhost_virtqueue *vq; + + dev = get_device(vid); + if (!dev) + return 0; + + vq = dev->virtqueue[queue_id]; +
[dpdk-dev] [PATCH 08/16] vhost: query pmd internal by vid
Query internal by vid instead of "ifname", to avoid the dependency of virtio_net struct. Signed-off-by: Yuanhan Liu --- drivers/net/vhost/rte_eth_vhost.c | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index 63538c1..290fd9e 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -86,6 +86,7 @@ struct vhost_queue { }; struct pmd_internal { + int vid; char *dev_name; char *iface_name; uint16_t max_queues; @@ -194,20 +195,17 @@ eth_dev_configure(struct rte_eth_dev *dev __rte_unused) } static inline struct internal_list * -find_internal_resource(char *ifname) +find_internal_resource(int vid) { int found = 0; struct internal_list *list; struct pmd_internal *internal; - if (ifname == NULL) - return NULL; - pthread_mutex_lock(_list_lock); TAILQ_FOREACH(list, _list, next) { internal = list->eth_dev->data->dev_private; - if (!strcmp(internal->iface_name, ifname)) { + if (internal->vid == vid) { found = 1; break; } @@ -238,14 +236,15 @@ new_device(struct virtio_net *dev) return -1; } - list = find_internal_resource(dev->ifname); + list = find_internal_resource(dev->vid); if (list == NULL) { - RTE_LOG(INFO, PMD, "Invalid device name\n"); + RTE_LOG(INFO, PMD, "Invalid vid %d\n", dev->vid); return -1; } eth_dev = list->eth_dev; internal = eth_dev->data->dev_private; + internal->vid = dev->vid; #ifdef RTE_LIBRTE_VHOST_NUMA ret = get_mempolicy(, NULL, 0, dev, @@ -371,9 +370,9 @@ vring_state_changed(struct virtio_net *dev, uint16_t vring, int enable) return -1; } - list = find_internal_resource(dev->ifname); + list = find_internal_resource(dev->vid); if (list == NULL) { - RTE_LOG(ERR, PMD, "Invalid interface name: %s\n", dev->ifname); + RTE_LOG(ERR, PMD, "Invalid vid %d\n", dev->vid); return -1; } @@ -884,7 +883,7 @@ rte_pmd_vhost_devuninit(const char *name) if (internal == NULL) return -ENODEV; - list = find_internal_resource(internal->iface_name); + list = find_internal_resource(internal->vid); if (list == NULL) return -ENODEV; -- 1.9.0
[dpdk-dev] [PATCH 04/16] example/vhost: make a copy of virtio device id
Make a copy of virtio device id (device_fh) from the virtio_net struct, so that we could have less dependency on the virtio_net struct. Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 59 --- examples/vhost/main.h | 1 + 2 files changed, 29 insertions(+), 31 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 23bfe09..7273897 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -709,7 +709,6 @@ static int link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) { struct ether_hdr *pkt_hdr; - struct virtio_net *dev = vdev->dev; int i, ret; /* Learn MAC address of guest device from packet */ @@ -718,7 +717,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) if (find_vhost_dev(_hdr->s_addr)) { RTE_LOG(ERR, VHOST_DATA, "(%d) device is using a registered MAC!\n", - dev->device_fh); + vdev->device_fh); return -1; } @@ -726,12 +725,12 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) vdev->mac_address.addr_bytes[i] = pkt_hdr->s_addr.addr_bytes[i]; /* vlan_tag currently uses the device_id. */ - vdev->vlan_tag = vlan_tags[dev->device_fh]; + vdev->vlan_tag = vlan_tags[vdev->device_fh]; /* Print out VMDQ registration info. */ RTE_LOG(INFO, VHOST_DATA, "(%d) mac %02x:%02x:%02x:%02x:%02x:%02x and vlan %d registered\n", - dev->device_fh, + vdev->device_fh, vdev->mac_address.addr_bytes[0], vdev->mac_address.addr_bytes[1], vdev->mac_address.addr_bytes[2], vdev->mac_address.addr_bytes[3], vdev->mac_address.addr_bytes[4], vdev->mac_address.addr_bytes[5], @@ -739,11 +738,11 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) /* Register the MAC address. */ ret = rte_eth_dev_mac_addr_add(ports[0], >mac_address, - (uint32_t)dev->device_fh + vmdq_pool_base); + (uint32_t)vdev->device_fh + vmdq_pool_base); if (ret) RTE_LOG(ERR, VHOST_DATA, "(%d) failed to add device MAC address to VMDQ\n", - dev->device_fh); + vdev->device_fh); /* Enable stripping of the vlan tag as we handle routing. */ if (vlan_strip) @@ -815,7 +814,6 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) { struct ether_hdr *pkt_hdr; struct vhost_dev *dst_vdev; - int fh; pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); @@ -823,19 +821,19 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) if (!dst_vdev) return -1; - fh = dst_vdev->dev->device_fh; - if (fh == vdev->dev->device_fh) { + if (vdev->device_fh == dst_vdev->device_fh) { RTE_LOG(DEBUG, VHOST_DATA, "(%d) TX: src and dst MAC is same. Dropping packet.\n", - fh); + vdev->device_fh); return 0; } - RTE_LOG(DEBUG, VHOST_DATA, "(%d) TX: MAC address is local\n", fh); + RTE_LOG(DEBUG, VHOST_DATA, + "(%d) TX: MAC address is local\n", dst_vdev->device_fh); if (unlikely(dst_vdev->remove)) { RTE_LOG(DEBUG, VHOST_DATA, - "(%d) device is marked for removal\n", fh); + "(%d) device is marked for removal\n", dst_vdev->device_fh); return 0; } @@ -848,7 +846,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) * and get its vlan tag, and offset if it is. */ static inline int __attribute__((always_inline)) -find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, +find_local_dest(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t *offset, uint16_t *vlan_tag) { struct vhost_dev *dst_vdev; @@ -858,10 +856,10 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, if (!dst_vdev) return 0; - if (dst_vdev->dev->device_fh == dev->device_fh) { + if (vdev->device_fh == dst_vdev->device_fh) { RTE_LOG(DEBUG, VHOST_DATA, "(%d) TX: src and dst MAC is same. Dropping packet.\n", - dst_vdev->dev->device_fh); + vdev->device_fh); return -1; } @@ -871,11 +869,11 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, * the packet length by plus it. */ *offset = VLAN_HLEN; - *vlan_tag = vlan_tags[(uint16_t)dst_vdev->dev->device_fh]; + *vlan_tag = vlan_tags[vdev->device_fh]; RTE_LOG(DEBUG, VHOST_DATA, - "(%d) TX: pkt to local VM device id (%d) vlan tag: %u.\n", -
[dpdk-dev] [PATCH 03/16] vhost: declare device_fh as int
Firstly, "int" would be big enough: we don't need 64 bit to represent a virtio-net device id. Secondly, this could let us avoid the ugly "%" PRIu64 ".." stuff. And since ctx.fh is derived from device_fh, declare it as int, too. Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 45 ++- lib/librte_vhost/rte_virtio_net.h | 2 +- lib/librte_vhost/vhost-net.h | 8 ++--- lib/librte_vhost/vhost_cuse/vhost-net-cdev.c | 34 ++-- lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 7 ++--- lib/librte_vhost/vhost_rxtx.c | 34 +--- lib/librte_vhost/vhost_user/virtio-net-user.c | 2 +- lib/librte_vhost/virtio-net.c | 21 ++--- 8 files changed, 74 insertions(+), 79 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 93f9994..23bfe09 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -717,7 +717,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) if (find_vhost_dev(_hdr->s_addr)) { RTE_LOG(ERR, VHOST_DATA, - "Device (%" PRIu64 ") is using a registered MAC!\n", + "(%d) device is using a registered MAC!\n", dev->device_fh); return -1; } @@ -729,7 +729,8 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) vdev->vlan_tag = vlan_tags[dev->device_fh]; /* Print out VMDQ registration info. */ - RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") MAC_ADDRESS %02x:%02x:%02x:%02x:%02x:%02x and VLAN_TAG %d registered\n", + RTE_LOG(INFO, VHOST_DATA, + "(%d) mac %02x:%02x:%02x:%02x:%02x:%02x and vlan %d registered\n", dev->device_fh, vdev->mac_address.addr_bytes[0], vdev->mac_address.addr_bytes[1], vdev->mac_address.addr_bytes[2], vdev->mac_address.addr_bytes[3], @@ -740,8 +741,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) ret = rte_eth_dev_mac_addr_add(ports[0], >mac_address, (uint32_t)dev->device_fh + vmdq_pool_base); if (ret) - RTE_LOG(ERR, VHOST_DATA, "(%"PRIu64") Failed to add device MAC address to VMDQ\n", - dev->device_fh); + RTE_LOG(ERR, VHOST_DATA, + "(%d) failed to add device MAC address to VMDQ\n", + dev->device_fh); /* Enable stripping of the vlan tag as we handle routing. */ if (vlan_strip) @@ -813,7 +815,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) { struct ether_hdr *pkt_hdr; struct vhost_dev *dst_vdev; - uint64_t fh; + int fh; pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); @@ -824,17 +826,16 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) fh = dst_vdev->dev->device_fh; if (fh == vdev->dev->device_fh) { RTE_LOG(DEBUG, VHOST_DATA, - "(%" PRIu64 ") TX: src and dst MAC is same. " - "Dropping packet.\n", fh); + "(%d) TX: src and dst MAC is same. Dropping packet.\n", + fh); return 0; } - RTE_LOG(DEBUG, VHOST_DATA, - "(%" PRIu64 ") TX: MAC address is local\n", fh); + RTE_LOG(DEBUG, VHOST_DATA, "(%d) TX: MAC address is local\n", fh); if (unlikely(dst_vdev->remove)) { - RTE_LOG(DEBUG, VHOST_DATA, "(%" PRIu64 ") " - "Device is marked for removal\n", fh); + RTE_LOG(DEBUG, VHOST_DATA, + "(%d) device is marked for removal\n", fh); return 0; } @@ -859,8 +860,8 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, if (dst_vdev->dev->device_fh == dev->device_fh) { RTE_LOG(DEBUG, VHOST_DATA, - "(%" PRIu64 ") TX: src and dst MAC is same. " - " Dropping packet.\n", dst_vdev->dev->device_fh); + "(%d) TX: src and dst MAC is same. Dropping packet.\n", + dst_vdev->dev->device_fh); return -1; } @@ -873,8 +874,7 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, *vlan_tag = vlan_tags[(uint16_t)dst_vdev->dev->device_fh]; RTE_LOG(DEBUG, VHOST_DATA, - "(%" PRIu64 ") TX: pkt to local VM device id: (%" PRIu64 ") " - "vlan tag: %u.\n", + "(%d) TX: pkt to local VM device id (%d) vlan tag: %u.\n", dev->device_fh, dst_vdev->dev->device_fh, *vlan_tag); return 0; @@ -965,8 +965,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) } } - RTE_LOG(DEBUG, VHOST_DATA, "(%" PRIu64 ") TX: " - "MAC address is
[dpdk-dev] [PATCH 02/16] vhost: set/reset dev flags internally
It does not make sense to ask the application to set/unset the flag VIRTIO_DEV_RUNNING (that used internal only) at new_device()/ destroy_device() callback. Instead, it should be set after new_device() succeeds and reset before destroy_device() is invoked inside vhost lib. This patch fixes it. Signed-off-by: Yuanhan Liu --- drivers/net/vhost/rte_eth_vhost.c | 2 -- examples/vhost/main.c | 3 --- lib/librte_vhost/vhost_user/virtio-net-user.c | 11 +++ lib/librte_vhost/virtio-net.c | 21 ++--- 4 files changed, 21 insertions(+), 16 deletions(-) diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index 310cbef..63538c1 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -278,7 +278,6 @@ new_device(struct virtio_net *dev) for (i = 0; i < dev->virt_qp_nb * VIRTIO_QNUM; i++) rte_vhost_enable_guest_notification(dev, i, 0); - dev->flags |= VIRTIO_DEV_RUNNING; dev->priv = eth_dev; eth_dev->data->dev_link.link_status = ETH_LINK_UP; @@ -341,7 +340,6 @@ destroy_device(volatile struct virtio_net *dev) eth_dev->data->dev_link.link_status = ETH_LINK_DOWN; dev->priv = NULL; - dev->flags &= ~VIRTIO_DEV_RUNNING; for (i = 0; i < eth_dev->data->nb_rx_queues; i++) { vq = eth_dev->data->rx_queues[i]; diff --git a/examples/vhost/main.c b/examples/vhost/main.c index d3da41b..93f9994 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1180,8 +1180,6 @@ destroy_device (volatile struct virtio_net *dev) struct vhost_dev *vdev; int lcore; - dev->flags &= ~VIRTIO_DEV_RUNNING; - vdev = (struct vhost_dev *)dev->priv; /*set the remove flag. */ vdev->remove = 1; @@ -1258,7 +1256,6 @@ new_device (struct virtio_net *dev) /* Disable notifications. */ rte_vhost_enable_guest_notification(dev, VIRTIO_RXQ, 0); rte_vhost_enable_guest_notification(dev, VIRTIO_TXQ, 0); - dev->flags |= VIRTIO_DEV_RUNNING; RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Device has been added to data core %d\n", dev->device_fh, vdev->coreid); diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c index f5248bc..e775e45 100644 --- a/lib/librte_vhost/vhost_user/virtio-net-user.c +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c @@ -115,8 +115,10 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg) return -1; /* Remove from the data plane. */ - if (dev->flags & VIRTIO_DEV_RUNNING) + if (dev->flags & VIRTIO_DEV_RUNNING) { + dev->flags &= ~VIRTIO_DEV_RUNNING; notify_ops->destroy_device(dev); + } if (dev->mem) { free_mem_region(dev); @@ -286,9 +288,10 @@ user_set_vring_kick(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg) "vring kick idx:%d file:%d\n", file.index, file.fd); vhost_set_vring_kick(ctx, ); - if (virtio_is_ready(dev) && - !(dev->flags & VIRTIO_DEV_RUNNING)) - notify_ops->new_device(dev); + if (virtio_is_ready(dev) && !(dev->flags & VIRTIO_DEV_RUNNING)) { + if (notify_ops->new_device(dev) == 0) + dev->flags |= VIRTIO_DEV_RUNNING; + } } /* diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index f7c7215..5eea3be 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -296,8 +296,10 @@ vhost_destroy_device(struct vhost_device_ctx ctx) if (dev == NULL) return; - if (dev->flags & VIRTIO_DEV_RUNNING) + if (dev->flags & VIRTIO_DEV_RUNNING) { + dev->flags &= ~VIRTIO_DEV_RUNNING; notify_ops->destroy_device(dev); + } cleanup_device(dev, 1); free_device(dev); @@ -352,8 +354,10 @@ vhost_reset_owner(struct vhost_device_ctx ctx) if (dev == NULL) return -1; - if (dev->flags & VIRTIO_DEV_RUNNING) + if (dev->flags & VIRTIO_DEV_RUNNING) { + dev->flags &= ~VIRTIO_DEV_RUNNING; notify_ops->destroy_device(dev); + } cleanup_device(dev, 0); reset_device(dev); @@ -718,12 +722,15 @@ vhost_set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file) if (!(dev->flags & VIRTIO_DEV_RUNNING)) { if (dev->virtqueue[VIRTIO_TXQ]->backend != VIRTIO_DEV_STOPPED && dev->virtqueue[VIRTIO_RXQ]->backend != VIRTIO_DEV_STOPPED) { - return notify_ops->new_device(dev); + if (notify_ops->new_device(dev) < 0) + return -1; + dev->flags |= VIRTIO_DEV_RUNNING; } - /* Otherwise we
[dpdk-dev] [PATCH 01/16] vhost: declare backend with int type
It's an fd; so define it as "int", which could also save the unncessary (int) case. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/rte_virtio_net.h | 2 +- lib/librte_vhost/virtio-net.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index 600b20b..4d25f79 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -85,7 +85,7 @@ struct vhost_virtqueue { struct vring_avail *avail; /**< Virtqueue available ring. */ struct vring_used *used; /**< Virtqueue used ring. */ uint32_tsize; /**< Size of descriptor ring. */ - uint32_tbackend;/**< Backend value to determine if device should started/stopped. */ + int backend;/**< Backend value to determine if device should started/stopped. */ uint16_tvhost_hlen; /**< Vhost header length (varies depending on RX merge buffers. */ volatile uint16_t last_used_idx; /**< Last index used on the available ring */ volatile uint16_t last_used_idx_res; /**< Used for multiple devices reserving buffers. */ diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index d870ad9..f7c7215 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -716,8 +716,8 @@ vhost_set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file) * we add the device. */ if (!(dev->flags & VIRTIO_DEV_RUNNING)) { - if (((int)dev->virtqueue[VIRTIO_TXQ]->backend != VIRTIO_DEV_STOPPED) && - ((int)dev->virtqueue[VIRTIO_RXQ]->backend != VIRTIO_DEV_STOPPED)) { + if (dev->virtqueue[VIRTIO_TXQ]->backend != VIRTIO_DEV_STOPPED && + dev->virtqueue[VIRTIO_RXQ]->backend != VIRTIO_DEV_STOPPED) { return notify_ops->new_device(dev); } /* Otherwise we remove it. */ -- 1.9.0
[dpdk-dev] [PATCH 00/16] vhost ABI/API refactoring
Every time we introduce a new feature to vhost, we are likely to break ABI. Moreover, some cleanups (such as the one from Ilya to remove vec_buf from vhost_virtqueue struct) also break ABI. This patch set is meant to resolve above issue ultimately, by hiding virtio_net structure (as well as few others) internaly, and export the virtio_net dev strut to applications by a number, vid, like the way kernel exposes an fd to user space. Back to the patch set, the first part of this set makes some changes to vhost example, vhost-pmd and vhost, bit by bit, to remove the dependence to "virtio_net" struct. And then do the final change to make the current APIs to adapt to using "vid". After that, "vrtio_net_device_ops" is the only left open struct that an application can acces, thefeore, it's the only place that might introduce potential ABI breakage in future for extension. Hence, I made few more (5) space reservation, to make sure we will not break ABI for a long time, and hopefuly, forever. The last bit of this patch set is some cleanups, including the one from Ilya. Note that this refactoring breaks the tep_termination example. Well, it's just another copy of the original messy vhost example, and I have no interest to cleanup it again. Therefore, I might consider to remove that example later, and add the vxlan bits into vhost example. Few more TODOs: update release note, update lib version, update version.map Thanks. --yliu --- Ilya Maximets (1): vhost: make buf vector for scatter Rx local Yuanhan Liu (15): vhost: declare backend with int type vhost: set/reset dev flags internally vhost: declare device_fh as int example/vhost: make a copy of virtio device id vhost: rename device_fh to vid vhost: get device by vid only vhost: move vhost_device_ctx to cuse vhost: query pmd internal by vid vhost: add few more functions vhost: export vid as the only interface to applications vhost: hide internal structs/macros/functions vhost: remove unnecessary fields vhost: remove virtio-net.h vhost: reserve few more space for future extension vhost: per device vhost_hlen drivers/net/vhost/rte_eth_vhost.c | 86 --- examples/vhost/main.c | 126 --- examples/vhost/main.h | 1 + lib/librte_vhost/rte_virtio_net.h | 197 ++-- lib/librte_vhost/vhost-net.h | 195 +++ lib/librte_vhost/vhost_cuse/vhost-net-cdev.c | 83 +- lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 30 ++-- lib/librte_vhost/vhost_cuse/virtio-net-cdev.h | 12 +- lib/librte_vhost/vhost_rxtx.c | 133 lib/librte_vhost/vhost_user/vhost-net-user.c | 53 +++ lib/librte_vhost/vhost_user/virtio-net-user.c | 64 lib/librte_vhost/vhost_user/virtio-net-user.h | 18 +-- lib/librte_vhost/virtio-net.c | 213 -- lib/librte_vhost/virtio-net.h | 43 -- 14 files changed, 644 insertions(+), 610 deletions(-) delete mode 100644 lib/librte_vhost/virtio-net.h -- 1.9.0
[dpdk-dev] [PATCH] lpm6: fix assigned value is garbage or undefined
2016-04-27 17:07, Daniel Mrzyglod: > Fix issue reported by clang scan-build > > Value of pointer tbl_next was uninitialized. When function lookup_step() > take else branch it may provide garbage into tbl = tbl_next; > > Fixes: 5c510e13a9cb ("lpm: add IPv6 support") > > Signed-off-by: Daniel Mrzyglod Applied, thanks
[dpdk-dev] [PATCH] cfgfile: fix uninitialized scalar variable
> > CID 13323: > > Uninitialized scalar variable. Using uninitialized value > > cfg->num_sections when calling rte_cfgfile_close. > > > > Fixes: eaafbad419bf ("cfgfile: library to interpret config files") > > > > Signed-off-by: Michal Kobylinski > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH] cfgfile: fix return value comment
> > Function rte_cfgfile_load can return NULL value, when something goes > > wrong. > > > > Signed-off-by: Dmitriy Yakovlev > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH v2 8/8] examples/vhost: embed statistics into vhost_dev struct
Embed dev_statistics into vhost_dev strcuct, which could clean the code a bit. Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 87 +++ examples/vhost/main.h | 11 +-- 2 files changed, 40 insertions(+), 58 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 66d3bf2..d3da41b 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -217,15 +217,6 @@ struct mbuf_table lcore_tx_queue[RTE_MAX_LCORE]; / US_PER_S * BURST_TX_DRAIN_US) #define VLAN_HLEN 4 -/* Per-device statistics struct */ -struct device_statistics { - uint64_t tx_total; - rte_atomic64_t rx_total_atomic; - uint64_t tx; - rte_atomic64_t rx_atomic; -} __rte_cache_aligned; -struct device_statistics dev_statistics[MAX_DEVICES]; - /* * Builds up the correct configuration for VMDQ VLAN pool map * according to the pool & queue limits. @@ -799,17 +790,17 @@ unlink_vmdq(struct vhost_dev *vdev) } static inline void __attribute__((always_inline)) -virtio_xmit(struct virtio_net *dst_dev, struct virtio_net *src_dev, +virtio_xmit(struct vhost_dev *dst_vdev, struct vhost_dev *src_vdev, struct rte_mbuf *m) { uint16_t ret; - ret = rte_vhost_enqueue_burst(dst_dev, VIRTIO_RXQ, , 1); + ret = rte_vhost_enqueue_burst(dst_vdev->dev, VIRTIO_RXQ, , 1); if (enable_stats) { - rte_atomic64_inc(_statistics[dst_dev->device_fh].rx_total_atomic); - rte_atomic64_add(_statistics[dst_dev->device_fh].rx_atomic, ret); - dev_statistics[src_dev->device_fh].tx_total++; - dev_statistics[src_dev->device_fh].tx += ret; + rte_atomic64_inc(_vdev->stats.rx_total_atomic); + rte_atomic64_add(_vdev->stats.rx_atomic, ret); + src_vdev->stats.tx_total++; + src_vdev->stats.tx += ret; } } @@ -847,7 +838,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) return 0; } - virtio_xmit(dst_vdev->dev, vdev->dev, m); + virtio_xmit(dst_vdev, vdev, m); return 0; } @@ -956,7 +947,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) struct vhost_dev *vdev2; TAILQ_FOREACH(vdev2, _dev_list, next) { - virtio_xmit(vdev2->dev, vdev->dev, m); + virtio_xmit(vdev2, vdev, m); } goto queue2nic; } @@ -1020,8 +1011,8 @@ queue2nic: tx_q->m_table[tx_q->len++] = m; if (enable_stats) { - dev_statistics[dev->device_fh].tx_total++; - dev_statistics[dev->device_fh].tx++; + vdev->stats.tx_total++; + vdev->stats.tx++; } if (unlikely(tx_q->len == MAX_PKT_BURST)) @@ -1082,10 +1073,8 @@ drain_eth_rx(struct vhost_dev *vdev) enqueue_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ, pkts, rx_count); if (enable_stats) { - uint64_t fh = dev->device_fh; - - rte_atomic64_add(_statistics[fh].rx_total_atomic, rx_count); - rte_atomic64_add(_statistics[fh].rx_atomic, enqueue_count); + rte_atomic64_add(>stats.rx_total_atomic, rx_count); + rte_atomic64_add(>stats.rx_atomic, enqueue_count); } free_pkts(pkts, rx_count); @@ -1266,9 +1255,6 @@ new_device (struct virtio_net *dev) TAILQ_INSERT_TAIL(_info[vdev->coreid].vdev_list, vdev, next); lcore_info[vdev->coreid].device_num++; - /* Initialize device stats */ - memset(_statistics[dev->device_fh], 0, sizeof(struct device_statistics)); - /* Disable notifications. */ rte_vhost_enable_guest_notification(dev, VIRTIO_RXQ, 0); rte_vhost_enable_guest_notification(dev, VIRTIO_TXQ, 0); @@ -1299,7 +1285,6 @@ print_stats(void) struct vhost_dev *vdev; uint64_t tx_dropped, rx_dropped; uint64_t tx, tx_total, rx, rx_total; - uint32_t device_fh; const char clr[] = { 27, '[', '2', 'J', '\0' }; const char top_left[] = { 27, '[', '1', ';', '1', 'H','\0' }; @@ -1307,37 +1292,32 @@ print_stats(void) sleep(enable_stats); /* Clear screen and move to top left */ - printf("%s%s", clr, top_left); - - printf("\nDevice statistics "); + printf("%s%s\n", clr, top_left); + printf("Device statistics =\n"); TAILQ_FOREACH(vdev, _dev_list, next) { - device_fh = vdev->dev->device_fh; - tx_total = dev_statistics[device_fh].tx_total; - tx = dev_statistics[device_fh].tx; + tx_total = vdev->stats.tx_total; +
[dpdk-dev] [PATCH v2 7/8] examples/vhost: switch_worker cleanup
switch_worker() is the last piece of code that is messy yet it touches virtio/vhost device. Here do a cleanup, so that we will be less painful for later vhost ABI refactoring. The cleanup is straigforward: break long lines, move some code into functions. The last, comment a bit on switch_worker(). Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 253 +++--- 1 file changed, 136 insertions(+), 117 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index dbb42ee..66d3bf2 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -213,6 +213,8 @@ struct mbuf_table { /* TX queue for each data core. */ struct mbuf_table lcore_tx_queue[RTE_MAX_LCORE]; +#define MBUF_TABLE_DRAIN_TSC ((rte_get_tsc_hz() + US_PER_S - 1) \ +/ US_PER_S * BURST_TX_DRAIN_US) #define VLAN_HLEN 4 /* Per-device statistics struct */ @@ -915,16 +917,35 @@ static void virtio_tx_offload(struct rte_mbuf *m) tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags); } +static inline void +free_pkts(struct rte_mbuf **pkts, uint16_t n) +{ + while (n--) + rte_pktmbuf_free(pkts[n]); +} + +static inline void __attribute__((always_inline)) +do_drain_mbuf_table(struct mbuf_table *tx_q) +{ + uint16_t count; + + count = rte_eth_tx_burst(ports[0], tx_q->txq_id, +tx_q->m_table, tx_q->len); + if (unlikely(count < tx_q->len)) + free_pkts(_q->m_table[count], tx_q->len - count); + + tx_q->len = 0; +} + /* - * This function routes the TX packet to the correct interface. This may be a local device - * or the physical port. + * This function routes the TX packet to the correct interface. This + * may be a local device or the physical port. */ static inline void __attribute__((always_inline)) virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) { struct mbuf_table *tx_q; - struct rte_mbuf **m_table; - unsigned len, ret, offset = 0; + unsigned offset = 0; const uint16_t lcore_id = rte_lcore_id(); struct virtio_net *dev = vdev->dev; struct ether_hdr *nh; @@ -960,7 +981,6 @@ queue2nic: /*Add packet to the port tx queue*/ tx_q = _tx_queue[lcore_id]; - len = tx_q->len; nh = rte_pktmbuf_mtod(m, struct ether_hdr *); if (unlikely(nh->ether_type == rte_cpu_to_be_16(ETHER_TYPE_VLAN))) { @@ -998,55 +1018,130 @@ queue2nic: if (m->ol_flags & PKT_TX_TCP_SEG) virtio_tx_offload(m); - tx_q->m_table[len] = m; - len++; + tx_q->m_table[tx_q->len++] = m; if (enable_stats) { dev_statistics[dev->device_fh].tx_total++; dev_statistics[dev->device_fh].tx++; } - if (unlikely(len == MAX_PKT_BURST)) { - m_table = (struct rte_mbuf **)tx_q->m_table; - ret = rte_eth_tx_burst(ports[0], (uint16_t)tx_q->txq_id, m_table, (uint16_t) len); - /* Free any buffers not handled by TX and update the port stats. */ - if (unlikely(ret < len)) { - do { - rte_pktmbuf_free(m_table[ret]); - } while (++ret < len); + if (unlikely(tx_q->len == MAX_PKT_BURST)) + do_drain_mbuf_table(tx_q); +} + + +static inline void __attribute__((always_inline)) +drain_mbuf_table(struct mbuf_table *tx_q) +{ + static uint64_t prev_tsc; + uint64_t cur_tsc; + + if (tx_q->len == 0) + return; + + cur_tsc = rte_rdtsc(); + if (unlikely(cur_tsc - prev_tsc > MBUF_TABLE_DRAIN_TSC)) { + prev_tsc = cur_tsc; + + RTE_LOG(DEBUG, VHOST_DATA, + "TX queue drained after timeout with burst size %u\n", + tx_q->len); + do_drain_mbuf_table(tx_q); + } +} + +static inline void __attribute__((always_inline)) +drain_eth_rx(struct vhost_dev *vdev) +{ + uint16_t rx_count, enqueue_count; + struct virtio_net *dev = vdev->dev; + struct rte_mbuf *pkts[MAX_PKT_BURST]; + + rx_count = rte_eth_rx_burst(ports[0], vdev->vmdq_rx_q, + pkts, MAX_PKT_BURST); + if (!rx_count) + return; + + /* +* When "enable_retry" is set, here we wait and retry when there +* is no enough free slots in the queue to hold @rx_count packets, +* to diminish packet loss. +*/ + if (enable_retry && + unlikely(rx_count > rte_vring_available_entries(dev, + VIRTIO_RXQ))) { + uint32_t retry; + + for (retry = 0; retry < burst_rx_retry_num; retry++) { + rte_delay_us(burst_rx_delay_time); + if (rx_count <= rte_vring_available_entries(dev, +
[dpdk-dev] [PATCH v2 6/8] examples/vhost: fix mbuf allocation failure
It has always been a mystery (at least to me before) that how many mbuf is enough while creating an mbuf pool. While current macro NUM_MBUFS_PER_PORT gives your some insights, it's not that accurate: it doesn't consider the case we may receive a big packet, say 64K when TSO is enabled. We actually have tried to fix it once before, with commit 5499c1fc9baa ("examples/vhost: fix mbuf allocation"), but it just workarounded it by enlarging it a bit so that the case described in the commit log by passes. So, while trying to fix it ultimately, I'm thinking how big is big enough, and what are the factors need consider to figure out a proper value. Therefore, here you are. I introduced a helper function to create the mbuf pool, and do the "how many mbufs are needed" calculation there. Also, I put detailed comments how that comes, to serve as the guidelines. Fixes: 9fd72e3cbd29 ("examples/vhost: add virtio offload") Fixes: 5499c1fc9baa ("examples/vhost: fix mbuf allocation") Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 79 ++- 1 file changed, 59 insertions(+), 20 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 7448e4f..dbb42ee 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -62,14 +62,6 @@ /* the maximum number of external ports supported */ #define MAX_SUP_PORTS 1 -/* - * Calculate the number of buffers needed per port - */ -#define NUM_MBUFS_PER_PORT ((MAX_QUEUES*RTE_TEST_RX_DESC_DEFAULT) + \ - (num_switching_cores*MAX_PKT_BURST) + \ - (num_switching_cores*RTE_TEST_TX_DESC_DEFAULT) +\ - ((num_switching_cores+1)*MBUF_CACHE_SIZE)) - #define MBUF_CACHE_SIZE128 #define MBUF_DATA_SIZE RTE_MBUF_DEFAULT_BUF_SIZE @@ -110,9 +102,6 @@ static uint32_t enabled_port_mask = 0; /* Promiscuous mode */ static uint32_t promiscuous; -/*Number of switching cores enabled*/ -static uint32_t num_switching_cores = 0; - /* number of devices/queues to support*/ static uint32_t num_queues = 0; static uint32_t num_devices; @@ -1345,6 +1334,57 @@ sigint_handler(__rte_unused int signum) } /* + * While creating an mbuf pool, one key thing is to figure out how + * many mbuf entries is enough for our use. FYI, here are some + * guidelines: + * + * - Each rx queue would reserve @nr_rx_desc mbufs at queue setup stage + * + * - For each switch core (A CPU core does the packet switch), we need + * also make some reservation for receiving the packets from virtio + * Tx queue. How many is enough depends on the usage. It's normally + * a simple calculation like following: + * + * MAX_PKT_BURST * max packet size / mbuf size + * + * So, we definitely need allocate more mbufs when TSO is enabled. + * + * - Similarly, for each switching core, we should serve @nr_rx_desc + * mbufs for receiving the packets from physical NIC device. + * + * - We also need make sure, for each switch core, we have allocated + * enough mbufs to fill up the mbuf cache. + */ +static void +create_mbuf_pool(uint16_t nr_port, uint32_t nr_switch_core, uint32_t mbuf_size, + uint32_t nr_queues, uint32_t nr_rx_desc, uint32_t nr_mbuf_cache) +{ + uint32_t nr_mbufs; + uint32_t nr_mbufs_per_core; + uint32_t mtu = 1500; + + if (mergeable) + mtu = 9000; + if (enable_tso) + mtu = 64 * 1024; + + nr_mbufs_per_core = (mtu + mbuf_size) * MAX_PKT_BURST / + (mbuf_size - RTE_PKTMBUF_HEADROOM) * MAX_PKT_BURST; + nr_mbufs_per_core += nr_rx_desc; + nr_mbufs_per_core = RTE_MAX(nr_mbufs_per_core, nr_mbuf_cache); + + nr_mbufs = nr_queues * nr_rx_desc; + nr_mbufs += nr_mbufs_per_core * nr_switch_core; + nr_mbufs *= nr_port; + + mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", nr_mbufs, + nr_mbuf_cache, 0, mbuf_size, + rte_socket_id()); + if (mbuf_pool == NULL) + rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); +} + +/* * Main function, does initialisation and calls the per-lcore functions. The CUSE * device is also registered here to handle the IOCTLs. */ @@ -1381,9 +1421,6 @@ main(int argc, char *argv[]) if (rte_lcore_count() > RTE_MAX_LCORE) rte_exit(EXIT_FAILURE,"Not enough cores\n"); - /*set the number of swithcing cores available*/ - num_switching_cores = rte_lcore_count()-1; - /* Get the number of physical ports. */ nb_ports = rte_eth_dev_count(); if (nb_ports > RTE_MAX_ETHPORTS) @@ -1401,12 +1438,14 @@ main(int argc, char *argv[]) return -1; } - /* Create the mbuf pool. */ - mbuf_pool =
[dpdk-dev] [PATCH v2 5/8] examples/vhost: handle broadcast packet
Every time I do a VM2VM iperf test with vhost example, I have to set the arp table manually, as vhost-switch just ignores the broadcast packet, leaving the ARP request not served. Here we do a transmit a broadcast packet (such as ARP request) to every vhost device, as well as the physical port, to fix above arp table issue. Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 39 +-- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index bfcabf3..7448e4f 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -807,6 +807,21 @@ unlink_vmdq(struct vhost_dev *vdev) } } +static inline void __attribute__((always_inline)) +virtio_xmit(struct virtio_net *dst_dev, struct virtio_net *src_dev, + struct rte_mbuf *m) +{ + uint16_t ret; + + ret = rte_vhost_enqueue_burst(dst_dev, VIRTIO_RXQ, , 1); + if (enable_stats) { + rte_atomic64_inc(_statistics[dst_dev->device_fh].rx_total_atomic); + rte_atomic64_add(_statistics[dst_dev->device_fh].rx_atomic, ret); + dev_statistics[src_dev->device_fh].tx_total++; + dev_statistics[src_dev->device_fh].tx += ret; + } +} + /* * Check if the packet destination MAC address is for a local device. If so then put * the packet on that devices RX queue. If not then return. @@ -815,7 +830,6 @@ static inline int __attribute__((always_inline)) virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) { struct ether_hdr *pkt_hdr; - uint64_t ret = 0; struct vhost_dev *dst_vdev; uint64_t fh; @@ -842,15 +856,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) return 0; } - /* send the packet to the local virtio device */ - ret = rte_vhost_enqueue_burst(dst_vdev->dev, VIRTIO_RXQ, , 1); - if (enable_stats) { - rte_atomic64_inc(_statistics[fh].rx_total_atomic); - rte_atomic64_add(_statistics[fh].rx_atomic, ret); - dev_statistics[vdev->dev->device_fh].tx_total++; - dev_statistics[vdev->dev->device_fh].tx += ret; - } - + virtio_xmit(dst_vdev->dev, vdev->dev, m); return 0; } @@ -934,6 +940,17 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) struct virtio_net *dev = vdev->dev; struct ether_hdr *nh; + + nh = rte_pktmbuf_mtod(m, struct ether_hdr *); + if (unlikely(is_broadcast_ether_addr(>d_addr))) { + struct vhost_dev *vdev2; + + TAILQ_FOREACH(vdev2, _dev_list, next) { + virtio_xmit(vdev2->dev, vdev->dev, m); + } + goto queue2nic; + } + /*check if destination is local VM*/ if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) { rte_pktmbuf_free(m); @@ -950,6 +967,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) RTE_LOG(DEBUG, VHOST_DATA, "(%" PRIu64 ") TX: " "MAC address is external\n", dev->device_fh); +queue2nic: + /*Add packet to the port tx queue*/ tx_q = _tx_queue[lcore_id]; len = tx_q->len; -- 1.9.3
[dpdk-dev] [PATCH v2 4/8] examples/vhost: use mac compare helper function directly
rte_ether.h already provides a helper function to do mac address compare. No need to define our own, use it directly. Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 135a4a4..bfcabf3 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -104,9 +104,6 @@ /* Maximum long option length for option parsing. */ #define MAX_LONG_OPT_SZ 64 -/* Used to compare MAC addresses. */ -#define MAC_ADDR_CMP 0xULL - /* mask of enabled ports */ static uint32_t enabled_port_mask = 0; @@ -708,15 +705,6 @@ static unsigned check_ports_num(unsigned nb_ports) return valid_num_ports; } -/* - * Compares a packet destination MAC address to a device MAC address. - */ -static inline int __attribute__((always_inline)) -ether_addr_cmp(struct ether_addr *ea, struct ether_addr *eb) -{ - return ((*(uint64_t *)ea ^ *(uint64_t *)eb) & MAC_ADDR_CMP) == 0; -} - static inline struct vhost_dev *__attribute__((always_inline)) find_vhost_dev(struct ether_addr *mac) { @@ -724,7 +712,7 @@ find_vhost_dev(struct ether_addr *mac) TAILQ_FOREACH(vdev, _dev_list, next) { if (vdev->ready == DEVICE_RX && - ether_addr_cmp(mac, >mac_address)) + is_same_ether_addr(mac, >mac_address)) return vdev; } -- 1.9.3
[dpdk-dev] [PATCH v2 3/8] examples/vhost: use tailq to link vhost devices
To simplify code and logic. Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 457 -- examples/vhost/main.h | 32 ++-- 2 files changed, 126 insertions(+), 363 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 9445100..135a4a4 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -86,10 +86,6 @@ #define DEVICE_RX 1 #define DEVICE_SAFE_REMOVE 2 -/* Config_core_flag status definitions. */ -#define REQUEST_DEV_REMOVAL 1 -#define ACK_DEV_REMOVAL 0 - /* Configurable number of RX/TX ring descriptors */ #define RTE_TEST_RX_DESC_DEFAULT 1024 #define RTE_TEST_TX_DESC_DEFAULT 512 @@ -216,11 +212,9 @@ const uint16_t vlan_tags[] = { /* ethernet addresses of ports */ static struct ether_addr vmdq_ports_eth_addr[RTE_MAX_ETHPORTS]; -/* heads for the main used and free linked lists for the data path. */ -static struct virtio_net_data_ll *ll_root_used = NULL; -static struct virtio_net_data_ll *ll_root_free = NULL; +static struct vhost_dev_tailq_list vhost_dev_list = + TAILQ_HEAD_INITIALIZER(vhost_dev_list); -/* Array of data core structures containing information on individual core linked lists. */ static struct lcore_info lcore_info[RTE_MAX_LCORE]; /* Used for queueing bursts of TX packets. */ @@ -723,6 +717,20 @@ ether_addr_cmp(struct ether_addr *ea, struct ether_addr *eb) return ((*(uint64_t *)ea ^ *(uint64_t *)eb) & MAC_ADDR_CMP) == 0; } +static inline struct vhost_dev *__attribute__((always_inline)) +find_vhost_dev(struct ether_addr *mac) +{ + struct vhost_dev *vdev; + + TAILQ_FOREACH(vdev, _dev_list, next) { + if (vdev->ready == DEVICE_RX && + ether_addr_cmp(mac, >mac_address)) + return vdev; + } + + return NULL; +} + /* * This function learns the MAC address of the device and registers this along with a * vlan tag to a VMDQ. @@ -731,21 +739,17 @@ static int link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m) { struct ether_hdr *pkt_hdr; - struct virtio_net_data_ll *dev_ll; struct virtio_net *dev = vdev->dev; int i, ret; /* Learn MAC address of guest device from packet */ pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); - dev_ll = ll_root_used; - - while (dev_ll != NULL) { - if (ether_addr_cmp(&(pkt_hdr->s_addr), _ll->vdev->mac_address)) { - RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") WARNING: This device is using an existing MAC address and has not been registered.\n", dev->device_fh); - return -1; - } - dev_ll = dev_ll->next; + if (find_vhost_dev(_hdr->s_addr)) { + RTE_LOG(ERR, VHOST_DATA, + "Device (%" PRIu64 ") is using a registered MAC!\n", + dev->device_fh); + return -1; } for (i = 0; i < ETHER_ADDR_LEN; i++) @@ -822,60 +826,44 @@ unlink_vmdq(struct vhost_dev *vdev) static inline int __attribute__((always_inline)) virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) { - struct virtio_net_data_ll *dev_ll; struct ether_hdr *pkt_hdr; uint64_t ret = 0; - struct virtio_net *dev = vdev->dev; - struct virtio_net *tdev; /* destination virito device */ + struct vhost_dev *dst_vdev; + uint64_t fh; pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); - /*get the used devices list*/ - dev_ll = ll_root_used; + dst_vdev = find_vhost_dev(_hdr->d_addr); + if (!dst_vdev) + return -1; + + fh = dst_vdev->dev->device_fh; + if (fh == vdev->dev->device_fh) { + RTE_LOG(DEBUG, VHOST_DATA, + "(%" PRIu64 ") TX: src and dst MAC is same. " + "Dropping packet.\n", fh); + return 0; + } - while (dev_ll != NULL) { - if ((dev_ll->vdev->ready == DEVICE_RX) && ether_addr_cmp(&(pkt_hdr->d_addr), - _ll->vdev->mac_address)) { + RTE_LOG(DEBUG, VHOST_DATA, + "(%" PRIu64 ") TX: MAC address is local\n", fh); - /* Drop the packet if the TX packet is destined for the TX device. */ - if (dev_ll->vdev->dev->device_fh == dev->device_fh) { - RTE_LOG(DEBUG, VHOST_DATA, "(%" PRIu64 ") TX: " - "Source and destination MAC addresses are the same. " - "Dropping packet.\n", -
[dpdk-dev] [PATCH v2 2/8] examples/vhost: remove unused macro and struct
Interestingly, DESC_PER_CACHELINE has never been used since the introduction of vhost example. Remove it. vlan_ethhdr struct and VLAN_ETH_HLEN macro reference had been removed by commit 4d50b6acbd95 ("examples/vhost: adapt Tx routing to lib"), but had forgot to remove the definition. Fixes: 4d50b6acbd95 ("examples/vhost: adapt Tx routing to lib") Signed-off-by: Yuanhan Liu --- examples/vhost/main.c | 14 -- 1 file changed, 14 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 9452bab..9445100 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -111,9 +111,6 @@ /* Used to compare MAC addresses. */ #define MAC_ADDR_CMP 0xULL -/* Number of descriptors per cacheline. */ -#define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc)) - /* mask of enabled ports */ static uint32_t enabled_port_mask = 0; @@ -236,18 +233,7 @@ struct mbuf_table { /* TX queue for each data core. */ struct mbuf_table lcore_tx_queue[RTE_MAX_LCORE]; -/* Vlan header struct used to insert vlan tags on TX. */ -struct vlan_ethhdr { - unsigned char h_dest[ETH_ALEN]; - unsigned char h_source[ETH_ALEN]; - __be16 h_vlan_proto; - __be16 h_vlan_TCI; - __be16 h_vlan_encapsulated_proto; -}; - -/* Header lengths. */ #define VLAN_HLEN 4 -#define VLAN_ETH_HLEN 18 /* Per-device statistics struct */ struct device_statistics { -- 1.9.3
[dpdk-dev] [PATCH v2 1/8] examples/vhost: remove the non-working zero copy code
It's reported that it's has not been working for a long while. And due to it's complex, it's better to redesign it than to fix it to make it work again. Signed-off-by: Yuanhan Liu --- v2: remove macro PRINT_PACKET; will not be used anymore --- doc/guides/sample_app_ug/vhost.rst | 36 +- examples/vhost/main.c | 1497 +--- examples/vhost/main.h | 17 - 3 files changed, 25 insertions(+), 1525 deletions(-) diff --git a/doc/guides/sample_app_ug/vhost.rst b/doc/guides/sample_app_ug/vhost.rst index 47ce36c..5f81802 100644 --- a/doc/guides/sample_app_ug/vhost.rst +++ b/doc/guides/sample_app_ug/vhost.rst @@ -491,39 +491,9 @@ The default value is 15. -- --rx-retry 1 --rx-retry-delay 20 **Zero copy.** -The zero copy option enables/disables the zero copy mode for RX/TX packet, -in the zero copy mode the packet buffer address from guest translate into host physical address -and then set directly as DMA address. -If the zero copy mode is disabled, then one copy mode is utilized in the sample. -This option is disabled by default. - -.. code-block:: console - -./vhost-switch -c f -n 4 --socket-mem 1024 --huge-dir /mnt/huge \ - -- --zero-copy [0,1] - -**RX descriptor number.** -The RX descriptor number option specify the Ethernet RX descriptor number, -Linux legacy virtio-net has different behavior in how to use the vring descriptor from DPDK based virtio-net PMD, -the former likely allocate half for virtio header, another half for frame buffer, -while the latter allocate all for frame buffer, -this lead to different number for available frame buffer in vring, -and then lead to different Ethernet RX descriptor number could be used in zero copy mode. -So it is valid only in zero copy mode is enabled. The value is 32 by default. - -.. code-block:: console - -./vhost-switch -c f -n 4 --socket-mem 1024 --huge-dir /mnt/huge \ - -- --zero-copy 1 --rx-desc-num [0, n] - -**TX descriptor number.** -The TX descriptor number option specify the Ethernet TX descriptor number, it is valid only in zero copy mode is enabled. -The value is 64 by default. - -.. code-block:: console - -./vhost-switch -c f -n 4 --socket-mem 1024 --huge-dir /mnt/huge \ - -- --zero-copy 1 --tx-desc-num [0, n] +Zero copy mode is removed, due to it has not been working for a while. And +due to the large and complex code, it's better to redesign it than fixing +it to make it work again. Hence, zero copy may be added back later. **VLAN strip.** The VLAN strip option enable/disable the VLAN strip on host, if disabled, the guest will receive the packets with VLAN tag. diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 78fd1ab..9452bab 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -73,15 +73,6 @@ #define MBUF_CACHE_SIZE128 #define MBUF_DATA_SIZE RTE_MBUF_DEFAULT_BUF_SIZE -/* - * No frame data buffer allocated from host are required for zero copy - * implementation, guest will allocate the frame data buffer, and vhost - * directly use it. - */ -#define VIRTIO_DESCRIPTOR_LEN_ZCP RTE_MBUF_DEFAULT_DATAROOM -#define MBUF_DATA_SIZE_ZCP RTE_MBUF_DEFAULT_BUF_SIZE -#define MBUF_CACHE_SIZE_ZCP 0 - #define MAX_PKT_BURST 32 /* Max burst size for RX/TX */ #define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */ @@ -103,25 +94,6 @@ #define RTE_TEST_RX_DESC_DEFAULT 1024 #define RTE_TEST_TX_DESC_DEFAULT 512 -/* - * Need refine these 2 macros for legacy and DPDK based front end: - * Max vring avail descriptor/entries from guest - MAX_PKT_BURST - * And then adjust power 2. - */ -/* - * For legacy front end, 128 descriptors, - * half for virtio header, another half for mbuf. - */ -#define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. */ -#define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. */ - -/* Get first 4 bytes in mbuf headroom. */ -#define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \ - + sizeof(struct rte_mbuf))) - -/* true if x is a power of 2 */ -#define POWEROF2(x) x)-1) & (x)) == 0) - #define INVALID_PORT_ID 0xFF /* Max number of devices. Limited by vmdq. */ @@ -142,8 +114,6 @@ /* Number of descriptors per cacheline. */ #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc)) -#define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) != (mb)) - /* mask of enabled ports */ static uint32_t enabled_port_mask = 0; @@ -157,29 +127,12 @@ static uint32_t num_switching_cores = 0; static uint32_t num_queues = 0; static uint32_t num_devices; -/* - * Enable zero copy, pkts buffer will directly dma to hw descriptor, - * disabled on default. - */ -static uint32_t zero_copy; +static struct rte_mempool *mbuf_pool; static int mergeable; /* Do vlan strip on host, enabled on default */ static uint32_t vlan_strip = 1; -/* number of descriptors to apply*/ -static uint32_t num_rx_descriptor =
[dpdk-dev] [PATCH v2 0/8] vhost/example cleanup/fix
I'm starting to work on the vhost ABI refactoring, that I also have to touch the vhost example code. The vhost example code, however, is very messy, full of __very__ long lines. This would make a later diff to apply the new vhost API be very ugly, therefore, not friendly for review. This is how this cleanup comes. Besides that, there is one enhancement patch, which handles the broadcast packets so that we could rely the ARP request packet, to let vhost-switch be more like a real switch. There is another patch that (hopefully) would fix the mbuf allocation failure ultimately. I also added some guidelines there as comments to show how to count how many mbuf entries is enough for our usage. In another word, an example is meant to be clean/simple and with good coding style so that people can get the usage easily. So, one way or another, this patch is good to have, even without this ABI refactoring stuff. Note that I'm going to apply it before the end of this week, if no objections. v2: - some checkpatch fixes - cleaned the code about device statistics --- Yuanhan Liu (8): examples/vhost: remove the non-working zero copy code examples/vhost: remove unused macro and struct examples/vhost: use tailq to link vhost devices examples/vhost: use mac compare helper function directly examples/vhost: handle broadcast packet examples/vhost: fix mbuf allocation failure examples/vhost: switch_worker cleanup examples/vhost: embed statistics into vhost_dev struct doc/guides/sample_app_ug/vhost.rst | 36 +- examples/vhost/main.c | 2394 ++-- examples/vhost/main.h | 56 +- 3 files changed, 391 insertions(+), 2095 deletions(-) -- 1.9.3
[dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned
Elements of struct rte_eth_dev used in the fast path. Make struct rte_eth_dev cache aligned to avoid the cases where rte_eth_dev elements share the same cache line with other structures. Signed-off-by: Jerin Jacob --- lib/librte_ether/rte_ethdev.c | 2 +- lib/librte_ether/rte_ethdev.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a31018e..04f492d 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -70,7 +70,7 @@ #include "rte_ethdev.h" static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data"; -struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS]; +struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] __rte_cache_aligned; static struct rte_eth_dev_data *rte_eth_dev_data; static uint8_t nb_ports; diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 8519ff6..e359dda 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1630,7 +1630,7 @@ struct rte_eth_dev { struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; uint8_t attached; /**< Flag indicating the port is attached */ enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ -}; +} __rte_cache_aligned; struct rte_eth_dev_sriov { uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ -- 2.1.0
[dpdk-dev] [PATCH] ip_pipeline: configuration file parser cleanup
This patch updates the parsing routines of packet queues (pktq_in/out fields in the PIPELINE section) and message queues (msgq_in/out fields of in the MSGQ Section) specified in ip_pipeline configuration file. Signed-off-by: Jasvinder Singh Acked-by: Cristian Dumitrescu --- examples/ip_pipeline/config_parse.c | 221 1 file changed, 45 insertions(+), 176 deletions(-) diff --git a/examples/ip_pipeline/config_parse.c b/examples/ip_pipeline/config_parse.c index ab79cd5..72e3d61 100644 --- a/examples/ip_pipeline/config_parse.c +++ b/examples/ip_pipeline/config_parse.c @@ -1128,61 +1128,29 @@ parse_pipeline_pcap_sink(struct app_params *app, static int parse_pipeline_pktq_in(struct app_params *app, struct app_pipeline_params *p, - const char *value) + char *value) { - const char *next = value; - char *end; - char name[APP_PARAM_NAME_SIZE]; - size_t name_len; - while (*next != '\0') { + while (1) { enum app_pktq_in_type type; int id; - char *end_space; - char *end_tab; + char *token = strtok_r(value, PARSE_DELIMITER, ); - next = skip_white_spaces(next); - if (!next) + if (token == NULL) break; - end_space = strchr(next, ' '); - end_tab = strchr(next, ''); - - if (end_space && (!end_tab)) - end = end_space; - else if ((!end_space) && end_tab) - end = end_tab; - else if (end_space && end_tab) - end = RTE_MIN(end_space, end_tab); - else - end = NULL; - - if (!end) - name_len = strlen(next); - else - name_len = end - next; - - if (name_len == 0 || name_len == sizeof(name)) - return -EINVAL; - - strncpy(name, next, name_len); - name[name_len] = '\0'; - next += name_len; - if (*next != '\0') - next++; - - if (validate_name(name, "RXQ", 2) == 0) { + if (validate_name(token, "RXQ", 2) == 0) { type = APP_PKTQ_IN_HWQ; - id = APP_PARAM_ADD(app->hwq_in_params, name); - } else if (validate_name(name, "SWQ", 1) == 0) { + id = APP_PARAM_ADD(app->hwq_in_params, token); + } else if (validate_name(token, "SWQ", 1) == 0) { type = APP_PKTQ_IN_SWQ; - id = APP_PARAM_ADD(app->swq_params, name); - } else if (validate_name(name, "TM", 1) == 0) { + id = APP_PARAM_ADD(app->swq_params, token); + } else if (validate_name(token, "TM", 1) == 0) { type = APP_PKTQ_IN_TM; - id = APP_PARAM_ADD(app->tm_params, name); - } else if (validate_name(name, "SOURCE", 1) == 0) { + id = APP_PARAM_ADD(app->tm_params, token); + } else if (validate_name(token, "SOURCE", 1) == 0) { type = APP_PKTQ_IN_SOURCE; - id = APP_PARAM_ADD(app->source_params, name); + id = APP_PARAM_ADD(app->source_params, token); } else return -EINVAL; @@ -1200,60 +1168,28 @@ parse_pipeline_pktq_in(struct app_params *app, static int parse_pipeline_pktq_out(struct app_params *app, struct app_pipeline_params *p, - const char *value) + char *value) { - const char *next = value; - char *end; - char name[APP_PARAM_NAME_SIZE]; - size_t name_len; - - while (*next != '\0') { - enum app_pktq_out_type type; + while (1) { + enum app_pktq_in_type type; int id; - char *end_space; - char *end_tab; + char *token = strtok_r(value, PARSE_DELIMITER, ); - next = skip_white_spaces(next); - if (!next) + if (token == NULL) break; - end_space = strchr(next, ' '); - end_tab = strchr(next, ''); - - if (end_space && (!end_tab)) - end = end_space; - else if ((!end_space) && end_tab) - end = end_tab; - else if (end_space && end_tab) - end = RTE_MIN(end_space, end_tab); - else - end = NULL; - - if (!end) - name_len = strlen(next); - else - name_len = end - next; - - if (name_len == 0 || name_len == sizeof(name)) -
[dpdk-dev] [PATCH] lpm6: fix missing header dependency
2016-04-28 15:08, Igor Ryzhov: > Include stdint.h for the definition of uint*_t types. > > Signed-off-by: Igor Ryzhov Applied, thanks
[dpdk-dev] [PATCH] lpm: fix freeing of rules_tbl in rte_lpm_free_v20
> > Back then when we fixed the missing free lpm I was to quickly to say yes > > if it applies not only to the lpm6 but also to all of the lpm code. > > > > It turned out to not apply to all of them. In rte_lpm_create_v20 there > > is an unexpected fused allocation: > > mem_size = sizeof(*lpm) + (sizeof(lpm->rules_tbl[0]) * max_rules); > > [...] > > lpm = (struct rte_lpm_v20 *)rte_zmalloc_socket(mem_name,mem_size, > > RTE_CACHE_LINE_SIZE, socket_id); > > > > That causes lpm->rules_tbl not to have an own struct malloc_elem that > > can be derived via RTE_PTR_SUB(data, MALLOC_ELEM_HEADER_LEN) in > > malloc_elem_from_data. > > Due to that the rte_lpm_free_v20 accidentially misderives the elem and > > assumes it is ELEM_FREE triggering in malloc_elem_free > > if (!malloc_elem_cookies_ok(elem) || elem->state != > > return -1; > > > > While it seems counter-intuitive the way to properly remove rules_tbl in > > the old fused allocation style of rte_lpm_free_v20 is to not remove it. > > > > The newer rte_lpm_free_v1604 is safe because in rte_lpm_create_v1604 > > rules_tbl is a separate allocation. > > > > Fixes: d4c18f0a1d5d ("lpm: fix missing free") > > > > Signed-off-by: Christian Ehrhardt > > Acked-by: Olivier Matz > > Thanks, I missed it too during the review. Applied, thanks
[dpdk-dev] [PATCH v2] virtio: fix modify drv_flags for specific device
On Thu, Apr 28, 2016 at 06:08:59PM +, Jianfeng Tan wrote: > Issue: virtio's drv_flags are decided by devices types (modern vs legacy), > and which kernel driver is used, and the negotiated features (especially > VIRTIO_NET_STATUS) with backend, which makes it possible to multiple > virtio devices have different versions of drv_flags, but this variable > is currently shared by each virtio device. > > How to fix: dev_flags is a device-specific variable to store this info. > > Fixes: da978dfdc43 ("virtio: use port IO to get PCI resource") > > Reported-by: David Marchand > Suggested-by: David Marchand Hi David, May I ask your review on this and give ACK when no issue to you? Thanks. --yliu
[dpdk-dev] Flow Director Example?
Hi Alex Can you confirm that you are using DPDK? And how do you use DPDK and possibly kernel driver? I need your detailed topo of how are you using DPDK, as I am a bit confused. Thanks! Regards, Helin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Forster > Sent: Saturday, April 30, 2016 4:34 AM > To: dev at dpdk.org > Subject: [dpdk-dev] Flow Director Example? > > Hi guys, apologies if this is the wrong list, but the others look pretty bare. > > We have a 32 core server that has two X520-QDA1's NICs with 2x10G ports > plugged into each. I'm using 2016.1 (latest stable) with ixgbe 4.3.15 (latest > stable). > I'm setting up 8 RX queues per port, and I'd like Flow Director in signature > mode > (?) to place packets into queues based on a hash of destination IPv4 or IPv6 > address. However, I can't figure out rte_fdir_conf, and despite a good amount > of > trial and error, each of my ports are still only using one of the RX queues I > set up. > > Would anyone be able to point me in the right direction here? Thanks in > advance! > > Alex Forster