[dpdk-dev] [Q] l2fwd in examples directory

2015-10-18 Thread Moon-Sang Lee
thanks bruce. I didn't know that PCI slots have direct socket affinity. is it static or configurable through PCI configuration space? well, my NUT, two node NUMA, seems always returns -1 on calling rte_eth_dev_socket_id(portid) whenever portid is 0, 1, or other values. I appreciate if you explain

[dpdk-dev] [PATCH v2 0/7] virtio ring layout optimization and simple rx/tx processing

2015-10-18 Thread Huawei Xie
In DPDK based switching enviroment, mostly vhost runs on a dedicated core while virtio processing in guest VMs runs on different cores. Take RX for example, with generic implementation, for each guest buffer, a) virtio driver allocates a descriptor from free descriptor list b) modify the entry of a

[dpdk-dev] [PATCH v2 0/7] virtio ring layout optimization and simple rx/tx processing

2015-10-18 Thread Huawei Xie
In DPDK based switching enviroment, mostly vhost runs on a dedicated core while virtio processing in guest VMs runs on different cores. Take RX for example, with generic implementation, for each guest buffer, a) virtio driver allocates a descriptor from free descriptor list b) modify the entry of a

[dpdk-dev] [PATCH v2 3/7] virtio: rx/tx ring layout optimization

2015-10-18 Thread Huawei Xie
In DPDK based switching enviroment, mostly vhost runs on a dedicated core while virtio processing in guest VMs runs on different cores. Take RX for example, with generic implementation, for each guest buffer, a) virtio driver allocates a descriptor from free descriptor list b) modify the entry of a

[dpdk-dev] [PATCH v2 4/7] virtio: fill RX avail ring with blank mbufs

2015-10-18 Thread Huawei Xie
fill avail ring with blank mbufs in virtio_dev_vring_start Signed-off-by: Huawei Xie --- drivers/net/virtio/Makefile | 2 +- drivers/net/virtio/virtio_rxtx.c| 6 ++- drivers/net/virtio/virtio_rxtx.h| 3 ++ drivers/net/virtio/virtio_rxtx_simple.c | 84 ++

[dpdk-dev] [PATCH v2 1/7] virtio: add virtio_rxtx.h header file

2015-10-18 Thread Huawei Xie
Would move all rx/tx related code into this header file in future. Add RTE_VIRTIO_PMD_MAX_BURST. Signed-off-by: Huawei Xie --- drivers/net/virtio/virtio_ethdev.c | 1 + drivers/net/virtio/virtio_rxtx.c | 1 + drivers/net/virtio/virtio_rxtx.h | 34 ++ 3 files

[dpdk-dev] [PATCH v2 2/7] virtio: add software rx ring, fake_buf into virtqueue

2015-10-18 Thread Huawei Xie
Add software RX ring in virtqueue. Add fake_mbuf in virtqueue for wraparound processing. Use global simple_rxtx to indicate whether simple rxtx is enabled Signed-off-by: Huawei Xie --- drivers/net/virtio/virtio_ethdev.c | 12 drivers/net/virtio/virtio_rxtx.c | 7 +++ drivers/

[dpdk-dev] [PATCH v2 5/7] virtio: virtio vec rx

2015-10-18 Thread Huawei Xie
With fixed avail ring, we don't need to get desc idx from avail ring. virtio driver only has to deal with desc ring. This patch uses vector instruction to accelerate processing desc ring. Signed-off-by: Huawei Xie --- drivers/net/virtio/virtio_ethdev.h | 2 + drivers/net/virtio/virtio_rxt

[dpdk-dev] [PATCH v2 6/7] virtio: simple tx routine

2015-10-18 Thread Huawei Xie
bulk free of mbufs when clean used ring. shift operation of idx could be further saved if vq_free_cnt means free slots rather than free descriptors. Signed-off-by: Huawei Xie --- drivers/net/virtio/virtio_ethdev.h | 3 ++ drivers/net/virtio/virtio_rxtx_simple.c | 95 +++

[dpdk-dev] [PATCH v2 7/7] virtio: pick simple rx/tx func

2015-10-18 Thread Huawei Xie
simple rx/tx func is enabled when user specifies single segment, no offload support. merge-able should be disabled to use simple rxtx. Signed-off-by: Huawei Xie --- drivers/net/virtio/virtio_rxtx.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/net/virtio/virtio_rxtx.

[dpdk-dev] [PATCH] vhost-user: enable virtio 1.0

2015-10-18 Thread Michael S. Tsirkin
On Fri, Oct 16, 2015 at 02:52:30PM +0100, Bruce Richardson wrote: > On Thu, Oct 15, 2015 at 04:18:59PM +0300, Michael S. Tsirkin wrote: > > On Thu, Oct 15, 2015 at 02:08:39PM +0300, Marcel Apfelbaum wrote: > > > Make vhost-user virtio 1.0 compatible by adding it to the > > > supported features and

[dpdk-dev] [PATCH v2 6/7] virtio: simple tx routine

2015-10-18 Thread Stephen Hemminger
On Sun, 18 Oct 2015 14:29:03 +0800 Huawei Xie wrote: > bulk free of mbufs when clean used ring. > shift operation of idx could be further saved if vq_free_cnt means > free slots rather than free descriptors. > > Signed-off-by: Huawei Xie Did you measure this. I finished my transmit optimizatio

[dpdk-dev] [PATCH v2 6/7] virtio: simple tx routine

2015-10-18 Thread Stephen Hemminger
On Sun, 18 Oct 2015 14:29:03 +0800 Huawei Xie wrote: > + > + for (i = 1; i < VIRTIO_TX_FREE_NR; i++) { > + m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; > + if (likely(m->pool == free[0]->pool)) > + free[nb_free++] = m; > + els

[dpdk-dev] [PATCH v2 6/7] virtio: simple tx routine

2015-10-18 Thread Stephen Hemminger
+static inline void __attribute__((always_inline)) +virtio_xmit_cleanup(struct virtqueue *vq) +{ Please don't use always inline, frustrating the compiler isn't going to help. + uint16_t i, desc_idx; + int nb_free = 0; + struct rte_mbuf *m, *free[VIRTIO_TX_MAX_FREE_BUF_SZ]; + +

[dpdk-dev] [PATCH v2 2/7] virtio: add software rx ring, fake_buf into virtqueue

2015-10-18 Thread Stephen Hemminger
On Sun, 18 Oct 2015 14:28:59 +0800 Huawei Xie wrote: > + if (vq->sw_ring) > + rte_free(vq->sw_ring); > + Do not need to test for NULL before calling rte_free. Better to just rely on the fact that rte_free(NULL) is documented to be ok (no operation).

[dpdk-dev] [PATCH v2 0/5] virtio: Tx performance improvements

2015-10-18 Thread Stephen Hemminger
This is a tested version of the virtio Tx performance improvements that I posted earlier on the list, and described at the DPDK Userspace meeting in Dublin. Together they get a 25% performance improvement for both small packet and large multi-segment packet case when testing from DPDK guest applica

[dpdk-dev] [PATCH 1/5] virtio: clean up space checks on xmit

2015-10-18 Thread Stephen Hemminger
The space check for transmit ring only needs a single conditional. I.e only need to recheck for space if there was no space in first check. This can help performance and simplifies loop. Signed-off-by: Stephen Hemminger --- drivers/net/virtio/virtio_rxtx.c | 66 -

[dpdk-dev] [PATCH 2/5] virtio: don't use unlikely for normal tx stuff

2015-10-18 Thread Stephen Hemminger
Don't use unlikely() for VLAN or ring getting full. GCC will not optimize code in unlikely paths and since these can happen with normal code that can hurt performance. Signed-off-by: Stephen Hemminger --- drivers/net/virtio/virtio_rxtx.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(

[dpdk-dev] [PATCH 3/5] virtio: use indirect ring elements

2015-10-18 Thread Stephen Hemminger
The virtio ring in QEMU/KVM is usually limited to 256 entries and the normal way that virtio driver was queuing mbufs required nsegs + 1 ring elements. By using the indirect ring element feature if available, each packet will take only one ring slot even for multi-segment packets. Signed-off-by: S

[dpdk-dev] [PATCH 4/5] virtio: use any layout on transmit

2015-10-18 Thread Stephen Hemminger
Virtio supports a feature that allows sender to put transmit header prepended to data. It requires that the mbuf be writeable, correct alignment, and the feature has been negotiatied. If all this works out, then it will be the optimum way to transmit a single segment packet. Signed-off-by: Steph

[dpdk-dev] [PATCH 5/5] virtio: optimize transmit enqueue

2015-10-18 Thread Stephen Hemminger
All the error checks in virtqueue_enqueue_xmit are already done by the caller. Therefore they can be removed to improve performance. Signed-off-by: Stephen Hemminger --- drivers/net/virtio/virtio_rxtx.c | 23 ++- 1 file changed, 2 insertions(+), 21 deletions(-) diff --git a/