[dpdk-dev] [PATCH v3 1/4] vmxnet3: restore tx data ring support

2016-01-13 Thread Yong Wang
On 1/5/16, 4:48 PM, "Stephen Hemminger"  wrote:


>On Tue,  5 Jan 2016 16:12:55 -0800
>Yong Wang  wrote:
>
>> @@ -365,6 +366,14 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
>> **tx_pkts,
>>  break;
>>  }
>>  
>> +if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) {
>> +struct Vmxnet3_TxDataDesc *tdd;
>> +
>> +tdd = txq->data_ring.base + txq->cmd_ring.next2fill;
>> +copy_size = rte_pktmbuf_pkt_len(txm);
>> +rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), 
>> copy_size);
>> +}
>
>Good idea to use a local region which optmizes the copy in the host,
>but this implementation needs to be more general.
>
>As written it is broken for multi-segment packets. A multi-segment
>packet will have a pktlen >= datalen as in:
>  m -> mb_segs=3, pktlen=1200, datalen=200
>-> datalen=900
>-> datalen=100
>
>There are two ways to fix this. You could test for nb_segs == 1
>or better yet. Optimize each segment it might be that the first
>segment (or tail segment) would fit in the available data area.

Currently the vmxnet3 backend has a limitation of 128B data area so
it should work even for the multi-segmented pkt shown above. But
I agree it does not work for all multi-segmented packets.  The
following packet will be such an example.

m -> nb_segs=3, pktlen=128, datalen=64
-> datalen=32
-> datalen=32


It?s unclear if/how we might get into such a multi-segmented pkt
but I agree we should handle this case.  Patch updated taking the
simple approach (checking for nb_segs == 1).  I?ll leave the
optimization as a future patch.


[dpdk-dev] [PATCH v3 1/4] vmxnet3: restore tx data ring support

2016-01-12 Thread Stephen Hemminger
On Wed, 13 Jan 2016 02:20:01 +
Yong Wang  wrote:

> >Good idea to use a local region which optmizes the copy in the host,
> >but this implementation needs to be more general.
> >
> >As written it is broken for multi-segment packets. A multi-segment
> >packet will have a pktlen >= datalen as in:
> >  m -> mb_segs=3, pktlen=1200, datalen=200  
> >-> datalen=900
> >-> datalen=100  
> >
> >There are two ways to fix this. You could test for nb_segs == 1
> >or better yet. Optimize each segment it might be that the first
> >segment (or tail segment) would fit in the available data area.  
> 
> Currently the vmxnet3 backend has a limitation of 128B data area so
> it should work even for the multi-segmented pkt shown above. But
> I agree it does not work for all multi-segmented packets.  The
> following packet will be such an example.
> 
> m -> nb_segs=3, pktlen=128, datalen=64
> -> datalen=32
> -> datalen=32  
> 
> 
> It?s unclear if/how we might get into such a multi-segmented pkt
> but I agree we should handle this case.  Patch updated taking the
> simple approach (checking for nb_segs == 1).  I?ll leave the
> optimization as a future patch.

Such a packet can happen when adding a tunnel header such as VXLAN
and the underlying packet is shared (refcnt > 1) or does not have
enough headroom for the tunnel header.


[dpdk-dev] [PATCH v3 1/4] vmxnet3: restore tx data ring support

2016-01-05 Thread Stephen Hemminger
On Tue,  5 Jan 2016 16:12:55 -0800
Yong Wang  wrote:

> @@ -365,6 +366,14 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts,
>   break;
>   }
>  
> + if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) {
> + struct Vmxnet3_TxDataDesc *tdd;
> +
> + tdd = txq->data_ring.base + txq->cmd_ring.next2fill;
> + copy_size = rte_pktmbuf_pkt_len(txm);
> + rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), 
> copy_size);
> + }

Good idea to use a local region which optmizes the copy in the host,
but this implementation needs to be more general.

As written it is broken for multi-segment packets. A multi-segment
packet will have a pktlen >= datalen as in:
  m -> mb_segs=3, pktlen=1200, datalen=200
-> datalen=900
-> datalen=100

There are two ways to fix this. You could test for nb_segs == 1
or better yet. Optimize each segment it might be that the first
segment (or tail segment) would fit in the available data area.


[dpdk-dev] [PATCH v3 1/4] vmxnet3: restore tx data ring support

2016-01-05 Thread Yong Wang
Tx data ring support was removed in a previous change
to add multi-seg transmit.  This change adds it back.

According to the original commit (2e849373), 64B pkt
rate with l2fwd improved by ~20% on an Ivy Bridge
server at which point we start to hit some bottleneck
on the rx side.

I also re-did the same test on a different setup (Haswell
processor, ~2.3GHz clock rate) on top of the master
and still observed ~17% performance gains.

Fixes: 7ba5de417e3c ("vmxnet3: support multi-segment transmit")

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |  5 +
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 17 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..a23c8ac 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -15,6 +15,11 @@ EAL
 Drivers
 ~~~

+* **vmxnet3: restore tx data ring.**
+
+  Tx data ring has been shown to improve small pkt forwarding performance
+  on vSphere environment.
+

 Libraries
 ~
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 4de5d89..2202d31 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -348,6 +348,7 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint32_t first2fill, avail, dw2;
struct rte_mbuf *txm = tx_pkts[nb_tx];
struct rte_mbuf *m_seg = txm;
+   int copy_size = 0;

/* Is this packet execessively fragmented, then drop */
if (unlikely(txm->nb_segs > VMXNET3_MAX_TXD_PER_PKT)) {
@@ -365,6 +366,14 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
break;
}

+   if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) {
+   struct Vmxnet3_TxDataDesc *tdd;
+
+   tdd = txq->data_ring.base + txq->cmd_ring.next2fill;
+   copy_size = rte_pktmbuf_pkt_len(txm);
+   rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), 
copy_size);
+   }
+
/* use the previous gen bit for the SOP desc */
dw2 = (txq->cmd_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT;
first2fill = txq->cmd_ring.next2fill;
@@ -377,7 +386,13 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
   transmit buffer size (16K) is greater than
   maximum sizeof mbuf segment size. */
gdesc = txq->cmd_ring.base + txq->cmd_ring.next2fill;
-   gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg);
+   if (copy_size)
+   gdesc->txd.addr = 
rte_cpu_to_le_64(txq->data_ring.basePA +
+   
txq->cmd_ring.next2fill *
+   sizeof(struct 
Vmxnet3_TxDataDesc));
+   else
+   gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg);
+
gdesc->dword[2] = dw2 | m_seg->data_len;
gdesc->dword[3] = 0;

-- 
1.9.1