date:20151028

[dpdk-dev] [PATCH v8] mem: command line option to delete hugepage backing files

2015-10-28 Thread Sergio Gonzalez Monroy

On 28/10/2015 22:04, Shesha Sreenivasamurthy wrote:
> When an application using huge-pages crash or exists, the hugetlbfs
> backing files are not cleaned up. This is a patch to clean those files.
> There are multi-process DPDK applications that may be benefited by those
> backing files. Therefore, I have made that configurable so that the
> application that does not need those backing files can remove them, thus
> not changing the current default behavior. The application itself can
> clean it up, however the rationale behind DPDK cleaning it up is, DPDK
> created it and therefore, it is better it unlinks it.
>
> Signed-off-by: Shesha Sreenivasamurthy 
> Acked-by: Sergio Gonzalez Monroy 
> ---
v3:
- Fix typo in comments

v2:
- Update function X return value
>   lib/librte_eal/common/eal_common_options.c | 12 +++
>   lib/librte_eal/common/eal_internal_cfg.h   |  1 +
>   lib/librte_eal/common/eal_options.h|  2 ++
>   lib/librte_eal/linuxapp/eal/eal_memory.c   | 32 
> ++
>   4 files changed, 47 insertions(+)
>
>
Patch looks good!

Just a couple of things for the next time ;)
You might be aware of them, but it doesn't hurt to remind them:
- When sending new version, use --in-reply-to to the last version of the 
patch sent, it's easier to have all patches on the same thread (if your 
email client supports it)
- Also when sending new versions it's useful to add what has changed 
from the previous to the new version.
   (add such info after the three dashes as shown above)

Cheers,
Sergio

[dpdk-dev] [PATCH v7 4/8] vhost: rxtx: use queue id instead of constant ring index

2015-10-28 Thread Michael S. Tsirkin

On Wed, Oct 28, 2015 at 06:30:41PM -0200, Flavio Leitner wrote:
> On Sat, Oct 24, 2015 at 08:47:10PM +0300, Michael S. Tsirkin wrote:
> > On Sat, Oct 24, 2015 at 12:34:08AM -0200, Flavio Leitner wrote:
> > > On Thu, Oct 22, 2015 at 02:32:31PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 22, 2015 at 05:49:55PM +0800, Yuanhan Liu wrote:
> > > > > On Wed, Oct 21, 2015 at 05:26:18PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Wed, Oct 21, 2015 at 08:48:15PM +0800, Yuanhan Liu wrote:
> > > > > > > > Please note that for virtio devices, guest is supposed to
> > > > > > > > control the placement of incoming packets in RX queues.
> > > > > > > 
> > > > > > > I may not follow you.
> > > > > > > 
> > > > > > > Enqueuing packets to a RX queue is done at vhost lib, outside the
> > > > > > > guest, how could the guest take the control here?
> > > > > > > 
> > > > > > >   --yliu
> > > > > > 
> > > > > > vhost should do what guest told it to.
> > > > > > 
> > > > > > See virtio spec:
> > > > > > 5.1.6.5.5 Automatic receive steering in multiqueue mode
> > > > > 
> > > > > Spec says:
> > > > > 
> > > > > After the driver transmitted a packet of a flow on transmitqX,
> > > > > the device SHOULD cause incoming packets for that flow to be
> > > > > steered to receiveqX.
> > > > > 
> > > > > 
> > > > > Michael, I still have no idea how vhost could know the flow even
> > > > > after discussion with Huawei. Could you be more specific about
> > > > > this? Say, how could guest know that? And how could guest tell
> > > > > vhost which RX is gonna to use?
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > >   --yliu
> > > > 
> > > > I don't really understand the question.
> > > > 
> > > > When guests transmits a packet, it makes a decision
> > > > about the flow to use, and maps that to a tx/rx pair of queues.
> > > > 
> > > > It sends packets out on the tx queue and expects device to
> > > > return packets from the same flow on the rx queue.
> > > 
> > > Why? I can understand that there should be a mapping between
> > > flows and queues in a way that there is no re-ordering, but
> > > I can't see the relation of receiving a flow with a TX queue.
> > > 
> > > fbl
> > 
> > That's the way virtio chose to program the rx steering logic.
> > 
> > It's low overhead (no special commands), and
> > works well for TCP when user is an endpoint since rx and tx
> > for tcp are generally tied (because of ack handling).
> > 
> > We can discuss other ways, e.g. special commands for guests to
> > program steering.
> > We'd have to first see some data showing the current scheme
> > is problematic somehow.
> 
> The issue is that the restriction imposes operations to be done in the
> data path.  For instance, Open vSwitch has N number of threads to manage
> X RX queues. We distribute them in round-robin fashion.  So, the thread
> polling one RX queue will do all the packet processing and push it to the
> TX queue of the other device (vhost-user or not) using the same 'id'.
> 
> Doing so we can avoid locking between threads and TX queues and any other
> extra computation while still keeping the packet ordering/distribution fine.
> 
> However, if vhost-user has to send packets according with guest mapping,
> it will require locking between queues and additional operations to select
> the appropriate queue.  Those actions will cause performance issues.


You only need to send updates if guest moves a flow to another queue.
This is very rare since guest must avoid reordering.

Oh and you don't have to have locking.  Just update the table and make
the target pick up the new value at leasure, worst case a packet ends up
in the wrong queue.


> I see no real benefit from enforcing the guest mapping outside to
> justify all the computation cost, so my suggestion is to change the
> spec to suggest that behavior, but not to require that to be compliant.
> 
> Does that make sense?
> 
> Thanks,
> fbl

It's not a question of what the spec says, it's a question of the
quality of implementation: guest needs to be able to balance load
between CPUs serving the queues, this means it needs a way to control
steering.

IMO having dpdk control it makes no sense in the scenario.

This is different from dpdk sending packets to real NIC
queues which all operate in parallel.

-- 
MST

[dpdk-dev] [PATCH v3 08/16] fm10k: add Vector RX scatter function

2015-10-28 Thread Liang, Cunming

Hi Mark,

On 10/27/2015 5:46 PM, Chen Jing D(Mark) wrote:
> From: "Chen Jing D(Mark)" 
>
> Add func fm10k_recv_scattered_pkts_vec to receive chained packets
> with SSE instructions.
>
> Signed-off-by: Chen Jing D(Mark) 
> ---
>   drivers/net/fm10k/fm10k.h  |2 +
>   drivers/net/fm10k/fm10k_rxtx_vec.c |   88 
> 
>   2 files changed, 90 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index 1502ae3..06697fa 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -329,4 +329,6 @@ uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts,
>   int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
>   int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
>   uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
> +uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
> + uint16_t);
>   #endif
> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> b/drivers/net/fm10k/fm10k_rxtx_vec.c
> index 2e6f1a2..3fd5d45 100644
> --- a/drivers/net/fm10k/fm10k_rxtx_vec.c
> +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
> @@ -513,3 +513,91 @@ fm10k_recv_pkts_vec(void *rx_queue, struct rte_mbuf 
> **rx_pkts,
>   {
>   return fm10k_recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
>   }
> +
> +static inline uint16_t
> +fm10k_reassemble_packets(struct fm10k_rx_queue *rxq,
> + struct rte_mbuf **rx_bufs,
> + uint16_t nb_bufs, uint8_t *split_flags)
> +{
> + struct rte_mbuf *pkts[RTE_FM10K_MAX_RX_BURST]; /*finished pkts*/
> + struct rte_mbuf *start = rxq->pkt_first_seg;
> + struct rte_mbuf *end =  rxq->pkt_last_seg;
> + unsigned pkt_idx, buf_idx;
> +
> +
> + for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
> + if (end != NULL) {
> + /* processing a split packet */
> + end->next = rx_bufs[buf_idx];
> + start->nb_segs++;
> + start->pkt_len += rx_bufs[buf_idx]->data_len;
> + end = end->next;
> +
> + if (!split_flags[buf_idx]) {
> + /* it's the last packet of the set */
> + start->hash = end->hash;
> + start->ol_flags = end->ol_flags;
> + pkts[pkt_idx++] = start;
> + start = end = NULL;
> + }
> + } else {
> + /* not processing a split packet */
> + if (!split_flags[buf_idx]) {
> + /* not a split packet, save and skip */
> + pkts[pkt_idx++] = rx_bufs[buf_idx];
> + continue;
> + }
> + end = start = rx_bufs[buf_idx];
> + }
I guess you forgot to consider the crc_len during processing. /Steve

[dpdk-dev] [PATCH v3 04/16] fm10k: add func to re-allocate mbuf for RX ring

2015-10-28 Thread Liang, Cunming

Hi Mark,

On 10/27/2015 5:46 PM, Chen Jing D(Mark) wrote:
> From: "Chen Jing D(Mark)" 
>
> Add function fm10k_rxq_rearm to re-allocate mbuf for used desc
> in RX HW ring.
>
> Signed-off-by: Chen Jing D(Mark) 
> ---
>   drivers/net/fm10k/fm10k.h  |9 
>   drivers/net/fm10k/fm10k_ethdev.c   |3 +
>   drivers/net/fm10k/fm10k_rxtx_vec.c |   90 
> 
>   3 files changed, 102 insertions(+), 0 deletions(-)
[...]
> +static inline void
> +fm10k_rxq_rearm(struct fm10k_rx_queue *rxq)
> +{
> + int i;
> + uint16_t rx_id;
> + volatile union fm10k_rx_desc *rxdp;
> + struct rte_mbuf **mb_alloc = &rxq->sw_ring[rxq->rxrearm_start];
> + struct rte_mbuf *mb0, *mb1;
> + __m128i head_off = _mm_set_epi64x(
> + RTE_PKTMBUF_HEADROOM + FM10K_RX_DATABUF_ALIGN - 1,
> + RTE_PKTMBUF_HEADROOM + FM10K_RX_DATABUF_ALIGN - 1);
> + __m128i dma_addr0, dma_addr1;
> + /* Rx buffer need to be aligned with 512 byte */
> + const __m128i hba_msk = _mm_set_epi64x(0,
> + UINT64_MAX - FM10K_RX_DATABUF_ALIGN + 1);
> +
> + rxdp = rxq->hw_ring + rxq->rxrearm_start;
> +
> + /* Pull 'n' more MBUFs into the software ring */
> + if (rte_mempool_get_bulk(rxq->mp,
> +  (void *)mb_alloc,
> +  RTE_FM10K_RXQ_REARM_THRESH) < 0) {
Here's one potential issue when the failure happens. As tail won't 
update, the head will equal to tail in the end. HW won't write back 
anyway, however the SW recv_raw_pkts_vec only check DD bit, the old 
'dirty' descriptor(DD bit is not clean) will be taken and continue move 
forward to check the next which even beyond the tail. I'm sorry didn't 
catch it on the first time. /Steve
> + rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed +=
> + RTE_FM10K_RXQ_REARM_THRESH;
> + return;
> + }
> +
> +

[dpdk-dev] [Patch] Eth Driver: Optimization for improved NIC processing rates

2015-10-28 Thread Polehn, Mike A

Hi Bruce!

Thank you for reviewing, sorry didn't write clearly as possible.

I was trying to say more than "The performance improved". I didn't call out RFC 
2544 since many 
people may not know much about it. I was also trying to convey what was 
observed and the 
conclusion derived from the observation without getting too big.

When the NIC processing loop rate is around 400,000/sec the entry and exit 
savings are not easily 
observable when the average data rate variation from test to test is higher 
than the packet rate 
gain. If RFC 2544 zero loss convergence is set too fine, the time it takes to 
make a complete test 
increases substantially (I set my convergence about 0.25% of line rate) at 60 
seconds per 
measurement point. Unless the current convergence data rate is close to zero 
loss for the
next point, a small improvement is not going to show up as higher zero loss 
rate. However the
test has a series of measurements, which has average latency and packet loss. 
Also since the
test equipment uses a predefined sequence algorithm that cause the same data 
rate to
to a high degree of accuracy be generated for each test, the results for same 
data rates can be
compared across tests. If someone repeats the tests, I am pointing to the 
particular data to
look at. One 60 second measurement itself does not give sufficient accuracy to 
make a 
conclusion, but information correlated across multiple measurements gives basis 
for a
correct conclusion.

For l3fwd, to be stable with i40e requires the queues to be increased (I use 
2k) and the 
Packet count to also be increased. This then gets 100% zero loss line rate with 
64 byte 
Packets for 2 10 GbE connections (given the correct Fortville firmware). This 
makes it
good to verify the correct NIC firmware but does not work well for testing 
since the 
data is network limited. I have my own stable packet processing code which I 
used for 
testing. I have multiple programs, but during the optimization cycle, hit line 
rate and
had to move to a 5 tuple processing program for a higher load to proceed. I 
have a
doc that covers this setup and the optimization results, but cannot be shared. 
Someone
making their on measurements needs to have made sufficient tests to understand 
the
stability of their test environment.

Mike

-Original Message-
From: Richardson, Bruce 
Sent: Wednesday, October 28, 2015 3:45 AM
To: Polehn, Mike A
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [Patch] Eth Driver: Optimization for improved NIC 
processing rates

On Tue, Oct 27, 2015 at 08:56:31PM +, Polehn, Mike A wrote:
> Prefetch of interface access variables while calling into driver RX and TX 
> subroutines.
> 
> For converging zero loss packet task tests, a small drop in latency 
> for zero loss measurements and small drop in lost packet counts for 
> the lossy measurement points was observed, indicating some savings of 
> execution clock cycles.
> 
Hi Mike,

the commit log message above seems a bit awkward to read. If I understand it 
correctly, would the below suggestion be a shorter, clearer equivalent?

Prefetch RX and TX queue variables in ethdev before driver function call

This has been measured to produce higher throughput and reduced latency
in RFC 2544 throughput tests.

Or perhaps you could suggest yourself some similar wording. It would also be 
good to clarify with what applications the improvements were seen - was it 
using testpmd or l3fwd or something else?

Regards,
/Bruce

[dpdk-dev] [PATCH v5 3/4] vhost: using EVENTFD_COPY2

2015-10-28 Thread Pavel Boldin

Huawei, Thomas,

Please find an updated patchset in the appropriate mail thread.

With best regards,
Pavel

On Mon, Oct 26, 2015 at 3:45 AM, Xie, Huawei  wrote:

> On 10/21/2015 8:16 PM, Pavel Boldin wrote:
> > Xie,
> >
> > Please find my comments intermixed below.
> >
> > On Tue, Oct 20, 2015 at 12:52 PM, Xie, Huawei  > > wrote:
> >
> > Thanks Pavel for this work.
> > This is what we think is the better implementation for eventfd
> > proxy, in
> > our last review.
> > Could you add an additional patch to remove the old implementation?
> >
> > I'm not really sure if we should do it -- imagine upgrading from one
> > version of DPDK to another.
> > Given the current implementation there is a backward compatibility.
> I couldn't image the case any one would run old dpdk app with the new
> dpdk module. However I am ok you leave it here, :), we could remove this
> in next release.
> Could you finish rebasing the patch before end of next week, otherwise
> it will lose chance of being merged.
> >
> >
> >
> > Again, please run checkpatch.pl  against
> > your patch.
> >
> > Oops. Thanks for pointing out.
> >
> >
> > On 8/29/2015 2:51 AM, Pavel Boldin wrote:
> >
> > [...]
> > > +
> > > +int
> > > +eventfd_init(void)
> > > +{
> > > + if (eventfd_link > 0)
> > 0 could be valid fd. Change it to:
> >
> > Got it. Thanks.
> >
> >
> > if (eventfd_link >= 0)
> > Change elsewhere if i miss it.
> > > +int
> > > +eventfd_free(void)
> > > +{
> > > + if (eventfd_link > 0)
> > same as above:
> > if (eventfd_link >= 0)
> >
> > [...]
> >
> >
> > --
> > Sincerely,
> >  Pavel
>
>

[dpdk-dev] [PATCH v6 3/3] vhost: using EVENTFD_COPY2

2015-10-28 Thread Pavel Boldin

Signed-off-by: Pavel Boldin 
---
 lib/librte_vhost/vhost_cuse/eventfd_copy.c   | 54 ++--
 lib/librte_vhost/vhost_cuse/eventfd_copy.h   |  6 
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c |  3 ++
 3 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/lib/librte_vhost/vhost_cuse/eventfd_copy.c 
b/lib/librte_vhost/vhost_cuse/eventfd_copy.c
index 4d697a2..154b32a 100644
--- a/lib/librte_vhost/vhost_cuse/eventfd_copy.c
+++ b/lib/librte_vhost/vhost_cuse/eventfd_copy.c
@@ -46,6 +46,32 @@

 static const char eventfd_cdev[] = "/dev/eventfd-link";

+static int eventfd_link = -1;
+
+int
+eventfd_init(void)
+{
+   if (eventfd_link >= 0)
+   return 0;
+
+   eventfd_link = open(eventfd_cdev, O_RDWR);
+   if (eventfd_link < 0) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "eventfd_link module is not loaded\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+int
+eventfd_free(void)
+{
+   if (eventfd_link >= 0)
+   close(eventfd_link);
+   return 0;
+}
+
 /*
  * This function uses the eventfd_link kernel module to copy an eventfd file
  * descriptor provided by QEMU in to our process space.
@@ -53,36 +79,26 @@ static const char eventfd_cdev[] = "/dev/eventfd-link";
 int
 eventfd_copy(int target_fd, int target_pid)
 {
-   int eventfd_link, ret;
-   struct eventfd_copy eventfd_copy;
-   int fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+   int ret;
+   struct eventfd_copy2 eventfd_copy2;

-   if (fd == -1)
-   return -1;

/* Open the character device to the kernel module. */
/* TODO: check this earlier rather than fail until VM boots! */
-   eventfd_link = open(eventfd_cdev, O_RDWR);
-   if (eventfd_link < 0) {
-   RTE_LOG(ERR, VHOST_CONFIG,
-   "eventfd_link module is not loaded\n");
-   close(fd);
+   if (eventfd_init() < 0)
return -1;
-   }

-   eventfd_copy.source_fd = fd;
-   eventfd_copy.target_fd = target_fd;
-   eventfd_copy.target_pid = target_pid;
+   eventfd_copy2.fd = target_fd;
+   eventfd_copy2.pid = target_pid;
+   eventfd_copy2.flags = O_NONBLOCK | O_CLOEXEC;
/* Call the IOCTL to copy the eventfd. */
-   ret = ioctl(eventfd_link, EVENTFD_COPY, &eventfd_copy);
-   close(eventfd_link);
+   ret = ioctl(eventfd_link, EVENTFD_COPY2, &eventfd_copy2);

if (ret < 0) {
RTE_LOG(ERR, VHOST_CONFIG,
-   "EVENTFD_COPY ioctl failed\n");
-   close(fd);
+   "EVENTFD_COPY2 ioctl failed\n");
return -1;
}

-   return fd;
+   return ret;
 }
diff --git a/lib/librte_vhost/vhost_cuse/eventfd_copy.h 
b/lib/librte_vhost/vhost_cuse/eventfd_copy.h
index 19ae30d..5f446ca 100644
--- a/lib/librte_vhost/vhost_cuse/eventfd_copy.h
+++ b/lib/librte_vhost/vhost_cuse/eventfd_copy.h
@@ -34,6 +34,12 @@
 #define _EVENTFD_H

 int
+eventfd_init(void);
+
+int
+eventfd_free(void);
+
+int
 eventfd_copy(int target_fd, int target_pid);

 #endif
diff --git a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c 
b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
index 1ae7c49..ae7ad8d 100644
--- a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
@@ -373,6 +373,9 @@ rte_vhost_driver_register(const char *dev_name)
return -1;
}

+   if (eventfd_init() < 0)
+   return -1;
+
/*
 * The device name is created. This is passed to QEMU so that it can
 * register the device with our application.
-- 
1.9.1

[dpdk-dev] [PATCH v6 2/3] vhost: add EVENTFD_COPY2 ioctl

2015-10-28 Thread Pavel Boldin

Signed-off-by: Pavel Boldin 
---
 lib/librte_vhost/eventfd_link/eventfd_link.c | 61 
 lib/librte_vhost/eventfd_link/eventfd_link.h | 28 ++---
 2 files changed, 84 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
b/lib/librte_vhost/eventfd_link/eventfd_link.c
index 7cbebd4..c54a938 100644
--- a/lib/librte_vhost/eventfd_link/eventfd_link.c
+++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
@@ -78,6 +78,64 @@ fget_from_files(struct files_struct *files, unsigned fd)
 }

 static long
+eventfd_link_ioctl_copy2(unsigned long arg)
+{
+   void __user *argp = (void __user *) arg;
+   struct task_struct *task_target = NULL;
+   struct file *file;
+   struct files_struct *files;
+   struct eventfd_copy2 eventfd_copy2;
+   long ret = -EFAULT;
+
+   if (copy_from_user(&eventfd_copy2, argp, sizeof(struct eventfd_copy2)))
+   goto out;
+
+   /*
+* Find the task struct for the target pid
+*/
+   ret = -ESRCH;
+
+   task_target =
+   get_pid_task(find_vpid(eventfd_copy2.pid), PIDTYPE_PID);
+   if (task_target == NULL) {
+   pr_info("Unable to find pid %d\n", eventfd_copy2.pid);
+   goto out;
+   }
+
+   ret = -ESTALE;
+   files = get_files_struct(task_target);
+   if (files == NULL) {
+   pr_info("Failed to get target files struct\n");
+   goto out_task;
+   }
+
+   ret = -EBADF;
+   file = fget_from_files(files, eventfd_copy2.fd);
+   put_files_struct(files);
+
+   if (file == NULL) {
+   pr_info("Failed to get fd %d from target\n", eventfd_copy2.fd);
+   goto out_task;
+   }
+
+   /*
+* Install the file struct from the target process into the
+* newly allocated file desciptor of the source process.
+*/
+   ret = get_unused_fd_flags(eventfd_copy2.flags);
+   if (ret < 0) {
+   fput(file);
+   goto out_task;
+   }
+   fd_install(ret, file);
+
+out_task:
+   put_task_struct(task_target);
+out:
+   return ret;
+}
+
+static long
 eventfd_link_ioctl_copy(unsigned long arg)
 {
void __user *argp = (void __user *) arg;
@@ -176,6 +234,9 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
unsigned long arg)
case EVENTFD_COPY:
ret = eventfd_link_ioctl_copy(arg);
break;
+   case EVENTFD_COPY2:
+   ret = eventfd_link_ioctl_copy2(arg);
+   break;
}

return ret;
diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.h 
b/lib/librte_vhost/eventfd_link/eventfd_link.h
index ea619ec..5ebc20b 100644
--- a/lib/librte_vhost/eventfd_link/eventfd_link.h
+++ b/lib/librte_vhost/eventfd_link/eventfd_link.h
@@ -61,11 +61,6 @@
 #define _EVENTFD_LINK_H_

 /*
- * ioctl to copy an fd entry in calling process to an fd in a target process
- */
-#define EVENTFD_COPY 1
-
-/*
  * arguements for the EVENTFD_COPY ioctl
  */
 struct eventfd_copy {
@@ -73,4 +68,27 @@ struct eventfd_copy {
unsigned source_fd; /* fd in the calling pid */
pid_t target_pid; /* pid of the target pid */
 };
+
+/*
+ * ioctl to copy an fd entry in calling process to an fd in a target process
+ * NOTE: this one should be
+ * #define EVENTFD_COPY _IOWR('D', 1, struct eventfd_copy) actually
+ */
+#define EVENTFD_COPY 1
+
+/*
+ * arguments for the EVENTFD_COPY2 ioctl
+ */
+struct eventfd_copy2 {
+   unsigned fd; /* fd to steal */
+   pid_t pid; /* pid of the process to steal from */
+   unsigned flags; /* flags to allocate new fd with */
+};
+
+/*
+ * ioctl to copy an fd entry from the target process into newly allocated
+ * fd in the calling process
+ */
+#define EVENTFD_COPY2 _IOW('D', 2, struct eventfd_copy2)
+
 #endif /* _EVENTFD_LINK_H_ */
-- 
1.9.1

[dpdk-dev] [PATCH v6 1/3] vhost: eventfd_link: refactoring EVENTFD_COPY handler

2015-10-28 Thread Pavel Boldin

* Move ioctl `EVENTFD_COPY' code to a separate function
* Remove extra #includes
* Introduce function fget_from_files
* Fix ioctl return values

Signed-off-by: Pavel Boldin 
---
 lib/librte_vhost/eventfd_link/eventfd_link.c | 181 +++
 1 file changed, 100 insertions(+), 81 deletions(-)

diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
b/lib/librte_vhost/eventfd_link/eventfd_link.c
index 62c45c8..7cbebd4 100644
--- a/lib/librte_vhost/eventfd_link/eventfd_link.c
+++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
@@ -22,18 +22,11 @@
  *   Intel Corporation
  */

-#include 
 #include 
 #include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
+#include 

 #include "eventfd_link.h"

@@ -65,9 +58,27 @@ put_files_struct(struct files_struct *files)
BUG();
 }

+static struct file *
+fget_from_files(struct files_struct *files, unsigned fd)
+{
+   struct file *file;
+
+   rcu_read_lock();
+   file = fcheck_files(files, fd);
+   if (file) {
+   if (file->f_mode & FMODE_PATH ||
+   !atomic_long_inc_not_zero(&file->f_count)) {
+
+   file = NULL;
+   }
+   }
+   rcu_read_unlock();
+
+   return file;
+}

 static long
-eventfd_link_ioctl(struct file *f, unsigned int ioctl, unsigned long arg)
+eventfd_link_ioctl_copy(unsigned long arg)
 {
void __user *argp = (void __user *) arg;
struct task_struct *task_target = NULL;
@@ -75,91 +86,99 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, 
unsigned long arg)
struct files_struct *files;
struct fdtable *fdt;
struct eventfd_copy eventfd_copy;
+   long ret = -EFAULT;

-   switch (ioctl) {
-   case EVENTFD_COPY:
-   if (copy_from_user(&eventfd_copy, argp,
-   sizeof(struct eventfd_copy)))
-   return -EFAULT;
-
-   /*
-* Find the task struct for the target pid
-*/
-   task_target =
-   pid_task(find_vpid(eventfd_copy.target_pid), 
PIDTYPE_PID);
-   if (task_target == NULL) {
-   pr_debug("Failed to get mem ctx for target pid\n");
-   return -EFAULT;
-   }
+   if (copy_from_user(&eventfd_copy, argp, sizeof(struct eventfd_copy)))
+   goto out;

-   files = get_files_struct(current);
-   if (files == NULL) {
-   pr_debug("Failed to get files struct\n");
-   return -EFAULT;
-   }
+   /*
+* Find the task struct for the target pid
+*/
+   ret = -ESRCH;

-   rcu_read_lock();
-   file = fcheck_files(files, eventfd_copy.source_fd);
-   if (file) {
-   if (file->f_mode & FMODE_PATH ||
-   !atomic_long_inc_not_zero(&file->f_count))
-   file = NULL;
-   }
-   rcu_read_unlock();
-   put_files_struct(files);
+   task_target =
+   get_pid_task(find_vpid(eventfd_copy.target_pid), PIDTYPE_PID);
+   if (task_target == NULL) {
+   pr_info("Unable to find pid %d\n", eventfd_copy.target_pid);
+   goto out;
+   }

-   if (file == NULL) {
-   pr_debug("Failed to get file from source pid\n");
-   return 0;
-   }
+   ret = -ESTALE;
+   files = get_files_struct(current);
+   if (files == NULL) {
+   pr_info("Failed to get current files struct\n");
+   goto out_task;
+   }

-   /*
-* Release the existing eventfd in the source process
-*/
-   spin_lock(&files->file_lock);
-   fput(file);
-   filp_close(file, files);
-   fdt = files_fdtable(files);
-   fdt->fd[eventfd_copy.source_fd] = NULL;
-   spin_unlock(&files->file_lock);
-
-   /*
-* Find the file struct associated with the target fd.
-*/
-
-   files = get_files_struct(task_target);
-   if (files == NULL) {
-   pr_debug("Failed to get files struct\n");
-   return -EFAULT;
-   }
+   ret = -EBADF;
+   file = fget_from_files(files, eventfd_copy.source_fd);

-   rcu_read_lock();
-   file = fcheck_files(files, eventfd_copy.target_fd);
-   if (file) {
-   if (file->f_mode & FMODE_PATH ||
-   !atomic_long_inc_not_zero(&file->f_count))
-   file = NULL;
-   }
-   rcu_read_unlock();
+   if (file == NULL) {
+   pr_info("Failed to

[dpdk-dev] [PATCH v6 0/3] vhost: eventfd_link refactoring

2015-10-28 Thread Pavel Boldin

The patchset contains an attempt at the refactoring the `eventfd_link`
kernel module that is used to steal an FD in DPDK.

The first patch refactors old EVENTFD_COPY handler fixing the codepath
and errors returned from the kernel space. This patch is retained
for the backward compatibility.

The next one introduces a new more clean implementation of the
EVENTFD_COPY2 ioctl that allocates a new fd for the `struct file'
being stolen.

The last patch uses this new mechanism in the DPDK userspace.

Pavel Boldin (3):
  vhost: eventfd_link: refactoring EVENTFD_COPY handler
  vhost: add EVENTFD_COPY2 ioctl
  vhost: using EVENTFD_COPY2

 lib/librte_vhost/eventfd_link/eventfd_link.c | 249 ++-
 lib/librte_vhost/eventfd_link/eventfd_link.h |  28 ++-
 lib/librte_vhost/vhost_cuse/eventfd_copy.c   |  54 --
 lib/librte_vhost/vhost_cuse/eventfd_copy.h   |   6 +
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c |   3 +
 5 files changed, 231 insertions(+), 109 deletions(-)

-- 
1.9.1

[dpdk-dev] dpdk/vhost-user and VM migration

2015-10-28 Thread Yuanhan Liu

On Wed, Oct 28, 2015 at 05:52:44AM -0400, Amnon Ilan wrote:
> 
> 
> - Original Message -
> > From: "Yuanhan Liu" 
> > To: "Michael S. Tsirkin" 
> > Cc: dev at dpdk.org
> > Sent: Friday, October 16, 2015 10:37:38 AM
> > Subject: Re: [dpdk-dev] dpdk/vhost-user and VM migration
> > 
> > On Wed, Oct 14, 2015 at 12:16:29AM +0300, Michael S. Tsirkin wrote:
> > > Hello!
> > > I am currently looking at how using dpdk on host, accessing VM memory
> > > using the vhost-user interface, interacts with VM migration.
> > > 
> > > The issue is that any changes made to VM memory need to be tracked so
> > > that updates can be sent from migration source to destination.
> > > 
> > > At the moment, there's a proposal of an interface extension to
> > > vhost-user which adds ability to do this tracking through shared memory.
> > > dpdk would then be responsible for tracking these updates using atomic
> > > operations to set bits (per page written) in a memory bitmap.
> > > 
> > > This only needs to happen during migration, at other times there could
> > > be a jump to skip this logging.
> > > 
> > > Is this a reasonable approach?
> > 
> > Hi Michael,
> > 
> > As I stated in another email, adding dpdk/vhost-user vm migration
> > support is my second TODO. However, I barely know anything about
> > vm migration so far, that I can't tell now.
> > 
> > I will re-visit this question when I finished my first item and
> > after some more investigation.
> 
> Yuanhan, 
> 
> Live-migration for vhost-user is now available upstream.

Hi Ammon,

Yes, I'm aware of that.

> Do you need some guidance on how to implement it in DPDK?

Appreciate a lot for offering the help!  However, I haven't started
it yet, and I should be able to start it in two or three weeks if
everything goes well here.  I may ask your help then if I am in
trouble.

--yliu

> > > Would performance degradation during
> > > migration associated with atomics affect the performance to a level
> > > where it's no longer useful?  Pls note these logs aren't latency
> > > sensitive, so can be done on a separate core, and can be batched.
> > > 
> > > 
> > > One alternative I'm considering is extending linux kernel so it can do
> > > this tracking automatically, by marking pages read-only, detecting a
> > > pagefault and logging the write, then making the pages writeable.  This
> > > would mean higher worst-case overhead (pagefaults are expensive) but
> > > lower average one (not extra code after the first fault).  Not sure how
> > > feasible this is yet, this would be harder to implement and it will only
> > > be apply to newer host kernels.
> > > 
> > > Any feedback would be appreciated.
> > > 
> > > --
> > > MST
> >

[dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture

2015-10-28 Thread David Marchand

On Wed, Oct 28, 2015 at 6:38 PM, Jan Viktorin 
wrote:

>
>
> > - can you update the 2.2 release notes as part of this patchset to
> announce
> > armv7 support ?
>
> Yes, but where?
>

I would say "New Features" in doc/guides/rel_notes/release_2_2.rst.


> > - I am not really sure the acl et lpm fixes really belong to this
> patchset
> > as a more larger cleanup is necessary to have all libraries compile fine
> on
> > non-x86
>
> So, you mean to omit those and disable them all? The LPM and ACL fixes
> will be then included in 2.3?
>

This sounds more sane to me, rather than workarounds only for arm.


> > - since you introduce a new architecture, do you intend to run daily
> build
> > checks and send reports to the test-report mailing list ?
>
> I think, this is possible, if I automate it somehow. Do you mean to
> test every individual patch? I have no tools for this (some ideas?). If
> its just about git pull && test_script.sh, then it is quite OK.
>
> I'd appreciate some help, ideas, advices, experiences in this area...
>

I am pretty sure Thomas has some ideas about this.


Thanks.

-- 
David Marchand

[dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture

2015-10-28 Thread Jan Viktorin

On Wed, 28 Oct 2015 15:54:47 +0100
David Marchand  wrote:

> Hello Jan,
> 
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
> wrote:
> 
> > Hello DPDK community,
> >
> > this is the third attempt to post support for ARMv7 into the DPDK.
> > There are changes related to the LPM and ACL libraries only:
> >
> > * included rte_vect.h, however, it is more a placeholder
> > * rte_lpm.h was simplified due to the previous point
> > * ACL now compiles as we detect whether the compiler
> >   supports SSE 4.1
> >  
> 
> This patchset looks good to me (with the minor comments I sent).
> And armv8 support should fit quite well in this.
> 
> A last few things :
> - checkpatch is not happy with some patches, can you have a look at this ?

I will check this.

> - can you update the 2.2 release notes as part of this patchset to announce
> armv7 support ?

Yes, but where?

> - I am not really sure the acl et lpm fixes really belong to this patchset
> as a more larger cleanup is necessary to have all libraries compile fine on
> non-x86

So, you mean to omit those and disable them all? The LPM and ACL fixes
will be then included in 2.3?

> - since you introduce a new architecture, do you intend to run daily build
> checks and send reports to the test-report mailing list ?

I think, this is possible, if I automate it somehow. Do you mean to
test every individual patch? I have no tools for this (some ideas?). If
its just about git pull && test_script.sh, then it is quite OK.

I'd appreciate some help, ideas, advices, experiences in this area...

> 
> 
> Thanks.
> 

Jan

-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

[dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build

2015-10-28 Thread Jan Viktorin

On Wed, 28 Oct 2015 13:16:24 +0100
David Marchand  wrote:

> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
> wrote:
> 
> > There several issues with alignment when compiling for ARMv7.
> > They are not considered to be fatal (ARMv7 supports unaligned
> > access of 32b words), so we just leave them as warnings. They
> > should be solved later, however.
> >
> > Signed-off-by: Jan Viktorin 
> > Signed-off-by: Vlastimil Kosar 
> > ---
> >  mk/toolchain/gcc/rte.vars.mk | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
> > index 0f51c66..8f9c396 100644
> > --- a/mk/toolchain/gcc/rte.vars.mk
> > +++ b/mk/toolchain/gcc/rte.vars.mk
> > @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs
> > -Wcast-qual
> >  WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
> >  WERROR_FLAGS += -Wundef -Wwrite-strings
> >
> > +# There are many issues reported for ARMv7 architecture
> > +# which are not necessarily fatal. Report as warnings.
> > +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
> > +WERROR_FLAGS += -Wno-error
> > +endif
> > +
> >  
> 
> Can we disable only "known" problems ?
> 
> Something like :
> WERROR_FLAGS += -Wno-error=cast-align
> 
> 

Sure! That's better idea, I always forgot about this possibilities in
GCC...

Jan

-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

[dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture

2015-10-28 Thread Jan Viktorin

On Wed, 28 Oct 2015 14:39:27 +0100
David Marchand  wrote:

> On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin 
> wrote:
> 
> >
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > new file mode 100644
> > index 000..5b582a8
> > --- /dev/null
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > @@ -0,0 +1,78 @@
> >
> > +# fails to compile on ARM
> > +CONFIG_RTE_LIBRTE_ACL=n
> > +CONFIG_RTE_LIBRTE_LPM=n
> >  
> 
> librte_lpm is used by librte_table, used by librte_pipeline.
> 
> So until lpm is fixed (later in your patchset), this config file won't
> build.
> 
> 

So, you mean to disable all those? OK.

Jan


-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

[dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture

2015-10-28 Thread Jan Viktorin

On Wed, 28 Oct 2015 14:34:40 +0100
David Marchand  wrote:

> On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin 
> wrote:
> 
> > From: Vlastimil Kosar 
> >
> > Make DPDK run on ARMv7-A architecture. This patch assumes
> > ARM Cortex-A9. However, it is known to be working on Cortex-A7
> > and Cortex-A15.
> >
> > Signed-off-by: Vlastimil Kosar 
> > Signed-off-by: Jan Viktorin 
> > ---
> > v1 -> v2:
> > * the -mtune parameter of GCC is configurable now
> > * the -mfpu=neon can be turned off
> >
> > Signed-off-by: Jan Viktorin 
> > ---
> >  config/defconfig_arm-armv7-a-linuxapp-gcc | 78
> > +++
> >  mk/arch/arm/rte.vars.mk   | 39 
> >  mk/machine/armv7-a/rte.vars.mk| 67 ++
> >  3 files changed, 184 insertions(+)
> >  create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
> >  create mode 100644 mk/arch/arm/rte.vars.mk
> >  create mode 100644 mk/machine/armv7-a/rte.vars.mk
> >  
> 
> This patch comes too early in the patchset, I would put it once compilation
> is fine (more comment to come, btw), so once all headers are in place, not
> before.

Agree, this was done by watching the Power 8 patchset. But this seems
quite logical.

> 
> Besides, do we really need this -a suffix ?

It is the full name of the ARM architecture - armv7, A profile. There
are 3 profiles: A - application, R - real time, M - microcontroller.
They differ in what MMU/MPU and other such things they support. But,
finally, we can omit. The M profile is unsuitable for DPDK anyway. The
R profile may be however used under certain circumstances (I believe, it
can run Linux).

Jan

-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

[dpdk-dev] [PATCH v7 4/8] vhost: rxtx: use queue id instead of constant ring index

2015-10-28 Thread Flavio Leitner

On Sat, Oct 24, 2015 at 08:47:10PM +0300, Michael S. Tsirkin wrote:
> On Sat, Oct 24, 2015 at 12:34:08AM -0200, Flavio Leitner wrote:
> > On Thu, Oct 22, 2015 at 02:32:31PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Oct 22, 2015 at 05:49:55PM +0800, Yuanhan Liu wrote:
> > > > On Wed, Oct 21, 2015 at 05:26:18PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, Oct 21, 2015 at 08:48:15PM +0800, Yuanhan Liu wrote:
> > > > > > > Please note that for virtio devices, guest is supposed to
> > > > > > > control the placement of incoming packets in RX queues.
> > > > > > 
> > > > > > I may not follow you.
> > > > > > 
> > > > > > Enqueuing packets to a RX queue is done at vhost lib, outside the
> > > > > > guest, how could the guest take the control here?
> > > > > > 
> > > > > > --yliu
> > > > > 
> > > > > vhost should do what guest told it to.
> > > > > 
> > > > > See virtio spec:
> > > > >   5.1.6.5.5 Automatic receive steering in multiqueue mode
> > > > 
> > > > Spec says:
> > > > 
> > > > After the driver transmitted a packet of a flow on transmitqX,
> > > > the device SHOULD cause incoming packets for that flow to be
> > > > steered to receiveqX.
> > > > 
> > > > 
> > > > Michael, I still have no idea how vhost could know the flow even
> > > > after discussion with Huawei. Could you be more specific about
> > > > this? Say, how could guest know that? And how could guest tell
> > > > vhost which RX is gonna to use?
> > > > 
> > > > Thanks.
> > > > 
> > > > --yliu
> > > 
> > > I don't really understand the question.
> > > 
> > > When guests transmits a packet, it makes a decision
> > > about the flow to use, and maps that to a tx/rx pair of queues.
> > > 
> > > It sends packets out on the tx queue and expects device to
> > > return packets from the same flow on the rx queue.
> > 
> > Why? I can understand that there should be a mapping between
> > flows and queues in a way that there is no re-ordering, but
> > I can't see the relation of receiving a flow with a TX queue.
> > 
> > fbl
> 
> That's the way virtio chose to program the rx steering logic.
> 
> It's low overhead (no special commands), and
> works well for TCP when user is an endpoint since rx and tx
> for tcp are generally tied (because of ack handling).
> 
> We can discuss other ways, e.g. special commands for guests to
> program steering.
> We'd have to first see some data showing the current scheme
> is problematic somehow.

The issue is that the restriction imposes operations to be done in the
data path.  For instance, Open vSwitch has N number of threads to manage
X RX queues. We distribute them in round-robin fashion.  So, the thread
polling one RX queue will do all the packet processing and push it to the
TX queue of the other device (vhost-user or not) using the same 'id'.

Doing so we can avoid locking between threads and TX queues and any other
extra computation while still keeping the packet ordering/distribution fine.

However, if vhost-user has to send packets according with guest mapping,
it will require locking between queues and additional operations to select
the appropriate queue.  Those actions will cause performance issues.

I see no real benefit from enforcing the guest mapping outside to
justify all the computation cost, so my suggestion is to change the
spec to suggest that behavior, but not to require that to be compliant.

Does that make sense?

Thanks,
fbl

[dpdk-dev] [PATCH v2] enic: improve Tx packet rate

2015-10-28 Thread John Daley

For every packet sent, a completion was being requested and the
posted_index register on the nic was being updated. Instead, request a
completion and update the posted index once per burst after all
packets have been sent by the burst function.

Acked-by: Sujith Sankar 
Signed-off-by: John Daley 
---
 drivers/net/enic/enic.h| 12 +++-
 drivers/net/enic/enic_ethdev.c | 22 ++
 drivers/net/enic/enic_main.c   | 28 +++-
 3 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 91faeaf..9e78305 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -48,7 +48,7 @@

 #define DRV_NAME   "enic_pmd"
 #define DRV_DESCRIPTION"Cisco VIC Ethernet NIC Poll-mode 
Driver"
-#define DRV_VERSION"1.0.0.5"
+#define DRV_VERSION"1.0.0.6"
 #define DRV_COPYRIGHT  "Copyright 2008-2015 Cisco Systems, Inc"

 #define ENIC_WQ_MAX8
@@ -187,10 +187,12 @@ extern void enic_add_packet_filter(struct enic *enic);
 extern void enic_set_mac_address(struct enic *enic, uint8_t *mac_addr);
 extern void enic_del_mac_address(struct enic *enic);
 extern unsigned int enic_cleanup_wq(struct enic *enic, struct vnic_wq *wq);
-extern int enic_send_pkt(struct enic *enic, struct vnic_wq *wq,
-   struct rte_mbuf *tx_pkt, unsigned short len,
-   uint8_t sop, uint8_t eop,
-   uint16_t ol_flags, uint16_t vlan_tag);
+extern void enic_send_pkt(struct enic *enic, struct vnic_wq *wq,
+ struct rte_mbuf *tx_pkt, unsigned short len,
+ uint8_t sop, uint8_t eop, uint8_t cq_entry,
+ uint16_t ol_flags, uint16_t vlan_tag);
+
+extern void enic_post_wq_index(struct vnic_wq *wq);
 extern int enic_poll(struct vnic_rq *rq, struct rte_mbuf **rx_pkts,
unsigned int budget, unsigned int *work_done);
 extern int enic_probe(struct enic *enic);
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index e385560..f8f7817 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -488,21 +488,26 @@ static uint16_t enicpmd_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
unsigned int seg_len;
unsigned int inc_len;
unsigned int nb_segs;
-   struct rte_mbuf *tx_pkt;
+   struct rte_mbuf *tx_pkt, *next_tx_pkt;
struct vnic_wq *wq = (struct vnic_wq *)tx_queue;
struct enic *enic = vnic_dev_priv(wq->vdev);
unsigned short vlan_id;
unsigned short ol_flags;
+   uint8_t last_seg, eop;

for (index = 0; index < nb_pkts; index++) {
tx_pkt = *tx_pkts++;
inc_len = 0;
nb_segs = tx_pkt->nb_segs;
if (nb_segs > vnic_wq_desc_avail(wq)) {
+   if (index > 0)
+   enic_post_wq_index(wq);
+
/* wq cleanup and try again */
if (!enic_cleanup_wq(enic, wq) ||
-   (nb_segs > vnic_wq_desc_avail(wq)))
+   (nb_segs > vnic_wq_desc_avail(wq))) {
return index;
+   }
}
pkt_len = tx_pkt->pkt_len;
vlan_id = tx_pkt->vlan_tci;
@@ -510,14 +515,15 @@ static uint16_t enicpmd_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
for (frags = 0; inc_len < pkt_len; frags++) {
if (!tx_pkt)
break;
+   next_tx_pkt = tx_pkt->next;
seg_len = tx_pkt->data_len;
inc_len += seg_len;
-   if (enic_send_pkt(enic, wq, tx_pkt,
-   (unsigned short)seg_len, !frags,
-   (pkt_len == inc_len), ol_flags, vlan_id)) {
-   break;
-   }
-   tx_pkt = tx_pkt->next;
+   eop = (pkt_len == inc_len) || (!next_tx_pkt);
+   last_seg = eop &&
+   (index == ((unsigned int)nb_pkts - 1));
+   enic_send_pkt(enic, wq, tx_pkt, (unsigned short)seg_len,
+ !frags, eop, last_seg, ol_flags, vlan_id);
+   tx_pkt = next_tx_pkt;
}
}

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index c3c62c4..07a9810 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -58,6 +58,7 @@
 #include "vnic_cq.h"
 #include "vnic_intr.h"
 #include "vnic_nic.h"
+#include "enic_vnic_wq.h"

 static inline int enic_is_sriov_vf(struct enic *enic)
 {
@@ -151,15 +152,18 @@ unsigned int enic_cleanup_wq(struct enic *enic, struct 
vnic_wq *wq)
-1 /*wq_work_to_do*/, e

[dpdk-dev] [PATCH v2 0/4] RSS enhancement on Intel x550 NIC

2015-10-28 Thread Thomas Monjalon

> > This patch set implements the RSS enhancement on x550.
> > The enhancement includes, the PF RSS redirection table is enlarged
> > from 128 entries to 512 entries, the VF doesn't share the same
> > registers with PF and per VF RSS redirection table is provided.
> > 
> > V2:
> > Condense the code.
> > Add release notes.
> > 
> > Wenzhuo Lu (4):
> >   ixgbe: 512 entries RSS table on x550
> >   ixgbe: VF RSS config on x550
> >   ixgbe: VF RSS reta/hash query and update
> >   doc: release notes update for RSS enhancement
> 
> Acked-by: Konstantin Ananyev 
> Great work :)

Applied, thanks
The doc patch is split and merged in previous commits.

[dpdk-dev] [PATCH v2] e1000: mark rxq with RTE_SET_USED

2015-10-28 Thread Thomas Monjalon

2015-10-28 16:18, De Lara Guarch, Pablo:
> > This patch marks rxq with RTE_SET_USED in
> > rx_desc_hlen_type_rss_to_pkt_flags(), when
> > ieee1588 is disabled. Previously a compilation
> > error occurred on unused-parameter.
> > 
> > Fixes: 1ce6591e238a ("igb: fix ieee1588 frame identification in i210")
> > 
> > Signed-off-by: Harry van Haaren 
> 
> Acked-by: Pablo de Lara 

Applied, thanks

To avoid such break, we must check carefully the compilation with
different combinations of options.
Hope the automated tests will help us more in the near future.

[dpdk-dev] [PATCH] lib/lpm:fix two issues in the delete_depth_small()

2015-10-28 Thread Nikita Kozlov

On 10/28/2015 03:40 PM, Bruce Richardson wrote:
> On Wed, Oct 28, 2015 at 11:44:15AM +0800, Jijiang Liu wrote:
>> Fix two issues in the delete_depth_small() function.
>>  
>> 1> The control is not strict in this function.
>>  
>> In the following structure,
>> struct rte_lpm_tbl24_entry {
>> union {
>> uint8_t next_hop;
>> uint8_t tbl8_gindex;
>> };
>>  uint8_t ext_entry :1;
>> }
>>  
>> When ext_entry = 0, use next_hop.only to process rte_lpm_tbl24_entry.
>>  
>> When ext_entry = 1, use tbl8_gindex to process the rte_lpm_tbl8_entry.
>>  
>> When using LPM24 + 8 algorithm, it will use ext_entry to decide to process 
>> rte_lpm_tbl24_entry structure or rte_lpm_tbl8_entry structure. 
>> If a route is deleted, the prefix of previous route is used to override the 
>> deleted route. when (lpm->tbl24[i].ext_entry == 0 && lpm->tbl24[i].depth > 
>> depth) 
>> it should be ignored, but due to the incorrect logic, the next_hop is used 
>> as tbl8_gindex and will process the rte_lpm_tbl8_entry.
>>  
>> 2> Initialization of rte_lpm_tbl8_entry is incorrect in this function 
>>  
>> In this function, use new rte_lpm_tbl8_entry we call A to replace the old 
>> rte_lpm_tbl8_entry. But the valid_group do not set VALID, so it will be 
>> INVALID.
>> Then when adding a new route which depth is > 24,the tbl8_alloc() function 
>> will search the rte_lpm_tbl8_entrys to find INVALID valid_group, 
>> and it will return the A to the add_depth_big function, so A's data is 
>> overridden.
>>
>> Signed-off-by: NaNa 
>>
> Hi NaNa, Jijiang,
>
> since this patch contains two separate fixes, it would be better split into
> two separate patches, one for each fix. Also, please add a "Fixes" line to
> the commit log.
>
> Are there still plans for a unit test to demonstrate the bug(s) and make it 
> easy
> for us to verify the fix?
>
> Regards,
> /Bruce
Hello,

It's the same fix as the one sent here (which contained some tests,
maybe we can use them ?)
http://dpdk.org/ml/archives/dev/2015-October/025871.html .
For what is worth, we are using those fix at my company and they are
fixing the described bug.

-- 
Nikita

[dpdk-dev] [PATCH v4] ixgbe: Drop flow control frames from VFs

2015-10-28 Thread Thomas Monjalon

> > This patch will drop flow control frames from being transmitted from VSIs.
> > With this patch in place a malicious VF cannot send flow control or PFC 
> > packets
> > out on the wire.
> > 
> > V2:
> > Reword the comments.
> > 
> > V3:
> > Move the check of set_ethertype_anti_spoofing to the top of the function, to
> > avoid occupying an ethertype_filter entity without using it.
> > 
> > V4:
> > Remove the useless braces and return.
> > 
> > Signed-off-by: Wenzhuo Lu 
> Acked-by: Helin Zhang 

Applied, thanks

[dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture

2015-10-28 Thread Richardson, Bruce



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Viktorin
> Sent: Wednesday, October 28, 2015 5:32 PM
> To: David Marchand 
> Cc: dev at dpdk.org; Vlastimil Kosar 
> Subject: Re: [dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture
> 
> On Wed, 28 Oct 2015 14:34:40 +0100
> David Marchand  wrote:
> 
> > On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin
> > 
> > wrote:
> >
> > > From: Vlastimil Kosar 
> > >
> > > Make DPDK run on ARMv7-A architecture. This patch assumes ARM
> > > Cortex-A9. However, it is known to be working on Cortex-A7 and
> > > Cortex-A15.
> > >
> > > Signed-off-by: Vlastimil Kosar 
> > > Signed-off-by: Jan Viktorin 
> > > ---
> > > v1 -> v2:
> > > * the -mtune parameter of GCC is configurable now
> > > * the -mfpu=neon can be turned off
> > >
> > > Signed-off-by: Jan Viktorin 
> > > ---
> > >  config/defconfig_arm-armv7-a-linuxapp-gcc | 78
> > > +++
> > >  mk/arch/arm/rte.vars.mk   | 39 
> > >  mk/machine/armv7-a/rte.vars.mk| 67
> ++
> > >  3 files changed, 184 insertions(+)
> > >  create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
> > >  create mode 100644 mk/arch/arm/rte.vars.mk  create mode 100644
> > > mk/machine/armv7-a/rte.vars.mk
> > >
> >
> > This patch comes too early in the patchset, I would put it once
> > compilation is fine (more comment to come, btw), so once all headers
> > are in place, not before.
> 
> Agree, this was done by watching the Power 8 patchset. But this seems
> quite logical.
> 
> >
> > Besides, do we really need this -a suffix ?
> 
> It is the full name of the ARM architecture - armv7, A profile. There are
> 3 profiles: A - application, R - real time, M - microcontroller.
> They differ in what MMU/MPU and other such things they support. But,
> finally, we can omit. The M profile is unsuitable for DPDK anyway. The R
> profile may be however used under certain circumstances (I believe, it can
> run Linux).
> 
If you do want to include the "a" maybe just drop the "-" before it, please. 
The "-" is used to separate the elements of the RTE_TARGET and so it doesn't 
look right.

/Bruce

[dpdk-dev] [PATCH v3 00/11] Port XStats

2015-10-28 Thread Tom Crugnale

Hi Harry,

We are planning on using the xstats API for periodic stats collection through a 
polling thread.  This would be done in a generic NIC agnostic manner, which 
would require that the xstats identifiers have consistent naming amongst all of 
the NIC types.  It would likely be polled several times per second and would 
only gather a subset of all available xstats types.  

I have reviewed your patches and am interested in providing some API 
enhancements and bugfixes.  Are you willing to provide feedback on such changes?

Thank you,
Tom

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Harry van Haaren
Sent: Thursday, October 22, 2015 11:48 AM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH v3 00/11] Port XStats

This patchset adds an implementation of the xstats_get() and xstats_reset() API 
to the following PMDs: virtio, igb, igbvf, ixgbe, ixgbevf, i40e, i40evf and 
fm10k.

The xstats API allows DPDK apps to gain access to extended statistics from each 
port on a NIC. These statistics are structured as per a scheme detailed in the 
patch for the doc/prog_guide.


Harry van Haaren (11):
  doc: add extended statistics notes
  doc: add extended statistics to prog_guide
  ethdev: update xstats_get() strings and Q handling
  virtio: add xstats() implementation
  igb: add xstats() implementation
  igbvf: add xstats() implementation
  ixgbe: update statistic strings to scheme
  ixgbevf: add xstats() functions to VF
  i40e: add xstats() implementation
  i40evf: add xstats() implementation
  fm10k: add xstats() implementation

 doc/guides/prog_guide/poll_mode_drv.rst |  51 -
 doc/guides/rel_notes/release_2_2.rst|  12 ++
 drivers/net/e1000/igb_ethdev.c  | 194 +-
 drivers/net/fm10k/fm10k_ethdev.c|  87 
 drivers/net/i40e/i40e_ethdev.c  | 265 +++-
 drivers/net/i40e/i40e_ethdev_vf.c   |  89 +++-
 drivers/net/ixgbe/ixgbe_ethdev.c| 346 +---
 drivers/net/virtio/virtio_ethdev.c  |  98 -
 drivers/net/virtio/virtio_rxtx.c|  32 +++
 drivers/net/virtio/virtqueue.h  |   4 +
 lib/librte_ether/rte_ethdev.c   |  38 ++--
 11 files changed, 1154 insertions(+), 62 deletions(-)

--
1.9.1

[dpdk-dev] [PATCH v2] ixgbe: prefetch cacheline after pointer becomes valid

2015-10-28 Thread Thomas Monjalon

2015-09-28 10:53, Bruce Richardson:
> On Fri, Sep 25, 2015 at 10:44:51AM -0700, Zoltan Kiss wrote:
> > At the original point the rx_pkts[pos( + n)] pointers are not initialized, 
> > so
> > the code is prefetching random data.
> > 
> > Signed-off-by: Zoltan Kiss 
> 
> Acked-by: Bruce Richardson 

Applied, thanks

[dpdk-dev] [PATCH v3] ixgbe: fix access to last byte of EEPROM

2015-10-28 Thread Thomas Monjalon

> > Incorrect operator in ixgbe_get_eeprom & ixgbe_set_eeprom prevents
> > last byte of EEPROM being read/written, and hence cannot be dumped
> > or updated in entirity using these functions.
> > 
> > Fixes: 0198848a47f5 ("ixgbe: add access to specific device info")
> > 
> > Signed-off-by: Remy Horton 
> 
> Acked-by: Konstantin Ananyev 

Applied, thanks

[dpdk-dev] [PATCH v4 7/7] librte_table: performance improvement on rte_prefetch offset

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

This patch modifies rte_prefetch offsets to improve hash/lru
table lookup performance.

Signed-off-by: Fan Zhang 
---
 lib/librte_table/rte_table_hash_ext.c   | 10 ---
 lib/librte_table/rte_table_hash_key16.c | 51 +
 lib/librte_table/rte_table_hash_key32.c | 35 +++---
 lib/librte_table/rte_table_hash_key8.c  | 51 +
 lib/librte_table/rte_table_hash_lru.c   | 10 ---
 5 files changed, 85 insertions(+), 72 deletions(-)

diff --git a/lib/librte_table/rte_table_hash_ext.c 
b/lib/librte_table/rte_table_hash_ext.c
index 1fa15c8..854e1a5 100644
--- a/lib/librte_table/rte_table_hash_ext.c
+++ b/lib/librte_table/rte_table_hash_ext.c
@@ -648,6 +648,7 @@ static int rte_table_hash_ext_lookup_unoptimized(
 {  \
uint64_t pkt00_mask, pkt01_mask;\
struct rte_mbuf *mbuf00, *mbuf01;   \
+   uint32_t key_offset = t->key_offset;\
\
pkt00_index = __builtin_ctzll(pkts_mask);   \
pkt00_mask = 1LLU << pkt00_index;   \
@@ -659,8 +660,8 @@ static int rte_table_hash_ext_lookup_unoptimized(
pkts_mask &= ~pkt01_mask;   \
mbuf01 = pkts[pkt01_index]; \
\
-   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf00, 0));  \
-   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf01, 0));  \
+   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf00, key_offset));\
+   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf01, key_offset));\
 }

 #define lookup2_stage0_with_odd_support(t, g, pkts, pkts_mask, pkt00_index, \
@@ -668,6 +669,7 @@ static int rte_table_hash_ext_lookup_unoptimized(
 {  \
uint64_t pkt00_mask, pkt01_mask;\
struct rte_mbuf *mbuf00, *mbuf01;   \
+   uint32_t key_offset = t->key_offset;\
\
pkt00_index = __builtin_ctzll(pkts_mask);   \
pkt00_mask = 1LLU << pkt00_index;   \
@@ -681,8 +683,8 @@ static int rte_table_hash_ext_lookup_unoptimized(
pkts_mask &= ~pkt01_mask;   \
mbuf01 = pkts[pkt01_index]; \
\
-   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf00, 0));  \
-   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf01, 0));  \
+   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf00, key_offset));\
+   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf01, key_offset));\
 }

 #define lookup2_stage1(t, g, pkts, pkt10_index, pkt11_index)   \
diff --git a/lib/librte_table/rte_table_hash_key16.c 
b/lib/librte_table/rte_table_hash_key16.c
index 427b534..21130b9 100644
--- a/lib/librte_table/rte_table_hash_key16.c
+++ b/lib/librte_table/rte_table_hash_key16.c
@@ -595,16 +595,17 @@ rte_table_hash_entry_delete_key16_ext(
pos = 3;\
 }

-#define lookup1_stage0(pkt0_index, mbuf0, pkts, pkts_mask) \
+#define lookup1_stage0(pkt0_index, mbuf0, pkts, pkts_mask, f)  \
 {  \
uint64_t pkt_mask;  \
+   uint32_t key_offset = f->key_offset;\
\
pkt0_index = __builtin_ctzll(pkts_mask);\
pkt_mask = 1LLU << pkt0_index;  \
pkts_mask &= ~pkt_mask; \
\
mbuf0 = pkts[pkt0_index];   \
-   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf0, 0));   \
+   rte_prefetch0(RTE_MBUF_METADATA_UINT8_PTR(mbuf0, key_offset));\
 }

 #define lookup1_stage1(mbuf1, bucket1, f)  \
@@ -729,36 +730,38 @@ rte_table_hash_entry_delete_key16_ext(
 }

 #define lookup2_stage0(pkt00_index, pkt01_index, mbuf00, mbuf01,\
-   pkts, pkts_mask)\
+   pkts, pkts_mask, f) \
 {  \
uint64_t pkt00_mask, pkt01_mask;\
+   uint32_t key_offset = f->key_offset;\

[dpdk-dev] [PATCH v4 6/7] example/ip_pipeline/pipeline: update flow_classification pipeline

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

This patch updates the flow_classification pipeline for added key_mask
parameter in 8/16-byte key hash parameters. The update provides user
optional key_mask configuration item applying to the packets.

Signed-off-by: Fan Zhang 
---
 .../pipeline/pipeline_flow_classification_be.c | 56 --
 1 file changed, 52 insertions(+), 4 deletions(-)

diff --git a/examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c 
b/examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c
index 06a648d..e22f96f 100644
--- a/examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c
+++ b/examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "pipeline_flow_classification_be.h"
 #include "hash_func.h"
@@ -49,6 +50,7 @@ struct pipeline_flow_classification {
uint32_t key_offset;
uint32_t key_size;
uint32_t hash_offset;
+   uint8_t *key_mask;
 } __rte_cache_aligned;

 static void *
@@ -125,8 +127,12 @@ pipeline_fc_parse_args(struct pipeline_flow_classification 
*p,
uint32_t key_offset_present = 0;
uint32_t key_size_present = 0;
uint32_t hash_offset_present = 0;
+   uint32_t key_mask_present = 0;

uint32_t i;
+   char *key_mask_str = NULL;
+
+   p->hash_offset = 0;

for (i = 0; i < params->n_args; i++) {
char *arg_name = params->args_name[i];
@@ -171,6 +177,20 @@ pipeline_fc_parse_args(struct pipeline_flow_classification 
*p,
continue;
}

+   /* key_mask */
+   if (strcmp(arg_name, "key_mask") == 0) {
+   if (key_mask_present)
+   return -1;
+
+   key_mask_str = strdup(arg_value);
+   if (key_mask_str == NULL)
+   return -1;
+
+   key_mask_present = 1;
+
+   continue;
+   }
+
/* hash_offset */
if (strcmp(arg_name, "hash_offset") == 0) {
if (hash_offset_present)
@@ -189,10 +209,23 @@ pipeline_fc_parse_args(struct 
pipeline_flow_classification *p,
/* Check that mandatory arguments are present */
if ((n_flows_present == 0) ||
(key_offset_present == 0) ||
-   (key_size_present == 0) ||
-   (hash_offset_present == 0))
+   (key_size_present == 0))
return -1;

+   if (key_mask_present) {
+   p->key_mask = rte_malloc(NULL, p->key_size, 0);
+   if (p->key_mask == NULL)
+   return -1;
+
+   if (parse_hex_string(key_mask_str, p->key_mask, &p->key_size)
+   != 0) {
+   free(p->key_mask);
+   return -1;
+   }
+
+   free(key_mask_str);
+   }
+
return 0;
 }

@@ -297,6 +330,7 @@ static void *pipeline_fc_init(struct pipeline_params 
*params,
.signature_offset = p_fc->hash_offset,
.key_offset = p_fc->key_offset,
.f_hash = hash_func[(p_fc->key_size / 8) - 1],
+   .key_mask = p_fc->key_mask,
.seed = 0,
};

@@ -307,6 +341,7 @@ static void *pipeline_fc_init(struct pipeline_params 
*params,
.signature_offset = p_fc->hash_offset,
.key_offset = p_fc->key_offset,
.f_hash = hash_func[(p_fc->key_size / 8) - 1],
+   .key_mask = p_fc->key_mask,
.seed = 0,
};

@@ -336,12 +371,25 @@ static void *pipeline_fc_init(struct pipeline_params 
*params,

switch (p_fc->key_size) {
case 8:
-   table_params.ops = &rte_table_hash_key8_lru_ops;
+   if (p_fc->hash_offset != 0) {
+   table_params.ops =
+   &rte_table_hash_key8_ext_ops;
+   } else {
+   table_params.ops =
+   &rte_table_hash_key8_ext_dosig_ops;
+   }
table_params.arg_create = &table_hash_key8_params;
break;
+   break;

case 16:
-   table_params.ops = &rte_table_hash_key16_ext_ops;
+   if (p_fc->hash_offset != 0) {
+   table_params.ops =
+   &rte_table_hash_key16_ext_ops;
+   } else {
+   table_params.ops =
+   &rte_table_hash_key16_ext_dosig_ops;
+   }
table_p

[dpdk-dev] [PATCH v4 5/7] example/ip_pipeline: add parse_hex_string for internal use

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

This patch adds parse_hex_string function to parse hex string to uint8_t
array.

Signed-off-by: Fan Zhang 
---
 examples/ip_pipeline/config_parse.c | 52 +
 examples/ip_pipeline/pipeline_be.h  |  4 +++
 2 files changed, 56 insertions(+)

diff --git a/examples/ip_pipeline/config_parse.c 
b/examples/ip_pipeline/config_parse.c
index c9b78f9..ab7c518 100644
--- a/examples/ip_pipeline/config_parse.c
+++ b/examples/ip_pipeline/config_parse.c
@@ -455,6 +455,58 @@ parse_pipeline_core(uint32_t *socket,
return 0;
 }

+static uint32_t
+get_hex_val(char c)
+{
+   switch (c) {
+   case '0': case '1': case '2': case '3': case '4': case '5':
+   case '6': case '7': case '8': case '9':
+   return c - '0';
+   case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
+   return c - 'A' + 10;
+   case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
+   return c - 'a' + 10;
+   default:
+   return 0;
+   }
+}
+
+int
+parse_hex_string(char *src, uint8_t *dst, uint32_t *size)
+{
+   char *c;
+   uint32_t len, i;
+
+   /* Check input parameters */
+   if ((src == NULL) ||
+   (dst == NULL) ||
+   (size == NULL) ||
+   (*size == 0))
+   return -1;
+
+   len = strlen(src);
+   if (((len & 3) != 0) ||
+   (len > (*size) * 2))
+   return -1;
+   *size = len / 2;
+
+   for (c = src; *c != 0; c++) {
+   if *c) >= '0') && ((*c) <= '9')) ||
+   (((*c) >= 'A') && ((*c) <= 'F')) ||
+   (((*c) >= 'a') && ((*c) <= 'f')))
+   continue;
+
+   return -1;
+   }
+
+   /* Convert chars to bytes */
+   for (i = 0; i < *size; i++)
+   dst[i] = get_hex_val(src[2 * i]) * 16 +
+   get_hex_val(src[2 * i + 1]);
+
+   return 0;
+}
+
 static size_t
 skip_digits(const char *src)
 {
diff --git a/examples/ip_pipeline/pipeline_be.h 
b/examples/ip_pipeline/pipeline_be.h
index 51f1e4f..2e46440 100644
--- a/examples/ip_pipeline/pipeline_be.h
+++ b/examples/ip_pipeline/pipeline_be.h
@@ -253,4 +253,8 @@ struct pipeline_be_ops {
pipeline_be_op_track f_track;
 };

+/* Parse hex string to uint8_t array */
+int
+parse_hex_string(char *src, uint8_t *dst, uint32_t *size);
+
 #endif
-- 
2.1.0

[dpdk-dev] [PATCH v4 4/7] app/test-pipeline: modify pipeline test

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

Test-pipeline has been updated to work on added
key_mask parameter for 8-byte key extendible
bucket and LRU tables.

Signed-off-by: Fan Zhang 
---
 app/test-pipeline/pipeline_hash.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/app/test-pipeline/pipeline_hash.c 
b/app/test-pipeline/pipeline_hash.c
index 5e4e17f..8b888d7 100644
--- a/app/test-pipeline/pipeline_hash.c
+++ b/app/test-pipeline/pipeline_hash.c
@@ -216,6 +216,7 @@ app_main_loop_worker_pipeline_hash(void) {
.n_entries_ext = 1 << 23,
.signature_offset = APP_METADATA_OFFSET(0),
.key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
.f_hash = test_hash,
.seed = 0,
};
@@ -240,6 +241,7 @@ app_main_loop_worker_pipeline_hash(void) {
.n_entries = 1 << 24,
.signature_offset = APP_METADATA_OFFSET(0),
.key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
.f_hash = test_hash,
.seed = 0,
};
@@ -267,6 +269,7 @@ app_main_loop_worker_pipeline_hash(void) {
.key_offset = APP_METADATA_OFFSET(32),
.f_hash = test_hash,
.seed = 0,
+   .key_mask = NULL,
};

struct rte_pipeline_table_params table_params = {
@@ -291,6 +294,7 @@ app_main_loop_worker_pipeline_hash(void) {
.key_offset = APP_METADATA_OFFSET(32),
.f_hash = test_hash,
.seed = 0,
+   .key_mask = NULL,
};

struct rte_pipeline_table_params table_params = {
-- 
2.1.0

[dpdk-dev] [PATCH v4 3/7] app/test: modify app/test_table_combined and app/test_table_tables

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

Tests have been updated to work on added key_mask parameter for 8-byte
key extendible bucket and LRU tables.

Signed-off-by: Fan Zhang 
---
 app/test/test_table_combined.c | 5 -
 app/test/test_table_tables.c   | 6 --
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/app/test/test_table_combined.c b/app/test/test_table_combined.c
index 18daeec..8bf4aeb 100644
--- a/app/test/test_table_combined.c
+++ b/app/test/test_table_combined.c
@@ -418,9 +418,9 @@ test_table_hash8lru(void)
struct rte_table_hash_key8_lru_params key8lru_params = {
.n_entries = 1<<24,
.f_hash = pipeline_test_hash,
-   .seed = 0,
.signature_offset = APP_METADATA_OFFSET(0),
.key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
};

uint8_t key8lru[8];
@@ -479,6 +479,7 @@ test_table_hash16lru(void)
.seed = 0,
.signature_offset = APP_METADATA_OFFSET(0),
.key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
};

uint8_t key16lru[16];
@@ -596,6 +597,7 @@ test_table_hash8ext(void)
.seed = 0,
.signature_offset = APP_METADATA_OFFSET(0),
.key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
};

uint8_t key8ext[8];
@@ -662,6 +664,7 @@ test_table_hash16ext(void)
.seed = 0,
.signature_offset = APP_METADATA_OFFSET(0),
.key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
};

uint8_t key16ext[16];
diff --git a/app/test/test_table_tables.c b/app/test/test_table_tables.c
index cf7c62d..b6364c4 100644
--- a/app/test/test_table_tables.c
+++ b/app/test/test_table_tables.c
@@ -669,7 +669,8 @@ test_table_hash_lru_generic(struct rte_table_ops *ops)
.f_hash = pipeline_test_hash,
.seed = 0,
.signature_offset = APP_METADATA_OFFSET(1),
-   .key_offset = APP_METADATA_OFFSET(32)
+   .key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
};

hash_params.n_entries = 0;
@@ -784,7 +785,8 @@ test_table_hash_ext_generic(struct rte_table_ops *ops)
.f_hash = pipeline_test_hash,
.seed = 0,
.signature_offset = APP_METADATA_OFFSET(1),
-   .key_offset = APP_METADATA_OFFSET(32)
+   .key_offset = APP_METADATA_OFFSET(32),
+   .key_mask = NULL,
};

hash_params.n_entries = 0;
-- 
2.1.0

[dpdk-dev] [PATCH v4 2/7] librte_table: add 16 byte hash table operations with computed lookup

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

This patch is to adding hash table operations for key signature
computed on lookup ("do-sig") for LRU hash tables and Extendible buckets.

Signed-off-by: Fan Zhang 
---
 lib/librte_table/rte_table_hash.h   |   8 +
 lib/librte_table/rte_table_hash_key16.c | 359 +++-
 2 files changed, 364 insertions(+), 3 deletions(-)

diff --git a/lib/librte_table/rte_table_hash.h 
b/lib/librte_table/rte_table_hash.h
index e2c60e1..9d17516 100644
--- a/lib/librte_table/rte_table_hash.h
+++ b/lib/librte_table/rte_table_hash.h
@@ -271,6 +271,10 @@ struct rte_table_hash_key16_lru_params {
 /** LRU hash table operations for pre-computed key signature */
 extern struct rte_table_ops rte_table_hash_key16_lru_ops;

+/** LRU hash table operations for key signature computed on lookup
+("do-sig") */
+extern struct rte_table_ops rte_table_hash_key16_lru_dosig_ops;
+
 /** Extendible bucket hash table parameters */
 struct rte_table_hash_key16_ext_params {
/** Maximum number of entries (and keys) in the table */
@@ -301,6 +305,10 @@ struct rte_table_hash_key16_ext_params {
 /** Extendible bucket operations for pre-computed key signature */
 extern struct rte_table_ops rte_table_hash_key16_ext_ops;

+/** Extendible bucket hash table operations for key signature computed on
+lookup ("do-sig") */
+extern struct rte_table_ops rte_table_hash_key16_ext_dosig_ops;
+
 /**
  * 32-byte key hash tables
  *
diff --git a/lib/librte_table/rte_table_hash_key16.c 
b/lib/librte_table/rte_table_hash_key16.c
index 0d6cc55..427b534 100644
--- a/lib/librte_table/rte_table_hash_key16.c
+++ b/lib/librte_table/rte_table_hash_key16.c
@@ -620,6 +620,27 @@ rte_table_hash_entry_delete_key16_ext(
rte_prefetch0((void *)(((uintptr_t) bucket1) + RTE_CACHE_LINE_SIZE));\
 }

+#define lookup1_stage1_dosig(mbuf1, bucket1, f)\
+{  \
+   uint64_t *key;  \
+   uint64_t signature = 0; \
+   uint32_t bucket_index;  \
+   uint64_t hash_key_buffer[2];\
+   \
+   key = RTE_MBUF_METADATA_UINT64_PTR(mbuf1, f->key_offset);\
+   \
+   hash_key_buffer[0] = key[0] & f->key_mask[0];   \
+   hash_key_buffer[1] = key[1] & f->key_mask[1];   \
+   signature = f->f_hash(hash_key_buffer,  \
+   RTE_TABLE_HASH_KEY_SIZE, f->seed);  \
+   \
+   bucket_index = signature & (f->n_buckets - 1);  \
+   bucket1 = (struct rte_bucket_4_16 *)\
+   &f->memory[bucket_index * f->bucket_size];  \
+   rte_prefetch0(bucket1); \
+   rte_prefetch0((void *)(((uintptr_t) bucket1) + RTE_CACHE_LINE_SIZE));\
+}
+
 #define lookup1_stage2_lru(pkt2_index, mbuf2, bucket2, \
pkts_mask_out, entries, f)  \
 {  \
@@ -769,6 +790,36 @@ rte_table_hash_entry_delete_key16_ext(
rte_prefetch0((void *)(((uintptr_t) bucket11) + RTE_CACHE_LINE_SIZE));\
 }

+#define lookup2_stage1_dosig(mbuf10, mbuf11, bucket10, bucket11, f)\
+{  \
+   uint64_t *key10, *key11;\
+   uint64_t hash_offset_buffer[2]; \
+   uint64_t signature10, signature11;  \
+   uint32_t bucket10_index, bucket11_index;\
+   \
+   key10 = RTE_MBUF_METADATA_UINT64_PTR(mbuf10, f->key_offset);\
+   hash_offset_buffer[0] = key10[0] & f->key_mask[0];  \
+   hash_offset_buffer[1] = key10[1] & f->key_mask[1];  \
+   signature10 = f->f_hash(hash_offset_buffer, \
+   RTE_TABLE_HASH_KEY_SIZE, f->seed);\
+   bucket10_index = signature10 & (f->n_buckets - 1);  \
+   bucket10 = (struct rte_bucket_4_16 *)   \
+   &f->memory[bucket10_index * f->bucket_size];\
+   rte_prefetch0(bucket10);\
+   rte_prefetch0((void *)(((uintptr_t) bucket10) + RTE_CACHE_LINE_SIZE));\
+   \
+   key11 = RTE_MBUF_METADATA_UINT64_PTR(mbuf11, f->key_offset);\
+   hash_offset_buffer[0] = key11[0] & f->key_mask[0];  \
+   hash_offset_buffer[1] = key11[1] & f->key_mask[1];  \
+   signature11 = f->f_hash(hash_offset_buffer, \
+   RTE_TABLE_HASH_KEY_SIZE, f->seed);\
+   bucket11_index = signature11 & (f

[dpdk-dev] [PATCH v4 1/7] librte_table: add key_mask parameter to 8- and 16-bytes key hash parameters

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

This patch relates to ABI change proposed for librte_table.
The key_mask parameter is added for 8-byte and 16-byte
key extendible bucket and LRU tables.The release notes
is updated and the deprecation notice is removed.

Signed-off-by: Fan Zhang 
Signed-off-by: Jasvinder Singh 
---
 doc/guides/rel_notes/deprecation.rst|  4 ---
 doc/guides/rel_notes/release_2_2.rst|  4 +++
 lib/librte_table/rte_table_hash.h   | 12 
 lib/librte_table/rte_table_hash_key16.c | 52 ++-
 lib/librte_table/rte_table_hash_key8.c  | 54 +++--
 lib/librte_table/rte_table_version.map  |  7 +
 6 files changed, 112 insertions(+), 21 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index a391ff0..16ec9f8 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -44,10 +44,6 @@ Deprecation Notices
 * librte_table: New functions for table entry bulk add/delete will be added
   to the table operations structure.

-* librte_table hash: Key mask parameter will be added to the hash table
-  parameter structure for 8-byte key and 16-byte key extendible bucket and
-  LRU tables.
-
 * librte_pipeline: The prototype for the pipeline input port, output port
   and table action handlers will be updated:
   the pipeline parameter will be added, the packets mask parameter will be
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 128f956..7beba40 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -132,6 +132,10 @@ ABI Changes
 * librte_cfgfile: Allow longer names and values by increasing the constants
   CFG_NAME_LEN and CFG_VALUE_LEN to 64 and 256 respectively.

+* librte_table hash: The key mask parameter is added to the hash table
+  parameter structure for 8-byte key and 16-byte key extendible bucket
+  and LRU tables.
+

 Shared Library Versions
 ---
diff --git a/lib/librte_table/rte_table_hash.h 
b/lib/librte_table/rte_table_hash.h
index 9181942..e2c60e1 100644
--- a/lib/librte_table/rte_table_hash.h
+++ b/lib/librte_table/rte_table_hash.h
@@ -196,6 +196,9 @@ struct rte_table_hash_key8_lru_params {

/** Byte offset within packet meta-data where the key is located */
uint32_t key_offset;
+
+   /** Bit-mask to be AND-ed to the key on lookup */
+   uint8_t *key_mask;
 };

 /** LRU hash table operations for pre-computed key signature */
@@ -226,6 +229,9 @@ struct rte_table_hash_key8_ext_params {

/** Byte offset within packet meta-data where the key is located */
uint32_t key_offset;
+
+   /** Bit-mask to be AND-ed to the key on lookup */
+   uint8_t *key_mask;
 };

 /** Extendible bucket hash table operations for pre-computed key signature */
@@ -257,6 +263,9 @@ struct rte_table_hash_key16_lru_params {

/** Byte offset within packet meta-data where the key is located */
uint32_t key_offset;
+
+   /** Bit-mask to be AND-ed to the key on lookup */
+   uint8_t *key_mask;
 };

 /** LRU hash table operations for pre-computed key signature */
@@ -284,6 +293,9 @@ struct rte_table_hash_key16_ext_params {

/** Byte offset within packet meta-data where the key is located */
uint32_t key_offset;
+
+   /** Bit-mask to be AND-ed to the key on lookup */
+   uint8_t *key_mask;
 };

 /** Extendible bucket operations for pre-computed key signature */
diff --git a/lib/librte_table/rte_table_hash_key16.c 
b/lib/librte_table/rte_table_hash_key16.c
index f6a3306..0d6cc55 100644
--- a/lib/librte_table/rte_table_hash_key16.c
+++ b/lib/librte_table/rte_table_hash_key16.c
@@ -85,6 +85,7 @@ struct rte_table_hash {
uint32_t bucket_size;
uint32_t signature_offset;
uint32_t key_offset;
+   uint64_t key_mask[2];
rte_table_hash_op_hash f_hash;
uint64_t seed;

@@ -164,6 +165,14 @@ rte_table_hash_create_key16_lru(void *params,
f->f_hash = p->f_hash;
f->seed = p->seed;

+   if (p->key_mask != NULL) {
+   f->key_mask[0] = ((uint64_t *)p->key_mask)[0];
+   f->key_mask[1] = ((uint64_t *)p->key_mask)[1];
+   } else {
+   f->key_mask[0] = 0xLLU;
+   f->key_mask[1] = 0xLLU;
+   }
+
for (i = 0; i < n_buckets; i++) {
struct rte_bucket_4_16 *bucket;

@@ -384,6 +393,14 @@ rte_table_hash_create_key16_ext(void *params,
for (i = 0; i < n_buckets_ext; i++)
f->stack[i] = i;

+   if (p->key_mask != NULL) {
+   f->key_mask[0] = (((uint64_t *)p->key_mask)[0]);
+   f->key_mask[1] = (((uint64_t *)p->key_mask)[1]);
+   } else {
+   f->key_mask[0] = 0xLLU;
+   f->key_mask[1] = 0xLLU;
+   }
+
return f;
 }

@@ -609,11 +626,

[dpdk-dev] [PATCH v4 0/7] librte_table: add key_mask parameter to

2015-10-28 Thread roy.fan.zh...@intel.com

From: Fan Zhang 

This patchset links to ABI change announced for librte_table.
The key_mask parameters has been added to the hash table
parameter structure for 8-byte key and 16-byte key extendible
bucket and LRU tables.

v2:
*updated release note

v3:
*merged release note with source code patch
*fixed build error: added missing symbol to
librte_table/rte_table_version.map

v4:
*modified rte_prefetch offsets to improve hash/lru table
lookup performance. 

Acked-by: Cristian Dumitrescu 

Fan Zhang (7):
  librte_table: add key_mask parameter to 8- and 16-bytes key hash
parameters
  librte_table: add 16 byte hash table operations with computed lookup
  app/test: modify app/test_table_combined and app/test_table_tables
  app/test-pipeline: modify pipeline test
  example/ip_pipeline: add parse_hex_string for internal use
  example/ip_pipeline/pipeline: update flow_classification pipeline
  librte_table: performance improvement on rte_prefetch offset

 app/test-pipeline/pipeline_hash.c  |   4 +
 app/test/test_table_combined.c |   5 +-
 app/test/test_table_tables.c   |   6 +-
 doc/guides/rel_notes/deprecation.rst   |   4 -
 doc/guides/rel_notes/release_2_2.rst   |   4 +
 examples/ip_pipeline/config_parse.c|  52 +++
 .../pipeline/pipeline_flow_classification_be.c |  56 ++-
 examples/ip_pipeline/pipeline_be.h |   4 +
 lib/librte_table/rte_table_hash.h  |  20 +
 lib/librte_table/rte_table_hash_ext.c  |  10 +-
 lib/librte_table/rte_table_hash_key16.c| 446 +++--
 lib/librte_table/rte_table_hash_key32.c|  35 +-
 lib/librte_table/rte_table_hash_key8.c | 105 +++--
 lib/librte_table/rte_table_hash_lru.c  |  10 +-
 lib/librte_table/rte_table_version.map |   7 +
 15 files changed, 673 insertions(+), 95 deletions(-)

-- 
2.1.0

[dpdk-dev] [PATCH] lib/lpm:fix two issues in the delete_depth_small()

2015-10-28 Thread Bruce Richardson

On Wed, Oct 28, 2015 at 05:55:59PM +0100, Nikita Kozlov wrote:
> On 10/28/2015 03:40 PM, Bruce Richardson wrote:
> > On Wed, Oct 28, 2015 at 11:44:15AM +0800, Jijiang Liu wrote:
> >> Fix two issues in the delete_depth_small() function.
> >>  
> >> 1> The control is not strict in this function.
> >>  
> >> In the following structure,
> >> struct rte_lpm_tbl24_entry {
> >> union {
> >> uint8_t next_hop;
> >> uint8_t tbl8_gindex;
> >> };
> >>  uint8_t ext_entry :1;
> >> }
> >>  
> >> When ext_entry = 0, use next_hop.only to process rte_lpm_tbl24_entry.
> >>  
> >> When ext_entry = 1, use tbl8_gindex to process the rte_lpm_tbl8_entry.
> >>  
> >> When using LPM24 + 8 algorithm, it will use ext_entry to decide to process 
> >> rte_lpm_tbl24_entry structure or rte_lpm_tbl8_entry structure. 
> >> If a route is deleted, the prefix of previous route is used to override 
> >> the deleted route. when (lpm->tbl24[i].ext_entry == 0 && 
> >> lpm->tbl24[i].depth > depth) 
> >> it should be ignored, but due to the incorrect logic, the next_hop is used 
> >> as tbl8_gindex and will process the rte_lpm_tbl8_entry.
> >>  
> >> 2> Initialization of rte_lpm_tbl8_entry is incorrect in this function 
> >>  
> >> In this function, use new rte_lpm_tbl8_entry we call A to replace the old 
> >> rte_lpm_tbl8_entry. But the valid_group do not set VALID, so it will be 
> >> INVALID.
> >> Then when adding a new route which depth is > 24,the tbl8_alloc() function 
> >> will search the rte_lpm_tbl8_entrys to find INVALID valid_group, 
> >> and it will return the A to the add_depth_big function, so A's data is 
> >> overridden.
> >>
> >> Signed-off-by: NaNa 
> >>
> > Hi NaNa, Jijiang,
> >
> > since this patch contains two separate fixes, it would be better split into
> > two separate patches, one for each fix. Also, please add a "Fixes" line to
> > the commit log.
> >
> > Are there still plans for a unit test to demonstrate the bug(s) and make it 
> > easy
> > for us to verify the fix?
> >
> > Regards,
> > /Bruce
> Hello,
> 
> It's the same fix as the one sent here (which contained some tests,
> maybe we can use them ?)
> http://dpdk.org/ml/archives/dev/2015-October/025871.html .
> For what is worth, we are using those fix at my company and they are
> fixing the described bug.
> 
Ok, great, so there are tests available. Unfortunately, the previous patches
haven't come through correctly, for example, see the tests patch in patchwork:
http://dpdk.org/dev/patchwork/patch/7934/
Given that the fix appears to work for you, it's something we need to get into
the release with or without tests, but with cleaned up tests would be better,
obviously. :-)

/Bruce

[dpdk-dev] Support for configuring MTU on i40e

2015-10-28 Thread Tom Crugnale

Hi,

I am trying to configure the MTU through rte_eth_dev_set_mtu() on an i40e 
interface and noticed that the function pointer inside of the rte_eth_dev 
struct for mtu_set is not populated from the i40e code.
It seems that the only API that will allow for restricting the MTU is to call 
rte_eth_dev_configure() with the max_rx_pkt_len field set appropriately inside 
of the passed rte_eth_conf structure.

This makes it a bit awkward from any code that is common for all NIC types, 
forcing the caller to fall back on rte_eth_dev_configure() when 
rte_eth_dev_set_mtu() returns -ENOTSUP.

Is there any reason why an implementation for setting the MTU isn't supported 
through the proper API here?

Thanks,
Tom

[dpdk-dev] [PATCH v2 4/4] doc: extend commands in testpmd and update release note

2015-10-28 Thread Jingjing Wu

Modify the doc about flow director commands to support filtering in VFs.
Remove related ABI deprecation.
update release note.

Signed-off-by: Jingjing Wu 
---
 doc/guides/rel_notes/deprecation.rst|  4 
 doc/guides/rel_notes/release_2_2.rst|  2 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 15 +--
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index a391ff0..cd2b80c 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -17,10 +17,6 @@ Deprecation Notices
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff

-* ABI changes are planned for struct rte_eth_fdir_flow_ext in order to support
-  flow director filtering in VF. The release 2.1 does not contain these ABI
-  changes, but release 2.2 will, and no backwards compatibility is planned.
-
 * ABI changes are planned for struct rte_eth_fdir_filter and
   rte_eth_fdir_masks in order to support new flow director modes,
   MAC VLAN and Cloud, on x550. The MAC VLAN mode means the MAC and
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index de6916e..d934776 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -124,6 +124,8 @@ ABI Changes
 * librte_cfgfile: Allow longer names and values by increasing the constants
   CFG_NAME_LEN and CFG_VALUE_LEN to 64 and 256 respectively.

+* The rte_eth_fdir_flow_ext structure is changed. New fields are added to
+  support flow director filtering in VF.

 Shared Library Versions
 ---
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 71d831b..eae5249 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1644,14 +1644,16 @@ Different NICs may have different capabilities, command 
show port fdir (port_id)
 flow (ipv4-other|ipv4-frag|ipv6-other|ipv6-frag)
 src (src_ip_address) dst (dst_ip_address) \
 vlan (vlan_value) flexbytes (flexbytes_value) \
-(drop|fwd) queue (queue_id) fd_id (fd_id_value)
+(drop|fwd) pf|vf(vf_id) queue (queue_id) \
+fd_id (fd_id_value)

flow_director_filter (port_id) (add|del|update) \
 flow (ipv4-tcp|ipv4-udp|ipv6-tcp|ipv6-udp) \
 src (src_ip_address) (src_port) \
 dst (dst_ip_address) (dst_port) \
 vlan (vlan_value) flexbytes (flexbytes_value) \
-(drop|fwd) queue (queue_id) fd_id (fd_id_value)
+(drop|fwd) queue pf|vf(vf_id) (queue_id) \
+fd_id (fd_id_value)

flow_director_filter (port_id) (add|del|update) \
 flow (ipv4-sctp|ipv6-sctp) \
@@ -1659,21 +1661,22 @@ Different NICs may have different capabilities, command 
show port fdir (port_id)
 dst (dst_ip_address) (dst_port)
 tag (verification_tag) vlan (vlan_value) \
 flexbytes (flexbytes_value) (drop|fwd) \
-queue (queue_id) fd_id (fd_id_value)
+pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)

flow_director_filter (port_id) (add|del|update) flow l2_payload \
 ether (ethertype) flexbytes (flexbytes_value) \
-(drop|fwd) queue (queue_id) fd_id (fd_id_value)
+(drop|fwd) pf|vf(vf_id) queue (queue_id)
+fd_id (fd_id_value)

 For example, to add an ipv4-udp flow type filter::

testpmd> flow_director_filter 0 add flow ipv4-udp src 2.2.2.3 32 \
-dst 2.2.2.5 33 vlan 0x1 flexbytes (0x88,0x48) fwd queue 1 fd_id 1
+dst 2.2.2.5 33 vlan 0x1 flexbytes (0x88,0x48) fwd pf queue 1 fd_id 
1

 For example, add an ipv4-other flow type filter::

testpmd> flow_director_filter 0 add flow ipv4-other src 2.2.2.3 \
- dst 2.2.2.5 vlan 0x1 flexbytes (0x88,0x48) fwd queue 1 fd_id 1
+ dst 2.2.2.5 vlan 0x1 flexbytes (0x88,0x48) fwd pf queue 1 fd_id 1

 flush_flow_director
 ~~~
-- 
2.4.0

[dpdk-dev] [PATCH v2 3/4] testpmd: extend commands

2015-10-28 Thread Jingjing Wu

This patch extends commands to support filtering in VFs of flow director.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c | 41 +
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0f8f48f..22476f5 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -631,7 +631,8 @@ static void cmd_help_long_parsed(void *parsed_result,
" flow (ipv4-other|ipv4-frag|ipv6-other|ipv6-frag)"
" src (src_ip_address) dst (dst_ip_address)"
" vlan (vlan_value) flexbytes (flexbytes_value)"
-   " (drop|fwd) queue (queue_id) fd_id (fd_id_value)\n"
+   " (drop|fwd) pf|vf(vf_id) queue (queue_id)"
+   " fd_id (fd_id_value)\n"
"Add/Del an IP type flow director filter.\n\n"

"flow_director_filter (port_id) (add|del|update)"
@@ -639,7 +640,8 @@ static void cmd_help_long_parsed(void *parsed_result,
" src (src_ip_address) (src_port)"
" dst (dst_ip_address) (dst_port)"
" vlan (vlan_value) flexbytes (flexbytes_value)"
-   " (drop|fwd) queue (queue_id) fd_id (fd_id_value)\n"
+   " (drop|fwd) pf|vf(vf_id) queue (queue_id)"
+   " fd_id (fd_id_value)\n"
"Add/Del an UDP/TCP type flow director filter.\n\n"

"flow_director_filter (port_id) (add|del|update)"
@@ -648,13 +650,13 @@ static void cmd_help_long_parsed(void *parsed_result,
" dst (dst_ip_address) (dst_port)"
" tag (verification_tag) vlan (vlan_value)"
" flexbytes (flexbytes_value) (drop|fwd)"
-   " queue (queue_id) fd_id (fd_id_value)\n"
+   " pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a SCTP type flow director filter.\n\n"

"flow_director_filter (port_id) (add|del|update)"
" flow l2_payload ether (ethertype)"
" flexbytes (flexbytes_value) (drop|fwd)"
-   " queue (queue_id) fd_id (fd_id_value)\n"
+   " pf|vf(vf_id) queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a l2 payload type flow director 
filter.\n\n"

"flush_flow_director (port_id)\n"
@@ -7742,6 +7744,7 @@ struct cmd_flow_director_result {
uint16_t vlan_value;
cmdline_fixed_string_t flexbytes;
cmdline_fixed_string_t flexbytes_value;
+   cmdline_fixed_string_t pf_vf;
cmdline_fixed_string_t drop;
cmdline_fixed_string_t queue;
uint16_t  queue_id;
@@ -7848,6 +7851,8 @@ cmd_flow_director_filter_parsed(void *parsed_result,
struct cmd_flow_director_result *res = parsed_result;
struct rte_eth_fdir_filter entry;
uint8_t flexbytes[RTE_ETH_FDIR_MAX_FLEXLEN];
+   char *end;
+   unsigned long vf_id;
int ret = 0;

ret = rte_eth_dev_filter_supported(res->port_id, RTE_ETH_FILTER_FDIR);
@@ -7941,6 +7946,27 @@ cmd_flow_director_filter_parsed(void *parsed_result,
entry.action.behavior = RTE_ETH_FDIR_REJECT;
else
entry.action.behavior = RTE_ETH_FDIR_ACCEPT;
+
+   if (!strcmp(res->pf_vf, "pf"))
+   entry.input.flow_ext.is_vf = 0;
+   else if (!strncmp(res->pf_vf, "vf", 2)) {
+   struct rte_eth_dev_info dev_info;
+
+   memset(&dev_info, 0, sizeof(dev_info));
+   rte_eth_dev_info_get(res->port_id, &dev_info);
+   errno = 0;
+   vf_id = strtoul(res->pf_vf + 2, &end, 10);
+   if (errno != 0 || *end != '\0' || vf_id >= dev_info.max_vfs) {
+   printf("invalid parameter %s.\n", res->pf_vf);
+   return;
+   }
+   entry.input.flow_ext.is_vf = 1;
+   entry.input.flow_ext.dst_id = (uint16_t)vf_id;
+   } else {
+   printf("invalid parameter %s.\n", res->pf_vf);
+   return;
+   }
+
/* set to report FD ID by default */
entry.action.report_status = RTE_ETH_FDIR_REPORT_ID;
entry.action.rx_queue = res->queue_id;
@@ -8020,6 +8046,9 @@ cmdline_parse_token_string_t 
cmd_flow_director_flexbytes_value =
 cmdline_parse_token_string_t cmd_flow_director_drop =
TOKEN_STRING_INITIALIZER(struct cmd_flow_director_result,
 drop, "drop#fwd");
+cmdline_parse_token_string_t cmd_flow_director_pf_vf =
+   TOKEN_STRING_INITIALIZER(struct cmd_flow_director_result,
+ pf_vf, NULL);
 cmdline_parse_token_string_t cmd_flow_director_queue =
TOKEN_STRING_INIT

[dpdk-dev] [PATCH v2 2/4] i40e: extend flow diretcor to support filtering in VFs

2015-10-28 Thread Jingjing Wu

This patch extends flow director to filtering in VFs.
It also corrects to disable interrupt on queues when stop device.

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c |  4 ++--
 drivers/net/i40e/i40e_fdir.c   | 15 ---
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2dd9fdc..b3207e3 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -1031,8 +1031,8 @@ i40e_dev_stop(struct rte_eth_dev *dev)
}

if (pf->fdir.fdir_vsi) {
-   i40e_vsi_queues_bind_intr(pf->fdir.fdir_vsi);
-   i40e_vsi_enable_queues_intr(pf->fdir.fdir_vsi);
+   i40e_vsi_queues_unbind_intr(pf->fdir.fdir_vsi);
+   i40e_vsi_disable_queues_intr(pf->fdir.fdir_vsi);
}
/* Clear all queues and release memory */
i40e_dev_clear_queues(dev);
diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index c9ce98f..73a69c9 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -1022,6 +1022,11 @@ i40e_add_del_fdir_filter(struct rte_eth_dev *dev,
PMD_DRV_LOG(ERR, "Invalid queue ID");
return -EINVAL;
}
+   if (filter->input.flow_ext.is_vf &&
+   filter->input.flow_ext.dst_id >= pf->vf_num) {
+   PMD_DRV_LOG(ERR, "Invalid VF ID");
+   return -EINVAL;
+   }

memset(pkt, 0, I40E_FDIR_PKT_LEN);

@@ -1061,7 +1066,7 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
volatile struct i40e_tx_desc *txdp;
volatile struct i40e_filter_program_desc *fdirdp;
uint32_t td_cmd;
-   uint16_t i;
+   uint16_t vsi_id, i;
uint8_t dest;

PMD_DRV_LOG(INFO, "filling filter programming descriptor.");
@@ -1083,9 +1088,13 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
  I40E_TXD_FLTR_QW0_PCTYPE_SHIFT) &
  I40E_TXD_FLTR_QW0_PCTYPE_MASK);

-   /* Use LAN VSI Id by default */
+   if (filter->input.flow_ext.is_vf)
+   vsi_id = pf->vfs[filter->input.flow_ext.dst_id].vsi->vsi_id;
+   else
+   /* Use LAN VSI Id by default */
+   vsi_id = pf->main_vsi->vsi_id;
fdirdp->qindex_flex_ptype_vsi |=
-   rte_cpu_to_le_32((pf->main_vsi->vsi_id <<
+   rte_cpu_to_le_32((vsi_id <<
  I40E_TXD_FLTR_QW0_DEST_VSI_SHIFT) &
  I40E_TXD_FLTR_QW0_DEST_VSI_MASK);

-- 
2.4.0

[dpdk-dev] [PATCH v2 1/4] ethdev: extend struct to support flow director in VFs

2015-10-28 Thread Jingjing Wu

This patch extends struct rte_eth_fdir_flow_ext to support flow
director in VFs.

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 26b7b33..403e6b8 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -398,6 +398,8 @@ struct rte_eth_fdir_flow_ext {
uint16_t vlan_tci;
uint8_t flexbytes[RTE_ETH_FDIR_MAX_FLEXLEN];
/**< It is filled by the flexible payload to match. */
+   uint8_t is_vf;   /**< 1 for VF, 0 for port dev */
+   uint16_t dst_id; /**< VF ID, available when is_vf is 1*/
 };

 /**
-- 
2.4.0

[dpdk-dev] [PATCH v2 0/4] extend flow director to support VF filtering in i40e driver

2015-10-28 Thread Jingjing Wu

This patch set extends flow director to VF filtering in i40e driver.

v2 change:
 - rework the doc, including release notes and testpmd guide

Jingjing Wu (4):
  ethdev: extend struct to support flow director in VFs
  i40e: extend flow diretcor to support filtering in VFs
  testpmd: extend commands
  doc: extend commands in testpmd and remove related ABI deprecation

 app/test-pmd/cmdline.c  | 41 ++---
 doc/guides/rel_notes/deprecation.rst|  4 ---
 doc/guides/rel_notes/release_2_2.rst|  2 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 15 ++-
 drivers/net/i40e/i40e_ethdev.c  |  4 +--
 drivers/net/i40e/i40e_fdir.c| 15 ---
 lib/librte_ether/rte_eth_ctrl.h |  2 ++
 7 files changed, 64 insertions(+), 19 deletions(-)

-- 
2.4.0

[dpdk-dev] Wrong TCP Checkum computed by hardware

2015-10-28 Thread Padam Jeet Singh

>> 
> Yes, you are correct, I just noticed you declare  it is static, If possible, 
> send more codes to me, I can help you with this.
>>> 
 Thanks,
 Padam
> 


I have the following code:


mbuf->ol_flags = (uint16_t) (mbuf->ol_flags &
(~PKT_TX_OFFLOAD_MASK));
mbuf->ol_flags |= PKT_TX_IP_CKSUM;
ipv4hdr->hdr_checksum = 0;
tcphdr = (struct tcp_hdr *)(pkt + sizeof(struct ether_hdr) +
sizeof(struct ipv4_hdr));
#ifdef L4CSUM_SW

tcphdr->cksum = 0;
tcphdr->cksum = get_ipv4_udptcp_checksum(ipv4hdr,   (uint16_t*)tcphdr);

#else

mbuf->ol_flags |= PKT_TX_TCP_CKSUM;
if(!(mbuf->ol_flags & PKT_TX_IPV4))
mbuf->ol_flags |= PKT_TX_IPV4;
tcphdr->cksum = 0;
tcphdr->cksum = rte_ipv4_phdr_cksum(ipv4hdr);

#endif

mbuf->pkt.vlan_macip.f.vlan_tci = 100;
mbuf->ol_flags |= PKT_TX_VLAN_PKT;

I have added a macro called L4CSUM_SW, if enabled does L4 Checksum computation 
in software. When I enable this, the packet received has the correct TCP 
checksum. It is important to note that the checksum is being computed, but is 
being computed wrong when TX_VLAN, IP_CKSUM and TCP_CKSUM are enabled.

When I do the same test with VLAN removed from the setup, the TCP checksum 
computation is correct. 

The system exhibiting this issue is a  Intel(R) Xeon(R) CPU E5-2620 v3 @ 
2.40GHz X 2, with 2 X 82599ES on a PCIe3 bus:
02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
Network Connection (rev 01)
Flags: bus master, fast devsel, latency 0, IRQ 32
Memory at c7d2 (64-bit, non-prefetchable) [size=128K]
Memory at c7d44000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [e0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-90-0b-ff-ff-3f-19-d0
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: igb_uio
Kernel modules: ixgbe

[dpdk-dev] [PATCH v2] e1000: mark rxq with RTE_SET_USED

2015-10-28 Thread De Lara Guarch, Pablo



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harry van Haaren
> Sent: Wednesday, October 28, 2015 4:08 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] e1000: mark rxq with RTE_SET_USED
> 
> This patch marks rxq with RTE_SET_USED in
> rx_desc_hlen_type_rss_to_pkt_flags(), when
> ieee1588 is disabled. Previously a compilation
> error occurred on unused-parameter.
> 
> Fixes: 1ce6591e238a ("igb: fix ieee1588 frame identification in i210")
> 
> Signed-off-by: Harry van Haaren 
> ---
> 
> v2: Fixed Fixes line (the irony)
> 
>  drivers/net/e1000/igb_rxtx.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
> index 66bc3f0..d734a19 100644
> --- a/drivers/net/e1000/igb_rxtx.c
> +++ b/drivers/net/e1000/igb_rxtx.c
> @@ -732,6 +732,8 @@ rx_desc_hlen_type_rss_to_pkt_flags(struct
> igb_rx_queue *rxq, uint32_t hl_tp_rs)
>   pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 12) & 0x07];
>   else
>   pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 4) & 0x07];
> +#else
> + RTE_SET_USED(rxq);
>  #endif
> 
>   return pkt_flags;
> --
> 1.9.1

Acked-by: Pablo de Lara

[dpdk-dev] [PATCH v2] e1000: mark rxq with RTE_SET_USED

2015-10-28 Thread Harry van Haaren

This patch marks rxq with RTE_SET_USED in
rx_desc_hlen_type_rss_to_pkt_flags(), when
ieee1588 is disabled. Previously a compilation
error occurred on unused-parameter.

Fixes: 1ce6591e238a ("igb: fix ieee1588 frame identification in i210")

Signed-off-by: Harry van Haaren 
---

v2: Fixed Fixes line (the irony)

 drivers/net/e1000/igb_rxtx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 66bc3f0..d734a19 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -732,6 +732,8 @@ rx_desc_hlen_type_rss_to_pkt_flags(struct igb_rx_queue 
*rxq, uint32_t hl_tp_rs)
pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 12) & 0x07];
else
pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 4) & 0x07];
+#else
+   RTE_SET_USED(rxq);
 #endif

return pkt_flags;
-- 
1.9.1

[dpdk-dev] [PATCH] e1000: mark rxq with RTE_SET_USED

2015-10-28 Thread Harry van Haaren

This patch marks rxq with RTE_SET_USED in
rx_desc_hlen_type_rss_to_pkt_flags(), when
ieee1588 is disabled. Previously a compilation
error occurred on unused-parameter.

Fixes: c6c79fa425f1 ("e1000: do not release queue on alloc error")

Signed-off-by: Harry van Haaren 
---
 drivers/net/e1000/igb_rxtx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 66bc3f0..d734a19 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -732,6 +732,8 @@ rx_desc_hlen_type_rss_to_pkt_flags(struct igb_rx_queue 
*rxq, uint32_t hl_tp_rs)
pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 12) & 0x07];
else
pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 4) & 0x07];
+#else
+   RTE_SET_USED(rxq);
 #endif

return pkt_flags;
-- 
1.9.1

[dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture

2015-10-28 Thread David Marchand

Hello Jan,

On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
wrote:

> Hello DPDK community,
>
> this is the third attempt to post support for ARMv7 into the DPDK.
> There are changes related to the LPM and ACL libraries only:
>
> * included rte_vect.h, however, it is more a placeholder
> * rte_lpm.h was simplified due to the previous point
> * ACL now compiles as we detect whether the compiler
>   supports SSE 4.1
>

This patchset looks good to me (with the minor comments I sent).
And armv8 support should fit quite well in this.

A last few things :
- checkpatch is not happy with some patches, can you have a look at this ?
- can you update the 2.2 release notes as part of this patchset to announce
armv7 support ?
- I am not really sure the acl et lpm fixes really belong to this patchset
as a more larger cleanup is necessary to have all libraries compile fine on
non-x86
- since you introduce a new architecture, do you intend to run daily build
checks and send reports to the test-report mailing list ?


Thanks.

-- 
David Marchand

[dpdk-dev] [PATCH] ixgbe: add mspdc to rx errors

2015-10-28 Thread Thomas Monjalon

> > This patch adds the mspdc (MAC Short Packet Discard Count) to the total rx
> > errors, as discussed on the dev at dpdk mailing
> > list: http://comments.gmane.org/gmane.comp.networking.dpdk.devel/23717
> > 
> > Suggested-by: Igor Ryzhov 
> > Signed-off-by: Harry van Haaren 
> Yes, why not including it in the error counter :)
> Acked-by: Wenzhuo Lu 

Thanks for thinking to add Suggested-by tag.
Applied, thanks

[dpdk-dev] [PATCH v3] ixgbe: remove rx jabber from ierrors

2015-10-28 Thread Thomas Monjalon

> > Remove receive jabber count (rjc) from ierrors count as the register 
> > overlaps
> > with the CRC error register, previously causing some packets to be counted
> > twice.
> > 
> > Signed-off-by: Harry van Haaren 
> Acked-by: Wenzhuo Lu 

Applied, thanks

[dpdk-dev] [PATCH] ixgbe: fix 82599 / 82598 register differences

2015-10-28 Thread Thomas Monjalon

> > Ixgbe based 82598 and 82599 have different priority recieve link-on register
> > addresses. This is solved in base/ by providing in the PXONRXC and
> > PXONXCNT as seperate macros. This patch ensures the correct address is
> > read, avoiding reading garbage values.
> > 
> > Also PXON2OFFCNT doesn't exist in 82598, so it is not read for that MAC.
> > 
> > This issue has existed since the drivers were imported into DPDK, but was
> > not easily discoverable as xstats were not available.
> > Tested using testpmd> show port xstats all
> > 
> > Fixes: af75078fece3 ("first public release")
> > 
> > Signed-off-by: Harry van Haaren 
> Acked-by: Wenzhuo Lu 

Applied, thanks

[dpdk-dev] [PATCH v8] mem: command line option to delete hugepage backing files

2015-10-28 Thread Shesha Sreenivasamurthy

When an application using huge-pages crash or exists, the hugetlbfs
backing files are not cleaned up. This is a patch to clean those files.
There are multi-process DPDK applications that may be benefited by those
backing files. Therefore, I have made that configurable so that the
application that does not need those backing files can remove them, thus
not changing the current default behavior. The application itself can
clean it up, however the rationale behind DPDK cleaning it up is, DPDK
created it and therefore, it is better it unlinks it.

Signed-off-by: Shesha Sreenivasamurthy 
Acked-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/eal_common_options.c | 12 +++
 lib/librte_eal/common/eal_internal_cfg.h   |  1 +
 lib/librte_eal/common/eal_options.h|  2 ++
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 32 ++
 4 files changed, 47 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index c614477..4e73b85 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -79,6 +79,7 @@ eal_long_options[] = {
{OPT_MASTER_LCORE,  1, NULL, OPT_MASTER_LCORE_NUM },
{OPT_NO_HPET,   0, NULL, OPT_NO_HPET_NUM  },
{OPT_NO_HUGE,   0, NULL, OPT_NO_HUGE_NUM  },
+   {OPT_HUGE_UNLINK,   0, NULL, OPT_HUGE_UNLINK_NUM  },
{OPT_NO_PCI,0, NULL, OPT_NO_PCI_NUM   },
{OPT_NO_SHCONF, 0, NULL, OPT_NO_SHCONF_NUM},
{OPT_PCI_BLACKLIST, 1, NULL, OPT_PCI_BLACKLIST_NUM},
@@ -718,6 +719,10 @@ eal_parse_common_option(int opt, const char *optarg,
conf->no_hugetlbfs = 1;
break;

+   case OPT_HUGE_UNLINK_NUM:
+   conf->hugepage_unlink = 1;
+   break;
+
case OPT_NO_PCI_NUM:
conf->no_pci = 1;
break;
@@ -841,6 +846,12 @@ eal_check_common_options(struct internal_config 
*internal_cfg)
return -1;
}

+   if (internal_cfg->no_hugetlbfs && internal_cfg->hugepage_unlink) {
+   RTE_LOG(ERR, EAL, "Option --"OPT_HUGE_UNLINK" cannot "
+   "be specified together with --"OPT_NO_HUGE"\n");
+   return -1;
+   }
+
if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) != 0 &&
rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_PCI) != 0) {
RTE_LOG(ERR, EAL, "Options blacklist (-b) and whitelist (-w) "
@@ -891,6 +902,7 @@ eal_common_usage(void)
   "  -h, --help  This help\n"
   "\nEAL options for DEBUG use only:\n"
   "  --"OPT_NO_HUGE"   Use malloc instead of hugetlbfs\n"
+  "  --"OPT_HUGE_UNLINK"   Unlink hugepage files after init\n"
   "  --"OPT_NO_PCI"Disable PCI\n"
   "  --"OPT_NO_HPET"   Disable HPET\n"
   "  --"OPT_NO_SHCONF" No shared config (mmap'd files)\n"
diff --git a/lib/librte_eal/common/eal_internal_cfg.h 
b/lib/librte_eal/common/eal_internal_cfg.h
index e2ecb0d..292013c 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -64,6 +64,7 @@ struct internal_config {
volatile unsigned force_nchannel; /**< force number of channels */
volatile unsigned force_nrank;/**< force number of ranks */
volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
+   unsigned hugepage_unlink; /** < true to unlink backing files */
volatile unsigned xen_dom0_support; /**< support app running on Xen 
Dom0*/
volatile unsigned no_pci; /**< true to disable PCI */
volatile unsigned no_hpet;/**< true to disable HPET */
diff --git a/lib/librte_eal/common/eal_options.h 
b/lib/librte_eal/common/eal_options.h
index f6714d9..745f38c 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -63,6 +63,8 @@ enum {
OPT_PROC_TYPE_NUM,
 #define OPT_NO_HPET   "no-hpet"
OPT_NO_HPET_NUM,
+#define OPT_HUGE_UNLINK"huge-unlink"
+   OPT_HUGE_UNLINK_NUM,
 #define OPT_NO_HUGE   "no-huge"
OPT_NO_HUGE_NUM,
 #define OPT_NO_PCI"no-pci"
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index ac2745e..3a9d190 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -786,6 +786,30 @@ copy_hugepages_to_shared_mem(struct hugepage_file * dst, 
int dest_size,
return 0;
 }

+static int
+unlink_hugepage_files(struct hugepage_file *hugepg_tbl,
+   unsigned num_hp_info)
+{
+   unsigned socket, size;
+   int page, nrpages = 0;
+
+   /* get total number of hugepages */
+   for (size = 0; size < num_hp_info; size++

[dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture

2015-10-28 Thread David Marchand

On Wed, Oct 28, 2015 at 11:56 AM, Jan Viktorin 
wrote:

> On Wed, 28 Oct 2015 11:09:21 +0100
> David Marchand  wrote:
>
> > +# PCI is usually not used on ARM
> > > +CONFIG_RTE_EAL_IGB_UIO=n
> > >
> >
> > Not sure "usually not used" is a good reason to disable something.
> > Is there a real issue on arm with igb_uio code (compilation, pci
> accesses) ?
> >
>
> Well, it requires to set some options in Linux Kernel (at least PCI
> support) which are usually disabled by the in-kernel *arm*_defconfigs.
> Moreover, it seems I cannot enable it for some ARM architectures (I've
> tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
> with a PCI bus. I suppose that if somebody _really_ needs this, she would
> enable it by hand.
>
> At the moment, it breaks my common builds... The driver is mostly
> useless on ARMv7 and just takes space in the filesystem.
>
>
Ok, well, at the moment, you seem to be the only user :-)
Let's see what other people say.


-- 
David Marchand

[dpdk-dev] [PATCH] lib/lpm:fix two issues in the delete_depth_small()

2015-10-28 Thread Bruce Richardson

On Wed, Oct 28, 2015 at 11:44:15AM +0800, Jijiang Liu wrote:
> Fix two issues in the delete_depth_small() function.
>  
> 1> The control is not strict in this function.
>  
> In the following structure,
> struct rte_lpm_tbl24_entry {
> union {
> uint8_t next_hop;
> uint8_t tbl8_gindex;
> };
>  uint8_t ext_entry :1;
> }
>  
> When ext_entry = 0, use next_hop.only to process rte_lpm_tbl24_entry.
>  
> When ext_entry = 1, use tbl8_gindex to process the rte_lpm_tbl8_entry.
>  
> When using LPM24 + 8 algorithm, it will use ext_entry to decide to process 
> rte_lpm_tbl24_entry structure or rte_lpm_tbl8_entry structure. 
> If a route is deleted, the prefix of previous route is used to override the 
> deleted route. when (lpm->tbl24[i].ext_entry == 0 && lpm->tbl24[i].depth > 
> depth) 
> it should be ignored, but due to the incorrect logic, the next_hop is used as 
> tbl8_gindex and will process the rte_lpm_tbl8_entry.
>  
> 2> Initialization of rte_lpm_tbl8_entry is incorrect in this function 
>  
> In this function, use new rte_lpm_tbl8_entry we call A to replace the old 
> rte_lpm_tbl8_entry. But the valid_group do not set VALID, so it will be 
> INVALID.
> Then when adding a new route which depth is > 24,the tbl8_alloc() function 
> will search the rte_lpm_tbl8_entrys to find INVALID valid_group, 
> and it will return the A to the add_depth_big function, so A's data is 
> overridden.
> 
> Signed-off-by: NaNa 
> 

Hi NaNa, Jijiang,

since this patch contains two separate fixes, it would be better split into
two separate patches, one for each fix. Also, please add a "Fixes" line to
the commit log.

Are there still plans for a unit test to demonstrate the bug(s) and make it easy
for us to verify the fix?

Regards,
/Bruce

[dpdk-dev] [PATCH 0/2] fix vf statistic wraparound handling in macro

2015-10-28 Thread Thomas Monjalon

2015-10-12 17:45, Harry van Haaren:
> The following two patches fix a misinterpretation of the cyclic
> counters of igb and ixgbe VF. When the 32bit value wraps around,
> the code now handles the wrapped new value appropriatly.
> 
> v2:
> - Reimplemented with Alex's suggested fix for off-by-one
> 
> v1:
> - Initial implementation
> 
> Harry van Haaren (2):
>   ixgbe: fix VF statistic wraparound handling macro
>   igb: fix VF statistic wraparound handling macro

Applied (with spacing fixes), thanks

[dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture

2015-10-28 Thread David Marchand

On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin 
wrote:

>
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> b/config/defconfig_arm-armv7-a-linuxapp-gcc
> new file mode 100644
> index 000..5b582a8
> --- /dev/null
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> @@ -0,0 +1,78 @@
>
> +# fails to compile on ARM
> +CONFIG_RTE_LIBRTE_ACL=n
> +CONFIG_RTE_LIBRTE_LPM=n
>

librte_lpm is used by librte_table, used by librte_pipeline.

So until lpm is fixed (later in your patchset), this config file won't
build.

-- 
David Marchand

[dpdk-dev] [PATCH 1/2] ixgbe: fix VF statistic wraparound handling macro

2015-10-28 Thread Thomas Monjalon

2015-10-12 17:45, Harry van Haaren:
> - cur += latest - last;   \
> + cur += (latest-last) & UINT_MAX;\

CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)

Please use checkpatch before submitting.
Thanks

[dpdk-dev] [PATCH v2 01/16] mk: Introduce ARMv7 architecture

2015-10-28 Thread David Marchand

On Mon, Oct 26, 2015 at 5:37 PM, Jan Viktorin 
wrote:

> From: Vlastimil Kosar 
>
> Make DPDK run on ARMv7-A architecture. This patch assumes
> ARM Cortex-A9. However, it is known to be working on Cortex-A7
> and Cortex-A15.
>
> Signed-off-by: Vlastimil Kosar 
> Signed-off-by: Jan Viktorin 
> ---
> v1 -> v2:
> * the -mtune parameter of GCC is configurable now
> * the -mfpu=neon can be turned off
>
> Signed-off-by: Jan Viktorin 
> ---
>  config/defconfig_arm-armv7-a-linuxapp-gcc | 78
> +++
>  mk/arch/arm/rte.vars.mk   | 39 
>  mk/machine/armv7-a/rte.vars.mk| 67 ++
>  3 files changed, 184 insertions(+)
>  create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
>  create mode 100644 mk/arch/arm/rte.vars.mk
>  create mode 100644 mk/machine/armv7-a/rte.vars.mk
>

This patch comes too early in the patchset, I would put it once compilation
is fine (more comment to come, btw), so once all headers are in place, not
before.

Besides, do we really need this -a suffix ?


-- 
David Marchand

[dpdk-dev] [PATCH v4 3/3] examples/ip_pipeline: add mp/mc and frag/ras swq

2015-10-28 Thread Piotr Azarewicz

Add integrated MP/MC and fragmentation/reassembly support to SWQs

Signed-off-by: Piotr Azarewicz 
---
 examples/ip_pipeline/app.h  |   14 +++
 examples/ip_pipeline/config_check.c |   45 +++-
 examples/ip_pipeline/config_parse.c |  195 +--
 examples/ip_pipeline/init.c |  165 -
 examples/ip_pipeline/main.c |4 +-
 examples/ip_pipeline/pipeline_be.h  |   18 
 6 files changed, 402 insertions(+), 39 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 521e3a0..943466e 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -107,6 +107,14 @@ struct app_pktq_swq_params {
uint32_t dropless;
uint64_t n_retries;
uint32_t cpu_socket_id;
+   uint32_t ipv4_frag;
+   uint32_t ipv6_frag;
+   uint32_t ipv4_ras;
+   uint32_t ipv6_ras;
+   uint32_t mtu;
+   uint32_t metadata_size;
+   uint32_t mempool_direct_id;
+   uint32_t mempool_indirect_id;
 };

 #ifndef APP_FILE_NAME_SIZE
@@ -405,6 +413,10 @@ struct app_params {
char app_name[APP_APPNAME_SIZE];
const char *config_file;
const char *script_file;
+   const char *parser_file;
+   const char *output_file;
+   const char *preproc;
+   const char *preproc_args;
uint64_t port_mask;
uint32_t log_level;

@@ -880,6 +892,8 @@ int app_config_init(struct app_params *app);
 int app_config_args(struct app_params *app,
int argc, char **argv);

+int app_config_preproc(struct app_params *app);
+
 int app_config_parse(struct app_params *app,
const char *file_name);

diff --git a/examples/ip_pipeline/config_check.c 
b/examples/ip_pipeline/config_check.c
index 07f4c8b..8052bc4 100644
--- a/examples/ip_pipeline/config_check.c
+++ b/examples/ip_pipeline/config_check.c
@@ -33,6 +33,8 @@

 #include 

+#include 
+
 #include "app.h"

 static void
@@ -193,6 +195,7 @@ check_swqs(struct app_params *app)
struct app_pktq_swq_params *p = &app->swq_params[i];
uint32_t n_readers = app_swq_get_readers(app, p);
uint32_t n_writers = app_swq_get_writers(app, p);
+   uint32_t n_flags;

APP_CHECK((p->size > 0),
"%s size is 0\n", p->name);
@@ -217,14 +220,48 @@ check_swqs(struct app_params *app)
APP_CHECK((n_readers != 0),
"%s has no reader\n", p->name);

-   APP_CHECK((n_readers == 1),
-   "%s has more than one reader\n", p->name);
+   if (n_readers > 1)
+   APP_LOG(app, LOW, "%s has more than one reader", 
p->name);

APP_CHECK((n_writers != 0),
"%s has no writer\n", p->name);

-   APP_CHECK((n_writers == 1),
-   "%s has more than one writer\n", p->name);
+   if (n_writers > 1)
+   APP_LOG(app, LOW, "%s has more than one writer", 
p->name);
+
+   n_flags = p->ipv4_frag + p->ipv6_frag + p->ipv4_ras + 
p->ipv6_ras;
+
+   APP_CHECK((n_flags < 2),
+   "%s has more than one fragmentation or reassembly mode 
enabled\n",
+   p->name);
+
+   APP_CHECK((!((n_readers > 1) && (n_flags == 1))),
+   "%s has more than one reader when fragmentation or 
reassembly"
+   " mode enabled\n",
+   p->name);
+
+   APP_CHECK((!((n_writers > 1) && (n_flags == 1))),
+   "%s has more than one writer when fragmentation or 
reassembly"
+   " mode enabled\n",
+   p->name);
+
+   n_flags = p->ipv4_ras + p->ipv6_ras;
+
+   APP_CHECK((!((p->dropless == 1) && (n_flags == 1))),
+   "%s has dropless when reassembly mode enabled\n", 
p->name);
+
+   n_flags = p->ipv4_frag + p->ipv6_frag;
+
+   if (n_flags == 1) {
+   uint16_t ip_hdr_size = (p->ipv4_frag) ? sizeof(struct 
ipv4_hdr) :
+   sizeof(struct ipv6_hdr);
+
+   APP_CHECK((p->mtu > ip_hdr_size),
+   "%s mtu size is smaller than ip header\n", 
p->name);
+
+   APP_CHECK((!((p->mtu - ip_hdr_size) % 8)),
+   "%s mtu size is incorrect\n", p->name);
+   }
}
 }

diff --git a/examples/ip_pipeline/config_parse.c 
b/examples/ip_pipeline/config_parse.c
index c9b78f9..a35bd3e 100644
--- a/examples/ip_pipeline/config_parse.c
+++ b/examples/ip_pipeline/config_parse.c
@@ -156,6 +156,14 @@ static const struct app_pktq_swq_params default_swq_params 
= {
.dropless = 0,
.n_retries = 0,
.cpu_socket_id = 0,
+   .ipv4_frag = 0,
+   .ipv6_frag = 0,
+   .ipv4_ras = 0,
+

[dpdk-dev] [PATCH v4 2/3] port: fix ras/frag ring ports

2015-10-28 Thread Piotr Azarewicz

Bug fixes for ring ports with IPv4/IPv6 reassembly support.
Previous implementation can't work properly due to incorrect choosing
process function.
Also, assuming that, when processing ip packet, ip header is know we can
set l3_len parameter here.

Fix usage RTE_MBUF_METADATA_* macros due to redefinition the macros.

Fixes: 50f54a84dfb7 ("port: add IPv6 reassembly port")
Fixes: ba92d511ddac ("port: move metadata offset reference at mbuf head")

Signed-off-by: Piotr Azarewicz 
---
 lib/librte_port/rte_port_frag.c |5 +++--
 lib/librte_port/rte_port_ras.c  |8 ++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/librte_port/rte_port_frag.c b/lib/librte_port/rte_port_frag.c
index 3720d5d..0fcace9 100644
--- a/lib/librte_port/rte_port_frag.c
+++ b/lib/librte_port/rte_port_frag.c
@@ -229,9 +229,10 @@ rte_port_ring_reader_frag_rx(void *port,

/* Copy meta-data from input jumbo packet to its fragments */
for (i = 0; i < p->n_frags; i++) {
-   uint8_t *src = RTE_MBUF_METADATA_UINT8_PTR(pkt, 0);
+   uint8_t *src =
+ RTE_MBUF_METADATA_UINT8_PTR(pkt, sizeof(struct 
rte_mbuf));
uint8_t *dst =
-   RTE_MBUF_METADATA_UINT8_PTR(p->frags[i], 0);
+ RTE_MBUF_METADATA_UINT8_PTR(p->frags[i], 
sizeof(struct rte_mbuf));

memcpy(dst, src, p->metadata_size);
}
diff --git a/lib/librte_port/rte_port_ras.c b/lib/librte_port/rte_port_ras.c
index 8a2e554..c4bb508 100644
--- a/lib/librte_port/rte_port_ras.c
+++ b/lib/librte_port/rte_port_ras.c
@@ -144,7 +144,7 @@ rte_port_ring_writer_ras_create(void *params, int 
socket_id, int is_ipv4)
port->tx_burst_sz = conf->tx_burst_sz;
port->tx_buf_count = 0;

-   port->f_ras = (is_ipv4 == 0) ? process_ipv4 : process_ipv6;
+   port->f_ras = (is_ipv4 == 1) ? process_ipv4 : process_ipv6;

return port;
 }
@@ -182,7 +182,7 @@ process_ipv4(struct rte_port_ring_writer_ras *p, struct 
rte_mbuf *pkt)
/* Assume there is no ethernet header */
struct ipv4_hdr *pkt_hdr = rte_pktmbuf_mtod(pkt, struct ipv4_hdr *);

-   /* Get "Do not fragment" flag and fragment offset */
+   /* Get "More fragments" flag and fragment offset */
uint16_t frag_field = rte_be_to_cpu_16(pkt_hdr->fragment_offset);
uint16_t frag_offset = (uint16_t)(frag_field & IPV4_HDR_OFFSET_MASK);
uint16_t frag_flag = (uint16_t)(frag_field & IPV4_HDR_MF_FLAG);
@@ -195,6 +195,8 @@ process_ipv4(struct rte_port_ring_writer_ras *p, struct 
rte_mbuf *pkt)
struct rte_ip_frag_tbl *tbl = p->frag_tbl;
struct rte_ip_frag_death_row *dr = &p->death_row;

+   pkt->l3_len = sizeof(*pkt_hdr);
+
/* Process this fragment */
mo = rte_ipv4_frag_reassemble_packet(tbl, dr, pkt, rte_rdtsc(),
pkt_hdr);
@@ -225,6 +227,8 @@ process_ipv6(struct rte_port_ring_writer_ras *p, struct 
rte_mbuf *pkt)
struct rte_ip_frag_tbl *tbl = p->frag_tbl;
struct rte_ip_frag_death_row *dr = &p->death_row;

+   pkt->l3_len = sizeof(*pkt_hdr) + sizeof(*frag_hdr);
+
/* Process this fragment */
mo = rte_ipv6_frag_reassemble_packet(tbl, dr, pkt, rte_rdtsc(), 
pkt_hdr,
frag_hdr);
-- 
1.7.9.5

[dpdk-dev] [PATCH v4 1/3] port: add mp/mc ring ports

2015-10-28 Thread Piotr Azarewicz

ring_multi_reader input port (on top of multi consumer rte_ring)
ring_multi_writer output port (on top of multi producer rte_ring)

Signed-off-by: Piotr Azarewicz 
---
 lib/librte_port/rte_port_ring.c  |  311 +++---
 lib/librte_port/rte_port_ring.h  |   35 +++-
 lib/librte_port/rte_port_version.map |9 +
 3 files changed, 325 insertions(+), 30 deletions(-)

diff --git a/lib/librte_port/rte_port_ring.c b/lib/librte_port/rte_port_ring.c
index 9461c05..755dfc1 100644
--- a/lib/librte_port/rte_port_ring.c
+++ b/lib/librte_port/rte_port_ring.c
@@ -63,15 +63,19 @@ struct rte_port_ring_reader {
 };

 static void *
-rte_port_ring_reader_create(void *params, int socket_id)
+rte_port_ring_reader_create_internal(void *params, int socket_id,
+   uint32_t is_multi)
 {
struct rte_port_ring_reader_params *conf =
(struct rte_port_ring_reader_params *) params;
struct rte_port_ring_reader *port;

/* Check input parameters */
-   if (conf == NULL) {
-   RTE_LOG(ERR, PORT, "%s: params is NULL\n", __func__);
+   if ((conf == NULL) ||
+   (conf->ring == NULL) ||
+   (conf->ring->cons.sc_dequeue && is_multi) ||
+   (!(conf->ring->cons.sc_dequeue) && !is_multi)) {
+   RTE_LOG(ERR, PORT, "%s: Invalid Parameters\n", __func__);
return NULL;
}

@@ -89,6 +93,18 @@ rte_port_ring_reader_create(void *params, int socket_id)
return port;
 }

+static void *
+rte_port_ring_reader_create(void *params, int socket_id)
+{
+   return rte_port_ring_reader_create_internal(params, socket_id, 0);
+}
+
+static void *
+rte_port_ring_multi_reader_create(void *params, int socket_id)
+{
+   return rte_port_ring_reader_create_internal(params, socket_id, 1);
+}
+
 static int
 rte_port_ring_reader_rx(void *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 {
@@ -102,6 +118,19 @@ rte_port_ring_reader_rx(void *port, struct rte_mbuf 
**pkts, uint32_t n_pkts)
 }

 static int
+rte_port_ring_multi_reader_rx(void *port, struct rte_mbuf **pkts,
+   uint32_t n_pkts)
+{
+   struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
+   uint32_t nb_rx;
+
+   nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+   RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
+
+   return nb_rx;
+}
+
+static int
 rte_port_ring_reader_free(void *port)
 {
if (port == NULL) {
@@ -155,10 +184,12 @@ struct rte_port_ring_writer {
uint32_t tx_burst_sz;
uint32_t tx_buf_count;
uint64_t bsz_mask;
+   uint32_t is_multi;
 };

 static void *
-rte_port_ring_writer_create(void *params, int socket_id)
+rte_port_ring_writer_create_internal(void *params, int socket_id,
+   uint32_t is_multi)
 {
struct rte_port_ring_writer_params *conf =
(struct rte_port_ring_writer_params *) params;
@@ -166,7 +197,9 @@ rte_port_ring_writer_create(void *params, int socket_id)

/* Check input parameters */
if ((conf == NULL) ||
-   (conf->ring == NULL) ||
+   (conf->ring == NULL) ||
+   (conf->ring->prod.sp_enqueue && is_multi) ||
+   (!(conf->ring->prod.sp_enqueue) && !is_multi) ||
(conf->tx_burst_sz > RTE_PORT_IN_BURST_SIZE_MAX)) {
RTE_LOG(ERR, PORT, "%s: Invalid Parameters\n", __func__);
return NULL;
@@ -185,10 +218,23 @@ rte_port_ring_writer_create(void *params, int socket_id)
port->tx_burst_sz = conf->tx_burst_sz;
port->tx_buf_count = 0;
port->bsz_mask = 1LLU << (conf->tx_burst_sz - 1);
+   port->is_multi = is_multi;

return port;
 }

+static void *
+rte_port_ring_writer_create(void *params, int socket_id)
+{
+   return rte_port_ring_writer_create_internal(params, socket_id, 0);
+}
+
+static void *
+rte_port_ring_multi_writer_create(void *params, int socket_id)
+{
+   return rte_port_ring_writer_create_internal(params, socket_id, 1);
+}
+
 static inline void
 send_burst(struct rte_port_ring_writer *p)
 {
@@ -204,6 +250,21 @@ send_burst(struct rte_port_ring_writer *p)
p->tx_buf_count = 0;
 }

+static inline void
+send_burst_mp(struct rte_port_ring_writer *p)
+{
+   uint32_t nb_tx;
+
+   nb_tx = rte_ring_mp_enqueue_burst(p->ring, (void **)p->tx_buf,
+   p->tx_buf_count);
+
+   RTE_PORT_RING_WRITER_STATS_PKTS_DROP_ADD(p, p->tx_buf_count - nb_tx);
+   for ( ; nb_tx < p->tx_buf_count; nb_tx++)
+   rte_pktmbuf_free(p->tx_buf[nb_tx]);
+
+   p->tx_buf_count = 0;
+}
+
 static int
 rte_port_ring_writer_tx(void *port, struct rte_mbuf *pkt)
 {
@@ -218,9 +279,23 @@ rte_port_ring_writer_tx(void *port, struct rte_mbuf *pkt)
 }

 static int
-rte_port_ring_writer_tx_bulk(void *port,
+rte_port_ring_multi_writer_tx(void *port, struct rte_mbuf *pkt)
+{
+   struct rte_port_ring_writer *p = (struct rte_port_ri

[dpdk-dev] [PATCH v4 0/3] ip_pipeline: add MP/MC and frag/ras support to SWQs

2015-10-28 Thread Piotr Azarewicz

This patch set enhancement ip_pipeline application:
- librte_port: add support for multi-producer/multi-consumer ring ports
- librte_port: bug fixes for ring ports with IPv4/IPv6 reassembly support
- ip_pipeline application: integrate MP/MC and fragmentation/reassembly support 
to SWQs

v2 changes:
- rte_port_ring:
- fixed checkpatch errors
- interlace the implementation of multi into the implementation of 
single
- reduced the amount of code duplication

v3 changes:
- new functions added in the .map
- add a "Fixes:" tag in commit comment

v4 changes:
- fix usage RTE_MBUF_METADATA_* macros

Acked-by: Cristian Dumitrescu 

Piotr Azarewicz (3):
  port: add mp/mc ring ports
  port: fix ras/frag ring ports
  examples/ip_pipeline: add mp/mc and frag/ras swq

 examples/ip_pipeline/app.h   |   14 ++
 examples/ip_pipeline/config_check.c  |   45 -
 examples/ip_pipeline/config_parse.c  |  195 +++--
 examples/ip_pipeline/init.c  |  165 +++---
 examples/ip_pipeline/main.c  |4 +-
 examples/ip_pipeline/pipeline_be.h   |   18 ++
 lib/librte_port/rte_port_frag.c  |5 +-
 lib/librte_port/rte_port_ras.c   |8 +-
 lib/librte_port/rte_port_ring.c  |  311 +++---
 lib/librte_port/rte_port_ring.h  |   35 +++-
 lib/librte_port/rte_port_version.map |9 +
 11 files changed, 736 insertions(+), 73 deletions(-)

-- 
1.7.9.5

[dpdk-dev] SR-IOV - VF: allowing unicast & multicast MAC addresses

2015-10-28 Thread Shaham Fridenberg

Hey all,

I upgraded from dpdk 1.8.0 to 2.1.0, and have a VM to which I attach several 
VFs in SR-IOV mode.

I managed to use set_mc_addr_list API to allow a specific multicast address via 
the VF, RX side.

I wonder is it possible to allow a range of multicast addresses? Allow by 
prefix? Couldn't find anything like that.

Also, is there some API to allow any MAC address via the VF itself (assuming PF 
is not accessible, VF is directly attached to the VM)?

I found filter_ctrl API but it seems to be implemented & used in PF only..

Thanks,
Shaham

[dpdk-dev] [PATCH] igb: fix IEEE1588 frame identification in i210

2015-10-28 Thread Thomas Monjalon

> > Fixed issue where the flag PKT_RX_IEEE1588_PTP was not being set in Intel
> > I210 NIC, as EtherType in RX descriptor is in bits 8:10 of Packet Type and 
> > not
> > in the default bits 0:2.
> > 
> > Fixes known issue "IEEE1588 support possibly not working with an Intel
> > Ethernet Controller I210 NIC"
> > 
> > Signed-off-by: Pablo de Lara 
> Acked-by: Wenzhuo Lu 

Applied, thanks

[dpdk-dev] [PATCH] librte: fix igb_uio's access to pci_dev->msi_list for kernels >= 4.3

2015-10-28 Thread David Hunt

Fix to take this change into account: https://lkml.org/lkml/2015/7/9/101
Has been applied to Kernel 4.3.0-rc6

Signed-off-by: David Hunt 
---
 lib/librte_eal/linuxapp/igb_uio/igb_uio.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c 
b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index 865a276..3bda5d2 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -254,8 +254,13 @@ igbuio_pci_irqcontrol(struct uio_info *info, s32 irq_state)
else if (udev->mode == RTE_INTR_MODE_MSIX) {
struct msi_desc *desc;

+#if (LINUX_VERSION_CODE < KERNEL_VERSION(4, 3, 0))
list_for_each_entry(desc, &pdev->msi_list, list)
igbuio_msix_mask_irq(desc, irq_state);
+#else
+   list_for_each_entry(desc, &pdev->dev.msi_list, list)
+   igbuio_msix_mask_irq(desc, irq_state);
+#endif
}
pci_cfg_access_unlock(pdev);

-- 
1.7.4.1

[dpdk-dev] [PATCH] ixgbe: fix the wrong prompt for VF TSO

2015-10-28 Thread Thomas Monjalon

> > When setting TSO on VF ixgbe NICs, for example, 82599, x550, the prompt
> > that TSO is not supported will be printed. But TSO is supported by VF ixgbe
> > NICs.
> > We should add TSO to the capability flag, so, we will not see the wrong
> > prompt.
> > 
> > Signed-off-by: Wenzhuo Lu 
> Acked-by: Jingjing Wu 

Applied, thanks

[dpdk-dev] Wrong TCP Checkum computed by hardware

2015-10-28 Thread Padam Jeet Singh


> On 28-Oct-2015, at 1:46 pm, Liu, Jijiang  wrote:
> 
> 
> 
>> -Original Message-
>> From: Padam Jeet Singh [mailto:padam.singh at inventum.net]
>> Sent: Wednesday, October 28, 2015 4:12 PM
>> To: Liu, Jijiang
>> Cc: dev at dpdk.org; Matthew Hall
>> Subject: Re: [dpdk-dev] Wrong TCP Checkum computed by hardware
>> 
>> 
>>> On 28-Oct-2015, at 1:31 pm, Liu, Jijiang  wrote:
>>> 
>>> 
 -Original Message-
 From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Padam Jeet Singh
 Sent: Wednesday, October 28, 2015 3:20 PM
 To: Matthew Hall
 Cc: dev at dpdk.org
 Subject: Re: [dpdk-dev] Wrong TCP Checkum computed by hardware
 
>>> 
>>> Did you set the txq_flags?
>>> If the flag is not set, the default value is IXGBE_SIMPLE_FLAGS, and the any
>> TX offloads do not work.
>>> 
>>> See the following codes in ixgbe_rxtx.c file
>>> 
>>> /* Use a simple Tx queue (no offloads, no multi segs) if possible */
>>> if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == IXGBE_SIMPLE_FLAGS)
>>> && (txq->tx_rs_thresh >=
>> RTE_PMD_IXGBE_TX_MAX_BURST)) {
>>> PMD_INIT_LOG(INFO, "Using simple tx code path");
>>> ...
>>> dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
>>> } else {
>>>  ...
>>> dev->tx_pkt_burst = ixgbe_xmit_pkts;
>>> }
>>> 
>>> 
>>> --Jijiang Liu
>> 
>> I initialise the queue with the following structure:
>> 
>> static const struct rte_eth_txconf tx_conf = {
>>  .tx_thresh = {
>>  .pthresh = 32,  /* Ring prefetch threshold */
>>  .hthresh = 0,   /* Ring host threshold */
>>  .wthresh = 0,   /* Ring writeback threshold */
>>  },
>>  .tx_free_thresh = 0,/* Use PMD default values */
>>  .tx_rs_thresh = 0,  /* Use PMD default values */
>> };
>> 
>> This would set the txq_flags to zero - so the tx_pkt_burst function would
>> always point to ixgbe_xmit_pkts. Also, as observed only TCP checksum is
>> computed wrong when there is VLAN TX Offload + IP Offload + TCP offload
>> bits set.  VLAN TX Offload + IP Offload + TCP CKSUM in software generates
>> correct packet on the wire.
> 
> I don't think the txq_flags is 0 if you just initialized the struct 
> rte_eth_txconf like that.

It?s declared as a global static, so it indeed is 0. I also added some debug 
around the init of the queue:

for (i = 0; i < tx; ++i) {
ret = rte_eth_tx_queue_setup(port, i, NB_TXD,
rte_eth_dev_socket_id(port), &tx_conf);
RTE_LOG(INFO, APP, "Port %u TXQ[%d] txflags = %d\n", (unsigned)port,
i, tx_conf.txq_flags);
if (ret < 0)
rte_exit(EXIT_FAILURE, "Could not setup up TX queue %d for "
"port%u (%d)", i, (unsigned)port, ret);
}

And got the following result:

Oct 28 13:55:34 localhost fpnas[1322]: APP: Port 0 TXQ[0] txflags = 0
Oct 28 13:55:34 localhost fpnas[1322]: APP: Port 0 TXQ[1] txflags = 0
Oct 28 13:55:34 localhost fpnas[1322]: APP: Port 1 TXQ[0] txflags = 0
Oct 28 13:55:34 localhost fpnas[1322]: APP: Port 1 TXQ[1] txflags = 0
Oct 28 13:55:35 localhost fpnas[1322]: APP: Port 2 TXQ[0] txflags = 0
Oct 28 13:55:35 localhost fpnas[1322]: APP: Port 3 TXQ[0] txflags = 0


> 
>> Thanks,
>> Padam

[dpdk-dev] [PATCH v3 0/2] User-space ethtool sample application

2015-10-28 Thread Ananyev, Konstantin

Hi Remy

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Wednesday, October 28, 2015 11:12 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/2] User-space ethtool sample application
> 
> Further enhancements to the userspace ethtool implementation that was
> submitted in 2.1 and packaged as a self-contained sample application.
> Implements an rte_ethtool shim layer based on rte_ethdev API, along
> with a command prompt driven demonstration application.

Looks good in general, just few small things, see below.
Thanks
Konstantin


> 
> This patchset depends on:
> * http://dpdk.org/dev/patchwork/patch/6563/
> * http://dpdk.org/dev/patchwork/patch/7340/
> * http://dpdk.org/dev/patchwork/patch/8070/
> * http://dpdk.org/dev/patchwork/patch/8067/
> * http://dpdk.org/dev/patchwork/patch/8075/
> * http://dpdk.org/dev/patchwork/patch/8074/
> * http://dpdk.org/dev/patchwork/patch/8072/
> * http://dpdk.org/dev/patchwork/patch/8071/
> * http://dpdk.org/dev/patchwork/patch/8073/
> * http://dpdk.org/dev/patchwork/patch/8068/
> * http://dpdk.org/dev/patchwork/patch/8069/

I think it actually depends only on these two:
http://dpdk.org/dev/patchwork/patch/6563/
http://dpdk.org/dev/patchwork/patch/8070/

1) From [PATCH v3 1/2] example: add user-space ethtool sample application

+int main(int argc, char **argv)
+{
+   int cnt_args_parsed;
+   uint32_t idx_port;
+   uint32_t id_core;
+   uint32_t cnt_ports;
+   struct app_port *ptr_port;
+
+   /* Init runtime enviornment */
+   cnt_args_parsed = rte_eal_init(argc, argv);
+   if (cnt_args_parsed < 0)
+   rte_exit(EXIT_FAILURE, "rte_eal_init(): Failed");
+
+   cnt_ports = rte_eth_dev_count();
+   printf("Eth NICs: %i\n", cnt_ports);
+   if (cnt_ports > MAX_PORTS) {
+   printf("Info: Using only %i of %i ports\n",
+   cnt_ports, MAX_PORTS
+   );
+   cnt_ports = MAX_PORTS;
+   }
+
+   setup_ports(&app_cfg, cnt_ports);
+
+   id_core = rte_lcore_id();
+   for (idx_port = 0; idx_port < cnt_ports; idx_port++) {
+   id_core = rte_get_next_lcore(id_core, 1, 0);
+   if (id_core == RTE_MAX_LCORE) {
+   printf("Warning: More ports than cores. "
+   "Some ports will not be active.\n");
+   break;
+   }

Master core is not always the first active core.
User can select a master lcore with '--master-lcore X' command-line option.
Another thing - why do you need a separate lcore for each port?
It means that if user would like to test X different ports he/she would need 
X+1 lcores to run the app.
I understand that you are trying to keep the app as simple as possible, but why 
not visa-versa then?
Make all ports managed by one slave lcore?
It might slowdown things as number of ports grow, but from other side this app 
is
mainly for functional testing of ethtool shim layer, so I think it is not a 
problem.

+   ptr_port = &app_cfg.ports[idx_port];
+   rte_eal_remote_launch(slave_main, ptr_port, id_core);
+   }
+
+   ethapp_main();
+
+   for (idx_port = 0; idx_port < cnt_ports; idx_port++)
+   app_cfg.ports[idx_port].exit_now = 1;
+   RTE_LCORE_FOREACH_SLAVE(id_core) {
+   if (rte_eal_wait_lcore(id_core) < 0)
+   return -1;
+   }
+
+   return 0;
+}

2) From: [PATCH v3 2/2] doc: add user-space ethtool sample app guide

It looks like a guide for keep-alive not ehttool :)

[dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture

2015-10-28 Thread Hunt, David

On 28/10/2015 10:56, Jan Viktorin wrote:
> On Wed, 28 Oct 2015 11:09:21 +0100
> David Marchand  wrote:
>
>> Hello Jan,
>>
>> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
>> wrote:
>>
>> +# PCI is usually not used on ARM
>>> +CONFIG_RTE_EAL_IGB_UIO=n
>>>
>>
>> Not sure "usually not used" is a good reason to disable something.
>> Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
>>
>
> Well, it requires to set some options in Linux Kernel (at least PCI
> support) which are usually disabled by the in-kernel *arm*_defconfigs.
> Moreover, it seems I cannot enable it for some ARM architectures (I've
> tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
> with a PCI bus. I suppose that if somebody _really_ needs this, she would
> enable it by hand.
>
> At the moment, it breaks my common builds... The driver is mostly
> useless on ARMv7 and just takes space in the filesystem.

I have an ARMv8 board here that I've built a new kernel for the purposes 
of an ARMv8 port, and it took quite a while to get the PCI
functionality all working, including implementing a fix to the kernel 
PCI driver to expose the mmap resources in sysfs properly. But after 
that, igb_uio compiles fine (on the ARMv8 patch) and works with a 
Niantic to pass traffic between ports.

If the majority of ARMv7 boards don't have a PCI bus, then I'd suggest 
leaving igb_uio disabled. Those few boards with PCI will most likely 
have a correctly kernel (and source) ready to go, so enabling igb_uio 
for them will be easy, but disabling seems a more sensible default for 
the majority of ARMv7 users.

Rgds,
Dave.

[dpdk-dev] [PATCH v6 2/9] null: fix segfault when null_pmd added to bonding

2015-10-28 Thread Kulasek, TomaszX


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, October 27, 2015 17:59
> To: Kulasek, TomaszX
> Cc: dev at dpdk.org; Tetsuya Mukawa
> Subject: Re: [dpdk-dev] [PATCH v6 2/9] null: fix segfault when null_pmd
> added to bonding
> 
> Hi,
> There is no change in v6 for this patch which was acked by Tetsuya.
> So why not keep the Acked-by below your Signed-off-by?
> 
> It seems patches 2, 3, 4 and 5 were Acked by Tetsuya.
> Other acks I'm missing?
> 

Hi,

Patches 4 and 5 were changed due to the Tetsuya's suggestions and already 
reviewed. There are not big changes, but I'm not sure if it should be reacked 
by Tetsuya, or I can just copy ack?

Tomasz

[dpdk-dev] Wrong TCP Checkum computed by hardware

2015-10-28 Thread Padam Jeet Singh


> On 28-Oct-2015, at 1:31 pm, Liu, Jijiang  wrote:
> 
> 
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Padam Jeet Singh
>> Sent: Wednesday, October 28, 2015 3:20 PM
>> To: Matthew Hall
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] Wrong TCP Checkum computed by hardware
>> 
> 
> Did you set the txq_flags?
> If the flag is not set, the default value is IXGBE_SIMPLE_FLAGS, and the any 
> TX offloads do not work.
> 
> See the following codes in ixgbe_rxtx.c file
> 
> /* Use a simple Tx queue (no offloads, no multi segs) if possible */
>   if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == IXGBE_SIMPLE_FLAGS)
>   && (txq->tx_rs_thresh >= RTE_PMD_IXGBE_TX_MAX_BURST)) {
>   PMD_INIT_LOG(INFO, "Using simple tx code path");
>  ...
>   dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
>   } else {
>...
>   dev->tx_pkt_burst = ixgbe_xmit_pkts;
>   }
> 
> 
> --Jijiang Liu

I initialise the queue with the following structure:

static const struct rte_eth_txconf tx_conf = {
.tx_thresh = {
.pthresh = 32,  /* Ring prefetch threshold */
.hthresh = 0,   /* Ring host threshold */
.wthresh = 0,   /* Ring writeback threshold */
},
.tx_free_thresh = 0,/* Use PMD default values */
.tx_rs_thresh = 0,  /* Use PMD default values */
};

This would set the txq_flags to zero - so the tx_pkt_burst function would 
always point to ixgbe_xmit_pkts. Also, as observed only TCP checksum is 
computed wrong when there is VLAN TX Offload + IP Offload + TCP offload bits 
set.  VLAN TX Offload + IP Offload + TCP CKSUM in software generates correct 
packet on the wire.

Thanks,
Padam

[dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build

2015-10-28 Thread David Marchand

On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
wrote:

> There several issues with alignment when compiling for ARMv7.
> They are not considered to be fatal (ARMv7 supports unaligned
> access of 32b words), so we just leave them as warnings. They
> should be solved later, however.
>
> Signed-off-by: Jan Viktorin 
> Signed-off-by: Vlastimil Kosar 
> ---
>  mk/toolchain/gcc/rte.vars.mk | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
> index 0f51c66..8f9c396 100644
> --- a/mk/toolchain/gcc/rte.vars.mk
> +++ b/mk/toolchain/gcc/rte.vars.mk
> @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs
> -Wcast-qual
>  WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
>  WERROR_FLAGS += -Wundef -Wwrite-strings
>
> +# There are many issues reported for ARMv7 architecture
> +# which are not necessarily fatal. Report as warnings.
> +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
> +WERROR_FLAGS += -Wno-error
> +endif
> +
>

Can we disable only "known" problems ?

Something like :
WERROR_FLAGS += -Wno-error=cast-align


-- 
David Marchand

[dpdk-dev] Wrong TCP Checkum computed by hardware

2015-10-28 Thread Padam Jeet Singh

On 28-Oct-2015, at 12:27 pm, Matthew Hall  wrote:
> 
> On Wed, Oct 28, 2015 at 12:20:22PM +0530, Padam Jeet Singh wrote:
>> Any hint what could I be doing wrong here?
> 
> When this kind of stuff doesn't work it often will depend on the exact 
> version 
> of card, chip, etc. if there are any errata.
> 

82599ES or ixgbe PMD has not had any bug fixes related to offload - not at 
least what I can see in the git commits.

One important finding: Issue only comes when I do TX VLAN offload using 
PKT_TX_VLAN_PKT (fill vlan tci, l2_len, l3_len with the VLAN ID, sizeof(struct 
ether_hdr), sizeof(struct ipv4_hdr) respectively.

So basically VLAN OFFLOAD + IP CSUM OFFLOAD + TCP CSUM OFFLOAD causes the TCP 
checksum to be computed wrong. VLAN Offload + IP CSUM Offload + TCP CSUM in 
Software produces correct results. I am suspecting this to be ixgbe driver bug 
now as nothing in my code can trigger this behaviour.

> So you might want to collect the specifics of the board with lspci -v, 
> ethtool, and pulling it out to check the chip and board revisions.

02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
Network Connection (rev 01)
Flags: bus master, fast devsel, latency 0, IRQ 32
Memory at c7d2 (64-bit, non-prefetchable) [size=128K]
Memory at c7d44000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [e0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-90-0b-ff-ff-3f-19-d0
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: igb_uio
Kernel modules: ixgbe

> 
> In addition check over the example apps and see how things work there 
> compared 
> with your own code. Often the DPDK interfaces are kind of complex and small 
> pointer or mbuf manipulation mistakes can cause very odd results.
> 

None of the sample code addresses the scenario which I have = VLAN offload + IP 
+ TCP offload.

> Matthew.

Thanks,
Padam

[dpdk-dev] [PATCH v3] vhost: Fix wrong handling of virtqueue array index

2015-10-28 Thread Tetsuya Mukawa

The patch fixes wrong handling of virtqueue array index when
GET_VRING_BASE message comes.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_vhost/vhost_user/virtio-net-user.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index a998ad8..d07452a 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -300,13 +300,9 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 * sent and only sent in vhost_vring_stop.
 * TODO: cleanup the vring, it isn't usable since here.
 */
-   if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
-   close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
-   dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
-   }
-   if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
-   close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
-   dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
+   if (dev->virtqueue[state->index]->kickfd >= 0) {
+   close(dev->virtqueue[state->index]->kickfd);
+   dev->virtqueue[state->index]->kickfd = -1;
}

return 0;
-- 
2.1.4

[dpdk-dev] [PATCH v2] vhost: Fix wrong handling of virtqueue array index

2015-10-28 Thread Tetsuya Mukawa

On 2015/10/28 12:35, Xie, Huawei wrote:
> Missed signoff?
> On 10/28/2015 11:00 AM, Tetsuya Mukawa wrote:
>> The patch fixes wrong handling of virtqueue array index when
>> GET_VRING_BASE message comes.
>> ---
>>  lib/librte_vhost/vhost_user/virtio-net-user.c | 10 +++---
>>  1 file changed, 3 insertions(+), 7 deletions(-)
>>
>>
Oops, I forgot it.

[dpdk-dev] [PATCH v3 3/3] example: add keep alive sample application

2015-10-28 Thread Tahhan, Maryam

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Wednesday, October 28, 2015 8:52 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 3/3] example: add keep alive sample
> application
> 
> Modification of l2fwd to demonstrate keep-alive functionality.
> 
> Signed-off-by: Remy Horton 
> ---

Acked-by: Maryam Tahhan

[dpdk-dev] [PATCH v3 1/3] rte: add keep alive functionality

2015-10-28 Thread Tahhan, Maryam

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Wednesday, October 28, 2015 8:52 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 1/3] rte: add keep alive functionality
> 
> Adds functions for detecting and reporting the live-ness of LCores, the
> primary requirement of which is minimal overheads for the
> core(s) being checked. Core failures are notified via an application defined
> callback.
> 
> Signed-off-by: Remy Horton 
> ---

Acked-by: Maryam Tahhan

[dpdk-dev] |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported

2015-10-28 Thread Jan Viktorin

On Wed, 28 Oct 2015 11:17:37 +
"Ananyev, Konstantin"  wrote:

> Hi Jan,
> 
> > -Original Message-
> > From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> > Sent: Wednesday, October 28, 2015 11:00 AM
> > To: Ananyev, Konstantin
> > Cc: dev at dpdk.org
> > Subject: Fw: |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported
> > 
> > Hello Konstantin,
> > 
> > the ACL patch (with changes as you suggested yesterday) breaks the
> > following test. I am confused about this. There is nothing in the log
> > that would help me to determine the source of the problem. How is this
> > test related to ACL? Moreover, I cannot see the actual packet drop
> > there, what packet has been dropped?  
> 
> I am also confused - testpmd doesn't use librte_acl at all.
> Are there any other changes in your patch set?
> Something related to vector instrincts emulation?
> As they are now in rte_vect.h and are seen not only by LPM ilbrary?

So, it is related to all the ARMv7 patch set? I thought it is only this
single patch... The only other thing I've changed is the LPM where for
x86/64 it keeps the previous implementation and for other platforms it
introduces a workaround without SSE. The new rte_vect.h is ARM-only.

Jan

> Probably someone from STC team can explain what exactly packet fails?
> Konstantin
> 
> > 
> > Regards
> > Jan
> > 
> > Begin forwarded message:
> > 
> > Date: 27 Oct 2015 13:44:40 -0700
> > From: sys_stv at intel.com
> > To: test-report at dpdk.org,viktorin at rehivetech.com
> > Subject: |ERROR| pw 8090-8106  acl: handle when SSE 4.1 is unsupported
> > 
> > 
> > Test-Label: Intel Niantic on Fedora
> > Test-Status: ERROR
> > Patchwork: http://www.dpdk.org/dev/patchwork/patch/8106/
> > 
> > DPDK git baseline: affc455438f4cbd3b14e2d9a24fbc154e22d68d3
> > Patchwork ID: 8090-8106
> > http://www.dpdk.org/dev/patchwork/patch/8106/
> > Submitter: Jan Viktorin 
> > Date: Tue, 27 Oct 2015 20:13:49 +0100
> > 
> > Compilation:
> > OS: fedora
> > Nic: niantic
> > i686-native-linuxapp-gcc: compile pass
> > x86_64-native-linuxapp-gcc: compile pass
> > 
> > DTS validation:
> > OS: fedora
> > Nic: Niantic
> > TestType: auto
> > GCC: 4
> > x86_64-native-linuxapp-gcc:  total 75, passed 74, failed 1.
> > Failed Case List:
> > Target: x86_64-native-linuxapp-gcc
> > OS: fedora
> > Failed DTS case:
> > checksum_offload_with_vlan:
> > http://dpdk.org/browse/tools/dts/tree/test_plans/checksum_offload_test_plan.rst
> > 
> > DTS Validation Error:
> > ==
> > ==
> > ==
> > ==
> > TEST SUITE : TestChecksumOffload
> > 
> > ---
> > Begin: Test Casetest_checksum_offload_with_vlan
> > --
> > FAILED  'Unexpected Packets Drop'
> > --
> > [   SUITE_DUT_CMD]  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n
> > 4  -- -i --portmask=0x3 --disable-hw-vlan --enable-rx-cksum --crc-strip
> > --txqflags=0 [   SUITE_DUT_CMD]  set verbose 1 [   SUITE_DUT_CMD]  set
> > fwd csum [   SUITE_DUT_CMD]  csum set ip hw 0
> > [   SUITE_DUT_CMD]  csum set udp hw 0
> > [   SUITE_DUT_CMD]  csum set tcp hw 0
> > [   SUITE_DUT_CMD]  csum set sctp hw 0
> > [   SUITE_DUT_CMD]  csum set ip hw 1
> > [   SUITE_DUT_CMD]  csum set udp hw 1
> > [   SUITE_DUT_CMD]  csum set tcp hw 1
> > [   SUITE_DUT_CMD]  csum set sctp hw 1
> > [   SUITE_DUT_CMD]  start
> > [SUITE_TESTER_CMD]  scapy
> > [SUITE_TESTER_CMD]  sys.path.append("./")
> > [SUITE_TESTER_CMD]  from sctp import *
> > [SUITE_TESTER_CMD]  p = Ether(dst="90:e2:ba:4a:54:81",
> > src="52:00:00:00:00:00")/IPv6(src="::2")/UDP()/("X"*46)
> > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > Ether(dst="90:e2:ba:4a:54:81",
> > src="52:00:00:00:00:00")/IP(src="127.0.0.2")/SCTP()/("X"*48)
> > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > Ether(dst="90:e2:ba:4a:54:81",
> > src="52:00:00:00:00:00")/IPv6(src="::2")/TCP()/("X"*46)
> > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > Ether(dst="90:e2:ba:4a:54:81",
> > src="52:00:00:00:00:00")/IP(src="127.0.0.2")/UDP()/("X"*46)
> > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > Ether(dst="90:e2:ba:4a:54:81",
> > src="52:00:00:00:00:00")/IP(src="127.0.0.2")/TCP()/("X"*46)
> > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  exit()
> > [SUITE_TESTER_CMD]  echo -n '' >  scapyResult.txt [SUITE_TESTER_CMD]
> > scp sniff.py root at 10.239.128.80:~/ [SUITE_TESTER_CMD]  SCAPY Receive
> > setup: [SUITE_TESTER_CMD]  killall scapy 2>/dev/null; echo tester
> > [SUITE_TESTER_CMD]  scapy [SUITE_TESTER_CMD]  subprocess.call("scapy -c
> > sniff.py &", shell=True) [SUITE_TESTER_CMD]  sys.path.append("./")
> > [SUITE_TESTER_CMD]  import sctp [SUIT

[dpdk-dev] Wrong TCP Checkum computed by hardware

2015-10-28 Thread Padam Jeet Singh

Hi,

While implementing a sample packet generation code, I am setting up the offload 
flags so that the hardware (82599ES in my case) computes the IP and TCP 
Checksums. However, it seems that only the IP checksum is computed whereas the 
TCP Checksum is computed, but computed wrong. Here is a snippet of what I doing:

/* First reset all offload flags */
mbuf->ol_flags = (uint16_t) (mbuf->ol_flags & (~PKT_TX_OFFLOAD_MASK));
/* Then enable IP Checksum offload */
mbuf->ol_flags |= PKT_TX_IP_CKSUM;
/* This is required to be set to 0 */
ipv4hdr->hdr_checksum = 0;
mbuf->ol_flags |= PKT_TX_TCP_CKSUM;
/* Compute the PSD header */
tcphdr->cksum = rte_ipv4_phdr_cksum(ipv4hdr);

When I replace the checksum computation from hardware-offload to a software 
implementation, the checksum computed is correct. I have verified both the 
cases with Wireshark on the system receiving the packets.

Any hint what could I be doing wrong here?

Thanks,
Padam

[dpdk-dev] [PATCH v3 06/11] igbvf: add xstats() implementation

2015-10-28 Thread Tahhan, Maryam

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harry van Haaren
> Sent: Thursday, October 22, 2015 4:48 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 06/11] igbvf: add xstats() implementation
> 
> Add xstats functionality to igbvf PMD, adding necessary statistic strings.
> 
> Signed-off-by: Harry van Haaren 
> ---

Acked-by: Maryam Tahhan

[dpdk-dev] [PATCH v3 08/11] ixgbevf: add xstats() functions to VF

2015-10-28 Thread Tahhan, Maryam

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harry van Haaren
> Sent: Thursday, October 22, 2015 4:49 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 08/11] ixgbevf: add xstats() functions to VF
> 
> Add xstats() functions and stat strings as necessary to ixgbevf PMD.
> 
> Signed-off-by: Harry van Haaren 
> ---

Acked-by: Maryam Tahhan

[dpdk-dev] [PATCH v3 10/11] i40evf: add xstats() implementation

2015-10-28 Thread Tahhan, Maryam

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harry van Haaren
> Sent: Thursday, October 22, 2015 4:49 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 10/11] i40evf: add xstats() implementation
> 
> Add implementation of xstats() functions in i40evf PMD.
> 
> Signed-off-by: Harry van Haaren 
> ---

Acked-by: Maryam Tahhan

[dpdk-dev] 答复: [PATCH] lib/lpm:fix two issues in the delete_depth_small()

2015-10-28 Thread 洪余柯(洪余柯)

Acked by? yuke.hyk at alibaba-inc.com

--
???: nana.nn [mailto:nana.nn at alibaba-inc.com] 
: 2015?10?28? 12:02
???: "???(??)"
??: dev at dpdk.org; Jijiang Liu
??: Re: [PATCH] lib/lpm:fix two issues in the delete_depth_small()

HI:
 yuke?please acked-by~ 

On Oct 28, 2015, at 11:44 AM, Jijiang Liu  wrote:

> Fix two issues in the delete_depth_small() function.
> 
> 1> The control is not strict in this function.
> 
> In the following structure,
> struct rte_lpm_tbl24_entry {
>union {
>uint8_t next_hop;
>uint8_t tbl8_gindex;
>};
> uint8_t ext_entry :1;
> }
> 
> When ext_entry = 0, use next_hop.only to process rte_lpm_tbl24_entry.
> 
> When ext_entry = 1, use tbl8_gindex to process the rte_lpm_tbl8_entry.
> 
> When using LPM24 + 8 algorithm, it will use ext_entry to decide to process 
> rte_lpm_tbl24_entry structure or rte_lpm_tbl8_entry structure. 
> If a route is deleted, the prefix of previous route is used to override the 
> deleted route. when (lpm->tbl24[i].ext_entry == 0 && lpm->tbl24[i].depth > 
> depth) 
> it should be ignored, but due to the incorrect logic, the next_hop is used as 
> tbl8_gindex and will process the rte_lpm_tbl8_entry.
> 
> 2> Initialization of rte_lpm_tbl8_entry is incorrect in this function 
> 
> In this function, use new rte_lpm_tbl8_entry we call A to replace the old 
> rte_lpm_tbl8_entry. But the valid_group do not set VALID, so it will be 
> INVALID.
> Then when adding a new route which depth is > 24,the tbl8_alloc() function 
> will search the rte_lpm_tbl8_entrys to find INVALID valid_group, 
> and it will return the A to the add_depth_big function, so A's data is 
> overridden.
> 
> Signed-off-by: NaNa 
> 
> ---
> lib/librte_lpm/rte_lpm.c |7 +++
> 1 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
> index 163ba3c..3981452 100644
> --- a/lib/librte_lpm/rte_lpm.c
> +++ b/lib/librte_lpm/rte_lpm.c
> @@ -734,8 +734,7 @@ delete_depth_small(struct rte_lpm *lpm, uint32_t 
> ip_masked,
>   if (lpm->tbl24[i].ext_entry == 0 &&
>   lpm->tbl24[i].depth <= depth ) {
>   lpm->tbl24[i].valid = INVALID;
> - }
> - else {
> + } else if (lpm->tbl24[i].ext_entry == 1) {
>   /*
>* If TBL24 entry is extended, then there has
>* to be a rule with depth >= 25 in the
> @@ -770,6 +769,7 @@ delete_depth_small(struct rte_lpm *lpm, uint32_t 
> ip_masked,
> 
>   struct rte_lpm_tbl8_entry new_tbl8_entry = {
>   .valid = VALID,
> + .valid_group = VALID,
>   .depth = sub_rule_depth,
>   .next_hop = lpm->rules_tbl
>   [sub_rule_index].next_hop,
> @@ -780,8 +780,7 @@ delete_depth_small(struct rte_lpm *lpm, uint32_t 
> ip_masked,
>   if (lpm->tbl24[i].ext_entry == 0 &&
>   lpm->tbl24[i].depth <= depth ) {
>   lpm->tbl24[i] = new_tbl24_entry;
> - }
> - else {
> + } else  if (lpm->tbl24[i].ext_entry == 1) {
>   /*
>* If TBL24 entry is extended, then there has
>* to be a rule with depth >= 25 in the
> -- 
> 1.7.7.6

[dpdk-dev] Fw: |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported

2015-10-28 Thread Jan Viktorin

Hello Konstantin,

the ACL patch (with changes as you suggested yesterday) breaks the
following test. I am confused about this. There is nothing in the log
that would help me to determine the source of the problem. How is this
test related to ACL? Moreover, I cannot see the actual packet drop
there, what packet has been dropped?

Regards
Jan

Begin forwarded message:

Date: 27 Oct 2015 13:44:40 -0700
From: sys_...@intel.com
To: test-report at dpdk.org,viktorin at rehivetech.com
Subject: |ERROR| pw 8090-8106  acl: handle when SSE 4.1 is unsupported


Test-Label: Intel Niantic on Fedora
Test-Status: ERROR
Patchwork: http://www.dpdk.org/dev/patchwork/patch/8106/

DPDK git baseline: affc455438f4cbd3b14e2d9a24fbc154e22d68d3
Patchwork ID: 8090-8106
http://www.dpdk.org/dev/patchwork/patch/8106/
Submitter: Jan Viktorin 
Date: Tue, 27 Oct 2015 20:13:49 +0100

Compilation:
OS: fedora
Nic: niantic
i686-native-linuxapp-gcc: compile pass
x86_64-native-linuxapp-gcc: compile pass

DTS validation: 
OS: fedora
Nic: Niantic
TestType: auto
GCC: 4
x86_64-native-linuxapp-gcc:  total 75, passed 74, failed 1.
Failed Case List:
Target: x86_64-native-linuxapp-gcc
OS: fedora
Failed DTS case: 
checksum_offload_with_vlan:
http://dpdk.org/browse/tools/dts/tree/test_plans/checksum_offload_test_plan.rst

DTS Validation Error:


TEST SUITE : TestChecksumOffload

---
Begin: Test Casetest_checksum_offload_with_vlan
--
FAILED  'Unexpected Packets Drop'
--
[   SUITE_DUT_CMD]  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n
4  -- -i --portmask=0x3 --disable-hw-vlan --enable-rx-cksum --crc-strip
--txqflags=0 [   SUITE_DUT_CMD]  set verbose 1 [   SUITE_DUT_CMD]  set
fwd csum [   SUITE_DUT_CMD]  csum set ip hw 0
[   SUITE_DUT_CMD]  csum set udp hw 0
[   SUITE_DUT_CMD]  csum set tcp hw 0
[   SUITE_DUT_CMD]  csum set sctp hw 0
[   SUITE_DUT_CMD]  csum set ip hw 1
[   SUITE_DUT_CMD]  csum set udp hw 1
[   SUITE_DUT_CMD]  csum set tcp hw 1
[   SUITE_DUT_CMD]  csum set sctp hw 1
[   SUITE_DUT_CMD]  start
[SUITE_TESTER_CMD]  scapy
[SUITE_TESTER_CMD]  sys.path.append("./")
[SUITE_TESTER_CMD]  from sctp import *
[SUITE_TESTER_CMD]  p = Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IPv6(src="::2")/UDP()/("X"*46)
[SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IP(src="127.0.0.2")/SCTP()/("X"*48)
[SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IPv6(src="::2")/TCP()/("X"*46)
[SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IP(src="127.0.0.2")/UDP()/("X"*46)
[SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IP(src="127.0.0.2")/TCP()/("X"*46)
[SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  exit()
[SUITE_TESTER_CMD]  echo -n '' >  scapyResult.txt [SUITE_TESTER_CMD]
scp sniff.py root at 10.239.128.80:~/ [SUITE_TESTER_CMD]  SCAPY Receive
setup: [SUITE_TESTER_CMD]  killall scapy 2>/dev/null; echo tester
[SUITE_TESTER_CMD]  scapy [SUITE_TESTER_CMD]  subprocess.call("scapy -c
sniff.py &", shell=True) [SUITE_TESTER_CMD]  sys.path.append("./")
[SUITE_TESTER_CMD]  import sctp [SUITE_TESTER_CMD]  from sctp import *
[SUITE_TESTER_CMD]  sendp([Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IPv6(src="::1")/UDP(chksum=0xf)/("X"*46)],
iface="p785p2") [SUITE_TESTER_CMD]
sendp([Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IP(chksum=0x0)/SCTP(chksum=0xf)/("X"*48)],
iface="p785p2") [SUITE_TESTER_CMD]
sendp([Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IPv6(src="::1")/TCP(chksum=0xf)/("X"*46)],
iface="p785p2") [SUITE_TESTER_CMD]
sendp([Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IP(chksum=0x0)/UDP(chksum=0xf)/("X"*46)],
iface="p785p2") [SUITE_TESTER_CMD]
sendp([Ether(dst="90:e2:ba:4a:54:81",
src="52:00:00:00:00:00")/IP(chksum=0x0)/TCP(chksum=0xf)/("X"*46)],
iface="p785p2") [SUITE_TESTER_CMD]  exit() [SUITE_TESTER_CMD]  cat
scapyResult.txt [SUITE_TESTER_CMD]  SCAPY Result: End
test_checksum_offload_with_vlan
---
[SUITE_DUT_CMD] quit



DPDK STV team 


-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

[dpdk-dev] [PATCH v2] vhost: Fix wrong handling of virtqueue array index

2015-10-28 Thread Tetsuya Mukawa

The patch fixes wrong handling of virtqueue array index when
GET_VRING_BASE message comes.
---
 lib/librte_vhost/vhost_user/virtio-net-user.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index a998ad8..d07452a 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -300,13 +300,9 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 * sent and only sent in vhost_vring_stop.
 * TODO: cleanup the vring, it isn't usable since here.
 */
-   if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
-   close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
-   dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
-   }
-   if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
-   close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
-   dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
+   if (dev->virtqueue[state->index]->kickfd >= 0) {
+   close(dev->virtqueue[state->index]->kickfd);
+   dev->virtqueue[state->index]->kickfd = -1;
}

return 0;
-- 
2.1.4

[dpdk-dev] |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported

2015-10-28 Thread Ananyev, Konstantin



> -Original Message-
> From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> Sent: Wednesday, October 28, 2015 11:22 AM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org; Cao, Waterman
> Subject: Re: |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported
> 
> On Wed, 28 Oct 2015 11:17:37 +
> "Ananyev, Konstantin"  wrote:
> 
> > Hi Jan,
> >
> > > -Original Message-
> > > From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> > > Sent: Wednesday, October 28, 2015 11:00 AM
> > > To: Ananyev, Konstantin
> > > Cc: dev at dpdk.org
> > > Subject: Fw: |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported
> > >
> > > Hello Konstantin,
> > >
> > > the ACL patch (with changes as you suggested yesterday) breaks the
> > > following test. I am confused about this. There is nothing in the log
> > > that would help me to determine the source of the problem. How is this
> > > test related to ACL? Moreover, I cannot see the actual packet drop
> > > there, what packet has been dropped?
> >
> > I am also confused - testpmd doesn't use librte_acl at all.
> > Are there any other changes in your patch set?
> > Something related to vector instrincts emulation?
> > As they are now in rte_vect.h and are seen not only by LPM ilbrary?
> 
> So, it is related to all the ARMv7 patch set? I thought it is only this
> single patch... The only other thing I've changed is the LPM where for
> x86/64 it keeps the previous implementation and for other platforms it
> introduces a workaround without SSE. The new rte_vect.h is ARM-only.

Yep, it should be arm-only...

> 
> Jan
> 
> > Probably someone from STC team can explain what exactly packet fails?
> > Konstantin
> >
> > >
> > > Regards
> > > Jan
> > >
> > > Begin forwarded message:
> > >
> > > Date: 27 Oct 2015 13:44:40 -0700
> > > From: sys_stv at intel.com
> > > To: test-report at dpdk.org,viktorin at rehivetech.com
> > > Subject: |ERROR| pw 8090-8106  acl: handle when SSE 4.1 is unsupported
> > >
> > >
> > > Test-Label: Intel Niantic on Fedora
> > > Test-Status: ERROR
> > > Patchwork: http://www.dpdk.org/dev/patchwork/patch/8106/
> > >
> > > DPDK git baseline: affc455438f4cbd3b14e2d9a24fbc154e22d68d3
> > > Patchwork ID: 8090-8106
> > > http://www.dpdk.org/dev/patchwork/patch/8106/
> > > Submitter: Jan Viktorin 
> > > Date: Tue, 27 Oct 2015 20:13:49 +0100
> > >
> > > Compilation:
> > > OS: fedora
> > > Nic: niantic
> > > i686-native-linuxapp-gcc: compile pass
> > > x86_64-native-linuxapp-gcc: compile pass
> > >
> > > DTS validation:
> > > OS: fedora
> > > Nic: Niantic
> > > TestType: auto
> > > GCC: 4
> > > x86_64-native-linuxapp-gcc:  total 75, passed 74, failed 1.
> > > Failed Case List:
> > > Target: x86_64-native-linuxapp-gcc
> > > OS: fedora
> > > Failed DTS case:
> > > checksum_offload_with_vlan:
> > > http://dpdk.org/browse/tools/dts/tree/test_plans/checksum_offload_test_plan.rst
> > >
> > > DTS Validation Error:
> > >
> ==
> > > ==
> > >
> ==
> > > ==
> > > TEST SUITE : TestChecksumOffload
> > >
> > > ---
> > > Begin: Test Casetest_checksum_offload_with_vlan
> > > --
> > > FAILED  'Unexpected Packets Drop'
> > > --
> > > [   SUITE_DUT_CMD]  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n
> > > 4  -- -i --portmask=0x3 --disable-hw-vlan --enable-rx-cksum --crc-strip
> > > --txqflags=0 [   SUITE_DUT_CMD]  set verbose 1 [   SUITE_DUT_CMD]  set
> > > fwd csum [   SUITE_DUT_CMD]  csum set ip hw 0
> > > [   SUITE_DUT_CMD]  csum set udp hw 0
> > > [   SUITE_DUT_CMD]  csum set tcp hw 0
> > > [   SUITE_DUT_CMD]  csum set sctp hw 0
> > > [   SUITE_DUT_CMD]  csum set ip hw 1
> > > [   SUITE_DUT_CMD]  csum set udp hw 1
> > > [   SUITE_DUT_CMD]  csum set tcp hw 1
> > > [   SUITE_DUT_CMD]  csum set sctp hw 1
> > > [   SUITE_DUT_CMD]  start
> > > [SUITE_TESTER_CMD]  scapy
> > > [SUITE_TESTER_CMD]  sys.path.append("./")
> > > [SUITE_TESTER_CMD]  from sctp import *
> > > [SUITE_TESTER_CMD]  p = Ether(dst="90:e2:ba:4a:54:81",
> > > src="52:00:00:00:00:00")/IPv6(src="::2")/UDP()/("X"*46)
> > > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > > Ether(dst="90:e2:ba:4a:54:81",
> > > src="52:00:00:00:00:00")/IP(src="127.0.0.2")/SCTP()/("X"*48)
> > > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > > Ether(dst="90:e2:ba:4a:54:81",
> > > src="52:00:00:00:00:00")/IPv6(src="::2")/TCP()/("X"*46)
> > > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > > Ether(dst="90:e2:ba:4a:54:81",
> > > src="52:00:00:00:00:00")/IP(src="127.0.0.2")/UDP()/("X"*46)
> > > [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> > > Ether(dst="90:e2:ba:4a:54:81",
> > >

[dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture

2015-10-28 Thread Jan Viktorin

On Wed, 28 Oct 2015 11:09:21 +0100
David Marchand  wrote:

> Hello Jan,
> 
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
> wrote:
> 
> >
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > new file mode 100644
> > index 000..5a778cf
> > --- /dev/null
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > +
> > +# avoids using i686/x86_64 SIMD instructions, nothing for ARM
> > +CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
> >  
> 
> (yet another build flag which has to disappear, and bitmap
> header should be moved from librte_sched to eal with arch-specific
> implementations when applicable)
> 
> Well, I am a bit confused by this comment.
> For me, gcc provides ctzll builtins.
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
> 
> And with your patchset applied, it builds fine with
> RTE_BITMAP_OPTIMIZATIONS enabled using gcc 4.7.3 for arm on ubuntu 14.04.
> Is there a dependency on gcc version ?

It seems, there is no need for this. I will remove it. DPDK compiles
well.

> 
> 
> +# PCI is usually not used on ARM
> > +CONFIG_RTE_EAL_IGB_UIO=n
> >  
> 
> Not sure "usually not used" is a good reason to disable something.
> Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
> 

Well, it requires to set some options in Linux Kernel (at least PCI
support) which are usually disabled by the in-kernel *arm*_defconfigs.
Moreover, it seems I cannot enable it for some ARM architectures (I've
tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
with a PCI bus. I suppose that if somebody _really_ needs this, she would
enable it by hand.

At the moment, it breaks my common builds... The driver is mostly
useless on ARMv7 and just takes space in the filesystem.

> 
> Thanks.
> 

Regards
Jan



-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic

[dpdk-dev] [PATCH] lib/lpm:fix two issues in the delete_depth_small()

2015-10-28 Thread Jijiang Liu

Fix two issues in the delete_depth_small() function.

1> The control is not strict in this function.

In the following structure,
struct rte_lpm_tbl24_entry {
union {
uint8_t next_hop;
uint8_t tbl8_gindex;
};
 uint8_t ext_entry :1;
}

When ext_entry = 0, use next_hop.only to process rte_lpm_tbl24_entry.

When ext_entry = 1, use tbl8_gindex to process the rte_lpm_tbl8_entry.

When using LPM24 + 8 algorithm, it will use ext_entry to decide to process 
rte_lpm_tbl24_entry structure or rte_lpm_tbl8_entry structure. 
If a route is deleted, the prefix of previous route is used to override the 
deleted route. when (lpm->tbl24[i].ext_entry == 0 && lpm->tbl24[i].depth > 
depth) 
it should be ignored, but due to the incorrect logic, the next_hop is used as 
tbl8_gindex and will process the rte_lpm_tbl8_entry.

2> Initialization of rte_lpm_tbl8_entry is incorrect in this function 

In this function, use new rte_lpm_tbl8_entry we call A to replace the old 
rte_lpm_tbl8_entry. But the valid_group do not set VALID, so it will be INVALID.
Then when adding a new route which depth is > 24,the tbl8_alloc() function will 
search the rte_lpm_tbl8_entrys to find INVALID valid_group, 
and it will return the A to the add_depth_big function, so A's data is 
overridden.

Signed-off-by: NaNa 

---
 lib/librte_lpm/rte_lpm.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
index 163ba3c..3981452 100644
--- a/lib/librte_lpm/rte_lpm.c
+++ b/lib/librte_lpm/rte_lpm.c
@@ -734,8 +734,7 @@ delete_depth_small(struct rte_lpm *lpm, uint32_t ip_masked,
if (lpm->tbl24[i].ext_entry == 0 &&
lpm->tbl24[i].depth <= depth ) {
lpm->tbl24[i].valid = INVALID;
-   }
-   else {
+   } else if (lpm->tbl24[i].ext_entry == 1) {
/*
 * If TBL24 entry is extended, then there has
 * to be a rule with depth >= 25 in the
@@ -770,6 +769,7 @@ delete_depth_small(struct rte_lpm *lpm, uint32_t ip_masked,

struct rte_lpm_tbl8_entry new_tbl8_entry = {
.valid = VALID,
+   .valid_group = VALID,
.depth = sub_rule_depth,
.next_hop = lpm->rules_tbl
[sub_rule_index].next_hop,
@@ -780,8 +780,7 @@ delete_depth_small(struct rte_lpm *lpm, uint32_t ip_masked,
if (lpm->tbl24[i].ext_entry == 0 &&
lpm->tbl24[i].depth <= depth ) {
lpm->tbl24[i] = new_tbl24_entry;
-   }
-   else {
+   } else  if (lpm->tbl24[i].ext_entry == 1) {
/*
 * If TBL24 entry is extended, then there has
 * to be a rule with depth >= 25 in the
-- 
1.7.7.6

[dpdk-dev] [RFC PATCH v2] vhost: Add VHOST PMD

2015-10-28 Thread Tetsuya Mukawa

On 2015/10/27 22:44, Traynor, Kevin wrote:
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tetsuya Mukawa
> [snip]
>
>> Hi,
>>
>> I have submitted latest patches.
>> I will keep vhost library until we will have agreement to merge it to
>> vhost PMD.
> Longer term there are pros and cons to keeping the vhost library. Personally
> I think it would make sense to remove sometime as trying to maintain two API's
> has a cost, but I think adding a deprecation notice in DPDK 2.2 for removal in
> DPDK 2.3 is very premature. Until it's proven *in the field* that the vhost 
> PMD
> is a suitable fully functioning replacement for the vhost library and users
> have time to migrate, then please don't remove.

Hi Kevin,

Thanks for commenting. I agree it's not the time to add deprecation notice.
(I haven't included it in the vhost PMD patches)

Tetsuya

[dpdk-dev] [PATCH v3 2/3] docs: add keep alive sample app guide

2015-10-28 Thread Van Haaren, Harry

> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Wednesday, October 28, 2015 8:52 AM
> To: dev at dpdk.org
> Cc: Browne, John J
> Subject: [dpdk-dev] [PATCH v3 2/3] docs: add keep alive sample app guide
> 
> Signed-off-by: Maryam Tahhan 
> Signed-off-by: John J Browne 
> Signed-off-by: Remy Horton 
> ---
>  doc/guides/sample_app_ug/index.rst  |   1 +
>  doc/guides/sample_app_ug/keep_alive.rst | 191 
> 
>  2 files changed, 192 insertions(+)
>  create mode 100644 doc/guides/sample_app_ug/keep_alive.rst

Acked-by: Harry van Haaren

[dpdk-dev] [PATCH v7] mem: command line option to delete hugepage backing files

2015-10-28 Thread Shesha Sreenivasamurthy

When an application using huge-pages crash or exists, the hugetlbfs backing 
files are not cleaned up. This is a patch to clean those files. There are 
multi-process DPDK applications that may be benefited by those backing files. 
Therefore, I have made that configurable so that the application that does not 
need those backing files can remove them, thus not changing the current default 
behavior. The application itself can clean it up, however the rationale behind 
DPDK cleaning it up is, DPDK created it and therefore, it is better it unlinks 
it.

Signed-off-by: Shesha Sreenivasamurthy 
Acked-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/eal_common_options.c | 12 
 lib/librte_eal/common/eal_internal_cfg.h   |  1 +
 lib/librte_eal/common/eal_options.h|  2 ++
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 31 ++
 4 files changed, 46 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index c614477..4e73b85 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -79,6 +79,7 @@ eal_long_options[] = {
{OPT_MASTER_LCORE,  1, NULL, OPT_MASTER_LCORE_NUM },
{OPT_NO_HPET,   0, NULL, OPT_NO_HPET_NUM  },
{OPT_NO_HUGE,   0, NULL, OPT_NO_HUGE_NUM  },
+   {OPT_HUGE_UNLINK,   0, NULL, OPT_HUGE_UNLINK_NUM  },
{OPT_NO_PCI,0, NULL, OPT_NO_PCI_NUM   },
{OPT_NO_SHCONF, 0, NULL, OPT_NO_SHCONF_NUM},
{OPT_PCI_BLACKLIST, 1, NULL, OPT_PCI_BLACKLIST_NUM},
@@ -718,6 +719,10 @@ eal_parse_common_option(int opt, const char *optarg,
conf->no_hugetlbfs = 1;
break;

+   case OPT_HUGE_UNLINK_NUM:
+   conf->hugepage_unlink = 1;
+   break;
+
case OPT_NO_PCI_NUM:
conf->no_pci = 1;
break;
@@ -841,6 +846,12 @@ eal_check_common_options(struct internal_config 
*internal_cfg)
return -1;
}

+   if (internal_cfg->no_hugetlbfs && internal_cfg->hugepage_unlink) {
+   RTE_LOG(ERR, EAL, "Option --"OPT_HUGE_UNLINK" cannot "
+   "be specified together with --"OPT_NO_HUGE"\n");
+   return -1;
+   }
+
if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) != 0 &&
rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_PCI) != 0) {
RTE_LOG(ERR, EAL, "Options blacklist (-b) and whitelist (-w) "
@@ -891,6 +902,7 @@ eal_common_usage(void)
   "  -h, --help  This help\n"
   "\nEAL options for DEBUG use only:\n"
   "  --"OPT_NO_HUGE"   Use malloc instead of hugetlbfs\n"
+  "  --"OPT_HUGE_UNLINK"   Unlink hugepage files after init\n"
   "  --"OPT_NO_PCI"Disable PCI\n"
   "  --"OPT_NO_HPET"   Disable HPET\n"
   "  --"OPT_NO_SHCONF" No shared config (mmap'd files)\n"
diff --git a/lib/librte_eal/common/eal_internal_cfg.h 
b/lib/librte_eal/common/eal_internal_cfg.h
index e2ecb0d..292013c 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -64,6 +64,7 @@ struct internal_config {
volatile unsigned force_nchannel; /**< force number of channels */
volatile unsigned force_nrank;/**< force number of ranks */
volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
+   unsigned hugepage_unlink; /** < true to unlink backing files */
volatile unsigned xen_dom0_support; /**< support app running on Xen 
Dom0*/
volatile unsigned no_pci; /**< true to disable PCI */
volatile unsigned no_hpet;/**< true to disable HPET */
diff --git a/lib/librte_eal/common/eal_options.h 
b/lib/librte_eal/common/eal_options.h
index f6714d9..745f38c 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -63,6 +63,8 @@ enum {
OPT_PROC_TYPE_NUM,
 #define OPT_NO_HPET   "no-hpet"
OPT_NO_HPET_NUM,
+#define OPT_HUGE_UNLINK"huge-unlink"
+   OPT_HUGE_UNLINK_NUM,
 #define OPT_NO_HUGE   "no-huge"
OPT_NO_HUGE_NUM,
 #define OPT_NO_PCI"no-pci"
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index ac2745e..982e83e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -786,6 +786,29 @@ copy_hugepages_to_shared_mem(struct hugepage_file * dst, 
int dest_size,
return 0;
 }

+static int
+unlink_hugepage_files(struct hugepage_file *hugepg_tbl,
+   unsigned num_hp_info)
+{
+   unsigned socket, size;
+   int page, nrpages = 0;
+
+   /* get total number of hugepages */
+   for (size = 0; size < num_hp_info

[dpdk-dev] |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported

2015-10-28 Thread Ananyev, Konstantin

Hi Jan,

> -Original Message-
> From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> Sent: Wednesday, October 28, 2015 11:00 AM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: Fw: |ERROR| pw 8090-8106 acl: handle when SSE 4.1 is unsupported
> 
> Hello Konstantin,
> 
> the ACL patch (with changes as you suggested yesterday) breaks the
> following test. I am confused about this. There is nothing in the log
> that would help me to determine the source of the problem. How is this
> test related to ACL? Moreover, I cannot see the actual packet drop
> there, what packet has been dropped?

I am also confused - testpmd doesn't use librte_acl at all.
Are there any other changes in your patch set?
Something related to vector instrincts emulation?
As they are now in rte_vect.h and are seen not only by LPM ilbrary?
Probably someone from STC team can explain what exactly packet fails?
Konstantin

> 
> Regards
> Jan
> 
> Begin forwarded message:
> 
> Date: 27 Oct 2015 13:44:40 -0700
> From: sys_stv at intel.com
> To: test-report at dpdk.org,viktorin at rehivetech.com
> Subject: |ERROR| pw 8090-8106  acl: handle when SSE 4.1 is unsupported
> 
> 
> Test-Label: Intel Niantic on Fedora
> Test-Status: ERROR
> Patchwork: http://www.dpdk.org/dev/patchwork/patch/8106/
> 
> DPDK git baseline: affc455438f4cbd3b14e2d9a24fbc154e22d68d3
> Patchwork ID: 8090-8106
> http://www.dpdk.org/dev/patchwork/patch/8106/
> Submitter: Jan Viktorin 
> Date: Tue, 27 Oct 2015 20:13:49 +0100
> 
> Compilation:
> OS: fedora
> Nic: niantic
> i686-native-linuxapp-gcc: compile pass
> x86_64-native-linuxapp-gcc: compile pass
> 
> DTS validation:
> OS: fedora
> Nic: Niantic
> TestType: auto
> GCC: 4
> x86_64-native-linuxapp-gcc:  total 75, passed 74, failed 1.
> Failed Case List:
> Target: x86_64-native-linuxapp-gcc
> OS: fedora
> Failed DTS case:
> checksum_offload_with_vlan:
> http://dpdk.org/browse/tools/dts/tree/test_plans/checksum_offload_test_plan.rst
> 
> DTS Validation Error:
> ==
> ==
> ==
> ==
> TEST SUITE : TestChecksumOffload
> 
> ---
> Begin: Test Casetest_checksum_offload_with_vlan
> --
> FAILED  'Unexpected Packets Drop'
> --
> [   SUITE_DUT_CMD]  ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n
> 4  -- -i --portmask=0x3 --disable-hw-vlan --enable-rx-cksum --crc-strip
> --txqflags=0 [   SUITE_DUT_CMD]  set verbose 1 [   SUITE_DUT_CMD]  set
> fwd csum [   SUITE_DUT_CMD]  csum set ip hw 0
> [   SUITE_DUT_CMD]  csum set udp hw 0
> [   SUITE_DUT_CMD]  csum set tcp hw 0
> [   SUITE_DUT_CMD]  csum set sctp hw 0
> [   SUITE_DUT_CMD]  csum set ip hw 1
> [   SUITE_DUT_CMD]  csum set udp hw 1
> [   SUITE_DUT_CMD]  csum set tcp hw 1
> [   SUITE_DUT_CMD]  csum set sctp hw 1
> [   SUITE_DUT_CMD]  start
> [SUITE_TESTER_CMD]  scapy
> [SUITE_TESTER_CMD]  sys.path.append("./")
> [SUITE_TESTER_CMD]  from sctp import *
> [SUITE_TESTER_CMD]  p = Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IPv6(src="::2")/UDP()/("X"*46)
> [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IP(src="127.0.0.2")/SCTP()/("X"*48)
> [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IPv6(src="::2")/TCP()/("X"*46)
> [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IP(src="127.0.0.2")/UDP()/("X"*46)
> [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  p =
> Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IP(src="127.0.0.2")/TCP()/("X"*46)
> [SUITE_TESTER_CMD]  p.show2() [SUITE_TESTER_CMD]  exit()
> [SUITE_TESTER_CMD]  echo -n '' >  scapyResult.txt [SUITE_TESTER_CMD]
> scp sniff.py root at 10.239.128.80:~/ [SUITE_TESTER_CMD]  SCAPY Receive
> setup: [SUITE_TESTER_CMD]  killall scapy 2>/dev/null; echo tester
> [SUITE_TESTER_CMD]  scapy [SUITE_TESTER_CMD]  subprocess.call("scapy -c
> sniff.py &", shell=True) [SUITE_TESTER_CMD]  sys.path.append("./")
> [SUITE_TESTER_CMD]  import sctp [SUITE_TESTER_CMD]  from sctp import *
> [SUITE_TESTER_CMD]  sendp([Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IPv6(src="::1")/UDP(chksum=0xf)/("X"*46)],
> iface="p785p2") [SUITE_TESTER_CMD]
> sendp([Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IP(chksum=0x0)/SCTP(chksum=0xf)/("X"*48)],
> iface="p785p2") [SUITE_TESTER_CMD]
> sendp([Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IPv6(src="::1")/TCP(chksum=0xf)/("X"*46)],
> iface="p785p2") [SUITE_TESTER_CMD]
> sendp([Ether(dst="90:e2:ba:4a:54:81",
> src="52:00:00:00:00:00")/IP(chksum=0x0)/UDP(chk

[dpdk-dev] [Patch 1/2] i40e RX Bulk Alloc: Larger list size (33 to 128) throughput optimization

2015-10-28 Thread Bruce Richardson

On Tue, Oct 27, 2015 at 08:56:36PM +, Polehn, Mike A wrote:
> Combined 2 subroutines of code into one subroutine with one read operation 
> followed by 
> buffer allocate and load loop.
> 
> Eliminated the staging queue and subroutine, which removed extra pointer list 
> movements 
> and reduced number of active variable cache pages during for call.
> 
> Reduced queue position variables to just 2, the next read point and last NIC 
> RX descriptor 
> position, also changed to allowing NIC descriptor table to not always need to 
> be filled.
> 
> Removed NIC register update write from per loop to one per driver write call 
> to minimize CPU 
> stalls waiting of multiple SMB synchronization points and for earlier NIC 
> register writes that 
> often take large cycle counts to complete. For example with an input packet 
> list of 33, with 
> the default loops size of 32, the second NIC register write will occur just 
> after RX processing 
> for just 1 packet, resulting in large CPU stall time.
> 
> Eliminated initial rx packet present test before rx processing loop that also 
> checks, since less 
> free time is generally available when packets are present than when not 
> processing any input 
> packets. 
> 
> Used some standard variables to help reduce overhead of non-standard variable 
> sizes.
> 
> Reduced number of variables, reordered variable structure to put most active 
> variables in 
> first cache line, better utilize memory bytes inside cache line, and reduced 
> active cache line 
> count to 1 cache line during processing call. Other RX subroutine sets might 
> still use more 
> than 1 variable cache line.
> 
> Signed-off-by: Mike A. Polehn 

Hi Mike,

Thanks for the contribution.

However, this patch seems to contain a lot of changes to the i40e code. Since 
you have
multiple optimizations listed above in the description it would be good if you
could submit this patch as multiple patches, one for each optimization. That
would make it far easier for us to review and test. The same would apply to
patch 2 of this set, which looks to have multiple changes in a single patch too.
Also, each patch should have a unique title stating very briefly what the one
change in that patch is.

Regards,
/Bruce

[dpdk-dev] [PATCH v3 2/2] doc: add user-space ethtool sample app guide

2015-10-28 Thread Remy Horton

Signed-off-by: Remy Horton 
---
 doc/guides/sample_app_ug/index.rst  |   1 +
 doc/guides/sample_app_ug/keep_alive.rst | 191 
 2 files changed, 192 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/keep_alive.rst

diff --git a/doc/guides/sample_app_ug/index.rst 
b/doc/guides/sample_app_ug/index.rst
index 9beedd9..11b8b14 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -49,6 +49,7 @@ Sample Applications User Guide
 ipv4_multicast
 ip_reassembly
 kernel_nic_interface
+keep_alive
 l2_forward_job_stats
 l2_forward_real_virtual
 l3_forward
diff --git a/doc/guides/sample_app_ug/keep_alive.rst 
b/doc/guides/sample_app_ug/keep_alive.rst
new file mode 100644
index 000..080811b
--- /dev/null
+++ b/doc/guides/sample_app_ug/keep_alive.rst
@@ -0,0 +1,191 @@
+
+..  BSD LICENSE
+Copyright(c) 2015 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Keep Alive Sample Application
+=
+
+The Keep Alive application is a simple example of a
+heartbeat/watchdog for packet processing cores. It demonstrates how
+to detect 'failed' DPDK cores and notify a fault management entity
+of this failure. Its purpose is to ensure the failure of the core
+does not result in a fault that is not detectable by a management
+entity.
+
+
+Overview
+
+
+The application demonstrates how to protect against 'silent outages'
+on packet processing cores. A Keep Alive Monitor Agent Core (master)
+monitors the state of packet processing cores (worker cores) by
+dispatching pings at a regular time interval (default is 5ms) and
+monitoring the state of the cores. Cores states are: Alive, MIA, Dead
+or Buried. MIA indicates a missed ping, and Dead indicates two missed
+pings within the specified time interval. When a core is Dead, a
+callback function is invoked to restart the packet processing core;
+A real life application might use this callback function to notify a
+higher level fault management entity of the core failure in order to
+take the appropriate corrective action.
+
+Note: Only the worker cores are monitored. A local (on the host) mechanism
+or agent to supervise the Keep Alive Monitor Agent Core DPDK core is required
+to detect its failure.
+
+Note: This application is based on the L2 forwarding application. As
+such, the initialization and run-time paths are very similar to those
+of the L2 forwarding application.
+
+Compiling the Application
+-
+
+To compile the application:
+
+#.  Go to the sample application directory:
+
+.. code-block:: console
+
+export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/keep_alive
+
+#.  Set the target (a default target is used if not specified). For example:
+
+.. code-block:: console
+
+export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+See the *DPDK Getting Started Guide* for possible RTE_TARGET values.
+
+#.  Build the application:
+
+.. code-block:: console
+
+make
+
+Running the Application
+---
+
+The application has a number of command line options:
+
+.. code-block:: console
+
+./build/l2fwd-keepalive [EAL options] \
+-- -p PORTMASK [-q NQ] [-K PERIOD] [-T PERIOD]
+
+where,
+
+* ``p PORTMASK``: A hexadecimal bitmask of the ports to configure
+
+* ``q NQ``: A number of queues (=ports) per lcore (default is 1)
+
+* ``K PER

[dpdk-dev] [PATCH v3 1/2] example: add user-space ethtool sample application

2015-10-28 Thread Remy Horton

Further enhancements to the userspace ethtool implementation that was
submitted in 2.1 and packaged as a self-contained sample application.
Implements an rte_ethtool shim layer based on rte_ethdev API, along
with a command prompt driven demonstration application.

Signed-off-by: Remy Horton 
---
 examples/ethtool/Makefile |  48 ++
 examples/ethtool/ethtool-app/Makefile |  54 +++
 examples/ethtool/ethtool-app/ethapp.c | 873 ++
 examples/ethtool/ethtool-app/ethapp.h |  41 ++
 examples/ethtool/ethtool-app/main.c   | 288 +++
 examples/ethtool/lib/Makefile |  57 +++
 examples/ethtool/lib/rte_ethtool.c| 421 
 examples/ethtool/lib/rte_ethtool.h| 410 
 8 files changed, 2192 insertions(+)
 create mode 100644 examples/ethtool/Makefile
 create mode 100644 examples/ethtool/ethtool-app/Makefile
 create mode 100644 examples/ethtool/ethtool-app/ethapp.c
 create mode 100644 examples/ethtool/ethtool-app/ethapp.h
 create mode 100644 examples/ethtool/ethtool-app/main.c
 create mode 100644 examples/ethtool/lib/Makefile
 create mode 100644 examples/ethtool/lib/rte_ethtool.c
 create mode 100644 examples/ethtool/lib/rte_ethtool.h

diff --git a/examples/ethtool/Makefile b/examples/ethtool/Makefile
new file mode 100644
index 000..94f8ee3
--- /dev/null
+++ b/examples/ethtool/Makefile
@@ -0,0 +1,48 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overwritten by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+$(error This application can only operate in a linuxapp environment, \
+please change the definition of the RTE_TARGET environment variable)
+endif
+
+DIRS-y += lib ethtool-app
+
+include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/ethtool/ethtool-app/Makefile 
b/examples/ethtool/ethtool-app/Makefile
new file mode 100644
index 000..09c66ad
--- /dev/null
+++ b/examples/ethtool/ethtool-app/Makefile
@@ -0,0 +1,54 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR

[dpdk-dev] [PATCH v3 0/2] User-space ethtool sample application

2015-10-28 Thread Remy Horton

Further enhancements to the userspace ethtool implementation that was
submitted in 2.1 and packaged as a self-contained sample application.
Implements an rte_ethtool shim layer based on rte_ethdev API, along
with a command prompt driven demonstration application.

This patchset depends on:
* http://dpdk.org/dev/patchwork/patch/6563/
* http://dpdk.org/dev/patchwork/patch/7340/
* http://dpdk.org/dev/patchwork/patch/8070/
* http://dpdk.org/dev/patchwork/patch/8067/
* http://dpdk.org/dev/patchwork/patch/8075/
* http://dpdk.org/dev/patchwork/patch/8074/
* http://dpdk.org/dev/patchwork/patch/8072/
* http://dpdk.org/dev/patchwork/patch/8071/
* http://dpdk.org/dev/patchwork/patch/8073/
* http://dpdk.org/dev/patchwork/patch/8068/
* http://dpdk.org/dev/patchwork/patch/8069/

v3:
* Made use of enums for core state.
* Fixed Makefile issue.
* Fixed incorrect assumption with core ids.
* Changed handling of more ports than cores.

v2:
* Replaced l2fwd base with simpler application.
* Added ringparam functions.
* Added documentation.

Remy Horton (2):
  example: User-space ethtool sample application
  doc: add user-space ethtool sample app guide

 doc/guides/sample_app_ug/index.rst  |   1 +
 doc/guides/sample_app_ug/keep_alive.rst | 191 +++
 examples/ethtool/Makefile   |  48 ++
 examples/ethtool/ethtool-app/Makefile   |  54 ++
 examples/ethtool/ethtool-app/ethapp.c   | 873 
 examples/ethtool/ethtool-app/ethapp.h   |  41 ++
 examples/ethtool/ethtool-app/main.c | 288 +++
 examples/ethtool/lib/Makefile   |  57 +++
 examples/ethtool/lib/rte_ethtool.c  | 421 +++
 examples/ethtool/lib/rte_ethtool.h  | 410 +++
 10 files changed, 2384 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/keep_alive.rst
 create mode 100644 examples/ethtool/Makefile
 create mode 100644 examples/ethtool/ethtool-app/Makefile
 create mode 100644 examples/ethtool/ethtool-app/ethapp.c
 create mode 100644 examples/ethtool/ethtool-app/ethapp.h
 create mode 100644 examples/ethtool/ethtool-app/main.c
 create mode 100644 examples/ethtool/lib/Makefile
 create mode 100644 examples/ethtool/lib/rte_ethtool.c
 create mode 100644 examples/ethtool/lib/rte_ethtool.h

-- 
1.9.3

[dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture

2015-10-28 Thread David Marchand

Hello Jan,

On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin 
wrote:

>
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> b/config/defconfig_arm-armv7-a-linuxapp-gcc
> new file mode 100644
> index 000..5a778cf
> --- /dev/null
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> +
> +# avoids using i686/x86_64 SIMD instructions, nothing for ARM
> +CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
>

(yet another build flag which has to disappear, and bitmap
header should be moved from librte_sched to eal with arch-specific
implementations when applicable)

Well, I am a bit confused by this comment.
For me, gcc provides ctzll builtins.
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

And with your patchset applied, it builds fine with
RTE_BITMAP_OPTIMIZATIONS enabled using gcc 4.7.3 for arm on ubuntu 14.04.
Is there a dependency on gcc version ?

+# PCI is usually not used on ARM
> +CONFIG_RTE_EAL_IGB_UIO=n
>

Not sure "usually not used" is a good reason to disable something.
Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?

Thanks.

-- 
David Marchand

[dpdk-dev] [PATCH] rte_sched: release enqueued mbufs on rte_sched_port_free()

2015-10-28 Thread Simon Kagstrom

Otherwise mbufs will leak when the port is destroyed. The
rte_sched_port_qbase() and rte_sched_port_qsize() functions are used
in free now, so move them up.

Signed-off-by: Simon Kagstrom 
---
 lib/librte_sched/rte_sched.c | 44 +++-
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 9c9419d..81462cd 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -312,6 +312,23 @@ rte_sched_port_queues_per_port(struct rte_sched_port *port)
return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport * 
port->n_subports_per_port;
 }

+static inline struct rte_mbuf **
+rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex)
+{
+   uint32_t pindex = qindex >> 4;
+   uint32_t qpos = qindex & 0xF;
+
+   return (port->queue_array + pindex * port->qsize_sum + 
port->qsize_add[qpos]);
+}
+
+static inline uint16_t
+rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
+{
+   uint32_t tc = (qindex >> 2) & 0x3;
+
+   return port->qsize[tc];
+}
+
 static int
 rte_sched_port_check_params(struct rte_sched_port_params *params)
 {
@@ -717,11 +734,21 @@ rte_sched_port_config(struct rte_sched_port_params 
*params)
 void
 rte_sched_port_free(struct rte_sched_port *port)
 {
+   unsigned int queue;
/* Check user parameters */
if (port == NULL){
return;
}

+   /* Free enqueued mbufs */
+   for (queue = 0; queue < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; queue++) {
+   unsigned int i;
+   struct rte_mbuf **mbufs = rte_sched_port_qbase(port, queue);
+
+   for (i = 0; i < rte_sched_port_qsize(port, queue); i++)
+   rte_pktmbuf_free(mbufs[i]);
+   }
+
rte_bitmap_free(port->bmp);
rte_free(port);
 }
@@ -1032,23 +1059,6 @@ rte_sched_port_qindex(struct rte_sched_port *port, 
uint32_t subport, uint32_t pi
return result;
 }

-static inline struct rte_mbuf **
-rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex)
-{
-   uint32_t pindex = qindex >> 4;
-   uint32_t qpos = qindex & 0xF;
-
-   return (port->queue_array + pindex * port->qsize_sum + 
port->qsize_add[qpos]);
-}
-
-static inline uint16_t
-rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex)
-{
-   uint32_t tc = (qindex >> 2) & 0x3;
-
-   return port->qsize[tc];
-}
-
 #if RTE_SCHED_DEBUG

 static inline int
-- 
1.9.1

[dpdk-dev] [PATCH] ixgbe: change logging for ixgbe tx code path selection

2015-10-28 Thread Bruce Richardson

On Tue, Oct 27, 2015 at 05:31:59PM +, Traynor, Kevin wrote:
> 
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Tuesday, October 27, 2015 12:13 PM
> > To: Richardson, Bruce; Traynor, Kevin
> > Cc: dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] ixgbe: change logging for ixgbe tx code path
> > selection
> > 
> > 
> > 
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > > Sent: Tuesday, October 27, 2015 11:50 AM
> > > To: Traynor, Kevin
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] ixgbe: change logging for ixgbe tx code
> > path selection
> > >
> > > On Tue, Oct 27, 2015 at 11:41:08AM +, Kevin Traynor wrote:
> > > > Simple and vector are different tx code paths. If vector
> > > > is selected, change logging from:
> > > > PMD: ixgbe_set_tx_function(): Using simple tx code path
> > > > PMD: ixgbe_set_tx_function(): Vector tx enabled.
> > > >
> > > > to:
> > > > PMD: ixgbe_set_tx_function(): Using vector tx code path
> > > >
> > > > or, if simple selected:
> > > > PMD: ixgbe_set_tx_function(): Using simple tx code path
> > > >
> > > > The dangling else in the #ifdef makes readability difficult,
> > > > so resolving in way that seems most readable.
> > > >
> > > > Signed-off-by: Kevin Traynor 
> > > > ---
> > > >  drivers/net/ixgbe/ixgbe_rxtx.c |8 +---
> > > >  1 files changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > index a598a72..11d7feb 100644
> > > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > @@ -1963,16 +1963,18 @@ ixgbe_set_tx_function(struct rte_eth_dev *dev,
> > struct ixgbe_tx_queue *txq)
> > > > /* Use a simple Tx queue (no offloads, no multi segs) if 
> > > > possible */
> > > > if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == 
> > > > IXGBE_SIMPLE_FLAGS)
> > > > && (txq->tx_rs_thresh >= 
> > > > RTE_PMD_IXGBE_TX_MAX_BURST)) {
> > > > -   PMD_INIT_LOG(DEBUG, "Using simple tx code path");
> > > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > > if (txq->tx_rs_thresh <= RTE_IXGBE_TX_MAX_FREE_BUF_SZ &&
> > > > (rte_eal_process_type() != 
> > > > RTE_PROC_PRIMARY ||
> > > > ixgbe_txq_vec_setup(txq) == 0)) 
> > > > {
> > > > -   PMD_INIT_LOG(DEBUG, "Vector tx enabled.");
> > > > +   PMD_INIT_LOG(DEBUG, "Using vector tx code 
> > > > path");
> > > > dev->tx_pkt_burst = ixgbe_xmit_pkts_vec;
> > > > } else
> > > >  #endif
> > > > -   dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
> > > > +   {
> > > > +   PMD_INIT_LOG(DEBUG, "Using simple tx code 
> > > > path");
> > > > +   dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
> > > > +   }
> > > > } else {
> > > > PMD_INIT_LOG(DEBUG, "Using full-featured tx code path");
> > > > PMD_INIT_LOG(DEBUG,
> > > > --
> > > > 1.7.4.1
> > > >
> > > Hi Kevin,
> > >
> > > can I suggest a slight alternative here that might help make things 
> > > easier.
> > > Instead of printing the message as we pick the code path, why not have a
> > "logmsg"
> > > pointer variable that is assigned in the code, and then print out the log
> > path
> > > at the end.
> > >
> > > This would have a number of advantages:
> > > 1. there are no issues with changing our mind, so we can assign one path
> > type,
> > > and then later change it to something different without cluttering up the
> > debug
> > > output with the history of our code's flow.
> > > 2. it means that you don't have a problem with smaller else legs as you 
> > > can
> > > easily do multiple assignments in the one line using a comma as:
> > >   dev->tx_pkt_burst = ixgbe_xmit_pkts_simple, logmsg = "Using simple
> > ...";
> > 
> > While I like approach with logmsg, please avoid commas here.
> > It will make this peace of code even more hard to read (at least for me).
> > Konstantin
> 
> yeah, sure. I agree with changing for 1. I also agree with Konstantin re 
> commas.
> The code under the dangling else is aligned incorrectly/correctly depending on
> whether the #ifdef is true or not, so I think adding multiple statements with 
> {}
> now will make it obvious for the next person who modifies.
> 
Final alternative is to have a switch statement at the end of the block for
printing based on what the final function selection is. That way there is only
a single assignment needed and no awkward braces. :-)

[dpdk-dev] [Patch] Eth Driver: Optimization for improved NIC processing rates

2015-10-28 Thread Bruce Richardson

On Tue, Oct 27, 2015 at 08:56:31PM +, Polehn, Mike A wrote:
> Prefetch of interface access variables while calling into driver RX and TX 
> subroutines.
> 
> For converging zero loss packet task tests, a small drop in latency for zero 
> loss measurements 
> and small drop in lost packet counts for the lossy measurement points was 
> observed, 
> indicating some savings of execution clock cycles.
> 
Hi Mike,

the commit log message above seems a bit awkward to read. If I understand it
correctly, would the below suggestion be a shorter, clearer equivalent?

Prefetch RX and TX queue variables in ethdev before driver function call

This has been measured to produce higher throughput and reduced latency
in RFC 2544 throughput tests.

Or perhaps you could suggest yourself some similar wording. It would also be
good to clarify with what applications the improvements were seen - was it using
testpmd or l3fwd or something else?

Regards,
/Bruce

[dpdk-dev] [PATCH v4] ip_pipeline: add more functions to routing-pipeline

2015-10-28 Thread Jasvinder Singh

This patch adds following features to the
routing-pipeline to enable it for various NFV
use-cases;

1.Fast-path ARP table enable/disable
2.Double-tagged VLAN (Q-in-Q) packet enacapsulation
for the next-hop
3.MPLS encapsulation for the next-hop
4.Add colour (Traffic-class for QoS) to the MPLS tag
5.Classification action to select the input queue
of the hierarchical scehdular (QoS)

The above proposed features can be enabled
(or disabled) through the parameters specified
in configuration file as below;

[PIPELINE0]
type = ROUTING
core = 1
pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
n_routes = 4096
n_arp_entries = 1024
ip_hdr_offset = 142
arp_key_offset = 64
l2 = qinq
qinq_sched = no

The LPM table entries might include additional
fields depending upon the packet encapsulation
(Q-in-Q, MPLS)for the next-hop. The CLI
commands for adding or deleting such entries
to LPM table have been implemented. Action
handlers for QinQ and MPLS encapsulation,
classification action to select the input queue
of the hierarchical schedular(QoS) and adding
colour (Traffic-class for QoS) to the MPLS
tag have been implemented.

v2:
*fixed bug in print_route

v3:
*replaced config file "l2 = qinq/mpls" with
"encap = ethernet/ethernet_qinq/ethernet_mpls"
*added config file option "dbg_ah_disable=yes/no"
to disable table action handlers for routing and
arp, which is a quick way to disable packet
encapsulation.

*compacted routing table action handlers

*writing 6 bytes for the macaddr_dst instead of
8 bytes during encapsulation, as the additional
2 bytes are located on previous cache line.

v4:
*fixed bug as RTE_MBUF_METADATA_* macros to
access the packet meta-data covers the packet
mbuf structure.

Signed-off-by: Jasvinder Singh 

Acked-by: Cristian Dumitrescu 

---
 examples/ip_pipeline/pipeline/pipeline_routing.c   |  806 -
 examples/ip_pipeline/pipeline/pipeline_routing.h   |8 +-
 .../ip_pipeline/pipeline/pipeline_routing_be.c | 1262 ++--
 .../ip_pipeline/pipeline/pipeline_routing_be.h |   72 +-
 4 files changed, 1960 insertions(+), 188 deletions(-)

diff --git a/examples/ip_pipeline/pipeline/pipeline_routing.c 
b/examples/ip_pipeline/pipeline/pipeline_routing.c
index beec982..4f6ff81 100644
--- a/examples/ip_pipeline/pipeline/pipeline_routing.c
+++ b/examples/ip_pipeline/pipeline/pipeline_routing.c
@@ -43,7 +43,7 @@

 struct app_pipeline_routing_route {
struct pipeline_routing_route_key key;
-   struct app_pipeline_routing_route_params params;
+   struct pipeline_routing_route_data data;
void *entry_ptr;

TAILQ_ENTRY(app_pipeline_routing_route) node;
@@ -187,21 +187,55 @@ print_route(const struct app_pipeline_routing_route 
*route)
&route->key.key.ipv4;

printf("IP Prefix = %" PRIu32 ".%" PRIu32
-   ".%" PRIu32 ".%" PRIu32 "/%" PRIu32 " => "
-   "(Port = %" PRIu32 ", Next Hop IP = "
-   "%" PRIu32 ".%" PRIu32 ".%" PRIu32 ".%" PRIu32 ")\n",
+   ".%" PRIu32 ".%" PRIu32 "/%" PRIu32
+   " => (Port = %" PRIu32,
+
(key->ip >> 24) & 0xFF,
(key->ip >> 16) & 0xFF,
(key->ip >> 8) & 0xFF,
key->ip & 0xFF,

key->depth,
-   route->params.port_id,
+   route->data.port_id);
+
+   if (route->data.flags & PIPELINE_ROUTING_ROUTE_ARP)
+   printf(
+   ", Next Hop IP = %" PRIu32 ".%" PRIu32
+   ".%" PRIu32 ".%" PRIu32,
+
+   (route->data.ethernet.ip >> 24) & 0xFF,
+   (route->data.ethernet.ip >> 16) & 0xFF,
+   (route->data.ethernet.ip >> 8) & 0xFF,
+   route->data.ethernet.ip & 0xFF);
+   else
+   printf(
+   ", Next Hop HWaddress = %02" PRIx32
+   ":%02" PRIx32 ":%02" PRIx32
+   ":%02" PRIx32 ":%02" PRIx32
+   ":%02" PRIx32,
+
+   route->data.ethernet.macaddr.addr_bytes[0],
+   route->data.ethernet.macaddr.addr_bytes[1],
+   route->data.ethernet.macaddr.addr_bytes[2],
+   route->data.ethernet.macaddr.addr_bytes[3],
+   route->data.ethernet.macaddr.addr_bytes[4],
+   route->data.ethernet.macaddr.addr_bytes[5]);
+
+   if (route->data.flags & PIPELINE_ROUTING_ROUTE_QINQ)
+   printf(", QinQ SVLAN = %" PRIu32 " CVLAN = %" PRIu32,
+   route->data.l2.qinq.svlan,
+

[dpdk-dev] [PATCHv7 0/9] ethdev: add new API to retrieve RX/TX queue information

2015-10-28 Thread Remy Horton


On 27/10/2015 12:51, Konstantin Ananyev wrote:
> Konstantin Ananyev (9):
>ethdev: add new API to retrieve RX/TX queue information
>i40e: add support for eth_(rxq|txq)_info_get and (rx|tx)_desc_lim
>ixgbe: add support for eth_(rxq|txq)_info_get and (rx|tx)_desc_lim
>e1000: add support for eth_(rxq|txq)_info_get and (rx|tx)_desc_lim
>fm10k: add HW specific desc_lim data into dev_info
>cxgbe: add HW specific desc_lim data into dev_info
>vmxnet3: add HW specific desc_lim data into dev_info
>testpmd: add new command to display RX/TX queue information
>doc: release notes update for queue_info_get() and (rx|tx)_desc_limit

Acked-by: Remy Horton

1 2 >

1 - 100 of 123 matches

Mail list logo