[ovs-dev] [PATCH ovs v1 0/2] Introduce dpdkvdpa netdev

2020-04-02 Thread Noa Ezra
Introduce dpdkvdpa netdev allowing HW offloads over VirtIO network
devices.

dpdkvdpa ports can be added to netdev bridges with the following
command:
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
options:vdpa-socket-path=
options:vdpa-accelerator-devargs=
options:dpdk-devargs=,representor=[id]
options:vdpa-max-queues=
vdpa-max-queues is an optional field.

vDPA netdev is designed to support both SW and HW acceleration.
SRIOV capable NICs can use the SW acceleration which relays packets
between VF and virtIO ports.
In a future patch, a support for vDPA configuration will be added,
so that HW mode will configure vDPA capable NICs.

The dpdkvdpa netdev supports all kind of traffic (TCP, UDP, NFV etc).

Using dpdkvdpa port allows to forward packets between VF and VirtIO guests
with better performance than using standard VirtIO ports.
On the first scenario, a guest is connected to OVS using VirtIO.
On the second scenario, a guest is connected to OVS using dpdkvdpa port.
The guest is running testpmd.
A Traffic generator (iperf3 or Ixia) is sending packets to the OVS.
In this case, dpdkvdpa port improves the performance by ~35%.

https://travis-ci.org/github/noaezra/OVS/builds/670001370

Patch 1 provides the vdpa functionality as a pre-step without a functional
change.
Patch 2 introduces the dpdkvdpa vport.


Noa Ezra (2):
  netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
  netdev-dpdk: Add dpdkvdpa port

 Documentation/automake.mk   |   1 +
 Documentation/topics/dpdk/index.rst |   1 +
 Documentation/topics/dpdk/vdpa.rst  |  90 
 NEWS|   1 +
 lib/automake.mk |   4 +-
 lib/netdev-dpdk-vdpa.c  | 820 
 lib/netdev-dpdk-vdpa.h  |  55 +++
 lib/netdev-dpdk.c   | 164 +++-
 vswitchd/vswitch.xml|  25 ++
 9 files changed, 1159 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/topics/dpdk/vdpa.rst
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovs v1 2/2] netdev-dpdk: Add dpdkvdpa port

2020-04-02 Thread Noa Ezra
dpdkvdpa netdev works with 3 components:
vhost-user socket, vdpa device: real vdpa device or a VF and
representor of "vdpa device".

In order to add a new vDPA port, add a new port to existing bridge
with type dpdkvdpa and vDPA options:
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
   options:vdpa-socket-path=
   options:vdpa-accelerator-devargs=
   options:dpdk-devargs=,representor=[id]

On this command OVS will create a new netdev:
1. Register vhost-user-client device.
2. Open and configure VF dpdk port.
3. Open and configure representor dpdk port.

The new netdev will use netdev_rxq_recv() function in order to receive
packets from VF and push to vhost-user and receive packets from
vhost-user and push to VF.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 Documentation/automake.mk   |   1 +
 Documentation/topics/dpdk/index.rst |   1 +
 Documentation/topics/dpdk/vdpa.rst  |  90 
 NEWS|   1 +
 lib/netdev-dpdk.c   | 164 +++-
 vswitchd/vswitch.xml|  25 ++
 6 files changed, 281 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/topics/dpdk/vdpa.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index f85c432..7caf6e7 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -41,6 +41,7 @@ DOC_SOURCE = \
Documentation/topics/dpdk/qos.rst \
Documentation/topics/dpdk/vdev.rst \
Documentation/topics/dpdk/vhost-user.rst \
+   Documentation/topics/dpdk/vdpa.rst \
Documentation/topics/fuzzing/index.rst \
Documentation/topics/fuzzing/what-is-fuzzing.rst \
Documentation/topics/fuzzing/ovs-fuzzing-infrastructure.rst \
diff --git a/Documentation/topics/dpdk/index.rst 
b/Documentation/topics/dpdk/index.rst
index a5be5e3..e8595c3 100644
--- a/Documentation/topics/dpdk/index.rst
+++ b/Documentation/topics/dpdk/index.rst
@@ -39,3 +39,4 @@ DPDK Support
/topics/dpdk/qos
/topics/dpdk/jumbo-frames
/topics/dpdk/memory
+   /topics/dpdk/vdpa
diff --git a/Documentation/topics/dpdk/vdpa.rst 
b/Documentation/topics/dpdk/vdpa.rst
new file mode 100644
index 000..34c5300
--- /dev/null
+++ b/Documentation/topics/dpdk/vdpa.rst
@@ -0,0 +1,90 @@
+..
+  Copyright (c) 2019 Mellanox Technologies, Ltd.
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at:
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in Open vSwitch documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''''''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+
+===
+DPDK VDPA Ports
+===
+
+In user space there are two main approaches to communicate with a guest (VM),
+using virtIO ports (e.g. netdev type=dpdkvhoshuser/dpdkvhostuserclient) or
+SR-IOV using phy ports (e.g. netdev type = dpdk).
+Phy ports allow working with port representor which is attached to the OVS and
+a matching VF is given with pass-through to the guest.
+HW rules can process packets from up-link and direct them to the VF without
+going through SW (OVS) and therefore using phy ports gives the best
+performance.
+However, SR-IOV architecture requires that the guest will use a driver which is
+specific to the underlying HW. Specific HW driver has two main drawbacks:
+1. Breaks virtualization in some sense (guest aware of the HW), can also limit
+the type of images supported.
+2. Less natural support for live migration.
+
+Using virtIO port solves both problems, but reduces performance and causes
+losing of some functionality, for example, for some HW offload, working
+directly with virtIO cannot be supported.
+
+We created a new netdev type- dpdkvdpa. dpdkvdpa port solves this conflict.
+The new netdev is basically very similar to regular dpdk netdev but it has some
+additional functionally.
+This port translates between phy port to virtIO port, it takes packets from
+rx-queue and send them to the suitable tx-queue and allows to transfer packets
+from virtIO guest (VM) to a VF and vice versa and benefit both SR-IOV and
+virtIO.
+
+Quick Example
+-
+
+Configure OVS bridge and ports
+~~
+
+you must first create a bridge and add ports to the switch.
+Since 

[ovs-dev] [PATCH ovs v1 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev

2020-04-02 Thread Noa Ezra
vDPA netdev is designed to support both SW and HW use cases.
HW mode will be used to configure vDPA capable devices.
SW acceleration is used to leverage SRIOV offloads to virtio guests
by relaying packets between VF and virtio devices.
Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa
port with no functional change.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 lib/automake.mk|   4 +-
 lib/netdev-dpdk-vdpa.c | 820 +
 lib/netdev-dpdk-vdpa.h |  55 
 3 files changed, 878 insertions(+), 1 deletion(-)
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 95925b5..b57682c 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -146,6 +146,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/netdev-offload.h \
lib/netdev-offload-provider.h \
lib/netdev-provider.h \
+   lib/netdev-dpdk-vdpa.h \
lib/netdev-vport.c \
lib/netdev-vport.h \
lib/netdev-vport-private.h \
@@ -429,7 +430,8 @@ if DPDK_NETDEV
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk.c \
lib/netdev-dpdk.c \
-   lib/netdev-offload-dpdk.c
+   lib/netdev-offload-dpdk.c \
+   lib/netdev-dpdk-vdpa.c
 else
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk-stub.c
diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c
new file mode 100755
index 000..c6ed061
--- /dev/null
+++ b/lib/netdev-dpdk-vdpa.c
@@ -0,0 +1,820 @@
+/*
+ * Copyright (c) 2019 Mellanox Technologies, Ltd.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "netdev-dpdk-vdpa.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "netdev-provider.h"
+#include "openvswitch/vlog.h"
+#include "dp-packet.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa);
+
+#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *))
+#define NETDEV_DPDK_VDPA_MAX_QPAIRS 16
+#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID   0x
+#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64
+#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512
+
+enum netdev_dpdk_vdpa_port_type {
+NETDEV_DPDK_VDPA_PORT_TYPE_VM,
+NETDEV_DPDK_VDPA_PORT_TYPE_VF
+};
+
+struct netdev_dpdk_vdpa_relay_flow {
+struct rte_flow *flow;
+bool queues_en[RTE_MAX_QUEUES_PER_PORT];
+uint32_t priority;
+};
+
+struct netdev_dpdk_vdpa_qpair {
+uint16_t port_id_rx;
+uint16_t port_id_tx;
+uint16_t pr_queue;
+uint8_t mb_head;
+uint8_t mb_tail;
+struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2];
+};
+
+struct netdev_dpdk_vdpa_relay {
+PADDED_MEMBERS(CACHE_LINE_SIZE,
+struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2];
+uint16_t num_queues;
+struct netdev_dpdk_vdpa_relay_flow flow_params;
+int port_id_vm;
+int port_id_vf;
+uint16_t vf_mtu;
+int n_rxq;
+char *vf_pci;
+char *vm_socket;
+char *vhost_name;
+bool started;
+);
+};
+
+static int
+netdev_dpdk_vdpa_port_from_name(const char *name)
+{
+int port_id;
+size_t len;
+
+len = strlen(name);
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+if (rte_eth_dev_is_valid_port(port_id) &&
+!strncmp(name, rte_eth_devices[port_id].device->name, len)) {
+return port_id;
+}
+}
+VLOG_ERR("No port was found for %s", name);
+return ENODEV;
+}
+
+static void
+netdev_dpdk_vdpa_free(void *ptr)
+{
+if (ptr == NULL) {
+return;
+}
+free(ptr);
+ptr = NULL;
+}
+
+static void
+netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay)
+{
+uint16_t q;
+uint8_t i;
+
+for (q = 0; q < relay->num_queues; q++) {
+for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) {
+rte_pktmbuf_free(relay->qpair[q].pkts[i]);
+}
+relay->qpair[q].mb_head = 0;
+relay->qpair[q].mb_tail = 0;
+relay->qpair[q].port_id_rx = 0;
+relay->qpair[q].port_id_tx = 0;
+relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID;
+}
+
+relay->started = false;
+relay->port_id_vm = 0;
+relay->port_i

[ovs-dev] [PATCH ovs v1 0/2] Allow setting MAC on DPDK interfaces

2020-03-05 Thread Noa Ezra
In cloud topology, when SR-IOV with port representors is in use
and VM is not trusted, the orchestration should set the VF mac address.
When using DPDK there is an architecture limitation to set the VF mac 
address from host (Linux tooling).
According to previous discussion (https://patchwork.ozlabs.org/patch/1215075/), 
it was agreed to add a new API in ovs-appctl for setting MAC address on port
representors.

ovs-appctl netdev-dpdk/set-mac  

Ilya Maximets (1):
  netdev-dpdk: Add ability to set MAC address.

Noa Ezra (1):
  netdev-dpdk: Allow setting MAC on DPDK interfaces

 lib/netdev-dpdk.c | 55 ---
 1 file changed, 52 insertions(+), 3 deletions(-)

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovs v1 1/2] netdev-dpdk: Add ability to set MAC address.

2020-03-05 Thread Noa Ezra
From: Ilya Maximets 

It is possible to set MAC address for DPDK ports by calling
rte_eth_dev_default_mac_addr_set().  For some reason OVS didn't
use this functionality avoiding real MAC address configuration.

With this change following command will result in real MAC address
update on HW NIC:

  ovs-vsctl set Interface  mac="xx:xx:xx:xx:xx:xx"

Signed-off-by: Ilya Maximets 
Acked-by: Ben Pfaff 
---
 lib/netdev-dpdk.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 7ab8186..e375b3d 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2968,15 +2968,28 @@ static int
 netdev_dpdk_set_etheraddr(struct netdev *netdev, const struct eth_addr mac)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+int err = 0;
 
 ovs_mutex_lock(&dev->mutex);
 if (!eth_addr_equals(dev->hwaddr, mac)) {
-dev->hwaddr = mac;
-netdev_change_seq_changed(netdev);
+if (dev->type == DPDK_DEV_ETH) {
+struct rte_ether_addr ea;
+
+memcpy(ea.addr_bytes, mac.ea, ETH_ADDR_LEN);
+err = rte_eth_dev_default_mac_addr_set(dev->port_id, &ea);
+}
+if (!err) {
+dev->hwaddr = mac;
+netdev_change_seq_changed(netdev);
+} else {
+VLOG_WARN("%s: Failed to set requested mac("ETH_ADDR_FMT"): %s",
+  netdev_get_name(netdev), ETH_ADDR_ARGS(mac),
+  rte_strerror(-err));
+}
 }
 ovs_mutex_unlock(&dev->mutex);
 
-return 0;
+return -err;
 }
 
 static int
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovs v1 2/2] netdev-dpdk: Allow setting MAC on DPDK interfaces

2020-03-05 Thread Noa Ezra
Adding a command for setting MAC of DPDK interfaces using:
ovs-appctl netdev-dpdk/set-mac  

Signed-off-by: Noa Ezra 
Acked-by: Roni Bar Yanai 
---
 lib/netdev-dpdk.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index e375b3d..2b8adac 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -3917,6 +3917,38 @@ out:
 netdev_close(netdev);
 }
 
+static void
+netdev_dpdk_set_mac(struct unixctl_conn *conn, int argc OVS_UNUSED,
+const char *argv[], void *aux OVS_UNUSED)
+{
+struct netdev *netdev = NULL;
+char *response = NULL;
+struct eth_addr mac;
+int error;
+
+netdev = netdev_from_name(argv[1]);
+if (!netdev || !is_dpdk_class(netdev->netdev_class)) {
+unixctl_command_reply_error(conn, "Not a DPDK Interface");
+return;
+}
+
+if (!argv[2] || !eth_addr_from_string(argv[2], &mac)) {
+response = xasprintf("No MAC address to set.");
+goto out;
+}
+
+error = netdev_dpdk_set_etheraddr(netdev, mac);
+if (error) {
+response = xasprintf("interface %s: setting MAC failed (%s)",
+ argv[1], ovs_strerror(error));
+}
+response = xasprintf("set-mac done.");
+
+out:
+unixctl_command_reply(conn, response);
+netdev_close(netdev);
+}
+
 /*
  * Set virtqueue flags so that we do not receive interrupts.
  */
@@ -4256,6 +4288,10 @@ netdev_dpdk_class_init(void)
  "[netdev]", 0, 1,
  netdev_dpdk_get_mempool_info, NULL);
 
+unixctl_command_register("netdev-dpdk/set-mac",
+ "[netdev] [mac]", 2, 2,
+ netdev_dpdk_set_mac, NULL);
+
 ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
 RTE_ETH_EVENT_INTR_RESET,
 dpdk_eth_event_callback, NULL);
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port

2019-10-22 Thread Noa Ezra
Hi,
Please see the answer below.

Thanks,
Noa.

> -Original Message-
> From: William Tu [mailto:u9012...@gmail.com]
> Sent: Friday, October 18, 2019 12:34 AM
> To: Noa Ezra 
> Cc: ovs-dev@openvswitch.org; Oz Shlomo ; Majd
> Dibbiny ; Ameer Mahagneh
> ; Eli Britstein 
> Subject: Re: [ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port
> 
> On Thu, Oct 17, 2019 at 02:16:56PM +0300, Noa Ezra wrote:
> 
> Hi Noa,
> 
> Thanks for the patch. I'm new to this and have a question below.
> 
> > dpdkvdpa netdev works with 3 components:
> > vhost-user socket, vdpa device: real vdpa device or a VF and
> > representor of "vdpa device".
> >
> > In order to add a new vDPA port, add a new port to existing bridge
> > with type dpdkvdpa and vDPA options:
> > ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
> >options:vdpa-socket-path=
> >options:vdpa-accelerator-devargs=
> >options:dpdk-devargs=,representor=[id]
> >
> > On this command OVS will create a new netdev:
> > 1. Register vhost-user-client device.
> > 2. Open and configure VF dpdk port.
> > 3. Open and configure representor dpdk port.
> >
> > The new netdev will use netdev_rxq_recv() function in order to receive
> > packets from VF and push to vhost-user and receive packets from
> > vhost-user and push to VF.
> 
> So does OVS in this case is able to apply OpenFlow rules on packets?
> 
> When netdev_dpdk_vdpa_rxq_recv() is invoked, does the batch of packets
> go into OVS's parse, lookup, action pipeline? Or all packets go directly into
> VM if (VF -> VM) and vice versa?
> 
> Is
> fwd_rx = netdev_dpdk_vdpa_rxq_recv_impl(dev->relay, rxq->queue_id);
> forward packets from vhost-user to VF and ret =
> netdev_dpdk_rxq_recv(rxq, batch, qfill); forward packets from vhost-user to
> VM?

I hope that I understand your question correctly, the netdev_dpdk_vdpa_rxq_recv 
forwards packets from VM to VF and vice versa.
There is no change in the processing of the packet between VF and up-link and 
no change in the packet's header.
The new netdev only translate between SR-IOV (phy) VF to virtIO VM.

> Thanks
> William

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovs v3 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev

2019-10-17 Thread Noa Ezra
vDPA netdev is designed to support both SW and HW use cases.
HW mode will be used to configure vDPA capable devices.
SW acceleration is used to leverage SRIOV offloads to virtio guests
by relaying packets between VF and virtio devices.
Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa
port with no functional change.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 lib/automake.mk|   4 +-
 lib/netdev-dpdk-vdpa.c | 750 +
 lib/netdev-dpdk-vdpa.h |  54 
 3 files changed, 807 insertions(+), 1 deletion(-)
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 17b36b4..38e027f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -144,6 +144,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/netdev-offload.h \
lib/netdev-offload-provider.h \
lib/netdev-provider.h \
+   lib/netdev-dpdk-vdpa.h \
lib/netdev-vport.c \
lib/netdev-vport.h \
lib/netdev-vport-private.h \
@@ -426,7 +427,8 @@ if DPDK_NETDEV
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk.c \
lib/netdev-dpdk.c \
-   lib/netdev-offload-dpdk.c
+   lib/netdev-offload-dpdk.c \
+   lib/netdev-dpdk-vdpa.c
 else
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk-stub.c
diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c
new file mode 100755
index 000..d8f8fb0
--- /dev/null
+++ b/lib/netdev-dpdk-vdpa.c
@@ -0,0 +1,750 @@
+/*
+ * Copyright (c) 2019 Mellanox Technologies, Ltd.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "netdev-dpdk-vdpa.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "netdev-provider.h"
+#include "openvswitch/vlog.h"
+#include "dp-packet.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa);
+
+#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *))
+#define NETDEV_DPDK_VDPA_MAX_QPAIRS 128
+#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID   0x
+#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64
+#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512
+
+enum netdev_dpdk_vdpa_port_type {
+NETDEV_DPDK_VDPA_PORT_TYPE_VM,
+NETDEV_DPDK_VDPA_PORT_TYPE_VF
+};
+
+struct netdev_dpdk_vdpa_relay_flow {
+struct rte_flow *flow;
+bool queues_en[RTE_MAX_QUEUES_PER_PORT];
+uint32_t priority;
+};
+
+struct netdev_dpdk_vdpa_qpair {
+uint16_t port_id_rx;
+uint16_t port_id_tx;
+uint16_t pr_queue;
+uint8_t mb_head;
+uint8_t mb_tail;
+struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2];
+};
+
+struct netdev_dpdk_vdpa_relay {
+PADDED_MEMBERS(CACHE_LINE_SIZE,
+struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2];
+uint16_t num_queues;
+struct netdev_dpdk_vdpa_relay_flow flow_params;
+int port_id_vm;
+int port_id_vf;
+uint16_t vf_mtu;
+int n_rxq;
+char *vf_pci;
+char *vm_socket;
+char *vhost_name;
+);
+};
+
+static int
+netdev_dpdk_vdpa_port_from_name(const char *name)
+{
+int port_id;
+size_t len;
+
+len = strlen(name);
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+if (rte_eth_dev_is_valid_port(port_id) &&
+!strncmp(name, rte_eth_devices[port_id].device->name, len)) {
+return port_id;
+}
+}
+VLOG_ERR("No port was found for %s", name);
+return -1;
+}
+
+static void
+netdev_dpdk_vdpa_free(void *ptr)
+{
+if (ptr == NULL) {
+return;
+}
+free(ptr);
+ptr = NULL;
+}
+static void
+netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay)
+{
+uint16_t q;
+uint8_t i;
+
+for (q = 0; q < relay->num_queues; q++) {
+for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) {
+rte_pktmbuf_free(relay->qpair[q].pkts[i]);
+}
+relay->qpair[q].mb_head = 0;
+relay->qpair[q].mb_tail = 0;
+relay->qpair[q].port_id_rx = 0;
+relay->qpair[q].port_id_tx = 0;
+relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID;
+}
+
+relay->port_id_vm = 0;
+relay->port_id_vf = 0;
+relay->num_queues = 0;
+rel

[ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port

2019-10-17 Thread Noa Ezra
dpdkvdpa netdev works with 3 components:
vhost-user socket, vdpa device: real vdpa device or a VF and
representor of "vdpa device".

In order to add a new vDPA port, add a new port to existing bridge
with type dpdkvdpa and vDPA options:
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
   options:vdpa-socket-path=
   options:vdpa-accelerator-devargs=
   options:dpdk-devargs=,representor=[id]

On this command OVS will create a new netdev:
1. Register vhost-user-client device.
2. Open and configure VF dpdk port.
3. Open and configure representor dpdk port.

The new netdev will use netdev_rxq_recv() function in order to receive
packets from VF and push to vhost-user and receive packets from
vhost-user and push to VF.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 Documentation/automake.mk   |   1 +
 Documentation/topics/dpdk/index.rst |   1 +
 Documentation/topics/dpdk/vdpa.rst  |  90 
 NEWS|   1 +
 lib/netdev-dpdk.c   | 162 
 vswitchd/vswitch.xml|  25 ++
 6 files changed, 280 insertions(+)
 create mode 100644 Documentation/topics/dpdk/vdpa.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index cd68f3b..ee574bc 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -43,6 +43,7 @@ DOC_SOURCE = \
Documentation/topics/dpdk/ring.rst \
Documentation/topics/dpdk/vdev.rst \
Documentation/topics/dpdk/vhost-user.rst \
+   Documentation/topics/dpdk/vdpa.rst \
Documentation/topics/fuzzing/index.rst \
Documentation/topics/fuzzing/what-is-fuzzing.rst \
Documentation/topics/fuzzing/ovs-fuzzing-infrastructure.rst \
diff --git a/Documentation/topics/dpdk/index.rst 
b/Documentation/topics/dpdk/index.rst
index cf24a7b..c1d4ea7 100644
--- a/Documentation/topics/dpdk/index.rst
+++ b/Documentation/topics/dpdk/index.rst
@@ -41,3 +41,4 @@ The DPDK Datapath
/topics/dpdk/pdump
/topics/dpdk/jumbo-frames
/topics/dpdk/memory
+   /topics/dpdk/vdpa
diff --git a/Documentation/topics/dpdk/vdpa.rst 
b/Documentation/topics/dpdk/vdpa.rst
new file mode 100644
index 000..34c5300
--- /dev/null
+++ b/Documentation/topics/dpdk/vdpa.rst
@@ -0,0 +1,90 @@
+..
+  Copyright (c) 2019 Mellanox Technologies, Ltd.
+
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at:
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in Open vSwitch documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''''''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+
+===
+DPDK VDPA Ports
+===
+
+In user space there are two main approaches to communicate with a guest (VM),
+using virtIO ports (e.g. netdev type=dpdkvhoshuser/dpdkvhostuserclient) or
+SR-IOV using phy ports (e.g. netdev type = dpdk).
+Phy ports allow working with port representor which is attached to the OVS and
+a matching VF is given with pass-through to the guest.
+HW rules can process packets from up-link and direct them to the VF without
+going through SW (OVS) and therefore using phy ports gives the best
+performance.
+However, SR-IOV architecture requires that the guest will use a driver which is
+specific to the underlying HW. Specific HW driver has two main drawbacks:
+1. Breaks virtualization in some sense (guest aware of the HW), can also limit
+the type of images supported.
+2. Less natural support for live migration.
+
+Using virtIO port solves both problems, but reduces performance and causes
+losing of some functionality, for example, for some HW offload, working
+directly with virtIO cannot be supported.
+
+We created a new netdev type- dpdkvdpa. dpdkvdpa port solves this conflict.
+The new netdev is basically very similar to regular dpdk netdev but it has some
+additional functionally.
+This port translates between phy port to virtIO port, it takes packets from
+rx-queue and send them to the suitable tx-queue and allows to transfer packets
+from virtIO guest (VM) to a VF and vice versa and benefit both SR-IOV and
+virtIO.
+
+Quick Example
+-
+
+Configure OVS bridge and ports
+~~
+
+you must first create a bridge and add ports to the switch.
+Since the dpdk

[ovs-dev] [PATCH ovs v3 0/2] Introduce dpdkvdpa netdev

2019-10-17 Thread Noa Ezra
There are two approaches to communicate with a guest, using virtIO or
SR-IOV.
SR-IOV allows working with port representor which is attached to the
OVS and a matching VF is given with pass-through to the VM.
HW rules can process packets from up-link and direct them to the VF
without going through SW (OVS) and therefore SR-IOV gives the best
performance.
However, SR-IOV architecture requires that the guest will use a driver
which is specific to the underlying HW. Specific HW driver has two main
drawbacks:
1. Breaks virtualization in some sense (VM aware of the HW), can also
   limit the type of images supported.
2. Less natural support for live migration.

Using virtIO interface solves both problems, but reduces performance and
causes losing of some functionality, for example, for some HW offload,
working directly with virtIO cannot be supported.
In order to solve this conflict, we created a new netdev type-dpdkvdpa.
The new netdev is basically similar to a regular dpdk netdev, but it
has some additional functionality for transferring packets from virtIO
guest (VM) to a VF and vice versa. With this solution we can benefit
both SR-IOV and virtIO.
vDPA netdev is designed to support both SW and HW use-cases.
HW mode will be used to configure vDPA capable devices. The support
for this mode is on progress in the dpdk community.
SW acceleration is used to leverage SR-IOV offloads to virtIO guests
by relaying packets between VF and virtio devices and as a pre-step for
supporting vDPA in HW mode.

Running example:
1. Configure OVS bridge and ports:
ovs-vsctl add-br br0-ovs -- set bridge br0-ovs datapath_type=netdev
ovs-vsctl add-port br0-ovs pf -- set Interface pf type=dpdk options: \
dpdk-devargs=
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa \
options:vdpa-socket-path= \
options:vdpa-accelerator-devargs= \
options:dpdk-devargs=,representor=[id]
2. Run a virtIO guest (VM) in server mode that creates the socket of
   the vDPA port.
3. Send traffic.

Noa Ezra (2):
  netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
  netdev-dpdk: Add dpdkvdpa port

 Documentation/automake.mk   |   1 +
 Documentation/topics/dpdk/index.rst |   1 +
 Documentation/topics/dpdk/vdpa.rst  |  90 +
 NEWS|   1 +
 lib/automake.mk |   4 +-
 lib/netdev-dpdk-vdpa.c  | 750 
 lib/netdev-dpdk-vdpa.h  |  54 +++
 lib/netdev-dpdk.c   | 162 
 vswitchd/vswitch.xml|  25 ++
 9 files changed, 1087 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/topics/dpdk/vdpa.rst
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovs v2 2/2] netdev-dpdk: Add dpdkvdpa port

2019-10-02 Thread Noa Ezra
dpdkvdpa netdev works with 3 components:
vhost-user socket, vdpa device: real vdpa device or a VF and
representor of "vdpa device".

In order to add a new vDPA port, add a new port to existing bridge
with type dpdkvdpa and vDPA options:
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
   options:vdpa-socket-path=
   options:vdpa-accelerator-devargs=
   options:dpdk-devargs=,representor=[id]

On this command OVS will create a new netdev:
1. Register vhost-user-client device.
2. Open and configure VF dpdk port.
3. Open and configure representor dpdk port.

The new netdev will use netdev_rxq_recv() function in order to receive
packets from VF and push to vhost-user and receive packets from
vhost-user and push to VF.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 NEWS |   1 +
 lib/netdev-dpdk.c| 162 +++
 vswitchd/vswitch.xml |  25 
 3 files changed, 188 insertions(+)

diff --git a/NEWS b/NEWS
index f5a0b8f..6f315c6 100644
--- a/NEWS
+++ b/NEWS
@@ -542,6 +542,7 @@ v2.6.0 - 27 Sep 2016
  * Remove dpdkvhostcuse port type.
  * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7)
  * 'dpdkvhostuserclient' port type.
+ * 'dpdkvdpa' port type.
- Increase number of registers to 16.
- ovs-benchmark: This utility has been removed due to lack of use and
  bitrot.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index bc20d68..16ddf58 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -47,6 +47,7 @@
 #include "dpif-netdev.h"
 #include "fatal-signal.h"
 #include "netdev-provider.h"
+#include "netdev-dpdk-vdpa.h"
 #include "netdev-vport.h"
 #include "odp-util.h"
 #include "openvswitch/dynamic-string.h"
@@ -137,6 +138,9 @@ typedef uint16_t dpdk_port_t;
 /* Legacy default value for vhost tx retries. */
 #define VHOST_ENQ_RETRY_DEF 8
 
+/* Size of VDPA custom stats. */
+#define VDPA_CUSTOM_STATS_SIZE  4
+
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 
 static const struct rte_eth_conf port_conf = {
@@ -461,6 +465,8 @@ struct netdev_dpdk {
 int rte_xstats_ids_size;
 uint64_t *rte_xstats_ids;
 );
+
+struct netdev_dpdk_vdpa_relay *relay;
 };
 
 struct netdev_rxq_dpdk {
@@ -1346,6 +1352,30 @@ netdev_dpdk_construct(struct netdev *netdev)
 return err;
 }
 
+static int
+netdev_dpdk_vdpa_construct(struct netdev *netdev)
+{
+struct netdev_dpdk *dev;
+int err;
+
+err = netdev_dpdk_construct(netdev);
+if (err) {
+VLOG_ERR("netdev_dpdk_construct failed. Port: %s\n", netdev->name);
+goto out;
+}
+
+ovs_mutex_lock(&dpdk_mutex);
+dev = netdev_dpdk_cast(netdev);
+dev->relay = netdev_dpdk_vdpa_alloc_relay();
+if (!dev->relay) {
+err = ENOMEM;
+}
+
+ovs_mutex_unlock(&dpdk_mutex);
+out:
+return err;
+}
+
 static void
 common_destruct(struct netdev_dpdk *dev)
 OVS_REQUIRES(dpdk_mutex)
@@ -1428,6 +1458,19 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk *dev 
OVS_UNUSED,
 }
 
 static void
+netdev_dpdk_vdpa_destruct(struct netdev *netdev)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ovs_mutex_lock(&dpdk_mutex);
+netdev_dpdk_vdpa_destruct_impl(dev->relay);
+rte_free(dev->relay);
+ovs_mutex_unlock(&dpdk_mutex);
+
+netdev_dpdk_destruct(netdev);
+}
+
+static void
 netdev_dpdk_vhost_destruct(struct netdev *netdev)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
@@ -1878,6 +1921,47 @@ out:
 }
 
 static int
+netdev_dpdk_vdpa_set_config(struct netdev *netdev, const struct smap *args,
+char **errp)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+const char *vdpa_accelerator_devargs =
+smap_get(args, "vdpa-accelerator-devargs");
+const char *vdpa_socket_path =
+smap_get(args, "vdpa-socket-path");
+int err = 0;
+
+if ((vdpa_accelerator_devargs == NULL) || (vdpa_socket_path == NULL)) {
+VLOG_ERR("netdev_dpdk_vdpa_set_config failed."
+ "Required arguments are missing for VDPA port %s",
+ netdev->name);
+goto free_relay;
+}
+
+err = netdev_dpdk_set_config(netdev, args, errp);
+if (err) {
+VLOG_ERR("netdev_dpdk_set_config failed. Port: %s", netdev->name);
+goto free_relay;
+}
+
+err = netdev_dpdk_vdpa_config_impl(dev->relay, dev->port_id,
+   vdpa_socket_path,
+   vdpa_accelerator_devargs);
+if (err) {
+VLOG_ERR("netdev_dpdk_vdpa_config_impl failed. Port %s",
+ netdev->name);
+goto free_relay;
+}
+
+goto out;
+
+free_relay:

[ovs-dev] [PATCH ovs v2 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev

2019-10-02 Thread Noa Ezra
vDPA netdev is designed to support both SW and HW use cases.
HW mode will be used to configure vDPA capable devices.
SW acceleration is used to leverage SRIOV offloads to virtio guests
by relaying packets between VF and virtio devices.
Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa
port with no functional change.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 lib/automake.mk|   4 +-
 lib/netdev-dpdk-vdpa.c | 750 +
 lib/netdev-dpdk-vdpa.h |  54 
 3 files changed, 807 insertions(+), 1 deletion(-)
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 17b36b4..38e027f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -144,6 +144,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/netdev-offload.h \
lib/netdev-offload-provider.h \
lib/netdev-provider.h \
+   lib/netdev-dpdk-vdpa.h \
lib/netdev-vport.c \
lib/netdev-vport.h \
lib/netdev-vport-private.h \
@@ -426,7 +427,8 @@ if DPDK_NETDEV
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk.c \
lib/netdev-dpdk.c \
-   lib/netdev-offload-dpdk.c
+   lib/netdev-offload-dpdk.c \
+   lib/netdev-dpdk-vdpa.c
 else
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk-stub.c
diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c
new file mode 100755
index 000..d8f8fb0
--- /dev/null
+++ b/lib/netdev-dpdk-vdpa.c
@@ -0,0 +1,750 @@
+/*
+ * Copyright (c) 2019 Mellanox Technologies, Ltd.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "netdev-dpdk-vdpa.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "netdev-provider.h"
+#include "openvswitch/vlog.h"
+#include "dp-packet.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa);
+
+#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *))
+#define NETDEV_DPDK_VDPA_MAX_QPAIRS 128
+#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID   0x
+#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64
+#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512
+
+enum netdev_dpdk_vdpa_port_type {
+NETDEV_DPDK_VDPA_PORT_TYPE_VM,
+NETDEV_DPDK_VDPA_PORT_TYPE_VF
+};
+
+struct netdev_dpdk_vdpa_relay_flow {
+struct rte_flow *flow;
+bool queues_en[RTE_MAX_QUEUES_PER_PORT];
+uint32_t priority;
+};
+
+struct netdev_dpdk_vdpa_qpair {
+uint16_t port_id_rx;
+uint16_t port_id_tx;
+uint16_t pr_queue;
+uint8_t mb_head;
+uint8_t mb_tail;
+struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2];
+};
+
+struct netdev_dpdk_vdpa_relay {
+PADDED_MEMBERS(CACHE_LINE_SIZE,
+struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2];
+uint16_t num_queues;
+struct netdev_dpdk_vdpa_relay_flow flow_params;
+int port_id_vm;
+int port_id_vf;
+uint16_t vf_mtu;
+int n_rxq;
+char *vf_pci;
+char *vm_socket;
+char *vhost_name;
+);
+};
+
+static int
+netdev_dpdk_vdpa_port_from_name(const char *name)
+{
+int port_id;
+size_t len;
+
+len = strlen(name);
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+if (rte_eth_dev_is_valid_port(port_id) &&
+!strncmp(name, rte_eth_devices[port_id].device->name, len)) {
+return port_id;
+}
+}
+VLOG_ERR("No port was found for %s", name);
+return -1;
+}
+
+static void
+netdev_dpdk_vdpa_free(void *ptr)
+{
+if (ptr == NULL) {
+return;
+}
+free(ptr);
+ptr = NULL;
+}
+static void
+netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay)
+{
+uint16_t q;
+uint8_t i;
+
+for (q = 0; q < relay->num_queues; q++) {
+for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) {
+rte_pktmbuf_free(relay->qpair[q].pkts[i]);
+}
+relay->qpair[q].mb_head = 0;
+relay->qpair[q].mb_tail = 0;
+relay->qpair[q].port_id_rx = 0;
+relay->qpair[q].port_id_tx = 0;
+relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID;
+}
+
+relay->port_id_vm = 0;
+relay->port_id_vf = 0;
+relay->num_queues = 0;
+rel

[ovs-dev] [PATCH ovs v2 0/2] Introduce dpdkvdpa netdev

2019-10-02 Thread Noa Ezra
There are two approaches to communicate with a guest, using virtIO or
SR-IOV.
SR-IOV allows working with port representor which is attached to the
OVS and a matching VF is given with pass-through to the VM.
HW rules can process packets from up-link and direct them to the VF
without going through SW (OVS) and therefore SR-IOV gives the best
performance.
However, SR-IOV architecture requires that the guest will use a driver
which is specific to the underlying HW. Specific HW driver has two main
drawbacks:
1. Breaks virtualization in some sense (VM aware of the HW), can also
   limit the type of images supported.
2. Less natural support for live migration.

Using virtIO interface solves both problems, but reduces performance and
causes losing of some functionality, for example, for some HW offload,
working directly with virtIO cannot be supported.
In order to solve this conflict, we created a new netdev type-dpdkvdpa.
The new netdev is basically similar to a regular dpdk netdev, but it
has some additional functionality for transferring packets from virtIO
guest (VM) to a VF and vice versa. With this solution we can benefit
both SR-IOV and virtIO.
vDPA netdev is designed to support both SW and HW use-cases.
HW mode will be used to configure vDPA capable devices. The support
for this mode is on progress in the dpdk community.
SW acceleration is used to leverage SR-IOV offloads to virtIO guests
by relaying packets between VF and virtio devices and as a pre-step for
supporting vDPA in HW mode.

Running example:
1. Configure OVS bridge and ports:
ovs-vsctl add-br br0-ovs -- set bridge br0-ovs datapath_type=netdev
ovs-vsctl add-port br0-ovs pf -- set Interface pf type=dpdk options: \
dpdk-devargs=
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa \
options:vdpa-socket-path= \
options:vdpa-accelerator-devargs= \
options:dpdk-devargs=,representor=[id]
2. Run a virtIO guest (VM) in server mode that creates the socket of
   the vDPA port.
3. Send traffic.

Noa Ezra (2):
  netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
  netdev-dpdk: Add dpdkvdpa port

 NEWS   |   1 +
 lib/automake.mk|   4 +-
 lib/netdev-dpdk-vdpa.c | 750 +
 lib/netdev-dpdk-vdpa.h |  54 
 lib/netdev-dpdk.c  | 162 +++
 vswitchd/vswitch.xml   |  25 ++
 6 files changed, 995 insertions(+), 1 deletion(-)
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovs V1 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev

2019-09-14 Thread Noa Ezra
vDPA netdev is designed to support both SW and HW use cases.
HW mode will be used to configure vDPA capable devices.
SW acceleration is used to leverage SRIOV offloads to virtio guests
by relaying packets between VF and virtio devices.
Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa
port with no functional change.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 lib/automake.mk|   4 +-
 lib/netdev-dpdk-vdpa.c | 750 +
 lib/netdev-dpdk-vdpa.h |  54 
 3 files changed, 807 insertions(+), 1 deletion(-)
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 17b36b4..38e027f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -144,6 +144,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/netdev-offload.h \
lib/netdev-offload-provider.h \
lib/netdev-provider.h \
+   lib/netdev-dpdk-vdpa.h \
lib/netdev-vport.c \
lib/netdev-vport.h \
lib/netdev-vport-private.h \
@@ -426,7 +427,8 @@ if DPDK_NETDEV
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk.c \
lib/netdev-dpdk.c \
-   lib/netdev-offload-dpdk.c
+   lib/netdev-offload-dpdk.c \
+   lib/netdev-dpdk-vdpa.c
 else
 lib_libopenvswitch_la_SOURCES += \
lib/dpdk-stub.c
diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c
new file mode 100755
index 000..ca831f2
--- /dev/null
+++ b/lib/netdev-dpdk-vdpa.c
@@ -0,0 +1,750 @@
+/*
+ * Copyright (c) 2019 Mellanox Technologies, Ltd.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "netdev-dpdk-vdpa.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "netdev-provider.h"
+#include "openvswitch/vlog.h"
+#include "dp-packet.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa);
+
+#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *))
+#define NETDEV_DPDK_VDPA_MAX_QPAIRS 128
+#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID   0x
+#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64
+#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512
+
+enum netdev_dpdk_vdpa_port_type {
+NETDEV_DPDK_VDPA_PORT_TYPE_VM,
+NETDEV_DPDK_VDPA_PORT_TYPE_VF
+};
+
+struct netdev_dpdk_vdpa_relay_flow {
+struct rte_flow *flow;
+bool queues_en[RTE_MAX_QUEUES_PER_PORT];
+uint32_t priority;
+};
+
+struct netdev_dpdk_vdpa_qpair {
+uint16_t port_id_rx;
+uint16_t port_id_tx;
+uint16_t pr_queue;
+uint8_t mb_head;
+uint8_t mb_tail;
+struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2];
+};
+
+struct netdev_dpdk_vdpa_relay {
+PADDED_MEMBERS(CACHE_LINE_SIZE,
+struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2];
+uint16_t num_queues;
+struct netdev_dpdk_vdpa_relay_flow flow_params;
+int port_id_vm;
+int port_id_vf;
+uint16_t vf_mtu;
+int n_rxq;
+char *vf_pci;
+char *vm_socket;
+char *vhost_name;
+);
+};
+
+static int
+netdev_dpdk_vdpa_port_from_name(const char *name)
+{
+int port_id;
+size_t len;
+
+len = strlen(name);
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+if (rte_eth_dev_is_valid_port(port_id) &&
+!strncmp(name, rte_eth_devices[port_id].device->name, len)) {
+return port_id;
+}
+}
+VLOG_ERR("No port was found for %s", name);
+return -1;
+}
+
+static void
+netdev_dpdk_vdpa_free(void *ptr)
+{
+if (ptr == NULL) {
+return;
+}
+free(ptr);
+ptr = NULL;
+}
+static void
+netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay)
+{
+uint16_t q;
+uint8_t i;
+
+for (q = 0; q < relay->num_queues; q++) {
+for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) {
+rte_pktmbuf_free(relay->qpair[q].pkts[i]);
+}
+relay->qpair[q].mb_head = 0;
+relay->qpair[q].mb_tail = 0;
+relay->qpair[q].port_id_rx = 0;
+relay->qpair[q].port_id_tx = 0;
+relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID;
+}
+
+relay->port_id_vm = 0;
+relay->port_id_vf = 0;
+relay->num_queues = 0;
+rel

[ovs-dev] [PATCH ovs V1 2/2] netdev-dpdk: Add dpdkvdpa port

2019-09-14 Thread Noa Ezra
dpdkvdpa netdev works with 3 components:
vhost-user socket, vdpa device: real vdpa device or a VF and
representor of "vdpa device".

In order to add a new vDPA port, add a new port to existing bridge
with type dpdkvdpa and vDPA options:
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
   options:vdpa-socket-path=
   options:vdpa-accelerator-devargs=
   options:dpdk-devargs=,representor=[id]

On this command OVS will create a new netdev:
1. Register vhost-user-client device.
2. Open and configure VF dpdk port.
3. Open and configure representor dpdk port.

The new netdev will use netdev_rxq_recv() function in order to receive
packets from VF and push to vhost-user and receive packets from
vhost-user and push to VF.

Signed-off-by: Noa Ezra 
Reviewed-by: Oz Shlomo 
---
 NEWS |   1 +
 lib/netdev-dpdk.c| 162 +++
 vswitchd/vswitch.xml |  25 
 3 files changed, 188 insertions(+)

diff --git a/NEWS b/NEWS
index f5a0b8f..6f315c6 100644
--- a/NEWS
+++ b/NEWS
@@ -542,6 +542,7 @@ v2.6.0 - 27 Sep 2016
  * Remove dpdkvhostcuse port type.
  * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7)
  * 'dpdkvhostuserclient' port type.
+ * 'dpdkvdpa' port type.
- Increase number of registers to 16.
- ovs-benchmark: This utility has been removed due to lack of use and
  bitrot.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index bc20d68..16ddf58 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -47,6 +47,7 @@
 #include "dpif-netdev.h"
 #include "fatal-signal.h"
 #include "netdev-provider.h"
+#include "netdev-dpdk-vdpa.h"
 #include "netdev-vport.h"
 #include "odp-util.h"
 #include "openvswitch/dynamic-string.h"
@@ -137,6 +138,9 @@ typedef uint16_t dpdk_port_t;
 /* Legacy default value for vhost tx retries. */
 #define VHOST_ENQ_RETRY_DEF 8
 
+/* Size of VDPA custom stats. */
+#define VDPA_CUSTOM_STATS_SIZE  4
+
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 
 static const struct rte_eth_conf port_conf = {
@@ -461,6 +465,8 @@ struct netdev_dpdk {
 int rte_xstats_ids_size;
 uint64_t *rte_xstats_ids;
 );
+
+struct netdev_dpdk_vdpa_relay *relay;
 };
 
 struct netdev_rxq_dpdk {
@@ -1346,6 +1352,30 @@ netdev_dpdk_construct(struct netdev *netdev)
 return err;
 }
 
+static int
+netdev_dpdk_vdpa_construct(struct netdev *netdev)
+{
+struct netdev_dpdk *dev;
+int err;
+
+err = netdev_dpdk_construct(netdev);
+if (err) {
+VLOG_ERR("netdev_dpdk_construct failed. Port: %s\n", netdev->name);
+goto out;
+}
+
+ovs_mutex_lock(&dpdk_mutex);
+dev = netdev_dpdk_cast(netdev);
+dev->relay = netdev_dpdk_vdpa_alloc_relay();
+if (!dev->relay) {
+err = ENOMEM;
+}
+
+ovs_mutex_unlock(&dpdk_mutex);
+out:
+return err;
+}
+
 static void
 common_destruct(struct netdev_dpdk *dev)
 OVS_REQUIRES(dpdk_mutex)
@@ -1428,6 +1458,19 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk *dev 
OVS_UNUSED,
 }
 
 static void
+netdev_dpdk_vdpa_destruct(struct netdev *netdev)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ovs_mutex_lock(&dpdk_mutex);
+netdev_dpdk_vdpa_destruct_impl(dev->relay);
+rte_free(dev->relay);
+ovs_mutex_unlock(&dpdk_mutex);
+
+netdev_dpdk_destruct(netdev);
+}
+
+static void
 netdev_dpdk_vhost_destruct(struct netdev *netdev)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
@@ -1878,6 +1921,47 @@ out:
 }
 
 static int
+netdev_dpdk_vdpa_set_config(struct netdev *netdev, const struct smap *args,
+char **errp)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+const char *vdpa_accelerator_devargs =
+smap_get(args, "vdpa-accelerator-devargs");
+const char *vdpa_socket_path =
+smap_get(args, "vdpa-socket-path");
+int err = 0;
+
+if ((vdpa_accelerator_devargs == NULL) || (vdpa_socket_path == NULL)) {
+VLOG_ERR("netdev_dpdk_vdpa_set_config failed."
+ "Required arguments are missing for VDPA port %s",
+ netdev->name);
+goto free_relay;
+}
+
+err = netdev_dpdk_set_config(netdev, args, errp);
+if (err) {
+VLOG_ERR("netdev_dpdk_set_config failed. Port: %s", netdev->name);
+goto free_relay;
+}
+
+err = netdev_dpdk_vdpa_config_impl(dev->relay, dev->port_id,
+   vdpa_socket_path,
+   vdpa_accelerator_devargs);
+if (err) {
+VLOG_ERR("netdev_dpdk_vdpa_config_impl failed. Port %s",
+ netdev->name);
+goto free_relay;
+}
+
+goto out;
+
+free_relay:

[ovs-dev] [PATCH ovs V1 0/2] Introduce dpdkvdpa netdev

2019-09-14 Thread Noa Ezra
Introduce dpdkvdpa netdev allowing HW offloads over VirtIO network devices.

dpdkvdpa ports can be added to netdev bridges with the following command:
ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa
options:vdpa-socket-path=
options:vdpa-accelerator-devargs=
options:dpdk-devargs=,representor=[id]

vDPA netdev is designed to support both SW and HW acceleration. 
SRIOV capable NICs can use the SW acceleration which relays packets 
between VF and virtIO ports.
HW mode will configure vDPA capable NICs.

Patch 1 provides the vdpa functionality as a pre-step without a functional
change.
Patch 2 introduces the dpdkvdpa vport.


Noa Ezra (2):
  netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
  netdev-dpdk: Add dpdkvdpa port

 NEWS   |   1 +
 lib/automake.mk|   4 +-
 lib/netdev-dpdk-vdpa.c | 750 +
 lib/netdev-dpdk-vdpa.h |  54 
 lib/netdev-dpdk.c  | 162 +++
 vswitchd/vswitch.xml   |  25 ++
 6 files changed, 995 insertions(+), 1 deletion(-)
 create mode 100755 lib/netdev-dpdk-vdpa.c
 create mode 100644 lib/netdev-dpdk-vdpa.h

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev