[ovs-dev] [PATCH ovs v1 0/2] Introduce dpdkvdpa netdev
Introduce dpdkvdpa netdev allowing HW offloads over VirtIO network devices. dpdkvdpa ports can be added to netdev bridges with the following command: ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa options:vdpa-socket-path= options:vdpa-accelerator-devargs= options:dpdk-devargs=,representor=[id] options:vdpa-max-queues= vdpa-max-queues is an optional field. vDPA netdev is designed to support both SW and HW acceleration. SRIOV capable NICs can use the SW acceleration which relays packets between VF and virtIO ports. In a future patch, a support for vDPA configuration will be added, so that HW mode will configure vDPA capable NICs. The dpdkvdpa netdev supports all kind of traffic (TCP, UDP, NFV etc). Using dpdkvdpa port allows to forward packets between VF and VirtIO guests with better performance than using standard VirtIO ports. On the first scenario, a guest is connected to OVS using VirtIO. On the second scenario, a guest is connected to OVS using dpdkvdpa port. The guest is running testpmd. A Traffic generator (iperf3 or Ixia) is sending packets to the OVS. In this case, dpdkvdpa port improves the performance by ~35%. https://travis-ci.org/github/noaezra/OVS/builds/670001370 Patch 1 provides the vdpa functionality as a pre-step without a functional change. Patch 2 introduces the dpdkvdpa vport. Noa Ezra (2): netdev-dpdk-vdpa: Introduce dpdkvdpa netdev netdev-dpdk: Add dpdkvdpa port Documentation/automake.mk | 1 + Documentation/topics/dpdk/index.rst | 1 + Documentation/topics/dpdk/vdpa.rst | 90 NEWS| 1 + lib/automake.mk | 4 +- lib/netdev-dpdk-vdpa.c | 820 lib/netdev-dpdk-vdpa.h | 55 +++ lib/netdev-dpdk.c | 164 +++- vswitchd/vswitch.xml| 25 ++ 9 files changed, 1159 insertions(+), 2 deletions(-) create mode 100644 Documentation/topics/dpdk/vdpa.rst create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovs v1 2/2] netdev-dpdk: Add dpdkvdpa port
dpdkvdpa netdev works with 3 components: vhost-user socket, vdpa device: real vdpa device or a VF and representor of "vdpa device". In order to add a new vDPA port, add a new port to existing bridge with type dpdkvdpa and vDPA options: ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa options:vdpa-socket-path= options:vdpa-accelerator-devargs= options:dpdk-devargs=,representor=[id] On this command OVS will create a new netdev: 1. Register vhost-user-client device. 2. Open and configure VF dpdk port. 3. Open and configure representor dpdk port. The new netdev will use netdev_rxq_recv() function in order to receive packets from VF and push to vhost-user and receive packets from vhost-user and push to VF. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- Documentation/automake.mk | 1 + Documentation/topics/dpdk/index.rst | 1 + Documentation/topics/dpdk/vdpa.rst | 90 NEWS| 1 + lib/netdev-dpdk.c | 164 +++- vswitchd/vswitch.xml| 25 ++ 6 files changed, 281 insertions(+), 1 deletion(-) create mode 100644 Documentation/topics/dpdk/vdpa.rst diff --git a/Documentation/automake.mk b/Documentation/automake.mk index f85c432..7caf6e7 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -41,6 +41,7 @@ DOC_SOURCE = \ Documentation/topics/dpdk/qos.rst \ Documentation/topics/dpdk/vdev.rst \ Documentation/topics/dpdk/vhost-user.rst \ + Documentation/topics/dpdk/vdpa.rst \ Documentation/topics/fuzzing/index.rst \ Documentation/topics/fuzzing/what-is-fuzzing.rst \ Documentation/topics/fuzzing/ovs-fuzzing-infrastructure.rst \ diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst index a5be5e3..e8595c3 100644 --- a/Documentation/topics/dpdk/index.rst +++ b/Documentation/topics/dpdk/index.rst @@ -39,3 +39,4 @@ DPDK Support /topics/dpdk/qos /topics/dpdk/jumbo-frames /topics/dpdk/memory + /topics/dpdk/vdpa diff --git a/Documentation/topics/dpdk/vdpa.rst b/Documentation/topics/dpdk/vdpa.rst new file mode 100644 index 000..34c5300 --- /dev/null +++ b/Documentation/topics/dpdk/vdpa.rst @@ -0,0 +1,90 @@ +.. + Copyright (c) 2019 Mellanox Technologies, Ltd. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at: + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + === Heading 0 (reserved for the title in a document) + --- Heading 1 + ~~~ Heading 2 + +++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + + +=== +DPDK VDPA Ports +=== + +In user space there are two main approaches to communicate with a guest (VM), +using virtIO ports (e.g. netdev type=dpdkvhoshuser/dpdkvhostuserclient) or +SR-IOV using phy ports (e.g. netdev type = dpdk). +Phy ports allow working with port representor which is attached to the OVS and +a matching VF is given with pass-through to the guest. +HW rules can process packets from up-link and direct them to the VF without +going through SW (OVS) and therefore using phy ports gives the best +performance. +However, SR-IOV architecture requires that the guest will use a driver which is +specific to the underlying HW. Specific HW driver has two main drawbacks: +1. Breaks virtualization in some sense (guest aware of the HW), can also limit +the type of images supported. +2. Less natural support for live migration. + +Using virtIO port solves both problems, but reduces performance and causes +losing of some functionality, for example, for some HW offload, working +directly with virtIO cannot be supported. + +We created a new netdev type- dpdkvdpa. dpdkvdpa port solves this conflict. +The new netdev is basically very similar to regular dpdk netdev but it has some +additional functionally. +This port translates between phy port to virtIO port, it takes packets from +rx-queue and send them to the suitable tx-queue and allows to transfer packets +from virtIO guest (VM) to a VF and vice versa and benefit both SR-IOV and +virtIO. + +Quick Example +- + +Configure OVS bridge and ports +~~ + +you must first create a bridge and add ports to the switch. +Since
[ovs-dev] [PATCH ovs v1 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
vDPA netdev is designed to support both SW and HW use cases. HW mode will be used to configure vDPA capable devices. SW acceleration is used to leverage SRIOV offloads to virtio guests by relaying packets between VF and virtio devices. Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa port with no functional change. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- lib/automake.mk| 4 +- lib/netdev-dpdk-vdpa.c | 820 + lib/netdev-dpdk-vdpa.h | 55 3 files changed, 878 insertions(+), 1 deletion(-) create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h diff --git a/lib/automake.mk b/lib/automake.mk index 95925b5..b57682c 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -146,6 +146,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/netdev-offload.h \ lib/netdev-offload-provider.h \ lib/netdev-provider.h \ + lib/netdev-dpdk-vdpa.h \ lib/netdev-vport.c \ lib/netdev-vport.h \ lib/netdev-vport-private.h \ @@ -429,7 +430,8 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/dpdk.c \ lib/netdev-dpdk.c \ - lib/netdev-offload-dpdk.c + lib/netdev-offload-dpdk.c \ + lib/netdev-dpdk-vdpa.c else lib_libopenvswitch_la_SOURCES += \ lib/dpdk-stub.c diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c new file mode 100755 index 000..c6ed061 --- /dev/null +++ b/lib/netdev-dpdk-vdpa.c @@ -0,0 +1,820 @@ +/* + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "netdev-dpdk-vdpa.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netdev-provider.h" +#include "openvswitch/vlog.h" +#include "dp-packet.h" +#include "util.h" + +VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa); + +#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *)) +#define NETDEV_DPDK_VDPA_MAX_QPAIRS 16 +#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID 0x +#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64 +#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512 + +enum netdev_dpdk_vdpa_port_type { +NETDEV_DPDK_VDPA_PORT_TYPE_VM, +NETDEV_DPDK_VDPA_PORT_TYPE_VF +}; + +struct netdev_dpdk_vdpa_relay_flow { +struct rte_flow *flow; +bool queues_en[RTE_MAX_QUEUES_PER_PORT]; +uint32_t priority; +}; + +struct netdev_dpdk_vdpa_qpair { +uint16_t port_id_rx; +uint16_t port_id_tx; +uint16_t pr_queue; +uint8_t mb_head; +uint8_t mb_tail; +struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2]; +}; + +struct netdev_dpdk_vdpa_relay { +PADDED_MEMBERS(CACHE_LINE_SIZE, +struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2]; +uint16_t num_queues; +struct netdev_dpdk_vdpa_relay_flow flow_params; +int port_id_vm; +int port_id_vf; +uint16_t vf_mtu; +int n_rxq; +char *vf_pci; +char *vm_socket; +char *vhost_name; +bool started; +); +}; + +static int +netdev_dpdk_vdpa_port_from_name(const char *name) +{ +int port_id; +size_t len; + +len = strlen(name); +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +if (rte_eth_dev_is_valid_port(port_id) && +!strncmp(name, rte_eth_devices[port_id].device->name, len)) { +return port_id; +} +} +VLOG_ERR("No port was found for %s", name); +return ENODEV; +} + +static void +netdev_dpdk_vdpa_free(void *ptr) +{ +if (ptr == NULL) { +return; +} +free(ptr); +ptr = NULL; +} + +static void +netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay) +{ +uint16_t q; +uint8_t i; + +for (q = 0; q < relay->num_queues; q++) { +for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) { +rte_pktmbuf_free(relay->qpair[q].pkts[i]); +} +relay->qpair[q].mb_head = 0; +relay->qpair[q].mb_tail = 0; +relay->qpair[q].port_id_rx = 0; +relay->qpair[q].port_id_tx = 0; +relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID; +} + +relay->started = false; +relay->port_id_vm = 0; +relay->port_i
[ovs-dev] [PATCH ovs v1 0/2] Allow setting MAC on DPDK interfaces
In cloud topology, when SR-IOV with port representors is in use and VM is not trusted, the orchestration should set the VF mac address. When using DPDK there is an architecture limitation to set the VF mac address from host (Linux tooling). According to previous discussion (https://patchwork.ozlabs.org/patch/1215075/), it was agreed to add a new API in ovs-appctl for setting MAC address on port representors. ovs-appctl netdev-dpdk/set-mac Ilya Maximets (1): netdev-dpdk: Add ability to set MAC address. Noa Ezra (1): netdev-dpdk: Allow setting MAC on DPDK interfaces lib/netdev-dpdk.c | 55 --- 1 file changed, 52 insertions(+), 3 deletions(-) -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovs v1 1/2] netdev-dpdk: Add ability to set MAC address.
From: Ilya Maximets It is possible to set MAC address for DPDK ports by calling rte_eth_dev_default_mac_addr_set(). For some reason OVS didn't use this functionality avoiding real MAC address configuration. With this change following command will result in real MAC address update on HW NIC: ovs-vsctl set Interface mac="xx:xx:xx:xx:xx:xx" Signed-off-by: Ilya Maximets Acked-by: Ben Pfaff --- lib/netdev-dpdk.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 7ab8186..e375b3d 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2968,15 +2968,28 @@ static int netdev_dpdk_set_etheraddr(struct netdev *netdev, const struct eth_addr mac) { struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); +int err = 0; ovs_mutex_lock(&dev->mutex); if (!eth_addr_equals(dev->hwaddr, mac)) { -dev->hwaddr = mac; -netdev_change_seq_changed(netdev); +if (dev->type == DPDK_DEV_ETH) { +struct rte_ether_addr ea; + +memcpy(ea.addr_bytes, mac.ea, ETH_ADDR_LEN); +err = rte_eth_dev_default_mac_addr_set(dev->port_id, &ea); +} +if (!err) { +dev->hwaddr = mac; +netdev_change_seq_changed(netdev); +} else { +VLOG_WARN("%s: Failed to set requested mac("ETH_ADDR_FMT"): %s", + netdev_get_name(netdev), ETH_ADDR_ARGS(mac), + rte_strerror(-err)); +} } ovs_mutex_unlock(&dev->mutex); -return 0; +return -err; } static int -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovs v1 2/2] netdev-dpdk: Allow setting MAC on DPDK interfaces
Adding a command for setting MAC of DPDK interfaces using: ovs-appctl netdev-dpdk/set-mac Signed-off-by: Noa Ezra Acked-by: Roni Bar Yanai --- lib/netdev-dpdk.c | 36 1 file changed, 36 insertions(+) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index e375b3d..2b8adac 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -3917,6 +3917,38 @@ out: netdev_close(netdev); } +static void +netdev_dpdk_set_mac(struct unixctl_conn *conn, int argc OVS_UNUSED, +const char *argv[], void *aux OVS_UNUSED) +{ +struct netdev *netdev = NULL; +char *response = NULL; +struct eth_addr mac; +int error; + +netdev = netdev_from_name(argv[1]); +if (!netdev || !is_dpdk_class(netdev->netdev_class)) { +unixctl_command_reply_error(conn, "Not a DPDK Interface"); +return; +} + +if (!argv[2] || !eth_addr_from_string(argv[2], &mac)) { +response = xasprintf("No MAC address to set."); +goto out; +} + +error = netdev_dpdk_set_etheraddr(netdev, mac); +if (error) { +response = xasprintf("interface %s: setting MAC failed (%s)", + argv[1], ovs_strerror(error)); +} +response = xasprintf("set-mac done."); + +out: +unixctl_command_reply(conn, response); +netdev_close(netdev); +} + /* * Set virtqueue flags so that we do not receive interrupts. */ @@ -4256,6 +4288,10 @@ netdev_dpdk_class_init(void) "[netdev]", 0, 1, netdev_dpdk_get_mempool_info, NULL); +unixctl_command_register("netdev-dpdk/set-mac", + "[netdev] [mac]", 2, 2, + netdev_dpdk_set_mac, NULL); + ret = rte_eth_dev_callback_register(RTE_ETH_ALL, RTE_ETH_EVENT_INTR_RESET, dpdk_eth_event_callback, NULL); -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port
Hi, Please see the answer below. Thanks, Noa. > -Original Message- > From: William Tu [mailto:u9012...@gmail.com] > Sent: Friday, October 18, 2019 12:34 AM > To: Noa Ezra > Cc: ovs-dev@openvswitch.org; Oz Shlomo ; Majd > Dibbiny ; Ameer Mahagneh > ; Eli Britstein > Subject: Re: [ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port > > On Thu, Oct 17, 2019 at 02:16:56PM +0300, Noa Ezra wrote: > > Hi Noa, > > Thanks for the patch. I'm new to this and have a question below. > > > dpdkvdpa netdev works with 3 components: > > vhost-user socket, vdpa device: real vdpa device or a VF and > > representor of "vdpa device". > > > > In order to add a new vDPA port, add a new port to existing bridge > > with type dpdkvdpa and vDPA options: > > ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa > >options:vdpa-socket-path= > >options:vdpa-accelerator-devargs= > >options:dpdk-devargs=,representor=[id] > > > > On this command OVS will create a new netdev: > > 1. Register vhost-user-client device. > > 2. Open and configure VF dpdk port. > > 3. Open and configure representor dpdk port. > > > > The new netdev will use netdev_rxq_recv() function in order to receive > > packets from VF and push to vhost-user and receive packets from > > vhost-user and push to VF. > > So does OVS in this case is able to apply OpenFlow rules on packets? > > When netdev_dpdk_vdpa_rxq_recv() is invoked, does the batch of packets > go into OVS's parse, lookup, action pipeline? Or all packets go directly into > VM if (VF -> VM) and vice versa? > > Is > fwd_rx = netdev_dpdk_vdpa_rxq_recv_impl(dev->relay, rxq->queue_id); > forward packets from vhost-user to VF and ret = > netdev_dpdk_rxq_recv(rxq, batch, qfill); forward packets from vhost-user to > VM? I hope that I understand your question correctly, the netdev_dpdk_vdpa_rxq_recv forwards packets from VM to VF and vice versa. There is no change in the processing of the packet between VF and up-link and no change in the packet's header. The new netdev only translate between SR-IOV (phy) VF to virtIO VM. > Thanks > William ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovs v3 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
vDPA netdev is designed to support both SW and HW use cases. HW mode will be used to configure vDPA capable devices. SW acceleration is used to leverage SRIOV offloads to virtio guests by relaying packets between VF and virtio devices. Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa port with no functional change. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- lib/automake.mk| 4 +- lib/netdev-dpdk-vdpa.c | 750 + lib/netdev-dpdk-vdpa.h | 54 3 files changed, 807 insertions(+), 1 deletion(-) create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h diff --git a/lib/automake.mk b/lib/automake.mk index 17b36b4..38e027f 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -144,6 +144,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/netdev-offload.h \ lib/netdev-offload-provider.h \ lib/netdev-provider.h \ + lib/netdev-dpdk-vdpa.h \ lib/netdev-vport.c \ lib/netdev-vport.h \ lib/netdev-vport-private.h \ @@ -426,7 +427,8 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/dpdk.c \ lib/netdev-dpdk.c \ - lib/netdev-offload-dpdk.c + lib/netdev-offload-dpdk.c \ + lib/netdev-dpdk-vdpa.c else lib_libopenvswitch_la_SOURCES += \ lib/dpdk-stub.c diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c new file mode 100755 index 000..d8f8fb0 --- /dev/null +++ b/lib/netdev-dpdk-vdpa.c @@ -0,0 +1,750 @@ +/* + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "netdev-dpdk-vdpa.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netdev-provider.h" +#include "openvswitch/vlog.h" +#include "dp-packet.h" +#include "util.h" + +VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa); + +#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *)) +#define NETDEV_DPDK_VDPA_MAX_QPAIRS 128 +#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID 0x +#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64 +#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512 + +enum netdev_dpdk_vdpa_port_type { +NETDEV_DPDK_VDPA_PORT_TYPE_VM, +NETDEV_DPDK_VDPA_PORT_TYPE_VF +}; + +struct netdev_dpdk_vdpa_relay_flow { +struct rte_flow *flow; +bool queues_en[RTE_MAX_QUEUES_PER_PORT]; +uint32_t priority; +}; + +struct netdev_dpdk_vdpa_qpair { +uint16_t port_id_rx; +uint16_t port_id_tx; +uint16_t pr_queue; +uint8_t mb_head; +uint8_t mb_tail; +struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2]; +}; + +struct netdev_dpdk_vdpa_relay { +PADDED_MEMBERS(CACHE_LINE_SIZE, +struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2]; +uint16_t num_queues; +struct netdev_dpdk_vdpa_relay_flow flow_params; +int port_id_vm; +int port_id_vf; +uint16_t vf_mtu; +int n_rxq; +char *vf_pci; +char *vm_socket; +char *vhost_name; +); +}; + +static int +netdev_dpdk_vdpa_port_from_name(const char *name) +{ +int port_id; +size_t len; + +len = strlen(name); +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +if (rte_eth_dev_is_valid_port(port_id) && +!strncmp(name, rte_eth_devices[port_id].device->name, len)) { +return port_id; +} +} +VLOG_ERR("No port was found for %s", name); +return -1; +} + +static void +netdev_dpdk_vdpa_free(void *ptr) +{ +if (ptr == NULL) { +return; +} +free(ptr); +ptr = NULL; +} +static void +netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay) +{ +uint16_t q; +uint8_t i; + +for (q = 0; q < relay->num_queues; q++) { +for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) { +rte_pktmbuf_free(relay->qpair[q].pkts[i]); +} +relay->qpair[q].mb_head = 0; +relay->qpair[q].mb_tail = 0; +relay->qpair[q].port_id_rx = 0; +relay->qpair[q].port_id_tx = 0; +relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID; +} + +relay->port_id_vm = 0; +relay->port_id_vf = 0; +relay->num_queues = 0; +rel
[ovs-dev] [PATCH ovs v3 2/2] netdev-dpdk: Add dpdkvdpa port
dpdkvdpa netdev works with 3 components: vhost-user socket, vdpa device: real vdpa device or a VF and representor of "vdpa device". In order to add a new vDPA port, add a new port to existing bridge with type dpdkvdpa and vDPA options: ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa options:vdpa-socket-path= options:vdpa-accelerator-devargs= options:dpdk-devargs=,representor=[id] On this command OVS will create a new netdev: 1. Register vhost-user-client device. 2. Open and configure VF dpdk port. 3. Open and configure representor dpdk port. The new netdev will use netdev_rxq_recv() function in order to receive packets from VF and push to vhost-user and receive packets from vhost-user and push to VF. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- Documentation/automake.mk | 1 + Documentation/topics/dpdk/index.rst | 1 + Documentation/topics/dpdk/vdpa.rst | 90 NEWS| 1 + lib/netdev-dpdk.c | 162 vswitchd/vswitch.xml| 25 ++ 6 files changed, 280 insertions(+) create mode 100644 Documentation/topics/dpdk/vdpa.rst diff --git a/Documentation/automake.mk b/Documentation/automake.mk index cd68f3b..ee574bc 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -43,6 +43,7 @@ DOC_SOURCE = \ Documentation/topics/dpdk/ring.rst \ Documentation/topics/dpdk/vdev.rst \ Documentation/topics/dpdk/vhost-user.rst \ + Documentation/topics/dpdk/vdpa.rst \ Documentation/topics/fuzzing/index.rst \ Documentation/topics/fuzzing/what-is-fuzzing.rst \ Documentation/topics/fuzzing/ovs-fuzzing-infrastructure.rst \ diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst index cf24a7b..c1d4ea7 100644 --- a/Documentation/topics/dpdk/index.rst +++ b/Documentation/topics/dpdk/index.rst @@ -41,3 +41,4 @@ The DPDK Datapath /topics/dpdk/pdump /topics/dpdk/jumbo-frames /topics/dpdk/memory + /topics/dpdk/vdpa diff --git a/Documentation/topics/dpdk/vdpa.rst b/Documentation/topics/dpdk/vdpa.rst new file mode 100644 index 000..34c5300 --- /dev/null +++ b/Documentation/topics/dpdk/vdpa.rst @@ -0,0 +1,90 @@ +.. + Copyright (c) 2019 Mellanox Technologies, Ltd. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at: + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + === Heading 0 (reserved for the title in a document) + --- Heading 1 + ~~~ Heading 2 + +++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + + +=== +DPDK VDPA Ports +=== + +In user space there are two main approaches to communicate with a guest (VM), +using virtIO ports (e.g. netdev type=dpdkvhoshuser/dpdkvhostuserclient) or +SR-IOV using phy ports (e.g. netdev type = dpdk). +Phy ports allow working with port representor which is attached to the OVS and +a matching VF is given with pass-through to the guest. +HW rules can process packets from up-link and direct them to the VF without +going through SW (OVS) and therefore using phy ports gives the best +performance. +However, SR-IOV architecture requires that the guest will use a driver which is +specific to the underlying HW. Specific HW driver has two main drawbacks: +1. Breaks virtualization in some sense (guest aware of the HW), can also limit +the type of images supported. +2. Less natural support for live migration. + +Using virtIO port solves both problems, but reduces performance and causes +losing of some functionality, for example, for some HW offload, working +directly with virtIO cannot be supported. + +We created a new netdev type- dpdkvdpa. dpdkvdpa port solves this conflict. +The new netdev is basically very similar to regular dpdk netdev but it has some +additional functionally. +This port translates between phy port to virtIO port, it takes packets from +rx-queue and send them to the suitable tx-queue and allows to transfer packets +from virtIO guest (VM) to a VF and vice versa and benefit both SR-IOV and +virtIO. + +Quick Example +- + +Configure OVS bridge and ports +~~ + +you must first create a bridge and add ports to the switch. +Since the dpdk
[ovs-dev] [PATCH ovs v3 0/2] Introduce dpdkvdpa netdev
There are two approaches to communicate with a guest, using virtIO or SR-IOV. SR-IOV allows working with port representor which is attached to the OVS and a matching VF is given with pass-through to the VM. HW rules can process packets from up-link and direct them to the VF without going through SW (OVS) and therefore SR-IOV gives the best performance. However, SR-IOV architecture requires that the guest will use a driver which is specific to the underlying HW. Specific HW driver has two main drawbacks: 1. Breaks virtualization in some sense (VM aware of the HW), can also limit the type of images supported. 2. Less natural support for live migration. Using virtIO interface solves both problems, but reduces performance and causes losing of some functionality, for example, for some HW offload, working directly with virtIO cannot be supported. In order to solve this conflict, we created a new netdev type-dpdkvdpa. The new netdev is basically similar to a regular dpdk netdev, but it has some additional functionality for transferring packets from virtIO guest (VM) to a VF and vice versa. With this solution we can benefit both SR-IOV and virtIO. vDPA netdev is designed to support both SW and HW use-cases. HW mode will be used to configure vDPA capable devices. The support for this mode is on progress in the dpdk community. SW acceleration is used to leverage SR-IOV offloads to virtIO guests by relaying packets between VF and virtio devices and as a pre-step for supporting vDPA in HW mode. Running example: 1. Configure OVS bridge and ports: ovs-vsctl add-br br0-ovs -- set bridge br0-ovs datapath_type=netdev ovs-vsctl add-port br0-ovs pf -- set Interface pf type=dpdk options: \ dpdk-devargs= ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa \ options:vdpa-socket-path= \ options:vdpa-accelerator-devargs= \ options:dpdk-devargs=,representor=[id] 2. Run a virtIO guest (VM) in server mode that creates the socket of the vDPA port. 3. Send traffic. Noa Ezra (2): netdev-dpdk-vdpa: Introduce dpdkvdpa netdev netdev-dpdk: Add dpdkvdpa port Documentation/automake.mk | 1 + Documentation/topics/dpdk/index.rst | 1 + Documentation/topics/dpdk/vdpa.rst | 90 + NEWS| 1 + lib/automake.mk | 4 +- lib/netdev-dpdk-vdpa.c | 750 lib/netdev-dpdk-vdpa.h | 54 +++ lib/netdev-dpdk.c | 162 vswitchd/vswitch.xml| 25 ++ 9 files changed, 1087 insertions(+), 1 deletion(-) create mode 100644 Documentation/topics/dpdk/vdpa.rst create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovs v2 2/2] netdev-dpdk: Add dpdkvdpa port
dpdkvdpa netdev works with 3 components: vhost-user socket, vdpa device: real vdpa device or a VF and representor of "vdpa device". In order to add a new vDPA port, add a new port to existing bridge with type dpdkvdpa and vDPA options: ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa options:vdpa-socket-path= options:vdpa-accelerator-devargs= options:dpdk-devargs=,representor=[id] On this command OVS will create a new netdev: 1. Register vhost-user-client device. 2. Open and configure VF dpdk port. 3. Open and configure representor dpdk port. The new netdev will use netdev_rxq_recv() function in order to receive packets from VF and push to vhost-user and receive packets from vhost-user and push to VF. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- NEWS | 1 + lib/netdev-dpdk.c| 162 +++ vswitchd/vswitch.xml | 25 3 files changed, 188 insertions(+) diff --git a/NEWS b/NEWS index f5a0b8f..6f315c6 100644 --- a/NEWS +++ b/NEWS @@ -542,6 +542,7 @@ v2.6.0 - 27 Sep 2016 * Remove dpdkvhostcuse port type. * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7) * 'dpdkvhostuserclient' port type. + * 'dpdkvdpa' port type. - Increase number of registers to 16. - ovs-benchmark: This utility has been removed due to lack of use and bitrot. diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bc20d68..16ddf58 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -47,6 +47,7 @@ #include "dpif-netdev.h" #include "fatal-signal.h" #include "netdev-provider.h" +#include "netdev-dpdk-vdpa.h" #include "netdev-vport.h" #include "odp-util.h" #include "openvswitch/dynamic-string.h" @@ -137,6 +138,9 @@ typedef uint16_t dpdk_port_t; /* Legacy default value for vhost tx retries. */ #define VHOST_ENQ_RETRY_DEF 8 +/* Size of VDPA custom stats. */ +#define VDPA_CUSTOM_STATS_SIZE 4 + #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) static const struct rte_eth_conf port_conf = { @@ -461,6 +465,8 @@ struct netdev_dpdk { int rte_xstats_ids_size; uint64_t *rte_xstats_ids; ); + +struct netdev_dpdk_vdpa_relay *relay; }; struct netdev_rxq_dpdk { @@ -1346,6 +1352,30 @@ netdev_dpdk_construct(struct netdev *netdev) return err; } +static int +netdev_dpdk_vdpa_construct(struct netdev *netdev) +{ +struct netdev_dpdk *dev; +int err; + +err = netdev_dpdk_construct(netdev); +if (err) { +VLOG_ERR("netdev_dpdk_construct failed. Port: %s\n", netdev->name); +goto out; +} + +ovs_mutex_lock(&dpdk_mutex); +dev = netdev_dpdk_cast(netdev); +dev->relay = netdev_dpdk_vdpa_alloc_relay(); +if (!dev->relay) { +err = ENOMEM; +} + +ovs_mutex_unlock(&dpdk_mutex); +out: +return err; +} + static void common_destruct(struct netdev_dpdk *dev) OVS_REQUIRES(dpdk_mutex) @@ -1428,6 +1458,19 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk *dev OVS_UNUSED, } static void +netdev_dpdk_vdpa_destruct(struct netdev *netdev) +{ +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + +ovs_mutex_lock(&dpdk_mutex); +netdev_dpdk_vdpa_destruct_impl(dev->relay); +rte_free(dev->relay); +ovs_mutex_unlock(&dpdk_mutex); + +netdev_dpdk_destruct(netdev); +} + +static void netdev_dpdk_vhost_destruct(struct netdev *netdev) { struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); @@ -1878,6 +1921,47 @@ out: } static int +netdev_dpdk_vdpa_set_config(struct netdev *netdev, const struct smap *args, +char **errp) +{ +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); +const char *vdpa_accelerator_devargs = +smap_get(args, "vdpa-accelerator-devargs"); +const char *vdpa_socket_path = +smap_get(args, "vdpa-socket-path"); +int err = 0; + +if ((vdpa_accelerator_devargs == NULL) || (vdpa_socket_path == NULL)) { +VLOG_ERR("netdev_dpdk_vdpa_set_config failed." + "Required arguments are missing for VDPA port %s", + netdev->name); +goto free_relay; +} + +err = netdev_dpdk_set_config(netdev, args, errp); +if (err) { +VLOG_ERR("netdev_dpdk_set_config failed. Port: %s", netdev->name); +goto free_relay; +} + +err = netdev_dpdk_vdpa_config_impl(dev->relay, dev->port_id, + vdpa_socket_path, + vdpa_accelerator_devargs); +if (err) { +VLOG_ERR("netdev_dpdk_vdpa_config_impl failed. Port %s", + netdev->name); +goto free_relay; +} + +goto out; + +free_relay:
[ovs-dev] [PATCH ovs v2 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
vDPA netdev is designed to support both SW and HW use cases. HW mode will be used to configure vDPA capable devices. SW acceleration is used to leverage SRIOV offloads to virtio guests by relaying packets between VF and virtio devices. Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa port with no functional change. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- lib/automake.mk| 4 +- lib/netdev-dpdk-vdpa.c | 750 + lib/netdev-dpdk-vdpa.h | 54 3 files changed, 807 insertions(+), 1 deletion(-) create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h diff --git a/lib/automake.mk b/lib/automake.mk index 17b36b4..38e027f 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -144,6 +144,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/netdev-offload.h \ lib/netdev-offload-provider.h \ lib/netdev-provider.h \ + lib/netdev-dpdk-vdpa.h \ lib/netdev-vport.c \ lib/netdev-vport.h \ lib/netdev-vport-private.h \ @@ -426,7 +427,8 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/dpdk.c \ lib/netdev-dpdk.c \ - lib/netdev-offload-dpdk.c + lib/netdev-offload-dpdk.c \ + lib/netdev-dpdk-vdpa.c else lib_libopenvswitch_la_SOURCES += \ lib/dpdk-stub.c diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c new file mode 100755 index 000..d8f8fb0 --- /dev/null +++ b/lib/netdev-dpdk-vdpa.c @@ -0,0 +1,750 @@ +/* + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "netdev-dpdk-vdpa.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netdev-provider.h" +#include "openvswitch/vlog.h" +#include "dp-packet.h" +#include "util.h" + +VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa); + +#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *)) +#define NETDEV_DPDK_VDPA_MAX_QPAIRS 128 +#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID 0x +#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64 +#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512 + +enum netdev_dpdk_vdpa_port_type { +NETDEV_DPDK_VDPA_PORT_TYPE_VM, +NETDEV_DPDK_VDPA_PORT_TYPE_VF +}; + +struct netdev_dpdk_vdpa_relay_flow { +struct rte_flow *flow; +bool queues_en[RTE_MAX_QUEUES_PER_PORT]; +uint32_t priority; +}; + +struct netdev_dpdk_vdpa_qpair { +uint16_t port_id_rx; +uint16_t port_id_tx; +uint16_t pr_queue; +uint8_t mb_head; +uint8_t mb_tail; +struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2]; +}; + +struct netdev_dpdk_vdpa_relay { +PADDED_MEMBERS(CACHE_LINE_SIZE, +struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2]; +uint16_t num_queues; +struct netdev_dpdk_vdpa_relay_flow flow_params; +int port_id_vm; +int port_id_vf; +uint16_t vf_mtu; +int n_rxq; +char *vf_pci; +char *vm_socket; +char *vhost_name; +); +}; + +static int +netdev_dpdk_vdpa_port_from_name(const char *name) +{ +int port_id; +size_t len; + +len = strlen(name); +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +if (rte_eth_dev_is_valid_port(port_id) && +!strncmp(name, rte_eth_devices[port_id].device->name, len)) { +return port_id; +} +} +VLOG_ERR("No port was found for %s", name); +return -1; +} + +static void +netdev_dpdk_vdpa_free(void *ptr) +{ +if (ptr == NULL) { +return; +} +free(ptr); +ptr = NULL; +} +static void +netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay) +{ +uint16_t q; +uint8_t i; + +for (q = 0; q < relay->num_queues; q++) { +for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) { +rte_pktmbuf_free(relay->qpair[q].pkts[i]); +} +relay->qpair[q].mb_head = 0; +relay->qpair[q].mb_tail = 0; +relay->qpair[q].port_id_rx = 0; +relay->qpair[q].port_id_tx = 0; +relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID; +} + +relay->port_id_vm = 0; +relay->port_id_vf = 0; +relay->num_queues = 0; +rel
[ovs-dev] [PATCH ovs v2 0/2] Introduce dpdkvdpa netdev
There are two approaches to communicate with a guest, using virtIO or SR-IOV. SR-IOV allows working with port representor which is attached to the OVS and a matching VF is given with pass-through to the VM. HW rules can process packets from up-link and direct them to the VF without going through SW (OVS) and therefore SR-IOV gives the best performance. However, SR-IOV architecture requires that the guest will use a driver which is specific to the underlying HW. Specific HW driver has two main drawbacks: 1. Breaks virtualization in some sense (VM aware of the HW), can also limit the type of images supported. 2. Less natural support for live migration. Using virtIO interface solves both problems, but reduces performance and causes losing of some functionality, for example, for some HW offload, working directly with virtIO cannot be supported. In order to solve this conflict, we created a new netdev type-dpdkvdpa. The new netdev is basically similar to a regular dpdk netdev, but it has some additional functionality for transferring packets from virtIO guest (VM) to a VF and vice versa. With this solution we can benefit both SR-IOV and virtIO. vDPA netdev is designed to support both SW and HW use-cases. HW mode will be used to configure vDPA capable devices. The support for this mode is on progress in the dpdk community. SW acceleration is used to leverage SR-IOV offloads to virtIO guests by relaying packets between VF and virtio devices and as a pre-step for supporting vDPA in HW mode. Running example: 1. Configure OVS bridge and ports: ovs-vsctl add-br br0-ovs -- set bridge br0-ovs datapath_type=netdev ovs-vsctl add-port br0-ovs pf -- set Interface pf type=dpdk options: \ dpdk-devargs= ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa \ options:vdpa-socket-path= \ options:vdpa-accelerator-devargs= \ options:dpdk-devargs=,representor=[id] 2. Run a virtIO guest (VM) in server mode that creates the socket of the vDPA port. 3. Send traffic. Noa Ezra (2): netdev-dpdk-vdpa: Introduce dpdkvdpa netdev netdev-dpdk: Add dpdkvdpa port NEWS | 1 + lib/automake.mk| 4 +- lib/netdev-dpdk-vdpa.c | 750 + lib/netdev-dpdk-vdpa.h | 54 lib/netdev-dpdk.c | 162 +++ vswitchd/vswitch.xml | 25 ++ 6 files changed, 995 insertions(+), 1 deletion(-) create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovs V1 1/2] netdev-dpdk-vdpa: Introduce dpdkvdpa netdev
vDPA netdev is designed to support both SW and HW use cases. HW mode will be used to configure vDPA capable devices. SW acceleration is used to leverage SRIOV offloads to virtio guests by relaying packets between VF and virtio devices. Add the SW relay forwarding logic as a pre-step for adding dpdkvdpa port with no functional change. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- lib/automake.mk| 4 +- lib/netdev-dpdk-vdpa.c | 750 + lib/netdev-dpdk-vdpa.h | 54 3 files changed, 807 insertions(+), 1 deletion(-) create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h diff --git a/lib/automake.mk b/lib/automake.mk index 17b36b4..38e027f 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -144,6 +144,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/netdev-offload.h \ lib/netdev-offload-provider.h \ lib/netdev-provider.h \ + lib/netdev-dpdk-vdpa.h \ lib/netdev-vport.c \ lib/netdev-vport.h \ lib/netdev-vport-private.h \ @@ -426,7 +427,8 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/dpdk.c \ lib/netdev-dpdk.c \ - lib/netdev-offload-dpdk.c + lib/netdev-offload-dpdk.c \ + lib/netdev-dpdk-vdpa.c else lib_libopenvswitch_la_SOURCES += \ lib/dpdk-stub.c diff --git a/lib/netdev-dpdk-vdpa.c b/lib/netdev-dpdk-vdpa.c new file mode 100755 index 000..ca831f2 --- /dev/null +++ b/lib/netdev-dpdk-vdpa.c @@ -0,0 +1,750 @@ +/* + * Copyright (c) 2019 Mellanox Technologies, Ltd. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "netdev-dpdk-vdpa.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netdev-provider.h" +#include "openvswitch/vlog.h" +#include "dp-packet.h" +#include "util.h" + +VLOG_DEFINE_THIS_MODULE(netdev_dpdk_vdpa); + +#define NETDEV_DPDK_VDPA_SIZEOF_MBUF(sizeof(struct rte_mbuf *)) +#define NETDEV_DPDK_VDPA_MAX_QPAIRS 128 +#define NETDEV_DPDK_VDPA_INVALID_QUEUE_ID 0x +#define NETDEV_DPDK_VDPA_STATS_MAX_STR_SIZE 64 +#define NETDEV_DPDK_VDPA_RX_DESC_DEFAULT512 + +enum netdev_dpdk_vdpa_port_type { +NETDEV_DPDK_VDPA_PORT_TYPE_VM, +NETDEV_DPDK_VDPA_PORT_TYPE_VF +}; + +struct netdev_dpdk_vdpa_relay_flow { +struct rte_flow *flow; +bool queues_en[RTE_MAX_QUEUES_PER_PORT]; +uint32_t priority; +}; + +struct netdev_dpdk_vdpa_qpair { +uint16_t port_id_rx; +uint16_t port_id_tx; +uint16_t pr_queue; +uint8_t mb_head; +uint8_t mb_tail; +struct rte_mbuf *pkts[NETDEV_MAX_BURST * 2]; +}; + +struct netdev_dpdk_vdpa_relay { +PADDED_MEMBERS(CACHE_LINE_SIZE, +struct netdev_dpdk_vdpa_qpair qpair[NETDEV_DPDK_VDPA_MAX_QPAIRS * 2]; +uint16_t num_queues; +struct netdev_dpdk_vdpa_relay_flow flow_params; +int port_id_vm; +int port_id_vf; +uint16_t vf_mtu; +int n_rxq; +char *vf_pci; +char *vm_socket; +char *vhost_name; +); +}; + +static int +netdev_dpdk_vdpa_port_from_name(const char *name) +{ +int port_id; +size_t len; + +len = strlen(name); +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +if (rte_eth_dev_is_valid_port(port_id) && +!strncmp(name, rte_eth_devices[port_id].device->name, len)) { +return port_id; +} +} +VLOG_ERR("No port was found for %s", name); +return -1; +} + +static void +netdev_dpdk_vdpa_free(void *ptr) +{ +if (ptr == NULL) { +return; +} +free(ptr); +ptr = NULL; +} +static void +netdev_dpdk_vdpa_clear_relay(struct netdev_dpdk_vdpa_relay *relay) +{ +uint16_t q; +uint8_t i; + +for (q = 0; q < relay->num_queues; q++) { +for (i = relay->qpair[q].mb_head; i < relay->qpair[q].mb_tail; i++) { +rte_pktmbuf_free(relay->qpair[q].pkts[i]); +} +relay->qpair[q].mb_head = 0; +relay->qpair[q].mb_tail = 0; +relay->qpair[q].port_id_rx = 0; +relay->qpair[q].port_id_tx = 0; +relay->qpair[q].pr_queue = NETDEV_DPDK_VDPA_INVALID_QUEUE_ID; +} + +relay->port_id_vm = 0; +relay->port_id_vf = 0; +relay->num_queues = 0; +rel
[ovs-dev] [PATCH ovs V1 2/2] netdev-dpdk: Add dpdkvdpa port
dpdkvdpa netdev works with 3 components: vhost-user socket, vdpa device: real vdpa device or a VF and representor of "vdpa device". In order to add a new vDPA port, add a new port to existing bridge with type dpdkvdpa and vDPA options: ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa options:vdpa-socket-path= options:vdpa-accelerator-devargs= options:dpdk-devargs=,representor=[id] On this command OVS will create a new netdev: 1. Register vhost-user-client device. 2. Open and configure VF dpdk port. 3. Open and configure representor dpdk port. The new netdev will use netdev_rxq_recv() function in order to receive packets from VF and push to vhost-user and receive packets from vhost-user and push to VF. Signed-off-by: Noa Ezra Reviewed-by: Oz Shlomo --- NEWS | 1 + lib/netdev-dpdk.c| 162 +++ vswitchd/vswitch.xml | 25 3 files changed, 188 insertions(+) diff --git a/NEWS b/NEWS index f5a0b8f..6f315c6 100644 --- a/NEWS +++ b/NEWS @@ -542,6 +542,7 @@ v2.6.0 - 27 Sep 2016 * Remove dpdkvhostcuse port type. * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7) * 'dpdkvhostuserclient' port type. + * 'dpdkvdpa' port type. - Increase number of registers to 16. - ovs-benchmark: This utility has been removed due to lack of use and bitrot. diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bc20d68..16ddf58 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -47,6 +47,7 @@ #include "dpif-netdev.h" #include "fatal-signal.h" #include "netdev-provider.h" +#include "netdev-dpdk-vdpa.h" #include "netdev-vport.h" #include "odp-util.h" #include "openvswitch/dynamic-string.h" @@ -137,6 +138,9 @@ typedef uint16_t dpdk_port_t; /* Legacy default value for vhost tx retries. */ #define VHOST_ENQ_RETRY_DEF 8 +/* Size of VDPA custom stats. */ +#define VDPA_CUSTOM_STATS_SIZE 4 + #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) static const struct rte_eth_conf port_conf = { @@ -461,6 +465,8 @@ struct netdev_dpdk { int rte_xstats_ids_size; uint64_t *rte_xstats_ids; ); + +struct netdev_dpdk_vdpa_relay *relay; }; struct netdev_rxq_dpdk { @@ -1346,6 +1352,30 @@ netdev_dpdk_construct(struct netdev *netdev) return err; } +static int +netdev_dpdk_vdpa_construct(struct netdev *netdev) +{ +struct netdev_dpdk *dev; +int err; + +err = netdev_dpdk_construct(netdev); +if (err) { +VLOG_ERR("netdev_dpdk_construct failed. Port: %s\n", netdev->name); +goto out; +} + +ovs_mutex_lock(&dpdk_mutex); +dev = netdev_dpdk_cast(netdev); +dev->relay = netdev_dpdk_vdpa_alloc_relay(); +if (!dev->relay) { +err = ENOMEM; +} + +ovs_mutex_unlock(&dpdk_mutex); +out: +return err; +} + static void common_destruct(struct netdev_dpdk *dev) OVS_REQUIRES(dpdk_mutex) @@ -1428,6 +1458,19 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk *dev OVS_UNUSED, } static void +netdev_dpdk_vdpa_destruct(struct netdev *netdev) +{ +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + +ovs_mutex_lock(&dpdk_mutex); +netdev_dpdk_vdpa_destruct_impl(dev->relay); +rte_free(dev->relay); +ovs_mutex_unlock(&dpdk_mutex); + +netdev_dpdk_destruct(netdev); +} + +static void netdev_dpdk_vhost_destruct(struct netdev *netdev) { struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); @@ -1878,6 +1921,47 @@ out: } static int +netdev_dpdk_vdpa_set_config(struct netdev *netdev, const struct smap *args, +char **errp) +{ +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); +const char *vdpa_accelerator_devargs = +smap_get(args, "vdpa-accelerator-devargs"); +const char *vdpa_socket_path = +smap_get(args, "vdpa-socket-path"); +int err = 0; + +if ((vdpa_accelerator_devargs == NULL) || (vdpa_socket_path == NULL)) { +VLOG_ERR("netdev_dpdk_vdpa_set_config failed." + "Required arguments are missing for VDPA port %s", + netdev->name); +goto free_relay; +} + +err = netdev_dpdk_set_config(netdev, args, errp); +if (err) { +VLOG_ERR("netdev_dpdk_set_config failed. Port: %s", netdev->name); +goto free_relay; +} + +err = netdev_dpdk_vdpa_config_impl(dev->relay, dev->port_id, + vdpa_socket_path, + vdpa_accelerator_devargs); +if (err) { +VLOG_ERR("netdev_dpdk_vdpa_config_impl failed. Port %s", + netdev->name); +goto free_relay; +} + +goto out; + +free_relay:
[ovs-dev] [PATCH ovs V1 0/2] Introduce dpdkvdpa netdev
Introduce dpdkvdpa netdev allowing HW offloads over VirtIO network devices. dpdkvdpa ports can be added to netdev bridges with the following command: ovs-vsctl add-port br0 vdpa0 -- set Interface vdpa0 type=dpdkvdpa options:vdpa-socket-path= options:vdpa-accelerator-devargs= options:dpdk-devargs=,representor=[id] vDPA netdev is designed to support both SW and HW acceleration. SRIOV capable NICs can use the SW acceleration which relays packets between VF and virtIO ports. HW mode will configure vDPA capable NICs. Patch 1 provides the vdpa functionality as a pre-step without a functional change. Patch 2 introduces the dpdkvdpa vport. Noa Ezra (2): netdev-dpdk-vdpa: Introduce dpdkvdpa netdev netdev-dpdk: Add dpdkvdpa port NEWS | 1 + lib/automake.mk| 4 +- lib/netdev-dpdk-vdpa.c | 750 + lib/netdev-dpdk-vdpa.h | 54 lib/netdev-dpdk.c | 162 +++ vswitchd/vswitch.xml | 25 ++ 6 files changed, 995 insertions(+), 1 deletion(-) create mode 100755 lib/netdev-dpdk-vdpa.c create mode 100644 lib/netdev-dpdk-vdpa.h -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev