Background ========== OVS HW offload solution is consisted of forwarding and control. HW implements embedded switch that connects SRIOV VF's and forwards packets according to the dynamically configured HW rules (packets can be altered by HW rules). Packets that have no forwarding rule, called exception packets, are sent to the control path (OVS SW). OVS SW will handle the exception packet (just like in SW only mode), namely calling up-call if no DP flow exists. OVS SW will use port representor for representing the VF. see: (https://doc.dpdk.org/guides/prog_guide/switch_representation.html). Packets sent from VF will get to the port representor and packets sent to the port representor will get to the VF. Once OVS SW generates a data plane flow, a new HW rules will be configured in the embedded switch. following packet on the on the same flow will be directed by HW only. Will arrive directly from VF (also uplink) to VF without getting to SW.
For some HW architecture only the shortly presented SRIOV hw offload architecture is supported. SRIOV architecture requires that the guest will install a driver which is specific for the underlying HW. Specific HW driver interduces two main problems for virtualization: 1. It breaks virtualization in some sense (VM aware of the HW). 2. less natural support for live migration. Using virtio interface solves both problems (on the expense of some loss in functionality and performance). However, for some HW offload, working directly with vitrio cannot be supported. HW offload for virtio architecture ==================================== We suggest an architecture for HW offload of virtio interface that adds another component called virtio-forwarder on top of the current architecture. The forwarder is a software or hardware (for vdpa) component that connects the VF with a matching virtio interface as shown below: | PR1 ----------- --|-- | | | | | forwarder | | OVS | | | | | ------------- --------- ----- | VF1 | virtio1 | | | uplink | | | guest | ----------------- | \ ----------| | | |----- / --------- | e-switch | | | ------------------ The forwarder role is to function as a wire between the VF and the virtio. Forwarder reads packets from the rx-queue and sends them to the peer tx-queue (and vice versa). since the function in this case is reduced to forwarding packets without inspecting them, a single core can push a very high number of PPS (near DPDK forwarding performance). There are 3 sub use cases. OVS-dpdk -------- This is the basic use case that was just described. In this use case we have port representor, VF and virtio (forwarding should be done between VF and virtio). Vdpa ----- Vdpa enables the HW to directly put the packets in the VM virtio. In this case the forwarding is done in HW, but it requires that some SW will handle the control. Configure the queues and adjust configuration according to VHOST updates. OVS-kernel ---------- OVS-kernel HW offload has the same limitation. However, in the case of just forwarding packets, DPDK has a great performance advantage over the kernel. It would be good to also add this use case, looking on implementation effort and performance gain. Why not just use standalone (DPDK test PMD)? ---------------------------------------- 1. When HW-offload is running, we expect that most of the traffic will be handled by HW, so the PMD thread will be mostly idle. we don't want to burn another core for the forwarding. 2. Standalone application is another application with all the additional overheads: start, configuration, monitoring...etc. besides being another project which means another dependency. 3. Using already existing OVS load balancing and NUMA awareness. Forwarding should have the exact same symptoms of unbalanced workload as regular rx-queue 4. We might need to have some prioritization, exception packets are more important than forwarding. Being on the same domain will make it possible to add such prioritization while reducing CPU requirement to minimum. OVS virtio-forwarder ==================== The suggestion is to put the wire and control functionality in the hw-offload module. Looking on the forwarder functionality we have control and data. The control is the configuration: Virtio/VF matching (and type). queues configuration (defined when VM initialized, and can change)...etc. The data is the actual forwarding that needs a context to run it. As explained, forwarding is reduced to a simple rx-burst and tx-burst where all can be predefined after the configuration. We add the forwarding layer to the hw offload module and we configure it separately. For example: ovs-appctl hw-offload/set-fw vhost-server-path=/tmp/dpdkvhostvm1:rxq=2 dpdk-devargs=0000:08:00.0 type=pr:[1] Once configured we attach the context according to user configuration. In the basic use case, we hook to the port representor scheduling. This way we can use the OVS scheduler. When port representor rx-queue is called we forward the packets for it and account the cycles on the port representor (rx-queue), so OVS can rebalance if needed. This way we use the PMD thread empty cycles. If no port representor is added, we hook to the scheduler as a generic call. Every scheduling cycle we will call the HW virtio-forwarder. We limit the quota to avoid starvation of rx-queues. Although we cannot use the OVS scheduling features in this case, we still reuse most of the code of the forwarder and we solve the problem for kernel-OVS with minor additional effort. >From OVS perspective this is a HW offload functionality, no ports are added to the OVS. The functionality and statistics can be only accessed through the hw offload module and there is a minimal code change needed from OVS, mainly for hooking the calling context. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev