Hi Liang-min, When replying inline, please do not prefix with ">>" as it is handled as quoted text. There is no need to prefix.
On 5/18/21 8:00 PM, Wang, Liang-min wrote: >> -----Original Message----- >> From: Maxime Coquelin <maxime.coque...@redhat.com> >> Sent: Tuesday, May 18, 2021 12:15 PM >> To: Miskell, Timothy <timothy.misk...@intel.com>; d...@openvswitch.org >> Cc: Wang, Liang-min <liang-min.w...@intel.com> >> Subject: Re: [PATCH] Extends the existing mirror configuration parameters >> >> Hi Timothy, Liang-min, >> >> Thanks for rebasing the patch. >> A list of delta against the first RFC could help the reviewers. >> I notice one change in the right direction is the conversion to Vhost >> API datapath instead of Vhost PMD. >> >>> In this patch we support both Vhost API and Vhost PMD because OVS supports >>> both >>> VhostUser and Vdev ports. >>> >> Also, I would suggest to have the patch split in several incremental >> patches to ease the review. >> >>> Thank you for suggestion. We will provide incremental patches on next >>> submission >>> >> On 5/10/21 6:00 PM, Timothy Miskell wrote: >>> From: Liang-min Wang <liang-min.w...@intel.com> >>> >>> The following parameters are added: >>> - mirror-offload: to turn on/off mirror offloading. >>> - output-port-name: specify a port, using name string, that is on a >>> different >>> bridge >>> - output-src-vlan: output port vlan for each select-src-port. >>> - output-dst-vlan: output port vlan for each select-dst-port. >>> - flow-src-mac: use src mac address of each select-dst-port for the header >>> scan. >>> - flow-dst-mac: use dst mac address of each select-src-port for the header >>> scan. >>> - mirror-tunnel-addr: BDF string of the tunnel device. >>> >>> ovs-vsctl test change because new mirroring parameters are introduced in >> this patch >> >> It would help to provide examples of usage of these new parameters. >> >>> Will add examples in the new patches >>> >>> Create a defer procedure call thread to handle all mirror offload requests. >>> This is a light-weight thread which remains in sleep-state when there is no >> new request. >>> This is created between ovs-vsctl and mirror offloading back end >>> >>> Implementing DPDK tx-burst (VIRTIO ingress traffic >>> mirror) and rx-burst (VIRTIO egress traffic mirror) callbacks. >>> Each callback functions implement the following tasks: >>> 1. Enable per-packet VLAN insertion >>> - for port mirroring, all packets are enabled per-packet VLAN insertion. >>> - for flow mirroring, only packet header matches the required mac >> address >>> are enabled. >>> 2. Sending the packets to the specified transport port (output-port in >>> mirror offload configuration) >>> - for port mirroring, all packets are sent to the transport port. >>> - for flow mirroring, only matched packets are sent. >>> 3. Restore each packet attributes (remove DPDK per-packet offload flag) >> >> I will for sure have more questions later, but please find a few >> comments/questions below: >> >>> Signed-off-by: Liang-min Wang <liang-min.w...@intel.com> >>> Tested-by: Timothy Miskell <timothy.misk...@intel.com> >>> Suggested-by: Munish Mehan <mm6...@att.com> >>> --- >>> lib/automake.mk | 2 + >>> lib/netdev-dpdk-mirror.c | 516 >> +++++++++++++++++++++++++++++++++++++ >>> lib/netdev-dpdk-mirror.h | 83 ++++++ >>> lib/netdev-dpdk.c | 397 ++++++++++++++++++++++++++++ >>> lib/netdev-provider.h | 16 ++ >>> lib/netdev.c | 386 +++++++++++++++++++++++++++ >>> lib/netdev.h | 16 ++ >>> tests/ovs-vsctl.at | 2 + >>> vswitchd/bridge.c | 271 ++++++++++++++++++- >>> vswitchd/vswitch.ovsschema | 24 +- >>> vswitchd/vswitch.xml | 50 ++++ >>> 11 files changed, 1759 insertions(+), 4 deletions(-) >>> create mode 100644 lib/netdev-dpdk-mirror.c >>> create mode 100644 lib/netdev-dpdk-mirror.h >>> ... >>> + >>> +/* port/flow mirror traffic processors */ >>> +static inline uint16_t >>> +netdev_custom_mirror_offload_cb(uint16_t qidx, struct rte_mbuf >> **pkts, >>> + uint16_t nb_pkts, void *user_params) >>> +{ >>> + struct mirror_param *data = user_params; >>> + uint16_t i, dst_qidx, match_count = 0; >>> + uint16_t pkt_trans; >>> + uint16_t dst_port_id = data->dst_port_id; >>> + uint16_t dst_vlan_id = data->dst_vlan_id; >>> + struct rte_mbuf **pkt_buf = &data->pkt_buf[qidx * data- >>> max_burst_size]; >>> + >>> + if (nb_pkts == 0) { >>> + return 0; >>> + } >>> + >>> + if (nb_pkts > data->max_burst_size) { >>> + VLOG_ERR("Per-flow batch size, %d, exceeds maximum limit\n", >> nb_pkts); >>> + return 0; >>> + } >>> + >>> + for (i = 0; i < nb_pkts; i++) { >>> + if (data->custom_scan(pkts[i], user_params)) { >>> + pkt_buf[match_count] = pkts[i]; >>> + pkt_buf[match_count]->ol_flags |= PKT_TX_VLAN_PKT; >> >> Does it work if the packet already has a VLAN inserted? >> >>> Good catch. The design is based upon no VLAN insertion offloading is >>> applied on source traffic. That is a significant limitation, I haven not seen it documented in the series. Also, the IEEE paper mentions using QinQ when the packet has already a VLAN inserted, but it is not in the series. Is there a reason QinQ is not possible? IIUC, that means the TAP VM would see traffic having no VLAN, whereas it has in reality? >>> + pkt_buf[match_count]->vlan_tci = dst_vlan_id; >>> + rte_mbuf_refcnt_update(pkt_buf[match_count], 1); >> >> >> >>> + match_count++; >>> + } >>> + } >>> + >>> + dst_qidx = (data->n_dst_queue > qidx)?qidx:(data->n_dst_queue -1); >> >> Wouldn't it scale better with: >> dst_qidx = qidx % data->n_dst_queue >> ? >> >>> We tried to avoid using "%" operator. We could add "unlikely" and the >>> suggested "%" to make improvement Not sure adding 'unlikely' is really necessary. The cost of the modulo operation is nothing compared to all we do in this path. >>> >>> + >>> + rte_spinlock_lock(&data->locks[dst_qidx]); >>> + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkt_buf, >> match_count); >>> + rte_spinlock_unlock(&data->locks[dst_qidx]); >>> + >>> + for (i = 0; i < match_count; i++) { >>> + pkt_buf[i]->ol_flags &= ~PKT_TX_VLAN_PKT; >>> + } >> >> In order to further reduce the performance impact of mirroring, have you >> envisaged to offload it to dedicated PMD threads? >> >>> The mirror-tunnel design is a comprised approach between hardware TAP and >>> software TAP. >>> The tunnel itself is designed to have very little impact on source traffic >>> processing core. From >>> our benchmark on SR-IOV, L2 forwarding, mirroring, we only observed 10-20% >>> impact on 64-byte packets, and >>> we did not observe impact when running traffic with packet size with >>> 128-byte or above. The cover letter mentions 20-30% for 64B packets, and the IEEE paper seems to indicate a significant impact up to 512B packets (~25%?). Maybe the workload is different in these tests, that could explain why you don't see an impact starting 128B? >>> >>> + >>> + while (unlikely (pkt_trans < match_count)) { >>> + rte_pktmbuf_free(pkt_buf[pkt_trans]); >>> + pkt_trans++; >>> + } >>> + >>> + return nb_pkts; >>> +} >>> + ... >>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >>> index 4597a215d..fd2049a7f 100644 >>> --- a/vswitchd/vswitch.xml >>> +++ b/vswitchd/vswitch.xml >>> @@ -4869,11 +4869,35 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 >> type=patch options:peer=p1 \ >>> selected VLANs. >>> </p> >>> >>> + <column name="mirror_tunnel_addr"> >>> + BDF string of the tunnel device on which mirrored traffic will be >>> + transmitted. >>> + </column> >>> + >>> <column name="select_all"> >>> If true, every packet arriving or departing on any port is >>> selected for mirroring. >>> </column> >>> >>> + <column name="mirror_offload"> >>> + If true, a hw-assisted port mirroring is configured instead >>> + default mirroring. >>> + </column> >>> + >>> + <column name="flow_src_mac"> >>> + The source MAC address(es) for per-flow mirroring. Each MAC >>> + address is separate by ','. This parametr is paired with >>> + select_dst_port. A '0' MAC address indicates the requested mirror >>> + is a per-port mirroring, otherwise it's a per-flow mirroring >>> + </column> >>> + >>> + <column name="flow_dst_mac"> >>> + The destination MAC address(es) for per-flow mirroring. Each MAC >>> + address is separate by ','. This parametr is paired with >>> + select_src_port. A '0' MAC address indicates the requested mirror >>> + is a per-port mirroring, otherwise it's a per-flow mirroring >>> + </column> >>> + >>> <column name="select_dst_port"> >>> Ports on which departing packets are selected for mirroring. >>> </column> >>> @@ -4955,6 +4979,32 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 >> type=patch options:peer=p1 \ >>> </p> >>> </column> >>> >>> + <column name="output_src_vlan"> >>> + <p>Output VLAN for selected source port packets, if nonempty.</p> >>> + <p> >>> + <em>Please note:</em> This is different than >>> + <ref column="output-vlan"/> This vlan is used to add an >>> additional >>> + vlan tag on the mirror traffic, regardless it contains vlan or >>> not. >>> + The receive end could choose to filter out this additional vlan. >>> + This option is provided so the mirrored traffic could maintain >>> its >>> + original vlan informaiton, and this mirror can be used to filter >>> + out un-wanted traffic such as in <ref column="mirror_offload"/>. >>> + </p> >>> + </column> >>> + >>> + <column name="output_dst_vlan"> >>> + <p>Output VLAN for selected destination port packets, if >> nonempty.</p> >>> + <p> >>> + <em>Please note:</em> This is different than >>> + <ref column="output-vlan"/> This vlan is used to add an >>> additional >>> + vlan tag on the mirror traffic, regardless it contains vlan or >>> not. >>> + The receive end could choose to filter out this additional vlan. >>> + This option is provided so the mirrored traffic could maintain >>> its >>> + original vlan informaiton, and this mirror cab be used to filter >>> + out un-wanted traffic such as in <ref column="mirror_offload"/>. >>> + </p> >>> + </column> >>> + >>> <column name="snaplen"> >>> <p>Maximum per-packet number of bytes to mirror.</p> >>> <p>A mirrored packet with size larger than <ref column="snaplen"/> >>> > These parameters are DPDK specific, but nothing mentions it. It would confuse the OVS-Kernel users. Regards, Maxime _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev