Hi Timothy, Liang-min, Thanks for rebasing the patch. A list of delta against the first RFC could help the reviewers. I notice one change in the right direction is the conversion to Vhost API datapath instead of Vhost PMD.
Also, I would suggest to have the patch split in several incremental patches to ease the review. On 5/10/21 6:00 PM, Timothy Miskell wrote: > From: Liang-min Wang <liang-min.w...@intel.com> > > The following parameters are added: > - mirror-offload: to turn on/off mirror offloading. > - output-port-name: specify a port, using name string, that is on a different > bridge > - output-src-vlan: output port vlan for each select-src-port. > - output-dst-vlan: output port vlan for each select-dst-port. > - flow-src-mac: use src mac address of each select-dst-port for the header > scan. > - flow-dst-mac: use dst mac address of each select-src-port for the header > scan. > - mirror-tunnel-addr: BDF string of the tunnel device. > > ovs-vsctl test change because new mirroring parameters are introduced in this > patch It would help to provide examples of usage of these new parameters. > Create a defer procedure call thread to handle all mirror offload requests. > This is a light-weight thread which remains in sleep-state when there is no > new request. > This is created between ovs-vsctl and mirror offloading back end > > Implementing DPDK tx-burst (VIRTIO ingress traffic > mirror) and rx-burst (VIRTIO egress traffic mirror) callbacks. > Each callback functions implement the following tasks: > 1. Enable per-packet VLAN insertion > - for port mirroring, all packets are enabled per-packet VLAN insertion. > - for flow mirroring, only packet header matches the required mac address > are enabled. > 2. Sending the packets to the specified transport port (output-port in > mirror offload configuration) > - for port mirroring, all packets are sent to the transport port. > - for flow mirroring, only matched packets are sent. > 3. Restore each packet attributes (remove DPDK per-packet offload flag) I will for sure have more questions later, but please find a few comments/questions below: > Signed-off-by: Liang-min Wang <liang-min.w...@intel.com> > Tested-by: Timothy Miskell <timothy.misk...@intel.com> > Suggested-by: Munish Mehan <mm6...@att.com> > --- > lib/automake.mk | 2 + > lib/netdev-dpdk-mirror.c | 516 +++++++++++++++++++++++++++++++++++++ > lib/netdev-dpdk-mirror.h | 83 ++++++ > lib/netdev-dpdk.c | 397 ++++++++++++++++++++++++++++ > lib/netdev-provider.h | 16 ++ > lib/netdev.c | 386 +++++++++++++++++++++++++++ > lib/netdev.h | 16 ++ > tests/ovs-vsctl.at | 2 + > vswitchd/bridge.c | 271 ++++++++++++++++++- > vswitchd/vswitch.ovsschema | 24 +- > vswitchd/vswitch.xml | 50 ++++ > 11 files changed, 1759 insertions(+), 4 deletions(-) > create mode 100644 lib/netdev-dpdk-mirror.c > create mode 100644 lib/netdev-dpdk-mirror.h > > diff --git a/lib/automake.mk b/lib/automake.mk > index 39901bd6d..dcafbfaca 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -170,6 +170,7 @@ lib_libopenvswitch_la_SOURCES = \ > lib/multipath.h \ > lib/namemap.c \ > lib/netdev-dpdk.h \ > + lib/netdev-dpdk-mirror.h \ > lib/netdev-dummy.c \ > lib/netdev-offload.c \ > lib/netdev-offload.h \ > @@ -460,6 +461,7 @@ if DPDK_NETDEV > lib_libopenvswitch_la_SOURCES += \ > lib/dpdk.c \ > lib/netdev-dpdk.c \ > + lib/netdev-dpdk-mirror.c \ > lib/netdev-offload-dpdk.c > else > lib_libopenvswitch_la_SOURCES += \ > diff --git a/lib/netdev-dpdk-mirror.c b/lib/netdev-dpdk-mirror.c > new file mode 100644 > index 000000000..ff2701660 > --- /dev/null > +++ b/lib/netdev-dpdk-mirror.c > @@ -0,0 +1,516 @@ > +/* > + * Copyright (c) 2014, 2015, 2016, 2017 Nicira, Inc. > + * Copyright (c) 2019 Mellanox Technologies, Ltd. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > +#include <config.h> > +#include <rte_ethdev.h> > + > +#include "netdev-dpdk-mirror.h" > +#include "openvswitch/vlog.h" > +#include "openvswitch/dynamic-string.h" > +#include "util.h" > + > +#define MAC_ADDR_MAP 0x0000FFFFFFFFFFFFULL > +#define is_mac_addr_match(a,b) (((a^b)&MAC_ADDR_MAP) == 0) > +#define INIT_MIRROR_DB_SIZE 8 > +#define INVALID_DEVICE_ID 0xFFFFFFFF > + > +VLOG_DEFINE_THIS_MODULE(netdev_dpdk_mirror); > + > +/* port/flow mirror database management routines */ > +/* > + * The below API is for port/flow mirror offloading which uses a different > DPDK > + * interface as rte-flow. > + */ > +static int mirror_port_db_size = 0; > +static int mirror_port_used = 0; > +static struct mirror_offload_port *mirror_port_db = NULL; > + > +static void > +netdev_mirror_db_init(struct mirror_offload_port *db, int size) > +{ > + int i; > + > + for (i = 0; i < size; i++) { > + db[i].dev_id = INVALID_DEVICE_ID; > + memset(&db[i].rx, 0, sizeof(struct mirror_param)); > + memset(&db[i].tx, 0, sizeof(struct mirror_param)); > + } > +} > + > +/* Double the db size when it runs out of space */ > +static int > +netdev_mirror_db_resize(void) > +{ > + int new_size = mirror_port_db_size << 1; > + struct mirror_offload_port *new_db = xmalloc( > + sizeof(struct mirror_offload_port)*new_size); > + > + memcpy(new_db, mirror_port_db, sizeof(struct mirror_offload_port) > + *mirror_port_db_size); > + netdev_mirror_db_init(&new_db[mirror_port_db_size], mirror_port_db_size); > + mirror_port_db_size = new_size; > + mirror_port_db = new_db; > + > + return 0; > +} > + > + > +static struct mirror_offload_port* > +netdev_mirror_data_find(uint32_t dev_id) > +{ > + int i; > + > + if (mirror_port_db == NULL) { > + return NULL; > + } > + > + for (i = 0; i < mirror_port_db_size; i++) { > + if (dev_id == mirror_port_db[i].dev_id) { > + return &mirror_port_db[i]; > + } > + } > + return NULL; > +} > + > +static struct mirror_offload_port* > +netdev_mirror_data_add(uint32_t dev_id, int tx, > + struct mirror_param *new_param) > +{ > + struct mirror_offload_port *target = NULL; > + int i; > + > + if (!mirror_port_db) { > + mirror_port_db_size = INIT_MIRROR_DB_SIZE; > + mirror_port_db = xmalloc(sizeof(struct mirror_offload_port)* > + mirror_port_db_size); > + netdev_mirror_db_init(mirror_port_db, mirror_port_db_size); > + } > + target = netdev_mirror_data_find(dev_id); > + if (target) { > + if (tx) { > + if (target->tx.mirror_cb) { > + VLOG_ERR("Attempt to add ingress mirror offloading" > + " on port, %d, while one is outstanding\n", dev_id); > + return target; > + } > + > + memcpy(&target->tx, new_param, sizeof(*new_param)); > + } else { > + if (target->rx.mirror_cb) { > + VLOG_ERR("Attempt to add egress mirror offloading" > + " on port, %d, while one is outstanding\n", dev_id); > + return target; > + } > + > + memcpy(&target->rx, new_param, sizeof(struct mirror_param)); > + } > + } else { > + struct mirror_param *param; > + /* find an unused spot on db */ > + for (i = 0; i < mirror_port_db_size; i++) { > + if (mirror_port_db[i].dev_id == INVALID_DEVICE_ID) { > + break; > + } > + } > + if (i == mirror_port_db_size && netdev_mirror_db_resize()) { > + return NULL; > + } > + > + param = tx ? &mirror_port_db[i].tx : &mirror_port_db[i].rx; > + memcpy(param, new_param, sizeof(struct mirror_param)); > + > + target = &mirror_port_db[i]; > + target->dev_id = dev_id; > + mirror_port_used ++; > + } > + return target; > +} > + > +static void > +netdev_mirror_data_remove(uint32_t dev_id, int tx) { > + struct mirror_offload_port *target = netdev_mirror_data_find(dev_id); > + > + if (!target) { > + VLOG_ERR("Attempt to remove unsaved port, %d, %s callback\n", > + dev_id, tx?"tx": "rx"); > + } > + > + if (tx) { > + memset(&target->tx, 0, sizeof(struct mirror_param)); > + } else { > + memset(&target->rx, 0, sizeof(struct mirror_param)); > + } > + > + if ((target->rx.mirror_cb == NULL) && > + (target->tx.mirror_cb == NULL)) { > + target->dev_id = INVALID_DEVICE_ID; > + mirror_port_used --; > + /* release port mirror db memory when there > + * is no outstanding port mirror offloading > + * configuration > + */ > + if (mirror_port_used == 0) { > + free(mirror_port_db); > + mirror_port_db = NULL; > + mirror_port_db_size = 0; > + } > + } > +} > + > +void > +netdev_mirror_data_proc(uint32_t dev_id, mirror_data_op op, > + int tx, struct mirror_param *in_param, > + struct mirror_offload_port **out_param) > +{ > + switch (op) { > + case mirror_data_find: > + *out_param = netdev_mirror_data_find(dev_id); > + break; > + case mirror_data_add: > + *out_param = netdev_mirror_data_add(dev_id, tx, in_param); > + break; > + case mirror_data_rem: > + netdev_mirror_data_remove(dev_id, tx); > + break; > + } > +} > + > +/* port/flow mirror traffic processors */ > +static inline uint16_t > +netdev_custom_mirror_offload_cb(uint16_t qidx, struct rte_mbuf **pkts, > + uint16_t nb_pkts, void *user_params) > +{ > + struct mirror_param *data = user_params; > + uint16_t i, dst_qidx, match_count = 0; > + uint16_t pkt_trans; > + uint16_t dst_port_id = data->dst_port_id; > + uint16_t dst_vlan_id = data->dst_vlan_id; > + struct rte_mbuf **pkt_buf = &data->pkt_buf[qidx * data->max_burst_size]; > + > + if (nb_pkts == 0) { > + return 0; > + } > + > + if (nb_pkts > data->max_burst_size) { > + VLOG_ERR("Per-flow batch size, %d, exceeds maximum limit\n", > nb_pkts); > + return 0; > + } > + > + for (i = 0; i < nb_pkts; i++) { > + if (data->custom_scan(pkts[i], user_params)) { > + pkt_buf[match_count] = pkts[i]; > + pkt_buf[match_count]->ol_flags |= PKT_TX_VLAN_PKT; Does it work if the packet already has a VLAN inserted? > + pkt_buf[match_count]->vlan_tci = dst_vlan_id; > + rte_mbuf_refcnt_update(pkt_buf[match_count], 1); > + match_count++; > + } > + } > + > + dst_qidx = (data->n_dst_queue > qidx)?qidx:(data->n_dst_queue -1); Wouldn't it scale better with: dst_qidx = qidx % data->n_dst_queue ? > + > + rte_spinlock_lock(&data->locks[dst_qidx]); > + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkt_buf, > match_count); > + rte_spinlock_unlock(&data->locks[dst_qidx]); > + > + for (i = 0; i < match_count; i++) { > + pkt_buf[i]->ol_flags &= ~PKT_TX_VLAN_PKT; > + } In order to further reduce the performance impact of mirroring, have you envisaged to offload it to dedicated PMD threads? > + > + while (unlikely (pkt_trans < match_count)) { > + rte_pktmbuf_free(pkt_buf[pkt_trans]); > + pkt_trans++; > + } > + > + return nb_pkts; > +} > + > +static inline uint16_t > +netdev_flow_mirror_offload_cb(uint16_t qidx, struct rte_mbuf **pkts, > + uint16_t nb_pkts, void *user_params, uint32_t offset) > +{ > + struct mirror_param *data = user_params; > + uint16_t i, dst_qidx, match_count = 0; > + uint16_t pkt_trans; > + uint16_t dst_port_id = data->dst_port_id; > + uint16_t dst_vlan_id = data->dst_vlan_id; > + uint64_t target_addr = *(uint64_t *) data->extra_data; > + struct rte_mbuf **pkt_buf = &data->pkt_buf[qidx * data->max_burst_size]; > + > + if (nb_pkts == 0) { > + return 0; > + } > + > + if (nb_pkts > data->max_burst_size) { > + VLOG_ERR("Per-flow batch size, %d, exceeds maximum limit\n", > nb_pkts); > + return 0; > + } > + > + for (i = 0; i < nb_pkts; i++) { > + uint64_t *dst_mac_addr = > + rte_pktmbuf_mtod_offset(pkts[i], void *, offset); > + if (is_mac_addr_match(target_addr, (*dst_mac_addr))) { > + pkt_buf[match_count] = pkts[i]; > + pkt_buf[match_count]->ol_flags |= PKT_TX_VLAN_PKT; > + pkt_buf[match_count]->vlan_tci = dst_vlan_id; > + rte_mbuf_refcnt_update(pkt_buf[match_count], 1); > + match_count ++; > + } > + } > + > + dst_qidx = (data->n_dst_queue > qidx) ? qidx : (data->n_dst_queue -1); > + > + rte_spinlock_lock(&data->locks[dst_qidx]); > + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkt_buf, > match_count); > + rte_spinlock_unlock(&data->locks[dst_qidx]); > + > + for (i = 0; i < match_count; i++) { > + pkt_buf[i]->ol_flags &= ~PKT_TX_VLAN_PKT; > + } > + > + while (unlikely (pkt_trans < match_count)) { > + rte_pktmbuf_free(pkt_buf[pkt_trans]); > + pkt_trans++; > + } > + > + return nb_pkts; > +} > + > +static inline uint16_t > +netdev_port_mirror_offload_cb(uint16_t qidx, struct rte_mbuf **pkts, > + uint16_t nb_pkts, void *user_params) > +{ > + struct mirror_param *data = user_params; > + uint16_t i, dst_qidx; > + uint16_t pkt_trans; > + uint16_t dst_port_id = data->dst_port_id; > + uint16_t dst_vlan_id = data->dst_vlan_id; > + > + if (nb_pkts == 0) { > + return 0; > + } > + > + for (i = 0; i < nb_pkts; i++) { > + pkts[i]->ol_flags |= PKT_TX_VLAN_PKT; > + pkts[i]->vlan_tci = dst_vlan_id; > + rte_mbuf_refcnt_update(pkts[i], 1); > + } > + > + dst_qidx = (data->n_dst_queue > qidx) ? qidx : (data->n_dst_queue -1); > + > + rte_spinlock_lock(&data->locks[dst_qidx]); > + pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkts, nb_pkts); > + rte_spinlock_unlock(&data->locks[dst_qidx]); > + > + for (i = 0; i < nb_pkts; i++) { > + pkts[i]->ol_flags &= ~PKT_TX_VLAN_PKT; > + } > + > + while (unlikely (pkt_trans < nb_pkts)) { > + rte_pktmbuf_free(pkts[pkt_trans]); > + pkt_trans++; > + } > + > + return nb_pkts; > +} > + > +static inline uint16_t > +netdev_rx_custom_mirror_offload_cb(uint16_t port_id OVS_UNUSED, > + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, > + uint16_t maxi_pkts OVS_UNUSED, void *user_params) > +{ > + return netdev_custom_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); > +} > + > +static inline uint16_t > +netdev_tx_custom_mirror_offload_cb(uint16_t port_id OVS_UNUSED, > + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, > + void *user_params) > +{ > + return netdev_custom_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); > +} > + > +static inline uint16_t > +netdev_rx_flow_mirror_offload_cb(uint16_t port_id OVS_UNUSED, > + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, > + uint16_t maxi_pkts OVS_UNUSED, void *user_params) > +{ > + return netdev_flow_mirror_offload_cb(qidx, pkts, nb_pkts, user_params, > 0); > +} > + > +static inline uint16_t > +netdev_tx_flow_mirror_offload_cb(uint16_t port_id OVS_UNUSED, > + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, > + void *user_params) > +{ > + return netdev_flow_mirror_offload_cb(qidx, pkts, nb_pkts, user_params, > 6); > +} > + > +static inline uint16_t > +netdev_rx_port_mirror_offload_cb(uint16_t port_id OVS_UNUSED, > + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, > + uint16_t max_pkts OVS_UNUSED, void *user_params) > +{ > + return netdev_port_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); > +} > + > +static inline uint16_t > +netdev_tx_port_mirror_offload_cb(uint16_t port_id OVS_UNUSED, > + uint16_t qidx, struct rte_mbuf **pkts, uint16_t nb_pkts, > + void *user_params) > +{ > + return netdev_port_mirror_offload_cb(qidx, pkts, nb_pkts, user_params); > +} > + > +static rte_rx_callback_fn > +netdev_mirror_rx_cb(rte_mirror_type mirror_type) > +{ > + switch (mirror_type) { > + case mirror_port: > + return netdev_rx_port_mirror_offload_cb; > + case mirror_flow_mac: > + return netdev_rx_flow_mirror_offload_cb; > + case mirror_flow_custom: > + return netdev_rx_custom_mirror_offload_cb; > + case mirror_invalid: > + return NULL; > + } > + VLOG_ERR("Un-supported mirror type\n"); > + return NULL; > +} > + > +static rte_tx_callback_fn > +netdev_mirror_tx_cb(rte_mirror_type mirror_type) > +{ > + switch (mirror_type) { > + case mirror_port: > + return netdev_tx_port_mirror_offload_cb; > + case mirror_flow_mac: > + return netdev_tx_flow_mirror_offload_cb; > + break; > + case mirror_flow_custom: > + return netdev_tx_custom_mirror_offload_cb; > + case mirror_invalid: > + return NULL; > + } > + VLOG_ERR("Un-supported mirror type\n"); > + return NULL; > +} > + > +void > +netdev_mirror_cb_set(struct mirror_param *data, uint16_t port_id, > + int pmd_cb, int tx) > +{ > + unsigned int qid; > + > + data->pkt_buf = NULL; > + if (data->extra_data_size) { > + data->pkt_buf = xmalloc(sizeof(mirror_fn_cb)*data->max_burst_size * > + data->n_src_queue); > + } > + > + data->mirror_cb = xmalloc(sizeof(struct rte_eth_rxtx_callback *) > + * data->n_src_queue); > + for (qid = 0; qid < data->n_src_queue; qid++) { > + if (pmd_cb) { > + if (tx) { > + data->mirror_cb[qid].pmd = rte_eth_add_tx_callback(port_id, > + qid, netdev_mirror_tx_cb(data->mirror_type), data); > + } else { > + data->mirror_cb[qid].pmd = rte_eth_add_rx_callback(port_id, > + qid, netdev_mirror_rx_cb(data->mirror_type), data); > + } > + } else { > + struct rte_eth_rxtx_callback *rxtx_cb = > + xmalloc(sizeof(struct rte_eth_rxtx_callback)); > + > + data->mirror_cb[qid].direct = rxtx_cb; > + rxtx_cb->next = NULL; > + rxtx_cb->param = data; > + > + if (tx) { > + rxtx_cb->fn.tx = netdev_mirror_tx_cb(data->mirror_type); > + } else { > + rxtx_cb->fn.rx = netdev_mirror_rx_cb(data->mirror_type); > + } > + } > + } > +} > + > +/* port/flow mirroring device (port) register/un-registe routines */ > +int > +netdev_eth_register_mirror(uint16_t src_port, struct mirror_param *param, > + int tx_cb) > +{ > + struct mirror_offload_port *port_info = NULL; > + struct mirror_param *data; > + > + netdev_mirror_data_proc(src_port, mirror_data_add, tx_cb, param, > + &port_info); > + if (!port_info) { > + return -1; > + } > + > + data = tx_cb ? &port_info->tx : &port_info->rx; > + netdev_mirror_cb_set(data, src_port, 1, tx_cb); > + > + return 0; > +} > + > +int > +netdev_eth_unregister_mirror(uint16_t src_port, int tx_cb) > +{ > + /* release both cb and pkt_buf */ > + unsigned int i; > + struct mirror_offload_port *port_info = NULL; > + struct mirror_param *data; > + > + netdev_mirror_data_proc(src_port, mirror_data_find, tx_cb, NULL, > + &port_info); > + if (port_info == NULL) { > + VLOG_ERR("Source port %d is not on outstanding port mirror db\n", > + src_port); > + return -1; > + } > + data = tx_cb ? &port_info->tx : &port_info->rx; > + > + for (i = 0; i < data->n_src_queue; i++) { > + if (data->mirror_cb[i].pmd) { > + if (tx_cb) { > + rte_eth_remove_tx_callback(src_port, i, > + data->mirror_cb[i].pmd); > + } else { > + rte_eth_remove_rx_callback(src_port, i, > + data->mirror_cb[i].pmd); > + } > + } > + data->mirror_cb[i].pmd = NULL; > + } > + free(data->mirror_cb); > + > + if (data->pkt_buf) { > + free(data->pkt_buf); > + data->pkt_buf = NULL; > + } > + > + if (data->extra_data) { > + free(data->extra_data); > + data->extra_data = NULL; > + data->extra_data_size = 0; > + } > + > + netdev_mirror_data_proc(src_port, mirror_data_rem, tx_cb, NULL, NULL); > + return 0; > +} > diff --git a/lib/netdev-dpdk-mirror.h b/lib/netdev-dpdk-mirror.h > new file mode 100644 > index 000000000..ee4b933ba > --- /dev/null > +++ b/lib/netdev-dpdk-mirror.h > @@ -0,0 +1,83 @@ > +/* > + * Copyright (c) 2014, 2015, 2016 Nicira, Inc. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > + > +#ifndef NETDEV_DPDK_MIRROR_H > +#define NETDEV_DPDK_MIRROR_H > + > +#include "openvswitch/types.h" > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +typedef enum { > + mirror_data_find, /* find the mirror-data allocated */ > + mirror_data_add, /* add a new mirror_param data int DB */ > + mirror_data_rem, /* remove a mirror_param from the DB */ > +} mirror_data_op; > + > +typedef int (*rte_mirror_scan_fn)(struct rte_mbuf *pkt, void *user_param); > +typedef enum { > + mirror_port, /* port mirror */ > + mirror_flow_mac, /* flow mirror according to source mac */ > + mirror_flow_custom, /* flow mirror according to a callback scn */ > + mirror_invalid, /* invalid mirror_type */ > +} rte_mirror_type; > + > +typedef union { > + const struct rte_eth_rxtx_callback *pmd; > + struct rte_eth_rxtx_callback *direct; > +} mirror_fn_cb; > + > +struct mirror_param { > + uint16_t dst_port_id; > + uint16_t dst_vlan_id; > + rte_spinlock_t *locks; > + int n_src_queue; > + int n_dst_queue; > + struct rte_mbuf **pkt_buf; > + mirror_fn_cb *mirror_cb; > + unsigned int max_burst_size; > + rte_mirror_scan_fn custom_scan; > + rte_mirror_type mirror_type; > + unsigned int extra_data_size; > + void *extra_data; /* extra mirror parameter */ > +}; > + > +struct mirror_offload_port { > + uint32_t dev_id; > + struct mirror_param rx; > + struct mirror_param tx; > +}; > + > +bool netdev_port_started(uint16_t port_id, uint32_t *num_tx_queue); > +int netdev_get_portid_from_addr(const char *pci_addr_str, uint16_t *port_id); > +int netdev_tunnel_port_setup(uint16_t portid, uint32_t *num_queue); > + > +void netdev_mirror_data_proc(uint32_t dev_id, mirror_data_op op, > + int tx, struct mirror_param *in_param, > + struct mirror_offload_port **out_param); > +void netdev_mirror_cb_set(struct mirror_param *data, uint16_t port_id, > + int pmd, int tx); > +int netdev_eth_register_mirror(uint16_t src_port, > + struct mirror_param *param, int tx_cb); > +int netdev_eth_unregister_mirror(uint16_t src_port, int tx_cb); > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* netdev-dpdk-mirror.h */ > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 9d8096668..eb6644333 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -48,6 +48,7 @@ > #include "fatal-signal.h" > #include "if-notifier.h" > #include "netdev-provider.h" > +#include "netdev-dpdk-mirror.h" > #include "netdev-vport.h" > #include "odp-util.h" > #include "openvswitch/dynamic-string.h" > @@ -171,6 +172,16 @@ static const struct rte_eth_conf port_conf = { > }, > }; > > +struct mirror_tunnel_port_info { > + uint16_t port_id; > + rte_spinlock_t *locks; > + uint32_t share_count; > + uint32_t num_queue; > + bool port_started; > + struct mirror_tunnel_port_info *next; > +}; > +static struct mirror_tunnel_port_info *mirror_tunnel_head = NULL; > + > /* > * These callbacks allow virtio-net devices to be added to vhost ports when > * configuration has been fully completed. > @@ -443,6 +454,8 @@ struct netdev_dpdk { > }; > struct dpdk_tx_queue *tx_q; > struct rte_eth_link link; > + mirror_fn_cb *rx_cb; /* shared pointer */ > + mirror_fn_cb *tx_cb; > ); > > PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline1, > @@ -2417,6 +2430,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > nb_rx = rte_vhost_dequeue_burst(vid, qid, dev->dpdk_mp->mp, > (struct rte_mbuf **) batch->packets, > NETDEV_MAX_BURST); > + > + if (dev->rx_cb && dev->rx_cb[qid].direct->fn.rx) { > + dev->rx_cb[qid].direct->fn.rx((uint16_t) vid, qid, > + (struct rte_mbuf **) batch->packets, nb_rx, > + NETDEV_MAX_BURST, dev->rx_cb[qid].direct->param); > + } > + > if (!nb_rx) { > return EAGAIN; > } > @@ -2634,6 +2654,10 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int > qid, > int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; > unsigned int tx_pkts; > > + if (dev->tx_cb && dev->tx_cb[qid].direct->fn.tx) { > + dev->tx_cb[qid].direct->fn.tx((uint16_t) vid, qid, cur_pkts, cnt, > + dev->tx_cb[qid].direct->param); > + } > tx_pkts = rte_vhost_enqueue_burst(vid, vhost_qid, cur_pkts, cnt); > if (OVS_LIKELY(tx_pkts)) { > /* Packets have been sent.*/ > @@ -5291,6 +5315,376 @@ netdev_dpdk_rte_flow_query_count(struct netdev > *netdev, > return ret; > } > > +/* > + * mirror tunnel device management routines > + * mirror tunnel devices are devices reserved solely for > + * traffic mirroring > + */ > +static void > +netdev_dpdk_update_mt_list(struct mirror_tunnel_port_info *mt_port_info, > + bool add_port) > +{ > + struct mirror_tunnel_port_info *ptr = mirror_tunnel_head; > + > + if (add_port) { > + if (!ptr) { > + mirror_tunnel_head = mt_port_info; > + return; > + } > + while (ptr->next) { > + ptr = ptr->next; > + } > + ptr->next = mt_port_info; > + } else { > + while (ptr->next && > + ptr->next->port_id != mt_port_info->port_id) { > + ptr = ptr->next; > + } > + > + if (ptr->next) { > + ptr->next = ptr->next->next; > + free(mt_port_info); > + } else { > + if (ptr->port_id == mt_port_info->port_id) { > + mirror_tunnel_head = NULL; > + free(mt_port_info); > + } else { > + VLOG_ERR("Fail to find %s mirror port (%d) info\n", > + add_port?"add":"remove", mt_port_info->port_id); > + } > + } > + } > +} > + > +static struct mirror_tunnel_port_info* > +netdev_dpdk_get_mt_port_info(uint16_t port_id) > +{ > + struct mirror_tunnel_port_info *mt_port_info; > + > + if (mirror_tunnel_head) { > + mt_port_info = mirror_tunnel_head; > + while (mt_port_info) { > + if (mt_port_info->port_id == port_id) { > + return mt_port_info; > + } > + mt_port_info = mt_port_info->next; > + } > + VLOG_ERR("Could not tunnel port with port-id %d\n", > + port_id); > + } > + > + mt_port_info = xmalloc(sizeof(struct mirror_tunnel_port_info)); > + memset(mt_port_info, 0, sizeof(*mt_port_info)); > + mt_port_info->port_id = port_id; > + mt_port_info->next = NULL; > + > + return mt_port_info; > +} > + > +static int > +netdev_dpdk_addr_to_portid(const char *pci_addr_str, uint16_t *port_id) > +{ > + struct rte_pci_device *pci_dev; > + struct rte_pci_addr pci_addr; > + int i; > + > + if (rte_pci_addr_parse(pci_addr_str, &pci_addr)) { > + VLOG_ERR("Incorrect pci address %s\n", pci_addr_str); > + return -1; > + } > + > + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { > + struct rte_pci_addr *eth_pci_addr; > + > + if (!rte_eth_devices[i].device) { > + continue; > + } > + > + pci_dev = RTE_ETH_DEV_TO_PCI(&rte_eth_devices[i]); > + if (!pci_dev) { > + continue; > + } > + > + eth_pci_addr = &pci_dev->addr; > + > + if (pci_addr.bus == eth_pci_addr->bus && > + pci_addr.devid == eth_pci_addr->devid && > + pci_addr.domain == eth_pci_addr->domain && > + pci_addr.function == eth_pci_addr->function) { > + *port_id = i; > + > + return 0; > + } > + } > + > + return -1; > +} > + > +static int > +netdev_dpdk_mt_open(uint16_t port_id, struct mirror_param *param) > +{ > + struct rte_eth_dev_info dev_info; > + struct rte_eth_txconf txq_conf; > + struct rte_eth_rxconf rxq_conf; > + struct rte_mempool *pktbuf; > + > + struct mirror_tunnel_port_info *mt_info; > + > + uint16_t nb_rxd = NIC_PORT_DEFAULT_RXQ_SIZE; > + uint16_t nb_txd = NIC_PORT_DEFAULT_TXQ_SIZE; > + unsigned int i, num_queue; > + > + struct rte_eth_conf mt_port_conf = { > + .rxmode = { > + .split_hdr_size = 0, > + }, > + .txmode = { > + .mq_mode = ETH_MQ_TX_NONE, > + }, > + }; > + > + mt_info = netdev_dpdk_get_mt_port_info(port_id); > + if (!mt_info) { > + return -1; > + } > + > + if (mt_info->port_started) { > + param->n_dst_queue = mt_info->num_queue; > + param->dst_port_id = port_id; > + param->locks = mt_info->locks; > + mt_info->share_count++; > + > + return 0; > + } > + > + rte_eth_dev_info_get(port_id, &dev_info); > + num_queue = param->n_src_queue; > + > + /* A tunnel device doesn't require mbuf. It's used as > + * hardware channel, transmit packets with > + * mbuf provided by source. Need this mbuf creation > + * to finish port initialization > + */ > + pktbuf = rte_pktmbuf_pool_create( > + "tunnel-port", > + (dev_info.rx_desc_lim.nb_max + dev_info.tx_desc_lim.nb_max), > + RTE_MEMPOOL_CACHE_MAX_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, > + rte_eth_dev_socket_id(port_id)); > + > + mt_port_conf.txmode.offloads |= DEV_TX_OFFLOAD_VLAN_INSERT; > + if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE) { > + mt_port_conf.txmode.offloads |= DEV_TX_OFFLOAD_MBUF_FAST_FREE; > + } > + rte_eth_dev_configure(port_id, 1, num_queue, &mt_port_conf); > + > + /* init one Rx queue */ > + rxq_conf = dev_info.default_rxconf; > + rxq_conf.offloads = mt_port_conf.rxmode.offloads; > + if (rte_eth_rx_queue_setup(port_id, 0, nb_rxd, > + rte_eth_dev_socket_id(port_id), &rxq_conf, pktbuf) < 0) > + VLOG_ERR("fail to setup tunnel port (%d) rx-queue\n", port_id); > + > + /* init # of Tx queue as part of mirror-tunnel setup */ > + txq_conf = dev_info.default_txconf; > + txq_conf.offloads |= mt_port_conf.txmode.offloads; > + for (i = 0; i < num_queue; i++) { > + if (rte_eth_tx_queue_setup(port_id, > + i, nb_txd, > + rte_eth_dev_socket_id(port_id), > + &txq_conf) < 0) { > + VLOG_ERR("fail to setup tunnel port (%d) tx queue #%u\n", > + port_id, i); > + return -1; > + } > + } > + > + if (rte_eth_dev_start(port_id) < 0) { > + VLOG_ERR("fail to start tunnel port %d\n", port_id); > + return -1; > + } > + > + mt_info->locks = xmalloc(num_queue * sizeof(rte_spinlock_t)); > + if (mt_info->locks) { > + for (i = 0; i < mt_info->num_queue; i++) { > + rte_spinlock_init(&mt_info->locks[i]); > + } > + } else { > + return -1; > + } > + mt_info->share_count = 1; > + mt_info->port_started = true; > + mt_info->num_queue = num_queue; > + > + param->n_dst_queue = mt_info->num_queue; > + param->dst_port_id = port_id; > + param->locks = mt_info->locks; > + > + netdev_dpdk_update_mt_list(mt_info, true); > + return 0; > +} > + > +static void > +netdev_dpdk_mt_close(uint16_t mirror_port_id) > +{ > + struct mirror_tunnel_port_info *mt_port_info = > + netdev_dpdk_get_mt_port_info(mirror_port_id); > + > + if (mt_port_info) { > + mt_port_info->share_count--; > + if (!mt_port_info->share_count) { > + netdev_dpdk_update_mt_list(mt_port_info, false); > + rte_eth_dev_stop(mirror_port_id); > + rte_eth_dev_close(mirror_port_id); > + } > + } > +} > + > +/* vhost device mirror registration and un-registration routines */ > +static int > +netdev_vhost_register_mirror(struct netdev_dpdk *dev, > + struct mirror_param *param, int tx_cb) > +{ > + uint32_t vid = netdev_dpdk_get_vid(dev); > + struct mirror_offload_port *port_info = NULL; > + struct mirror_param *data; > + > + netdev_mirror_data_proc(vid, mirror_data_add, tx_cb, param, &port_info); > + if (!port_info) { > + return -1; > + } > + > + data = tx_cb ? &port_info->tx : &port_info->rx; > + netdev_mirror_cb_set(data, (uint16_t) vid, 0, tx_cb); > + > + if (tx_cb) { > + dev->tx_cb = data->mirror_cb; > + } else { > + dev->rx_cb = data->mirror_cb; > + } > + > + return 0; > +} > + > +static int > +netdev_vhost_unregister_mirror(struct netdev_dpdk *dev, int tx_cb) > +{ > + /* release both cb and pkt_buf */ > + unsigned int i; > + uint32_t vid = netdev_dpdk_get_vid(dev); > + struct mirror_offload_port *port_info = NULL; > + struct mirror_param *data; > + > + netdev_mirror_data_proc(vid, mirror_data_find, tx_cb, NULL, &port_info); > + if (port_info == NULL) { > + VLOG_ERR("Source port %d is not on outstanding port mirror db\n", > vid); > + return -1; > + } > + data = tx_cb ? &port_info->tx : &port_info->rx; > + > + if (tx_cb) { > + dev->tx_cb = NULL; > + } else { > + dev->rx_cb = NULL; > + } > + > + for (i = 0; i < data->n_src_queue; i++) { > + free(data->mirror_cb[i].direct); > + } > + > + free(data->mirror_cb); > + > + if (data->pkt_buf) { > + free(data->pkt_buf); > + data->pkt_buf = NULL; > + } > + > + if (data->extra_data) { > + free(data->extra_data); > + data->extra_data = NULL; > + data->extra_data_size = 0; > + } > + > + netdev_mirror_data_proc(vid, mirror_data_rem, tx_cb, NULL, NULL); > + return 0; > +} > + > +static int > +netdev_dpdk_mirror_offload(struct netdev *src, struct eth_addr *flow_addr, > + uint16_t vlan_id, char *mirror_tunnel_addr, > + bool add_mirror, bool tx_cb) { > + struct netdev_dpdk *src_dev = netdev_dpdk_cast(src); > + bool eth_dev = src_dev->type == DPDK_DEV_ETH; > + uint16_t mirror_port_id; > + int status = 0; > + > + if (netdev_dpdk_addr_to_portid(mirror_tunnel_addr, &mirror_port_id)) { > + VLOG_ERR("Could not find tunnel port with BDF addr %s\n", > + mirror_tunnel_addr); > + return -1; > + } > + if (add_mirror) { > + uint32_t i; > + struct mirror_param data; > + uint64_t mac_addr = 0; > + > + memset(&data, 0, sizeof(struct mirror_param)); > + data.extra_data_size = 0; > + data.extra_data = NULL; > + data.mirror_type = mirror_port; > + for (i = 0; i < 6; i++) { > + mac_addr <<= 8; > + mac_addr |= flow_addr->ea[6 - i - 1]; > + } > + if (mac_addr) { > + data.mirror_type = mirror_flow_mac; > + data.extra_data_size = sizeof(uint64_t); > + data.extra_data = xmalloc(sizeof(uint64_t)); > + memcpy(data.extra_data, &mac_addr, sizeof(uint64_t)); > + } > + data.dst_vlan_id = vlan_id; > + data.n_src_queue = tx_cb?src->n_txq:src->n_rxq; > + data.max_burst_size = NETDEV_MAX_BURST; > + > + if (netdev_dpdk_mt_open(mirror_port_id, &data)) { > + VLOG_ERR("Fail to initialize mirror tunnel port %d\n", > + mirror_port_id); > + return -1; > + } > + > + VLOG_INFO("register %s device with %s mirror-offload with" > + "src-port:%d (%s) and output-port:%d (%s) vlan-id=%d flow-mac=" > + "0x%" PRIx64 "\n", > + eth_dev?"ethdev":"vhost", > + tx_cb?"ingress":"egress", src_dev->port_id, > + src->name, mirror_port_id, mirror_tunnel_addr, vlan_id, > + (uint64_t)__builtin_bswap64(mac_addr)); > + > + if (eth_dev) { > + status = netdev_eth_register_mirror(src_dev->port_id, &data, > + tx_cb); > + } else { > + status = netdev_vhost_register_mirror(src_dev, &data, tx_cb); > + } > + } else { > + VLOG_INFO("unregister %s device with %s mirror-offload with" > + " src-port:%d(%s)\n", > + eth_dev?"ethdev":"vhost", > + tx_cb?"ingress":"egress", src_dev->port_id, > + src->name); > + > + if (eth_dev) { > + status = netdev_eth_unregister_mirror(src_dev->port_id, tx_cb); > + } else { > + status = netdev_vhost_unregister_mirror(src_dev, tx_cb); > + } > + > + netdev_dpdk_mt_close(mirror_port_id); > + } > + > + return status; > +} > + > #define NETDEV_DPDK_CLASS_COMMON \ > .is_pmd = true, \ > .alloc = netdev_dpdk_alloc, \ > @@ -5340,6 +5734,7 @@ static const struct netdev_class dpdk_class = { > .construct = netdev_dpdk_construct, > .set_config = netdev_dpdk_set_config, > .send = netdev_dpdk_eth_send, > + .mirror_offload = netdev_dpdk_mirror_offload, > }; > > static const struct netdev_class dpdk_vhost_class = { > @@ -5355,6 +5750,7 @@ static const struct netdev_class dpdk_vhost_class = { > .reconfigure = netdev_dpdk_vhost_reconfigure, > .rxq_recv = netdev_dpdk_vhost_rxq_recv, > .rxq_enabled = netdev_dpdk_vhost_rxq_enabled, > + .mirror_offload = netdev_dpdk_mirror_offload, > }; > > static const struct netdev_class dpdk_vhost_client_class = { > @@ -5371,6 +5767,7 @@ static const struct netdev_class > dpdk_vhost_client_class = { > .reconfigure = netdev_dpdk_vhost_client_reconfigure, > .rxq_recv = netdev_dpdk_vhost_rxq_recv, > .rxq_enabled = netdev_dpdk_vhost_rxq_enabled, > + .mirror_offload = ne²tdev_dpdk_mirror_offload, > }; > > void > diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h > index 73dce2fca..dab278dcd 100644 > --- a/lib/netdev-provider.h > +++ b/lib/netdev-provider.h > @@ -834,6 +834,22 @@ struct netdev_class { > /* Get a block_id from the netdev. > * Returns the block_id or 0 if none exists for netdev. */ > uint32_t (*get_block_id)(struct netdev *); > + > + /* Configure a mirror offload setting on a netdev. > + * 'src': netdev traffic to be mirrored > + * 'flow_addr': the destination mac address is of source traffic for > + * inspection. > + * 'dst': netdev where mirror traffic is transmitted. > + * 'vlan_id': vlag to be added to the mirrored packets. > + * 'mt_pci_addr': mirror tunnel pcie address. > + * 'add_mirror': true: configure a mirror traffic; false: remove mirror > + * 'ingress': true: mirror 'src' netdev Rx traffic; false: mirror > + * 'src' netdev Tx traffic. > + */ > + int (*mirror_offload)(struct netdev *src, struct eth_addr *flow_addr, > + uint16_t vlan_id, char *mt_pci_addr, > + bool add_mirror, bool ingress); > + > }; > > int netdev_register_provider(const struct netdev_class *); > diff --git a/lib/netdev.c b/lib/netdev.c > index 91e91955c..464c2f8fe 100644 > --- a/lib/netdev.c > +++ b/lib/netdev.c > @@ -69,6 +69,8 @@ COVERAGE_DEFINE(netdev_get_stats); > COVERAGE_DEFINE(netdev_send_prepare_drops); > COVERAGE_DEFINE(netdev_push_header_drops); > > +#define MIRROR_DB_INIT_SIZE 8 > + > struct netdev_saved_flags { > struct netdev *netdev; > struct ovs_list node; /* In struct netdev's saved_flags_list. > */ > @@ -2297,3 +2299,387 @@ netdev_free_custom_stats_counters(struct > netdev_custom_stats *custom_stats) > } > } > } > + > + > +struct netdev_mirror_offload_item { > + struct mirror_offload_info info; > + > + struct ovs_list node; > +}; > + > +struct netdev_mirror_offload { > + struct ovs_mutex mutex; > + struct ovs_list list; > + pthread_cond_t cond; > +}; > + > +static struct netdev_mirror_offload netdev_mirror_offload = { > + .mutex = OVS_MUTEX_INITIALIZER, > + .list = OVS_LIST_INITIALIZER(&netdev_mirror_offload.list), > +}; > + > +static struct ovsthread_once offload_thread_once > + = OVSTHREAD_ONCE_INITIALIZER; > + > +static void *netdev_mirror_offload_main(void *data); > + > +/* > + * Re-size mirror_db when it's out of space. > + * Always double the buffer when it's needed > + */ > +static int > +netdev_mirror_db_resize(struct netdev_mirror_offload_item ***old_db, > + int *old_db_size) > +{ > + struct netdev_mirror_offload_item **new_db; > + int cur_size = *old_db_size; > + int new_size; > + > + if (!cur_size) { > + new_size = MIRROR_DB_INIT_SIZE; > + } else { > + new_size = 2 * cur_size; > + } > + > + new_db = xzalloc(sizeof(struct netdev_mirror_offload_item *) * new_size); > + > + if (!new_db) { > + VLOG_ERR("Out of memory!!!"); > + return -1; > + } > + memset(new_db, 0, sizeof(struct netdev_mirror_offload_item *) * > new_size); > + > + if (cur_size) { > + int i; > + > + for (i = 0; i < cur_size; i++) { > + new_db[i] = (*old_db)[i]; > + } > + free(*old_db); > + } > + > + *old_db = new_db; > + *old_db_size = new_size; > + > + return 0; > +} > + > +static void > +netdev_free_mirror_offload(struct netdev_mirror_offload_item *offload) > +{ > + if (!offload) { > + return; > + } > + > + if (offload->info.src) { > + free(offload->info.src); > + } > + if (offload->info.dst) { > + free(offload->info.dst); > + } > + if (offload->info.flow_dst_mac) { > + free(offload->info.flow_dst_mac); > + } > + if (offload->info.flow_src_mac) { > + free(offload->info.flow_src_mac); > + } > + if (offload->info.output_src_tags) { > + free(offload->info.output_src_tags); > + } > + if (offload->info.output_dst_tags) { > + free(offload->info.output_dst_tags); > + } > + if (offload->info.name) { > + free(offload->info.name); > + } > + if (offload->info.mirror_tunnel_addr) { > + free(offload->info.mirror_tunnel_addr); > + } > + > + free(offload); > +} > + > +static struct > +netdev_mirror_offload_item * > +netdev_alloc_mirror_offload(struct mirror_offload_info *info) > +{ > + struct netdev_mirror_offload_item *offload; > + int i; > + > + offload = xzalloc(sizeof(*offload)); > + memcpy(&offload->info, info, sizeof(struct mirror_offload_info)); > + > + if (info->name) { > + offload->info.name = xzalloc(strlen(info->name) + 1); > + if (offload->info.name) { > + ovs_strzcpy(offload->info.name, info->name, strlen(info->name)); > + } > + } > + > + if (info->mirror_tunnel_addr) { > + offload->info.mirror_tunnel_addr = > + xzalloc(strlen(info->mirror_tunnel_addr) + 1); > + if (offload->info.mirror_tunnel_addr) { > + ovs_strzcpy(offload->info.mirror_tunnel_addr, > + info->mirror_tunnel_addr, > + strlen(info->mirror_tunnel_addr)); > + } > + } > + > + /* only add_mirror request include valid configuration */ > + if (info->n_src_port) { > + offload->info.src = xzalloc(sizeof(struct netdev > *)*info->n_src_port); > + offload->info.flow_dst_mac = xzalloc(sizeof(struct eth_addr)* > + info->n_src_port); > + offload->info.output_src_tags = xzalloc(sizeof(uint16_t)* > + info->n_src_port); > + if (!offload->info.src || !offload->info.flow_dst_mac || > + !offload->info.output_src_tags) { > + VLOG_ERR("Out of memory!!!"); > + netdev_free_mirror_offload(offload); > + return NULL; > + } > + > + for (i = 0; i < info->n_src_port; i++) { > + offload->info.src[i] = info->src[i]; > + offload->info.output_src_tags[i] = info->output_src_tags[i]; > + memcpy(&offload->info.flow_dst_mac[i], &info->flow_dst_mac[i], > + sizeof(struct eth_addr)); > + } > + } > + > + if (info->n_dst_port) { > + offload->info.dst = xzalloc(sizeof(struct netdev > *)*info->n_dst_port); > + offload->info.flow_src_mac = xzalloc(sizeof(struct eth_addr)* > + info->n_dst_port); > + offload->info.output_dst_tags = xzalloc(sizeof(uint16_t)* > + info->n_dst_port); > + if (!offload->info.dst || !offload->info.flow_src_mac || > + !offload->info.output_dst_tags) { > + VLOG_ERR("Out of memory!!!"); > + netdev_free_mirror_offload(offload); > + return NULL; > + } > + > + for (i = 0; i < info->n_dst_port; i++) { > + offload->info.dst[i] = info->dst[i]; > + offload->info.output_dst_tags[i] = info->output_dst_tags[i]; > + memcpy(&offload->info.flow_src_mac[i], &info->flow_src_mac[i], > + sizeof(struct eth_addr)); > + } > + } > + > + return offload; > +} > + > +static void > +netdev_append_mirror_offload(struct netdev_mirror_offload_item *offload) > +{ > + ovs_mutex_lock(&netdev_mirror_offload.mutex); > + ovs_list_push_back(&netdev_mirror_offload.list, &offload->node); > + xpthread_cond_signal(&netdev_mirror_offload.cond); > + ovs_mutex_unlock(&netdev_mirror_offload.mutex); > +} > + > +void > +netdev_mirror_offload_put(struct mirror_offload_info *info) > +{ > + struct netdev_mirror_offload_item *offload; > + /* only support tunnel port for traffic mirroring */ > + if (info->add_mirror && !info->mirror_tunnel_addr) { > + return; > + } > + > + if (ovsthread_once_start(&offload_thread_once)) { > + xpthread_cond_init(&netdev_mirror_offload.cond, NULL); > + ovs_thread_create("netdev_mirror_offload", > + netdev_mirror_offload_main, NULL); > + ovsthread_once_done(&offload_thread_once); > + } > + > + offload = netdev_alloc_mirror_offload(info); > + netdev_append_mirror_offload(offload); > +} > + > +static int > +netdev_mirror_offload_configue(struct mirror_offload_info *info, > + bool add_mirror) > +{ > + int un_support_count = 0; > + int ret; > + > + if (info->n_src_port) { > + for (int i = 0; i < info->n_src_port; i++) { > + const struct netdev_class *class = > + info->src[i]->netdev_class; > + if (!class) { > + return -1; > + } > + if (class->mirror_offload) { > + ret = class->mirror_offload( > + info->src[i], > + &info->flow_dst_mac[i], > + info->output_src_tags[i], > + info->mirror_tunnel_addr, > + add_mirror, false); > + if (ret) { > + VLOG_ERR("Fail to %s mirror-offload" > + " configuration %s\n", > + add_mirror ? "add" : "remove", > + info->name); > + return ret; > + } > + } else { > + un_support_count++; > + } > + } > + } > + > + if (info->n_dst_port) { > + for (int i = 0; i < info->n_dst_port; i++) { > + const struct netdev_class *class = > + info->dst[i]->netdev_class; > + if (!class) { > + return -1; > + } > + if (class->mirror_offload) { > + ret = class->mirror_offload( > + info->dst[i], > + &info->flow_src_mac[i], > + info->output_dst_tags[i], > + info->mirror_tunnel_addr, > + add_mirror, true); > + if (ret) { > + VLOG_ERR("Fail to %s mirror-offload" > + " configuration %s\n", > + add_mirror ? "add" : "remove", > + info->name); > + return ret; > + } > + } else { > + un_support_count++; > + } > + } > + } > + > + return un_support_count; > +} > + > +static void * > +netdev_mirror_offload_main(void *data OVS_UNUSED) > +{ > + struct netdev_mirror_offload_item *offload; > + struct mirror_offload_info *info; > + struct ovs_list *list; > + struct netdev_mirror_offload_item **offload_db = NULL; > + int offload_used_count = 0; > + int offload_db_size = 0; > + int ret, i, ind; > + > + /* continue polling to check if there is an outstanding request */ > + for (;;) { > + ovs_mutex_lock(&netdev_mirror_offload.mutex); > + if (ovs_list_is_empty(&netdev_mirror_offload.list)) { > + ovsrcu_quiesce_start(); > + ovs_mutex_cond_wait(&netdev_mirror_offload.cond, > + &netdev_mirror_offload.mutex); > + ovsrcu_quiesce_end(); > + } > + list = ovs_list_pop_front(&netdev_mirror_offload.list); > + offload = CONTAINER_OF(list, struct netdev_mirror_offload_item, > + node); > + ovs_mutex_unlock(&netdev_mirror_offload.mutex); > + > + if (!offload_db_size && > + netdev_mirror_db_resize(&offload_db, &offload_db_size)){ > + return NULL; > + } > + > + ind = offload_db_size; > + for (i = 0; i < offload_db_size; i++) { > + if (offload_db[i] && > + !strncmp(offload_db[i]->info.name, offload->info.name, > + strlen(offload->info.name) + 1)) { > + ind = i; > + break; > + } > + } > + > + if (!offload->info.add_mirror) { > + /* remove mirror offload setup */ > + if (ind == offload_db_size) { > + VLOG_WARN("Mirror offload remove configuration, %s, " > + "not found; clear mirror offload operation" > + " aborted\n", offload->info.name); > + continue; > + } > + } else { > + /* add mirror offload */ > + if (ind < offload_db_size) { > + netdev_free_mirror_offload(offload); > + VLOG_WARN("Attempt adding an existing mirror-offload " > + "configuration; request aborted\n"); > + continue; > + } > + > + if (offload_used_count == offload_db_size && > + netdev_mirror_db_resize(&offload_db, &offload_db_size)) { > + return NULL; > + } > + } > + > + info = offload->info.add_mirror ? &offload->info : > + &offload_db[ind]->info; > + ret = netdev_mirror_offload_configue(info, offload->info.add_mirror); > + > + if (ret) { > + VLOG_ERR("%s mirror configuration fails due to %s\n", > + offload->info.add_mirror ? "Add" : "Remove", > + ret > 0 ? "unsupport source traffic type" : > + "device is not ready"); > + netdev_free_mirror_offload(offload); > + continue; > + } else { > + VLOG_INFO("Succeed %s mirror-offload configuration: %s", > + offload->info.add_mirror ? "adding" : "removing", > + offload->info.name); > + } > + > + if (offload->info.add_mirror) { > + for (i = 0; i < offload_db_size; i++) { > + if (offload_db[i] == NULL) { > + offload_db[i] = offload; > + offload_used_count++; > + break; > + } > + } > + } else { > + /* remove the prior "add" request */ > + netdev_free_mirror_offload(offload_db[ind]); > + offload_db[ind] = NULL; > + > + /* remove the current("remove") request */ > + netdev_free_mirror_offload(offload); > + offload_used_count--; > + } > + > + /* free db when the used count drop to 0 */ > + if (!offload_used_count) { > + free(offload_db); > + offload_db = NULL; > + offload_db_size = 0; > + } > + } > + > + /* clean up memory */ > + for (i = 0; i < offload_db_size; i++) { > + if (offload_db[i]) { > + netdev_free_mirror_offload(offload_db[i]); > + } > + } > + if (offload_db) { > + free(offload_db); > + } > + > + return NULL; > +} > diff --git a/lib/netdev.h b/lib/netdev.h > index b705a9e56..cce042fc7 100644 > --- a/lib/netdev.h > +++ b/lib/netdev.h > @@ -201,6 +201,22 @@ int netdev_send(struct netdev *, int qid, struct > dp_packet_batch *, > bool concurrent_txq); > void netdev_send_wait(struct netdev *, int qid); > > +/* Hardware assisted mirror offloading*/ > +struct mirror_offload_info { > + struct netdev **src; > + struct netdev **dst; > + int n_src_port; > + int n_dst_port; > + struct eth_addr *flow_src_mac; > + struct eth_addr *flow_dst_mac; > + uint16_t *output_src_tags; > + uint16_t *output_dst_tags; > + bool add_mirror; > + char *mirror_tunnel_addr; > + char *name; > +}; > +void netdev_mirror_offload_put(struct mirror_offload_info *); > + > /* native tunnel APIs */ > /* Structure to pass parameters required to build a tunnel header. */ > struct netdev_tnl_build_header_params { > diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at > index dccb11741..ff6e9e625 100644 > --- a/tests/ovs-vsctl.at > +++ b/tests/ovs-vsctl.at > @@ -1364,7 +1364,9 @@ _uuid : <1> > name : eth1 > _uuid : <2> > name : mymirror > +output_dst_vlan : [] > output_port : <1> > +output_src_vlan : [] > output_vlan : [] > select_all : false > select_dst_port : [<0>] > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c > index 5ed7e8234..7b7603513 100644 > --- a/vswitchd/bridge.c > +++ b/vswitchd/bridge.c > @@ -38,6 +38,7 @@ > #include "mac-learning.h" > #include "mcast-snooping.h" > #include "netdev.h" > +#include "netdev-provider.h" > #include "netdev-offload.h" > #include "nx-match.h" > #include "ofproto/bond.h" > @@ -330,6 +331,9 @@ static void mirror_destroy(struct mirror *); > static bool mirror_configure(struct mirror *); > static void mirror_refresh_stats(struct mirror *); > > +static void mirror_offload_destroy(struct mirror *); > +static bool mirror_offload_configure(struct mirror *); > + > static void iface_configure_lacp(struct iface *, > struct lacp_member_settings *); > static bool iface_create(struct bridge *, const struct ovsrec_interface *, > @@ -423,6 +427,35 @@ if_notifier_changed(struct if_notifier *notifier > OVS_UNUSED) > seq_wait(ifaces_changed, last_ifaces_changed); > return changed; > } > + > +static struct port * > +port_lookup_all(const char *port_name) > +{ > + struct bridge *br; > + struct port *port = NULL; > + int found = 0; > + > + HMAP_FOR_EACH (br, node, &all_bridges) { > + struct port *temp_port = NULL; > + temp_port = port_lookup(br, port_name); > + if (temp_port) { > + if (!port) { > + port = temp_port; > + } > + found++; > + } > + } > + > + if (found) { > + if (found > 1) { > + VLOG_INFO("More than one bridge owns port with name:%s\n", > + port_name); > + } > + return port; > + } > + return NULL; > +} > + > > /* Public functions. */ > > @@ -5055,14 +5088,228 @@ mirror_create(struct bridge *br, const struct > ovsrec_mirror *cfg) > return m; > } > > +static struct netdev *get_netdev_from_port(struct mirror *m, > + struct port **port, const char *name) > +{ > + struct port *temp_port; > + struct iface *iface; > + > + *port = NULL; > + temp_port = port_lookup(m->bridge, name); > + if (temp_port) { > + LIST_FOR_EACH (iface, port_elem, &temp_port->ifaces) { > + if (iface) { > + *port = temp_port; > + return iface->netdev; > + } > + } > + } > + /* try different bridges */ > + temp_port = port_lookup_all(name); > + if (temp_port) { > + LIST_FOR_EACH (iface, port_elem, &temp_port->ifaces) { > + if (iface) { > + *port = temp_port; > + return iface->netdev; > + } > + } > + } > + return NULL; > +} > + > +static void > +release_mirror_offload_info(struct mirror_offload_info *info) > +{ > + if (info->src) { > + free(info->src); > + } > + if (info->dst) { > + free(info->dst); > + } > + if (info->flow_dst_mac) { > + free(info->flow_dst_mac); > + } > + if (info->flow_src_mac) { > + free(info->flow_src_mac); > + } > + if (info->output_src_tags) { > + free(info->output_src_tags); > + } > + if (info->output_dst_tags) { > + free(info->output_dst_tags); > + } > + if (info->name) { > + free(info->name); > + } > + if (info->mirror_tunnel_addr) { > + free(info->mirror_tunnel_addr); > + } > +} > + > +static int > +set_mirror_offload_info(struct mirror *m, struct mirror_offload_info *info) > +{ > + const struct ovsrec_mirror *cfg = m->cfg; > + struct port *port = NULL; > + int i; > + > + if (m->name) { > + info->name = xmalloc(strlen(m->name) + 1); > + ovs_strzcpy(info->name, m->name, strlen(m->name)); > + } > + > + if (cfg->mirror_tunnel_addr) { > + info->mirror_tunnel_addr = xmalloc(strlen(cfg->mirror_tunnel_addr) > + + 1); > + ovs_strzcpy(info->mirror_tunnel_addr, cfg->mirror_tunnel_addr, > + strlen(cfg->mirror_tunnel_addr)); > + } else { > + VLOG_ERR("mirror-offload configuration fails because" > + " lack of tunnel device\n"); > + return -1; > + } > + > + /* source port */ > + info->n_src_port = cfg->n_select_src_port; > + if (info->n_src_port) { > + info->src = xmalloc(sizeof(struct netdev *)*info->n_src_port); > + info->flow_dst_mac = xmalloc(sizeof(struct eth_addr)* > + info->n_src_port); > + if (info->n_src_port != cfg->n_output_src_vlan) { > + VLOG_ERR("src port count:%d ouput src vlan count:%lu", > + info->n_src_port, (unsigned long) cfg->n_output_src_vlan); > + return -1; > + } > + info->output_src_tags = xmalloc(sizeof(uint16_t)*info->n_src_port); > + } > + > + if (info->n_src_port) { > + /* find netdev instance for each port */ > + for (i = 0; i < info->n_src_port; i++) { > + info->src[i] = get_netdev_from_port(m, &port, > + cfg->select_src_port[i]->name); > + if (!info->src[i]) { > + VLOG_ERR("src-port: %s is not a netdev device\n", > + cfg->select_src_port[i]->name); > + return -1; > + } > + } > + memset(info->flow_dst_mac, 0, sizeof(struct eth_addr)* > + info->n_src_port); > + > + /* > + * for source port, flow is separated by > + * different dst mac addr > + */ > + if (cfg->n_flow_dst_mac) { > + int dst_count = (info->n_src_port > cfg->n_flow_dst_mac)? > + cfg->n_flow_dst_mac:info->n_src_port; > + for (i = 0; i < dst_count; i++) { > + eth_addr_from_string(cfg->flow_dst_mac[i], > + &info->flow_dst_mac[i]); > + } > + } > + > + if (cfg->n_output_src_vlan) { > + int count = (cfg->n_output_src_vlan > info->n_src_port)? > + info->n_src_port:cfg->n_output_src_vlan; > + for (i = 0; i < count; i++) { > + info->output_src_tags[i] = cfg->output_src_vlan[i] & 0xFFF; > + } > + } > + } > + > + /* dst ports */ > + info->n_dst_port = cfg->n_select_dst_port; > + if (info->n_dst_port) { > + info->dst = xmalloc(sizeof(struct netdev *)*info->n_dst_port); > + info->flow_src_mac = xmalloc(sizeof(struct eth_addr)* > + info->n_dst_port); > + if (info->n_dst_port != cfg->n_output_dst_vlan) { > + VLOG_ERR("dst port count:%d ouput dst vlan count:%lu\n", > + info->n_dst_port, (unsigned long) cfg->n_output_dst_vlan); > + return -1; > + } > + info->output_dst_tags = xmalloc(sizeof(uint16_t)*info->n_dst_port); > + } > + > + if (info->n_dst_port) { > + for (i = 0; i < info->n_dst_port; i++) { > + info->dst[i] = get_netdev_from_port(m, &port, > + cfg->select_dst_port[i]->name); > + if (!info->dst[i]) { > + VLOG_ERR("dst-port: %s is not a netdev device\n", > + cfg->select_dst_port[i]->name); > + return -1; > + } > + } > + memset(info->flow_src_mac, 0, sizeof(struct eth_addr)* > + info->n_dst_port); > + > + /* > + * for destination port, flow is separated by > + * different src mac addr > + */ > + if (cfg->n_flow_src_mac) { > + int src_count = (info->n_dst_port > cfg->n_flow_src_mac)? > + cfg->n_flow_src_mac:info->n_dst_port; > + for (i = 0; i < src_count; i++) { > + eth_addr_from_string(cfg->flow_src_mac[i], > + &info->flow_src_mac[i]); > + } > + } > + > + if (cfg->n_output_dst_vlan) { > + int count = (cfg->n_output_dst_vlan > info->n_dst_port)? > + info->n_dst_port:cfg->n_output_dst_vlan; > + for (i = 0; i < count; i++) { > + info->output_dst_tags[i] = cfg->output_dst_vlan[i] & 0xFFF; > + } > + } > + } > + > + VLOG_INFO("sucess creating mirror-offload(%s): with %d src-port" > + " streams %d dst-port streams to tunnel %s\n", > + cfg->name, info->n_src_port, info->n_dst_port, > + info->mirror_tunnel_addr?info->mirror_tunnel_addr:"none"); > + return 0; > +} > + > +static void > +mirror_offload_destroy(struct mirror *m) > +{ > + struct mirror_offload_info info; > + > + memset(&info, 0, sizeof(struct mirror_offload_info)); > + info.add_mirror = false; > + if (m->name) { > + info.name = xmalloc(strlen(m->name) + 1); > + if (info.name) { > + ovs_strzcpy(info.name, m->name, strlen(m->name)); > + } > + } > + > + netdev_mirror_offload_put(&info); > + if (info.name) { > + free(info.name); > + } > + if (info.mirror_tunnel_addr) { > + free(info.mirror_tunnel_addr); > + } > +} > + > static void > mirror_destroy(struct mirror *m) > { > if (m) { > struct bridge *br = m->bridge; > > - if (br->ofproto) { > - ofproto_mirror_unregister(br->ofproto, m); > + if (m->cfg && m->cfg->mirror_offload) { > + mirror_offload_destroy(m); > + } else { > + if (br->ofproto) { > + ofproto_mirror_unregister(br->ofproto, m); > + } > } > > hmap_remove(&br->mirrors, &m->hmap_node); > @@ -5094,12 +5341,32 @@ mirror_collect_ports(struct mirror *m, > *n_out_portsp = n_out_ports; > } > > +static bool > +mirror_offload_configure(struct mirror *m) > +{ > + struct mirror_offload_info info; > + > + memset(&info, 0, sizeof(struct mirror_offload_info)); > + info.add_mirror = true; > + if (set_mirror_offload_info(m, &info)) { > + release_mirror_offload_info(&info); > + return false; > + } > + > + netdev_mirror_offload_put(&info); > + release_mirror_offload_info(&info); > + return true; > +} > + > static bool > mirror_configure(struct mirror *m) > { > const struct ovsrec_mirror *cfg = m->cfg; > struct ofproto_mirror_settings s; > > + if (cfg->mirror_offload) { > + return mirror_offload_configure(m); > + } > /* Set name. */ > if (strcmp(cfg->name, m->name)) { > free(m->name); > diff --git a/vswitchd/vswitch.ovsschema b/vswitchd/vswitch.ovsschema > index 0666c8c76..4a1a34a1f 100644 > --- a/vswitchd/vswitch.ovsschema > +++ b/vswitchd/vswitch.ovsschema > @@ -1,6 +1,6 @@ > {"name": "Open_vSwitch", > - "version": "8.2.0", > - "cksum": "1076640191 26427", > + "version": "8.2.1", > + "cksum": "4051567316 27206", > "tables": { > "Open_vSwitch": { > "columns": { > @@ -418,8 +418,18 @@ > "columns": { > "name": { > "type": "string"}, > + "mirror_tunnel_addr": { > + "type": "string"}, > "select_all": { > "type": "boolean"}, > + "mirror_offload": { > + "type": "boolean"}, > + "flow_src_mac": { > + "type": {"key": {"type": "string"}, > + "min": 0, "max": "unlimited"}}, > + "flow_dst_mac": { > + "type": {"key": {"type": "string"}, > + "min": 0, "max": "unlimited"}}, > "select_src_port": { > "type": {"key": {"type": "uuid", > "refTable": "Port", > @@ -440,6 +450,16 @@ > "refTable": "Port", > "refType": "weak"}, > "min": 0, "max": 1}}, > + "output_src_vlan": { > + "type": {"key": {"type": "integer", > + "minInteger": 0, > + "maxInteger": 4294967295}, > + "min": 0, "max": 4096}}, > + "output_dst_vlan": { > + "type": {"key": {"type": "integer", > + "minInteger": 0, > + "maxInteger": 4294967295}, > + "min": 0, "max": 4096}}, > "output_vlan": { > "type": {"key": {"type": "integer", > "minInteger": 1, > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 4597a215d..fd2049a7f 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -4869,11 +4869,35 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > selected VLANs. > </p> > > + <column name="mirror_tunnel_addr"> > + BDF string of the tunnel device on which mirrored traffic will be > + transmitted. > + </column> > + > <column name="select_all"> > If true, every packet arriving or departing on any port is > selected for mirroring. > </column> > > + <column name="mirror_offload"> > + If true, a hw-assisted port mirroring is configured instead > + default mirroring. > + </column> > + > + <column name="flow_src_mac"> > + The source MAC address(es) for per-flow mirroring. Each MAC > + address is separate by ','. This parametr is paired with > + select_dst_port. A '0' MAC address indicates the requested mirror > + is a per-port mirroring, otherwise it's a per-flow mirroring > + </column> > + > + <column name="flow_dst_mac"> > + The destination MAC address(es) for per-flow mirroring. Each MAC > + address is separate by ','. This parametr is paired with > + select_src_port. A '0' MAC address indicates the requested mirror > + is a per-port mirroring, otherwise it's a per-flow mirroring > + </column> > + > <column name="select_dst_port"> > Ports on which departing packets are selected for mirroring. > </column> > @@ -4955,6 +4979,32 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > </p> > </column> > > + <column name="output_src_vlan"> > + <p>Output VLAN for selected source port packets, if nonempty.</p> > + <p> > + <em>Please note:</em> This is different than > + <ref column="output-vlan"/> This vlan is used to add an additional > + vlan tag on the mirror traffic, regardless it contains vlan or not. > + The receive end could choose to filter out this additional vlan. > + This option is provided so the mirrored traffic could maintain its > + original vlan informaiton, and this mirror can be used to filter > + out un-wanted traffic such as in <ref column="mirror_offload"/>. > + </p> > + </column> > + > + <column name="output_dst_vlan"> > + <p>Output VLAN for selected destination port packets, if > nonempty.</p> > + <p> > + <em>Please note:</em> This is different than > + <ref column="output-vlan"/> This vlan is used to add an additional > + vlan tag on the mirror traffic, regardless it contains vlan or not. > + The receive end could choose to filter out this additional vlan. > + This option is provided so the mirrored traffic could maintain its > + original vlan informaiton, and this mirror cab be used to filter > + out un-wanted traffic such as in <ref column="mirror_offload"/>. > + </p> > + </column> > + > <column name="snaplen"> > <p>Maximum per-packet number of bytes to mirror.</p> > <p>A mirrored packet with size larger than <ref column="snaplen"/> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev