Hi Liang-min,

When replying inline, please do not prefix with ">>" as it is handled as
quoted text. There is no need to prefix.

On 5/18/21 8:00 PM, Wang, Liang-min wrote:
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coque...@redhat.com>
>> Sent: Tuesday, May 18, 2021 12:15 PM
>> To: Miskell, Timothy <timothy.misk...@intel.com>; d...@openvswitch.org
>> Cc: Wang, Liang-min <liang-min.w...@intel.com>
>> Subject: Re: [PATCH] Extends the existing mirror configuration parameters
>>
>> Hi Timothy, Liang-min,
>>
>> Thanks for rebasing the patch.
>> A list of delta against the first RFC could help the reviewers.
>> I notice one change in the right direction is the conversion to Vhost
>> API datapath instead of Vhost PMD.
>>
>>> In this patch we support both Vhost API and Vhost PMD because OVS supports 
>>> both
>>> VhostUser and Vdev ports.
>>>
>> Also, I would suggest to have the patch split in several incremental
>> patches to ease the review.
>>
>>> Thank you for suggestion. We will provide incremental patches on next 
>>> submission
>>>
>> On 5/10/21 6:00 PM, Timothy Miskell wrote:
>>> From: Liang-min Wang <liang-min.w...@intel.com>
>>>
>>> The following parameters are added:
>>>  - mirror-offload: to turn on/off mirror offloading.
>>>  - output-port-name: specify a port, using name string, that is on a 
>>> different
>>>    bridge
>>>  - output-src-vlan: output port vlan for each select-src-port.
>>>  - output-dst-vlan: output port vlan for each select-dst-port.
>>>  - flow-src-mac: use src mac address of each select-dst-port for the header
>>>    scan.
>>>  - flow-dst-mac: use dst mac address of each select-src-port for the header
>>>    scan.
>>>  - mirror-tunnel-addr: BDF string of the tunnel device.
>>>
>>> ovs-vsctl test change because new mirroring parameters are introduced in
>> this patch
>>
>> It would help to provide examples of usage of these new parameters.
>>
>>> Will add examples in the new patches
>>>
>>> Create a defer procedure call thread to handle all mirror offload requests.
>>> This is a light-weight thread which remains in sleep-state when there is no
>> new request.
>>> This is created between ovs-vsctl and mirror offloading back end
>>>
>>> Implementing DPDK tx-burst (VIRTIO ingress traffic
>>> mirror) and rx-burst (VIRTIO egress traffic mirror) callbacks.
>>> Each callback  functions implement the following tasks:
>>>  1. Enable per-packet VLAN insertion
>>>    - for port mirroring, all packets are enabled per-packet VLAN insertion.
>>>    - for flow mirroring, only packet header matches the required mac
>> address
>>>      are enabled.
>>>  2. Sending the packets to the specified transport port (output-port in
>>>     mirror offload configuration)
>>>    - for port mirroring, all packets are sent to the transport port.
>>>    - for flow mirroring, only matched packets are sent.
>>>  3. Restore each packet attributes (remove DPDK per-packet offload flag)
>>
>> I will for sure have more questions later, but please find a few
>> comments/questions below:
>>
>>> Signed-off-by: Liang-min Wang <liang-min.w...@intel.com>
>>> Tested-by: Timothy Miskell <timothy.misk...@intel.com>
>>> Suggested-by: Munish Mehan <mm6...@att.com>
>>> ---
>>>  lib/automake.mk            |   2 +
>>>  lib/netdev-dpdk-mirror.c   | 516
>> +++++++++++++++++++++++++++++++++++++
>>>  lib/netdev-dpdk-mirror.h   |  83 ++++++
>>>  lib/netdev-dpdk.c          | 397 ++++++++++++++++++++++++++++
>>>  lib/netdev-provider.h      |  16 ++
>>>  lib/netdev.c               | 386 +++++++++++++++++++++++++++
>>>  lib/netdev.h               |  16 ++
>>>  tests/ovs-vsctl.at         |   2 +
>>>  vswitchd/bridge.c          | 271 ++++++++++++++++++-
>>>  vswitchd/vswitch.ovsschema |  24 +-
>>>  vswitchd/vswitch.xml       |  50 ++++
>>>  11 files changed, 1759 insertions(+), 4 deletions(-)
>>>  create mode 100644 lib/netdev-dpdk-mirror.c
>>>  create mode 100644 lib/netdev-dpdk-mirror.h
>>>

...

>>> +
>>> +/* port/flow mirror traffic processors */
>>> +static inline uint16_t
>>> +netdev_custom_mirror_offload_cb(uint16_t qidx, struct rte_mbuf
>> **pkts,
>>> +    uint16_t nb_pkts, void *user_params)
>>> +{
>>> +    struct mirror_param *data = user_params;
>>> +    uint16_t i, dst_qidx, match_count = 0;
>>> +    uint16_t pkt_trans;
>>> +    uint16_t dst_port_id = data->dst_port_id;
>>> +    uint16_t dst_vlan_id = data->dst_vlan_id;
>>> +    struct rte_mbuf **pkt_buf = &data->pkt_buf[qidx * data-
>>> max_burst_size];
>>> +
>>> +    if (nb_pkts == 0) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (nb_pkts > data->max_burst_size) {
>>> +        VLOG_ERR("Per-flow batch size, %d, exceeds maximum limit\n",
>> nb_pkts);
>>> +        return 0;
>>> +    }
>>> +
>>> +    for (i = 0; i < nb_pkts; i++) {
>>> +        if (data->custom_scan(pkts[i], user_params)) {
>>> +            pkt_buf[match_count] = pkts[i];
>>> +            pkt_buf[match_count]->ol_flags |= PKT_TX_VLAN_PKT;
>>
>> Does it work if the packet already has a VLAN inserted?
>>
>>> Good catch. The design is based upon no VLAN insertion offloading is 
>>> applied on source traffic. 

That is a significant limitation, I haven not seen it documented in the
series. Also, the IEEE paper mentions using QinQ when the packet has
already a VLAN inserted, but it is not in the series. Is there a reason
QinQ is not possible?

IIUC, that means the TAP VM would see traffic having no VLAN, whereas it
has in reality?

>>> +            pkt_buf[match_count]->vlan_tci = dst_vlan_id;
>>> +            rte_mbuf_refcnt_update(pkt_buf[match_count], 1);
>>
>>
>>
>>> +            match_count++;
>>> +        }
>>> +    }
>>> +
>>> +    dst_qidx = (data->n_dst_queue > qidx)?qidx:(data->n_dst_queue -1);
>>
>> Wouldn't it scale better with:
>> dst_qidx = qidx % data->n_dst_queue
>> ?
>>
>>> We tried to avoid using "%" operator. We could add "unlikely" and the 
>>> suggested "%" to make improvement

Not sure adding 'unlikely' is really necessary. The cost of the modulo
operation is nothing compared to all we do in this path.

>>>
>>> +
>>> +    rte_spinlock_lock(&data->locks[dst_qidx]);
>>> +    pkt_trans = rte_eth_tx_burst(dst_port_id, dst_qidx, pkt_buf,
>> match_count);
>>> +    rte_spinlock_unlock(&data->locks[dst_qidx]);
>>> +
>>> +    for (i = 0; i < match_count; i++) {
>>> +        pkt_buf[i]->ol_flags &= ~PKT_TX_VLAN_PKT;
>>> +    }
>>
>> In order to further reduce the performance impact of mirroring, have you
>> envisaged to offload it to dedicated PMD threads?
>>
>>> The mirror-tunnel design is a comprised approach between hardware TAP and 
>>> software TAP.
>>> The tunnel itself is designed to have very little impact on source traffic 
>>> processing core. From
>>> our benchmark on SR-IOV, L2 forwarding, mirroring, we only observed 10-20% 
>>> impact on 64-byte packets, and
>>> we did not observe impact when running traffic with packet size with 
>>> 128-byte or above.

The cover letter mentions 20-30% for 64B packets, and the IEEE paper
seems to indicate a significant impact up to 512B packets (~25%?).

Maybe the workload is different in these tests, that could explain why
you don't see an impact starting 128B?

>>>
>>> +
>>> +    while (unlikely (pkt_trans < match_count)) {
>>> +        rte_pktmbuf_free(pkt_buf[pkt_trans]);
>>> +        pkt_trans++;
>>> +    }
>>> +
>>> +    return nb_pkts;
>>> +}
>>> +

...

>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>>> index 4597a215d..fd2049a7f 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -4869,11 +4869,35 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>> type=patch options:peer=p1 \
>>>          selected VLANs.
>>>        </p>
>>>
>>> +      <column name="mirror_tunnel_addr">
>>> +        BDF string of the tunnel device on which mirrored traffic will be
>>> +        transmitted.
>>> +      </column>
>>> +
>>>        <column name="select_all">
>>>          If true, every packet arriving or departing on any port is
>>>          selected for mirroring.
>>>        </column>
>>>
>>> +      <column name="mirror_offload">
>>> +        If true, a hw-assisted port mirroring is configured instead
>>> +        default mirroring.
>>> +      </column>
>>> +
>>> +      <column name="flow_src_mac">
>>> +        The source MAC address(es) for per-flow mirroring. Each MAC
>>> +        address is separate by ','. This parametr is paired with
>>> +        select_dst_port. A '0' MAC address indicates the requested mirror
>>> +        is a per-port mirroring, otherwise it's a per-flow mirroring
>>> +      </column>
>>> +
>>> +      <column name="flow_dst_mac">
>>> +        The destination MAC address(es) for per-flow mirroring. Each MAC
>>> +        address is separate by ','. This parametr is paired with
>>> +        select_src_port. A '0' MAC address indicates the requested mirror
>>> +        is a per-port mirroring, otherwise it's a per-flow mirroring
>>> +      </column>
>>> +
>>>        <column name="select_dst_port">
>>>          Ports on which departing packets are selected for mirroring.
>>>        </column>
>>> @@ -4955,6 +4979,32 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>> type=patch options:peer=p1 \
>>>          </p>
>>>        </column>
>>>
>>> +      <column name="output_src_vlan">
>>> +        <p>Output VLAN for selected source port packets, if nonempty.</p>
>>> +        <p>
>>> +          <em>Please note:</em> This is different than
>>> +          <ref column="output-vlan"/> This vlan is used to add an 
>>> additional
>>> +          vlan tag on the mirror traffic, regardless it contains vlan or 
>>> not.
>>> +          The receive end could choose to filter out this additional vlan.
>>> +          This option is provided so the mirrored traffic could maintain 
>>> its
>>> +          original vlan informaiton, and this mirror can be used to filter
>>> +          out un-wanted traffic such as in <ref column="mirror_offload"/>.
>>> +        </p>
>>> +      </column>
>>> +
>>> +      <column name="output_dst_vlan">
>>> +        <p>Output VLAN for selected destination port packets, if
>> nonempty.</p>
>>> +        <p>
>>> +          <em>Please note:</em> This is different than
>>> +          <ref column="output-vlan"/> This vlan is used to add an 
>>> additional
>>> +          vlan tag on the mirror traffic, regardless it contains vlan or 
>>> not.
>>> +          The receive end could choose to filter out this additional vlan.
>>> +          This option is provided so the mirrored traffic could maintain 
>>> its
>>> +          original vlan informaiton, and this mirror cab be used to filter
>>> +          out un-wanted traffic such as in <ref column="mirror_offload"/>.
>>> +        </p>
>>> +      </column>
>>> +
>>>        <column name="snaplen">
>>>          <p>Maximum per-packet number of bytes to mirror.</p>
>>>          <p>A mirrored packet with size larger than <ref column="snaplen"/>
>>>
> 

These parameters are DPDK specific, but nothing mentions it. It would
confuse the OVS-Kernel users.

Regards,
Maxime

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to