On Tue, 2022-06-28 at 13:18 +0200, Eelco Chaudron wrote:
> 
> 
> On 28 Jun 2022, at 10:52, Jianbo Liu wrote:
> 
> > On Tue, 2022-06-28 at 09:52 +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 27 Jun 2022, at 16:58, Jianbo Liu wrote:
> > > 
> > > > On Mon, 2022-06-27 at 11:32 +0200, Eelco Chaudron wrote:
> > > > > 
> > > > > 
> > > > > On 21 Jun 2022, at 10:22, Jianbo Liu wrote:
> > > > > 
> > > > > > On Mon, 2022-06-20 at 12:18 +0200, Eelco Chaudron wrote:
> > > > > > > On 27 May 2022, at 11:00, Jianbo Liu wrote:
> > > > > > > 
> > > > > > > > OVS meters are created in advance and openflow rules
> > > > > > > > refer
> > > > > > > > to
> > > > > > > > them
> > > > > > > > by
> > > > > > > > their unique ID. New tc_police API is used to offload
> > > > > > > > them.
> > > > > > > > By
> > > > > > > > calling
> > > > > > > > the API, police actions are created and meters are
> > > > > > > > mapped
> > > > > > > > to
> > > > > > > > them.
> > > > > > > > These actions then can be used in tc filter rules by
> > > > > > > > the
> > > > > > > > index.
> > > > > > > > 
> > > > > > > > Signed-off-by: Jianbo Liu <jian...@nvidia.com>
> > > > > > > > ---
> > > > > > > >  NEWS                             |  2 ++
> > > > > > > >  lib/dpif-netlink.c               | 31
> > > > > > > > +++++++++++++++++---
> > > > > > > > -
> > > > > > > >  tests/system-offloads-traffic.at | 48
> > > > > > > > ++++++++++++++++++++++++++++++++
> > > > > > > >  3 files changed, 76 insertions(+), 5 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/NEWS b/NEWS
> > > > > > > > index eece0d0b2..dfd659d4e 100644
> > > > > > > > --- a/NEWS
> > > > > > > > +++ b/NEWS
> > > > > > > > @@ -29,6 +29,8 @@ Post-v2.17.0
> > > > > > > >     - Windows:
> > > > > > > >       * Conntrack support for TCPv6, UDPv6, ICMPv6,
> > > > > > > > FTPv6.
> > > > > > > >       * IPv6 Geneve tunnel support.
> > > > > > > > +   - Linux datapath:
> > > > > > > > +     * Add offloading meter tc police.
> > > > > > > > 
> > > > > > > > 
> > > > > > > >  v2.17.0 - 17 Feb 2022
> > > > > > > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> > > > > > > > index 06e1e8ca0..0af9ee77e 100644
> > > > > > > > --- a/lib/dpif-netlink.c
> > > > > > > > +++ b/lib/dpif-netlink.c
> > > > > > > > @@ -4163,11 +4163,18 @@ static int
> > > > > > > >  dpif_netlink_meter_set(struct dpif *dpif_,
> > > > > > > > ofproto_meter_id
> > > > > > > > meter_id,
> > > > > > > >                         struct ofputil_meter_config
> > > > > > > > *config)
> > > > > > > >  {
> > > > > > > > +    int err;
> > > > > > > > +
> > > > > > > >      if (probe_broken_meters(dpif_)) {
> > > > > > > >          return ENOMEM;
> > > > > > > >      }
> > > > > > > > 
> > > > > > > > -    return dpif_netlink_meter_set__(dpif_, meter_id,
> > > > > > > > config);
> > > > > > > > +    err = dpif_netlink_meter_set__(dpif_, meter_id,
> > > > > > > > config);
> > > > > > > > +    if (!err && netdev_is_flow_api_enabled()) {
> > > > > > > > +        meter_offload_set(meter_id, config);
> > > > > > > 
> > > > > > > Although currently we always return 0, we should still
> > > > > > > account
> > > > > > > for it
> > > > > > > to change in the future, so we should set err to the
> > > > > > > return
> > > > > > > value.
> > > > > > > 
> > > > > > 
> > > > > > If there is err from meter_offload_set, it will be passed
> > > > > > to
> > > > > > the
> > > > > > caller
> > > > > > of dpif_netlink_meter_set(). I don't agree with that,
> > > > > > because
> > > > > > the
> > > > > > caller thinks meter_set operation fail, but actually not.
> > > > > > Besides,
> > > > > > we
> > > > > > allow the case that dp meter_set success, but offloading
> > > > > > fails,
> > > > > > so
> > > > > > the
> > > > > > return the error of meter_offload_set seems unnecessary.
> > > > > 
> > > > > If this is the case, we should change the
> > > > > dpif_netlink_meter_set()
> > > > > API to return void.
> > > > > And add a comment to the function why it would not return an
> > > > > error:
> > > > > 
> > > > 
> > > > OK.
> > > > 
> > > > > --- a/lib/netdev-offload.c
> > > > > +++ b/lib/netdev-offload.c
> > > > > @@ -196,7 +196,7 @@ netdev_assign_flow_api(struct netdev
> > > > > *netdev)
> > > > >      return -1;
> > > > >  }
> > > > > 
> > > > > -int
> > > > > +void
> > > > >  meter_offload_set(ofproto_meter_id meter_id,
> > > > >                    struct ofputil_meter_config *config)
> > > > >  {
> > > > > @@ -212,8 +212,8 @@ meter_offload_set(ofproto_meter_id
> > > > > meter_id,
> > > > >             }
> > > > >          }
> > > > >      }
> > > > > -
> > > > > -    return 0;
> > > > > +    /* Offload APIs could fail, for example, because the
> > > > > offload
> > > > > is
> > > > > not
> > > > > +     * supported. This is fine, as the offload API should
> > > > > take
> > > > > care
> > > > > of this. */
> > > > >  }
> > > > > 
> > > > > 
> > > > > 
> > > > > > >  +        err = meter_offload_set(meter_id, config);
> > > > > > > 
> > > > > > > > +    }
> > > > > > > > +
> > > > > 
> > > > > <SNIP>
> > > > > 
> > > > > > > > diff --git a/tests/system-offloads-traffic.at
> > > > > > > > b/tests/system-
> > > > > > > > offloads-traffic.at
> > > > > > > > index 80bc1dd5c..7ec75340f 100644
> > > > > > > > --- a/tests/system-offloads-traffic.at
> > > > > > > > +++ b/tests/system-offloads-traffic.at
> > > > > > > > @@ -168,3 +168,51 @@ matchall
> > > > > > > >  ])
> > > > > > > >  OVS_TRAFFIC_VSWITCHD_STOP
> > > > > > > >  AT_CLEANUP
> > > > > > > > +
> > > > > > > > +AT_SETUP([offloads - check if meter offloading ])
> > > > > > > 
> > > > > > > Can we replace if with interface, as I keep on reading it
> > > > > > > as
> > > > > > > "if".
> > > > > > > 
> > > > > > 
> > > > > > Sure.
> > > > > > 
> > > > > > > > +AT_KEYWORDS([meter])
> > > > > > > > +AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"])
> > > > > > > > +OVS_TRAFFIC_VSWITCHD_START()
> > > > > > > > +
> > > > > > > > +AT_CHECK([ovs-vsctl set Open_vSwitch .
> > > > > > > > other_config:hw-
> > > > > > > > offload=true])
> > > > > > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0
> > > > > > > > 'meter=1
> > > > > > > > pktps
> > > > > > > > bands=type=drop rate=1'])
> > > > > > > > +
> > > > > > > > +ADD_NAMESPACES(at_ns0, at_ns1)
> > > > > > > > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24",
> > > > > > > > "f0:00:00:01:01:01")
> > > > > > > > +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24",
> > > > > > > > "f0:00:00:01:01:02")
> > > > > > > > +
> > > > > > > > +NS_CHECK_EXEC([at_ns0], [ip neigh add 10.1.1.2 lladdr
> > > > > > > > f0:00:00:01:01:02 dev p0])
> > > > > > > > +NS_CHECK_EXEC([at_ns1], [ip neigh add 10.1.1.1 lladdr
> > > > > > > > f0:00:00:01:01:01 dev p1])
> > > > > > > > +
> > > > > > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> > > > > > > > "actions=normal"])
> > > > > > > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2
> > > > > > > > 10.1.1.2 |
> > > > > > > > FORMAT_PING], [0], [dnl
> > > > > > > > +10 packets transmitted, 10 received, 0% packet loss,
> > > > > > > > time
> > > > > > > > 0ms
> > > > > > > > +])
> > > > > > > > +
> > > > > > > > +NETNS_DAEMONIZE([at_ns1], [nc -u -l 5678 > /dev/null
> > > > > > > > ],
> > > > > > > > [nc0.pid])
> > > > > > > > +
> > > > > > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 del-flows br0])
> > > > > > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> > > > > > > > "priority=10,in_port=ovs-p0,udp
> > > > > > > > actions=meter:1,normal"])
> > > > > > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> > > > > > > > "priority=1
> > > > > > > > actions=normal"])
> > > > > > > > +
> > > > > > > > +NS_CHECK_EXEC([at_ns0], [sleep 0.5; echo "mark" | nc -
> > > > > > > > u
> > > > > > > > 10.1.1.2
> > > > > > > > 5678 -p 6789])
> > > > > > > 
> > > > > > > Any specific reason why you need the sleep 0.5 here? Is
> > > > > > > it to
> > > > > > > be
> > > > > > > sure
> > > > > > > the flow is programmed?
> > > > > > 
> > > > > > I remember I added this because there are failures
> > > > > > sometimes. I
> > > > > > don't
> > > > > > know why, but obviously they are related to this patchset.
> > > > > > So I
> > > > > > added
> > > > > > the sleep to avoid them. It's only 0.5s, should be no
> > > > > > problem,
> > > > > > right?
> > > > > 
> > > > > I did a lot of runs, but could not get it to fail without it.
> > > > > So
> > > > > if
> > > > > it fails in your case it would be good to investigate.
> > > > 
> > > > I can't reproduce today, though I run many times. It's related
> > > > to
> > > > my
> > > > setup, I don't test on physical machine, but a virtual machine.
> > > > 
> > > > > 
> > > > > > > If so, you might just want to do a ovs-vsctl dump-flows
> > > > > > > and
> > > > > > > checke
> > > > > > > the output?
> > > > > > > 
> > > > > > 
> > > > > > I don't understand your qustion. I checked ovs-vsctl dump-
> > > > > > flows
> > > > > > below.
> > > > 
> > > > If I remember correctly, the used is "never", not "0.001s".
> > > > 
> > > > > 
> > > > > I mean in the failure case without he 0.5 seconds, what do
> > > > > you
> > > > > get.
> > > > > 
> > > > > > > > +AT_CHECK([ovs-appctl dpctl/dump-flows | grep "meter" |
> > > > > > > > DUMP_CLEAN_SORTED], [0], [dnl
> > > > > > > > +in_port(2),eth(macs),eth_type(0x0800),ipv4(proto=17,fr
> > > > > > > > ag=n
> > > > > > > > o),
> > > > > > > > packets:0, bytes:0, used:0.001s,
> > > > > > > > actions:outputmeter(0),3
> > > > > > > > +])
> > > > > > > 
> > > > > > > Here you verify the DP rule is inserted, but should you
> > > > > > > not
> > > > > > > also
> > > > > > > wait
> > > > > > > a second to make sure the meter is reset?
> > > > > > 
> > > > > > OK. I will add.
> > > > > > 
> > > > > > > Because in theory your could/should sent 11 packets in 1
> > > > > > > second,
> > > > > > > so
> > > > > > > 10 should be dropped!?
> > > > > > > This is the case in the kernel environment, but with TC
> > > > > > > the
> > > > > > > learning
> > > > > > > packet is not passing trough the TC meter (this might
> > > > > > > also be
> > > > > > > a
> > > > > > > corner case we need to document).
> > > > > > > 
> > > > > > 
> > > > > > Yes, it's the reason. But where to document?
> > > > > 
> > > > > Not sure where either, I guess in NEWS and maybe it’s time to
> > > > > add
> > > > > a
> > > > > howto/tc-offload.rst file?
> > > > > 
> > > > 
> > > > Maybe just adding comments here, to explain why it's the
> > > > result.
> > > > And
> > > > it's not any related to "howto".
> > > 
> > > I think this is an important difference in behaviour we should
> > > document for the end-user. So that’s why I think we need this in
> > > the
> > > documentation somewhere.
> > > 
> > > > > > > > +for i in `seq 10`; do
> > > > > > > > +NS_CHECK_EXEC([at_ns0], [echo "mark" | nc -u 10.1.1.2
> > > > > > > > 5678
> > > > > > > > -p
> > > > > > > > 6789])
> > > > > > > > +done
> > > > > > > > +
> > > > > > > > +AT_CHECK([ovs-appctl dpctl/dump-flows | grep "meter" |
> > > > > > > > DUMP_CLEAN_SORTED], [0], [dnl
> > > > > > > > +in_port(2),eth(macs),eth_type(0x0800),ipv4(proto=17,fr
> > > > > > > > ag=n
> > > > > > > > o),
> > > > > > > > packets:10, bytes:330, used:0.001s,
> > > > > > > > actions:outputmeter(0),3
> > > > > > > > +])
> > > > > > > > +
> > > > > > > > +AT_CHECK([ovs-ofctl -O OpenFlow13 meter-stats br0 |
> > > > > > > > sed -e
> > > > > > > > 's/duration:[[0-9]].[[0-9]]*s/duration:0.001s/'], [0],
> > > > > > > > [dnl
> > > > > > > > +OFPST_METER reply (OF1.3) (xid=0x2):
> > > > > > > > +meter:1 flow_count:1 packet_in_count:11
> > > > > > > > byte_in_count:377
> > > > > > > > duration:0.001s bands:
> > > > > > > 
> > > > > > > byte_in_count:517 is a lot larger than with kernel code,
> > > > > > > do
> > > > > > > we
> > > > > > > know
> > > > > > > why? We should document it.
> > > > > > > 
> > > > > > 
> > > > > > I think DP also count the bytes of mac layer.
> > > > > 
> > > > > Any idea why?
> > > > 
> > > > From ovs_meter_execute() in datapath/meter.c, the packet size
> > > > of
> > > > band
> > > > stats is added by skb->len, which should include the len of mac
> > > > layer.
> > > 
> > > So this needs to be fixed in the kernel. I can remember someone
> > > from
> > > Nvidia was already looking at this?
> > 
> > Sorry, I don't know.
> > 
> > > 
> > > > > > > > +0: packet_count:9 byte_count:0
> > > > > > > 
> > > > > > > Here in theory we should report byte_count, like this:
> > > > > > > 
> > > > > > > 0: packet_count:10 byte_count:470
> > > > > > > 
> > > > > > > However, TC does not support dropped byte count, only
> > > > > > > packet_count.
> > > > > > > So we should be ok for now, but we must add this
> > > > > > > limitation
> > > > > > > to
> > > > > > > the
> > > > > > > documentation somewhere that it's clear we will not
> > > > > > > report
> > > > > > > byte
> > > > > > > count
> > > > > > > with TC offload.
> > > > > > > 
> > > > > > > If you run this test without HW offload enabled you can
> > > > > > > see
> > > > > > > the
> > > > > > > difference in behavior, and I think there should be none
> > > > > > > (or
> > > > > > > at
> > > > > > > least
> > > > > > > the corner cases should be documented).
> > > > > > > You could also add a "- offloads disabled" variant of
> > > > > > > this
> > > > > > > test,
> > > > > > > like
> > > > > > > other features do and add some additional reasoning why
> > > > > > > it's
> > > > > > > different there.
> > > > > > > 
> > > > > > 
> > > > > > Sure, I will add.
> > > > > 
> > > > > Guess this could go to the same howto/tc-offload.rst in the
> > > > > limitations section?
> > > > > 
> > > > 
> > > > Maybe also add comments here, don't waste time to find info
> > > > from
> > > > the
> > > > other document :)
> > > 
> > > See above, end-users do not read test cases, so we need this in
> > > some
> > > user documentation.
> > 
> > Sorry, I don't understand. Are we talking about the tests, and why
> > do
> > we get these expected results regarding the numbers of bytes and
> > packets? Maybe adding comments here is enough, user may not notice
> > these differences if he doesn't run this tests, right?
> 
> I’m talking about the fact that the “ovs-ofctl -O OpenFlow13 meter-
> stats br0” is not reporting byte count when hwoffload is enabled.
> This is a missing feature/difference from previous behavior and we
> should document this for the end-user.
> 

OK, I see. Will add howto/tc-offload.rst to document the feature of
meter offload and its limitation.

> > > > > > 
> > > > > > > > +])
> > > > > > > > +
> > > > > > > > +OVS_TRAFFIC_VSWITCHD_STOP
> > > > > > > > +AT_CLEANUP
> > > > > > > > -- 
> > > > > > > > 2.26.2
> > > > > > > 
> > > > > > > This concludes my review of v5.
> > > > > > > 
> > > > > > > //Eelco
> > > > > > > 
> > > > > 
> > > 
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to