Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather implementation

2020-05-20 Thread Federico Iezzi
On Wed, 20 May 2020 at 15:32, William Tu  wrote:

> On Wed, May 20, 2020 at 3:35 AM Federico Iezzi  wrote:
> >
> >
> >
> >
> >
> > On Wed, 20 May 2020 at 12:20, Van Haaren, Harry <
> harry.van.haa...@intel.com> wrote:
> >>
> >> > -Original Message-
> >> > From: William Tu 
> >> > Sent: Wednesday, May 20, 2020 1:12 AM
> >> > To: Van Haaren, Harry 
> >> > Cc: ovs-dev@openvswitch.org; i.maxim...@ovn.org
> >> > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather
> >> > implementation
> >> >
> >> > On Mon, May 18, 2020 at 9:12 AM Van Haaren, Harry
> >> >  wrote:
> >> > >
> >> > > > -Original Message-
> >> > > > From: William Tu 
> >> > > > Sent: Monday, May 18, 2020 3:58 PM
> >> > > > To: Van Haaren, Harry 
> >> > > > Cc: ovs-dev@openvswitch.org; i.maxim...@ovn.org
> >> > > > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512
> gather
> >> > > > implementation
> >> > > >
> >> > > > On Wed, May 06, 2020 at 02:06:09PM +0100, Harry van Haaren wrote:
> >> > > > > This commit adds an AVX-512 dpcls lookup implementation.
> >> > > > > It uses the AVX-512 SIMD ISA to perform multiple miniflow
> >> > > > > operations in parallel.
> >>
> >> 
> >>
> >> > Hi Harry,
> >> >
> >> > I managed to find a machine with avx512 in google cloud and did some
> >> > performance testing. I saw lower performance when enabling avx512,
> >
> >
> > AVX512 instruction path lowers the clock speed well below the base
> frequency [1].
> > Aren't you killing the PMD performance while improving the lookup ones?
> >
> > [1]
> https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/2nd-gen-xeon-scalable-spec-update.pdf
> (see page 20)
> >
>
> Hi Federico,
>
> Thanks for sharing the link.
> Does that mean if OVS PMD uses avx512 on one core, then all the other
> cores's
> frequency will be lower?
>

Only where avx512 instructions are executed the clock is reduced to cope
with the thermals
I'm not sure if there is a situation where avx512 code is executed only on
specific PMDs, if that happens is bad as some may PMD be faster/slower (see
below)
Kinda like when dynamic turbo boost is enabled and some pmd go faster
because of the higher clock


>
> There are some discussion here:
>
> https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-use-these-new-instructions/


Wow, quite interesting. Thanks!


>
> My take is that overall down clocking will happen, but application
> will get better performance.
>

Indeed the part of the code wrote for avx512 goes much faster, the rest,
stay on the normal path and will go slow due to the reduced clock.
Those are different use-cases and programs but see Cannon Lake Anandtech
review regarding what AVX512 can deliver

###
When we crank on the AVX2 and AVX512, there is no stopping the Cannon Lake
chip here. At a score of 4519, it beats a full 18-core Core i9-7980XE
processor running in non-AVX.
https://www.anandtech.com/show/13405/intel-10nm-cannon-lake-and-core-i3-8121u-deep-dive-review/9
###

Indeed you have to expect much-improved performance from it, the question
is how much non-avx512 code will slow down
See also this one ->
https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html

HTH,
Federico


> William
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather implementation

2020-05-20 Thread Federico Iezzi
On Wed, 20 May 2020 at 12:20, Van Haaren, Harry 
wrote:

> > -Original Message-
> > From: William Tu 
> > Sent: Wednesday, May 20, 2020 1:12 AM
> > To: Van Haaren, Harry 
> > Cc: ovs-dev@openvswitch.org; i.maxim...@ovn.org
> > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather
> > implementation
> >
> > On Mon, May 18, 2020 at 9:12 AM Van Haaren, Harry
> >  wrote:
> > >
> > > > -Original Message-
> > > > From: William Tu 
> > > > Sent: Monday, May 18, 2020 3:58 PM
> > > > To: Van Haaren, Harry 
> > > > Cc: ovs-dev@openvswitch.org; i.maxim...@ovn.org
> > > > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather
> > > > implementation
> > > >
> > > > On Wed, May 06, 2020 at 02:06:09PM +0100, Harry van Haaren wrote:
> > > > > This commit adds an AVX-512 dpcls lookup implementation.
> > > > > It uses the AVX-512 SIMD ISA to perform multiple miniflow
> > > > > operations in parallel.
>
> 
>
> > Hi Harry,
> >
> > I managed to find a machine with avx512 in google cloud and did some
> > performance testing. I saw lower performance when enabling avx512,
>

AVX512 instruction path lowers the clock speed well below the base
frequency [1].
Aren't you killing the PMD performance while improving the lookup ones?

[1]
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/2nd-gen-xeon-scalable-spec-update.pdf
(see
page 20)


> > I believe I did something wrong. Do you mind having a look:
> >
> > 1) first a compile error
> > diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
> > index b22a26b8c8a2..5c71096c10c5 100644
> > --- a/lib/dpif-netdev-lookup.c
> > +++ b/lib/dpif-netdev-lookup.c
> > @@ -1,5 +1,6 @@
> >
> >  #include 
> > +#include 
> >  #include "dpif-netdev-lookup.h"
> >
> >  #include "openvswitch/vlog.h"
>
> Existing code compiles fine here - but I've added this in the v3, thanks
> for flagging.
>
>
> > 2) cpuinfo
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> > pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm
> > constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq
> > pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt
> > aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch
> > invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle
> > avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap
> > clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1
> > xsaves arat md_clear arch_capabilities
>
> The avx512f and dq/cd/bw/vl extensions indicate AVX512 is available on
> this machine, all looks good so far.
>
>
> >
> > 3) start ovs and set avx and traffic gen
> >  ovs-appctl dpif-netdev/subtable-lookup-set avx512_gather 5
> >  ovs-vsctl add-port br0 tg0 -- set int tg0 type=dpdk
> >
> options:dpdk-devargs=vdev:net_pcap0,rx_pcap=/root/ovs/p0.pcap,infinite_rx=1
>
> The output of the first command (enabling the AVX512 lookup) posts some
> output to Log INFO, please ensure its there?
>
> 2020-05-20T09:39:09Z|00262|dpif_netdev_lookup|INFO|Subtable function
> 'avx512_gather' set priority to 4
> 2020-05-20T09:39:09Z|6|dpif_netdev(pmd-c15/id:99)|INFO|reprobing sub
> func, 5 1
>
>
> > 4) dp flows with miniflow info
> > root@instance-3:~/ovs# ovs-appctl dpctl/dump-flows -m
> > flow-dump from pmd on cpu core: 0
> > ufid:caf1-2e15-418c-a7d4-b4ec377593ca,
> >
> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(
> >
> 0/0),recirc_id(0),dp_hash(0/0),in_port(tg0),packet_type(ns=0,id=0),eth(src=42:01:
> > 0a:b6:00:02,dst=42:01:0a:b6:00:01),eth_type(0x0800),ipv4(src=
> 10.182.0.2/0.0.0.0,
> > dst=76.21.95.192/0.0.0.0,proto=6/0,tos=0x10/0,ttl=64/0,frag=no
> ),tcp(src=22/0,ds
> > t=62190/0),tcp_flags(0/0),
> > packets:0, bytes:0, used:never, dp:ovs, actions:drop,
> > dp-extra-info:miniflow_bits(5,1)
> > ufid:78cc1751-3a81-4dba-900c-b3507d965bdc,
> >
> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(
> >
> 0/0),recirc_id(0),dp_hash(0/0),in_port(tg0),packet_type(ns=0,id=0),eth(src=42:01:
> > 0a:b6:00:02,dst=42:01:0a:b6:00:01),eth_type(0x0800),ipv4(src=
> 10.182.0.2/0.0.0.0,
> > dst=169.254.169.254/0.0.0.0,proto=6/0,tos=0/0,ttl=64/0,frag=no
> ),tcp(src=51650/
> > 0,dst=80/0),tcp_flags(0/0),
> > packets:0, bytes:0, used:never, dp:ovs, actions:drop,
> > dp-extra-info:miniflow_bits(5,1)
>
> It seems the "packets:0, bytes:0,used:never" tags indicate that there is
> no traffic hitting these rules at all?
>
> Output here (with traffic running for a while) shows:
> packets:621588543, bytes:37295312580, used:0.000s, dp:ovs, actions:dpdk1,
> dp-extra-info:miniflow_bits(4,1)
>
>
> > 5) pmd-stat-show
> > root@instance-3:~/ovs# ovs-appctl dpif-netdev/pmd-stats-show
> > pmd thread numa_id 0 core_id 0:
> >   packets received: 19838528
> >   packet recirculations: 0
> >   avg. datapath passes per packet: 1.00
> >   emc hits: 0
> >   smc hits: 0
> >   megaflow 

Re: [ovs-dev] [PATCH v18] Improved Packet Drop Statistics in OVS

2020-01-03 Thread Federico Iezzi
> 
> On 18 Dec 2019, at 3:20 pm, Ilya Maximets  wrote:
> 
> On 18.12.2019 05:48, Anju Thomas wrote:
>> Currently OVS maintains explicit packet drop/error counters only on port
>> level.  Packets that are dropped as part of normal OpenFlow processing
>> are counted in flow stats of “drop” flows or as table misses in table
>> stats. These can only be interpreted by controllers that know the
>> semantics of the configured OpenFlow pipeline.  Without that knowledge,
>> it is impossible for an OVS user to obtain e.g. the total number of
>> packets dropped due to OpenFlow rules.
>> 
>> Furthermore, there are numerous other reasons for which packets can be
>> dropped by OVS slow path that are not related to the OpenFlow pipeline.
>> The generated datapath flow entries include a drop action to avoid
>> further expensive upcalls to the slow path, but subsequent packets
>> dropped by the datapath are not accounted anywhere.
>> 
>> Finally, the datapath itself drops packets in certain error situations.
>> Also, these drops are today not accounted for.This makes it difficult
>> for OVS users to monitor packet drop in an OVS instance and to alert a
>> management system in case of a unexpected increase of such drops.
>> AlsoOVS trouble-shooters face difficulties in analysing packet drops.
>> 
>> With this patch we implement following changes to address the issues
>> mentioned above.
>> 
>> 1. Identify and account all the silent packet drop scenarios
>> 
>> 2. Display these drops in ovs-appctl coverage/show
>> 
>> Co-authored-by: Rohith Basavaraja 
>> Co-authored-by: Keshav Gupta 
>> Signed-off-by: Anju Thomas 
>> Signed-off-by: Rohith Basavaraja 
>> Signed-off-by: Keshav Gupta 
>> Acked-by: Eelco Chaudron > ---
> 
> Thanks.  This version looks OK to me beside the fact that I'd rename
> the patch to something like "userspace: Improved packet drop statistics.",
> but this is minor.  One more thing is that we might want to rename
> 'enum xlate_error' to 'enum ovs_xlate_error' and prefix all its members
> with 'OVS_' to keep the style along with kernel definitions, but this is
> a mechanical change that could be done later in a separate patch as it
> will touch a lot of unrelated code.
> 
> Acked-by: Ilya Maximets 
> 
> Hi Ben,
> could you, please, take a look at this patch one more time?
> There were a couple of changes as we moved 'enum xlate_error' to
> openvswitch.h header to avoid inclusion of 'ofproto/ofproto-dpif-xlate.h'
> from the 'lib' code, new datapath capability was documented and we
> cleaned the patch up a little bit.

Any news? We’re finally this close for merging the patch set.

> 
> BTW, I will be traveling starting from tomorrow until the end of next
> week, so feel free to apply this patch if it looks good to you.  I could
> apply it myself today (if you could reply on it today) or after my trip.
> 
> Best regards, Ilya Maximets.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] Adding support for PMD auto load balancing

2019-01-14 Thread Federico Iezzi
Maybe it's a bit late for this series, but would be possible in a
future enhancement to have a user parameter to set a different value
for ALB_PMD_LOAD_THRESHOLD?

Regards,
Federico

FEDERICO IEZZI

SR. TELCO ARCHITECT

Red Hat EMEA

fie...@redhat.comM: +31-6-5152-9709

TRIED. TESTED. TRUSTED.
@RedHat   Red Hat   Red Hat


On Mon, 14 Jan 2019 at 11:56, Nitin Katiyar  wrote:
>
> Port rx queues that have not been statically assigned to PMDs are currently
> assigned based on periodically sampled load measurements.
> The assignment is performed at specific instances – port addition, port
> deletion, upon reassignment request via CLI etc.
>
> Due to change in traffic pattern over time it can cause uneven load among
> the PMDs and thus resulting in lower overall throughout.
>
> This patch enables the support of auto load balancing of PMDs based on
> measured load of RX queues. Each PMD measures the processing load for each
> of its associated queues every 10 seconds. If the aggregated PMD load reaches
> 95% for 6 consecutive intervals then PMD considers itself to be overloaded.
>
> If any PMD is overloaded, a dry-run of the PMD assignment algorithm is
> performed by OVS main thread. The dry-run does NOT change the existing
> queue to PMD assignments.
>
> If the resultant mapping of dry-run indicates an improved distribution
> of the load then the actual reassignment will be performed.
>
> The automatic rebalancing will be disabled by default and has to be
> enabled via configuration option. The interval (in minutes) between
> two consecutive rebalancing can also be configured via CLI, default
> is 1 min.
>
> Following example commands can be used to set the auto-lb params:
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-rebalance-intvl="5"
>
> Co-authored-by: Rohith Basavaraja 
> Co-authored-by: Venkatesan Pradeep 
> Signed-off-by: Rohith Basavaraja 
> Signed-off-by: Venkatesan Pradeep 
> Signed-off-by: Nitin Katiyar 
> ---
>  Documentation/topics/dpdk/pmd.rst |  41 +
>  NEWS  |   1 +
>  lib/dpif-netdev.c | 379 
> ++
>  vswitchd/vswitch.xml  |  41 +
>  4 files changed, 462 insertions(+)
>
> diff --git a/Documentation/topics/dpdk/pmd.rst 
> b/Documentation/topics/dpdk/pmd.rst
> index dd9172d..c273b40 100644
> --- a/Documentation/topics/dpdk/pmd.rst
> +++ b/Documentation/topics/dpdk/pmd.rst
> @@ -183,3 +183,44 @@ or can be triggered by using::
> In addition, the output of ``pmd-rxq-show`` was modified to include
> Rx queue utilization of the PMD as a percentage. Prior to this, tracking 
> of
> stats was not available.
> +
> +Automatic assignment of Port/Rx Queue to PMD Threads (experimental)
> +---
> +
> +Cycle or utilization based allocation of Rx queues to PMDs gives efficient
> +load distribution but it is not adaptive to change in traffic pattern 
> occuring
> +over the time. This causes uneven load among the PMDs which results in 
> overall
> +lower throughput.
> +
> +To address this automatic load balancing of PMDs can be set by::
> +
> +$ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> +
> +If pmd-auto-lb is set to true AND cycle based assignment is enabled then auto
> +load balancing of PMDs is enabled provided there are 2 or more non-isolated
> +PMDs and at least one of these PMDs is polling more than one RX queue.
> +
> +Once auto load balancing is set, each non-isolated PMD measures the 
> processing
> +load for each of its associated queues every 10 seconds. If the aggregated 
> PMD
> +load reaches 95% for 6 consecutive intervals then PMD considers itself to be
> +overloaded.
> +
> +If any PMD is overloaded, a dry-run of the PMD assignment algorithm is
> +performed by OVS main thread. The dry-run does NOT change the existing queue
> +to PMD assignments.
> +
> +If the resultant mapping of dry-run indicates an improved distribution of the
> +load then the actual reassignment will be performed.
> +
> +The minimum time between 2 consecutive PMD auto load balancing iterations can
> +also be configured by::
> +
> +$ ovs-vsctl set open_vswitch .\
> +other_config:pmd-auto-lb-rebal-interval=""
> +
> +where  is a value in minutes. The default interval is 1 minute
> +and setting it to 0 will also result in default value i.e. 1 min.
> +
> +A user can use this option to avoid frequent trigger of auto load balancing 
> of
> +PMDs. For e.g. set this (in min) such that it occurs once in few hours or a 
>

Re: [ovs-dev] netdev-dpdk: Support the link speed of XL710

2018-10-11 Thread Federico Iezzi
So, any news on the other link speeds like 25, 50, and 100Gbps?

Thanks

On Mon, 3 Sep 2018 at 19:54, Flavio Leitner  wrote:

> On Fri, Aug 31, 2018 at 05:54:25AM +0200, Federico Iezzi wrote:
> > Any comment here?
> > This seems like a very easy commit :-)
> >
> >
> >
> > On Thu, 23 Aug 2018 at 13:34, Ian Stokes  wrote:
> >
> > > On 8/22/2018 6:14 PM, Federico Iezzi wrote:
> > > > DPDK exposes API all the way from 10Mbps to 100Gbps.
> > > > http://doc.dpdk.org/api/rte__ethdev_8h_source.html
> > > >
> > > > Can other cards be added? 25G is now getting really popular.
> > > >
> > > > Thanks
> > >
> > > It’s a good point, technically there’s nothing stopping users from
> using
> > > 25/50/56/100 Gbp HW.
> > >
> > > 25/50/56 Gb are not defined specifically as a port feature rate in the
> > > openflow specifications at this time so they would have to be defined
> as
> > > NETDEV_F_OTHER to correlate to the feature rate not being in the
> > > ofp_port feature list in openflow.
> > >
> > > The following incremental on the patch below should suffice:
> > >
> > > @@ -2735,9 +2735,21 @@ netdev_dpdk_get_features(const struct netdev
> > > *netdev,
> > >   if (link.link_speed == ETH_SPEED_NUM_10G) {
> > >   *current = NETDEV_F_10GB_FD;
> > >   }
> > > +if (link.link_speed == ETH_SPEED_NUM_25G) {
> > > +*current = NETDEV_F_OTHER;
> > > +}
> > >   if (link.link_speed == ETH_SPEED_NUM_40G) {
> > >   *current = NETDEV_F_40GB_FD;
> > >   }
> > > +if (link.link_speed == ETH_SPEED_NUM_50G) {
> > > +*current = NETDEV_F_OTHER;
> > > +}
> > > +if (link.link_speed == ETH_SPEED_NUM_56G) {
> > > +*current = NETDEV_F_OTHER;
> > > +}
> > > +if (link.link_speed == ETH_SPEED_NUM_100G) {
> > > +*current = NETDEV_F_100GB_FD;
> > > +}
> > >
> > > What are peoples thoughts? I can submit this as a separate patch if
> > > preferred.
>
> That's all the supported speeds OF defines as you mentioned, but there
> are free bits there if it's important to report a more accurate speed.
>
> Alternatively we could expose that information through get_status() as
> a translated string most probably.
>
> What worries me is that 'current' variable is allocated in the stack and
> dpdk doesn't zero it like bsd or linux does, so if speed falls out of those
> values, it will use whatever was in the stack before as reported
> originally.
>
> Perhaps we could do:
>
>  uint32_t supported = 0;
>
>  if (link.link_duplex == ETH_LINK_FULL_DUPLEX) {
>   switch (link.link_speed) {
>   /* OpenFlow defined values: see enum ofp_port_features */
>   [...]
>   case ETH_SPEED_NUM_10G:
>  supported |= NETDEV_F_10GB_FD;
>  break;
>   case ETH_SPEED_NUM_40G:
>  supported |= NETDEV_F_40GB_FD;
>  break;
>   case ETH_SPEED_NUM_100G:
>  supported |= NETDEV_F_100GB_FD
>  break;
>   default:
> supported |= NETDEV_F_OTHER;
>   }
>  else if (link.link_duplex == ETH_LINK_HALF_DUPLEX) {
>   [...]
>  }
>
>  if (link.link_autoneg) {
>  supported |= NETDEV_F_AUTONEG;
>  }
>
>  *current = supported;
>  *advertised = *supported = *peer = 0;
>
>  return 0;
>
>
> fbl
>
> > >
> > > Thanks
> > > Ian
> > >
> > > Ian
> > >
> > > >
> > > > On Wed, 22 Aug 2018 at 16:28, Stokes, Ian  > > > <mailto:ian.sto...@intel.com>> wrote:
> > > >
> > > >  > In the scenario of XL710, the link speed which stored in the
> > > table of
> > > >  > Interface is not 40G. Because the implementation of query of
> link
> > > > speed
> > > >  > only support to 10G, the parameter 'current' will be a random
> > > > value in the
> > > >  > scenario of higher link speed. In this case, incorrect link
> speed
> > > > of XL710
> > > >  > nic will be stored in the database.
> > > >  >
> > > >
> > > > Good catch, I've tested a

Re: [ovs-dev] netdev-dpdk: Support the link speed of XL710

2018-08-30 Thread Federico Iezzi
Any comment here?
This seems like a very easy commit :-)



On Thu, 23 Aug 2018 at 13:34, Ian Stokes  wrote:

> On 8/22/2018 6:14 PM, Federico Iezzi wrote:
> > DPDK exposes API all the way from 10Mbps to 100Gbps.
> > http://doc.dpdk.org/api/rte__ethdev_8h_source.html
> >
> > Can other cards be added? 25G is now getting really popular.
> >
> > Thanks
>
> It’s a good point, technically there’s nothing stopping users from using
> 25/50/56/100 Gbp HW.
>
> 25/50/56 Gb are not defined specifically as a port feature rate in the
> openflow specifications at this time so they would have to be defined as
> NETDEV_F_OTHER to correlate to the feature rate not being in the
> ofp_port feature list in openflow.
>
> The following incremental on the patch below should suffice:
>
> @@ -2735,9 +2735,21 @@ netdev_dpdk_get_features(const struct netdev
> *netdev,
>   if (link.link_speed == ETH_SPEED_NUM_10G) {
>   *current = NETDEV_F_10GB_FD;
>   }
> +if (link.link_speed == ETH_SPEED_NUM_25G) {
> +*current = NETDEV_F_OTHER;
> +}
>   if (link.link_speed == ETH_SPEED_NUM_40G) {
>   *current = NETDEV_F_40GB_FD;
>   }
> +if (link.link_speed == ETH_SPEED_NUM_50G) {
> +*current = NETDEV_F_OTHER;
> +}
> +if (link.link_speed == ETH_SPEED_NUM_56G) {
> +*current = NETDEV_F_OTHER;
> +}
> +if (link.link_speed == ETH_SPEED_NUM_100G) {
> +*current = NETDEV_F_100GB_FD;
> +}
>
> What are peoples thoughts? I can submit this as a separate patch if
> preferred.
>
> Thanks
> Ian
>
> Ian
>
> >
> > On Wed, 22 Aug 2018 at 16:28, Stokes, Ian  > <mailto:ian.sto...@intel.com>> wrote:
> >
> >  > In the scenario of XL710, the link speed which stored in the
> table of
> >  > Interface is not 40G. Because the implementation of query of link
> > speed
> >  > only support to 10G, the parameter 'current' will be a random
> > value in the
> >  > scenario of higher link speed. In this case, incorrect link speed
> > of XL710
> >  > nic will be stored in the database.
> >  >
> >
> > Good catch, I've tested and it works as expected. I'll add this to
> > the dpdk_merge pull request for this week and backport it to the
> > previous release branches also.
> >
> > Thanks
> > Ian
> >
> >  > Signed-off-by: Xu Binbin  > <mailto:xu.binb...@zte.com.cn>>
> >  > ---
> >  >  lib/netdev-dpdk.c | 3 +++
> >  >  1 file changed, 3 insertions(+)
> >  >
> >  > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> > ac02a09..e4b6ced
> >  > 100644
> >  > --- a/lib/netdev-dpdk.c
> >  > +++ b/lib/netdev-dpdk.c
> >  > @@ -2735,6 +2735,9 @@ netdev_dpdk_get_features(const struct netdev
> >  > *netdev,
> >  >  if (link.link_speed == ETH_SPEED_NUM_10G) {
> >  >  *current = NETDEV_F_10GB_FD;
> >  >  }
> >  > +if (link.link_speed == ETH_SPEED_NUM_40G) {
> >  > +*current = NETDEV_F_40GB_FD;
> >  > +}
> >  >  }
> >  >
> >  >  if (link.link_autoneg) {
> >  > --
> >  > 1.8.3.1
> >  >
> >  > ___
> >  > dev mailing list
> >  > d...@openvswitch.org <mailto:d...@openvswitch.org>
> >  > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > ___
> > dev mailing list
> > d...@openvswitch.org <mailto:d...@openvswitch.org>
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] netdev-dpdk: Support the link speed of XL710

2018-08-22 Thread Federico Iezzi
DPDK exposes API all the way from 10Mbps to 100Gbps.
http://doc.dpdk.org/api/rte__ethdev_8h_source.html

Can other cards be added? 25G is now getting really popular.

Thanks

On Wed, 22 Aug 2018 at 16:28, Stokes, Ian  wrote:

> > In the scenario of XL710, the link speed which stored in the table of
> > Interface is not 40G. Because the implementation of query of link speed
> > only support to 10G, the parameter 'current' will be a random value in
> the
> > scenario of higher link speed. In this case, incorrect link speed of
> XL710
> > nic will be stored in the database.
> >
>
> Good catch, I've tested and it works as expected. I'll add this to the
> dpdk_merge pull request for this week and backport it to the previous
> release branches also.
>
> Thanks
> Ian
>
> > Signed-off-by: Xu Binbin 
> > ---
> >  lib/netdev-dpdk.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ac02a09..e4b6ced
> > 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -2735,6 +2735,9 @@ netdev_dpdk_get_features(const struct netdev
> > *netdev,
> >  if (link.link_speed == ETH_SPEED_NUM_10G) {
> >  *current = NETDEV_F_10GB_FD;
> >  }
> > +if (link.link_speed == ETH_SPEED_NUM_40G) {
> > +*current = NETDEV_F_40GB_FD;
> > +}
> >  }
> >
> >  if (link.link_autoneg) {
> > --
> > 1.8.3.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in OVS.

2018-07-24 Thread Federico Iezzi
Hi all,

It seems Rohith Basavaraja email has been disabled.
I believe it's too late for someone else to pick up Rohith's work, right?

Federico

On Mon, 11 Jun 2018 at 10:33, Stokes, Ian  wrote:

> > I'm very sorry for double posting.
> > I believe this patch would represent a major feature in term of OVS's
> > usability.
> > Has anyone got any time for reviewing it?
> >
>
> I'm planning to look at it this week.
>
> Thanks
> Ian
>
> > Thanks,
> > Federico
> >
> > On 6 June 2018 at 17:37, Jan Scheurich 
> wrote:
> >
> > > The user-space part for packet drop stats should be generic and work
> > > with any dpif datapath.
> > > So, if someone implemented the equivalent drop stats functionality in
> > > the kernel datapath that would be very welcome.
> > > We in Ericsson cannot do that currently due to license restrictions.
> > >
> > > Regards, Jan
> > >
> > > > -Original Message-
> > > > From: ovs-dev-boun...@openvswitch.org
> > > > 
> > > On Behalf Of Rohith Basavaraja
> > > > Sent: Friday, 25 May, 2018 07:37
> > > > To: Ben Pfaff 
> > > > Cc: d...@openvswitch.org
> > > > Subject: Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in
> OVS.
> > > >
> > > > Thanks Ben for the clarification. Yes this new stuff is used only in
> > > > the DPDK datapath and it’s not used in the kernel datapath.
> > > >
> > > > Thanks
> > > > Rohith
> > > >
> > > > On 25/05/18, 2:52 AM, "Ben Pfaff"  wrote:
> > > >
> > > > On Thu, May 24, 2018 at 02:19:06AM +, Rohith Basavaraja
> wrote:
> > > > > Only  changes in
> > > > > datapath/linux/compat/include/linux/openvswitch.h
> > > > > are related to OvS Kernel module.
> > > >
> > > > On a second look, I see that the new stuff here is only for the
> > DPDK
> > > > datapath.  If you don't intend to add this feature to the kernel
> > > > datapath, there should be no problem.  Never mind.
> > > >
> > > >
> > > > ___
> > > > dev mailing list
> > > > d...@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > ___
> > > dev mailing list
> > > d...@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in OVS.

2018-07-10 Thread Federico Iezzi
I see it doesn't merge anymore, I have very little hopes to see it in with
2.10 ...

On Mon, 11 Jun 2018 at 10:33, Stokes, Ian  wrote:

> > I'm very sorry for double posting.
> > I believe this patch would represent a major feature in term of OVS's
> > usability.
> > Has anyone got any time for reviewing it?
> >
>
> I'm planning to look at it this week.
>
> Thanks
> Ian
>
> > Thanks,
> > Federico
> >
> > On 6 June 2018 at 17:37, Jan Scheurich 
> wrote:
> >
> > > The user-space part for packet drop stats should be generic and work
> > > with any dpif datapath.
> > > So, if someone implemented the equivalent drop stats functionality in
> > > the kernel datapath that would be very welcome.
> > > We in Ericsson cannot do that currently due to license restrictions.
> > >
> > > Regards, Jan
> > >
> > > > -Original Message-
> > > > From: ovs-dev-boun...@openvswitch.org
> > > > 
> > > On Behalf Of Rohith Basavaraja
> > > > Sent: Friday, 25 May, 2018 07:37
> > > > To: Ben Pfaff 
> > > > Cc: d...@openvswitch.org
> > > > Subject: Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in
> OVS.
> > > >
> > > > Thanks Ben for the clarification. Yes this new stuff is used only in
> > > > the DPDK datapath and it’s not used in the kernel datapath.
> > > >
> > > > Thanks
> > > > Rohith
> > > >
> > > > On 25/05/18, 2:52 AM, "Ben Pfaff"  wrote:
> > > >
> > > > On Thu, May 24, 2018 at 02:19:06AM +, Rohith Basavaraja
> wrote:
> > > > > Only  changes in
> > > > > datapath/linux/compat/include/linux/openvswitch.h
> > > > > are related to OvS Kernel module.
> > > >
> > > > On a second look, I see that the new stuff here is only for the
> > DPDK
> > > > datapath.  If you don't intend to add this feature to the kernel
> > > > datapath, there should be no problem.  Never mind.
> > > >
> > > >
> > > > ___
> > > > dev mailing list
> > > > d...@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > ___
> > > dev mailing list
> > > d...@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] Fix packet drops on LACP bond after link up

2018-07-03 Thread Federico Iezzi
Anyone got any time to review this patch?

Thanks!

On Mon, 25 Jun 2018 at 11:19, Manohar Krishnappa Chidambaraswamy <
manohar.krishnappa.chidambarasw...@ericsson.com> wrote:

> Hi Ben,
>
> Does this patch apply without issues?
>
> Would you be able to look at 2/2 of this series as well?
>
> Thanx
> Manu
>
> On 18/06/18, 2:05 PM, "ovs-dev-boun...@openvswitch.org on behalf of
> Manohar Krishnappa Chidambaraswamy"  behalf of manohar.krishnappa.chidambarasw...@ericsson.com> wrote:
>
> Ben,
>
> Here are the v2 diffs. Hope this applies without any issue.
>
> Thanx
> Manu
>
> Signed-off-by: Manohar K C
> 
> CC: Jan Scheurich 
> CC: Nitin Katiyar 
> ---
> v1 1/2: https://patchwork.ozlabs.org/patch/915285/
> v2 1/2: Rebased to master
>
>  lib/lacp.c   | 14 --
>  lib/lacp.h   |  3 ++-
>  ofproto/bond.c   | 18 +++---
>  ofproto/ofproto-dpif-xlate.c | 13 -
>  4 files changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/lib/lacp.c b/lib/lacp.c
> index d6b36aa..9e43e06 100644
> --- a/lib/lacp.c
> +++ b/lib/lacp.c
> @@ -154,6 +154,7 @@ static struct slave *slave_lookup(const struct
> lacp *, const void *slave)
>  OVS_REQUIRES(mutex);
>  static bool info_tx_equal(struct lacp_info *, struct lacp_info *)
>  OVS_REQUIRES(mutex);
> +static bool slave_may_enable__(struct slave *slave)
> OVS_REQUIRES(mutex);
>
>  static unixctl_cb_func lacp_unixctl_show;
>  static unixctl_cb_func lacp_unixctl_show_stats;
> @@ -324,8 +325,8 @@ lacp_is_active(const struct lacp *lacp)
> OVS_EXCLUDED(mutex)
>  /* Processes 'packet' which was received on 'slave_'.  This function
> should be
>   * called on all packets received on 'slave_' with Ethernet Type
> ETH_TYPE_LACP.
>   */
> -void
> -lacp_process_packet(struct lacp *lacp, const void *slave_,
> +bool
> +lacp_process_packet(struct lacp *lacp, const void *bond, const void
> *slave_,
>  const struct dp_packet *packet)
>  OVS_EXCLUDED(mutex)
>  {
> @@ -333,6 +334,7 @@ lacp_process_packet(struct lacp *lacp, const void
> *slave_,
>  const struct lacp_pdu *pdu;
>  long long int tx_rate;
>  struct slave *slave;
> +bool lacp_may_enable = false;
>
>  lacp_lock();
>  slave = slave_lookup(lacp, slave_);
> @@ -362,8 +364,16 @@ lacp_process_packet(struct lacp *lacp, const void
> *slave_,
>  slave->partner = pdu->actor;
>  }
>
> +/*
> + * Evaluate may_enable here to avoid dropping of packets till
> main thread
> + * sets may_enable to true.
> + */
> +lacp_may_enable = slave_may_enable__(slave);
> +
>  out:
>  lacp_unlock();
> +
> +return lacp_may_enable;
>  }
>
>  /* Returns the lacp_status of the given 'lacp' object (which may be
> NULL). */
> diff --git a/lib/lacp.h b/lib/lacp.h
> index f35cff5..1505c2c 100644
> --- a/lib/lacp.h
> +++ b/lib/lacp.h
> @@ -46,7 +46,8 @@ struct lacp *lacp_ref(const struct lacp *);
>  void lacp_configure(struct lacp *, const struct lacp_settings *);
>  bool lacp_is_active(const struct lacp *);
>
> -void lacp_process_packet(struct lacp *, const void *slave,
> +bool lacp_process_packet(struct lacp *, const void *bond,
> + const void *slave,
>   const struct dp_packet *packet);
>  enum lacp_status lacp_status(const struct lacp *);
>
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index f87cdba..5fc1e0e 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -777,6 +777,7 @@ bond_check_admissibility(struct bond *bond, const
> void *slave_,
>  {
>  enum bond_verdict verdict = BV_DROP;
>  struct bond_slave *slave;
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>
>  ovs_rwlock_rdlock();
>  slave = bond_slave_lookup(bond, slave_);
> @@ -794,7 +795,13 @@ bond_check_admissibility(struct bond *bond, const
> void *slave_,
>   * drop all incoming traffic except if lacp_fallback_ab is
> enabled. */
>  switch (bond->lacp_status) {
>  case LACP_NEGOTIATED:
> -verdict = slave->enabled ? BV_ACCEPT : BV_DROP;
> +/*
> + * To reduce packet-drops due to delay in enabling of slave
> (post
> + * LACP-SYNC), from main thread, check for may_enable as well.
> + * When may_enable is TRUE, it means LACP is UP and waiting
> for
> + * the main thread to run LACP state machine and enable the
> slave.
> + */
> +verdict = (slave->enabled || slave->may_enable) ? BV_ACCEPT :
> BV_DROP;
>  goto out;
>  case LACP_CONFIGURED:
>  if (!bond->lacp_fallback_ab) {
> 

Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in OVS.

2018-06-08 Thread Federico Iezzi
I'm very sorry for double posting.
I believe this patch would represent a major feature in term of OVS's
usability.
Has anyone got any time for reviewing it?

Thanks,
Federico

On 6 June 2018 at 17:37, Jan Scheurich  wrote:

> The user-space part for packet drop stats should be generic and work with
> any dpif datapath.
> So, if someone implemented the equivalent drop stats functionality in the
> kernel datapath that would be very welcome.
> We in Ericsson cannot do that currently due to license restrictions.
>
> Regards, Jan
>
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org 
> On Behalf Of Rohith Basavaraja
> > Sent: Friday, 25 May, 2018 07:37
> > To: Ben Pfaff 
> > Cc: d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in OVS.
> >
> > Thanks Ben for the clarification. Yes this new stuff is used only in the
> > DPDK datapath and it’s not used in the kernel datapath.
> >
> > Thanks
> > Rohith
> >
> > On 25/05/18, 2:52 AM, "Ben Pfaff"  wrote:
> >
> > On Thu, May 24, 2018 at 02:19:06AM +, Rohith Basavaraja wrote:
> > > Only  changes in
> > > datapath/linux/compat/include/linux/openvswitch.h
> > > are related to OvS Kernel module.
> >
> > On a second look, I see that the new stuff here is only for the DPDK
> > datapath.  If you don't intend to add this feature to the kernel
> > datapath, there should be no problem.  Never mind.
> >
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Improved Packet Drop Statistics in OVS.

2018-05-25 Thread Federico Iezzi
Hello everyone,

May I ask to provide some (even brief) documentation about each specific
drop counter?
That would really help understanding why a specific drop is happening.

Thanks,
Federico

On 25 May 2018 at 07:44, Rohith Basavaraja 
wrote:

> Hi Greg,
>
> Thanks a lot for sharing the info.
> Ben clarified that it should be ok if this new stuff is used only in the
> DPDK datapath and it’s not used in the kernel datapath.
>
> Thanks
> Rohith
>
>
>
> On 24/05/18, 9:05 PM, "Gregory Rose"  wrote:
>
> On 5/23/2018 7:19 PM, Rohith Basavaraja wrote:
> > Hi Ben,
> >
> > Only  changes in
> > datapath/linux/compat/include/linux/openvswitch.h
> > are related to OvS Kernel module.
> >
> > The changes in openvswitch.h are not upstreamed yet.
> > Can you please let me know the procedure for submitting the
> > Changes for OvS Kernel Module?.
> >
> > In the meanwhile can I get code review feedback for the changes,
> while I
> > Figure out how to commit changes in datapath/linux/compat/include/
> linux/openvswitch.h.
> >
> > Thanks
> > Rohith
>
> You need to submit changes to openvswitch.h to the Linux network mail
> list.
>
> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
>
> After acceptance upstream we can then pull in the change.
>
> And this might help as well.
>
> https://www.kernel.org/doc/html/v4.16/process/submitting-patches.html
>
> Thanks,
>
> - Greg
>
> >
> >
> > On 24/05/18, 2:03 AM, "Ben Pfaff"  wrote:
> >
> >  I see that this adds code to the OVS kernel module.  Are these
> changes
> >  upstream already in Linux?  Usually we use an "upstream first"
> >  philosophy regarding the kernel module.
> >
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] About missing news entry for LSC backports

2018-05-13 Thread Federico Iezzi
Hey there,

As per subject, you may want to write something down in the NEWS as
part of the next dpdk_marge for the recent LSC backport in the
branches 2.7, 2.8, and 2.9 as well.

As you know the LSC provides a quite important workaround for
Fortville link check and an entry in the news would give more
visibility about it.

Cheers,
Federico
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 0/6] OVS-DPDK rxq to pmd assignment improvements.

2017-09-11 Thread Federico Iezzi
On Fri, Aug 25, 2017 at 10:36 AM, Darrell Ball  wrote:
> I applied the series to https://github.com/darball/ovs/commits/dpdk_merge

I believe this series is worth a mention in the NEWS, reengineering an
important OVS-DPDK limitation.

>
>
>
> On 8/24/17, 4:37 PM, "Kevin Traynor"  wrote:
>
> For the DPDK datapath, by default rxqs are assigned to available pmds
> in round robin order with no weight or priority.
>
> It can happen that some very busy queues are handled by one pmd which
> does not have enough cycles to prevent packets being dropped on them.
> While at the same time another pmd which handles queues with no traffic
> on them is essentially idle.
>
> Rxq to pmd assignment happens as a result of a number of events and
> when it does, the same unweighted round robin approach is applied
> each time.
>
> This patchset proposes to improve the round robin nature of rxq to pmd
> assignment by counting the processing cycles used by the rxqs during
> their operation and incorporating that data into assignment.
>
> Before assigning in a round robin manner, the rxqs will be sorted in
> order of the processing cycles they have been consuming. Assuming
> multiple pmds, this ensures that the rxqs measured to be using the
> most processing cycles will be assigned to different cores.
>
> In some cases the measured cycles for an rxq may be not available as
> the rxq is new or may not be useful for assignment as traffic patterns
> may change.  In those cases the code will essentially fallback to being
> round round similar to what currently exists. However, in the case
> where data is available and a reliable indication of future rxq cycles
> consumption, rxq to pmd distribution will be much improved.
>
> V5 -> V6
> 
> Minor changes to 2/6, 3/6, 4/6, 5/6 from Darrell's review comments.
>
> V4 -> V5
> 
> Changed history of rxq considered during assignment to 1 min. In order
> to have data available quicker than 1 min and not to be using up to
> 1 min old data, introduced storing of data in multiple intervals
> similar to suggestion by Darrell. Some minor name changes to reflect
> this.
>
> 2/6 Added storage for multiple intervals
> 3/6 Store cycles into interval
> 4/6 Sum cycles from intervals and use total for assignment
>
> V3 -> V4
> 
> 4/6
> Rebased to accomodate new cross numa assigment.
>
> V2 -> V3
> 
> Dropped v2 1/7 as not reusing dpcls optimisation interval anymore
>
> 2/6
> Moved unused functions to 3/6 to avoid compiler warning
>
> 3/6
> Made pmd rxq interval independent from dpcls opt interval
>
> 4/6
> Moved docs about rebalance command to when it is available in 6/6
> Added logging info for pmd to rxq assignment
>
> 5/6
> Added an example to docs
>
> 6/6
> Noted in commit msg that Jan requested this for testing purposes
>
> V1 -> V2
> 
> Dropped Ciara's patch to change how pmd cycles are counted as it merged.
>
> 6/7: Rebased unit tests.
>
>
> Kevin Traynor (6):
>   dpif-netdev: Change polled_queue to use dp_netdev_rxq.
>   dpif-netdev: Add rxq processing cycle counters.
>   dpif-netdev: Count the rxq processing cycles for an rxq.
>   dpif-netdev: Change rxq_scheduling to use rxq processing cycles.
>   dpif-netdev: Change pmd selection order.
>   dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance.
>
>  Documentation/howto/dpdk.rst |  26 
>  lib/dpif-netdev.c| 298 
> ---
>  tests/pmd.at |   2 +-
>  vswitchd/ovs-vswitchd.8.in   |   2 +
>  4 files changed, 284 insertions(+), 44 deletions(-)
>
> --
> 1.8.3.1
>
>
>
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs V8 00/26] Introducing HW offload support for openvswitch

2017-05-03 Thread Federico Iezzi
On Wed, May 3, 2017 at 5:07 PM, Roi Dayan  wrote:
> This patch series introduces rule offload functionality to dpif-netlink
> via netdev ports new flow offloading API. The user can specify whether to
> enable rule offloading or not via OVS configuration. Netdev providers
> are able to implement netdev flow offload API in order to offload rules.
>
> This patch series also implements one offload scheme for netdev-linux,
> using TC flower classifier, which was chosen because its sort of natural
> to state OVS DP rules for this classifier. However, the code can be
> extended to support other classifiers such as U32, eBPF, etc which support
> offload as well.
>
> The use-case we are currently addressing is the newly sriov switchdev mode
> in the Linux kernel which was introduced in version 4.8.
> This series was tested against sriov vfs vports representors of the
> Mellanox 100G ConnectX-4 series exposed by the mlx5 kernel driver.
>
>
> V7->V8
> - Refactor dpif logging functions and use them in dpif-netlink
> - Ignore internal devices from netdev hashmap
> - Refactor netdev hmap naming to prefix netdev_ports_*
> - Use single hashmap with 2 nodes for ufid/tc mapping
> - Verify ifindex is valid in netdev_hmap_port_add
> - Close netdev in netdev_tc_flow_get error flow
> - Improve comments for flow offload api
> - Reorder flow api output args to be last args
> - Remove redundant netdev_flow_support
> - Fix using uninitialized var 's'
>
> Not done:
> refactor netdev_ports_* functions to accept a typed obj
> (e.g. netdev_ports struct) instead of void*.
> We can do it as a follow-up commit later.
>
> V6->V7:
> - Fix L3 IPv4 matching got broken
> - Refactor offloads test and testsuite to have same prefix
> - Better handling of unsupported match attributes
>
> V5->V6:
> - Rebase over master branch, fix compilation issue
> - Add Nicira copyright to tc interface
>
> V4->V5:
> - Fix compat
> - Fix VXLAN IPv6 tunnel matching
> - Fix order of actions in dump flows
> - Update ovs-dpctl man page about the addtion of type to dump-flows
>
> Travis
> https://travis-ci.org/roidayan/ovs/builds/213735371
> AppVeyor
> https://ci.appveyor.com/project/roidayan/ovs/build/1.0.18
>
> V3->V4:
> - Move declarations to the right commit with implementation
> - Fix tc_get_flower flow return false success
> - Fix memory leaks - not releasing tc_transact replies
> - Fix travis failure for OSX compilation
> - Fix loop in dpif_netlink_flow_dump_next
> - Fix declared default value for tc-policy in vswitch.xml
> - Refactor loop in netdev_tc_flow_dump_next
> - Add missing error checks in parse_flow_put
> - Fix handling modify request where old rule is in hw and new
>   rule is not supported and needs to be in sw.
> - Use 2 hashmaps instead of 1 for faster reverse lookup of ufid from tc
> - Init ports when enabling hw-offload after OVS is running
>
> TODO: Fix breaking of datapath compilation
>   Fix testsuite failures
>
> Travis
> https://travis-ci.org/roidayan/ovs/builds/210549325
> AppVeyor
> https://ci.appveyor.com/project/roidayan/ovs/build/1.0.15
>
> V2->V3:
> - Code styling fixes
> - Bug fixes
> - Using already available macros/functions to match current OVS code
> - Refactored code according to V2 review
> - Replaced bool option skip-hw for string option tc-policy
> - Added hw offload tests using policy skip_hw
> - Fixed most compatability compiling issues
> - Travis
> https://travis-ci.org/roidayan/ovs/builds/199610124
> - AppVeyor
> https://ci.appveyor.com/project/roidayan/ovs/build/1.0.14
> - Fixed compiling with DPDK enabled
>
> TODO:
> - need to fix datapath compiling issues found in travis after adding tc
>   compatability headers
> - need to fix failing test cases because of get_ifindex
>
> V1->V2:
> - Added generic netdev flow offloads API.
> - Implemented relevant flow API in netdev-linux (and netdev-vport).
> - Added a other_config hw-offload option to enable offloading (defaults 
> to false).
> - Fixed coding style to conform with OVS.
> - Policy removed for now. (Will be discussed how best implemented later).
>
>
> Thanks,
> Paul & Roi
>
>
> Paul Blakey (24):
>   tc: Add tc flower interface
>   netdev: Adding a new netdev api to be used for offloading flows
>   other-config: Add hw-offload switch to control netdev flow offloading
>   other-config: Add tc-policy switch to control tc flower flag
>   dpif: Save added ports in a port map for netdev flow api use
>   dpif-netlink: Flush added ports using netdev flow api
>   netdev-tc-offloads: Implement netdev flow flush using tc interface
>   dpif-netlink: Dump netdevs flows on flow dump
>   netdev-tc-offloads: Add ufid to tc/netdev map
>   netdev-tc-offloads: Implement netdev