Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-07-03 Thread Hui Xiang
Your help was greatly appreciated, thanks Bodireddy.

On Mon, Jul 3, 2017 at 10:57 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> >>What is your use case(s) ?
> >>My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal
> >>case, and it requires a good performance, however, OVS-DPDK seems still
> >not
> >>reach its needs compared with  hardware offloading, we are evaluating VPP
> >as
> >>well,
> >As you mentioned VPP here, It's worth looking at the benchmarks that were
> >carried comparing
> >OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in
> OvS
> >Fall conference.
> >The slides can be found @
> >http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf.
> >In above pdf page 12, why does classifier showed a constant throughput
> with
> >increasing concurrent L4 flows? shouldn't the performance get degradation
> >with more subtable look up as you mentioned.
>
> You raised a good point. The reason being the 'sorted subtable ranking'
> implementation in 2.7 release.
> With this we will have the subtable vector sorted by frequency of hits and
> this reduces the number of subtable lookups.
> That is the reason why I asked for the ' avg. subtable lookups per hit:'
> number.
>
> I recommend watching the video of the presentation here
> https://www.youtube.com/watch?v=cxRcfn2x4eE , as the
> bottlenecks you are referring in this thread are more or less similar to
> ones discussed at the conference.
>
> - Bhanuprakash.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-07-03 Thread Bodireddy, Bhanuprakash
>>What is your use case(s) ?
>>My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal
>>case, and it requires a good performance, however, OVS-DPDK seems still
>not
>>reach its needs compared with  hardware offloading, we are evaluating VPP
>as
>>well,
>As you mentioned VPP here, It's worth looking at the benchmarks that were
>carried comparing
>OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in OvS
>Fall conference.
>The slides can be found @
>http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf.
>In above pdf page 12, why does classifier showed a constant throughput with
>increasing concurrent L4 flows? shouldn't the performance get degradation
>with more subtable look up as you mentioned.

You raised a good point. The reason being the 'sorted subtable ranking' 
implementation in 2.7 release.
With this we will have the subtable vector sorted by frequency of hits and this 
reduces the number of subtable lookups.
That is the reason why I asked for the ' avg. subtable lookups per hit:'  
number.

I recommend watching the video of the presentation here 
https://www.youtube.com/watch?v=cxRcfn2x4eE , as the
bottlenecks you are referring in this thread are more or less similar to ones 
discussed at the conference. 

- Bhanuprakash.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-07-03 Thread Hui Xiang
Thanks much Bodireddy again! comment inline.

On Mon, Jul 3, 2017 at 5:00 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> It’s a long weekend in US and will try answering some of your questions in
> Darrell's absence.
>
> >Why do think having more than 64k per PMD would be optimal ?
> >I originally thought that the bottleneck in classifier because it is
> saturated full
> >so that look up has to be going to flow table, so I think why not just
> increase
> >the dpcls flows per PMD, but seems I am wrong based on your explanation.
>
> For few use cases much of the bottleneck moves to Classifier when EMC is
> saturated. You may have
> to add more  PMD threads (again this depends on the availability of cores
> in your case.)
> As your initial investigation proved classifier is bottleneck, just
> curious about few things.
>  -  In the 'dpif-netdev/pmd-stats-show' output, what does the ' avg.
> subtable lookups per hit:'  looks like?
>  -  In steady state do 'dpcls_lookup()' top the list of functions with
> 'perf top'.
>
> Those are great advices, I'll check more.

> >What is your use case(s) ?
> >My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal
> >case, and it requires a good performance, however, OVS-DPDK seems still
> not
> >reach its needs compared with  hardware offloading, we are evaluating VPP
> as
> >well,
> As you mentioned VPP here, It's worth looking at the benchmarks that were
> carried comparing
> OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in
> OvS Fall conference.
> The slides can be found @ http://openvswitch.org/
> support/ovscon2016/8/1400-gray.pdf.
>
In above pdf page 12, why does classifier showed a constant throughput with
increasing concurrent L4 flows? shouldn't the performance get degradation
with more subtable look up as you mentioned.

>
> basically I am looking to find out what's the bottleneck so far in OVS-
> >DPDK (seems in flow look up), and if there are some solutions being
> discussed
> >or working in progress.
>
> I personally did some investigation in this area. One of the bottlenecks
> in classifier is due to sub-table lookup.
> Murmur hash is used in OvS and it is  recommended enabling intrinsics with
> -march=native/CFLAGS="-msse4.2"  if not done.
> If you have more subtables, the lookups may be taking significant cycles.
> I presume you are using OvS 2.7. Some optimizations
> were done to  improve classifier  performance(subtable ranking, hash
> optimizations).
> If emc_lookup()/emc_insert() show up in top 5 functions taking significant
> cycles, worth disabling EMC as below.
>   'ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-
> prob=0'
>
Thanks much for your advice.

>
> >Are you wanting for this number to be larger by default ?
> >I am not sure, I need to understand whether it is good or bad to set it
> larger.
> >Are you wanting for this number to be configurable ?
> >Probably good.
> >
> >BTW, after reading part of DPDK document, it strengthens to decrease to
> copy
> >between cache and memory and get cache hit as much as possible to get
> >fewer cpu cycles to fetch data, but now I am totally lost on how does OVS-
> >DPDK emc and classifier map to the LLC.
>
> I didn't get your question here. PMD is like any other thread and has EMC
> and a classifier per ingress port.
> The EMC,  classifier subtables and other data structures will make it to
> LLC when accessed.
>
ACK.

>
> As already mentioned using RDT Cache Allocation Technology(CAT), one can
> assign cache ways to high priority threads
> https://software.intel.com/en-us/articles/introduction-to-
> cache-allocation-technology
>
> - Bhanuprakash.
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-07-03 Thread Bodireddy, Bhanuprakash
It’s a long weekend in US and will try answering some of your questions in 
Darrell's absence.

>Why do think having more than 64k per PMD would be optimal ?
>I originally thought that the bottleneck in classifier because it is saturated 
>full
>so that look up has to be going to flow table, so I think why not just increase
>the dpcls flows per PMD, but seems I am wrong based on your explanation.

For few use cases much of the bottleneck moves to Classifier when EMC is 
saturated. You may have
to add more  PMD threads (again this depends on the availability of cores in 
your case.)
As your initial investigation proved classifier is bottleneck, just curious 
about few things.
 -  In the 'dpif-netdev/pmd-stats-show' output, what does the ' avg. 
subtable lookups per hit:'  looks like?
 -  In steady state do 'dpcls_lookup()' top the list of functions with 
'perf top'.

>What is your use case(s) ?
>My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal
>case, and it requires a good performance, however, OVS-DPDK seems still not
>reach its needs compared with  hardware offloading, we are evaluating VPP as
>well, 
As you mentioned VPP here, It's worth looking at the benchmarks that were 
carried comparing
OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in OvS 
Fall conference.
The slides can be found @ 
http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf.

basically I am looking to find out what's the bottleneck so far in OVS-
>DPDK (seems in flow look up), and if there are some solutions being discussed
>or working in progress.

I personally did some investigation in this area. One of the bottlenecks in 
classifier is due to sub-table lookup.
Murmur hash is used in OvS and it is  recommended enabling intrinsics with 
-march=native/CFLAGS="-msse4.2"  if not done. 
If you have more subtables, the lookups may be taking significant cycles.  I 
presume you are using OvS 2.7. Some optimizations
were done to  improve classifier  performance(subtable ranking, hash 
optimizations). 
If emc_lookup()/emc_insert() show up in top 5 functions taking significant 
cycles, worth disabling EMC as below.
  'ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=0'

>Are you wanting for this number to be larger by default ?
>I am not sure, I need to understand whether it is good or bad to set it larger.
>Are you wanting for this number to be configurable ?
>Probably good.
>
>BTW, after reading part of DPDK document, it strengthens to decrease to copy
>between cache and memory and get cache hit as much as possible to get
>fewer cpu cycles to fetch data, but now I am totally lost on how does OVS-
>DPDK emc and classifier map to the LLC.

I didn't get your question here. PMD is like any other thread and has EMC and a 
classifier per ingress port.
The EMC,  classifier subtables and other data structures will make it to LLC 
when accessed. 

As already mentioned using RDT Cache Allocation Technology(CAT), one can assign 
cache ways to high priority threads
https://software.intel.com/en-us/articles/introduction-to-cache-allocation-technology

- Bhanuprakash.

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-30 Thread Hui Xiang
Thanks Darrell, comment inline.

On Sat, Jul 1, 2017 at 1:02 AM, Darrell Ball  wrote:

>
>
>
>
> *From: *Hui Xiang 
> *Date: *Thursday, June 29, 2017 at 6:57 PM
> *To: *Darrell Ball 
> *Cc: *"Bodireddy, Bhanuprakash" , "
> ovs-discuss@openvswitch.org" 
> *Subject: *Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
>
>
>
> I am interested about how to define 'reasonable' here, how it is got and
> what what is the 'many case'? is there any document/link to refer this
> information, please shed me some light.
>
>
>
> It is based on real usage scenarios for the number of megaflows needed.
>
> The usage may be less in most cases.
>
> In cases where larger, it may imply that more threads are better and
> dividing among queues.
>
Yes, more threads are better, but the overall cores are limited, more
threads pinned on cores for OVS-DPDK, less available for vms.

>
>
> Why do think having more than 64k per PMD would be optimal ?
>
I originally thought that the bottleneck in classifier because it is
saturated full so that look up has to be going to flow table, so I think
why not just increase the dpcls flows per PMD, but seems I am wrong based
on your explanation.

> What is your use case(s) ?
>
My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal case,
and it requires a good performance, however, OVS-DPDK seems still not reach
its needs compared with  hardware offloading, we are evaluating VPP as
well, basically I am looking to find out what's the bottleneck so far in
OVS-DPDK (seems in flow look up), and if there are some solutions being
discussed or working in progress.

> Are you wanting for this number to be larger by default ?
>
I am not sure, I need to understand whether it is good or bad to set it
larger.

> Are you wanting for this number to be configurable ?
>
Probably good.

>
>
BTW, after reading part of DPDK document, it strengthens to decrease to
copy between cache and memory and get cache hit as much as possible to get
fewer cpu cycles to fetch data, but now I am totally lost on how does
OVS-DPDK emc and classifier map to the LLC.

>
>
> On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball  wrote:
>
> Q: “how it is calculated in such an exact number? “
>
> A: It is a reasonable number to accommodate many cases.
>
> Q: “If there are more ports added for polling, for avoid competing can I
> increase the 64k size into a
> bigger one?”
>
> A: If a larger number is needed, it may imply that adding another PMD and
> dividing the forwarding
> work would be best.  Maybe even a smaller number of flows may be best
> served with more PMDs.
>
>
>
>
>
>
> On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of
> Bodireddy, Bhanuprakash"  of bhanuprakash.bodire...@intel.com> wrote:
>
> >
>
> >I guess the answer is now the general LLC is 2.5M per core so that
> there is 64k
>
> >flows per thread.
>
>
>
> AFAIK, the no. of flows here may not have to do anything with LLC.
> Also there is EMC cache(8k entries) of ~4MB per PMD thread.
>
>
>
>
>
> Yes the performance will be nice with simple test cases (P2P with 1
> PMD thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be memory bound.
>
>
>
> BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption
> of 2.5M/core isn't right.
>
>
>
> - Bhanuprakash.
>
>
>
> >
>
> >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang 
> wrote:
>
> >Thanks Darrell,
>
> >
>
> >More questions:
>
> >Why not allocating 64k for each dpcls? does the 64k just fit in L3
> cache or
>
> >anywhere? how it is calculated in such an exact number?  If there are
> more
>
> >ports added for polling, for avoid competing can I increase the 64k
> size into a
>
> >bigger one? Thanks.
>
> >
>
> >Hui.
>
> >
>
> >
>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.
> openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c=
> uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=-
> aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_
> ukDYkjSnSmA9Q9h37XchMZofuU&e=
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-30 Thread Darrell Ball


From: Hui Xiang 
Date: Thursday, June 29, 2017 at 6:57 PM
To: Darrell Ball 
Cc: "Bodireddy, Bhanuprakash" , 
"ovs-discuss@openvswitch.org" 
Subject: Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

I am interested about how to define 'reasonable' here, how it is got and what 
what is the 'many case'? is there any document/link to refer this information, 
please shed me some light.

It is based on real usage scenarios for the number of megaflows needed.
The usage may be less in most cases.
In cases where larger, it may imply that more threads are better and dividing 
among queues.

Why do think having more than 64k per PMD would be optimal ?
What is your use case(s) ?
Are you wanting for this number to be larger by default ?
Are you wanting for this number to be configurable ?


On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball 
mailto:db...@vmware.com>> wrote:
Q: “how it is calculated in such an exact number? “

A: It is a reasonable number to accommodate many cases.

Q: “If there are more ports added for polling, for avoid competing can I 
increase the 64k size into a
bigger one?”

A: If a larger number is needed, it may imply that adding another PMD and 
dividing the forwarding
work would be best.  Maybe even a smaller number of flows may be best served 
with more PMDs.





On 6/29/17, 7:23 AM, 
"ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org>
 on behalf of Bodireddy, Bhanuprakash" 
mailto:ovs-discuss-boun...@openvswitch.org>
 on behalf of 
bhanuprakash.bodire...@intel.com<mailto:bhanuprakash.bodire...@intel.com>> 
wrote:

>

>I guess the answer is now the general LLC is 2.5M per core so that there 
is 64k

>flows per thread.



AFAIK, the no. of flows here may not have to do anything with LLC.  Also 
there is EMC cache(8k entries) of ~4MB per PMD thread.





Yes the performance will be nice with simple test cases (P2P with 1 PMD 
thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK can be 
memory bound.



BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of 
2.5M/core isn't right.



- Bhanuprakash.



>

>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang 
mailto:xiangh...@gmail.com>> wrote:

>Thanks Darrell,

>

>More questions:

>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or

>anywhere? how it is calculated in such an exact number?  If there are more

>ports added for polling, for avoid competing can I increase the 64k size 
into a

>bigger one? Thanks.

>

>Hui.

>

>


___
discuss mailing list
disc...@openvswitch.org<mailto:disc...@openvswitch.org>

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=-aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_ukDYkjSnSmA9Q9h37XchMZofuU&e=


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-30 Thread Hui Xiang
Thank you very much, Bodireddy, appreciated your reply.



On Fri, Jun 30, 2017 at 5:19 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> >
> >Thanks Bodireddy.
> >
> >Sorry I am a bit confused about the EMC occupied size per PMD, here[1]
> has a
> >different story.
>
> Initially EMC had 1024 entries and the patch [1] increased it to 8k.  By
> doing so in simple test
> scenarios, most of the flows will hit EMC and we can achieve wire speed
> for smaller packets.
> With this patch the EMC cache size is now @ ~4MB.
>
> It should be noted that there can be multiple PMD threads and whole lot of
> other threads running
> on the compute. Cache is limited and all the threads contend for LLC
> resources. This leads to cache thrashing
> and impacts performance.
>
> To alleviate this problem, Intel's (Resource Director Technology) RDT is
> used to partition the LLC and
> assign cache ways to different threads based on priority.
>
> >
> >Do you mean in real scenarios OVS-DPDK can be memory bound on EMC?
> OvS is flexible and results are use case dependent.  With some use cases ,
> EMC quickly gets saturated and packets will be sent to classifier.
> Some of the bottlenecks I referred are in classifier. .
>
> >I thought EMC should be totally fit in LLC.
> As pointed earlier there may be lot of threads on the compute and this
> assumption may not be right always.
>
> >
> >If the megaflows just part in LLC, then the cost of copy between memory
> and
> >LLC should be large, isn't it not like what defined as 'fast path' in
> userspace
> >compared with kernel datapath? And if most of megaflows are in memory,
> >the reason of every PMD  has one dpcls instance is to follow the rule PMD
> >thread should has local data as most as it can, but not every PMD put it
> in its
> >local cache, if that is true, I can't see why 64k is the limit num,
> unless this is an
> >experience best value calculated from vtune/perf resutls.
> >
> >You are probably enabled hyper-thread with 35MB and got 28 cores.
>
> I have E5-2695 v3, dual socket with 14 cores per socket. I will have 56
> cores with HT enabled.
>
> - Bhanuprakash.
>
> >
> >[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html
> >
> >
> >
> >On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash
> > wrote:
> >>
> >>I guess the answer is now the general LLC is 2.5M per core so that there
> is
> >64k
> >>flows per thread.
> >
> >AFAIK, the no. of flows here may not have to do anything with LLC.  Also
> there
> >is EMC cache(8k entries) of ~4MB per PMD thread.
> >Yes the performance will be nice with simple test cases (P2P with 1 PMD
> >thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be
> >memory bound.
> >
> >BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of
> >2.5M/core isn't right.
> >
> >- Bhanuprakash.
> >
> >>
> >>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang  wrote:
> >>Thanks Darrell,
> >>
> >>More questions:
> >>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or
> >>anywhere? how it is calculated in such an exact number?  If there are
> more
> >>ports added for polling, for avoid competing can I increase the 64k size
> into a
> >>bigger one? Thanks.
> >>
> >>Hui.
> >>
> >>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-30 Thread Bodireddy, Bhanuprakash
>
>Thanks Bodireddy.
>
>Sorry I am a bit confused about the EMC occupied size per PMD, here[1] has a
>different story.

Initially EMC had 1024 entries and the patch [1] increased it to 8k.  By doing 
so in simple test
scenarios, most of the flows will hit EMC and we can achieve wire speed for 
smaller packets.
With this patch the EMC cache size is now @ ~4MB.

It should be noted that there can be multiple PMD threads and whole lot of 
other threads running
on the compute. Cache is limited and all the threads contend for LLC resources. 
This leads to cache thrashing
and impacts performance. 

To alleviate this problem, Intel's (Resource Director Technology) RDT is used 
to partition the LLC and 
assign cache ways to different threads based on priority.

>
>Do you mean in real scenarios OVS-DPDK can be memory bound on EMC?
OvS is flexible and results are use case dependent.  With some use cases , EMC 
quickly gets saturated and packets will be sent to classifier. 
Some of the bottlenecks I referred are in classifier. .

>I thought EMC should be totally fit in LLC.
As pointed earlier there may be lot of threads on the compute and this 
assumption may not be right always.

>
>If the megaflows just part in LLC, then the cost of copy between memory and
>LLC should be large, isn't it not like what defined as 'fast path' in userspace
>compared with kernel datapath? And if most of megaflows are in memory,
>the reason of every PMD  has one dpcls instance is to follow the rule PMD
>thread should has local data as most as it can, but not every PMD put it in its
>local cache, if that is true, I can't see why 64k is the limit num, unless 
>this is an
>experience best value calculated from vtune/perf resutls.
>
>You are probably enabled hyper-thread with 35MB and got 28 cores.

I have E5-2695 v3, dual socket with 14 cores per socket. I will have 56 cores 
with HT enabled.

- Bhanuprakash.

>
>[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html
>
>
>
>On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash
> wrote:
>>
>>I guess the answer is now the general LLC is 2.5M per core so that there is
>64k
>>flows per thread.
>
>AFAIK, the no. of flows here may not have to do anything with LLC.  Also there
>is EMC cache(8k entries) of ~4MB per PMD thread.
>Yes the performance will be nice with simple test cases (P2P with 1 PMD
>thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK can be
>memory bound.
>
>BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of
>2.5M/core isn't right.
>
>- Bhanuprakash.
>
>>
>>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang  wrote:
>>Thanks Darrell,
>>
>>More questions:
>>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or
>>anywhere? how it is calculated in such an exact number?  If there are more
>>ports added for polling, for avoid competing can I increase the 64k size into 
>>a
>>bigger one? Thanks.
>>
>>Hui.
>>
>>

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Hui Xiang
I am interested about how to define 'reasonable' here, how it is got and
what what is the 'many case'? is there any document/link to refer this
information, please shed me some light.

On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball  wrote:

> Q: “how it is calculated in such an exact number? “
>
> A: It is a reasonable number to accommodate many cases.
>
> Q: “If there are more ports added for polling, for avoid competing can I
> increase the 64k size into a
> bigger one?”
>
> A: If a larger number is needed, it may imply that adding another PMD and
> dividing the forwarding
> work would be best.  Maybe even a smaller number of flows may be best
> served with more PMDs.
>
>
>
>
>
> On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of
> Bodireddy, Bhanuprakash"  of bhanuprakash.bodire...@intel.com> wrote:
>
> >
>
> >I guess the answer is now the general LLC is 2.5M per core so that
> there is 64k
>
> >flows per thread.
>
>
>
> AFAIK, the no. of flows here may not have to do anything with LLC.
> Also there is EMC cache(8k entries) of ~4MB per PMD thread.
>
>
>
>
>
> Yes the performance will be nice with simple test cases (P2P with 1
> PMD thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK
> can be memory bound.
>
>
>
> BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption
> of 2.5M/core isn't right.
>
>
>
> - Bhanuprakash.
>
>
>
> >
>
> >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang 
> wrote:
>
> >Thanks Darrell,
>
> >
>
> >More questions:
>
> >Why not allocating 64k for each dpcls? does the 64k just fit in L3
> cache or
>
> >anywhere? how it is calculated in such an exact number?  If there are
> more
>
> >ports added for polling, for avoid competing can I increase the 64k
> size into a
>
> >bigger one? Thanks.
>
> >
>
> >Hui.
>
> >
>
> >
>
>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.
> openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c=
> uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=-
> aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_
> ukDYkjSnSmA9Q9h37XchMZofuU&e=
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Hui Xiang
Thanks Bodireddy.

Sorry I am a bit confused about the EMC occupied size per PMD, here[1] has
a different story.

Do you mean in real scenarios OVS-DPDK can be memory bound on EMC? I
thought EMC should be totally fit in LLC.

If the megaflows just part in LLC, then the cost of copy between memory and
LLC should be large, isn't it not like what defined as 'fast path' in
userspace compared with kernel datapath? And if most of megaflows are in
memory, the reason of every PMD  has one dpcls instance is to follow the
rule PMD thread should has local data as most as it can, but not every PMD
put it in its local cache, if that is true, I can't see why 64k is the
limit num, unless this is an experience best value calculated from
vtune/perf resutls.

You are probably enabled hyper-thread with 35MB and got 28 cores.

[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html



On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash <
bhanuprakash.bodire...@intel.com> wrote:

> >
> >I guess the answer is now the general LLC is 2.5M per core so that there
> is 64k
> >flows per thread.
>
> AFAIK, the no. of flows here may not have to do anything with LLC.  Also
> there is EMC cache(8k entries) of ~4MB per PMD thread.
> Yes the performance will be nice with simple test cases (P2P with 1 PMD
> thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK can
> be memory bound.
>
> BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of
> 2.5M/core isn't right.
>
> - Bhanuprakash.
>
> >
> >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang  wrote:
> >Thanks Darrell,
> >
> >More questions:
> >Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or
> >anywhere? how it is calculated in such an exact number?  If there are more
> >ports added for polling, for avoid competing can I increase the 64k size
> into a
> >bigger one? Thanks.
> >
> >Hui.
> >
> >
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Darrell Ball
Q: “how it is calculated in such an exact number? “

A: It is a reasonable number to accommodate many cases.

Q: “If there are more ports added for polling, for avoid competing can I 
increase the 64k size into a
bigger one?”

A: If a larger number is needed, it may imply that adding another PMD and 
dividing the forwarding
work would be best.  Maybe even a smaller number of flows may be best served 
with more PMDs.





On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of 
Bodireddy, Bhanuprakash"  wrote:

>

>I guess the answer is now the general LLC is 2.5M per core so that there 
is 64k

>flows per thread.



AFAIK, the no. of flows here may not have to do anything with LLC.  Also 
there is EMC cache(8k entries) of ~4MB per PMD thread.





Yes the performance will be nice with simple test cases (P2P with 1 PMD 
thread) as most of this fits in to LLC. But in real scenarios  OvS-DPDK can be 
memory bound.



BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of 
2.5M/core isn't right. 



- Bhanuprakash.



>

>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang  wrote:

>Thanks Darrell,

>

>More questions:

>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or

>anywhere? how it is calculated in such an exact number?  If there are more

>ports added for polling, for avoid competing can I increase the 64k size 
into a

>bigger one? Thanks.

>

>Hui.

>

>



___
discuss mailing list
disc...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=-aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_ukDYkjSnSmA9Q9h37XchMZofuU&e=
 


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Bodireddy, Bhanuprakash
>
>I guess the answer is now the general LLC is 2.5M per core so that there is 64k
>flows per thread.

AFAIK, the no. of flows here may not have to do anything with LLC.  Also there 
is EMC cache(8k entries) of ~4MB per PMD thread.
Yes the performance will be nice with simple test cases (P2P with 1 PMD thread) 
as most of this fits in to LLC. But in real scenarios  OvS-DPDK can be memory 
bound.

BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of 
2.5M/core isn't right. 

- Bhanuprakash.

>
>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang  wrote:
>Thanks Darrell,
>
>More questions:
>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or
>anywhere? how it is calculated in such an exact number?  If there are more
>ports added for polling, for avoid competing can I increase the 64k size into a
>bigger one? Thanks.
>
>Hui.
>
>

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-29 Thread Hui Xiang
I guess the answer is now the general LLC is 2.5M per core so that there is
64k flows per thread.

On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang  wrote:

> Thanks Darrell,
>
> More questions:
> Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache
> or anywhere? how it is calculated in such an exact number?  If there are
> more ports added for polling, for avoid competing can I increase the 64k
> size into a bigger one? Thanks.
>
> Hui.
>
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-22 Thread Hui Xiang
Thanks Darrell,

More questions:
Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or
anywhere? how it is calculated in such an exact number?  If there are more
ports added for polling, for avoid competing can I increase the 64k size
into a bigger one? Thanks.

Hui.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-22 Thread Darrell Ball


From:  on behalf of Hui Xiang 

Date: Thursday, June 22, 2017 at 2:22 AM
To: "ovs-discuss@openvswitch.org" 
Subject: Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

Anyone could help to answer it?

On Tue, Jun 20, 2017 at 6:22 PM, Hui Xiang 
mailto:xiangh...@gmail.com>> wrote:
Hello guys,

  I have seen that there will be one dpcls instance for each port per pmd, and 
seems flow table max entries num is 64k per pmd other than per dpcls,


“my question is if there are several dpcls instances per pmd, would they 
compete the 64k per pmd?”

yes


BR.
Hui.

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?

2017-06-22 Thread Hui Xiang
Anyone could help to answer it?

On Tue, Jun 20, 2017 at 6:22 PM, Hui Xiang  wrote:

> Hello guys,
>
>   I have seen that there will be one dpcls instance for each port per pmd,
> and seems flow table max entries num is 64k per pmd other than per dpcls,
> my question is if there are several dpcls instances per pmd, would they
> compete the 64k per pmd?
>
>
> BR.
> Hui.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss