Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Your help was greatly appreciated, thanks Bodireddy. On Mon, Jul 3, 2017 at 10:57 PM, Bodireddy, Bhanuprakash < bhanuprakash.bodire...@intel.com> wrote: > >>What is your use case(s) ? > >>My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal > >>case, and it requires a good performance, however, OVS-DPDK seems still > >not > >>reach its needs compared with hardware offloading, we are evaluating VPP > >as > >>well, > >As you mentioned VPP here, It's worth looking at the benchmarks that were > >carried comparing > >OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in > OvS > >Fall conference. > >The slides can be found @ > >http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf. > >In above pdf page 12, why does classifier showed a constant throughput > with > >increasing concurrent L4 flows? shouldn't the performance get degradation > >with more subtable look up as you mentioned. > > You raised a good point. The reason being the 'sorted subtable ranking' > implementation in 2.7 release. > With this we will have the subtable vector sorted by frequency of hits and > this reduces the number of subtable lookups. > That is the reason why I asked for the ' avg. subtable lookups per hit:' > number. > > I recommend watching the video of the presentation here > https://www.youtube.com/watch?v=cxRcfn2x4eE , as the > bottlenecks you are referring in this thread are more or less similar to > ones discussed at the conference. > > - Bhanuprakash. > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
>>What is your use case(s) ? >>My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal >>case, and it requires a good performance, however, OVS-DPDK seems still >not >>reach its needs compared with hardware offloading, we are evaluating VPP >as >>well, >As you mentioned VPP here, It's worth looking at the benchmarks that were >carried comparing >OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in OvS >Fall conference. >The slides can be found @ >http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf. >In above pdf page 12, why does classifier showed a constant throughput with >increasing concurrent L4 flows? shouldn't the performance get degradation >with more subtable look up as you mentioned. You raised a good point. The reason being the 'sorted subtable ranking' implementation in 2.7 release. With this we will have the subtable vector sorted by frequency of hits and this reduces the number of subtable lookups. That is the reason why I asked for the ' avg. subtable lookups per hit:' number. I recommend watching the video of the presentation here https://www.youtube.com/watch?v=cxRcfn2x4eE , as the bottlenecks you are referring in this thread are more or less similar to ones discussed at the conference. - Bhanuprakash. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Thanks much Bodireddy again! comment inline. On Mon, Jul 3, 2017 at 5:00 PM, Bodireddy, Bhanuprakash < bhanuprakash.bodire...@intel.com> wrote: > It’s a long weekend in US and will try answering some of your questions in > Darrell's absence. > > >Why do think having more than 64k per PMD would be optimal ? > >I originally thought that the bottleneck in classifier because it is > saturated full > >so that look up has to be going to flow table, so I think why not just > increase > >the dpcls flows per PMD, but seems I am wrong based on your explanation. > > For few use cases much of the bottleneck moves to Classifier when EMC is > saturated. You may have > to add more PMD threads (again this depends on the availability of cores > in your case.) > As your initial investigation proved classifier is bottleneck, just > curious about few things. > - In the 'dpif-netdev/pmd-stats-show' output, what does the ' avg. > subtable lookups per hit:' looks like? > - In steady state do 'dpcls_lookup()' top the list of functions with > 'perf top'. > > Those are great advices, I'll check more. > >What is your use case(s) ? > >My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal > >case, and it requires a good performance, however, OVS-DPDK seems still > not > >reach its needs compared with hardware offloading, we are evaluating VPP > as > >well, > As you mentioned VPP here, It's worth looking at the benchmarks that were > carried comparing > OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in > OvS Fall conference. > The slides can be found @ http://openvswitch.org/ > support/ovscon2016/8/1400-gray.pdf. > In above pdf page 12, why does classifier showed a constant throughput with increasing concurrent L4 flows? shouldn't the performance get degradation with more subtable look up as you mentioned. > > basically I am looking to find out what's the bottleneck so far in OVS- > >DPDK (seems in flow look up), and if there are some solutions being > discussed > >or working in progress. > > I personally did some investigation in this area. One of the bottlenecks > in classifier is due to sub-table lookup. > Murmur hash is used in OvS and it is recommended enabling intrinsics with > -march=native/CFLAGS="-msse4.2" if not done. > If you have more subtables, the lookups may be taking significant cycles. > I presume you are using OvS 2.7. Some optimizations > were done to improve classifier performance(subtable ranking, hash > optimizations). > If emc_lookup()/emc_insert() show up in top 5 functions taking significant > cycles, worth disabling EMC as below. > 'ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv- > prob=0' > Thanks much for your advice. > > >Are you wanting for this number to be larger by default ? > >I am not sure, I need to understand whether it is good or bad to set it > larger. > >Are you wanting for this number to be configurable ? > >Probably good. > > > >BTW, after reading part of DPDK document, it strengthens to decrease to > copy > >between cache and memory and get cache hit as much as possible to get > >fewer cpu cycles to fetch data, but now I am totally lost on how does OVS- > >DPDK emc and classifier map to the LLC. > > I didn't get your question here. PMD is like any other thread and has EMC > and a classifier per ingress port. > The EMC, classifier subtables and other data structures will make it to > LLC when accessed. > ACK. > > As already mentioned using RDT Cache Allocation Technology(CAT), one can > assign cache ways to high priority threads > https://software.intel.com/en-us/articles/introduction-to- > cache-allocation-technology > > - Bhanuprakash. > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
It’s a long weekend in US and will try answering some of your questions in Darrell's absence. >Why do think having more than 64k per PMD would be optimal ? >I originally thought that the bottleneck in classifier because it is saturated >full >so that look up has to be going to flow table, so I think why not just increase >the dpcls flows per PMD, but seems I am wrong based on your explanation. For few use cases much of the bottleneck moves to Classifier when EMC is saturated. You may have to add more PMD threads (again this depends on the availability of cores in your case.) As your initial investigation proved classifier is bottleneck, just curious about few things. - In the 'dpif-netdev/pmd-stats-show' output, what does the ' avg. subtable lookups per hit:' looks like? - In steady state do 'dpcls_lookup()' top the list of functions with 'perf top'. >What is your use case(s) ? >My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal >case, and it requires a good performance, however, OVS-DPDK seems still not >reach its needs compared with hardware offloading, we are evaluating VPP as >well, As you mentioned VPP here, It's worth looking at the benchmarks that were carried comparing OvS and VPP for L3-VPN use case by Intel, Ericsson and was presented in OvS Fall conference. The slides can be found @ http://openvswitch.org/support/ovscon2016/8/1400-gray.pdf. basically I am looking to find out what's the bottleneck so far in OVS- >DPDK (seems in flow look up), and if there are some solutions being discussed >or working in progress. I personally did some investigation in this area. One of the bottlenecks in classifier is due to sub-table lookup. Murmur hash is used in OvS and it is recommended enabling intrinsics with -march=native/CFLAGS="-msse4.2" if not done. If you have more subtables, the lookups may be taking significant cycles. I presume you are using OvS 2.7. Some optimizations were done to improve classifier performance(subtable ranking, hash optimizations). If emc_lookup()/emc_insert() show up in top 5 functions taking significant cycles, worth disabling EMC as below. 'ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=0' >Are you wanting for this number to be larger by default ? >I am not sure, I need to understand whether it is good or bad to set it larger. >Are you wanting for this number to be configurable ? >Probably good. > >BTW, after reading part of DPDK document, it strengthens to decrease to copy >between cache and memory and get cache hit as much as possible to get >fewer cpu cycles to fetch data, but now I am totally lost on how does OVS- >DPDK emc and classifier map to the LLC. I didn't get your question here. PMD is like any other thread and has EMC and a classifier per ingress port. The EMC, classifier subtables and other data structures will make it to LLC when accessed. As already mentioned using RDT Cache Allocation Technology(CAT), one can assign cache ways to high priority threads https://software.intel.com/en-us/articles/introduction-to-cache-allocation-technology - Bhanuprakash. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Thanks Darrell, comment inline. On Sat, Jul 1, 2017 at 1:02 AM, Darrell Ball wrote: > > > > > *From: *Hui Xiang > *Date: *Thursday, June 29, 2017 at 6:57 PM > *To: *Darrell Ball > *Cc: *"Bodireddy, Bhanuprakash" , " > ovs-discuss@openvswitch.org" > *Subject: *Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls? > > > > I am interested about how to define 'reasonable' here, how it is got and > what what is the 'many case'? is there any document/link to refer this > information, please shed me some light. > > > > It is based on real usage scenarios for the number of megaflows needed. > > The usage may be less in most cases. > > In cases where larger, it may imply that more threads are better and > dividing among queues. > Yes, more threads are better, but the overall cores are limited, more threads pinned on cores for OVS-DPDK, less available for vms. > > > Why do think having more than 64k per PMD would be optimal ? > I originally thought that the bottleneck in classifier because it is saturated full so that look up has to be going to flow table, so I think why not just increase the dpcls flows per PMD, but seems I am wrong based on your explanation. > What is your use case(s) ? > My usecase might be setup a VBRAS VNF with OVS-DPDK as an NFV normal case, and it requires a good performance, however, OVS-DPDK seems still not reach its needs compared with hardware offloading, we are evaluating VPP as well, basically I am looking to find out what's the bottleneck so far in OVS-DPDK (seems in flow look up), and if there are some solutions being discussed or working in progress. > Are you wanting for this number to be larger by default ? > I am not sure, I need to understand whether it is good or bad to set it larger. > Are you wanting for this number to be configurable ? > Probably good. > > BTW, after reading part of DPDK document, it strengthens to decrease to copy between cache and memory and get cache hit as much as possible to get fewer cpu cycles to fetch data, but now I am totally lost on how does OVS-DPDK emc and classifier map to the LLC. > > > On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball wrote: > > Q: “how it is calculated in such an exact number? “ > > A: It is a reasonable number to accommodate many cases. > > Q: “If there are more ports added for polling, for avoid competing can I > increase the 64k size into a > bigger one?” > > A: If a larger number is needed, it may imply that adding another PMD and > dividing the forwarding > work would be best. Maybe even a smaller number of flows may be best > served with more PMDs. > > > > > > > On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of > Bodireddy, Bhanuprakash" of bhanuprakash.bodire...@intel.com> wrote: > > > > > >I guess the answer is now the general LLC is 2.5M per core so that > there is 64k > > >flows per thread. > > > > AFAIK, the no. of flows here may not have to do anything with LLC. > Also there is EMC cache(8k entries) of ~4MB per PMD thread. > > > > > > Yes the performance will be nice with simple test cases (P2P with 1 > PMD thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK > can be memory bound. > > > > BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption > of 2.5M/core isn't right. > > > > - Bhanuprakash. > > > > > > > >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang > wrote: > > >Thanks Darrell, > > > > > >More questions: > > >Why not allocating 64k for each dpcls? does the 64k just fit in L3 > cache or > > >anywhere? how it is calculated in such an exact number? If there are > more > > >ports added for polling, for avoid competing can I increase the 64k > size into a > > >bigger one? Thanks. > > > > > >Hui. > > > > > > > > > ___ > discuss mailing list > disc...@openvswitch.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail. > openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c= > uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=- > aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_ > ukDYkjSnSmA9Q9h37XchMZofuU&e= > > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
From: Hui Xiang Date: Thursday, June 29, 2017 at 6:57 PM To: Darrell Ball Cc: "Bodireddy, Bhanuprakash" , "ovs-discuss@openvswitch.org" Subject: Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls? I am interested about how to define 'reasonable' here, how it is got and what what is the 'many case'? is there any document/link to refer this information, please shed me some light. It is based on real usage scenarios for the number of megaflows needed. The usage may be less in most cases. In cases where larger, it may imply that more threads are better and dividing among queues. Why do think having more than 64k per PMD would be optimal ? What is your use case(s) ? Are you wanting for this number to be larger by default ? Are you wanting for this number to be configurable ? On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball mailto:db...@vmware.com>> wrote: Q: “how it is calculated in such an exact number? “ A: It is a reasonable number to accommodate many cases. Q: “If there are more ports added for polling, for avoid competing can I increase the 64k size into a bigger one?” A: If a larger number is needed, it may imply that adding another PMD and dividing the forwarding work would be best. Maybe even a smaller number of flows may be best served with more PMDs. On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org> on behalf of Bodireddy, Bhanuprakash" mailto:ovs-discuss-boun...@openvswitch.org> on behalf of bhanuprakash.bodire...@intel.com<mailto:bhanuprakash.bodire...@intel.com>> wrote: > >I guess the answer is now the general LLC is 2.5M per core so that there is 64k >flows per thread. AFAIK, the no. of flows here may not have to do anything with LLC. Also there is EMC cache(8k entries) of ~4MB per PMD thread. Yes the performance will be nice with simple test cases (P2P with 1 PMD thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK can be memory bound. BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of 2.5M/core isn't right. - Bhanuprakash. > >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang mailto:xiangh...@gmail.com>> wrote: >Thanks Darrell, > >More questions: >Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or >anywhere? how it is calculated in such an exact number? If there are more >ports added for polling, for avoid competing can I increase the 64k size into a >bigger one? Thanks. > >Hui. > > ___ discuss mailing list disc...@openvswitch.org<mailto:disc...@openvswitch.org> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=-aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_ukDYkjSnSmA9Q9h37XchMZofuU&e= ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Thank you very much, Bodireddy, appreciated your reply. On Fri, Jun 30, 2017 at 5:19 PM, Bodireddy, Bhanuprakash < bhanuprakash.bodire...@intel.com> wrote: > > > >Thanks Bodireddy. > > > >Sorry I am a bit confused about the EMC occupied size per PMD, here[1] > has a > >different story. > > Initially EMC had 1024 entries and the patch [1] increased it to 8k. By > doing so in simple test > scenarios, most of the flows will hit EMC and we can achieve wire speed > for smaller packets. > With this patch the EMC cache size is now @ ~4MB. > > It should be noted that there can be multiple PMD threads and whole lot of > other threads running > on the compute. Cache is limited and all the threads contend for LLC > resources. This leads to cache thrashing > and impacts performance. > > To alleviate this problem, Intel's (Resource Director Technology) RDT is > used to partition the LLC and > assign cache ways to different threads based on priority. > > > > >Do you mean in real scenarios OVS-DPDK can be memory bound on EMC? > OvS is flexible and results are use case dependent. With some use cases , > EMC quickly gets saturated and packets will be sent to classifier. > Some of the bottlenecks I referred are in classifier. . > > >I thought EMC should be totally fit in LLC. > As pointed earlier there may be lot of threads on the compute and this > assumption may not be right always. > > > > >If the megaflows just part in LLC, then the cost of copy between memory > and > >LLC should be large, isn't it not like what defined as 'fast path' in > userspace > >compared with kernel datapath? And if most of megaflows are in memory, > >the reason of every PMD has one dpcls instance is to follow the rule PMD > >thread should has local data as most as it can, but not every PMD put it > in its > >local cache, if that is true, I can't see why 64k is the limit num, > unless this is an > >experience best value calculated from vtune/perf resutls. > > > >You are probably enabled hyper-thread with 35MB and got 28 cores. > > I have E5-2695 v3, dual socket with 14 cores per socket. I will have 56 > cores with HT enabled. > > - Bhanuprakash. > > > > >[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html > > > > > > > >On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash > > wrote: > >> > >>I guess the answer is now the general LLC is 2.5M per core so that there > is > >64k > >>flows per thread. > > > >AFAIK, the no. of flows here may not have to do anything with LLC. Also > there > >is EMC cache(8k entries) of ~4MB per PMD thread. > >Yes the performance will be nice with simple test cases (P2P with 1 PMD > >thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK > can be > >memory bound. > > > >BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of > >2.5M/core isn't right. > > > >- Bhanuprakash. > > > >> > >>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang wrote: > >>Thanks Darrell, > >> > >>More questions: > >>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache > or > >>anywhere? how it is calculated in such an exact number? If there are > more > >>ports added for polling, for avoid competing can I increase the 64k size > into a > >>bigger one? Thanks. > >> > >>Hui. > >> > >> > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
> >Thanks Bodireddy. > >Sorry I am a bit confused about the EMC occupied size per PMD, here[1] has a >different story. Initially EMC had 1024 entries and the patch [1] increased it to 8k. By doing so in simple test scenarios, most of the flows will hit EMC and we can achieve wire speed for smaller packets. With this patch the EMC cache size is now @ ~4MB. It should be noted that there can be multiple PMD threads and whole lot of other threads running on the compute. Cache is limited and all the threads contend for LLC resources. This leads to cache thrashing and impacts performance. To alleviate this problem, Intel's (Resource Director Technology) RDT is used to partition the LLC and assign cache ways to different threads based on priority. > >Do you mean in real scenarios OVS-DPDK can be memory bound on EMC? OvS is flexible and results are use case dependent. With some use cases , EMC quickly gets saturated and packets will be sent to classifier. Some of the bottlenecks I referred are in classifier. . >I thought EMC should be totally fit in LLC. As pointed earlier there may be lot of threads on the compute and this assumption may not be right always. > >If the megaflows just part in LLC, then the cost of copy between memory and >LLC should be large, isn't it not like what defined as 'fast path' in userspace >compared with kernel datapath? And if most of megaflows are in memory, >the reason of every PMD has one dpcls instance is to follow the rule PMD >thread should has local data as most as it can, but not every PMD put it in its >local cache, if that is true, I can't see why 64k is the limit num, unless >this is an >experience best value calculated from vtune/perf resutls. > >You are probably enabled hyper-thread with 35MB and got 28 cores. I have E5-2695 v3, dual socket with 14 cores per socket. I will have 56 cores with HT enabled. - Bhanuprakash. > >[1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html > > > >On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash > wrote: >> >>I guess the answer is now the general LLC is 2.5M per core so that there is >64k >>flows per thread. > >AFAIK, the no. of flows here may not have to do anything with LLC. Also there >is EMC cache(8k entries) of ~4MB per PMD thread. >Yes the performance will be nice with simple test cases (P2P with 1 PMD >thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK can be >memory bound. > >BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of >2.5M/core isn't right. > >- Bhanuprakash. > >> >>On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang wrote: >>Thanks Darrell, >> >>More questions: >>Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or >>anywhere? how it is calculated in such an exact number? If there are more >>ports added for polling, for avoid competing can I increase the 64k size into >>a >>bigger one? Thanks. >> >>Hui. >> >> ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
I am interested about how to define 'reasonable' here, how it is got and what what is the 'many case'? is there any document/link to refer this information, please shed me some light. On Thu, Jun 29, 2017 at 10:47 PM, Darrell Ball wrote: > Q: “how it is calculated in such an exact number? “ > > A: It is a reasonable number to accommodate many cases. > > Q: “If there are more ports added for polling, for avoid competing can I > increase the 64k size into a > bigger one?” > > A: If a larger number is needed, it may imply that adding another PMD and > dividing the forwarding > work would be best. Maybe even a smaller number of flows may be best > served with more PMDs. > > > > > > On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of > Bodireddy, Bhanuprakash" of bhanuprakash.bodire...@intel.com> wrote: > > > > > >I guess the answer is now the general LLC is 2.5M per core so that > there is 64k > > >flows per thread. > > > > AFAIK, the no. of flows here may not have to do anything with LLC. > Also there is EMC cache(8k entries) of ~4MB per PMD thread. > > > > > > Yes the performance will be nice with simple test cases (P2P with 1 > PMD thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK > can be memory bound. > > > > BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption > of 2.5M/core isn't right. > > > > - Bhanuprakash. > > > > > > > >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang > wrote: > > >Thanks Darrell, > > > > > >More questions: > > >Why not allocating 64k for each dpcls? does the 64k just fit in L3 > cache or > > >anywhere? how it is calculated in such an exact number? If there are > more > > >ports added for polling, for avoid competing can I increase the 64k > size into a > > >bigger one? Thanks. > > > > > >Hui. > > > > > > > > > > ___ > discuss mailing list > disc...@openvswitch.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail. > openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c= > uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=- > aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_ > ukDYkjSnSmA9Q9h37XchMZofuU&e= > > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Thanks Bodireddy. Sorry I am a bit confused about the EMC occupied size per PMD, here[1] has a different story. Do you mean in real scenarios OVS-DPDK can be memory bound on EMC? I thought EMC should be totally fit in LLC. If the megaflows just part in LLC, then the cost of copy between memory and LLC should be large, isn't it not like what defined as 'fast path' in userspace compared with kernel datapath? And if most of megaflows are in memory, the reason of every PMD has one dpcls instance is to follow the rule PMD thread should has local data as most as it can, but not every PMD put it in its local cache, if that is true, I can't see why 64k is the limit num, unless this is an experience best value calculated from vtune/perf resutls. You are probably enabled hyper-thread with 35MB and got 28 cores. [1] https://mail.openvswitch.org/pipermail/ovs-dev/2015-May/298999.html On Thu, Jun 29, 2017 at 10:23 PM, Bodireddy, Bhanuprakash < bhanuprakash.bodire...@intel.com> wrote: > > > >I guess the answer is now the general LLC is 2.5M per core so that there > is 64k > >flows per thread. > > AFAIK, the no. of flows here may not have to do anything with LLC. Also > there is EMC cache(8k entries) of ~4MB per PMD thread. > Yes the performance will be nice with simple test cases (P2P with 1 PMD > thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK can > be memory bound. > > BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of > 2.5M/core isn't right. > > - Bhanuprakash. > > > > >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang wrote: > >Thanks Darrell, > > > >More questions: > >Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache > or > >anywhere? how it is calculated in such an exact number? If there are more > >ports added for polling, for avoid competing can I increase the 64k size > into a > >bigger one? Thanks. > > > >Hui. > > > > > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Q: “how it is calculated in such an exact number? “ A: It is a reasonable number to accommodate many cases. Q: “If there are more ports added for polling, for avoid competing can I increase the 64k size into a bigger one?” A: If a larger number is needed, it may imply that adding another PMD and dividing the forwarding work would be best. Maybe even a smaller number of flows may be best served with more PMDs. On 6/29/17, 7:23 AM, "ovs-discuss-boun...@openvswitch.org on behalf of Bodireddy, Bhanuprakash" wrote: > >I guess the answer is now the general LLC is 2.5M per core so that there is 64k >flows per thread. AFAIK, the no. of flows here may not have to do anything with LLC. Also there is EMC cache(8k entries) of ~4MB per PMD thread. Yes the performance will be nice with simple test cases (P2P with 1 PMD thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK can be memory bound. BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of 2.5M/core isn't right. - Bhanuprakash. > >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang wrote: >Thanks Darrell, > >More questions: >Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or >anywhere? how it is calculated in such an exact number? If there are more >ports added for polling, for avoid competing can I increase the 64k size into a >bigger one? Thanks. > >Hui. > > ___ discuss mailing list disc...@openvswitch.org https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddiscuss&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=-aL2AdnELLqgfD2paHXevABAGM7lXVTVcc-WMLHqINE&s=pSk0G_pj9n5VvpbG_ukDYkjSnSmA9Q9h37XchMZofuU&e= ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
> >I guess the answer is now the general LLC is 2.5M per core so that there is 64k >flows per thread. AFAIK, the no. of flows here may not have to do anything with LLC. Also there is EMC cache(8k entries) of ~4MB per PMD thread. Yes the performance will be nice with simple test cases (P2P with 1 PMD thread) as most of this fits in to LLC. But in real scenarios OvS-DPDK can be memory bound. BTW, on my DUT the LLC is 35MB and has 28 cores and so the assumption of 2.5M/core isn't right. - Bhanuprakash. > >On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang wrote: >Thanks Darrell, > >More questions: >Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or >anywhere? how it is calculated in such an exact number? If there are more >ports added for polling, for avoid competing can I increase the 64k size into a >bigger one? Thanks. > >Hui. > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
I guess the answer is now the general LLC is 2.5M per core so that there is 64k flows per thread. On Fri, Jun 23, 2017 at 11:15 AM, Hui Xiang wrote: > Thanks Darrell, > > More questions: > Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache > or anywhere? how it is calculated in such an exact number? If there are > more ports added for polling, for avoid competing can I increase the 64k > size into a bigger one? Thanks. > > Hui. > > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Thanks Darrell, More questions: Why not allocating 64k for each dpcls? does the 64k just fit in L3 cache or anywhere? how it is calculated in such an exact number? If there are more ports added for polling, for avoid competing can I increase the 64k size into a bigger one? Thanks. Hui. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
From: on behalf of Hui Xiang Date: Thursday, June 22, 2017 at 2:22 AM To: "ovs-discuss@openvswitch.org" Subject: Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls? Anyone could help to answer it? On Tue, Jun 20, 2017 at 6:22 PM, Hui Xiang mailto:xiangh...@gmail.com>> wrote: Hello guys, I have seen that there will be one dpcls instance for each port per pmd, and seems flow table max entries num is 64k per pmd other than per dpcls, “my question is if there are several dpcls instances per pmd, would they compete the 64k per pmd?” yes BR. Hui. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] max mega flow 64k per pmd or per dpcls?
Anyone could help to answer it? On Tue, Jun 20, 2017 at 6:22 PM, Hui Xiang wrote: > Hello guys, > > I have seen that there will be one dpcls instance for each port per pmd, > and seems flow table max entries num is 64k per pmd other than per dpcls, > my question is if there are several dpcls instances per pmd, would they > compete the 64k per pmd? > > > BR. > Hui. > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss