Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-11 Thread O Mahony, Billy
Hi Wang,

I believe that the PMD stats processing cycles includes EMC processing time. 

This is just in the context of your results being surprising. It could be a 
factor if you are using code where the bug exists. The patch carries a fixes: 
tag (I think) that should help you figure out if your results were potentially 
affected by this issue.

Regards,
/Billy. 

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Monday, September 11, 2017 3:00 AM
> To: O Mahony, Billy ; ovs-
> d...@openvswitch.org; Jan Scheurich ; Darrell
> Ball ; ovs-discuss@openvswitch.org; Kevin Traynor
> 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> In my test, almost all traffic went trough via EMC. So the fix does not impact
> the result, especially we want to know the difference (not the exact num).
> 
> Can you test to get some data? Thanks.
> 
> Br,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Friday, September 08, 2017 11:18 PM
> To: 王志克; ovs-...@openvswitch.org; Jan Scheurich; Darrell Ball; ovs-
> disc...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337309.html
> 
> I see it's been acked and is due to be pushed to master with other changes
> on the dpdk merge branch so you'll have to apply it manually for now.
> 
> /Billy.
> 
> > -Original Message-
> > From: 王志克 [mailto:wangzh...@jd.com]
> > Sent: Friday, September 8, 2017 11:48 AM
> > To: ovs-...@openvswitch.org; Jan Scheurich
> > ; O Mahony, Billy
> > ; Darrell Ball ; ovs-
> > disc...@openvswitch.org; Kevin Traynor 
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,
> >
> > I used ovs2.7.0. I searched the git log, and not sure which commit it
> > is. Do you happen to know?
> >
> > Yes, I cleared the stats after traffic run.
> >
> > Br,
> > Wang Zhike
> >
> >
> > From: "O Mahony, Billy" 
> > To: "wangzh...@jd.com" , Jan Scheurich
> > , Darrell Ball ,
> > "ovs-discuss@openvswitch.org" ,
> > "ovs-...@openvswitch.org" , Kevin
> Traynor
> > 
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> > Message-ID:
> > <03135aea779d444e90975c2703f148dc58c19...@irsmsx107.ger.c
> > orp.intel.com>
> >
> > Content-Type: text/plain; charset="utf-8"
> >
> > Hi Wang,
> >
> > Thanks for the figures. Unexpected results as you say. Two things come
> > to
> > mind:
> >
> > I?m not sure what code you are using but the cycles per packet
> > statistic was broken for a while recently. Ilya posted a patch to fix
> > it so make sure you have that patch included.
> >
> > Also remember to reset the pmd stats after you start your traffic and
> > then measure after a short duration.
> >
> > Regards,
> > Billy.
> >
> >
> >
> > From: ??? [mailto:wangzh...@jd.com]
> > Sent: Friday, September 8, 2017 8:01 AM
> > To: Jan Scheurich ; O Mahony, Billy
> > ; Darrell Ball ; ovs-
> > disc...@openvswitch.org; ovs-...@openvswitch.org; Kevin Traynor
> > 
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> >
> > Hi All,
> >
> >
> >
> > I tested below cases, and get some performance data. The data shows
> > there is little impact for cross NUMA communication, which is
> > different from my expectation. (Previously I mentioned that cross NUMA
> > would add 60% cycles, but I can NOT reproduce it any more).
> >
> >
> >
> > @Jan,
> >
> > You mentioned cross NUMA communication would cost lots more cycles.
> > Can you share your data? I am not sure whether I made some mistake or
> not.
> >
> >
> >
> > @All,
> >
> > Welcome your data if you have data for similar cases. Thanks.
> >
> >
> >
> > Case1: VM0->PMD0->NIC0
> >
> > Case2:VM1->PMD1->NIC0
> >
> > Case3:VM1->PMD0->NIC0
> >
> > Case4:NIC0->PMD0->VM0
> >
> > Case5:NIC0->PMD1->VM1
> >
> > Case6:NIC0->PMD0->VM1
> >
> >
> >
> > ? VM Tx Mpps  Host Tx Mpps  avg cycles per packet   avg processing
> > cycles per packet
> >
> > Case1 1.4   1.4 512 
> > 415
> >
> > Case2 1.3   1.3 537 
> > 436
> >
> > Case3 1.351.35   514 390
> >
> >
> >
> > ?  VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing
> cycles
> > per 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-10 Thread 王志克
Hi Jan,

Do you have some test data about the cross-NUMA impact?

Thanks.

Br,
Wang Zhike

-Original Message-
From: Jan Scheurich [mailto:jan.scheur...@ericsson.com] 
Sent: Wednesday, September 06, 2017 9:33 PM
To: O Mahony, Billy; 王志克; Darrell Ball; ovs-discuss@openvswitch.org; 
ovs-...@openvswitch.org; Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

Hi Billy,

> You are going to have to take the hit crossing the NUMA boundary at some 
> point if your NIC and VM are on different NUMAs.
> 
> So are you saying that it is more expensive to cross the NUMA boundary from 
> the pmd to the VM that to cross it from the NIC to the
> PMD?

Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is 
typically not a bottleneck.) The PMD only performs local memory access.

On the other hand, if the PMD crosses the QPI when copying packets into a 
remote VM, there is a huge latency penalty involved, consuming lots of PMD 
cycles that cannot be spent on processing packets. We at Ericsson have observed 
exactly this behavior.

This latency penalty becomes even worse when the LLC cache hit rate is degraded 
due to LLC cache contention with real VNFs and/or unfavorable packet buffer 
re-use patterns as exhibited by real VNFs compared to typical synthetic 
benchmark apps like DPDK testpmd.

> 
> If so then in that case you'd like to have two (for example) PMDs polling 2 
> queues on the same NIC. With the PMDs on each of the
> NUMA nodes forwarding to the VMs local to that NUMA?
> 
> Of course your NIC would then also need to be able know which VM (or at least 
> which NUMA the VM is on) in order to send the frame
> to the correct rxq.

That would indeed be optimal but hard to realize in the general case (e.g. with 
VXLAN encapsulation) as the actual destination is only known after tunnel pop. 
Here perhaps some probabilistic steering of RSS hash values based on measured 
distribution of final destinations might help in the future.

But even without that in place, we need PMDs on both NUMAs anyhow (for 
NUMA-aware polling of vhostuser ports), so why not use them to also poll remote 
eth ports. We can achieve better average performance with fewer PMDs than with 
the current limitation to NUMA-local polling.

BR, Jan

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-10 Thread 王志克
Hi Billy,

In my test, almost all traffic went trough via EMC. So the fix does not impact 
the result, especially we want to know the difference (not the exact num).

Can you test to get some data? Thanks.

Br,
Wang Zhike

-Original Message-
From: O Mahony, Billy [mailto:billy.o.mah...@intel.com] 
Sent: Friday, September 08, 2017 11:18 PM
To: 王志克; ovs-...@openvswitch.org; Jan Scheurich; Darrell Ball; 
ovs-discuss@openvswitch.org; Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

Hi Wang,

https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337309.html

I see it's been acked and is due to be pushed to master with other changes on 
the dpdk merge branch so you'll have to apply it manually for now.

/Billy. 

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Friday, September 8, 2017 11:48 AM
> To: ovs-...@openvswitch.org; Jan Scheurich
> ; O Mahony, Billy
> ; Darrell Ball ; ovs-
> disc...@openvswitch.org; Kevin Traynor 
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> I used ovs2.7.0. I searched the git log, and not sure which commit it is. Do 
> you
> happen to know?
> 
> Yes, I cleared the stats after traffic run.
> 
> Br,
> Wang Zhike
> 
> 
> From: "O Mahony, Billy" 
> To: "wangzh...@jd.com" , Jan Scheurich
>   , Darrell Ball ,
>   "ovs-discuss@openvswitch.org" ,
>   "ovs-...@openvswitch.org" , Kevin
> Traynor
>   
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
>   physical port
> Message-ID:
>   <03135aea779d444e90975c2703f148dc58c19...@irsmsx107.ger.c
> orp.intel.com>
> 
> Content-Type: text/plain; charset="utf-8"
> 
> Hi Wang,
> 
> Thanks for the figures. Unexpected results as you say. Two things come to
> mind:
> 
> I?m not sure what code you are using but the cycles per packet statistic was
> broken for a while recently. Ilya posted a patch to fix it so make sure you
> have that patch included.
> 
> Also remember to reset the pmd stats after you start your traffic and then
> measure after a short duration.
> 
> Regards,
> Billy.
> 
> 
> 
> From: ??? [mailto:wangzh...@jd.com]
> Sent: Friday, September 8, 2017 8:01 AM
> To: Jan Scheurich ; O Mahony, Billy
> ; Darrell Ball ; ovs-
> disc...@openvswitch.org; ovs-...@openvswitch.org; Kevin Traynor
> 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> 
> Hi All,
> 
> 
> 
> I tested below cases, and get some performance data. The data shows there
> is little impact for cross NUMA communication, which is different from my
> expectation. (Previously I mentioned that cross NUMA would add 60%
> cycles, but I can NOT reproduce it any more).
> 
> 
> 
> @Jan,
> 
> You mentioned cross NUMA communication would cost lots more cycles. Can
> you share your data? I am not sure whether I made some mistake or not.
> 
> 
> 
> @All,
> 
> Welcome your data if you have data for similar cases. Thanks.
> 
> 
> 
> Case1: VM0->PMD0->NIC0
> 
> Case2:VM1->PMD1->NIC0
> 
> Case3:VM1->PMD0->NIC0
> 
> Case4:NIC0->PMD0->VM0
> 
> Case5:NIC0->PMD1->VM1
> 
> Case6:NIC0->PMD0->VM1
> 
> 
> 
> ? VM Tx Mpps  Host Tx Mpps  avg cycles per packet   avg processing
> cycles per packet
> 
> Case1 1.4   1.4 512 
> 415
> 
> Case2 1.3   1.3 537 
> 436
> 
> Case3 1.351.35   514 390
> 
> 
> 
> ?  VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing 
> cycles
> per packet
> 
> Case4 1.3   1.3 549 
> 533
> 
> Case5 1.3   1.3 559 
> 540
> 
> Case6 1.28 1.28  568 551
> 
> 
> 
> Br,
> 
> Wang Zhike
> 
> 
> 
> -Original Message-
> From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> Sent: Wednesday, September 06, 2017 9:33 PM
> To: O Mahony, Billy; ???; Darrell Ball; ovs-
> disc...@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> 
> 
> Hi Billy,
> 
> 
> 
> > You are going to have to take the hit crossing the NUMA boundary at some
> point if your NIC and VM are on different NUMAs.
> 
> >
> 
> > So are you saying that it is more expensive to cross the NUMA boundary
> from the pmd to the VM that to cross it from the NIC to the
> 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-08 Thread O Mahony, Billy
Hi Wang,

https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337309.html

I see it's been acked and is due to be pushed to master with other changes on 
the dpdk merge branch so you'll have to apply it manually for now.

/Billy. 

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Friday, September 8, 2017 11:48 AM
> To: ovs-...@openvswitch.org; Jan Scheurich
> ; O Mahony, Billy
> ; Darrell Ball ; ovs-
> disc...@openvswitch.org; Kevin Traynor 
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> I used ovs2.7.0. I searched the git log, and not sure which commit it is. Do 
> you
> happen to know?
> 
> Yes, I cleared the stats after traffic run.
> 
> Br,
> Wang Zhike
> 
> 
> From: "O Mahony, Billy" 
> To: "wangzh...@jd.com" , Jan Scheurich
>   , Darrell Ball ,
>   "ovs-discuss@openvswitch.org" ,
>   "ovs-...@openvswitch.org" , Kevin
> Traynor
>   
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
>   physical port
> Message-ID:
>   <03135aea779d444e90975c2703f148dc58c19...@irsmsx107.ger.c
> orp.intel.com>
> 
> Content-Type: text/plain; charset="utf-8"
> 
> Hi Wang,
> 
> Thanks for the figures. Unexpected results as you say. Two things come to
> mind:
> 
> I?m not sure what code you are using but the cycles per packet statistic was
> broken for a while recently. Ilya posted a patch to fix it so make sure you
> have that patch included.
> 
> Also remember to reset the pmd stats after you start your traffic and then
> measure after a short duration.
> 
> Regards,
> Billy.
> 
> 
> 
> From: ??? [mailto:wangzh...@jd.com]
> Sent: Friday, September 8, 2017 8:01 AM
> To: Jan Scheurich ; O Mahony, Billy
> ; Darrell Ball ; ovs-
> disc...@openvswitch.org; ovs-...@openvswitch.org; Kevin Traynor
> 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> 
> Hi All,
> 
> 
> 
> I tested below cases, and get some performance data. The data shows there
> is little impact for cross NUMA communication, which is different from my
> expectation. (Previously I mentioned that cross NUMA would add 60%
> cycles, but I can NOT reproduce it any more).
> 
> 
> 
> @Jan,
> 
> You mentioned cross NUMA communication would cost lots more cycles. Can
> you share your data? I am not sure whether I made some mistake or not.
> 
> 
> 
> @All,
> 
> Welcome your data if you have data for similar cases. Thanks.
> 
> 
> 
> Case1: VM0->PMD0->NIC0
> 
> Case2:VM1->PMD1->NIC0
> 
> Case3:VM1->PMD0->NIC0
> 
> Case4:NIC0->PMD0->VM0
> 
> Case5:NIC0->PMD1->VM1
> 
> Case6:NIC0->PMD0->VM1
> 
> 
> 
> ? VM Tx Mpps  Host Tx Mpps  avg cycles per packet   avg processing
> cycles per packet
> 
> Case1 1.4   1.4 512 
> 415
> 
> Case2 1.3   1.3 537 
> 436
> 
> Case3 1.351.35   514 390
> 
> 
> 
> ?  VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing 
> cycles
> per packet
> 
> Case4 1.3   1.3 549 
> 533
> 
> Case5 1.3   1.3 559 
> 540
> 
> Case6 1.28 1.28  568 551
> 
> 
> 
> Br,
> 
> Wang Zhike
> 
> 
> 
> -Original Message-
> From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> Sent: Wednesday, September 06, 2017 9:33 PM
> To: O Mahony, Billy; ???; Darrell Ball; ovs-
> disc...@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> 
> 
> Hi Billy,
> 
> 
> 
> > You are going to have to take the hit crossing the NUMA boundary at some
> point if your NIC and VM are on different NUMAs.
> 
> >
> 
> > So are you saying that it is more expensive to cross the NUMA boundary
> from the pmd to the VM that to cross it from the NIC to the
> 
> > PMD?
> 
> 
> 
> Indeed, that is the case: If the NIC crosses the QPI bus when storing packets
> in the remote NUMA there is no cost involved for the PMD. (The QPI
> bandwidth is typically not a bottleneck.) The PMD only performs local
> memory access.
> 
> 
> 
> On the other hand, if the PMD crosses the QPI when copying packets into a
> remote VM, there is a huge latency penalty involved, consuming lots of PMD
> cycles that cannot be spent on processing packets. We at Ericsson have
> observed exactly this behavior.
> 
> 
> 
> 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-08 Thread 王志克
Hi Billy,

I used ovs2.7.0. I searched the git log, and not sure which commit it is. Do 
you happen to know?

Yes, I cleared the stats after traffic run.

Br,
Wang Zhike


From: "O Mahony, Billy" 
To: "wangzh...@jd.com" , Jan Scheurich
, Darrell Ball ,
"ovs-discuss@openvswitch.org" ,
"ovs-...@openvswitch.org" , Kevin Traynor

Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
physical port
Message-ID:
<03135aea779d444e90975c2703f148dc58c19...@irsmsx107.ger.corp.intel.com>

Content-Type: text/plain; charset="utf-8"

Hi Wang,

Thanks for the figures. Unexpected results as you say. Two things come to mind:

I?m not sure what code you are using but the cycles per packet statistic was 
broken for a while recently. Ilya posted a patch to fix it so make sure you 
have that patch included.

Also remember to reset the pmd stats after you start your traffic and then 
measure after a short duration.

Regards,
Billy.



From: ??? [mailto:wangzh...@jd.com]
Sent: Friday, September 8, 2017 8:01 AM
To: Jan Scheurich ; O Mahony, Billy 
; Darrell Ball ; 
ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Kevin Traynor 

Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port


Hi All,



I tested below cases, and get some performance data. The data shows there is 
little impact for cross NUMA communication, which is different from my 
expectation. (Previously I mentioned that cross NUMA would add 60% cycles, but 
I can NOT reproduce it any more).



@Jan,

You mentioned cross NUMA communication would cost lots more cycles. Can you 
share your data? I am not sure whether I made some mistake or not.



@All,

Welcome your data if you have data for similar cases. Thanks.



Case1: VM0->PMD0->NIC0

Case2:VM1->PMD1->NIC0

Case3:VM1->PMD0->NIC0

Case4:NIC0->PMD0->VM0

Case5:NIC0->PMD1->VM1

Case6:NIC0->PMD0->VM1



? VM Tx Mpps  Host Tx Mpps  avg cycles per packet   avg processing 
cycles per packet

Case1 1.4   1.4 512 415

Case2 1.3   1.3 537 436

Case3 1.351.35   514 390



?  VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing 
cycles per packet

Case4 1.3   1.3 549 533

Case5 1.3   1.3 559 540

Case6 1.28 1.28  568 551



Br,

Wang Zhike



-Original Message-
From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
Sent: Wednesday, September 06, 2017 9:33 PM
To: O Mahony, Billy; ???; Darrell Ball; 
ovs-discuss@openvswitch.org; 
ovs-...@openvswitch.org; Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port



Hi Billy,



> You are going to have to take the hit crossing the NUMA boundary at some 
> point if your NIC and VM are on different NUMAs.

>

> So are you saying that it is more expensive to cross the NUMA boundary from 
> the pmd to the VM that to cross it from the NIC to the

> PMD?



Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is 
typically not a bottleneck.) The PMD only performs local memory access.



On the other hand, if the PMD crosses the QPI when copying packets into a 
remote VM, there is a huge latency penalty involved, consuming lots of PMD 
cycles that cannot be spent on processing packets. We at Ericsson have observed 
exactly this behavior.



This latency penalty becomes even worse when the LLC cache hit rate is degraded 
due to LLC cache contention with real VNFs and/or unfavorable packet buffer 
re-use patterns as exhibited by real VNFs compared to typical synthetic 
benchmark apps like DPDK testpmd.



>

> If so then in that case you'd like to have two (for example) PMDs polling 2 
> queues on the same NIC. With the PMDs on each of the

> NUMA nodes forwarding to the VMs local to that NUMA?

>

> Of course your NIC would then also need to be able know which VM (or at least 
> which NUMA the VM is on) in order to send the frame

> to the correct rxq.



That would indeed be optimal but hard to realize in the general case (e.g. with 
VXLAN encapsulation) as the actual destination is only known after tunnel pop. 
Here perhaps some probabilistic steering of RSS hash values based on measured 
distribution of final destinations might help in the future.



But even without that in 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-08 Thread O Mahony, Billy
Hi Wang,

Thanks for the figures. Unexpected results as you say. Two things come to mind:

I’m not sure what code you are using but the cycles per packet statistic was 
broken for a while recently. Ilya posted a patch to fix it so make sure you 
have that patch included.

Also remember to reset the pmd stats after you start your traffic and then 
measure after a short duration.

Regards,
Billy.



From: 王志克 [mailto:wangzh...@jd.com]
Sent: Friday, September 8, 2017 8:01 AM
To: Jan Scheurich ; O Mahony, Billy 
; Darrell Ball ; 
ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Kevin Traynor 

Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port


Hi All,



I tested below cases, and get some performance data. The data shows there is 
little impact for cross NUMA communication, which is different from my 
expectation. (Previously I mentioned that cross NUMA would add 60% cycles, but 
I can NOT reproduce it any more).



@Jan,

You mentioned cross NUMA communication would cost lots more cycles. Can you 
share your data? I am not sure whether I made some mistake or not.



@All,

Welcome your data if you have data for similar cases. Thanks.



Case1: VM0->PMD0->NIC0

Case2:VM1->PMD1->NIC0

Case3:VM1->PMD0->NIC0

Case4:NIC0->PMD0->VM0

Case5:NIC0->PMD1->VM1

Case6:NIC0->PMD0->VM1



  VM Tx Mpps  Host Tx Mpps  avg cycles per packet   avg processing 
cycles per packet

Case1 1.4   1.4 512 415

Case2 1.3   1.3 537 436

Case3 1.351.35   514 390



   VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing 
cycles per packet

Case4 1.3   1.3 549 533

Case5 1.3   1.3 559 540

Case6 1.28 1.28  568 551



Br,

Wang Zhike



-Original Message-
From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
Sent: Wednesday, September 06, 2017 9:33 PM
To: O Mahony, Billy; 王志克; Darrell Ball; 
ovs-discuss@openvswitch.org; 
ovs-...@openvswitch.org; Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port



Hi Billy,



> You are going to have to take the hit crossing the NUMA boundary at some 
> point if your NIC and VM are on different NUMAs.

>

> So are you saying that it is more expensive to cross the NUMA boundary from 
> the pmd to the VM that to cross it from the NIC to the

> PMD?



Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is 
typically not a bottleneck.) The PMD only performs local memory access.



On the other hand, if the PMD crosses the QPI when copying packets into a 
remote VM, there is a huge latency penalty involved, consuming lots of PMD 
cycles that cannot be spent on processing packets. We at Ericsson have observed 
exactly this behavior.



This latency penalty becomes even worse when the LLC cache hit rate is degraded 
due to LLC cache contention with real VNFs and/or unfavorable packet buffer 
re-use patterns as exhibited by real VNFs compared to typical synthetic 
benchmark apps like DPDK testpmd.



>

> If so then in that case you'd like to have two (for example) PMDs polling 2 
> queues on the same NIC. With the PMDs on each of the

> NUMA nodes forwarding to the VMs local to that NUMA?

>

> Of course your NIC would then also need to be able know which VM (or at least 
> which NUMA the VM is on) in order to send the frame

> to the correct rxq.



That would indeed be optimal but hard to realize in the general case (e.g. with 
VXLAN encapsulation) as the actual destination is only known after tunnel pop. 
Here perhaps some probabilistic steering of RSS hash values based on measured 
distribution of final destinations might help in the future.



But even without that in place, we need PMDs on both NUMAs anyhow (for 
NUMA-aware polling of vhostuser ports), so why not use them to also poll remote 
eth ports. We can achieve better average performance with fewer PMDs than with 
the current limitation to NUMA-local polling.



BR, Jan


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-08 Thread 王志克
Hi All,



I tested below cases, and get some performance data. The data shows there is 
little impact for cross NUMA communication, which is different from my 
expectation. (Previously I mentioned that cross NUMA would add 60% cycles, but 
I can NOT reproduce it any more).



@Jan,

You mentioned cross NUMA communication would cost lots more cycles. Can you 
share your data? I am not sure whether I made some mistake or not.



@All,

Welcome your data if you have data for similar cases. Thanks.



Case1: VM0->PMD0->NIC0

Case2:VM1->PMD1->NIC0

Case3:VM1->PMD0->NIC0

Case4:NIC0->PMD0->VM0

Case5:NIC0->PMD1->VM1

Case6:NIC0->PMD0->VM1



 VM Tx Mpps   Host Tx Mpps  avg cycles per packet   avg 
processing cycles per packet

Case1   1.4   1.4   512  415

Case2   1.3   1.3   537  436

Case3   1.351.35  514  390



  VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing 
cycles per packet

Case4   1.3   1.3   549  533

Case5   1.3   1.3   559  540

Case6   1.28 1.28 568  551



Br,

Wang Zhike



-Original Message-
From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
Sent: Wednesday, September 06, 2017 9:33 PM
To: O Mahony, Billy; 王志克; Darrell Ball; 
ovs-discuss@openvswitch.org; 
ovs-...@openvswitch.org; Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port



Hi Billy,



> You are going to have to take the hit crossing the NUMA boundary at some 
> point if your NIC and VM are on different NUMAs.

>

> So are you saying that it is more expensive to cross the NUMA boundary from 
> the pmd to the VM that to cross it from the NIC to the

> PMD?



Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is 
typically not a bottleneck.) The PMD only performs local memory access.



On the other hand, if the PMD crosses the QPI when copying packets into a 
remote VM, there is a huge latency penalty involved, consuming lots of PMD 
cycles that cannot be spent on processing packets. We at Ericsson have observed 
exactly this behavior.



This latency penalty becomes even worse when the LLC cache hit rate is degraded 
due to LLC cache contention with real VNFs and/or unfavorable packet buffer 
re-use patterns as exhibited by real VNFs compared to typical synthetic 
benchmark apps like DPDK testpmd.



>

> If so then in that case you'd like to have two (for example) PMDs polling 2 
> queues on the same NIC. With the PMDs on each of the

> NUMA nodes forwarding to the VMs local to that NUMA?

>

> Of course your NIC would then also need to be able know which VM (or at least 
> which NUMA the VM is on) in order to send the frame

> to the correct rxq.



That would indeed be optimal but hard to realize in the general case (e.g. with 
VXLAN encapsulation) as the actual destination is only known after tunnel pop. 
Here perhaps some probabilistic steering of RSS hash values based on measured 
distribution of final destinations might help in the future.



But even without that in place, we need PMDs on both NUMAs anyhow (for 
NUMA-aware polling of vhostuser ports), so why not use them to also poll remote 
eth ports. We can achieve better average performance with fewer PMDs than with 
the current limitation to NUMA-local polling.



BR, Jan


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-08 Thread 王志克
Hi All,



I tested below cases, and get some performance data. The data shows there is 
little impact for cross NUMA communication, which is different from my 
expectation. (Previously I mentioned that cross NUMA would add 60% cycles, but 
I can NOT reproduce it any more).



@Jan,

You mentioned cross NUMA communication would cost lots more cycles. Can you 
share your data? I am not sure whether I made some mistake or not.



@All,

Welcome your data if you have data for similar cases. Thanks.



Cases:

[cid:image001.png@01D3288F.2FEA3970]


 

VM Tx Mpps

Host Tx Mpps

avg cycles per packet

avg processing cycles per packet

Case1

1.4

1.4

512

415

Case2

1.3

1.3

537

436

Case3

1.35

1.35

514

390



 

VM Rx Mpps

Host Rx Mpps

avg cycles per packet

avg processing cycles per packet

Case4

1.3

1.3

549

533

Case5

1.3

1.3

559

540

Case6

1.28

1.28

568

551




Br,

Wang Zhike



-Original Message-
From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
Sent: Wednesday, September 06, 2017 9:33 PM
To: O Mahony, Billy; 王志克; Darrell Ball; ovs-discuss@openvswitch.org; 
ovs-...@openvswitch.org; Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port



Hi Billy,



> You are going to have to take the hit crossing the NUMA boundary at some 
> point if your NIC and VM are on different NUMAs.

>

> So are you saying that it is more expensive to cross the NUMA boundary from 
> the pmd to the VM that to cross it from the NIC to the

> PMD?



Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is 
typically not a bottleneck.) The PMD only performs local memory access.



On the other hand, if the PMD crosses the QPI when copying packets into a 
remote VM, there is a huge latency penalty involved, consuming lots of PMD 
cycles that cannot be spent on processing packets. We at Ericsson have observed 
exactly this behavior.



This latency penalty becomes even worse when the LLC cache hit rate is degraded 
due to LLC cache contention with real VNFs and/or unfavorable packet buffer 
re-use patterns as exhibited by real VNFs compared to typical synthetic 
benchmark apps like DPDK testpmd.



>

> If so then in that case you'd like to have two (for example) PMDs polling 2 
> queues on the same NIC. With the PMDs on each of the

> NUMA nodes forwarding to the VMs local to that NUMA?

>

> Of course your NIC would then also need to be able know which VM (or at least 
> which NUMA the VM is on) in order to send the frame

> to the correct rxq.



That would indeed be optimal but hard to realize in the general case (e.g. with 
VXLAN encapsulation) as the actual destination is only known after tunnel pop. 
Here perhaps some probabilistic steering of RSS hash values based on measured 
distribution of final destinations might help in the future.



But even without that in place, we need PMDs on both NUMAs anyhow (for 
NUMA-aware polling of vhostuser ports), so why not use them to also poll remote 
eth ports. We can achieve better average performance with fewer PMDs than with 
the current limitation to NUMA-local polling.



BR, Jan


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread 王志克


-Original Message-
From: O Mahony, Billy [mailto:billy.o.mah...@intel.com] 
Sent: Wednesday, September 06, 2017 10:49 PM
To: Kevin Traynor; Jan Scheurich; 王志克; Darrell Ball; 
ovs-discuss@openvswitch.org; ovs-...@openvswitch.org
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port



> -Original Message-
> From: Kevin Traynor [mailto:ktray...@redhat.com]
> Sent: Wednesday, September 6, 2017 2:50 PM
> To: Jan Scheurich ; O Mahony, Billy
> ; wangzh...@jd.com; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> On 09/06/2017 02:33 PM, Jan Scheurich wrote:
> > Hi Billy,
> >
> >> You are going to have to take the hit crossing the NUMA boundary at
> some point if your NIC and VM are on different NUMAs.
> >>
> >> So are you saying that it is more expensive to cross the NUMA
> >> boundary from the pmd to the VM that to cross it from the NIC to the
> PMD?
> >
> > Indeed, that is the case: If the NIC crosses the QPI bus when storing
> packets in the remote NUMA there is no cost involved for the PMD. (The QPI
> bandwidth is typically not a bottleneck.) The PMD only performs local
> memory access.
> >
> > On the other hand, if the PMD crosses the QPI when copying packets into a
> remote VM, there is a huge latency penalty involved, consuming lots of PMD
> cycles that cannot be spent on processing packets. We at Ericsson have
> observed exactly this behavior.
> >
> > This latency penalty becomes even worse when the LLC cache hit rate is
> degraded due to LLC cache contention with real VNFs and/or unfavorable
> packet buffer re-use patterns as exhibited by real VNFs compared to typical
> synthetic benchmark apps like DPDK testpmd.
> >
> >>
> >> If so then in that case you'd like to have two (for example) PMDs
> >> polling 2 queues on the same NIC. With the PMDs on each of the NUMA
> nodes forwarding to the VMs local to that NUMA?
> >>
> >> Of course your NIC would then also need to be able know which VM (or
> >> at least which NUMA the VM is on) in order to send the frame to the
> correct rxq.
> >
> > That would indeed be optimal but hard to realize in the general case (e.g.
> with VXLAN encapsulation) as the actual destination is only known after
> tunnel pop. Here perhaps some probabilistic steering of RSS hash values
> based on measured distribution of final destinations might help in the future.
> >
> > But even without that in place, we need PMDs on both NUMAs anyhow
> (for NUMA-aware polling of vhostuser ports), so why not use them to also
> poll remote eth ports. We can achieve better average performance with
> fewer PMDs than with the current limitation to NUMA-local polling.
> >
> 
> If the user has some knowledge of the numa locality of ports and can place
> VM's accordingly, default cross-numa assignment can be harm performance.
> Also, it would make for very unpredictable performance from test to test and
> even for flow to flow on a datapath.
[[BO'M]] Wang's original request would constitute default cross numa assignment 
but I don't think this modified proposal would as it still requires explicit 
config to assign to the remote NUMA.

[Wangzhike] I think configuration option or compiling option are OK to me, 
since only phyiscal NIC rxq needs be configrued. It is only one-shot job.
Regarding the test concern, I think it is worth to clarify different 
performance if the new behavior improves the rx throughput a lot.
> 
> Kevin.
> 
> > BR, Jan
> >

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread 王志克
Hi Billy,

Please see my reply in line.

Br,
Wang Zhike

-Original Message-
From: O Mahony, Billy [mailto:billy.o.mah...@intel.com] 
Sent: Wednesday, September 06, 2017 9:01 PM
To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; 
Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

Hi Wang,

I think the mention of pinning was confusing me a little. Let me see if I fully 
understand your use case:  You don't 'want' to pin anything but you are using 
it as a way to force the distribution of rxq from a single nic across to PMDs 
on different NUMAs. As without pinning all rxqs are assigned to the NUMA-local 
pmd leaving the other PMD totally unused.

But then when you used pinning you the PMDs became isolated so the vhostuser 
ports rxqs would not be assigned to the PMDs unless they too were pinned. Which 
worked but was not manageable as VM (and vhost ports) came and went.

Yes? 
[Wang Zhike] Yes, exactly.

In that case what we probably want is the ability to pin an rxq to a pmd but 
without also isolating the pmd. So the PMD could be assigned some rxqs manually 
and still have others automatically assigned. 

But what I still don't understand is why you don't put both PMDs on the same 
NUMA node. Given that you cannot program the NIC to know which VM a frame is 
for then you would have to RSS the frames across rxqs (ie across NUMA nodes). 
Of those going to the NICs local-numa node 50% would have to go across the NUMA 
boundary when their destination VM was decided - which is okay - they have to 
cross the boundary at some point. But for or frames going to non-local NUMA, 
50% of these will actually be destined for what was originally the local NUMA 
node. Now these packets (25% of all traffic would ) will cross NUMA *twice* 
whereas if all PMDs were on the NICs NUMA node those frames would never have 
had to pass between NUMA nodes.

In short I think it's more efficient to have both PMDs on the same NUMA node as 
the NIC.

[Wang Zhike] If considering Tx direction, i.e, from VM on different NUMA node 
to phy NIC, I am not sure whether your proposal would downgrade the TX 
performance...
I will try to test different cross-NUMA scenario to get the performance penalty 
data.

There is one more comments below..

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Wednesday, September 6, 2017 12:50 PM
> To: O Mahony, Billy ; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> See my reply in line.
> 
> Br,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Wednesday, September 06, 2017 7:26 PM
> To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> You are going to have to take the hit crossing the NUMA boundary at some
> point if your NIC and VM are on different NUMAs.
> 
> So are you saying that it is more expensive to cross the NUMA boundary
> from the pmd to the VM that to cross it from the NIC to the PMD?
> 
> [Wang Zhike] I do not have such data. I hope we can try the new behavior
> and get the test result, and then know whether and how much performance
> can be improved.

[[BO'M]] You don't need to a code change to compare performance of these two 
scenarios. You can simulate it by pinning queues to VMs. I'd imagine crossing 
the NUMA boundary during the PCI DMA would be cheaper that crossing it over 
vhost. But I don't know what the result would be and this would a pretty 
interesting figure to have by the way.


> 
> If so then in that case you'd like to have two (for example) PMDs polling 2
> queues on the same NIC. With the PMDs on each of the NUMA nodes
> forwarding to the VMs local to that NUMA?
> 
> Of course your NIC would then also need to be able know which VM (or at
> least which NUMA the VM is on) in order to send the frame to the correct
> rxq.
> 
> [Wang Zhike] Currently I do not know how to achieve it. From my view, NIC
> do not know which NUMA should be the destination of the packet. Only
> after OVS handling (eg lookup the fowarding rule in OVS), then it can know
> the destination. If NIC does not know the destination NUMA socket, it does
> not matter which PMD to poll it.
> 
> 
> /Billy.
> 
> > -Original Message-
> > From: 王志克 [mailto:wangzh...@jd.com]
> > Sent: Wednesday, September 6, 2017 11:41 AM
> > To: O Mahony, Billy ; Darrell Ball
> > ; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor 
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Jan Scheurich
> 
> I think the mention of pinning was confusing me a little. Let me see if I 
> fully understand your use case:  You don't 'want' to pin
> anything but you are using it as a way to force the distribution of rxq from 
> a single nic across to PMDs on different NUMAs. As without
> pinning all rxqs are assigned to the NUMA-local pmd leaving the other PMD 
> totally unused.
> 
> But then when you used pinning you the PMDs became isolated so the vhostuser 
> ports rxqs would not be assigned to the PMDs unless
> they too were pinned. Which worked but was not manageable as VM (and vhost 
> ports) came and went.
> 
> Yes?

Yes!!!

> 
> In that case what we probably want is the ability to pin an rxq to a pmd but 
> without also isolating the pmd. So the PMD could be
> assigned some rxqs manually and still have others automatically assigned.

Wonderful. That is exactly what I have wanted to propose for a while: Separate 
PMD isolation from pinning of Rx queues. 

Tying these two together makes it impossible to use pinning of Rx queues in 
OpenStack context (without the addition of dedicated PMDs/cores). And even 
during manual testing it is a nightmare to have to manually pin all 48 
vhostuser queues just because we want to pin the two heavy-loaded Rx queues to 
different PMDs.

The idea would be to introduce a separate configuration option for PMDs to 
isolate them, and no longer automatically set that when pinning an rx queue to 
the PMD.

BR, Jan
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Jan Scheurich
Hi Billy,

> You are going to have to take the hit crossing the NUMA boundary at some 
> point if your NIC and VM are on different NUMAs.
> 
> So are you saying that it is more expensive to cross the NUMA boundary from 
> the pmd to the VM that to cross it from the NIC to the
> PMD?

Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth is 
typically not a bottleneck.) The PMD only performs local memory access.

On the other hand, if the PMD crosses the QPI when copying packets into a 
remote VM, there is a huge latency penalty involved, consuming lots of PMD 
cycles that cannot be spent on processing packets. We at Ericsson have observed 
exactly this behavior.

This latency penalty becomes even worse when the LLC cache hit rate is degraded 
due to LLC cache contention with real VNFs and/or unfavorable packet buffer 
re-use patterns as exhibited by real VNFs compared to typical synthetic 
benchmark apps like DPDK testpmd.

> 
> If so then in that case you'd like to have two (for example) PMDs polling 2 
> queues on the same NIC. With the PMDs on each of the
> NUMA nodes forwarding to the VMs local to that NUMA?
> 
> Of course your NIC would then also need to be able know which VM (or at least 
> which NUMA the VM is on) in order to send the frame
> to the correct rxq.

That would indeed be optimal but hard to realize in the general case (e.g. with 
VXLAN encapsulation) as the actual destination is only known after tunnel pop. 
Here perhaps some probabilistic steering of RSS hash values based on measured 
distribution of final destinations might help in the future.

But even without that in place, we need PMDs on both NUMAs anyhow (for 
NUMA-aware polling of vhostuser ports), so why not use them to also poll remote 
eth ports. We can achieve better average performance with fewer PMDs than with 
the current limitation to NUMA-local polling.

BR, Jan

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread O Mahony, Billy


> -Original Message-
> From: Kevin Traynor [mailto:ktray...@redhat.com]
> Sent: Wednesday, September 6, 2017 3:02 PM
> To: Jan Scheurich ; O Mahony, Billy
> ; wangzh...@jd.com; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> On 09/06/2017 02:43 PM, Jan Scheurich wrote:
> >>
> >> I think the mention of pinning was confusing me a little. Let me see
> >> if I fully understand your use case:  You don't 'want' to pin
> >> anything but you are using it as a way to force the distribution of rxq 
> >> from
> a single nic across to PMDs on different NUMAs. As without pinning all rxqs
> are assigned to the NUMA-local pmd leaving the other PMD totally unused.
> >>
> >> But then when you used pinning you the PMDs became isolated so the
> >> vhostuser ports rxqs would not be assigned to the PMDs unless they too
> were pinned. Which worked but was not manageable as VM (and vhost
> ports) came and went.
> >>
> >> Yes?
> >
> > Yes!!!
[[BO'M]] Hurrah!
> >
> >>
> >> In that case what we probably want is the ability to pin an rxq to a
> >> pmd but without also isolating the pmd. So the PMD could be assigned
> some rxqs manually and still have others automatically assigned.
> >
> > Wonderful. That is exactly what I have wanted to propose for a while:
> Separate PMD isolation from pinning of Rx queues.
> >
> > Tying these two together makes it impossible to use pinning of Rx queues
> in OpenStack context (without the addition of dedicated PMDs/cores). And
> even during manual testing it is a nightmare to have to manually pin all 48
> vhostuser queues just because we want to pin the two heavy-loaded Rx
> queues to different PMDs.
> >
> 
> That sounds like it would be useful. Do you know in advance of running which
> rxq's they will be? i.e. you know it's particular port and there is only one
> queue. Or you don't know but analyze at runtime and then reconfigure?
> 
> > The idea would be to introduce a separate configuration option for PMDs
> to isolate them, and no longer automatically set that when pinning an rx
> queue to the PMD.
> >
> 
> Please don't break backward compatibility. I think it would be better to keep
> the existing command as is and add a new softer version that allows other
> rxq's to be scheduled on that pmd also.
[[BO'M]] Although is implicit isolation feature of pmd-rxq-affinity actuall 
used in the wild?  But still it's sensible to introduce the new 'softer 
version' as you say. 
> 
> Kevin.
> 
> > BR, Jan
> >

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread O Mahony, Billy


> -Original Message-
> From: Kevin Traynor [mailto:ktray...@redhat.com]
> Sent: Wednesday, September 6, 2017 2:50 PM
> To: Jan Scheurich ; O Mahony, Billy
> ; wangzh...@jd.com; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> On 09/06/2017 02:33 PM, Jan Scheurich wrote:
> > Hi Billy,
> >
> >> You are going to have to take the hit crossing the NUMA boundary at
> some point if your NIC and VM are on different NUMAs.
> >>
> >> So are you saying that it is more expensive to cross the NUMA
> >> boundary from the pmd to the VM that to cross it from the NIC to the
> PMD?
> >
> > Indeed, that is the case: If the NIC crosses the QPI bus when storing
> packets in the remote NUMA there is no cost involved for the PMD. (The QPI
> bandwidth is typically not a bottleneck.) The PMD only performs local
> memory access.
> >
> > On the other hand, if the PMD crosses the QPI when copying packets into a
> remote VM, there is a huge latency penalty involved, consuming lots of PMD
> cycles that cannot be spent on processing packets. We at Ericsson have
> observed exactly this behavior.
> >
> > This latency penalty becomes even worse when the LLC cache hit rate is
> degraded due to LLC cache contention with real VNFs and/or unfavorable
> packet buffer re-use patterns as exhibited by real VNFs compared to typical
> synthetic benchmark apps like DPDK testpmd.
> >
> >>
> >> If so then in that case you'd like to have two (for example) PMDs
> >> polling 2 queues on the same NIC. With the PMDs on each of the NUMA
> nodes forwarding to the VMs local to that NUMA?
> >>
> >> Of course your NIC would then also need to be able know which VM (or
> >> at least which NUMA the VM is on) in order to send the frame to the
> correct rxq.
> >
> > That would indeed be optimal but hard to realize in the general case (e.g.
> with VXLAN encapsulation) as the actual destination is only known after
> tunnel pop. Here perhaps some probabilistic steering of RSS hash values
> based on measured distribution of final destinations might help in the future.
> >
> > But even without that in place, we need PMDs on both NUMAs anyhow
> (for NUMA-aware polling of vhostuser ports), so why not use them to also
> poll remote eth ports. We can achieve better average performance with
> fewer PMDs than with the current limitation to NUMA-local polling.
> >
> 
> If the user has some knowledge of the numa locality of ports and can place
> VM's accordingly, default cross-numa assignment can be harm performance.
> Also, it would make for very unpredictable performance from test to test and
> even for flow to flow on a datapath.
[[BO'M]] Wang's original request would constitute default cross numa assignment 
but I don't think this modified proposal would as it still requires explicit 
config to assign to the remote NUMA.
> 
> Kevin.
> 
> > BR, Jan
> >

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Kevin Traynor
On 09/06/2017 02:43 PM, Jan Scheurich wrote:
>>
>> I think the mention of pinning was confusing me a little. Let me see if I 
>> fully understand your use case:  You don't 'want' to pin
>> anything but you are using it as a way to force the distribution of rxq from 
>> a single nic across to PMDs on different NUMAs. As without
>> pinning all rxqs are assigned to the NUMA-local pmd leaving the other PMD 
>> totally unused.
>>
>> But then when you used pinning you the PMDs became isolated so the vhostuser 
>> ports rxqs would not be assigned to the PMDs unless
>> they too were pinned. Which worked but was not manageable as VM (and vhost 
>> ports) came and went.
>>
>> Yes?
> 
> Yes!!!
> 
>>
>> In that case what we probably want is the ability to pin an rxq to a pmd but 
>> without also isolating the pmd. So the PMD could be
>> assigned some rxqs manually and still have others automatically assigned.
> 
> Wonderful. That is exactly what I have wanted to propose for a while: 
> Separate PMD isolation from pinning of Rx queues. 
> 
> Tying these two together makes it impossible to use pinning of Rx queues in 
> OpenStack context (without the addition of dedicated PMDs/cores). And even 
> during manual testing it is a nightmare to have to manually pin all 48 
> vhostuser queues just because we want to pin the two heavy-loaded Rx queues 
> to different PMDs.
> 

That sounds like it would be useful. Do you know in advance of running
which rxq's they will be? i.e. you know it's particular port and there
is only one queue. Or you don't know but analyze at runtime and then
reconfigure?

> The idea would be to introduce a separate configuration option for PMDs to 
> isolate them, and no longer automatically set that when pinning an rx queue 
> to the PMD.
> 

Please don't break backward compatibility. I think it would be better to
keep the existing command as is and add a new softer version that allows
other rxq's to be scheduled on that pmd also.

Kevin.

> BR, Jan
> 

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Kevin Traynor
On 09/06/2017 02:33 PM, Jan Scheurich wrote:
> Hi Billy,
> 
>> You are going to have to take the hit crossing the NUMA boundary at some 
>> point if your NIC and VM are on different NUMAs.
>>
>> So are you saying that it is more expensive to cross the NUMA boundary from 
>> the pmd to the VM that to cross it from the NIC to the
>> PMD?
> 
> Indeed, that is the case: If the NIC crosses the QPI bus when storing packets 
> in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth 
> is typically not a bottleneck.) The PMD only performs local memory access.
> 
> On the other hand, if the PMD crosses the QPI when copying packets into a 
> remote VM, there is a huge latency penalty involved, consuming lots of PMD 
> cycles that cannot be spent on processing packets. We at Ericsson have 
> observed exactly this behavior.
> 
> This latency penalty becomes even worse when the LLC cache hit rate is 
> degraded due to LLC cache contention with real VNFs and/or unfavorable packet 
> buffer re-use patterns as exhibited by real VNFs compared to typical 
> synthetic benchmark apps like DPDK testpmd.
> 
>>
>> If so then in that case you'd like to have two (for example) PMDs polling 2 
>> queues on the same NIC. With the PMDs on each of the
>> NUMA nodes forwarding to the VMs local to that NUMA?
>>
>> Of course your NIC would then also need to be able know which VM (or at 
>> least which NUMA the VM is on) in order to send the frame
>> to the correct rxq.
> 
> That would indeed be optimal but hard to realize in the general case (e.g. 
> with VXLAN encapsulation) as the actual destination is only known after 
> tunnel pop. Here perhaps some probabilistic steering of RSS hash values based 
> on measured distribution of final destinations might help in the future.
> 
> But even without that in place, we need PMDs on both NUMAs anyhow (for 
> NUMA-aware polling of vhostuser ports), so why not use them to also poll 
> remote eth ports. We can achieve better average performance with fewer PMDs 
> than with the current limitation to NUMA-local polling.
> 

If the user has some knowledge of the numa locality of ports and can
place VM's accordingly, default cross-numa assignment can be harm
performance. Also, it would make for very unpredictable performance from
test to test and even for flow to flow on a datapath.

Kevin.

> BR, Jan
> 

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread O Mahony, Billy
Hi Wang,

I think the mention of pinning was confusing me a little. Let me see if I fully 
understand your use case:  You don't 'want' to pin anything but you are using 
it as a way to force the distribution of rxq from a single nic across to PMDs 
on different NUMAs. As without pinning all rxqs are assigned to the NUMA-local 
pmd leaving the other PMD totally unused.

But then when you used pinning you the PMDs became isolated so the vhostuser 
ports rxqs would not be assigned to the PMDs unless they too were pinned. Which 
worked but was not manageable as VM (and vhost ports) came and went.

Yes? 

In that case what we probably want is the ability to pin an rxq to a pmd but 
without also isolating the pmd. So the PMD could be assigned some rxqs manually 
and still have others automatically assigned. 

But what I still don't understand is why you don't put both PMDs on the same 
NUMA node. Given that you cannot program the NIC to know which VM a frame is 
for then you would have to RSS the frames across rxqs (ie across NUMA nodes). 
Of those going to the NICs local-numa node 50% would have to go across the NUMA 
boundary when their destination VM was decided - which is okay - they have to 
cross the boundary at some point. But for or frames going to non-local NUMA, 
50% of these will actually be destined for what was originally the local NUMA 
node. Now these packets (25% of all traffic would ) will cross NUMA *twice* 
whereas if all PMDs were on the NICs NUMA node those frames would never have 
had to pass between NUMA nodes.

In short I think it's more efficient to have both PMDs on the same NUMA node as 
the NIC.

There is one more comments below..

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Wednesday, September 6, 2017 12:50 PM
> To: O Mahony, Billy ; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> See my reply in line.
> 
> Br,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Wednesday, September 06, 2017 7:26 PM
> To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> You are going to have to take the hit crossing the NUMA boundary at some
> point if your NIC and VM are on different NUMAs.
> 
> So are you saying that it is more expensive to cross the NUMA boundary
> from the pmd to the VM that to cross it from the NIC to the PMD?
> 
> [Wang Zhike] I do not have such data. I hope we can try the new behavior
> and get the test result, and then know whether and how much performance
> can be improved.

[[BO'M]] You don't need to a code change to compare performance of these two 
scenarios. You can simulate it by pinning queues to VMs. I'd imagine crossing 
the NUMA boundary during the PCI DMA would be cheaper that crossing it over 
vhost. But I don't know what the result would be and this would a pretty 
interesting figure to have by the way.


> 
> If so then in that case you'd like to have two (for example) PMDs polling 2
> queues on the same NIC. With the PMDs on each of the NUMA nodes
> forwarding to the VMs local to that NUMA?
> 
> Of course your NIC would then also need to be able know which VM (or at
> least which NUMA the VM is on) in order to send the frame to the correct
> rxq.
> 
> [Wang Zhike] Currently I do not know how to achieve it. From my view, NIC
> do not know which NUMA should be the destination of the packet. Only
> after OVS handling (eg lookup the fowarding rule in OVS), then it can know
> the destination. If NIC does not know the destination NUMA socket, it does
> not matter which PMD to poll it.
> 
> 
> /Billy.
> 
> > -Original Message-
> > From: 王志克 [mailto:wangzh...@jd.com]
> > Sent: Wednesday, September 6, 2017 11:41 AM
> > To: O Mahony, Billy ; Darrell Ball
> > ; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor 
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,
> >
> > It depends on the destination of the traffic.
> >
> > I observed that if the traffic destination is across NUMA socket, the
> > "avg processing cycles per packet" would increase 60% than the traffic
> > to same NUMA socket.
> >
> > Br,
> > Wang Zhike
> >
> > -Original Message-
> > From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> > Sent: Wednesday, September 06, 2017 6:35 PM
> > To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Wang,
> >
> > If you create several 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread 王志克
Hi Billy,

See my reply in line.

Br,
Wang Zhike

-Original Message-
From: O Mahony, Billy [mailto:billy.o.mah...@intel.com] 
Sent: Wednesday, September 06, 2017 7:26 PM
To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; 
Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

Hi Wang,

You are going to have to take the hit crossing the NUMA boundary at some point 
if your NIC and VM are on different NUMAs.

So are you saying that it is more expensive to cross the NUMA boundary from the 
pmd to the VM that to cross it from the NIC to the PMD?

[Wang Zhike] I do not have such data. I hope we can try the new behavior and 
get the test result, and then know whether and how much performance can be 
improved.

If so then in that case you'd like to have two (for example) PMDs polling 2 
queues on the same NIC. With the PMDs on each of the NUMA nodes forwarding to 
the VMs local to that NUMA?

Of course your NIC would then also need to be able know which VM (or at least 
which NUMA the VM is on) in order to send the frame to the correct rxq. 

[Wang Zhike] Currently I do not know how to achieve it. From my view, NIC do 
not know which NUMA should be the destination of the packet. Only after OVS 
handling (eg lookup the fowarding rule in OVS), then it can know the 
destination. If NIC does not know the destination NUMA socket, it does not 
matter which PMD to poll it.


/Billy. 

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Wednesday, September 6, 2017 11:41 AM
> To: O Mahony, Billy ; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> It depends on the destination of the traffic.
> 
> I observed that if the traffic destination is across NUMA socket, the "avg
> processing cycles per packet" would increase 60% than the traffic to same
> NUMA socket.
> 
> Br,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Wednesday, September 06, 2017 6:35 PM
> To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> If you create several PMDs on the NUMA of the physical port does that have
> the same performance characteristic?
> 
> /Billy
> 
> 
> 
> > -Original Message-
> > From: 王志克 [mailto:wangzh...@jd.com]
> > Sent: Wednesday, September 6, 2017 10:20 AM
> > To: O Mahony, Billy ; Darrell Ball
> > ; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor 
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,
> >
> > Yes, I want to achieve better performance.
> >
> > The commit "dpif-netdev: Assign ports to pmds on non-local numa node"
> > can NOT meet my needs.
> >
> > I do have pmd on socket 0 to poll the physical NIC which is also on socket 
> > 0.
> > However, this is not enough since I also have other pmd on socket 1. I
> > hope such pmds on socket 1 can together poll physical NIC. In this
> > way, we have more CPU (in my case, double CPU) to poll the NIC, which
> > results in performance improvement.
> >
> > BR,
> > Wang Zhike
> >
> > -Original Message-
> > From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> > Sent: Wednesday, September 06, 2017 5:14 PM
> > To: Darrell Ball; 王志克; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Wang,
> >
> > A change was committed to head of master 2017-08-02 "dpif-netdev:
> > Assign ports to pmds on non-local numa node" which if I understand
> > your request correctly will do what you require.
> >
> > However it is not clear to me why you are pinning rxqs to PMDs in the
> > first instance. Currently if you configure at least on pmd on each
> > numa there should always be a PMD available. Is the pinning for
> performance reasons?
> >
> > Regards,
> > Billy
> >
> >
> >
> > > -Original Message-
> > > From: Darrell Ball [mailto:db...@vmware.com]
> > > Sent: Wednesday, September 6, 2017 8:25 AM
> > > To: 王志克 ; ovs-discuss@openvswitch.org; ovs-
> > > d...@openvswitch.org; O Mahony, Billy ;
> > Kevin
> > > Traynor 
> > > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > > physical port
> > >
> > > Adding Billy and Kevin
> > >
> > >
> > > On 9/6/17, 12:22 AM, "Darrell Ball"  wrote:
> > >
> > >
> > >
> > > On 9/6/17, 12:03 AM, "王志克"  wrote:
> > >
> > > Hi Darrell,
> > >
> > > 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread O Mahony, Billy
Hi Wang,

You are going to have to take the hit crossing the NUMA boundary at some point 
if your NIC and VM are on different NUMAs.

So are you saying that it is more expensive to cross the NUMA boundary from the 
pmd to the VM that to cross it from the NIC to the PMD?

If so then in that case you'd like to have two (for example) PMDs polling 2 
queues on the same NIC. With the PMDs on each of the NUMA nodes forwarding to 
the VMs local to that NUMA?

Of course your NIC would then also need to be able know which VM (or at least 
which NUMA the VM is on) in order to send the frame to the correct rxq. 

/Billy. 

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Wednesday, September 6, 2017 11:41 AM
> To: O Mahony, Billy ; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> It depends on the destination of the traffic.
> 
> I observed that if the traffic destination is across NUMA socket, the "avg
> processing cycles per packet" would increase 60% than the traffic to same
> NUMA socket.
> 
> Br,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Wednesday, September 06, 2017 6:35 PM
> To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> If you create several PMDs on the NUMA of the physical port does that have
> the same performance characteristic?
> 
> /Billy
> 
> 
> 
> > -Original Message-
> > From: 王志克 [mailto:wangzh...@jd.com]
> > Sent: Wednesday, September 6, 2017 10:20 AM
> > To: O Mahony, Billy ; Darrell Ball
> > ; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor 
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,
> >
> > Yes, I want to achieve better performance.
> >
> > The commit "dpif-netdev: Assign ports to pmds on non-local numa node"
> > can NOT meet my needs.
> >
> > I do have pmd on socket 0 to poll the physical NIC which is also on socket 
> > 0.
> > However, this is not enough since I also have other pmd on socket 1. I
> > hope such pmds on socket 1 can together poll physical NIC. In this
> > way, we have more CPU (in my case, double CPU) to poll the NIC, which
> > results in performance improvement.
> >
> > BR,
> > Wang Zhike
> >
> > -Original Message-
> > From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> > Sent: Wednesday, September 06, 2017 5:14 PM
> > To: Darrell Ball; 王志克; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; Kevin Traynor
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Wang,
> >
> > A change was committed to head of master 2017-08-02 "dpif-netdev:
> > Assign ports to pmds on non-local numa node" which if I understand
> > your request correctly will do what you require.
> >
> > However it is not clear to me why you are pinning rxqs to PMDs in the
> > first instance. Currently if you configure at least on pmd on each
> > numa there should always be a PMD available. Is the pinning for
> performance reasons?
> >
> > Regards,
> > Billy
> >
> >
> >
> > > -Original Message-
> > > From: Darrell Ball [mailto:db...@vmware.com]
> > > Sent: Wednesday, September 6, 2017 8:25 AM
> > > To: 王志克 ; ovs-discuss@openvswitch.org; ovs-
> > > d...@openvswitch.org; O Mahony, Billy ;
> > Kevin
> > > Traynor 
> > > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > > physical port
> > >
> > > Adding Billy and Kevin
> > >
> > >
> > > On 9/6/17, 12:22 AM, "Darrell Ball"  wrote:
> > >
> > >
> > >
> > > On 9/6/17, 12:03 AM, "王志克"  wrote:
> > >
> > > Hi Darrell,
> > >
> > > pmd-rxq-affinity has below limitation: (so isolated pmd can
> > > not be used for others, which is not my expectation. Lots of VMs
> > > come and go on the fly, and manully assignment is not feasible.)
> > >   >>After that PMD threads on cores where RX queues
> > > was pinned will become isolated. This means that this thread will
> > > poll only pinned RX queues
> > >
> > > My problem is that I have several CPUs spreading on
> > > different NUMA nodes. I hope all these CPU can have chance to serve
> the rxq.
> > > However, because the phy NIC only locates on one certain socket
> > > node, non-same numa pmd/CPU would be excluded. So I am wondering
> > > whether
> > we
> > > can have different behavior for phy port rxq:
> > >   round-robin to all PMDs even the pmd on different NUMA 
> > > socket.
> > >

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread 王志克
Hi Billy,

It depends on the destination of the traffic.

I observed that if the traffic destination is across NUMA socket, the "avg 
processing cycles per packet" would increase 60% than the traffic to same NUMA 
socket.

Br,
Wang Zhike

-Original Message-
From: O Mahony, Billy [mailto:billy.o.mah...@intel.com] 
Sent: Wednesday, September 06, 2017 6:35 PM
To: 王志克; Darrell Ball; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; 
Kevin Traynor
Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

Hi Wang,

If you create several PMDs on the NUMA of the physical port does that have the 
same performance characteristic? 

/Billy



> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Wednesday, September 6, 2017 10:20 AM
> To: O Mahony, Billy ; Darrell Ball
> ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> Yes, I want to achieve better performance.
> 
> The commit "dpif-netdev: Assign ports to pmds on non-local numa node" can
> NOT meet my needs.
> 
> I do have pmd on socket 0 to poll the physical NIC which is also on socket 0.
> However, this is not enough since I also have other pmd on socket 1. I hope
> such pmds on socket 1 can together poll physical NIC. In this way, we have
> more CPU (in my case, double CPU) to poll the NIC, which results in
> performance improvement.
> 
> BR,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Wednesday, September 06, 2017 5:14 PM
> To: Darrell Ball; 王志克; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> A change was committed to head of master 2017-08-02 "dpif-netdev: Assign
> ports to pmds on non-local numa node" which if I understand your request
> correctly will do what you require.
> 
> However it is not clear to me why you are pinning rxqs to PMDs in the first
> instance. Currently if you configure at least on pmd on each numa there
> should always be a PMD available. Is the pinning for performance reasons?
> 
> Regards,
> Billy
> 
> 
> 
> > -Original Message-
> > From: Darrell Ball [mailto:db...@vmware.com]
> > Sent: Wednesday, September 6, 2017 8:25 AM
> > To: 王志克 ; ovs-discuss@openvswitch.org; ovs-
> > d...@openvswitch.org; O Mahony, Billy ;
> Kevin
> > Traynor 
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Adding Billy and Kevin
> >
> >
> > On 9/6/17, 12:22 AM, "Darrell Ball"  wrote:
> >
> >
> >
> > On 9/6/17, 12:03 AM, "王志克"  wrote:
> >
> > Hi Darrell,
> >
> > pmd-rxq-affinity has below limitation: (so isolated pmd can
> > not be used for others, which is not my expectation. Lots of VMs come
> > and go on the fly, and manully assignment is not feasible.)
> >   >>After that PMD threads on cores where RX queues
> > was pinned will become isolated. This means that this thread will poll
> > only pinned RX queues
> >
> > My problem is that I have several CPUs spreading on different
> > NUMA nodes. I hope all these CPU can have chance to serve the rxq.
> > However, because the phy NIC only locates on one certain socket node,
> > non-same numa pmd/CPU would be excluded. So I am wondering whether
> we
> > can have different behavior for phy port rxq:
> >   round-robin to all PMDs even the pmd on different NUMA socket.
> >
> > I guess this is a common case, and I believe it would improve
> > rx performance.
> >
> >
> > [Darrell] I agree it would be a common problem and some
> > distribution would seem to make sense, maybe factoring in some
> > favoring of local numa PMDs ?
> > Maybe an optional config to enable ?
> >
> >
> > Br,
> > Wang Zhike
> >
> >

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Kevin Traynor
On 09/06/2017 08:03 AM, 王志克 wrote:
> Hi Darrell,
> 
> pmd-rxq-affinity has below limitation: (so isolated pmd can not be used for 
> others, which is not my expectation. Lots of VMs come and go on the fly, and 
> manully assignment is not feasible.)
>   >>After that PMD threads on cores where RX queues was pinned will 
> become isolated. This means that this thread will poll only pinned RX queues
> 
> My problem is that I have several CPUs spreading on different NUMA nodes. I 
> hope all these CPU can have chance to serve the rxq. However, because the phy 
> NIC only locates on one certain socket node, non-same numa pmd/CPU would be 
> excluded. So I am wondering whether we can have different behavior for phy 
> port rxq: 
>   round-robin to all PMDs even the pmd on different NUMA socket.
> 
> I guess this is a common case, and I believe it would improve rx performance.
> 

The issue is that cross numa datapaths occur a large performance penalty
(~2x cycles). This is the reason rxq assignment uses pmds from the same
numa node as the port. Also, any rxqs from other ports that are also
scheduled on the same pmd could suffer as a result of cpu starvation
from that cross-numa assignment.

An issue was that in the case of no pmds available on the correct NUMA
node for a port, it meant that rxqs from that port were not polled at
all. Billy's commit addressed that by allowing cross-numa assignment
*only* in the event of no pmds on the same numa node as the port.

If you look through the threads on Billy's patch you'll see more
discussion on it.

Kevin.


> Br,
> Wang Zhike
> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com] 
> Sent: Wednesday, September 06, 2017 1:39 PM
> To: 王志克; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port
> 
> You could use  pmd-rxq-affinity for the queues you want serviced locally and 
> let the others go remote
> 
> On 9/5/17, 8:14 PM, "王志克"  wrote:
> 
> It is a bit different from my expectation.
> 
> 
> 
> I have separate CPU and pmd for each NUMA node. However, the physical NIC 
> only locates on NUMA socket0. So only part of CPU and pmd (the ones in same 
> NUMA node) can poll the physical NIC. Since I have multiple rx queue, I hope 
> part queues can be polled with pmd on same node, others can be polled with 
> pmd on non-local numa node. In this way, we have more pmds contributes the 
> polling of physical NIC, so performance improvement is expected from total rx 
> traffic view.
> 
> 
> 
> Br,
> 
> Wang Zhike
> 
> 
> 
> -Original Message-
> 
> From: Darrell Ball [mailto:db...@vmware.com] 
> 
> Sent: Wednesday, September 06, 2017 10:47 AM
> 
> To: 王志克; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org
> 
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical 
> port
> 
> 
> 
> This same numa node limitation was already removed, although same numa is 
> preferred for performance reasons.
> 
> 
> 
> commit c37813fdb030b4270d05ad61943754f67021a50d
> 
> Author: Billy O'Mahony 
> 
> Date:   Tue Aug 1 14:38:43 2017 -0700
> 
> 
> 
> dpif-netdev: Assign ports to pmds on non-local numa node.
> 
> 
> 
> Previously if there is no available (non-isolated) pmd on the numa 
> node
> 
> for a port then the port is not polled at all. This can result in a
> 
> non-operational system until such time as nics are physically
> 
> repositioned. It is preferable to operate with a pmd on the 'wrong' 
> numa
> 
> node albeit with lower performance. Local pmds are still chosen when
> 
> available.
> 
> 
> 
> Signed-off-by: Billy O'Mahony 
> 
> Signed-off-by: Ilya Maximets 
> 
> Co-authored-by: Ilya Maximets 
> 
> 
> 
> 
> 
> The sentence “The rx queues are assigned to pmd threads on the same NUMA 
> node in a round-robin fashion.”
> 
> 
> 
> under
> 
> 
> 
> DPDK Physical Port Rx Queues¶
> 
> 
> 
> should be removed since it is outdated in a couple of ways and there is 
> other correct documentation on the same page
> 
> and also here 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_howto_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=iNebKvfYjcXbjMsmtLJqThRUImv8W4PRrYWpD-QwUVg=KG3MmQe4QkUkyG3xsCoF6DakFsZh_eg9aEyhYFUKF2c=
>  
> 
> 
> 
> Maybe you could submit a patch ?
> 
> 
> 
> Thanks Darrell
> 
> 
> 
> 
> 
> On 9/5/17, 7:18 PM, "ovs-dev-boun...@openvswitch.org on behalf 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread O Mahony, Billy
Hi Wang,

A change was committed to head of master 2017-08-02 "dpif-netdev: Assign ports 
to pmds on non-local numa node" which if I understand your request correctly 
will do what you require.

However it is not clear to me why you are pinning rxqs to PMDs in the first 
instance. Currently if you configure at least on pmd on each numa there should 
always be a PMD available. Is the pinning for performance reasons?

Regards,
Billy



> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com]
> Sent: Wednesday, September 6, 2017 8:25 AM
> To: 王志克 ; ovs-discuss@openvswitch.org; ovs-
> d...@openvswitch.org; O Mahony, Billy ; Kevin
> Traynor 
> Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Adding Billy and Kevin
> 
> 
> On 9/6/17, 12:22 AM, "Darrell Ball"  wrote:
> 
> 
> 
> On 9/6/17, 12:03 AM, "王志克"  wrote:
> 
> Hi Darrell,
> 
> pmd-rxq-affinity has below limitation: (so isolated pmd can not be 
> used
> for others, which is not my expectation. Lots of VMs come and go on the fly,
> and manully assignment is not feasible.)
>   >>After that PMD threads on cores where RX queues was pinned
> will become isolated. This means that this thread will poll only pinned RX
> queues
> 
> My problem is that I have several CPUs spreading on different NUMA
> nodes. I hope all these CPU can have chance to serve the rxq. However,
> because the phy NIC only locates on one certain socket node, non-same
> numa pmd/CPU would be excluded. So I am wondering whether we can
> have different behavior for phy port rxq:
>   round-robin to all PMDs even the pmd on different NUMA socket.
> 
> I guess this is a common case, and I believe it would improve rx
> performance.
> 
> 
> [Darrell] I agree it would be a common problem and some distribution
> would seem to make sense, maybe factoring in some favoring of local numa
> PMDs ?
> Maybe an optional config to enable ?
> 
> 
> Br,
> Wang Zhike
> 
> 

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Darrell Ball
Adding Billy and Kevin


On 9/6/17, 12:22 AM, "Darrell Ball"  wrote:



On 9/6/17, 12:03 AM, "王志克"  wrote:

Hi Darrell,

pmd-rxq-affinity has below limitation: (so isolated pmd can not be used 
for others, which is not my expectation. Lots of VMs come and go on the fly, 
and manully assignment is not feasible.)
  >>After that PMD threads on cores where RX queues was pinned 
will become isolated. This means that this thread will poll only pinned RX 
queues

My problem is that I have several CPUs spreading on different NUMA 
nodes. I hope all these CPU can have chance to serve the rxq. However, because 
the phy NIC only locates on one certain socket node, non-same numa pmd/CPU 
would be excluded. So I am wondering whether we can have different behavior for 
phy port rxq: 
  round-robin to all PMDs even the pmd on different NUMA socket.

I guess this is a common case, and I believe it would improve rx 
performance.


[Darrell] I agree it would be a common problem and some distribution would 
seem to make sense, maybe factoring in some favoring of local numa PMDs ?
Maybe an optional config to enable ?
  

Br,
Wang Zhike



___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread Darrell Ball


On 9/6/17, 12:03 AM, "王志克"  wrote:

Hi Darrell,

pmd-rxq-affinity has below limitation: (so isolated pmd can not be used for 
others, which is not my expectation. Lots of VMs come and go on the fly, and 
manully assignment is not feasible.)
  >>After that PMD threads on cores where RX queues was pinned will 
become isolated. This means that this thread will poll only pinned RX queues

My problem is that I have several CPUs spreading on different NUMA nodes. I 
hope all these CPU can have chance to serve the rxq. However, because the phy 
NIC only locates on one certain socket node, non-same numa pmd/CPU would be 
excluded. So I am wondering whether we can have different behavior for phy port 
rxq: 
  round-robin to all PMDs even the pmd on different NUMA socket.

I guess this is a common case, and I believe it would improve rx 
performance.


[Darrell] I agree it would be a common problem and some distribution would seem 
to make sense, maybe factoring in some favoring of local numa PMDs ?
Maybe an optional config to enable ?
  

Br,
Wang Zhike

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-06 Thread 王志克
Hi Darrell,

pmd-rxq-affinity has below limitation: (so isolated pmd can not be used for 
others, which is not my expectation. Lots of VMs come and go on the fly, and 
manully assignment is not feasible.)
  >>After that PMD threads on cores where RX queues was pinned will 
become isolated. This means that this thread will poll only pinned RX queues

My problem is that I have several CPUs spreading on different NUMA nodes. I 
hope all these CPU can have chance to serve the rxq. However, because the phy 
NIC only locates on one certain socket node, non-same numa pmd/CPU would be 
excluded. So I am wondering whether we can have different behavior for phy port 
rxq: 
  round-robin to all PMDs even the pmd on different NUMA socket.

I guess this is a common case, and I believe it would improve rx performance.

Br,
Wang Zhike
-Original Message-
From: Darrell Ball [mailto:db...@vmware.com] 
Sent: Wednesday, September 06, 2017 1:39 PM
To: 王志克; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org
Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

You could use  pmd-rxq-affinity for the queues you want serviced locally and 
let the others go remote

On 9/5/17, 8:14 PM, "王志克"  wrote:

It is a bit different from my expectation.



I have separate CPU and pmd for each NUMA node. However, the physical NIC 
only locates on NUMA socket0. So only part of CPU and pmd (the ones in same 
NUMA node) can poll the physical NIC. Since I have multiple rx queue, I hope 
part queues can be polled with pmd on same node, others can be polled with pmd 
on non-local numa node. In this way, we have more pmds contributes the polling 
of physical NIC, so performance improvement is expected from total rx traffic 
view.



Br,

Wang Zhike



-Original Message-

From: Darrell Ball [mailto:db...@vmware.com] 

Sent: Wednesday, September 06, 2017 10:47 AM

To: 王志克; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org

Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical 
port



This same numa node limitation was already removed, although same numa is 
preferred for performance reasons.



commit c37813fdb030b4270d05ad61943754f67021a50d

Author: Billy O'Mahony 

Date:   Tue Aug 1 14:38:43 2017 -0700



dpif-netdev: Assign ports to pmds on non-local numa node.



Previously if there is no available (non-isolated) pmd on the numa node

for a port then the port is not polled at all. This can result in a

non-operational system until such time as nics are physically

repositioned. It is preferable to operate with a pmd on the 'wrong' numa

node albeit with lower performance. Local pmds are still chosen when

available.



Signed-off-by: Billy O'Mahony 

Signed-off-by: Ilya Maximets 

Co-authored-by: Ilya Maximets 





The sentence “The rx queues are assigned to pmd threads on the same NUMA 
node in a round-robin fashion.”



under



DPDK Physical Port Rx Queues¶



should be removed since it is outdated in a couple of ways and there is 
other correct documentation on the same page

and also here 
https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_howto_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=iNebKvfYjcXbjMsmtLJqThRUImv8W4PRrYWpD-QwUVg=KG3MmQe4QkUkyG3xsCoF6DakFsZh_eg9aEyhYFUKF2c=
 



Maybe you could submit a patch ?



Thanks Darrell





On 9/5/17, 7:18 PM, "ovs-dev-boun...@openvswitch.org on behalf of 王志克" 
 wrote:



Hi All,







I read below doc about pmd assignment for physical port. I think the 
limitation “on the same NUMA node” may be not efficient.








https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_intro_install_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=pqvCrQwfrcDxvwcpuouzVymiBkev1vHpnOlef-ZMev8=4wch_Q6fqo0stIDE4K2loh0z-dshuligqsrAV_h-QuU=
 



DPDK Physical Port Rx 
Queues¶







$ 

Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-05 Thread Darrell Ball
You could use  pmd-rxq-affinity for the queues you want serviced locally and 
let the others go remote

On 9/5/17, 8:14 PM, "王志克"  wrote:

It is a bit different from my expectation.



I have separate CPU and pmd for each NUMA node. However, the physical NIC 
only locates on NUMA socket0. So only part of CPU and pmd (the ones in same 
NUMA node) can poll the physical NIC. Since I have multiple rx queue, I hope 
part queues can be polled with pmd on same node, others can be polled with pmd 
on non-local numa node. In this way, we have more pmds contributes the polling 
of physical NIC, so performance improvement is expected from total rx traffic 
view.



Br,

Wang Zhike



-Original Message-

From: Darrell Ball [mailto:db...@vmware.com] 

Sent: Wednesday, September 06, 2017 10:47 AM

To: 王志克; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org

Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical 
port



This same numa node limitation was already removed, although same numa is 
preferred for performance reasons.



commit c37813fdb030b4270d05ad61943754f67021a50d

Author: Billy O'Mahony 

Date:   Tue Aug 1 14:38:43 2017 -0700



dpif-netdev: Assign ports to pmds on non-local numa node.



Previously if there is no available (non-isolated) pmd on the numa node

for a port then the port is not polled at all. This can result in a

non-operational system until such time as nics are physically

repositioned. It is preferable to operate with a pmd on the 'wrong' numa

node albeit with lower performance. Local pmds are still chosen when

available.



Signed-off-by: Billy O'Mahony 

Signed-off-by: Ilya Maximets 

Co-authored-by: Ilya Maximets 





The sentence “The rx queues are assigned to pmd threads on the same NUMA 
node in a round-robin fashion.”



under



DPDK Physical Port Rx Queues¶



should be removed since it is outdated in a couple of ways and there is 
other correct documentation on the same page

and also here 
https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_howto_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=iNebKvfYjcXbjMsmtLJqThRUImv8W4PRrYWpD-QwUVg=KG3MmQe4QkUkyG3xsCoF6DakFsZh_eg9aEyhYFUKF2c=
 



Maybe you could submit a patch ?



Thanks Darrell





On 9/5/17, 7:18 PM, "ovs-dev-boun...@openvswitch.org on behalf of 王志克" 
 wrote:



Hi All,







I read below doc about pmd assignment for physical port. I think the 
limitation “on the same NUMA node” may be not efficient.








https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_intro_install_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=pqvCrQwfrcDxvwcpuouzVymiBkev1vHpnOlef-ZMev8=4wch_Q6fqo0stIDE4K2loh0z-dshuligqsrAV_h-QuU=
 



DPDK Physical Port Rx 
Queues¶







$ ovs-vsctl set Interface  options:n_rxq=







The above command sets the number of rx queues for DPDK physical 
interface. The rx queues are assigned to pmd threads on the same NUMA node in a 
round-robin fashion.



Consider below case:







One host has one PCI NIC on NUMA node 0, and has 4 VMs, which spread in 
NUMA node 0 and 1. There are multiple rx queues configured on the physical NIC. 
We configured 4 pmd (two cpu from NUMA node0, and two cpu from node 1). Since 
the physical NIC locates on NUMA node0, only pmds on same NUMA node can poll 
its rxq. As a result, only two cpu can be used for polling physical NIC.







If we compare the OVS kernel mode, there is no such limitation.







So question:



should we remove “same NUMA node” limitation fro physical port rx 
queues? Or we have other options to improve the performance for this case?


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-05 Thread 王志克
It is a bit different from my expectation.

I have separate CPU and pmd for each NUMA node. However, the physical NIC only 
locates on NUMA socket0. So only part of CPU and pmd (the ones in same NUMA 
node) can poll the physical NIC. Since I have multiple rx queue, I hope part 
queues can be polled with pmd on same node, others can be polled with pmd on 
non-local numa node. In this way, we have more pmds contributes the polling of 
physical NIC, so performance improvement is expected from total rx traffic view.

Br,
Wang Zhike

-Original Message-
From: Darrell Ball [mailto:db...@vmware.com] 
Sent: Wednesday, September 06, 2017 10:47 AM
To: 王志克; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org
Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

This same numa node limitation was already removed, although same numa is 
preferred for performance reasons.

commit c37813fdb030b4270d05ad61943754f67021a50d
Author: Billy O'Mahony 
Date:   Tue Aug 1 14:38:43 2017 -0700

dpif-netdev: Assign ports to pmds on non-local numa node.

Previously if there is no available (non-isolated) pmd on the numa node
for a port then the port is not polled at all. This can result in a
non-operational system until such time as nics are physically
repositioned. It is preferable to operate with a pmd on the 'wrong' numa
node albeit with lower performance. Local pmds are still chosen when
available.

Signed-off-by: Billy O'Mahony 
Signed-off-by: Ilya Maximets 
Co-authored-by: Ilya Maximets 


The sentence “The rx queues are assigned to pmd threads on the same NUMA node 
in a round-robin fashion.”

under

DPDK Physical Port Rx Queues¶

should be removed since it is outdated in a couple of ways and there is other 
correct documentation on the same page
and also here http://docs.openvswitch.org/en/latest/howto/dpdk/

Maybe you could submit a patch ?

Thanks Darrell


On 9/5/17, 7:18 PM, "ovs-dev-boun...@openvswitch.org on behalf of 王志克" 
 wrote:

Hi All,



I read below doc about pmd assignment for physical port. I think the 
limitation “on the same NUMA node” may be not efficient.




https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_intro_install_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=pqvCrQwfrcDxvwcpuouzVymiBkev1vHpnOlef-ZMev8=4wch_Q6fqo0stIDE4K2loh0z-dshuligqsrAV_h-QuU=
 

DPDK Physical Port Rx 
Queues¶



$ ovs-vsctl set Interface  options:n_rxq=



The above command sets the number of rx queues for DPDK physical interface. 
The rx queues are assigned to pmd threads on the same NUMA node in a 
round-robin fashion.

Consider below case:



One host has one PCI NIC on NUMA node 0, and has 4 VMs, which spread in 
NUMA node 0 and 1. There are multiple rx queues configured on the physical NIC. 
We configured 4 pmd (two cpu from NUMA node0, and two cpu from node 1). Since 
the physical NIC locates on NUMA node0, only pmds on same NUMA node can poll 
its rxq. As a result, only two cpu can be used for polling physical NIC.



If we compare the OVS kernel mode, there is no such limitation.



So question:

should we remove “same NUMA node” limitation fro physical port rx queues? 
Or we have other options to improve the performance for this case?



Br,

Wang Zhike



___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=pqvCrQwfrcDxvwcpuouzVymiBkev1vHpnOlef-ZMev8=Whz73vLTYWkBuEL6reD88bkzCgSfqpgb7MDiCG5fB4A=
 


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-05 Thread Darrell Ball
This same numa node limitation was already removed, although same numa is 
preferred for performance reasons.

commit c37813fdb030b4270d05ad61943754f67021a50d
Author: Billy O'Mahony 
Date:   Tue Aug 1 14:38:43 2017 -0700

dpif-netdev: Assign ports to pmds on non-local numa node.

Previously if there is no available (non-isolated) pmd on the numa node
for a port then the port is not polled at all. This can result in a
non-operational system until such time as nics are physically
repositioned. It is preferable to operate with a pmd on the 'wrong' numa
node albeit with lower performance. Local pmds are still chosen when
available.

Signed-off-by: Billy O'Mahony 
Signed-off-by: Ilya Maximets 
Co-authored-by: Ilya Maximets 


The sentence “The rx queues are assigned to pmd threads on the same NUMA node 
in a round-robin fashion.”

under

DPDK Physical Port Rx Queues¶

should be removed since it is outdated in a couple of ways and there is other 
correct documentation on the same page
and also here http://docs.openvswitch.org/en/latest/howto/dpdk/

Maybe you could submit a patch ?

Thanks Darrell


On 9/5/17, 7:18 PM, "ovs-dev-boun...@openvswitch.org on behalf of 王志克" 
 wrote:

Hi All,



I read below doc about pmd assignment for physical port. I think the 
limitation “on the same NUMA node” may be not efficient.




https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.openvswitch.org_en_latest_intro_install_dpdk_=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=pqvCrQwfrcDxvwcpuouzVymiBkev1vHpnOlef-ZMev8=4wch_Q6fqo0stIDE4K2loh0z-dshuligqsrAV_h-QuU=
 

DPDK Physical Port Rx 
Queues¶



$ ovs-vsctl set Interface  options:n_rxq=



The above command sets the number of rx queues for DPDK physical interface. 
The rx queues are assigned to pmd threads on the same NUMA node in a 
round-robin fashion.

Consider below case:



One host has one PCI NIC on NUMA node 0, and has 4 VMs, which spread in 
NUMA node 0 and 1. There are multiple rx queues configured on the physical NIC. 
We configured 4 pmd (two cpu from NUMA node0, and two cpu from node 1). Since 
the physical NIC locates on NUMA node0, only pmds on same NUMA node can poll 
its rxq. As a result, only two cpu can be used for polling physical NIC.



If we compare the OVS kernel mode, there is no such limitation.



So question:

should we remove “same NUMA node” limitation fro physical port rx queues? 
Or we have other options to improve the performance for this case?



Br,

Wang Zhike



___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=pqvCrQwfrcDxvwcpuouzVymiBkev1vHpnOlef-ZMev8=Whz73vLTYWkBuEL6reD88bkzCgSfqpgb7MDiCG5fB4A=
 


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss