[dpdk-dev] Order of system brought up affects throughput with qos_sched app

2015-09-10 Thread Dumitrescu, Cristian
Hi Wei,

You simply need to do the math and create a model for each of your token 
buckets, considering parameters like: size of the bucket, initial number of 
credits in the bucket, credit update rate for the bucket, rate of input packets 
(in bytes per second) hitting that bucket and consuming credits, etc.

Regards,
Cristian

From: Wei Shen [mailto:wshen0...@outlook.com]
Sent: Thursday, September 10, 2015 1:47 AM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: RE: [dpdk-dev] Order of system brought up affects throughput with 
qos_sched app

Hi Cristian,

Thanks for your quick response. I did a quick test of your hypothesis and it 
sort of came out as you mentioned. That is, it went back to ~4Gbps after around 
ten minutes with the previous profile I posted.

In another test, I set the pipe token rate to ~20Mbps instead of full line rate 
each. Although I did run into the same order issue, I haven't noticed any slow 
down yet by the time of this email (it's been up for an hour or so).

I am sorry but I still don't get why. Do you mean ~10Gbps throughput saw in 
order #2 is made possible by the initial accumulation of credits and later when 
the app runs long enough the old credits would run out and get capped by this 
credit rate? But in the profile I set everything to be line rate so I think 
this is not the bottleneck.

Could you please illustrate it further? Appreciate it. Thank you.

> From: cristian.dumitrescu at intel.com<mailto:cristian.dumitrescu at 
> intel.com>
> To: wshen0123 at outlook.com<mailto:wshen0123 at outlook.com>; dev at 
> dpdk.org<mailto:dev at dpdk.org>
> Subject: RE: [dpdk-dev] Order of system brought up affects throughput with 
> qos_sched app
> Date: Wed, 9 Sep 2015 19:54:12 +
>
> Hi Wei,
>
> Here is another hypothesis for you to consider: if the size of your token 
> buckets (used to store subport and pipe credits) is big (and it actually is 
> set big in the default config file of the app), then when no packets are 
> received for a long while (which is the case when you start the app first and 
> the traffic gen later), the token buckets are continuously replenished (with 
> nothing consumed) until they become full; when packets start to arrive, the 
> token buckets are full and it can take a long time (might be minutes or even 
> hours, depending on how big your buckets are) until they come down to their 
> normal values (this time can actually be comuted/estimated).
>
> If this is what happens in your case, lowering the size of your buckets will 
> help.
>
> Regards,
> Cristian
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wei Shen
> > Sent: Wednesday, September 9, 2015 9:39 PM
> > To: dev at dpdk.org<mailto:dev at dpdk.org>
> > Subject: [dpdk-dev] Order of system brought up affects throughput with
> > qos_sched app
> >
> > Hi all,
> > I ran into problems with qos_sched with different order of system brought
> > up. I can bring up the system in two ways:
> > 1. Start traffic gen first. Then start qos_sched.2. Start qos_sched first. 
> > Then
> > start traffic gen.
> > With 256K pipes and 64 queue size, 128B packet size, I got ~4Gbps with order
> > #1. While I got 10G with order #2.
> > qos_sched command stats showed that ~59% packets got dropped in RX
> > (rte_ring_enqueue).
> > Plus, with #1, if I restart the traffic gen later, I would regain 10Gbps
> > throughput, which suggests that this is not an initialization issue but 
> > runtime
> > behavior.
> > I also tried to assign qos_sched on different cores and got the same result.
> > I suspect that there is some rte_ring bugs when connecting two cores and
> > one core started enqueuing before another core is ready for dequeue.
> > Have you experienced the same issue? Appreciate your help.
> >
> > Wei 
> > Shen.-
> > ---My system spec 
> > is:Intel(R) Xeon(R)
> > CPU E5-2699 v3 @ 2.30GHz15 * 1G hugepages
> > qos_sched argument: ./build/app/qos_sched -c 1c0002 -n 4 -- --pfc
> > "0,1,20,18,19" --cfg profile.cfg
> > profile.cfg:[port]frame overhead = 20number of subports per port =
> > 1number of pipes per subport = 262144queue sizes = 64 64 64 64
> >
> > ; Subport configuration[subport 0]tb rate = 125000 ; Bytes per
> > secondtb size = 100 ; Bytes
> > tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000 ;
> > Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate =
> > 125000 ; Bytes per secondtc period = 10 ; Milliseconds
> > pipe 0-262143 = 0 ; These 

[dpdk-dev] Order of system brought up affects throughput with qos_sched app

2015-09-09 Thread Wei Shen
Hi Cristian,
Thanks for your quick response. I did a quick test of your hypothesis and it 
sort of came out as you mentioned. That is, it went back to ~4Gbps after around 
ten minutes with the previous profile I posted.
In another test, I set the pipe token rate to ~20Mbps instead of full line rate 
each. Although I did run into the same order issue, I haven't noticed any slow 
down yet by the time of this email (it's been up for an hour or so).
I am sorry but I still don't get why. Do you mean ~10Gbps throughput saw in 
order #2 is made possible by the initial accumulation of credits and later when 
the app runs long enough the old credits would run out and get capped by this 
credit rate? But in the profile I set everything to be line rate so I think 
this is not the bottleneck.
Could you please illustrate it further? Appreciate it. Thank you.
> From: cristian.dumitrescu at intel.com
> To: wshen0123 at outlook.com; dev at dpdk.org
> Subject: RE: [dpdk-dev] Order of system brought up affects throughput with
> qos_sched app
> Date: Wed, 9 Sep 2015 19:54:12 +
> 
> Hi Wei,
> 
> Here is another hypothesis for you to consider: if the size of your token 
> buckets (used to store subport and pipe credits) is big (and it actually is 
> set big in the default config file of the app), then when no packets are 
> received for a long while (which is the case when you start the app first and 
> the traffic gen later), the token buckets are continuously replenished (with 
> nothing consumed) until they become full; when packets start to arrive, the 
> token buckets are full and it can take a long time (might be minutes or even 
> hours, depending on how big your buckets are) until they come down to their 
> normal values (this time can actually be comuted/estimated).
> 
> If this is what happens in your case, lowering the size of your buckets will 
> help.
> 
> Regards,
> Cristian
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wei Shen
> > Sent: Wednesday, September 9, 2015 9:39 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Order of system brought up affects throughput with
> > qos_sched app
> > 
> > Hi all,
> > I ran into problems with qos_sched with different order of system brought
> > up. I can bring up the system in two ways:
> > 1. Start traffic gen first. Then start qos_sched.2. Start qos_sched first. 
> > Then
> > start traffic gen.
> > With 256K pipes and 64 queue size, 128B packet size, I got ~4Gbps with order
> > #1. While I got 10G with order #2.
> > qos_sched command stats showed that ~59% packets got dropped in RX
> > (rte_ring_enqueue).
> > Plus, with #1, if I restart the traffic gen later, I would regain 10Gbps
> > throughput, which suggests that this is not an initialization issue but 
> > runtime
> > behavior.
> > I also tried to assign qos_sched on different cores and got the same result.
> > I suspect that there is some rte_ring bugs when connecting two cores and
> > one core started enqueuing before another core is ready for dequeue.
> > Have you experienced the same issue? Appreciate your help.
> > 
> > Wei 
> > Shen.-
> > ---My system spec 
> > is:Intel(R) Xeon(R)
> > CPU E5-2699 v3 @ 2.30GHz15 * 1G hugepages
> > qos_sched argument: ./build/app/qos_sched -c 1c0002 -n 4 -- --pfc
> > "0,1,20,18,19" --cfg profile.cfg
> > profile.cfg:[port]frame overhead = 20number of subports per port =
> > 1number of pipes per subport = 262144queue sizes = 64 64 64 64
> > 
> > ; Subport configuration[subport 0]tb rate = 125000   ; Bytes per
> > secondtb size = 100 ; Bytes
> > tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000 
> > ;
> > Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate =
> > 125000 ; Bytes per secondtc period = 10
> > ; Milliseconds
> > pipe 0-262143 = 0; These pipes are configured with pipe 
> > profile 0
> > ; Pipe configuration[pipe profile 0]tb rate = 125000   ; Bytes 
> > per
> > secondtb size = 100 ; Bytes
> > tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000 
> > ;
> > Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate =
> > 125000 ; Bytes per secondtc period = 10
> > ; Milliseconds
> > tc 3 oversubscription weight = 1
> > tc 0 wrr weights = 1 1 1 1tc 1 wrr weights = 1 1 1 1tc 2 wrr weights = 1 1 
> > 1 1tc 3
> > wrr weights = 1 1 1 1



[dpdk-dev] Order of system brought up affects throughput with qos_sched app

2015-09-09 Thread Dumitrescu, Cristian
Hi Wei,

Here is another hypothesis for you to consider: if the size of your token 
buckets (used to store subport and pipe credits) is big (and it actually is set 
big in the default config file of the app), then when no packets are received 
for a long while (which is the case when you start the app first and the 
traffic gen later), the token buckets are continuously replenished (with 
nothing consumed) until they become full; when packets start to arrive, the 
token buckets are full and it can take a long time (might be minutes or even 
hours, depending on how big your buckets are) until they come down to their 
normal values (this time can actually be comuted/estimated).

If this is what happens in your case, lowering the size of your buckets will 
help.

Regards,
Cristian

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wei Shen
> Sent: Wednesday, September 9, 2015 9:39 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Order of system brought up affects throughput with
> qos_sched app
> 
> Hi all,
> I ran into problems with qos_sched with different order of system brought
> up. I can bring up the system in two ways:
> 1. Start traffic gen first. Then start qos_sched.2. Start qos_sched first. 
> Then
> start traffic gen.
> With 256K pipes and 64 queue size, 128B packet size, I got ~4Gbps with order
> #1. While I got 10G with order #2.
> qos_sched command stats showed that ~59% packets got dropped in RX
> (rte_ring_enqueue).
> Plus, with #1, if I restart the traffic gen later, I would regain 10Gbps
> throughput, which suggests that this is not an initialization issue but 
> runtime
> behavior.
> I also tried to assign qos_sched on different cores and got the same result.
> I suspect that there is some rte_ring bugs when connecting two cores and
> one core started enqueuing before another core is ready for dequeue.
> Have you experienced the same issue? Appreciate your help.
> 
> Wei 
> Shen.-
> ---My system spec is:Intel(R) 
> Xeon(R)
> CPU E5-2699 v3 @ 2.30GHz15 * 1G hugepages
> qos_sched argument: ./build/app/qos_sched -c 1c0002 -n 4 -- --pfc
> "0,1,20,18,19" --cfg profile.cfg
> profile.cfg:[port]frame overhead = 20number of subports per port =
> 1number of pipes per subport = 262144queue sizes = 64 64 64 64
> 
> ; Subport configuration[subport 0]tb rate = 125000   ; Bytes per
> secondtb size = 100 ; Bytes
> tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000   
>   ;
> Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate =
> 125000 ; Bytes per secondtc period = 10; 
> Milliseconds
> pipe 0-262143 = 0; These pipes are configured with pipe 
> profile 0
> ; Pipe configuration[pipe profile 0]tb rate = 125000   ; Bytes per
> secondtb size = 100 ; Bytes
> tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000   
>   ;
> Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate =
> 125000 ; Bytes per secondtc period = 10; 
> Milliseconds
> tc 3 oversubscription weight = 1
> tc 0 wrr weights = 1 1 1 1tc 1 wrr weights = 1 1 1 1tc 2 wrr weights = 1 1 1 
> 1tc 3
> wrr weights = 1 1 1 1


[dpdk-dev] Order of system brought up affects throughput with qos_sched app

2015-09-09 Thread Wei Shen
Hi all,
I ran into problems with qos_sched with different order of system brought up. I 
can bring up the system in two ways:
1. Start traffic gen first. Then start qos_sched.2. Start qos_sched first. Then 
start traffic gen.
With 256K pipes and 64 queue size, 128B packet size, I got ~4Gbps with order 
#1. While I got 10G with order #2.
qos_sched command stats showed that ~59% packets got dropped in RX 
(rte_ring_enqueue).
Plus, with #1, if I restart the traffic gen later, I would regain 10Gbps 
throughput, which suggests that this is not an initialization issue but runtime 
behavior.
I also tried to assign qos_sched on different cores and got the same result.
I suspect that there is some rte_ring bugs when connecting two cores and one 
core started enqueuing before another core is ready for dequeue.
Have you experienced the same issue? Appreciate your help.

Wei 
Shen.My
 system spec is:Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz15 * 1G hugepages
qos_sched argument: ./build/app/qos_sched -c 1c0002 -n 4 -- --pfc 
"0,1,20,18,19" --cfg profile.cfg
profile.cfg:[port]frame overhead = 20number of subports per port = 1number of 
pipes per subport = 262144queue sizes = 64 64 64 64

; Subport configuration[subport 0]tb rate = 125000   ; Bytes per 
secondtb size = 100 ; Bytes
tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000 
; Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate = 
125000 ; Bytes per secondtc period = 10; 
Milliseconds
pipe 0-262143 = 0; These pipes are configured with pipe profile 0
; Pipe configuration[pipe profile 0]tb rate = 125000   ; Bytes per 
secondtb size = 100 ; Bytes
tc 0 rate = 125000 ; Bytes per secondtc 1 rate = 125000 
; Bytes per secondtc 2 rate = 125000 ; Bytes per secondtc 3 rate = 
125000 ; Bytes per secondtc period = 10; 
Milliseconds
tc 3 oversubscription weight = 1
tc 0 wrr weights = 1 1 1 1tc 1 wrr weights = 1 1 1 1tc 2 wrr weights = 1 1 1 
1tc 3 wrr weights = 1 1 1 1