> 
> Hi Jerin,
> 
> Following the guide to use the PMU counters(KO inserted and DPDK
> recompiled), the numbers increased 10+ folds(bigger numbers here mean
> more precise?), is this valid and expected?
This is correct, big numbers mean, more precise/granular results.

> No significant difference was seen.
This is what we are interested in. Do you have any before and after this change 
numbers?

> 
> gavin@net-arm-thunderx2:~/community/dpdk$ sudo ./test/test/test -l 16-
> 19,44-47,72-75,100-103 -n 4 --socket-mem=1024  -- -i
> RTE>>ring_perf_autotest (#1 run w/o the patch)
> ### Testing single element and burst enq/deq ### SP/SC single
> enq/dequeue: 103 MP/MC single enq/dequeue: 130 SP/SC burst
> enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 21 SP/SC
> burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 3.00
> MC empty dequeue: 3.00
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 17.48
> MP/MC bulk enq/dequeue (size: 8): 21.77
> SP/SC bulk enq/dequeue (size: 32): 7.39
> MP/MC bulk enq/dequeue (size: 32): 8.52
> 
> ### Testing using two hyperthreads ###
> SP/SC bulk enq/dequeue (size: 8): 31.32
> MP/MC bulk enq/dequeue (size: 8): 38.52
> SP/SC bulk enq/dequeue (size: 32): 13.39 MP/MC bulk enq/dequeue (size:
> 32): 14.15
> 
> ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8):
> 75.00 MP/MC bulk enq/dequeue (size: 8): 141.97 SP/SC bulk enq/dequeue
> (size: 32): 23.85 MP/MC bulk enq/dequeue (size: 32): 36.13 Test OK
> RTE>>ring_perf_autotest (#2 run w/o the patch)
> ### Testing single element and burst enq/deq ### SP/SC single
> enq/dequeue: 103 MP/MC single enq/dequeue: 130 SP/SC burst
> enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 21 SP/SC
> burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 3.00
> MC empty dequeue: 3.00
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 17.48
> MP/MC bulk enq/dequeue (size: 8): 21.77
> SP/SC bulk enq/dequeue (size: 32): 7.38
> MP/MC bulk enq/dequeue (size: 32): 8.52
> 
> ### Testing using two hyperthreads ###
> SP/SC bulk enq/dequeue (size: 8): 31.31
> MP/MC bulk enq/dequeue (size: 8): 38.52
> SP/SC bulk enq/dequeue (size: 32): 13.33 MP/MC bulk enq/dequeue (size:
> 32): 14.16
> 
> ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8):
> 75.74 MP/MC bulk enq/dequeue (size: 8): 147.33 SP/SC bulk enq/dequeue
> (size: 32): 24.79 MP/MC bulk enq/dequeue (size: 32): 40.09 Test OK
> 
> RTE>>ring_perf_autotest (#1 run w/ the patch)
> ### Testing single element and burst enq/deq ### SP/SC single
> enq/dequeue: 103 MP/MC single enq/dequeue: 129 SP/SC burst
> enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 22 SP/SC
> burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 3.00
> MC empty dequeue: 4.00
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 17.89
> MP/MC bulk enq/dequeue (size: 8): 21.77
> SP/SC bulk enq/dequeue (size: 32): 7.50
> MP/MC bulk enq/dequeue (size: 32): 8.52
> 
> ### Testing using two hyperthreads ###
> SP/SC bulk enq/dequeue (size: 8): 31.24
> MP/MC bulk enq/dequeue (size: 8): 38.14
> SP/SC bulk enq/dequeue (size: 32): 13.24 MP/MC bulk enq/dequeue (size:
> 32): 14.69
> 
> ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8):
> 74.63 MP/MC bulk enq/dequeue (size: 8): 137.61 SP/SC bulk enq/dequeue
> (size: 32): 24.82 MP/MC bulk enq/dequeue (size: 32): 36.64 Test OK
> RTE>>ring_perf_autotest (#1 run w/ the patch)
> ### Testing single element and burst enq/deq ### SP/SC single
> enq/dequeue: 103 MP/MC single enq/dequeue: 129 SP/SC burst
> enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 22 SP/SC
> burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 3.00
> MC empty dequeue: 4.00
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 17.89
> MP/MC bulk enq/dequeue (size: 8): 21.77
> SP/SC bulk enq/dequeue (size: 32): 7.50
> MP/MC bulk enq/dequeue (size: 32): 8.52
> 
> ### Testing using two hyperthreads ###
> SP/SC bulk enq/dequeue (size: 8): 31.53
> MP/MC bulk enq/dequeue (size: 8): 38.59
> SP/SC bulk enq/dequeue (size: 32): 13.24 MP/MC bulk enq/dequeue (size:
> 32): 14.69
> 
> ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8):
> 75.60 MP/MC bulk enq/dequeue (size: 8): 149.14 SP/SC bulk enq/dequeue
> (size: 32): 25.13 MP/MC bulk enq/dequeue (size: 32): 40.60 Test OK
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob <[email protected]>
> > Sent: Monday, October 8, 2018 6:50 PM
> > To: Gavin Hu (Arm Technology China) <[email protected]>
> > Cc: Ola Liljedahl <[email protected]>; [email protected]; Honnappa
> > Nagarahalli <[email protected]>; Ananyev, Konstantin
> > <[email protected]>; Steve Capper
> <[email protected]>;
> > nd <[email protected]>; [email protected]
> > Subject: Re: [PATCH v3 1/3] ring: read tail using atomic load
> >
> > -----Original Message-----
> > > Date: Mon, 8 Oct 2018 10:33:43 +0000
> > > From: "Gavin Hu (Arm Technology China)" <[email protected]>
> > > To: Ola Liljedahl <[email protected]>, Jerin Jacob
> > > <[email protected]>
> > > CC: "[email protected]" <[email protected]>, Honnappa Nagarahalli
> > > <[email protected]>, "Ananyev, Konstantin"
> > >  <[email protected]>, Steve Capper
> > <[email protected]>,
> > > nd  <[email protected]>, "[email protected]" <[email protected]>
> > > Subject: RE: [PATCH v3 1/3] ring: read tail using atomic load
> > >
> > >
> > > I did benchmarking w/o and w/ the patch, it did not show any
> > > noticeable
> > differences in terms of latency.
> > > Here is the full log( 3 runs w/o the patch and 2 runs w/ the patch).
> > >
> > > sudo ./test/test/test -l 16-19,44-47,72-75,100-103 -n 4
> > > --socket-mem=1024  -- -i
> >
> > These counters are running at 100MHz. Use PMU counters to get more
> > accurate results.
> >
> > https://doc.dpdk.org/guides/prog_guide/profile_app.html
> > See: 55.2. Profiling on ARM64
> >

Reply via email to