> > Hi Jerin, > > Following the guide to use the PMU counters(KO inserted and DPDK > recompiled), the numbers increased 10+ folds(bigger numbers here mean > more precise?), is this valid and expected? This is correct, big numbers mean, more precise/granular results.
> No significant difference was seen. This is what we are interested in. Do you have any before and after this change numbers? > > gavin@net-arm-thunderx2:~/community/dpdk$ sudo ./test/test/test -l 16- > 19,44-47,72-75,100-103 -n 4 --socket-mem=1024 -- -i > RTE>>ring_perf_autotest (#1 run w/o the patch) > ### Testing single element and burst enq/deq ### SP/SC single > enq/dequeue: 103 MP/MC single enq/dequeue: 130 SP/SC burst > enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 21 SP/SC > burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8 > > ### Testing empty dequeue ### > SC empty dequeue: 3.00 > MC empty dequeue: 3.00 > > ### Testing using a single lcore ### > SP/SC bulk enq/dequeue (size: 8): 17.48 > MP/MC bulk enq/dequeue (size: 8): 21.77 > SP/SC bulk enq/dequeue (size: 32): 7.39 > MP/MC bulk enq/dequeue (size: 32): 8.52 > > ### Testing using two hyperthreads ### > SP/SC bulk enq/dequeue (size: 8): 31.32 > MP/MC bulk enq/dequeue (size: 8): 38.52 > SP/SC bulk enq/dequeue (size: 32): 13.39 MP/MC bulk enq/dequeue (size: > 32): 14.15 > > ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8): > 75.00 MP/MC bulk enq/dequeue (size: 8): 141.97 SP/SC bulk enq/dequeue > (size: 32): 23.85 MP/MC bulk enq/dequeue (size: 32): 36.13 Test OK > RTE>>ring_perf_autotest (#2 run w/o the patch) > ### Testing single element and burst enq/deq ### SP/SC single > enq/dequeue: 103 MP/MC single enq/dequeue: 130 SP/SC burst > enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 21 SP/SC > burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8 > > ### Testing empty dequeue ### > SC empty dequeue: 3.00 > MC empty dequeue: 3.00 > > ### Testing using a single lcore ### > SP/SC bulk enq/dequeue (size: 8): 17.48 > MP/MC bulk enq/dequeue (size: 8): 21.77 > SP/SC bulk enq/dequeue (size: 32): 7.38 > MP/MC bulk enq/dequeue (size: 32): 8.52 > > ### Testing using two hyperthreads ### > SP/SC bulk enq/dequeue (size: 8): 31.31 > MP/MC bulk enq/dequeue (size: 8): 38.52 > SP/SC bulk enq/dequeue (size: 32): 13.33 MP/MC bulk enq/dequeue (size: > 32): 14.16 > > ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8): > 75.74 MP/MC bulk enq/dequeue (size: 8): 147.33 SP/SC bulk enq/dequeue > (size: 32): 24.79 MP/MC bulk enq/dequeue (size: 32): 40.09 Test OK > > RTE>>ring_perf_autotest (#1 run w/ the patch) > ### Testing single element and burst enq/deq ### SP/SC single > enq/dequeue: 103 MP/MC single enq/dequeue: 129 SP/SC burst > enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 22 SP/SC > burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8 > > ### Testing empty dequeue ### > SC empty dequeue: 3.00 > MC empty dequeue: 4.00 > > ### Testing using a single lcore ### > SP/SC bulk enq/dequeue (size: 8): 17.89 > MP/MC bulk enq/dequeue (size: 8): 21.77 > SP/SC bulk enq/dequeue (size: 32): 7.50 > MP/MC bulk enq/dequeue (size: 32): 8.52 > > ### Testing using two hyperthreads ### > SP/SC bulk enq/dequeue (size: 8): 31.24 > MP/MC bulk enq/dequeue (size: 8): 38.14 > SP/SC bulk enq/dequeue (size: 32): 13.24 MP/MC bulk enq/dequeue (size: > 32): 14.69 > > ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8): > 74.63 MP/MC bulk enq/dequeue (size: 8): 137.61 SP/SC bulk enq/dequeue > (size: 32): 24.82 MP/MC bulk enq/dequeue (size: 32): 36.64 Test OK > RTE>>ring_perf_autotest (#1 run w/ the patch) > ### Testing single element and burst enq/deq ### SP/SC single > enq/dequeue: 103 MP/MC single enq/dequeue: 129 SP/SC burst > enq/dequeue (size: 8): 18 MP/MC burst enq/dequeue (size: 8): 22 SP/SC > burst enq/dequeue (size: 32): 7 MP/MC burst enq/dequeue (size: 32): 8 > > ### Testing empty dequeue ### > SC empty dequeue: 3.00 > MC empty dequeue: 4.00 > > ### Testing using a single lcore ### > SP/SC bulk enq/dequeue (size: 8): 17.89 > MP/MC bulk enq/dequeue (size: 8): 21.77 > SP/SC bulk enq/dequeue (size: 32): 7.50 > MP/MC bulk enq/dequeue (size: 32): 8.52 > > ### Testing using two hyperthreads ### > SP/SC bulk enq/dequeue (size: 8): 31.53 > MP/MC bulk enq/dequeue (size: 8): 38.59 > SP/SC bulk enq/dequeue (size: 32): 13.24 MP/MC bulk enq/dequeue (size: > 32): 14.69 > > ### Testing using two physical cores ### SP/SC bulk enq/dequeue (size: 8): > 75.60 MP/MC bulk enq/dequeue (size: 8): 149.14 SP/SC bulk enq/dequeue > (size: 32): 25.13 MP/MC bulk enq/dequeue (size: 32): 40.60 Test OK > > > > -----Original Message----- > > From: Jerin Jacob <[email protected]> > > Sent: Monday, October 8, 2018 6:50 PM > > To: Gavin Hu (Arm Technology China) <[email protected]> > > Cc: Ola Liljedahl <[email protected]>; [email protected]; Honnappa > > Nagarahalli <[email protected]>; Ananyev, Konstantin > > <[email protected]>; Steve Capper > <[email protected]>; > > nd <[email protected]>; [email protected] > > Subject: Re: [PATCH v3 1/3] ring: read tail using atomic load > > > > -----Original Message----- > > > Date: Mon, 8 Oct 2018 10:33:43 +0000 > > > From: "Gavin Hu (Arm Technology China)" <[email protected]> > > > To: Ola Liljedahl <[email protected]>, Jerin Jacob > > > <[email protected]> > > > CC: "[email protected]" <[email protected]>, Honnappa Nagarahalli > > > <[email protected]>, "Ananyev, Konstantin" > > > <[email protected]>, Steve Capper > > <[email protected]>, > > > nd <[email protected]>, "[email protected]" <[email protected]> > > > Subject: RE: [PATCH v3 1/3] ring: read tail using atomic load > > > > > > > > > I did benchmarking w/o and w/ the patch, it did not show any > > > noticeable > > differences in terms of latency. > > > Here is the full log( 3 runs w/o the patch and 2 runs w/ the patch). > > > > > > sudo ./test/test/test -l 16-19,44-47,72-75,100-103 -n 4 > > > --socket-mem=1024 -- -i > > > > These counters are running at 100MHz. Use PMU counters to get more > > accurate results. > > > > https://doc.dpdk.org/guides/prog_guide/profile_app.html > > See: 55.2. Profiling on ARM64 > >

