@Maxim: after PR327 is merged I'll make the changes to replace
scheduler mode with direct pktio mode.

@Honnappa: Right now the sender mode is using one pktout_queue per
thread. Can be modified for more interfaces (more pktout_queue) of
course.... but I would vote for faster interfaces (40G ?).

@Bill: Nice!!!... but kind of tricky to get it right (in limited
time). The simple workaround is using 40G interfaces for this
benchmark (if I remember correctly from Nokia's results, it goes to
around 20 mpps per core on a regular Xeon)

On 7 December 2017 at 21:32, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>
>
> On Thu, Dec 7, 2017 at 12:55 PM, Honnappa Nagarahalli
> <honnappa.nagaraha...@linaro.org> wrote:
>>
>> On 7 December 2017 at 08:01, Bogdan Pricope <bogdan.pric...@linaro.org>
>> wrote:
>> > TX is at line rate. Probably will get RX at line rate in direct mode,
>> > too.
>> > Problem is how can you see the performance increase/degradation if you
>> > can process more than line rate with one core?
>>
>> Any possibility to add one more port?
>>
>
> The usual way to measure this is to insert a process_packet() routine in the
> loop that consumes a configurable number of cycles. Real applications do
> more than just RX/TX processing but do something with the packets. The lower
> the system overhead the larger the cycle budget process_packet() has while
> maintaining line rate. A good benchmarking tool will self-tune this to find
> the number of cycles process_packet() can consume at line rate. That's the
> measure of efficiency of most interest from a data plane application
> perspective.
>
>>
>> >
>> > I guess .. enable csum option... ?
>> >
>> > On 7 December 2017 at 15:46, Maxim Uvarov <maxim.uva...@linaro.org>
>> > wrote:
>> >> nice. TX is on line rate,  right?  Next step probably to add RX path
>> >> without
>> >> scheduler. And we will have good testing environment.
>> >>
>> >>
>> >> On 7 December 2017 at 16:12, Bogdan Pricope <bogdan.pric...@linaro.org>
>> >> wrote:
>> >>>
>> >>> More results with odp_generator in lava setup:
>> >>>
>> >>>  7.6 mpps  (TX) /  5.9 mpps (RX) - api-next with PR313 (Petri):
>> >>>  8.3 mpps  (TX) /  6.3 mpps (RX) - api-next with PR313 (Petri) +
>> >>> remove 1m sleep + replace atomic counters
>> >>> 14.8 mpps (TX) /  6.5 mpps (RX) - api-next with PR313 (Petri) + remove
>> >>> 1m sleep + replace atomic counters + remove csum
>> >>> calculation/validation
>> >>> 14.8 mpps (TX) /  6.8 mpps (RX) - master with PR327 (remove 1m sleep +
>> >>> replace atomic counters + remove csum calculation/validation)
>> >>>
>> >>> /Bogdan
>> >>>
>> >>>
>> >>> On 6 December 2017 at 13:49, Maxim Uvarov <maxim.uva...@linaro.org>
>> >>> wrote:
>> >>> > small update. Double checked that increasing num of desc does not
>> >>> > give
>> >>> > any
>> >>> > effect in odp_generator.
>> >>> >
>> >>> > Disable check sums in odp_generator increases TX from 7M to 13M pps
>> >>> > and
>> >>> > RX
>> >>> > from 5.9M to 6.1M pps.
>> >>> > Because of generator uses predefined packets with calculated
>> >>> > checksum -
>> >>> > there is no need to enable checksum inside generator.
>> >>> >
>> >>> > It looks like problem inside DPDK driver itself.
>> >>> >
>> >>> > For this PR I think we need to merge it together with changes to
>> >>> > odp_generator (the same as for l2fwd) to enable hw check sum,
>> >>> > which has to be disabled by default.
>> >>> >
>> >>> > Maxim.
>> >>> >
>> >>> >
>> >>> > On 6 December 2017 at 10:46, Maxim Uvarov <maxim.uva...@linaro.org>
>> >>> > wrote:
>> >>> >>
>> >>> >> skip this message. I will recheck. Pushed to lava wrong branch.
>> >>> >>
>> >>> >> On 6 December 2017 at 10:42, Maxim Uvarov <maxim.uva...@linaro.org>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> Ilias was right yesterday. If number of descriptors increased to
>> >>> >>> 1024
>> >>> >>> then TX became again 10M.
>> >>> >>>
>> >>> >>> +               ret = rte_eth_tx_queue_setup(port_id, i,
>> >>> >>> +
>> >>> >>> dev_info.tx_desc_lim.nb_max
>> >>> >>> > 1024 ? 1024 : dev_info.tx_desc_lim.nb_max,
>> >>> >>>
>> >>> >>> rte_eth_dev_socket_id(port_id),
>> >>> >>>                                              txconf);
>> >>> >>>
>> >>> >>> +               ret = rte_eth_rx_queue_setup(port_id, i,
>> >>> >>> +
>> >>> >>> dev_info.rx_desc_lim.nb_max
>> >>> >>> > 1024 ? 1024 : dev_info.rx_desc_lim.nb_max,
>> >>> >>>
>> >>> >>> rte_eth_dev_socket_id(port_id),
>> >>> >>>                                              NULL,
>> >>> >>> pkt_dpdk->pkt_pool);
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> Maxim.
>> >>> >>>
>> >>> >>> On 5 December 2017 at 11:20, Elo, Matias (Nokia - FI/Espoo)
>> >>> >>> <matias....@nokia.com> wrote:
>> >>> >>>>
>> >>> >>>> When I tested enabling HW checksum with Fortville NICs (i40e) the
>> >>> >>>> slower
>> >>> >>>> driver path alone caused ~20% throughput drop on l2fwd test. This
>> >>> >>>> was
>> >>> >>>> without actually calculating the checksums, I simply forced the
>> >>> >>>> slower
>> >>> >>>> driver path (no vectorization).
>> >>> >>>>
>> >>> >>>> -Matias
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> > On 5 Dec 2017, at 8:59, Bogdan Pricope
>> >>> >>>> > <bogdan.pric...@linaro.org>
>> >>> >>>> > wrote:
>> >>> >>>> >
>> >>> >>>> > On RX side is kind-of expected result since it uses scheduler
>> >>> >>>> > mode.
>> >>> >>>> >
>> >>> >>>> > On TX side there is this drop from 10 mpps to 7.69 mpps that is
>> >>> >>>> > unexpected.
>> >>> >>>> >
>> >>> >>>> > So Petri, when you said:
>> >>> >>>> > "DPDK uses less optimized driver code (on Intel NICs at least)
>> >>> >>>> > when
>> >>> >>>> > any of the L4 checksum offloads is enabled."
>> >>> >>>> >
>> >>> >>>> > you were referring to this kind of drop in performance?
>> >>> >>>> >
>> >>> >>>> > There is that 'folklore' that SW csum is faster on small
>> >>> >>>> > packets
>> >>> >>>> > while
>> >>> >>>> > HW csum is faster on bigger packets. Do you have this kind of
>> >>> >>>> > data?
>> >>> >>>> >
>> >>> >>>> > Anyway, for this particular case (odp_generator), since UDP
>> >>> >>>> > header/payload is not changing during the test (for now), csum
>> >>> >>>> > is
>> >>> >>>> > calculated only once at the beginning of the test: so we are
>> >>> >>>> > comparing
>> >>> >>>> > HW IPv4 + HW UDP csum vs. SW IPv4 csum.... yet, the differences
>> >>> >>>> > in
>> >>> >>>> > performance is huge...
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> > On 4 December 2017 at 20:37, Maxim Uvarov
>> >>> >>>> > <maxim.uva...@linaro.org>
>> >>> >>>> > wrote:
>> >>> >>>> >> I added isocpus and mounted huge page TX became more stable at
>> >>> >>>> >> 7.6M.
>> >>> >>>> >> But
>> >>> >>>> >> anyway it's better to test performance for this PR because
>> >>> >>>> >> previous
>> >>> >>>> >> speed was 10M.
>> >>> >>>> >>
>> >>> >>>> >> Maxim.
>> >>> >>>> >>
>> >>> >>>> >> On 12/04/17 19:42, Honnappa Nagarahalli wrote:
>> >>> >>>> >>> Can you run with Linux-DPDK in ODP 2.0?
>> >>> >>>> >>>
>> >>> >>>> >>> On 4 December 2017 at 09:54, Maxim Uvarov
>> >>> >>>> >>> <maxim.uva...@linaro.org>
>> >>> >>>> >>> wrote:
>> >>> >>>> >>>> after clean patches apply and fix in run scripts I made it
>> >>> >>>> >>>> run.
>> >>> >>>> >>>>
>> >>> >>>> >>>> But results is really bad. --enable-dpdk-zero-copy
>> >>> >>>> >>>>
>> >>> >>>> >>>> TX rate is:
>> >>> >>>> >>>> 7673155 pps
>> >>> >>>> >>>>
>> >>> >>>> >>>> RX rate is:
>> >>> >>>> >>>> 5989846 pps
>> >>> >>>> >>>>
>> >>> >>>> >>>>
>> >>> >>>> >>>> Before patch PR 313 TX was 10M pps.
>> >>> >>>> >>>>
>> >>> >>>> >>>> I re run task and TX is 3.3M pps. All tests are single core.
>> >>> >>>> >>>> So
>> >>> >>>> >>>> something strange happens in lava or this PR.
>> >>> >>>> >>>>
>> >>> >>>> >>>> Maxim.
>> >>> >>>> >>>>
>> >>> >>>> >>>>
>> >>> >>>> >>>> On 12/04/17 17:03, Bogdan Pricope wrote:
>> >>> >>>> >>>>> On TX
>> >>> >>>> >>>>> (https://lng.validation.linaro.org/scheduler/job/23252.0)
>> >>> >>>> >>>>> I
>> >>> >>>> >>>>> see:
>> >>> >>>> >>>>>
>> >>> >>>> >>>>> ODP_REPO='https://github.com/muvarov/odp'
>> >>> >>>> >>>>> ODP_BRANCH='api-next'
>> >>> >>>> >>>>>
>> >>> >>>> >>>>>
>> >>> >>>> >>>>> On RX
>> >>> >>>> >>>>> (https://lng.validation.linaro.org/scheduler/job/23252.1)
>> >>> >>>> >>>>> I
>> >>> >>>> >>>>> see:
>> >>> >>>> >>>>>
>> >>> >>>> >>>>> ODP_REPO='https://github.com/muvarov/odp'
>> >>> >>>> >>>>> ODP_BRANCH='devel/api-next_shsum'
>> >>> >>>> >>>>>
>> >>> >>>> >>>>>
>> >>> >>>> >>>>> or are you referring to other test?
>> >>> >>>> >>>>>
>> >>> >>>> >>>>>
>> >>> >>>> >>>>> On 4 December 2017 at 15:53, Maxim Uvarov
>> >>> >>>> >>>>> <maxim.uva...@linaro.org> wrote:
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>> On 4 December 2017 at 15:11, Bogdan Pricope
>> >>> >>>> >>>>>> <bogdan.pric...@linaro.org>
>> >>> >>>> >>>>>> wrote:
>> >>> >>>> >>>>>>>
>> >>> >>>> >>>>>>> You need to put 313 on TX side (not RX).
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>> both rx and tx have patches from 313. l2fwd works on recv
>> >>> >>>> >>>>>> side.
>> >>> >>>> >>>>>> Generator
>> >>> >>>> >>>>>> does not work.
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>> Maxim.
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>>>
>> >>> >>>> >>>>>>>
>> >>> >>>> >>>>>>> On 4 December 2017 at 13:19, Savolainen, Petri (Nokia -
>> >>> >>>> >>>>>>> FI/Espoo)
>> >>> >>>> >>>>>>> <petri.savolai...@nokia.com> wrote:
>> >>> >>>> >>>>>>>> Is the DPDK version 17.08 ? Other versions might not
>> >>> >>>> >>>>>>>> work
>> >>> >>>> >>>>>>>> properly.
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> -Petri
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> From: Maxim Uvarov [mailto:maxim.uva...@linaro.org]
>> >>> >>>> >>>>>>>> Sent: Monday, December 04, 2017 1:10 PM
>> >>> >>>> >>>>>>>> To: Savolainen, Petri (Nokia - FI/Espoo)
>> >>> >>>> >>>>>>>> <petri.savolai...@nokia.com>
>> >>> >>>> >>>>>>>> Cc: Bogdan Pricope <bogdan.pric...@linaro.org>;
>> >>> >>>> >>>>>>>> lng-odp-forward
>> >>> >>>> >>>>>>>> <lng-odp@lists.linaro.org>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> Subject: Re: [lng-odp] odp dpdk
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> 313 does not work also:
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> https://lng.validation.linaro.org/scheduler/job/23242.1
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> I will replace RX side to l2fwd and see that will be
>> >>> >>>> >>>>>>>> there.
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> Maxim.
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> On 4 December 2017 at 13:46, Savolainen, Petri (Nokia -
>> >>> >>>> >>>>>>>> FI/Espoo)
>> >>> >>>> >>>>>>>> <petri.savolai...@nokia.com> wrote:
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> Maxim, try https://github.com/Linaro/odp/pull/313 It has
>> >>> >>>> >>>>>>>> been
>> >>> >>>> >>>>>>>> tested to
>> >>> >>>> >>>>>>>> fix
>> >>> >>>> >>>>>>>> checksum insert for 10/40GE Intel NICs.
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>> -Petri
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>> -----Original Message-----
>> >>> >>>> >>>>>>>>> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org]
>> >>> >>>> >>>>>>>>> On
>> >>> >>>> >>>>>>>>> Behalf Of
>> >>> >>>> >>>>>>>>> Bogdan Pricope
>> >>> >>>> >>>>>>>>> Sent: Monday, December 04, 2017 12:21 PM
>> >>> >>>> >>>>>>>>> To: Maxim Uvarov <maxim.uva...@linaro.org>
>> >>> >>>> >>>>>>>>> Cc: lng-odp-forward <lng-odp@lists.linaro.org>
>> >>> >>>> >>>>>>>>> Subject: Re: [lng-odp] odp dpdk
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> I suspect this is actually caused by csum issue in TX
>> >>> >>>> >>>>>>>>> side:
>> >>> >>>> >>>>>>>>> on
>> >>> >>>> >>>>>>>>> RX,
>> >>> >>>> >>>>>>>>> socket pktio does not validate csum (and accept the
>> >>> >>>> >>>>>>>>> packets)
>> >>> >>>> >>>>>>>>> but on
>> >>> >>>> >>>>>>>>> dpdk pktio the csum is validated and packets are
>> >>> >>>> >>>>>>>>> dropped.
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> I am not seeing this in my setup because default
>> >>> >>>> >>>>>>>>> txq_flags
>> >>> >>>> >>>>>>>>> for
>> >>> >>>> >>>>>>>>> igb
>> >>> >>>> >>>>>>>>> driver (1G interface) is
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> .txq_flags = 0
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> while for ixgbe (10G interface) is:
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
>> >>> >>>> >>>>>>>>>                ETH_TXQ_FLAGS_NOOFFLOADS,
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> /B
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>>
>> >>> >>>> >>>>>>>>> On 1 December 2017 at 23:47, Maxim Uvarov
>> >>> >>>> >>>>>>>>> <maxim.uva...@linaro.org>
>> >>> >>>> >>>>>>>>> wrote:
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>> Looking to dpdk pktio support and generator. It looks
>> >>> >>>> >>>>>>>>>> like
>> >>> >>>> >>>>>>>>>> receive
>> >>> >>>> >>>>>>>>>> part
>> >>> >>>> >>>>>>>>>> is broken. If for receive I use sockets it works well
>> >>> >>>> >>>>>>>>>> but
>> >>> >>>> >>>>>>>>>> receive
>> >>> >>>> >>>>>>>>>> with
>> >>> >>>> >>>>>>>>>> dpdk does not get any packets. For both master and
>> >>> >>>> >>>>>>>>>> api-next.
>> >>> >>>> >>>>>>>>>> Can
>> >>> >>>> >>>>>>>>>> somebody confirm please that it's so. Lava is not
>> >>> >>>> >>>>>>>>>> supper
>> >>> >>>> >>>>>>>>>> friendly to
>> >>> >>>> >>>>>>>>>> debug issue.
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>> 1. Recv
>> >>> >>>> >>>>>>>>>> odp_generator -I 0 -m r -c 0x4
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>> https://lng.validation.linaro.org/scheduler/job/23206.1
>> >>> >>>> >>>>>>>>>> Network devices using DPDK-compatible driver
>> >>> >>>> >>>>>>>>>> ============================================
>> >>> >>>> >>>>>>>>>> 0000:07:00.1 '82599ES 10-Gigabit SFI/SFP+ Network
>> >>> >>>> >>>>>>>>>> Connection
>> >>> >>>> >>>>>>>>>> 10fb'
>> >>> >>>> >>>>>>>>>> drv=igb_uio unused=
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>> 2. Send
>> >>> >>>> >>>>>>>>>> odp_generator -I 0 --srcmac 38:ea:a7:93:98:94 --dstmac
>> >>> >>>> >>>>>>>>>> 38:ea:a7:93:83:a0
>> >>> >>>> >>>>>>>>>> --srcip 192.168.100.2 --dstip 192.168.100.1 -m u -i 0
>> >>> >>>> >>>>>>>>>> -c
>> >>> >>>> >>>>>>>>>> 0x8
>> >>> >>>> >>>>>>>>>> -p 18 -e
>> >>> >>>> >>>>>>>>>> 5000 -f 5001 -n 800000000
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>> https://lng.validation.linaro.org/scheduler/job/23206.0
>> >>> >>>> >>>>>>>>>>
>> >>> >>>> >>>>>>>>>> Thank you,
>> >>> >>>> >>>>>>>>>> Maxim.
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>>>
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>>>
>> >>> >>>> >>>>
>> >>> >>>> >>
>> >>> >>>>
>> >>> >>>
>> >>> >>
>> >>> >
>> >>
>> >>
>
>

Reply via email to