Just to avoid potential confusion, my recent ethernet-input change mandates that
frames are not aggregated when they are coming from input nodes. Reason for 
that is much faster
processing of the frame when we know in advance that all packets in frame share 
same sw_if_index.

In other words, input node calls vlib_get_new_next_frame() instead of 
vlib_get_next_frame()
and it stores flags and sw_if_index inside frame scalar data....

At the moment this is only case in current codebase which doesn't aggregate...


> On 25 Jan 2019, at 14:03, Dave Barach via Lists.Fd.Io 
> <dbarach=cisco....@lists.fd.io> wrote:
> 
> Dear Kingwei,
>  
> On a per-thread basis, only one input node is active at a time. In the 2x 
> active input node case you sketched, the second input node will take frame 
> ownership of the ethernet input frame - which should be pending but not yet 
> dispatched - and add more buffer indices to it. It’s possible that the first 
> frame will fill and that a second frame will be allocated and [partially] 
> filled; that’s not the typical case.
>  
> One lap around the track later, the first input node will take back ownership.
>  
> Let’s say that input node 1 adds 50 packets to the ethernet-input frame, and 
> input node 2 adds another 50 packets. The frame ownership change dance yields 
> a vector of size 100 in ethernet-input. My guess is that the increase in 
> efficiency from ethernet-input forward in the graph more than compensates for 
> the fixed cost of the frame change ownership dance. This is to confirm what 
> you wrote.
>  
> I usually call this effect “input aggregation.” Also holds true for the 
> handoff node, especially when handing off frames from multiple threads to a 
> single thread.
>  
> The alternative: dispatch two smaller frames instead of one big one. Doing 
> that might not be awful if all input nodes produced a fair number of packets. 
> The situation becomes awful when the 2..N input nodes produce very different 
> numbers of packets, e.g. 99 and 1. Anyhow, the code doesn’t work that way and 
> isn’t going to work that way, so I digress...
>  
> Thanks... Dave  
>  
> From: Kingwel Xie <kingwel....@ericsson.com 
> <mailto:kingwel....@ericsson.com>> 
> Sent: Friday, January 25, 2019 5:13 AM
> To: Dave Barach (dbarach) <dbar...@cisco.com <mailto:dbar...@cisco.com>>; 
> vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>>
> Subject: RE: [vpp-dev] Question about vlib_next_frame_change_ownership
>  
> Hi Dave,
>  
> After checking the code and some debug sessions, I realized where the bug is 
> – crypto-input, which is calling vlib_trace_buffer() earlier than 
> get_next_frame/put_next_frame. Therefore, the next_frame->flag is overwritten 
> by get_next_frame/change_owenrship. I’ve made a patch for it: 
> https://gerrit.fd.io/r/17079 <https://gerrit.fd.io/r/17079>  appreciate your 
> comments.
>  
> Again, about vlib_next_frame_change_ownership, as I understand, this 
> mechanism will always try to enqueue buffers to the owner’s frame_index, so 
> please consider this scenario:
>  
> We have 2 input nodes running at the same time – dpdk-input and avf-input, 
> they will try to compete with each other to enqueue to ethernet-input. 
> Consequently, bad side, memcpy per frame as we discussed, and, good side, 
> assume they both have 10 buffers per frame, then ethernet-input will have 
> chance to get 20 buffers for batch processing, good for performance. Please 
> confirm if this is expected behavior of frame ownership.
>  
> Regards,
> Kingwel
>  
>  
> From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io 
> <mailto:vpp-dev@lists.fd.io>> On Behalf Of Dave Barach via Lists.Fd.Io
> Sent: Friday, January 25, 2019 4:52 AM
> To: Kingwel Xie <kingwel....@ericsson.com <mailto:kingwel....@ericsson.com>>; 
> vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>>
> Cc: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>
> Subject: Re: [vpp-dev] Question about vlib_next_frame_change_ownership
>  
> The vpp packet trace which I extracted from your dispatch trace seems exactly 
> as I would have expected. See below. In a pg test like this one using a 
> loopback interface, anything past loopN-tx is irrelevant. The ipsec packet 
> turns into an ARP request for 18.1.0.241.
>  
> In non-cyclic graph cases, we don’t end up changing frame ownership at all. 
> In this case, you’re doing a double lookup. One small memcpy per frame is a 
> trivial cost, especially when one remembers that the cost is amortized over 
> all the packets in the frame. 
>  
> Until you produce a repeatable demonstration of the claimed issue, there’s 
> nothing that we can do.
>  
> Thanks... Dave
>  
> VPP Buffer Trace
>     Trace: 
>     Trace: 00:00:53:959410: pg-input
>     Trace:   stream ipsec0, 100 bytes, 0 sw_if_index
>     Trace:   current data 0, length 100, buffer-pool 0, clone-count 0, trace 
> 0x0
>     Trace:   UDP: 192.168.2.255 -> 1.2.3.4
>     Trace:     tos 0x00, ttl 64, length 28, checksum 0xb324
>     Trace:     fragment id 0x0000
>     Trace:   UDP: 4321 -> 1234
>     Trace:     length 80, checksum 0x30d9
>     Trace: 00:00:53:959426: ip4-input
>     Trace:   UDP: 192.168.2.255 -> 1.2.3.4
>     Trace:     tos 0x00, ttl 64, length 28, checksum 0xb324
>     Trace:     fragment id 0x0000
>     Trace:   UDP: 4321 -> 1234
>     Trace:     length 80, checksum 0x30d9
>     Trace: 00:00:53:959519: ip4-lookup
>     Trace:   fib 0 dpo-idx 2 flow hash: 0x00000000
>     Trace:   UDP: 192.168.2.255 -> 1.2.3.4
>     Trace:     tos 0x00, ttl 64, length 28, checksum 0xb324
>     Trace:     fragment id 0x0000
>     Trace:   UDP: 4321 -> 1234
>     Trace:     length 80, checksum 0x30d9
>     Trace: 00:00:53:959598: ip4-rewrite
>     Trace:   tx_sw_if_index 2 dpo-idx 2 : ipv4 via 0.0.0.0 ipsec0: mtu:9000 
> flow hash: 0x00000000
>     Trace:   00000000: 
> 4500001c000000003f11b424c0a802ff0102030410e104d2005030d900010203
>     Trace:   00000020: 
> 0405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
>     Trace: 00:00:53:959687: ipsec0-output
>     Trace:   ipsec0
>     Trace:   00000000: 
> 4500001c000000003f11b424c0a802ff0102030410e104d2005030d900010203
>     Trace:   00000020: 
> 0405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20212223
>     Trace:   00000040: 
> 2425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40414243
>     Trace:   00000060: 
> 44454647000000000000000000000000000000000000000000000000
>     Trace: 00:00:53:959802: ipsec0-tx
>     Trace:   IPSec: spi 1 seq 1
>     Trace: 00:00:53:959934: esp4-encrypt
>     Trace:   esp: spi 1 seq 1 crypto aes-cbc-128 integrity sha1-96
>     Trace: 00:00:53:960084: ip4-lookup
>     Trace:   fib 0 dpo-idx 0 flow hash: 0x00000000
>     Trace:   IPSEC_ESP: 18.1.0.71 -> 18.1.0.241
>     Trace:     tos 0x00, ttl 254, length 168, checksum 0x96ea
>     Trace:     fragment id 0x0000
>     Trace: 00:00:53:960209: ip4-glean
>     Trace:     IPSEC_ESP: 18.1.0.71 -> 18.1.0.241
>     Trace:       tos 0x00, ttl 254, length 168, checksum 0x96ea
>     Trace:       fragment id 0x0000
>     Trace: 00:00:53:960336: loop0-output
>     Trace:   loop0
>     Trace:   ARP: de:ad:00:00:00:00 -> ff:ff:ff:ff:ff:ff
>     Trace:   request, type ethernet/IP4, address size 6/4
>     Trace:   de:ad:00:00:00:00/18.1.0.71 -> 00:00:00:00:00:00/18.1.0.241
>     Trace: 00:00:53:960491: error-drop
>     Trace:   ip4-glean: ARP requests sent
>  
>  
> From: Kingwel Xie <kingwel....@ericsson.com 
> <mailto:kingwel....@ericsson.com>> 
> Sent: Thursday, January 24, 2019 2:43 AM
> To: Dave Barach (dbarach) <dbar...@cisco.com <mailto:dbar...@cisco.com>>; 
> vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>>
> Subject: RE: [vpp-dev] Question about vlib_next_frame_change_ownership
>  
> Ok. As requested, pcap trace & test script attached. Actually I made some 
> simplification to indicate the problem – using native IPSEC instead of DPDK.
>  
> You can see in the buffer trace that ip-lookup is referred by ip-input in the 
> beginning then by esp-encrypt later. It means the ownership of ip-lookup will 
> be changed back and forth, 16x3=48 bytes memcpy, per frame basis. Under some 
> case, the trace flag in next_frame will be lost, then it leads to buffer 
> trace broken. I made a patch for further discussion about it:  
> https://gerrit.fd.io/r/17037 <https://gerrit.fd.io/r/17037>
>  
> Test log shown below:
>  
> DBGvpp# show version
> vpp v19.04-rc0~24-g0702554 built by root on ubuntu89 at Sat Jan 19 22:13:50 
> EST 2019
> DBGvpp# 
> DBGvpp# exec ipsec
> loop0
> DBGvpp# 
> DBGvpp# pcap dispatch trace on max 1000 file vpp.pcap buffer-trace pg-input 10
> Buffer tracing of 10 pkts from pg-input enabled...
> pcap dispatch capture on...
> DBGvpp# 
> DBGvpp# 
> DBGvpp# packet-generator enable-stream ipsec0
> DBGvpp#           
> DBGvpp# pcap dispatch trace off
> captured 14 pkts...
> saved to /tmp/vpp.pcap...
> DBGvpp# 
> DBGvpp# show trace
> ------------------- Start of thread 0 kw_main -------------------
> Packet 1
>  
> 00:00:53:959410: pg-input
>   stream ipsec0, 100 bytes, 0 sw_if_index
>   current data 0, length 100, buffer-pool 0, clone-count 0, trace 0x0
>   UDP: 192.168.2.255 -> 1.2.3.4
>     tos 0x00, ttl 64, length 28, checksum 0xb324
>     fragment id 0x0000
>   UDP: 4321 -> 1234
>     length 80, checksum 0x30d9
> 00:00:53:959426: ip4-input
>   UDP: 192.168.2.255 -> 1.2.3.4
>     tos 0x00, ttl 64, length 28, checksum 0xb324
>     fragment id 0x0000
>   UDP: 4321 -> 1234
>     length 80, checksum 0x30d9
> 00:00:53:959519: ip4-lookup
>   fib 0 dpo-idx 2 flow hash: 0x00000000
>   UDP: 192.168.2.255 -> 1.2.3.4
>     tos 0x00, ttl 64, length 28, checksum 0xb324
>     fragment id 0x0000
>   UDP: 4321 -> 1234
>     length 80, checksum 0x30d9
> 00:00:53:959598: ip4-rewrite
>   tx_sw_if_index 2 dpo-idx 2 : ipv4 via 0.0.0.0 ipsec0: mtu:9000 flow hash: 
> 0x00000000
>   00000000: 4500001c000000003f11b424c0a802ff0102030410e104d2005030d900010203
>   00000020: 0405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
> 00:00:53:959687: ipsec0-output
>   ipsec0
>   00000000: 4500001c000000003f11b424c0a802ff0102030410e104d2005030d900010203
>   00000020: 0405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20212223
>   00000040: 2425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40414243
>   00000060: 44454647000000000000000000000000000000000000000000000000
> 00:00:53:959802: ipsec0-tx
>   IPSec: spi 1 seq 1
> 00:00:53:959934: esp4-encrypt
>   esp: spi 1 seq 1 crypto aes-cbc-128 integrity sha1-96
> 00:00:53:960084: ip4-lookup
>   fib 0 dpo-idx 0 flow hash: 0x00000000
>   IPSEC_ESP: 18.1.0.71 -> 18.1.0.241
>     tos 0x00, ttl 254, length 168, checksum 0x96ea
>     fragment id 0x0000
> 00:00:53:960209: ip4-glean
>     IPSEC_ESP: 18.1.0.71 -> 18.1.0.241
>       tos 0x00, ttl 254, length 168, checksum 0x96ea
>       fragment id 0x0000
> 00:00:53:960336: loop0-output
>   loop0
>   ARP: de:ad:00:00:00:00 -> ff:ff:ff:ff:ff:ff
>   request, type ethernet/IP4, address size 6/4
>   de:ad:00:00:00:00/18.1.0.71 -> 00:00:00:00:00:00/18.1.0.241
> 00:00:53:960491: error-drop
>   ip4-glean: ARP requests sent
> 00:00:53:960780: ethernet-input
>   ARP: de:ad:00:00:00:00 -> ff:ff:ff:ff:ff:ff
> 00:00:53:960927: arp-input
>   request, type ethernet/IP4, address size 6/4
>   de:ad:00:00:00:00/18.1.0.71 -> 00:00:00:00:00:00/18.1.0.241
> 00:00:53:961126: error-drop
>   arp-input: IP4 source address matches local interface
>  
>  
> From: Dave Barach (dbarach) <dbar...@cisco.com <mailto:dbar...@cisco.com>> 
> Sent: Wednesday, January 23, 2019 11:33 PM
> To: Kingwel Xie <kingwel....@ericsson.com <mailto:kingwel....@ericsson.com>>; 
> vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>>
> Subject: RE: [vpp-dev] Question about vlib_next_frame_change_ownership
>  
> Please write up the issue and share the config and pg input script as I 
> asked. You might find that the issue disappears pretty rapidly, with no 
> further action on your part... (😉)...
>  
> The basic graph engine is not a place to start hacking based on “I think I 
> get it...” 
>  
> Thanks... Dave
>  
> From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io 
> <mailto:vpp-dev@lists.fd.io>> On Behalf Of Kingwel Xie
> Sent: Wednesday, January 23, 2019 10:18 AM
> To: Dave Barach (dbarach) <dbar...@cisco.com <mailto:dbar...@cisco.com>>; 
> vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>>
> Subject: Re: [vpp-dev] Question about vlib_next_frame_change_ownership
>  
> thanks. I think I get it. By maintaining the ownership, vPP is able to 
> enqueue all buffers destinated to the same target node into the owner's next 
> frame at one time. This avoids dispatching the node function multiple times.
>  
> The bug is still there. I will create a patch later for further discussion.
>  
> And, maybe there has some space to improve: considering we have two input 
> nodes which will both add elements to the third node, we will see the 
> ownership of this node being switched per frame basis.
>  
> - Kingwel
>  
> 
> -------- 原始邮件 --------
> 主题: RE: Question about vlib_next_frame_change_ownership
> 来自: "Dave Barach (dbarach)" <dbar...@cisco.com <mailto:dbar...@cisco.com>>
> 发至: 2019年1月23日 下午8:49
> 抄送: Kingwel Xie <kingwel....@ericsson.com 
> <mailto:kingwel....@ericsson.com>>,vpp-dev <vpp-dev@lists.fd.io 
> <mailto:vpp-dev@lists.fd.io>>
> As you've probably noticed, the buffer manager has been under active 
> development. That may or may not have anything to do with the issue.
>  
> Please follow the bug reporting process: 
> https://wiki.fd.io/view/VPP/BugReports 
> <https://wiki.fd.io/view/VPP/BugReports>. In this case, using master/latest, 
> please create a Jira ticket including the exact configuration, packet 
> generator input script, and a dispatch pcap trace:  
>  
> "pcap dispatch trace on file dtrace max 10000 buffer-trace pg-input 1000",
> start the pg stream 
> "pcap dispatch trace off".
> Results in /tmp/dtrace.
>  
> I'm not going to speculate on what's going on at this point. Please write up 
> the issue so we can look at it.
>  
> For a decent explanation of the frame ownership scheme, take a look at 
> https://fdio-vpp.readthedocs.io/en/latest/gettingstarted/developers/vlib.html 
> <https://fdio-vpp.readthedocs.io/en/latest/gettingstarted/developers/vlib.html>
>  under "Complications".
>  
> HTH... Dave
>  
> -----Original Message-----
> From: Kingwel Xie <kingwel....@ericsson.com 
> <mailto:kingwel....@ericsson.com>> 
> Sent: Wednesday, January 23, 2019 2:16 AM
> To: Dave Barach (dbarach) <dbar...@cisco.com <mailto:dbar...@cisco.com>>; 
> vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>>
> Subject: Question about vlib_next_frame_change_ownership
>  
> Hi Dave and all,
>  
> I'm looking at a buffer trace issue with DPDK IPSEC. It turns out the flag 
> VLIB_FRAME_TRACE is broken in vlib_next_frame_change_ownership().
>  
> The node path in my setup is:  pg-input -> ip-input -> ip-lookup -> ... -> 
> dkdp-esp-encrypt -> cryptodev -> crypto-input -> ip-lookup -> ...
>  
> As you can see, the ip-lookup node has the owner node ip-input in the 
> beginning, then owner will be changed to crypto-input shortly. This change 
> causes that we swap the current next_frame with the owner's in 
> vlib_next_frame_change_ownership(). As a result, the VLIB_FRAME_TRACE in 
> next_frame->flag will be overwritten.
>  
> The fix could be very simple, but I'm wondering why we have to change the 
> ownership of the next_frame? Actually I can observe the ownership is changed 
> back and forth between ip-input and crypto-input for every frame, which leads 
> to performance degradation. However, it looks good to me even that we don’t 
> care the ownership. In this case, ip-lookup will be dispatched by either 
> ip-input or crypto-input, with different next_frame. I guess I must have 
> missed something, appreciate if you can elaborate.
>  
> Regards,
> Kingwel
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#12010): https://lists.fd.io/g/vpp-dev/message/12010 
> <https://lists.fd.io/g/vpp-dev/message/12010>
> Mute This Topic: https://lists.fd.io/mt/29430823/675642 
> <https://lists.fd.io/mt/29430823/675642>
> Group Owner: vpp-dev+ow...@lists.fd.io <mailto:vpp-dev+ow...@lists.fd.io>
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub 
> <https://lists.fd.io/g/vpp-dev/unsub>  [dmar...@me.com 
> <mailto:dmar...@me.com>]
> -=-=-=-=-=-=-=-=-=-=-=-

-- 
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12013): https://lists.fd.io/g/vpp-dev/message/12013
Mute This Topic: https://lists.fd.io/mt/29430823/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to