What exactly were you printing? Vlib_pending_frame_t’s don’t have a thread_index field...:
/* A frame pending dispatch by main loop. */ typedef struct { /* Node and runtime for this frame. */ u32 node_runtime_index; /* Frame index (in the heap). */ u32 frame_index; /* Start of next frames for this node. */ u32 next_frame_index; /* Special value for next_frame_index when there is no next frame. */ #define VLIB_PENDING_FRAME_NO_NEXT_FRAME ((u32) ~0) } vlib_pending_frame_t; Thanks… Dave From: Yuliang Li [mailto:yuliang...@yale.edu] Sent: Friday, July 7, 2017 8:04 PM To: Dave Barach (dbarach) <dbar...@cisco.com> Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] set mapping from node to thread Hi Dave, Thanks for the detailed response. However, from the experiment I did, I see the same packet being processed by two threads. That is why I asked this question. Maybe I made some mistakes, here is what I did: - In vlib_main_or_worker_loop function, I print out all pending frames here, as well as the thread_index of them, and the node name. - I start the vpp with 4 worker threads, and set up SNAT according to the progressive tutorial<https://wiki.fd.io/view/VPP/Progressive_VPP_Tutorial>. - I run iperf that go through the SNAT. The printed information shows that, each TCP packet from the inside to the outside of the SNAT will go through the ethernet-input and snat-in2out. But the thread_index at the ethernet-input is 2, while the thread_index at snat-in2out is 1. Is the above expected? Thanks, On Fri, Jul 7, 2017 at 12:22 PM, Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: Dear Yuliang, From a high level: vpp creates N identical graph replicas in a multi-core configuration. When practicable, we use hardware RSS hashing to ensure that all packets belonging to a specific flow are processed [in order!] by the same thread / graph replica. In effect, embarrassing parallelism. It’s easy enough to hand off packets between threads - see the “handoff-node” - but we avoid that whenever possible. Although one could - and I have - divided graph nodes across threads to create pipelines, that scheme needs significant dynamic tuning to handle a traffic pattern change. It’s hard to map nodes onto cores so that each thread in a pipeline uses approximately the same number of clocks/pkt; critical, since pipelines run at the speed of the slowest stage. It’s possible to hand off a full frame of packets for less than two clocks/pkt. Unfortunately, that’s the least significant issue. Handing off a packet from one core/thread to another guarantees a bunch of memory/cache subsystem pressure as the system moves packet data and metadata from A to B. HTH... Dave From: vpp-dev-boun...@lists.fd.io<mailto:vpp-dev-boun...@lists.fd.io> [mailto:vpp-dev-boun...@lists.fd.io<mailto:vpp-dev-boun...@lists.fd.io>] On Behalf Of Yuliang Li Sent: Friday, July 7, 2017 1:14 PM To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: [vpp-dev] set mapping from node to thread Hi, Is there a way to set which node should run on which thread? And is there a command that shows the mapping from nodes to threads? Thanks, -- Yuliang Li PhD student Department of Computer Science Yale University -- Yuliang Li PhD student Department of Computer Science Yale University
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev