What exactly were you printing? Vlib_pending_frame_t’s don’t have a 
thread_index field...:


/* A frame pending dispatch by main loop. */
typedef struct
{
  /* Node and runtime for this frame. */
  u32 node_runtime_index;

  /* Frame index (in the heap). */
  u32 frame_index;

  /* Start of next frames for this node. */
  u32 next_frame_index;

  /* Special value for next_frame_index when there is no next frame. */
#define VLIB_PENDING_FRAME_NO_NEXT_FRAME ((u32) ~0)
} vlib_pending_frame_t;

Thanks… Dave

From: Yuliang Li [mailto:yuliang...@yale.edu]
Sent: Friday, July 7, 2017 8:04 PM
To: Dave Barach (dbarach) <dbar...@cisco.com>
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] set mapping from node to thread

Hi Dave,

Thanks for the detailed response.

However, from the experiment I did, I see the same packet being processed by 
two threads. That is why I asked this question. Maybe I made some mistakes, 
here is what I did:

- In vlib_main_or_worker_loop function, I print out all pending frames here, as 
well as the thread_index of them, and the node name.
- I start the vpp with 4 worker threads, and set up SNAT according to the 
progressive tutorial<https://wiki.fd.io/view/VPP/Progressive_VPP_Tutorial>.
- I run iperf that go through the SNAT.

The printed information shows that, each TCP packet from the inside to the 
outside of the SNAT will go through the ethernet-input and snat-in2out. But the 
thread_index at the ethernet-input is 2, while the thread_index at snat-in2out 
is 1.

Is the above expected?

Thanks,

On Fri, Jul 7, 2017 at 12:22 PM, Dave Barach (dbarach) 
<dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote:
Dear Yuliang,

From a high level: vpp creates N identical graph replicas in a multi-core 
configuration. When practicable, we use hardware RSS hashing to ensure that all 
packets belonging to a specific flow are processed [in order!] by the same 
thread / graph replica. In effect, embarrassing parallelism.

It’s easy enough to hand off packets between threads - see the “handoff-node” - 
but we avoid that whenever possible.

Although one could - and I have - divided graph nodes across threads to create 
pipelines, that scheme needs significant dynamic tuning to handle a traffic 
pattern change. It’s hard to map nodes onto cores so that each thread in a 
pipeline uses approximately the same number of clocks/pkt; critical, since 
pipelines run at the speed of the slowest stage.

It’s possible to hand off a full frame of packets for less than two clocks/pkt. 
Unfortunately, that’s the least significant issue. Handing off a packet from 
one core/thread to another guarantees a bunch of memory/cache subsystem 
pressure as the system moves packet data and metadata from A to B.

HTH... Dave

From: vpp-dev-boun...@lists.fd.io<mailto:vpp-dev-boun...@lists.fd.io> 
[mailto:vpp-dev-boun...@lists.fd.io<mailto:vpp-dev-boun...@lists.fd.io>] On 
Behalf Of Yuliang Li
Sent: Friday, July 7, 2017 1:14 PM
To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>
Subject: [vpp-dev] set mapping from node to thread

Hi,

Is there a way to set which node should run on which thread? And is there a 
command that shows the mapping from nodes to threads?

Thanks,
--
Yuliang Li
PhD student
Department of Computer Science
Yale University



--
Yuliang Li
PhD student
Department of Computer Science
Yale University
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to