Re: [vpp-dev] Packet processing time.
Hi Mohammed / Dave, How would you measure the latency of packet ? for e.g clocks & vector/call for each node, can we measure it ? Thanks, Regards, Venu On Tue, 21 Apr 2020 at 16:54, Mohammed Hawari wrote: > Hi Chris, > > Evaluating packet processing time in software is a very challenging issue, > as mentioned by Dave, it is likely to impact the performance we are trying > to evaluate. I worked on that issue and have an unpublished, under review, > academic paper proposing a solution using the NetFPGA-SUME platform. > Basically, I built a custom FPGA design, mimicking a NIC capable of > timestamping every packets at the ingress and the egress (immediately after > packets arrivals from the wire, and immediately before the packets > departures on the wire). I also wrote a DPDK driver for that NIC, and made > it work with VPP, so that the latency introduced by (VPP+PCI-based DMA) can > be evaluated. I played with this design and VPP in various configurations > (l2-patch l2 crossconnect and l3 forward) and I think it could be an > interesting tool to diagnose latency issues on a “per-packet” basis. > Downside is, of course, from the perspective of VPP, this is a custom NIC, > with a custom driver (not necessarily super-optimised), and the evaluated > packet forwarding latency takes the driver’s performance into account. > > If you are interested in discussing this work, I can give you more details > and resources in unicast, don’t hesitate to contact me :) > > Cheers, > > Mohammed Hawari > Software Engineer & PhD student > Cisco Systems > > > On 18 Apr 2020, at 22:14, Dave Barach via lists.fd.io < > dbarach=cisco@lists.fd.io> wrote: > > If you turn on the main loop dispatch event logs and look at the results > in the g2 viewer [or dump them in ascii] you can make pretty accurate lap > time estimates for any workload. Roughly speaking, packets take 1 lap time > to arrive and then leave. > > The “circuit-node ” game produces one elog event per frame, so > you can look at several million frame circuit times. > > Individually timestamping packets would be more precise, but calling > clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would > almost certainly affect forwarding performance. > > See > https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html > > /*? > * Control event logging of api, cli, and thread barrier events > * With no arguments, displays the current trace status. > * Name the event groups you wish to trace or stop tracing. > * > * @cliexpar > * @clistart > * elog trace api cli barrier > * elog trace api cli barrier disable > * elog trace dispatch > * elog trace circuit-node ethernet-input > * elog trace > * @cliend > * @cliexcmd{elog trace [api][cli][barrier][disable]} > ?*/ > /* *INDENT-OFF* */ > > *From:* vpp-dev@lists.fd.io *On Behalf Of *Christian > Hopps > *Sent:* Saturday, April 18, 2020 3:14 PM > *To:* vpp-dev > *Cc:* Christian Hopps > *Subject:* [vpp-dev] Packet processing time. > > > The recent discussion on reference counting and barrier timing has got me > interested in packet processing time. I realize there's a way to use "show > runtime" along with knowledge of the arc a packet follows, but I'm curious > if something more straight-forward has been attempted where packets are > timestamped on ingress (or creation) and stats are collected on egress > (transmission)? > > I also have an unrelated interest in hooking into the graph > immediate-post-transmission -- I'd like to adjust an input queue size only > when the packet that enqueued on it is actually transmitted on the wire, > and not just handed off downstream on the arc -- this would be a likely the > same place packet stat collection might occur. :) > > Thanks, > Chris. > > > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20021): https://lists.fd.io/g/vpp-dev/message/20021 Mute This Topic: https://lists.fd.io/mt/73114130/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Packet processing time.
Hi Chris, Evaluating packet processing time in software is a very challenging issue, as mentioned by Dave, it is likely to impact the performance we are trying to evaluate. I worked on that issue and have an unpublished, under review, academic paper proposing a solution using the NetFPGA-SUME platform. Basically, I built a custom FPGA design, mimicking a NIC capable of timestamping every packets at the ingress and the egress (immediately after packets arrivals from the wire, and immediately before the packets departures on the wire). I also wrote a DPDK driver for that NIC, and made it work with VPP, so that the latency introduced by (VPP+PCI-based DMA) can be evaluated. I played with this design and VPP in various configurations (l2-patch l2 crossconnect and l3 forward) and I think it could be an interesting tool to diagnose latency issues on a “per-packet” basis. Downside is, of course, from the perspective of VPP, this is a custom NIC, with a custom driver (not necessarily super-optimised), and the evaluated packet forwarding latency takes the driver’s performance into account. If you are interested in discussing this work, I can give you more details and resources in unicast, don’t hesitate to contact me :) Cheers, Mohammed Hawari Software Engineer & PhD student Cisco Systems > On 18 Apr 2020, at 22:14, Dave Barach via lists.fd.io > wrote: > > If you turn on the main loop dispatch event logs and look at the results in > the g2 viewer [or dump them in ascii] you can make pretty accurate lap time > estimates for any workload. Roughly speaking, packets take 1 lap time to > arrive and then leave. > > The “circuit-node ” game produces one elog event per frame, so you > can look at several million frame circuit times. > > Individually timestamping packets would be more precise, but calling > clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would almost > certainly affect forwarding performance. > > See https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html > <https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html> > > /*? > * Control event logging of api, cli, and thread barrier events > * With no arguments, displays the current trace status. > * Name the event groups you wish to trace or stop tracing. > * > * @cliexpar > * @clistart > * elog trace api cli barrier > * elog trace api cli barrier disable > * elog trace dispatch > * elog trace circuit-node ethernet-input > * elog trace > * @cliend > * @cliexcmd{elog trace [api][cli][barrier][disable]} > ?*/ > /* *INDENT-OFF* */ > > From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io> <mailto:vpp-dev@lists.fd.io>> On Behalf Of Christian Hopps > Sent: Saturday, April 18, 2020 3:14 PM > To: vpp-dev mailto:vpp-dev@lists.fd.io>> > Cc: Christian Hopps mailto:cho...@chopps.org>> > Subject: [vpp-dev] Packet processing time. > > The recent discussion on reference counting and barrier timing has got me > interested in packet processing time. I realize there's a way to use "show > runtime" along with knowledge of the arc a packet follows, but I'm curious if > something more straight-forward has been attempted where packets are > timestamped on ingress (or creation) and stats are collected on egress > (transmission)? > > I also have an unrelated interest in hooking into the graph > immediate-post-transmission -- I'd like to adjust an input queue size only > when the packet that enqueued on it is actually transmitted on the wire, and > not just handed off downstream on the arc -- this would be a likely the same > place packet stat collection might occur. :) > > Thanks, > Chris. > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16123): https://lists.fd.io/g/vpp-dev/message/16123 Mute This Topic: https://lists.fd.io/mt/73114130/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Packet processing time.
If you turn on the main loop dispatch event logs and look at the results in the g2 viewer [or dump them in ascii] you can make pretty accurate lap time estimates for any workload. Roughly speaking, packets take 1 lap time to arrive and then leave. The “circuit-node ” game produces one elog event per frame, so you can look at several million frame circuit times. Individually timestamping packets would be more precise, but calling clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would almost certainly affect forwarding performance. See https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html /*? * Control event logging of api, cli, and thread barrier events * With no arguments, displays the current trace status. * Name the event groups you wish to trace or stop tracing. * * @cliexpar * @clistart * elog trace api cli barrier * elog trace api cli barrier disable * elog trace dispatch * elog trace circuit-node ethernet-input * elog trace * @cliend * @cliexcmd{elog trace [api][cli][barrier][disable]} ?*/ /* *INDENT-OFF* */ From: vpp-dev@lists.fd.io On Behalf Of Christian Hopps Sent: Saturday, April 18, 2020 3:14 PM To: vpp-dev Cc: Christian Hopps Subject: [vpp-dev] Packet processing time. The recent discussion on reference counting and barrier timing has got me interested in packet processing time. I realize there's a way to use "show runtime" along with knowledge of the arc a packet follows, but I'm curious if something more straight-forward has been attempted where packets are timestamped on ingress (or creation) and stats are collected on egress (transmission)? I also have an unrelated interest in hooking into the graph immediate-post-transmission -- I'd like to adjust an input queue size only when the packet that enqueued on it is actually transmitted on the wire, and not just handed off downstream on the arc -- this would be a likely the same place packet stat collection might occur. :) Thanks, Chris. -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16111): https://lists.fd.io/g/vpp-dev/message/16111 Mute This Topic: https://lists.fd.io/mt/73114130/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Packet processing time.
The recent discussion on reference counting and barrier timing has got me interested in packet processing time. I realize there's a way to use "show runtime" along with knowledge of the arc a packet follows, but I'm curious if something more straight-forward has been attempted where packets are timestamped on ingress (or creation) and stats are collected on egress (transmission)? I also have an unrelated interest in hooking into the graph immediate-post-transmission -- I'd like to adjust an input queue size only when the packet that enqueued on it is actually transmitted on the wire, and not just handed off downstream on the arc -- this would be a likely the same place packet stat collection might occur. :) Thanks, Chris. -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16110): https://lists.fd.io/g/vpp-dev/message/16110 Mute This Topic: https://lists.fd.io/mt/73114130/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-