Re: [vpp-dev] Packet processing time.

2021-08-25 Thread Venumadhav Josyula
Hi Mohammed / Dave,

How  would you measure the latency of packet ? for e.g clocks &
vector/call  for each node, can we measure it ?

Thanks,
Regards,
Venu

On Tue, 21 Apr 2020 at 16:54, Mohammed Hawari  wrote:

> Hi Chris,
>
> Evaluating packet processing time in software is a very challenging issue,
> as mentioned by Dave, it is likely to impact the performance we are trying
> to evaluate. I worked on that issue and have an unpublished, under review,
> academic paper proposing a solution using the NetFPGA-SUME platform.
> Basically, I built a custom FPGA design, mimicking a NIC capable of
> timestamping every packets at the ingress and the egress (immediately after
> packets arrivals from the wire, and immediately before the packets
> departures on the wire). I also wrote a DPDK driver for that NIC, and made
> it work with VPP, so that the latency introduced by (VPP+PCI-based DMA) can
> be evaluated. I played with this design and VPP in various configurations
> (l2-patch l2 crossconnect and l3 forward) and I think it could be an
> interesting tool to diagnose latency issues on a “per-packet” basis.
> Downside is, of course, from the perspective of VPP, this is a custom NIC,
> with a custom driver (not necessarily super-optimised), and the evaluated
> packet forwarding latency takes the driver’s performance into account.
>
> If you are interested in discussing this work, I can give you more details
> and resources in unicast, don’t hesitate to contact me :)
>
> Cheers,
>
> Mohammed Hawari
> Software Engineer & PhD student
> Cisco Systems
>
>
> On 18 Apr 2020, at 22:14, Dave Barach via lists.fd.io <
> dbarach=cisco@lists.fd.io> wrote:
>
> If you turn on the main loop dispatch event logs and look at the results
> in the g2 viewer [or dump them in ascii] you can make pretty accurate lap
> time estimates for any workload. Roughly speaking, packets take 1 lap time
> to arrive and then leave.
>
> The “circuit-node ” game produces one elog event per frame, so
> you can look at several million frame circuit times.
>
> Individually timestamping packets would be more precise, but calling
> clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would
> almost certainly affect forwarding performance.
>
> See
> https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html
>
> /*?
> * Control event logging of api, cli, and thread barrier events
> * With no arguments, displays the current trace status.
> * Name the event groups you wish to trace or stop tracing.
> *
> * @cliexpar
> * @clistart
> * elog trace api cli barrier
> * elog trace api cli barrier disable
> * elog trace dispatch
> * elog trace circuit-node ethernet-input
> * elog trace
> * @cliend
> * @cliexcmd{elog trace [api][cli][barrier][disable]}
> ?*/
> /* *INDENT-OFF* */
>
> *From:* vpp-dev@lists.fd.io  *On Behalf Of *Christian
> Hopps
> *Sent:* Saturday, April 18, 2020 3:14 PM
> *To:* vpp-dev 
> *Cc:* Christian Hopps 
> *Subject:* [vpp-dev] Packet processing time.
>
>
> The recent discussion on reference counting and barrier timing has got me
> interested in packet processing time. I realize there's a way to use "show
> runtime" along with knowledge of the arc a packet follows, but I'm curious
> if something more straight-forward has been attempted where packets are
> timestamped on ingress (or creation) and stats are collected on egress
> (transmission)?
>
> I also have an unrelated interest in hooking into the graph
> immediate-post-transmission -- I'd like to adjust an input queue size only
> when the packet that enqueued on it is actually transmitted on the wire,
> and not just handed off downstream on the arc -- this would be a likely the
> same place packet stat collection might occur. :)
>
> Thanks,
> Chris.
>
>
>
> 
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20021): https://lists.fd.io/g/vpp-dev/message/20021
Mute This Topic: https://lists.fd.io/mt/73114130/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Packet processing time.

2020-04-21 Thread Mohammed Hawari
Hi Chris,

Evaluating packet processing time in software is a very challenging issue, as 
mentioned by Dave, it is likely to impact the performance we are trying to 
evaluate. I worked on that issue and have an unpublished, under review, 
academic paper proposing a solution using the NetFPGA-SUME platform. Basically, 
I built a custom FPGA design, mimicking a NIC capable of timestamping every 
packets at the ingress and the egress (immediately after packets arrivals from 
the wire, and immediately before the packets departures on the wire). I also 
wrote a DPDK driver for that NIC, and made it work with VPP, so that the 
latency introduced by (VPP+PCI-based DMA) can be evaluated. I played with this 
design and VPP in various configurations (l2-patch l2 crossconnect and l3 
forward) and I think it could be an interesting tool to diagnose latency issues 
on a “per-packet” basis.
Downside is, of course, from the perspective of VPP, this is a custom NIC, with 
a custom driver (not necessarily super-optimised), and the evaluated packet 
forwarding latency takes the driver’s performance into account.

If you are interested in discussing this work, I can give you more details and 
resources in unicast, don’t hesitate to contact me :)

Cheers,

Mohammed Hawari
Software Engineer & PhD student
Cisco Systems


> On 18 Apr 2020, at 22:14, Dave Barach via lists.fd.io 
>  wrote:
> 
> If you turn on the main loop dispatch event logs and look at the results in 
> the g2 viewer [or dump them in ascii] you can make pretty accurate lap time 
> estimates for any workload. Roughly speaking, packets take 1 lap time to 
> arrive and then leave. 
>
> The “circuit-node ” game produces one elog event per frame, so you 
> can look at several million frame circuit times.
>
> Individually timestamping packets would be more precise, but calling 
> clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would almost 
> certainly affect forwarding performance.
>
> See https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html 
> <https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html>
>
> /*?
> * Control event logging of api, cli, and thread barrier events
> * With no arguments, displays the current trace status.
> * Name the event groups you wish to trace or stop tracing.
> *
> * @cliexpar
> * @clistart
> * elog trace api cli barrier
> * elog trace api cli barrier disable
> * elog trace dispatch
> * elog trace circuit-node ethernet-input
> * elog trace
> * @cliend
> * @cliexcmd{elog trace [api][cli][barrier][disable]}
> ?*/
> /* *INDENT-OFF* */
>
> From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>  <mailto:vpp-dev@lists.fd.io>> On Behalf Of Christian Hopps
> Sent: Saturday, April 18, 2020 3:14 PM
> To: vpp-dev mailto:vpp-dev@lists.fd.io>>
> Cc: Christian Hopps mailto:cho...@chopps.org>>
> Subject: [vpp-dev] Packet processing time.
>
> The recent discussion on reference counting and barrier timing has got me 
> interested in packet processing time. I realize there's a way to use "show 
> runtime" along with knowledge of the arc a packet follows, but I'm curious if 
> something more straight-forward has been attempted where packets are 
> timestamped on ingress (or creation) and stats are collected on egress 
> (transmission)?
> 
> I also have an unrelated interest in hooking into the graph 
> immediate-post-transmission -- I'd like to adjust an input queue size only 
> when the packet that enqueued on it is actually transmitted on the wire, and 
> not just handed off downstream on the arc -- this would be a likely the same 
> place packet stat collection might occur. :)
> 
> Thanks,
> Chris.
> 
> 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16123): https://lists.fd.io/g/vpp-dev/message/16123
Mute This Topic: https://lists.fd.io/mt/73114130/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Packet processing time.

2020-04-18 Thread Dave Barach via lists.fd.io
If you turn on the main loop dispatch event logs and look at the results in the 
g2 viewer [or dump them in ascii] you can make pretty accurate lap time 
estimates for any workload. Roughly speaking, packets take 1 lap time to arrive 
and then leave.

The “circuit-node ” game produces one elog event per frame, so you 
can look at several million frame circuit times.

Individually timestamping packets would be more precise, but calling 
clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would almost 
certainly affect forwarding performance.

See https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html

/*?
* Control event logging of api, cli, and thread barrier events
* With no arguments, displays the current trace status.
* Name the event groups you wish to trace or stop tracing.
*
* @cliexpar
* @clistart
* elog trace api cli barrier
* elog trace api cli barrier disable
* elog trace dispatch
* elog trace circuit-node ethernet-input
* elog trace
* @cliend
* @cliexcmd{elog trace [api][cli][barrier][disable]}
?*/
/* *INDENT-OFF* */

From: vpp-dev@lists.fd.io  On Behalf Of Christian Hopps
Sent: Saturday, April 18, 2020 3:14 PM
To: vpp-dev 
Cc: Christian Hopps 
Subject: [vpp-dev] Packet processing time.

The recent discussion on reference counting and barrier timing has got me 
interested in packet processing time. I realize there's a way to use "show 
runtime" along with knowledge of the arc a packet follows, but I'm curious if 
something more straight-forward has been attempted where packets are 
timestamped on ingress (or creation) and stats are collected on egress 
(transmission)?

I also have an unrelated interest in hooking into the graph 
immediate-post-transmission -- I'd like to adjust an input queue size only when 
the packet that enqueued on it is actually transmitted on the wire, and not 
just handed off downstream on the arc -- this would be a likely the same place 
packet stat collection might occur. :)

Thanks,
Chris.


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16111): https://lists.fd.io/g/vpp-dev/message/16111
Mute This Topic: https://lists.fd.io/mt/73114130/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] Packet processing time.

2020-04-18 Thread Christian Hopps
The recent discussion on reference counting and barrier timing has got me 
interested in packet processing time. I realize there's a way to use "show 
runtime" along with knowledge of the arc a packet follows, but I'm curious if 
something more straight-forward has been attempted where packets are 
timestamped on ingress (or creation) and stats are collected on egress 
(transmission)?

I also have an unrelated interest in hooking into the graph 
immediate-post-transmission -- I'd like to adjust an input queue size only when 
the packet that enqueued on it is actually transmitted on the wire, and not 
just handed off downstream on the arc -- this would be a likely the same place 
packet stat collection might occur. :)

Thanks,
Chris.


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16110): https://lists.fd.io/g/vpp-dev/message/16110
Mute This Topic: https://lists.fd.io/mt/73114130/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-