Hi,

We’ve been working with Tibor and few other csit-dev and vpp-dev folks on the 
design to improve VPP performance trend tracking and automate anomaly detection 
(regression/progression). Result of this work is a design write-up published 
here: https://wiki.fd.io/view/CSIT/PerformanceTrendingAnalysis

We’re now working on the POC to put it in place two things:

1. PT - performance trending
  a. improved version of existing csit performance trending jobs.
  b. initially measuring only MRR (maximum received rate), with unlimited 
tolerance of packet loss, to measure vpp code efficiency (vpp vectors size at 
max), and cause it’s fast.
  c. optimized PDR and NDR searches will be added later.
  d. jobs should start running today/tomorrow, with daily cadence, 
1run/testbed/day.

2. PA - performance analysis
  a. automated trend analysis and anomaly detection.
  b. failed jobs signaling regressions.
  c. anomaly detection based on usual statistical metrics: trimmed moving avg, 
stdev, mean.
    - efficiency of detection to be tuned - see above link for description.
  c. funky graphs summarizing the trend and identifying performance regression 
and progression - daily, weekly, monthly views..
  d. concept code is already generating some graphs - can show on vpp-dev and 
csit-dev calls for feedback.

Hope folks find it useful - looking for comments and feedback..

Cheers,
-Maciek

Reply via email to