Re: [I] Add a short report with performance degradations [otava]

via GitHub Thu, 05 Mar 2026 16:50:07 -0800


henrikingo commented on issue #137:
URL: https://github.com/apache/otava/issues/137#issuecomment-4008759836

@ligurio Thanks for using otava and being active here. Getting more input is
going to be crucial in the next 6 - 12 months when for the first time ever we
work on a common upstream and make some UX changes and improvements. If you're
not already, I invite you to [join the mailing
list](https://otava.apache.org/docs/community) where we occasionally have such
conversations too...

A couple of comments on your workflow:

In every place I've seen Otava used (and used myself), it was used within
some larger web based UI. Unfortunately none of these were open sourced, now
are they even publicly available for you to view, although maybe the Datastax
way of doing this is to a large extent available in Otava. Namely, in Datastax,
benchmark results would be submitted to Prometheus and the associated Grafana
dashboard. Otava then has a functionality to read data from Prometheus, compute
change points, and write them back into Prometheus as Grafana annotations.
(Over time, other databases have been added.)

In MongoDB we had performance graphs directly in the CI system (very similar
to what you might have in Jenkins, for example) and also Jira integration so
that we could create a Jira issue directly from the change point alert in CI.
Further, this included a feature using custom Jira fields, so that when a
regression was eventually fixed, the commit sha of the fix would be added to
the same jira, so that the regression and the fix were paired together. The CI
graphs would get this information in Jira, and fixed change points could be
presented with a less urgent color than the ones not addressed. Same for
processing false positives. Was another click of a button.

But speaking more generally, I always advocate to store your test results
and compute change points outside and after the actual test run. There are
several reasons:

- First of all, Otava isn't designed to find a change point immediately
after the test. In many cases it might flag a change only after 4-5 more tests
were run. It is only at that point you can be certain that the change really
was persistant and not some random up or down.
- As always with "big data", you may want to rerun the analytics part on the
same data that you already have. For example you may want to change the p-value
or some other parameter. Or fix a bug... You want to be able to do this
without having to rerun the actual benchmarks.
- If you analyze the 30 most recent points inside your workflow, and there
is an actual regression, now you will be alerted of the same regression 30
times, no? If you are saying you only alert / fail the job if it is the most
recent point that is the change point, then re-read the first point.
- And if you have two change points within the 30 day window, how would you
notice the second one if the job is already failing because of the first change
point?

Like I said, unfortunately I'm not aware of any of the Otava based
dashboards to be publicly available. Nyrkiö is a commercial SaaS offering
providing this same type of graphing plus integration with github pull requests
and issues. (I'm unsure about the etiquette here, but it seemed on topic to
mention it in this case.) Here is a random example of a pull request comment
from Nyrkiö about one benchmark result being significantly slower than before.
https://github.com/nyrkio/nyrkio/pull/968#issuecomment-3905270510 and here is
the same for a push event, in that case an issue is created
https://github.com/unodb-dev/unodb/issues/832 The link in the issue is broken,
it tried to link to
https://nyrkio.com/public/https%3A%2F%2Fgithub.com%2Funodb-dev%2Funodb/master/UnoDB_Benchmarks__x64_?commit=d2ab269909c64723eb5930201bde3bd8b7cefad3&timestamp=1766116904#full_n4_sequential_insert%3Cunodb::benchmark::olc_db%3E/32768

So with this kind of graphing, ability to rerun the analytics for example
after fine tuning parameters, and integration with a ticket system, triaging
results should be much more pleasant. You should get more than 50% correct
alerts, and you should only get them once, and you should be able to mark a
change as closed either because it was fixed or considered invalid. (Nyrkiö
doesn't do this last bit.)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Add a short report with performance degradations [otava]

Reply via email to