henrikingo commented on issue #137: URL: https://github.com/apache/otava/issues/137#issuecomment-4008759836
@ligurio Thanks for using otava and being active here. Getting more input is going to be crucial in the next 6 - 12 months when for the first time ever we work on a common upstream and make some UX changes and improvements. If you're not already, I invite you to [join the mailing list](https://otava.apache.org/docs/community) where we occasionally have such conversations too... A couple of comments on your workflow: In every place I've seen Otava used (and used myself), it was used within some larger web based UI. Unfortunately none of these were open sourced, now are they even publicly available for you to view, although maybe the Datastax way of doing this is to a large extent available in Otava. Namely, in Datastax, benchmark results would be submitted to Prometheus and the associated Grafana dashboard. Otava then has a functionality to read data from Prometheus, compute change points, and write them back into Prometheus as Grafana annotations. (Over time, other databases have been added.) In MongoDB we had performance graphs directly in the CI system (very similar to what you might have in Jenkins, for example) and also Jira integration so that we could create a Jira issue directly from the change point alert in CI. Further, this included a feature using custom Jira fields, so that when a regression was eventually fixed, the commit sha of the fix would be added to the same jira, so that the regression and the fix were paired together. The CI graphs would get this information in Jira, and fixed change points could be presented with a less urgent color than the ones not addressed. Same for processing false positives. Was another click of a button. But speaking more generally, I always advocate to store your test results and compute change points outside and after the actual test run. There are several reasons: - First of all, Otava isn't designed to find a change point immediately after the test. In many cases it might flag a change only after 4-5 more tests were run. It is only at that point you can be certain that the change really was persistant and not some random up or down. - As always with "big data", you may want to rerun the analytics part on the same data that you already have. For example you may want to change the p-value or some other parameter. Or fix a bug... You want to be able to do this without having to rerun the actual benchmarks. - If you analyze the 30 most recent points inside your workflow, and there is an actual regression, now you will be alerted of the same regression 30 times, no? If you are saying you only alert / fail the job if it is the most recent point that is the change point, then re-read the first point. - And if you have two change points within the 30 day window, how would you notice the second one if the job is already failing because of the first change point? Like I said, unfortunately I'm not aware of any of the Otava based dashboards to be publicly available. Nyrkiö is a commercial SaaS offering providing this same type of graphing plus integration with github pull requests and issues. (I'm unsure about the etiquette here, but it seemed on topic to mention it in this case.) Here is a random example of a pull request comment from Nyrkiö about one benchmark result being significantly slower than before. https://github.com/nyrkio/nyrkio/pull/968#issuecomment-3905270510 and here is the same for a push event, in that case an issue is created https://github.com/unodb-dev/unodb/issues/832 The link in the issue is broken, it tried to link to https://nyrkio.com/public/https%3A%2F%2Fgithub.com%2Funodb-dev%2Funodb/master/UnoDB_Benchmarks__x64_?commit=d2ab269909c64723eb5930201bde3bd8b7cefad3×tamp=1766116904#full_n4_sequential_insert%3Cunodb::benchmark::olc_db%3E/32768 So with this kind of graphing, ability to rerun the analytics for example after fine tuning parameters, and integration with a ticket system, triaging results should be much more pleasant. You should get more than 50% correct alerts, and you should only get them once, and you should be able to mark a change as closed either because it was fixed or considered invalid. (Nyrkiö doesn't do this last bit.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
