On Mon, Mar 21, 2022 at 11:29:42AM +0100, Lukáš Doktor wrote: > Hello Stefan, > > Dne 21. 03. 22 v 10:42 Stefan Hajnoczi napsal(a): > > On Mon, Mar 21, 2022 at 09:46:12AM +0100, Lukáš Doktor wrote: > >> Dear qemu developers, > >> > >> you might remember the "replied to" email from a bit over year ago to > >> raise a discussion about a qemu performance regression CI. On KVM forum I > >> presented > >> https://www.youtube.com/watch?v=Cbm3o4ACE3Y&list=PLbzoR-pLrL6q4ZzA4VRpy42Ua4-D2xHUR&index=9 > >> some details about my testing pipeline. I think it's stable enough to > >> become part of the official CI so people can consume, rely on it and > >> hopefully even suggest configuration changes. > >> > >> The CI consists of: > >> > >> 1. Jenkins pipeline(s) - internal, not available to developers, running > >> daily builds of the latest available commit > >> 2. Publicly available anonymized results: > >> https://ldoktor.github.io/tmp/RedHat-Perf-worker1/ > > > > This link is 404. > > > > My mistake, it works well without the tailing slash: > https://ldoktor.github.io/tmp/RedHat-Perf-worker1 > > >> 3. (optional) a manual gitlab pulling job which triggered by the Jenkins > >> pipeline when that particular commit is checked > >> > >> The (1) is described here: > >> https://run-perf.readthedocs.io/en/latest/jenkins.html and can be > >> replicated on other premises and the individual jobs can be executed > >> directly https://run-perf.readthedocs.io on any linux box using Fedora > >> guests (via pip or container > >> https://run-perf.readthedocs.io/en/latest/container.html ). > >> > >> As for the (3) I made a testing pipeline available here: > >> https://gitlab.com/ldoktor/qemu/-/pipelines with one always-passing test > >> and one allow-to-fail actual testing job. If you think such integration > >> would be useful, I can add it as another job to the official qemu repo. > >> Note the integration is a bit hacky as, due to resources, we can not test > >> all commits but rather test on daily basis, which is not officially > >> supported by gitlab. > >> > >> Note the aim of this project is to ensure some very basic system-level > >> workflow performance stays the same or that the differences are described > >> and ideally pinned to individual commits. It should not replace thorough > >> release testing or low-level performance tests. > > > > If I understand correctly the GitLab CI integration you described > > follows the "push" model where Jenkins (running on your own machine) > > triggers a manual job in GitLab CI simply to indicate the status of the > > nightly performance regression test? > > > > What process should QEMU follow to handle performance regressions > > identified by your job? In other words, which stakeholders need to > > triage, notify, debug, etc when a regression is identified? > > > > My guess is: > > - Someone (you or the qemu.git committer) need to watch the job status and > > triage failures. > > - That person then notifies likely authors of suspected commits so they can > > investigate. > > - The authors need a way to reproduce the issue - either locally or by > > pushing commits to GitLab and waiting for test results. > > - Fixes will be merged as additional qemu.git commits since commit history > > cannot be rewritten. > > - If necessary a git-revert(1) commit can be merged to temporarily undo a > > commit that caused issues. > > > > Who will watch the job status and triage failures? > > > > Stefan > > This is exactly the main question I'd like to resolve as part of > considering-this-to-be-official-part-of-the-upstream-qemu-testing. At this > point our team is offering it's service to maintain this single worker for > daily jobs, monitoring the status and pinging people in case of bisectable > results.
That's great! The main hurdle is finding someone to triage regressions and if you are volunteering to do that then these regression tests would be helpful to QEMU. > From the upstream qemu community we are mainly looking for a feedback: > > * whether they'd want to be notified of such issues (and via what means) I have CCed Kevin Wolf in case he has any questions regarding how fio regressions will be handled. I'm happy to be contacted when a regression bisects to a commit I authored. > * whether the current approach seems to be actually performing useful tasks > * whether the reports are understandable Reports aren't something I would look at as a developer. Although the history and current status may be useful to some maintainers, that information isn't critical. Developers simply need to know which commit introduced a regression and the details of how to run the regression. > * whether the reports should be regularly pushed into publicly available > place (or just on regression/improvement) > * whether there are any volunteers to be interested in non-clearly-bisectable > issues (probably by-topic) One option is to notify maintainers, but when I'm in this position myself I usually only investigate critical issues due to limited time. Regarding how to contact people, I suggest emailing them and CCing qemu-devel so others are aware. Thanks, Stefan
signature.asc
Description: PGP signature