On Mon, Mar 21, 2022 at 11:29:42AM +0100, Lukáš Doktor wrote:
> Hello Stefan,
> 
> Dne 21. 03. 22 v 10:42 Stefan Hajnoczi napsal(a):
> > On Mon, Mar 21, 2022 at 09:46:12AM +0100, Lukáš Doktor wrote:
> >> Dear qemu developers,
> >>
> >> you might remember the "replied to" email from a bit over year ago to 
> >> raise a discussion about a qemu performance regression CI. On KVM forum I 
> >> presented 
> >> https://www.youtube.com/watch?v=Cbm3o4ACE3Y&list=PLbzoR-pLrL6q4ZzA4VRpy42Ua4-D2xHUR&index=9
> >>  some details about my testing pipeline. I think it's stable enough to 
> >> become part of the official CI so people can consume, rely on it and 
> >> hopefully even suggest configuration changes.
> >>
> >> The CI consists of:
> >>
> >> 1. Jenkins pipeline(s) - internal, not available to developers, running 
> >> daily builds of the latest available commit
> >> 2. Publicly available anonymized results: 
> >> https://ldoktor.github.io/tmp/RedHat-Perf-worker1/
> > 
> > This link is 404.
> > 
> 
> My mistake, it works well without the tailing slash: 
> https://ldoktor.github.io/tmp/RedHat-Perf-worker1
> 
> >> 3. (optional) a manual gitlab pulling job which triggered by the Jenkins 
> >> pipeline when that particular commit is checked
> >>
> >> The (1) is described here: 
> >> https://run-perf.readthedocs.io/en/latest/jenkins.html and can be 
> >> replicated on other premises and the individual jobs can be executed 
> >> directly https://run-perf.readthedocs.io on any linux box using Fedora 
> >> guests (via pip or container 
> >> https://run-perf.readthedocs.io/en/latest/container.html ).
> >>
> >> As for the (3) I made a testing pipeline available here: 
> >> https://gitlab.com/ldoktor/qemu/-/pipelines with one always-passing test 
> >> and one allow-to-fail actual testing job. If you think such integration 
> >> would be useful, I can add it as another job to the official qemu repo. 
> >> Note the integration is a bit hacky as, due to resources, we can not test 
> >> all commits but rather test on daily basis, which is not officially 
> >> supported by gitlab.
> >>
> >> Note the aim of this project is to ensure some very basic system-level 
> >> workflow performance stays the same or that the differences are described 
> >> and ideally pinned to individual commits. It should not replace thorough 
> >> release testing or low-level performance tests.
> > 
> > If I understand correctly the GitLab CI integration you described
> > follows the "push" model where Jenkins (running on your own machine)
> > triggers a manual job in GitLab CI simply to indicate the status of the
> > nightly performance regression test?
> > 
> > What process should QEMU follow to handle performance regressions
> > identified by your job? In other words, which stakeholders need to
> > triage, notify, debug, etc when a regression is identified?
> > 
> > My guess is:
> > - Someone (you or the qemu.git committer) need to watch the job status and 
> > triage failures.
> > - That person then notifies likely authors of suspected commits so they can 
> > investigate.
> > - The authors need a way to reproduce the issue - either locally or by 
> > pushing commits to GitLab and waiting for test results.
> > - Fixes will be merged as additional qemu.git commits since commit history 
> > cannot be rewritten.
> > - If necessary a git-revert(1) commit can be merged to temporarily undo a 
> > commit that caused issues.
> > 
> > Who will watch the job status and triage failures?
> > 
> > Stefan
> 
> This is exactly the main question I'd like to resolve as part of 
> considering-this-to-be-official-part-of-the-upstream-qemu-testing. At this 
> point our team is offering it's service to maintain this single worker for 
> daily jobs, monitoring the status and pinging people in case of bisectable 
> results.

That's great! The main hurdle is finding someone to triage regressions
and if you are volunteering to do that then these regression tests would
be helpful to QEMU.

> From the upstream qemu community we are mainly looking for a feedback:
> 
> * whether they'd want to be notified of such issues (and via what means)

I have CCed Kevin Wolf in case he has any questions regarding how fio
regressions will be handled.

I'm happy to be contacted when a regression bisects to a commit I
authored.

> * whether the current approach seems to be actually performing useful tasks
> * whether the reports are understandable

Reports aren't something I would look at as a developer. Although the
history and current status may be useful to some maintainers, that
information isn't critical. Developers simply need to know which commit
introduced a regression and the details of how to run the regression.

> * whether the reports should be regularly pushed into publicly available 
> place (or just on regression/improvement)
> * whether there are any volunteers to be interested in non-clearly-bisectable 
> issues (probably by-topic)

One option is to notify maintainers, but when I'm in this position
myself I usually only investigate critical issues due to limited time.

Regarding how to contact people, I suggest emailing them and CCing
qemu-devel so others are aware.

Thanks,
Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to