Re: New automated test coverage: openQA tests of critical path updates
On Mon, 2017-02-27 at 10:22 -0800, Adam Williamson wrote: > Hi folks! > > I am currently rolling out some changes to the Fedora openQA deployment > which enable a new testing workflow. From now on, a subset of openQA > tests should be run automatically on every critpath update, both on > initial submission and on any edit of the update. Hi again folks! Just a quick update on progress here so far. The deployment went pretty well, and the tests have been running now for the last week or so. You can view all the results so far here: https://openqa.fedoraproject.org/group_overview/2?limit_builds=400 One thing you might notice right away is the list sort order. openQA currently sorts 'builds' (in this context, the update is the 'build') on the assumption that they sort as dotted version strings, which Fedora update IDs (the string we use as the 'build' value for these tests) certainly don't. I've got a PR in progress upstream to allow us to sort these differently, and that should get changed soon. About half way through last week I implemented a change which means any failed test is automatically retried; this cut down quite a lot on false failures caused by transient bugs, mirror issues etc. There are still occasional cases of this, though. For now you can force all the tests to be re-run by editing the update in any way at all; in future we'll probably try and set up some system which lets you request re- runs of failed tests if the failures don't look like an actual bug in the update. This week I'm aiming to get the necessary changes made so that Bodhi will find and display these results alongside the Taskotron results in its web UI, which should make them much more visible. There's another significant factor which I hadn't considered: today was the Bodhi activation point for Fedora 26, meaning we now have Fedora 26 critpath updates we could test. For now I've decided to go ahead and try and test Branched updates, and just see how much of a mess it turns out to be. I suspect, though, that we'll have problems with the tests failing due to underlying bugs far more often (certainly several of the tests currently fail on Branched, for instance), and also we'll have problems with the base disk images much more often for pre-release branches. It may prove to be difficult or impossible to provide useful feedback for Branched updates with this system, and if so, we'll turn it off. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
> On Thu, 2017-03-02 at 04:31 -0500, Kamil Paral wrote: > > > The job can - and already does - log the exact packages it actually > > > got, but I don't think there's an easy way for it to take the > > > 'last_modified' date for the update at the time it does the download. > > > > I don't know how you download the rpms, but a single python call can > > do that (http get and parse the json). Again, to prevent race > > conditions, it would be good to do the call before and after > > downloading the rpms and compare the timestamp. These race conditions > > occur surprisingly often once you start executing hundreds/thousands > > tasks a day. > > > > But if this is easier done in the scheduler, I think that's totally fine. > > During test execution, we can only really type stuff into the console. > We try to keep the amount of typing-into-consoles we do to a minimum, > too, as the more there is, the more likely it is openQA will choke on a > keypress and fail. (Though these tests already involve quite a lot of > typing, can't avoid it.) The test just uses the Bodhi CLI client to > download the packages. Sure, I'm not saying this needs to happen during the actual test. That seems silly, if we can do the same thing in the scheduler (initial timestamp) and in the reporter (end timestamp). > > I mean, it's not impossible, we *could* just type in a curl / Python > one-liner (or use something like httpie to hit the API to get it). I'm > just questioning whether it's worth the effort. It's not necessary now, in the "development" phase. But once we want gate on it, I think it's very important (if we want our gating mechanics to be reliable). ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
On Thu, 2017-03-02 at 04:31 -0500, Kamil Paral wrote: > > The job can - and already does - log the exact packages it actually > > got, but I don't think there's an easy way for it to take the > > 'last_modified' date for the update at the time it does the download. > > I don't know how you download the rpms, but a single python call can > do that (http get and parse the json). Again, to prevent race > conditions, it would be good to do the call before and after > downloading the rpms and compare the timestamp. These race conditions > occur surprisingly often once you start executing hundreds/thousands > tasks a day. > > But if this is easier done in the scheduler, I think that's totally fine. During test execution, we can only really type stuff into the console. We try to keep the amount of typing-into-consoles we do to a minimum, too, as the more there is, the more likely it is openQA will choke on a keypress and fail. (Though these tests already involve quite a lot of typing, can't avoid it.) The test just uses the Bodhi CLI client to download the packages. I mean, it's not impossible, we *could* just type in a curl / Python one-liner (or use something like httpie to hit the API to get it). I'm just questioning whether it's worth the effort. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
> 2017-03-01 18:04 GMT+01:00 Adam Williamson : > > I'm not so sure it's really necessary, and doing it is actually tricky > > for openQA. Only the openQA job itself knows what packages it actually > > tested, and it doesn't have an easy way to get the associated > > timestamp. The scheduler could easily get the timestamp at the time the > > job was created, or at the time the job completed, but that will never > > be 100% reliable, because the job actually goes and does the download > > somewhere in between those two times. > > I thought that Bodhi should be the one providing timestamps... Yes, Bodhi is providing these timestamps. Sorry if that was not obvious from my email. $ curl 'https://bodhi.fedoraproject.org/updates/FEDORA-2017-e12389b771' | python -m json.tool | grep date_modified "date_modified": "2017-03-02 09:18:13", ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
> > There's one important thing we need to do first, though. Bodhi ID > > doesn't identify the thing tested uniquely, because Bodhi updates are > > mutable (and the ID is kept). So Bodhi (or any gating tools) can't > > rely on just retrieving the latest result for a particular Bodhi ID > > and trust that result. It might be old and no longer reflect the > > current state. We need to extend bodhi_update results with > > "timestamp" key in extra data, that will report the "last_modified" > > time of the Bodhi update tested. And Bodhi (or any other tool) must > > not only query for item=$bodhi_id&type=bodhi_update, but also for > > ×tamp=$timestamp. Only with this we can be sure we've really > > tested particular Bodhi update. > > I'm not so sure it's really necessary, and doing it is actually tricky > for openQA. Only the openQA job itself knows what packages it actually > tested, and it doesn't have an easy way to get the associated > timestamp. The scheduler could easily get the timestamp at the time the > job was created, or at the time the job completed, but that will never > be 100% reliable, because the job actually goes and does the download > somewhere in between those two times. This problem is not exclusive to openqa, it affects all tasks that test bodhi updates and download the included rpms (there's always a race condition window). For openqa, I see two options here: a) record the timestamp in the scheduler when the job is created and use it. Either it will be correct, or if the race condition happens, it will publish a result based on testing newer packages with an older timestamp. That's slightly incorrect, but not really a problem. Because the update edit event scheduled another openqa run, and that will publish an up-to-date result. So there's no harm done. b) record the timestamp in the scheduler when the job is created, and when the job is finished. If they don't match, ignore the result, don't publish it. The update edit event scheduled another openqa run anyway. Again, no harm done and we didn't populate resultsdb with an incorrect result. (This is similar to what we do in certain taskotron tasks - if we detect that a bodhi update state doesn't match at the time when we publish results, we print it into the logs and skip them.) > > The job can - and already does - log the exact packages it actually > got, but I don't think there's an easy way for it to take the > 'last_modified' date for the update at the time it does the download. I don't know how you download the rpms, but a single python call can do that (http get and parse the json). Again, to prevent race conditions, it would be good to do the call before and after downloading the rpms and compare the timestamp. These race conditions occur surprisingly often once you start executing hundreds/thousands tasks a day. But if this is easier done in the scheduler, I think that's totally fine. > > OTOH, I don't think it's really too bad just to show the 'most recent' > results. That should usually only be out of date for a few minutes > after an update is edited. It might be possible to do a 'tests > running...' spinner when there are jobs scheduled or running for the > update in question, even. You're assuming here that the new task will finish successfully. It will often not. From my experience, network is the bane of automated testing. Bodhi will time out, koji will time out, they will return http 5xx errors, etc. Taskotron tasks are plagued with it (at least dozens such failures a day). That's why I try to detect the race condition and either not record it at all, or record it with the older timestamp, which is safe - you don't mislead people/tools when looking at the results. The worst thing to happen here is that a result is missing for a long time. And people will then complain (and we start investigating) or they'll use the "request re-testing" button, which we'll have to provide sooner or later (because all systems are imperfect). Of course I'm not saying we need to have this *now*. But I think it's necessary for gating updates. ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
2017-03-01 18:04 GMT+01:00 Adam Williamson : > I'm not so sure it's really necessary, and doing it is actually tricky > for openQA. Only the openQA job itself knows what packages it actually > tested, and it doesn't have an easy way to get the associated > timestamp. The scheduler could easily get the timestamp at the time the > job was created, or at the time the job completed, but that will never > be 100% reliable, because the job actually goes and does the download > somewhere in between those two times. I thought that Bodhi should be the one providing timestamps... ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
On Wed, 2017-03-01 at 11:18 -0500, Kamil Paral wrote: > So my first thought was to recommend you to also publish just > type=koji_build results and finish this transition. But then I > realized that's wrong. OpenQA operates completely different than the > aforementioned tasks do. We operate on builds, and can distinguish > which build is causing issues. Collating them into bodhi_update > results is just a convenience measure for the consumer. But you > operate on the whole update as a set. You can't distinguish which > build of the update caused the issues, you just know that some of > them did. So the smallest unit for you is bodhi update, and you > should report results as such. Yes, exactly. > The way forward is, I believe, to extend Bodhi to query both > type=koji_build for all included builds and collate the results (if > needed), and also query type=bodhi_update and shows those results as > well. Because different tasks operate on different type of data, > which influences how they publish the results. And both use cases are > valid. Yep, again, this is what I was expecting to do. > There's one important thing we need to do first, though. Bodhi ID > doesn't identify the thing tested uniquely, because Bodhi updates are > mutable (and the ID is kept). So Bodhi (or any gating tools) can't > rely on just retrieving the latest result for a particular Bodhi ID > and trust that result. It might be old and no longer reflect the > current state. We need to extend bodhi_update results with > "timestamp" key in extra data, that will report the "last_modified" > time of the Bodhi update tested. And Bodhi (or any other tool) must > not only query for item=$bodhi_id&type=bodhi_update, but also for > ×tamp=$timestamp. Only with this we can be sure we've really > tested particular Bodhi update. I'm not so sure it's really necessary, and doing it is actually tricky for openQA. Only the openQA job itself knows what packages it actually tested, and it doesn't have an easy way to get the associated timestamp. The scheduler could easily get the timestamp at the time the job was created, or at the time the job completed, but that will never be 100% reliable, because the job actually goes and does the download somewhere in between those two times. The job can - and already does - log the exact packages it actually got, but I don't think there's an easy way for it to take the 'last_modified' date for the update at the time it does the download. OTOH, I don't think it's really too bad just to show the 'most recent' results. That should usually only be out of date for a few minutes after an update is edited. It might be possible to do a 'tests running...' spinner when there are jobs scheduled or running for the update in question, even. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: New automated test coverage: openQA tests of critical path updates
> Hi folks! > > I am currently rolling out some changes to the Fedora openQA deployment > which enable a new testing workflow. From now on, a subset of openQA > tests should be run automatically on every critpath update, both on > initial submission and on any edit of the update. > > For the next little while, at least, this won't be incredibly visible. > openQA sends out fedmsgs for all tests, so you can sign up for FMN > notifications to learn about these results. They'll also be > discoverable from the openQA web UI - https://openqa.fedoraproject.org > . The results are also being forwarded to ResultsDB, so they'll be > visible via ResultsDB API queries and the ResultsDB web UI. But for > now, that's it...I think. > > Our intent is to set up the necessary bits so that these results will > show up in the Bodhi web UI alongside the results for relevant > Taskotron tests. There's an outside possibility that Bodhi is actually > already set up to find these results in ResultsDB, in which case > they'll just suddenly start showing up in Bodhi - we should know about > that soon enough. :) But most likely Bodhi will need a bit of a tweak > to find them. Let me add a bit of a technical background here. Bodhi web UI now queries ResultsDB for all available testcases, and then asks for item=$item&type=koji_build for all these testcases. So the new results won't be visible (unless you start reporting them per build). Our depcheck and upgradepath tasks report both type=koji_build (for each SRPM) and type=bodhi_update (for each bodhi update). Both these tools internally process RPMs or builds, and then query Bodhi at the end and collate the results to bodhi_update results. We want to get rid of that collating, because it has numerous issues: 1. It's slow, because Bodhi is very slow to respond 2. It often causes the task to fail, because Bodhi often returns 500 errors 3. It's prone to race conditions. It happens often that a Bodhi update is edited between the task start and end, changing included builds. 4. It's unnecessary, because Bodhi knows all this information, and can collate the data itself, without any network issues or race conditions. So while we still publish type=bodhi_update for compatibility reasons (I'm not sure whether some part of bodhi backend still might use this data), but want to get rid of it. So my first thought was to recommend you to also publish just type=koji_build results and finish this transition. But then I realized that's wrong. OpenQA operates completely different than the aforementioned tasks do. We operate on builds, and can distinguish which build is causing issues. Collating them into bodhi_update results is just a convenience measure for the consumer. But you operate on the whole update as a set. You can't distinguish which build of the update caused the issues, you just know that some of them did. So the smallest unit for you is bodhi update, and you should report results as such. The way forward is, I believe, to extend Bodhi to query both type=koji_build for all included builds and collate the results (if needed), and also query type=bodhi_update and shows those results as well. Because different tasks operate on different type of data, which influences how they publish the results. And both use cases are valid. There's one important thing we need to do first, though. Bodhi ID doesn't identify the thing tested uniquely, because Bodhi updates are mutable (and the ID is kept). So Bodhi (or any gating tools) can't rely on just retrieving the latest result for a particular Bodhi ID and trust that result. It might be old and no longer reflect the current state. We need to extend bodhi_update results with "timestamp" key in extra data, that will report the "last_modified" time of the Bodhi update tested. And Bodhi (or any other tool) must not only query for item=$bodhi_id&type=bodhi_update, but also for ×tamp=$timestamp. Only with this we can be sure we've really tested particular Bodhi update. ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
New automated test coverage: openQA tests of critical path updates
Hi folks! I am currently rolling out some changes to the Fedora openQA deployment which enable a new testing workflow. From now on, a subset of openQA tests should be run automatically on every critpath update, both on initial submission and on any edit of the update. For the next little while, at least, this won't be incredibly visible. openQA sends out fedmsgs for all tests, so you can sign up for FMN notifications to learn about these results. They'll also be discoverable from the openQA web UI - https://openqa.fedoraproject.org . The results are also being forwarded to ResultsDB, so they'll be visible via ResultsDB API queries and the ResultsDB web UI. But for now, that's it...I think. Our intent is to set up the necessary bits so that these results will show up in the Bodhi web UI alongside the results for relevant Taskotron tests. There's an outside possibility that Bodhi is actually already set up to find these results in ResultsDB, in which case they'll just suddenly start showing up in Bodhi - we should know about that soon enough. :) But most likely Bodhi will need a bit of a tweak to find them. This is probably a good thing, because we need to let the tests run for a while to find out how reliable they are, and if there's an unacceptable number of false negatives/positives. Once we have some info on that and are happy that we can get things sufficiently reliable for the results to be useful, we'll hook up the Bodhi integration. The tests that are run are most of the tests that, on the 'compose test' workflow, get run on the Server DVD and Workstation Live images after installation. Between them they do a decent job of covering basic system functionality. They also cover FreeIPA server and client setup, and Workstation browser (Firefox) and terminal functionality. So hopefully, if your critpath update completely breaks one of those basic workflows, you'll find out about it before pushing it stable. At present it looks like the Workstation tests may sometimes fail simply because the base install gets stuck during boot for some reason; I'm going to look into that this week. In testing so far the Server tests seem fairly reliable, but I want to gather data from a few days worth of test runs to see how those look. Once we start sending results to Bodhi, I'll try and write up some basic instructions on how to interpret and debug openQA test results; QA folks will also be available in IRC and by email for help with this, of course. You can see sample runs on Server: https://openqa.stg.fedoraproject.org/tests/overview?groupid=1&build=FEDORA-2017-376ae2b92c&version=25&distri=fedora and Workstation: https://openqa.stg.fedoraproject.org/tests/overview?version=25&distri=fedora&build=FEDORA-2017-87896dfb59&groupid=1 the 'desktop_notifications_live' failure is a stale bit of data - that test isn't actually run any more because obviously it makes no sense in this context, but because it got run one time in early development, openQA continues to show it for that update (it won't show for any *other* update). The `desktop_update_graphical` fail is a good example of the kind of issue I'll have to look into this week: it seems to have failed because of an intermittent crasher bug in PackageKit, rather than an issue in the update. We'll have to look at skipping known- unreliable tests, or marking them somehow so you know the deal in Bodhi, or automatically re-running them, or things along those lines. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org