On Fri, 2016-11-25 at 09:27 -0500, Bastien Nocera wrote: > > > The problem is that we didn't get around to running the test until the > > day before the go/no-go. There's a lot of stuff to test, and anything > > which only one person is likely to test is a risk. Frankly speaking, > > given how humans work, things that involve digging some piece of > > hardware you never touch out of a pile and hooking it up to a keyboard > > and mouse and a monitor and power and network is quite likely to get > > passed over in favour of something you can run in a VM. Especially if > > it's 4:30. This is why I have an Unused Arm Devices Pile Of Shame on my > > desk... > > > > So, partly this is our fault because we could've tested this earlier and > > didn't. But it's also the case that we really need more redundancy in as > > much of the required testing as possible. > > Is there any continuous testing done on the images on the installer? Is it > on real hardware? Is it possible to mock hardware setups? Comparing > boot setups on working and non-working installations.
I don't know what you mean by "the images on the installer", and I don't know precisely what you mean by "mock hardware setups". Mock what about them, in what way, exactly? To take the specific example we're starting from here, if you mean 'can we fake up something like an OS X dual boot install on a virtual machine', the answer is kinda yes, but it's not entirely straightforward. I actually did this in order to test my fix for the bug, but as well as faking up a disk layout that approximated an OS X install, I had to patch anaconda to think the system was a Mac (I just sabotaged that check to always return True) and provide that patch as an anaconda updates.img. Otherwise anaconda would've known the system wasn't a Mac and wouldn't have hit the right code path. > I think it would be possible to do testing that didn't rely quite as much > on manual testing, through regression testing on "mock" hardware (a hacked > up VM with a test disk image), comparing the partition types after > installation > against a working setup, comparing the file lists in the boot partition, > etc. We do rather a lot of automated testing of the installer, in fact: https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=Rawhide&build=Fedora-Rawhide-20161125.n.0&groupid=1 is a current Rawhide test run (missing some tests as some images are missing) https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=25&groupid=1&build=Fedora-25-20161115.0 is the Fedora 25 Final automated test run (with all tests) Those tests are run on every 'nightly' Branched and Rawhide compose, and on all 'candidate' Branched composes. (We also run the Atomic ostree installer test on the two-week Atomic nightlies). Reports from these tests are mailed to this list every day, under the title "Compose check report". I frequently reply to them with details about the bugs the tests found. We could, of course, do a lot *more* of this testing. Just personally I've got a list of about 30 things I want to add to the test set. But there's only two and a half people working on the openQA tests, and we have other stuff to do as well. And of course we have to monitor the tests that *are* run, investigate the bugs they discover, file them, and follow up on them. I should probably note some RH inside baseball at this point: there's a general theme in RH at present that people would like to have less divergence between Fedora and RHEL testing processes, specifically it would be nice to have some of the testing that RH's release test team does on RHEL builds done on Fedora builds too. Most of those tests run on Beaker; Fedora technically has a Beaker instance but it's not sufficiently set up for us to be able to actually run the tests RH has on it yet. That whole 'oh just get Fedora a Beaker setup, shouldn't be hard right?!' problem is dumped on tflink's lap at present. Like everyone else, he has a million other things to do and that one isn't his highest priority, and he also keeps getting roadblocked on it by things that are out of his control, AIUI. Once we have a usable Beaker instance we can try importing some of RH's tests to it and setting them up to run on Fedora, though I would be *utterly* unsurprised if that turns out to be a lot more work than it sounds like. We currently run all openQA tests on virtual machines. openQA *does* in fact have the necessary bits to run tests on bare metal - SUSE does this - but we don't do that and haven't investigated the possibility in detail yet. Fedora's openQA instances are in PHX, so it'd at least involve putting new hardware in there, which is its own process and apparently we don't have unlimited physical *space* there for them. It's really something we just haven't looked at at all yet. Beaker is rather more focused on metal testing - it's kinda as much a 'allocate resources on machines with specific properties' system as it is a testing system, really - but again, we're not really in a position to be doing stuff on Beaker yet. > I'm surprised that the Anaconda, and blivet developers aren't taking part > of this conversation. I'd certainly like them to point out all the ways in > which they're already doing what I mentioned, and showing how we could > add more test cases. anaconda has its own automated test system and runner, called kickstart-tests: https://github.com/rhinstaller/kickstart-tests the basic concept of kickstart-tests is that a 'test' is a kickstart which runs an install and then, from %post, decides whether whatever it was testing worked and writes out either a 'SUCCESS' message or some kind of details about what went wrong to a canary file. The test runner spins up a VM, runs an install using the kickstart, sucks the canary file out of the VM and records the result. As I understand it, these tests are not fully integrated CI-style into the anaconda development process yet (though that was/is the goal), but they are run quite frequently. I wrote a tool about six months back which actually takes the kickstart-tests repository and processes the kickstart tests into openQA tests. It was kind of a proof of concept and took some shortcuts, but it worked quite well. (I'd link to the results, except my personal openQA box - which I ran the tests on - dropped dead last month and I haven't got around to fixing it yet). We had some discussions with anaconda team about ways in which that could possibly be officially adopted, but it kinda petered out when I ran out of time, and the script would probably need a bit of work now (though likely not much). One constraint with openQA is we have somewhat limited resources for running the tests - we can run 14 tests in parallel on the production openQA instance, and 8 on the staging instance - so we can't just add new tests entirely willy-nilly, we have to be somewhat strategic, and this was one reason I didn't just throw all the kickstart-tests into the existing openQA instances. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org