On Fri, 2016-11-25 at 09:27 -0500, Bastien Nocera wrote:
> 
> > The problem is that we didn't get around to running the test until the
> > day before the go/no-go. There's a lot of stuff to test, and anything
> > which only one person is likely to test is a risk. Frankly speaking,
> > given how humans work, things that involve digging some piece of
> > hardware you never touch out of a pile and hooking it up to a keyboard
> > and mouse and a monitor and power and network is quite likely to get
> > passed over in favour of something you can run in a VM. Especially if
> > it's 4:30. This is why I have an Unused Arm Devices Pile Of Shame on my
> > desk...
> > 
> > So, partly this is our fault because we could've tested this earlier and
> > didn't. But it's also the case that we really need more redundancy in as
> > much of the required testing as possible.
> 
> Is there any continuous testing done on the images on the installer? Is it
> on real hardware? Is it possible to mock hardware setups? Comparing
> boot setups on working and non-working installations.

I don't know what you mean by "the images on the installer", and I
don't know precisely what you mean by "mock hardware setups". Mock what
about them, in what way, exactly?

To take the specific example we're starting from here, if you mean 'can
we fake up something like an OS X dual boot install on a virtual
machine', the answer is kinda yes, but it's not entirely
straightforward. I actually did this in order to test my fix for the
bug, but as well as faking up a disk layout that approximated an OS X
install, I had to patch anaconda to think the system was a Mac (I just
sabotaged that check to always return True) and provide that patch as
an anaconda updates.img. Otherwise anaconda would've known the system
wasn't a Mac and wouldn't have hit the right code path.

> I think it would be possible to do testing that didn't rely quite as much
> on manual testing, through regression testing on "mock" hardware (a hacked
> up VM with a test disk image), comparing the partition types after 
> installation
> against a working setup, comparing the file lists in the boot partition,
> etc.

We do rather a lot of automated testing of the installer, in fact:

https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=Rawhide&build=Fedora-Rawhide-20161125.n.0&groupid=1
 is a current Rawhide test run (missing some tests as some images are missing)
https://openqa.fedoraproject.org/tests/overview?distri=fedora&version=25&groupid=1&build=Fedora-25-20161115.0
 is the Fedora 25 Final automated test run (with all tests)

Those tests are run on every 'nightly' Branched and Rawhide compose,
and on all 'candidate' Branched composes. (We also run the Atomic
ostree installer test on the two-week Atomic nightlies).

Reports from these tests are mailed to this list every day, under the
title "Compose check report". I frequently reply to them with details
about the bugs the tests found.

We could, of course, do a lot *more* of this testing. Just personally
I've got a list of about 30 things I want to add to the test set. But
there's only two and a half people working on the openQA tests, and we
have other stuff to do as well. And of course we have to monitor the
tests that *are* run, investigate the bugs they discover, file them,
and follow up on them.

I should probably note some RH inside baseball at this point: there's a
general theme in RH at present that people would like to have less
divergence between Fedora and RHEL testing processes, specifically it
would be nice to have some of the testing that RH's release test team
does on RHEL builds done on Fedora builds too. Most of those tests run
on Beaker; Fedora technically has a Beaker instance but it's not
sufficiently set up for us to be able to actually run the tests RH has
on it yet. That whole 'oh just get Fedora a Beaker setup, shouldn't be
hard right?!' problem is dumped on tflink's lap at present. Like
everyone else, he has a million other things to do and that one isn't
his highest priority, and he also keeps getting roadblocked on it by
things that are out of his control, AIUI.

Once we have a usable Beaker instance we can try importing some of RH's
tests to it and setting them up to run on Fedora, though I would be
*utterly* unsurprised if that turns out to be a lot more work than it
sounds like.

We currently run all openQA tests on virtual machines. openQA *does* in
fact have the necessary bits to run tests on bare metal - SUSE does
this - but we don't do that and haven't investigated the possibility in
detail yet. Fedora's openQA instances are in PHX, so it'd at least
involve putting new hardware in there, which is its own process and
apparently we don't have unlimited physical *space* there for them.
It's really something we just haven't looked at at all yet.

Beaker is rather more focused on metal testing - it's kinda as much a
'allocate resources on machines with specific properties' system as it
is a testing system, really - but again, we're not really in a position
to be doing stuff on Beaker yet.

> I'm surprised that the Anaconda, and blivet developers aren't taking part
> of this conversation. I'd certainly like them to point out all the ways in
> which they're already doing what I mentioned, and showing how we could
> add more test cases.

anaconda has its own automated test system and runner, called
kickstart-tests:

https://github.com/rhinstaller/kickstart-tests

the basic concept of kickstart-tests is that a 'test' is a kickstart
which runs an install and then, from %post, decides whether whatever it
was testing worked and writes out either a 'SUCCESS' message or some
kind of details about what went wrong to a canary file. The test runner
spins up a VM, runs an install using the kickstart, sucks the canary
file out of the VM and records the result.

As I understand it, these tests are not fully integrated CI-style into
the anaconda development process yet (though that was/is the goal), but
they are run quite frequently.

I wrote a tool about six months back which actually takes the
kickstart-tests repository and processes the kickstart tests into
openQA tests. It was kind of a proof of concept and took some
shortcuts, but it worked quite well. (I'd link to the results, except
my personal openQA box - which I ran the tests on - dropped dead last
month and I haven't got around to fixing it yet).

We had some discussions with anaconda team about ways in which that
could possibly be officially adopted, but it kinda petered out when I
ran out of time, and the script would probably need a bit of work now
(though likely not much). One constraint with openQA is we have
somewhat limited resources for running the tests - we can run 14 tests
in parallel on the production openQA instance, and 8 on the staging
instance - so we can't just add new tests entirely willy-nilly, we have
to be somewhat strategic, and this was one reason I didn't just throw
all the kickstart-tests into the existing openQA instances.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org

Reply via email to