On Tue, 2018-03-06 at 20:01 +0100, Petr Vobornik via FreeIPA-devel
wrote:
> Hi FreeIPA contributors,
> 
> first, I apologize for such long mail.
> 
> in the team, we discuss how upstream CI should look like and what to
> expect from it. Various proposals were discussed. In this email I'd
> like to write down what I believe should be the expectations/goals of
> it. Or scope of the expectations. Most were based on Fedora CI and
> Cockpit Project ideals. The intention of this mail is to convey this
> information and also to create space to receive feedback and start a
> specific discussion on the individual points.
> 
> In the first part, I'm mostly focusing on the ideals, not the
> implementation (using of some specific CI). This is the core part to
> focus on. When we agree on ideals/goals, we can discuss
> implementation.
> 
> In the second part, I'm mentioning some aspects of PR-CI and it's
> capabilities and why I think it fulfills the ideas above.
> 
> I don't want to take much of your time and I hope that I'm not
> repeating myself much, but I have a feeling that we lack a write-up
> like this, i.e. that not everybody knows the motivation behind PR-CI
> or some parts of it. I did not cover everything. Also, there is the
> idea how to transform CI runner into a bot for other non-test tasks
> but that is for another mail, another discussion.
> 
> # Upstream CI expectations:
> 
> Upstream CI should be upstream. So that upstream contributors can use
> it. It should be also simple enough to modify.
> 
> Being upstream means:
> * upstream contributor is able to add test
> * upstream contributor is able to modify a way how tests are run (e.g.
> install selenium and Firefox on a host for Web UI tests)
> * upstream contributor is able to inspect results and logs of a test run
> * test run can be referenced in public resources like Pagure
> * doing the above should not require an action outside of upstream channels
> 
> ## Goal of CI:
> Keep good enough quality to release often without regressions.
> 
> So in essence, I see 2 goals: test(assessment of quality) and being
> upstream friendly.
> 
> ## What to test:
> * ideally everything on each pull request
> 
> ## Complexity of the test
> * test everything might raise questions like: ok what about
> performance tests or big infra tests. My personal opinion is that such
> tests might create quite high demand on the testing infra - too much
> unnecessary complexity. Upstream CI could support test with e.g. 6
> hosts but the amount of such test could be limited. In my opinion,
> this could be the threshold in what upstream tests and what downstream
> providers of FreeIPA test. So in practice, upstream CI should test
> "almost everything". ;)
> 
> ## How quickly to test it:
> * in a reasonable time after opening/updating PR. Reasonable in this
> context means a time where it still provides feedback but also is able
> to run something non-trivial. As an arbitrary value which can fulfill
> it 1.5h was chosen.
> 
> ## How test should look like
> Ideally be in a form that it can be reused easily on other OSes then Fedora.
> 
> ## How many parallel Pull Requests to count with these constraints:
> * when designing PR-CI, 3 were chosen. Seemed as a value which might
> be close to usual reality peak.
> 
> ## Other expectations on test infra:
> * be able to divide resource cost to more entities/companies (a
> concept of a trusted runner)
> * upstream contributor should be able to write and debug tests in the
> same fashion as tests are run in the CI
> 
> 
> Every testing and especially resource demanding testing like multihost
> tests which FreeIPA has requires significant resources. Currently,
> those are provided by Red Hat. If somebody else would like to join the
> party, or if there are some available resources then why not utilize
> them.
> 
> Being able to reproduce the test environment on private hw allows
> contributors to write and debug tests. If this is not possible then it
> makes a contribution of tests difficult. So ideally majority of tests
> should be possible to run on a well-equipped laptop (e.g with 12-16 GB
> of memory) while still being able to do some work there.
> 
> ## How results should look like
> Up to the team. For me, personally, even current state is good enough
> (meaning the Git Hub interface). But the decentralized and public
> aspect could support other views like Michal's dashboard POC which is
> quite nice. Or could support sending results to DBs like result
> DB<https://fedoraproject.org/wiki/ResultsDB> for future analysis and
> processing. Such DB can have it own UI and mainly doesn't need to be
> developed by us.
> 
> ## Compromises and release cadence
> 
> Is it possible to test everything in 1.5h while having 3 parallel PRs?
> Maybe. But probably not, or maybe not at the beginning. But if so then
> let us explore how. If not then lets come up with reasonable
> compromises.
> 
> Being able to test everything on each PRs allows us to release at any moment.
> 
> If we do compromises then we do also compromise on how often we are
> able to release. This might be still OK. It is up to team's decision.
> 
> I'd prefer if we would aim for releasing upstream FreeIPA every 14
> days. It might be too ambitious at the beginning so e.g. 1 month could
> be OK at the start.
> 
> 
> # PR-CI specifics and why it fulfills the goals:
> 
> It's completely upstream - code, test definition, implementation and
> deployment specification. The only private part is a run of the actual
> deployment of a runner. But if upstream core team has a way how to
> "bless" runners to be trusted then this is fine, more entities can
> deploy such runners. In practice, this blessing is done by
> fedorahosted private key and GitHub token.
> 
> ## Nightly tests
> 
> Nighly tests are a compromise which basically says: we are not able to
> test everything in every PR. So at least let's test the rest nightly.
> It was easy to set up without much-required work. But it doesn't mean
> that different approach cannot be taken.
> 
> ## Alternatives to nightly tests
> 
> In last weeks discussion, Lex was talking about testing more stuff and
> utilizing runners when they don't do anything. I can imagine that
> there might be a set of jobs with higher priority to give fast
> feedback and then another set of jobs (currently the ones in
> nightlies) with lower priority. Thus it would still provide good
> feedback soon enough and full feedback a bit later.
> 
> ## Test stability
> We often say PR-CI, while meaning test on Pull Request. But the hidden
> jewel of the project is the way how thing are run on a runner. It uses
> local virtualization by using Vagrant and test configuration by using
> Ansible with predefined, regularly updated host images.
> * local virtualization does not tight us to specific provisioning
> systems. It can work on OpenStack, Beaker, bare metal and maybe also
> OpenShift or other cloud providers like AWS.
> * Ansible is just ease of setting it up
> * custom images is the key part here. The core idea is that we prepare
> an image with dependencies, test it first and if it does not regress
> it can be used. So bugs in dependencies won't stop development. For
> FreeIPA which has huge dependency tree, it is really a game changer.
> Regular updates and pre-test of images before use is a key to catch
> and fix such bugs while having images relatively up-to-date.
> 
> ## Optimizations:
> 
> Not every job has the same size. There are tests which require 1 host,
> there are tests which need 5. Currently, all runners have the same
> size and thus small tasks don't utilize resources well. Having support
> for different sizes will allow us to run more stuff in parallel with
> the same resources.
> 
> ## Test definition yaml
> 
> The key benefit is that adding test to run on PR or in future nightly
> could be done by simple modification of `.freeipa-pr-ci.yaml`
> 
> Kudos if you got here.

Thanks! :-)

I may not agree with minor details, but I appreciate you setting down
the direction with which I fully agree.

My only comment is that you should really prioritize splitting test in
pieces for a few reasons (from experience on Openshift CI):

1) you can parallelize all those runs if you get enough hardware
2) you can rerun individual parts that failed, it is really important
to save resources when a flake makes a single part fail and you just
want to rerun it because you know it was an infrastructure failure and
not a test failure
3) when submitting a PR you might run first the tests you know may show
error first (again saves resources)
4) When there are failures they are reported earlier and the PR author
can get to work right away on fixing those.
5) The total time taken is lower the more tests you can run in parallel
6) Less chance for one test to influence the outcome of another that
comes later if the infra is not completely rebuilt for each test

My 2c, parallelizing tests really buys you a lot.

Simo.

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc
_______________________________________________
FreeIPA-devel mailing list -- freeipa-devel@lists.fedorahosted.org
To unsubscribe send an email to freeipa-devel-le...@lists.fedorahosted.org

Reply via email to