Do you know how they got rid of flakiness in their tests? We've spent a bunch of effort fixing flaky tests (and in marking the remaining flaky tests as flaky), but there's still a long tail of flakiness. I wonder if that sort of thing might be different for OpenStack if they have a different approach to testing than we do.
Adam On Mon, Feb 4, 2013 at 5:14 PM, Tim Ansell <[email protected]> wrote: > Hey guys, > > Last week a number of the team here at Google Sydney, including myself > attended Linux.conf.au 2013 conference. The conference was a blast and the > hot topic this year was OpenStack, an Open Source Cloud layer. > > The OpenStack project has grown from being a small project to having over > 500 active committers and continues to grow at a rapid pace. Both the > Continuous Integration Miniconf > (http://lca2013.linux.org.au/schedule/30102/view_talk?day=monday) and main > conference included talks from OpenStack leaders about how they have tried > to handle this growth and I think we can learn from their successes and > failures. All of the OpenStack's infrastructure is documented in the > following talks http://openstack-ci.github.com/publications/ > > I pulled the following stats to see how comparable the projects are; > > OpenStack; > (http://openstack-ci.github.com/publications/lca2013-ci/index.html#(3)) > > Over 500 Active Technical Contributors > As many as 200 trunk changes an hour > 18 (integrated) projects (and growing) > > I tried looking these up in WebKit and got the following; > > ~200 active contributors > As many as ~12 trunk changes an hour > 1 project, but 7 target platforms > > One of the most interesting parts of OpenStack was having a "gated trunk". > From their talk; >> >> Before each change to the OpenStack projects is merged into the main tree, >> unit and integration tests are run on the change, and only if they pass, is >> the change merged. We call this "gating". > > > There is a lot of debate about the value of a gated trunk on the internet; > which I'm not going to repeat here. OpenStack's experience has been that it > preserves the following properties; > http://openstack-ci.github.com/publications/lca2013-ci/index.html#(9) > > Ensures Code Quality > Protects developers > > Devs always start from working code > > Protects tree > > Bad code doesn't land > > Egalitarian > > Process is the same for everyone > Process is transparent > Process is automated > > These are all things that came up in Eric's "WebKit wishes" email specially > the parts about having an always green tree. The egalitarian nature of the > system also helps with trusting people as you *know* they can not break the > tree. This system is similar to our commit queue, however nobody has > privileges to bypass the queue. > > OpenStack has 18 projects which are all tightly integrated, for example a > change in the API in one project could break another project, for this > reason they gate changes on tests runs from all projects before allowing a > commit to land to any of them. While WebKit is only a single project, the > process of requiring multiple jobs to be green is similar to WebKit needing > to support multiple platforms. > > They do point out that when this system is set up, the system has to be > ultra repeatable and reliable; >> >> Once everything is automated, the projects stops if the automation does - >> http://openstack-ci.github.com/publications/lca2013-ci/index.html#(8) > > > To allow this to happen, OpenStack has managed to eliminated all flaky tests > in their suite. WebKit is not at this stage and still has a large number > tests which are both failing and/or flaky. Luckily, WebKit has much better > infrastructure for dealing with and tracking them down. > > Other things they have done to try and make this process work are; > > Like WebKit, every patch is required to have code review before being > submitted. OpenStack requires two positive reviews before allowing a commit > to be submitted, rather than the single one that WebKit needs. > Like WebKit, OpenStack has an "early warning system" which runs all tests as > soon as a patch is submitted. > > The complete OpenStack test suite takes around ~1 hour to run, but as they > have more than 1 event per hour their landing system needs pipelining. They > have developed a system called Zuul to make this happen. Before they had > this pipeline process, committing was taking many hours to land. > > You can see their currently running system at http://zuul.openstack.org/ and > find out more about Zuul at the following locations; >> >> Zuul: a Pipelining Trunk Gating System >> http://amo-probos.org/post/14 >> >> http://mirror.linux.org.au/linux.conf.au/2013/ogv/OpenStack_Zuul.ogv > > > I guess this is something we should discuss further. > > Tim 'mithro' Ansell > > > > _______________________________________________ > webkit-dev mailing list > [email protected] > https://lists.webkit.org/mailman/listinfo/webkit-dev > _______________________________________________ webkit-dev mailing list [email protected] https://lists.webkit.org/mailman/listinfo/webkit-dev

