On Sun, Oct 20, 2013 at 12:18 PM, Tim Bell <[email protected]> wrote: > > Is it easy ? No... it is hard, whether in an integrated test suite or on > its own > Can it be solved ? Yes, we have done incredible things with the current > QA infrastructure > Should it be split off from other testing ? No, I want EVERY commit to > have this check. Performance through benchmarking at scale is fundamental. > > Integration with the current process mean all code has to pass the bar... > a new project makes it optional and therefore makes the users do the > debugging... just the kind of thing that drives those users away...
Tim, we already have a very basic gating test to check for degraded performance (large-ops test in tempest). Does it have all the issues listed below? Yes, this doesn't detect a minor performance degradation, it doesn't work on RAX cloud (slower VMs) et cetera. But its a start. After debugging the issues with nova-networking / rootwrap ( https://bugs.launchpad.net/oslo/+bug/1199433, https://review.openstack.org/#/c/38000/) that caused nova to timeout when booting just 50 instances, I added a test to boot up n (n=150 in gate) instances at once using the fake virt driver. We are now gating on the nova-network version and are getting ready to enable gate on the neutron version too. The tests are pretty fast too: gate-tempest-devstack-vm-large-ops SUCCESS in 13m 44s gate-tempest-devstack-vm-neutron-large-ops SUCCESS in 16m 09s (non-voting) best, Joe > > Tim > > > -----Original Message----- > > From: Robert Collins [mailto:[email protected]] > > Sent: 20 October 2013 21:03 > > To: OpenStack Development Mailing List > > Subject: Re: [openstack-dev] Announce of Rally - benchmarking system for > OpenStack > > > > On 21 October 2013 07:36, Alex Gaynor <[email protected]> wrote: > > > There's several issues involved in doing automated regression checking > > > for > > > benchmarks: > > > > > > - You need a platform which is stable. Right now all our CI runs on > > > virtualized instances, and I don't think there's any particular > > > guarantee it'll be the same underlying hardware, further virtualized > > > systems tend to be very noisy and not give you the stability you need. > > > - You need your benchmarks to be very high precision, if you really > > > want to rule out regressions of more than N% without a lot of false > positives. > > > - You need more than just checks on individual builds, you need long > > > term trend checking - 100 1% regressions are worse than a single 50% > regression. > > > > Let me offer a couple more key things: > > - you need a platform that is representative of your deployments: > > 1000 physical hypervisors have rather different checkin patterns than > > 1 qemu hypervisor. > > - you need a workload that is representative of your deployments: > > 10000 VM's spread over 500 physical hypervisors routing traffic through > one neutron software switch will have rather different load > > characteristics than 5 qemu vm's in a kvm vm hosted all in one > configuration. > > > > neither the platform - # of components, their configuration, etc, nor > the workload in devstack-gate are representative of production > > deployments of any except the most modest clouds. Thats fine - > devstack-gate to date has been about base functionality, not digging down > > into race conditions. > > > > I think having a dedicated tool aimed at: > > - setting up *many different* production-like environments and running > > - many production-like workloads and > > - reporting back which ones work and which ones don't > > > > makes a huge amount of sense. > > > > from the reports from that tool we can craft targeted unit test or > isolated functional tests to capture the problem and prevent it > > worsening or regressing (once fixed). See for instance Joe Gordons' > > fake hypervisor which is great for targeted testing. > > > > That said, I also agree with the sentiment expressed that the > workload-driving portion of Rally doesn't seem different enough to Tempest > > to warrant being separate; it seems to me that Rally could be built like > this: > > > > - a thing that does deployments spread out over a phase space of > configurations > > - instrumentation for deployments that permit the data visibility needed > to analyse problems > > - tests for tempest that stress a deployment > > > > So the single-button-push Rally would: > > - take a set of hardware > > - in a loop > > - deploy a configuration, run Tempest, report data > > > > That would reuse Tempest and still be a single button push data > gathering thing, and if Tempest isn't capable of generating enough > > concurrency/load [for a single test - ignore parallel execution of > different tests] then that seems like something we should fix in Tempest, > > because concurrency/race conditions are things we need tests for in > devstack-gate. > > > > -Rob > > > > -- > > Robert Collins <[email protected]> > > Distinguished Technologist > > HP Converged Cloud > > > > _______________________________________________ > > OpenStack-dev mailing list > > [email protected] > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
