Hi,
I forgot to add the new design doc to Makefile.am, so I'd like to include
the following interdiff:
diff --git a/Makefile.am b/Makefile.am
index fbeb9f2..2ce5b24 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -534,6 +534,7 @@ docinput = \
doc/design-optables.rst \
doc/design-ovf-support.rst \
doc/design-partitioned.rst \
+ doc/design-performance-tests.rst \
doc/design-query-splitting.rst \
doc/design-query2.rst \
doc/design-reason-trail.rst \
On Wed, Apr 16, 2014 at 2:47 PM, Thomas Thrainer <[email protected]>wrote:
> This design doc describes which tests are added in order to test the
> performance of Ganeti, specifically when handling multiple jobs in
> parallel.
>
> Note that this design doc is submitted to stable-2.10 so performance
> changes over different Ganeti versions can be captured.
>
> Signed-off-by: Thomas Thrainer <[email protected]>
> ---
>
> If you have additional test scenarious in mind, please share them
> with me. Ideally, also include a rational for why a scenario is
> relevant.
>
> doc/design-draft.rst | 1 +
> doc/design-performance-tests.rst | 96
> ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 97 insertions(+)
> create mode 100644 doc/design-performance-tests.rst
>
> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
> index 35f6c96..81faa80 100644
> --- a/doc/design-draft.rst
> +++ b/doc/design-draft.rst
> @@ -19,6 +19,7 @@ Design document drafts
> design-ceph-ganeti-support.rst
> design-daemons.rst
> design-hsqueeze.rst
> + design-performance-tests.rst
>
> .. vim: set textwidth=72 :
> .. Local Variables:
> diff --git a/doc/design-performance-tests.rst
> b/doc/design-performance-tests.rst
> new file mode 100644
> index 0000000..1f804e0
> --- /dev/null
> +++ b/doc/design-performance-tests.rst
> @@ -0,0 +1,96 @@
> +========================
> +Performance tests for QA
> +========================
> +
> +.. contents:: :depth: 4
> +
> +This design document describes performance tests to be added to QA in
> +order to measure performance changes over time.
> +
> +Current state and shortcomings
> +==============================
> +
> +Currently, only functional QA tests are performed. Those tests verify
> +the correct behaviour of Ganeti in various configurations, but are not
> +designed to continuously monitor the performance of Ganeti.
> +
> +The current QA tests don't execute multiple tasks/jobs in parallel.
> +Therefore, the locking part of Ganeti does not really receive any
> +testing, neither functional nor performance wise.
> +
> +On the plus side, Ganeti's QA code does already measure the runtime of
> +individual tests, which is leveraged in this design.
> +
> +Proposed changes
> +================
> +
> +The tests to be added in the context of this design document focus on
> +two areas:
> +
> + * Job queue performance. How does Ganeti handle a lot of submitted
> + jobs?
> + * Parallel job execution performance. How well does Ganeti
> + parallelize jobs?
> +
> +In order to make it easier to recognize performance related tests, all
> +tests added in the context of this design get a description with a
> +"PERFORMANCE: " prefix.
> +
> +Job queue performance
> +---------------------
> +
> +Tests targeting the job queue should eliminate external factors (like
> +network/disk performance or hypervisor delays) as much as possible, so
> +they are designed to run in a vcluster QA environment.
> +
> +The following tests are added to the QA:
> +
> + * Submit the maximum amount of instance create jobs in parallel. As
> + soon as a creation job starts to run, submit a removal job for this
> + instance.
> + * Submit as many instance create jobs as there are nodes in the
> + cluster in parallel (for non-redundant instances). Removal jobs
> + as above.
> + * For the maximum amount of instances in the cluster, submit modify
> + jobs (modify hypervisor and backend parameters) in parallel.
> + * For the maximum amount of instances in the cluster, submit stop,
> + start, reboot and reinstall jobs in parallel.
> + * For the maximum amount of instances in the cluster, submit multiple
> + list and info jobs in parallel.
> + * For the maximum amount of instances in the cluster, submit move
> + jobs in parallel.
> + * For the maximum amount of instances in the cluster, submit add-,
> + remove- and list-tags jobs.
> +
> +Parallel job execution performance
> +----------------------------------
> +
> +Tests targeting the performance of parallel execution of "real" jobs
> +in close-to-production clusters should actually perform all operations,
> +such as creating disks and starting instances. This way, real world
> +locking or waiting issues can be reproduced. Performing all those
> +operations does requires quite some time though, so only a smaller
> +number of instances and parallel jobs can be tested realistically.
> +
> +The following tests are added to the QA:
> +
> + * Submitting twice as many instance creation request as there are
> + nodes in the cluster, using DRBD as disk template. As soon as a
> + creation job starts to run, submit a removal job for this instance.
> + * Create an instance using DRBD. Fail it over, migrate it, recreate
> + its disk and change its secondary node while creating an additional
> + instance in parallel to each of those operations.
> +
> +Future work
> +===========
> +
> +Based on test results of the tests listed above, additional tests can
> +be added to cover more real-world use-cases. Also, based on user
> +requests, specially crafted performance tests modeling those workloads
> +can be added too.
> +
> +.. vim: set textwidth=72 :
> +.. Local Variables:
> +.. mode: rst
> +.. fill-column: 72
> +.. End:
> --
> 1.9.1.423.g4596e3a
>
>
--
Thomas Thrainer | Software Engineer | [email protected] |
Google Germany GmbH
Dienerstr. 12
80331 München
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores