Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm

Sebastian Tanase Fri, 25 Jul 2014 07:30:07 -0700


----- Mail original -----
> De: "Paolo Bonzini" <pbonz...@redhat.com>
> À: "Sebastian Tanase" <sebastian.tan...@openwide.fr>, qemu-devel@nongnu.org
> Cc: aligu...@amazon.com, afaer...@suse.de, r...@twiddle.net, "peter maydell" 
> <peter.mayd...@linaro.org>,
> mich...@walle.cc, a...@alex.org.uk, stefa...@redhat.com, 
> lcapitul...@redhat.com, crobi...@redhat.com,
> arm...@redhat.com, wenchaoq...@gmail.com, quint...@redhat.com, 
> kw...@redhat.com, m...@redhat.com, "pierre lemagourou"
> <pierre.lemagou...@openwide.fr>, "jeremy rosen" <jeremy.ro...@openwide.fr>, 
> "camille begue"
> <camille.be...@openwide.fr>
> Envoyé: Vendredi 25 Juillet 2014 12:13:44
> Objet: Re: [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
> 
> Il 25/07/2014 11:56, Sebastian Tanase ha scritto:
> > The goal is to sleep qemu whenever the guest clock
> > is in advance compared to the host clock (we use
> > the monotonic clocks). The amount of time to sleep
> > is calculated in the execution loop in cpu_exec.
> > 
> > At first, we tried to approximate at each for loop the real time
> > elapsed
> > while searching for a TB (generating or retrieving from cache) and
> > executing it. We would then approximate the virtual time
> > corresponding
> > to the number of virtual instructions executed. The difference
> > between
> > these 2 values would allow us to know if the guest is in advance or
> > delayed.
> > However, the function used for measuring the real time
> > (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very
> > expensive.
> > We had an added overhead of 13% of the total run time.
> > 
> > Therefore, we modified the algorithm and only take into account the
> > difference between the 2 clocks at the begining of the cpu_exec
> > function.
> > During the for loop we try to reduce the advance of the guest only
> > by
> > computing the virtual time elapsed and sleeping if necessary. The
> > overhead
> > is thus reduced to 3%. Even though this method still has a
> > noticeable
> > overhead, it no longer is a bottleneck in trying to achieve a
> > better
> > guest frequency for which the guest clock is faster than the host
> > one.
> > 
> > As for the the alignement of the 2 clocks, with the first algorithm
> > the guest clock was oscillating between -1 and 1ms compared to the
> > host clock.
> > Using the second algorithm we notice that the guest is 5ms behind
> > the host, which
> > is still acceptable for our use case.
> > 
> > The tests where conducted using fio and stress. The host machine in
> > an i5 CPU at
> > 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is
> > an arm versatile-pb
> > built with buildroot.
> > 
> > Currently, on our test machine, the lowest icount we can achieve
> > that is suitable for
> > aligning the 2 clocks is 6. However, we observe that the IO tests
> > (using fio) are
> > slower than the cpu tests (using stress).
> > 
> > Signed-off-by: Sebastian Tanase <sebastian.tan...@openwide.fr>
> > Tested-by: Camille Bégué <camille.be...@openwide.fr>
> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> > ---
> >  cpu-exec.c           | 91
> >  ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  cpus.c               | 17 ++++++++++
> >  include/qemu/timer.h |  1 +
> >  3 files changed, 109 insertions(+)
> > 
> > diff --git a/cpu-exec.c b/cpu-exec.c
> > index 38e5f02..1a725b6 100644
> > --- a/cpu-exec.c
> > +++ b/cpu-exec.c
> > @@ -22,6 +22,84 @@
> >  #include "tcg.h"
> >  #include "qemu/atomic.h"
> >  #include "sysemu/qtest.h"
> > +#include "qemu/timer.h"
> > +
> > +/* -icount align implementation. */
> > +
> > +typedef struct SyncClocks {
> > +    int64_t diff_clk;
> > +    int64_t original_instr_counter;
> > +} SyncClocks;
> > +
> > +#if !defined(CONFIG_USER_ONLY)
> > +/* Allow the guest to have a max 3ms advance.
> > + * The difference between the 2 clocks could therefore
> > + * oscillate around 0.
> > + */
> > +#define VM_CLOCK_ADVANCE 3000000
> > +
> > +static int64_t delay_host(int64_t diff_clk)
> > +{
> > +    if (diff_clk > VM_CLOCK_ADVANCE) {
> > +#ifndef _WIN32
> > +        struct timespec sleep_delay, rem_delay;
> > +        sleep_delay.tv_sec = diff_clk / 1000000000LL;
> > +        sleep_delay.tv_nsec = diff_clk % 1000000000LL;
> > +        if (nanosleep(&sleep_delay, &rem_delay) < 0) {
> > +            diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) *
> > 1000000000LL;
> > +            diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
> > +        } else {
> > +            diff_clk = 0;
> > +        }
> > +#else
> > +        Sleep(diff_clk / SCALE_MS);
> > +        diff_clk = 0;
> > +#endif
> > +    }
> > +    return diff_clk;
> > +}
> > +
> > +static int64_t instr_to_vtime(int64_t instr_counter, const
> > CPUState *cpu)
> > +{
> > +    int64_t instr_exec_time;
> > +    instr_exec_time = instr_counter -
> > +                      (cpu->icount_extra +
> > +                       cpu->icount_decr.u16.low);
> > +    instr_exec_time = instr_exec_time << icount_time_shift;
> > +
> > +    return instr_exec_time;
> > +}
> > +
> > +static void align_clocks(SyncClocks *sc, const CPUState *cpu)
> > +{
> > +    if (!icount_align_option) {
> > +        return;
> > +    }
> > +    sc->diff_clk += instr_to_vtime(sc->original_instr_counter,
> > cpu);
> > +    sc->original_instr_counter = cpu->icount_extra +
> > cpu->icount_decr.u16.low;
> > +    sc->diff_clk = delay_host(sc->diff_clk);
> > +}
> 
> Just two comments:
> 
> 1) perhaps s/original/last/ in original_instr_counter?
> 
> 2) I think I prefer this to be written like:
> 
>     instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low;
>     instr_exec_time = sc->original_instr_counter - instr_counter;
>     sc->original_instr_counter = instr_counter
>     sc->diff_clk += instr_exec_time << icount_time_shift;
>     sc->diff_clk = delay_host(sc->diff_clk);
> 
> If you agree, I can do it when applying the patches.
>


Sure, no problem.

> Thanks for your persistence, I'm very happy with this version!
> 
> As a follow up, do you think it's possible to modify the places where
> you run align_clocks, so that you sleep with the iothread mutex *not*
> taken?
> 
> Paolo

I'll consider that and run some tests.
> 

Thank you very much for your help and guidance.


Sebastian

Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm

Reply via email to