Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-17 Thread David Gibson
On Fri, Jan 15, 2016 at 05:46:10PM +, Mark Cave-Ayland wrote:
> On 12/01/16 02:44, David Gibson wrote:
> 
> >>> In other words, isn't this just skipping the decrementer interrupts at
> >>> the qemu level rather than the guest level?
> >>>
> >>> It seems that instead we should be reconstructing the decrementer on
> >>> the destination based on an offset from the timebase.
> >>
> >> Well I haven't really looked at how time warping works during in
> >> migration for QEMU, however this seems to be the method used by
> >> hw/ppc/ppc.c's timebase_post_load() function but my understanding is
> >> that this isn't currently available for the g3beige/mac99 machines?
> > 
> > Ah.. yes, it looks like the timebase migration stuff is only hooked in
> > on the pseries machine type.  As far as I can tell it should be
> > trivial to add it to other machines though - it doesn't appear to rely
> > on anything outside the common ppc timebase stuff.
> > 
> >> Should the patch in fact do this but also add decrementer support? And
> >> if it did, would this have a negative effect on pseries?
> > 
> > Yes, I think that's the right approach.  Note that rather than
> > duplicating the logic to adjust the decrementer over migration, it
> > should be possible to encode the decrementer as a diff from the
> > timebase across the migration.
> > 
> > In fact.. I'm not sure it ever makes sense to store the decrementer
> > value as a direct value, since it's constantly changing - probably
> > makes more sense to derive it from the timebase whenever it is needed.
> > 
> > As far as I know that should be fine for pseries.  I think the current
> > behaviour is probably technically wrong for pseries as well, but the
> > timing code of our Linux guests is robust enough to handle a small
> > displacement to the time of the next decrementer interrupt.
> 
> I've had a bit of an experiment trying to implement something suitable,
> but I'm not 100% certain I've got this right.
> 
> >From the code my understanding is that the timebase is effectively
> free-running and so if a migration takes 5s then you use tb_offset to
> calculate the difference between the timebase before migration, and
> subsequently apply the offset for all future reads of the timebase for
> the lifetime of the CPU (i.e. the migrated guest is effectively living
> at a point in the past where the timebase is consistent).

Um.. no.  At least in the usual configuration, the timebase represents
real, wall-clock time, so we expect it to jump forward across the
migration downtime.  This is important because the guest will use the
timebase to calculate real time differences.

However, the absolute value of the timebase may be different on the
*host* between source and destination for migration.  So what we need
to do is before migration we work out the delta between host and guest
notions of wall clock time (as defined by the guest timebase), and
transfer that in the migration stream.

On the destination we initialize the guest timebase so that the guest
maintains the same realtime offset from the host.  This means that as
long as source and destination system time is synchronized, guest
real-time tracking will continue correctly across the migration.

We do also make sure that the guest timebase never goes backwards, but
that would only happen if the source and destination host times were
badly out of sync.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-15 Thread Mark Cave-Ayland
On 12/01/16 02:44, David Gibson wrote:

>>> In other words, isn't this just skipping the decrementer interrupts at
>>> the qemu level rather than the guest level?
>>>
>>> It seems that instead we should be reconstructing the decrementer on
>>> the destination based on an offset from the timebase.
>>
>> Well I haven't really looked at how time warping works during in
>> migration for QEMU, however this seems to be the method used by
>> hw/ppc/ppc.c's timebase_post_load() function but my understanding is
>> that this isn't currently available for the g3beige/mac99 machines?
> 
> Ah.. yes, it looks like the timebase migration stuff is only hooked in
> on the pseries machine type.  As far as I can tell it should be
> trivial to add it to other machines though - it doesn't appear to rely
> on anything outside the common ppc timebase stuff.
> 
>> Should the patch in fact do this but also add decrementer support? And
>> if it did, would this have a negative effect on pseries?
> 
> Yes, I think that's the right approach.  Note that rather than
> duplicating the logic to adjust the decrementer over migration, it
> should be possible to encode the decrementer as a diff from the
> timebase across the migration.
> 
> In fact.. I'm not sure it ever makes sense to store the decrementer
> value as a direct value, since it's constantly changing - probably
> makes more sense to derive it from the timebase whenever it is needed.
> 
> As far as I know that should be fine for pseries.  I think the current
> behaviour is probably technically wrong for pseries as well, but the
> timing code of our Linux guests is robust enough to handle a small
> displacement to the time of the next decrementer interrupt.

I've had a bit of an experiment trying to implement something suitable,
but I'm not 100% certain I've got this right.

>From the code my understanding is that the timebase is effectively
free-running and so if a migration takes 5s then you use tb_offset to
calculate the difference between the timebase before migration, and
subsequently apply the offset for all future reads of the timebase for
the lifetime of the CPU (i.e. the migrated guest is effectively living
at a point in the past where the timebase is consistent).

So I do see that it would make more sense to switch the decrementer over
to an offset from timebase to avoid duplicating the time-shift migration
logic, but my PPC experience is fairly limited. Any particular pointers
as to what is the best way to approach this? It seems getting this done
separately first will make the changes in the last patch trivial.


Many thanks,

Mark.




Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-11 Thread David Gibson
On Mon, Jan 11, 2016 at 07:43:54AM +, Mark Cave-Ayland wrote:
> On 11/01/16 04:55, David Gibson wrote:
> 
> > On Mon, Jan 11, 2016 at 12:18:31PM +1100, Alexey Kardashevskiy wrote:
> >> On 01/09/2016 01:21 AM, Mark Cave-Ayland wrote:
> >>> On 08/01/16 02:47, Alexey Kardashevskiy wrote:
> >>>
>  On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
> > During local testing with TCG, intermittent errors were found when
> > trying to
> > migrate Darwin OS images.
> >
> > The underlying cause was that Darwin resets the decrementer value to
> > fairly
> > small values on each interrupt. cpu_ppc_set_tb_clk() sets the default
> > value
> > of the decrementer to 0x during initialisation which typically
> > corresponds to several seconds. Hence when restoring the image, the 
> > guest
> > would effectively "lose" decrementer interrupts during this time causing
> > confusion in the guest.
> >
> > NOTE: there does seem to be some overlap here with the
> > vmstate_ppc_timebase
> > code, however it doesn't seem to handle multiple CPUs which is why
> > I've gone
> > for an independent implementation.
> 
>  It does handle multiple CPUs:
> 
>  static int timebase_post_load(void *opaque, int version_id)
>  {
>  ...
>   /* Set new offset to all CPUs */
>   CPU_FOREACH(cpu) {
>   PowerPCCPU *pcpu = POWERPC_CPU(cpu);
>   pcpu->env.tb_env->tb_offset = tb_off_adj;
>   }
> 
> 
>  It does not transfer DECR though, it transfers the offset instead, one
>  for all CPUs.
> 
> 
>  The easier solution would be just adding this instead of the whole patch:
> 
>  spr_register(env, SPR_DECR, "DECR",
>   SPR_NOACCESS, SPR_NOACCESS,
>   _read_decr, _write_decr,
>   0x);
> 
>  somewhere in target-ppc/translate_init.c for CPUs you want to support,
>  gen_tbl() seems to be the right place for this. As long as it is just
>  spr_register() and not spr_register_kvm(), I suppose it should not break
>  KVM and pseries.
> >>>
> >>> I've just tried adding that but it then gives the following error on
> >>> startup:
> >>>
> >>> Error: Trying to register SPR 22 (016) twice !
> >>>
> >>> Based upon this, the existing registration seems fine. I think the
> >>> problem is that I can't see anything in __cpu_ppc_store_decr() that
> >>> updates the spr[SPR_DECR] value when the decrementer register is
> >>> changed, so it needs to be explicitly added to
> >>> cpu_pre_save()/cpu_post_load():
> >>>
> >>>
> >>> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> >>> index 251a84b..495e58d 100644
> >>> --- a/target-ppc/machine.c
> >>> +++ b/target-ppc/machine.c
> >>> @@ -141,6 +141,7 @@ static void cpu_pre_save(void *opaque)
> >>>  env->spr[SPR_CFAR] = env->cfar;
> >>>  #endif
> >>>  env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
> >>> +env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
> >>>
> >>>  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
> >>>  env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
> >>> @@ -175,6 +176,7 @@ static int cpu_post_load(void *opaque, int version_id)
> >>>  env->cfar = env->spr[SPR_CFAR];
> >>>  #endif
> >>>  env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
> >>> +cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
> >>>
> >>>  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
> >>>  env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];
> >>>
> >>>
> >>> I've just tried the diff above instead of my original version and
> >>> repeated my savevm/loadvm pair test with a Darwin installation and it
> >>> also fixes the random hang I was seeing on loadvm.
> >>>
> >>> Seemingly this should work on KVM in that cpu_ppc_load_decr() and
> >>> cpu_ppc_store_decr() become no-ops as long as KVM maintains
> >>> env->spr[SPR_DECR], but a second set of eyeballs would be useful here.
> >>>
> >>> If you can let me know if this is suitable then I'll update the patchset
> >>> based upon your feedback and send out a v2.
> >>
> >>
> >> Looks good to me, I'd just wait a day or two in the case if David wants to
> >> comment.
> > 
> > I was on holiday and missed the start of this thread, I'm afraid, so I
> > don't fully understand the context here.
> 
> It's part of a patchset I posted which fixes up problems I had with
> migrating g3beige/mac99 machines under TCG:
> https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00544.html.
> 
> Apologies for not adding you as CC directly - are you still helping to
> cover ppc-next for Alex?

Yes, I am.

> > Am I right in thinking that this change will essentially freeze the
> > decrementer across the migration downtime?  That doesn't seem right,
> > since the decrementer is supposed to be linked to the timebase and
> > represent real time passing.
> 
> Yes, that's correct.
> 
> > In other words, isn't this just skipping 

Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-10 Thread Mark Cave-Ayland
On 11/01/16 04:55, David Gibson wrote:

> On Mon, Jan 11, 2016 at 12:18:31PM +1100, Alexey Kardashevskiy wrote:
>> On 01/09/2016 01:21 AM, Mark Cave-Ayland wrote:
>>> On 08/01/16 02:47, Alexey Kardashevskiy wrote:
>>>
 On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
> During local testing with TCG, intermittent errors were found when
> trying to
> migrate Darwin OS images.
>
> The underlying cause was that Darwin resets the decrementer value to
> fairly
> small values on each interrupt. cpu_ppc_set_tb_clk() sets the default
> value
> of the decrementer to 0x during initialisation which typically
> corresponds to several seconds. Hence when restoring the image, the guest
> would effectively "lose" decrementer interrupts during this time causing
> confusion in the guest.
>
> NOTE: there does seem to be some overlap here with the
> vmstate_ppc_timebase
> code, however it doesn't seem to handle multiple CPUs which is why
> I've gone
> for an independent implementation.

 It does handle multiple CPUs:

 static int timebase_post_load(void *opaque, int version_id)
 {
 ...
  /* Set new offset to all CPUs */
  CPU_FOREACH(cpu) {
  PowerPCCPU *pcpu = POWERPC_CPU(cpu);
  pcpu->env.tb_env->tb_offset = tb_off_adj;
  }


 It does not transfer DECR though, it transfers the offset instead, one
 for all CPUs.


 The easier solution would be just adding this instead of the whole patch:

 spr_register(env, SPR_DECR, "DECR",
  SPR_NOACCESS, SPR_NOACCESS,
  _read_decr, _write_decr,
  0x);

 somewhere in target-ppc/translate_init.c for CPUs you want to support,
 gen_tbl() seems to be the right place for this. As long as it is just
 spr_register() and not spr_register_kvm(), I suppose it should not break
 KVM and pseries.
>>>
>>> I've just tried adding that but it then gives the following error on
>>> startup:
>>>
>>> Error: Trying to register SPR 22 (016) twice !
>>>
>>> Based upon this, the existing registration seems fine. I think the
>>> problem is that I can't see anything in __cpu_ppc_store_decr() that
>>> updates the spr[SPR_DECR] value when the decrementer register is
>>> changed, so it needs to be explicitly added to
>>> cpu_pre_save()/cpu_post_load():
>>>
>>>
>>> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
>>> index 251a84b..495e58d 100644
>>> --- a/target-ppc/machine.c
>>> +++ b/target-ppc/machine.c
>>> @@ -141,6 +141,7 @@ static void cpu_pre_save(void *opaque)
>>>  env->spr[SPR_CFAR] = env->cfar;
>>>  #endif
>>>  env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
>>> +env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
>>>
>>>  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
>>>  env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
>>> @@ -175,6 +176,7 @@ static int cpu_post_load(void *opaque, int version_id)
>>>  env->cfar = env->spr[SPR_CFAR];
>>>  #endif
>>>  env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
>>> +cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
>>>
>>>  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
>>>  env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];
>>>
>>>
>>> I've just tried the diff above instead of my original version and
>>> repeated my savevm/loadvm pair test with a Darwin installation and it
>>> also fixes the random hang I was seeing on loadvm.
>>>
>>> Seemingly this should work on KVM in that cpu_ppc_load_decr() and
>>> cpu_ppc_store_decr() become no-ops as long as KVM maintains
>>> env->spr[SPR_DECR], but a second set of eyeballs would be useful here.
>>>
>>> If you can let me know if this is suitable then I'll update the patchset
>>> based upon your feedback and send out a v2.
>>
>>
>> Looks good to me, I'd just wait a day or two in the case if David wants to
>> comment.
> 
> I was on holiday and missed the start of this thread, I'm afraid, so I
> don't fully understand the context here.

It's part of a patchset I posted which fixes up problems I had with
migrating g3beige/mac99 machines under TCG:
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00544.html.

Apologies for not adding you as CC directly - are you still helping to
cover ppc-next for Alex?

> Am I right in thinking that this change will essentially freeze the
> decrementer across the migration downtime?  That doesn't seem right,
> since the decrementer is supposed to be linked to the timebase and
> represent real time passing.

Yes, that's correct.

> In other words, isn't this just skipping the decrementer interrupts at
> the qemu level rather than the guest level?
> 
> It seems that instead we should be reconstructing the decrementer on
> the destination based on an offset from the timebase.

Well I haven't really looked at how time warping works during in
migration for QEMU, however this seems to be the method 

Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-10 Thread David Gibson
On Mon, Jan 11, 2016 at 12:18:31PM +1100, Alexey Kardashevskiy wrote:
> On 01/09/2016 01:21 AM, Mark Cave-Ayland wrote:
> >On 08/01/16 02:47, Alexey Kardashevskiy wrote:
> >
> >>On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
> >>>During local testing with TCG, intermittent errors were found when
> >>>trying to
> >>>migrate Darwin OS images.
> >>>
> >>>The underlying cause was that Darwin resets the decrementer value to
> >>>fairly
> >>>small values on each interrupt. cpu_ppc_set_tb_clk() sets the default
> >>>value
> >>>of the decrementer to 0x during initialisation which typically
> >>>corresponds to several seconds. Hence when restoring the image, the guest
> >>>would effectively "lose" decrementer interrupts during this time causing
> >>>confusion in the guest.
> >>>
> >>>NOTE: there does seem to be some overlap here with the
> >>>vmstate_ppc_timebase
> >>>code, however it doesn't seem to handle multiple CPUs which is why
> >>>I've gone
> >>>for an independent implementation.
> >>
> >>It does handle multiple CPUs:
> >>
> >>static int timebase_post_load(void *opaque, int version_id)
> >>{
> >>...
> >>  /* Set new offset to all CPUs */
> >>  CPU_FOREACH(cpu) {
> >>  PowerPCCPU *pcpu = POWERPC_CPU(cpu);
> >>  pcpu->env.tb_env->tb_offset = tb_off_adj;
> >>  }
> >>
> >>
> >>It does not transfer DECR though, it transfers the offset instead, one
> >>for all CPUs.
> >>
> >>
> >>The easier solution would be just adding this instead of the whole patch:
> >>
> >>spr_register(env, SPR_DECR, "DECR",
> >>  SPR_NOACCESS, SPR_NOACCESS,
> >>  _read_decr, _write_decr,
> >>  0x);
> >>
> >>somewhere in target-ppc/translate_init.c for CPUs you want to support,
> >>gen_tbl() seems to be the right place for this. As long as it is just
> >>spr_register() and not spr_register_kvm(), I suppose it should not break
> >>KVM and pseries.
> >
> >I've just tried adding that but it then gives the following error on
> >startup:
> >
> >Error: Trying to register SPR 22 (016) twice !
> >
> >Based upon this, the existing registration seems fine. I think the
> >problem is that I can't see anything in __cpu_ppc_store_decr() that
> >updates the spr[SPR_DECR] value when the decrementer register is
> >changed, so it needs to be explicitly added to
> >cpu_pre_save()/cpu_post_load():
> >
> >
> >diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> >index 251a84b..495e58d 100644
> >--- a/target-ppc/machine.c
> >+++ b/target-ppc/machine.c
> >@@ -141,6 +141,7 @@ static void cpu_pre_save(void *opaque)
> >  env->spr[SPR_CFAR] = env->cfar;
> >  #endif
> >  env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
> >+env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
> >
> >  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
> >  env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
> >@@ -175,6 +176,7 @@ static int cpu_post_load(void *opaque, int version_id)
> >  env->cfar = env->spr[SPR_CFAR];
> >  #endif
> >  env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
> >+cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
> >
> >  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
> >  env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];
> >
> >
> >I've just tried the diff above instead of my original version and
> >repeated my savevm/loadvm pair test with a Darwin installation and it
> >also fixes the random hang I was seeing on loadvm.
> >
> >Seemingly this should work on KVM in that cpu_ppc_load_decr() and
> >cpu_ppc_store_decr() become no-ops as long as KVM maintains
> >env->spr[SPR_DECR], but a second set of eyeballs would be useful here.
> >
> >If you can let me know if this is suitable then I'll update the patchset
> >based upon your feedback and send out a v2.
> 
> 
> Looks good to me, I'd just wait a day or two in the case if David wants to
> comment.

I was on holiday and missed the start of this thread, I'm afraid, so I
don't fully understand the context here.

Am I right in thinking that this change will essentially freeze the
decrementer across the migration downtime?  That doesn't seem right,
since the decrementer is supposed to be linked to the timebase and
represent real time passing.

In other words, isn't this just skipping the decrementer interrupts at
the qemu level rather than the guest level?

It seems that instead we should be reconstructing the decrementer on
the destination based on an offset from the timebase.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-10 Thread Alexey Kardashevskiy

On 01/09/2016 01:21 AM, Mark Cave-Ayland wrote:

On 08/01/16 02:47, Alexey Kardashevskiy wrote:


On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:

During local testing with TCG, intermittent errors were found when
trying to
migrate Darwin OS images.

The underlying cause was that Darwin resets the decrementer value to
fairly
small values on each interrupt. cpu_ppc_set_tb_clk() sets the default
value
of the decrementer to 0x during initialisation which typically
corresponds to several seconds. Hence when restoring the image, the guest
would effectively "lose" decrementer interrupts during this time causing
confusion in the guest.

NOTE: there does seem to be some overlap here with the
vmstate_ppc_timebase
code, however it doesn't seem to handle multiple CPUs which is why
I've gone
for an independent implementation.


It does handle multiple CPUs:

static int timebase_post_load(void *opaque, int version_id)
{
...
  /* Set new offset to all CPUs */
  CPU_FOREACH(cpu) {
  PowerPCCPU *pcpu = POWERPC_CPU(cpu);
  pcpu->env.tb_env->tb_offset = tb_off_adj;
  }


It does not transfer DECR though, it transfers the offset instead, one
for all CPUs.


The easier solution would be just adding this instead of the whole patch:

spr_register(env, SPR_DECR, "DECR",
  SPR_NOACCESS, SPR_NOACCESS,
  _read_decr, _write_decr,
  0x);

somewhere in target-ppc/translate_init.c for CPUs you want to support,
gen_tbl() seems to be the right place for this. As long as it is just
spr_register() and not spr_register_kvm(), I suppose it should not break
KVM and pseries.


I've just tried adding that but it then gives the following error on
startup:

Error: Trying to register SPR 22 (016) twice !

Based upon this, the existing registration seems fine. I think the
problem is that I can't see anything in __cpu_ppc_store_decr() that
updates the spr[SPR_DECR] value when the decrementer register is
changed, so it needs to be explicitly added to
cpu_pre_save()/cpu_post_load():


diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 251a84b..495e58d 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -141,6 +141,7 @@ static void cpu_pre_save(void *opaque)
  env->spr[SPR_CFAR] = env->cfar;
  #endif
  env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
+env->spr[SPR_DECR] = cpu_ppc_load_decr(env);

  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
  env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
@@ -175,6 +176,7 @@ static int cpu_post_load(void *opaque, int version_id)
  env->cfar = env->spr[SPR_CFAR];
  #endif
  env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
+cpu_ppc_store_decr(env, env->spr[SPR_DECR]);

  for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
  env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];


I've just tried the diff above instead of my original version and
repeated my savevm/loadvm pair test with a Darwin installation and it
also fixes the random hang I was seeing on loadvm.

Seemingly this should work on KVM in that cpu_ppc_load_decr() and
cpu_ppc_store_decr() become no-ops as long as KVM maintains
env->spr[SPR_DECR], but a second set of eyeballs would be useful here.

If you can let me know if this is suitable then I'll update the patchset
based upon your feedback and send out a v2.



Looks good to me, I'd just wait a day or two in the case if David wants to 
comment.




--
Alexey



Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-08 Thread Mark Cave-Ayland
On 08/01/16 02:47, Alexey Kardashevskiy wrote:

> On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
>> During local testing with TCG, intermittent errors were found when
>> trying to
>> migrate Darwin OS images.
>>
>> The underlying cause was that Darwin resets the decrementer value to
>> fairly
>> small values on each interrupt. cpu_ppc_set_tb_clk() sets the default
>> value
>> of the decrementer to 0x during initialisation which typically
>> corresponds to several seconds. Hence when restoring the image, the guest
>> would effectively "lose" decrementer interrupts during this time causing
>> confusion in the guest.
>>
>> NOTE: there does seem to be some overlap here with the
>> vmstate_ppc_timebase
>> code, however it doesn't seem to handle multiple CPUs which is why
>> I've gone
>> for an independent implementation.
> 
> It does handle multiple CPUs:
> 
> static int timebase_post_load(void *opaque, int version_id)
> {
> ...
>  /* Set new offset to all CPUs */
>  CPU_FOREACH(cpu) {
>  PowerPCCPU *pcpu = POWERPC_CPU(cpu);
>  pcpu->env.tb_env->tb_offset = tb_off_adj;
>  }
> 
> 
> It does not transfer DECR though, it transfers the offset instead, one
> for all CPUs.
> 
> 
> The easier solution would be just adding this instead of the whole patch:
> 
> spr_register(env, SPR_DECR, "DECR",
>  SPR_NOACCESS, SPR_NOACCESS,
>  _read_decr, _write_decr,
>  0x);
> 
> somewhere in target-ppc/translate_init.c for CPUs you want to support,
> gen_tbl() seems to be the right place for this. As long as it is just
> spr_register() and not spr_register_kvm(), I suppose it should not break
> KVM and pseries.

I've just tried adding that but it then gives the following error on
startup:

Error: Trying to register SPR 22 (016) twice !

Based upon this, the existing registration seems fine. I think the
problem is that I can't see anything in __cpu_ppc_store_decr() that
updates the spr[SPR_DECR] value when the decrementer register is
changed, so it needs to be explicitly added to
cpu_pre_save()/cpu_post_load():


diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 251a84b..495e58d 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -141,6 +141,7 @@ static void cpu_pre_save(void *opaque)
 env->spr[SPR_CFAR] = env->cfar;
 #endif
 env->spr[SPR_BOOKE_SPEFSCR] = env->spe_fscr;
+env->spr[SPR_DECR] = cpu_ppc_load_decr(env);

 for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
 env->spr[SPR_DBAT0U + 2*i] = env->DBAT[0][i];
@@ -175,6 +176,7 @@ static int cpu_post_load(void *opaque, int version_id)
 env->cfar = env->spr[SPR_CFAR];
 #endif
 env->spe_fscr = env->spr[SPR_BOOKE_SPEFSCR];
+cpu_ppc_store_decr(env, env->spr[SPR_DECR]);

 for (i = 0; (i < 4) && (i < env->nb_BATs); i++) {
 env->DBAT[0][i] = env->spr[SPR_DBAT0U + 2*i];


I've just tried the diff above instead of my original version and
repeated my savevm/loadvm pair test with a Darwin installation and it
also fixes the random hang I was seeing on loadvm.

Seemingly this should work on KVM in that cpu_ppc_load_decr() and
cpu_ppc_store_decr() become no-ops as long as KVM maintains
env->spr[SPR_DECR], but a second set of eyeballs would be useful here.

If you can let me know if this is suitable then I'll update the patchset
based upon your feedback and send out a v2.


ATB,

Mark.




Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-07 Thread Alexey Kardashevskiy

On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:

During local testing with TCG, intermittent errors were found when trying to
migrate Darwin OS images.

The underlying cause was that Darwin resets the decrementer value to fairly
small values on each interrupt. cpu_ppc_set_tb_clk() sets the default value
of the decrementer to 0x during initialisation which typically
corresponds to several seconds. Hence when restoring the image, the guest
would effectively "lose" decrementer interrupts during this time causing
confusion in the guest.

NOTE: there does seem to be some overlap here with the vmstate_ppc_timebase
code, however it doesn't seem to handle multiple CPUs which is why I've gone
for an independent implementation.


It does handle multiple CPUs:

static int timebase_post_load(void *opaque, int version_id)
{
...
 /* Set new offset to all CPUs */
 CPU_FOREACH(cpu) {
 PowerPCCPU *pcpu = POWERPC_CPU(cpu);
 pcpu->env.tb_env->tb_offset = tb_off_adj;
 }


It does not transfer DECR though, it transfers the offset instead, one for 
all CPUs.



The easier solution would be just adding this instead of the whole patch:

spr_register(env, SPR_DECR, "DECR",
 SPR_NOACCESS, SPR_NOACCESS,
 _read_decr, _write_decr,
 0x);

somewhere in target-ppc/translate_init.c for CPUs you want to support, 
gen_tbl() seems to be the right place for this. As long as it is just 
spr_register() and not spr_register_kvm(), I suppose it should not break 
KVM and pseries.







Signed-off-by: Mark Cave-Ayland 
---
  target-ppc/machine.c |   49 +
  1 file changed, 49 insertions(+)

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index cb56423..5ee6269 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -499,6 +499,54 @@ static const VMStateDescription vmstate_tlbmas = {
  }
  };

+static bool decr_needed(void *opaque)
+{
+return true;
+}
+
+static int get_decr_state(QEMUFile *f, void *opaque, size_t size)
+{
+PowerPCCPU *cpu = opaque;
+CPUPPCState *env = >env;
+
+cpu_ppc_store_decr(env, qemu_get_be32(f));
+
+return 0;
+}
+
+static void put_decr_state(QEMUFile *f, void *opaque, size_t size)
+{
+PowerPCCPU *cpu = opaque;
+CPUPPCState *env = >env;
+
+qemu_put_be32(f, cpu_ppc_load_decr(env));
+}
+
+static const VMStateInfo vmstate_info_decr = {
+.name = "decr_state",
+.get = get_decr_state,
+.put = put_decr_state
+};
+
+static const VMStateDescription vmstate_decr = {
+.name = "cpu/decr",
+.version_id = 0,
+.minimum_version_id = 0,
+.needed = decr_needed,
+.fields = (VMStateField[]) {
+{
+.name = "cpu/decr",
+.version_id   = 0,
+.field_exists = NULL,
+.size = 0,
+.info = _info_decr,
+.flags= VMS_SINGLE,
+.offset   = 0,
+},
+VMSTATE_END_OF_LIST()
+}
+};
+
  const VMStateDescription vmstate_ppc_cpu = {
  .name = "cpu",
  .version_id = 5,
@@ -555,6 +603,7 @@ const VMStateDescription vmstate_ppc_cpu = {
  _tlb6xx,
  _tlbemb,
  _tlbmas,
+_decr,
  NULL
  }
  };




--
Alexey