Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-13 Thread Greg Kurz
On Mon, 22 Apr 2019 12:33:26 +0530
Aravinda Prasad  wrote:

> Upon a machine check exception (MCE) in a guest address space,
> KVM causes a guest exit to enable QEMU to build and pass the
> error to the guest in the PAPR defined rtas error log format.
> 
> This patch builds the rtas error log, copies it to the rtas_addr
> and then invokes the guest registered machine check handler. The
> handler in the guest takes suitable action(s) depending on the type
> and criticality of the error. For example, if an error is
> unrecoverable memory corruption in an application inside the
> guest, then the guest kernel sends a SIGBUS to the application.
> For recoverable errors, the guest performs recovery actions and
> logs the error.
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  hw/ppc/spapr.c |4 +
>  hw/ppc/spapr_events.c  |  245 
> 
>  include/hw/ppc/spapr.h |4 +
>  3 files changed, 253 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 2779efe..ffd1715 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState *machine)
>  error_report("Could not get size of LPAR rtas '%s'", filename);
>  exit(1);
>  }
> +
> +/* Resize blob to accommodate error log. */
> +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
> +

This is the only user for spapr_get_rtas_size(), which is trivial.
I suggest you simply open-code it here.

But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in the
DT. Since existing machine types don't do that, I guess we should only use
the new size if cap-fwnmi-mce=on for the sake of compatibility.

>  spapr->rtas_blob = g_malloc(spapr->rtas_size);
>  if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
>  error_report("Could not load LPAR rtas '%s'", filename);
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 9922a23..4032db0 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -212,6 +212,106 @@ struct hp_extended_log {
>  struct rtas_event_log_v6_hp hp;
>  } QEMU_PACKED;
>  
> +struct rtas_event_log_v6_mc {

Even if the rest of the code in this file seems to ignore CODING_STYLE,
maybe it's time to start using CamelCase.

David ?

> +#define RTAS_LOG_V6_SECTION_ID_MC   0x4D43 /* MC */
> +struct rtas_event_log_v6_section_header hdr;
> +uint32_t fru_id;
> +uint32_t proc_id;
> +uint8_t error_type;
> +#define RTAS_LOG_V6_MC_TYPE_UE   0
> +#define RTAS_LOG_V6_MC_TYPE_SLB  1
> +#define RTAS_LOG_V6_MC_TYPE_ERAT 2
> +#define RTAS_LOG_V6_MC_TYPE_TLB  4
> +#define RTAS_LOG_V6_MC_TYPE_D_CACHE  5
> +#define RTAS_LOG_V6_MC_TYPE_I_CACHE  7
> +uint8_t sub_err_type;
> +#define RTAS_LOG_V6_MC_UE_INDETERMINATE  0
> +#define RTAS_LOG_V6_MC_UE_IFETCH 1
> +#define RTAS_LOG_V6_MC_UE_PAGE_TABLE_WALK_IFETCH 2
> +#define RTAS_LOG_V6_MC_UE_LOAD_STORE 3
> +#define RTAS_LOG_V6_MC_UE_PAGE_TABLE_WALK_LOAD_STORE 4
> +#define RTAS_LOG_V6_MC_SLB_PARITY0
> +#define RTAS_LOG_V6_MC_SLB_MULTIHIT  1
> +#define RTAS_LOG_V6_MC_SLB_INDETERMINATE 2
> +#define RTAS_LOG_V6_MC_ERAT_PARITY   1
> +#define RTAS_LOG_V6_MC_ERAT_MULTIHIT 2
> +#define RTAS_LOG_V6_MC_ERAT_INDETERMINATE3
> +#define RTAS_LOG_V6_MC_TLB_PARITY1
> +#define RTAS_LOG_V6_MC_TLB_MULTIHIT  2
> +#define RTAS_LOG_V6_MC_TLB_INDETERMINATE 3
> +uint8_t reserved_1[6];
> +uint64_t effective_address;
> +uint64_t logical_address;
> +} QEMU_PACKED;
> +
> +struct mc_extended_log {
> +struct rtas_event_log_v6 v6hdr;
> +struct rtas_event_log_v6_mc mc;
> +} QEMU_PACKED;
> +
> +struct MC_ierror_table {
> +unsigned long srr1_mask;
> +unsigned long srr1_value;
> +bool nip_valid; /* nip is a valid indicator of faulting address */
> +uint8_t error_type;
> +uint8_t error_subtype;
> +unsigned int initiator;
> +unsigned int severity;
> +};
> +
> +static const struct MC_ierror_table mc_ierror_table[] = {
> +{ 0x081c, 0x0004, true,
> +  RTAS_LOG_V6_MC_TYPE_UE, RTAS_LOG_V6_MC_UE_IFETCH,
> +  RTAS_LOG_INITIATOR_CPU, RTAS_LOG_SEVERITY_ERROR_SYNC, },
> +{ 0x081c, 0x0008, true,
> +  RTAS_LOG_V6_MC_TYPE_SLB, RTAS_LOG_V6_MC_SLB_PARITY,
> +  RTAS_LOG_INITIATOR_CPU, RTAS_LOG_SEVERITY_ERROR_SYNC, },
> +{ 0x081c, 0x000c, true,
> +  RTAS_LOG_V6_MC_TYPE_SLB, RTAS_LOG_V6_MC_SLB_MULTIHIT,
> +  RTAS_LOG_INITIATOR_CPU, RTAS_LOG_SEVERITY_ERROR_SYNC, },
> +{ 0x081c, 0x0010, true,
> +  RTAS_LOG_

Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-13 Thread David Gibson
On Mon, May 13, 2019 at 01:30:53PM +0200, Greg Kurz wrote:
> On Mon, 22 Apr 2019 12:33:26 +0530
> Aravinda Prasad  wrote:
> 
> > Upon a machine check exception (MCE) in a guest address space,
> > KVM causes a guest exit to enable QEMU to build and pass the
> > error to the guest in the PAPR defined rtas error log format.
> > 
> > This patch builds the rtas error log, copies it to the rtas_addr
> > and then invokes the guest registered machine check handler. The
> > handler in the guest takes suitable action(s) depending on the type
> > and criticality of the error. For example, if an error is
> > unrecoverable memory corruption in an application inside the
> > guest, then the guest kernel sends a SIGBUS to the application.
> > For recoverable errors, the guest performs recovery actions and
> > logs the error.
> > 
> > Signed-off-by: Aravinda Prasad 
> > ---
> >  hw/ppc/spapr.c |4 +
> >  hw/ppc/spapr_events.c  |  245 
> > 
> >  include/hw/ppc/spapr.h |4 +
> >  3 files changed, 253 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 2779efe..ffd1715 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState *machine)
> >  error_report("Could not get size of LPAR rtas '%s'", filename);
> >  exit(1);
> >  }
> > +
> > +/* Resize blob to accommodate error log. */
> > +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
> > +
> 
> This is the only user for spapr_get_rtas_size(), which is trivial.
> I suggest you simply open-code it here.

I agree.

> But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in the
> DT. Since existing machine types don't do that, I guess we should only use
> the new size if cap-fwnmi-mce=on for the sake of compatibility.

Yes, that's a good idea.  Changing this is very unlikely to break a
guest, but it's easy to be safe here so let's do it.

> 
> >  spapr->rtas_blob = g_malloc(spapr->rtas_size);
> >  if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) 
> > {
> >  error_report("Could not load LPAR rtas '%s'", filename);
> > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> > index 9922a23..4032db0 100644
> > --- a/hw/ppc/spapr_events.c
> > +++ b/hw/ppc/spapr_events.c
> > @@ -212,6 +212,106 @@ struct hp_extended_log {
> >  struct rtas_event_log_v6_hp hp;
> >  } QEMU_PACKED;
> >  
> > +struct rtas_event_log_v6_mc {
> 
> Even if the rest of the code in this file seems to ignore CODING_STYLE,
> maybe it's time to start using CamelCase.
> 
> David ?

Out of scope here, I think.

> > +#define RTAS_LOG_V6_SECTION_ID_MC   0x4D43 /* MC */
> > +struct rtas_event_log_v6_section_header hdr;
> > +uint32_t fru_id;
> > +uint32_t proc_id;
> > +uint8_t error_type;
> > +#define RTAS_LOG_V6_MC_TYPE_UE   0
> > +#define RTAS_LOG_V6_MC_TYPE_SLB  1
> > +#define RTAS_LOG_V6_MC_TYPE_ERAT 2
> > +#define RTAS_LOG_V6_MC_TYPE_TLB  4
> > +#define RTAS_LOG_V6_MC_TYPE_D_CACHE  5
> > +#define RTAS_LOG_V6_MC_TYPE_I_CACHE  7
> > +uint8_t sub_err_type;
> > +#define RTAS_LOG_V6_MC_UE_INDETERMINATE  0
> > +#define RTAS_LOG_V6_MC_UE_IFETCH 1
> > +#define RTAS_LOG_V6_MC_UE_PAGE_TABLE_WALK_IFETCH 2
> > +#define RTAS_LOG_V6_MC_UE_LOAD_STORE 3
> > +#define RTAS_LOG_V6_MC_UE_PAGE_TABLE_WALK_LOAD_STORE 4
> > +#define RTAS_LOG_V6_MC_SLB_PARITY0
> > +#define RTAS_LOG_V6_MC_SLB_MULTIHIT  1
> > +#define RTAS_LOG_V6_MC_SLB_INDETERMINATE 2
> > +#define RTAS_LOG_V6_MC_ERAT_PARITY   1
> > +#define RTAS_LOG_V6_MC_ERAT_MULTIHIT 2
> > +#define RTAS_LOG_V6_MC_ERAT_INDETERMINATE3
> > +#define RTAS_LOG_V6_MC_TLB_PARITY1
> > +#define RTAS_LOG_V6_MC_TLB_MULTIHIT  2
> > +#define RTAS_LOG_V6_MC_TLB_INDETERMINATE 3
> > +uint8_t reserved_1[6];
> > +uint64_t effective_address;
> > +uint64_t logical_address;
> > +} QEMU_PACKED;
> > +
> > +struct mc_extended_log {
> > +struct rtas_event_log_v6 v6hdr;
> > +struct rtas_event_log_v6_mc mc;
> > +} QEMU_PACKED;
> > +
> > +struct MC_ierror_table {
> > +unsigned long srr1_mask;
> > +unsigned long srr1_value;
> > +bool nip_valid; /* nip is a valid indicator of faulting address */
> > +uint8_t error_type;
> > +uint8_t error_subtype;
> > +unsigned int initiator;
> > +unsigned int severity;
> > +};
> > +
> > +static const struct MC_ierror_table mc_ierror_table[] = {
> > +{ 0x081c, 0x0004, true,
> > +  RTAS_LOG_V6_MC_TYPE_UE, RTAS_LOG_V6_MC_UE_IFETCH,
> > +  RTAS_LOG_INITIATOR

Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-13 Thread Aravinda Prasad



On Tuesday 14 May 2019 05:38 AM, David Gibson wrote:
> On Mon, May 13, 2019 at 01:30:53PM +0200, Greg Kurz wrote:
>> On Mon, 22 Apr 2019 12:33:26 +0530
>> Aravinda Prasad  wrote:
>>
>>> Upon a machine check exception (MCE) in a guest address space,
>>> KVM causes a guest exit to enable QEMU to build and pass the
>>> error to the guest in the PAPR defined rtas error log format.
>>>
>>> This patch builds the rtas error log, copies it to the rtas_addr
>>> and then invokes the guest registered machine check handler. The
>>> handler in the guest takes suitable action(s) depending on the type
>>> and criticality of the error. For example, if an error is
>>> unrecoverable memory corruption in an application inside the
>>> guest, then the guest kernel sends a SIGBUS to the application.
>>> For recoverable errors, the guest performs recovery actions and
>>> logs the error.
>>>
>>> Signed-off-by: Aravinda Prasad 
>>> ---
>>>  hw/ppc/spapr.c |4 +
>>>  hw/ppc/spapr_events.c  |  245 
>>> 
>>>  include/hw/ppc/spapr.h |4 +
>>>  3 files changed, 253 insertions(+)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 2779efe..ffd1715 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState *machine)
>>>  error_report("Could not get size of LPAR rtas '%s'", filename);
>>>  exit(1);
>>>  }
>>> +
>>> +/* Resize blob to accommodate error log. */
>>> +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
>>> +
>>
>> This is the only user for spapr_get_rtas_size(), which is trivial.
>> I suggest you simply open-code it here.
> 
> I agree.

Sure.

> 
>> But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in the
>> DT. Since existing machine types don't do that, I guess we should only use
>> the new size if cap-fwnmi-mce=on for the sake of compatibility.
> 
> Yes, that's a good idea.  Changing this is very unlikely to break a
> guest, but it's easy to be safe here so let's do it.

I did it like that because the rtas_blob is allocated based on rtas_size
in spapr_machine_init(). During spapr_machine_init() it is not know if
the guest calls "ibm, nmi-register". So if we want to use the new size
only when cap_fwnmi=on, then we have to realloc the blob in "ibm,
nmi-register".


> 
>>
>>>  spapr->rtas_blob = g_malloc(spapr->rtas_size);
>>>  if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) 
>>> {
>>>  error_report("Could not load LPAR rtas '%s'", filename);
>>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>>> index 9922a23..4032db0 100644
>>> --- a/hw/ppc/spapr_events.c
>>> +++ b/hw/ppc/spapr_events.c
>>> @@ -212,6 +212,106 @@ struct hp_extended_log {
>>>  struct rtas_event_log_v6_hp hp;
>>>  } QEMU_PACKED;
>>>  
>>> +struct rtas_event_log_v6_mc {
>>
>> Even if the rest of the code in this file seems to ignore CODING_STYLE,
>> maybe it's time to start using CamelCase.
>>
>> David ?
> 
> Out of scope here, I think.
> 
>>> +#define RTAS_LOG_V6_SECTION_ID_MC   0x4D43 /* MC */
>>> +struct rtas_event_log_v6_section_header hdr;
>>> +uint32_t fru_id;
>>> +uint32_t proc_id;
>>> +uint8_t error_type;
>>> +#define RTAS_LOG_V6_MC_TYPE_UE   0
>>> +#define RTAS_LOG_V6_MC_TYPE_SLB  1
>>> +#define RTAS_LOG_V6_MC_TYPE_ERAT 2
>>> +#define RTAS_LOG_V6_MC_TYPE_TLB  4
>>> +#define RTAS_LOG_V6_MC_TYPE_D_CACHE  5
>>> +#define RTAS_LOG_V6_MC_TYPE_I_CACHE  7
>>> +uint8_t sub_err_type;
>>> +#define RTAS_LOG_V6_MC_UE_INDETERMINATE  0
>>> +#define RTAS_LOG_V6_MC_UE_IFETCH 1
>>> +#define RTAS_LOG_V6_MC_UE_PAGE_TABLE_WALK_IFETCH 2
>>> +#define RTAS_LOG_V6_MC_UE_LOAD_STORE 3
>>> +#define RTAS_LOG_V6_MC_UE_PAGE_TABLE_WALK_LOAD_STORE 4
>>> +#define RTAS_LOG_V6_MC_SLB_PARITY0
>>> +#define RTAS_LOG_V6_MC_SLB_MULTIHIT  1
>>> +#define RTAS_LOG_V6_MC_SLB_INDETERMINATE 2
>>> +#define RTAS_LOG_V6_MC_ERAT_PARITY   1
>>> +#define RTAS_LOG_V6_MC_ERAT_MULTIHIT 2
>>> +#define RTAS_LOG_V6_MC_ERAT_INDETERMINATE3
>>> +#define RTAS_LOG_V6_MC_TLB_PARITY1
>>> +#define RTAS_LOG_V6_MC_TLB_MULTIHIT  2
>>> +#define RTAS_LOG_V6_MC_TLB_INDETERMINATE 3
>>> +uint8_t reserved_1[6];
>>> +uint64_t effective_address;
>>> +uint64_t logical_address;
>>> +} QEMU_PACKED;
>>> +
>>> +struct mc_extended_log {
>>> +struct rtas_event_log_v6 v6hdr;
>>> +struct rtas_event_log_v6_mc mc;
>>> +} QEMU_PACKED;
>>> +
>>> +struct MC_ierror_table {
>>> +unsigned long srr1_mask;
>>> +unsigned long srr1_value;
>>> +bool nip_vali

Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-13 Thread David Gibson
On Tue, May 14, 2019 at 09:56:41AM +0530, Aravinda Prasad wrote:
> 
> 
> On Tuesday 14 May 2019 05:38 AM, David Gibson wrote:
> > On Mon, May 13, 2019 at 01:30:53PM +0200, Greg Kurz wrote:
> >> On Mon, 22 Apr 2019 12:33:26 +0530
> >> Aravinda Prasad  wrote:
> >>
> >>> Upon a machine check exception (MCE) in a guest address space,
> >>> KVM causes a guest exit to enable QEMU to build and pass the
> >>> error to the guest in the PAPR defined rtas error log format.
> >>>
> >>> This patch builds the rtas error log, copies it to the rtas_addr
> >>> and then invokes the guest registered machine check handler. The
> >>> handler in the guest takes suitable action(s) depending on the type
> >>> and criticality of the error. For example, if an error is
> >>> unrecoverable memory corruption in an application inside the
> >>> guest, then the guest kernel sends a SIGBUS to the application.
> >>> For recoverable errors, the guest performs recovery actions and
> >>> logs the error.
> >>>
> >>> Signed-off-by: Aravinda Prasad 
> >>> ---
> >>>  hw/ppc/spapr.c |4 +
> >>>  hw/ppc/spapr_events.c  |  245 
> >>> 
> >>>  include/hw/ppc/spapr.h |4 +
> >>>  3 files changed, 253 insertions(+)
> >>>
> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>> index 2779efe..ffd1715 100644
> >>> --- a/hw/ppc/spapr.c
> >>> +++ b/hw/ppc/spapr.c
> >>> @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState 
> >>> *machine)
> >>>  error_report("Could not get size of LPAR rtas '%s'", filename);
> >>>  exit(1);
> >>>  }
> >>> +
> >>> +/* Resize blob to accommodate error log. */
> >>> +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
> >>> +
> >>
> >> This is the only user for spapr_get_rtas_size(), which is trivial.
> >> I suggest you simply open-code it here.
> > 
> > I agree.
> 
> Sure.
> 
> > 
> >> But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in 
> >> the
> >> DT. Since existing machine types don't do that, I guess we should only use
> >> the new size if cap-fwnmi-mce=on for the sake of compatibility.
> > 
> > Yes, that's a good idea.  Changing this is very unlikely to break a
> > guest, but it's easy to be safe here so let's do it.
> 
> I did it like that because the rtas_blob is allocated based on rtas_size
> in spapr_machine_init(). During spapr_machine_init() it is not know if
> the guest calls "ibm, nmi-register". So if we want to use the new size
> only when cap_fwnmi=on, then we have to realloc the blob in "ibm,
> nmi-register".

What?  Just always allocate the necessary space in
spapr_machine_init() if cap_fwnmi=on, it'll be wasted if
ibm,nmi-register is never called, but it's not that much space so we
don't really care.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-13 Thread Aravinda Prasad



On Tuesday 14 May 2019 10:10 AM, David Gibson wrote:
> On Tue, May 14, 2019 at 09:56:41AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Tuesday 14 May 2019 05:38 AM, David Gibson wrote:
>>> On Mon, May 13, 2019 at 01:30:53PM +0200, Greg Kurz wrote:
 On Mon, 22 Apr 2019 12:33:26 +0530
 Aravinda Prasad  wrote:

> Upon a machine check exception (MCE) in a guest address space,
> KVM causes a guest exit to enable QEMU to build and pass the
> error to the guest in the PAPR defined rtas error log format.
>
> This patch builds the rtas error log, copies it to the rtas_addr
> and then invokes the guest registered machine check handler. The
> handler in the guest takes suitable action(s) depending on the type
> and criticality of the error. For example, if an error is
> unrecoverable memory corruption in an application inside the
> guest, then the guest kernel sends a SIGBUS to the application.
> For recoverable errors, the guest performs recovery actions and
> logs the error.
>
> Signed-off-by: Aravinda Prasad 
> ---
>  hw/ppc/spapr.c |4 +
>  hw/ppc/spapr_events.c  |  245 
> 
>  include/hw/ppc/spapr.h |4 +
>  3 files changed, 253 insertions(+)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 2779efe..ffd1715 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState 
> *machine)
>  error_report("Could not get size of LPAR rtas '%s'", filename);
>  exit(1);
>  }
> +
> +/* Resize blob to accommodate error log. */
> +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
> +

 This is the only user for spapr_get_rtas_size(), which is trivial.
 I suggest you simply open-code it here.
>>>
>>> I agree.
>>
>> Sure.
>>
>>>
 But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in 
 the
 DT. Since existing machine types don't do that, I guess we should only use
 the new size if cap-fwnmi-mce=on for the sake of compatibility.
>>>
>>> Yes, that's a good idea.  Changing this is very unlikely to break a
>>> guest, but it's easy to be safe here so let's do it.
>>
>> I did it like that because the rtas_blob is allocated based on rtas_size
>> in spapr_machine_init(). During spapr_machine_init() it is not know if
>> the guest calls "ibm, nmi-register". So if we want to use the new size
>> only when cap_fwnmi=on, then we have to realloc the blob in "ibm,
>> nmi-register".
> 
> What?  Just always allocate the necessary space in
> spapr_machine_init() if cap_fwnmi=on, it'll be wasted if
> ibm,nmi-register is never called, but it's not that much space so we
> don't really care.

Yes, not that much space, and ibm,nmi-register is called when the Linux
kernel boots. I guess, even though other OSes might not call
ibm,nmi-register, they do not constitute significant QEMU on Power users.

So I think, I will keep the code as is.

> 

-- 
Regards,
Aravinda




Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-04-23 Thread Aravinda Prasad



On Tuesday 23 April 2019 08:08 PM, Fabiano Rosas wrote:
> Aravinda Prasad  writes:
> 
>> +/*
>> + * Properly set bits in MSR before we invoke the handler.
>> + * SRR0/1, DAR and DSISR are properly set by KVM
>> + */
>> +if (!(*pcc->interrupts_big_endian)(cpu)) {
>> +msr |= (1ULL << MSR_LE);
>> +}
>> +
>> +if (env->msr && (1ULL << MSR_SF)) {
> 
> Don't you mean & instead of &&?

Ah.. yes.. Thanks for pointing out.

> 
>> +msr |= (1ULL << MSR_SF);
>> +}
> 
> 

-- 
Regards,
Aravinda




Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-15 Thread David Gibson
On Tue, May 14, 2019 at 10:36:17AM +0530, Aravinda Prasad wrote:
> 
> 
> On Tuesday 14 May 2019 10:10 AM, David Gibson wrote:
> > On Tue, May 14, 2019 at 09:56:41AM +0530, Aravinda Prasad wrote:
> >>
> >>
> >> On Tuesday 14 May 2019 05:38 AM, David Gibson wrote:
> >>> On Mon, May 13, 2019 at 01:30:53PM +0200, Greg Kurz wrote:
>  On Mon, 22 Apr 2019 12:33:26 +0530
>  Aravinda Prasad  wrote:
> 
> > Upon a machine check exception (MCE) in a guest address space,
> > KVM causes a guest exit to enable QEMU to build and pass the
> > error to the guest in the PAPR defined rtas error log format.
> >
> > This patch builds the rtas error log, copies it to the rtas_addr
> > and then invokes the guest registered machine check handler. The
> > handler in the guest takes suitable action(s) depending on the type
> > and criticality of the error. For example, if an error is
> > unrecoverable memory corruption in an application inside the
> > guest, then the guest kernel sends a SIGBUS to the application.
> > For recoverable errors, the guest performs recovery actions and
> > logs the error.
> >
> > Signed-off-by: Aravinda Prasad 
> > ---
> >  hw/ppc/spapr.c |4 +
> >  hw/ppc/spapr_events.c  |  245 
> > 
> >  include/hw/ppc/spapr.h |4 +
> >  3 files changed, 253 insertions(+)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 2779efe..ffd1715 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState 
> > *machine)
> >  error_report("Could not get size of LPAR rtas '%s'", filename);
> >  exit(1);
> >  }
> > +
> > +/* Resize blob to accommodate error log. */
> > +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
> > +
> 
>  This is the only user for spapr_get_rtas_size(), which is trivial.
>  I suggest you simply open-code it here.
> >>>
> >>> I agree.
> >>
> >> Sure.
> >>
> >>>
>  But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in 
>  the
>  DT. Since existing machine types don't do that, I guess we should only 
>  use
>  the new size if cap-fwnmi-mce=on for the sake of compatibility.
> >>>
> >>> Yes, that's a good idea.  Changing this is very unlikely to break a
> >>> guest, but it's easy to be safe here so let's do it.
> >>
> >> I did it like that because the rtas_blob is allocated based on rtas_size
> >> in spapr_machine_init(). During spapr_machine_init() it is not know if
> >> the guest calls "ibm, nmi-register". So if we want to use the new size
> >> only when cap_fwnmi=on, then we have to realloc the blob in "ibm,
> >> nmi-register".
> > 
> > What?  Just always allocate the necessary space in
> > spapr_machine_init() if cap_fwnmi=on, it'll be wasted if
> > ibm,nmi-register is never called, but it's not that much space so we
> > don't really care.
> 
> Yes, not that much space, and ibm,nmi-register is called when the Linux
> kernel boots. I guess, even though other OSes might not call
> ibm,nmi-register, they do not constitute significant QEMU on Power users.
> 
> So I think, I will keep the code as is.

No, that's not right.  It's impractical to change the allocation
depending on whether fwnmi is currently active.  But you *can* (and
should) base the allocation on whether fwnmi is *possible* - that is,
the value of the spapr cap.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-ppc] [PATCH v8 4/6] target/ppc: Build rtas error log upon an MCE

2019-05-15 Thread Aravinda Prasad



On Thursday 16 May 2019 07:17 AM, David Gibson wrote:
> On Tue, May 14, 2019 at 10:36:17AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Tuesday 14 May 2019 10:10 AM, David Gibson wrote:
>>> On Tue, May 14, 2019 at 09:56:41AM +0530, Aravinda Prasad wrote:


 On Tuesday 14 May 2019 05:38 AM, David Gibson wrote:
> On Mon, May 13, 2019 at 01:30:53PM +0200, Greg Kurz wrote:
>> On Mon, 22 Apr 2019 12:33:26 +0530
>> Aravinda Prasad  wrote:
>>
>>> Upon a machine check exception (MCE) in a guest address space,
>>> KVM causes a guest exit to enable QEMU to build and pass the
>>> error to the guest in the PAPR defined rtas error log format.
>>>
>>> This patch builds the rtas error log, copies it to the rtas_addr
>>> and then invokes the guest registered machine check handler. The
>>> handler in the guest takes suitable action(s) depending on the type
>>> and criticality of the error. For example, if an error is
>>> unrecoverable memory corruption in an application inside the
>>> guest, then the guest kernel sends a SIGBUS to the application.
>>> For recoverable errors, the guest performs recovery actions and
>>> logs the error.
>>>
>>> Signed-off-by: Aravinda Prasad 
>>> ---
>>>  hw/ppc/spapr.c |4 +
>>>  hw/ppc/spapr_events.c  |  245 
>>> 
>>>  include/hw/ppc/spapr.h |4 +
>>>  3 files changed, 253 insertions(+)
>>>
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 2779efe..ffd1715 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -2918,6 +2918,10 @@ static void spapr_machine_init(MachineState 
>>> *machine)
>>>  error_report("Could not get size of LPAR rtas '%s'", filename);
>>>  exit(1);
>>>  }
>>> +
>>> +/* Resize blob to accommodate error log. */
>>> +spapr->rtas_size = spapr_get_rtas_size(spapr->rtas_size);
>>> +
>>
>> This is the only user for spapr_get_rtas_size(), which is trivial.
>> I suggest you simply open-code it here.
>
> I agree.

 Sure.

>
>> But also, spapr->rtas_size is a guest visible thing, "rtas-size" prop in 
>> the
>> DT. Since existing machine types don't do that, I guess we should only 
>> use
>> the new size if cap-fwnmi-mce=on for the sake of compatibility.
>
> Yes, that's a good idea.  Changing this is very unlikely to break a
> guest, but it's easy to be safe here so let's do it.

 I did it like that because the rtas_blob is allocated based on rtas_size
 in spapr_machine_init(). During spapr_machine_init() it is not know if
 the guest calls "ibm, nmi-register". So if we want to use the new size
 only when cap_fwnmi=on, then we have to realloc the blob in "ibm,
 nmi-register".
>>>
>>> What?  Just always allocate the necessary space in
>>> spapr_machine_init() if cap_fwnmi=on, it'll be wasted if
>>> ibm,nmi-register is never called, but it's not that much space so we
>>> don't really care.
>>
>> Yes, not that much space, and ibm,nmi-register is called when the Linux
>> kernel boots. I guess, even though other OSes might not call
>> ibm,nmi-register, they do not constitute significant QEMU on Power users.
>>
>> So I think, I will keep the code as is.
> 
> No, that's not right.  It's impractical to change the allocation
> depending on whether fwnmi is currently active.  But you *can* (and
> should) base the allocation on whether fwnmi is *possible* - that is,
> the value of the spapr cap.

Sure..

> 

-- 
Regards,
Aravinda