On Wednesday 27 March 2019 08:16 PM, Greg Kurz wrote:
> On Wed, 27 Mar 2019 12:18:31 +0530
> Aravinda Prasad <aravi...@linux.vnet.ibm.com> wrote:
> 
>> On Tuesday 26 March 2019 03:35 PM, Greg Kurz wrote:
>>> On Tue, 26 Mar 2019 14:45:57 +0530
>>> Aravinda Prasad <aravi...@linux.vnet.ibm.com> wrote:
>>>   
>>>> On Tuesday 26 March 2019 02:00 PM, Greg Kurz wrote:  
>>>>> On Tue, 26 Mar 2019 10:32:35 +1100
>>>>> David Gibson <da...@gibson.dropbear.id.au> wrote:
>>>>>     
>>>>>> On Mon, Mar 25, 2019 at 01:56:50PM +0530, Aravinda Prasad wrote:    
>>>>>>>
>>>>>>>
>>>>>>> On Monday 25 March 2019 12:00 PM, David Gibson wrote:      
>>>>>>>> On Fri, Mar 22, 2019 at 12:04:07PM +0530, Aravinda Prasad wrote:      
>>>>>>>>> This patch builds the rtas error log, copies it to the
>>>>>>>>> rtas_addr and then invokes the guest registered machine
>>>>>>>>> check handler.      
>>>>>>>>
>>>>>>>> This commit message needs more context.  When is this occurring, why
>>>>>>>> do we need this?
>>>>>>>>
>>>>>>>> [I can answer those questions now, but whether I - or anyone else -
>>>>>>>>  will be able to looking back at this commit from years in the future
>>>>>>>>  is a different question]      
>>>>>>>
>>>>>>> will add more info.      
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> [snip]    
>>>>>>>>> +static uint64_t spapr_get_rtas_addr(void)
>>>>>>>>> +{
>>>>>>>>> +    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>>>>>>>> +    int rtas_node;
>>>>>>>>> +    const struct fdt_property *rtas_addr_prop;
>>>>>>>>> +    void *fdt = spapr->fdt_blob;
>>>>>>>>> +    uint32_t rtas_addr;
>>>>>>>>> +
>>>>>>>>> +    /* fetch rtas addr from fdt */
>>>>>>>>> +    rtas_node = fdt_path_offset(fdt, "/rtas");
>>>>>>>>> +    g_assert(rtas_node >= 0);
>>>>>>>>> +
>>>>>>>>> +    rtas_addr_prop = fdt_get_property(fdt, rtas_node, 
>>>>>>>>> "linux,rtas-base", NULL);
>>>>>>>>> +    g_assert(rtas_addr_prop);
>>>>>>>>> +
>>>>>>>>> +    rtas_addr = fdt32_to_cpu(*(uint32_t *)rtas_addr_prop->data);
>>>>>>>>> +    return (uint64_t)rtas_addr;      
>>>>>>>>
>>>>>>>> It seems a bit roundabout to pull the rtas address out of the device
>>>>>>>> tree, since it was us that put it in there in the first place.      
>>>>>>>
>>>>>>> Slof can change the rtas address. So we need to get the updated rtas
>>>>>>> address.      
>>>>>>
>>>>>> Ah, ok.
>>>>>>    
>>>>>
>>>>> Yeah, and knowing that the DT is guest originated makes me a bit
>>>>> nervous when I see the g_assert()... a misbehaving guest could
>>>>> possibly abort QEMU. Either there should be some sanity checks
>>>>> performed earlier or an non-fatal error path should be added in
>>>>> this function IMHO.    
>>>>
>>>> Is it not the QEMU that builds the DT and provides it to the guest?
>>>>  
>>>
>>> Yes, but then the guest can push a new DT with the KVMPPC_H_UPDATE_DT hcall.
>>> We only do some minimalist sanity checks in h_update_dt(). I don't think
>>> we want to abort QEMU because the guest sent a DT where "linux,rtas-base"
>>> is missing for example.
>>>   
>>>> Also, spapr_get_rtas_addr() is called during physical memory corruption
>>>> which is a fatal error.  
>>>
> 
> BTW... IIUC this can also occur if the guest created duplicate SLB entries,
> which isn't fatal from an hypervisor perspective.

yes.. right..

> 
>>> Not that fatal since we care to report it to the guest.  
>>
>> True, but if guest does not provide rtas_addr then I am not getting the
>> point on why terminating the QEMU instance (which actually terminates
>> the guest) is a problem. Am I missing something?
>>
> 
> This isn't terminating QEMU, this is aborting QEMU, which should only
> ever happen in case of a QEMU bug or g_malloc() failure.
> 
> LoPAPR v1.1 7.3.14 Firmware Assisted Non-Maskable Interrupts Option (FWNMI)
> doesn't describe this specific case of course. However, it gives a hint on
> what to do in case of a fatal condition where it isn't possible to notify
> the guest:
> 
>       "...the firmware has no choice but to declare the condition
>       fatal, log the result and execute the partition’s reboot policy."
> 
> ie, either triggering a system_reset or exiting QEMU.

Now I see the point. I will modify it to trigger system_reset instead of
aborting.

> 
>>>   
>>>> So, if we cannot fetch rtas_addr (required to
>>>> build and pass the error info to the guest) then I think we should abort.
>>>>  
>>>
>>> Maybe we cannot do anything better at this point, but then we should
>>> do some earlier checks and switch to the old machine check behaviour
>>> if what we need is missing from the updated DT for example.  
>>
>> We can do some checks earlier, may be during fwnmi registration to see
>> if rtas entry is missing. Again, the guest can possibly update DT after
>> fwnmi registration.
>>
> 
> Hmm... you're right. Maybe also add these checks to h_update_dt() if the
> guest registered fwnmi ? Or even forbid DT updates since we really expect
> this to occur only once from SLOF between machine resets ? In which case,
> it would be appropriate to keep the asserts in spapr_get_rtas_addr().

As you pointed out above, we can trigger system_reset if we fail to
fetch rtas_addr. I think this approach is simple and straight forward
and will not forbid guests from updating the DT.

> 
>> But I think we cannot switch to old machine check behavior if we cannot
>> fetch the rtas_addr, because according to PAPR the guest would have
>> relinquished 0x200 vector to the firmware when fwnmi is registered. So
>> we cannot expect the guest to handle 0x200 interrupt.
>>
> 
> I wonder how kexec fits in the picture... I mean what if the guest switches
> from an FWNMI-capable kernel to an FWNMI-not-capable one ? There's no machine
> reset in this case...

Ah.. not sure I need to check. So in this case QEMU will never know that
the guest rebooted into an FWNMI-not-capable kexec kernel.

> 
>>
>>>   
>>>>>     
>>>>  
>>>   
>>
> 

-- 
Regards,
Aravinda


Reply via email to