On Tue, 11 Nov 2025 14:08:13 +1000
Gavin Shan <[email protected]> wrote:

> Hi Igor and Jonathan,
> 
> On 11/11/25 12:49 AM, Igor Mammedov wrote:
> > On Thu, 6 Nov 2025 13:15:52 +1000
> > Gavin Shan <[email protected]> wrote:  
> >> On 11/6/25 12:14 AM, Jonathan Cameron wrote:  
> >>> On Wed,  5 Nov 2025 21:44:49 +1000
> >>> Gavin Shan <[email protected]> wrote:
> >>>      
> >>>> In the situation where host and guest has 64KiB and 4KiB page sizes,
> >>>> one problematic host page affects 16 guest pages. we need to send 16
> >>>> consective errors in this specific case.
> >>>>
> >>>> Extend acpi_ghes_memory_errors() to support multiple CPERs after the
> >>>> hunk of code to generate the GHES error status is pulled out from
> >>>> ghes_gen_err_data_uncorrectable_recoverable(). The status field of
> >>>> generic error status block is also updated accordingly if multiple
> >>>> error data entries are contained in the generic error status block.
> >>>>
> >>>> Signed-off-by: Gavin Shan <[email protected]>  
> >>> Hi Gavin,
> >>>
> >>> Mostly fine, but a few comments on the defines added and a
> >>> question on what the multiple things are meant to mean?
> >>>      
> >>
> >> Thanks for your review and comments, replies as below.
> >>  
> >>>> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> >>>> index a9c08e73c0..527b85c8d8 100644
> >>>> --- a/hw/acpi/ghes.c
> >>>> +++ b/hw/acpi/ghes.c
> >>>> @@ -57,8 +57,12 @@
> >>>>    /* The memory section CPER size, UEFI 2.6: N.2.5 Memory Error Section 
> >>>> */
> >>>>    #define ACPI_GHES_MEM_CPER_LENGTH           80
> >>>>    
> >>>> -/* Masks for block_status flags */
> >>>> -#define ACPI_GEBS_UNCORRECTABLE         1
> >>>> +/* Bits for block_status flags */
> >>>> +#define ACPI_GEBS_UNCORRECTABLE           0
> >>>> +#define ACPI_GEBS_CORRECTABLE             1
> >>>> +#define ACPI_GEBS_MULTIPLE_UNCORRECTABLE  2
> >>>> +#define ACPI_GEBS_MULTIPLE_CORRECTABLE    3  
> >>>
> >>> So this maps to the bits in block status.
> >>>
> >>> I'm not actually sure what these multiple variants are meant to tell us.
> >>> The multiple error blocks example referred to by the spec is a way to 
> >>> represent
> >>> the same error applying to multiple places.  So that's one error, many 
> >>> blocks.
> >>> I have no idea if we set these bits in that case.
> >>>
> >>> Based on a quick look I don't think linux even takes any notice.  THere
> >>> are defines in actbl1.h but I'm not seeing any use made of them.
> >>>      
> >>
> >> I hope Igor can confirm since it was suggested by him.
> >>
> >> It's hard to understand how exactly these multiple variants are used from 
> >> the
> >> spec. In ACPI 6.5 Table 18.11, it's explained as below.
> >>
> >> Bit [2] - Multiple Uncorrectable Errors: If set to one, indicates that more
> >> than one uncorrectable errors have been detected.
> >>
> >> I don't see those multiple variants have been used by Linux. So I think 
> >> it's
> >> safe to drop them.  
> > 
> > even though example describes 'same' error at different components,
> > the bit fields descriptions doesn't set any limits on what 'more than one' 
> > means.
> > 
> > Also from guest POV it's multiple different pages that we are reporting here
> > as multiple CPERs.
> > It seems to me that setting *_MULTIPLE_* here is correct thing to do.
> >   
> 
> I don't have strong opinions. Lets keep to set _MULTIPLE_ flag if Jonathan
> is fine. Again, this field isn't used by Linux guest.
I don't care strongly.  Maybe we should ask for a spec clarification as I doubt
implementations will be consistent on this given the vague description and that
Linux ignores it today.

> 
> >>>> +#define ACPI_GEBS_ERROR_DATA_ENTRIES      4  
> >>>
> >>> This is bits 4-13 and the define isn't used. I'd drop it.
> >>>      
> >>
> >> The definition is used in acpi_ghes_memory_errors() of this patch. However,
> >> I don't see it has been used by Linux. This field isn't used by Linux to 
> >> determine
> >> the total number of error entries. So I think I can drop it either if Igor 
> >> is ok.
> >>  
> 
> Lets keep this field either in next revision if Jonathan is fine.

I'm fine with the field, but not the value.  As far as I can tell form the 
spec, it should
be a mask, not a single bit.

> 
> Thanks,
> Gavin
> 
> 
> 


Reply via email to