Hi James, Mark,
On Tue, Jul 9, 2019 at 8:52 PM Tyler Baicar OS
wrote:
> On Mon, Jul 8, 2019 at 10:10 AM James Morse wrote:
> > On 02/07/2019 17:51, Tyler Baicar OS wrote:
> > > @@ -632,6 +633,8 @@ static int do_sea(unsigned long addr, unsigned int
> > >
On Mon, Jul 8, 2019 at 10:10 AM James Morse wrote:
> On 02/07/2019 17:51, Tyler Baicar OS wrote:
> > On systems that support the ARM RAS extension, synchronous external
> > abort syndrome information could be captured in the core's RAS extension
> > system registers. So
Hello Shiju,
Thank you for the feedback!
On Thu, Jul 4, 2019 at 12:03 PM Shiju Jose wrote:
> >+struct ras_ext_regs {
> >+ u64 err_fr;
> >+ u64 err_ctlr;
> >+ u64 err_status;
> >+ u64 err_addr;
> >+ u64 err_misc0;
> >+ u64 err_misc1;
> >+ u64 err_misc2;
> >+
Hello Andrew,
Thank you for the feedback!
On Wed, Jul 3, 2019 at 5:26 AM Andrew Murray wrote:
>
> On Tue, Jul 02, 2019 at 04:51:38PM +, Tyler Baicar OS wrote:
> > Add support for parsing the ARM Error Source Table and basic handling of
> > errors reported through both
ork:
- UER handling to avoid panic
- Looping through all external abort capable (ERRFR.UE != 0) error
nodes in SEA/SEI handling
- ARMv8.4 extension support
[0] https://static.docs.arm.com/den0085/a/DEN0085_RAS_ACPI_1.0_BETA_1.pdf
Tyler Baicar (4):
ACPI/AEST: Initial AEST driver
arm64: m
Add a trace event for hardware errors reported by the ARMv8.2
RAS extension registers.
Signed-off-by: Tyler Baicar
---
arch/arm64/kernel/ras.c | 3 +++
drivers/acpi/arm64/aest.c | 4
include/ras/ras_event.h | 46 ++
3 files changed, 53
On systems that support the ARM RAS extension, serror interrupt syndrome
information could be captured in the core's RAS extension system
registers. When handling serrors, check the RAS system registers for
error syndrome information.
Signed-off-by: Tyler Baicar
---
arch/arm64/kernel/tr
Add support for parsing the ARM Error Source Table and basic handling of
errors reported through both memory mapped and system register interfaces.
Signed-off-by: Tyler Baicar
---
arch/arm64/include/asm/ras.h | 41 +
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/ras.c | 67
On systems that support the ARM RAS extension, synchronous external
abort syndrome information could be captured in the core's RAS extension
system registers. So, when handling SEAs check the RAS system registers
for error syndrome information.
Signed-off-by: Tyler Baicar
---
arch/arm
t the patches since the hardware error record on that machine has
> been cleared.
>
> Ross Lagerwall (2):
> acpi/apei: Fix possible out-of-bounds access to BERT region
> efi/cper: Fix possible out-of-bounds access
For both patches:
Tested-by: Tyler Baicar
> for exactly the same fatal error.
>
> Otherwise ghes_probe(), running in the crash kernel, would see
> an unhandled error in the APEI generic error status block and
> panic again, thereby precluding any crash dump.
>
> Signed-off-by: Lenny Szubowicz
> Signed-off-by: Da
On Tue, Nov 27, 2018 at 1:32 PM Sinan Kaya wrote:
>
> On 11/27/2018 1:22 PM, alex_gagn...@dellteam.com wrote:
> > On 11/20/2018 04:08 PM, Sinan Kaya wrote:
> >> I followed the ASWG thread yesterday. There will be a meeting next week to
> >> discuss this.
> >
> > Any updates on the meeting?
> >
> >
d *arg)
> (*num_dimm)++;
> }
>
> +static int get_dimm_smbios_index(u16 handle)
> +{
> + struct mem_ctl_info *mci;
> + int i;
> +
> + mci = ghes_pvt->mci;
> +
Minor nit: you could define and set mci in the same line to save some
space here.
Otherwise this patch looks good to me.
Reviewed-by: Tyler Baicar
On Thu, Aug 30, 2018 at 12:32 PM, James Morse wrote:
> Hi Fan,
>
> On 30/08/18 15:40, wufan wrote:
@@ -327,12 +349,20 @@ void ghes_edac_report_mem_error(int sev,
>>> struct cper_sec_mem_err *mem_err)
p += sprintf(p, "bit_pos:%d ", mem_err->bit_pos);
if (mem_err->vali
On Tue, Aug 28, 2018 at 1:11 PM, James Morse wrote:
> On 24/08/18 16:14, Tyler Baicar wrote:
>> On Fri, Aug 24, 2018 at 5:48 AM, James Morse wrote:
>>> On 23/08/18 16:46, Tyler Baicar wrote:
>>> so edac_raw_mc_handle_error() has no clue where the error happened. (I
&
On Fri, Aug 24, 2018 at 5:48 AM, James Morse wrote:
> On 23/08/18 16:46, Tyler Baicar wrote:
>> On Thu, Aug 23, 2018 at 5:29 AM James Morse wrote:
>>> On 19/07/18 19:36, Tyler Baicar wrote:
>>>> This seems pretty hacky to me, so if anyone has other suggestion
Hello James,
On Thu, Aug 23, 2018 at 5:29 AM James Morse wrote:
> On 19/07/18 19:36, Tyler Baicar wrote:
> > On 7/19/2018 10:46 AM, James Morse wrote:
> >> On 19/07/18 15:01, Borislav Petkov wrote:
> >>> On Mon, Jul 16, 2018 at 01:26:49PM -0400, Tyler Baicar wrote:
On Thu, Aug 9, 2018 at 6:16 PM, gengdongjiu wrote:
> 2018-08-10 5:05 GMT+08:00 Tyler Baicar :
>> On Thu, Aug 9, 2018 at 8:32 AM, gengdongjiu wrote:
>>>
>>> 2018-08-08 0:26 GMT+08:00 Dongjiu Geng :
>>> > In order to remove the additional check before calling
On 7/19/2018 10:46 AM, James Morse wrote:
On 19/07/18 15:01, Borislav Petkov wrote:
On Mon, Jul 16, 2018 at 01:26:49PM -0400, Tyler Baicar wrote:
Enable per-layer error reporting for ARM systems so that the error
counters are incremented per-DIMM.
On ARM systems that use firmware first error
systems so that
the EDAC error counters are incremented based on DIMM number as per the
SMBIOS table rather than just incrementing the noinfo counters on the
memory controller.
Signed-off-by: Tyler Baicar
---
drivers/edac/ghes_edac.c | 15 ---
1 file changed, 12 insertions(+), 3
lspci uses abbreviated naming for AER error strings. Adopt the
same naming convention for the AER printing so they match.
Signed-off-by: Tyler Baicar
---
drivers/pci/pcie/aer.c | 46 +++---
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a
On 6/21/2018 5:25 PM, Rajat Jain wrote:
On Thu, Jun 21, 2018 at 11:48 AM, Bjorn Helgaas wrote:
[+cc Tyler for AER dmesg decoding]
- Tyler posted a patch [1] to update those dmesg strings so they match
the way lspci decodes them. I really liked that update, but we
never quite finished it
On 5/22/2018 10:32 AM, Alex G. wrote:
I think the biggest problem is having a policy to panic on "fatal"
errors, instead of letting the error handler make that decision. I'd
much rather kill that stupid policy, but people seem to like it for some
reason.
You can get around that panic and still
On 5/21/2018 9:49 AM, Alexandru Gagniuc wrote:
+/* PCIe errors should not cause a panic. */
+static int ghes_sec_pcie_severity(struct acpi_hest_generic_data *gdata)
+{
+ struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
+
+ if (pcie_err->validation_bits & CPER_PCIE_VALID_
.
Signed-off-by: Alexandru Gagniuc
Tested-by: Tyler Baicar
Thanks!
---
drivers/pci/pcie/aer/aerdrv_errprint.c | 16 +---
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c
b/drivers/pci/pcie/aer/aerdrv_errprint.c
index cfc89dd57831
On 2/24/2018 2:20 AM, Dave Young wrote:
On 02/23/18 at 12:42pm, Tyler Baicar wrote:
If ESRT initialization fails due to an unsupported version, the
early_memremap allocation is never unmapped. This will cause an
early ioremap leak. So, make sure to unmap the memory allocation
before returning
Hello Akashi,
On 3/6/2018 4:00 AM, AKASHI Takahiro wrote:
Tyler, Jeffrey,
On Fri, Mar 02, 2018 at 08:27:11AM -0500, Tyler Baicar wrote:
On 3/2/2018 12:53 AM, AKASHI Takahiro wrote:
Tyler, Jeffrey,
[Note: This issue takes place in kexec, not kdump. So to be precise,
it is not the same
://lists.infradead.org/pipermail/linux-arm-kernel/2018-January/553098.html
]
On Thu, Mar 01, 2018 at 12:56:38PM -0500, Tyler Baicar wrote:
Hello,
On 2/28/2018 9:50 PM, AKASHI Takahiro wrote:
Hi,
On Wed, Feb 28, 2018 at 08:39:42AM -0700, Jeffrey Hugo wrote:
On 2/27/2018 11:19 PM, AKASHI Takahiro
Hello Bjorn,
On 2/7/2018 3:11 PM, Tyler Baicar wrote:
Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the proper PCI device
name prefix that the other AER prints include. Also, it has a different
print level than all
Hello Ard,
On 2/24/2018 3:03 AM, Ard Biesheuvel wrote:
Hi Tyler,
On 23 February 2018 at 19:42, Tyler Baicar wrote:
The ESRT memory region is being exposed as System RAM in /proc/iomem
which is wrong because it cannot be overwritten. This memory is needed
for kexec kernels in order to
If ESRT initialization fails due to an unsupported version, the
early_memremap allocation is never unmapped. This will cause an
early ioremap leak. So, make sure to unmap the memory allocation
before returning from efi_esrt_init().
Signed-off-by: Tyler Baicar
---
drivers/firmware/efi/esrt.c | 2
that it is not overwritten.
Signed-off-by: Tyler Baicar
Tested-by: Jeffrey Hugo
---
drivers/firmware/efi/esrt.c | 8
1 file changed, 8 insertions(+)
diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
index 504f3c3..f5f79c7 100644
--- a/drivers/firmware/efi/esrt.c
returning.
This still leaves ESRT unable to initialize in the kexec'd kernel, so now
mark the ESRT memory block as nomap so that this memory is not treated as
System RAM. With this change I'm able to see that the ESRT data is not
overwritten when running a kexec'd kernel.
Tyler Ba
:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
Signed-off-by: Tyler Baicar
---
drivers/pci/pcie/aer/aerdrv_errprint.c | 71 ++
1 file changed, 47 insertions(+), 24 deletions(-)
diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c
b/drivers/pci/pcie/aer
safe, because we are in process context.
In some platform, when SEA triggerred, physical address could be reported
by memory section or by processor section, so we save address at this two
place.
For this series - Tested-by: Tyler Baicar
Note that this will probably need to be rebased on top of
Commit-ID: 301f55b1a9177132d2b9ce8a90bf0ae4b37bb850
Gitweb: https://git.kernel.org/tip/301f55b1a9177132d2b9ce8a90bf0ae4b37bb850
Author: Tyler Baicar
AuthorDate: Tue, 2 Jan 2018 18:10:42 +
Committer: Ingo Molnar
CommitDate: Wed, 3 Jan 2018 14:03:48 +0100
efi: Parse ARM error
Commit-ID: c6d8c8ef1d0d94fdae9f5d72982963db89f9cdad
Gitweb: https://git.kernel.org/tip/c6d8c8ef1d0d94fdae9f5d72982963db89f9cdad
Author: Tyler Baicar
AuthorDate: Tue, 2 Jan 2018 18:10:41 +
Committer: Ingo Molnar
CommitDate: Wed, 3 Jan 2018 14:03:48 +0100
efi: Move ARM CPER code to
On 11/15/2017 12:56 PM, Bjorn Helgaas wrote:
Hi Tyler,
On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
On 10/17/2017 11:42 AM, Tyler Baicar wrote:
Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the
First, break the PCIe AER handling out into its own function to separate
it from the standard GHES processing
Then fix the AER handling to process all errors in the AER driver rather
than only handling recoverable errors.
V4: Rebase to 4.15-rc1 and add reviewed-by
Tyler Baicar (2):
acpi: apei
severity
Signed-off-by: Tyler Baicar
Reviewed-by: Borislav Petkov
---
drivers/acpi/apei/ghes.c | 22 +-
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f67eb76..cc65d19 100644
--- a/drivers/acpi/apei
Move PCIe AER error handling code into a separate function.
Signed-off-by: Tyler Baicar
Reviewed-by: Borislav Petkov
---
drivers/acpi/apei/ghes.c | 64 +---
1 file changed, 34 insertions(+), 30 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b
severity
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 22 +-
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 839c3d5..15dbf65 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei
On 10/17/2017 11:42 AM, Tyler Baicar wrote:
Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the proper PCI device
name prefix that the other AER prints include. Also, it has a different
print level than all the other
On 10/2/2017 7:19 PM, Bjorn Helgaas wrote:
On Mon, Aug 28, 2017 at 11:09:44AM -0600, Tyler Baicar wrote:
Correctable errors do not need any software intervention, so
avoid calling into the software recovery process for correctable
errors.
Signed-off-by: Tyler Baicar
---
drivers/pci/pcie/aer
On 11/13/2017 7:36 AM, Dongdong Liu wrote:
在 2017/11/9 3:13, Tyler Baicar 写道:
Currently the GHES code only calls into the AER driver for
recoverable type errors. This is incorrect because errors of
other severities do not get logged by the AER driver and do not
get exposed to user space via
On 11/9/2017 4:46 AM, Borislav Petkov wrote:
On Wed, Nov 08, 2017 at 12:13:12PM -0700, Tyler Baicar wrote:
Currently the GHES code only calls into the AER driver for
recoverable type errors. This is incorrect because errors of
other severities do not get logged by the AER driver and do not
get
On 11/9/2017 4:46 AM, Borislav Petkov wrote:
On Wed, Nov 08, 2017 at 12:13:12PM -0700, Tyler Baicar wrote:
Currently the GHES code only calls into the AER driver for
recoverable type errors. This is incorrect because errors of
other severities do not get logged by the AER driver and do not
get
severity
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 8 +++-
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 839c3d5..bb65fa6 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -458,14
Move PCIe AER error handling code into a separate function.
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 64 +---
1 file changed, 34 insertions(+), 30 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index
First, break the PCIe AER handling out into its own function to separate
it from the standard GHES processing
Then fix the AER handling to process all errors in the AER driver rather
than only handling recoverable errors.
Tyler Baicar (2):
acpi: apei: handle PCIe AER errors in separate
On 10/11/2017 1:09 PM, Bjorn Helgaas wrote:
On Wed, Oct 11, 2017 at 10:37:47AM -0400, Tyler Baicar wrote:
On 10/2/2017 7:19 PM, Bjorn Helgaas wrote:
On Mon, Aug 28, 2017 at 11:09:44AM -0600, Tyler Baicar wrote:
Correctable errors do not need any software intervention, so
avoid calling into
On 10/17/2017 11:28 AM, Tyler Baicar wrote:
Currently the GHES code only calls into the AER driver for
recoverable type errors. This is incorrect because errors of
other severities do not get logged by the AER driver and do not
get exposed to user space via the AER trace event. So, call
into the
On 10/20/2017 7:55 PM, Bjorn Helgaas wrote:
On Tue, Oct 17, 2017 at 09:42:02AM -0600, Tyler Baicar wrote:
Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the proper PCI device
name prefix that the other AER prints
ghes_ioremap_area or arch_apei_flush_tlb_one(),
rip them out.
RFC as I've only build-tested this on x86. For arm64 I've tested it on a
software model. Any more testing would be welcome. These patches are based
on rc7.
For the arm64 and APEI patches:
Tested-by: Tyler Baicar
Verified on arm64. I no long
On 10/30/2017 1:46 PM, Linus Torvalds wrote:
On Mon, Oct 30, 2017 at 10:20 AM, Linus Torvalds
wrote:
I will add a "might_sleep()" to ioremap_page_range() itself, so that
we get this warning more reliably and much eailer. Right now it has
been hidden by the fact that most of the time the time th
On 10/30/2017 10:06 AM, Borislav Petkov wrote:
On Mon, Oct 30, 2017 at 10:01:52AM -0400, Tyler Baicar wrote:
This is not as important for polling sources as it is for the
interrupt sources since polling sources are regularly checked and
shouldn't be used for fatal error scenarios. For inte
On 10/30/2017 7:05 AM, Borislav Petkov wrote:
On Mon, Oct 30, 2017 at 12:18:35AM +0100, Fengguang Wu wrote:
CC related developers for the BUG in v4.14-rc6.
On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
Hi Linus,
Up to now we see the below boot error/warnings when testing v4.14
On 10/29/2017 9:23 PM, Qiang Zheng wrote:
Current Error status block processing flow, if wrong format is checked,
GHES table ack is not cleared.
It will cause new error can not be filled GHES table, because UEFI
need check ack to know if error was handled by OS.
This patch solved issue, no matte
On 10/18/2017 6:14 AM, David Laight wrote:
From: Tyler Baicar [mailto:tbai...@codeaurora.org]
Sent: 17 October 2017 18:14
On 10/17/2017 12:00 PM, David Laight wrote:
From: Tyler Baicar
Sent: 17 October 2017 16:42
Currently the AER driver uses cper_print_bits() to print the AER status
string
On 10/17/2017 3:30 PM, Andy Shevchenko wrote:
On Tue, 2017-10-17 at 11:23 -0600, Tyler Baicar wrote:
ARM errors just print out the error information value, then the
value needs to be manually decoded as per the UEFI spec. Add
decoding of the ARM error information value so that the kernel
logs
in UEFI 2.7
spec tables 263-265.
Signed-off-by: Tyler Baicar
---
drivers/firmware/efi/cper.c | 213 +++-
include/linux/cper.h| 44 +
2 files changed, 255 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efi/cper.c b/drivers
On 10/17/2017 12:00 PM, David Laight wrote:
From: Tyler Baicar
Sent: 17 October 2017 16:42
Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the proper PCI device
name prefix that the other AER prints include. Also, it
Layer, aer_agent=Receiver ID
pcieport 0003:00:00.0: aer_status: 0x1000, aer_mask: 0xe000
pcieport 0003:00:00.0: Replay Timer Timeout
pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
Signed-off-by: Tyler Baicar
---
drivers/pci/pcie/aer/aerdrv_errprint.c | 15
severity.
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 76 +---
1 file changed, 46 insertions(+), 30 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 3c3a37b..d7801bc 100644
--- a/drivers/acpi/apei
has no chance to be called. Hence, remove the unnecessary
handling when CONFIG_ACPI_APEI_SEA is not defined.
For the NMI notification, it has the same issue as SEA notification,
so also remove the unused dead-code for it.
Cc: Tyler Baicar
Cc: James Morse
Signed-off-by: Dongjiu Geng
Tested-by
y Luck
Cc: Borislav Petkov
Cc: Tyler Baicar
Cc: Will Deacon
Cc: James Morse
Cc: "Jonathan (Zhixiong) Zhang"
Cc: Shiju Jose
Cc: linux-a...@vger.kernel.org
Signed-off-by: Kees Cook
Tested-by: Tyler Baicar
Verified that the polled error sources still work with this timer setup.
Thanks,
On 10/2/2017 7:19 PM, Bjorn Helgaas wrote:
On Mon, Aug 28, 2017 at 11:09:44AM -0600, Tyler Baicar wrote:
Correctable errors do not need any software intervention, so
avoid calling into the software recovery process for correctable
errors.
Signed-off-by: Tyler Baicar
---
drivers/pci/pcie/aer
On 9/27/2017 6:05 AM, gengdongjiu wrote:
Tyler, Stephen
On 2017/9/27 3:23, Tyler Baicar wrote:
Signed-off-by: Dongjiu Geng
Tested-by: Tyler Baicar
Tested this functionality using SEA support.
++Stephen,
Something to be aware of, this patch will conflict with
https://lkml.org/lkml/2017/9
not accurate, so EL3 firmware
should identify the address to a invalid value.
Signed-off-by: Dongjiu Geng
Tested-by: Tyler Baicar
Tested this functionality using SEA support.
++Stephen,
Something to be aware of, this patch will conflict with
https://lkml.org/lkml/2017/9/14/663
It may make
On 9/13/2017 8:40 AM, Baicar, Tyler wrote:
On 8/29/2017 2:16 AM, Borislav Petkov wrote:
On Mon, Aug 28, 2017 at 10:53:41AM -0600, Tyler Baicar wrote:
Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error
severity.
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d661d45..5cab238 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -489,9
Correctable errors do not need any software intervention, so
avoid calling into the software recovery process for correctable
errors.
Signed-off-by: Tyler Baicar
---
drivers/pci/pcie/aer/aerdrv_core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/aer
tatus before acknowledging the errors.
Also, make sure to acknowledge the error if the error status read
fails.
V3: Seperate check for -ENOENT return value
V2: Only send error ack if there was an error populated
Remove curly braces that are no longer needed
Signed-off-by: Tyler Baicar
---
dr
tatus before acknowledging the errors.
Also, make sure to acknowledge the error if the error status read
fails.
V2: Only send error ack if there was an error populated
Remove curly braces that are no longer needed
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 9 +++--
1
GHES estatus iteration to properly increment through
the estatus blocks similar to how the CPER estatus printing
iterates through them.
Fixes: bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3
structure")
Signed-off-by: Tyler Baicar
Tested-by: Austin Christ
---
dr
tatus before acknowledging the errors.
Also, make sure to acknowledge the error if the error status read
fails.
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
eliminating the race condition.
Add support for parsing of GHESv2 sub-tables as well.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
---
drivers/acpi/apei/ghes.c | 59 +---
drivers/acpi/apei/hest.c | 7 --
include/acpi
Add support for ARM Common Platform Error Record (CPER).
UEFI 2.6 specification adds support for ARM specific
processor error information to be reported as part of the
CPER records. This provides more detail on for processor error logs.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
_t to map with in
the same way as ghes_ioremap_pfn_irq().
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
Acked-by: Catalin Marinas
---
arch/arm64/Kconfig| 2 ++
arch/arm64/mm/fault.c | 17
drivers/acpi/apei/Kconfig
. The OS
should panic when a hardware error record is received with this
severity.
Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.
Signed-off-by: Jonathan (Zhixiong) Zhang
Signed-off-by: Tyler Baicar
Reviewed-by: James Mors
rated.
Generate a trace event which contains the raw error data for
non-standard section type error records.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Tested-by: Shiju Jose
---
drivers/acpi/apei/ghes.c | 27 +++
drivers/ras/ras.c | 10 +-
in
message of an unsupported FSC would be printed and nothing else
would happen. With this patch, the code gets routed to the APEI
handling of SEAs in the host kernel to report the SEA information.
Signed-off-by: Tyler Baicar
Acked-by: Catalin Marinas
Acked-by: Marc Zyngier
Acked-by: Christoffer Dall
then be decoded using vendor
specific tools.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
---
drivers/firmware/efi/cper.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/c
section N.2.4.4.
Signed-off-by: Tyler Baicar
Acked-by: Steven Rostedt
Reviewed-by: Xie XiuQi
---
drivers/acpi/apei/ghes.c| 6 +-
drivers/firmware/efi/cper.c | 1 +
drivers/ras/ras.c | 6 ++
include/linux/ras.h | 3 +++
include/ras/ras_event.h | 45
The ACPI 6.1 spec adds a new revision of the generic error data
entry structure. Add support to handle the new structure as well
as properly verify and iterate through the generic data entries.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
---
drivers/acpi/apei/ghes.c| 11
kml.org/lkml/2016/2/5/544
Jonathan (Zhixiong) Zhang (1):
acpi: apei: panic OS with fatal error status block
Tyler Baicar (10):
acpi: apei: read ack upon ghes record consumption
ras: acpi/apei: cper: add support for generic data v3 structure
cper: add timestamp print to CPER status printing
The ACPI 6.1 spec added a timestamp to the generic error data
entry structure. Print the timestamp out when printing out the
error information.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
---
drivers/firmware/efi/cper.c | 26 ++
1 file changed, 26
specific SEA faults so that the
new SEA handler is used.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
Acked-by: Catalin Marinas
---
arch/arm64/include/asm/esr.h | 1 +
arch/arm64/mm/fault.c| 45 ++--
2 files
wasn't present.
V3: Check for pending errors of all GHES types
Signed-off-by: Tyler Baicar
---
drivers/acpi/apei/ghes.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d0855c0..5347230 100644
--- a/drivers/acpi/apei/ghes.c
The ACPI 6.1 spec adds a new revision of the generic error data
entry structure. Add support to handle the new structure as well
as properly verify and iterate through the generic data entries.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
---
drivers/acpi/apei/ghes.c| 11
specific SEA faults so that the
new SEA handler is used.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
Acked-by: Catalin Marinas
---
arch/arm64/include/asm/esr.h | 1 +
arch/arm64/mm/fault.c| 45 ++--
2 files
. The OS
should panic when a hardware error record is received with this
severity.
Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.
Signed-off-by: Jonathan (Zhixiong) Zhang
Signed-off-by: Tyler Baicar
Reviewed-by: James Mors
message of an unsupported FSC would be printed and nothing else
would happen. With this patch, the code gets routed to the APEI
handling of SEAs in the host kernel to report the SEA information.
Signed-off-by: Tyler Baicar
Acked-by: Catalin Marinas
Acked-by: Marc Zyngier
Acked-by: Christoffer Dall
rated.
Generate a trace event which contains the raw error data for
non-standard section type error records.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Tested-by: Shiju Jose
---
drivers/acpi/apei/ghes.c | 27 +++
drivers/ras/ras.c | 9 +
in
section N.2.4.4.
Signed-off-by: Tyler Baicar
Acked-by: Steven Rostedt
Reviewed-by: Xie XiuQi
---
drivers/acpi/apei/ghes.c| 6 +-
drivers/firmware/efi/cper.c | 1 +
drivers/ras/ras.c | 6 ++
include/linux/ras.h | 3 +++
include/ras/ras_event.h | 45
0
[ 140.739226] {1}[Hardware Error]: 0050: 0101 0001
0000
...
The raw data from the error can then be decoded using vendor
specific tools.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James
Add support for ARM Common Platform Error Record (CPER).
UEFI 2.6 specification adds support for ARM specific
processor error information to be reported as part of the
CPER records. This provides more detail on for processor error logs.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
The ACPI 6.1 spec added a timestamp to the generic error data
entry structure. Print the timestamp out when printing out the
error information.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
---
drivers/firmware/efi/cper.c | 26 ++
1 file changed, 26
_t to map with in
the same way as ghes_ioremap_pfn_irq().
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
Acked-by: Catalin Marinas
---
arch/arm64/Kconfig| 2 ++
arch/arm64/mm/fault.c | 17
drivers/acpi/apei/Kconfig
eliminating the race condition.
Add support for parsing of GHESv2 sub-tables as well.
Signed-off-by: Tyler Baicar
CC: Jonathan (Zhixiong) Zhang
Reviewed-by: James Morse
---
drivers/acpi/apei/ghes.c | 59 +---
drivers/acpi/apei/hest.c | 7 --
include/acpi
1 - 100 of 277 matches
Mail list logo