Am Wed, 17 May 2023 12:52:31 +0200 schrieb Jan Kiszka <[email protected]>:
> On 15.05.23 20:20, 'Henning Schild' via EFI Boot Guard wrote: > > Tested on a Supermicro X11 series (Simatic IPC 1047E) and on some > > other random Supermicro X9 series machine. Should work for any > > board that has IPMI in any version. When an IPMI watchdog is > > present, iTCO is likely not working. The Linux driver would detect > > that and has special code to deal with especially Supermicro, here > > we use probe order. > > > > Developed using the IPMI Spec 2.0 sections 9 and 27. > > > > Signed-off-by: Henning Schild <[email protected]> > > --- > > Makefile.am | 2 + > > drivers/watchdog/ipmi_wdt.c | 193 > > ++++++++++++++++++++++++++++++++++++ 2 files changed, 195 > > insertions(+) create mode 100644 drivers/watchdog/ipmi_wdt.c > > > > diff --git a/Makefile.am b/Makefile.am > > index 48c560f72f0c..ba2440025a0c 100644 > > --- a/Makefile.am > > +++ b/Makefile.am > > @@ -159,12 +159,14 @@ if BOOTLOADER > > if ARCH_IS_X86 > > # NOTE: wdat.c is placed first so it is tried before any other > > drivers # NOTE: ipc4x7e_wdt.c must be *before* itco.c > > +# NOTE: ipmi_wdt.c must be *before* itco.c > > efi_sources_watchdogs = \ > > drivers/watchdog/wdat.c \ > > drivers/watchdog/amdfch_wdt.c \ > > drivers/watchdog/i6300esb.c \ > > drivers/watchdog/atom-quark.c \ > > drivers/watchdog/ipc4x7e_wdt.c \ > > + drivers/watchdog/ipmi_wdt.c \ > > drivers/watchdog/itco.c \ > > drivers/watchdog/hpwdt.c > > else > > diff --git a/drivers/watchdog/ipmi_wdt.c > > b/drivers/watchdog/ipmi_wdt.c new file mode 100644 > > index 000000000000..af55e57c5a0e > > --- /dev/null > > +++ b/drivers/watchdog/ipmi_wdt.c > > @@ -0,0 +1,193 @@ > > +/* > > + * EFI Boot Guard > > + * > > + * Copyright (c) Siemens AG, 2023 > > + * > > + * Authors: > > + * Henning Schild <[email protected]> > > + * > > + * This work is licensed under the terms of the GNU GPL, version > > 2. See > > + * the COPYING file in the top-level directory. > > + * > > + * SPDX-License-Identifier: GPL-2.0 > > + */ > > + > > +#include <efi.h> > > +#include <pci/header.h> > > +#include <sys/io.h> > > +#include "utils.h" > > + > > +#define SMBIOS_TYPE_IPMI_KCS 38 > > +#define IPMI_KCS_DEFAULT_IOBASE 0xca2 > > + > > +#define IPMI_KCS_STS_OBF 0x1 > > +#define IPMI_KCS_STS_IBF 0x2 > > + > > +#define IPMI_KCS_CMD_ABORT 0x60 > > +#define IPMI_KCS_CMD_WRITE_START 0x61 > > +#define IPMI_KCS_CMD_WRITE_END 0x62 > > + > > +#define IPMI_KCS_NETFS_LUN_WDT 0x18 > > + > > +#define IPMI_WDT_CMD_RESET 0x22 > > +#define IPMI_WDT_CMD_SET 0x24 > > +#define IPMI_WDT_SET_USE_OSLOAD 0x3 > > +#define IPMI_WDT_SET_ACTION_HARD_RESET 0x1 > > + > > + > > Extra newline. > > > +#define kcs_sts_is_error(sts) (((sts >> 6 ) & 0x3) == 0x3) > > + > > +static char > > +set_wdt_data[] = {IPMI_WDT_SET_USE_OSLOAD, > > IPMI_WDT_SET_ACTION_HARD_RESET, > > + 0x00, 0x00,0x00, 0x00}; > > + > > +static EFI_STATUS > > +kcs_wait_iobf(UINT16 io_base, char iobf) > > +{ > > + char sts; > > + > > + while (1) { > > + sts = inb(io_base + 1); > > + if (kcs_sts_is_error(sts)) > > + return EFI_DEVICE_ERROR; > > + if (iobf == IPMI_KCS_STS_IBF) { > > + // IBF we wait for clear > > + if (!(sts & IPMI_KCS_STS_IBF)) > > + break; > > + } else { > > + // OBF we wait for set > > + if (sts & IPMI_KCS_STS_OBF) > > + break; > > + } > > + } > > Is that loop guaranteed to eventually terminate? Or do we have to set > a waiting limit? What does the kernel do? Not sure what the kernel really does, did not dig that deep since the kernel code is hard to read and was not even looked at. The loop is expected to eventually end. We wait for input to become available, or output to be received by the BMC. Or we should see an error. But thinking about it again, i think it might indeed be better to put a timer around all of that. I will have a look at the kernel and freeipmi (userspace implementation) and kind of expect endless retry loops guarded with timers that use multiples of 5s (number coming from the spec). > > + > > + return EFI_SUCCESS; > > +} > > + > > +static EFI_STATUS > > +kcs_outb(UINT8 value, UINT16 io_base, UINT16 port) > > +{ > > + EFI_STATUS status; > > + > > + status = kcs_wait_iobf(io_base, IPMI_KCS_STS_IBF); > > + if (status) > > + return status; > > + > > + outb(value, io_base + port); > > + // dummy read > > Why? As delay or because the BMC expects a read in order to continue? BMC expects that, that is written in the SPEC and also called "dummy read" there. > > + inb(io_base); > > + > > + return EFI_SUCCESS; > > +} > > + > > +static EFI_STATUS > > +_send_ipmi_cmd(UINT16 io_base, char cmd, char *data, int datalen) > > +{ > > + int i, err = 0; > > + char lastbyte = cmd; > > + > > + err += kcs_outb(IPMI_KCS_CMD_WRITE_START, io_base, 1); > > + err += kcs_outb(IPMI_KCS_NETFS_LUN_WDT, io_base, 0); > > + > > + if (datalen) { > > + lastbyte = data[datalen - 1]; > > + err += kcs_outb(cmd, io_base, 0); > > + for (i = 0; i < datalen - 1; i++) > > + err += kcs_outb(data[i], io_base, 0); > > + } > > + > > + err += kcs_outb(IPMI_KCS_CMD_WRITE_END, io_base, 1); > > + err += kcs_outb(lastbyte, io_base, 0); > > + > > + if (err) > > + return EFI_DEVICE_ERROR; > > + > > + return kcs_wait_iobf(io_base, IPMI_KCS_STS_OBF); > > +} > > + > > +static VOID > > +handle_ipmi_error(UINT16 io_base) > > +{ > > + WARNING(L"Handling Error Status 0x%x\n", inb(io_base + 1)); > > + > > + outb(IPMI_KCS_CMD_ABORT, io_base + 1); > > + > > + if (kcs_wait_iobf(io_base, IPMI_KCS_STS_IBF)) > > + // stuck in error state, hopefully does not happen > > > > Reads a bit like "we are lost when we get here", which is not > completely true as we have an error message on the console and an > error return code from the caller, send_ipmi_cmd. What we then do > with the error code is a different story, see below. I will remove the comments. We come here when we are in error-state and hope to transition out of it by sending an abort command. If that all goes well we can try sending our real command again ... but if we are stuck in error we eventually have to give up. > > + return; > > + > > + if (inb((io_base + 1) & IPMI_KCS_STS_OBF)) > > + inb(io_base); > > + outb(0x0, io_base); > > + > > + if (kcs_wait_iobf(io_base, IPMI_KCS_STS_IBF)) > > + // stuck in error state, hopefully does not happen > > + return; > > +} > > + > > +static EFI_STATUS > > +send_ipmi_cmd(UINT16 io_base, char cmd, char *data, int datalen) > > +{ > > + EFI_STATUS status; > > + int i; > > + > > + for (i = 0; i < 5; i++) { > > + status = _send_ipmi_cmd(io_base, cmd, data, > > datalen); > > + if (status == EFI_SUCCESS) > > + return status; > > + handle_ipmi_error(io_base); > > + } > > + > > + return status; > > +} > > + > > +static EFI_STATUS __attribute__((constructor)) > > +init(__attribute__((unused)) EFI_PCI_IO *pci_io, > > + __attribute__((unused)) UINT16 pci_vendor_id, > > + __attribute__((unused)) UINT16 pci_device_id, > > + UINTN timeout) > > +{ > > + SMBIOS_STRUCTURE_TABLE *smbios_table; > > + SMBIOS_STRUCTURE_POINTER smbios_struct; > > + EFI_STATUS status; > > + UINT64 io_base; > > + UINT16 *timeout_value; > > + > > + status = LibGetSystemConfigurationTable(&SMBIOSTableGuid, > > + (VOID > > **)&smbios_table); + > > + if (status != EFI_SUCCESS) > > + return EFI_UNSUPPORTED; > > + > > + smbios_struct = smbios_find_struct(smbios_table, > > SMBIOS_TYPE_IPMI_KCS); + > > + if (smbios_struct.Raw == NULL) > > + return EFI_UNSUPPORTED; > > + > > + io_base = *((UINT64 *)(smbios_struct.Raw + 8)); > > + if (io_base == 0) { > > + io_base = IPMI_KCS_DEFAULT_IOBASE; > > + } else { > > + if (!(io_base & 1)) > > + // MMIO not implemented > > + return EFI_UNSUPPORTED; > > + > > + io_base &= ~1; > > + } > > + > > + INFO(L"Detected IPMI watchdog at I/O 0x%x\n", io_base); > > + timeout_value = (UINT16 *)(set_wdt_data + 4); > > + *timeout_value = timeout * 10; > > + > > + status = send_ipmi_cmd(io_base, IPMI_WDT_CMD_SET, > > set_wdt_data, > > + sizeof(set_wdt_data)); > > + if (status == EFI_SUCCESS) > > + status = send_ipmi_cmd(io_base, > > IPMI_WDT_CMD_RESET, NULL, 0); + > > + if (status != EFI_SUCCESS) { > > + WARNING(L"Watchdog device repeatedly reported > > errors. " \ > > + "Failed to arm the watchdog, but still > > booting up.\n"); > > That's deviating for other watchdog drivers. What makes this watchdog > different to justify it? If nothing, we can still argue about the best > behavior at EBG level. But, IIRC, a failing EFI application would > cause a reboot attempt of the firmware. This watchdog driver needs to have error-handling built-in the actual arming of the watchdog. Because we talk to a potentially multi-user device and there can be races. But you are totally right, i never thought about what would happen if i just returned the error to the firmware. I played with all the error handling in an UEFI shell ... where an error code looks like a "stuck in boot". I think i will indeed remove much of the error handling and retry/abort logic and simply pass the error to the firmware. And for the "breaking up endless waiting" simply use a 5s timer. We manage to arm in less than 5 seconds, or we tell the firmware there was an error. Henning > Jan > > > + } > > + > > + return EFI_SUCCESS; > > +} > -- You received this message because you are subscribed to the Google Groups "EFI Boot Guard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/efibootguard-dev/20230522163653.18f68869%40md1za8fc.ad001.siemens.net.
