Hi, I've made some progress, but I've got some serious problems that I'll need help with.
The current patchset (with 36 patches) is visible in my "smm_wip1" branch on github: https://github.com/lersek/edk2/compare/master...lersek:smm_wip1 I'm not posting the patchset to edk2-devel, for two reasons: - it is not testable yet (it's partially testable in the middle) - it is huge (due to importing PiSmmCpuDxeSmm in patch #24, ad85ba0). If you're interested, please fetch the series from github. So here's a quick overview and status report. ** First, let's (re)state the purpose of this work: The Secure Boot feature as built into OVMF is not actually secure; it does not enforce separation between a potentially malicious runtime guest OS and the runtime firmware (OVMF itself). OVMF's Secure Boot implementation can easily be subverted for example by directly poking into the pflash chip that stores authenticated variables, or messing with various special memory areas that the S3 resume PEIM processes (as data) or executes (as code, see eg. the boot script executor image). The planned solution for this is to add SMRAM / SMI / SMM support to all of QEMU, KVM, and OVMF. Gerd and Paolo have been working on the first two, and I've been struggling with the third. QEMU and KVM intend to provide TSEG emulation for SMRAM purposes (the guest RAM underneath TSEG can be shown, hidden, and locked away until platform reset), and (independently) QEMU intends to restrict access to the read-write mapped varstore pflash chip to System Management Mode. ** High level description for OVMF: The edk2 tree includes most of the necessary modules (beyond the protocols that are by design platform dependent). So, on a very high level, the tasks are the following: (1) Audit the special memory ranges used by OVMF, and make them secure against a malicious runtime OS. This is addressed by going through the ranges one by one (they are described in detail in the OVMF whitepaper, in the "comprehensive memory map of OVMF" chapter), and protecting / restoring them from pristine (read-only mapped) "code" flash chip as appropriate. (2) pull in the general SMM-related infrastructure (drivers, libaries) (3) switch the build to SMM-specific variants of drivers (and libraries) where both SMM and non-SMM implementations are available (and where they make a difference for security). (4) implement the modules (protocols and PPIs) that adapt the platform-independent infrastructure to the QEMU Q35 platform. These three tasks come together from several sources: - Vol4 of the PI1.4 spec - protocol GUID and PPI dependencies discovered from INF files and source code - searching INF file names for *Smm* - Jiewen's and Vincent's whitepapers about S3 and UEFI variables ** Let's see what the patches cover thus far. > 1 e5726fb OvmfPkg: introduce -D SMM_REQUIRE and PcdSmmSmramRequire Switching between library instances (which are statically linked into executable modules) at runtime, dependent on the underlying QEMU's support for SMM, would be both very complicated and ugly, so we decided on IRC to make SMM support a build time switch. This patch introduces that switch (and corresponding Feature PCD). Importantly, an OVMF binary built with this switch will *require* SMM support in QEMU, and refuse to run after a certain point when it is not available. (We can discuss this question later; right now it is the least of concerns.) > 2 40f831a MdePkg: BaseExtractGuidedSectionLib: allow forced reinit > of handler table > 3 b3eb2a4 OvmfPkg: set PcdBaseExtractGuidedSectionLibForceInit for > SEC on SMM_REQUIRE > 4 7154624 OvmfPkg: Sec: assert the build-time calculated end of the > scratch buffer > 5 4d1c9dd OvmfPkg: decompress FVs on S3 resume if SMM_REQUIRE is set > 6 5f09195 OvmfPkg: PlatformPei: allow caching in > AddReservedMemoryBaseSizeHob() > 7 241f95b OvmfPkg: PlatformPei: account for TSEG size with > PcdSmmSmramRequire set These patches belong to task (1) -- auditing & protecting the special memory ranges used by OVMF, plus adjusting the memory map in general for TSEG. Patch #2 fixes a potential security issue in an MdePkg library. I sent that patch on Apr 23rd to the tianocore-security list. I got no feedback, so I guess it's not overly significant. I plan to simply post it with the rest, at some point. FWIW, patch #2 could be posted and applied independently, so let me know if I should do that. > 8 b9ba372 OvmfPkg: split Include/OvmfPlatforms.h > 9 4aa7b9a OvmfPkg: consolidate POWER_MGMT_REGISTER_Q35() on > "Q35MchIch9.h" macros > 10 d6aac71 OvmfPkg: consolidate POWER_MGMT_REGISTER_PIIX4() on > "I440FxPiix4.h" macros > 11 780c910 OvmfPkg: extract some bits and port offsets common to Q35 > and I440FX These patches refactor the macros that relate to Q35 vs. PIIX4. This sub-series builds upon Gabriel's earlier work, and it could be posted and applied independently. Let me know if I should do that. The refactoring is necessary because many more macros are needed for massaging TSEG & co, and this is the cleanest way to introduce them. > 12 788c603 OvmfPkg: add PEIM for providing TSEG-as-SMRAM during PEI > 13 98d9262 OvmfPkg: add DXE_DRIVER for providing TSEG-as-SMRAM during > boot-time DXE > 14 5bc3653 OvmfPkg: implement EFI_SMM_CONTROL2_PROTOCOL with a > DXE_RUNTIME_DRIVER These three patches implement the following PPIs and protocols: - PEI_SMM_ACCESS_PPI - EFI_SMM_ACCESS2_PROTOCOL - EFI_SMM_CONTROL2_PROTOCOL They cover part of task (4). > 15 d253184 FIXME: DROP THIS once SMI_LOCK actually locks GBL_SMI_EN > 16 34a32f9 FIXME: DROP THIS -- SMI_LOCK should protect APMC_EN as > well These are temporary workarounds for issues in the QEMU work-in-progress series. They will be dropped later. > 17 0014c34 MdeModulePkg: SmmIplEntry(): don't suppress SMM core > startup failure This fixes a bug in MdeModulePkg that I've come across. (It is normally not exposed; it was triggered by an earlier, buggy, EFI_SMM_ACCESS2_PROTOCOL of mine.) This fix could be posted and committed separately. > 18 36b1aa7 OvmfPkg: pull in the SMM IPL and SMM core > 19 637601e OvmfPkg: pull in CpuIo2Smm driver They belong to task (2). > 20 c1d5295 OvmfPkg: AcpiS3SaveDxe: fix protocol usage hint in the INF > file > 21 db70e8c OvmfPkg: AcpiS3SaveDxe: don't fake LockBox protocol if > SMM_REQUIRE These belong to task (3), in particular to the LockBox provider (see patches just below). > 22 73446c4 OvmfPkg: LockBox: -D SMM_REQUIRE excludes our fake lockbox This patch completes task (1). It is located here in the series because it forms part of the larger LockBox switch-over, which follows below. > 23 1e23249 OvmfPkg: LockBox: use SMM stack with -D SMM_REQUIRE Belongs to task (3). All of the above is (hopefully) pretty clear and reasonably doable. The horrors start here, because for the SMM stack to work, more platform-specific modules are needed that are incredibly complex. In particular, some driver needs to - provide EFI_SMM_CONFIGURATION_PROTOCOL, - 'Initialize the SMM entry vector with the code necessary to meet the entry point requirements described in “Entering & Exiting SMM”', - populate the SMM_S3_RESUME_STATE object in SMRAM that carries vital information from normal boot to S3 resume. ( A digression: Remember that my RFC / v1 / v2 series for S3 support faked SMRAM. Ultimately we dropped that idea then, but those versions *worked* (they had bugs / issues of course, but not in relation to SMRAM usage). And, I never implemented any of the above. So how was that possible? Turns out the SMM_COMMUNICATION_PROTOCOL implementation in edk2, ie. SmmCommunicationCommunicate() in "MdeModulePkg/Core/PiSmmCore/PiSmmIpl.c", is very smart. It is capable of entering SMM, but it doesn't insist on it, *if* the system is still early enough into the boot process (namely, SMRAM has not been locked yet, near the end of DXE). Hence, a missing EFI_SMM_CONFIGURATION_PROTOCOL was no problem in those series of mine, because my S3 work never had a *runtime* component. The only time SMM drivers were actually in use were before SMRAM got locked, and then edk2's SMM_COMMUNICATION_PROTOCOL does not *want* to enter SMM -- since SMRAM is open(able) anyway. Plus, I simply faked SMM_S3_RESUME_STATE. This is why the need for EFI_SMM_CONFIGURATION_PROTOCOL is a cold shower now. We want runtime drivers to work -- see variables -- so we can't forego locking SMRAM near the end of DXE, and entering SMM when the variable drivers need it at runtime. ) The SMM entry vector is a large undertaking. Thankfully Jiewen pointed us to the open source (3-clause BSDL) Quark_EDKII_v1.1.0 distribution: http://thread.gmane.org/gmane.comp.emulators.qemu/331335/focus=14016 which provides a *huge* driver, "PiSmmCpuDxeSmm", that covers the above. So, in the rest of the patch series, I imported that driver to OVMF, and gradually got it to build: > 24 ad85ba0 OvmfPkg: import PiSmmCpuDxeSmm from > Quark_EDKII_v1.1.0/IA32FamilyCpuBasePkg > 25 b805d4b OvmfPkg: PiSmmCpuDxeSmm: eliminate SmmLib dependency > 26 63921a3 OvmfPkg: import CpuConfigLib from > Quark_EDKII_v1.1.0/IA32FamilyCpuBasePkg > 27 4007781 OvmfPkg: import SmmCpuPlatformHookLibNull from > Quark_EDKII_v1.1.0/IA32FamilyCpuBasePkg > 28 19c251c OvmfPkg: resolve ReportStatusCodeLib for DXE_SMM_DRIVER > modules > 29 860f330 OvmfPkg: replace IA32FamilyCpuBasePkg.dec references with > OvmfPkg.dec > 30 a548d0a OvmfPkg: replace gEfiCpuTokenSpaceGuid with > gQuarkPortCpuTokenSpaceGuid > 31 5cb4d15 OvmfPkg: PiSmmCpuDxeSmm: fix namespace for > PcdCpuMaxLogicalProcessorNumber > 32 e7809c2 FIXME: OvmfPkg: import PCDs from > Quark_EDKII_v1.1.0/IA32FamilyCpuBasePkg > 33 c6d4730 OvmfPkg: import three protocols from > Quark_EDKII_v1.1.0/IA32FamilyCpuBasePkg > 34 2a4cfd7 OvmfPkg: PiSmmCpuDxeSmm: fix warning about > UINT32-to-(VOID*) conversion > 35 3484b2f OvmfPkg: PiSmmCpuDxeSmm: fix up pathname in include > directive > 36 def1f58 FIXME: OvmfPkg: build PiSmmCpuDxeSmm for -D SMM_REQUIRE These patches belong to task (4). ** What's complete, missing. ("Complete" below means "ready for testing".) Task (1) is complete. Task (2) is complete. Task (3) is partially complete. What remains is to replace the Variable --> FTW --> FVB protocol stack with an SMM one (where we enter SMM as early as in the variable drivers -- the variable driver is split into a non-privileged runtime and a privileged SMM part, and FTW and FVB both function in SMM). This doesn't look hard (most of the modules are already there); I have plans. In fact, if you check out the tree at patch #23, you should be able to build it and play with it. S3 won't work, variables won't be "secure", but the SMM core (and "some" SMM drivers) will be loaded into SMRAM, and should work. (Assuming you have a QEMU build with Gerd's & Paolo's WIP patches, and use TCG acceleration (or build a host kernel with Paolo's KVM patches I guess).) Task (4) is where things fall apart. A (comparatively) small TODO item is to actually provide the SMM FVB implemenation (by reworking OvmfPkg/QemuFlashFvbServicesRuntimeDxe) for the variable protocol stack above. But that shouldn't be the problem. Trouble is that the PiSmmCpuDxeSmm driver imported from Quark_EDKII_v1.1.0 creates a huge mess for me. To begin with, this driver should have always lived in edk2, given that it is Intel *CPU*, not Intel *chipset* specific, and UefiCpuPkg is already Intel CPU specific. So here's the *specific* issues I'm facing (and need help with): * Problem #1 for task (4): Because Quark is 32-bit only, the (mostly assembly) code under "OvmfPkg/QuarkPort/PiSmmCpuDxeSmm/Ia32" that (partly) constitutes the SMM entry vector does not *build* for X64 guests. I don't know what it would take to make that code build & work for X64 guests: - just replace some registers and instructions, and maybe recalculate some offsets? - or else, is X64 architecturally different in this regard, and a rewrite from scratch is needed? (But, again, this code should have always lived in edk2, under UefiCpuPkg...) * Problem #2 (terrible) for task (4): I audited all PCDs used by PiSmmCpuDxeSmm carefully. Most of them are fixed or feature PCDs, fine. However, there are two dynamic PCDs that carry important information (and we can't just go with a default): - PcdCpuConfigContextBuffer - PcdCpuS3DataAddress PiSmmCpuDxeSmm *consumes* these PCDs, and the driver that produces them -- brace for impact -- is Quark_EDKII_v1.1.0/IA32FamilyCpuBasePkg/CpuMpDxe/ This means that PiSmmCpuDxeSmm will never work unless we throw out our current MpService implementation, located in edk2's UefiCpuPkg/CpuDxe/ (contributed by Chen Fan of Fujitsu), and replace it with the one from the Quark distribution. Which means that I'd have to import *another* 300+ KB driver from the Quark package, get it to build, and pray that it doesn't have even *further* dependencies. * Problem #3 (smaller, in comparison) for task (4): The default SMBASE area, starting at guest-phys address 0x30000, and continuing for 0x10000 bytes, doesn't seem to be protected with any kind of memory allocation HOB or similar in Quark_EDKII_v1.1.0. (Grep for SMM_DEFAULT_SMBASE). Why is this okay? As far as I understand, when a CPU enters SMM for the very first time, it will happily scribble over this area (saving state etc). We should make sure in advance (starting from PEI) that no later memory allocation (in PEI or in DXE) will accidentally overlap with that memory range, because when the SMBASE relocation is performed, that allocation will be corrupted. Can someone please advise wrt. to problems #1 to #3? Thanks Laszlo ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ edk2-devel mailing list edk2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/edk2-devel