* Laszlo Ersek (ler...@redhat.com) wrote: > On 09/06/18 19:23, Dr. David Alan Gilbert wrote: > > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote: > >> Hi > >> > >> On Thu, Sep 6, 2018 at 1:42 PM Dr. David Alan Gilbert > >> <dgilb...@redhat.com> wrote: > >>> > >>> * Marc-André Lureau (marcandre.lur...@gmail.com) wrote: > >>>> Hi > >>>> > >>>> On Thu, Sep 6, 2018 at 12:59 PM Dr. David Alan Gilbert > >>>> <dgilb...@redhat.com> wrote: > >>>>> > >>>>> * Marc-André Lureau (marcandre.lur...@gmail.com) wrote: > >>>>>> Hi > >>>>>> > >>>>>> On Thu, Sep 6, 2018 at 11:58 AM Igor Mammedov <imamm...@redhat.com> > >>>>>> wrote: > >>>>>>> > >>>>>>> On Thu, 6 Sep 2018 07:50:09 +0400 > >>>>>>> Marc-André Lureau <marcandre.lur...@gmail.com> wrote: > >>>>>>> > >>>>>>>> Hi > >>>>>>>> > >>>>>>>> On Tue, Sep 4, 2018 at 10:47 AM Igor Mammedov <imamm...@redhat.com> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> On Fri, 31 Aug 2018 19:24:24 +0200 > >>>>>>>>> Marc-André Lureau <marcandre.lur...@redhat.com> wrote: > >>>>>>>>> > >>>>>>>>>> This allows to pass the last failing test from the Windows HLK TPM > >>>>>>>>>> 2.0 > >>>>>>>>>> TCG PPI 1.3 tests. > >>>>>>>>>> > >>>>>>>>>> The interface is described in the "TCG Platform Reset Attack > >>>>>>>>>> Mitigation Specification", chapter 6 "ACPI _DSM Function". > >>>>>>>>>> According > >>>>>>>>>> to Laszlo, it's not so easy to implement in OVMF, he suggested to > >>>>>>>>>> do > >>>>>>>>>> it in qemu instead. > >>>>>>>>>> > >>>>>>>>>> Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> > >>>>>>>>>> --- > >>>>>>>>>> hw/tpm/tpm_ppi.h | 2 ++ > >>>>>>>>>> hw/i386/acpi-build.c | 46 > >>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++ > >>>>>>>>>> hw/tpm/tpm_crb.c | 1 + > >>>>>>>>>> hw/tpm/tpm_ppi.c | 23 ++++++++++++++++++++++ > >>>>>>>>>> hw/tpm/tpm_tis.c | 1 + > >>>>>>>>>> docs/specs/tpm.txt | 2 ++ > >>>>>>>>>> hw/tpm/trace-events | 3 +++ > >>>>>>>>>> 7 files changed, 78 insertions(+) > >>>>>>>>>> > >>>>>>>>>> diff --git a/hw/tpm/tpm_ppi.h b/hw/tpm/tpm_ppi.h > >>>>>>>>>> index f6458bf87e..3239751e9f 100644 > >>>>>>>>>> --- a/hw/tpm/tpm_ppi.h > >>>>>>>>>> +++ b/hw/tpm/tpm_ppi.h > >>>>>>>>>> @@ -23,4 +23,6 @@ typedef struct TPMPPI { > >>>>>>>>>> bool tpm_ppi_init(TPMPPI *tpmppi, struct MemoryRegion *m, > >>>>>>>>>> hwaddr addr, Object *obj, Error **errp); > >>>>>>>>>> > >>>>>>>>>> +void tpm_ppi_reset(TPMPPI *tpmppi); > >>>>>>>>>> + > >>>>>>>>>> #endif /* TPM_TPM_PPI_H */ > >>>>>>>>>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > >>>>>>>>>> index c5e9a6e11d..2ab3e8fae7 100644 > >>>>>>>>>> --- a/hw/i386/acpi-build.c > >>>>>>>>>> +++ b/hw/i386/acpi-build.c > >>>>>>>>>> @@ -1824,6 +1824,13 @@ build_tpm_ppi(TPMIf *tpm, Aml *dev) > >>>>>>>>>> pprq = aml_name("PPRQ"); > >>>>>>>>>> pprm = aml_name("PPRM"); > >>>>>>>>>> > >>>>>>>>>> + aml_append(dev, > >>>>>>>>>> + aml_operation_region("TPP3", AML_SYSTEM_MEMORY, > >>>>>>>>>> + aml_int(TPM_PPI_ADDR_BASE + > >>>>>>>>>> 0x15a), > >>>>>>>>>> + 0x1)); > >>>>>>>>>> + field = aml_field("TPP3", AML_BYTE_ACC, AML_NOLOCK, > >>>>>>>>>> AML_PRESERVE); > >>>>>>>>>> + aml_append(field, aml_named_field("MOVV", 8)); > >>>>>>>>>> + aml_append(dev, field); > >>>>>>>>>> /* > >>>>>>>>>> * DerefOf in Windows is broken with SYSTEM_MEMORY. Use a > >>>>>>>>>> dynamic > >>>>>>>>>> * operation region inside of a method for getting FUNC[op]. > >>>>>>>>>> @@ -2166,7 +2173,46 @@ build_tpm_ppi(TPMIf *tpm, Aml *dev) > >>>>>>>>>> aml_append(ifctx, aml_return(aml_buffer(1, > >>>>>>>>>> zerobyte))); > >>>>>>>>>> } > >>>>>>>>>> aml_append(method, ifctx); > >>>>>>>>>> + > >>>>>>>>>> + ifctx = aml_if( > >>>>>>>>>> + aml_equal(uuid, > >>>>>>>>>> + > >>>>>>>>>> aml_touuid("376054ED-CC13-4675-901C-4756D7F2D45D"))); > >>>>>>>>>> + { > >>>>>>>>>> + /* standard DSM query function */ > >>>>>>>>>> + ifctx2 = aml_if(aml_equal(function, zero)); > >>>>>>>>>> + { > >>>>>>>>>> + uint8_t byte_list[1] = { 0x03 }; > >>>>>>>>>> + aml_append(ifctx2, aml_return(aml_buffer(1, > >>>>>>>>>> byte_list))); > >>>>>>>>>> + } > >>>>>>>>>> + aml_append(ifctx, ifctx2); > >>>>>>>>>> + > >>>>>>>>>> + /* > >>>>>>>>>> + * TCG Platform Reset Attack Mitigation Specification > >>>>>>>>>> 1.0 Ch.6 > >>>>>>>>>> + * > >>>>>>>>>> + * Arg 2 (Integer): Function Index = 1 > >>>>>>>>>> + * Arg 3 (Package): Arguments = Package: Type: Integer > >>>>>>>>>> + * Operation Value of the Request > >>>>>>>>>> + * Returns: Type: Integer > >>>>>>>>>> + * 0: Success > >>>>>>>>>> + * 1: General Failure > >>>>>>>>>> + */ > >>>>>>>>>> + ifctx2 = aml_if(aml_equal(function, one)); > >>>>>>>>>> + { > >>>>>>>>>> + aml_append(ifctx2, > >>>>>>>>>> + > >>>>>>>>>> aml_store(aml_derefof(aml_index(arguments, zero)), > >>>>>>>>>> + op)); > >>>>>>>>>> + { > >>>>>>>>>> + aml_append(ifctx2, aml_store(op, > >>>>>>>>>> aml_name("MOVV"))); > >>>>>>>>>> + > >>>>>>>>>> + /* 0: success */ > >>>>>>>>>> + aml_append(ifctx2, aml_return(zero)); > >>>>>>>>>> + } > >>>>>>>>>> + } > >>>>>>>>>> + aml_append(ifctx, ifctx2); > >>>>>>>>>> + } > >>>>>>>>>> + aml_append(method, ifctx); > >>>>>>>>>> } > >>>>>>>>>> + > >>>>>>>>>> aml_append(dev, method); > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> diff --git a/hw/tpm/tpm_crb.c b/hw/tpm/tpm_crb.c > >>>>>>>>>> index b243222fd6..48f6a716ad 100644 > >>>>>>>>>> --- a/hw/tpm/tpm_crb.c > >>>>>>>>>> +++ b/hw/tpm/tpm_crb.c > >>>>>>>>>> @@ -233,6 +233,7 @@ static void tpm_crb_reset(void *dev) > >>>>>>>>>> { > >>>>>>>>>> CRBState *s = CRB(dev); > >>>>>>>>>> > >>>>>>>>>> + tpm_ppi_reset(&s->ppi); > >>>>>>>>>> tpm_backend_reset(s->tpmbe); > >>>>>>>>>> > >>>>>>>>>> memset(s->regs, 0, sizeof(s->regs)); > >>>>>>>>>> diff --git a/hw/tpm/tpm_ppi.c b/hw/tpm/tpm_ppi.c > >>>>>>>>>> index 8b46b9dd4b..ce43bc5729 100644 > >>>>>>>>>> --- a/hw/tpm/tpm_ppi.c > >>>>>>>>>> +++ b/hw/tpm/tpm_ppi.c > >>>>>>>>>> @@ -16,8 +16,30 @@ > >>>>>>>>>> #include "qapi/error.h" > >>>>>>>>>> #include "cpu.h" > >>>>>>>>>> #include "sysemu/memory_mapping.h" > >>>>>>>>>> +#include "sysemu/reset.h" > >>>>>>>>>> #include "migration/vmstate.h" > >>>>>>>>>> #include "tpm_ppi.h" > >>>>>>>>>> +#include "trace.h" > >>>>>>>>>> + > >>>>>>>>>> +void tpm_ppi_reset(TPMPPI *tpmppi) > >>>>>>>>>> +{ > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> + char *ptr = memory_region_get_ram_ptr(&tpmppi->ram); > >>>>>>>>> nvdimm seems to use cpu_physical_memory_read() to access guest > >>>>>>>>> accessible memory, so question is what's difference? > >>>>>>>> > >>>>>>>> cpu_physical_memory_read() is higher level, doing dispatch on address > >>>>>>>> and length checks. > >>>>>>>> > >>>>>>>> This is a bit unnecessary, as ppi->buf could be accessed directly. > >>>>>>> [...] > >>>>>>>>>> + memset(block->host_addr, 0, > >>>>>>>>>> + block->target_end - block->target_start); > >>>>>>>>>> + } > >>>>>>> my concern here is that if we directly touch guest memory here > >>>>>>> we might get in trouble on migration without dirtying modified > >>>>>>> ranges > >>>>>> > >>>>>> It is a read-only of one byte. > >>>>>> by the time the reset handler is called, the memory must have been > >>>>>> already migrated. > >>>>> > >>>>> Looks like a write to me? > >>>> > >>>> the PPI RAM memory is read for the "memory clear" byte > >>>> The whole guest RAM is reset to 0 if set. > >>> > >>> Oh, I see; hmm. > >>> How do you avoid zeroing things like persistent memory? Or ROMs? Or EFI > >>> pflash? > >> > >> guest_phys_blocks_append() only cares about RAM (see > >> guest_phys_blocks_region_add) > > > > Hmm, promising; it uses: > > if (!memory_region_is_ram(section->mr) || > > memory_region_is_ram_device(section->mr)) { > > return; > > } > > > > so ram_device is used by vfio and vhost-user; I don't see anything else. > > pflash init's as a rom_device so that's probably OK. > > But things like backends/hostmem-file.c just use > > memory_region_init_ram_from_file even if they're shared or PMEM. > > So, I think this would wipe an attached PMEM device - do you want to or > > not? > > I think that question could be put, "does the reset attack mitigation > spec recommend / require clearing persistent memory"? I've got no idea. > The reset attack is that the platform is re-set (forcibly, uncleanly) > while the original OS holds some secrets in memory, then the attacker's > OS (or firmware application) is loaded, and those scavenge the > leftovers. Would the original OS keep secrets in PMEM? I don't know.
No, I don't know either; and I think part of the answer might depend what PMEM is being used for; if it's being used as actual storage with a filesystem or database on, you probably don't want to clear it - I mean that's what the (P)ersistence is for. > I guess all address space that a guest OS could consider as "read-write > memory" should be wiped. I think guest_phys_blocks_append() is a good > choice here; originally I wrote that for supporting guest RAM dump; see > commit c5d7f60f06142. Note that the is_ram_device bit was added > separately, see commits e4dc3f5909ab9 and 21e00fa55f3fd -- but that's a > change in the "right" direction here, because it *restricts* what > regions qualify (for dumping, and here for clearing). Yem, I'm wondering if we should add a check for the pmem flag at the same place. Having said that; I don't think that's a question for this patch series; if we agree that guest_phys_blocks* is the right thing to use then it's a separate question about adding the pmem check to there. (I didn't know about guest_phys_block* and would have probably just used qemu_ram_foreach_block ) Dave > > > >>> > >>>>> Also, don't forget that a guest reset can happen during a migration. > >>>> > >>>> Hmm, does cpu_physical_memory_read() guarantee the memory has been > >>>> migrated? > >>>> Is there a way to wait for migration to be completed in a reset handler? > >>> > >>> No; remember that migration can take a significant amount of time (many > >>> minutes) as you stuff many GB of RAM down a network. > >>> > >>> So you can be in the situation where: > >>> a) Migration starts > >>> b) Migration sends a copy of most of RAM across > >>> c) Guest dirties lots of RAM in parallel with b > >>> d) migration sends some of the RAM again > >>> e) guest reboots > >>> f) migration keeps sending ram across > >>> g) Migration finally completes and starts on destination > >>> > >>> a-f are all happening on the source side as the guest is still running > >>> and doing whatever it wants (including reboots). > >>> > >>> Given something like acpi-build.c's acpi_ram_update's call to > >>> memory_region_set_dirty, would that work for you? > >> > >> after the memset(), it should then call: > >> > >> memory_region_set_dirty(block->mr, 0, block->target_end - > >> block->target_start); > >> > >> looks about right? > > > > I think so. > > I'll admit I can't follow the dirtying discussion. But, I'm reminded of > another commit of Dave's, namely 90c647db8d59 ("Fix pflash migration", > 2016-04-15). Would the same trick apply here? > > ( > > BTW, I'm sorry about not having following this series closely -- I feel > bad that we can't solve this within the firmware, but we really can't. > The issue is that this memory clearing would have to occur really early > into the firmware, at the latest in the PEI phase. > > However, both the 32-bit and the 64-bit variants of OVMF's PEI phase > have access only to the lowest 4GB of the memory address space. Mapping > all RAM (even with a "sliding bank") for clearing it would be a real > mess. To that, add the fact that OVMF's PEI phase executes from RAM (not > from pflash -- in OVMF, only SEC executes from pflash, and it > decompresses the PEI + DXE firmware volumes to RAM), so PEI would have > to clear all RAM *except* the areas its own self occupies, such as code, > global variables (static data), heap, stack, other processor data > structures (IDT, GDT, page tables, ...). And, no gaps inside those > should be left out either, because the previous OS might have left > secrets there... > > This is actually easier for firmware that runs on physical hardware; for > two reasons. First, on physical hardware, the PEI phase of the firmware > runs from pflash (it might use CAR, Cache-As-RAM, for a small temporary > r/w storage), so it doesn't have to worry about scribbling over itself. > Second, on physical hardware, the memory controller too has to be booted > up -- the PEI code that does this is all vendors' most closely guarded > secret, and *never* open source --; and when the firmware massages > chipset registers for that, it can use non-architected means to get the > RAM to clear itself. > > In comparison, in QEMU/KVM guests, the RAM just *is*. While that's a > huge relief most of the time, in this case, the fact that the RAM can > start out as nonzero, is a big problem. Hence my plea to implement the > feature in QEMU. > > ) > > Thanks, > Laszlo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK