On 07/28/2017 08:38 AM, Laszlo Ersek wrote:
On 07/27/17 21:00, Brijesh Singh wrote:
On 07/27/2017 12:56 PM, Ard Biesheuvel wrote:
On 27 July 2017 at 18:16, Laszlo Ersek <ler...@redhat.com> wrote:

Normally, Map allocates the bounce buffer internally, and Unmap
releases the bounce buffer internally (for BusMasterRead and
BusMasterWrite), which is not right here. If you use
AllocateBuffer() separately, then Map() -- with
BusMasterCommonBuffer -- will not allocate internally, and Unmap()
will also not deallocate internally. So, in the ExitBootServices()
callback, you will be able to call Unmap() only, and then *not* call
FreeBuffer() at all.

This is why I suggest introducing all four functions (Allocate,
Free, Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio
drivers should use all four functions explicitly, not just Map and
Unmap.

... Actually, I think there is a bug in
"OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following
distribution of operations at the moment:

- IoMmuAllocateBuffer() allocates pages and clears the memory
   encryption mask

- IoMmuFreeBuffer() re-sets the memory encryption mask, and
   deallocates pages

- IoMmuMap() does nothing at all when BusMasterCommonBuffer is
   requested (and IoMmuAllocateBuffer() was called previously).
   Otherwise, IoMmuMap() allocates pages, allocates MAP_INFO, and
   clears the memory encryption mask.

- IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer
   operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the
   encryption mask, and frees both the pages and MAP_INFO.


I agree with you, there is a bug in AmdSevIoMmu.c. I will fix it. And
introduce list to track if the buffer was allocated by us. If buffer
was allocated by us then we can safely say that it does not require a
bounce buffer. While implementing the initial code I was not able to
see BusMasterCommonBuffer usage. But with virtio things are getting a
bit more clear in my mind.

The purpose of the internal list is a little bit different -- it is a
"free list", not a tracker list.

Under the proposed scheme, a MAP_INFO structure will have to be
allocated for *all* mappings: both for CommonBuffer (where the actual
pages come from the caller, who retrieved them earlier with an
AllocateBuffer() call), and for Read/Write (where the actual pages will
be allocated internally to Map). Allocating the MAP_INFO structure in
Map() is necessary in *both* cases because the Unmap() function can
*only* consult this MAP_INFO structure to learn the address and the size
at which it has to re-set the memory encryption mask. This is because
the Unmap() function gets no other input parameter.

Then, regardless of the original operation (CommonBuffer vs.
Read/Write), the MAP_INFO structure has to be recycled in Unmap(). (For
CommonBuffer, the actual pages are owned by the caller, so we don't free
them here; for Read/Write, the actual pages are owned by Map/Unmap, so
we free them in Unmap() *in addition* to recycling MAP_INFO.) The main
point is that MAP_INFO cannot be released with FreePool() because that
could change the UEFI memmap during ExitBootServices() processing. So
MAP_INFO has to be "released" to an internal *free* list.

And since we have an internal free list, the original MAP_INFO
allocation in Map() should consult the free list first, and only if it
is empty should it fall back to AllocatePool().

Whether the actual pages tracked by MAP_INFO were allocated internally
by Map(), or externally by the caller (via AllocateBuffer()) is an
independent question. (And it can be seen from the MAP_INFO.Operation
field.)

Does this make it clearer?


It makes sense. thanks

One question, do we need to return EFI_NOT_SUPPORTED error when we get
request to map a buffer with CommonBuffer but the buffer was not
allocated using "EFI_PCI_IO_PROTOCOL->AllocateBuffer (...)"?

At least as per spec, it seems if caller wants to map a buffer with
CommonBuffer then it must have called "EFI_PCI_IO_PROTOCOL->AllocateBuffer 
(...)"



This distribution of operations seems wrong. The key point is that
AllocateBuffer() *need not* result in a buffer that is immediately
usable, and that client code is required to call Map()
*unconditionally*, even if BusMasterCommonBuffer is the desired
operation. Therefore, the right distribution of operations is:

- IoMmuAllocateBuffer() allocates pages and does not touch the
   encryption mask..

- IoMmuFreeBuffer() deallocates pages and does not touch the
   encryption mask.


Actually one of main reason why we cleared and restored the memory
encryption mask during allocate/free is because we also consume the
IOMMU protocol in QemuFwCfgLib as a method to allocate and free a DMA
buffer. I am certainly open to suggestions.

Argh. That's again my fault then, I should have noticed it. :( I
apologize, the Map/Unmap/AllocateBuffer/FreeBuffer APIs are a "first"
for me as well.

As discussed earlier (and confirmed by Ard), calling *just*
AllocateBuffer() is never enough, Map()/Unmap() are always necessary.

So here's what I suggest (to keep the changes localized):

- Extend the prototype of InternalQemuFwCfgSevDmaAllocateBuffer()
   function to output a "VOID *Mapping" parameter as well, in addition to
   the address.

- Modify the prototype of the InternalQemuFwCfgSevDmaFreeBuffer()
   function to take a "VOID *Mapping" parameter in addition to the buffer
   address.

- In the DXE implementation of InternalQemuFwCfgSevDmaAllocateBuffer(),
   keep the current IOMMU AllocateBuffer() call, but also call IOMMU
   Map(), with CommonBuffer. Furthermore, propagate the Mapping output of
   Map() outwards. (Note that Map() may have to be called in a loop,
   dependent on "NumberOfBytes".)

- In the DXE implementation of InternalQemuFwCfgSevDmaFreeBuffer(), call
   the IOMMU Unmap() function first (using the new Mapping parameter).


I will update the code and send patch for review. thanks


[1]
https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159

[2]
https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197


- IoMmuMap() does not allocate pages when BusMasterCommonBuffer is
   requested, and it allocates pages (bounce buffer) otherwise.


I am trying to wrap my head around how we can support
BusMasterCommonBuffer when buffer was not allocated by us. Changing
the memory encryption mask in a page table will not update the
contents.

Thank you for the clarification. I've now tried to review the current
code for a better understanding. Are the below statements correct?

- For the guest, guest memory is always readable transparently.
- For the host, guest memory is always readable *as it was written
   last*.
- If the guest memory was last written with the C bit clear, then the
   host can read it as plaintext (regardless of the current state of the
   C bit).
- If the guest memory was last written with the C bit set, then the host
   can only read garbage (regardless of the current state of the C bit).


Your understanding is correct.

*If* this is the case, then:

(a) We already have a sort of guest->host information leak. Namely, in
IoMmuFreeBuffer() [OvmfPkg/IoMmuDxe/AmdSevIoMmu.c], the C bit is
restored, and the pages are released with gBS->FreePages(). However, the
pages being released are not rewritten in place (they are not actually
re-encrypted, only marked for encryption whenever they are next written
to). This means that pages released like this remain readable to the
hypervisor for an unpredictable time.

This is not necessarily a critical problem (after all the contents of
those pages were visible to the hypervisor at some point anyway!); but
in the longer term, such pages could accumulate, and if the hypervisor
is compromised later, it could potentially read an unbounded amount of
"past guest data" as plain-text.


Very good catch, I can *zero* the pages. IIRC, I did not consider doing
so because it may introduce unnecessary perform hit (mainly when dealing
with larger pages). I think when we load kernel or initrd using QemuFwCfg
DMA interface we allocate large buffers.

I will still go ahead and experiment *zeroing* page and measure the performance
impact before we integrate the solution.

(b) Plus, approaching the question from the Map() direction, we need to
consider two scenarios:

- Client code calls AllocateBuffer(), then Map(), and it writes to the
   buffer only then. This should be safe.
- client code calls AllocateBuffer(), writes to it, and then calls
   Map(). This will result in memory contents that look like garbage to
   the hypervisor. Bad.

I can imagine the following to handle these cases: in the Map() and
Unmap() functions, we have to decrypt and encrypt the memory contents
in-place, after changing the C bit (without allocating additional
memory). Introduce a static UINT8 array with EFI_PAGE_SIZE bytes (this
will always remain in encrypted memory). Update the C bit with a single
function call for the entire range (like now) -- this will not affect
the guest-readability of the pages --, then bounce each page within the
range to the static buffer and back to its original place. In effect
this will in-place encrypt or decrypt the memory, and will be faster
than a byte-wise rewrite.

* BusMasterRead (host will want to read guest memory):
   - Client calls Map() without calling AllocateBuffer() first. Map()
     allocates an internal bounce buffer, clears the C bit, and does the
     copying.
   - Client fires off the PCI transaction.
   - Client calls Unmap(), without calling FreeBuffer() later. Unmap()
     restores the C-bit, *zeros* the pages, and releases the pages.


Yes this will work just fine (see my previous comments on *zeroing* pages)

* BusMasterWrite (host will want to write to guest memory):
   - Client calls Map() without calling AllocateBuffer() first. Map()
     allocates an internal bounce buffer and clears the C bit.
   - Client fires off the PCI transaction.
   - Client calls Unmap(), without calling FreeBuffer() later. Unmap()
     copies the bounce buffer to crpyted (client-owned) memory, restores
     the C-bit, *zeros* the pages, and releases the pages.


Yes, this will work just fine (see my previous comments on *zeroing* pages)

* BusMasterCommonBuffer:
   - Client calls AllocateBuffer(), and places some data in the returned
     memory.
   - Client calls Map(). Map() clears the C bit in one fell swoop, and
     then decrypts the buffer in-place (by bouncing it page-wise to the
     static array and back).
   - Client communicates with the device.
   - Client calls Unmap(). Unmap() restores the C bit in one fell swoop,
     and encrypts the buffer in-place (by bouncing it page-wise to the
     static array and back).
   - Client reads some residual data from the buffer.
   - Client calls FreeBuffer(). FreeBuffer() relases the pages.


Yes this works fine as long as the client uses 
EFI_PCI_IO_PROTOCOL.AllocateBuffer()
to allocate the buffer.


This is suitable for the discussed ExitBootServices() callback update
too, where the callback will reset the virtio device (like now) and then
call Unmap() for all shared areas, without calling FreeBuffer() on them.
The above logic will re-encrypt the contents without affecting the UEFI
memmap.

Also since the memory encryption mask works on PAGE_SIZE hence
changing the encryption mask on not our allocated buffer could mess
things up (e.g if NumberOfBytes is not PAGE_SIZE aligned).

This is not a problem for BusMasterRead and BusMasterWrite because for
those operations, Map() allocates the internal bounce buffer. Also not a
problem for BusMasterCommonBuffer, because then the client first calls
AllocateBuffer (whole number of pages), and so Map() can round up the
number of bytes and de-crypt full pages in-place (see above).


    *Regardless* of BusMaster operation, the following actions are
    carried out unconditionally:

    . the memory encryption mask is cleared in this function (and in
      this function only),

    . An attempt is made to grab a MAP_INFO structure from an
      internal free list (to be introduced!). The head of the list is
      a new static variable. If the free list is empty, then a
      MAP_INFO structure is allocated with AllocatePool(). The
      NO_MAPPING macro becomes unused and can be deleted from the
      source code.

- IoMmuUnmap() clears the encryption mask unconditionally. (For
   this, it has to consult the MAP_INFO structure that is being
   passed in from the caller.) In addition:

    . If MapInfo->Operation is BusMasterCommonBuffer, then we know
      the allocation was done separately in AllocateBuffer, so we do
      not release the pages. Otherwise, we do release the pages.

    . MapInfo is linked back on the internal free list (see above).
      It is *never* released with FreePool().

    This approach guarantees that IoMmuUnmap() can de-program the
    IOMMU (= re-set the memory encryption mask) without changing the
    UEFI memory map. (I trust that MemEncryptSevSetPageEncMask() will
    not split page tables internally when it *re*sets the encryption
    mask -- is that correct?)

Yes, MemEncryptSevSetPageEncMask() will not split pages when it
restores mask.

Great, thanks.
Laszlo

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to