Re: Partial BAR Address Allocation

2017-02-22 Thread Bjorn Helgaas
[+cc Joerg, iommu list]

On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
> > There is no way for a driver to say "I only need this memory BAR and
> > not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
> > enables *all* the memory BARs; there's no way to enable memory BARs
> > selectively.  If we enable memory BARs and one of them is unassigned,
> > that unassigned BAR is enabled, and the device will respond at
> > whatever address the register happens to contain, and that may cause
> > conflicts.
> > 
> > I'm not sure this answers your question.  Do you want to get rid of
> > 32-bit BAR addresses because your host bridge doesn't have a window to
> > 32-bit PCI addresses?  It's typical for a bridge to support a window
> > to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
> > it performs address translation for the 32-bit window so it doesn't
> > have to be in the 32-bit area on the CPU side, e.g., you could have
> > something like this where we have three host bridges and the 2-4GB
> > space on each PCI root bus is addressable:
> > 
> >   pci_bus :00: root bus resource [mem 0x108000-0x10] (bus 
> > address [0x8000-0x])
> >   pci_bus 0001:00: root bus resource [mem 0x118000-0x11] (bus 
> > address [0x8000-0x])
> >   pci_bus 0002:00: root bus resource [mem 0x128000-0x12] (bus 
> > address [0x8000-0x])
> 
> The problem is that according to PCI specification BAR addresses and
> DMA addresses cannot overlap.
> 
> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
> transactions from its primary interface to its secondary interface
> (downstream) if a memory address is in the range defined by the
> Memory Base and Memory Limit registers (when the base is less than
> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
> memory transaction on the secondary interface that is within this
> address range will not be forwarded upstream to the primary
> interface."
> 
> To be specific, if your DMA address happens to be in
> [0x8000-0x] and root port's aperture includes this
> range; the DMA will never make to the system memory.
> 
> Lorenzo and Robin took some steps to carve out PCI addresses out of
> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows()
> function.
> 
> However, I see that we are still exposed when the operating system
> doesn't have any IOMMU driver and is using the SWIOTLB for instance. 

Hmmm.  I guess SWIOTLB assumes there's no address translation in the
DMA direction, right?  If there's no address translation in the PIO
direction, PCI bus BAR addresses are identical to the CPU-side
addresses.  In that case, there's no conflict because we already have
to assign BARs so they never look like a system memory address.

But if there *is* address translation in the PIO direction, we can
have conflicts because the bridge can translate CPU-side PIO accesses
to arbitrary PCI bus addresses.

> The FW solution I'm looking at requires carving out some part of the
> DDR from before OS boot so that OS doesn't reclaim that area for
> DMA.

If you want to reach system RAM, I guess you need to make sure you
only DMA to bus addresses outside the host bridge windows, as you said
above.  DMA inside the windows would be handled as peer-to-peer DMA.

> I'm not very happy with this solution. I'm also surprised that there
> is no generic solution in the kernel takes care of this for all root
> ports regardless of IOMMU driver presence.

The PCI core isn't really involved in allocating DMA addresses,
although there definitely is the connection with PCI-to-PCI bridge
windows that you mentioned.  I added IOMMU guys, who would know a lot
more than I do.

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 16/28] x86: Add support for changing memory encryption attribute

2017-02-22 Thread Borislav Petkov
On Thu, Feb 16, 2017 at 09:45:35AM -0600, Tom Lendacky wrote:
> Add support for changing the memory encryption attribute for one or more
> memory pages.

"This will be useful when we, , for example."

> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/cacheflush.h |3 ++
>  arch/x86/mm/pageattr.c|   66 
> +
>  2 files changed, 69 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cacheflush.h 
> b/arch/x86/include/asm/cacheflush.h
> index 872877d..33ae60a 100644
> --- a/arch/x86/include/asm/cacheflush.h
> +++ b/arch/x86/include/asm/cacheflush.h
> @@ -12,6 +12,7 @@
>   * Executability : eXeutable, NoteXecutable
>   * Read/Write: ReadOnly, ReadWrite
>   * Presence  : NotPresent
> + * Encryption: Encrypted, Decrypted
>   *
>   * Within a category, the attributes are mutually exclusive.
>   *
> @@ -47,6 +48,8 @@
>  int set_memory_rw(unsigned long addr, int numpages);
>  int set_memory_np(unsigned long addr, int numpages);
>  int set_memory_4k(unsigned long addr, int numpages);
> +int set_memory_encrypted(unsigned long addr, int numpages);
> +int set_memory_decrypted(unsigned long addr, int numpages);
>  
>  int set_memory_array_uc(unsigned long *addr, int addrinarray);
>  int set_memory_array_wc(unsigned long *addr, int addrinarray);
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 91c5c63..9710f5c 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -1742,6 +1742,72 @@ int set_memory_4k(unsigned long addr, int numpages)
>   __pgprot(0), 1, 0, NULL);
>  }
>  
> +static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> +{
> + struct cpa_data cpa;
> + unsigned long start;
> + int ret;
> +
> + /* Nothing to do if the _PAGE_ENC attribute is zero */
> + if (_PAGE_ENC == 0)

Why not:

if (!sme_active())

?

> + return 0;
> +
> + /* Save original start address since it will be modified */

That's obvious - it is a small-enough function to fit on the screen. No
need for the comment.

> + start = addr;
> +
> + memset(&cpa, 0, sizeof(cpa));
> + cpa.vaddr = &addr;
> + cpa.numpages = numpages;
> + cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
> + cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
> + cpa.pgd = init_mm.pgd;
> +
> + /* Should not be working on unaligned addresses */
> + if (WARN_ONCE(*cpa.vaddr & ~PAGE_MASK,
> +   "misaligned address: %#lx\n", *cpa.vaddr))

Use addr here so that you don't have to deref. gcc is probably smart
enough but the code should look more readable this way too.

> + *cpa.vaddr &= PAGE_MASK;

I know, you must use cpa.vaddr here but if you move that alignment check
over the cpa assignment, you can use addr solely.

> +
> + /* Must avoid aliasing mappings in the highmem code */
> + kmap_flush_unused();
> + vm_unmap_aliases();
> +
> + /*
> +  * Before changing the encryption attribute, we need to flush caches.
> +  */
> + if (static_cpu_has(X86_FEATURE_CLFLUSH))
> + cpa_flush_range(start, numpages, 1);
> + else
> + cpa_flush_all(1);

I guess we don't really need the distinction since a SME CPU most
definitely implies CLFLUSH support but ok, let's be careful.

> +
> + ret = __change_page_attr_set_clr(&cpa, 1);
> +
> + /*
> +  * After changing the encryption attribute, we need to flush TLBs
> +  * again in case any speculative TLB caching occurred (but no need
> +  * to flush caches again).  We could just use cpa_flush_all(), but
> +  * in case TLB flushing gets optimized in the cpa_flush_range()
> +  * path use the same logic as above.
> +  */
> + if (static_cpu_has(X86_FEATURE_CLFLUSH))
> + cpa_flush_range(start, numpages, 0);
> + else
> + cpa_flush_all(0);
> +
> + return ret;
> +}

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 10/28] x86: Insure that boot memory areas are mapped properly

2017-02-22 Thread Tom Lendacky

On 2/20/2017 1:45 PM, Borislav Petkov wrote:

On Thu, Feb 16, 2017 at 09:44:11AM -0600, Tom Lendacky wrote:

The boot data and command line data are present in memory in a decrypted
state and are copied early in the boot process.  The early page fault
support will map these areas as encrypted, so before attempting to copy
them, add decrypted mappings so the data is accessed properly when copied.

For the initrd, encrypt this data in place. Since the future mapping of the
initrd area will be mapped as encrypted the data will be accessed properly.

Signed-off-by: Tom Lendacky 
---


...


diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 182a4c7..03f8e74 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -46,13 +46,18 @@ static void __init reset_early_page_tables(void)
write_cr3(__sme_pa_nodebug(early_level4_pgt));
 }

+void __init __early_pgtable_flush(void)
+{
+   write_cr3(__sme_pa_nodebug(early_level4_pgt));
+}


Move that to mem_encrypt.c where it is used and make it static. The diff
below, ontop of this patch, seems to build fine here.


Ok, I can do that.



Also, aren't those mappings global so that you need to toggle CR4.PGE
for that?

PAGE_KERNEL at least has _PAGE_GLOBAL set.


The early_pmd_flags has _PAGE_GLOBAL cleared:

pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);

so I didn't do the CR4.PGE toggle. I could always add it to be safe in
case that is ever changed. It only happens twice, on the map and on the
unmap, so it shouldn't be a big deal.




+
 /* Create a new PMD entry */
-int __init early_make_pgtable(unsigned long address)
+int __init __early_make_pgtable(unsigned long address, pmdval_t pmd)


__early_make_pmd() then, since it creates a PMD entry.


unsigned long physaddr = address - __PAGE_OFFSET;
pgdval_t pgd, *pgd_p;
pudval_t pud, *pud_p;
-   pmdval_t pmd, *pmd_p;
+   pmdval_t *pmd_p;

/* Invalid address or early pgt is done ?  */
if (physaddr >= MAXMEM || read_cr3() != 
__sme_pa_nodebug(early_level4_pgt))


...


diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ac3565c..ec548e9 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -16,8 +16,12 @@

 #include 
 #include 
+#include 
+#include 

 extern pmdval_t early_pmd_flags;
+int __init __early_make_pgtable(unsigned long, pmdval_t);
+void __init __early_pgtable_flush(void);


What's with the forward declarations?

Those should be in some header AFAICT.


I can add them to a header, probably arch/x86/include/asm/pgtable.h.

Thanks,
Tom




  * Since SME related variables are set early in the boot process they must
@@ -103,6 +107,76 @@ void __init sme_early_decrypt(resource_size_t paddr, 
unsigned long size)
__sme_early_enc_dec(paddr, size, false);
 }


...

---
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 03f8e74c7223..c47500d72330 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -46,11 +46,6 @@ static void __init reset_early_page_tables(void)
write_cr3(__sme_pa_nodebug(early_level4_pgt));
 }

-void __init __early_pgtable_flush(void)
-{
-   write_cr3(__sme_pa_nodebug(early_level4_pgt));
-}
-
 /* Create a new PMD entry */
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd)
 {
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ec548e9a76f1..0af020b36232 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -21,7 +21,7 @@

 extern pmdval_t early_pmd_flags;
 int __init __early_make_pgtable(unsigned long, pmdval_t);
-void __init __early_pgtable_flush(void);
+extern pgd_t early_level4_pgt[PTRS_PER_PGD];

 /*
  * Since SME related variables are set early in the boot process they must
@@ -34,6 +34,11 @@ EXPORT_SYMBOL_GPL(sme_me_mask);
 /* Buffer used for early in-place encryption by BSP, no locking needed */
 static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);

+static void __init early_pgtable_flush(void)
+{
+   write_cr3(__sme_pa_nodebug(early_level4_pgt));
+}
+
 /*
  * This routine does not change the underlying encryption setting of the
  * page(s) that map this memory. It assumes that eventually the memory is
@@ -158,7 +163,7 @@ void __init sme_unmap_bootdata(char *real_mode_data)
 */
__sme_map_unmap_bootdata(real_mode_data, false);

-   __early_pgtable_flush();
+   early_pgtable_flush();
 }

 void __init sme_map_bootdata(char *real_mode_data)
@@ -174,7 +179,7 @@ void __init sme_map_bootdata(char *real_mode_data)
 */
__sme_map_unmap_bootdata(real_mode_data, true);

-   __early_pgtable_flush();
+   early_pgtable_flush();
 }

 void __init sme_early_init(void)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 07/28] x86: Provide general kernel support for memory encryption

2017-02-22 Thread Dave Hansen
On 02/16/2017 07:43 AM, Tom Lendacky wrote:
> )
> @@ -673,7 +683,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
>   * Currently stuck as a macro due to indirect forward reference to
>   * linux/mmzone.h's __section_mem_map_addr() definition:
>   */
> -#define pgd_page(pgd)pfn_to_page(pgd_val(pgd) >> PAGE_SHIFT)
> +#define pgd_page(pgd)pfn_to_page(pgd_pfn(pgd))

FWIW, these seem like good cleanups that can go in separately from the
rest of your series.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 07/28] x86: Provide general kernel support for memory encryption

2017-02-22 Thread Dave Hansen
On 02/16/2017 07:43 AM, Tom Lendacky wrote:
>  static inline unsigned long pte_pfn(pte_t pte)
>  {
> - return (pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT;
> + return (pte_val(pte) & ~sme_me_mask & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long pmd_pfn(pmd_t pmd)
>  {
> - return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
> + return (pmd_val(pmd) & ~sme_me_mask & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
>  }

Could you talk a bit about why you chose to do the "~sme_me_mask" bit in
here instead of making it a part of PTE_PFN_MASK / pmd_pfn_mask(pmd)?

It might not matter, but I'd be worried that this ends up breaking
direct users of PTE_PFN_MASK / pmd_pfn_mask(pmd) since they now no
longer mask the PFN out of a PTE.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 07/28] x86: Provide general kernel support for memory encryption

2017-02-22 Thread Tom Lendacky

On 2/20/2017 12:38 PM, Borislav Petkov wrote:

On Thu, Feb 16, 2017 at 09:43:32AM -0600, Tom Lendacky wrote:

Adding general kernel support for memory encryption includes:
- Modify and create some page table macros to include the Secure Memory
  Encryption (SME) memory encryption mask
- Modify and create some macros for calculating physical and virtual
  memory addresses
- Provide an SME initialization routine to update the protection map with
  the memory encryption mask so that it is used by default
- #undef CONFIG_AMD_MEM_ENCRYPT in the compressed boot path

Signed-off-by: Tom Lendacky 


...


+#define __sme_pa(x)(__pa((x)) | sme_me_mask)
+#define __sme_pa_nodebug(x)(__pa_nodebug((x)) | sme_me_mask)
+
 #else  /* !CONFIG_AMD_MEM_ENCRYPT */

 #ifndef sme_me_mask
@@ -35,6 +42,13 @@ static inline bool sme_active(void)
 }
 #endif

+static inline void __init sme_early_init(void)
+{
+}
+
+#define __sme_pa   __pa
+#define __sme_pa_nodebug   __pa_nodebug


One more thing - in the !CONFIG_AMD_MEM_ENCRYPT case, sme_me_mask is 0
so you don't need to define __sme_pa* again.


Makes sense.  I'll move those macros outside the #ifdef (I'll do the
same for the new __sme_clr() and __sme_set() macros, too).

Thanks,
Tom




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 1/6] iommu/dmar: Fix crash on boot when DMAR is disabled

2017-02-22 Thread Andy Shevchenko
On Wed, 2017-02-22 at 12:26 +0100, Joerg Roedel wrote:
> Hi Andy,
> 
> On Wed, Feb 15, 2017 at 04:42:21PM +0200, Andy Shevchenko wrote:
> > By default CONFIG_INTEL_IOMMU_DEFAULT_ON is not set and thus
> > dmar_disabled variable is set.
> > 
> > Intel IOMMU driver based on above doesn't set intel_iommu_enabled
> > variable.
> > 
> > The commit b0119e870837 ("iommu: Introduce new 'struct
> > iommu_device'")
> > mistakenly assumes it never happens and tries to unregister not ever
> > registered resources, which crashes the kernel at boot time:
> > 
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0008
> > IP: iommu_device_unregister+0x31/0x60
> > 
> > Make unregister procedure conditional in free_iommu().
> > 
> > Fixes: b0119e870837 ("iommu: Introduce new 'struct iommu_device'")
> > Cc: Joerg Roedel 
> > Signed-off-by: Andy Shevchenko 
> 
> Thanks for the patch, I applied it and will send it upstream asap.

Thanks! Any comments on the rest? I would like to send v2.

-- 
Andy Shevchenko 
Intel Finland Oy
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 09/28] x86: Add support for early encryption/decryption of memory

2017-02-22 Thread Tom Lendacky

On 2/20/2017 12:22 PM, Borislav Petkov wrote:

On Thu, Feb 16, 2017 at 09:43:58AM -0600, Tom Lendacky wrote:

Add support to be able to either encrypt or decrypt data in place during
the early stages of booting the kernel. This does not change the memory
encryption attribute - it is used for ensuring that data present in either
an encrypted or decrypted memory area is in the proper state (for example
the initrd will have been loaded by the boot loader and will not be
encrypted, but the memory that it resides in is marked as encrypted).

The early_memmap support is enhanced to specify encrypted and decrypted
mappings with and without write-protection. The use of write-protection is
necessary when encrypting data "in place". The write-protect attribute is
considered cacheable for loads, but not stores. This implies that the
hardware will never give the core a dirty line with this memtype.

Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |   15 +++
 arch/x86/mm/mem_encrypt.c  |   79 
 2 files changed, 94 insertions(+)


...


diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index d71df97..ac3565c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -14,6 +14,9 @@
 #include 
 #include 

+#include 
+#include 
+
 extern pmdval_t early_pmd_flags;

 /*
@@ -24,6 +27,82 @@
 unsigned long sme_me_mask __section(.data) = 0;
 EXPORT_SYMBOL_GPL(sme_me_mask);

+/* Buffer used for early in-place encryption by BSP, no locking needed */
+static char sme_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE);
+
+/*
+ * This routine does not change the underlying encryption setting of the
+ * page(s) that map this memory. It assumes that eventually the memory is
+ * meant to be accessed as either encrypted or decrypted but the contents
+ * are currently not in the desired stated.


   state.


Will fix.




+ *
+ * This routine follows the steps outlined in the AMD64 Architecture
+ * Programmer's Manual Volume 2, Section 7.10.8 Encrypt-in-Place.
+ */
+static void __init __sme_early_enc_dec(resource_size_t paddr,
+  unsigned long size, bool enc)
+{
+   void *src, *dst;
+   size_t len;
+
+   if (!sme_me_mask)
+   return;
+
+   local_flush_tlb();
+   wbinvd();
+
+   /*
+* There are limited number of early mapping slots, so map (at most)
+* one page at time.
+*/
+   while (size) {
+   len = min_t(size_t, sizeof(sme_early_buffer), size);
+
+   /*
+* Create write protected mappings for the current format


  write-protected


Ok.




+* of the memory.
+*/
+   src = enc ? early_memremap_decrypted_wp(paddr, len) :
+   early_memremap_encrypted_wp(paddr, len);
+
+   /*
+* Create mappings for the desired format of the memory.
+*/


That comment can go - you already say that in the previous one.


Ok.




+   dst = enc ? early_memremap_encrypted(paddr, len) :
+   early_memremap_decrypted(paddr, len);


Btw, looking at this again, it seems to me that if you write it this
way:

if (enc) {
src = early_memremap_decrypted_wp(paddr, len);
dst = early_memremap_encrypted(paddr, len);
} else {
src = early_memremap_encrypted_wp(paddr, len);
dst = early_memremap_decrypted(paddr, len);
}

it might become even more readable. Anyway, just an idea - your decision
which is better.


I go back and forth on that one, too.  Not sure what I'll do, I guess it
will depend on my mood :).




+
+   /*
+* If a mapping can't be obtained to perform the operation,
+* then eventual access of that area will in the desired


s/will //


Yup.

Thanks,
Tom




+* mode will cause a crash.
+*/
+   BUG_ON(!src || !dst);
+
+   /*
+* Use a temporary buffer, of cache-line multiple size, to
+* avoid data corruption as documented in the APM.
+*/
+   memcpy(sme_early_buffer, src, len);
+   memcpy(dst, sme_early_buffer, len);
+
+   early_memunmap(dst, len);
+   early_memunmap(src, len);
+
+   paddr += len;
+   size -= len;
+   }
+}
+
+void __init sme_early_encrypt(resource_size_t paddr, unsigned long size)
+{
+   __sme_early_enc_dec(paddr, size, true);
+}
+
+void __init sme_early_decrypt(resource_size_t paddr, unsigned long size)
+{
+   __sme_early_enc_dec(paddr, size, false);
+}
+
 void __init sme_early_init(void)
 {
unsigned int i;





___

Re: [RFC PATCH v4 08/28] x86: Extend the early_memremap support with additional attrs

2017-02-22 Thread Tom Lendacky

On 2/20/2017 9:43 AM, Borislav Petkov wrote:

On Thu, Feb 16, 2017 at 09:43:48AM -0600, Tom Lendacky wrote:

Add to the early_memremap support to be able to specify encrypted and


early_memremap()

Please append "()" to function names in your commit messages text.


decrypted mappings with and without write-protection. The use of
write-protection is necessary when encrypting data "in place". The
write-protect attribute is considered cacheable for loads, but not
stores. This implies that the hardware will never give the core a
dirty line with this memtype.


By "hardware will never give" you mean that WP writes won't land dirty
in the cache but will go out to mem and when some other core needs them,
they will have to come from memory?


I think this best explains it, from Table 7-8 of the APM Vol 2:

"Reads allocate cache lines on a cache miss, but only to the shared
state. All writes update main memory. Cache lines are not allocated
on a write miss. Write hits invalidate the cache line and update
main memory."

We're early enough that only the BSP is running and we don't have
to worry about accesses from other cores.  If this was to be used
outside of early boot processing, then some safeties might have to
be added.




Signed-off-by: Tom Lendacky 
---
 arch/x86/Kconfig |4 +++
 arch/x86/include/asm/fixmap.h|   13 ++
 arch/x86/include/asm/pgtable_types.h |8 ++
 arch/x86/mm/ioremap.c|   44 ++
 include/asm-generic/early_ioremap.h  |2 ++
 mm/early_ioremap.c   |   10 
 6 files changed, 81 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a3b8c71..581eae4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1417,6 +1417,10 @@ config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
  If set to N, then the encryption of system memory can be
  activated with the mem_encrypt=on command line option.

+config ARCH_USE_MEMREMAP_PROT
+   def_bool y
+   depends on AMD_MEM_ENCRYPT


Why do we need this?

IOW, all those helpers below will end up being defined unconditionally,
in practice. Think distro kernels. Then saving the couple of bytes is
not really worth the overhead.


I added this because some other architectures use a u64 for the
protection value instead of an unsigned long (i386 for one) and it
was causing build errors/warnings on those archs. And trying to bring
in the header to use pgprot_t instead of an unsigned long caused a ton
of build issues. This seemed to be the simplest and least intrusive way
to approach the issue.

Thanks,
Tom




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 2/2] iommu: add qcom_iommu

2017-02-22 Thread Sricharan
Hi Rob,

<..>

>>>+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>>@@ -0,0 +1,45 @@
>>>+* QCOM IOMMU Implementation
>>>+
>>>+Qualcomm "B" family devices which are not compatible with arm-smmu have
>>>+a similar looking IOMMU but without access to the global register space.
>>>+This is modelled as separate IOMMU devices which have just a single
>>>+master.
>>>+
>>>+** Required properties:
>>>+
>>>+- compatible: Should be one of:
>>>+
>>>+"qcom,msm8916-iommu-context-bank"
>>>+
>>>+  depending on the particular implementation and/or the
>>>+  version of the architecture implemented.
>>>+
>>>+- reg   : Base address and size of the SMMU.  And optionally,
>>>+  if present, the "smmu_local_base"
>>>+
>>>+- interrupts: The context fault irq.
>>>+
>>>+- #iommu-cells  : Must be 0
>>>+
>>>+- qcom,iommu-ctx-asid   : context ASID
>>>+
>>>+- qcom,iommu-secure-id  : secure-id
>>>+
>>>+- clocks: The interface clock (iface_clk) and bus clock (bus_clk)
>>>+
>>>+** Examples:
>>>+
>>>+  mdp_iommu: iommu-context-bank@1e24000 {
>>>+  compatible = "qcom,msm8916-iommu-context-bank";
>>>+  reg = <0x1e24000 0x1000
>>>+  0x1ef 0x3000>;
>>>+  reg-names = "iommu_base", "smmu_local_base";
>>>+  interrupts = ;
>>>+  qcom,iommu-ctx-asid = <4>;
>>>+  qcom,iommu-secure-id = <17>;
>>
>> This is not an per context bank property and can be programmed for an
>> given iommu only once. So we call qcom_iommu_sec_init for
>> each context bank once, which does not look correct. Similarly for
>> smmu_local_base as well. So should this be handled using an global
>> once for all contexts ?
>
>yeah, smmu_local_base and secure-id would be duplicate for all context
>banks that are part of the same actual iommu.  (But it was Robin's
>suggestion to just model this as separate context-bank devices, since
>we cannot touch the global space).
>
>Did I misunderstand the downstream driver code?  It looked like
>qcom_scm_restore_sec_cfg() was called once on first attach per
>context-bank, not globally for the entire iommu, which is what I'm
>doing with this driver.  But I haven't yet tried to enable other
>context-banks in the apps iommu yet.
>

The downstream driver seems to be calling the sec_cfg once
for an iommu when a context is attached for the first time and not for
the subsequent's contexts that are attached. So, means programmed
only once and not for every context. I see it that way. Anyways when you
add more than context-banks, we can see if that causes trouble..

Regards,
 Sricharan

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[git pull] IOMMU Fix for Linux v4.11-rc0

2017-02-22 Thread Joerg Roedel
Hi Linus,

The following changes since commit ebb4949eb32ff500602f960525592fc4e614c5a7:

  Merge tag 'iommu-updates-v4.11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu (2017-02-20 16:42:43 
-0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fix-v4.11-rc0

for you to fetch changes up to c37a01779b3954d9c8f9ac4f663a03c11f69fded:

  iommu/vt-d: Fix crash on boot when DMAR is disabled (2017-02-22 12:25:31 
+0100)


IOMMU Fix for v4.11-rc0

* Fix a boot crash caused by the VT-d driver when booted with IOMMU
  disabled. This was introduced with the recent IOMMU changes.


Andy Shevchenko (1):
  iommu/vt-d: Fix crash on boot when DMAR is disabled

 drivers/iommu/dmar.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

Please pull.

Thanks,

Joerg


signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] iommu: add qcom_iommu

2017-02-22 Thread Rob Clark
On Wed, Feb 22, 2017 at 4:31 AM, Sricharan  wrote:
> Hi Rob,
>
>>diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>new file mode 100644
>>index 000..78a8d65
>>--- /dev/null
>>+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>@@ -0,0 +1,45 @@
>>+* QCOM IOMMU Implementation
>>+
>>+Qualcomm "B" family devices which are not compatible with arm-smmu have
>>+a similar looking IOMMU but without access to the global register space.
>>+This is modelled as separate IOMMU devices which have just a single
>>+master.
>>+
>>+** Required properties:
>>+
>>+- compatible: Should be one of:
>>+
>>+"qcom,msm8916-iommu-context-bank"
>>+
>>+  depending on the particular implementation and/or the
>>+  version of the architecture implemented.
>>+
>>+- reg   : Base address and size of the SMMU.  And optionally,
>>+  if present, the "smmu_local_base"
>>+
>>+- interrupts: The context fault irq.
>>+
>>+- #iommu-cells  : Must be 0
>>+
>>+- qcom,iommu-ctx-asid   : context ASID
>>+
>>+- qcom,iommu-secure-id  : secure-id
>>+
>>+- clocks: The interface clock (iface_clk) and bus clock (bus_clk)
>>+
>>+** Examples:
>>+
>>+  mdp_iommu: iommu-context-bank@1e24000 {
>>+  compatible = "qcom,msm8916-iommu-context-bank";
>>+  reg = <0x1e24000 0x1000
>>+  0x1ef 0x3000>;
>>+  reg-names = "iommu_base", "smmu_local_base";
>>+  interrupts = ;
>>+  qcom,iommu-ctx-asid = <4>;
>>+  qcom,iommu-secure-id = <17>;
>
> This is not an per context bank property and can be programmed for an
> given iommu only once. So we call qcom_iommu_sec_init for
> each context bank once, which does not look correct. Similarly for
> smmu_local_base as well. So should this be handled using an global
> once for all contexts ?

yeah, smmu_local_base and secure-id would be duplicate for all context
banks that are part of the same actual iommu.  (But it was Robin's
suggestion to just model this as separate context-bank devices, since
we cannot touch the global space).

Did I misunderstand the downstream driver code?  It looked like
qcom_scm_restore_sec_cfg() was called once on first attach per
context-bank, not globally for the entire iommu, which is what I'm
doing with this driver.  But I haven't yet tried to enable other
context-banks in the apps iommu yet.

>>+  #iommu-cells = <0>;
>>+  clocks = <&gcc GCC_SMMU_CFG_CLK>,
>>+   <&gcc GCC_APSS_TCU_CLK>;
>>+  clock-names = "iface_clk", "bus_clk";
>
> I am trying to generalize the clock bindings for MMU-500 and one more
> qcom specific. Anyways this can follow that.

no problem to adapt to what you come up with for arm-smmu, it is
basically the same requirements.

>>+  status = "okay";
>
> <..>
>
>>+#define pr_fmt(fmt) "qcom-iommu: " fmt
>>+
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+
>>+#include "io-pgtable.h"
>>+#include "arm-smmu-regs.h"
>>+
>>+// TODO are these qcom specific, or just something no one bothered to add to 
>>arm-smmu
>>+#define SMMU_CB_TLBSYNC  0x7f0
>>+#define SMMU_CB_TLBSTATUS0x7f4
>
> I think the reason was in arm-smmu, we are using the global TLBSYNC/STATUS 
> bits, as its
> used in both global device reset and flush path. Otherwise here, its correct 
> to add this.

ok, that is what I suspected.. in next version I'll add these two to
the shared header instead

>>+#define SMMU_INTR_SEL_NS 0x2000
>>+
>>+
>>+struct qcom_iommu_device {
>>+  struct device   *dev;
>>+
>>+  void __iomem*base;
>>+  void __iomem*local_base;
>>+  unsigned int irq;
>>+  struct clk  *iface_clk;
>>+  struct clk  *bus_clk;
>>+
>>+  bool secure_init;
>>+  u32  asid;  /* asid and ctx bank # are 1:1 */
>>+  u32  sec_id;
>>+
>>+  /* single group per device: */
>>+  struct iommu_group  *group;
>>+};
>>+
>>+struct qcom_iommu_domain {
>>+  struct qcom_iommu_device*iommu;
>>+  struct io_pgtable_ops   *pgtbl_ops;
>>+  spinlock_t   pgtbl_lock;
>>+  struct mutex init_mutex; /* Protects iommu pointer 
>>*/
>>+  struct iommu_domain  domain;
>>+};
>>+
>>+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain 
>>*dom)
>>+{
>>+  return container_of(dom, struct qcom_iommu_domain, domain);
>>+}
>>+
>>+static const struct iommu_ops qcom_iommu_ops;
>>+static struct platform_driver

Re: [RFC PATCH v4 07/28] x86: Provide general kernel support for memory encryption

2017-02-22 Thread Borislav Petkov
On Tue, Feb 21, 2017 at 11:18:08AM -0600, Tom Lendacky wrote:
> It's the latter.  It's really only used for working with values that
> will either be written to or read from cr3.  I'll add some comments
> around the macros as well as expand on it in the commit message.

Ok, that makes sense. Normally we will have the mask in the lower levels
of the pagetable hierarchy but we need to add it to the CR3 value by
hand. Yap.

> Ok, I'll try and come up with something...  maybe __sme_rm or
> __sme_clear (__sme_clr).

__sme_clr looks nice to me :)

Thanks.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [bug report] iommu/arm-smmu: Make use of the iommu_register interface

2017-02-22 Thread Joerg Roedel
On Wed, Feb 22, 2017 at 11:26:48AM +, Robin Murphy wrote:
> The pointer isn't cleared because the whole fwspec is freed on the very
> next line.

You are right, stupid me :/ Sorry for the noise.



Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [bug report] iommu/arm-smmu: Make use of the iommu_register interface

2017-02-22 Thread Robin Murphy
On 22/02/17 11:00, Joerg Roedel wrote:
> Hi Dan,
> 
> thanks for the report! There are more bogus things going on here.
> 
> On Wed, Feb 15, 2017 at 11:36:48AM +0300, Dan Carpenter wrote:
>> The patch 9648cbc9625b: "iommu/arm-smmu: Make use of the 
>> iommu_register interface" from Feb 1, 2017, leads to the following 
>> Smatch complaint:
>>
>> drivers/iommu/arm-smmu-v3.c:1810 arm_smmu_remove_device()
>>   warn: variable dereferenced before check 'master' (see line 1809)
>>
>> drivers/iommu/arm-smmu-v3.c
>>   1808   master = fwspec->iommu_priv;
>>   1809   smmu = master->smmu;
>>
>> New dereference.
>>
>>   1810   if (master && master->ste.valid)
>> ^^
>> Old code checked for NULL.
>>
>>   1811   arm_smmu_detach_dev(dev);
>>   1812   iommu_group_remove_device(dev);
> 
> So the master pointer comes from fwspec->iommu_priv, and master is freed
> later in the function. But I can't find where the fwspec->iommu_priv
> pointer is cleared. To me it looks like this breaks when a device is
> removed and the added again.

The pointer isn't cleared because the whole fwspec is freed on the very
next line.

Robin.

> 
> Robin, Will, can you have a look please?
> 
> 
> Thanks,
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 1/6] iommu/dmar: Fix crash on boot when DMAR is disabled

2017-02-22 Thread Joerg Roedel
Hi Andy,

On Wed, Feb 15, 2017 at 04:42:21PM +0200, Andy Shevchenko wrote:
> By default CONFIG_INTEL_IOMMU_DEFAULT_ON is not set and thus
> dmar_disabled variable is set.
> 
> Intel IOMMU driver based on above doesn't set intel_iommu_enabled
> variable.
> 
> The commit b0119e870837 ("iommu: Introduce new 'struct iommu_device'")
> mistakenly assumes it never happens and tries to unregister not ever
> registered resources, which crashes the kernel at boot time:
> 
>   BUG: unable to handle kernel NULL pointer dereference at 
> 0008
>   IP: iommu_device_unregister+0x31/0x60
> 
> Make unregister procedure conditional in free_iommu().
> 
> Fixes: b0119e870837 ("iommu: Introduce new 'struct iommu_device'")
> Cc: Joerg Roedel 
> Signed-off-by: Andy Shevchenko 

Thanks for the patch, I applied it and will send it upstream asap.



Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [bug report] iommu/arm-smmu: Make use of the iommu_register interface

2017-02-22 Thread Joerg Roedel
Hi Dan,

thanks for the report! There are more bogus things going on here.

On Wed, Feb 15, 2017 at 11:36:48AM +0300, Dan Carpenter wrote:
> The patch 9648cbc9625b: "iommu/arm-smmu: Make use of the 
> iommu_register interface" from Feb 1, 2017, leads to the following 
> Smatch complaint:
> 
> drivers/iommu/arm-smmu-v3.c:1810 arm_smmu_remove_device()
>warn: variable dereferenced before check 'master' (see line 1809)
> 
> drivers/iommu/arm-smmu-v3.c
>   1808master = fwspec->iommu_priv;
>   1809smmu = master->smmu;
>
> New dereference.
> 
>   1810if (master && master->ste.valid)
> ^^
> Old code checked for NULL.
> 
>   1811arm_smmu_detach_dev(dev);
>   1812iommu_group_remove_device(dev);

So the master pointer comes from fwspec->iommu_priv, and master is freed
later in the function. But I can't find where the fwspec->iommu_priv
pointer is cleared. To me it looks like this breaks when a device is
removed and the added again.

Robin, Will, can you have a look please?


Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 2/2] iommu: add qcom_iommu

2017-02-22 Thread Sricharan
Hi Rob,

>diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>new file mode 100644
>index 000..78a8d65
>--- /dev/null
>+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>@@ -0,0 +1,45 @@
>+* QCOM IOMMU Implementation
>+
>+Qualcomm "B" family devices which are not compatible with arm-smmu have
>+a similar looking IOMMU but without access to the global register space.
>+This is modelled as separate IOMMU devices which have just a single
>+master.
>+
>+** Required properties:
>+
>+- compatible: Should be one of:
>+
>+"qcom,msm8916-iommu-context-bank"
>+
>+  depending on the particular implementation and/or the
>+  version of the architecture implemented.
>+
>+- reg   : Base address and size of the SMMU.  And optionally,
>+  if present, the "smmu_local_base"
>+
>+- interrupts: The context fault irq.
>+
>+- #iommu-cells  : Must be 0
>+
>+- qcom,iommu-ctx-asid   : context ASID
>+
>+- qcom,iommu-secure-id  : secure-id
>+
>+- clocks: The interface clock (iface_clk) and bus clock (bus_clk)
>+
>+** Examples:
>+
>+  mdp_iommu: iommu-context-bank@1e24000 {
>+  compatible = "qcom,msm8916-iommu-context-bank";
>+  reg = <0x1e24000 0x1000
>+  0x1ef 0x3000>;
>+  reg-names = "iommu_base", "smmu_local_base";
>+  interrupts = ;
>+  qcom,iommu-ctx-asid = <4>;
>+  qcom,iommu-secure-id = <17>;

This is not an per context bank property and can be programmed for an
given iommu only once. So we call qcom_iommu_sec_init for
each context bank once, which does not look correct. Similarly for
smmu_local_base as well. So should this be handled using an global
once for all contexts ?

>+  #iommu-cells = <0>;
>+  clocks = <&gcc GCC_SMMU_CFG_CLK>,
>+   <&gcc GCC_APSS_TCU_CLK>;
>+  clock-names = "iface_clk", "bus_clk";

I am trying to generalize the clock bindings for MMU-500 and one more
qcom specific. Anyways this can follow that.

>+  status = "okay";

<..>

>+#define pr_fmt(fmt) "qcom-iommu: " fmt
>+
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+#include 
>+
>+#include "io-pgtable.h"
>+#include "arm-smmu-regs.h"
>+
>+// TODO are these qcom specific, or just something no one bothered to add to 
>arm-smmu
>+#define SMMU_CB_TLBSYNC  0x7f0
>+#define SMMU_CB_TLBSTATUS0x7f4

I think the reason was in arm-smmu, we are using the global TLBSYNC/STATUS 
bits, as its
used in both global device reset and flush path. Otherwise here, its correct to 
add this.

>+#define SMMU_INTR_SEL_NS 0x2000
>+
>+
>+struct qcom_iommu_device {
>+  struct device   *dev;
>+
>+  void __iomem*base;
>+  void __iomem*local_base;
>+  unsigned int irq;
>+  struct clk  *iface_clk;
>+  struct clk  *bus_clk;
>+
>+  bool secure_init;
>+  u32  asid;  /* asid and ctx bank # are 1:1 */
>+  u32  sec_id;
>+
>+  /* single group per device: */
>+  struct iommu_group  *group;
>+};
>+
>+struct qcom_iommu_domain {
>+  struct qcom_iommu_device*iommu;
>+  struct io_pgtable_ops   *pgtbl_ops;
>+  spinlock_t   pgtbl_lock;
>+  struct mutex init_mutex; /* Protects iommu pointer 
>*/
>+  struct iommu_domain  domain;
>+};
>+
>+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain 
>*dom)
>+{
>+  return container_of(dom, struct qcom_iommu_domain, domain);
>+}
>+
>+static const struct iommu_ops qcom_iommu_ops;
>+static struct platform_driver qcom_iommu_driver;
>+
>+static struct qcom_iommu_device * dev_to_iommu(struct device *dev)
>+{
>+  struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>+  if (WARN_ON(!fwspec || fwspec->ops != &qcom_iommu_ops))
>+  return NULL;
>+  return fwspec->iommu_priv;
>+}
>+
>+static inline void
>+iommu_writel(struct qcom_iommu_device *qcom_iommu, unsigned reg, u32 val)
>+{
>+  writel_relaxed(val, qcom_iommu->base + reg);
>+}
>+
>+static inline void
>+iommu_writeq(struct qcom_iommu_device *qcom_iommu, unsigned reg, u64 val)
>+{
>+  writeq_relaxed(val, qcom_iommu->base + reg);
>+}
>+
>+static inline u32
>+iommu_readl(struct qcom_iommu_device *qcom_iommu, unsigned reg)
>+{
>+  return readl_relaxed(qcom_iommu->base + reg);
>+}
>+
>+static inline u32
>+iommu_readq(struct qcom_iommu_device *qcom_iommu, unsigned reg)
>+{
>+  return readq_relaxed(qcom_iommu->base + reg);
>+}
>+
>+static void __sync_tlb(s