Re: [PATCH v12 22/22] selftests/vm: Fix deadlock in protection_keys.c

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> From: Thiago Jung Bauermann <bauer...@linux.vnet.ibm.com>
> 
> The sig_chld() handler calls dprintf2() taking care of setting
> dprint_in_signal so that sigsafe_printf() won't call printf().
> Unfortunately, this precaution is is negated by dprintf_level(), which
> has a call to fflush().
> 
> This function acquires a lock, which means that if the signal interrupts an
> ongoing fflush() the process will deadlock. At least on powerpc this is
> easy to trigger, resulting in the following backtrace when attaching to the
> frozen process:

Ugh, yeah, I've run into this too.

Acked-by: Dave Hansen <dave.han...@intel.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 21/22] selftests/vm: sub-page allocator

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
...
> @@ -888,6 +917,7 @@ void setup_hugetlbfs(void)
>  void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
>  
>   malloc_pkey_with_mprotect,
> + malloc_pkey_with_mprotect_subpage,
>   malloc_pkey_anon_huge,
>   malloc_pkey_hugetlb
>  /* can not do direct with the pkey_mprotect() API:


I think I'd rather have an #ifdef on the array entries than have the
malloc entry do nothing on x86.  Maybe have a ppc-specific section at
the end?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 20/22] selftests/vm: testcases must restore pkey-permissions

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> Generally the signal handler restores the state of the pkey register
> before returning. However there are times when the read/write operation
> can legitamely fail without invoking the signal handler.  Eg: A
> sys_read() operaton to a write-protected page should be disallowed.  In
> such a case the state of the pkey register is not restored to its
> original state.  The test case is responsible for restoring the key
> register state to its original value.

Oh, that's a good point.

Could we just do this in a common place, though?  Like reset the
register after each test?  Seems more foolproof.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 19/22] selftests/vm: detect write violation on a mapped access-denied-key page

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> detect write-violation on a page to which access-disabled
> key is associated much after the page is mapped.

Acked-by: Dave Hansen <dave.han...@intel.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 18/22] selftests/vm: associate key on a mapped page and detect write violation

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> detect write-violation on a page to which write-disabled
> key is associated much after the page is mapped.

The more tests the merrier.

Acked-by: Dave Hansen <dave.han...@intel.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 17/22] selftests/vm: associate key on a mapped page and detect access violation

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> detect access-violation on a page to which access-disabled
> key is associated much after the page is mapped.

Looks fine to me.  Did this actually find a bug for you?

Acked-by: Dave Hansen <dave.han...@intel.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 16/22] selftests/vm: fix an assertion in test_pkey_alloc_exhaust()

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> +static inline int arch_reserved_keys(void)
> +{
> +#if defined(__i386__) || defined(__x86_64__) /* arch */
> + return NR_RESERVED_PKEYS;
> +#elif __powerpc64__ /* arch */
> + if (sysconf(_SC_PAGESIZE) == 4096)
> + return NR_RESERVED_PKEYS_4K;
> + else
> + return NR_RESERVED_PKEYS_64K;
> +#else /* arch */
> + NOT SUPPORTED
> +#endif /* arch */
> +}

Yeah, this is hideous.

Please either do it in one header:

#ifdef x86..
static inline int arch_reserved_keys(void)
{
}
...
#elif ppc
static inline int arch_reserved_keys(void)
{
}
...
#else
#error
#endif

Or in multiple:

#ifdef x86..
#include 
#elif ppc
#include 
#else
#error
#endif
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 15/22] selftests/vm: powerpc implementation to check support for pkey

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
>  #define PAGE_SIZE (0x1UL << 16)
> -static inline int cpu_has_pku(void)
> +static inline bool is_pkey_supported(void)
>  {
> - return 1;
> + /*
> +  * No simple way to determine this.
> +  * lets try allocating a key and see if it succeeds.
> +  */
> + int ret = sys_pkey_alloc(0, 0);
> +
> + if (ret > 0) {
> + sys_pkey_free(ret);
> + return true;
> + }
> + return false;
>  }

The point of doing this was to have a test for the CPU that way separate
from the syscalls.

Can you leave cpu_has_pkeys() in place?


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 13/22] selftests/vm: powerpc implementation for generic abstraction

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
>  static inline u32 pkey_to_shift(int pkey)
>  {
> +#if defined(__i386__) || defined(__x86_64__) /* arch */
>   return pkey * PKEY_BITS_PER_PKEY;
> +#elif __powerpc64__ /* arch */
> + return (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
> +#endif /* arch */
>  }

I really detest the #if #else style.  Can't we just have a pkey_ppc.h
and a pkey_x86.h or something?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 12/22] selftests/vm: generic cleanup

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> cleanup the code to satisfy coding styles.
> 
> cc: Dave Hansen <dave.han...@intel.com>
> cc: Florian Weimer <fwei...@redhat.com>
> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  tools/testing/selftests/vm/protection_keys.c |   81 
> ++
>  1 files changed, 43 insertions(+), 38 deletions(-)
> 
> diff --git a/tools/testing/selftests/vm/protection_keys.c 
> b/tools/testing/selftests/vm/protection_keys.c
> index 6054093..6fdd8f5 100644
> --- a/tools/testing/selftests/vm/protection_keys.c
> +++ b/tools/testing/selftests/vm/protection_keys.c
> @@ -4,7 +4,7 @@
>   *
>   * There are examples in here of:
>   *  * how to set protection keys on memory
> - *  * how to set/clear bits in pkey registers (the rights register)
> + *  * how to set/clear bits in Protection Key registers (the rights register)

I don't think CodingStyle says to do this. :)

>   *  * how to handle SEGV_PKUERR signals and extract pkey-relevant
>   *information from the siginfo
>   *
> @@ -13,13 +13,18 @@
>   *   prefault pages in at malloc, or not
>   *   protect MPX bounds tables with protection keys?
>   *   make sure VMA splitting/merging is working correctly
> - *   OOMs can destroy mm->mmap (see exit_mmap()), so make sure it is immune 
> to pkeys
> - *   look for pkey "leaks" where it is still set on a VMA but "freed" back 
> to the kernel
> - *   do a plain mprotect() to a mprotect_pkey() area and make sure the pkey 
> sticks
> + *   OOMs can destroy mm->mmap (see exit_mmap()),
> + *   so make sure it is immune to pkeys
> + *   look for pkey "leaks" where it is still set on a VMA
> + *but "freed" back to the kernel
> + *   do a plain mprotect() to a mprotect_pkey() area and make
> + *sure the pkey sticks

Ram, I'm not sure where this came from, but this looks horrid.  Please
don't do this to the file

>   * Compile like this:
> - *   gcc  -o protection_keys-O2 -g -std=gnu99 -pthread -Wall 
> protection_keys.c -lrt -ldl -lm
> - *   gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99 -pthread -Wall 
> protection_keys.c -lrt -ldl -lm
> + *   gcc  -o protection_keys-O2 -g -std=gnu99
> + *-pthread -Wall protection_keys.c -lrt -ldl -lm
> + *   gcc -m32 -o protection_keys_32 -O2 -g -std=gnu99
> + *-pthread -Wall protection_keys.c -lrt -ldl -lm
>   */

Please just leave this, or remove it from the file.  It was a long line
so it could be copied and pasted, this ruins that.



>  #define _GNU_SOURCE
>  #include 
> @@ -251,26 +256,11 @@ void signal_handler(int signum, siginfo_t *si, void 
> *vucontext)
>   dprintf1("signal pkey_reg from  pkey_reg: %016lx\n", __rdpkey_reg());
>   dprintf1("pkey from siginfo: %jx\n", siginfo_pkey);
>   *(u64 *)pkey_reg_ptr = 0x;
> - dprintf1("WARNING: set PRKU=0 to allow faulting instruction to 
> continue\n");
> + dprintf1("WARNING: set PKEY_REG=0 to allow faulting instruction "
> + "to continue\n");
>   pkey_faults++;
>   dprintf1("<<<<==\n");
>   return;
> - if (trapno == 14) {
> - fprintf(stderr,
> - "ERROR: In signal handler, page fault, trapno = %d, ip 
> = %016lx\n",
> - trapno, ip);
> - fprintf(stderr, "si_addr %p\n", si->si_addr);
> - fprintf(stderr, "REG_ERR: %lx\n",
> - (unsigned 
> long)uctxt->uc_mcontext.gregs[REG_ERR]);
> - exit(1);
> - } else {
> - fprintf(stderr, "unexpected trap %d! at 0x%lx\n", trapno, ip);
> - fprintf(stderr, "si_addr %p\n", si->si_addr);
> - fprintf(stderr, "REG_ERR: %lx\n",
> - (unsigned 
> long)uctxt->uc_mcontext.gregs[REG_ERR]);
> - exit(2);
> - }
> - dprint_in_signal = 0;
>  }

I think this is just randomly removing code now.

I think you should probably just drop this patch.  It's not really
brining anything useful.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 11/22] selftests/vm: pkey register should match shadow pkey

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> expected_pkey_fault() is comparing the contents of pkey
> register with 0. This may not be true all the time. There
> could be bits set by default by the architecture
> which can never be changed. Hence compare the value against
> shadow pkey register, which is supposed to track the bits
> accurately all throughout
> 
> cc: Dave Hansen <dave.han...@intel.com>
> cc: Florian Weimer <fwei...@redhat.com>
> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  tools/testing/selftests/vm/protection_keys.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/vm/protection_keys.c 
> b/tools/testing/selftests/vm/protection_keys.c
> index 254b66d..6054093 100644
> --- a/tools/testing/selftests/vm/protection_keys.c
> +++ b/tools/testing/selftests/vm/protection_keys.c
> @@ -926,10 +926,10 @@ void expected_pkey_fault(int pkey)
>   pkey_assert(last_pkey_faults + 1 == pkey_faults);
>   pkey_assert(last_si_pkey == pkey);
>   /*
> -  * The signal handler shold have cleared out PKEY register to let the
> +  * The signal handler shold have cleared out pkey-register to let the

Heh, you randomly changed the formatting and didn't bother with my awful
typo. :)

>* test program continue.  We now have to restore it.
>*/
> - if (__rdpkey_reg() != 0)
> + if (__rdpkey_reg() != shadow_pkey_reg)
>   pkey_assert(0);
>  
>   __wrpkey_reg(shadow_pkey_reg);
> 

I don't think this should be "shadow_pkey_reg".  This was just trying to
double-check that the signal handler messed around with PKRU the way we
expected.

We could also just check that the disable bits for 'pkey' are clear at
this point.  That would be almost as good.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 10/22] selftests/vm: introduce two arch independent abstraction

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> open_hugepage_file() <- opens the huge page file
> get_start_key() <--  provides the first non-reserved key.
> 

Looks reasonable.

Reviewed-by: Dave Hansen <dave.han...@intel.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 09/22] selftests/vm: fix alloc_random_pkey() to make it really random

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> alloc_random_pkey() was allocating the same pkey every time.
> Not all pkeys were geting tested. fixed it.
...
> @@ -602,13 +603,15 @@ int alloc_random_pkey(void)
>   int alloced_pkeys[NR_PKEYS];
>   int nr_alloced = 0;
>   int random_index;
> +
>   memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
> + srand((unsigned int)time(NULL));
>  
>   /* allocate every possible key and make a note of which ones we got */
>   max_nr_pkey_allocs = NR_PKEYS;
> - max_nr_pkey_allocs = 1;
>   for (i = 0; i < max_nr_pkey_allocs; i++) {
>   int new_pkey = alloc_pkey();

The srand() is probably useful, but won't this always just do a single
alloc_pkey() now?  That seems like it will mean we always use the first
one the kernel gives us, which isn't random.

> - dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%x\n", __func__,
> - __LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
> + dprintf1("%s()::%d, ret: %d pkey_reg: 0x%x shadow: 0x%016lx\n",
> + __func__, __LINE__, ret, __rdpkey_reg(), shadow_pkey_reg);
>   return ret;
>  }

This belonged in the pkey_reg_t patch, I think.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 08/22] selftests/vm: clear the bits in shadow reg when a pkey is freed.

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> When a key is freed, the  key  is  no  more  effective.
> Clear the bits corresponding to the pkey in the shadow
> register. Otherwise  it  will carry some spurious bits
> which can trigger false-positive asserts.
...
> diff --git a/tools/testing/selftests/vm/protection_keys.c 
> b/tools/testing/selftests/vm/protection_keys.c
> index ca54a95..aaf9f09 100644
> --- a/tools/testing/selftests/vm/protection_keys.c
> +++ b/tools/testing/selftests/vm/protection_keys.c
> @@ -582,6 +582,9 @@ int alloc_pkey(void)
>  int sys_pkey_free(unsigned long pkey)
>  {
>   int ret = syscall(SYS_pkey_free, pkey);
> +
> + if (!ret)
> + shadow_pkey_reg &= reset_bits(pkey, PKEY_DISABLE_ACCESS);
>   dprintf1("%s(pkey=%ld) syscall ret: %d\n", __func__, pkey, ret);
>   return ret;
>  }

Did this cause problems for you in practice?

On x86, sys_pkey_free() does not affect PKRU, so this isn't quite right.
 I'd much rather have the actual tests explicitly clear the PKRU bits
and also in the process clear the shadow bits.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 07/22] selftests/vm: fixed bugs in pkey_disable_clear()

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> --- a/tools/testing/selftests/vm/protection_keys.c
> +++ b/tools/testing/selftests/vm/protection_keys.c
> @@ -461,7 +461,7 @@ void pkey_disable_clear(int pkey, int flags)
>   pkey, pkey, pkey_rights);
>   pkey_assert(pkey_rights >= 0);
>  
> - pkey_rights |= flags;
> + pkey_rights &= ~flags;
>  
>   ret = pkey_set(pkey, pkey_rights, 0);
>   /* pkey_reg and flags have the same format */
> @@ -475,7 +475,7 @@ void pkey_disable_clear(int pkey, int flags)
>   dprintf1("%s(%d) pkey_reg: 0x%016lx\n", __func__,
>   pkey, rdpkey_reg());
>   if (flags)
> - assert(rdpkey_reg() > orig_pkey_reg);
> + assert(rdpkey_reg() < orig_pkey_reg);
>  }
>  
>  void pkey_write_allow(int pkey)

This seems so horribly wrong that I wonder how it worked in the first
place.  Any idea?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 06/22] selftests/vm: fix the wrong assert in pkey_disable_set()

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> If the flag is 0, no bits will be set. Hence we cant expect
> the resulting bitmap to have a higher value than what it
> was earlier.
> 
> cc: Dave Hansen <dave.han...@intel.com>
> cc: Florian Weimer <fwei...@redhat.com>
> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  tools/testing/selftests/vm/protection_keys.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/tools/testing/selftests/vm/protection_keys.c 
> b/tools/testing/selftests/vm/protection_keys.c
> index 83216c5..0109388 100644
> --- a/tools/testing/selftests/vm/protection_keys.c
> +++ b/tools/testing/selftests/vm/protection_keys.c
> @@ -443,7 +443,7 @@ void pkey_disable_set(int pkey, int flags)
>   dprintf1("%s(%d) pkey_reg: 0x%lx\n",
>   __func__, pkey, rdpkey_reg());
>   if (flags)
> - pkey_assert(rdpkey_reg() > orig_pkey_reg);
> + pkey_assert(rdpkey_reg() >= orig_pkey_reg);
>   dprintf1("END<---%s(%d, 0x%x)\n", __func__,
>   pkey, flags);
>  }

I'm not sure about this one.  Did this cause a problem for you?

Why would you call this and ask no bits to be set?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 05/22] selftests/vm: generic function to handle shadow key register

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> +static inline u32 pkey_to_shift(int pkey)
> +{
> + return pkey * PKEY_BITS_PER_PKEY;
> +}

pkey_bit_position(), perhaps?

> +static inline pkey_reg_t reset_bits(int pkey, pkey_reg_t bits)
> +{
> + u32 shift = pkey_to_shift(pkey);
> +
> + return ~(bits << shift);
> +}

I'd prefer clear_pkey_flags() or maybe clear_pkey_bits().  "reset" can
mean "reset to 0" or "reset to 1".

Also, why the u32 here?  Isn't an int more appropriate?

> +static inline pkey_reg_t left_shift_bits(int pkey, pkey_reg_t bits)
> +{
> + u32 shift = pkey_to_shift(pkey);
> +
> + return (bits << shift);
> +}
> +
> +static inline pkey_reg_t right_shift_bits(int pkey, pkey_reg_t bits)
> +{
> + u32 shift = pkey_to_shift(pkey);
> +
> + return (bits >> shift);
> +}

Some comments on these would be handy.  Basically that this takes a
per-key flags value and puts it at the right position so it can be
shoved in the register.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 04/22] selftests/vm: typecast the pkey register

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
> -static inline unsigned int _rdpkey_reg(int line)
> +static inline pkey_reg_t _rdpkey_reg(int line)
>  {
> - unsigned int pkey_reg = __rdpkey_reg();
> + pkey_reg_t pkey_reg = __rdpkey_reg();
>  
> - dprintf4("rdpkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
> + dprintf4("rdpkey_reg(line=%d) pkey_reg: %016lx shadow: %016lx\n",
>   line, pkey_reg, shadow_pkey_reg);
>   assert(pkey_reg == shadow_pkey_reg);

Hmm.  So we're using %lx for an int?  Doesn't the compiler complain
about this?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 02/22] selftests/vm: rename all references to pkru to a generic name

2018-03-16 Thread Dave Hansen
On 02/21/2018 05:55 PM, Ram Pai wrote:
>  int pkey_set(int pkey, unsigned long rights, unsigned long flags)
>  {
>   u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
> - u32 old_pkru = __rdpkru();
> - u32 new_pkru;
> + u32 old_pkey_reg = __rdpkey_reg();
> + u32 new_pkey_reg;

If we're not using the _actual_ instruction names ("rdpkru"), I think
I'd rather this be something more readable, like: __read_pkey_reg().

But, it's OK-ish the way it is.

Reviewed-by: Dave Hansen <dave.han...@intel.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [v2] docs: clarify security-bugs disclosure policy

2018-03-07 Thread Dave Hansen

From: Dave Hansen <dave.han...@linux.intel.com>

I think we need to soften the language a bit.  It might scare folks
off, especially the:

 We prefer to fully disclose the bug as soon as possible.

which is not really the case.  Linus says:

It's not full disclosure, it's not coordinated disclosure,
and it's not "no disclosure".  It's more like just "timely
open fixes".

I changed a bit of the wording in here, but mostly to remove the word
"disclosure" since it seems to mean very specific things to people
that we do not mean here.

Signed-off-by: Dave Hansen <dave.han...@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.willi...@intel.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Alan Cox <gno...@lxorguk.ukuu.org.uk>
Cc: Andrea Arcangeli <aarca...@redhat.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Kees Cook <keesc...@google.com>
Cc: Tim Chen <tim.c.c...@linux.intel.com>
Cc: Alexander Viro <v...@zeniv.linux.org.uk>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Mark Rutland <mark.rutl...@arm.com>
---

 b/Documentation/admin-guide/security-bugs.rst |   24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff -puN Documentation/admin-guide/security-bugs.rst~embargo2 
Documentation/admin-guide/security-bugs.rst
--- a/Documentation/admin-guide/security-bugs.rst~embargo2  2018-03-07 
13:23:49.390228208 -0800
+++ b/Documentation/admin-guide/security-bugs.rst   2018-03-07 
13:42:37.618225395 -0800
@@ -29,18 +29,20 @@ made public.
 Disclosure
 --
 
-The goal of the Linux kernel security team is to work with the
-bug submitter to bug resolution as well as disclosure.  We prefer
-to fully disclose the bug as soon as possible.  It is reasonable to
-delay disclosure when the bug or the fix is not yet fully understood,
-the solution is not well-tested or for vendor coordination.  However, we
-expect these delays to be short, measurable in days, not weeks or months.
-A disclosure date is negotiated by the security team working with the
-bug submitter as well as vendors.  However, the kernel security team
-holds the final say when setting a disclosure date.  The timeframe for
-disclosure is from immediate (esp. if it's already publicly known)
+The goal of the Linux kernel security team is to work with the bug
+submitter to understand and fix the bug.  We prefer to publish the fix as
+soon as possible, but try to avoid public discussion of the bug itself
+and leave that to others.
+
+Publishing the fix may be delayed when the bug or the fix is not yet
+fully understood, the solution is not well-tested or for vendor
+coordination.  However, we expect these delays to be short, measurable in
+days, not weeks or months.  A release date is negotiated by the security
+team working with the bug submitter as well as vendors.  However, the
+kernel security team holds the final say when setting a timeframe.  The
+timeframe varies from immediate (esp. if it's already publicly known bug)
 to a few weeks.  As a basic default policy, we expect report date to
-disclosure date to be on the order of 7 days.
+release date to be on the order of 7 days.
 
 Coordination
 
_
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] docs: clarify security-bugs disclosure policy

2018-03-06 Thread Dave Hansen

From: Dave Hansen <dave.han...@linux.intel.com>

I think we need to soften the language a bit.  It might scare folks
off, especially the:

 We prefer to fully disclose the bug as soon as possible.

which is not really the case.  As Greg mentioned in private mail, we
really do not prefer to disclose things until *after* a fix.  The
whole "we have the final say" is also a bit harsh.

Signed-off-by: Dave Hansen <dave.han...@linux.intel.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Alan Cox <gno...@lxorguk.ukuu.org.uk>
Cc: Andrea Arcangeli <aarca...@redhat.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Kees Cook <keesc...@google.com>
Cc: Tim Chen <tim.c.c...@linux.intel.com>
Cc: Dan Williams <dan.j.willi...@intel.com>
Cc: Alexander Viro <v...@zeniv.linux.org.uk>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Mark Rutland <mark.rutl...@arm.com>
---

 b/Documentation/admin-guide/security-bugs.rst |   26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff -puN Documentation/admin-guide/security-bugs.rst~embargo 
Documentation/admin-guide/security-bugs.rst
--- a/Documentation/admin-guide/security-bugs.rst~embargo   2018-03-06 
14:47:04.519431230 -0800
+++ b/Documentation/admin-guide/security-bugs.rst   2018-03-06 
14:57:46.410429629 -0800
@@ -29,18 +29,22 @@ made public.
 Disclosure
 --
 
-The goal of the Linux kernel security team is to work with the
-bug submitter to bug resolution as well as disclosure.  We prefer
-to fully disclose the bug as soon as possible.  It is reasonable to
-delay disclosure when the bug or the fix is not yet fully understood,
-the solution is not well-tested or for vendor coordination.  However, we
-expect these delays to be short, measurable in days, not weeks or months.
+The goal of the Linux kernel security team is to work with the bug
+submitter to bug resolution as well as disclosure.  We prefer to fully
+disclose the bug as soon as possible after a fix is available.  It is
+customary to delay disclosure when the bug or the fix is not yet fully
+understood, the solution is not well-tested or for vendor coordination.
+However, we expect these delays to typically be short, measurable in
+days, not weeks or months.
+
 A disclosure date is negotiated by the security team working with the
-bug submitter as well as vendors.  However, the kernel security team
-holds the final say when setting a disclosure date.  The timeframe for
-disclosure is from immediate (esp. if it's already publicly known)
-to a few weeks.  As a basic default policy, we expect report date to
-disclosure date to be on the order of 7 days.
+bug submitter as well as affected vendors.  The security team prefers
+coordinated disclosure and will consider pre-existing, reasonable
+disclosure dates.
+
+The timeframe for disclosure ranges from immediate (esp. if it's
+already publicly known) to a few weeks.  As a basic default policy, we
+expect report date to disclosure date to be on the order of 7 days.
 
 Coordination
 
_
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 03/05/2018 01:37 PM, Khalid Aziz wrote:
>> How big can this storage get, btw?  Superficially it seems like it might
>> be able to be gigantic for a large, sparse VMA.
>>
> Tags are stored only for the pages being swapped out, not for the pages
> in entire vma. Each tag storage page can hold tags for 128 pages (each
> page has 128 4-bit tags, hence 64 bytes are needed to store tags for an
> entire page allowing each page to store tags for 128 pages). Sparse VMA
> does not cause any problems since holes do not have corresponding pages
> that will be swapped out. Tag storage pages are freed once all the pages
> they store tags for have been swapped back in, except for a small number
> of pages (maximum of 8) marked for emergency tag storage.

With a linear scan holding a process-wide spinlock?  If you have a fast
swap device, does this become the bottleneck when swapping ADI-tagged
memory?

FWIW, this tag storage is complex and subtle enough code that it
deserves to be in its own well-documented patch, not buried in a
thousand-line patch.

> +tag_storage_desc_t *find_tag_store(struct mm_struct *mm,
> +struct vm_area_struct *vma,
> +unsigned long addr)
> +{
> + tag_storage_desc_t *tag_desc = NULL;
> + unsigned long i, max_desc, flags;
> +
> + /* Check if this vma already has tag storage descriptor
> +  * allocated for it.
> +  */
> + max_desc = PAGE_SIZE/sizeof(tag_storage_desc_t);
> + if (mm->context.tag_store) {
> + tag_desc = mm->context.tag_store;
> + spin_lock_irqsave(>context.tag_lock, flags);
> + for (i = 0; i < max_desc; i++) {
> + if ((addr >= tag_desc->start) &&
> + ((addr + PAGE_SIZE - 1) <= tag_desc->end))
> + break;
> + tag_desc++;
> + }
> + spin_unlock_irqrestore(>context.tag_lock, flags);

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 03/05/2018 01:14 PM, Khalid Aziz wrote:
> Are you suggesting that vma returned by find_vma() could be split or
> merged underneath me if I do not hold mmap_sem and thus make the flag
> check invalid? If so, that is a good point.

This part does make me think that this code hasn't been tested very
thoroughly.  Could you describe the testing that you have done?  For MPX
and protection keys, I added something to tools/testing/selftests/x86,
for instance.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 03/05/2018 01:14 PM, Khalid Aziz wrote:
> On 03/05/2018 12:22 PM, Dave Hansen wrote:
>> On 02/21/2018 09:15 AM, Khalid Aziz wrote:
>>> +#define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr)
>>> +static inline int sparc_validate_prot(unsigned long prot, unsigned
>>> long addr)
>>> +{
>>> +    if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM |
>>> PROT_ADI))
>>> +    return 0;
>>> +    if (prot & PROT_ADI) {
>>> +    if (!adi_capable())
>>> +    return 0;
>>> +
>>> +    if (addr) {
>>> +    struct vm_area_struct *vma;
>>> +
>>> +    vma = find_vma(current->mm, addr);
>>> +    if (vma) {
>>> +    /* ADI can not be enabled on PFN
>>> + * mapped pages
>>> + */
>>> +    if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
>>> +    return 0;
>>
>> You don't hold mmap_sem here.  How can this work?
>>
> Are you suggesting that vma returned by find_vma() could be split or
> merged underneath me if I do not hold mmap_sem and thus make the flag
> check invalid? If so, that is a good point.

Um, yes.  You can't walk the vma tree without holding mmap_sem.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> +tag_storage_desc_t *alloc_tag_store(struct mm_struct *mm,
> + struct vm_area_struct *vma,
> + unsigned long addr)
...
> + tags = kzalloc(size, GFP_NOWAIT|__GFP_NOWARN);
> + if (tags == NULL) {
> + tag_desc->tag_users = 0;
> + tag_desc = NULL;
> + goto out;
> + }
> + tag_desc->start = addr;
> + tag_desc->tags = tags;
> + tag_desc->end = end_addr;
> +
> +out:
> + spin_unlock_irqrestore(>context.tag_lock, flags);
> + return tag_desc;
> +}

OK, sorry, I missed this.  I do see that you now have per-ADI-block tag
storage and it is not per-page.

How big can this storage get, btw?  Superficially it seems like it might
be able to be gigantic for a large, sparse VMA.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v12 10/11] sparc64: Add support for ADI (Application Data Integrity)

2018-03-05 Thread Dave Hansen
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> +#define arch_validate_prot(prot, addr) sparc_validate_prot(prot, addr)
> +static inline int sparc_validate_prot(unsigned long prot, unsigned long addr)
> +{
> + if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_ADI))
> + return 0;
> + if (prot & PROT_ADI) {
> + if (!adi_capable())
> + return 0;
> +
> + if (addr) {
> + struct vm_area_struct *vma;
> +
> + vma = find_vma(current->mm, addr);
> + if (vma) {
> + /* ADI can not be enabled on PFN
> +  * mapped pages
> +  */
> + if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
> + return 0;

You don't hold mmap_sem here.  How can this work?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface

2017-12-18 Thread Dave Hansen
On 12/18/2017 02:18 PM, Ram Pai wrote:
> b) minimum number of keys available to the application.
>   if libraries consumes a few, they could provide a library
>   interface to the application informing the number available to
>   the application.  The library interface can leverage (b) to
>   provide the information.

OK, let's see a real user of this including a few libraries.  Then we'll
put it in the kernel.

> c) types of disable-rights supported by keys.
>   Helps the application to determine the types of disable-features
>   available. This is helpful, otherwise the app has to 
>   make pkey_alloc() call with the corresponding parameter set
>   and see if it suceeds or fails. Painful from an application
>   point of view, in my opinion.

Again, let's see a real-world use of this.  How does it look?  How does
an app "fall back" if it can't set a restriction the way it wants to?

Are we *sure* that such an interface makes sense?  For instance, will it
be possible for some keys to be execute-disable while other are only
write-disable?

> I think on x86 you look for some hardware registers to determine which
> hardware features are enabled by the kernel.

No, we use CPUID.  It's a part of the ISA that's designed for
enumerating CPU and (sometimes) OS support for CPU features.

> We do not have generic support for something like that on ppc.
> The kernel looks at the device tree to determine what hardware features
> are available. But does not have mechanism to tell the hardware to track
> which of its features are currently enabled/used by the kernel; atleast
> not for the memory-key feature.

Bummer.  You're missing out.

But, you could still do this with a syscall.  "Hey, kernel, do you
support this feature?"
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface

2017-12-18 Thread Dave Hansen
On 11/06/2017 12:57 AM, Ram Pai wrote:
> Expose useful information for programs using memory protection keys.
> Provide implementation for powerpc and x86.
> 
> On a powerpc system with pkeys support, here is what is shown:
> 
> $ head /sys/kernel/mm/protection_keys/*
> ==> /sys/kernel/mm/protection_keys/disable_access_supported <==
> true

This is cute, but I don't think it should be part of the ABI.  Put it in
debugfs if you want it for cute tests.  The stuff that this tells you
can and should come from pkey_alloc() for the ABI.

http://man7.org/linux/man-pages/man7/pkeys.7.html

>Any application wanting to use protection keys needs to be able to
>function without them.  They might be unavailable because the
>hardware that the application runs on does not support them, the
>kernel code does not contain support, the kernel support has been
>disabled, or because the keys have all been allocated, perhaps by a
>library the application is using.  It is recommended that
>applications wanting to use protection keys should simply call
>pkey_alloc(2) and test whether the call succeeds, instead of
>attempting to detect support for the feature in any other way.

Do you really not have standard way on ppc to say whether hardware
features are supported by the kernel?  For instance, how do you know if
a given set of registers are known to and are being context-switched by
the kernel?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 00/51] powerpc, mm: Memory Protection Keys

2017-11-07 Thread Dave Hansen
On 11/07/2017 02:39 PM, Ram Pai wrote:
> 
> As per the current semantics of sys_pkey_free(); the way I understand it,
> the calling thread is saying disassociate me from this key.

No.  It is saying: "this *process* no longer has any uses of this key,
it can be reused".
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 34/38] procfs: display the protection-key number associated with a vma

2017-07-13 Thread Dave Hansen
On 07/13/2017 01:03 AM, Ram Pai wrote:
> On Tue, Jul 11, 2017 at 11:13:56AM -0700, Dave Hansen wrote:
>> On 07/05/2017 02:22 PM, Ram Pai wrote:
>>> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
>>> +void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
>>> +{
>>> +   seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
>>> +}
>>> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
>>
>> This seems like kinda silly unnecessary duplication.  Could we just put
>> this in the fs/proc/ code and #ifdef it on ARCH_HAS_PKEYS?
> 
> Well x86 predicates it based on availability of X86_FEATURE_OSPKE.
> 
> powerpc doesn't need that check or any similar check. So trying to
> generalize the code does not save much IMHO.

I know all your hardware doesn't support it. :)

So, for instance, if you are running on a new POWER9 with radix page
tables, you will just always output "ProtectionKey: 0" in every VMA,
regardless?

> maybe have a seperate inline function that does
> seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> and is called from x86 and powerpc's arch_show_smap()?
> At least will keep the string format captured in 
> one single place.

Now that we have two architectures, is there a strong reason we can't
just have an arch_pkeys_enabled(), and stick the seq_printf() back in
generic code?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 12/38] mm: ability to disable execute permission on a key at creation

2017-07-11 Thread Dave Hansen
On 07/11/2017 03:14 PM, Ram Pai wrote:
> Now how many does the kernel use to reserve for itself is something
> the kernel knows too and hence can expose it, though the information
> may change dynamically as the kernel reserves and releases the key
> based on its internal needs. 
> 
> So i think we can expose this informaton through procfs/sysfs and let
> the application decide how it wants to use the information.

Why bother?  On x86, you'll be told either 14 or 15 depending on whether
you tried to create a mapping in the process without execute permission.
 You can't use all 14 or 15 unless you actually call pkey_alloc() anyway
because the /proc check is inherently racy.

I'm just not sure I see the value in creating a new ABI for it.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 12/38] mm: ability to disable execute permission on a key at creation

2017-07-11 Thread Dave Hansen
On 07/11/2017 02:51 PM, Ram Pai wrote:
> On Wed, Jul 12, 2017 at 07:29:37AM +1000, Benjamin Herrenschmidt wrote:
>> On Tue, 2017-07-11 at 11:11 -0700, Dave Hansen wrote:
>>> On 07/05/2017 02:21 PM, Ram Pai wrote:
>>>> Currently sys_pkey_create() provides the ability to disable read
>>>> and write permission on the key, at  creation. powerpc  has  the
>>>> hardware support to disable execute on a pkey as well.This patch
>>>> enhances the interface to let disable execute  at  key  creation
>>>> time. x86 does  not  allow  this.  Hence the next patch will add
>>>> ability  in  x86  to  return  error  if  PKEY_DISABLE_EXECUTE is
>>>> specified.
>>
>> That leads to the question... How do you tell userspace.
>>
>> (apologies if I missed that in an existing patch in the series)
>>
>> How do we inform userspace of the key capabilities ? There are at least
>> two things userspace may want to know already:
>>
>>  - What protection bits are supported for a key
> 
> the userspace is the one which allocates the keys and enables/disables the
> protection bits on the key. the kernel is just a facilitator. Now if the
> use space wants to know the current permissions on a given key, it can
> just read the AMR/PKRU register on powerpc/intel respectively.

Let's say I want to execute-disable a region.  Can I use protection
keys?  Do I do

pkey_mprotect(... PKEY_DISABLE_EXECUTE);

and assume that the -EINVAL is because PKEY_DISABLE_EXECUTE is
unsupported, or do I do:

#ifdef __ppc__
pkey = pkey_aloc();
pkey_mprotect(... PKEY_DISABLE_EXECUTE);
#else
mprotect();
#endif

>>  - How many keys exist
> 
> There is no standard way of finding this other than trying to allocate
> as many till you fail.  A procfs or sysfs file can be added to expose
> this information.

It's also dynamic.  On x86, you lose a key if you've used the
execute-only support.  We also reserve the right to steal more in the
future if we want.

>>  - Which keys are available for use by userspace. On PowerPC, the
>> kernel can reserve some keys for itself, so can the hypervisor. In
>> fact, they do.
> 
> this information can be exposed through /proc or /sysfs
> 
> I am sure there will be more demands and requirements as applications
> start leveraging these feature.

For 5 bits, I think just having someone run pkey_alloc() in a loop is
fine.  I don't think we really need to enumerate it in some other way.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 38/38] Documentation: PowerPC specific updates to memory protection keys

2017-07-11 Thread Dave Hansen
On 07/05/2017 02:22 PM, Ram Pai wrote:
> Add documentation updates that capture PowerPC specific changes.
> 
> Signed-off-by: Ram Pai 
> ---
>  Documentation/vm/protection-keys.txt |   85 
> ++
>  1 files changed, 65 insertions(+), 20 deletions(-)
> 
> diff --git a/Documentation/vm/protection-keys.txt 
> b/Documentation/vm/protection-keys.txt
> index b643045..d50b6ab 100644
> --- a/Documentation/vm/protection-keys.txt
> +++ b/Documentation/vm/protection-keys.txt
> @@ -1,21 +1,46 @@
> -Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
> -which will be found on future Intel CPUs.
> +Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature found 
> in
> +new generation of intel CPUs and on PowerPC 7 and higher CPUs.

Please try not to change the wording here.  I really did mean to
literally put "future Intel CPUs."  Also, you broke my nice wrapping. :)

I'm also thinking that this needs to be more generic.  The ppc _CPU_
feature is *NOT* for userspace-only, right?

>  Memory Protection Keys provides a mechanism for enforcing page-based
> -protections, but without requiring modification of the page tables
> -when an application changes protection domains.  It works by
> -dedicating 4 previously ignored bits in each page table entry to a
> -"protection key", giving 16 possible keys.
> -
> -There is also a new user-accessible register (PKRU) with two separate
> -bits (Access Disable and Write Disable) for each key.  Being a CPU
> -register, PKRU is inherently thread-local, potentially giving each
> -thread a different set of protections from every other thread.
> -
> -There are two new instructions (RDPKRU/WRPKRU) for reading and writing
> -to the new register.  The feature is only available in 64-bit mode,
> -even though there is theoretically space in the PAE PTEs.  These
> -permissions are enforced on data access only and have no effect on
> +protections, but without requiring modification of the page tables when an
> +application changes protection domains.
> +
> +
> +On Intel:
> +
> + It works by dedicating 4 previously ignored bits in each page table
> + entry to a "protection key", giving 16 possible keys.
> +
> + There is also a new user-accessible register (PKRU) with two separate
> + bits (Access Disable and Write Disable) for each key.  Being a CPU
> + register, PKRU is inherently thread-local, potentially giving each
> + thread a different set of protections from every other thread.
> +
> + There are two new instructions (RDPKRU/WRPKRU) for reading and writing
> + to the new register.  The feature is only available in 64-bit mode,
> + even though there is theoretically space in the PAE PTEs.  These
> + permissions are enforced on data access only and have no effect on
> + instruction fetches.
> +
> +
> +On PowerPC:
> +
> + It works by dedicating 5 page table entry bits to a "protection key",
> + giving 32 possible keys.
> +
> + There  is  a  user-accessible  register (AMR)  with  two separate bits;
> + Access Disable and  Write  Disable, for  each key.  Being  a  CPU
> + register,  AMR  is inherently  thread-local,  potentially  giving  each
> + thread a different set of protections from every other thread.  NOTE:
> + Disabling read permission does not disable write and vice-versa.
> +
> + The feature is available on 64-bit HPTE mode only.
> + 'mtspr 0xd, mem' reads the AMR register
> + 'mfspr mem, 0xd' writes into the AMR register.

The whole "being a CPU register" bits seem pretty common.  Should it be
in the leading paragraph that is shared?

> +Permissions are enforced on data access only and have no effect on
>  instruction fetches.

Shouldn't we mention the ppc support for execute-disable here too?

Also, *does* this apply to ppc?  You have it both in this common area
and in the x86 portion.

>  === Syscalls ===
> @@ -28,9 +53,9 @@ There are 3 system calls which directly interact with pkeys:
> unsigned long prot, int pkey);
>  
>  Before a pkey can be used, it must first be allocated with
> -pkey_alloc().  An application calls the WRPKRU instruction
> +pkey_alloc().  An application calls the WRPKRU/AMR instruction
>  directly in order to change access permissions to memory covered
> -with a key.  In this example WRPKRU is wrapped by a C function
> +with a key.  In this example WRPKRU/AMR is wrapped by a C function
>  called pkey_set().
>  
>   int real_prot = PROT_READ|PROT_WRITE;
> @@ -52,11 +77,11 @@ is no longer in use:
>   munmap(ptr, PAGE_SIZE);
>   pkey_free(pkey);
>  
> -(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
> +(Note: pkey_set() is a wrapper for the RDPKRU,WRPKRU or AMR instructions.
>   An example implementation can be found in
>   tools/testing/selftests/x86/protection_keys.c)
>  
> 

Re: [RFC v5 34/38] procfs: display the protection-key number associated with a vma

2017-07-11 Thread Dave Hansen
On 07/05/2017 02:22 PM, Ram Pai wrote:
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
> +{
> + seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> +}
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

This seems like kinda silly unnecessary duplication.  Could we just put
this in the fs/proc/ code and #ifdef it on ARCH_HAS_PKEYS?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 13/38] x86: disallow pkey creation with PKEY_DISABLE_EXECUTE

2017-07-11 Thread Dave Hansen
On 07/05/2017 02:21 PM, Ram Pai wrote:
> x86 does not support disabling execute permissions on a pkey.
> 
> Signed-off-by: Ram Pai 
> ---
>  arch/x86/kernel/fpu/xstate.c |3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index c24ac1e..d582631 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -900,6 +900,9 @@ int arch_set_user_pkey_access(struct task_struct *tsk, 
> int pkey,
>   if (!boot_cpu_has(X86_FEATURE_OSPKE))
>   return -EINVAL;
>  
> + if (init_val & PKEY_DISABLE_EXECUTE)
> + return -EINVAL;

I'd really rather that we define a supported mask instead of having each
architecture go through and list which ones it supports.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 12/38] mm: ability to disable execute permission on a key at creation

2017-07-11 Thread Dave Hansen
On 07/05/2017 02:21 PM, Ram Pai wrote:
> Currently sys_pkey_create() provides the ability to disable read
> and write permission on the key, at  creation. powerpc  has  the
> hardware support to disable execute on a pkey as well.This patch
> enhances the interface to let disable execute  at  key  creation
> time. x86 does  not  allow  this.  Hence the next patch will add
> ability  in  x86  to  return  error  if  PKEY_DISABLE_EXECUTE is
> specified.
> 
> Signed-off-by: Ram Pai 
> ---
>  include/uapi/asm-generic/mman-common.h |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/include/uapi/asm-generic/mman-common.h 
> b/include/uapi/asm-generic/mman-common.h
> index 8c27db0..bf4fa07 100644
> --- a/include/uapi/asm-generic/mman-common.h
> +++ b/include/uapi/asm-generic/mman-common.h
> @@ -74,7 +74,9 @@
>  
>  #define PKEY_DISABLE_ACCESS  0x1
>  #define PKEY_DISABLE_WRITE   0x2
> +#define PKEY_DISABLE_EXECUTE 0x4
>  #define PKEY_ACCESS_MASK (PKEY_DISABLE_ACCESS |\
> -  PKEY_DISABLE_WRITE)
> +  PKEY_DISABLE_WRITE  |\
> +  PKEY_DISABLE_EXECUTE)

If you do this, it breaks bisection.  Can you please just do this in one
patch?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 11/38] mm: introduce an additional vma bit for powerpc pkey

2017-07-11 Thread Dave Hansen
On 07/05/2017 02:21 PM, Ram Pai wrote:
> Currently there are only 4bits in the vma flags to support 16 keys
> on x86.  powerpc supports 32 keys, which needs 5bits. This patch
> introduces an addition bit in the vma flags.
> 
> Signed-off-by: Ram Pai 
> ---
>  fs/proc/task_mmu.c |6 +-
>  include/linux/mm.h |   18 +-
>  2 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index f0c8b33..2ddc298 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -666,12 +666,16 @@ static void show_smap_vma_flags(struct seq_file *m, 
> struct vm_area_struct *vma)
>   [ilog2(VM_MERGEABLE)]   = "mg",
>   [ilog2(VM_UFFD_MISSING)]= "um",
>   [ilog2(VM_UFFD_WP)] = "uw",
> -#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
> +#ifdef CONFIG_ARCH_HAS_PKEYS
>   /* These come out via ProtectionKey: */
>   [ilog2(VM_PKEY_BIT0)]   = "",
>   [ilog2(VM_PKEY_BIT1)]   = "",
>   [ilog2(VM_PKEY_BIT2)]   = "",
>   [ilog2(VM_PKEY_BIT3)]   = "",
> +#endif /* CONFIG_ARCH_HAS_PKEYS */
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> + /* Additional bit in ProtectionKey: */
> + [ilog2(VM_PKEY_BIT4)]   = "",
>  #endif

I'd probably just leave the #ifdef out and eat the byte or whatever of
storage that this costs us on x86.

>   };
>   size_t i;
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 7cb17c6..3d35bcc 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -208,21 +208,29 @@ extern int overcommit_kbytes_handler(struct ctl_table 
> *, int, void __user *,
>  #define VM_HIGH_ARCH_BIT_1   33  /* bit only usable on 64-bit 
> architectures */
>  #define VM_HIGH_ARCH_BIT_2   34  /* bit only usable on 64-bit 
> architectures */
>  #define VM_HIGH_ARCH_BIT_3   35  /* bit only usable on 64-bit 
> architectures */
> +#define VM_HIGH_ARCH_BIT_4   36  /* bit only usable on 64-bit arch */

Please just copy the above lines.

>  #define VM_HIGH_ARCH_0   BIT(VM_HIGH_ARCH_BIT_0)
>  #define VM_HIGH_ARCH_1   BIT(VM_HIGH_ARCH_BIT_1)
>  #define VM_HIGH_ARCH_2   BIT(VM_HIGH_ARCH_BIT_2)
>  #define VM_HIGH_ARCH_3   BIT(VM_HIGH_ARCH_BIT_3)
> +#define VM_HIGH_ARCH_4   BIT(VM_HIGH_ARCH_BIT_4)
>  #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
>  
> -#if defined(CONFIG_X86)
> -# define VM_PAT  VM_ARCH_1   /* PAT reserves whole VMA at 
> once (x86) */
> -#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
> +#ifdef CONFIG_ARCH_HAS_PKEYS
>  # define VM_PKEY_SHIFT   VM_HIGH_ARCH_BIT_0
> -# define VM_PKEY_BIT0VM_HIGH_ARCH_0  /* A protection key is a 4-bit 
> value */
> +# define VM_PKEY_BIT0VM_HIGH_ARCH_0
>  # define VM_PKEY_BIT1VM_HIGH_ARCH_1
>  # define VM_PKEY_BIT2VM_HIGH_ARCH_2
>  # define VM_PKEY_BIT3VM_HIGH_ARCH_3
> -#endif
> +#endif /* CONFIG_ARCH_HAS_PKEYS */

We have the space here, so can we just say that it's 4-bits on x86 and 5
on ppc?

> +#if defined(CONFIG_PPC64_MEMORY_PROTECTION_KEYS)
> +# define VM_PKEY_BIT4VM_HIGH_ARCH_4 /* additional key bit used on 
> ppc64 */
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

Why bother #ifdef'ing a #define?

> +#if defined(CONFIG_X86)
> +# define VM_PAT  VM_ARCH_1   /* PAT reserves whole VMA at 
> once (x86) */
>  #elif defined(CONFIG_PPC)
>  # define VM_SAO  VM_ARCH_1   /* Strong Access Ordering 
> (powerpc) */
>  #elif defined(CONFIG_PARISC)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 36/38] selftest: PowerPC specific test updates to memory protection keys

2017-07-11 Thread Dave Hansen
On 07/05/2017 02:22 PM, Ram Pai wrote:
> Abstracted out the arch specific code into the header file, and
> added powerpc specific changes.
> 
> a) added 4k-backed hpte, memory allocator, powerpc specific.
> b) added three test case where the key is associated after the page is
>   accessed/allocated/mapped.
> c) cleaned up the code to make checkpatch.pl happy

There's a *lot* of churn here.  If it breaks, I'm going to have a heck
of a time figuring out which hunk broke.  Is there any way to break this
up into a series of things that we have a chance at bisecting?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-04-27 Thread Dave Hansen
On 04/27/2017 12:25 AM, Dave Young wrote:
> On 04/21/17 at 02:55pm, Dave Hansen wrote:
>> On 04/18/2017 02:22 PM, Tom Lendacky wrote:
>>> Add sysfs support for SME so that user-space utilities (kdump, etc.) can
>>> determine if SME is active.
>>>
>>> A new directory will be created:
>>>   /sys/kernel/mm/sme/
>>>
>>> And two entries within the new directory:
>>>   /sys/kernel/mm/sme/active
>>>   /sys/kernel/mm/sme/encryption_mask
>>
>> Why do they care, and what will they be doing with this information?
> 
> Since kdump will copy old memory but need this to know if the old memory
> was encrypted or not. With this sysfs file we can know the previous SME
> status and pass to kdump kernel as like a kernel param.
> 
> Tom, have you got chance to try if it works or not?

What will the kdump kernel do with it though?  We kexec() into that
kernel so the SME keys will all be the same, right?  So, will the kdump
kernel be just setting the encryption bit in the PTE so it can copy the
old plaintext out?

Why do we need both 'active' and 'encryption_mask'?  How could it be
that the hardware-enumerated 'encryption_mask' changes across a kexec()?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-04-21 Thread Dave Hansen
On 04/18/2017 02:22 PM, Tom Lendacky wrote:
> Add sysfs support for SME so that user-space utilities (kdump, etc.) can
> determine if SME is active.
> 
> A new directory will be created:
>   /sys/kernel/mm/sme/
> 
> And two entries within the new directory:
>   /sys/kernel/mm/sme/active
>   /sys/kernel/mm/sme/encryption_mask

Why do they care, and what will they be doing with this information?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 09/32] x86/mm: Provide general kernel support for memory encryption

2017-04-21 Thread Dave Hansen
On 04/18/2017 02:17 PM, Tom Lendacky wrote:
> @@ -55,7 +57,7 @@ static inline void copy_user_page(void *to, void *from, 
> unsigned long vaddr,
>   __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))
>  
>  #ifndef __va
> -#define __va(x)  ((void *)((unsigned 
> long)(x)+PAGE_OFFSET))
> +#define __va(x)  ((void *)(__sme_clr(x) + PAGE_OFFSET))
>  #endif

It seems wrong to be modifying __va().  It currently takes a physical
address, and this modifies it to take a physical address plus the SME bits.

How does that end up ever happening?  If we are pulling physical
addresses out of the page tables, we use p??_phys().  I'd expect *those*
to be masking off the SME bits.

Is it these cases?

pgd_t *base = __va(read_cr3());

For those, it seems like we really want to create two modes of reading
cr3.  One that truly reads CR3 and another that reads the pgd's physical
address out of CR3.  Then you only do the SME masking on the one
fetching a physical address, and the SME bits never leak into __va().
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 07/28] x86: Provide general kernel support for memory encryption

2017-02-22 Thread Dave Hansen
On 02/16/2017 07:43 AM, Tom Lendacky wrote:
> )
> @@ -673,7 +683,7 @@ static inline unsigned long pgd_page_vaddr(pgd_t pgd)
>   * Currently stuck as a macro due to indirect forward reference to
>   * linux/mmzone.h's __section_mem_map_addr() definition:
>   */
> -#define pgd_page(pgd)pfn_to_page(pgd_val(pgd) >> PAGE_SHIFT)
> +#define pgd_page(pgd)pfn_to_page(pgd_pfn(pgd))

FWIW, these seem like good cleanups that can go in separately from the
rest of your series.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 07/28] x86: Provide general kernel support for memory encryption

2017-02-22 Thread Dave Hansen
On 02/16/2017 07:43 AM, Tom Lendacky wrote:
>  static inline unsigned long pte_pfn(pte_t pte)
>  {
> - return (pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT;
> + return (pte_val(pte) & ~sme_me_mask & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long pmd_pfn(pmd_t pmd)
>  {
> - return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
> + return (pmd_val(pmd) & ~sme_me_mask & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
>  }

Could you talk a bit about why you chose to do the "~sme_me_mask" bit in
here instead of making it a part of PTE_PFN_MASK / pmd_pfn_mask(pmd)?

It might not matter, but I'd be worried that this ends up breaking
direct users of PTE_PFN_MASK / pmd_pfn_mask(pmd) since they now no
longer mask the PFN out of a PTE.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Application Data Integrity feature introduced by SPARC M7

2017-01-13 Thread Dave Hansen
On 01/13/2017 07:29 AM, Rob Gardner wrote:
> so perhaps ADI should simply be disallowed for memory mapped to
> files, and this particular complication can be avoided. Thoughts?

What's a "file" from your perspective?

In Linux, shared memory is a file.  hugetlbfs is done with files.  Many
databases mmap() their data into their address space.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Application Data Integrity feature introduced by SPARC M7

2017-01-12 Thread Dave Hansen
On 01/12/2017 08:50 AM, Khalid Aziz wrote:
> 2. Any shared page that has ADI protection enabled on it, must stay ADI
> protected across all processes sharing it.

Is that true?

What happens if a page with ADI tags set is accessed via a PTE without
the ADI enablement bit set?

> COW creates an intersection of the two. It creates a new copy of the
> shared data. It is a new data page and hence the process creating it
> must be the one responsible for enabling ADI protection on it.

Do you mean that the application must be responsible?  Or the kernel
running in the context of the new process must be responsible?

> It is also a copy of what was ADI protected data, so should it
> inherit the protection instead?

I think the COW'd copy must inherit the VMA bit, the PTE bits, and the
tags on the cachelines.

> I misspoke earlier. I had misinterpreted the results of test I ran.
> Changing the tag on shared memory is allowed by memory controller. The
> requirement is every one sharing the page must switch to the new tag or
> else they get SIGSEGV.

I asked this in the last mail, but I guess I'll ask it again.  Please
answer this directly.

If we require that everyone coordinate their tags on the backing
physical memory, and we allow a lower-privileged program to access the
same data as a more-privileged one, then the lower-privilege app can
cause arbitrary crashes in the privileged application.

For instance, say sudo mmap()'s /etc/passwd and uses ADI tags to protect
the mapping.  Couldn't any other app in the system prevent sudo from
working?

How can we *EVER* allow tags to be set on non-writable mappings?

> I am inclined to suggest we copy the tags to the new data page on COW
> and that will continue to enforce ADI on the COW'd pages even though
> COW'd pages are new data pages. This is the logically consistent
> behavior. Does that make sense?

Yes, I think this is what you have to do.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Application Data Integrity feature introduced by SPARC M7

2017-01-11 Thread Dave Hansen
On 01/11/2017 04:22 PM, Khalid Aziz wrote:
...
> All of the tag coordination can happen in userspace. Once a process sets
> a tag on a physical page mapped in its address space, another process
> that has mapped the same physical page in its address space can only set
> the tag to exact same value. Attempts to set a different tag are caught
> by memory controller and result in MCD trap and kernel sends SIGSEGV to
> the process trying to set a different tag.

Again, I don't think these semantics will work for anything other than
explicitly shared memory.  This behavior ensures that it is *entirely*
unsafe to use ADI on any data that any process you do not control might
be able to mmap().  That's a *HUGE* caveat for the feature and can't
imagine ever seeing this get merged without addressing it.

I think it's fairly simple to address, though a bit expensive.  First,
you can't allow the VMA bit to get set on non-writable mappings.
Second, you'll have to force COW to occur on read-only pages in writable
mappings before the PTE bit can get set.  I think you can probably even
do this in the faults that presumably occur when you try to set ADI tags
on memory mapped with non-ADI PTEs.

>> If you want to use it on copy-on-write'able data, you've got to ensure
>> that you've got entirely private copies.  I'm not sure we even have an
>> interface to guarantee that.  How could this work after a fork() on
>> un-COW'd, but COW'able data?
> 
> On COW, kernel maps the the source and destination pages with
> kmap_atomic() and copies the data over to the new page and the new page
> wouldn't be ADI protected unless the child process chooses to do so.

What do you mean by "ADI protection"?

I think of ADI _protection_ as coming from the PTE and/or VMA bits.
Those are copied at fork() from the old VMA to the new one.  Is there a
reason the child won't implicitly inherit these that I missed?

Whether the parent or the child does the COW fault is basically random.
Whether they get the ADI-tagged page, or the non-ADI-tagged copy is thus
effectively random.  Assuming that the new page has its tags cleared
(and thus is tagged not to be protected), whether your data continues to
be protected or not after a fork() is random.

That doesn't seem like workable behavior.


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Application Data Integrity feature introduced by SPARC M7

2017-01-11 Thread Dave Hansen
On 01/11/2017 10:50 AM, Khalid Aziz wrote:
> On 01/11/2017 11:13 AM, Dave Hansen wrote:
>> On 01/11/2017 08:56 AM, Khalid Aziz wrote:
>> For memory shared by two different processes, do they have to agree on
>> what the tags are, or can they differ?
> 
> The two processes have to agree on the tag. This is part of the security
> design to prevent other processes from accessing pages belonging to
> another process unless they know the tag set on those pages.

So what do you do with static data, say from a shared executable?  You
need to ensure that two different processes from two different privilege
domains can't set different tags on the same physical memory.  That
would seem to mean that you must not allow tags to be set of memory
unless you have write access to it.  Or, you have to ensure that any
file that you might want to use this feature on is entirely unreadable
(well, un-mmap()-able really) by anybody that you are not coordinating with.

If you want to use it on copy-on-write'able data, you've got to ensure
that you've got entirely private copies.  I'm not sure we even have an
interface to guarantee that.  How could this work after a fork() on
un-COW'd, but COW'able data?

>>> Potential for side
>>> effects is too high in such case and would require kernel to either
>>> track tags for every page as they are re-allocated or migrated, or scrub
>>> pages constantly to ensure we do not get spurious tag mismatches. Unless
>>> there is a very strong reason to blindly set TTE.mcd on every PTE, I
>>> think the risk of instability is too high without lot of extra code.
>>
>> Ahh, ok.  That makes sense.  Clearing the tags is expensive.  We must
>> either clear tags or know the previous tags of the memory before we
>> access it.
>>
>> Are any of the tags special?  Do any of them mean "don't do any
>> checking", or similar?
> 
> Tag values of 0 and 15 can be considered special. Setting tag to 15 on
> memory range is disallowed. Accessing a memory location whose tag is
> cleared (means set to 0) with any tag value in the VA is allowed. Once a
> tag is set on a memory, and PSTATE.mcde and TTE.mcd are set, there isn't
> a tag that can be used to bypass version check by MMU.

Bummer.  If the hardware had allowed a special VA tag to bypass checks,
then you wouldn't need to worry about clearing the tags, and you
wouldn't need the interface to control the PTE bit setting/clearing.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Application Data Integrity feature introduced by SPARC M7

2017-01-11 Thread Dave Hansen
On 01/11/2017 08:56 AM, Khalid Aziz wrote:
> On 01/11/2017 09:33 AM, Dave Hansen wrote:
>> On 01/11/2017 08:12 AM, Khalid Aziz wrote:
>>> A userspace task enables ADI through mprotect(). This patch series adds
>>> a page protection bit PROT_ADI and a corresponding VMA flag
>>> VM_SPARC_ADI. VM_SPARC_ADI is used to trigger setting TTE.mcd bit in the
>>> sparc pte that enables ADI checking on the corresponding page.
>>
>> Is there a cost in the hardware associated with doing this "ADI
>> checking"?  For instance, instead of having this new mprotect()
>> interface, why not just always set TTE.mcd on all PTEs?
> 
> There is no performance penalty in the MMU to check tags, but if
> PSTATE.mcd bit is set and TTE.mcde is set, the tag in VA must match what
> was set on the physical page for all memory accesses.

OK, then I'm misunderstanding the architecture again.

For memory shared by two different processes, do they have to agree on
what the tags are, or can they differ?

> Potential for side
> effects is too high in such case and would require kernel to either
> track tags for every page as they are re-allocated or migrated, or scrub
> pages constantly to ensure we do not get spurious tag mismatches. Unless
> there is a very strong reason to blindly set TTE.mcd on every PTE, I
> think the risk of instability is too high without lot of extra code.

Ahh, ok.  That makes sense.  Clearing the tags is expensive.  We must
either clear tags or know the previous tags of the memory before we
access it.

Are any of the tags special?  Do any of them mean "don't do any
checking", or similar?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Application Data Integrity feature introduced by SPARC M7

2017-01-11 Thread Dave Hansen
On 01/11/2017 08:12 AM, Khalid Aziz wrote:
> A userspace task enables ADI through mprotect(). This patch series adds
> a page protection bit PROT_ADI and a corresponding VMA flag
> VM_SPARC_ADI. VM_SPARC_ADI is used to trigger setting TTE.mcd bit in the
> sparc pte that enables ADI checking on the corresponding page.

Is there a cost in the hardware associated with doing this "ADI
checking"?  For instance, instead of having this new mprotect()
interface, why not just always set TTE.mcd on all PTEs?

Also, should this be a privileged interface in some way?  The hardware
is storing these tags *somewhere* and that storage is consuming
resources *somewhere*.  What stops a crafty attacker from mmap()'ing a
128TB chunk of the zero pages and storing ADI tags for all of it?
That'll be 128TB/64*4bits = 1TB worth of 4-bit tags.  Page tables, for
instance, consume a comparable amount of storage, but the OS *knows*
about those and can factor them into OOM decisions.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-06 Thread Dave Hansen
On 01/06/2017 08:22 AM, Khalid Aziz wrote:
> On 01/06/2017 08:36 AM, Dave Hansen wrote:
>> On 01/06/2017 07:32 AM, Khalid Aziz wrote:
>>> I agree with you on simplicity first. Subpage granularity is complex,
>>> but the architecture allows for subpage granularity. Maybe the right
>>> approach is to support this at page granularity first for swappable
>>> pages and then expand to subpage granularity in a subsequent patch?
>>> Pages locked in memory can already use subpage granularity with my
>>> patch.
>>
>> What do you mean by "locked in memory"?  mlock()'d memory can still be
>> migrated around and still requires "swap" ptes, for instance.
> 
> You are right. Page migration can invalidate subpage granularity even
> for locked pages. Is it possible to use cpusets to keep a task and its
> memory locked on a single node?

It's going to be hard to impossible to guarantee.  mlock() doesn't
guarantee that things won't change physical addresses.  You'd have to
change that guarantee or chase all the things in the kernel that might
change physical addresses (compaction, ksm, etc...).

Actually, that reminds me...  How does your code interface with ksm?  Or
is there no interaction needed since you're always working on virtual
addresses?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-06 Thread Dave Hansen
On 01/06/2017 07:32 AM, Khalid Aziz wrote:
> I agree with you on simplicity first. Subpage granularity is complex,
> but the architecture allows for subpage granularity. Maybe the right
> approach is to support this at page granularity first for swappable
> pages and then expand to subpage granularity in a subsequent patch?
> Pages locked in memory can already use subpage granularity with my patch.

What do you mean by "locked in memory"?  mlock()'d memory can still be
migrated around and still requires "swap" ptes, for instance.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-05 Thread Dave Hansen
On 01/04/2017 04:26 PM, Khalid Aziz wrote:
...
> No, we do not have space to stuff PAGE_SIZE/64 version tags in swap pte.
> There is enough space for just one tag per page. DaveM had suggested
> doing this since the usual case is for a task to set one tag per page
> even though MMU does not require it. I have implemented this as first
> pass to start a discussion and get feedback on whether rest of the
> swapping implementation and other changes look right, hence the patch is
> "RFC". If this all looks good, I can expand swapping support in a
> subsequent patch or iteration of this patch to allocate space in
> mm_context_t possibly to store per cacheline tags. I am open to any
> other ideas on storing this larger number of version tags.

FWIW, This is the kind of thing that would be really useful to point out
to reviewers instead of requiring them to ferret it out of the code.  It
has huge implications for how applications use this feature.

As for where to store the tags...  It's potentially a *lot* of data, so
I think it'll be a pain any way you do it.

If you, instead, can live with doing things on a PAGE_SIZE granularity
like pkeys does, you could just store it in the VMA and have the kernel
tag the data at the same time it zeroes the pages.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
On 01/04/2017 04:05 PM, Rob Gardner wrote:
>> What if two different small pages have different tags and khugepaged
>> comes along and tries to collapse them?  Will the page be split if a
>> user attempts to set two different tags inside two different small-page
>> portions of a single THP?
> 
> The MCD tags operate at a resolution of cache lines (64 bytes). Page
> sizes don't matter except that each virtual page must have a bit set in
> its TTE to allow MCD to be enabled on the page. Any page can have many
> different tags, one for each cache line.

Is an "MCD tag" the same thing as a "ADI version tag"?

The thing that confused me here is that we're taking an entire page of
"ADI version tags" and stuffing them into a swap pte (in
set_swp_pte_at()).  Do we somehow have enough space in a swap pte on
sparc to fit PAGE_SIZE/64 "ADI version tag"s in there?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
On 01/04/2017 03:58 PM, Khalid Aziz wrote:
>> How does this all work with large pages?
> 
> It works with large pages the same way as normal sized pages. The TTE
> for a large page also will have the mcd bit set in it and tags are set
> and referenced the same way.

But does the user setting the tags need to know what the page size is?

What if two different small pages have different tags and khugepaged
comes along and tries to collapse them?  Will the page be split if a
user attempts to set two different tags inside two different small-page
portions of a single THP?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
On 01/04/2017 03:46 PM, Khalid Aziz wrote:
>> It would also be really nice to see a high-level breakdown explaining
>> what you had to modify, especially since this affects all of the system
>> calls that take a PROT_* argument.  The sample code is nice, but it's no
>> substitute for writing it down.
> 
> I will expand the explanation in Documentation/sparc/adi.txt.

I think (partially) duplicating that in a cover letter would also be
nice.  The documentation is a bit buried in the 1,000 lines of code.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
On 01/04/2017 03:44 PM, Rob Gardner wrote:
> On 01/04/2017 03:40 PM, Dave Hansen wrote:
>> On 01/04/2017 03:35 PM, Rob Gardner wrote:
>>> Tags are not cleared at all when memory is freed, but rather, lazily
>>> (and automatically) cleared when memory is allocated.
>> What does "allocated" mean in this context?  Physical or virtual? What
>> does this do, for instance?
> 
> The first time a virtual page is touched by a process after the malloc,
> the kernel does clear_user_page() or something similar, which zeroes the
> memory. At the same time, the memory tags are cleared.

OK, so the tags can't survive a MADV_FREE.  That's definitely something
for apps to understand that use MADV_FREE as a substitute for memset().
It also means that tags can't be set for physically unallocated memory.

Neither of those are deal killers, but it would be nice to document it.

How does this all work with large pages?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
On 01/04/2017 03:35 PM, Rob Gardner wrote:
> Tags are not cleared at all when memory is freed, but rather, lazily
> (and automatically) cleared when memory is allocated.

What does "allocated" mean in this context?  Physical or virtual? What
does this do, for instance?

ptr = malloc(PAGE_SIZE);
set_tag(ptr, 14);
madvise(ptr, PAGE_SIZE, MADV_FREE);
printf("tag: %d\n", get_tag(ptr));
free(ptr);
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
One other high-level comment:  It would be nice to see the
arch-independent and x86 portions broken out and explained in their own
right, even if they're small patches.  It's a bit cruel to make us
scroll through a thousand lines of sparc code to see the bits
interesting to us.

It would also be really nice to see a high-level breakdown explaining
what you had to modify, especially since this affects all of the system
calls that take a PROT_* argument.  The sample code is nice, but it's no
substitute for writing it down.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v3] sparc64: Add support for Application Data Integrity (ADI)

2017-01-04 Thread Dave Hansen
On 01/04/2017 02:46 PM, Khalid Aziz wrote:
> This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
> MCD (Memory Corruption Detection) on selected memory ranges, enable
> TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
> version tags on page swap out/in. 

I'm a bit confused why we need all the mechanics with set_swp_pte_at().
For pkeys, for instance, all of the PTEs under a given VMA share a pkey.
 When swapping something in, we just get the pkey out of the VMA and
populate the PTE.

ADI doesn't seem to have a similar restriction.  The feature is turned
on or off at a VMA granularity, but we do not (or can enforce that all
pages under a given VMA must share a tag.

But this leads to an interesting question: is the tag associated with
the (populated?) pte, or the virtual address?  Can you have tags
associated with non-present addresses?  What's the mechanism that clears
the tags at munmap() or MADV_FREE time?

Is the tag storage a precious resource?  Can it be exhausted?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] x86/cpufeature: Add User-Mode Instruction Prevention definitions

2016-11-11 Thread Dave Hansen
On 11/10/2016 08:08 PM, Ricardo Neri wrote:
> Thanks for the suggestions. Perhaps I can include these metrics in my
> V2. On th other hand, Dave Hansen gave a good argument on potential
> conflicts when, of instance running on an AMD CPU. UMIP is enabled by
> setting a bit in CR4. If that bit is not supposed to be set, that could
> cause a #GP fault.

I just meant that some folks probably appreciate being able to build out
all the Intel-specific features.  Not that it causes a functional problem.


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] x86/cpufeature: Add User-Mode Instruction Prevention definitions

2016-11-10 Thread Dave Hansen
On 11/09/2016 07:24 PM, Ricardo Neri wrote:
> On Wed, 2016-11-09 at 03:02 -0800, Andy Lutomirski wrote:
...
>> > What I mean is: why does this need a config option at all?
> I intended this feature to be configurable at build time in case someone
> wants to build a kernel without it; similar to other features such as
> SMAP. Is this not needed? Should Linux be built with this feature always
> enabled?

I think marking these features with their own CONFIG's is a really good
idea.  It helps the tinification effort.  It's also nice for folks that
might want to turn all the Intel features off because they're running on
AMD or something.

We don't necessarily need prompts for *everything*, but I can't imagine
just slapping the code in without #ifdefs of any kind.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] x86/pkeys: Update documentation

2016-10-04 Thread Dave Hansen

From: Dave Hansen <dave.han...@intel.com>

There are a few items that have gotten stale in the protection
keys documentation.  The config option description only applied
to the execute-only support and is not accurate for the current
code.  There was also a typo with the number of system calls.  I
also wanted to call out that pkey_set() is not a kernel-provided
facility, and where to find an implementation.

Signed-off-by: Dave Hansen <dave.han...@intel.com>
Cc: x...@kernel.org
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: linux-doc@vger.kernel.org
---

 b/Documentation/x86/protection-keys.txt |   14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff -puN Documentation/x86/protection-keys.txt~pkeys-docfix 
Documentation/x86/protection-keys.txt
--- a/Documentation/x86/protection-keys.txt~pkeys-docfix2016-10-04 
09:31:01.361928429 -0700
+++ b/Documentation/x86/protection-keys.txt 2016-10-04 09:32:39.142383585 
-0700
@@ -20,7 +20,7 @@ instruction fetches.
 
 === Syscalls ===
 
-There are 2 system calls which directly interact with pkeys:
+There are 3 system calls which directly interact with pkeys:
 
int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
int pkey_free(int pkey);
@@ -52,6 +52,10 @@ is no longer in use:
munmap(ptr, PAGE_SIZE);
pkey_free(pkey);
 
+(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
+ An example implementation can be found in
+ tools/testing/selftests/x86/protection_keys.c)
+
 === Behavior ===
 
 The kernel attempts to make protection keys consistent with the
@@ -79,11 +83,3 @@ with a read():
 The kernel will send a SIGSEGV in both cases, but si_code will be set
 to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
 the plain mprotect() permissions are violated.
-
-=== Config Option ===
-
-This config option adds approximately 1.5kb of text. and 50 bytes of
-data to the executable.  A workload which does large O_DIRECT reads
-of holes in XFS files was run to exercise get_user_pages_fast().  No
-performance delta was observed with the config option
-enabled or disabled.
_
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mm: Add the ram_latent_entropy kernel parameter

2016-08-11 Thread Dave Hansen
On 08/10/2016 03:28 PM, Kees Cook wrote:
> + if (ram_latent_entropy && !PageHighMem(page) &&
> + page_to_pfn(page) < 0x10) {
> + u64 hash = 0;
> + size_t index, end = PAGE_SIZE * nr_pages / sizeof(hash);
> + const u64 *data = lowmem_page_address(page);
> +
> + for (index = 0; index < end; index++)
> + hash ^= hash + data[index];
> + add_device_randomness((const void *), sizeof(hash));
> + }

When I was first reading this, I thought it was using the _addresses_ of
the freed memory for entropy.  But it's actually using the _contents_.
The description could probably use a wee bit of sprucing up.

It might also be nice to say in the patch description (and the
Documentation/) what you expect to be in this memory.  It will obviously
be zeros for the vast majority of the space, but I do wonder what else
ends up in there in practice.

Why is it limited to 4GB?  Just so it doesn't go and try to XOR the
contents of a multi-TB system if it got turned on there? :)
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/4] x86, boot: PUD VA support for physical mapping (x86_64)

2016-05-02 Thread Dave Hansen
On 05/02/2016 02:41 PM, Thomas Garnier wrote:
> Minor change that allows early boot physical mapping of PUD level virtual
> addresses. This change prepares usage of different virtual addresses for
> KASLR memory randomization. It has no impact on default usage.
...
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 89d9747..6adfbce 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -526,10 +526,10 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, 
> unsigned long end,
>  {
>   unsigned long pages = 0, next;
>   unsigned long last_map_addr = end;
> - int i = pud_index(addr);
> + int i = pud_index((unsigned long)__va(addr));
>  
>   for (; i < PTRS_PER_PUD; i++, addr = next) {
> - pud_t *pud = pud_page + pud_index(addr);
> + pud_t *pud = pud_page + pud_index((unsigned long)__va(addr));
>   pmd_t *pmd;
>   pgprot_t prot = PAGE_KERNEL;

pud_index() is supposed to take a virtual address.  We were passing a
physical address in here, and it all just worked because PAGE_OFFSET is
PUD-aligned.  Now that you are moving PAGE_OFFSET around a bit and not
PUD-aligning it, this breaks.  Right?

Could you spell this out a bit more the changelog?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sparc64: Add support for Application Data Integrity (ADI)

2016-03-07 Thread Dave Hansen
On 03/07/2016 11:46 AM, Khalid Aziz wrote:
> On 03/07/2016 12:22 PM, David Miller wrote:
>> Khalid, maybe you should share notes with the folks working on x86
>> protection keys.
> 
> Good idea. Sparc ADI feature is indeed similar to x86 protection keys
> sounds like.

There are definitely some similarities.  But protection keys doesn't
have any additional tables in which to keep metadata.  It keeps all of
its data in the page tables.  It also doesn't have an impact on the
virtual address layout.

But, it does have metadata to store in the VMA, has a special
siginfo->si_code, and it uses mprotect() (although a new pkey_mprotect()
variant that takes an extra argument).

Protection Keys are described a bit more here:

> http://git.kernel.org/cgit/linux/kernel/git/daveh/x86-pkeys.git/tree/Documentation/x86/protection-keys.txt?h=pkeys-v025=1b5b8a8836de8eb667027178b4820665dea5a038

MPX is another Intel feature separate from protection keys, but *it* has
some tables that it keep its metadata memory and special special
instructions to move metadata in and out of it.  It also has a prctl()
to enable/disable kernel assistance for the feature.  Unlike ADI, the
tables are exposed (and accessible) to user applications in normal
application memory.

MPX's documentation is here:

> http://git.kernel.org/cgit/linux/kernel/git/daveh/x86-pkeys.git/tree/Documentation/x86/intel_mpx.txt

Overall, I'm not seeing much overlap at all between the features, honestly.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sparc64: Add support for Application Data Integrity (ADI)

2016-03-07 Thread Dave Hansen
On 03/07/2016 08:06 AM, Khalid Aziz wrote:
> Top 4-bits of sparc64 virtual address are used for version tag only when
> a process has its PSTATE.mcde bit set and it is accessing a memory
> region that has ADI enabled on it (TTE.mcd set) and a version tag was
> set on the virtual address being accessed. These 4-bits retain their
> original semantics in all other cases.

OK, so this effectively reduces the address space of a process using the
feature.  Do we need to do anything explicit to keep an app from using
that address space?  Do we make sure the kernel doesn't place VMAs
there?  Do we respect mmap() hints that try to place memory there?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sparc64: Add support for Application Data Integrity (ADI)

2016-03-07 Thread Dave Hansen
On 03/02/2016 12:39 PM, Khalid Aziz wrote:
> +long enable_sparc_adi(unsigned long addr, unsigned long len)
> +{
> + unsigned long end, pagemask;
> + int error;
> + struct vm_area_struct *vma, *vma2;
> + struct mm_struct *mm;
> +
> + if (!ADI_CAPABLE())
> + return -EINVAL;
...

This whole thing with the VMA splitting and so forth looks pretty darn
arch-independent.  Are you sure you need that much arch-specific code
for it, or can you share more of the generic VMA management code?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sparc64: Add support for Application Data Integrity (ADI)

2016-03-07 Thread Dave Hansen
On 03/02/2016 12:39 PM, Khalid Aziz wrote:
> --- a/include/uapi/asm-generic/siginfo.h
> +++ b/include/uapi/asm-generic/siginfo.h
> @@ -206,7 +206,10 @@ typedef struct siginfo {
>  #define SEGV_MAPERR  (__SI_FAULT|1)  /* address not mapped to object */
>  #define SEGV_ACCERR  (__SI_FAULT|2)  /* invalid permissions for mapped 
> object */
>  #define SEGV_BNDERR  (__SI_FAULT|3)  /* failed address bound checks */
> -#define NSIGSEGV 3
> +#define SEGV_ACCADI  (__SI_FAULT|4)  /* ADI not enabled for mapped object */
> +#define SEGV_ADIDERR (__SI_FAULT|5)  /* Disrupting MCD error */
> +#define SEGV_ADIPERR (__SI_FAULT|6)  /* Precise MCD exception */
> +#define NSIGSEGV 6

FYI, this will conflict with code in -tip right now:

> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=mm/pkeys=cd0ea35ff5511cde299a61c21a95889b4a71464e

It's not a big deal to resolve, of course.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html