Re: [RFC PATCH] skiboot machine check handler

2020-01-20 Thread Nicholas Piggin
Mahesh J Salgaonkar's on January 16, 2020 5:03 pm:
> On 2019-12-11 20:01:18 Wed, Nicholas Piggin wrote:
>> Provide facilities to decode machine checks into human readable
>> strings, with only sufficient information required to deal with
>> them sanely.
>> 
>> The old machine check stuff was over engineered. The philosophy
>> here is that OPAL should correct anything it possibly can, what
>> it can't handle but the OS might be able to do something with
>> (e.g., uncorrected memory error or SLB multi-hit), it passes back
>> to Linux. Anything else, the OS doesn't care. It doesn't want a
>> huge struct of severities and levels and originators etc that it
>> can't do anything with -- just provide human readable strings
>> for what happened and what was done with it.
>> 
>> A Linux driver for this will be able to cope with new processors.
>> 
>> This also uses the same facility to decode machine checks in OPAL
>> boot.
>> 
>> The code is a bit in flux because it's sitting on top of a few
>> other RFC patches and not quite complete, just wanted opinions
>> about it.
> 
> opal_handle_mce() may have to be treated as special opal call. For MCE
> that occurs in OPAL context, Linux making opal call will clobber
> original opal call stack which hit MCE. Same is true with nested MCE in
> OPAL. Should it just continue using same r1 to avoid clobbering or have
> a separate stack for mce opal call ?

Ah, it wasn't clear in my message, sorry: this would only be made
available to kernels which use the new calling convention where the
kernel provides its own stack for OPAL to use.

That may be controversial itself, that's another RFC but if we went
ahead with that approach, then handling re-entrant interrupts like
this becomes easy because Linux does all the hard work with NMI/MCE
stacks etc.

Thanks,
Nick


Re: [PATCH -next] powerpc/maple: fix comparing pointer to 0

2020-01-20 Thread Segher Boessenkool
On Mon, Jan 20, 2020 at 05:52:15PM -0800, Joe Perches wrote:
> On Tue, 2020-01-21 at 09:31 +0800, Chen Zhou wrote:
> > Fixes coccicheck warning:
> > ./arch/powerpc/platforms/maple/setup.c:232:15-16:
> > WARNING comparing pointer to 0
> 
> Does anyone have or use these powerpc maple boards anymore?
> 
> Maybe the whole codebase should just be deleted instead.

This is used for *all* non-Apple 970 systems (not running virtualized),
not just actual Maple.


Segher


Re:Re: [PATCH] powerpc/sysdev: fix compile errors

2020-01-20 Thread 王文虎
发件人:Andrew Donnellan 
发送日期:2020-01-21 14:13:07
收件人:wangwenhu ,Benjamin Herrenschmidt 
,Paul Mackerras ,Michael Ellerman 
,Kate Stewart ,Greg 
Kroah-Hartman ,Richard Fontana 
,Thomas Gleixner 
,linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
抄送人:triv...@kernel.org,loneh...@hotmail.com,wenhu.w...@vivo.com
主题:Re: [PATCH] powerpc/sysdev: fix compile errors>On 21/1/20 4:31 pm, wangwenhu 
wrote:
>> From: wangwenhu 
>> 
>> Include arch/powerpc/include/asm/io.h into fsl_85xx_cache_sram.c to
>> fix the implicit declaration compile errors when building Cache-Sram.
>> 
>> arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function 
>> ‘instantiate_cache_sram’:
>> arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration 
>> of function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? 
>> [-Werror=implicit-function-declaration]
>>cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
>>^~~~
>>bitmap_complement
>> arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes 
>> pointer from integer without a cast [-Werror=int-conversion]
>>cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
>>  ^
>> arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration 
>> of function ‘iounmap’; did you mean ‘roundup’? 
>> [-Werror=implicit-function-declaration]
>>iounmap(cache_sram->base_virt);
>>^~~
>>roundup
>> cc1: all warnings being treated as errors
>> 
>> Signed-off-by: wangwenhu 
>
>How long has this code been broken for?

It's been broken almost 15 months since the commit below:
"commit aa91796ec46339f2ed53da311bd3ea77a3e4dfe1
Author: Christophe Leroy 
Date:   Tue Oct 9 13:51:41 2018 +

powerpc: don't use ioremap_prot() nor __ioremap() unless really needed."

And we are working on it now for further development.

>
>> ---
>>   arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c 
>> b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
>> index f6c665dac725..29b6868eff7d 100644
>> --- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
>> +++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
>> @@ -17,6 +17,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>> 
>>   #include "fsl_85xx_cache_ctlr.h"
>> 
>
>-- 
>Andrew Donnellan  OzLabs, ADL Canberra
>a...@linux.ibm.com IBM Australia Limited
>

Wenhu



Re: [PATCH v2 05/27] powerpc: Map & release OpenCAPI LPC memory

2020-01-20 Thread Andrew Donnellan

On 3/12/19 2:46 pm, Alastair D'Silva wrote:

From: Alastair D'Silva 

This patch adds platform support to map & release LPC memory.


Might want to explain what LPC is.

Otherwise:

Reviewed-by: Andrew Donnellan 



Signed-off-by: Alastair D'Silva 
---
  arch/powerpc/include/asm/pnv-ocxl.h   |  2 ++
  arch/powerpc/platforms/powernv/ocxl.c | 42 +++
  2 files changed, 44 insertions(+)

diff --git a/arch/powerpc/include/asm/pnv-ocxl.h 
b/arch/powerpc/include/asm/pnv-ocxl.h
index 7de82647e761..f8f8ffb48aa8 100644
--- a/arch/powerpc/include/asm/pnv-ocxl.h
+++ b/arch/powerpc/include/asm/pnv-ocxl.h
@@ -32,5 +32,7 @@ extern int pnv_ocxl_spa_remove_pe_from_cache(void 
*platform_data, int pe_handle)
  
  extern int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);

  extern void pnv_ocxl_free_xive_irq(u32 irq);
+extern u64 pnv_ocxl_platform_lpc_setup(struct pci_dev *pdev, u64 size);
+extern void pnv_ocxl_platform_lpc_release(struct pci_dev *pdev);


nit: I don't think these need to be extern?


--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Re:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-01-20 Thread 王文虎
发件人:Scott Wood 
发送日期:2020-01-21 13:49:59
收件人:"王文虎" 
抄送人:wangwenhu ,Kumar Gala 
,Benjamin Herrenschmidt 
,Paul Mackerras ,Michael Ellerman 
,linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,triv...@kernel.org,Rai
 Harninder 
主题:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On Tue, 
2020-01-21 at 13:20 +0800, 王文虎 wrote:
>> From: Scott Wood 
>> Date: 2020-01-21 11:25:25
>> To:  wangwenhu ,Kumar Gala ,
>> Benjamin Herrenschmidt ,Paul Mackerras <
>> pau...@samba.org>,Michael Ellerman ,
>> linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
>> Cc:  triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder <
>> harninder@nxp.com>
>> Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM
>> configurable>On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
>> > > From: wangwenhu 
>> > > 
>> > > When generating .config file with menuconfig on Freescale BOOKE
>> > > SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
>> > > description in the Kconfig field, which makes it impossible
>> > > to support L2Cache-Sram driver. Add a description to make it
>> > > configurable.
>> > > 
>> > > Signed-off-by: wangwenhu 
>> > 
>> > The intent was that drivers using the SRAM API would select the
>> > symbol.  What
>> > is the use case for selecting it manually?
>> > 
>> 
>> With a repository of multiple products(meaning different defconfigs) and
>> multiple
>> developers, the Kconfigs of the Kernel Source Tree change frequently. So the
>> "make menuconfig"
>> process is needed for defconfigs' re-generating or updating for the
>> complexity of dependencies
>> between different features defined in the Kconfigs.
>
>That doesn't answer my question of how the SRAM code would be useful other
>than to some other driver that uses the API (which would use "select").  There
>is no userspace API.  You could use the kernel command line to configure the
>SRAM but you need to get the address of it for it to be useful.
>

Like you've asked below, via /dev/mem or direct calling within the Kernel.
And they are not submitted yes, under development.

>> > Since this code was added almost ten years ago and there are still no (in-
>> > tree?) users of the API, we should just remove the sram code (unless this
>> > prods someone to submit such a user very soon).
>> > 
>> 
>> Yes, pretty long a time. But we DO really use the API now for
>> PPCE500/Freescale SoC.
>
>I do not see any users in the kernel tree.  Are you talking about out-of-tree
>code, or something that you've submitted or will submit soon?  Or are you
>accessing it via /dev/mem?
>

Both, but not submitted yet, and partly under development.

>> Like sometimes we need to reset the whole RAM, then the L2-Cache would be
>> used as
>> SRAM for backup using. Since it is useful for us now, a re-consideration is
>> recommanded.
>
>Where is the code that would do this?
>

Currently under development, and not submitted yet.

>-Scott
>> 
>

Wenhu



Re: [PATCH] powerpc/sysdev: fix compile errors

2020-01-20 Thread Andrew Donnellan

On 21/1/20 4:31 pm, wangwenhu wrote:

From: wangwenhu 

Include arch/powerpc/include/asm/io.h into fsl_85xx_cache_sram.c to
fix the implicit declaration compile errors when building Cache-Sram.

arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function ‘instantiate_cache_sram’:
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration of 
function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? 
[-Werror=implicit-function-declaration]
   cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
   ^~~~
   bitmap_complement
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes 
pointer from integer without a cast [-Werror=int-conversion]
   cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
 ^
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration of 
function ‘iounmap’; did you mean ‘roundup’? 
[-Werror=implicit-function-declaration]
   iounmap(cache_sram->base_virt);
   ^~~
   roundup
cc1: all warnings being treated as errors

Signed-off-by: wangwenhu 


How long has this code been broken for?


---
  arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c 
b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
index f6c665dac725..29b6868eff7d 100644
--- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
+++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
@@ -17,6 +17,7 @@
  #include 
  #include 
  #include 
+#include 

  #include "fsl_85xx_cache_ctlr.h"



--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Re: [PATCH] powerpc/sysdev: fix compile errors

2020-01-20 Thread Christophe Leroy




Le 21/01/2020 à 06:31, wangwenhu a écrit :

From: wangwenhu 

Include arch/powerpc/include/asm/io.h into fsl_85xx_cache_sram.c to
fix the implicit declaration compile errors when building Cache-Sram.


It is usually better to include  instead of 

Christophe


Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-01-20 Thread Scott Wood
On Tue, 2020-01-21 at 13:20 +0800, 王文虎 wrote:
> From: Scott Wood 
> Date: 2020-01-21 11:25:25
> To:  wangwenhu ,Kumar Gala ,
> Benjamin Herrenschmidt ,Paul Mackerras <
> pau...@samba.org>,Michael Ellerman ,
> linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
> Cc:  triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder <
> harninder@nxp.com>
> Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM
> configurable>On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
> > > From: wangwenhu 
> > > 
> > > When generating .config file with menuconfig on Freescale BOOKE
> > > SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
> > > description in the Kconfig field, which makes it impossible
> > > to support L2Cache-Sram driver. Add a description to make it
> > > configurable.
> > > 
> > > Signed-off-by: wangwenhu 
> > 
> > The intent was that drivers using the SRAM API would select the
> > symbol.  What
> > is the use case for selecting it manually?
> > 
> 
> With a repository of multiple products(meaning different defconfigs) and
> multiple
> developers, the Kconfigs of the Kernel Source Tree change frequently. So the
> "make menuconfig"
> process is needed for defconfigs' re-generating or updating for the
> complexity of dependencies
> between different features defined in the Kconfigs.

That doesn't answer my question of how the SRAM code would be useful other
than to some other driver that uses the API (which would use "select").  There
is no userspace API.  You could use the kernel command line to configure the
SRAM but you need to get the address of it for it to be useful.

> > Since this code was added almost ten years ago and there are still no (in-
> > tree?) users of the API, we should just remove the sram code (unless this
> > prods someone to submit such a user very soon).
> > 
> 
> Yes, pretty long a time. But we DO really use the API now for
> PPCE500/Freescale SoC.

I do not see any users in the kernel tree.  Are you talking about out-of-tree
code, or something that you've submitted or will submit soon?  Or are you
accessing it via /dev/mem?

> Like sometimes we need to reset the whole RAM, then the L2-Cache would be
> used as
> SRAM for backup using. Since it is useful for us now, a re-consideration is
> recommanded.

Where is the code that would do this?

-Scott
> 



Re:[PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-01-20 Thread 王文虎
From: Scott Wood 
Date: 2020-01-21 11:25:25
To:  wangwenhu ,Kumar Gala 
,Benjamin Herrenschmidt 
,Paul Mackerras ,Michael Ellerman 
,linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
Cc:  triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder 

Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On 
Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
>> From: wangwenhu 
>> 
>> When generating .config file with menuconfig on Freescale BOOKE
>> SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
>> description in the Kconfig field, which makes it impossible
>> to support L2Cache-Sram driver. Add a description to make it
>> configurable.
>> 
>> Signed-off-by: wangwenhu 
>
>The intent was that drivers using the SRAM API would select the symbol.  What
>is the use case for selecting it manually?
>

With a repository of multiple products(meaning different defconfigs) and 
multiple
developers, the Kconfigs of the Kernel Source Tree change frequently. So the 
"make menuconfig"
process is needed for defconfigs' re-generating or updating for the complexity 
of dependencies
between different features defined in the Kconfigs.

>Since this code was added almost ten years ago and there are still no (in-
>tree?) users of the API, we should just remove the sram code (unless this
>prods someone to submit such a user very soon).
>

Yes, pretty long a time. But we DO really use the API now for PPCE500/Freescale 
SoC.
Like sometimes we need to reset the whole RAM, then the L2-Cache would be used 
as
SRAM for backup using. Since it is useful for us now, a re-consideration is 
recommanded.

>-Scott
>
>

--
Wenhu
vivo



[Bug 205099] KASAN hit at raid6_pq: BUG: Unable to handle kernel data access at 0x00f0fd0d

2020-01-20 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205099

--- Comment #19 from Christophe Leroy (christophe.le...@c-s.fr) ---
Can you tell exactly where it stops during the boot ? Or take a photo of the
screen ?

In parallele, could you try (without VMAP_STACK) increasing CONFIG_THREAD_SHIFT
to 14 ? It will double the size of the stacks.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH] powerpc/sysdev: fix compile errors

2020-01-20 Thread wangwenhu
From: wangwenhu 

Include arch/powerpc/include/asm/io.h into fsl_85xx_cache_sram.c to
fix the implicit declaration compile errors when building Cache-Sram.

arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function ‘instantiate_cache_sram’:
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration of 
function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? 
[-Werror=implicit-function-declaration]
  cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
  ^~~~
  bitmap_complement
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes 
pointer from integer without a cast [-Werror=int-conversion]
  cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
^
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration of 
function ‘iounmap’; did you mean ‘roundup’? 
[-Werror=implicit-function-declaration]
  iounmap(cache_sram->base_virt);
  ^~~
  roundup
cc1: all warnings being treated as errors

Signed-off-by: wangwenhu 
---
 arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c 
b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
index f6c665dac725..29b6868eff7d 100644
--- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
+++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "fsl_85xx_cache_ctlr.h"
 
-- 
2.17.1



[PATCH v2 10/10] powerpc/configs/skiroot: Enable CONFIG_PRINTK_CALLER

2020-01-20 Thread Michael Ellerman
This adds the CPU or thread number to printk messages. This can help
decipher concurrent oopses that have been interleaved.

Example output, of PID1 (T1) triggering a warning:

  [1.581678][T1] WARNING: CPU: 0 PID: 1 at crypto/rsa-pkcs1pad.c:539 
pkcs1pad_verify+0x38/0x140
  [1.581681][T1] Modules linked in:
  [1.581693][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.5.0-rc5-gcc-8.2.0-00121-gf84c2e595927-dirty #1515
  [1.581700][T1] NIP:  c0207d64 LR: c0207d3c CTR: 
c0207d2c
  [1.581708][T1] REGS: c000fd2e7560 TRAP: 0700   Not tainted  
(5.5.0-rc5-gcc-8.2.0-00121-gf84c2e595927-dirty)
  [1.581712][T1] MSR:  90029033   CR: 
44000222  XER: 0004

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 +
 1 file changed, 1 insertion(+)

v2: New.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index ca6f1842aa29..ae1d7137a84e 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -294,6 +294,7 @@ CONFIG_LIBCRC32C=y
 # CONFIG_XZ_DEC_ARMTHUMB is not set
 # CONFIG_XZ_DEC_SPARC is not set
 CONFIG_PRINTK_TIME=y
+CONFIG_PRINTK_CALLER=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_SLUB_DEBUG_ON=y
 CONFIG_SCHED_STACK_END_CHECK=y
-- 
2.21.1



[PATCH v2 09/10] powerpc/configs/skiroot: Enable some more hardening options

2020-01-20 Thread Michael Ellerman
Enable more hardening options.

Note BUG_ON_DATA_CORRUPTION selects DEBUG_LIST and is essentially just
a synonym for it.

DEBUG_SG, DEBUG_NOTIFIERS, DEBUG_LIST, DEBUG_CREDENTIALS and
SCHED_STACK_END_CHECK should all be low overhead and just add a few
extra checks.

SLAB_FREELIST_RANDOM, and SLUB_DEBUG_ON will add some overhead to the
SLAB allocator, but nothing that should be meaningful for skiroot.

Unselecting SLAB_MERGE_DEFAULT causes the SLAB to use more memory, but
the skiroot kernel shouldn't be memory constrained on any of our
systems, all it does is run a small bootloader.

Disabling merging has some security/robustness benefit as it means a
user-after-free or overflow will be limited to the objects in that
slab, rather than potentially affecting objects from unrelated slabs
that have been merged.

Note also that slab merging is disabled anyway by enabling
SLUB_DEBUG_ON, because of the SLAB_NEVER_MERGE mask.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 8 
 1 file changed, 8 insertions(+)

v2: Add more explanation about slab merging.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 28cfd68e8b16..ca6f1842aa29 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -23,6 +23,8 @@ CONFIG_EXPERT=y
 # CONFIG_AIO is not set
 CONFIG_PERF_EVENTS=y
 # CONFIG_COMPAT_BRK is not set
+# CONFIG_SLAB_MERGE_DEFAULT is not set
+CONFIG_SLAB_FREELIST_RANDOM=y
 CONFIG_SLAB_FREELIST_HARDENED=y
 CONFIG_PPC64=y
 CONFIG_ALTIVEC=y
@@ -293,6 +295,8 @@ CONFIG_LIBCRC32C=y
 # CONFIG_XZ_DEC_SPARC is not set
 CONFIG_PRINTK_TIME=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_SLUB_DEBUG_ON=y
+CONFIG_SCHED_STACK_END_CHECK=y
 CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_PANIC_ON_OOPS=y
 CONFIG_SOFTLOCKUP_DETECTOR=y
@@ -301,6 +305,10 @@ CONFIG_HARDLOCKUP_DETECTOR=y
 CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
 CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
+CONFIG_DEBUG_SG=y
+CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_BUG_ON_DATA_CORRUPTION=y
+CONFIG_DEBUG_CREDENTIALS=y
 # CONFIG_FTRACE is not set
 CONFIG_XMON=y
 # CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.21.1



[PATCH v2 08/10] powerpc/configs/skiroot: Disable xmon default & enable reboot on panic

2020-01-20 Thread Michael Ellerman
If the skiroot kernel crashes we don't want it sitting at an xmon
prompt forever. Instead it's more helpful to reboot and bring the
boot loader back up, and if the crash was transient we can then boot
successfully.

Similarly if we panic we should reboot, with a short timeout in case
someone is watching the console.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

v2: No change.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 93b478436a2b..28cfd68e8b16 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -29,6 +29,7 @@ CONFIG_ALTIVEC=y
 CONFIG_VSX=y
 CONFIG_NR_CPUS=2048
 CONFIG_CPU_LITTLE_ENDIAN=y
+CONFIG_PANIC_TIMEOUT=30
 # CONFIG_PPC_VAS is not set
 # CONFIG_PPC_PSERIES is not set
 # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
@@ -293,6 +294,7 @@ CONFIG_LIBCRC32C=y
 CONFIG_PRINTK_TIME=y
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_STACKOVERFLOW=y
+CONFIG_PANIC_ON_OOPS=y
 CONFIG_SOFTLOCKUP_DETECTOR=y
 CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
 CONFIG_HARDLOCKUP_DETECTOR=y
@@ -301,5 +303,4 @@ CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
 # CONFIG_FTRACE is not set
 CONFIG_XMON=y
-CONFIG_XMON_DEFAULT=y
 # CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.21.1



[PATCH v2 07/10] powerpc/configs/skiroot: Enable security features

2020-01-20 Thread Michael Ellerman
From: Joel Stanley 

This turns on HARDENED_USERCOPY with HARDENED_USERCOPY_PAGESPAN, and
FORTIFY_SOURCE.

It also enables SECURITY_LOCKDOWN_LSM with _EARLY and
LOCK_DOWN_KERNEL_FORCE_INTEGRITY options enabled. This still allows
xmon to be used in read-only mode.

MODULE_SIG is selected by lockdown, so it is still enabled.

Signed-off-by: Joel Stanley 
[mpe: Switch to lockdown integrity mode per oohal]
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

v2: Switch to lockdown integrity mode rather than confidentiality as noticed by
dja and discussed with jms and oohal.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 24a210fe0049..93b478436a2b 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -49,7 +49,6 @@ CONFIG_JUMP_LABEL=y
 CONFIG_STRICT_KERNEL_RWX=y
 CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
-CONFIG_MODULE_SIG=y
 CONFIG_MODULE_SIG_FORCE=y
 CONFIG_MODULE_SIG_SHA512=y
 CONFIG_PARTITION_ADVANCED=y
@@ -272,6 +271,16 @@ CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
 CONFIG_NLS_UTF8=y
 CONFIG_ENCRYPTED_KEYS=y
+CONFIG_SECURITY=y
+CONFIG_HARDENED_USERCOPY=y
+# CONFIG_HARDENED_USERCOPY_FALLBACK is not set
+CONFIG_HARDENED_USERCOPY_PAGESPAN=y
+CONFIG_FORTIFY_SOURCE=y
+CONFIG_SECURITY_LOCKDOWN_LSM=y
+CONFIG_SECURITY_LOCKDOWN_LSM_EARLY=y
+CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY=y
+# CONFIG_INTEGRITY is not set
+CONFIG_LSM="yama,loadpin,safesetid,integrity"
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC16=y
 CONFIG_CRC_ITU_T=y
-- 
2.21.1



[PATCH v2 06/10] powerpc/configs/skiroot: Update for symbol movement only

2020-01-20 Thread Michael Ellerman
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

v2: No change.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 0aa060eef06c..24a210fe0049 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -1,8 +1,3 @@
-CONFIG_PPC64=y
-CONFIG_ALTIVEC=y
-CONFIG_VSX=y
-CONFIG_NR_CPUS=2048
-CONFIG_CPU_LITTLE_ENDIAN=y
 CONFIG_KERNEL_XZ=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
@@ -29,16 +24,11 @@ CONFIG_EXPERT=y
 CONFIG_PERF_EVENTS=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_SLAB_FREELIST_HARDENED=y
-CONFIG_JUMP_LABEL=y
-CONFIG_STRICT_KERNEL_RWX=y
-CONFIG_MODULES=y
-CONFIG_MODULE_UNLOAD=y
-CONFIG_MODULE_SIG=y
-CONFIG_MODULE_SIG_FORCE=y
-CONFIG_MODULE_SIG_SHA512=y
-CONFIG_PARTITION_ADVANCED=y
-# CONFIG_MQ_IOSCHED_DEADLINE is not set
-# CONFIG_MQ_IOSCHED_KYBER is not set
+CONFIG_PPC64=y
+CONFIG_ALTIVEC=y
+CONFIG_VSX=y
+CONFIG_NR_CPUS=2048
+CONFIG_CPU_LITTLE_ENDIAN=y
 # CONFIG_PPC_VAS is not set
 # CONFIG_PPC_PSERIES is not set
 # CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
@@ -49,14 +39,24 @@ CONFIG_KEXEC=y
 CONFIG_PRESERVE_FA_DUMP=y
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_NUMA=y
-# CONFIG_COMPACTION is not set
-# CONFIG_MIGRATION is not set
 CONFIG_PPC_64K_PAGES=y
 CONFIG_SCHED_SMT=y
 CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE="console=tty0 console=hvc0 ipr.fast_reboot=1 quiet"
 # CONFIG_SECCOMP is not set
 # CONFIG_PPC_MEM_KEYS is not set
+CONFIG_JUMP_LABEL=y
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_SIG=y
+CONFIG_MODULE_SIG_FORCE=y
+CONFIG_MODULE_SIG_SHA512=y
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_MQ_IOSCHED_DEADLINE is not set
+# CONFIG_MQ_IOSCHED_KYBER is not set
+# CONFIG_COMPACTION is not set
+# CONFIG_MIGRATION is not set
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
@@ -153,7 +153,6 @@ CONFIG_IGB=m
 CONFIG_IXGB=m
 CONFIG_IXGBE=m
 CONFIG_I40E=m
-CONFIG_S2IO=m
 # CONFIG_NET_VENDOR_MARVELL is not set
 CONFIG_MLX4_EN=m
 # CONFIG_MLX4_CORE_GEN2 is not set
@@ -164,6 +163,7 @@ CONFIG_MLX5_CORE_EN=y
 # CONFIG_NET_VENDOR_MICROSEMI is not set
 CONFIG_MYRI10GE=m
 # CONFIG_NET_VENDOR_NATSEMI is not set
+CONFIG_S2IO=m
 # CONFIG_NET_VENDOR_NETRONOME is not set
 # CONFIG_NET_VENDOR_NI is not set
 # CONFIG_NET_VENDOR_NVIDIA is not set
@@ -271,6 +271,8 @@ CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ASCII=y
 CONFIG_NLS_ISO8859_1=y
 CONFIG_NLS_UTF8=y
+CONFIG_ENCRYPTED_KEYS=y
+# CONFIG_CRYPTO_HW is not set
 CONFIG_CRC16=y
 CONFIG_CRC_ITU_T=y
 CONFIG_LIBCRC32C=y
@@ -289,8 +291,6 @@ CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
 CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
 # CONFIG_FTRACE is not set
-# CONFIG_RUNTIME_TESTING_MENU is not set
 CONFIG_XMON=y
 CONFIG_XMON_DEFAULT=y
-CONFIG_ENCRYPTED_KEYS=y
-# CONFIG_CRYPTO_HW is not set
+# CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.21.1



[PATCH v2 05/10] powerpc/configs/skiroot: Drop default n CONFIG_CRYPTO_ECHAINIV

2020-01-20 Thread Michael Ellerman
It's default n so we don't need to disable it.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 -
 1 file changed, 1 deletion(-)

v2: No change.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 74cffb854c0f..0aa060eef06c 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -293,5 +293,4 @@ CONFIG_WQ_WATCHDOG=y
 CONFIG_XMON=y
 CONFIG_XMON_DEFAULT=y
 CONFIG_ENCRYPTED_KEYS=y
-# CONFIG_CRYPTO_ECHAINIV is not set
 # CONFIG_CRYPTO_HW is not set
-- 
2.21.1



[PATCH v2 04/10] powerpc/configs/skiroot: Drop HID_LOGITECH

2020-01-20 Thread Michael Ellerman
Commit bdd08fff4915 ("HID: logitech: Add depends on LEDS_CLASS to
Logitech Kconfig entry") made HID_LOGITECH depend on LEDS_CLASS which
we do not enable, meaning we are not actually enabling those drivers
any more.

The Kconfig help text suggests USB HID compliant Logictech devices
will continue to work without HID_LOGITECH, so just drop it.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 -
 1 file changed, 1 deletion(-)

v2: No change.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 3eee39c50941..74cffb854c0f 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -235,7 +235,6 @@ CONFIG_HID_CYPRESS=y
 CONFIG_HID_EZKEY=y
 CONFIG_HID_ITE=y
 CONFIG_HID_KENSINGTON=y
-CONFIG_HID_LOGITECH=y
 CONFIG_HID_MICROSOFT=y
 CONFIG_HID_MONTEREY=y
 CONFIG_USB_HIDDEV=y
-- 
2.21.1



[PATCH v2 03/10] powerpc/configs: Drop NET_VENDOR_HP which moved to staging

2020-01-20 Thread Michael Ellerman
The HP network driver moved to staging in commit 52340b82cf1a ("hp100:
Move 100BaseVG AnyLAN driver to staging") meaning we don't need to
disable it any more in our defconfigs.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/44x/akebono_defconfig | 1 -
 arch/powerpc/configs/skiroot_defconfig | 1 -
 2 files changed, 2 deletions(-)

v2: No change.

diff --git a/arch/powerpc/configs/44x/akebono_defconfig 
b/arch/powerpc/configs/44x/akebono_defconfig
index f0c8a07cc274..7705a5c3f4ea 100644
--- a/arch/powerpc/configs/44x/akebono_defconfig
+++ b/arch/powerpc/configs/44x/akebono_defconfig
@@ -59,7 +59,6 @@ CONFIG_BLK_DEV_SD=y
 # CONFIG_NET_VENDOR_DLINK is not set
 # CONFIG_NET_VENDOR_EMULEX is not set
 # CONFIG_NET_VENDOR_EXAR is not set
-# CONFIG_NET_VENDOR_HP is not set
 CONFIG_IBM_EMAC=y
 # CONFIG_NET_VENDOR_MARVELL is not set
 # CONFIG_NET_VENDOR_MELLANOX is not set
diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index eaaffe9ae8b9..3eee39c50941 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -146,7 +146,6 @@ CONFIG_CHELSIO_T1=m
 # CONFIG_NET_VENDOR_DLINK is not set
 CONFIG_BE2NET=m
 # CONFIG_NET_VENDOR_EZCHIP is not set
-# CONFIG_NET_VENDOR_HP is not set
 # CONFIG_NET_VENDOR_HUAWEI is not set
 CONFIG_E1000=m
 CONFIG_E1000E=m
-- 
2.21.1



[PATCH v2 01/10] powerpc/configs: Drop CONFIG_QLGE which moved to staging

2020-01-20 Thread Michael Ellerman
The QLGE driver moved to staging in commit 955315b0dc8c ("qlge: Move
drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/"), meaning
our defconfigs that enable it have no effect as we don't enable
CONFIG_STAGING.

It sounds like the device is obsolete, so drop the driver.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/powernv_defconfig | 1 -
 arch/powerpc/configs/ppc64_defconfig   | 1 -
 arch/powerpc/configs/ppc6xx_defconfig  | 1 -
 arch/powerpc/configs/pseries_defconfig | 1 -
 arch/powerpc/configs/skiroot_defconfig | 1 -
 5 files changed, 5 deletions(-)

v2: No change.

diff --git a/arch/powerpc/configs/powernv_defconfig 
b/arch/powerpc/configs/powernv_defconfig
index 32841456a573..71749377d164 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -181,7 +181,6 @@ CONFIG_MLX5_FPGA=y
 CONFIG_MLX5_CORE_EN=y
 CONFIG_MLX5_CORE_IPOIB=y
 CONFIG_MYRI10GE=m
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_USB_NET_DRIVERS=m
 # CONFIG_WLAN is not set
diff --git a/arch/powerpc/configs/ppc64_defconfig 
b/arch/powerpc/configs/ppc64_defconfig
index b250e6f5a7ca..7e68cb222c7b 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -189,7 +189,6 @@ CONFIG_MLX4_EN=m
 CONFIG_MYRI10GE=m
 CONFIG_S2IO=m
 CONFIG_PASEMI_MAC=y
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_SUNGEM=y
 CONFIG_GELIC_NET=m
diff --git a/arch/powerpc/configs/ppc6xx_defconfig 
b/arch/powerpc/configs/ppc6xx_defconfig
index 7e28919041cf..3e2f44f38ac5 100644
--- a/arch/powerpc/configs/ppc6xx_defconfig
+++ b/arch/powerpc/configs/ppc6xx_defconfig
@@ -507,7 +507,6 @@ CONFIG_FORCEDETH=m
 CONFIG_HAMACHI=m
 CONFIG_YELLOWFIN=m
 CONFIG_QLA3XXX=m
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_8139CP=m
 CONFIG_8139TOO=m
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index 26126b4d4de3..6b68109e248f 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -169,7 +169,6 @@ CONFIG_IXGBE=m
 CONFIG_I40E=m
 CONFIG_MLX4_EN=m
 CONFIG_MYRI10GE=m
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_PPP=m
 CONFIG_PPP_BSDCOMP=m
diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 069f67f12731..7ff1ff1ddc28 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -171,7 +171,6 @@ CONFIG_MYRI10GE=m
 # CONFIG_NET_VENDOR_NVIDIA is not set
 # CONFIG_NET_VENDOR_OKI is not set
 # CONFIG_NET_VENDOR_PACKET_ENGINES is not set
-CONFIG_QLGE=m
 CONFIG_NETXEN_NIC=m
 CONFIG_QED=m
 CONFIG_QEDE=m
-- 
2.21.1



[PATCH v2 02/10] powerpc/configs: NET_CADENCE became NET_VENDOR_CADENCE

2020-01-20 Thread Michael Ellerman
The NET_CADENCE symbol was renamed to NET_VENDOR_CADENCE, so we don't
need to disable the former, see commit 0df5f81c481e ("net: ethernet:
Add missing VENDOR to Cadence and Packet Engines symbols").

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/skiroot_defconfig | 1 -
 1 file changed, 1 deletion(-)

v2: No change.

diff --git a/arch/powerpc/configs/skiroot_defconfig 
b/arch/powerpc/configs/skiroot_defconfig
index 7ff1ff1ddc28..eaaffe9ae8b9 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -138,7 +138,6 @@ CONFIG_TIGON3=m
 CONFIG_BNX2X=m
 # CONFIG_NET_VENDOR_BROCADE is not set
 # CONFIG_NET_VENDOR_CADENCE is not set
-# CONFIG_NET_CADENCE is not set
 # CONFIG_NET_VENDOR_CAVIUM is not set
 CONFIG_CHELSIO_T1=m
 # CONFIG_NET_VENDOR_CISCO is not set
-- 
2.21.1



Re: [RFC PATCH 9/9] powerpc/configs/skiroot: Enable some more hardening options

2020-01-20 Thread Michael Ellerman
Joel Stanley  writes:
> On Thu, 16 Jan 2020 at 01:48, Michael Ellerman  wrote:
>>
>> Enable more hardening options.
>>
>> Note BUG_ON_DATA_CORRUPTION selects DEBUG_LIST and is essentially just
>> a synonym for it.
>>
>> DEBUG_SG, DEBUG_NOTIFIERS, DEBUG_LIST, DEBUG_CREDENTIALS and
>> SCHED_STACK_END_CHECK should all be low overhead and just add a few
>> extra checks.
>>
>> Unselecting SLAB_MERGE_DEFAULT causes the SLAB to use more memory, but
>> the skiroot kernel shouldn't be memory constrained on any of our
>> systems, all it does is run a small bootloader.
>
> Why do we unselect it?

The help text pretty much explains it:

config SLAB_MERGE_DEFAULT
bool "Allow slab caches to be merged"
default y
help
  For reduced kernel memory fragmentation, slab caches can be
  merged when they share the same size and other characteristics.
  This carries a risk of kernel heap overflows being able to
  overwrite objects from merged caches (and more easily control
  cache layout), which makes such heap attacks easier to exploit
  by attackers. By keeping caches unmerged, these kinds of exploits
  can usually only damage objects in the same cache. To disable
  merging at runtime, "slab_nomerge" can be passed on the kernel
  command line.


So unselecting it uses a bit more memory but has some
security/robustness benefit.

I should probably also mention that it essentially has no effect because
we're also enabling SLUB_DEBUG_ON, and that causes some of the flags in
SLAB_NEVER_MERGE to be set, which also disables merging.

cheers


Re: [PATCH] powerpc/pseries/vio: Fix iommu_table use-after-free refcount warning

2020-01-20 Thread Alexey Kardashevskiy



On 21/01/2020 09:10, Tyrel Datwyler wrote:
> From: Tyrel Datwyler 
> 
> Commit e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to
> iommu_table") missed an iommu_table allocation in the pseries vio code.
> The iommu_table is allocated with kzalloc and as a result the associated
> kref gets a value of zero. This has the side effect that during a DLPAR
> remove of the associated virtual IOA the iommu_tce_table_put() triggers
> a use-after-free underflow warning.
> 
> Call Trace:
> [c002879e39f0] [c071ecb4] refcount_warn_saturate+0x184/0x190
> (unreliable)
> [c002879e3a50] [c00500ac] iommu_tce_table_put+0x9c/0xb0
> [c002879e3a70] [c00f54e4] vio_dev_release+0x34/0x70
> [c002879e3aa0] [c087cfa4] device_release+0x54/0xf0
> [c002879e3b10] [c0d64c84] kobject_cleanup+0xa4/0x240
> [c002879e3b90] [c087d358] put_device+0x28/0x40
> [c002879e3bb0] [c07a328c] dlpar_remove_slot+0x15c/0x250
> [c002879e3c50] [c07a348c] remove_slot_store+0xac/0xf0
> [c002879e3cd0] [c0d64220] kobj_attr_store+0x30/0x60
> [c002879e3cf0] [c04ff13c] sysfs_kf_write+0x6c/0xa0
> [c002879e3d10] [c04fde4c] kernfs_fop_write+0x18c/0x260
> [c002879e3d60] [c0410f3c] __vfs_write+0x3c/0x70
> [c002879e3d80] [c0415408] vfs_write+0xc8/0x250
> [c002879e3dd0] [c04157dc] ksys_write+0x7c/0x120
> [c002879e3e20] [c000b278] system_call+0x5c/0x68
> 
> Further, since the refcount was always zero the iommu_tce_table_put()
> fails to call the iommu_table release function resulting in a leak.
> 
> Fix this issue be initilizing the iommu_table kref immediately after
> allocation.
> 
> Fixes: e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to 
> iommu_table")
> Signed-off-by: Tyrel Datwyler 



Reviewed-by: Alexey Kardashevskiy 




> ---
>  arch/powerpc/platforms/pseries/vio.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/vio.c 
> b/arch/powerpc/platforms/pseries/vio.c
> index 79e2287..f682b7b 100644
> --- a/arch/powerpc/platforms/pseries/vio.c
> +++ b/arch/powerpc/platforms/pseries/vio.c
> @@ -1176,6 +1176,8 @@ static struct iommu_table *vio_build_iommu_table(struct 
> vio_dev *dev)
>   if (tbl == NULL)
>   return NULL;
>  
> + kref_init(>it_kref);
> +
>   of_parse_dma_window(dev->dev.of_node, dma_window,
>   >it_index, , );
>  
> 

-- 
Alexey


Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-01-20 Thread Scott Wood
On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
> From: wangwenhu 
> 
> When generating .config file with menuconfig on Freescale BOOKE
> SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
> description in the Kconfig field, which makes it impossible
> to support L2Cache-Sram driver. Add a description to make it
> configurable.
> 
> Signed-off-by: wangwenhu 

The intent was that drivers using the SRAM API would select the symbol.  What
is the use case for selecting it manually?

Since this code was added almost ten years ago and there are still no (in-
tree?) users of the API, we should just remove the sram code (unless this
prods someone to submit such a user very soon).

-Scott




Re: [PATCH -next] powerpc/maple: fix comparing pointer to 0

2020-01-20 Thread Joe Perches
On Tue, 2020-01-21 at 09:31 +0800, Chen Zhou wrote:
> Fixes coccicheck warning:
> ./arch/powerpc/platforms/maple/setup.c:232:15-16:
>   WARNING comparing pointer to 0

Does anyone have or use these powerpc maple boards anymore?

Maybe the whole codebase should just be deleted instead.

If not, setup.c has an unused DBG macro that could be removed too.
---
 arch/powerpc/platforms/maple/setup.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/platforms/maple/setup.c 
b/arch/powerpc/platforms/maple/setup.c
index 47f7310..d6a083c 100644
--- a/arch/powerpc/platforms/maple/setup.c
+++ b/arch/powerpc/platforms/maple/setup.c
@@ -57,12 +57,6 @@
 
 #include "maple.h"
 
-#ifdef DEBUG
-#define DBG(fmt...) udbg_printf(fmt)
-#else
-#define DBG(fmt...)
-#endif
-
 static unsigned long maple_find_nvram_base(void)
 {
struct device_node *rtcs;




[PATCH -next] powerpc/maple: fix comparing pointer to 0

2020-01-20 Thread Chen Zhou
Fixes coccicheck warning:
./arch/powerpc/platforms/maple/setup.c:232:15-16:
WARNING comparing pointer to 0

Compare pointer-typed values to NULL rather than 0.

Signed-off-by: Chen Zhou 
---
 arch/powerpc/platforms/maple/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/maple/setup.c 
b/arch/powerpc/platforms/maple/setup.c
index 47f7310..00a0780 100644
--- a/arch/powerpc/platforms/maple/setup.c
+++ b/arch/powerpc/platforms/maple/setup.c
@@ -229,7 +229,7 @@ static void __init maple_init_IRQ(void)
root = of_find_node_by_path("/");
naddr = of_n_addr_cells(root);
opprop = of_get_property(root, "platform-open-pic", );
-   if (opprop != 0) {
+   if (opprop) {
openpic_addr = of_read_number(opprop, naddr);
has_isus = (opplen > naddr);
printk(KERN_DEBUG "OpenPIC addr: %lx, has ISUs: %d\n",
-- 
2.7.4



[Bug 205099] KASAN hit at raid6_pq: BUG: Unable to handle kernel data access at 0x00f0fd0d

2020-01-20 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205099

--- Comment #18 from Erhard F. (erhar...@mailbox.org) ---
(In reply to Christophe Leroy from comment #17)
> Created attachment 286907 [details]
> Patch to fix kasan with KASAN_VMALLOC and VMAP_STACK
> 
> Please try the attached patch, it fixes the setup of the kasan early hash
> table  when VMAP_STACK is enabled.
Sorry, but still the same situation with this patch applied.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH] powerpc/pseries/vio: Fix iommu_table use-after-free refcount warning

2020-01-20 Thread Tyrel Datwyler
From: Tyrel Datwyler 

Commit e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to
iommu_table") missed an iommu_table allocation in the pseries vio code.
The iommu_table is allocated with kzalloc and as a result the associated
kref gets a value of zero. This has the side effect that during a DLPAR
remove of the associated virtual IOA the iommu_tce_table_put() triggers
a use-after-free underflow warning.

Call Trace:
[c002879e39f0] [c071ecb4] refcount_warn_saturate+0x184/0x190
(unreliable)
[c002879e3a50] [c00500ac] iommu_tce_table_put+0x9c/0xb0
[c002879e3a70] [c00f54e4] vio_dev_release+0x34/0x70
[c002879e3aa0] [c087cfa4] device_release+0x54/0xf0
[c002879e3b10] [c0d64c84] kobject_cleanup+0xa4/0x240
[c002879e3b90] [c087d358] put_device+0x28/0x40
[c002879e3bb0] [c07a328c] dlpar_remove_slot+0x15c/0x250
[c002879e3c50] [c07a348c] remove_slot_store+0xac/0xf0
[c002879e3cd0] [c0d64220] kobj_attr_store+0x30/0x60
[c002879e3cf0] [c04ff13c] sysfs_kf_write+0x6c/0xa0
[c002879e3d10] [c04fde4c] kernfs_fop_write+0x18c/0x260
[c002879e3d60] [c0410f3c] __vfs_write+0x3c/0x70
[c002879e3d80] [c0415408] vfs_write+0xc8/0x250
[c002879e3dd0] [c04157dc] ksys_write+0x7c/0x120
[c002879e3e20] [c000b278] system_call+0x5c/0x68

Further, since the refcount was always zero the iommu_tce_table_put()
fails to call the iommu_table release function resulting in a leak.

Fix this issue be initilizing the iommu_table kref immediately after
allocation.

Fixes: e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to 
iommu_table")
Signed-off-by: Tyrel Datwyler 
---
 arch/powerpc/platforms/pseries/vio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/vio.c 
b/arch/powerpc/platforms/pseries/vio.c
index 79e2287..f682b7b 100644
--- a/arch/powerpc/platforms/pseries/vio.c
+++ b/arch/powerpc/platforms/pseries/vio.c
@@ -1176,6 +1176,8 @@ static struct iommu_table *vio_build_iommu_table(struct 
vio_dev *dev)
if (tbl == NULL)
return NULL;
 
+   kref_init(>it_kref);
+
of_parse_dma_window(dev->dev.of_node, dma_window,
>it_index, , );
 
-- 
1.8.3.1



[PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-01-20 Thread wangwenhu
From: wangwenhu 

When generating .config file with menuconfig on Freescale BOOKE
SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
description in the Kconfig field, which makes it impossible
to support L2Cache-Sram driver. Add a description to make it
configurable.

Signed-off-by: wangwenhu 
---
 arch/powerpc/platforms/85xx/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/85xx/Kconfig 
b/arch/powerpc/platforms/85xx/Kconfig
index fa3d29dcb57e..ee5ba10b98cb 100644
--- a/arch/powerpc/platforms/85xx/Kconfig
+++ b/arch/powerpc/platforms/85xx/Kconfig
@@ -17,7 +17,7 @@ if FSL_SOC_BOOKE
 if PPC32

 config FSL_85XX_CACHE_SRAM
-   bool
+   bool "Freescale Cache-Sram"
select PPC_LIB_RHEAP
help
  When selected, this option enables cache-sram support
--
2.23.0



Re: [PATCH v2 00/10] Impveovements for random.h/archrandom.h

2020-01-20 Thread Borislav Petkov
On Mon, Jan 20, 2020 at 05:26:27PM +, Mark Brown wrote:
> I think the important thing here is that *someone* takes the patches.
> We've now got Ted and Borislav both saying they're OK applying the
> patches, an additional proposal that Andrew takes the patches, nobody
> saying anything negative about applying the patches and yet the patches
> are not applied.  The random tree sounds like a sensible enough tree to
> take this so if Ted picks them up perhaps that's most sensible?

Yes, Ted, pls pick them up so that we're done with this.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation.

2020-01-20 Thread Segher Boessenkool
On Mon, Jan 20, 2020 at 06:08:23PM +0100, Christophe Leroy wrote:
> Not easy I think.
> 
> First we have the unavoidable ASM entry function that can't be dropped 
> because of the CR[SO] bit the set on error or clear on no error and that 
> can't be done in C.

Yup.

> In our ASM VDSO, fixed shifts are used, while in generic C VDSO, shifts 
> are generic and read from the VDSO data.

Does that cost more than just a few cycles?

> And there is still some funny code generated by GCC (8.1), like:
> 
>  620: 7d 29 3c 30 srw r9,r9,r7
>  624: 21 87 00 20 subfic  r12,r7,32
>  628: 7d 07 3c 31 srw.r7,r8,r7
>  62c: 7d 08 60 30 slw r8,r8,r12
>  630: 7d 0b 4b 78 or  r11,r8,r9

(This can be done cheaper for fixed shifts, you can use rlwimi then).

>  634: 39 40 00 00 li  r10,0
>  638: 40 82 00 84 bne 6bc <__c_kernel_clock_gettime+0x114>
>  63c: 81 23 00 24 lwz r9,36(r3)
>  640: 81 05 00 00 lwz r8,0(r5)
> ...
>  6bc: 7d 69 5b 78 mr  r9,r11
>  6c0: 7c ea 3b 78 mr  r10,r7
>  6c4: 7d 2b 4b 78 mr  r11,r9
>  6c8: 4b ff ff 74 b   63c <__c_kernel_clock_gettime+0x94>
> 
> This branch to 6bc is totally useless:
> - copying r11 into r9 is pointless as r9 is overwritten in 63c
> - copying back r9 into r11 is pointless as r11 has not been modified 
> inbetween.

Yeah, huh, how did that happen.

> - loading r10 with 0 then overwritting r10 with r7 when r7 is not 0 is 
> pointless as well, could have directly put the result of srw. in r10.

This may be harder to make the compiler do.

But the r9/r11 thing suggests you are preventing optimisation somewhere,
maybe with some asm?  Do you have some small testcase I can compile?


Segher


Re: [PATCH v2 00/10] Impveovements for random.h/archrandom.h

2020-01-20 Thread Mark Brown
On Fri, Jan 10, 2020 at 12:05:59PM -0500, Theodore Y. Ts'o wrote:
> On Fri, Jan 10, 2020 at 04:51:53PM +0100, Borislav Petkov wrote:
> > On Fri, Jan 10, 2020 at 02:54:12PM +, Mark Brown wrote:

> > > This is a resend of a series from Richard Henderson last posted back in
> > > November:

> > >
> > > https://lore.kernel.org/linux-arm-kernel/20191106141308.30535-1-...@twiddle.net/

> > > Back then Borislav said they looked good and asked if he should take
> > > them through the tip tree but things seem to have got lost since then.

> > Or, alternatively, akpm could take them. In any case, if someone else
> > ends up doing that, for the x86 bits:

> Or I can take them through the random.git tree, since we have a lot of
> changes this cycle going to Linus anyway.  Any objections?

I think the important thing here is that *someone* takes the patches.
We've now got Ted and Borislav both saying they're OK applying the
patches, an additional proposal that Andrew takes the patches, nobody
saying anything negative about applying the patches and yet the patches
are not applied.  The random tree sounds like a sensible enough tree to
take this so if Ted picks them up perhaps that's most sensible?


signature.asc
Description: PGP signature


Re: [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation.

2020-01-20 Thread Christophe Leroy




Le 20/01/2020 à 16:19, Segher Boessenkool a écrit :

On Mon, Jan 20, 2020 at 02:56:00PM +, Christophe Leroy wrote:

Nice!  Much better.

It should be tested on more representative hardware, too, but this looks
promising alright :-)


mpc832x (e300c2 core) at 333 MHz:

Before:

gettimeofday:vdso: 235 nsec/call
clock-gettime-realtime:vdso: 244 nsec/call

With the series:

gettimeofday:vdso: 271 nsec/call
clock-gettime-realtime:vdso: 281 nsec/call


Those are important, and degrade ~15%.  That is acceptable IMO, but do
you see a way to optimise this (later)?


Not easy I think.

First we have the unavoidable ASM entry function that can't be dropped 
because of the CR[SO] bit the set on error or clear on no error and that 
can't be done in C.


In our ASM VDSO, fixed shifts are used, while in generic C VDSO, shifts 
are generic and read from the VDSO data.


And there is still some funny code generated by GCC (8.1), like:

 620:   7d 29 3c 30 srw r9,r9,r7
 624:   21 87 00 20 subfic  r12,r7,32
 628:   7d 07 3c 31 srw.r7,r8,r7
 62c:   7d 08 60 30 slw r8,r8,r12
 630:   7d 0b 4b 78 or  r11,r8,r9
 634:   39 40 00 00 li  r10,0
 638:   40 82 00 84 bne 6bc <__c_kernel_clock_gettime+0x114>
 63c:   81 23 00 24 lwz r9,36(r3)
 640:   81 05 00 00 lwz r8,0(r5)
...
 6bc:   7d 69 5b 78 mr  r9,r11
 6c0:   7c ea 3b 78 mr  r10,r7
 6c4:   7d 2b 4b 78 mr  r11,r9
 6c8:   4b ff ff 74 b   63c <__c_kernel_clock_gettime+0x94>

This branch to 6bc is totally useless:
- copying r11 into r9 is pointless as r9 is overwritten in 63c
- copying back r9 into r11 is pointless as r11 has not been modified 
inbetween.
- loading r10 with 0 then overwritting r10 with r7 when r7 is not 0 is 
pointless as well, could have directly put the result of srw. in r10.


Christophe


Re: [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation.

2020-01-20 Thread Segher Boessenkool
On Mon, Jan 20, 2020 at 02:56:00PM +, Christophe Leroy wrote:
> >Nice!  Much better.
> >
> >It should be tested on more representative hardware, too, but this looks
> >promising alright :-)
> 
> mpc832x (e300c2 core) at 333 MHz:
> 
> Before:
> 
> gettimeofday:vdso: 235 nsec/call
> clock-gettime-realtime:vdso: 244 nsec/call
> 
> With the series:
> 
> gettimeofday:vdso: 271 nsec/call
> clock-gettime-realtime:vdso: 281 nsec/call

Those are important, and degrade ~15%.  That is acceptable IMO, but do
you see a way to optimise this (later)?

Anyway, excellent results, thanks for your persistence!


Segher


Re: [RFC PATCH v4 00/11] powerpc: switch VDSO to C implementation.

2020-01-20 Thread Christophe Leroy

Hi

On 01/17/2020 08:58 AM, Segher Boessenkool wrote:

Hi!

On Thu, Jan 16, 2020 at 05:58:24PM +, Christophe Leroy wrote:

On a powerpc8xx, with current powerpc/32 ASM VDSO:

gettimeofday:vdso: 907 nsec/call
clock-getres-realtime:vdso: 484 nsec/call
clock-gettime-realtime:vdso: 899 nsec/call

The first patch adds VDSO generic C support without any changes to common code.
Performance is as follows:

gettimeofday:vdso: 1211 nsec/call
clock-getres-realtime:vdso: 722 nsec/call
clock-gettime-realtime:vdso: 1216 nsec/call

Then a few changes in the common code have allowed performance improvement. At
the end of the series we have:

gettimeofday:vdso: 974 nsec/call
clock-getres-realtime:vdso: 545 nsec/call
clock-gettime-realtime:vdso: 941 nsec/call

The final result is rather close to pure ASM VDSO:
* 7% more on gettimeofday (9 cycles)
* 5% more on clock-gettime-realtime (6 cycles)
* 12% more on clock-getres-realtime (8 cycles)


Nice!  Much better.

It should be tested on more representative hardware, too, but this looks
promising alright :-)



mpc832x (e300c2 core) at 333 MHz:

Before:

gettimeofday:vdso: 235 nsec/call
clock-getres-realtime-coarse:vdso: 1668 nsec/call
clock-gettime-realtime-coarse:vdso: 1338 nsec/call
clock-getres-realtime:vdso: 135 nsec/call
clock-gettime-realtime:vdso: 244 nsec/call
clock-getres-boottime:vdso: 1232 nsec/call
clock-gettime-boottime:vdso: 1935 nsec/call
clock-getres-tai:vdso: 1257 nsec/call
clock-gettime-tai:vdso: 1898 nsec/call
clock-getres-monotonic-raw:vdso: 1229 nsec/call
clock-gettime-monotonic-raw:vdso: 1541 nsec/call
clock-getres-monotonic-coarse:vdso: 1699 nsec/call
clock-gettime-monotonic-coarse:vdso: 1477 nsec/call
clock-getres-monotonic:vdso: 135 nsec/call
clock-gettime-monotonic:vdso: 283 nsec/call

With the series:

gettimeofday:vdso: 271 nsec/call
clock-getres-realtime-coarse:vdso: 159 nsec/call
clock-gettime-realtime-coarse:vdso: 184 nsec/call
clock-getres-realtime:vdso: 163 nsec/call
clock-gettime-realtime:vdso: 281 nsec/call
clock-getres-boottime:vdso: 169 nsec/call
clock-gettime-boottime:vdso: 274 nsec/call
clock-getres-tai:vdso: 163 nsec/call
clock-gettime-tai:vdso: 277 nsec/call
clock-getres-monotonic-raw:vdso: 166 nsec/call
clock-gettime-monotonic-raw:vdso: 302 nsec/call
clock-getres-monotonic-coarse:vdso: 159 nsec/call
clock-gettime-monotonic-coarse:vdso: 184 nsec/call
clock-getres-monotonic:vdso: 166 nsec/call
clock-gettime-monotonic:vdso: 274 nsec/call

Christophe


Re: [PATCH v2] selftests: vm: Fix 64-bit test builds for powerpc64le

2020-01-20 Thread Kamalesh Babulal
On 1/20/20 7:29 PM, Sandipan Das wrote:
> Some tests are built only for 64-bit systems. This makes
> sure that these tests are built for both big and little
> endian variants of powerpc64.
> 
> Fixes: 7549b3364201 ("selftests: vm: Build/Run 64bit tests only on 64bit 
> arch")
> Signed-off-by: Sandipan Das 

I was about to suggest, the missing change in run_vmtests script in your V1.

Reviewed-by: Kamalesh Babulal 


-- 
Kamalesh



[PATCH v2] selftests: vm: Fix 64-bit test builds for powerpc64le

2020-01-20 Thread Sandipan Das
Some tests are built only for 64-bit systems. This makes
sure that these tests are built for both big and little
endian variants of powerpc64.

Fixes: 7549b3364201 ("selftests: vm: Build/Run 64bit tests only on 64bit arch")
Signed-off-by: Sandipan Das 
---
Changelog:

v2:
  - Added required changes in run_vmtests.

---
 tools/testing/selftests/vm/Makefile| 2 +-
 tools/testing/selftests/vm/run_vmtests | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index 7f9a8a8c31da..f3d11f4fca38 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -19,7 +19,7 @@ TEST_GEN_FILES += thuge-gen
 TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 
-ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 
sparc64 x86_64))
+ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 ppc64le riscv64 
s390x sh64 sparc64 x86_64))
 TEST_GEN_FILES += va_128TBswitch
 TEST_GEN_FILES += virtual_address_range
 endif
diff --git a/tools/testing/selftests/vm/run_vmtests 
b/tools/testing/selftests/vm/run_vmtests
index a692ea828317..db8e0d1c7b39 100755
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -59,7 +59,7 @@ else
 fi
 
 #filter 64bit architectures
-ARCH64STR="arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 sparc64 x86_64"
+ARCH64STR="arm64 ia64 mips64 parisc64 ppc64 ppc64le riscv64 s390x sh64 sparc64 
x86_64"
 if [ -z $ARCH ]; then
   ARCH=`uname -m 2>/dev/null | sed -e 's/aarch64.*/arm64/'`
 fi
-- 
2.17.1



[PATCH] selftests: vm: Fix 64-bit test builds for powerpc64le

2020-01-20 Thread Sandipan Das
Some tests are built only for 64-bit systems. This makes
sure that these tests are built for both big and little
endian variants of powerpc64.

Fixes: 7549b3364201 ("selftests: vm: Build/Run 64bit tests only on 64bit arch")
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index 7f9a8a8c31da..f3d11f4fca38 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -19,7 +19,7 @@ TEST_GEN_FILES += thuge-gen
 TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 
-ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 
sparc64 x86_64))
+ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 ppc64le riscv64 
s390x sh64 sparc64 x86_64))
 TEST_GEN_FILES += va_128TBswitch
 TEST_GEN_FILES += virtual_address_range
 endif
-- 
2.17.1



Re: [PATCH] ide: remove set but not used variable 'hwif'

2020-01-20 Thread David Miller
From: Wang Hai 
Date: Sat, 26 Oct 2019 09:57:38 +0800

> Fix the following gcc warning:
> 
> drivers/ide/pmac.c: In function pmac_ide_setup_device:
> drivers/ide/pmac.c:1027:14: warning: variable hwif set but not used
> [-Wunused-but-set-variable]
> 
> Fixes: d58b0c39e32f ("powerpc/macio: Rework hotplug media bay support")
> Reported-by: Hulk Robot 
> Signed-off-by: Wang Hai 

Applied.


[Bug 205099] KASAN hit at raid6_pq: BUG: Unable to handle kernel data access at 0x00f0fd0d

2020-01-20 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205099

--- Comment #17 from Christophe Leroy (christophe.le...@c-s.fr) ---
Created attachment 286907
  --> https://bugzilla.kernel.org/attachment.cgi?id=286907=edit
Patch to fix kasan with KASAN_VMALLOC and VMAP_STACK

Please try the attached patch, it fixes the setup of the kasan early hash table
 when VMAP_STACK is enabled.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH v5 10/10] drivers/oprofile: open access for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to monitoring for CAP_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability. Providing the access under CAP_PERFMON capability singly,
without the rest of CAP_SYS_ADMIN credentials, excludes chances to
misuse the credentials and makes the operations more secure.

Signed-off-by: Alexey Budankov 
---
 drivers/oprofile/event_buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/oprofile/event_buffer.c b/drivers/oprofile/event_buffer.c
index 12ea4a4ad607..6c9edc8bbc95 100644
--- a/drivers/oprofile/event_buffer.c
+++ b/drivers/oprofile/event_buffer.c
@@ -113,7 +113,7 @@ static int event_buffer_open(struct inode *inode, struct 
file *file)
 {
int err = -EPERM;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EPERM;
 
if (test_and_set_bit_lock(0, _opened))
-- 
2.20.1


[PATCH v5 09/10] drivers/perf: open access for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to monitoring for CAP_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability. Providing the access under CAP_PERFMON capability singly,
without the rest of CAP_SYS_ADMIN credentials, excludes chances to
misuse the credentials and makes the operations more secure.

Signed-off-by: Alexey Budankov 
---
 drivers/perf/arm_spe_pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 4e4984a55cd1..5dff81bc3324 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -274,7 +274,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
if (!attr->exclude_kernel)
reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
 
-   if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && capable(CAP_SYS_ADMIN))
+   if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
 
return reg;
@@ -700,7 +700,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
return -EOPNOTSUPP;
 
reg = arm_spe_event_to_pmscr(event);
-   if (!capable(CAP_SYS_ADMIN) &&
+   if (!perfmon_capable() &&
(reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
BIT(SYS_PMSCR_EL1_CX_SHIFT) |
BIT(SYS_PMSCR_EL1_PCT_SHIFT
-- 
2.20.1



[PATCH v5 08/10] parisc/perf: open access for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to monitoring for CAP_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability. Providing the access under CAP_PERFMON capability singly,
without the rest of CAP_SYS_ADMIN credentials, excludes chances to
misuse the credentials and makes the operations more secure.

Signed-off-by: Alexey Budankov 
---
 arch/parisc/kernel/perf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/perf.c b/arch/parisc/kernel/perf.c
index 676683641d00..c4208d027794 100644
--- a/arch/parisc/kernel/perf.c
+++ b/arch/parisc/kernel/perf.c
@@ -300,7 +300,7 @@ static ssize_t perf_write(struct file *file, const char 
__user *buf,
else
return -EFAULT;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EACCES;
 
if (count != sizeof(uint32_t))
-- 
2.20.1




[PATCH v5 07/10] powerpc/perf: open access for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to monitoring for CAP_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage
for secure monitoring is discouraged with respect to CAP_PERFMON
capability. Providing the access under CAP_PERFMON capability singly,
without the rest of CAP_SYS_ADMIN credentials, excludes chances to
misuse the credentials and makes the operations more secure.

Signed-off-by: Alexey Budankov 
---
 arch/powerpc/perf/imc-pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index cb50a9e1fd2d..e837717492e4 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -898,7 +898,7 @@ static int thread_imc_event_init(struct perf_event *event)
if (event->attr.type != event->pmu->type)
return -ENOENT;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EACCES;
 
/* Sampling not supported */
@@ -1307,7 +1307,7 @@ static int trace_imc_event_init(struct perf_event *event)
if (event->attr.type != event->pmu->type)
return -ENOENT;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EACCES;
 
/* Return if this is a couting event */
-- 
2.20.1


[PATCH v5 06/10] trace/bpf_trace: open access for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to bpf_trace monitoring for CAP_PERFMON privileged processes.
For backward compatibility reasons access to bpf_trace monitoring remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure bpf_trace monitoring is discouraged with respect to CAP_PERFMON
capability. Providing the access under CAP_PERFMON capability singly,
without the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse
the credentials and makes operations more secure.

Signed-off-by: Alexey Budankov 
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e5ef4ae9edb5..334f1d71ebb1 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1395,7 +1395,7 @@ int perf_event_query_prog_array(struct perf_event *event, 
void __user *info)
u32 *ids, prog_cnt, ids_len;
int ret;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EPERM;
if (event->attr.type != PERF_TYPE_TRACEPOINT)
return -EINVAL;
-- 
2.20.1



[PATCH v5 05/10] drm/i915/perf: open access for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to i915_perf monitoring for CAP_PERFMON privileged processes.
For backward compatibility reasons access to i915_perf subsystem remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure i915_perf monitoring is discouraged with respect to CAP_PERFMON
capability. Providing the access under CAP_PERFMON capability singly,
without the rest of CAP_SYS_ADMIN credentials, excludes chances to misuse
the credentials and makes operations more secure.

Signed-off-by: Alexey Budankov 
---
 drivers/gpu/drm/i915/i915_perf.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 2ae14bc14931..d89347861b7d 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3375,10 +3375,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
 * we check a dev.i915.perf_stream_paranoid sysctl option
 * to determine if it's ok to access system wide OA counters
-* without CAP_SYS_ADMIN privileges.
+* without CAP_PERFMON or CAP_SYS_ADMIN privileges.
 */
if (privileged_op &&
-   i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+   i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to open i915 perf stream\n");
ret = -EACCES;
goto err_ctx;
@@ -3571,9 +3571,8 @@ static int read_properties_unlocked(struct i915_perf 
*perf,
} else
oa_freq_hz = 0;
 
-   if (oa_freq_hz > i915_oa_max_sample_rate &&
-   !capable(CAP_SYS_ADMIN)) {
-   DRM_DEBUG("OA exponent would exceed the max 
sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without root 
privileges\n",
+   if (oa_freq_hz > i915_oa_max_sample_rate && 
!perfmon_capable()) {
+   DRM_DEBUG("OA exponent would exceed the max 
sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without 
CAP_PERFMON or CAP_SYS_ADMIN privileges\n",
  i915_oa_max_sample_rate);
return -EACCES;
}
@@ -3994,7 +3993,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, 
void *data,
return -EINVAL;
}
 
-   if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+   if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
return -EACCES;
}
@@ -4141,7 +4140,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, 
void *data,
return -ENOTSUPP;
}
 
-   if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+   if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
return -EACCES;
}
-- 
2.20.1



[PATCH v5 04/10] perf tool: extend Perf tool with CAP_PERFMON capability support

2020-01-20 Thread Alexey Budankov


Extend error messages to mention CAP_PERFMON capability as an option
to substitute CAP_SYS_ADMIN capability for secure system performance
monitoring and observability operations. Make perf_event_paranoid_check()
and __cmd_ftrace() to be aware of CAP_PERFMON capability.

Signed-off-by: Alexey Budankov 
---
 tools/perf/builtin-ftrace.c |  5 +++--
 tools/perf/design.txt   |  3 ++-
 tools/perf/util/cap.h   |  4 
 tools/perf/util/evsel.c | 10 +-
 tools/perf/util/util.c  |  1 +
 5 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index d5adc417a4ca..55eda54240fb 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -284,10 +284,11 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int 
argc, const char **argv)
.events = POLLIN,
};
 
-   if (!perf_cap__capable(CAP_SYS_ADMIN)) {
+   if (!(perf_cap__capable(CAP_PERFMON) ||
+ perf_cap__capable(CAP_SYS_ADMIN))) {
pr_err("ftrace only works for %s!\n",
 #ifdef HAVE_LIBCAP_SUPPORT
-   "users with the SYS_ADMIN capability"
+   "users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
 #else
"root"
 #endif
diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index 0453ba26cdbd..a42fab308ff6 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -258,7 +258,8 @@ gets schedule to. Per task counters can be created by any 
user, for
 their own tasks.
 
 A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
+all events on CPU-x. Per CPU counters need CAP_PERFMON or CAP_SYS_ADMIN
+privilege.
 
 The 'flags' parameter is currently unused and must be zero.
 
diff --git a/tools/perf/util/cap.h b/tools/perf/util/cap.h
index 051dc590ceee..ae52878c0b2e 100644
--- a/tools/perf/util/cap.h
+++ b/tools/perf/util/cap.h
@@ -29,4 +29,8 @@ static inline bool perf_cap__capable(int cap __maybe_unused)
 #define CAP_SYSLOG 34
 #endif
 
+#ifndef CAP_PERFMON
+#define CAP_PERFMON38
+#endif
+
 #endif /* __PERF_CAP_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a69e64236120..a35f17723dd3 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2491,14 +2491,14 @@ int perf_evsel__open_strerror(struct evsel *evsel, 
struct target *target,
 "You may not have permission to collect %sstats.\n\n"
 "Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
 "which controls use of the performance events system by\n"
-"unprivileged users (without CAP_SYS_ADMIN).\n\n"
+"unprivileged users (without CAP_PERFMON or 
CAP_SYS_ADMIN).\n\n"
 "The current value is %d:\n\n"
 "  -1: Allow use of (almost) all events by all users\n"
 "  Ignore mlock limit after perf_event_mlock_kb without 
CAP_IPC_LOCK\n"
-">= 0: Disallow ftrace function tracepoint by users without 
CAP_SYS_ADMIN\n"
-"  Disallow raw tracepoint access by users without 
CAP_SYS_ADMIN\n"
-">= 1: Disallow CPU event access by users without 
CAP_SYS_ADMIN\n"
-">= 2: Disallow kernel profiling by users without 
CAP_SYS_ADMIN\n\n"
+">= 0: Disallow ftrace function tracepoint by users without 
CAP_PERFMON or CAP_SYS_ADMIN\n"
+"  Disallow raw tracepoint access by users without 
CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+">= 1: Disallow CPU event access by users without CAP_PERFMON 
or CAP_SYS_ADMIN\n"
+">= 2: Disallow kernel profiling by users without CAP_PERFMON 
or CAP_SYS_ADMIN\n\n"
 "To make this setting permanent, edit /etc/sysctl.conf too, 
e.g.:\n\n"
 "  kernel.perf_event_paranoid = -1\n" ,
 target->system_wide ? "system-wide " : "",
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 969ae560dad9..51cf3071db74 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -272,6 +272,7 @@ int perf_event_paranoid(void)
 bool perf_event_paranoid_check(int max_level)
 {
return perf_cap__capable(CAP_SYS_ADMIN) ||
+   perf_cap__capable(CAP_PERFMON) ||
perf_event_paranoid() <= max_level;
 }
 
-- 
2.20.1




[PATCH v5 03/10] perf/core: open access to anon probes for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to anon kprobes, uprobes and eBPF tracing for CAP_PERFMON
privileged processes. For backward compatibility reasons access remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure monitoring is discouraged with respect to CAP_PERFMON capability.
Providing the access under CAP_PERFMON capability singly, without the
rest of CAP_SYS_ADMIN credentials, excludes chances to misuse the
credentials and makes operations more secure.

Anon kprobes and uprobes are used by ftrace and eBPF. perf probe uses
ftrace to define new kprobe events, and those events are treated as
tracepoint events. eBPF defines new probes via perf_event_open syscall
and then the probes are used in eBPF tracing.

Signed-off-by: Alexey Budankov 
---
 kernel/events/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index b1fcbbe24849..8a6c0b08451d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9088,7 +9088,7 @@ static int perf_kprobe_event_init(struct perf_event 
*event)
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EACCES;
 
/*
@@ -9148,7 +9148,7 @@ static int perf_uprobe_event_init(struct perf_event 
*event)
if (event->attr.type != perf_uprobe.type)
return -ENOENT;
 
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EACCES;
 
/*
-- 
2.20.1


[PATCH v5 02/10] perf/core: open access to the core for CAP_PERFMON privileged process

2020-01-20 Thread Alexey Budankov


Open access to monitoring of kernel code, system, tracepoints and namespaces
data for a CAP_PERFMON privileged process. For backward compatibility
reasons access to perf_events subsystem remains open for CAP_SYS_ADMIN
privileged processes but CAP_SYS_ADMIN usage for secure perf_events
monitoring is discouraged with respect to CAP_PERFMON capability.
Providing the access under CAP_PERFMON capability singly, without the rest
of CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
and makes operation more secure.

Signed-off-by: Alexey Budankov 
---
 include/linux/perf_event.h | 6 +++---
 kernel/events/core.c   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6d4c22aee384..730469babcc2 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1285,7 +1285,7 @@ static inline int perf_is_paranoid(void)
 
 static inline int perf_allow_kernel(struct perf_event_attr *attr)
 {
-   if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+   if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
return -EACCES;
 
return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
@@ -1293,7 +1293,7 @@ static inline int perf_allow_kernel(struct 
perf_event_attr *attr)
 
 static inline int perf_allow_cpu(struct perf_event_attr *attr)
 {
-   if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+   if (sysctl_perf_event_paranoid > 0 && !perfmon_capable())
return -EACCES;
 
return security_perf_event_open(attr, PERF_SECURITY_CPU);
@@ -1301,7 +1301,7 @@ static inline int perf_allow_cpu(struct perf_event_attr 
*attr)
 
 static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
 {
-   if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
+   if (sysctl_perf_event_paranoid > -1 && !perfmon_capable())
return -EPERM;
 
return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a1f8bde19b56..b1fcbbe24849 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11186,7 +11186,7 @@ SYSCALL_DEFINE5(perf_event_open,
}
 
if (attr.namespaces) {
-   if (!capable(CAP_SYS_ADMIN))
+   if (!perfmon_capable())
return -EACCES;
}
 
-- 
2.20.1



[PATCH v5 01/10] capabilities: introduce CAP_PERFMON to kernel and user space

2020-01-20 Thread Alexey Budankov


Introduce CAP_PERFMON capability designed to secure system performance
monitoring and observability operations so that CAP_PERFMON would assist
CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf
and other performance monitoring and observability subsystems.

CAP_PERFMON intends to harden system security and integrity during system
performance monitoring and observability operations by decreasing attack
surface that is available to a CAP_SYS_ADMIN privileged process [1].
Providing access to system performance monitoring and observability
operations under CAP_PERFMON capability singly, without the rest of
CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and
makes operation more secure.

CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
system performance monitoring and observability operations and balance
amount of CAP_SYS_ADMIN credentials following the recommendations in the
capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is
overloaded; see Notes to kernel developers, below."

Although the software running under CAP_PERFMON can not ensure avoidance
of related hardware issues, the software can still mitigate these issues
following the official embargoed hardware issues mitigation procedure [2].
The bugs in the software itself could be fixed following the standard
kernel development process [3] to maintain and harden security of system
performance monitoring and observability operations.

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
[2] 
https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
[3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html

Signed-off-by: Alexey Budankov 
---
 include/linux/capability.h  | 12 
 include/uapi/linux/capability.h |  8 +++-
 security/selinux/include/classmap.h |  4 ++--
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index ecce0f43c73a..8784969d91e1 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct 
user_namespace *ns, const struct
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
 extern bool file_ns_capable(const struct file *file, struct user_namespace 
*ns, int cap);
 extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace 
*ns);
+static inline bool perfmon_capable(void)
+{
+   struct user_namespace *ns = _user_ns;
+
+   if (ns_capable_noaudit(ns, CAP_PERFMON))
+   return ns_capable(ns, CAP_PERFMON);
+
+   if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))
+   return ns_capable(ns, CAP_SYS_ADMIN);
+
+   return false;
+}
 
 /* audit system wants to get cap info from files as well */
 extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct 
cpu_vfs_cap_data *cpu_caps);
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..8b416e5f3afa 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -366,8 +366,14 @@ struct vfs_ns_cap_data {
 
 #define CAP_AUDIT_READ 37
 
+/*
+ * Allow system performance and observability privileged operations
+ * using perf_events, i915_perf and other kernel subsystems
+ */
+
+#define CAP_PERFMON38
 
-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_LAST_CAP CAP_PERFMON
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/security/selinux/include/classmap.h 
b/security/selinux/include/classmap.h
index 7db24855e12d..c599b0c2b0e7 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -27,9 +27,9 @@
"audit_control", "setfcap"
 
 #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
-   "wake_alarm", "block_suspend", "audit_read"
+   "wake_alarm", "block_suspend", "audit_read", "perfmon"
 
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_PERFMON
 #error New capability defined, please update COMMON_CAP2_PERMS.
 #endif
 
-- 
2.20.1




Re: [FSL P5020 P5040 PPC] Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

2020-01-20 Thread Ulf Hansson
On Mon, 20 Jan 2020 at 10:17, Christian Zigotzky  wrote:
>
> Am 16.01.20 um 16:46 schrieb Ulf Hansson:
> > On Thu, 16 Jan 2020 at 12:18, Christian Zigotzky  
> > wrote:
> >> Hi All,
> >>
> >> We still need the attached patch for our onboard SD card interface
> >> [1,2]. Could you please add this patch to the tree?
> > No, because according to previous discussion that isn't the correct
> > solution and more importantly it will break other archs (if I recall
> > correctly).
> >
> > Looks like someone from the ppc community needs to pick up the ball.
> I am not sure if the ppc community have to fix this issue because your
> updates (mmc-v5.4-2) are responsible for this issue. If nobody wants to
> fix this issue then we will lost the onboard SD card support in the
> future. PLEASE check the 'mmc-v5.4-2' updates again.

Applying your suggested fix breaks other archs/boards. It's really not
a good situation, but I will not take a step back when it's quite easy
to take a step forward instead.

Someone just need to care and send a patch, it doesn't look that hard
to me, but maybe I am wrong.

Apologies if this isn't the answer you wanted, but that's all I can do
for now, sorry.

Kind regards
Uffe


[PATCH v5 0/10] Introduce CAP_PERFMON to secure system performance monitoring and observability

2020-01-20 Thread Alexey Budankov


Currently access to perf_events, i915_perf and other performance monitoring and
observability subsystems of the kernel is open for a privileged process [1] with
CAP_SYS_ADMIN capability enabled in the process effective set [2].

This patch set introduces CAP_PERFMON capability designed to secure system
performance monitoring and observability operations so that CAP_PERFMON would
assist CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf
and other performance monitoring and observability subsystems of the kernel.

CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to system
performance monitoring and observability operations and balance amount of
CAP_SYS_ADMIN credentials following the recommendations in the capabilities man
page [2] for CAP_SYS_ADMIN: "Note: this capability is overloaded; see Notes to
kernel developers, below."

CAP_PERFMON intends to harden system security and integrity during system
performance monitoring and observability operations by decreasing attack surface
that is available to a CAP_SYS_ADMIN privileged process [2]. Providing the 
access
to system performance monitoring and observability operations under CAP_PERFMON
capability singly, without the rest of CAP_SYS_ADMIN credentials, excludes 
chances
to misuse the credentials and makes the operation more secure.

For backward compatibility reasons access to system performance monitoring and
observability subsystems of the kernel remains open for CAP_SYS_ADMIN privileged
processes but CAP_SYS_ADMIN capability usage for secure system performance
monitoring and observability operations is discouraged with respect to the
designed CAP_PERFMON capability.

CAP_PERFMON intends to meet the demand to secure system performance monitoring
and observability operations in security sensitive, restricted, multiuser 
production
environments (e.g. HPC clusters, cloud and virtual compute environments) where
root or CAP_SYS_ADMIN credentials are not available to mass users of a system
because of security considerations.

Possible alternative solution to this capabilities balancing, system security
hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
system performance monitoring and observability operations. However 
CAP_SYS_PTRACE
capability still provides users with more credentials than are required for
secure performance monitoring and observability operations and this excess is
avoided by the designed CAP_PERFMON capability.

Although the software running under CAP_PERFMON can not ensure avoidance of
related hardware issues, the software can still mitigate those issues following
the official embargoed hardware issues mitigation procedure [3]. The bugs in
the software itself could be fixed following the standard kernel development
process [4] to maintain and harden security of system performance monitoring
and observability operations. After all, the patch set is shaped in the way
that simplifies procedure for backtracking of possible issues and bugs [5] as
much as possible.

The patch set is for tip perf/core repository:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
sha1: 5738891229a25e9e678122a843cbf0466a456d0c

---
Changes in v5:
- renamed CAP_SYS_PERFMON to CAP_PERFMON
- extended perfmon_capable() with noaudit checks
Changes in v4:
- converted perfmon_capable() into an inline function
- made perf_events kprobes, uprobes, hw breakpoints and namespaces data 
available
  to CAP_SYS_PERFMON privileged processes
- applied perfmon_capable() to drivers/perf and drivers/oprofile
- extended __cmd_ftrace() with support of CAP_SYS_PERFMON
Changes in v3:
- implemented perfmon_capable() macros aggregating required capabilities checks
Changes in v2:
- made perf_events trace points available to CAP_SYS_PERFMON privileged 
processes
- made perf_event_paranoid_check() treat CAP_SYS_PERFMON equally to 
CAP_SYS_ADMIN
- applied CAP_SYS_PERFMON to i915_perf, bpf_trace, powerpc and parisc system
  performance monitoring and observability related subsystems

---
Alexey Budankov (10):
  capabilities: introduce CAP_PERFMON to kernel and user space
  perf/core: open access to the core for CAP_PERFMON privileged process
  perf/core: open access to anon probes for CAP_PERFMON privileged process
  perf tool: extend Perf tool with CAP_PERFMON capability support
  drm/i915/perf: open access for CAP_PERFMON privileged process
  trace/bpf_trace: open access for CAP_PERFMON privileged process
  powerpc/perf: open access for CAP_PERFMON privileged process
  parisc/perf: open access for CAP_PERFMON privileged process
  drivers/perf: open access for CAP_PERFMON privileged process
  drivers/oprofile: open access for CAP_PERFMON privileged process

 arch/parisc/kernel/perf.c   |  2 +-
 arch/powerpc/perf/imc-pmu.c |  4 ++--
 drivers/gpu/drm/i915/i915_perf.c| 13 ++---
 drivers/oprofile/event_buffer.c |  2 +-
 drivers/perf/arm_spe_pmu.c  |  4 ++--
 

[PATCH 2/2] powerpc/perf: Implement a global lock to avoid races between trace, core and thread imc events.

2020-01-20 Thread Anju T Sudhakar
IMC(In-memory Collection Counters) does performance monitoring in
two different modes, i.e accumulation mode(core-imc and thread-imc events),
and trace mode(trace-imc events). A cpu thread can either be in
accumulation-mode or trace-mode at a time and this is done via the LDBAR
register in POWER architecture. The current design does not address the
races between thread-imc and trace-imc events.

Patch implements a global id and lock to avoid the races between
core, trace and thread imc events. With this global id-lock
implementation, the system can either run core, thread or trace imc
events at a time. i.e. to run any core-imc events, thread/trace imc events
should not be enabled/monitored.
 
Signed-off-by: Anju T Sudhakar 
---
 arch/powerpc/perf/imc-pmu.c | 177 +++-
 1 file changed, 153 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index cb50a9e1fd2d..2e220f199530 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -44,6 +44,16 @@ static DEFINE_PER_CPU(u64 *, trace_imc_mem);
 static struct imc_pmu_ref *trace_imc_refc;
 static int trace_imc_mem_size;
 
+/*
+ * Global data structure used to avoid races between thread,
+ * core and trace-imc
+ */
+static struct imc_pmu_ref imc_global_refc = {
+   .lock = __MUTEX_INITIALIZER(imc_global_refc.lock),
+   .id = 0,
+   .refc = 0,
+};
+
 static struct imc_pmu *imc_event_to_pmu(struct perf_event *event)
 {
return container_of(event->pmu, struct imc_pmu, pmu);
@@ -759,6 +769,20 @@ static void core_imc_counters_release(struct perf_event 
*event)
ref->refc = 0;
}
mutex_unlock(>lock);
+
+   mutex_lock(_global_refc.lock);
+   if (imc_global_refc.id == IMC_DOMAIN_CORE) {
+   imc_global_refc.refc--;
+   /*
+* If no other thread is running any core-imc
+* event, set the global id to zero.
+*/
+   if (imc_global_refc.refc <= 0) {
+   imc_global_refc.refc = 0;
+   imc_global_refc.id = 0;
+   }
+   }
+   mutex_unlock(_global_refc.lock);
 }
 
 static int core_imc_event_init(struct perf_event *event)
@@ -779,6 +803,22 @@ static int core_imc_event_init(struct perf_event *event)
if (event->cpu < 0)
return -EINVAL;
 
+   /*
+* Take the global lock, and make sure
+* no other thread is running any trace OR thread imc event
+*/
+   mutex_lock(_global_refc.lock);
+   if (imc_global_refc.id == 0) {
+   imc_global_refc.id = IMC_DOMAIN_CORE;
+   imc_global_refc.refc++;
+   } else if (imc_global_refc.id == IMC_DOMAIN_CORE) {
+   imc_global_refc.refc++;
+   } else {
+   mutex_unlock(_global_refc.lock);
+   return -EBUSY;
+   }
+   mutex_unlock(_global_refc.lock);
+
event->hw.idx = -1;
pmu = imc_event_to_pmu(event);
 
@@ -877,7 +917,16 @@ static int ppc_thread_imc_cpu_online(unsigned int cpu)
 
 static int ppc_thread_imc_cpu_offline(unsigned int cpu)
 {
-   mtspr(SPRN_LDBAR, 0);
+   /*
+* Toggle the bit 0 of LDBAR.
+*
+* If bit 0 of LDBAR is unset, it will stop posting
+* the counetr data to memory.
+* For thread-imc, bit 0 of LDBAR will be set to 1 in the
+* event_add function. So toggle this bit here, to stop the updates
+* to memory in the cpu_offline path.
+*/
+   mtspr(SPRN_LDBAR, (mfspr(SPRN_LDBAR) ^ (1UL << 63)));
return 0;
 }
 
@@ -889,6 +938,24 @@ static int thread_imc_cpu_init(void)
  ppc_thread_imc_cpu_offline);
 }
 
+static void thread_imc_counters_release(struct perf_event *event)
+{
+
+   mutex_lock(_global_refc.lock);
+   if (imc_global_refc.id == IMC_DOMAIN_THREAD) {
+   imc_global_refc.refc--;
+   /*
+* If no other thread is running any thread-imc
+* event, set the global id to zero.
+*/
+   if (imc_global_refc.refc <= 0) {
+   imc_global_refc.refc = 0;
+   imc_global_refc.id = 0;
+   }
+   }
+   mutex_unlock(_global_refc.lock);
+}
+
 static int thread_imc_event_init(struct perf_event *event)
 {
u32 config = event->attr.config;
@@ -905,6 +972,27 @@ static int thread_imc_event_init(struct perf_event *event)
if (event->hw.sample_period)
return -EINVAL;
 
+   mutex_lock(_global_refc.lock);
+   /*
+* Check if any other thread is running
+* core-engine, if not set the global id to
+* thread-imc.
+*/
+   if (imc_global_refc.id == 0) {
+   imc_global_refc.id = IMC_DOMAIN_THREAD;
+   imc_global_refc.refc++;
+   } else if (imc_global_refc.id == IMC_DOMAIN_THREAD) 

[PATCH 1/2] powerpc/powernv: Re-enable imc trace-mode in kernel

2020-01-20 Thread Anju T Sudhakar
commit <249fad734a25> ""powerpc/perf: Disable trace_imc pmu"
disables IMC(In-Memory Collection) trace-mode in kernel, since frequent
mode switching between accumulation mode and trace mode via the spr LDBAR
in the hardware can trigger a checkstop(system crash).

Patch to re-enable imc-trace mode in kernel.

The following patch in this series will address the mode switching issue
by implementing a global lock, and will restrict the usage of
accumulation and trace-mode at a time.

Signed-off-by: Anju T Sudhakar 
---
 arch/powerpc/platforms/powernv/opal-imc.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-imc.c 
b/arch/powerpc/platforms/powernv/opal-imc.c
index 000b350d4060..3b4518f4b643 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -278,14 +278,7 @@ static int opal_imc_counters_probe(struct platform_device 
*pdev)
domain = IMC_DOMAIN_THREAD;
break;
case IMC_TYPE_TRACE:
-   /*
-* FIXME. Using trace_imc events to monitor application
-* or KVM thread performance can cause a checkstop
-* (system crash).
-* Disable it for now.
-*/
-   pr_info_once("IMC: disabling trace_imc PMU\n");
-   domain = -1;
+   domain = IMC_DOMAIN_TRACE;
break;
default:
pr_warn("IMC Unknown Device type \n");
-- 
2.20.1



Re: [PATCH RFC v1] mm: is_mem_section_removable() overhaul

2020-01-20 Thread David Hildenbrand
On 20.01.20 10:14, David Hildenbrand wrote:
> On 20.01.20 08:48, Michal Hocko wrote:
>> On Fri 17-01-20 08:57:51, Dan Williams wrote:
>> [...]
>>> Unless the user is willing to hold the device_hotplug_lock over the
>>> evaluation then the result is unreliable.
>>
>> Do we want to hold the device_hotplug_lock from this user readable file
>> in the first place? My book says that this just waits to become a
>> problem.
> 
> It was the "big hammer" solution for this RFC.
> 
> I think we could do with a try_lock() on the device_lock() paired with a
> device->removed flag. The latter is helpful for properly catching zombie
> devices on the onlining/offlining path either way (and on my todo list).

We do have dev->p->dead which could come in handy.

-- 
Thanks,

David / dhildenb



[FSL P5020 P5040 PPC] Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

2020-01-20 Thread Christian Zigotzky

Am 16.01.20 um 16:46 schrieb Ulf Hansson:

On Thu, 16 Jan 2020 at 12:18, Christian Zigotzky  wrote:

Hi All,

We still need the attached patch for our onboard SD card interface
[1,2]. Could you please add this patch to the tree?

No, because according to previous discussion that isn't the correct
solution and more importantly it will break other archs (if I recall
correctly).

Looks like someone from the ppc community needs to pick up the ball.
I am not sure if the ppc community have to fix this issue because your 
updates (mmc-v5.4-2) are responsible for this issue. If nobody wants to 
fix this issue then we will lost the onboard SD card support in the 
future. PLEASE check the 'mmc-v5.4-2' updates again.



Thanks,
Christian

[1] https://www.spinics.net/lists/linux-mmc/msg56211.html

I think this discussion even suggested some viable solutions, so it
just be a matter of sending a patch :-)


[2]
http://forum.hyperion-entertainment.com/viewtopic.php?f=58=4349=20#p49012


Kind regards
Uffe




Re: [PATCH RFC v1] mm: is_mem_section_removable() overhaul

2020-01-20 Thread David Hildenbrand
On 20.01.20 08:48, Michal Hocko wrote:
> On Fri 17-01-20 08:57:51, Dan Williams wrote:
> [...]
>> Unless the user is willing to hold the device_hotplug_lock over the
>> evaluation then the result is unreliable.
> 
> Do we want to hold the device_hotplug_lock from this user readable file
> in the first place? My book says that this just waits to become a
> problem.

It was the "big hammer" solution for this RFC.

I think we could do with a try_lock() on the device_lock() paired with a
device->removed flag. The latter is helpful for properly catching zombie
devices on the onlining/offlining path either way (and on my todo list).

> 
> Really, the interface is flawed and should have never been merged in the
> first place. We cannot simply remove it altogether I am afraid so let's
> at least remove the bogus code and pretend that the world is a better
> place where everything is removable except the reality sucks...

As I expressed already, the interface works as designed/documented and
has been used like that for years. I tend to agree that it never should
have been merged like that.

We have (at least) two places that are racy (with concurrent memory
hotplug):

1. /sys/.../memoryX/removable
- a) make it always return yes and make the interface useless
- b) add proper locking and keep it running as is (e.g., so David can
 identify offlineable memory blocks :) ).

2. /sys/.../memoryX/valid_zones
- a) always return "none" if the memory is online
- b) add proper locking and keep it running as is
- c) cache the result ("zone") when a block is onlined (e.g., in
mem->zone. If it is NULL, either mixed zones or unknown)

At least 2. already scream for a proper device_lock() locking as the
mem->state is not stable across the function call.

1a and 2a are the easiest solutions but remove all ways to identify if a
memory block could theoretically be offlined - without trying
(especially, also to identify the MOVABLE zone).

I tend to prefer 1b) and 2c), paired with proper device_lock() locking.
We don't affect existing use cases but are able to simplify the code +
fix the races.

What's your opinion? Any alternatives?

-- 
Thanks,

David / dhildenb



[PATCH v17 16/24] selftests/vm/pkeys: Fix assertion in test_pkey_alloc_exhaust()

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Some pkeys which are valid on the hardware are reserved
and not available for application use. These keys cannot
be allocated.

test_pkey_alloc_exhaust() tries to account for these and
has an assertion which validates if all available pkeys
have been exahaustively allocated. However, the expression
that is currently used is only valid for x86. On powerpc,
a pkey is additionally reserved as compared to x86. Hence,
the assertion is made to use an arch-specific helper to
get the correct count of reserved pkeys.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index e6de078a9196..5fcbbc525364 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1153,6 +1153,7 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
dprintf3("%s()::%d\n", __func__, __LINE__);
 
/*
+* On x86:
 * There are 16 pkeys supported in hardware.  Three are
 * allocated by the time we get here:
 *   1. The default key (0)
@@ -1160,8 +1161,16 @@ void test_pkey_alloc_exhaust(int *ptr, u16 pkey)
 *   3. One allocated by the test code and passed in via
 *  'pkey' to this function.
 * Ensure that we can allocate at least another 13 (16-3).
+*
+* On powerpc:
+* There are either 5, 28, 29 or 32 pkeys supported in
+* hardware depending on the page size (4K or 64K) and
+* platform (powernv or powervm). Four are allocated by
+* the time we get here. These include pkey-0, pkey-1,
+* exec-only pkey and the one allocated by the test code.
+* Ensure that we can allocate the remaining.
 */
-   pkey_assert(i >= NR_PKEYS-3);
+   pkey_assert(i >= (NR_PKEYS - get_arch_reserved_keys() - 1));
 
for (i = 0; i < nr_allocated_pkeys; i++) {
err = sys_pkey_free(allocated_pkeys[i]);
-- 
2.17.1



[PATCH v17 17/24] selftests/vm/pkeys: Improve checks to determine pkey support

2020-01-20 Thread Sandipan Das
From: Ram Pai 

For the pkeys subsystem to work, both the CPU and the
kernel need to have support. So, additionally check if
the kernel supports pkeys apart from the CPU feature
checks.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h| 30 
 tools/testing/selftests/vm/pkey-powerpc.h|  3 +-
 tools/testing/selftests/vm/pkey-x86.h|  2 +-
 tools/testing/selftests/vm/protection_keys.c |  7 +++--
 4 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 2f4b1eb3a680..59ccdff18214 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -76,6 +76,8 @@ extern void abort_hooks(void);
 
 __attribute__((noinline)) int read_ptr(int *ptr);
 void expected_pkey_fault(int pkey);
+int sys_pkey_alloc(unsigned long flags, unsigned long init_val);
+int sys_pkey_free(unsigned long pkey);
 
 #if defined(__i386__) || defined(__x86_64__) /* arch */
 #include "pkey-x86.h"
@@ -186,4 +188,32 @@ static inline u32 *siginfo_get_pkey_ptr(siginfo_t *si)
 #endif
 }
 
+static inline int kernel_has_pkeys(void)
+{
+   /* try allocating a key and see if it succeeds */
+   int ret = sys_pkey_alloc(0, 0);
+   if (ret <= 0) {
+   return 0;
+   }
+   sys_pkey_free(ret);
+   return 1;
+}
+
+static inline int is_pkeys_supported(void)
+{
+   /* check if the cpu supports pkeys */
+   if (!cpu_has_pkeys()) {
+   dprintf1("SKIP: %s: no CPU support\n", __func__);
+   return 0;
+   }
+
+   /* check if the kernel supports pkeys */
+   if (!kernel_has_pkeys()) {
+   dprintf1("SKIP: %s: no kernel support\n", __func__);
+   return 0;
+   }
+
+   return 1;
+}
+
 #endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index 319673bbab0b..7d7c3ffafdd9 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -63,8 +63,9 @@ static inline void __write_pkey_reg(u64 pkey_reg)
__func__, __read_pkey_reg(), pkey_reg);
 }
 
-static inline int cpu_has_pku(void)
+static inline int cpu_has_pkeys(void)
 {
+   /* No simple way to determine this */
return 1;
 }
 
diff --git a/tools/testing/selftests/vm/pkey-x86.h 
b/tools/testing/selftests/vm/pkey-x86.h
index a0c59d4f7af2..6421b846aa16 100644
--- a/tools/testing/selftests/vm/pkey-x86.h
+++ b/tools/testing/selftests/vm/pkey-x86.h
@@ -97,7 +97,7 @@ static inline void __cpuid(unsigned int *eax, unsigned int 
*ebx,
 #define X86_FEATURE_PKU(1<<3) /* Protection Keys for Userspace */
 #define X86_FEATURE_OSPKE  (1<<4) /* OS Protection Keys Enable */
 
-static inline int cpu_has_pku(void)
+static inline int cpu_has_pkeys(void)
 {
unsigned int eax;
unsigned int ebx;
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 5fcbbc525364..95f173049f43 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1378,7 +1378,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 
pkey)
int size = PAGE_SIZE;
int sret;
 
-   if (cpu_has_pku()) {
+   if (cpu_has_pkeys()) {
dprintf1("SKIP: %s: no CPU support\n", __func__);
return;
}
@@ -1447,12 +1447,13 @@ void pkey_setup_shadow(void)
 int main(void)
 {
int nr_iterations = 22;
+   int pkeys_supported = is_pkeys_supported();
 
setup_handlers();
 
-   printf("has pku: %d\n", cpu_has_pku());
+   printf("has pkeys: %d\n", pkeys_supported);
 
-   if (!cpu_has_pku()) {
+   if (!pkeys_supported) {
int size = PAGE_SIZE;
int *ptr;
 
-- 
2.17.1



[PATCH v17 20/24] selftests/vm/pkeys: Detect write violation on a mapped access-denied-key page

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Detect write-violation on a page to which access-disabled
key is associated much after the page is mapped.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index cb31a5cdf6d9..8bb4de103874 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1027,6 +1027,18 @@ void test_write_of_access_disabled_region(int *ptr, u16 
pkey)
*ptr = __LINE__;
expected_pkey_fault(pkey);
 }
+
+void test_write_of_access_disabled_region_with_page_already_mapped(int *ptr,
+   u16 pkey)
+{
+   *ptr = __LINE__;
+   dprintf1("disabling access; after accessing the page, "
+   " to PKEY[%02d], doing write\n", pkey);
+   pkey_access_deny(pkey);
+   *ptr = __LINE__;
+   expected_pkey_fault(pkey);
+}
+
 void test_kernel_write_of_access_disabled_region(int *ptr, u16 pkey)
 {
int ret;
@@ -1423,6 +1435,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey) = {
test_write_of_write_disabled_region,
test_write_of_write_disabled_region_with_page_already_mapped,
test_write_of_access_disabled_region,
+   test_write_of_access_disabled_region_with_page_already_mapped,
test_kernel_write_of_access_disabled_region,
test_kernel_write_of_write_disabled_region,
test_kernel_gup_of_access_disabled_region,
-- 
2.17.1



[PATCH v17 05/24] selftests/vm/pkeys: Move some definitions to arch-specific header

2020-01-20 Thread Sandipan Das
From: Thiago Jung Bauermann 

In preparation for multi-arch support, move definitions which
have arch-specific values to x86-specific header.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h| 111 +
 tools/testing/selftests/vm/pkey-x86.h| 156 +++
 tools/testing/selftests/vm/protection_keys.c |  47 --
 3 files changed, 162 insertions(+), 152 deletions(-)
 create mode 100644 tools/testing/selftests/vm/pkey-x86.h

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 6ad1bd54ef94..3ed2f021bf7a 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -21,9 +21,6 @@
 
 #define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
 
-#define NR_PKEYS 16
-#define PKEY_BITS_PER_PKEY 2
-
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
 #endif
@@ -73,19 +70,13 @@ extern void abort_hooks(void);
}   \
 } while (0)
 
+#if defined(__i386__) || defined(__x86_64__) /* arch */
+#include "pkey-x86.h"
+#else /* arch */
+#error Architecture not supported
+#endif /* arch */
+
 extern unsigned int shadow_pkey_reg;
-static inline unsigned int __read_pkey_reg(void)
-{
-   unsigned int eax, edx;
-   unsigned int ecx = 0;
-   unsigned int pkey_reg;
-
-   asm volatile(".byte 0x0f,0x01,0xee\n\t"
-: "=a" (eax), "=d" (edx)
-: "c" (ecx));
-   pkey_reg = eax;
-   return pkey_reg;
-}
 
 static inline unsigned int _read_pkey_reg(int line)
 {
@@ -100,19 +91,6 @@ static inline unsigned int _read_pkey_reg(int line)
 
 #define read_pkey_reg() _read_pkey_reg(__LINE__)
 
-static inline void __write_pkey_reg(unsigned int pkey_reg)
-{
-   unsigned int eax = pkey_reg;
-   unsigned int ecx = 0;
-   unsigned int edx = 0;
-
-   dprintf4("%s() changing %08x to %08x\n", __func__,
-   __read_pkey_reg(), pkey_reg);
-   asm volatile(".byte 0x0f,0x01,0xef\n\t"
-: : "a" (eax), "c" (ecx), "d" (edx));
-   assert(pkey_reg == __read_pkey_reg());
-}
-
 static inline void write_pkey_reg(unsigned int pkey_reg)
 {
dprintf4("%s() changing %08x to %08x\n", __func__,
@@ -157,83 +135,6 @@ static inline void __pkey_write_allow(int pkey, int 
do_allow_write)
dprintf4("pkey_reg now: %08x\n", read_pkey_reg());
 }
 
-#define PAGE_SIZE 4096
-#define MB (1<<20)
-
-static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
-   unsigned int *ecx, unsigned int *edx)
-{
-   /* ecx is often an input as well as an output. */
-   asm volatile(
-   "cpuid;"
-   : "=a" (*eax),
- "=b" (*ebx),
- "=c" (*ecx),
- "=d" (*edx)
-   : "0" (*eax), "2" (*ecx));
-}
-
-/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx) */
-#define X86_FEATURE_PKU(1<<3) /* Protection Keys for Userspace */
-#define X86_FEATURE_OSPKE  (1<<4) /* OS Protection Keys Enable */
-
-static inline int cpu_has_pku(void)
-{
-   unsigned int eax;
-   unsigned int ebx;
-   unsigned int ecx;
-   unsigned int edx;
-
-   eax = 0x7;
-   ecx = 0x0;
-   __cpuid(, , , );
-
-   if (!(ecx & X86_FEATURE_PKU)) {
-   dprintf2("cpu does not have PKU\n");
-   return 0;
-   }
-   if (!(ecx & X86_FEATURE_OSPKE)) {
-   dprintf2("cpu does not have OSPKE\n");
-   return 0;
-   }
-   return 1;
-}
-
-#define XSTATE_PKEY_BIT(9)
-#define XSTATE_PKEY0x200
-
-int pkey_reg_xstate_offset(void)
-{
-   unsigned int eax;
-   unsigned int ebx;
-   unsigned int ecx;
-   unsigned int edx;
-   int xstate_offset;
-   int xstate_size;
-   unsigned long XSTATE_CPUID = 0xd;
-   int leaf;
-
-   /* assume that XSTATE_PKEY is set in XCR0 */
-   leaf = XSTATE_PKEY_BIT;
-   {
-   eax = XSTATE_CPUID;
-   ecx = leaf;
-   __cpuid(, , , );
-
-   if (leaf == XSTATE_PKEY_BIT) {
-   xstate_offset = ebx;
-   xstate_size = eax;
-   }
-   }
-
-   if (xstate_size == 0) {
-   printf("could not find size/offset of PKEY in xsave state\n");
-   return 0;
-   }
-
-   return xstate_offset;
-}
-
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
 #define ALIGN_UP(x, align_to)  (((x) + ((align_to)-1)) & ~((align_to)-1))
 #define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
diff --git a/tools/testing/selftests/vm/pkey-x86.h 
b/tools/testing/selftests/vm/pkey-x86.h
new file mode 100644
index ..2f04ade8ca9c
--- /dev/null
+++ b/tools/testing/selftests/vm/pkey-x86.h
@@ -0,0 +1,156 @@
+/* 

[PATCH v17 11/24] selftests/vm/pkeys: Fix alloc_random_pkey() to make it really random

2020-01-20 Thread Sandipan Das
From: Ram Pai 

alloc_random_pkey() was allocating the same pkey every
time. Not all pkeys were geting tested. This fixes it.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 7fd52d5c4bfd..9cc82b65f828 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -25,6 +25,7 @@
 #define __SANE_USERSPACE_TYPES__
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -546,10 +547,10 @@ int alloc_random_pkey(void)
int nr_alloced = 0;
int random_index;
memset(alloced_pkeys, 0, sizeof(alloced_pkeys));
+   srand((unsigned int)time(NULL));
 
/* allocate every possible key and make a note of which ones we got */
max_nr_pkey_allocs = NR_PKEYS;
-   max_nr_pkey_allocs = 1;
for (i = 0; i < max_nr_pkey_allocs; i++) {
int new_pkey = alloc_pkey();
if (new_pkey < 0)
-- 
2.17.1



[PATCH v17 13/24] selftests/vm/pkeys: Introduce generic pkey abstractions

2020-01-20 Thread Sandipan Das
From: Ram Pai 

This introduces some generic abstractions and provides
the corresponding architecture-specfic implementations
for these abstractions.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h| 12 
 tools/testing/selftests/vm/pkey-x86.h| 15 +++
 tools/testing/selftests/vm/protection_keys.c |  8 ++--
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 0e3da7c8d628..621fb2a0a5ef 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -74,6 +74,9 @@ extern void abort_hooks(void);
}   \
 } while (0)
 
+__attribute__((noinline)) int read_ptr(int *ptr);
+void expected_pkey_fault(int pkey);
+
 #if defined(__i386__) || defined(__x86_64__) /* arch */
 #include "pkey-x86.h"
 #else /* arch */
@@ -172,4 +175,13 @@ static inline void __pkey_write_allow(int pkey, int 
do_allow_write)
 #define __stringify_1(x...) #x
 #define __stringify(x...)   __stringify_1(x)
 
+static inline u32 *siginfo_get_pkey_ptr(siginfo_t *si)
+{
+#ifdef si_pkey
+   return >si_pkey;
+#else
+   return (u32 *)(((u8 *)si) + si_pkey_offset);
+#endif
+}
+
 #endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/pkey-x86.h 
b/tools/testing/selftests/vm/pkey-x86.h
index def2a1bcf6a5..a0c59d4f7af2 100644
--- a/tools/testing/selftests/vm/pkey-x86.h
+++ b/tools/testing/selftests/vm/pkey-x86.h
@@ -42,6 +42,7 @@
 #endif
 
 #define NR_PKEYS   16
+#define NR_RESERVED_PKEYS  2 /* pkey-0 and exec-only-pkey */
 #define PKEY_BITS_PER_PKEY 2
 #define HPAGE_SIZE (1UL<<21)
 #define PAGE_SIZE  4096
@@ -158,4 +159,18 @@ int pkey_reg_xstate_offset(void)
return xstate_offset;
 }
 
+static inline int get_arch_reserved_keys(void)
+{
+   return NR_RESERVED_PKEYS;
+}
+
+void expect_fault_on_read_execonly_key(void *p1, int pkey)
+{
+   int ptr_contents;
+
+   ptr_contents = read_ptr(p1);
+   dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+   expected_pkey_fault(pkey);
+}
+
 #endif /* _PKEYS_X86_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 535e464e27e9..57c71056c93d 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1307,9 +1307,7 @@ void test_executing_on_unreadable_memory(int *ptr, u16 
pkey)
madvise(p1, PAGE_SIZE, MADV_DONTNEED);
lots_o_noops_around_write();
do_not_expect_pkey_fault("executing on PROT_EXEC memory");
-   ptr_contents = read_ptr(p1);
-   dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-   expected_pkey_fault(pkey);
+   expect_fault_on_read_execonly_key(p1, pkey);
 }
 
 void test_implicit_mprotect_exec_only_memory(int *ptr, u16 pkey)
@@ -1336,9 +1334,7 @@ void test_implicit_mprotect_exec_only_memory(int *ptr, 
u16 pkey)
madvise(p1, PAGE_SIZE, MADV_DONTNEED);
lots_o_noops_around_write();
do_not_expect_pkey_fault("executing on PROT_EXEC memory");
-   ptr_contents = read_ptr(p1);
-   dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
-   expected_pkey_fault(UNKNOWN_PKEY);
+   expect_fault_on_read_execonly_key(p1, UNKNOWN_PKEY);
 
/*
 * Put the memory back to non-PROT_EXEC.  Should clear the
-- 
2.17.1



[PATCH v17 07/24] selftests: vm: pkeys: Use sane types for pkey register

2020-01-20 Thread Sandipan Das
The size of the pkey register can vary across architectures.
This converts the data type of all its references to u64 in
preparation for multi-arch support.

To keep the definition of the u64 type consistent and remove
format specifier related warnings, __SANE_USERSPACE_TYPES__
is defined as suggested by Michael Ellerman.

Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h| 31 +++
 tools/testing/selftests/vm/pkey-x86.h|  8 +-
 tools/testing/selftests/vm/protection_keys.c | 86 
 3 files changed, 72 insertions(+), 53 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 7f18a82e54fc..dfbce49269ce 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -14,10 +14,10 @@
 #include 
 
 /* Define some kernel-like types */
-#define  u8 uint8_t
-#define u16 uint16_t
-#define u32 uint32_t
-#define u64 uint64_t
+#define  u8 __u8
+#define u16 __u16
+#define u32 __u32
+#define u64 __u64
 
 #define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
 
@@ -80,13 +80,14 @@ extern void abort_hooks(void);
 #error Architecture not supported
 #endif /* arch */
 
-extern unsigned int shadow_pkey_reg;
+extern u64 shadow_pkey_reg;
 
-static inline unsigned int _read_pkey_reg(int line)
+static inline u64 _read_pkey_reg(int line)
 {
-   unsigned int pkey_reg = __read_pkey_reg();
+   u64 pkey_reg = __read_pkey_reg();
 
-   dprintf4("read_pkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
+   dprintf4("read_pkey_reg(line=%d) pkey_reg: %016llx"
+   " shadow: %016llx\n",
line, pkey_reg, shadow_pkey_reg);
assert(pkey_reg == shadow_pkey_reg);
 
@@ -95,15 +96,15 @@ static inline unsigned int _read_pkey_reg(int line)
 
 #define read_pkey_reg() _read_pkey_reg(__LINE__)
 
-static inline void write_pkey_reg(unsigned int pkey_reg)
+static inline void write_pkey_reg(u64 pkey_reg)
 {
-   dprintf4("%s() changing %08x to %08x\n", __func__,
+   dprintf4("%s() changing %016llx to %016llx\n", __func__,
__read_pkey_reg(), pkey_reg);
/* will do the shadow check for us: */
read_pkey_reg();
__write_pkey_reg(pkey_reg);
shadow_pkey_reg = pkey_reg;
-   dprintf4("%s(%08x) pkey_reg: %08x\n", __func__,
+   dprintf4("%s(%016llx) pkey_reg: %016llx\n", __func__,
pkey_reg, __read_pkey_reg());
 }
 
@@ -113,7 +114,7 @@ static inline void write_pkey_reg(unsigned int pkey_reg)
  */
 static inline void __pkey_access_allow(int pkey, int do_allow)
 {
-   unsigned int pkey_reg = read_pkey_reg();
+   u64 pkey_reg = read_pkey_reg();
int bit = pkey * 2;
 
if (do_allow)
@@ -121,13 +122,13 @@ static inline void __pkey_access_allow(int pkey, int 
do_allow)
else
pkey_reg |= (1<
 #include 
 #include 
@@ -48,7 +49,7 @@
 int iteration_nr = 1;
 int test_nr;
 
-unsigned int shadow_pkey_reg;
+u64 shadow_pkey_reg;
 int dprint_in_signal;
 char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
@@ -163,7 +164,7 @@ void dump_mem(void *dumpme, int len_bytes)
 
for (i = 0; i < len_bytes; i += sizeof(u64)) {
u64 *ptr = (u64 *)(c + i);
-   dprintf1("dump[%03d][@%p]: %016jx\n", i, ptr, *ptr);
+   dprintf1("dump[%03d][@%p]: %016llx\n", i, ptr, *ptr);
}
 }
 
@@ -205,7 +206,8 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
 
dprint_in_signal = 1;
dprintf1("===SIGSEGV\n");
-   dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__, __LINE__,
+   dprintf1("%s()::%d, pkey_reg: 0x%016llx shadow: %016llx\n",
+   __func__, __LINE__,
__read_pkey_reg(), shadow_pkey_reg);
 
trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
@@ -213,8 +215,9 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
fpregset = uctxt->uc_mcontext.fpregs;
fpregs = (void *)fpregset;
 
-   dprintf2("%s() trapno: %d ip: 0x%lx info->si_code: %s/%d\n", __func__,
-   trapno, ip, si_code_str(si->si_code), si->si_code);
+   dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
+   __func__, trapno, ip, si_code_str(si->si_code),
+   si->si_code);
 #ifdef __i386__
/*
 * 32-bit has some extra padding so that userspace can tell whether
@@ -256,8 +259,9 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
 * need __read_pkey_reg() version so we do not do shadow_pkey_reg
 * checking
 */
-   dprintf1("signal pkey_reg from  pkey_reg: %08x\n", __read_pkey_reg());
-   dprintf1("pkey from siginfo: %jx\n", siginfo_pkey);
+   dprintf1("signal pkey_reg from  pkey_reg: %016llx\n",
+   __read_pkey_reg());

[PATCH v17 22/24] selftests/vm/pkeys: Test correct behaviour of pkey-0

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Ensure that pkey-0 is allocated on start and that it can
be attached dynamically in various modes, without failures.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 53 
 1 file changed, 53 insertions(+)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index d4952b57cc90..a1cb9a71e77c 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -964,6 +964,58 @@ __attribute__((noinline)) int read_ptr(int *ptr)
return *ptr;
 }
 
+void test_pkey_alloc_free_attach_pkey0(int *ptr, u16 pkey)
+{
+   int i, err;
+   int max_nr_pkey_allocs;
+   int alloced_pkeys[NR_PKEYS];
+   int nr_alloced = 0;
+   long size;
+
+   pkey_assert(pkey_last_malloc_record);
+   size = pkey_last_malloc_record->size;
+   /*
+* This is a bit of a hack.  But mprotect() requires
+* huge-page-aligned sizes when operating on hugetlbfs.
+* So, make sure that we use something that's a multiple
+* of a huge page when we can.
+*/
+   if (size >= HPAGE_SIZE)
+   size = HPAGE_SIZE;
+
+   /* allocate every possible key and make sure key-0 never got allocated 
*/
+   max_nr_pkey_allocs = NR_PKEYS;
+   for (i = 0; i < max_nr_pkey_allocs; i++) {
+   int new_pkey = alloc_pkey();
+   pkey_assert(new_pkey != 0);
+
+   if (new_pkey < 0)
+   break;
+   alloced_pkeys[nr_alloced++] = new_pkey;
+   }
+   /* free all the allocated keys */
+   for (i = 0; i < nr_alloced; i++) {
+   int free_ret;
+
+   if (!alloced_pkeys[i])
+   continue;
+   free_ret = sys_pkey_free(alloced_pkeys[i]);
+   pkey_assert(!free_ret);
+   }
+
+   /* attach key-0 in various modes */
+   err = sys_mprotect_pkey(ptr, size, PROT_READ, 0);
+   pkey_assert(!err);
+   err = sys_mprotect_pkey(ptr, size, PROT_WRITE, 0);
+   pkey_assert(!err);
+   err = sys_mprotect_pkey(ptr, size, PROT_EXEC, 0);
+   pkey_assert(!err);
+   err = sys_mprotect_pkey(ptr, size, PROT_READ|PROT_WRITE, 0);
+   pkey_assert(!err);
+   err = sys_mprotect_pkey(ptr, size, PROT_READ|PROT_WRITE|PROT_EXEC, 0);
+   pkey_assert(!err);
+}
+
 void test_read_of_write_disabled_region(int *ptr, u16 pkey)
 {
int ptr_contents;
@@ -1448,6 +1500,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey) = {
test_pkey_syscalls_on_non_allocated_pkey,
test_pkey_syscalls_bad_args,
test_pkey_alloc_exhaust,
+   test_pkey_alloc_free_attach_pkey0,
 };
 
 void run_tests_once(void)
-- 
2.17.1



[PATCH v17 23/24] selftests/vm/pkeys: Override access right definitions on powerpc

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Some platforms hardcode the x86 values for PKEY_DISABLE_ACCESS
and PKEY_DISABLE_WRITE such as those in:
 /usr/include/bits/mman-shared.h.

This overrides the definitions with correct values for powerpc.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-powerpc.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index d31665c48f5e..02bd4dd7d467 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -16,11 +16,13 @@
 #define fpregs fp_regs
 #define si_pkey_offset 0x20
 
-#ifndef PKEY_DISABLE_ACCESS
+#ifdef PKEY_DISABLE_ACCESS
+#undef PKEY_DISABLE_ACCESS
 # define PKEY_DISABLE_ACCESS   0x3  /* disable read and write */
 #endif
 
-#ifndef PKEY_DISABLE_WRITE
+#ifdef PKEY_DISABLE_WRITE
+#undef PKEY_DISABLE_WRITE
 # define PKEY_DISABLE_WRITE0x2
 #endif
 
-- 
2.17.1



[PATCH v17 19/24] selftests/vm/pkeys: Associate key on a mapped page and detect write violation

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Detect write-violation on a page to which write-disabled
key is associated much after the page is mapped.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index f65d384ef6a0..cb31a5cdf6d9 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -1002,6 +1002,17 @@ void 
test_read_of_access_disabled_region_with_page_already_mapped(int *ptr,
expected_pkey_fault(pkey);
 }
 
+void test_write_of_write_disabled_region_with_page_already_mapped(int *ptr,
+   u16 pkey)
+{
+   *ptr = __LINE__;
+   dprintf1("disabling write access; after accessing the page, "
+   "to PKEY[%02d], doing write\n", pkey);
+   pkey_write_deny(pkey);
+   *ptr = __LINE__;
+   expected_pkey_fault(pkey);
+}
+
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
@@ -1410,6 +1421,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey) = {
test_read_of_access_disabled_region,
test_read_of_access_disabled_region_with_page_already_mapped,
test_write_of_write_disabled_region,
+   test_write_of_write_disabled_region_with_page_already_mapped,
test_write_of_access_disabled_region,
test_kernel_write_of_access_disabled_region,
test_kernel_write_of_write_disabled_region,
-- 
2.17.1



[PATCH v17 06/24] selftests/vm/pkeys: Make gcc check arguments of sigsafe_printf()

2020-01-20 Thread Sandipan Das
From: Thiago Jung Bauermann 

This will help us ensure we print pkey_reg_t values correctly in
different architectures.

Signed-off-by: Thiago Jung Bauermann 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 3ed2f021bf7a..7f18a82e54fc 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -27,6 +27,10 @@
 #define DPRINT_IN_SIGNAL_BUF_SIZE 4096
 extern int dprint_in_signal;
 extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+
+#ifdef __GNUC__
+__attribute__((format(printf, 1, 2)))
+#endif
 static inline void sigsafe_printf(const char *format, ...)
 {
va_list ap;
-- 
2.17.1



[PATCH v17 08/24] selftests: vm: pkeys: Add helpers for pkey bits

2020-01-20 Thread Sandipan Das
This introduces some functions that help with setting
or clearing bits of a particular pkey. This also adds
an abstraction for getting a pkey's bit position in
the pkey register as this may vary across architectures.

Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h| 22 ++
 tools/testing/selftests/vm/pkey-x86.h|  5 +++
 tools/testing/selftests/vm/protection_keys.c | 32 ++--
 3 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index dfbce49269ce..0e3da7c8d628 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -80,6 +80,28 @@ extern void abort_hooks(void);
 #error Architecture not supported
 #endif /* arch */
 
+#define PKEY_MASK  (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE)
+
+static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags)
+{
+   u32 shift = pkey_bit_position(pkey);
+   /* mask out bits from pkey in old value */
+   reg &= ~((u64)PKEY_MASK << shift);
+   /* OR in new bits for pkey */
+   reg |= (flags & PKEY_MASK) << shift;
+   return reg;
+}
+
+static inline u64 get_pkey_bits(u64 reg, int pkey)
+{
+   u32 shift = pkey_bit_position(pkey);
+   /*
+* shift down the relevant bits to the lowest two, then
+* mask off all the other higher bits
+*/
+   return ((reg >> shift) & PKEY_MASK);
+}
+
 extern u64 shadow_pkey_reg;
 
 static inline u64 _read_pkey_reg(int line)
diff --git a/tools/testing/selftests/vm/pkey-x86.h 
b/tools/testing/selftests/vm/pkey-x86.h
index 6ffea27e2d2d..def2a1bcf6a5 100644
--- a/tools/testing/selftests/vm/pkey-x86.h
+++ b/tools/testing/selftests/vm/pkey-x86.h
@@ -118,6 +118,11 @@ static inline int cpu_has_pku(void)
return 1;
 }
 
+static inline u32 pkey_bit_position(int pkey)
+{
+   return pkey * PKEY_BITS_PER_PKEY;
+}
+
 #define XSTATE_PKEY_BIT(9)
 #define XSTATE_PKEY0x200
 
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index efa35cc6f6b9..bed9d4de12b4 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -334,25 +334,13 @@ pid_t fork_lazy_child(void)
 
 static u32 hw_pkey_get(int pkey, unsigned long flags)
 {
-   u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
u64 pkey_reg = __read_pkey_reg();
-   u64 shifted_pkey_reg;
-   u32 masked_pkey_reg;
 
dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
__func__, pkey, flags, 0, 0);
dprintf2("%s() raw pkey_reg: %016llx\n", __func__, pkey_reg);
 
-   shifted_pkey_reg = (pkey_reg >> (pkey * PKEY_BITS_PER_PKEY));
-   dprintf2("%s() shifted_pkey_reg: %016llx\n", __func__,
-   shifted_pkey_reg);
-   masked_pkey_reg = shifted_pkey_reg & mask;
-   dprintf2("%s() masked  pkey_reg: %x\n", __func__, masked_pkey_reg);
-   /*
-* shift down the relevant bits to the lowest two, then
-* mask off all the other high bits.
-*/
-   return masked_pkey_reg;
+   return (u32) get_pkey_bits(pkey_reg, pkey);
 }
 
 static int hw_pkey_set(int pkey, unsigned long rights, unsigned long flags)
@@ -364,12 +352,8 @@ static int hw_pkey_set(int pkey, unsigned long rights, 
unsigned long flags)
/* make sure that 'rights' only contains the bits we expect: */
assert(!(rights & ~mask));
 
-   /* copy old pkey_reg */
-   new_pkey_reg = old_pkey_reg;
-   /* mask out bits from pkey in old value: */
-   new_pkey_reg &= ~(mask << (pkey * PKEY_BITS_PER_PKEY));
-   /* OR in new bits for pkey: */
-   new_pkey_reg |= (rights << (pkey * PKEY_BITS_PER_PKEY));
+   /* modify bits accordingly in old pkey_reg and assign it */
+   new_pkey_reg = set_pkey_bits(old_pkey_reg, pkey, rights);
 
__write_pkey_reg(new_pkey_reg);
 
@@ -403,7 +387,7 @@ void pkey_disable_set(int pkey, int flags)
ret = hw_pkey_set(pkey, pkey_rights, syscall_flags);
assert(!ret);
/* pkey_reg and flags have the same format */
-   shadow_pkey_reg |= flags << (pkey * 2);
+   shadow_pkey_reg = set_pkey_bits(shadow_pkey_reg, pkey, pkey_rights);
dprintf1("%s(%d) shadow: 0x%016llx\n",
__func__, pkey, shadow_pkey_reg);
 
@@ -437,7 +421,7 @@ void pkey_disable_clear(int pkey, int flags)
pkey_rights |= flags;
 
ret = hw_pkey_set(pkey, pkey_rights, 0);
-   shadow_pkey_reg &= ~(flags << (pkey * 2));
+   shadow_pkey_reg = set_pkey_bits(shadow_pkey_reg, pkey, pkey_rights);
pkey_assert(ret >= 0);
 
pkey_rights = hw_pkey_get(pkey, syscall_flags);
@@ -513,7 +497,8 @@ int alloc_pkey(void)
shadow_pkey_reg);
if (ret) {
/* clear both the bits: */
-   shadow_pkey_reg &= ~(0x3 

[PATCH v17 09/24] selftests/vm/pkeys: Fix pkey_disable_clear()

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Currently, pkey_disable_clear() sets the specified bits
instead clearing them. This has been dead code up to now
because its only callers i.e. pkey_access/write_allow()
are also unused.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index bed9d4de12b4..4b1ddb526228 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -418,7 +418,7 @@ void pkey_disable_clear(int pkey, int flags)
pkey, pkey, pkey_rights);
pkey_assert(pkey_rights >= 0);
 
-   pkey_rights |= flags;
+   pkey_rights &= ~flags;
 
ret = hw_pkey_set(pkey, pkey_rights, 0);
shadow_pkey_reg = set_pkey_bits(shadow_pkey_reg, pkey, pkey_rights);
@@ -431,7 +431,7 @@ void pkey_disable_clear(int pkey, int flags)
dprintf1("%s(%d) pkey_reg: 0x%016llx\n", __func__,
pkey, read_pkey_reg());
if (flags)
-   assert(read_pkey_reg() > orig_pkey_reg);
+   assert(read_pkey_reg() < orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
-- 
2.17.1



[PATCH v17 14/24] selftests/vm/pkeys: Introduce powerpc support

2020-01-20 Thread Sandipan Das
From: Ram Pai 

This makes use of the abstractions added earlier and
introduces support for powerpc.

For powerpc, after receiving the SIGSEGV, the signal
handler must explicitly restore access permissions
for the faulting pkey to allow the test to continue.
As this makes use of pkey_access_allow(), all of its
dependencies and other similar functions have been
moved ahead of the signal handler.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h|   2 +
 tools/testing/selftests/vm/pkey-powerpc.h|  90 +++
 tools/testing/selftests/vm/protection_keys.c | 269 ++-
 3 files changed, 233 insertions(+), 128 deletions(-)
 create mode 100644 tools/testing/selftests/vm/pkey-powerpc.h

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 621fb2a0a5ef..2f4b1eb3a680 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -79,6 +79,8 @@ void expected_pkey_fault(int pkey);
 
 #if defined(__i386__) || defined(__x86_64__) /* arch */
 #include "pkey-x86.h"
+#elif defined(__powerpc64__) /* arch */
+#include "pkey-powerpc.h"
 #else /* arch */
 #error Architecture not supported
 #endif /* arch */
diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
new file mode 100644
index ..c79f4160a6a0
--- /dev/null
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _PKEYS_POWERPC_H
+#define _PKEYS_POWERPC_H
+
+#ifndef SYS_mprotect_key
+# define SYS_mprotect_key  386
+#endif
+#ifndef SYS_pkey_alloc
+# define SYS_pkey_alloc384
+# define SYS_pkey_free 385
+#endif
+#define REG_IP_IDX PT_NIP
+#define REG_TRAPNO PT_TRAP
+#define gregs  gp_regs
+#define fpregs fp_regs
+#define si_pkey_offset 0x20
+
+#ifndef PKEY_DISABLE_ACCESS
+# define PKEY_DISABLE_ACCESS   0x3  /* disable read and write */
+#endif
+
+#ifndef PKEY_DISABLE_WRITE
+# define PKEY_DISABLE_WRITE0x2
+#endif
+
+#define NR_PKEYS   32
+#define NR_RESERVED_PKEYS_4K   27 /* pkey-0, pkey-1, exec-only-pkey
+ and 24 other keys that cannot be
+ represented in the PTE */
+#define NR_RESERVED_PKEYS_64K  3  /* pkey-0, pkey-1 and exec-only-pkey */
+#define PKEY_BITS_PER_PKEY 2
+#define HPAGE_SIZE (1UL << 24)
+#define PAGE_SIZE  (1UL << 16)
+
+static inline u32 pkey_bit_position(int pkey)
+{
+   return (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
+}
+
+static inline u64 __read_pkey_reg(void)
+{
+   u64 pkey_reg;
+
+   asm volatile("mfspr %0, 0xd" : "=r" (pkey_reg));
+
+   return pkey_reg;
+}
+
+static inline void __write_pkey_reg(u64 pkey_reg)
+{
+   u64 amr = pkey_reg;
+
+   dprintf4("%s() changing %016llx to %016llx\n",
+__func__, __read_pkey_reg(), pkey_reg);
+
+   asm volatile("mtspr 0xd, %0" : : "r" ((unsigned long)(amr)) : "memory");
+
+   dprintf4("%s() pkey register after changing %016llx to %016llx\n",
+   __func__, __read_pkey_reg(), pkey_reg);
+}
+
+static inline int cpu_has_pku(void)
+{
+   return 1;
+}
+
+static inline int get_arch_reserved_keys(void)
+{
+   if (sysconf(_SC_PAGESIZE) == 4096)
+   return NR_RESERVED_PKEYS_4K;
+   else
+   return NR_RESERVED_PKEYS_64K;
+}
+
+void expect_fault_on_read_execonly_key(void *p1, int pkey)
+{
+   /*
+* powerpc does not allow userspace to change permissions of exec-only
+* keys since those keys are not allocated by userspace. The signal
+* handler wont be able to reset the permissions, which means the code
+* will infinitely continue to segfault here.
+*/
+   return;
+}
+
+/* 4-byte instructions * 16384 = 64K page */
+#define __page_o_noops() asm(".rept 16384 ; nop; .endr")
+
+#endif /* _PKEYS_POWERPC_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 57c71056c93d..e6de078a9196 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -169,6 +169,125 @@ void dump_mem(void *dumpme, int len_bytes)
}
 }
 
+static u32 hw_pkey_get(int pkey, unsigned long flags)
+{
+   u64 pkey_reg = __read_pkey_reg();
+
+   dprintf1("%s(pkey=%d, flags=%lx) = %x / %d\n",
+   __func__, pkey, flags, 0, 0);
+   dprintf2("%s() raw pkey_reg: %016llx\n", __func__, pkey_reg);
+
+   return (u32) get_pkey_bits(pkey_reg, pkey);
+}
+
+static int hw_pkey_set(int pkey, unsigned long rights, unsigned long flags)
+{
+   u32 mask = (PKEY_DISABLE_ACCESS|PKEY_DISABLE_WRITE);
+   u64 old_pkey_reg = __read_pkey_reg();
+   

[PATCH v17 24/24] selftests: vm: pkeys: Use the correct page size on powerpc

2020-01-20 Thread Sandipan Das
Both 4K and 64K pages are supported on powerpc. Parts of
the selftest code perform alignment computations based on
the PAGE_SIZE macro which is currently hardcoded to 64K
for powerpc. This causes some test failures on kernels
configured with 4K page size.

In some cases, we need to enforce function alignment on
page size. Since this can only be done at build time,
64K is used as the alignment factor as that also ensures
4K alignment.

Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-powerpc.h| 2 +-
 tools/testing/selftests/vm/protection_keys.c | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index 02bd4dd7d467..3a761e51a587 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -36,7 +36,7 @@
 pkey-31 and exec-only key */
 #define PKEY_BITS_PER_PKEY 2
 #define HPAGE_SIZE (1UL << 24)
-#define PAGE_SIZE  (1UL << 16)
+#define PAGE_SIZE  sysconf(_SC_PAGESIZE)
 
 static inline u32 pkey_bit_position(int pkey)
 {
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index a1cb9a71e77c..fc19addcb5c8 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -146,7 +146,12 @@ void abort_hooks(void)
  * will then fault, which makes sure that the fault code handles
  * execute-only memory properly.
  */
+#ifdef __powerpc64__
+/* This way, both 4K and 64K alignment are maintained */
+__attribute__((__aligned__(65536)))
+#else
 __attribute__((__aligned__(PAGE_SIZE)))
+#endif
 void lots_o_noops_around_write(int *write_to_me)
 {
dprintf3("running %s()\n", __func__);
-- 
2.17.1



[PATCH v17 15/24] selftests/vm/pkeys: Fix number of reserved powerpc pkeys

2020-01-20 Thread Sandipan Das
From: "Desnes A. Nunes do Rosario" 

The number of reserved pkeys in a PowerNV environment is
different from that on PowerVM or KVM.

Tested on PowerVM and PowerNV environments.

Signed-off-by: "Desnes A. Nunes do Rosario" 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-powerpc.h | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index c79f4160a6a0..319673bbab0b 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -28,7 +28,10 @@
 #define NR_RESERVED_PKEYS_4K   27 /* pkey-0, pkey-1, exec-only-pkey
  and 24 other keys that cannot be
  represented in the PTE */
-#define NR_RESERVED_PKEYS_64K  3  /* pkey-0, pkey-1 and exec-only-pkey */
+#define NR_RESERVED_PKEYS_64K_3KEYS3 /* PowerNV and KVM: pkey-0,
+pkey-1 and exec-only key */
+#define NR_RESERVED_PKEYS_64K_4KEYS4 /* PowerVM: pkey-0, pkey-1,
+pkey-31 and exec-only key */
 #define PKEY_BITS_PER_PKEY 2
 #define HPAGE_SIZE (1UL << 24)
 #define PAGE_SIZE  (1UL << 16)
@@ -65,12 +68,27 @@ static inline int cpu_has_pku(void)
return 1;
 }
 
+static inline bool arch_is_powervm()
+{
+   struct stat buf;
+
+   if ((stat("/sys/firmware/devicetree/base/ibm,partition-name", ) == 
0) &&
+   (stat("/sys/firmware/devicetree/base/hmc-managed?", ) == 0) &&
+   (stat("/sys/firmware/devicetree/base/chosen/qemu,graphic-width", 
) == -1) )
+   return true;
+
+   return false;
+}
+
 static inline int get_arch_reserved_keys(void)
 {
if (sysconf(_SC_PAGESIZE) == 4096)
return NR_RESERVED_PKEYS_4K;
else
-   return NR_RESERVED_PKEYS_64K;
+   if (arch_is_powervm())
+   return NR_RESERVED_PKEYS_64K_4KEYS;
+   else
+   return NR_RESERVED_PKEYS_64K_3KEYS;
 }
 
 void expect_fault_on_read_execonly_key(void *p1, int pkey)
-- 
2.17.1



[PATCH v17 10/24] selftests/vm/pkeys: Fix assertion in pkey_disable_set/clear()

2020-01-20 Thread Sandipan Das
From: Ram Pai 

In some cases, a pkey's bits need not necessarily change
in a way that the value of the pkey register increases
when performing a pkey_disable_set() or decreases when
performing a pkey_disable_clear().

For example, on powerpc, if a pkey's current state is
PKEY_DISABLE_ACCESS and we perform a pkey_write_disable()
on it, the bits still remain the same. We will observe
something similar when the pkey's current state is 0 and
a pkey_access_enable() is performed on it.

Either case would cause some assertions to fail. This
fixes the problem.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 4b1ddb526228..7fd52d5c4bfd 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -400,7 +400,7 @@ void pkey_disable_set(int pkey, int flags)
dprintf1("%s(%d) pkey_reg: 0x%016llx\n",
__func__, pkey, read_pkey_reg());
if (flags)
-   pkey_assert(read_pkey_reg() > orig_pkey_reg);
+   pkey_assert(read_pkey_reg() >= orig_pkey_reg);
dprintf1("END<---%s(%d, 0x%x)\n", __func__,
pkey, flags);
 }
@@ -431,7 +431,7 @@ void pkey_disable_clear(int pkey, int flags)
dprintf1("%s(%d) pkey_reg: 0x%016llx\n", __func__,
pkey, read_pkey_reg());
if (flags)
-   assert(read_pkey_reg() < orig_pkey_reg);
+   assert(read_pkey_reg() <= orig_pkey_reg);
 }
 
 void pkey_write_allow(int pkey)
-- 
2.17.1



[PATCH v17 21/24] selftests/vm/pkeys: Introduce a sub-page allocator

2020-01-20 Thread Sandipan Das
From: Ram Pai 

This introduces a new allocator that allocates 4K hardware
pages to back 64K linux pages. This allocator is available
only on powerpc.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h|  6 +
 tools/testing/selftests/vm/pkey-powerpc.h| 25 
 tools/testing/selftests/vm/pkey-x86.h|  5 
 tools/testing/selftests/vm/protection_keys.c |  1 +
 4 files changed, 37 insertions(+)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 59ccdff18214..622a85848f61 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -28,6 +28,9 @@
 extern int dprint_in_signal;
 extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
+extern int test_nr;
+extern int iteration_nr;
+
 #ifdef __GNUC__
 __attribute__((format(printf, 1, 2)))
 #endif
@@ -78,6 +81,9 @@ __attribute__((noinline)) int read_ptr(int *ptr);
 void expected_pkey_fault(int pkey);
 int sys_pkey_alloc(unsigned long flags, unsigned long init_val);
 int sys_pkey_free(unsigned long pkey);
+int mprotect_pkey(void *ptr, size_t size, unsigned long orig_prot,
+   unsigned long pkey);
+void record_pkey_malloc(void *ptr, long size, int prot);
 
 #if defined(__i386__) || defined(__x86_64__) /* arch */
 #include "pkey-x86.h"
diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index 7d7c3ffafdd9..d31665c48f5e 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -106,4 +106,29 @@ void expect_fault_on_read_execonly_key(void *p1, int pkey)
 /* 4-byte instructions * 16384 = 64K page */
 #define __page_o_noops() asm(".rept 16384 ; nop; .endr")
 
+void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey)
+{
+   void *ptr;
+   int ret;
+
+   dprintf1("doing %s(size=%ld, prot=0x%x, pkey=%d)\n", __func__,
+   size, prot, pkey);
+   pkey_assert(pkey < NR_PKEYS);
+   ptr = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+   pkey_assert(ptr != (void *)-1);
+
+   ret = syscall(__NR_subpage_prot, ptr, size, NULL);
+   if (ret) {
+   perror("subpage_perm");
+   return PTR_ERR_ENOTSUP;
+   }
+
+   ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
+   pkey_assert(!ret);
+   record_pkey_malloc(ptr, size, prot);
+
+   dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
+   return ptr;
+}
+
 #endif /* _PKEYS_POWERPC_H */
diff --git a/tools/testing/selftests/vm/pkey-x86.h 
b/tools/testing/selftests/vm/pkey-x86.h
index 6421b846aa16..3be20f5d5275 100644
--- a/tools/testing/selftests/vm/pkey-x86.h
+++ b/tools/testing/selftests/vm/pkey-x86.h
@@ -173,4 +173,9 @@ void expect_fault_on_read_execonly_key(void *p1, int pkey)
expected_pkey_fault(pkey);
 }
 
+void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey)
+{
+   return PTR_ERR_ENOTSUP;
+}
+
 #endif /* _PKEYS_X86_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 8bb4de103874..d4952b57cc90 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -845,6 +845,7 @@ void *malloc_pkey_mmap_dax(long size, int prot, u16 pkey)
 void *(*pkey_malloc[])(long size, int prot, u16 pkey) = {
 
malloc_pkey_with_mprotect,
+   malloc_pkey_with_mprotect_subpage,
malloc_pkey_anon_huge,
malloc_pkey_hugetlb
 /* can not do direct with the pkey_mprotect() API:
-- 
2.17.1



[PATCH v17 18/24] selftests/vm/pkeys: Associate key on a mapped page and detect access violation

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Detect access-violation on a page to which access-disabled
key is associated much after the page is mapped.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Acked-by: Dave Hansen 
Signed-off: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 95f173049f43..f65d384ef6a0 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -984,6 +984,24 @@ void test_read_of_access_disabled_region(int *ptr, u16 
pkey)
dprintf1("*ptr: %d\n", ptr_contents);
expected_pkey_fault(pkey);
 }
+
+void test_read_of_access_disabled_region_with_page_already_mapped(int *ptr,
+   u16 pkey)
+{
+   int ptr_contents;
+
+   dprintf1("disabling access to PKEY[%02d], doing read @ %p\n",
+   pkey, ptr);
+   ptr_contents = read_ptr(ptr);
+   dprintf1("reading ptr before disabling the read : %d\n",
+   ptr_contents);
+   read_pkey_reg();
+   pkey_access_deny(pkey);
+   ptr_contents = read_ptr(ptr);
+   dprintf1("*ptr: %d\n", ptr_contents);
+   expected_pkey_fault(pkey);
+}
+
 void test_write_of_write_disabled_region(int *ptr, u16 pkey)
 {
dprintf1("disabling write access to PKEY[%02d], doing write\n", pkey);
@@ -1390,6 +1408,7 @@ void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 
pkey)
 void (*pkey_tests[])(int *ptr, u16 pkey) = {
test_read_of_write_disabled_region,
test_read_of_access_disabled_region,
+   test_read_of_access_disabled_region_with_page_already_mapped,
test_write_of_write_disabled_region,
test_write_of_access_disabled_region,
test_kernel_write_of_access_disabled_region,
-- 
2.17.1



[PATCH v17 03/24] selftests/vm/pkeys: Rename all references to pkru to a generic name

2020-01-20 Thread Sandipan Das
From: Ram Pai 

This renames PKRU references to "pkey_reg" or "pkey" based on
the usage.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
Reviewed-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h|  85 +++
 tools/testing/selftests/vm/protection_keys.c | 240 ++-
 2 files changed, 170 insertions(+), 155 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index 254e5436bdd9..d5779be4793f 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -14,7 +14,7 @@
 #include 
 
 #define NR_PKEYS 16
-#define PKRU_BITS_PER_PKEY 2
+#define PKEY_BITS_PER_PKEY 2
 
 #ifndef DEBUG_LEVEL
 #define DEBUG_LEVEL 0
@@ -53,85 +53,88 @@ static inline void sigsafe_printf(const char *format, ...)
 #define dprintf3(args...) dprintf_level(3, args)
 #define dprintf4(args...) dprintf_level(4, args)
 
-extern unsigned int shadow_pkru;
-static inline unsigned int __rdpkru(void)
+extern unsigned int shadow_pkey_reg;
+static inline unsigned int __read_pkey_reg(void)
 {
unsigned int eax, edx;
unsigned int ecx = 0;
-   unsigned int pkru;
+   unsigned int pkey_reg;
 
asm volatile(".byte 0x0f,0x01,0xee\n\t"
 : "=a" (eax), "=d" (edx)
 : "c" (ecx));
-   pkru = eax;
-   return pkru;
+   pkey_reg = eax;
+   return pkey_reg;
 }
 
-static inline unsigned int _rdpkru(int line)
+static inline unsigned int _read_pkey_reg(int line)
 {
-   unsigned int pkru = __rdpkru();
+   unsigned int pkey_reg = __read_pkey_reg();
 
-   dprintf4("rdpkru(line=%d) pkru: %x shadow: %x\n",
-   line, pkru, shadow_pkru);
-   assert(pkru == shadow_pkru);
+   dprintf4("read_pkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
+   line, pkey_reg, shadow_pkey_reg);
+   assert(pkey_reg == shadow_pkey_reg);
 
-   return pkru;
+   return pkey_reg;
 }
 
-#define rdpkru() _rdpkru(__LINE__)
+#define read_pkey_reg() _read_pkey_reg(__LINE__)
 
-static inline void __wrpkru(unsigned int pkru)
+static inline void __write_pkey_reg(unsigned int pkey_reg)
 {
-   unsigned int eax = pkru;
+   unsigned int eax = pkey_reg;
unsigned int ecx = 0;
unsigned int edx = 0;
 
-   dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+   dprintf4("%s() changing %08x to %08x\n", __func__,
+   __read_pkey_reg(), pkey_reg);
asm volatile(".byte 0x0f,0x01,0xef\n\t"
 : : "a" (eax), "c" (ecx), "d" (edx));
-   assert(pkru == __rdpkru());
+   assert(pkey_reg == __read_pkey_reg());
 }
 
-static inline void wrpkru(unsigned int pkru)
+static inline void write_pkey_reg(unsigned int pkey_reg)
 {
-   dprintf4("%s() changing %08x to %08x\n", __func__, __rdpkru(), pkru);
+   dprintf4("%s() changing %08x to %08x\n", __func__,
+   __read_pkey_reg(), pkey_reg);
/* will do the shadow check for us: */
-   rdpkru();
-   __wrpkru(pkru);
-   shadow_pkru = pkru;
-   dprintf4("%s(%08x) pkru: %08x\n", __func__, pkru, __rdpkru());
+   read_pkey_reg();
+   __write_pkey_reg(pkey_reg);
+   shadow_pkey_reg = pkey_reg;
+   dprintf4("%s(%08x) pkey_reg: %08x\n", __func__,
+   pkey_reg, __read_pkey_reg());
 }
 
 /*
  * These are technically racy. since something could
- * change PKRU between the read and the write.
+ * change PKEY register between the read and the write.
  */
 static inline void __pkey_access_allow(int pkey, int do_allow)
 {
-   unsigned int pkru = rdpkru();
+   unsigned int pkey_reg = read_pkey_reg();
int bit = pkey * 2;
 
if (do_allow)
-   pkru &= (1<>>>===SIGSEGV\n");
-   dprintf1("%s()::%d, pkru: 0x%x shadow: %x\n", __func__, __LINE__,
-   __rdpkru(), shadow_pkru);
+   dprintf1("%s()::%d, pkey_reg: 0x%x shadow: %x\n", __func__, __LINE__,
+   __read_pkey_reg(), shadow_pkey_reg);
 
trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
@@ -289,19 +289,19 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
 */
fpregs += 0x70;
 #endif
-   pkru_offset = pkru_xstate_offset();
-   pkru_ptr = (void *)([pkru_offset]);
+   pkey_reg_offset = pkey_reg_xstate_offset();
+   pkey_reg_ptr = (void *)([pkey_reg_offset]);
 
dprintf1("siginfo: %p\n", si);
dprintf1(" fpregs: %p\n", fpregs);
/*
-* If we got a PKRU fault, we *HAVE* to have at least one bit set in
+* If we got a PKEY fault, we *HAVE* to have at least one bit set in
 * here.
 */
-   dprintf1("pkru_xstate_offset: %d\n", 

[PATCH v17 12/24] selftests: vm: pkeys: Use the correct huge page size

2020-01-20 Thread Sandipan Das
The huge page size can vary across architectures. This will
ensure that the correct huge page size is used when accessing
the hugetlb controls under sysfs. Instead of using a hardcoded
page size (i.e. 2MB), this now uses the HPAGE_SIZE macro which
is arch-specific.

Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/protection_keys.c | 23 ++--
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 9cc82b65f828..535e464e27e9 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -739,12 +739,15 @@ void *malloc_pkey_anon_huge(long size, int prot, u16 pkey)
 }
 
 int hugetlb_setup_ok;
+#define SYSFS_FMT_NR_HUGE_PAGES 
"/sys/kernel/mm/hugepages/hugepages-%ldkB/nr_hugepages"
 #define GET_NR_HUGE_PAGES 10
 void setup_hugetlbfs(void)
 {
int err;
int fd;
-   char buf[] = "123";
+   char buf[256];
+   long hpagesz_kb;
+   long hpagesz_mb;
 
if (geteuid() != 0) {
fprintf(stderr, "WARNING: not run as root, can not do hugetlb 
test\n");
@@ -755,11 +758,16 @@ void setup_hugetlbfs(void)
 
/*
 * Now go make sure that we got the pages and that they
-* are 2M pages.  Someone might have made 1G the default.
+* are PMD-level pages. Someone might have made PUD-level
+* pages the default.
 */
-   fd = open("/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages", 
O_RDONLY);
+   hpagesz_kb = HPAGE_SIZE / 1024;
+   hpagesz_mb = hpagesz_kb / 1024;
+   sprintf(buf, SYSFS_FMT_NR_HUGE_PAGES, hpagesz_kb);
+   fd = open(buf, O_RDONLY);
if (fd < 0) {
-   perror("opening sysfs 2M hugetlb config");
+   fprintf(stderr, "opening sysfs %ldM hugetlb config: %s\n",
+   hpagesz_mb, strerror(errno));
return;
}
 
@@ -767,13 +775,14 @@ void setup_hugetlbfs(void)
err = read(fd, buf, sizeof(buf)-1);
close(fd);
if (err <= 0) {
-   perror("reading sysfs 2M hugetlb config");
+   fprintf(stderr, "reading sysfs %ldM hugetlb config: %s\n",
+   hpagesz_mb, strerror(errno));
return;
}
 
if (atoi(buf) != GET_NR_HUGE_PAGES) {
-   fprintf(stderr, "could not confirm 2M pages, got: '%s' expected 
%d\n",
-   buf, GET_NR_HUGE_PAGES);
+   fprintf(stderr, "could not confirm %ldM pages, got: '%s' 
expected %d\n",
+   hpagesz_mb, buf, GET_NR_HUGE_PAGES);
return;
}
 
-- 
2.17.1



[PATCH v17 04/24] selftests/vm/pkeys: Move generic definitions to header file

2020-01-20 Thread Sandipan Das
From: Ram Pai 

Moved all the generic definition and helper functions to the
header file.

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-helpers.h| 35 +---
 tools/testing/selftests/vm/protection_keys.c | 27 ---
 2 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
index d5779be4793f..6ad1bd54ef94 100644
--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -13,6 +13,14 @@
 #include 
 #include 
 
+/* Define some kernel-like types */
+#define  u8 uint8_t
+#define u16 uint16_t
+#define u32 uint32_t
+#define u64 uint64_t
+
+#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
+
 #define NR_PKEYS 16
 #define PKEY_BITS_PER_PKEY 2
 
@@ -53,6 +61,18 @@ static inline void sigsafe_printf(const char *format, ...)
 #define dprintf3(args...) dprintf_level(3, args)
 #define dprintf4(args...) dprintf_level(4, args)
 
+extern void abort_hooks(void);
+#define pkey_assert(condition) do {\
+   if (!(condition)) { \
+   dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
+   __FILE__, __LINE__, \
+   test_nr, iteration_nr); \
+   dprintf0("errno at assert: %d", errno); \
+   abort_hooks();  \
+   exit(__LINE__); \
+   }   \
+} while (0)
+
 extern unsigned int shadow_pkey_reg;
 static inline unsigned int __read_pkey_reg(void)
 {
@@ -137,11 +157,6 @@ static inline void __pkey_write_allow(int pkey, int 
do_allow_write)
dprintf4("pkey_reg now: %08x\n", read_pkey_reg());
 }
 
-#define PROT_PKEY0 0x10/* protection key value (bit 0) */
-#define PROT_PKEY1 0x20/* protection key value (bit 1) */
-#define PROT_PKEY2 0x40/* protection key value (bit 2) */
-#define PROT_PKEY3 0x80/* protection key value (bit 3) */
-
 #define PAGE_SIZE 4096
 #define MB (1<<20)
 
@@ -219,4 +234,14 @@ int pkey_reg_xstate_offset(void)
return xstate_offset;
 }
 
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
+#define ALIGN_UP(x, align_to)  (((x) + ((align_to)-1)) & ~((align_to)-1))
+#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
+#define ALIGN_PTR_UP(p, ptr_align_to)  \
+   ((typeof(p))ALIGN_UP((unsigned long)(p), ptr_align_to))
+#define ALIGN_PTR_DOWN(p, ptr_align_to)\
+   ((typeof(p))ALIGN_DOWN((unsigned long)(p), ptr_align_to))
+#define __stringify_1(x...) #x
+#define __stringify(x...)   __stringify_1(x)
+
 #endif /* _PKEYS_HELPER_H */
diff --git a/tools/testing/selftests/vm/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
index 2f4ab81c570d..42ffb58810f2 100644
--- a/tools/testing/selftests/vm/protection_keys.c
+++ b/tools/testing/selftests/vm/protection_keys.c
@@ -51,31 +51,10 @@ int test_nr;
 unsigned int shadow_pkey_reg;
 
 #define HPAGE_SIZE (1UL<<21)
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
-#define ALIGN_UP(x, align_to)  (((x) + ((align_to)-1)) & ~((align_to)-1))
-#define ALIGN_DOWN(x, align_to) ((x) & ~((align_to)-1))
-#define ALIGN_PTR_UP(p, ptr_align_to)  ((typeof(p))ALIGN_UP((unsigned 
long)(p),ptr_align_to))
-#define ALIGN_PTR_DOWN(p, ptr_align_to)
((typeof(p))ALIGN_DOWN((unsigned long)(p),  ptr_align_to))
-#define __stringify_1(x...) #x
-#define __stringify(x...)   __stringify_1(x)
-
-#define PTR_ERR_ENOTSUP ((void *)-ENOTSUP)
 
 int dprint_in_signal;
 char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
 
-extern void abort_hooks(void);
-#define pkey_assert(condition) do {\
-   if (!(condition)) { \
-   dprintf0("assert() at %s::%d test_nr: %d iteration: %d\n", \
-   __FILE__, __LINE__, \
-   test_nr, iteration_nr); \
-   dprintf0("errno at assert: %d", errno); \
-   abort_hooks();  \
-   exit(__LINE__); \
-   }   \
-} while (0)
-
 void cat_into_file(char *str, char *file)
 {
int fd = open(file, O_RDWR);
@@ -186,12 +165,6 @@ void lots_o_noops_around_write(int *write_to_me)
dprintf3("%s() done\n", __func__);
 }
 
-/* Define some kernel-like types */
-#define  u8 uint8_t
-#define u16 uint16_t
-#define u32 uint32_t
-#define u64 uint64_t
-
 #ifdef __i386__
 
 #ifndef SYS_mprotect_key
-- 
2.17.1



[PATCH v17 02/24] selftests: vm: pkeys: Fix multilib builds for x86

2020-01-20 Thread Sandipan Das
This ensures that both 32-bit and 64-bit binaries are generated
when this is built on a x86_64 system. Most of the changes have
been borrowed from tools/testing/selftests/x86/Makefile.

Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/Makefile | 49 +
 1 file changed, 49 insertions(+)

diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index 4e9c741be6af..7fa0adf11f6a 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -18,7 +18,56 @@ TEST_GEN_FILES += on-fault-limit
 TEST_GEN_FILES += thuge-gen
 TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
+
+ifeq ($(ARCH), x86_64)
+CAN_BUILD_I386 := $(shell ./../x86/check_cc.sh $(CC) 
../x86/trivial_32bit_program.c -m32)
+CAN_BUILD_X86_64 := $(shell ./../x86/check_cc.sh $(CC) 
../x86/trivial_64bit_program.c)
+CAN_BUILD_WITH_NOPIE := $(shell ./../x86/check_cc.sh $(CC) 
../x86/trivial_program.c -no-pie)
+
+TARGETS := protection_keys
+BINARIES_32 := $(TARGETS:%=%_32)
+BINARIES_64 := $(TARGETS:%=%_64)
+
+.PHONY: $(TARGETS)
+
+ifeq ($(CAN_BUILD_WITH_NOPIE),1)
+CFLAGS += -no-pie
+endif
+
+ifeq ($(CAN_BUILD_I386),1)
+TEST_GEN_FILES += $(BINARIES_32)
+$(TARGETS): $(BINARIES_32)
+$(BINARIES_32): %_32: %.c
+   $(CC) -m32 -o $@ $(CFLAGS) $(EXTRA_CFLAGS) $^ -lrt -ldl -lm
+endif
+
+ifeq ($(CAN_BUILD_X86_64),1)
+TEST_GEN_FILES += $(BINARIES_64)
+$(TARGETS): $(BINARIES_64)
+$(BINARIES_64): %_64: %.c
+   $(CC) -m64 -o $@ $(CFLAGS) $(EXTRA_CFLAGS) $^ -lrt -ldl
+endif
+
+# x86_64 users should be encouraged to install 32-bit libraries
+ifeq ($(CAN_BUILD_I386)$(CAN_BUILD_X86_64),01)
+$(TARGETS): warn_32bit_failure
+
+warn_32bit_failure:
+   @echo "Warning: you seem to have a broken 32-bit build" 2>&1;   
\
+   echo "environment.  This will reduce test coverage of 64-bit" 2>&1; 
\
+   echo "kernels.  If you are using a Debian-like distribution," 2>&1; 
\
+   echo "try:"; 2>&1;  
\
+   echo "";
\
+   echo "  apt-get install gcc-multilib libc6-i386 libc6-dev-i386";
\
+   echo "";
\
+   echo "If you are using a Fedora-like distribution, try:";   
\
+   echo "";
\
+   echo "  yum install glibc-devel.*i686"; 
\
+   exit 0;
+endif
+else
 TEST_GEN_FILES += protection_keys
+endif
 
 ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 
sparc64 x86_64))
 TEST_GEN_FILES += va_128TBswitch
-- 
2.17.1



[PATCH v17 01/24] selftests/x86/pkeys: Move selftests to arch-neutral directory

2020-01-20 Thread Sandipan Das
From: Ram Pai 

cc: Dave Hansen 
cc: Florian Weimer 
Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
Acked-by: Ingo Molnar 
Acked-by: Dave Hansen 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/.gitignore | 1 +
 tools/testing/selftests/vm/Makefile   | 1 +
 tools/testing/selftests/{x86 => vm}/pkey-helpers.h| 0
 tools/testing/selftests/{x86 => vm}/protection_keys.c | 0
 tools/testing/selftests/x86/.gitignore| 1 -
 tools/testing/selftests/x86/Makefile  | 2 +-
 6 files changed, 3 insertions(+), 2 deletions(-)
 rename tools/testing/selftests/{x86 => vm}/pkey-helpers.h (100%)
 rename tools/testing/selftests/{x86 => vm}/protection_keys.c (100%)

diff --git a/tools/testing/selftests/vm/.gitignore 
b/tools/testing/selftests/vm/.gitignore
index 31b3c98b6d34..c55837bf39fa 100644
--- a/tools/testing/selftests/vm/.gitignore
+++ b/tools/testing/selftests/vm/.gitignore
@@ -14,3 +14,4 @@ virtual_address_range
 gup_benchmark
 va_128TBswitch
 map_fixed_noreplace
+protection_keys
diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index 7f9a8a8c31da..4e9c741be6af 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -18,6 +18,7 @@ TEST_GEN_FILES += on-fault-limit
 TEST_GEN_FILES += thuge-gen
 TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
+TEST_GEN_FILES += protection_keys
 
 ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 
sparc64 x86_64))
 TEST_GEN_FILES += va_128TBswitch
diff --git a/tools/testing/selftests/x86/pkey-helpers.h 
b/tools/testing/selftests/vm/pkey-helpers.h
similarity index 100%
rename from tools/testing/selftests/x86/pkey-helpers.h
rename to tools/testing/selftests/vm/pkey-helpers.h
diff --git a/tools/testing/selftests/x86/protection_keys.c 
b/tools/testing/selftests/vm/protection_keys.c
similarity index 100%
rename from tools/testing/selftests/x86/protection_keys.c
rename to tools/testing/selftests/vm/protection_keys.c
diff --git a/tools/testing/selftests/x86/.gitignore 
b/tools/testing/selftests/x86/.gitignore
index 7757f73ff9a3..eb30ffd83876 100644
--- a/tools/testing/selftests/x86/.gitignore
+++ b/tools/testing/selftests/x86/.gitignore
@@ -11,5 +11,4 @@ ldt_gdt
 iopl
 mpx-mini-test
 ioperm
-protection_keys
 test_vdso
diff --git a/tools/testing/selftests/x86/Makefile 
b/tools/testing/selftests/x86/Makefile
index 5d49bfec1e9a..5f16821c7f63 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -12,7 +12,7 @@ CAN_BUILD_WITH_NOPIE := $(shell ./check_cc.sh $(CC) 
trivial_program.c -no-pie)
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt 
test_mremap_vdso \
check_initial_reg_state sigreturn iopl ioperm \
-   protection_keys test_vdso test_vsyscall mov_ss_trap \
+   test_vdso test_vsyscall mov_ss_trap \
syscall_arg_fault
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
test_FCMOV test_FCOMI test_FISTTP \
-- 
2.17.1



[PATCH v16 00/23] selftests, powerpc, x86: Memory Protection Keys

2020-01-20 Thread Sandipan Das
Memory protection keys enables an application to protect its address
space from inadvertent access by its own code.

This feature is now enabled on powerpc and has been available since
4.16-rc1. The patches move the selftests to arch neutral directory
and enhance their test coverage.

Tested on powerpc64 and x86_64 (Skylake-SP).

Link to development branch:
https://github.com/sandip4n/linux/tree/pkey-selftests

Changelog
-
Link to previous version (v16):
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=153824

v17:
(1) Fixed issues with i386 builds when running on x86_64
based on feedback from Dave.
(2) Replaced patch 6 from previous version with patch 7.
This addresses u64 format specifier related concerns
that Michael had raised in v15.

v16:
(1) Rebased on top of latest master.
(2) Switched to u64 instead of using an arch-dependent
pkey_reg_t type for references to the pkey register
based on suggestions from Dave, Michal and Michael.
(3) Removed build time determination of page size based
on suggestion from Michael.
(4) Fixed comment before the definition of __page_o_noops()
from patch 13 ("selftests/vm/pkeys: Introduce powerpc
support").

v15:
(1) Rebased on top of latest master.
(2) Addressed review comments from Dave Hansen.
(3) Moved code for getting or setting pkey bits to new
helpers. These changes replace patch 7 of v14.
(4) Added a fix which ensures that the correct count of
reserved keys is used across different platforms.
(5) Added a fix which ensures that the correct page size
is used as powerpc supports both 4K and 64K pages.

v14:
(1) Incorporated another round of comments from Dave Hansen.

v13:
(1) Incorporated comments for Dave Hansen.
(2) Added one more test for correct pkey-0 behavior.

v12:
(1) Fixed the offset of pkey field in the siginfo structure for
x86_64 and powerpc. And tries to use the actual field
if the headers have it defined.

v11:
(1) Fixed a deadlock in the ptrace testcase.

v10 and prior:
(1) Moved the testcase to arch neutral directory.
(2) Split the changes into incremental patches.

Desnes A. Nunes do Rosario (1):
  selftests/vm/pkeys: Fix number of reserved powerpc pkeys

Ram Pai (16):
  selftests/x86/pkeys: Move selftests to arch-neutral directory
  selftests/vm/pkeys: Rename all references to pkru to a generic name
  selftests/vm/pkeys: Move generic definitions to header file
  selftests/vm/pkeys: Fix pkey_disable_clear()
  selftests/vm/pkeys: Fix assertion in pkey_disable_set/clear()
  selftests/vm/pkeys: Fix alloc_random_pkey() to make it really random
  selftests/vm/pkeys: Introduce generic pkey abstractions
  selftests/vm/pkeys: Introduce powerpc support
  selftests/vm/pkeys: Fix assertion in test_pkey_alloc_exhaust()
  selftests/vm/pkeys: Improve checks to determine pkey support
  selftests/vm/pkeys: Associate key on a mapped page and detect access
violation
  selftests/vm/pkeys: Associate key on a mapped page and detect write
violation
  selftests/vm/pkeys: Detect write violation on a mapped
access-denied-key page
  selftests/vm/pkeys: Introduce a sub-page allocator
  selftests/vm/pkeys: Test correct behaviour of pkey-0
  selftests/vm/pkeys: Override access right definitions on powerpc

Sandipan Das (5):
  selftests: vm: pkeys: Fix multilib builds for x86
  selftests: vm: pkeys: Use sane types for pkey register
  selftests: vm: pkeys: Add helpers for pkey bits
  selftests: vm: pkeys: Use the correct huge page size
  selftests: vm: pkeys: Use the correct page size on powerpc

Thiago Jung Bauermann (2):
  selftests/vm/pkeys: Move some definitions to arch-specific header
  selftests/vm/pkeys: Make gcc check arguments of sigsafe_printf()

 tools/testing/selftests/vm/.gitignore |   1 +
 tools/testing/selftests/vm/Makefile   |  50 ++
 tools/testing/selftests/vm/pkey-helpers.h | 225 ++
 tools/testing/selftests/vm/pkey-powerpc.h | 136 
 tools/testing/selftests/vm/pkey-x86.h | 181 +
 .../selftests/{x86 => vm}/protection_keys.c   | 696 ++
 tools/testing/selftests/x86/.gitignore|   1 -
 tools/testing/selftests/x86/Makefile  |   2 +-
 tools/testing/selftests/x86/pkey-helpers.h| 219 --
 9 files changed, 979 insertions(+), 532 deletions(-)
 create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
 create mode 100644 tools/testing/selftests/vm/pkey-powerpc.h
 create mode 100644 tools/testing/selftests/vm/pkey-x86.h
 rename tools/testing/selftests/{x86 => vm}/protection_keys.c (74%)
 delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h

-- 
2.17.1



Re: [PATCH v6 1/5] powerpc/mm: Implement set_memory() routines

2020-01-20 Thread Christophe Leroy




Le 24/12/2019 à 06:55, Russell Currey a écrit :

The set_memory_{ro/rw/nx/x}() functions are required for STRICT_MODULE_RWX,
and are generally useful primitives to have.  This implementation is
designed to be completely generic across powerpc's many MMUs.

It's possible that this could be optimised to be faster for specific
MMUs, but the focus is on having a generic and safe implementation for
now.

This implementation does not handle cases where the caller is attempting
to change the mapping of the page it is executing from, or if another
CPU is concurrently using the page being altered.  These cases likely
shouldn't happen, but a more complex implementation with MMU-specific code
could safely handle them, so that is left as a TODO for now.

Signed-off-by: Russell Currey 
---
  arch/powerpc/Kconfig  |  1 +
  arch/powerpc/include/asm/set_memory.h | 32 +++
  arch/powerpc/mm/Makefile  |  1 +
  arch/powerpc/mm/pageattr.c| 83 +++
  4 files changed, 117 insertions(+)
  create mode 100644 arch/powerpc/include/asm/set_memory.h
  create mode 100644 arch/powerpc/mm/pageattr.c

+static int __change_page_attr(pte_t *ptep, unsigned long addr, void *data)
+{
+   int action = *((int *)data);
+   pte_t pte_val;


pte_val is really not a good naming, because pte_val() is already a 
function which returns the value of a pte_t var.


Here you should name it 'pte' as usual.

Christophe


+
+   // invalidate the PTE so it's safe to modify
+   pte_val = ptep_get_and_clear(_mm, addr, ptep);
+   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
+   // modify the PTE bits as desired, then apply
+   switch (action) {
+   case SET_MEMORY_RO:
+   pte_val = pte_wrprotect(pte_val);
+   break;
+   case SET_MEMORY_RW:
+   pte_val = pte_mkwrite(pte_val);
+   break;
+   case SET_MEMORY_NX:
+   pte_val = pte_exprotect(pte_val);
+   break;
+   case SET_MEMORY_X:
+   pte_val = pte_mkexec(pte_val);
+   break;
+   default:
+   WARN_ON(true);
+   return -EINVAL;
+   }
+
+   set_pte_at(_mm, addr, ptep, pte_val);
+
+   return 0;
+}
+