Re: [PATCH v4 1/3] powerpc/perf: Use stack siar instead of mfspr

2021-08-18 Thread Nageswara Sastry




On 18/08/21 10:45 pm, Kajol Jain wrote:

Minor optimization in the 'perf_instruction_pointer' function code by
making use of stack siar instead of mfspr.

Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs
into perf_read_regs")
Signed-off-by: Kajol Jain 



Tested this patch series, not seeing any '0' values.
Tested-by: Nageswara R Sastry 

example output:
# perf report -D | grep addr
0 26236879714 0x3dcc8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x1): 1446/1446: 
0xc0113584 period: 1 addr: 0
0 26236882500 0x3dd00 [0x38]: PERF_RECORD_SAMPLE(IP, 0x1): 1446/1446: 
0xc0113584 period: 1 addr: 0
0 26236883436 0x3dd38 [0x38]: PERF_RECORD_SAMPLE(IP, 0x1): 1446/1446: 
0xc0113584 period: 10 addr: 0

...



---
  arch/powerpc/perf/core-book3s.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bb0ee716de91..1b464aad29c4 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
else
return regs->nip;
} else if (use_siar && siar_valid(regs))
-   return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
+   return siar + perf_ip_adjust(regs);
else if (use_siar)
return 0;   // no valid instruction pointer
else



--
Thanks and Regards
R.Nageswara Sastry


Re: [PATCH v7 1/2] tty: hvc: pass DMA capable memory to put_chars()

2021-08-18 Thread Xianting TIan



在 2021/8/18 上午11:17, Jiri Slaby 写道:

Hi,

On 17. 08. 21, 15:22, Xianting Tian wrote:

As well known, hvc backend can register its opertions to hvc backend.
the opertions contain put_chars(), get_chars() and so on.


"operations". And there too:


Some hvc backend may do dma in its opertions. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
    hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
    static void hvc_poll_put_char(,,char ch)
    {
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;

do {
    n = hp->ops->put_chars(hp->vtermno, , 1);
} while (n <= 0);
    }

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the furture.

We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no
longer the stack memory. we can use it in above two cases.


In fact, you need only a single char for the poll case 
(hvc_poll_put_char), so hvc_struct needs to contain only c, not an array.


OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), 
but not whole hvc_struct. So cons_hvcs should be an array of structs 
composed of only the lock and the buffer.


Hum.

Or maybe rethink and take care of the console case by kmemdup and be 
done with that case? What problem do you have with allocating 16 
bytes? It should be quite easy and really fast (lockless) in most 
cases. On the contrary, your solution has to take a spinlock to access 
the global buffer.


As I replyed before,  this issue may can be solved just by adjust the 
alignment to L1_CACHE_BYTES or at least 16: 


#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))

Then, c[16] won't cross the pages, that is to say c[16]'s physical 
address is continuous.  Could you comment this?


I submitted v8,  I found it still can't solve ths issue, even we create 
'char out_buf[N_OUTBUF]' and 'chat out_ch' be part of 'struct 
hvc_struct', and use it separately, we still need lock to protect each 
buf. When we invloced lock, it will impact the hvc performance.


So we can back to the original intention of this solution, just fix the 
kmemdup issue in virtio_console driver?






Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as
dma alignment is wrong. And use struct_size() to calculate size of
hvc_struct.


This ought to be in separate patches.

thanks,


[PATCH] powerpc/tau: Add 'static' storage qualifier to 'tau_work' definition

2021-08-18 Thread Finn Thain
This patch prevents the following sparse warning.

arch/powerpc/kernel/tau_6xx.c:199:1: sparse: sparse: symbol 'tau_work'
was not declared. Should it be static?

Reported-by: kernel test robot 
Signed-off-by: Finn Thain 
---
 arch/powerpc/kernel/tau_6xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c
index b9a047d92ec0..8e83d19fe8fa 100644
--- a/arch/powerpc/kernel/tau_6xx.c
+++ b/arch/powerpc/kernel/tau_6xx.c
@@ -164,7 +164,7 @@ static void tau_work_func(struct work_struct *work)
queue_work(tau_workq, work);
 }
 
-DECLARE_WORK(tau_work, tau_work_func);
+static DECLARE_WORK(tau_work, tau_work_func);
 
 /*
  * setup the TAU
-- 
2.26.3



Re: [PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow

2021-08-18 Thread Kees Cook
On Wed, Aug 18, 2021 at 08:42:18AM +0200, Christophe Leroy wrote:
> 
> 
> Le 18/08/2021 à 08:05, Kees Cook a écrit :
> > In preparation for FORTIFY_SOURCE performing compile-time and run-time
> > field bounds checking for memset(), avoid intentionally writing across
> > neighboring fields.
> > 
> > Instead of writing across a field boundary with memset(), move the call
> > to just the array, and an explicit zeroing of the prior field.
> > 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Qinglang Miao 
> > Cc: "Gustavo A. R. Silva" 
> > Cc: Hulk Robot 
> > Cc: Wang Wensheng 
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Signed-off-by: Kees Cook 
> > Reviewed-by: Michael Ellerman 
> > Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au
> > ---
> >   drivers/macintosh/smu.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
> > index 94fb63a7b357..59ce431da7ef 100644
> > --- a/drivers/macintosh/smu.c
> > +++ b/drivers/macintosh/smu.c
> > @@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
> > cmd->read = cmd->info.devaddr & 0x01;
> > switch(cmd->info.type) {
> > case SMU_I2C_TRANSFER_SIMPLE:
> > -   memset(>info.sublen, 0, 4);
> > +   cmd->info.sublen = 0;
> > +   memset(>info.subaddr, 0, 3);
> 
> subaddr[] is a table, should the & be avoided ?

It results in the same thing, but it's better form to not have the &; I
will fix this.

> And while at it, why not use sizeof(subaddr) instead of 3 ?

Agreed. :)

Thanks!

-- 
Kees Cook


[PATCH] soc: fsl: guts: Fix a resource leak in the error handling path of 'fsl_guts_probe()'

2021-08-18 Thread Christophe JAILLET
If an error occurs after 'of_find_node_by_path()', the reference taken for
'root' will never be released and some memory will leak.

Instead of adding an error handling path and modifying all the
'return -SOMETHING' into 'goto errorpath', use 'devm_add_action_or_reset()'
to release the reference when needed.

Simplify the remove function accordingly.

As an extra benefit, the 'root' global variable can now be removed as well.

Fixes: 3c0d64e867ed ("soc: fsl: guts: reuse machine name from device tree")
Signed-off-by: Christophe JAILLET 
---
Compile tested only
---
 drivers/soc/fsl/guts.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
index d5e9a5f2c087..4d9476c7b87c 100644
--- a/drivers/soc/fsl/guts.c
+++ b/drivers/soc/fsl/guts.c
@@ -28,7 +28,6 @@ struct fsl_soc_die_attr {
 static struct guts *guts;
 static struct soc_device_attribute soc_dev_attr;
 static struct soc_device *soc_dev;
-static struct device_node *root;
 
 
 /* SoC die attribute definition for QorIQ platform */
@@ -136,14 +135,23 @@ static u32 fsl_guts_get_svr(void)
return svr;
 }
 
+static void fsl_guts_put_root(void *data)
+{
+   struct device_node *root = data;
+
+   of_node_put(root);
+}
+
 static int fsl_guts_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
struct device *dev = >dev;
+   struct device_node *root;
struct resource *res;
const struct fsl_soc_die_attr *soc_die;
const char *machine;
u32 svr;
+   int ret;
 
/* Initialize guts */
guts = devm_kzalloc(dev, sizeof(*guts), GFP_KERNEL);
@@ -159,6 +167,10 @@ static int fsl_guts_probe(struct platform_device *pdev)
 
/* Register soc device */
root = of_find_node_by_path("/");
+   ret = devm_add_action_or_reset(dev, fsl_guts_put_root, root);
+   if (ret)
+   return ret;
+
if (of_property_read_string(root, "model", ))
of_property_read_string_index(root, "compatible", 0, );
if (machine)
@@ -197,7 +209,7 @@ static int fsl_guts_probe(struct platform_device *pdev)
 static int fsl_guts_remove(struct platform_device *dev)
 {
soc_device_unregister(soc_dev);
-   of_node_put(root);
+
return 0;
 }
 
-- 
2.30.2



Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-08-18 Thread kernel test robot
Hi Xianting,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tty/tty-testing]
[also build test WARNING on char-misc/char-misc-testing soc/for-next v5.14-rc6 
next-20210818]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Xianting-Tian/make-hvc-pass-dma-capable-memory-to-its-backend/20210818-162408
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git 
tty-testing
config: arm64-randconfig-r025-20210818 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 
d2b574a4dea5b718e4386bf2e26af0126e5978ce)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/e1b7662dafceb07a6905b64da2f1be27498c4a46
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Xianting-Tian/make-hvc-pass-dma-capable-memory-to-its-backend/20210818-162408
git checkout e1b7662dafceb07a6905b64da2f1be27498c4a46
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/tty/hvc/hvc_console.c:880:39: warning: incompatible integer to 
>> pointer conversion passing 'char' to parameter of type 'const char *'; take 
>> the address with & [-Wint-conversion]
   n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
   ^~
   &
   1 warning generated.


vim +880 drivers/tty/hvc/hvc_console.c

   870  
   871  static void hvc_poll_put_char(struct tty_driver *driver, int line, char 
ch)
   872  {
   873  struct tty_struct *tty = driver->ttys[0];
   874  struct hvc_struct *hp = tty->driver_data;
   875  int n;
   876  
   877  hp->out_ch = ch;
   878  
   879  do {
 > 880  n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
   881  } while (n <= 0);
   882  }
   883  #endif
   884  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH v4 3/3] powerpc/perf: Fix the check for SIAR value

2021-08-18 Thread Kajol Jain
Incase of random sampling, there can be scenarios where
Sample Instruction Address Register(SIAR) may not latch
to the sampled instruction and could result in
the value of 0. In these scenarios it is preferred to
return regs->nip. These corner cases are seen in the
previous generation (p9) also.

Patch adds the check for SIAR value along with regs_use_siar
and siar_valid checks so that the function will return
regs->nip incase SIAR is zero.

Patch drops the code under PPMU_P10_DD1 flag check
which handles SIAR 0 case only for Power10 DD1.

Fixes: 2ca13a4cc56c9 ("powerpc/perf: Use regs->nip when SIAR is zero")
Signed-off-by: Kajol Jain 
---

Changelog:
v3 -> v4
- Remove use_siar variable and directly using regs_use_siar call as
  suggested by Christophe Leroy

v2 -> v3
- Drop adding new ternary condition to check siar value.
- Remove siar check specific for PPMU_P10_DD1 and add
  it along with common checks as suggested by Christophe Leroy
  and Michael Ellermen

 arch/powerpc/perf/core-book3s.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 23ec89a59893..b0a589409039 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2251,15 +2251,9 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
  */
 unsigned long perf_instruction_pointer(struct pt_regs *regs)
 {
-   bool use_siar = regs_use_siar(regs);
unsigned long siar = mfspr(SPRN_SIAR);
 
-   if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
-   if (siar)
-   return siar;
-   else
-   return regs->nip;
-   } else if (use_siar && siar_valid(regs))
+   if (regs_use_siar(regs) && siar_valid(regs) && siar)
return siar + perf_ip_adjust(regs);
else
return regs->nip;
-- 
2.26.2



[PATCH v4 2/3] powerpc/perf: Drop the case of returning 0 as instruction pointer

2021-08-18 Thread Kajol Jain
Drop the case of returning 0 as instruction pointer since kernel
never executes at 0 and userspace almost never does either.

Fixes: e6878835ac47 ("powerpc/perf: Sample only if SIAR-Valid
bit is set in P7+")
Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/core-book3s.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 1b464aad29c4..23ec89a59893 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2261,8 +2261,6 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
return regs->nip;
} else if (use_siar && siar_valid(regs))
return siar + perf_ip_adjust(regs);
-   else if (use_siar)
-   return 0;   // no valid instruction pointer
else
return regs->nip;
 }
-- 
2.26.2



[PATCH v4 1/3] powerpc/perf: Use stack siar instead of mfspr

2021-08-18 Thread Kajol Jain
Minor optimization in the 'perf_instruction_pointer' function code by
making use of stack siar instead of mfspr.

Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs
into perf_read_regs")
Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/core-book3s.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bb0ee716de91..1b464aad29c4 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
else
return regs->nip;
} else if (use_siar && siar_valid(regs))
-   return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
+   return siar + perf_ip_adjust(regs);
else if (use_siar)
return 0;   // no valid instruction pointer
else
-- 
2.26.2



Re: [PATCH v3 3/3] powerpc/perf: Fix the check for SIAR value

2021-08-18 Thread kajoljain



On 8/18/21 6:58 PM, Christophe Leroy wrote:
> 
> 
> Le 18/08/2021 à 15:19, Kajol Jain a écrit :
>> Incase of random sampling, there can be scenarios where
>> Sample Instruction Address Register(SIAR) may not latch
>> to the sampled instruction and could result in
>> the value of 0. In these scenarios it is preferred to
>> return regs->nip. These corner cases are seen in the
>> previous generation (p9) also.
>>
>> Patch adds the check for SIAR value along with use_siar and
>> siar_valid checks so that the function will return regs->nip
>> incase SIAR is zero.
>>
>> Patch drops the code under PPMU_P10_DD1 flag check
>> which handles SIAR 0 case only for Power10 DD1.
>>
>> Fixes: 2ca13a4cc56c9 ("powerpc/perf: Use regs->nip when SIAR is zero")
>> Signed-off-by: Kajol Jain 
>> ---
>>
>> Changelog:
>> - Drop adding new ternary condition to check siar value.
>> - Remove siar check specific for PPMU_P10_DD1 and add
>>    it along with common checks as suggested by Christophe Leroy
>>    and Michael Ellermen
>>
>>   arch/powerpc/perf/core-book3s.c | 7 +--
>>   1 file changed, 1 insertion(+), 6 deletions(-)
>>
>> diff --git a/arch/powerpc/perf/core-book3s.c 
>> b/arch/powerpc/perf/core-book3s.c
>> index 23ec89a59893..55efbba7572b 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -2254,12 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
>> *regs)
>>   bool use_siar = regs_use_siar(regs);
>>   unsigned long siar = mfspr(SPRN_SIAR);
>>   -    if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
>> -    if (siar)
>> -    return siar;
>> -    else
>> -    return regs->nip;
>> -    } else if (use_siar && siar_valid(regs))
>> +    if (use_siar && siar_valid(regs) && siar)
> 
> You can probably now do
> 
> +    if (regs_use_siar(regs) && siar_valid(regs) && siar)
> 
> and remove use_siar

Hi Christophe,
 I will update it. Thanks for pointing it.

Thanks,
Kajol Jain

> 
>>   return siar + perf_ip_adjust(regs);
>>   else
>>   return regs->nip;
>>


Re: [PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses

2021-08-18 Thread Laurent Vivier
On 18/08/2021 14:05, Michael Ellerman wrote:
> Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
> on one of his systems:
> 
>   kernel tried to execute exec-protected page (c00804073278) - exploit 
> attempt? (uid: 0)
>   BUG: Unable to handle kernel instruction fetch
>   Faulting instruction address: 0xc00804073278
>   Oops: Kernel access of bad area, sig: 11 [#1]
>   LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>   Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
>   CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
>   Workqueue: events control_work_handler [virtio_console]
>   NIP:  c00804073278 LR: c00804073278 CTR: c01e9de0
>   REGS: c0002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
>   MSR:  80004280b033   CR: 24002822 XER: 
> 200400cf
>   ...
>   NIP fill_queue+0xf0/0x210 [virtio_console]
>   LR  fill_queue+0xf0/0x210 [virtio_console]
>   Call Trace:
> fill_queue+0xb4/0x210 [virtio_console] (unreliable)
> add_port+0x1a8/0x470 [virtio_console]
> control_work_handler+0xbc/0x1e8 [virtio_console]
> process_one_work+0x290/0x590
> worker_thread+0x88/0x620
> kthread+0x194/0x1a0
> ret_from_kernel_thread+0x5c/0x64
> 
> Jordan, Fabiano & Murilo were able to reproduce and identify that the
> problem is caused by the call to module_enable_ro() in do_init_module(),
> which happens after the module's init function has already been called.
> 
> Our current implementation of change_page_attr() is not safe against
> concurrent accesses, because it invalidates the PTE before flushing the
> TLB and then installing the new PTE. That leaves a window in time where
> there is no valid PTE for the page, if another CPU tries to access the
> page at that time we see something like the fault above.
> 
> We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
> code doesn't handle a set_pte_at() of a valid PTE. See [1].
> 
> But we do have pte_update(), which replaces the old PTE with the new,
> meaning there's no window where the PTE is invalid. And the hash MMU
> version hash__pte_update() deals with synchronising the hash page table
> correctly.
> 
> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r@linux.ibm.com/
> 
> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
> Reported-by: Laurent Vivier 
> Signed-off-by: Fabiano Rosas 
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/mm/pageattr.c | 23 ++-
>  1 file changed, 10 insertions(+), 13 deletions(-)
> 
> v2: Use pte_update(..., ~0, pte_val(pte), ...) as suggested by Fabiano,
> and ptep_get() as suggested by Christophe.
> 
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> index 0876216ceee6..edea388e9d3f 100644
> --- a/arch/powerpc/mm/pageattr.c
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -18,16 +18,12 @@
>  /*
>   * Updates the attributes of a page in three steps:
>   *
> - * 1. invalidate the page table entry
> - * 2. flush the TLB
> - * 3. install the new entry with the updated attributes
> - *
> - * Invalidating the pte means there are situations where this will not work
> - * when in theory it should.
> - * For example:
> - * - removing write from page whilst it is being executed
> - * - setting a page read-only whilst it is being read by another CPU
> + * 1. take the page_table_lock
> + * 2. install the new entry with the updated attributes
> + * 3. flush the TLB
>   *
> + * This sequence is safe against concurrent updates, and also allows 
> updating the
> + * attributes of a page currently being executed or accessed.
>   */
>  static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>  {
> @@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long 
> addr, void *data)
>  
>   spin_lock(_mm.page_table_lock);
>  
> - /* invalidate the PTE so it's safe to modify */
> - pte = ptep_get_and_clear(_mm, addr, ptep);
> - flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> + pte = ptep_get(ptep);
>  
>   /* modify the PTE bits as desired, then apply */
>   switch (action) {
> @@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long 
> addr, void *data)
>   break;
>   }
>  
> - set_pte_at(_mm, addr, ptep, pte);
> + pte_update(_mm, addr, ptep, ~0UL, pte_val(pte), 0);
>  
>   /* See ptesync comment in radix__set_pte_at() */
>   if (radix_enabled())
>   asm volatile("ptesync": : :"memory");
> +
> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +
>   spin_unlock(_mm.page_table_lock);
>  
>   return 0;
> 
> base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30
> 

Tested-by: Laurent Vivier 



[PATCH 1/1] selftests/powerpc: Add memmove_64 test

2021-08-18 Thread Ritesh Harjani
While debugging an issue, we wanted to check whether the arch specific
kernel memmove implementation is correct. This selftest could help test that.

Suggested-by: Aneesh Kumar K.V 
Suggested-by: Vaibhav Jain 
Signed-off-by: Ritesh Harjani 
---
 tools/testing/selftests/powerpc/Makefile  |  1 +
 .../selftests/powerpc/memmoveloop/.gitignore  |  2 +
 .../selftests/powerpc/memmoveloop/Makefile| 19 +++
 .../powerpc/memmoveloop/asm/asm-compat.h  |  0
 .../powerpc/memmoveloop/asm/export.h  |  4 ++
 .../powerpc/memmoveloop/asm/feature-fixups.h  |  0
 .../selftests/powerpc/memmoveloop/asm/kasan.h |  0
 .../powerpc/memmoveloop/asm/ppc_asm.h | 39 +
 .../powerpc/memmoveloop/asm/processor.h   |  0
 .../selftests/powerpc/memmoveloop/mem_64.S|  1 +
 .../selftests/powerpc/memmoveloop/memcpy_64.S |  1 +
 .../selftests/powerpc/memmoveloop/stubs.S |  8 +++
 .../selftests/powerpc/memmoveloop/validate.c  | 56 +++
 13 files changed, 131 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/.gitignore
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/Makefile
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/asm-compat.h
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/export.h
 create mode 100644 
tools/testing/selftests/powerpc/memmoveloop/asm/feature-fixups.h
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/kasan.h
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/ppc_asm.h
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/asm/processor.h
 create mode 12 tools/testing/selftests/powerpc/memmoveloop/mem_64.S
 create mode 12 tools/testing/selftests/powerpc/memmoveloop/memcpy_64.S
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/stubs.S
 create mode 100644 tools/testing/selftests/powerpc/memmoveloop/validate.c

diff --git a/tools/testing/selftests/powerpc/Makefile 
b/tools/testing/selftests/powerpc/Makefile
index 0830e63818c1..d110b3e5cbcd 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -16,6 +16,7 @@ export CFLAGS
 SUB_DIRS = alignment   \
   benchmarks   \
   cache_shape  \
+  memmoveloop  \
   copyloops\
   dscr \
   mm   \
diff --git a/tools/testing/selftests/powerpc/memmoveloop/.gitignore 
b/tools/testing/selftests/powerpc/memmoveloop/.gitignore
new file mode 100644
index ..56c1426013d5
--- /dev/null
+++ b/tools/testing/selftests/powerpc/memmoveloop/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+memmove_64
diff --git a/tools/testing/selftests/powerpc/memmoveloop/Makefile 
b/tools/testing/selftests/powerpc/memmoveloop/Makefile
new file mode 100644
index ..d58d8c100099
--- /dev/null
+++ b/tools/testing/selftests/powerpc/memmoveloop/Makefile
@@ -0,0 +1,19 @@
+# SPDX-License-Identifier: GPL-2.0
+CFLAGS += -m64
+CFLAGS += -I$(CURDIR)
+CFLAGS += -D SELFTEST
+CFLAGS += -maltivec
+
+ASFLAGS = $(CFLAGS) -Wa,-mpower4
+
+TEST_GEN_PROGS := memmove_64
+EXTRA_SOURCES := validate.c ../harness.c stubs.S
+CPPFLAGS += -D TEST_MEMMOVE=test_memmove
+
+top_srcdir = ../../../../..
+include ../../lib.mk
+
+$(OUTPUT)/memmove_64: mem_64.S memcpy_64.S $(EXTRA_SOURCES)
+   $(CC) $(CPPFLAGS) $(CFLAGS) \
+   -D TEST_MEMMOVE=test_memmove \
+   -o $@ $^
diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/asm-compat.h 
b/tools/testing/selftests/powerpc/memmoveloop/asm/asm-compat.h
new file mode 100644
index ..e69de29bb2d1
diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/export.h 
b/tools/testing/selftests/powerpc/memmoveloop/asm/export.h
new file mode 100644
index ..e6b80d5fbd14
--- /dev/null
+++ b/tools/testing/selftests/powerpc/memmoveloop/asm/export.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#define EXPORT_SYMBOL(x)
+#define EXPORT_SYMBOL_GPL(x)
+#define EXPORT_SYMBOL_KASAN(x)
diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/feature-fixups.h 
b/tools/testing/selftests/powerpc/memmoveloop/asm/feature-fixups.h
new file mode 100644
index ..e69de29bb2d1
diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/kasan.h 
b/tools/testing/selftests/powerpc/memmoveloop/asm/kasan.h
new file mode 100644
index ..e69de29bb2d1
diff --git a/tools/testing/selftests/powerpc/memmoveloop/asm/ppc_asm.h 
b/tools/testing/selftests/powerpc/memmoveloop/asm/ppc_asm.h
new file mode 100644
index ..117005c56e19
--- /dev/null
+++ b/tools/testing/selftests/powerpc/memmoveloop/asm/ppc_asm.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __SELFTESTS_POWERPC_PPC_ASM_H
+#define __SELFTESTS_POWERPC_PPC_ASM_H
+#include 
+
+#define CONFIG_ALTIVEC
+
+#define r1 1
+
+#define R14 r14
+#define 

Re: [PATCH v2 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32

2021-08-18 Thread Segher Boessenkool
On Fri, Aug 13, 2021 at 04:08:13PM +1000, Nicholas Piggin wrote:
> This one possibly the branches end up in predictors, whereas conditional 
> trap is always just speculated not to hit. Branches may also have a
> throughput limit on execution whereas trap could be more (1 per cycle
> vs 4 per cycle on POWER9).

I thought only *taken* branches are just one per cycle?  And those
branches are only taken for the exceptional condition (or the case where
we do not care about performance, anyway, if we do have an error most of
the time ;-) )

> On typical ppc32 CPUs, maybe it's a more obvious win. As you say there
> is the CFAR issue as well which makes it a problem for 64s. It would
> have been nice if it could use the same code though.

On 64-bit the code looks better for the no-error path as well.

> Maybe one day gcc's __builtin_trap() will become smart enough around
> conditional statements that it it generates better code and tries to
> avoid branches.

Internally *all* traps are conditional, in GCC.  It also can optimise
them quite well.  There must be something in the kernel macros that
prevents good optimisation.


Segher


Re: [PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses

2021-08-18 Thread Murilo Opsfelder Araújo

On 8/18/21 9:05 AM, Michael Ellerman wrote:

Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

   kernel tried to execute exec-protected page (c00804073278) - exploit 
attempt? (uid: 0)
   BUG: Unable to handle kernel instruction fetch
   Faulting instruction address: 0xc00804073278
   Oops: Kernel access of bad area, sig: 11 [#1]
   LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
   Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
   CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
   Workqueue: events control_work_handler [virtio_console]
   NIP:  c00804073278 LR: c00804073278 CTR: c01e9de0
   REGS: c0002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
   MSR:  80004280b033   CR: 24002822 XER: 
200400cf
   ...
   NIP fill_queue+0xf0/0x210 [virtio_console]
   LR  fill_queue+0xf0/0x210 [virtio_console]
   Call Trace:
 fill_queue+0xb4/0x210 [virtio_console] (unreliable)
 add_port+0x1a8/0x470 [virtio_console]
 control_work_handler+0xbc/0x1e8 [virtio_console]
 process_one_work+0x290/0x590
 worker_thread+0x88/0x620
 kthread+0x194/0x1a0
 ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r@linux.ibm.com/

Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier 
Signed-off-by: Fabiano Rosas 
Signed-off-by: Michael Ellerman 


Reviewed-by: Murilo Opsfelder Araújo 


---
  arch/powerpc/mm/pageattr.c | 23 ++-
  1 file changed, 10 insertions(+), 13 deletions(-)

v2: Use pte_update(..., ~0, pte_val(pte), ...) as suggested by Fabiano,
 and ptep_get() as suggested by Christophe.

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 0876216ceee6..edea388e9d3f 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -18,16 +18,12 @@
  /*
   * Updates the attributes of a page in three steps:
   *
- * 1. invalidate the page table entry
- * 2. flush the TLB
- * 3. install the new entry with the updated attributes
- *
- * Invalidating the pte means there are situations where this will not work
- * when in theory it should.
- * For example:
- * - removing write from page whilst it is being executed
- * - setting a page read-only whilst it is being read by another CPU
+ * 1. take the page_table_lock
+ * 2. install the new entry with the updated attributes
+ * 3. flush the TLB
   *
+ * This sequence is safe against concurrent updates, and also allows updating 
the
+ * attributes of a page currently being executed or accessed.
   */
  static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
  {
@@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, 
void *data)
  
  	spin_lock(_mm.page_table_lock);
  
-	/* invalidate the PTE so it's safe to modify */

-   pte = ptep_get_and_clear(_mm, addr, ptep);
-   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+   pte = ptep_get(ptep);
  
  	/* modify the PTE bits as desired, then apply */

switch (action) {
@@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long 
addr, void *data)
break;
}
  
-	set_pte_at(_mm, addr, ptep, pte);

+   pte_update(_mm, addr, ptep, ~0UL, pte_val(pte), 0);
  
  	/* See ptesync comment in radix__set_pte_at() */

if (radix_enabled())
asm volatile("ptesync": : :"memory");
+
+   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
spin_unlock(_mm.page_table_lock);
  
  	return 0;


base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30




--
Murilo


Re: [PATCH 3/3] powerpc: move the install rule to arch/powerpc/Makefile

2021-08-18 Thread Masahiro Yamada
On Sat, Jul 31, 2021 at 5:30 AM Nick Desaulniers
 wrote:
>
> On Thu, Jul 29, 2021 at 7:22 AM Masahiro Yamada  wrote:
> >
> > Currently, the install target in arch/powerpc/Makefile descends into
> > arch/powerpc/boot/Makefile to invoke the shell script, but there is no
> > good reason to do so.
>
> Sure, but there are more arch/ subdirs that DO invoke install.sh from
> arch//boot/Makefile than, not:
>
> arch//boot/Makefile:
> - parisc
> - nios2
> - arm
> - nds32
> - sparc
> - riscv
> - 390
> - ppc (this patch)
> - x86
> - arm64

I sent patches for these architectures.

Check LKML.






> arch//Makefile:
> - ia64
> - m68k
>
> Patch is fine, but right now the tree is a bit inconsistent.
>
> >
> > arch/powerpc/Makefile can run the shell script directly.
> >
> > Signed-off-by: Masahiro Yamada 
> > ---
> >
> >  arch/powerpc/Makefile  | 3 ++-
> >  arch/powerpc/boot/Makefile | 6 --
> >  2 files changed, 2 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> > index 6505d66f1193..9aaf1abbc641 100644
> > --- a/arch/powerpc/Makefile
> > +++ b/arch/powerpc/Makefile
> > @@ -407,7 +407,8 @@ endef
> >
> >  PHONY += install
> >  install:
> > -   $(Q)$(MAKE) $(build)=$(boot) install
> > +   sh -x $(srctree)/$(boot)/install.sh "$(KERNELRELEASE)" vmlinux \
> > +   System.map "$(INSTALL_PATH)"
> >
> >  archclean:
> > $(Q)$(MAKE) $(clean)=$(boot)
> > diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> > index 0d165bd98b61..10c0fb306f15 100644
> > --- a/arch/powerpc/boot/Makefile
> > +++ b/arch/powerpc/boot/Makefile
> > @@ -444,12 +444,6 @@ $(obj)/zImage: $(addprefix $(obj)/, 
> > $(image-y))
> >  $(obj)/zImage.initrd:  $(addprefix $(obj)/, $(initrd-y))
> > $(Q)rm -f $@; ln $< $@
> >
> > -# Only install the vmlinux
> > -install:
> > -   sh -x $(srctree)/$(src)/install.sh "$(KERNELRELEASE)" vmlinux 
> > System.map "$(INSTALL_PATH)"
> > -
> > -PHONY += install
> > -
> >  # anything not in $(targets)
> >  clean-files += $(image-) $(initrd-) cuImage.* dtbImage.* treeImage.* \
> > zImage zImage.initrd zImage.chrp zImage.coff zImage.holly \
> > --
> > 2.27.0
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 0/1] cpufreq:powernv: Fix init_chip_info initialization in numa=off

2021-08-18 Thread Michael Ellerman
On Wed, 28 Jul 2021 17:34:59 +0530, Pratik R. Sampat wrote:
> v1: https://lkml.org/lkml/2021/7/26/1509
> Changelog v1-->v2:
> Based on comments from Gautham,
> 1. Included a #define for MAX_NR_CHIPS instead of hardcoding the
> allocation.
> 
> Pratik R. Sampat (1):
>   cpufreq:powernv: Fix init_chip_info initialization in numa=off
> 
> [...]

Applied to powerpc/next.

[1/1] cpufreq:powernv: Fix init_chip_info initialization in numa=off
  https://git.kernel.org/powerpc/c/f34ee9cb2c5ac5af426fee6fa4591a34d187e696

cheers


Re: [PATCH] powerpc/64s/perf: Always use SIAR for kernel interrupts

2021-08-18 Thread Michael Ellerman
On Wed, 21 Jul 2021 00:15:04 +1000, Nicholas Piggin wrote:
> If an interrupt is taken in kernel mode, always use SIAR for it rather than
> looking at regs_sipr. This prevents samples piling up around interrupt
> enable (hard enable or interrupt replay via soft enable) in PMUs / modes
> where the PR sample indication is not in synch with SIAR.
> 
> This results in better sampling of interrupt entry and exit in particular.

Applied to powerpc/next.

[1/1] powerpc/64s/perf: Always use SIAR for kernel interrupts
  https://git.kernel.org/powerpc/c/cf9c615cde49fb5d2480549c8c955a0a387798d3

cheers


Re: [PATCH] cpuidle: pseries: Mark pseries_idle_proble() as __init

2021-08-18 Thread Michael Ellerman
On Tue, 3 Aug 2021 14:15:47 -0700, Nathan Chancellor wrote:
> After commit 7cbd631d4dec ("cpuidle: pseries: Fixup CEDE0 latency only
> for POWER10 onwards"), pseries_idle_probe() is no longer inlined when
> compiling with clang, which causes a modpost warning:
> 
> WARNING: modpost: vmlinux.o(.text+0xc86a54): Section mismatch in
> reference from the function pseries_idle_probe() to the function
> .init.text:fixup_cede0_latency()
> The function pseries_idle_probe() references
> the function __init fixup_cede0_latency().
> This is often because pseries_idle_probe lacks a __init
> annotation or the annotation of fixup_cede0_latency is wrong.
> 
> [...]

Applied to powerpc/next.

[1/1] cpuidle: pseries: Mark pseries_idle_proble() as __init
  https://git.kernel.org/powerpc/c/d04691d373e75c83424b85c0e68e4a3f9370c10d

cheers


Re: [PATCHv2 0/3] Subject: [PATCHv2 0/3] Make cache-object aware of L3 siblings by parsing "ibm, thread-groups" property

2021-08-18 Thread Michael Ellerman
On Wed, 28 Jul 2021 23:26:04 +0530, Parth Shah wrote:
> Changes from v1 -> v2:
> - Based on Gautham's comments, use a separate thread_group_l3_cache_map
>   and modify parsing code to build cache_map for L3. This makes the
>   cache_map building code isolated from the parsing code.
> v1 can be found at:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-June/230680.html
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/cacheinfo: Lookup cache by dt node and thread-group id
  https://git.kernel.org/powerpc/c/a4bec516b9c0823d7e2bb8c8928c98b535cf9adf
[2/3] powerpc/cacheinfo: Remove the redundant get_shared_cpu_map()
  https://git.kernel.org/powerpc/c/69aa8e078545bc14d84a8b4b3cb914ac8f9f280e
[3/3] powerpc/smp: Use existing L2 cache_map cpumask to find L3 cache siblings
  https://git.kernel.org/powerpc/c/e9ef81e1079b0c4c374fba0f9affa7129c7c913b

cheers


Re: [PATCH 1/3] powerpc: remove unused zInstall target from arch/powerpc/boot/Makefile

2021-08-18 Thread Michael Ellerman
On Thu, 29 Jul 2021 23:19:35 +0900, Masahiro Yamada wrote:
> Commit c913e5f95e54 ("powerpc/boot: Don't install zImage.* from make
> install") added the zInstall target to arch/powerpc/boot/Makefile,
> but you cannot use it since the corresponding hook is missing in
> arch/powerpc/Makefile.
> 
> It has never worked since its addition. Nobody has complained about
> it for 7 years, which means this code was unneeded.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc: remove unused zInstall target from arch/powerpc/boot/Makefile
  https://git.kernel.org/powerpc/c/156ca4e650bfb9a4259b427069caa11b5a4df3d4
[2/3] powerpc: make the install target not depend on any build artifact
  https://git.kernel.org/powerpc/c/9bef456b20581e630ef9a13555ca04fed65a859d
[3/3] powerpc: move the install rule to arch/powerpc/Makefile
  https://git.kernel.org/powerpc/c/86ff0bce2e9665c8b074930fe6caed615da070c1

cheers


Re: [PATCH v4 0/5] nvmem: nintendo-otp: Add new driver for the Wii and Wii U OTP

2021-08-18 Thread Michael Ellerman
On Sun, 1 Aug 2021 09:38:17 +0200, Emmanuel Gil Peyrot wrote:
> The OTP is a read-only memory area which contains various keys and
> signatures used to decrypt, encrypt or verify various pieces of storage.
> 
> Its size depends on the console, it is 128 bytes on the Wii and
> 1024 bytes on the Wii U (split into eight 128 bytes banks).
> 
> It can be used directly by writing into one register and reading from
> the other one, without any additional synchronisation.
> 
> [...]

Patches 3-5 applied to powerpc/next.

[3/5] powerpc: wii.dts: Reduce the size of the control area
  https://git.kernel.org/powerpc/c/b11748e693166679acc13c8a4328a71efe1d4a89
[4/5] powerpc: wii.dts: Expose the OTP on this platform
  https://git.kernel.org/powerpc/c/562a610b4c5119034aed300f6ae212ec7a20c4b4
[5/5] powerpc: wii_defconfig: Enable OTP by default
  https://git.kernel.org/powerpc/c/140a89b7bfe65e9649c4a3678f74c32556834ec1

cheers


Re: [PATCH v5] pseries: prevent free CPU ids to be reused on another node

2021-08-18 Thread Michael Ellerman
On Thu, 29 Apr 2021 19:49:08 +0200, Laurent Dufour wrote:
> When a CPU is hot added, the CPU ids are taken from the available mask from
> the lower possible set. If that set of values was previously used for CPU
> attached to a different node, this seems to application like if these CPUs
> have migrated from a node to another one which is not expected in real
> life.
> 
> To prevent this, it is needed to record the CPU ids used for each node and
> to not reuse them on another node. However, to prevent CPU hot plug to
> fail, in the case the CPU ids is starved on a node, the capability to reuse
> other nodes’ free CPU ids is kept. A warning is displayed in such a case
> to warn the user.
> 
> [...]

Applied to powerpc/next.

[1/1] pseries: prevent free CPU ids to be reused on another node
  https://git.kernel.org/powerpc/c/bd1dd4c5f5286df0148b5b316f37c583b8f55fa1

cheers


Re: [PATCH v5] pseries/drmem: update LMBs after LPM

2021-08-18 Thread Michael Ellerman
On Mon, 17 May 2021 11:06:06 +0200, Laurent Dufour wrote:
> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
> updated by the hypervisor in the case the NUMA topology of the LPAR's
> memory is updated.
> 
> This is handled by the kernel, but the memory's node is not updated because
> there is no way to move a memory block between nodes from the Linux kernel
> point of view.
> 
> [...]

Applied to powerpc/next.

[1/1] pseries/drmem: update LMBs after LPM
  https://git.kernel.org/powerpc/c/d144f4d5a8a804133d20ff311d7be70bcdbfaac2

cheers


Re: [PATCH v2] ppc64/numa: consider the max numa node for migratable LPAR

2021-08-18 Thread Michael Ellerman
On Tue, 11 May 2021 09:31:36 +0200, Laurent Dufour wrote:
> When a LPAR is migratable, we should consider the maximum possible NUMA
> node instead the number of NUMA node from the actual system.
> 
> The DT property 'ibm,current-associativity-domains' is defining the maximum
> number of nodes the LPAR can see when running on that box. But if the LPAR
> is being migrated on another box, it may seen up to the nodes defined by
> 'ibm,max-associativity-domains'. So if a LPAR is migratable, that value
> should be used.
> 
> [...]

Applied to powerpc/next.

[1/1] ppc64/numa: consider the max numa node for migratable LPAR
  https://git.kernel.org/powerpc/c/9c7248bb8de31f51c693bfa6a6ea53b1c07e0fa8

cheers


Re: [PATCH] powerpc/kexec: fix for_each_child.cocci warning

2021-08-18 Thread Michael Ellerman
On Tue, 3 Aug 2021 16:59:55 +0200 (CEST), Julia Lawall wrote:
> for_each_node_by_type should have of_node_put() before return.
> 
> Generated by: scripts/coccinelle/iterators/for_each_child.cocci

Applied to powerpc/next.

[1/1] powerpc/kexec: fix for_each_child.cocci warning
  https://git.kernel.org/powerpc/c/c00103abf76fd3916596afd07dd3fdeee0dca15d

cheers


Re: [PATCH v2 00/32] powerpc: Add MSI IRQ domains to PCI drivers

2021-08-18 Thread Michael Ellerman
On Thu, 1 Jul 2021 15:27:18 +0200, Cédric Le Goater wrote:
> This series adds support for MSI IRQ domains on top of the XICS (P8)
> and XIVE (P9/P10) IRQ domains for the PowerNV (baremetal) and pSeries
> (VM) platforms. It should simplify and improve IRQ affinity of PCI
> MSIs under these PowerPC platforms, specially for drivers distributing
> multiple RX/TX queues on the different CPUs of the system.
> 
> Data locality can still be improved with an interrupt controller node
> per chip but this requires FW changes. It could be done under OPAL.
> 
> [...]

Patches 1-31 applied to powerpc/next.

[01/32] powerpc/pseries/pci: Introduce __find_pe_total_msi()

https://git.kernel.org/powerpc/c/786e5b102a0007d81579822eac23cb5bfaa0b65f
[02/32] powerpc/pseries/pci: Introduce rtas_prepare_msi_irqs()

https://git.kernel.org/powerpc/c/e81202007363bd694b711f307f02320b5f98edaa
[03/32] powerpc/xive: Add support for IRQ domain hierarchy

https://git.kernel.org/powerpc/c/14be098c5387eb93b794f299f3c3e2ddf6038ec7
[04/32] powerpc/xive: Ease debugging of xive_irq_set_affinity()

https://git.kernel.org/powerpc/c/6c2ab2a5d634d4e30445ee5d52d5d1469bf74aa2
[05/32] powerpc/pseries/pci: Add MSI domains

https://git.kernel.org/powerpc/c/a5f3d2c17b07e69166b93209f34a5fb8271a6810
[06/32] powerpc/xive: Drop unmask of MSIs at startup

https://git.kernel.org/powerpc/c/5690bcae186084a8544b1819f0d89399268bd0cf
[07/32] powerpc/xive: Remove irqd_is_started() check when setting the affinity

https://git.kernel.org/powerpc/c/292145a6e598c1e6633b8f5f607706b46f552ab9
[08/32] powerpc/pseries/pci: Add a domain_free_irqs() handler

https://git.kernel.org/powerpc/c/07817a578a7a79638537480b8847dc7a12f293c5
[09/32] powerpc/pseries/pci: Add a msi_free() handler to clear XIVE data

https://git.kernel.org/powerpc/c/9a014f456881e947bf8cdd8c984a207097e6c096
[10/32] powerpc/pseries/pci: Add support of MSI domains to PHB hotplug

https://git.kernel.org/powerpc/c/174db9e7f775ce06fc6949c9abbe758b3eb8171c
[11/32] powerpc/powernv/pci: Introduce __pnv_pci_ioda_msi_setup()

https://git.kernel.org/powerpc/c/2c50d7e99e39eba92b93210e740f3f9e5a06ba54
[12/32] powerpc/powernv/pci: Add MSI domains

https://git.kernel.org/powerpc/c/0fcfe2247e75070361af2b6845030cada92cdbf8
[13/32] KVM: PPC: Book3S HV: Use the new IRQ chip to detect passthrough 
interrupts

https://git.kernel.org/powerpc/c/ba418a0278265ad65f2f9544e743b7dbff3b994b
[14/32] KVM: PPC: Book3S HV: XIVE: Change interface of passthrough interrupt 
routines

https://git.kernel.org/powerpc/c/e5e78b15113a73d0294141d9796969fa7b10fa3c
[15/32] KVM: PPC: Book3S HV: XIVE: Fix mapping of passthrough interrupts

https://git.kernel.org/powerpc/c/51be9e51a8000ffc6a33083ceca9da9303ed4dc5
[16/32] powerpc/xics: Remove ICS list

https://git.kernel.org/powerpc/c/298f6f952885eeb1f25461f085c6c238bcd9fc5e
[17/32] powerpc/xics: Rename the map handler in a check handler

https://git.kernel.org/powerpc/c/248af248a8f45461662fb633eca4adf24ae704ad
[18/32] powerpc/xics: Give a name to the default XICS IRQ domain

https://git.kernel.org/powerpc/c/7d14f6c60b76fa7f3f89d81a95385576ca33b483
[19/32] powerpc/xics: Add debug logging to the set_irq_affinity handlers

https://git.kernel.org/powerpc/c/53b34e8db73af98fa652641bf490384dc665d0f2
[20/32] powerpc/xics: Add support for IRQ domain hierarchy

https://git.kernel.org/powerpc/c/e4f0aa3b4731430ad73fb4485e97f751c7500668
[21/32] powerpc/powernv/pci: Customize the MSI EOI handler to support PHB3

https://git.kernel.org/powerpc/c/bbb25af8fbdba4acaf955e412a84eb2eea48697c
[22/32] powerpc/pci: Drop XIVE restriction on MSI domains

https://git.kernel.org/powerpc/c/679e30b9536eeb93bc8c9a39c0ddc77dec536f6b
[23/32] powerpc/xics: Drop unmask of MSIs at startup

https://git.kernel.org/powerpc/c/1e661f81a522eadfe4bc5bb1ec9fbae27c13f163
[24/32] powerpc/pseries/pci: Drop unused MSI code

https://git.kernel.org/powerpc/c/3005123eea0daa18d98602ab64b2ce3ad087d849
[25/32] powerpc/powernv/pci: Drop unused MSI code

https://git.kernel.org/powerpc/c/6d9ba6121b1cf453985d08c141970a1b44cd9cf1
[26/32] powerpc/powernv/pci: Adapt is_pnv_opal_msi() to detect passthrough 
interrupt

https://git.kernel.org/powerpc/c/f1a377f86f51b381cfc30bf2270f8a5f81e35ee9
[27/32] powerpc/xics: Fix IRQ migration

https://git.kernel.org/powerpc/c/c80198a21792ac59412871e4e6fad5041c9be8e4
[28/32] powerpc/powernv/pci: Set the IRQ chip data for P8/CXL devices

https://git.kernel.org/powerpc/c/5cd69651ceeed15e021cf7d19f1b1be0a80c0c7a
[29/32] powerpc/powernv/pci: Rework pnv_opal_pci_msi_eoi()

https://git.kernel.org/powerpc/c/c325712b5f85e561ea89bae2ba5d0104e797e42c
[30/32] KVM: PPC: Book3S HV: XICS: Fix mapping of passthrough interrupts

https://git.kernel.org/powerpc/c/1753081f2d445f9157550692fcc4221cd3ff0958
[31/32] 

Re: [PATCH] powerpc: Always inline radix_enabled() to fix build failure

2021-08-18 Thread Michael Ellerman
On Wed, 4 Aug 2021 11:37:24 +1000, Jordan Niethe wrote:
> This is the same as commit acdad8fb4a15 ("powerpc: Force inlining of
> mmu_has_feature to fix build failure") but for radix_enabled().  The
> config in the linked bugzilla causes the following build failure:
> 
> LD  .tmp_vmlinux.kallsyms1
> powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function 
> `.__ptep_set_access_flags':
> pgtable.c:(.text+0x17c): undefined reference to 
> `.radix__ptep_set_access_flags'
...
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Always inline radix_enabled() to fix build failure
  https://git.kernel.org/powerpc/c/27fd051dc218e5b6cb2da5dbb3f342879ff1

cheers


Re: [PATCH] powerpc/non-smp: Inconditionaly call smp_mb() on switch_mm

2021-08-18 Thread Michael Ellerman
On Mon, 5 Jul 2021 12:00:50 + (UTC), Christophe Leroy wrote:
> Commit 3ccfebedd8cf ("powerpc, membarrier: Skip memory barrier in
> switch_mm()") added some logic to skip the smp_mb() in
> switch_mm_irqs_off() before the call to switch_mmu_context().
> 
> However, on non SMP smp_mb() is just a compiler barrier and doing
> it inconditionaly is simpler than the logic used to check
> whether the barrier is needed or not.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/non-smp: Inconditionaly call smp_mb() on switch_mm
  https://git.kernel.org/powerpc/c/c8a6d91005343dea0d53be0ff0620c66934dcd44

cheers


Re: [PATCH kernel v2] KVM: PPC: Use arch_get_random_seed_long instead of powernv variant

2021-08-18 Thread Michael Ellerman
On Thu, 5 Aug 2021 17:56:49 +1000, Alexey Kardashevskiy wrote:
> The powernv_get_random_long() does not work in nested KVM (which is
> pseries) and produces a crash when accessing in_be64(rng->regs) in
> powernv_get_random_long().
> 
> This replaces powernv_get_random_long with the ppc_md machine hook
> wrapper.

Applied to powerpc/next.

[1/1] KVM: PPC: Use arch_get_random_seed_long instead of powernv variant
  https://git.kernel.org/powerpc/c/2ac78e0c00184a9ba53d507be7549c69a3f566b6

cheers


Re: [PATCH v8 0/5] Add support for FORM2 associativity

2021-08-18 Thread Michael Ellerman
On Thu, 12 Aug 2021 18:52:18 +0530, Aneesh Kumar K.V wrote:
> Form2 associativity adds a much more flexible NUMA topology layout
> than what is provided by Form1. More details can be found in patch 7.
> 
> $ numactl -H
> ...
> node distances:
> node   0   1   2   3
>   0:  10  11  222  33
>   1:  44  10  55  66
>   2:  77  88  10  99
>   3:  101  121  132  10
> $
> 
> [...]

Applied to powerpc/next.

[1/5] powerpc/pseries: rename min_common_depth to primary_domain_index
  https://git.kernel.org/powerpc/c/7e35ef662ca05c42dbc2f401bb76d9219dd7fd02
[2/5] powerpc/pseries: Rename TYPE1_AFFINITY to FORM1_AFFINITY
  https://git.kernel.org/powerpc/c/0eacd06bb8adea8dd9edb0a30144166d9f227e64
[3/5] powerpc/pseries: Consolidate different NUMA distance update code paths
  https://git.kernel.org/powerpc/c/8ddc6448ec5a5ef50eaa581a7dec0e12a02850ff
[4/5] powerpc/pseries: Add a helper for form1 cpu distance
  https://git.kernel.org/powerpc/c/ef31cb83d19c4c589d650747cd5a7e502be9f665
[5/5] powerpc/pseries: Add support for FORM2 associativity
  https://git.kernel.org/powerpc/c/1c6b5a7e74052768977855f95d6b8812f6e7772c

cheers


Re: [PATCH] powerpc/configs: Disable legacy ptys on microwatt defconfig

2021-08-18 Thread Michael Ellerman
On Thu, 5 Aug 2021 11:20:05 +1000, Anton Blanchard wrote:
> We shouldn't need legacy ptys, and disabling the option improves boot
> time by about 0.5 seconds.

Applied to powerpc/next.

[1/1] powerpc/configs: Disable legacy ptys on microwatt defconfig
  https://git.kernel.org/powerpc/c/9b49f979b3d560cb75ea9f1a596baf432d566798

cheers


Re: [PATCH] powerpc: Remove in_kernel_text()

2021-08-18 Thread Michael Ellerman
On Sun, 27 Jun 2021 17:09:18 + (UTC), Christophe Leroy wrote:
> Last user of in_kernel_text() stopped using in with
> commit 549e8152de80 ("powerpc: Make the 64-bit kernel as a
> position-independent executable").
> 
> Generic function is_kernel_text() does the same.
> 
> So remote it.

Applied to powerpc/next.

[1/1] powerpc: Remove in_kernel_text()
  https://git.kernel.org/powerpc/c/09ca497528dac12cbbceab8197011c875a96d053

cheers


Re: [PATCH] powerpc: use IRQF_NO_DEBUG for IPIs

2021-08-18 Thread Michael Ellerman
On Mon, 19 Jul 2021 15:06:14 +0200, Cédric Le Goater wrote:
> There is no need to use the lockup detector ("noirqdebug") for IPIs.
> The ipistorm benchmark measures a ~10% improvement on high systems
> when this flag is set.

Applied to powerpc/next.

[1/1] powerpc: use IRQF_NO_DEBUG for IPIs
  https://git.kernel.org/powerpc/c/17df41fec5b80b16ea4774495f1eb730e2225619

cheers


Re: [PATCH 0/2] KVM: PPC: Book3S HV: XIVE: Improve guest entries and exits

2021-08-18 Thread Michael Ellerman
On Tue, 20 Jul 2021 15:42:07 +0200, Cédric Le Goater wrote:
> The XIVE interrupt controller on P10 can automatically save and
> restore the state of the interrupt registers under the internal NVP
> structure representing the VCPU. This saves a costly store/load in
> guest entries and exits.
> 
> Thanks,
> 
> [...]

Applied to powerpc/next.

[1/2] KVM: PPC: Book3S HV: XIVE: Add a 'flags' field
  https://git.kernel.org/powerpc/c/b68c6646cce5ee8caefa6333ee743f960222dcea
[2/2] KVM: PPC: Book3S HV: XIVE: Add support for automatic save-restore
  https://git.kernel.org/powerpc/c/f5af0a978776b710f16dc99a85496b1e760bf9e0

cheers


Re: [PATCH v2 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32

2021-08-18 Thread Michael Ellerman
On Tue, 13 Apr 2021 16:38:09 + (UTC), Christophe Leroy wrote:
> powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
> 
> For catching simple conditions like a variable having value 0, this
> is efficient because it does the test and the trap at the same time.
> But most conditions used with BUG_ON or WARN_ON are more complex and
> forces GCC to format the condition into a 0 or 1 value in a register.
> This will usually require 2 to 3 instructions.
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32
  https://git.kernel.org/powerpc/c/db87a7199229b75c9996bf78117eceb81854fce2
[2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with 
asm goto
  https://git.kernel.org/powerpc/c/1e688dd2a3d6759d416616ff07afc4bb836c4213

cheers


Re: (subset) [PATCH 00/38] Replace deprecated CPU-hotplug

2021-08-18 Thread Michael Ellerman
On Tue, 3 Aug 2021 16:15:43 +0200, Sebastian Andrzej Siewior wrote:
> This is a tree wide replacement of the deprecated CPU hotplug functions
> which are only wrappers around the actual functions.
> 
> Each patch is independent and can be picked up by the relevant maintainer.
> 
> [...]

Applied to powerpc/next.

[03/38] powerpc: Replace deprecated CPU-hotplug functions.

https://git.kernel.org/powerpc/c/5ae36401ca4ea2737d779ce7c267444b16530001

cheers


Re: [PATCH v2 1/2] powerpc/book3s64/radix: make tlb_single_page_flush_ceiling a debugfs entry

2021-08-18 Thread Michael Ellerman
On Thu, 12 Aug 2021 18:58:30 +0530, Aneesh Kumar K.V wrote:
> Similar to x86/s390 add a debugfs file to tune tlb_single_page_flush_ceiling.
> Also add a debugfs entry for tlb_local_single_page_flush_ceiling.

Applied to powerpc/next.

[1/2] powerpc/book3s64/radix: make tlb_single_page_flush_ceiling a debugfs entry
  https://git.kernel.org/powerpc/c/3e188b1ae8807f26cc5a530a9d55f3f643fe050a
[2/2] powerpc: rename powerpc_debugfs_root to arch_debugfs_dir
  https://git.kernel.org/powerpc/c/dbf77fed8b302e87561c7c2fc06050c88f4d3120

cheers


Re: [PATCH v3 3/3] powerpc/perf: Fix the check for SIAR value

2021-08-18 Thread Christophe Leroy




Le 18/08/2021 à 15:19, Kajol Jain a écrit :

Incase of random sampling, there can be scenarios where
Sample Instruction Address Register(SIAR) may not latch
to the sampled instruction and could result in
the value of 0. In these scenarios it is preferred to
return regs->nip. These corner cases are seen in the
previous generation (p9) also.

Patch adds the check for SIAR value along with use_siar and
siar_valid checks so that the function will return regs->nip
incase SIAR is zero.

Patch drops the code under PPMU_P10_DD1 flag check
which handles SIAR 0 case only for Power10 DD1.

Fixes: 2ca13a4cc56c9 ("powerpc/perf: Use regs->nip when SIAR is zero")
Signed-off-by: Kajol Jain 
---

Changelog:
- Drop adding new ternary condition to check siar value.
- Remove siar check specific for PPMU_P10_DD1 and add
   it along with common checks as suggested by Christophe Leroy
   and Michael Ellermen

  arch/powerpc/perf/core-book3s.c | 7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 23ec89a59893..55efbba7572b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2254,12 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
bool use_siar = regs_use_siar(regs);
unsigned long siar = mfspr(SPRN_SIAR);
  
-	if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {

-   if (siar)
-   return siar;
-   else
-   return regs->nip;
-   } else if (use_siar && siar_valid(regs))
+   if (use_siar && siar_valid(regs) && siar)


You can probably now do

+   if (regs_use_siar(regs) && siar_valid(regs) && siar)

and remove use_siar


return siar + perf_ip_adjust(regs);
else
return regs->nip;



[PATCH v3 3/3] powerpc/perf: Fix the check for SIAR value

2021-08-18 Thread Kajol Jain
Incase of random sampling, there can be scenarios where
Sample Instruction Address Register(SIAR) may not latch
to the sampled instruction and could result in
the value of 0. In these scenarios it is preferred to
return regs->nip. These corner cases are seen in the
previous generation (p9) also. 

Patch adds the check for SIAR value along with use_siar and
siar_valid checks so that the function will return regs->nip
incase SIAR is zero.

Patch drops the code under PPMU_P10_DD1 flag check
which handles SIAR 0 case only for Power10 DD1.

Fixes: 2ca13a4cc56c9 ("powerpc/perf: Use regs->nip when SIAR is zero")
Signed-off-by: Kajol Jain 
---

Changelog:
- Drop adding new ternary condition to check siar value.
- Remove siar check specific for PPMU_P10_DD1 and add
  it along with common checks as suggested by Christophe Leroy
  and Michael Ellermen

 arch/powerpc/perf/core-book3s.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 23ec89a59893..55efbba7572b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2254,12 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
bool use_siar = regs_use_siar(regs);
unsigned long siar = mfspr(SPRN_SIAR);
 
-   if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
-   if (siar)
-   return siar;
-   else
-   return regs->nip;
-   } else if (use_siar && siar_valid(regs))
+   if (use_siar && siar_valid(regs) && siar)
return siar + perf_ip_adjust(regs);
else
return regs->nip;
-- 
2.26.2



[PATCH v3 2/3] powerpc/perf: Drop the case of returning 0 as instruction pointer

2021-08-18 Thread Kajol Jain
Drop the case of returning 0 as instruction pointer since kernel
never executes at 0 and userspace almost never does either.

Fixes: e6878835ac47 ("powerpc/perf: Sample only if SIAR-Valid
bit is set in P7+")
Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/core-book3s.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 1b464aad29c4..23ec89a59893 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2261,8 +2261,6 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
return regs->nip;
} else if (use_siar && siar_valid(regs))
return siar + perf_ip_adjust(regs);
-   else if (use_siar)
-   return 0;   // no valid instruction pointer
else
return regs->nip;
 }
-- 
2.26.2



[PATCH v3 1/3] powerpc/perf: Use stack siar instead of mfspr

2021-08-18 Thread Kajol Jain
Minor optimization in the 'perf_instruction_pointer' function code by
making use of stack siar instead of mfspr.

Fixes: 75382aa72f06 ("powerpc/perf: Move code to select SIAR or pt_regs
into perf_read_regs")
Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/core-book3s.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bb0ee716de91..1b464aad29c4 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2260,7 +2260,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
else
return regs->nip;
} else if (use_siar && siar_valid(regs))
-   return mfspr(SPRN_SIAR) + perf_ip_adjust(regs);
+   return siar + perf_ip_adjust(regs);
else if (use_siar)
return 0;   // no valid instruction pointer
else
-- 
2.26.2



Re: [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used

2021-08-18 Thread Athira Rajeev



> On 18-Aug-2021, at 5:11 PM, Nicholas Piggin  wrote:
> 
> Excerpts from Madhavan Srinivasan's message of August 17, 2021 11:06 pm:
>> 
>> On 8/16/21 12:59 PM, Nicholas Piggin wrote:
>>> Interrupt handling code would like to know whether perf is enabled, to
>>> know whether it should enable MSR[EE] to improve PMI coverage.
>>> 
>>> Cc: Madhavan Srinivasan 
>>> Cc: Athira Rajeev 
>>> Signed-off-by: Nicholas Piggin 
>>> ---
>>>  arch/powerpc/include/asm/hw_irq.h |  2 ++
>>>  arch/powerpc/perf/core-book3s.c   | 13 +
>>>  2 files changed, 15 insertions(+)
>>> 
>>> diff --git a/arch/powerpc/include/asm/hw_irq.h 
>>> b/arch/powerpc/include/asm/hw_irq.h
>>> index 21cc571ea9c2..2d5c0d3ccbb6 100644
>>> --- a/arch/powerpc/include/asm/hw_irq.h
>>> +++ b/arch/powerpc/include/asm/hw_irq.h
>>> @@ -306,6 +306,8 @@ static inline bool lazy_irq_pending_nocheck(void)
>>> return __lazy_irq_pending(local_paca->irq_happened);
>>>  }
>>> 
>>> +bool power_pmu_running(void);
>>> +
>>>  /*
>>>   * This is called by asynchronous interrupts to conditionally
>>>   * re-enable hard interrupts after having cleared the source
>>> diff --git a/arch/powerpc/perf/core-book3s.c 
>>> b/arch/powerpc/perf/core-book3s.c
>>> index bb0ee716de91..76114a9afb2b 100644
>>> --- a/arch/powerpc/perf/core-book3s.c
>>> +++ b/arch/powerpc/perf/core-book3s.c
>>> @@ -2380,6 +2380,19 @@ static void perf_event_interrupt(struct pt_regs 
>>> *regs)
>>> perf_sample_event_took(sched_clock() - start_clock);
>>>  }
>>> 
>>> +bool power_pmu_running(void)
>>> +{
>>> +   struct cpu_hw_events *cpuhw;
>>> +
>>> +   /* Could this simply test local_paca->pmcregs_in_use? */
>>> +
>>> +   if (!ppmu)
>>> +   return false;
>> 
>> 
>> This covers only when perf platform driver is not registered,
>> but we should also check for MMCR0[32], since pmu sprs can be
>> accessed via sysfs.
> 
> In that case do they actually do anything with the PMI? I don't think it 
> should matter hopefully.
> 
> But I do think a lot of this stuff could be cleaned up. We have 
> pmcs_enabled and ppc_enable_pmcs() in sysfs.c, ppc_set_pmu_inuse(), 
> ppc_md.enable_pmcs(), reserve_pmc_hardware(), etc and different users 
> call different things. We don't consistently disable either, e.g., we 
> never disable the H_PERFMON facility after we stop using perf even 
> though it says that slows down partition switch.

Hi Nick,

I have started looking at understanding the code path and working on it to get 
the code cleaned up.
I will work on posting the patch set for clean up.

Thanks
Athira Rajeev
> 
> I started to have a look at sorting it out but it looks like a big
> job so would take a bit of time if we want to do it.
> 
> Thanks,
> Nick



Re: [PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses

2021-08-18 Thread Christophe Leroy




Le 18/08/2021 à 14:05, Michael Ellerman a écrit :

Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

   kernel tried to execute exec-protected page (c00804073278) - exploit 
attempt? (uid: 0)
   BUG: Unable to handle kernel instruction fetch
   Faulting instruction address: 0xc00804073278
   Oops: Kernel access of bad area, sig: 11 [#1]
   LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
   Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
   CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
   Workqueue: events control_work_handler [virtio_console]
   NIP:  c00804073278 LR: c00804073278 CTR: c01e9de0
   REGS: c0002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
   MSR:  80004280b033   CR: 24002822 XER: 
200400cf
   ...
   NIP fill_queue+0xf0/0x210 [virtio_console]
   LR  fill_queue+0xf0/0x210 [virtio_console]
   Call Trace:
 fill_queue+0xb4/0x210 [virtio_console] (unreliable)
 add_port+0x1a8/0x470 [virtio_console]
 control_work_handler+0xbc/0x1e8 [virtio_console]
 process_one_work+0x290/0x590
 worker_thread+0x88/0x620
 kthread+0x194/0x1a0
 ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r@linux.ibm.com/

Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier 
Signed-off-by: Fabiano Rosas 
Signed-off-by: Michael Ellerman 


Reviewed-by: Christophe Leroy 


---
  arch/powerpc/mm/pageattr.c | 23 ++-
  1 file changed, 10 insertions(+), 13 deletions(-)

v2: Use pte_update(..., ~0, pte_val(pte), ...) as suggested by Fabiano,
 and ptep_get() as suggested by Christophe.

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 0876216ceee6..edea388e9d3f 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -18,16 +18,12 @@
  /*
   * Updates the attributes of a page in three steps:
   *
- * 1. invalidate the page table entry
- * 2. flush the TLB
- * 3. install the new entry with the updated attributes
- *
- * Invalidating the pte means there are situations where this will not work
- * when in theory it should.
- * For example:
- * - removing write from page whilst it is being executed
- * - setting a page read-only whilst it is being read by another CPU
+ * 1. take the page_table_lock
+ * 2. install the new entry with the updated attributes
+ * 3. flush the TLB
   *
+ * This sequence is safe against concurrent updates, and also allows updating 
the
+ * attributes of a page currently being executed or accessed.
   */
  static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
  {
@@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, 
void *data)
  
  	spin_lock(_mm.page_table_lock);
  
-	/* invalidate the PTE so it's safe to modify */

-   pte = ptep_get_and_clear(_mm, addr, ptep);
-   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+   pte = ptep_get(ptep);
  
  	/* modify the PTE bits as desired, then apply */

switch (action) {
@@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long 
addr, void *data)
break;
}
  
-	set_pte_at(_mm, addr, ptep, pte);

+   pte_update(_mm, addr, ptep, ~0UL, pte_val(pte), 0);
  
  	/* See ptesync comment in radix__set_pte_at() */

if (radix_enabled())
asm volatile("ptesync": : :"memory");
+
+   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
spin_unlock(_mm.page_table_lock);
  
  	return 0;


base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30



[PATCH v2] powerpc/mm: Fix set_memory_*() against concurrent accesses

2021-08-18 Thread Michael Ellerman
Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

  kernel tried to execute exec-protected page (c00804073278) - exploit 
attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0xc00804073278
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
  CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
  Workqueue: events control_work_handler [virtio_console]
  NIP:  c00804073278 LR: c00804073278 CTR: c01e9de0
  REGS: c0002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
  MSR:  80004280b033   CR: 24002822 XER: 
200400cf
  ...
  NIP fill_queue+0xf0/0x210 [virtio_console]
  LR  fill_queue+0xf0/0x210 [virtio_console]
  Call Trace:
fill_queue+0xb4/0x210 [virtio_console] (unreliable)
add_port+0x1a8/0x470 [virtio_console]
control_work_handler+0xbc/0x1e8 [virtio_console]
process_one_work+0x290/0x590
worker_thread+0x88/0x620
kthread+0x194/0x1a0
ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r@linux.ibm.com/

Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier 
Signed-off-by: Fabiano Rosas 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/pageattr.c | 23 ++-
 1 file changed, 10 insertions(+), 13 deletions(-)

v2: Use pte_update(..., ~0, pte_val(pte), ...) as suggested by Fabiano,
and ptep_get() as suggested by Christophe.

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 0876216ceee6..edea388e9d3f 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -18,16 +18,12 @@
 /*
  * Updates the attributes of a page in three steps:
  *
- * 1. invalidate the page table entry
- * 2. flush the TLB
- * 3. install the new entry with the updated attributes
- *
- * Invalidating the pte means there are situations where this will not work
- * when in theory it should.
- * For example:
- * - removing write from page whilst it is being executed
- * - setting a page read-only whilst it is being read by another CPU
+ * 1. take the page_table_lock
+ * 2. install the new entry with the updated attributes
+ * 3. flush the TLB
  *
+ * This sequence is safe against concurrent updates, and also allows updating 
the
+ * attributes of a page currently being executed or accessed.
  */
 static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 {
@@ -36,9 +32,7 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, 
void *data)
 
spin_lock(_mm.page_table_lock);
 
-   /* invalidate the PTE so it's safe to modify */
-   pte = ptep_get_and_clear(_mm, addr, ptep);
-   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+   pte = ptep_get(ptep);
 
/* modify the PTE bits as desired, then apply */
switch (action) {
@@ -59,11 +53,14 @@ static int change_page_attr(pte_t *ptep, unsigned long 
addr, void *data)
break;
}
 
-   set_pte_at(_mm, addr, ptep, pte);
+   pte_update(_mm, addr, ptep, ~0UL, pte_val(pte), 0);
 
/* See ptesync comment in radix__set_pte_at() */
if (radix_enabled())
asm volatile("ptesync": : :"memory");
+
+   flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
spin_unlock(_mm.page_table_lock);
 
return 0;

base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30
-- 
2.25.1



Re: [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used

2021-08-18 Thread Nicholas Piggin
Excerpts from Madhavan Srinivasan's message of August 17, 2021 11:06 pm:
> 
> On 8/16/21 12:59 PM, Nicholas Piggin wrote:
>> Interrupt handling code would like to know whether perf is enabled, to
>> know whether it should enable MSR[EE] to improve PMI coverage.
>>
>> Cc: Madhavan Srinivasan 
>> Cc: Athira Rajeev 
>> Signed-off-by: Nicholas Piggin 
>> ---
>>   arch/powerpc/include/asm/hw_irq.h |  2 ++
>>   arch/powerpc/perf/core-book3s.c   | 13 +
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/hw_irq.h 
>> b/arch/powerpc/include/asm/hw_irq.h
>> index 21cc571ea9c2..2d5c0d3ccbb6 100644
>> --- a/arch/powerpc/include/asm/hw_irq.h
>> +++ b/arch/powerpc/include/asm/hw_irq.h
>> @@ -306,6 +306,8 @@ static inline bool lazy_irq_pending_nocheck(void)
>>  return __lazy_irq_pending(local_paca->irq_happened);
>>   }
>>   
>> +bool power_pmu_running(void);
>> +
>>   /*
>>* This is called by asynchronous interrupts to conditionally
>>* re-enable hard interrupts after having cleared the source
>> diff --git a/arch/powerpc/perf/core-book3s.c 
>> b/arch/powerpc/perf/core-book3s.c
>> index bb0ee716de91..76114a9afb2b 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -2380,6 +2380,19 @@ static void perf_event_interrupt(struct pt_regs *regs)
>>  perf_sample_event_took(sched_clock() - start_clock);
>>   }
>>   
>> +bool power_pmu_running(void)
>> +{
>> +struct cpu_hw_events *cpuhw;
>> +
>> +/* Could this simply test local_paca->pmcregs_in_use? */
>> +
>> +if (!ppmu)
>> +return false;
> 
> 
> This covers only when perf platform driver is not registered,
> but we should also check for MMCR0[32], since pmu sprs can be
> accessed via sysfs.

In that case do they actually do anything with the PMI? I don't think it 
should matter hopefully.

But I do think a lot of this stuff could be cleaned up. We have 
pmcs_enabled and ppc_enable_pmcs() in sysfs.c, ppc_set_pmu_inuse(), 
ppc_md.enable_pmcs(), reserve_pmc_hardware(), etc and different users 
call different things. We don't consistently disable either, e.g., we 
never disable the H_PERFMON facility after we stop using perf even 
though it says that slows down partition switch.

I started to have a look at sorting it out but it looks like a big
job so would take a bit of time if we want to do it.

Thanks,
Nick


Re: [PATCH] scsi: ibmvfc: Stop using scsi_cmnd.tag

2021-08-18 Thread John Garry

Hi Martin,




Use scsi_cmd_to_rq(scsi_cmnd)->tag in preference to scsi_cmnd.tag.


Applied to 5.15/scsi-staging and rebased for bisectability.



Thanks, and sorry for the hassle. But I would still like the maintainers 
to have a look, as I was curious about current usage of scsi_cmnd.tag in 
that driver.



Just to be picky it looks like there's another scsi_cmmd tag lurking in
qla1280.c but it's sitting behind an #ifdef DEBUG_QLA1280.



That driver does not even compile with DEBUG_QLA1280 set beforehand. 
I'll fix that up and send as separate patches in case you want to 
shuffle the tag patch in earlier, which is prob not worth the effort.


I've done a good few more x86 randconfigs and tried to audit the code 
for more references, so hopefully that's the last.


Thanks


Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-08-18 Thread kernel test robot
Hi Xianting,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tty/tty-testing]
[also build test WARNING on char-misc/char-misc-testing soc/for-next v5.14-rc6 
next-20210818]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Xianting-Tian/make-hvc-pass-dma-capable-memory-to-its-backend/20210818-162408
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git 
tty-testing
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/e1b7662dafceb07a6905b64da2f1be27498c4a46
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Xianting-Tian/make-hvc-pass-dma-capable-memory-to-its-backend/20210818-162408
git checkout e1b7662dafceb07a6905b64da2f1be27498c4a46
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=sh 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/tty/hvc/hvc_console.c: In function 'hvc_poll_put_char':
>> drivers/tty/hvc/hvc_console.c:880:55: warning: passing argument 2 of 
>> 'hp->ops->put_chars' makes pointer from integer without a cast 
>> [-Wint-conversion]
 880 | n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
 | ~~^~~~
 |   |
 |   char
   drivers/tty/hvc/hvc_console.c:880:55: note: expected 'const char *' but 
argument is of type 'char'

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for SND_ATMEL_SOC_PDC
   Depends on SOUND && !UML && SND && SND_SOC && SND_ATMEL_SOC && HAS_DMA
   Selected by
   - SND_ATMEL_SOC_SSC && SOUND && !UML && SND && SND_SOC && SND_ATMEL_SOC
   - SND_ATMEL_SOC_SSC_PDC && SOUND && !UML && SND && SND_SOC && SND_ATMEL_SOC 
&& ATMEL_SSC


vim +880 drivers/tty/hvc/hvc_console.c

   870  
   871  static void hvc_poll_put_char(struct tty_driver *driver, int line, char 
ch)
   872  {
   873  struct tty_struct *tty = driver->ttys[0];
   874  struct hvc_struct *hp = tty->driver_data;
   875  int n;
   876  
   877  hp->out_ch = ch;
   878  
   879  do {
 > 880  n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
   881  } while (n <= 0);
   882  }
   883  #endif
   884  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH] powerpc/32: Remove unneccessary calculations in load_up_{fpu/altivec}

2021-08-18 Thread Christophe Leroy
No need to re-read SPRN_THREAD, we can calculate thread address
from current (r2).

And remove a reload of value 1 into r4 as r4 is already 1.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/fpu.S| 3 +--
 arch/powerpc/kernel/vector.S | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index 6010adcee16e..ba4afe3b5a9c 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -91,8 +91,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
isync
/* enable use of FP after return */
 #ifdef CONFIG_PPC32
-   mfspr   r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) */
-   tovirt(r5, r5)
+   addir5,r2,THREAD
lwz r4,THREAD_FPEXC_MODE(r5)
ori r9,r9,MSR_FP/* enable FP for current */
or  r9,r9,r4
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index fc120fac1910..ba03eedfdcd8 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -65,9 +65,8 @@ _GLOBAL(load_up_altivec)
 1:
/* enable use of VMX after return */
 #ifdef CONFIG_PPC32
-   mfspr   r5,SPRN_SPRG_THREAD /* current task's THREAD (phys) 
*/
+   addir5,r2,THREAD
orisr9,r9,MSR_VEC@h
-   tovirt(r5, r5)
 #else
ld  r4,PACACURRENT(r13)
addir5,r4,THREAD/* Get THREAD */
@@ -81,7 +80,6 @@ _GLOBAL(load_up_altivec)
li  r4,1
stb r4,THREAD_LOAD_VEC(r5)
addir6,r5,THREAD_VRSTATE
-   li  r4,1
li  r10,VRSTATE_VSCR
stw r4,THREAD_USED_VR(r5)
lvx v0,r10,r6
-- 
2.25.0



[PATCH v8 3/3] virtio-console: remove unnecessary kmemdup()

2021-08-18 Thread Xianting Tian
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack")

hvc framework will never pass stack memory to the put_chars() function,
So the calling of kmemdup() is unnecessary, we can remove it.

Signed-off-by: Xianting Tian 
Reviewed-by: Shile Zhang 
---
 drivers/char/virtio_console.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7..4ed3ffb1d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int 
count)
 {
struct port *port;
struct scatterlist sg[1];
-   void *data;
-   int ret;
 
if (unlikely(early_put_chars))
return early_put_chars(vtermno, buf, count);
@@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int 
count)
if (!port)
return -EPIPE;
 
-   data = kmemdup(buf, count, GFP_ATOMIC);
-   if (!data)
-   return -ENOMEM;
-
-   sg_init_one(sg, data, count);
-   ret = __send_to_port(port, sg, 1, count, data, false);
-   kfree(data);
-   return ret;
+   sg_init_one(sg, buf, count);
+   return __send_to_port(port, sg, 1, count, (void *)buf, false);
 }
 
 /*
-- 
2.17.1



[PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-08-18 Thread Xianting Tian
As well known, hvc backend driver(eg, virtio-console) can register its
operations to hvc framework. The operations can contain put_chars(),
get_chars() and so on.

Some hvc backend may do dma in its operations. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
   hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
   static void hvc_poll_put_char(,,char ch)
   {
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;

do {
n = hp->ops->put_chars(hp->vtermno, , 1);
} while (n <= 0);
   }

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the future.

In this patch, we make 'char out_buf[N_OUTBUF]' and 'chat out_ch' part
of 'struct hvc_struct', so both two buf are no longer the stack memory.
we can use it in above two cases separately.

Introduce another array(cons_outbufs[]) for buffer pointers next to
the cons_ops[] and vtermnos[] arrays. With the array, we can easily find
the buffer, instead of traversing hp list.

With the patch, we can remove the fix c4baad5029.

Signed-off-by: Xianting Tian 
Reviewed-by: Shile Zhang 
---
 drivers/tty/hvc/hvc_console.c | 27 ---
 drivers/tty/hvc/hvc_console.h | 16 ++--
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..300e9c037 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -41,16 +41,6 @@
  */
 #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
 
-/*
- * These sizes are most efficient for vio, because they are the
- * native transfer size. We could make them selectable in the
- * future to better deal with backends that want other buffer sizes.
- */
-#define N_OUTBUF   16
-#define N_INBUF16
-
-#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
-
 static struct tty_driver *hvc_driver;
 static struct task_struct *hvc_task;
 
@@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp)
 static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
 static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
{[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
+static char *cons_outbufs[MAX_NR_HVC_CONSOLES];
 
 /*
  * Console APIs, NOT TTY.  These APIs are available immediately when
@@ -151,7 +142,7 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
 static void hvc_console_print(struct console *co, const char *b,
  unsigned count)
 {
-   char c[N_OUTBUF] __ALIGNED__;
+   char *c;
unsigned i = 0, n = 0;
int r, donecr = 0, index = co->index;
 
@@ -163,6 +154,10 @@ static void hvc_console_print(struct console *co, const 
char *b,
if (vtermnos[index] == -1)
return;
 
+   c = cons_outbufs[index];
+   if (!c)
+   return;
+
while (count > 0 || i > 0) {
if (count > 0 && i < sizeof(c)) {
if (b[n] == '\n' && !donecr) {
@@ -879,8 +874,10 @@ static void hvc_poll_put_char(struct tty_driver *driver, 
int line, char ch)
struct hvc_struct *hp = tty->driver_data;
int n;
 
+   hp->out_ch = ch;
+
do {
-   n = hp->ops->put_chars(hp->vtermno, , 1);
+   n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1);
} while (n <= 0);
 }
 #endif
@@ -922,8 +919,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
return ERR_PTR(err);
}
 
-   hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
-   GFP_KERNEL);
+   hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
if (!hp)
return ERR_PTR(-ENOMEM);
 
@@ -931,7 +927,6 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
hp->data = data;
hp->ops = ops;
hp->outbuf_size = outbuf_size;
-   hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
 
tty_port_init(>port);
hp->port.ops = _port_ops;
@@ -964,6 +959,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
if (i < MAX_NR_HVC_CONSOLES) {
cons_ops[i] = ops;
vtermnos[i] = 

[PATCH v8 0/3] make hvc pass dma capable memory to its backend

2021-08-18 Thread Xianting Tian
Dear all,

This patch series make hvc framework pass DMA capable memory to
put_chars() of hvc backend(eg, virtio-console), and revert commit
c4baad5029 ("virtio-console: avoid DMA from stack”)

V1
virtio-console: avoid DMA from vmalloc area
https://lkml.org/lkml/2021/7/27/494

For v1 patch, Arnd Bergmann suggests to fix the issue in the first
place:
Make hvc pass DMA capable memory to put_chars()
The fix suggestion is included in v2.

V2
[PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/1/8
[PATCH 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/1/9

For v2 patch, Arnd Bergmann suggests to make new buf part of the
hvc_struct structure, and fix the compile issue.
The fix suggestion is included in v3.

V3
[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/3/1347
[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/3/1348

For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of
hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to
calculate the size of hvc_struct. The fix suggestion is included in
v4.

V4
[PATCH v4 0/2] make hvc pass dma capable memory to its backend
https://lkml.org/lkml/2021/8/5/1350
[PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/5/1351
[PATCH v4 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/5/1352

For v4 patch, Arnd Bergmann suggests to introduce another
array(cons_outbuf[]) for the buffer pointers next to the cons_ops[]
and vtermnos[] arrays. This fix included in this v5 patch.

V5
Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment,
use 'sizeof(long)' as dma alignment is wrong. fix it in v6.

V6
It contains coding error, fix it in v7 and it worked normally
according to test result.

V7
Greg KH suggests to add test and code review developer,
Jiri Slaby suggests to use lockless buffer and fix dma alignment
in separate patch.
fix above things in v8. 

drivers/tty/hvc/hvc_console.c | 27 ---
drivers/tty/hvc/hvc_console.h | 16 ++--
drivers/tty/hvc/hvc_console.h | 16 --
3 file changed


[PATCH v8 1/3] tty: hvc: use correct dma alignment size

2021-08-18 Thread Xianting Tian
Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)'
is wrong.

Signed-off-by: Xianting Tian 
Reviewed-by: Shile Zhang 
---
 drivers/tty/hvc/hvc_console.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..5957ab728 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -49,7 +49,7 @@
 #define N_OUTBUF   16
 #define N_INBUF16
 
-#define __ALIGNED__ __attribute__((__aligned__(sizeof(long
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
 
 static struct tty_driver *hvc_driver;
 static struct task_struct *hvc_task;
-- 
2.17.1



Re: [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses

2021-08-18 Thread Michael Ellerman
Fabiano Rosas  writes:
> Michael Ellerman  writes:
>
> Hi, I already mentioned these things in private, but I'll post here so
> everyone can see:
>
>> Because pte_update() takes the set of PTE bits to set and clear we can't
>> use our existing helpers, eg. pte_wrprotect() etc. and instead have to
>> open code the set of flags. We will clean that up somehow in a future
>> commit.
>
> I tested the following on P9 and it seems to work fine. Not sure if it
> works for CONFIG_PPC_8xx, though.
>
>
>  static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>  {
>   long action = (long)data;
>   pte_t pte;
>  
>   spin_lock(_mm.page_table_lock);
> -
> - /* invalidate the PTE so it's safe to modify */
> - pte = ptep_get_and_clear(_mm, addr, ptep);
> - flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> + pte = *ptep;
>  
>   /* modify the PTE bits as desired, then apply */
>   switch (action) {
> @@ -59,11 +42,9 @@ static int change_page_attr(pte_t *ptep, unsigned long 
> addr, void *data)
>   break;
>   }
>  
> - set_pte_at(_mm, addr, ptep, pte);
> + pte_update(_mm, addr, ptep, ~0UL, pte_val(pte), 0);

I avoided that because it's not atomic, but pte_update() is not atomic
on some platforms anyway. And for now at least we still have the
page_table_lock as well.

So you're right that's a nicer way to do it.

And I'll use ptep_get() as Christophe suggested.

> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>  
> - /* See ptesync comment in radix__set_pte_at() */
> - if (radix_enabled())
> - asm volatile("ptesync": : :"memory");

I didn't do that because I wanted to keep the patch minimal. We can do
that as a separate patch.

>   spin_unlock(_mm.page_table_lock);
>  
>   return 0;
> ---
>
> For reference, the full patch is here:
> https://github.com/farosas/linux/commit/923c95c84d7081d7be9503bf5b276dd93bd17036.patch
>
>>
>> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r@linux.ibm.com/
>>
>> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
>> Reported-by: Laurent Vivier 
>> Signed-off-by: Michael Ellerman 
>> ---
>
> ...
>
>> -set_pte_at(_mm, addr, ptep, pte);
>> +pte_update(_mm, addr, ptep, clear, set, 0);
>>  
>>  /* See ptesync comment in radix__set_pte_at() */
>>  if (radix_enabled())
>>  asm volatile("ptesync": : :"memory");
>> +
>> +flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>
> I think there's an optimization possible here, when relaxing access, to
> skip the TLB flush. Would still need the ptesync though. Similar to what
> Nick did in e5f7cb58c2b7 ("powerpc/64s/radix: do not flush TLB when
> relaxing access").

That commit is specific to Radix, whereas this code needs to work on all
platforms.

We'd need to verify that it's safe to skip the flush on all platforms
and CPU versions.

What I think we can do, and would possibly be a more meaningful
optimisation, is to move the TLB flush out of the loop and up into
change_memory_attr(). So we just do it once for the range, rather than
per page. But that too would be a separate patch.

cheers


[PATCH] sched/topology: Skip updating masks for non-online nodes

2021-08-18 Thread Srikar Dronamraju
From: Valentin Schneider 

The scheduler currently expects NUMA node distances to be stable from
init onwards, and as a consequence builds the related data structures
once-and-for-all at init (see sched_init_numa()).

Unfortunately, on some architectures node distance is unreliable for
offline nodes and may very well change upon onlining.

Skip over offline nodes during sched_init_numa(). Track nodes that have
been onlined at least once, and trigger a build of a node's NUMA masks
when it is first onlined post-init.

Cc: LKML 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Nathan Lynch 
Cc: Michael Ellerman 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Valentin Schneider 
Cc: Gautham R Shenoy 
Cc: Dietmar Eggemann 
Cc: Mel Gorman 
Cc: Vincent Guittot 
Cc: Rik van Riel 
Cc: Geetika Moolchandani 
Cc: Laurent Dufour 
Reported-by: Geetika Moolchandani 
Signed-off-by: Srikar Dronamraju 
Signed-off-by: Valentin Schneider 
---
Changelog:
[Fixed warning: no previous prototype for '__sched_domains_numa_masks_set']
 kernel/sched/topology.c | 65 +
 1 file changed, 65 insertions(+)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b77ad49dc14f..4e8698e62f07 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1482,6 +1482,8 @@ int   sched_max_numa_distance;
 static int *sched_domains_numa_distance;
 static struct cpumask  ***sched_domains_numa_masks;
 int __read_mostly  node_reclaim_distance = RECLAIM_DISTANCE;
+
+static unsigned long __read_mostly *sched_numa_onlined_nodes;
 #endif
 
 /*
@@ -1833,6 +1835,16 @@ void sched_init_numa(void)
sched_domains_numa_masks[i][j] = mask;
 
for_each_node(k) {
+   /*
+* Distance information can be unreliable for
+* offline nodes, defer building the node
+* masks to its bringup.
+* This relies on all unique distance values
+* still being visible at init time.
+*/
+   if (!node_online(j))
+   continue;
+
if (sched_debug() && (node_distance(j, k) != 
node_distance(k, j)))
sched_numa_warn("Node-distance not 
symmetric");
 
@@ -1886,6 +1898,53 @@ void sched_init_numa(void)
sched_max_numa_distance = sched_domains_numa_distance[nr_levels - 1];
 
init_numa_topology_type();
+
+   sched_numa_onlined_nodes = bitmap_alloc(nr_node_ids, GFP_KERNEL);
+   if (!sched_numa_onlined_nodes)
+   return;
+
+   bitmap_zero(sched_numa_onlined_nodes, nr_node_ids);
+   for_each_online_node(i)
+   bitmap_set(sched_numa_onlined_nodes, i, 1);
+}
+
+static void __sched_domains_numa_masks_set(unsigned int node)
+{
+   int i, j;
+
+   /*
+* NUMA masks are not built for offline nodes in sched_init_numa().
+* Thus, when a CPU of a never-onlined-before node gets plugged in,
+* adding that new CPU to the right NUMA masks is not sufficient: the
+* masks of that CPU's node must also be updated.
+*/
+   if (test_bit(node, sched_numa_onlined_nodes))
+   return;
+
+   bitmap_set(sched_numa_onlined_nodes, node, 1);
+
+   for (i = 0; i < sched_domains_numa_levels; i++) {
+   for (j = 0; j < nr_node_ids; j++) {
+   if (!node_online(j) || node == j)
+   continue;
+
+   if (node_distance(j, node) > 
sched_domains_numa_distance[i])
+   continue;
+
+   /* Add remote nodes in our masks */
+   cpumask_or(sched_domains_numa_masks[i][node],
+  sched_domains_numa_masks[i][node],
+  sched_domains_numa_masks[0][j]);
+   }
+   }
+
+   /*
+* A new node has been brought up, potentially changing the topology
+* classification.
+*
+* Note that this is racy vs any use of sched_numa_topology_type :/
+*/
+   init_numa_topology_type();
 }
 
 void sched_domains_numa_masks_set(unsigned int cpu)
@@ -1893,8 +1952,14 @@ void sched_domains_numa_masks_set(unsigned int cpu)
int node = cpu_to_node(cpu);
int i, j;
 
+   __sched_domains_numa_masks_set(node);
+
for (i = 0; i < sched_domains_numa_levels; i++) {
for (j = 0; j < nr_node_ids; j++) {
+   if (!node_online(j))
+   continue;
+
+   /* Set ourselves in the remote node's masks */
if (node_distance(j, node) <= 
sched_domains_numa_distance[i])

[PATCH v2] powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP

2021-08-18 Thread Christophe Leroy
Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
removed the 'isync' instruction after adding/removing NX bit in user
segments. The reasoning behind this change was that when setting the
NX bit we don't mind it taking effect with delay as the kernel never
executes text from userspace, and when clearing the NX bit this is
to return to userspace and then the 'rfi' should synchronise the
context.

However, it looks like on book3s/32 having a hash page table, at least
on the G3 processor, we get an unexpected fault from userspace, then
this is followed by something wrong in the verification of MSR_PR
at end of another interrupt.

This is fixed by adding back the removed isync() following update
of NX bit in user segment registers. Only do it for cores with an
hash table, as 603 cores don't exhibit that problem and the two isync
increase ./null_syscall selftest by 6 cycles on an MPC 832x.

First problem: unexpected WARN_ON() for mysterious PROTFAULT

[   62.896426] WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 
do_page_fault+0x6c/0x5b0
[   62.918111] Modules linked in:
[   62.923350] CPU: 0 PID: 1660 Comm: Xorg Not tainted 
5.13.0-pmac-00028-gb3c15b60339a #40
[   62.943476] NIP:  c001b5c8 LR: c001b6f8 CTR: 
[   62.954714] REGS: e2d09e40 TRAP: 0700   Not tainted  
(5.13.0-pmac-00028-gb3c15b60339a)
[   62.974581] MSR:  00021032   CR: 42d04f30  XER: 2000
[   62.990009]
   GPR00: c000424c e2d09f00 c301b680 e2d09f40 001e 
4200 00cba028 
   GPR08: 0800 4810 c301b680 e2d09f30 22d09f30 
00c1fff0 00cba000 a7b7ba4c
   GPR16: 0031     
 a7b7b0d0 00c5c010
   GPR24: a7b7b64c a7b7d2f0 0004  c1efa6c0 
00cba02c 0300 e2d09f40
[   63.075238] NIP [c001b5c8] do_page_fault+0x6c/0x5b0
[   63.085952] LR [c001b6f8] do_page_fault+0x19c/0x5b0
[   63.096678] Call Trace:
[   63.100075] [e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable)
[   63.112359] [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4
[   63.125168] --- interrupt: 300 at 0xa7a261dc
[   63.134060] NIP:  a7a261dc LR: a7a253bc CTR: 
[   63.145302] REGS: e2d09f40 TRAP: 0300   Not tainted  
(5.13.0-pmac-00028-gb3c15b60339a)
[   63.165167] MSR:  d032   CR: 228428e2  XER: 
2000
[   63.182162] DAR: 00cba02c DSISR: 4200
   GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 001e 
a7b7b614 00cba028 
   GPR08: 00020fd9 0031 00cb9ff8 a7a273b0 220028e2 
00c1fff0 00cba000 a7b7ba4c
   GPR16: 0031     
 a7b7b0d0 00c5c010
   GPR24: a7b7b64c a7b7d2f0 0004 0002 001e 
a7b7b614 a7b7aff4 0030
[   63.275233] NIP [a7a261dc] 0xa7a261dc
[   63.282291] LR [a7a253bc] 0xa7a253bc
[   63.289087] --- interrupt: 300
[   63.294322] Instruction dump:
[   63.299291] 7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 
551e0036 4082038c
[   63.318630] 2e1b 40920228 75280800 41820220 <0fe0> 3b60 
41920214 81420594
[   63.338503] ---[ end trace f642a84639cba377 ]---

Second problem: MSR PR is seen unset allthough the interrupt frame shows it set

[   63.666846] kernel BUG at arch/powerpc/kernel/interrupt.c:458!
[   63.680633] Oops: Exception in kernel mode, sig: 5 [#1]
[   63.692201] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
[   63.705011] Modules linked in:
[   63.710247] CPU: 0 PID: 1660 Comm: Xorg Tainted: GW 
5.13.0-pmac-00028-gb3c15b60339a #40
[   63.734553] NIP:  c0011434 LR: c001629c CTR: 
[   63.745796] REGS: e2d09e70 TRAP: 0700   Tainted: GW  
(5.13.0-pmac-00028-gb3c15b60339a)
[   63.769844] MSR:  00029032   CR: 42d09f30  XER: 

[   63.786052]
   GPR00:  e2d09f30 c301b680 e2d09f40 8344 
c44d0e68 e2d09e8c 
   GPR08: 0002 00dc228a 4000 e2d09f30 22d09f30 
00c1fff0 afa6ceb4 00c26144
   GPR16: 00c25fb8 00c26140 afa6ceb8 9000 00c944d8 
001c  0020
   GPR24:  01fb afa6d1b4 0001  
a539a2a0 a530fd80 0089
[   63.871284] NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70
[   63.885922] LR [c001629c] interrupt_return+0x9c/0x144
[   63.897168] Call Trace:
[   63.900562] [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 
(unreliable)
[   63.916773] --- interrupt: 300 at 0xa09be008
[   63.925659] NIP:  a09be008 LR: a09bdfe8 CTR: a09bdfc0
[   63.936903] REGS: e2d09f40 TRAP: 0300   Tainted: GW  

Re: [PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow

2021-08-18 Thread Christophe Leroy




Le 18/08/2021 à 08:05, Kees Cook a écrit :

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing across a field boundary with memset(), move the call
to just the array, and an explicit zeroing of the prior field.

Cc: Benjamin Herrenschmidt 
Cc: Qinglang Miao 
Cc: "Gustavo A. R. Silva" 
Cc: Hulk Robot 
Cc: Wang Wensheng 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Reviewed-by: Michael Ellerman 
Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au
---
  drivers/macintosh/smu.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
index 94fb63a7b357..59ce431da7ef 100644
--- a/drivers/macintosh/smu.c
+++ b/drivers/macintosh/smu.c
@@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
cmd->read = cmd->info.devaddr & 0x01;
switch(cmd->info.type) {
case SMU_I2C_TRANSFER_SIMPLE:
-   memset(>info.sublen, 0, 4);
+   cmd->info.sublen = 0;
+   memset(>info.subaddr, 0, 3);


subaddr[] is a table, should the & be avoided ?
And while at it, why not use sizeof(subaddr) instead of 3 ?



break;
case SMU_I2C_TRANSFER_COMBINED:
cmd->info.devaddr &= 0xfe;



[PATCH v2 57/63] powerpc/signal32: Use struct_group() to zero spe regs

2021-08-18 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Add a struct_group() for the spe registers so that memset() can correctly reason
about the size:

   In function 'fortify_memset_chk',
   inlined from 'restore_user_regs.part.0' at 
arch/powerpc/kernel/signal_32.c:539:3:
>> include/linux/fortify-string.h:195:4: error: call to 
>> '__write_overflow_field' declared with attribute warning: detected write 
>> beyond size of field (1st parameter); maybe use struct_group()? 
>> [-Werror=attribute-warning]
 195 |__write_overflow_field();
 |^~~~

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Christophe Leroy 
Cc: Sudeep Holla 
Cc: linuxppc-dev@lists.ozlabs.org
Reported-by: kernel test robot 
Signed-off-by: Kees Cook 
---
 arch/powerpc/include/asm/processor.h | 6 --
 arch/powerpc/kernel/signal_32.c  | 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index f348e564f7dd..05dc567cb9a8 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -191,8 +191,10 @@ struct thread_struct {
int used_vsr;   /* set if process has used VSX */
 #endif /* CONFIG_VSX */
 #ifdef CONFIG_SPE
-   unsigned long   evr[32];/* upper 32-bits of SPE regs */
-   u64 acc;/* Accumulator */
+   struct_group(spe,
+   unsigned long   evr[32];/* upper 32-bits of SPE regs */
+   u64 acc;/* Accumulator */
+   );
unsigned long   spefscr;/* SPE & eFP status */
unsigned long   spefscr_last;   /* SPEFSCR value on last prctl
   call or trap return */
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 0608581967f0..77b86caf5c51 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -532,11 +532,11 @@ static long restore_user_regs(struct pt_regs *regs,
regs_set_return_msr(regs, regs->msr & ~MSR_SPE);
if (msr & MSR_SPE) {
/* restore spe registers from the stack */
-   unsafe_copy_from_user(current->thread.evr, >mc_vregs,
- ELF_NEVRREG * sizeof(u32), failed);
+   unsafe_copy_from_user(>thread.spe, >mc_vregs,
+ sizeof(current->thread.spe), failed);
current->thread.used_spe = true;
} else if (current->thread.used_spe)
-   memset(current->thread.evr, 0, ELF_NEVRREG * sizeof(u32));
+   memset(>thread.spe, 0, sizeof(current->thread.spe));
 
/* Always get SPEFSCR back */
unsafe_get_user(current->thread.spefscr, (u32 __user *)>mc_vregs + 
ELF_NEVRREG, failed);
-- 
2.30.2



[PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow

2021-08-18 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing across a field boundary with memset(), move the call
to just the array, and an explicit zeroing of the prior field.

Cc: Benjamin Herrenschmidt 
Cc: Qinglang Miao 
Cc: "Gustavo A. R. Silva" 
Cc: Hulk Robot 
Cc: Wang Wensheng 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Reviewed-by: Michael Ellerman 
Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au
---
 drivers/macintosh/smu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
index 94fb63a7b357..59ce431da7ef 100644
--- a/drivers/macintosh/smu.c
+++ b/drivers/macintosh/smu.c
@@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
cmd->read = cmd->info.devaddr & 0x01;
switch(cmd->info.type) {
case SMU_I2C_TRANSFER_SIMPLE:
-   memset(>info.sublen, 0, 4);
+   cmd->info.sublen = 0;
+   memset(>info.subaddr, 0, 3);
break;
case SMU_I2C_TRANSFER_COMBINED:
cmd->info.devaddr &= 0xfe;
-- 
2.30.2



[PATCH v2 36/63] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp

2021-08-18 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
upper union (evt_struct->iu.srp) instead, as that's what is being wiped.

Cc: Tyrel Datwyler 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-s...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Acked-by: Martin K. Petersen 
Link: https://lore.kernel.org/lkml/yq135rzp79c@ca-mkp.ca.oracle.com
Acked-by: Tyrel Datwyler 
Link: 
https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695...@linux.ibm.com
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 50df7dd9cb91..ea8e01f49cba 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -1055,8 +1055,9 @@ static int ibmvscsi_queuecommand_lck(struct scsi_cmnd 
*cmnd,
return SCSI_MLQUEUE_HOST_BUSY;
 
/* Set up the actual SRP IU */
+   BUILD_BUG_ON(sizeof(evt_struct->iu.srp) != SRP_MAX_IU_LEN);
+   memset(_struct->iu.srp, 0x00, sizeof(evt_struct->iu.srp));
srp_cmd = _struct->iu.srp.cmd;
-   memset(srp_cmd, 0x00, SRP_MAX_IU_LEN);
srp_cmd->opcode = SRP_CMD;
memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb));
int_to_scsilun(lun, _cmd->lun);
-- 
2.30.2