Re: [PATCH 2/6] powerpc: Provide syscall wrapper

2022-06-15 Thread Rohan McLure
> Le 01/06/2022 à 10:29, Christophe Leroy a écrit :
>> Le 01/06/2022 à 07:48, Rohan McLure a écrit :
>>> [Vous ne recevez pas souvent de courriers de la part de 
>>> rmcl...@linux.ibm.com. Découvrez pourquoi cela peut être important à 
>>> l'adresse https://aka.ms/LearnAboutSenderIdentification.]
>>> 
>>> Syscall wrapper implemented as per s390, x86, arm64, providing the
>>> option for gprs to be cleared on entry to the kernel, reducing caller
>>> influence influence on speculation within syscall routine. The wrapper
>>> is a macro that emits syscall handler implementations with parameters
>>> passed by stack pointer.
>> Passing parameters by stack is going to be sub-optimal. Did you make any 
>> measurement of the implied performance degradation ? We usually use the 
>> null_syscall selftest for that everytime we touch syscall entries/exits.
> 
> I did a test with null_syscall on an 8xx. Surprisingly I get more than 20% 
> improvement with your series.
> 
> Looking at the generated code in more details, we see that 
> system_call_exception() is lighter as now no stack frame is needed, the 
> compiler has enough registers available.
> 
> Before the patch:
> 
> c000c9ec :
> c000c9ec: 94 21 ff f0 stwu r1,-16(r1)
> c000c9f0: 93 e1 00 0c stw r31,12(r1)
> c000c9f4: 7d 5f 53 78 mr r31,r10
> c000c9f8: 81 4a 00 84 lwz r10,132(r10)
> c000c9fc: 90 7f 00 88 stw r3,136(r31)
> c000ca00: 71 4b 00 02 andi. r11,r10,2
> c000ca04: 41 82 00 4c beq c000ca50 
> c000ca08: 71 4b 40 00 andi. r11,r10,16384
> c000ca0c: 41 82 00 50 beq c000ca5c 
> c000ca10: 71 4a 80 00 andi. r10,r10,32768
> c000ca14: 41 82 00 54 beq c000ca68 
> c000ca18: 7c 50 13 a6 mtspr 80,r2
> c000ca1c: 81 42 00 4c lwz r10,76(r2)
> c000ca20: 71 4a 84 91 andi. r10,r10,33937
> c000ca24: 40 82 00 50 bne c000ca74 
> c000ca28: 28 09 01 c2 cmplwi r9,450
> c000ca2c: 41 81 00 88 bgt c000cab4 
> c000ca30: 3d 40 c0 6f lis r10,-16273
> c000ca34: 55 29 10 3a rlwinm r9,r9,2,0,29
> c000ca38: 39 4a c1 c5 addi r10,r10,-15931
> c000ca3c: 7d 2a 48 2e lwzx r9,r10,r9
> c000ca40: 83 e1 00 0c lwz r31,12(r1)
> c000ca44: 7d 29 03 a6 mtctr r9
> c000ca48: 38 21 00 10 addi r1,r1,16
> c000ca4c: 4e 80 04 20 bctr
> ...
> 
> After the patch:
> c000cc94 :
> c000cc94: 81 24 00 84 lwz r9,132(r4)
> c000cc98: 81 44 00 0c lwz r10,12(r4)
> c000cc9c: 71 28 00 02 andi. r8,r9,2
> c000cca0: 91 44 00 88 stw r10,136(r4)
> c000cca4: 41 82 00 48 beq c000ccec 
> c000cca8: 71 2a 40 00 andi. r10,r9,16384
> c000ccac: 41 82 00 44 beq c000ccf0 
> c000ccb0: 71 29 80 00 andi. r9,r9,32768
> c000ccb4: 41 82 00 40 beq c000ccf4 
> c000ccb8: 7c 50 13 a6 mtspr 80,r2
> c000ccbc: 81 22 00 4c lwz r9,76(r2)
> c000ccc0: 71 29 84 91 andi. r9,r9,33937
> c000ccc4: 40 82 00 34 bne c000ccf8 
> c000ccc8: 28 03 01 c2 cmplwi r3,450
> c000: 41 81 00 78 bgt c000cd44 
> c000ccd0: 3d 20 c0 70 lis r9,-16272
> c000ccd4: 54 63 10 3a rlwinm r3,r3,2,0,29
> c000ccd8: 39 29 81 c5 addi r9,r9,-32315
> c000ccdc: 7d 29 18 2e lwzx r9,r9,r3
> c000cce0: 7c 83 23 78 mr r3,r4
> c000cce4: 7d 29 03 a6 mtctr r9
> c000cce8: 4e 80 04 20 bctr
> ...
> 
> 
> 
>> Why going via stack ? The main advantage of a RISC processor like powerpc is 
>> that, unlike x86, there are enough registers to avoid going through memory. 
>> RISC processors are optimised with three operands operations and many 
>> registers, and usually have slow memory in return.
> 
> Well, thinking about it once more. In fact registers are saved to the stack 
> anyway. At the start of syscall functions they are likely to still be hot in 
> the cache, so reading them back is just a few cycles. And it eventually 
> provide the compiler the opportunity to organise stuff better.
> 
> 
>>> 
>>> For platforms supporting this syscall wrapper, emit symbols with usual
>>> in-register parameters (`sys...`) to support calls to syscall handlers
>>> from within the kernel.
>>> 
>>> Syscalls are wrapped on all platforms except Cell processor. SPUs require
>>> access syscall prototypes which are omitted with ARCH_HAS_SYSCALL_WRAPPER
>>> enabled.
>> This commit message isn't very clear, please describe in more details what 
>> is done, how and why.
> 
> 
> Christophe

Thanks for checking this Christophe.

>> Why going via stack ? The main advantage of a RISC processor like powerpc is 
>> that, unlike x86, there are enough registers to avoid going through memory. 
>> RISC processors are optimised with three operands operations and many 
>> registers, and usually have slow memory in return.
> 
> Well, thinking about it once more. In fact registers are saved to the stack 
> anyway. At the start of syscall functions they are likely to still be 

Re: [PATCH] kexec: replace crash_mem_range with range

2022-06-15 Thread Li Chen
Hi Baoquan,

  On Wed, 15 Jun 2022 19:03:53 -0700 Baoquan He  wrote 
 > On 06/14/22 at 10:04pm, Li Chen wrote:
 > > From: Li Chen 
 > > 
 > > We already have struct range, so just use it.
 > 
 > Looks good, have you tested it?

No, I don't have ppc machine, just pass compile on x86.

Regards,
Li

 > 
 > > 
 > > Signed-off-by: Li Chen 
 > > ---
 > >  arch/powerpc/kexec/file_load_64.c | 2 +-
 > >  arch/powerpc/kexec/ranges.c   | 8 
 > >  include/linux/kexec.h | 7 ++-
 > >  kernel/kexec_file.c   | 2 +-
 > >  4 files changed, 8 insertions(+), 11 deletions(-)
 > > 
 > > diff --git a/arch/powerpc/kexec/file_load_64.c 
 > > b/arch/powerpc/kexec/file_load_64.c
 > > index b4981b651d9a..583b7fc478f2 100644
 > > --- a/arch/powerpc/kexec/file_load_64.c
 > > +++ b/arch/powerpc/kexec/file_load_64.c
 > > @@ -34,7 +34,7 @@ struct umem_info {
 > >  
 > >  /* usable memory ranges to look up */
 > >  unsigned int nr_ranges;
 > > -const struct crash_mem_range *ranges;
 > > +const struct range *ranges;
 > >  };
 > >  
 > >  const struct kexec_file_ops * const kexec_file_loaders[] = {
 > > diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
 > > index 563e9989a5bf..5fc53a5fcfdf 100644
 > > --- a/arch/powerpc/kexec/ranges.c
 > > +++ b/arch/powerpc/kexec/ranges.c
 > > @@ -33,7 +33,7 @@
 > >  static inline unsigned int get_max_nr_ranges(size_t size)
 > >  {
 > >  return ((size - sizeof(struct crash_mem)) /
 > > -sizeof(struct crash_mem_range));
 > > +sizeof(struct range));
 > >  }
 > >  
 > >  /**
 > > @@ -51,7 +51,7 @@ static inline size_t get_mem_rngs_size(struct crash_mem 
 > > *mem_rngs)
 > >  return 0;
 > >  
 > >  size = (sizeof(struct crash_mem) +
 > > -(mem_rngs->max_nr_ranges * sizeof(struct crash_mem_range)));
 > > +(mem_rngs->max_nr_ranges * sizeof(struct range)));
 > >  
 > >  /*
 > >   * Memory is allocated in size multiple of MEM_RANGE_CHUNK_SZ.
 > > @@ -98,7 +98,7 @@ static int __add_mem_range(struct crash_mem 
 > > **mem_ranges, u64 base, u64 size)
 > >   */
 > >  static void __merge_memory_ranges(struct crash_mem *mem_rngs)
 > >  {
 > > -struct crash_mem_range *ranges;
 > > +struct range *ranges;
 > >  int i, idx;
 > >  
 > >  if (!mem_rngs)
 > > @@ -123,7 +123,7 @@ static void __merge_memory_ranges(struct crash_mem 
 > > *mem_rngs)
 > >  /* cmp_func_t callback to sort ranges with sort() */
 > >  static int rngcmp(const void *_x, const void *_y)
 > >  {
 > > -const struct crash_mem_range *x = _x, *y = _y;
 > > +const struct range *x = _x, *y = _y;
 > >  
 > >  if (x->start > y->start)
 > >  return 1;
 > > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
 > > index 58d1b58a971e..d7ab4ad4c619 100644
 > > --- a/include/linux/kexec.h
 > > +++ b/include/linux/kexec.h
 > > @@ -17,6 +17,7 @@
 > >  
 > >  #include 
 > >  #include 
 > > +#include 
 > >  
 > >  #include 
 > >  
 > > @@ -214,14 +215,10 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf);
 > >  /* Alignment required for elf header segment */
 > >  #define ELF_CORE_HEADER_ALIGN   4096
 > >  
 > > -struct crash_mem_range {
 > > -u64 start, end;
 > > -};
 > > -
 > >  struct crash_mem {
 > >  unsigned int max_nr_ranges;
 > >  unsigned int nr_ranges;
 > > -struct crash_mem_range ranges[];
 > > +struct range ranges[];
 > >  };
 > >  
 > >  extern int crash_exclude_mem_range(struct crash_mem *mem,
 > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
 > > index 8347fc158d2b..f2758af86b93 100644
 > > --- a/kernel/kexec_file.c
 > > +++ b/kernel/kexec_file.c
 > > @@ -1183,7 +1183,7 @@ int crash_exclude_mem_range(struct crash_mem *mem,
 > >  {
 > >  int i, j;
 > >  unsigned long long start, end, p_start, p_end;
 > > -struct crash_mem_range temp_range = {0, 0};
 > > +struct range temp_range = {0, 0};
 > >  
 > >  for (i = 0; i < mem->nr_ranges; i++) {
 > >  start = mem->ranges[i].start;
 > > -- 
 > > 2.36.1
 > > 
 > > 
 > > 
 > > ___
 > > kexec mailing list
 > > ke...@lists.infradead.org
 > > http://lists.infradead.org/mailman/listinfo/kexec
 > > 
 > 
 > 


Re: [PATCH] kexec: replace crash_mem_range with range

2022-06-15 Thread Baoquan He
On 06/14/22 at 10:04pm, Li Chen wrote:
> From: Li Chen 
> 
> We already have struct range, so just use it.

Looks good, have you tested it?

> 
> Signed-off-by: Li Chen 
> ---
>  arch/powerpc/kexec/file_load_64.c | 2 +-
>  arch/powerpc/kexec/ranges.c   | 8 
>  include/linux/kexec.h | 7 ++-
>  kernel/kexec_file.c   | 2 +-
>  4 files changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/kexec/file_load_64.c 
> b/arch/powerpc/kexec/file_load_64.c
> index b4981b651d9a..583b7fc478f2 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -34,7 +34,7 @@ struct umem_info {
>  
>   /* usable memory ranges to look up */
>   unsigned int nr_ranges;
> - const struct crash_mem_range *ranges;
> + const struct range *ranges;
>  };
>  
>  const struct kexec_file_ops * const kexec_file_loaders[] = {
> diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
> index 563e9989a5bf..5fc53a5fcfdf 100644
> --- a/arch/powerpc/kexec/ranges.c
> +++ b/arch/powerpc/kexec/ranges.c
> @@ -33,7 +33,7 @@
>  static inline unsigned int get_max_nr_ranges(size_t size)
>  {
>   return ((size - sizeof(struct crash_mem)) /
> - sizeof(struct crash_mem_range));
> + sizeof(struct range));
>  }
>  
>  /**
> @@ -51,7 +51,7 @@ static inline size_t get_mem_rngs_size(struct crash_mem 
> *mem_rngs)
>   return 0;
>  
>   size = (sizeof(struct crash_mem) +
> - (mem_rngs->max_nr_ranges * sizeof(struct crash_mem_range)));
> + (mem_rngs->max_nr_ranges * sizeof(struct range)));
>  
>   /*
>* Memory is allocated in size multiple of MEM_RANGE_CHUNK_SZ.
> @@ -98,7 +98,7 @@ static int __add_mem_range(struct crash_mem **mem_ranges, 
> u64 base, u64 size)
>   */
>  static void __merge_memory_ranges(struct crash_mem *mem_rngs)
>  {
> - struct crash_mem_range *ranges;
> + struct range *ranges;
>   int i, idx;
>  
>   if (!mem_rngs)
> @@ -123,7 +123,7 @@ static void __merge_memory_ranges(struct crash_mem 
> *mem_rngs)
>  /* cmp_func_t callback to sort ranges with sort() */
>  static int rngcmp(const void *_x, const void *_y)
>  {
> - const struct crash_mem_range *x = _x, *y = _y;
> + const struct range *x = _x, *y = _y;
>  
>   if (x->start > y->start)
>   return 1;
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 58d1b58a971e..d7ab4ad4c619 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -17,6 +17,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -214,14 +215,10 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf);
>  /* Alignment required for elf header segment */
>  #define ELF_CORE_HEADER_ALIGN   4096
>  
> -struct crash_mem_range {
> - u64 start, end;
> -};
> -
>  struct crash_mem {
>   unsigned int max_nr_ranges;
>   unsigned int nr_ranges;
> - struct crash_mem_range ranges[];
> + struct range ranges[];
>  };
>  
>  extern int crash_exclude_mem_range(struct crash_mem *mem,
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 8347fc158d2b..f2758af86b93 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -1183,7 +1183,7 @@ int crash_exclude_mem_range(struct crash_mem *mem,
>  {
>   int i, j;
>   unsigned long long start, end, p_start, p_end;
> - struct crash_mem_range temp_range = {0, 0};
> + struct range temp_range = {0, 0};
>  
>   for (i = 0; i < mem->nr_ranges; i++) {
>   start = mem->ranges[i].start;
> -- 
> 2.36.1
> 
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 



Re: [PATCH] kprobes: Enable tracing for mololithic kernel images

2022-06-15 Thread jar...@kernel.org
et>, "ebied...@xmission.com" , 
"aneesh.ku...@linux.ibm.com" , "Edgecombe, Rick P" 
, "bris...@redhat.com" , 
"wangkefeng.w...@huawei.com" , "ker...@esmil.dk" 
, "jniet...@gmail.com" , 
"paul.walms...@sifive.com" , "a...@kernel.org" 
, "w...@kernel.org" , "masahi...@kernel.org" 
, "Sakkinen, Jarkko" , 
"samitolva...@google.com" , 
"naveen.n@linux.ibm.com" , "el...@google.com" 
, "keesc...@chromium.org" , 
"rost...@goodmis.org" , "nat...@kernel.org" 
, "rmk+ker...@armlinux.org.uk" , 
"broo...@kernel.org" , "b...@alien8.de" , 
"egore...@linux.ibm
 .com" , "tsbog...@alpha.franken.de" 
, "linux-par...@vger.kernel.org" 
, "nathan...@profian.com" 
, "dmitry.torok...@gmail.com" 
, "da...@davemloft.net" , 
"kirill.shute...@linux.intel.com" , 
"husc...@linux.ibm.com" , "pet...@infradead.org" 
, "h...@zytor.com" , 
"sparcli...@vger.kernel.org" , 
"yangtie...@loongson.cn" , "mbe...@suse.cz" 
, "chenzhong...@huawei.com" , 
"a...@kernel.org" , "x...@kernel.org" , 
"li...@armlinux.org.uk" , 
"linux-ri...@lists.infradead.org" , 
"mi...@redhat.com" , "atom...@redhat.com" 
, "a...@eecs.berkeley.edu" , "h...@linux.ibm.com" , "liaocha...@huawei.com" , 
"ati...@atishpatra.org" , "jpoim...@kernel.org" 
, "tmri...@linux.ibm.com" , 
"linux-m...@vger.kernel.org" , 
"changbin...@intel.com" , "pal...@dabbelt.com" 
, "linuxppc-dev@lists.ozlabs.org" 
, "linux-modu...@vger.kernel.org" 

Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Wed, Jun 15, 2022 at 08:37:07AM +0200, h...@lst.de wrote:
> On Tue, Jun 14, 2022 at 03:32:38PM +0300, jar...@kernel.org wrote:
> > > Like say for a next step we moved prog pack out of bpf into core code,
> > > gave it it's own copy of module_alloc(), and then made kprobes use it.
> > > Then we would have something with improved W^X guard rails, and kprobes
> > > would not depend on modules anymore. I think maybe it's a step in the
> > > right direction, even if it's not perfect.
> > 
> > So you're saying that I should (as a first step) basically clone
> > module_alloc() implementation for kprobes, and future for BPF 
> > use, in order to get a clean starting point?
> 
> I don't think cloning the code helps anyone.  The fact that except
> for the eBPF mess everyone uses module_alloc and the related
> infrastructure is a feature and not a bug.  The interface should
> become better than what we have right now, but there is few enough
> users that this can be done in one go.
> 
> So assuming we really care deeply enough about fancy tracing without
> modules (and I'm not sure we do, even if you don't use modules it
> doesn't hurt to just build the modules code, I do that all the time
> for my test machines), the general approach in your series is the
> right one.

OK, thanks for the elaboration!

However I bake it, I doubt that next version is going to be the final
version, given all the angles. Therefore, I mostly Christophe's
suggestions on compilation flags, and also split this into per-arch
patches.

That should be at least to the right direction.

BR, Jarkko


Re: [PATCH] kprobes: Enable tracing for mololithic kernel images

2022-06-15 Thread Jarkko Sakkinen
r...@esmil.dk>, Jordan Niethe , Atish Patra 
, Alexei Starovoitov , Will Deacon 
, Masahiro Yamada , Jarkko Sakkinen 
, Sami Tolvanen , "Naveen N. Rao" 
, Marco Elver , Kees Cook 
, Steven Rostedt , Nathan 
Chancellor , Mark Brown , Borislav 
Petkov , Alexander Egorenkov , Thomas 
Bogendoerfer , "linux-par...@vger.kernel.org" 
, Nathaniel McCallum , 
Dmitry Torokhov , "David S. Miller" 
, "Kirill A. Shutemov" , 
Tobias Huschle , "Peter Zijlstra \(Intel\)" 
, "H. Peter Anvin" 
 , "sparcli...@vger.kernel.org" , Tiezhu Yang 
, Miroslav Benes , Chen Zhongjin 
, Ard Biesheuvel , "x...@kernel.org" 
, "Russell King \(Oracle\)" , 
"linux-ri...@lists.infradead.org" , Ingo 
Molnar , Aaron Tomlin , Albert Ou 
, Heiko Carstens , Liao Chang 
, Paul Walmsley , Josh 
Poimboeuf , Thomas Richter , 
"linux-m...@vger.kernel.org" , Changbin Du 
, Palmer Dabbelt , 
"linuxppc-dev@lists.ozlabs.org" , 
"linux-modu...@vger.kernel.org" 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Tue, Jun 14, 2022 at 12:36:25PM +, Christophe Leroy wrote:
> 
> 
> Le 14/06/2022 à 14:26, Jarkko Sakkinen a écrit :
> > On Thu, Jun 09, 2022 at 06:44:45AM -0700, Luis Chamberlain wrote:
> >> On Thu, Jun 09, 2022 at 08:47:38AM +0100, Russell King (Oracle) wrote:
> >>> On Wed, Jun 08, 2022 at 02:59:27AM +0300, Jarkko Sakkinen wrote:
>  diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
>  index 553866751e1a..d2bb954cd54f 100644
>  --- a/arch/arm/kernel/Makefile
>  +++ b/arch/arm/kernel/Makefile
>  @@ -44,6 +44,11 @@ obj-$(CONFIG_CPU_IDLE)+= cpuidle.o
>    obj-$(CONFIG_ISA_DMA_API)  += dma.o
>    obj-$(CONFIG_FIQ)  += fiq.o fiqasm.o
>    obj-$(CONFIG_MODULES)  += armksyms.o module.o
>  +ifeq ($(CONFIG_MODULES),y)
>  +obj-y   += module_alloc.o
>  +else
>  +obj-$(CONFIG_KPROBES)   += module_alloc.o
>  +endif
> >>>
> >>> Doesn't:
> >>>
> >>> obj-$(CONFIG_MODULES) += module_alloc.o
> >>> obj-$(CONFIG_KPROBES) += module_alloc.o
> >>
> >> That just begs for a new kconfig symbol for the object, and for
> >> the object then to be built with it.
> >>
> >> The archs which override the default can use ARCH_HAS_VM_ALLOC_EXEC.
> >> Please note that the respective free is important as well and its
> >> not clear if we need an another define for the free. Someone has
> >> to do that work. We want to ensure to noexec the code on free and
> >> this can vary on each arch.
> > 
> > Let me check if I understand this (not 100% sure).
> > 
> > So if arch define ARCH_HAS_VMALLOC_EXEC, then this would set
> > config flag CONFIG_VMALLOC_EXEC, which would be used to include
> > the compilation unit?
> > 
> 
> I guess you have two possible approaches.
> 
> Either architectures select CONFIG_ARCH_HAS_VMALLOC_EXEC at all time and 
> then you add a CONFIG_VMALLOC_EXEC which depends on 
> CONFIG_ARCH_HAS_VMALLOC_EXEC and CONFIG_MODULES or CONFIG_KPROBES,
> 
> Or architectures select CONFIG_ARCH_HAS_VMALLOC_EXEC only when either 
> CONFIG_MODULES or CONFIG_KPROBES is selected, in that case there is no 
> need for a CONFIG_VMALLOC_EXEC.

Right, got it now. Thanks for the elaboration.

> Christophe

BR, Jarkko


[PATCH] arch: powerpc: platforms: 512x: Add missing of_node_put()

2022-06-15 Thread Liang He
In mpc5121_clk_init(), of_find_compatible_node() will return a
node pointer with refcount incremented. We should use of_node_put()
when it is not used anymore.

Signed-off-by: Liang He 
---
 arch/powerpc/platforms/512x/clock-commonclk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/512x/clock-commonclk.c 
b/arch/powerpc/platforms/512x/clock-commonclk.c
index 0652c7e69225..ca475462e95b 100644
--- a/arch/powerpc/platforms/512x/clock-commonclk.c
+++ b/arch/powerpc/platforms/512x/clock-commonclk.c
@@ -1208,6 +1208,8 @@ int __init mpc5121_clk_init(void)
/* register as an OF clock provider */
mpc5121_clk_register_of_provider(clk_np);
 
+   of_node_put(clk_np);
+
/*
 * unbreak not yet adjusted peripheral drivers during migration
 * towards fully operational common clock support, and allow
-- 
2.25.1



Re: [PATCH] powerpc: Enable execve syscall exit tracepoint

2022-06-15 Thread Sumit Dubey2
Tested-by: Sumit Dubey2 mailto:sumit.dub...@ibm.com>>


[PATCH] arch: powerpc: platforms: 85xx: Add missing of_node_put in sgy_cts1000.c

2022-06-15 Thread Liang He
Signed-off-by: Liang He 
---
 arch/powerpc/platforms/85xx/sgy_cts1000.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/platforms/85xx/sgy_cts1000.c 
b/arch/powerpc/platforms/85xx/sgy_cts1000.c
index 98ae64075193..2a45b30852b2 100644
--- a/arch/powerpc/platforms/85xx/sgy_cts1000.c
+++ b/arch/powerpc/platforms/85xx/sgy_cts1000.c
@@ -85,17 +85,24 @@ static int gpio_halt_probe(struct platform_device *pdev)
/* Technically we could just read the first one, but punish
 * DT writers for invalid form. */
if (of_gpio_count(halt_node) != 1)
+   {
+   of_node_put(halt_node);
return -EINVAL;
+   }
 
/* Get the gpio number relative to the dynamic base. */
gpio = of_get_gpio_flags(halt_node, 0, );
if (!gpio_is_valid(gpio))
+   {
+   of_node_put(halt_node);
return -EINVAL;
+   }
 
err = gpio_request(gpio, "gpio-halt");
if (err) {
printk(KERN_ERR "gpio-halt: error requesting GPIO %d.\n",
   gpio);
+   of_node_put(halt_node);
halt_node = NULL;
return err;
}
@@ -112,6 +119,7 @@ static int gpio_halt_probe(struct platform_device *pdev)
printk(KERN_ERR "gpio-halt: error requesting IRQ %d for "
   "GPIO %d.\n", irq, gpio);
gpio_free(gpio);
+   of_node_put(halt_node);
halt_node = NULL;
return err;
}
@@ -123,6 +131,8 @@ static int gpio_halt_probe(struct platform_device *pdev)
printk(KERN_INFO "gpio-halt: registered GPIO %d (%d trigger, %d"
   " irq).\n", gpio, trigger, irq);
 
+   of_node_put(halt_node);
+
return 0;
 }
 
-- 
2.25.1



Re: [PATCH] kprobes: Enable tracing for mololithic kernel images

2022-06-15 Thread h...@lst.de
luxnic.net" , "ebied...@xmission.com" 
, "aneesh.ku...@linux.ibm.com" 
, "Edgecombe, Rick P" , 
"bris...@redhat.com" , "wangkefeng.w...@huawei.com" 
, "ker...@esmil.dk" , 
"jniet...@gmail.com" , "paul.walms...@sifive.com" 
, "a...@kernel.org" , 
"w...@kernel.org" , "masahi...@kernel.org" 
, "Sakkinen, Jarkko" , 
"samitolva...@google.com" , 
"naveen.n@linux.ibm.com" , "el...@google.com" 
, "keesc...@chromium.org" , 
"rost...@goodmis.org" , "nat...@kernel.org" 
, "rmk+ker...@armlinux.org.uk" , 
"broo...@kernel.org" , "b...@alien8.de" , "egore...@linux.ibm.com" , 
"tsbog...@alpha.franken.de" , 
"linux-par...@vger.kernel.org" , 
"nathan...@profian.com" , "dmitry.torok...@gmail.com" 
, "da...@davemloft.net" , 
"kirill.shute...@linux.intel.com" , 
"husc...@linux.ibm.com" , "pet...@infradead.org" 
, "h...@zytor.com" , 
"sparcli...@vger.kernel.org" , 
"yangtie...@loongson.cn" , "mbe...@suse.cz" 
, "chenzhong...@huawei.com" , 
"a...@kernel.org" , "x...@kernel.org" , 
"li...@armlinux.org.uk" , 
"linux-ri...@lists.infradead.org" , 
"mi...@redhat.com" , "atom...@redhat.com" 
,
  "a...@eecs.berkeley.edu" , "hc
a...@linux.ibm.com" , "liaocha...@huawei.com" 
, "ati...@atishpatra.org" , 
"jpoim...@kernel.org" , "tmri...@linux.ibm.com" 
, "linux-m...@vger.kernel.org" 
, "changbin...@intel.com" , 
"pal...@dabbelt.com" , "linuxppc-dev@lists.ozlabs.org" 
, "linux-modu...@vger.kernel.org" 

Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Tue, Jun 14, 2022 at 03:32:38PM +0300, jar...@kernel.org wrote:
> > Like say for a next step we moved prog pack out of bpf into core code,
> > gave it it's own copy of module_alloc(), and then made kprobes use it.
> > Then we would have something with improved W^X guard rails, and kprobes
> > would not depend on modules anymore. I think maybe it's a step in the
> > right direction, even if it's not perfect.
> 
> So you're saying that I should (as a first step) basically clone
> module_alloc() implementation for kprobes, and future for BPF 
> use, in order to get a clean starting point?

I don't think cloning the code helps anyone.  The fact that except
for the eBPF mess everyone uses module_alloc and the related
infrastructure is a feature and not a bug.  The interface should
become better than what we have right now, but there is few enough
users that this can be done in one go.

So assuming we really care deeply enough about fancy tracing without
modules (and I'm not sure we do, even if you don't use modules it
doesn't hurt to just build the modules code, I do that all the time
for my test machines), the general approach in your series is the
right one.


Re: [PATCH 23/36] arm64,smp: Remove trace_.*_rcuidle() usage

2022-06-15 Thread Marc Zyngier
ry.no...@gmail.com, ulli.kr...@googlemail.com, vgu...@kernel.org, 
linux-...@vger.kernel.org, mon...@monstr.eu, rost...@goodmis.org, 
r...@vger.kernel.org, b...@alien8.de, bc...@quicinc.com, 
tsbog...@alpha.franken.de, linux-par...@vger.kernel.org, sudeep.ho...@arm.com, 
shawn...@kernel.org, da...@davemloft.net, dal...@libc.org, Peter Zijlstra 
, amakha...@vmware.com, bjorn.anders...@linaro.org, 
h...@zytor.com, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-ri...@lists.infradead.org, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, Arnd Bergmann , rich...@nod.at, 
x...@kernel.org, li...@armlinux.org.uk, mi...@redhat.com, 
a...@eecs.berkeley.edu, paul...@kernel.org, h...@linux.ibm.com, 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, 
paul.walms...@sifive.com, linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
pv-driv...@vmware.com, linux-mips@vger.kerne
 l.org, pal...@dabbelt.com, a...@brainfault.org, i...@jurassic.park.msu.ru, 
johan...@sipsolutions.net, linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Tue, 14 Jun 2022 17:24:48 +0100,
Mark Rutland  wrote:
> 
> On Wed, Jun 08, 2022 at 04:27:46PM +0200, Peter Zijlstra wrote:
> > Ever since commit d3afc7f12987 ("arm64: Allow IPIs to be handled as
> > normal interrupts") this function is called in regular IRQ context.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) 
> 
> [adding Marc since he authored that commit]
> 
> Makes sense to me:
> 
>   Acked-by: Mark Rutland 
> 
> Mark.
> 
> > ---
> >  arch/arm64/kernel/smp.c |4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -865,7 +865,7 @@ static void do_handle_IPI(int ipinr)
> > unsigned int cpu = smp_processor_id();
> >  
> > if ((unsigned)ipinr < NR_IPI)
> > -   trace_ipi_entry_rcuidle(ipi_types[ipinr]);
> > +   trace_ipi_entry(ipi_types[ipinr]);
> >  
> > switch (ipinr) {
> > case IPI_RESCHEDULE:
> > @@ -914,7 +914,7 @@ static void do_handle_IPI(int ipinr)
> > }
> >  
> > if ((unsigned)ipinr < NR_IPI)
> > -   trace_ipi_exit_rcuidle(ipi_types[ipinr]);
> > +   trace_ipi_exit(ipi_types[ipinr]);
> >  }
> >  
> >  static irqreturn_t ipi_handler(int irq, void *data)

Acked-by: Marc Zyngier 

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH 34.5/36] cpuidle,omap4: Push RCU-idle into omap4_enter_lowpower()

2022-06-15 Thread Tony Lindgren
linux.intel.com, Arnd Bergmann , ulli.kr...@googlemail.com, 
vgu...@kernel.org, linux-...@vger.kernel.org, j...@joshtriplett.org, 
rost...@goodmis.org, r...@vger.kernel.org, b...@alien8.de, bc...@quicinc.com, 
tsbog...@alpha.franken.de, linux-par...@vger.kernel.org, sudeep.ho...@arm.com, 
shawn...@kernel.org, da...@davemloft.net, Peter Vasil , 
dal...@libc.org, pv-driv...@vmware.com, amakha...@vmware.com, 
bjorn.anders...@linaro.org, h...@zytor.com, sparcli...@vger.kernel.org, 
linux-hexa...@vger.kernel.org, linux-ri...@lists.infradead.org, 
anton.iva...@cambridgegreys.com, jo...@southpole.se, yury.no...@gmail.com, 
rich...@nod.at, x...@kernel.org, li...@armlinux.org.uk, mi...@redhat.com, 
a...@eecs.berkeley.edu, paul...@kernel.org, h...@linux.ibm.com, 
stefan.kristians...@saunalahti.fi, openr...@lists.librecores.org, 
paul.walms...@sifive.com, linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
 mon...@monstr.eu, linux-m...@vger.kernel.org, pal...@dabbelt.com, 
a...@brainfault.org, i...@jurassic.park.msu.ru, johan...@sipsolutions.net, 
linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


Hi,

Adding Aaro Koskinen and Peter Vasil for pm24xx for n800 and n810 related
idle.

* Peter Zijlstra  [220614 22:07]:
> On Mon, Jun 13, 2022 at 03:39:05PM +0300, Tony Lindgren wrote:
> > OMAP4 uses full SoC suspend modes as idle states, as such it needs the
> > whole power-domain and clock-domain code from the idle path.
> > 
> > All that code is not suitable to run with RCU disabled, as such push
> > RCU-idle deeper still.
> > 
> > Signed-off-by: Tony Lindgren 
> > ---
> > 
> > Peter here's one more for your series, looks like this is needed to avoid
> > warnings similar to what you did for omap3.
> 
> Thanks Tony!
> 
> I've had a brief look at omap2_pm_idle() and do I understand it right
> that something like the below patch would reduce it to a simple 'WFI'?

Yes that should do for omap2_do_wfi().

> What do I do with the rest of that code, because I don't think this
> thing has a cpuidle driver to take over, effectively turning it into
> dead code.

As we are establishing a policy where deeper idle states must be
handled by cpuidle, and for most part that has been the case for at least
10 years, I'd just drop the unused functions with an explanation in the
patch why we're doing it. Or the functions could be tagged with
__maybe_unused if folks prefer that.

In the pm24xx case we are not really causing a regression for users as
there are still pending patches to make n800 and n810 truly usable with
the mainline kernel. At least the PMIC and LCD related patches need some
work [0]. The deeper idle states can be added back later using cpuidle
as needed so we have a clear path.

Aaro & Peter V, do you have any better suggestions here as this will
mostly affect you guys currently?

Regards,

Tony

[0] 
https://lore.kernel.org/linux-omap/20211224214512.1583430-1-peter.va...@gmail.com/


> --- a/arch/arm/mach-omap2/pm24xx.c
> +++ b/arch/arm/mach-omap2/pm24xx.c
> @@ -126,10 +126,20 @@ static int omap2_allow_mpu_retention(voi
>   return 1;
>  }
>  
> -static void omap2_enter_mpu_retention(void)
> +static void omap2_do_wfi(void)
>  {
>   const int zero = 0;
>  
> + /* WFI */
> + asm("mcr p15, 0, %0, c7, c0, 4" : : "r" (zero) : "memory", "cc");
> +}
> +
> +#if 0
> +/*
> + * possible cpuidle implementation between WFI and full_retention above
> + */
> +static void omap2_enter_mpu_retention(void)
> +{
>   /* The peripherals seem not to be able to wake up the MPU when
>* it is in retention mode. */
>   if (omap2_allow_mpu_retention()) {
> @@ -146,8 +157,7 @@ static void omap2_enter_mpu_retention(vo
>   pwrdm_set_next_pwrst(mpu_pwrdm, PWRDM_POWER_ON);
>   }
>  
> - /* WFI */
> - asm("mcr p15, 0, %0, c7, c0, 4" : : "r" (zero) : "memory", "cc");
> + omap2_do_wfi();
>  
>   pwrdm_set_next_pwrst(mpu_pwrdm, PWRDM_POWER_ON);
>  }
> @@ -161,6 +171,7 @@ static int omap2_can_sleep(void)
>  
>   return 1;
>  }
> +#endif
>  
>  static void omap2_pm_idle(void)
>  {
> @@ -169,6 +180,7 @@ static void omap2_pm_idle(void)
>   if (omap_irq_pending())
>   return;
>  
> +#if 0
>   error = cpu_cluster_pm_enter();
>   if (error || !omap2_can_sleep()) {
>   omap2_enter_mpu_retention();
> @@ -179,6 +191,9 @@ static void omap2_pm_idle(void)
>  
>  out_cpu_cluster_pm:
>   cpu_cluster_pm_exit();
> +#else
> + omap2_do_wfi();
> +#endif
>  }
>  
>  static void __init prcm_setup_regs(void)


[PATCH] kexec: replace crash_mem_range with range

2022-06-15 Thread Li Chen
From: Li Chen 

We already have struct range, so just use it.

Signed-off-by: Li Chen 
---
 arch/powerpc/kexec/file_load_64.c | 2 +-
 arch/powerpc/kexec/ranges.c   | 8 
 include/linux/kexec.h | 7 ++-
 kernel/kexec_file.c   | 2 +-
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index b4981b651d9a..583b7fc478f2 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -34,7 +34,7 @@ struct umem_info {
 
/* usable memory ranges to look up */
unsigned int nr_ranges;
-   const struct crash_mem_range *ranges;
+   const struct range *ranges;
 };
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 563e9989a5bf..5fc53a5fcfdf 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -33,7 +33,7 @@
 static inline unsigned int get_max_nr_ranges(size_t size)
 {
return ((size - sizeof(struct crash_mem)) /
-   sizeof(struct crash_mem_range));
+   sizeof(struct range));
 }
 
 /**
@@ -51,7 +51,7 @@ static inline size_t get_mem_rngs_size(struct crash_mem 
*mem_rngs)
return 0;
 
size = (sizeof(struct crash_mem) +
-   (mem_rngs->max_nr_ranges * sizeof(struct crash_mem_range)));
+   (mem_rngs->max_nr_ranges * sizeof(struct range)));
 
/*
 * Memory is allocated in size multiple of MEM_RANGE_CHUNK_SZ.
@@ -98,7 +98,7 @@ static int __add_mem_range(struct crash_mem **mem_ranges, u64 
base, u64 size)
  */
 static void __merge_memory_ranges(struct crash_mem *mem_rngs)
 {
-   struct crash_mem_range *ranges;
+   struct range *ranges;
int i, idx;
 
if (!mem_rngs)
@@ -123,7 +123,7 @@ static void __merge_memory_ranges(struct crash_mem 
*mem_rngs)
 /* cmp_func_t callback to sort ranges with sort() */
 static int rngcmp(const void *_x, const void *_y)
 {
-   const struct crash_mem_range *x = _x, *y = _y;
+   const struct range *x = _x, *y = _y;
 
if (x->start > y->start)
return 1;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 58d1b58a971e..d7ab4ad4c619 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -17,6 +17,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -214,14 +215,10 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf);
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
 
-struct crash_mem_range {
-   u64 start, end;
-};
-
 struct crash_mem {
unsigned int max_nr_ranges;
unsigned int nr_ranges;
-   struct crash_mem_range ranges[];
+   struct range ranges[];
 };
 
 extern int crash_exclude_mem_range(struct crash_mem *mem,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 8347fc158d2b..f2758af86b93 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -1183,7 +1183,7 @@ int crash_exclude_mem_range(struct crash_mem *mem,
 {
int i, j;
unsigned long long start, end, p_start, p_end;
-   struct crash_mem_range temp_range = {0, 0};
+   struct range temp_range = {0, 0};
 
for (i = 0; i < mem->nr_ranges; i++) {
start = mem->ranges[i].start;
-- 
2.36.1




Re: [PATCH 16/36] rcu: Fix rcu_idle_exit()

2022-06-15 Thread Paul E. McKenney
, Arnd Bergmann , ulli.kr...@googlemail.com, vgu...@kernel.org, 
linux-...@vger.kernel.org, j...@joshtriplett.org, rost...@goodmis.org, 
r...@vger.kernel.org, b...@alien8.de, bc...@quicinc.com, 
tsbog...@alpha.franken.de, linux-par...@vger.kernel.org, sudeep.ho...@arm.com, 
shawn...@kernel.org, da...@davemloft.net, dal...@libc.org, t...@atomide.com, 
amakha...@vmware.com, bjorn.anders...@linaro.org, h...@zytor.com, 
sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-ri...@lists.infradead.org, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, yury.no...@gmail.com, rich...@nod.at, x...@kernel.org, 
li...@armlinux.org.uk, mi...@redhat.com, a...@eecs.berkeley.edu, 
h...@linux.ibm.com, stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, paul.walms...@sifive.com, 
linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
mon...@monstr.eu, linux-m...@vger.kernel.org, pal...@dabbelt.com, anup@brain
 fault.org, i...@jurassic.park.msu.ru, johan...@sipsolutions.net, 
linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Wed, Jun 08, 2022 at 04:27:39PM +0200, Peter Zijlstra wrote:
> Current rcu_idle_exit() is terminally broken because it uses
> local_irq_{save,restore}(), which are traced which uses RCU.
> 
> However, now that all the callers are sure to have IRQs disabled, we
> can remove these calls.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> Acked-by: Paul E. McKenney 

We have some fun conflicts between this series and Frederic's context-tracking
series.  But it looks like these can be resolved by:

1.  A patch on top of Frederic's series that provides the old rcu_*()
names for the functions now prefixed with ct_*() such as
ct_idle_exit().

2.  Another patch on top of Frederic's series that takes the
changes remaining from this patch, shown below.  Frederic's
series uses raw_local_irq_save() and raw_local_irq_restore(),
which can then be removed.

Or is there a better way to do this?

Thanx, Paul



commit f64cee8c159e9863a74594efe3d33fb513a6a7b5
Author: Peter Zijlstra 
Date:   Tue Jun 14 17:24:43 2022 -0700

context_tracking: Interrupts always disabled for ct_idle_exit()

Now that the idle-loop cleanups have ensured that rcu_idle_exit() is
always invoked with interrupts disabled, remove the interrupt disabling
in favor of a debug check.

Signed-off-by: Peter Zijlstra 
Cc: Frederic Weisbecker 
Signed-off-by: Paul E. McKenney 

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 1da44803fd319..99310cf5b0254 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -332,11 +332,8 @@ EXPORT_SYMBOL_GPL(ct_idle_enter);
  */
 void noinstr ct_idle_exit(void)
 {
-   unsigned long flags;
-
-   raw_local_irq_save(flags);
+   WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled());
ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE);
-   raw_local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(ct_idle_exit);
 


Re: [PATCH 34.5/36] cpuidle,omap4: Push RCU-idle into omap4_enter_lowpower()

2022-06-15 Thread Peter Zijlstra
rndb.de>, ulli.kr...@googlemail.com, vgu...@kernel.org, 
linux-...@vger.kernel.org, j...@joshtriplett.org, rost...@goodmis.org, 
r...@vger.kernel.org, b...@alien8.de, bc...@quicinc.com, 
tsbog...@alpha.franken.de, linux-par...@vger.kernel.org, sudeep.ho...@arm.com, 
shawn...@kernel.org, da...@davemloft.net, dal...@libc.org, 
pv-driv...@vmware.com, amakha...@vmware.com, bjorn.anders...@linaro.org, 
h...@zytor.com, sparcli...@vger.kernel.org, linux-hexa...@vger.kernel.org, 
linux-ri...@lists.infradead.org, anton.iva...@cambridgegreys.com, 
jo...@southpole.se, yury.no...@gmail.com, rich...@nod.at, x...@kernel.org, 
li...@armlinux.org.uk, mi...@redhat.com, a...@eecs.berkeley.edu, 
paul...@kernel.org, h...@linux.ibm.com, stefan.kristians...@saunalahti.fi, 
openr...@lists.librecores.org, paul.walms...@sifive.com, 
linux-te...@vger.kernel.org, namhy...@kernel.org, 
andriy.shevche...@linux.intel.com, jpoim...@kernel.org, jgr...@suse.com, 
mon...@monstr.eu, linux-m...@vger.kernel.org, pal...@dabbelt.com, anup@bra
 infault.org, i...@jurassic.park.msu.ru, johan...@sipsolutions.net, 
linuxppc-dev@lists.ozlabs.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Mon, Jun 13, 2022 at 03:39:05PM +0300, Tony Lindgren wrote:
> OMAP4 uses full SoC suspend modes as idle states, as such it needs the
> whole power-domain and clock-domain code from the idle path.
> 
> All that code is not suitable to run with RCU disabled, as such push
> RCU-idle deeper still.
> 
> Signed-off-by: Tony Lindgren 
> ---
> 
> Peter here's one more for your series, looks like this is needed to avoid
> warnings similar to what you did for omap3.

Thanks Tony!

I've had a brief look at omap2_pm_idle() and do I understand it right
that something like the below patch would reduce it to a simple 'WFI'?

What do I do with the rest of that code, because I don't think this
thing has a cpuidle driver to take over, effectively turning it into
dead code.

--- a/arch/arm/mach-omap2/pm24xx.c
+++ b/arch/arm/mach-omap2/pm24xx.c
@@ -126,10 +126,20 @@ static int omap2_allow_mpu_retention(voi
return 1;
 }
 
-static void omap2_enter_mpu_retention(void)
+static void omap2_do_wfi(void)
 {
const int zero = 0;
 
+   /* WFI */
+   asm("mcr p15, 0, %0, c7, c0, 4" : : "r" (zero) : "memory", "cc");
+}
+
+#if 0
+/*
+ * possible cpuidle implementation between WFI and full_retention above
+ */
+static void omap2_enter_mpu_retention(void)
+{
/* The peripherals seem not to be able to wake up the MPU when
 * it is in retention mode. */
if (omap2_allow_mpu_retention()) {
@@ -146,8 +157,7 @@ static void omap2_enter_mpu_retention(vo
pwrdm_set_next_pwrst(mpu_pwrdm, PWRDM_POWER_ON);
}
 
-   /* WFI */
-   asm("mcr p15, 0, %0, c7, c0, 4" : : "r" (zero) : "memory", "cc");
+   omap2_do_wfi();
 
pwrdm_set_next_pwrst(mpu_pwrdm, PWRDM_POWER_ON);
 }
@@ -161,6 +171,7 @@ static int omap2_can_sleep(void)
 
return 1;
 }
+#endif
 
 static void omap2_pm_idle(void)
 {
@@ -169,6 +180,7 @@ static void omap2_pm_idle(void)
if (omap_irq_pending())
return;
 
+#if 0
error = cpu_cluster_pm_enter();
if (error || !omap2_can_sleep()) {
omap2_enter_mpu_retention();
@@ -179,6 +191,9 @@ static void omap2_pm_idle(void)
 
 out_cpu_cluster_pm:
cpu_cluster_pm_exit();
+#else
+   omap2_do_wfi();
+#endif
 }
 
 static void __init prcm_setup_regs(void)


Re: [PATCH v2 0/4] pseries-wdt: initial support for H_WATCHDOG-based watchdog timers

2022-06-15 Thread Daniel Henrique Barboza

Hi,

I tried this series out with mainline QEMU built with Alexey's patch [1]
and I wasn't able to get it to work. I'm using a simple QEMU command line
booting a fedora36 guest in a Power9 boston host:

sudo  ./qemu-system-ppc64 \
-M 
pseries,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off,ic-mode=dual
 \
-m 4G -accel kvm -cpu POWER9 -smp 1,maxcpus=1,threads=1,cores=1,sockets=1 \
-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x2 \
-drive 
file=/home/danielhb/fedora36.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none
 \
-device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
 \
-device qemu-xhci,id=usb,bus=pci.0,addr=0x4 -nographic -display none


Guest is running v5.19-rc2 with this series applied. Kernel config consists of
'pseries_le_defconfig' plus the following 'watchdog' related changes:

[root@fedora ~]# cat linux/.config | grep PSERIES_WDT
CONFIG_PSERIES_WDT=y

[root@fedora ~]# cat linux/.config | grep -i watchdog
CONFIG_PPC_WATCHDOG=y
CONFIG_HAVE_NMI_WATCHDOG=y
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
CONFIG_WATCHDOG_OPEN_TIMEOUT=0
# CONFIG_WATCHDOG_SYSFS is not set
# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set
# Watchdog Pretimeout Governors
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
# Watchdog Device Drivers
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_XILINX_WATCHDOG is not set
# CONFIG_ZIIRAVE_WATCHDOG is not set
# CONFIG_CADENCE_WATCHDOG is not set
# CONFIG_DW_WATCHDOG is not set
# CONFIG_MAX63XX_WATCHDOG is not set
CONFIG_WATCHDOG_RTAS=y
# PCI-based Watchdog Cards
# CONFIG_PCIPCWATCHDOG is not set
# USB-based Watchdog Cards
# CONFIG_USBPCWATCHDOG is not set
# CONFIG_WQ_WATCHDOG is not set
[root@fedora ~]#



Kernel command line:

[root@fedora ~]# cat /proc/cmdline
BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-5.19.0-rc2-00054-g12ede8ffb103 \
root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root \
pseries-wdt.timeout=60 pseries-wdt.nowayout=1 pseries-wdt.action=2


With all that, executing

echo V > /dev/watchdog0

Does nothing. dmesg is clean and the guest doesn't reboot after the 60 sec
timeout.  I also tried with PSERIES_WDT being compiled as a module instead
of built-in. Same results.


What am I missing?


[1] 
https://patchwork.ozlabs.org/project/qemu-ppc/patch/20220608030153.1862335-1-...@ozlabs.ru/



Thanks,


Daniel




On 6/2/22 14:53, Scott Cheloha wrote:

PAPR v2.12 defines a new hypercall, H_WATCHDOG.  This patch series
adds support for this hypercall to powerpc/pseries kernels and
introduces a new watchdog driver, "pseries-wdt", for the virtual
timers exposed by the hypercall.

This series is preceded by the following:

RFC v1: 
https://lore.kernel.org/linux-watchdog/20220413165104.179144-1-chel...@linux.ibm.com/
RFC v2: 
https://lore.kernel.org/linux-watchdog/20220509174357.5448-1-chel...@linux.ibm.com/
PATCH v1: 
https://lore.kernel.org/linux-watchdog/20220520183552.33426-1-chel...@linux.ibm.com/

Changes of note from PATCH v1:

- Trim down the large comment documenting the H_WATCHDOG hypercall.
   The comment is likely to rot, so remove anything we aren't using
   and anything overly obvious.

- Remove any preprocessor definitions not actually used in the module
   right now.  If we want to use other features offered by the hypercall
   we can add them in later.  They're just clutter until then.

- Simplify the "action" module parameter.  The value is now an index
   into an array of possible timeoutAction values.  This design removes
   the need for the custom get/set methods used in PATCH v1.

   Now we merely need to check that the "action" value is a valid
   index during pseries_wdt_probe().  Easy.

- Make the timeoutAction a member of pseries_wdt, "action".  This
   eliminates the use of a global variable during pseries_wdt_start().

- Use watchdog_init_timeout() idiomatically.  Check its return value
   and error out of pseries_wdt_probe() if it fails.




[PATCH] powerpc: Merge hardirq stack and softirq stack

2022-06-15 Thread Christophe Leroy
__do_IRQ() doesn't switch on hardirq stack if we are on softirq stack.

do_softirq() bail out early without doing anything when already in
an interrupt.

invoke_softirq() is on task_stack when it calls do_softirq_own_stack().

So there are neither situation where we switch from hardirq stack to
softirq stack nor from softirq stack to hardirq stack.

It is therefore not necessary to have two stacks because they are
never used at the same time.

Merge both stacks into a new one called normirq_ctx.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/irq.h |  3 +--
 arch/powerpc/kernel/irq.c  | 18 +++---
 arch/powerpc/kernel/process.c  |  6 +-
 arch/powerpc/kernel/setup_32.c |  6 ++
 arch/powerpc/kernel/setup_64.c |  6 ++
 5 files changed, 13 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
index 13f0409dd617..03de3fe3488c 100644
--- a/arch/powerpc/include/asm/irq.h
+++ b/arch/powerpc/include/asm/irq.h
@@ -49,8 +49,7 @@ extern void *mcheckirq_ctx[NR_CPUS];
 /*
  * Per-cpu stacks for handling hard and soft interrupts.
  */
-extern void *hardirq_ctx[NR_CPUS];
-extern void *softirq_ctx[NR_CPUS];
+extern void *normirq_ctx[NR_CPUS];
 
 void __do_IRQ(struct pt_regs *regs);
 extern void __init init_IRQ(void);
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index dd09919c3c66..7c0455cd7aae 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -683,17 +683,16 @@ void __do_irq(struct pt_regs *regs)
 void __do_IRQ(struct pt_regs *regs)
 {
struct pt_regs *old_regs = set_irq_regs(regs);
-   void *cursp, *irqsp, *sirqsp;
+   void *cursp, *irqsp;
 
/* Switch to the irq stack to handle this */
cursp = (void *)(current_stack_pointer & ~(THREAD_SIZE - 1));
-   irqsp = hardirq_ctx[raw_smp_processor_id()];
-   sirqsp = softirq_ctx[raw_smp_processor_id()];
+   irqsp = normirq_ctx[raw_smp_processor_id()];
 
check_stack_overflow();
 
/* Already there ? */
-   if (unlikely(cursp == irqsp || cursp == sirqsp)) {
+   if (unlikely(cursp == irqsp)) {
__do_irq(regs);
set_irq_regs(old_regs);
return;
@@ -719,10 +718,8 @@ static void __init vmap_irqstack_init(void)
 {
int i;
 
-   for_each_possible_cpu(i) {
-   softirq_ctx[i] = alloc_vm_stack();
-   hardirq_ctx[i] = alloc_vm_stack();
-   }
+   for_each_possible_cpu(i)
+   normirq_ctx[i] = alloc_vm_stack();
 }
 
 
@@ -744,12 +741,11 @@ void*dbgirq_ctx[NR_CPUS] __read_mostly;
 void *mcheckirq_ctx[NR_CPUS] __read_mostly;
 #endif
 
-void *softirq_ctx[NR_CPUS] __read_mostly;
-void *hardirq_ctx[NR_CPUS] __read_mostly;
+void *normirq_ctx[NR_CPUS] __read_mostly;
 
 void do_softirq_own_stack(void)
 {
-   call_do_softirq(softirq_ctx[smp_processor_id()]);
+   call_do_softirq(normirq_ctx[smp_processor_id()]);
 }
 
 irq_hw_number_t virq_to_hw(unsigned int virq)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index ee0433809621..4b724d86ed9d 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2089,11 +2089,7 @@ static inline int valid_irq_stack(unsigned long sp, 
struct task_struct *p,
unsigned long stack_page;
unsigned long cpu = task_cpu(p);
 
-   stack_page = (unsigned long)hardirq_ctx[cpu];
-   if (sp >= stack_page && sp <= stack_page + THREAD_SIZE - nbytes)
-   return 1;
-
-   stack_page = (unsigned long)softirq_ctx[cpu];
+   stack_page = (unsigned long)normirq_ctx[cpu];
if (sp >= stack_page && sp <= stack_page + THREAD_SIZE - nbytes)
return 1;
 
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index 813261789303..cad0e4fbdd4b 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -158,10 +158,8 @@ void __init irqstack_early_init(void)
 
/* interrupt stacks must be in lowmem, we get that for free on ppc32
 * as the memblock is limited to lowmem by default */
-   for_each_possible_cpu(i) {
-   softirq_ctx[i] = alloc_stack();
-   hardirq_ctx[i] = alloc_stack();
-   }
+   for_each_possible_cpu(i)
+   normirq_ctx[i] = alloc_stack();
 }
 
 #ifdef CONFIG_VMAP_STACK
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 5761f08dae95..70ba227d13fc 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -718,10 +718,8 @@ void __init irqstack_early_init(void)
 * cannot afford to take SLB misses on them. They are not
 * accessed in realmode.
 */
-   for_each_possible_cpu(i) {
-   softirq_ctx[i] = alloc_stack(limit, i);
-   hardirq_ctx[i] = alloc_stack(limit, i);
-   }
+   for_each_possible_cpu(i)
+   normirq_ctx[i] = 

Re: [RFC PATCH v2 0/7] objtool: Enable and implement --mcount option on powerpc

2022-06-15 Thread Christophe Leroy


Le 25/05/2022 à 20:12, Sathvika Vasireddy a écrit :
> 
> On 25/05/22 23:09, Christophe Leroy wrote:
>> Hi Sathvika,
>>
>> Le 25/05/2022 à 12:14, Sathvika Vasireddy a écrit :
>>> Hi Christophe,
>>>
>>> On 24/05/22 18:47, Christophe Leroy wrote:
 This draft series adds PPC32 support to Sathvika's series.
 Verified on pmac32 on QEMU.

 It should in principle also work for PPC64 BE but for the time being
 something goes wrong. In the beginning I had a segfaut hence the first
 patch. But I still get no mcount section in the files.
>>> Since PPC64 BE uses older elfv1 ABI, it prepends a dot to symbols.
>>> And so, the relocation records in case of PPC64BE point to "._mcount",
>>> rather than just "_mcount". We should be looking for "._mcount" to be
>>> able to generate mcount_loc section in the files.
>>>
>>> Like:
>>>
>>> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
>>> index 70be5a72e838..7da5bf8c7236 100644
>>> --- a/tools/objtool/check.c
>>> +++ b/tools/objtool/check.c
>>> @@ -2185,7 +2185,7 @@ static int classify_symbols(struct objtool_file
>>> *file)
>>>       if (arch_is_retpoline(func))
>>>       func->retpoline_thunk = true;
>>>
>>> -   if ((!strcmp(func->name, "__fentry__")) ||
>>> (!strcmp(func->name, "_mcount")))
>>> +   if ((!strcmp(func->name, "__fentry__")) ||
>>> (!strcmp(func->name, "_mcount")) || (!strcmp(func->name, "._mcount")))
>>>       func->fentry = true;
>>>
>>>       if (is_profiling_func(func->name))
>>>
>>>
>>> With this change, I could see __mcount_loc section being
>>> generated in individual ppc64be object files.
>>>
>> Or should we implement an equivalent of arch_ftrace_match_adjust() in
>> objtool ?
> 
> Yeah, I think it makes more sense if we make it arch specific.
> Thanks for the suggestion. I'll make this change in next revision :-)
> 

Do you have any idea when you plan to send next revision ?

I'm really looking forward to submitting the inline static calls on top 
of your series.

Thanks
Christophe

Re: [PATCH] arch/*: Disable softirq stacks on PREEMPT_RT.

2022-06-15 Thread Arnd Bergmann
On Wed, Jun 15, 2022 at 8:57 AM Christoph Hellwig  wrote:
>
> On Tue, Jun 14, 2022 at 08:18:14PM +0200, Sebastian Andrzej Siewior wrote:
> > Disable the unused softirqs stacks on PREEMPT_RT to safe some memory and
>
> s/safe/save/


Applied to the asm-generic tree with the above fixup, thanks!

  Arnd


Re: [PATCH 24/30] panic: Refactor the panic path

2022-06-15 Thread Guilherme G. Piccoli
Perfect Petr, thanks for your feedback!

I'll be out for some weeks, but after that what I'm doing is to split
the series in 2 parts:

(a) The general fixes, which should be reviewed by subsystem maintainers
and even merged individually by them.

(b) The proper panic refactor, which includes the notifiers list split,
etc. I'll think about what I consider the best solution for the
crash_dump required ones, and will try to split in very simple patches
to make it easier to review.

Cheers,


Guilherme


Re: [PATCH 2/6] powerpc: Provide syscall wrapper

2022-06-15 Thread Arnd Bergmann
On Wed, Jun 15, 2022 at 3:47 AM Rohan McLure  wrote:
> > On 3 Jun 2022, at 7:04 pm, Arnd Bergmann  wrote:
> > On Wed, Jun 1, 2022 at 7:48 AM Rohan McLure  wrote:

> > What is the benefit of having a separate set of macros for this? I think 
> > that
> > adds more complexity than it saves in the end.
>
> I was unsure whether the exact return types needed to be respected for syscall
> handlers or not. I realise that under the existing behaviour,
> system_call_exception performs an indirect call, the return type of which is
> interpreted as a long, so the return type should be irrelevant. On inspection
> PPC_SYSCALL_DEFINE is readily replacable with COMPAT_SYSCALL_DEFINE as you
> have suggested.
>
> Before resubmitting this series, I will try for a patch series which 
> modernises
> syscall handlers in arch/powerpc, and inspect where powerpc private versions
> are strictly necessary, using __ARCH_WANT_... wherever possible.

Ok, great! The parameter ordering is a bit tricky for some of them. I think
in most cases the version used by risc-v now should be the same as what
you need for powerpc (with the appropriate compat_arg_u64_dual()).
If some don't work, I would suggest modifying the common code so it can
handle both riscv and powerpc instead of keeping a private copy.

   Arnd


Re: [PATCH] arch/*: Disable softirq stacks on PREEMPT_RT.

2022-06-15 Thread Christoph Hellwig
On Tue, Jun 14, 2022 at 08:18:14PM +0200, Sebastian Andrzej Siewior wrote:
> Disable the unused softirqs stacks on PREEMPT_RT to safe some memory and

s/safe/save/


Re: [PATCHv2 2/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

2022-06-15 Thread Christophe Leroy


Le 15/06/2022 à 07:57, Wang Wenhu a écrit :
> Freescale mpc85xx l2-cache could be optionally configured as SRAM partly
> or fully. Users can make use of it as a block of independent memory that
> offers special usage, such as for debuging or other critical status info
> storage, which keeps consistently even when the whole system crashed.
> Applications can make use of UIO driver to access the SRAM from user level.
> 
> Once there was another driver version for the l2-cache-sram for SRAM access
> in kernel space. It had been removed recently.
> See: 
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=dc21ed2aef4150fc2fcf58227a4ff24502015c03
> 
> Signed-off-by: Wang Wenhu 
> ---
> v2:
>   - Use __be32 instead of u32 for big-endian data declarations;

I get the following warnings which 'make 
drivers/uio/uio_fsl_85xx_cache_sram.o C=2'

   CHECK   drivers/uio/uio_fsl_85xx_cache_sram.c
drivers/uio/uio_fsl_85xx_cache_sram.c:96:19: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:96:19:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:96:19:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:100:27: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:100:27:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:100:27:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:102:9: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:102:9:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:102:9:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:102:9: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:102:9:expected unsigned int 
const volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:102:9:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:106:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:106:17:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:106:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:106:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:106:17:expected unsigned int 
const volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:106:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:110:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:110:17:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:110:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:110:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:110:17:expected unsigned int 
const volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:110:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:114:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:114:17:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:114:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:114:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:114:17:expected unsigned int 
const volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:114:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:119:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:119:17:expected unsigned int 
volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:119:17:got restricted __be32 
[noderef] __iomem *
drivers/uio/uio_fsl_85xx_cache_sram.c:119:17: warning: incorrect type in 
argument 1 (different base types)
drivers/uio/uio_fsl_85xx_cache_sram.c:119:17:expected unsigned int 
const volatile [noderef] [usertype] __iomem *addr
drivers/uio/uio_fsl_85xx_cache_sram.c:119:17:got restricted __be32 
[noderef] __iomem *


>   - Use generic ioremap_cache instead of ioremap_coherent;
>   - Physical address support both 32 and 64 bits;
>   - Addressed some other comments from Greg.
> ---
>   drivers/uio/Kconfig   |  14 ++
>   drivers/uio/Makefile  

Re: [PATCHv2 1/2] mm: eliminate ifdef of HAVE_IOREMAP_PROT in .c files

2022-06-15 Thread Christoph Hellwig
Did you verify that all architectures actually provide a ioremap_prot
prototype?
The header situation for ioremap* is a mess unfortunately.


Re: [PATCHv2 2/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

2022-06-15 Thread Christophe Leroy


Le 15/06/2022 à 07:57, Wang Wenhu a écrit :
> Freescale mpc85xx l2-cache could be optionally configured as SRAM partly
> or fully. Users can make use of it as a block of independent memory that
> offers special usage, such as for debuging or other critical status info
> storage, which keeps consistently even when the whole system crashed.
> Applications can make use of UIO driver to access the SRAM from user level.
> 
> Once there was another driver version for the l2-cache-sram for SRAM access
> in kernel space. It had been removed recently.
> See: 
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=dc21ed2aef4150fc2fcf58227a4ff24502015c03
> 
> Signed-off-by: Wang Wenhu 
> ---
> v2:
>   - Use __be32 instead of u32 for big-endian data declarations;
>   - Use generic ioremap_cache instead of ioremap_coherent;
>   - Physical address support both 32 and 64 bits;
>   - Addressed some other comments from Greg.
> ---
>   drivers/uio/Kconfig   |  14 ++
>   drivers/uio/Makefile  |   1 +
>   drivers/uio/uio_fsl_85xx_cache_sram.c | 288 ++
>   3 files changed, 303 insertions(+)
>   create mode 100644 drivers/uio/uio_fsl_85xx_cache_sram.c
> 
> diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig
> index 2e16c5338e5b..f7604584a12c 100644
> --- a/drivers/uio/Kconfig
> +++ b/drivers/uio/Kconfig
> @@ -105,6 +105,20 @@ config UIO_NETX
> To compile this driver as a module, choose M here; the module
> will be called uio_netx.
>   
> +config UIO_FSL_85XX_CACHE_SRAM
> + tristate "Freescale 85xx L2-Cache-SRAM UIO driver"
> + depends on FSL_SOC_BOOKE && PPC32
> + help
> +   Driver for user level access of freescale mpc85xx l2-cache-sram.
> +
> +   Freescale's mpc85xx provides an option of configuring a part of
> +   (or full) cache memory as SRAM. The driver does this configuring
> +   work and exports SRAM to user-space for access form user level.
> +   This is extremely helpful for user applications that require
> +   high performance memory accesses.
> +
> +   If you don't know what to do here, say N.
> +
>   config UIO_FSL_ELBC_GPCM
>   tristate "eLBC/GPCM driver"
>   depends on FSL_LBC
> diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile
> index f2f416a14228..1ba07d92a1b1 100644
> --- a/drivers/uio/Makefile
> +++ b/drivers/uio/Makefile
> @@ -12,3 +12,4 @@ obj-$(CONFIG_UIO_MF624) += uio_mf624.o
>   obj-$(CONFIG_UIO_FSL_ELBC_GPCM) += uio_fsl_elbc_gpcm.o
>   obj-$(CONFIG_UIO_HV_GENERIC)+= uio_hv_generic.o
>   obj-$(CONFIG_UIO_DFL)   += uio_dfl.o
> +obj-$(CONFIG_UIO_FSL_85XX_CACHE_SRAM)+= uio_fsl_85xx_cache_sram.o
> diff --git a/drivers/uio/uio_fsl_85xx_cache_sram.c 
> b/drivers/uio/uio_fsl_85xx_cache_sram.c
> new file mode 100644
> index ..6f91b0aa946b
> --- /dev/null
> +++ b/drivers/uio/uio_fsl_85xx_cache_sram.c
> @@ -0,0 +1,288 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2022 Wang Wenhu 
> + * All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DRIVER_NAME  "uio_mpc85xx_cache_sram"
> +#define UIO_INFO_VER "0.0.1"
> +#define UIO_NAME "uio_cache_sram"
> +
> +#define L2CR_L2FI0x4000  /* L2 flash invalidate */
> +#define L2CR_L2IO0x0020  /* L2 instruction only */
> +#define L2CR_SRAM_ZERO   0x  /* L2SRAM zero size */
> +#define L2CR_SRAM_FULL   0x0001  /* L2SRAM full size */
> +#define L2CR_SRAM_HALF   0x0002  /* L2SRAM half size */
> +#define L2CR_SRAM_TWO_HALFS  0x0003  /* L2SRAM two half sizes */
> +#define L2CR_SRAM_QUART  0x0004  /* L2SRAM one quarter 
> size */
> +#define L2CR_SRAM_TWO_QUARTS 0x0005  /* L2SRAM two quarter size */
> +#define L2CR_SRAM_EIGHTH 0x0006  /* L2SRAM one eighth size */
> +#define L2CR_SRAM_TWO_EIGHTH 0x0007  /* L2SRAM two eighth size */
> +
> +#define L2SRAM_OPTIMAL_SZ_SHIFT  0x0003  /* Optimum size for 
> L2SRAM */
> +
> +#define L2SRAM_BAR_MSK_LO18  0xC000  /* Lower 18 bits */
> +#define L2SRAM_BARE_MSK_HI4  0x000F  /* Upper 4 bits */
> +
> +enum cache_sram_lock_ways {
> + LOCK_WAYS_ZERO  = 0,
> + LOCK_WAYS_EIGHTH= 1,
> + LOCK_WAYS_TWO_EIGHTH= 2,
> + LOCK_WAYS_HALF  = 4,
> + LOCK_WAYS_FULL  = 8,
> +};
> +
> +struct mpc85xx_l2ctlr {
> + __be32  ctl;/* 0x000 - L2 control */
> + u8  res1[0xC];
> + __be32  ewar0;  /* 0x010 - External write address 0 */
> + __be32  ewarea0;/* 0x014 - External write address extended 0 */
> + __be32  ewcr0;  /* 0x018 - External write ctrl */
> + u8  res2[4];
> + __be32  ewar1;  /* 0x020 - External write address 1 */
> + __be32  ewarea1;  

Re: [PATCHv2 2/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

2022-06-15 Thread Christoph Hellwig
As pointed out last time:  uio is the wrong interface to expose sram,
and any kind of ioremap is the wrong way to map it.


Re: 回复: [PATCH 2/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

2022-06-15 Thread Christophe Leroy


Le 14/06/2022 à 08:09, Wenhu Wang a écrit :
>>> +
>>> +static const struct vm_operations_struct uio_cache_sram_vm_ops = {
>>> +#ifdef CONFIG_HAVE_IOREMAP_PROT
>>
>> Same here.
>>
> 
> I tried to eliminate it in mainline
> See: [PATCH v2] mm: eliminate ifdef of HAVE_IOREMAP_PROT in .c files
> https://lkml.org/lkml/2022/6/10/695
> 
>>> + .access = generic_access_phys,
>>> +#endif
>>> +};

Another solution is to do:


static const struct vm_operations_struct uio_cache_sram_vm_ops = {
.access = IS_ENABLED(CONFIG_HAVE_IOREMAP_PROT) ? generic_access_phys : 
NULL,
};


Christophe

[PATCHv2 2/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

2022-06-15 Thread Wang Wenhu
Freescale mpc85xx l2-cache could be optionally configured as SRAM partly
or fully. Users can make use of it as a block of independent memory that
offers special usage, such as for debuging or other critical status info
storage, which keeps consistently even when the whole system crashed.
Applications can make use of UIO driver to access the SRAM from user level.

Once there was another driver version for the l2-cache-sram for SRAM access
in kernel space. It had been removed recently.
See: 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=dc21ed2aef4150fc2fcf58227a4ff24502015c03

Signed-off-by: Wang Wenhu 
---
v2:
 - Use __be32 instead of u32 for big-endian data declarations;
 - Use generic ioremap_cache instead of ioremap_coherent;
 - Physical address support both 32 and 64 bits;
 - Addressed some other comments from Greg.
---
 drivers/uio/Kconfig   |  14 ++
 drivers/uio/Makefile  |   1 +
 drivers/uio/uio_fsl_85xx_cache_sram.c | 288 ++
 3 files changed, 303 insertions(+)
 create mode 100644 drivers/uio/uio_fsl_85xx_cache_sram.c

diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig
index 2e16c5338e5b..f7604584a12c 100644
--- a/drivers/uio/Kconfig
+++ b/drivers/uio/Kconfig
@@ -105,6 +105,20 @@ config UIO_NETX
  To compile this driver as a module, choose M here; the module
  will be called uio_netx.
 
+config UIO_FSL_85XX_CACHE_SRAM
+   tristate "Freescale 85xx L2-Cache-SRAM UIO driver"
+   depends on FSL_SOC_BOOKE && PPC32
+   help
+ Driver for user level access of freescale mpc85xx l2-cache-sram.
+
+ Freescale's mpc85xx provides an option of configuring a part of
+ (or full) cache memory as SRAM. The driver does this configuring
+ work and exports SRAM to user-space for access form user level.
+ This is extremely helpful for user applications that require
+ high performance memory accesses.
+
+ If you don't know what to do here, say N.
+
 config UIO_FSL_ELBC_GPCM
tristate "eLBC/GPCM driver"
depends on FSL_LBC
diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile
index f2f416a14228..1ba07d92a1b1 100644
--- a/drivers/uio/Makefile
+++ b/drivers/uio/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_UIO_MF624) += uio_mf624.o
 obj-$(CONFIG_UIO_FSL_ELBC_GPCM)+= uio_fsl_elbc_gpcm.o
 obj-$(CONFIG_UIO_HV_GENERIC)   += uio_hv_generic.o
 obj-$(CONFIG_UIO_DFL)  += uio_dfl.o
+obj-$(CONFIG_UIO_FSL_85XX_CACHE_SRAM)  += uio_fsl_85xx_cache_sram.o
diff --git a/drivers/uio/uio_fsl_85xx_cache_sram.c 
b/drivers/uio/uio_fsl_85xx_cache_sram.c
new file mode 100644
index ..6f91b0aa946b
--- /dev/null
+++ b/drivers/uio/uio_fsl_85xx_cache_sram.c
@@ -0,0 +1,288 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2022 Wang Wenhu 
+ * All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRIVER_NAME"uio_mpc85xx_cache_sram"
+#define UIO_INFO_VER   "0.0.1"
+#define UIO_NAME   "uio_cache_sram"
+
+#define L2CR_L2FI  0x4000  /* L2 flash invalidate */
+#define L2CR_L2IO  0x0020  /* L2 instruction only */
+#define L2CR_SRAM_ZERO 0x  /* L2SRAM zero size */
+#define L2CR_SRAM_FULL 0x0001  /* L2SRAM full size */
+#define L2CR_SRAM_HALF 0x0002  /* L2SRAM half size */
+#define L2CR_SRAM_TWO_HALFS0x0003  /* L2SRAM two half sizes */
+#define L2CR_SRAM_QUART0x0004  /* L2SRAM one quarter 
size */
+#define L2CR_SRAM_TWO_QUARTS   0x0005  /* L2SRAM two quarter size */
+#define L2CR_SRAM_EIGHTH   0x0006  /* L2SRAM one eighth size */
+#define L2CR_SRAM_TWO_EIGHTH   0x0007  /* L2SRAM two eighth size */
+
+#define L2SRAM_OPTIMAL_SZ_SHIFT0x0003  /* Optimum size for 
L2SRAM */
+
+#define L2SRAM_BAR_MSK_LO180xC000  /* Lower 18 bits */
+#define L2SRAM_BARE_MSK_HI40x000F  /* Upper 4 bits */
+
+enum cache_sram_lock_ways {
+   LOCK_WAYS_ZERO  = 0,
+   LOCK_WAYS_EIGHTH= 1,
+   LOCK_WAYS_TWO_EIGHTH= 2,
+   LOCK_WAYS_HALF  = 4,
+   LOCK_WAYS_FULL  = 8,
+};
+
+struct mpc85xx_l2ctlr {
+   __be32  ctl;/* 0x000 - L2 control */
+   u8  res1[0xC];
+   __be32  ewar0;  /* 0x010 - External write address 0 */
+   __be32  ewarea0;/* 0x014 - External write address extended 0 */
+   __be32  ewcr0;  /* 0x018 - External write ctrl */
+   u8  res2[4];
+   __be32  ewar1;  /* 0x020 - External write address 1 */
+   __be32  ewarea1;/* 0x024 - External write address extended 1 */
+   __be32  ewcr1;  /* 0x028 - External write ctrl 1 */
+   u8  res3[4];
+   __be32  ewar2;  /* 0x030 - External write address 2 */
+   __be32  ewarea2;   

[PATCHv2 1/2] mm: eliminate ifdef of HAVE_IOREMAP_PROT in .c files

2022-06-15 Thread Wang Wenhu
It is recommended in the "Conditional Compilation" chapter of kernel
coding-style documentation that preprocessor conditionals should not
be used in .c files wherever possible.

As for the macro CONFIG_HAVE_IOREMAP_PROT, now it's a proper chance
to eliminate it in .c files which are referencers. We constrict its usage
only to mm/memory.c.
HAVE_IOREMAP_PROT is supported by part of archectures such as powerpc and
x86, but not supported by some others such as arm. So for some functions,
a no-op version should be available. Currently it's generic_access_phys,
which is referenced by some other modules.

Signed-off-by: Wang Wenhu 
---
v2:
 - Added IS_ENABLED(CONFIG_HAVE_IOREMAP_PROT) condition in __access_remote_vm
 - Added generic_access_phys() function with no-op in mm/memory.c instead of 
the 
 former one of "static inline" in include/linux/mm.h
Former: https://lore.kernel.org/linux-mm/yqmrtwah5fiws...@kroah.com/T/
---
 drivers/char/mem.c  |  2 --
 drivers/fpga/dfl-afu-main.c |  2 --
 drivers/pci/mmap.c  |  2 --
 drivers/uio/uio.c   |  2 --
 mm/memory.c | 13 +
 5 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 84ca98ed1dad..40186a441e38 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -354,9 +354,7 @@ static inline int private_mapping_ok(struct vm_area_struct 
*vma)
 #endif
 
 static const struct vm_operations_struct mmap_mem_ops = {
-#ifdef CONFIG_HAVE_IOREMAP_PROT
.access = generic_access_phys
-#endif
 };
 
 static int mmap_mem(struct file *file, struct vm_area_struct *vma)
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 7f621e96d3b8..833e14806c7a 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -797,9 +797,7 @@ static long afu_ioctl(struct file *filp, unsigned int cmd, 
unsigned long arg)
 }
 
 static const struct vm_operations_struct afu_vma_ops = {
-#ifdef CONFIG_HAVE_IOREMAP_PROT
.access = generic_access_phys,
-#endif
 };
 
 static int afu_mmap(struct file *filp, struct vm_area_struct *vma)
diff --git a/drivers/pci/mmap.c b/drivers/pci/mmap.c
index b8c9011987f4..1dcfabf80453 100644
--- a/drivers/pci/mmap.c
+++ b/drivers/pci/mmap.c
@@ -35,9 +35,7 @@ int pci_mmap_page_range(struct pci_dev *pdev, int bar,
 #endif
 
 static const struct vm_operations_struct pci_phys_vm_ops = {
-#ifdef CONFIG_HAVE_IOREMAP_PROT
.access = generic_access_phys,
-#endif
 };
 
 int pci_mmap_resource_range(struct pci_dev *pdev, int bar,
diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 43afbb7c5ab9..c9205a121007 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -719,9 +719,7 @@ static int uio_mmap_logical(struct vm_area_struct *vma)
 }
 
 static const struct vm_operations_struct uio_physical_vm_ops = {
-#ifdef CONFIG_HAVE_IOREMAP_PROT
.access = generic_access_phys,
-#endif
 };
 
 static int uio_mmap_physical(struct vm_area_struct *vma)
diff --git a/mm/memory.c b/mm/memory.c
index 7a089145cad4..7c0e59085456 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5413,6 +5413,13 @@ int generic_access_phys(struct vm_area_struct *vma, 
unsigned long addr,
return ret;
 }
 EXPORT_SYMBOL_GPL(generic_access_phys);
+#else
+int generic_access_phys(struct vm_area_struct *vma, unsigned long addr,
+   void *buf, int len, int write)
+{
+   return 0;
+}
+EXPORT_SYMBOL_GPL(generic_access_phys);
 #endif
 
 /*
@@ -5437,9 +5444,8 @@ int __access_remote_vm(struct mm_struct *mm, unsigned 
long addr, void *buf,
ret = get_user_pages_remote(mm, addr, 1,
gup_flags, , , NULL);
if (ret <= 0) {
-#ifndef CONFIG_HAVE_IOREMAP_PROT
-   break;
-#else
+   if (!IS_ENABLED(CONFIG_HAVE_IOREMAP_PROT))
+   break;
/*
 * Check if this is a VM_IO | VM_PFNMAP VMA, which
 * we can access using slightly different code.
@@ -5453,7 +5459,6 @@ int __access_remote_vm(struct mm_struct *mm, unsigned 
long addr, void *buf,
if (ret <= 0)
break;
bytes = ret;
-#endif
} else {
bytes = len;
offset = addr & (PAGE_SIZE-1);
-- 
2.25.1



[PATCHv2 0/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver

2022-06-15 Thread Wang Wenhu
This series try to push an uio driver which works on freescale mpc85xx
to configure its l2-cache-sram as a block of SRAM and enable user level
application access of the SRAM.

1/2: For coding-style consideration of macro CONFIG_HAVE_IOREMAP_PORT;
2/2: Implementation of the uio driver.

This is the second version, which addressed some commets:
1. Use __be32 instead of u32 for the big-endian data declarations;
2. Remove "static inline" version of generic_access_phys definition in .h file
and give a version of no-op definition in mm/memory.c;
3. Use generic ioremap_cache instead of ioremap_coherent

For v1, see:
1/2: https://lore.kernel.org/all/20220610144348.GA595923@bhelgaas/T/
2/2: https://lore.kernel.org/lkml/yqhy1uxwclljm...@kroah.com/

Wang Wenhu (2):
  mm: eliminate ifdef of HAVE_IOREMAP_PROT in .c files
  uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

 drivers/char/mem.c|   2 -
 drivers/fpga/dfl-afu-main.c   |   2 -
 drivers/pci/mmap.c|   2 -
 drivers/uio/Kconfig   |  14 ++
 drivers/uio/Makefile  |   1 +
 drivers/uio/uio.c |   2 -
 drivers/uio/uio_fsl_85xx_cache_sram.c | 288 ++
 mm/memory.c   |  13 +-
 8 files changed, 312 insertions(+), 12 deletions(-)
 create mode 100644 drivers/uio/uio_fsl_85xx_cache_sram.c

-- 
2.25.1