date:20220201

On Wed, Feb 02, 2022 at 06:54:43AM +, Matthew Garrett wrote:
> On Wed, Feb 02, 2022 at 07:10:02AM +0100, Greg KH wrote:
> > On Wed, Feb 02, 2022 at 04:01:57AM +, Matthew Garrett wrote:
> > > We're talking about things that have massively different semantics.
> > 
> > I see lots of different platforms trying to provide access to their
> > "secure" firmware data to userspace in different ways.  That feels to me
> > like they are the same thing that userspace would care about in a
> > unified way.
> 
> EFI variables are largely for the OS to provide information to the 
> firmware, while this patchset is to provide information from the 
> firmware to the OS. I don't see why we'd expect to use the same userland 
> tooling for both.

I totally agree, I'm not worried about EFI variables here, I don't know
why that came up.

> In the broader case - I don't think we *can* use the same userland
> tooling for everything. For example, the patches to add support for 
> manipulating the Power secure boot keys originally attempted to make it 
> look like efivars, but the underlying firmware semantics are 
> sufficiently different that even exposing the same kernel interface 
> wouldn't be a sufficient abstraction and userland would still need to 
> behave differently. Exposing an interface that looks consistent but 
> isn't is arguably worse for userland than exposing explicitly distinct 
> interfaces.

So what does userspace really need here?  Just the ability to find if
the platform has blobs that it cares about, and how to read/write them.

I see different platform patches trying to stick these blobs in
different locations and ways to access (securityfs, sysfs, char device
node), which seems crazy to me.  Why can't we at least pick one way to
access these to start with, and then have the filesystem layout be
platform-specific as needed, which will give the correct hints to
userspace as to what it needs to do here?

thanks,

greg k-h

Re: [PATCH] mm: Merge pte_mkhuge() call into arch_make_huge_pte()



Le 02/02/2022 à 07:18, Mike Rapoport a écrit :
> On Wed, Feb 02, 2022 at 11:08:06AM +0530, Anshuman Khandual wrote:
>> Each call into pte_mkhuge() is invariably followed by arch_make_huge_pte().
>> Instead arch_make_huge_pte() can accommodate pte_mkhuge() at the beginning.
>> This updates generic fallback stub for arch_make_huge_pte() and available
>> platforms definitions. This makes huge pte creation much cleaner and easier
>> to follow.
> 
> Won't it break architectures that don't define arch_make_huge_pte()?

It shouldn't, see below

>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>> index d1897a69c540..52c462390aee 100644
>> --- a/include/linux/hugetlb.h
>> +++ b/include/linux/hugetlb.h
>> @@ -754,7 +754,7 @@ static inline void arch_clear_hugepage_flags(struct page 
>> *page) { }
>>   static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift,
>> vm_flags_t flags)
>>   {
>> -return entry;
>> +return pte_mkhuge(entry);
>>   }
>>   #endif
>>

Re: [PATCH] mm: Merge pte_mkhuge() call into arch_make_huge_pte()



Le 02/02/2022 à 06:38, Anshuman Khandual a écrit :
> Each call into pte_mkhuge() is invariably followed by arch_make_huge_pte().
> Instead arch_make_huge_pte() can accommodate pte_mkhuge() at the beginning.
> This updates generic fallback stub for arch_make_huge_pte() and available
> platforms definitions. This makes huge pte creation much cleaner and easier
> to follow.

I think it is a good cleanup. I always wonder why commit d9ed9faac283 
("mm: add new arch_make_huge_pte() method for tile support") didn't move 
the pte_mkhuge() into arch_make_huge_pte().

When I implemented arch_make_huge_pte() for powerpc 8xx, in one case 
arch_make_huge_pte() have to undo the things done by pte_mkhuge(), see below

As a second step we could probably try to get rid of pte_mkhuge() 
completely, at least in the core.

> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: "David S. Miller" 
> Cc: Mike Kravetz 
> Cc: Andrew Morton 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: sparcli...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 

Reviewed-by: Christophe Leroy 

> ---
>   arch/arm64/mm/hugetlbpage.c  | 1 +
>   arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h | 1 +
>   arch/sparc/mm/hugetlbpage.c  | 1 +
>   include/linux/hugetlb.h  | 2 +-
>   mm/hugetlb.c | 3 +--
>   mm/vmalloc.c | 1 -
>   6 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index ffb9c229610a..228226c5fa80 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -347,6 +347,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, 
> vm_flags_t flags)
>   {
>   size_t pagesize = 1UL << shift;
>   
> + entry = pte_mkhuge(entry);
>   if (pagesize == CONT_PTE_SIZE) {
>   entry = pte_mkcont(entry);
>   } else if (pagesize == CONT_PMD_SIZE) {
> diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h 
> b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> index 64b6c608eca4..e41e095158c7 100644
> --- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> +++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> @@ -70,6 +70,7 @@ static inline pte_t arch_make_huge_pte(pte_t entry, 
> unsigned int shift, vm_flags
>   {
>   size_t size = 1UL << shift;
>   
> + entry = pte_mkhuge(entry);

Could drop that and replace the below by:

if (size == SZ_16K)
return __pte(pte_val(entry) | _PAGE_SPS);
else
return __pte(pte_val(entry) | _PAGE_SPS | _PAGE_HUGE);


>   if (size == SZ_16K)
>   return __pte(pte_val(entry) & ~_PAGE_HUGE);
>   else
> diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
> index 0f49fada2093..d8e0e3c7038d 100644
> --- a/arch/sparc/mm/hugetlbpage.c
> +++ b/arch/sparc/mm/hugetlbpage.c
> @@ -181,6 +181,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, 
> vm_flags_t flags)
>   {
>   pte_t pte;
>   
> + entry = pte_mkhuge(entry);
>   pte = hugepage_shift_to_tte(entry, shift);
>   
>   #ifdef CONFIG_SPARC64
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index d1897a69c540..52c462390aee 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -754,7 +754,7 @@ static inline void arch_clear_hugepage_flags(struct page 
> *page) { }
>   static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift,
>  vm_flags_t flags)
>   {
> - return entry;
> + return pte_mkhuge(entry);
>   }
>   #endif
>   
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 61895cc01d09..5ca253c1b4e4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4637,7 +4637,6 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, 
> struct page *page,
>  vma->vm_page_prot));
>   }
>   entry = pte_mkyoung(entry);
> - entry = pte_mkhuge(entry);
>   entry = arch_make_huge_pte(entry, shift, vma->vm_flags);
>   
>   return entry;
> @@ -6172,7 +6171,7 @@ unsigned long hugetlb_change_protection(struct 
> vm_area_struct *vma,
>   unsigned int shift = huge_page_shift(hstate_vma(vma));
>   
>   old_pte = huge_ptep_modify_prot_start(vma, address, 
> ptep);
> - pte = pte_mkhuge(huge_pte_modify(old_pte, newprot));
> + pte = huge_pte_modify(old_pte, newprot);
>   pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
>   huge_ptep_modify_prot_commit(vma, address, ptep, 
> old_pte, pte);
>   pages++;
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 4165304d3547..d0b14dd73adc 100644

Re: [PATCH] mm: Merge pte_mkhuge() call into arch_make_huge_pte()

2022-02-01 Thread Mike Rapoport

On Wed, Feb 02, 2022 at 11:08:06AM +0530, Anshuman Khandual wrote:
> Each call into pte_mkhuge() is invariably followed by arch_make_huge_pte().
> Instead arch_make_huge_pte() can accommodate pte_mkhuge() at the beginning.
> This updates generic fallback stub for arch_make_huge_pte() and available
> platforms definitions. This makes huge pte creation much cleaner and easier
> to follow.

Won't it break architectures that don't define arch_make_huge_pte()?
 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: "David S. Miller" 
> Cc: Mike Kravetz 
> Cc: Andrew Morton 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: sparcli...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
>  arch/arm64/mm/hugetlbpage.c  | 1 +
>  arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h | 1 +
>  arch/sparc/mm/hugetlbpage.c  | 1 +
>  include/linux/hugetlb.h  | 2 +-
>  mm/hugetlb.c | 3 +--
>  mm/vmalloc.c | 1 -
>  6 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index ffb9c229610a..228226c5fa80 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -347,6 +347,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, 
> vm_flags_t flags)
>  {
>   size_t pagesize = 1UL << shift;
>  
> + entry = pte_mkhuge(entry);
>   if (pagesize == CONT_PTE_SIZE) {
>   entry = pte_mkcont(entry);
>   } else if (pagesize == CONT_PMD_SIZE) {
> diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h 
> b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> index 64b6c608eca4..e41e095158c7 100644
> --- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> +++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> @@ -70,6 +70,7 @@ static inline pte_t arch_make_huge_pte(pte_t entry, 
> unsigned int shift, vm_flags
>  {
>   size_t size = 1UL << shift;
>  
> + entry = pte_mkhuge(entry);
>   if (size == SZ_16K)
>   return __pte(pte_val(entry) & ~_PAGE_HUGE);
>   else
> diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
> index 0f49fada2093..d8e0e3c7038d 100644
> --- a/arch/sparc/mm/hugetlbpage.c
> +++ b/arch/sparc/mm/hugetlbpage.c
> @@ -181,6 +181,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, 
> vm_flags_t flags)
>  {
>   pte_t pte;
>  
> + entry = pte_mkhuge(entry);
>   pte = hugepage_shift_to_tte(entry, shift);
>  
>  #ifdef CONFIG_SPARC64
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index d1897a69c540..52c462390aee 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -754,7 +754,7 @@ static inline void arch_clear_hugepage_flags(struct page 
> *page) { }
>  static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift,
>  vm_flags_t flags)
>  {
> - return entry;
> + return pte_mkhuge(entry);
>  }
>  #endif
>  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 61895cc01d09..5ca253c1b4e4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4637,7 +4637,6 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, 
> struct page *page,
>  vma->vm_page_prot));
>   }
>   entry = pte_mkyoung(entry);
> - entry = pte_mkhuge(entry);
>   entry = arch_make_huge_pte(entry, shift, vma->vm_flags);
>  
>   return entry;
> @@ -6172,7 +6171,7 @@ unsigned long hugetlb_change_protection(struct 
> vm_area_struct *vma,
>   unsigned int shift = huge_page_shift(hstate_vma(vma));
>  
>   old_pte = huge_ptep_modify_prot_start(vma, address, 
> ptep);
> - pte = pte_mkhuge(huge_pte_modify(old_pte, newprot));
> + pte = huge_pte_modify(old_pte, newprot);
>   pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
>   huge_ptep_modify_prot_commit(vma, address, ptep, 
> old_pte, pte);
>   pages++;
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 4165304d3547..d0b14dd73adc 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -118,7 +118,6 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, 
> unsigned long end,
>   if (size != PAGE_SIZE) {
>   pte_t entry = pfn_pte(pfn, prot);
>  
> - entry = pte_mkhuge(entry);
>   entry = arch_make_huge_pte(entry, ilog2(size), 0);
>   set_huge_pte_at(&init_mm, addr, pte, entry);
>   pfn += PFN_DOWN(size);
> -- 
> 2.25.1
> 
> 

-- 
Sincerely yours,
Mike.

Re: [PATCH v7 0/5] Allow guest access to EFI confidential computing secret area

On Wed, Feb 02, 2022 at 04:01:57AM +, Matthew Garrett wrote:
> On Tue, Feb 01, 2022 at 09:24:50AM -0500, James Bottomley wrote:
> > On Tue, 2022-02-01 at 14:50 +0100, Greg KH wrote:
> > > You all need to work together to come up with a unified place for
> > > this and stop making it platform-specific.
> 
> We're talking about things that have massively different semantics.

I see lots of different platforms trying to provide access to their
"secure" firmware data to userspace in different ways.  That feels to me
like they are the same thing that userspace would care about in a
unified way.

Unless we expeect userspace tools to have to be platform-specific for
all of this?  That does not seem wise.

what am I missing here?

thanks,

greg k-h

[PATCH] mm: Merge pte_mkhuge() call into arch_make_huge_pte()

2022-02-01 Thread Anshuman Khandual

Each call into pte_mkhuge() is invariably followed by arch_make_huge_pte().
Instead arch_make_huge_pte() can accommodate pte_mkhuge() at the beginning.
This updates generic fallback stub for arch_make_huge_pte() and available
platforms definitions. This makes huge pte creation much cleaner and easier
to follow.

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: "David S. Miller" 
Cc: Mike Kravetz 
Cc: Andrew Morton 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/mm/hugetlbpage.c  | 1 +
 arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h | 1 +
 arch/sparc/mm/hugetlbpage.c  | 1 +
 include/linux/hugetlb.h  | 2 +-
 mm/hugetlb.c | 3 +--
 mm/vmalloc.c | 1 -
 6 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index ffb9c229610a..228226c5fa80 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -347,6 +347,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, 
vm_flags_t flags)
 {
size_t pagesize = 1UL << shift;
 
+   entry = pte_mkhuge(entry);
if (pagesize == CONT_PTE_SIZE) {
entry = pte_mkcont(entry);
} else if (pagesize == CONT_PMD_SIZE) {
diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h 
b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
index 64b6c608eca4..e41e095158c7 100644
--- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
@@ -70,6 +70,7 @@ static inline pte_t arch_make_huge_pte(pte_t entry, unsigned 
int shift, vm_flags
 {
size_t size = 1UL << shift;
 
+   entry = pte_mkhuge(entry);
if (size == SZ_16K)
return __pte(pte_val(entry) & ~_PAGE_HUGE);
else
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index 0f49fada2093..d8e0e3c7038d 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -181,6 +181,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, 
vm_flags_t flags)
 {
pte_t pte;
 
+   entry = pte_mkhuge(entry);
pte = hugepage_shift_to_tte(entry, shift);
 
 #ifdef CONFIG_SPARC64
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index d1897a69c540..52c462390aee 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -754,7 +754,7 @@ static inline void arch_clear_hugepage_flags(struct page 
*page) { }
 static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift,
   vm_flags_t flags)
 {
-   return entry;
+   return pte_mkhuge(entry);
 }
 #endif
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 61895cc01d09..5ca253c1b4e4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4637,7 +4637,6 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, 
struct page *page,
   vma->vm_page_prot));
}
entry = pte_mkyoung(entry);
-   entry = pte_mkhuge(entry);
entry = arch_make_huge_pte(entry, shift, vma->vm_flags);
 
return entry;
@@ -6172,7 +6171,7 @@ unsigned long hugetlb_change_protection(struct 
vm_area_struct *vma,
unsigned int shift = huge_page_shift(hstate_vma(vma));
 
old_pte = huge_ptep_modify_prot_start(vma, address, 
ptep);
-   pte = pte_mkhuge(huge_pte_modify(old_pte, newprot));
+   pte = huge_pte_modify(old_pte, newprot);
pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
huge_ptep_modify_prot_commit(vma, address, ptep, 
old_pte, pte);
pages++;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4165304d3547..d0b14dd73adc 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -118,7 +118,6 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, 
unsigned long end,
if (size != PAGE_SIZE) {
pte_t entry = pfn_pte(pfn, prot);
 
-   entry = pte_mkhuge(entry);
entry = arch_make_huge_pte(entry, ilog2(size), 0);
set_huge_pte_at(&init_mm, addr, pte, entry);
pfn += PFN_DOWN(size);
-- 
2.25.1

[PATCH v2] powerpc/ptdump: Fix sparse warning in hashpagetable.c

As reported by sparse:

  arch/powerpc/mm/ptdump/hashpagetable.c:264:29: warning: restricted __be64 
degrades to integer
  arch/powerpc/mm/ptdump/hashpagetable.c:265:49: warning: restricted __be64 
degrades to integer
  arch/powerpc/mm/ptdump/hashpagetable.c:267:36: warning: incorrect type in 
assignment (different base types)
  arch/powerpc/mm/ptdump/hashpagetable.c:267:36:expected unsigned long long 
[usertype]
  arch/powerpc/mm/ptdump/hashpagetable.c:267:36:got restricted __be64 
[usertype] v
  arch/powerpc/mm/ptdump/hashpagetable.c:268:36: warning: incorrect type in 
assignment (different base types)
  arch/powerpc/mm/ptdump/hashpagetable.c:268:36:expected unsigned long long 
[usertype]
  arch/powerpc/mm/ptdump/hashpagetable.c:268:36:got restricted __be64 
[usertype] r

The values returned by plpar_pte_read_4() are CPU endian, not __be64, so
assigning them to struct hash_pte confuses sparse. As a minimal fix open
code a struct to hold the values with CPU endian types.

Reported-by: kernel test robot 
Reported-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/ptdump/hashpagetable.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

v2: Don't use struct hash_pte at all.

Replaces v1 
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/bbc196451dd34521d239023ccca488db35b8fff1.1643567900.git.christophe.le...@csgroup.eu/

diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c 
b/arch/powerpc/mm/ptdump/hashpagetable.c
index c7f824d294b2..9a601587836b 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -238,7 +238,10 @@ static int native_find(unsigned long ea, int psize, bool 
primary, u64 *v, u64
 
 static int pseries_find(unsigned long ea, int psize, bool primary, u64 *v, u64 
*r)
 {
-   struct hash_pte ptes[4];
+   struct {
+   unsigned long v;
+   unsigned long r;
+   } ptes[4];
unsigned long vsid, vpn, hash, hpte_group, want_v;
int i, j, ssize = mmu_kernel_ssize;
long lpar_rc = 0;
-- 
2.34.1

[PATCH V2] powerpc/perf: Fix task context setting for trace imc

2022-02-01 Thread Athira Rajeev

Trace IMC (In-Memory collection counters) in powerpc is
useful for application level profiling. For trace_imc,
presently task context (task_ctx_nr) is set to
perf_hw_context. But perf_hw_context is to be used for
cpu PMU. So for trace_imc, even though it is per thread
PMU, it is preferred to use sw_context inorder to be able
to do application level monitoring. Hence change the
task_ctx_nr to use perf_sw_context.

Fixes: 012ae244845f ("powerpc/perf: Trace imc PMU functions")
Signed-off-by: Athira Rajeev 
Reviewed-by: Madhavan Srinivasan 
---
Changelog:
v1 -> v2:
 Added comment in code on why perf_sw_context is used.
Notes:
 trace_imc_event_init currently uses context as
 perf_hw_context. But ideally there can only be a single
 PMU for perf_hw_context events which is core PMU.
 Reference:
 commit 26657848502b ("perf/core: Verify we have a single perf_hw_context PMU")
 Reason for using "perf_sw_context" instead of invalid_context
 is that, task level monitoring is restricted with
 invalid_context.

 arch/powerpc/perf/imc-pmu.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index e106909ff9c3..8fe57601e61d 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -1457,7 +1457,13 @@ static int trace_imc_event_init(struct perf_event *event)
 
event->hw.idx = -1;
 
-   event->pmu->task_ctx_nr = perf_hw_context;
+   /*
+* There can only be a single PMU for
+* perf_hw_context events which is assigned
+* to core PMU. Hence use "perf_sw_context" for
+* trace_imc.
+*/
+   event->pmu->task_ctx_nr = perf_sw_context;
event->destroy = reset_global_refc;
return 0;
 }
-- 
2.33.0

Re: [PATCH V5 15/21] riscv: compat: Add hw capability check for elf

On Tue, Feb 1, 2022 at 11:07 PM  wrote:
>
> From: Guo Ren 
>
> Detect hardware COMPAT (32bit U-mode) capability in rv64. If not
> support COMPAT mode in hw, compat_elf_check_arch would return
> false by compat_binfmt_elf.c
>
> Signed-off-by: Guo Ren 
> Signed-off-by: Guo Ren 
> Cc: Arnd Bergmann 
> Cc: Christoph Hellwig 
> ---
>  arch/riscv/include/asm/elf.h |  3 ++-
>  arch/riscv/kernel/process.c  | 32 
>  2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
> index aee40040917b..3a4293dc7229 100644
> --- a/arch/riscv/include/asm/elf.h
> +++ b/arch/riscv/include/asm/elf.h
> @@ -40,7 +40,8 @@
>   * elf64_hdr e_machine's offset are different. The checker is
>   * a little bit simple compare to other architectures.
>   */
> -#define compat_elf_check_arch(x) ((x)->e_machine == EM_RISCV)
> +extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
> +#define compat_elf_check_arch  compat_elf_check_arch
>
>  #define CORE_DUMP_USE_REGSET
>  #define ELF_EXEC_PAGESIZE  (PAGE_SIZE)
> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> index 1a666ad299b4..758847cba391 100644
> --- a/arch/riscv/kernel/process.c
> +++ b/arch/riscv/kernel/process.c
> @@ -83,6 +83,38 @@ void show_regs(struct pt_regs *regs)
> dump_backtrace(regs, NULL, KERN_DEFAULT);
>  }
>
> +#ifdef CONFIG_COMPAT
> +static bool compat_mode_support __read_mostly;
> +
> +bool compat_elf_check_arch(Elf32_Ehdr *hdr)
> +{
> +   if (compat_mode_support && (hdr->e_machine == EM_RISCV))
> +   return true;
> +   else
> +   return false;
> +}
> +
> +static int compat_mode_detect(void)
Forgot __init, here

> +{
> +   unsigned long tmp = csr_read(CSR_STATUS);
> +
> +   csr_write(CSR_STATUS, (tmp & ~SR_UXL) | SR_UXL_32);
> +
> +   if ((csr_read(CSR_STATUS) & SR_UXL) != SR_UXL_32) {
> +   pr_info("riscv: 32bit compat mode detect failed\n");
> +   compat_mode_support = false;
> +   } else {
> +   compat_mode_support = true;
> +   pr_info("riscv: 32bit compat mode detected\n");
> +   }
> +
> +   csr_write(CSR_STATUS, tmp);
> +
> +   return 0;
> +}
> +arch_initcall(compat_mode_detect);
> +#endif
> +
>  void start_thread(struct pt_regs *regs, unsigned long pc,
> unsigned long sp)
>  {
> --
> 2.25.1
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

[PATCH] spi: mpc512x-psc: Fix compile errors

2022-02-01 Thread Linus Walleij

My patch created compilation bugs in the MPC512x-PSC driver.
Fix them up.

Cc: Uwe Kleine-König 
Cc: Anatolij Gustschin 
Cc: linuxppc-dev@lists.ozlabs.org
Reported-by: kernel test robot 
Fixes: 2818824ced4b (" spi: mpc512x-psc: Convert to use GPIO descriptors")
Signed-off-by: Linus Walleij 
---
(This is because I don't have a PPC cross compiler.)
---
 drivers/spi/spi-mpc512x-psc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/spi/spi-mpc512x-psc.c b/drivers/spi/spi-mpc512x-psc.c
index 8a488d8e4c1b..03630359ce70 100644
--- a/drivers/spi/spi-mpc512x-psc.c
+++ b/drivers/spi/spi-mpc512x-psc.c
@@ -127,7 +127,7 @@ static void mpc512x_psc_spi_activate_cs(struct spi_device 
*spi)
out_be32(psc_addr(mps, ccr), ccr);
mps->bits_per_word = cs->bits_per_word;
 
-   if (cs->gpiod) {
+   if (spi->cs_gpiod) {
if (mps->cs_control)
/* boardfile override */
mps->cs_control(spi, (spi->mode & SPI_CS_HIGH) ? 1 : 0);
@@ -373,7 +373,6 @@ static int mpc512x_psc_spi_unprep_xfer_hw(struct spi_master 
*master)
 static int mpc512x_psc_spi_setup(struct spi_device *spi)
 {
struct mpc512x_psc_spi_cs *cs = spi->controller_state;
-   int ret;
 
if (spi->bits_per_word % 8)
return -EINVAL;
-- 
2.34.1

Re: [PATCH v7 0/5] Allow guest access to EFI confidential computing secret area

2022-02-01 Thread Dr. David Alan Gilbert

* James Bottomley (j...@linux.ibm.com) wrote:
> [cc's added]
> On Tue, 2022-02-01 at 14:50 +0100, Greg KH wrote:
> > On Tue, Feb 01, 2022 at 12:44:08PM +, Dov Murik wrote:
> [...]
> > > # ls -la /sys/kernel/security/coco/efi_secret
> > > total 0
> > > drwxr-xr-x 2 root root 0 Jun 28 11:55 .
> > > drwxr-xr-x 3 root root 0 Jun 28 11:54 ..
> > > -r--r- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-
> > > 06879ce3da0b
> > > -r--r- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-
> > > d3a0b54312c6
> > > -r--r- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-
> > > ff17f78864d2
> > 
> > Please see my comments on the powerpc version of this type of thing:
> > 
> > https://lore.kernel.org/r/20220122005637.28199-1-na...@linux.ibm.com
> 
> If you want a debate, actually cc'ing the people on the other thread
> would have been a good start ...
> 
> For those added, this patch series is at:
> 
> https://lore.kernel.org/all/20220201124413.1093099-1-dovmu...@linux.ibm.com/
> 
> > You all need to work together to come up with a unified place for
> > this and stop making it platform-specific.
> 
> I'm not entirely sure of that.  If you look at the differences between
> EFI variables and the COCO proposal: the former has an update API
> which, in the case of signed variables, is rather complex and a UC16
> content requirement.  The latter is binary data with read only/delete. 
> Plus each variable in EFI is described by a GUID, so having a directory
> of random guids, some of which behave like COCO secrets and some of
> which are EFI variables is going to be incredibly confusing (and also
> break all our current listing tools which seems somewhat undesirable).
> 
> So we could end up with 
> 
> /efivar
> /coco
> 
> To achieve the separation, but I really don't see what this buys us. 
> Both filesystems would likely end up with different backends because of
> the semantic differences and we can easily start now in different
> places (effectively we've already done this for efi variables) and
> unify later if that is the chosen direction, so it doesn't look like a
> blocker.
> 
> > Until then, we can't take this.
> 
> I don't believe anyone was asking you to take it.

I have some sympathy in wanting some unification; (I'm not sure that
list of comparison even includes the TDX world).
But I'm not sure if they're the same thing - these are strictly
constants, they're not changable.

But it is a messy list of differences - especially things like the
UTF-16 stuff
I guess the PowerVM key naming contains nul and / can be ignored
- if anyone is silly enough to create keys with those names then they
can not access them; so at least that would solve that problem.

I don't really understand the talk of 32bit attributes in either the
uEFI or PowerVM key store case.

Is that GOOGLE_SMI stuff already there? If so I guess there's not much
we can do  - but it's a shame that there's the directory per variable.

Dave



> James
> 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH V5 21/21] KVM: compat: riscv: Prevent KVM_COMPAT from being selected

On Wed, Feb 2, 2022 at 12:11 AM Anup Patel  wrote:
>
> On Tue, Feb 1, 2022 at 9:31 PM Paolo Bonzini  wrote:
> >
> > On 2/1/22 16:44, Anup Patel wrote:
> > > +Paolo
> > >
> > > On Tue, Feb 1, 2022 at 8:38 PM  wrote:
> > >>
> > >> From: Guo Ren 
> > >>
> > >> Current riscv doesn't support the 32bit KVM API. Let's make it
> > >> clear by not selecting KVM_COMPAT.
> > >>
> > >> Signed-off-by: Guo Ren 
> > >> Signed-off-by: Guo Ren 
> > >> Cc: Arnd Bergmann 
> > >> Cc: Anup Patel 
> > >
> > > This looks good to me.
> > >
> > > Reviewed-by: Anup Patel 
> >
> > Hi Anup,
> >
> > feel free to send this via a pull request (perhaps together with Mark
> > Rutland's entry/exit rework).
>
> Sure, I will do like you suggested.
Great, thx.

>
> Regards,
> Anup
>
> >
> > Paolo
> >



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

Re: [PATCH 2/3] powerpc/pseries/vas: Add VAS migration handler

2022-02-01 Thread Nathan Lynch

Haren Myneni  writes:
> On Mon, 2022-01-31 at 10:37 -0600, Nathan Lynch wrote:
>> Haren Myneni  writes:
>> > Since the VAS windows belong to the VAS hardware resource, the
>> > hypervisor expects the partition to close them on source partition
>> > and reopen them after the partition migrated on the destination
>> > machine.
>> 
>> Not clear to me what "expects" really means here. Would it be
>> accurate to say "requires" instead? If the OS fails to close the
>> windows before suspend, what happens?
>
> I will change it to 'requires' - These VAS windows have to be closed
> before migration so that these windows / credits will be available to
> other LPARs on the source machine.

I'll rephrase.

I want to know whether the architecture (PAPR or whichever doc) imposes
a requirement on the OS (i.e. using words like "must") to close the
windows, or does it merely make a recommendation ("should"), or is it
silent on the subject?

In your testing, if Linux has windows open at the time of calling
ibm,suspend-me, does the call fail or succeed?

Re: [PATCH V5 21/21] KVM: compat: riscv: Prevent KVM_COMPAT from being selected

2022-02-01 Thread Anup Patel

On Tue, Feb 1, 2022 at 9:31 PM Paolo Bonzini  wrote:
>
> On 2/1/22 16:44, Anup Patel wrote:
> > +Paolo
> >
> > On Tue, Feb 1, 2022 at 8:38 PM  wrote:
> >>
> >> From: Guo Ren 
> >>
> >> Current riscv doesn't support the 32bit KVM API. Let's make it
> >> clear by not selecting KVM_COMPAT.
> >>
> >> Signed-off-by: Guo Ren 
> >> Signed-off-by: Guo Ren 
> >> Cc: Arnd Bergmann 
> >> Cc: Anup Patel 
> >
> > This looks good to me.
> >
> > Reviewed-by: Anup Patel 
>
> Hi Anup,
>
> feel free to send this via a pull request (perhaps together with Mark
> Rutland's entry/exit rework).

Sure, I will do like you suggested.

Regards,
Anup

>
> Paolo
>

Re: [PATCH V5 21/21] KVM: compat: riscv: Prevent KVM_COMPAT from being selected

2022-02-01 Thread Paolo Bonzini


On 2/1/22 16:44, Anup Patel wrote:

+Paolo

On Tue, Feb 1, 2022 at 8:38 PM  wrote:


From: Guo Ren 

Current riscv doesn't support the 32bit KVM API. Let's make it
clear by not selecting KVM_COMPAT.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Anup Patel 


This looks good to me.

Reviewed-by: Anup Patel 


Hi Anup,

feel free to send this via a pull request (perhaps together with Mark 
Rutland's entry/exit rework).


Paolo

Re: [PATCH V5 21/21] KVM: compat: riscv: Prevent KVM_COMPAT from being selected

2022-02-01 Thread Anup Patel

+Paolo

On Tue, Feb 1, 2022 at 8:38 PM  wrote:
>
> From: Guo Ren 
>
> Current riscv doesn't support the 32bit KVM API. Let's make it
> clear by not selecting KVM_COMPAT.
>
> Signed-off-by: Guo Ren 
> Signed-off-by: Guo Ren 
> Cc: Arnd Bergmann 
> Cc: Anup Patel 

This looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  virt/kvm/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index f4834c20e4a6..a8c5c9f06b3c 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -53,7 +53,7 @@ config KVM_GENERIC_DIRTYLOG_READ_PROTECT
>
>  config KVM_COMPAT
> def_bool y
> -   depends on KVM && COMPAT && !(S390 || ARM64)
> +   depends on KVM && COMPAT && !(S390 || ARM64 || RISCV)
>
>  config HAVE_KVM_IRQ_BYPASS
> bool
> --
> 2.25.1
>

Re: [RFC PATCH 0/2] powerpc/pseries: add support for local secure storage called Platform Keystore(PKS)

2022-02-01 Thread Dave Hansen

On 1/21/22 16:56, Nayna Jain wrote:
> Nayna Jain (2):
>   pseries: define driver for Platform Keystore
>   pseries: define sysfs interface to expose PKS variables

Hi Folks,

There another feature that we might want to consider in the naming here:

> https://lore.kernel.org/all/20220127175505.851391-1-ira.we...@intel.com/

Protection Keys for Supervisor pages is also called PKS.  It's also not
entirely impossible that powerpc might want to start using this code at
some point, just like what happened with the userspace protection keys[1].

I don't think it's the end of the world either way, but it might save a
hapless user or kernel developer some confusion if we can avoid
including two "PKS" features in the kernel.  I just wanted to make sure
we were aware of the other's existence. :)

1.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/include/asm/pkeys.h

[PATCH V5 21/21] KVM: compat: riscv: Prevent KVM_COMPAT from being selected

From: Guo Ren 

Current riscv doesn't support the 32bit KVM API. Let's make it
clear by not selecting KVM_COMPAT.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Anup Patel 
---
 virt/kvm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index f4834c20e4a6..a8c5c9f06b3c 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -53,7 +53,7 @@ config KVM_GENERIC_DIRTYLOG_READ_PROTECT
 
 config KVM_COMPAT
def_bool y
-   depends on KVM && COMPAT && !(S390 || ARM64)
+   depends on KVM && COMPAT && !(S390 || ARM64 || RISCV)
 
 config HAVE_KVM_IRQ_BYPASS
bool
-- 
2.25.1

[PATCH V5 20/21] riscv: compat: Add COMPAT Kbuild skeletal support

From: Guo Ren 

Adds initial skeletal COMPAT Kbuild (Running 32bit U-mode on
64bit S-mode) support.
 - Setup kconfig & dummy functions for compiling.
 - Implement compat_start_thread by the way.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/Kconfig | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5adcbd9b5e88..6f11df8c189f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -73,6 +73,7 @@ config RISCV
select HAVE_ARCH_KGDB if !XIP_KERNEL
select HAVE_ARCH_KGDB_QXFER_PKT
select HAVE_ARCH_MMAP_RND_BITS if MMU
+   select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU
@@ -123,12 +124,18 @@ config ARCH_MMAP_RND_BITS_MIN
default 18 if 64BIT
default 8
 
+config ARCH_MMAP_RND_COMPAT_BITS_MIN
+   default 8
+
 # max bits determined by the following formula:
 #  VA_BITS - PAGE_SHIFT - 3
 config ARCH_MMAP_RND_BITS_MAX
default 24 if 64BIT # SV39 based
default 17
 
+config ARCH_MMAP_RND_COMPAT_BITS_MAX
+   default 17
+
 # set if we run in machine mode, cleared if we run in supervisor mode
 config RISCV_M_MODE
bool
@@ -406,6 +413,18 @@ config CRASH_DUMP
 
  For more details see Documentation/admin-guide/kdump/kdump.rst
 
+config COMPAT
+   bool "Kernel support for 32-bit U-mode"
+   default 64BIT
+   depends on 64BIT && MMU
+   help
+ This option enables support for a 32-bit U-mode running under a 64-bit
+ kernel at S-mode. riscv32-specific components such as system calls,
+ the user helper functions (vdso), signal rt_frame functions and the
+ ptrace interface are handled appropriately by the kernel.
+
+ If you want to execute 32-bit userspace applications, say Y.
+
 endmenu
 
 menu "Boot options"
-- 
2.25.1

[PATCH V5 19/21] riscv: compat: ptrace: Add compat_arch_ptrace implement

From: Guo Ren 

Now, you can use native gdb on riscv64 for rv32 app debugging.

$ uname -a
Linux buildroot 5.16.0-rc4-00036-gbef6b82fdf23-dirty #53 SMP Mon Dec 20 
23:06:53 CST 2021 riscv64 GNU/Linux
$ cat /proc/cpuinfo
processor   : 0
hart: 0
isa : rv64imafdcsuh
mmu : sv48

$ file /bin/busybox
/bin/busybox: setuid ELF 32-bit LSB shared object, UCB RISC-V, version 1 
(SYSV), dynamically linked, interpreter /lib/ld-linux-riscv32-ilp32d.so.1, for 
GNU/Linux 5.15.0, stripped
$ file /usr/bin/gdb
/usr/bin/gdb: ELF 32-bit LSB shared object, UCB RISC-V, version 1 (GNU/Linux), 
dynamically linked, interpreter /lib/ld-linux-riscv32-ilp32d.so.1, for 
GNU/Linux 5.15.0, stripped
$ /usr/bin/gdb /bin/busybox
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
...
Reading symbols from /bin/busybox...
(No debugging symbols found in /bin/busybox)
(gdb) b main
Breakpoint 1 at 0x8ddc
(gdb) r
Starting program: /bin/busybox
Failed to read a valid object file image from memory.

Breakpoint 1, 0x555a8ddc in main ()
(gdb) i r
ra 0x77df0b74   0x77df0b74
sp 0x7fdd3d10   0x7fdd3d10
gp 0x5567e800   0x5567e800 
tp 0x77f64280   0x77f64280
t0 0x0  0
t1 0x555a6fac   1431990188
t2 0x77dd8db4   2011008436
fp 0x7fdd3e34   0x7fdd3e34
s1 0x7fdd3e34   2145205812
a0 0x   -1
a1 0x2000   8192
a2 0x7fdd3e3c   2145205820
a3 0x0  0
a4 0x7fdd3d30   2145205552
a5 0x555a8dc0   1431997888
a6 0x77f2c170   2012397936
a7 0x6a7c7a2f   1786542639
s2 0x0  0
s3 0x0  0
s4 0x555a8dc0   1431997888
s5 0x77f8a3a8   2012783528
s6 0x7fdd3e3c   2145205820
s7 0x5567cecc   1432866508
--Type  for more, q to quit, c to continue without paging--
s8 0x1  1
s9 0x0  0
s100x55634448   1432568904
s110x0  0
t3 0x77df0bb8   2011106232
t4 0x42fc   17148
t5 0x0  0
t6 0x40 64
pc 0x555a8ddc   0x555a8ddc 
(gdb) si
0x555a78f0 in mallopt@plt ()
(gdb) c
Continuing.
BusyBox v1.34.1 (2021-12-19 22:39:48 CST) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.

Usage: busybox [function [arguments]...]
   or: busybox --list[-full]
...
[Inferior 1 (process 107) exited normally]
(gdb) q

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/kernel/ptrace.c | 87 +++---
 1 file changed, 82 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
index a89243730153..bb387593a121 100644
--- a/arch/riscv/kernel/ptrace.c
+++ b/arch/riscv/kernel/ptrace.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -111,11 +112,6 @@ static const struct user_regset_view 
riscv_user_native_view = {
.n = ARRAY_SIZE(riscv_user_regset),
 };
 
-const struct user_regset_view *task_user_regset_view(struct task_struct *task)
-{
-   return &riscv_user_native_view;
-}
-
 struct pt_regs_offset {
const char *name;
int offset;
@@ -273,3 +269,84 @@ __visible void do_syscall_trace_exit(struct pt_regs *regs)
trace_sys_exit(regs, regs_return_value(regs));
 #endif
 }
+
+#ifdef CONFIG_COMPAT
+static int compat_riscv_gpr_get(struct task_struct *target,
+   const struct user_regset *regset,
+   struct membuf to)
+{
+   struct compat_user_regs_struct cregs;
+
+   regs_to_cregs(&cregs, task_pt_regs(target));
+
+   return membuf_write(&to, &cregs,
+   sizeof(struct compat_user_regs_struct));
+}
+
+static int compat_riscv_gpr_set(struct task_struct *target,
+   const struct user_regset *regset,
+   unsigned int pos, unsigned int count,
+   const void *kbuf, const void __user *ubuf)
+{
+   int ret;
+   struct compat_user_regs_struct cregs;
+
+   ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &cregs, 0, -1);
+
+   cregs_to_regs(&cregs, task_pt_regs(target));
+
+   return ret;
+}
+
+static const struct user_regset compat_riscv_user_regset[] = {
+   [REGSET_X] = {
+   .core_note_type = NT_PRSTATUS,
+   .n = ELF_NGREG,
+   .size = sizeof(compat_elf_greg_t),
+   .align = sizeof(compat_elf_greg_t),
+   .regset_get = compat_riscv_gpr_get

[PATCH V5 18/21] riscv: compat: signal: Add rt_frame implementation

From: Guo Ren 

Implement compat_setup_rt_frame for sigcontext save & restore. The
main process is the same with signal, but the rv32 pt_regs' size
is different from rv64's, so we needs convert them.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/kernel/Makefile|   1 +
 arch/riscv/kernel/compat_signal.c | 243 ++
 arch/riscv/kernel/signal.c|  13 +-
 3 files changed, 256 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/kernel/compat_signal.c

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 88e79f481c21..a46f9807c59e 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -67,4 +67,5 @@ obj-$(CONFIG_JUMP_LABEL)  += jump_label.o
 
 obj-$(CONFIG_EFI)  += efi.o
 obj-$(CONFIG_COMPAT)   += compat_syscall_table.o
+obj-$(CONFIG_COMPAT)   += compat_signal.o
 obj-$(CONFIG_COMPAT)   += compat_vdso/
diff --git a/arch/riscv/kernel/compat_signal.c 
b/arch/riscv/kernel/compat_signal.c
new file mode 100644
index ..7041742ded08
--- /dev/null
+++ b/arch/riscv/kernel/compat_signal.c
@@ -0,0 +1,243 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define COMPAT_DEBUG_SIG 0
+
+struct compat_sigcontext {
+   struct compat_user_regs_struct sc_regs;
+   union __riscv_fp_state sc_fpregs;
+};
+
+struct compat_ucontext {
+   compat_ulong_t  uc_flags;
+   struct compat_ucontext  *uc_link;
+   compat_stack_t  uc_stack;
+   sigset_tuc_sigmask;
+   /* There's some padding here to allow sigset_t to be expanded in the
+* future.  Though this is unlikely, other architectures put uc_sigmask
+* at the end of this structure and explicitly state it can be
+* expanded, so we didn't want to box ourselves in here. */
+   __u8  __unused[1024 / 8 - sizeof(sigset_t)];
+   /* We can't put uc_sigmask at the end of this structure because we need
+* to be able to expand sigcontext in the future.  For example, the
+* vector ISA extension will almost certainly add ISA state.  We want
+* to ensure all user-visible ISA state can be saved and restored via a
+* ucontext, so we're putting this at the end in order to allow for
+* infinite extensibility.  Since we know this will be extended and we
+* assume sigset_t won't be extended an extreme amount, we're
+* prioritizing this. */
+   struct compat_sigcontext uc_mcontext;
+};
+
+struct compat_rt_sigframe {
+   struct compat_siginfo info;
+   struct compat_ucontext uc;
+};
+
+#ifdef CONFIG_FPU
+static long compat_restore_fp_state(struct pt_regs *regs,
+   union __riscv_fp_state __user *sc_fpregs)
+{
+   long err;
+   struct __riscv_d_ext_state __user *state = &sc_fpregs->d;
+   size_t i;
+
+   err = __copy_from_user(¤t->thread.fstate, state, sizeof(*state));
+   if (unlikely(err))
+   return err;
+
+   fstate_restore(current, regs);
+
+   /* We support no other extension state at this time. */
+   for (i = 0; i < ARRAY_SIZE(sc_fpregs->q.reserved); i++) {
+   u32 value;
+
+   err = __get_user(value, &sc_fpregs->q.reserved[i]);
+   if (unlikely(err))
+   break;
+   if (value != 0)
+   return -EINVAL;
+   }
+
+   return err;
+}
+
+static long compat_save_fp_state(struct pt_regs *regs,
+ union __riscv_fp_state __user *sc_fpregs)
+{
+   long err;
+   struct __riscv_d_ext_state __user *state = &sc_fpregs->d;
+   size_t i;
+
+   fstate_save(current, regs);
+   err = __copy_to_user(state, ¤t->thread.fstate, sizeof(*state));
+   if (unlikely(err))
+   return err;
+
+   /* We support no other extension state at this time. */
+   for (i = 0; i < ARRAY_SIZE(sc_fpregs->q.reserved); i++) {
+   err = __put_user(0, &sc_fpregs->q.reserved[i]);
+   if (unlikely(err))
+   break;
+   }
+
+   return err;
+}
+#else
+#define compat_save_fp_state(task, regs) (0)
+#define compat_restore_fp_state(task, regs) (0)
+#endif
+
+static long compat_restore_sigcontext(struct pt_regs *regs,
+   struct compat_sigcontext __user *sc)
+{
+   long err;
+   struct compat_user_regs_struct cregs;
+
+   /* sc_regs is structured the same as the start of pt_regs */
+   err = __copy_from_user(&cregs, &sc->sc_regs, sizeof(sc->sc_regs));
+
+   cregs_to_regs(&cregs, regs);
+
+   /* Restore the floating-point state. */
+   if (has_fpu())
+   err |= compat_restore_fp_state(regs, &sc->sc_fpregs);
+   return err;
+}
+
+COMPAT_SYSCALL_DEFINE0(rt_sigreturn)
+{
+   str

[PATCH V5 17/21] riscv: compat: vdso: Add setup additional pages implementation

From: Guo Ren 

Reconstruct __setup_additional_pages() by appending vdso info
pointer argument to meet compat_vdso_info requirement. And change
vm_special_mapping *dm, *cm initialization into static.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/include/asm/elf.h |   5 ++
 arch/riscv/include/asm/mmu.h |   1 +
 arch/riscv/kernel/vdso.c | 104 +--
 3 files changed, 81 insertions(+), 29 deletions(-)

diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index 3a4293dc7229..d87d3bcc758d 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -134,5 +134,10 @@ do {if ((ex).e_ident[EI_CLASS] == ELFCLASS32)  
\
 typedef compat_ulong_t compat_elf_greg_t;
 typedef compat_elf_greg_t  compat_elf_gregset_t[ELF_NGREG];
 
+extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp);
+#define compat_arch_setup_additional_pages \
+   compat_arch_setup_additional_pages
+
 #endif /* CONFIG_COMPAT */
 #endif /* _ASM_RISCV_ELF_H */
diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
index 0099dc116168..cedcf8ea3c76 100644
--- a/arch/riscv/include/asm/mmu.h
+++ b/arch/riscv/include/asm/mmu.h
@@ -16,6 +16,7 @@ typedef struct {
atomic_long_t id;
 #endif
void *vdso;
+   void *vdso_info;
 #ifdef CONFIG_SMP
/* A local icache flush is needed before user execution can resume. */
cpumask_t icache_stale_mask;
diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
index a9436a65161a..deca69524799 100644
--- a/arch/riscv/kernel/vdso.c
+++ b/arch/riscv/kernel/vdso.c
@@ -23,6 +23,9 @@ struct vdso_data {
 #endif
 
 extern char vdso_start[], vdso_end[];
+#ifdef CONFIG_COMPAT
+extern char compat_vdso_start[], compat_vdso_end[];
+#endif
 
 enum vvar_pages {
VVAR_DATA_PAGE_OFFSET,
@@ -30,6 +33,11 @@ enum vvar_pages {
VVAR_NR_PAGES,
 };
 
+enum rv_vdso_map {
+   RV_VDSO_MAP_VVAR,
+   RV_VDSO_MAP_VDSO,
+};
+
 #define VVAR_SIZE  (VVAR_NR_PAGES << PAGE_SHIFT)
 
 /*
@@ -52,12 +60,6 @@ struct __vdso_info {
struct vm_special_mapping *cm;
 };
 
-static struct __vdso_info vdso_info __ro_after_init = {
-   .name = "vdso",
-   .vdso_code_start = vdso_start,
-   .vdso_code_end = vdso_end,
-};
-
 static int vdso_mremap(const struct vm_special_mapping *sm,
   struct vm_area_struct *new_vma)
 {
@@ -66,35 +68,35 @@ static int vdso_mremap(const struct vm_special_mapping *sm,
return 0;
 }
 
-static int __init __vdso_init(void)
+static int __init __vdso_init(struct __vdso_info *vdso_info)
 {
unsigned int i;
struct page **vdso_pagelist;
unsigned long pfn;
 
-   if (memcmp(vdso_info.vdso_code_start, "\177ELF", 4)) {
+   if (memcmp(vdso_info->vdso_code_start, "\177ELF", 4)) {
pr_err("vDSO is not a valid ELF object!\n");
return -EINVAL;
}
 
-   vdso_info.vdso_pages = (
-   vdso_info.vdso_code_end -
-   vdso_info.vdso_code_start) >>
+   vdso_info->vdso_pages = (
+   vdso_info->vdso_code_end -
+   vdso_info->vdso_code_start) >>
PAGE_SHIFT;
 
-   vdso_pagelist = kcalloc(vdso_info.vdso_pages,
+   vdso_pagelist = kcalloc(vdso_info->vdso_pages,
sizeof(struct page *),
GFP_KERNEL);
if (vdso_pagelist == NULL)
return -ENOMEM;
 
/* Grab the vDSO code pages. */
-   pfn = sym_to_pfn(vdso_info.vdso_code_start);
+   pfn = sym_to_pfn(vdso_info->vdso_code_start);
 
-   for (i = 0; i < vdso_info.vdso_pages; i++)
+   for (i = 0; i < vdso_info->vdso_pages; i++)
vdso_pagelist[i] = pfn_to_page(pfn + i);
 
-   vdso_info.cm->pages = vdso_pagelist;
+   vdso_info->cm->pages = vdso_pagelist;
 
return 0;
 }
@@ -116,13 +118,14 @@ int vdso_join_timens(struct task_struct *task, struct 
time_namespace *ns)
 {
struct mm_struct *mm = task->mm;
struct vm_area_struct *vma;
+   struct __vdso_info *vdso_info = mm->context.vdso_info;
 
mmap_read_lock(mm);
 
for (vma = mm->mmap; vma; vma = vma->vm_next) {
unsigned long size = vma->vm_end - vma->vm_start;
 
-   if (vma_is_special_mapping(vma, vdso_info.dm))
+   if (vma_is_special_mapping(vma, vdso_info->dm))
zap_page_range(vma, vma->vm_start, size);
}
 
@@ -187,11 +190,6 @@ static vm_fault_t vvar_fault(const struct 
vm_special_mapping *sm,
return vmf_insert_pfn(vma, vmf->address, pfn);
 }
 
-enum rv_vdso_map {
-   RV_VDSO_MAP_VVAR,
-   RV_VDSO_MAP_VDSO,
-};
-
 static struct vm_special_mapping rv_vdso_maps[] __ro_after_init = {

[PATCH V5 16/21] riscv: compat: vdso: Add rv32 VDSO base code implementation

From: Guo Ren 

There is no vgettimeofday supported in rv32 that makes simple to
generate rv32 vdso code which only needs riscv64 compiler. Other
architectures need change compiler or -m (machine parameter) to
support vdso32 compiling. If rv32 support vgettimeofday (which
cause C compile) in future, we would add CROSS_COMPILE to support
that makes more requirement on compiler enviornment.

linux-rv64/arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg:
file format elf64-littleriscv

Disassembly of section .text:

0800 <__vdso_rt_sigreturn>:
 800:   08b00893li  a7,139
 804:   0073ecall
 808:   unimp
...

080c <__vdso_getcpu>:
 80c:   0a800893li  a7,168
 810:   0073ecall
 814:   8082ret
...

0818 <__vdso_flush_icache>:
 818:   10300893li  a7,259
 81c:   0073ecall
 820:   8082ret

linux-rv32/arch/riscv/kernel/vdso/vdso.so.dbg:
file format elf32-littleriscv

Disassembly of section .text:

0800 <__vdso_rt_sigreturn>:
 800:   08b00893li  a7,139
 804:   0073ecall
 808:   unimp
...

080c <__vdso_getcpu>:
 80c:   0a800893li  a7,168
 810:   0073ecall
 814:   8082ret
...

0818 <__vdso_flush_icache>:
 818:   10300893li  a7,259
 81c:   0073ecall
 820:   8082ret

Finally, reuse all *.S from vdso in compat_vdso that makes
implementation clear and readable.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/Makefile   |  5 ++
 arch/riscv/include/asm/vdso.h |  9 +++
 arch/riscv/kernel/Makefile|  1 +
 arch/riscv/kernel/compat_vdso/.gitignore  |  2 +
 arch/riscv/kernel/compat_vdso/Makefile| 68 +++
 arch/riscv/kernel/compat_vdso/compat_vdso.S   |  8 +++
 .../kernel/compat_vdso/compat_vdso.lds.S  |  3 +
 arch/riscv/kernel/compat_vdso/flush_icache.S  |  3 +
 .../compat_vdso/gen_compat_vdso_offsets.sh|  5 ++
 arch/riscv/kernel/compat_vdso/getcpu.S|  3 +
 arch/riscv/kernel/compat_vdso/note.S  |  3 +
 arch/riscv/kernel/compat_vdso/rt_sigreturn.S  |  3 +
 arch/riscv/kernel/vdso/vdso.S |  6 +-
 13 files changed, 118 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/kernel/compat_vdso/.gitignore
 create mode 100644 arch/riscv/kernel/compat_vdso/Makefile
 create mode 100644 arch/riscv/kernel/compat_vdso/compat_vdso.S
 create mode 100644 arch/riscv/kernel/compat_vdso/compat_vdso.lds.S
 create mode 100644 arch/riscv/kernel/compat_vdso/flush_icache.S
 create mode 100755 arch/riscv/kernel/compat_vdso/gen_compat_vdso_offsets.sh
 create mode 100644 arch/riscv/kernel/compat_vdso/getcpu.S
 create mode 100644 arch/riscv/kernel/compat_vdso/note.S
 create mode 100644 arch/riscv/kernel/compat_vdso/rt_sigreturn.S

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index a02e588c4947..f73d50552e09 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -106,12 +106,17 @@ libs-$(CONFIG_EFI_STUB) += 
$(objtree)/drivers/firmware/efi/libstub/lib.a
 PHONY += vdso_install
 vdso_install:
$(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso $@
+   $(if $(CONFIG_COMPAT),$(Q)$(MAKE) \
+   $(build)=arch/riscv/kernel/compat_vdso $@)
 
 ifeq ($(KBUILD_EXTMOD),)
 ifeq ($(CONFIG_MMU),y)
 prepare: vdso_prepare
 vdso_prepare: prepare0
$(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso 
include/generated/vdso-offsets.h
+   $(if $(CONFIG_COMPAT),$(Q)$(MAKE) \
+   $(build)=arch/riscv/kernel/compat_vdso 
include/generated/compat_vdso-offsets.h)
+
 endif
 endif
 
diff --git a/arch/riscv/include/asm/vdso.h b/arch/riscv/include/asm/vdso.h
index bc6f75f3a199..af981426fe0f 100644
--- a/arch/riscv/include/asm/vdso.h
+++ b/arch/riscv/include/asm/vdso.h
@@ -21,6 +21,15 @@
 
 #define VDSO_SYMBOL(base, name)
\
(void __user *)((unsigned long)(base) + __vdso_##name##_offset)
+
+#ifdef CONFIG_COMPAT
+#include 
+
+#define COMPAT_VDSO_SYMBOL(base, name) 
\
+   (void __user *)((unsigned long)(base) + compat__vdso_##name##_offset)
+
+#endif /* CONFIG_COMPAT */
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* CONFIG_MMU */
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 954dc7043ad2..88e79f481c21 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -67,3 +67,4 @@ obj-$(CONFIG_JUMP_LABEL)  += jump_label.o
 
 obj-$(CONFIG_EFI)  += efi.o
 obj-$(CONFIG_COMPAT)   += compat_syscall_table.o
+obj-$(CONFIG_COMPAT)   += compat_vdso/
diff --git a/arch/

[PATCH V5 15/21] riscv: compat: Add hw capability check for elf

From: Guo Ren 

Detect hardware COMPAT (32bit U-mode) capability in rv64. If not
support COMPAT mode in hw, compat_elf_check_arch would return
false by compat_binfmt_elf.c

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Christoph Hellwig 
---
 arch/riscv/include/asm/elf.h |  3 ++-
 arch/riscv/kernel/process.c  | 32 
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index aee40040917b..3a4293dc7229 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -40,7 +40,8 @@
  * elf64_hdr e_machine's offset are different. The checker is
  * a little bit simple compare to other architectures.
  */
-#define compat_elf_check_arch(x) ((x)->e_machine == EM_RISCV)
+extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
+#define compat_elf_check_arch  compat_elf_check_arch
 
 #define CORE_DUMP_USE_REGSET
 #define ELF_EXEC_PAGESIZE  (PAGE_SIZE)
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 1a666ad299b4..758847cba391 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -83,6 +83,38 @@ void show_regs(struct pt_regs *regs)
dump_backtrace(regs, NULL, KERN_DEFAULT);
 }
 
+#ifdef CONFIG_COMPAT
+static bool compat_mode_support __read_mostly;
+
+bool compat_elf_check_arch(Elf32_Ehdr *hdr)
+{
+   if (compat_mode_support && (hdr->e_machine == EM_RISCV))
+   return true;
+   else
+   return false;
+}
+
+static int compat_mode_detect(void)
+{
+   unsigned long tmp = csr_read(CSR_STATUS);
+
+   csr_write(CSR_STATUS, (tmp & ~SR_UXL) | SR_UXL_32);
+
+   if ((csr_read(CSR_STATUS) & SR_UXL) != SR_UXL_32) {
+   pr_info("riscv: 32bit compat mode detect failed\n");
+   compat_mode_support = false;
+   } else {
+   compat_mode_support = true;
+   pr_info("riscv: 32bit compat mode detected\n");
+   }
+
+   csr_write(CSR_STATUS, tmp);
+
+   return 0;
+}
+arch_initcall(compat_mode_detect);
+#endif
+
 void start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
 {
-- 
2.25.1

[PATCH V5 14/21] riscv: compat: Add elf.h implementation

From: Guo Ren 

Implement necessary type and macro for compat elf. See the code
comment for detail.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
---
 arch/riscv/include/asm/elf.h | 46 +++-
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index f53c40026c7a..aee40040917b 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -8,6 +8,8 @@
 #ifndef _ASM_RISCV_ELF_H
 #define _ASM_RISCV_ELF_H
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -18,11 +20,13 @@
  */
 #define ELF_ARCH   EM_RISCV
 
+#ifndef ELF_CLASS
 #ifdef CONFIG_64BIT
 #define ELF_CLASS  ELFCLASS64
 #else
 #define ELF_CLASS  ELFCLASS32
 #endif
+#endif
 
 #define ELF_DATA   ELFDATA2LSB
 
@@ -31,6 +35,13 @@
  */
 #define elf_check_arch(x) ((x)->e_machine == EM_RISCV)
 
+/*
+ * Use the same code with elf_check_arch, because elf32_hdr &
+ * elf64_hdr e_machine's offset are different. The checker is
+ * a little bit simple compare to other architectures.
+ */
+#define compat_elf_check_arch(x) ((x)->e_machine == EM_RISCV)
+
 #define CORE_DUMP_USE_REGSET
 #define ELF_EXEC_PAGESIZE  (PAGE_SIZE)
 
@@ -43,8 +54,14 @@
 #define ELF_ET_DYN_BASE((TASK_SIZE / 3) * 2)
 
 #ifdef CONFIG_64BIT
+#ifdef CONFIG_COMPAT
+#define STACK_RND_MASK (test_thread_flag(TIF_32BIT) ? \
+0x7ff >> (PAGE_SHIFT - 12) : \
+0x3 >> (PAGE_SHIFT - 12))
+#else
 #define STACK_RND_MASK (0x3 >> (PAGE_SHIFT - 12))
 #endif
+#endif
 /*
  * This yields a mask that user programs can use to figure out what
  * instruction set this CPU supports.  This could be done in user space,
@@ -60,11 +77,19 @@ extern unsigned long elf_hwcap;
  */
 #define ELF_PLATFORM   (NULL)
 
+#define COMPAT_ELF_PLATFORM(NULL)
+
 #ifdef CONFIG_MMU
 #define ARCH_DLINFO\
 do {   \
+   /*  \
+* Note that we add ulong after elf_addr_t because  \
+* casting current->mm->context.vdso triggers a cast\
+* warning of cast from pointer to integer for  \
+* COMPAT ELFCLASS32.   \
+*/ \
NEW_AUX_ENT(AT_SYSINFO_EHDR,\
-   (elf_addr_t)current->mm->context.vdso); \
+   (elf_addr_t)(ulong)current->mm->context.vdso);  \
NEW_AUX_ENT(AT_L1I_CACHESIZE,   \
get_cache_size(1, CACHE_TYPE_INST));\
NEW_AUX_ENT(AT_L1I_CACHEGEOMETRY,   \
@@ -90,4 +115,23 @@ do {
\
*(struct user_regs_struct *)regs;   \
 } while (0);
 
+#ifdef CONFIG_COMPAT
+
+#define SET_PERSONALITY(ex)\
+do {if ((ex).e_ident[EI_CLASS] == ELFCLASS32)  \
+   set_thread_flag(TIF_32BIT); \
+   else\
+   clear_thread_flag(TIF_32BIT);   \
+   if (personality(current->personality) != PER_LINUX32)   \
+   set_personality(PER_LINUX | \
+   (current->personality & (~PER_MASK)));  \
+} while (0)
+
+#define COMPAT_ELF_ET_DYN_BASE ((TASK_SIZE_32 / 3) * 2)
+
+/* rv32 registers */
+typedef compat_ulong_t compat_elf_greg_t;
+typedef compat_elf_greg_t  compat_elf_gregset_t[ELF_NGREG];
+
+#endif /* CONFIG_COMPAT */
 #endif /* _ASM_RISCV_ELF_H */
-- 
2.25.1

[PATCH V5 13/21] riscv: compat: process: Add UXL_32 support in start_thread

From: Guo Ren 

If the current task is in COMPAT mode, set SR_UXL_32 in status for
returning userspace. We need CONFIG _COMPAT to prevent compiling
errors with rv32 defconfig.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/kernel/process.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 03ac3aa611f5..1a666ad299b4 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -97,6 +97,11 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
}
regs->epc = pc;
regs->sp = sp;
+
+#ifdef CONFIG_COMPAT
+   if (is_compat_task())
+   regs->status |= SR_UXL_32;
+#endif
 }
 
 void flush_thread(void)
-- 
2.25.1

[PATCH V5 12/21] riscv: compat: syscall: Add entry.S implementation

From: Guo Ren 

Implement the entry of compat_sys_call_table[] in asm. Ref to
riscv-privileged spec 4.1.1 Supervisor Status Register (sstatus):

 BIT[32:33] = UXL[1:0]:
 - 1:32
 - 2:64
 - 3:128

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/include/asm/csr.h |  7 +++
 arch/riscv/kernel/entry.S| 18 --
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index ae711692eec9..eed96fa62d66 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -36,6 +36,13 @@
 #define SR_SD  _AC(0x8000, UL) /* FS/XS dirty */
 #endif
 
+#ifdef CONFIG_COMPAT
+#define SR_UXL _AC(0x3, UL) /* XLEN mask for U-mode */
+#define SR_UXL_32  _AC(0x1, UL) /* XLEN = 32 for U-mode */
+#define SR_UXL_64  _AC(0x2, UL) /* XLEN = 64 for U-mode */
+#define SR_UXL_SHIFT   32
+#endif
+
 /* SATP flags */
 #ifndef CONFIG_64BIT
 #define SATP_PPN   _AC(0x003F, UL)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index ed29e9c8f660..1951743f09b3 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -207,13 +207,27 @@ check_syscall_nr:
 * Syscall number held in a7.
 * If syscall number is above allowed value, redirect to ni_syscall.
 */
-   bgeu a7, t0, 1f
+   bgeu a7, t0, 3f
+#ifdef CONFIG_COMPAT
+   REG_L s0, PT_STATUS(sp)
+   srli s0, s0, SR_UXL_SHIFT
+   andi s0, s0, (SR_UXL >> SR_UXL_SHIFT)
+   li t0, (SR_UXL_32 >> SR_UXL_SHIFT)
+   sub t0, s0, t0
+   bnez t0, 1f
+
+   /* Call compat_syscall */
+   la s0, compat_sys_call_table
+   j 2f
+1:
+#endif
/* Call syscall */
la s0, sys_call_table
+2:
slli t0, a7, RISCV_LGPTR
add s0, s0, t0
REG_L s0, 0(s0)
-1:
+3:
jalr s0
 
 ret_from_syscall:
-- 
2.25.1

[PATCH V5 11/21] riscv: compat: syscall: Add compat_sys_call_table implementation

From: Guo Ren 

Implement compat sys_call_table and some system call functions:
truncate64, ftruncate64, fallocate, pread64, pwrite64,
sync_file_range, readahead, fadvise64_64 which need argument
translation.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/include/asm/syscall.h |  1 +
 arch/riscv/include/asm/unistd.h  | 11 +++
 arch/riscv/include/uapi/asm/unistd.h |  2 +-
 arch/riscv/kernel/Makefile   |  1 +
 arch/riscv/kernel/compat_syscall_table.c | 19 
 arch/riscv/kernel/sys_riscv.c|  6 ++--
 fs/open.c| 24 +++
 fs/read_write.c  | 16 ++
 fs/sync.c|  9 ++
 include/asm-generic/compat.h |  7 +
 include/linux/compat.h   | 37 
 mm/fadvise.c | 11 +++
 mm/readahead.c   |  7 +
 13 files changed, 148 insertions(+), 3 deletions(-)
 create mode 100644 arch/riscv/kernel/compat_syscall_table.c

diff --git a/arch/riscv/include/asm/syscall.h b/arch/riscv/include/asm/syscall.h
index 7ac6a0e275f2..384a63b86420 100644
--- a/arch/riscv/include/asm/syscall.h
+++ b/arch/riscv/include/asm/syscall.h
@@ -16,6 +16,7 @@
 
 /* The array of function pointers for syscalls. */
 extern void * const sys_call_table[];
+extern void * const compat_sys_call_table[];
 
 /*
  * Only the low 32 bits of orig_r0 are meaningful, so we return int.
diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h
index 6c316093a1e5..5ddac412b578 100644
--- a/arch/riscv/include/asm/unistd.h
+++ b/arch/riscv/include/asm/unistd.h
@@ -11,6 +11,17 @@
 #define __ARCH_WANT_SYS_CLONE
 #define __ARCH_WANT_MEMFD_SECRET
 
+#ifdef CONFIG_COMPAT
+#define __ARCH_WANT_COMPAT_TRUNCATE64
+#define __ARCH_WANT_COMPAT_FTRUNCATE64
+#define __ARCH_WANT_COMPAT_FALLOCATE
+#define __ARCH_WANT_COMPAT_PREAD64
+#define __ARCH_WANT_COMPAT_PWRITE64
+#define __ARCH_WANT_COMPAT_SYNC_FILE_RANGE
+#define __ARCH_WANT_COMPAT_READAHEAD
+#define __ARCH_WANT_COMPAT_FADVISE64_64
+#endif
+
 #include 
 
 #define NR_syscalls (__NR_syscalls)
diff --git a/arch/riscv/include/uapi/asm/unistd.h 
b/arch/riscv/include/uapi/asm/unistd.h
index 8062996c2dfd..c9e50eed14aa 100644
--- a/arch/riscv/include/uapi/asm/unistd.h
+++ b/arch/riscv/include/uapi/asm/unistd.h
@@ -15,7 +15,7 @@
  * along with this program.  If not, see .
  */
 
-#ifdef __LP64__
+#if defined(__LP64__) && !defined(__SYSCALL_COMPAT)
 #define __ARCH_WANT_NEW_STAT
 #define __ARCH_WANT_SET_GET_RLIMIT
 #endif /* __LP64__ */
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 612556faa527..954dc7043ad2 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -66,3 +66,4 @@ obj-$(CONFIG_CRASH_DUMP)  += crash_dump.o
 obj-$(CONFIG_JUMP_LABEL)   += jump_label.o
 
 obj-$(CONFIG_EFI)  += efi.o
+obj-$(CONFIG_COMPAT)   += compat_syscall_table.o
diff --git a/arch/riscv/kernel/compat_syscall_table.c 
b/arch/riscv/kernel/compat_syscall_table.c
new file mode 100644
index ..651f2b009c28
--- /dev/null
+++ b/arch/riscv/kernel/compat_syscall_table.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define __SYSCALL_COMPAT
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#undef __SYSCALL
+#define __SYSCALL(nr, call)  [nr] = (call),
+
+asmlinkage long compat_sys_rt_sigreturn(void);
+
+void * const compat_sys_call_table[__NR_syscalls] = {
+   [0 ... __NR_syscalls - 1] = sys_ni_syscall,
+#include 
+};
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
index 12f8a7fce78b..9c0194f176fc 100644
--- a/arch/riscv/kernel/sys_riscv.c
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -33,7 +33,9 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
 {
return riscv_sys_mmap(addr, len, prot, flags, fd, offset, 0);
 }
-#else
+#endif
+
+#if defined(CONFIG_32BIT) || defined(CONFIG_COMPAT)
 SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, len,
unsigned long, prot, unsigned long, flags,
unsigned long, fd, off_t, offset)
@@ -44,7 +46,7 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, 
len,
 */
return riscv_sys_mmap(addr, len, prot, flags, fd, offset, 12);
 }
-#endif /* !CONFIG_64BIT */
+#endif
 
 /*
  * Allows the instruction cache to be flushed from userspace.  Despite RISC-V
diff --git a/fs/open.c b/fs/open.c
index 9ff2f621b760..b25613f7c0a7 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -224,6 +224,21 @@ SYSCALL_DEFINE2(ftruncate64, unsigned int, fd, loff_t, 
length)
 }
 #endif /* BITS_PER_LONG == 32 */
 
+#if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_TRUNCATE64)
+COMPAT_SYSCALL_DEFINE3(truncate64, const char __user *, pathname,
+  compat_arg_u64_dual(length))
+{
+

[PATCH V5 10/21] riscv: compat: Re-implement TASK_SIZE for COMPAT_32BIT

From: Guo Ren 

Make TASK_SIZE from const to dynamic detect TIF_32BIT flag
function. Refer to arm64 to implement DEFAULT_MAP_WINDOW_64 for
efi-stub.

Limit 32-bit compatible process in 0-2GB virtual address range
(which is enough for real scenarios), because it could avoid
address sign extend problem when 32-bit enter 64-bit and ease
software design.

The standard 32-bit TASK_SIZE is 0x9dc0:FIXADDR_START, and
compared to a compatible 32-bit, it increases 476MB for the
application's virtual address.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
---
 arch/riscv/include/asm/pgtable.h | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7e949f25c933..f0d125ea3ceb 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -704,8 +704,17 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
  * 63–48 all equal to bit 47, or else a page-fault exception will occur."
  */
 #ifdef CONFIG_64BIT
-#define TASK_SIZE  (PGDIR_SIZE * PTRS_PER_PGD / 2)
-#define TASK_SIZE_MIN  (PGDIR_SIZE_L3 * PTRS_PER_PGD / 2)
+#define TASK_SIZE_64   (PGDIR_SIZE * PTRS_PER_PGD / 2)
+#define TASK_SIZE_MIN  (PGDIR_SIZE_L3 * PTRS_PER_PGD / 2)
+
+#ifdef CONFIG_COMPAT
+#define TASK_SIZE_32   (_AC(0x8000, UL) - PAGE_SIZE)
+#define TASK_SIZE  (test_thread_flag(TIF_32BIT) ? \
+TASK_SIZE_32 : TASK_SIZE_64)
+#else
+#define TASK_SIZE  TASK_SIZE_64
+#endif
+
 #else
 #define TASK_SIZE  FIXADDR_START
 #define TASK_SIZE_MIN  TASK_SIZE
-- 
2.25.1

[PATCH V5 09/21] riscv: compat: Add basic compat data type implementation

From: Guo Ren 

Implement riscv asm/compat.h for struct compat_xxx,
is_compat_task, compat_user_regset, regset convert.

The rv64 compat.h has inherited most of the structs
from the generic one.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Cc: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/include/asm/compat.h  | 129 +++
 arch/riscv/include/asm/thread_info.h |   1 +
 2 files changed, 130 insertions(+)
 create mode 100644 arch/riscv/include/asm/compat.h

diff --git a/arch/riscv/include/asm/compat.h b/arch/riscv/include/asm/compat.h
new file mode 100644
index ..2ac955b51148
--- /dev/null
+++ b/arch/riscv/include/asm/compat.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_COMPAT_H
+#define __ASM_COMPAT_H
+
+#define COMPAT_UTS_MACHINE "riscv\0\0"
+
+/*
+ * Architecture specific compatibility types
+ */
+#include 
+#include 
+#include 
+#include 
+
+static inline int is_compat_task(void)
+{
+   return test_thread_flag(TIF_32BIT);
+}
+
+struct compat_user_regs_struct {
+   compat_ulong_t pc;
+   compat_ulong_t ra;
+   compat_ulong_t sp;
+   compat_ulong_t gp;
+   compat_ulong_t tp;
+   compat_ulong_t t0;
+   compat_ulong_t t1;
+   compat_ulong_t t2;
+   compat_ulong_t s0;
+   compat_ulong_t s1;
+   compat_ulong_t a0;
+   compat_ulong_t a1;
+   compat_ulong_t a2;
+   compat_ulong_t a3;
+   compat_ulong_t a4;
+   compat_ulong_t a5;
+   compat_ulong_t a6;
+   compat_ulong_t a7;
+   compat_ulong_t s2;
+   compat_ulong_t s3;
+   compat_ulong_t s4;
+   compat_ulong_t s5;
+   compat_ulong_t s6;
+   compat_ulong_t s7;
+   compat_ulong_t s8;
+   compat_ulong_t s9;
+   compat_ulong_t s10;
+   compat_ulong_t s11;
+   compat_ulong_t t3;
+   compat_ulong_t t4;
+   compat_ulong_t t5;
+   compat_ulong_t t6;
+};
+
+static inline void regs_to_cregs(struct compat_user_regs_struct *cregs,
+struct pt_regs *regs)
+{
+   cregs->pc   = (compat_ulong_t) regs->epc;
+   cregs->ra   = (compat_ulong_t) regs->ra;
+   cregs->sp   = (compat_ulong_t) regs->sp;
+   cregs->gp   = (compat_ulong_t) regs->gp;
+   cregs->tp   = (compat_ulong_t) regs->tp;
+   cregs->t0   = (compat_ulong_t) regs->t0;
+   cregs->t1   = (compat_ulong_t) regs->t1;
+   cregs->t2   = (compat_ulong_t) regs->t2;
+   cregs->s0   = (compat_ulong_t) regs->s0;
+   cregs->s1   = (compat_ulong_t) regs->s1;
+   cregs->a0   = (compat_ulong_t) regs->a0;
+   cregs->a1   = (compat_ulong_t) regs->a1;
+   cregs->a2   = (compat_ulong_t) regs->a2;
+   cregs->a3   = (compat_ulong_t) regs->a3;
+   cregs->a4   = (compat_ulong_t) regs->a4;
+   cregs->a5   = (compat_ulong_t) regs->a5;
+   cregs->a6   = (compat_ulong_t) regs->a6;
+   cregs->a7   = (compat_ulong_t) regs->a7;
+   cregs->s2   = (compat_ulong_t) regs->s2;
+   cregs->s3   = (compat_ulong_t) regs->s3;
+   cregs->s4   = (compat_ulong_t) regs->s4;
+   cregs->s5   = (compat_ulong_t) regs->s5;
+   cregs->s6   = (compat_ulong_t) regs->s6;
+   cregs->s7   = (compat_ulong_t) regs->s7;
+   cregs->s8   = (compat_ulong_t) regs->s8;
+   cregs->s9   = (compat_ulong_t) regs->s9;
+   cregs->s10  = (compat_ulong_t) regs->s10;
+   cregs->s11  = (compat_ulong_t) regs->s11;
+   cregs->t3   = (compat_ulong_t) regs->t3;
+   cregs->t4   = (compat_ulong_t) regs->t4;
+   cregs->t5   = (compat_ulong_t) regs->t5;
+   cregs->t6   = (compat_ulong_t) regs->t6;
+};
+
+static inline void cregs_to_regs(struct compat_user_regs_struct *cregs,
+struct pt_regs *regs)
+{
+   regs->epc   = (unsigned long) cregs->pc;
+   regs->ra= (unsigned long) cregs->ra;
+   regs->sp= (unsigned long) cregs->sp;
+   regs->gp= (unsigned long) cregs->gp;
+   regs->tp= (unsigned long) cregs->tp;
+   regs->t0= (unsigned long) cregs->t0;
+   regs->t1= (unsigned long) cregs->t1;
+   regs->t2= (unsigned long) cregs->t2;
+   regs->s0= (unsigned long) cregs->s0;
+   regs->s1= (unsigned long) cregs->s1;
+   regs->a0= (unsigned long) cregs->a0;
+   regs->a1= (unsigned long) cregs->a1;
+   regs->a2= (unsigned long) cregs->a2;
+   regs->a3= (unsigned long) cregs->a3;
+   regs->a4= (unsigned long) cregs->a4;
+   regs->a5= (unsigned long) cregs->a5;
+   regs->a6= (unsigned long) cregs->a6;
+   regs->a7= (unsigned long) cregs->a7;
+   regs->s2= (unsigned long) cregs->s2;
+   regs->s3= (unsigned long) cregs->s3;
+   regs->s4= (unsigned

[PATCH V5 08/21] riscv: Fixup difference with defconfig

From: Guo Ren 

Let's follow the origin patch's spirit:

The only difference between rv32_defconfig and defconfig is that
rv32_defconfig has  CONFIG_ARCH_RV32I=y.

This is helpful to compare rv64-compat-rv32 v.s. rv32-linux.

Fixes: 1b937e8faa87ccfb ("RISC-V: Add separate defconfig for 32bit systems")
Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/riscv/Makefile   |   4 +
 arch/riscv/configs/rv32_defconfig | 135 --
 2 files changed, 4 insertions(+), 135 deletions(-)
 delete mode 100644 arch/riscv/configs/rv32_defconfig

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 8a107ed18b0d..a02e588c4947 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -148,3 +148,7 @@ PHONY += rv64_randconfig
 rv64_randconfig:
$(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/riscv/configs/64-bit.config \
-f $(srctree)/Makefile randconfig
+
+PHONY += rv32_defconfig
+rv32_defconfig:
+   $(Q)$(MAKE) -f $(srctree)/Makefile defconfig 32-bit.config
diff --git a/arch/riscv/configs/rv32_defconfig 
b/arch/riscv/configs/rv32_defconfig
deleted file mode 100644
index 8b56a7f1eb06..
--- a/arch/riscv/configs/rv32_defconfig
+++ /dev/null
@@ -1,135 +0,0 @@
-CONFIG_SYSVIPC=y
-CONFIG_POSIX_MQUEUE=y
-CONFIG_NO_HZ_IDLE=y
-CONFIG_HIGH_RES_TIMERS=y
-CONFIG_BPF_SYSCALL=y
-CONFIG_IKCONFIG=y
-CONFIG_IKCONFIG_PROC=y
-CONFIG_CGROUPS=y
-CONFIG_CGROUP_SCHED=y
-CONFIG_CFS_BANDWIDTH=y
-CONFIG_CGROUP_BPF=y
-CONFIG_NAMESPACES=y
-CONFIG_USER_NS=y
-CONFIG_CHECKPOINT_RESTORE=y
-CONFIG_BLK_DEV_INITRD=y
-CONFIG_EXPERT=y
-# CONFIG_SYSFS_SYSCALL is not set
-CONFIG_SOC_SIFIVE=y
-CONFIG_SOC_VIRT=y
-CONFIG_ARCH_RV32I=y
-CONFIG_SMP=y
-CONFIG_HOTPLUG_CPU=y
-CONFIG_VIRTUALIZATION=y
-CONFIG_KVM=m
-CONFIG_JUMP_LABEL=y
-CONFIG_MODULES=y
-CONFIG_MODULE_UNLOAD=y
-CONFIG_NET=y
-CONFIG_PACKET=y
-CONFIG_UNIX=y
-CONFIG_INET=y
-CONFIG_IP_MULTICAST=y
-CONFIG_IP_ADVANCED_ROUTER=y
-CONFIG_IP_PNP=y
-CONFIG_IP_PNP_DHCP=y
-CONFIG_IP_PNP_BOOTP=y
-CONFIG_IP_PNP_RARP=y
-CONFIG_NETLINK_DIAG=y
-CONFIG_NET_9P=y
-CONFIG_NET_9P_VIRTIO=y
-CONFIG_PCI=y
-CONFIG_PCIEPORTBUS=y
-CONFIG_PCI_HOST_GENERIC=y
-CONFIG_PCIE_XILINX=y
-CONFIG_DEVTMPFS=y
-CONFIG_DEVTMPFS_MOUNT=y
-CONFIG_BLK_DEV_LOOP=y
-CONFIG_VIRTIO_BLK=y
-CONFIG_BLK_DEV_SD=y
-CONFIG_BLK_DEV_SR=y
-CONFIG_SCSI_VIRTIO=y
-CONFIG_ATA=y
-CONFIG_SATA_AHCI=y
-CONFIG_SATA_AHCI_PLATFORM=y
-CONFIG_NETDEVICES=y
-CONFIG_VIRTIO_NET=y
-CONFIG_MACB=y
-CONFIG_E1000E=y
-CONFIG_R8169=y
-CONFIG_MICROSEMI_PHY=y
-CONFIG_INPUT_MOUSEDEV=y
-CONFIG_SERIAL_8250=y
-CONFIG_SERIAL_8250_CONSOLE=y
-CONFIG_SERIAL_OF_PLATFORM=y
-CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
-CONFIG_HVC_RISCV_SBI=y
-CONFIG_VIRTIO_CONSOLE=y
-CONFIG_HW_RANDOM=y
-CONFIG_HW_RANDOM_VIRTIO=y
-CONFIG_SPI=y
-CONFIG_SPI_SIFIVE=y
-# CONFIG_PTP_1588_CLOCK is not set
-CONFIG_DRM=y
-CONFIG_DRM_RADEON=y
-CONFIG_DRM_VIRTIO_GPU=y
-CONFIG_FB=y
-CONFIG_FRAMEBUFFER_CONSOLE=y
-CONFIG_USB=y
-CONFIG_USB_XHCI_HCD=y
-CONFIG_USB_XHCI_PLATFORM=y
-CONFIG_USB_EHCI_HCD=y
-CONFIG_USB_EHCI_HCD_PLATFORM=y
-CONFIG_USB_OHCI_HCD=y
-CONFIG_USB_OHCI_HCD_PLATFORM=y
-CONFIG_USB_STORAGE=y
-CONFIG_USB_UAS=y
-CONFIG_MMC=y
-CONFIG_MMC_SPI=y
-CONFIG_RTC_CLASS=y
-CONFIG_VIRTIO_PCI=y
-CONFIG_VIRTIO_BALLOON=y
-CONFIG_VIRTIO_INPUT=y
-CONFIG_VIRTIO_MMIO=y
-CONFIG_RPMSG_CHAR=y
-CONFIG_RPMSG_VIRTIO=y
-CONFIG_EXT4_FS=y
-CONFIG_EXT4_FS_POSIX_ACL=y
-CONFIG_AUTOFS4_FS=y
-CONFIG_MSDOS_FS=y
-CONFIG_VFAT_FS=y
-CONFIG_TMPFS=y
-CONFIG_TMPFS_POSIX_ACL=y
-CONFIG_NFS_FS=y
-CONFIG_NFS_V4=y
-CONFIG_NFS_V4_1=y
-CONFIG_NFS_V4_2=y
-CONFIG_ROOT_NFS=y
-CONFIG_9P_FS=y
-CONFIG_CRYPTO_USER_API_HASH=y
-CONFIG_CRYPTO_DEV_VIRTIO=y
-CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_FS=y
-CONFIG_DEBUG_PAGEALLOC=y
-CONFIG_SCHED_STACK_END_CHECK=y
-CONFIG_DEBUG_VM=y
-CONFIG_DEBUG_VM_PGFLAGS=y
-CONFIG_DEBUG_MEMORY_INIT=y
-CONFIG_DEBUG_PER_CPU_MAPS=y
-CONFIG_SOFTLOCKUP_DETECTOR=y
-CONFIG_WQ_WATCHDOG=y
-CONFIG_DEBUG_TIMEKEEPING=y
-CONFIG_DEBUG_RT_MUTEXES=y
-CONFIG_DEBUG_SPINLOCK=y
-CONFIG_DEBUG_MUTEXES=y
-CONFIG_DEBUG_RWSEMS=y
-CONFIG_DEBUG_ATOMIC_SLEEP=y
-CONFIG_STACKTRACE=y
-CONFIG_DEBUG_LIST=y
-CONFIG_DEBUG_PLIST=y
-CONFIG_DEBUG_SG=y
-# CONFIG_RCU_TRACE is not set
-CONFIG_RCU_EQS_DEBUG=y
-# CONFIG_FTRACE is not set
-# CONFIG_RUNTIME_TESTING_MENU is not set
-CONFIG_MEMTEST=y
-- 
2.25.1

[PATCH V5 07/21] syscalls: compat: Fix the missing part for __SYSCALL_COMPAT

From: Guo Ren 

Make "uapi asm unistd.h" could be used for architectures' COMPAT
mode. The __SYSCALL_COMPAT is first used in riscv.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
Reviewed-by: Christoph Hellwig 
---
 include/uapi/asm-generic/unistd.h   | 4 ++--
 tools/include/uapi/asm-generic/unistd.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/uapi/asm-generic/unistd.h 
b/include/uapi/asm-generic/unistd.h
index 1c48b0ae3ba3..45fa180cc56a 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -383,7 +383,7 @@ __SYSCALL(__NR_syslog, sys_syslog)
 
 /* kernel/ptrace.c */
 #define __NR_ptrace 117
-__SYSCALL(__NR_ptrace, sys_ptrace)
+__SC_COMP(__NR_ptrace, sys_ptrace, compat_sys_ptrace)
 
 /* kernel/sched/core.c */
 #define __NR_sched_setparam 118
@@ -779,7 +779,7 @@ __SYSCALL(__NR_rseq, sys_rseq)
 #define __NR_kexec_file_load 294
 __SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)
 /* 295 through 402 are unassigned to sync up with generic numbers, don't use */
-#if __BITS_PER_LONG == 32
+#if defined(__SYSCALL_COMPAT) || __BITS_PER_LONG == 32
 #define __NR_clock_gettime64 403
 __SYSCALL(__NR_clock_gettime64, sys_clock_gettime)
 #define __NR_clock_settime64 404
diff --git a/tools/include/uapi/asm-generic/unistd.h 
b/tools/include/uapi/asm-generic/unistd.h
index 1c48b0ae3ba3..45fa180cc56a 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -383,7 +383,7 @@ __SYSCALL(__NR_syslog, sys_syslog)
 
 /* kernel/ptrace.c */
 #define __NR_ptrace 117
-__SYSCALL(__NR_ptrace, sys_ptrace)
+__SC_COMP(__NR_ptrace, sys_ptrace, compat_sys_ptrace)
 
 /* kernel/sched/core.c */
 #define __NR_sched_setparam 118
@@ -779,7 +779,7 @@ __SYSCALL(__NR_rseq, sys_rseq)
 #define __NR_kexec_file_load 294
 __SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)
 /* 295 through 402 are unassigned to sync up with generic numbers, don't use */
-#if __BITS_PER_LONG == 32
+#if defined(__SYSCALL_COMPAT) || __BITS_PER_LONG == 32
 #define __NR_clock_gettime64 403
 __SYSCALL(__NR_clock_gettime64, sys_clock_gettime)
 #define __NR_clock_settime64 404
-- 
2.25.1

[PATCH V5 06/21] asm-generic: compat: Cleanup duplicate definitions

From: Guo Ren 

There are 7 64bit architectures that support Linux COMPAT mode to
run 32bit applications. A lot of definitions are duplicate:
 - COMPAT_USER_HZ
 - COMPAT_RLIM_INFINITY
 - COMPAT_OFF_T_MAX
 - __compat_uid_t, __compat_uid_t
 - compat_dev_t
 - compat_ipc_pid_t
 - struct compat_flock
 - struct compat_flock64
 - struct compat_statfs
 - struct compat_ipc64_perm, compat_semid64_ds,
  compat_msqid64_ds, compat_shmid64_ds

Cleanup duplicate definitions and merge them into asm-generic.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
Cc: Palmer Dabbelt 
---
 arch/arm64/include/asm/compat.h   |  71 ++--
 arch/mips/include/asm/compat.h|  18 ++---
 arch/parisc/include/asm/compat.h  |  29 ++--
 arch/powerpc/include/asm/compat.h |  30 ++---
 arch/s390/include/asm/compat.h|  79 --
 arch/sparc/include/asm/compat.h   |  39 ---
 arch/x86/include/asm/compat.h |  80 --
 include/asm-generic/compat.h  | 106 ++
 8 files changed, 168 insertions(+), 284 deletions(-)

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index e0faec1984a1..46317319738a 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -8,6 +8,13 @@
 #define compat_mode_t compat_mode_t
 typedef u16compat_mode_t;
 
+#define __compat_uid_t __compat_uid_t
+typedef u16__compat_uid_t;
+typedef u16__compat_gid_t;
+
+#define compat_ipc_pid_t compat_ipc_pid_t
+typedef u16 compat_ipc_pid_t;
+
 #include 
 
 #ifdef CONFIG_COMPAT
@@ -19,21 +26,15 @@ typedef u16 compat_mode_t;
 #include 
 #include 
 
-#define COMPAT_USER_HZ 100
 #ifdef __AARCH64EB__
 #define COMPAT_UTS_MACHINE "armv8b\0\0"
 #else
 #define COMPAT_UTS_MACHINE "armv8l\0\0"
 #endif
 
-typedef u16__compat_uid_t;
-typedef u16__compat_gid_t;
 typedef u16__compat_uid16_t;
 typedef u16__compat_gid16_t;
-typedef u32compat_dev_t;
 typedef s32compat_nlink_t;
-typedef u16compat_ipc_pid_t;
-typedef __kernel_fsid_tcompat_fsid_t;
 
 struct compat_stat {
 #ifdef __AARCH64EB__
@@ -87,64 +88,6 @@ struct compat_statfs {
 #define compat_user_stack_pointer() (user_stack_pointer(task_pt_regs(current)))
 #define COMPAT_MINSIGSTKSZ 2048
 
-struct compat_ipc64_perm {
-   compat_key_t key;
-   __compat_uid32_t uid;
-   __compat_gid32_t gid;
-   __compat_uid32_t cuid;
-   __compat_gid32_t cgid;
-   unsigned short mode;
-   unsigned short __pad1;
-   unsigned short seq;
-   unsigned short __pad2;
-   compat_ulong_t unused1;
-   compat_ulong_t unused2;
-};
-
-struct compat_semid64_ds {
-   struct compat_ipc64_perm sem_perm;
-   compat_ulong_t sem_otime;
-   compat_ulong_t sem_otime_high;
-   compat_ulong_t sem_ctime;
-   compat_ulong_t sem_ctime_high;
-   compat_ulong_t sem_nsems;
-   compat_ulong_t __unused3;
-   compat_ulong_t __unused4;
-};
-
-struct compat_msqid64_ds {
-   struct compat_ipc64_perm msg_perm;
-   compat_ulong_t msg_stime;
-   compat_ulong_t msg_stime_high;
-   compat_ulong_t msg_rtime;
-   compat_ulong_t msg_rtime_high;
-   compat_ulong_t msg_ctime;
-   compat_ulong_t msg_ctime_high;
-   compat_ulong_t msg_cbytes;
-   compat_ulong_t msg_qnum;
-   compat_ulong_t msg_qbytes;
-   compat_pid_t   msg_lspid;
-   compat_pid_t   msg_lrpid;
-   compat_ulong_t __unused4;
-   compat_ulong_t __unused5;
-};
-
-struct compat_shmid64_ds {
-   struct compat_ipc64_perm shm_perm;
-   compat_size_t  shm_segsz;
-   compat_ulong_t shm_atime;
-   compat_ulong_t shm_atime_high;
-   compat_ulong_t shm_dtime;
-   compat_ulong_t shm_dtime_high;
-   compat_ulong_t shm_ctime;
-   compat_ulong_t shm_ctime_high;
-   compat_pid_t   shm_cpid;
-   compat_pid_t   shm_lpid;
-   compat_ulong_t shm_nattch;
-   compat_ulong_t __unused4;
-   compat_ulong_t __unused5;
-};
-
 static inline int is_compat_task(void)
 {
return test_thread_flag(TIF_32BIT);
diff --git a/arch/mips/include/asm/compat.h b/arch/mips/include/asm/compat.h
index 6d6e5a451f4d..ec01dc000a41 100644
--- a/arch/mips/include/asm/compat.h
+++ b/arch/mips/include/asm/compat.h
@@ -9,28 +9,28 @@
 #include 
 #include 
 
+#define __compat_uid_t __compat_uid_t
 typedef s32__compat_uid_t;
 typedef s32__compat_gid_t;
+
 typedef __compat_uid_t __compat_uid32_t;
 typedef __compat_gid_t __compat_gid32_t;
 #define __compat_uid32_t __compat_uid32_t
-#define __compat_gid32_t __compat_gid32_t
+
+#define compat_statfs  compat_statfs
+#define compat_ipc64_perm  compat_ipc64_perm
 
 #define _COMPAT_NSIG   128 /* Don't ask !$@#% ...  */
 #define _COMPAT_NSIG_BPW   32
 typedef u32

[PATCH V5 05/21] fs: stat: compat: Add __ARCH_WANT_COMPAT_STAT

From: Guo Ren 

RISC-V doesn't neeed compat_stat, so using __ARCH_WANT_COMPAT_STAT
to exclude unnecessary SYSCALL functions.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
Reviewed-by: Christoph Hellwig 
Cc: Palmer Dabbelt 
---
 arch/arm64/include/asm/unistd.h   | 1 +
 arch/mips/include/asm/unistd.h| 2 ++
 arch/parisc/include/asm/unistd.h  | 1 +
 arch/powerpc/include/asm/unistd.h | 1 +
 arch/s390/include/asm/unistd.h| 1 +
 arch/sparc/include/asm/unistd.h   | 1 +
 arch/x86/include/asm/unistd.h | 1 +
 fs/stat.c | 2 +-
 8 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 4e65da3445c7..037feba03a51 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -3,6 +3,7 @@
  * Copyright (C) 2012 ARM Ltd.
  */
 #ifdef CONFIG_COMPAT
+#define __ARCH_WANT_COMPAT_STAT
 #define __ARCH_WANT_COMPAT_STAT64
 #define __ARCH_WANT_SYS_GETHOSTNAME
 #define __ARCH_WANT_SYS_PAUSE
diff --git a/arch/mips/include/asm/unistd.h b/arch/mips/include/asm/unistd.h
index c2196b1b6604..25a5253db7f4 100644
--- a/arch/mips/include/asm/unistd.h
+++ b/arch/mips/include/asm/unistd.h
@@ -50,6 +50,8 @@
 # ifdef CONFIG_32BIT
 #  define __ARCH_WANT_STAT64
 #  define __ARCH_WANT_SYS_TIME32
+# else
+#  define __ARCH_WANT_COMPAT_STAT
 # endif
 # ifdef CONFIG_MIPS32_O32
 #  define __ARCH_WANT_SYS_TIME32
diff --git a/arch/parisc/include/asm/unistd.h b/arch/parisc/include/asm/unistd.h
index cd438e4150f6..14e0668184cb 100644
--- a/arch/parisc/include/asm/unistd.h
+++ b/arch/parisc/include/asm/unistd.h
@@ -168,6 +168,7 @@ type name(type1 arg1, type2 arg2, type3 arg3, type4 arg4, 
type5 arg5)   \
 #define __ARCH_WANT_SYS_CLONE
 #define __ARCH_WANT_SYS_CLONE3
 #define __ARCH_WANT_COMPAT_SYS_SENDFILE
+#define __ARCH_WANT_COMPAT_STAT
 
 #ifdef CONFIG_64BIT
 #define __ARCH_WANT_SYS_TIME
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index 5eb462af6766..b1129b4ef57d 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -44,6 +44,7 @@
 #define __ARCH_WANT_SYS_TIME
 #define __ARCH_WANT_SYS_UTIME
 #define __ARCH_WANT_SYS_NEWFSTATAT
+#define __ARCH_WANT_COMPAT_STAT
 #define __ARCH_WANT_COMPAT_SYS_SENDFILE
 #endif
 #define __ARCH_WANT_SYS_FORK
diff --git a/arch/s390/include/asm/unistd.h b/arch/s390/include/asm/unistd.h
index 9e9f75ef046a..4260bc5ce7f8 100644
--- a/arch/s390/include/asm/unistd.h
+++ b/arch/s390/include/asm/unistd.h
@@ -28,6 +28,7 @@
 #define __ARCH_WANT_SYS_SIGPENDING
 #define __ARCH_WANT_SYS_SIGPROCMASK
 # ifdef CONFIG_COMPAT
+#   define __ARCH_WANT_COMPAT_STAT
 #   define __ARCH_WANT_SYS_TIME32
 #   define __ARCH_WANT_SYS_UTIME32
 # endif
diff --git a/arch/sparc/include/asm/unistd.h b/arch/sparc/include/asm/unistd.h
index 1e66278ba4a5..d6bc76706a7a 100644
--- a/arch/sparc/include/asm/unistd.h
+++ b/arch/sparc/include/asm/unistd.h
@@ -46,6 +46,7 @@
 #define __ARCH_WANT_SYS_TIME
 #define __ARCH_WANT_SYS_UTIME
 #define __ARCH_WANT_COMPAT_SYS_SENDFILE
+#define __ARCH_WANT_COMPAT_STAT
 #endif
 
 #ifdef __32bit_syscall_numbers__
diff --git a/arch/x86/include/asm/unistd.h b/arch/x86/include/asm/unistd.h
index 80e9d5206a71..761173ccc33c 100644
--- a/arch/x86/include/asm/unistd.h
+++ b/arch/x86/include/asm/unistd.h
@@ -22,6 +22,7 @@
 #  include 
 #  define __ARCH_WANT_SYS_TIME
 #  define __ARCH_WANT_SYS_UTIME
+#  define __ARCH_WANT_COMPAT_STAT
 #  define __ARCH_WANT_COMPAT_SYS_PREADV64
 #  define __ARCH_WANT_COMPAT_SYS_PWRITEV64
 #  define __ARCH_WANT_COMPAT_SYS_PREADV64V2
diff --git a/fs/stat.c b/fs/stat.c
index 28d2020ba1f4..ffdeb9065d53 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -639,7 +639,7 @@ SYSCALL_DEFINE5(statx,
return do_statx(dfd, filename, flags, mask, buffer);
 }
 
-#ifdef CONFIG_COMPAT
+#if defined(CONFIG_COMPAT) && defined(__ARCH_WANT_COMPAT_STAT)
 static int cp_compat_stat(struct kstat *stat, struct compat_stat __user *ubuf)
 {
struct compat_stat tmp;
-- 
2.25.1

[PATCH V5 04/21] kconfig: Add SYSVIPC_COMPAT for all architectures

From: Guo Ren 

The existing per-arch definitions are pretty much historic cruft.
Move SYSVIPC_COMPAT into init/Kconfig.

Signed-off-by: Guo Ren 
Signed-off-by: Guo Ren 
Acked-by: Arnd Bergmann 
Reviewed-by: Christoph Hellwig 
Cc: Palmer Dabbelt 
---
 arch/arm64/Kconfig   | 4 
 arch/mips/Kconfig| 5 -
 arch/parisc/Kconfig  | 4 
 arch/powerpc/Kconfig | 5 -
 arch/s390/Kconfig| 3 ---
 arch/sparc/Kconfig   | 5 -
 arch/x86/Kconfig | 4 
 init/Kconfig | 4 
 8 files changed, 4 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f2b5a4abef21..c6fe78fdd58b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2091,10 +2091,6 @@ config DMI
 
 endmenu
 
-config SYSVIPC_COMPAT
-   def_bool y
-   depends on COMPAT && SYSVIPC
-
 menu "Power management options"
 
 source "kernel/power/Kconfig"
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 058446f01487..91a17ad380c9 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -3170,16 +3170,12 @@ config MIPS32_COMPAT
 config COMPAT
bool
 
-config SYSVIPC_COMPAT
-   bool
-
 config MIPS32_O32
bool "Kernel support for o32 binaries"
depends on 64BIT
select ARCH_WANT_OLD_COMPAT_IPC
select COMPAT
select MIPS32_COMPAT
-   select SYSVIPC_COMPAT if SYSVIPC
help
  Select this option if you want to run o32 binaries.  These are pure
  32-bit binaries as used by the 32-bit Linux/MIPS port.  Most of
@@ -3193,7 +3189,6 @@ config MIPS32_N32
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
select COMPAT
select MIPS32_COMPAT
-   select SYSVIPC_COMPAT if SYSVIPC
help
  Select this option if you want to run n32 binaries.  These are
  64-bit binaries using 32-bit quantities for addressing and certain
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 43c1c880def6..bc56759d44a2 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -345,10 +345,6 @@ config COMPAT
def_bool y
depends on 64BIT
 
-config SYSVIPC_COMPAT
-   def_bool y
-   depends on COMPAT && SYSVIPC
-
 config AUDIT_ARCH
def_bool y
 
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b779603978e1..5a32b7f21af2 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -291,11 +291,6 @@ config COMPAT
select ARCH_WANT_OLD_COMPAT_IPC
select COMPAT_OLD_SIGACTION
 
-config SYSVIPC_COMPAT
-   bool
-   depends on COMPAT && SYSVIPC
-   default y
-
 config SCHED_OMIT_FRAME_POINTER
bool
default y
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index be9f39fd06df..80f69cafbb87 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -459,9 +459,6 @@ config COMPAT
  (and some other stuff like libraries and such) is needed for
  executing 31 bit applications.  It is safe to say "Y".
 
-config SYSVIPC_COMPAT
-   def_bool y if COMPAT && SYSVIPC
-
 config SMP
def_bool y
 
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 1cab1b284f1a..15d5725bd623 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -488,9 +488,4 @@ config COMPAT
select ARCH_WANT_OLD_COMPAT_IPC
select COMPAT_OLD_SIGACTION
 
-config SYSVIPC_COMPAT
-   bool
-   depends on COMPAT && SYSVIPC
-   default y
-
 source "drivers/sbus/char/Kconfig"
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9f5bd41bf660..7d0487189f6e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2860,10 +2860,6 @@ config COMPAT
 if COMPAT
 config COMPAT_FOR_U64_ALIGNMENT
def_bool y
-
-config SYSVIPC_COMPAT
-   def_bool y
-   depends on SYSVIPC
 endif
 
 endmenu
diff --git a/init/Kconfig b/init/Kconfig
index e9119bf54b1f..589ccec56571 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -386,6 +386,10 @@ config SYSVIPC_SYSCTL
depends on SYSCTL
default y
 
+config SYSVIPC_COMPAT
+   def_bool y
+   depends on COMPAT && SYSVIPC
+
 config POSIX_MQUEUE
bool "POSIX Message Queues"
depends on NET
-- 
2.25.1

[PATCH V5 03/21] compat: consolidate the compat_flock{, 64} definition

From: Christoph Hellwig 

Provide a single common definition for the compat_flock and
compat_flock64 structures using the same tricks as for the native
variants.  Another extra define is added for the packing required on
x86.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
---
 arch/arm64/include/asm/compat.h   | 16 
 arch/mips/include/asm/compat.h| 19 ++-
 arch/parisc/include/asm/compat.h  | 16 
 arch/powerpc/include/asm/compat.h | 16 
 arch/s390/include/asm/compat.h| 16 
 arch/sparc/include/asm/compat.h   | 18 +-
 arch/x86/include/asm/compat.h | 20 +++-
 include/linux/compat.h| 31 +++
 8 files changed, 37 insertions(+), 115 deletions(-)

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index 276328765408..e0faec1984a1 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -65,22 +65,6 @@ struct compat_stat {
compat_ulong_t  __unused4[2];
 };
 
-struct compat_flock {
-   short   l_type;
-   short   l_whence;
-   compat_off_tl_start;
-   compat_off_tl_len;
-   compat_pid_tl_pid;
-};
-
-struct compat_flock64 {
-   short   l_type;
-   short   l_whence;
-   compat_loff_t   l_start;
-   compat_loff_t   l_len;
-   compat_pid_tl_pid;
-};
-
 struct compat_statfs {
int f_type;
int f_bsize;
diff --git a/arch/mips/include/asm/compat.h b/arch/mips/include/asm/compat.h
index 6a350c1f70d7..6d6e5a451f4d 100644
--- a/arch/mips/include/asm/compat.h
+++ b/arch/mips/include/asm/compat.h
@@ -55,23 +55,8 @@ struct compat_stat {
s32 st_pad4[14];
 };
 
-struct compat_flock {
-   short   l_type;
-   short   l_whence;
-   compat_off_tl_start;
-   compat_off_tl_len;
-   s32 l_sysid;
-   compat_pid_tl_pid;
-   s32 pad[4];
-};
-
-struct compat_flock64 {
-   short   l_type;
-   short   l_whence;
-   compat_loff_t   l_start;
-   compat_loff_t   l_len;
-   compat_pid_tl_pid;
-};
+#define __ARCH_COMPAT_FLOCK_EXTRA_SYSIDs32 l_sysid;
+#define __ARCH_COMPAT_FLOCK_PADs32 pad[4];
 
 struct compat_statfs {
int f_type;
diff --git a/arch/parisc/include/asm/compat.h b/arch/parisc/include/asm/compat.h
index c04f5a637c39..a1e4534d8050 100644
--- a/arch/parisc/include/asm/compat.h
+++ b/arch/parisc/include/asm/compat.h
@@ -53,22 +53,6 @@ struct compat_stat {
u32 st_spare4[3];
 };
 
-struct compat_flock {
-   short   l_type;
-   short   l_whence;
-   compat_off_tl_start;
-   compat_off_tl_len;
-   compat_pid_tl_pid;
-};
-
-struct compat_flock64 {
-   short   l_type;
-   short   l_whence;
-   compat_loff_t   l_start;
-   compat_loff_t   l_len;
-   compat_pid_tl_pid;
-};
-
 struct compat_statfs {
s32 f_type;
s32 f_bsize;
diff --git a/arch/powerpc/include/asm/compat.h 
b/arch/powerpc/include/asm/compat.h
index 83d8f70779cb..5ef3c7c83c34 100644
--- a/arch/powerpc/include/asm/compat.h
+++ b/arch/powerpc/include/asm/compat.h
@@ -44,22 +44,6 @@ struct compat_stat {
u32 __unused4[2];
 };
 
-struct compat_flock {
-   short   l_type;
-   short   l_whence;
-   compat_off_tl_start;
-   compat_off_tl_len;
-   compat_pid_tl_pid;
-};
-
-struct compat_flock64 {
-   short   l_type;
-   short   l_whence;
-   compat_loff_t   l_start;
-   compat_loff_t   l_len;
-   compat_pid_tl_pid;
-};
-
 struct compat_statfs {
int f_type;
int f_bsize;
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index 0f14b3188b1b..07f04d37068b 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -102,22 +102,6 @@ struct compat_stat {
u32 __unused5;
 };
 
-struct compat_flock {
-   short   l_type;
-   short   l_whence;
-   compat_off_tl_start;
-   compat_off_tl_len;
-   compat_pid_tl_pid;
-};
-
-struct compat_flock64 {
-   short   l_type;
-   short   l_whence;
-   compat_loff_t   l_start;
-   compat_loff_t   l_len;
-   compat_pid_tl_pid;
-};
-
 struct compat_statfs {
u32 f_type;
u32 f_bsize;
diff --git a/arch/sparc/include/asm/compat.h b/arch/sparc/include/asm/compat.h
index 108078751bb5..d78fb44942e0 100644
--- a/arch/sparc/in

[PATCH V5 02/21] uapi: always define F_GETLK64/F_SETLK64/F_SETLKW64 in fcntl.h

From: Christoph Hellwig 

The F_GETLK64/F_SETLK64/F_SETLKW64 fcntl opcodes are only implemented
for the 32-bit syscall APIs, but are also needed for compat handling
on 64-bit kernels.

Consolidate them in unistd.h instead of definining the internal compat
definitions in compat.h, which is rather error prone (e.g. parisc
gets the values wrong currently).

Note that before this change they were never visible to userspace due
to the fact that CONFIG_64BIT is only set for kernel builds.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
---
 arch/arm64/include/asm/compat.h| 4 
 arch/mips/include/asm/compat.h | 4 
 arch/mips/include/uapi/asm/fcntl.h | 4 ++--
 arch/powerpc/include/asm/compat.h  | 4 
 arch/s390/include/asm/compat.h | 4 
 arch/sparc/include/asm/compat.h| 4 
 arch/x86/include/asm/compat.h  | 4 
 include/uapi/asm-generic/fcntl.h   | 4 ++--
 tools/include/uapi/asm-generic/fcntl.h | 2 --
 9 files changed, 4 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index eaa6ca062d89..276328765408 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -73,10 +73,6 @@ struct compat_flock {
compat_pid_tl_pid;
 };
 
-#define F_GETLK64  12  /*  using 'struct flock64' */
-#define F_SETLK64  13
-#define F_SETLKW64 14
-
 struct compat_flock64 {
short   l_type;
short   l_whence;
diff --git a/arch/mips/include/asm/compat.h b/arch/mips/include/asm/compat.h
index bbb3bc5a42fd..6a350c1f70d7 100644
--- a/arch/mips/include/asm/compat.h
+++ b/arch/mips/include/asm/compat.h
@@ -65,10 +65,6 @@ struct compat_flock {
s32 pad[4];
 };
 
-#define F_GETLK64  33
-#define F_SETLK64  34
-#define F_SETLKW64 35
-
 struct compat_flock64 {
short   l_type;
short   l_whence;
diff --git a/arch/mips/include/uapi/asm/fcntl.h 
b/arch/mips/include/uapi/asm/fcntl.h
index 9e44ac810db9..0369a38e3d4f 100644
--- a/arch/mips/include/uapi/asm/fcntl.h
+++ b/arch/mips/include/uapi/asm/fcntl.h
@@ -44,11 +44,11 @@
 #define F_SETOWN   24  /*  for sockets. */
 #define F_GETOWN   23  /*  for sockets. */
 
-#ifndef __mips64
+#if __BITS_PER_LONG == 32 || defined(__KERNEL__)
 #define F_GETLK64  33  /*  using 'struct flock64' */
 #define F_SETLK64  34
 #define F_SETLKW64 35
-#endif
+#endif /* __BITS_PER_LONG == 32 || defined(__KERNEL__) */
 
 #if _MIPS_SIM != _MIPS_SIM_ABI64
 #define __ARCH_FLOCK_EXTRA_SYSID   long l_sysid;
diff --git a/arch/powerpc/include/asm/compat.h 
b/arch/powerpc/include/asm/compat.h
index 7afc96fb6524..83d8f70779cb 100644
--- a/arch/powerpc/include/asm/compat.h
+++ b/arch/powerpc/include/asm/compat.h
@@ -52,10 +52,6 @@ struct compat_flock {
compat_pid_tl_pid;
 };
 
-#define F_GETLK64  12  /*  using 'struct flock64' */
-#define F_SETLK64  13
-#define F_SETLKW64 14
-
 struct compat_flock64 {
short   l_type;
short   l_whence;
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index cdc7ae72529d..0f14b3188b1b 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -110,10 +110,6 @@ struct compat_flock {
compat_pid_tl_pid;
 };
 
-#define F_GETLK64   12
-#define F_SETLK64   13
-#define F_SETLKW64  14
-
 struct compat_flock64 {
short   l_type;
short   l_whence;
diff --git a/arch/sparc/include/asm/compat.h b/arch/sparc/include/asm/compat.h
index bd949fcf9d63..108078751bb5 100644
--- a/arch/sparc/include/asm/compat.h
+++ b/arch/sparc/include/asm/compat.h
@@ -84,10 +84,6 @@ struct compat_flock {
short   __unused;
 };
 
-#define F_GETLK64  12
-#define F_SETLK64  13
-#define F_SETLKW64 14
-
 struct compat_flock64 {
short   l_type;
short   l_whence;
diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index 7516e4199b3c..8d19a212f4f2 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -58,10 +58,6 @@ struct compat_flock {
compat_pid_tl_pid;
 };
 
-#define F_GETLK64  12  /*  using 'struct flock64' */
-#define F_SETLK64  13
-#define F_SETLKW64 14
-
 /*
  * IA32 uses 4 byte alignment for 64 bit quantities,
  * so we need to pack this structure.
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 77aa9f2ff98d..f13d37b60775 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -116,13 +116,13 @@
 #define F_GETSIG   11  /* for sockets. */
 #endif
 
-#ifndef CONFIG_64BIT
+#if __BITS_PER_LONG == 32 || defined(__KERNEL__)
 #ifndef F_GETLK64
 #define F_GETLK64  12  /*  using 'struct flock64' */
 #define F_SETLK64

[PATCH V5 01/21] uapi: simplify __ARCH_FLOCK{,64}_PAD a little

From: Christoph Hellwig 

Don't bother to define the symbols empty, just don't use them.
That makes the intent a little more clear.

Remove the unused HAVE_ARCH_STRUCT_FLOCK64 define and merge the
32-bit mips struct flock into the generic one.

Add a new __ARCH_FLOCK_EXTRA_SYSID macro following the style of
__ARCH_FLOCK_PAD to avoid having a separate definition just for
one architecture.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Guo Ren 
Reviewed-by: Arnd Bergmann 
---
 arch/mips/include/uapi/asm/fcntl.h | 26 +++---
 include/uapi/asm-generic/fcntl.h   | 19 +++
 tools/include/uapi/asm-generic/fcntl.h | 19 +++
 3 files changed, 17 insertions(+), 47 deletions(-)

diff --git a/arch/mips/include/uapi/asm/fcntl.h 
b/arch/mips/include/uapi/asm/fcntl.h
index 42e13dead543..9e44ac810db9 100644
--- a/arch/mips/include/uapi/asm/fcntl.h
+++ b/arch/mips/include/uapi/asm/fcntl.h
@@ -50,30 +50,10 @@
 #define F_SETLKW64 35
 #endif
 
-/*
- * The flavours of struct flock.  "struct flock" is the ABI compliant
- * variant.  Finally struct flock64 is the LFS variant of struct flock.
 As
- * a historic accident and inconsistence with the ABI definition it doesn't
- * contain all the same fields as struct flock.
- */
-
 #if _MIPS_SIM != _MIPS_SIM_ABI64
-
-#include 
-
-struct flock {
-   short   l_type;
-   short   l_whence;
-   __kernel_off_t  l_start;
-   __kernel_off_t  l_len;
-   longl_sysid;
-   __kernel_pid_t l_pid;
-   longpad[4];
-};
-
-#define HAVE_ARCH_STRUCT_FLOCK
-
-#endif /* _MIPS_SIM == _MIPS_SIM_ABI32 */
+#define __ARCH_FLOCK_EXTRA_SYSID   long l_sysid;
+#define __ARCH_FLOCK_PAD   long pad[4];
+#endif
 
 #include 
 
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index ecd0f5bdfc1d..77aa9f2ff98d 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -192,25 +192,19 @@ struct f_owner_ex {
 
 #define F_LINUX_SPECIFIC_BASE  1024
 
-#ifndef HAVE_ARCH_STRUCT_FLOCK
-#ifndef __ARCH_FLOCK_PAD
-#define __ARCH_FLOCK_PAD
-#endif
-
 struct flock {
short   l_type;
short   l_whence;
__kernel_off_t  l_start;
__kernel_off_t  l_len;
__kernel_pid_t  l_pid;
-   __ARCH_FLOCK_PAD
-};
+#ifdef __ARCH_FLOCK_EXTRA_SYSID
+   __ARCH_FLOCK_EXTRA_SYSID
 #endif
-
-#ifndef HAVE_ARCH_STRUCT_FLOCK64
-#ifndef __ARCH_FLOCK64_PAD
-#define __ARCH_FLOCK64_PAD
+#ifdef __ARCH_FLOCK_PAD
+   __ARCH_FLOCK_PAD
 #endif
+};
 
 struct flock64 {
short  l_type;
@@ -218,8 +212,9 @@ struct flock64 {
__kernel_loff_t l_start;
__kernel_loff_t l_len;
__kernel_pid_t  l_pid;
+#ifdef __ARCH_FLOCK64_PAD
__ARCH_FLOCK64_PAD
-};
 #endif
+};
 
 #endif /* _ASM_GENERIC_FCNTL_H */
diff --git a/tools/include/uapi/asm-generic/fcntl.h 
b/tools/include/uapi/asm-generic/fcntl.h
index ac190958c981..99bc9b15ce2b 100644
--- a/tools/include/uapi/asm-generic/fcntl.h
+++ b/tools/include/uapi/asm-generic/fcntl.h
@@ -187,25 +187,19 @@ struct f_owner_ex {
 
 #define F_LINUX_SPECIFIC_BASE  1024
 
-#ifndef HAVE_ARCH_STRUCT_FLOCK
-#ifndef __ARCH_FLOCK_PAD
-#define __ARCH_FLOCK_PAD
-#endif
-
 struct flock {
short   l_type;
short   l_whence;
__kernel_off_t  l_start;
__kernel_off_t  l_len;
__kernel_pid_t  l_pid;
-   __ARCH_FLOCK_PAD
-};
+#ifdef __ARCH_FLOCK_EXTRA_SYSID
+   __ARCH_FLOCK_EXTRA_SYSID
 #endif
-
-#ifndef HAVE_ARCH_STRUCT_FLOCK64
-#ifndef __ARCH_FLOCK64_PAD
-#define __ARCH_FLOCK64_PAD
+#ifdef __ARCH_FLOCK_PAD
+   __ARCH_FLOCK_PAD
 #endif
+};
 
 struct flock64 {
short  l_type;
@@ -213,8 +207,9 @@ struct flock64 {
__kernel_loff_t l_start;
__kernel_loff_t l_len;
__kernel_pid_t  l_pid;
+#ifdef __ARCH_FLOCK64_PAD
__ARCH_FLOCK64_PAD
-};
 #endif
+};
 
 #endif /* _ASM_GENERIC_FCNTL_H */
-- 
2.25.1

[PATCH V5 00/21] riscv: compat: Add COMPAT mode support for rv64

From: Guo Ren 

Currently, most 64-bit architectures (x86, parisc, powerpc, arm64,
s390, mips, sparc) have supported COMPAT mode. But they all have
history issues and can't use standard linux unistd.h. RISC-V would
be first standard __SYSCALL_COMPAT user of include/uapi/asm-generic
/unistd.h.

The patchset are based on v5.17-rc2, you can compare rv64-compat32
v.s. rv32-whole in qemu with following step:

 - Prepare rv32 rootfs & fw_jump.bin by buildroot.org
   $ git clone git://git.busybox.net/buildroot
   $ cd buildroot
   $ make qemu_riscv32_virt_defconfig O=qemu_riscv32_virt_defconfig
   $ make -C qemu_riscv32_virt_defconfig
   $ make qemu_riscv64_virt_defconfig O=qemu_riscv64_virt_defconfig
   $ make -C qemu_riscv64_virt_defconfig
   (Got fw_jump.bin & rootfs.ext2 in qemu_riscvXX_virt_defconfig/images)

 - Prepare Linux rv32 & rv64 Image
   $ git clone g...@github.com:c-sky/csky-linux.git -b riscv_compat_v5 linux
   $ cd linux
   $ echo "CONFIG_STRICT_KERNEL_RWX=n" >> arch/riscv/configs/defconfig
   $ echo "CONFIG_STRICT_MODULE_RWX=n" >> arch/riscv/configs/defconfig
   $ make ARCH=riscv CROSS_COMPILE=riscv32-buildroot-linux-gnu- 
O=../build-rv32/ rv32_defconfig
   $ make ARCH=riscv CROSS_COMPILE=riscv32-buildroot-linux-gnu- 
O=../build-rv32/ Image
   $ make ARCH=riscv CROSS_COMPILE=riscv64-buildroot-linux-gnu- 
O=../build-rv64/ defconfig
   $ make ARCH=riscv CROSS_COMPILE=riscv64-buildroot-linux-gnu- 
O=../build-rv64/ Image

 - Prepare Qemu: (made by LIU Zhiwei )
   $ git clone g...@github.com:alistair23/qemu.git -b 
riscv-to-apply.for-upstream linux
   $ cd qemu
   $ ./configure --target-list="riscv64-softmmu riscv32-softmmu"
   $ make

Now let's compare rv32-compat with rv32-native memory footprint. Kernel with 
rv32 = rv64
defconfig, rootfs, opensbi, Qemu are the same.

 - Run rv64 with rv32 rootfs in compat mode:
   $ ./build/qemu-system-riscv64 -cpu rv64,x-h=true -M virt -m 64m -nographic 
-bios qemu_riscv64_virt_defconfig/images/fw_jump.bin -kernel build-rv64/Image 
-drive file qemu_riscv32_virt_defconfig/images/rootfs.ext2,format=raw,id=hd0 
-device virtio-blk-device,drive=hd0 -append "rootwait root=/dev/vda ro 
console=ttyS0 earlycon=sbi" -netdev user,id=net0 -device 
virtio-net-device,netdev=net0

QEMU emulator version 6.2.50 (v6.2.0-29-g196d7182c8)
OpenSBI v0.9
[0.00] Linux version 5.16.0-rc6-00017-g750f87086bdd-dirty 
(guoren@guoren-Z87-HD3) (riscv64-unknown-linux-gnu-gcc (GCC) 10.2.0, GNU ld 
(GNU Binutils) 2.37) #96 SMP Tue Dec 28 21:01:55 CST 2021
[0.00] OF: fdt: Ignoring memory range 0x8000 - 0x8020
[0.00] Machine model: riscv-virtio,qemu
[0.00] earlycon: sbi0 at I/O port 0x0 (options '')
[0.00] printk: bootconsole [sbi0] enabled
[0.00] efi: UEFI not found.
[0.00] Zone ranges:
[0.00]   DMA32[mem 0x8020-0x83ff]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x8020-0x83ff]
[0.00] Initmem setup node 0 [mem 0x8020-0x83ff]
[0.00] SBI specification v0.2 detected
[0.00] SBI implementation ID=0x1 Version=0x9
[0.00] SBI TIME extension detected
[0.00] SBI IPI extension detected
[0.00] SBI RFENCE extension detected
[0.00] SBI v0.2 HSM extension detected
[0.00] riscv: ISA extensions acdfhimsu
[0.00] riscv: ELF capabilities acdfim
[0.00] percpu: Embedded 17 pages/cpu s30696 r8192 d30744 u69632
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 15655
[0.00] Kernel command line: rootwait root=/dev/vda ro console=ttyS0 
earlycon=sbi
[0.00] Dentry cache hash table entries: 8192 (order: 4, 65536 bytes, 
linear)
[0.00] Inode-cache hash table entries: 4096 (order: 3, 32768 bytes, 
linear)
[0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
[0.00] Virtual kernel memory layout:
[0.00]   fixmap : 0xffcefee0 - 0xffceff00   (2048 
kB)
[0.00]   pci io : 0xffceff00 - 0xffcf   (  16 
MB)
[0.00]  vmemmap : 0xffcf - 0xffcf   (4095 
MB)
[0.00]  vmalloc : 0xffd0 - 0xffdf   (65535 
MB)
[0.00]   lowmem : 0xffe0 - 0xffe003e0   (  62 
MB)
[0.00]   kernel : 0x8000 - 0x   (2047 
MB)
[0.00] Memory: 52788K/63488K available (6184K kernel code, 888K rwdata, 
1917K rodata, 294K init, 297K bss, 10700K reserved, 0K cma-reserved)
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[0.00] rcu: Hierarchical RCU implementation.
[0.00] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
[0.00] rcu: RCU debug extended QS entry/exit.
[0.00]  Tracing variant of Tasks RC

Re: [PATCH v7 0/5] Allow guest access to EFI confidential computing secret area

2022-02-01 Thread James Bottomley

On Tue, 2022-02-01 at 15:41 +0100, Greg KH wrote:
> On Tue, Feb 01, 2022 at 09:24:50AM -0500, James Bottomley wrote:
> > [cc's added]
> > On Tue, 2022-02-01 at 14:50 +0100, Greg KH wrote:
[...]
> > > You all need to work together to come up with a unified place for
> > > this and stop making it platform-specific.
> > 
> > I'm not entirely sure of that.  If you look at the differences
> > between EFI variables and the COCO proposal: the former has an
> > update API which, in the case of signed variables, is rather
> > complex and a UC16 content requirement.  The latter is binary data
> > with read only/delete.  Plus each variable in EFI is described by a
> > GUID, so having a directory of random guids, some of which behave
> > like COCO secrets and some of which are EFI variables is going to
> > be incredibly confusing (and also break all our current listing
> > tools which seems somewhat undesirable).
> > 
> > So we could end up with 
> > 
> > /efivar
> > /coco
> 
> The powerpc stuff is not efi.  But yes, that is messy here.  But why
> doesn't the powerpc follow the coco standard?

There is no coco standard for EFI variables.  There's only a UEFI
variable standard which, I believe, power tries to follow in some
measure since the variables are mostly used for its version of secure
boot.  Certainly you're either a power or UEFI platform but not both,
so they could live at the same location ... that's not true with the
coco ones.  I added the cc's to see if there are other ideas, but I
really think the use cases are too disjoint.

As Daniel has previously proposed, it might be possible to unify the
power and UEFI implementations ... useful if we want them to respond to
the same tooling, but we'll do that by giving them the same EFI
semantics.  The semantics and source of the coco secrets will still be
immutable and completely alien to whatever backend does the non
volatile power/efi authenticated variables, so we'll still need two
different backends and then it's just a question of arguing about path,
which doesn't make sense as a blocker.

> > To achieve the separation, but I really don't see what this buys
> > us.  Both filesystems would likely end up with different backends
> > because of the semantic differences and we can easily start now in
> > different places (effectively we've already done this for efi
> > variables) and unify later if that is the chosen direction, so it
> > doesn't look like a blocker.
> > 
> > > Until then, we can't take this.
> > 
> > I don't believe anyone was asking you to take it.
> 
> I was on the review list...

You raised a doc/API concenrn.  I think you were on the review list to
ensure it got addressed.

James

Re: [PATCH v7 0/5] Allow guest access to EFI confidential computing secret area

On Tue, Feb 01, 2022 at 09:24:50AM -0500, James Bottomley wrote:
> [cc's added]
> On Tue, 2022-02-01 at 14:50 +0100, Greg KH wrote:
> > On Tue, Feb 01, 2022 at 12:44:08PM +, Dov Murik wrote:
> [...]
> > > # ls -la /sys/kernel/security/coco/efi_secret
> > > total 0
> > > drwxr-xr-x 2 root root 0 Jun 28 11:55 .
> > > drwxr-xr-x 3 root root 0 Jun 28 11:54 ..
> > > -r--r- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-
> > > 06879ce3da0b
> > > -r--r- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-
> > > d3a0b54312c6
> > > -r--r- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-
> > > ff17f78864d2
> > 
> > Please see my comments on the powerpc version of this type of thing:
> > 
> > https://lore.kernel.org/r/20220122005637.28199-1-na...@linux.ibm.com
> 
> If you want a debate, actually cc'ing the people on the other thread
> would have been a good start ...
> 
> For those added, this patch series is at:
> 
> https://lore.kernel.org/all/20220201124413.1093099-1-dovmu...@linux.ibm.com/

Thanks for adding everyone.

> > You all need to work together to come up with a unified place for
> > this and stop making it platform-specific.
> 
> I'm not entirely sure of that.  If you look at the differences between
> EFI variables and the COCO proposal: the former has an update API
> which, in the case of signed variables, is rather complex and a UC16
> content requirement.  The latter is binary data with read only/delete. 
> Plus each variable in EFI is described by a GUID, so having a directory
> of random guids, some of which behave like COCO secrets and some of
> which are EFI variables is going to be incredibly confusing (and also
> break all our current listing tools which seems somewhat undesirable).
> 
> So we could end up with 
> 
> /efivar
> /coco

The powerpc stuff is not efi.  But yes, that is messy here.  But why
doesn't the powerpc follow the coco standard?

> To achieve the separation, but I really don't see what this buys us. 
> Both filesystems would likely end up with different backends because of
> the semantic differences and we can easily start now in different
> places (effectively we've already done this for efi variables) and
> unify later if that is the chosen direction, so it doesn't look like a
> blocker.
> 
> > Until then, we can't take this.
> 
> I don't believe anyone was asking you to take it.

I was on the review list...

Re: [PATCH v7 0/5] Allow guest access to EFI confidential computing secret area

2022-02-01 Thread James Bottomley

[cc's added]
On Tue, 2022-02-01 at 14:50 +0100, Greg KH wrote:
> On Tue, Feb 01, 2022 at 12:44:08PM +, Dov Murik wrote:
[...]
> > # ls -la /sys/kernel/security/coco/efi_secret
> > total 0
> > drwxr-xr-x 2 root root 0 Jun 28 11:55 .
> > drwxr-xr-x 3 root root 0 Jun 28 11:54 ..
> > -r--r- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-
> > 06879ce3da0b
> > -r--r- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-
> > d3a0b54312c6
> > -r--r- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-
> > ff17f78864d2
> 
> Please see my comments on the powerpc version of this type of thing:
>   
> https://lore.kernel.org/r/20220122005637.28199-1-na...@linux.ibm.com

If you want a debate, actually cc'ing the people on the other thread
would have been a good start ...

For those added, this patch series is at:

https://lore.kernel.org/all/20220201124413.1093099-1-dovmu...@linux.ibm.com/

> You all need to work together to come up with a unified place for
> this and stop making it platform-specific.

I'm not entirely sure of that.  If you look at the differences between
EFI variables and the COCO proposal: the former has an update API
which, in the case of signed variables, is rather complex and a UC16
content requirement.  The latter is binary data with read only/delete. 
Plus each variable in EFI is described by a GUID, so having a directory
of random guids, some of which behave like COCO secrets and some of
which are EFI variables is going to be incredibly confusing (and also
break all our current listing tools which seems somewhat undesirable).

So we could end up with 

/efivar
/coco

To achieve the separation, but I really don't see what this buys us. 
Both filesystems would likely end up with different backends because of
the semantic differences and we can easily start now in different
places (effectively we've already done this for efi variables) and
unify later if that is the chosen direction, so it doesn't look like a
blocker.

> Until then, we can't take this.

I don't believe anyone was asking you to take it.

James

Re: [PATCH V4 16/17] riscv: compat: Add COMPAT Kbuild skeletal support

On Tue, Feb 1, 2022 at 7:48 PM Arnd Bergmann  wrote:
>
> On Tue, Feb 1, 2022 at 11:26 AM Guo Ren  wrote:
> >
> > Hi Arnd & Christoph,
> >
> > The UXL field controls the value of XLEN for U-mode, termed UXLEN,
> > which may differ from the
> > value of XLEN for S-mode, termed SXLEN. The encoding of UXL is the
> > same as that of the MXL
> > field of misa, shown in Table 3.1.
> >
> > Here is the patch. (We needn't exception helper, because we are in
> > S-mode and UXL wouldn't affect.)
>
> Looks good to me, just a few details that could be improved
>
> > -#define compat_elf_check_arch(x) ((x)->e_machine == EM_RISCV)
> > +#ifdef CONFIG_COMPAT
> > +#define compat_elf_check_arch compat_elf_check_arch
> > +extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
> > +#endif
>
> No need for the #ifdef
Okay

> > +}
>
> > +void compat_mode_detect(void)
>
> __init
Okay

>
> > +{
> > + unsigned long tmp = csr_read(CSR_STATUS);
> > + csr_write(CSR_STATUS, (tmp & ~SR_UXL) | SR_UXL_32);
> > +
> > + if ((csr_read(CSR_STATUS) & SR_UXL) != SR_UXL_32) {
> > + csr_write(CSR_STATUS, tmp);
> > + return;
> > + }
> > +
> > + csr_write(CSR_STATUS, tmp);
> > + compat_mode_support = true;
> > +
> > + pr_info("riscv: compat: 32bit U-mode applications support\n");
> > +}
>
> I think an entry in /proc/cpuinfo would be more helpful than the pr_info at
> boot time. Maybe a follow-up patch though, as there is no obvious place
> to put it. On other architectures, you typically have a set of space
> separated feature names, but riscv has a single string that describes
> the ISA, and this feature is technically the support for a second ISA.
Yes, it should be another patch after discussion.

>
>  Arnd



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

Re: [RFC PATCH 0/2] powerpc/pseries: add support for local secure storage called Platform Keystore(PKS)

On Mon, Jan 24, 2022 at 11:25:17AM +1100, Daniel Axtens wrote:
> Hi Greg,
> 
> > Ok, this is like the 3rd or 4th different platform-specific proposal for
> > this type of functionality.  I think we need to give up on
> > platform-specific user/kernel apis on this (random sysfs/securityfs
> > files scattered around the tree), and come up with a standard place for
> > all of this.
> 
> I agree that we do have a number of platforms exposing superficially
> similar functionality. Indeed, back in 2019 I had a crack at a unified
> approach: [1] [2].
> 
> Looking back at it now, I am not sure it ever would have worked because
> the semantics of the underlying firmware stores are quite
> different. Here are the ones I know about:
> 
>  - OpenPower/PowerNV Secure Variables:
>  
>* Firmware semantics:
>  - flat variable space
>  - variables are fixed in firmware, can neither be created nor
> destroyed
>  - variable names are ASCII
>  - no concept of policy/attributes
>  
>* Current kernel interface semantics:
>  - names are case sensitive
>  - directory per variable 
> 
> 
>  - (U)EFI variables:
>  
>* Firmware semantics:
>  - flat variable space
>  - variables can be created/destroyed but the semantics are fiddly
> [3]
>  - variable names are UTF-16 + UUID
>  - variables have 32-bit attributes
> 
>* efivarfs interface semantics:
>  - file per variable
>  - attributes are the first 4 bytes of the file
>  - names are partially case-insensitive (UUID part) and partially
> case-sensitive ('name' part)
> 
>* sysfs interface semantics (as used by CONFIG_GOOGLE_SMI)
>  - directory per variable
>  - attributes are a separate sysfs file
>  - to create a variable you write a serialised structure to
> `/sys/firmware/efi/vars/new_var`, to delete a var you write
> to `.../del_var`
>  - names are case-sensitive including the UUID
> 
> 
>  - PowerVM Partition Key Store Variables:
>  
>* Firmware semantics:
>  - _not_ a flat space, there are 3 domains ("consumers"): firmware,
> bootloader and OS (not yet supported by the patch set)
>  - variables can be created and destroyed but the semantics are
> fiddly and fiddly in different ways to UEFI [4]
>  - variable names are arbitrary byte strings: the hypervisor permits
> names to contain nul and /.
>  - variables have 32-bit attributes ("policy") that don't align with
> UEFI attributes
> 
>* No stable kernel interface yet
> 
> Even if we could come up with some stable kernel interface features
> (e.g. decide if we want file per variable vs directory per variable), I
> don't know how easy it would be to deal with the underlying semantic
> differences - I think userspace would still need substantial
> per-platform knowledge.
> 
> Or have I misunderstood what you're asking for? (If you want them all to
> live under /sys/firmware, these ones all already do...)

I want them to be unified in some way, right now there are lots of
proposals for the same type of thing, but in different places (sysfs,
securityfs, somewhere else), and with different names.

Please work together.

thanks,

greg k-h

Re: [PATCH] powerpc/xive: Add some error handling code to 'xive_spapr_init()'



Le 01/02/2022 à 13:32, Christophe JAILLET a écrit :
> Le 01/02/2022 à 12:31, Christophe Leroy a écrit :
>> Hi,
>>
>> Le 01/08/2019 à 13:09, Christophe JAILLET a écrit :
>>> 'xive_irq_bitmap_add()' can return -ENOMEM.
>>> In this case, we should free the memory already allocated and return
>>> 'false' to the caller.
>>>
>>> Also add an error path which undoes the 'tima = ioremap(...)'
>>
>> This old patch doesn't apply, if it is still relevant can you please 
>> rebase ?
>>
>> Thanks
>> Christophe
>>
> 
> Hi, funny to see a 2 1/2 years old patch to pop-up like that :)
> It still looks relevant to me.

Yeah I'm trying to clean some dust in Patchwork.

> 
> V2 sent.
> Still not compile tested.
> 

At least it's all green at 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/564998101804886b151235c8a9f93020923bfd2c.1643718324.git.christophe.jail...@wanadoo.fr/

Christophe

Re: [PATCH v2] powerpc/xive: Add some error handling code to 'xive_spapr_init()'

2022-02-01 Thread Cédric Le Goater


On 2/1/22 13:31, Christophe JAILLET wrote:

'xive_irq_bitmap_add()' can return -ENOMEM.
In this case, we should free the memory already allocated and return
'false' to the caller.

Also add an error path which undoes the 'tima = ioremap(...)'

Signed-off-by: Christophe JAILLET 


Reviewed-by: Cédric Le Goater 


---
NOT compile tested (I don't have a cross compiler and won't install one).
So if some correction or improvement are needed, feel free to propose and
commit it directly.



A cross compiler takes a couple of seconds to install on any distro.
It takes a little more to compile the pseries defconfig. To test with
QEMU, grab the disk image here :

  
https://github.com/legoater/qemu-ppc-boot/blob/main/buildroot/qemu_ppc64le_pseries-2021.11-7-g3058e75456-20211206

run :

  qemu-system-ppc64 -M pseries -cpu POWER9 -kernel vmlinux -append "console=hvc0 
rootwait root=/dev/sda" -drive file=rootfs.ext2,if=scsi,index=0,format=raw 
-nographic -net nic -net user -serial mon:stdio

and you will have a pseries machine with network and disk using the
XIVE interrupt controller.

To get more info on the genirq layer and the XIVE driver, simply
append :

  dyndbg="file arch/powerpc/sysdev/xive/* +p; file kernel/irq/* +p"

Thanks,

C.

Re: [PATCH] powerpc/xive: Add some error handling code to 'xive_spapr_init()'

2022-02-01 Thread Christophe JAILLET


Le 01/02/2022 à 12:31, Christophe Leroy a écrit :

Hi,

Le 01/08/2019 à 13:09, Christophe JAILLET a écrit :

'xive_irq_bitmap_add()' can return -ENOMEM.
In this case, we should free the memory already allocated and return
'false' to the caller.

Also add an error path which undoes the 'tima = ioremap(...)'


This old patch doesn't apply, if it is still relevant can you please 
rebase ?


Thanks
Christophe



Hi, funny to see a 2 1/2 years old patch to pop-up like that :)
It still looks relevant to me.

V2 sent.
Still not compile tested.

CJ

[PATCH v2] powerpc/xive: Add some error handling code to 'xive_spapr_init()'

2022-02-01 Thread Christophe JAILLET

'xive_irq_bitmap_add()' can return -ENOMEM.
In this case, we should free the memory already allocated and return
'false' to the caller.

Also add an error path which undoes the 'tima = ioremap(...)'

Signed-off-by: Christophe JAILLET 
---
NOT compile tested (I don't have a cross compiler and won't install one).
So if some correction or improvement are needed, feel free to propose and
commit it directly.

v2: rebase with latest -next
---
 arch/powerpc/sysdev/xive/spapr.c | 36 +---
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 928f95004501..29456c255f9f 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -67,6 +67,17 @@ static int __init xive_irq_bitmap_add(int base, int count)
return 0;
 }
 
+static void xive_irq_bitmap_remove_all(void)
+{
+   struct xive_irq_bitmap *xibm, *tmp;
+
+   list_for_each_entry_safe(xibm, tmp, &xive_irq_bitmaps, list) {
+   list_del(&xibm->list);
+   kfree(xibm->bitmap);
+   kfree(xibm);
+   }
+}
+
 static int __xive_irq_bitmap_alloc(struct xive_irq_bitmap *xibm)
 {
int irq;
@@ -803,7 +814,7 @@ bool __init xive_spapr_init(void)
u32 val;
u32 len;
const __be32 *reg;
-   int i;
+   int i, err;
 
if (xive_spapr_disabled())
return false;
@@ -828,23 +839,26 @@ bool __init xive_spapr_init(void)
}
 
if (!xive_get_max_prio(&max_prio))
-   return false;
+   goto err_unmap;
 
/* Feed the IRQ number allocator with the ranges given in the DT */
reg = of_get_property(np, "ibm,xive-lisn-ranges", &len);
if (!reg) {
pr_err("Failed to read 'ibm,xive-lisn-ranges' property\n");
-   return false;
+   goto err_unmap;
}
 
if (len % (2 * sizeof(u32)) != 0) {
pr_err("invalid 'ibm,xive-lisn-ranges' property\n");
-   return false;
+   goto err_unmap;
}
 
-   for (i = 0; i < len / (2 * sizeof(u32)); i++, reg += 2)
-   xive_irq_bitmap_add(be32_to_cpu(reg[0]),
-   be32_to_cpu(reg[1]));
+   for (i = 0; i < len / (2 * sizeof(u32)); i++, reg += 2) {
+   err = xive_irq_bitmap_add(be32_to_cpu(reg[0]),
+ be32_to_cpu(reg[1]));
+   if (err < 0)
+   goto err_mem_free;
+   }
 
/* Iterate the EQ sizes and pick one */
of_property_for_each_u32(np, "ibm,xive-eq-sizes", prop, reg, val) {
@@ -855,10 +869,16 @@ bool __init xive_spapr_init(void)
 
/* Initialize XIVE core with our backend */
if (!xive_core_init(np, &xive_spapr_ops, tima, TM_QW1_OS, max_prio))
-   return false;
+   goto err_mem_free;
 
pr_info("Using %dkB queues\n", 1 << (xive_queue_shift - 10));
return true;
+
+err_mem_free:
+   xive_irq_bitmap_remove_all();
+err_unmap:
+   iounmap(tima);
+   return false;
 }
 
 machine_arch_initcall(pseries, xive_core_debug_init);
-- 
2.32.0

Re: [PATCH] powerpc/kasan: Fix early region not updated correctly



Le 29/12/2021 à 04:52, Chen Jingwen a écrit :
> The shadow's page table is not updated when PTE_RPN_SHIFT is 24
> and PAGE_SHIFT is 12. It not only causes false positives but
> also false negative as shown the following text.
> 
> Fix it by bringing the logic of kasan_early_shadow_page_entry here.
> 
> 1. False Positive:
> ==
> BUG: KASAN: vmalloc-out-of-bounds in pcpu_alloc+0x508/0xa50
> Write of size 16 at addr f57f3be0 by task swapper/0/1
> 
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-12267-gdebe436e77c7 #1
> Call Trace:
> [c80d1c20] [c07fe7b8] dump_stack_lvl+0x4c/0x6c (unreliable)
> [c80d1c40] [c02ff668] print_address_description.constprop.0+0x88/0x300
> [c80d1c70] [c02ff45c] kasan_report+0x1ec/0x200
> [c80d1cb0] [c0300b20] kasan_check_range+0x160/0x2f0
> [c80d1cc0] [c03018a4] memset+0x34/0x90
> [c80d1ce0] [c0280108] pcpu_alloc+0x508/0xa50
> [c80d1d40] [c02fd7bc] __kmem_cache_create+0xfc/0x570
> [c80d1d70] [c0283d64] kmem_cache_create_usercopy+0x274/0x3e0
> [c80d1db0] [c2036580] init_sd+0xc4/0x1d0
> [c80d1de0] [c00044a0] do_one_initcall+0xc0/0x33c
> [c80d1eb0] [c2001624] kernel_init_freeable+0x2c8/0x384
> [c80d1ef0] [c0004b14] kernel_init+0x24/0x170
> [c80d1f10] [c001b26c] ret_from_kernel_thread+0x5c/0x64
> 
> Memory state around the buggy address:
>   f57f3a80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>   f57f3b00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>> f57f3b80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ^
>   f57f3c00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>   f57f3c80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ==
> 
> 2. False Negative (with KASAN tests):
> ==
> Before fix:
>  ok 45 - kmalloc_double_kzfree
>  # vmalloc_oob: EXPECTATION FAILED at lib/test_kasan.c:1039
>  KASAN failure expected in "((volatile char *)area)[3100]", but none 
> occurred
>  not ok 46 - vmalloc_oob
>  not ok 1 - kasan
> 
> ==
> After fix:
>  ok 1 - kasan
> 
> Fixes: cbd18991e24fe ("powerpc/mm: Fix an Oops in kasan_mmu_init()")
> Cc: sta...@vger.kernel.org # 5.4.x
> Signed-off-by: Chen Jingwen 

Reviewed-by: Christophe Leroy 

> ---
>   arch/powerpc/mm/kasan/kasan_init_32.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
> b/arch/powerpc/mm/kasan/kasan_init_32.c
> index cf8770b1a692e..f3e4d069e0ba7 100644
> --- a/arch/powerpc/mm/kasan/kasan_init_32.c
> +++ b/arch/powerpc/mm/kasan/kasan_init_32.c
> @@ -83,13 +83,12 @@ void __init
>   kasan_update_early_region(unsigned long k_start, unsigned long k_end, pte_t 
> pte)
>   {
>   unsigned long k_cur;
> - phys_addr_t pa = __pa(kasan_early_shadow_page);
>   
>   for (k_cur = k_start; k_cur != k_end; k_cur += PAGE_SIZE) {
>   pmd_t *pmd = pmd_off_k(k_cur);
>   pte_t *ptep = pte_offset_kernel(pmd, k_cur);
>   
> - if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
> + if (pte_page(*ptep) != 
> virt_to_page(lm_alias(kasan_early_shadow_page)))
>   continue;
>   
>   __set_pte_at(&init_mm, k_cur, ptep, pte, 0);

Re: [PATCH] powerpc/ptdump: Fix sparse warning in hashpagetable.c

Christophe Leroy  writes:
>   arch/powerpc/mm/ptdump/hashpagetable.c:264:29: warning: restricted __be64 
> degrades to integer
>   arch/powerpc/mm/ptdump/hashpagetable.c:265:49: warning: restricted __be64 
> degrades to integer
>   arch/powerpc/mm/ptdump/hashpagetable.c:267:36: warning: incorrect type in 
> assignment (different base types)
>   arch/powerpc/mm/ptdump/hashpagetable.c:267:36:expected unsigned long 
> long [usertype]
>   arch/powerpc/mm/ptdump/hashpagetable.c:267:36:got restricted __be64 
> [usertype] v
>   arch/powerpc/mm/ptdump/hashpagetable.c:268:36: warning: incorrect type in 
> assignment (different base types)
>   arch/powerpc/mm/ptdump/hashpagetable.c:268:36:expected unsigned long 
> long [usertype]
>   arch/powerpc/mm/ptdump/hashpagetable.c:268:36:got restricted __be64 
> [usertype] r
>
> struct hash_pte fields have type __be64. Convert them to
> regular long before using them.

Your patch changes one side of the comparison but not the other, which
implies the code doesn't work at the moment, ie. it should never be
matching.

But it does work at the moment, so there must be something else going on.

> diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c 
> b/arch/powerpc/mm/ptdump/hashpagetable.c
> index c7f824d294b2..bf60ab1bedb9 100644
> --- a/arch/powerpc/mm/ptdump/hashpagetable.c
> +++ b/arch/powerpc/mm/ptdump/hashpagetable.c
> @@ -261,11 +261,11 @@ static int pseries_find(unsigned long ea, int psize, 
> bool primary, u64 *v, u64 *

Expanding the context a little:

for (i = 0; i < HPTES_PER_GROUP; i += 4, hpte_group += 4) {
lpar_rc = plpar_pte_read_4(0, hpte_group, (void *)ptes);

>   if (lpar_rc)
>   continue;
>   for (j = 0; j < 4; j++) {
> - if (HPTE_V_COMPARE(ptes[j].v, want_v) &&
> - (ptes[j].v & HPTE_V_VALID)) {
> + if (HPTE_V_COMPARE(be64_to_cpu(ptes[j].v), want_v) &&
> + (be64_to_cpu(ptes[j].v) & HPTE_V_VALID)) {
>   /* HPTE matches */
> - *v = ptes[j].v;
> - *r = ptes[j].r;
> + *v = be64_to_cpu(ptes[j].v);
> + *r = be64_to_cpu(ptes[j].r);
>   return 0;
>   }
>   }

Turns out the values returned from plpar_pte_read_4() are already in CPU
endian.

We pass an on-stack buffer to plpar_pte_read_4():

  plpar_pte_read_4(0, hpte_group, (void *)ptes);

Which makes it look like the hypercall is writing to memory (our
buffer), so we'd expect the values to need an endian swap.

But plpar_pte_read_4() writes into that buffer from another on-stack
buffer:

static inline long plpar_pte_read_4(unsigned long flags, unsigned long ptex,
unsigned long *ptes)

{
long rc;
unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];

rc = plpar_hcall9(H_READ, retbuf, flags | H_READ_4, ptex);

memcpy(ptes, retbuf, 8*sizeof(unsigned long));

return rc;
}


And the values in that stack buffer are actually returned from the
hypervisor in registers, r4-r11, and written into retbuf by the asm
wrapper:

_GLOBAL_TOC(plpar_hcall9)
HMT_MEDIUM

mfcrr0
stw r0,8(r1)

HCALL_BRANCH(plpar_hcall9_trace)

std r4,STK_PARAM(R4)(r1) /* Save ret buffer */  <- this 
is retbuf

mr  r4,r5
mr  r5,r6
mr  r6,r7
mr  r7,r8
mr  r8,r9
mr  r9,r10
ld  r10,STK_PARAM(R11)(r1)   /* put arg7 in R10 */
ld  r11,STK_PARAM(R12)(r1)   /* put arg8 in R11 */
ld  r12,STK_PARAM(R13)(r1)/* put arg9 in R12 */

HVSC/* invoke the hypervisor */

mr  r0,r12
ld  r12,STK_PARAM(R4)(r1)   <- reload retbuf into r12
std r4,  0(r12)
std r5,  8(r12)
std r6, 16(r12)
std r7, 24(r12)
std r8, 32(r12)
std r9, 40(r12)
std r10,48(r12)
std r11,56(r12)
std r0, 64(r12)


Although the values are BE in memory in the actual HPT, they're read by
the hypervisor which does the byte swap for us, and then when the
hypervisor returns they're returned in the registers. So there's no
extra byte swap needed.

Possibly we should move struct hash_pte into hash_native.c, which is
where it's almost exclusively used, and is used to point to actual HPTEs
in memory.

But for now I think the patch below is a minimal fix for this sparse
warning, it's what other callers of plpar_pte_read_4() are doing.

cheers


diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c 
b/arch/powerpc/mm/ptdump/hashpagetable.c
index c7f824d294b2..9a601587836b 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc

Re: microwatt booting linux-5.7 under verilator

2022-02-01 Thread Luke Kenneth Casson Leighton

On Tue, Feb 1, 2022 at 11:53 AM Michael Ellerman  wrote:

> If you build with CONFIG_RELOCATABLE=y and CONFIG_RELOCATABLE_TEST=y the
> kernel will run wherever you load it (must be 64K aligned), without
> copying itself down to zero first. That will save you a few cycles.

ahh, thank you :)

l.

Re: [Oops][next-20220128] 5.17.0-rc1 kernel panics on my powerpc box

Andy Shevchenko  writes:
> On Mon, Jan 31, 2022 at 12:30:14PM +0530, Abdul Haleem wrote:
>> Greeting's
>> 
>> Today's linux-next kernel failed to boot 5.17.0-rc1-next-20220128 with 
>> kernel Oops on PowerVM LPAR
>> 
>> dmesg:
>> Started hybrid virtual network scan and config.
>> Started VDO volume services.
>> Started Dynamic System Tuning Daemon.
>> Attempted to run process '/opt/rsct/bin/trspoolmgr' with NULL argv
>> Created slice system-systemd\x2dcoredump.slice.
>> Started Process Core Dump (PID 3726/UID 0).
>> Started Process Core Dump (PID 4032/UID 0).
>> Started Process Core Dump (PID 4200/UID 0).
>> Started RMC-Resource Monitioring and Control.
>> Started Process Core Dump (PID 4319/UID 0).
>> rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
>> rpaphp: Slot [U78D2.001.WZS01DT-P1-C10] registered
>> Started Process Core Dump (PID 4687/UID 0).
>> Started Process Core Dump (PID 4806/UID 0).
>> Started Process Core Dump (PID 4973/UID 0).
>> Async-gnnft timeout - hdl=7.
>> BUG: Unable to handle kernel data access on read at 0x5deadbeef12a
>> Faulting instruction address: 0xc0221f4c
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> Modules linked in: rpadlpar_io rpaphp tcp_diag udp_diag inet_diag unix_diag 
>> af_packet_diag netlink_diag bonding rfkill sunrpc dm_round_robin 
>> dm_multipath dm_mod ocrdma ib_uverbs ib_core pseries_rng xts vmx_crypto 
>> uio_pdrv_genirq gf128mul uio sch_fq_codel ext4 mbcache jbd2 sd_mod sg 
>> qla2xxx ibmvscsi ibmveth scsi_transport_srp nvme_fc nvme_fabrics nvme_core 
>> be2net t10_pi scsi_transport_fc
>> CPU: 8 PID: 5782 Comm: mksquashfs Not tainted 
>> 5.17.0-rc1-next-20220128-autotest #1
>> NIP:  c0221f4c LR: c0221ee4 CTR: 0006
>> REGS: c001008bb920 TRAP: 0380   Not tainted  
>> (5.17.0-rc1-next-20220128-autotest)
>> MSR:  80009033   CR: 24004224  XER: 2004
>> CFAR: c0220df8 IRQMASK: 1
>> GPR00: c0221ee4 c001008bbbc0 c28d6100 9cc0
>> GPR04:  01c0 01c0 
>> GPR08:  5deadbeef122 c001008bbbe8 0001
>> GPR12: c0221d50 c7fe6680 7fffbcf0 0100
>> GPR16: 0100 7fffbce64420 0001 0002
>> GPR20: c2912108 5deadbeef122 c0079c008668 
>> GPR24:  c001008bbbe8 9540 c2913a00
>> GPR28:  c000a4560950 c0079c008600 c20e8600
>> NIP [c0221f4c] run_timer_softirq+0x1fc/0x7c0
>> LR [c0221ee4] run_timer_softirq+0x194/0x7c0
>> Call Trace:
>> [c001008bbbc0] [c0221ee4] run_timer_softirq+0x194/0x7c0 
>> (unreliable)
>> [c001008bbc90] [c0ca7e5c] __do_softirq+0x15c/0x3d0
>> [c001008bbd80] [c014f538] irq_exit+0x168/0x1b0
>> [c001008bbdb0] [c0027184] timer_interrupt+0x1a4/0x3e0
>> [c001008bbe10] [c0009a08] decrementer_common_virt+0x208/0x210
>> --- interrupt: 900 at 0x7fffbccb3850
>> NIP:  7fffbccb3850 LR: 7fffbccb51f8 CTR: 
>> REGS: c001008bbe80 TRAP: 0900   Not tainted  
>> (5.17.0-rc1-next-20220128-autotest)
>> MSR:  8280f033   CR: 42004488  
>> XER: 2004
>> CFAR:  IRQMASK: 0
>> GPR00: 7fff 7ffe8afde4d0 7fffbcce7f00 7ffe74020c50
>> GPR04: 6bfa 3cf1 7ffe740223a0 0005
>> GPR08: 7ffe74028fb4 09a3 0004 0039
>> GPR12: 7ffe740323b0 7ffe8afe68e0 7fffbcf0 7ffe8a7d
>> GPR16: 7fffbce64410 7fffbce64420  7fffbce60318
>> GPR20: 7ffe740323b0 7ffe740423c0 3fff 0005
>> GPR24: 0004 7fffbccc8058  000c
>> GPR28: 0102 4415 7ffe7402df8b 7ffe7402e08d
>> NIP [7fffbccb3850] 0x7fffbccb3850
>> LR [7fffbccb51f8] 0x7fffbccb51f8
>> --- interrupt: 900
>> Instruction dump:
>> 6000 e939 2fa9 419effc8 ebb9 fbbe0008 6000 e93d
>> e95d0008 2fa9 f92a 419e0008  813d0020 fb1d0008 ea9d0018
>> ---[ end trace  ]---
>> 
>> The fault instruction points to
>> 
>> # gdb -batch /boot/vmlinuz-5.17.0-rc1-next-20220128-autotest -ex 'list 
>> *(0xc0221f4c)'
>> 0xc0221f4c is in run_timer_softirq (./include/linux/list.h:850).
>> 845  struct hlist_node *next = n->next;
>> 846  struct hlist_node **pprev = n->pprev;
>> 847  
>> 848  WRITE_ONCE(*pprev, next);
>> 849  if (next)
>> 850  WRITE_ONCE(next->pprev, pprev);
>> 851  }
>> 852  
>> 853  /**
>> 854   * hlist_del - Delete the specified hlist_node from its list
>
> It's quite likely not a culprit, but the result of some (race?) condition.
> Cc'ing to Thomas, maybe he has an

Re: [PATCH v3] powerpc: Fix virt_addr_valid() check



Le 27/01/2022 à 13:37, Kefeng Wang a écrit :
> When run ethtool eth0 on PowerPC64, the BUG occurred,
> 
>usercopy: Kernel memory exposure attempt detected from SLUB object not in 
> SLUB page?! (offset 0, size 1048)!
>kernel BUG at mm/usercopy.c:99
>...
>usercopy_abort+0x64/0xa0 (unreliable)
>__check_heap_object+0x168/0x190
>__check_object_size+0x1a0/0x200
>dev_ethtool+0x2494/0x2b20
>dev_ioctl+0x5d0/0x770
>sock_do_ioctl+0xf0/0x1d0
>sock_ioctl+0x3ec/0x5a0
>__se_sys_ioctl+0xf0/0x160
>system_call_exception+0xfc/0x1f0
>system_call_common+0xf8/0x200
> 
> The code shows below,
> 
>data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
>copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN))
> 
> The data is alloced by vmalloc(), virt_addr_valid(ptr) will return true
> on PowerPC64, which leads to the panic.
> 
> As commit 4dd7554a6456 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va
> and __pa addresses") does, let's check the virt addr above PAGE_OFFSET in
> the virt_addr_valid() for PowerPC64, which will make sure that the passed
> address is a valid linear map address.
> 
> Meanwhile, PAGE_OFFSET is the virtual address of the start of lowmem,
> the check is suitable for PowerPC32 too.
> 
> Signed-off-by: Kefeng Wang 

Reviewed-by: Christophe Leroy 

> ---
> v3:
> - update changelog and remove a redundant cast
>   arch/powerpc/include/asm/page.h | 5 -
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
> index 254687258f42..a8a29a23ce2d 100644
> --- a/arch/powerpc/include/asm/page.h
> +++ b/arch/powerpc/include/asm/page.h
> @@ -132,7 +132,10 @@ static inline bool pfn_valid(unsigned long pfn)
>   #define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr))
>   #define pfn_to_kaddr(pfn)   __va((pfn) << PAGE_SHIFT)
>   
> -#define virt_addr_valid(kaddr)   pfn_valid(virt_to_pfn(kaddr))
> +#define virt_addr_valid(vaddr)   ({  
> \
> + unsigned long _addr = (unsigned long)vaddr; 
> \
> + _addr >= PAGE_OFFSET && pfn_valid(virt_to_pfn(_addr));  \
> +})
>   
>   /*
>* On Book-E parts we need __va to parse the device tree and we can't

Re: microwatt booting linux-5.7 under verilator

Nicholas Piggin  writes:
> Excerpts from lkcl's message of January 31, 2022 2:19 pm:
>> 
>> On January 31, 2022 3:31:41 AM UTC, Nicholas Piggin  
>> wrote:
>>>Hi Luke,
>>>
>>>Interesting to read about the project, thanks for the post.
>> 
>> no problem. it's been i think 18 years since i last did linux kernel work.
>> 
 i also had to fix a couple of things in the linux kernel source
 https://git.kernel.org/pub/scm/linux/kernel/git/joel/microwatt.git
>>>
>>>I think these have mostly (all?) been upstreamed now.
>> 
>> i believe so, although last i checked (6 months?) there was some of dts 
>> still to do. instructions online all tend to refer to joel or benh's tree(s)
>> 
 this led me to add support for CONFIG_KERNEL_UNCOMPRESSED
 and cut that time entirely, hence why you can see this in the console
>>>log:
 
 0x5b0e10 bytes of uncompressed data copied
>>>
>>>Interesting, it looks like your HAVE_KERNEL_UNCOMPRESSED support
>>>patch is pretty trivial. 
>> 
>> yeah i was really surprised, it was all there
>> 
>>> We should be able to upstream it pretty
>>>easily I think?
>> 
>> don't see why not.
>
> Okay then we should.
>
>> 
>> the next interesting thing which would save another hour when emulating HDL 
>> at this astoundingly-slow speed of sub-1000 instructions per second would be 
>> in-place execution: no memcpy, just jump.
>> 
>> i seem to recall this (inplace execution) being a standard option back in 
>> 2003 when i was doing xda-developers wince smartphone reverse-emgineering, 
>> although with it being 19 years ago i could be wrong
>
> Not sure of the details on that. Is it memcpy()ing out of ROM or RAM to 
> RAM? Is this in the arch boot code? (I don't know very well).

If you build with CONFIG_RELOCATABLE=y and CONFIG_RELOCATABLE_TEST=y the
kernel will run wherever you load it (must be 64K aligned), without
copying itself down to zero first. That will save you a few cycles.

cheers

Re: [PATCH V4 16/17] riscv: compat: Add COMPAT Kbuild skeletal support

2022-02-01 Thread Arnd Bergmann

On Tue, Feb 1, 2022 at 11:26 AM Guo Ren  wrote:
>
> Hi Arnd & Christoph,
>
> The UXL field controls the value of XLEN for U-mode, termed UXLEN,
> which may differ from the
> value of XLEN for S-mode, termed SXLEN. The encoding of UXL is the
> same as that of the MXL
> field of misa, shown in Table 3.1.
>
> Here is the patch. (We needn't exception helper, because we are in
> S-mode and UXL wouldn't affect.)

Looks good to me, just a few details that could be improved

> -#define compat_elf_check_arch(x) ((x)->e_machine == EM_RISCV)
> +#ifdef CONFIG_COMPAT
> +#define compat_elf_check_arch compat_elf_check_arch
> +extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
> +#endif

No need for the #ifdef
> +}

> +void compat_mode_detect(void)

__init

> +{
> + unsigned long tmp = csr_read(CSR_STATUS);
> + csr_write(CSR_STATUS, (tmp & ~SR_UXL) | SR_UXL_32);
> +
> + if ((csr_read(CSR_STATUS) & SR_UXL) != SR_UXL_32) {
> + csr_write(CSR_STATUS, tmp);
> + return;
> + }
> +
> + csr_write(CSR_STATUS, tmp);
> + compat_mode_support = true;
> +
> + pr_info("riscv: compat: 32bit U-mode applications support\n");
> +}

I think an entry in /proc/cpuinfo would be more helpful than the pr_info at
boot time. Maybe a follow-up patch though, as there is no obvious place
to put it. On other architectures, you typically have a set of space
separated feature names, but riscv has a single string that describes
the ISA, and this feature is technically the support for a second ISA.

 Arnd

Re: powerpc: Set crashkernel offset to mid of RMA region

Sourabh Jain  writes:
> On large config LPARs (having 192 and more cores), Linux fails to boot
> due to insufficient memory in the first memblock. It is due to the
> memory reservation for the crash kernel which starts at 128MB offset of
> the first memblock. This memory reservation for the crash kernel doesn't
> leave enough space in the first memblock to accommodate other essential
> system resources.
>
> The crash kernel start address was set to 128MB offset by default to
> ensure that the crash kernel get some memory below the RMA region which
> is used to be of size 256MB. But given that the RMA region size can be
> 512MB or more, setting the crash kernel offset to mid of RMA size will
> leave enough space for kernel to allocate memory for other system
> resources.
>
> Since the above crash kernel offset change is only applicable to the LPAR
> platform, the LPAR feature detection is pushed before the crash kernel
> reservation. The rest of LPAR specific initialization will still
> be done during pseries_probe_fw_features as usual.
>
> Signed-off-by: Sourabh Jain 
> Reported-and-tested-by: Abdul haleem 
>
> ---
>  arch/powerpc/kernel/rtas.c |  4 
>  arch/powerpc/kexec/core.c  | 15 +++
>  2 files changed, 15 insertions(+), 4 deletions(-)
>
>  ---
>  Change in v3:
>   Dropped 1st and 2nd patch from v2. 1st and 2nd patch from v2 patch
>   series [1] try to discover 1T segment MMU feature support
>   BEFORE boot CPU paca allocation ([1] describes why it is needed).
>   MPE has posted a patch [2] that archives a similar objective by moving
>   boot CPU paca allocation after mmu_early_init_devtree().
>
> NOTE: This patch is dependent on the patch [2].
>
> [1] 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20211018084434.217772-3-sourabhj...@linux.ibm.com/
> [2] https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-January/239175.html
>  ---
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 733e6ef36758..06df7464fb57 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -1313,6 +1313,10 @@ int __init early_init_dt_scan_rtas(unsigned long node,
>   entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL);
>   sizep  = of_get_flat_dt_prop(node, "rtas-size", NULL);
>  
> + /* need this feature to decide the crashkernel offset */
> + if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL))
> + powerpc_firmware_features |= FW_FEATURE_LPAR;
> +

As you'd have seen this breaks the 32-bit build. It will need an #ifdef
CONFIG_PPC64 around it.

>   if (basep && entryp && sizep) {
>   rtas.base = *basep;
>   rtas.entry = *entryp;
> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
> index 8b68d9f91a03..abf5897ae88c 100644
> --- a/arch/powerpc/kexec/core.c
> +++ b/arch/powerpc/kexec/core.c
> @@ -134,11 +134,18 @@ void __init reserve_crashkernel(void)
>   if (!crashk_res.start) {
>  #ifdef CONFIG_PPC64
>   /*
> -  * On 64bit we split the RMO in half but cap it at half of
> -  * a small SLB (128MB) since the crash kernel needs to place
> -  * itself and some stacks to be in the first segment.
> +  * On the LPAR platform place the crash kernel to mid of
> +  * RMA size (512MB or more) to ensure the crash kernel
> +  * gets enough space to place itself and some stack to be
> +  * in the first segment. At the same time normal kernel
> +  * also get enough space to allocate memory for essential
> +  * system resource in the first segment. Keep the crash
> +  * kernel starts at 128MB offset on other platforms.
>*/
> - crashk_res.start = min(0x800ULL, (ppc64_rma_size / 2));
> + if (firmware_has_feature(FW_FEATURE_LPAR))
> + crashk_res.start = ppc64_rma_size / 2;
> + else
> + crashk_res.start = min(0x800ULL, (ppc64_rma_size / 
> 2));

I think this will break on machines using Radix won't it? At this point
in boot ppc64_rma_size will be == 0. Because we won't call into 
hash__setup_initial_memory_limit().

That's not changed by your patch, but seems like this code needs to be
more careful/clever.

cheers

>  #else
>   crashk_res.start = KDUMP_KERNELBASE;
>  #endif
> -- 
> 2.34.1

Re: [PATCH] powerpc/xive: Add some error handling code to 'xive_spapr_init()'


Hi,

Le 01/08/2019 à 13:09, Christophe JAILLET a écrit :

'xive_irq_bitmap_add()' can return -ENOMEM.
In this case, we should free the memory already allocated and return
'false' to the caller.

Also add an error path which undoes the 'tima = ioremap(...)'


This old patch doesn't apply, if it is still relevant can you please 
rebase ?


Thanks
Christophe



Signed-off-by: Christophe JAILLET 
---
NOT compile tested (I don't have a cross compiler and won't install one).
So if some correction or improvement are needed, feel free to propose and
commit it directly.
---
  arch/powerpc/sysdev/xive/spapr.c | 36 +---
  1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 52198131c75e..b3ae0b76c433 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -64,6 +64,17 @@ static int xive_irq_bitmap_add(int base, int count)
return 0;
  }
  
+static void xive_irq_bitmap_remove_all(void)

+{
+   struct xive_irq_bitmap *xibm, *tmp;
+
+   list_for_each_entry_safe(xibm, tmp, &xive_irq_bitmaps, list) {
+   list_del(&xibm->list);
+   kfree(xibm->bitmap);
+   kfree(xibm);
+   }
+}
+
  static int __xive_irq_bitmap_alloc(struct xive_irq_bitmap *xibm)
  {
int irq;
@@ -723,7 +734,7 @@ bool __init xive_spapr_init(void)
u32 val;
u32 len;
const __be32 *reg;
-   int i;
+   int i, err;
  
  	if (xive_spapr_disabled())

return false;
@@ -748,23 +759,26 @@ bool __init xive_spapr_init(void)
}
  
  	if (!xive_get_max_prio(&max_prio))

-   return false;
+   goto err_unmap;
  
  	/* Feed the IRQ number allocator with the ranges given in the DT */

reg = of_get_property(np, "ibm,xive-lisn-ranges", &len);
if (!reg) {
pr_err("Failed to read 'ibm,xive-lisn-ranges' property\n");
-   return false;
+   goto err_unmap;
}
  
  	if (len % (2 * sizeof(u32)) != 0) {

pr_err("invalid 'ibm,xive-lisn-ranges' property\n");
-   return false;
+   goto err_unmap;
}
  
-	for (i = 0; i < len / (2 * sizeof(u32)); i++, reg += 2)

-   xive_irq_bitmap_add(be32_to_cpu(reg[0]),
-   be32_to_cpu(reg[1]));
+   for (i = 0; i < len / (2 * sizeof(u32)); i++, reg += 2) {
+   err = xive_irq_bitmap_add(be32_to_cpu(reg[0]),
+ be32_to_cpu(reg[1]));
+   if (err < 0)
+   goto err_mem_free;
+   }
  
  	/* Iterate the EQ sizes and pick one */

of_property_for_each_u32(np, "ibm,xive-eq-sizes", prop, reg, val) {
@@ -775,8 +789,14 @@ bool __init xive_spapr_init(void)
  
  	/* Initialize XIVE core with our backend */

if (!xive_core_init(&xive_spapr_ops, tima, TM_QW1_OS, max_prio))
-   return false;
+   goto err_mem_free;
  
  	pr_info("Using %dkB queues\n", 1 << (xive_queue_shift - 10));

return true;
+
+err_mem_free:
+   xive_irq_bitmap_remove_all();
+err_unmap:
+   iounmap(tima);
+   return false;
  }

Re: [PATCH 1/2] powerpc: reserve memory for capture kernel after hugepages init





Le 27/06/2019 à 21:21, Hari Bathini a écrit :

Sometimes, memory reservation for KDump/FADump can overlap with memory
marked for hugepages. This overlap leads to error, hang in KDump case
and copy error reported by f/w in case of FADump, while trying to
capture dump. Report error while setting up memory for the capture
kernel instead of running into issues while capturing dump, by moving
KDump/FADump reservation below MMU early init and failing gracefully
when hugepages memory overlaps with capture kernel memory.


This patch doesn't apply, if it's still needed can you please rebase ?

Christophe



Signed-off-by: Hari Bathini 
---
  arch/powerpc/kernel/prom.c |   16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 7159e79..454e19cf 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -731,14 +731,6 @@ void __init early_init_devtree(void *params)
if (PHYSICAL_START > MEMORY_START)
memblock_reserve(MEMORY_START, 0x8000);
reserve_kdump_trampoline();
-#ifdef CONFIG_FA_DUMP
-   /*
-* If we fail to reserve memory for firmware-assisted dump then
-* fallback to kexec based kdump.
-*/
-   if (fadump_reserve_mem() == 0)
-#endif
-   reserve_crashkernel();
early_reserve_mem();
  
  	/* Ensure that total memory size is page-aligned. */

@@ -777,6 +769,14 @@ void __init early_init_devtree(void *params)
  #endif
  
  	mmu_early_init_devtree();

+#ifdef CONFIG_FA_DUMP
+   /*
+* If we fail to reserve memory for firmware-assisted dump then
+* fallback to kexec based kdump.
+*/
+   if (fadump_reserve_mem() == 0)
+#endif
+   reserve_crashkernel();
  
  #ifdef CONFIG_PPC_POWERNV

/* Scan and build the list of machine check recoverable ranges */

Re: consolidate the compat fcntl definitions v2

Hi Christoph,

Le 31/01/2022 à 07:49, Christoph Hellwig a écrit :
> Hi all,
> 
> currenty the compat fcnt definitions are duplicate for all compat
> architectures, and the native fcntl64 definitions aren't even usable
> from userspace due to a bogus CONFIG_64BIT ifdef.  This series tries
> to sort out all that.

There must be something wrong with this cover letter, patchwork doesn't 
see it, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?submitter=82

Any idea what the problem is ?

Thanks
Christophe

> 
> Changes since v1:
>   - only make the F*64 defines uapi visible for 32-bit architectures
> 
> Diffstat:
>   arch/arm64/include/asm/compat.h|   20 
>   arch/mips/include/asm/compat.h |   23 ++-
>   arch/mips/include/uapi/asm/fcntl.h |   30 +-
>   arch/parisc/include/asm/compat.h   |   16 
>   arch/powerpc/include/asm/compat.h  |   20 
>   arch/s390/include/asm/compat.h |   20 
>   arch/sparc/include/asm/compat.h|   22 +-
>   arch/x86/include/asm/compat.h  |   24 +++-
>   include/linux/compat.h |   31 
> +++
>   include/uapi/asm-generic/fcntl.h   |   23 +--
>   tools/include/uapi/asm-generic/fcntl.h |   21 +++--
>   11 files changed, 58 insertions(+), 192 deletions(-)

Re: microwatt booting linux-5.7 under verilator

2022-02-01 Thread Luke Kenneth Casson Leighton

On Tue, Feb 1, 2022 at 6:27 AM Nicholas Piggin  wrote:

> Not sure of the details on that. Is it memcpy()ing out of ROM or RAM to
> RAM? Is this in the arch boot code? (I don't know very well).

RAM to RAM.  arch/powerpc/boot/main.c:

if (uncompressed_image) {
memcpy(addr, vmlinuz_addr + ei.elfoffset, ei.loadsize);
printf("0x%lx bytes of uncompressed data copied\n\r",
   ei.loadsize);
goto out;
}

in some systems those would be two different types of RAM,
(one would be on-board SRAM, the target would be DRAM
which had previously been initialised by the previous chain-boot
loader e.g. u-boot)

[in other circumstances, the source location might be addressable
SPI NOR flash, which would be slower, expensive, and therefore
compression is plain common sense, in which case it's out of
scope for this discussion.]

in the case of the simulation - and also in the case of the
WinCE Smartphone hand-held reverse-engineering using
GNUHARET.EXE (similar to LOADLIN.EXE if anyone remembers
that) - the uncompressed initramfs are both in the same
RAM, so the memcpy is completely redundant.

the only good reason for the memcpy would be to ensure
that the start location is at a known-fixed offset, and of course
that can be arranged in advance by the simulator.  even if
it has to be at 0x___ that can be arranged
by moving the cold-boot loader to an alternative hard-reset
start address and telling the simulated-core to start from there.

> >
> > other areas are the memset before VM is set up, followed by memset *again* 
> > on.individual pages once created.  those are an hour each
>
> Seems like we could should avoid the duplication and maybe be able to
> add an option to skip zeroing (I thought there was one, maybe thinking
> of something else).

it makes sense for security reasons (on real hardware) - a simulation
not so much, it's guaranteed to be all-zeros at startup.

> Are you using optimize for size? That can result in much slower code in
> some places. In skiboot we compile some of the string.h library code
> with -O2 for example.

interesting - no, this is default options.  have to be careful not to
introduce any VSX instructions (the core doesn't have them).

   CROSS_COMPILE="ccache powerpc64le-linux-gnu-" \
   ARCH=powerpc \
   make -j16 O=microwatt

l.

Re: powerpc: Set crashkernel offset to mid of RMA region

2022-02-01 Thread Hari Bathini

On 28/01/22 3:34 pm, Sourabh Jain wrote:

On large config LPARs (having 192 and more cores), Linux fails to boot
due to insufficient memory in the first memblock. It is due to the
memory reservation for the crash kernel which starts at 128MB offset of
the first memblock. This memory reservation for the crash kernel doesn't
leave enough space in the first memblock to accommodate other essential
system resources.

The crash kernel start address was set to 128MB offset by default to
ensure that the crash kernel get some memory below the RMA region which
is used to be of size 256MB. But given that the RMA region size can be
512MB or more, setting the crash kernel offset to mid of RMA size will
leave enough space for kernel to allocate memory for other system
resources.

Since the above crash kernel offset change is only applicable to the LPAR
platform, the LPAR feature detection is pushed before the crash kernel
reservation. The rest of LPAR specific initialization will still
be done during pseries_probe_fw_features as usual.

Signed-off-by: Sourabh Jain
Reported-and-tested-by: Abdul haleem

---
arch/powerpc/kernel/rtas.c | 4
arch/powerpc/kexec/core.c | 15 +++
2 files changed, 15 insertions(+), 4 deletions(-)

---
Change in v3:
Dropped 1st and 2nd patch from v2. 1st and 2nd patch from v2 patch
series [1] try to discover 1T segment MMU feature support
BEFORE boot CPU paca allocation ([1] describes why it is needed).
MPE has posted a patch [2] that archives a similar objective by moving
boot CPU paca allocation after mmu_early_init_devtree().

NOTE: This patch is dependent on the patch [2].

[1]https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20211018084434.217772-3-sourabhj...@linux.ibm.com/
[2]https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-January/239175.html

This dependency info must be captured somewhere within the changelog to
be useful.

[RESEND PATCH v2] powerpc/fadump: register for fadump as early as possible

2022-02-01 Thread Hari Bathini

Crash recovery (fadump) is setup in the userspace by some service.
This service rebuilds initrd with dump capture capability, if it is
not already dump capture capable before proceeding to register for
firmware assisted dump (echo 1 > /sys/kernel/fadump/registered). But
arming the kernel with crash recovery support does not have to wait
for userspace configuration. So, register for fadump while setting
it up itself. This can at worst lead to a scenario, where /proc/vmcore
is ready afer crash but the initrd does not know how/where to offload
it, which is always better than not having a /proc/vmcore at all due
to incomplete configuration in the userspace at the time of crash.

Commit 0823c68b054b ("powerpc/fadump: re-register firmware-assisted
dump if already registered") ensures this change does not break
userspace.

Signed-off-by: Hari Bathini 
---

* Resending the patch after rebasing on latest code.

Changes in V2:
* Updated the changelog with bit more explanation about userspace issue
  with/without this change.
* Added a comment in the code for why setup_fadump function is changed
  from subsys_init() to subsys_init_sync() call.


 arch/powerpc/kernel/fadump.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index d03e488cfe9c..97d80df4563f 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1637,9 +1637,11 @@ int __init setup_fadump(void)
if (fw_dump.ops->fadump_process(&fw_dump) < 0)
fadump_invalidate_release_mem();
}
-   /* Initialize the kernel dump memory structure for FAD registration. */
-   else if (fw_dump.reserve_dump_area_size)
+   /* Initialize the kernel dump memory structure and register with f/w */
+   else if (fw_dump.reserve_dump_area_size) {
fw_dump.ops->fadump_init_mem_struct(&fw_dump);
+   register_fadump();
+   }
 
/*
 * In case of panic, fadump is triggered via ppc_panic_event()
@@ -1651,7 +1653,12 @@ int __init setup_fadump(void)
 
return 1;
 }
-subsys_initcall(setup_fadump);
+/*
+ * Replace subsys_initcall() with subsys_initcall_sync() as there is dependency
+ * with crash_save_vmcoreinfo_init() to ensure vmcoreinfo initialization is 
done
+ * before regisering with f/w.
+ */
+subsys_initcall_sync(setup_fadump);
 #else /* !CONFIG_PRESERVE_FA_DUMP */
 
 /* Scan the Firmware Assisted dump configuration details. */
-- 
2.34.1

Re: [PATCH V4 16/17] riscv: compat: Add COMPAT Kbuild skeletal support

Hi Arnd & Christoph,

The UXL field controls the value of XLEN for U-mode, termed UXLEN,
which may differ from the
value of XLEN for S-mode, termed SXLEN. The encoding of UXL is the
same as that of the MXL
field of misa, shown in Table 3.1.

Here is the patch. (We needn't exception helper, because we are in
S-mode and UXL wouldn't affect.)

 arch/riscv/include/asm/elf.h   |  5 -
 arch/riscv/include/asm/processor.h |  1 +
 arch/riscv/kernel/process.c| 22 ++
 arch/riscv/kernel/setup.c  |  5 +
 4 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
index 37f1cbdaa242..6baa49c4fba1 100644
--- a/arch/riscv/include/asm/elf.h
+++ b/arch/riscv/include/asm/elf.h
@@ -35,7 +35,10 @@
  */
 #define elf_check_arch(x) ((x)->e_machine == EM_RISCV)

-#define compat_elf_check_arch(x) ((x)->e_machine == EM_RISCV)
+#ifdef CONFIG_COMPAT
+#define compat_elf_check_arch compat_elf_check_arch
+extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
+#endif

 #define CORE_DUMP_USE_REGSET
 #define ELF_EXEC_PAGESIZE (PAGE_SIZE)
diff --git a/arch/riscv/include/asm/processor.h
b/arch/riscv/include/asm/processor.h
index 9544c138d9ce..8b288ac0d704 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -64,6 +64,7 @@ extern void start_thread(struct pt_regs *regs,
 #ifdef CONFIG_COMPAT
 extern void compat_start_thread(struct pt_regs *regs,
  unsigned long pc, unsigned long sp);
+extern void compat_mode_detect(void);

 #define DEFAULT_MAP_WINDOW_64 TASK_SIZE_64
 #else
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 9ebf9a95e5ea..496d09c5d384 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -101,6 +101,28 @@ void start_thread(struct pt_regs *regs, unsigned long pc,
 }

 #ifdef CONFIG_COMPAT
+static bool compat_mode_support __read_mostly = false;
+
+bool compat_elf_check_arch(Elf32_Ehdr *hdr)
+{
+ if (compat_mode_support && (hdr->e_machine == EM_RISCV))
+ return true;
+
+ return false;
+}
+
+void compat_mode_detect(void)
+{
+ unsigned long tmp = csr_read(CSR_STATUS);
+ csr_write(CSR_STATUS, (tmp & ~SR_UXL) | SR_UXL_32);
+
+ if ((csr_read(CSR_STATUS) & SR_UXL) != SR_UXL_32) {
+ csr_write(CSR_STATUS, tmp);
+ return;
+ }
+
+ csr_write(CSR_STATUS, tmp);
+ compat_mode_support = true;
+
+ pr_info("riscv: compat: 32bit U-mode applications support\n");
+}
+
 void compat_start_thread(struct pt_regs *regs, unsigned long pc,
  unsigned long sp)
 {
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index b42bfdc67482..be131219d549 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -294,6 +295,10 @@ void __init setup_arch(char **cmdline_p)
  setup_smp();
 #endif

+#ifdef CONFIG_COMPAT
+ compat_mode_detect();
+#endif
+
  riscv_fill_hwcap();
 }
On Tue, Feb 1, 2022 at 5:36 PM Arnd Bergmann  wrote:
>
> On Tue, Feb 1, 2022 at 10:13 AM Guo Ren  wrote:
> > On Tue, Feb 1, 2022 at 3:45 PM Christoph Hellwig  wrote:
> > > On Mon, Jan 31, 2022 at 09:50:58PM +0800, Guo Ren wrote:
> > > > On Mon, Jan 31, 2022 at 8:26 PM Christoph Hellwig  
> > > > wrote:
> > > > >
> > > > > Given that most rv64 implementations can't run in rv32 mode, what is 
> > > > > the
> > > > > failure mode if someone tries it with the compat mode enabled?
> > > > A static linked simple hello_world could still run on a non-compat
> > > > support hardware. But most rv32 apps would meet different userspace
> > > > segment faults.
> > > >
> > > > Current code would let the machine try the rv32 apps without detecting
> > > > whether hw support or not.
> > >
> > > Hmm, we probably want some kind of check for not even offer running
> > > rv32 binaries.  I guess trying to write UXL some time during early
> > > boot and catching the resulting exception would be the way to go?
> >
> > Emm... I think it's unnecessary. Free rv32 app running won't cause
> > system problem, just as a wrong elf running. They are U-mode
> > privileged.
>
> While it's not a security issue, I think it would be helpful to get a
> user-readable error message and a machine-readable /proc/cpuinfo
> flag to see if a particular system can run rv32 binaries rather than
> relying on SIGILL to kill a process.
--
2.25.1


>
> Arnd



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

Re: [PATCH RFC v1] drivers/base/node: consolidate node device subsystem initialization in node_dev_init()

2022-02-01 Thread David Hildenbrand

On 31.01.22 10:40, Oscar Salvador wrote:
> On Mon, Jan 31, 2022 at 08:48:54AM +0100, David Hildenbrand wrote:
>> Hi Oscar,
> 
> Hi David :-),
> 
>> Right, and the idea is that the online state of nodes (+ node/zone
>> ranges) already has to be known at that point in time, because
>> otherwise, we'd be in bigger trouble.
> 
> Yeah, I wanted to check where exactly did we mark the nodes online,
> and for the few architectures I checked it happens in setup_arch(),
> which is called very early in start_kernel(), while driver_init()
> gets called through arch_call_rest_init(), which happens at the end
> of the function.
> 
> I am not sure whether we want to remark that somehow in the changelog,
> so it is crystal clear that by the time the node_dev_init() gets called,
> we already set the nodes online.
> 
> Anyway, just saying, but is fine as is.

I'll adjust the first paragraph to:

... and call node_dev_init() after memory_dev_init() from driver_init(),
so before any of the existing arch/subsys calls. All online nodes should
be known at that point: early during boot, arch code determines node and
zone ranges and sets the relevant nodes online; usually this happens in
setup_arch().

Thanks!

-- 
Thanks,

David / dhildenb

Re: [PATCH RFC v1] drivers/base/node: consolidate node device subsystem initialization in node_dev_init()

2022-02-01 Thread Anatoly Pugachev

On Mon, Jan 31, 2022 at 2:11 PM David Hildenbrand  wrote:
>
> ... and call node_dev_init() after memory_dev_init() from driver_init(),
> so before any of the existing arch/subsys calls. All online nodes should
> be known at that point.
>
> This is in line with memory_dev_init(), which initializes the memory
> device subsystem and creates all memory block devices.
>
> Similar to memory_dev_init(), panic() if anything goes wrong, we don't
> want to continue with such basic initialization errors.
>
> The important part is that node_dev_init() gets called after
> memory_dev_init() and after cpu_dev_init(), but before any of the
> relevant archs call register_cpu() to register the new cpu device under
> the node device. The latter should be the case for the current users
> of topology_init().
>
> RFC because I tested only on x86-64 and s390x, I think I cross-compiled all
> applicable architectures except riscv and sparc.

Compiled and boot tested on sparc.

Tested-by: Anatoly Pugachev  (sparc64)

Re: [PATCH V4 16/17] riscv: compat: Add COMPAT Kbuild skeletal support

2022-02-01 Thread Arnd Bergmann

On Tue, Feb 1, 2022 at 10:13 AM Guo Ren  wrote:
> On Tue, Feb 1, 2022 at 3:45 PM Christoph Hellwig  wrote:
> > On Mon, Jan 31, 2022 at 09:50:58PM +0800, Guo Ren wrote:
> > > On Mon, Jan 31, 2022 at 8:26 PM Christoph Hellwig  
> > > wrote:
> > > >
> > > > Given that most rv64 implementations can't run in rv32 mode, what is the
> > > > failure mode if someone tries it with the compat mode enabled?
> > > A static linked simple hello_world could still run on a non-compat
> > > support hardware. But most rv32 apps would meet different userspace
> > > segment faults.
> > >
> > > Current code would let the machine try the rv32 apps without detecting
> > > whether hw support or not.
> >
> > Hmm, we probably want some kind of check for not even offer running
> > rv32 binaries.  I guess trying to write UXL some time during early
> > boot and catching the resulting exception would be the way to go?
>
> Emm... I think it's unnecessary. Free rv32 app running won't cause
> system problem, just as a wrong elf running. They are U-mode
> privileged.

While it's not a security issue, I think it would be helpful to get a
user-readable error message and a machine-readable /proc/cpuinfo
flag to see if a particular system can run rv32 binaries rather than
relying on SIGILL to kill a process.

Arnd

Re: Build regressions/improvements in v5.17-rc2

2022-02-01 Thread Geert Uytterhoeven


On Mon, 31 Jan 2022, Geert Uytterhoeven wrote:

JFYI, when comparing v5.17-rc2[1] to v5.17-rc1[3], the summaries are:
 - build errors: +1/-3


  + error: arch/powerpc/kvm/book3s_64_entry.o: relocation truncated to fit: 
R_PPC64_REL14 (stub) against symbol `system_reset_common' defined in .text section 
in arch/powerpc/kernel/head_64.o:  => (.text+0x3ec)

powerpc-gcc5/powerpc-allyesconfig


[1] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/26291c54e111ff6ba87a164d85d4a4e134b7315c/
 (all 99 configs)
[3] 
http://kisskb.ellerman.id.au/kisskb/branch/linus/head/e783362eb54cd99b2cac8b3a9aeac942e6f6ac07/
 (all 99 configs)


Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH V4 16/17] riscv: compat: Add COMPAT Kbuild skeletal support