Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-04-21 Thread Sukadev Bhattiprolu
Lijun Pan [lijunp...@gmail.com] wrote:
> > Now, sure we can attempt a "thorough hard reset" which also does
> > the same hcalls to reestablish the connection. Is there any
> > other magic in do_hard_reset()? But in addition, it also frees lot
> > more Linux kernel buffers and reallocates them for instance.
> 
> Working around everything in do_reset will make the code very difficult

We are not working around everything. We are doing in do_reset()
exactly what we would do in hard reset for this error (ignore the
set link down error and try to reestablish the connection with the
VIOS).

What we are avoiding is unnecessary work on the Linux side for a
communication problem on the VIOS side.

> to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset
> can be removed completely or merged into do_reset.
> 
> >
> > If we are having a communication problem with the VIOS, what is
> > the point of freeing and reallocating Linux kernel buffers? Beside
> > being inefficient, it would expose us to even more errors during
> > reset under heavy workloads?
> 
> No real customer runs the system under that heavy load created by
> HTX stress test, which can tear down any working system.

We need to talk to capacity planning and test architects about that,
but all I want to know is what hard reset would do differently to
fix this communication error with VIOS.

Sukadev


Re: [V3 PATCH 16/16] crypto/nx: Add sysfs interface to export NX capabilities

2021-04-21 Thread Herbert Xu
On Sat, Apr 17, 2021 at 02:13:40PM -0700, Haren Myneni wrote:
> 
> Changes to export the following NXGZIP capabilities through sysfs:
> 
> /sys/devices/vio/ibm,compression-v1/NxGzCaps:
> min_compress_len  /*Recommended minimum compress length in bytes*/
> min_decompress_len /*Recommended minimum decompress length in bytes*/
> req_max_processed_len /* Maximum number of bytes processed in one
>   request */
> 
> Signed-off-by: Haren Myneni 
> ---
>  drivers/crypto/nx/nx-common-pseries.c | 43 +++
>  1 file changed, 43 insertions(+)

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [V3 PATCH 15/16] crypto/nx: Get NX capabilities for GZIP coprocessor type

2021-04-21 Thread Herbert Xu
On Sat, Apr 17, 2021 at 02:12:51PM -0700, Haren Myneni wrote:
> 
> phyp provides NX capabilities which gives recommended minimum
> compression / decompression length and maximum request buffer size
> in bytes.
> 
> Changes to get NX overall capabilities which points to the specific
> features phyp supports. Then retrieve NXGZIP specific capabilities.
> 
> Signed-off-by: Haren Myneni 
> ---
>  drivers/crypto/nx/nx-common-pseries.c | 83 +++
>  1 file changed, 83 insertions(+)

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [V3 PATCH 14/16] crypto/nx: Register and unregister VAS interface

2021-04-21 Thread Herbert Xu
On Sat, Apr 17, 2021 at 02:12:12PM -0700, Haren Myneni wrote:
> 
> Changes to create /dev/crypto/nx-gzip interface with VAS register
> and to remove this interface with VAS unregister.
> 
> Signed-off-by: Haren Myneni 
> ---
>  drivers/crypto/nx/Kconfig | 1 +
>  drivers/crypto/nx/nx-common-pseries.c | 9 +
>  2 files changed, 10 insertions(+)

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [V3 PATCH 13/16] crypto/nx: Rename nx-842-pseries file name to nx-common-pseries

2021-04-21 Thread Herbert Xu
On Sat, Apr 17, 2021 at 02:11:15PM -0700, Haren Myneni wrote:
> 
> Rename nx-842-pseries.c to nx-common-pseries.c to add code for new
> GZIP compression type. The actual functionality is not changed in
> this patch.
> 
> Signed-off-by: Haren Myneni 
> ---
>  drivers/crypto/nx/Makefile  | 2 +-
>  drivers/crypto/nx/{nx-842-pseries.c => nx-common-pseries.c} | 0
>  2 files changed, 1 insertion(+), 1 deletion(-)
>  rename drivers/crypto/nx/{nx-842-pseries.c => nx-common-pseries.c} (100%)
> 
> diff --git a/drivers/crypto/nx/Makefile b/drivers/crypto/nx/Makefile
> index bc89a20e5d9d..d00181a26dd6 100644
> --- a/drivers/crypto/nx/Makefile
> +++ b/drivers/crypto/nx/Makefile
> @@ -14,5 +14,5 @@ nx-crypto-objs := nx.o \
>  obj-$(CONFIG_CRYPTO_DEV_NX_COMPRESS_PSERIES) += nx-compress-pseries.o 
> nx-compress.o
>  obj-$(CONFIG_CRYPTO_DEV_NX_COMPRESS_POWERNV) += nx-compress-powernv.o 
> nx-compress.o
>  nx-compress-objs := nx-842.o
> -nx-compress-pseries-objs := nx-842-pseries.o
> +nx-compress-pseries-objs := nx-common-pseries.o
>  nx-compress-powernv-objs := nx-common-powernv.o
> diff --git a/drivers/crypto/nx/nx-842-pseries.c 
> b/drivers/crypto/nx/nx-common-pseries.c
> similarity index 100%
> rename from drivers/crypto/nx/nx-842-pseries.c
> rename to drivers/crypto/nx/nx-common-pseries.c

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 1/1] powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs

2021-04-21 Thread Leonardo Bras
Hello,

This patch was also reviewed when it was part of another patchset:
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200911170738.82818-4-leobra...@gmail.com/

On Thu, 2021-03-18 at 14:44 -0300, Leonardo Bras wrote:
> Currently both iommu_alloc_coherent() and iommu_free_coherent() align the
> desired allocation size to PAGE_SIZE, and gets system pages and IOMMU
> mappings (TCEs) for that value.
> 
> When IOMMU_PAGE_SIZE < PAGE_SIZE, this behavior may cause unnecessary
> TCEs to be created for mapping the whole system page.
> 
> Example:
> - PAGE_SIZE = 64k, IOMMU_PAGE_SIZE() = 4k
> - iommu_alloc_coherent() is called for 128 bytes
> - 1 system page (64k) is allocated
> - 16 IOMMU pages (16 x 4k) are allocated (16 TCEs used)
> 
> It would be enough to use a single TCE for this, so 15 TCEs are
> wasted in the process.
> 
> Update iommu_*_coherent() to make sure the size alignment happens only
> for IOMMU_PAGE_SIZE() before calling iommu_alloc() and iommu_free().
> 
> Also, on iommu_range_alloc(), replace ALIGN(n, 1 << tbl->it_page_shift)
> with IOMMU_PAGE_ALIGN(n, tbl), which is easier to read and does the
> same.
> 
> Signed-off-by: Leonardo Bras 
> Reviewed-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/kernel/iommu.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 5b69a6a72a0e..3329ef045805 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -851,6 +851,7 @@ void *iommu_alloc_coherent(struct device *dev, struct 
> iommu_table *tbl,
>   unsigned int order;
>   unsigned int nio_pages, io_order;
>   struct page *page;
> + size_t size_io = size;
>  
> 
>   size = PAGE_ALIGN(size);
>   order = get_order(size);
> @@ -877,8 +878,9 @@ void *iommu_alloc_coherent(struct device *dev, struct 
> iommu_table *tbl,
>   memset(ret, 0, size);
>  
> 
>   /* Set up tces to cover the allocated range */
> - nio_pages = size >> tbl->it_page_shift;
> - io_order = get_iommu_order(size, tbl);
> + size_io = IOMMU_PAGE_ALIGN(size_io, tbl);
> + nio_pages = size_io >> tbl->it_page_shift;
> + io_order = get_iommu_order(size_io, tbl);
>   mapping = iommu_alloc(dev, tbl, ret, nio_pages, DMA_BIDIRECTIONAL,
>     mask >> tbl->it_page_shift, io_order, 0);
>   if (mapping == DMA_MAPPING_ERROR) {
> @@ -893,10 +895,9 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t 
> size,
>    void *vaddr, dma_addr_t dma_handle)
>  {
>   if (tbl) {
> - unsigned int nio_pages;
> + size_t size_io = IOMMU_PAGE_ALIGN(size, tbl);
> + unsigned int nio_pages = size_io >> tbl->it_page_shift;
>  
> 
> - size = PAGE_ALIGN(size);
> - nio_pages = size >> tbl->it_page_shift;
>   iommu_free(tbl, dma_handle, nio_pages);
>   size = PAGE_ALIGN(size);
>   free_pages((unsigned long)vaddr, get_order(size));




Re: [PATCH 1/1] powerpc/kernel/iommu: Use largepool as a last resort when !largealloc

2021-04-21 Thread Leonardo Bras
Hello,

FYI: This patch was reviewed when it was part of another patchset:
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200817234033.442511-4-leobra...@gmail.com/


On Thu, 2021-03-18 at 14:44 -0300, Leonardo Bras wrote:
> As of today, doing iommu_range_alloc() only for !largealloc (npages <= 15)
> will only be able to use 3/4 of the available pages, given pages on
> largepool  not being available for !largealloc.
> 
> This could mean some drivers not being able to fully use all the available
> pages for the DMA window.
> 
> Add pages on largepool as a last resort for !largealloc, making all pages
> of the DMA window available.
> 
> Signed-off-by: Leonardo Bras 
> Reviewed-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/kernel/iommu.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 3329ef045805..ae6ad8dca605 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -255,6 +255,15 @@ static unsigned long iommu_range_alloc(struct device 
> *dev,
>   pass++;
>   goto again;
>  
> 
> + } else if (pass == tbl->nr_pools + 1) {
> + /* Last resort: try largepool */
> + spin_unlock(&pool->lock);
> + pool = &tbl->large_pool;
> + spin_lock(&pool->lock);
> + pool->hint = pool->start;
> + pass++;
> + goto again;
> +
>   } else {
>   /* Give up */
>   spin_unlock_irqrestore(&(pool->lock), flags);




Re: [PATCH] powerpc/mce: save ignore_event flag unconditionally for UE

2021-04-21 Thread Ganesh



On 4/22/21 11:31 AM, Ganesh wrote:

On 4/7/21 10:28 AM, Ganesh Goudar wrote:


When we hit an UE while using machine check safe copy routines,
ignore_event flag is set and the event is ignored by mce handler,
And the flag is also saved for defered handling and printing of
mce event information, But as of now saving of this flag is done
on checking if the effective address is provided or physical address
is calculated, which is not right.

Save ignore_event flag regardless of whether the effective address is
provided or physical address is calculated.

Without this change following log is seen, when the event is to be
ignored.

[  512.971365] MCE: CPU1: machine check (Severe)  UE Load/Store 
[Recovered]

[  512.971509] MCE: CPU1: NIP: [c00b67c0] memcpy+0x40/0x90
[  512.971655] MCE: CPU1: Initiator CPU
[  512.971739] MCE: CPU1: Unknown
[  512.972209] MCE: CPU1: machine check (Severe)  UE Load/Store 
[Recovered]

[  512.972334] MCE: CPU1: NIP: [c00b6808] memcpy+0x88/0x90
[  512.972456] MCE: CPU1: Initiator CPU
[  512.972534] MCE: CPU1: Unknown

Signed-off-by: Ganesh Goudar 
---
  arch/powerpc/kernel/mce.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Hi mpe, Any comments on this patch?

Please ignore, I see its applied.


Re: [PATCH] powerpc/mce: save ignore_event flag unconditionally for UE

2021-04-21 Thread Ganesh

On 4/7/21 10:28 AM, Ganesh Goudar wrote:


When we hit an UE while using machine check safe copy routines,
ignore_event flag is set and the event is ignored by mce handler,
And the flag is also saved for defered handling and printing of
mce event information, But as of now saving of this flag is done
on checking if the effective address is provided or physical address
is calculated, which is not right.

Save ignore_event flag regardless of whether the effective address is
provided or physical address is calculated.

Without this change following log is seen, when the event is to be
ignored.

[  512.971365] MCE: CPU1: machine check (Severe)  UE Load/Store [Recovered]
[  512.971509] MCE: CPU1: NIP: [c00b67c0] memcpy+0x40/0x90
[  512.971655] MCE: CPU1: Initiator CPU
[  512.971739] MCE: CPU1: Unknown
[  512.972209] MCE: CPU1: machine check (Severe)  UE Load/Store [Recovered]
[  512.972334] MCE: CPU1: NIP: [c00b6808] memcpy+0x88/0x90
[  512.972456] MCE: CPU1: Initiator CPU
[  512.972534] MCE: CPU1: Unknown

Signed-off-by: Ganesh Goudar 
---
  arch/powerpc/kernel/mce.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Hi mpe, Any comments on this patch?



[PATCH v5 9/9] powerpc/mm: Enable move pmd/pud

2021-04-21 Thread Aneesh Kumar K.V
mremap HAVE_MOVE_PMD/PUD optimization time comparison for 1GB region:
1GB mremap - Source PTE-aligned, Destination PTE-aligned
  mremap time:  1127034ns
1GB mremap - Source PMD-aligned, Destination PMD-aligned
  mremap time:   508817ns
1GB mremap - Source PUD-aligned, Destination PUD-aligned
  mremap time:23046ns

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/platforms/Kconfig.cputype | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 3ce907523b1e..2e666e569fdf 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -97,6 +97,8 @@ config PPC_BOOK3S_64
select PPC_HAVE_PMU_SUPPORT
select SYS_SUPPORTS_HUGETLBFS
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
+   select HAVE_MOVE_PMD
+   select HAVE_MOVE_PUD
select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
select ARCH_SUPPORTS_NUMA_BALANCING
select IRQ_WORK
-- 
2.30.2



[PATCH v5 8/9] mm/mremap: Allow arch runtime override

2021-04-21 Thread Aneesh Kumar K.V
Architectures like ppc64 support faster mremap only with radix
translation. Hence allow a runtime check w.r.t support for fast mremap.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/tlb.h |  6 ++
 mm/mremap.c| 15 ++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
index 160422a439aa..09a9ae5f3656 100644
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -83,5 +83,11 @@ static inline int mm_is_thread_local(struct mm_struct *mm)
 }
 #endif
 
+#define arch_supports_page_table_move arch_supports_page_table_move
+static inline bool arch_supports_page_table_move(void)
+{
+   return radix_enabled();
+}
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_TLB_H */
diff --git a/mm/mremap.c b/mm/mremap.c
index 9effca76bf17..27306168440f 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -25,7 +25,7 @@
 #include 
 
 #include 
-#include 
+#include 
 #include 
 
 #include "internal.h"
@@ -220,6 +220,15 @@ static inline void flush_pte_tlb_pwc_range(struct 
vm_area_struct *vma,
 }
 #endif
 
+#ifndef arch_supports_page_table_move
+#define arch_supports_page_table_move arch_supports_page_table_move
+static inline bool arch_supports_page_table_move(void)
+{
+   return IS_ENABLED(CONFIG_HAVE_MOVE_PMD) ||
+   IS_ENABLED(CONFIG_HAVE_MOVE_PUD);
+}
+#endif
+
 #ifdef CONFIG_HAVE_MOVE_PMD
 static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
  unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
@@ -228,6 +237,8 @@ static bool move_normal_pmd(struct vm_area_struct *vma, 
unsigned long old_addr,
struct mm_struct *mm = vma->vm_mm;
pmd_t pmd;
 
+   if (!arch_supports_page_table_move())
+   return false;
/*
 * The destination pmd shouldn't be established, free_pgtables()
 * should have released it.
@@ -294,6 +305,8 @@ static bool move_normal_pud(struct vm_area_struct *vma, 
unsigned long old_addr,
struct mm_struct *mm = vma->vm_mm;
pud_t pud;
 
+   if (!arch_supports_page_table_move())
+   return false;
/*
 * The destination pud shouldn't be established, free_pgtables()
 * should have released it.
-- 
2.30.2



[PATCH v5 7/9] mm/mremap: Move TLB flush outside page table lock

2021-04-21 Thread Aneesh Kumar K.V
Move TLB flush outside page table lock so that kernel does
less with page table lock held. Releasing the ptl with old
TLB contents still valid will behave such that such access
happened before the level3 or level2 entry update.

Signed-off-by: Aneesh Kumar K.V 
---
 mm/mremap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/mremap.c b/mm/mremap.c
index 109560977944..9effca76bf17 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -258,7 +258,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, 
unsigned long old_addr,
 * We don't have to worry about the ordering of src and dst
 * ptlocks because exclusive mmap_lock prevents deadlock.
 */
-   old_ptl = pmd_lock(vma->vm_mm, old_pmd);
+   old_ptl = pmd_lock(mm, old_pmd);
new_ptl = pmd_lockptr(mm, new_pmd);
if (new_ptl != old_ptl)
spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
@@ -270,11 +270,11 @@ static bool move_normal_pmd(struct vm_area_struct *vma, 
unsigned long old_addr,
VM_BUG_ON(!pmd_none(*new_pmd));
pmd_populate(mm, new_pmd, (pgtable_t)pmd_page_vaddr(pmd));
 
-   flush_pte_tlb_pwc_range(vma, old_addr, old_addr + PMD_SIZE);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
spin_unlock(old_ptl);
 
+   flush_pte_tlb_pwc_range(vma, old_addr, old_addr + PMD_SIZE);
return true;
 }
 #else
@@ -305,7 +305,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, 
unsigned long old_addr,
 * We don't have to worry about the ordering of src and dst
 * ptlocks because exclusive mmap_lock prevents deadlock.
 */
-   old_ptl = pud_lock(vma->vm_mm, old_pud);
+   old_ptl = pud_lock(mm, old_pud);
new_ptl = pud_lockptr(mm, new_pud);
if (new_ptl != old_ptl)
spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
@@ -317,11 +317,11 @@ static bool move_normal_pud(struct vm_area_struct *vma, 
unsigned long old_addr,
VM_BUG_ON(!pud_none(*new_pud));
 
pud_populate(mm, new_pud, (pmd_t *)pud_page_vaddr(pud));
-   flush_pte_tlb_pwc_range(vma, old_addr, old_addr + PUD_SIZE);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
spin_unlock(old_ptl);
 
+   flush_pte_tlb_pwc_range(vma, old_addr, old_addr + PUD_SIZE);
return true;
 }
 #else
-- 
2.30.2



[PATCH v5 6/9] mm/mremap: Use range flush that does TLB and page walk cache flush

2021-04-21 Thread Aneesh Kumar K.V
Some architectures do have the concept of page walk cache which need
to be flush when updating higher levels of page tables. A fast mremap
that involves moving page table pages instead of copying pte entries
should flush page walk cache since the old translation cache is no more
valid.

Add new helper flush_pte_tlb_pwc_range() which invalidates both TLB and
page walk cache where TLB entries are mapped with page size PAGE_SIZE.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/tlbflush.h | 10 ++
 mm/mremap.c   | 14 --
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index f9f8a3a264f7..e84fee9db106 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -80,6 +80,16 @@ static inline void flush_hugetlb_tlb_range(struct 
vm_area_struct *vma,
return flush_hugetlb_tlb_pwc_range(vma, start, end, false);
 }
 
+#define flush_pte_tlb_pwc_range flush_tlb_pwc_range
+static inline void flush_pte_tlb_pwc_range(struct vm_area_struct *vma,
+  unsigned long start, unsigned long 
end)
+{
+   if (radix_enabled())
+   return radix__flush_tlb_pwc_range_psize(vma->vm_mm, start,
+   end, mmu_virtual_psize, 
true);
+   return hash__flush_tlb_range(vma, start, end);
+}
+
 static inline void flush_tlb_range(struct vm_area_struct *vma,
   unsigned long start, unsigned long end)
 {
diff --git a/mm/mremap.c b/mm/mremap.c
index 574287f9bb39..109560977944 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -210,6 +210,16 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t 
*old_pmd,
drop_rmap_locks(vma);
 }
 
+#ifndef flush_pte_tlb_pwc_range
+#define flush_pte_tlb_pwc_range flush_pte_tlb_pwc_range
+static inline void flush_pte_tlb_pwc_range(struct vm_area_struct *vma,
+  unsigned long start,
+  unsigned long end)
+{
+   return flush_tlb_range(vma, start, end);
+}
+#endif
+
 #ifdef CONFIG_HAVE_MOVE_PMD
 static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
  unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd)
@@ -260,7 +270,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, 
unsigned long old_addr,
VM_BUG_ON(!pmd_none(*new_pmd));
pmd_populate(mm, new_pmd, (pgtable_t)pmd_page_vaddr(pmd));
 
-   flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE);
+   flush_pte_tlb_pwc_range(vma, old_addr, old_addr + PMD_SIZE);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
spin_unlock(old_ptl);
@@ -307,7 +317,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, 
unsigned long old_addr,
VM_BUG_ON(!pud_none(*new_pud));
 
pud_populate(mm, new_pud, (pmd_t *)pud_page_vaddr(pud));
-   flush_tlb_range(vma, old_addr, old_addr + PUD_SIZE);
+   flush_pte_tlb_pwc_range(vma, old_addr, old_addr + PUD_SIZE);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
spin_unlock(old_ptl);
-- 
2.30.2



[PATCH v5 5/9] powerpc/mm/book3s64: Update tlb flush routines to take a page walk cache flush argument

2021-04-21 Thread Aneesh Kumar K.V
No functional change in this patch

Signed-off-by: Aneesh Kumar K.V 
---
 .../include/asm/book3s/64/tlbflush-radix.h| 19 +++-
 arch/powerpc/include/asm/book3s/64/tlbflush.h | 23 ---
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c  |  4 +--
 arch/powerpc/mm/book3s64/radix_tlb.c  | 29 +++
 4 files changed, 42 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 8b33601cdb9d..171441a43b35 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -56,15 +56,18 @@ static inline void radix__flush_all_lpid_guest(unsigned int 
lpid)
 }
 #endif
 
-extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma,
-  unsigned long start, unsigned long 
end);
-extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long 
start,
-unsigned long end, int psize);
-extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma,
-  unsigned long start, unsigned long end);
-extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long 
start,
+void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+   unsigned long start, unsigned long end,
+   bool flush_pwc);
+void radix__flush_pmd_tlb_range(struct vm_area_struct *vma,
+   unsigned long start, unsigned long end,
+   bool flush_pwc);
+void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long 
start,
+ unsigned long end, int psize, bool 
flush_pwc);
+void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end);
-extern void radix__flush_tlb_kernel_range(unsigned long start, unsigned long 
end);
+void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end);
+
 
 extern void radix__local_flush_tlb_mm(struct mm_struct *mm);
 extern void radix__local_flush_all_mm(struct mm_struct *mm);
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 215973b4cb26..f9f8a3a264f7 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -45,13 +45,30 @@ static inline void tlbiel_all_lpid(bool radix)
hash__tlbiel_all(TLB_INVAL_SCOPE_LPID);
 }
 
+static inline void flush_pmd_tlb_pwc_range(struct vm_area_struct *vma,
+  unsigned long start,
+  unsigned long end,
+  bool flush_pwc)
+{
+   if (radix_enabled())
+   return radix__flush_pmd_tlb_range(vma, start, end, flush_pwc);
+   return hash__flush_tlb_range(vma, start, end);
+}
 
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
 static inline void flush_pmd_tlb_range(struct vm_area_struct *vma,
   unsigned long start, unsigned long end)
+{
+   return flush_pmd_tlb_pwc_range(vma, start, end, false);
+}
+
+static inline void flush_hugetlb_tlb_pwc_range(struct vm_area_struct *vma,
+  unsigned long start,
+  unsigned long end,
+  bool flush_pwc)
 {
if (radix_enabled())
-   return radix__flush_pmd_tlb_range(vma, start, end);
+   return radix__flush_hugetlb_tlb_range(vma, start, end, 
flush_pwc);
return hash__flush_tlb_range(vma, start, end);
 }
 
@@ -60,9 +77,7 @@ static inline void flush_hugetlb_tlb_range(struct 
vm_area_struct *vma,
   unsigned long start,
   unsigned long end)
 {
-   if (radix_enabled())
-   return radix__flush_hugetlb_tlb_range(vma, start, end);
-   return hash__flush_tlb_range(vma, start, end);
+   return flush_hugetlb_tlb_pwc_range(vma, start, end, false);
 }
 
 static inline void flush_tlb_range(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c 
b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
index cb91071eef52..e62f5679b119 100644
--- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
+++ b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
@@ -26,13 +26,13 @@ void radix__local_flush_hugetlb_page(struct vm_area_struct 
*vma, unsigned long v
 }
 
 void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, unsigned long 
start,
-  unsigned long end)
+   unsigned long end, bool flush_pwc)
 {
int psize;
struct hstate *hstate = hstate_f

[PATCH v5 4/9] powerpc/mm/book3s64: Fix possible build error

2021-04-21 Thread Aneesh Kumar K.V
Update _tlbiel_pid() such that we can avoid build errors like below when
using this function in other places.

arch/powerpc/mm/book3s64/radix_tlb.c: In function 
‘__radix__flush_tlb_range_psize’:
arch/powerpc/mm/book3s64/radix_tlb.c:114:2: warning: ‘asm’ operand 3 probably 
does not match constraints
  114 |  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
  |  ^~~
arch/powerpc/mm/book3s64/radix_tlb.c:114:2: error: impossible constraint in 
‘asm’
make[4]: *** [scripts/Makefile.build:271: arch/powerpc/mm/book3s64/radix_tlb.o] 
Error 1
m

With this fix, we can also drop the __always_inline in 
__radix_flush_tlb_range_psize
which was added by commit e12d6d7d46a6 ("powerpc/mm/radix: mark 
__radix__flush_tlb_range_psize() as __always_inline")

Reviewed-by: Christophe Leroy 
Acked-by: Michael Ellerman 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/radix_tlb.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 409e61210789..817a02ef6032 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -291,22 +291,30 @@ static inline void fixup_tlbie_lpid(unsigned long lpid)
 /*
  * We use 128 set in radix mode and 256 set in hpt mode.
  */
-static __always_inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
+static inline void _tlbiel_pid(unsigned long pid, unsigned long ric)
 {
int set;
 
asm volatile("ptesync": : :"memory");
 
-   /*
-* Flush the first set of the TLB, and if we're doing a RIC_FLUSH_ALL,
-* also flush the entire Page Walk Cache.
-*/
-   __tlbiel_pid(pid, 0, ric);
+   switch (ric) {
+   case RIC_FLUSH_PWC:
 
-   /* For PWC, only one flush is needed */
-   if (ric == RIC_FLUSH_PWC) {
+   /* For PWC, only one flush is needed */
+   __tlbiel_pid(pid, 0, RIC_FLUSH_PWC);
ppc_after_tlbiel_barrier();
return;
+   case RIC_FLUSH_TLB:
+   __tlbiel_pid(pid, 0, RIC_FLUSH_TLB);
+   break;
+   case RIC_FLUSH_ALL:
+   default:
+   /*
+* Flush the first set of the TLB, and if
+* we're doing a RIC_FLUSH_ALL, also flush
+* the entire Page Walk Cache.
+*/
+   __tlbiel_pid(pid, 0, RIC_FLUSH_ALL);
}
 
if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
@@ -1176,7 +1184,7 @@ void radix__tlb_flush(struct mmu_gather *tlb)
}
 }
 
-static __always_inline void __radix__flush_tlb_range_psize(struct mm_struct 
*mm,
+static void __radix__flush_tlb_range_psize(struct mm_struct *mm,
unsigned long start, unsigned long end,
int psize, bool also_pwc)
 {
-- 
2.30.2



[PATCH v5 3/9] mm/mremap: Use pmd/pud_poplulate to update page table entries

2021-04-21 Thread Aneesh Kumar K.V
pmd/pud_populate is the right interface to be used to set the respective
page table entries. Some architectures like ppc64 do assume that set_pmd/pud_at
can only be used to set a hugepage PTE. Since we are not setting up a hugepage
PTE here, use the pmd/pud_populate interface.

Signed-off-by: Aneesh Kumar K.V 
---
 mm/mremap.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/mremap.c b/mm/mremap.c
index ec8f840399ed..574287f9bb39 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -26,6 +26,7 @@
 
 #include 
 #include 
+#include 
 
 #include "internal.h"
 
@@ -257,9 +258,8 @@ static bool move_normal_pmd(struct vm_area_struct *vma, 
unsigned long old_addr,
pmd_clear(old_pmd);
 
VM_BUG_ON(!pmd_none(*new_pmd));
+   pmd_populate(mm, new_pmd, (pgtable_t)pmd_page_vaddr(pmd));
 
-   /* Set the new pmd */
-   set_pmd_at(mm, new_addr, new_pmd, pmd);
flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
@@ -306,8 +306,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, 
unsigned long old_addr,
 
VM_BUG_ON(!pud_none(*new_pud));
 
-   /* Set the new pud */
-   set_pud_at(mm, new_addr, new_pud, pud);
+   pud_populate(mm, new_pud, (pmd_t *)pud_page_vaddr(pud));
flush_tlb_range(vma, old_addr, old_addr + PUD_SIZE);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
-- 
2.30.2



[PATCH v5 2/9] selftest/mremap_test: Avoid crash with static build

2021-04-21 Thread Aneesh Kumar K.V
With a large mmap map size, we can overlap with the text area and using
MAP_FIXED results in unmapping that area. Switch to MAP_FIXED_NOREPLACE
and handle the EEXIST error.

Reviewed-by: Kalesh Singh 
Signed-off-by: Aneesh Kumar K.V 
---
 tools/testing/selftests/vm/mremap_test.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/vm/mremap_test.c 
b/tools/testing/selftests/vm/mremap_test.c
index c9a5461eb786..0624d1bd71b5 100644
--- a/tools/testing/selftests/vm/mremap_test.c
+++ b/tools/testing/selftests/vm/mremap_test.c
@@ -75,9 +75,10 @@ static void *get_source_mapping(struct config c)
 retry:
addr += c.src_alignment;
src_addr = mmap((void *) addr, c.region_size, PROT_READ | PROT_WRITE,
-   MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+   MAP_FIXED_NOREPLACE | MAP_ANONYMOUS | MAP_SHARED,
+   -1, 0);
if (src_addr == MAP_FAILED) {
-   if (errno == EPERM)
+   if (errno == EPERM || errno == EEXIST)
goto retry;
goto error;
}
-- 
2.30.2



[PATCH v5 1/9] selftest/mremap_test: Update the test to handle pagesize other than 4K

2021-04-21 Thread Aneesh Kumar K.V
Instead of hardcoding 4K page size fetch it using sysconf(). For the performance
measurements test still assume 2M and 1G are hugepage sizes.

Reviewed-by: Kalesh Singh 
Signed-off-by: Aneesh Kumar K.V 
---
 tools/testing/selftests/vm/mremap_test.c | 113 ---
 1 file changed, 61 insertions(+), 52 deletions(-)

diff --git a/tools/testing/selftests/vm/mremap_test.c 
b/tools/testing/selftests/vm/mremap_test.c
index 9c391d016922..c9a5461eb786 100644
--- a/tools/testing/selftests/vm/mremap_test.c
+++ b/tools/testing/selftests/vm/mremap_test.c
@@ -45,14 +45,15 @@ enum {
_4MB = 4ULL << 20,
_1GB = 1ULL << 30,
_2GB = 2ULL << 30,
-   PTE = _4KB,
PMD = _2MB,
PUD = _1GB,
 };
 
+#define PTE page_size
+
 #define MAKE_TEST(source_align, destination_align, size,   \
  overlaps, should_fail, test_name) \
-{  \
+(struct test){ \
.name = test_name,  \
.config = { \
.src_alignment = source_align,  \
@@ -252,12 +253,17 @@ static int parse_args(int argc, char **argv, unsigned int 
*threshold_mb,
return 0;
 }
 
+#define MAX_TEST 13
+#define MAX_PERF_TEST 3
 int main(int argc, char **argv)
 {
int failures = 0;
int i, run_perf_tests;
unsigned int threshold_mb = VALIDATION_DEFAULT_THRESHOLD;
unsigned int pattern_seed;
+   struct test test_cases[MAX_TEST];
+   struct test perf_test_cases[MAX_PERF_TEST];
+   int page_size;
time_t t;
 
pattern_seed = (unsigned int) time(&t);
@@ -268,56 +274,59 @@ int main(int argc, char **argv)
ksft_print_msg("Test 
configs:\n\tthreshold_mb=%u\n\tpattern_seed=%u\n\n",
   threshold_mb, pattern_seed);
 
-   struct test test_cases[] = {
-   /* Expected mremap failures */
-   MAKE_TEST(_4KB, _4KB, _4KB, OVERLAPPING, EXPECT_FAILURE,
- "mremap - Source and Destination Regions Overlapping"),
-   MAKE_TEST(_4KB, _1KB, _4KB, NON_OVERLAPPING, EXPECT_FAILURE,
- "mremap - Destination Address Misaligned (1KB-aligned)"),
-   MAKE_TEST(_1KB, _4KB, _4KB, NON_OVERLAPPING, EXPECT_FAILURE,
- "mremap - Source Address Misaligned (1KB-aligned)"),
-
-   /* Src addr PTE aligned */
-   MAKE_TEST(PTE, PTE, _8KB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "8KB mremap - Source PTE-aligned, Destination PTE-aligned"),
-
-   /* Src addr 1MB aligned */
-   MAKE_TEST(_1MB, PTE, _2MB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "2MB mremap - Source 1MB-aligned, Destination PTE-aligned"),
-   MAKE_TEST(_1MB, _1MB, _2MB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "2MB mremap - Source 1MB-aligned, Destination 1MB-aligned"),
-
-   /* Src addr PMD aligned */
-   MAKE_TEST(PMD, PTE, _4MB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "4MB mremap - Source PMD-aligned, Destination PTE-aligned"),
-   MAKE_TEST(PMD, _1MB, _4MB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "4MB mremap - Source PMD-aligned, Destination 1MB-aligned"),
-   MAKE_TEST(PMD, PMD, _4MB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "4MB mremap - Source PMD-aligned, Destination PMD-aligned"),
-
-   /* Src addr PUD aligned */
-   MAKE_TEST(PUD, PTE, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "2GB mremap - Source PUD-aligned, Destination PTE-aligned"),
-   MAKE_TEST(PUD, _1MB, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "2GB mremap - Source PUD-aligned, Destination 1MB-aligned"),
-   MAKE_TEST(PUD, PMD, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "2GB mremap - Source PUD-aligned, Destination PMD-aligned"),
-   MAKE_TEST(PUD, PUD, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "2GB mremap - Source PUD-aligned, Destination PUD-aligned"),
-   };
-
-   struct test perf_test_cases[] = {
-   /*
-* mremap 1GB region - Page table level aligned time
-* comparison.
-*/
-   MAKE_TEST(PTE, PTE, _1GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "1GB mremap - Source PTE-aligned, Destination PTE-aligned"),
-   MAKE_TEST(PMD, PMD, _1GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "1GB mremap - Source PMD-aligned, Destination PMD-aligned"),
-   MAKE_TEST(PUD, PUD, _1GB, NON_OVERLAPPING, EXPECT_SUCCESS,
- "1GB mremap - Source PUD-aligned, Destination PUD-aligned"),
-   };
+   page_size = sysconf(_SC_PAGESIZE);
+
+   /* Expected mremap failures */
+   test_cases[0] = MAKE_TES

[PATCH v5 0/9] Speedup mremap on ppc64

2021-04-21 Thread Aneesh Kumar K.V
Hi,

This patchset enables MOVE_PMD/MOVE_PUD support on power. This requires
the platform to support updating higher-level page tables without
updating page table entries. This also needs to invalidate the Page Walk
Cache on architecture supporting the same.

Changes from v4:
* Change function name and arguments based on review feedback.

Changes from v3:
* Fix build error reported by kernel test robot
* Address review feedback.

Changes from v2:
* switch from using mmu_gather to flush_pte_tlb_pwc_range() 

Changes from v1:
* Rebase to recent upstream
* Fix build issues with tlb_gather_mmu changes



Aneesh Kumar K.V (9):
  selftest/mremap_test: Update the test to handle pagesize other than 4K
  selftest/mremap_test: Avoid crash with static build
  mm/mremap: Use pmd/pud_poplulate to update page table entries
  powerpc/mm/book3s64: Fix possible build error
  powerpc/mm/book3s64: Update tlb flush routines to take a page walk
cache flush argument
  mm/mremap: Use range flush that does TLB and page walk cache flush
  mm/mremap: Move TLB flush outside page table lock
  mm/mremap: Allow arch runtime override
  powerpc/mm: Enable move pmd/pud

 .../include/asm/book3s/64/tlbflush-radix.h|  19 +--
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  29 -
 arch/powerpc/include/asm/tlb.h|   6 +
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c  |   4 +-
 arch/powerpc/mm/book3s64/radix_tlb.c  |  55 
 arch/powerpc/platforms/Kconfig.cputype|   2 +
 mm/mremap.c   |  40 --
 tools/testing/selftests/vm/mremap_test.c  | 118 ++
 8 files changed, 170 insertions(+), 103 deletions(-)

-- 
2.30.2



Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-04-21 Thread Lijun Pan
On Tue, Apr 20, 2021 at 4:37 PM Dany Madden  wrote:
>
> When ibmvnic gets a FATAL error message from the vnicserver, it marks
> the Command Respond Queue (CRQ) inactive and resets the adapter. If this
> FATAL reset fails and a transmission timeout reset follows, the CRQ is
> still inactive, ibmvnic's attempt to set link down will also fail. If
> ibmvnic abandons the reset because of this failed set link down and this
> is the last reset in the workqueue, then this adapter will be left in an
> inoperable state.
>
> Instead, make the driver ignore this link down failure and continue to
> free and re-register CRQ so that the adapter has an opportunity to
> recover.
>
> Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
> Signed-off-by: Dany Madden 
> Reviewed-by: Rick Lindsley 
> Reviewed-by: Sukadev Bhattiprolu 

One thing I would like to point out as already pointed out by Nathan Lynch is
that those review-by tags given by the same groups of people from the same
company loses credibility over time if you never critique or ask
questions on the list.


Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-04-21 Thread Lijun Pan
On Wed, Apr 21, 2021 at 3:06 AM Rick Lindsley
 wrote:
>
> On 4/20/21 2:42 PM, Lijun Pan wrote:
> >
> > This v2 does not adddress the concerns mentioned in v1.
> > And I think it is better to exit with error from do_reset, and schedule a 
> > thorough
> > do_hard_reset if the the adapter is already in unstable state.
>
> But the point is that the testing and analysis has indicated that doing a full
> hard reset is not necessary. We are about to take the very action which will 
> fix
> this situation, but currently do not.

The testing was done on this patch. It was not performed on a full hard reset.
So I don't think you could even compare the two results.

>
> Please describe the advantage in deferring it further by routing it through
> do_hard_reset().  I don't see one.

It is not deferred. It exits with error and calls do_hard_reset.
See my reply to Suka's.


Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-04-21 Thread Lijun Pan
On Wed, Apr 21, 2021 at 2:25 AM Sukadev Bhattiprolu
 wrote:
>
> Lijun Pan [l...@linux.vnet.ibm.com] wrote:
> >
> >
> > > On Apr 20, 2021, at 4:35 PM, Dany Madden  wrote:
> > >
> > > When ibmvnic gets a FATAL error message from the vnicserver, it marks
> > > the Command Respond Queue (CRQ) inactive and resets the adapter. If this
> > > FATAL reset fails and a transmission timeout reset follows, the CRQ is
> > > still inactive, ibmvnic's attempt to set link down will also fail. If
> > > ibmvnic abandons the reset because of this failed set link down and this
> > > is the last reset in the workqueue, then this adapter will be left in an
> > > inoperable state.
> > >
> > > Instead, make the driver ignore this link down failure and continue to
> > > free and re-register CRQ so that the adapter has an opportunity to
> > > recover.
> >
> > This v2 does not adddress the concerns mentioned in v1.
> > And I think it is better to exit with error from do_reset, and schedule a 
> > thorough
> > do_hard_reset if the the adapter is already in unstable state.
>
> We had a FATAL error and when handling it, we failed to send a
> link-down message to the VIOS. So what we need to try next is to
> reset the connection with the VIOS. For this we must talk to the
> firmware using the H_FREE_CRQ and H_REG_CRQ hcalls. do_reset()
> does just that in ibmvnic_reset_crq().
>
> Now, sure we can attempt a "thorough hard reset" which also does
> the same hcalls to reestablish the connection. Is there any
> other magic in do_hard_reset()? But in addition, it also frees lot
> more Linux kernel buffers and reallocates them for instance.

Working around everything in do_reset will make the code very difficult
to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset
can be removed completely or merged into do_reset.

>
> If we are having a communication problem with the VIOS, what is
> the point of freeing and reallocating Linux kernel buffers? Beside
> being inefficient, it would expose us to even more errors during
> reset under heavy workloads?

No real customer runs the system under that heavy load created by
HTX stress test, which can tear down any working system.

>
> From what I understand so far, do_reset() is complicated because
> it is attempting some optimizations.  If we are going to fall back
> to hard reset for every error we might as well drop the do_reset()
> and just do the "thorough hard reset" every time right?

I think such optimizations are catered for passing HTX tests. Whether
the optimization benefits the adapter, say making the adapter more stable,
I doubt it. I think there should be a trade off between optimization
and stability.


Re: [PATCH] powerpc: Initialize local variable fdt to NULL in elf64_load()

2021-04-21 Thread Daniel Axtens
Daniel Axtens  writes:

> Hi Lakshmi,
>
>> On 4/15/21 12:14 PM, Lakshmi Ramasubramanian wrote:
>>
>> Sorry - missed copying device-tree and powerpc mailing lists.
>>
>>> There are a few "goto out;" statements before the local variable "fdt"
>>> is initialized through the call to of_kexec_alloc_and_setup_fdt() in
>>> elf64_load(). This will result in an uninitialized "fdt" being passed
>>> to kvfree() in this function if there is an error before the call to
>>> of_kexec_alloc_and_setup_fdt().
>>> 
>>> Initialize the local variable "fdt" to NULL.
>>>
> I'm a huge fan of initialising local variables! But I'm struggling to
> find the code path that will lead to an uninit fdt being returned...

OK, so perhaps this was putting it too strongly. I have been bitten
by uninitialised things enough in C that I may have taken a slightly
overly-agressive view of fixing them in the source rather than the
compiler. I do think compiler-level mitigations are better, and I take
the point that we don't want to defeat compiler checking.

(Does anyone - and by anyone I mean any large distro - compile with
local variables inited by the compiler?)

I was reading the version in powerpc/next, clearly I should have looked
at linux-next. Having said that, I think I will leave the rest of the
bikeshedding to the rest of you, you all seem to have it in hand :)

Kind regards,
Daniel

>
> The out label reads in part:
>
>   /* Make kimage_file_post_load_cleanup free the fdt buffer for us. */
>   return ret ? ERR_PTR(ret) : fdt;
>
> As far as I can tell, any time we get a non-zero ret, we're going to
> return an error pointer rather than the uninitialised value...
>
> (btw, it does look like we might leak fdt if we have an error after we
> successfully kmalloc it.)
>
> Am I missing something? Can you link to the report for the kernel test
> robot or from Dan? 
>
> FWIW, I think it's worth including this patch _anyway_ because initing
> local variables is good practice, but I'm just not sure on the
> justification.
>
> Kind regards,
> Daniel
>
>>> Signed-off-by: Lakshmi Ramasubramanian 
>>> Reported-by: kernel test robot 
>>> Reported-by: Dan Carpenter 
>>> ---
>>>   arch/powerpc/kexec/elf_64.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
>>> index 5a569bb51349..0051440c1f77 100644
>>> --- a/arch/powerpc/kexec/elf_64.c
>>> +++ b/arch/powerpc/kexec/elf_64.c
>>> @@ -32,7 +32,7 @@ static void *elf64_load(struct kimage *image, char 
>>> *kernel_buf,
>>> int ret;
>>> unsigned long kernel_load_addr;
>>> unsigned long initrd_load_addr = 0, fdt_load_addr;
>>> -   void *fdt;
>>> +   void *fdt = NULL;
>>> const void *slave_code;
>>> struct elfhdr ehdr;
>>> char *modified_cmdline = NULL;
>>> 
>>
>> thanks,
>>   -lakshmi


Re: mmu.c:undefined reference to `patch__hash_page_A0'

2021-04-21 Thread Randy Dunlap
On 4/21/21 1:43 AM, Christophe Leroy wrote:
> 
> 
> Le 18/04/2021 à 19:15, Randy Dunlap a écrit :
>> On 4/18/21 3:43 AM, Christophe Leroy wrote:
>>>
>>>
>>> Le 18/04/2021 à 02:02, Randy Dunlap a écrit :
>>>> HI--
>>>>
>>>> I no longer see this build error.
>>>
>>> Fixed by 
>>> https://github.com/torvalds/linux/commit/acdad8fb4a1574323db88f98a38b630691574e16
>>>
>>>> However:
>>>>

...

>>>>
>>>> I do see this build error:
>>>>
>>>> powerpc-linux-ld: arch/powerpc/boot/wrapper.a(decompress.o): in function 
>>>> `partial_decompress':
>>>> decompress.c:(.text+0x1f0): undefined reference to `__decompress'
>>>>
>>>> when either
>>>> CONFIG_KERNEL_LZO=y
>>>> or
>>>> CONFIG_KERNEL_LZMA=y
>>>>
>>>> but the build succeeds when either
>>>> CONFIG_KERNEL_GZIP=y
>>>> or
>>>> CONFIG_KERNEL_XZ=y
>>>>
>>>> I guess that is due to arch/powerpc/boot/decompress.c doing this:
>>>>
>>>> #ifdef CONFIG_KERNEL_GZIP
>>>> #    include "decompress_inflate.c"
>>>> #endif
>>>>
>>>> #ifdef CONFIG_KERNEL_XZ
>>>> #    include "xz_config.h"
>>>> #    include "../../../lib/decompress_unxz.c"
>>>> #endif
>>>>
>>>>
>>>> It would be nice to require one of KERNEL_GZIP or KERNEL_XZ
>>>> to be set/enabled (maybe unless a uImage is being built?).
>>>
>>>
>>> Can you test by 
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/a74fce4dfc9fa32da6ce3470bbedcecf795de1ec.1591189069.git.christophe.le...@csgroup.eu/
>>>  ?
>>
>> Hi Christophe,
>>
>> I get build errors for both LZO and LZMA:
>>
> 
> Can you check with the following changes on top of my patch:
> 
> diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c
> index a8dbde4b32d4..f06f925385c0 100644
> --- a/lib/decompress_unlzo.c
> +++ b/lib/decompress_unlzo.c
> @@ -23,13 +23,15 @@
>  #include 
>  #endif
> 
> -#include 
>  #ifdef __KERNEL__
>  #include 
> +#endif
> +#include 
> +#ifdef __KERNEL__
>  #include 
> +#include 
>  #endif
> 
> -#include 
>  #include 
> 
>  static const unsigned char lzop_magic[] = {

Hi Christophe,
Sorry for the delay -- it's been a very busy day here.

For CONFIG_KERNEL_LZMA=y, I get a couple of warnings:

  BOOTCC  arch/powerpc/boot/decompress.o
In file included from ../arch/powerpc/boot/decompress.c:38:
../arch/powerpc/boot/../../../lib/decompress_unlzma.c: In function 'unlzma':
../arch/powerpc/boot/../../../lib/decompress_unlzma.c:582:21: warning: pointer 
targets in passing argument 3 of 'rc_init' differ in signedness [-Wpointer-sign]
  582 |  rc_init(&rc, fill, inbuf, in_len);
  | ^
  | |
  | unsigned char *
../arch/powerpc/boot/../../../lib/decompress_unlzma.c:107:18: note: expected 
'char *' but argument is of type 'unsigned char *'
  107 |    char *buffer, long buffer_size)
  |~~^~


and for CONFIG_KERNEL_LZO=y, this one warning:

  BOOTCC  arch/powerpc/boot/decompress.o
In file included from ../arch/powerpc/boot/decompress.c:43:
../arch/powerpc/boot/../../../lib/decompress_unlzo.c: In function 
'parse_header':
../arch/powerpc/boot/../../../lib/decompress_unlzo.c:51:5: warning: variable 
'level' set but not used [-Wunused-but-set-variable]
   51 |  u8 level = 0;
  | ^

Note: the patch above did not apply cleanly for me so any problems
above could be due to my mangling the patch.
The patch that I used is below.

Thanks.
---
---
 lib/decompress_unlzo.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- linux-next-20210421.orig/lib/decompress_unlzo.c
+++ linux-next-20210421/lib/decompress_unlzo.c
@@ -23,13 +23,16 @@
 #include 
 #endif
 
-#include 
 #ifdef __KERNEL__
 #include 
-#include 
 #endif
+#include 
 
+#ifdef __KERNEL__
+#include 
 #include 
+#endif
+
 #include 
 
 static const unsigned char lzop_magic[] = {



[powerpc:next] BUILD SUCCESS 39352430aaa05fbe4ba710231c70b334513078f2

2021-04-21 Thread kernel test robot
 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20210421
x86_64   randconfig-a002-20210421
x86_64   randconfig-a001-20210421
x86_64   randconfig-a005-20210421
x86_64   randconfig-a006-20210421
x86_64   randconfig-a003-20210421
i386 randconfig-a005-20210421
i386 randconfig-a002-20210421
i386 randconfig-a001-20210421
i386 randconfig-a006-20210421
i386 randconfig-a004-20210421
i386 randconfig-a003-20210421
i386 randconfig-a012-20210421
i386 randconfig-a014-20210421
i386 randconfig-a011-20210421
i386 randconfig-a013-20210421
i386 randconfig-a015-20210421
i386 randconfig-a016-20210421
riscvnommu_k210_defconfig
riscvnommu_virt_defconfig
riscv   defconfig
riscv  rv32_defconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a015-20210421
x86_64   randconfig-a016-20210421
x86_64   randconfig-a011-20210421
x86_64   randconfig-a014-20210421
x86_64   randconfig-a013-20210421
x86_64   randconfig-a012-20210421

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[powerpc:next-test] BUILD REGRESSION f76c7820fc6ef641b75b5142aea72f1485c73bb1

2021-04-21 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next-test
branch HEAD: f76c7820fc6ef641b75b5142aea72f1485c73bb1  powerpc/powernv: Fix 
type of opal_mpipl_query_tag() addr argument

Error/Warning reports:

https://lore.kernel.org/linuxppc-dev/202104220600.1zf0gkvf-...@intel.com

Error/Warning in current branch:

arch/powerpc/include/asm/book3s/64/hash-pkey.h:10:23: error: 'VM_PKEY_BIT0' 
undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT0'?
arch/powerpc/include/asm/book3s/64/hash-pkey.h:11:16: error: 'VM_PKEY_BIT1' 
undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT1'?
arch/powerpc/include/asm/book3s/64/hash-pkey.h:12:16: error: 'VM_PKEY_BIT2' 
undeclared (first use in this function)
arch/powerpc/include/asm/book3s/64/hash-pkey.h:13:16: error: 'VM_PKEY_BIT3' 
undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT3'?
arch/powerpc/include/asm/book3s/64/hash-pkey.h:14:16: error: 'VM_PKEY_BIT4' 
undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT4'?
arch/powerpc/include/asm/mmu_context.h:287:19: error: redefinition of 
'pte_to_hpte_pkey_bits'

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
`-- powerpc64-randconfig-p001-20210421
|-- 
arch-powerpc-include-asm-book3s-hash-pkey.h:error:VM_PKEY_BIT0-undeclared-(first-use-in-this-function)
|-- 
arch-powerpc-include-asm-book3s-hash-pkey.h:error:VM_PKEY_BIT1-undeclared-(first-use-in-this-function)
|-- 
arch-powerpc-include-asm-book3s-hash-pkey.h:error:VM_PKEY_BIT2-undeclared-(first-use-in-this-function)
|-- 
arch-powerpc-include-asm-book3s-hash-pkey.h:error:VM_PKEY_BIT3-undeclared-(first-use-in-this-function)
|-- 
arch-powerpc-include-asm-book3s-hash-pkey.h:error:VM_PKEY_BIT4-undeclared-(first-use-in-this-function)
`-- 
arch-powerpc-include-asm-mmu_context.h:error:redefinition-of-pte_to_hpte_pkey_bits

elapsed time: 721m

configs tested: 162
configs skipped: 3

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
riscvallmodconfig
x86_64   allyesconfig
i386 allyesconfig
riscvallyesconfig
armlart_defconfig
mipsmalta_qemu_32r6_defconfig
mips  malta_defconfig
sh shx3_defconfig
powerpc  ppc44x_defconfig
um  defconfig
mipsworkpad_defconfig
powerpc   currituck_defconfig
powerpc mpc837x_mds_defconfig
powerpc xes_mpc85xx_defconfig
riscv allnoconfig
powerpc  pcm030_defconfig
arm s3c6400_defconfig
xtensa  iss_defconfig
mips  loongson3_defconfig
powerpc  iss476-smp_defconfig
arm  alldefconfig
arm   omap2plus_defconfig
arm  tct_hammer_defconfig
mips   rs90_defconfig
arm  integrator_defconfig
arm  footbridge_defconfig
h8300h8300h-sim_defconfig
xtensa  defconfig
alpha   defconfig
sparc   defconfig
powerpc asp8347_defconfig
ia64  tiger_defconfig
mipsnlm_xlr_defconfig
mips  decstation_64_defconfig
m68k  hp300_defconfig
mips   xway_defconfig
powerpc  storcenter_defconfig
mips   ci20_defconfig
arm lpc18xx_defconfig
mips  bmips_stb_defconfig
armmvebu_v7_defconfig
powerpc  allmodconfig
xtensa  nommu_kc705_defconfig
arm ezx_defconfig
powerpc   eiger_defconfig
mips   rbtx49xx_defconfig
arm  exynos_defconfig
sh   se7619_defconfig
sh  sdk7780_defconfig
shtitan_defconfig
arm cm_x300_defconfig
arm bcm2835_defconfig
armkeystone_defconfig
ia64 bigsur_defconfig
s390  debug_defconfig
x86_64   alldefconfig
m68k amcore_defconfig
mips   capcella_defco

[powerpc:merge] BUILD SUCCESS d20f726744a0312b4b6613333bda7da9bc52fb75

2021-04-21 Thread kernel test robot
onfig
i386 alldefconfig
arc  axs103_smp_defconfig
sh   sh2007_defconfig
mips loongson1b_defconfig
powerpc mpc8313_rdb_defconfig
powerpcfsp2_defconfig
arm mv78xx0_defconfig
powerpc mpc83xx_defconfig
xtensa   alldefconfig
sh   se7722_defconfig
openrisc alldefconfig
powerpc  acadia_defconfig
mipse55_defconfig
shmigor_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20210421
x86_64   randconfig-a002-20210421
x86_64   randconfig-a001-20210421
x86_64   randconfig-a005-20210421
x86_64   randconfig-a006-20210421
x86_64   randconfig-a003-20210421
i386 randconfig-a005-20210421
i386 randconfig-a002-20210421
i386 randconfig-a001-20210421
i386 randconfig-a006-20210421
i386 randconfig-a004-20210421
i386 randconfig-a003-20210421
i386 randconfig-a012-20210421
i386 randconfig-a014-20210421
i386 randconfig-a011-20210421
i386 randconfig-a013-20210421
i386 randconfig-a015-20210421
i386 randconfig-a016-20210421
riscvnommu_k210_defconfig
riscvnommu_virt_defconfig
riscv   defconfig
riscv  rv32_defconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a015-20210421
x86_64   randconfig-a016-20210421
x86_64   randconfig-a011-20210421
x86_64   randconfig-a014-20210421
x86_64   randconfig-a013-20210421
x86_64   randconfig-a012-20210421

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[powerpc:next-test 245/263] arch/powerpc/include/asm/mmu_context.h:287:19: error: redefinition of 'pte_to_hpte_pkey_bits'

2021-04-21 Thread kernel test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next-test
head:   f76c7820fc6ef641b75b5142aea72f1485c73bb1
commit: e4e8bc1df691ba5ba749d1e2b67acf9827e51a35 [245/263] powerpc/kvm: Fix PR 
KVM with KUAP/MEM_KEYS enabled
config: powerpc64-randconfig-p001-20210421 (attached as .config)
compiler: powerpc64le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=e4e8bc1df691ba5ba749d1e2b67acf9827e51a35
git remote add powerpc 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
git fetch --no-tags powerpc next-test
git checkout e4e8bc1df691ba5ba749d1e2b67acf9827e51a35
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross W=1 
ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/include/asm/book3s/64/pkeys.h:6,
from arch/powerpc/kvm/book3s_64_mmu_host.c:15:
   arch/powerpc/include/asm/book3s/64/hash-pkey.h: In function 
'hash__vmflag_to_pte_pkey_bits':
>> arch/powerpc/include/asm/book3s/64/hash-pkey.h:10:23: error: 'VM_PKEY_BIT0' 
>> undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT0'?
  10 |  return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
 |   ^~~~
 |   H_PTE_PKEY_BIT0
   arch/powerpc/include/asm/book3s/64/hash-pkey.h:10:23: note: each undeclared 
identifier is reported only once for each function it appears in
>> arch/powerpc/include/asm/book3s/64/hash-pkey.h:11:16: error: 'VM_PKEY_BIT1' 
>> undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT1'?
  11 |   ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
 |^~~~
 |H_PTE_PKEY_BIT1
>> arch/powerpc/include/asm/book3s/64/hash-pkey.h:12:16: error: 'VM_PKEY_BIT2' 
>> undeclared (first use in this function)
  12 |   ((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
 |^~~~
>> arch/powerpc/include/asm/book3s/64/hash-pkey.h:13:16: error: 'VM_PKEY_BIT3' 
>> undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT3'?
  13 |   ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
 |^~~~
 |H_PTE_PKEY_BIT3
>> arch/powerpc/include/asm/book3s/64/hash-pkey.h:14:16: error: 'VM_PKEY_BIT4' 
>> undeclared (first use in this function); did you mean 'H_PTE_PKEY_BIT4'?
  14 |   ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
 |^~~~
 |H_PTE_PKEY_BIT4
   In file included from arch/powerpc/kvm/book3s_64_mmu_host.c:17:
   arch/powerpc/include/asm/mmu_context.h: At top level:
>> arch/powerpc/include/asm/mmu_context.h:287:19: error: redefinition of 
>> 'pte_to_hpte_pkey_bits'
 287 | static inline u64 pte_to_hpte_pkey_bits(u64 pteflags, unsigned long 
flags)
 |   ^
   In file included from arch/powerpc/include/asm/book3s/64/pkeys.h:6,
from arch/powerpc/kvm/book3s_64_mmu_host.c:15:
   arch/powerpc/include/asm/book3s/64/hash-pkey.h:17:19: note: previous 
definition of 'pte_to_hpte_pkey_bits' was here
  17 | static inline u64 pte_to_hpte_pkey_bits(u64 pteflags, unsigned long 
flags)
 |   ^


vim +/pte_to_hpte_pkey_bits +287 arch/powerpc/include/asm/mmu_context.h

87bbabbed8a770 Ram Pai  2018-01-18  286  
d94b827e89dc3f Aneesh Kumar K.V 2020-11-27 @287  static inline u64 
pte_to_hpte_pkey_bits(u64 pteflags, unsigned long flags)
a6590ca55f1f49 Ram Pai  2018-01-18  288  {
a6590ca55f1f49 Ram Pai  2018-01-18  289 return 0x0UL;
a6590ca55f1f49 Ram Pai  2018-01-18  290  }
a6590ca55f1f49 Ram Pai  2018-01-18  291  

:: The code at line 287 was first introduced by commit
:: d94b827e89dc3f92cd871d10f4992a6bd3c861e5 powerpc/book3s64/kuap: Use Key 
3 for kernel mapping with hash translation

:: TO: Aneesh Kumar K.V 
:: CC: Michael Ellerman 

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH bpf-next 1/2] bpf: Remove bpf_jit_enable=2 debugging mode

2021-04-21 Thread John Fastabend
Christophe Leroy wrote:
> 
> 
> Le 20/04/2021 à 05:28, Alexei Starovoitov a écrit :
> > On Sat, Apr 17, 2021 at 1:16 AM Christophe Leroy
> >  wrote:
> >>
> >>
> >>
> >> Le 16/04/2021 à 01:49, Alexei Starovoitov a écrit :
> >>> On Thu, Apr 15, 2021 at 8:41 AM Quentin Monnet  
> >>> wrote:
> 
>  2021-04-15 16:37 UTC+0200 ~ Daniel Borkmann 
> > On 4/15/21 11:32 AM, Jianlin Lv wrote:
> >> For debugging JITs, dumping the JITed image to kernel log is 
> >> discouraged,
> >> "bpftool prog dump jited" is much better way to examine JITed dumps.
> >> This patch get rid of the code related to bpf_jit_enable=2 mode and
> >> update the proc handler of bpf_jit_enable, also added auxiliary
> >> information to explain how to use bpf_jit_disasm tool after this 
> >> change.
> >>
> >> Signed-off-by: Jianlin Lv 
> 
>  Hello,
> 
>  For what it's worth, I have already seen people dump the JIT image in
>  kernel logs in Qemu VMs running with just a busybox, not for kernel
>  development, but in a context where buiding/using bpftool was not
>  possible.
> >>>
> >>> If building/using bpftool is not possible then majority of selftests won't
> >>> be exercised. I don't think such environment is suitable for any kind
> >>> of bpf development. Much so for JIT debugging.
> >>> While bpf_jit_enable=2 is nothing but the debugging tool for JIT 
> >>> developers.
> >>> I'd rather nuke that code instead of carrying it from kernel to kernel.
> >>>
> >>
> >> When I implemented JIT for PPC32, it was extremely helpfull.
> >>
> >> As far as I understand, for the time being bpftool is not usable in my 
> >> environment because it
> >> doesn't support cross compilation when the target's endianess differs from 
> >> the building host
> >> endianess, see discussion at
> >> https://lore.kernel.org/bpf/21e66a09-514f-f426-b9e2-13baab0b9...@csgroup.eu/
> >>
> >> That's right that selftests can't be exercised because they don't build.
> >>
> >> The question might be candid as I didn't investigate much about the 
> >> replacement of "bpf_jit_enable=2
> >> debugging mode" by bpftool, how do we use bpftool exactly for that ? 
> >> Especially when using the BPF
> >> test module ?
> > 
> > the kernel developers can add any amount of printk and dumps to debug
> > their code,
> > but such debugging aid should not be part of the production kernel.
> > That sysctl was two things at once: debugging tool for kernel devs and
> > introspection for users.
> > bpftool jit dump solves the 2nd part. It provides JIT introspection to 
> > users.
> > Debugging of the kernel can be done with any amount of auxiliary code
> > including calling print_hex_dump() during jiting.
> > 
> 
> I get the following message when trying the command suggested in the patch 
> message:
> 
> root@vgoip:~# ./bpftool prog dump jited
> Error: No libbfd support
> 
> Christophe

Seems your bpftool prog was built without libbfd, can you rebuild with libbfd
installed.

.John


Re: [PATCH bpf-next 1/2] bpf: Remove bpf_jit_enable=2 debugging mode

2021-04-21 Thread Quentin Monnet
2021-04-21 15:10 UTC+0200 ~ Christophe Leroy 
> 
> 
> Le 20/04/2021 à 05:28, Alexei Starovoitov a écrit :
>> On Sat, Apr 17, 2021 at 1:16 AM Christophe Leroy
>>  wrote:
>>>
>>>
>>>
>>> Le 16/04/2021 à 01:49, Alexei Starovoitov a écrit :
 On Thu, Apr 15, 2021 at 8:41 AM Quentin Monnet
  wrote:
>
> 2021-04-15 16:37 UTC+0200 ~ Daniel Borkmann 
>> On 4/15/21 11:32 AM, Jianlin Lv wrote:
>>> For debugging JITs, dumping the JITed image to kernel log is
>>> discouraged,
>>> "bpftool prog dump jited" is much better way to examine JITed dumps.
>>> This patch get rid of the code related to bpf_jit_enable=2 mode and
>>> update the proc handler of bpf_jit_enable, also added auxiliary
>>> information to explain how to use bpf_jit_disasm tool after this
>>> change.
>>>
>>> Signed-off-by: Jianlin Lv 
>
> Hello,
>
> For what it's worth, I have already seen people dump the JIT image in
> kernel logs in Qemu VMs running with just a busybox, not for kernel
> development, but in a context where buiding/using bpftool was not
> possible.

 If building/using bpftool is not possible then majority of selftests
 won't
 be exercised. I don't think such environment is suitable for any kind
 of bpf development. Much so for JIT debugging.
 While bpf_jit_enable=2 is nothing but the debugging tool for JIT
 developers.
 I'd rather nuke that code instead of carrying it from kernel to kernel.

>>>
>>> When I implemented JIT for PPC32, it was extremely helpfull.
>>>
>>> As far as I understand, for the time being bpftool is not usable in
>>> my environment because it
>>> doesn't support cross compilation when the target's endianess differs
>>> from the building host
>>> endianess, see discussion at
>>> https://lore.kernel.org/bpf/21e66a09-514f-f426-b9e2-13baab0b9...@csgroup.eu/
>>>
>>>
>>> That's right that selftests can't be exercised because they don't build.
>>>
>>> The question might be candid as I didn't investigate much about the
>>> replacement of "bpf_jit_enable=2
>>> debugging mode" by bpftool, how do we use bpftool exactly for that ?
>>> Especially when using the BPF
>>> test module ?
>>
>> the kernel developers can add any amount of printk and dumps to debug
>> their code,
>> but such debugging aid should not be part of the production kernel.
>> That sysctl was two things at once: debugging tool for kernel devs and
>> introspection for users.
>> bpftool jit dump solves the 2nd part. It provides JIT introspection to
>> users.
>> Debugging of the kernel can be done with any amount of auxiliary code
>> including calling print_hex_dump() during jiting.
>>
> 
> I get the following message when trying the command suggested in the
> patch message:
> 
> root@vgoip:~# ./bpftool prog dump jited
> Error: No libbfd support
> 
> Christophe

Hi Christophe,

Bpftool relies on libbfd to disassemble the JIT-ed instructions, but
this is an optional dependency and your version of bpftool has been
compiled without it.

You could try to install it on your system (it is usually shipped with
binutils, package "binutils-dev" on Ubuntu for example). If you want to
cross-compile bpftool, the libbfd version provided by your distribution
may not include support for the target architecture. In that case you
would have to build libbfd yourself to make sure it supports it.

Then you can clean up the results from the libbfd probing:

$ make -C tools/build/feature/ clean

and recompile bpftool.

I hope this helps,
Quentin


Re: [PATCH bpf-next 1/2] bpf: Remove bpf_jit_enable=2 debugging mode

2021-04-21 Thread Christophe Leroy




Le 20/04/2021 à 05:28, Alexei Starovoitov a écrit :

On Sat, Apr 17, 2021 at 1:16 AM Christophe Leroy
 wrote:




Le 16/04/2021 à 01:49, Alexei Starovoitov a écrit :

On Thu, Apr 15, 2021 at 8:41 AM Quentin Monnet  wrote:


2021-04-15 16:37 UTC+0200 ~ Daniel Borkmann 

On 4/15/21 11:32 AM, Jianlin Lv wrote:

For debugging JITs, dumping the JITed image to kernel log is discouraged,
"bpftool prog dump jited" is much better way to examine JITed dumps.
This patch get rid of the code related to bpf_jit_enable=2 mode and
update the proc handler of bpf_jit_enable, also added auxiliary
information to explain how to use bpf_jit_disasm tool after this change.

Signed-off-by: Jianlin Lv 


Hello,

For what it's worth, I have already seen people dump the JIT image in
kernel logs in Qemu VMs running with just a busybox, not for kernel
development, but in a context where buiding/using bpftool was not
possible.


If building/using bpftool is not possible then majority of selftests won't
be exercised. I don't think such environment is suitable for any kind
of bpf development. Much so for JIT debugging.
While bpf_jit_enable=2 is nothing but the debugging tool for JIT developers.
I'd rather nuke that code instead of carrying it from kernel to kernel.



When I implemented JIT for PPC32, it was extremely helpfull.

As far as I understand, for the time being bpftool is not usable in my 
environment because it
doesn't support cross compilation when the target's endianess differs from the 
building host
endianess, see discussion at
https://lore.kernel.org/bpf/21e66a09-514f-f426-b9e2-13baab0b9...@csgroup.eu/

That's right that selftests can't be exercised because they don't build.

The question might be candid as I didn't investigate much about the replacement of 
"bpf_jit_enable=2
debugging mode" by bpftool, how do we use bpftool exactly for that ? Especially 
when using the BPF
test module ?


the kernel developers can add any amount of printk and dumps to debug
their code,
but such debugging aid should not be part of the production kernel.
That sysctl was two things at once: debugging tool for kernel devs and
introspection for users.
bpftool jit dump solves the 2nd part. It provides JIT introspection to users.
Debugging of the kernel can be done with any amount of auxiliary code
including calling print_hex_dump() during jiting.



I get the following message when trying the command suggested in the patch 
message:

root@vgoip:~# ./bpftool prog dump jited
Error: No libbfd support

Christophe


[PATCH v3] powerpc: make ALTIVEC select PPC_FPU

2021-04-21 Thread Randy Dunlap
On a kernel config with ALTIVEC=y and PPC_FPU not set/enabled,
there are build errors:

drivers/cpufreq/pmac32-cpufreq.c:262:2: error: implicit declaration of function 
'enable_kernel_fp' [-Werror,-Wimplicit-function-declaration]
   enable_kernel_fp();
../arch/powerpc/lib/sstep.c: In function 'do_vec_load':
../arch/powerpc/lib/sstep.c:637:3: error: implicit declaration of function 
'put_vr' [-Werror=implicit-function-declaration]
  637 |   put_vr(rn, &u.v);
  |   ^~
../arch/powerpc/lib/sstep.c: In function 'do_vec_store':
../arch/powerpc/lib/sstep.c:660:3: error: implicit declaration of function 
'get_vr'; did you mean 'get_oc'? [-Werror=implicit-function-declaration]
  660 |   get_vr(rn, &u.v);
  |   ^~

In theory ALTIVEC is independent of PPC_FPU but in practice nobody
is going to build such a machine, so make ALTIVEC require PPC_FPU
by selecting it.

Signed-off-by: Randy Dunlap 
Reported-by: kernel test robot 
Cc: Michael Ellerman 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Christophe Leroy 
Cc: Segher Boessenkool 
Cc: l...@intel.com
---
v2: change ALTIVEC depends on PPC_FPU to select (Christophe and Michael)
v3: other platforms don't need to select PPC_FPU since they select ALTIVEC
(Christophe; thanks)

 arch/powerpc/platforms/Kconfig.cputype |1 +
 1 file changed, 1 insertion(+)

--- linux-next-20210416.orig/arch/powerpc/platforms/Kconfig.cputype
+++ linux-next-20210416/arch/powerpc/platforms/Kconfig.cputype
@@ -310,6 +310,7 @@ config PHYS_64BIT
 config ALTIVEC
bool "AltiVec Support"
depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 || (PPC_E500MC && PPC64)
+   select PPC_FPU
help
  This option enables kernel support for the Altivec extensions to the
  PowerPC processor. The kernel currently supports saving and restoring


Re: [PATCH v2] powerpc: make ALTIVEC select PPC_FPU

2021-04-21 Thread Randy Dunlap
On 4/20/21 10:19 PM, Christophe Leroy wrote:
> 
> 
> Le 21/04/2021 à 04:56, Randy Dunlap a écrit :
>> On a kernel config with ALTIVEC=y and PPC_FPU not set/enabled,
>> there are build errors:
>>
>> drivers/cpufreq/pmac32-cpufreq.c:262:2: error: implicit declaration of 
>> function 'enable_kernel_fp' [-Werror,-Wimplicit-function-declaration]
>>     enable_kernel_fp();
>> ../arch/powerpc/lib/sstep.c: In function 'do_vec_load':
>> ../arch/powerpc/lib/sstep.c:637:3: error: implicit declaration of function 
>> 'put_vr' [-Werror=implicit-function-declaration]
>>    637 |   put_vr(rn, &u.v);
>>    |   ^~
>> ../arch/powerpc/lib/sstep.c: In function 'do_vec_store':
>> ../arch/powerpc/lib/sstep.c:660:3: error: implicit declaration of function 
>> 'get_vr'; did you mean 'get_oc'? [-Werror=implicit-function-declaration]
>>    660 |   get_vr(rn, &u.v);
>>    |   ^~
>>
>> In theory ALTIVEC is independent of PPC_FPU but in practice nobody
>> is going to build such a machine, so make ALTIVEC require PPC_FPU
>> by selecting it.
>>
>> Signed-off-by: Randy Dunlap 
>> Reported-by: kernel test robot 
>> Cc: Michael Ellerman 
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: Christophe Leroy 
>> Cc: Segher Boessenkool 
>> Cc: l...@intel.com
>> ---
>> v2: change ALTIVEC depends on PPC_FPU to select (Christophe and Michael)
>>
>>   arch/powerpc/platforms/86xx/Kconfig    |    1 +
>>   arch/powerpc/platforms/Kconfig.cputype |    2 ++
>>   2 files changed, 3 insertions(+)
>>
>> --- linux-next-20210416.orig/arch/powerpc/platforms/86xx/Kconfig
>> +++ linux-next-20210416/arch/powerpc/platforms/86xx/Kconfig
>> @@ -4,6 +4,7 @@ menuconfig PPC_86xx
>>   bool "86xx-based boards"
>>   depends on PPC_BOOK3S_32
>>   select FSL_SOC
>> +    select PPC_FPU
> 
> Now that ALTIVEC selects PPC_FPU by itself, I don't think you need that.
> 
>>   select ALTIVEC
>>   help
>>     The Freescale E600 SoCs have 74xx cores.
>> --- linux-next-20210416.orig/arch/powerpc/platforms/Kconfig.cputype
>> +++ linux-next-20210416/arch/powerpc/platforms/Kconfig.cputype
>> @@ -186,6 +186,7 @@ config E300C3_CPU
>>   config G4_CPU
>>   bool "G4 (74xx)"
>>   depends on PPC_BOOK3S_32
>> +    select PPC_FPU
> 
> Same

Of course. v3 coming up.
Thanks.

>>   select ALTIVEC
>>     endchoice
>> @@ -310,6 +311,7 @@ config PHYS_64BIT
>>   config ALTIVEC
>>   bool "AltiVec Support"
>>   depends on PPC_BOOK3S_32 || PPC_BOOK3S_64 || (PPC_E500MC && PPC64)
>> +    select PPC_FPU
>>   help
>>     This option enables kernel support for the Altivec extensions to the
>>     PowerPC processor. The kernel currently supports saving and restoring
>>


-- 
~Randy


[PATCH] opal/fadump: fix fadump to work with a different endian capture kernel

2021-04-21 Thread Hari Bathini
Dump capture would fail if capture kernel is not of the endianess as
the production kernel, because the in-memory data structure (struct
opal_fadump_mem_struct) shared across production kernel and capture
kernel assumes the same endianess for both the kernels, which doesn't
have to be true always. Fix it by having a well-defined endianess for
struct opal_fadump_mem_struct.

Signed-off-by: Hari Bathini 
---
 arch/powerpc/platforms/powernv/opal-fadump.c |   94 ++
 arch/powerpc/platforms/powernv/opal-fadump.h |   10 +--
 2 files changed, 57 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-fadump.c 
b/arch/powerpc/platforms/powernv/opal-fadump.c
index 9a360ced663b..e23a51a05f99 100644
--- a/arch/powerpc/platforms/powernv/opal-fadump.c
+++ b/arch/powerpc/platforms/powernv/opal-fadump.c
@@ -60,7 +60,7 @@ void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, 
u64 node)
addr = be64_to_cpu(addr);
pr_debug("Kernel metadata addr: %llx\n", addr);
opal_fdm_active = (void *)addr;
-   if (opal_fdm_active->registered_regions == 0)
+   if (be16_to_cpu(opal_fdm_active->registered_regions) == 0)
return;
 
ret = opal_mpipl_query_tag(OPAL_MPIPL_TAG_BOOT_MEM, &addr);
@@ -95,17 +95,17 @@ static int opal_fadump_unregister(struct fw_dump 
*fadump_conf);
 static void opal_fadump_update_config(struct fw_dump *fadump_conf,
  const struct opal_fadump_mem_struct *fdm)
 {
-   pr_debug("Boot memory regions count: %d\n", fdm->region_cnt);
+   pr_debug("Boot memory regions count: %d\n", 
be16_to_cpu(fdm->region_cnt));
 
/*
 * The destination address of the first boot memory region is the
 * destination address of boot memory regions.
 */
-   fadump_conf->boot_mem_dest_addr = fdm->rgn[0].dest;
+   fadump_conf->boot_mem_dest_addr = be64_to_cpu(fdm->rgn[0].dest);
pr_debug("Destination address of boot memory regions: %#016llx\n",
 fadump_conf->boot_mem_dest_addr);
 
-   fadump_conf->fadumphdr_addr = fdm->fadumphdr_addr;
+   fadump_conf->fadumphdr_addr = be64_to_cpu(fdm->fadumphdr_addr);
 }
 
 /*
@@ -126,9 +126,9 @@ static void opal_fadump_get_config(struct fw_dump 
*fadump_conf,
fadump_conf->boot_memory_size = 0;
 
pr_debug("Boot memory regions:\n");
-   for (i = 0; i < fdm->region_cnt; i++) {
-   base = fdm->rgn[i].src;
-   size = fdm->rgn[i].size;
+   for (i = 0; i < be16_to_cpu(fdm->region_cnt); i++) {
+   base = be64_to_cpu(fdm->rgn[i].src);
+   size = be64_to_cpu(fdm->rgn[i].size);
pr_debug("\t[%03d] base: 0x%lx, size: 0x%lx\n", i, base, size);
 
fadump_conf->boot_mem_addr[i] = base;
@@ -143,7 +143,7 @@ static void opal_fadump_get_config(struct fw_dump 
*fadump_conf,
 * Start address of reserve dump area (permanent reservation) for
 * re-registering FADump after dump capture.
 */
-   fadump_conf->reserve_dump_area_start = fdm->rgn[0].dest;
+   fadump_conf->reserve_dump_area_start = be64_to_cpu(fdm->rgn[0].dest);
 
/*
 * Rarely, but it can so happen that system crashes before all
@@ -155,13 +155,14 @@ static void opal_fadump_get_config(struct fw_dump 
*fadump_conf,
 * Hope the memory that could not be preserved only has pages
 * that are usually filtered out while saving the vmcore.
 */
-   if (fdm->region_cnt > fdm->registered_regions) {
+   if (be16_to_cpu(fdm->region_cnt) > 
be16_to_cpu(fdm->registered_regions)) {
pr_warn("Not all memory regions were saved!!!\n");
pr_warn("  Unsaved memory regions:\n");
-   i = fdm->registered_regions;
-   while (i < fdm->region_cnt) {
+   i = be16_to_cpu(fdm->registered_regions);
+   while (i < be16_to_cpu(fdm->region_cnt)) {
pr_warn("\t[%03d] base: 0x%llx, size: 0x%llx\n",
-   i, fdm->rgn[i].src, fdm->rgn[i].size);
+   i, be64_to_cpu(fdm->rgn[i].src),
+   be64_to_cpu(fdm->rgn[i].size));
i++;
}
 
@@ -170,7 +171,7 @@ static void opal_fadump_get_config(struct fw_dump 
*fadump_conf,
}
 
fadump_conf->boot_mem_top = (fadump_conf->boot_memory_size + hole_size);
-   fadump_conf->boot_mem_regs_cnt = fdm->region_cnt;
+   fadump_conf->boot_mem_regs_cnt = be16_to_cpu(fdm->region_cnt);
opal_fadump_update_config(fadump_conf, fdm);
 }
 
@@ -178,35 +179,38 @@ static void opal_fadump_get_config(struct fw_dump 
*fadump_conf,
 static void opal_fadump_init_metadata(struct opal_fadump_mem_struct *fdm)
 {
fdm->version = OPAL_FADUMP_VERSION;
-   fdm->region_cnt = 0;
-   fdm->registered_regions = 0;
-   fdm->fadumphdr_addr = 0;
+   fdm

Re: [PATCH] ASoC: fsl: imx-pcm-dma: Don't request dma channel in probe

2021-04-21 Thread Mark Brown
On Wed, Apr 21, 2021 at 07:43:18PM +0200, Lucas Stach wrote:

> If your driver code drops the rpm refcount to 0 and starts the
> autosuspend timer while a cyclic transfer is still in flight this is
> clearly a bug. Autosuspend is not there to paper over driver bugs, but
> to amortize cost of actually suspending and resuming the hardware. Your
> driver code must still work even if the timeout is 0, i.e. the hardware
> is immediately suspended after you drop the rpm refcount to 0.

> If you still have transfers queued/in-flight the driver code must keep
> a rpm reference.

Right, failing to do that is a clear bug.

Please delete unneeded context from mails when replying.  Doing this
makes it much easier to find your reply in the message, helping ensure
it won't be missed by people scrolling through the irrelevant quoted
material.


signature.asc
Description: PGP signature


[PATCH] powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')

2021-04-21 Thread Christophe Leroy
  AS  arch/powerpc/platforms/52xx/lite5200_sleep.o
arch/powerpc/platforms/52xx/lite5200_sleep.S: Assembler messages:
arch/powerpc/platforms/52xx/lite5200_sleep.S:184: Warning: invalid register 
expression

In the following code, 'addi' is wrong, has to be 'add'

/* local udelay in sram is needed */
  udelay: /* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
mullw   r12, r12, r11
mftbr13 /* start */
addir12, r13, r12 /* end */

Fixes: ee983079ce04 ("[POWERPC] MPC5200 low power mode")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/platforms/52xx/lite5200_sleep.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/52xx/lite5200_sleep.S 
b/arch/powerpc/platforms/52xx/lite5200_sleep.S
index 11475c58ea43..afee8b1515a8 100644
--- a/arch/powerpc/platforms/52xx/lite5200_sleep.S
+++ b/arch/powerpc/platforms/52xx/lite5200_sleep.S
@@ -181,7 +181,7 @@ sram_code:
   udelay: /* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
mullw   r12, r12, r11
mftbr13 /* start */
-   addir12, r13, r12 /* end */
+   add r12, r13, r12 /* end */
 1:
mftbr13 /* current */
cmp cr0, r13, r12
-- 
2.25.0



[PATCH v2] powerpc/kconfig: Restore alphabetic order of the selects under CONFIG_PPC

2021-04-21 Thread Christophe Leroy
Commit a7d2475af7ae ("powerpc: Sort the selects under CONFIG_PPC")
sorted all selects under CONFIG_PPC.

4 years later, several items have been introduced at wrong place,
a few other have been renamed without moving them to their correct
place.

Reorder them now.

While we are at it, simplify the test for a couple of them:
- PPC_64 && PPC_PSERIES is simplified in PPC_PSERIES
- PPC_64 && PPC_BOOK3S is simplified in PPC_BOOK3S_64

Signed-off-by: Christophe Leroy 
---
v2: Rebased on d20f726744a0 ("Automatic merge of 'next' into merge (2021-04-21 
22:57)")
---
 arch/powerpc/Kconfig | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d2e31a578e26..b2970c8241d5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -118,28 +118,29 @@ config PPC
# Please keep this list sorted alphabetically.
#
select ARCH_32BIT_OFF_T if PPC32
+   select ARCH_HAS_COPY_MC if PPC64
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEBUG_VM_PGTABLE
select ARCH_HAS_DEVMEM_IS_ALLOWED
+   select ARCH_HAS_DMA_MAP_DIRECT  if PPC_PSERIES
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
-   select ARCH_HAS_KCOV
select ARCH_HAS_HUGEPD  if HUGETLB_PAGE
+   select ARCH_HAS_KCOV
+   select ARCH_HAS_MEMBARRIER_CALLBACKS
+   select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEMREMAP_COMPAT_ALIGN
select ARCH_HAS_MMIOWB  if PPC64
+   select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_PMEM_API
-   select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_DEVMAP  if PPC_BOOK3S_64
select ARCH_HAS_PTE_SPECIAL
-   select ARCH_HAS_MEMBARRIER_CALLBACKS
-   select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_SCALED_CPUTIME  if VIRT_CPU_ACCOUNTING_NATIVE 
&& PPC_BOOK3S_64
select ARCH_HAS_STRICT_KERNEL_RWX   if ((PPC_BOOK3S_64 || PPC32) && 
!HIBERNATION)
select ARCH_HAS_TICK_BROADCAST  if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE
-   select ARCH_HAS_COPY_MC if PPC64
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_KEEP_MEMBLOCK
@@ -161,9 +162,8 @@ config PPC
select BUILDTIME_TABLE_SORT
select CLONE_BACKWARDS
select DCACHE_WORD_ACCESS   if PPC64 && CPU_LITTLE_ENDIAN
-   select DMA_OPS  if PPC64
select DMA_OPS_BYPASS   if PPC64
-   select ARCH_HAS_DMA_MAP_DIRECT  if PPC64 && PPC_PSERIES
+   select DMA_OPS  if PPC64
select DYNAMIC_FTRACE   if FUNCTION_TRACER
select EDAC_ATOMIC_SCRUB
select EDAC_SUPPORT
@@ -188,18 +188,16 @@ config PPC
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN  if PPC32 && PPC_PAGE_SHIFT <= 14
select HAVE_ARCH_KASAN_VMALLOC  if PPC32 && PPC_PAGE_SHIFT <= 14
-   select HAVE_ARCH_KGDB
select HAVE_ARCH_KFENCE if PPC32
+   select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS   if COMPAT
select HAVE_ARCH_NVRAM_OPS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_ASM_MODVERSIONS
-   select HAVE_C_RECORDMCOUNT
-   select HAVE_STACKPROTECTOR  if PPC64 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
-   select HAVE_STACKPROTECTOR  if PPC32 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
select HAVE_CONTEXT_TRACKINGif PPC64
+   select HAVE_C_RECORDMCOUNT
select HAVE_DEBUG_KMEMLEAK
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DYNAMIC_FTRACE
@@ -213,10 +211,13 @@ config PPC
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS if GCC_VERSION >= 50200   # 
plugin support on gcc <= 5.1 is buggy on PPC
select HAVE_GENERIC_VDSO
+   select HAVE_HARDLOCKUP_DETECTOR_ARCHif PPC_BOOK3S_64 && SMP
+   select HAVE_HARDLOCKUP_DETECTOR_PERFif PERF_EVENTS && 
HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
select HAVE_HW_BREAKPOINT   if PERF_EVENTS && (PPC_BOOK3S 
|| PPC_8xx)
select HAVE_IDE
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK
+   select HAVE_IRQ_TIME_ACCOUNTING
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZMA if DEFAULT_UIMAGE
select HAVE_KERNEL_LZO  if DEFAULT_UIMAGE
@@ -228,25 +229,24 @@ config PPC
   

[PATCH v2 2/2] powerpc: If kexec_build_elf_info() fails return immediately from elf64_load()

2021-04-21 Thread Lakshmi Ramasubramanian
Uninitialized local variable "elf_info" would be passed to
kexec_free_elf_info() if kexec_build_elf_info() returns an error
in elf64_load().

If kexec_build_elf_info() returns an error, return the error
immediately.

Signed-off-by: Lakshmi Ramasubramanian 
Reported-by: Dan Carpenter 
Reviewed-by: Michael Ellerman 
---
 arch/powerpc/kexec/elf_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 02662e72c53d..eeb258002d1e 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -45,7 +45,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
 
ret = kexec_build_elf_info(kernel_buf, kernel_len, &ehdr, &elf_info);
if (ret)
-   goto out;
+   return ERR_PTR(ret);
 
if (image->type == KEXEC_TYPE_CRASH) {
/* min & max buffer values for kdump case */
-- 
2.31.0



[PATCH v2 1/2] powerpc: Free fdt on error in elf64_load()

2021-04-21 Thread Lakshmi Ramasubramanian
There are a few "goto out;" statements before the local variable "fdt"
is initialized through the call to of_kexec_alloc_and_setup_fdt() in
elf64_load().  This will result in an uninitialized "fdt" being passed
to kvfree() in this function if there is an error before the call to
of_kexec_alloc_and_setup_fdt().

If there is any error after fdt is allocated, but before it is
saved in the arch specific kimage struct, free the fdt.

Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Signed-off-by: Michael Ellerman 
Signed-off-by: Lakshmi Ramasubramanian 
---
 arch/powerpc/kexec/elf_64.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 5a569bb51349..02662e72c53d 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -114,7 +114,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
  initrd_len, cmdline);
if (ret)
-   goto out;
+   goto out_free_fdt;
 
fdt_pack(fdt);
 
@@ -125,7 +125,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
ret = kexec_add_buffer(&kbuf);
if (ret)
-   goto out;
+   goto out_free_fdt;
 
/* FDT will be freed in arch_kimage_file_post_load_cleanup */
image->arch.fdt = fdt;
@@ -140,18 +140,14 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
if (ret)
pr_err("Error setting up the purgatory.\n");
 
+   goto out;
+
+out_free_fdt:
+   kvfree(fdt);
 out:
kfree(modified_cmdline);
kexec_free_elf_info(&elf_info);
 
-   /*
-* Once FDT buffer has been successfully passed to kexec_add_buffer(),
-* the FDT buffer address is saved in image->arch.fdt. In that case,
-* the memory cannot be freed here in case of any other error.
-*/
-   if (ret && !image->arch.fdt)
-   kvfree(fdt);
-
return ret ? ERR_PTR(ret) : NULL;
 }
 
-- 
2.31.0



[PATCH] powerpc/64s: Fix mm_cpumask memory ordering comment

2021-04-21 Thread Nicholas Piggin
The memory ordering comment no longer applies, because mm_ctx_id is
no longer used anywhere. At best always been difficult to follow.

It's better to consider the load on which the slbmte depends on, which
the MMU depends on before it can start loading TLBs, rather than a
store which may or may not have a subsequent dependency chain to the
slbmte.

So update the comment and we use the load of the mm's user context ID.
This is much more analogous the radix ordering too, which is good.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/mm/mmu_context.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c
index 18f20da0d348..a857af401738 100644
--- a/arch/powerpc/mm/mmu_context.c
+++ b/arch/powerpc/mm/mmu_context.c
@@ -43,24 +43,26 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
 
/*
 * This full barrier orders the store to the cpumask above vs
-* a subsequent operation which allows this CPU to begin loading
-* translations for next.
+* a subsequent load which allows this CPU/MMU to begin loading
+* translations for 'next' from page table PTEs into the TLB.
 *
-* When using the radix MMU that operation is the load of the
+* When using the radix MMU, that operation is the load of the
 * MMU context id, which is then moved to SPRN_PID.
 *
 * For the hash MMU it is either the first load from slb_cache
-* in switch_slb(), and/or the store of paca->mm_ctx_id in
-* copy_mm_to_paca().
+* in switch_slb() to preload the SLBs, or the load of
+* get_user_context which loads the context for the VSID hash
+* to insert a new SLB, in the SLB fault handler.
 *
 * On the other side, the barrier is in mm/tlb-radix.c for
-* radix which orders earlier stores to clear the PTEs vs
-* the load of mm_cpumask. And pte_xchg which does the same
-* thing for hash.
+* radix which orders earlier stores to clear the PTEs before
+* the load of mm_cpumask to check which CPU TLBs should be
+* flushed. For hash, pte_xchg to clear the PTE includes the
+* barrier.
 *
-* This full barrier is needed by membarrier when switching
-* between processes after store to rq->curr, before user-space
-* memory accesses.
+* This full barrier is also needed by membarrier when
+* switching between processes after store to rq->curr, before
+* user-space memory accesses.
 */
smp_mb();
 
-- 
2.23.0



Re: [PATCH 1/1] of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses

2021-04-21 Thread Leonardo Bras
On Tue, 2021-04-20 at 17:34 -0500, Rob Herring wrote:
> > [...]
> > I think the point here is bus resources not getting the MEM_64 flag,
> > but device resources getting it correctly. Is that supposed to happen?
> 
> I experimented with this on Arm with qemu and it seems fine there too.
> Looks like the BARs are first read and will have bit 2 set by default
> (or hardwired?). Now I'm just wondering why powerpc needs the code it
> has...
> 
> Anyways, I'll apply the patch.
> 
> Rob

Thanks Rob!




Re: [PATCH 1/2] powerpc: Free fdt on error in elf64_load()

2021-04-21 Thread Santosh Sivaraj
Lakshmi Ramasubramanian  writes:

> On 4/20/21 10:35 PM, Santosh Sivaraj wrote:
> Hi Santosh,
>
>> 
>>> There are a few "goto out;" statements before the local variable "fdt"
>>> is initialized through the call to of_kexec_alloc_and_setup_fdt() in
>>> elf64_load().  This will result in an uninitialized "fdt" being passed
>>> to kvfree() in this function if there is an error before the call to
>>> of_kexec_alloc_and_setup_fdt().
>>>
>>> If there is any error after fdt is allocated, but before it is
>>> saved in the arch specific kimage struct, free the fdt.
>>>
>>> Signed-off-by: Lakshmi Ramasubramanian 
>>> Reported-by: kernel test robot 
>>> Reported-by: Dan Carpenter 
>>> Suggested-by: Michael Ellerman 
>>> ---
>>>   arch/powerpc/kexec/elf_64.c | 16 ++--
>>>   1 file changed, 6 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
>>> index 5a569bb51349..02662e72c53d 100644
>>> --- a/arch/powerpc/kexec/elf_64.c
>>> +++ b/arch/powerpc/kexec/elf_64.c
>>> @@ -114,7 +114,7 @@ static void *elf64_load(struct kimage *image, char 
>>> *kernel_buf,
>>> ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
>>>   initrd_len, cmdline);
>>> if (ret)
>>> -   goto out;
>>> +   goto out_free_fdt;
>> 
>> Shouldn't there be a goto out_free_fdt if fdt_open_into fails?
>
> You are likely looking at elf_64.c in the mainline branch. The patch I 
> have submitted is based on Rob's device-tree for-next branch. Please see 
> the link below:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/tree/arch/powerpc/kexec/elf_64.c?h=for-next

That's right, I was indeed looking at the mainline. Sorry for the noise.

Thanks,
Santosh

>
>> 
>>>   
>>> fdt_pack(fdt);
>>>   
>>> @@ -125,7 +125,7 @@ static void *elf64_load(struct kimage *image, char 
>>> *kernel_buf,
>>> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>> ret = kexec_add_buffer(&kbuf);
>>> if (ret)
>>> -   goto out;
>>> +   goto out_free_fdt;
>>>   
>>> /* FDT will be freed in arch_kimage_file_post_load_cleanup */
>>> image->arch.fdt = fdt;
>>> @@ -140,18 +140,14 @@ static void *elf64_load(struct kimage *image, char 
>>> *kernel_buf,
>>> if (ret)
>>> pr_err("Error setting up the purgatory.\n");
>>>   
>>> +   goto out;
>>> +
>>> +out_free_fdt:
>>> +   kvfree(fdt);
>> 
>> Can just use kfree here?
> "fdt" is allocated through kvmalloc(). So it is freed using kvfree.
>
> thanks,
>   -lakshmi
>
>>>   out:
>>> kfree(modified_cmdline);
>>> kexec_free_elf_info(&elf_info);
>>>   
>>> -   /*
>>> -* Once FDT buffer has been successfully passed to kexec_add_buffer(),
>>> -* the FDT buffer address is saved in image->arch.fdt. In that case,
>>> -* the memory cannot be freed here in case of any other error.
>>> -*/
>>> -   if (ret && !image->arch.fdt)
>>> -   kvfree(fdt);
>>> -
>>> return ret ? ERR_PTR(ret) : NULL;
>>>   }
>>>   
>>> -- 
>>> 2.31.0


Re: [PATCH 1/2] powerpc: Free fdt on error in elf64_load()

2021-04-21 Thread Lakshmi Ramasubramanian

On 4/21/21 12:18 AM, Michael Ellerman wrote:

Lakshmi Ramasubramanian  writes:

There are a few "goto out;" statements before the local variable "fdt"
is initialized through the call to of_kexec_alloc_and_setup_fdt() in
elf64_load().  This will result in an uninitialized "fdt" being passed
to kvfree() in this function if there is an error before the call to
of_kexec_alloc_and_setup_fdt().

If there is any error after fdt is allocated, but before it is
saved in the arch specific kimage struct, free the fdt.

Signed-off-by: Lakshmi Ramasubramanian 
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Suggested-by: Michael Ellerman 


I basically sent you the diff, so this should probably be:

   Reported-by: kernel test robot 
   Reported-by: Dan Carpenter 
   Signed-off-by: Michael Ellerman 
   Signed-off-by: Lakshmi Ramasubramanian 

Otherwise looks good to me, thanks for turning it into a proper patch
and submitting it.


I will submit the patch again with the above change.
Thanks for reviewing the patch.

Could you please review [PATCH 2/2] as well?

thanks,
 -lakshmi



cheers



diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 5a569bb51349..02662e72c53d 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -114,7 +114,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
  initrd_len, cmdline);
if (ret)
-   goto out;
+   goto out_free_fdt;
  
  	fdt_pack(fdt);
  
@@ -125,7 +125,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,

kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
ret = kexec_add_buffer(&kbuf);
if (ret)
-   goto out;
+   goto out_free_fdt;
  
  	/* FDT will be freed in arch_kimage_file_post_load_cleanup */

image->arch.fdt = fdt;
@@ -140,18 +140,14 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
if (ret)
pr_err("Error setting up the purgatory.\n");
  
+	goto out;

+
+out_free_fdt:
+   kvfree(fdt);
  out:
kfree(modified_cmdline);
kexec_free_elf_info(&elf_info);
  
-	/*

-* Once FDT buffer has been successfully passed to kexec_add_buffer(),
-* the FDT buffer address is saved in image->arch.fdt. In that case,
-* the memory cannot be freed here in case of any other error.
-*/
-   if (ret && !image->arch.fdt)
-   kvfree(fdt);
-
return ret ? ERR_PTR(ret) : NULL;
  }
  
--

2.31.0




Re: [PATCH 1/2] powerpc: Free fdt on error in elf64_load()

2021-04-21 Thread Lakshmi Ramasubramanian

On 4/20/21 10:35 PM, Santosh Sivaraj wrote:
Hi Santosh,




There are a few "goto out;" statements before the local variable "fdt"
is initialized through the call to of_kexec_alloc_and_setup_fdt() in
elf64_load().  This will result in an uninitialized "fdt" being passed
to kvfree() in this function if there is an error before the call to
of_kexec_alloc_and_setup_fdt().

If there is any error after fdt is allocated, but before it is
saved in the arch specific kimage struct, free the fdt.

Signed-off-by: Lakshmi Ramasubramanian 
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Suggested-by: Michael Ellerman 
---
  arch/powerpc/kexec/elf_64.c | 16 ++--
  1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 5a569bb51349..02662e72c53d 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -114,7 +114,7 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
  initrd_len, cmdline);
if (ret)
-   goto out;
+   goto out_free_fdt;


Shouldn't there be a goto out_free_fdt if fdt_open_into fails?


You are likely looking at elf_64.c in the mainline branch. The patch I 
have submitted is based on Rob's device-tree for-next branch. Please see 
the link below:


https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/tree/arch/powerpc/kexec/elf_64.c?h=for-next



  
  	fdt_pack(fdt);
  
@@ -125,7 +125,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,

kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
ret = kexec_add_buffer(&kbuf);
if (ret)
-   goto out;
+   goto out_free_fdt;
  
  	/* FDT will be freed in arch_kimage_file_post_load_cleanup */

image->arch.fdt = fdt;
@@ -140,18 +140,14 @@ static void *elf64_load(struct kimage *image, char 
*kernel_buf,
if (ret)
pr_err("Error setting up the purgatory.\n");
  
+	goto out;

+
+out_free_fdt:
+   kvfree(fdt);


Can just use kfree here?

"fdt" is allocated through kvmalloc(). So it is freed using kvfree.

thanks,
 -lakshmi


  out:
kfree(modified_cmdline);
kexec_free_elf_info(&elf_info);
  
-	/*

-* Once FDT buffer has been successfully passed to kexec_add_buffer(),
-* the FDT buffer address is saved in image->arch.fdt. In that case,
-* the memory cannot be freed here in case of any other error.
-*/
-   if (ret && !image->arch.fdt)
-   kvfree(fdt);
-
return ret ? ERR_PTR(ret) : NULL;
  }
  
--

2.31.0




Re: [PATCH v1] KVM: PPC: Book3S HV P9: implement kvmppc_xive_pull_vcpu in C

2021-04-21 Thread Cédric Le Goater
On 4/13/21 3:38 PM, Nicholas Piggin wrote:
> This is more symmetric with kvmppc_xive_push_vcpu, and has the advantage
> that it runs with the MMU on.
> 
> The extra test added to the asm will go away with a future change.
> 
> Reviewed-by: Cédric Le Goater 
> Reviewed-by: Alexey Kardashevskiy 
> Signed-off-by: Nicholas Piggin 
> ---
> Another bit that came from the KVM Cify series.
> 
> Thanks,
> Nick
> 
>  arch/powerpc/include/asm/kvm_ppc.h  |  2 ++
>  arch/powerpc/kvm/book3s_hv.c|  2 ++
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |  5 
>  arch/powerpc/kvm/book3s_xive.c  | 31 +
>  4 files changed, 40 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index 9531b1c1b190..73b1ca5a6471 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -672,6 +672,7 @@ extern int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 
> icpval);
>  extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
>  int level, bool line_status);
>  extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
> +extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu);
>  
>  static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
>  {
> @@ -712,6 +713,7 @@ static inline int kvmppc_xive_set_icp(struct kvm_vcpu 
> *vcpu, u64 icpval) { retur
>  static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, 
> u32 irq,
> int level, bool line_status) { return 
> -ENODEV; }
>  static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
> +static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { }
>  
>  static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
>   { return 0; }
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 4a532410e128..981bcaf787a8 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3570,6 +3570,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
> *vcpu, u64 time_limit,
>  
>   trap = __kvmhv_vcpu_entry_p9(vcpu);
>  
> + kvmppc_xive_pull_vcpu(vcpu);
> +
>   /* Advance host PURR/SPURR by the amount used by guest */
>   purr = mfspr(SPRN_PURR);
>   spurr = mfspr(SPRN_SPURR);
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 75405ef53238..c11597f815e4 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1442,6 +1442,11 @@ guest_exit_cont:   /* r9 = vcpu, r12 = 
> trap, r13 = paca */
>   bl  kvmhv_accumulate_time
>  #endif
>  #ifdef CONFIG_KVM_XICS
> + /* If we came in through the P9 short path, xive pull is done in C */
> + lwz r0, STACK_SLOT_SHORT_PATH(r1)
> + cmpwi   r0, 0
> + bne 1f
> +
>   /* We are exiting, pull the VP from the XIVE */
>   lbz r0, VCPU_XIVE_PUSHED(r9)
>   cmpwi   cr0, r0, 0
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index e7219b6f5f9a..741bf1f4387a 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -127,6 +127,37 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu)
>  }
>  EXPORT_SYMBOL_GPL(kvmppc_xive_push_vcpu);
>  
> +/*
> + * Pull a vcpu's context from the XIVE on guest exit.
> + * This assumes we are in virtual mode (MMU on)
> + */
> +void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu)
> +{
> + void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt;
> +
> + if (!vcpu->arch.xive_pushed)
> + return;
> +
> + /*
> +  * Should not have been pushed if there is no tima
> +  */
> + if (WARN_ON(!tima))
> + return;
> +
> + eieio();
> + /* First load to pull the context, we ignore the value */
> + __raw_readl(tima + TM_SPC_PULL_OS_CTX);
> + /* Second load to recover the context state (Words 0 and 1) */
> + vcpu->arch.xive_saved_state.w01 = __raw_readq(tima + TM_QW1_OS);

This load could be removed on P10, since HW is configured to do the same.
It should save a few cycles.

C. 

> + /* Fixup some of the state for the next load */
> + vcpu->arch.xive_saved_state.lsmfb = 0;
> + vcpu->arch.xive_saved_state.ack = 0xff;
> + vcpu->arch.xive_pushed = 0;
> + eieio();
> +}
> +EXPORT_SYMBOL_GPL(kvmppc_xive_pull_vcpu);
> +
>  /*
>   * This is a simple trigger for a generic XIVE IRQ. This must
>   * only be called for interrupts that support a trigger page
> 



Re: [PATCH] powerpc/pseries: Add shutdown() to vio_driver and vio_bus

2021-04-21 Thread Michael Ellerman
On Thu, 1 Apr 2021 18:13:25 -0600, Tyrel Datwyler wrote:
> Currently, neither the vio_bus or vio_driver structures provide support
> for a shutdown() routine.
> 
> Add support for shutdown() by allowing drivers to provide a
> implementation via function pointer in their vio_driver struct and
> provide a proper implementation in the driver template for the vio_bus
> that calls a vio drivers shutdown() if defined.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries: Add shutdown() to vio_driver and vio_bus
  https://git.kernel.org/powerpc/c/39d0099f94390eb7a677e1a5c9bb56a4daa242a1

cheers


Re: [PATCH] powerpc/pseries: Stop calling printk in rtas_stop_self()

2021-04-21 Thread Michael Ellerman
On Sun, 18 Apr 2021 23:54:13 +1000, Michael Ellerman wrote:
> RCU complains about us calling printk() from an offline CPU:
> 
>   =
>   WARNING: suspicious RCU usage
>   5.12.0-rc7-02874-g7cf90e481cb8 #1 Not tainted
>   -
>   kernel/locking/lockdep.c:3568 RCU-list traversed in non-reader section!!
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries: Stop calling printk in rtas_stop_self()
  https://git.kernel.org/powerpc/c/ed8029d7b472369a010a1901358567ca3b6dbb0d

cheers


Re: [PATCH] powerpc: Only define _TASK_CPU for 32-bit

2021-04-21 Thread Michael Ellerman
On Sun, 18 Apr 2021 23:16:41 +1000, Michael Ellerman wrote:
> We have some interesting code in our Makefile to define _TASK_CPU, based
> on awk'ing the value out of asm-offsets.h. It exists to circumvent some
> circular header dependencies that prevent us from referring to
> task_struct in the relevant code. See the comment around _TASK_CPU in
> smp.h for more detail.
> 
> Maybe one day we can come up with a better solution, but for now we can
> at least limit that logic to 32-bit, because it's not needed for 64-bit.

Applied to powerpc/next.

[1/1] powerpc: Only define _TASK_CPU for 32-bit
  https://git.kernel.org/powerpc/c/3027a37c06be364e6443d3df3adf45576fba50cb

cheers


Re: [PATCH] powerpc/kvm: Fix PR KVM with KUAP/MEM_KEYS enabled

2021-04-21 Thread Michael Ellerman
On Mon, 19 Apr 2021 22:01:39 +1000, Michael Ellerman wrote:
> The changes to add KUAP support with the hash MMU broke booting of KVM
> PR guests. The symptom is no visible progress of the guest, or possibly
> just "SLOF" being printed to the qemu console.
> 
> Host code is still executing, but breaking into xmon might show a stack
> trace such as:
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/kvm: Fix PR KVM with KUAP/MEM_KEYS enabled
  https://git.kernel.org/powerpc/c/e4e8bc1df691ba5ba749d1e2b67acf9827e51a35

cheers


Re: [PATCH 2/2] powerpc/perf: Add platform specific check_attr_config

2021-04-21 Thread Michael Ellerman
On Thu, 8 Apr 2021 13:15:04 +0530, Madhavan Srinivasan wrote:
> Add platform specific attr.config value checks. Patch
> includes checks for both power9 and power10.

Applied to powerpc/next.

[2/2] powerpc/perf: Add platform specific check_attr_config
  https://git.kernel.org/powerpc/c/d8a1d6c58986d8778768b15dc5bac0b4b082d345

cheers


Re: [PATCH 1/1] powerpc/pseries/iommu: Fix window size for direct mapping with pmem

2021-04-21 Thread Michael Ellerman
On Tue, 20 Apr 2021 01:54:04 -0300, Leonardo Bras wrote:
> As of today, if the DDW is big enough to fit (1 << MAX_PHYSMEM_BITS) it's
> possible to use direct DMA mapping even with pmem region.
> 
> But, if that happens, the window size (len) is set to
> (MAX_PHYSMEM_BITS - page_shift) instead of MAX_PHYSMEM_BITS, causing a
> pagesize times smaller DDW to be created, being insufficient for correct
> usage.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries/iommu: Fix window size for direct mapping with pmem
  https://git.kernel.org/powerpc/c/a9d2f9bb225fd2a764aef57738ab6c7f38d782ae

cheers


Re: [PATCH] Documentation/powerpc: Add proper links for manual and tests

2021-04-21 Thread Michael Ellerman
On Sun, 18 Apr 2021 12:29:42 -0700, Haren Myneni wrote:
> The links that are mentioned in this document are no longer
> valid. So changed the proper links for NXGZIP user manual and
> test cases.

Applied to powerpc/next.

[1/1] Documentation/powerpc: Add proper links for manual and tests
  https://git.kernel.org/powerpc/c/2886e2df10beaf50352dad7a90907251bc692029

cheers


Re: [PATCH] powerpc/pseries/mce: Fix a typo in error type assignment

2021-04-21 Thread Michael Ellerman
On Fri, 16 Apr 2021 18:27:50 +0530, Ganesh Goudar wrote:
> The error type is ICACHE and DCACHE, for case MCE_ERROR_TYPE_ICACHE.

Applied to powerpc/next.

[1/1] powerpc/pseries/mce: Fix a typo in error type assignment
  https://git.kernel.org/powerpc/c/864ec4d40c83365b16483d88990e7e579537635c

cheers


Re: [PATCH] powerpc/mce: save ignore_event flag unconditionally for UE

2021-04-21 Thread Michael Ellerman
On Wed, 7 Apr 2021 10:28:16 +0530, Ganesh Goudar wrote:
> When we hit an UE while using machine check safe copy routines,
> ignore_event flag is set and the event is ignored by mce handler,
> And the flag is also saved for defered handling and printing of
> mce event information, But as of now saving of this flag is done
> on checking if the effective address is provided or physical address
> is calculated, which is not right.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/mce: save ignore_event flag unconditionally for UE
  https://git.kernel.org/powerpc/c/92d9d61be519f32f16c07602db5bcbe30a0836fe

cheers


Re: [PATCH 0/2] pseries: UNISOLATE DRCs to signal device removal error

2021-04-21 Thread Michael Ellerman
On Fri, 16 Apr 2021 18:02:14 -0300, Daniel Henrique Barboza wrote:
> At this moment, PAPR [1] does not have a way to report errors during a device
> removal operation. This puts a strain in the hypervisor, which needs extra
> mechanisms to try to fallback and recover from an error that might have
> happened during the removal. The QEMU community has dealt with it during these
> years by either trying to preempt the error before sending the HP event or, in
> case of a guest side failure, reboot the guest to complete the removal 
> process.
> 
> [...]

Applied to powerpc/next.

[1/2] dlpar.c: introduce dlpar_unisolate_drc()
  https://git.kernel.org/powerpc/c/0e3b3ff83ce24a7a01e467ca42e3e33e87195c0d
[2/2] hotplug-cpu.c: set UNISOLATE on dlpar_cpu_remove() failure
  https://git.kernel.org/powerpc/c/29c9a2699e71a7866a98ebdf6ea38135d31b4e1f

cheers


Re: [PATCH v3 1/4] powerpc: Remove probe_user_read_inst()

2021-04-21 Thread Michael Ellerman
On Wed, 14 Apr 2021 13:08:40 + (UTC), Christophe Leroy wrote:
> Its name comes from former probe_user_read() function.
> That function is now called copy_from_user_nofault().
> 
> probe_user_read_inst() uses copy_from_user_nofault() to read only
> a few bytes. It is suboptimal.
> 
> It does the same as get_user_inst() but in addition disables
> page faults.
> 
> [...]

Applied to powerpc/next.

[1/4] powerpc: Remove probe_user_read_inst()
  https://git.kernel.org/powerpc/c/6ac7897f08e04b47df3955d7691652e9d12d4068
[2/4] powerpc: Make probe_kernel_read_inst() common to PPC32 and PPC64
  https://git.kernel.org/powerpc/c/6449078d50111c839bb7156c3b99b9def80eed42
[3/4] powerpc: Rename probe_kernel_read_inst()
  https://git.kernel.org/powerpc/c/41d6cf68b5f611934bcc6a7d4a1a2d9bfd04b420
[4/4] powerpc: Move copy_from_kernel_nofault_inst()
  https://git.kernel.org/powerpc/c/39352430aaa05fbe4ba710231c70b334513078f2

cheers


Re: [PATCH v2 1/2] powerpc/inst: ppc_inst_as_u64() becomes ppc_inst_as_ulong()

2021-04-21 Thread Michael Ellerman
On Tue, 20 Apr 2021 14:02:06 + (UTC), Christophe Leroy wrote:
> In order to simplify use on PPC32, change ppc_inst_as_u64()
> into ppc_inst_as_ulong() that returns the 32 bits instruction
> on PPC32.
> 
> Will be used when porting OPTPROBES to PPC32.

Applied to powerpc/next.

[1/2] powerpc/inst: ppc_inst_as_u64() becomes ppc_inst_as_ulong()
  https://git.kernel.org/powerpc/c/693557ebf407a85ea400a0b501bb97687d8f4856
[2/2] powerpc: Enable OPTPROBES on PPC32
  https://git.kernel.org/powerpc/c/eacf4c0202654adfa94bbb17b5c5c77c0be14af8

cheers


Re: [PATCH] selftests: timens: Fix gettime_perf to work on powerpc

2021-04-21 Thread Michael Ellerman
On Wed, 31 Mar 2021 13:59:17 + (UTC), Christophe Leroy wrote:
> On powerpc:
> - VDSO library is named linux-vdso32.so.1 or linux-vdso64.so.1
> - clock_gettime is named __kernel_clock_gettime()
> 
> Ensure gettime_perf tries these names before giving up.

Applied to powerpc/next.

[1/1] selftests: timens: Fix gettime_perf to work on powerpc
  https://git.kernel.org/powerpc/c/f56607e85ee38f2a5bb7096e24e2d40f35d714f9

cheers


Re: [PATCH 1/3] powerpc/ebpf32: Fix comment on BPF_ALU{64} | BPF_LSH | BPF_K

2021-04-21 Thread Michael Ellerman
On Mon, 12 Apr 2021 11:44:16 + (UTC), Christophe Leroy wrote:
> Replace <<== by <<=

Applied to powerpc/next.

[1/3] powerpc/ebpf32: Fix comment on BPF_ALU{64} | BPF_LSH | BPF_K
  https://git.kernel.org/powerpc/c/d228cc4969663623e6b5a749b02e4619352a0a8d
[2/3] powerpc/ebpf32: Rework 64 bits shifts to avoid tests and branches
  https://git.kernel.org/powerpc/c/e7de0023e1232f42a10ef6af03352538cc27eaf6
[3/3] powerpc/ebpf32: Use standard function call for functions within 32M 
distance
  https://git.kernel.org/powerpc/c/ee7c3ec3b4b1222b30272624897826bc40d79bc5

cheers


Re: [PATCH] powerpc/32: Use r2 in wrtspr() instead of r0

2021-04-21 Thread Michael Ellerman
On Fri, 22 Jan 2021 07:15:03 + (UTC), Christophe Leroy wrote:
> wrtspr() is a function to write an arbitrary value in a special
> register. It is used on 8xx to write to SPRN_NRI, SPRN_EID and
> SPRN_EIE. Writing any value to one of those will play with MSR EE
> and MSR RI regardless of that value.
> 
> r0 is used many places in the generated code and using r0 for
> that creates an unnecessary dependency of this instruction with
> preceding ones using r0 in a few places in vmlinux.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32: Use r2 in wrtspr() instead of r0
  https://git.kernel.org/powerpc/c/867e762480f4ad4106b16299a373fa23eccf5b4b

cheers


Re: [PATCH 1/3] powerpc/8xx: Enhance readability of trap types

2021-04-21 Thread Michael Ellerman
On Mon, 19 Apr 2021 15:48:09 + (UTC), Christophe Leroy wrote:
> This patch makes use of trap types in head_8xx.S

Applied to powerpc/next.

[1/3] powerpc/8xx: Enhance readability of trap types
  https://git.kernel.org/powerpc/c/0f5eb28a6ce6ab0882010e6727bfd6e8cd569273
[2/3] powerpc/32s: Enhance readability of trap types
  https://git.kernel.org/powerpc/c/7fab639729ce4a0ecb3c528cd68b0c0598696ef9
[3/3] powerpc/irq: Enhance readability of trap types
  https://git.kernel.org/powerpc/c/e522331173ec9af563461e0fae534e83ce39e8e3

cheers


Re: [PATCH V2 0/5] powerpc/perf: Export processor pipeline stage cycles information

2021-04-21 Thread Michael Ellerman
On Mon, 22 Mar 2021 10:57:22 -0400, Athira Rajeev wrote:
> Performance Monitoring Unit (PMU) registers in powerpc exports
> number of cycles elapsed between different stages in the pipeline.
> Example, sampling registers in ISA v3.1.
> 
> This patchset implements kernel and perf tools support to expose
> these pipeline stage cycles using the sample type PERF_SAMPLE_WEIGHT_TYPE.
> 
> [...]

Patch 1 applied to powerpc/next.

[1/5] powerpc/perf: Expose processor pipeline stage cycles using 
PERF_SAMPLE_WEIGHT_STRUCT
  https://git.kernel.org/powerpc/c/af31fd0c9107e400a8eb89d0eafb40bb78802f79

cheers


[PATCH 2/2] powerpc/powernv: Fix type of opal_mpipl_query_tag() addr argument

2021-04-21 Thread Michael Ellerman
opal_mpipl_query_tag() takes a pointer to a 64-bit value, which firmware
writes a value to. As OPAL is traditionally big endian this value will
be big endian.

This can be confirmed by looking at the implementation in skiboot:

  static uint64_t opal_mpipl_query_tag(enum opal_mpipl_tags tag, __be64 
*tag_val)
  {
...
*tag_val = cpu_to_be64(opal_mpipl_tags[tag]);
return OPAL_SUCCESS;
  }

Fix the declaration to annotate that the value is big endian.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/opal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9986ac34b8e2..c76157237e22 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -307,7 +307,7 @@ int opal_secvar_enqueue_update(const char *key, uint64_t 
key_len, u8 *data,
 
 s64 opal_mpipl_update(enum opal_mpipl_ops op, u64 src, u64 dest, u64 size);
 s64 opal_mpipl_register_tag(enum opal_mpipl_tags tag, u64 addr);
-s64 opal_mpipl_query_tag(enum opal_mpipl_tags tag, u64 *addr);
+s64 opal_mpipl_query_tag(enum opal_mpipl_tags tag, __be64 *addr);
 
 s64 opal_signal_system_reset(s32 cpu);
 s64 opal_quiesce(u64 shutdown_type, s32 cpu);
-- 
2.25.1



[PATCH 1/2] powerpc/fadump: Fix sparse warnings

2021-04-21 Thread Michael Ellerman
Sparse says:
  arch/powerpc/kernel/fadump.c:48:16: warning: symbol 'fadump_kobj' was not 
declared. Should it be static?
  arch/powerpc/kernel/fadump.c:55:27: warning: symbol 'crash_mrange_info' was 
not declared. Should it be static?
  arch/powerpc/kernel/fadump.c:61:27: warning: symbol 'reserved_mrange_info' 
was not declared. Should it be static?
  arch/powerpc/kernel/fadump.c:83:12: warning: symbol 'fadump_cma_init' was not 
declared. Should it be static?

And indeed none of them are used outside this file, they can all be made
static. Also fadump_kobj needs to be moved inside the ifdef where it's
used.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/fadump.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 000e3b7f3fca..b990075285f5 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -45,22 +45,21 @@ static struct fw_dump fw_dump;
 
 static void __init fadump_reserve_crash_area(u64 base);
 
-struct kobject *fadump_kobj;
-
 #ifndef CONFIG_PRESERVE_FA_DUMP
 
+static struct kobject *fadump_kobj;
+
 static atomic_t cpus_in_fadump;
 static DEFINE_MUTEX(fadump_mutex);
 
-struct fadump_mrange_info crash_mrange_info = { "crash", NULL, 0, 0, 0, false 
};
+static struct fadump_mrange_info crash_mrange_info = { "crash", NULL, 0, 0, 0, 
false };
 
 #define RESERVED_RNGS_SZ   16384 /* 16K - 128 entries */
 #define RESERVED_RNGS_CNT  (RESERVED_RNGS_SZ / \
 sizeof(struct fadump_memory_range))
 static struct fadump_memory_range rngs[RESERVED_RNGS_CNT];
-struct fadump_mrange_info reserved_mrange_info = { "reserved", rngs,
-  RESERVED_RNGS_SZ, 0,
-  RESERVED_RNGS_CNT, true };
+static struct fadump_mrange_info
+reserved_mrange_info = { "reserved", rngs, RESERVED_RNGS_SZ, 0, 
RESERVED_RNGS_CNT, true };
 
 static void __init early_init_dt_scan_reserved_ranges(unsigned long node);
 
@@ -80,7 +79,7 @@ static struct cma *fadump_cma;
  * But for some reason even if it fails we still have the memory reservation
  * with us and we can still continue doing fadump.
  */
-int __init fadump_cma_init(void)
+static int __init fadump_cma_init(void)
 {
unsigned long long base, size;
int rc;
-- 
2.25.1



Re: [PATCH v11 6/6] powerpc: Book3S 64-bit outline-only KASAN support

2021-04-21 Thread Christophe Leroy




Le 19/03/2021 à 15:40, Daniel Axtens a écrit :

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index aca354fb670b..63672aa656e8 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -20,6 +20,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -317,6 +318,23 @@ static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)

unsigned long addr;
unsigned int i;
  
+#if defined(CONFIG_KASAN) && defined(CONFIG_PPC_BOOK3S_64)

+   /*
+* On radix + KASAN, we want to check for the KASAN "early" shadow
+* which covers huge quantities of memory with the same set of
+* read-only PTEs. If it is, we want to note the first page (to see
+* the status change), and then note the last page. This gives us good
+* results without spending ages noting the exact same PTEs over 100s of
+* terabytes of memory.
+*/
+   if (p4d_page(*p4d) == virt_to_page(lm_alias(kasan_early_shadow_pud))) {
+   walk_pmd(st, pud, start);
+   addr = start + (PTRS_PER_PUD - 1) * PUD_SIZE;
+   walk_pmd(st, pud, addr);
+   return;
+   }
+#endif
+
for (i = 0; i < PTRS_PER_PUD; i++, pud++) {
addr = start + i * PUD_SIZE;
if (!pud_none(*pud) && !pud_is_leaf(*pud))



The above changes should not be necessary once PPC_PTDUMP is converted to 
GENERIC_PTDUMP.

See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=239795


Christophe


Re: [PATCH] powerpc/64s: Add load address to plt branch targets before moved to linked location for non-relocatable kernels

2021-04-21 Thread Christophe Leroy




Le 21/04/2021 à 04:17, Jordan Niethe a écrit :

Large branches will go through the plt which includes a stub that loads
a target address from the .branch_lt section. On a relocatable kernel the
targets in .branch_lt have relocations so they will be fixed up for
where the kernel is running by relocate().

For a non-relocatable kernel obviously there are no relocations.
However, until the kernel is moved down to its linked address it is
expected to be able to run where ever it is loaded. For pseries machines
prom_init() is called before running at the linked address.

Certain configs result in a large kernel such as STRICT_KERNEL_RWX
(because of the larger data shift):

config DATA_SHIFT
int "Data shift" if DATA_SHIFT_BOOL
default 24 if STRICT_KERNEL_RWX && PPC64

These large kernels lead to prom_init()'s final call to __start()
generating a plt branch:

bl  c218 <0078.plt_branch.__start>

This results in the kernel jumping to the linked address of __start,
0xc000, when really it needs to jump to the
0xc000 + the runtime address because the kernel is still
running at the load address.



On ppc32 it seems to be different. I can't find plt_branch or lt_branch or 
whatever.

Looks like the stubs are placed at the end of .head section, and just after 
prom_init:

c0003858 :
c0003858:   7d 08 02 a6 mflrr8
c000385c:   48 00 d5 35 bl  c0010d90 
c0003860:   7d 08 03 a6 mtlrr8
c0003864:   3d 03 c2 04 addis   r8,r3,-15868
c0003868:   39 08 1d 08 addir8,r8,7432
c000386c:   2c 08 00 00 cmpwi   r8,0
c0003870:   4d 82 00 20 beqlr
c0003874:   81 68 00 00 lwz r11,0(r8)
c0003878:   81 08 00 04 lwz r8,4(r8)
c000387c:   7d 1f 83 a6 mtdbatl 3,r8
c0003880:   7d 7e 83 a6 mtdbatu 3,r11
c0003884:   4e 80 00 20 blr
c0003888:   3d 80 c2 00 lis r12,-15872
c000388c:   39 8c 16 dc addir12,r12,5852
c0003890:   7d 89 03 a6 mtctr   r12
c0003894:   4e 80 04 20 bctr
c0003898:   3d 80 c2 01 lis r12,-15871
c000389c:   39 8c e1 38 addir12,r12,-7880
c00038a0:   7d 89 03 a6 mtctr   r12
c00038a4:   4e 80 04 20 bctr
c00038a8:   3d 80 c2 00 lis r12,-15872
c00038ac:   39 8c 74 d0 addir12,r12,29904
c00038b0:   7d 89 03 a6 mtctr   r12
c00038b4:   4e 80 04 20 bctr
c00038b8:   3d 80 c2 00 lis r12,-15872
c00038bc:   39 8c 73 38 addir12,r12,29496
c00038c0:   7d 89 03 a6 mtctr   r12
c00038c4:   4e 80 04 20 bctr
c00038c8:   3d 80 c2 01 lis r12,-15871
c00038cc:   39 8c 83 6c addir12,r12,-31892
c00038d0:   7d 89 03 a6 mtctr   r12
c00038d4:   4e 80 04 20 bctr
c00038d8:   3d 80 c2 01 lis r12,-15871
c00038dc:   39 8c 8f 08 addir12,r12,-28920
c00038e0:   7d 89 03 a6 mtctr   r12
c00038e4:   4e 80 04 20 bctr

Disassembly of section .text:

c0004000 :


c20016dc :
c20016dc:   94 21 ff 50 stwur1,-176(r1)
c20016e0:   7c 08 02 a6 mflrr0
c20016e4:   42 9f 00 05 bcl 20,4*cr7+so,c20016e8 
c20016e8:   bd c1 00 68 stmwr14,104(r1)
c20016ec:   7f c8 02 a6 mflrr30
c20016f0:   90 01 00 b4 stw r0,180(r1)
c20016f4:   7c bb 2b 78 mr  r27,r5
c20016f8:   80 1e ff f0 lwz r0,-16(r30)

c20026d4:   4a 00 ed 69 bl  c001143c 
c20026d8:   7f e3 fb 78 mr  r3,r31
c20026dc:   7f 24 cb 78 mr  r4,r25
c20026e0:   39 20 00 00 li  r9,0
c20026e4:   39 00 00 00 li  r8,0
c20026e8:   38 e0 00 00 li  r7,0
c20026ec:   38 c0 00 00 li  r6,0
c20026f0:   38 a0 00 00 li  r5,0
c20026f4:   48 00 00 61 bl  c2002754 
c20026f8:   38 60 00 00 li  r3,0
c20026fc:   80 01 00 b4 lwz r0,180(r1)
c2002700:   81 c1 00 68 lwz r14,104(r1)
c2002704:   81 e1 00 6c lwz r15,108(r1)
c2002708:   7c 08 03 a6 mtlrr0
c200270c:   82 01 00 70 lwz r16,112(r1)
c2002710:   82 21 00 74 lwz r17,116(r1)
c2002714:   82 41 00 78 lwz r18,120(r1)
c2002718:   82 61 00 7c lwz r19,124(r1)
c200271c:   82 81 00 80 lwz r20,128(r1)
c2002720:   82 a1 00 84 lwz r21,132(r1)
c2002724:   82 c1 00 88 lwz r22,136(r1)
c2002728:   82 e1 00 8c lwz r23,140(r1)
c200272c:   83 01 00 90 lwz r24,144(r1)
c2002730:   83 21 00 94 lwz r25,148(r1)
c2002734:   83 41 00 98 lwz r26,152(r1)
c2002738:   83 61 00 9c lwz r27,156(r1)
c200273c:   83 81 00 a0 lwz r28,160(r1)
c2002740:   83 a1 00 a4 lwz r29,164(r1)
c2002744:   83 c1 00 a8 lwz r30,168(r1)
c2002748:   83 e1 00 ac lwz r31,172(r1)
c200274c:   38 21

[PATCH v2] soc: fsl: qe: Remove unused function

2021-04-21 Thread Jiapeng Chong
The last callers of this function were removed by commit d7c2878cfcfa
("soc: fsl: qe: remove unused qe_ic_set_* functions"):
https://github.com/torvalds/linux/commit/d7c2878cfcfa

Fix the following clang warning:

drivers/soc/fsl/qe/qe_ic.c:234:29: warning: unused function
'qe_ic_from_irq' [-Wunused-function].

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
Changes in v2:
  - Modified submission information.

 drivers/soc/fsl/qe/qe_ic.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/soc/fsl/qe/qe_ic.c
index 0390af9..b573712 100644
--- a/drivers/soc/fsl/qe/qe_ic.c
+++ b/drivers/soc/fsl/qe/qe_ic.c
@@ -231,11 +231,6 @@ static inline void qe_ic_write(__be32  __iomem *base, 
unsigned int reg,
qe_iowrite32be(value, base + (reg >> 2));
 }
 
-static inline struct qe_ic *qe_ic_from_irq(unsigned int virq)
-{
-   return irq_get_chip_data(virq);
-}
-
 static inline struct qe_ic *qe_ic_from_irq_data(struct irq_data *d)
 {
return irq_data_get_irq_chip_data(d);
-- 
1.8.3.1



Re: powerpc{32,64} randconfigs

2021-04-21 Thread Masahiro Yamada
On Wed, Apr 21, 2021 at 4:15 PM Michael Ellerman  wrote:
>
> Randy Dunlap  writes:
> > Hi,
> >
> > Is there a way to do this?
> >
> > $ make ARCH=powerpc randconfig # and force PPC32
>
> Sort of:
>
> $ KCONFIG_ALLCONFIG=arch/powerpc/configs/book3s_32.config make randconfig
>
> But that also forces BOOK3S.
>
> > and separately
> > $ make ARCH=powerpc randconfig # and force PPC64
>
> No.
>
> ...
> > OK, I have a patch that seems for work as far as setting
> > PPC32=y or PPC64=y... but it has a problem during linking
> > of vmlinux:
> >
> > crosstool/gcc-9.3.0-nolibc/powerpc-linux/bin/powerpc-linux-ld:./arch/powerpc/kernel/vmlinux.lds:6:
> >  syntax error
> >
> > and the (bad) generated vmlinux.lds file says (at line 6):
> >
> > OUTPUT_ARCH(1:common)
> >
> > while it should say:
> >
> > OUTPUT_ARCH(powerpc:common)
> >
> > Does anyone have any ideas about this problem?
>
> I guess your patch broke something? :D
> Not sure sorry.
>
> What about something like this?
>
> cheers
>
>
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index 3212d076ac6a..712c5e8768ce 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -376,6 +376,16 @@ PHONY += ppc64_book3e_allmodconfig
> $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/85xx-64bit.config \
> -f $(srctree)/Makefile allmodconfig
>
> +PHONY += ppc32_randconfig
> +ppc32_randconfig:
> +   $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/32-bit.config \
> +   -f $(srctree)/Makefile randconfig
> +
> +PHONY += ppc64_randconfig
> +ppc64_randconfig:
> +   $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/64-bit.config \
> +   -f $(srctree)/Makefile randconfig
> +
>  define archhelp
>@echo '* zImage  - Build default images selected by kernel config'
>@echo '  zImage.*- Compressed kernel image 
> (arch/$(ARCH)/boot/zImage.*)'
> diff --git a/arch/powerpc/configs/32-bit.config 
> b/arch/powerpc/configs/32-bit.config
> new file mode 100644
> index ..bdf833009006
> --- /dev/null
> +++ b/arch/powerpc/configs/32-bit.config
> @@ -0,0 +1 @@
> +CONFIG_PPC64=n

Please do:

# CONFIG_PPC64 is not set





> diff --git a/arch/powerpc/configs/64-bit.config 
> b/arch/powerpc/configs/64-bit.config
> new file mode 100644
> index ..0fe6406929e2
> --- /dev/null
> +++ b/arch/powerpc/configs/64-bit.config
> @@ -0,0 +1 @@
> +CONFIG_PPC64=y
>


-- 
Best Regards
Masahiro Yamada


Re: [PATCH] powerpc/64s: Add load address to plt branch targets before moved to linked location for non-relocatable kernels

2021-04-21 Thread Christophe Leroy




Le 21/04/2021 à 04:17, Jordan Niethe a écrit :

Large branches will go through the plt which includes a stub that loads
a target address from the .branch_lt section. On a relocatable kernel the
targets in .branch_lt have relocations so they will be fixed up for
where the kernel is running by relocate().

For a non-relocatable kernel obviously there are no relocations.
However, until the kernel is moved down to its linked address it is
expected to be able to run where ever it is loaded. For pseries machines
prom_init() is called before running at the linked address.

Certain configs result in a large kernel such as STRICT_KERNEL_RWX
(because of the larger data shift):


Same problem occurs on 32s, see discussion at 
https://bugzilla.kernel.org/show_bug.cgi?id=208181#c14




config DATA_SHIFT
int "Data shift" if DATA_SHIFT_BOOL
default 24 if STRICT_KERNEL_RWX && PPC64

These large kernels lead to prom_init()'s final call to __start()
generating a plt branch:

bl  c218 <0078.plt_branch.__start>

This results in the kernel jumping to the linked address of __start,
0xc000, when really it needs to jump to the
0xc000 + the runtime address because the kernel is still
running at the load address.

The first 256 bytes are already copied to address 0 so the kernel will
run until

b   __start_initialization_multiplatform

because there is nothing yet at __start_initialization_multiplatform
this will inevitably crash. At this point the exception handlers are
still OF's.

On phyp this will look like:

OF stdout device is: /vdevice/vty@3000
Preparing to boot Linux version 5.12.0-rc3-63029-gada7d7e600c0 (gcc (GCC) 8.4.1 
20200928 (Red Hat 8.4.1-1), GNU ld version 2.30-93.el8) #1 SMP Wed Apr 7 
07:24:20 EDT 2021
Detected machine type: 0101
command line: BOOT_IMAGE=/vmlinuz-5.12.0-rc3-63029-gada7d7e600c0
Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
   memory_limit :  (16 MB aligned)
   alloc_bottom : 0edc
   alloc_top: 2000
   alloc_top_hi : 2000
   rmo_top  : 2000
   ram_top  : 2000
instantiating rtas at 0x1ec3... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0edd -> 0x0edd1809
Device tree struct  0x0ede -> 0x0edf
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0a71 ...
DEFAULT CATCH!, exception-handler=fff6
at   %SRR0: 0f20   %SRR1: 80081000
Open Firmware exception handler entered from non-OF code
Client's Fix Pt Regs:
  00 0c713134 08a9fc00 0caf9c00 0edc
  04 0a71   
  08   0a7200fc 3003
  0c c000   0b5a9820
  10 0b5a9b38 0b5a9988 0b5a9f38 0b660c10
  14 0b5a9f60 013d 1ec3 1ec3
  18 0b5a9840 0a71 0028 0edc0008
  1c 0edc 0cb6  0edc
Special Regs:
 %IV: 0700 %CR: 44000202%XER:   %DSISR: 
   %SRR0: 0f20   %SRR1: 80081000
 %LR: 0c71326c%CTR: c000
%DAR: 
Virtual PID = 0
DEFAULT CATCH!, throw-code=fff6
Call History

throw  - c3f05c
$call-method  - c4f0b4
(poplocals)  - c40a00
key-fillq  - c4f4cc
?xoff  - c4f5b4
(poplocals)  - c40a00
(stdout-write)  - c4fa64
(emit)  - c4fb3c
space  - c4dfc8
quit  - c5336c
quit  - c53100
My Fix Pt Regs:
  00 8000b002  deadbeef 00c4f0b0
  04 08bfff80 deadbeef 0004 00c09010
  08 0005   
  0c 872a40a8   08d2cf30
  10 00e7d968 00e7d968 00c4f0a8 00c4f0b4
  14 fff6 08bfff80 c8ff21fbd0ff41fb f8ffe1fbb1fd21f8
  18 00c19000 00c3e000 00c1af80 00c1cfc0
  1c 00c26000 00c460f0 00c17fa8 00c16fe0
Special Regs:
 %IV: 0900 %CR: 84800208%XER: 00040010  %DSISR: 
   %SRR0: 00c3eec8   %SRR1: 8000b002
 %LR: 00c3f05c%CTR: 00c4f0b0
%DAR: 
...

On qemu it will just appear to be stuck after
Booting Linux via __start() @ 0x0040 ...:

SLOF **
QEMU Starting
  Build Date = Apr  9 2021 14:13:31
  FW Version = git-33a7322de13e9dca
  Press "s" to ente

Re: [PATCH 1/2] mm: Fix struct page layout on 32-bit systems

2021-04-21 Thread Arnd Bergmann
On Wed, Apr 21, 2021 at 10:43 AM David Laight  wrote:
> From: Arnd Bergmann Sent: 20 April 2021 22:20
> > On Tue, Apr 20, 2021 at 11:14 PM Vineet Gupta  
> > wrote:
> > > On 4/20/21 12:07 AM, Arnd Bergmann wrote:
> >
> > > >
> > > > which means that half the 32-bit architectures do this. This may
> > > > cause more problems when arc and/or microblaze want to support
> > > > 64-bit kernels and compat mode in the future on their latest hardware,
> > > > as that means duplicating the x86 specific hacks we have for compat.
> > > >
> > > > What is alignof(u64) on 64-bit arc?
> > >
> > > $ echo 'int a = __alignof__(long long);' | arc64-linux-gnu-gcc -xc -
> > > -Wall -S -o - | grep -A1 a: | tail -n 1 | cut -f 3
> > > 8
> >
> > Ok, good.
>
> That test doesn't prove anything.
> Try running on x86:
> $ echo 'int a = __alignof__(long long);' | gcc -xc - -Wall -S -o - -m32
> a:
> .long   8

Right, I had wondered about that one after I sent the email.

> Using '__alignof__(struct {long long x;})' does give the expected 4.
>
> __alignof__() returns the preferred alignment, not the enforced
> alignmnet - go figure.

I checked the others as well now, and i386 is the only one that
changed here: m68k still has '2', while arc/csky/h8300/microblaze/
nios2/or1k/sh/i386 all have '4' and the rest have '8'.

 Arnd


Re: mmu.c:undefined reference to `patch__hash_page_A0'

2021-04-21 Thread Christophe Leroy




Le 18/04/2021 à 19:15, Randy Dunlap a écrit :

On 4/18/21 3:43 AM, Christophe Leroy wrote:



Le 18/04/2021 à 02:02, Randy Dunlap a écrit :

HI--

I no longer see this build error.


Fixed by 
https://github.com/torvalds/linux/commit/acdad8fb4a1574323db88f98a38b630691574e16


However:

On 2/27/21 2:24 AM, kernel test robot wrote:

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   3fb6d0e00efc958d01c2f109c8453033a2d96796
commit: 259149cf7c3c6195e6199e045ca988c31d081cab powerpc/32s: Only build hash 
code when CONFIG_PPC_BOOK3S_604 is selected
date:   4 weeks ago
config: powerpc64-randconfig-r013-20210227 (attached as .config)


ktr/lkp, this is a PPC32 .config file that is attached, not PPC64.

Also:


compiler: powerpc-linux-gcc (GCC) 9.3.0


...



I do see this build error:

powerpc-linux-ld: arch/powerpc/boot/wrapper.a(decompress.o): in function 
`partial_decompress':
decompress.c:(.text+0x1f0): undefined reference to `__decompress'

when either
CONFIG_KERNEL_LZO=y
or
CONFIG_KERNEL_LZMA=y

but the build succeeds when either
CONFIG_KERNEL_GZIP=y
or
CONFIG_KERNEL_XZ=y

I guess that is due to arch/powerpc/boot/decompress.c doing this:

#ifdef CONFIG_KERNEL_GZIP
#    include "decompress_inflate.c"
#endif

#ifdef CONFIG_KERNEL_XZ
#    include "xz_config.h"
#    include "../../../lib/decompress_unxz.c"
#endif


It would be nice to require one of KERNEL_GZIP or KERNEL_XZ
to be set/enabled (maybe unless a uImage is being built?).



Can you test by 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/a74fce4dfc9fa32da6ce3470bbedcecf795de1ec.1591189069.git.christophe.le...@csgroup.eu/
 ?


Hi Christophe,

I get build errors for both LZO and LZMA:



Can you check with the following changes on top of my patch:

diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c
index a8dbde4b32d4..f06f925385c0 100644
--- a/lib/decompress_unlzo.c
+++ b/lib/decompress_unlzo.c
@@ -23,13 +23,15 @@
 #include 
 #endif

-#include 
 #ifdef __KERNEL__
 #include 
+#endif
+#include 
+#ifdef __KERNEL__
 #include 
+#include 
 #endif

-#include 
 #include 

 static const unsigned char lzop_magic[] = {



Thanks
Christophe


RE: [PATCH 1/2] mm: Fix struct page layout on 32-bit systems

2021-04-21 Thread David Laight
From: Arnd Bergmann
> Sent: 20 April 2021 22:20
> 
> On Tue, Apr 20, 2021 at 11:14 PM Vineet Gupta
>  wrote:
> > On 4/20/21 12:07 AM, Arnd Bergmann wrote:
> 
> > >
> > > which means that half the 32-bit architectures do this. This may
> > > cause more problems when arc and/or microblaze want to support
> > > 64-bit kernels and compat mode in the future on their latest hardware,
> > > as that means duplicating the x86 specific hacks we have for compat.
> > >
> > > What is alignof(u64) on 64-bit arc?
> >
> > $ echo 'int a = __alignof__(long long);' | arc64-linux-gnu-gcc -xc -
> > -Wall -S -o - | grep -A1 a: | tail -n 1 | cut -f 3
> > 8
> 
> Ok, good.

That test doesn't prove anything.
Try running on x86:
$ echo 'int a = __alignof__(long long);' | gcc -xc - -Wall -S -o - -m32
.file   ""
.globl  a
.data
.align 4
.type   a, @object
.size   a, 4
a:
.long   8
.ident  "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609"
.section.note.GNU-stack,"",@progbits

Using '__alignof__(struct {long long x;})' does give the expected 4.

__alignof__() returns the preferred alignment, not the enforced
alignmnet - go figure.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH] soc: fsl: qe: remove unused function

2021-04-21 Thread Jiapeng Chong

On 2021/4/16 15:06, Christophe Leroy wrote:



Le 16/04/2021 à 08:57, Daniel Axtens a écrit :

Hi Jiapeng,


Fix the following clang warning:


You are not fixing a warning, you are removing a function in order to 
fix a warning ...




drivers/soc/fsl/qe/qe_ic.c:234:29: warning: unused function
'qe_ic_from_irq' [-Wunused-function].


Would be wise to tell that the last users of the function where removed 
by commit d7c2878cfcfa ("soc: fsl: qe: remove unused qe_ic_set_* 
functions")


https://github.com/torvalds/linux/commit/d7c2878cfcfa



Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
  drivers/soc/fsl/qe/qe_ic.c | 5 -
  1 file changed, 5 deletions(-)

diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/soc/fsl/qe/qe_ic.c
index 0390af9..b573712 100644
--- a/drivers/soc/fsl/qe/qe_ic.c
+++ b/drivers/soc/fsl/qe/qe_ic.c
@@ -231,11 +231,6 @@ static inline void qe_ic_write(__be32  __iomem 
*base, unsigned int reg,

  qe_iowrite32be(value, base + (reg >> 2));
  }
-static inline struct qe_ic *qe_ic_from_irq(unsigned int virq)
-{
-    return irq_get_chip_data(virq);
-}


This seems good to me.

  * We know that this function can't be called directly from outside the
   file, because it is static.

  * The function address isn't used as a function pointer anywhere, so
    that means it can't be called from outside the file that way (also
    it's inline, which would make using a function pointer unwise!)

  * There's no obvious macros in that file that might construct the name
    of the function in a way that is hidden from grep.

All in all, I am fairly confident that the function is indeed not used.

Reviewed-by: Daniel Axtens 

Kind regards,
Daniel


-
  static inline struct qe_ic *qe_ic_from_irq_data(struct irq_data *d)
  {
  return irq_data_get_irq_chip_data(d);
--
1.8.3.1

Hi,
I will follow the advice and send V2 later.


Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

2021-04-21 Thread Rick Lindsley

On 4/20/21 2:42 PM, Lijun Pan wrote:


This v2 does not adddress the concerns mentioned in v1.
And I think it is better to exit with error from do_reset, and schedule a 
thorough
do_hard_reset if the the adapter is already in unstable state.


But the point is that the testing and analysis has indicated that doing a full
hard reset is not necessary. We are about to take the very action which will fix
this situation, but currently do not.

Please describe the advantage in deferring it further by routing it through
do_hard_reset().  I don't see one.

Rick


Re: [PATCH 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

2021-04-21 Thread Michael Ellerman
"Naveen N. Rao"  writes:
> Daniel Axtens wrote:
>> Sathvika Vasireddy  writes:
>> 
>>> This adds emulation support for the following instruction:
>>>* Set Boolean (setb)
>>>
>>> Signed-off-by: Sathvika Vasireddy 
...
>> 
>> If you do end up respinning the patch, I think it would be good to make
>> the maths a bit clearer. I think it works because a left shift of 2 is
>> the same as multiplying by 4, but it would be easier to follow if you
>> used a temporary variable for btf.
>
> Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
> the bit we are interested in is:
>   4 x BFA + 32
>
> So, if we use that along with the PPC_BIT() macro, we get:
>   if (regs->ccr & PPC_BIT(ra + 32))

Use of PPC_BIT risks annoying your maintainer :)

cheers


Re: [PATCH 2/2] powerpc: If kexec_build_elf_info() fails return immediately from elf64_load()

2021-04-21 Thread Michael Ellerman
Lakshmi Ramasubramanian  writes:

> Uninitialized local variable "elf_info" would be passed to
> kexec_free_elf_info() if kexec_build_elf_info() returns an error
> in elf64_load().
>
> If kexec_build_elf_info() returns an error, return the error
> immediately.
>
> Signed-off-by: Lakshmi Ramasubramanian 
> Reported-by: Dan Carpenter 

Reviewed-by: Michael Ellerman 

cheers

> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
> index 02662e72c53d..eeb258002d1e 100644
> --- a/arch/powerpc/kexec/elf_64.c
> +++ b/arch/powerpc/kexec/elf_64.c
> @@ -45,7 +45,7 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>  
>   ret = kexec_build_elf_info(kernel_buf, kernel_len, &ehdr, &elf_info);
>   if (ret)
> - goto out;
> + return ERR_PTR(ret);
>  
>   if (image->type == KEXEC_TYPE_CRASH) {
>   /* min & max buffer values for kdump case */
> -- 
> 2.31.0


Re: [PATCH 1/2] powerpc: Free fdt on error in elf64_load()

2021-04-21 Thread Michael Ellerman
Lakshmi Ramasubramanian  writes:
> There are a few "goto out;" statements before the local variable "fdt"
> is initialized through the call to of_kexec_alloc_and_setup_fdt() in
> elf64_load().  This will result in an uninitialized "fdt" being passed
> to kvfree() in this function if there is an error before the call to
> of_kexec_alloc_and_setup_fdt().
>
> If there is any error after fdt is allocated, but before it is
> saved in the arch specific kimage struct, free the fdt.
>
> Signed-off-by: Lakshmi Ramasubramanian 
> Reported-by: kernel test robot 
> Reported-by: Dan Carpenter 
> Suggested-by: Michael Ellerman 

I basically sent you the diff, so this should probably be:

  Reported-by: kernel test robot 
  Reported-by: Dan Carpenter 
  Signed-off-by: Michael Ellerman 
  Signed-off-by: Lakshmi Ramasubramanian 

Otherwise looks good to me, thanks for turning it into a proper patch
and submitting it.

cheers


> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
> index 5a569bb51349..02662e72c53d 100644
> --- a/arch/powerpc/kexec/elf_64.c
> +++ b/arch/powerpc/kexec/elf_64.c
> @@ -114,7 +114,7 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>   ret = setup_new_fdt_ppc64(image, fdt, initrd_load_addr,
> initrd_len, cmdline);
>   if (ret)
> - goto out;
> + goto out_free_fdt;
>  
>   fdt_pack(fdt);
>  
> @@ -125,7 +125,7 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>   kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer(&kbuf);
>   if (ret)
> - goto out;
> + goto out_free_fdt;
>  
>   /* FDT will be freed in arch_kimage_file_post_load_cleanup */
>   image->arch.fdt = fdt;
> @@ -140,18 +140,14 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>   if (ret)
>   pr_err("Error setting up the purgatory.\n");
>  
> + goto out;
> +
> +out_free_fdt:
> + kvfree(fdt);
>  out:
>   kfree(modified_cmdline);
>   kexec_free_elf_info(&elf_info);
>  
> - /*
> -  * Once FDT buffer has been successfully passed to kexec_add_buffer(),
> -  * the FDT buffer address is saved in image->arch.fdt. In that case,
> -  * the memory cannot be freed here in case of any other error.
> -  */
> - if (ret && !image->arch.fdt)
> - kvfree(fdt);
> -
>   return ret ? ERR_PTR(ret) : NULL;
>  }
>  
> -- 
> 2.31.0


Re: powerpc{32,64} randconfigs

2021-04-21 Thread Michael Ellerman
Randy Dunlap  writes:
> Hi,
>
> Is there a way to do this?
>
> $ make ARCH=powerpc randconfig # and force PPC32

Sort of:

$ KCONFIG_ALLCONFIG=arch/powerpc/configs/book3s_32.config make randconfig

But that also forces BOOK3S.

> and separately
> $ make ARCH=powerpc randconfig # and force PPC64

No.

...
> OK, I have a patch that seems for work as far as setting
> PPC32=y or PPC64=y... but it has a problem during linking
> of vmlinux:
>
> crosstool/gcc-9.3.0-nolibc/powerpc-linux/bin/powerpc-linux-ld:./arch/powerpc/kernel/vmlinux.lds:6:
>  syntax error
>
> and the (bad) generated vmlinux.lds file says (at line 6):
>
> OUTPUT_ARCH(1:common)
>
> while it should say:
>
> OUTPUT_ARCH(powerpc:common)
>
> Does anyone have any ideas about this problem?

I guess your patch broke something? :D
Not sure sorry.

What about something like this?

cheers


diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 3212d076ac6a..712c5e8768ce 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -376,6 +376,16 @@ PHONY += ppc64_book3e_allmodconfig
$(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/85xx-64bit.config \
-f $(srctree)/Makefile allmodconfig
 
+PHONY += ppc32_randconfig
+ppc32_randconfig:
+   $(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/32-bit.config \
+   -f $(srctree)/Makefile randconfig
+
+PHONY += ppc64_randconfig
+ppc64_randconfig:
+   $(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/64-bit.config \
+   -f $(srctree)/Makefile randconfig
+
 define archhelp
   @echo '* zImage  - Build default images selected by kernel config'
   @echo '  zImage.*- Compressed kernel image 
(arch/$(ARCH)/boot/zImage.*)'
diff --git a/arch/powerpc/configs/32-bit.config 
b/arch/powerpc/configs/32-bit.config
new file mode 100644
index ..bdf833009006
--- /dev/null
+++ b/arch/powerpc/configs/32-bit.config
@@ -0,0 +1 @@
+CONFIG_PPC64=n
diff --git a/arch/powerpc/configs/64-bit.config 
b/arch/powerpc/configs/64-bit.config
new file mode 100644
index ..0fe6406929e2
--- /dev/null
+++ b/arch/powerpc/configs/64-bit.config
@@ -0,0 +1 @@
+CONFIG_PPC64=y