Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback

2019-12-30 Thread Arnd Bergmann
On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
 wrote:
>
> In order to simplify next step which moves fallback call at arch
> level, ensure all arches have a 32bit fallback instead of handling
> the lack of 32bit fallback in the common code based
> on VDSO_HAS_32BIT_FALLBACK
>
> Signed-off-by: Christophe Leroy 

I like the idea of removing VDSO_HAS_32BIT_FALLBACK and ensuring
that all 32-bit architectures implement them, but we really should *not*
have any implementation calling the 64-bit syscalls.

> +static __always_inline
> +long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +   struct __kernel_timespec ts;
> +   int ret = clock_gettime_fallback(clock, );
> +
> +   if (likely(!ret)) {
> +   _ts->tv_sec = ts.tv_sec;
> +   _ts->tv_nsec = ts.tv_nsec;
> +   }
> +   return ret;
> +}
> +
> +static __always_inline
> +long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +   struct __kernel_timespec ts;
> +   int ret = clock_getres_fallback(clock, );
> +
> +   if (likely(!ret && _ts)) {
> +   _ts->tv_sec = ts.tv_sec;
> +   _ts->tv_nsec = ts.tv_nsec;
> +   }
> +   return ret;
> +}

Please change these to call __NR_clock_gettime and __NR_clock_getres_time
instead of __NR_clock_gettime64/__NR_clock_getres_time64 for multiple reasons.

- When doing migration between containers, the vdso may get copied into
  an application running on a kernel that does not support the time64
  variants, and then the fallback fails.

- When CONFIG_COMPAT_32BIT_TIME is disabled, the time32 syscalls
  return -ENOSYS, and the vdso version should have the exact same behavior
  to avoid surprises. In particular an application that checks clock_gettime()
  to see if the time32 are in part of the kernel would get an incorrect result
  here.

arch/arm64/include/asm/vdso/compat_gettimeofday.h already does this,
I think you can just copy the implementation or find a way to share it.

> diff --git a/arch/arm64/include/asm/vdso/gettimeofday.h 
> b/arch/arm64/include/asm/vdso/gettimeofday.h
> index b08f476b72b4..c41c86a07423 100644
> --- a/arch/arm64/include/asm/vdso/gettimeofday.h
> +++ b/arch/arm64/include/asm/vdso/gettimeofday.h
> @@ -66,6 +66,32 @@ int clock_getres_fallback(clockid_t _clkid, struct 
> __kernel_timespec *_ts)
> return ret;
>  }
>
> +static __always_inline
> +long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +   struct __kernel_timespec ts;
> +   int ret = clock_gettime_fallback(clock, );
> +
> +   if (likely(!ret)) {
> +   _ts->tv_sec = ts.tv_sec;
> +   _ts->tv_nsec = ts.tv_nsec;
> +   }
> +   return ret;
> +}

As Andy said, this makes no sense at all, nothing should ever call this on a
64-bit architecture.

> diff --git a/arch/mips/include/asm/vdso/gettimeofday.h 
> b/arch/mips/include/asm/vdso/gettimeofday.h
> index b08825531e9f..60608e930a5c 100644
> --- a/arch/mips/include/asm/vdso/gettimeofday.h
> +++ b/arch/mips/include/asm/vdso/gettimeofday.h
> @@ -109,8 +109,6 @@ static __always_inline int clock_getres_fallback(
>
>  #if _MIPS_SIM != _MIPS_SIM_ABI64
>
> -#define VDSO_HAS_32BIT_FALLBACK1
> -
>  static __always_inline long clock_gettime32_fallback(
> clockid_t _clkid,
> struct old_timespec32 *_ts)
> @@ -150,6 +148,32 @@ static __always_inline int clock_getres32_fallback(
>
> return error ? -ret : ret;
>  }
> +#else
> +static __always_inline
> +long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +   struct __kernel_timespec ts;
> +   int ret = clock_gettime_fallback(clock, );
> +
> +   if (likely(!ret)) {
> +   _ts->tv_sec = ts.tv_sec;
> +   _ts->tv_nsec = ts.tv_nsec;
> +   }
> +   return ret;
> +}
> +

Same here.

> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -125,13 +125,8 @@ __cvdso_clock_gettime32(clockid_t clock, struct 
> old_timespec32 *res)
>
> ret = __cvdso_clock_gettime_common(clock, );
>
> -#ifdef VDSO_HAS_32BIT_FALLBACK
> if (unlikely(ret))
> return clock_gettime32_fallback(clock, res);
> -#else
> -   if (unlikely(ret))
> -   ret = clock_gettime_fallback(clock, );
> -#endif
>
> if (likely(!ret)) {
> res->tv_sec = ts.tv_sec;

Removing the #ifdef and the fallback seems fine. I think this is actually
required for correctness on arm32 as well. Maybe enclose the entire function in

#ifdef VDSO_HAS_CLOCK_GETTIME32

to only define it when it is called?

> @@ -238,13 +233,8 @@ __cvdso_clock_getres_time32(clockid_t clock, struct 
> old_timespec32 *res)
>
> ret = __cvdso_clock_getres_common(clock, );
>
> -#ifdef VDSO_HAS_32BIT_FALLBACK
> if (unlikely(ret))
> return 

Re: [RFC PATCH v2 05/10] lib: vdso: inline do_hres()

2019-12-30 Thread Arnd Bergmann
On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
 wrote:
>
> do_hres() is called from several places, so GCC doesn't inline
> it at first.
>
> do_hres() takes a struct __kernel_timespec * parameter for
> passing the result. In the 32 bits case, this parameter corresponds
> to a local var in the caller. In order to provide a pointer
> to this structure, the caller has to put it in its stack and
> do_hres() has to write the result in the stack. This is suboptimal,
> especially on RISC processor like powerpc.
>
> By making GCC inline the function, the struct __kernel_timespec
> remains a local var using registers, avoiding the need to write and
> read stack.
>
> The improvement is significant on powerpc.
>
> Signed-off-by: Christophe Leroy 

Good idea, I can see how this ends up being an improvement
for most of the callers.

Acked-by: Arnd Bergmann 


[PATCH v2 1/2] powerpc32/booke: consistently return phys_addr_t in __pa()

2019-12-30 Thread yingjie_bai
From: Bai Yingjie 

When CONFIG_RELOCATABLE=y is set, VIRT_PHYS_OFFSET is a 64bit variable,
thus __pa() returns as 64bit value.
But when CONFIG_RELOCATABLE=n, __pa() returns 32bit value.

We'd make __pa() consistently return phys_addr_t, even if the upper bits
are known to always be zero in a particular config.

Signed-off-by: Bai Yingjie 
---
 arch/powerpc/include/asm/page.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 7f1fd41e3065..86332080399a 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -209,7 +209,7 @@ static inline bool pfn_valid(unsigned long pfn)
  */
 #if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
 #define __va(x) ((void *)(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET))
-#define __pa(x) ((unsigned long)(x) - VIRT_PHYS_OFFSET)
+#define __pa(x) ((phys_addr_t)(unsigned long)(x) - VIRT_PHYS_OFFSET)
 #else
 #ifdef CONFIG_PPC64
 /*
-- 
2.17.1



[PATCH v2 2/2] powerpc/mpc85xx: also write addr_h to spin table for 64bit boot entry

2019-12-30 Thread yingjie_bai
From: Bai Yingjie 

CPU like P4080 has 36bit physical address, its DDR physical
start address can be configured above 4G by LAW registers.

For such systems in which their physical memory start address was
configured higher than 4G, we need also to write addr_h into the spin
table of the target secondary CPU, so that addr_h and addr_l together
represent a 64bit physical address.
Otherwise the secondary core can not get correct entry to start from.

This should do no harm for normal case where addr_h is all 0.

Signed-off-by: Bai Yingjie 
---
 arch/powerpc/platforms/85xx/smp.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/platforms/85xx/smp.c 
b/arch/powerpc/platforms/85xx/smp.c
index 8c7ea2486bc0..e241516ae013 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -252,6 +252,13 @@ static int smp_85xx_start_cpu(int cpu)
out_be64((u64 *)(_table->addr_h),
__pa(ppc_function_entry(generic_secondary_smp_init)));
 #else
+   /*
+* We need also to write addr_h to spin table for systems
+* in which their physical memory start address was configured
+* to above 4G, otherwise the secondary core can not get
+* correct entry to start from.
+*/
+   out_be32(_table->addr_h, __pa(__early_start) >> 32);
out_be32(_table->addr_l, __pa(__early_start));
 #endif
flush_spin_table(spin_table);
-- 
2.17.1