Re: linux-next: manual merge of the spi tree with the powerpc tree
Hi Mark, On Fri, 12 Feb 2021 12:27:59 + Mark Brown wrote: > > On Fri, Feb 12, 2021 at 03:31:42PM +1100, Stephen Rothwell wrote: > > > BTW Mark: the author's address in 258ea99fe25a uses a non existent domain > > :-( > > Ugh, I think that's something gone wrong with b4 :( A bit late now to > try to fix it up. Not sure about that, the email (following the link to lore from the commit) has the same address (...@public.gmane.com) and that domain does not exist. In fact the email headers (in lore) look like this: From: Sergiu Cuciurean To: , , Cc: Sergiu Cuciurean So I am suprised that it was received by anyone. Maybe gmane has an internal reply system that is screwed. -- Cheers, Stephen Rothwell pgpwdZmSquIk5.pgp Description: OpenPGP digital signature
Re: [PATCH v5 10/10] powerpc/signal64: Use __get_user() to copy sigset_t
"Christopher M. Riedl" writes: > Usually sigset_t is exactly 8B which is a "trivial" size and does not > warrant using __copy_from_user(). Use __get_user() directly in > anticipation of future work to remove the trivial size optimizations > from __copy_from_user(). Calling __get_user() also results in a small > boost to signal handling throughput here. Modulo the comments from Christophe, this looks good to me. It seems to do what it says on the tin. Reviewed-by: Daniel Axtens Do you know if this patch is responsible for the slightly increase in radix performance you observed in your cover letter? The rest of the series shouldn't really make things faster than the no-KUAP case... Kind regards, Daniel > > Signed-off-by: Christopher M. Riedl > --- > arch/powerpc/kernel/signal_64.c | 14 -- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c > index 817b64e1e409..42fdc4a7ff72 100644 > --- a/arch/powerpc/kernel/signal_64.c > +++ b/arch/powerpc/kernel/signal_64.c > @@ -97,6 +97,14 @@ static void prepare_setup_sigcontext(struct task_struct > *tsk, int ctx_has_vsx_re > #endif /* CONFIG_VSX */ > } > > +static inline int get_user_sigset(sigset_t *dst, const sigset_t *src) > +{ > + if (sizeof(sigset_t) <= 8) > + return __get_user(dst->sig[0], >sig[0]); > + else > + return __copy_from_user(dst, src, sizeof(sigset_t)); > +} > + > /* > * Set up the sigcontext for the signal frame. > */ > @@ -701,8 +709,9 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, > old_ctx, >* We kill the task with a SIGSEGV in this situation. >*/ > > - if (__copy_from_user(, _ctx->uc_sigmask, sizeof(set))) > + if (get_user_sigset(, _ctx->uc_sigmask)) > do_exit(SIGSEGV); > + > set_current_blocked(); > > if (!user_read_access_begin(new_ctx, ctx_size)) > @@ -740,8 +749,9 @@ SYSCALL_DEFINE0(rt_sigreturn) > if (!access_ok(uc, sizeof(*uc))) > goto badframe; > > - if (__copy_from_user(, >uc_sigmask, sizeof(set))) > + if (get_user_sigset(, >uc_sigmask)) > goto badframe; > + > set_current_blocked(); > > #ifdef CONFIG_PPC_TRANSACTIONAL_MEM > -- > 2.26.1
Re: [PATCH v5 07/10] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext()
Hi Chris, > Previously restore_sigcontext() performed a costly KUAP switch on every > uaccess operation. These repeated uaccess switches cause a significant > drop in signal handling performance. > > Rewrite restore_sigcontext() to assume that a userspace read access > window is open. Replace all uaccess functions with their 'unsafe' > versions which avoid the repeated uaccess switches. > Much of the same comments apply here as to the last patch: - the commit message might be improved by adding that you are also changing the calling functions to open the uaccess window before calling into the new unsafe functions - I have checked that the safe to unsafe conversions look right. - I think you're opening too wide a window in user_read_access_begin, it seems to me that it could be reduced to just the uc_mcontext. (Again, not that it makes a difference with the current HW.) Kind regards, Daniel > Signed-off-by: Christopher M. Riedl > --- > arch/powerpc/kernel/signal_64.c | 68 - > 1 file changed, 41 insertions(+), 27 deletions(-) > > diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c > index 4248e4489ff1..d668f8af18fe 100644 > --- a/arch/powerpc/kernel/signal_64.c > +++ b/arch/powerpc/kernel/signal_64.c > @@ -326,14 +326,14 @@ static long setup_tm_sigcontexts(struct sigcontext > __user *sc, > /* > * Restore the sigcontext from the signal frame. > */ > - > -static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int > sig, > - struct sigcontext __user *sc) > +#define unsafe_restore_sigcontext(tsk, set, sig, sc, e) \ > + unsafe_op_wrap(__unsafe_restore_sigcontext(tsk, set, sig, sc), e) > +static long notrace __unsafe_restore_sigcontext(struct task_struct *tsk, > sigset_t *set, > + int sig, struct sigcontext > __user *sc) > { > #ifdef CONFIG_ALTIVEC > elf_vrreg_t __user *v_regs; > #endif > - unsigned long err = 0; > unsigned long save_r13 = 0; > unsigned long msr; > struct pt_regs *regs = tsk->thread.regs; > @@ -348,27 +348,28 @@ static long restore_sigcontext(struct task_struct *tsk, > sigset_t *set, int sig, > save_r13 = regs->gpr[13]; > > /* copy the GPRs */ > - err |= __copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr)); > - err |= __get_user(regs->nip, >gp_regs[PT_NIP]); > + unsafe_copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr), > + efault_out); > + unsafe_get_user(regs->nip, >gp_regs[PT_NIP], efault_out); > /* get MSR separately, transfer the LE bit if doing signal return */ > - err |= __get_user(msr, >gp_regs[PT_MSR]); > + unsafe_get_user(msr, >gp_regs[PT_MSR], efault_out); > if (sig) > regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE); > - err |= __get_user(regs->orig_gpr3, >gp_regs[PT_ORIG_R3]); > - err |= __get_user(regs->ctr, >gp_regs[PT_CTR]); > - err |= __get_user(regs->link, >gp_regs[PT_LNK]); > - err |= __get_user(regs->xer, >gp_regs[PT_XER]); > - err |= __get_user(regs->ccr, >gp_regs[PT_CCR]); > + unsafe_get_user(regs->orig_gpr3, >gp_regs[PT_ORIG_R3], efault_out); > + unsafe_get_user(regs->ctr, >gp_regs[PT_CTR], efault_out); > + unsafe_get_user(regs->link, >gp_regs[PT_LNK], efault_out); > + unsafe_get_user(regs->xer, >gp_regs[PT_XER], efault_out); > + unsafe_get_user(regs->ccr, >gp_regs[PT_CCR], efault_out); > /* Don't allow userspace to set SOFTE */ > set_trap_norestart(regs); > - err |= __get_user(regs->dar, >gp_regs[PT_DAR]); > - err |= __get_user(regs->dsisr, >gp_regs[PT_DSISR]); > - err |= __get_user(regs->result, >gp_regs[PT_RESULT]); > + unsafe_get_user(regs->dar, >gp_regs[PT_DAR], efault_out); > + unsafe_get_user(regs->dsisr, >gp_regs[PT_DSISR], efault_out); > + unsafe_get_user(regs->result, >gp_regs[PT_RESULT], efault_out); > > if (!sig) > regs->gpr[13] = save_r13; > if (set != NULL) > - err |= __get_user(set->sig[0], >oldmask); > + unsafe_get_user(set->sig[0], >oldmask, efault_out); > > /* >* Force reload of FP/VEC. > @@ -378,29 +379,28 @@ static long restore_sigcontext(struct task_struct *tsk, > sigset_t *set, int sig, > regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1 | MSR_VEC | MSR_VSX); > > #ifdef CONFIG_ALTIVEC > - err |= __get_user(v_regs, >v_regs); > - if (err) > - return err; > + unsafe_get_user(v_regs, >v_regs, efault_out); > if (v_regs && !access_ok(v_regs, 34 * sizeof(vector128))) > return -EFAULT; > /* Copy 33 vec registers (vr0..31 and vscr) from the stack */ > if (v_regs != NULL && (msr & MSR_VEC) != 0) { > - err |= __copy_from_user(>thread.vr_state, v_regs, > - 33 * sizeof(vector128)); > +
Re: [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
Hi Greg, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on linus/master v5.11-rc7 next-20210211] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Greg-Kurz/powerpc-pseries-Don-t-enforce-MSI-affinity-with-kdump/20210213-004658 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-allyesconfig (attached as .config) compiler: powerpc64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/1e5f7523fcfc57ab9437b8c7b29a974b62bde79d git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Greg-Kurz/powerpc-pseries-Don-t-enforce-MSI-affinity-with-kdump/20210213-004658 git checkout 1e5f7523fcfc57ab9437b8c7b29a974b62bde79d # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): arch/powerpc/platforms/pseries/msi.c: In function 'rtas_setup_msi_irqs': >> arch/powerpc/platforms/pseries/msi.c:478:7: error: implicit declaration of >> function 'is_kdump_kernel' [-Werror=implicit-function-declaration] 478 | if (is_kdump_kernel()) | ^~~ cc1: some warnings being treated as errors vim +/is_kdump_kernel +478 arch/powerpc/platforms/pseries/msi.c 369 370 static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type) 371 { 372 struct pci_dn *pdn; 373 int hwirq, virq, i, quota, rc; 374 struct msi_desc *entry; 375 struct msi_msg msg; 376 int nvec = nvec_in; 377 int use_32bit_msi_hack = 0; 378 379 if (type == PCI_CAP_ID_MSIX) 380 rc = check_req_msix(pdev, nvec); 381 else 382 rc = check_req_msi(pdev, nvec); 383 384 if (rc) 385 return rc; 386 387 quota = msi_quota_for_device(pdev, nvec); 388 389 if (quota && quota < nvec) 390 return quota; 391 392 if (type == PCI_CAP_ID_MSIX && check_msix_entries(pdev)) 393 return -EINVAL; 394 395 /* 396 * Firmware currently refuse any non power of two allocation 397 * so we round up if the quota will allow it. 398 */ 399 if (type == PCI_CAP_ID_MSIX) { 400 int m = roundup_pow_of_two(nvec); 401 quota = msi_quota_for_device(pdev, m); 402 403 if (quota >= m) 404 nvec = m; 405 } 406 407 pdn = pci_get_pdn(pdev); 408 409 /* 410 * Try the new more explicit firmware interface, if that fails fall 411 * back to the old interface. The old interface is known to never 412 * return MSI-Xs. 413 */ 414 again: 415 if (type == PCI_CAP_ID_MSI) { 416 if (pdev->no_64bit_msi) { 417 rc = rtas_change_msi(pdn, RTAS_CHANGE_32MSI_FN, nvec); 418 if (rc < 0) { 419 /* 420 * We only want to run the 32 bit MSI hack below if 421 * the max bus speed is Gen2 speed 422 */ 423 if (pdev->bus->max_bus_speed != PCIE_SPEED_5_0GT) 424 return rc; 425 426 use_32bit_msi_hack = 1; 427 } 428 } else 429 rc = -1; 430 431 if (rc < 0) 432 rc = rtas_change_msi(pdn, RTAS_CHANGE_MSI_FN, nvec); 433 434 if (rc < 0) { 435 pr_debug("rtas_msi: trying the old firmware call.\n"); 436 rc = rtas_change_msi(pdn, RTAS_CHANGE_FN, nvec); 437 } 438 439 if (use_32bit_msi_hack && rc > 0) 440 rtas_hack_32bit_msi_gen2(pdev); 441 } else 442 rc = rtas_change_msi(pdn, RTAS_CHANGE_MSIX_FN, nvec); 443 444 if (rc != nvec) { 445
Re: [RFC PATCH 3/9] KVM: PPC: Book3S 64: add hcall interrupt handler
Nicholas Piggin writes: > Add a separate hcall entry point. This can be used to deal with the > different calling convention. > > Signed-off-by: Nicholas Piggin Reviewed-by: Fabiano Rosas > --- > arch/powerpc/kernel/exceptions-64s.S | 4 ++-- > arch/powerpc/kvm/book3s_64_entry.S | 4 > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > b/arch/powerpc/kernel/exceptions-64s.S > index e6f7fc7c61a1..c25395b5921a 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -2028,13 +2028,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >* Requires __LOAD_FAR_HANDLER beause kvmppc_interrupt lives >* outside the head section. >*/ > - __LOAD_FAR_HANDLER(r10, kvmppc_interrupt) > + __LOAD_FAR_HANDLER(r10, kvmppc_hcall) > mtctr r10 > ld r10,PACA_EXGEN+EX_R10(r13) > bctr > #else > ld r10,PACA_EXGEN+EX_R10(r13) > - b kvmppc_interrupt > + b kvmppc_hcall > #endif > #endif > > diff --git a/arch/powerpc/kvm/book3s_64_entry.S > b/arch/powerpc/kvm/book3s_64_entry.S > index 8e7216f3c3ee..3b894b90862f 100644 > --- a/arch/powerpc/kvm/book3s_64_entry.S > +++ b/arch/powerpc/kvm/book3s_64_entry.S > @@ -9,6 +9,10 @@ > /* > * We come here from the first-level interrupt handlers. > */ > +.global kvmppc_hcall > +.balign IFETCH_ALIGN_BYTES > +kvmppc_hcall: > + > .global kvmppc_interrupt > .balign IFETCH_ALIGN_BYTES > kvmppc_interrupt:
Re: [RFC PATCH 2/9] KVM: PPC: Book3S 64: Move GUEST_MODE_SKIP test into KVM
Nicholas Piggin writes: > Move the GUEST_MODE_SKIP logic into KVM code. This is quite a KVM > internal detail that has no real need to be in common handlers. LGTM, > > (XXX: Need to confirm CBE handlers etc) Do these interrupts exist only in Cell? I see that they set HSRRs and MSR_HV, but CPU_FTRS_CELL does not contain CPU_HVMODE. So I don't get why they use the KVM macros. And for the instruction_breakpoint (0x1300) I think it would help if we could at least restrict when it is built. But I don't know what ISA/processor version it is from. > > Signed-off-by: Nicholas Piggin > --- > arch/powerpc/kernel/exceptions-64s.S | 64 > arch/powerpc/kvm/book3s_64_entry.S | 50 ++ > 2 files changed, 42 insertions(+), 72 deletions(-) > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > b/arch/powerpc/kernel/exceptions-64s.S > index 65659ea3cec4..e6f7fc7c61a1 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -133,7 +133,6 @@ name: > #define IBRANCH_TO_COMMON.L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch > to common */ > #define IREALMODE_COMMON .L_IREALMODE_COMMON_\name\() /* Common runs in > realmode */ > #define IMASK.L_IMASK_\name\() /* IRQ soft-mask bit */ > -#define IKVM_SKIP.L_IKVM_SKIP_\name\() /* Generate KVM skip handler */ > #define IKVM_REAL.L_IKVM_REAL_\name\() /* Real entry tests KVM */ > #define __IKVM_REAL(name).L_IKVM_REAL_ ## name > #define IKVM_VIRT.L_IKVM_VIRT_\name\() /* Virt entry tests KVM */ > @@ -191,9 +190,6 @@ do_define_int n > .ifndef IMASK > IMASK=0 > .endif > - .ifndef IKVM_SKIP > - IKVM_SKIP=0 > - .endif > .ifndef IKVM_REAL > IKVM_REAL=0 > .endif > @@ -254,15 +250,10 @@ do_define_int n > .balign IFETCH_ALIGN_BYTES > \name\()_kvm: > > - .if IKVM_SKIP > - cmpwi r10,KVM_GUEST_MODE_SKIP > - beq 89f > - .else > BEGIN_FTR_SECTION > ld r10,IAREA+EX_CFAR(r13) > std r10,HSTATE_CFAR(r13) > END_FTR_SECTION_IFSET(CPU_FTR_CFAR) > - .endif > > ld r10,IAREA+EX_CTR(r13) > mtctr r10 > @@ -289,27 +280,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) > ori r12,r12,(IVEC) > .endif > b kvmppc_interrupt > - > - .if IKVM_SKIP > -89: mtocrf 0x80,r9 > - ld r10,IAREA+EX_CTR(r13) > - mtctr r10 > - ld r9,IAREA+EX_R9(r13) > - ld r10,IAREA+EX_R10(r13) > - ld r11,IAREA+EX_R11(r13) > - ld r12,IAREA+EX_R12(r13) > - .if IHSRR_IF_HVMODE > - BEGIN_FTR_SECTION > - b kvmppc_skip_Hinterrupt > - FTR_SECTION_ELSE > - b kvmppc_skip_interrupt > - ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) > - .elseif IHSRR > - b kvmppc_skip_Hinterrupt > - .else > - b kvmppc_skip_interrupt > - .endif > - .endif > .endm > > #else > @@ -1128,7 +1098,6 @@ INT_DEFINE_BEGIN(machine_check) > ISET_RI=0 > IDAR=1 > IDSISR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > INT_DEFINE_END(machine_check) > > @@ -1419,7 +1388,6 @@ INT_DEFINE_BEGIN(data_access) > IVEC=0x300 > IDAR=1 > IDSISR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > INT_DEFINE_END(data_access) > > @@ -1469,7 +1437,6 @@ INT_DEFINE_BEGIN(data_access_slb) > IAREA=PACA_EXSLB > IRECONCILE=0 > IDAR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > INT_DEFINE_END(data_access_slb) > > @@ -2116,7 +2083,6 @@ INT_DEFINE_BEGIN(h_data_storage) > IHSRR=1 > IDAR=1 > IDSISR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > IKVM_VIRT=1 > INT_DEFINE_END(h_data_storage) > @@ -2573,7 +2539,6 @@ EXC_VIRT_NONE(0x5100, 0x100) > INT_DEFINE_BEGIN(cbe_system_error) > IVEC=0x1200 > IHSRR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > INT_DEFINE_END(cbe_system_error) > > @@ -2598,7 +2563,6 @@ EXC_VIRT_NONE(0x5200, 0x100) > INT_DEFINE_BEGIN(instruction_breakpoint) > IVEC=0x1300 > #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE > - IKVM_SKIP=1 > IKVM_REAL=1 > #endif > INT_DEFINE_END(instruction_breakpoint) > @@ -2744,7 +2708,6 @@ EXC_COMMON_BEGIN(denorm_exception_common) > INT_DEFINE_BEGIN(cbe_maintenance) > IVEC=0x1600 > IHSRR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > INT_DEFINE_END(cbe_maintenance) > > @@ -2797,7 +2760,6 @@ EXC_COMMON_BEGIN(altivec_assist_common) > INT_DEFINE_BEGIN(cbe_thermal) > IVEC=0x1800 > IHSRR=1 > - IKVM_SKIP=1 > IKVM_REAL=1 > INT_DEFINE_END(cbe_thermal) > > @@ -3081,32 +3043,6 @@ EXPORT_SYMBOL(do_uaccess_flush) > MASKED_INTERRUPT > MASKED_INTERRUPT hsrr=1 > > -#ifdef CONFIG_KVM_BOOK3S_64_HANDLER > -kvmppc_skip_interrupt: > - /* > - * Here all GPRs are unchanged from when the interrupt happened > - * except for r13, which is saved in SPRG_SCRATCH0. > - */ > - mfspr
Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
Lakshmi Ramasubramanian writes: > On 2/12/21 10:24 AM, Rob Herring wrote: >> On Fri, Feb 12, 2021 at 11:19 AM Lakshmi Ramasubramanian >> wrote: >>> >>> On 2/12/21 6:38 AM, Rob Herring wrote: On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian wrote: > > On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote: >> >> There's actually a complication that I just noticed and needs to be >> addressed. More below. >> > > <...> > >>> + >>> +/* >>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened >>> Device Tree >>> + * >>> + * @image: kexec image being loaded. >>> + * @initrd_load_addr: Address where the next initrd will be >>> loaded. >>> + * @initrd_len: Size of the next initrd, or 0 if there >>> will be none. >>> + * @cmdline:Command line for the next kernel, or NULL >>> if there will >>> + * be none. >>> + * >>> + * Return: fdt on success, or NULL errno on error. >>> + */ >>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image, >>> + unsigned long initrd_load_addr, >>> + unsigned long initrd_len, >>> + const char *cmdline) >>> +{ >>> +void *fdt; >>> +int ret, chosen_node; >>> +const void *prop; >>> +unsigned long fdt_size; >>> + >>> +fdt_size = fdt_totalsize(initial_boot_params) + >>> + (cmdline ? strlen(cmdline) : 0) + >>> + FDT_EXTRA_SPACE; >> >> Just adding 4 KB to initial_boot_params won't be enough for crash >> kernels on ppc64. The current powerpc code doubles the size of >> initial_boot_params (which is normally larger than 4 KB) and even that >> isn't enough. A patch was added to powerpc/next today which uses a more >> precise (but arch-specific) formula: >> >> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/ >> >> So I believe we need a hook here where architectures can provide their >> own specific calculation for the size of the fdt. Perhaps a weakly >> defined function providing a default implementation which an >> arch-specific file can override (a la arch_kexec_kernel_image_load())? >> >> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64() >> function from the patch I linked above. >> > > Do you think it'd better to add "fdt_size" parameter to > of_kexec_alloc_and_setup_fdt() so that the caller can provide the > desired FDT buffer size? Yes, I guess so. But please define the param as extra size, not total size. The kernel command line size addition can be in the common code. >>> >>> Will do. Just to clarify - >>> >>> The common code will do: >>> >>> fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size >>> >>> The caller will pass "extra_fdt_size" >>> ARM64 => 4KB >>> PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when >>> the patch Thiago had referred to is merged. >> Yes, I'd leave the 4KB in there by default and arm64 use 0. >> > > Sounds good. > > common: > fdt_totalsize(initial_boot_params) + strlen(cmdline) + 0x1000 + extra > > arm64 => 0 for extra > ppc => fdt_totalsize(initial_boot_params) for extra. Looks good to me. -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
On 2/12/21 10:24 AM, Rob Herring wrote: On Fri, Feb 12, 2021 at 11:19 AM Lakshmi Ramasubramanian wrote: On 2/12/21 6:38 AM, Rob Herring wrote: On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian wrote: On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote: There's actually a complication that I just noticed and needs to be addressed. More below. <...> + +/* + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree + * + * @image: kexec image being loaded. + * @initrd_load_addr: Address where the next initrd will be loaded. + * @initrd_len: Size of the next initrd, or 0 if there will be none. + * @cmdline:Command line for the next kernel, or NULL if there will + * be none. + * + * Return: fdt on success, or NULL errno on error. + */ +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image, + unsigned long initrd_load_addr, + unsigned long initrd_len, + const char *cmdline) +{ +void *fdt; +int ret, chosen_node; +const void *prop; +unsigned long fdt_size; + +fdt_size = fdt_totalsize(initial_boot_params) + + (cmdline ? strlen(cmdline) : 0) + + FDT_EXTRA_SPACE; Just adding 4 KB to initial_boot_params won't be enough for crash kernels on ppc64. The current powerpc code doubles the size of initial_boot_params (which is normally larger than 4 KB) and even that isn't enough. A patch was added to powerpc/next today which uses a more precise (but arch-specific) formula: https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/ So I believe we need a hook here where architectures can provide their own specific calculation for the size of the fdt. Perhaps a weakly defined function providing a default implementation which an arch-specific file can override (a la arch_kexec_kernel_image_load())? Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64() function from the patch I linked above. Do you think it'd better to add "fdt_size" parameter to of_kexec_alloc_and_setup_fdt() so that the caller can provide the desired FDT buffer size? Yes, I guess so. But please define the param as extra size, not total size. The kernel command line size addition can be in the common code. Will do. Just to clarify - The common code will do: fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size The caller will pass "extra_fdt_size" ARM64 => 4KB PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when the patch Thiago had referred to is merged. Yes, I'd leave the 4KB in there by default and arm64 use 0. Sounds good. common: fdt_totalsize(initial_boot_params) + strlen(cmdline) + 0x1000 + extra arm64 => 0 for extra ppc => fdt_totalsize(initial_boot_params) for extra. -lakshmi
Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
On Fri, Feb 12, 2021 at 11:19 AM Lakshmi Ramasubramanian wrote: > > On 2/12/21 6:38 AM, Rob Herring wrote: > > On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian > > wrote: > >> > >> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote: > >>> > >>> There's actually a complication that I just noticed and needs to be > >>> addressed. More below. > >>> > >> > >> <...> > >> > + > +/* > + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened > Device Tree > + * > + * @image: kexec image being loaded. > + * @initrd_load_addr: Address where the next initrd will be > loaded. > + * @initrd_len: Size of the next initrd, or 0 if there will > be none. > + * @cmdline:Command line for the next kernel, or NULL > if there will > + * be none. > + * > + * Return: fdt on success, or NULL errno on error. > + */ > +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image, > + unsigned long initrd_load_addr, > + unsigned long initrd_len, > + const char *cmdline) > +{ > +void *fdt; > +int ret, chosen_node; > +const void *prop; > +unsigned long fdt_size; > + > +fdt_size = fdt_totalsize(initial_boot_params) + > + (cmdline ? strlen(cmdline) : 0) + > + FDT_EXTRA_SPACE; > >>> > >>> Just adding 4 KB to initial_boot_params won't be enough for crash > >>> kernels on ppc64. The current powerpc code doubles the size of > >>> initial_boot_params (which is normally larger than 4 KB) and even that > >>> isn't enough. A patch was added to powerpc/next today which uses a more > >>> precise (but arch-specific) formula: > >>> > >>> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/ > >>> > >>> So I believe we need a hook here where architectures can provide their > >>> own specific calculation for the size of the fdt. Perhaps a weakly > >>> defined function providing a default implementation which an > >>> arch-specific file can override (a la arch_kexec_kernel_image_load())? > >>> > >>> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64() > >>> function from the patch I linked above. > >>> > >> > >> Do you think it'd better to add "fdt_size" parameter to > >> of_kexec_alloc_and_setup_fdt() so that the caller can provide the > >> desired FDT buffer size? > > > > Yes, I guess so. But please define the param as extra size, not total > > size. The kernel command line size addition can be in the common code. > > Will do. Just to clarify - > > The common code will do: > > fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size > > The caller will pass "extra_fdt_size" > ARM64 => 4KB > PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when > the patch Thiago had referred to is merged. Yes, I'd leave the 4KB in there by default and arm64 use 0. Rob
Re: [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
On 2/12/21 5:41 PM, Greg Kurz wrote: > Depending on the number of online CPUs in the original kernel, it is > likely for CPU #0 to be offline in a kdump kernel. The associated IRQs > in the affinity mappings provided by irq_create_affinity_masks() are > thus not started by irq_startup(), as per-design with managed IRQs. > > This can be a problem with multi-queue block devices driven by blk-mq : > such a non-started IRQ is very likely paired with the single queue > enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This > causes the device to remain silent and likely hangs the guest at > some point. > > This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: > Pass MSI affinity to irq_create_mapping()"). Note that this only happens > with the XIVE interrupt controller because XICS has a workaround to bypass > affinity, which is activated during kdump with the "noirqdistrib" kernel > parameter. > > The issue comes from a combination of factors: > - discrepancy between the number of queues detected by the multi-queue > block driver, that was used to create the MSI vectors, and the single > queue mode enforced later on by blk-mq because of kdump (i.e. keeping > all queues fixes the issue) > - CPU#0 offline (i.e. kdump always succeed with CPU#0) > > Given that I couldn't reproduce on x86, which seems to always have CPU#0 > online even during kdump, I'm not sure where this should be fixed. Hence > going for another approach : fine-grained affinity is for performance > and we don't really care about that during kdump. Simply revert to the > previous working behavior of ignoring affinity masks in this case only. > > Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to > irq_create_mapping()") > Cc: lviv...@redhat.com > Cc: sta...@vger.kernel.org > Signed-off-by: Greg Kurz Reviewed-by: Cédric Le Goater Thanks for tracking this issue. This layer needs a rework. Patches adding a MSI domain should be ready in a couple of releases. Hopefully. C. > --- > arch/powerpc/platforms/pseries/msi.c | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/msi.c > b/arch/powerpc/platforms/pseries/msi.c > index b3ac2455faad..29d04b83288d 100644 > --- a/arch/powerpc/platforms/pseries/msi.c > +++ b/arch/powerpc/platforms/pseries/msi.c > @@ -458,8 +458,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int > nvec_in, int type) > return hwirq; > } > > - virq = irq_create_mapping_affinity(NULL, hwirq, > -entry->affinity); > + /* > + * Depending on the number of online CPUs in the original > + * kernel, it is likely for CPU #0 to be offline in a kdump > + * kernel. The associated IRQs in the affinity mappings > + * provided by irq_create_affinity_masks() are thus not > + * started by irq_startup(), as per-design for managed IRQs. > + * This can be a problem with multi-queue block devices driven > + * by blk-mq : such a non-started IRQ is very likely paired > + * with the single queue enforced by blk-mq during kdump (see > + * blk_mq_alloc_tag_set()). This causes the device to remain > + * silent and likely hangs the guest at some point. > + * > + * We don't really care for fine-grained affinity when doing > + * kdump actually : simply ignore the pre-computed affinity > + * masks in this case and let the default mask with all CPUs > + * be used when creating the IRQ mappings. > + */ > + if (is_kdump_kernel()) > + virq = irq_create_mapping(NULL, hwirq); > + else > + virq = irq_create_mapping_affinity(NULL, hwirq, > +entry->affinity); > > if (!virq) { > pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq); >
Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
On 2/12/21 6:38 AM, Rob Herring wrote: On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian wrote: On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote: There's actually a complication that I just noticed and needs to be addressed. More below. <...> + +/* + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree + * + * @image: kexec image being loaded. + * @initrd_load_addr: Address where the next initrd will be loaded. + * @initrd_len: Size of the next initrd, or 0 if there will be none. + * @cmdline:Command line for the next kernel, or NULL if there will + * be none. + * + * Return: fdt on success, or NULL errno on error. + */ +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image, + unsigned long initrd_load_addr, + unsigned long initrd_len, + const char *cmdline) +{ +void *fdt; +int ret, chosen_node; +const void *prop; +unsigned long fdt_size; + +fdt_size = fdt_totalsize(initial_boot_params) + + (cmdline ? strlen(cmdline) : 0) + + FDT_EXTRA_SPACE; Just adding 4 KB to initial_boot_params won't be enough for crash kernels on ppc64. The current powerpc code doubles the size of initial_boot_params (which is normally larger than 4 KB) and even that isn't enough. A patch was added to powerpc/next today which uses a more precise (but arch-specific) formula: https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/ So I believe we need a hook here where architectures can provide their own specific calculation for the size of the fdt. Perhaps a weakly defined function providing a default implementation which an arch-specific file can override (a la arch_kexec_kernel_image_load())? Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64() function from the patch I linked above. Do you think it'd better to add "fdt_size" parameter to of_kexec_alloc_and_setup_fdt() so that the caller can provide the desired FDT buffer size? Yes, I guess so. But please define the param as extra size, not total size. The kernel command line size addition can be in the common code. Will do. Just to clarify - The common code will do: fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size The caller will pass "extra_fdt_size" ARM64 => 4KB PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when the patch Thiago had referred to is merged. The above change is also going to conflict, so I think this may have to wait. Or I'll take the common and arm bits and powerpc can be converted next cycle (or after the merge window). thanks. -lakshmi
Re: [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
On 12/02/2021 17:41, Greg Kurz wrote: > Depending on the number of online CPUs in the original kernel, it is > likely for CPU #0 to be offline in a kdump kernel. The associated IRQs > in the affinity mappings provided by irq_create_affinity_masks() are > thus not started by irq_startup(), as per-design with managed IRQs. > > This can be a problem with multi-queue block devices driven by blk-mq : > such a non-started IRQ is very likely paired with the single queue > enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This > causes the device to remain silent and likely hangs the guest at > some point. > > This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: > Pass MSI affinity to irq_create_mapping()"). Note that this only happens > with the XIVE interrupt controller because XICS has a workaround to bypass > affinity, which is activated during kdump with the "noirqdistrib" kernel > parameter. > > The issue comes from a combination of factors: > - discrepancy between the number of queues detected by the multi-queue > block driver, that was used to create the MSI vectors, and the single > queue mode enforced later on by blk-mq because of kdump (i.e. keeping > all queues fixes the issue) > - CPU#0 offline (i.e. kdump always succeed with CPU#0) > > Given that I couldn't reproduce on x86, which seems to always have CPU#0 > online even during kdump, I'm not sure where this should be fixed. Hence > going for another approach : fine-grained affinity is for performance > and we don't really care about that during kdump. Simply revert to the > previous working behavior of ignoring affinity masks in this case only. > > Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to > irq_create_mapping()") > Cc: lviv...@redhat.com > Cc: sta...@vger.kernel.org > Signed-off-by: Greg Kurz > --- > arch/powerpc/platforms/pseries/msi.c | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/msi.c > b/arch/powerpc/platforms/pseries/msi.c > index b3ac2455faad..29d04b83288d 100644 > --- a/arch/powerpc/platforms/pseries/msi.c > +++ b/arch/powerpc/platforms/pseries/msi.c > @@ -458,8 +458,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int > nvec_in, int type) > return hwirq; > } > > - virq = irq_create_mapping_affinity(NULL, hwirq, > -entry->affinity); > + /* > + * Depending on the number of online CPUs in the original > + * kernel, it is likely for CPU #0 to be offline in a kdump > + * kernel. The associated IRQs in the affinity mappings > + * provided by irq_create_affinity_masks() are thus not > + * started by irq_startup(), as per-design for managed IRQs. > + * This can be a problem with multi-queue block devices driven > + * by blk-mq : such a non-started IRQ is very likely paired > + * with the single queue enforced by blk-mq during kdump (see > + * blk_mq_alloc_tag_set()). This causes the device to remain > + * silent and likely hangs the guest at some point. > + * > + * We don't really care for fine-grained affinity when doing > + * kdump actually : simply ignore the pre-computed affinity > + * masks in this case and let the default mask with all CPUs > + * be used when creating the IRQ mappings. > + */ > + if (is_kdump_kernel()) > + virq = irq_create_mapping(NULL, hwirq); > + else > + virq = irq_create_mapping_affinity(NULL, hwirq, > +entry->affinity); > > if (!virq) { > pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq); > Reviewed-by: Laurent Vivier
[PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: lviv...@redhat.com Cc: sta...@vger.kernel.org Signed-off-by: Greg Kurz --- arch/powerpc/platforms/pseries/msi.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c index b3ac2455faad..29d04b83288d 100644 --- a/arch/powerpc/platforms/pseries/msi.c +++ b/arch/powerpc/platforms/pseries/msi.c @@ -458,8 +458,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type) return hwirq; } - virq = irq_create_mapping_affinity(NULL, hwirq, - entry->affinity); + /* +* Depending on the number of online CPUs in the original +* kernel, it is likely for CPU #0 to be offline in a kdump +* kernel. The associated IRQs in the affinity mappings +* provided by irq_create_affinity_masks() are thus not +* started by irq_startup(), as per-design for managed IRQs. +* This can be a problem with multi-queue block devices driven +* by blk-mq : such a non-started IRQ is very likely paired +* with the single queue enforced by blk-mq during kdump (see +* blk_mq_alloc_tag_set()). This causes the device to remain +* silent and likely hangs the guest at some point. +* +* We don't really care for fine-grained affinity when doing +* kdump actually : simply ignore the pre-computed affinity +* masks in this case and let the default mask with all CPUs +* be used when creating the IRQ mappings. +*/ + if (is_kdump_kernel()) + virq = irq_create_mapping(NULL, hwirq); + else + virq = irq_create_mapping_affinity(NULL, hwirq, + entry->affinity); if (!virq) { pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq); -- 2.26.2
Re: [PATCH 2/2] powerpc/uaccess: Move might_fault() into user_access_begin()
Le 08/02/2021 à 14:57, Michael Ellerman a écrit : We have a might_fault() check in __unsafe_put_user_goto(), but that is dangerous as it potentially calls lots of code while user access is enabled. It also triggers the check Alexey added to the irq restore path to catch cases like that: WARNING: CPU: 30 PID: 1 at arch/powerpc/include/asm/book3s/64/kup.h:324 arch_local_irq_restore+0x160/0x190 NIP arch_local_irq_restore+0x160/0x190 LR lock_is_held_type+0x140/0x200 Call Trace: 0xc0007f392ff8 (unreliable) ___might_sleep+0x180/0x320 __might_fault+0x50/0xe0 filldir64+0x2d0/0x5d0 call_filldir+0xc8/0x180 ext4_readdir+0x948/0xb40 iterate_dir+0x1ec/0x240 sys_getdents64+0x80/0x290 system_call_exception+0x160/0x280 system_call_common+0xf0/0x27c So remove the might fault check from unsafe_put_user(). Any call to unsafe_put_user() must be inside a region that's had user access enabled with user_access_begin(), so move the might_fault() in there. That also allows us to drop the is_kernel_addr() test, because there should be no code using user_access_begin() in order to access a kernel address. x86 and mips only have might_fault() on get_user() and put_user(), neither on __get_user() nor on __put_user() nor on the unsafe alternative. When have might_fault() in __get_user_nocheck() that is used by __get_user() and __get_user_allowed() ie by unsafe_get_user(). Shoudln't those be dropped as well ? Christophe Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/uaccess.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 70347ee34c94..71640eca7341 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -214,8 +214,6 @@ do { \ #define __unsafe_put_user_goto(x, ptr, size, label) \ do { \ __typeof__(*(ptr)) __user *__pu_addr = (ptr); \ - if (!is_kernel_addr((unsigned long)__pu_addr)) \ - might_fault(); \ __chk_user_ptr(ptr);\ __put_user_size_goto((x), __pu_addr, (size), label);\ } while (0) @@ -494,6 +492,8 @@ extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset, static __must_check inline bool user_access_begin(const void __user *ptr, size_t len) { + might_fault(); + if (unlikely(!access_ok(ptr, len))) return false; allow_read_write_user((void __user *)ptr, ptr, len);
Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian wrote: > > On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote: > > > > There's actually a complication that I just noticed and needs to be > > addressed. More below. > > > > <...> > > >> + > >> +/* > >> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device > >> Tree > >> + * > >> + * @image: kexec image being loaded. > >> + * @initrd_load_addr: Address where the next initrd will be loaded. > >> + * @initrd_len: Size of the next initrd, or 0 if there will > >> be none. > >> + * @cmdline:Command line for the next kernel, or NULL if > >> there will > >> + * be none. > >> + * > >> + * Return: fdt on success, or NULL errno on error. > >> + */ > >> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image, > >> + unsigned long initrd_load_addr, > >> + unsigned long initrd_len, > >> + const char *cmdline) > >> +{ > >> +void *fdt; > >> +int ret, chosen_node; > >> +const void *prop; > >> +unsigned long fdt_size; > >> + > >> +fdt_size = fdt_totalsize(initial_boot_params) + > >> + (cmdline ? strlen(cmdline) : 0) + > >> + FDT_EXTRA_SPACE; > > > > Just adding 4 KB to initial_boot_params won't be enough for crash > > kernels on ppc64. The current powerpc code doubles the size of > > initial_boot_params (which is normally larger than 4 KB) and even that > > isn't enough. A patch was added to powerpc/next today which uses a more > > precise (but arch-specific) formula: > > > > https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/ > > > > So I believe we need a hook here where architectures can provide their > > own specific calculation for the size of the fdt. Perhaps a weakly > > defined function providing a default implementation which an > > arch-specific file can override (a la arch_kexec_kernel_image_load())? > > > > Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64() > > function from the patch I linked above. > > > > Do you think it'd better to add "fdt_size" parameter to > of_kexec_alloc_and_setup_fdt() so that the caller can provide the > desired FDT buffer size? Yes, I guess so. But please define the param as extra size, not total size. The kernel command line size addition can be in the common code. The above change is also going to conflict, so I think this may have to wait. Or I'll take the common and arm bits and powerpc can be converted next cycle (or after the merge window). Rob
Re: linux-next: manual merge of the spi tree with the powerpc tree
On Fri, Feb 12, 2021 at 03:31:42PM +1100, Stephen Rothwell wrote: > BTW Mark: the author's address in 258ea99fe25a uses a non existent domain :-( Ugh, I think that's something gone wrong with b4 :( A bit late now to try to fix it up. signature.asc Description: PGP signature
[PATCH for 5.10] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
This is backport of 3642eb21256a ("powerpc/32: Preserve cr1 in exception prolog stack check to fix build error") for kernel 5.10 It fixes the build failure on v5.10 reported by kernel test robot and by David Michael. This fix is not in Linux tree yet, it is in next branch in powerpc tree. (cherry picked from commit 3642eb21256a317ac14e9ed560242c6d20cf06d9) THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1 Maximum PAGE_SHIFT is 18 for 256k pages so THREAD_ALIGN_SHIFT is 19 at the maximum. No need to clobber cr1, it can be preserved when moving r1 into CR when we check stack overflow. This reduces the number of instructions in Machine Check Exception prolog and fixes a build failure reported by the kernel test robot on v5.10 stable when building with RTAS + VMAP_STACK + KVM. That build failure is due to too many instructions in the prolog hence not fitting between 0x200 and 0x300. Allthough the problem doesn't show up in mainline, it is still worth the change. Fixes: 98bf2d3f4970 ("powerpc/32s: Fix RTAS machine check with VMAP stack") Cc: sta...@vger.kernel.org Reported-by: kernel test robot Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/5ae4d545e3ac58e133d2599e0deb88843cb494fc.1612768623.git.christophe.le...@csgroup.eu --- arch/powerpc/kernel/head_32.h| 2 +- arch/powerpc/kernel/head_book3s_32.S | 6 -- 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h index c88e66adecb5..fef0b34a77c9 100644 --- a/arch/powerpc/kernel/head_32.h +++ b/arch/powerpc/kernel/head_32.h @@ -56,7 +56,7 @@ 1: tophys_novmstack r11, r11 #ifdef CONFIG_VMAP_STACK - mtcrf 0x7f, r1 + mtcrf 0x3f, r1 bt 32 - THREAD_ALIGN_SHIFT, stack_overflow #endif .endm diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S index d66da35f2e8d..2729d8fa6e77 100644 --- a/arch/powerpc/kernel/head_book3s_32.S +++ b/arch/powerpc/kernel/head_book3s_32.S @@ -280,12 +280,6 @@ MachineCheck: 7: EXCEPTION_PROLOG_2 addir3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_CHRP -#ifdef CONFIG_VMAP_STACK - mfspr r4, SPRN_SPRG_THREAD - tovirt(r4, r4) - lwz r4, RTAS_SP(r4) - cmpwi cr1, r4, 0 -#endif beq cr1, machine_check_tramp twi 31, 0, 0 #else -- 2.25.0