Re: powerpc/process: Fix flush_all_to_thread for SPE
On Thu, 2018-10-25 at 12:07:54 UTC, "DATACOM - Felipe.Rechia" wrote: > From: "Felipe Rechia" > Date: Wed, 24 Oct 2018 10:57:22 -0300 > Subject: [PATCH] powerpc/process: Fix flush_all_to_thread for SPE > > Fix a bug introduced by the creation of flush_all_to_thread() for > processors that have SPE (Signal Processing Engine) and use it to > compute floating-point operations. > > >From userspace perspective, the problem was seen in attempts of > computing floating-point operations which should generate exceptions. > For example: > > fork(); > float x = 0.0 / 0.0; > isnan(x); // forked process returns False (should be True) > > The operation above also should always cause the SPEFSCR FINV bit to > be set. However, the SPE floating-point exceptions were turned off > after a fork(). > > Kernel versions prior to the bug used flush_spe_to_thread(), which > first saves SPEFSCR register values in tsk->thread and then calls > giveup_spe(tsk). > > After commit 579e633e764e, the save_all() function was called first > to giveup_spe(), and then the SPEFSCR register values were saved in > tsk->thread. This would save the SPEFSCR register values after > disabling SPE for that thread, causing the bug described above. > > Fixes 579e633e764e ("powerpc: create flush_all_to_thread()") > Signed-off-by: felipe.rechia Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/e901378578c62202594cba0f6c076f cheers
Re: powerpc/pseries: add missing cpumask.h include file
On Tue, 2018-10-23 at 01:32:12 UTC, Tyrel Datwyler wrote: > Build error is encountered when inlcuding if no explicit or > implicit include of cpumask.h exists in the including file. > > In file included from arch/powerpc/platforms/pseries/hotplug-pci.c:3:0: > ./arch/powerpc/include/asm/rtas.h:360:34: error: unknown type name > 'cpumask_var_t' > extern int rtas_online_cpus_mask(cpumask_var_t cpus); > ^ > ./arch/powerpc/include/asm/rtas.h:361:35: error: unknown type name > 'cpumask_var_t' > extern int rtas_offline_cpus_mask(cpumask_var_t cpus); > > Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to > migration/hibernation") > Signed-off-by: Tyrel Datwyler Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/8dce6b2215eaa91dbf04463e11098a cheers
Re: selftests/powerpc: Fix ptrace tm failure
On Mon, 2018-10-22 at 21:16:26 UTC, Breno Leitao wrote: > Test ptrace-tm-spd-gpr fails on current kernel (4.19) due to a segmentation > fault that happens on the child process prior to setting cptr[2] = 1. This > causes the parent process to wait forever at 'while (!pptr[2])' and the test > to > be killed by the test harness framework by timeout, thus, failing. > > The segmentation fault happens because of a inline assembly being > generated as: > > 0x1355c lfsf0, 0(0) > > This is reading memory position 0x0 and causing the segmentation fault. > > This code is being generated by ASM_LOAD_FPR_SINGLE_PRECISION(flt_4), where > flt_4 is passed to the inline assembly block as: > > [flt_4] "r" () > > Since the inline assembly 'r' constraint means any GPR, gpr0 is being > chosen, thus causing this issue when issuing a Load Floating-Point Single > instruction. > > This patch simply changes the constraint to 'b', which specify that this > register will be used as base, and r0 is not allowed to be used, avoiding > this issue. > > Other than that, removing flt_2 register from the input operands, since it > is not used by the inline assembly code at all. > > Cc: sta...@vger.kernel.org > Signed-off-by: Breno Leitao > Acked-by: Segher Boessenkool Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/48dc0ef19044bfb69193302fbe3a83 cheers
Re: [v2] KVM: PPC: Use exported tb_to_ns() function in decrementer emulation
On Sat, 2018-10-20 at 09:54:55 UTC, Paul Mackerras wrote: > This changes the KVM code that emulates the decrementer function to do > the conversion of decrementer values to time intervals in nanoseconds > by calling the tb_to_ns() function exported by the powerpc timer code, > in preference to open-coded arithmetic using values from the > decrementer_clockevent struct. Similarly, the HV-KVM code that did > the same conversion using arithmetic on tb_ticks_per_sec also now > uses tb_to_ns(). > > Signed-off-by: Paul Mackerras Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/c43befca86ae35cc82bd889484bd17 cheers
Re: [v6,08/20] powerpc/8xx: Use patch_site for perf counters setup
On Fri, 2018-10-19 at 06:55:08 UTC, Christophe Leroy wrote: > The 8xx TLB miss routines are patched when (de)activating > perf counters. > > This patch uses the new patch_site functionality in order > to get a better code readability and avoid a label mess when > dumping the code with 'objdump -d' > > Signed-off-by: Christophe Leroy Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/709cf19c5749308603ffa12557d8bd cheers
Re: [v6,07/20] powerpc/8xx: Use patch_site for memory setup patching
On Fri, 2018-10-19 at 06:55:06 UTC, Christophe Leroy wrote: > The 8xx TLB miss routines are patched at startup at several places. > > This patch uses the new patch_site functionality in order > to get a better code readability and avoid a label mess when > dumping the code with 'objdump -d' > > Signed-off-by: Christophe Leroy Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/1a210878bf21de3f60646c13001d04 cheers
Re: [v6, 06/20] powerpc/code-patching: add a helper to get the address of a patch_site
On Fri, 2018-10-19 at 06:55:04 UTC, Christophe Leroy wrote: > This patch adds a helper to get the address of a patch_site > > Signed-off-by: Christophe Leroy Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/082e2869fc022de4db9977e06558da cheers
Re: [v6, 01/20] Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP"
On Fri, 2018-10-19 at 06:54:54 UTC, Christophe Leroy wrote: > This reverts commit 4f94b2c7462d9720b2afa7e8e8d4c19446bb31ce. > > That commit was buggy, as it used rlwinm instead of rlwimi. > Instead of fixing that bug, we revert the previous commit in order to > reduce the dependency between L1 entries and L2 entries > > Fixes: 4f94b2c7462d9 ("powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED > for CONFIG_SWAP") > Cc: sta...@vger.kernel.org > Signed-off-by: Christophe Leroy Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/cc4ebf5c0a3440ed0a32d25c55ebdb cheers
Re: powerpc/pseries: Export maximum memory value
On Wed, 2018-10-10 at 10:22:59 UTC, Aravinda Prasad wrote: > This patch exports the maximum possible amount of memory > configured on the system via /proc/powerpc/lparcfg. > > Signed-off-by: Aravinda Prasad Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/772b039fd9a7e12d5fc80e6f649341 cheers
Re: [PATCH 5/6] pci: layerscape: Add the EP mode support.
Hi, On 31/10/18 8:03 AM, Xiaowei Bao wrote: > > > -Original Message- > From: Xiaowei Bao > Sent: 2018年10月26日 17:19 > To: 'Kishon Vijay Abraham I' ; bhelg...@google.com; > robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li > ; lorenzo.pieral...@arm.com; a...@arndb.de; > gre...@linuxfoundation.org; M.h. Lian ; Mingkai Hu > ; Roy Zang ; > kstew...@linuxfoundation.org; cyrille.pitc...@free-electrons.com; > pombreda...@nexb.com; shawn@rock-chips.com; niklas.cas...@axis.com; > linux-...@vger.kernel.org; devicet...@vger.kernel.org; > linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > linuxppc-dev@lists.ozlabs.org > Cc: Jiafei Pan > Subject: RE: [PATCH 5/6] pci: layerscape: Add the EP mode support. > > > > -Original Message- > From: Kishon Vijay Abraham I > Sent: 2018年10月26日 13:29 > To: Xiaowei Bao ; bhelg...@google.com; > robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li > ; lorenzo.pieral...@arm.com; a...@arndb.de; > gre...@linuxfoundation.org; M.h. Lian ; Mingkai Hu > ; Roy Zang ; > kstew...@linuxfoundation.org; cyrille.pitc...@free-electrons.com; > pombreda...@nexb.com; shawn@rock-chips.com; niklas.cas...@axis.com; > linux-...@vger.kernel.org; devicet...@vger.kernel.org; > linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > linuxppc-dev@lists.ozlabs.org > Subject: Re: [PATCH 5/6] pci: layerscape: Add the EP mode support. > > Hi, > > On Thursday 25 October 2018 04:39 PM, Xiaowei Bao wrote: >> Add the PCIe EP mode support for layerscape platform. >> >> Signed-off-by: Xiaowei Bao >> --- >> drivers/pci/controller/dwc/Makefile|2 +- >> drivers/pci/controller/dwc/pci-layerscape-ep.c | 161 >> >> 2 files changed, 162 insertions(+), 1 deletions(-) create mode >> 100644 drivers/pci/controller/dwc/pci-layerscape-ep.c >> >> diff --git a/drivers/pci/controller/dwc/Makefile >> b/drivers/pci/controller/dwc/Makefile >> index 5d2ce72..b26d617 100644 >> --- a/drivers/pci/controller/dwc/Makefile >> +++ b/drivers/pci/controller/dwc/Makefile >> @@ -8,7 +8,7 @@ obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o >> obj-$(CONFIG_PCI_IMX6) += pci-imx6.o >> obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o >> obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone-dw.o pci-keystone.o >> -obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o >> +obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o pci-layerscape-ep.o >> obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o >> obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o >> obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o diff --git >> a/drivers/pci/controller/dwc/pci-layerscape-ep.c >> b/drivers/pci/controller/dwc/pci-layerscape-ep.c >> new file mode 100644 >> index 000..3b33bbc >> --- /dev/null >> +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c >> @@ -0,0 +1,161 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/* >> + * PCIe controller EP driver for Freescale Layerscape SoCs >> + * >> + * Copyright (C) 2018 NXP Semiconductor. >> + * >> + * Author: Xiaowei Bao */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include "pcie-designware.h" >> + >> +#define PCIE_DBI2_OFFSET0x1000 /* DBI2 base address*/ > > The base address should come from dt. >> + >> +struct ls_pcie_ep { >> +struct dw_pcie *pci; >> +}; >> + >> +#define to_ls_pcie_ep(x)dev_get_drvdata((x)->dev) >> + >> +static bool ls_pcie_is_bridge(struct ls_pcie_ep *pcie) { >> +struct dw_pcie *pci = pcie->pci; >> +u32 header_type; >> + >> +header_type = ioread8(pci->dbi_base + PCI_HEADER_TYPE); >> +header_type &= 0x7f; >> + >> +return header_type == PCI_HEADER_TYPE_BRIDGE; } >> + >> +static int ls_pcie_establish_link(struct dw_pcie *pci) { >> +return 0; >> +} > > There should be some way by which EP should tell RC that it is not configured > yet. Are there no bits to control LTSSM state initialization or Configuration > retry status enabling? > [Xiaowei Bao] There have not bits to control LTSSM state to tell the RC it is > configured. The start link is auto completed. > [Xiaowei Bao] Hi Kishon, is there any advice? If there is no HW support, I don't think anything could be done here. This could result in RC reading configuration space even before EP is fully initialized. >> + >> +static const struct dw_pcie_ops ls_pcie_ep_ops = { >> +.start_link = ls_pcie_establish_link, }; >> + >> +static const struct of_device_id ls_pcie_ep_of_match[] = { >> +{ .compatible = "fsl,ls-pcie-ep",}, >> +{ }, >> +}; >> + >> +static void ls_pcie_ep_init(struct dw_pcie_ep *ep) { >> +struct dw_pcie *pci = to_dw_pcie_from_ep(ep); >> +struct pci_epc *epc = ep->epc; >> +enum pci_barno bar; >> + >> +for (bar = BAR_0; bar <= BAR_5; bar++) >> +dw_pcie_ep_reset_bar(pci, bar); >> + >> +epc->features |= EPC_FEATURE_NO_LINKUP_NOTIFIER; } >> + >>
Re: [PATCH 1/5] powerpc/64s: Guarded Userspace Access Prevention
On Sun, 2018-10-28 at 18:57 +0100, LEROY Christophe wrote: > Russell Currey a écrit : > > > Guarded Userspace Access Prevention (GUAP) utilises a feature of > > the Radix MMU which disallows read and write access to userspace > > addresses. By utilising this, the kernel is prevented from > > accessing > > user data from outside of trusted paths that perform proper safety > > checks, > > such as copy_{to/from}_user() and friends. > > > > Userspace access is disabled from early boot and is only enabled > > when: > > > > - exiting the kernel and entering userspace > > - performing an operation like copy_{to/from}_user() > > - context switching to a process that has access enabled > > > > and similarly, access is disabled again when exiting userspace and > > entering > > the kernel. > > > > This feature has a slight performance impact which I roughly > > measured to be > > 3% slower in the worst case (performing 1GB of 1 byte > > read()/write() > > syscalls), and is gated behind the CONFIG_PPC_RADIX_GUAP option for > > performance-critical builds. > > > > This feature can be tested by using the lkdtm driver > > (CONFIG_LKDTM=y) and > > performing the following: > > > > echo ACCESS_USERSPACE > [debugfs]/provoke-crash/DIRECT > > > > if enabled, this should send SIGSEGV to the thread. > > > > Signed-off-by: Russell Currey > > I think this patch should be split in at least two parts: > First part for implementing the generic part, including the changes > to > futex and csum, and a second part implementing the radix part. I'll see how I go making generic handlers - I am concerned about the implementation becoming more complex than it needs to be just to accommodate potential future changes that could end up having different requirements anyway, rather than something simple that works today. > > --- > > Since the previous version of this patchset (named KHRAP) there > > have been > > several changes, some of which include: > > > > - macro naming, suggested by Nick > > - builds should be fixed outside of 64s > > - no longer unlock heading out to userspace > > - removal of unnecessary isyncs > > - more config option testing > > - removal of save/restore > > - use pr_crit() and reword message on fault > > > > arch/powerpc/include/asm/exception-64e.h | 3 ++ > > arch/powerpc/include/asm/exception-64s.h | 19 +++- > > arch/powerpc/include/asm/mmu.h | 7 +++ > > arch/powerpc/include/asm/paca.h | 3 ++ > > arch/powerpc/include/asm/reg.h | 1 + > > arch/powerpc/include/asm/uaccess.h | 57 > > > > arch/powerpc/kernel/asm-offsets.c| 1 + > > arch/powerpc/kernel/dt_cpu_ftrs.c| 4 ++ > > arch/powerpc/kernel/entry_64.S | 17 ++- > > arch/powerpc/mm/fault.c | 12 + > > arch/powerpc/mm/pgtable-radix.c | 2 + > > arch/powerpc/mm/pkeys.c | 7 ++- > > arch/powerpc/platforms/Kconfig.cputype | 15 +++ > > 13 files changed, 135 insertions(+), 13 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/exception-64e.h > > b/arch/powerpc/include/asm/exception-64e.h > > index 555e22d5e07f..bf25015834ee 100644 > > --- a/arch/powerpc/include/asm/exception-64e.h > > +++ b/arch/powerpc/include/asm/exception-64e.h > > @@ -215,5 +215,8 @@ exc_##label##_book3e: > > #define RFI_TO_USER > > \ > > rfi > > > > +#define UNLOCK_USER_ACCESS(reg) > > +#define LOCK_USER_ACCESS(reg) > > + > > #endif /* _ASM_POWERPC_EXCEPTION_64E_H */ > > > > diff --git a/arch/powerpc/include/asm/exception-64s.h > > b/arch/powerpc/include/asm/exception-64s.h > > index 3b4767ed3ec5..0cac5bd380ca 100644 > > --- a/arch/powerpc/include/asm/exception-64s.h > > +++ b/arch/powerpc/include/asm/exception-64s.h > > @@ -264,6 +264,19 @@ BEGIN_FTR_SECTION_NESTED(943) > > \ > > std ra,offset(r13); \ > > END_FTR_SECTION_NESTED(ftr,ftr,943) > > > > +#define LOCK_USER_ACCESS(reg) > > \ > > +BEGIN_MMU_FTR_SECTION_NESTED(944) > > \ > > + LOAD_REG_IMMEDIATE(reg,AMR_LOCKED); \ > > + mtspr SPRN_AMR,reg; > > \ > > +END_MMU_FTR_SECTION_NESTED(MMU_FTR_RADIX_GUAP,MMU_FTR_RADIX_GUAP,9 > > 44) > > + > > +#define UNLOCK_USER_ACCESS(reg) > > \ > > +BEGIN_MMU_FTR_SECTION_NESTED(945) > > \ > > + li reg,0; \ > > + mtspr SPRN_AMR,reg; > > \ > > + isync > > \ > >
Re: [PATCH 0/5] Guarded Userspace Access Prevention on Radix
On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote: > Russell Currey a écrit : > > > Guarded Userspace Access Prevention is a security mechanism that > > prevents > > the kernel from being able to read and write userspace addresses > > outside of > > the allowed paths, most commonly copy_{to/from}_user(). > > > > At present, the only CPU that supports this is POWER9, and only > > while using > > the Radix MMU. Privileged reads and writes cannot access user data > > when > > key 0 of the AMR is set. This is described in the "Radix Tree > > Translation > > Storage Protection" section of the POWER ISA as of version 3.0. > > It is not right that only power9 can support that. It's true that not only P9 can support it, but there are more considerations under hash than radix, implementing this for radix is a first step. > > The 8xx has mmu access protection registers which can serve the > same > purpose. Today on the 8xx kernel space is group 0 and user space is > group 1. Group 0 is set to "page defined access permission" in > MD_AP > and MI_AP registers, and group 1 is set to "all accesses done with > supervisor rights". By setting group 1 to "user and supervisor > interpretation swapped" we can forbid kernel access to user space > while still allowing user access to it. Then by simply changing > group > 1 mode at dedicated places we can lock/unlock kernel access to user > space. > > Could you implement something as generic as possible having that in > mind for a future patch ? I don't think anything in this series is particularly problematic in relation to future work for hash. I am interested in doing a hash implementation in future too. - Russell > > Christophe > > > GUAP code sets key 0 of the AMR (thus disabling accesses of user > > data) > > early during boot, and only ever "unlocks" access prior to certain > > operations, like copy_{to/from}_user(), futex ops, etc. Setting > > this does > > not prevent unprivileged access, so userspace can operate fine > > while access > > is locked. > > > > There is a performance impact, although I don't consider it > > heavy. Running > > a worst-case benchmark of a 1GB copy 1 byte at a time (and thus > > constant > > read(1) write(1) syscalls), I found enabling GUAP to be 3.5% slower > > than > > when disabled. In most cases, the difference is negligible. The > > main > > performance impact is the mtspr instruction, which is quite slow. > > > > There are a few caveats with this series that could be improved > > upon in > > future. Right now there is no saving and restoring of the AMR > > value - > > there is no userspace exploitation of the AMR on Radix in POWER9, > > but if > > this were to change in future, saving and restoring the value would > > be > > necessary. > > > > No attempt to optimise cases of repeated calls - for example, if > > some > > code was repeatedly calling copy_to_user() for small sizes very > > frequently, > > it would be slower than the equivalent of wrapping that code in an > > unlock > > and lock and only having to modify the AMR once. > > > > There are some interesting cases that I've attempted to handle, > > such as if > > the AMR is unlocked (i.e. because a copy_{to_from}_user is in > > progress)... > > > > - and an exception is taken, the kernel would then be running > > with the > > AMR unlocked and freely able to access userspace again. I am > > working > > around this by storing a flag in the PACA to indicate if the > > AMR is > > unlocked (to save a costly SPR read), and if so, locking the > > AMR in > > the exception entry path and unlocking it on the way out. > > > > - and gets context switched out, goes into a path that locks > > the AMR, > > then context switches back, access will be disabled and will > > fault. > > As a result, I context switch the AMR between tasks as if it > > was used > > by userspace like hash (which already implements this). > > > > Another consideration is use of the isync instruction. Without an > > isync > > following the mtspr instruction, there is no guarantee that the > > change > > takes effect. The issue is that isync is very slow, and so I tried > > to > > avoid them wherever necessary. In this series, the only place an > > isync > > gets used is after *unlocking* the AMR, because if an access takes > > place > > and access is still prevented, the kernel will fault. > > > > On the flipside, a slight delay in unlocking caused by skipping an > > isync > > potentially allows a small window of vulnerability. It is my > > opinion > > that this window is practically impossible to exploit, but if > > someone > > thinks otherwise, please do share. > > > > This series is my first attempt at POWER assembly so all feedback > > is very > > welcome. > > > > The official theme song of this series can be found here: > > https://www.youtube.com/watch?v=QjTrnKAcYjE > > > > Russell Currey (5): > > powerpc/64s:
[PATCH] raid6/ppc: Fix build for clang
We cannot build these files with clang as it does not allow altivec instructions in assembly when -msoft-float is passed. Jinsong Ji wrote: > We currently disable Altivec/VSX support when enabling soft-float. So > any usage of vector builtins will break. > > Enable Altivec/VSX with soft-float may need quite some clean up work, so > I guess this is currently a limitation. > > Removing -msoft-float will make it work (and we are lucky that no > floating point instructions will be generated as well). This is a workaround until the issue is resolved in clang. Link: https://bugs.llvm.org/show_bug.cgi?id=31177 Link: https://github.com/ClangBuiltLinux/linux/issues/239 Signed-off-by: Joel Stanley --- lib/raid6/Makefile | 15 +++ 1 file changed, 15 insertions(+) diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile index 2f8b61dfd9b0..3a844e6fd01c 100644 --- a/lib/raid6/Makefile +++ b/lib/raid6/Makefile @@ -18,6 +18,21 @@ quiet_cmd_unroll = UNROLL $@ ifeq ($(CONFIG_ALTIVEC),y) altivec_flags := -maltivec $(call cc-option,-mabi=altivec) + +ifdef CONFIG_CC_IS_CLANG +# clang ppc port does not yet support -maltifec when -msoft-float is +# enabled. A future release of clang will resolve this +# https://bugs.llvm.org/show_bug.cgi?id=31177 +CFLAGS_REMOVE_altivec1.o += -msoft-float +CFLAGS_REMOVE_altivec2.o += -msoft-float +CFLAGS_REMOVE_altivec4.o += -msoft-float +CFLAGS_REMOVE_altivec8.o += -msoft-float +CFLAGS_REMOVE_altivec8.o += -msoft-float +CFLAGS_REMOVE_vpermxor1.o += -msoft-float +CFLAGS_REMOVE_vpermxor2.o += -msoft-float +CFLAGS_REMOVE_vpermxor4.o += -msoft-float +CFLAGS_REMOVE_vpermxor8.o += -msoft-float +endif endif # The GCC option -ffreestanding is required in order to compile code containing -- 2.19.1
RE: [PATCH 5/6] pci: layerscape: Add the EP mode support.
-Original Message- From: Xiaowei Bao Sent: 2018年10月26日 17:19 To: 'Kishon Vijay Abraham I' ; bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li ; lorenzo.pieral...@arm.com; a...@arndb.de; gre...@linuxfoundation.org; M.h. Lian ; Mingkai Hu ; Roy Zang ; kstew...@linuxfoundation.org; cyrille.pitc...@free-electrons.com; pombreda...@nexb.com; shawn@rock-chips.com; niklas.cas...@axis.com; linux-...@vger.kernel.org; devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org Cc: Jiafei Pan Subject: RE: [PATCH 5/6] pci: layerscape: Add the EP mode support. -Original Message- From: Kishon Vijay Abraham I Sent: 2018年10月26日 13:29 To: Xiaowei Bao ; bhelg...@google.com; robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li ; lorenzo.pieral...@arm.com; a...@arndb.de; gre...@linuxfoundation.org; M.h. Lian ; Mingkai Hu ; Roy Zang ; kstew...@linuxfoundation.org; cyrille.pitc...@free-electrons.com; pombreda...@nexb.com; shawn@rock-chips.com; niklas.cas...@axis.com; linux-...@vger.kernel.org; devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 5/6] pci: layerscape: Add the EP mode support. Hi, On Thursday 25 October 2018 04:39 PM, Xiaowei Bao wrote: > Add the PCIe EP mode support for layerscape platform. > > Signed-off-by: Xiaowei Bao > --- > drivers/pci/controller/dwc/Makefile|2 +- > drivers/pci/controller/dwc/pci-layerscape-ep.c | 161 > > 2 files changed, 162 insertions(+), 1 deletions(-) create mode > 100644 drivers/pci/controller/dwc/pci-layerscape-ep.c > > diff --git a/drivers/pci/controller/dwc/Makefile > b/drivers/pci/controller/dwc/Makefile > index 5d2ce72..b26d617 100644 > --- a/drivers/pci/controller/dwc/Makefile > +++ b/drivers/pci/controller/dwc/Makefile > @@ -8,7 +8,7 @@ obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o > obj-$(CONFIG_PCI_IMX6) += pci-imx6.o > obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o > obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone-dw.o pci-keystone.o > -obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o > +obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o pci-layerscape-ep.o > obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o > obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o > obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o diff --git > a/drivers/pci/controller/dwc/pci-layerscape-ep.c > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > new file mode 100644 > index 000..3b33bbc > --- /dev/null > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > @@ -0,0 +1,161 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * PCIe controller EP driver for Freescale Layerscape SoCs > + * > + * Copyright (C) 2018 NXP Semiconductor. > + * > + * Author: Xiaowei Bao */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "pcie-designware.h" > + > +#define PCIE_DBI2_OFFSET 0x1000 /* DBI2 base address*/ The base address should come from dt. > + > +struct ls_pcie_ep { > + struct dw_pcie *pci; > +}; > + > +#define to_ls_pcie_ep(x) dev_get_drvdata((x)->dev) > + > +static bool ls_pcie_is_bridge(struct ls_pcie_ep *pcie) { > + struct dw_pcie *pci = pcie->pci; > + u32 header_type; > + > + header_type = ioread8(pci->dbi_base + PCI_HEADER_TYPE); > + header_type &= 0x7f; > + > + return header_type == PCI_HEADER_TYPE_BRIDGE; } > + > +static int ls_pcie_establish_link(struct dw_pcie *pci) { > + return 0; > +} There should be some way by which EP should tell RC that it is not configured yet. Are there no bits to control LTSSM state initialization or Configuration retry status enabling? [Xiaowei Bao] There have not bits to control LTSSM state to tell the RC it is configured. The start link is auto completed. [Xiaowei Bao] Hi Kishon, is there any advice? > + > +static const struct dw_pcie_ops ls_pcie_ep_ops = { > + .start_link = ls_pcie_establish_link, }; > + > +static const struct of_device_id ls_pcie_ep_of_match[] = { > + { .compatible = "fsl,ls-pcie-ep",}, > + { }, > +}; > + > +static void ls_pcie_ep_init(struct dw_pcie_ep *ep) { > + struct dw_pcie *pci = to_dw_pcie_from_ep(ep); > + struct pci_epc *epc = ep->epc; > + enum pci_barno bar; > + > + for (bar = BAR_0; bar <= BAR_5; bar++) > + dw_pcie_ep_reset_bar(pci, bar); > + > + epc->features |= EPC_FEATURE_NO_LINKUP_NOTIFIER; } > + > +static int ls_pcie_ep_raise_irq(struct dw_pcie_ep *ep, u8 func_no, > + enum pci_epc_irq_type type, u16 > interrupt_num) { > + struct dw_pcie *pci = to_dw_pcie_from_ep(ep); > + > + switch (type) { > + case PCI_EPC_IRQ_LEGACY: > + return dw_pcie_ep_raise_legacy_irq(ep, func_no); > + case PCI_EPC_IRQ_MSI: > +
Re: [PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
Hi Douglas, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on kgdb/kgdb-next] [also build test WARNING on v4.19 next-20181030] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Douglas-Anderson/kgdb-Fix-kgdb_roundup_cpus/20181031-063733 base: https://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb.git kgdb-next reproduce: make htmldocs All warnings (new ones prefixed by >>): WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org) include/linux/srcu.h:175: warning: Function parameter or member 'p' not described in 'srcu_dereference_notrace' include/linux/srcu.h:175: warning: Function parameter or member 'sp' not described in 'srcu_dereference_notrace' include/linux/gfp.h:1: warning: no structured comments found >> include/linux/kgdb.h:188: warning: Function parameter or member 'ignored' >> not described in 'kgdb_call_nmi_hook' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.ibss' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.connect' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.keys' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.ie' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.ie_len' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.bssid' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.ssid' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.default_key' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.default_mgmt_key' not described in 'wireless_dev' include/net/cfg80211.h:4381: warning: Function parameter or member 'wext.prev_bssid_valid' not described in 'wireless_dev' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' include/net/cfg80211.h:4869: warning: Excess function parameter
[PATCH] powerpc/xmon: Relax frame size for clang
When building with clang (8 trunk, 7.0 release) the frame size limit is hit: arch/powerpc/xmon/xmon.c:452:12: warning: stack frame size of 2576 bytes in function 'xmon_core' [-Wframe-larger-than=] Some investigation by Naveen indicates this is due to clang saving the addresses to printf format strings on the stack. While this issue is investigated, bump up the frame size limit for xmon when building with clang. Link: https://github.com/ClangBuiltLinux/linux/issues/252 Signed-off-by: Joel Stanley --- arch/powerpc/xmon/Makefile | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile index 69e7fb47bcaa..878f9c1d3615 100644 --- a/arch/powerpc/xmon/Makefile +++ b/arch/powerpc/xmon/Makefile @@ -11,6 +11,12 @@ UBSAN_SANITIZE := n ORIG_CFLAGS := $(KBUILD_CFLAGS) KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)) +ifdef CONFIG_CC_IS_CLANG +# clang stores addresses on the stack causing the frame size to blow +# out. See https://github.com/ClangBuiltLinux/linux/issues/252 +KBUILD_CFLAGS += -Wframe-larger-than=4096 +endif + ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC) obj-y += xmon.o nonstdio.o spr_access.o -- 2.19.1
Re: [PATCH 2/2] selftests/powerpc: Skip test instead of failing
On 10/30/2018 08:16 AM, Michael Ellerman wrote: > Thiago Jung Bauermann writes: > >> Breno Leitao writes: >> >>> Hi Tyrel, >>> >>> On 10/23/2018 05:41 PM, Tyrel Datwyler wrote: > --- a/tools/testing/selftests/powerpc/ptrace/core-pkey.c > +++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c > @@ -352,17 +352,11 @@ static int write_core_pattern(const char > *core_pattern) > FILE *f; > > f = fopen(core_pattern_file, "w"); > - if (!f) { > - perror("Error writing to core_pattern file"); > - return TEST_FAIL; > - } > + SKIP_IF(!f); > > ret = fwrite(core_pattern, 1, len, f); > fclose(f); > - if (ret != len) { > - perror("Error writing to core_pattern file"); > - return TEST_FAIL; > - } > + SKIP_IF(ret != len); >>> If we don't have proper privileges we should fail on the open, right? So wouldn't we still want to fail on the write if something goes wrong? >>> >>> That is a good point. Should the test fail or skip if it is not possible >>> to create the infrastructure to run the core test? >>> >>> Trying to find the answer in the current test sets, I find tests where >>> the self test skips if the test environment is not able to be set up, as >>> for example, when a memory allocation fails. >>> >>> File: tools/testing/selftests/powerpc/alignment/alignment_handler.c >>> >>> ci1 = mmap(NULL, bufsize, PROT_WRITE, MAP_SHARED, >>>fd, bufsize); >>> if ((ci0 == MAP_FAILED) || (ci1 == MAP_FAILED)) { >>> printf("\n"); >>> perror("mmap failed"); >>> SKIP_IF(1); >>> } >> >> I think TEST_FAIL means the test was able to exercise the feature >> and found a problem with it. In this case, the test wasn't able to >> exercise the feature so it's not appropriate. >> >> Ideally, there should be a TEST_ERROR result for a case like this where >> an unexpected problem prevented the testcase from exercising the >> feature. >> >> If we're to use the an existing result then I vote for SKIP_IF. > > Yeah I agree. > > See for example some of the TM tests, which skip if TM is not available. > Or the alignment test which skips if it can't open /dev/fb0. > > In this case it should print "you need to be root to run this" and then > skip. Agreed that there should be some indicator of why we are skipping the test. My original point I was trying to make was that I thought skipping a failed open was okay because we are likely not root. However, I wasn't sure that a failed write was okay to skip as that could be an indicator that something has actually been broken. -Tyrel > > cheers >
Re: [PATCH 0/7] serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb
Hi, On Tue, Oct 30, 2018 at 4:56 AM Daniel Thompson wrote: > > On Mon, Oct 29, 2018 at 11:07:00AM -0700, Douglas Anderson wrote: > > Looking back, this is pretty much two series squashed that could be > > treated indepdently. The first is a serial series and the second is a > > kgdb series. > > Indeed. > > I couldn't work out the link between the first 5 patches and the last 2 > until I read this... > > Is anything in the 01->05 patch set even related to kgdb? From the stack > traces it looks to me like the lock dep warning would trigger for any > sysrq. I think separating into two threads for v2 would be sensible. Yes, sorry about the mess. Splitting this into two series makes the most sense. Then I can focus more on trying to get the CCs right and people can just get the patches that matter to them. OK, I've sent v2 of both series out now. Please yell if you can't find them for whatever reason. -Doug -Doug
Re: [PATCH 7/7] kgdb: Remove irq flags and local_irq_enable/disable from roundup
Hi, On Tue, Oct 30, 2018 at 4:46 AM Daniel Thompson wrote: > > On Mon, Oct 29, 2018 at 11:07:07AM -0700, Douglas Anderson wrote: > > The function kgdb_roundup_cpus() was passed a parameter that was > > documented as: > > > > > the flags that will be used when restoring the interrupts. There is > > > local_irq_save() call before kgdb_roundup_cpus(). > > > > Nobody used those flags. Anyone who wanted to temporarily turn on > > interrupts just did local_irq_enable() and local_irq_disable() without > > looking at them. So we can definitely remove the flags. > > On the whole I'd rather that this change... > > > > Speaking of calling local_irq_enable(), it seems like a bad idea and > > it caused a nice splat on my system with lockdep turned on. > > Specifically it hit: > > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > > ... and fixes for this this were in separate patches. They don't appear > especially related. Agreed that this is cleaner. Done for v2. -Doug
Re: [PATCH 6/7] smp: Don't yell about IRQs disabled in kgdb_roundup_cpus()
Daniel, On Tue, Oct 30, 2018 at 2:42 AM Daniel Thompson wrote: > > On Mon, Oct 29, 2018 at 11:07:06AM -0700, Douglas Anderson wrote: > > In kgdb_roundup_cpus() we've got code that looks like: > > local_irq_enable(); > > smp_call_function(kgdb_call_nmi_hook, NULL, 0); > > local_irq_disable(); > > > > In certain cases when we drop into kgdb (like with sysrq-g on a serial > > console) we'll get a big yell that looks like: > > > > sysrq: SysRq : DEBUG > > [ cut here ] > > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > > WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 > > lockdep_hardirqs_on+0xf0/0x160 > > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 > > pstate: 604003c9 (nZCv DAIF +PAN -UAO) > > pc : lockdep_hardirqs_on+0xf0/0x160 > > ... > > Call trace: > >lockdep_hardirqs_on+0xf0/0x160 > >trace_hardirqs_on+0x188/0x1ac > >kgdb_roundup_cpus+0x14/0x3c > >kgdb_cpu_enter+0x53c/0x5cc > >kgdb_handle_exception+0x180/0x1d4 > >kgdb_compiled_brk_fn+0x30/0x3c > >brk_handler+0x134/0x178 > >do_debug_exception+0xfc/0x178 > >el1_dbg+0x18/0x78 > >kgdb_breakpoint+0x34/0x58 > >sysrq_handle_dbg+0x54/0x5c > >__handle_sysrq+0x114/0x21c > >handle_sysrq+0x30/0x3c > >qcom_geni_serial_isr+0x2dc/0x30c > > ... > > ... > > irq event stamp: ...45 > > hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 > > hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 > > softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 > > softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 > > ---[ end trace adf21f830c46e638 ]--- > > > > Let's add kgdb to the list of reasons not to warn in > > smp_call_function_many(). That will allow us (in a future patch) to > > stop calling local_irq_enable() which will get rid of the original > > splat. > > > > NOTE: with this change comes the obvious question: will we start > > deadlocking more often now when we drop into the debugger. I can't > > say that for sure one way or the other, but the fact that we do the > > same logic for "oops_in_progress" makes me feel like it shouldn't > > happen too often. Also note that the old logic of turning on > > interrupts temporarily wasn't exactly safe since (I presume) that > > could have violated spin_lock_irqsave() semantics and ended up with a > > deadlock of its own. > > This is part of the code to bring all the cores to a halt and since > the other cores are still running kgdb isn't yet able to use the fact > all the CPUs are halted to bend the rules. It is better for this code > to play by the rules if it can. > > Is is possible to get the roundup functions to use a private csd > alongside smp_call_function_single_async()? We could add a helper > function to the debug core to avoid having to add cpu_online loops into > every kgdb_roundup_cpus() implementaton. Exactly the kind of helpful suggestion I was looking for. Thank you very much! See v2 and hopefully it matches what you were thinking of. -Doug
[PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
When I had lockdep turned on and dropped into kgdb I got a nice splat on my system. Specifically it hit: DEBUG_LOCKS_WARN_ON(current->hardirq_context) Specifically it looked like this: sysrq: SysRq : DEBUG [ cut here ] DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 pstate: 604003c9 (nZCv DAIF +PAN -UAO) pc : lockdep_hardirqs_on+0xf0/0x160 ... Call trace: lockdep_hardirqs_on+0xf0/0x160 trace_hardirqs_on+0x188/0x1ac kgdb_roundup_cpus+0x14/0x3c kgdb_cpu_enter+0x53c/0x5cc kgdb_handle_exception+0x180/0x1d4 kgdb_compiled_brk_fn+0x30/0x3c brk_handler+0x134/0x178 do_debug_exception+0xfc/0x178 el1_dbg+0x18/0x78 kgdb_breakpoint+0x34/0x58 sysrq_handle_dbg+0x54/0x5c __handle_sysrq+0x114/0x21c handle_sysrq+0x30/0x3c qcom_geni_serial_isr+0x2dc/0x30c ... ... irq event stamp: ...45 hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 ---[ end trace adf21f830c46e638 ]--- Looking closely at it, it seems like a really bad idea to be calling local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems like it could violate spinlock semantics and cause a deadlock. Instead, let's use a private csd alongside smp_call_function_single_async() to round up the other CPUs. Using smp_call_function_single_async() doesn't require interrupts to be enabled so we can remove the offending bit of code. In order to avoid duplicating this across all the architectures that use the default kgdb_roundup_cpus(), we'll add a "weak" implementation to debug_core.c. Looking at all the people who previously had copies of this code, there were a few variants. I've attempted to keep the variants working like they used to. Specifically: * For arch/arc we passed NULL to kgdb_nmicallback() instead of get_irq_regs(). * For arch/mips there was a bit of extra code around kgdb_nmicallback() Suggested-by: Daniel Thompson Signed-off-by: Douglas Anderson --- Changes in v2: - Removing irq flags separated from fixing lockdep splat. - Don't use smp_call_function (Daniel). arch/arc/kernel/kgdb.c | 10 ++ arch/arm/kernel/kgdb.c | 12 arch/arm64/kernel/kgdb.c | 12 arch/hexagon/kernel/kgdb.c | 27 --- arch/mips/kernel/kgdb.c| 9 + arch/powerpc/kernel/kgdb.c | 4 ++-- arch/sh/kernel/kgdb.c | 12 include/linux/kgdb.h | 15 +-- kernel/debug/debug_core.c | 36 9 files changed, 54 insertions(+), 83 deletions(-) diff --git a/arch/arc/kernel/kgdb.c b/arch/arc/kernel/kgdb.c index 0932851028e0..68d9fe4b5aa7 100644 --- a/arch/arc/kernel/kgdb.c +++ b/arch/arc/kernel/kgdb.c @@ -192,18 +192,12 @@ void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long ip) instruction_pointer(regs) = ip; } -static void kgdb_call_nmi_hook(void *ignored) +void kgdb_call_nmi_hook(void *ignored) { + /* Default implementation passes get_irq_regs() but we don't */ kgdb_nmicallback(raw_smp_processor_id(), NULL); } -void kgdb_roundup_cpus(void) -{ - local_irq_enable(); - smp_call_function(kgdb_call_nmi_hook, NULL, 0); - local_irq_disable(); -} - struct kgdb_arch arch_kgdb_ops = { /* breakpoint instruction: TRAP_S 0x3 */ #ifdef CONFIG_CPU_BIG_ENDIAN diff --git a/arch/arm/kernel/kgdb.c b/arch/arm/kernel/kgdb.c index f21077b077be..d9a69e941463 100644 --- a/arch/arm/kernel/kgdb.c +++ b/arch/arm/kernel/kgdb.c @@ -170,18 +170,6 @@ static struct undef_hook kgdb_compiled_brkpt_hook = { .fn = kgdb_compiled_brk_fn }; -static void kgdb_call_nmi_hook(void *ignored) -{ - kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); -} - -void kgdb_roundup_cpus(void) -{ - local_irq_enable(); - smp_call_function(kgdb_call_nmi_hook, NULL, 0); - local_irq_disable(); -} - static int __kgdb_notify(struct die_args *args, unsigned long cmd) { struct pt_regs *regs = args->regs; diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c index 12c339ff6e75..da880247c734 100644 --- a/arch/arm64/kernel/kgdb.c +++ b/arch/arm64/kernel/kgdb.c @@ -284,18 +284,6 @@ static struct step_hook kgdb_step_hook = { .fn = kgdb_step_brk_fn }; -static void kgdb_call_nmi_hook(void *ignored) -{ - kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); -} - -void kgdb_roundup_cpus(void) -{ - local_irq_enable(); - smp_call_function(kgdb_call_nmi_hook, NULL, 0); - local_irq_disable(); -} - static int __kgdb_notify(struct die_args *args, unsigned long cmd) {
[PATCH v2 1/2] kgdb: Remove irq flags from roundup
The function kgdb_roundup_cpus() was passed a parameter that was documented as: > the flags that will be used when restoring the interrupts. There is > local_irq_save() call before kgdb_roundup_cpus(). Nobody used those flags. Anyone who wanted to temporarily turn on interrupts just did local_irq_enable() and local_irq_disable() without looking at them. So we can definitely remove the flags. Signed-off-by: Douglas Anderson --- Changes in v2: - Removing irq flags separated from fixing lockdep splat. arch/arc/kernel/kgdb.c | 2 +- arch/arm/kernel/kgdb.c | 2 +- arch/arm64/kernel/kgdb.c | 2 +- arch/hexagon/kernel/kgdb.c | 9 ++--- arch/mips/kernel/kgdb.c| 2 +- arch/powerpc/kernel/kgdb.c | 2 +- arch/sh/kernel/kgdb.c | 2 +- arch/sparc/kernel/smp_64.c | 2 +- arch/x86/kernel/kgdb.c | 9 ++--- include/linux/kgdb.h | 9 ++--- kernel/debug/debug_core.c | 2 +- 11 files changed, 14 insertions(+), 29 deletions(-) diff --git a/arch/arc/kernel/kgdb.c b/arch/arc/kernel/kgdb.c index 9a3c34af2ae8..0932851028e0 100644 --- a/arch/arc/kernel/kgdb.c +++ b/arch/arc/kernel/kgdb.c @@ -197,7 +197,7 @@ static void kgdb_call_nmi_hook(void *ignored) kgdb_nmicallback(raw_smp_processor_id(), NULL); } -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { local_irq_enable(); smp_call_function(kgdb_call_nmi_hook, NULL, 0); diff --git a/arch/arm/kernel/kgdb.c b/arch/arm/kernel/kgdb.c index caa0dbe3dc61..f21077b077be 100644 --- a/arch/arm/kernel/kgdb.c +++ b/arch/arm/kernel/kgdb.c @@ -175,7 +175,7 @@ static void kgdb_call_nmi_hook(void *ignored) kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); } -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { local_irq_enable(); smp_call_function(kgdb_call_nmi_hook, NULL, 0); diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c index a20de58061a8..12c339ff6e75 100644 --- a/arch/arm64/kernel/kgdb.c +++ b/arch/arm64/kernel/kgdb.c @@ -289,7 +289,7 @@ static void kgdb_call_nmi_hook(void *ignored) kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); } -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { local_irq_enable(); smp_call_function(kgdb_call_nmi_hook, NULL, 0); diff --git a/arch/hexagon/kernel/kgdb.c b/arch/hexagon/kernel/kgdb.c index 16c24b22d0b2..012e0e230ac2 100644 --- a/arch/hexagon/kernel/kgdb.c +++ b/arch/hexagon/kernel/kgdb.c @@ -119,17 +119,12 @@ void kgdb_arch_set_pc(struct pt_regs *regs, unsigned long pc) /** * kgdb_roundup_cpus - Get other CPUs into a holding pattern - * @flags: Current IRQ state * * On SMP systems, we need to get the attention of the other CPUs * and get them be in a known state. This should do what is needed * to get the other CPUs to call kgdb_wait(). Note that on some arches, * the NMI approach is not used for rounding up all the CPUs. For example, - * in case of MIPS, smp_call_function() is used to roundup CPUs. In - * this case, we have to make sure that interrupts are enabled before - * calling smp_call_function(). The argument to this function is - * the flags that will be used when restoring the interrupts. There is - * local_irq_save() call before kgdb_roundup_cpus(). + * in case of MIPS, smp_call_function() is used to roundup CPUs. * * On non-SMP systems, this is not called. */ @@ -139,7 +134,7 @@ static void hexagon_kgdb_nmi_hook(void *ignored) kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); } -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { local_irq_enable(); smp_call_function(hexagon_kgdb_nmi_hook, NULL, 0); diff --git a/arch/mips/kernel/kgdb.c b/arch/mips/kernel/kgdb.c index eb6c0d582626..2b05effc17b4 100644 --- a/arch/mips/kernel/kgdb.c +++ b/arch/mips/kernel/kgdb.c @@ -219,7 +219,7 @@ static void kgdb_call_nmi_hook(void *ignored) set_fs(old_fs); } -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { local_irq_enable(); smp_call_function(kgdb_call_nmi_hook, NULL, 0); diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c index 59c578f865aa..b0e804844be0 100644 --- a/arch/powerpc/kernel/kgdb.c +++ b/arch/powerpc/kernel/kgdb.c @@ -124,7 +124,7 @@ static int kgdb_call_nmi_hook(struct pt_regs *regs) } #ifdef CONFIG_SMP -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { smp_send_debugger_break(); } diff --git a/arch/sh/kernel/kgdb.c b/arch/sh/kernel/kgdb.c index 4f04c6638a4d..cc57630f6bf2 100644 --- a/arch/sh/kernel/kgdb.c +++ b/arch/sh/kernel/kgdb.c @@ -319,7 +319,7 @@ static void kgdb_call_nmi_hook(void *ignored) kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); } -void kgdb_roundup_cpus(unsigned long flags) +void kgdb_roundup_cpus(void) { local_irq_enable(); smp_call_function(kgdb_call_nmi_hook,
[PATCH v2 0/2] kgdb: Fix kgdb_roundup_cpus()
This series was originally part of the series ("serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb") but it made sense to split it up. It's believed that dropping into kgdb should be more robust once these patches are applied. Changes in v2: - Removing irq flags separated from fixing lockdep splat. - Don't use smp_call_function (Daniel). Douglas Anderson (2): kgdb: Remove irq flags from roundup kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function() arch/arc/kernel/kgdb.c | 10 ++ arch/arm/kernel/kgdb.c | 12 arch/arm64/kernel/kgdb.c | 12 arch/hexagon/kernel/kgdb.c | 32 arch/mips/kernel/kgdb.c| 9 + arch/powerpc/kernel/kgdb.c | 6 +++--- arch/sh/kernel/kgdb.c | 12 arch/sparc/kernel/smp_64.c | 2 +- arch/x86/kernel/kgdb.c | 9 ++--- include/linux/kgdb.h | 22 ++ kernel/debug/debug_core.c | 38 +- 11 files changed, 60 insertions(+), 104 deletions(-) -- 2.19.1.568.g152ad8e336-goog
Re: [PATCH v2 1/2] kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
On Tue, 30 Oct 2018 at 23:58, Masahiro Yamada wrote: > > Evaluating cc-name invokes the compiler every time even when you are > not compiling anything, like 'make help'. This is not efficient. > > The compiler type has been already detected in the Kconfig stage. > Use CONFIG_CC_IS_CLANG, instead. Thanks, I didn't know about this. > > Signed-off-by: Masahiro Yamada > Acked-by: Michael Ellerman (powerpc) Acked-by: Joel Stanley Cheers, Joel
Re: [PATCH] hwmon: (ibmpowernv) Remove bogus __init annotations
On Sun, Oct 28, 2018 at 06:16:51PM +0100, Geert Uytterhoeven wrote: > If gcc decides not to inline make_sensor_label(): > > WARNING: vmlinux.o(.text+0x4df549c): Section mismatch in reference from > the function .create_device_attrs() to the function > .init.text:.make_sensor_label() > The function .create_device_attrs() references > the function __init .make_sensor_label(). > This is often because .create_device_attrs lacks a __init > annotation or the annotation of .make_sensor_label is wrong. > > As .probe() can be called after freeing of __init memory, all __init > annotiations in the driver are bogus, and should be removed. > > Signed-off-by: Geert Uytterhoeven Applied. Thanks, Guenter > --- > Compile-tested only. > --- > drivers/hwmon/ibmpowernv.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/drivers/hwmon/ibmpowernv.c b/drivers/hwmon/ibmpowernv.c > index 0ccca87f527191dc..293dd1c6c7b36ef2 100644 > --- a/drivers/hwmon/ibmpowernv.c > +++ b/drivers/hwmon/ibmpowernv.c > @@ -181,7 +181,7 @@ static ssize_t show_label(struct device *dev, struct > device_attribute *devattr, > return sprintf(buf, "%s\n", sdata->label); > } > > -static int __init get_logical_cpu(int hwcpu) > +static int get_logical_cpu(int hwcpu) > { > int cpu; > > @@ -192,9 +192,8 @@ static int __init get_logical_cpu(int hwcpu) > return -ENOENT; > } > > -static void __init make_sensor_label(struct device_node *np, > - struct sensor_data *sdata, > - const char *label) > +static void make_sensor_label(struct device_node *np, > + struct sensor_data *sdata, const char *label) > { > u32 id; > size_t n;
Re: [PATCH v2 1/2] kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
Hi Masahiro, On Tue, Oct 30, 2018 at 10:26:33PM +0900, Masahiro Yamada wrote: > Evaluating cc-name invokes the compiler every time even when you are > not compiling anything, like 'make help'. This is not efficient. > > The compiler type has been already detected in the Kconfig stage. > Use CONFIG_CC_IS_CLANG, instead. > > Signed-off-by: Masahiro Yamada > Acked-by: Michael Ellerman (powerpc) Looks good to me: Acked-by: Paul Burton (MIPS) Thanks, Paul
Re: [PATCH 6/7] smp: Don't yell about IRQs disabled in kgdb_roundup_cpus()
On Mon, Oct 29, 2018 at 11:07:06AM -0700, Douglas Anderson wrote: > In kgdb_roundup_cpus() we've got code that looks like: > local_irq_enable(); > smp_call_function(kgdb_call_nmi_hook, NULL, 0); > local_irq_disable(); > Let's add kgdb to the list of reasons not to warn in > smp_call_function_many(). That will allow us (in a future patch) to > stop calling local_irq_enable() which will get rid of the original > splat. > > NOTE: with this change comes the obvious question: will we start > deadlocking more often now when we drop into the debugger. I can't > say that for sure one way or the other, but the fact that we do the > same logic for "oops_in_progress" makes me feel like it shouldn't > happen too often. Also note that the old logic of turning on > interrupts temporarily wasn't exactly safe since (I presume) that > could have violated spin_lock_irqsave() semantics and ended up with a > deadlock of its own. How is any of that not utterly and terminally broken? > @@ -413,7 +414,8 @@ void smp_call_function_many(const struct cpumask *mask, >* can't happen. >*/ > WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() > - && !oops_in_progress && !early_boot_irqs_disabled); > + && !oops_in_progress && !early_boot_irqs_disabled > + && !in_dbg_master()); > > /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ > cpu = cpumask_first_and(mask, cpu_online_mask); Not a fan of this. There is a distinct difference between oops_in_progress and dropping into kgdb in that you don't ever expect to return/survive oopses, whereas we do expect to survive kgdb. Also, how does kgdb work at all without actual NMIs ? So no, NAK on this.
[PATCH] arch: fix without checked-return value with lseek
lseek should have returned value but we miss it maybe. This is detected by Coverity scan: CID: 1440481 Signed-off-by: Bo YU --- arch/powerpc/boot/addnote.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/boot/addnote.c b/arch/powerpc/boot/addnote.c index 9d9f6f334d3c..3da3e2b1b51b 100644 --- a/arch/powerpc/boot/addnote.c +++ b/arch/powerpc/boot/addnote.c @@ -223,7 +223,11 @@ main(int ac, char **av) PUT_16(E_PHNUM, np + 2); /* write back */ - lseek(fd, (long) 0, SEEK_SET); + i = lseek(fd, (long) 0, SEEK_SET); + if (i < 0) { + perror("lseek"); + exit(1); + } i = write(fd, buf, n); if (i < 0) { perror("write"); -- 2.11.0
Re: [PATCH 0/7] serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb
On Mon, Oct 29, 2018 at 11:07:00AM -0700, Douglas Anderson wrote: > I started out this series trying to make sysrq work over the serial > console on qcom_geni_serial, then fell into a rat's nest. > > To solve the deadlock I faced when enabling sysrq I tried to borrow > code from '8250_port.c' which avoided grabbing the port lock in > console_write(). ...but since these days I try to run with lockdep on > all the time, I found it caused an annoying lockdep splat (which I > also reproduced on my rk3399 board). ...so I instead changed my > qcom_geni_serial solution to borrow code from 'msm_serial.c' > > I wasn't super happy with the solution in 'msm_serial.c' though. I > don't like releasing the spinlock there. Not only is it ugly but it > means we are unlocking / re-locking _all the time_ even though sysrq > characters are rare. ...so I came up with what I think is a better > solution and then implemented it for qcom_geni_serial. > > Since I had a good way to test 8250-based UARTs, I also fixed that > driver to use my new method. When doing so, I ran into a missing > msm_serial.c at all, so I didn't switch that (or all other serial > drivers for that matter) to the new method. > > After fixing all the above issues, I found the next lockdep splat in > kgdb and I think I've worked around it in a good-enough way, but I'm > much less confident about this. Hopefully folks can take a look at > it. > > In general, patches earlier in this series should be "less > controversial" and hopefully can land even if people don't like > patches later in the series. ;-) > > Looking back, this is pretty much two series squashed that could be > treated indepdently. The first is a serial series and the second is a > kgdb series. > > For all serial patches I'd expect them to go through the tty tree once > they've been reviewed. > > If folks are OK w/ the 'smp' patch it probably should go in some core > kernel tree. The kgdb patch won't work without it, though, so to land > that we'd need coordination between the folks landing that and the > folks landing the 'smp' patch. I have got only 0/7 and 5/7, everything okay with your mail client and other tools? > > > Douglas Anderson (7): > serial: qcom_geni_serial: Finish supporting sysrq > serial: core: Allow processing sysrq at port unlock time > serial: qcom_geni_serial: Process sysrq at port unlock time > serial: core: Include console.h from serial_core.h > serial: 8250: Process sysrq at port unlock time > smp: Don't yell about IRQs disabled in kgdb_roundup_cpus() > kgdb: Remove irq flags and local_irq_enable/disable from roundup > > arch/arc/kernel/kgdb.c | 4 +-- > arch/arm/kernel/kgdb.c | 4 +-- > arch/arm64/kernel/kgdb.c| 4 +-- > arch/hexagon/kernel/kgdb.c | 11 ++ > arch/mips/kernel/kgdb.c | 4 +-- > arch/powerpc/kernel/kgdb.c | 2 +- > arch/sh/kernel/kgdb.c | 4 +-- > arch/sparc/kernel/smp_64.c | 2 +- > arch/x86/kernel/kgdb.c | 9 ++--- > drivers/tty/serial/8250/8250_aspeed_vuart.c | 6 +++- > drivers/tty/serial/8250/8250_fsl.c | 6 +++- > drivers/tty/serial/8250/8250_omap.c | 6 +++- > drivers/tty/serial/8250/8250_port.c | 8 ++--- > drivers/tty/serial/qcom_geni_serial.c | 10 -- > include/linux/kgdb.h| 9 ++--- > include/linux/serial_core.h | 38 - > kernel/debug/debug_core.c | 2 +- > kernel/smp.c| 4 ++- > 18 files changed, 80 insertions(+), 53 deletions(-) > > -- > 2.19.1.568.g152ad8e336-goog > -- With Best Regards, Andy Shevchenko
Re: [PATCH 0/7] serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb
On Tue, Oct 30, 2018 at 11:56:28AM +, Daniel Thompson wrote: > On Mon, Oct 29, 2018 at 11:07:00AM -0700, Douglas Anderson wrote: > > Looking back, this is pretty much two series squashed that could be > > treated indepdently. The first is a serial series and the second is a > > kgdb series. > > Indeed. > > I couldn't work out the link between the first 5 patches and the last 2 > until I read this... > > Is anything in the 01->05 patch set even related to kgdb? From the stack > traces it looks to me like the lock dep warning would trigger for any > sysrq. I think separating into two threads for v2 would be sensible. I'm concerned about calling smp_call_function() from IRQ context with IRQs disabled - that will block the ability of the _calling_ CPU to process IPIs from other CPUs in the system. If we have other CPUs waiting on their IPIs to complete on _this_ CPU, we could end up deadlocking while trying to grab the CSD lock. This is the intention of warnings in smp_call_function*() - to catch cases where deadlocks _can_ occur, but do not reliably show up. The exceptions to the warning (disregarding oops_in_progress) are chosen to allow IRQs-disabled calls when we're sure that the rest of the system isn't going to be sending the calling CPU an IPI (eg, because the CPU isn't marked online, and we only send IPIs to online CPUs, or if we're still early in the kernel boot and hence have no other CPUs running.) The exception is oops_in_progress, which can occur at any time - even with the current CPU already holding some CSD locks (eg, oops while trying to send an IPI.) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
Re: [PATCH 2/2] selftests/powerpc: Skip test instead of failing
Thiago Jung Bauermann writes: > Breno Leitao writes: > >> Hi Tyrel, >> >> On 10/23/2018 05:41 PM, Tyrel Datwyler wrote: --- a/tools/testing/selftests/powerpc/ptrace/core-pkey.c +++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c @@ -352,17 +352,11 @@ static int write_core_pattern(const char *core_pattern) FILE *f; f = fopen(core_pattern_file, "w"); - if (!f) { - perror("Error writing to core_pattern file"); - return TEST_FAIL; - } + SKIP_IF(!f); ret = fwrite(core_pattern, 1, len, f); fclose(f); - if (ret != len) { - perror("Error writing to core_pattern file"); - return TEST_FAIL; - } + SKIP_IF(ret != len); >> >>> If we don't have proper privileges we should fail on the open, right? >>> So wouldn't we still want to fail on the write if something goes >>> wrong? >> >> That is a good point. Should the test fail or skip if it is not possible >> to create the infrastructure to run the core test? >> >> Trying to find the answer in the current test sets, I find tests where >> the self test skips if the test environment is not able to be set up, as >> for example, when a memory allocation fails. >> >> File: tools/testing/selftests/powerpc/alignment/alignment_handler.c >> >> ci1 = mmap(NULL, bufsize, PROT_WRITE, MAP_SHARED, >>fd, bufsize); >> if ((ci0 == MAP_FAILED) || (ci1 == MAP_FAILED)) { >> printf("\n"); >> perror("mmap failed"); >> SKIP_IF(1); >> } > > I think TEST_FAIL means the test was able to exercise the feature > and found a problem with it. In this case, the test wasn't able to > exercise the feature so it's not appropriate. > > Ideally, there should be a TEST_ERROR result for a case like this where > an unexpected problem prevented the testcase from exercising the > feature. > > If we're to use the an existing result then I vote for SKIP_IF. Yeah I agree. See for example some of the TM tests, which skip if TM is not available. Or the alignment test which skips if it can't open /dev/fb0. In this case it should print "you need to be root to run this" and then skip. cheers
Re: NXP P50XX/e5500: SMP doesn't work anymore with the latest Git kernel
On 30 October 2018 at 08:27AM, Christian Zigotzky wrote: On 30 October 2018 at 02:59AM, Benjamin Herrenschmidt wrote: On Tue, 2018-10-30 at 02:42 +0100, Christian Zigotzky wrote: OF patch for the latest Git kernel: http://www.xenosoft.de/of_v2.patch This just seems to revert a whole bunch of stuff, not really the right way to go. Why is of_get_next_cpu_node() not finding your CPUs ? There must be something wrong with the device-tree... Maybe we need a new dtb for QEMU ppce500 and P50XX machines. I don't know. For us it is good to have this patch because we can test further. (other updates) With this patch, SMP works again on virtual QEMU ppce500 and P50XX machines. If you have another code, then I will test it with QEMU and my P5020 board. Thanks, Christian We use the 'cyrus_p5020.dts' file for creating the dtb file for our P5020 board. You can find it in the official kernel source code. (arch/powerpc/boot/dts/cyrus_p5020.dts) For the virtual ppce500 machine I don't need to add a dtb file to the QEMU command. But maybe there is a solution: On 30 October 2018 at 3:18PM, Michael Ellerman wrote: Hi Rob, Sorry I missed this when you posted it. Rob Herring writes: Iterating thru cpu nodes is a common pattern. Create a common iterator which can find child nodes either by node name or device_type == cpu. Using the former will allow for eventually dropping device_type properties which are deprecated for FDT. Device trees we see on powerpc generally don't (never?) use "cpu" as the node name for CPU nodes. And many of those device trees come from firmware, so we can't update them. So dropping support for device_type is a non-starter from our POV. cheers - of_v2.patch - diff -rupN a/drivers/of/base.c b/drivers/of/base.c --- a/drivers/of/base.c 2018-10-30 02:19:30.827089495 +0100 +++ b/drivers/of/base.c 2018-10-30 02:18:51.666856715 +0100 @@ -395,7 +395,7 @@ struct device_node *of_get_cpu_node(int { struct device_node *cpun; - for_each_of_cpu_node(cpun) { + for_each_node_by_type(cpun, "cpu") { if (arch_find_n_match_cpu_physical_id(cpun, cpu, thread)) return cpun; } @@ -750,45 +750,6 @@ struct device_node *of_get_next_availabl EXPORT_SYMBOL(of_get_next_available_child); /** - * of_get_next_cpu_node - Iterate on cpu nodes - * @prev: previous child of the /cpus node, or NULL to get first - * - * Returns a cpu node pointer with refcount incremented, use of_node_put() - * on it when done. Returns NULL when prev is the last child. Decrements - * the refcount of prev. - */ -struct device_node *of_get_next_cpu_node(struct device_node *prev) -{ - struct device_node *next = NULL; - unsigned long flags; - struct device_node *node; - - if (!prev) - node = of_find_node_by_path("/cpus"); - - raw_spin_lock_irqsave(_lock, flags); - if (prev) - next = prev->sibling; - else if (node) { - next = node->child; - of_node_put(node); - } - for (; next; next = next->sibling) { - if (!(of_node_name_eq(next, "cpu") || - (next->type && !of_node_cmp(next->type, "cpu" - continue; - if (!__of_device_is_available(next)) - continue; - if (of_node_get(next)) - break; - } - of_node_put(prev); - raw_spin_unlock_irqrestore(_lock, flags); - return next; -} -EXPORT_SYMBOL(of_get_next_cpu_node); - -/** * of_get_compatible_child - Find compatible child node * @parent: parent node * @compatible: compatible string diff -rupN a/include/linux/of.h b/include/linux/of.h --- a/include/linux/of.h 2018-10-30 02:19:32.047096634 +0100 +++ b/include/linux/of.h 2018-10-30 02:18:51.666856715 +0100 @@ -347,7 +347,6 @@ extern const void *of_get_property(const const char *name, int *lenp); extern struct device_node *of_get_cpu_node(int cpu, unsigned int *thread); -extern struct device_node *of_get_next_cpu_node(struct device_node *prev); #define for_each_property_of_node(dn, pp) \ for (pp = dn->properties; pp != NULL; pp = pp->next) @@ -757,11 +756,6 @@ static inline struct device_node *of_get return NULL; } -static inline struct device_node *of_get_next_cpu_node(struct device_node *prev) -{ - return NULL; -} - static inline int of_n_addr_cells(struct device_node *np) { return 0; @@ -1239,10 +1233,6 @@ static inline int of_property_read_s32(c for (child = of_get_next_available_child(parent, NULL); child != NULL; \ child = of_get_next_available_child(parent, child)) -#define for_each_of_cpu_node(cpu) \ - for (cpu = of_get_next_cpu_node(NULL); cpu != NULL; \ - cpu = of_get_next_cpu_node(cpu)) - #define for_each_node_with_property(dn, prop_name) \ for (dn = of_find_node_with_property(NULL, prop_name); dn; \ dn =
Re: [PATCH 1/2] kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
Masahiro Yamada writes: > On Tue, Oct 30, 2018 at 9:36 PM Michael Ellerman wrote: >> >> Masahiro Yamada writes: >> > diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile >> > index 17be664..338e827 100644 >> > --- a/arch/powerpc/Makefile >> > +++ b/arch/powerpc/Makefile >> > @@ -96,7 +96,7 @@ aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call >> > cc-option,-mabi=elfv1) >> > aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 >> > endif >> > >> > -ifneq ($(cc-name),clang) >> > +ifneq ($(CONFIG_CC_IS_CLANG),y) >> >cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align >> > endif >> > >> > @@ -175,7 +175,7 @@ endif >> > # Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc >> > <= 4.8 >> > # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199 >> > # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828 >> > -ifneq ($(cc-name),clang) >> > +ifneq ($(CONFIG_CC_IS_CLANG),y) >> > CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) >> > endif >> > endif >> >> Does this behave like other CONFIG variables, ie. it will not be defined >> when it's false? > > Right. > >> And if so can't we use ifdef/ifndef? eg: >> >> ifndef CONFIG_CC_IS_CLANG >> CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) >> >> That reads cleaner to me. > > > OK, will do respin if you prefer ifdef/ifndef style. I do prefer it, but I can also fix it up later if you're in a hurry to get this merged. cheers
Re: [PATCH] powerpc/npu-dma: Remove NPU DMA ops
Christoph Hellwig writes: > On Wed, Oct 31, 2018 at 12:09:29AM +1100, Benjamin Herrenschmidt wrote: >> On Tue, 2018-10-30 at 13:58 +0100, Christoph Hellwig wrote: >> > Please take my patch instead. We have a kernel polcity to not keep >> > dead code around, and everyone including Linus and the attending IBMers >> > confirmed this. >> >> Let's call a cat a cat ... ;-) >> >> It's not *dead* code. It's code that is used by an out of tree driver >> (which also happen not to be open source). (Actually that part of the driver is MIT licensed, but yes other parts are not open source.) > Which clearly makes it a derived work of the kernel if it uses an > interface just create for it, and thus even more important to remove > it to not get anyone into legal trouble. We don't want driver code calling into firmware, that's the job of platform code, so we would need the same API (or something similar) for a GPL driver. cheers
Re: [PATCH 01/21] of: Add cpu node iterator for_each_of_cpu_node()
On Tue, Oct 30, 2018 at 9:20 AM Michael Ellerman wrote: > > Michael Ellerman writes: > > Hi Rob, > > > > Sorry I missed this when you posted it. > > > > Rob Herring writes: > >> Iterating thru cpu nodes is a common pattern. Create a common iterator > >> which can find child nodes either by node name or device_type == cpu. > >> Using the former will allow for eventually dropping device_type > >> properties which are deprecated for FDT. > > > > Device trees we see on powerpc generally don't (never?) use "cpu" as the > > node name for CPU nodes. And many of those device trees come from > > firmware, so we can't update them. > > > > So dropping support for device_type is a non-starter from our POV. > > ps. presumably that's what you meant by deprecated *for FDT*. Right. > But anyway just wanted to make sure we are on the same page. Yes, I was aware at least older powerpc DTs don't use 'cpu' for node names. Rob
Re: [PATCH 01/21] of: Add cpu node iterator for_each_of_cpu_node()
Michael Ellerman writes: > Hi Rob, > > Sorry I missed this when you posted it. > > Rob Herring writes: >> Iterating thru cpu nodes is a common pattern. Create a common iterator >> which can find child nodes either by node name or device_type == cpu. >> Using the former will allow for eventually dropping device_type >> properties which are deprecated for FDT. > > Device trees we see on powerpc generally don't (never?) use "cpu" as the > node name for CPU nodes. And many of those device trees come from > firmware, so we can't update them. > > So dropping support for device_type is a non-starter from our POV. ps. presumably that's what you meant by deprecated *for FDT*. But anyway just wanted to make sure we are on the same page. cheers
Re: [PATCH 01/21] of: Add cpu node iterator for_each_of_cpu_node()
Hi Rob, Sorry I missed this when you posted it. Rob Herring writes: > Iterating thru cpu nodes is a common pattern. Create a common iterator > which can find child nodes either by node name or device_type == cpu. > Using the former will allow for eventually dropping device_type > properties which are deprecated for FDT. Device trees we see on powerpc generally don't (never?) use "cpu" as the node name for CPU nodes. And many of those device trees come from firmware, so we can't update them. So dropping support for device_type is a non-starter from our POV. cheers > Cc: Frank Rowand > Signed-off-by: Rob Herring > --- > drivers/of/base.c | 39 +++ > include/linux/of.h | 11 +++ > 2 files changed, 50 insertions(+) > > diff --git a/drivers/of/base.c b/drivers/of/base.c > index a055cd1ef96d..4807db0a35b3 100644 > --- a/drivers/of/base.c > +++ b/drivers/of/base.c > @@ -741,6 +741,45 @@ struct device_node *of_get_next_available_child(const > struct device_node *node, > } > EXPORT_SYMBOL(of_get_next_available_child); > > +/** > + * of_get_next_cpu_node - Iterate on cpu nodes > + * @prev: previous child of the /cpus node, or NULL to get first > + * > + * Returns a cpu node pointer with refcount incremented, use of_node_put() > + * on it when done. Returns NULL when prev is the last child. Decrements > + * the refcount of prev. > + */ > +struct device_node *of_get_next_cpu_node(struct device_node *prev) > +{ > + struct device_node *next = NULL; > + unsigned long flags; > + struct device_node *node; > + > + if (!prev) > + node = of_find_node_by_path("/cpus"); > + > + raw_spin_lock_irqsave(_lock, flags); > + if (prev) > + next = prev->sibling; > + else if (node) { > + next = node->child; > + of_node_put(node); > + } > + for (; next; next = next->sibling) { > + if (!(of_node_name_eq(next, "cpu") || > + (next->type && !of_node_cmp(next->type, "cpu" > + continue; > + if (!__of_device_is_available(next)) > + continue; > + if (of_node_get(next)) > + break; > + } > + of_node_put(prev); > + raw_spin_unlock_irqrestore(_lock, flags); > + return next; > +} > +EXPORT_SYMBOL(of_get_next_cpu_node); > + > /** > * of_get_compatible_child - Find compatible child node > * @parent: parent node > diff --git a/include/linux/of.h b/include/linux/of.h > index 99b0ebf49632..1aca0dbd35df 100644 > --- a/include/linux/of.h > +++ b/include/linux/of.h > @@ -353,6 +353,8 @@ extern const void *of_get_property(const struct > device_node *node, > const char *name, > int *lenp); > extern struct device_node *of_get_cpu_node(int cpu, unsigned int *thread); > +extern struct device_node *of_get_next_cpu_node(struct device_node *prev); > + > #define for_each_property_of_node(dn, pp) \ > for (pp = dn->properties; pp != NULL; pp = pp->next) > > @@ -754,6 +756,11 @@ static inline struct device_node *of_get_cpu_node(int > cpu, > return NULL; > } > > +static inline struct device_node *of_get_next_cpu_node(struct device_node > *prev) > +{ > + return NULL; > +} > + > static inline int of_n_addr_cells(struct device_node *np) > { > return 0; > @@ -1217,6 +1224,10 @@ static inline int of_property_read_s32(const struct > device_node *np, > for (child = of_get_next_available_child(parent, NULL); child != NULL; \ >child = of_get_next_available_child(parent, child)) > > +#define for_each_of_cpu_node(cpu) \ > + for (cpu = of_get_next_cpu_node(NULL); cpu != NULL; \ > + cpu = of_get_next_cpu_node(cpu)) > + > #define for_each_node_with_property(dn, prop_name) \ > for (dn = of_find_node_with_property(NULL, prop_name); dn; \ >dn = of_find_node_with_property(dn, prop_name)) > -- > 2.17.1
[PATCH] powerpc/powernv: remove dead npu-dma code
This code has never been unused in the kernel since it was merged, and there has been no attempt that I could find to even submit users for it. Besides the general policy of not keep 1000+ lines of dead code, it helps cleaning up the DMA code and making powerpc user common infrastructure. This effectively reverts commit 5d2aa710 ("powerpc/powernv: Add support for Nvlink NPUs"). Signed-off-by: Christoph Hellwig --- [this is a follow up to the brief maintainer summit discussion] arch/powerpc/include/asm/pci.h| 3 - arch/powerpc/include/asm/powernv.h| 23 - arch/powerpc/platforms/powernv/Makefile | 2 +- arch/powerpc/platforms/powernv/npu-dma.c | 989 -- arch/powerpc/platforms/powernv/pci-ioda.c | 243 -- arch/powerpc/platforms/powernv/pci.h | 11 - 6 files changed, 1 insertion(+), 1270 deletions(-) delete mode 100644 arch/powerpc/platforms/powernv/npu-dma.c diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h index 2af9ded80540..a01d2e3d6ff9 100644 --- a/arch/powerpc/include/asm/pci.h +++ b/arch/powerpc/include/asm/pci.h @@ -127,7 +127,4 @@ extern void pcibios_scan_phb(struct pci_controller *hose); #endif /* __KERNEL__ */ -extern struct pci_dev *pnv_pci_get_gpu_dev(struct pci_dev *npdev); -extern struct pci_dev *pnv_pci_get_npu_dev(struct pci_dev *gpdev, int index); - #endif /* __ASM_POWERPC_PCI_H */ diff --git a/arch/powerpc/include/asm/powernv.h b/arch/powerpc/include/asm/powernv.h index 2f3ff7a27881..4848a6b3c6b2 100644 --- a/arch/powerpc/include/asm/powernv.h +++ b/arch/powerpc/include/asm/powernv.h @@ -11,33 +11,10 @@ #define _ASM_POWERNV_H #ifdef CONFIG_PPC_POWERNV -#define NPU2_WRITE 1 extern void powernv_set_nmmu_ptcr(unsigned long ptcr); -extern struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, - unsigned long flags, - void (*cb)(struct npu_context *, void *), - void *priv); -extern void pnv_npu2_destroy_context(struct npu_context *context, - struct pci_dev *gpdev); -extern int pnv_npu2_handle_fault(struct npu_context *context, uintptr_t *ea, - unsigned long *flags, unsigned long *status, - int count); - void pnv_tm_init(void); #else static inline void powernv_set_nmmu_ptcr(unsigned long ptcr) { } -static inline struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev, - unsigned long flags, - struct npu_context *(*cb)(struct npu_context *, void *), - void *priv) { return ERR_PTR(-ENODEV); } -static inline void pnv_npu2_destroy_context(struct npu_context *context, - struct pci_dev *gpdev) { } - -static inline int pnv_npu2_handle_fault(struct npu_context *context, - uintptr_t *ea, unsigned long *flags, - unsigned long *status, int count) { - return -ENODEV; -} static inline void pnv_tm_init(void) { } static inline void pnv_power9_force_smt4(void) { } diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index b540ce8eec55..2b13e9dd137c 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -6,7 +6,7 @@ obj-y += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o obj-y += opal-kmsg.o opal-powercap.o opal-psr.o opal-sensor-groups.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o -obj-$(CONFIG_PCI) += pci.o pci-ioda.o npu-dma.o pci-ioda-tce.o +obj-$(CONFIG_PCI) += pci.o pci-ioda.o pci-ioda-tce.o obj-$(CONFIG_CXL_BASE) += pci-cxl.o obj-$(CONFIG_EEH) += eeh-powernv.o obj-$(CONFIG_PPC_SCOM) += opal-xscom.o diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c deleted file mode 100644 index 6f60e0931922.. --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ /dev/null @@ -1,989 +0,0 @@ -/* - * This file implements the DMA operations for NVLink devices. The NPU - * devices all point to the same iommu table as the parent PCI device. - * - * Copyright Alistair Popple, IBM Corporation 2015. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of version 2 of the GNU General Public - * License as published by the Free Software Foundation. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "powernv.h" -#include "pci.h" - -#define npu_to_phb(x) container_of(x, struct pnv_phb, npu) - -/* - * spinlock to protect initialisation of an npu_context for a particular - * mm_struct. - */ -static DEFINE_SPINLOCK(npu_context_lock);
[PATCH v2 1/2] kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
Evaluating cc-name invokes the compiler every time even when you are not compiling anything, like 'make help'. This is not efficient. The compiler type has been already detected in the Kconfig stage. Use CONFIG_CC_IS_CLANG, instead. Signed-off-by: Masahiro Yamada Acked-by: Michael Ellerman (powerpc) --- Changes in v2: - Use ifdef/ifndef insteaed of ifeq/ifneq Makefile | 2 +- arch/mips/Makefile | 2 +- arch/mips/vdso/Makefile| 2 +- arch/powerpc/Makefile | 4 ++-- scripts/Makefile.extrawarn | 2 +- 5 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile index 0a42d06..1c0696d 100644 --- a/Makefile +++ b/Makefile @@ -707,7 +707,7 @@ stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) := -fstack-protector-strong KBUILD_CFLAGS += $(stackp-flags-y) -ifeq ($(cc-name),clang) +ifdef CONFIG_CC_IS_CLANG KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,) KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier) KBUILD_CFLAGS += $(call cc-disable-warning, gnu) diff --git a/arch/mips/Makefile b/arch/mips/Makefile index 15a84cf..6841049 100644 --- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -128,7 +128,7 @@ cflags-y += -ffreestanding # clang's output will be based upon the build machine. So for clang we simply # unconditionally specify -EB or -EL as appropriate. # -ifeq ($(cc-name),clang) +ifdef CONFIG_CC_IS_CLANG cflags-$(CONFIG_CPU_BIG_ENDIAN)+= -EB cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -EL else diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile index 34605ca..58a0315 100644 --- a/arch/mips/vdso/Makefile +++ b/arch/mips/vdso/Makefile @@ -10,7 +10,7 @@ ccflags-vdso := \ $(filter -march=%,$(KBUILD_CFLAGS)) \ -D__VDSO__ -ifeq ($(cc-name),clang) +ifdef CONFIG_CC_IS_CLANG ccflags-vdso += $(filter --target=%,$(KBUILD_CFLAGS)) endif diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 17be664..8a2ce14 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -96,7 +96,7 @@ aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 endif -ifneq ($(cc-name),clang) +ifndef CONFIG_CC_IS_CLANG cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align endif @@ -175,7 +175,7 @@ endif # Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= 4.8 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828 -ifneq ($(cc-name),clang) +ifndef CONFIG_CC_IS_CLANG CC_FLAGS_FTRACE+= $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) endif endif diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn index 24b2fb1..800a10f 100644 --- a/scripts/Makefile.extrawarn +++ b/scripts/Makefile.extrawarn @@ -64,7 +64,7 @@ endif KBUILD_CFLAGS += $(warning) else -ifeq ($(cc-name),clang) +ifdef CONFIG_CC_IS_CLANG KBUILD_CFLAGS += $(call cc-disable-warning, initializer-overrides) KBUILD_CFLAGS += $(call cc-disable-warning, unused-value) KBUILD_CFLAGS += $(call cc-disable-warning, format) -- 2.7.4
Re: [PATCH] powerpc/npu-dma: Remove NPU DMA ops
On Wed, Oct 31, 2018 at 12:09:29AM +1100, Benjamin Herrenschmidt wrote: > On Tue, 2018-10-30 at 13:58 +0100, Christoph Hellwig wrote: > > Please take my patch instead. We have a kernel polcity to not keep > > dead code around, and everyone including Linus and the attending IBMers > > confirmed this. > > Let's call a cat a cat ... ;-) > > It's not *dead* code. It's code that is used by an out of tree driver > (which also happen not to be open source). Which clearly makes it a derived work of the kernel if it uses an interface just create for it, and thus even more important to remove it to not get anyone into legal trouble.
Re: [PATCH] powerpc/npu-dma: Remove NPU DMA ops
On Tue, 2018-10-30 at 13:58 +0100, Christoph Hellwig wrote: > Please take my patch instead. We have a kernel polcity to not keep > dead code around, and everyone including Linus and the attending IBMers > confirmed this. Let's call a cat a cat ... ;-) It's not *dead* code. It's code that is used by an out of tree driver (which also happen not to be open source). Just making things clear for everybody. Cheers, Ben. > On Tue, Oct 30, 2018 at 10:02:03PM +1100, Alistair Popple wrote: > > The NPU IOMMU is setup to mirror the parent PCIe device IOMMU > > setup. Therefore it does not make sense to call dma operations such as > > dma_map_page, etc. directly on these devices. The existing dma-ops > > simply print a warning if they are ever called, however this is > > unnecessary and the warnings are likely to go unnoticed. > > > > It is instead simpler to remove these operations and let the generic > > DMA code print warnings (eg. via a NULL pointer deref) in cases of > > buggy drivers attempting dma operations on NVLink devices. > > > > Signed-off-by: Alistair Popple > > --- > > arch/powerpc/platforms/powernv/npu-dma.c | 64 > > ++-- > > 1 file changed, 4 insertions(+), 60 deletions(-) > > > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c > > b/arch/powerpc/platforms/powernv/npu-dma.c > > index 6f60e0931922..75b935252981 100644 > > --- a/arch/powerpc/platforms/powernv/npu-dma.c > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c > > @@ -102,63 +102,6 @@ struct pci_dev *pnv_pci_get_npu_dev(struct pci_dev > > *gpdev, int index) > > } > > EXPORT_SYMBOL(pnv_pci_get_npu_dev); > > > > -#define NPU_DMA_OP_UNSUPPORTED() \ > > - dev_err_once(dev, "%s operation unsupported for NVLink devices\n", \ > > - __func__) > > - > > -static void *dma_npu_alloc(struct device *dev, size_t size, > > - dma_addr_t *dma_handle, gfp_t flag, > > - unsigned long attrs) > > -{ > > - NPU_DMA_OP_UNSUPPORTED(); > > - return NULL; > > -} > > - > > -static void dma_npu_free(struct device *dev, size_t size, > > -void *vaddr, dma_addr_t dma_handle, > > -unsigned long attrs) > > -{ > > - NPU_DMA_OP_UNSUPPORTED(); > > -} > > - > > -static dma_addr_t dma_npu_map_page(struct device *dev, struct page *page, > > - unsigned long offset, size_t size, > > - enum dma_data_direction direction, > > - unsigned long attrs) > > -{ > > - NPU_DMA_OP_UNSUPPORTED(); > > - return 0; > > -} > > - > > -static int dma_npu_map_sg(struct device *dev, struct scatterlist *sglist, > > - int nelems, enum dma_data_direction direction, > > - unsigned long attrs) > > -{ > > - NPU_DMA_OP_UNSUPPORTED(); > > - return 0; > > -} > > - > > -static int dma_npu_dma_supported(struct device *dev, u64 mask) > > -{ > > - NPU_DMA_OP_UNSUPPORTED(); > > - return 0; > > -} > > - > > -static u64 dma_npu_get_required_mask(struct device *dev) > > -{ > > - NPU_DMA_OP_UNSUPPORTED(); > > - return 0; > > -} > > - > > -static const struct dma_map_ops dma_npu_ops = { > > - .map_page = dma_npu_map_page, > > - .map_sg = dma_npu_map_sg, > > - .alloc = dma_npu_alloc, > > - .free = dma_npu_free, > > - .dma_supported = dma_npu_dma_supported, > > - .get_required_mask = dma_npu_get_required_mask, > > -}; > > - > > /* > > * Returns the PE assoicated with the PCI device of the given > > * NPU. Returns the linked pci device if pci_dev != NULL. > > @@ -270,10 +213,11 @@ static void pnv_npu_dma_set_32(struct pnv_ioda_pe > > *npe) > > rc = pnv_npu_set_window(npe, 0, gpe->table_group.tables[0]); > > > > /* > > -* We don't initialise npu_pe->tce32_table as we always use > > -* dma_npu_ops which are nops. > > +* NVLink devices use the same TCE table configuration as > > +* their parent device so drivers shouldn't be doing DMA > > +* operations directly on these devices. > > */ > > - set_dma_ops(>pdev->dev, _npu_ops); > > + set_dma_ops(>pdev->dev, NULL); > > } > > > > /* > > -- > > 2.11.0 > > ---end quoted text---
Re: [PATCH] powerpc/npu-dma: Remove NPU DMA ops
Please take my patch instead. We have a kernel polcity to not keep dead code around, and everyone including Linus and the attending IBMers confirmed this. On Tue, Oct 30, 2018 at 10:02:03PM +1100, Alistair Popple wrote: > The NPU IOMMU is setup to mirror the parent PCIe device IOMMU > setup. Therefore it does not make sense to call dma operations such as > dma_map_page, etc. directly on these devices. The existing dma-ops > simply print a warning if they are ever called, however this is > unnecessary and the warnings are likely to go unnoticed. > > It is instead simpler to remove these operations and let the generic > DMA code print warnings (eg. via a NULL pointer deref) in cases of > buggy drivers attempting dma operations on NVLink devices. > > Signed-off-by: Alistair Popple > --- > arch/powerpc/platforms/powernv/npu-dma.c | 64 > ++-- > 1 file changed, 4 insertions(+), 60 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c > b/arch/powerpc/platforms/powernv/npu-dma.c > index 6f60e0931922..75b935252981 100644 > --- a/arch/powerpc/platforms/powernv/npu-dma.c > +++ b/arch/powerpc/platforms/powernv/npu-dma.c > @@ -102,63 +102,6 @@ struct pci_dev *pnv_pci_get_npu_dev(struct pci_dev > *gpdev, int index) > } > EXPORT_SYMBOL(pnv_pci_get_npu_dev); > > -#define NPU_DMA_OP_UNSUPPORTED() \ > - dev_err_once(dev, "%s operation unsupported for NVLink devices\n", \ > - __func__) > - > -static void *dma_npu_alloc(struct device *dev, size_t size, > -dma_addr_t *dma_handle, gfp_t flag, > -unsigned long attrs) > -{ > - NPU_DMA_OP_UNSUPPORTED(); > - return NULL; > -} > - > -static void dma_npu_free(struct device *dev, size_t size, > - void *vaddr, dma_addr_t dma_handle, > - unsigned long attrs) > -{ > - NPU_DMA_OP_UNSUPPORTED(); > -} > - > -static dma_addr_t dma_npu_map_page(struct device *dev, struct page *page, > -unsigned long offset, size_t size, > -enum dma_data_direction direction, > -unsigned long attrs) > -{ > - NPU_DMA_OP_UNSUPPORTED(); > - return 0; > -} > - > -static int dma_npu_map_sg(struct device *dev, struct scatterlist *sglist, > - int nelems, enum dma_data_direction direction, > - unsigned long attrs) > -{ > - NPU_DMA_OP_UNSUPPORTED(); > - return 0; > -} > - > -static int dma_npu_dma_supported(struct device *dev, u64 mask) > -{ > - NPU_DMA_OP_UNSUPPORTED(); > - return 0; > -} > - > -static u64 dma_npu_get_required_mask(struct device *dev) > -{ > - NPU_DMA_OP_UNSUPPORTED(); > - return 0; > -} > - > -static const struct dma_map_ops dma_npu_ops = { > - .map_page = dma_npu_map_page, > - .map_sg = dma_npu_map_sg, > - .alloc = dma_npu_alloc, > - .free = dma_npu_free, > - .dma_supported = dma_npu_dma_supported, > - .get_required_mask = dma_npu_get_required_mask, > -}; > - > /* > * Returns the PE assoicated with the PCI device of the given > * NPU. Returns the linked pci device if pci_dev != NULL. > @@ -270,10 +213,11 @@ static void pnv_npu_dma_set_32(struct pnv_ioda_pe *npe) > rc = pnv_npu_set_window(npe, 0, gpe->table_group.tables[0]); > > /* > - * We don't initialise npu_pe->tce32_table as we always use > - * dma_npu_ops which are nops. > + * NVLink devices use the same TCE table configuration as > + * their parent device so drivers shouldn't be doing DMA > + * operations directly on these devices. >*/ > - set_dma_ops(>pdev->dev, _npu_ops); > + set_dma_ops(>pdev->dev, NULL); > } > > /* > -- > 2.11.0 ---end quoted text---
Re: [PATCH 1/2] kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
On Tue, Oct 30, 2018 at 9:36 PM Michael Ellerman wrote: > > Masahiro Yamada writes: > > diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile > > index 17be664..338e827 100644 > > --- a/arch/powerpc/Makefile > > +++ b/arch/powerpc/Makefile > > @@ -96,7 +96,7 @@ aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call > > cc-option,-mabi=elfv1) > > aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 > > endif > > > > -ifneq ($(cc-name),clang) > > +ifneq ($(CONFIG_CC_IS_CLANG),y) > >cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align > > endif > > > > @@ -175,7 +175,7 @@ endif > > # Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc > > <= 4.8 > > # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199 > > # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828 > > -ifneq ($(cc-name),clang) > > +ifneq ($(CONFIG_CC_IS_CLANG),y) > > CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) > > endif > > endif > > Does this behave like other CONFIG variables, ie. it will not be defined > when it's false? Right. > And if so can't we use ifdef/ifndef? eg: > > ifndef CONFIG_CC_IS_CLANG > CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) > > That reads cleaner to me. OK, will do respin if you prefer ifdef/ifndef style. > Still this patch is fine as is: > > Acked-by: Michael Ellerman (powerpc) > > cheers -- Best Regards Masahiro Yamada
Re: [PATCH 1/2] kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
Masahiro Yamada writes: > diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile > index 17be664..338e827 100644 > --- a/arch/powerpc/Makefile > +++ b/arch/powerpc/Makefile > @@ -96,7 +96,7 @@ aflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call > cc-option,-mabi=elfv1) > aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 > endif > > -ifneq ($(cc-name),clang) > +ifneq ($(CONFIG_CC_IS_CLANG),y) >cflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mno-strict-align > endif > > @@ -175,7 +175,7 @@ endif > # Work around gcc code-gen bugs with -pg / -fno-omit-frame-pointer in gcc <= > 4.8 > # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44199 > # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52828 > -ifneq ($(cc-name),clang) > +ifneq ($(CONFIG_CC_IS_CLANG),y) > CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) > endif > endif Does this behave like other CONFIG variables, ie. it will not be defined when it's false? And if so can't we use ifdef/ifndef? eg: ifndef CONFIG_CC_IS_CLANG CC_FLAGS_FTRACE += $(call cc-ifversion, -lt, 0409, -mno-sched-epilog) That reads cleaner to me. Still this patch is fine as is: Acked-by: Michael Ellerman (powerpc) cheers
Re: [PATCH 0/7] serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb
On Mon, Oct 29, 2018 at 11:07:00AM -0700, Douglas Anderson wrote: > Looking back, this is pretty much two series squashed that could be > treated indepdently. The first is a serial series and the second is a > kgdb series. Indeed. I couldn't work out the link between the first 5 patches and the last 2 until I read this... Is anything in the 01->05 patch set even related to kgdb? From the stack traces it looks to me like the lock dep warning would trigger for any sysrq. I think separating into two threads for v2 would be sensible. Daniel. > > For all serial patches I'd expect them to go through the tty tree once > they've been reviewed. > > If folks are OK w/ the 'smp' patch it probably should go in some core > kernel tree. The kgdb patch won't work without it, though, so to land > that we'd need coordination between the folks landing that and the > folks landing the 'smp' patch. > > > Douglas Anderson (7): > serial: qcom_geni_serial: Finish supporting sysrq > serial: core: Allow processing sysrq at port unlock time > serial: qcom_geni_serial: Process sysrq at port unlock time > serial: core: Include console.h from serial_core.h > serial: 8250: Process sysrq at port unlock time > smp: Don't yell about IRQs disabled in kgdb_roundup_cpus() > kgdb: Remove irq flags and local_irq_enable/disable from roundup > > arch/arc/kernel/kgdb.c | 4 +-- > arch/arm/kernel/kgdb.c | 4 +-- > arch/arm64/kernel/kgdb.c| 4 +-- > arch/hexagon/kernel/kgdb.c | 11 ++ > arch/mips/kernel/kgdb.c | 4 +-- > arch/powerpc/kernel/kgdb.c | 2 +- > arch/sh/kernel/kgdb.c | 4 +-- > arch/sparc/kernel/smp_64.c | 2 +- > arch/x86/kernel/kgdb.c | 9 ++--- > drivers/tty/serial/8250/8250_aspeed_vuart.c | 6 +++- > drivers/tty/serial/8250/8250_fsl.c | 6 +++- > drivers/tty/serial/8250/8250_omap.c | 6 +++- > drivers/tty/serial/8250/8250_port.c | 8 ++--- > drivers/tty/serial/qcom_geni_serial.c | 10 -- > include/linux/kgdb.h| 9 ++--- > include/linux/serial_core.h | 38 - > kernel/debug/debug_core.c | 2 +- > kernel/smp.c| 4 ++- > 18 files changed, 80 insertions(+), 53 deletions(-) > > -- > 2.19.1.568.g152ad8e336-goog >
Re: [PATCH 7/7] kgdb: Remove irq flags and local_irq_enable/disable from roundup
On Mon, Oct 29, 2018 at 11:07:07AM -0700, Douglas Anderson wrote: > The function kgdb_roundup_cpus() was passed a parameter that was > documented as: > > > the flags that will be used when restoring the interrupts. There is > > local_irq_save() call before kgdb_roundup_cpus(). > > Nobody used those flags. Anyone who wanted to temporarily turn on > interrupts just did local_irq_enable() and local_irq_disable() without > looking at them. So we can definitely remove the flags. On the whole I'd rather that this change... > Speaking of calling local_irq_enable(), it seems like a bad idea and > it caused a nice splat on my system with lockdep turned on. > Specifically it hit: > DEBUG_LOCKS_WARN_ON(current->hardirq_context) ... and fixes for this this were in separate patches. They don't appear especially related. Daniel. > See the previous patch in this series ("smp: Don't yell about IRQs > disabled in kgdb_roundup_cpus()") for more details, but the last few > things on the stack were this on my arm64 board: > lockdep_hardirqs_on+0xf0/0x160 > trace_hardirqs_on+0x188/0x1ac > kgdb_roundup_cpus+0x14/0x3c > > As agrued in the the commit text of ("smp: Don't yell about IRQs > disabled in kgdb_roundup_cpus()"), it seems better to make > smp_call_function() lenient about kgdb than to locally turn on IRQs > here. Thus let's totally remove all the local_irq_enable() and > local_irq_disable() calls from all of the kgdb_roundup_cpus() calls. > > Signed-off-by: Douglas Anderson > --- > > arch/arc/kernel/kgdb.c | 4 +--- > arch/arm/kernel/kgdb.c | 4 +--- > arch/arm64/kernel/kgdb.c | 4 +--- > arch/hexagon/kernel/kgdb.c | 11 ++- > arch/mips/kernel/kgdb.c| 4 +--- > arch/powerpc/kernel/kgdb.c | 2 +- > arch/sh/kernel/kgdb.c | 4 +--- > arch/sparc/kernel/smp_64.c | 2 +- > arch/x86/kernel/kgdb.c | 9 ++--- > include/linux/kgdb.h | 9 ++--- > kernel/debug/debug_core.c | 2 +- > 11 files changed, 14 insertions(+), 41 deletions(-) > > diff --git a/arch/arc/kernel/kgdb.c b/arch/arc/kernel/kgdb.c > index 9a3c34af2ae8..d94d3cb7f9e8 100644 > --- a/arch/arc/kernel/kgdb.c > +++ b/arch/arc/kernel/kgdb.c > @@ -197,11 +197,9 @@ static void kgdb_call_nmi_hook(void *ignored) > kgdb_nmicallback(raw_smp_processor_id(), NULL); > } > > -void kgdb_roundup_cpus(unsigned long flags) > +void kgdb_roundup_cpus(void) > { > - local_irq_enable(); > smp_call_function(kgdb_call_nmi_hook, NULL, 0); > - local_irq_disable(); > } > > struct kgdb_arch arch_kgdb_ops = { > diff --git a/arch/arm/kernel/kgdb.c b/arch/arm/kernel/kgdb.c > index caa0dbe3dc61..a80e9259f7e9 100644 > --- a/arch/arm/kernel/kgdb.c > +++ b/arch/arm/kernel/kgdb.c > @@ -175,11 +175,9 @@ static void kgdb_call_nmi_hook(void *ignored) > kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); > } > > -void kgdb_roundup_cpus(unsigned long flags) > +void kgdb_roundup_cpus(void) > { > - local_irq_enable(); > smp_call_function(kgdb_call_nmi_hook, NULL, 0); > - local_irq_disable(); > } > > static int __kgdb_notify(struct die_args *args, unsigned long cmd) > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c > index a20de58061a8..5d171c26788f 100644 > --- a/arch/arm64/kernel/kgdb.c > +++ b/arch/arm64/kernel/kgdb.c > @@ -289,11 +289,9 @@ static void kgdb_call_nmi_hook(void *ignored) > kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs()); > } > > -void kgdb_roundup_cpus(unsigned long flags) > +void kgdb_roundup_cpus(void) > { > - local_irq_enable(); > smp_call_function(kgdb_call_nmi_hook, NULL, 0); > - local_irq_disable(); > } > > static int __kgdb_notify(struct die_args *args, unsigned long cmd) > diff --git a/arch/hexagon/kernel/kgdb.c b/arch/hexagon/kernel/kgdb.c > index 16c24b22d0b2..30fbc491cf45 100644 > --- a/arch/hexagon/kernel/kgdb.c > +++ b/arch/hexagon/kernel/kgdb.c > @@ -119,17 +119,12 @@ void kgdb_arch_set_pc(struct pt_regs *regs, unsigned > long pc) > > /** > * kgdb_roundup_cpus - Get other CPUs into a holding pattern > - * @flags: Current IRQ state > * > * On SMP systems, we need to get the attention of the other CPUs > * and get them be in a known state. This should do what is needed > * to get the other CPUs to call kgdb_wait(). Note that on some arches, > * the NMI approach is not used for rounding up all the CPUs. For example, > - * in case of MIPS, smp_call_function() is used to roundup CPUs. In > - * this case, we have to make sure that interrupts are enabled before > - * calling smp_call_function(). The argument to this function is > - * the flags that will be used when restoring the interrupts. There is > - * local_irq_save() call before kgdb_roundup_cpus(). > + * in case of MIPS, smp_call_function() is used to roundup CPUs. > * > * On non-SMP systems, this is not called. > */ > @@ -139,11 +134,9 @@ static void hexagon_kgdb_nmi_hook(void *ignored) >
[PATCH] powerpc/npu-dma: Remove NPU DMA ops
The NPU IOMMU is setup to mirror the parent PCIe device IOMMU setup. Therefore it does not make sense to call dma operations such as dma_map_page, etc. directly on these devices. The existing dma-ops simply print a warning if they are ever called, however this is unnecessary and the warnings are likely to go unnoticed. It is instead simpler to remove these operations and let the generic DMA code print warnings (eg. via a NULL pointer deref) in cases of buggy drivers attempting dma operations on NVLink devices. Signed-off-by: Alistair Popple --- arch/powerpc/platforms/powernv/npu-dma.c | 64 ++-- 1 file changed, 4 insertions(+), 60 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 6f60e0931922..75b935252981 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -102,63 +102,6 @@ struct pci_dev *pnv_pci_get_npu_dev(struct pci_dev *gpdev, int index) } EXPORT_SYMBOL(pnv_pci_get_npu_dev); -#define NPU_DMA_OP_UNSUPPORTED() \ - dev_err_once(dev, "%s operation unsupported for NVLink devices\n", \ - __func__) - -static void *dma_npu_alloc(struct device *dev, size_t size, - dma_addr_t *dma_handle, gfp_t flag, - unsigned long attrs) -{ - NPU_DMA_OP_UNSUPPORTED(); - return NULL; -} - -static void dma_npu_free(struct device *dev, size_t size, -void *vaddr, dma_addr_t dma_handle, -unsigned long attrs) -{ - NPU_DMA_OP_UNSUPPORTED(); -} - -static dma_addr_t dma_npu_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction direction, - unsigned long attrs) -{ - NPU_DMA_OP_UNSUPPORTED(); - return 0; -} - -static int dma_npu_map_sg(struct device *dev, struct scatterlist *sglist, - int nelems, enum dma_data_direction direction, - unsigned long attrs) -{ - NPU_DMA_OP_UNSUPPORTED(); - return 0; -} - -static int dma_npu_dma_supported(struct device *dev, u64 mask) -{ - NPU_DMA_OP_UNSUPPORTED(); - return 0; -} - -static u64 dma_npu_get_required_mask(struct device *dev) -{ - NPU_DMA_OP_UNSUPPORTED(); - return 0; -} - -static const struct dma_map_ops dma_npu_ops = { - .map_page = dma_npu_map_page, - .map_sg = dma_npu_map_sg, - .alloc = dma_npu_alloc, - .free = dma_npu_free, - .dma_supported = dma_npu_dma_supported, - .get_required_mask = dma_npu_get_required_mask, -}; - /* * Returns the PE assoicated with the PCI device of the given * NPU. Returns the linked pci device if pci_dev != NULL. @@ -270,10 +213,11 @@ static void pnv_npu_dma_set_32(struct pnv_ioda_pe *npe) rc = pnv_npu_set_window(npe, 0, gpe->table_group.tables[0]); /* -* We don't initialise npu_pe->tce32_table as we always use -* dma_npu_ops which are nops. +* NVLink devices use the same TCE table configuration as +* their parent device so drivers shouldn't be doing DMA +* operations directly on these devices. */ - set_dma_ops(>pdev->dev, _npu_ops); + set_dma_ops(>pdev->dev, NULL); } /* -- 2.11.0
Re: [RFC PATCH] seccomp: Add protection keys into seccomp_data
On 10/29/2018 11:33 PM, Dave Hansen wrote: But, that's really an implementation detail. The effect on the ABI and how this might constrain future pkeys use is my bigger worry. I'd also want to make sure that your specific use-case is compatible with all the oddities of pkeys, like the 'clone' and signal behavior. Some of that is spelled out here: http://man7.org/linux/man-pages/man7/pkeys.7.html I think my specific use case is compatible with these oddities since the untrusted code is not allowed to install signal handlers or spawn threads himself but only through the trusted code, which ensures, that the PKRU has the correct value when control passes back to the untrusted code. But I agree that these oddities are something a programmer needs to take into account if he uses pkeys in general (and this patch adds new considerations to it if the program installs a seccomp filter and e.g. wants to do system calls from a signal handler). One thing that's a worry is that we have never said that you *can't* write to arbitrary permissions in PKRU. I can totally see some really paranoid code saying, "I'm about to do something risky, so I'll turn off access to *all* pkeys", or " turn off all access except my current stack". If they did that, they might also inadvertently disable access to certain seccomp-restricted syscalls. We can fix that up by documenting restrictions like "code should never change the access rights of any pkey other than those that it allocated", but that doesn't help any old code (of which I hope there is relatively little). I can also see paranoid code wanting to do something like you describe and I think, that we should try to not forbid this behaviour. Documenting restrictions on code which writes to the PKRU as you describe would be one way to go, but this would disallow this paranoid use case if I understand it correctly because the code would not be allowed to disable access to all except of one pkey. My idea would be to not put the blame on the code which writes to the PKRU but on the code which installs the seccomp filter based on the PKRU: It has to make sure, that it allows the system calls which the paranoid code should be allowed to do when the pkey of the paranoid code is the (only) enabled pkey. This could be ensured e.g. by the paranoid code providing the pkey it intends to use to the code which installs the seccomp filter. This of course means, that you need to know what you are doing, when you install a seccomp filter which depends on the PKRU, but I think this is already the case when you install a seccomp filter without this patch (but maybe someone with more seccomp experience than me can say better how much reasoning is needed to install a seccomp filter without and with this patch). -- Michael
Re: [PATCH 6/7] smp: Don't yell about IRQs disabled in kgdb_roundup_cpus()
On Mon, Oct 29, 2018 at 11:07:06AM -0700, Douglas Anderson wrote: > In kgdb_roundup_cpus() we've got code that looks like: > local_irq_enable(); > smp_call_function(kgdb_call_nmi_hook, NULL, 0); > local_irq_disable(); > > In certain cases when we drop into kgdb (like with sysrq-g on a serial > console) we'll get a big yell that looks like: > > sysrq: SysRq : DEBUG > [ cut here ] > DEBUG_LOCKS_WARN_ON(current->hardirq_context) > WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 > lockdep_hardirqs_on+0xf0/0x160 > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27 > pstate: 604003c9 (nZCv DAIF +PAN -UAO) > pc : lockdep_hardirqs_on+0xf0/0x160 > ... > Call trace: >lockdep_hardirqs_on+0xf0/0x160 >trace_hardirqs_on+0x188/0x1ac >kgdb_roundup_cpus+0x14/0x3c >kgdb_cpu_enter+0x53c/0x5cc >kgdb_handle_exception+0x180/0x1d4 >kgdb_compiled_brk_fn+0x30/0x3c >brk_handler+0x134/0x178 >do_debug_exception+0xfc/0x178 >el1_dbg+0x18/0x78 >kgdb_breakpoint+0x34/0x58 >sysrq_handle_dbg+0x54/0x5c >__handle_sysrq+0x114/0x21c >handle_sysrq+0x30/0x3c >qcom_geni_serial_isr+0x2dc/0x30c > ... > ... > irq event stamp: ...45 > hardirqs last enabled at (...44): [...] __do_softirq+0xd8/0x4e4 > hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130 > softirqs last enabled at (...42): [...] _local_bh_enable+0x2c/0x34 > softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100 > ---[ end trace adf21f830c46e638 ]--- > > Let's add kgdb to the list of reasons not to warn in > smp_call_function_many(). That will allow us (in a future patch) to > stop calling local_irq_enable() which will get rid of the original > splat. > > NOTE: with this change comes the obvious question: will we start > deadlocking more often now when we drop into the debugger. I can't > say that for sure one way or the other, but the fact that we do the > same logic for "oops_in_progress" makes me feel like it shouldn't > happen too often. Also note that the old logic of turning on > interrupts temporarily wasn't exactly safe since (I presume) that > could have violated spin_lock_irqsave() semantics and ended up with a > deadlock of its own. This is part of the code to bring all the cores to a halt and since the other cores are still running kgdb isn't yet able to use the fact all the CPUs are halted to bend the rules. It is better for this code to play by the rules if it can. Is is possible to get the roundup functions to use a private csd alongside smp_call_function_single_async()? We could add a helper function to the debug core to avoid having to add cpu_online loops into every kgdb_roundup_cpus() implementaton. Daniel. > > Signed-off-by: Douglas Anderson > --- > > kernel/smp.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/smp.c b/kernel/smp.c > index 163c451af42e..bb581e58c8dc 100644 > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > > #include "smpboot.h" > > @@ -413,7 +414,8 @@ void smp_call_function_many(const struct cpumask *mask, >* can't happen. >*/ > WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() > - && !oops_in_progress && !early_boot_irqs_disabled); > + && !oops_in_progress && !early_boot_irqs_disabled > + && !in_dbg_master()); > > /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ > cpu = cpumask_first_and(mask, cpu_online_mask); > -- > 2.19.1.568.g152ad8e336-goog >
Re: NXP P50XX/e5500: SMP doesn't work anymore with the latest Git kernel
On 30 October 2018 at 02:59AM, Benjamin Herrenschmidt wrote: On Tue, 2018-10-30 at 02:42 +0100, Christian Zigotzky wrote: OF patch for the latest Git kernel: http://www.xenosoft.de/of_v2.patch This just seems to revert a whole bunch of stuff, not really the right way to go. Why is of_get_next_cpu_node() not finding your CPUs ? There must be something wrong with the device-tree... Maybe we need a new dtb for QEMU ppce500 and P50XX machines. I don't know. For us it is good to have this patch because we can test further. (other updates) With this patch, SMP works again on virtual QEMU ppce500 and P50XX machines. If you have another code, then I will test it with QEMU and my P5020 board. Thanks, Christian - of_v2.patch - diff -rupN a/drivers/of/base.c b/drivers/of/base.c --- a/drivers/of/base.c2018-10-30 02:19:30.827089495 +0100 +++ b/drivers/of/base.c2018-10-30 02:18:51.666856715 +0100 @@ -395,7 +395,7 @@ struct device_node *of_get_cpu_node(int { struct device_node *cpun; -for_each_of_cpu_node(cpun) { +for_each_node_by_type(cpun, "cpu") { if (arch_find_n_match_cpu_physical_id(cpun, cpu, thread)) return cpun; } @@ -750,45 +750,6 @@ struct device_node *of_get_next_availabl EXPORT_SYMBOL(of_get_next_available_child); /** - *of_get_next_cpu_node - Iterate on cpu nodes - *@prev:previous child of the /cpus node, or NULL to get first - * - *Returns a cpu node pointer with refcount incremented, use of_node_put() - *on it when done. Returns NULL when prev is the last child. Decrements - *the refcount of prev. - */ -struct device_node *of_get_next_cpu_node(struct device_node *prev) -{ -struct device_node *next = NULL; -unsigned long flags; -struct device_node *node; - -if (!prev) -node = of_find_node_by_path("/cpus"); - -raw_spin_lock_irqsave(_lock, flags); -if (prev) -next = prev->sibling; -else if (node) { -next = node->child; -of_node_put(node); -} -for (; next; next = next->sibling) { -if (!(of_node_name_eq(next, "cpu") || - (next->type && !of_node_cmp(next->type, "cpu" -continue; -if (!__of_device_is_available(next)) -continue; -if (of_node_get(next)) -break; -} -of_node_put(prev); -raw_spin_unlock_irqrestore(_lock, flags); -return next; -} -EXPORT_SYMBOL(of_get_next_cpu_node); - -/** * of_get_compatible_child - Find compatible child node * @parent:parent node * @compatible:compatible string diff -rupN a/include/linux/of.h b/include/linux/of.h --- a/include/linux/of.h2018-10-30 02:19:32.047096634 +0100 +++ b/include/linux/of.h2018-10-30 02:18:51.666856715 +0100 @@ -347,7 +347,6 @@ extern const void *of_get_property(const const char *name, int *lenp); extern struct device_node *of_get_cpu_node(int cpu, unsigned int *thread); -extern struct device_node *of_get_next_cpu_node(struct device_node *prev); #define for_each_property_of_node(dn, pp) \ for (pp = dn->properties; pp != NULL; pp = pp->next) @@ -757,11 +756,6 @@ static inline struct device_node *of_get return NULL; } -static inline struct device_node *of_get_next_cpu_node(struct device_node *prev) -{ -return NULL; -} - static inline int of_n_addr_cells(struct device_node *np) { return 0; @@ -1239,10 +1233,6 @@ static inline int of_property_read_s32(c for (child = of_get_next_available_child(parent, NULL); child != NULL; \ child = of_get_next_available_child(parent, child)) -#define for_each_of_cpu_node(cpu) \ -for (cpu = of_get_next_cpu_node(NULL); cpu != NULL; \ - cpu = of_get_next_cpu_node(cpu)) - #define for_each_node_with_property(dn, prop_name) \ for (dn = of_find_node_with_property(NULL, prop_name); dn; \ dn = of_find_node_with_property(dn, prop_name))
Re: [mainline][bisected 55469b][ppc] build error at drivers/net/ethernet/mellanox/mlx4/en_rx.c
On Mon, Oct 29, 2018 at 11:43 PM Abdul Haleem wrote: > > Greeting's > > mainline build fails on my powerpc with commit 55469b : drivers: net: > remove inclusion when not needed > > Machine type: PowerPC power 8 bare-metal > kernel : 4.19.0 > gcc : 4.8.5 > config attached > Thanks for the report, but this patch is not in 4.19 kernel ? It is scheduled for 4.20 only. Anyway, mlx4 needs to include directly, instead of assuming it is indirectly included by another file.
[mainline][bisected 55469b][ppc] build error at drivers/net/ethernet/mellanox/mlx4/en_rx.c
Greeting's mainline build fails on my powerpc with commit 55469b : drivers: net: remove inclusion when not needed Machine type: PowerPC power 8 bare-metal kernel : 4.19.0 gcc : 4.8.5 config attached build errors: drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: ‘struct iphdr’ declared inside parameter list [enabled by default] struct iphdr *iph) ^ drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/net/ethernet/mellanox/mlx4/en_rx.c: In function ‘get_fixed_ipv4_csum’: drivers/net/ethernet/mellanox/mlx4/en_rx.c:586:20: error: dereferencing pointer to incomplete type __u8 ipproto = iph->protocol; ^ Kernel builds fine when the patch is reverted. -- Regard's Abdul Haleem IBM Linux Technology Centre # # Automatically generated file; DO NOT EDIT. # Linux/powerpc 4.13.0-rc2 Kernel Configuration # CONFIG_PPC64=y # # Processor support # CONFIG_PPC_BOOK3S_64=y # CONFIG_PPC_BOOK3E_64 is not set # CONFIG_POWER7_CPU is not set CONFIG_POWER8_CPU=y CONFIG_PPC_BOOK3S=y CONFIG_PPC_FPU=y CONFIG_ALTIVEC=y CONFIG_VSX=y # CONFIG_PPC_ICSWX is not set CONFIG_PPC_STD_MMU=y CONFIG_PPC_STD_MMU_64=y CONFIG_PPC_RADIX_MMU=y CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y CONFIG_PPC_MM_SLICES=y CONFIG_PPC_HAVE_PMU_SUPPORT=y CONFIG_PPC_PERF_CTRS=y CONFIG_FORCE_SMP=y CONFIG_SMP=y CONFIG_NR_CPUS=2048 CONFIG_PPC_DOORBELL=y # CONFIG_CPU_BIG_ENDIAN is not set CONFIG_CPU_LITTLE_ENDIAN=y CONFIG_PPC64_BOOT_WRAPPER=y CONFIG_64BIT=y CONFIG_ARCH_PHYS_ADDR_T_64BIT=y CONFIG_ARCH_DMA_ADDR_T_64BIT=y CONFIG_MMU=y CONFIG_ARCH_MMAP_RND_BITS_MAX=29 CONFIG_ARCH_MMAP_RND_BITS_MIN=14 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=13 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7 CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y CONFIG_NR_IRQS=512 CONFIG_NMI_IPI=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_TRACE_IRQFLAGS_SUPPORT=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK=y CONFIG_PPC=y # CONFIG_GENERIC_CSUM is not set CONFIG_EARLY_PRINTK=y CONFIG_PANIC_TIMEOUT=180 CONFIG_COMPAT=y CONFIG_SYSVIPC_COMPAT=y CONFIG_SCHED_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_PPC_UDBG_16550=y # CONFIG_GENERIC_TBSYNC is not set CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y CONFIG_EPAPR_BOOT=y # CONFIG_DEFAULT_UIMAGE is not set CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_PPC_DCR_NATIVE is not set # CONFIG_PPC_DCR_MMIO is not set # CONFIG_PPC_OF_PLATFORM_PCI is not set CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_ARCH_SUPPORTS_UPROBES=y CONFIG_PPC_EMULATE_SSTEP=y CONFIG_ZONE_DMA32=y CONFIG_PGTABLE_LEVELS=4 CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_EXTABLE_SORT=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_XZ is not set CONFIG_DEFAULT_HOSTNAME="(none)" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y CONFIG_CROSS_MEMORY_ATTACH=y CONFIG_FHANDLE=y # CONFIG_USELIB is not set CONFIG_AUDIT=y CONFIG_HAVE_ARCH_AUDITSYSCALL=y CONFIG_AUDITSYSCALL=y CONFIG_AUDIT_WATCH=y CONFIG_AUDIT_TREE=y # # IRQ subsystem # CONFIG_GENERIC_IRQ_SHOW=y CONFIG_GENERIC_IRQ_SHOW_LEVEL=y CONFIG_GENERIC_IRQ_MIGRATION=y CONFIG_HARDIRQS_SW_RESEND=y CONFIG_IRQ_DOMAIN=y CONFIG_GENERIC_MSI_IRQ=y CONFIG_IRQ_DOMAIN_DEBUG=y CONFIG_IRQ_FORCED_THREADING=y CONFIG_SPARSE_IRQ=y # CONFIG_GENERIC_IRQ_DEBUGFS is not set CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_ARCH_HAS_TICK_BROADCAST=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_GENERIC_CMOS_UPDATE=y # # Timers subsystem # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # # CPU/Task time and stats accounting # CONFIG_VIRT_CPU_ACCOUNTING=y # CONFIG_TICK_CPU_ACCOUNTING is not set CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y # # RCU Subsystem # CONFIG_TREE_RCU=y CONFIG_RCU_EXPERT=y CONFIG_SRCU=y CONFIG_TREE_SRCU=y CONFIG_TASKS_RCU=y CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_NEED_SEGCBLIST=y CONFIG_RCU_FANOUT=64 CONFIG_RCU_FANOUT_LEAF=16 # CONFIG_RCU_FAST_NO_HZ is not set CONFIG_RCU_NOCB_CPU=y CONFIG_BUILD_BIN2C=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=18 CONFIG_LOG_CPU_MAX_BUF_SHIFT=13 CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13 CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y CONFIG_NUMA_BALANCING=y CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y CONFIG_CGROUPS=y
Re: [RFC PATCH v1 1/4] kvmppc: HMM backend driver to manage pages of secure guest
On Mon, Oct 29, 2018 at 11:31:55PM -0700, Ram Pai wrote: > On Tue, Oct 30, 2018 at 04:03:00PM +1100, Paul Mackerras wrote: > > On Mon, Oct 22, 2018 at 10:48:34AM +0530, Bharata B Rao wrote: > > > HMM driver for KVM PPC to manage page transitions of > > > secure guest via H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls. > > > > > > H_SVM_PAGE_IN: Move the content of a normal page to secure page > > > H_SVM_PAGE_OUT: Move the content of a secure page to normal page > > > > Comments below... > > > > > Signed-off-by: Bharata B Rao > > > --- > > > /* pSeries hypervisor opcodes */ > > > > #define H_REMOVE 0x04 > > > #define H_ENTER 0x08 > > > @@ -295,7 +298,9 @@ > > > #define H_INT_ESB 0x3C8 > > > #define H_INT_SYNC 0x3CC > > > #define H_INT_RESET 0x3D0 > > > -#define MAX_HCALL_OPCODE H_INT_RESET > > > +#define H_SVM_PAGE_IN0x3D4 > > > +#define H_SVM_PAGE_OUT 0x3D8 > > > +#define MAX_HCALL_OPCODE H_SVM_PAGE_OUT > > > > We should define hcall numbers in the implementation-specific range. > > We can't use numbers in this range without first getting them > > standardized in PAPR. Since these hcalls are not actually used by > > the guest but are just a private interface between KVM and the > > ultravisor, it's probably not worth putting them in PAPR. We should > > pick a range somewhere in the 0xf000 - 0xfffc area and use that. > > We are using that range for Ucalls. For hcalls we were told to reserve > a range between 1024(0x400) to 2047(0x7FF). Have to reserve them in the > appropriate database. Who gave you that advice? Paul.
Re: [RFC PATCH v1 1/4] kvmppc: HMM backend driver to manage pages of secure guest
On Tue, Oct 30, 2018 at 04:03:00PM +1100, Paul Mackerras wrote: > On Mon, Oct 22, 2018 at 10:48:34AM +0530, Bharata B Rao wrote: > > HMM driver for KVM PPC to manage page transitions of > > secure guest via H_SVM_PAGE_IN and H_SVM_PAGE_OUT hcalls. > > > > H_SVM_PAGE_IN: Move the content of a normal page to secure page > > H_SVM_PAGE_OUT: Move the content of a secure page to normal page > > Comments below... > > > Signed-off-by: Bharata B Rao > > --- > > /* pSeries hypervisor opcodes */ > > #define H_REMOVE 0x04 > > #define H_ENTER0x08 > > @@ -295,7 +298,9 @@ > > #define H_INT_ESB 0x3C8 > > #define H_INT_SYNC 0x3CC > > #define H_INT_RESET 0x3D0 > > -#define MAX_HCALL_OPCODE H_INT_RESET > > +#define H_SVM_PAGE_IN 0x3D4 > > +#define H_SVM_PAGE_OUT 0x3D8 > > +#define MAX_HCALL_OPCODE H_SVM_PAGE_OUT > > We should define hcall numbers in the implementation-specific range. > We can't use numbers in this range without first getting them > standardized in PAPR. Since these hcalls are not actually used by > the guest but are just a private interface between KVM and the > ultravisor, it's probably not worth putting them in PAPR. We should > pick a range somewhere in the 0xf000 - 0xfffc area and use that. We are using that range for Ucalls. For hcalls we were told to reserve a range between 1024(0x400) to 2047(0x7FF). Have to reserve them in the appropriate database. RP