Re: [powerpc:next-test 70/71] include/asm-generic/memory_model.h:55:54: error: 'vmemmap' undeclared; did you mean 'mem_map'?
This one is already fixed in v2 and v3 by the addition of asm/pgtable.h Christophe Le 12/12/2018 à 06:36, kbuild test robot a écrit : tree: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test head: 13fd174be1903cc069bb73b0d53a0420b8f6d778 commit: 3a0f466ce2402cc090c97ccc1285508f688d8a8e [70/71] powerpc: Implement CONFIG_DEBUG_VIRTUAL config: powerpc-ppc64e_defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 3a0f466ce2402cc090c97ccc1285508f688d8a8e # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All errors (new ones prefixed by >>): In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/thread_info.h:29, from include/linux/thread_info.h:38, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:81, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/stat.h:19, from include/linux/module.h:10, from init/do_mounts.c:1: arch/powerpc/include/asm/io.h: In function 'page_to_phys': include/asm-generic/memory_model.h:55:54: error: 'vmemmap' undeclared (first use in this function); did you mean 'mem_map'? #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ include/asm-generic/memory_model.h:55:54: note: each undeclared identifier is reported only once for each function it appears in #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ vim +55 include/asm-generic/memory_model.h 8f6aac41 Christoph Lameter 2007-10-16 52 af901ca1 André Goddard Rosa 2009-11-14 53 /* memmap is virtually contiguous. */ 8f6aac41 Christoph Lameter 2007-10-16 54 #define __pfn_to_page(pfn) (vmemmap + (pfn)) 32272a26 Martin Schwidefsky 2008-12-25 @55 #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) 8f6aac41 Christoph Lameter 2007-10-16 56 :: The code at line 55 was first introduced by commit :: 32272a26974d2027384fd4010cd1780fca425d94 [S390] __page_to_pfn warnings :: TO: Martin Schwidefsky :: CC: Martin Schwidefsky --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [PATCH v3] powerpc: implement CONFIG_DEBUG_VIRTUAL
Le 12/12/2018 à 01:23, Michael Ellerman a écrit : Christophe Leroy writes: This patch implements CONFIG_DEBUG_VIRTUAL to warn about incorrect use of virt_to_phys() and page_to_phys() Below is the result of test_debug_virtual: [1.438746] WARNING: CPU: 0 PID: 1 at ./arch/powerpc/include/asm/io.h:808 test_debug_virtual_init+0x3c/0xd4 [1.448156] CPU: 0 PID: 1 Comm: swapper Not tainted 4.20.0-rc5-00560-g6bfb52e23a00-dirty #532 [1.457259] NIP: c066c550 LR: c0650ccc CTR: c066c514 [1.462257] REGS: c900bdb0 TRAP: 0700 Not tainted (4.20.0-rc5-00560-g6bfb52e23a00-dirty) [1.471184] MSR: 00029032 CR: 48000422 XER: 2000 [1.477811] [1.477811] GPR00: c0650ccc c900be60 c60d 006000c0 c900 9032 c7fa0020 [1.477811] GPR08: 2400 0001 0900 c07b5d04 c00037d8 [1.477811] GPR16: c076 c074 0092 c0685bb0 [1.477811] GPR24: c065042c c068a734 c0685b8c 0006 c076 c075c3c0 [1.512711] NIP [c066c550] test_debug_virtual_init+0x3c/0xd4 [1.518315] LR [c0650ccc] do_one_initcall+0x8c/0x1cc [1.523163] Call Trace: [1.525595] [c900be60] [c0567340] 0xc0567340 (unreliable) [1.530954] [c900be90] [c0650ccc] do_one_initcall+0x8c/0x1cc [1.536551] [c900bef0] [c0651000] kernel_init_freeable+0x1f4/0x2cc [1.542658] [c900bf30] [c00037ec] kernel_init+0x14/0x110 [1.547913] [c900bf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c [1.553971] Instruction dump: [1.556909] 3ca50100 bfa10024 54a5000e 3fa0c076 7c0802a6 3d454000 813dc204 554893be [1.564566] 7d294010 7d294910 90010034 39290001 <0f09> 7c3e0b78 955e0008 3fe0c062 [1.572425] ---[ end trace 6f6984225b280ad6 ]--- [1.577467] PA: 0x0900 for VA: 0xc900 [1.581799] PA: 0x061e8f50 for VA: 0xc61e8f50 Signed-off-by: Christophe Leroy --- v3: Added missing linux/mm.h I realised that a driver may use DMA on stack after checking with virt_addr_valid(), so the new verification might induce false positives. I remove it for now, will add it again later in a more controled way. What is this comment referring to? I can't see any difference to v2 except the linux/mm.h include. v2 was: @@ -804,6 +806,11 @@ extern void __iounmap_at(void *ea, unsigned long size); */ static inline unsigned long virt_to_phys(volatile void * address) { + if (IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && + !WARN_ON(IS_ENABLED(CONFIG_HAVE_ARCH_VMAP_STACK) && current->pid && +object_is_on_stack((const void*)address))) + WARN_ON(!virt_addr_valid(address)); + return __pa((unsigned long)address); } v3 is: (same as v1) @@ -804,6 +806,8 @@ extern void __iounmap_at(void *ea, unsigned long size); */ static inline unsigned long virt_to_phys(volatile void * address) { + WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !virt_addr_valid(address)); + return __pa((unsigned long)address); } The idea in v2 was to detect objects on stack used for DMA before activating CONFIG_VMAP_STACK, but if the driver uses virt_addr_valid() to decide if it can DMA map it, then we'll get false positives. So I think this should be added with a dedicated DEBUG CONFIG option, not implicitely. Christophe
Re: use generic DMA mapping code in powerpc V4
> On 12. Dec 2018, at 01:47, Benjamin Herrenschmidt > wrote: > >> On Tue, 2018-12-11 at 19:17 +0100, Christian Zigotzky wrote: >> X5000 (P5020 board): U-Boot loads the kernel and the dtb file. Then the >> kernel starts but it doesn't find any hard disks (partitions). That >> means this is also the bad commit for the P5020 board. > > What are the disks hanging off ? A PCIe device of some sort ? > > Can you send good & bad dmesg logs ? > > Ben. > > Unfortunately not. It doesn’t detect any hard disk. That means the kernel ring buffer won’t save to log files. I don’t have a serial null modem cable for seeing all output. We use SATA disks. I will investigate more in this problem with rollback the files in the bad commit. — Christian
[powerpc:next-test 70/71] include/asm-generic/memory_model.h:55:54: error: 'vmemmap' undeclared; did you mean 'mem_map'?
tree: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test head: 13fd174be1903cc069bb73b0d53a0420b8f6d778 commit: 3a0f466ce2402cc090c97ccc1285508f688d8a8e [70/71] powerpc: Implement CONFIG_DEBUG_VIRTUAL config: powerpc-ppc64e_defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 3a0f466ce2402cc090c97ccc1285508f688d8a8e # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All errors (new ones prefixed by >>): In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/thread_info.h:29, from include/linux/thread_info.h:38, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:81, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/stat.h:19, from include/linux/module.h:10, from init/do_mounts.c:1: arch/powerpc/include/asm/io.h: In function 'page_to_phys': >> include/asm-generic/memory_model.h:55:54: error: 'vmemmap' undeclared (first >> use in this function); did you mean 'mem_map'? #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ include/asm-generic/memory_model.h:55:54: note: each undeclared identifier is reported only once for each function it appears in #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ vim +55 include/asm-generic/memory_model.h 8f6aac41 Christoph Lameter 2007-10-16 52 af901ca1 André Goddard Rosa 2009-11-14 53 /* memmap is virtually contiguous. */ 8f6aac41 Christoph Lameter 2007-10-16 54 #define __pfn_to_page(pfn) (vmemmap + (pfn)) 32272a26 Martin Schwidefsky 2008-12-25 @55 #define __page_to_pfn(page) (unsigned long)((page) - vmemmap) 8f6aac41 Christoph Lameter 2007-10-16 56 :: The code at line 55 was first introduced by commit :: 32272a26974d2027384fd4010cd1780fca425d94 [S390] __page_to_pfn warnings :: TO: Martin Schwidefsky :: CC: Martin Schwidefsky --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [Resend PATCH V5 7/10] KVM: Make kvm_set_spte_hva() return int
On Thu, Dec 06, 2018 at 09:21:10PM +0800, lantianyu1...@gmail.com wrote: > From: Lan Tianyu > > The patch is to make kvm_set_spte_hva() return int and caller can > check return value to determine flush tlb or not. It would be helpful if the patch description told the reader which return value(s) mean that the caller should flush the tlb. I would guess that non-zero means to do the flush, but you should make that explicit. > Signed-off-by: Lan Tianyu For the powerpc bits: Acked-by: Paul Mackerras
Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags
Andrew Murray writes: > On Tue, Dec 11, 2018 at 10:06:53PM +1100, Michael Ellerman wrote: >> [ Reviving old thread. ] >> >> Andrew Murray writes: >> > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote: >> >> Andrew Murray writes: >> >> >> >> > Update design.txt to reflect the presence of the exclude_host >> >> > and exclude_guest perf flags. >> >> > >> >> > Signed-off-by: Andrew Murray >> >> > --- >> >> > tools/perf/design.txt | 4 >> >> > 1 file changed, 4 insertions(+) >> >> > >> >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt >> >> > index a28dca2..7de7d83 100644 >> >> > --- a/tools/perf/design.txt >> >> > +++ b/tools/perf/design.txt >> >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and >> >> > 'exclude_hv' bits provide a >> >> > way to request that counting of events be restricted to times when the >> >> > CPU is in user, kernel and/or hypervisor mode. >> >> > >> >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way >> >> > +to request counting of events restricted to guest and host contexts >> >> > when >> >> > +using virtualisation. >> >> >> >> How does exclude_host differ from exclude_hv ? >> > >> > I believe exclude_host / exclude_guest are intented to distinguish >> > between host and guest in the hosted hypervisor context (KVM). >> >> OK yeah, from the perf-list man page: >> >>u - user-space counting >>k - kernel counting >>h - hypervisor counting >>I - non idle counting >>G - guest counting (in KVM guests) >>H - host counting (not in KVM guests) >> >> > Whereas exclude_hv allows to distinguish between guest and >> > hypervisor in the bare-metal type hypervisors. >> >> Except that's exactly not how we use them on powerpc :) >> >> We use exclude_hv to exclude "the hypervisor", regardless of whether >> it's KVM or PowerVM (which is a bare-metal hypervisor). >> >> We don't use exclude_host / exclude_guest at all, which I guess is a >> bug, except I didn't know they existed until this thread. >> >> eg, in a KVM guest: >> >> $ perf record -e cycles:G /bin/bash -c "for i in {0..10}; do :;done" >> $ perf report -D | grep -Fc "dso: [hypervisor]" >> 16 >> >> >> > In the case of arm64 - if VHE extensions are present then the host >> > kernel will run at a higher privilege to the guest kernel, in which >> > case there is no distinction between hypervisor and host so we ignore >> > exclude_hv. But where VHE extensions are not present then the host >> > kernel runs at the same privilege level as the guest and we use a >> > higher privilege level to switch between them - in this case we can >> > use exclude_hv to discount that hypervisor role of switching between >> > guests. >> >> I couldn't find any arm64 perf code using exclude_host/guest at all? > > Correct - but this is in flight as I am currently adding support for this > see [1]. OK, so at least that will be consistent across arm64 & x86. >> And I don't see any x86 code using exclude_hv. > > I can't find any either. I think that's because they don't need it, because they don't let guests program the PMU directly. It's all handled by the host and the host doesn't let the guest count host cycles anyway. But I could be wrong I'm no x86 expert. >> But maybe that's OK, I just worry this is confusing for users. > > There is some extra context regarding this where exclude_guest/exclude_host > was added, see [2] Good find. I had looked at that commit, but the thread on the list is more informative. In fact there was even a man page update! Never occurred to me look there :P http://man7.org/linux/man-pages/man2/perf_event_open.2.html exclude_host (since Linux 3.2) When conducting measurements that include processes running VM instances (i.e., have executed a KVM_RUN ioctl(2)), only mea‐ sure events happening inside a guest instance. This is only meaningful outside the guests; this setting does not change counts gathered inside of a guest. Currently, this function‐ ality is x86 only. exclude_guest (since Linux 3.2) When conducting measurements that include processes running VM instances (i.e., have executed a KVM_RUN ioctl(2)), do not measure events happening inside guest instances. This is only meaningful outside the guests; this setting does not change counts gathered inside of a guest. Currently, this function‐ ality is x86 only. Which makes things much clearer. Perhaps you want to add a reference to the man page in your text, something like? Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way to request counting of events restricted to guest and host contexts when using virtualisation. See the perf_event_open(2) man page for more detail. cheers
Re: [PATCH v4] kbuild: Add support for DT binding schema checks
On Wed, Dec 12, 2018 at 5:24 AM Rob Herring wrote: > > This adds the build infrastructure for checking DT binding schema > documents and validating dts files using the binding schema. > > Check DT binding schema documents: > make dt_binding_check > > Build dts files and check using DT binding schema: > make dtbs_check > > Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to > use for validation. This makes it easier to find and fix errors > generated by a specific schema. > > Currently, the validation targets are separate from a normal build to > avoid a hard dependency on the external DT schema project and because > there are lots of warnings generated. > > Cc: Jonathan Corbet > Cc: Mark Rutland > Cc: Masahiro Yamada > Cc: Michal Marek > Cc: linux-...@vger.kernel.org > Cc: devicet...@vger.kernel.org > Cc: linux-kbu...@vger.kernel.org > Signed-off-by: Rob Herring > --- > v4: > - Rework libyaml check and error message with Masahiro's version > - Simplify build rules and dependencies > Acked-by: Masahiro Yamada Thanks. -- Best Regards Masahiro Yamada
Re: [PATCH v3] kbuild: Add support for DT binding schema checks
On Wed, Dec 12, 2018 at 3:36 AM Rob Herring wrote: > > On Tue, Dec 11, 2018 at 10:03 AM Masahiro Yamada > wrote: > > > > On Wed, Dec 12, 2018 at 12:13 AM Rob Herring wrote: > > > > > > > > > > +$(obj)/%.example.dts: $(src)/%.yaml FORCE > > > > > + $(call if_changed,chk_binding) > > > > > + > > > > > +DT_TMP_SCHEMA := .schema.yaml.tmp > > > > > > > > > > > > BTW, why does this file start with a period? > > > > What is the meaning of '.tmp' extension? > > > > > > Nothing really. Just named it something so it gets cleaned and ignored by > > > git. > > > > > > It is cleaned whatever file name you use. > > > > > > See scripts/Makefile.clean > > > > __clean-files := $(extra-y) $(extra-m) $(extra-) \ > >$(always) $(targets) $(clean-files) \ > >$(hostprogs-y) $(hostprogs-m) $(hostprogs-) \ > >$(hostlibs-y) $(hostlibs-m) $(hostlibs-) \ > >$(hostcxxlibs-y) $(hostcxxlibs-m) > > > > > > $(extra-y) is cleaned. > > True. > > > > > > > You are adding *.example.dts to .gitignore > > > > Why not "schema.yaml" ? > > Okay. I'll do "processed-schema.yaml" to give a bit better name of > what it contains. > > > > > > > > +extra-y += $(DT_TMP_SCHEMA) > > > > > + > > > > > +quiet_cmd_mk_schema = SCHEMA $@ > > > > > + cmd_mk_schema = mkdir -p $(obj); \ > > > > > + rm -f $@; \ > > > > > + $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ > > > > > $(filter-out FORCE, $^) > > > > > > > > > > > > "mkdir -p $(obj)" is redundant. > > > > > > > > > > > > Why is 'rm -f $@' necessary ? > > > > Can't dt-mk-schema overwrite the output file? > > > > > > It is for error case when the output file is not generated. I can > > > handle this within dt-mk-schema instead. > > > > > +DT_DOCS = $(shell cd $(srctree)/$(src) && find * -name '*.yaml') > > > > > +DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS)) > > > > > + > > > > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dts, > > > > > $(DT_SCHEMA_FILES)) > > > > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dtb, > > > > > $(DT_SCHEMA_FILES)) > > > > > > > > > > > > > > > > I assume you intentionally did not do like this: > > > > > > > > extra-y += $(patsubst %.yaml,%.example.dtb, $(DT_DOCS)) > > > > > > > > From the commit description, DT_SCHEMA_FILES might be overridden by a > > > > user. > > > > So, I think this is OK. > > > > > > > > > > > > > > > > > > > > > +$(obj)/$(DT_TMP_SCHEMA): | $(addprefix $(obj)/,$(patsubst > > > > > $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES))) > > > > > > > > I do not understand this line. > > > > Why is it necessary? > > > > > > > > *.example.dtb files are generated anyway > > > > since they are listed in extra-y. > > > > > > It is enforcing the ordering. Without it, the binding checks and > > > building .schema.yaml.tmp happen in parallel because both only have > > > the source files as dependencies. The '|' keeps the dependencies out > > > of the dependency list($^). > > > > > > What kind problem would you see if > > the binding checks and building .schema.yaml.tmp > > happen in parallel? > > In case of no errors in the binding docs, it doesn't matter. If there > are errors, I don't want the dtbs validation to run if any schema > doesn't validate. However, I played around with this a bit more and it > seems like having the examples' dts/dtb in extra-y prevents that from > happening. Does that match your expections? Exactly. If any error occurs in Documentation/devicetree/bindings/Makefile, Make terminates before proceeding to the dtbs_check stage. -- Best Regards Masahiro Yamada
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On Tue, Dec 11, 2018 at 6:38 PM David Gibson wrote: > > On Tue, Dec 11, 2018 at 08:01:43AM -0600, Bjorn Helgaas wrote: > > Hi David, > > > > I see you're still working on this, but if you do end up going this > > direction eventually, would you mind splitting this into two patches: > > 1) rename the quirk to make it more generic (but not changing any > > behavior), and 2) add the ConnectX devices to the quirk. That way > > the ConnectX change is smaller and more easily > > understood/reverted/etc. > > Sure. Would it make sense to send (1) as an independent cleanup, > while I'm still working out exactly what (if anything) we need for > (2)? You could, but I don't think there's really much benefit in doing the first without the second, and I think there is some value in handling both patches at the same time.
Re: [PATCH v2] ocxl: Fix endiannes bug in read_afu_name()
On 12/12/18 4:58 am, Greg Kurz wrote: The AFU Descriptor Template in the PCI config space has a Name Space field which is a 24 Byte ASCII character string of descriptive name space for the AFU. The OCXL driver read the string four characters at a time with pci_read_config_dword(). This optimization is valid on a little-endian system since this is PCI, but a big-endian system ends up with each subset of four characters in reverse order. This could be fixed by switching to read characters one by one. Another option is to swap the bytes if we're big-endian. Go for the latter with le32_to_cpu(). Cc: sta...@vger.kernel.org # v4.16 Signed-off-by: Greg Kurz Acked-by: Frederic Barrat Acked-by: Andrew Donnellan --- v2: - silence sparse with (__force __le32) cast - new changelog --- drivers/misc/ocxl/config.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 57a6bb1fd3c9..8f2c5d8bd2ee 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -318,7 +318,7 @@ static int read_afu_name(struct pci_dev *dev, struct ocxl_fn_config *fn, if (rc) return rc; ptr = (u32 *) >name[i]; - *ptr = val; + *ptr = le32_to_cpu((__force __le32) val); } afu->name[OCXL_AFU_NAME_SZ - 1] = '\0'; /* play safe */ return 0; -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
RE: [PATCH 1/2 v3] powerpc/fsl: Use new clockgen binding
> -Original Message- > From: Scott Wood > Sent: 2018年11月26日 9:19 > To: Andy Tang > Cc: mturque...@baylibre.com; sb...@kernel.org; robh...@kernel.org; > mark.rutl...@arm.com; b...@kernel.crashing.org; pau...@samba.org; > m...@ellerman.id.au; linux-...@vger.kernel.org; > devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; > linuxppc-dev@lists.ozlabs.org > Subject: Re: [PATCH 1/2 v3] powerpc/fsl: Use new clockgen binding > > On Wed, 2018-10-31 at 14:57 +0800, Yuantian Tang wrote: > > From: Scott Wood > > > > The driver retains compatibility with old device trees, but we don't > > want the old nodes lying around to be copied, or used as a reference > > (some of the mux options are incorrect), or even just being clutter. > > > > > > +sysclk: sysclk { > > + compatible = "fixed-clock"; > > + #clock-cells = <0>; > > + clock-frequency = <1>; > > + clock-output-names = "sysclk"; > > +}; > > + > > clockgen: global-utilities@e1000 { > > The U-Boot fixup won't work with this. U-Boot patches the frequency > directly into the clockgen node (BTW, this is another reason to preserve > the generic > 1.0/2.0 compatible string). The new binding does not require an input > clock node when it is provided as clock-frequency directly in the clockgen > node -- and the sysclk node was not in my original patch (nor did you note > that you made changes from that original). Why did you add it? > > I would just remove it when applying, but I'm concerned that this indicates > a lack of testing (and I don't have the hardware access to test it myself, > except on t4240) -- unless the 100 MHz sysclk just happened to be correct > on the machines you tested (which would also be a test coverage > problem)? [Andy] You are right. Sysclk may not be useful anymore. Uboot will fixup the clockgen node correctly. Please apply this patch without sysclk. We will test it and catch the error if the clock is not fixed correctly. BTW, which git tree are you going to apply it on? This one? https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git/log/?h=next BR, Andy > > -Scott >
Re: use generic DMA mapping code in powerpc V4
On Tue, 2018-12-11 at 19:17 +0100, Christian Zigotzky wrote: > X5000 (P5020 board): U-Boot loads the kernel and the dtb file. Then the > kernel starts but it doesn't find any hard disks (partitions). That > means this is also the bad commit for the P5020 board. What are the disks hanging off ? A PCIe device of some sort ? Can you send good & bad dmesg logs ? Ben.
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On Tue, Dec 11, 2018 at 08:01:43AM -0600, Bjorn Helgaas wrote: > Hi David, > > I see you're still working on this, but if you do end up going this > direction eventually, would you mind splitting this into two patches: > 1) rename the quirk to make it more generic (but not changing any > behavior), and 2) add the ConnectX devices to the quirk. That way > the ConnectX change is smaller and more easily > understood/reverted/etc. Sure. Would it make sense to send (1) as an independent cleanup, while I'm still working out exactly what (if anything) we need for (2)? -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH v3] powerpc: implement CONFIG_DEBUG_VIRTUAL
Christophe Leroy writes: > This patch implements CONFIG_DEBUG_VIRTUAL to warn about > incorrect use of virt_to_phys() and page_to_phys() > > Below is the result of test_debug_virtual: > > [1.438746] WARNING: CPU: 0 PID: 1 at ./arch/powerpc/include/asm/io.h:808 > test_debug_virtual_init+0x3c/0xd4 > [1.448156] CPU: 0 PID: 1 Comm: swapper Not tainted > 4.20.0-rc5-00560-g6bfb52e23a00-dirty #532 > [1.457259] NIP: c066c550 LR: c0650ccc CTR: c066c514 > [1.462257] REGS: c900bdb0 TRAP: 0700 Not tainted > (4.20.0-rc5-00560-g6bfb52e23a00-dirty) > [1.471184] MSR: 00029032 CR: 48000422 XER: 2000 > [1.477811] > [1.477811] GPR00: c0650ccc c900be60 c60d 006000c0 c900 > 9032 c7fa0020 > [1.477811] GPR08: 2400 0001 0900 c07b5d04 > c00037d8 > [1.477811] GPR16: c076 c074 > 0092 c0685bb0 > [1.477811] GPR24: c065042c c068a734 c0685b8c 0006 c076 > c075c3c0 > [1.512711] NIP [c066c550] test_debug_virtual_init+0x3c/0xd4 > [1.518315] LR [c0650ccc] do_one_initcall+0x8c/0x1cc > [1.523163] Call Trace: > [1.525595] [c900be60] [c0567340] 0xc0567340 (unreliable) > [1.530954] [c900be90] [c0650ccc] do_one_initcall+0x8c/0x1cc > [1.536551] [c900bef0] [c0651000] kernel_init_freeable+0x1f4/0x2cc > [1.542658] [c900bf30] [c00037ec] kernel_init+0x14/0x110 > [1.547913] [c900bf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c > [1.553971] Instruction dump: > [1.556909] 3ca50100 bfa10024 54a5000e 3fa0c076 7c0802a6 3d454000 813dc204 > 554893be > [1.564566] 7d294010 7d294910 90010034 39290001 <0f09> 7c3e0b78 > 955e0008 3fe0c062 > [1.572425] ---[ end trace 6f6984225b280ad6 ]--- > [1.577467] PA: 0x0900 for VA: 0xc900 > [1.581799] PA: 0x061e8f50 for VA: 0xc61e8f50 > > Signed-off-by: Christophe Leroy > --- > v3: Added missing linux/mm.h > I realised that a driver may use DMA on stack after checking with > virt_addr_valid(), so the new > verification might induce false positives. I remove it for now, will add > it again later in a more > controled way. What is this comment referring to? I can't see any difference to v2 except the linux/mm.h include. cheers
Re: Snowpatch
On Wed, Dec 12, 2018, at 8:41 AM, Christophe Leroy wrote: > snowpatch is pretty surprising on > https://patchwork.ozlabs.org/patch/1003954/ > > How can such a trivial patch for 8xx alter compilation of all other > targets that much (ie add or remove sparse warnings) ? It was a bug I fixed as soon as I noticed and shouldn't happen any more. > > Christophe
[PATCH v3] powerpc: implement CONFIG_DEBUG_VIRTUAL
This patch implements CONFIG_DEBUG_VIRTUAL to warn about incorrect use of virt_to_phys() and page_to_phys() Below is the result of test_debug_virtual: [1.438746] WARNING: CPU: 0 PID: 1 at ./arch/powerpc/include/asm/io.h:808 test_debug_virtual_init+0x3c/0xd4 [1.448156] CPU: 0 PID: 1 Comm: swapper Not tainted 4.20.0-rc5-00560-g6bfb52e23a00-dirty #532 [1.457259] NIP: c066c550 LR: c0650ccc CTR: c066c514 [1.462257] REGS: c900bdb0 TRAP: 0700 Not tainted (4.20.0-rc5-00560-g6bfb52e23a00-dirty) [1.471184] MSR: 00029032 CR: 48000422 XER: 2000 [1.477811] [1.477811] GPR00: c0650ccc c900be60 c60d 006000c0 c900 9032 c7fa0020 [1.477811] GPR08: 2400 0001 0900 c07b5d04 c00037d8 [1.477811] GPR16: c076 c074 0092 c0685bb0 [1.477811] GPR24: c065042c c068a734 c0685b8c 0006 c076 c075c3c0 [1.512711] NIP [c066c550] test_debug_virtual_init+0x3c/0xd4 [1.518315] LR [c0650ccc] do_one_initcall+0x8c/0x1cc [1.523163] Call Trace: [1.525595] [c900be60] [c0567340] 0xc0567340 (unreliable) [1.530954] [c900be90] [c0650ccc] do_one_initcall+0x8c/0x1cc [1.536551] [c900bef0] [c0651000] kernel_init_freeable+0x1f4/0x2cc [1.542658] [c900bf30] [c00037ec] kernel_init+0x14/0x110 [1.547913] [c900bf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c [1.553971] Instruction dump: [1.556909] 3ca50100 bfa10024 54a5000e 3fa0c076 7c0802a6 3d454000 813dc204 554893be [1.564566] 7d294010 7d294910 90010034 39290001 <0f09> 7c3e0b78 955e0008 3fe0c062 [1.572425] ---[ end trace 6f6984225b280ad6 ]--- [1.577467] PA: 0x0900 for VA: 0xc900 [1.581799] PA: 0x061e8f50 for VA: 0xc61e8f50 Signed-off-by: Christophe Leroy --- v3: Added missing linux/mm.h I realised that a driver may use DMA on stack after checking with virt_addr_valid(), so the new verification might induce false positives. I remove it for now, will add it again later in a more controled way. v2: Using asm/pgtable.h to avoid build failure on ppc64e. Added a verification that the object is not in stack to catch problems before activing VMAP_STACK. arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/io.h | 13 - arch/powerpc/mm/pgtable_32.c | 2 +- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e312e92e3381..94b46624068d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -128,6 +128,7 @@ config PPC # # Please keep this list sorted alphabetically. # + select ARCH_HAS_DEBUG_VIRTUAL select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_DMA_SET_COHERENT_MASK select ARCH_HAS_ELF_RANDOMIZE diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h index e746becd9d6f..7f19fbd3ba55 100644 --- a/arch/powerpc/include/asm/io.h +++ b/arch/powerpc/include/asm/io.h @@ -29,12 +29,14 @@ extern struct pci_dev *isa_bridge_pcidev; #include #include +#include #include #include #include #include #include #include +#include #ifdef CONFIG_PPC64 #include @@ -804,6 +806,8 @@ extern void __iounmap_at(void *ea, unsigned long size); */ static inline unsigned long virt_to_phys(volatile void * address) { + WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !virt_addr_valid(address)); + return __pa((unsigned long)address); } @@ -827,7 +831,14 @@ static inline void * phys_to_virt(unsigned long address) /* * Change "struct page" to physical address. */ -#define page_to_phys(page) ((phys_addr_t)page_to_pfn(page) << PAGE_SHIFT) +static inline phys_addr_t page_to_phys(struct page *page) +{ + unsigned long pfn = page_to_pfn(page); + + WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !pfn_valid(pfn)); + + return PFN_PHYS(pfn); +} /* * 32 bits still uses virt_to_bus() for it's implementation of DMA diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 4fc77a99c9bf..68d204a45cd0 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -143,7 +143,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, pgprot_t prot, void *call * Don't allow anybody to remap normal RAM that we're using. * mem_init() sets high_memory so only do the check after that. */ - if (slab_is_available() && (p < virt_to_phys(high_memory)) && + if (slab_is_available() && virt_addr_valid(p) && page_is_ram(__phys_to_pfn(p))) { printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n", (unsigned long long)p, __builtin_return_address(0)); -- 2.13.3
Re: [powerpc:next-test 70/71] arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn'
Missing linux/mm.h Will fix it in the coming patch. Christophe Le 11/12/2018 à 22:55, kbuild test robot a écrit : tree: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test head: 13fd174be1903cc069bb73b0d53a0420b8f6d778 commit: 3a0f466ce2402cc090c97ccc1285508f688d8a8e [70/71] powerpc: Implement CONFIG_DEBUG_VIRTUAL config: powerpc-ps3_petitboot_nfs_defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 3a0f466ce2402cc090c97ccc1285508f688d8a8e # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All warnings (new ones prefixed by >>): In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/thread_info.h:29, from include/linux/thread_info.h:38, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:81, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/jiffies.h:9, from drivers/net/loopback.c:32: arch/powerpc/include/asm/io.h: In function 'page_to_phys': include/asm-generic/memory_model.h:64:14: error: implicit declaration of function 'page_to_section'; did you mean '__nr_to_section'? [-Werror=implicit-function-declaration] int __sec = page_to_section(__pg); \ ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ In file included from include/linux/scatterlist.h:8:0, from include/linux/dma-mapping.h:11, from include/linux/skbuff.h:34, from include/net/net_namespace.h:36, from include/linux/inet.h:46, from drivers/net/loopback.c:46: include/linux/mm.h: At top level: include/linux/mm.h:1121:29: error: conflicting types for 'page_to_section' static inline unsigned long page_to_section(const struct page *page) ^~~ In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/thread_info.h:29, from include/linux/thread_info.h:38, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:81, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/jiffies.h:9, from drivers/net/loopback.c:32: include/asm-generic/memory_model.h:64:14: note: previous implicit declaration of 'page_to_section' was here int __sec = page_to_section(__pg); \ ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ cc1: some warnings being treated as errors -- In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/book3s/64/mmu.h:5, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/current.h:16, from include/linux/mutex.h:14, from include/linux/kernfs.h:13, from include/linux/sysfs.h:16, from include/linux/kobject.h:20, from include/linux/device.h:16, from drivers/of/irq.c:19: arch/powerpc/include/asm/io.h: In function 'page_to_phys': include/asm-generic/memory_model.h:64:14: error: implicit declaration of function 'page_to_section'; did you mean '__nr_to_section'? [-Werror=implicit-function-declaration] int __sec = page_to_section(__pg); \ ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro
Re: [RFC PATCH v2 11/11] powerpc/book3s32: Implement Kernel Userspace Access Protection
On 12/11/2018 05:25 AM, Russell Currey wrote: On Wed, 2018-11-28 at 09:27 +, Christophe Leroy wrote: This patch implements Kernel Userspace Access Protection for book3s/32. Due to limitations of the processor page protection capabilities, the protection is only against writing. read protection cannot be achieved using page protection. In order to provide the protection, Ku and Ks keys are modified in Userspace Segment registers, and different PP bits are used to: PP01 provides RW for Key 0 and RO for Key 1 PP10 provides RW for all PP11 provides RO for all Today PP10 is used for RW pages and PP11 for RO pages. This patch modifies page protection to PP01 for RW pages. Then segment registers are set to Ku 0 and Ks 1. When kernel needs to write to RW pages, the associated segment register is changed to Ks 0 in order to allow write access to the kernel. In order to avoid having the read all segment registers when locking/unlocking the access, some data is kept in the thread_struct and saved on stack on exceptions. The field identifies both the first unlocked segment and the first segment following the last unlocked one. When no segment is unlocked, it contains value 0. Signed-off-by: Christophe Leroy Hey Christophe, I tried to test this and got a machine check after the kernel starts init. A program check you mean ? Vector: 700 (Program Check) at [ef0b5e70] pc: 0ca4 lr: b7e1a030 sp: ef0b5f30 msr: 81002 current = 0xef0b8000 pid = 1, comm = init Testing with mac99 model in qemu. That's pretty surprising. At 0xca4 there is nothing particular for me. This is a handler for system call. Do you have the same ? How can this trigger a program check ? According to the MSR, the check is due to an illegal instruction (bit 12). An we are with MMU off. cc00 : cc00: 7d 50 43 a6 mtsprg 0,r10 cc04: 7d 71 43 a6 mtsprg 1,r11 cc08: 7d 40 00 26 mfcrr10 cc0c: 7d 7b 02 a6 mfsrr1 r11 cc10: 71 6b 40 00 andi. r11,r11,16384 cc14: 3d 61 40 00 addis r11,r1,16384 cc18: 41 82 00 14 beq cc2c cc1c: 7d 73 42 a6 mfsprg r11,3 cc20: 81 6b fb d8 lwz r11,-1064(r11) cc24: 39 6b 20 00 addir11,r11,8192 cc28: 3d 6b 40 00 addis r11,r11,16384 cc2c: 39 6b ff 40 addir11,r11,-192 cc30: 91 4b 00 a8 stw r10,168(r11) cc34: 91 8b 00 40 stw r12,64(r11) cc38: 91 2b 00 34 stw r9,52(r11) cc3c: 7d 50 42 a6 mfsprg r10,0 cc40: 91 4b 00 38 stw r10,56(r11) cc44: 7d 91 42 a6 mfsprg r12,1 cc48: 91 8b 00 3c stw r12,60(r11) cc4c: 7d 48 02 a6 mflrr10 cc50: 91 4b 00 a0 stw r10,160(r11) cc54: 7d 9a 02 a6 mfsrr0 r12 cc58: 7d 3b 02 a6 mfsrr1 r9 cc5c: 90 2b 00 14 stw r1,20(r11) cc60: 90 2b 00 00 stw r1,0(r11) cc64: 3c 2b c0 00 addis r1,r11,-16384 cc68: 39 40 10 02 li r10,4098 cc6c: 7d 40 01 24 mtmsr r10 cc70: 90 0b 00 10 stw r0,16(r11) cc74: 3d 40 72 65 lis r10,29285 cc78: 39 4a 67 73 addir10,r10,26483 cc7c: 91 4b 00 08 stw r10,8(r11) cc80: 90 6b 00 1c stw r3,28(r11) cc84: 90 8b 00 20 stw r4,32(r11) cc88: 90 ab 00 24 stw r5,36(r11) cc8c: 90 cb 00 28 stw r6,40(r11) cc90: 90 eb 00 2c stw r7,44(r11) cc94: 91 0b 00 30 stw r8,48(r11) cc98: 39 40 0c 01 li r10,3073 cc9c: 91 4b 00 b0 stw r10,176(r11) cca0: 39 40 10 32 li r10,4146 cca4: 51 2a 04 20 rlwimi r10,r9,0,16,16 cca8: 48 01 13 5d bl c0012004 Christophe - Russell
Re: [PATCH] pseries/hotplug: Add more delay in pseries_cpu_die while waiting for rtas-stop
Note from Scott Mayes on latest crash: Michael, Since the partition crashed, I was able to get the last .2 seconds worth of RTAS call trace leading up to the crash. Best I could tell from that bit of trace was that the removal of a processor involved the following steps: -- Call to stop-self for a given thread -- Repeated calls to query-cpu-stopped-state (which eventually indicated the thread was stopped) -- Call to get-sensor-state for the thread to check its entity-state (9003) sensor which returned 'dr-entity-present' -- Call to set-indicator to set the isolation-state (9001) indicator to ISOLATE state -- Call to set-indicator to set the allocation-state (9003) indicator to UNUSABLE state I noticed one example of thread x28 getting through all of these steps just fine, but for thread x20, although the query-cpu-stopped state returned 0 status (STOPPED), a subsequent call to set-indicator to ISOLATE failed. This failure was near the end of the trace, but was not the very last RTAS call made in the trace. The set-indicator failure reported to Linux was a -9001 (Valid outstanding translation) which was mapped from a 0x502 (Invalid thread state) return code from PHYP's H_SET_DR_STATE h-call. On 12/10/2018 02:31 PM, Thiago Jung Bauermann wrote: > > Hello Michael, > > Michael Bringmann writes: > >> I have asked Scott Mayes to take a look at one of these crashes from >> the phyp side. I will let you know if he finds anything notable. > > Thanks! It might make sense to test whether booting with > cede_offline=off makes the bug go away. Scott is looking at the system. I will try once he is finished. > > One suspicion I have is regarding the code handling CPU_STATE_INACTIVE. >>From what I understand, it is a powerpc-specific CPU state and from the > perspective of the generic CPU hotplug state machine, inactive CPUs are > already fully offline. Which means that the locking performed by the > generic code state machine doesn't apply to transitioning CPUs from > INACTIVE to OFFLINE state. Perhaps the bug is that there is more than > one CPU making that transition at the same time? That would cause two > CPUs to call RTAS stop-self. > > I haven't checked whether this is really possible or not, though. It's > just a conjecture. Michael > > -- > Thiago Jung Bauermann > IBM Linux Technology Center > > -- Michael W. Bringmann Linux Technology Center IBM Corporation Tie-Line 363-5196 External: (512) 286-5196 Cell: (512) 466-0650 m...@linux.vnet.ibm.com
[RFC 3/3] powerpc/numa: Apply mapping between HW and kernel cpus
Apply new interface to map external powerpc cpus across multiple nodes to a range of kernel cpu values. Mapping is intended to prevent confusion within the kernel about the cpu+node mapping, and the changes in configuration that may happen due to powerpc LPAR migration or other associativity changes during the lifetime of a system. These interfaces exchange the thread_index provided by the 'ibm,ppc-interrupt-server#s' properties, for an internal index to be used by kernel scheduling interfaces. Signed-off-by: Michael Bringmann --- arch/powerpc/mm/numa.c | 45 +- arch/powerpc/platforms/pseries/hotplug-cpu.c | 15 +++-- 2 files changed, 41 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 7d6bba264..59d7cd9 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1063,7 +1063,8 @@ u64 memory_hotplug_max(void) struct topology_update_data { struct topology_update_data *next; - unsigned int cpu; + unsigned int old_cpu; + unsigned int new_cpu; int old_nid; int new_nid; }; @@ -1253,13 +1254,13 @@ static int update_cpu_topology(void *data) for (update = data; update; update = update->next) { int new_nid = update->new_nid; - if (cpu != update->cpu) + if (cpu != update->new_cpu) continue; - unmap_cpu_from_node(cpu); - map_cpu_to_node(cpu, new_nid); - set_cpu_numa_node(cpu, new_nid); - set_cpu_numa_mem(cpu, local_memory_node(new_nid)); + unmap_cpu_from_node(update->old_cpu); + map_cpu_to_node(update->new_cpu, new_nid); + set_cpu_numa_node(update->new_cpu, new_nid); + set_cpu_numa_mem(update->new_cpu, local_memory_node(new_nid)); vdso_getcpu_init(); } @@ -1283,7 +1284,7 @@ static int update_lookup_table(void *data) int nid, base, j; nid = update->new_nid; - base = cpu_first_thread_sibling(update->cpu); + base = cpu_first_thread_sibling(update->new_cpu); for (j = 0; j < threads_per_core; j++) { update_numa_cpu_lookup_table(base + j, nid); @@ -1305,7 +1306,7 @@ int numa_update_cpu_topology(bool cpus_locked) struct topology_update_data *updates, *ud; cpumask_t updated_cpus; struct device *dev; - int weight, new_nid, i = 0; + int weight, new_nid, i = 0, ii; if (!prrn_enabled && !vphn_enabled && topology_inited) return 0; @@ -1349,12 +1350,16 @@ int numa_update_cpu_topology(bool cpus_locked) continue; } + ii = 0; for_each_cpu(sibling, cpu_sibling_mask(cpu)) { ud = [i++]; ud->next = [i]; - ud->cpu = sibling; ud->new_nid = new_nid; ud->old_nid = numa_cpu_lookup_table[sibling]; + ud->old_cpu = sibling; + ud->new_cpu = cpuremap_map_cpu( + get_hard_smp_processor_id(sibling), + ii++, new_nid); cpumask_set_cpu(sibling, _cpus); } cpu = cpu_last_thread_sibling(cpu); @@ -1370,9 +1375,10 @@ int numa_update_cpu_topology(bool cpus_locked) pr_debug("Topology update for the following CPUs:\n"); if (cpumask_weight(_cpus)) { for (ud = [0]; ud; ud = ud->next) { - pr_debug("cpu %d moving from node %d " - "to %d\n", ud->cpu, - ud->old_nid, ud->new_nid); + pr_debug("cpu %d, node %d moving to" +" cpu %d, node %d\n", +ud->old_cpu, ud->old_nid, +ud->new_cpu, ud->new_nid); } } @@ -1409,13 +1415,20 @@ int numa_update_cpu_topology(bool cpus_locked) cpumask_of(raw_smp_processor_id())); for (ud = [0]; ud; ud = ud->next) { - unregister_cpu_under_node(ud->cpu, ud->old_nid); - register_cpu_under_node(ud->cpu, ud->new_nid); + unregister_cpu_under_node(ud->old_cpu, ud->old_nid); + register_cpu_under_node(ud->new_cpu, ud->new_nid); - dev = get_cpu_device(ud->cpu); + dev = get_cpu_device(ud->old_cpu); if (dev) kobject_uevent(>kobj, KOBJ_CHANGE); - cpumask_clear_cpu(ud->cpu, _associativity_changes_mask); + cpumask_clear_cpu(ud->old_cpu, _associativity_changes_mask); + if (ud->old_cpu
[RFC 2/3] powerpc/numa: Define mapping between HW and kernel cpus
Define interface to map external powerpc cpus across multiple nodes to a range of kernel cpu values. Mapping is intended to prevent confusion within the kernel about the cpu+node mapping, and the changes in configuration that may happen due to powerpc LPAR migration or other associativity changes during the lifetime of a system. These interfaces will be used entirely within the powerpc kernel code to maintain separation between the machine and kernel contexts. Signed-off-by: Michael Bringmann --- arch/powerpc/include/asm/topology.h | 31 +++ arch/powerpc/platforms/pseries/Kconfig| 10 ++ arch/powerpc/platforms/pseries/Makefile |1 arch/powerpc/platforms/pseries/cpuremap.c | 131 + 4 files changed, 173 insertions(+) create mode 100644 arch/powerpc/platforms/pseries/cpuremap.c diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h index 4621f40..db11969 100644 --- a/arch/powerpc/include/asm/topology.h +++ b/arch/powerpc/include/asm/topology.h @@ -135,5 +135,36 @@ static inline void shared_proc_topology_init(void) {} #endif #endif +#define CPUREMAP_NO_CPU(~0) +#define CPUREMAP_NO_THREAD (~0) + +#ifdef CONFIG_CPUREMAP +extern int cpuremap_thread_to_cpu(int thread_index); + /* Return CPUREMAP_NO_CPU if not found */ +extern int cpuremap_map_cpu(int thread_index, int in_core_ndx, int node); + /* Return CPUREMAP_NO_CPU if fails */ +extern int cpuremap_reserve_cpu(int cpu); + /* Return CPUREMAP_NO_CPU if fails */ +extern int cpuremap_release_cpu(int cpu); + /* Return CPUREMAP_NO_CPU if fails */ +extern int cpuremap_cpu_to_thread(int cpu); + /* Return CPUREMAP_NO_THREAD if not found */ +extern void cpuremap_init(void); + /* Identify necessary constants & alloc memory at boot */ +#else +static inline int cpuremap_thread_to_cpu(int thread_index) +{ + return thread_index; +} +static inline int cpuremap_map_cpu(int thread_index, int in_core_ndx, int node) +{ + return thread_index; +} +static inline int cpuremap_reserve_cpu(int cpu) { return cpu; } +static inline int cpuremap_release_cpu(int cpu) { return cpu; } +static inline int cpuremap_cpu_to_thread(int cpu) { return cpu; } +static inline void cpuremap_init(void) {} +#endif + #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_TOPOLOGY_H */ diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 2e4bd32..c35009f 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -145,3 +145,13 @@ config PAPR_SCM tristate "Support for the PAPR Storage Class Memory interface" help Enable access to hypervisor provided storage class memory. + Enable access to hypervisor provided storage class memory. + +config CPUREMAP +bool "Support for mapping hw cpu+node to kernel index" +depends on SMP && (PPC_PSERIES) +---help--- + Say Y here to be able to remap hw cpu+node to standardized + kernel CPUs at runtime on Pseries machines. + + Say N if you are unsure. diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile index a43ec84..ad49d8e 100644 --- a/arch/powerpc/platforms/pseries/Makefile +++ b/arch/powerpc/platforms/pseries/Makefile @@ -13,6 +13,7 @@ obj-$(CONFIG_KEXEC_CORE) += kexec.o obj-$(CONFIG_PSERIES_ENERGY) += pseries_energy.o obj-$(CONFIG_HOTPLUG_CPU) += hotplug-cpu.o +obj-$(CONFIG_CPUREMAP) += cpuremap.o obj-$(CONFIG_MEMORY_HOTPLUG) += hotplug-memory.o pmem.o obj-$(CONFIG_HVC_CONSOLE) += hvconsole.o diff --git a/arch/powerpc/platforms/pseries/cpuremap.c b/arch/powerpc/platforms/pseries/cpuremap.c new file mode 100644 index 000..86fdf12 --- /dev/null +++ b/arch/powerpc/platforms/pseries/cpuremap.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include + +struct cpuremap_cpu { + int thread_index; + /* Set to thread_index from ibm,ppc-interrupt-server#s arrays +* Don't clear when release'ed +*/ + int node; + bool in_use; + /* Set to true when reserve'ed +* Don't clear when release'ed + */ +}; + +struct cpuremap_struct { + int num_nodes; + int num_cores; + int num_threads_per_core; + struct cpuremap_cpu *threads; +} cpuremap_data; + + +void cpuremap_init(void) +{ + int i, k; + + /* Identify necessary constants & alloc memory at boot */ + cpuremap_data.num_threads_per_core = 8; + cpuremap_data.num_cores = 32; + cpuremap_data.num_nodes = + nr_cpu_ids / + (cpuremap_data.num_threads_per_core * cpuremap_data.num_cores); + cpuremap_data.threads = kcalloc(nr_cpu_ids,
[RFC 1/3] powerpc/numa: Conditionally online new nodes
Add argument to allow caller to determine whether the node identified for a cpu after an associativity / affinity change should be inited. Signed-off-by: Michael Bringmann --- arch/powerpc/include/asm/topology.h |2 +- arch/powerpc/mm/numa.c |6 +++--- arch/powerpc/platforms/pseries/hotplug-cpu.c |2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h index a4a718d..4621f40 100644 --- a/arch/powerpc/include/asm/topology.h +++ b/arch/powerpc/include/asm/topology.h @@ -90,7 +90,7 @@ static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {} extern int start_topology_update(void); extern int stop_topology_update(void); extern int prrn_is_enabled(void); -extern int find_and_online_cpu_nid(int cpu); +extern int find_and_online_cpu_nid(int cpu, bool must_online); extern int timed_topology_update(int nsecs); extern void __init shared_proc_topology_init(void); #else diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 87f0dd0..7d6bba264 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1197,7 +1197,7 @@ static long vphn_get_associativity(unsigned long cpu, return rc; } -int find_and_online_cpu_nid(int cpu) +int find_and_online_cpu_nid(int cpu, bool must_online) { __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; int new_nid; @@ -1210,7 +1210,7 @@ int find_and_online_cpu_nid(int cpu) if (new_nid < 0 || !node_possible(new_nid)) new_nid = first_online_node; - if (NODE_DATA(new_nid) == NULL) { + if (must_online && (NODE_DATA(new_nid) == NULL)) { #ifdef CONFIG_MEMORY_HOTPLUG /* * Need to ensure that NODE_DATA is initialized for a node from @@ -1337,7 +1337,7 @@ int numa_update_cpu_topology(bool cpus_locked) continue; } - new_nid = find_and_online_cpu_nid(cpu); + new_nid = find_and_online_cpu_nid(cpu, true); if (new_nid == numa_cpu_lookup_table[cpu]) { cpumask_andnot(_associativity_changes_mask, diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 2f8e621..620cb57 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -366,7 +366,7 @@ static int dlpar_online_cpu(struct device_node *dn) != CPU_STATE_OFFLINE); cpu_maps_update_done(); timed_topology_update(1); - find_and_online_cpu_nid(cpu); + find_and_online_cpu_nid(cpu, true); rc = device_online(get_cpu_device(cpu)); if (rc) goto out;
[RFC 0/3] powerpc/pseries: Remap hw to kernel cpu indexes
Define and apply new interface to map hardware-specific powerpc cpu ids to a kernel specific range of cpu values. Mapping is intended to prevent confusion within the kernel about the cpu+node mapping, and the changes in configuration that may happen due to powerpc LPAR migration or other associativity changes during the lifetime of a system. These interfaces exchange the thread_index provided by the 'ibm,ppc-interrupt-server#s' properties, for an internal index to be used by kernel scheduling interfaces. Signed-off-by: Michael Bringmann Michael Bringmann (3): powerpc/numa: Conditionally online new nodes powerpc/numa: Define mapping between HW and kernel cpus powerpc/numa: Apply mapping between HW and kernel cpu
[powerpc:next-test 70/71] arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro 'page_to_pfn'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test head: 13fd174be1903cc069bb73b0d53a0420b8f6d778 commit: 3a0f466ce2402cc090c97ccc1285508f688d8a8e [70/71] powerpc: Implement CONFIG_DEBUG_VIRTUAL config: powerpc-ps3_petitboot_nfs_defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 3a0f466ce2402cc090c97ccc1285508f688d8a8e # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All warnings (new ones prefixed by >>): In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/thread_info.h:29, from include/linux/thread_info.h:38, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:81, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/jiffies.h:9, from drivers/net/loopback.c:32: arch/powerpc/include/asm/io.h: In function 'page_to_phys': include/asm-generic/memory_model.h:64:14: error: implicit declaration of function 'page_to_section'; did you mean '__nr_to_section'? [-Werror=implicit-function-declaration] int __sec = page_to_section(__pg); \ ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ >> arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro >> 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ In file included from include/linux/scatterlist.h:8:0, from include/linux/dma-mapping.h:11, from include/linux/skbuff.h:34, from include/net/net_namespace.h:36, from include/linux/inet.h:46, from drivers/net/loopback.c:46: include/linux/mm.h: At top level: include/linux/mm.h:1121:29: error: conflicting types for 'page_to_section' static inline unsigned long page_to_section(const struct page *page) ^~~ In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/thread_info.h:29, from include/linux/thread_info.h:38, from include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from include/linux/preempt.h:81, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/jiffies.h:9, from drivers/net/loopback.c:32: include/asm-generic/memory_model.h:64:14: note: previous implicit declaration of 'page_to_section' was here int __sec = page_to_section(__pg); \ ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ >> arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro >> 'page_to_pfn' unsigned long pfn = page_to_pfn(page); ^~~ cc1: some warnings being treated as errors -- In file included from arch/powerpc/include/asm/page.h:339:0, from arch/powerpc/include/asm/book3s/64/mmu.h:5, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/current.h:16, from include/linux/mutex.h:14, from include/linux/kernfs.h:13, from include/linux/sysfs.h:16, from include/linux/kobject.h:20, from include/linux/device.h:16, from drivers/of/irq.c:19: arch/powerpc/include/asm/io.h: In function 'page_to_phys': include/asm-generic/memory_model.h:64:14: error: implicit declaration of function 'page_to_section'; did you mean '__nr_to_section'? [-Werror=implicit-function-declaration] int __sec = page_to_section(__pg); \ ^ include/asm-generic/memory_model.h:81:21: note: in expansion of macro '__page_to_pfn' #define page_to_pfn __page_to_pfn ^ >> arch/powerpc/include/asm/io.h:835:22: note: in expansion of macro >> 'page_to_pfn'
Snowpatch
snowpatch is pretty surprising on https://patchwork.ozlabs.org/patch/1003954/ How can such a trivial patch for 8xx alter compilation of all other targets that much (ie add or remove sparse warnings) ? Christophe
Re: [PATCH] i2c: powermac: Use of_node_name_eq for node name comparisons
On Wed, Dec 05, 2018 at 01:50:24PM -0600, Rob Herring wrote: > Convert string compares of DT node names to use of_node_name_eq helper > instead. This removes direct access to the node name pointer. > > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: Michael Ellerman > Cc: linux-...@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Rob Herring Applied to for-next, thanks! signature.asc Description: PGP signature
[PATCH v4] kbuild: Add support for DT binding schema checks
This adds the build infrastructure for checking DT binding schema documents and validating dts files using the binding schema. Check DT binding schema documents: make dt_binding_check Build dts files and check using DT binding schema: make dtbs_check Optionally, DT_SCHEMA_FILES can be passed in with a schema file(s) to use for validation. This makes it easier to find and fix errors generated by a specific schema. Currently, the validation targets are separate from a normal build to avoid a hard dependency on the external DT schema project and because there are lots of warnings generated. Cc: Jonathan Corbet Cc: Mark Rutland Cc: Masahiro Yamada Cc: Michal Marek Cc: linux-...@vger.kernel.org Cc: devicet...@vger.kernel.org Cc: linux-kbu...@vger.kernel.org Signed-off-by: Rob Herring --- v4: - Rework libyaml check and error message with Masahiro's version - Simplify build rules and dependencies .gitignore | 1 + Documentation/Makefile | 2 +- Documentation/devicetree/bindings/.gitignore | 2 ++ Documentation/devicetree/bindings/Makefile | 27 Makefile | 13 +++--- scripts/Makefile.lib | 24 +++-- scripts/dtc/Makefile | 4 +++ 7 files changed, 67 insertions(+), 6 deletions(-) create mode 100644 Documentation/devicetree/bindings/.gitignore create mode 100644 Documentation/devicetree/bindings/Makefile diff --git a/.gitignore b/.gitignore index 97ba6b79834c..a20ac26aa2f5 100644 --- a/.gitignore +++ b/.gitignore @@ -15,6 +15,7 @@ *.bin *.bz2 *.c.[012]*.* +*.dt.yaml *.dtb *.dtb.S *.dwo diff --git a/Documentation/Makefile b/Documentation/Makefile index 2ca77ad0f238..9786957c6a35 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -2,7 +2,7 @@ # Makefile for Sphinx documentation # -subdir-y := +subdir-y := devicetree/bindings/ # You can set these variables from the command line. SPHINXBUILD = sphinx-build diff --git a/Documentation/devicetree/bindings/.gitignore b/Documentation/devicetree/bindings/.gitignore new file mode 100644 index ..ef82fcfcccab --- /dev/null +++ b/Documentation/devicetree/bindings/.gitignore @@ -0,0 +1,2 @@ +*.example.dts +processed-schema.yaml diff --git a/Documentation/devicetree/bindings/Makefile b/Documentation/devicetree/bindings/Makefile new file mode 100644 index ..6e5cef0ed6fb --- /dev/null +++ b/Documentation/devicetree/bindings/Makefile @@ -0,0 +1,27 @@ +# SPDX-License-Identifier: GPL-2.0 +DT_DOC_CHECKER ?= dt-doc-validate +DT_EXTRACT_EX ?= dt-extract-example +DT_MK_SCHEMA ?= dt-mk-schema +DT_MK_SCHEMA_FLAGS := $(if $(DT_SCHEMA_FILES), -u) + +quiet_cmd_chk_binding = CHKDT $(patsubst $(srctree)/%,%,$<) + cmd_chk_binding = $(DT_DOC_CHECKER) $< ; \ +$(DT_EXTRACT_EX) $< > $@ + +$(obj)/%.example.dts: $(src)/%.yaml FORCE + $(call if_changed,chk_binding) + +DT_TMP_SCHEMA := processed-schema.yaml +extra-y += $(DT_TMP_SCHEMA) + +quiet_cmd_mk_schema = SCHEMA $@ + cmd_mk_schema = $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ $(filter-out FORCE, $^) + +DT_DOCS = $(shell cd $(srctree)/$(src) && find * -name '*.yaml') +DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS)) + +extra-y += $(patsubst $(src)/%.yaml,%.example.dts, $(DT_SCHEMA_FILES)) +extra-y += $(patsubst $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES)) + +$(obj)/$(DT_TMP_SCHEMA): $(DT_SCHEMA_FILES) FORCE + $(call if_changed,mk_schema) diff --git a/Makefile b/Makefile index 2f36db897895..a3e2db2a3119 100644 --- a/Makefile +++ b/Makefile @@ -1232,10 +1232,13 @@ ifneq ($(dtstree),) %.dtb: prepare3 scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@ -PHONY += dtbs dtbs_install -dtbs: prepare3 scripts_dtc +PHONY += dtbs dtbs_install dt_binding_check +dtbs dtbs_check: prepare3 scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) +dtbs_check: export CHECK_DTBS=1 +dtbs_check: dt_binding_check + dtbs_install: $(Q)$(MAKE) $(dtbinst)=$(dtstree) @@ -1249,6 +1252,9 @@ PHONY += scripts_dtc scripts_dtc: scripts_basic $(Q)$(MAKE) $(build)=scripts/dtc +dt_binding_check: scripts_dtc + $(Q)$(MAKE) $(build)=Documentation/devicetree/bindings + # --- # Modules @@ -1611,7 +1617,8 @@ clean: $(clean-dirs) $(call cmd,rmfiles) @find $(if $(KBUILD_EXTMOD), $(KBUILD_EXTMOD), .) $(RCS_FIND_IGNORE) \ \( -name '*.[aios]' -o -name '*.ko' -o -name '.*.cmd' \ - -o -name '*.ko.*' -o -name '*.dtb' -o -name '*.dtb.S' \ + -o -name '*.ko.*' \ + -o -name '*.dtb' -o -name '*.dtb.S' -o -name '*.dt.yaml' \ -o -name '*.dwo' -o -name '*.lst' \ -o -name '*.su' \ -o -name '.*.d' -o -name '.*.tmp' -o -name '*.mod.c' \ diff --git
Re: [PATCH v3] kbuild: Add support for DT binding schema checks
On Tue, Dec 11, 2018 at 10:03 AM Masahiro Yamada wrote: > > On Wed, Dec 12, 2018 at 12:13 AM Rob Herring wrote: > > > > > > > +$(obj)/%.example.dts: $(src)/%.yaml FORCE > > > > + $(call if_changed,chk_binding) > > > > + > > > > +DT_TMP_SCHEMA := .schema.yaml.tmp > > > > > > > > > BTW, why does this file start with a period? > > > What is the meaning of '.tmp' extension? > > > > Nothing really. Just named it something so it gets cleaned and ignored by > > git. > > > It is cleaned whatever file name you use. > > > See scripts/Makefile.clean > > __clean-files := $(extra-y) $(extra-m) $(extra-) \ >$(always) $(targets) $(clean-files) \ >$(hostprogs-y) $(hostprogs-m) $(hostprogs-) \ >$(hostlibs-y) $(hostlibs-m) $(hostlibs-) \ >$(hostcxxlibs-y) $(hostcxxlibs-m) > > > $(extra-y) is cleaned. True. > > > You are adding *.example.dts to .gitignore > > Why not "schema.yaml" ? Okay. I'll do "processed-schema.yaml" to give a bit better name of what it contains. > > > > > +extra-y += $(DT_TMP_SCHEMA) > > > > + > > > > +quiet_cmd_mk_schema = SCHEMA $@ > > > > + cmd_mk_schema = mkdir -p $(obj); \ > > > > + rm -f $@; \ > > > > + $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ > > > > $(filter-out FORCE, $^) > > > > > > > > > "mkdir -p $(obj)" is redundant. > > > > > > > > > Why is 'rm -f $@' necessary ? > > > Can't dt-mk-schema overwrite the output file? > > > > It is for error case when the output file is not generated. I can > > handle this within dt-mk-schema instead. > > > > +DT_DOCS = $(shell cd $(srctree)/$(src) && find * -name '*.yaml') > > > > +DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS)) > > > > + > > > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dts, $(DT_SCHEMA_FILES)) > > > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES)) > > > > > > > > > > > > I assume you intentionally did not do like this: > > > > > > extra-y += $(patsubst %.yaml,%.example.dtb, $(DT_DOCS)) > > > > > > From the commit description, DT_SCHEMA_FILES might be overridden by a > > > user. > > > So, I think this is OK. > > > > > > > > > > > > > > > > +$(obj)/$(DT_TMP_SCHEMA): | $(addprefix $(obj)/,$(patsubst > > > > $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES))) > > > > > > I do not understand this line. > > > Why is it necessary? > > > > > > *.example.dtb files are generated anyway > > > since they are listed in extra-y. > > > > It is enforcing the ordering. Without it, the binding checks and > > building .schema.yaml.tmp happen in parallel because both only have > > the source files as dependencies. The '|' keeps the dependencies out > > of the dependency list($^). > > > What kind problem would you see if > the binding checks and building .schema.yaml.tmp > happen in parallel? In case of no errors in the binding docs, it doesn't matter. If there are errors, I don't want the dtbs validation to run if any schema doesn't validate. However, I played around with this a bit more and it seems like having the examples' dts/dtb in extra-y prevents that from happening. Does that match your expections? If so, then I think we can remove the dependency. Here's an example. SCHEMA Documentation/devicetree/bindings/.schema.yaml.tmp CHKDT Documentation/devicetree/bindings/arm/vt8500.yaml CHKDT Documentation/devicetree/bindings/arm/zte.yaml CHKDT Documentation/devicetree/bindings/arm/altera/socfpga-clk-manager.yaml CHKDT Documentation/devicetree/bindings/arm/ti/ti,davinci.yaml CHKDT Documentation/devicetree/bindings/arm/ti/nspire.yaml CHKDT Documentation/devicetree/bindings/arm/spear.yaml CHKDT Documentation/devicetree/bindings/arm/altera.yaml warning: no schema found in file: ../Documentation/devicetree/bindings/arm/xilinx.yaml /home/rob/proj/git/linux-2.6/Documentation/devicetree/bindings/arm/xilinx.yaml: ignoring, error in schema 'onOf' CHKDT Documentation/devicetree/bindings/arm/sti.yaml CHKDT Documentation/devicetree/bindings/arm/qcom.yaml CHKDT Documentation/devicetree/bindings/arm/primecell.yaml CHKDT Documentation/devicetree/bindings/arm/cpus.yaml CHKDT Documentation/devicetree/bindings/arm/sirf.yaml CHKDT Documentation/devicetree/bindings/arm/calxeda.yaml CHKDT Documentation/devicetree/bindings/arm/xilinx.yaml CHKDT Documentation/devicetree/bindings/example-schema.yaml /home/rob/proj/git/linux-2.6/Documentation/devicetree/bindings/arm/xilinx.yaml: properties:compatible:onOf: 'onOf' is not one of ['$ref', 'additionalItems', 'allOf', 'const', 'contains', 'default', 'dependencies', 'description', 'enum', 'items', 'minItems', 'minimum', 'maxItems', 'maximum', 'not', 'oneOf', 'pattern', 'patternProperties', 'properties', 'required', 'type', 'typeSize'] make[2]: *** [../Documentation/devicetree/bindings/Makefile:12: Documentation/devicetree/bindings/arm/xilinx.example.dts] Error 1 Rob
Re: use generic DMA mapping code in powerpc V4
Next step: 7decbcfc656805603ab97206b3f816f26cd2cf7d (powerpc/dma: use generic direct and swiotlb ops) git checkout 7decbcfc656805603ab97206b3f816f26cd2cf7d We have the bad commit! :-) The PASEMI onboard ethernet doesn't work with this commit anymore. Error messages: [ 367.627623] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627631] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627639] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627647] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 pci :00:1a.0 = 00:1a.0 DMA controller: PA Semi, Inc PWRficient DMA Controller (rev 12) X5000 (P5020 board): U-Boot loads the kernel and the dtb file. Then the kernel starts but it doesn't find any hard disks (partitions). That means this is also the bad commit for the P5020 board. Link to the bad commit: http://git.infradead.org/users/hch/misc.git/commit/7decbcfc656805603ab97206b3f816f26cd2cf7d Link to the Git: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/powerpc-dma.5 The commit before (977706f9755d2d697aa6f45b4f9f0e07516efeda - powerpc/dma: remove dma_nommu_mmap_coherent) works without any problems. -- Christian
[PATCH v2] ocxl: Fix endiannes bug in read_afu_name()
The AFU Descriptor Template in the PCI config space has a Name Space field which is a 24 Byte ASCII character string of descriptive name space for the AFU. The OCXL driver read the string four characters at a time with pci_read_config_dword(). This optimization is valid on a little-endian system since this is PCI, but a big-endian system ends up with each subset of four characters in reverse order. This could be fixed by switching to read characters one by one. Another option is to swap the bytes if we're big-endian. Go for the latter with le32_to_cpu(). Cc: sta...@vger.kernel.org # v4.16 Signed-off-by: Greg Kurz Acked-by: Frederic Barrat --- v2: - silence sparse with (__force __le32) cast - new changelog --- drivers/misc/ocxl/config.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c index 57a6bb1fd3c9..8f2c5d8bd2ee 100644 --- a/drivers/misc/ocxl/config.c +++ b/drivers/misc/ocxl/config.c @@ -318,7 +318,7 @@ static int read_afu_name(struct pci_dev *dev, struct ocxl_fn_config *fn, if (rc) return rc; ptr = (u32 *) >name[i]; - *ptr = val; + *ptr = le32_to_cpu((__force __le32) val); } afu->name[OCXL_AFU_NAME_SZ - 1] = '\0'; /* play safe */ return 0;
Re: [PATCH v03] powerpc/mobility: Fix node detach/rename problem
--- Snip --- >> >> The mobility.c code continues on during the second migration, accepts >> the definitions of the new nodes from the PHYP and ends up renaming >> the new properties e.g. >> >> [ 4565.827296] Duplicate name in base, renamed to "ibm,platform-facilities#1" >> >> There is no check like 'of_node_check_flag(np, OF_DETACHED)' within >> of_find_node_by_phandle to skip nodes that are detached, but still >> present due to caching or use count considerations. Also, note that >> of_find_node_by_phandle also uses a 'phandle_cache' which does not >> appear to be updated when of_detach_node() is invoked. > > This seems like the real bug. Since the phandle cache was added we can > now find detached nodes when we shouldn't be able to. > > Does the patch below work? > > cheers > > diff --git a/drivers/of/base.c b/drivers/of/base.c > index 09692c9b32a7..d8e4534c0686 100644 > --- a/drivers/of/base.c > +++ b/drivers/of/base.c > @@ -1190,6 +1190,10 @@ struct device_node *of_find_node_by_phandle(phandle > handle) > if (phandle_cache[masked_handle] && > handle == phandle_cache[masked_handle]->phandle) > np = phandle_cache[masked_handle]; > + > + /* If we find a detached node, remove it */ > + if (of_node_check_flag(np, OF_DETACHED)) > + np = phandle_cache[masked_handle] = NULL; > } > > if (!np) { > > I think this would be a bit better for cases where masked values overlap: diff --git a/drivers/of/base.c b/drivers/of/base.c index 09692c9..ec79129 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1188,8 +1188,13 @@ struct device_node *of_find_node_by_phandle(phandle handle) if (phandle_cache) { if (phandle_cache[masked_handle] && - handle == phandle_cache[masked_handle]->phandle) - np = phandle_cache[masked_handle]; + handle == phandle_cache[masked_handle]->phandle) { + np = phandle_cache[masked_handle]; + + /* If we find a detached node, remove it */ + if (of_node_check_flag(np, OF_DETACHED)) + np = phandle_cache[masked_handle] = NULL; + } } if (!np) { Will try it out. Wouldn't it be better to do this when the node is detached in drivers/of/dynamic.c:__of_detach_node()? Thanks. Michael -- Michael W. Bringmann Linux Technology Center IBM Corporation Tie-Line 363-5196 External: (512) 286-5196 Cell: (512) 466-0650 m...@linux.vnet.ibm.com
Re: [PATCH v03] powerpc/mobility: Fix node detach/rename problem
On Tue, Dec 11, 2018 at 7:29 AM Michael Ellerman wrote: > > Hi Michael, > > Please Cc the device tree folks on device tree patches, and also the > original author of the patch that added the code you're modifying. > > So I've added: > robh...@kernel.org > frowand.l...@gmail.com > devicet...@vger.kernel.org > linux-ker...@vger.kernel.org > > Michael Bringmann writes: > > The PPC mobility code receives RTAS requests to delete nodes with > > platform-/hardware-specific attributes when restarting the kernel > > after a migration. My example is for migration between a P8 Alpine > > and a P8 Brazos. Nodes to be deleted include 'ibm,random-v1', > > 'ibm,platform-facilities', 'ibm,sym-encryption-v1', and, > > 'ibm,compression-v1'. > > > > The mobility.c code calls 'of_detach_node' for the nodes and their > > children. This makes calls to detach the properties and to remove > > the associated sysfs/kernfs files. > > > > Then new copies of the same nodes are next provided by the PHYP, > > local copies are built, and a pointer to the 'struct device_node' > > is passed to of_attach_node. Before the call to of_attach_node, > > the phandle is initialized to 0 when the data structure is alloced. > > During the call to of_attach_node, it calls __of_attach_node which > > pulls the actual name and phandle from just created sub-properties > > named something like 'name' and 'ibm,phandle'. > > > > This is all fine for the first migration. The problem occurs with > > the second and subsequent migrations when the PHYP on the new system > > wants to replace the same set of nodes again, referenced with the > > same names and phandle values. > > > > On the second and subsequent migrations, the PHYP tells the system > > to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1', > > 'ibm,compression-v1', 'ibm,sym-encryption-v1'. It specifies these > > nodes by its known set of phandle values -- the same handles used > > by the PHYP on the source system are known on the target system. > > The mobility.c code calls of_find_node_by_phandle() with these values > > and ends up locating the first instance of each node that was added > > during the original boot, instead of the second instance of each node > > created after the first migration. The detach during the second > > migration fails with errors like, > > > > [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 > > __of_detach_node+0x8/0xa0 > > [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag > > inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc > > xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod > > ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod > > [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: GW > > 4.18.0-rc1-wi107836-v05-120+ #201 > > [ 4565.030737] NIP: c07c1ea8 LR: c07c1fb4 CTR: > > 00655170 > > [ 4565.030741] REGS: c003f302b690 TRAP: 0700 Tainted: GW > > (4.18.0-rc1-wi107836-v05-120+) > > [ 4565.030745] MSR: 80010282b033 > > CR: 22288822 XER: 000a > > [ 4565.030757] CFAR: c07c1fb0 IRQMASK: 1 > > [ 4565.030757] GPR00: c07c1fa4 c003f302b910 c114bf00 > > c0038e68 > > [ 4565.030757] GPR04: 0001 80c008e0b4b8 > > > > [ 4565.030757] GPR08: 0001 8003 > > 2843 > > [ 4565.030757] GPR12: 8800 c0001ec9ae00 4000 > > > > [ 4565.030757] GPR16: 0008 > > f6ff > > [ 4565.030757] GPR20: 0007 c003e9f1f034 > > 0001 > > [ 4565.030757] GPR24: > > > > [ 4565.030757] GPR28: c1549d28 c1134828 c0038e68 > > c003f302b930 > > [ 4565.030804] NIP [c07c1ea8] __of_detach_node+0x8/0xa0 > > [ 4565.030808] LR [c07c1fb4] of_detach_node+0x74/0xd0 > > [ 4565.030811] Call Trace: > > [ 4565.030815] [c003f302b910] [c07c1fa4] > > of_detach_node+0x64/0xd0 (unreliable) > > [ 4565.030821] [c003f302b980] [c00c33c4] > > dlpar_detach_node+0xb4/0x150 > > [ 4565.030826] [c003f302ba10] [c00c3ffc] > > delete_dt_node+0x3c/0x80 > > [ 4565.030831] [c003f302ba40] [c00c4380] > > pseries_devicetree_update+0x150/0x4f0 > > [ 4565.030836] [c003f302bb70] [c00c479c] > > post_mobility_fixup+0x7c/0xf0 > > [ 4565.030841] [c003f302bbe0] [c00c4908] > > migration_store+0xf8/0x130 > > [ 4565.030847] [c003f302bc70] [c0998160] > > kobj_attr_store+0x30/0x60 > > [ 4565.030852] [c003f302bc90] [c0412f14] > > sysfs_kf_write+0x64/0xa0 > > [ 4565.030857] [c003f302bcb0] [c0411cac] > >
Re: [PATCH v3] kbuild: Add support for DT binding schema checks
On Wed, Dec 12, 2018 at 12:13 AM Rob Herring wrote: > > > > +$(obj)/%.example.dts: $(src)/%.yaml FORCE > > > + $(call if_changed,chk_binding) > > > + > > > +DT_TMP_SCHEMA := .schema.yaml.tmp > > > > > > BTW, why does this file start with a period? > > What is the meaning of '.tmp' extension? > > Nothing really. Just named it something so it gets cleaned and ignored by git. It is cleaned whatever file name you use. See scripts/Makefile.clean __clean-files := $(extra-y) $(extra-m) $(extra-) \ $(always) $(targets) $(clean-files) \ $(hostprogs-y) $(hostprogs-m) $(hostprogs-) \ $(hostlibs-y) $(hostlibs-m) $(hostlibs-) \ $(hostcxxlibs-y) $(hostcxxlibs-m) $(extra-y) is cleaned. You are adding *.example.dts to .gitignore Why not "schema.yaml" ? > > > +extra-y += $(DT_TMP_SCHEMA) > > > + > > > +quiet_cmd_mk_schema = SCHEMA $@ > > > + cmd_mk_schema = mkdir -p $(obj); \ > > > + rm -f $@; \ > > > + $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ > > > $(filter-out FORCE, $^) > > > > > > "mkdir -p $(obj)" is redundant. > > > > > > Why is 'rm -f $@' necessary ? > > Can't dt-mk-schema overwrite the output file? > > It is for error case when the output file is not generated. I can > handle this within dt-mk-schema instead. > > > +DT_DOCS = $(shell cd $(srctree)/$(src) && find * -name '*.yaml') > > > +DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS)) > > > + > > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dts, $(DT_SCHEMA_FILES)) > > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES)) > > > > > > > > I assume you intentionally did not do like this: > > > > extra-y += $(patsubst %.yaml,%.example.dtb, $(DT_DOCS)) > > > > From the commit description, DT_SCHEMA_FILES might be overridden by a user. > > So, I think this is OK. > > > > > > > > > > > +$(obj)/$(DT_TMP_SCHEMA): | $(addprefix $(obj)/,$(patsubst > > > $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES))) > > > > I do not understand this line. > > Why is it necessary? > > > > *.example.dtb files are generated anyway > > since they are listed in extra-y. > > It is enforcing the ordering. Without it, the binding checks and > building .schema.yaml.tmp happen in parallel because both only have > the source files as dependencies. The '|' keeps the dependencies out > of the dependency list($^). What kind problem would you see if the binding checks and building .schema.yaml.tmp happen in parallel? -- Best Regards Masahiro Yamada
Re: [PATCH v03] powerpc/mobility: Fix node detach/rename problem
On 12/11/2018 07:29 AM, Michael Ellerman wrote: > Hi Michael, > > Please Cc the device tree folks on device tree patches, and also the > original author of the patch that added the code you're modifying. > > So I've added: > robh...@kernel.org > frowand.l...@gmail.com > devicet...@vger.kernel.org > linux-ker...@vger.kernel.org Thanks. > > Michael Bringmann writes: >> The PPC mobility code receives RTAS requests to delete nodes with >> platform-/hardware-specific attributes when restarting the kernel >> after a migration. My example is for migration between a P8 Alpine >> and a P8 Brazos. Nodes to be deleted include 'ibm,random-v1', >> 'ibm,platform-facilities', 'ibm,sym-encryption-v1', and, >> 'ibm,compression-v1'. >> >> The mobility.c code calls 'of_detach_node' for the nodes and their >> children. This makes calls to detach the properties and to remove >> the associated sysfs/kernfs files. >> >> Then new copies of the same nodes are next provided by the PHYP, >> local copies are built, and a pointer to the 'struct device_node' >> is passed to of_attach_node. Before the call to of_attach_node, >> the phandle is initialized to 0 when the data structure is alloced. >> During the call to of_attach_node, it calls __of_attach_node which >> pulls the actual name and phandle from just created sub-properties >> named something like 'name' and 'ibm,phandle'. >> >> This is all fine for the first migration. The problem occurs with >> the second and subsequent migrations when the PHYP on the new system >> wants to replace the same set of nodes again, referenced with the >> same names and phandle values. >> >> On the second and subsequent migrations, the PHYP tells the system >> to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1', >> 'ibm,compression-v1', 'ibm,sym-encryption-v1'. It specifies these >> nodes by its known set of phandle values -- the same handles used >> by the PHYP on the source system are known on the target system. >> The mobility.c code calls of_find_node_by_phandle() with these values >> and ends up locating the first instance of each node that was added >> during the original boot, instead of the second instance of each node >> created after the first migration. The detach during the second >> migration fails with errors like, >> >> [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 >> __of_detach_node+0x8/0xa0 >> [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag >> inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc >> xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod >> ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod >> [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: GW >> 4.18.0-rc1-wi107836-v05-120+ #201 >> [ 4565.030737] NIP: c07c1ea8 LR: c07c1fb4 CTR: >> 00655170 >> [ 4565.030741] REGS: c003f302b690 TRAP: 0700 Tainted: GW >>(4.18.0-rc1-wi107836-v05-120+) >> [ 4565.030745] MSR: 80010282b033 >> CR: 22288822 XER: 000a >> [ 4565.030757] CFAR: c07c1fb0 IRQMASK: 1 >> [ 4565.030757] GPR00: c07c1fa4 c003f302b910 c114bf00 >> c0038e68 >> [ 4565.030757] GPR04: 0001 80c008e0b4b8 >> >> [ 4565.030757] GPR08: 0001 8003 >> 2843 >> [ 4565.030757] GPR12: 8800 c0001ec9ae00 4000 >> >> [ 4565.030757] GPR16: 0008 >> f6ff >> [ 4565.030757] GPR20: 0007 c003e9f1f034 >> 0001 >> [ 4565.030757] GPR24: >> >> [ 4565.030757] GPR28: c1549d28 c1134828 c0038e68 >> c003f302b930 >> [ 4565.030804] NIP [c07c1ea8] __of_detach_node+0x8/0xa0 >> [ 4565.030808] LR [c07c1fb4] of_detach_node+0x74/0xd0 >> [ 4565.030811] Call Trace: >> [ 4565.030815] [c003f302b910] [c07c1fa4] >> of_detach_node+0x64/0xd0 (unreliable) >> [ 4565.030821] [c003f302b980] [c00c33c4] >> dlpar_detach_node+0xb4/0x150 >> [ 4565.030826] [c003f302ba10] [c00c3ffc] delete_dt_node+0x3c/0x80 >> [ 4565.030831] [c003f302ba40] [c00c4380] >> pseries_devicetree_update+0x150/0x4f0 >> [ 4565.030836] [c003f302bb70] [c00c479c] >> post_mobility_fixup+0x7c/0xf0 >> [ 4565.030841] [c003f302bbe0] [c00c4908] >> migration_store+0xf8/0x130 >> [ 4565.030847] [c003f302bc70] [c0998160] >> kobj_attr_store+0x30/0x60 >> [ 4565.030852] [c003f302bc90] [c0412f14] sysfs_kf_write+0x64/0xa0 >> [ 4565.030857] [c003f302bcb0] [c0411cac] >> kernfs_fop_write+0x16c/0x240 >> [ 4565.030862] [c003f302bd00] [c0355f20] __vfs_write+0x40/0x220 >> [
Re: [PATCH v2.2 24/34] dt-bindings: arm: Convert Rockchip board/soc bindings to json-schema
On Mon, Dec 10, 2018 at 4:45 PM Heiko Stuebner wrote: > > From: Rob Herring > > Convert Rockchip SoC bindings to DT schema format using json-schema. > > Cc: Mark Rutland > Cc: Heiko Stuebner > Cc: devicet...@vger.kernel.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-rockc...@lists.infradead.org > Signed-off-by: Rob Herring > [move to per-board entries and added recently added boards] > Signed-off-by: Heiko Stuebner > --- > Hi Rob, > > thanks for the libyaml hint, now dtc does check my dts nicely and > emits quite a number of little complaints ;-) . > > Also that suggestion to move the original firefly release+beta boards > together was great and I just did that. > > Should look ok now, if so I'll apply it tomorrow. LGTM
Re: [PATCH v3] kbuild: Add support for DT binding schema checks
On Mon, Dec 10, 2018 at 10:39 PM Masahiro Yamada wrote: > > Hi Rob, > > On Tue, Dec 11, 2018 at 5:50 AM Rob Herring wrote: > > > > This adds the build infrastructure for checking DT binding schema > > documents and validating dts files using the binding schema. > > > > Check DT binding schema documents: > > make dt_binding_check > > > > Build dts files and check using DT binding schema: > > make dtbs_check > > > > Optionally, DT_SCHEMA_FILES can passed in with a schema file(s) to use > > > Perhaps, "can be passed" ? Yes. [...] > > +quiet_cmd_chk_binding = CHKDT $< > > > In convention, the short log displays the target name > instead of the prerequisite. Yes, but here there is no target. We're just running a check on the source. > If O=foo/bar is given, $< shows the full path, > then log will look like follows: > > > CHKDT > /home/masahiro/ref/linux-devicetree/Documentation/devicetree/bindings/arm/calxeda.yaml > DTC Documentation/devicetree/bindings/arm/calxeda.example.dtb > CHKDT > /home/masahiro/ref/linux-devicetree/Documentation/devicetree/bindings/arm/qcom.yaml > DTC Documentation/devicetree/bindings/arm/qcom.example.dtb > CHKDT > /home/masahiro/ref/linux-devicetree/Documentation/devicetree/bindings/arm/xilinx.yaml > DTC Documentation/devicetree/bindings/arm/xilinx.example.dtb > CHKDT > /home/masahiro/ref/linux-devicetree/Documentation/devicetree/bindings/arm/ti/nspire.yaml > DTC Documentation/devicetree/bindings/arm/ti/nspire.example.dtb > CHKDT > /home/masahiro/ref/linux-devicetree/Documentation/devicetree/bindings/arm/ti/ti,davinci.yaml > DTC Documentation/devicetree/bindings/arm/ti/ti,davinci.example.dtb > CHKDT > /home/masahiro/ref/linux-devicetree/Documentation/devicetree/bindings/arm/altera.yaml > > You might think it is ugly. I've changed it to this: quiet_cmd_chk_binding = CHKDT $(patsubst $(srctree)/%,%,$<) [...] > > +$(obj)/%.example.dts: $(src)/%.yaml FORCE > > + $(call if_changed,chk_binding) > > + > > +DT_TMP_SCHEMA := .schema.yaml.tmp > > > BTW, why does this file start with a period? > What is the meaning of '.tmp' extension? Nothing really. Just named it something so it gets cleaned and ignored by git. > > +extra-y += $(DT_TMP_SCHEMA) > > + > > +quiet_cmd_mk_schema = SCHEMA $@ > > + cmd_mk_schema = mkdir -p $(obj); \ > > + rm -f $@; \ > > + $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ > > $(filter-out FORCE, $^) > > > "mkdir -p $(obj)" is redundant. > > > Why is 'rm -f $@' necessary ? > Can't dt-mk-schema overwrite the output file? It is for error case when the output file is not generated. I can handle this within dt-mk-schema instead. > > +DT_DOCS = $(shell cd $(srctree)/$(src) && find * -name '*.yaml') > > +DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS)) > > + > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dts, $(DT_SCHEMA_FILES)) > > +extra-y += $(patsubst $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES)) > > > > I assume you intentionally did not do like this: > > extra-y += $(patsubst %.yaml,%.example.dtb, $(DT_DOCS)) > > From the commit description, DT_SCHEMA_FILES might be overridden by a user. > So, I think this is OK. > > > > > > +$(obj)/$(DT_TMP_SCHEMA): | $(addprefix $(obj)/,$(patsubst > > $(src)/%.yaml,%.example.dtb, $(DT_SCHEMA_FILES))) > > I do not understand this line. > Why is it necessary? > > *.example.dtb files are generated anyway > since they are listed in extra-y. It is enforcing the ordering. Without it, the binding checks and building .schema.yaml.tmp happen in parallel because both only have the source files as dependencies. The '|' keeps the dependencies out of the dependency list($^). [...] > > +cmd_dtc_yaml_chk = \ > > + if ! echo "/dts-v1/; /{};" | $(DTC) -I dts -O yaml > /dev/null ; > > then \ > > + echo "*"; \ > > + echo "* dtc needs libyaml for DT schema validation > > support."; \ > > + echo "* Install the necessary libyaml development package > > from your distro."; \ > > + echo "*"; \ > > + false; \ > > + fi; \ > > + touch $@ > > + > > +$(objtree)/scripts/dtc/.dtc-yaml-chk.tmp: $(DTC) FORCE > > + $(call if_changed,dtc_yaml_chk) > > > Hmm, this is kind of ugly. > > I'd rather check this in scripts/dtc/Makefile > > > > How about something like below? Looks good. > ifeq ($(wildcard /usr/include/yaml.h),) > +ifeq ($(CHECK_DTBS),1) I'm going to change this to "ifneq ($CHECK_DTBS),)" in case we start supporting more than 1 value such as warning levels. > +$(error dtc needs libyaml for DT schema validation support. \ > +Install the necessary libyaml development package.) > +endif > HOST_EXTRACFLAGS += -DNO_YAML > else > dtc-objs += yamltree.o > > > > > > One drawback of this approach is, > this is not checked if you are using out-of-tree DTC. > Probably, Rob is the only person that overrides DTC.
Re: use generic DMA mapping code in powerpc V4
Next step: 977706f9755d2d697aa6f45b4f9f0e07516efeda (powerpc/dma: remove dma_nommu_mmap_coherent) Result: The P5020 board boots and the PASEMI onboard ethernet works. -- Christian On 10 December 2018 at 4:54PM, Christian Zigotzky wrote: Next step: 64ecd2c160bbef31465c4d34efc0f076a2aad4df (powerpc/dma: use phys_to_dma instead of get_dma_offset) The P5020 board boots and the PASEMI onboard ethernet works. -- Christian On 09 December 2018 at 7:26PM, Christian Zigotzky wrote: Next step: c1bfcad4b0cf38ce5b00f7ad880d3a13484c123a (dma-mapping, powerpc: simplify the arch dma_set_mask override) Result: No problems with the PASEMI onboard ethernet and with booting the X5000 (P5020 board). -- Christian On 09 December 2018 at 3:20PM, Christian Zigotzky wrote: Next step: 602307b034734ce77a05da4b99333a2eaf6b6482 (powerpc/fsl_pci: simplify fsl_pci_dma_set_mask) git checkout 602307b034734ce77a05da4b99333a2eaf6b6482 The PASEMI onboard ethernet works and the X5000 boots. -- Christian On 08 December 2018 at 2:47PM, Christian Zigotzky wrote: Next step: e15cd8173ef85e9cc3e2a9c7cc2982f5c1355615 (powerpc/dma: fix an off-by-one in dma_capable) git checkout e15cd8173ef85e9cc3e2a9c7cc2982f5c1355615 The PASEMI onboard ethernet also works with this commit and the X5000 boots without any problems. -- Christian On 08 December 2018 at 11:29AM, Christian Zigotzky wrote: Next step: 7ebc44c535f6bd726d553756d38b137acc718443 (powerpc/dma: remove max_direct_dma_addr) git checkout 7ebc44c535f6bd726d553756d38b137acc718443 OK, the PASEMI onboard ethernet works and the P5020 board boots. -- Christian On 07 December 2018 at 7:33PM, Christian Zigotzky wrote: Next step: 13c1fdec5682b6e13257277fa16aa31f342d167d (powerpc/dma: move pci_dma_dev_setup_swiotlb to fsl_pci.c) git checkout 13c1fdec5682b6e13257277fa16aa31f342d167d Result: The PASEMI onboard ethernet works and the P5020 board boots. — Christian
Re: [PATCH] powerpc/ipic: Fix a bounds check in ipic_set_priority()
On Thu, Dec 06, 2018 at 09:12:12AM +0100, Julia Lawall wrote: > > > On Thu, 6 Dec 2018, Christophe LEROY wrote: > > > > > > > Le 05/12/2018 à 04:26, Michael Ellerman a écrit : > > > Hi Dan, > > > > > > Thanks for the patch. > > > > > > Dan Carpenter writes: > > > > The ipic_info[] array only has 95 elements so I have made the bounds > > > > check smaller to prevent a read overflow. It was Smatch that found > > > > this issue: > > > > > > > > arch/powerpc/sysdev/ipic.c:784 ipic_set_priority() > > > > error: buffer overflow 'ipic_info' 95 <= 127 > > > > > > > > Signed-off-by: Dan Carpenter > > > > --- > > > > I wasn't able to find any callers of this code. Maybe we removed the > > > > last one in commit b9f0f1bb2bca ("[POWERPC] Adapt ipic driver to new > > > > host_ops interface, add set_irq_type to set IRQ sense"). So perhaps we > > > > should just remove it. I'm not really comfortable doing that myself, > > > > because I don't know the code well enough and can't build test > > > > it properly. > > > > > > Hah wow, last usage removed in 2006! > > > > > > I don't see any mention of it since then, so I'll remove it. If it > > > breaks something we can put it back. > > > > > > Can smatch help us find things like this that are defined non-static but > > > never used? > > > > > > > I think we have to do that carrefully. Some of those functions might be used > > by out-of-tree boards. > > > > I'm thinking especially at ipic_get_mcp_status() and ipic_set_mcp_status(). > > They are used in my 832x boards's machine check handler to know when a > > machine > > check is a timeout from the 832x watchdog. > > The message I have gotten in the past is that the Linux kernel doesn't > support code that is not used in the Linux kernel. However, if I were to > do this, I would send the code to the individual maintainers, who > presumably would know what is actually needed and what is not. > > Perhaps a good sanity check would be if the code has been used in the > past. If there was a use in the past that has been removed, then perhaps > it is more likely that the function was intended for internal kernel use > rather than the case that you are describing. Yeah. That's a good idea. I've been encouraging people who remove "written to but not used" variables to say in the commit message "variable foo has not been used since commit 123412341243 ("blah blah")." It helps me review the patch as well. regards, dan carpenter
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
Hi David, I see you're still working on this, but if you do end up going this direction eventually, would you mind splitting this into two patches: 1) rename the quirk to make it more generic (but not changing any behavior), and 2) add the ConnectX devices to the quirk. That way the ConnectX change is smaller and more easily understood/reverted/etc. On Thu, Dec 06, 2018 at 03:19:51PM +1100, David Gibson wrote: > Mellanox ConnectX-5 IB cards (MT27800) seem to cause a call trace when > unbound from their regular driver and attached to vfio-pci in order to pass > them through to a guest. > > This goes away if the disable_idle_d3 option is used, so it looks like a > problem with the hardware handling D3 state. To fix that more permanently, > use a device quirk to disable D3 state for these devices. > > We do this by renaming the existing quirk_no_ata_d3() more generally and > attaching it to the ConnectX-[45] devices (0x15b3:0x1013). > > Signed-off-by: David Gibson > --- > drivers/pci/quirks.c | 17 +++-- > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 4700d24e5d55..add3f516ca12 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -1315,23 +1315,24 @@ static void quirk_ide_samemode(struct pci_dev *pdev) > } > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82801CA_10, > quirk_ide_samemode); > > -/* Some ATA devices break if put into D3 */ > -static void quirk_no_ata_d3(struct pci_dev *pdev) > +/* Some devices (including a number of ATA cards) break if put into D3 */ > +static void quirk_no_d3(struct pci_dev *pdev) > { > pdev->dev_flags |= PCI_DEV_FLAGS_NO_D3; > } > + > /* Quirk the legacy ATA devices only. The AHCI ones are ok */ > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_ANY_ID, > - PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); > + PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3); > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_ATI, PCI_ANY_ID, > - PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); > + PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3); > /* ALi loses some register settings that we cannot then restore */ > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID, > - PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); > + PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3); > /* VIA comes back fine but we need to keep it alive or ACPI GTM failures > occur when mode detecting */ > DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID, > - PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3); > + PCI_CLASS_STORAGE_IDE, 8, quirk_no_d3); > > /* > * This was originally an Alpha-specific thing, but it really fits here. > @@ -3367,6 +3368,10 @@ static void mellanox_check_broken_intx_masking(struct > pci_dev *pdev) > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX, PCI_ANY_ID, > mellanox_check_broken_intx_masking); > > +/* Mellanox MT27800 (ConnectX-5) IB card seems to break with D3 > + * In particular this shows up when the device is bound to the vfio-pci > driver */ Follow usual multiline comment style, i.e., /* * text ... * more text ... */ > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MELLANOX, > PCI_DEVICE_ID_MELLANOX_CONNECTX4, quirk_no_d3) > + > static void quirk_no_bus_reset(struct pci_dev *dev) > { > dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; > -- > 2.19.2 >
Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags
On Tue, Dec 11, 2018 at 10:06:53PM +1100, Michael Ellerman wrote: > [ Reviving old thread. ] > > Andrew Murray writes: > > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote: > >> Andrew Murray writes: > >> > >> > Update design.txt to reflect the presence of the exclude_host > >> > and exclude_guest perf flags. > >> > > >> > Signed-off-by: Andrew Murray > >> > --- > >> > tools/perf/design.txt | 4 > >> > 1 file changed, 4 insertions(+) > >> > > >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt > >> > index a28dca2..7de7d83 100644 > >> > --- a/tools/perf/design.txt > >> > +++ b/tools/perf/design.txt > >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and > >> > 'exclude_hv' bits provide a > >> > way to request that counting of events be restricted to times when the > >> > CPU is in user, kernel and/or hypervisor mode. > >> > > >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way > >> > +to request counting of events restricted to guest and host contexts when > >> > +using virtualisation. > >> > >> How does exclude_host differ from exclude_hv ? > > > > I believe exclude_host / exclude_guest are intented to distinguish > > between host and guest in the hosted hypervisor context (KVM). > > OK yeah, from the perf-list man page: > >u - user-space counting >k - kernel counting >h - hypervisor counting >I - non idle counting >G - guest counting (in KVM guests) >H - host counting (not in KVM guests) > > > Whereas exclude_hv allows to distinguish between guest and > > hypervisor in the bare-metal type hypervisors. > > Except that's exactly not how we use them on powerpc :) > > We use exclude_hv to exclude "the hypervisor", regardless of whether > it's KVM or PowerVM (which is a bare-metal hypervisor). > > We don't use exclude_host / exclude_guest at all, which I guess is a > bug, except I didn't know they existed until this thread. > > eg, in a KVM guest: > > $ perf record -e cycles:G /bin/bash -c "for i in {0..10}; do :;done" > $ perf report -D | grep -Fc "dso: [hypervisor]" > 16 > > > > In the case of arm64 - if VHE extensions are present then the host > > kernel will run at a higher privilege to the guest kernel, in which > > case there is no distinction between hypervisor and host so we ignore > > exclude_hv. But where VHE extensions are not present then the host > > kernel runs at the same privilege level as the guest and we use a > > higher privilege level to switch between them - in this case we can > > use exclude_hv to discount that hypervisor role of switching between > > guests. > > I couldn't find any arm64 perf code using exclude_host/guest at all? Correct - but this is in flight as I am currently adding support for this see [1]. > > And I don't see any x86 code using exclude_hv. I can't find any either. > > But maybe that's OK, I just worry this is confusing for users. There is some extra context regarding this where exclude_guest/exclude_host was added, see [2] and where exclude_hv was added, see [3] Generally it seems that exclude_guest/exclude_host relies upon switching counters off/on on guest/host switch code (which works well in the nested virt case). Whereas exclude_hv tends to rely solely on hardware capability based on privilege level (which works well in the bare metal case where the guest doesn't run at same privilege as the host). I think from the user perspective exclude_hv allows you to see your overhead if you are a guest (i.e. work done by bare metal hypervisor associated with you as the guest). Whereas exclude_guest/exclude_host doesn't allow you to see events above you (i.e. the kernel hypervisor) if you are the guest... At least that's how I read this, I've copied in others that may provide more authoritative feedback. [1] https://lists.cs.columbia.edu/pipermail/kvmarm/2018-December/033698.html [2] https://www.spinics.net/lists/kvm/msg53996.html [3] https://lore.kernel.org/patchwork/patch/143918/ Thanks, Andrew Murray > > cheers
Re: [v5] powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call
On Fri, 2018-12-07 at 15:56:05 UTC, "Dmitry V. Levin" wrote: > From: Elvira Khabirova > > Arch code should use tracehook_*() helpers, as documented > in include/linux/tracehook.h, > ptrace_report_syscall() is not expected to be used outside that file. > > The patch does not look very nice, but at least it is correct > and opens the way for PTRACE_GET_SYSCALL_INFO API. > > Co-authored-by: Dmitry V. Levin > Fixes: 5521eb4bca2d ("powerpc/ptrace: Add support for PTRACE_SYSEMU") > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: Oleg Nesterov > Cc: Breno Leitao > Cc: Andy Lutomirski > Cc: Eugene Syromyatnikov > Cc: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Elvira Khabirova > Signed-off-by: Dmitry V. Levin Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/a225f1567405558fb5410e9b2b9080 cheers
Re: [PATCH v03] powerpc/mobility: Fix node detach/rename problem
Hi Michael, Please Cc the device tree folks on device tree patches, and also the original author of the patch that added the code you're modifying. So I've added: robh...@kernel.org frowand.l...@gmail.com devicet...@vger.kernel.org linux-ker...@vger.kernel.org Michael Bringmann writes: > The PPC mobility code receives RTAS requests to delete nodes with > platform-/hardware-specific attributes when restarting the kernel > after a migration. My example is for migration between a P8 Alpine > and a P8 Brazos. Nodes to be deleted include 'ibm,random-v1', > 'ibm,platform-facilities', 'ibm,sym-encryption-v1', and, > 'ibm,compression-v1'. > > The mobility.c code calls 'of_detach_node' for the nodes and their > children. This makes calls to detach the properties and to remove > the associated sysfs/kernfs files. > > Then new copies of the same nodes are next provided by the PHYP, > local copies are built, and a pointer to the 'struct device_node' > is passed to of_attach_node. Before the call to of_attach_node, > the phandle is initialized to 0 when the data structure is alloced. > During the call to of_attach_node, it calls __of_attach_node which > pulls the actual name and phandle from just created sub-properties > named something like 'name' and 'ibm,phandle'. > > This is all fine for the first migration. The problem occurs with > the second and subsequent migrations when the PHYP on the new system > wants to replace the same set of nodes again, referenced with the > same names and phandle values. > > On the second and subsequent migrations, the PHYP tells the system > to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1', > 'ibm,compression-v1', 'ibm,sym-encryption-v1'. It specifies these > nodes by its known set of phandle values -- the same handles used > by the PHYP on the source system are known on the target system. > The mobility.c code calls of_find_node_by_phandle() with these values > and ends up locating the first instance of each node that was added > during the original boot, instead of the second instance of each node > created after the first migration. The detach during the second > migration fails with errors like, > > [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 > __of_detach_node+0x8/0xa0 > [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag > inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc > xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod > ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod > [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: GW > 4.18.0-rc1-wi107836-v05-120+ #201 > [ 4565.030737] NIP: c07c1ea8 LR: c07c1fb4 CTR: > 00655170 > [ 4565.030741] REGS: c003f302b690 TRAP: 0700 Tainted: GW > (4.18.0-rc1-wi107836-v05-120+) > [ 4565.030745] MSR: 80010282b033 > CR: 22288822 XER: 000a > [ 4565.030757] CFAR: c07c1fb0 IRQMASK: 1 > [ 4565.030757] GPR00: c07c1fa4 c003f302b910 c114bf00 > c0038e68 > [ 4565.030757] GPR04: 0001 80c008e0b4b8 > > [ 4565.030757] GPR08: 0001 8003 > 2843 > [ 4565.030757] GPR12: 8800 c0001ec9ae00 4000 > > [ 4565.030757] GPR16: 0008 > f6ff > [ 4565.030757] GPR20: 0007 c003e9f1f034 > 0001 > [ 4565.030757] GPR24: > > [ 4565.030757] GPR28: c1549d28 c1134828 c0038e68 > c003f302b930 > [ 4565.030804] NIP [c07c1ea8] __of_detach_node+0x8/0xa0 > [ 4565.030808] LR [c07c1fb4] of_detach_node+0x74/0xd0 > [ 4565.030811] Call Trace: > [ 4565.030815] [c003f302b910] [c07c1fa4] of_detach_node+0x64/0xd0 > (unreliable) > [ 4565.030821] [c003f302b980] [c00c33c4] > dlpar_detach_node+0xb4/0x150 > [ 4565.030826] [c003f302ba10] [c00c3ffc] delete_dt_node+0x3c/0x80 > [ 4565.030831] [c003f302ba40] [c00c4380] > pseries_devicetree_update+0x150/0x4f0 > [ 4565.030836] [c003f302bb70] [c00c479c] > post_mobility_fixup+0x7c/0xf0 > [ 4565.030841] [c003f302bbe0] [c00c4908] > migration_store+0xf8/0x130 > [ 4565.030847] [c003f302bc70] [c0998160] kobj_attr_store+0x30/0x60 > [ 4565.030852] [c003f302bc90] [c0412f14] sysfs_kf_write+0x64/0xa0 > [ 4565.030857] [c003f302bcb0] [c0411cac] > kernfs_fop_write+0x16c/0x240 > [ 4565.030862] [c003f302bd00] [c0355f20] __vfs_write+0x40/0x220 > [ 4565.030867] [c003f302bd90] [c0356358] vfs_write+0xc8/0x240 > [ 4565.030872] [c003f302bde0] [c03566cc] ksys_write+0x5c/0x100 > [ 4565.030880]
Re: [PATCH] bpf: fix overflow of bpf_jit_limit when PAGE_SIZE >= 64K
Daniel Borkmann writes: < snip > > > I would actually just like to get rid of the BPF_JIT_LIMIT_DEFAULT > define also given for 4.21 arm64 will have its own dedicated area for > JIT allocations where neither the above limit nor the MODULES_END/ > MODULES_VADDR one would fit and I don't want to make this even more > ugly with adding further cases into the core. Would the below variant > work for you? I haven't actually hit the bug so I won't ack/tested-by this, but it looks fine to me. cheers > From da9daf462d41ce5506c6b6318a9fa3d6d8a64f6c Mon Sep 17 00:00:00 2001 > From: Daniel Borkmann > Date: Mon, 10 Dec 2018 14:30:27 +0100 > Subject: [PATCH bpf] bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K > > Michael and Sandipan report: > > Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF > JIT allocations. At compile time it defaults to PAGE_SIZE * 4, > and is adjusted again at init time if MODULES_VADDR is defined. > > For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with > the compile-time default at boot-time, which is 0x9c40 when > using 64K page size. This overflows the signed 32-bit bpf_jit_limit > value: > > root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit > -1673527296 > > and can cause various unexpected failures throughout the network > stack. In one case `strace dhclient eth0` reported: > > setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8}, > 16) = -1 ENOTSUPP (Unknown error 524) > > and similar failures can be seen with tools like tcpdump. This doesn't > always reproduce however, and I'm not sure why. The more consistent > failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9 > host would time out on systemd/netplan configuring a virtio-net NIC > with no noticeable errors in the logs. > > Given this and also given that in near future some architectures like > arm64 will have a custom area for BPF JIT image allocations we should > get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For > 4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec() > so therefore add another overridable bpf_jit_alloc_exec_limit() helper > function which returns the possible size of the memory area for deriving > the default heuristic in bpf_jit_charge_init(). > > Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new > bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default > JIT memory provider, and therefore in case archs implement their custom > module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for > vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}. > > Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv > allocations") > Reported-by: Sandipan Das > Reported-by: Michael Roth > Signed-off-by: Daniel Borkmann > --- > kernel/bpf/core.c | 19 ++- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c > index b1a3545..6c2332e 100644 > --- a/kernel/bpf/core.c > +++ b/kernel/bpf/core.c > @@ -365,13 +365,11 @@ void bpf_prog_kallsyms_del_all(struct bpf_prog *fp) > } > > #ifdef CONFIG_BPF_JIT > -# define BPF_JIT_LIMIT_DEFAULT (PAGE_SIZE * 4) > - > /* All BPF JIT sysctl knobs here. */ > int bpf_jit_enable __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_ALWAYS_ON); > int bpf_jit_harden __read_mostly; > int bpf_jit_kallsyms __read_mostly; > -int bpf_jit_limit__read_mostly = BPF_JIT_LIMIT_DEFAULT; > +int bpf_jit_limit__read_mostly; > > static __always_inline void > bpf_get_prog_addr_region(const struct bpf_prog *prog, > @@ -580,16 +578,27 @@ int bpf_get_kallsym(unsigned int symnum, unsigned long > *value, char *type, > > static atomic_long_t bpf_jit_current; > > +/* Can be overridden by an arch's JIT compiler if it has a custom, > + * dedicated BPF backend memory area, or if neither of the two > + * below apply. > + */ > +u64 __weak bpf_jit_alloc_exec_limit(void) > +{ > #if defined(MODULES_VADDR) > + return MODULES_END - MODULES_VADDR; > +#else > + return VMALLOC_END - VMALLOC_START; > +#endif > +} > + > static int __init bpf_jit_charge_init(void) > { > /* Only used as heuristic here to derive limit. */ > - bpf_jit_limit = min_t(u64, round_up((MODULES_END - MODULES_VADDR) >> 2, > + bpf_jit_limit = min_t(u64, round_up(bpf_jit_alloc_exec_limit() >> 2, > PAGE_SIZE), INT_MAX); > return 0; > } > pure_initcall(bpf_jit_charge_init); > -#endif > > static int bpf_jit_charge_modmem(u32 pages) > { > -- > 2.9.5
Re: [PATCH] powerpc: Fix bogus usage of MSR_RI on BookE and 40x
Le 11/12/2018 à 02:57, Benjamin Herrenschmidt a écrit : BookE and 40x processors lack the MSR:RI bit. However, we have a few common code places that rely on it. This fixes it by not defining MSR_RI on those processor types and using the appropriate ifdef's in those locations. Signed-off-by: Benjamin Herrenschmidt --- arch/powerpc/include/asm/reg.h | 2 ++ arch/powerpc/include/asm/reg_booke.h | 4 ++-- arch/powerpc/kernel/process.c| 2 +- arch/powerpc/kernel/traps.c | 8 ++-- arch/powerpc/lib/sstep.c | 2 ++ 5 files changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index de52c31..41d0b2e 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -110,7 +110,9 @@ #ifndef MSR_PMM #define MSR_PMM __MASK(MSR_PMM_LG) /* Performance monitor */ #endif +#if !defined(CONFIG_BOOKE) && !defined(CONFIG_4xx) #define MSR_RI__MASK(MSR_RI_LG) /* Recoverable Exception */ +#endif #define MSR_LE__MASK(MSR_LE_LG) /* Little Endian */ #define MSR_TM __MASK(MSR_TM_LG) /* Transactional Mem Available */ diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h index eb2a33d..06967f1 100644 --- a/arch/powerpc/include/asm/reg_booke.h +++ b/arch/powerpc/include/asm/reg_booke.h @@ -46,10 +46,10 @@ #define MSR_USER32(MSR_ | MSR_PR | MSR_EE) #define MSR_USER64(MSR_USER32 | MSR_64BIT) #elif defined (CONFIG_40x) -#define MSR_KERNEL (MSR_ME|MSR_RI|MSR_IR|MSR_DR|MSR_CE) +#define MSR_KERNEL (MSR_ME|MSR_IR|MSR_DR|MSR_CE) #define MSR_USER (MSR_KERNEL|MSR_PR|MSR_EE) #else -#define MSR_KERNEL (MSR_ME|MSR_RI|MSR_CE) +#define MSR_KERNEL (MSR_ME|MSR_CE) #define MSR_USER (MSR_KERNEL|MSR_PR|MSR_EE) #endif diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 96f3473..77679a7 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1359,7 +1359,7 @@ static struct regbit msr_bits[] = { {MSR_IR,"IR"}, {MSR_DR,"DR"}, {MSR_PMM, "PMM"}, -#ifndef CONFIG_BOOKE +#if !defined(CONFIG_BOOKE) && !defined(CONFIG_40x) {MSR_RI,"RI"}, {MSR_LE,"LE"}, #endif diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 9a86572..2d00696 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -429,10 +429,11 @@ void system_reset_exception(struct pt_regs *regs) if (get_paca()->in_nmi > 1) nmi_panic(regs, "Unrecoverable nested System Reset"); #endif +#ifdef MSR_RI /* Must die if the interrupt is not recoverable */ if (!(regs->msr & MSR_RI)) nmi_panic(regs, "Unrecoverable System Reset"); - +#endif Could we have a helper in .h file instead of #ifdefs ? For instance something like #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) static inline bool exception_is_recoverable(u32 msr) { return msr & MSR_RI; } #else static inline bool exception_is_recoverable(u32 msr) { return true; } #endif Or reuse the function unrecoverable_excp() defined in xmon.c ? if (!nested) nmi_exit(); @@ -478,7 +479,9 @@ static inline int check_io_access(struct pt_regs *regs) printk(KERN_DEBUG "%s bad port %lx at %p\n", (*nip & 0x100)? "OUT to": "IN from", regs->gpr[rb] - _IO_BASE, nip); +#ifdef MSR_RI regs->msr |= MSR_RI; +#endif Same here, a helper, something like that, to also be used in arch/powerpc/sysdev/fsl_rio.c ? #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) static inline void exception_set_recoverable(struct pt_regs *regs) { regs->msr |= MSR_RI; } #else static inline void exception_set_recoverable(struct pt_regs *regs) { } #endif regs->nip = extable_fixup(entry); return 1; } @@ -763,10 +766,11 @@ void machine_check_exception(struct pt_regs *regs) if (check_io_access(regs)) goto bail; +#ifdef MSR_RI /* Must die if the interrupt is not recoverable */ if (!(regs->msr & MSR_RI)) nmi_panic(regs, "Unrecoverable Machine check"); - +#endif Same if (!nested) nmi_exit(); diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index d81568f..c03c453 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -3059,9 +3059,11 @@ int emulate_step(struct pt_regs *regs, unsigned int instr) case MTMSR: val = regs->gpr[op.reg]; +#ifdef MSR_RI if ((val & MSR_RI) == 0) Could be something like the following instead of this #ifdef stuff ? if (!IS_ENABLED(CONFIG_4xx) &&
[RFC PATCH v1 3/3] powerpc/8xx: Enable CONFIG_VMAP_STACK
This patch enables CONFIG_VMAP_STACK. For that, a few changes are done in head_8xx.S to re-activation DATA MMU Translation before accessing to the stack. Due to the growing of exception prolog, a few rearrangement is also done in a few exception handlers. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/head_8xx.S | 87 +- 2 files changed, 79 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 94b46624068d..323b8a1efb3e 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -180,6 +180,7 @@ config PPC select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK + select HAVE_ARCH_VMAP_STACK if PPC_8xx select HAVE_CBPF_JITif !PPC64 select HAVE_STACKPROTECTOR if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13) select HAVE_STACKPROTECTOR if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 48996a424075..ded66a6fdfeb 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -136,6 +136,53 @@ instruction_counter: EXCEPTION_PROLOG_1; \ EXCEPTION_PROLOG_2 +#ifdef CONFIG_VMAP_STACK +#define EXCEPTION_PROLOG_1 \ + mtspr SPRN_SPRG_SCRATCH2, r12;\ + mfspr r12, SPRN_SPRG_THREAD; \ + mfspr r11, SPRN_SRR0; \ + stw r11, SRR0(r12); \ + mfspr r11, SPRN_DAR; \ + stw r11, DAR(r12); \ + mfspr r11,SPRN_SRR1; /* check whether user or kernel */ \ + stw r11, SRR1(r12); \ + andi. r11,r11,MSR_PR + +#define EXCEPTION_PROLOG_2 \ + li r11, MSR_KERNEL & ~(MSR_IR | MSR_RI); /* can take DTLB miss */ \ + mtmsr r11;\ + tovirt(r12, r12); \ + subir11, r1, INT_FRAME_SIZE;/* use r1 if kernel */ \ + beq 1f; \ + lwz r11, TASK_STACK-THREAD(r12);\ + addir11, r11, THREAD_SIZE - INT_FRAME_SIZE; \ +1: stw r10,_CCR(r11); /* save registers */ \ + stw r9,GPR9(r11); \ + mfspr r10,SPRN_SPRG_SCRATCH0; \ + stw r10,GPR10(r11); \ + mfspr r10,SPRN_SPRG_SCRATCH1; \ + stw r10,GPR11(r11); \ + mfspr r10,SPRN_SPRG_SCRATCH2; \ + stw r10,GPR12(r11); \ + mflrr10;\ + stw r10,_LINK(r11); \ + lwz r10, DAR(r12); \ + stw r10, _DAR(r11); \ + lwz r9, SRR1(r12); \ + lwz r12, SRR0(r12); \ + stw r1,GPR1(r11); \ + stw r1,0(r11); \ + mr r1, r11;/* set new kernel sp */ \ + li r10,MSR_KERNEL & ~MSR_IR; /* can take exceptions */ \ + mtmsr r10;\ + stw r0,GPR0(r11); \ + lis r10, STACK_FRAME_REGS_MARKER@ha; /* exception frame marker */ \ + addir10, r10, STACK_FRAME_REGS_MARKER@l; \ + stw r10, 8(r11);\ + SAVE_4GPRS(3, r11); \ + SAVE_2GPRS(7, r11) + +#else #define EXCEPTION_PROLOG_1 \ mfspr r11,SPRN_SRR1; /* check whether user or kernel */ \ andi. r11,r11,MSR_PR; \ @@ -172,6 +219,8 @@ instruction_counter: SAVE_4GPRS(3, r11); \ SAVE_2GPRS(7, r11) +#endif + /* * Note: code which follows this uses cr0.eq (set if from kernel), * r11, r12 (SRR0), and r9 (SRR1). @@ -226,8 +275,12 @@ i##n: \ . = 0x200 MachineCheck: EXCEPTION_PROLOG +#ifdef CONFIG_VMAP_STACK + lwz r4, _DAR(r11) +#else mfspr r4,SPRN_DAR stw r4,_DAR(r11) +#endif li r5,RPN_PATTERN mtspr SPRN_DAR,r5 /* Tag DAR, to be used in DTLB Error */ mfspr r5,SPRN_DSISR @@ -254,8 +307,12 @@ InstructionAccess: . = 0x600 Alignment: EXCEPTION_PROLOG +#ifdef CONFIG_VMAP_STACK + lwz r4, _DAR(r11) +#else mfspr r4,SPRN_DAR stw r4,_DAR(r11) +#endif li r5,RPN_PATTERN mtspr SPRN_DAR,r5 /* Tag DAR, to be used in DTLB Error */ mfspr r5,SPRN_DSISR @@ -573,20 +630,31 @@ DataTLBError: beq-FixupDAR/* must be a buggy dcbX, icbi insn. */ DARFixed:/* Return from dcbx instruction bug workaround */ EXCEPTION_PROLOG_1 +#ifdef CONFIG_VMAP_STACK + li r11, RPN_PATTERN + mtspr SPRN_DAR, r11 /* Tag DAR, to be used in DTLB Error */ +#endif EXCEPTION_PROLOG_2 mfspr r5,SPRN_DSISR stw r5,_DSISR(r11) +#ifdef CONFIG_VMAP_STACK + lwz r4, _DAR(r11) +#else mfspr r4,SPRN_DAR +#endif andis.
[RFC PATCH v1 2/3] powerpc/8xx: Use alternative scratch registers in DTLB miss handler
In preparation of handling CONFIG_VMAP_STACK, we need DTLB miss handler to use different scratch registers than other exception handlers in order to not jeopardise exception entry on stack DTLB misses. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 27 ++- arch/powerpc/perf/8xx-pmu.c| 12 2 files changed, 22 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 5f5f89e87e3a..48996a424075 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -421,8 +421,8 @@ ITLBMissLinear: . = 0x1200 DataStoreTLBMiss: - mtspr SPRN_SPRG_SCRATCH0, r10 - mtspr SPRN_SPRG_SCRATCH1, r11 + mtspr SPRN_DAR, r10 + mtspr SPRN_M_TW, r11 mfcrr11 /* If we are faulting a kernel address, we have to use the @@ -487,10 +487,10 @@ DataStoreTLBMiss: mtspr SPRN_MD_RPN, r10/* Update TLB entry */ /* Restore registers */ - mtspr SPRN_DAR, r11 /* Tag DAR */ -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 +0: mfspr r10, SPRN_DAR + mtspr SPRN_DAR, r11 /* Tag DAR */ + mfspr r11, SPRN_M_TW rfi patch_site 0b, patch__dtlbmiss_exit_1 @@ -499,8 +499,9 @@ DataStoreTLBMiss: 0: lwz r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0) addir10, r10, 1 stw r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0) - mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 + mfspr r10, SPRN_DAR + mtspr SPRN_DAR, r11 /* Tag DAR */ + mfspr r11, SPRN_M_TW rfi #endif @@ -516,10 +517,10 @@ DTLBMissIMMR: mtspr SPRN_MD_RPN, r10/* Update TLB entry */ li r11, RPN_PATTERN - mtspr SPRN_DAR, r11 /* Tag DAR */ -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 +0: mfspr r10, SPRN_DAR + mtspr SPRN_DAR, r11 /* Tag DAR */ + mfspr r11, SPRN_M_TW rfi patch_site 0b, patch__dtlbmiss_exit_2 @@ -534,10 +535,10 @@ DTLBMissLinear: mtspr SPRN_MD_RPN, r10/* Update TLB entry */ li r11, RPN_PATTERN - mtspr SPRN_DAR, r11 /* Tag DAR */ -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 +0: mfspr r10, SPRN_DAR + mtspr SPRN_DAR, r11 /* Tag DAR */ + mfspr r11, SPRN_M_TW rfi patch_site 0b, patch__dtlbmiss_exit_3 diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c index e38f74e9e7a4..4556c8837575 100644 --- a/arch/powerpc/perf/8xx-pmu.c +++ b/arch/powerpc/perf/8xx-pmu.c @@ -161,10 +161,6 @@ static void mpc8xx_pmu_read(struct perf_event *event) static void mpc8xx_pmu_del(struct perf_event *event, int flags) { - /* mfspr r10, SPRN_SPRG_SCRATCH0 */ - unsigned int insn = PPC_INST_MFSPR | __PPC_RS(R10) | - __PPC_SPR(SPRN_SPRG_SCRATCH0); - mpc8xx_pmu_read(event); /* If it was the last user, stop counting to avoid useles overhead */ @@ -177,6 +173,10 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags) break; case PERF_8xx_ID_ITLB_LOAD_MISS: if (atomic_dec_return(_miss_ref) == 0) { + /* mfspr r10, SPRN_SPRG_SCRATCH0 */ + unsigned int insn = PPC_INST_MFSPR | __PPC_RS(R10) | + __PPC_SPR(SPRN_SPRG_SCRATCH0); + patch_instruction_site(__itlbmiss_exit_1, insn); #ifndef CONFIG_PIN_TLB_TEXT patch_instruction_site(__itlbmiss_exit_2, insn); @@ -185,6 +185,10 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags) break; case PERF_8xx_ID_DTLB_LOAD_MISS: if (atomic_dec_return(_miss_ref) == 0) { + /* mfspr r10, SPRN_DAR */ + unsigned int insn = PPC_INST_MFSPR | __PPC_RS(R10) | + __PPC_SPR(SPRN_DAR); + patch_instruction_site(__dtlbmiss_exit_1, insn); patch_instruction_site(__dtlbmiss_exit_2, insn); patch_instruction_site(__dtlbmiss_exit_3, insn); -- 2.13.3
[RFC PATCH v1 1/3] powerpc/32: prepare for CONFIG_VMAP_STACK
To support CONFIG_VMAP_STACK, the kernel must be able to activate Data MMU Translation for accessing the stack. Before doing that it must save SRR0, SRR1 and DAR in order to not loose them in case there is a Data TLB Miss once the translation is reactivated. This patch defines fields in the thread struct for saving those registers. It also prepares entry_32.S to handle exception entry with Data MMU Translation enabled. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/processor.h | 5 + arch/powerpc/include/asm/thread_info.h | 5 + arch/powerpc/kernel/asm-offsets.c | 5 + arch/powerpc/kernel/entry_32.S | 16 4 files changed, 31 insertions(+) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index 8179b64871ed..e839a1231b17 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -218,6 +218,11 @@ struct thread_struct { #ifdef CONFIG_PPC32 void*pgdir; /* root of page-table tree */ unsigned long ksp_limit; /* if ksp <= ksp_limit stack overflow */ +#ifdef CONFIG_VMAP_STACK + unsigned long dar; + unsigned long srr0; + unsigned long srr1; +#endif #endif /* Debug Registers */ struct debug_reg debug; diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h index 8e1d0195ac36..488d5c4670ff 100644 --- a/arch/powerpc/include/asm/thread_info.h +++ b/arch/powerpc/include/asm/thread_info.h @@ -10,10 +10,15 @@ #define _ASM_POWERPC_THREAD_INFO_H #include +#include #ifdef __KERNEL__ +#if defined(CONFIG_VMAP_STACK) && CONFIG_THREAD_SHIFT < PAGE_SHIFT +#define THREAD_SHIFT PAGE_SHIFT +#else #define THREAD_SHIFT CONFIG_THREAD_SHIFT +#endif #define THREAD_SIZE(1 << THREAD_SHIFT) diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 03439785c2ea..985523ef23e8 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -128,6 +128,11 @@ int main(void) OFFSET(KSP_VSID, thread_struct, ksp_vsid); #else /* CONFIG_PPC64 */ OFFSET(PGDIR, thread_struct, pgdir); +#ifdef CONFIG_VMAP_STACK + OFFSET(SRR0, thread_struct, srr0); + OFFSET(SRR1, thread_struct, srr1); + OFFSET(DAR, thread_struct, dar); +#endif #ifdef CONFIG_SPE OFFSET(THREAD_EVR0, thread_struct, evr[0]); OFFSET(THREAD_ACC, thread_struct, acc); diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 52a061f14c7d..6e2c45fdd2c0 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -150,8 +150,13 @@ transfer_to_handler: stw r12,_CTR(r11) stw r2,_XER(r11) mfspr r12,SPRN_SPRG_THREAD +#ifdef CONFIG_VMAP_STACK + tovirt(r12, r12) +#endif addir2,r12,-THREAD +#ifndef CONFIG_VMAP_STACK tovirt(r2,r2) /* set r2 to current */ +#endif beq 2f /* if from user, fix up THREAD.regs */ addir11,r1,STACK_FRAME_OVERHEAD stw r11,PT_REGS(r12) @@ -179,9 +184,13 @@ transfer_to_handler: stw r12,4(r11) #endif #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE +#ifdef CONFIG_VMAP_STACK + ACCOUNT_CPU_USER_ENTRY(r2, r11, r12) +#else tophys(r9, r2) ACCOUNT_CPU_USER_ENTRY(r9, r11, r12) #endif +#endif b 3f @@ -193,8 +202,12 @@ transfer_to_handler: ble-stack_ovf /* then the kernel stack overflowed */ 5: #if defined(CONFIG_PPC_BOOK3S_32) || defined(CONFIG_E500) +#ifdef CONFIG_VMAP_STACK + lwz r12, TI_LOCAL_FLAGS(r2) +#else tophys(r9,r2) /* check local flags */ lwz r12,TI_LOCAL_FLAGS(r9) +#endif mtcrf 0x01,r12 bt- 31-TLF_NAPPING,4f bt- 31-TLF_SLEEPING,7f @@ -203,6 +216,9 @@ transfer_to_handler: transfer_to_handler_cont: 3: mflrr9 +#ifdef CONFIG_VMAP_STACK + tovirt(r9,r9) +#endif lwz r11,0(r9) /* virtual address of handler */ lwz r9,4(r9)/* where to go when done */ #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS) -- 2.13.3
[RFC PATCH v1 0/3] Enable CONFIG_VMAP_STACK on the 8xx
The purpose of this serie is to enable CONFIG_VMAP_STACK on the 8xx. I'm sending it now to get early feedback if any. For the time being, there is no proper handling of stack overflow. This serie applies on the top of the serie "powerpc: Switch to CONFIG_THREAD_INFO_IN_TASK" Christophe Leroy (3): powerpc/32: prepare for CONFIG_VMAP_STACK powerpc/8xx: Use alternative scratch registers in DTLB miss handler powerpc/8xx: Enable CONFIG_VMAP_STACK arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/processor.h | 5 ++ arch/powerpc/include/asm/thread_info.h | 5 ++ arch/powerpc/kernel/asm-offsets.c | 5 ++ arch/powerpc/kernel/entry_32.S | 16 + arch/powerpc/kernel/head_8xx.S | 114 ++--- arch/powerpc/perf/8xx-pmu.c| 12 ++-- 7 files changed, 132 insertions(+), 26 deletions(-) -- 2.13.3
Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags
[ Reviving old thread. ] Andrew Murray writes: > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote: >> Andrew Murray writes: >> >> > Update design.txt to reflect the presence of the exclude_host >> > and exclude_guest perf flags. >> > >> > Signed-off-by: Andrew Murray >> > --- >> > tools/perf/design.txt | 4 >> > 1 file changed, 4 insertions(+) >> > >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt >> > index a28dca2..7de7d83 100644 >> > --- a/tools/perf/design.txt >> > +++ b/tools/perf/design.txt >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' >> > bits provide a >> > way to request that counting of events be restricted to times when the >> > CPU is in user, kernel and/or hypervisor mode. >> > >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way >> > +to request counting of events restricted to guest and host contexts when >> > +using virtualisation. >> >> How does exclude_host differ from exclude_hv ? > > I believe exclude_host / exclude_guest are intented to distinguish > between host and guest in the hosted hypervisor context (KVM). OK yeah, from the perf-list man page: u - user-space counting k - kernel counting h - hypervisor counting I - non idle counting G - guest counting (in KVM guests) H - host counting (not in KVM guests) > Whereas exclude_hv allows to distinguish between guest and > hypervisor in the bare-metal type hypervisors. Except that's exactly not how we use them on powerpc :) We use exclude_hv to exclude "the hypervisor", regardless of whether it's KVM or PowerVM (which is a bare-metal hypervisor). We don't use exclude_host / exclude_guest at all, which I guess is a bug, except I didn't know they existed until this thread. eg, in a KVM guest: $ perf record -e cycles:G /bin/bash -c "for i in {0..10}; do :;done" $ perf report -D | grep -Fc "dso: [hypervisor]" 16 > In the case of arm64 - if VHE extensions are present then the host > kernel will run at a higher privilege to the guest kernel, in which > case there is no distinction between hypervisor and host so we ignore > exclude_hv. But where VHE extensions are not present then the host > kernel runs at the same privilege level as the guest and we use a > higher privilege level to switch between them - in this case we can > use exclude_hv to discount that hypervisor role of switching between > guests. I couldn't find any arm64 perf code using exclude_host/guest at all? And I don't see any x86 code using exclude_hv. But maybe that's OK, I just worry this is confusing for users. cheers
Re: [PATCH v2] cpufreq: powernv: add of_node_put()
On Wednesday, November 21, 2018 5:02:04 AM CET Viresh Kumar wrote: > On 20-11-18, 11:05, Yangtao Li wrote: > > The of_find_node_by_path() returns a node pointer with refcount > > incremented,but there is the lack of use of the of_node_put() when > > done.Add the missing of_node_put() to release the refcount. > > > > Signed-off-by: Yangtao Li > > --- > > Changes in v2 > > -update changelog > > --- > > drivers/cpufreq/powernv-cpufreq.c | 17 +++-- > > 1 file changed, 11 insertions(+), 6 deletions(-) > > Acked-by: Viresh Kumar Patch applied, thanks!
Re: [PATCH v2] cpufreq: pmac64: add of_node_put()
On Monday, November 26, 2018 7:02:26 AM CET Viresh Kumar wrote: > On 23-11-18, 08:33, Yangtao Li wrote: > > of_find_node_by_path() acquires a reference to the node > > returned by it and that reference needs to be dropped by its caller. > > g5_neo2_cpufreq_init() doesn't do that, so fix it. > > > > Signed-off-by: Yangtao Li > > Acked-by: Viresh Kumar > > --- > > Changes in v2: > > -update changelog > > --- > > drivers/cpufreq/pmac64-cpufreq.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/cpufreq/pmac64-cpufreq.c > > b/drivers/cpufreq/pmac64-cpufreq.c > > index be623dd7b9f2..1d32a863332d 100644 > > --- a/drivers/cpufreq/pmac64-cpufreq.c > > +++ b/drivers/cpufreq/pmac64-cpufreq.c > > @@ -411,6 +411,7 @@ static int __init g5_neo2_cpufreq_init(struct > > device_node *cpunode) > > pfunc_set_vdnap0 = pmf_find_function(root, "set-vdnap0"); > > pfunc_vdnap0_complete = > > pmf_find_function(root, "slewing-done"); > > + of_node_put(root); > > if (pfunc_set_vdnap0 == NULL || > > pfunc_vdnap0_complete == NULL) { > > pr_err("Can't find required platform function\n"); > > Acked-by: Viresh Kumar Patch applied, thanks!
Re: [PATCH] ocxl: Simplify free_spa()
On Tue, 11 Dec 2018 20:13:21 +1100 Andrew Donnellan wrote: > On 11/12/18 7:57 pm, Greg Kurz wrote: > > I now realize that I should have mentioned the real motivation for this > > change. I'm working on refactoring the code so that we can use ocxl in a > > KVM guest. The concept of link can be shared by both powernv and pseries > > variants but the SPA is definitely a powernv only thingy. The benefit > > of this patch is hence to kick 'struct link' out of free_spa() so that > > it can be utimately moved to powernv specific code. > > > > The initial version of this change was just moving the link->spa check > > and link->spa = NULL to the callers, where it was quite obvious they're > > not needed... Should I re-post this as two patches for clarity ? > > Ah, that makes much more sense. > > If that's the case, then why not go all the way and change the function > signature while you're at it? > Sure, will do.
Re: use generic DMA mapping code in powerpc V4
On Mon, 10 Dec 2018 at 20:49, Benjamin Herrenschmidt wrote: > [snip] > > AGP is a gigantic nightmare :-) It's not just cache coherency issues > (some implementations are coherent, some aren't, Apple's is ... weird). > > Apple has all sort of bugs, and Darwin source code only sheds light on > some of them. Some implementation can only read, not write I think, for > example. There are issues with transfers crossing some boundaries I > beleive, but it's all unclear. > > Apple makes this work with a combination of hacks in the AGP "driver" > and the closed source GPU driver, which we don't see. > > I have given up trying to make that stuff work reliably a decade ago :) > > Cheers, > Ben. That's what I was afraid of… what a mess. At least now I have a definitive answer from one of the original authors of the code, thanks a lot, Ben. :) I have an unresearched belief that AGP support was hacked in the Mac series as an afterthought (weren't they supposed to be PCI/PCI-X only?), and your explanation surely seems to corroborate. :/
Re: [PATCH] crypto: talitos - fix ablkcipher for CONFIG_VMAP_STACK
On Tue, Dec 11, 2018 at 07:29:40AM +, Christophe Leroy wrote: > > diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c > index 6988012deca4..385ec970b639 100644 > --- a/drivers/crypto/talitos.c > +++ b/drivers/crypto/talitos.c > @@ -1668,8 +1668,11 @@ static struct talitos_edesc > *ablkcipher_edesc_alloc(struct ablkcipher_request * > struct talitos_ctx *ctx = crypto_ablkcipher_ctx(cipher); > unsigned int ivsize = crypto_ablkcipher_ivsize(cipher); > > + if (ivsize) > + memcpy(ctx->iv, areq->info, ivsize); The ctx is per-tfm, not per-request. So you cannot write to it. This needs to go into a pre-request area. > return talitos_edesc_alloc(ctx->dev, areq->src, areq->dst, > -areq->info, 0, areq->nbytes, 0, ivsize, 0, > +ctx->iv, 0, areq->nbytes, 0, ivsize, 0, > areq->base.flags, encrypt); > } Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] ocxl: Simplify free_spa()
On 11/12/18 7:57 pm, Greg Kurz wrote: I now realize that I should have mentioned the real motivation for this change. I'm working on refactoring the code so that we can use ocxl in a KVM guest. The concept of link can be shared by both powernv and pseries variants but the SPA is definitely a powernv only thingy. The benefit of this patch is hence to kick 'struct link' out of free_spa() so that it can be utimately moved to powernv specific code. The initial version of this change was just moving the link->spa check and link->spa = NULL to the callers, where it was quite obvious they're not needed... Should I re-post this as two patches for clarity ? Ah, that makes much more sense. If that's the case, then why not go all the way and change the function signature while you're at it? -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [PATCH] ocxl: Simplify free_spa()
On Tue, 11 Dec 2018 12:05:49 +1100 Andrew Donnellan wrote: > On 11/12/18 2:15 am, Greg Kurz wrote: > > The only users of free_spa() are alloc_link() and free_link(), and > > in both cases: > > > > - link->spa != NULL > > > > - free_spa(link) is immediatly followed by kfree(link) > > > > The check isn't needed, and it doesn't bring much to clear the link->spa > > pointer. Drop both. > > > > Signed-off-by: Greg Kurz > > I like defensive programming but for this case I don't really care too > much either way > I now realize that I should have mentioned the real motivation for this change. I'm working on refactoring the code so that we can use ocxl in a KVM guest. The concept of link can be shared by both powernv and pseries variants but the SPA is definitely a powernv only thingy. The benefit of this patch is hence to kick 'struct link' out of free_spa() so that it can be utimately moved to powernv specific code. The initial version of this change was just moving the link->spa check and link->spa = NULL to the callers, where it was quite obvious they're not needed... Should I re-post this as two patches for clarity ? > Acked-by: Andrew Donnellan > > > --- > > drivers/misc/ocxl/link.c |7 ++- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c > > index 31695a078485..eed92055184d 100644 > > --- a/drivers/misc/ocxl/link.c > > +++ b/drivers/misc/ocxl/link.c > > @@ -352,11 +352,8 @@ static void free_spa(struct link *link) > > pr_debug("Freeing SPA for %x:%x:%x\n", link->domain, link->bus, > > link->dev); > > > > - if (spa && spa->spa_mem) { > > - free_pages((unsigned long) spa->spa_mem, spa->spa_order); > > - kfree(spa); > > - link->spa = NULL; > > - } > > + free_pages((unsigned long) spa->spa_mem, spa->spa_order); > > + kfree(spa); > > } > > > > static int alloc_link(struct pci_dev *dev, int PE_mask, struct link > > **out_link) > > >
Re: [PATCH] ocxl: Fix endiannes bug in read_afu_name()
On Tue, 11 Dec 2018 11:24:08 +1100 Andrew Donnellan wrote: > On 11/12/18 11:05 am, Andrew Donnellan wrote: > > On 11/12/18 2:10 am, Greg Kurz wrote: > >> The double word returned by read_afu_info(OCXL_DVSEC_TEMPL_NAME) contains > >> four characters of the AFU name, read from the PCI config space, hence > >> with a little-endian ordering. When composing the string, a big-endian > >> system must swap the bytes so that the characters appear in the right > >> order. > >> > >> Do this with le32_to_cpu(). > >> > >> Signed-off-by: Greg Kurz > > > > snowpatch reports the following sparse warning: > > > > +drivers/misc/ocxl/config.c:321:24: warning: cast to restricted __le32 > > > > You probably need to change val from a u32 to a __le32. > You might be right, I'll look into this. > Also does this need to go to stable? > Oops... this bug has been there since the beginning, so yes it does. I simply forgot to add the Cc: stable tag... :-\ Cheers, -- Greg