Re: [PATCH 1/1] serial/uuc_uart: Set shutdown timeout to CONFIG_HZ independent 2ms
On Monday 05 December 2016 10:04:27, Timur Tabi wrote: > Alexander Stein wrote: > > - schedule_timeout(2); > > + schedule_timeout(msecs_to_jiffies(2)); > > NACK. > > So I don't remember why I wrote this code, but I don't think I was > expecting it to be 2ms. Instead, I think I just wanted it to be some > delay, but I believed that schedule_timeout(1) was too short or would be > "optimized" out somehow. > > Note that right below this, I do: > > if (qe_port->wait_closing) { > /* Wait a bit longer */ > set_current_state(TASK_UNINTERRUPTIBLE); > schedule_timeout(qe_port->wait_closing); > } > > And wait_closing is a number of jiffies, so I knew that > schedule_timeout() took jiffies as a parameter. > > So I think I'm going to NACK this patch, since I believe I knew what I > was doing when I wrote it five years ago. Okay, I was just wondering why the timeout is dependant on the timer tick. That didn't seem obvious to me. Rethinking about this, I would rather replace those lines with msleep instead. Best regards, Alexander
Re: [PATCH 1/2] powerpc/powernv/opal-dump : Handles opal_dump_info properly
Hi Michael, Can you please have a look at this patchset as there is no functional changes involve with this? Thanks, Mukesh On Thursday 01 December 2016 02:38 PM, Mukesh Ojha wrote: Moves the return value check of 'opal_dump_info' to a proper place which was previously unnecessarily filling all the dump info even on failure. Signed-off-by: Mukesh Ojha --- arch/powerpc/platforms/powernv/opal-dump.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powernv/opal-dump.c b/arch/powerpc/platforms/powernv/opal-dump.c index 4c82782..ae32212 100644 --- a/arch/powerpc/platforms/powernv/opal-dump.c +++ b/arch/powerpc/platforms/powernv/opal-dump.c @@ -225,13 +225,16 @@ static int64_t dump_read_info(uint32_t *dump_id, uint32_t *dump_size, uint32_t * if (rc == OPAL_PARAMETER) rc = opal_dump_info(&id, &size); + if (rc) { + pr_warn("%s: Failed to get dump info (%d)\n", + __func__, rc); + return rc; + } + *dump_id = be32_to_cpu(id); *dump_size = be32_to_cpu(size); *dump_type = be32_to_cpu(type); - if (rc) - pr_warn("%s: Failed to get dump info (%d)\n", - __func__, rc); return rc; }
Re: [PATCH] PPC: sstep.c: Add modsw, moduw instruction emulation
By the way, I missed mentioning previously: please use 'powerpc: ' prefix for the subject, rather than PPC. On 2016/12/04 10:25PM, PrasannaKumar Muralidharan wrote: > Add modsw and moduw instruction emulation support to analyse_instr. And, it will be better if you can briefly describe what these functions do for the benefit of others. - Naveen
Re: [PATCH v3 2/3] powerpc: get hugetlbpage handling more generic
Le 06/12/2016 à 02:18, Scott Wood a écrit : On Wed, 2016-09-21 at 10:11 +0200, Christophe Leroy wrote: Today there are two implementations of hugetlbpages which are managed by exclusive #ifdefs: * FSL_BOOKE: several directory entries points to the same single hugepage * BOOK3S: one upper level directory entry points to a table of hugepages In preparation of implementation of hugepage support on the 8xx, we need a mix of the two above solutions, because the 8xx needs both cases depending on the size of pages: * In 4k page size mode, each PGD entry covers a 4M bytes area. It means that 2 PGD entries will be necessary to cover an 8M hugepage while a single PGD entry will cover 8x 512k hugepages. * In 16 page size mode, each PGD entry covers a 64M bytes area. It means that 8x 8M hugepages will be covered by one PGD entry and 64x 512k hugepages will be covers by one PGD entry. This patch: * removes #ifdefs in favor of if/else based on the range sizes * merges the two huge_pte_alloc() functions as they are pretty similar * merges the two hugetlbpage_init() functions as they are pretty similar [snip] @@ -860,16 +803,34 @@ static int __init hugetlbpage_init(void) * if we have pdshift and shift value same, we don't * use pgt cache for hugepd. */ - if (pdshift != shift) { + if (pdshift > shift) { pgtable_cache_add(pdshift - shift, NULL); if (!PGT_CACHE(pdshift - shift)) panic("hugetlbpage_init(): could not create " "pgtable cache for %d bit pagesize\n", shift); } +#ifdef CONFIG_PPC_FSL_BOOK3E + else if (!hugepte_cache) { This else never triggers on book3e, because the way this function calculates pdshift is wrong for book3e (it uses PyD_SHIFT instead of HUGEPD_PxD_SHIFT). We later get OOMs because huge_pte_alloc() calculates pdshift correctly, tries to use hugepte_cache, and fails. Ok, I'll check it again, I was expecting it to still work properly on book3e, because after applying patch 3 it works properly on the 8xx. If the point of this patch is to remove the compile-time decision on whether to do things the book3e way, why are there still ifdefs such as the ones controlling the definition of HUGEPD_PxD_SHIFT? How does what you're doing on 8xx (for certain page sizes) differ from book3e? Some of the things done for book3e are common to 8xx, but differ from book3s. For that reason, in the following patch (3/3), there is in several places: -#ifdef CONFIG_PPC_FSL_BOOK3E +#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) Christophe
Re: [PATCH] PPC: sstep.c: Add modsw, moduw instruction emulation
On 2016/12/06 01:21AM, PrasannaKumar Muralidharan wrote: > >> + > >> + case 267: /* moduw */ > > > > Please move this case further up so that the extended opcodes are in > > numerical order. > > Placed it after divide instruction as it appeared logical. Also placed > 267 below 779 as it is the order in which the instructions are > documented in the ISA book. This may help in finding related > instructions together. If this style is not preferred I can arrange it > in numerical order. I guessed as much, but if you look at the existing function, you'll see that things have been arranged in numerical order. As such, it's best to stick to that convention. - Naveen
[PATCH 3/3] powerpc: enable support for GCC plugins
Enable support for GCC plugins on powerpc. Add an additional version check in gcc-plugins-check to advise users to upgrade to gcc 5.2+ on powerpc to avoid issues with header files (gcc <= 4.6) or missing copies of rs6000-cpus.def (4.8 to 5.1 on 64-bit targets). Signed-off-by: Andrew Donnellan --- Open to bikeshedding on the gcc version check. Compile tested with all plugins enabled on gcc 4.6-6.2, x86->ppc{32,64,64le} and 4.8-6.2 ppc64le->ppc{32,64,64le}. Thanks to Chris Smart for help with this. I think it's best to take this through powerpc#next with an ACK from Kees/Emese? --- arch/powerpc/Kconfig | 1 + scripts/Makefile.gcc-plugins | 8 2 files changed, 9 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 65fba4c..6efbc08 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -92,6 +92,7 @@ config PPC select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_GRAPH_TRACER + select HAVE_GCC_PLUGINS select SYSCTL_EXCEPTION_TRACE select VIRT_TO_BUS if !PPC64 select HAVE_IDE diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 26c67b7..9835a75 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -47,6 +47,14 @@ gcc-plugins-check: FORCE ifdef CONFIG_GCC_PLUGINS ifeq ($(PLUGINCC),) ifneq ($(GCC_PLUGINS_CFLAGS),) + # Various gccs between 4.5 and 5.1 have bugs on powerpc due to missing + # header files. gcc <= 4.6 doesn't work at all, gccs from 4.8 to 5.1 have + # issues with 64-bit targets. + ifeq ($(ARCH),powerpc) +ifeq ($(call cc-ifversion, -le, 0501, y), y) + @echo "Cannot use CONFIG_GCC_PLUGINS: plugin support on gcc <= 5.1 is buggy on powerpc, please upgrade to gcc 5.2 or newer" >&2 && exit 1 +endif + endif ifeq ($(call cc-ifversion, -ge, 0405, y), y) $(Q)$(srctree)/scripts/gcc-plugin.sh --show-error "$(__PLUGINCC)" "$(HOSTCXX)" "$(CC)" || true @echo "Cannot use CONFIG_GCC_PLUGINS: your gcc installation does not support plugins, perhaps the necessary headers are missing?" >&2 && exit 1 -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
[PATCH 2/3] powerpc: correctly disable latent entropy GCC plugin on prom_init.o
Commit 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") excludes certain powerpc early boot code from the latent entropy plugin by adding appropriate CFLAGS. It looks like this was supposed to cover prom_init.o, but ended up saying init.o (which doesn't exist) instead. Fix the typo. Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by: Andrew Donnellan --- I think that we potentially could get rid of some of these disables, but it's safer to leave it for now. --- arch/powerpc/kernel/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 1925341..adb52d1 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -15,7 +15,7 @@ CFLAGS_btext.o+= -fPIC endif CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) -CFLAGS_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) +CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
[PATCH 1/3] gcc-plugins: fix definition of DISABLE_LATENT_ENTROPY_PLUGIN
The variable DISABLE_LATENT_ENTROPY_PLUGIN is defined when CONFIG_PAX_LATENT_ENTROPY is set. This is leftover from the original PaX version of the plugin code and doesn't actually exist. Change the condition to depend on CONFIG_GCC_PLUGIN_LATENT_ENTROPY instead. Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by: Andrew Donnellan --- scripts/Makefile.gcc-plugins | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 060d2cb..26c67b7 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -8,7 +8,7 @@ ifdef CONFIG_GCC_PLUGINS gcc-plugin-$(CONFIG_GCC_PLUGIN_LATENT_ENTROPY) += latent_entropy_plugin.so gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_LATENT_ENTROPY)+= -DLATENT_ENTROPY_PLUGIN - ifdef CONFIG_PAX_LATENT_ENTROPY + ifdef CONFIG_GCC_PLUGIN_LATENT_ENTROPY DISABLE_LATENT_ENTROPY_PLUGIN += -fplugin-arg-latent_entropy_plugin-disable endif -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [PATCH 1/3] KVM: PPC: Book3S: Change interrupt call to reduce scratch space use on HV
On Thu, Dec 01, 2016 at 06:18:10PM +1100, Nicholas Piggin wrote: > Change the calling convention to put the trap number together with > CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV > handler, and r9 free. Cute idea! Some comments below... > The 64-bit PR handler entry translates the calling convention back > to match the previous call convention (i.e., shared with 32-bit), for > simplicity. > > Signed-off-by: Nicholas Piggin > --- > arch/powerpc/include/asm/exception-64s.h | 28 +++- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 15 +++ > arch/powerpc/kvm/book3s_segment.S| 27 --- > 3 files changed, 42 insertions(+), 28 deletions(-) > > diff --git a/arch/powerpc/include/asm/exception-64s.h > b/arch/powerpc/include/asm/exception-64s.h > index 9a3eee6..bc8fc45 100644 > --- a/arch/powerpc/include/asm/exception-64s.h > +++ b/arch/powerpc/include/asm/exception-64s.h > @@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) > > #endif > > -#define __KVM_HANDLER_PROLOG(area, n) > \ > +#define __KVM_HANDLER(area, h, n)\ > BEGIN_FTR_SECTION_NESTED(947) \ > ld r10,area+EX_CFAR(r13); \ > std r10,HSTATE_CFAR(r13); \ > @@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) > std r10,HSTATE_PPR(r13);\ > END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);\ > ld r10,area+EX_R10(r13); \ > - stw r9,HSTATE_SCRATCH1(r13);\ > - ld r9,area+EX_R9(r13); \ > std r12,HSTATE_SCRATCH0(r13); \ > - > -#define __KVM_HANDLER(area, h, n)\ > - __KVM_HANDLER_PROLOG(area, n) \ > - li r12,n; \ > + li r12,(n);\ > + sldir12,r12,32; \ > + or r12,r12,r9; \ Did you consider doing it the other way around, i.e. with r12 containing (cr << 32) | trap? That would save 1 instruction in each handler: + sldir12,r9,32; \ + ori r12,r12,(n);\ > + ld r9,area+EX_R9(r13); \ > + std r9,HSTATE_SCRATCH1(r13);\ Why not put this std in kvmppc_interrupt[_hv] rather than in each handler? > b kvmppc_interrupt > > #define __KVM_HANDLER_SKIP(area, h, n) > \ > cmpwi r10,KVM_GUEST_MODE_SKIP;\ > - ld r10,area+EX_R10(r13); \ > beq 89f;\ > - stw r9,HSTATE_SCRATCH1(r13);\ > BEGIN_FTR_SECTION_NESTED(948) \ > - ld r9,area+EX_PPR(r13);\ > - std r9,HSTATE_PPR(r13); \ > + ld r10,area+EX_PPR(r13); \ > + std r10,HSTATE_PPR(r13);\ > END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);\ > - ld r9,area+EX_R9(r13); \ > + ld r10,area+EX_R10(r13); \ > std r12,HSTATE_SCRATCH0(r13); \ > - li r12,n; \ > + li r12,(n);\ > + sldir12,r12,32; \ > + or r12,r12,r9; \ > + ld r9,area+EX_R9(r13); \ > + std r9,HSTATE_SCRATCH1(r13);\ Same comment again, of course. > b kvmppc_interrupt; \ > 89: mtocrf 0x80,r9;\ > ld r9,area+EX_R9(r13); \ > + ld r10,area+EX_R10(r13); \ > b kvmppc_skip_##h##interrupt > > #ifdef CONFIG_KVM_BOOK3S_64_HANDLER > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index c3c1d1b..0536c73 100644 > --- a/arch/powerpc/kvm/book
Re: [PATCH v8 2/3] perf annotate: Support jump instruction with target as second operand
Hi Arnaldo, Hmm, so it's difficult to find example of this when we use debuginfo. Because... Jump__parse tries to look for two things 'offset' and 'target address'. objdump with debuginfo will include offset in assembly f.e. annotate of 'smp_call_function_single' with perf.data and vmlinux I shared. │c016d6ac: cmpwi cr7,r9,0 ▒ │c016d6b0: ↑ bnecr7,c016d59c <.smp_call_function_single+0x8c> ▒ │c016d6b4: addis r10,r2,-15 ▒ objdump of same function with kcore. │c016d6ac: cmpwi cr7,r9,0 ▒ │c016d6b0: ↓ bnecr7,0xc016d59c ▒ │c016d6b4: addis r10,r2,-15 ▒ Annotating in first case won't show any issue because we directly get offset. But in this case as well, we are parsing wrong target address in ops->target.addr While we don't have offset in second case, we use target address to find it. And thus it shows wrong o/p something like: │ cmpwi cr7,r9,0 ▒ │ ↓ bne3fe92afc ▒ │ addis r10,r2,-15 ▒ BTW, we have lot of such instructions in kernel. Thanks, -Ravi On Monday 05 December 2016 09:26 PM, Ravi Bangoria wrote: > Arch like powerpc has jump instructions that includes target address > as second operand. For example, 'bne cr7,0xc00f6154'. Add > support for such instruction in perf annotate. > > objdump o/p: > c00f6140: ld r9,1032(r31) > c00f6144: cmpdi cr7,r9,0 > c00f6148: bnecr7,0xc00f6154 > c00f614c: ld r9,2312(r30) > c00f6150: stdr9,1032(r31) > c00f6154: ld r9,88(r31) > > Corresponding perf annotate o/p: > > Before patch: > ld r9,1032(r31) > cmpdi cr7,r9,0 > v bne3ff09f2c > ld r9,2312(r30) > stdr9,1032(r31) > 74:ld r9,88(r31) > > After patch: > ld r9,1032(r31) > cmpdi cr7,r9,0 > v bne74 > ld r9,2312(r30) > stdr9,1032(r31) > 74:ld r9,88(r31) > > Signed-off-by: Ravi Bangoria > --- > Changes in v8: > - v7: https://lkml.org/lkml/2016/9/21/436 > - Rebase to acme/perf/core > - Little change in patch description. > - No logical changes. (Cross arch annotate patches are in. This patch > is for hardening annotate for powerpc.) > > tools/perf/util/annotate.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index ea7e0de..590244e 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -223,8 +223,12 @@ bool ins__is_call(const struct ins *ins) > static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands > *ops, struct map *map __maybe_unused) > { > const char *s = strchr(ops->raw, '+'); > + const char *c = strchr(ops->raw, ','); > > - ops->target.addr = strtoull(ops->raw, NULL, 16); > + if (c++ != NULL) > + ops->target.addr = strtoull(c, NULL, 16); > + else > + ops->target.addr = strtoull(ops->raw, NULL, 16); > > if (s++ != NULL) > ops->target.offset = strtoull(s, NULL, 16);
Re: [PATCH] cxl: prevent read/write to AFU config space while AFU not configured
Acked-by: Ian Munsie Looks like a reasonable solution > Pradipta found this while doing testing for cxlflash. I've tested this > patch and I'm satisfied that it solves the issue, but I've asked Pradipta > to test it a bit further. :)
[PATCH RFC v2 3/3] powerpc/64: Enable use of radix MMU under hypervisor on POWER9
To use radix as a guest, we first need to tell the hypervisor via the ibm,client-architecture call first that we support POWER9 and architecture v3.00, and that we can do either radix or hash and that we would like to choose later using an hcall (the H_REGISTER_PROC_TBL hcall). Then we need to check whether the hypervisor agreed to us using radix. We need to do this very early on in the kernel boot process before any of the MMU initialization is done. If the hypervisor doesn't agree, we can't use radix and therefore clear the radix MMU feature bit. Later, when we have set up our process table, which points to the radix tree for each process, we need to install that using the H_REGISTER_PROC_TBL hcall. Signed-off-by: Paul Mackerras --- arch/powerpc/include/asm/book3s/64/mmu.h | 2 ++ arch/powerpc/include/asm/hvcall.h| 11 +++ arch/powerpc/include/asm/prom.h | 9 + arch/powerpc/kernel/prom_init.c | 18 +- arch/powerpc/mm/init_64.c| 29 + arch/powerpc/mm/pgtable-radix.c | 2 ++ arch/powerpc/platforms/pseries/lpar.c| 29 + 7 files changed, 99 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 8afb0e0..e8cbdc0 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -138,5 +138,7 @@ static inline void setup_initial_memory_limit(phys_addr_t first_memblock_base, extern int (*register_process_table)(unsigned long base, unsigned long page_size, unsigned long tbl_size); +extern void radix_init_pseries(void); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index 77ff1ba..54d11b3 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -276,6 +276,7 @@ #define H_GET_MPP_X0x314 #define H_SET_MODE 0x31C #define H_CLEAR_HPT0x358 +#define H_REGISTER_PROC_TBL0x37C #define H_SIGNAL_SYS_RESET 0x380 #define MAX_HCALL_OPCODE H_SIGNAL_SYS_RESET @@ -313,6 +314,16 @@ #define H_SIGNAL_SYS_RESET_ALL_OTHERS -2 /* >= 0 values are CPU number */ +/* Flag values used in H_REGISTER_PROC_TBL hcall */ +#define PROC_TABLE_OP_MASK 0x18 +#define PROC_TABLE_DEREG 0x10 +#define PROC_TABLE_NEW 0x18 +#define PROC_TABLE_TYPE_MASK 0x06 +#define PROC_TABLE_HPT_SLB 0x00 +#define PROC_TABLE_HPT_PT 0x02 +#define PROC_TABLE_RADIX 0x04 +#define PROC_TABLE_GTSE0x01 + #ifndef __ASSEMBLY__ /** diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index e6d83d0..8af2546 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -121,6 +121,8 @@ struct of_drconf_cell { #define OV1_PPC_2_06 0x02/* set if we support PowerPC 2.06 */ #define OV1_PPC_2_07 0x01/* set if we support PowerPC 2.07 */ +#define OV1_PPC_3_00 0x80/* set if we support PowerPC 3.00 */ + /* Option vector 2: Open Firmware options supported */ #define OV2_REAL_MODE 0x20/* set if we want OF in real mode */ @@ -155,6 +157,13 @@ struct of_drconf_cell { #define OV5_PFO_HW_842 0x1140 /* PFO Compression Accelerator */ #define OV5_PFO_HW_ENCR0x1120 /* PFO Encryption Accelerator */ #define OV5_SUB_PROCESSORS 0x1501 /* 1,2,or 4 Sub-Processors supported */ +#define OV5_XIVE_EXPLOIT 0x1701 /* XIVE exploitation supported */ +#define OV5_MMU_RADIX_300 0x1880 /* ISA v3.00 radix MMU supported */ +#define OV5_MMU_HASH_300 0x1840 /* ISA v3.00 hash MMU supported */ +#define OV5_MMU_SEGM_RADIX 0x1820 /* radix mode (no segmentation) */ +#define OV5_MMU_PROC_TBL 0x1810 /* hcall selects SLB or proc table */ +#define OV5_MMU_SLB0x1800 /* always use SLB */ +#define OV5_MMU_GTSE 0x1808 /* Guest translation shootdown */ /* Option Vector 6: IBM PAPR hints */ #define OV6_LINUX 0x02/* Linux is our OS */ diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index ec47a93..358d43f 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -649,6 +649,7 @@ static void __init early_cmdline_parse(void) struct option_vector1 { u8 byte1; u8 arch_versions; + u8 arch_versions3; } __packed; struct option_vector2 { @@ -691,6 +692,9 @@ struct option_vector5 { u8 reserved2; __be16 reserved3; u8 subprocessors; + u8 byte22; + u8 intarch; + u8 mmu; } __packed; struct option_vector6 { @@ -700,7 +704,7 @@ struct option_vector6 { } __packed; struct ibm_arch_vec { - struct { u32 mask, val; } pvrs[10]; +
Re: [PATCH v8 2/6] powerpc: pSeries/Kconfig: Add qspinlock build config
在 2016/12/6 09:24, Pan Xinhui 写道: 在 2016/12/6 08:58, Boqun Feng 写道: On Mon, Dec 05, 2016 at 10:19:22AM -0500, Pan Xinhui wrote: pSeries/powerNV will use qspinlock from now on. Signed-off-by: Pan Xinhui --- arch/powerpc/platforms/pseries/Kconfig | 8 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index bec90fb..8a87d06 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig Why here? Not arch/powerpc/platforms/Kconfig? @@ -23,6 +23,14 @@ config PPC_PSERIES select PPC_DOORBELL default y +config ARCH_USE_QUEUED_SPINLOCKS +default y +bool "Enable qspinlock" I think you just enable qspinlock by default for all PPC platforms. I guess you need to put depends on PPC_PSERIES || PPC_POWERNV here to achieve what you mean in you commit message. oh, yes, need depends on PPC_PSERIES || PPC_POWERNV. yes, another good way. I prefer to put it in pseries/Kconfig as same as pv-qspinlocks config. when we build nv, it still include pSeries's config anyway. thanks xinhui Regards, Boqun +help + Enabling this option will let kernel use qspinlock which is a kind of + fairlock. It has shown a good performance improvement on x86 and also ppc + especially in high contention cases. + config PPC_SPLPAR depends on PPC_PSERIES bool "Support for shared-processor logical partitions" -- 2.4.11
[RFC][PATCH] powerpc/64s: use start, size rather than start, end for exception handlers
start,size has the benefit of being easier to search for (start,end usually gives you the preceeding vector from the one you want, as first result). Suggested-by: Benjamin Herrenschmidt Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/head-64.h | 158 ++-- arch/powerpc/kernel/exceptions-64s.S | 195 ++- 2 files changed, 185 insertions(+), 168 deletions(-) diff --git a/arch/powerpc/include/asm/head-64.h b/arch/powerpc/include/asm/head-64.h index c691fc2..a475711 100644 --- a/arch/powerpc/include/asm/head-64.h +++ b/arch/powerpc/include/asm/head-64.h @@ -38,8 +38,8 @@ * li r10,128 * mv r11,r10 - * FIXED_SECTION_ENTRY_BEGIN_LOCATION(section_name, label2, start_address) - * FIXED_SECTION_ENTRY_END_LOCATION(section_name, label2, end_address) + * FIXED_SECTION_ENTRY_BEGIN_LOCATION(section_name, label2, start_address, size) + * FIXED_SECTION_ENTRY_END_LOCATION(section_name, label2, start_address, size) * CLOSE_FIXED_SECTION(section_name) * * ZERO_FIXED_SECTION can be used to emit zeroed data. @@ -102,9 +102,15 @@ end_##sname: #define FIXED_SECTION_ENTRY_BEGIN(sname, name) \ __FIXED_SECTION_ENTRY_BEGIN(sname, name, IFETCH_ALIGN_BYTES) -#define FIXED_SECTION_ENTRY_BEGIN_LOCATION(sname, name, start) \ +#define FIXED_SECTION_ENTRY_BEGIN_LOCATION(sname, name, start, size) \ USE_FIXED_SECTION(sname); \ name##_start = (start); \ + .if ((start) % (size) != 0);\ + .error "Fixed section exception vector misalignment"; \ + .endif; \ + .if ((size) != 0x20) && ((size) != 0x80) && ((size) != 0x100); \ + .error "Fixed section exception vector bad size"; \ + .endif; \ .if (start) < sname##_start;\ .error "Fixed section underflow"; \ .abort; \ @@ -113,16 +119,16 @@ end_##sname: .global name; \ name: -#define FIXED_SECTION_ENTRY_END_LOCATION(sname, name, end) \ - .if (end) > sname##_end;\ +#define FIXED_SECTION_ENTRY_END_LOCATION(sname, name, start, size) \ + .if (start) + (size) > sname##_end; \ .error "Fixed section overflow";\ .abort; \ .endif; \ - .if (. - name > end - name##_start);\ + .if (. - name > (start) + (size) - name##_start); \ .error "Fixed entry overflow"; \ .abort; \ .endif; \ - . = ((end) - sname##_start);\ + . = ((start) + (size) - sname##_start); \ /* @@ -191,17 +197,17 @@ end_##sname: * and OOL handlers are implemented as types of TRAMP and TRAMP_VIRT handlers. */ -#define EXC_REAL_BEGIN(name, start, end) \ - FIXED_SECTION_ENTRY_BEGIN_LOCATION(real_vectors, exc_real_##start##_##name, start) +#define EXC_REAL_BEGIN(name, start, size) \ + FIXED_SECTION_ENTRY_BEGIN_LOCATION(real_vectors, exc_real_##start##_##name, start, size) -#define EXC_REAL_END(name, start, end) \ - FIXED_SECTION_ENTRY_END_LOCATION(real_vectors, exc_real_##start##_##name, end) +#define EXC_REAL_END(name, start, size)\ + FIXED_SECTION_ENTRY_END_LOCATION(real_vectors, exc_real_##start##_##name, start, size) -#define EXC_VIRT_BEGIN(name, start, end) \ - FIXED_SECTION_ENTRY_BEGIN_LOCATION(virt_vectors, exc_virt_##start##_##name, start) +#define EXC_VIRT_BEGIN(name, start, size) \ + FIXED_SECTION_ENTRY_BEGIN_LOCATION(virt_vectors, exc_virt_##start##_##name, start, size) -#define EXC_VIRT_END(name, start, end) \ - FIXED_SECTION_ENTRY_END_LOCATION(virt_vectors, exc_virt_##start##_##name, end) +#define EXC_VIRT_END(name, start, size)\ + FIXED_SECTION_ENTRY_END_LOCATION(virt_vectors, exc_virt_##start##_##name, start, size) #define EXC_COMMON_BEGIN(name) \ USE_TEXT_SECTION(); \ @@ -223,140 +229,140 @@ end_##sname: #define TRAMP_KVM_BEGIN(name) #endif -#define EXC_REAL_NONE(start, end) \ - FIXED_SECTION_ENTRY_BEGIN_LOCATION(real_vectors, exc_real_##start##_##unused, start); \ - FIXED_SEC
[PATCH] powerpc/64s: tidy up after exception handler rework
Somewhere along the line, search/replace left some naming garbled, and untidy alignment. Might as well fix them all up now while git blame history doesn't extend too far. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/head-64.h | 160 +-- arch/powerpc/kernel/exceptions-64s.S | 2 +- 2 files changed, 81 insertions(+), 81 deletions(-) diff --git a/arch/powerpc/include/asm/head-64.h b/arch/powerpc/include/asm/head-64.h index fca7033..c691fc2 100644 --- a/arch/powerpc/include/asm/head-64.h +++ b/arch/powerpc/include/asm/head-64.h @@ -102,7 +102,7 @@ end_##sname: #define FIXED_SECTION_ENTRY_BEGIN(sname, name) \ __FIXED_SECTION_ENTRY_BEGIN(sname, name, IFETCH_ALIGN_BYTES) -#define FIXED_SECTION_ENTRY_BEGIN_LOCATION(sname, name, start) \ +#define FIXED_SECTION_ENTRY_BEGIN_LOCATION(sname, name, start) \ USE_FIXED_SECTION(sname); \ name##_start = (start); \ .if (start) < sname##_start;\ @@ -113,7 +113,7 @@ end_##sname: .global name; \ name: -#define FIXED_SECTION_ENTRY_END_LOCATION(sname, name, end) \ +#define FIXED_SECTION_ENTRY_END_LOCATION(sname, name, end) \ .if (end) > sname##_end;\ .error "Fixed section overflow";\ .abort; \ @@ -147,12 +147,12 @@ end_##sname: * Following are the BOOK3S exception handler helper macros. * Handlers come in a number of types, and each type has a number of varieties. * - * EXC_REAL_*- real, unrelocated exception vectors - * EXC_VIRT_*- virt (AIL), unrelocated exception vectors + * EXC_REAL_* - real, unrelocated exception vectors + * EXC_VIRT_* - virt (AIL), unrelocated exception vectors * TRAMP_REAL_* - real, unrelocated helpers (virt can call these) - * TRAMP_VIRT_* - virt, unreloc helpers (in practice, real can use) - * TRAMP_KVM - KVM handlers that get put into real, unrelocated - * EXC_COMMON_* - virt, relocated common handlers + * TRAMP_VIRT_* - virt, unreloc helpers (in practice, real can use) + * TRAMP_KVM - KVM handlers that get put into real, unrelocated + * EXC_COMMON_* - virt, relocated common handlers * * The EXC handlers are given a name, and branch to name_common, or the * appropriate KVM or masking function. Vector handler verieties are as @@ -194,20 +194,20 @@ end_##sname: #define EXC_REAL_BEGIN(name, start, end) \ FIXED_SECTION_ENTRY_BEGIN_LOCATION(real_vectors, exc_real_##start##_##name, start) -#define EXC_REAL_END(name, start, end) \ +#define EXC_REAL_END(name, start, end) \ FIXED_SECTION_ENTRY_END_LOCATION(real_vectors, exc_real_##start##_##name, end) #define EXC_VIRT_BEGIN(name, start, end) \ FIXED_SECTION_ENTRY_BEGIN_LOCATION(virt_vectors, exc_virt_##start##_##name, start) -#define EXC_VIRT_END(name, start, end) \ +#define EXC_VIRT_END(name, start, end) \ FIXED_SECTION_ENTRY_END_LOCATION(virt_vectors, exc_virt_##start##_##name, end) -#define EXC_COMMON_BEGIN(name) \ - USE_TEXT_SECTION(); \ - .balign IFETCH_ALIGN_BYTES; \ - .global name; \ - DEFINE_FIXED_SYMBOL(name); \ +#define EXC_COMMON_BEGIN(name) \ + USE_TEXT_SECTION(); \ + .balign IFETCH_ALIGN_BYTES; \ + .global name; \ + DEFINE_FIXED_SYMBOL(name); \ name: #define TRAMP_REAL_BEGIN(name) \ @@ -217,7 +217,7 @@ end_##sname: FIXED_SECTION_ENTRY_BEGIN(virt_trampolines, name) #ifdef CONFIG_KVM_BOOK3S_64_HANDLER -#define TRAMP_KVM_BEGIN(name) \ +#define TRAMP_KVM_BEGIN(name) \ TRAMP_REAL_BEGIN(name) #else #define TRAMP_KVM_BEGIN(name) @@ -232,132 +232,132 @@ end_##sname: FIXED_SECTION_ENTRY_END_LOCATION(virt_vectors, exc_virt_##start##_##unused, end); -#define EXC_REAL(name, start, end) \ - EXC_REAL_BEGIN(name, start, end); \ +#define EXC_REAL(name, start, end) \ + EXC_REAL_BEGIN(name, start, end); \ STD_EXCEPTION_PSERIES(start, name##_common);\ EXC_REAL_END(name, start, end);
Re: [PATCH v8 3/6] powerpc: lib/locks.c: Add cpu yield/wake helper function
在 2016/12/6 09:23, Boqun Feng 写道: On Mon, Dec 05, 2016 at 10:19:23AM -0500, Pan Xinhui wrote: Add two corresponding helper functions to support pv-qspinlock. For normal use, __spin_yield_cpu will confer current vcpu slices to the target vcpu(say, a lock holder). If target vcpu is not specified or it is in running state, such conferging to lpar happens or not depends. Because hcall itself will introduce latency and a little overhead. And we do NOT want to suffer any latency on some cases, e.g. in interrupt handler. The second parameter *confer* can indicate such case. __spin_wake_cpu is simpiler, it will wake up one vcpu regardless of its current vcpu state. Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/spinlock.h | 4 +++ arch/powerpc/lib/locks.c| 59 + 2 files changed, 63 insertions(+) diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 954099e..6426bd5 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -64,9 +64,13 @@ static inline bool vcpu_is_preempted(int cpu) /* We only yield to the hypervisor if we are in shared processor mode */ #define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) extern void __spin_yield(arch_spinlock_t *lock); +extern void __spin_yield_cpu(int cpu, int confer); +extern void __spin_wake_cpu(int cpu); extern void __rw_yield(arch_rwlock_t *lock); #else /* SPLPAR */ #define __spin_yield(x)barrier() +#define __spin_yield_cpu(x, y) barrier() +#define __spin_wake_cpu(x) barrier() #define __rw_yield(x) barrier() #define SHARED_PROCESSOR 0 #endif diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c index 6574626..bd872c9 100644 --- a/arch/powerpc/lib/locks.c +++ b/arch/powerpc/lib/locks.c @@ -23,6 +23,65 @@ #include #include +/* + * confer our slices to a specified cpu and return. If it is in running state + * or cpu is -1, then we will check confer. If confer is NULL, we will return + * otherwise we confer our slices to lpar. + */ +void __spin_yield_cpu(int cpu, int confer) +{ + unsigned int holder_cpu = cpu, yield_count; As I said at: https://marc.info/?l=linux-kernel&m=147455748619343&w=2 @holder_cpu is not necessary and doesn't help anything. + + if (cpu == -1) + goto yield_to_lpar; + + BUG_ON(holder_cpu >= nr_cpu_ids); + yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); + + /* if cpu is running, confer slices to lpar conditionally*/ + if ((yield_count & 1) == 0) + goto yield_to_lpar; + + plpar_hcall_norets(H_CONFER, + get_hard_smp_processor_id(holder_cpu), yield_count); + return; + +yield_to_lpar: + if (confer) + plpar_hcall_norets(H_CONFER, -1, 0); +} +EXPORT_SYMBOL_GPL(__spin_yield_cpu); + +void __spin_wake_cpu(int cpu) +{ + unsigned int holder_cpu = cpu; And it's even wrong to call the parameter of _wake_cpu() a holder_cpu, because it's not the current lock holder. oh, its name is really misleading. thanks Regards, Boqun + + BUG_ON(holder_cpu >= nr_cpu_ids); + /* +* NOTE: we should always do this hcall regardless of +* the yield_count of the holder_cpu. +* as thers might be a case like below; +* CPU 1 CPU 2 +* yielded = true +* if (yielded) +* __spin_wake_cpu() +* __spin_yield_cpu() +* +* So we might lose a wake if we check the yield_count and +* return directly if the holder_cpu is running. +* IOW. do NOT code like below. +* yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); +* if ((yield_count & 1) == 0) +* return; +* +* a PROD hcall marks the target_cpu proded, which cause the next cede +* or confer called on the target_cpu invalid. +*/ + plpar_hcall_norets(H_PROD, + get_hard_smp_processor_id(holder_cpu)); +} +EXPORT_SYMBOL_GPL(__spin_wake_cpu); + #ifndef CONFIG_QUEUED_SPINLOCKS void __spin_yield(arch_spinlock_t *lock) { -- 2.4.11
Re: [PATCH v8 2/6] powerpc: pSeries/Kconfig: Add qspinlock build config
在 2016/12/6 08:58, Boqun Feng 写道: On Mon, Dec 05, 2016 at 10:19:22AM -0500, Pan Xinhui wrote: pSeries/powerNV will use qspinlock from now on. Signed-off-by: Pan Xinhui --- arch/powerpc/platforms/pseries/Kconfig | 8 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index bec90fb..8a87d06 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig Why here? Not arch/powerpc/platforms/Kconfig? @@ -23,6 +23,14 @@ config PPC_PSERIES select PPC_DOORBELL default y +config ARCH_USE_QUEUED_SPINLOCKS + default y + bool "Enable qspinlock" I think you just enable qspinlock by default for all PPC platforms. I guess you need to put depends on PPC_PSERIES || PPC_POWERNV here to achieve what you mean in you commit message. yes, another good way. I prefer to put it in pseries/Kconfig as same as pv-qspinlocks config. when we build nv, it still include pSeries's config anyway. thanks xinhui Regards, Boqun + help + Enabling this option will let kernel use qspinlock which is a kind of + fairlock. It has shown a good performance improvement on x86 and also ppc + especially in high contention cases. + config PPC_SPLPAR depends on PPC_PSERIES bool "Support for shared-processor logical partitions" -- 2.4.11
Re: [PATCH v8 3/6] powerpc: lib/locks.c: Add cpu yield/wake helper function
On Mon, Dec 05, 2016 at 10:19:23AM -0500, Pan Xinhui wrote: > Add two corresponding helper functions to support pv-qspinlock. > > For normal use, __spin_yield_cpu will confer current vcpu slices to the > target vcpu(say, a lock holder). If target vcpu is not specified or it > is in running state, such conferging to lpar happens or not depends. > > Because hcall itself will introduce latency and a little overhead. And we > do NOT want to suffer any latency on some cases, e.g. in interrupt handler. > The second parameter *confer* can indicate such case. > > __spin_wake_cpu is simpiler, it will wake up one vcpu regardless of its > current vcpu state. > > Signed-off-by: Pan Xinhui > --- > arch/powerpc/include/asm/spinlock.h | 4 +++ > arch/powerpc/lib/locks.c| 59 > + > 2 files changed, 63 insertions(+) > > diff --git a/arch/powerpc/include/asm/spinlock.h > b/arch/powerpc/include/asm/spinlock.h > index 954099e..6426bd5 100644 > --- a/arch/powerpc/include/asm/spinlock.h > +++ b/arch/powerpc/include/asm/spinlock.h > @@ -64,9 +64,13 @@ static inline bool vcpu_is_preempted(int cpu) > /* We only yield to the hypervisor if we are in shared processor mode */ > #define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) > extern void __spin_yield(arch_spinlock_t *lock); > +extern void __spin_yield_cpu(int cpu, int confer); > +extern void __spin_wake_cpu(int cpu); > extern void __rw_yield(arch_rwlock_t *lock); > #else /* SPLPAR */ > #define __spin_yield(x)barrier() > +#define __spin_yield_cpu(x, y) barrier() > +#define __spin_wake_cpu(x) barrier() > #define __rw_yield(x) barrier() > #define SHARED_PROCESSOR 0 > #endif > diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c > index 6574626..bd872c9 100644 > --- a/arch/powerpc/lib/locks.c > +++ b/arch/powerpc/lib/locks.c > @@ -23,6 +23,65 @@ > #include > #include > > +/* > + * confer our slices to a specified cpu and return. If it is in running state > + * or cpu is -1, then we will check confer. If confer is NULL, we will return > + * otherwise we confer our slices to lpar. > + */ > +void __spin_yield_cpu(int cpu, int confer) > +{ > + unsigned int holder_cpu = cpu, yield_count; As I said at: https://marc.info/?l=linux-kernel&m=147455748619343&w=2 @holder_cpu is not necessary and doesn't help anything. > + > + if (cpu == -1) > + goto yield_to_lpar; > + > + BUG_ON(holder_cpu >= nr_cpu_ids); > + yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); > + > + /* if cpu is running, confer slices to lpar conditionally*/ > + if ((yield_count & 1) == 0) > + goto yield_to_lpar; > + > + plpar_hcall_norets(H_CONFER, > + get_hard_smp_processor_id(holder_cpu), yield_count); > + return; > + > +yield_to_lpar: > + if (confer) > + plpar_hcall_norets(H_CONFER, -1, 0); > +} > +EXPORT_SYMBOL_GPL(__spin_yield_cpu); > + > +void __spin_wake_cpu(int cpu) > +{ > + unsigned int holder_cpu = cpu; And it's even wrong to call the parameter of _wake_cpu() a holder_cpu, because it's not the current lock holder. Regards, Boqun > + > + BUG_ON(holder_cpu >= nr_cpu_ids); > + /* > + * NOTE: we should always do this hcall regardless of > + * the yield_count of the holder_cpu. > + * as thers might be a case like below; > + * CPU 1 CPU 2 > + * yielded = true > + * if (yielded) > + * __spin_wake_cpu() > + * __spin_yield_cpu() > + * > + * So we might lose a wake if we check the yield_count and > + * return directly if the holder_cpu is running. > + * IOW. do NOT code like below. > + * yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); > + * if ((yield_count & 1) == 0) > + * return; > + * > + * a PROD hcall marks the target_cpu proded, which cause the next cede > + * or confer called on the target_cpu invalid. > + */ > + plpar_hcall_norets(H_PROD, > + get_hard_smp_processor_id(holder_cpu)); > +} > +EXPORT_SYMBOL_GPL(__spin_wake_cpu); > + > #ifndef CONFIG_QUEUED_SPINLOCKS > void __spin_yield(arch_spinlock_t *lock) > { > -- > 2.4.11 > signature.asc Description: PGP signature
Re: [PATCH v3 2/3] powerpc: get hugetlbpage handling more generic
On Wed, 2016-09-21 at 10:11 +0200, Christophe Leroy wrote: > Today there are two implementations of hugetlbpages which are managed > by exclusive #ifdefs: > * FSL_BOOKE: several directory entries points to the same single hugepage > * BOOK3S: one upper level directory entry points to a table of hugepages > > In preparation of implementation of hugepage support on the 8xx, we > need a mix of the two above solutions, because the 8xx needs both cases > depending on the size of pages: > * In 4k page size mode, each PGD entry covers a 4M bytes area. It means > that 2 PGD entries will be necessary to cover an 8M hugepage while a > single PGD entry will cover 8x 512k hugepages. > * In 16 page size mode, each PGD entry covers a 64M bytes area. It means > that 8x 8M hugepages will be covered by one PGD entry and 64x 512k > hugepages will be covers by one PGD entry. > > This patch: > * removes #ifdefs in favor of if/else based on the range sizes > * merges the two huge_pte_alloc() functions as they are pretty similar > * merges the two hugetlbpage_init() functions as they are pretty similar [snip] > @@ -860,16 +803,34 @@ static int __init hugetlbpage_init(void) > * if we have pdshift and shift value same, we don't > * use pgt cache for hugepd. > */ > - if (pdshift != shift) { > + if (pdshift > shift) { > pgtable_cache_add(pdshift - shift, NULL); > if (!PGT_CACHE(pdshift - shift)) > panic("hugetlbpage_init(): could not create > " > "pgtable cache for %d bit > pagesize\n", shift); > } > +#ifdef CONFIG_PPC_FSL_BOOK3E > + else if (!hugepte_cache) { This else never triggers on book3e, because the way this function calculates pdshift is wrong for book3e (it uses PyD_SHIFT instead of HUGEPD_PxD_SHIFT). We later get OOMs because huge_pte_alloc() calculates pdshift correctly, tries to use hugepte_cache, and fails. If the point of this patch is to remove the compile-time decision on whether to do things the book3e way, why are there still ifdefs such as the ones controlling the definition of HUGEPD_PxD_SHIFT? How does what you're doing on 8xx (for certain page sizes) differ from book3e? -Scott
Re: [PATCH v8 1/6] powerpc/qspinlock: powerpc support qspinlock
correct waiman's address. 在 2016/12/6 08:47, Boqun Feng 写道: On Mon, Dec 05, 2016 at 10:19:21AM -0500, Pan Xinhui wrote: This patch add basic code to enable qspinlock on powerpc. qspinlock is one kind of fairlock implementation. And seen some performance improvement under some scenarios. queued_spin_unlock() release the lock by just one write of NULL to the ::locked field which sits at different places in the two endianness system. We override some arch_spin_XXX as powerpc has io_sync stuff which makes sure the io operations are protected by the lock correctly. There is another special case, see commit 2c610022711 ("locking/qspinlock: Fix spin_unlock_wait() some more") Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/qspinlock.h | 66 +++ arch/powerpc/include/asm/spinlock.h | 31 +-- arch/powerpc/include/asm/spinlock_types.h | 4 ++ arch/powerpc/lib/locks.c | 59 +++ 4 files changed, 147 insertions(+), 13 deletions(-) create mode 100644 arch/powerpc/include/asm/qspinlock.h diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h new file mode 100644 index 000..4c89256 --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock.h @@ -0,0 +1,66 @@ +#ifndef _ASM_POWERPC_QSPINLOCK_H +#define _ASM_POWERPC_QSPINLOCK_H + +#include + +#define SPIN_THRESHOLD (1 << 15) +#define queued_spin_unlock queued_spin_unlock +#define queued_spin_is_locked queued_spin_is_locked +#define queued_spin_unlock_wait queued_spin_unlock_wait + +extern void queued_spin_unlock_wait(struct qspinlock *lock); + +static inline u8 *__qspinlock_lock_byte(struct qspinlock *lock) +{ + return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); +} + +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + /* release semantics is required */ + smp_store_release(__qspinlock_lock_byte(lock), 0); +} + +static inline int queued_spin_is_locked(struct qspinlock *lock) +{ + smp_mb(); + return atomic_read(&lock->val); +} + +#include + +/* we need override it as ppc has io_sync stuff */ +#undef arch_spin_trylock +#undef arch_spin_lock +#undef arch_spin_lock_flags +#undef arch_spin_unlock +#define arch_spin_trylock arch_spin_trylock +#define arch_spin_lock arch_spin_lock +#define arch_spin_lock_flags arch_spin_lock_flags +#define arch_spin_unlock arch_spin_unlock + +static inline int arch_spin_trylock(arch_spinlock_t *lock) +{ + CLEAR_IO_SYNC; + return queued_spin_trylock(lock); +} + +static inline void arch_spin_lock(arch_spinlock_t *lock) +{ + CLEAR_IO_SYNC; + queued_spin_lock(lock); +} + +static inline +void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags) +{ + CLEAR_IO_SYNC; + queued_spin_lock(lock); +} + +static inline void arch_spin_unlock(arch_spinlock_t *lock) +{ + SYNC_IO; + queued_spin_unlock(lock); +} +#endif /* _ASM_POWERPC_QSPINLOCK_H */ diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 8c1b913..954099e 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -60,6 +60,23 @@ static inline bool vcpu_is_preempted(int cpu) } #endif +#if defined(CONFIG_PPC_SPLPAR) +/* We only yield to the hypervisor if we are in shared processor mode */ +#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) +extern void __spin_yield(arch_spinlock_t *lock); +extern void __rw_yield(arch_rwlock_t *lock); +#else /* SPLPAR */ +#define __spin_yield(x)barrier() +#define __rw_yield(x) barrier() +#define SHARED_PROCESSOR 0 +#endif + +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#else + +#define arch_spin_relax(lock) __spin_yield(lock) + static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) { return lock.slock == 0; @@ -114,18 +131,6 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock) * held. Conveniently, we have a word in the paca that holds this * value. */ - -#if defined(CONFIG_PPC_SPLPAR) -/* We only yield to the hypervisor if we are in shared processor mode */ -#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) -extern void __spin_yield(arch_spinlock_t *lock); -extern void __rw_yield(arch_rwlock_t *lock); -#else /* SPLPAR */ -#define __spin_yield(x)barrier() -#define __rw_yield(x) barrier() -#define SHARED_PROCESSOR 0 -#endif - static inline void arch_spin_lock(arch_spinlock_t *lock) { CLEAR_IO_SYNC; @@ -203,6 +208,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) smp_mb(); } +#endif /* !CONFIG_QUEUED_SPINLOCKS */ /* * Read-write spinlocks, allowing multiple readers * but only one writer. @@ -338,7 +344,6 @@ static inline void arch_write_unlock(arch_rwlock_t *rw) #define arch_read_lock_flags(lock, flags) arch_read_lock(lock) #define arch_write_lock_flags(lock, flags) arch_write_lock(lock
Re: [PATCH v8 2/6] powerpc: pSeries/Kconfig: Add qspinlock build config
On Mon, Dec 05, 2016 at 10:19:22AM -0500, Pan Xinhui wrote: > pSeries/powerNV will use qspinlock from now on. > > Signed-off-by: Pan Xinhui > --- > arch/powerpc/platforms/pseries/Kconfig | 8 > 1 file changed, 8 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/Kconfig > b/arch/powerpc/platforms/pseries/Kconfig > index bec90fb..8a87d06 100644 > --- a/arch/powerpc/platforms/pseries/Kconfig > +++ b/arch/powerpc/platforms/pseries/Kconfig Why here? Not arch/powerpc/platforms/Kconfig? > @@ -23,6 +23,14 @@ config PPC_PSERIES > select PPC_DOORBELL > default y > > +config ARCH_USE_QUEUED_SPINLOCKS > + default y > + bool "Enable qspinlock" I think you just enable qspinlock by default for all PPC platforms. I guess you need to put depends on PPC_PSERIES || PPC_POWERNV here to achieve what you mean in you commit message. Regards, Boqun > + help > + Enabling this option will let kernel use qspinlock which is a kind of > + fairlock. It has shown a good performance improvement on x86 and > also ppc > + especially in high contention cases. > + > config PPC_SPLPAR > depends on PPC_PSERIES > bool "Support for shared-processor logical partitions" > -- > 2.4.11 > signature.asc Description: PGP signature
Re: [PATCH v8 1/6] powerpc/qspinlock: powerpc support qspinlock
On Mon, Dec 05, 2016 at 10:19:21AM -0500, Pan Xinhui wrote: > This patch add basic code to enable qspinlock on powerpc. qspinlock is > one kind of fairlock implementation. And seen some performance improvement > under some scenarios. > > queued_spin_unlock() release the lock by just one write of NULL to the > ::locked field which sits at different places in the two endianness > system. > > We override some arch_spin_XXX as powerpc has io_sync stuff which makes > sure the io operations are protected by the lock correctly. > > There is another special case, see commit > 2c610022711 ("locking/qspinlock: Fix spin_unlock_wait() some more") > > Signed-off-by: Pan Xinhui > --- > arch/powerpc/include/asm/qspinlock.h | 66 > +++ > arch/powerpc/include/asm/spinlock.h | 31 +-- > arch/powerpc/include/asm/spinlock_types.h | 4 ++ > arch/powerpc/lib/locks.c | 59 +++ > 4 files changed, 147 insertions(+), 13 deletions(-) > create mode 100644 arch/powerpc/include/asm/qspinlock.h > > diff --git a/arch/powerpc/include/asm/qspinlock.h > b/arch/powerpc/include/asm/qspinlock.h > new file mode 100644 > index 000..4c89256 > --- /dev/null > +++ b/arch/powerpc/include/asm/qspinlock.h > @@ -0,0 +1,66 @@ > +#ifndef _ASM_POWERPC_QSPINLOCK_H > +#define _ASM_POWERPC_QSPINLOCK_H > + > +#include > + > +#define SPIN_THRESHOLD (1 << 15) > +#define queued_spin_unlock queued_spin_unlock > +#define queued_spin_is_locked queued_spin_is_locked > +#define queued_spin_unlock_wait queued_spin_unlock_wait > + > +extern void queued_spin_unlock_wait(struct qspinlock *lock); > + > +static inline u8 *__qspinlock_lock_byte(struct qspinlock *lock) > +{ > + return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); > +} > + > +static inline void queued_spin_unlock(struct qspinlock *lock) > +{ > + /* release semantics is required */ > + smp_store_release(__qspinlock_lock_byte(lock), 0); > +} > + > +static inline int queued_spin_is_locked(struct qspinlock *lock) > +{ > + smp_mb(); > + return atomic_read(&lock->val); > +} > + > +#include > + > +/* we need override it as ppc has io_sync stuff */ > +#undef arch_spin_trylock > +#undef arch_spin_lock > +#undef arch_spin_lock_flags > +#undef arch_spin_unlock > +#define arch_spin_trylock arch_spin_trylock > +#define arch_spin_lock arch_spin_lock > +#define arch_spin_lock_flags arch_spin_lock_flags > +#define arch_spin_unlock arch_spin_unlock > + > +static inline int arch_spin_trylock(arch_spinlock_t *lock) > +{ > + CLEAR_IO_SYNC; > + return queued_spin_trylock(lock); > +} > + > +static inline void arch_spin_lock(arch_spinlock_t *lock) > +{ > + CLEAR_IO_SYNC; > + queued_spin_lock(lock); > +} > + > +static inline > +void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags) > +{ > + CLEAR_IO_SYNC; > + queued_spin_lock(lock); > +} > + > +static inline void arch_spin_unlock(arch_spinlock_t *lock) > +{ > + SYNC_IO; > + queued_spin_unlock(lock); > +} > +#endif /* _ASM_POWERPC_QSPINLOCK_H */ > diff --git a/arch/powerpc/include/asm/spinlock.h > b/arch/powerpc/include/asm/spinlock.h > index 8c1b913..954099e 100644 > --- a/arch/powerpc/include/asm/spinlock.h > +++ b/arch/powerpc/include/asm/spinlock.h > @@ -60,6 +60,23 @@ static inline bool vcpu_is_preempted(int cpu) > } > #endif > > +#if defined(CONFIG_PPC_SPLPAR) > +/* We only yield to the hypervisor if we are in shared processor mode */ > +#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) > +extern void __spin_yield(arch_spinlock_t *lock); > +extern void __rw_yield(arch_rwlock_t *lock); > +#else /* SPLPAR */ > +#define __spin_yield(x)barrier() > +#define __rw_yield(x) barrier() > +#define SHARED_PROCESSOR 0 > +#endif > + > +#ifdef CONFIG_QUEUED_SPINLOCKS > +#include > +#else > + > +#define arch_spin_relax(lock) __spin_yield(lock) > + > static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) > { > return lock.slock == 0; > @@ -114,18 +131,6 @@ static inline int arch_spin_trylock(arch_spinlock_t > *lock) > * held. Conveniently, we have a word in the paca that holds this > * value. > */ > - > -#if defined(CONFIG_PPC_SPLPAR) > -/* We only yield to the hypervisor if we are in shared processor mode */ > -#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) > -extern void __spin_yield(arch_spinlock_t *lock); > -extern void __rw_yield(arch_rwlock_t *lock); > -#else /* SPLPAR */ > -#define __spin_yield(x) barrier() > -#define __rw_yield(x)barrier() > -#define SHARED_PROCESSOR 0 > -#endif > - > static inline void arch_spin_lock(arch_spinlock_t *lock) > { > CLEAR_IO_SYNC; > @@ -203,6 +208,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t > *lock) > smp_mb(); > } > > +#endif /* !CONFIG_QUEUED_SPINLOCKS */ > /* > * Read-write spinlocks, allowing multiple readers > * but o
[GIT PULL 00/20] perf/core improvements and fixes
Hi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit e7af7b15121ca08c31a0ab9df71a41b4c53365b4: Merge tag 'perf-core-for-mingo-20161201' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-12-02 10:08:03 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161205 for you to fetch changes up to bec60e50af83741cde1786ab475d4bf472aed6f9: perf annotate: Show raw form for jump instruction with indirect target (2016-12-05 17:21:57 -0300) perf/core improvements and fixes: Fixes: - Do not show a bogus target address in 'perf annotate' for targetless powerpc jump instructions such as 'bctr' (Ravi Bangoria) - tools/build fixes related to race conditions with the fixdep utility (Jiri Olsa) - Fix building objtool with clang (Peter Foley) Infrastructure: - Support linking perf with clang and LLVM libraries, initially statically, but this limitation will be lifted and shared libraries, when available, will be preferred to the static build, that should, as with other features, be enabled explicitly (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo Jiri Olsa (7): tools build: Make fixdep parsing wait for last target tools build: Make the .cmd file more readable tools build: Move tabs to spaces where suitable perf tools: Move install-gtk target into rules area perf tools: Move python/perf.so target into rules area perf tools: Cleanup build directory before each test perf tools: Add non config targets Peter Foley (1): tools build: Fix objtool build with clang Ravi Bangoria (1): perf annotate: Show raw form for jump instruction with indirect target Wang Nan (11): perf tools: Pass context to perf hook functions perf llvm: Extract helpers in llvm-utils.c tools build: Add feature detection for LLVM tools build: Add feature detection for clang perf build: Add clang and llvm compile and linking support perf clang: Add builtin clang support ant test case perf clang: Use real file system for #include perf clang: Allow passing CFLAGS to builtin clang perf clang: Update test case to use real BPF script perf clang: Support compile IR to BPF object and add testcase perf clang: Compile BPF script using builtin clang support tools/build/Build.include | 20 ++-- tools/build/Makefile.feature | 138 +- tools/build/feature/Makefile | 120 +-- tools/build/feature/test-clang.cpp | 21 tools/build/feature/test-llvm.cpp | 8 ++ tools/build/fixdep.c | 5 +- tools/perf/Makefile.config | 62 +--- tools/perf/Makefile.perf | 56 +++ tools/perf/tests/Build | 1 + tools/perf/tests/builtin-test.c| 9 ++ tools/perf/tests/clang.c | 46 + tools/perf/tests/llvm.h| 7 ++ tools/perf/tests/make | 4 +- tools/perf/tests/perf-hooks.c | 14 ++- tools/perf/tests/tests.h | 3 + tools/perf/util/Build | 2 + tools/perf/util/annotate.c | 3 + tools/perf/util/bpf-loader.c | 19 +++- tools/perf/util/c++/Build | 2 + tools/perf/util/c++/clang-c.h | 43 tools/perf/util/c++/clang-test.cpp | 62 tools/perf/util/c++/clang.cpp | 195 + tools/perf/util/c++/clang.h| 26 + tools/perf/util/llvm-utils.c | 76 +++ tools/perf/util/llvm-utils.h | 6 ++ tools/perf/util/perf-hooks.c | 10 +- tools/perf/util/perf-hooks.h | 6 +- tools/perf/util/util-cxx.h | 26 + 28 files changed, 795 insertions(+), 195 deletions(-) create mode 100644 tools/build/feature/test-clang.cpp create mode 100644 tools/build/feature/test-llvm.cpp create mode 100644 tools/perf/tests/clang.c create mode 100644 tools/perf/util/c++/Build create mode 100644 tools/perf/util/c++/clang-c.h create mode 100644 tools/perf/util/c++/clang-test.cpp create mode 100644 tools/perf/util/c++/clang.cpp create mode 100644 tools/perf/util/c++/clang.h create mode 100644 tools/perf/util/util-cxx.h # uname -a Linux jouet 4.8.8-300.fc25.x86_64 #1 SMP Tue Nov 15 18:10:06 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux # perf test 1: vmlinux symtab matches kallsyms: Ok 2: Detect openat syscall event: Ok 3: Detect openat syscall event on all cpus: Ok 4: Read samples using the mmap interface : Ok 5: Parse event definition strings : Ok 6: PERF_RECORD_* events & pe
[PATCH 20/20] perf annotate: Show raw form for jump instruction with indirect target
From: Ravi Bangoria For jump instructions that does not include target address as direct operand, show the original disassembled line for them. This is needed for certain powerpc jump instructions that use target address in a register (such as bctr, btar, ...). Before: ld r12,32088(r12) mtctr r12 v bctr ca2c stdr2,24(r1) addis r12,r2,-1 After: ld r12,32088(r12) mtctr r12 v bctr stdr2,24(r1) addis r12,r2,-1 Committer notes: Testing it using a perf.data file and vmlinux for powerpc64, cross-annotating it on a x86_64 workstation: Before: .__bpf_prog_run vmlinux.powerpc │stdr10,512(r9) ▒ │lbzr9,0(r31)▒ │rldicr r9,r9,3,60 ▒ │ldxr9,r30,r9▒ │mtctr r9 ▒ 100.00 │ ↓ bctr 3fe01510 ▒ │lwar10,4(r31) ▒ │lwzr9,0(r31)▒ Invalid jump offset: 3fe01510 After: .__bpf_prog_run vmlinux.powerpc │stdr10,512(r9) ▒ │lbzr9,0(r31)▒ │rldicr r9,r9,3,60 ▒ │ldxr9,r30,r9▒ │mtctr r9 ▒ 100.00 │ ↓ bctr▒ │lwar10,4(r31) ▒ │lwzr9,0(r31)▒ Invalid jump offset: 3fe01510 This, in turn, uncovers another problem with jumps without operands, the ENTER/-> operation, to jump to the target, still continues using the bogus target :-) BTW, this was the file used for the above tests: [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev # # captured on: Thu Nov 24 12:40:38 2016 # hostname : pdev-f22-qemu # os release : 4.4.10-200.fc22.ppc64 # perf version : 4.9.rc1.g6298ce # arch : ppc64 # nrcpus online : 48 # nrcpus avail : 48 # cpudesc : POWER7 (architected), altivec supported # cpuid : 74,513 # total memory : 4158976 kB # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5 # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE # # [acme@jouet ravi_bangoria]$ Suggested-by: Michael Ellerman Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: Chris Riyder Cc: Kim Phillips Cc: Markus Trippelsdorf Cc: Masami Hiramatsu Cc: Naveen N. Rao Cc: Peter Zijlstra Cc: Taeung Song Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1480953407-7605-1-git-send-email-ravi.bango...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/annotate.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 4012b1de2813..ea7e0de4b9c1 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op static int jump__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops) { + if (!ops->target.addr) + return ins__raw_scnprintf(ins, bf, size, ops); + return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target.offset); } -- 2.9.3
Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target
Em Mon, Dec 05, 2016 at 05:21:42PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu: > > For jump instructions that does not include target address as direct > > operand, show the original disassembled line for them. This is needed > > for certain powerpc jump instructions that use target address in a > > register (such as bctr, btar, ...). > > Found it, .__bpf_prog_run, that is present in that perf.data file you > sent me, has it, will use it in my committer notes for this patch. So, I've added these committer notes while testing it, will continue processing your patches later/tomorrow, thanks! Committer notes: Testing it using a perf.data file and vmlinux for powerpc64, cross-annotating it on a x86_64 workstation: Before: .__bpf_prog_run vmlinux.powerpc │stdr10,512(r9) ▒ │lbzr9,0(r31)▒ │rldicr r9,r9,3,60 ▒ │ldxr9,r30,r9▒ │mtctr r9 ▒ 100.00 │ ↓ bctr 3fe01510 ▒ │lwar10,4(r31) ▒ │lwzr9,0(r31)▒ Invalid jump offset: 3fe01510 After: .__bpf_prog_run vmlinux.powerpc │stdr10,512(r9) ▒ │lbzr9,0(r31)▒ │rldicr r9,r9,3,60 ▒ │ldxr9,r30,r9▒ │mtctr r9 ▒ 100.00 │ ↓ bctr▒ │lwar10,4(r31) ▒ │lwzr9,0(r31)▒ Invalid jump offset: 3fe01510 This, in turn, uncovers another problem with jumps without operands, the ENTER/-> operation, to jump to the target, still continues using the bogus target :-) BTW, this was the file used for the above tests: [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev # # captured on: Thu Nov 24 12:40:38 2016 # hostname : pdev-f22-qemu # os release : 4.4.10-200.fc22.ppc64 # perf version : 4.9.rc1.g6298ce # arch : ppc64 # nrcpus online : 48 # nrcpus avail : 48 # cpudesc : POWER7 (architected), altivec supported # cpuid : 74,513 # total memory : 4158976 kB # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5 # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE # # [acme@jouet ravi_bangoria]$ Suggested-by: Michael Ellerman Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo > - Arnaldo > > > > > Before: > > ld r12,32088(r12) > > mtctr r12 > > v bctr ca2c > > stdr2,24(r1) > > addis r12,r2,-1 > > > > After: > > ld r12,32088(r12) > > mtctr r12 > > v bctr > > stdr2,24(r1) > > addis r12,r2,-1 > > > > Suggested-by: Michael Ellerman > > Signed-off-by: Ravi Bangoria > > --- > > Changes in v8: > > - v7: https://lkml.org/lkml/2016/9/21/436 > > - Rebase to acme/perf/core > > - No logical changes. (Cross arch annotate patches are in. This patch > > is for hardening annotate for powerpc.) > > > > tools/perf/util/annotate.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > > index 4012b1d..ea7e0de 100644 > > --- a/tools/perf/util/annotate.c > > +++ b/tools/perf/util/annotate.c > > @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch > > __maybe_unused, struct ins_operands *op > > static int jump__scnprintf(struct ins *ins, char *bf, size_t size, > >struct ins_operands *ops) > > { > > + if (!ops->target.addr) > > + return ins__raw_scnprintf(ins, bf, size, ops); > > + > > return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, > > ops->target.offset); > > } > > > > -- > > 2.4.11
Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target
Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu: > For jump instructions that does not include target address as direct > operand, show the original disassembled line for them. This is needed > for certain powerpc jump instructions that use target address in a > register (such as bctr, btar, ...). Found it, .__bpf_prog_run, that is present in that perf.data file you sent me, has it, will use it in my committer notes for this patch. - Arnaldo > > Before: > ld r12,32088(r12) > mtctr r12 > v bctr ca2c > stdr2,24(r1) > addis r12,r2,-1 > > After: > ld r12,32088(r12) > mtctr r12 > v bctr > stdr2,24(r1) > addis r12,r2,-1 > > Suggested-by: Michael Ellerman > Signed-off-by: Ravi Bangoria > --- > Changes in v8: > - v7: https://lkml.org/lkml/2016/9/21/436 > - Rebase to acme/perf/core > - No logical changes. (Cross arch annotate patches are in. This patch > is for hardening annotate for powerpc.) > > tools/perf/util/annotate.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index 4012b1d..ea7e0de 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, > struct ins_operands *op > static int jump__scnprintf(struct ins *ins, char *bf, size_t size, > struct ins_operands *ops) > { > + if (!ops->target.addr) > + return ins__raw_scnprintf(ins, bf, size, ops); > + > return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, > ops->target.offset); > } > > -- > 2.4.11
Re: [PATCH] PPC: sstep.c: Add modsw, moduw instruction emulation
Hi Naveen, Thanks for the review. >> --- >> arch/powerpc/lib/sstep.c | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c >> index 9c78a9c..5acef72 100644 >> --- a/arch/powerpc/lib/sstep.c >> +++ b/arch/powerpc/lib/sstep.c >> @@ -1148,6 +1148,15 @@ int __kprobes analyse_instr(struct instruction_op >> *op, struct pt_regs *regs, >> (int) regs->gpr[rb]; >> goto arith_done; >> >> + case 779: /* modsw */ >> + regs->gpr[rd] = (int) regs->gpr[ra] % >> + (int) regs->gpr[rb]; >> + goto arith_done; > > Since these instructions don't update CR, you can directly goto > instr_done. Sure. Will use that. >> + >> + case 267: /* moduw */ > > Please move this case further up so that the extended opcodes are in > numerical order. Placed it after divide instruction as it appeared logical. Also placed 267 below 779 as it is the order in which the instructions are documented in the ISA book. This may help in finding related instructions together. If this style is not preferred I can arrange it in numerical order. Regards, PrasannaKumar
Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target
Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu: > For jump instructions that does not include target address as direct > operand, show the original disassembled line for them. This is needed > for certain powerpc jump instructions that use target address in a > register (such as bctr, btar, ...). Please, mention the name of the function where you copy annotated examples from, so that I can reproduce it here, using the files you provided (perf.data and vmlinux for powerpc). Searching one such function now... > Before: > ld r12,32088(r12) > mtctr r12 > v bctr ca2c > stdr2,24(r1) > addis r12,r2,-1 > > After: > ld r12,32088(r12) > mtctr r12 > v bctr > stdr2,24(r1) > addis r12,r2,-1 > > Suggested-by: Michael Ellerman > Signed-off-by: Ravi Bangoria > --- > Changes in v8: > - v7: https://lkml.org/lkml/2016/9/21/436 > - Rebase to acme/perf/core > - No logical changes. (Cross arch annotate patches are in. This patch > is for hardening annotate for powerpc.) > > tools/perf/util/annotate.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index 4012b1d..ea7e0de 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, > struct ins_operands *op > static int jump__scnprintf(struct ins *ins, char *bf, size_t size, > struct ins_operands *ops) > { > + if (!ops->target.addr) > + return ins__raw_scnprintf(ins, bf, size, ops); > + > return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, > ops->target.offset); > } > > -- > 2.4.11
Re: [PATCH v2] of/irq: improve error report on irq discovery process failure
On 12/05/2016 12:28 PM, Rob Herring wrote: > On Mon, Dec 5, 2016 at 7:59 AM, Guilherme G. Piccoli > wrote: >> On PowerPC machines some PCI slots might not have level triggered >> interrupts capability (also know as level signaled interrupts), >> leading of_irq_parse_pci() to complain by presenting error messages >> on the kernel log - in this case, the properties "interrupt-map" and >> "interrupt-map-mask" are not present on device's node in the device >> tree. >> >> This patch introduces a different message for this specific case, >> and also reduces its level from error to warning. Besides, we warn >> (once) that possibly some PCI slots on the system have no level >> triggered interrupts available. >> We changed some error return codes too on function of_irq_parse_raw() >> in order other failure's cases can be presented in a more precise way. >> >> Before this patch, when an adapter was plugged in a slot without level >> interrupts capabilitiy on PowerPC, we saw a generic error message >> like this: >> >> [54.239] pci 002d:70:00.0: of_irq_parse_pci() failed with rc=-22 >> >> Now, with this applied, we see the following specific message: >> >> [16.154] pci 0014:60:00.1: of_irq_parse_pci: no interrupt-map found, >> INTx interrupts not available >> >> Finally, we standardize the error path in of_irq_parse_raw() by always >> taking the fail path instead of returning directly from the loop. >> >> Signed-off-by: Guilherme G. Piccoli >> --- >> >> v2: >> * Changed function return code to always return negative values; > > Are you sure this is safe? This is tricky because of differing values > of NO_IRQ (0 or -1). Thanks Rob, but this is purely bad wording from myself. I'm sorry - I meant to say that I changed only my positive return code (that was suggested to be removed in the prior revision) to negative return code! So, I changed only code I added myself in v1 =) > >> * Improved/simplified warning outputs; >> * Changed some return codes and some error paths in of_irq_parse_raw() >> in order to be more precise/consistent; > > This too could have some side effects on callers. > > Not saying don't do these changes, just need some assurances this has > been considered. Thanks for your attention. I performed a quick investigation before changing this, all the places that use the return values are just getting "true/false" information from that, meaning they just are comparing to 0 basically. So change -EINVAL to -ENOENT wouldn't hurt any user of these return values, it'll only become more informative IMHO. Now, regarding the only error path that was changed: for some reason, this was the only place in which we didn't goto fail label in case of failure - it was added by a legacy commit from Ben, dated from 2006: 006b64de60 ("[POWERPC] Make OF irq map code detect more error cases"). Then it was carried by Grant Likely's commit 7dc2e1134a ("of/irq: merge irq mapping code"), 6-year old commit. I wasn't able to imagine a scenario in which changing this would break something; I believe the change improve consistency, but I'd remove it if you or somebody else thinks it worth be removed. Cheers, Guilherme > > Rob >
Re: [PATCH 1/1] serial/uuc_uart: Set shutdown timeout to CONFIG_HZ independent 2ms
Alexander Stein wrote: - schedule_timeout(2); + schedule_timeout(msecs_to_jiffies(2)); NACK. So I don't remember why I wrote this code, but I don't think I was expecting it to be 2ms. Instead, I think I just wanted it to be some delay, but I believed that schedule_timeout(1) was too short or would be "optimized" out somehow. Note that right below this, I do: if (qe_port->wait_closing) { /* Wait a bit longer */ set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(qe_port->wait_closing); } And wait_closing is a number of jiffies, so I knew that schedule_timeout() took jiffies as a parameter. So I think I'm going to NACK this patch, since I believe I knew what I was doing when I wrote it five years ago.
[PATCH 1/1] serial/uuc_uart: Set shutdown timeout to CONFIG_HZ independent 2ms
schedule_timeout takes a timeout in jiffies resolution. So pass 2ms as a converted jiffies value. This makes the timeout independent of CONFIG_HZ. Signed-off-by: Alexander Stein --- drivers/tty/serial/ucc_uart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/ucc_uart.c b/drivers/tty/serial/ucc_uart.c index 481eb29..c6c01a4 100644 --- a/drivers/tty/serial/ucc_uart.c +++ b/drivers/tty/serial/ucc_uart.c @@ -827,7 +827,7 @@ static void qe_uart_shutdown(struct uart_port *port) break; } set_current_state(TASK_UNINTERRUPTIBLE); - schedule_timeout(2); + schedule_timeout(msecs_to_jiffies(2)); } if (qe_port->wait_closing) { -- 2.7.3
[PATCH v8 3/3] perf annotate: Fix jump target outside of function address range
If jump target is outside of function range, perf is not handling it correctly. Especially when target address is lesser than function start address, target offset will be negative. But, target address declared to be unsigned, converts negative number into 2's complement. See below example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is lesser than function start address(34cf0). 34ac0 - 34cf0 = -0x230 = 0xfdd0 Objdump output: 00034cf0 <__sigaction>: __GI___sigaction(): 34cf0: lea-0x20(%rdi),%eax 34cf3: cmp-bashx1,%eax 34cf6: jbe34d00 <__sigaction+0x10> 34cf8: jmpq 34ac0 <__GI___libc_sigaction> 34cfd: nopl (%rax) 34d00: mov0x386161(%rip),%rax# 3bae68 <_DYNAMIC+0x2e8> 34d07: movl -bashx16,%fs:(%rax) 34d0e: mov-bashx,%eax 34d13: retq perf annotate before applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea-0x20(%rdi),%eax cmp-bashx1,%eax v jbe10 v jmpq fdd0 nop 10:mov_DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov-bashx,%eax retq perf annotate after applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea-0x20(%rdi),%eax cmp-bashx1,%eax v jbe10 ^ jmpq 34ac0 <__GI___libc_sigaction> nop 10:mov_DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov-bashx,%eax retq Signed-off-by: Ravi Bangoria --- Changes in v8: - v7: https://lkml.org/lkml/2016/9/21/436 - Rebased to acme/perf/core. - No logical changes. (Cross arch annotate patches are in. This patch is for hardening annotate.) tools/perf/ui/browsers/annotate.c | 5 +++-- tools/perf/util/annotate.c| 14 +- tools/perf/util/annotate.h| 5 +++-- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index ec7a30f..ba36aac 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -215,7 +215,7 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int ui_browser__set_color(browser, color); if (dl->ins.ops && dl->ins.ops->scnprintf) { if (ins__is_jump(&dl->ins)) { - bool fwd = dl->ops.target.offset > (u64)dl->offset; + bool fwd = dl->ops.target.offset > dl->offset; ui_browser__write_graph(browser, fwd ? SLSMG_DARROW_CHAR : SLSMG_UARROW_CHAR); @@ -245,7 +245,8 @@ static bool disasm_line__is_valid_jump(struct disasm_line *dl, struct symbol *sy { if (!dl || !dl->ins.ops || !ins__is_jump(&dl->ins) || !disasm_line__has_offset(dl) - || dl->ops.target.offset >= symbol__size(sym)) + || dl->ops.target.offset < 0 + || dl->ops.target.offset >= (s64)symbol__size(sym)) return false; return true; diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 590244e..c81a395 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -230,10 +230,12 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op else ops->target.addr = strtoull(ops->raw, NULL, 16); - if (s++ != NULL) + if (s++ != NULL) { ops->target.offset = strtoull(s, NULL, 16); - else - ops->target.offset = UINT64_MAX; + ops->target.offset_avail = true; + } else { + ops->target.offset_avail = false; + } return 0; } @@ -241,7 +243,7 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op static int jump__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops) { - if (!ops->target.addr) + if (!ops->target.addr || ops->target.offset < 0) return ins__raw_scnprintf(ins, bf, size, ops); return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target.offset); @@ -1209,9 +1211,11 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map, if (dl == NULL) return -1; - if (dl->ops.target.offset == UINT64_MAX) + if (!disasm_line__has_offset(dl)) { dl->ops.target.offset = dl->ops.target.addr - map__rip_2objdump(map, sym->start); + dl->ops.target.offset_avail = true; + } /* kcore has no symbols, so add the call target name */ if (dl->ins.ops && ins__is_call(&dl->ins) && !dl->ops.target.name) { diff --git a/tools
[PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target
For jump instructions that does not include target address as direct operand, show the original disassembled line for them. This is needed for certain powerpc jump instructions that use target address in a register (such as bctr, btar, ...). Before: ld r12,32088(r12) mtctr r12 v bctr ca2c stdr2,24(r1) addis r12,r2,-1 After: ld r12,32088(r12) mtctr r12 v bctr stdr2,24(r1) addis r12,r2,-1 Suggested-by: Michael Ellerman Signed-off-by: Ravi Bangoria --- Changes in v8: - v7: https://lkml.org/lkml/2016/9/21/436 - Rebase to acme/perf/core - No logical changes. (Cross arch annotate patches are in. This patch is for hardening annotate for powerpc.) tools/perf/util/annotate.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 4012b1d..ea7e0de 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *op static int jump__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops) { + if (!ops->target.addr) + return ins__raw_scnprintf(ins, bf, size, ops); + return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target.offset); } -- 2.4.11
[PATCH v8 2/3] perf annotate: Support jump instruction with target as second operand
Arch like powerpc has jump instructions that includes target address as second operand. For example, 'bne cr7,0xc00f6154'. Add support for such instruction in perf annotate. objdump o/p: c00f6140: ld r9,1032(r31) c00f6144: cmpdi cr7,r9,0 c00f6148: bnecr7,0xc00f6154 c00f614c: ld r9,2312(r30) c00f6150: stdr9,1032(r31) c00f6154: ld r9,88(r31) Corresponding perf annotate o/p: Before patch: ld r9,1032(r31) cmpdi cr7,r9,0 v bne3ff09f2c ld r9,2312(r30) stdr9,1032(r31) 74:ld r9,88(r31) After patch: ld r9,1032(r31) cmpdi cr7,r9,0 v bne74 ld r9,2312(r30) stdr9,1032(r31) 74:ld r9,88(r31) Signed-off-by: Ravi Bangoria --- Changes in v8: - v7: https://lkml.org/lkml/2016/9/21/436 - Rebase to acme/perf/core - Little change in patch description. - No logical changes. (Cross arch annotate patches are in. This patch is for hardening annotate for powerpc.) tools/perf/util/annotate.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ea7e0de..590244e 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -223,8 +223,12 @@ bool ins__is_call(const struct ins *ins) static int jump__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map *map __maybe_unused) { const char *s = strchr(ops->raw, '+'); + const char *c = strchr(ops->raw, ','); - ops->target.addr = strtoull(ops->raw, NULL, 16); + if (c++ != NULL) + ops->target.addr = strtoull(c, NULL, 16); + else + ops->target.addr = strtoull(ops->raw, NULL, 16); if (s++ != NULL) ops->target.offset = strtoull(s, NULL, 16); -- 2.4.11
Re: [PATCH v2] of/irq: improve error report on irq discovery process failure
On Mon, Dec 5, 2016 at 7:59 AM, Guilherme G. Piccoli wrote: > On PowerPC machines some PCI slots might not have level triggered > interrupts capability (also know as level signaled interrupts), > leading of_irq_parse_pci() to complain by presenting error messages > on the kernel log - in this case, the properties "interrupt-map" and > "interrupt-map-mask" are not present on device's node in the device > tree. > > This patch introduces a different message for this specific case, > and also reduces its level from error to warning. Besides, we warn > (once) that possibly some PCI slots on the system have no level > triggered interrupts available. > We changed some error return codes too on function of_irq_parse_raw() > in order other failure's cases can be presented in a more precise way. > > Before this patch, when an adapter was plugged in a slot without level > interrupts capabilitiy on PowerPC, we saw a generic error message > like this: > > [54.239] pci 002d:70:00.0: of_irq_parse_pci() failed with rc=-22 > > Now, with this applied, we see the following specific message: > > [16.154] pci 0014:60:00.1: of_irq_parse_pci: no interrupt-map found, > INTx interrupts not available > > Finally, we standardize the error path in of_irq_parse_raw() by always > taking the fail path instead of returning directly from the loop. > > Signed-off-by: Guilherme G. Piccoli > --- > > v2: > * Changed function return code to always return negative values; Are you sure this is safe? This is tricky because of differing values of NO_IRQ (0 or -1). > * Improved/simplified warning outputs; > * Changed some return codes and some error paths in of_irq_parse_raw() > in order to be more precise/consistent; This too could have some side effects on callers. Not saying don't do these changes, just need some assurances this has been considered. Rob
[PATCH v2] of/irq: improve error report on irq discovery process failure
On PowerPC machines some PCI slots might not have level triggered interrupts capability (also know as level signaled interrupts), leading of_irq_parse_pci() to complain by presenting error messages on the kernel log - in this case, the properties "interrupt-map" and "interrupt-map-mask" are not present on device's node in the device tree. This patch introduces a different message for this specific case, and also reduces its level from error to warning. Besides, we warn (once) that possibly some PCI slots on the system have no level triggered interrupts available. We changed some error return codes too on function of_irq_parse_raw() in order other failure's cases can be presented in a more precise way. Before this patch, when an adapter was plugged in a slot without level interrupts capabilitiy on PowerPC, we saw a generic error message like this: [54.239] pci 002d:70:00.0: of_irq_parse_pci() failed with rc=-22 Now, with this applied, we see the following specific message: [16.154] pci 0014:60:00.1: of_irq_parse_pci: no interrupt-map found, INTx interrupts not available Finally, we standardize the error path in of_irq_parse_raw() by always taking the fail path instead of returning directly from the loop. Signed-off-by: Guilherme G. Piccoli --- v2: * Changed function return code to always return negative values; * Improved/simplified warning outputs; * Changed some return codes and some error paths in of_irq_parse_raw() in order to be more precise/consistent; drivers/of/irq.c| 19 --- drivers/of/of_pci_irq.c | 10 +- 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/of/irq.c b/drivers/of/irq.c index 393fea8..9deee86 100644 --- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -104,7 +104,7 @@ int of_irq_parse_raw(const __be32 *addr, struct of_phandle_args *out_irq) const __be32 *match_array = initial_match_array; const __be32 *tmp, *imap, *imask, dummy_imask[] = { [0 ... MAX_PHANDLE_ARGS] = ~0 }; u32 intsize = 1, addrsize, newintsize = 0, newaddrsize = 0; - int imaplen, match, i; + int imaplen, match, i, rc = -EINVAL; #ifdef DEBUG of_print_phandle_args("of_irq_parse_raw: ", out_irq); @@ -134,7 +134,7 @@ int of_irq_parse_raw(const __be32 *addr, struct of_phandle_args *out_irq) pr_debug("of_irq_parse_raw: ipar=%s, size=%d\n", of_node_full_name(ipar), intsize); if (out_irq->args_count != intsize) - return -EINVAL; + goto fail; /* Look for this #address-cells. We have to implement the old linux * trick of looking for the parent here as some device-trees rely on it @@ -153,8 +153,10 @@ int of_irq_parse_raw(const __be32 *addr, struct of_phandle_args *out_irq) pr_debug(" -> addrsize=%d\n", addrsize); /* Range check so that the temporary buffer doesn't overflow */ - if (WARN_ON(addrsize + intsize > MAX_PHANDLE_ARGS)) + if (WARN_ON(addrsize + intsize > MAX_PHANDLE_ARGS)) { + rc = -EFAULT; goto fail; + } /* Precalculate the match array - this simplifies match loop */ for (i = 0; i < addrsize; i++) @@ -240,10 +242,11 @@ int of_irq_parse_raw(const __be32 *addr, struct of_phandle_args *out_irq) newintsize, newaddrsize); /* Check for malformed properties */ - if (WARN_ON(newaddrsize + newintsize > MAX_PHANDLE_ARGS)) - goto fail; - if (imaplen < (newaddrsize + newintsize)) + if (WARN_ON(newaddrsize + newintsize > MAX_PHANDLE_ARGS) + || (imaplen < (newaddrsize + newintsize))) { + rc = -EFAULT; goto fail; + } imap += newaddrsize + newintsize; imaplen -= newaddrsize + newintsize; @@ -271,11 +274,13 @@ int of_irq_parse_raw(const __be32 *addr, struct of_phandle_args *out_irq) ipar = newpar; newpar = NULL; } + rc = -ENOENT; /* No interrupt-map found */ + fail: of_node_put(ipar); of_node_put(newpar); - return -EINVAL; + return rc; } EXPORT_SYMBOL_GPL(of_irq_parse_raw); diff --git a/drivers/of/of_pci_irq.c b/drivers/of/of_pci_irq.c index 2306313..c175d9c 100644 --- a/drivers/of/of_pci_irq.c +++ b/drivers/of/of_pci_irq.c @@ -93,7 +93,15 @@ int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq goto err; return 0; err: - dev_err(&pdev->dev, "of_irq_parse_pci() failed with rc=%d\n", rc); + if (rc == -ENOENT) { + dev_warn(&pdev->dev, + "%s: no interrupt-map found, INTx interrupts not available\n", + __func__); + pr_warn_once("%s: possibly
[PATCH] cxl: prevent read/write to AFU config space while AFU not configured
During EEH recovery, we deconfigure all AFUs whilst leaving the corresponding vPHB and virtual PCI device in place. If something attempts to interact with the AFU's PCI config space (e.g. running lspci) after the AFU has been deconfigured and before it's reconfigured, cxl_pcie_{read,write}_config() will read invalid values from the deconfigured struct cxl_afu and proceed to Oops when they try to dereference pointers that have been set to NULL during deconfiguration. Add a rwsem to struct cxl_afu so we can prevent interaction with config space while the AFU is deconfigured. Reported-by: Pradipta Ghosh Suggested-by: Frederic Barrat Cc: sta...@vger.kernel.org # 4.4+ Signed-off-by: Andrew Donnellan --- Pradipta found this while doing testing for cxlflash. I've tested this patch and I'm satisfied that it solves the issue, but I've asked Pradipta to test it a bit further. --- drivers/misc/cxl/cxl.h | 2 ++ drivers/misc/cxl/main.c | 3 ++- drivers/misc/cxl/pci.c | 2 ++ drivers/misc/cxl/vphb.c | 11 ++- 4 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index a144073..379c463 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -418,6 +418,8 @@ struct cxl_afu { struct dentry *debugfs; struct mutex contexts_lock; spinlock_t afu_cntl_lock; + /* Used to block access to AFU config space while deconfigured */ + struct rw_semaphore configured_rwsem; /* AFU error buffer fields and bin attribute for sysfs */ u64 eb_len, eb_offset; diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index 62e0dfb..2a6bf1d 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -268,7 +268,8 @@ struct cxl_afu *cxl_alloc_afu(struct cxl *adapter, int slice) idr_init(&afu->contexts_idr); mutex_init(&afu->contexts_lock); spin_lock_init(&afu->afu_cntl_lock); - + init_rwsem(&afu->configured_rwsem); + down_write(&afu->configured_rwsem); afu->prefault_mode = CXL_PREFAULT_NONE; afu->irqs_max = afu->adapter->user_irqs; diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index c4d79b5d..c7b2121 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1129,6 +1129,7 @@ static int pci_configure_afu(struct cxl_afu *afu, struct cxl *adapter, struct pc if ((rc = cxl_native_register_psl_irq(afu))) goto err2; + up_write(&afu->configured_rwsem); return 0; err2: @@ -1141,6 +1142,7 @@ static int pci_configure_afu(struct cxl_afu *afu, struct cxl *adapter, struct pc static void pci_deconfigure_afu(struct cxl_afu *afu) { + down_write(&afu->configured_rwsem); cxl_native_release_psl_irq(afu); if (afu->adapter->native->sl_ops->release_serr_irq) afu->adapter->native->sl_ops->release_serr_irq(afu); diff --git a/drivers/misc/cxl/vphb.c b/drivers/misc/cxl/vphb.c index 3519ace..d79aba5 100644 --- a/drivers/misc/cxl/vphb.c +++ b/drivers/misc/cxl/vphb.c @@ -88,9 +88,16 @@ static int cxl_pcie_config_info(struct pci_bus *bus, unsigned int devfn, return PCIBIOS_DEVICE_NOT_FOUND; afu = (struct cxl_afu *)phb->private_data; + + /* Grab a reader lock on afu. We rely on the caller to release this! */ + if (!down_read_trylock(&afu->configured_rwsem)) + return PCIBIOS_DEVICE_NOT_FOUND; + record = cxl_pcie_cfg_record(bus->number, devfn); - if (record > afu->crs_num) + if (record > afu->crs_num) { + up_read(&afu->configured_rwsem); return PCIBIOS_DEVICE_NOT_FOUND; + } *_afu = afu; *_record = record; @@ -127,6 +134,7 @@ static int cxl_pcie_read_config(struct pci_bus *bus, unsigned int devfn, WARN_ON(1); } + up_read(&afu->configured_rwsem); /* locked in cxl_pcie_config_info() */ if (rc) return PCIBIOS_DEVICE_NOT_FOUND; @@ -157,6 +165,7 @@ static int cxl_pcie_write_config(struct pci_bus *bus, unsigned int devfn, WARN_ON(1); } + up_read(&afu->configured_rwsem); /* locked in cxl_pcie_config_info() */ if (rc) return PCIBIOS_SET_FAILED; -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [RFC PATCH] PCI: designware: add host_init() error handling
Às 11:51 AM de 12/2/2016, Srinivas Kandagatla escreveu: > > > On 02/12/16 10:32, Joao Pinto wrote: >> >> Hi Srinivas, >> >> Às 11:51 AM de 12/1/2016, Srinivas Kandagatla escreveu: >>> drivers/pci/host/pci-dra7xx.c | 4 +++- >>> drivers/pci/host/pci-exynos.c | 4 +++- >>> drivers/pci/host/pci-imx6.c | 4 +++- >>> drivers/pci/host/pci-keystone.c | 4 +++- >>> drivers/pci/host/pci-layerscape.c | 12 >>> drivers/pci/host/pcie-armada8k.c| 4 +++- >>> drivers/pci/host/pcie-designware-plat.c | 4 +++- >>> drivers/pci/host/pcie-designware.c | 4 +++- >>> drivers/pci/host/pcie-designware.h | 2 +- >>> drivers/pci/host/pcie-qcom.c| 6 -- >>> drivers/pci/host/pcie-spear13xx.c | 4 +++- >>> 11 files changed, 37 insertions(+), 15 deletions(-) >>> >> >> Thanks for the patch! >> >> In my opinion your idea is good but only qcom driver is able to detect >> failure >> in the specific host init routine, all others have a 'return 0' even if >> something not well init. I would recomend that we take this issue a bit >> further >> and add the error checking to all specific pci drivers in order to make them >> as >> robust as qcom'. > I totally agree with you, I can give this a go in next version. Sure, but I think it would be better to finish now since we are on top of the task. I can help you if you need. Thanks Joao > > Thanks, > srini > >> >> Thanks, >> Joao >>
[PATCH v8 6/6] powerpc/pv-qspinlock: Optimise native unlock path
Avoid a function call under native version of qspinlock. On powerNV, bafore applying this patch, every unlock is expensive. This small optimizes enhance the performance. We use static_key with jump_label which removes unnecessary loads of lppaca and its stuff. Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/qspinlock_paravirt.h | 18 +- arch/powerpc/kernel/paravirt.c| 4 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h b/arch/powerpc/include/asm/qspinlock_paravirt.h index d87cda0..8d39446 100644 --- a/arch/powerpc/include/asm/qspinlock_paravirt.h +++ b/arch/powerpc/include/asm/qspinlock_paravirt.h @@ -6,12 +6,14 @@ #define _ASM_QSPINLOCK_PARAVIRT_H #include +#include extern void pv_lock_init(void); extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); extern void __pv_init_lock_hash(void); extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); extern void __pv_queued_spin_unlock(struct qspinlock *lock); +extern struct static_key_true sharedprocessor_key; static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val) { @@ -20,7 +22,21 @@ static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val) static inline void pv_queued_spin_unlock(struct qspinlock *lock) { - pv_lock_op.unlock(lock); + /* +* on powerNV and pSeries with jump_label, code will be +* PowerNV:pSeries: +* nop;b 2f; +* native unlock 2: +* pv unlock; +* In this way, we can do unlock quick in native case. +* +* IF jump_label is not enabled, we fall back into +* if condition, IOW, ld && cmp && bne. +*/ + if (static_branch_likely(&sharedprocessor_key)) + native_queued_spin_unlock(lock); + else + pv_lock_op.unlock(lock); } static inline void pv_wait(u8 *ptr, u8 val) diff --git a/arch/powerpc/kernel/paravirt.c b/arch/powerpc/kernel/paravirt.c index e697b17..a0a000e 100644 --- a/arch/powerpc/kernel/paravirt.c +++ b/arch/powerpc/kernel/paravirt.c @@ -140,6 +140,9 @@ struct pv_lock_ops pv_lock_op = { }; EXPORT_SYMBOL(pv_lock_op); +struct static_key_true sharedprocessor_key = STATIC_KEY_TRUE_INIT; +EXPORT_SYMBOL(sharedprocessor_key); + void __init pv_lock_init(void) { if (SHARED_PROCESSOR) { @@ -149,5 +152,6 @@ void __init pv_lock_init(void) pv_lock_op.unlock = __pv_queued_spin_unlock; pv_lock_op.wait = __pv_wait; pv_lock_op.kick = __pv_kick; + static_branch_disable(&sharedprocessor_key); } } -- 2.4.11
[PATCH v8 5/6] powerpc: pSeries: Add pv-qspinlock build config/make
pSeries run as a guest and might need pv-qspinlock. Signed-off-by: Pan Xinhui --- arch/powerpc/kernel/Makefile | 1 + arch/powerpc/platforms/pseries/Kconfig | 8 2 files changed, 9 insertions(+) diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 1925341..4780415 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -53,6 +53,7 @@ obj-$(CONFIG_PPC_970_NAP) += idle_power4.o obj-$(CONFIG_PPC_P7_NAP) += idle_book3s.o procfs-y := proc_powerpc.o obj-$(CONFIG_PROC_FS) += $(procfs-y) +obj-$(CONFIG_PARAVIRT_SPINLOCKS) += paravirt.o rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI) := rtas_pci.o obj-$(CONFIG_PPC_RTAS) += rtas.o rtas-rtc.o $(rtaspci-y-y) obj-$(CONFIG_PPC_RTAS_DAEMON) += rtasd.o diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 8a87d06..0288c78 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -31,6 +31,14 @@ config ARCH_USE_QUEUED_SPINLOCKS fairlock. It has shown a good performance improvement on x86 and also ppc especially in high contention cases. +config PARAVIRT_SPINLOCKS + bool "Paravirtialization support for qspinlock" + depends on PPC_SPLPAR && QUEUED_SPINLOCKS + default y + help + If kernel need run as a guest then enable this option. + Generally it can let kernel have a better performace. + config PPC_SPLPAR depends on PPC_PSERIES bool "Support for shared-processor logical partitions" -- 2.4.11
[PATCH v8 4/6] powerpc/pv-qspinlock: powerpc support pv-qspinlock
The default pv-qspinlock uses qspinlock(native version of pv-qspinlock). pv_lock initialization should be done in bootstage with irq disabled. And if we run as a guest with powerKVM/pHyp shared_processor mode, restore pv_lock_ops callbacks to pv-qspinlock(pv version) which makes full use of virtualization. There is a hash table, we store cpu number into it and the key is lock. So everytime pv_wait can know who is the lock holder by searching the lock. Also store the lock in a per_cpu struct, and remove it when we own the lock. Then pv_wait can know which lock we are spinning on. But the cpu in the hash table might not be the correct lock holder, as for performace issue, we does not take care of hash conflict. Also introduce spin_lock_holder, which tells who owns the lock now. currently the only user is spin_unlock_wait. Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/qspinlock.h | 29 +++- arch/powerpc/include/asm/qspinlock_paravirt.h | 36 + .../powerpc/include/asm/qspinlock_paravirt_types.h | 13 ++ arch/powerpc/kernel/paravirt.c | 153 + arch/powerpc/lib/locks.c | 8 +- arch/powerpc/platforms/pseries/setup.c | 5 + 6 files changed, 241 insertions(+), 3 deletions(-) create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h create mode 100644 arch/powerpc/kernel/paravirt.c diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 4c89256..8fd6349 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -15,7 +15,7 @@ static inline u8 *__qspinlock_lock_byte(struct qspinlock *lock) return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); } -static inline void queued_spin_unlock(struct qspinlock *lock) +static inline void native_queued_spin_unlock(struct qspinlock *lock) { /* release semantics is required */ smp_store_release(__qspinlock_lock_byte(lock), 0); @@ -27,6 +27,33 @@ static inline int queued_spin_is_locked(struct qspinlock *lock) return atomic_read(&lock->val); } +#ifdef CONFIG_PARAVIRT_SPINLOCKS +#include +/* + * try to know who is the lock holder, however it is not always true + * Return: + * -1, we did not know the lock holder. + * other value, likely is the lock holder. + */ +extern int spin_lock_holder(void *lock); + +static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + pv_queued_spin_lock(lock, val); +} + +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + pv_queued_spin_unlock(lock); +} +#else +#define spin_lock_holder(l) (-1) +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + native_queued_spin_unlock(lock); +} +#endif + #include /* we need override it as ppc has io_sync stuff */ diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h b/arch/powerpc/include/asm/qspinlock_paravirt.h new file mode 100644 index 000..d87cda0 --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock_paravirt.h @@ -0,0 +1,36 @@ +#ifndef CONFIG_PARAVIRT_SPINLOCKS +#error "do not include this file" +#endif + +#ifndef _ASM_QSPINLOCK_PARAVIRT_H +#define _ASM_QSPINLOCK_PARAVIRT_H + +#include + +extern void pv_lock_init(void); +extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_init_lock_hash(void); +extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_queued_spin_unlock(struct qspinlock *lock); + +static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val) +{ + pv_lock_op.lock(lock, val); +} + +static inline void pv_queued_spin_unlock(struct qspinlock *lock) +{ + pv_lock_op.unlock(lock); +} + +static inline void pv_wait(u8 *ptr, u8 val) +{ + pv_lock_op.wait(ptr, val); +} + +static inline void pv_kick(int cpu) +{ + pv_lock_op.kick(cpu); +} + +#endif diff --git a/arch/powerpc/include/asm/qspinlock_paravirt_types.h b/arch/powerpc/include/asm/qspinlock_paravirt_types.h new file mode 100644 index 000..83611ed --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock_paravirt_types.h @@ -0,0 +1,13 @@ +#ifndef _ASM_QSPINLOCK_PARAVIRT_TYPES_H +#define _ASM_QSPINLOCK_PARAVIRT_TYPES_H + +struct pv_lock_ops { + void (*lock)(struct qspinlock *lock, u32 val); + void (*unlock)(struct qspinlock *lock); + void (*wait)(u8 *ptr, u8 val); + void (*kick)(int cpu); +}; + +extern struct pv_lock_ops pv_lock_op; + +#endif diff --git a/arch/powerpc/kernel/paravirt.c b/arch/powerpc/kernel/paravirt.c new file mode 100644 index 000..e697b17 --- /dev/null +++ b/arch/powerpc/kernel/paravirt.c @@ -0,0 +1,153 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundatio
[PATCH v8 3/6] powerpc: lib/locks.c: Add cpu yield/wake helper function
Add two corresponding helper functions to support pv-qspinlock. For normal use, __spin_yield_cpu will confer current vcpu slices to the target vcpu(say, a lock holder). If target vcpu is not specified or it is in running state, such conferging to lpar happens or not depends. Because hcall itself will introduce latency and a little overhead. And we do NOT want to suffer any latency on some cases, e.g. in interrupt handler. The second parameter *confer* can indicate such case. __spin_wake_cpu is simpiler, it will wake up one vcpu regardless of its current vcpu state. Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/spinlock.h | 4 +++ arch/powerpc/lib/locks.c| 59 + 2 files changed, 63 insertions(+) diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 954099e..6426bd5 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -64,9 +64,13 @@ static inline bool vcpu_is_preempted(int cpu) /* We only yield to the hypervisor if we are in shared processor mode */ #define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) extern void __spin_yield(arch_spinlock_t *lock); +extern void __spin_yield_cpu(int cpu, int confer); +extern void __spin_wake_cpu(int cpu); extern void __rw_yield(arch_rwlock_t *lock); #else /* SPLPAR */ #define __spin_yield(x)barrier() +#define __spin_yield_cpu(x, y) barrier() +#define __spin_wake_cpu(x) barrier() #define __rw_yield(x) barrier() #define SHARED_PROCESSOR 0 #endif diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c index 6574626..bd872c9 100644 --- a/arch/powerpc/lib/locks.c +++ b/arch/powerpc/lib/locks.c @@ -23,6 +23,65 @@ #include #include +/* + * confer our slices to a specified cpu and return. If it is in running state + * or cpu is -1, then we will check confer. If confer is NULL, we will return + * otherwise we confer our slices to lpar. + */ +void __spin_yield_cpu(int cpu, int confer) +{ + unsigned int holder_cpu = cpu, yield_count; + + if (cpu == -1) + goto yield_to_lpar; + + BUG_ON(holder_cpu >= nr_cpu_ids); + yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); + + /* if cpu is running, confer slices to lpar conditionally*/ + if ((yield_count & 1) == 0) + goto yield_to_lpar; + + plpar_hcall_norets(H_CONFER, + get_hard_smp_processor_id(holder_cpu), yield_count); + return; + +yield_to_lpar: + if (confer) + plpar_hcall_norets(H_CONFER, -1, 0); +} +EXPORT_SYMBOL_GPL(__spin_yield_cpu); + +void __spin_wake_cpu(int cpu) +{ + unsigned int holder_cpu = cpu; + + BUG_ON(holder_cpu >= nr_cpu_ids); + /* +* NOTE: we should always do this hcall regardless of +* the yield_count of the holder_cpu. +* as thers might be a case like below; +* CPU 1 CPU 2 +* yielded = true +* if (yielded) +* __spin_wake_cpu() +* __spin_yield_cpu() +* +* So we might lose a wake if we check the yield_count and +* return directly if the holder_cpu is running. +* IOW. do NOT code like below. +* yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); +* if ((yield_count & 1) == 0) +* return; +* +* a PROD hcall marks the target_cpu proded, which cause the next cede +* or confer called on the target_cpu invalid. +*/ + plpar_hcall_norets(H_PROD, + get_hard_smp_processor_id(holder_cpu)); +} +EXPORT_SYMBOL_GPL(__spin_wake_cpu); + #ifndef CONFIG_QUEUED_SPINLOCKS void __spin_yield(arch_spinlock_t *lock) { -- 2.4.11
[PATCH v8 2/6] powerpc: pSeries/Kconfig: Add qspinlock build config
pSeries/powerNV will use qspinlock from now on. Signed-off-by: Pan Xinhui --- arch/powerpc/platforms/pseries/Kconfig | 8 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index bec90fb..8a87d06 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -23,6 +23,14 @@ config PPC_PSERIES select PPC_DOORBELL default y +config ARCH_USE_QUEUED_SPINLOCKS + default y + bool "Enable qspinlock" + help + Enabling this option will let kernel use qspinlock which is a kind of + fairlock. It has shown a good performance improvement on x86 and also ppc + especially in high contention cases. + config PPC_SPLPAR depends on PPC_PSERIES bool "Support for shared-processor logical partitions" -- 2.4.11
[PATCH v8 1/6] powerpc/qspinlock: powerpc support qspinlock
This patch add basic code to enable qspinlock on powerpc. qspinlock is one kind of fairlock implementation. And seen some performance improvement under some scenarios. queued_spin_unlock() release the lock by just one write of NULL to the ::locked field which sits at different places in the two endianness system. We override some arch_spin_XXX as powerpc has io_sync stuff which makes sure the io operations are protected by the lock correctly. There is another special case, see commit 2c610022711 ("locking/qspinlock: Fix spin_unlock_wait() some more") Signed-off-by: Pan Xinhui --- arch/powerpc/include/asm/qspinlock.h | 66 +++ arch/powerpc/include/asm/spinlock.h | 31 +-- arch/powerpc/include/asm/spinlock_types.h | 4 ++ arch/powerpc/lib/locks.c | 59 +++ 4 files changed, 147 insertions(+), 13 deletions(-) create mode 100644 arch/powerpc/include/asm/qspinlock.h diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h new file mode 100644 index 000..4c89256 --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock.h @@ -0,0 +1,66 @@ +#ifndef _ASM_POWERPC_QSPINLOCK_H +#define _ASM_POWERPC_QSPINLOCK_H + +#include + +#define SPIN_THRESHOLD (1 << 15) +#define queued_spin_unlock queued_spin_unlock +#define queued_spin_is_locked queued_spin_is_locked +#define queued_spin_unlock_wait queued_spin_unlock_wait + +extern void queued_spin_unlock_wait(struct qspinlock *lock); + +static inline u8 *__qspinlock_lock_byte(struct qspinlock *lock) +{ + return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); +} + +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + /* release semantics is required */ + smp_store_release(__qspinlock_lock_byte(lock), 0); +} + +static inline int queued_spin_is_locked(struct qspinlock *lock) +{ + smp_mb(); + return atomic_read(&lock->val); +} + +#include + +/* we need override it as ppc has io_sync stuff */ +#undef arch_spin_trylock +#undef arch_spin_lock +#undef arch_spin_lock_flags +#undef arch_spin_unlock +#define arch_spin_trylock arch_spin_trylock +#define arch_spin_lock arch_spin_lock +#define arch_spin_lock_flags arch_spin_lock_flags +#define arch_spin_unlock arch_spin_unlock + +static inline int arch_spin_trylock(arch_spinlock_t *lock) +{ + CLEAR_IO_SYNC; + return queued_spin_trylock(lock); +} + +static inline void arch_spin_lock(arch_spinlock_t *lock) +{ + CLEAR_IO_SYNC; + queued_spin_lock(lock); +} + +static inline +void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags) +{ + CLEAR_IO_SYNC; + queued_spin_lock(lock); +} + +static inline void arch_spin_unlock(arch_spinlock_t *lock) +{ + SYNC_IO; + queued_spin_unlock(lock); +} +#endif /* _ASM_POWERPC_QSPINLOCK_H */ diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 8c1b913..954099e 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -60,6 +60,23 @@ static inline bool vcpu_is_preempted(int cpu) } #endif +#if defined(CONFIG_PPC_SPLPAR) +/* We only yield to the hypervisor if we are in shared processor mode */ +#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) +extern void __spin_yield(arch_spinlock_t *lock); +extern void __rw_yield(arch_rwlock_t *lock); +#else /* SPLPAR */ +#define __spin_yield(x)barrier() +#define __rw_yield(x) barrier() +#define SHARED_PROCESSOR 0 +#endif + +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#else + +#define arch_spin_relax(lock) __spin_yield(lock) + static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) { return lock.slock == 0; @@ -114,18 +131,6 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock) * held. Conveniently, we have a word in the paca that holds this * value. */ - -#if defined(CONFIG_PPC_SPLPAR) -/* We only yield to the hypervisor if we are in shared processor mode */ -#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr)) -extern void __spin_yield(arch_spinlock_t *lock); -extern void __rw_yield(arch_rwlock_t *lock); -#else /* SPLPAR */ -#define __spin_yield(x)barrier() -#define __rw_yield(x) barrier() -#define SHARED_PROCESSOR 0 -#endif - static inline void arch_spin_lock(arch_spinlock_t *lock) { CLEAR_IO_SYNC; @@ -203,6 +208,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) smp_mb(); } +#endif /* !CONFIG_QUEUED_SPINLOCKS */ /* * Read-write spinlocks, allowing multiple readers * but only one writer. @@ -338,7 +344,6 @@ static inline void arch_write_unlock(arch_rwlock_t *rw) #define arch_read_lock_flags(lock, flags) arch_read_lock(lock) #define arch_write_lock_flags(lock, flags) arch_write_lock(lock) -#define arch_spin_relax(lock) __spin_yield(lock) #define arch_read_relax(lock) __rw_yield(lock) #define arch_writ
[PATCH v8 0/6] Implement qspinlock/pv-qspinlock on ppc
Hi All, this is the fairlock patchset. You can apply them and build successfully. patches are based on linux-next qspinlock can avoid waiter starved issue. It has about the same speed in single-thread and it can be much faster in high contention situations especially when the spinlock is embedded within the data structure to be protected. v7 -> v8: add one patch to drop a function call under native qspinlock unlock. Enabling qspinlock or not is a complier option now. rebase onto linux-next(4.9-rc7) v6 -> v7: rebase onto 4.8-rc4 v1 -> v6: too many details. snip. some benchmark result below perf bench these numbers are ops per sec, So the higher the better. *** on pSeries with 32 vcpus, 32Gb memory, pHyp. test case | pv-qspinlock | qspinlock| current-spinlock futex hash | 618572| 552332| 553788 futex lock-pi | 364 | 364 | 364 sched pipe | 78984 | 76060 | 81454 unix bench: these numbers are scores, So the higher the better. on PowerNV with 16 cores(cpus) (smt off), 32Gb memory: - pv-qspinlock and qspinlock have very similar results because pv-qspinlock use native version which is only having one callback overhead test case | pv-qspinlock and qspinlock | current-spinlock Execl Throughput 761.1 761.4 File Copy 1024 bufsize 2000 maxblocks 1259.81286.6 File Copy 256 bufsize 500 maxblocks782.2 790.3 File Copy 4096 bufsize 8000 maxblocks 2741.52817.4 Pipe Throughput 1063.21036.7 Pipe-based Context Switching 284.7 281.1 Process Creation 679.6 649.1 Shell Scripts (1 concurrent) 1933.21922.9 Shell Scripts (8 concurrent) 5003.34899.8 System Call Overhead 900.6 896.8 == System Benchmarks Index Score 1139.3 1133.0 --- - *** on pSeries with 32 vcpus, 32Gb memory, pHyp. test case | pv-qspinlock | qspinlock | current-spinlock Execl Throughput 877.1 891.2 872.8 File Copy 1024 bufsize 2000 maxblocks 1390.41399.21395.0 File Copy 256 bufsize 500 maxblocks 882.4 889.5 881.8 File Copy 4096 bufsize 8000 maxblocks 3112.33113.43121.7 Pipe Throughput 1095.81162.61158.5 Pipe-based Context Switching 194.9 192.7 200.7 Process Creation 518.4 526.4 509.1 Shell Scripts (1 concurrent)1401.91413.91402.2 Shell Scripts (8 concurrent)3215.63246.63229.1 System Call Overhead 833.2 892.4 888.1 System Benchmarks Index Score 1033.71052.51047.8 ** on pSeries with 32 vcpus, 16Gb memory, KVM. test case | pv-qspinlock | qspinlock | current-spinlock Execl Throughput 497.4518.7 497.8 File Copy 1024 bufsize 2000 maxblocks 1368.8 1390.11343.3 File Copy 256 bufsize 500 maxblocks 857.7859.8 831.4 File Copy 4096 bufsize 8000 maxblocks 2851.7 2838.12785.5 Pipe Throughput 1221.9 1265.31250.4 Pipe-based Context S
[GIT PULL] Please pull powerpc/linux.git powerpc-4.9-7 tag
Hi Linus, Please pull what is hopefully the last batch of powerpc fixes for 4.9. The following changes since commit 984d7a1ec67ce3a46324fa4bcb4c745bbc266cf2: powerpc/mm: Fixup kernel read only mapping (2016-11-25 14:18:25 +1100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-4.9-7 for you to fetch changes up to dadc4a1bb9f0095343ed9dd4f1d9f3825d7b3e45: powerpc/64: Fix placement of .text to be immediately following .head.text (2016-12-01 22:26:31 +1100) powerpc fixes for 4.9 #7 Four fixes, the first for code we merged this cycle and three that are also going to stable: - On 64-bit Book3E we were not placing the .text section where we said we would in the asm. - We broke building the boot wrapper on some 32-bit toolchains. - Lazy icache flushing was broken on pre-POWER5 machines. - One of the error paths in our EEH code would lead to a deadlock. Thanks to: Andrew Donnellan, Ben Hutchings, Benjamin Herrenschmidt, Nicholas Piggin. Andrew Donnellan (1): powerpc/eeh: Fix deadlock when PE frozen state can't be cleared Ben Hutchings (1): powerpc/boot: Fix build failure in 32-bit boot wrapper Benjamin Herrenschmidt (1): powerpc/mm: Fix lazy icache flush on pre-POWER5 Nicholas Piggin (1): powerpc/64: Fix placement of .text to be immediately following .head.text arch/powerpc/boot/Makefile| 3 ++- arch/powerpc/boot/opal.c | 2 +- arch/powerpc/kernel/eeh_driver.c | 4 +++- arch/powerpc/kernel/vmlinux.lds.S | 9 + arch/powerpc/mm/hash64_4k.c | 2 +- arch/powerpc/mm/hash64_64k.c | 4 ++-- 6 files changed, 18 insertions(+), 6 deletions(-) signature.asc Description: PGP signature
Re: [RFC][PATCH] powerpc/oops: Provide disassembly on OOPS
Balbir Singh writes: > This patch is tied to xmon, it can be refactored out > better later if required. The idea is to provide > disassembly using xmon so that when we get an OOPS > we see something like the following below > > ... > NIP [c063a230] lkdtm_WARNING+0x0/0x10 > LR [c063986c] lkdtm_do_action+0x3c/0x80 > Call Trace: > [c000ef1bbbc0] [c09a5804] printk+0x50/0x64 (unreliable) > [c000ef1bbbe0] [c0639cc0] direct_entry+0x100/0x1b0 > [c000ef1bbc70] [c043eb4c] full_proxy_write+0x8c/0x100 > [c000ef1bbcd0] [c028fe24] __vfs_write+0x54/0x1c0 > [c000ef1bbd80] [c0291138] vfs_write+0xc8/0x260 > [c000ef1bbdd0] [c0292c98] SyS_write+0x78/0x120 > [c000ef1bbe30] [c000b220] system_call+0x38/0xfc > Instruction dump: > c063a200 38630618 addir3,r3,1560 > c063a204 f8010010 std r0,16(r1) > c063a208 f821ffa1 stdur1,-96(r1) > c063a20c 4836b0f1 bl c09a52fc# panic+0x8/0x304 > c063a210 6000 nop > c063a214 6000 nop > c063a218 6000 nop > c063a21c 6042 ori r2,r2,0 > c063a220 0fe0 twi 31,r0,0 > c063a224 6000 nop > c063a228 6000 nop > c063a22c 6042 ori r2,r2,0 > c063a230 0fe0 twi 31,r0,0 > > NOTE: That the <> around the instruction that caused the > OOPS is now replaced with a following the disassembly > in the output. I think I'd prefer: c063a22c 6042 ori r2,r2,0 c063a230 0fe0 twi 31,r0,0 # <- nip c063a234 4e800020 blr Or maybe: c063a22c 6042 ori r2,r2,0 c063a230 0fe0 twi 31,r0,0 # Faulting instruction c063a234 4e800020 blr ? > An issue was raised if as to whether calling > xmon during OOPS can cause further issues? xmon has been used > robustly in the past to look at OOPS and disassemble them > and moreover the OOPS output is at the end, so we've already > captured the GPR's and stack trace already. Once it's refactored properly you won't be calling xmon at all, you'll just be calling the disassembly code. The problem we have is that currently print_insn_powerpc() is built using nonstdio.h, which means it is calling xmon_printf(), and that's not what we want to do for an oops. An oops is printed using printk. So that will need more work. > NOTE2: If CONFIG_XMON_DISASSEMBLY is turned off, the disassembly > will be printed as a list of .long(s). It is highly recommended > to have both CONFIG_XMON_DISASSEMBLY and CONFIG_XMON for usable > output. So once it's refactored CONFIG_XMON_DISASSEMBLY would become CONFIG_PPC_DISASSEMBLY (or something like that), and there'd be no dependency on CONFIG_XMON. And so there'd be no fallback to printing longs, you'd either print the old compact format, or the disassembled format. cheers
Re: [PATCH RFC 3/3] powerpc/64: Enable use of radix MMU under hypervisor on POWER9
On Mon, Dec 05, 2016 at 07:55:32PM +1100, Benjamin Herrenschmidt wrote: > On Mon, 2016-12-05 at 19:04 +1100, Paul Mackerras wrote: > > + vec5 = of_get_flat_dt_prop(chosen, "ibm,architecture-vec-5", &size); > > + if (!vec5 || size <= OV5_INDX(OV5_MMU_RADIX_300)) > > + return; > > Could be bike shedding but shouldn't we first check if > we are in an LPAR and bail out of we are not, then > if we *are* and the above size is too small to contain > the ARCH 3.00 options, also disable radix as obviously > the hypervisor doesn't know about it ? This is *very* early on, so early that we haven't yet decided what platform we're on. If we're not in an LPAR then we won't have a /chosen/ibm-architecture-vec-5 property. Any hypervisor that is too old to have that property will also be too old to set the radix bit in the ibm,pa-features property, so we won't use radix. If we do have the property but it's short then yes that's a good indication that the hypervisor can't do radix, though in that case it's strange that it set the radix bit in the ibm,pa-features property (which must have been set otherwise we wouldn't have got here). I'll do a new patch. > > + if (!(vec5[OV5_INDX(OV5_MMU_RADIX_300)] & > > OV5_FEAT(OV5_MMU_RADIX_300))) > > + /* Hypervisor doesn't support radix */ > > + cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX; > > +} > > + Paul.
Re: [PATCH RFC 3/3] powerpc/64: Enable use of radix MMU under hypervisor on POWER9
On Mon, 2016-12-05 at 19:04 +1100, Paul Mackerras wrote: > + vec5 = of_get_flat_dt_prop(chosen, "ibm,architecture-vec-5", &size); > + if (!vec5 || size <= OV5_INDX(OV5_MMU_RADIX_300)) > + return; Could be bike shedding but shouldn't we first check if we are in an LPAR and bail out of we are not, then if we *are* and the above size is too small to contain the ARCH 3.00 options, also disable radix as obviously the hypervisor doesn't know about it ? > + if (!(vec5[OV5_INDX(OV5_MMU_RADIX_300)] & > OV5_FEAT(OV5_MMU_RADIX_300))) > + /* Hypervisor doesn't support radix */ > + cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX; > +} > +
Re: [PATCH] PPC: sstep.c: Add modsw, moduw instruction emulation
On 2016/12/04 10:25PM, PrasannaKumar Muralidharan wrote: > Add modsw and moduw instruction emulation support to analyse_instr. > > Signed-off-by: PrasannaKumar Muralidharan Hi Prasanna, Thanks for the patch! A few minor comments below... > --- > arch/powerpc/lib/sstep.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c > index 9c78a9c..5acef72 100644 > --- a/arch/powerpc/lib/sstep.c > +++ b/arch/powerpc/lib/sstep.c > @@ -1148,6 +1148,15 @@ int __kprobes analyse_instr(struct instruction_op *op, > struct pt_regs *regs, > (int) regs->gpr[rb]; > goto arith_done; > > + case 779: /* modsw */ > + regs->gpr[rd] = (int) regs->gpr[ra] % > + (int) regs->gpr[rb]; > + goto arith_done; Since these instructions don't update CR, you can directly goto instr_done. > + > + case 267: /* moduw */ Please move this case further up so that the extended opcodes are in numerical order. - Naveen
[PATCH RFC 3/3] powerpc/64: Enable use of radix MMU under hypervisor on POWER9
To use radix as a guest, we first need to tell the hypervisor via the ibm,client-architecture call first that we support POWER9 and architecture v3.00, and that we can do either radix or hash and that we would like to choose later using an hcall (the H_REGISTER_PROC_TBL hcall). Then we need to check whether the hypervisor agreed to us using radix. We need to do this very early on in the kernel boot process before any of the MMU initialization is done. If the hypervisor doesn't agree, we can't use radix and therefore clear the radix MMU feature bit. Later, when we have set up our process table, which points to the radix tree for each process, we need to install that using the H_REGISTER_PROC_TBL hcall. Signed-off-by: Paul Mackerras --- arch/powerpc/include/asm/book3s/64/mmu.h | 2 ++ arch/powerpc/include/asm/hvcall.h| 11 +++ arch/powerpc/include/asm/prom.h | 9 + arch/powerpc/kernel/prom_init.c | 18 +- arch/powerpc/mm/init_64.c| 28 arch/powerpc/mm/pgtable-radix.c | 2 ++ arch/powerpc/platforms/pseries/lpar.c| 29 + 7 files changed, 98 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 8afb0e0..e8cbdc0 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -138,5 +138,7 @@ static inline void setup_initial_memory_limit(phys_addr_t first_memblock_base, extern int (*register_process_table)(unsigned long base, unsigned long page_size, unsigned long tbl_size); +extern void radix_init_pseries(void); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index 77ff1ba..54d11b3 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -276,6 +276,7 @@ #define H_GET_MPP_X0x314 #define H_SET_MODE 0x31C #define H_CLEAR_HPT0x358 +#define H_REGISTER_PROC_TBL0x37C #define H_SIGNAL_SYS_RESET 0x380 #define MAX_HCALL_OPCODE H_SIGNAL_SYS_RESET @@ -313,6 +314,16 @@ #define H_SIGNAL_SYS_RESET_ALL_OTHERS -2 /* >= 0 values are CPU number */ +/* Flag values used in H_REGISTER_PROC_TBL hcall */ +#define PROC_TABLE_OP_MASK 0x18 +#define PROC_TABLE_DEREG 0x10 +#define PROC_TABLE_NEW 0x18 +#define PROC_TABLE_TYPE_MASK 0x06 +#define PROC_TABLE_HPT_SLB 0x00 +#define PROC_TABLE_HPT_PT 0x02 +#define PROC_TABLE_RADIX 0x04 +#define PROC_TABLE_GTSE0x01 + #ifndef __ASSEMBLY__ /** diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index e6d83d0..8af2546 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -121,6 +121,8 @@ struct of_drconf_cell { #define OV1_PPC_2_06 0x02/* set if we support PowerPC 2.06 */ #define OV1_PPC_2_07 0x01/* set if we support PowerPC 2.07 */ +#define OV1_PPC_3_00 0x80/* set if we support PowerPC 3.00 */ + /* Option vector 2: Open Firmware options supported */ #define OV2_REAL_MODE 0x20/* set if we want OF in real mode */ @@ -155,6 +157,13 @@ struct of_drconf_cell { #define OV5_PFO_HW_842 0x1140 /* PFO Compression Accelerator */ #define OV5_PFO_HW_ENCR0x1120 /* PFO Encryption Accelerator */ #define OV5_SUB_PROCESSORS 0x1501 /* 1,2,or 4 Sub-Processors supported */ +#define OV5_XIVE_EXPLOIT 0x1701 /* XIVE exploitation supported */ +#define OV5_MMU_RADIX_300 0x1880 /* ISA v3.00 radix MMU supported */ +#define OV5_MMU_HASH_300 0x1840 /* ISA v3.00 hash MMU supported */ +#define OV5_MMU_SEGM_RADIX 0x1820 /* radix mode (no segmentation) */ +#define OV5_MMU_PROC_TBL 0x1810 /* hcall selects SLB or proc table */ +#define OV5_MMU_SLB0x1800 /* always use SLB */ +#define OV5_MMU_GTSE 0x1808 /* Guest translation shootdown */ /* Option Vector 6: IBM PAPR hints */ #define OV6_LINUX 0x02/* Linux is our OS */ diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index ec47a93..358d43f 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -649,6 +649,7 @@ static void __init early_cmdline_parse(void) struct option_vector1 { u8 byte1; u8 arch_versions; + u8 arch_versions3; } __packed; struct option_vector2 { @@ -691,6 +692,9 @@ struct option_vector5 { u8 reserved2; __be16 reserved3; u8 subprocessors; + u8 byte22; + u8 intarch; + u8 mmu; } __packed; struct option_vector6 { @@ -700,7 +704,7 @@ struct option_vector6 { } __packed; struct ibm_arch_vec { - struct { u32 mask, val; } pvrs[10]; +
[PATCH RFC 2/3] powerpc/64: Always enable radix support for 64-bit Book 3S kernels
This removes the ability for the user to choose whether or not to include support for the radix MMU in kernels built to run on 64-bit Book 3S machines. Excluding radix support saves only about 25kiB of text and 13kiB of data, a total of little over half a page. Having the option expands the space of option combinations that need to be tested, which is an ongoing burden on developers, as well as increasing the number of #ifdefs in the code. Given that the space savings are small, let's remove the option. Signed-off-by: Paul Mackerras --- arch/powerpc/platforms/Kconfig.cputype | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index ca2da30..52a71ca 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -333,13 +333,8 @@ config PPC_STD_MMU_64 depends on PPC_STD_MMU && PPC64 config PPC_RADIX_MMU - bool "Radix MMU Support" + def_bool y depends on PPC_BOOK3S_64 - default y - help - Enable support for the Power ISA 3.0 Radix style MMU. Currently this - is only implemented by IBM Power9 CPUs, if you don't have one of them - you can probably disable this. config PPC_MMU_NOHASH def_bool y -- 2.7.4
[PATCH 1/3] powerpc/64: Fixes for the ibm,client-architecture-support options
This fixes the values for some of the option vector 5 bits in the ibm,client-architecture-support vector 5. The "platform facilities options" bits are in byte 17 not byte 14, so the upper 8 bits of their definitions need to be 0x11 not 0x0E. The "sub processor support" option is in byte 21 not byte 15. When checking whether option bits are set, we should check that the offset of the byte being checked is less than the vector length that we got from the hypervisor. Signed-off-by: Paul Mackerras --- arch/powerpc/include/asm/prom.h | 8 arch/powerpc/platforms/pseries/firmware.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index 5e57705..e6d83d0 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -151,10 +151,10 @@ struct of_drconf_cell { #define OV5_XCMO 0x0440 /* Page Coalescing */ #define OV5_TYPE1_AFFINITY 0x0580 /* Type 1 NUMA affinity */ #define OV5_PRRN 0x0540 /* Platform Resource Reassignment */ -#define OV5_PFO_HW_RNG 0x0E80 /* PFO Random Number Generator */ -#define OV5_PFO_HW_842 0x0E40 /* PFO Compression Accelerator */ -#define OV5_PFO_HW_ENCR0x0E20 /* PFO Encryption Accelerator */ -#define OV5_SUB_PROCESSORS 0x0F01 /* 1,2,or 4 Sub-Processors supported */ +#define OV5_PFO_HW_RNG 0x1180 /* PFO Random Number Generator */ +#define OV5_PFO_HW_842 0x1140 /* PFO Compression Accelerator */ +#define OV5_PFO_HW_ENCR0x1120 /* PFO Encryption Accelerator */ +#define OV5_SUB_PROCESSORS 0x1501 /* 1,2,or 4 Sub-Processors supported */ /* Option Vector 6: IBM PAPR hints */ #define OV6_LINUX 0x02/* Linux is our OS */ diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c index ea7f09b..7d67623 100644 --- a/arch/powerpc/platforms/pseries/firmware.c +++ b/arch/powerpc/platforms/pseries/firmware.c @@ -126,7 +126,7 @@ static void __init fw_vec5_feature_init(const char *vec5, unsigned long len) index = OV5_INDX(vec5_fw_features_table[i].feature); feat = OV5_FEAT(vec5_fw_features_table[i].feature); - if (vec5[index] & feat) + if (index < len && (vec5[index] & feat)) powerpc_firmware_features |= vec5_fw_features_table[i].val; } -- 2.7.4